Commun. Math. Phys. 285, 1–29 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0643-x
Communications in
Mathematical Physics
On Mason’s Rigidity Theorem Piotr T. Chru´sciel1,2 , Paul Tod3 1 LMPT, Fédération Denis Poisson, Parc de Grandmont, 37200 Tours, France.
E-mail:
[email protected]; www.phys.univ-tours.fr/∼piotr
2 Mathematical Institute and Hertford College, Oxford, UK 3 Mathematical Institute and St John’s College, Oxford, UK.
E-mail:
[email protected] Received: 21 April 2007 / Accepted: 7 July 2008 Published online: 14 October 2008 – © Springer-Verlag 2008
Abstract: Following an argument proposed by Mason, we prove that there are no algebraically special asymptotically simple vacuum space-times with a smooth, shear-free, geodesic congruence of principal null directions extending transversally to a crosssection of I + . Our analysis leaves the door open for escaping this conclusion if the congruence is not smooth, or not transverse to I + . One of the elements of the proof is a new rigidity theorem for the Trautman-Bondi mass. Contents 1. 2. 3. 4. 5. A. B.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Non-Differentiable Congruences . . . . . . . . . . . . . . . The Metric Form of Algebraically-Special Vacuum Solutions Spacelike Hypersurfaces, a Rigid Positive Energy Theorem . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . Smoothness of for Non-Branching Metrics . . . . . . . . . Rescalings, ρ and Smooth Extendibility of ˜ . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
1 4 6 14 19 20 23
1. Introduction It is a long standing conjecture that the only vacuum algebraically special asymptotically simple space-time is Minkowski space. Arguments towards a proof have been presented in [14]. The aim of this work is to establish the conjecture under a set of restrictive conditions, with the aid of a rigidity theorem for Trautman-Bondi mass, a complete proof of which has not been presented previously. A space-time (M , g) is said to admit a conformal boundary completion at infinity if there exists a manifold M = M ∪I with boundary I and a function on Mvanishing precisely on I , with nowhere vanishing gradient there, such that the metric g := 2 g extends smoothly to a tensor field with Lorentzian signature defined on M. We denote by
2
P. T. Chru´sciel, P. Tod
I + , respectively I − , the set of points on I which are end-points of null future directed, respectively past directed, geodesics. We will say that 1 (M , g) is past asymptotically simple if every maximally extended null geodesic acquires a past end-point on I − ⊂ I . Future asymptotic simplicity is defined by changing time-orientation. Following [17], we use the term asymptotic simplicity if M does not contain closed timelike curves, and if both past and future asymptotic simplicity hold. An embedded submanifold of I will be said to be a cross-section if it meets generators of I transversally, and at most once each. Asymptotically simple space-times with null conformal boundaries are known to be globally hyperbolic [17], with contractible Cauchy surfaces, with I + and I − containing R × S 2 , where the R factor corresponds to motions along the null geodesic generators. Furthermore, I reduces to two copies of R × S 2 if one assumes that the extended space-time (M, g ) is strongly causal at I . It appears of some interest to consider algebraically special space-times which are asymptotically simple to the past, without necessarily being asymptotically simple. 2 Such space-times could describe e.g. the formation of a black hole in a space-time without singularities in the past. Recall that a space-time (M , g) is algebraically special if at every point there exists a null vector such that the Weyl tensor Cµνρσ satisfies Cµνρ[σ π ] ρ ν = 0.
(1.1)
Assume that (M , g) is vacuum, or that the Ricci tensor satisfies a set of restrictions listed in detail below. Then, on regions where the ’s can be chosen to produce a smooth vector field, near those orbits along which the Weyl tensor isn’t zero everywhere, can be rescaled 3 so that its integral curves are null affinely parameterized geodesics without shear. Conversely, the existence of such a congruence implies (1.1). In asymptotically simple space-times the integral curves of extend smoothly to the conformal boundary at their end points. The question then arises, whether a suitable rescaling ˜ of extends by continuity to a smooth vector field defined on the set M ∪ { p ∈ I | p is an end point of precisely one integral curve of } ⊂ M := M ∪ I .
U
Since the integral curves of intersect I transversally, ˜ is transverse to U . However, neither continuity nor differentiability of ˜ at U are clear. Further, ˜ might become singular as the boundary U \U is approached, or perhaps develop zeros there. Problems will clearly arise at points at which more than one integral curve of meets I + , assuming that ˜ can be defined at those points at all. One of our results here (see Appendix B) is the proof that smoothness and transversality to I + of ˜ := −2 is equivalent to the non-existence of zeros of the complex divergence ρ = m¯ µ m ν ∇µ ν (see, e.g., [16] for details of the definition of m µ ) of the congruence defined by in a neighborhood of I + . 1 Recall that a space-time is a time-oriented Lorentzian manifold, the topology of which is assumed to be metrisable. In view of our extensive use of the NP formalism the signature (+ − −−) is used. 2 We are grateful to an anonymous referee for pointing out this possibility to us. 3 A priori this can be done only locally; however, in globally hyperbolic space-times (which is the case here) this can always be done globally when is globally smooth.
On Mason’s Rigidity Theorem
3
Assuming that is globally well defined, in Sect. 4 we prove: Theorem 1.1. The following set of conditions is incompatible in vacuum: 1. (M , g) is past asymptotically simple and contains a contractible Cauchy surface. 4 2. There exists on M a smooth, null, shear-free, geodesic vector field . 3. We have I − ≈ R × S 2 , and there exists a compact cross-section S + of I + near which a rescaling ˜ of extends by continuity to a smooth vector field which is transverse to I + . The statement remains true for non-vacuum metrics if the dominant energy condition holds and if the Newman-Penrose components 00 , 01 , 02 and of the Ricci tensor5 , associated to the congruence defined by , vanish, with the remaining components decaying fast enough. 6 ˜ but we have Remark 1.2. Theorem 1.1 holds under finite differentiability conditions on , not attempted to determine the threshold; in any case there are no a priori reasons for a geodesic shear-free null congruence to be globally C 0 or C 1 , even if the metric is smooth. Indeed, poorly differentiable examples can be constructed in Minkowski space-time, see Sect. 2 where somewhat more general congruences are allowed. Remark 1.3. As can be seen from footnote 5, the Ricci tensor conditions of Theorem 1.1 will hold in electro-vacuum if o A o B ϕ AB = 0, where ϕ is the Maxwell spinor, with o A as in Sect. 3. Theorem 1.1 is similar in spirit to the results of Mason [14]. The differences between our hypotheses and those of [14] are as follows: First, algebraic speciality does not imply the smoothness of either ˜ or . Next, neither existence nor transversality of ˜ at I + are assumed in [14]. We further note that the hypotheses of Theorem 1.1 enforce nonvanishing of twist throughout a region of M relevant for the proof (see Proposition 3.1 below), while more general configurations are a priori allowed in [14]. 7 Finally, our argument requires the topology of I − to be R × S 2 which, for asymptotically simple space-times as considered in [14], is only known to be true [17] when M ∪ I − is strongly causal. The key idea stems from [14], but some steps of the argument require careful reorganizations. The proof can be structured as follows: We start, in Sect. 3, by introducing a coordinate system based on the members of the congruence. This allows us to construct a cut S − of I − , associated to the cut S + of I + , on which ˜ is transverse. The calculations in [14] subsequently show that the Trautman-Bondi mass m TB (S + ) of S + is the negative of that of S − . One then wishes to appeal to the positive energy theorem to show flatness of the metric near a cross-section of I + . This requires controlled spacelike hypersurfaces, say S , which are constructed at the beginning of Sect. 4. So, the positive energy theorem 4 As already pointed out, all these conditions will hold by [17] if the space-time is asymptotically simple. 5 Here and elsewhere, we follow the conventions of [16] for the NP spin-coefficient formalism, so that these conditions on the Ricci tensor are equivalent to the vanishing of the scalar curvature and of AB A B o A o B , where o A is the spinor obtained from . 6 The exact decay rates needed can be found by chasing through the calculations in [15,25] that lead to the Natorf-Tafel mass aspect formula (3.33) below. 7 Theorem 2.1 below allows configurations somewhat more general than Theorem 1.1, but those are still less general than indicated in [14].
4
P. T. Chru´sciel, P. Tod
of [7] implies that m TB (S + ) vanishes, and that S carries a timelike KID. An analysis of Killing developments allows one to conclude that the initial data on the S can be realized by embedding in Minkowski space-time; this is in fact a new rigidity result for the Trautman-Bondi mass, see Theorem 4.1. One concludes by showing that no congruences with the properties listed exist near a Minkowskian I + . We shall say that an algebraically special space-time is non-branching if it is either type I I or D everywhere8 , or type I I I everywhere, or type N everywhere. The point is that in these cases the Weyl tensor does not allow branching of the principal null directions. We then have the following related statement: Theorem 1.4. The following conditions are incompatible: 1. (M , g) is past asymptotically simple and contains a contractible Cauchy surface.4 2. (M , g) is non-branching, vacuum, with I − ≈ R × S 2 , and the complex divergence ρ of the congruence has no zeros near a compact cross-section S + of I + . The conclusion remains true for non-vacuum space-times if the conditions on the Ricci tensor spelled out in Theorem 1.1 are met. Indeed, assume that such a space-time exists. We show in Appendix A that can be chosen to be smooth throughout M , and in Appendix B that −2 is smooth and transverse at S. By Proposition 4.5 below (M , g) contains a flat region, thus is of type O there, which gives a contradiction. 2. Non-Differentiable Congruences It appears of interest to find a set of hypotheses, alternative to those of Theorem 1.1, which are compatible with at least one space-time. A possible direction of enquiries is to admit smoothness and transversality of ˜ near one or more sections of I + , but allow singularities of ˜ in the space-time. (The question of non-transversal congruences will be discussed in Sect. 5.) Now, consider any maximally extended null geodesic γ initially tangent to ˜ near I + . In the argument below it is necessary that the tangent to γ remains ˜ If ˜ is allowed to become singular, this last property might not hold, proportional to . and it is easy to construct congruences where this occurs. (Consider, for example, any timelike curve in Minkowski space-time extending from i − to i + , let u denote the retarded time function based on , and let = du on M \. Then every integral curve of , when followed from I + towards space-time, stops at .) Clearly, any argument in which null geodesics need to be followed from I + to I − has no chance of succeeding in such situations. In this last example of a congruence based on a curve , one can smoothly flow the geodesics through , landing on a second congruence generated by the past light-cones issued from . To accommodate such situations in any kind of generality would require considering multiple-valued congruences. 9 One could, however, enquire what happens if ˜ is smooth on a dense set, and if one further assumes that null geodesics somewhere tangent to ˜ remain tangent to ˜ at all those points at which ˜ is defined. The apparent difficulty of flowing along a 8 We allow the metric to be II at some places and D at others. 9 The density condition (2.1) below essentially forbids multiple-valued congruences, which therefore appear
to be intractable by our arguments.
On Mason’s Rigidity Theorem
5
singular vector field ˜ is easily resolved by flowing along the associated geodesics. Anticipating, in such situations Proposition 3.1 below does not hold anymore, and one faces the problem of understanding what happens along those geodesics on which the twist vanishes. The hypothesis that the space-time is smooth together with the NewmanPenrose equations leads then to the vanishing of some components ψi of the Weyl tensor along such geodesics, but the implications of this are not clear. It is conceivable that this might again lead to a mass changing sign as in Proposition 3.5 below, which would allow one to conclude, but this remains to be seen. In spite of the above, some degree of singularity of can be allowed, as follows. To obtain more control of the space-time we will assume full asymptotic simplicity, and consider a sequence of spherical cuts Si+ of I + near which ˜ is again assumed to be smooth and transverse. Let us set S˚i+ := { p ∈ Si+ : is smooth in an M -neighborhood of the maximally extended null geodesic with tangent ˜ at its end point p}. Rather than assuming that is smooth throughout M , so that S˚i+ = Si+ , suppose instead that S˚i+ is dense within Si+ .
(2.1)
Let I0 ⊂ I be defined as I0 := { p ∈ I | strong causality holds at p}, with I0± = I0 ∩ I ± . According to Newman [17], in asymptotically simple spacetimes each of I0± is diffeomorphic to R × S 2 , with the generators of I tangent to the R factor, which we parameterize by u; we choose u to be increasing to the future. We let Si+ be any S 2 included in I0+ that intersects every generator of I0+ precisely once; such sets will be called cross-sections of I + . (It actually follows from Theorem 2.1, which we are about to state, that I0 = I under the hypotheses there.) We claim that: Theorem 2.1. Let (M , g) be an asymptotically simple space-time with smooth null asymptote (M, g ) such that I − ≈ R × S 2 . Assume that there exists a sequence of + cross-sections Si of I0+ , i ∈ N, such that I0+ ⊂ ∪i∈N J + (Si+ , M) together with a geodesic, shear free, null vector field ˜ defined on a neighborhood of M ∪i∈N Si+ satisfying (2.1). Assume that the dominant energy condition holds and that the NewmanPenrose components 00 , 01 , 02 and of the Ricci tensor, associated to the congruence defined by , vanish, while the remaining ones decay fast enough. If ˜ is transverse to ∪i∈N Si+ , then (M , g) is the Minkowski space-time R1,3 . Remark 2.2. The Kerr congruence in Minkowski space-time (see, e.g., the Appendix to [13]) satisfies the hypotheses of Theorem 2.1. The proof of Theorem 2.1 can be found at the end of Sect. 4.
6
P. T. Chru´sciel, P. Tod
3. The Metric Form of Algebraically-Special Vacuum Solutions We now run through the derivation of the metric form of algebraically-special spacetimes, first vacuum and then noting the changes for non-vacuum. References for this are [13,14,23]. The general technique is to construct a coordinate system with the aid of the geodesic and shear-free congruence and solve enough of the Newman-Penrose spin-coefficient equations to obtain the radial dependence of the metric. We follow [16] for the Newman-Penrose spin-coefficient equations, rather than the version in [23]. We modify the derivations in [23] and [14] in order to connect the coordinate system to standard coordinates on I + from the start of the calculation. Consider a manifold S + transverse to the generators of I + . We use a local coordinate u along the generators and a local complex coordinate ζ on S + , so that the (degenerate) metric of I + is −4
dζ dζ . P2
As the calculations that follow are purely local in u and ζ we can, without loss of generality, choose P = 1 + ζ ζ . (If S + is a sphere, then (u, ζ, ζ ) are Bondi coordinates A , at I + .) Recall that, in (M, g ), the usual spinor dyad ( O I A ) and corresponding NP a a a a ,M ,M ) are related to the coordinates by tetrad ( L ,N a ∂a = ∂u , N a ∂a = √P ∂ζ , M 2 and a d x a = −d, N √ a d x a = − 2 dζ , M P where the last equation is understood as pulled back to I + , where d pulls back to zero (the point is that there are different ways of extending the coordinates into the interior). By assumption, generates a geodesic and shear-free null congruence and we may scale so that it is affinely parameterized. Therefore defines a smooth spinor field o A , which in turn can be scaled to be parallelly-propagated along the congruence. In the NP formalism, this is Do A := b ∇b o A = 0.
(3.1)
There remains a residual freedom to rescale o A by a nowhere-zero function F 0 which is constant along the congruence, when rescales with |F 0 |2 . Under conformal-rescaling of the metric g = 2 g, the rescaling ˜ = −2 takes the affinely normalized geodesic vector field to an affinely normalized geodesic one, leading to a vector field ˜ which is continuous at I by hypothesis. The rescaling o˜ A = −1 o A likewise extends to I + , where A + C IA o˜ A = B O
(3.2)
for smooth functions B and C on I + . The assumption that ˜ is transverse to I + on S + translates to the requirement that B be nonzero on that part of I + . We extend B and C
On Mason’s Rigidity Theorem
7
into the interior as functions constant along the congruence and then rescale o˜ A to set B = 1. Now, at I + , we have A + L(u, ζ, ζ ) I A, o˜ A = O
(3.3)
in terms of a function L on I + . The assumption of transversality implies that L is a smooth function on S + . (We could define L independently of the scaling of o˜ A as A o˜ A / L=O I B o˜ B .) Equation (3.1) implies that the spin-coefficients κ and are zero and, by assumption, σ is also zero. We extend the coordinates u and ζ into the interior by taking them to be constant along the geodesics of the congruence, so that Du = 0 = Dζ, and then D = ∂/∂r with r as before. This fixes r uniquely up to a shift of origin on each geodesic of the congruence. A convenient (and standard) way to choose the origin in r is next to solve one of the spin-coefficient equations, (A.3a) as given in [16] which, with the restrictions that we currently have on the spin-coefficients, is just: Dρ = ρ 2 . The solution of this is either ρ ≡ 0 or ρ = −(r + r 0 + i)−1 ,
(3.4)
where r 0 and are real functions of integration, constant along the congruence (and so are functions only of (u, ζ, ζ ); as before, we use the superscript 0 for functions independent of r but, as is conventional, omit it from ). We choose the origin of r so that r 0 = 0. Note that, if ρ ≡ 0 and ever vanishes, so that the twist of the congruence vanishes, then ρ is real and in this case ρ will diverge at a finite r . This is incompatible with smoothness of the congruence, unless ρ identically vanishes, leading to: Proposition 3.1. Under the hypotheses of Theorem 1.1, the divergence ρ and the twist are nowhere vanishing on those integral curves of which have end points on S + . Proof. Since ˜ is smooth, transverse, and geodesic, we must have ˜ = χ −2 for some smooth nowhere vanishing function χ . We might therefore without loss of generality assume, rescaling ˜ if necessary, that χ = 1. From Equations (B.11) and (B.14) of Appendix B we obtain ρ = 2 ρ˜ +
1 D = − + O(r −2 ), r
where ρ˜ is associated with ˜ just as ρ is associated with (see Appendix B for the details of this). We conclude that ρ has no zeros near S + . Hence ρ is not identically zero on any of the relevant members of the congruence, and since is smooth by hypothesis, (3.4) excludes zeros of .
8
P. T. Chru´sciel, P. Tod
Following Mason [14], who in turn follows Debney et al. [9], we next consider the complex vector field Wa = o B ∇a o B . By the geodesic, shear-free condition this is of the form o A τ A for some spinor field τ A with o A τ A = ρ. By smoothness of the congruence, ρ is smooth in the interior, and it does not vanish by Proposition 3.1. Therefore we can define the spinor field ι A , which makes up the NP dyad with o A , via its complex conjugate by ι A = −ρ −1 τ A . Then Wa = −ρo A ι A = −ρm a ,
(3.5)
and the spin-coefficient τ is also zero. With the spin-coefficient equations numbered as in [16], from (A.3c) and (A.3p) with what we have now and the vacuum equations we find that π and λ vanish. With the aid of (A.3a), we calculate the exterior derivative of Wa from (3.5) as ∇[a Wb] = X 1 [a m b] + X 2 m [a m b] ,
(3.6)
in terms of two functions X 1 and X 2 whose precise form does not concern us. Thus, in the language of differential forms, W ∧ dW = 0. We note the following, presumably well known, complex version of the Frobenius theorem: Lemma 3.2. There exist, locally, complex-valued functions X 3 and X 4 such that W = X 3d X 4. Proof. Note that W ∧ W = ρρm ∧ m = 0
(3.7)
by Proposition 3.1, which shows that the real and the imaginary part of W are nowhere vanishing, linearly independent. Elementary algebra gives dW = W ∧ Z for some complex-valued one-form Z . The usual calculation shows that the two-dimensional distribution defined by the collection of vector fields {X ∈ T M :
W (X ) = 0}
is integrable. Hence there exist, locally, complex functions α and β, as well as real valued functions f and g such that W = αd f + βdg. Equation (3.7) shows that α and β are nowhere-vanishing, and that d f and dg are linearly independent. The equation W ∧ dW = 0 implies α/β = ϕ + iψ for some functions ϕ = ϕ( f, g) and ψ = ψ( f, g), hence W = β(ϕd f + dg + iψdg). Consider the two-dimensional Riemannian metric b := (ϕd f + dg)2 + ψ 2 dg 2 .
On Mason’s Rigidity Theorem
9
By the uniformization theorem there exist, again locally, smooth functions x, y and h such that b = e2h (d x)2 + (dy)2 . Changing y to −y if necessary, at each point the b-ON coframes {ϕd f + dg, ψdg} and {eh d x, eh dy} are rotated with respect to each other, hence there exists a function θ = θ (x, y) such that (ϕd f + dg + iψdg) = eh+iθ (d x + idy). The functions X 3 = βeh+iθ and X 4 = x + i y satisfy our claim.
Returning to the problem at hand, either (3.7), or the argument in the proof of Lemma 3.2, shows that the real and imaginary parts of X 4 are independent functions. By (3.5) and (3.6), both d X 3 and d X 4 are orthogonal to , so that both are functions only of (u, ζ, ζ ), and we can determine them by looking at the value of W at I + . We have Wa d x a = o B ∇a o B d x a A A o˜ B + ϒ B A o˜ A )d x a , = o˜ B (∇ where ϒa = ∂a log , we use the rules for conformal transformation of the spinor connection given in [20] and we use the rescaled dyad of (B.12). We calculate this from (3.3) and pullback to I + to find that, at I + , √ 2 a a dζ . (3.8) Wa d x = Ma d x = − P However, from what we have said above about X 3 and X 4 , (3.8) holds everywhere, so that, by (3.5), in the interior √ 2 a dζ . (3.9) ma d x = ρP It follows at once from this that, in terms of the NP operators and δ, ζ = 0 = δζ , while δζ = − ρ√P . 2 We need covariant and contravariant expressions for the rest of the NP tetrad. We have = (u)∂u + (r )∂r ,
ρP δ = (δu)∂u + (δr )∂r − √ ∂ζ . 2
From the commutator [, D] (given in [16]) Du = 0
10
P. T. Chru´sciel, P. Tod
so that u = X 6 (u, ζ, ζ ) for some function X 6 , and analogously, from the commutator [δ, D] (using (3.4)) δu = ρ X 7 in terms of another function X 7 of (u, ζ, ζ ). Since a is null, we have a d x a = Adu + Bdζ + Bdζ for some real A and complex B (not to be confused with B appearing in (3.2)), and then normalization against m a and n a forces AX 6 = 1,
ρP Aρ X 7 − B √ = 0, 2
so that A and B are independent of r and can be found from ˜ at I + . There we have (3.3) so that, on I + , a )d x a a + L M a d x a = ( La + L M √ √ 2L 2L dζ − dζ . = du − P P
(3.10)
Now we argue as for m a d x a : from what we have deduced already for a d x a , we know that (3.10) holds in the interior. This gives the NP tetrad in the covariant form as D = ∂r , = ∂u + H ∂r , ρP δ = − √ (∂ζ + 2
√ 2L ∂u − Q∂r ), P
where H and Q are still to be determined, and in the contravariant form as √ √ 2L 2L a a d x = du − dζ − dζ , P P n a d x a = dr + Qdζ + Qdζ − H a d x a , √ 2 a dζ . ma d x = ρP
(3.11) (3.12) (3.13)
(3.14) (3.15) (3.16)
Once we have the radial dependence of Q and H , we have the radial dependence of the metric. From the [, D] commutator we find D H = −(γ + γ ), while from (A.3f) and (A.4c), Dγ = ψ2 , Dψ2 = 3ρψ2 ,
(3.17) (3.18)
ψ2 = ρ 3 ψ20 , 1 γ = γ 0 + ρ 2 ψ20 , 2
(3.19)
so that, by (3.4),
(3.20)
On Mason’s Rigidity Theorem
11
where ψ20 and γ 0 are independent of r . Therefore 1 1 0 H = H 0 − (γ 0 + γ 0 )r − ρψ20 − ρψ 2 , 2 2
(3.21)
where H 0 is independent of r ; so (3.21) gives the radial dependence of H . For Q, the commutator [δ, D] gives ρP − √ D Q = α + β, 2 while (A.3d) and (A.3e) can be integrated to give α = −α 0 ρ ; β = −β 0 ρ, with α 0 and β 0 independent of r . Therefore √ 2 0 0 (α + β 0 )r, Q=Q + P
(3.22)
with Q 0 independent of r . For later use, we find the radial dependence of the remaining spin coefficients, µ and ν. For µ, we integrate (A.3h) to find 1 µ = µ0 ρ + ρ(ρ + ρ)ψ20 , 2
(3.23)
where µ0 independent of r . For ν, first from (A.4e), assuming 12 = 0, ψ3 = ψ30 ρ 2 + ψ31 ρ 3 + ψ32 ρ 4 , where ψ3i are independent of r , and then from (A.3i), 1 1 ν = ν 0 + ψ30 ρ + ψ31 ρ 2 + ψ32 ρ 3 , 2 3
(3.24)
where ν 0 is independent of r . We now note the changes in the non-vacuum case. As in the Goldberg-Sachs theorem, we continue to insist on 00 = 01 = 02 = 0 = , but allow the possibility of non-zero 11 , 12 , and 22 (in fact, 22 doesn’t arise in the calculation). This changes some of the details above. We still have κ = σ = = τ = π = λ = 0, and (3.4), but (3.17)-(3.20) change. For the radial dependence of 11 we have Eq. (A.4i): D11 = 2(ρ + ρ)11 ,
12
P. T. Chru´sciel, P. Tod
which integrates readily. With this, (A.4c) can be integrated for ψ2 , then (A.3f) for γ and then H obtained from the commutator [, D]. In place of (3.19)-(3.20) and (3.21), we find 11 = (ρρ)2 011 , ψ2 = ρ
3
ψ20
(3.25) 3
ρ011 ,
+ 2ρ 1 γ = γ 0 + ρ 2 ψ20 + ρ 2 ρ011 , 2 1 1 0 0 H = H − (γ 0 + γ 0 )r − ρψ20 − ρψ 2 − ρρ011 , 2 2
(3.26) (3.27) (3.28)
where, as usual, quantities with a superscript zero are independent of r , and 011 is real. This is enough for the metric. For the spin-coefficient µ, (A.3h) now gives 1 (3.29) µ = µ0 ρ + ρ(ρ + ρ)ψ20 + ρ 2 ρ011 . 2 The remaining spin-coefficient ν is altogether more complicated. We need to solve (A.4j) for 12 , then (A.4e) for ψ3 and then (A.3i) for ν. The results are polynomials in ρ and ρ, with coefficients constant along . We don’t need the detailed expressions for these quantities, which can be found in [26]. For our purposes, the following suffices: 21 = O(|ρ|3 ), ψ3 = O(|ρ|2 ), ν = ν 0 + O(|ρ|).
(3.30)
We are ready to prove now: Proposition 3.3. Let N = { p ∈ M : the null geodesic through p tangent to ( p) has an end point on S + }. There exist coordinates (u, r, ζ ) parameterizing a neighborhood of N , such that (u, ζ ) coincide with Bondi coordinates on S + , in which the metric takes the form 2(a n b) − 2m (a m¯ b) , with , n and m given by (3.9)-(3.16), with ρ given by (3.4), H given by (3.21), while Q is given by (3.22), where α 0 , β 0 , ψ20 , H 0 , Q 0 , L and are smooth functions of u and ζ . Proof. Transversality and smoothness of ˜ at S + imply that there exists a neighborhood of S + on which ˜ is transverse to I + , and the result follows from smoothness of together with the calculations above.
Now we have the r -dependence of the metric. By construction, the coordinates u and ζ are good coordinates on I + near S + , while r → 1. Rescaling the metric by r −2 and setting R = r −1 , we obtain for the asymptotic behavior √ L √ L dζ dζ g = R 2 g = 2(du − 2 dζ − 2 dζ )(−d R + O(R))−4 2 (1+ O(R 2 ))(3.31) P P P (recall that we shifted r to obtain r 0 = 0), which shows explicitly that the space-time is weakly asymptotically simple with this choice of rescaling. ˜ We have: Let S − be obtained by flowing S + from I + to I − along .
On Mason’s Rigidity Theorem
13
Lemma 3.4. S − is a smooth acausal cross-section of I − , with both S + and S − diffeomorphic to S 2 . Proof. In the construction leading to (3.31) we consider instead r → −∞, taking R = −r −1 to obtain a conformal completion Ui at past infinity of the coordinate patch, say Ui , constructed above. Consider the map ψ which to p ∈ S + assigns the generator of I − = R × S 2 which is met by the maximally extended null geodesic tangent to ˜ and passing through p. Applying [5, Theorem 3.1] to Ui we conclude that there exists a smooth local diffeomorphism from Ui to M, so that S − is a smooth immersed submanifold of I − ; note, however, that S − might fail to be embedded because some points of I − could be met by more than one integral curve of ˜ emanating from S + . In any case, we infer that ψ is a local diffeomorphism. By [12, Exercise 11-9, p. 253] ψ is a covering map, and since S 2 is simply connected it follows that ψ is a diffeomorphism, so S + ≈ S 2 , and S − intersects every generator precisely once. As the only causal curves within I − are the generators of I − , the result follows.
As observed by Mason [14], one has Proposition 3.5. The Trautman-Bondi mass m TB (S + ) of S + equals the negative of the Trautman-Bondi mass m TB (S − ) of S − . This will follow if the mass aspect has the same property. Mason suggests two proofs for this proposition: either via a direct check on the mass aspect or by exploiting an alternative formula for the Trautman-Bondi mass given in [20]. We shall present the first, exploiting a formula in √[15] for the mass aspect. First we note that the correspondence (u, r, ξ ) N T = (u, r, ζ 2)C T relates our coordinates (subscript CT) to the ones used in ˆ N T arising in their [15] (subscript NT). Then the quantities (L , H, W, m + i M, , P) Q L 0 metric are for us (− P , −H, √ , ψ2 , , 1)C T , and finally their operator ∂ translates for 2 us as √ ∂ L 2 ∂ 1 + ∂=√ . (3.32) P ∂u 2 ∂ζ With these preliminaries, the Natorf-Tafel formula (47) of [15] for the integrand of the Trautman-Bondi mass (which we will refer to as the mass aspect; but note that this is not the original mass aspect function of [3,21]) translates for us to M=
1 1 0 0 ˜ + 2)η − 2i P(L ,u ∂ − L ,u ∂), (ψ + ψ 2 ) + 3,u + ( 2 2 2
(3.33)
˜ = P 2 (∂∂ + ∂∂) and where
L L 1 2 η=− P ∂ +∂ . 2 P P
To investigate the mass aspect at I − , we define a time-reversed space-time for which the old I − is now I + . With hatted quantities referring to the time-reversed spacetime, we take coordinates (u, ˆ rˆ , ζˆ ) = (−u, −r, ζ ). (The redefinition r → −r should be clear in our context; the need to replace ζ by ζ arises then from elementary orientation considerations; the transition u → −u arises from the fact that we will be using on
14
P. T. Chru´sciel, P. Tod
I − a formula for the Trautman-Bondi mass which has been worked out at I + , and this requires a change of time-orientation). Then all our calculations so far can be repeated in ˆ n, the tetrad (, ˆ m) ˆ = (−, −n, m). In particular ρˆ equals −ρ, and tracing through the ˆ η, ˆ ψˆ 0 ) = (, −L, −η, ∂, −ψ 02 ). Using ˆ L, quantities in the mass aspect, we find (, ˆ ∂, 2 these in (3.33), we see that, as desired, M changes sign. Now all quantities appearing in (3.33) are constant along , and we conclude that the mass aspect at S − is the negative of the mass aspect at S + .
4. Spacelike Hypersurfaces, a Rigid Positive Energy Theorem Choose a cut S + of I + and consider the associated null boundary N := J˙− (S + , M ),
(4.1)
then N is an achronal hypersurface generated by null geodesics orthogonal to S + . Further there exists a neighborhood O of I + such that N ∩ O is smooth. If we assume (M , 4 g) to be globally hyperbolic, there exists a time-function τ on M with the property that its level sets, S˚ τ := {τ = τ0 }, 0
are smooth spacelike Cauchy surfaces (compare [2]). Define Nτ := J˙− (S + ) ∩ S˚ τ ; note that the intersection is transverse. Since M = ∪τ S˚ τ , we have that ∪τ Nτ = N . This, together with Dini’s theorem, shows that there exists τ0 such that Nτ0 ⊂ O, thus Nτ0 is a smooth sphere. For > 0 let 4 g˜ be a family of smooth Lorentzian metrics on M such that 4 g˜ converges to 4 g˜ on compact subsets of M as goes to zero in the C ∞ topology, with the property that all vectors which are null for 4 g˜ are spacelike for 4 g. ˜ By continuous dependence of geodesics upon the metric, for > 0 small enough all null 4 g˜ -geodesics normal to I + intersect S˚ τ0 in a smooth sphere N˚ , with the corresponding hypersurface N , defined as in (4.1) using the metric 4 g˜ , being smooth in its portion which is bounded by S + and by N˚ ; call this region Sext . The Cauchy surface S˚ τ0 is contractible by one of the hypotheses of Theorem 1.1, or by [17, Sect. 5] if full asymptotic simplicity is assumed. Simple connectedness of S˚ τ0 and elementary intersection theory show that N˚ separates S˚ τ0 into two components. From the Hurewicz isomorphism theorem [22, Chap. 7, Sect. 5] we further conclude that H2 (S˚ τ0 ) is trivial, which implies that one of the components separated by the sphere N˚ , say K , is compact. Set S = K ∪ Sext , then S is a piecewise differentiable 4 g–spacelike hypersurface which is the union of a compact set and of an asymptotic region extending to I + . Smoothing out the corner at N˚ one obtains a smooth hypersurface, still denoted by S . Next, the formulae of [6, App. C.3] show how to make a small deformation of S near I + to obtain a hypersurface on which the induced metric asymptotes to a hyperbolic one, as needed for the proof be the universal cover of S , then S is of positivity of mass of [7]. Finally, we let S 10 complete, with one or more asymptotically hyperbolic ends. 10 Conceivably one can infer at this stage, from the results in [17], that S is simply connected; however, the argument that follows sidesteps this issue.
On Mason’s Rigidity Theorem
15
we By the positive energy theorem of [7] applied to a chosen asymptotic end of S conclude that the Trautman-Bondi mass associated with this end is non-negative. An identical construction starting from S − shows that m TB (S − ) ≥ 0. From Proposition 3.5 we infer that m TB (S + ) = 0. We continue with an investigation of the consequences of the Witten-type proof of the . Let ψ˚ be a Dirac spinor which is parallel with respect positive energy theorem on S to the spin-connection associated with the Minkowski metric, such that the resulting Killing vector in Minkowski space-time R1,3 equals ∂t . Now, when m = 0, the proof of , the positive energy theorem in [7] shows that the space-time metric 4 g is flat along S ˚ and that there exists a spinor ψ, solution of the Witten equation, such that ψ = ψ + χ , with χ in a weighted Sobolev space obtained by completing C0∞ with respect to the norm
|Dχ |2 dµg . Furthermore, ψ is parallel with respect to the space-time connection ∇ associated to the , g, K ), initial data set (S ∇i ψ := Di ψ +
1 K i j γ j γ 0 ψ = 0. 2
(4.2)
Let (V, Y ) be the KID defined by ψ, V := ψ, ψ, Y := ψ, γ 0 γ j ψe j . Equation (4.2) implies that (V, Y ) is parallel, in the following sense: Di V = K i j Y j ,
Di Y j = V K i j .
(4.3)
It follows that the Lorentzian norm squared V 2 − |Y |2g of (V, Y ) is constant on S , Di V 2 − |Y |2g = 0. (4.4) It should follow from the methods in [1] that this norm is strictly positive by choice of ˚ so that the associated Killing vector is timelike; however, an argument which avoids ψ, the heavy machinery of the last reference proceeds as follows: If V 2 − |Y |2g = 0, we ˚ If the new resulting Killing vector is timelike choose a different asymptotic value of ψ. we are done, otherwise there is a linear combination of the new Killing vector and of the old one which is timelike, and satisfies (4.3), leading to a timelike Killing vector for which (4.4) holds as well. , V, Y ), thus M˚ is ˚ defined by (S We consider the Killing development (M˚ , g) Rt × S with metric g˚ = V 2 dt 2 − gi j (d x i + Y i dt)(d x j + Y j dt), with Killing vector X = ∂t . Letting exp(µ) := 4 g(X, X ) = V 2 − |Y |2g ,
(4.5)
16
P. T. Chru´sciel, P. Tod
we rewrite the Killing development metric g˚ in the following form: g˚ = exp(µ)(dt + θi d x i )2 − h,
(4.6)
as needed in Lemma 3.11 of [8], where the Riemannian metric h is related to the initial data metric g by the equation h i j = gi j + exp(µ)θi θ j ,
(4.7)
θi = −e−µ gi j Y j .
(4.8)
and
To apply that last lemma, we need to verify that the metric h in (4.6) is complete, and . Now, the hyperbolic asymptotics of g, that the h-length of θ is uniformly bounded on S ˙ ∪S and the Hopf-Rinow theorem, imply completeness together with compactness of S , g). Since the last term in (4.7) gives a non-negative contribution on any given of (S , h) follows from that of (S , g). vector, completeness of (S . Further, Next, it follows from (4.4) that exp(µ) is constant over S h i j = gi j −
exp (−µ) YiY j, 1 + exp (−µ)|Y |2g
so that |θ |2h = h i j θi θ j =
exp (−2µ)|Y |2g 1 + exp (−µ)|Y |2g
≤ exp(−µ) =: C,
which establishes the desired uniform bound on |θ |h . , V, Y ) ˚ of (S From [8, Lemma 3.11] we conclude that the Killing development (M˚ , g) is simply connected, so is M˚ ≈ R × S . The Lorentis geodesically complete. Since S ˚ zian version of the Hadamard-Cartan theorem [19, Prop. 23, p. 227] implies that (M˚ , g) is a hyperboloidal hypersurface in R1,3 , and hence has only one is R1,3 . In particular S would have had asymptotically hyperbolic end. But if S were not simply connected, S = S. more than one such end. We conclude that S By hypothesis (M , 4 g) satisfies the dominant energy condition, hence the domain of dependence D(S , M ) in the original space-time M is vacuum by [11, Sect. 4.3]. From [4] we conclude that D(S , M ) is isometrically diffeomorphic to a globally hyperbolic subset of the domain of dependence D(S , M˚ ) in the Killing development. This is a bijection when (M , 4 g) is both past and future asymptotically simple, otherwise D(S , M ) couldn’t be future null geodesically complete. For the record, the above establishes the following rigidity statement (compare Theorems 5.4 and 5.7 of [7]; the reader is referred to this last reference for precise definitions): Theorem 4.1. Let µ, respectively J i , denote the energy density, respectively the momentum density, of an initial data set (S , g, K ). Suppose that (S , g) is geodesically complete without boundary, and that S contains an end which is C 4 × C 3 , or C 1 and polyhomogeneously, compactifiable and asymptotically CMC, with energy-momentum density decaying fast enough. If gi j J i J j ≤ µ, (4.9)
On Mason’s Rigidity Theorem
17
and if the Trautman-Bondi mass of S vanishes, then (S , g, K ) can be realized by embedding S into Minkowski space-time. If the initial data set is known to be vacuum near the conformal boundary from the outset, or to satisfy a set of equations which are well behaved under singular conformal transformations such as, e.g., the Einstein–Maxwell or Einstein–Yang-Mills equations, then the restriction that the data be asymptotically CMC is not needed. Remark 4.2. It is still an open question whether a null Trautman-Bondi energy-momentum is compatible with the remaining hypotheses above; it would be of interest to settle that. We continue by pushing S + slightly down the generators of I + , to conclude that the space-time metric is flat in a neighborhood of S + . By [5] we conclude that Proposition 4.3. There exists a neighborhood of S + which is isometrically diffeomorphic to a neighbourhood of a spherical cut of I + in Minkowski space-time. Moreover the space-time metric is flat to the future of any spacelike hypersurface spanned by S + . Recalling that the congruence generated by has nowhere vanishing twist (see Proposition 3.1), Theorem 1.1 follows now from Proposition 4.4: Proposition 4.4. There exists no smooth, null, geodesic congruence defined in a neighborhood of a cross-section S + of the Minkowskian I which is shear-free, transverse to S + , and has nowhere vanishing twist. Proof. Any smooth, null-geodesic congruence near a Minkowskian I + defines a spinweight one function L as in (3.3) which is smooth on I + in a neighbourhood of the cut S + = {u = 0}. By [13, Eq. (2.24)] the shear-free condition in Minkowski space is equivalent to ðL + L L˙ = 0,
(4.10)
where the dot stands for ∂/∂u and ð is the eth-operator of Newman and Penrose. Then in (3.4) is given by (see [13, Eq. (2.17)]) =
i ˙ (ðL + L L˙ − ðL − L L). 2
(4.11)
Now consider F := L L restricted to the cut u = 0. Clearly F has a maximum on this sphere, and at the maximum its gradient vanishes, so at any maximum ˙ 0 = ðF = (ðL)L + L(ðL) = L(ðL − L L),
(4.12)
using (4.10) to eliminate ðL. If L ≡ 0 on the cut {u = 0} then = 0 throughout the cut by (4.11), and we are done. Otherwise, at a maximum of F, L does not vanish so that the second factor in (4.12) must. But by (4.11) this forces to vanish there.
Proof of Theorem 2.1. Under (2.1), the argument of the proof of Theorem 1.1 with S + replaced by Si+ carries through with minor modifications. Indeed, the cross-sections Si− are smooth acausal cross-sections of I − as before, because they are constructed by flowing along the geodesics which start at I + , and those do not care about smoothness of as a vector field on M . Next, the argument that the Bondi mass aspect changes sign remains valid for those members of the congruence which have end points on S˚i+ . But the Bondi mass aspect is a smooth function both on Si+ and Si− , and the density hypothesis
18
P. T. Chru´sciel, P. Tod
(2.1) guarantees that the corresponding subset of Si− is dense. Continuity allows us to conclude, as before, that the Trautman-Bondi mass changes sign when replacing Si+ with Si− . Let Si be a hypersurface S as in the proof of Theorem 1.1 with S + there replaced by Si+ . We have shown so far that Si is a hyperboloidal hypersurface in Minkowski space-time and, since it has no edge, its future (whether in Minkowski space-time or in M ) coincides with its future domain of dependence there: D + (Si , M ) = J + (Si , M ).
(4.13)
Now, by asymptotic simplicity, every generator of the Cauchy horizon 11 D˙ − (Si ) has a future end point on Si+ . This implies that D˙ − (Si , M ) = J˙− (Si+ , M) ∩ M .
(4.14)
We continue by showing that ∪i∈N D(Si , M ) = M .
(4.15)
=:U
˙ − (Si ) Suppose that this is not the case, then there exists a sequence of points pi ∈ D ˙ which converges to a point p belonging to the boundary U of U . Let γ˙i be the vector tangent to a generator of D˙ − (Si ) at pi , normalized to unit norm with respect to an auxiliary Riemannian metric. Passing to a subsequence if necessary, the sequence (γ˙i ) converges to a null vector γ˙ at p. Let γ be a null geodesic through p with tangent γ˙ there, maximally extended in M, then γ meets I + at some point q. Without loss of generality, passing to a subsequence if necessary, we can assume that + ⊂ J + (Si+ , M). Si−1
Since I + ⊂ ∪i J + (Si+ ) there exists i 0 such that q ∈ J + (Si+0 ). Then γ intersects D(Si ) for every i < i 0 , and since p ∈ D(Si ) the null geodesic γ , when followed to the past starting from q, has to intersect J˙− (Si+ ) before reaching p, compare (4.14). But the γi ’s accumulate at γ as i tends to infinity, so that there exists i 1 > i 0 + 1 so that (by continuous dependence of solutions of ODE’s upon initial values) the geodesic γi1 ⊂ J˙− (Si+1 ) intersects J˙− (Si+0 +1 ). This is, however, not possible since J˙− (Si+0 +1 ) is strictly interior to J + ( J˙− (Si+1 )). We conclude that (4.15) holds, and therefore 4 g is flat. Summarising, (M , 4 g) is a simply connected, flat, null geodesically complete manifold. Theorem 2.1 follows now from Proposition 4.5 below. 12
Proposition 4.5. Let n ≥ 1. The only (n + 1)-dimensional simply connected, flat, null or timelike geodesically complete Lorentzian manifold (M , g) is, up to isometric diffeomorphism, the Minkowski space-time R1,n . Proof. Since g is flat, the dimension of the set of germs of locally defined Killing vector fields is the same at every point. A theorem of Nomizu [18] shows then that every local Killing vector extends to a globally defined one. By [10, Lemma 1], all Killing vector 11 There are two conventions for defining D(S ); we use the one in which inextendible timelike curves are required to intersect S precisely once. 12 We are grateful to a referee for a suggestion leading to Proposition 4.5.
On Mason’s Rigidity Theorem
19
fields 13 are complete. But, in a flat space-time, affinely parameterized geodesics are orbits of translational Killing vectors, hence (M , 4 g) is geodesically complete. The result follows now from the Hadamard-Cartan theorem.
5. Concluding Remarks One would like to remove all restrictive hypotheses of Theorem 2.1 and assert that the only algebraically special vacuum asymptotically simple space-time is the Minkowski one. Any proof of this, in a setting where the set V := { p ∈ I + | p is an end-point of an integral curve γ of } ⊂ I + does not cover a dense subset of some sequence of cross-sections of I + , has to use arguments going beyond those indicated by Mason. On the other hand, one could expect that some version of the current argument should apply if the last density property holds. However, attempts to include such situations face several difficulties. Suppose, for instance, that S is a cross-section of I + such that V ∩ S is dense in S. Now, Mason’s ˜ Since construction requires flowing from V ∩ S to the past along the integral curves of . V ∩ S is not compact anymore, the geometry of the resulting subset S − of I − is not clear: By causality considerations, S − will be bounded to the future on I − , however, it could very well be unbounded to the past. Regardless of that issue, the closure S − of S − might fail to be differentiable. Finally, S − might develop self-intersections. In all those cases a useful notion of mass of S − is not clear, and certainly no suitable positivity theorem is available. Similar problems concerning the geometry of Si− could occur in those space-times in which I − is not diffeomorphic to R × S 2 ; while we are not aware of any such asymptotically simple examples, their existence has not been ruled out so far (strong causality at I − must then necessarily fail, compare [17]). In this context the following example is rather instructive: Consider a cut S of I + in Minkowski or Schwarzschild space-time given by the equation u = α, then the integrand of the Trautman-Bondi mass of S equals m 1 + 2 (2 + 2)α, 4π 16π where m is the Schwarzschild mass parameter (which we set to zero in the Minkowski case), while 2 is the Laplacian on S2 (see, e.g., [6, p. 136]). Now, with a little work one finds that for any c ∈ R the function αc =
c (cos θ ln tan(θ/2) − 2 ln sin θ ) 4
is a solution of 2 (2 + 2)αc = c
(5.1)
away from the north and south poles. We can add to αc elements of the kernel of the operator appearing in (5.1) which, when allowing functions which are singular at the 13 The hypothesis that the Killing vector is timelike, made elsewhere in [10], is not used in the proof, which goes through unchanged with one exception: when n = 1, and the manifold is assumed to be null geodesically complete, and the Killing orbit is null. But in this case the orbit is a null geodesic, so null geodesic completeness implies completeness of that orbit trivially.
20
P. T. Chru´sciel, P. Tod
poles, contains ln tan(θ/2). By adding to αc an appropriate multiple of this last function one can obtain a function αc,S which solves (5.1) away from the south pole, as well as a function αc,N which is a solution away from the north one αc,N = αc −
c ln tan(θ/2), 4
αc,S = αc +
c ln tan(θ/2). 4
The graph {u = α} of each of these functions provides thus an example of an embedded smooth submanifold of I + (which fails to be a cut of I + because it misses one generator) with Trautman-Bondi mass m T B , when naively defined as the integral of the mass aspect function, being an affine function of c, in particular both the mass aspect and m T B can be negative. A piecewise smooth, non-differentiable, but (uniformly) Lipschitz continuous crosssection of the Minkowskian I + , with mass aspect function which is everywhere negative except at the equator where it is not defined, can be constructed by using the function α = max(α−1,S , α−1,N ). The reader may readily devise a similar example in Schwarzschild space-time, or in any space-time with a complete I + in which the relevant functions are uniformly bounded over I + . Acknowledgements. We acknowledge useful discussions with, or comments from, M. Anderson, G. Galloway, W. Natorf, E.T. Newman, J. Tafel, and A. Trautman.
A. Smoothness of for Non-Branching Metrics Let be the field of principal null directions of the Weyl tensor, normalized so that ∇ = 0. In this Appendix we wish to prove that is smooth on the set where the Weyl tensor is non-branching, as defined in the introduction; thus either of type I I or D throughout the set, or type I I I throughout the set, or type N throughout. As already mentioned, in the type I I or D case we allow the type to change from point to point, as long as the Weyl tensor remains in the I I or D class. Since the claim is local, it is sufficient to establish the result in a neighborhood of a point. So let o A , ι A be any local basis of the space of two component spinors near p, and let ψ ABC D be the Weyl spinor. Then, by definition, the Weyl tensor is type I I or D if at least one of the solutions of the equation 0 = P(λ) := ψ ABC D (λι A + o A )(λι B + o B )(λιC + oC )(λι D + o D ) ≡ ψ4 λ4 + ψ3 λ3 + ψ2 λ2 + ψ1 λ + ψ0
(A.1)
corresponds to a zero which is exactly of second order. The associated principal null direction is (whatever the type) determined by the null vector (λι A + o A )(λι A + o A ). So smoothness of near p, for a smooth metric, will be proved if we show that the solution λ of (A.1) depends smoothly upon the coefficients ψi appearing in (A.1). We will actually show that λ is an analytic function of the coefficients, see Proposition A.1 below, so will be analytic if the Weyl tensor is. The analysis applies regardless of the order of the remaining roots of (A.1), which explains why the argument covers both the I I and D Petrov-types (recall that type I I is defined by requiring the remaining zeros to be simple, while type D correspond to a second order zero for the other root).
On Mason’s Rigidity Theorem
21
Similarly we define the Weyl tensor to be of type I I I throughout a set U if one of the solutions of (A.1) corresponds to a zero of exactly third order throughout U ; smoothness of the associated vector field follows then from Proposition A.2 below with k = 3. Finally type N is defined by requiring P to have one single zero of order four, and smoothness is a consequence of Proposition A.2 with k = 4. We start by noting that, by passing to a different basis of the space of spinors if necessary, we can assume ψ4 is non-zero at p. Indeed, suppose that ψ4 is zero in any basis at p, then also ψ0 = 0 for any basis at p, which implies P(λ) = 0 for all λ. It follows that ψi = 0 for all i ∈ {0, . . . , 4}, hence ψ ABC D = 0 at p, thus the Weyl tensor is of type 0 there, contradicting our hypothesis that the Weyl tensor is non-branching on the set under consideration. From now on we choose any basis so that ψ4 ( p) = 0, but then by continuity there exists a neighborhood V p of p on which ψ4 has no zeros. All remaining considerations are restricted to V p , which involves no loss of generality since p is arbitrary within the non-branching set. Dividing by ψ4 , we are led to study the equation 0=λ + N
N −1
αi λi ≡ W (λ),
(A.2)
i=0
with smooth complex coefficients αi (in the case of current interest, αi = ψi /ψ4 , and N = 4). Then λ is a zero of order two if and only if W (λ) = W (λ) = 0,
but W (λ) = 0.
We need to analyse the dependence of λ upon the coefficients αi of (A.8). Consider, first, the equation W (λ) = 0;
(A.3)
the holomorphic implicit function theorem shows that (A.3) defines an analytic function λ ≡ λ(αi ) on the set U2 := {W (λ) = 0, λ ∈ C, (αi ) ∈ C N } ⊂ C N +1 ,
(A.4)
iλi−1 ∂λ = − . ∂αi W (λ)
(A.5)
with
Next, let the function ϕ : U2 → C be defined as ϕ = W (λ(αi )), by definition we have W (λ(αi )) = 0 so that ⎛ ⎞ ∂ϕ ∂λ dϕ = dαi = ⎝W (λ) + λi ⎠ dαi = dα0 + λdα1 + . . . + λ N −1 dα N −1 . (A.6) ∂αi ∂αi =0
It follows that dϕ has no zeros on U2 , hence {W (λ) = 0} is an analytic submanifold of U2 . We have thus shown
22
P. T. Chru´sciel, P. Tod
Proposition A.1. The set V2 := {αi : W (λ) = W (λ) = 0, W (λ) = 0 for some λ ∈ C} ⊂ C N is an analytic submanifold of co-dimension one in C N , with λ being an analytic function on V2 . The above generalizes immediately to zeros of W which are exactly of order k: indeed, set
Vk := αi : ∃ λ ∈ C such that W (i) (λ) = 0, i = 0, . . . , k − 1, but W (k) (λ) = 0 ⊂ C N .
(A.7)
Then the equation W (k−1) (λ) = 0 defines a smooth function λ on Vk by the implicit function theorem, using an obvious generalization of (A.5), and for k = 1 we are done. Otherwise consider the map φ = (φ i ) : Vk → Rk , where φ i = W (i) (λ), i = 0, . . . , k − 1. On the preimage φ −1 ({0}) we have, as in (A.6), ∂φ j /∂αi = i(i − 1) · · · (i − j)λi− j , so that the last k columns of the Jacobi matrix take the form ⎞ ⎛ λk−2 · · · 0! λk−1 ⎜ (k − 1)λk−2 · · · 1! 0 ⎟ ⎟, ⎜ .. ⎝ ... 0 0 ⎠ . (k − 1)! 0 0 0 the determinant of which is clearly non-vanishing. By the rank theorem one concludes that: Proposition A.2. The set Vk is an analytic submanifold of co-dimension k − 1 in C N , with λ being an analytic function on Vk . Remark A.3. Identical arguments apply to polynomials with real coefficients, C being replaced by R and “analytic” being replaced by “real analytic” both in the statements and in the proofs. The argument just given also settles the following closely related question: consider a smooth function A : U → End(C N ) or A : U → End(R N ), defined on an open subset U of Rn , with the property that for all p ∈ U the dimension of the associated eigenspace equals k. We further assume that A is hermitian in the complex case, or symmetric in the real one. We claim that the function which to p ∈ U assigns the associated k-dimensional eigenspace is a smooth function on U . 14 In order to see this, let λ be a solution of the characteristic equation, 0=λ + N
N −1
αi λi ≡ W (λ) := det(A − λId).
(A.8)
i=1
Then λ will have algebraic multiplicity k if and only if the αi ’s belong to the set Vk of Proposition A.1. Composing with the map which to A assigns its symmetric polynomials αi , and using Proposition A.2, we conclude that λ is a smooth function on U (analytic if A is). This allows us to show that: 14 The question of multiple principal directions of the Weyl tensor, discussed at the beginning of this section, can also be formulated as such a problem [23].
On Mason’s Rigidity Theorem
23
Proposition A.4. The k-dimensional eigenspaces are smooth functions on U , analytic if A is. Proof. Let p0 ∈ U and let A0 = A( p0 ), λ0 = λ( p0 ), thus there exist k linearly independent vectors ei ∈ C N such that (A0 − λ0 Id)e1 = · · · = (A0 − λ0 Id)ek = 0. k N of C N . In this basis any A = A( p) can be We can complete {ei }i=1 to a basis {ei }i=1 written as
B C 00 , , while A A = λId + = λ Id + 0 0 0 E0 C† E
where B is a k × k matrix, with λ = λ( p), and with B, C, and E being analytic functions of A, hence smooth in p (analytic if A is). Since dim Ker(A0 − λ0 Id) = k we have det E 0 = 0, hence there exists a neighborhood of A0 on which det E = 0. For p within this neighborhood set X i = ei + Xˆ i , where the vectors Xˆ i ∈ Vect{ek+1 , . . . , e N } are given by Xˆ 1 = −E −1 C † e1 , . . . , Xˆ k = −E −1 C † ek . Clearly the X i ’s are analytic functions of A, thus smooth (analytic if A is) in p. As Ker(A − λId) has dimension precisely k throughout U by hypothesis, it easily follows that the X i ’s span Ker(A − λId).
B. Rescalings, ρ and Smooth Extendibility of ˜ Throughout this Appendix the symbol denotes a vector field satisfying ∇ = 0 together with (1.1). We assume that the Ricci tensor of (M , g) satisfies the conditions spelled out in the last part of Theorem 1.1. The aim here is to prove the following: Theorem B.1. Suppose that is smooth on the intersection U ∩ M of a neighborhood U of I + with M , and let V := { p ∈ I + | p is an end-point of an integral curve γ of } ⊂ I + , Vρ≡0 := { p ∈ I + | p is an end-point of an integral curve γ of with ρ ≡ 0 on γ } ⊂ V , Uρ≡0 := { p ∈ I + | p is an end-point of precisely one integral curve γ of with ρ ≡ 0 on γ } ⊂ Vρ≡0 . Then 1. The field −2 extends smoothly and transversally to a neighborhood of p ∈ V if and only if p ∈ Uρ≡0 . 2. The sets Vρ≡0 and Uρ≡0 coincide, and are open subsets of I + (perhaps empty). Proof. Point 1. The necessity follows from Proposition 3.1, the sufficiency from Proposition B.3 below. Point 2 follows from Proposition B.3.
24
P. T. Chru´sciel, P. Tod
An example of a set V which is the union of precisely one generator of I + and one generator of I − (and is therefore closed, without interior) is provided by the congruence of null geodesics with tangent vector ∂t + ∂z in Minkowski space-time. Note that in this example extends to a smooth vector field everywhere tangent to I , and thus −2 extends neither to I + nor to I − . An example of V which is not closed is provided by the Robinson congruence in Minkowski space-time [20, Volume I, p. 59], where V equals I with one generator removed from each of I + and I − . Theorem B.1 has the following corollary: Corollary B.2. Let be smooth on the intersection U ∩ M of a neighborhood U of I + with M , and suppose that all future directed integral curves of in U have end points on I + . Then the following conditions are equivalent: 1. 2. 3.
˜ := −2 extends smoothly and transversally to I + . ρ˜ is bounded on U . ρ is nowhere vanishing on U ∩ M .
Proof. The implication 1 =⇒ 2 is obvious. Next, (B.15) below shows that ρ does not vanish near I + under the hypothesis of Point 2, but then ρ is nowhere vanishing by (B.1) as long as the congruence remains smooth, and the implication 2 =⇒ 3 follows. Finally, the extendibility part of 3 =⇒ 1 follows from Theorem B.1; transversality follows from the construction in that theorem.
Before passing to the statement, and proof, of Proposition B.3, we analyse the transformation properties of the objects at hand under conformal rescalings. From the general theory of algebraically-special metrics [13,14,23], which has been reviewed in Sect. 3, there is a normalized spinor dyad (o A , ι A ) related to the affinely-parameterized vector field by a = o A o A , and with the following restrictions on the spin-coefficients: κ = = σ = τ = π = λ = 0. In any region in which the complex expansion ρ is non-zero, the r -dependence of the non-zero spin-coefficients for vacuum, where r is an affine-parameter along so that (r ) = 1, has been explicitly found above as: ρ = −(r + r 0 + i)−1 , α = −α 0 ρ, β = −β 0 ρ, 1 γ = γ 0 + ρ 2 ψ20 , 2 1 0 µ = µ ρ + ρ(ρ + ρ)ψ20 , 2 1 1 0 0 ν = ν + ψ3 ρ + ψ31 ρ 2 + ψ32 ρ 3 . 2 3
(B.1) (B.2) (B.3) (B.4) (B.5) (B.6)
In (B.1)-(B.6), the superscript zero indicates a function constant along and ψ31 , ψ32 are also constant along . For the non-vacuum case, γ , µ and ν are given instead by (3.27), (3.29) and (3.30), which will be sufficient for our conclusion below.
On Mason’s Rigidity Theorem
25
With the conformal rescaling g = 2 g we obtain ([24]) ˜ ab + ˜ b − 1 −1 a ∇ gab ( g e f ∂e ∂ f ) = (− gab ), ∇ 2
(B.7)
where, following the usual NP conventions, 1 1 ab = − Rab + Rgab 2 8
=
1 R, 24
in terms of the Ricci tensor Rab and scalar curvature R, and the tilde indicates that these quantities are calculated for g. Now is geodesic, shear-free and affinely parameterized for g, and one readily finds that ˜ = −2 has the same properties for g . Suppose an affine parameter for ˜ is r˜ , so ˜ that (˜r ) = 1, as well as (r ) = 1. Then ˜ is bounded in M, being a solution of the equation ∇˜ ˜ ˜ = 0 with smooth data at = > 0. Contract (B.7) with ˜a ˜b to find d 2 ˜ ab ˜a ˜b ). = (− d r˜ 2
(B.8)
Integrate this twice along a geodesic of the congruence, fixing the origin of r˜ to be at I + (note that r˜ ≤ 0 then), to obtain: 0 d ˜ ab ˜a ˜b )(s)ds, = A+ (s)( (B.9) d r˜ r˜ 0 ˜ ab ˜a ˜b )(s)ds, = A˜r + (˜r − s)(s)( (B.10) r˜
where A is a constant of integration which can be written as A=
d ˜a I + = ,a |I + . d r˜
(The limit is negative since decreases towards I + ). Suppose that p ∈ I + is an end-point of an integral curve of . Then ˜ is transverse to I + at p and we conclude that A is nonzero there. We have a remaining freedom to multiply and hence also ˜ by a positive function and we may use this to set A = −1. Now ˜ r ) = −2 (˜r ) = −2 1 = (˜
d r˜ , dr
from which r → 1 as r → ∞, and
d r˜ = 2 . dr
The chosen rescaling of implies the following rescalings for the null tetrad ˜a = −2 a , m˜ a = −1 m a , n˜ a = n a , the following for the corresponding one-forms: ˜a = a , m˜ a = m a , n˜ a = 2 n a ,
(B.11)
26
P. T. Chru´sciel, P. Tod
and the following for the spinor dyad: o˜ A = −1 o A , ι A = ι A , o˜ A = o A , ι A = ι A ,
(B.12)
while the spin-coefficients change according to: α˜ = β˜ = ρ˜ = τ˜ = γ˜ = π˜ = µ ˜ = ν˜ =
−1 α − −2 δ, −1 β, −2 (ρ − −1 D), −−2 δ, γ − −1 , −2 δ, µ − −1 , ν,
(B.13) (B.14) (B.15) (B.16) (B.17) (B.18) (B.19) (B.20)
as well as ˜ = κ˜ = σ˜ = λ˜ = 0. From (B.7) ˜ − ( D˜ m˜ a )∂a = − ˜ ab ˜a m˜ b := − ˜ 01 , D˜ δ ˜ or with (B.18) and the definition of π˜ and δ: ˜ −2 δ) = − ˜ 01 . D(
(B.21)
We are ready to prove now Proposition B.3. The set Vρ≡0 is open and coincides with Uρ≡0 . Moreover the field −2 extends by continuity to a smooth vector field ˜ on Vρ≡0 . Proof. Consider an integral curve of which has an end point on I + at p ∈ Vρ≡0 . ˜ still denoted by , which meets Then can be extended to a null geodesic with tangent , I + transversally at p. There exists 0 > 0 so that meets all the level set { = }, 0 ≤ ≤ 0 transversally. Let W ⊂ { = 0 } be a small conditionally compact open neighborhood of ∩{ = 0 }, on which ρ is not vanishing, and to which is transverse. Let the set O ⊂ M be the union of points obtained by flowing W along the geodesics tangent to ˜ from W to I + . We let O = O∩ M denote the intersection of O with the original space-time M . We start by showing that the tilded spin coefficients are uniformly bounded on O. To see that, integrate (B.21) to find that −2 δ is bounded up to I + , and therefore, by (B.16) and (B.18), so are τ˜ and π˜ . From (B.1)-(B.3) and (B.13)-(B.15), we may ˜ For ρ, conclude boundedness of α, ˜ β. ˜ straightforward manipulations using (B.10) lead to the following form of (B.15): 0 1 + r r˜ r 1 a ˜b ˜ ˜ i − + (˜r − s)(s)(ab )(s)ds ρ˜ = (r + i) r˜ 1 0 ˜ ab ˜a ˜b )(s)ds. (s)( (B.22) − r˜
On Mason’s Rigidity Theorem
27
Note that r → 1 and r r˜ → −1 as r˜ approaches zero, and boundedness of each term in (B.22) easily follows. For γ˜ , we return to (B.7) and contract with ˜a n˜ b to find ˜ −1 ) = |−2 δ|2 − ˜ 11 + , ˜ D(
(B.23)
using what we already have. Integrate this to find that −1 is bounded at I + and therefore so also is γ˜ , from (B.4). Finally, from (B.5), B.6), (B.19) and (B.20), µ ˜ and ν˜ are bounded. In the non-vacuum case, we need the modified expressions (3.27), (3.29) and (3.30) for γ , µ and ν but the conclusion is the same. Now a ˜b = ˜a ((γ˜ + γ˜ )˜b − τ˜ m˜ b − τ˜ m˜ b ) ∇ ˜ ˜b − ρ˜ m˜ b − ρ˜ m˜ b ) −m˜ a ((α˜ + β) ˜ ˜b − ρ˜ m˜ b − ρ˜ m˜ b ), −m˜ a ((α˜ + β)
(B.24)
˜ and we have shown that all the with similar expressions for the derivatives of n˜ and m, ˜ covariant derivatives of the tetrad are uniformly bounded. It follows that the tetrad , ˜ n, ˜ is uniformly Lipschitz, and therefore extends to a Lipschitz continuous tetrad on m, the M-closure O ⊃ O of O. In particular the extended vector field ˜ is Lipschitz continuous. This implies that the map obtained by flowing along the geodesic with initial tangent ˜ from I + for an affine parameter distance r˜ defines a Lipschitz continuous function of the coordinates, say v A , on I + : indeed, by definition we have, in any smooth coordinate system near I + , x µ (˜r , v A ) − x µ (˜r , v A ) = −
0
˜µ (x ν (s, v A )) − ˜µ (x ν (s, v A ) ds,
(B.25)
r˜
and the Lipschitz character of v A → x µ (˜r , v A ) follows from the Gronwall inequality. We now show (uniform) Lipschitz continuity of the connection coefficients. First, from (B.9)-(B.10), the functions , /˜r and d/d r˜ are now uniformly Lipschitz in the variables (˜r , v A ) by a calculation similar to that in (B.25). Next, we want to show Lipschitz continuity of the right-hand-side of (B.22), which we rewrite in the following way, convenient for the purposes here: ρ˜ =
0 1 1 + r r˜ r˜ s (s) s ˜ ˜a ˜b (ab )(s)ds i − + r r˜ 1− r˜ (r + i) (˜r ) r˜ s r˜ 0 (s) s ˜ ˜a ˜b r˜ (ab )(s)ds. (B.26) − (˜r ) r˜ s r˜
Consider the function h := r r˜ + 1; it follows from (B.11) that h satisfies the equation r˜
dh = h + H, d r˜
H=
r˜
2 − 1,
28
P. T. Chru´sciel, P. Tod
where the function H is already known to be uniformly Lipschitz in v A . Integration gives
r˜ H (s) h = C r˜ 1 + ds , (B.27) s2 r˜0 and uniform Lipschitz continuity of h — and hence also of r r˜ — follows by straightforward estimations. But now
r = (r r˜ ) r˜ is uniformly Lipschitz as well. Rewriting (B.10) with A = −1 as 0 s (s) ˜ ˜a ˜b +1= s(ab )(s)ds, (1 − ) r˜ r˜ s r˜ we find that /˜r + 1 is O(˜r 2 ), with v A -Hölder modulus of continuity also being O(˜r 2 ). But then 1 − r˜ 1 + r˜ H= 2 r˜
is
O(˜r 2 ),
with
v A -Hölder
modulus of continuity O(˜r 2 ). Rewriting (B.27) as
r˜ H (s) h =C 1+ ds , r˜ s2 r˜0
(B.28)
we conclude that h/˜r is uniformly Lipschitz continuous in v A . It follows that h/ = (h/˜r )(˜r / ) also is. From the right-hand-side of (B.26) we conclude that ρ˜ is uniformly Lipschitz in v A . To continue, integration of (B.21) shows that −2 δ is a Lipschitz function of v A , which in turn justifies Lipschitz continuity of τ˜ and π˜ . Furthermore, the uniformly Lipschitz character of the flow of ˜ implies that all the functions such as , α 0 , etc., are Lipschitz continuous functions of v A , hence — by composition — Lipschitz continuous functions on M. This, together with (B.6) and (B.20) immediately shows that ν˜ is Lipschitz continuous. From what has been said and from (B.1)–(B.3), (B.13)–(B.14) we conclude that α˜ and β˜ are uniformly Lipschitz continuous. Finally, integration of the right-hand-side of (B.23) gives uniform Lipschitz continuity of −1 and hence, in view of (B.4), (B.5), (B.17) and (B.19), that of γ˜ and µ. ˜ But now the right-hand-side of (B.24) is uniformly Lipschitz continuous, and hence ˜ extends to a C 1,1 vector field on O. Similarly the remaining elements of the tetrad ∇ are C 1,1 on O. The vector field ˜ is transverse to I + at p by hypothesis, further ˜ is transverse to W , and the implicit function theorem applied to the map obtained by flowing from W to I + along ˜ provides a diffeomorphism from a neighborhood of ∩ W to a neighborhood of p within I + . This shows in particular that Vρ≡0 contains a neighborhood of p, hence Vρ≡0 is open. Further, every point near p is the end point of a unique element of the congruence generated by , so that Vρ≡0 = Uρ≡0 near p.
On Mason’s Rigidity Theorem
29
One can iterate the regularity argument above as many times as the differentiability of the metric allows, obtaining each time one more degree of differentiability of ˜ which, for smooth conformal boundary extensions, proves smoothness of ˜ near p. Since p ∈ Vρ≡0 is arbitrary, Proposition B.3 follows.
References 1. Andersson, L., Chru´sciel, P.T.: On asymptotic behavior of solutions of the constraint equations in general relativity with “hyperboloidal boundary conditions”. Dissert. Math. 355, 1–100 (1996) 2. Bernal, A.N., Sánchez, M.: Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes. Commun. Math. Phys. 257, 43–50 (2005) 3. Bondi, H., van der Burg, M.G.J., Metzner, A.W.K.: Gravitational waves in general relativity VII: Waves from axi–symmetric isolated systems. Proc. Roy. Soc. London A 269, 21–52 (1962) 4. Choquet-Bruhat, Y., Geroch, R.: Global aspects of the Cauchy problem in general relativity. Commun. Math. Phys. 14, 329–335 (1969) 5. Chru´sciel, P.T.: Conformal boundary extensions of Lorentzian manifolds. http://arxiv.org/abs/gr-qc/ 0606101, 2006 6. Chru´sciel, P.T., Jezierski, J., Kijowski, J.: Hamiltonian field theory in the radiating regime, Lect. Notes in Physics, Vol. 70, Berlin-Heidelberg-New York: Springer, 2001 7. Chru´sciel, P.T., Jezierski, J., Ł¸eski, S.: The Trautman-Bondi mass of hyperboloidal initial data sets. Adv. Theor. Math. Phys. 8, 83–139 (2004) 8. Chru´sciel, P.T., Maerten, D., Tod, K.P.: Rigid upper bounds for the angular momentum and centre of mass of non-singular asymptotically anti-de Sitter space-times. JHEP 11, 084 (2006) 9. Debney, G.C., Kerr, R.P., Schild, A.: Solutions of the Einstein and Einstein-Maxwell equations. Jour. Math. Phys. 10, 1842–1854 (1969) 10. Garfinkle, D., Harris, S.G.: Ricci fall-off in static and stationary, globally hyperbolic, non-singular spacetimes. Class. Quantum Grav. 14, 139–151. http://arxiv.org/abs/gr-qc/9511050 (1997) 11. Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Mathematical Physics, No. 1., Cambridge Monographs on, Cambridge: Cambridge University Press, 1973 12. Lee, J.M.: Introduction to topological manifolds. Graduate Texts in Mathematics, Vol. 202, New York: Springer-Verlag, 2000 13. Lind, R.W., Newman, E.T.: Complexification of the algebraically special gravitational fields. Jour. Math. Phys. 15, 1103–1112 (1974) 14. Mason, L.J.: The asymptotic structure of algebraically special spacetimes. Class. Quantum Grav. 15, 1019–1030 (1998) 15. Natorf, W., Tafel, J.: Asymptotic flatness and algebraically special metrics. Class. Quantum Grav. 21, 5397–5407 (2004) 16. Newman, E.T., Tod, K.P.: Asymptotically flat space–times. In: General relativity and gravitation, Vol. 2 Held A. ed., New York: Plenum, 1980 pp.1–36 17. Newman, R.P.A.C.: The global structure of simple space–times. Commun. Math. Phys. 123, 17–52 (1989) 18. Nomizu, K.: On local and global existence of Killing vector fields. Ann. Math. 72, 105–120 (1960) 19. O’Neill, B.: Semi-Riemannian geometry. Pure and Applied Mathematics, Vol. 103, New York: Academic Press, 1983 20. Penrose, R., Rindler, W.: Spinors and spacetime. Cambridge: Cambridge University Press, 1984 and 1989 21. Sachs, R.K.: Gravitational waves in general relativity VIII. Waves in asymptotically flat spacetime. Proc. Roy. Soc. London A 270, 103–126 (1962) 22. Spanier, E.H.: Algebraic topology. New York: Springer-Verlag, 1981 23. Stephani, H., Kramer, D., MacCallum, M., Hoenselaers, C., Herlt, E.: Exact solutions of Einstein’s field equations. 2nd ed., Cambridge Monographs on Mathematical Physics, Cambridge: Cambridge University Press, 2003. MR MR2003646 (2004h:83017) 24. Stewart, J.M.: The Cauchy problem and the initial boundary value problem in numerical relativity. Class. Quantum Grav. 15, 2865–2889, (1998) (Proc. of Topology of the Universe, Cleveland 1997, Starkman G.D. ed.) 25. Tafel, J.: Bondi mass in terms of the Penrose conformal factor. Class. Quantum Grav. 17, 4397– 4408 (2000) 26. Trim, D.W., Wainwright, J.: Nonradiative algebraically special spacetimes. Jour. Math. Phys. 15, 535– 546 (1974) Communicated by G. W. Gibbons
Commun. Math. Phys. 285, 31–65 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0522-5
Communications in
Mathematical Physics
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions E. V. Ferapontov1 , A. Moro1 , V. V. Sokolov2 1 Department of Mathematical Sciences, Loughborough University, Loughborough,
Leicestershire LE11 3TU, United Kingdom. E-mail:
[email protected];
[email protected] 2 Landau Institute for Theoretical Physics, Kosygina 2, 119334 Moscow, Russia. E-mail:
[email protected] Received: 10 October 2007 / Accepted: 12 December 2007 Published online: 4 June 2008 – © Springer-Verlag 2008
Abstract: We investigate multi-dimensional Hamiltonian systems associated with constant Poisson brackets of hydrodynamic type. A complete list of two- and threecomponent integrable Hamiltonians is obtained. All our examples possess dispersionless Lax pairs and an infinity of hydrodynamic reductions. 1. Introduction Over the past three decades there has been a significant progress in the theory of (1 + 1)-dimensional quasilinear systems, j
u it + v ij (u)u x = 0,
(1)
which are representable in the Hamiltonian form u it + P i j h j = 0. Here h(u) is a Hamiltonian density, h j = ∂u j h, and P i j is a Hamiltonian operator of differentialgeometric type, P i j = g i j (u)
d ij + bk (u)u kx , dx
generated by a metric g i j (assumed non-degenerate) and its Levi-Civita connection ijk ij
j
via bk = −g is sk . It was demonstrated in [6] that the metric g i j must necessarily be flat, and in the flat coordinates of g i j the operator P i j takes a constant coefficient form P i j = i δ i j ddx . In the same coordinates, Hamiltonian systems take a Hessian form j u it + i h i j u x = 0. It was observed that many particularly important examples arising in applications are diagonalizable, that is, reducible to the Riemann invariant form Rti + v i (R)Rxi = 0. We recall that there exists a simple tensor criterion of the diagonalizability
32
E. V. Ferapontov, A. Moro, V. V. Sokolov
for an arbitrary hyperbolic system (1). Let us first calculate the Nijenhuis tensor of the matrix v ij , p
p
p
p
i = v j ∂u p vki − vk ∂u p v ij − v ip (∂u j vk − ∂u k v j ), N jk
(2)
and introduce the Haantjes tensor p
p
p
p
i Hijk = N pr v j vkr − N jr v ip vkr − Nr k v ip vrj + N jk vri vrp .
(3)
It was observed in [16] that a (1, 1)-tensor v ij with mutually distinct eigenvalues is diagonalizable if and only if the corresponding Haantjes tensor H is identically zero. As demonstrated by Tsarev, a combination of the diagonalizability with the Hamiltonian property implies the integrability: all diagonalizable Hamiltonian systems possess an infinity of conservation laws and commuting flows, and can be solved by the generalized hodograph transform. We refer to [6,27] for further discussion and references. The aim of our paper is to generalize this approach to (2+1)-dimensional Hamiltonian systems ut + A(u)ux + B(u)u y = 0,
(4)
which are representable in the form ut + Ph u = 0 where h(u) is a Hamiltonian density, and P is a two-dimensional Hamiltonian operator of differential-geometric type, P i j = g i j (u)
d d ij ij + bk (u)u kx + g˜ i j (u) + b˜k (u)u ky ; dx dy
such operators are generated by a pair of metrics g i j , g˜ i j and the corresponding ij j ij j Levi-Civita connections ijk , ˜ ijk via bk = −g is sk , b˜k = −g˜ is ˜ sk . The theory of multi-dimensional Poisson brackets was constructed in [6,19,20]. The main difference from the one-dimensional situation is that, although both metrics g i j and g˜ i j must necessarily be flat, they can no longer be reduced to a constant coefficient form simultaneously: there exist obstruction tensors. The obstruction tensors are known to vanish if either one of the metrics is positive definite, or a pair of metrics is non-singular in the sense of [20], that is, the mutual eigenvalues of g i j and g˜ i j are distinct. In both cases, the operator P i j can be transformed to a constant coefficient form. In the two-component situation any non-singular Hamiltonian operator can be cast into a canonical form d/d x 0 P= 0 d/dy by an appropriate linear change of the independent variables x, y. The corresponding Hamiltonian systems take the form u 1t + (h 1 )x = 0, u 2t + (h 2 ) y = 0.
(5)
The ‘simplest’ non-trivial integrable Hamiltonian density is h(u 1 , u 2 ) = u 1 u 2 − 16 (u 1 )3 (we point out that, up to certain natural equivalence, there exist no other integrable densities which are polynomial in u 1 , u 2 ). The corresponding equations (5) take the form u 1t − u 1 u 1x + u 2x = 0, u 2t + u 1y = 0,
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
33
see Sect. 4.1. This system appears in the context of the genus zero universal Whitham hierarchy, [17,18]. Setting u 1 = −ϕxt , u 2 = ϕx y one obtains a second order PDE, 1 2 = 0, ϕtt − ϕx y + ϕxt 2 which is one of the Hirota equations of the dispersionless Toda hierarchy [9]. The same equation appeared in [22] in the classification of integrable Egorov’s hydrodynamic chains. Other examples of integrable Hamiltonian densities expressible in elementary functions include 1 2 h(u 1 , u 2 ) = (u 1 − u 2 )2 + eu , h(u 1 , u 2 ) = u 2 u 1 + α(u 1 )5/2 , 2 h(u 1 , u 2 ) = (u 1 u 2 )2/3 , etc. The problem of classification of integrable two-component Hamiltonian systems (5) was first addressed in [10] based on the method of hydrodynamic reductions. We recall that a multi-dimensional quasilinear system (4) is said to be integrable if it possesses an infinity of n-component hydrodynamic reductions parametrized by n arbitrary functions of a single variable (see Sect. 2 for more details). It was demonstrated in [10] that this requirement imposes strong restrictions on the corresponding Hamiltonian density h(u 1 , u 2 ). In Sect. 4 we provide a complete list of integrable Hamiltonian densities (Theorem 1), as well as the associated dispersionless Lax pairs (Sect. 4.1). The ‘generic’ density is expressed in terms of the Weierstrass elliptic functions. In the three-component situation we consider Hamiltonian operators of the form ⎛ 1 ⎞ ⎛ ⎞ λ 0 0 1 0 0 d ⎝ d + 0 λ2 0 ⎠ , (6) P = ⎝0 1 0⎠ d x dy 3 0 0 1 0 0 λ here λi are constant and pairwise distinct; the corresponding Hamiltonian systems are u it + (h i )x + λi (h i ) y = 0.
(7)
There is a new phenomenon arising in the multi-component case: it was observed in [12] that the necessary condition for integrability of an n-component quasilinear system (4) is the vanishing of the Haantjes tensor for an arbitrary matrix of the form (α A + β B + γ In )−1 (α˜ A + β˜ B + γ˜ In ). In fact, it is sufficient to require the vanishing of the Haantjes tensor for a two-parameter family (k A + In )−1 (l B + In ). We point out that in the two-component case the Haantjes tensor vanishes automatically. On the contrary, in the multi-component situation the vanishing of the Haantjes tensor is a very strong restriction. Systems with this property will be called ‘diagonalizable’ (we would like to stress that matrices A and B do not commute in general, and cannot be diagonalized simultaneously). In Sect. 5 we obtain a complete list of diagonalizable three-component Hamiltonian systems (7) (Theorem 3). It turns out that in this case the diagonalizability conditions are very restrictive, and imply the integrability. For technical reasons, the classification results take much simpler form when expressed in terms of the Legendre transform H of the density Hamiltonian h, rather than a Hamiltonian density h itself (recall that H = u i h i − h, Hi = u i , u i = h i ; we use variables u i with lower indices for the arguments of H ). We demonstrate
34
E. V. Ferapontov, A. Moro, V. V. Sokolov
that the Legendre transform H of the ‘generic’ integrable Hamiltonian density h is given by the formula H=
λi − λ j V (ai u i , a j u j ), ai2 a 2j j=i
where V (x, y) = Z (x − y) + Z (x − y) + 2 Z (x − 2 y); here ai are arbitrary constants, = e2πi/3 , and Z = ζ where ζ is the Weierstrass zetafunction: ζ = −℘, (℘ )2 = 4℘ 3 − g3 . Notice that we are dealing with an incomplete elliptic curve, g2 = 0, and that the expression for V is real. The above formula for H has a natural multi-component extension, which is also integrable. This formula possesses a number of remarkable degenerations which are listed in Theorems 1 and 3. In particular, one has H=
λi − λ j (ai u i − a j u j ) ln(ai u i − a j u j ). ai2 a 2j j=i
We prove that all examples appearing in the classification possess dispersionless Lax pairs and an infinity of hydrodynamic reductions (Theorems 4 and 5 in Sect. 5.1 and 5.2). It is important to stress that, in 1+1 dimensions, integrable Hamiltonians are parametrized by n(n−1) arbitrary functions of two variables. On the contrary, in 2 + 1 dimensions, 2 the moduli spaces of integrable Hamiltonians are finite-dimensional. Furthermore, the results Sect. 6 (Theorems 6 and 7) make it tempting to conjecture that there exists no non-trivial integrable Hamiltonian systems of hydrodynamic type in 3 + 1 dimensions. The analysis of the integrability conditions is considerably simplified after a transformation of a given Hamiltonian system into the so-called Godunov, or symmetric, form. This construction is briefly reviewed in Sect. 3. The necessary information on hydrodynamic reductions and dispersionless Lax pairs is summarized in Sect. 2. 2. Hydrodynamic Reductions and Dispersionless Lax Pairs Applied to a (2 + 1)-dimensional system (4), the method of hydrodynamic reductions consists of seeking multi-phase solutions in the form u(x, y, t) = u(R 1 (x, y, t), . . . , R n (x, y, t)), where the ‘phases’ R i (x, y, t) are required to satisfy a pair of (1+1)-dimensional systems of hydrodynamic type, Rti = ν i (R) R iy ,
Rxi = µi (R) R iy .
Solutions of this form, known as ‘non-linear interactions of n planar simple waves’ [4,24,25], have been extensively discussed in gas dynamics; later, they reappeared in the context of the dispersionless KP hierarchy, see [14,15] and references therein. Technically, one ‘decouples’ a (2 + 1)-dimensional system (4) into a pair of commuting
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
35
n-component (1 + 1)-dimensional systems. Substituting the ansatz u(R 1 , ..., R n ) into (4) one obtains (ν i In + µi A + B) ∂i u = 0,
i = 1, . . . , n,
(8)
∂i = ∂/∂ R i , implying that both characteristic speeds ν i and µi satisfy the dispersion relation det(ν In + µA + B) = 0,
(9)
which defines an algebraic curve of degree n on the (ν, µ)-plane. Moreover, ν i and µi have to satisfy the commutativity conditions ∂ j νi ∂ j µi = , ν j − νi µ j − µi
(10)
i = j, see [27]. In was observed in [10] that the requirement of the existence of ‘sufficiently many’ hydrodynamic reductions imposes strong restrictions on the system (4), and provides an efficient classification criterion. To be precise, we will call a system (4) integrable if, for any n, it possesses infinitely many n-component hydrodynamic reductions parametrized by n arbitrary functions of a single variable. Thus, integrable systems are required to possess an infinity of n-phase solutions which can be viewed as natural dispersionless analogs of algebro-geometric solutions of soliton equations. We recall that a system (4) is said to possess a dispersionless Lax pair
ψt = f u, ψ y ,
ψx = g u, ψ y ,
(11)
if it can be recovered from the consistency condition ψxt = ψt x (we point out that the dependence of f and g on ψ y is generally non-linear). Lax pairs of this type first appeared in the construction of the universal Whitham hierarchy, see [17] and references therein. It was observed in [28] that such non-linear Lax pairs arise from the usual ‘solitonic’ Lax pairs in the dispersionless limit, and the cases of polynomial/rational dependence of f and g on ψ y were investigated. In particular, a Hamiltonian formulation of such systems was uncovered, requiring a non-local Hamiltonian density. It was demonstrated in [10,13] that, for a number of particularly interesting classes of systems, the existence of a dispersionless Lax pair is equivalent to the existence of hydrodynamic reductions and, thus, to the integrability. Setting ψ y = p and calculating the consistency condition ψxt = ψt x by virtue of (4), one arrives at the following relations for f (u, p) and g(u, p): grad f + gradg A = 0,
grad g f p In + g p A + B = 0;
(12)
here grad is the gradient with respect to u. In particular, this shows that f p and g p satisfy the dispersion relation (9), and the vector grad g belongs to the left characteristic cone of the system (4). Thus, as p varies, the equations ν = f p , µ = g p parametrize the dispersion curve (9), while grad g parametrizes the left characteristic cone. Throughout this paper we assume that the dispersion relation (9) defines an irreducible algebraic curve. This condition is satisfied for most examples discussed in the literature so far.
36
E. V. Ferapontov, A. Moro, V. V. Sokolov
3. Transformation of a Hamiltonian System into Godunov’s Form Recall that a system of hydrodynamic type (4) is said to be symmetrizable, or reducible to Godunov’s form [8], if it possesses a conservative representation of the form (∂u i p)t + (∂u i q)x + (∂u i r ) y = 0; here the potentials p, q and r are certain functions of u. Any such system possesses an extra conservation law L( p)t + L(q)x + L(r ) y = 0, where L denotes Legendre’s transform. Equations in Godunov’s form play an important role in the general theory of multi-dimensional hyperbolic conservation laws [5]. Given a Hamiltonian system (7) we perform the Legendre transform, H = L(h) = u i h i − h, Hi = u i , u i = h i , to obtain a system in Godunov’s form, (Hi )t + (u i )x + λi (u i ) y = 0, which corresponds to the choice p = H, q = u i2 /2, r = λi u i2 /2. We assume that the Legendre transform is well-defined, that is, all partial derivatives h i are functionally independent. This condition is equivalent to the requirement that the Hessian matrix of h is non-degenerate, which is automatically satisfied under the assumption of the irreducibility of the dispersion relation. It turns out that the integrability conditions take much simpler form when represented in terms of the Legendre transform H = L(h), rather than a Hamiltonian density h itself. Thus, in what follows we will work with systems represented in Godunov’s form (to make the equations look formally ‘evolutionary’ we will relabel the independent variables as x, y, t → T, X, Y ). This results in (u i )T + λi (u i ) X + (Hi )Y = 0;
(13)
systems of this type can be viewed as describing n linear waves (traveling with constant speeds λi in the X, T -plane) which are non-linearly coupled in the Y -direction. 4. Integrable Hamiltonians in 2 + 1 Dimensions: Two-Component Case In this section we classify two-component Hamiltonian systems (5). The corresponding Legendre transform is vT + (Hv )Y = 0, w X + (Hw )Y = 0;
(14)
here v = u 1 , w = u 2 . We point out that this case was addressed previously in [10], although the classification was only sketched. Here we provide a complete list of integrable potentials H (v, w), and calculate the corresponding dispersionless Lax pairs. For systems (14) the integrability conditions constitute an over-determined system of fourth order PDEs for the potential H (v, w): Hvw Hvvvv = 2Hvvv Hvvw , Hvw Hvvvw = 2Hvvv Hvww , Hvw Hvvww = Hvvw Hvww + Hvvv Hwww , Hvw Hvwww = 2Hvvw Hwww , Hvw Hwwww = 2Hvww Hwww .
(15)
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
37
The system (15) is in involution, and its solution space is 10-dimensional [10]. We point out that the transformations v → av + b, w → cw + d, H → α H + βv 2 + γ w 2 + µv + νw + δ generate a 10-dimensional group of Lie-point symmetries of the system (15). These transformations correspond to obvious linear changes of the independent variables X, Y, T in the equations (14). One can show that the action of the symmetry group on the moduli space of solutions of the system (15) possesses an open orbit. The classification of integrable potentials H (v, w) will be performed up to this equivalence. Moreover, we will not be interested in the potentials which are either quadratic in v, w and generate linear systems (14), or separable potentials of the form f (v) + g(w) giving rise to reducible systems. Our main result is the following complete list of integrable potentials: Theorem 1. The ‘generic’ solution of the system (15) is given by the formula H (v, w) = Z (v + w) + Z (v + w) + 2 Z (v + 2 w);
(16)
here = e2πi/3 and Z (s) = ζ (s), where ζ is the Weierstrass zeta-function: ζ = − ℘, (℘ )2 = 4℘ 3 − g3 . Degenerations of this solution correspond to 1 2 v ζ (w), 2 H (v, w) = (v + w) ln(v + w), H (v, w) =
(17) (18)
as well as the following polynomial potentials: H (v, w) = v 2 w 2 , α H (v, w) = vw 2 + w 5 , α = const, 5
(19) (20)
and 1 H (v, w) = vw + w 3 . 6
(21)
Remark. The ‘elliptic’ examples (16) and (17) possess a specialization g3 = 0: ℘ (w) → 1/w2 , ζ (w) → 1/w, σ (w) → w, etc. This results in the potentials H (v, w) = (v + w) log(v + w) + (v + w) log(v + w) + 2 (v + 2 w) log(v + 2 w)
(22)
and H (v, w) =
v2 , 2w
respectively. Dispersionless Lax pairs for Eqs. (14) corresponding to the potentials (16)–(21) are calculated in Sect. 4.1. Proof of Theorem 1. The system (15) can be solved as follows. The first two equations 2 = const. Similarly, the last two equations imply H 2 imply that Hvvv /Hvw www /Hvw =
38
E. V. Ferapontov, A. Moro, V. V. Sokolov
const. Setting Hvw = e one can parametrise the third order derivatives of H in the form Hvvv =
1 2 1 me , Hvvw = ev , Hvww = ew , Hwww = ne2 , 2 2
(23)
here m, n are arbitrary constants. The compatibility conditions for these equations, plus Eq. (15)3 , result in the following overdetermined system for e: (ln e)vw =
mn 2 e , evv = meew , eww = neev . 4
(24)
The general solution of the first (Liouville) equation has the form e2 =
p (v)q (w) 4 ; mn ( p(v) + q(w))2
(25)
one has to consider separately the case e = const (up to the equivalence transformations, this results in the potential (21)), as well as the case when e depends on one variable only, say, on w (this leads to the potential (20)). Let us assume that both constants m and n are nonzero (the cases when either of them vanishes will be discussed later). By scaling v and w one can assume m = n = 1. Setting ( p )3 = P 2 ( p), (q )3 = Q 2 (q),
(26)
(here P( p) and Q(q) are functions to be determined), one obtains from the last two equations (24) the following functional-differential equations for P and Q: P ( p + q)2 − 4P ( p + q) + 6P = 2Q ( p + q) − 6Q, Q ( p + q)2 − 4Q ( p + q) + 6Q = 2P ( p + q) − 6P; these equations imply that both P and Q are cubic polynomials in p and q, P = ap 3 + bp 2 + cp + d, Q = aq 3 − bq 2 + cq − d, where a, b, c, d are arbitrary constants. Notice that the right hand side of (25) possesses the following S L(2, R)-invariance, p→
αp − β αp + β , q→− , γp +δ γp −δ
which can be used to bring the polynomials P( p) and Q(q) to canonical forms. There are three cases to consider. Three distinct roots. In this case one can reduce both P( p) and Q(q) to quadratics, so that the ODEs (26) assume the form ( p )3 =
27 2 27 2 ( p + g3 )2 and (q )3 = (q + g3 )2 , 2 2
respectively. Thus, p = ℘ (v), q = ℘ (w), where ℘ is the Weierstrass ℘-function: (℘ )2 = 4℘ 3 − g3 (we point out that the value of g3 is not really essential, and can be normalized to ±1). Setting Hvw = e = −
12℘ (v)℘ (w) ℘ (v) + ℘ (w)
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
39
and integrating (23) with respect to v and w we obtain 12℘ 2 (w) 12℘ (v)℘ (w) , Hvw = − , ℘ (v) + ℘ (w) ℘ (v) + ℘ (w) 12℘ 2 (v) = −6ζ (v) − , ℘ (v) + ℘ (w)
Hvv = −6ζ (w) − Hww
(27)
here the zeta-function is defined as ζ = −℘. Since the ℘-function on the elliptic curve y 2 = 4x 3 − g3 satisfies the automorphic property ℘ (z) = ℘ (z), 3 = 1, one can rewrite (27) in the following equivalent form: Hvv = −2 ζ (v + w) + ζ (v + w) + 2 ζ (v + 2 w) , Hvw = −2 ζ (v + w) + 2 ζ (v + w) + ζ (v + 2 w) , Hww = −2 ζ (v + w) + ζ (v + w) + ζ (v + 2 w) . Up to a constant multiple, these formulae give rise to (16). Double root. In this case both P( p) and Q(q) can be reduced to p and q, so that the ODEs (26) take the form ( p )3 = 27 p 2 and (q )3 = 27q 2 , respectively. This leads to p = v 3 , q = w 3 , and a straightforward integration of (23) gives Hvv = −
6w 2 6vw 6v 2 , H = , H = − ; vw ww v 3 + w3 v 3 + w3 v 3 + w3
notice that these formulae can be obtained as a degeneration of (27) corresponding to g3 = 0. Up to a constant multiple, this leads to the potential (22). Triple root. In this case both P( p) and Q(q) can be reduced to constants, so that the ODEs (26) take the form ( p )3 = 1 and (q )3 = 1, respectively. This leads to e = 2/(v + w), which, up to a constant multiple, results in the potential (18). If m = 0, n = 0 (without any loss of generality we will again set n = 1), Eqs. (24) can be solved in the form e = 6v℘ (w), where ℘ is the Weierstrass ℘-function: (℘ )2 = 4℘ 3 − g3 . The corresponding potential H is given by H = −3v 2 ζ (w). Up to a multiple, this is the case (17). In the simplest case m = n = 0, Eqs. (24) imply e = (αv + β)(γ w + δ), and the elementary integration of Eqs. (23) results in 1 1 H (v, w) = ( αv 2 + βv)( γ w 2 + δw); 2 2 here α, β, γ , δ are arbitrary constants. Using the equivalence transformations one can reduce H to either H = v 2 w 2 (both α and γ are nonzero) or H = vw2 (α = 0). These are the polynomial cases (19) and a subcase of (20), respectively. This finishes the proof of Theorem 1.
40
E. V. Ferapontov, A. Moro, V. V. Sokolov
4.1. Dispersionless Lax pairs. In this section we calculate dispersionless Lax pairs for systems (14) corresponding to the potentials (16)–(21) of Theorem 1. We point out that, in spite of the deceptive simplicity of some of these potentials, the corresponding Lax pairs are quite non-trivial. Potential (21). The corresponding system (14) takes the form vT + wY = 0, w X + wwY + vY = 0;
(28)
it arises in the genus zero case of the universal Whitham hierarchy [17,18]. This system possesses the Lax pair ψT =
1 ln(ψY + w/2), ψ X = ψY2 + v/2. 2
A simple calculation shows that the Legendre transform of the potential H (v, w) = vw + 16 w 3 , defined by the formulae u 1 = Hv , u 2 = Hw , h(u 1 , u 2 ) = v Hv + w Hw − h, is also polynomial: 1 h(u 1 , u 2 ) = u 1 u 2 − (u 1 )3 . 6 We point out that all other examples of integrable potentials H (v, w) produce non-polynomial Hamiltonian densities h(u 1 , u 2 ). Potential (20). The corresponding system (14) takes the form vT + (w 2 )Y = 0, w X + 2(vw)Y + α(w 4 )Y = 0.
(29)
For α = 0 it possesses the Lax pair ψT = −
w2 , ψ X = ψY4 − 2vψY . 2ψY2
Setting v = u Y , w 2 = −u T one can rewrite (29) (when α = 0) as a single second order PDE u X T + 2u Y u T Y + 4u T u Y Y = 0. Up to a rescaling X → −2X this equation is a particular case of the generalized dispersionless Harry Dym equation [1,23]. For α = 0 the Lax pair modifies to w , ψ X = ψY4 − 2vψY , ψT = f ψY where the function f (s) satisfies the equation f (s) = −s/(αs 3 + 1) (for α = 0 one recovers the previous formula). The first equation of this Lax pair appeared in [23] as a generating function of conservation laws for the Kupershmidt hydrodynamic chain. Without any loss of generality one can set α = −1, which gives 1 ln(s − 1) + 2 ln(s − ) + ln(s − 2 ) , 3 = 1. f (s) = 3
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
41
Potential (19). The corresponding system (14) takes the form vT + 2(vw 2 )Y = 0, w X + 2(v 2 w)Y = 0.
(30)
It possesses the Lax pair ψT = w 2 a(ψY ), ψ X = −v 2 b(ψY ), where the dependence of a and b on ψ y ≡ ξ is governed by the ODEs a = −4
a b − 2, b = 4 + 2. b a
To solve these equations we proceed as follows. Expressing b from the first equation, b = −4a/(a + 2), and substituting into the second one arrives at a second order ODE 2aa − 3(a )2 + 12 = 0. It can be integrated once, (a )2 = 4ca 3 + 4, where c is a constant of integration. Without any loss of generality we will set c = 1. Thus, a is the Weierstrass ℘-function: a = ℘ (ξ, 0, −4) = ℘ (ξ ). The corresponding b is given by b = −4℘/(℘ + 2). Notice that this expression for b equals ℘ (ξ + c), where c is the zero of ℘-function such that ℘ (c) = 0, ℘ (c) = 2 (use the addition theorem to calculate ℘ (ξ + c)). Ultimately, we obtain the Lax pair ψT = w 2 ℘ (ψY ), ψ X = −v 2 ℘ (ψY + c). Setting V = v 2 , W = w 2 one can rewrite (30) in the form where the non-linearity is quadratic: Vt + 2W VY + 4V WY = 0, Wx + 2V WY + 4W VY = 0. Potential (18). The corresponding system (14) takes the form vT +
vY + w Y vY + w Y = 0, w X + = 0. v+w v+w
It possesses the Lax pair ψT = − ln(w + ψY ), ψ X = ln(v − ψY ). This system also arises in the genus zero case of the universal Whitham hierarchy [17,18]; its dispersionful analogue was constructed in [26]. Potential (17). The corresponding system (14) takes the form 1 vT + ζ (w)vY − v℘ (w)wY = 0, w X − ℘ (w)vvY − v 2 ℘ (w)wY = 0. 2 One can show that it possesses the Lax pair 1 ψT = − f (w, ψY ), ψ X = − v 2 b(ψY ) 2 where, setting ψY ≡ ξ , the function f (w, ξ ) has to satisfy the equations fw =
2b(ξ )℘ (w) , b (ξ ) + ℘ (w)
f ξ = ζ (w) +
2℘ 2 (w) b (ξ ) + ℘ (w)
.
42
E. V. Ferapontov, A. Moro, V. V. Sokolov
We point out that the consistency condition f ξ w = f wξ implies a second order ODE 2bb − 3(b )2 − 3g3 = 0 which, upon integration, gives (b (ξ ))2 = 4b3 (ξ ) − g3 , (the constant of integration is not essential). Thus, one can set b = ℘ (ξ ) so that the equations for f take the form 2℘ (ξ )℘ (w) , ℘ (ξ ) + ℘ (w)
fw =
f ξ = ζ (w) +
2℘ 2 (w) , ℘ (ξ ) + ℘ (w)
compare with (27)! Thus, 1 2 ln σ (ξ + w) + ln σ (ξ + w) + ln σ (ξ + 2 w), 3 3 3
f (w, ξ ) =
where σ is the Weierstrass sigma-function: σ /σ = ζ . Ultimately, the Lax pair takes the form ψT =
1 2 1 ln σ (ψY + w) + ln σ (ψY + w) + ln σ (ψ y + 2 w), ψ X = − v 2 ℘ (ψY ). 3 3 3 2
Potential (16). The equations corresponding to H/3 take the form 2℘ 2 (w) 2℘ (v)℘ (w) vT + ζ (w) + ℘ (v)+℘ (w) vY + ℘ (v)+℘ (w) wY = 0, wX +
2℘ (v)℘ (w) ℘ (v)+℘ (w) vY
+ ζ (v) +
2℘ 2 (v) ℘ (v)+℘ (w)
wY = 0.
One can show that the corresponding Lax pair is given by the equations ψT = f (w, ψY ), ψ X = g(v, ψY ) where, setting ψY = ξ , the first order partial derivatives of f and g are given by 2℘ 2 (w) ℘ (ξ ) + ℘ (w)
fw = −
2℘ (ξ )℘ (w) , ℘ (ξ ) + ℘ (w)
gv = −
2℘ (ξ )℘ (v) 2℘ 2 (v) , g , = −ζ (v) + ξ ℘ (ξ ) − ℘ (v) ℘ (ξ ) − ℘ (v)
f ξ = −ζ (w) −
and
respectively. Explicitly, one has 1 2 f (w, ξ ) = − ln σ (ξ + w) − ln σ (ξ + w) − ln σ (ξ + 2 w), 3 3 3 g(v, ξ ) =
2 1 ln σ (ξ − v) + ln σ (ξ − v) + ln σ (ξ − 2 v). 3 3 3
Notice that the expression for f (w, ξ ) coincides with the one from the previous case. This means that the corresponding Hamiltonian systems commute with each other — the fact which is, in a sense, unexpected.
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
43
Potential (22). This is the g3 = 0 degeneration of the potential (16). The system corresponding to H/3 takes the form vT +
w2 vw vw v2 v − w = 0, w − v + wY = 0; Y Y X Y v 3 + w3 v 3 + w3 v 3 + w3 v 3 + w3
it possesses the Lax pair ψT = f (w/ψY ), ψ X = g(v/ψY ), where the dependence of f and g on their arguments is specified by f (s) = s/(s 3 − 1), g (s) = s/(s 3 + 1). Explicitly, one has 1 ln(s − 1) + 2 ln(s − ) + ln(s − 2 ) , f (s) = 3 1 g(s) = − ln(s + 1) + 2 ln(s + ) + ln(s + 2 ) . 3 5. Integrable Hamiltonians in 2 + 1 Dimensions: Three-Component Case In this section we classify three-component integrable equations of the form (13), ⎛ 1 ⎛ ⎞⎛ ⎞ ⎞⎛ ⎞ ⎛ ⎞ λ 0 0 H11 H12 H13 u1 u1 u1 ⎝ u 2 ⎠ + ⎝ 0 λ2 0 ⎠ ⎝ u 2 ⎠ + ⎝ H12 H22 H23 ⎠ ⎝ u 2 ⎠ = 0, (31) u3 T u3 X H13 H23 H33 u3 Y 0 0 λ3 assuming that the constants λi are pairwise distinct. As mentioned in the introduction, the integrability of the system (31) implies the vanishing of the Haantjes tensor for any matrix of the two-parameter family (k A + I3 )−1 (l B + I3 ). Here A = diag(λi ) and B = (Hi j ). To formulate the integrability conditions in a compact form we introduce the following notation: H12 H13 2 H12 H23 3 R1 = λ − λ3 , R2 = λ − λ1 , H23 H13 H13 H23 1 R3 = λ − λ2 ; H12 we will see below that all mixed partial derivatives Hi j must be non-zero, otherwise the system is either linear, or reducible. Moreover, we will need the quantities 2 2 I = 2 − 4(λ2 − λ3 )(λ3 − λ1 )H12 − 4(λ3 − λ1 )(λ1 − λ2 )H23 2 −4(λ1 − λ2 )(λ2 − λ3 )H13
and 2 2 2 2 2 2 J = (λ2 − λ3 )H12 H13 + (λ3 − λ1 )H23 H12 + (λ1 − λ2 )H13 H23 − H12 H23 H13 ,
where = (λ2 − λ3 )H11 + (λ3 − λ1 )H22 + (λ1 − λ2 )H33 . Our first result is the following
44
E. V. Ferapontov, A. Moro, V. V. Sokolov
Theorem 2. The system (31) with an irreducible dispersion curve is diagonalizable if and only if the potential H satisfies the relations J = 0, H123 = 0, ∂ ∂u 1
3 (λ − λ2 )H11 + R2 + R3 = 0,
∂ ∂u 2
1 (λ − λ3 )H22 + R1 + R3 = 0,
∂ ∂u 3
(32)
(λ2 − λ1 )H33 + R1 + R2 = 0.
Notice that, in contrast to the two-component situation (15), these relations are third order in the derivatives of H . We will demonstrate below that the necessary conditions (32) are, in fact, sufficient for the integrability, and imply the existence of dispersionless Lax pairs and an infinity of hydrodynamic reductions. Remark. The condition J = 0, which is equivalent to R1 + R2 + R3 = , has a simple geometric interpretation as the condition of reducibility of the left characteristic cone of the system (31) (see Sect. 2 for definitions). Indeed, the left characteristic cone consists of all vectors g = (g1 , g2 , g3 ) which satisfy the relation g(ν I3 + µA + B) = 0. Excluding ν and µ, one obtains a single algebraic relation among g1 , g2 , g3 , H13 (g1 )2 g2 + H23 g1 (g2 )2 + H33 g1 g2 g3 λ1 − λ2 + H21 (g2 )2 g3 + H13 g2 (g3 )2 + H11 g1 g2 g3 λ2 − λ3 + H23 (g3 )2 g1 + H12 g3 (g1 )2 + H22 g1 g2 g3 λ3 − λ1 = 0,
(33)
(34)
which is the equation of the left characteristic cone. The condition J = 0 is equivalent to its degeneration into a line and a conic: [H12 H13 g1 + H12 H23 g2 + H13 H23 g3 ] H13 H23 (λ1 − λ2 )g1 g2 + H12 H23 (λ3 − λ1 )g1 g3 + H12 H13 (λ2 − λ3 )g2 g3 = 0. (35) We point out that, by virtue of (33), the left characteristic cone and the dispersion curve are birationally equivalent. This implies that the dispersion curve is necessarily rational, although not reducible (the linear factor of the left characteristic cone corresponds to a singular point on the dispersion curve — see Sect. 5.2 for explicit formulae). Proof of Theorem 2. To simplify the calculation of the Haantjes tensor we multiply the matrix (k A + I3 )−1 (l B + I3 ) by (kλ1 + 1)(kλ2 + 1)(kλ3 + 1). This results in the matrix ˜ B + I3 ), where A˜ = diag[(kλ2 + 1)(kλ3 + 1), (kλ1 + 1)(kλ3 + 1), (kλ1 + 1)(kλ2 + 1)]. A(l Since the multiplication by a scalar does not effect the vanishing of the Haantjes tensor, ˜ B + I3 ) which has an advantage of being polynomial we will work with the matrix A(l in k and l. Using computer algebra we calculate components of the Haantjes tensor H
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
45
(which are certain polynomials in k and l) and set them equal to zero. First of all, one can verify that all components of the form Hii j vanish identically, so that the only nonzero components are Hijk , i = j = k. In the following we will focus on the analysis of the 3 : it turns out that the vanishing of H3 alone implies the vanishing of the component H12 12 full Haantjes tensor. Let us compute coefficients at different powers of the parameter l 3 vanish identically since A ˜ is and set them equal to zero. At the order l 0 , all terms in H12 1 a constant diagonal matrix. The coefficient at l is a polynomial in k, however, setting its coefficients equal to zero we obtain only one independent relation: H123 = 0. Similarly, two extra relations come from the analysis of l 2 -terms, three relations from l 3 -terms, and four relations from l 4 -terms. Ultimately, we end up with a set of 9 linear homogeneous equations for the 9 third order derivatives Hiii , Hii j . From these 9 relations it readily follows that if one of the mixed derivatives equals zero, say, H12 = 0, then either H13 H23 = 0 or Hi jk = 0 for all i, j, k. In the first case the system (31) decouples into a pair of independent 1 × 1 and 2 × 2 subsystems. The second case corresponds to linear systems with constant coefficients. Therefore, from now on we assume Hi j = 0 for any i = j. The set of 9 relations so obtained is rather complicated, and the calculation of the corresponding 9 × 9 determinant is computationally intense. A simpler equivalent set 3 by (λ1 k + 1)(λ2 k + 1)2 (λ3 k + of relations can be derived as follows: first, divide H12 2 1) (which is a common multiple), then equate to zero the coefficient of l 2 at k = − 1/λ1 , −1/λ2 (the coefficient at k = −1/λ3 appears to be a linear combination of the previous two), the coefficient of l 3 at k = −1/λ1 , −1/λ2 , −1/λ3 and the coefficient of l 4 at k = 0, −1/λ1 , −1/λ2 , −1/λ3 . As a result, we arrive at a simpler set of 9 linearly independent relations that are nothing but linear combinations of the previous ones. If the determinant of this system is non-zero, then all remaining derivatives Hiii and Hii j vanish identically. This is the case of linear systems. Thus, to obtain non-linear examples, one has to require the vanishing of the determinant. It is straightforward to verify that this determinant factorizes as follows: J 4 I 2 − 64(λ1 − λ2 )(λ2 − λ3 )(λ3 − λ1 )J = 0. Thus, there are two cases to consider. If I 2 − 64(λ1 − λ2 )(λ2 − λ3 )(λ3 − λ1 )J = 0,
(36)
then the dispersion relation of the system (31) is reducible. To show this we introduce the quantities 1 = H12 − 2H13 H23 (λ1 − λ2 ), 2 = H23 − 2H12 H13 (λ2 − λ3 ), 2 (λ1 − λ2 )(λ2 − λ3 ), 3 = 2 − 4H13 2 2 (λ3 − λ2 ) + H23 (λ1 − λ2 ), 4 = H12
which can be verified to satisfy the quadratic identity (λ2 − λ3 )21 + (λ2 − λ1 )22 + 3 4 = 0.
(37)
46
E. V. Ferapontov, A. Moro, V. V. Sokolov
In terms of these quantities, Eq. (36) can be rewritten as follows: 2 3 − 4(λ1 − λ3 )4 + 16(λ1 − λ2 )(λ1 − λ3 )22 = 0,
(38)
or, equivalently, 2 3 + 4(λ1 − λ3 )4 + 16(λ1 − λ3 )(λ2 − λ3 )21 = 0;
(39)
one has to use the identity (37) to verify the equivalence of (38) and (39). Let us assume that λ1 < λ2 < λ3 . Since we are interested in real-valued solutions, Eq. (38) implies 2 = 0,
3 = 4(λ1 − λ3 )4 ;
(40)
(one should use (39) if λ2 < λ1 < λ3 ). In this case the identity (37) takes the form (λ2 − λ3 )21 + 4(λ1 − λ3 )24 = 0, so that 1 = 0,
4 = 0.
These conditions lead to potentials of the form H = u 2 (γ u 1 + δu 3 ) + f (γ u 1 + δu 3 ); here the constants γ and δ satisfy the relation (λ2 − λ1 )δ 2 + (λ2 − λ3 )γ 2 = 0, and f is an arbitrary function of the indicated argument. This ansatz, however, implies the reducibility of the dispersion relation as discussed in [12]. Thus, we are left with the second branch J = 0, in which case the rank of the system drops to 5, and we end up with Eqs. (32). This finishes the proof of Theorem 2. The main result of this section is a complete list of integrable potentials H (u 1 , u 2 , u 3 ) which come from a detailed analysis of Eqs. (32). The classification will be performed up to the following equivalence transformations, which constitute a group of point symmetries of the relations (32). Equivalence transformations: transformations of the variables u i : u i → au i + bi ; transformations of the potential H : λi u i2 /2 + µi u i + δ, H → αH + β u i2 /2 + γ the latter corresponding to Y → αY + βT + γ X in Eqs. (31). Moreover, relations (32) are invariant under arbitrary permutations of indices. Finally, we will not be interested in the potentials which are either quadratic in u i and generate linear systems (31), or separable potentials, e.g., H = f (u 1 ) + g(u 2 , u 3 ), giving rise to reducible systems. Theorem 3. The ‘generic’ solution of Eqs. (32) is given by the formula H =−
λi − λ j V (ai u i , a j u j ), 6ai2 a 2j j=i
(41)
where V (x, y) = Z (x − y) + Z (x − y) + 2 Z (x − 2 y);
(42)
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
47
here = e2πi/3 and Z = ζ , where ζ is the Weierstrass zeta-function: ζ = −℘, (℘ )2 = 4℘ 3 − g3 . Degenerations of this solution correspond to H =−
λi − λ j V˜ (ai u i , a j u j ), 2a2 3a i j j=i
(43)
where V˜ (x, y) = (x − y) ln(x − y) + (x − y) ln(x − y) + 2 (x − 2 y) ln(x − 2 y), and H =−
λi − λ j (ai u i − a j u j ) ln(ai u i − a j u j ), ai2 a 2j j=i
(44)
respectively. Further examples include H=
λ1 − λ 2 2 λ1 − λ3 2 2 λ2 − λ3 u ζ (a u ) + u ζ (a u ) − V (a2 u 2 , a3 u 3 ), (45) 2 2 3 3 1 1 3 a22 a32 a22 a32
where V is the same as in (42). This potential possesses a degeneration H = (λ1 − λ2 )u 21 u 22 + (λ2 − λ3 )ζ (u 3 + c)u 22 − (λ3 − λ1 )ζ (u 3 )u 21 ,
(46)
here ζ = −℘, (℘ )2 = 4℘ 3 + 4, and c is the zero of ℘ such that ℘ (c) = 0, ℘ (c) = 2. It possesses a further quartic degeneration, H = (λ1 − λ2 )u 21 u 22 + (λ2 − λ3 )u 22 u 23 + (λ3 − λ1 )u 23 u 21 .
(47)
We have also found the following (non-symmetric) examples: H = ( pu 1 + qu 3 ) ln ( pu 1 + qu 3 ) −
1 p(λ1 − λ2 )(λ1 − λ3 )u 31 6
1 − q(λ3 − λ1 )(λ3 − λ2 )u 33 + p(λ3 − λ2 )u 2 u 3 + q(λ2 − λ1 )u 1 u 2 , 6 H = (λ2 − λ1 )u 2 u 21 + (λ2 − λ3 )u 2 u 23 +
u2 1 2 (λ − λ3 )(λ3 − λ1 )u 53 + 1 , 10 u3
(48)
(49)
and H = (λ2 − λ1 )u 2 u 21 + (λ2 − λ3 )u 2 u 23 u q p 1 (50) 2 1 1 3 5 2 3 3 1 5 , (λ − λ )(λ − λ )u + (λ − λ )(λ − λ )u + u G + 3 1 3 2 2 15q 15 p u3 where G(x) = ( px + q) log ( px + q)+( px + q) log ( px +q)+ 2 ( px + 2 q) log ( px + 2 q). Up to the equivalence transformations, the above examples exhaust the list of integrable potentials. We claim that all examples appearing in the classification possess dispersionless Lax pairs and an infinity of hydrodynamic reductions (this will be demonstrated in Sect. 5.1–5.2).
48
E. V. Ferapontov, A. Moro, V. V. Sokolov
Proof of Theorem 3. We can assume that all mixed partial derivatives Hi j are non-zero. It follows from (32) that ∂3 H12 H13 H12 H23 ∂3 = ∂u 1 ∂u 2 ∂u 3 H23 ∂u 1 ∂u 2 ∂u 3 H13 3 H13 H23 ∂ = = 0. (51) ∂u 1 ∂u 2 ∂u 3 H12 The further analysis depends on the value of the expression ∂ H12 ∂ H23 ∂ H13 ∂ H12 ∂ H23 ∂ H13 + , ∂u 2 ∂u 3 ∂u 1 ∂u 1 ∂u 2 ∂u 3
(52)
which appears as a denominator when solving Eqs. (51). Case I. The expression (52) is nonzero. In this case Eqs. (51) are equivalent to Fu 1 ,u 2 =
K u1 G u2 K u1 G u2 Fu 2 + Fu 1 − F, K G K G
G u 2 ,u 3 =
Fu 2 K u3 Fu K u 3 G u3 + G u2 − 2 G, F K F K
K u 3 ,u 1 =
G u3 Fu G u 3 Fu 1 K u1 + 1 K u3 − K, G F G F
where F = 1/H12 , G = 1/H23 , K = 1/H13 . Keeping in mind that F3 = G 1 = K 2 = 0, we can rewrite these equations in the form F G K = 0, = 0, = 0, F3 = G 1 = K 2 = 0. (53) G K 12 F K 23 F G 13 The system (53) possesses obvious symmetries F → f 1 (u 1 ) f 2 (u 2 )F, G → f 2 (u 2 ) f 3 (u 3 )G, K → f 1 (u 1 ) f 3 (u 3 )K , u 1 → g1 (u 1 ), u 2 → g2 (u 2 ), u 3 → g3 (u 3 );
(54)
here f i and gi are six arbitrary functions of the indicated arguments. As a first step, we introduce the new variables p=
K1 F1 F2 G2 G3 K3 − , q= − , r= − , K F F G G K
which are nothing but the invariants of the first ‘half’ of the symmetry group (54). In terms of p, q, r , Eqs. (53) take the form q1 = − p2 = pq, r2 = −q3 = qr,
p3 = −r1 = pr.
(55)
This system is straightforward to solve: assuming p = 0 (the case when p = q = r = 0 will be a particular case of the general formula), one has q = − p2 / p, r = p3 / p, along with the three commuting Monge-Ampére equations for p, p23 = 0, (ln p)12 = p2 , (ln p)13 = − p3 .
(56)
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
49
The integration of the last two equations implies p1 / p = p + 2ϕ(u 1 , u 3 ) and p1 / p = − p + 2ψ(u 1 , u 2 ), respectively. Thus, p = ψ(u 1 , u 2 ) − ϕ(u 1 , u 3 ), and the substitution back into the above equations gives ψ1 (u 1 , u 2 )−ψ 2 (u 1 , u 2 ) = ϕ1 (u 1 , u 3 )− ϕ 2 (u 1 , u 3 ). The separation of variables provides a pair of Riccati equations, ψ1 = ψ 2 + V (u 1 ) and ϕ1 = ϕ 2 + V (u 1 ). Thus, ψ = −[ln v]1 , ϕ = −[ln v] ˜ 1 , where v and v˜ are two arbitrary solutions of the linear ODE v11 + V (u 1 )v = 0. Therefore, we can represent ψ and ϕ in the form ψ = −[ln(q2 (u 2 ) p1 (u 1 ) − p2 (u 2 )q1 (u 1 ))]1 , ϕ = −[ln(q3 (u 3 ) p1 (u 1 ) − q1 (u 1 ) p2 (u 2 ))]1 , where p1 (u 1 ) and q1 (u 1 ) form a basis of solutions of the linear ODE. Introducing wi (u i ) = qi (u i )/ pi (u i ), one obtains the final formula p =ψ −ϕ =
w1 (w3 − w2 ) , (w2 − w1 )(w3 − w1 )
leading to q=
w2 (w1 − w3 ) w3 (w2 − w1 ) , r= . (w2 − w1 )(w2 − w3 ) (w3 − w1 )(w3 − w2 )
Here wi (u i ) can be viewed as three arbitrary functions of one argument. The corresponding F, G, H are given by F = s1 s2 (w1 − w2 ), G = s2 s3 (w2 − w3 ), K = s1 s3 (w3 − w1 ), where si (u i ) are three extra arbitrary functions. This implies the ansatz H12 =
P(u 1 )Q(u 2 ) , f (u 1 ) − g(u 2 )
H23 =
Q(u 2 )R(u 3 ) , g(u 2 ) − h(u 3 )
H13 =
P(u 1 )R(u 3 ) , (57) h(u 3 ) − f (u 1 )
(with the obvious identification w1 (u 1 ) → f (u 1 ), s1 (u 1 ) → 1/P(u 1 ), etc). We have to consider different cases depending on how many functions among f, g, h are constant. Subcase 1. f = g = h = 0. Without any loss of generality one can assume H12 = P(u 1 )Q(u 2 ), H23 = Q(u 2 )R(u 3 ), H13 = P(u 1 )R(u 3 ). Substituting this ansatz into (32) one can show that the functions P, Q, R must necessarily be linear. Up to the equivalence transformations, this leads to a unique quartic potential (47): H = (λ1 − λ2 )u 21 u 22 + (λ2 − λ3 )u 22 u 23 + (λ3 − λ1 )u 23 u 21 . Subcase 2. f = g = 0. Without any loss of generality one can assume the following ansatz: H12 = P(u 1 )Q(u 2 ),
H23 = Q(u 2 )h 1 (u 3 ),
H13 = P(u 1 )h 2 (u 3 ).
(58)
The substitution into (32) implies that P and Q must necessarily be linear. Up to the equivalence transformations, this results in the potential H = (λ1 − λ2 )u 21 u 22 + (λ2 − λ3 )b(u 3 )u 22 + (λ3 − λ1 )a(u 3 )u 21 ,
50
E. V. Ferapontov, A. Moro, V. V. Sokolov
where the functions a and b satisfy the ODEs a = 4
a b − 2, b = 4 − 2, a b = 2(a + b). b a
The special case a = b = u 23 brings us back to the quartic potential from the previous subcase. The generic solution of these ODEs takes the form a(u 3 ) = −ζ (u 3 ), b(u 3 ) = ζ (u 3 + c), where ζ is the Weierstrass ζ -function, ζ = −℘, (℘ )2 = 4℘ 3 + 4, and c is the zero of ℘ such that ℘ (c) = 0, ℘ (c) = 2. This is the case (46). Subcase 3. f = 0. The analysis of this case leads to the ansatz H = (λ1 − λ2 )u 21 a(u 2 ) + (λ3 − λ1 )u 21 b(u 3 ) + h(u 2 , u 3 ), where a(u 2 ) =
1 1 ζ (a2 u 2 ), b(u 3 ) = − 2 ζ (a3 u 3 ), a22 a3
(here a2 , a3 are arbitrary constants), and the second order derivatives of h(u 2 , u 3 ) are given by H23 = 4
H22 = 4
H33
λ2 − λ3 a32
λ3 − λ2 =4 a22
λ2 − λ3 ℘ (a2 u 2 )℘ (a3 u 3 ) , a2 a3 ℘ (a2 u 2 ) − ℘ (a3 u 3 )
1 ℘ 2 (a3 u 3 ) ζ (a3 u 3 ) − , 2 ℘ (a2 u 2 ) − ℘ (a3 u 3 )
1 ℘ 2 (a2 u 2 ) . ζ (a2 u 2 ) − 2 ℘ (a3 u 3 ) − ℘ (a2 u 2 )
This is the case (45). Generic subcase. f (x) g (x) h (x) = 0. From (57) and (32) we find all third order derivatives of H. The compatibility conditions ∂i H j ji = ∂ j Hii j give rise to six functionaldifferential equations for the functions f, g, h, P, Q, R. It follows from (32) that ∂u 1 R1 + (λ1 − λ3 )H22 + (λ2 − λ1 )H33 = 0, ∂u 2 R2 + (λ2 − λ1 )H33 + (λ3 − λ2 )H11 = 0,
(59)
∂u 3 R3 + (λ3 − λ2 )H11 + (λ1 − λ3 )H22 = 0. These give us three more equations for f, g, h, P, Q, R, so that we have nine equations altogether. Substituting the values of the third order derivatives of H into the first Eq. (59), taking the numerator and dividing by the common factor P(u 1 )2 Q(u 2 )2 R(u 3 ), we get a fourth degree polynomial in f, g, h, P, Q, R, and first order derivatives thereof. Applying to this polynomial the differential operator f (u
1 1 ∂ 1 ∂2 ∂ 1 ∂2 ∂3 , f (u 1 ) g (u 2 ) h (u 3 ) 1 ) g (u 2 )
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
51
we arrive at a separation of variables, (λ2 − λ3 )(P (u 1 ) f (u 1 ) − P (u 1 ) f (u 1 )) f (u 1 )3 3 1 (λ − λ )(Q (u 2 )g (u 2 ) − Q (u 2 )g (u 2 )) = = 2c. g (u 2 )3 Integrating twice, we obtain (λ2 − λ3 )P = c f 2 + a1 f + b1 ,
(λ3 − λ1 )Q = cg 2 + a2 g + b2 .
Analogously, (λ1 − λ2 )R = ch 2 + a3 g + b3 . Using these relations we eliminate all derivatives of P, Q and R from our nine equations. As a result, we obtain a linear system of nine equations for the three unknowns P, Q, R. This system is consistent (that is, the rank of the extended matrix is ≤ 3) if and only if a1 = a2 = a3 = a, b1 = b2 = b3 = b, and 4(ch2 + bh + a)h 2 f − 4(c f 2 + b f + a)f 2 h + ( f − h) 2c( f 2 + f h + h 2 ) + 3b( f + h) + 6a f h = 0, 2 4(c f 2 + b f + a) f 2 g − 4(cg 2 + bg + a)g f + (g − f ) 2c(g 2 + g f + f 2 ) + 3b(g + f ) + 6a g f = 0,
(60)
2 4(cg2 + bg + a)g 2 h − 4(ch 2 + bh + a)h g + (h − g) 2c(h 2 + hg + g 2 ) + 3b(h + g) + 6a h g = 0.
Hence, we have either c = b = a = 0 or f = g = h = 0, otherwise f g h = 0. If c = b = a = 0 then the linear system for P, Q, R becomes homogeneous. Its rank equals two if and only if f = g = h = 0. In this case P f (λ3 − λ2 ) = Qg (λ1 − λ3 ) = Rh (λ2 − λ1 ) = const.
(61)
If f = g = h = 0 then the rank of the system also equals two. The requirement that the rank of the extended matrix equals two as well leads to c = b = a = 0. Thus, this case reduces to the previous one. Suppose now that f g h = 0. Solving the linear system for P, Q, R we get P=
2(c f 2 + b f + a) f , (λ2 − λ3 ) f
Q=
2(cg 2 + bg + a)g , (λ3 − λ1 )g
R=
2(ch 2 + bh + a)h . (λ1 − λ2 )h
Separating the variables in (60) we ultimately obtain f 3 = c1 S 2 ( f ),
g 3 = c2 S 2 (g),
h 3 = c3 S 2 (h),
(62)
52
E. V. Ferapontov, A. Moro, V. V. Sokolov
and (λ1 − λ2 )(λ1 − λ3 )S( f ) , 2f (λ3 − λ1 )(λ3 − λ2 )S(h) , R= 2h P=
Q=
(λ2 − λ1 )(λ2 − λ3 )S(g) , 2g
where S(x) is a polynomial of degree ≤ 3, and ci are arbitrary constants (the polynomial S(z) can be recovered from (λ1 − λ2 )(λ1 − λ3 )(λ2 − λ3 )S = 6(cz 2 + bz + a)). Notice that the case (61) is a particular case of the above with S = const. We point out that the right hand sides of (57) possess the following S L(2, R)invariance, αf + β αg + β αh + β , g→ , h→ , γf +δ γg + δ γh + δ P Q R P→ , Q→ , R→ , γf +δ γg + δ γh + δ f →
which can be used to bring the polynomial S to a canonical form. There are three cases to consider. Three distinct roots. In this case one can reduce S to a quadratic, S(x) = x 2 + g3 , so that the ODEs (62) imply f = ℘ (a1 u 1 ), g = ℘ (a2 u 2 ), h = ℘ (a3 u 3 ) where 27ai3 = 2ci and ℘ is the Weierstrass ℘-function: (℘ )2 = 4℘ 3 − g3 . Up to a constant multiple, this leads to λi − λ j ℘ (ai u i )℘ (a j u j ) , ai a j ℘ (ai u i ) − ℘ (a j u j ) λi − λ j 1 ℘ 2 (a j u j ) ζ (a . Hii = u ) − j j 2 ℘ (ai u i ) − ℘ (a j u j ) a 2j Hi j =
j=i
The corresponding potential H (u) is given by (41). Double root. In this case one can assume S(x) = x, so that the ODEs (62) imply f = (a1 u 1 )3 , g = (a2 u 2 )3 , h = (a3 u 3 )3 , here 27ai3 = ci . Up to a constant multiple, this leads to (λi − λ j )u 2j (λi − λ j )u i u j Hi j = , Hii = − . (ai u i )3 − (a j u j )3 (ai u i )3 − (a j u j )3 j=i
The corresponding potential H (u) is given by (43). Triple root. In this case S can be reduced to S = 1, so that the ODEs (62) imply f = a1 u 1 , g = a2 u 2 , h = a3 u 3 , here ai3 = ci . Up to a constant multiple, this leads to Hi j =
λi − λ j λi − λ j , Hii = − , ai a j (ai u i − a j u j ) a 2j (ai u i − a j u j )
(63)
j=i
and the corresponding potential H (u) is given by (44). Notice, however, that for this potential the expression (52) equals zero. Formally, it should be considered as an example from Case II below.
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
53
Case II. This is the case when the expression (52) equals zero, although both terms in (52) are nonzero: H122 H233 H113 = −H112 H223 H133 = 0;
(64)
an integrable example from this class is provided by Hi j =
λi − λ j ; ai a j (ai u i − a j u j )
(65)
it appears in the ‘triple root’ case above. A detailed analysis below shows that this case possesses no other non-trivial solutions. Rewriting (64) in the form H122 H233 H113 = −1, H112 H223 H133 one can set H122 H233 l(u 1 ) m(u 2 ) H113 n(u 3 ) , , . =− =− =− H112 m(u 2 ) H223 n(u 3 ) H133 l(u 1 ) Thus, H12 =
1 , P(x)
H23 =
1 , Q(y)
H13 =
1 , R(z)
(66)
where x = α(u 1 ) − β(u 2 ), y = β(u 2 ) − γ (u 3 ) and z = −x − y for some functions α, β, γ such that α = 1/l, β = 1/m, γ = 1/n. Substituting (66) into (32) and integrating once, one gets (λ1 − λ2 )P 2 + (λ3 − λ1 )R 2 + µ(u 2 , u 3 ), (λ2 − λ3 )P Q R (λ1 − λ2 )P 2 + (λ2 − λ3 )Q 2 H22 = + ν(u 1 , u 3 ), (λ3 − λ1 )P Q R (λ3 − λ2 )Q 2 + (λ3 − λ1 )R 2 + η(u 1 , u 2 ). H33 = (λ1 − λ2 )P Q R
H11 =
(67)
Expressing six partial derivatives of the functions µ(u 2 , u 3 ), ν(u 1 , u 3 ), η(u 1 , u 2 ) from the six compatibility conditions ∂ j Hii = ∂i Hi j , and substituting them into the equations ∂1 J = 0 and ∂2 J = 0, we obtain w1 α (u 1 ) + w2 γ (u 3 ) = 0, ∂2 (w1 )α (u 1 ) + ∂2 (w2 )γ (u 3 ) = 0,
w3 β (u 2 ) + w4 γ (u 3 ) = 0, ∂1 (w3 )β (u 2 ) + ∂1 (w4 )γ (u 3 ) = 0,
where w1 = (λ2 − λ3 )Q(R P Q + Q P R − P Q R ), w2 = (λ2 − λ1 )P(R P Q − Q P R + P Q R ), w3 = (λ1 − λ3 )R(R P Q + Q P R − P Q R ), w4 = (λ2 − λ1 )P(R P Q − Q P R − P Q R ).
(68)
54
E. V. Ferapontov, A. Moro, V. V. Sokolov
Eqs. (68)2 are obtained from (68)1 upon differentiation by u 2 and u 1 , respectively. Since α , β and γ are nonzero, the system (68) is consistent iff P, Q and R satisfy the following conditions: w1 ∂2 (w2 ) − w2 ∂2 (w1 ) = 0,
w3 ∂1 (w4 ) − w4 ∂1 (w3 ) = 0.
(69)
Let us observe that Eqs. (51) (which also hold in this case) can be rewritten as follows: p pq rq r = − + , p r p r
q pq rp r = − + , q r q r
p r p rq q = − + ; p q p q
(70)
here we use the notation p = P /P, q = Q /Q, r = R /R, and prime denotes the derivative of functions with respect to their arguments. Note that only two of the above equations are independent. Differentiating, for instance, the first two equations in (70) by x and y and eliminating r and r , one ends up at the following relations involving p and q: 2 p 2 p q 2 q 2 p −q = p2 + q 2 , + − p q p q (71) p 2 p q 2 q − − = 2 p2 − q 2 − . p q p q Similarly, one can eliminate p and p obtaining the analog of (71) for q and r . All these relations imply p q r p2 − p4 = q 2 − q4 = r 2 − r 4. (72) p q r Thus, p, q and r must satisfy an ODE of the form f f − ( f )2 − f 4 + k = 0, where f = f (ζ ) and k is an arbitrary constant. If k = 0 then f =
ν sin νζ
f =
or
1 , ζ
where ν is an arbitrary constant. Note that the second solution is a limit of the first as ν → 0. If k = 0 we have 1 √ k4 1 1 f = , or f = sn ν kζ ; 4 1 ν ν k tanh k 4 ζ
where sn is the Jacobi elliptic sine function: (sn )2 = (1−sn 2 ) 1 − 1/(ν 4 k)sn 2 . Using p = P /P and integrating once, we obtain P = c3 x,
tan νx , P = c3 ν
1
P = c3
sinh k 4 x k
1 4
,
or
P = c3 dn − c3
cn √ . k
ν2
(73)
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
55
Analogously, Q and R can be obtained from these formulae by cycling the indices c3 → c1 → c2√and the variables x → √ y → z. Here cn and dn are the Jacobi elliptic functions cn(ν kx; 1/(ν 4 k)) and dn(ν kx; 1/(ν 4 k)), and c1 , c2 and c3 are arbitrary constants. It turns out that only linear and trigonometric solutions in (73) satisfy the condition (69). Thus, hyperbolic and elliptic solutions can be dropped. The substitution of the linear solution into one of the Eqs. (68)1 implies that the functions α, β and γ must be linear. One recovers the solution (63) by setting c1 = a2 a3 /(λ2 − λ3 ), c2 = a1 a3 /(λ3 −λ1 ), c3 = a1 a2 /(λ1 −λ2 ). Finally, the substitution of the trigonometric solution (73) also implies that α, β and γ must be linear, however, the compatibility conditions for the systems (66) and (67) are not satisfied. Case III. This is the case when both terms in (52) equal zero separately: ∂ H12 ∂ H23 ∂ H13 ∂ H12 ∂ H23 ∂ H13 = = 0. ∂u 2 ∂u 3 ∂u 1 ∂u 1 ∂u 2 ∂u 3
(74)
Up to permutations of indices, we have to consider the following three subcases. Subcase 1. H12 = const = 0. It follows from (32) that (λ1 − λ2 )H13 H233 = (λ3 − λ1 )H12 H223 ,
(λ2 − λ1 )H23 H133 = (λ3 − λ2 )H12 H113 .
Differentiating the first equation with respect to u 1 we obtain H113 H233 = 0. If H233 = 0 then the first equation implies H23 = const. Otherwise, it follows from the second equation that H13 = const. Without any loss of generality we assume that H23 = const = 0. Setting H12 = q(λ2 − λ1 ), H23 = p(λ3 − λ2 ), p, q = const, and substituting into (32) one arrives, up to the equivalence transformations, at the following potential H : H (u 1 , u 2 , u 3 ) = ( pu 1 + qu 3 ) ln ( pu 1 + qu 3 ) −
1 p(λ1 − λ2 )(λ1 − λ3 )u 31 6
1 − q(λ3 − λ1 )(λ3 − λ2 )u 33 + p(λ3 − λ2 )u 2 u 3 + q(λ2 − λ1 )u 1 u 2 . 6 Subcase 2. H12 = f (u 1 ), H23 = g(u 3 ). One can prove that in this case u 1 H (u 1 , u 2 , u 3 ) = αu 21 u 2 + βu 2 u 23 + γ u 51 + δu 53 + u 3 G u3 for some constants α, β, γ , δ. The function G has to satisfy an equation of the form G (x) =
z1 , z2 + z3 x 3
where z i are some constants. If z 3 = 0 we have G(x) = x 2 . In this case α = (λ2 − λ1 ),
β = (λ2 − λ3 ),
γ = 0,
δ=
1 2 (λ − λ3 )(λ3 − λ1 ), 10
which gives (49). The case z 2 = 0 is equivalent to the above. Otherwise, G(x) = ( px + q) log ( px + q)+( px + q) log ( px +q)+ 2 ( px + 2 q) log ( px + 2 q).
56
E. V. Ferapontov, A. Moro, V. V. Sokolov
In this case α = (λ2 − λ1 ), β = (λ2 − λ3 ), γ = δ=
p (λ2 − λ1 )(λ1 − λ3 ), 15q 2
q (λ2 − λ3 )(λ3 − λ1 ), 15 p 2
which gives (51). Subcase 3. H12 = f (u 1 ), H13 = g(u 1 ). A direct calculation shows that this case gives no non-trivial examples. This finishes the proof of Theorem 3. 5.1. Dispersionless Lax pairs. In this section we prove that the diagonalizability conditions (32) imply the existence of the dispersionless Lax pairs (Theorem 4), and explicitly calculate Lax pairs for some of the most ‘symmetric’ examples appearing in the classification list of Theorem 3. Example 1. Let us consider the quartic potential (47), H = (λ1 − λ2 )u 21 u 22 + (λ2 − λ3 )u 22 u 23 + (λ3 − λ1 )u 23 u 21 , which is a three-component generalization of the potential (19) from Theorem 1 (we have verified that this example possesses no natural four-component extensions). The corresponding system (31) has a Lax pair ψT = λ1 a1 (ξ )u 21 + λ2 a2 (ξ )u 22 + λ3 a3 (ξ )u 23 , ψ X = −a1 (ξ )u 21 − a2 (ξ )u 22 − a3 (ξ )u 23 ; here ξ = ψY and the functions ai (ξ ) satisfy the ODEs a1 =
4a1 + 2, a3
a2 =
4a2 + 2, a1
a3 =
4a3 + 2, a2
a1 a2 + a2 a3 + a3 a1 = 0.
Equivalently, 4a1 , −2
a3 =
a1
a2 = −
4a1 , a1 + 2
2a1 a1 = 3a12 − 12.
Without any loss of generality one can set a1 = ℘ (ξ ), a2 = ℘ (ξ + c), a3 = ℘ (ξ − c), where ℘ is the Weierstrass ℘-function, (℘ )2 = 4℘ 3 + 4, and c is the zero of ℘ such that ℘ (c) = 0, ℘ (c) = 2. Example 2. Let us consider the potential (44), H =−
λi − λ j (ai u i − a j u j ) ln(ai u i − a j u j ), ai2 a 2j j=i
which is a three-component generalization of the potential (18) from Theorem 1. The corresponding system (31) possesses the Lax pair ψT = −
λi 1 ln(ai u i − ψY ), ψ X = ln(ai u i − ψY ). 2 ai ai2
This Lax pair appeared previously in [21].
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
57
Example 3. Let us consider the potential (41), H =−
λi − λ j V (ai u i , a j u j ). 6ai2 a 2j j=i
One can show that the corresponding system (31) has the Lax pair ψT = −
λi 1 f (a u , ψ ), ψ = f (ai u i , ψY ), i i Y X ai2 ai2
where the dependence of f (u, ξ ) on its arguments (here ξ = ψY ) is governed by fu =
℘ (u)℘ (ξ ) , ℘ (u) − ℘ (ξ )
fξ =
1 ℘ 2 (u) − ζ (u). ℘ (ξ ) − ℘ (u) 2
Explicitly, one has f (u, ξ ) =
1 2 ln σ (u − ξ ) + ln σ (u − ξ ) + ln σ ( 2 u − ξ ), 6 6 6
here σ is the Weierstrass sigma-function: σ /σ = ζ . In a different parametrization, this Lax pair appeared in [21] in the classification of dispersionless Lax pairs with movable singularities. We point out that both Examples 2 and 3 generalize to the n-component case in a straightforward way (allowing the summation to go from 1 to n). In fact, the following general result holds: Theorem 4. Any system (31) satisfying the diagonalizability conditions (32) possesses a dispersionless Lax pair. Proof. We look for a Lax pair in the form (11). The compatibility condition ψt x = ψxt results in the following set of relations: f 1 = λ1 g1 , f 2
= λ2 g2 , f 3 = λ3 g3 ,
(75)
and f p g1 = H11 g1 + H21 g2 + H31 g3 + g p f 1 , f p g2 = H12 g1 + H22 g2 + H32 g3 + g p f 2 , f p g3 = H13 g1 + H23 g2 + H33 g3 + g p f 3 ,
(76)
where we have set p = ψ y , f i = ∂i f, and gi = ∂i g. The relations (75) and (76) are equivalent to (12). Eliminating f p and g p from (76), one obtains a single algebraic constraint among the components g1 , g2 , g3 , which coincides with the left characteristic cone (34). The expressions for f p and g p obtained from the first two Eqs. (76) take the form (H11 g1 + H12 g2 + H13 g3 ) λ2 g2 − (H12 g1 + H22 g2 + H23 g3 ) λ1 g1 , g1 g2 (λ2 − λ1 ) (H11 g1 + H12 g2 + H13 g3 ) g2 − (H12 g1 + H22 g2 + H23 g3 ) g1 . gp = g1 g2 (λ2 − λ1 ) fp =
(77)
58
E. V. Ferapontov, A. Moro, V. V. Sokolov
Using the compatibility conditions f i j = f ji and f i p = f pi , we can express all second order derivatives of g in the form g12 =g13 = g23 = 0, g11 =
g1 (H111 g1 + H112 g2 + H113 g3 ) , H12 g2 + H13 g3
g22 =
g2 (H221 g1 + H222 g2 + H223 g3 ) , H12 g1 + H23 g3
g33
(78)
g1 (H123 g1 + H223 g2 + H332 g3 ) λ3 − λ1
= H13 g2 λ3 − λ2 + H23 g1 λ1 − λ3
g2 (H113 g1 + H123 g2 + H331 g3 ) λ2 − λ3
. + H13 g2 λ3 − λ2 + H23 g1 λ1 − λ3
It was already mentioned that the condition J = 0 implies the decomposition of the left characteristic cone (34) into linear and quadratic factors, see (35). We will assume that g1 , g2 , g3 lie on the quadratic branch, = H13 H23 (λ1 − λ2 )g1 g2 + H12 H23 (λ3 − λ1 )g1 g3 + H12 H13 (λ2 − λ3 )g2 g3 = 0. (79) One can verify that the differential consequences ∂ = 0, ∂u 1
∂ = 0, ∂u 2
∂ = 0, ∂u 3
∂ =0 ∂p
(80)
hold identically modulo (78), (79) and (32). Finally, using computer algebra, it is straightforward to verify that the consistency conditions for the system (78) are satisfied identically modulo (79) and (32). This completes the proof of Theorem 4. 5.2. Hydrodynamic reductions. The aim of this section is to prove that all examples listed in Theorem 3 possess infinitely many n-component hydrodynamic reductions parametrized by n arbitrary functions of a single variable. To do so one has to demonstrate the consistency of the relations (8), (10) where the characteristic speeds ν i and µi satisfy the dispersion relation det (ν I3 + µA + B) = 0, and ∂i u is the right eigenvector of the matrix ν i I3 + µi A + B — see Sect. 2. Theorem 5. The diagonalizability conditions (32) are necessary and sufficient for the existence of an infinity of n-component hydrodynamic reductions parametrized by n arbitrary functions of a single variable. Proof. The necessity follows from the general result of [12] which states that, for a quasilinear system (4), the diagonalizability is a necessary condition for the existence of an infinity of hydrodynamic reductions. The first step to demonstrate the sufficiency is to explicitly parametrize the dispersion curve (9), which we know to be a rational curve of degree three (see the Remark after
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
59
Theorem 2). This can be done as follows. Let us first calculate the singular point ν0 , µ0 on the dispersion curve. It corresponds to the situation when the rank of the matrix ν I3 +µA + B drops to one. The associated left eigenvectors constitute a two-dimensional plane given by the first factor in the equation of the left characteristic cone (35). A simple calculation shows that ν0 and µ0 can be obtained from the linear system H12 H13 − H11 , H23 H12 H23 ν0 + λ2 µ0 = − H22 , H13 H13 H23 ν0 + λ3 µ0 = − H33 ; H12 ν0 + λ1 µ0 =
notice that these three relations are linearly dependent, indeed, multiplying the first by λ2 −λ3 , the second by λ3 −λ1 , the third by λ1 −λ2 and adding them together, one obtains J = 0, see (32). Next, we parametrize the quadratic branch of the left characteristic cone (35) in the form g1 =
(λ1
1 1 1 , g2 = 2 , g3 = 3 , + s)H23 (λ + s)H13 (λ + s)H12
here s is a parameter. The corresponding relation (33) is equivalent to λ1 + s H12 H23 λ1 + s H13 H23 + 3 = 0, λ2 + s H13 λ + s H12 λ2 + s H12 H13 λ2 + s H13 H23 + 3 = 0, ν + µλ2 + H22 + 1 λ + s H23 λ + s H12 λ3 + s H12 H13 λ3 + s H12 H23 + 2 = 0; ν + µλ3 + H33 + 1 λ + s H23 λ + s H13 ν + µλ1 + H11 +
we point out that these three relations are also linearly dependent. Solving them for ν(s) and µ(s) one obtains a rational parametrization of the dispersion curve: s H12 H13 s H12 H23 s H13 H23 − 2 − 3 , λ1 + s H23 λ + s H13 λ + s H12 1 H12 H13 1 H12 H23 1 H13 H23 µ(s) = µ0 − 1 − 2 − 3 ; λ + s H23 λ + s H13 λ + s H12 ν(s) = ν0 −
here ν0 and µ0 are coordinates of the singular point. Thus, the characteristic speeds ν i (R) and µi (R) can be represented in the form ν i (R) = ν(s i ), µi (R) = µ(s i ),
(81)
where s i , which are the parameter values of n points on the dispersion curve, are certain functions of the Riemann invariants: s i = s i (R). Since in our case the matrix ν I3 +µA+ B is symmetric, the left characteristic cone coincides with the right characteristic cone. Thus, the right eigenvector corresponding to the point ν i , µi on the dispersion curve is t 1 1 1 , , , (λ1 + s i )H23 (λ2 + s i )H13 (λ3 + s i )H12
60
E. V. Ferapontov, A. Moro, V. V. Sokolov
and the relations (8) take the form ∂i u 2 =
λ1 + s i H23 λ1 + s i H23 ∂ u , ∂ u = ∂i u 1 . i 1 i 3 λ2 + s i H13 λ3 + s i H12
(82)
Substituting (81) into the commutativity conditions (10) and using (82) one obtains the relations ∂ j s i = (...)∂ j u 1 ,
(83)
i = j, where dots denote a certain rational expression in s i , s j whose coefficients depend on the second and third order derivatives of the potential H . For example, in the case of the quartic potential (47) these relations take the form ∂ j si =
3(λ1 + s i )(λ2 + s i )(λ3 + s i )(λ1 + s j ) ∂ j u1. (λ1 − λ2 )(λ1 − λ3 )(s j − s i ) u 1
By virtue of (82) and (32), the consistency conditions ∂ j ∂i u 2 = ∂i ∂ j u 2 and ∂ j ∂i u 3 = ∂i ∂ j u 3 imply one and the same relation ∂i ∂ j u 1 = (...)∂i u 1 ∂ j u 1 ,
(84)
i = j, where, again, dots denote a rational expression in s i , s j whose coefficients depend on the second and third order derivatives of H . In the case (47), we have ∂i ∂ j u 1 =
Y (s i , s j ) ∂i u 1 ∂ j u 1 , u1
where Y (α, β) =
6α 2 β 2 + k1 (α 2 β + αβ 2 ) + k2 (α 2 + 4αβ + β 2 ) + k3 (α + β) + k4 , (λ1 − λ2 )(λ1 − λ3 )(α − β)2
k1 = 3(λ2 + λ3 + 2λ1 ), k2 = (λ1 )2 + 2λ1 λ2 + 2λ1 λ3 + λ2 λ3 , 1 1 2 1 3 k3 = 3λ (λ λ + λ λ + 2λ2 λ3 ), k4 = 6(λ1 )2 λ2 λ3 . The relations (83) and (84) constitute the so-called Gibbons-Tsarev-type equations which govern hydrodynamic reductions of the system (31). The last step is to verify their consistency, namely, ∂k ∂ j s i = ∂ j ∂k s i and ∂i ∂ j ∂k u 1 = ∂i ∂k ∂ j u 1 (without any loss of generality one can set i = 1, j = 2, s = 3). If these consistency conditions are satisfied identically, the system (83), (84) will be in involution, with the general solution depending on 2n arbitrary functions of a single variable. Up to reparametrizations R i → f i (R i ) this gives an infinity of hydrodynamic reductions depending on n arbitrary functions. We have verified the consistency for all examples appearing in Theorem 3. In fact, rather than considering them case-by-case, one can give a unified proof of the consistency using only the diagonalizability conditions (32). To do so one needs to bring the system (32) into a passive form. It turns out that all higher order partial derivatives of the potential H can be expressed in terms of the second order derivatives Hi j and the 4 third order derivatives, say, H122 , H113 , H223 , H233 . Second order derivatives are constrained by a single algebraic equation J = 0, while the values of H and its first order derivatives Hi are arbitrary. This calculation shows that the generic solution of the system (32) should depend on 13 arbitrary constants, which is in full accordance with the results of Sect. 5.
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
61
The computation of the expressions (83) and (84), as well as the verification of the consistency conditions have been performed modulo this passive form. This means that all partial derivatives of H except the basic ones were eliminated, and the basic derivatives were considered as independent variables related by a single algebraic equation J = 0. An intense computer calculation shows that all compatibility conditions are identities in the basic derivatives. 6. Hamiltonian Systems in 3 + 1 Dimensions In this section we establish a number of non-existence results for integrable Hamiltonian systems of hydrodynamic type in 3 + 1 dimensions. We will begin with a two-component case. According to the results of [19], there exists a unique two-component Hamiltonian operator of hydrodynamic type which is essentially three-dimensional. Up to a linear transformation of the independent variables it can be cast into a canonical form 0 d/dz d/d x 0 . + P= d/dz 0 0 d/dy The corresponding Hamiltonian systems ut + P(h u ) = 0 take the form u 1t + (h 1 )x + (h 2 )z = 0, u 2t + (h 2 ) y + (h 1 )z = 0. Applying the Legendre transform, u 1 = h 1 , u 2 = h 2 , H = u 1 h 1 + u 2 h 2 − h, one can rewrite these equations in the equivalent form (u 1 )x + (u 2 )z + (H1 )t = 0, (u 2 ) y + (u 1 )z + (H2 )t = 0.
(85)
Notice that H (u 1 , u 2 ) is defined up to an arbitrary quadratic form (all quadratic terms in H can be eliminated by appropriate linear changes of the independent variables). Our first result is the following Theorem 6. Any integrable system (85) is necessarily linear (that is, the potential H is quadratic in u 1 , u 2 ). Proof. Our strategy will be to consider reductions of the system (85) to various (2 + 1) -dimensional systems. In fact, it will be sufficient to look at reductions governing traveling wave solutions. If the original system (85) is integrable, all such reductions must be integrable as well. Since the integrability conditions for (2 + 1)-dimensional twocomponent systems of hydrodynamic type are explicitly known [11], this will provide a set of necessary conditions for the integrability of the system (85). It turns out that these conditions are very strong indeed, leading to the non-existence of non-quadratic integrable potentials H . Setting in Eqs. (85) ∂z = µ∂t , which is equivalent to seeking solutions in the form u(x, y, t + µz), one obtains a (2 + 1)-dimensional Hamiltonian system (u 1 )x + (H1 + µu 2 )t = 0, (u 2 ) y + (H2 + µu 1 )t = 0, with the Hamiltonian density H (u 1 , u 2 ) + µu 1 u 2 . According to our philosophy we have to require that it is integrable for an arbitrary value of the parameter µ. The integrability conditions (15) readily imply that the corresponding H must be cubic in u 1 , u 2 , and Theorem 1 tells us that the only two ‘suspicious’ cases to consider are H = 16 u 32 and
62
E. V. Ferapontov, A. Moro, V. V. Sokolov
H = 21 u 1 u 22 (recall that we ignore quadratic terms in H ). In the first case the system (85) takes the form (u 1 )x + (u 2 )z = 0, (u 2 ) y + (u 1 )z + u 2 (u 2 )t = 0. Setting here x = y (this amounts to seeking traveling wave solutions in the form u(x + y, z, y), one obtains a (2 + 1)-dimensional system (u 1 )x + (u 2 )z = 0, (u 2 )x + (u 1 )z + u 2 (u 2 )t = 0.
(86)
We recall that the paper [11] provides a complete set of the integrability conditions for two-component hydrodynamic type systems represented in the form ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ v a 0 v p q v ⎝ ⎠ +⎝ ⎠⎝ ⎠ + ⎝ ⎠ ⎝ ⎠ = 0. w t 0 b w x r s w y The integrability conditions constitute a complicated over-determined system of PDEs for the coefficients a, b, p, q, r, s as functions of v, w. Representing Eqs. (86) in the form ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎞⎛ ⎞ ⎛ u1 0 0 0 1 u1 u1 ⎝ ⎠ +⎝ ⎠⎝ ⎠ = 0 ⎠⎝ ⎠ + ⎝ 0 u2 1 0 u2 x u2 t u2 z one can verify that these integrability conditions are not satisfied. Thus, the (3 + 1)-dimensional system corresponding to H = 16 u 32 is not integrable. Similarly, for H = 21 u 1 u 22 the system (85) takes the form (u 1 )x + (u 2 )z + u 2 (u 2 )t = 0, (u 2 ) y + (u 1 )z + u 2 (u 1 )t + u 1 (u 2 )t = 0. Setting, again, x = y, and changing to the new dependent variables v = u 1 + u 2 , w = u 2 − u 1 , one obtains the system ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ 3v+w v−w ⎞ ⎛ ⎞ v 1 0 v v 4 4 ⎝ ⎠ +⎝ ⎠⎝ ⎠ + ⎝ ⎠ ⎝ ⎠ = 0, v−w v+3w w x 0 −1 w z w t 4 − 4 which also does not satisfy the integrability conditions. This finishes the proof of Theorem 4. Our next result shows that any three-component (3 + 1)-dimensional integrable Hamiltonian system associated with a non-singular Poisson bracket of hydrodynamic type is either linear or reducible. Any such system can be brought to a canonical form u 1t + (h 1 )x = 0, u 2t + (h 2 ) y = 0, u 3t + (h 3 )z = 0, with the Hamiltonian operator ⎛
d/d x ⎝ 0 0
0 d/dy 0
⎞ 0 0 ⎠. d/dz
(87)
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
63
Performing the Legendre transform one obtains (H1 )t + (u 1 )x = 0, (H2 )t + (u 2 ) y = 0, (H3 )t + (u 3 )z = 0, or, in matrix form, A0 ut + A1 ux + A2 u y + A3 uz = 0, where the 3 × 3 matrices Ai are given by ⎛ ⎞ H11 H12 H13 A0 = ⎝ H12 H22 H23 ⎠ , A1 H13 H23 H33 ⎛ ⎞ ⎛ 0 0 0 0 A2 = ⎝ 0 1 0 ⎠ , A3 = ⎝ 0 0 0 0 0
⎛
1 = ⎝0 0 0 0 0
0 0 0
⎞ 0 0⎠, 0
⎞ 0 0⎠. 1
Theorem 7. Any integrable (3+1)-dimensional Hamiltonian system (87) is either linear or reducible. Proof. As a necessary condition for integrability, one has to require the vanishing of the Haantjes tensor for an arbitrary matrix of the form (A0 + λA1 + β A2 + γ A3 )−1 (A0 + λ˜ A1 + β˜ A2 + γ˜ A3 ), ˜ which is equivalent to the vanishing of the Haantjes tensor for any matrix (A0 + ), ˜ where and are arbitrary 3×3 constant coefficient diagonal matrices. Computing the Haantjes tensor and equating to zero coefficients at different monomials in the diagonal ˜ one obtains that either all third order derivatives Hi jk are identically entries of and , zero (this corresponds to linear systems), or Hi j = Hik = 0 for some i = j = k (this corresponds to the reducible case). We would like to conclude this section by formulating the following general Conjecture. There exists no non-trivial integrable Hamiltonian system of hydrodynamic type in 3 + 1 dimensions corresponding to a local Poisson bracket of hydrodynamic type and a local Hamiltonian density. 7. Concluding Remarks We have found a broad class of non-trivial potentials leading to integrable Hamiltonian systems of hydrodynamic type in 2+1 dimensions. There is a number of natural problems arising in this context, in particular: − Describe the structure of the corresponding Hamiltonian hierarchies. The main difficulty here is the non-locality of higher symmetries/conservation laws. − Construct the associated Hamiltonian hydrodynamic chains. This requires the introduction of a canonical set of non-local variables reducing all higher flows of the hierarchy to infinite-component systems of hydrodynamic type. − Construct dispersive deformations of the examples arising in the classification, especially those with ‘elliptic’ Lax pairs. − Study the behavior of exact solutions coming from hydrodynamic reductions. We hope to address some of these questions elsewhere.
64
E. V. Ferapontov, A. Moro, V. V. Sokolov
Acknowledgements. We thank B. Dubrovin, O. Mokhov, A. Odesskii, M. Pavlov and A. Veselov for clarifying discussions. This research was supported by the EPSRC grant EP/D036178/1. The work of EVF was also partially supported by the European Union through the FP6 Marie Curie RTN project ENIGMA (Contract number MRTN-CT-2004-5652), and the ESF programme MISGAM. The work of VVS was partially supported by the RFBI-grant 08-01-00461. VVS also thanks IHES, where the final part of this research was completed, for their hospitality.
References 1. Blaszak, M.: Classical R-matrices on Poisson algebras and related dispersionless systems. Phys. Lett. A 297(3–4), 191–195 (2002) 2. Blaszak, M., Szablikowski, B.M.: Classical R-matrix theory of dispersionless systems. II. (2 + 1) dimension theory. J. Phys. A35(48), 10345–10364 (2002) 3. Boyer, C.P., Finley, J.D.: Killing vectors in self-dual Euclidean Einstein spaces. J. Math. Phys. 23, 1126–1130 (1982) 4. Burnat, M.: The method of Riemann invariants for multi-dimensional nonelliptic system. Bull. Acad. Polon. Sci. Sr. Sci. Tech. 17, 1019–1026 (1969) 5. Dafermos, C.: Hyperbolic conservation laws in continuum physics. Berlin-Heidelberg-New York: Springer-Verlag, 2000 6. Dubrovin, B.A., Novikov, S.P.: Poisson brackets of hydrodynamic type. Dokl. Akad. Nauk SSSR 279(2), 294–297 (1984) 7. Dubrovin, B.A., Novikov, S.P.: Hydrodynamics of weakly deformed soliton lattices. Differential geometry and Hamiltonian Theory. Russ. Math. Surv. 44(6), 35–124 (1989) 8. Godunov, S.K.: An interesting class of quasi-linear systems. Dokl. Akad. Nauk SSSR 139, 521–523 (1961) 9. Fairlie, D.B., Strachan, I.A.B.: The algebraic and Hamiltonian structure of the dispersionless Benney and Toda hierarchies. Inverse Problems 12(6), 885–908 (1996) 10. Ferapontov, E.V., Khusnutdinova, K.R.: On integrability of (2+1)-dimensional quasilinear systems. Commun. Math. Phys. 248, 187–206 (2004) 11. Ferapontov, E.V., Khusnutdinova, K.R.: The characterization of 2-component (2+1)-dimensional integrable systems of hydrodynamic type. J. Phys. A: Math. Gen. 37(8), 2949–2963 (2004) 12. Ferapontov, E.V., Khusnutdinova, K.R.: Double waves in multi-dimensional systems of hydrodynamic type: the necessary condition for integrability. Proc. R. Soc. A462, 1197–1219 (2006) 13. Ferapontov, E.V., Khusnutdinova, K.R., Tsarev, S.P.: On a class of three-dimensional integrable Lagrangians. Commun. Math. Phys. 261(1), 225–243 (2006) 14. Gibbons, J., Tsarev, S.P.: Reductions of the Benney equations. Phys. Lett. A211, 19–24 (1996) 15. Gibbons, J., Tsarev, S.P.: Conformal maps and reductions of the Benney equations. Phys. Lett. A258, 263–271 (1999) 16. Haantjes, J.: On X m -forming sets of eigenvectors. Indag. Math. 17, 158–162 (1955) 17. Krichever, I.M.: The τ -function of the universal Whitham hierarchy, matrix models and topological field theories. Comm. Pure Appl. Math. 47(4), 437–475 (1994) 18. Manas, M., Medina, E., Mart’nez Alonso, L.: On the Whitham hierarchy: dressing scheme, string equations and additional symmetries. J. Phys. A 39(10), 2349–2381 (2006) 19. Mokhov, O.I.: Poisson brackets of Dubrovin-Novikov type (DN-brackets). Funct. Anal. Appl. 22(4), 336–338 (1988) 20. Mokhov, O.I.: The classification of nonsingular multidimensional Dubrovin-Novikov brackets. http:// arxiv.org/list/math/0611785, 2006 21. Odesskii, A., Sokolov, V.: On 3-D hydrodynamic type systems possessing pseudopotential with movable singularities. http://arxiv.org/list/math-ph/0702026, 2007 22. Pavlov, M.V.: Classification of integrable Egorov hydrodynamic chains. Theoret. and Math. Phys. 138(1), 45–58 (2004) 23. Pavlov, M.V.: The Kupershmidt hydrodynamic chains and lattices. Int. Math. Res. Not. Art. ID 46987 (2006), 43 pp. 24. Peradzy´nski, Z.: Riemann invariants for the nonplanar k-waves. Bull. Acad. Polon. Sci. Sr. Sci. Tech. 19, 717–724 (1971) 25. Sidorov, A.F., Shapeev, V.P., Yanenko, N.N.: The method of differential constraints and its applications in gas dynamics. Novosibirsk: ‘Nauka’ (1984), 272 pp. 26. Szablikowski, B.M., Blaszak, M.: Dispersionful analogue of the Whitham hierarchy. http://arxiv.org/dbs/ 0707.1082, 2007
Hamiltonian Systems of Hydrodynamic Type in 2 + 1 Dimensions
65
27. Tsarev, S.P.: Geometry of hamiltonian systems of hydrodynamic type. Generalized Hodograph Method. Izv. AN USSR Math. 54(5), 1048–1068 (1990) 28. Zakharov, E.V.: Dispersionless limit of integrable systems in 2 + 1 dimensions, in Singular Limits of Dispersive Waves, Ed. N.M. Ercolani et al., NY: Plenum Press, 1994, pp. 165–174 Communicated by G.W. Gibbons
Commun. Math. Phys. 285, 67–140 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0593-3
Communications in
Mathematical Physics
Representations of the Weyl Algebra in Quantum Geometry Christian Fleischhack1,2 1 Max-Planck-Institut für Mathematik in den Naturwissenschaften, Inselstraße 22–26,
04103 Leipzig, Germany
2 Department Mathematik, Universität Hamburg, Bundesstraße 55, 20146 Hamburg, Germany.
E-mail:
[email protected] Received: 23 October 2007 / Accepted: 20 April 2008 Published online: 22 October 2008 – © The Author(s) 2008. This article is published with open access at Springerlink.com
Abstract: The Weyl algebra A of continuous functions and exponentiated fluxes, introduced by Ashtekar, Lewandowski and others, in quantum geometry is studied. It is shown that, in the piecewise analytic category, every regular representation of A having a cyclic and diffeomorphism invariant vector, is already unitarily equivalent to the fundamental representation. Additional assumptions concern the dimension of the underlying analytic manifold (at least three), the finite wide triangulizability of surfaces in it to be used for the fluxes and the naturality of the action of diffeomorphisms – but neither any domain properties of the represented Weyl operators nor the requirement that the diffeomorphisms act by pull-backs. For this, the general behaviour of C ∗ -algebras generated by continuous functions and pull-backs of homeomorphisms, as well as the properties of stratified analytic diffeomorphisms are studied. Additionally, the paper includes also a short and direct proof of the irreducibility of A. 1. Introduction Every physical theory requires fundamental mathematical assumptions at the very beginning. It is highly desirable to justify them by even more fundamental axioms that are both mathematically and physically as plausible as possible. In loop quantum gravity, there are a few of such technical prerequisites. First of all, of course, one assumes that all objects are constructed out of parallel transports along graphs in a base manifold of an SU (2) principal fibre bundle (or maybe also using higher dimensional objects like in spin foam theory). This is reasonable by the fact that classical (canonical) gravity is an SU (2) gauge field theory with constraints as discovered by Ashtekar in the mid-80s [1]. Secondly, one needs inputs about the quantization of this classical system. For this, at least the structure of the configuration space C of all those parallel transports (modulo gauge transforms) has to be fixed. If one wants to use functional integrals for quantization, one is forced to study measures on that space. The usage of parallel transports corresponding to smooth connections only, however, has lead to enormous mathematical
68
Ch. Fleischhack
problems. These could be widely avoided only by including distributional connections as well [2]. Namely, by the assumption that the reductions of the full theory to finitely many degrees of freedom (i.e., parallel transports on a finite graph) are continuous, one finds that the topology of C is a projective limit topology1 making C a compact space. Here, the compactness is induced by that of the underlying structure group SU (2) comprising the values of the parallel transports. This strategy can be reused to find natural measures on C – one simply uses the assumption that the restrictions of the theory to finite graphs push forward the measure on C to the Haar measures on the finite powers of SU (2). This leads to the Ashtekar-Lewandowski measure µ0 [6]. Of course, this measure is “natural”, since the Haar measure on a Lie group is “natural” as well. However, this is at most a mathematical statement or a statement of beauty. The deeper question behind is how one can justify this choice by mathematical physics arguments. 1.1. Early attempts. For the first time, this problem has been raised by Sahlmann [32]. He considered the class of measures on C that are absolutely continuous w.r.t. µ0 , and realized [32,33] that (up to some additional technical assumptions) only µ0 allows for a diffeomorphism invariant measure such that the flux variables are represented as operators on the corresponding L 2 space. Although these results were proven for the case of a U (1) gauge theory, they have been expected to hold also for the case of a general compact structure Lie group G. Moreover, it suggests that the diffeomorphism invariance of gravity together with its full phase space description could be responsible for the uniqueness of µ0 . The situation is similar to ordinary quantum mechanics. There, the Stone-von Neumann theorem [11] tells us that there is (up to equivalence) precisely one irreducible regular representation of the Weyl algebra generated by the exponentiated position and momentum operators together with their Poisson relations. In the standard Schrödinger representation on L 2 (R, dx), these unitary operators are given by [eiπx ψ](x) = eiπ x ψ(x)
and
p [eiξ ψ](x) = ψ(x + ξ ).
In loop quantum gravity, on the other hand, the connections are the generalized positions and the densitized dreibein fields are the generalized momenta. Exponentiation here includes also smearing: Connections are smeared along one-dimensional objects (i.e., paths) and exponentiated to give parallel transports – dreibeine along one-codimensional objects (i.e., hypersurfaces) to give flux variables. Now, one possible (even irreducible and regular) representation for the corresponding Weyl algebra A is given by multiplication and translation operators, respectively, on L 2 functions on C w.r.t. the Ashtekar-Lewandowski measure. All that suggests that maybe this representation π0 is even uniquely determined as well by certain reasonable assumptions. Sahlmann and Thiemann [35,34], supported by results of Lewandowski and Okołów [30] (see also [25] for further discussion), had argued that π0 may be the only irreducible, regular and diffeomorphism invariant representation of A. Despite the progress given by these papers, there had remained many open points, both technically and conceptually. A conceptual one concerned the domain properties of the represented operators. In fact, all results for non-abelian structure groups in [35] relied crucially on the fact that the self-adjoint generators of both the represented and the non-represented unitary operators share a certain, but not naturally given common dense domain. Another issue regarding the smoothness properties of the diffeomorphisms will be discussed below. 1 Of course, any refinement of this topology leads to continuous reductions again. However, for simplicity, one ignores this possibility.
Representations of the Weyl Algebra in Quantum Geometry
69
1.2. Achievements of the present paper. The situation above has described the status some five years ago. The goal of our present paper is now to give a complete and rigorous proof of a Stone-von Neumann-like theorem in quantum geometry avoiding most of these problems. More precisely, we will show that every regular representation of A that has a cyclic and diffeomorphism invariant vector, is unitarily equivalent to the fundamental representation π0 , provided the action of diffeomorphisms satisfies some rather mild condition. The main conceptual achievements of our theorem, in comparison to [35], are the following: • There are no longer any requirements concerning the domains of the operators in the game. This will be possible, since we consequently, from the very beginning, work with the exponentiated fluxes only. At no point, will we use their self-adjoint generators. There is only one issue, where we use the relation between operators and their generators. This will concern one-parameter subgroups in a compact Lie group in order to get some estimate for certain products in it. However, we will completely leave this infinitesimal arena before going back to the Weyl algebra level. • The requirements concerning the representations of the diffeomorphisms are drastically weakened. In [35], it had to be assumed that these are represented via pull-backs and respect the decomposition of the representation restricted to C(C) into cyclic generators. In particular, one had to assume that each of these components contains a diffeomorphism invariant cyclic vector. As to be discussed at the end of the paper, a priori these requirements drastically reduce the measures allowed in these decompositions. We will now be able to show that this assumption can be replaced by a weaker one. We only require that coinciding addends in the decomposition share the same representation of diffeomorphisms if at least one addend is diffeomorphism invariant. • Moreover, we will be able to clarify the particular class of diffeomorphisms to be used. Analytic diffeomorphisms are unsatisfactory from two points of view: Physically, they contradict the notion of locality, i.e., if we transform some set in the space(-time) manifold locally, then we transform this manifold even globally. Mathematically, they are not flexible enough as well, i.e., it will often be very difficult, if not impossible, to locally map objects onto each other under very rigid conditions, as we will see below. Therefore, we are forced to extend the class of isomorphisms. In fact, it will be manageable to use stratified analytic diffeomorphisms, slightly modifying the similar structures in, e.g., [29,21,10]. This, at the same time, leads to a natural extension of the surfaces used to define the Weyl operators, from analytic submanifolds to semianalytic sets. However, this is not a severe extension, since every semianalytic set can be stratified into a locally finite set of analytic submanifolds being mutually disjoint, i.e., having commuting Weyl operators.
1.3. Idea of the proof. Let us very shortly outline the proof of the uniqueness theorem. As usual (see, e.g., [35]), the restriction of any representation π of a Weyl-like algebra to the continuous functions, can be decomposed into (w.r.t. C(C)) cyclic ones. These are always the canonical representations on some L 2 (C, µν ) with appropriate measures µν on C. Assuming that π contained a cyclic vector having some invariance property, we may find such a decomposition, such that one of the constant vectors 1ν ∈ L 2 (C, µν ) has these properties as well. Then, being the first step where we use the particular structures of quantum geometry, regularity and diffeomorphism invariance imply that this µν is the Ashtekar-Lewandowski measure. Now, being the second step relying
70
Ch. Fleischhack
on quantum geometry, we may show that certain Weyl operators are diffeomorphism conjugate to their adjoints. By general arguments, using the two properties above and adding invariance and cyclicity of 1ν , we prove that π equals (up to unitary equivalence) the fundamental representation of A. 1.4. Comparison with LOST paper. While this paper was prepared, Lewandowski, Okołów, Sahlmann and Thiemann (LOST) were working on a similar problem for the holonomy-flux ∗-algebra. This algebra is given if the fluxes themselves are considered together with the continuous functions on C. Some time after the present article had been sent to the arxiv, the four-men paper [26] has been finished and appeared there as well. In this subsection, we are going to compare the corresponding results. As already mentioned, the most striking difference between the two approaches lies in the algebra: We use both exponentiated positions and momenta, but LOST exponentiate positions only and keep the fluxes non-exponentiated. Consequently, LOST investigate the holonomy-flux algebra, a ∗-algebra, but we consider the Weyl algebra – a C ∗ -algebra. Here the exponentiated fluxes are implemented as unitaries, whereas LOST study implicitly their self-adjoint generators being, of course, unbounded. The price to pay is that, in contrast to our case, LOST have to get rid of the persistent domain problems. This is done very directly using a state, since that –via GNS– guarantees the existence of a common dense domain for all the operators. By construction, this domain is spanned by the cylindrical functions on C. On the other hand, we only assume that the Weyl operators are continuously represented w.r.t. their smearing. This means that each corresponding one-parameter subgroup has some self-adjoint generator. If this was not the case, it is expected that then there exist other diffeoinvariant representations of the Weyl algebra. Nevertheless, note that our regularity assumption for each single oneparameter subgroup is much weaker than that of the existence of a certain common dense domain for all generators as in the LOST case. Indeed, our assumption follows from the LOST requirements: The GNS construction implies that, given a state, the ∗-invariant fluxes become symmetric operators. As it turns out, they are even self-adjoint. Hence they generate weakly continuous one-parameter subgroups. All that seems to show that our result is much stronger than that of LOST. However, there will be an additional assumption made in our paper only: the diffeomorphisms are implemented naturally. Until now, by no means, neither the relevance of this requirement nor its possible counterpart in the LOST paper is clear. However, while, as a matter of principle, it cannot be expected that the domain assumptions above can be dropped by LOST, we do hope that the naturality condition can be shown obsolete sometime. The remaining differences are, from our point of view, secondary. Let us only sketch a few of them. The technical advantage of the ∗-algebra case is the linearity of the fluxes w.r.t. the smearing, which enables LOST to use the scalar-product trick by Okołów. At the same time, LOST have to use compactly supported smearing functions. We, on the other hand, are confined to (up-to-gauge) constant smearings, although there is some hope to relax that. Since compactly supported smearings mean that one can restrict oneself to “nice” parts of the surfaces and forget about near-boundary regions, LOST –in contrast to us– did not have to assume that the surfaces are (widely) triangulizable. Rather similar are the general assumptions concerning smoothness. The striking idea that underlies both investigations is that stratified analytic objects comprise both the advantages of analyticity and those of locality. Only the implementation somewhat differs. Both are influenced by the notion of semianalyticity introduced mainly by Łojasiewicz, but – for simplicity – we mostly study these structures on a given analytic
Representations of the Weyl Algebra in Quantum Geometry
71
Table 1. Comparison between LOST and Fleischhack theory geometric ingredients
smoothness
basic assumptions
diffeomorphisms positions · exponentiated · smeared along paths momenta · exponentiated · smeared along surfaces
smearing functions algebra type generators
uniqueness assumed cyclicity domain assumptions regularity assumptions add’l assumptions required invariance
LOST gauge field theory principal fibre bundle P · structure group G · base manifold M stratified analytic · Ck · semianalytic · G compact connected Lie · M stratified analytic · dim M ≥ 2 stratified analytic connections · yes · paths stratified analytic fluxes · no · surfaces stratified analytic · open · codimension 1 ·— stratified analytic compactly supported holonomy-flux algebra ∗-algebra positions · cylindrical functions on C momenta · weak derivatives of pull-backs of left/right translations on C state cyclic invariant vector common dense domain: cylindrical functions — — all bundle automorphisms · diffeomorphisms · gauge transformations
Fleischhack gauge field theory principal fibre bundle P · structure group G · base manifold M stratified analytic · C0 · semi- or subanalytic · G compact connected Lie · M analytic · dim M ≥ 3 stratified analytic connections · yes · paths stratified analytic fluxes · yes · surfaces stratified analytic · open · codimension 1+ · widely triangulizable stratified analytic constant on strata Weyl algebra C ∗ -algebra positions · continuous functions on C momenta (unitary) · pull-backs of left/right translations on C representation cyclic invariant vector — regularity w.r.t. smearing natural diffeo action some bundle automorphisms · some diffeomorphisms ·—
manifold, whereas LOST define semianalytic structures in a more categorical way. Nevertheless, essentially all of our considerations should be directly transferable to the LOST framework and vice versa. There should also be no significant changes if we required semianalyticity to include not only continuity at the boundaries, but also C k as in the LOST regime. Only in the C ∞ case, this is not completely clear. Finally, we summarize our comparison in Table 1. Note that there we slightly modify the notions used in the respective article to better explain coincidences and differences. 1.5. Further developments. Both the LOST and the present paper originate from the quest for a quantum gravity theory. Therefore, as said above, its main application
72
Ch. Fleischhack
concerns an SU (2)-gauge field theory over a three-dimensional manifold M (i.e., some Cauchy surface) with diffeomorphism invariance as a fundamental symmetry. All the results contain, of course, this case, but go much beyond. Nevertheless, some related questions are still unsolved. For instance, what about theories with other symmetries or another field content? First results have been obtained for homeomorphism invariant scalar field theories [23,24]. Here, it turned out, that there are indeed other states, labelled by the Euler characteristics, i.e., algebraic-topological properties of the hypersurfaces. Another approach currently under investigation, has been taken by Bahr and Thiemann [9] extending the diffeomorphism group symmetry to general automorphisms of the path groupoid.
1.6. Structure of the article. To finish the introduction, let us briefly outline the present paper. In Sect. 2 we start with a general investigation of C ∗ -algebras that are generated by the continuous functions on a compact Hausdorff space X and by pull-backs of homeomorphisms of X . Afterwards, we switch over to quantum geometry. Since we would like to make the theory applicable to weaker smoothness classes the paths are required to belong to, we generalize the notion of oriented surfaces introducing quasi-surfaces and intersection functions in Sect. 3. Then, in Sect. 4, the Weyl algebra of quantum geometry is defined and the assumed structures regarding paths, hypersurfaces, diffeomorphisms etc. are fixed. After presenting a pretty short and direct proof for the irreducibility of the Weyl algebra in Sect. 5, we study the theory of stratified diffeomorphisms in detail in Sect. 6. The main result on the uniqueness of representations is then contained in Sect. 7, including a discussion of the assumptions made and the extensions possible.
2. General Setting Let X be a compact Hausdorff space and Homeo(X ) be the set of all homeomorphisms of X . Given some ξ ∈ Homeo(X ), its pull-back to C(X ) is denoted by wξ or, as usual, ξ ∗ . Correspondingly, for every H ⊆ Homeo(X ), the set WH ≡ H∗ ⊆ Homeo∗ (X ) contains precisely the pull-backs of all elements in H. The other way round, given some pull-back w ∈ Homeo∗ (X ), the corresponding homeomorphism is denoted by ξw , i.e., we have ξw∗ = w. Analogously, HW ⊆ Homeo(X ) is defined for all W ⊆ Homeo∗ (X ). Moreover, we denote by W the (abstract) subgroup of Homeo∗ (X ) generated by W and define, analogously, H. Obviously, HW = HW and WH = WH . Next, for every measure2 µ on X , we denote by H(µ) the set of all homeomorphisms on X leaving µ invariant. Clearly, H(µ) = H(µ). Moreover, every w ∈ WH(µ) extends naturally to a unitary operator on L 2 (X, µ), again denoted by w. By w( f ψ) = w( f )w(ψ) for all f ∈ C(X ), ψ ∈ L 2 (X, µ) and w ∈ WH(µ) , we have w ◦ f ◦ w −1 = w( f ) as operators in B(L 2 (X, µ)). Sometimes, we will extend the notion to operators: w1 (w2 ) := w1 ◦ w2 ◦ w1−1 for w1 , w2 ∈ WH(µ) . Finally, let A(W, µ) denote the C ∗ -subalgebra in B(L 2 (X, µ)) generated by C(X ) and W ⊆ WH(µ) , and let π0 be the identical (or fundamental) representation of A(W, µ) on L 2 (X, µ). Lemma 2.1. For every W ⊆ WH(µ) , the subalgebra spanned by all products f ◦ w with f ∈ C(X ) and w ∈ W is dense in A(W, µ). 2 If not stated otherwise, by a measure we always mean a normalized regular Borel measure.
Representations of the Weyl Algebra in Quantum Geometry
73
Proof. Since w ◦ f = w( f ) ◦ w for all w ∈ W and f ∈ C(X ), f 1 ◦ w1 ◦ f 2 ◦ w2 ◦ · · · ◦ wk ◦ f k+1 = f 1 · w1 ( f 2 ) · · · · · w1 (w2 (. . . (wk ( f k+1 )) . . .)) ◦ w1 ◦ · · · ◦ wk is always in C(X ) ◦ W. Moreover, with f , also f ∗ ≡ f is in C(X ), and with w, also w ∗ = w −1 is in W. Therefore, the span of C(X ) ◦ W equals the ∗-subalgebra of B(L 2 (X, µ)) generated by C(X ) and W.
Throughout the whole section, let µ be some arbitrary, but fixed measure on X . 2.1. First-step decomposition. Since every representation of a C ∗ -algebra is the direct sum of a zero representation and a non-degenerate one, we may restrict ourselves to non-degenerate representations in the following. Lemma 2.2. Fix some W ⊆ WH(µ) and let π be a non-degenerate representation of A(W, µ) on some Hilbert space H. Then there are measures µν on X with ν running over some (not necessarily countable) index set N, such that π |C(X ) is unitarily equivalent to the direct-sum representation ν πµν , where πµν denotes the canonical representation of C(X ) on L 2 (X, µν ) by multiplication operators. Moreover, these measures may be chosen, such that two of them are equal if they are equivalent (w.r.t. absolute continuity). Proof. Every non-degenerate representation of a C ∗ -algebra is (up to unitary equivalence) the direct sum of cyclic representations [12]. The first assertion now follows, because every cyclic representation of C(X ) is equivalent to the canonical representation on L 2 (X, µν ) by multiplication operators for some regular Borel measure µν [36]. Note that π |C(X ) is non-degenerate by 1 ∈ C(X ). Since measures on X are equivalent w.r.t. absolute continuity iff the corresponding canonical representations are equivalent [36], we get the proof.
Definition 2.1. A decomposition ν πµν as given in Lemma 2.2 is called first-step decomposition of π . Sometimes we write (µν )ν∈N or shortly µ to characterize such a decomposition. Moreover, if the particular W is not important, we will consider first-step decompositions without any reference to some π . Definition 2.2. A first-step decomposition is called short iff N consists of a single element. Remark. First-step decompositions are not at all unique. In fact, consider a short one with µν = µ and choose U ⊆ X with 0 < µ(U ) < 1. Decomposing any ψ ∈ H into ψ = 1U ψ + 1 X\U ψ with 1U being the characteristic function on U , we get a first-step decomposition πµU ⊕ πµ X\U . Here, µU is the normalization of 1U µ. In the following, given some representation π of A(W, µ) on H, we will usually assume that π |C(X ) equals (one of) its first-step decomposition(s). Moreover, we usually write shortly πν instead of πµν . By · µν we denote the norm on L 2 (X, µν ) =: Hν and by Pν the respective orthogonal projector mapping H to Hν . In particular, we have π( f )ψ2H = ν f · Pν ψ2µν for all f ∈ C(X ) and ψ ∈ H. Next, let Iν : Hν −→ H
74
Ch. Fleischhack
denote the (norm-preserving) canonical embedding of Hν into H and set 1ν := Iν (1), where 1 is seen not only as an element in C(X ), but in Hν as well. Anyway, often we will simply drop Iν . Analogously, we do not explicitly mark the transition from continuous functions to their classes in L 2 , when calculating scalar products. Note, however, that C(X ) is, in general, not embedded into L 2 (X, µν ). Let, e.g., µν be the Dirac measure at some point in X , then the image of C(X ) is isomorphic to C. Therefore, one has to be careful when operating with pull-backs of homeomorphisms that do not leave µν invariant. Finally, for µν1 = µν2 we denote the canonical isomorphism mapping Iν1 (Hν1 ) to Iν2 (Hν2 ) by Iνν21 . Definition 2.3. Let W be a subset of WH(µ) and let π be some representation of A(W, µ) on some Hilbert space H. A vector ψ ∈ H is called W-invariant iff π(w)ψ = ψ for all w ∈ W. Note that we tacitly assume some information about π to be given when we speak on invariance w.r.t. some W. This will avoid some cumbersome notation when we study equivalent representations. Lemma 2.3. Let W and W be subsets of WH(µ) , let π be a representation of A(W ∪ W , µ) on some Hilbert space H, and let ψ ∈ H be a W -invariant vector. Then there is a first-step decomposition ν∈N πµν of π and some ν ∈ N, such that 1ν is a W -invariant vector. If, moreover, ψ is cyclic for π |A(W ,µ) , then 1ν may be chosen cyclic as well. Proof. Define Hν := π (C(X ))ψ ⊆ H. Then both Hν and H⊥ ν are invariant w.r.t. (if not zero), the projection of π |C(X ) to H⊥ π (C(X )). Since H⊥ ν is non-degenerate ν is (up to equivalence) some direct sum ν ∈N πµν of cyclic representations of C(X ). Since, on the other hand, π |C(X ) is cyclic on Hν , it is equivalent to the canonical representation πµν of C(X ) on some L 2 (X, µ ν ), whereas the corresponding intertwiner maps ψ to 1ν . Now, by construction, πµν ⊕ ν ∈N πµν is a first-step decomposition of π . Moreover, the W -invariance of ψ translates into that of 1ν and the cyclicity, if given, as well.
Now, throughout the whole Sect. 2, we let W and W be some arbitrary subsets of WH(µ) , whereas w (W) ⊆ W for all w ∈ W . Note that we do not assume that they are fixed once and for all, i.e., they may be changed from one statement to the other. Next, π and π are always non-degenerate representations of A(W, µ) and A(W ∪ W , µ), respectively, on some Hilbert space H, where π is the restriction A(W, µ).3 We of π to let ν πµν be a fixed first-step decomposition of π on H = ν Hν = ν L 2 (X, µν ) and usually set πν := πµν for simplicity. Note that every first-step decomposition of π is also some for π and vice versa, since π and π coincide on A(W, µ) containing C(X ). Moreover, if there is some W -invariant (and π -cyclic) vector, then we assume that there is some ν ∈ N, such that 1ν is W -invariant (and π -cyclic). Note that this does not contradict the assumption above that measures in a first-step decomposition are equal if they are equivalent. Finally, in order to fix a home for the one-parameter subgroups in W introduced later, we fix some subset R in the set Hom(R, W) of homomorphisms from R to W. 3 Often, we will not refer to π at all. Then, in general, we tacitly set W = ∅ and π = π .
Representations of the Weyl Algebra in Quantum Geometry
75
2.2. πν -Scalars and πν -Units. Definition 2.4. An element w ∈ W is called • πν -scalar iff Pν π(w)1ν = c 1 for some c ∈ C; • πν -unit iff Pν π(w)1ν = 1. Analogously, we define these properties for w ∈ W . Since w is unitary, we have Lemma 2.4. 1ν is π(w)-invariant ⇐⇒ w is a πν -unit ⇐⇒ w ∗ is a πν -unit. Corollary 2.5. Any finite product of πν -units is a πν -unit. Lemma 2.6. If 1ν is π(w)-invariant, then Hν and H⊥ ν are π(w)-invariant. Proof. Fix some ψν ∈ Hν and recall that 1ν is cyclic for πν , i.e., for every ε > 0 there is some f ∈ C(X ) with π( f )1ν − ψν < ε. Since w is a πν -unit, we have π(w)π( f )1ν = π(w( f ))1ν ∈ Hν . By unitarity of w, we get π(w)ψν − π(w)π( f )1ν < ε, i.e., dist(π(w)ψν , Hν ) < ε for all ε > 0. Hence, π(w)ψν ∈ Hν . The invariance of H⊥ ν follows from the unitarity of π(w).
Corollary 2.7. If each w ∈ W is a πν -unit, then the restriction of π to Hν is cyclic. Proof. Since every πν -unit leaves Hν invariant, π(A(W, µ)) leaves Hν invariant. Since 1ν is already cyclic on Hν for π restricted to C(X ) ⊆ A(W, µ), we get the assertion.
Lemma 2.8. Let w ∈ W be a πν -scalar and assume ψ0 := (1 − Iν Pν )π(w)1ν = 0. Define ψ to be the normalization of ψ0 , and let Hψ be the completion of π(C(X ))ψ. Finally, assume that µν is ξw -invariant, i.e., w ∈ WH(µν ) . Then the restriction of π |C(X ) to its invariant subspace Hψ is equivalent to the canonical representation of C(X ) on L 2 (X, µν ). Moreover, Hν and Hψ are orthogonal. Proof. Of course, by definition, Hψ is invariant w.r.t. π(C(X )). Let now f 1 and f 2 be in C(X ). Then, by unitarity of π(w) and ξw -invariance of µν , we have π( f 1 )ψ0 , π( f 2 )ψ0 H = π( f 1 )(1 − Iν Pν )π(w)1ν , π( f 2 )(1 − Iν Pν )π(w)1ν H = (1 − Iν Pν )π( f 1 )π(w)1ν , (1 − Iν Pν )π( f 2 )π(w)1ν H = π( f 1 )π(w)1ν , π( f 2 )π(w)1ν H − Iν Pν π( f 1 )π(w)1ν , Iν Pν π( f 2 )π(w)1ν H = π(w)π(w ∗ ( f 1 ))1ν , π(w)π(w ∗ ( f 2 ))1ν H − f 1 · Pν π(w)1ν , f 2 · Pν π(w)1ν µν = w ∗ ( f 1 ), w ∗ ( f 2 )µν − |c|2 f 1 , f 2 µν = (1 − |c|2 ) f 1 , f 2 µν , where c is given by Pν π(w)1ν = c1. By the arguments above, ψ0 2 = 1 − |c|2 , implying π( f 1 )ψ, π( f 2 )ψH = f 1 , f 2 µν = π( f 1 )1ν , π( f 2 )1ν H for all f 1 , f 2 ∈ C(X ). The orthogonality of Hν and Hψ follows directly from that of 1ν and ψ.
76
Ch. Fleischhack
Definition 2.5. π is called W -natural iff, for any first-step decomposition ν∈N πµν and for all ν1 , ν2 ∈ N with µν1 = µν2 , an appearing π (W )-invariance of Hν1 implies that of Hν2 and Iνν21 ◦ Iν1 Pν1 π |A(W ,µ) ) = Iν2 Pν2 π |A(W ,µ) ) ◦ Iνν21 . Obviously, π is W -natural, if the respective first-step decomposition is short. Moreover, if π is W -natural and if µν1 = µν2 , then 1ν1 is π (w )-invariant iff 1ν2 is π (w )invariant. Corollary 2.9. Let w ∈ W ∩ WH(µν ) be a πν -scalar. Moreover, let π be W -natural. If 1ν is π (w )-invariant, then π(w)1ν is π (w )-invariant. Proof. Since w is a πν -unit, Hν is π (w )-invariant. Hence, π (w )Iν Pν π(w)1ν = π (w ) c1ν = c1ν = Iν Pν π(w)1ν . If ψ0 := (1 − Iν Pν )π(w)1ν = 0, the statement is trivial. Otherwise, we know from the lemma above and the notations there that Hν and Hψ are orthogonal. Choose a first-step decomposition of π containing the representation πµν for Hν and for Hψ . In fact, simply construct a first-step decomposition of the orthogonal complement of Hν ⊕ Hψ in H. Now, since π is W -natural, ψ is π (w )-invariant as well, by the π (w )-invariance of 1ν and the lemma above. The proof follows from π(w)1ν = ψ0 + Iν Pν π(w)1ν .
Corollary 2.10. Let w ∈ W ∩ WH(µν ) . Additionally, let π be W -natural, and let 1ν be π (w )-invariant for some w ∈ W . If w is a πν -scalar, then π(w (w))1ν = π (w )π(w)1ν = π(w)1ν . This means, in particular, w is a πν -scalar. ⇐⇒ w (w) is a πν -scalar. w is a πν -unit.
⇐⇒ w (w) is a πν -unit.
Corollary 2.11. Assume that π is W -natural and that 1ν is π (W )-invariant. Then, for all πν -scalars w ∈ W ∩ WH(µν ) and all w , w1 , w2 ∈ W , we have w (w) = w1 (w) ◦ w2 (w) =⇒ w is a πν -unit, w (w) = w ∗
=⇒ w 2 is a πν -unit.
Proof. Using Corollary 2.9 we have in the first case, π(w)1ν = π (w1 )∗ π(w)1ν = π (w1 )∗ π(w (w))1ν = π (w1 )∗ π(w1 (w) ◦ w2 (w))1ν = π (w1 )∗ π(w1 (w))π(w2 (w))1ν = π(w)π (w1 )∗ π (w2 )π(w)π (w2 )∗ 1ν = π(w)π(w)1ν and, in the second one, π(w)1ν = π(w (w))1ν = π(w ∗ )1ν .
Representations of the Weyl Algebra in Quantum Geometry
77
Lemma 2.12. If w is a πν -unit and µν equals µ, then Pν π(w) = π0 (w)Pν . Proof. We have π(w) f 1ν = π(w)π( f )1ν = π(w( f ))π(w)1ν = π(w( f ))1ν = w( f )1ν , hence Pν π(w) f 1ν = π0 (w)Pν f 1ν for all f ∈ C(X ). By continuity of π0 (w) on L 2 (X, µ) and by cyclicity of 1ν w.r.t. C(X ), we get Pν π(w) = π0 (w)Pν on Hν . Finally, both Pν π(w) (by Lemma 2.6) and π0 (w)Pν are zero on H⊥ ν.
Corollary 2.13. Let w and w0 be commuting elements in W. Moreover, let µν = µ. If w0 is a πν -unit, then it leaves Pν π(w)1ν invariant. Proof.
w0 Pν π(w)1ν ≡ = = =
π0 (w0 ) Pν π(w)1ν Pν π(w0 )π(w)1ν (Lemma 2.12) Pν π(w)π(w0 )1ν Pν π(w)1ν .
2.3. Continuous µ0 -generating systems. Until the end of this subsection, let µ0 be some measure on X . Definition 2.6. A subset E of C(X ) is called continuous µ0 -generating system iff • 1 ∈ E is orthogonal in L 2 (X, µ0 ) to each other element in E and • spanC E is dense both in C(X ) and in L 2 (X, µ0 ). Lemma 2.14. Let E ⊆ C(X ) be a continuous µ0 -generating system for some measure µ0 , and let ψ be a vector in L 2 (X, µ0 ). Then f, ψµ0 = 0 for all 1 = f ∈ E implies that ψ = cψ 1 for some c ∈ U (1). Proof. Use L 2 (X, µ0 ) = spanC E = spanC {1} ⊕ spanC (E\{1}) = C1 ⊕ spanC (E\{1}).
Lemma 2.15. If E ⊆ C(X ) is a continuous generating system w.r.t. two measures µ1 and µ2 , then µ1 equals µ2 . Proof. We have X f dµ1 = 1, f µ1 = 0 = 1, f µ2 = X f dµ2 for all 1 = f ∈ E. Since spanC E is dense in C(X ), the assertion follows from the regularity of the measures.
Lemma 2.16. Continuous µ0 -generating systems always exist. Proof. C(X ) always spans a dense subset in L 2 (X, µ0 ). Let now E contain 1 and all f − 1, f µ0 1 with f in C(X ).
Lemma 2.17. Let w ∈ W be some element. Assume that π is W -natural and that 1ν is W -invariant. Moreover, let E0 ⊆ C(X ) be some subset, such that for every non-constant f ∈ E0 there are infinitely many elements {wι } in W commuting with w, such that {wι ( f )} ⊆ C(X ) forms an orthonormal system in L 2 (X, µν ). Then Pν π(w)1ν is orthogonal to the span of E0 for all ν ∈ N with µν = µν . Here, E0 is seen as a subset in Hν .
78
Ch. Fleischhack
Proof. Let f ∈ E0 . Then there are infinitely many {wι } in W commuting with w and fulfilling π(wι1 ( f ))1ν , π(wι2 ( f ))1ν H = wι1 ( f ), wι2 ( f )µν = δι1 ι2 . By naturality, 1ν is W -invariant as well. Hence, π(wι ( f ))1ν , π(w)1ν H = π (wι )π( f )π (wι )∗ 1ν , π(w)1ν H = π( f )1ν , π (wι )∗ π(w)1ν H = π( f )1ν , π(w)π (wι )∗ 1ν H = π( f )1ν , π(w)1ν H for all ι. Consequently, f, Pν π(w)1ν µν = π( f )1ν , π(w)1ν H = 0.
2.4. Regularity. Definition 2.7. Precisely the elements of Hom(R, W) are called one-parameter subgroups in W, those in R ⊆ Hom(R, W) one-parameter R-subgroups in W. Definition 2.8. A one-parameter subgroup is called regular iff it is weakly continuous. Definition 2.9. A representation π of A(W, µ) is called regular w.r.t. R iff π maps regular one-parameter R-subgroups in W to weakly continuous one-parameter subgroups in π(W). If R is clear from the context, we will simply speak about regular representations. Definition 2.10. • Two one-parameter subgroups t −→ w1,t and t −→ w2,t in W are called commuting iff w1,t1 and w2,t2 commute for all t1 , t2 ∈ R. • The set given by all finite (pointwise) products of mutually commuting one-parameter R-subgroups in W is denoted by R. Lemma 2.18. The product of finitely many, mutually commuting one-parameter R-subgroups in W is a one-parameter R-subgroup in W. Moreover, if π is regular w.r.t. R, then π is regular w.r.t. R. Proof. The first part is clear. For the second one use π(wt )B(H) ≤ wt A(W ,µ) = 1 for all t to show
π(wi,t ) ψ − ψ ≤ π(wi,t ) (π(w j,t )ψ − ψ) i j i< j ≤ π(w j,t )ψ − ψ j
→0 for t → 0.
Therefore, in what follows, we will often assume that R is replaced tacitly by R.
Representations of the Weyl Algebra in Quantum Geometry
79
2.5. Splitting. Lemma 2.19. We have π(w ( f ))B(H) ≤ f ∞ for all w ∈ WH(µ) and all f ∈ C(X ). Here, the equality holds if π is faithful. Proof. We get π(w ( f ))B(H) ≤ w ( f )A(W ,µ) = w ( f )∞ = f ∞ , since w is a pull-back of a homeomorphism. If π is faithful, then even π(w ( f ))B(H) = w ( f )A(W ,µ) .
Lemma 2.20. We have sup
w ∈WH(µ)
|ψ, π((w − 1)(w ( f )))ψH| ≤ 2 f ∞ ψH (π(w) − 1)ψH
for all ψ ∈ H, w ∈ W and f ∈ C(X ). Proof. We have |ψ, π((w − 1)( f ))ψH| = |ψ, π(w( f ))ψH − ψ, π( f )ψH| = |ψ, π(w)π( f )π(w)∗ ψH − ψ, π( f )ψH| = |π(w)∗ ψ, π( f )π(w)∗ ψH − ψ, π( f )ψH| ≤ |(π(w)∗ − 1)ψ, π( f )π(w)∗ ψH| + |ψ, π( f )(π(w)∗ − 1)ψH| ≤ π( f )B(H) (π(w)∗ B(H) + 1) ψH (π(w)∗ − 1)ψH ≤ 2 π( f )B(H) ψH (π(w) − 1)ψH, (Unitarity of w) hence, for all w ∈ WH(µ) , |ψ, π((w − 1)(w ( f )))ψH| ≤ 2 π(w ( f ))B(H) ψH (π(w) − 1)ψH ≤ 2 f ∞ ψH (π(w) − 1)ψH with Lemma 2.19.
Definition 2.11. Let ψ ∈ H be some vector. • Let f ∈ C(X ). We say W splits W at ψ for f iff there is a one-parameter R-subgroup wt in W, some ε > 0 and some t0 > 0, such that sup |ψ, π (wt − 1)(w ( f )) ψH| ≥ ε w ∈W
for all 0 = |t| < t0 . • We say W splits W at ψ iff there is a continuous µ-generating system E, such that W splits W at ψ for every f ∈ E with f = 1 and ψ, π( f )ψH = 0. In other words, wt is not uniformly weakly continuous on the W -span of f . Moreover, note that the splitting property actually refers to the choice of R. Since, in general, we will have fixed R, we drop this notion here. Proposition 2.21. Assume that π ≡ π |A(W ,µ) is regular (w.r.t. R). If W splits W at 1ν0 , then µν0 equals µ.
80
Ch. Fleischhack
Proof. Choose a continuous µ-generating system E, such that W splits W at 1ν0 for every non-constant f ∈ E with 1, f ν0 ≡ 1ν0 , π( f )1ν0 H = 0. Assume there is such an f with 1, f ν0 = 0. Choose a one-parameter R-subgroup wt in W, some sufficiently small ε > 0 and some t0 > 0, such that sup |1ν0 , π (wt − 1)(w ( f )) 1ν0 H| ≥ ε
w ∈W
for all non-zero |t| < t0 . Hence, using Lemma 2.20, 2 f ∞ π(wt )1ν0 − 1ν0 H ≥ sup |1ν0 , π (wt − 1)(w ( f )) 1ν0 H| ≥ ε w ∈W
for all non-zero |t| < t0 . This, however, is a contradiction to our assumption that π is regular, i.e., t −→ π(wt ) is weakly continuous. Hence, 1, f ν0 = 0 for all f in E. By Lemma 2.15, we have µ = µν0 .
2.6. -Regularity. Definition 2.12. Let A be any set. • A set is called set of A-functions iff its elements are A-valued functions (i.e., there is no restriction for the domains of these functions). • A set of A-functions is called topological (sequential) iff the domain of each λ ∈ is a topological (sequential topological) space. Definition 2.13. Let A be some subset of a C ∗ -algebra A, and let π be a representation of A on some Hilbert space H. Moreover, let be a set of topological A-functions. Then π is called -regular iff the mapping ψ1 , π(λ( · ))ψ2 H : dom λ −→ C is continuous for all ψ1 , ψ2 ∈ H and each λ ∈ . Remark. The ordinary regularity uses dom λ = R, where λ : t −→ wt runs over all one-parameter R-subgroups. Let us return to the case that π is a representation of A(W, µ) on H. Proposition 2.22. Let π be -regular for some set of W-functions. Fix for each λ ∈ some subset Yλ in dom λ, such that λ(Yλ ) consists of πν -units only and λ∈ λ(Yλ ) generates W. Then every w ∈ W is a πν -unit. Proof. For all λ ∈ and all y ∈ Yλ , we have 1ν , π(λ(y))1ν H = 1ν , 1ν H = 1. Consequently, by -regularity, we even have 1ν , π(λ(y))1ν H = 1 for all y ∈ Yλ , hence π(λ(y))1ν = 1ν , i.e., λ(Yλ ) contains πν -units only. Since these sets generate full W and since, obviously, products and inverses of πν -units are πν -units again, all elements of W are πν -units.
Representations of the Weyl Algebra in Quantum Geometry
81
3. Quantum Geometric Background 3.1. Quantum geometric Hilbert space. In the remaining sections we will apply the general framework of Sect. 2 to quantum geometry. First, however, let us briefly recall in this subsection the basic facts and notations needed in the following. General expositions can be found in [6,4,3] for the analytic framework. The smooth case is dealt with in [8,7,27]. The facts on hyphs and the conventions are due to [14,16,19]. Let G be some arbitrary connected compact Lie group and M be some manifold. We let M be equipped with an arbitrary, but fixed differential structure. Later, we will restrict ourselves to analytic (or, if so desired, semianalytic) manifolds. A path is a piecewise differentiable map from [0, 1] to M, whereas differentiability is always understood in the chosen smoothness class. Moreover, we may restrict ourselves to use piecewise embedded paths only. A path is trivial iff its image is a single point. Two paths γ1 and γ2 are composable iff the end point γ1 (1) of the first one coincides with the starting point γ2 (0) of the second one. If they are composable, their product is given by
(γ1 γ2 )(t) :=
γ1 (2t) γ2 (2t − 1)
for t ∈ [0, 21 ] . for t ∈ [ 21 , 1]
An edge e is a path having no self-intersections, i.e., e(t1 ) = e(t2 ) implies that |t1 − t2 | either equals 0 or 1. Two paths γ1 and γ2 coincide up to the parametrization iff there is some orientation preserving piecewise diffeomorphism φ : [0, 1] −→ [0, 1], such that γ1 = γ2 ◦ φ. A path is called finite iff it equals up to the parametrization a finite product of edges and trivial paths. In what follows, every path will be assumed to be finite. Next, two paths are equivalent iff there is a finite sequence of paths, such that two subsequent paths coincide up to the parametrization or up to insertion or deletion of retracings δδ −1 . Finally, we denote the set of all paths by Pgen , that of all equivalence classes of paths by P. The multiplication of paths naturally turns P into a groupoid. Usually (but not in Subsects. 3.2 and 3.3), paths are understood to be equivalence classes of paths. Initial and final segments of paths are naturally defined. We will write γ1 ↑↑ γ2 iff there is some path γ being (possibly up to the parametrization) an initial path of both γ1 and γ2 . A hyph υ is some finite collection (γ1 , . . . , γn ) of edges each having a “free” point. This means, for at least one direction none of the segments of γi starting in that point in this direction, is a full segment of some of the γ j with j < i. Graphs and webs are special hyphs. The subgroupoid generated (freely) by the paths in a hyph υ will be denoted by Pυ . Hyphs are ordered in the natural way. In particular, υ ≤ υ implies Pυ ⊆ Pυ . The set A of generalized connections A is now defined by A := limυ Aυ ← −
∼ =
Hom(P, G),
with Aγ := Hom(Pγ , G) ⊆ G#γ given the topology which is induced by that of G, for all finite tuples γ of paths. Moreover, we define the (always continuous) map πγ : A −→ G#γ by πγ (A) := A(γ ) ≡ h A (γ ). Note that πγ is surjective, if γ is a hyph. Finally, for compact G, the Ashtekar-Lewandowski measure µ0 is the unique regular Borel measure on A whose push-forward (πυ )∗ µ0 to Aυ ∼ = G#υ coincides with the Haar measure there for every hyph υ. It is used to span the auxiliary Hilbert space Haux := L 2 (A, µ0 ) of quantum geometry with scalar product ·, ·.
82
Ch. Fleischhack
If we included (generalized) gauge transforms into our considerations and studied the analytic category only, we could use the spin-network states to get a basis of Haux,inv = L 2 (A/G, µ0 ) with G being the group of generalized gauge transforms. Here, however, we want to include gauge-variant functions as well and, moreover, do not want to restrict the smoothness class at the beginning. Therefore, we will consider now generating systems for Haux . For this, first of all, let us fix a representative in each equivalence class of irreducible representations of G, which we will refer to below. When considering matrix indices for matrices on some Euclidean space V , we assume that the underlying vectors are normalized. This means that for all A ∈ End V we have |Aij | ≤ A, where · denotes the standard operator norm. Definition 3.1. • For each non-trivial irreducible representation of G, we define Mφ to be the set Mφ := dim φ φnm , m,n
of normalized matrix functions, where m, n run over the set of matrix indices for φ. • We define M to be the set M :=
φ
Mφ =
φ,m,n
dim φ φnm ,
of normalized matrix functions, where φ runs over the set of all (equivalence classes of) non-trivial irreducible representations of G. • For every hyph υ with edges γ1 , . . . , γ I we define the set Mυ of gauge-variant spin network states (gSN) of υ by Mυ :=
i
M ◦ πγi .
If υ is the empty hyph, we have Mυ := {1}. The set of all gauge-variant spin network states will be denoted by MSN . √ m • More compactly, we set (Tφ,γ )m n := dim φ φn ◦ πγ and (Tφ ,γ )nm :=
dim φ φ m n ◦ πγ ≡
k dim φk (φk )m n k ◦ πγk . k
Observe that we get the same gauge-variant spin network state again if we simultaneously revert the orientations of an arbitrary number of edges and dualize the corresponding representations. This trivial overcompleteness will be ignored in the following, i.e., we will always identify graphs and hyphs differing in the ordering or the orientation of the edges only. Let us now recall Lemma 3.1. For every hyph υ, the set Mυ of gauge-variant spin networks on υ is an orthonormal set in L 2 (A, µ0 ). Note that (even after admitting only one edge orientation per hyph) υ Mυ is a generating system for, but not an orthonormal set in L 2 (A, µ0 ). This would still be the case, if we were in the (semi)analytic category and use graphs only (see below). In particular, we have
Representations of the Weyl Algebra in Quantum Geometry
83
Lemma 3.2. • We have Mυ ⊆ span Mυ for all υ ≥ υ . φ φ • We have Mγ ⊆ span Mυ for all υ = {γ1 , . . . , γn } ≥ γ with i γi = γ . φ Here, Mυ := i Mφ ◦ πγi . Lemma 3.3. MSN is a continuous µ-generating system in L 2 (A, µ0 ). Nevertheless, we will be looking for orthogonal decompositions of L 2 (A, µ0 ). For that purpose, we will have to single out orthogonal subsets of gauge-variant spin network functions: Until the end of this subsection we will now consider piecewise analytic paths only. In contrast to the standard, i.e., gauge-invariant spin network states, the gauge-variant ones do not form an orthonormal basis for L 2 (A, µ0 ) even after dropping some subset of them. The problem are the states arising in the decomposition of an edge into a product of subedges, i.e., having two-valent vertices. In the gauge-invariant case they can be dropped since, by invariance, they reproduce the original state. Here, however, in the gauge-variant case, we get a sum like (Tγ1 γ2 ,φ )m n = √
1 (Tγ1 ,φ )rm ⊗ (Tγ2 ,φ )rn , r dim φ
where the (dim φ) gauge-variant spin network states together with that at the left-hand side span a (dim φ)-dimensional subspace of L 2 (A, µ0 ). We might simply drop the one at the left-hand side, but this would lead to consistency troubles since we could want to decompose those at the right-hand side again. A possible solution for this dilemma is given by the extended spin network states as defined by Ashtekar and Lewandowski in [5]. We do not want to introduce that notion here, but only study the “most dangerous” cases in our framework – namely, those gSN with “matching” indices4 at each two-valent vertex. In the decomposition of the γ1 γ2 -state above, this concerns the vector at γ1 (1) = γ2 (0). Lemma 3.4. Let two gauge-variant spin networks states T := (Tφ ,γ )nm and T := (Tφ ,γ )nm with graphs γ and γ be given. Then T and T are orthogonal in L 2 (A, µ0 ) if • im γ = im γ ; • there is a point m ∈ int γ ∩ int γ , such that the representations for the edges in γ and γ running through m do not coincide; • there is some m ∈ M being a two-valent vertex with non-matching indices for one and being interior for the other graph; or • there is some m ∈ M being a two-valent vertex for both graphs, whereas both “incoming” or both “outgoing” indices are different. Note that matrix indices are regarded as different if they belong to different representations. Proof. The first two cases are obvious. The third one is clear observing our example above. Namely, decompose one of the graphs, say γ , by inserting m as a vertex. In the decomposition of T into a sum of gauge-variant spin network states of the enlarged 4 Recall that a gSN is said to have “matching” indices at a two-valent vertex m iff the lower index, assigned to the incoming edge at m, and the upper index for the outgoing one are equal. Note that we possibly have to invert orientations before, in order to have an incoming and an outgoing edge at a two-valent vertex.
84
Ch. Fleischhack
graph, the indices of every addend are matching. By the orthogonality properties of matrix functions w.r.t. the Haar measure, we get the assertion. The last case is now clear as well.
Definition 3.2. Let γ be an edge and φ be a non-trivial irreducible representation of G and let T := (Tφ ,γ )nm be a gauge-variant spin network state. 1. If γ is non-closed, then T is called (γ , φ)-based iff • γ = γ1 ◦ · · · ◦ γ# γ ; • φk = φ for all k; and • all indices at two-valent vertices are matching, i.e., m k+1 = n k for all k. 2. If γ is closed, then T is called (γ , φ)-based iff • γ1 ◦ · · · ◦ γ#γ equals γ or equals γ |[τ,1] ◦ γ |[0,τ ] for some τ ∈ (0, 1); • φk = φ for all k; and • all indices at two-valent vertices are matching, i.e., m k+1 = n k for all k and m 1 = n #γ . The set of all (γ , φ)-based gauge-variant spin network states will be denoted by Bγ ,φ . Moreover, we set Bγ := {1} ∪ φ Bγ ,φ , where the union runs over all non-trivial irreducible representations of G. It contains precisely the γ -based gauge-variant spin network states. Note again that T is (γ , φ)-based if for some orientation and some ordering of γ , the conditions above are met. Lemma 3.5. Bγ ,φ is orthogonal to its complement in the set of all gauge-variant spin network states, for every edge γ and every irreducible representation φ of G. Proof. Let T = (Tφ ,γ )nm be a gSN not contained in Bγ ,φ . If im γ = im γ , the situation is clear. The same is true for φk = φ for some k. Let now im γ = im γ and φk = φ for all k. Then, possibly after modifying ordering or orientations, we have γ = γ1 · · · γn . Moreover, every vertex of γ is at most two-valent. Thus, the proof follows from Lemma 3.4.
Corollary 3.6. For every edge γ , the Hilbert space L 2 (A, µ0 ) is the closure of
span Bγ ,φ ⊕ C 1 ⊕ span MSN\Bγ . φ
3.2. Decomposition of paths. In the following we will study the intersection behaviour between paths and (generalized) surfaces. For this, we first consider how paths can be decomposed. Most of the relevant definitions and assertions are given in [13]. We will quote where appropriate and will simplify some assumptions and, therefore, proofs. Note that in this subsection we will often distinguish between P and Pgen ; paths here are genuine maps from [0, 1] to M, not equivalence classes. 3.2.1. Completeness. Definition 3.3. Let γ be some path. Then a finite sequence γ := (γ1 , . . . , γn ) in Pgen is called decomposition of γ iff γ1 · · · γn equals γ up to the parametrization.
Representations of the Weyl Algebra in Quantum Geometry
85
This definition is well defined, since γ1 (γ2 γ3 ) equals (γ1 γ2 )γ3 up to the parametrization. Moreover, observe that every reparametrization of γ gives a decomposition of γ . If confusion is unlikely, we identify γ1 · · · γn , {γ1 , . . . , γn }, and (γ1 , . . . , γn ). Definition 3.4. Let γ := γ1 · · · γ I and δ := δ1 · · · δ J be decompositions of some path γ . Then γ is a refinement of δ iff there are 0 = I0 < I1 < · · · < I J = I , such that γ I j−1 +1 · · · γ I j is a decomposition of δ j for all j = 1, . . . , J . We write5 γ ≥ δ iff γ is a refinement of δ. It can easily be shown [13] that the set of all decompositions of a path γ is directed w.r.t. ≥. Definition 3.5. • A subset Q of Pgen is called hereditary iff for each γ ∈ Q 1. the inverse of γ is in Q again, and 2. every decomposition of γ consists of paths in Q. • A subset Q of Pgen is called complete iff it is hereditary and every path in Pgen has a decomposition into paths in Q. A decomposition consisting of paths in Q only, will be called Q-decomposition. Lemma 3.7. Let Q ⊆ Pgen be complete. Then for every hyph υ there is a hyph υ ≥ υ with υ ⊆ Q. Proof. First decompose each γ ∈ υ into paths in Q. Collect all these paths in a set γ ≥ υ. Since γ may be not a hyph again, refine, if necessary, the paths in γ further to get a hyph υ ≥ γ ≥ υ [14]. By completeness, υ contains only paths in Q.
Lemma 3.8. The set of all edges and trivial paths in Pgen is complete. 3.2.2. Main construction. Definition 3.6. Let Q be some hereditary subset of Pgen . Then a map ρ : Q −→ G is called Q-germ iff for all γ ∈ Q 1. ρ(γ −1 ) = ρ(γ )−1 , and 2. ρ(γ ) = ρ(γ1 )ρ(γ2 ) for all decompositions γ1 γ2 of γ . The set of all Q-germs from Q to G is denoted by Germ(Q, G). Observe that ρ(γ ) and ρ(δ) coincide if γ and δ coincide up to the parametrization. In fact, since every decomposition γ1 γ2 of γ is also some for δ, we may apply Property 2 above. Note that we will shortly speak about germs instead of Q-germs, provided the domain Q is clear from the context. Proposition 3.9. Let Q be some complete subset of Pgen , and let ρ : Q −→ G be a germ. Then we have: • There is a unique germ ρ : Pgen −→ G extending ρ. 5 By a little misuse of notation we denote both graphs and decompositions by γ , δ, etc., and denote both the relation on the set of hyphs (or graphs) and that of refinement by ≥. Confusion should be unlikely.
86
Ch. Fleischhack
• The map ρ is given by ρ (γ ) =
I i=1
ρ(γi )
for each γ ∈ Pgen , where γ1 · · · γn is any6 Q-decomposition of γ . • The map ρ is constant on equivalence classes in Pgen . • The induced map [ ρ ] : P −→ G is a homomorphism, i.e., it is an element of A. Proof. Let us first define the desired map ρ as given in the proposition above and now check its properties. 1. ρ does not depend on the choice of the Q-decomposition. Let γ and δ be two Q-decompositions of γ . Since, by assumption, every path in Q has Q-decompositions only, and since the set of decompositions of a path is directed w.r.t. ≥, we may assume γ ≥ δ. But, in this case the well-definedness follows directly from the definitions and germ property 2 of ρ. 2. ρ is constant on equivalence classes in Pgen . Let γ and δ in Pgen be equivalent. By definition, it is sufficient to check the following two cases: • γ and δ coincide up to the parametrization. This case is trivial, since every Q-decomposition of γ is also one of δ. Hence, ρ(γ ) = ρ(δ). • There is some ε in Pgen and some decomposition γ1 γ2 of γ , such that δ equals the product of γ1 , ε, ε−1 and γ2 . Now, in this case, choose some Q-decompositions ε1 · · · ε K of ε and γs1 · · · γs Is of γs with s = 1, 2. Then γ11 · · · γ1I1 γ21 · · · γ2I2 is a Q-decomposition of γ and −1 γ11 · · · γ1I1 ε1 · · · ε K ε−1 K · · · ε1 γ21 · · · γ2I2 one of δ. Hence, we have ρ (δ) = ρ(γ11 ) · · · ρ(γ1I1 ) ρ(ε1 ) · · · ρ(ε K ) −1 ρ(ε−1 K ) · · · ρ(ε1 ) ρ(γ21 ) · · · ρ(γ2I2 ) = ρ(γ11 ) · · · ρ(γ1I1 ) ρ(γ21 ) · · · ρ(γ2I2 ) =ρ (γ ).
(Definition of ρ ) (Property 1 of ρ) (Definition of ρ )
3. ρ is a germ extending ρ, and [ ρ ] is a homomorphism. This is proven as the statements above. 4. ρ is the only germ extending ρ. If ρ is some other germ extending ρ different from ρ , then there is some γ ∈ Pgen with ρ (γ ) = ρ (γ ). Now, choose a Q-decomposition γ1 · · · γ I of γ . By the properties of a germ, there is some i with ρ (γi ) = ρ (γi ). However, since both ρ and ρ extend ρ, both sides are equal to ρ(γi ). Contradiction.
Proposition 3.10. Let Q be some complete subset of Pgen . Let X be some topological space, and let λ : X −→ Germ(Q, G) be some map. Finally, assume that the map λ(·) (γ ) : X −→ G is continuous for all γ ∈ Q. Then λ : X −→ A x −→ [ λ(x)] is continuous, where · is given as in Proposition 3.9. 6 Recall that, by completeness of Q, such a decomposition exists always.
Representations of the Weyl Algebra in Quantum Geometry
87
Proof. It is sufficient [19] to prove that πγ ◦ λ : X −→ G is continuous for all edges γ . Since the multiplication in G is continuous and Q is complete, we even may restrict ourselves to the cases of γ ∈ Q. Here, however, the assertion follows immediately from λ(x)]) = [ λ(x)]([γ ]) = λ(x)(γ ) ≡ λ(x)(γ ), (πγ ◦ λ )(x) ≡ πγ ([ i.e., πγ ◦ λ = λ(·) (γ ) for all γ ∈ Q.
Lemma 3.11. Two generalized connections coincide iff they coincide for all (equivalence classes of) paths of a complete subset of Pgen . 3.2.3. Application to Weyl-type operators. Definition 3.7. Let Q be some hereditary subset of Pgen . Then a map κ : Q −→ G is called admissible iff • κ(δ1 ) = κ(δ2 ) for all δ1 , δ2 ∈ Q with δ1 ↑↑ δ2 , and • κ(γ1−1 ) = κ(γ2 ) for all γ ∈ Q and all decompositions γ1 γ2 of γ . Most relevant for the well-definedness of the Weyl operators to be introduced below, will be Theorem 3.12. Let Q be a complete subset of Pgen and κ : Q −→ G an admissible map. Then there is a unique map : A −→ A, such that, for all γ ∈ Q, h (A) ([γ ]) = κ(γ )−1 h A ([γ ]) κ(γ −1 ). Moreover, is a homeomorphism preserving the Ashtekar-Lewandowski measure µ0 . Hence, the pull-back ∗ : C(A) −→ C(A) is an isometry and the induced operator on B(L 2 (A, µ0 )) is well defined and unitary. A more general version is proven in [13]. We replay the corresponding proof. Proof. • Define λ : A −→ Maps(Q, G) by7 λ(A) (γ ) = κ(γ )−1 h A (γ ) κ(γ −1 ). • First we show that λ(A) is indeed in Germ(Q, G) for all A ∈ A. In fact, for all γ ∈ Q and all decompositions γ1 γ2 of γ , we have λ(A) (γ −1 ) = κ(γ −1 )−1 h A (γ −1 ) κ(γ ) −1 −1 = λ(A)(γ ) = κ(γ )−1 h A (γ ) κ(γ −1 ) and
λ(A) (γ ) = κ(γ )−1 h A (γ ) κ(γ −1 ) = κ(γ1 γ2 )−1 h A (γ1 ) h A (γ2 ) κ(γ2−1 γ1−1 ) = κ(γ1 )−1 h A (γ1 ) κ(γ1−1 ) κ(γ2 )−1 h A (γ2 ) κ(γ2−1 ) = λ(A) (γ1 ) λ(A) (γ2 ).
Here, we used the admissibility of κ with γ1 ↑↑ γ1 γ2 and γ2−1 γ1−1 ↑↑ γ2−1 . 7 From now on, we will drop the square brackets in all h ([. . .]). A
88
Ch. Fleischhack
• Next, observe that, for every fixed γ ∈ Q, λ(A) (γ ) = κ(γ )−1 h A (γ ) κ(γ −1 ) ≡ κ(γ )−1 πγ (A) κ(γ −1 ) depends continuously on A, by definition of the projective-limit topology on A. • Now, by Proposition 3.10, := [ λ(·)] : A −→ A is continuous, whereas for γ ∈ Q, ]) = κ(γ )−1 h A (γ ) κ(γ −1 ). h (A) (γ ) ≡ (A) ([γ ]) = [λ(A)]([γ The uniqueness of follows from the completeness of Q. • To prove that is a homeomorphism, we explicitly describe the inverse of . Define κ : Q −→ G by κ (γ ) := κ(γ )−1 . It is easy to check that κ is admissible. As already proven above, there is a unique continuous map : A −→ A with h (A) (γ ) = κ (γ )−1 h A (γ ) κ (γ −1 ) for all γ ∈ Q. Altogether, this gives h ((A)) (γ ) = κ (γ )−1 h (A) (γ ) κ (γ −1 ) = κ (γ )−1 κ(γ )−1 h A (γ ) κ(γ −1 ) κ (γ −1 ) = h A (γ ) for all γ ∈ Q. The completeness of Q and Lemma 3.11 prove ◦ = idA . Analogously, one shows ◦ = idA . • even preserves the Ashtekar-Lewandowski measure. In fact, let υ be an arbitrary, but fixed hyph. By completeness, there is some hyph υ ≥ υ with Y edges and υ ⊆ Q. By construction, we have πυ ◦ = (γ1 × · · · × γY ) ◦ πυ with γ (g) := κ(γ )−1 g κ(γ −1 ) for γ ∈ Q. In other words, each γ consists of a left and a right translation, whence the Haar measure on G is γ -invariant. Since πυυ ◦ πυ = πυ with continuous πυυ : Aυ −→ Aυ and since (πυ )∗ µ0 is the Y -fold product of the Haar measure on G, we get
(πυ )∗ (∗ µ0 ) = (πυυ )∗ (πυ ◦ )∗ µ0
= (πυυ )∗ (γ1 × · · · × γY )∗ (πυ )∗ µ0
Y = (πυυ )∗ (γ1 × · · · × γY )∗ µHaar
Y = (πυυ )∗ µHaar
= (πυυ )∗ (πυ )∗ µ0 = (πυ )∗ µ0 . Since finite regular Borel measures on A coincide iff their push-forwards w.r.t. all πυ coincide, we get the assertion.
We get immediately Corollary 3.13. Let Q be some complete subset of Pgen . Moreover, let Y be some topological space and let κ : Q × Y −→ G be some map, such that
Representations of the Weyl Algebra in Quantum Geometry
89
• κ(·, y) : Q −→ G is admissible for all y ∈ Y , and • κ(γ , ·) : Y −→ G is continuous for all γ ∈ Q. Then there is a unique map : A × Y −→ A with h (A,y) (γ ) = κ(γ , y)−1 h A (γ ) κ(γ −1 , y) for all γ ∈ Q. Moreover, is continuous.
3.3. Surfaces and fluxes. Originally (see, e.g., [32]), the action of flux operators on cylindrical functions has been given by self-adjoint differential operators. Since these operators are unbounded, one has to study their domains very carefully. To avoid this problem, one usually considers them as generators of unitary, i.e., bounded operators. Now, the flux operators turn into some sort of translation operators. In this section, we are going to shift this action to a still deeper level. We will see that it can be regarded as the pull-back of some continuous action of translations on A itself. 3.3.1. Quasi-surfaces. Before we can define this action we study how paths are decomposed by surfaces. Definition 3.8. Let S be a subset of M. • A path γ ∈ Pgen is called S-external iff (int γ ) ∩ S = ∅. • A path γ ∈ Pgen is called S-internal iff int γ ⊆ S. Observe that the end points of an S-external path may be contained in S. It is only required for the “interior part” of the path, i.e., for all γ (t) with 0 < t < 1 to be outside of S. If S is clear from the context, we simply speak about external and internal edges. Definition 3.9. Let S be some subset of M. Then Q S denotes the set of all paths that are S-external or S-internal. Definition 3.10. Let S be a subset of M and γ ∈ Pgen be an edge. Then a decomposition γ of γ is called S-admissible iff γ ⊆ Q S . In other words, γ = (γ1 , . . . , γ I ) is S-admissible iff γ equals γ1 · · · γ I up to the parametrization and each γi is S-internal or S-external. Lemma 3.14. Let S be a subset of M. Then Q S is complete, if every edge has an S-admissible decomposition. Proof. Heredity is clear. The completeness follows since any (finite) path can be decomposed into a product of edges and trivial paths, hence, by assumption, into a product of S-external or S-internal paths.
Definition 3.11. A subset S of M is called quasi-surface iff every edge γ ∈ Pgen has an S-admissible decomposition.
90
Ch. Fleischhack
Examples for quasi-surfaces, in case we are in the (semi)analytic category for the paths, are embedded analytic submanifolds that are even semianalytic.8 Note that these submanifolds may have any dimension. Therefore, any collection of points having no accumulation point is a quasi-surface. This even remains true in the category of piecewise smooth paths. On the other hand, there are indeed non-semianalytic submanifolds that are quasi2 surfaces. Consider, e.g., the smooth function f on R with f (x) := e−1/x for x = 0 and f (0) := 0. Of course, it is analytic everywhere except for x = 0. But, its graph S does not form a semianalytic submanifold in, for simplicity, R2 . Nevertheless, it is a quasi-surface. In fact, let γ be a piecewise analytic path in R2 . If it does not run through the origin, the statement is trivial. Assume now that γ runs through the origin. Decomposing γ appropriately, if necessary, we may restrict ourselves to the case of an analytic γ starting at the origin without returning there at any other parameter time. Assume next that γ has infinitely many intersection points with S, and let the origin 0 be an accumulation point for int γ ∩ S. W.l.o.g.,9 we may consider, finally, γ to be the graph of an analytic function on R, again denoted by γ . Use now the fact that two C ∞ functions f 1 , f 2 have identical Taylor coefficients at 0 if 0 is an accumulation point of f 1 = f 2 , to derive that γ has only zero Taylor coefficients, just because f does. Now, analyticity implies that γ is a straight edge along the x-axis never intersecting S again. Using this contradiction, the statement is now trivial. If we would like to take even more quasi-surfaces into account, we may reduce the set of paths under consideration. This might be relevant, e.g., in the case of piecewise linear paths, although there usually also the set of manifolds is restricted to that of piecewise linear submanifolds a priori. The punctures leading to an S-admissible decomposition will be relevant for the definition of Weyl operators. In particular, these operators depend on the transversality properties between the path and the (oriented) hypersurface. Therefore, we need to introduce a general notion for the properties an orientation should encode. Definition 3.12. Let S be a quasi-surface of M. • A function σ S : Pgen −→ Z is called – outgoing intersection function for S iff we have 1. σ S (γ ) = 0 if γ (0) ∈ S and 2. σ S (γ ) = σ S (γ ) for all γ , γ ∈ Pgen with γ ↑↑ γ ; – incoming intersection function for S iff we have 1. σ S (γ ) = 0 if γ (1) ∈ S and 2. σ S (γ ) = σ S (γ ) for all γ , γ ∈ Pgen with γ ↓↓ γ . • An outgoing intersection function σ S− and an incoming intersection function σ S+ are called compatible iff σ S− (γ ) + σ S+ (γ −1 ) = 0 for all γ ∈ Pgen . 8 This, however, is no longer true if we drop the semianalyticity (for its definition see Subsect. 6.3). In fact, consider R2 and a smooth path γ in the closed half-plane y ≤ 0, such that γ connects (−1, 0) and (+1, 0)
and intersects the straight line δ between these two points infinitely often without sharing a full segment. (See similar constructions, e.g., in [7,17].). Now define S to be the upper one of the two open sets in R2 bounded by γ , by x = −1 and by x = +1. Of course, S is an embedded analytic manifold, although it is not semianalytic in R2 . Nevertheless, δ leaves S and returns into it infinitely often. Therefore, there is no S-admissible decomposition of δ, whence S is not a quasi-surface. 9 Otherwise, restrict the domain of γ , such that the x-component of γ˙ is non-zero everywhere. If this is not possible, the x-component of γ˙ vanishes at t = 0. But, then γ is S-external anyway, at least locally.
Representations of the Weyl Algebra in Quantum Geometry
91
For brevity, we will denote a compatible pair (σ S− , σ S+ ) of an outgoing and an incoming intersection function by σ S and call it intersection function for S. Even more, we use σ S and σ S− synonymously. Sometimes, we write σ (S, γ ) instead of σ S (γ ) to emphasize that the intersection function may depend on quasi-surface and path as well. Definition 3.13. Let S be a quasi-surface of M, and let σ S : Pgen −→ Z be some intersection function for S. Then the intersection function −σ S is called inverse to σ S . Definition 3.14. Let S be a quasi-surface with intersection function σ S , and let γ ∈ Pgen be some path. Assume, moreover, that there are only finitely many τi ∈ [0, 1] with γ (τi ) ∈ S. We say that the orientation of S coincides with the direction of γ iff σ S− (γ |[τi ,1] ) = 1 for all τi = 1 and σ S+ (γ |[0,τi ] ) = 1 for all τi = 0. In our applications, we will, e.g., define σ S (γ ) for an S-external path γ to be ±1 (depending on the direction of γ ), if its initial path intersects S transversally, and equal to 0, otherwise: Definition 3.15. Let S be an oriented (embedded) hypersurface in M being a quasisurface of M. Then we have: 1. The natural intersection function σ S : Pgen −→ Z is defined by: • σ S (γ ) = 0 if γ (0) ∈ S or γ˙ (0) is tangent to S; • σ S (γ ) = ±1 if γ (0) ∈ S and γ˙ (0) is not tangent to S and some initial path of γ lies (except γ (0)) above (below) S. top 2. The topological intersection function σ S : Pgen −→ Z is defined as follows: top • σ S (γ ) = 0 if γ (0) ∈ S or some initial path of γ is contained in S; top • σ S (γ ) = ±1 if γ (0) ∈ S and no initial path of γ is contained in S and some initial path of γ lies (except γ (0)) above (below) S. Here, “above” and “below” refer to the orientation of S. Moreover, initial paths w.r.t. a trivial interval are not taken into consideration. It is easy to check that this definition is well defined. Moreover, obviously, for every orientable S there are precisely two natural (and two topological) intersection functions corresponding to the two choices of orientations. They coincide up to the sign. If S is a submanifold of codimension larger than 1, there is no longer just a pair of natural orientations. Nevertheless, in view of the applications we aim at, we may define “natural” orientations: Definition 3.16. Let S be some embedded submanifold of M being a quasi-surface of M and having codimension 2 or higher. Then an intersection function σ S : Pgen −→ Z is called natural (topological) iff there is some oriented embedded hypersurface S in M being a quasi-surface and having σ S as its natural (topological) intersection function. One sees immediately that the number of natural intersection functions of such quasisurfaces with higher codimension may be rather large. For instance, let S be (a bounded part of) a line in R3 . Then we may take all the full circles in R3 having S as its diameter. Of course, there is a continuum of such circles each having another pair of natural or topological intersection functions.
92
Ch. Fleischhack
Definition 3.17. • A quasi-surface S is called quasi-subsurface of some quasi-surface S iff S is contained in S. • Let S be a quasi-subsurface of a quasi-surface S having intersection function σ S . Then an intersection function σ S is called induced by σ S iff σ S (γ ) = σ S (γ ) for all γ with γ (0) ∈ S . Definition 3.16 gives an example for the induction of intersection functions. Lemma 3.15. The complement of a quasi-surface is a quasi-surface. Proof. An S-admissible decomposition of an edge is also (M \ S)-admissible.
Lemma 3.16. If S1 and S2 are quasi-surfaces, then S1 ∪S2 and S1 ∩S2 are quasi-surfaces. Proof. If γ is some edge, decompose each path of some S1 -admissible decomposition w.r.t. S2 . It is easy to check that this leads to an S-admissible decomposition of γ with S being S1 ∪ S2 or S1 ∩ S2 .
Corollary 3.17. Let S1 and S2 be quasi-surfaces with intersection functions σ S1 and σ S2 , respectively. Then σ S1 + σ S2 is an intersection function for S := S1 ∪ S2 . If, additionally, σ S1 and σ S2 coincide for all paths starting at S1 ∩ S2 , then the function σ S1 S2 defined by ⎧ ⎪ ⎨σ S1 (γ ) if γ (0) ∈ S1 , σ S1 S2 (γ ) := 0 if γ (0) ∈ S1 ∪ S2 , ⎪ ⎩σ (γ ) if γ (0) ∈ S , S2 2 is an intersection function for S. It is called joint intersection function. Obviously, the joint intersection function equals σ S1 + σ S2 if S1 and S2 are disjoint. Sometimes, it is convenient to use some sort of standard decomposition of edges. Indeed, there is a minimal decomposition. Definition 3.18. Let S be a subset of M and γ ∈ Pgen be an edge. An S-admissible decomposition γ of γ is called minimal iff γ ≥ γ for any other S-admissible decomposition γ of γ . In other words, γ = (γ1 , . . . , γ I ) is minimal iff every other S-admissible decomposition γ = (γ1 , . . . , γ J ) is a refinement of γ , i.e., there are 0 = j0 < j1 < · · · < j I = J , such that γi equals γ ji−1 +1 · · · γ ji up to the parametrization. Lemma 3.18. If an edge γ has any S-admissible decomposition, it has also a minimal S-admissible decomposition. Moreover, this minimal decomposition is unique up to the parametrization of its components. Proof. Let δ be an S-admissible decomposition of γ . Since γ equals δ1 · · · δ K up to the parametrization, the parameter domain [0, 1] of γ may be decomposed into nontrivial closed intervals Rk = [tk−1 , tk ] ⊆ [0, 1], such that each γ | Rk corresponds to δk . Cancel now in T := {t0 , t1 , . . . , t K } each tk = 0, 1 with int γ |[tk−1 ,tk+1 ] ∩ S = ∅ or int γ |[tk−1 ,tk+1 ] ⊆ S. The remaining set T = {τ0 , . . . , τ I } ⊆ T naturally defines another S-admissible decomposition γ = (γ1 , . . . , γ I ) of γ and a corresponding decomposition of [0, 1] into intervals Pi . Let now γ = (γ1 , . . . , γ J ) be any S-admissible decomposition of γ . Then each γi corresponds to some interval Q j ⊆ [0, 1] with γ | Q j = γ j . Assume that Q j overlaps two different intervals Pi and Pi+1 , i.e., γ (τi ) ∈ int γ j .
Representations of the Weyl Algebra in Quantum Geometry
93
• Let γ (τi ) ∈ S. Then int γ j = int γ | Q j ⊆ S, hence int γ | Pi ⊆ S and int γ | Pi+1 ⊆ S, by admissibility. Consequently, int γ | Pi ∪Pi+1 = int γ | Pi ∪ {γ (τi )} ∪ int γ | Pi+1 ⊆ S. This implies τi ∈ T , in contradiction to the minimality of γ . • Let γ (τi ) ∈ S. Then, analogously, we get a contradiction. Consequently, Q j can overlap nontrivially only either Pi or Pi+1 .
Definition 3.19. Let S be a quasi-surface with intersection function σ S , let γ be an edge n and let γ = {γi }i=0 be its minimal S-admissible decomposition. Then a point x ∈ M is called • γ -puncture in S iff there is an i ∈ [1, n] with γi−1 (1) = x = γi (0)
and
σ S+ (γi−1 )σ S− (γi ) > 0;
• γ -half-puncture in S iff there is an i ∈ [0, n] with x = γi (0)
and
σ S− (γi ) = 0
x = γi (1)
and
σ S+ (γi ) = 0.
or
We say that γ intersects S completely transversally iff there are no S-internal edges in the minimal S-admissible decomposition of γ and each γ -half-puncture is also a γ -puncture. Roughly speaking, x is a γ -puncture iff γ intersects S (w.r.t. σ S ) transversally at x. 3.3.2. Quasi-flux action. In this subsection, S is some quasi-surface and σ S some intersection function for S. Proposition 3.19. There is a unique map S,σ S : A × Maps(M, G) −→ A, such that
−
d(γ (0))σ S (γ ) h A (γ ) d(γ (1))σ S (γ ) for S-external γ h S,σ S (A,d) (γ ) = . for S-internal γ h A (γ ) +
If Maps(M, G) ∼ = G M is given the product topology, then is continuous. Moreover, the map dS,σ S : A −→ A, given by dS,σ S (A) := S,σ S (A, d), is a homeomorphism and preserves the AshtekarLewandowski measure for each d ∈ Maps(M, G). Finally, the inverse of dS,σ S is given S by dS,σ −1 .
94
Ch. Fleischhack
Proof. • S,σ S exists uniquely and is continuous for the product topology on Maps(M, G). First note that Q S is complete by Lemma 3.14. Let now Y := Maps(M, G) and define
− d(γ (0))−σ S (γ ) if γ is S-external κ(γ , d) := . if γ is S-internal eG The only nontrivial property of κ in Corollary 3.13 to be checked is κ(γ1−1 , d) = κ(γ2 , d) for decompositions γ1 γ2 of S-external γ . Observe, however, that here γ1−1 (0) ≡ γ1 (1) ≡ γ2 (0) is not contained in S, hence κ(γ1−1 , d) = eG = κ(γ2 , d). The claim now follows from σ S− (γ ) + σ S+ (γ −1 ) = 0 and Corollary 3.13. • dS,σ S is a homeomorphism and leaves µ0 invariant. This now follows from Theorem 3.12.
Definition 3.20. S,σ S : A × Maps(M, G) −→ A is called quasi-flux action. Remark. Note that S,σ S is, in general, not a group action of Maps(M, G). But, we have Lemma 3.20. Let S1 and S2 be two quasi-surfaces, and let d1 , d2 : M −→ G be functions commuting on S1 ∩ S2 . Let d : M −→ G be any function with ⎧ ⎪ on S1 \ S2 ⎨d1 d := d1 d2 on S1 ∩ S2 ⎪ ⎩d on S2 \ S1 2 If σ S1 and σ S2 coincide for all paths starting in S1 ∩ S2 and vanish both for S1 - and S2 -internal paths, then S1 ,σ S1
d 1
S2 ,σ S2
◦ d 2
Proof. By direct calculation.
S1 ∪S2 ,σ S1 S2
= d
S2 ,σ S2
= d 2
S1 ,σ S1
◦ d 1
.
Corollary 3.21. Let d1 , d2 : M −→ G be two functions. S S S If d1 and d2 commute pointwise, we have dS,σ ◦ dS,σ = dS,σ . 1 2 1 d2 Proof. Straightforward.
3.4. Weyl operators. Recall that every continuous map ψ : X −→ X on a topological space X defines a continuous pull-back map ψ ∗ : C(X ) −→ C(X ). This map is an isometry if ψ is surjective. If X is even a compact Hausdorff space, ψ is surjective and µ a (finite) regular Borel measure on X with ψ∗ µ = µ, then ψ ∗ is a unitary operator on L 2 (X, µ). This motivates Definition 3.21. The operators wdS,σ S := (dS,σ S )∗ with S being a quasi-surface, σ S an intersection function and d : M −→ G being any function are called Weyl operators. Note that each Weyl operator is both a map on C(A) and L 2 (A, µ0 ). In fact, Proposition 3.19 gives
Representations of the Weyl Algebra in Quantum Geometry
95
Proposition 3.22. • Every Weyl operator is an isometry on C(A). • Every Weyl operator is a unitary operator on L 2 (A, µ0 ). Note, however, that measures, in general, lead to Weyl operators that are ill defined on the L 2 -functions: For instance, let us work in the analytic category, fix some hypersurface S and some intersection function σ S . Assume now that, g running over G, we have all Weyl operators at our disposal that are given by S , wgS,σ S := wdS,σ g
where dg is the constant function on M with value g ∈ G. To make all these Weyl operators well defined as operators on L 2 (A, µ) for some µ, we have at least to demand that, for each S-external edge γ (having only one end attached to S), the support of the push-forward measure (πγ )∗ µ equals G. Of course, there are many measures without this property. Let us now collect some additional properties of Weyl operators, again following directly from the properties of and the definition of Weyl operators by pull-backs. Lemma 3.23. Let S be a quasi-surface and let d, d1 , d2 : M −→ G be some functions. Then we have (dropping always the upper indices S, σ S in wdS,σ S ): 1. wd ( f 1 f 2 ) = wd ( f 1 )wd ( f 2 ) for all functions f 1 , f 2 on A. 2. wd1 wd2 = wd1 d2 , if d1 d2 = d2 d1 . Corollary 3.24. For all quasi-surfaces S, all intersection functions σ S and all functions d : M −→ G, we have S,σ S −1 S wdS,−σ S = wdS,σ ) ≡ (wdS,σ S )∗ . −1 = (wd
The preceding corollary implies that the inversion of the orientation of a quasi-surface leads to the adjoint Weyl operator. The uniqueness proof in Sect. 7 will heavily use this fact. Corollary 3.25. Let υ = {γ1 , . . . , γn } be a hyph. Then wdS,σ S (T1 ⊗ · · · ⊗ Tn ) = wdS,σ S (T1 ) ⊗ · · · ⊗ wdS,σ S (Tn ) for all Ti ∈ Mγi and all functions d : M −→ G. Corollary 3.17 implies Lemma 3.26. Let S1 and S2 be disjoint quasi-surfaces with intersection functions σ S1 and σ S2 , respectively. Let, moreover, d1 , d2 : M −→ G be some functions. Then we have S1 ,σ S1
wd 1
S2 ,σ S2
◦ wd2
S2 ,σ S2
= wd2
S1 ,σ S1
◦ wd1
.
Lemma 3.27. Let υ be a hyph and w be a Weyl operator for some quasi-surface S. Then there is a hyph υ ≥ υ with w(Mυ ) ⊆ span Mυ . If, moreover, υ contains S-external and S-internal edges only, then w(Mυ ) ⊆ span Mυ . Proof. Choose a hyph υ ≥ υ containing S-external and S-internal edges only. One checks immediately, that w(Mυ ) ⊆ span Mυ . Using Mυ ⊆ span Mυ (Lemma 3.2), we get the assertion.
96
Ch. Fleischhack
3.5. Regularity. Proposition 3.28. Fix some quasi-surface S and some intersection function for σ S . Next, let 0 be a set of sequential Maps(M, G)-functions, such that10 prx ◦ λ0 : dom λ0 −→ G is sequentially continuous for every x ∈ M and each λ0 ∈ 0 . Finally, assign to each λ0 some λ : dom λ0 −→ W, S y −→ wλS,σ 0 (y) W being the set of Weyl operators, and collect all such λ into . Then λ( · )ψ : dom λ0 −→ Haux ≡ L 2 (A, µ0 ) is continuous for all ψ ∈ Haux and each λ ∈ . Proof. Fix some λ ∈ with corresponding λ0 ∈ 0 and recall that sequential continuity equals continuity, if the domain is sequential. To avoid cumbersome notation, we write S shortly w y instead of wλS,σ . 0 (y) • Of course, w y (1) = 1 for all y. • Let γ be an edge and T ∈ Mγ some gauge-variant spin network state over γ . – If γ is internal, then w y (T ) = T for all y, hence y −→ w y (T ) is continuous. √ – If γ is external, then with T = dim φ φlk and after a straightforward calculation, we have w y (T ) − w y (T )2Haux − − = 2 − 2 Re φkk [λ0 (y )](γ (0))σ S (γ ) [λ0 (y)](γ (0))−σ S (γ ) · + + · φll [λ0 (y)](γ (1))−σ S (γ ) [λ0 (y )](γ (1))σ S (γ ) − − ≡ 2 − 2 Re φkk [prγ (0) ◦ λ0 ](y )σ S (γ ) [prγ (0) ◦ λ0 ](y)−σ S (γ ) · + + · φll [prγ (1) ◦ λ0 ](y)−σ S (γ ) [prγ (1) ◦ λ0 ](y )σ S (γ ) . (There is no summation over k and l.) Since, by assumption each prx ◦ λ0 is a continuous mapping from dom λ0 to G, we get w y (T ) − w y (T )Haux → 0 for y → y , implying the desired continuity of y −→ w y (T ). • Let υ contain external and internal edges only. Let, moreover, T = T1 ⊗ · · · ⊗ TY be in Mυ . Then we have w y (T ) − w y (T )2Haux = 2 − 2 Re w y T, w y T Haux = 2 − 2 Re w y T1 ⊗ · · · ⊗ w y TY , w y T1 ⊗ · · · ⊗ w y TY Haux = 2 − 2 Re w y Ti , w y Ti Haux i → 2 − 2 Re w y Ti , w y Ti Haux (w y Ti → w y Ti by the preceding step) =0
i
10 Here, pr : Maps(M, G) −→ G assigns to each function from M to G its value in x. x
Representations of the Weyl Algebra in Quantum Geometry
97
for y → y . The factorization of the scalar products was possible, because w y leaves the span of (non-trivial) matrix functions over γi invariant and because such spans are orthogonal w.r.t. µ0 for paths in a hyph. • Let now T ∈ MSN be an arbitrary gauge-variant spin network function, i.e., there is a hyph υ with T ∈ Mυ . Then there is some hyph υ ≥ υ containing external and internal edges only. Since Mυ ⊆ span Mυ by Lemma 3.2, w y (T ) → w y T for y → y. • Now, Lemma A.1 gives the proof: The span of MSN = υ Mυ is dense in L 2 (A, µ0 ), and w y = 1 for all y by unitarity.
A typical example is given by the continuous (or differentiable) functions w.r.t. the supremum norm: Definition 3.22. Let S be some quasi-surface and σ S some intersection function for S. Now let p,S,σ S for p ∈ N ∪ {∞, ω} contain precisely all mappings w·S,σ S : C p (M, G) −→ W, d −→ wdS,σ S where C p (M, G) is equipped with the supremum norm on S. We now may transfer this result to one-parameter subgroups. Using the one-parameter subgroups on G induced by the elements of the Lie algebra g, we have Corollary 3.29. Let d : M −→ g be a (not necessarily continuous) function, and define E d : R −→ Maps(M, G). t −→ (et d(x) )x∈M Then we have: 1. E d(t1 )E d(t2 ) = E d(t1 + t2 ) for all t1 , t2 ∈ R. 2. prx ◦ E d is continuous for every x ∈ M. 3. The one-parameter subgroup S t −→ w ES,σ d (t)
is strongly continuous w.r.t. to L 2 (A, µ0 ) for each quasi-surface S with intersection function σ S . Proof. The first two assertions are trivial. To see the strong continuity, apply Proposition 3.28 to the case 0 := {E d : R −→ Maps(M, G)}.
3.6. Graphomorphisms. One of the particular features of quantum geometry is its invariance w.r.t. diffeomorphisms of M. More precisely, diffeomorphisms act naturally on the paths inducing a µ0 -invariant action on A and, consequently, a unitary action on Haux . The question remains, what kind of diffeomorphisms are to be admitted: analytic, piecewise analytic, smooth or something else? Anyway, we will postpone this discussion to Sect. 4 and consider here only some sort of minimal requirements. For this, let us again fix some smoothness class for the manifold and the paths in it. Definition 3.23. A map ϕ : M −→ M is called graphomorphism iff ϕ is bijective and induces a groupoid isomorphism on P. [13]
98
Ch. Fleischhack
Here, ϕ(γ ) := ϕ ◦ γ . Graphomorphisms have a convenient characterization [13]: Lemma 3.30. A bijection ϕ on M is a graphomorphism iff ϕ and ϕ −1 map edges to Pgen . The action of graphomorphisms on P can be lifted to an action on A. In fact, each graphomorphism ϕ defines via [ϕ(A)](γ ) := h A (ϕ −1 ◦ γ )
for all γ ∈ P
a map from A to Maps(P, G), again denoted by ϕ. We have Proposition 3.31. Every graphomorphism ϕ maps A homeomorphically to A. Proof. Of course, ϕ maps A to A. Moreover, πγ ◦ ϕ = πϕ −1 ◦γ is continuous for all γ ∈ P. Hence, ϕ is continuous. The proof now follows, since ϕ ◦ ϕ −1 is the identity on A.
Proposition 3.32. The Ashtekar-Lewandowski measure µ0 is ϕ-invariant for all graphomorphisms ϕ. Proof. This follows, because for all hyphs υ, #(ϕ −1 ◦υ)
(πυ )∗ (ϕ∗ µ0 ) = (πϕ −1 ◦υ )∗ µ0 = µHaar
= µ#υ Haar = (πυ )∗ µ0 .
Definition 3.24. For each graphomorphism ϕ define αϕ to be the pull-back of ϕ −1 . Proposition 3.33. For every graphomorphism ϕ, • αϕ is an isometry on C(A); • αϕ is a unitary operator on L 2 (A, µ0 ). The map ϕ −→ αϕ is even a representation of the group of graphomorphisms on L 2 (A, µ0 ), because αϕ1 ◦ϕ2 = αϕ1 ◦ αϕ2 and αϕ −1 = αϕ−1 .11 Graphomorphisms do not only act on graphs, but also on quasi-surfaces, intersection and other functions. Definition 3.25. Let ϕ be a graphomorphism. Then we set: • ϕ(S) := ϕ ◦ S for every quasi-surface S; • ϕ(d) := d ◦ ϕ −1 for every function d : M −→ G; • [ϕ(σ )](S, γ ) := σ (ϕ −1 (S), ϕ −1 (γ )) for every intersection function σ . We, therefore, will have to guarantee that admissible homeomorphisms do not only preserve the set of paths under consideration, but also that of quasi-surfaces, and have to avoid ill-defined intersection functions – in particular, if we aim at an “intrinsic” assignment of intersection functions to quasi-surfaces. All that will be provided by using stratified analytic isomorphisms as to be discussed below. Directly from the definitions, we get finally Proposition 3.34. Let ϕ : M −→ M be a graphomorphism, S a quasi-surface, σ an intersection function and d : M −→ G a function. Then we have ϕ(S),ϕ(σ )
wϕ(d)
= αϕ (wdS,σ ) ≡ αϕ ◦ wdS,σ ◦ αϕ−1 .
11 Note that we did not care about the corresponding covariance property for the Weyl operators. In fact, there w is given by the pull-back of , not of −1. Since, however, the -transforms do not form a group, that does not matter.
Representations of the Weyl Algebra in Quantum Geometry
99
3.7. Generalized gauge transforms. Any gauge theory incorporates gauge invariance. Therefore, we close this section with a few remarks on gauge transformations and, more general, bundle automorphisms. Definition 3.26. The elements of G := Maps(M, G) are called generalized gauge transforms.12 G is given the product topology inherited from the canonical isomorphism Maps(M, G) ∼ = G M and its group structure is given by pointwise multiplication. Proposition 3.35. G is a topological group and acts continuously on A via h A◦ g (γ ) := g(γ (0))−1 h A (γ ) g(γ (1))
for all γ ∈ P.
Proposition 3.36. The Ashtekar-Lewandowski measure µ0 is invariant w.r.t. all generalized gauge transforms. Definition 3.27. For each generalized gauge transform g define β g on functions on A by (β g f )(A) := f (A ◦ g). Observe that β g1 ◦ g2 = β g1 ◦ β g2 and β g−1 = β −1 g . Proposition 3.37. • g −→ β g is a representation of G on C(A) by isometries. • g −→ β g is a representation of G on L 2 (A, µ0 ) by unitaries. Generalized gauge transforms do also act on the G-valued functions labelling the quasi-surfaces. Definition 3.28. Let g be a generalized gauge transform. Then we set: • g(d) := g · d · g −1 for every function d : M −→ G. Again, directly from the definitions, we get Proposition 3.38. Let g : M −→ M be a generalized gauge transform, S a quasisurface, σ an intersection function and d : M −→ G a function. Then we have S,σ S,σ −1 w S,σ g(d) = β g (wd ) ≡ β g ◦ wd ◦ β g .
3.8. Bundle automorphisms. Up to now, we have widely ignored the bundle structure of the gauge theory. Without a real need, we tacitly assumed to deal with a trivialized bundle, as we focused on the manifold M and the structure group G only. Of course, it made the notations simpler and can, moreover, be justified a posteriori: A contains the C p connections of any G-principal bundle over M, independently from the bundle we started from. Similarly, G contains all C p gauge transforms in any such bundle. But, conceptually, it is much more desirable to include the full bundle structure. Then we would also like to include the full group of bundle automorphisms. Note, here, that given any bundle automorphism θ : P −→ P of the G-bundle P over M, we may extract from it a diffeomorphism ϕθ : M −→ M via ϕθ ◦ pr M = pr M ◦ θ, 12 Starting from Sect. 4, we will usually drop the word “generalized” for simplicity.
100
Ch. Fleischhack
where pr M denotes the canonical projection pr M : P −→ M. Moreover, the (smooth) gauge transforms correspond to vertical automorphisms; these are the bundle automorphisms with ϕθ = id M . Nevertheless, the full information on any (possibly stratified) C p bundle automorphism can be encoded in a (again, possibly stratified) C p diffeomorphism and a generalized gauge transform (even of any other bundle). The only danger arising from taking all the generalized gauge transforms of Subsect. 3.7 is to take too many gauge transforms. However, observe that, at least for the piecewise analytic category, the set of gauge orbits A/G is densely embedded into A/G and no two piecewise analytic connections fall into the same equivalence class by moding out the group of gauge transforms [18]. Finally, as it will turn out, the diffeomorphisms and the gauge transforms will play different rôles in the following proofs. Therefore, to make the basic ideas clearer and to sometimes allow for relaxed assumptions in the assertions, we will refrain from considering the fully automorphism invariant treatment of the Weyl algebra. Thus, w.l.o.g., we may pragmatically consider the bundle-automorphism invariance given by implementing both diffeomorphism and gauge invariance. The translation into the fully invariant language has to be left to the interested reader. 4. Weyl Algebra of Quantum Geometry 4.1. Structure data. In what follows, we are going to apply the above definitions and results to quantum geometry. Usually, this means to use piecewise analytic paths γ and oriented hypersurfaces S in M, whereas the intersection functions encode whether γ intersects S transversally or not and how its direction is related to the orientation of S. Moreover, (piecewise) analytic diffeomorphisms act on these objects. However, is it obvious that we should consider precisely these ingredients? Before we discuss this question, let us collect these assumptions to avoid cumbersome notation. Definition 4.1. The structure data of the theory under consideration contain: • • • • • • • •
a manifold M; a Lie group G; a smoothness class used for the definition of the set P of paths in M, a subset S of the set of quasi-surfaces in M, for each S ∈ S a subset (S) of the set of intersection functions for S, for each S ∈ S a subset (S) of the set of functions from M to G, a subset E of the set G of gauge transforms acting covariantly on ; a subset D of the set of graphomorphisms on M that leave S invariant and act covariantly on and ;
Indeed, at first glance, there seems to be an enormous freedom in choosing structure data of a theory. However, there are several antagonists in the game. For instance, if we would enlarge P, we might have to reduce S, simply because we have to guarantee that there are at most finitely many (genuine) intersections of paths and quasi-surfaces. In fact, this practically excludes the choice of the smooth category for the paths: There are even analytic submanifolds having an infinite number of isolated transversal intersections with smooth paths. Therefore, we are – from the mathematical, technical point of view – quite forced to admit at most (piecewise) analytic paths. This however reduces the number of graphomorphisms in ϕ. Namely, they have to map analytic paths to (piecewise)
Representations of the Weyl Algebra in Quantum Geometry
101
analytic ones. This would lead directly into conflicts, if general smooth diffeomorphisms were allowed. They have to be “analyticity preserving” – at least for one-dimensional submanifolds. There are indeed classes of homeomorphisms having this property: At first, of course, analytic diffeomorphisms fulfill this requirement. However, this will not be sufficient for two reasons: On the one hand, analyticity usually implies high nonlocality – a feature not desired in gravity for physical reasons. On the other hand, in the sequel, the proofs will, in general, crucially depend on the locality for technical reasons as we will see later. Thus, some sort of piecewise analytic diffeomorphisms are to be admitted. In a natural way, this leads to stratified diffeomorphisms, because they map semianalytic sets (disjoint unions of analytic submanifolds forming stratifications) into semianalytic sets. Next, we have to take care of the intersection functions. Given some oriented submanifold, say, a hypersurface, we would like to use this orientation to define such a function. However, this might lead to problems again: Using piecewise analytic diffeomorphisms, it may happen that a surface (including its orientation) is kept invariant, but an originally transversally intersecting path may now be mapped to a tangential one.13 This would contradict the concept that the intersection function encodes the transversality properties of a surface and its orientation, i.e., is assigned naturally and uniquely to an oriented surface. Of course, in contrast to the previous arguments, this rather is a conceptual demand and not a technical one. Moreover, it can be overcome using a slightly more special kind of piecewise analytic diffeomorphisms, as we will see later. Third, the selection of functions is to be discussed. Since we have argued that mostly analytic (or piecewise analytic) objects are to be used, we could restrict ourselves again to (piecewise) analytic functions (at least for the restrictions to the respective surface). However, although this is possible, we may consider more general classes. In particular, after decomposing a surface into several submanifolds, we may admit functions that are analytic only on these submanifolds, but do not satisfy any continuity condition at their “boundaries”. In fact, assume, e.g., that we are given a 2-surface S and divide it by a line S0 into two pieces S1 and S2 plus S0 (like the interior of a circle is divided by a diameter). We now want to label S on each Si by some analytic function di . We may take the Weyl operator w0 for S and d0 , then w j0 for S j with (d0 )−1 , and, finally, w j for S j and d j ( j = 1, 2). Now, w ◦ w10 ◦ w20 ◦ w1 ◦ w2 is the Weyl operator for S with a function whose restriction on each Si is di . We should remark that this way one may even define submanifolds with codimension 2 or larger to be (quasi-)surfaces. This, however, brings back the problem that the intersection function is not necessarily given directly by the orientation of the submanifold itself: the transversality between paths and such lower-dimensional submanifolds would, in general, be destroyed already by analytic diffeomorphisms. Thus, one should restrict oneself to hypersurfaces (or at least semianalytic sets of pure codimension 1) and control lower-dimensional surfaces by including labellings of hypersurfaces with functions d that are nontrivial only on these “sub”-surfaces. Or, equivalently, one may give lower-dimensional surfaces orientations that are induced by hypersurfaces containing them. We will exploit this idea. Anyway, after all, it does not seem necessary to impose very strong smoothness restrictions on (S) from the conceptual point of view. Nevertheless, as we will see, there will be some technical difficulties that lead to restrictions. 13 Let M be R2 and divide M by the two lines x = ±1 into three open parts and the two lines. Now define ϕ on the open strip between these two lines by ϕ(x, y) := (x, y + 1 − x 2 ) and let ϕ be the identity otherwise. Of
course, ϕ is continuous everywhere and an analytic diffeomorphism on each of these five parts. Nevertheless, the path γ with γ (t) = (t, 0) is transversal w.r.t. x = 1, but ϕ(γ ) is tangent to it.
102
Ch. Fleischhack
To summarize, in what follows we will always assume to work with “nice” structure data having the following minimal properties: Definition 4.2. The structure data are called nice iff • • • • • • • •
M is an at least two-dimensional analytic manifold; G is a nontrivial, connected compact Lie group; P consists of all piecewise analytic paths in M; S contains at most the stratified analytic sets in M; (S) contains at least the natural14 intersection functions of S; (S) contains at least the constant functions on M; E contains at least the trivial gauge transform; D contains at most the stratified analytic diffeomorphisms in M.
The requirements regarding regularity will be discussed in Subsect. 4.3. The precise definitions of stratified objects will be given in Sect. 6. Note, that whether we consider closed manifolds only or include open ones, is not decided here. The remaining “finetuning” will be made if needed.
4.2. Weyl algebra. Assume we are working with some arbitrary, but fixed “consistent” structure data. We define W := {wdS,σ S } S∈S
σ S ∈(S)
d∈(S)
and set W :=
ϕ∈D
{αϕ }
and W :=
g∈E
{β g } .
Definition 4.3. The C ∗ -subalgebra A := A(W, µ0 ) of B(L 2 (A, µ0 )), generated by C(A) and W, is called Weyl algebra of quantum geometry. Definition 4.4. • ADiff := A(W∪W , µ0 ) denotes the C ∗ -subalgebra of B(L 2 (A, µ0 )) generated by A and W . • AAuto := A(W ∪ W ∪ W , µ0 ) denotes the C ∗ -subalgebra of B(L 2 (A, µ0 )) generated by A, W and W . Definition 4.5. Let π be a representation of ADiff on some Hilbert space H. 14 In contrast to Definition 3.16, we consider an intersection function on S with codim S ≥ 2 to be natural M iff it is induced by an embedded hypersurface S that is contained in S, not just in M. Moreover, one can directly extend the definition of natural intersection functions to stratified sets, e.g., using triangulations. However, since, at the end, we are interested mostly in the orientation of genuine submanifolds (possibly with boundary) only, we do not consider this issue in this paper in detail. Thus, at the moment, the statement “(S) contains at least the natural intersection functions of S” only refers to such submanifolds S.
Representations of the Weyl Algebra in Quantum Geometry
103
• ψ ∈ H is called diffeomorphism invariant (w.r.t. π ) iff π (αϕ )ψ = ψ for all ϕ ∈ D. • π is called diffeomorphism invariant iff it has a diffeomorphism invariant vector. Often we write “D-invariant” instead of “diffeomorphism invariant”. Analogously, we speak about D-natural representations meaning W -natural representations. Definition 4.6. Let π be a representation of AAuto on some Hilbert space H. • ψ ∈ H is called automorphism invariant (w.r.t. π ) iff π |ADiff is diffeomorphism invariant and π (β g )ψ = ψ for all g ∈ E. • π is called automorphism invariant iff it has an automorphism invariant vector. Usually we write “D-E-invariant” instead of “automorphism invariant”. Definition 4.7. π0 denotes the fundamental (i.e., identical) representation of A on L 2 (A, µ0 ) (and, analogously, that of ADiff and AAuto , respectively). Since 1 ∈ L 2 (A, µ0 ) is already cyclic for C(X ) ⊆ A, and αϕ (1) equals 1 for all ϕ ∈ D as well as β g (1) does for all g ∈ E, we have Proposition 4.1. 1 is a cyclic, diffeomorphism and automorphism invariant vector for π0 . The irreducibility of π0 will be proven separately in Sect. 5. 4.3. Regularity. One of our goals in this paper is a uniqueness proof for certain representations of A. However, we will only be able to do this for certain regularity conditions. It is now reasonable to presuppose as little of them as possible. In other words, R which encodes the one-parameter subgroups to be mapped to weakly continuous ones, should be chosen as small as possible. As we will see, it will be sufficient to include that all S,σ S with d(t) := et d ∈ (S) for constant d : M −→ g. Of course, t −→ wt = wd(t) more regularity, hence larger R, will not reduce uniqueness, but may even lead to the case that there is no such regular representation at all. Therefore, we are faced with some maximality conditions as well. First of all, we may at most allow for those one-parameter subgroups that map to the Weyl operators given by the structure data. Typically such restrictions are induced by the functions d at our disposal. For instance, let G, M and S be not simply connected, allow (S) to contain continuous functions only, and let d : M −→ G have nontrivial mapping degree. Then, in general, it is not possible to deform d in (S) continuously into the trivial function on G. This shows that it need not be possible to connect any Weyl operator to the identity within the limits of the structure data. Of course, using non-continuous d, it is always possible: Choose at every point x S in M some d(x) ∈ g with ed(x) = d(x) and define wt := w ES,σ for all t. But, moreover, d (t) even if we might find for each t some allowed d(t) with d(t1 + t2 ) = d(t1 )d(t2 ), the corresponding maps t −→ (d(t))(x) need not be continuous at all. The reason behind this is that the functional equation f (x + y) = f (x)+ f (y) has non-continuous, “cloudy” solutions. Then the corresponding one-parameter subgroups of Weyl operators are no longer weakly continuous, as one immediately checks. Therefore, we should restrict ourselves indeed to the functions generated by the Lie algebra functions. We summarize these considerations in
104
Ch. Fleischhack
Definition 4.8. Let S contain some quasi-surfaces in M and, for each S ∈ S, let (S) contain intersection functions for S and (S) contain functions from M to G. A set R of one-parameter subgroups in the set of Weyl operators is called full-consistent with S, {(S)} and {(S)} iff for every element t −→ wt in R there is some function d : M −→ g and some quasi-surface S ∈ S with intersection function S,σ S σ S ∈ (S), such that d(t) := et d ∈ (S) and wt = wd(t) for all t. R is called consistent with S, {(S)} and {(S)} iff R equals R0 for some R0 being full-consistent with S, {(S)} and {(S)}. After all, we enlarge the structure data above by some subset R of the set of oneparameter subgroups in W. Definition 4.9. The enlarged structure data are called nice iff the structure data are nice and • R contains at most the one-parameter subgroups of Weyl operators consistent with S, {(S)}, {(S)} and at least those consistent with S, {(S)} and the constant functions. Using Corollary 3.29 and Proposition 3.28, we have for nice enlarged structure data Proposition 4.2. 1. π0 is regular w.r.t. R. 2. π0 is -regular with given in Proposition 3.28. In particular, π0 is p,S,σ S -regular for all p ∈ N ∪ {∞, ω}, S ∈ S and σ S ∈ (S). 5. Irreducibility In this section we are going to prove the irreducibility of A for nice structure data. [15] Additionally, we assume that S contains at least the closed, oriented hypersurfaces of M. Since we do not need diffeomorphisms, there will be no restrictions for D. Note that given the irreducibility of the Weyl algebra of quantum geometry for these structure data, we get it immediately for all larger structure data. In fact, since the Weyl algebra cannot shrink if the structure data get larger, the commutant of the Weyl algebra cannot get larger in this case. Since, however, we will see it is already trivial for the assumptions above, the enlarged Weyl algebra is again irreducible. 5.1. Nice intersections. In this subsection, properties of intersections between graphs and surfaces, together with their implications for certain scalar products are studied. Definition 5.1. Let γ be an edge and let γ be a (possibly trivial) graph. A surface S is called (γ , γ )-nice iff 1. S is naturally oriented; 2. S and (the image of) γ are disjoint; and 3. γ intersects S in precisely one interior point x of γ transversally, such that the orientation of S coincides with the direction of γ . In this case, x is called puncture of S and (γ , γ ). Lemma 5.1. Let γ be an edge and γ be a (possibly empty) graph, such that γ and (the edges in) γ intersect at most at their end points. Then for every interior point x of γ , there is a (γ , γ )-nice hypersurface S with corresponding puncture x.
Representations of the Weyl Algebra in Quantum Geometry
105
Note that it does not matter whether we restrict ourselves to the case of closed surfaces or to that of open ones. Proof. If we admit open surfaces S, then the assertion is trivial, since we may always find some neighbourhood of x disjoint to γ , where γ is a straight line. Take for S some sufficiently small hyperplane “orthogonal” to γ and that contains x. Let us, therefore, consider the case of closed surfaces. Roughly speaking, the problem here is that if γ “enters” S at some point, it has to “leave” it somewhere else. Thus, we have to ensure that at only one point this intersection is transversal. For that purpose, we consider some (real) analytic curve c in R2 that has an inflection point, such that the corresponding tangent t intersects c in precisely one other point y transversally. Such curves exist – take, e.g., an appropriate Cassini curve [39]. As in the case of open surfaces, consider now some neighbourhood of x isomorphic to Rn ⊇ R2 and disjoint with γ , such that x is mapped to y and such that (the image of) γ coincides with t in some sufficiently large neighbourhood of y. Let now S be the rotational surface given by c and, e.g., the x 1 -axis in R2 ⊆ Rn . By construction, S has the required properties. (If the direction of γ and the orientation of S at the puncture do not coincide, simply mirror S at the hyperplane “orthogonal” to γ .)
Lemma 5.2. Let γ be an edge and let γ be some (possibly trivial) graph, such that γ and the edges in γ intersect each other at most at their end points. Moreover, let S, S1 and S2 be (γ , γ )-nice surfaces, such that the corresponding punctures are different. Finally, let T be a gauge-variant spin network function of the form T = (Tφ,γ )m n ⊗T with T ∈ Mγ . Then we have wgS11 (T ), wgS22 (T ) =
χφ (g12 ) χφ (g22 ) (dim φ)2
for all g1 , g2 ∈ G. Moreover, if φ is abelian15 , we have wgS (T ) = φ(g 2 ) T = χφ (g 2 ) T for all g ∈ G. S Here, wgS is a shorter notation for wdS,σ with σ S given by the natural orientation of S g and with dg being the function on M constantly equal g ∈ G.
Proof. First of all, note that wgS (T ) = wgS ((Tφ,γ )m n ) ⊗ T for all g ∈ G and for all (γ , γ )-nice S. Assume now that t1 < t2 , where γ (t j ) is the intersection point of S j and γ . Decompose γ into the three segments γ1 , γ0 and γ2 according to the parameter intervals [0, t1 ], [t1 , t2 ] and [t2 , 1], respectively. Then we have m (Tφ,γ )m n = (Tφ,γ1 γ0 γ2 )n =
1 p q (Tφ,γ1 )mp ⊗ (Tφ,γ0 )q ⊗ (Tφ,γ2 )n . dim φ
15 Recall that a representation is called abelian (or linear) iff its character χ : G −→ C is multiplicative, φ i.e., χφ (g1 )χφ (g2 ) = χφ (g1 g2 ) for all g1 , g2 ∈ G. An irreducible abelian representation of a connected compact group is necessarily one-dimensional, i.e., φ(g) = χφ (g)1 with |χφ (g)| = 1 for all g ∈ G. Moreover, every compact connected G equals (Gss × Gab )/N for some semisimple Gss , some torus Gab and some discrete N being central in Gss × Gab . Hence, for every irreducible representation φ of G there are irreducible representations φss and φab of Gss and Gab , respectively, such that φ◦π = φss ⊗φab with π : Gss ×Gab −→ G being the canonical projection. Then φ is abelian iff φss is trivial.
106
Ch. Fleischhack
Consequently, 1 p q (Tφ,γ1 )rm1 φ(g1 )rp1 ⊗ φ(g1 )s1 (Tφ,γ0 )qs1 ⊗ (Tφ,γ2 )n dim φ 1 q φ(g12 )rs11 (Tφ,γ1 )rm1 ⊗ (Tφ,γ0 )qs1 ⊗ (Tφ,γ2 )n = dim φ
wgS11 ((Tφ,γ )m n)=
and, analogously, wgS22 ((Tφ,γ )m n)=
1 p φ(g22 )rs22 (Tφ,γ1 )mp ⊗ (Tφ,γ0 )r2 ⊗ (Tφ,γ2 )sn2 . dim φ
Since γ1 , γ0 , γ2 and γ are independent, we get wgS11 T, wgS22 T S2 m = wgS11 ((Tφ,γ )m n ), wg2 ((Tφ,γ )n ) · T , T 1 φ(g12 )rs11 φ(g22 )rs22 · = (dim φ)2 p q · (Tφ,γ1 )rm1 , (Tφ,γ1 )mp (Tφ,γ0 )qs1 , (Tφ,γ0 )r2 (Tφ,γ2 )n , (Tφ,γ2 )sn2 1 φ(g12 )rs11 φ(g22 )rs22 δr1 p δ s1 p δqr2 δ qs2 = (dim φ)2 1 = trφ(g12 ) trφ(g22 ). (dim φ)2
If t1 > t2 , the calculation is completely analogous. The assertion wgS (T ) = φ(g 2 ) T for abelian φ follows directly from the definition of wgS . Recall that every abelian representation is one-dimensional and maps G to U (1)1.
5.2. Irreducibility proof. Theorem 5.3. The Weyl algebra A of quantum geometry is irreducible on L 2 (A, µ0 ). Before proving the theorem, we set L ∞ := L ∞ (A, µ0 ) and L 2 := L 2 (A, µ0 ). Proof. We are now going to prove the irreducibility of A by verifying that the commutant of A consists of scalars only [12]. Since C(A) ⊆ A, we have A ⊆ C(A) = L ∞ for the commutants [36]. Next, one checks immediately that w( f )w(ψ) = w( f ψ) for all w ∈ W, f ∈ L ∞ and ψ ∈ L 2 . In other words, w( f ) ◦ w = w ◦ f in B(L 2 ). Let now f ∈ A ⊆ L ∞ . Then we have f ◦ w = w ◦ f = w( f ) ◦ w for all w ∈ W, hence w( f ) = f in L ∞ ⊆ L 2 by invertibility of w. Consider additionally some non trivial gauge-variant spin network function T . It can be written as T = (Tφ,γ )m n ⊗T with nontrivial φ, where T ∈ Mγ is a (possibly trivial) spin network function, such that γ and the edges in γ intersect at most at their end points. By w( f ) = f and w∗ ∈ W for all w ∈ W, we have T, f = T, w∗ ( f ) = w(T ), f and, therefore, w(T ), f = T, f = w (T ), f for all w, w ∈ W.
Representations of the Weyl Algebra in Quantum Geometry
107
1. Let φ be abelian. Choose some (γ , γ )-nice surface S by Lemma 5.1. Then we have wgS (T ) = φ(g 2 ) T for all g ∈ G, by Lemma 5.2. Consequently, T, f = wgS (T ), f = φ(g 2 ) T, f . Since φ is nontrivial, there is some g ∈ G with φ(g 2 ) = 1. Hence, T, f = 0. 2. Let φ be nonabelian. Since G is compact and connected, there is a square root for each element of G. Moreover, by [20], each nonabelian irreducible character has a zero. Hence, there is a g ∈ G with χφ (g 2 ) = 0. Choose now, by Lemma 5.1, infinitely many (γ , γ )-nice surfaces Si , whose punctures with γ are mutually different. Then, by Lemma 5.2, we have S
wgSi (T ), wg j (T ) =
χφ (g 2 ) χφ (g 2 ) =0 (dim φ)2
for i = j, due to the choice of g. Since wgSi is unitary, {wgSi (T )} is an orthonormal system. Using S
wgSi (T ), f = wg j (T ), f for all i, j, this implies wgSi (T ), f = 0 and thus T, f = 0. Altogether, we have proven T, f = 0 for all nontrivial gauge-variant spin network functions T . Therefore, f ∈ C 1, hence A = C 1.
Corollary 5.4. ADiff and AAuto are irreducible. 6. Stratified Diffeomorphisms As we have mentioned in Sect. 4 and we will see in the proofs, analytic graphomorphisms will not always be sufficient for studying representations of A. A natural extension is stratified analytic isomorphisms. The theory of stratifications we will use here is motivated by [21]. The first definition will be quoted almost literally, however, that of stratified maps is slightly sharpened. Although we will later apply the whole framework to the analytic category, we assume at this point only that we have fixed some smoothness category C p with p ∈ N or p = ∞ or p = ω. Let M and N be C p manifolds. Definition 6.1. Let A be some subset in M. • A stratification M of M is a locally finite, disjoint decomposition of M into connected embedded C p submanifolds Mi of M (the so-called strata), such that Mi ∩ ∂ M j = ∅ =⇒ Mi ⊆ ∂ M j
and
dim Mi < dim M j
for all Mi , M j ∈ M. • A stratification M of M is called stratification of A in M iff A is the union of certain elements in M. • A is a stratified set (w.r.t. M) iff there is a stratification of A in M.
108
Ch. Fleischhack
Definition 6.2. Let M1 and M2 be two stratifications of some subset of A. Then M1 is called finer than M2 iff each stratum in M2 is a union of strata in M1 . Definition 6.3. A map f : M −→ N is called • stratified map iff f is continuous and there are stratifications M and N of M and N , respectively, such that for every Mi ∈ M there is an open Ui ⊆ M and a C p differentiable map f i : Mi ⊆ Ui −→ N with Mi ⊆ Ui ,
f i | Mi = f | Mi ,
f i (Mi ) ∈ N ,
rank f | Mi = dim f (Mi );
• stratified monomorphism iff, additionally, f | Mi is injective; • stratified isomorphism16 iff, additionally, f is a homeomorphism and each f i : Ui −→ f i (Ui ) is a C p diffeomorphism. If we drop the above conditions that Ui is open and that M i is contained in Ui , we speak about weakly stratified maps. Definition 6.4. A stratified map f : M −→ M is called localized iff f is the identity outside some compact subset of M. Definition 6.5. Two stratified sets S1 and S2 in M are called (weakly) strata equivalent iff there is a product of localized (weakly) stratified isomorphisms mapping S1 onto S2 . They are called oriented-strata equivalent iff there is such a product mapping additionally the orientation of S1 to that of S2 .
6.1. Localized Stratified Diffeomorphisms in Linear Spaces. In the sections below, we will have to study the local transformation behaviour of geometric objects in manifolds. To get prepared for this, we will now investigate first the corresponding problems in linear spaces. In particular, we will be able to rotate, scale and translate these objects locally, i.e., by transformations that are the identity outside some bounded region. This guarantees that we may lift the corresponding operations to manifolds. We recall that a q-simplex S in Rk with q ≤ k is the closed convex hull of q + 1 points in general position. The corresponding interior of S is called open simplex. Moreover, the (open) faces of S are the (open) simplices spanned by subsets of these q + 1 points. q Additionally, we denote by Br (x), or shortly Br (x), some closed q-dimensional ball in k R with radius r around x. If x is the origin, we simply write Br . We remark that, in this subsection, nice orientations of some simplex or ball S will always mean an orientation induced by that of some hyperplane (i.e., not by some more general hypersurface as for natural orientations) containing S. This implies, e.g., that the nice orientation of a q-simplex S is always induced by some (k − 1)-simplex having S as one of its faces. Finally, let us remark that in most of the statements of this subsection we will use 0 as a base point. It should be clear that all these statements hold analogously if 0 is replaced by any point in Rk. 16 Sometimes we will use “stratified diffeomorphism” synonymously.
Representations of the Weyl Algebra in Quantum Geometry
109
6.1.1. Strata equivalence of star-shaped regions. Lemma 6.1. Let k be a positive integer and let U be an open subset of Rk not containing 0. Next, let a, b, p : U −→ R be C p -functions, such that both a, p and pa + b are positive on U . Moreover, for every λ > 0, let p(λx) = λp(x), a(λx) = a(x), b(λx) = b(x), whenever both λx and x are contained in U . Finally define ρ, ρinv : U −→ R by b
1 b and ρinv := 1− . ρ := a + p a p Then ρ : U −→ Rk defined by ρ (x) := ρ(x) x is a C p diffeomorphism between U and ρ (U ) and maps (subintervals of) each half-ray R+ x into (subintervals of) the same half-ray. Moreover, its inverse is given by ρ −1 (x) = ρinv (x) x. at a single x is just a positive Proof. ρ is indeed C p , since p never vanishes. Since ρ scalar multiplication, it maps (subintervals of) each half-ray R+ x into (subintervals of) the same half-ray. Moreover, ρ is injective and the image of ρ is an open subset of Rk . Finally, one checks immediately that ρ −1 is C p and that it is the inverse of ρ by pa + b > 0.
Lemma 6.2. Let k be a positive integer. Let S0 and S1 be the boundaries of two bounded open regions R0 and R1 in Rk both containing 0. Assume, moreover, that each Ri is star-shaped, that the corresponding Minkowski functional pi for Ri is C p and that each Si is an embedded C p submanifold of Rk . Then, for all real λ± and λ0,± with 0 < λ− < inf
Rk\{0}
p1 ≤ λ0,− p0
and λ0,+ ≤ sup
Rk\{0}
p1 < λ+ , p0
there are C p mappings ρ + and ρ − with the following properties: 1. ρ ± is a C p diffeomorphism from some open neighbourhood of V± onto some neighbourhood of W± . Here, V− = {x ∈ Rk | λ− ≤ p1 (x) and p0 (x) ≤ 1}, V+ = {x ∈ Rk | 1 ≤ p0 (x) and p1 (x) ≤ λ+ }, W− = {x ∈ Rk | λ− ≤ p1 (x) ≤ λ0,− }, W+ = {x ∈ Rk | λ0,+ ≤ p1 (x) ≤ λ+ } are compact sets with nonempty interior. 2. ρ ± maps S0 to λ0,± S1 ; 3. ρ + and ρ − coincide on S0 if λ0,− = λ0,+ ;
110
Ch. Fleischhack
4. ρ ± is the identity on λ± S1 ; 5. ρ ± maps subintervals of half-rays to subintervals of the same half-ray. 6. The restrictions of ρ ± to (an appropriate open subset of) any linear subspace of Rk are diffeomorphisms into that linear subspace. Corollary 6.3. Given the assumptions of Lemma 6.2, there is a stratified C p diffeomorphism ϕ mapping S0 to λ0 S1 and R0 to λ0 R1 for some λ− ≤ λ0 ≤ λ+ , such that ϕ is the identity inside λ− R1 and outside λ+ R1 . Moreover, ϕ can be chosen, such that it preserves half-rays and its restrictions to linear subspaces of Rk are stratified C p diffeomorphisms again. Proof. Simply define ϕ to equal ρ ± on V± and to be the identity otherwise. Since these mappings coincide on the corresponding overlaps λ− S1 , S0 and λ+ S1 , we get the assertion.
Note that λ± does only depend on the relative shape of S0 and S1 . In particular, λ± need not be changed if both S0 and S1 are scaled by the same factor. Proof of Lemma 6.2. Denote R± := λ± R1 and, correspondingly, S± := ∂ R± ≡ λ± S1 . By choice of λ± , we have R− ⊆ R0 ⊆ R0 ⊆ R+ . Furthermore, let us define q := pp01 on V := Rk \{0} and let a± :=
λ± − λ0,± λ± − q
and
b± := λ±
λ0,± − q λ± − q
define functions a± , b± : V −→ R. Of course, a± is positive. Since Minkowski functionals are semilinear17 , we see immediately that q, and so a and b as well, are constant on each half-ray R+ x. Observe that a and b are well defined by choice of λ± and λ0,± . Finally, we define ρ ± (x) := ρ± (x) x on V by ρ± := a± +
b± λ± ( p1 + λ0,± − q) − λ0,± p1 = . p1 p1 (λ± − q)
We have (q − λ± )( p1 a± + b± ) = p1 (λ0,± − λ± ) + λ± (q − λ0,± ) = p1 (λ± ( p10 − 1) + λ0,± ) − λ± λ0,± = ( p1 − λ± )λ0,± + p1 λ± ( p10 − 1). Let us check the properties of ρ ± : • ρ ± is obviously a C p function mapping subintervals of half-rays to subintervals of the same half-ray. • Let x ∈ S0 , i.e., p0 (x) = 1, hence q(x) = p1 (x). Then p1 (ρ± (x)x) = ρ± (x) p1 (x) = λ0,± , i.e., ρ ± (x) ∈ λ0,± S1 . In particular, ρ − (x) = ρ + (x) if λ0,− = λ0,+ . • Let x ∈ λ± S1 , i.e., p1 (x) = λ± . Then ρ± (x) = 1, hence ρ (x) = x. 17 This means p(λx) = λp(x) for all λ > 0.
Representations of the Weyl Algebra in Quantum Geometry
111
• Let x ∈ V− , i.e., p0 (x) ≤ 1 and p1 (x) ≥ λ− . From the lines above, we see that this implies ( p1 a− + b− )(x) ≥ 0, the equality holding iff p0 (x) = 1 and p1 (x) = λ− . This, however, is impossible, since q(x) would be equal λ− < inf V q. Therefore, p1 a− + b− > 0 on V− . Since, by construction, V− is compact, there is some open neighbourhood of V− where p1 a− + b− > 0. Lemma 6.1 now shows that ρ ± is a C p diffeomorphism on that neighbourhood. By the previous items we see that ρ − (V− ) = W− . • The corresponding properties of ρ + are proven completely analogously. • By intersecting Ri and Si with linear subspaces we get C p boundaries with C p Minkowski functionals again. The remaining statements are now clear.
6.1.2. Scaling. To study geometric objects in charts, it may be necessary to first shrink them to have enough “space”. That this is (almost) always possible using stratified diffeomorphisms, guarantees the following Lemma 6.4. Let k be a positive integer and let R be a bounded star-shaped open region in Rk containing 0 and having a C p differentiable Minkowski functional p. Moreover, assume that the boundary S of R is an embedded C p submanifold of Rk . Then for all λ > 0 and all ε > 0, there is a stratified C p isomorphism ϕ preserving half-rays, such that ϕ = λ id on R and ϕ = id outside (1 + ε) max(λ, 1)R. Proof. • Assume first √λ ≥ 1. √ Choosing λ√ := λ and λ+ := (1 0,+ √ + ε) λ, we may apply Lemma 6.2 to R0 := R and R1 := λ R with p = p0 = λ p1 . For this, define ⎧ −1 ⎪ ⎨λ id on p [0, 1] ϕ(x) := ρ on p −1 [1, (1 + ε)λ] . + ⎪ ⎩id on p −1 [(1 + ε)λ, ∞) √ ρ+ (x)) = λ0,+ = λ, Now, let x ∈ S, i.e., p(x) = 1. Then, by construction, p1 ( hence p0 ( ρ+ (x)) = λ √ = p0 (λx). For x ∈ (1 + ε)λS, we have p(x) = (1 + ε)λ, hence p1 (x) = (1 + ε) λ = λ+ . By definition, we get ρ + (x) = x. Altogether, ρ + equals λ id on S = ∂ R and id on (1 + ε)λS = (1 + ε)λ ∂ R. Therefore, ϕ is a stratified diffeomorphism having the desired properties. • Assume now λ√≤ 1. √ −1 Define λ+ := √ 1 + ε and λ0,+ := λ( √1 + ε) , and apply Lemma 6.2 to R0 := R and R1 := 1 + ε R with p = p0 = 1 + ε p1 : Define ⎧ −1 ⎪ ⎨λ id on p [0, 1] ϕ(x) := ρ on p −1 [1, 1 + ε] . + ⎪ ⎩id on p −1 [1 + ε, ∞) √ ρ+ (x)) = λ0,+ = λ( 1 + ε)−1 . Therefore, For x ∈ S, i.e., p(x) = 1, we have p1 ( we get p0 ( ρ+ (x)) = λ = p√ 0 (λx). If, on the other hand, x ∈ (1 + ε)S, we have p(x) = 1 + ε and p1 (x) = 1 + ε = λ+ , implying ρ + (x) = x. Consequently, ρ + equals λ id on S = ∂ R and id on (1 + ε)S = (1 + ε) ∂ R. Now, ϕ is a stratified diffeomorphism having the desired properties.
112
Ch. Fleischhack
6.1.3. Rotation. Lemma 6.5. Let k be a positive integer and let r1 > r2 > 0 be real. Let X ∈ so(k), define A := e X ∈ S O(k) and denote the orthogonal projection from Rk to (ker X )⊥ by P. Then there is a stratified diffeomorphism ϕ of Rk , such that • • • • •
ϕ coincides with A on Br2 ; ϕ is the identity outside of Br1 ; ϕ is norm preserving; ϕ is homotopic to the identity; P ϕ = P.
Proof. We stratify Rk into int Br2 ∪ ∂ Br2 ∪ (int Br1 \ Br2 ) ∪ ∂ Br1 ∪ (Rk \ Br1 ) and define three auxiliary C p functions ai : R −→ R with a12 (r ) := 1,
a234 (r ) :=
r1 − r r1 − r2
and
a45 (r ) := 0.
One now immediately checks that ⎧ a (x)X x ⎪ ⎨e 12 ϕ(x) := ea234 (x)X x ⎪ ⎩ea45 (x)X x
if x ∈ int Br2 ∪ ∂ Br2 if x ∈ ∂ Br2 ∪ (int Br1 \ Br2 ) ∪ ∂ Br1 if x ∈ ∂ Br1 ∪ (Rk \ Br1 )
gives the desired map.18 For the homotopy property define ϕt as above with t X instead of X . Then ϕ1 = ϕ and ϕ0 = id.
Immediately from the proof, we get with the above assumptions: Corollary 6.6. Let k be a nonnegative integer and let ε > 0. Moreover, let γα be the straight line in R2 connecting (− cos α, − sin α) and (cos α, sin α). Then for each α ∈ R there is a stratified isomorphism ϕα of R2 ⊕ Rk , such that • • • • •
ϕα is the identity outside B1+ε ⊆ Rk+2 ; ϕα is norm preserving; ϕα is homotopic19 to the identity; Pϕα = P, where P is the canonical projection from R2 ⊕ Rk to Rk ; ϕα maps γ0 to γα .
Proof. Choose X = α
01 ∈ so(2) ⊆ so(2) × so(k) ⊆ so(2 + k). −1 0
18 Moreover, note that the three functions used to define ϕ are defined on full Rk (possibly, up to the origin). 19 The mapping is given by t −→ ϕ . tα
Representations of the Weyl Algebra in Quantum Geometry
113
6.1.4. Translation. Lemma 6.7. Let k be some positive integer. Let γ be some edge in Rk and U be some neighbourhood of γ . Choose r > 0, such that the balls with radius r centered at γ (0) and γ (1), respectively, are contained in U . Then there is a finite product of stratified C p diffeomorphisms of Rk being the identity outside U and the translation by γ (1) − γ (0) on Br (γ (0)). Proof. We only give the idea of the proof. The technical details are similar to that for the preceding statements. Moreover, in Lemma C.1 we will give a proof for a more specific type of translation. Here, we cover γ by (non-trivial) balls. By compactness, there is some r , such that finitely many balls with radius r centered at points of γ will cover γ and such that the convex hull of “neighbouring” balls is contained in U . The idea now is to first shrink Br (γ (0)) to Br (γ (0)), then move parallelly this ball through the convex hulls of neighbouring balls and finally blow it up to its original size. All these operations are possible by the statements above without moving any point outside U .
6.1.5. Strata equivalence of simplices and balls. Let us now show that all q-simplices are not only isomorphic as simplices themselves, but can also be mapped into each other by localized stratified C p diffeomorphisms. Moreover, they are equivalent to q-dimensional balls. Proposition 6.8. Let q ≤ k be two positive integers. Then all q-simplices and all q-dimensional balls in Rk are strata equivalent. For this, we first show that each q-simplex can be mapped to a q-dimensional ball. Lemma 6.9. Let q ≤ k be two positive integers. Moreover, let V := {v0 , . . . , vq } ⊆ Rk contain q + 1 points in general position, such that 0 is contained in the interior of the q-simplex R V spanned by V . Finally, fix some ε > 0 and some r > 0, such that RV is contained completely in the interior of Br . Then there is a stratified C p diffeomorphism ϕ, being the identity outside of B(1+ε)r , such that R V is mapped to Br ∩ spanR V . } ⊆ Rk of k − q + 1 points in general Proof. Choose some set V = {v0 , . . . , vk−q position, such that its span is complementary to that of V and such that the (k − q)simplex spanned by W contains 0 in its interior and is contained in int Br . Define for every 0 ≤ i ≤ q and 0 ≤ j ≤ k − q the set
Vi j := {0} ∪ (V \{vi }) ∪ (V \{v j }) now containing k + 1 points in general position, hence each defining a k-simplex Ri j . These simplices form a complex, i.e., in particular, they share at most lower-dimensional faces. Let R0 be the union of all these (k − q + 1)(q + 1) simplices. Its boundary is the union of the simplices Vi0j spanned by Vi j \{0}. Let us now invoke Corollary 6.3. First of all, observe that the statement there can be extended directly to the case that R0 is formed by a finite number of cones each having tip at 0 and each defined by k-simplices, such that these cones fill Rk completely and share at most the boundaries with each other. Of course, the requirements for S0 have to be relaxed accordingly. We refrained from explicitly giving this form of
114
Ch. Fleischhack
Corollary 6.3 (and Lemma 6.2, respectively), since it would have made the proof even more technical without introducing new ideas. One simply has to construct the stratified diffeomorphism in the more general case for every cone (more precisely, some open appropriate neighbourhood of it) and then use that these mappings fit together at the boundaries. This however, follows from the coincidence of the Minkowski functionals at these boundaries, the construction of the maps in the proofs above and the invariance of half-rays. Coming back to the present proof, define R1 to be Br . Then, by R0 ⊆ Br , the corresponding Minkowski functionals fulfill p1 ≤ p0 , and we may choose λ+ = 1 + ε > 1. This means that, by Corollary 6.3, there is a stratified C p diffeomorphism ϕ being the identity outside λ+ Br , mapping R0 to R1 and ∂ R0 to ∂ R1 . Now the assertion follows, since ϕ preserves linear subspaces. Therefore, R V (being the intersection of R0 with spanR V ) is mapped to Br ∩ spanR V being a q-dimensional ball.
Proof of Proposition 6.8. Let two q-simplices be given. Using Lemma 6.7, translate both, such that they contain 0 in their interior. Then each of them is strata equivalent to some q-dimensional sphere in Rk , by Lemma 6.9. Shrinking these balls, if necessary, we make them of identical radius. Finally, by Lemma 6.5, we may find some localized stratified C p diffeomorphism rotating one ball into the other. Hence, these two q-simplices are strata equivalent to a (hence, any) q-dimensional ball.
Now we are going to mirror simplices and balls into each other. Proposition 6.10. Let q < k be two non-negative integers. Then every q-simplex and every q-dimensional ball in Rk having a nice orientation, is strata equivalent to itself having inverse orientation. Proof. First assume that q = k − 1 and consider some q-dimensional ball B around the origin. Choose X ∈ so(k), such that X is zero on some (k − 2)-dimensional linear subspace V of spanR B and generates a rotation in the two-dimensional complement in Rk spanned by the normal of V in spanR B and the normal of spanR B in Rk . In particular, it generates some map A := et X ∈ S O(k), being minus the identity on this two-dimensional space for some t. Since only one of its “dimensions” belongs to B, the rotation A inverts the orientation of B. Now, Lemma 6.5 guarantees the existence of some stratified diffeomorphism inverting the orientation of B. To prove the statement for q = k − 1 and a given q-simplex S, we map it to some q-dimensional ball B, invert its rotation and take the inverse of the first mapping to get S back. Of course, the orientation of S has been flipped. Next, let q be arbitrary and consider a q-simplex S. Since we work with nice orientations only, there is some (k − 1)-simplex S in M having S as one of its faces and inducing its orientation. Since we may invert the orientation of S , we also may invert that of S by localized stratified diffeomorphisms. To prove the remaining case of q-balls for arbitrary q, reuse the argumentation above for q = k − 1 and reduce to the case of q-simplices.
Corollary 6.11. Let q < k be two non-negative integers. Then all q-simplices and all q-balls in Rk are oriented-strata equivalent, provided they have nice orientations. Proof. Assume, first of all, that S is a q-simplex or a q-ball containing the origin, and let S be given two nice orientations. This means there are linear subspaces T1 and T2
Representations of the Weyl Algebra in Quantum Geometry
115
inducing these orientations by their own nice ones. There is now some A ∈ S O(k) leaving the q-plane spanned by S invariant and mapping T1 onto T2 . Hence A maps the one orientation of S to either the other one or the inverse of it. Hence, by Lemma 6.5, there is some localized stratified isomorphism mapping S onto itself and transforming the orientations by A. By adding, if necessary, some localized stratified isomorphism inverting the orientation as given by Proposition 6.10, we get such a transformation mapping the two orientations of S onto each other. Let now Si be a q-simplex or a q-ball for i = 1, 2. Then we may map them by localized stratified diffeomorphisms to some q-simplex S containing the origin. Since, as one checks immediately, these mappings can be chosen, such that the corresponding orientations of S are nice20 , there is a localized stratified diffeomorphism mapping one orientation of S to the other, by the arguments above.
Without explicitly stating the proof, we have by arguments as in the proposition above: Corollary 6.12. For every nicely oriented 1-dimensional ball S in Rk with k ≥ 3, there are finitely many localized stratified isomorphisms, whose product is the identity on S, but inverts the orientation of S. Finally, we are looking for objects that can be divided into two parts, such that the original one is, on the one hand, strata equivalent to both of them and, on the other hand, is the disjoint union of them. Moreover, the orientation should be preserved. For example, consider an open 2-simplex, i.e., a full open triangle. Intersecting it by a line through one corner and some point of the opposite edge, we get two triangles. If we take their interiors, then they are strata equivalent to the original triangle, however not a decomposition of it – simply the border line is missing. One the other hand, if we were taking it to just one of the subtriangles, then they are no longer strata equivalent. The solution of this problem is to consider at the beginning an open triangle plus be one of its open edges. Then, as above, we may divide the triangle by a line, now through some boundary point of the added edge. Now it is clear that the triangle plus edge is divided into twice a triangle plus edge and all three objects are strata equivalent. The generalization to higher dimensions is straightforward, but more technical: Proposition 6.13. Let q < k be two positive integers. Let S be some open q-simplex in Rk , and let F, one of its open (q − 1)-faces. Finally, give R := S ∪ F the orientation induced by one of the nice orientations of S ⊇ R. Then there are products ϕ0 and ϕ1 of localized stratified isomorphisms, such that R is the disjoint union of ϕ0 R and ϕ1 R and the intersection function of R is the joint intersection function of ϕ0 R and ϕ1 R. Proof. First of all, choose some open (k − 1)-simplex S, such that its orientation is induced by one of the nice orientations of its closure which, on the other hand, induces the orientations of S and R. Let V be the set of k points {v0 , . . . , vk−1 } in Rk , such that S is the interior of the simplex spanned by {v0 , . . . , vk−1 }, F is the interior of the simplex spanned by {v1 , . . . , vk−1 }, S is the interior of the simplex spanned by {v0 , . . . , vq }, F is the interior of the simplex spanned by {v1 , . . . , vq }. 20 One sees that all necessary transformations are locally “affine”.
116
Ch. Fleischhack
:= ∪ S ∪ F. Now choose some v in the open 1-face connecting v0 Define R S∪F by the plane spanned by {v, v2 , . . . , vk−1 } into two parts R 0 and R 1 , and v1 , and cut R whereas the intersection of this plane with R is added to R0 , and R1 is that “half” whose i into i ∪ Si ∪ Fi , where closure contains v1 . We now may decompose each R Si ∪ F Si is the interior of the simplex spanned by {v, vi , v2 , . . . , vk−1 }, i is the interior of the simplex spanned by {xi , v2 , . . . , vk−1 }, F Si is the interior of the simplex spanned by {v, vi , v2 , . . . , vq }, Fi is the interior of the simplex spanned by {xi , v2 , . . . , vq } with x0 = v and x1 = v1 . Obviously, S ∪ F = S0 ∪ F0 ∪ S1 ∪ F1 . Let now ϕi be products of localized stratified isomorphisms that leave v j with j ≥ 2 and vi invariant, map v1−i to v and map the simplex spanned by {v0 , . . . , vk−1 } onto that spanned by {ϕi (v0 ), ϕi (v1 ), v2 , . . . , vk−1 }. It is easy to check that ϕi (S ∪ F) = Si ∪ Fi and that ϕi may be chosen to have the desired orientation properties.
6.2. Localized stratified diffeomorphisms in manifolds. We are now going to transfer the results of the previous subsection to the case of general C p manifolds M. Definition 6.6. A subset S in M is called (nicely oriented) q-simplex in the chart (U, κ) iff S ⊆ U is mapped by κ to a q-simplex in Rdim M (and the orientation of S is induced by one of the natural orientations of some hyperplane in κ(U )). Analogously, we may define q-balls. The definition of faces of q-simplices should be clear as well. We will speak about q-simplices and q-balls in general iff there is a chart of M, in which they are q-simplices or q-balls. Note that, at least locally, every simplex or ball S having a natural orientation is nicely oriented, i.e., it is induced by some hypersurface being (an open set of) a hyperplane in some chart. In fact, let N be some embedded submanifold in M containing S as an embedded submanifold and inducing its orientation. Then we may find some chart mapping N locally into some hyperplane in the local chart image of M and mapping S locally into some plane in the local image of N . Proposition 6.14. The statements of Propositions 6.8, 6.10 and 6.13 as well as of Corollaries 6.11 and 6.12 remain valid if we replace Rk by M and assume all q-simplices and q-balls to be in one and the same connected chart and, moreover, nicely oriented there. Proof. The only point to be shown is the case that the localized stratified isomorphism ϕ needs more space in Rdim M than provided by the chart denoted by (U, κ). If this is the case, first shrink any occurring object S (being a ball or a simplex) to a sufficiently small size. Indeed, since simplices and balls are assumed to be closed and the chart is open, S – magnified (in the chart) by 1 + ε w.r.t. some interior point – is again in U for small ε. Therefore, the scaling lemma (Lemma 6.4) is applicable in order to shrink S by any factor λ ≤ 1. Now it may be necessary to move S to some other place in U inside this chart. To do this, we choose some path that S is moved along. By compactness and continuity reasons, there is a finite number of open k-balls in U covering this path. We now assume that λ is chosen small enough that the accordingly shrunk S can be transferred between any non-disjoint two of these balls by means of Lemma 6.7. This way it can be (parallelly) shifted between any two points in the chart. Using these ingredients of shrinking and shifting, it is now easy to generate the desired localized stratified isomorphisms by means of their counterparts in Rdim M .
Representations of the Weyl Algebra in Quantum Geometry
117
6.3. Application to the analytic category. Let us now come back to the analytic case, i.e., p = ω. Recall [21] that a subset A of an analytic manifold M is called semianalytic iff M can be covered by some open sets Uι , such that each Uι ∩ A is a union of connected components of a set f 1−1 (0)\ f 2−1 (0), for f 1 and f 2 belonging to some finite family of real-valued functions analytic in Uι . Complements, finite intersections and finite unions of semianalytic sets are semianalytic again [10]. Moreover, it can be shown [21,28] that every semianalytic set admits a semianalytic stratification, i.e., a stratification consisting of semianalytic strata only.21 Lemma 6.15. Let M1 and M2 be two stratifications of M. Then there is a stratification M of M being finer than M1 and M2 . Proof. For every semianalytic, hence stratifiable set A in M and every nonnegative integer k, we choose some semianalytic stratification N (A) of A and let Nk (A) contain precisely the k-dimensional strata in N (A) contained in A. Moreover, let n be the dimension of M. Since the intersection of any two semianalytic sets is semianalytic, we may define Nn,k := Nk (M1 ∩ M2 ), M ∈M1 ,M2 ∈M2 1 Nn,k , Nn := k 0. Here, a is selected under the assumption that y = 2a + 2ε0 and y = −2ε0 are still hypersurfaces in the chart for some ε0 > 0 (of course, only that part of the hypersurfaces whose x- and !z -values are admitted in the chosen cubic chart). Finally, choose some analytic d : S −→ G depending on z only, such that every element of G occurs somewhere (in a sufficiently close) neighbourhood of z = 0.25 Let now finitely many (mutually different) points τ j ∈ I be given. The γ -images of these points will turn into the intersection points of the transformed γ with S. Fix, additionally, some small ε < ε0 , such that the distance of any two of the marked points in I is greater than 2ε (and that, if necessary, each of the ε-neighbourhoods of the τk are both in I and in the fixed chart). Now, in the first step we move γ the following way inside the x-y-plane: On the one hand, each segment of γ outside the ε-neighbourhoods of the marked points is again parallel to the x-axis; now, however, alternately with y = 0 and y = 2a. The ε-neighbourhoods, on the other hand, are the straight lines connecting these alternatingly lifted and unlifted segments. This way, the center of these neighbourhoods, i.e., the marked points themselves are mapped to half-way between the levels y = 0 and y = 2a. In other words, precisely the marked points are the intersection points of the transformed γ with S. Note, in particular, that this transformation of γ can be done by a stratified isomorphism that does only change the y-coordinates of any point in M, but neither the x- nor the !z -values (see Lemma C.1). Moreover, note that we have tacitly used that there is an even number of τk to end up at level y = 0 again after moving the largest τk . This finishes the first step. We are now left with the problem to find the intersection points having the correct values of d in the second step. Nicely, the idea of the first step can be used again. To see this, assume that n = 3 and look at the scene from above the (x, z)-plane. Since we only changed the y-values of γ , no change can be seen from this perspective. Using our assignment of d to S, we move the ε-neighbourhoods of the marked points of (the transformed) γ . Slightly more generally than in the first step, however, we let the “bumps” that these neighbourhoods are mapped to, return to the original line before this neighbourhood ends. More precisely, the segments outside these neighbourhoods are not shifted again, and the “bumps” map each τk to the correct “level” (i.e., z-coordinate) in order to get mapped to the point with the correct value gk of d. Note that here we only need to change the z-coordinates, but leave, in particular, the y-coordinates unchanged. This implies that the parameter values where γ intersects S after having been transformed by both steps, are precisely those of the γ after the first transformation. If n > 3, this step is completely analogous. 25 Such a function indeed exists: Choose some b > 0, such that the surfaces defined by y = a and z = ib are contained in the chosen chart for all i = 1, . . . , #G. Choose, additionally, for each i, some polynomial z pi with pi (l) = δil and some X i ∈ g with gi = e X i . Now, define d(x, y, !z ) := i e pi ( b )X i . This function fulfills d(x, y, ib) = gi for all i.
Representations of the Weyl Algebra in Quantum Geometry
121
(x0, y0) H11
H21
ε
H12
H22
H31 ε
ε
2a + ε G21
G11
G31 ε
−τ ε
G12 G13 ε
0
τ
G22 G23
ε
G32 ε G33 ε ε
H32 2a + ε
0 H13
H23
H33
Fig. 1. Stratified Diffeomorphism in Lemma C.1
To summarize: It is clear that the constructed isomorphism has all the desired properties and that s can be chosen obviously. Originally, we looked for a stratified isomorphism mapping γ , such that its transform intersects S precisely for the parameter values τk and at points having the desired values of d. By the arguments above, we reduced this problem to the existence of a diffeomorphism in Rn as indicated in Fig. 1 that, in particular, does not move any point outside the given square (times some ε-ball in the remaining n − 2 dimensions not drawn there). The existence of such a diffeomorphism, however, is guaranteed by Lemma C.1. This furnishes the present proof.
The crucial idea in the proof of Proposition 6.20 was to define for each element in G some domain on the surface S, such that for a given sequence in G, the transformed graph punctures S at the correct points and in the correct ordering, i.e., leading to the correct sequence of values for d. We constructed above a single surface with an analytic d on it. However, we even might use constant d, if we admit S to consist of more than one connected component. In other words, for any finite number we may find such a number of hypersurfaces Si , such that γ may always be transformed to puncture these different surfaces in an arbitrarily given ordering. More precisely, choose for Si some open (cubic) subspace in S, and let the only restriction to Si be that its z-coordinate is in some sufficiently small interval Ii . We may assume that the closures of these intervals are disjoint. Moreover, each Si is a hypersurface of M. Reusing the argumentation of the proof of Proposition 6.20, we have shown Proposition 6.21. Let dim M ≥ 3. Let γ be a graph, let γ ∈ γ be one of its edges and set γ := γ \{γ }. Moreover, let K be a positive integer. Then there is • some subinterval I ⊆ [0, 1], • some nicely oriented, open, embedded analytic hypersurfaces Si with i = 1, . . . , K , such that each Si and each ∂ Si has a finite wide triangulation, each Si is disjoint to im γ , and all Si are mutually disjoint; and • some s ∈ Z, such that for any even integer J > 0, any function l : [1, J ] −→ [1, K ] and any sequence (τ j ) ⊆ I (with τ j > τ j for j > j ) having length J , there is a stratified analytic isomorphism ϕ with the following properties: 1. i Si and im ϕ(γ ) are disjoint ; 2. ϕ(γ (τ j )) ∈ Sl( j) for all j; 3. ϕ(γ ) intersects each Si completely transversally;
122
Ch. Fleischhack
4. {ϕ(γ (τ j ))} j is the set of ϕ(γ )-punctures of i Si ; 5. σ + (Sl( j) , ϕ(γ )|[τ j−1 ,τ j ] ) = (−1) j+s = σ − (Sl( j) , ϕ(γ )|[τ j ,τ j+1 ] ) for all j. 6.4.2. Generation of independent paths. Transferred to the case of manifolds, Corollary 6.6 yields Proposition 6.22. Let M be some n-dimensional manifold with n ≥ 2 and let S ⊆ M. Assume that S and ∂ S are connected embedded submanifolds in M (without boundary) and that S is an embedded submanifold in M having boundary ∂ S. Moreover, let γ be some nontrivial graph in M, such that the image of γ is neither equal to S, ∂ S nor S. Then there is a nontrivial path γ , a neighbourhood U of some m ∈ im γ in M and infinitely many stratified diffeomorphisms ϕi of M, such that • • • •
γ is the only edge in γ not disjoint to U ; ϕi is the identity outside U ; ϕi leaves the set S invariant; {ϕi (γ )}i is a hyph.26
If we additionally assume, that S has one of its natural orientations, then each ϕi may be chosen such that, additionally, it leaves the orientation of S invariant. Proof. 1. im γ is not contained in S. Let γ be an edge of γ not contained in S. Choose some interior point m of γ outside S, and let U be some open neighbourhood of m disjoint to S and disjoint to all other edges in γ except for γ . Choose some chart whose closure is contained in U and whose intersection with (the image of) γ is mapped to a straight line with m mapped to the origin. Corollary 6.6 now gives a collection ϕα of stratified diffeomorphisms being the identity outside the chart that, therefore, may be extended to stratified diffeomorphisms of M that are the identity outside, at least, U . Since each ϕα (γ ) with α ∈ [0, π ) has some interior point not passed by any other ϕα (γ ), these paths are independent. The invariance of S is trivial as well as the fact that the orientation of S is preserved and that {ϕα (γ )}α is a hyph. 2. im γ is contained in ∂ S. In particular, this implies that ∂ S is at least one-dimensional. In fact, otherwise ∂ S would be a point and γ trivial. In the case that dim S < n − 1 and that we consider orientations, let, moreover, S ⊇ S be some (n −1)-dimensional embedded submanifold of M inducing the orientation of S. a) dim ∂ S ≥ 2. Choose some interior point m of some edge γ in γ and some open neighbourhood U of m whose closure is disjoint to all edges in γ except for γ . By assumption, there is some chart whose closure is contained in U , such that the intersection of the chart • • • • •
with S (if applicable) is some open subset of Rn−1 , with S is some open subset of Rdim S (if applicable, in Rn−1 ), with ∂ S is some open subset of Rdim S−1 ⊆ Rdim S , with im γ is a straight line in R ⊆ Rdim S−1 , and with m is mapped to the origin.
26 Here, we extended the notion of a hyph naturally to the case of infinitely many paths.
Representations of the Weyl Algebra in Quantum Geometry
123
Since dim S > dim ∂ S ≥ 2, Corollary 6.6 provides us, analogously to the first case, with stratified diffeomorphisms having the desired properties. In particular, observe that, although they are not the identity neither on S nor on ∂ S, they leave both S and ∂ S (and, if applicable, S ) invariant. The orientation of S is obviously preserved for dim S = n. For dim S = n − 1, use the fact that t −→ ϕtα is a homotopy over diffeomorphisms having the properties above, whence the natural orientation of S is preserved by each diffeomorphism. If dim S < n − 1, then the natural orientations of S are preserved as above, whence the induced orientations on S are so as well. b) dim ∂ S = 1. Since ∂ S is one-dimensional, it is isomorphic to either a line or a circle. Moreover, ∂∂ S = ∅. Since the compact set im γ does not equal ∂ S, there is some point m ∈ ∂(im γ ) ⊆ ∂ S. Moreover, there is a (unique) edge, say γ , having m as one of its endpoints. We may assume γ (0) = m and choose some open neighbourhood U of m whose closure is disjoint to all edges in γ except for γ . Now, we select some chart whose closure is contained in U , such that the intersection of the chart • • • • •
with S is (if applicable) some open subset of Rn−1 , with S is some open subset of R2 (if applicable, in Rn−1 ), with ∂ S is some open subset of R ⊆ R2 , with im γ equals [0, τ ) ⊆ R, and with m is mapped to the origin,
and such that Bτ ⊆ U for some τ > 0. By Lemma 6.4, for α ∈ [0, 13 τ ), there are now stratified diffeomorphisms ϕα , taking − 13 τ ∈ R ⊆ Rn as the origin, such that (−τ, +τ ) is mapped onto itself, such that ϕα [0, τ ]) = [α, τ ] and such that ϕα is the identity outside U . In particular, each ϕα leaves both S, ∂ S and S invariant. Choosing some monotonously decreasing, infinite sequence αi → 0, we get a hyph {ϕαi (γ )}i∈N , since (αi , αi−1 ) is passed by no ϕα j (γ ) with j < i. The preservation of orientation by ϕαi is shown analogously to the case above. 3. im γ is contained in S, but not in ∂ S. As above, this implies that S is at least one-dimensional. Again, for dim S < n−1 and if we consider orientations, we let S ⊇ S be some (n − 1)-dimensional embedded submanifold of M inducing the orientation of S. a) dim S ≥ 2. Choose in γ some edge γ not fully contained in ∂ S. We now may find some interior point m of γ being in the interior of S and fix some open neighbourhood U of m, whose closure is disjoint to ∂ S and disjoint to all edges in γ except for γ . By assumption, there is some chart whose closure is contained in U , such that the intersection of the chart • • • •
with S is (if applicable) some open subset of Rn−1, with S is some open subset of Rdim S (if applicable, in Rn−1 ), with im γ is a straight line in R ⊆ Rdim S, and with m is mapped to the origin.
As above, we may find stratified diffeomorphisms of the desired type, by Corollary 6.6.
124
Ch. Fleischhack
b) dim S = 1. Since S is one-dimensional, it is isomorphic to either a line or a circle. Hence ∂ S consists of at most two points. Consequently, S is isomorphic either to a circle, a line, a ray or a closed interval. Since im γ ⊂ S is compact, there is some point m ∈ ∂(im γ ) ∩ S. Picking, as above, the (unique) edge γ having m as one of its endpoints, we now may find some open neighbourhood U of m whose closure is disjoint to ∂ S and to all edges in γ except for γ . Again, we select some chart whose closure is contained in U , such that the intersection of the chart • • • •
with S is (if applicable) some open subset of Rn−1 , with S is some open subset of R (if applicable, in Rn−1 ), with im γ equals [0, τ ) ⊆ R, and with m is mapped to the origin,
and such that Bτ ⊆ U for some τ > 0. Again, as in the case im γ ⊆ ∂ S and dim ∂ S = 1, we find the desired stratified diffeomorphisms by Lemma 6.4.
Proposition 6.23. Let M be some n-dimensional manifold with n ≥ 2 and let S ⊆ M. Assume that S and ∂ S are connected embedded submanifolds in M (without boundary) and that S is an embedded submanifold in M having boundary ∂ S. Moreover, let either S or ∂ S be an embedded 1-circle S 1 . Finally, let γ be a graph whose image is this S 1 and let m be some vertex of γ . Then there is a neighbourhood U of m in M, infinitely many different m i in S 1 ∩ U and for each i a stratified diffeomorphism ϕi of M with the following properties: • ϕi is the identity outside U ; • ϕi leaves the set S invariant; • ϕi maps m to m i ; • ϕi is the identity on all edges of γ not adjacent to m. If we additionally assume that S has one of its natural orientations, then each ϕm may be chosen such that, additionally, it leaves the orientation of S invariant. Proof. This proof is very analogous to that of Proposition 6.23. Therefore, we only present its main idea. First choose some U , small enough to intersect im γ only at its edges adjacent to m and such that U ∩ S is a domain of a straight line (if im γ = S) or of a half plane (if im γ = ∂ S). Now choose some point near m as the origin for a local scaling as in Lemma 6.4. This way, we may move m to every other sufficiently near-by point, leaving ∂ S or S, respectively, invariant, without moving any point outside U .
7. Representations of the Weyl Algebra Now we are prepared to give a rigorous proof of (a stronger version of the) uniqueness theorem claimed by Sahlmann and Thiemann [35]. As well, we will proceed in two steps: First we use regularity and diffeomorphism invariance to show that the first-step decomposition contains the Ashtekar-Lewandowski measure. This will follow from the fact that the diffeomorphisms split the Weyl operators, i.e., the weak convergence of Weyl operators is not uniformly on states related by diffeomorphisms. Second, using diffeomorphism invariance again, we show that each Weyl operator is a scalar at this component. This enables us to use the naturality of the action of diffeomorphisms in order to prove that each Weyl operator is even a unit there. Cyclicity will give the proof. At the end, we discuss the technical assumptions made in the proofs.
Representations of the Weyl Algebra in Quantum Geometry
125
7.1. Splitting property. As before, we assume to be given some nice enlarged structure data. Moreover, we restrict ourselves to the case that S and D contain at least those hypersurfaces and stratified isomorphisms, respectively, that are necessary to keep Proposition 6.21 valid. In other words, one possibility is to choose S to contain at least all of these “cubic” hypersurfaces and D to contain at least the stratified isomorphisms described in Subsubsect. 6.4.1. Throughout the whole subsection, let π be some representation of ADiff on H and denote by π := π |A the corresponding representation of A. Additionally, we require M to have at least dimension 3. Proposition 7.1. Assume π to be regular. Moreover, let 1ν0 be D-invariant for some ν0 . Then µν0 is the Ashtekar-Lewandowski measure µ0 . Recall that regularity always means regularity w.r.t. R, whereas R is taken from the nice enlarged structure data. Corollary 7.2. Let π be regular, and let there exist a (cyclic) D-invariant vector in H. Then there is a first-step decomposition of π , such that µν0 is the AshtekarLewandowski measure µ0 for some ν0 and 1ν0 is (cyclic and) D-invariant. Proof. According to Lemma 2.3 and the agreements thereafter, we may find some ν0 , such that 1ν0 is D-invariant (and cyclic). Now use the proposition above.
Note that, as mentioned earlier, we do not distinguish between the D-invariances on equivalent representations. More precisely, we should say in the corollary above: There is an isomorphism U : H −→ H with H = ν∈N L 2 (A, µν ) for certain measures µν on A, such that U ◦ π |C(X ) ◦ U −1 is cyclic on each L 2 (A, µν ) with cyclic vector 1ν ; moreover, µν0 equals µ0 for some ν0 ∈ N and U π (αϕ )U −1 1ν0 = 1ν0 for all ϕ ∈ D. Before we will be able to prove the proposition above, we have to provide two estimates. Lemma 7.3. Let T ∈ Mγ be a gauge-variant spin network state, and let φ be some representation occurring in T . Denote the Casimir eigenvalue w.r.t. φ by λφ and set n := dim g. Finally, define η : R+ −→ R+ according to Lemma B.2. Then there is a one-parameter group wt of Weyl operators, such that, for each t0 > 0 and each even J ∈ N+ , there are (2n) J diffeomorphisms ϕ , such that
1 −1 4 − 21 λφ J t 2 α (w ) − e 1 T ≤ T ∞ (eη(t0 )J t − 1) t ϕ J ∞ (2n)
for all |t| < t0 . Proof. Fix some edge γ ∈ γ , such that φ is the representation carried by γ in T . n be a basis of the Lie algebra g of G, such that Let, according to Lemma B.2, {X i }i=1 1 − n i φ(X i )φ(X i ) is (up to the prefactor) the Casimir operator φ. Define cφ,g(t) :=
n 2n 2n 1 t X i 1 1 φ(e ) + φ(e−t X i ) ≡ φ(et X i ) ≡ φ(e−t X i ) 2n 2n 2n i=1
i=1
i=1
with X i+n := −X i . According to Proposition 6.21, choose some interval I ⊆ [0, 1], for each i = 1, . . . , 2n some appropriate surface Si disjoint to im γ and some s ∈ Z.
126
Ch. Fleischhack
Moreover, let di : M −→ g be the constant function of value 21 X i , and fix some strictly increasing sequence (τ j ) j∈N+ ⊆ I . We are now going to consider the one-parameter group wt :=
2n
wi,t ≡
i=1
2n
Si ,σ Si . i (t)
w Ed
i=1
In fact, this is a one-parameter group: All the Si are disjoint, whence wi,t and wi ,t commute by Lemma 3.26.27 Fix now some positive even integer J and some positive t0 . By the choice of Si and of (τ j ), for each : [1, J ] −→ [1, 2n] there is a diffeomorphism ϕ ∈ D with the properties described in Proposition 6.21. In particular, since ϕ (γ ) intersects S := i Si completely transversally, the minimal S-admissible decomposition of ϕ (γ ) contains S-external edges only. More explicitly, it equals ϕ (γ0 ) · · · ϕ (γ J ) with γ0 = γ |[0,τ1 ] , γ j = γ |[τ j ,τ j+1 ] and γ J = γ |[τ J ,1] . By ϕ (γ (τ j )) ∈ S( j) for all j, we see that ϕ (γ j ) starts in S( j) and ends in S( j+1) (with the obvious exceptions for j = 0 and j = J ). Since, moreover, by construction, σ + (S( j) , ϕ (γ j−1 )) = (−1) j+s = σ − (S( j) , ϕ (γ j )) for j = 1, . . . , J , we get wt (αϕ (φ ◦ πγ )) = wt (φ ◦ πϕ γ ) =
J
wt (φ ◦ πϕ γ j )
j=0 J
j+s φ e(−1) X ( j) t · (φ ◦ πϕ γ j ) , = (φ ◦ πϕ γ0 ) ⊗ j=1
hence (wt )](φ ◦ πγ ) = (φ ◦ πγ0 ) ⊗ [αϕ−1
J
j+s φ e(−1) X ( j) t · (φ ◦ πγ j ) j=1
and J 2n 1 1 −1 j+s [α (w )](φ ◦ π ) = (φ ◦ π ) ⊗ φ(e(−1) X i t ) · (φ ◦ πγ j ) t γ γ 0 ϕ J (2n) 2n j=1
= (φ ◦ πγ0 ) ⊗
J
i=1
cφ,g(t) · (φ ◦ πγ j ).
j=1
Here, we used
J
:[1,J ]−→[1,2n] j=1
a( j), j =
J 2n
ai, j .
j=1 i=1
27 If we choose some E(t) : M −→ G with E(t) = E (t) on each S and define σ to be the joint di i S S,σ intersection function of S1 , . . . , S2n , we get wt = w E(t)S . Recall that we assumed that R contains not only
the “genuine” subgroups in W, but also the finite products of such subgroups, provided they mutually commute. Therefore, it is not important that E(t) is possibly not included in (S).
Representations of the Weyl Algebra in Quantum Geometry
127
√ By assumption, we have T = Tγ ⊗ Tγ = dim φ(φ ◦ πγ )lk ⊗ Tγ for some matrix indices k, l and some Tγ ∈ Mγ with γ = γ \ {γ }. Additionally using S ∩ im (ϕ γ ) = ∅ and φlk ∞ ≤ φ∞ , we get
1 −1 − 21 λφ J t 2 α (w ) − e 1 T t ∞ (2n) J ϕ 1
k −1 − 21 λφ J t 2 α ≤ dim φ (w ) − e 1 (φ ◦ π ) ⊗ T t γ γ ∞ l (2n) J ϕ ≤ Tγ ∞ (eη(t0 )J t − 1) Tγ ∞ 4
for all |t| < t0 , by Lemma B.2 and the surjectivity [14] of πυ : A −→ G#υ for every hyph υ.
Corollary 7.4. Let π be some representation of ADiff , let T ∈ Mγ be a nontrivial gauge-variant spin network and let ψ ∈ H be D-invariant with ψ, π(T )ψH = 0. Then there is a one-parameter group wt of Weyl operators and ε, t0 > 0, such that for all 0 = |t| < t0 there is a diffeomorphism ϕ with |ψ, π (wt − 1)(αϕ T ) ψH| ≥ ε. Proof. • Set ε := min{ 21 , 41 |ψ, π(T )ψH|} > 0. Fix τ0 > 0 and some non-trivial irreducible representation φ occurring in T . Next, choose positive real τ2 and τ4 , such that (using the η(τ0 ) as given in Lemma 7.3) ψ2H T ∞ (eη(τ0 )τ − 1) < ε
for all |τ | < τ4
and 1
e− 2 λφ τ < ε Define now 1 J0 := 2τ2 τ4 2
for all |τ | > τ2 . 1 3 t0 := min J0 , J0 , τ0 . τ2
and
We say that (J, t) ∈ N+ × R is an admissible pair iff 0 < |t| < t0
and
J0 J0 ≤ J ≤2 3. 3 t t
As one checks easily, the admissibility of (J, t) implies J t 2 > τ2 and J t 4 < τ4 . Moreover, for every 0 < |t| < t0 , there is an even J (t) ∈ N+ such that (t, J (t)) is admissible. • Choose now a one-parameter subgroup wt of Weyl operators and, for all positive integers J , some diffeomorphisms ϕ , ∈ P J , as in Lemma 7.3. Consequently, we have 1 1 2 ψ, π αϕ−1 (wt ) − e− 2 λφ J t 1 T ψH (2n) J ∈P J
128
Ch. Fleischhack
≤ π
1 −1 − 21 λφ J t 2 (α (w ) − e 1)T ψ2H t ϕ B(H) (2n) J ∈P J 1 −1 − 21 λφ J t 2 ≤ (α (w ) − e 1)T ψ2H t ϕ ∞ (2n) J ∈P J
≤ (eη(τ0 ε.
1 (2n) J
∈P J
1 2 ψ, π (αϕ−1 (wt ) − e− 2 λφ J t 1)(T ) ψH
• Finally, for all 0 < |t| < t0 , we may choose a J (t) providing an admissible pair (J (t), t). By the lines above, there is some ϕ ∈ {ϕ | ∈ P J (t) }, such that |ψ, π (wt − 1)(αϕ T ) ψH| = |ψ, π (αϕ−1 (wt ) − 1)T ψH| > ε using the D-invariance of ψ.
Proof of Proposition 7.1. Corollary 7.4 shows that W splits W at 1ν0 for every nontrivial gauge-variant spin network state T with 1, T ν0 ≡ 1ν0 , π(T )1ν0 H = 0. Hence, W splits W at 1ν0 , by Lemma 3.3. Since π is regular, Proposition 2.21 gives the assertion.
7.2. Naturality. Now we are using nice enlarged structure data and assume additionally that • D contains at least the stratified isomorphisms described in Subsect. 6.4 and at least those necessary to keep Proposition 6.14 valid; • S contains at most all D-orbits of semianalytic subsets in M having a finite wide triangulation and being of lower dimension than M, but contains at least the q-balls and q-simplices with q < dim M.
Representations of the Weyl Algebra in Quantum Geometry
129
Proposition 7.5. Let π be a D-natural representation of ADiff , such that 1ν0 is a diffeomorphism-invariant vector and µν0 equals µ0 for some ν0 ∈ N. Then the restriction of π to Hν0 is the fundamental representation π0 , i.e., we have Pν π (a) = π0 (a)Pν for all a ∈ ADiff . The proof of the proposition will use several steps we are now going to write down in separate lemmata. For this, throughout the whole section, we will assume that π is a D-natural representation of ADiff having some 1ν0 as a D-invariant vector. Moreover, µν0 equal µ0 . Finally, as usual, we set π := π |A. Lemma 7.6. Let S1 and S2 be elements in S having orientations σ S1 and σ S2 . Assume that they are oriented-strata equivalent. Finally, let g ∈ G be some element and di : Si −→ G for i = 1, 2 be the constant function with value g. S1 ,σ S S2 ,σ S Then wd1 1 is a πν0 -unit (πν0 -scalar) iff wd2 2 is a πν0 -unit (πν0 -scalar). Proof. Let ϕ be a product of localized stratified isomorphisms mapping S1 onto S2 as well as their orientations. Then S1 ,σ S1
αϕ (wd1
ϕ(S1 ),ϕ(σ S1 )
) = wϕ(d1 )
Now, the assertion follows from Corollary 2.10.
S2 ,σ S2
= wd2
.
Lemma 7.7. Let S some subset of M, such that S and ∂ S are connected embedded submanifolds in M (without boundary) and that S is an embedded submanifold in M having boundary ∂ S. Moreover, assume that S has one of its natural orientations. Finally, let w = wdS,σ S for some constant d ∈ (S). Then Pν0 π(w)1ν0 ∈ Hν0 ≡ L 2 (A, µ0 ) is orthogonal to all non-trivial gauge-variant spin network states that are not based on an edge γ whose image equals S, ∂ S or S. Recall that no edge of a gauge-variant spin network is labelled with the trivial representation. Proof. Let T be a gauge-variant spin network state in Mγ . There are two main cases: • im γ neither equals S, ∂ S nor S. By Proposition 6.22, there is an infinite number of localized stratified diffeomorphisms ϕi , leaving S (including its orientation and d) and each edge of γ except for some γ invariant, and forming a hyph {ϕi (γ )}i . Consequently, αϕi T and αϕ j T are orthogonal for i = j. Moreover, each αϕi commutes with w. Therefore, by Lemma 2.17, Pν0 π(w)1ν0 is orthogonal to T . • im γ equals S, ∂ S or S. Assume first im γ = ∂ S. Since im γ is compact, ∂ S has to be compact as well. Hence, it is isomorphic to S 1 . After a possibly necessary re-orientation of γ , the product of all paths in γ is a closed edge γ with image ∂ S. Assume that T is not (γ , φ)-based for some φ. Then there is some vertex m in γ , where the adjacent edges at m are either labelled with different representations or carry non-matching indices. Now, as in the previous case, but this time by Proposition 6.23, there are infinitely many localized stratified diffeomorphisms ϕi , leaving the sets S (including orientation and d) and ∂ S invariant; they simply move m along ∂ S stretching ∂ S a bit. By Lemma 3.4, any two αϕi T and αϕ j T with i = j are orthogonal. Since each αϕi commutes with w, Lemma 2.17 proves the orthogonality of Pν0 π(w)1ν0 and T . The case of im γ = S is completely analogous. For im γ = S we may additionally get the case of an embedded interval. However, this is analogous as well.
130
Ch. Fleischhack
Immediately from the proof of the lemma above and that of Proposition 6.22, we get Corollary 7.8. Let S be some subset of M and wdS,σ S ∈ W be any Weyl operator. Moreover, let γ be a graph not contained in the closure of S. Then Pν0 π(w)1ν0 ∈ Hν0 ≡ L 2 (A, µ0 ) is orthogonal to all non-trivial gauge-variant spin network states in Mγ . We are now going to prove that the Weyl operators to open balls given some constant “labelling” d, are πν0 -units. We start with the dimensions 0 and 3+, but smaller than dim M, proceed with dimension 1 and end up with dimension 2. Corollary 7.9. Let s < dim M be some non-negative integer with s = 1, 2, and let S be an open or closed s-dimensional ball in M given a nice orientation. Then w := wdS,σ S is a πν0 -unit for every constant d ∈ (S). Proof. Let γ be a non-trivial graph. Since s = 1, 2, neither S, ∂ S nor S equals im γ . Thus, Pν0 π(w)1ν0 is orthogonal to each non-trivial T ∈ MSN . Since MSN is a continuous µ0 -generating system, w is a πν0 -scalar (see also Lemma 2.14). To prove that w is even a πν0 -unit observe first that, by Propositions 6.10 and 6.14, there is a stratified isomorphism ϕ mapping S onto itself, but reverting its orientation. Thus, αϕ (w) = w ∗ , whence w 2 is a πν0 -unit by Corollary 2.11. Since G is compact, there is a square root for any element. Re-doing the proof for d1 ∈ (S) with d1 d1 = d gives the assertion.
Lemma 7.10. Let w ∈ W be a Weyl operator for some quasi-surface S and some constant d ∈ (S), and let γ be an analytic edge, such that Pν0 π(w)1ν0 is contained in the closure of span Bγ . If the image of γ is not completely contained in S, then w is a πν0 -scalar. Proof. Let m ∈ im γ \ S. If γ is closed, we may assume that m is not the base point of γ . Consider now for each g ∈ G the Weyl operator wg,m given by the quasi-surface Sm := {m}, whereas the orientation of Sm is chosen, such that the direction of γ coincides with the orientation of Sm . Since Sm and S are disjoint, wg,m and w commute. Moreover, by Corollary 7.9, wg,m is a πν0 -unit. Consequently, by Corollary 2.13, wg,m leaves Pν0 π(w)1ν0 invariant. • Let m be not an endpoint of γ . First of all, let T = (Tγ ,φ ) ij ∈ Bγ ,φ for some non-trivial φ with φk = φ for all k and with m being a vertex of γ . One easily checks that wg,m (T ) =
r1 ,r2
i (r )
φ(g 2 )rr12 (Tγ ,φ ) j (r21 ) ,
whereas i(r ) is the tuple of all i k where the index belonging to the edge leaving at m is replaced by r . Hence, ∗ (Pν0 π(w)1ν0 ), T µ0 Pν0 π(w)1ν0 , T µ0 = wg,m = Pν0 π(w)1ν0 , wg,m (T )µ0 i (r ) = φ(g 2 )rr12 Pν0 π(w)1ν0 , (Tγ ,φ ) j (r21 ) µ0 r1 ,r2
Representations of the Weyl Algebra in Quantum Geometry
for all g ∈ G and therefore, since square roots exist in G, ! Pν0 π(w)1ν0 , T µ0 = Pν0 π(w)1ν0 , T µ0 dµHaar (g) G
=
Pν0 π(w)1ν0 , (Tγ ,φ ) ij(r(r21)) µ0 r1 ,r2
= 0.
131
! G
φ(g)rr12 dµHaar (g)
Now, if T = (Tγ ,φ ) ij ∈ Bγ ,φ for some non-trivial φ without m being a vertex of γ , we may refine γ by inserting m as a new vertex. Then T is a (finite) sum of (γ , φ)based gSNs each having m as a vertex of the underlying graph. Using the just shown result, we get Pν0 π(w)1ν0 , T µ0 = 0 for all (γ , φ)-based gauge-variant spin network states. Altogether, this shows that Pν0 π(w)1ν0 is orthogonal to all non-trivial gaugevariant spin network states, i.e., w is a πν0 -scalar. • Let m be an endpoint of γ . We argue analogously, using i (r ) wg,m (T ) = φ(g)ri1 (Tγ ,φ ) j r
if m = γ (0), and similarly for m = γ (1).
Corollary 7.11. Let S be an open 1-dimensional ball in M given a nice orientation. Then w := wdS,σ S is a πν0 -unit for every constant d ∈ (S). Proof. Let γ be the edge whose interior is S and choose one of its orientations. By Lemma 7.7, Pν0 π(w)1ν0 is orthogonal to all non-trivial gauge-variant spin network states that are not based on the edge γ . By Corollary 3.6, Pν0 π(w)1ν0 is contained in the closure of the span of γ -based gSNs. Since, however, the endpoints of γ are not contained in S, Lemma 7.10 implies that w is a πν0 -scalar. Now, by Proposition 6.10, there is some ϕ ∈ D being the identity on S, but inverting the orientation of S, i.e., αϕ (w) = w ∗ . Corollary 2.11 implies that w 2 is a πν0 -unit. As above, the assertion follows since square roots exist in G.
Corollary 7.12. Let S be an open 2-dimensional ball in M given a nice orientation. Then w := wdS,σ S is a πν0 -unit for every constant d ∈ (S). Proof. The image of an edge γ equals S, ∂ S or S iff γ is a closed loop along ∂ S ∼ = S1. By Lemma 7.7, Pν0 π(w)1ν0 is orthogonal to all non-trivial gauge-variant spin network states not based on such a γ . Hence, we have Pν0 π(w)1ν0 ∈ span Bγ by Corollary 3.6. Observe that im γ ∩ S = ∅. Now argue as in Corollary 7.11.
Proposition 7.13. Let S be a finitely widely triangulizable subset in M having a natural orientation with dim S < dim M. Then w := wdS,σ S is a πν0 -unit for every d ∈ (S) being constant on S.
132
Ch. Fleischhack
Proof. • S is an open q-simplex having a nice orientation. By Corollary 6.11 and Proposition 6.14, S is oriented-strata equivalent to a nicely oriented q-ball. So we get the assertion, since q-balls lead to Weyl operators that are πν0 -units (Corollaries 7.9, 7.11 and 7.12) and since this property is inherited to all oriented-strata equivalent objects according to Lemma 7.6. • S is finitely widely triangulizable. This means, by definition, S is the finite disjoint union of nicely oriented simplices. Since disjoint unions lead to products of (commuting) Weyl operators (see Lemma 3.26), we get the assertion as well in the general case.
Proof of Proposition 7.5. Use Proposition 7.13 and Lemma 2.12, observing that each w ∈ W is a πν 0 -unit and that ADiff is generated by W, W and C(X ).
7.3. Classification. Definition 7.1. Enlarged structure data are called optimal iff they are nice and • M is at least three-dimensional; • (S) contains – at most the constant functions together with their E-orbits; • D contains – at least the stratified isomorphisms described in Subsect. 6.4, – at least those necessary to keep Proposition 6.14 valid; • S contains – at least those hypersurfaces that are necessary to keep Proposition 6.21 valid, – at least the q-balls and q-simplices for q < dim M, – at most all D-orbits of semianalytic subsets in M having a finite wide triangulation and being of lower dimension than M. Theorem 7.14. Let π be a representation of AAuto on H, such that π := π |A is regular and π := π |ADiff is D-natural. Moreover, let there exist some D-E-invariant vector in H being cyclic for π . Finally, let optimal enlarged structure data be given. Then π is unitarily equivalent to the fundamental representation of AAuto . Proof. By Corollary 7.2, regularity and diffeomorphism invariance imply that there is a first-step decomposition of π , such that some µν0 is the Ashtekar-Lewandowski measure µ0 and that 1ν0 is D-invariant. Naturality, diffeomorphism invariance and Proposition 7.5 imply that each w ∈ W w.r.t. a constant d ∈ (S) is a πν0 -unit. As 1ν0 is also E-invariant, w ∈ W is a πν0 -unit for every d ∈ (S) by Lemma 2.4 and Proposition 3.38. Cyclicity gives the proof by Corollary 2.7.
We remark that the results above can be directly extended to semianalytic sets having the same dimension as M. Of course, the triangulizability has to be guaranteed and the intersection functions have to be adjusted. The latter one can be done, e.g., by setting σ S (γ ) for closed S to be one iff γ starts at the boundary of S and then leaves S nontangentially. The proofs, however, have to be modified accordingly. In particular, there is no longer an extra dimension available to mirror simplices and balls. Instead, we now use that there are diffeomorphisms mapping simplices (enlarged by one of its faces) onto two other, disjoint simplices whose union is the original simplex again (Proposition 6.13). The proofs that the corresponding Weyl operators are πν -units, should now use the first statement of Corollary 2.11 and proceed inductively on the dimension.
Representations of the Weyl Algebra in Quantum Geometry
133
7.4. Discussion. Having now obtained the desired uniqueness theorem, we might ask whether the assumptions for it are reasonable. 7.4.1. Structure data. First of all, let us consider the enlarged structure data. Lemma 7.15. The following enlarged structure data are optimal: • • • • • •
M is an at least three-dimensional analytic manifold; G is a nontrivial connected compact Lie group; P consists of all piecewise analytic paths in M; E contains the generalized gauge transforms;28 D contains the stratified analytic diffeomorphisms in M; S contains the semianalytic sets in M (together with their D-orbits) having lower dimension than M and having a finite wide triangulation; • (S) contains the natural29 intersection functions of S; • (S) contains the constant functions on M (together with their E-orbits); • R contains the one-parameter subgroups of Weyl operators consistent with S, {(S)}, {(S)}.
From our point of view (see also the discussion in Subsect. 4.1), all ingredients are natural up to the restrictions on S and, maybe, on M and (S). The inclusion of semianalytic sets is reasonable, since the stratified diffeomorphisms map analytic hypersurfaces to semianalytic sets anyway. At the same time, the inclusion of lower-dimensional surfaces becomes natural. But, it would be desirable to at least replace the condition of wide triangulizability by the “standard” triangulizability, since in this case it is known that any semianalytic set is triangulizable. The requirement that each simplex in the triangulation is nicely oriented, is not too restrictive, since every naturally oriented, embedded surface is at least locally nicely oriented. The finiteness, on the other hand, cannot simply be dropped. This may at most be possible for compact M. In fact, then every semianalytic set has a compact closure and compact boundary. Then we may triangulize them finitely, by local finiteness. Redoing the procedure with the (lower-dimensional) semianalytic set given by the intersection of the original one with its boundary, we may successively get a finite decomposition of the original set into simplices. For non-compact M, this is no longer true. Simply take a hyperplane in R3 being triangulizable, of course, but not finitely. Well, although our proofs above have aimed at the finite case, we may extend the uniqueness result immediately to this example. Simply use that a hyperplane can be rotated onto itself inverting its orientation, and argue as in Corollary 7.9. In other words, it may be, as already mentioned above, that every analytic manifold is widely triangulizable; but even if not, there seems to be still some leeway in our argumentation above to keep the uniqueness given in the more general context. However, to explore this, several technical investigations in the field of semianalytic sets are necessary that go much beyond the scope of this paper. We mentioned also the restriction that M has to be at least three-dimensional. Well, for quantum gravity this is no problem at all, since the space-like hypersurfaces are three-dimensional. The space-time is even four-dimensional, although this does not 28 The statement remains true if we assume E to be any subset of G, e.g., just the (stratified analytic) gauge transforms of a particular principal fibre bundle. 29 To be precise, (S) should contain the natural intersection functions of S, if S is a submanifold. In the general case, include all intersection functions that are joint intersection functions given by the nice intersection functions for the submanifolds forming the respective triangulation. Finally, if necessary, collect all intersection functions generated by the action of D on stratified sets of the types mentioned previously.
134
Ch. Fleischhack
seem relevant here, since we work with compact structure groups excluding the full covariant formulation of general relativity in four dimensions using the structure group S O(3, 1) or Sl(2, C). Nevertheless, we expect our result to be true in dimension 2 as well. In dimension 1, one should check it by hand – M can only be a line or a circle. Another issue concerns the choice of functions d ∈ (S) to label the stratified sets. Constant functions mark some minimal condition. On the other hand, our proofs in Subsect. 7.2 only go through for constant labellings. In fact, only these guarantee that diffeomorphisms mapping some S onto itself preserve even its labelling. The most obvious way out might be to add some stronger notion of regularity. In particular, we might reuse the idea of step functions for the definition of integrals. This means we should approximate an arbitrary (sufficiently “smooth”) function by simple functions, i.e., by sums of step functions, having sufficiently nice, disjoint supports. These sums now correspond to products of Weyl operators with constant labellings. Since these are represented identically, we would get the desired uniqueness for representations that are in this sense regular and if each d can be approximated this way. However, this approximation again may be in conflict with the triangulation problem above. Therefore, at this point, we state only the directly given Lemma 7.16. Besides nice enlarged structure data, assume that each (S) consists of some subset of continuous functions d : M −→ G. Equip (S) with the supremum norm on S induced by some fixed norm on G. Assume there is some sequence (di )i∈N with di → d in (S), such that for all i there are finitely many Si,ki forming a decomposition of S and each having a finite wide triangulation, whereas di is constant on each Si,ki . Then, given the assumptions of Theorem 7.14, π is equivalent to π0 , provided π is 0,S,σ S -regular for all S ∈ S and σ S ∈ (S). Recall that π0 itself is always 0,S,σ S -regular, i.e., if di converges pointwise on S to d, then the corresponding Weyl operators converge weakly. Proof. Let d, S and σ S be fixed. The Weyl operators corresponding to Si,ki and di | Si,ki
S are even πν0 -units according to the proof of Theorem 7.14, hence each wdS,σ as well, by i
Lemma 3.26. Proposition 2.22 and the 0,S,σ S -regularity imply that wdS,σ S is a πν0 -unit as well. Corollary 2.7 gives the proof.
7.4.2. Further assumptions. Let us now say a few words about the other assumptions of Theorem 7.14. That we restrict ourselves to cyclic representations, is no restriction at all, since any (non-degenerate) representation can be decomposed into cyclic ones. Rather, the assumption that there is a cyclic vector being at the same time diffeomorphism invariant, is a restriction. This means that we only consider theories having a diffeomorphism invariant “vacuum”. Well, this may be justified by the corresponding invariance of general relativity leading to some special kind of quantum geometry. Next, we assumed at least the “standard” regularity mapping weakly continuous one-parameter subgroups into weakly continuous ones. It may be desirable to drop this assumption; however, even in the classical theory of quantum mechanics, the Stone-von Neumann theorem relies on the regularity assumption. Indeed, it is very difficult to prove results without referring to it. However, in our case, there may be some hope, since the diffeomorphism group is that large and may thence identify so many objects in order to, possibly, replace some or all of the regularity assumptions. The naturality of the action of diffeomorphisms is discussed below.
Representations of the Weyl Algebra in Quantum Geometry
135
7.4.3. Improvements. Finally, we would like to emphasize that we were able to drop a crucial assumption and to weaken another made in the paper [35] by Sahlmann and Thiemann: First of all, we did not need any assumptions about the domains of the operators. This was possible, since we are working with the exponentiated Weyl operators from the very beginning. The only point, where we went down to the non-exponentiated regime, was in Subsect. 7.1 (and Appendix B). But even there, we did not do this for generators of the represented Weyl operators. In fact, we did only use results for the convergence of the genuine Weyl operators w.r.t. the supremum norm. This way, we get some “analytic” convergence at the exponentiated level that, afterwards, leads to the emergence of the Ashtekar-Lewandowski measure by splitting and regularity. Secondly, we significantly weakened the assumptions on the representation of the diffeomorphisms. Although we re-used the name “natural” representation, our definition imposes much less restrictions than that in [35]. There, the action of diffeomorphisms is said to be natural if it is the pull-back representation of D on each L 2 (A, µν ). This, however, is well-defined only if the pull-back action of D is well-defined on L 2 (A, µν ). In fact, in general, it is not. To see this, we use again the general notation of Sect. 2. Namely, let µν be the Dirac measure at some x ∈ X being not invariant w.r.t. W . Then Hν = L 2 (X, µν ) is isomorphic to C by ψ −→ ψ(x) for any measurable function ψ on X . Take some w ∈ W with ξw (x) = x, and let the even continuous function ψ be one at ξw (x), but zero at x. Therefore, ψ = 0 in Hν , but, at the same time, (w (ψ))(x) = ψ(ξw (x)) = 1, hence w (ψ) = 1 in Hν . In other words, the extension of the pull-back mapping from C(X ) to L 2 (X, µν ) is ill defined. Additionally, one sees immediately, that, even if the pull-back representation is well defined, it is unitary only if µν is W -invariant. This, of course, restricts the possible measures drastically. We, instead, defined naturality (see Definition 2.5) much less restrictive. Firstly, we do not refer to the pull-back representation at all. Secondly, before we impose conditions on the projection of π to certain Hν , we check whether Hν is invariant w.r.t. W . Only then and only if µν1 and µν2 coincide, we, thirdly, require that the respective projections of π coincide. This way, the problems indicated above are circumvented. Nevertheless, one should think why one required naturality at all in our case X = A and W ∼ = D. Recall that there are three different objects to be considered: the continuous functions on A, the Weyl operators w ∈ W and the diffeomorphisms αϕ ∈ W . The first two of them are dynamical, the last one is just a constraint. Therefore, it is reasonable to distinguish between them. For instance, it is not required that π is regular. In fact, the diffeomorphisms act arbitrarily non-continuously on H already in the fundamental representation: Given some ψ ∈ L 2 (A, µ0 ), say a spin network state on some graph γ , then αϕ (ψ) is orthogonal to ψ provided γ is not preserved by ϕ (actually being a negligible restriction). Moreover, since C(A) is continuous, in any case, we may decompose the restriction of any representation to C(A) into canonical representations on A w.r.t. certain measures. It now may be conceivable that, if the continuous functions on A cannot distinguish between two of these addends, then the purely kinematical, constraining part cannot either. In other words, if two measures in the first-step decomposition coincide, then the induced representations should be identified. There is no obvious reason why diffeomorphisms should not keep the addends of the first-step decomposition invariant – but, there is also no reason why they should. Therefore, although it might be reasonable to restrict oneself to natural representations of diffeomorphisms, this assumption does not seem to be absolutely desirable. If we do this, however, observe that the arguments above should not be applied to the Weyl operators. Indeed, first of all, these are dynamical
136
Ch. Fleischhack
objects, and, secondly, they act on a higher level, namely, on A affected by the dynamics and not on the paths being the ultimate constituents of the theory and being the domain for the homomorphisms in A. 7.4.4. Main open issues. Of course, the remarks above are not at all final answers why to consider just these assumptions. At least from the mathematical point of view, it would be highly desirable to have more general results available. We have given some hints here for direct extensions, however, the field is still open, in particular: Question 1. Is naturality w.r.t. diffeomorphisms necessary? Question 2. Is regularity w.r.t. Weyl operators necessary? Acknowledgements. The author thanks Abhay Ashtekar, Jerzy Lewandowski, Stefan Müller, Andrzej Okołów, Hans-Bert Rademacher, Hanno Sahlmann, Konrad Schmüdgen, Matthias Schwarz, Thomas Thiemann, Rainer Verch and Elmar Wagner for fruitful discussions. Moreover, the author is very grateful to Garth Warner for his remarks and, in particular, for pointing out a mistake in an earlier version of this article. Fortunately, all the results kept valid. Additionally, the author thanks the three anonymous referees for their very valuable comments and suggestions helping him to improve the article. The author has been supported in part by the Emmy-Noether-Programm (grant FL 622/1-1) of the Deutsche Forschungsgemeinschaft.
Appendix A. Continuity Criterion Lemma A.1. Let Y be some sequential topological space. Let X be a Banach space and let λ : Y −→ B(X ) be some map. Moreover, let λ(·) : Y −→ R be locally bounded. Assume, finally, that there is some subset E ⊆ X , such that y −→ λ(y)e is continuous for all e ∈ E and that span E is dense in X . Then y −→ λ(y)x is continuous for all x ∈ X . Proof. Fix some y ∈ Y and choose a neighbourhood U of y , such that λ(·) is bounded on U , say, by c. Let now ε > 0 and x ∈ X . Then there are x1 , x2 ∈ X with x1 ∈ span E and x2 ≤ ε, such that x = x1 + x2 . Since y −→ λ(y)e is continuous for e ∈ E, so it is for e ∈ span E. Hence, there is some neighbourhood U ⊆ U of y , such that λ(y)x1 − λ(y )x1 ≤ ε for all y ∈ U . Consequently, λ(y)x − λ(y )x ≤ λ(y)x1 − λ(y )x1 + λ(y)x2 − λ(y )x2 ≤ λ(y)x1 − λ(y )x1 + (λ(y) + λ(y ))x2 ≤ (2c + 1)ε for all y ∈ U . Hence, y −→ λ(y)x is continuous in y for all x ∈ X . Since y was arbitrary, we get the proof.
B. Two Estimates Lemma B.1. Let H be some Hilbert space and N ∈ N. Moreover, let A, Ai and Bi be linear continuous operators on H , such that A ≤ 1 and Bi ≤ 1 for all i = 1, . . . , N . Then N N N 1 + Ai − A − 1. Ai Bi − ABi ≤ i=1
i=1
i=1
Representations of the Weyl Algebra in Quantum Geometry
Proof. We have N N Ai Bi − i=1
i=1
137
N N ABi = (A + [Ai − A])Bi − i=1
i=1
ABi
N N ≤ ABi + (Ai − A)Bi − ABi i=1
i=1
N N A + Ai − A − A ≤ i=1
i=1
N 1 + Ai − A − 1. ≤ i=1
Lemma B.2. Let G be a connected, compact (hence linear) Lie group and let φ be an n be a basis of the Lie algebra irreducible representation of G on Vφ . Moreover, let {X i }i=1 1 g of G, such that − n i φ(X i )φ(X i ) is (up to the prefactor) the (quadratic) Casimir operator Cφ for φ. Set cφ,g(t) :=
n 1 t X i φ(e ) + φ(e−t X i ) . 2n i=1
Then, for all t0 > 0, there is some η(t0 ) > 0 with J 1 2 J cφ,g(t) · φ − e− 2 λφ J t φ ⊗ j=1
j=0
φ
≤ eη(t0 )J t − 1 4
∞
for all |t| < t0 and all positive integers J . Here, λφ is the Casimir eigenvalue w.r.t. φ, and · ∞ denotes the supremum norm in G J +1 induced by the standard operator norm · on Vφ . Proof. Let 1
f 2 (t) := e− 2 λφ t φ(1). n We have f 1 (0) = φ(1) = f 2 (0). Next, f 1 (0) = n1 i=1 φ(X i )φ(X i ) = −Cφ . Since Cφ = λφ φ(1), we have f 1 (0) = −λφ φ(1) = f 2 (0). Since, moreover, the derivatives of odd degree vanish for both f 1 and f 2 , the derivatives of f 1 and f 2 coincide up to degree 3. Hence, for every t0 > 0 there is some η(t0 ) > 0, such that f 1 (t) − f 2 (t) < η(t0 )t 4 for all |t| < t0 . Here, we used the analyticity of f 1 and f 2 on full R. Using Lemma B.1 and φ(g) = 1 (by unitarity of G), we get f 1 (t) := cφ,g(t)
and
2
J J 1 2 φ(g j ) cφ,g(t) φ(g j ) − e− 2 λφ J t φ(g0 ) j=1
j=0
≤
J
1 2 1 + cφ,g(t) − e− 2 λφ t φ(1) − 1
j=1
≤
J
1 + η(t0 )t 4 − 1
j=1
≤ eη(t0 )J t − 1 4
for all g0 , . . . , g J ∈ G and all |t| < t0 .
138
Ch. Fleischhack
C. “Bumpy” Stratified Isomorphisms Lemma C.1. Let τ1 and τ2 be real numbers with τ1 < τ2 . Moreover, let 0 < ε < 1 2 (τ2 − τ1 ) and a > 0. Finally, let n ≥ 2 be an integer and define n−2 ⊆ Rn , C := [τ1 − ε, τ2 + ε] × [−2ε, 2a + 2ε] × B2ε
where Brm is the ball around the origin in m dimensions with radius r . Then there is a stratified analytic isomorphism ϕ of Rn with the following properties (see also Fig. 1 in Subsect. 6.4.1): • ϕ is the identity outside C; • ϕ changes the second (i.e., y-)coordinate only; • ϕ maps the first (i.e., x-)coordinate axis (restricted to [τ1 − ε, τ2 + ε]) to the union of straight lines connecting the points ! (τ1 + ε, 2a, 0), ! (τ2 − ε, 2a, 0), ! and (τ2 + ε, 0, 0). ! (τ1 − ε, 0, 0), Proof. W.l.o.g. we may assume that τ := τ2 = −τ1 . Decompose C into the 18 subsets30 G i j0 := G i j × Bεn−2 and
n−2 G i j+ := G i j × B2ε \int Bεn−2
having overlapping boundaries (for the definition of G i j , see Fig. 1). We are going to explicitly construct a diffeomorphism ϕ mapping G i j∗ onto some Hi j∗ . Before stating the explicit formulae, we explain them verbally for n = 2. G 11 is mapped to H11 , such that lines parallel to the x-axis are mapped to lines through (x0 , y0 ), whereas the line x = −τ − ε is preserved pointwise. The mapping between G 12 and H12 simply makes (mutually parallel) sloped lines out of lines parallel to the x-axis. G 13 is mapped to H13 similarly as G 11 to H11 . The maps G 2i −→ H2i map a line parallel to the x-axis again to such a line. The shift is completely determined by the shift on the left boundaries of the G 2i . These, of course, are already given by maps of the right boundaries of G 1i . The maps for G 3i will not be given explicitly. They just follow by the reflection symmetry w.r.t. x = 0. The ideas above widely fix ϕ. We only have to take care of the matching conditions in the !z -directions. Here, we introduce a “fall-off” when !z is in [ε, 2ε]. For this, we define g(!z ) := 21 1 − cos( πε !z ) . In the following, we will use that for any analytic function h : R × Rn−2 and any y0 ∈ R, ϕaux : Rn −→ Rn (x, y, !z ) −→ x, y + h(x, !z )(y0 − y), !z is invertible analytically on Uh := {(x, y, !z ) | h(x, !z ) = 1} by
−1 : ϕaux
Uh −→ Rn . y−h(x,!z )y0 (x, y, !z ) −→ x, 1−h(x,!z ) , !z . Let us now state the diffeomorphism setting y0 := a + 2ε:
30 Of course, if n = 3, there are 27 connected components, and for n = 2 there are only 9. We drop the corresponding cases here.
Representations of the Weyl Algebra in Quantum Geometry
• Define and
ϕ110 : ϕ11+ :
Rn −→ (x, y, !z ) −→ x, y +
139
Rn
a x+τ +ε ε 2a+ε (y0
Rn −→ (x, y, !z ) −→ x, y + g(!z )
Rn
− y), !z
a x+τ +ε ε 2a+ε (y0
− y), !z .
Both ϕ110 and ϕ11+ are analytically invertible on x < −τ + ε + εa and are the identity on x = −τ − ε. Moreover, they coincide on G 110 ∩ G 11+ . Finally, ϕ11+ is the identity on !z = 2ε. • Define ϕ120 : Rn −→ Rn (x, y, !z ) −→ x, y + aε (x + τ + ε), !z and ϕ12+ : Rn −→ Rn . (x, y, !z ) −→ x, y + g(!z ) aε (x + τ + ε), !z . Both ϕ120 and ϕ12+ are analytically invertible on full Rn , coincide on G 120 ∩ G 12+ and are the identity on x = −τ − ε. In particular, observe that 2
! = (−τ − ε, y, 0) ! ϕ120 (−τ − ε, y, 0) and ! = (−τ + ε, y + 2a, 0), ! ϕ120 (−τ + ε, y, 0) ! = (−τ, a, 0). ! i.e., ϕ120 (−τ, 0, 0) • The maps ϕ13∗ : Rn −→ Rn are defined analogously to the case of ϕ11∗ . • The maps ϕ2i∗ are given by ϕ2i∗ : Rn −→ Rn , (x, y, !z ) −→ x, pr y ϕ1i∗ (−τ + ε, y, !z ), !z , where pr y is the projection to the y-component. • The remaining maps ϕ3i∗ are defined using the reflection symmetry w.r.t. x = 0. One immediately checks that ϕ : Rn −→ Rn defined by ϕ|G i j∗ := ϕi j∗ and ϕ|Rn\C := id is a well-defined stratified analytic isomorphism with the desired properties.
Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References 1. Ashtekar, A.: New Hamiltonian formulation of general relativity. Phys. Rev. D 36, 1587–1602 (1987) 2. Ashtekar, A., Isham, C.J.: Representations of the holonomy algebras of gravity and nonabelian gauge theories. Class. Quant. Grav. 9, 1433–1468 (1992) 3. Ashtekar, A., Lewandowski, J.: Differential geometry on the space of connections via graphs and projective limits. J. Geom. Phys. 17, 191–230 (1995) 4. Ashtekar, A., Lewandowski, J.: Projective techniques and functional integration for gauge theories. J. Math. Phys. 36, 2170–2191 (1995) 5. Ashtekar, A., Lewandowski, J.: Quantum theory of geometry I: Area operators. Class. Quant. Grav. 14, A55–A82 (1997) 6. Ashtekar, A., Lewandowski, J.: Representation theory of analytic holonomy C ∗ algebras. In: Baez, J.C. (ed.) Knots and Quantum Gravity (Riverside, CA, 1993), Oxford Lecture Series in Mathematics and its Applications 1, pp. 21–61, Oxford University Press, Oxford (1994)
140
Ch. Fleischhack
7. Baez, J.C., Sawin, S.: Diffeomorphism-invariant spin network states. J. Funct. Anal. 158, 253–266 (1998) 8. Baez, J.C., Sawin, S.: Functional integration on spaces of connections. J. Funct. Anal. 150, 1–26 (1997) 9. Bahr, B., Thiemann, T.: Automorphisms in loop quantum gravity. http://arxiv.org/list/0711.0373[gr-qc] (2007) 10. Bierstone, E., Milman, P.D.: Semianalytic and subanalytic sets. Publ. Math. IHES 67, 5–42 (1988) 11. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics, vol. 2 (Equilibrium States, Models in Quantum Statistical Mechanics). Springer-Verlag, New York (1996) 12. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics, vol. 1 (C ∗ - and W ∗ -Algebras, Symmetry Groups, Decomposition of States). Springer-Verlag, New York (1987) 13. Fleischhack, Ch.: Construction of generalized connections. http://arxiv.org/list/math-ph/0601005 (2006) 14. Fleischhack, Ch.: Hyphs and the Ashtekar-Lewandowski Measure. J. Geom. Phys. 45, 231–251 (2003) 15. Fleischhack, Ch.: Irreducibility of the Weyl Algebra in Loop Quantum Gravity. Phys. Rev. Lett. 97, 061302 (2006) 16. Fleischhack, Ch.: Mathematische und physikalische Aspekte verallgemeinerter Eichfeldtheorien im Ashtekarprogramm (Dissertation). Universität Leipzig (2001) 17. Fleischhack, Ch.: Proof of a Conjecture by Lewandowski and Thiemann. Commun. Math. Phys. 249, 331–352 (2004) 18. Fleischhack, Ch.: Regular connections among generalized connections. J. Geom. Phys. 47, 469–483 (2003) 19. Fleischhack, Ch.: Stratification of the Generalized Gauge Orbit Space. Commun. Math. Phys. 214, 607–649 (2000) 20. Gallagher, P.X.: Zeros of group characters. Math. Zeitschr. 87, 363–364 (1965) 21. Hardt, R.M.: Stratification of Real Analytic Mappings and Images. Invent. Math. 28, 193–208 (1975) 22. Huebsch, W., Morse, M.: Diffeomorphisms of manifolds. Rend. Circ. Mat. Palermo 11, 291–318 (1962) 23. Kami´nski, W., Lewandowski, J., Bobie´nski, M.: Background independent quantizations: the scalar field I. Class. Quant. Grav. 23, 2761–2270 (2006) 24. Kami´nski, W., Lewandowski, J., Okołów, A.: Background independent quantizations: the scalar field II. Class. Quant. Grav. 23, 5547–5586 (2006) 25. Lewandowski, J., Okołów, A.: Automorphism covariant representations of the holonomy-flux ∗-algebra. Class. Quant. Grav. 22, 657–680 (2005) 26. Lewandowski, J., Okołów, A., Sahlmann, H., Thiemann, T.: Uniqueness of diffeomorphism invariant states on holonomy-flux algebras. Commun. Math. Phys. 267, 703–733 (2006) 27. Lewandowski, J., Thiemann, T.: Diffeomorphism invariant quantum field theories of connections in terms of webs. Class. Quant. Grav. 16, 2299–2322 (1999) 28. Łojasiewicz, S.: On semi-analytic and subanalytic geometry. In: Panoramas of mathematics (Warszawa, 1992/1994), Banach Center Publications 34, pp. 89–104, Polish Academy of Sciences, Warszawa (1995) 29. Łojasiewicz, S.: Triangulation of semi-analytic sets. Ann. Scuola Norm. Sup. Pisa 18, 449–474 (1964) 30. Okołów, A., Lewandowski, J.: Diffeomorphism covariant representations of the holonomy-flux ∗-algebra. Class. Quant. Grav. 20, 3543–3568 (2003) 31. Pezzana, M.: Strong analytic triangulation for manifolds (Review of an article by Massimo Ferrarotti (Boll. Un. Mat. Ital. A 17, 79–84 (1980))). Math. Rev. 81e, 32012 (1981) 32. Sahlmann, H.: Some results concering the representation theory of the algebra underlying loop quantum gravity. http://arxiv.org/list/gr-qc/0207111 (2002) 33. Sahlmann, H.: When do measures on the space of connections support the triad operators of loop quantum gravity? http://arxiv.org/list/gr-qc/0207112 (2002) 34. Sahlmann, H., Thiemann, T.: Irreducibility of the Ashtekar-Isham-Lewandowski Representation. Class. Quant. Grav. 23, 4453–4472 (2006) 35. Sahlmann, H., Thiemann, T.: On the superselection theory of the weyl algebra for diffeomorphism invariant quantum gauge theories. http://arxiv.org/list/gr-qc/0302090 (2003) 36. Takesaki, M.: Theory of Operator Algebras I (Encyclopaedia of Mathematical Sciences 124). SpringerVerlag, Berlin (2002) 37. Whitehead, J.H.C.: On C 1 -complexes. Annals of Math. 41, 809–824 (1940) 38. Whitney, H.: Geometric Integration Theory. Princeton University Press, Princeton, NJ (1957) 39. Zeidler, E. (Hrsg.): Taschenbuch der Mathematik, Bd. 1. Teubner-Verlag, Leipzig, Stuttgart (1996) Communicated by Y. Kawahigashi
Commun. Math. Phys. 285, 141–160 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0461-1
Communications in
Mathematical Physics
Quantum Group of Isometries in Classical and Noncommutative Geometry Debashish Goswami Stat-Math Unit, Kolkata Centre, Indian Statistical Institute, 203, B. T. Road, Kolkata 700 108, India. E-mail:
[email protected] Received: 29 October 2007 / Accepted: 16 December 2007 Published online: 28 March 2008 – © Springer-Verlag 2008
Abstract: We formulate a quantum generalization of the notion of the group of Riemannian isometries for a compact Riemannian manifold, by introducing a natural notion of smooth and isometric action by a compact quantum group on a classical or noncommutative manifold described by spectral triples, and then proving the existence of a universal object (called the quantum isometry group) in the category of compact quantum groups acting smoothly and isometrically on a given (possibly noncommutative) manifold satisfying certain regularity assumptions. The idea of ‘quantum families’ (due to Woronowicz and Soltan) are relevant to our construction. A number of explicit examples are given and possible applications of our results to the problem of constructing quantum group equivariant spectral triples are discussed.
1. Introduction Since the formulation of quantum automorphism groups by Wang ([17,18]), following suggestions of Alain Connes, many interesting examples of such quantum groups, particularly the quantum permutation groups of finite sets and finite graphs, have been extensively studied by a number of mathematicians (see, e.g. [1–3,19] and references therein), who have also found applications to and interaction with areas like free probability and subfactor theory. The underlying basic principle of defining a quantum automorphism group corresponding to some given mathematical structure (for example, a finite set, a graph, a C ∗ or von Neumann algebra) consists of two steps: first, to identify (if possible) the group of automorphisms of the structure as a universal object in a suitable category, and then, try to look for the universal object in a similar but bigger category by replacing groups by quantum groups of appropriate type. Supported in part by the Indian National Academy of Sciences.
142
D. Goswami
However, most of the work done so far concerns some kind of quantum automorphism groups of a ‘finite’ structure, for example, of finite sets or finite dimensional matrix algebras. It is thus quite natural to try to extend these ideas to the ‘infinite’ or ‘continuous’ mathematical structures, for example classical and noncommutative manifolds. In the present article, we have made an attempt to formulate and study the quantum analogues of the groups of Riemannian isometries, which play a very important role in the classical differential geometry. The group of Riemannian isometries of a compact Riemannian manifold M can be viewed as the universal object in the category of all compact metrizable groups acting on M, with smooth and isometric action. Therefore, to define the quantum isometry group, it is reasonable to consider a category of compact quantum groups which act on the manifold (or more generally, on a noncommutative manifold given by spectral triple) in a ‘nice’ way, preserving the Riemannian structure in some suitable sense, to be precisely formulated. In this article, we have given a definition of such ‘smooth and isometric’ action by a compact quantum group on a (possibly noncommutative) manifold, extending the notion of smooth and isometric action by a group on a classical manifold. Indeed, the meaning of isometric action is nothing but that the action should commute with the ‘Laplacian’ coming from the spectral triple, and we should mention that this idea was already present in [2], though only in the context of a finite metric space or a finite graph. The universal object in the category of such quantum groups, if it exists, should be thought of as the quantum analogue of the group of isometries, and we have been able to prove its existence under some regularity assumptions, all of which can be verified for a general compact connected Riemannian manifold as well as the standard examples of noncommutative manifolds. Motivated by the ideas of Woronowicz and Soltan, we actually consider a bigger category. The isometry group of a classical manifold, viewed as a compact metrizable space (forgetting the group structure), can be seen to be the universal object of a category whose object-class consists of subsets (not necessarily subgroups) of the set of smooth isometries of the manifold. Then it can be proved that this universal compact set has a canonical group structure. A natural quantum analogue of this has been formulated by us, called the category of ‘quantum families of smooth isometries’. The underlying C ∗ -algebra of the quantum isometry group has been identified with its universal object and moreover, it is shown to be equipped with a canonical coproduct making it into a compact quantum group. We believe that a detailed study of quantum isometry groups will not only give many new and interesting examples of compact quantum groups, it will also contribute to the understanding of quantum group-equivariant spectral triples. In fact, we have made some progress in this direction already by constructing a spectral triple (which is often closely related to the original spectral triple) on the Hilbert space of forms which is equivariant with respect to a canonical unitary representation of the quantum isometry group. In a companion article [4] with J. Bhowmick, we provide explicit computations of quantum isometry groups of a few classical and noncommutative manifolds. However, we briefly quote some of main results of [4] in the present article. One interesting observation is that the quantum isometry group of the noncommutative two-torus Aθ (with the canonical spectral triple) is (as a C ∗ algebra) a direct sum of four commutative and four noncommutative tori, and contains as a quantum subgroup (which is universal for a certain class of isometric actions called holomorphic isometries) the ‘quantum double-torus’ discovered and studied by Hajac and Masuda ([13]).
Quantum Group of Isometries in Classical and Noncommutative Geometry
143
2. Isometry Groups of Classical Manifolds We begin with a well-known characterization of the isometry group of a (classical) compact Riemannian manifold. Let (M, g) be a compact Riemannian manifold and let 1 = 1 (M) be the space of smooth one-forms, which has a right Hilbert-C ∞ (M)module structure given by the C ∞ (M)-valued inner product > defined by > (m) =< ω(m), η(m) > |m , where < ·, · > |m is the Riemannian metric on the cotangent space Tm∗ M at the point m ∈ M. The Riemannian volume form allows us to make 1 a pre-Hilbert space, and we denote its completion by H1 . Let H0 = L 2 (M, dvol) and consider the de-Rham differential d as an unbounded linear map from H0 to H1 , with the natural domain C ∞ (M) ⊂ H0 , and also denote its closure by d. Let L := −d ∗ d. The following identity can be verified by direct and easy computation using the local coordinates : ¯ − L(φ)ψ ¯ − φL(ψ) ¯ (∂L)(φ, ψ) ≡ L(φψ) = 2 > for φ, ψ ∈ C ∞ (M). (1) Proposition 2.1. A smooth map γ : M → M is a Riemannian isometry if and only if γ commutes with L in the sense that L( f ◦ γ ) = (L( f )) ◦ γ for all f ∈ C ∞ (M). Proof. If γ commutes with L then from the identity (1) we get for m ∈ M and φ, ψ ∈ C ∞ (M) : < dφ|γ (m) , dψ|γ (m) > |γ (m) = > (γ (m)) 1 (∂L(φ, ψ) ◦ γ )(m) 2 1 = ∂L(φ ◦ γ , ψ ◦ γ )(m) 2 = > (m) =
= < d(φ ◦ γ )|m , d(ψ ◦ γ )|m > |m = < (dγ |m )∗ (dφ|γ (m) ), (dγ |m )∗ (dψ|γ (m) ) > |m , which proves that (dγ |m )∗ : Tγ∗(m) M → Tm∗ M is an isometry. Thus, γ is a Riemannian isometry. Conversely, if γ is an isometry, both the maps induced by γ on H0 and H1 , i.e. Uγ0 : H0 → H0 given by Uγ0 ( f ) = f ◦ γ and Uγ1 : H1 → H1 given by Uγ1 ( f dφ) = ( f ◦ γ )d(φ ◦ γ ) are unitaries. Moreover, d ◦ Uγ0 = Uγ1 ◦ d on C ∞ (M) ⊂ H0 . From this, it follows that L = −d ∗ d commutes with Uγ0 . Now let us consider a compact metrizable (i.e. second countable) space Y with a continuous map θ : M × Y → M. We abbreviate θ (m, y) as my and denote by ξ y the map M m → my. Let α : C(M) → C(M) ⊗ C(Y ) ∼ = C(M × Y ) be the map given by α( f )(m, y) := f (my) for y ∈ Y , m ∈ M and f ∈ C(M). For a state φ on C(Y ), denote by αφ the map (id ⊗ φ) ◦ α : C(M) → C(M). We shall also denote by C the subspace of C(M) ⊗ C(Y ) generated by elements of the form α( f )(1 ⊗ ψ), f ∈ C(M), ψ ∈ C(Y ). Since C(M) and C(Y ) are commutative algebras, it is easy to see that C is a ∗-subalgebra of C(M) ⊗ C(Y ). Then we have the following
144
D. Goswami
Theorem 2.2. (i) C is norm-dense in C(M) ⊗ C(Y ) if and only if for every y ∈ Y , ξ y is one-to-one. (ii) The map ξ y is C ∞ for every y ∈ Y if and only if αφ (C ∞ (M)) ⊆ C ∞ (M) for all φ. (iii) Under the hypothesis of (ii), each ξ y is also an isometry if and only if αφ commutes with (L − λ)−1 for all state φ and all λ in the resolvent of L (equivalently, αφ commutes with the Laplacian L on C ∞ (M)). (i) First, assume that ξ y is one-to-one for all y. By Stone-Weirstrass Theorem, it is enough to show that C separates points. Take (m 1 , y1 ) = (m 2 , y2 ) in M ×Y . If y1 = y2 , we can choose ψ ∈ C(Y ) which separates y1 and y2 , hence (1 ⊗ ψ) ∈ C separates (m 1 , y1 ) and (m 2 , y2 ). So, we can consider the case when y1 = y2 = y (say), but m 1 = m 2 . By injectivity of ξ y , we have m 1 y = m 2 y, so there exists f ∈ C(M) such that f (m 1 y) = f (m 2 y), i.e. α( f )(m 1 , y) = α( f )(m 2 , y). This proves the density of C. For the converse, we argue as in the proof of Proposition 3.3 of [16]. Assume that C is dense in C(M) ⊗ C(Y ), and let y ∈ Y , m 1 , m 2 ∈ M be such that m 1 y = m 2 y. That is, α( f )(1⊗ψ)(m 1 , y) = α( f )(1⊗ψ)(m 2 , y) for all f ∈ C(M), ψ ∈ C(Y ). By the density of C we get χ (m 1 , y) = χ (m 2 , y) for all χ ∈ C(M × Y ), so (m 1 , y) = (m 2 , y), i.e. m 1 = m 2 . (ii) The ‘if part’ of (ii) follows by considering the states corresponding to point evaluation, i.e. C(Y ) ψ → ψ(y), y ∈ Y . For the converse, we note that an arbitrary state φ corresponds to a regular Borel measure µ on Y so that φ(h) = hdµ, and thus, αφ ( f )(m) = f (my)dµ(y) for f ∈ C(M). From this, by interchanging differentiation and integation (which is allowed by the Dominated Convergence Theorem, since µ is a finite measure) we can prove that αφ ( f ) is C ∞ whenever f is so. The assertion (iii) follows from Proposition 2.1 in a straightforward way.
Proof.
Let us recall a few well-known facts about the Laplacian L, viewed as a negative self-adjoint operator on the Hilbert space L 2 (M, dvol). It is known (see [14] and references therein) that L has compact resolvents and all its eigenvectors belong to C ∞ (M). Moreover, it follows from the Sobolev Embedding Theorem that Dom(Ln ) = C ∞ (M). n≥1
Let {ei j , j = 1, . . . , di ; i = 0, 1, 2, . . .} be the set of (normalized) eigenvectors of L, where ei j ∈ C ∞ (M) is an eigenvector corresponding to the eigenvalue λi , 0 = |λ0 | < |λ1 | < |λ2 | < . . .. We have the following: Lemma 2.3. The complex linear span of {ei j } is norm-dense in C(M). Proof. This is a consequence of the asymptotic estimates of eigenvalues λi , as well as the uniform bound of the eigenfunctions ei j . For example, it is known ([11], Theorem n−1 n−1 1.2) that there exist constants C, C such that ei j ∞ ≤ C|λi | 4 , di ≤ C |λi | 2 , where n is the dimension of the manifold M. Now, for f ∈ C ∞ (M) ⊆ k≥1 Dom(Lk ), we write f as an a-priori L 2 -convergent series i j f i j ei j ( f i j ∈ C), and observe that | f i j |2 |λi |2k < ∞ for every k ≥ 1. Choose and fix sufficiently large k such that
Quantum Group of Isometries in Classical and Noncommutative Geometry
i≥0 |λi |
145
< ∞, which is possible due to the well-known Weyl asymptotics of eigenvalues of L. Now, by the Cauchy-Schwarz inequality and the estimate for di , we have ⎛ ⎞1 ⎛ ⎞1 2 2 1 2 ⎝ 2 2k ⎠ ⎝ n−1−2k ⎠ | f i j |ei j ∞ ≤ C(C ) | f i j | |λi | |λi | < ∞. ij
n−1−2k
ij
i≥0
Thus, the series i j f i j ei j converges to f in sup-norm, so Sp{ei j , j = 1, 2, . . . , di ; i = 0, 1, 2, . . .} is dense in sup-norm in C ∞ (M), hence in C(M) as well. Let us denote Sp{ei j , j = 1, . . . , di ; i ≥ 0} by A∞ 0 from now on. We shall now show that C ∞ (M) can be replaced by the smaller subspace A∞ 0 in Theorem 2.2. We need a lemma for this, which will be useful later on too. Lemma 2.4. Let H1 , H2 be Hilbert spaces and for i = 1, 2, let Li be a (possibly unbounded) self-adjoint operator on Hi with compact resolvents, and let Vi be the linear span of eigenvectors of Li . Moreover, assume that there is an eigenvalue of Li for which the eigenspace is one-dimensional, say spanned by a unit vector ξi . Let be a linear map from V1 to V2 such that L2 = L1 and (ξ1 ) = ξ2 . Then we have ξ2 , (x) = ξ1 , x ∀x ∈ V1 .
(2)
Proof. By hypothesis on , it is clear that there is a common eigenvalue, say λ0 , of L1 and L2 , with the eigenvectors
ξ1 and ξ2 respectively. Let us write the set of eigenvalues of Li as a disjoint union {λ0 } i (i = 1, 2), and let the corresponding orthogonal decomposition of Vi be given by Vi = Cξi λ∈i Viλ ≡ Cξi ⊕ Vi , say, where Viλ denotes the eigenspace of Li corresponding to the eigenvalue λ. By assumption, maps V1λ to V2λ whenever λ is an eigenvalue of L2 , i.e. V2λ = {0}, and otherwise it maps V1λ into {0}. Thus, (V1 ) ⊆ V2 . Now, (2) is obviously satisfied for x = ξ1 , so it is enough to prove (2) for all x ∈ V1 . But we have ξ, x = 0 for x ∈ V1 , and since (x) ∈ V2 = V2 {ξ2 }⊥ , it follows that ξ2 , (x) = 0 = ξ1 , x. Lemma 2.5. Let Y and α be as in Theorem 2.2. Then the following are equivalent: (a) For every y ∈ Y , ξ y is smooth isometric. ∞ ∞ (b) For every state φ on C(Y ), we have αφ (A∞ 0 ) ⊆ A0 , and αφ L = Lαφ on A0 . Proof. We prove only the nontrivial implication (b) ⇒ (a). Assume that αφ leaves A∞ 0 invariant and commutes with L on it, for every state φ. To prove that α is a smooth isometric action, it is enough (see the proof of Theorem 2.2) to prove that α y (A∞ ) ⊆ A∞ for all y ∈ Y , where α y ( f ) := (id ⊗ ev y )( f ) = f ◦ ξ y , ev y being the evaluation at the point y. Let M1 , . . . , Mk be the connected components of the compact manifold M. Thus, k L 2 (M , dvol), the Hilbert space L 2 (M, dvol) admits an orthogonal decomposition ⊕i=1 i and the Laplacian L is of the form ⊕i Li , where Li denotes the Laplacian on Mi . Since each Mi is connected, we have Ker(Li ) = Cχi , where χi is the constant function on Mi equal to 1. Now, we note that for fixed y and i, the image of Mi under the continuous function ξ y must be mapped into a component, say M j . Thus, by applying Lemma 2.4 with H1 = L 2 (Mi ),H2 = L 2 (M j ), = ξ y and the L 2 -continuity of the map f → α y ( f ) = f ◦ ξ y , we have α y ( f )(x)dvol(x) = f (x)dvol(x) Mj
Mi
146
D. Goswami
for all f in the linear span of eigenvectors of Li , hence (by density) for all f in L 2 (Mi ). It follows that M α y ( f )dvol = M f dvol for all f ∈ L 2 (M), in particular for all f ∈ C(M). Since α y is a ∗-homomorphism on C(M), we have α y ( f g)dvol = f gdvol = f, g, α y ( f ), α y (g) = M
M
for all f, g ∈ C(M). Thus, α y extends to an isometry on L 2 (M), to be denoted by the same notation, which by our assumption commutes with the self-adjoint operator L on n the core A∞ 0 , and hence α y commutes with L for all n. In particular it leaves invariant n ∞ the domains of each L , which implies α y (C (M)) ⊆ C ∞ (M). In view of the fact that the set of isometries of M, denoted by ISO(M), is a compact second countable (i.e. compact metrizable) group, we see that ISO(M) is the maximal compact second countable group acting on M such that the action is smooth and isometric. In other words, if we consider a category whose objects are compact metrizable groups acting smoothly and isometrically on M, and morphisms are the group homomorphisms commuting with the actions on M, then ISO(M) (with its canonical action on M) is the initial object of this category. However, one can take a more general viewpoint and consider the category of compact metrizable spaces Y equipped with a continuous map θ : M × Y → M satisfying (i)–(iii) of Theorem 2.2, or equivalently, the pair of commutative unital C ∗ -algebras B = C(Y ) and a unital C ∗ -homomorphism α : C(M) → C(M)⊗B satisfying the conditions (i)–(iii). The set of isometries ISO(M) (as a topological space) can be identified with the universal object of this category, and then one can prove that it has a group structure. It is quite natural to formulate a quantum analogue of the above, by considering, in the spirit of Woronowicz and Soltan (see [22,15]), ‘quantum families of isometries’, which can be defined to be a pair (B, α), where B is a (not necessarily commutative) C ∗ -algebra and α : C(M) → C(M) ⊗ B is unital C ∗ -homomorphism satisfying (i)–(iii) of Theorem 2.2, i.e. the linear span of α(C(M))(1 ⊗ B) (which is not necessarily a ∗-subalgebra any more, B being possibly noncommutative) is norm-dense in C(M) ⊗ B and for every state φ on B, the map αφ keeps C ∞ (M) invariant and commutes with the Laplacian L. The morphisms of this category are obvious. We shall prove that this category has a universal object at least when the manifold M is connected, and this universal object can be equipped with a canonical quantum group structure. This will define the quantum isometry group of a manifold. However, we shall go beyond classical manifolds and a define quantum isometry group QISO(A∞ , H, D) for a spectral triple (A∞ , H, D), with A∞ being unital, and satisfying certain assumptions. To this end, we need to carefully formulate the notion of Laplacian in noncommutative geometry, which is the goal of the next section. 3. Laplacian in Noncommutative Geometry Given a spectral triple (A∞ , H, D), we recall from [12,7] the construction of the space of one-forms. We have a derivation from A∞ to the A∞ -A∞ bimodule B(H) given by a → [D, a]. This induces a bimodule morphism π from 1 (A∞ ) (the bimodule of universal one-forms on A∞ ) to B(H), such that π(δ(a)) = [D, a], where δ : A∞ → 1 (A∞ ) denotes the universal derivation map. We set 1D ≡ 1D (A∞ ) := 1 (A∞ )/Ker(π ) ∼ = π(1 (A∞ )) ⊆ B(H). Assume, in case H is infinite-dimensional, that the spectral triple is of compact type and has a finite dimension in the sense of Connes ([7]), i.e. there is
Quantum Group of Isometries in Classical and Noncommutative Geometry
147
some p > 0 such that the operator |D|− p (interpreted as the inverse of the restriction of |D| p on the closure of its range, which has a finite co-dimension since D has compact resolvents) has finite nonzero Dixmier trace, denoted by T rω (where ω is some suitable Banach limit, see, e.g. [7,12]). Consider the canonical ‘volume form’ τ coming from the 1 − p ). In case Dixmier trace, i.e. τ : B(H) → C defined by τ (A) := T r (|D| − p ) T r ω (A|D| ω H is finite-dimensional, we shall take τ to be the usual trace instead. Let us at this point assume that the spectral triple is QC ∞ , i.e. A∞ and {[D, a], a ∈ A∞ } are contained in the domains of all powers of the derivation [|D|, ·]. Under this assumption, τ is a positive faithful trace on the C ∗ -subalgebra generated by A∞ and {[D, a] a ∈ A∞ }, and the GNS Hilbert space L 2 (A∞ , τ ) is denoted by H0D . Similarly, we equip 1D with a semi-inner product given by < η, η >:= τ (η∗ η ), and denote the Hilbert space obtained from it by H1D . The map d D : H0D → H1D given by d D (·) = [D, ·] is an unbounded densely defined linear map. We show that the map d D has nice properties under a very natural condition on the spectral triple. Lemma 3.1. Suppose that for every element a ∈ A∞ , the map R t → αt (X ) := exp(it D)X exp(−it D) is differentiable at t = 0 in the norm-topology of B(H), where X = a or [D, a]. Then we have: (a) d D is closable (the closure is denoted again by d D ); ∗ d and A∞ is viewed as a dense subspace of (b) A∞ ⊆ Dom(L), where L := −d D D 0 HD ; (c) L maps A∞ into the weak closure of A∞ in B(H0D ). Proof. We first observe that τ (αt (A)) = τ (A) for all t and for all A ∈ B(H), since exp(it D) commutes with |D|− p . If moreover, A belongs to the domain of normdifferentiability (at t = 0) of αt , i.e. αt (A)−A → i[D, A] in operator-norm, then it follows t (A) = 0. from the property of the Dixmier trace that τ ([D, A]) = 1i limt→0 τ (αt (A))−τ t Now, since by assumption we have the norm- differentiability at t = 0 of αt (A) for A belonging to the ∗-subalgebra (say B) generated by A∞ and [D, A∞ ], it follows that τ ([D, A]) = 0 ∀A ∈ B. Let us now fix a, b, c ∈ A∞ and observe that < a d D (b), d D (c) > = τ ((a d D (b))∗ d D (c) > = −τ ([D, [D, b∗ ]a ∗ c]) + τ ([D, [D, b∗ ]a ∗ ]c) = τ ([D, [D, b∗ ]a ∗ ]c), using the fact that τ ([D, [D, b∗ ]a ∗ c]) = 0. This implies 1
| < a d D (b), d D (c) > | ≤ [D, [D, b∗ ]a ∗ ]τ (c∗ c) 2 = [D, [D, b∗ ]a ∗ ]c2 , 1
where c2 = τ (c∗ c) 2 denotes the L 2 -norm of c ∈ H0D . This proves that a d D (b) ∗ for all a, b ∈ A∞ , so in particular d ∗ is dense, i.e. d is belongs to the domain of d D D D ∗ ), or in other words, closable. Moreover, taking a = 1, we see that d D (A∞ ) ⊆ Dom(d D ∗ d ). This proves (a) and (b). The fact (c) can be proved along the line A∞ ⊆ Dom(d D D of Theorem 2.9, p. 129, [12]. We need few more assumptions on the operator L to define the quantum isometry group.
148
D. Goswami
Assumption (i). (a) d D is closable (the closure is denoted again by d D ); ∗d ; (b) A∞ ⊆ Dom(L), where L := −d D D Assumption (ii). L has compact resolvents; Assumption(iii). L(A∞ ) ⊆ A∞ ; Assumption(iv). Each eigenvector of L (which has a discrete spectrum, hence a complete set of eigenvectors) belongs to A∞ ; Assumption(v)(‘Connectedness assumption’). The kernel of L is one-dimensional, spanned by the identity 1 of A∞ , viewed as a unit vector in H0D ; Assumption (vi). The complex linear span of the eigenvectors of L, say A∞ 0 (which is a subspace of A∞ by assumption (iv)), is norm-dense in A∞ . We call L the noncommutative Laplacian and Tt = et L the noncommutative heat semigroup. Let us summarize some simple observations in the form of the following: Lemma 3.2. (a) If the assumptions (i)–(iii) are valid, then for x ∈ A∞ , we have L(x ∗ ) = (L(x))∗ . (b) If Tt = exp(tL) maps H0D into A∞ for all t > 0, assumption (iv) is satisfied. Proof. It follows by simple calculation using the facts that τ is a trace and d D (x ∗ ) = −(d D (x))∗ that τ (L(x ∗ )∗ y) = τ (d D (x)d D (y)) = τ (d D (y)d D (x)) = −τ ((d D (y ∗ ))∗ d D (x)) =< y ∗ , L(x) >= τ (yL(x)) = τ (L(x)y), for all y ∈ A∞ . By density of A∞ in H0D (a) follows. To prove (b), we note that if x ∈ H0D is an eigenvector of L, say L(x) = λx (λ ∈ C), then we have Tt (x) = eλt x, hence x = e−λt Tt (x) ∈ A∞ . Definition 3.3. We say that a spectral triple satisfying the assumptions (i)–(vi) admissible. Given a spectral triple satisfying assumptions (i)–(iv) and (vi), the Laplacian L has a countable set of eigenvalues each with finite multiplicity; let us denote them by λ0 = 0, λ1 , λ2 , . . . with V0 , V1 , . . . the corresponding eigenspaces (finite dimensional), and for each i, let {ei j , j = 1, . . . , di } be an orthonormal basis of Vi . We denote by A∞ 0 the linear subspace spanned by {ei j , j = 1, . . . , di ; i ≥ 0}. We have Vi ⊆ A∞ for each i, Vi is closed under ∗, and moreover, {ei∗j , j = 1, . . . , di } is also an orthonormal basis for Vi , since τ (x ∗ y) = τ (yx ∗ ) for x, y ∈ A∞ . Moreover, if the spectral triple also satisfies assumption (v), i.e. admissible, we have V0 = C1. We conclude the present section by the following useful lemma. Lemma 3.4. Let us assume that the spectral triple (A∞ , H, D) is admissible. Let : ∞ A∞ 0 → A0 be a (norm-) bounded linear map, such that (1) = 1, and ◦L = L◦ on the subspace A∞ 0 spanned (algebraically) by Vi , i = 0, 1, 2, . . . . Then τ ((x)) = τ (x) for all x ∈ A∞ . Proof. By Lemma 2.4 with H1 = H2 = H0D , ξ1 = ξ2 = 1, we have τ ((x)) = τ (x) ∞ for all x ∈ A∞ 0 . By the norm-continuity of and τ it extends to the whole of A .
Quantum Group of Isometries in Classical and Noncommutative Geometry
149
4. Definition and Existence of the Quantum Isometry Group We begin by recalling the definition of compact quantum groups and their actions from [21,20]. A compact quantum group is given by a pair (S, ), where S is a unital separable C ∗ algebra equipped with a unital C ∗ -homomorphism : S → S ⊗ S (where ⊗ denotes the injective tensor product) satisfying (ai) (aii)
( ⊗ id) ◦ = (id ⊗ ) ◦ (co-associativity), and the linear span of (S)(S ⊗ 1) and (S)(1 ⊗ S) are norm-dense in S ⊗ S.
It is well-known (see [21,20]) that there is a canonical dense ∗-subalgebra S0 of S, consisting of the matrix coefficients of the finite dimensional unitary (co)-representations of S, and maps : S0 → C (co-unit) and κ : S0 → S0 (antipode) defined on S0 which make S0 a Hopf ∗-algebra. We say that the compact quantum group (S, ) acts on a unital C ∗ algebra B, if there is a unital C ∗ -homomorphism α : B → B ⊗ S satisfying the following: (bi) (bii)
(α ⊗ id) ◦ α = (id ⊗ ) ◦ α, and the linear span of α(B)(1 ⊗ S) is norm-dense in B ⊗ S.
Let us now recall the concept of universal quantum groups as in [19,17] and references therein. We shall use most of the terminologies of [17], e.g. Woronowicz C ∗ -subalgebra, Woronowicz C ∗ -ideal, etc, however with the exception that we shall call the Woronowicz C ∗ algebras just compact quantum groups, and not use the term compact quantum groups for the dual objects as done in [17]. Let An denote the universal compact quantum group generated by u i j , i, j = 1, . . . , n satisfying the relations uu ∗ = In = u ∗ u, u u = In = uu , ˜ is given by, where u = ((u i j )), u = ((u ji )) and u = ((u i∗j )). The coproduct, say , ˜ ij) = (u
u ik ⊗ u k j .
k
We refer the reader to [19] for a detailed discussion on the structure and classification of such quantum groups. Let us now consider a spectral triple (A∞ , H, D) satisfying the assumptions (i)–(iv) and (vi) of the previous section, but not necessarily (v) (i.e. connectedness assumption). Let L, Vi , ei j , etc. be as in the previous section. We shall denote by Ui the quantum group Adi , where di is dimension of the subspace Vi . We fix a representation βi : Vi → Vi ⊗ Ui of Ui on the Hilbert space Vi , given (i) (i) by βi (ei j ) = k eik ⊗ u k j , for j = 1, . . . , di , where u i ≡ ((u k j )) are the generators of Ui as discussed before. Thus, both u i and u¯i are unitaries. It follows from [17] that the representations βi canonically induce a representation β = ∗i βi of the free product U := ∗i Ui (which is a compact quantum group, see [17] for the details) on the Hilbert space H0D , such that the restriction of β on Vi coincides with βi for all i. In view of the characterization of smooth isometric action on a classical manifold, we give the following definitions. Definition 4.1. A quantum family of smooth isometries of the noncommutative manifold A∞ (or, more precisely on the corresponding spectral triple) is a pair (S, α), where
150
D. Goswami
S is a separable unital C ∗ -algebra, α : A → A ⊗ S (where A denotes the C ∗ algebra obtained by completing A∞ in the norm of B(H0D )) is a unital C ∗ -homomorphism, satisfying the following:
(a) Sp α(A)(1 ⊗ S) = A ⊗ S, ∞ (b) αφ := (id ⊗ φ) ◦ α maps A∞ 0 into itself and commutes with L on A0 , for every state φ on S. The quantum family of isometries (S, α) is said to be volume-preserving if (τ ⊗ id)(α(a)) = τ (a)1S for all a ∈ A∞ . In case the C ∗ -algebra S has a coproduct such that (S, ) is a compact quantum group and α is an action of (S, ) on A, we say that (S, ) acts smoothly and isometrically on the noncommutative manifold. Consider the category Q with the object-class consisting of all quantum families of isometries (S, α) of the given noncommutative manifold, and the set of morphisms Mor((S, α), (S , α )) being the set of unital C ∗ -homomorphisms φ : S → S satisfying (id ⊗ φ) ◦ α = α . We also consider another category Q whose objects are triplets (S, , α), where (S, ) is a compact quantum group acting smoothly and isometrically on the given noncommutative manifold, with α being the corresponding action. The morphisms are the homomorphisms of compact quantum groups which are also morphisms of the underlying quantum families. The forgetful functor F : Q → Q is clearly faithful, and we can view F(Q ) as a subcategory of Q. Let Q0 and Q0 denote the full subcategories of Q and Q respectively obtained by restricting the object-classes to the volume-preserving quantum families. Lemma 4.2. In case the spectral triple (A∞ , H, D) is admissible, any quantum family of smooth isometries is automatically volume-preserving; hence Q0 = Q and Q0 = Q as categories. Proof. Let (S, α) be any quantum family of smooth isometries, and let ω be any state on S. We conclude by Lemma 3.4 that τ (αω (x)) = τ (x)ω(1) for all x ∈ A. Since ω is arbitrary, we have (τ ⊗ id)(α(x)) = τ (x)1S for all x ∈ A. Remark 4.3. The assumption of admissibility is not actually necessary in the above lemma; we do have a non-admissible spectral triple for which the volume-preserving property is automatic. For example, consider a finite set X consisting of n points and realize the algebra A = C(X ) of complex-valued functions on X as diagonal matrices acting on H = Cn in the obvious way. Taking the trivial Dirac operator D = I , it is easy to see that the volume form τ is given by τ (ei ) = 1 for i = 1, . . . , n, where ei s are the canonical generators of A as in [18]. If (S, , α) is a compact quantum group acting n faithfully on A, with α(ei ) = i=1 e j ⊗ x ji , then it follows from [18] that j x ji = 1S for each i, hence α is automatically τ -invariant, i.e. volume-preserving in our terminology. This proves Q0 = Q , even if (A, H, D) is not admissible whenever n ≥ 2, since L = 0 has n-dimensional kernel. Remark 4.4. On the other hand, with A = Mn (C), H = Cn , D = I , we get an example of non-admissible spectral triple for which the volume-preserving property is not automatic. To see this, it is enough to observe that τ coincides with the usual trace of Mn (C), and there are actions of compact quantum groups on Mn (C) which do not preserve the trace (see [18]). Note that since L = 0 and Mn (C) is finite-dimensional, any quantum group action is trivially smooth and isometric in this case.
Quantum Group of Isometries in Classical and Noncommutative Geometry
151
Lemma 4.5. Consider the spectral triple (A∞ , H, D) satisfying assumptions (i)–(iv) and (vi), and let (S, α) be a quantum family of volume-preserving smooth isometries of the given spectral triple. Moreover, assume that the action α is faithful in the sense that there is no proper C ∗ -subalgebra S1 of S such that α(A∞ ) ⊆ A∞ ⊗ S1 . Then α˜ : A∞ ⊗ S → A∞ ⊗ S defined by α(a ˜ ⊗ b) := α(a)(1 ⊗ b) extends to an S-linear unitary on the Hilbert S-module H0D ⊗ S, denoted again by α. ˜ Moreover, we can find a C ∗ -isomorphism φ : U/I → S between S and a quotient of U by a C ∗ -ideal I of U, such that α = (id ⊗ φ) ◦ (id ⊗ I ) ◦ β on A∞ ⊆ H0D , where I denotes the quotient map from U to U/I. If, furthermore, there is a compact quantum group structure on S given by a coproduct such that (S, , α) is an object in Q0 , the map α : A∞ → A∞ ⊗ S extends to a unitary representation (denoted again by α) of the compact quantum group (S, ) on H0D . In this case, the ideal I is a Woronowicz C ∗ -ideal and the C ∗ -isomorphism φ : U/I → S is a morphism of compact quantum groups. Proof. We have (τ ⊗ id)(α(a)) = τ (a)1S for all a ∈ A∞ and hence for all a ∈ A by continuity. So, < α(x), α(y) >S =< x, y > 1S , where < ·, · >S denotes the S-valued inner product of the Hilbert module H0D ⊗ S. This proves that α˜ defined by α(x ˜ ⊗ b) := α(x)(1 ⊗ b) (x ∈ A∞ , b ∈ S) extends to an S-linear isometry on the Hilbert S-module H0D ⊗ S. Moreover, since α(A∞ )(1 ⊗ S) is norm-total in A¯ ⊗ S, it is clear that the S-linear span of the range of α(A∞ ) is dense in the Hilbert module H0D ⊗ S, or in other words, the isometry α˜ has a dense range, so it is a unitary. Since αω leaves each Vi invariant, it is clear that α maps Vi into Vi ⊗ S for each i. Let vk(i)j ( j, k = 1, . . . , di ) be the elements of S such that α(ei j ) = k eik ⊗ vk(i)j . Note (i)
that vi := ((vk j )) is a unitary in Mdi (C) ⊗ S. Moreover, the ∗-subalgebra generated by (i)
all {vk j , i ≥ 0, , j, k ≥ 1} must be dense in S by the assumption of faithfulness. We have already remarked that {ei∗j } is also an orthonormal basis of Vi , and since α, ∗ (i)∗ being a C ∗ -action on A, is ∗-preserving, we have α(ei∗j ) = (α(ei j ))∗ = k eik ⊗ vk j , (i)∗
and therefore ((vk j )) is also unitary. By universality of Ui , there is a C ∗ -homomor(i)
(i)
phism from Ui to S sending u k j to vk j , and by definition of the free product, this induces a C ∗ -homomorphism, say , from U onto S, so that U/I ∼ = S, where I := Ker(). In case S has a coproduct making it into a compact quantum group and α is a (i) quantum group action, it is easy to see that the subalgebra of S generated by {vk j , i ≥ (i) ⊗ vl(i) 0, j, k = 1, . . . , di } is a Hopf algebra, with (vk(i)j ) = l vkl j . From this, it follows that is Hopf-algebra morphism, hence I is a Woronowicz C ∗ -ideal. Before we state and prove the main theorem, let us note the following elementary fact about C ∗ -algebras. Lemma 4.6. Let C be a C ∗ algebra and F be a nonempty collection of C ∗ -ideals (closed two-sided ideals) of C. Then for any x ∈ C, we have sup x + I = x + I0 ,
I ∈F
where I0 denotes the intersection of all I in F and x + I = in f {x − y : y ∈ I } denotes the norm in C/I .
152
D. Goswami
Proof. It is clear that sup I ∈F x + I defines a norm on C/I0 , which is in fact a C ∗ -norm since each of the quotient norms · +I is so. Thus the lemma follows from the uniqueness of the C ∗ norm on the C ∗ algebra C/I0 . Theorem 4.7. For any spectral triple (A∞ , H, D) satisfying assumptions (i)–(iv) and (vi), the category Q0 of quantum families of volume-preserving smooth isometries has a universal (initial) object, say (G, α0 ). Moreover, G has a coproduct 0 such that (G, 0 ) is a compact quantum group and (G, 0 , α0 ) is a universal object in the category Q0 of compact quantum groups having smooth isometric volume-preserving action on the given spectral triple. The action α0 is faithful. Proof. Recall the C ∗ -algebra U considered before, and the map β from H0D to H0D ⊗ U. ∞ By our definition of β, it is clear that β(A∞ 0 ) ⊆ A0 ⊗alg U. However, β is only a linear map (unitary) but not necessarily a ∗-homomorphism. We shall construct the universal object as a suitable quotient of U. Let F be the collection of all those C ∗ -ideals I of U ∞ such that the composition I := (id ⊗ I ) ◦ β : A∞ 0 → A0 ⊗alg (U/I) extends to a ∗ C -homomorphism from A¯ to A¯ ⊗ (U/I), and (τ ⊗ id)(I (a)) = τ (a)1U /I ∀a ∈ A∞ 0 (i.e. (τ ⊗ id)(β(a)) − τ (a)1U ∈ I), where I denotes the quotient map from U onto U/I. This collection is nonempty, since the trivial one-dimensional C ∗ -algebra C gives an object in Q0 and by Lemma 4.5 we do get a member of F. Now, let I0 be the intersection of all ideals in F. We claim that I0 is again a member of F. Since any C ∗ -homomorphism is contractive, we have I (a) ≡ β(a) + A¯ ⊗ I ≤ a for all ∞ a ∈ A∞ 0 and I ∈ F. By Lemma 4.6, we see that I0 (a) ≤ a for a ∈ A0 , so ∞ ¯ Moreover, for I0 extends to a norm-contractive map on A¯ by the density of A0 in A. a, b ∈ A¯ and for I ∈ F, we have I (ab) = I (a)I (b). Since I = I ◦ I0 , we can rewrite the homomorphic property of I as I0 (ab) − I0 (a)I0 (b) ∈ A¯ ⊗ (I/I0 ).
Since this holds for every I ∈ F, we conclude that I0 (ab) − I0 (a)I0 (b) ∈ I ∈F A¯ ⊗ (I/I0 ) = (0), i.e. I0 is a homomorphism. In a similar way, we can show that it is a ∗-homomorphism. Since each βi is a unitary representation of the compact quantum group Ui on the finite dimensional space Vi , it follows that βi (Vi )(1 ⊗ Ui ) is total in Vi ⊗ Ui . In particular, for any vector w ∈ Vi (i arbitrary), the element w ⊗ 1Ui = w ⊗ 1U belongs to the linear span of βi (Vi )(1 ⊗ Ui ) ⊂ β(Vi )(1 ⊗ U). Thus, A∞ 0 ⊗ 1U is con∞ ⊗1 tained in the linear span of β(A∞ )(1 ⊗ U) and hence A is linearly spanned U 0 0 I0
∞ by I0 (A∞ 0 )(1 ⊗ U/I0 ). By the norm-density of A0 in A and the contractivity of the quotient map, it follows that A ⊗ U/I0 is the closed linear span of I0 (A∞ 0 )(1 ⊗ U/I0 ). This proves that (U/I0 , I0 ) is indeed an object of Q. Moreover, it is an object in Q0 , since for a ∈ A∞ 0 we have (τ ⊗ id)(β(a)) − τ (a)1U ∈ I for every I ∈ F, hence (τ ⊗ id)(β(a)) − τ (a)1U ∈ I0 . We now show that G := U/I0 is a universal object in Q0 . To see this, consider any object (S, α) of Q0 . Without loss of generality we can assume the action to be faithful, (i) since otherwise we can replace S by the C ∗ -subalgebra generated by the elements {vk j } appearing in the proof of Lemma 4.5. But by Lemma 4.5 we can further assume that S is isomorphic with U/I for some I ∈ F. Since I0 ⊆ I, we have a C ∗ -homomorphism from U/I0 onto U/I, sending x + I0 to x + I, which is clearly a morphism in the category Q0 . This is indeed the unique such morphism, since it is uniquely determined (i) on the dense subalgebra generated by {u k j + I0 , i ≥ 0, j, k ≥ 1} of G.
Quantum Group of Isometries in Classical and Noncommutative Geometry
153
To construct the coproduct on G = U/I0 , we first consider α (2) = (I0 ⊗ id) ◦ I0 : A → A ⊗ G ⊗ G. It is easy to verify that (G ⊗ G, α (2) ) is an object in the category Q0 , so by the universality of (G, I0 ), we have a unique unital C ∗ -homomorphism 0 : G → G ⊗ G satisfying (id ⊗ 0 ) ◦ I0 (x) = α (2) (x) ∀x ∈ A. Taking x = ei j , we get l
eil ⊗ (πI0 ⊗ πI0 )
k
(i) u lk
⊗ u (i) kj
=
eil ⊗ 0 (πI0 (u l(i) j )).
l
˜ (i) ) = Comparing coefficients of eil , and recalling that (u lj denotes the coproduct on U), we have ˜ = 0 ◦ πI0 (πI0 ⊗ πI0 ) ◦
k
(i) ˜ u lk ⊗ u (i) k j (where
(3)
on the linear span of {u (i) jk , i ≥ 0, j, k ≥ 1}, and hence on the whole of U. This implies that 0 maps I0 = Ker(πI0 ) into Ker(πI0 ⊗ πI0 ) = (I0 ⊗ 1 + 1 ⊗ I0 ) ⊂ U ⊗ U. In other words, I0 is a Hopf C ∗ -ideal, and hence G = U/I0 has the canonical compact quantum group structure as a quantum subgroup of U. It is clear from the relation (3) that 0 coincides with the canonical coproduct of the quantum subgroup U/I0 inherited from that of U. It is also easy to see that the object (G, 0 , I0 ) is universal in the category Q0 , using the fact that (by Lemma 4.5) any compact quantum group (G, ) acting smoothly and isometrically on the given spectral triple is isomorphic with a quantum subgroup U/I, for some Hopf C ∗ -ideal I of U. Finally, the faithfulness of α0 follows from the universality by standard arguments which we briefly sketch. If G1 ⊂ G is a C ∗ -subalgebra of G such that α0 (A) ⊆ A ⊗ G1 , it is easy to see that (G1 , 0 , α0 ) is also a universal object, and by definition of universality of G it follows that there is a unique morphism, say j, from G to G1 . But the map j ◦ i is a morphism from G to itself, where i : G1 → G is the inclusion. Again by universality, we have that j ◦ i = idG , so in particular, i is onto, i.e. G1 = G. Since we have already observed that for admissible spectral triples the categories Q and Q coincide with Q0 and Q0 respectively, we obtain the following: Corollary 4.8. For any admissible spectral triple (A∞ , H, D) the category Q of quantum families of smooth isometries has a universal (initial) object, say (G, α0 ). Moreover, G has a coproduct 0 such that (G, 0 ) is a compact quantum group and (G, 0 , α0 ) is a universal object in the category Q of compact quantum groups having smooth isometric action on the given spectral triple. The action α0 is faithful. Remark 4.9. The conclusion of Corollary 4.8 may not hold if we drop the assumption of admissibility. For example, for the finite-dimensional spectral triple on Mn (C) considered in Remark 4.4, the category Q does not admit a universal object. Indeed, as we have already noted in Remark 4.4, any compact quantum group action on Mn (C) is automatically smooth isometric, so the category Q is nothing but the category of all compact quantum groups acting on Mn (C), which does not have a universal object (see [18]).
154
D. Goswami
Definition 4.10. We shall call the universal object (G, 0 ) obtained in Corollary 4.8 above the quantum isometry group of the admissible spectral triple (A∞ , H, D) and ¯ if the denote it by Q I S O(A∞ , H, D), or just Q I S O(A∞ ) (or sometimes Q I S O(A)) spectral triple is understood from the context. Remark 4.11. It has been shown in Sect. 2 that any classical spectral triple (A∞ = C ∞ (M), H, D), where M is a compact connected Riemannian spin manifold, H is the L 2 space of square integrable spinors and D is the Dirac operator, is indeed admissible in our sense. Moreover, most of the known examples of noncommutative spectral triples, e.g. those on Aθ , quantum Heisenberg manifolds, are easily seen to be admissible. So, we can define quantum isometry groups for such classical as well as noncommutative manifolds. Remark 4.12. Let us now briefly indicate how one can weaken the hypothesis of connectedness and define quantum isometry group for spectral triples satisfying assumptions (i)–(iv) and (vi), but not necessarily the connectedness assumption. Such an extension of our results is desirable to accommodate disconnected classical spaces, including the finite sets and graphs, in our framework. One possibility is to define the quantum isometry group as the universal object of Q0 obtained in Theorem 4.7. Applying this proposed definition to the spectral triple of Remark 4.4, we shall get the quantum automorphism group Au (I ), which is the quantum group defined in the statement of Theorem 4.1 of [18]. For a finite set X , the categories Q and Q0 coincide anyway, and we get the quantum permutation group defined in [18] as the quantum isometry group of the spectral triple described in Remark 4.3. However, we would like to take the above proposal for the definition of Q I S O of a non-admissible spectral triple as tentative at the moment, keeping the possibility to explore more satisfactory alternatives for future work. Remark 4.13. In our definition of quantum family of smooth isometries, we have required the invariance of the subspace A∞ 0 under the quantum action α. This is equivalent to the invariance of the possibly bigger space A∞ for the classical spectral triples with A∞ = C ∞ (M), as seen in Sect. 2. Thus, it is quite natural to look for sufficient conditions for invariance of A∞ under the action α in the noncommutative situation. The purpose of this to argue that for an admissible spectral triple (A∞ , H, D), remark is n the condition Dom(L ) = A∞ is sufficent to have the invariance of A∞ under the action of any quantum family of smooth isometries (S, α). Indeed, by Lemma 4.5 we have that the map α˜ from A ⊗ S to itself extends to an S-linear unitary on the Hilbert S-module H0D ⊗ S, i.e. α˜ can be viewed as a unitary in B(H0D ) ⊗ S. Clearly, for any state φ on S, we have αφ = (id ⊗ φ)(α) ˜ ∈ B(H0D ). Now, by the definition of a smooth isometric action, the bounded operator αφ commutes with the self-adjoint operator L on A∞ a core for L. So, αφ must commute with Ln for all n, and in particular 0 , which is ∞ keeps A = n Dom(Ln ) invariant. We shall conclude this section with a brief discussion on how to generalize our formulation and results to the twisted or type-III spectral triples in the sense of Connes and Moscovici ([9]). Let (A∞ , H, D) be a σ -twisted, Lipschitz-regular spectral triple of finite-dimension n (n positive integer), where σ is a unital algebra automorphism of A∞ satisfying σ (a)∗ = σ −1 (a ∗ ) for all a ∈ A∞ . In this case Da − σ (a)D ∈ B(H) for all |D|−n ) a ∈ A∞ , and using the faithful positive linear functional τ given by τ (X ) = TTrrω (X (|D|−n ) ω
discussed in [9], we can construct analogues of the Hilbert spaces H0D and H1D , with σ given by d σ (a) = Da − σ (a)D. Let us make the assumpthe map d D replaced by d D D σ ), and moreover, say that the (σ -twisted) tions (i)–(iv) and (vi) (with d D replaced by d D
Quantum Group of Isometries in Classical and Noncommutative Geometry
155
spectral triple is admissible if assumption (v) also holds. We shall use the notations Vi , {λi } and {ei j } and define quantum families of smooth isometries, smooth isometric actions by compact quantum groups and volume preserving actions as in the case of usual (untwisted) spectral triples. Note that in this case τ is not necessarily a trace, hence {ei∗j } are not necessarily orthogonal; however, they are still linearly independent. di ∗ ))di ∗ Thus, Q i = ((ei∗j , eik j,k=1 = ((τ (ei j eik ))) j,k=1 is a nonsingular di × di matrix. Let Uiσ = Au (Q i ) be the compact quantum group studied in [18,19], which is the Q Q universal C ∗ -algebra generated by {u k ji , k, j = 1, . . . , di } such that u := ((u k ji )) satisfies uu ∗ = In = u ∗ u, u Q i u Q i−1 = In = Q i u Q i−1 u . (4)
We set βiσ : Vi → Vi ⊗ Uiσ by βiσ (ei j ) =
eik ⊗ u kQji ,
k
and then define a unitary representation β σ of the free product U σ = ∗i Uiσ on H by taking β σ = ∗i βiσ as before. It is now straightforward to prove an analogue of Lemma 4.5, with U replaced by U σ . With the notation used in the proof of Lemma 4.5, we see that the map α˜ is an S-linear unitary on the Hilbert module H0D ⊗ S, and it leaves the subspace Vi ⊗ S ≡ (i) ∗
Sp{ei∗j , j = 1, . . . , di } ⊗ S invariant. So, the S-valued matrix vi ≡ ((vk j )) obtained from the expansion of α(ei∗j ) is an invertible element of Mdi (C) ⊗ S. However, taking ∗ (i) ⊗ (vk j )∗ and using the S-valued inner product ·, ·S on both sides of α(ei∗j ) = k eik
the fact that α(x), α(y)S = τ (x ∗ y)1S , we obtain Q i = vi Q i vi . Thus, Q i−1 vi Q i must be the (both-sided) inverse of vi , from which we see that the relations (4) are satisfied with u replaced by vi , inducing a natural C ∗ -homomorphism from Uiσ into S. The rest of the proof of Lemma 4.5, as well as the proof of Theorem 4.7 and Corollary 4.8 go through verbatim, just replacing U and β by U σ and β σ respectively. This allows us to define the quantum isometry group of an admissible σ -twisted spectral triple. 5. Construction of Quantum Group-Equivariant Spectral Triples In this section, we shall briefly discuss the relevance of the quantum isometry group to the problem of constructing quantum group equivariant spectral triples, which is important to understand the role of quantum groups in the framework of noncommutative geometry. There has been a lot of activity in this direction recently, see, for example, the articles by Chakraborty and Pal ([6]), Connes ([8]), Landi et al ([10]) and the references therein. In the classical situation, there exists a natural unitary representation of the isometry group G = I S O(M) of a manifold M on the Hilbert space of forms, so that the operator d + d ∗ (where d is the de-Rham differential operator) commutes with the representation. Indeed, d + d ∗ is also a Dirac operator for the spectral triple given by the natural representation of C ∞ (M) on the Hilbert space of forms, so we have a canonical construction of the G-equivariant spectral triple. Our aim in this section is to generalize ∗ is equivariant with this to the noncommutative framework, by proving that d D + d D respect to a canonical unitary representation on the Hilbert space of ‘noncommutative forms’ (see, for example, [12] for a detailed discussion of such forms).
156
D. Goswami
Consider an admissible spectral triple (A∞ , H, D) and moreover, make the assumption of Lemma 3.1, i.e. assume that t → eit D xe−it D is norm-differentiable at t = 0 for all x in the ∗-algebra B generated by A∞ and [D, A∞ ]. Lemma 5.1. In the notation of Lemma 3.1, we have the following (where b, c ∈ A∞ ): 1 ∗ dD (d D (b)c) = − (bL(c) − L(b)c − L(bc)) . 2
(5)
Proof. Denote by χ (b, c) the right hand side of Eq. (5) and fix any a ∈ A∞ . Using the ∗ d and that facts that the functional τ is a faithful trace on the ∗-algebra B, L = −d D D τ ([D, X ]) = 0 for any X in B, we have, τ (a ∗ χ (b, c)) 1 = − {τ (a ∗ bL(c)) − τ (ca ∗ L(b)) − τ (a ∗ L(bc))} 2 1 = {τ ([D, a ∗ b][D, c]) − τ ([D, ca ∗ ][D, b]) − τ ([D, a ∗ ][D, bc])} 2 1 = {τ (a ∗ [D, b][D, c]) − τ ([D, c]a ∗ [D, b]) − τ (c[D, a ∗ ][D, b]) 2 −τ ([D, a ∗ ][D, b]c)} = −τ ([D, a ∗ ][D, b]c) = τ ([D, a]∗ [D, b]c) = d D (a), d D (b)c ∗ = τ (a ∗ (d D (d D (b)c))). From this, we get the following by a simple computation: 1 ad D (b), a d D (b ) = − τ (b∗ (a ∗ a , b )), 2
(6)
for a, b, a , b ∈ A∞ , and where (x, y) := L(x y)−L(x)y + xL(y). Now, let us denote the quantum isometry group of the given spectral triple (A∞ , H, D) by (G, , α). Let A0 denote the ∗-algebra generated by A∞ 0 , G0 denote ∗-algebra of G generated by matrix elements of irreducible representations. Clearly, α : A0 → A0 ⊗alg G0 is a Hopf˜ : (A0 ⊗alg G0 )×(A0 ⊗alg G0 ) → algebraic action of G0 on A0 . Define a C-bilinear map A0 ⊗alg G0 by setting ˜ ((x ⊗ q), (x ⊗ q )) := (x, x ) ⊗ (qq ). It follows from the relation (L ⊗ id) ◦ α = α ◦ L on A0 that ˜ (α(x), α(y)) = α((x, y)).
(7)
We now define a linear map α (1) from the linear span of {ad D (b) : a, b ∈ A0 } to H1D ⊗G by setting (1) (2) (2) α (1) (ad D (b)) := ai d D (b(1) j ) ⊗ ai b j , i, j
Quantum Group of Isometries in Classical and Noncommutative Geometry
157
(1) (2) where for any x ∈ A0 we write α(x) = i xi ⊗ xi ∈ A0 ⊗alg G0 (summation over finitely many terms). We shall sometimes use the Sweedler convention of writing the above simply as α(x) = x (1) ⊗ x (2) . It then follows from the identities (6) and (7), and also the fact that (τ ⊗ id)(α(a)) = τ (a)1 for all a ∈ A0 that α (1) (a d D (b)), α (1) (a d D (b ))G 1 ∗ ˜ = − (τ ⊗ id)(α(b∗ )(α(a a ), α(b ))) 2 1 = − (τ ⊗ id)(α(b∗ )α((a ∗ a , b ))) 2 1 = − (τ ⊗ id)(α(b∗ (a ∗ a , b ))) 2 1 = − τ (b∗ (a ∗ a , b ))1G 2 = ad D (b), a d D (b )1G . This proves that α (1) is indeed well-defined and extends to a G-linear isometry on H1D ⊗G, to be denoted by U (1) , which sends (ad D (b)) ⊗ q to α (1) (ad D (b))(1 ⊗ q), a, b ∈ A0 , 0 q ∈ G. Moreover, since the linear span of α(A∞ 0 )(1 ⊗ G) is dense in H D ⊗ G, it is easily 1 (1) seen that the range of the isometry U is the whole of H D ⊗ G, i.e. U (1) is a unitary. In fact, from its definition it can also be shown that U (1) is a unitary representation of the compact quantum group G on H1D . In a similar way, we can construct unitary representation U (n) of G on the Hilbert space of n-forms for any n ≥ 1, by defining U (n) ((a0 d D (a1 )d D (a2 ) . . . d D (an )) ⊗ q) = a0(1) d D (a1(1) ) . . . d D (an(1) ) ⊗ (a0(2) a1(2) . . . an(2) q), (where ai ∈ A∞ 0 , q ∈ G, and Sweedler convention is used), and verifying that it extends to a unitary. We also denote by U (0) the unitary representation α˜ on H0D discussed before. Finally, we have a unitary representation U = n≥0 U (n) of G on H˜ := n HnD , and also extend d D as a closed densely defined operator on H˜ in the obvious way, by defining d D (a0 d D (a1 ) . . . d D (an )) = d D (a0 ) . . . d D (an ). It is now straightforward to see the following: ∗ is equivariant in the sense that U (D ⊗ 1) = Theorem 5.2. The operator D := d D + d D (D ⊗ 1)U .
We point out that there is a natural representation π of A on H˜ given by π(a)(a0 d D ˜ D ) is indeed a spectral (a1 ) . . . d D (an )) = aa0 d D (a1 ) . . . d D (an ), and (π(A∞ ), H, triple, which is G-equivariant. Although the relation between spectral properties of D and D is not clear in general, in many cases of interest (e.g. when there is an underlying type (1, 1) spectral data in the sense of [12]) these two Dirac operators are closely related. As an illustration, consider the canonical spectral on the noncommutative 2-torus Aθ , which is discussed in some details in the next section. In this case, the Dirac operator D acts on L 2 (Aθ , τ ) ⊗ C2 , and it can easily be shown (see [12]) that the Hilbert space of forms is isomorphic with L 2 (Aθ , τ ) ⊗ C4 ∼ = L 2 (Aθ ) ⊗ C2 ; thus D is essentially same as D in this case.
158
D. Goswami
6. Examples and Computations We give some simple yet interesting explicit examples of quantum isometry groups here. However, we give only some computational details for the first example, and for the rest, the reader is referred to a companion article ([4]). Example 1. Commutative torus. Consider M = T, the one-torus, with the usual Riemannian structure. The ∗-algebra A∞ = C ∞ (M) is generated by one unitary U , which is the multiplication operator by z in L 2 (T). The Laplacian is given by L(U n ) = ∞ −n 2 U n . If a compact quantum group let An , n ∈ Z be (S,nS ) acts on A smoothly, elements of S such that α0 (U ) = n U ⊗ An (here α0 : A∞ → A∞ ⊗alg S is the S-action on A∞ ). Note that this infinite sum converges at least in the topology of the Hilbert space L 2 (T) ⊗ L 2 (S), where L 2 (S) denotes the GNS space for the Haar state of S. It is clear that the condition (L ⊗ id) ◦ α0 = α0 ◦ L forces to have An = 0 for all but n = ±1. The conditions α0 (U )α0 (U )∗ = α0 (U )∗ α0 (U ) = 1 ⊗ 1 further imply the following: A∗1 A1 + A∗−1 A−1 = 1 = A1 A∗1 + A−1 A∗−1 , A∗1 A−1 = A∗−1 A1 = A1 A∗−1 = A−1 A∗1 = 0. It follows that A±1 are partial isometries with orthogonal domains and ranges. Say, A1 has domain P and range Q. Hence the domain and range of A−1 are respectively 1 − P and 1 − Q. Consider the unitary V = A1 + A−1 , so that V P = A1 , V (1 − P) = A−1 . Now, from the fact that (L ⊗ id)(α0 (U 2 )) = α0 (L(U 2 )) it is easy to see that the coefficient of 1 ⊗ 1 in the expression of α0 (U )2 must be 0, i.e. A1 A−1 + A−1 A1 = 0. From this, it follows that V and P commute and therefore P = Q. By straightforward calculation using the facts that V is unitary, P is a projection and V and P commute, we can verify that α0 given by α0 (U ) = U ⊗ V P + U −1 ⊗ V (1 − P) extends to a ∗-homomorphism from A∞ to A∞ ⊗ C ∗ (V, P) satisfying (L ⊗ id) ◦ α0 = α0 ◦ L. It follows that the C ∗ algebra Q I S O(T) is commutative and generated by a unitary V and a projection P, or equivalently by two partial isometries A1 , A−1 such that A∗1 A1 = A1 A∗1 , A∗−1 A−1 = A−1 A∗−1 , A1 A−1 = A−1 A1 = 0. So, as a C ∗ algebra it is isomorphic with C(T) ⊕ C(T) ∼ = C(T × Z2 ). The coproduct (say 0 ) can easily be calculated from the requirement of co-associativity, and the Hopf algebra structure of Q I S O(T) can be seen to coincide with that of the semi-direct product of T by Z2 , where the generator of Z2 acts on T by sending z → z¯ . We summarize this in the form of the following. Theorem 6.1. The quantum isometry group Q I S O(T) of the one-torus T is isomorphic (as a quantum group) with C(T >Z2 ) = C(I S O(T)). We can easily extend this result to higher dimensional commutative tori, and can prove that the quantum isometry group coincides with the classical isometry group. This is some kind of rigidity result, and it will be interesting to investigate the nature of quantum isometry groups of more general classical manifolds. Example 2. Noncommutative torus; holomorphic isometries. Next we consider the simplest and well-known example of a noncommutative manifold, namely the noncommutative two-torus Aθ , where θ is a fixed irrational number (see [7]). It is the universal C ∗ algebra generated by two unitaries U and V satisfying the commutation relation U V = λV U , where λ = e2πiθ . There is a canonical faithful trace τ on Aθ given by
Quantum Group of Isometries in Classical and Noncommutative Geometry
159
τ (U m V n ) = δmn . We consider the canonical spectral triple (A∞ , H, D), where A∞ is the unital ∗-algebra spanned by U, V ; H = L 2 (τ ) ⊕ L 2 (τ ) and D is given by 0 d1 + id2 , D= d1 − id2 0 where d1 and d2 are closed unbounded linear maps on L 2 (τ ) given by d1 (U m V n ) = mU m V n , d2 (U m V n ) = nU m V n . It is easy to compute the space of one-forms 1D (see [5,12,7]) and the Laplacian L = −d ∗ d is given by L(U m V n ) = −(m 2 + n 2 )U m V n . For simplicity of computation, instead of the full quantum isometry group we at first concentrate on an interesting quantum subgroup G = QISOhol (A∞ , H, D), which is the universal quantum group which leaves invariant the subalgebra of A∞ consisting of polynomials in U , V and 1, i.e. span of U m V n with m, n ≥ 0. The proof of existence and uniqueness of such a universal quantum group is more or less identical to the proof of existence and uniqueness of QISO. We call G the quantum group of “holomorphic” isometries, and observe in the theorem stated below without proof (see [4]) that this quantum group is nothing but the quantum double torus studied in [13]. Theorem 6.2. Consider the following co-product B on the C ∗ algebra B = C(T2 ) ⊕ A2θ , given on the generators A0 , B0 , C0 , D0 as follows ( where A0 , D0 correspond to C(T2 ) and B0 , C0 correspond to A2θ ) B (A0 ) = A0 ⊗ A0 + C0 ⊗ B0 , B (B0 ) = B0 ⊗ A0 + D0 ⊗ B0 , B (C0 ) = A0 ⊗ C0 + C0 ⊗ D0 , B (D0 ) = B0 ⊗ C0 + D0 ⊗ D0 . Then (B, 0 ) is a compact quantum group and it has an action α0 on Aθ given by α0 (U ) = U ⊗ A0 + V ⊗ B0 , α0 (V ) = U ⊗ C0 + V ⊗ D0 . Moreover, (B, B ) is isomorphic (as a quantum group) with G = Q I S O hol (A∞ , H, D). We refer to [4] for a proof of the above result, and to [13] for computation of the Haar state and representation theory of the quantum group G. Example 3. Noncommutative torus; full quantum isometry group. By similar but somewhat tedious calculations (see [4]) one can also describe explicitly the full quantum isometry group QISO(A∞ , H, D). It is as a C ∗ algebra has eight direct summands, four of which are isomorphic with the commutative algebra C(T2 ), and the other four are irrational rotation algebras. Theorem 6.3. QISO(Aθ ) = ⊕8k=1 C ∗ (Uk1 , Uk2 ) (as a C ∗ algebra), where for odd k, Uk1 , Uk2 are the two commuting unitary generators of C(T2 ), and for even k, Uk1 Uk2 = exp(4πiθ )Uk2 Uk1 , i.e. they generate A2θ . The (co)-action on the generators U, V (say) of Aθ are given by the following: α0 (U ) = U ⊗ (U11 +U41 )+V ⊗ (U52 +U61 )+U −1 ⊗ (U21 +U31 )+V −1 ⊗ (U71 +U81 ), α0 (V ) = U ⊗ (U62 +U72 )+V ⊗ (U12 +U22 )+U −1 ⊗ (U51 +U82 )+V −1 ⊗ (U32 +U42 ). From the co-associativity condition, the co-product of QISO(Aθ ) can easily be calculated. For the detailed description of the coproduct, counit, antipode and study of the representation theory of QISO(Aθ ), the reader is referred to [4]. It is interesting to mention here that the quantum isometry group of Aθ is a Rieffel type deformation of the
160
D. Goswami
isometry group (which is the same as the quantum isometry group) of the commutative two-torus. The commutative two-torus is a subgroup of its isometry group, but when the isometry group is deformed into QISO(Aθ ), the subgroup relation is not respected, and the deformation of the commutative torus, which is A2θ , sits in QISO(Aθ ) just as a C ∗ subalgebra (in fact a direct summand) but not as a quantum subgroup any more. This perhaps provides some explanation of the non-existence of any Hopf algebra structure on the noncommutative torus. Acknowledgement. The author would like to thank P. Hajac for drawing his attention to the article [13], and S.L. Woronowicz for many valuable comments and suggestions which led to substantial improvement of the paper. The author gratefully acknowledges support obtained from the Indian National Academy of Sciences through the grants for a project on ‘Noncommutative Geometry and Quantum Groups’, and also wishes to thank The Abdus Salam ICTP (Trieste), where a major part of the work was done during a visit as Junior Associate.
References 1. Banica, T.: Quantum automorphism groups of small metric spaces. Pacific J. Math. 219(1), 27–51 (2005) 2. Banica, T.: Quantum automorphism groups of homogeneous graphs. J. Funct. Anal. 224(2), 243–280 (2005) 3. Bichon, J.: Quantum automorphism groups of finite graphs, Proc. Amer. Math. Soc. 131(3), 665–673 (2003) 4. Bhowmick, J., Goswami, D.: Quantum isometry groups : examples and computations. http://arxiv.org/ abs/0707.2648[math.QA], 2007 5. Chakraborty, P.S., Goswami, D., Sinha, K.B.: Probability and geometry on some noncommutative manifolds. J Operator Theory 49(1), 185–201 (2003) 6. Chakraborty, P.S., Pal, A.: Equivariant spectral triples on the quantum SU (2) group. K Theory 28, 107–126 (2003) 7. Connes, A.: “Noncommutative Geometry”. London-New York: Academic Press, 1994 8. Connes, A.: Cyclic cohomology, quantum group symmetries and the local index formula for SUq (2). J. Inst. Math. Jussieu 3(1), 17–68 (2004) 9. Connes, A., Moscovici, H.: Type III and spectral triples. http://arxiv.org/abs/math/0609703v2[math. OA], 2006 10. Dabrowski, L., Landi, G., Sitarz, A., van Suijlekom, W., Varilly, J.C.: The Dirac operator on SUq (2). Commun. Math. Phys. 259(3), 729–759 (2005) 11. Donnelly, H.: Eigenfunctions of Laplacians on Compact Riemannian Manifolds. Asian J. Math. 10(1), 115–126 (2006) 12. Fröhlich, J., Grandjean, O., Recknagel, A.: Supersymmetric quantum theory and non-commutative geometry. Commun. Math. Phys. 203(1), 119–184 (1999) 13. Hajac, P., Masuda, T.: Quantum Double-Torus, Comptes Rendus Acad. Sci. Paris 327(6), Ser. I, Math. 553–558 (1998) 14. Rosenberg, S.: “The Laplacian on a Riemannian Manifold”. Cambridge: University Press, 1997 15. Soltan, P. M.: Quantum families of maps and quantum semigroups on finite quantum spaces. http://arxiv. org/abs/math/0610922v4[math.OA], 2006 16. Maes, A., Van Daele, A.: Notes on compact quantum groups. Nieuw Arch Wisk. 4 16(1–2), 73– 112 (1998) 17. Wang, S.: Free products of compact quantum groups. Commun. Math. Phys. 167(3), 671–692 (1995) 18. Wang, S.: Quantum symmetry groups of finite spaces. Commun. Math. Phys. 195, 195–211 (1998) 19. Wang, S.: Structure and isomorphism classification of compact quantum groups Au (Q) and Bu (Q). J. Operator Theory 48, 573–583 (2002) 20. Woronowicz, S.L.: Compact matrix pseudogroups. Commun. Math. Phys. 111(4), 613–665 (1987) 21. Woronowicz, S.L.: “Compact quantum groups”. In: Symétries quantiques (Quantum symmetries) (Les Houches, 1995), edited by A. Connes et al., Amsterdam: Elsevier,1998, pp. 845–884 22. Woronowicz, S.L.: Pseudogroups, pseudospaces and Pontryagin duality, Proceedings of the International Conference on Mathematical Physics, Lausane. Lecture Notes in Physics 116, 407–412 (1979) Communicated by A. Connes
Commun. Math. Phys. 285, 161–174 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0582-6
Communications in
Mathematical Physics
On the Chernoff Distance for Asymptotic LOCC Discrimination of Bipartite Quantum States William Matthews1 , Andreas Winter2 1 Department of Mathematics, University of Bristol, Bristol BS8 1TW, U.K.
E-mail:
[email protected];
[email protected] 2 Centre for Quantum Technologies, National University of Singapore, 2 Science Drive 3,
Singapore 117542, Singapore. E-mail:
[email protected] Received: 29 October 2007 / Accepted: 3 April 2008 Published online: 29 July 2008 – © Springer-Verlag 2008
Abstract: Motivated by the recent discovery of a quantum Chernoff theorem for asymptotic state discrimination, we investigate the distinguishability of two bipartite mixed states under the constraint of local operations and classical communication (LOCC), in the limit of many copies. While for two pure states a result of Walgate et al. shows that LOCC is just as powerful as global measurements, data hiding states (DiVincenzo et al.) show that locality can impose severe restrictions on the distinguishability of even orthogonal states. Here we determine the optimal error probability and measurement to discriminate many copies of particular data hiding states (extremal d ×d Werner states) by a linear programming approach. Surprisingly, the single-copy optimal measurement remains optimal for n copies, in the sense that the best strategy is measuring each copy separately, followed by a simple classical decision rule. We also put a lower bound on the bias with which states can be distinguished by separable operations. 1. Introduction The non-classical nature of information represented in states of a bipartite quantum system is strikingly evident in the fact that, even allowing the experimenters (Alice and Bob) holding each of the subsystems to use local operations and classical communication (LOCC) freely, they cannot access the information as well as if they were in the same lab or could exchange quantum states. Thus, there is a specifically quantum obstruction to the distributed analysis of data and investigating this obstruction is a way of obtaining an understanding of the quantum nature of information. The problem of LOCC discrimination of two or more states, has recently attracted quite considerable attention [1–13] and what can be said at the very least is that it is difficult. In the simplest example, the experimenters are given one of two states at random according to some probability distribution and their task is to unambiguously determine which state they have with the smallest possible error probability. Throughout this paper X (ρ , ρ ; p) to denote the minimum error with which the states ρ and ρ , we’ll use Perr 1 2 1 2
162
W. Matthews, A. Winter
with prior probabilities p and 1 − p respectively, can be distinguished by a POVM that can be implemented by operations in the class X . It will sometimes be convenient to refer to the optimal bias (over random guessing) instead of the optimal probability. This we define, as usual, by X B X = 1 − 2Perr .
(1)
In this work we will talk about the well known classes of PPT-preserving (PPT) operations, separable (SEP) operations [14] and local operations with classical communication (LOCC), which obey the strict inclusions [15] LOCC ⊂ SEP ⊂ PPT ⊂ ALL ,
(2)
where ALL simply denotes the set of all possible global operations. Briefly, the POVMs which can be implemented by operations in these different classes can be characterized as follows: An LOCC POVM is one which can be implemented as a multi-round process where each round consists of a partial measurement of one party, which can depend on previously generated classical messages, and whose result is broadcast; A POVM is in SEP if and only if its elements can be written as positive linear combinations of product operators; A POVM can be implemented by PPT operations if and only if its constituent operators have positive partial transpose. The inclusion structure immediately implies the ordering LOCC SEP PPT ALL Perr ≥ Perr ≥ Perr ≥ Perr =
1 1 − pρ1 − (1 − p)ρ2 1 . 2 2
(3)
The final equality is the classic result of Helstrom and Holevo [16]. A similar closed LOCC or any of the other bipartite P X . form expression does not seem to exist for Perr err Motivated by the recent development of a quantum Chernoff theorem [17], we are X ρ ⊗n , ρ ⊗n ; p as the interested here in the asymptotic behaviour of the quantity Perr 1 2 number of copies, n, goes to infinity. We can define the Chernoff distance with respect to a class of operations X , between the states ρ1 and ρ2 by ⊗n ⊗n 1 X ρ1 , ρ2 ; p . ξ X (ρ1 , ρ2 ) = lim − log Perr n→∞ n
(4)
(We note that the Chernoff distance is not strictly a distance since it does not obey the triangle inequality and that it is independent of the prior probabilities as long as they are both non-zero.) In [17], it was determined that the (unconstrained) quantum Chernoff distance ξ ALL (ρ1 , ρ2 ) is given by the formula (note the independence of p): ξ ALL (ρ1 , ρ2 ) = − min log Tr ρ11−s ρ2s . 0≤s≤1
(5)
This is a pleasantly straightforward generalisation of the classical Chernoff theorem for probability distributions, where for probability distribution vectors p and q, ξ ( p, q) = − min log 0≤s≤1
n
pi1−s qis .
(6)
i=1
It is useful to define yet another Chernoff distance on quantum states, for an even more restricted class of measurements than LOCC. Let (M, 1− M) be the optimal single-copy
Chernoff Distance for Bipartite Quantum State for Asymptotic LOCC Discrimination
163
LOCC POVM. ξ SC (ρ1 , ρ2 ; p) is the classical Chernoff distance between the probability distributions on the outcome of this measurement when it is performed on ρ1 or ρ2 . (Outside the bipartite setting this notion was considered before by Kargin [18].) If we write p j1 = Tr Mρ j , p j2 = Tr (1 − M)ρ j , (7) we can summarize the relationships between Chernoff distances we have defined as follows: − min log 0≤s≤1
2
1−s s p1i p2i = ξ SC ≤ ξ LOCC ≤ ξ SEP ≤ ξ PPT ≤ ξ ALL
i=1
= − min log Tr ρ11−s ρ2s . 0≤s≤1
(8)
Before proceeding with our main new results, we would like to make some general remarks about these quantities and describe some of the existing knowledge about them. One striking difference between global and local state discrimination can be seen in the effect of adding an ancilla. In the global case, this has no effect on our ability to distinguish between states, asymptotically or otherwise. That is, for any state τ , ALL ALL Perr (ρ1 , ρ2 ; p) = Perr (ρ1 ⊗ τ, ρ2 ⊗ τ ; p) ,
ξ ALL (ρ1 , ρ2 ; p) = ξ ALL (ρ1 ⊗ τ, ρ2 ⊗ τ ; p) .
(9)
This is hardly surprising when one considers that the addition of any ancilla state is subsumed by the POVM formalism in the global case. In cases where our ability to distinguish between two states (of a d × d system, let’s say) is worsened by restriction to LOCC, then we will indeed be helped by the provision of a d × d maximally entangled ancilla: by using it to teleport Alice’s half to Bob (say), we have restored the ability to make global measurements and will be able to decrease the error probability accordingly. It is not always the case that the restriction to LOCC will impair our performance however. It was shown by Walgate et al. [1] (and generalized to non-orthogonal states by Virmani et al. [2]) that LOCC can do just as well in distinguishing between two pure states as a global measurement can. ALL LOCC Perr (|ψ ψ|, |φ φ|; p) = Perr (|ψ ψ|, |φ φ|; p) .
(10)
Naturally, the corresponding Chernoff distances are also equal when both states are pure. Recently, Nathanson [19] has generalized this to the case of discriminating a mixed state from a pure state. He finds that under certain conditions on the fidelity of the states and the Schmidt coefficients of the pure state, ξ LOCC (ρ1 , ρ2 ) = ξ ALL (ρ1 , ρ2 ), even though the single-copy error probabilities may differ. From our perspective, it is more interesting to look at pairs of states where the LOCC constraint reduces our ability to distinguish them. In this paper we discuss an example of such a case. Let σd and αd denote the completely symmetric and completely antisymmetric Werner states in d × d dimensions, respectively (when d is a power of two, these are the states used by DiVincenzo et al. [20] for “data hiding”; see also [21]). In this paper we calculate the Chernoff distance between these states, ξ LOCC (σd , αd ), and ⊗n LOCC σd , αd⊗n ; p . to do so, we actually give an expression for Perr The rest of this paper is organized as follows:In the next section we present an LOCC LOCC σ ⊗n , α ⊗n ; p . In Sect. 3, we formulate protocol which puts an upper bound on Perr d d
164
W. Matthews, A. Winter
the minimization of the error which can by achieved by PPT operations as a linear program, and by solving show that the LOCC upper bound is also a the dual program PPT σ ⊗n , α ⊗n ; p , and hence on P LOCC σ ⊗n , α ⊗n ; p , thus proving lower bound on Perr err d d d d the optimality of our LOCC protocol, and allowing us to calculate the Chernoff distance. SEP In Sect. 4, we prove a lower bound on B (ρ1 , ρ2 ; p) in terms of B ALL (ρ1 , ρ2 ; p), after which we conclude. To describe asymptotic behaviours we will use ‘Big-O’ notation (including , and ∼). If X is an operator on a bipartite Hilbert space H A ⊗ H B , we use X to denote its partial transpose, which is defined (for some orthonormal product basis {|i A ⊗| j B }) by |i A ⊗ | j B k| A ⊗ l| B = |i A ⊗ |l B k| A ⊗ j| B .
(11)
2. LOCC Discrimination Protocol Proposition 1. There is an LOCC protocol (requiringonly one-way communication) LOCC σ ⊗n , α ⊗n ; p ≤ min p d−1 n , 1 − p . which demonstrates that Perr d d d+1 Proof. Alice and Bob take each copy in turn and measure in the computational basis. They share their results. If they recorded different results for every copy then they guess that they have the anti-symmetric state. Otherwise, they have obtained the same result for at least one state and they know with certainty that they share the symmetric state. For a single copy, the POVM implemented by this measurement is ⎧ ⎫ d−1 d−1 ⎨ ⎬ Gd = |i j i j|, 1 − G d = |ii ii| . (12) ⎩ ⎭ i= j
i=0
Because the states to be distinguished are both U ⊗ U -invariant, it is convenient to apply the twirl operation to the two operators in the POVM and it also emphasizes the symmetry of the states that are to be distinguished. After doing so we have the following single-copy POVM of equal performance:
d −1 2 Md = (13) s + a , 1 − Md = s , d +1 d +1 where s and a are the projections onto the symmetric and anti-symmetric subspaces, respectively. The POVM element Md corresponds to Alice and Bob having different measurement outcomes on a single copy. For n copies the POVM is {Md⊗n , 1 − Md⊗n },
(14)
since Md⊗n corresponds to Alice and Bob getting different outcomes for every copy they measure. Let Ak denote the sum of all elements of {s , a }⊗n which have k copies of a . Expanding in terms of the n + 1 orthogonal projection operators {A0 , . . . , An }, we find that n d − 1 n−k ⊗n Md = Ak , (15) d +1 k=0 Perr = p Tr Md⊗n σd⊗n + (1 − p) Tr 1 − Md⊗n αd⊗n , (16)
Chernoff Distance for Bipartite Quantum State for Asymptotic LOCC Discrimination
165
where the first term is the probability that Alice and Bob have the symmetric state and mistake it for the anti-symmetric state and the second term is the probability that they share the anti-symmetric and mistake it for the symmetric state. Substituting (15) into (16) and using the fact that σd⊗n ∝ A0 and αd⊗n ∝ An , we obtain d −1 n d −1 n Perr = p Tr A0 σd⊗n + (1 − p) Tr (1 − An ) αd⊗n = p .(17) d +1 d +1 If Perr > 1 − p then we will do better to simply guess that we have the symmetric state all the time. Adding this proviso to our strategy, we obtain the desired result. Remark 2. We note that the second term in the expression for the error probability is zero, meaning that all the error is due to the case where the symmetric state is mistaken for the anti-symmetric state. This is just what we would expect given that our protocol reports that we have a symmetric state only when it is certain that we have one. We shall now show that (17) is the optimum error probability that can be achieved using LOCC by showing that it is the best that can be achieved even if we use the larger class of measurements that can be implemented using PPT preserving operations. 3. Optimal PPT Preserving POVM We shall first formulate the minimisation of the error probability over PPT preserving POVMs [14] as a linear programming problem (see [22], for instance) by taking advantage of the symmetries of the states we wish to distinguish. We will then show that there is a solution to the dual linear program which lower bounds the error probability to exactly that achieved by the LOCC procedure given above. The states αd⊗n and σd⊗n are invariant under permutations of the copies and under biunitary transformations of the individual copies. We can assume therefore that our two POVM elements have the same symmetries (this is a trick that was used before in [24] to solve a relative entropy minimisation problem). The most general operator with these symmetries is a linear combination of the n + 1 operators Ak which we defined above, so we write our POVM as: n n x k Ak , (1 − xk )Ak . (18) k=0
k=0
The probability of error is given by n n ⊗n ⊗n xk Ak σd (1 − xk )Ak αd + (1 − p) Tr Perr = p Tr k=0
k=0
1− p xn . p
(19)
xk ≥ 0 for k = 0, . . . , n, xk ≤ 1 for k = 0, . . . , n
(20) (21)
= (1 − p) + p x0 − The constraints
are necessary and sufficient to ensure that the two operators do in fact comprise a POVM.
166
W. Matthews, A. Winter
The partial transpose of the flip operator F is equal to d d , where d = d1 i,d−1 j=0 |ii j j| is the maximally entangled state. Since s = (1 + F)/2 and a = (1 − F)/2, we have 1 (1 + d d ) = 2 1 a = (1 − d d ) = 2 s =
1 ((1 − d ) + (1 + d) d ) , 2 1 ((1 − d ) + (1 − d) d ) , 2
(22) (23)
so the operators A k can be written as linear combinations of operators from the set of 2n orthogonal operators {(1 − d ), d }⊗n . Let Skn denote the subset of strings in {0, 1} N which have exactly k ones. Then, A k = 2−n
n (1 − d ) + 1 + (−1)vi d d v∈Skn i=1
= 2−n
n n − l l (1 − d) j (1 + d)l− j Tl , k− j j
(24)
l=0 0≤ j≤l,k
where Tl is the sum over all elements of {(1 − d ) , d }⊗n which have l copies of d . A POVM is PPT preserving if and only if all of the operators that comprise it have positive partial transpose [14]. A necessary and sufficient condition for the POVM to be PPT preserving is therefore given by the following inequalities n
xk
k=0
n − l l (1 − d) j (1 + d)l− j ≥ 0 for l = 0, . . . , n, (25) k− j j
0≤ j≤l,k
n − l l (1 − xk ) (1 − d) j (1 + d)l− j ≥ 0 for l = 0, . . . , n. (26) k− j j
n k=0
0≤ j≤l,k
Let Q be an (n + 1) × (n + 1) matrix with elements Q lk =
n − l l (1 − d) j (1 + d)l− j . k− j j
(27)
0≤ j≤l,k
We note that n
Q lk = (1 + d)l
l n−l n −l l 1−d j m=0
k=0
m
j=0
j
1+d
n−l 1−d l n −l = (1 + d) 1 + m 1+d m=0 l 2 = (1 + d)l 2n−l = 2n . 1+d l
(28)
Chernoff Distance for Bipartite Quantum State for Asymptotic LOCC Discrimination
167
Defining the vectors c and b as follows: 1− p δni , ci = δ0i − p ⎧ for i = 0, . . . , n, ⎨0 bi = −2n for i = n + 1, . . . , 2n + 1, ⎩ −1 for i = 2n + 2, . . . , 3n + 2, we can write the optimisation in standard linear programming form, ⎛ ⎞ Q min{c T · x|P · x ≥ b, x ≥ 0} where P = ⎝ −Q ⎠ . x −1
(29) (30)
(31)
Writing (19) in terms of the objective function c T · x, we see that the POVM corresponding to the vector x has error probability Perr (x) = (1 − p) + pc T · x.
(32)
Proposition 3. The probability of error for a PPT preserving POVM to distinguish σd⊗n PPT σ ⊗n , α ⊗n ; p , is bounded below and αd⊗n with prior probabilities p and 1 − p, P err d d n ,1 − p . by min p d−1 d+1 Proof. The linear program dual to (31) is just max{b T · y|P T · y ≤ c, y ≥ 0}. y
(33)
Indeed, the duality of linear programs tells that for any primal feasible point x and any dual feasible point y, c T · x ≥ b T · y, so any dual feasible point y gives us a lower bound on the error probability: ⊗n ⊗n PPT σd , αd ; p ≥ (1 − p) + pb T · y. Perr
(34)
(35)
It is convenient to write y as the direct sum of three (n + 1)-dimensional vectors y = u ⊕ v ⊕ w so that we can rewrite the dual program as n n n T T vi − wi u ≥ 0, v ≥ 0, w ≥ 0, Q · u − Q · v−w ≤ c . (36) max −2 y
i=0
i=0
Consider the point y ∗ = u ∗ ⊕ v ∗ ⊕ w ∗ defined by n (d − 1)n−i (d + 1)i − (1 − d)i ∗ ui = , (2d)n (d + 1)i i vi∗ = 0, 0 for i = 0, . . . , n − 1, ∗ wi = 1− p d−1 n max p − ( d+1 ) , 0 for i = n.
(37) (38) (39)
168
W. Matthews, A. Winter
We show that the point y ∗ is dual feasible in Appendix A. The dual objective function at this point is −2
n
n i=0
vi∗
−
n
wi∗
=
−wn∗
= min
i=0
d −1 d +1
n
1− p ,0 , − p
(40)
so, substituting y ∗ into (35), we obtain the bound: ⊗n ⊗n d −1 n PPT Perr ,1 − p . σd , αd ; p ≥ min p d +1
(41)
Corollary 4. Substituting the results of Proposition 1 and Proposition 3 into (3), we have shown that ⊗n ⊗n ⊗n ⊗n PPT SEP σd , αd ; p = Perr σd , αd ; p Perr ⊗n ⊗n LOCC = Perr σd , αd ; p d −1 n = min p ,1 − p . (42) d +1 Substituting into the definition of the Chernoff distance for each class of operations and noting that each copy is measured separately in the optimal strategy, we obtain our main result: Theorem 5. Whenever 0 < p < 1, we have ξ PPT (σd , αd ) = ξ SEP (σd , αd ) = ξ LOCC (σd , αd ) 2 log e d +1 ∼ . = ξ SC (σd , αd ) = log d −1 d −1
(43)
4. A Lower Bound on Bias for Single-Copy Separable Measurements The fact that ξ LOCC (σd , αd ) = ξ SC (σd , αd ) shows that our ability to distinguish the extremal Werner states cannot be improved by measurements which are entangled across copies. This is the least favorable many-copy behaviour possible. It would be interesting to know if the single-copy error probability for these states also has the worst kind of scaling with dimension. In terms of bias, we have shown that 1 B LOCC (σd , αd ; p) = . ALL B d (σd , αd ; p)
(44)
Is 1/d an asymptotic lower bound whatever states we choose? If we relax the LOCC constraint and allow separable operations then we can show that it is. Proposition 6. If ρ1 and ρ2 are bipartite states on a system of overall dimension D, then 1 B SEP (ρ1 , ρ2 ; p) ≥ √ B ALL (ρ1 , ρ2 ; p) . 2 D
(45)
Chernoff Distance for Bipartite Quantum State for Asymptotic LOCC Discrimination
169
Proof. We know that the optimal error probability for global measurements is given by the Holevo-Helstrom POVM, the elements of which are generally not even PPT. It was shown by Barnum and Gurvits [23] that every Hermitian operator in the ball centred on the identity, with radius one in the Hilbert-Schmidt norm is separable. If we add to each element of the Holevo-Helstrom POVM the minimum amount of the identity operator necessary to put the resulting operator inside this ball, and normalize the POVM, we obtain the separable POVM, 1 1 M M , , (46) 1+ 1− 2 M2 2 M2 where M is the projector onto the support of the positive part of (1 − p)ρ2 − pρ1 if p ≤ 1/2 (and minus one times the projector onto the support of the negative part otherwise). This POVM yields the error probability 1 1 (47) 1− Perr = (|1 − 2 p| + (1 − p)ρ2 − pρ1 1 ) . 2 2M2 √ Using the fact that M2 ≤ D, we get the bound 1 (1 − p)ρ2 − pρ1 1 = B ALL ≥ B PPT ≥ B SEP ≥ √ (1 − p)ρ2 − pρ1 1 2 D 1 ALL = √ B . (48) 2 D So, for states of a d × d system: B SEP /B ALL ∈ (1/d). This result, combined with our result for the data hiding states, leads us to conjecture that Conjecture 7. For states on a d × d system, B LOCC 1 . ≥ B ALL d
(49)
To put the insights and conjecture above into a different and wider perspective, let us look at the biases B X for the particular value p = 21 : 1 , (50) B X (ρ1 , ρ2 ) := B X ρ1 , ρ2 ; 2 for which, by definition, it is clear that it is symmetric: B X (ρ1 , ρ2 ) = B X (ρ2 , ρ1 ). Furthermore, for all the classes X considered in the introduction, B X (ρ1 , ρ2 ) = 0 if and only if ρ1 = ρ2 . Indeed, the B X are all metrics, as they obey the triangle inequality: B X (ρ1 , ρ3 ) ≤ B X (ρ1 , ρ2 ) + B X (ρ2 , ρ3 ) for any states ρ1 , ρ2 and ρ3 . To be more precise, they derive from operator norms · X , defined on trace-free hermitian operators: 1 X B (ρ1 , ρ2 ) = (ρ1 − ρ2 ) , with M X = sup | Tr M Mi |, (51) 2 POVM (Mi )i ∈X X i
We note that the supremum in (51) is always attained by a POVM with two elements (one with Tr (M M1 ) ≥ 0 and the other with Tr (M M2 ) = − Tr (M M1 ) ≤ 0).
170
W. Matthews, A. Winter
For example by Helstrom’s theorem [16], B X (ρ1 , ρ2 ) = 21 (ρ1 − ρ2 )1 , so ·ALL = · 1 . Of course, all norms on finite-dimensional spaces are equivalent up to constant factors. Equation (48) translates into the ordering of norms1 1 M1 = MALL ≥ MPPT ≥ MSEP ≥ √ MALL , (52) D and Conjecture 7 can be expressed as MLOCC ≥ d1 MALL for d × d systems. Note that the existence of data hiding states implies that this would be essentially best possible, as for M = 21 (αd − σd ), MLOCC ≤ MSEP ≤ MPPT =
2 MALL . d +1
(53)
5. Discussion We have calculated the Chernoff distance between the extremal d × d Werner states, under the constraint of LOCC operations, for all values of d. This is the first time the LOCC Chernoff distance has been calculated for states where it differs from the unconstrained Chernoff distance. In this case, we have also been able to calculate the smallest error probability that can be achieved by LOCC for any finite number of copies. The solution has at least two remarkable features: First, the error probability is—up to constant factors—equal to the nth power of the single-copy error probability, showing that in a sense n copies don’t give disproportionate advantage over one copy, in this case. Secondly, even the optimal n-copy measurement reflects this structurally; namely, it can be implemented by measuring the single-copy optimal POVM n times, followed by a trivial classical post-processing. As discussed in the introduction, this is a “worst-case” strategy for many copies. Both of these properties distinguish the solution from what is to be expected in the quantum Chernoff problem: e.g., discriminating two (non-orthogonal) pure states has a very simple optimal strategy, but for n copies (which is also a problem of discriminating two pure states) this strategy is highly collective over the n systems. Also, in general, even classically, the error probability shows only an asymptotically exponential decay, but here it is exactly exponential. Our result also leads to a number of further questions. An extension of the work which we are currently considering is to see if we can find Chernoff bounds for the discrimination of pairs of general Werner states. Preliminary and ongoing investigations suggest that some interesting effects occur when at least one state is non-extremal. Also, as discussed above, it would be interesting to know how close to “worst possible” is our example in terms of comparing LOCC to unrestricted measurements? That is, we would like to resolve our Conjecture 7 on the single-copy LOCC bias. Acknowledgements. WM acknowledges support from the U.K. EPSRC; AW was supported through an Advanced Research Fellowship of the U.K. EPSRC, the EPSRC’s “QIP IRC”, and the European Commission IP “QAP”. The Centre for Quantum Technologies is funded by the Singapore Ministry of Education and the National Research Foundation as part of the Research Centres of Excellence programme. The authors would like to acknowledge useful discussions with Keiji Matsumoto, Chris King and Michael Nathanson and to thank Aram Harrow for a stimulating conversation on the design of the optimally discriminating POVM. 1 The final lower bound can be made tighter by a factor of a two when the prior probabilities are equal.
Chernoff Distance for Bipartite Quantum State for Asymptotic LOCC Discrimination
171
Appendix A: Proof of Dual Feasibility We note that u i∗ ≥ 0 for i = 0, . . . , n: n (d − 1)n (d − 1)n−k − (−1)k k (d + 1)k n (d − 1)n ≥ (2d)−n (d − 1)n−k − k (d + 1)k n n d −1 1 1 = ≥ 0. − k 2d (d − 1)k (d + 1)k
u ∗k = (2d)−n
(A1)
It is obvious that v ∗ ≥ 0 and w ∗ ≥ 0, so the first three inequalities of (36) are satisfied. We now show that the remaining inequality, Q T · u − Q T · v − w ≤ c,
(A2)
is also satisfied: n n −l l n (d − 1)n (Q · u )k = n (2d) k− j j l T
∗
l=0 0≤ j≤l,k
(d + 1)l − (1 − d)l (d − 1)l (d + 1)l = s1 (d, n; k) − s2 (d, n; k), × (1 − d) j (1 + d)l− j
(A3)
where s1 (d, n; k) =
n n −l l n (d − 1)n k− j j l (2d)n l=0 0≤ j≤l,k
(d + 1)l (d − 1)l (d + 1)l n n −l l n d + 1 l− j (d − 1)n j (−1) = , k− j j l (2d)n d −1 × (1 − d) j (1 + d)l− j
(A4)
l=0 0≤ j≤l,k
n n −l l n (d − 1)n s2 (d, n; k) = k− j j l (2d)n l=0 0≤ j≤l,k
(1 − d)l (d − 1)l (d + 1)l j n n −l l n (d − 1)n j+l d − 1 (−1) = . k− j j l (2d)n d +1 × (1 − d) j (1 + d)l− j
l=0 0≤ j≤l,k
(A5)
172
W. Matthews, A. Winter
Defining m = l − j we can rewrite the first sum (A4) as n−k k n d +1 m (d − 1)n n − (m + j) m + j j s1 (d, n; k) = (−1) k− j j m+ j (2d)n d −1 m=0 j=0
n−k − 1)n
(d = (2d)n =
m=0 j=0
n−k (d −1)n
(2d)n
k
k
m=0 j=0
n! (−1) j (k − j)!(n − (m + k))!m! j!
d +1 d −1
(n−m)! k! n! (−1) j (n−m)!m! ((n−m)−k)!k! (k − j)! j!
k n−k n−m d +1 m k (d − 1)n n (−1) j . = m k j (2d)n d −1 m=0
m
d +1 d −1
m
(A6)
j=0
The sum over j is 0 except when k = 0, so n d +1 m (d − 1)n n m (2d)n d −1 m=0 d +1 n (d − 1)n 1 + = δ0k = δ0k . (2d)n d −1
s1 (d, n; k) = δ0k
(A7)
Making the same change of variables (m = l − j) in (A5), we obtain
s2 (d, n; k) =
j k n+ j−k l (d − 1)n n n − l j+l d − 1 (−1) l k− j j (2d)n d +1 j=0 l= j
k n−k n n − (m + j) m + j (d = (2d)n m+ j k− j j j=0 m=0 j d −1 × (−1) j (−1)m+ j d +1 − 1)n
j k n−k n! (d − 1)n 2j m d −1 (−1) (−1) = (2d)n (k − j)!(n − (m + k))!m! j! d +1 j=0 m=0
=
j k n−k (n−k)! k! n! (d −1)n m d −1 (−1) (2d)n (n−k)!k! ((n−k)−m)!m! (k − j)! j! d +1 j=0 m=0
n−k k k d −1 j (d − 1)n n n − k m (−1) = (2d)n d +1 k m j m=0 j=0 d −1 n d −1 n (d − 1)n n 1 + = δnk = δ . nk k (2d)n d +1 d +1
(A8)
Chernoff Distance for Bipartite Quantum State for Asymptotic LOCC Discrimination
173
n Substituting (A7) and (A8) into (A3) we find that (Q T · u ∗ )k = δ0k − δnk d−1 d+1 , so the constraint (A2) is satisfied: d −1 n T ∗ T ∗ ∗ (Q · u − Q · v − w )k = δ0k − δnk d +1 1− p d −1 n − max , 0 δnk ≤ ck . (A9) − p d +1
References 1. Walgate, J., Short, A.J., Hardy, L., Vedral, V.: Local Distinguishability of Multipartite Orthogonal Quantum States. Phys. Rev. Lett. 8(23), 4972–4975 (2000) 2. Virmani, S., Sacchi, M.F., Plenio, M.B., Markham, D.: Optimal local discrimination of two multipartite pure states. Phys. Lett. A 288, 62–68 (2001) 3. Fan, H.: Distinguishing bipartite states by local operations and classical communication. Phys. Rev. A 75, 014305 (2007) 4. Hayashi, M., Markham, D., Murao, M., Owari, M., Virmani, S.: Bounds on Multipartite Entangled Orthogonal State Discrimination Using Local Operations and Classical Communication. Phys. Rev. Lett. 96, 040501 (2006) 5. Watrous, J.: Bipartite Subspaces Having No Bases Distinguishable by Local Operations and Classical Communication. Phys. Rev. Lett. 95, 080505 (2005) 6. Horodecki, M., Oppenheim, J., Sen(De), A., Sen, U.: Distillation Protocols: Output Entanglement and Local Mutual Information. Phys. Rev. Lett. 93, 170503 (2004) 7. Ghosh, S., Joag, P., Kar, G., Kunkri, S., Roy, A.: Locally accessible information and distillation of entanglement. Phys. Rev. A 71, 012321 (2005) 8. Walgate, J., Hardy, L.: Nonlocality, Asymmetry, and Distinguishing Bipartite States. Phys. Rev. Lett. 89, 147901 (2002) 9. Groisman, B., Reznik, B.: Measurements of semilocal and nonmaximally entangled states. Phys. Rev. A 66(2), 022110 (2002) 10. Chefles, A.: Condition for unambiguous state discrimination using local operations and classical communication. Phys. Rev. A 69, 050307(R) (2004) 11. Hayashi, M., Matsumoto, K., Tsuda, Y.: A study of LOCC-detection of a maximally entangled state using hypothesis testing. J. Phys. A: Math. Gen. 39, 14427–14446 (2006) 12. King, C., Matysiak, D.: On the existence of LOCC-distinguishable bases in three-dimensional subspaces of bipartite 3 × n systems. J. Phys. A: Math. Theor. 40, 7939–7944 (2007) 13. Nathanson, M.: Distinguishing bipartite orthogonal states using LOCC: Best and worst cases. J. Math. Phys. 46:062103 (2005) 14. Rains, E.M.: A semidefinite program for distillable entanglement. IEEE Trans. Inf. Theory 47(7), 2921– 2933 (2001) 15. Bennett, C.H., DiVincenzo, D.P., Fuchs, C.A., Mor, T., Rains, E., Shor, P.W., Smolin, J.A., Wootters, W.K.: Quantum nonlocality without entanglement. Phys. Rev. A 59(2), 1070–1091 (1999) 16. Helstrom, C.W.: Quantum Detection and Estimation Theory. Academic Press, New York, 1976 17. Audenaert, K.M.R., Calsamiglia, J., Munoz-Tapia, R., Bagan, E., Masanes, Ll., Acín, A., Verstraete, F.: Discriminating States: The Quantum Chernoff Bound. Phys. Rev. Lett. 98,160501 (2007); Nussbaum, M., Szkola, A.: A lower bound of Chernoff type for symmetric quantum hypothesis testing. Annals of Statistics (in press); available at http://arxiv.org/list/quant-ph/0607216v1, 2006; Audenaert, K.M.R., Nussbaum, M., Szkola, A., Verstraete, F.: Asymptotic Error Rates in Quantum Hypothesis Testing. Comm. Math. Phys. 279,251–283 (2008) 18. Kargin, V.: On the Chernoff Bound for Efficiency of Quantum Hypothesis Testing. Ann. Stat. 33(2), 959– 976 (2005) 19. Nathanson, M.: Distinguishing a pure state from an arbitrary mixed state using LOCC. Private communication (2007) 20. DiVincenzo, D.P., Leung, D.W., Terhal, B.M.: Quantum data hiding. IEEE Trans. Inf. Theory 48(3), 580–598 (2002) 21. Eggeling, T., Werner, R.F.: Hiding Classical Data in Multipartite Quantum States. Phys. Rev. Lett. 89, 097905 (2002) 22. Schrijver, A.: Theory of Linear and Integer Programming. John Wiley and Sons, New York, 1998
174
W. Matthews, A. Winter
23. Barnum, H., Gurvits, L.: Largest separable balls around the maximally mixed bipartite quantum state. Phys. Rev. A 66, 062311 (2002) 24. Audenaert, K.M.R., Eisert, J., Jané, E., Plenio, M.B., Virmani, S.S., De Moor, B. : Phys. Rev. Lett. 87, 217902 (2002) Communicated by M.B. Ruskai
Commun. Math. Phys. 285, 175–217 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0610-6
Communications in
Mathematical Physics
Central Limit Theorem for Locally Interacting Fermi Gas V. Jakši´c1 , Y. Pautrat2 , C.-A. Pillet3 1 Department of Mathematics and Statistics, McGill University, 805 Sherbrooke Street West,
Montreal, QC, H3A 2K6, Canada
2 Univ. Paris-Sud, Laboratoire de Mathématiques d’Orsay, Orsay cedex, F-91405, France.
E-mail:
[email protected] 3 Centre de Physique Théorique , Université du Sud Toulon-Var,
B.P. 20132, 83957 La Garde Cedex, France Received: 1 November 2007 / Accepted: 22 May 2008 Published online: 2 September 2008 – © Springer-Verlag 2008
Abstract: We consider a locally interacting Fermi gas in its natural non-equilibrium steady state and prove the Quantum Central Limit Theorem (QCLT) for a large class of observables. A special case of our results concerns finitely many free Fermi gas reservoirs coupled by local interactions. The QCLT for flux observables, together with the Green-Kubo formulas and the Onsager reciprocity relations previously established [JOP4], complete the proof of the Fluctuation-Dissipation Theorem and the development of linear response theory for this class of models. 1. Introduction This paper and its companion [AJPP3] are first in a series of papers dealing with fluctuation theory of non-equilibrium steady states in quantum statistical mechanics. They are part of a wider program initiated in [Ru2,Ru3,JP1,JP2,JP4] which deals with the development of a mathematical theory of non-equilibrium statistical mechanics in the framework of algebraic quantum statistical mechanics [BR1,BR2,Pi]. For additional information about this program we refer the reader to the reviews [Ru4,JP3,AJPP1]. In this paper we study the same model as in [JOP4]: A free Fermi gas in a quasi-free state perturbed by a sufficiently regular local interaction. It is well-known that under the influence of such a perturbation this system approaches, as time t → +∞, a steady state commonly called the natural non-equilibrium steady state (NESS) [BM1,AM,BM2, FMU,JOP4]. Our main result is that under very general conditions the Quantum Central Limit Theorem (QCLT) holds for this NESS. Combined with the results of [JOP4], the QCLT completes the proof of the near-equilibrium Fluctuation-Dissipation Theorem and the development of linear response theory for this class of models. The rest of this introduction is organized as follows. In Subsect. 1.1 for notational purposes we review a few basic concepts of algebraic quantum statistical mechanics. UMR 6207, CNRS, Université de la Méditerranée, Université de Toulon et Université de Provence.
176
V. Jakši´c, Y. Pautrat, C.-A. Pillet
In this subsection the reader can find the definition of QCLT for quantum dynamical systems and a brief review of related literature. Our main result is stated in Subsect. 1.2. In Subsect. 1.3 we discuss our results in the context of linear response theory. 1.1. Central limit theorem for quantum dynamical systems. Let O be a C ∗ -algebra with identity 1l and let τ t , t ∈ R, be a strongly continuous group of ∗-automorphisms of O. The pair (O, τ ) is called a C ∗ -dynamical system. A positive normalized element of the dual O∗ is called a state on O. In what follows ω is a given τ -invariant state on O. The triple (O, τ, ω) is called a quantum dynamical system. The system (O, τ, ω) is called ergodic if 1 t ∗ s lim ω B τ (A)B ds = ω(B ∗ B)ω(A), t→∞ t 0 and mixing if lim ω B ∗ τ t (A)B = ω(B ∗ B)ω(A),
|t|→∞
for all A, B ∈ O. We denote by (Hω , πω , ω ) the GNS-representation of the C ∗ -algebra O associated to the state ω. The state ω is called modular if ω is a separating vector for the enveloping von Neumann algebra πω (O) . The states of thermal equilibrium are described by the (τ, β)-KMS condition where β > 0 is the inverse temperature. Any (τ, β)-KMS state on O is τ -invariant and modular. For any subset A ⊂ O we denote by Aself = {A ∈ A | A = A∗ } the set of selfadjoint elements of A. Let f be a bounded Borel function on R and A ∈ Oself . With a slight abuse of notation in the sequel we will often denote f (πω (A)) by f (A) and write ω( f (A)) = (ω , f (πω (A))ω ). With this convention, 1[a,b] (A) denotes the spectral projection on the interval [a, b] of πω (A). We shall use the same convention for the products f 1 (πω (A1 )) · · · f n (πω (An )), etc. An involutive antilinear ∗-automorphism of O is called time-reversal if ◦ τ t = τ −t ◦ . A state η on O is called time-reversal invariant if η ◦ (A) = η(A∗ ) holds for all A ∈ O. We say that a subset A ⊂ O is L 1 -asymptotically abelian for τ if for all A, B ∈ A, ∞ [A, τ t (B)] dt < ∞. −∞
Throughout the paper we shall use the shorthand t s 1 τ (A) − ω(A) ds. A˜ t ≡ √ t 0 Definition 1.1. Let C be a ∗-vector subspace of O. We say that C is CLT-admissible if for all A, B ∈ C, ∞ ω(τ t (A)B) − ω(A)ω(B) dt < ∞. −∞
Central Limit Theorem for Locally Interacting Fermi Gas
177
For A, B ∈ C we set ∞ ∞ t ω τ (A)B −ω(A)ω(B) dt, ω (τ t (A)−ω(A))(B −ω(B)) dt = L(A, B) ≡ −∞ −∞ 1 ∞ t 1 ς (A, B) ≡ ω [τ (A), B] dt = (L(A, B) − L(B, A)) . 2i −∞ 2i The functional (A, B) → L(A, B) is obviously bilinear. Other properties of this functional are summarized in: Proposition 1.2. Suppose that C is CLT-admissible and let A, B ∈ C. Then: (i) L(A∗ , A) ≥ 0. (ii) L(A, B) = L(B ∗ , A∗ ). In particular, if A and B are self-adjoint, then ς (A, B) = Im L(A, B). (iii) |L(A∗ , B)|2 ≤ L(A∗ , A)L(B ∗ , B). (iv) (A, B) → ς (A, B) is a (possibly degenerate) symplectic form on the real vector space Cself . (v) If ω is a mixing (τ, β)-KMS state, then ς = 0. (vi) Suppose that ς = 0, that C is dense in O and L 1 -asymptotically abelian for τ , and that ω is either a factor state or 3-fold mixing: For all A1 , A2 , A3 ∈ O, ω τ t1 (A1 )τ t2 (A2 )τ t3 (A3 ) = ω(A1 )ω(A2 )ω(A3 ). lim mini = j |ti −t j |→∞
Then ω is a (τ, β)-KMS state for some β ∈ R ∪ {±∞}. Proof. Note that 0 ≤ ω A˜ ∗t A˜ t =
|s| 1− ω (τ t (A∗ ) − ω(A∗ ))(A − ω(A)) ds. t −t t
This identity and the dominated convergence theorem yield L(A∗ , A) = lim ω A˜ ∗t A˜ t ≥ 0, t→∞
and (i) follows. Parts (ii) and (iv) are obvious. (i) and (ii) imply the Cauchy-Schwartz inequality (iii). Part (v) follows from Proposition 5.4.12 in [BR2]. Part (vi) is the celebrated stability result of Bratteli, Kishimoto and Robinson [BKR], see Proposition 5.4.20 in [BR2]. Definition 1.3. Let C be CLT-admissible. We shall say that the Simple Quantum Central Limit Theorem (SQCLT) holds for C w.r.t. (O, τ, ω) if for all A ∈ Cself , 1 i A˜ t = exp − L(A, A) . lim ω e t→∞ 2 We shall say that the Quantum Central Limit Theorem (QCLT) holds for C if for all n and all A1 , · · · , An in Cself , ⎛ ⎞ 1 ˜ ˜ lim ω ei A1t · · · ei Ant = exp ⎝− L Ak , A j −i ς (A j , Ak )⎠ . (1.1) t→∞ 2 1≤ j,k≤n
1≤ j 0 and t > 0, let t/ 2 s Aˆ (t) ≡ τ (A) − ω(A) ds. 0
We say that C has QHL w.r.t. (O, τ, ω) if for all A1 , . . . An ∈ Cself , and all t1 > 0, . . . , tn > 0, ˆ ˆ lim ω ei A1 (t1 ) · · · ei An (tn ) = ω L (W (χ[0,t1 ] ⊗ A1 ) · · · W (χ[0,tn ] ⊗ An )), (1.10) ↓0
where, in the definition of the Weyl algebra, the bilinear form L must be replaced by L QHL (χ[0,s] ⊗ A, χ[0,t] ⊗ B) = inf(s, t) L(A, B). The special case where all t j ’s are equal corresponds to QCLT. The QHL is interpreted as the weak convergence of the quantum stochastic process Aˆ (t) to a quantum Brownian motion. With the obvious reformulation, Theorem 1.5 holds for QHL. Convergence of moments lim ω( Aˆ 1 (t1 ) · · · Aˆ n (tn )) = ω L (ϕ L (χ[0,t1 ] ⊗ A1 ) · · · ϕ L (χ[0,tn ] ⊗ An ), (1.11) ↓0
is of independent interest. Even more generally, one may associate to a class F of real valued integrable functions on R the observables ∞ f ( 2 t) τ t (A) − ω(A) dt, Aˆ ( f ) ≡ −1 0
Central Limit Theorem for Locally Interacting Fermi Gas
with f ∈ F, A ∈ C and study the limit ↓ 0 of ˆ ˆ ω ei A1 ( f1 ) · · · ei An ( fn ) .
181
(1.12)
Note that QHL corresponds to the choice F = {χ[0,t] | t > 0}. For reasons of space and notational simplicity we will focus in the paper on the QCLT for locally interacting fermionic systems. With only notational changes our proofs can be extended to establish QHL and (1.11). It is likely that the proofs can be extended to a much larger class of functions F, but we shall not pursue this question here (see [De1] for a related discussion). We conclude this subsection with a few remarks about earlier quantum central-limit type results. First, notice that, since the law of one single observable is well-defined, the description of the limiting law of a family ( A˜ x )x≥0 of observables as the parameter x → ∞, is covered by the classical Lévy- Cramèr theorem. Several results of interest exist, which ˜ are only of quantum nature insofar as the computation of the limit lim x→∞ ω(eiα A x ) is made more complicated by the quantum setting. Truly quantum central limit theorems therefore involve an attempt to describe the limiting joint behavior of the law of a family ( A˜ (1) , . . . , A˜ ( p) )x of observables as x → ∞. The earliest results of this type were obtained in a quantum probabilistic approach and were non commutative analogues of classical results concerning sums of independent, identically distributed variables. Such results can be translated in a physical setting as applying to space fluctuations of one-site observables in quantum spin systems with respect to translation-invariant product states. The generality of the framework and the formulation of the limit vary. We mention in particular [AB] where matrix elements of approximate Weyl operators constructed from Pauli matrices are considered; [GvW] which holds in the general *-algebra case but where only convergence of moments is proved; [Kup] which works in a general C*-algebra setting and where a true convergence in distribution (to a classical Gaussian family) is proved, but only with respect to a tracial state. We also mention [CH] which, although not a central limit theorem, is a first attempt to characterize a convergence in distribution of a family of non-commutating operators, in terms of a (pseudo)-characteristic function. See [JPP] for a more detailed discussion. The papers [GVV1–GVV6] aim at more physical applications: a satisfactory algebra of fluctuations is constructed for space fluctuations of local observables in a quantum spin system with a tranlation-invariant state. That state does not have to be a product state; however, the ergodic assumptions on that state are so strong that no nontrivial application was found beyond the product case. However, these papers were a conceptual improvement and our construction owes much to them. The papers [Ma1,Ma2] had a similar spirit but, using less stringent ergodic conditions, gave non trivial application to space fluctuations of local observables in XY chains. A distinct feature of our work is that we study QCLT with respect to the group τ t describing the microscopic dynamics of the system. There is a number of technical and conceptual aspects of QCLT which are specific to the dynamical group. For example, the ergodic properties of the system (laws of large numbers), which have to be established prior to study its fluctuations, are typically much harder to prove for the dynamical group than for the lattice translation group. As for the conceptual differences, we mention that if ω is a (τ, β)-KMS state, then by Proposition 1.2 (v), ς = 0 and the CCR algebra of fluctuations W is commutative (Part (vi) provides a partial converse to this statement). This is in sharp contrast with QCLT w.r.t. the translation group, where even in the simple example of product states of spin systems the fluctuation algebra is non-commutative.
182
V. Jakši´c, Y. Pautrat, C.-A. Pillet
The CLT for classical dynamical systems is discussed in [Li]. For a review of results on dynamical CLT for interacting particle systems in classical statistical mechanics we refer the reader to [Sp] and [KL]. The CLT for classical spin systems is discussed in Sect. V.7 of [E]. After this paper was completed, we have learned of the work [De1] which is technically and conceptually related to ours. We shall comment on Derezi´nski’s result at the end of Subsect. 3.3.
1.2. QCLT for locally interacting fermions. A free Fermi gas is described by the C ∗ -dynamical system (O, τ0 ) where: (i) O = CAR(h) is the CAR algebra over the single particle Hilbert space h; (ii) τ0t is the group of Bogoliubov ∗-automorphisms generated by the single particle Hamiltonian h 0 , τ0t (a # ( f )) = a # (eith 0 f ), where a ∗ ( f )/a( f ) are the Fermi creation/annihilation operators associated to f ∈ h and a # stands for either a or a ∗ . We denote by δ0 the generator of τ0 . Let O be the τ0 -invariant C ∗ -subalgebra of O generated by {a ∗ ( f )a(g) | f, g ∈ h} and 1l. Physical observables are gauge invariant and hence the elements of O. Let v be a vector subspace of h and let O(v) be the collection of the elements of the form A=
nk K
a ∗ ( f k j )a(gk j ),
(1.13)
k=1 j=1
where K and n k ’s are finite and f k j , gk j ∈ v. We denote n A ≡ maxk n k and F(A) ≡ { f k j , gk j | j = 1, . . . , n k , k = 1, . . . , K } (to indicate the dependence of K on A we will also denote it by K A ). O(v) is a ∗-subalgebra of O, and if v is dense in h, then O(v) is norm dense in O. Our main assumption is : (A) There exists a dense vector subspace d ⊂ h such that the functions R t → ( f, eith 0 g), are in L 1 (R, dt) for all f, g ∈ d. This assumption implies that h 0 has purely absolutely continuous spectrum. Specific physical models which satisfy this assumption are discussed at the end of this subsection. Let V ∈ O(d)self be a self-adjoint perturbation. We shall always assume that n V ≥ 2. The special case n V = 1 leads to quasi-free perturbed dynamics and is discussed in detail in the companion paper [AJPP3], see also [AJPP1,AJPP2,JKP] and the Remark after Theorem 1.7 below. Let λ ∈ R be a coupling constant and let τλ be the C ∗ -dynamics generated by δλ = δ0 + iλ[V, · ]. By rescaling λ, without loss of generality we may assume that max f = 1.
f ∈F (V )
(1.14)
Central Limit Theorem for Locally Interacting Fermi Gas
183
We shall consider the locally interacting fermionic system described by (O, τλ ). Note that τλ preserves O and that the pair (O, τλ ) is also a C ∗ -dynamical system. Let λV ≡ where
(2n V − 2)2n V −2 1 , 2n V K V V (2n V − 1)2n V −1
V ≡
∞
sup
−∞ f,g∈F (V )
|( f, eith 0 g)|dt.
(1.15)
(1.16)
The following result was proven in [JOP4] (see also [BM1,AM,BM2,FMU]). Theorem 1.6. Suppose that (A) holds. Then: 1. For all A ∈ O(d) and any monomial B = a # ( f 1 ) · · · a # ( f m ) with { f 1 , . . . , f m } ⊂ d, one has t [τ (A), B] dt < ∞. sup λ |λ|≤λV
R
2. For |λ| ≤ λV the Møller morphisms γλ+ ≡ s − lim τ0−t ◦ τλt , t→∞
exist and are ∗-automorphisms of O. In what follows we shall assume that (A) holds. Let T be a self-adjoint operator on h satisfying 0 ≤ T ≤ I and [T, eith 0 ] = 0 for all t, and let ω0 be the gauge invariant quasi-free state on O associated to T . We will sometimes call T the density operator. The state ω0 is τ0 -invariant and is the initial (reference) state of our fermionic system. The quantum dynamical system (O, τ0 , ω0 ) is mixing. We denote by N0 the set of all ω-normal states on O. Theorem 1.6 yields that any state η ∈ N0 evolves to the limiting state ωλ+ = ω0 ◦ γλ+ , i.e., for A ∈ O and |λ| ≤ λV , lim η(τλt (A)) = ωλ+ (A),
t→∞
see, e.g., [Ro,AJPP1]. The state ωλ+ is the NESS (non-equilibrium steady state) of (O, τλ ) associated to the initial state ω0 . Clearly, ωλ+ is τλ -invariant and γλ+ is an isomorphism of the quantum dynamical systems (O, τ0 , ω0 ) and (O, τλ , ωλ+ ). In particular, the system (O, τλ , ωλ+ ) is mixing. In what follows we shall always assume that Ker T = Ker (I − T ) = {0}. This assumption ensures that the states ω0 and ωλ+ are modular. Let c ⊂ d be a vector subspace such that the functions R t → ( f, eith 0 T g) are in L 1 (R, dt) for all f, g ∈ c. In general, it may happen that c = {0}, and so the existence of a non-trivial c is a dynamical regularity property of the pair (T, h 0 ). If T = F(h 0 ), where F ∈ L 1 (R, dx) is such that its Fourier transform ∞ 1 ˆ eit x F(x)dx F(t) =√ 2π −∞
184
V. Jakši´c, Y. Pautrat, C.-A. Pillet
is also in L 1 (R, dt), then one can take c = d. Let λ˜ V ≡ 2−8(n V −1) λV ,
(1.17)
and C ≡ O(c). The main result of this paper is: Theorem 1.7. Suppose that (A) holds, that V ∈ Cself , and that |λ| ≤ λ˜ V . Then C is CLT-admissible and the QCLT holds for C w.r.t. (O, τλ , ωλ+ ). Remark. If n V = 1, then Theorem 1.6 holds for any 0 < λV < (2K V V )−1 , see [JOP4]. ˜ With this change, Theorem ∗1.7 holds with λV = λV . The case n V = 1 is however very special. If V = a ( f )a(g ), then τλ is quasi-free dynamics generated by k k k h λ = h 0 + λ k (gk , ·) f k and Theorem 1.6 can be derived from the scattering theory of the pair (h λ , h 0 ), see [Ro,AJPP1]. This alternative approach is technically simpler, yields better constants, and can be also used to prove a Large Deviation Principle and to discuss additional topics like the Landauer-Büttiker formula which cannot be handled by the method of [JOP4] and this paper. For this reason, we shall discuss this special case separately in the companion paper [AJPP3]. As we have already remarked, our proof of Theorem 1.7 also yields the convergence of moments (see Theorem 3.2), and is easily extended to the proof of existence of QHL for locally interacting fermionic systems (recall (1.10), (1.11)). We finish this subsection with some concrete models to which Theorem 1.7 applies. The models on graphs are the same as in [JOP4]. Let G be the set of vertices of a connected graph of bounded degree, G the discrete Laplacian acting on l 2 (G), and δx the Kronecker delta function at x ∈ G. We shall call a graph G admissible if there exists γ > 1 such that for all x, y ∈ G, |(δx , e−itG δ y )| = O(|t|−γ ),
(1.18)
for d ≥ 3, G = Z+ × Zd−1 , as t → ∞. Examples of admissible graphs are G = where Z+ = {0, 1, · · · } and d ≥ 1, tubular graphs of the type Z+ × , where ⊂ Zd−1 is finite, a rooted Bethe lattice, etc. Assumption (A) holds and Theorem 1.7 holds with c = d if: Zd
(i) (ii) (iii) (iv)
G is an admissible graph; h = 2 (G) (or more generally 2 (G) ⊗ C L ) and h 0 = −G ; d is the subspace of finitely supported elements of h; T = F(h 0 ), where Fˆ ∈ L 1 (R, dt) and 0 < F(x) < 1 for x ∈ sp(h 0 );
The continuous examples are similar. Let D ⊂ Rd be a domain and let D be the Dirichlet Laplacian on L 2 (D, dx). We shall say that a domain D is admissible if there exists γ > 1 such that |( f, e−itD g)| = O(|t|−γ ), for all bounded f and g with compact support. Examples of admissible domains are D = Rd for d ≥ 3, D = R+ × Rd−1 for d ≥ 1, tubular domains of the type R+ × , where ⊂ Rd−1 is a bounded domain, etc. Assumption (A) holds and Theorem 1.7 holds with c = d if:
Central Limit Theorem for Locally Interacting Fermi Gas
(i) (ii) (iii) (iv)
185
D is an admissible domain; h = L 2 (D, dx) (or more generally L 2 (D, dx) ⊗ C L ) and h 0 = −D ; d is the subspace of bounded compactly supported elements of h; T = F(h 0 ), where Fˆ ∈ L 1 (R, dt) and 0 < F(x) < 1 for x ∈ sp(h 0 );
1.3. QCLT, linear response and the Fluctuation-Dissipation Theorem. In addition to the assumptions of the previous subsection, we assume that h, h 0 , T have the composite structure h=
M
hj,
h0 =
M
j=1
h j,
T =
j=1
M j=1
1 1 + eβ j (h j −µ j )
,
(1.19)
where h j ’s are bounded from below self-adjoint operators on the Hilbert subspaces h j , β j > 0, and µ j ∈ R. We denote by p j the orthogonal projections onto h j . The subalgebras O j = CAR(h j ) describe Fermi gas reservoirs R j which are initially in equilibrium at inverse temperatures β j and chemical potentials µ j . The perturbation λV describes the interaction between the reservoirs and allows for the flow of heat and charges within the system. The non-equilibrium statistical mechanics of this class of models has been studied recently in [JOP4] (see also [FMU] for related models and results). We briefly recall the results we need. Suppose that p j F(V ) ⊂ Dom (h j ) for all j. The entropy production observable of (O, τλ ) associated to the reference state ω0 is σλ ≡ −
M
β j ( j − µ j J j ),
j=1
where j ≡ iλ[d(h j p j ), V ] and J j ≡ iλ[d( p j ), V ]. Explicitly, j = λ
l−1 nk KV k=1 l=1
∗
a ( f ki )a(gki )
i=1
∗
+a ( f kl )a(ih j p j gkl ) Jj = λ
l−1 nk KV k=1 l=1
∗ a (ih j p j f kl )a(gkl )
nk
∗
a ( f ki )a(gki ) ,
i=l+1
a ∗ ( f ki )a(gki )
i=1
∗
+a ( f kl )a(i p j gkl )
∗ a (i p j f kl )a(gkl )
nk
∗
a ( f ki )a(gki ) .
i=l+1
The observable j /J j describes the heat/charge flux out of the reservoir R j (note that j , J j ∈ O). The conservation laws M j=1
ωλ+ ( j ) = 0,
M j=1
ωλ+ (J j ) = 0,
186
V. Jakši´c, Y. Pautrat, C.-A. Pillet
hold. By the general result of [JP1,Ru2,JP4], the entropy production of the NESS ωλ+ is non-negative, Ep(ωλ+ ) ≡ ωλ+ (σλ ) = −
M
β j (ωλ+ ( j ) − µ j ωλ+ (J j )) ≥ 0.
j=1
If all β j ’s and µ j ’s are equal, i.e. β1 = · · · = β M = β and µ1 = · · · = µ M = µ, then ω0 O is a (τ0 , β)-KMS state and so the reference state is a thermal equilibrium state of the unperturbed system. Then ωλ+ O is a (τλ , β)-KMS state, ωλ+ ( j ) = ωλ+ (J j ) = 0 for all j, and in particular Ep(ωλ+ ) = 0, see [JOP2]. On physical grounds, vanishing of the fluxes and the entropy production in thermal equilibrium is certainly an expected result. It is also expected that if either β j ’s or µ j ’s are not all equal, then Ep(ωλ+ ) > 0. For specific interactions V one can compute ωλ+ (σλ ) to the first non-trivial order in λ and hence establish the strict positivity of entropy production by a perturbative calculation (see [FMU,JP6] and [JP3] for a related results). The strict positivity of the entropy production for a generic perturbation λV has been established in [JP5]. To establish QCLT for the flux observables in addition to Assumption (A) we need: (B) For all j, h j p j d ⊂ d. This assumption and the specific form of the density operator ensure that one may take c = d and that if V ∈ Cself , then { j , J j } ⊂ Cself . Hence, for |λ| ≤ λ˜ V the QCLT holds for the flux observables. We finish with a discussion of linear response theory (for references and additional information about linear response theory in the algebraic formalism of quantum statistical mechanics we refer the reader to [AJPP1] and [JOP1–JOP4]. We will need the following two assumptions: (C) The operators h j are bounded. (D) There exists a complex conjugation c on h which commutes with all h j and satisfies c f = f for all f ∈ F(V ). Assumption (C) is of a technical nature and can be relaxed. Assumption (D) ensures that the system (O, τλ , ω0 ) is time-reversal invariant. Time-reversal invariance is of central importance in linear response theory. Let βeq > 0 and µeq ∈ R be given equilibrium values of the inverse temperature and chemical potential. We denote β = (β1 , . . . , β M ), µ = (µ1 , . . . , µ M ), β eq = (βeq , . . . , βeq ), µeq = (µeq , . . . , µeq ), and we shall indicate explicitly the depen+ dence of ωλ+ on β and µ by ωλ, β ,µ . Similarly, we shall indicate explicitly the dependence + + of L(A, B) on λ, β, µ by L λ,β ,µ . Since ωλ, β eq ,µeq ( j ) = ωλ,β eq ,µeq (J j ) = 0,
L λ,β eq ,µeq (A, B) =
∞
−∞
t + ωλ, β eq ,µeq Aτλ (B) dt,
for A, B ∈ { j , J j | 1 ≤ j ≤ M}. Assuming the existence of derivatives, the kinetic transport coefficients are defined by kj kj + + Lλhh ≡ −∂β j ωλ, ( ) , L ≡ β ∂ ω ( ) , k eq µ k β ≡ β , µ = µ j eq eq λhc β ,µ λ,β ,µ β =β eq ,µ=µeq (1.20) kj kj + + Lλch ≡ −∂β j ωλ, (J ) , L ≡ β ∂ ω (J ) , eq µ j λ,β ,µ k λcc β ,µ k β =β eq ,µ=µeq β =β eq ,µ=µeq
where the indices h/c stand for heat/charge. We then have
Central Limit Theorem for Locally Interacting Fermi Gas
187
Theorem 1.8. Suppose that Assumptions (A)–(D) hold. Then, for any |λ| < λV , the functions + (β, µ) → ωλ, β ,µ ( j ),
+ (β, µ) → ωλ, β ,µ (J j ),
are analytic in a neighborhood of (β eq , µeq ). Moreover, (1) The Green-Kubo formulas hold: kj
kj
kj
kj
Lλhh = 21 L λ,β eq ,µeq (k , j ), Lλhc = 21 L λ,β eq ,µeq (k , J j ), Lλch = 21 L λ,β eq ,µeq (Jk , j ), Lλcc = 21 L λ,β eq ,µeq (Jk , J j ).
(1.21)
(2) The Onsager reciprocity relations hold: kj
jk
Lλhh = Lλhh ,
kj
jk
kj
Lλcc = Lλcc ,
jk
Lλhc = Lλch .
(1.22)
(3) Let C denote the linear span of { j , J j | 1 ≤ j ≤ M}. For |λ| ≤ λ˜ V , C is CLTadmissible and the QCLT holds for C w.r.t. (O, τλ , ωλ,β eq ,µeq ). The associated fluctuation algebra W is commutative. Remark 1. Parts (1) and (2) of Theorem 1.8 are proven in [JOP4]. Part (3) is a special case of Theorem 1.7. Parts (1) and (3) relate linear response to thermodynamical forces to fluctuations in thermal equilibrium and constitute the Fluctuation-Dissipation Theorem for our model. The physical aspects of linear response theory and Fluctuation-Dissipation Theorem are discussed in the classical references [DGM,KTH]. Remark 2. The arguments in [JOP4] do not establish that the functions + t → ωλ,β
eq ,µeq
Aτλt (B) ,
(1.23)
are absolutely integrable for A, B ∈ { j , J j | 1 ≤ j ≤ M} and in Part (2) L λ,β eq ,µeq (A, B) is defined by L λ,β eq ,µeq (A, B) = lim
t
t→∞ −t
+ ωλ,β
eq ,µeq
Aτλs (B) ds.
The absolute integrability of the correlation functions (1.23) is a delicate dynamical problem resolved in Part (3) for |λ| ≤ λ˜ V (see Definition 1.1). Remark 3. Remarks 4 and 6 after Theorem 1.5 in [JOP4] apply without changes to Theorem 1.8. Remark 7 is also applicable and allows to extend the Fluctuation-Dissipation Theorem to a large class of so called centered observables. Remark 4. Although the time-reversal Assumption (D) plays no role in Part (3) of Theorem 1.8, it is a crucial ingredient in proofs of Parts (1) and (2) (see [JOP4,AJPP3] for a discussion). The Fluctuation-Dissipation Theorem fails for locally interacting open fermionic systems which are not time-reversal invariant.
188
V. Jakši´c, Y. Pautrat, C.-A. Pillet
A class of concrete models for which (A)-(B)-(D) hold is easily constructed following the examples discussed at the end of Subsect. 1.2. Let G1 , . . . , G M be admissible graphs. Then (A)–(D) hold if h j = 2 (G j ) (or 2 (G j ) ⊗ C L ), h j = −G j , and d is the subspace of finitely supported elements of h. A physically important class of allowed interactions is V = V hop + V int , where t (x, y) a ∗ (δx )a(δ y ) + a ∗ (δ y )a(δx ) , V hop = x,y
and t : G × G → R is a finitely supported function (G = ∪ j G j ), and v(x, y)a ∗ (δx )a ∗ (δ y )a(δ y )a(δx ), V int = x,y
where v : G × G → R is finitely supported. V hop describes tunneling junctions between the reservoir and V int is a local pair interaction. 2. General Aspects of CLT 2.1. Proof of Theorem 1.4. Our argument follows the ideas of [GV]. For A, B in Oself we set 1
D(A, B) ≡ ei A eiB − ei(A+B) e− 2 [A,B] . The first ingredient of the proof is: Proposition 2.1. If the set {A, B} ⊂ Oself is L 1 -asymptotically abelian for τ then the asymptotic 2nd -order Baker-Campbell-Hausdorff formula lim D( A˜ t , B˜ t ) = 0,
t→∞
holds. Note that Proposition 2.1 is not a simple consequence of the BCH formula because its hypothesis does not ensure that the double commutator [ A˜ t , [ A˜ t , B˜ t ]] vanishes as t → ∞. To prove Proposition 2.1 we need the following estimate. Lemma 2.2. If A, B, a, b are bounded self-adjoint operators then D(A + a, B + b) ≤ D(A, B) + 4 a3 + b3 + [[A, B], [a, b]] +(2 + a + b) [X, y]. X ∈{A,B} y∈{a,b}
Proof. We decompose D(A + a, B + b) = 9j=1 D j according to the following table and get an upper bound of the norm of each term using the elementary estimates 1 i(x+y) − eix ≤ y, ei(x+y) − eix eiy ≤ [x, y], eix eiy − eiy eix ≤ [x, y]. e 2
Central Limit Theorem for Locally Interacting Fermi Gas j 1
189 upper bound on D j
Dj
1 2 [ A, a]
ei(A+a) − eia ei A ei(B+b)
2
eia ei A ei(B+b) − eib eiB
1 2 [B, b]
3
eia ei A eib − eib ei A eiB
[ A, b]
4 5 6 7 8
− 1 [A,B] eia eib ei A eiB − ei(A+B) e 2 − 1 [a,b] − 1 [A,B] eia eib − ei(a+b) e 2 ei(A+B) e 2 − 1 [a,b] i(A+B) − 1 [a,b] − 1 [A,B] e 2 ei(a+b) e 2 e − ei(A+B) e 2 − 1 [a,b] − 21 [A,B] − 1 [A,B]− 21 [a,b] ei(a+b) ei(A+B) e 2 e −e 2 − 1 [A,B]− 21 [a,b] − 1 [A+a,B+b] ei(a+b) ei(A+B) e 2 −e 2
9
1 − [A+a,B+b] ei(a+b) ei(A+B) − ei(A+B+a+b) e 2
D(A, B) D(a, b) 1 2 [ A + B, [a, b]] 1 [[ A, B], [a, b]] 8 1 2 ([ A, b] + [B, a]) 1 2 ([ A, a] + [ A, b] + [B, a] + [B, b])
From the BCH estimate we further get D5 ≤ D(a, b) ≤ [a, [a, b]] + [b, [a, b]] ≤ 4(a3 + b3 ), and the Jacobi identity yields D6 ≤ a([A, b] + [B, b]) + b([A, a] + [B, a]). The result follows.
Proof of Proposition 2.1. For t > 0 and j ∈ N set p(t) ≡ log(1 + t) and I j (t) ≡ [ j p(t), ( j + 1) p(t)[. For X ∈ Oself define ( j) ( j) ( 0 and |λ| ≤ λ˜ V , t |s| + s 2 1− ωλ (τλ (A) − ωλ+ (A))(A − ωλ+ (A)) ds ≤ 2C V,A . t −t As t → ∞ the monotone convergence theorem yields ∞ + s ω (τ (A) − ω+ (A))(A − ω+ (A)) ds ≤ 2C 2 . λ λ λ V,A λ −∞
In particular, we derive that C is CLT-admissible. The second ingredient of the proof of Theorem 1.7 is: Theorem 3.2. For |λ| ≤ λ˜ V and all n ≥ 1, n! L(A, A)n/2 n/2 + n ˜ lim ωλ ( At ) = 2 (n/2)! t→∞ 0
if n is even, if n is odd.
Remark. With only notational changes the proof of Theorem 3.2 yields that for all A1 , . . . , An ∈ C, lim ωλ+ A˜ 1t · · · A˜ nt = ω L (ϕ L (A1 ) · · · ϕ L (An )), t→∞
where the r.h.s. is defined by (1.6). Given Theorems 3.1 and 3.2, we can complete: Proof of Theorem 1.7. Let A ∈ Cself . For α ∈ C one has (iα)n ˜ ωλ+ ( A˜ t )n . ωλ+ eiα At = n!
(3.36)
n≥0
Let = 1/(2C V,A ) and suppose that |λ| ≤ λ˜ V . Theorems 3.1 and 3.2 yield that ˜ sup ωλ+ eiα At < ∞, |α|< ,t>0
and that for |α| < ,
2 ˜ lim ωλ+ eiα At = e−L(A,A) α /2 .
t→∞
(3.37)
Proposition 2.4 yields that (3.37) holds for all α ∈ R, and so SQCLT holds for C w.r.t. (O, τλ , ωλ+ ). Our standing assumption Ker (T ) = Ker (I −T ) = {0} ensures that the state ω0 is modular, and since ωλ+ = ω0 ◦ γλ+ , the state ωλ+ is also modular. By Theorem 1.6, if |λ| ≤ λV , then C is L 1 -asymptotically Abelian for τλ and it follows from Theorem 1.4 that the QCLT also holds. Notice that in the initial step of the proof we did not use the assumption that A is self-adjoint, and so the following weak form of QCLT holds for any A ∈ C: Corollary 3.3. For any A ∈ C there exists > 0 such that for |λ| ≤ λ˜ V and |α| < , 2 ˜ lim ωλ+ eiα At = e−L(A,A)α /2 . t→∞
In the rest of this section we shall describe the strategy of the proof of Theorems 3.1 and 3.2.
196
V. Jakši´c, Y. Pautrat, C.-A. Pillet
3.2. The commutator estimate. We shall need the following result Theorem 3.4. Suppose that Assumption (A) holds. Let V ∈ O(d)self be a perturbation such that n V ≥ 2 and max f = 1.
f ∈F (V )
Let A = a # ( f 1 ) · · · a # ( f m ) be a monomial such that F(A) = { f 1 , . . . , f m } ⊂ d, and let (n)
C A (s1 , . . . , sn ) = [τ0sn (V ), [· · · , [τ0s1 (V ), A] · · · ]]. (n) Then for all n ≥ 0 there exist a finite index set Qn (A), monomials FA,q ∈ O, and scalar (n)
functions G A,q such that (n)
C A (s1 , . . . , sn ) =
(n)
(n)
G A,q (s1 , . . . , sn )FA,q (s1 , . . . , sn ).
(3.38)
q∈Qn (A)
Moreover, (n) does not exceed 2n(n V − 1) + m. 1. The order of the monomial FA,q (n)
2. If m is even then the order of FA,q is also even. (n)
3. The factors of FA,q are from ! ! a # (eish 0 g) g ∈ F(V ), s ∈ {s1 , . . . , sn } ∪ a # (g) g ∈ F(A) , The number of factors from the first set does not exceed n(2n V − 1) while the number (n) ≤ of factors from the second set does not exceed m − 1. In particular, FA,q m−1 max(1, max f ∈F (A) f ). 4. Let λV be given by (1.15). Then ∞ (n) |λV |n WV,A ≡ G A,q (s1 , . . . , sn ) ds1 · · · dsn < ∞. (3.39) n=1
q∈Qn (A) −∞<s ≤···≤s ≤0 n 1
The proof of Theorem 3.4 is identical to the proof of Theorem 1.1 in [JOP4]. Parts 1–3 are simple and are stated for reference purposes. Part 4 is a relatively straightforward consequence of the fundamental Botvich-Guta-Maassen integral estimate [BGM] which also gives an explicit estimate on WV,A . A pedagogical exposition of the BotvichGuta-Maassen estimate can be found in [JP6]. If A is as in Theorem 3.4 then γλ+ (A) = lim τ0−t ◦ τλt (A) t→∞
can be expanded in a power series in λ which converges for |λ| ≤ λV . Indeed, it follows from the Dyson expansion that ∞ τ0−t ◦ τλt (A) = A + (iλ)n [τ0sn (V ), [. . . , [τ0s1 (V ), A] · · · ]] ds1 · · · dsn . n=1
−t≤sn ≤···≤s1 ≤0
Central Limit Theorem for Locally Interacting Fermi Gas
197
Hence, for |λ| ≤ λV , γλ+ (A)
∞ = A+ (iλ)n n=1
G (n) A,q (s1 , . . . , sn )
q∈Qn (A) −∞<s ≤···≤s ≤0 n 1
(n)
×FA,q (s1 , . . . , sn ) ds1 · · · dsn ,
(3.40)
where the series on the right-hand side is norm convergent by Parts 3 and 4 of Theorem 3.4. This expansion will be used in the proof of Theorems 3.1 and 3.2. 3.3. Quasi-free correlations. Let O, τ0 and ω0 be as in Subsect. 1.2. We denote by 1 ϕ( f ) = √ a( f ) + a ∗ ( f ) , 2 the Fermi field operator associated to f ∈ h. The Fermi field operators satisfy the commutation relation ϕ( f )ϕ(g) + ϕ(g)ϕ( f ) = Re( f, g)1l, and the CAR algebra O is generated by {ϕ( f ) | f ∈ h}. Clearly, 1 a( f ) = √ (ϕ( f ) + iϕ(i f )), 2
1 a ∗ ( f ) = √ (ϕ( f ) − iϕ(i f )). 2
(3.41)
We recall that ω0 , the gauge invariant quasi-free state associated to the density operator T , is uniquely specified by ω0 (a ∗ ( f n ) · · · a ∗ ( f 1 )a(g1 ) · · · a(gm )) = δn,m det{(gi , T f j )}. Alternatively, ω0 can be described by its action on the Fermi field operators. Let Pn be the set of all permutations π of {1, . . . , 2n} described in Subsect. 1.1 (recall (1.5)). Denote by (π ) the signature of π ∈ Pn . ω0 is the unique state on O such that ω0 (ϕ( f 1 )ϕ( f 2 )) = and
ω0 (ϕ( f 1 ) · · · ϕ( f n )) =
π ∈Pn/2
(π )
1 ( f 1 , f 2 ) − i Im( f 1 , T f 2 ), 2 n/2
j=1 ωT
0
ϕ( f π(2 j−1) ), ϕ( f π(2 j) ) if n is even; if n is odd.
For any bounded subset f ⊂ h we set Mf = sup f , f ∈f
and
2 Cf = max 1, sup f,g∈f f g
∞
−∞
ω0 ϕ( f )τ t (ϕ(g)) dt , 0
and we denote by M(f) the set of monomials with factors from {ϕ( f )| f ∈ f}. We further say that A ∈ M(f) is of degree at most k if, for some f 1 , . . . , f k ∈ f, one can write A = ϕ( f 1 ) · · · ϕ( f k ).
198
V. Jakši´c, Y. Pautrat, C.-A. Pillet
Theorem 3.5. Suppose that Cf < ∞. Then for any A1 , . . . , An ∈ M(f) of degrees at most k1 , . . . , kn the following holds: 1. sup t
−n/2
n i ki t i τ0 (Ai ) − ω0 (Ai ) dt1 · · · dtn ≤ 27/2 Mf Cfn n!. ω0 n [0,t]
t>0
i=1
2. If n is odd, lim t −n/2
t→∞
[0,t]n
n t τ0i (Ai ) − ω0 (Ai ) ω0 dt1 · · · dtn = 0. i=1
3. If n is even, lim t
−n/2
t→∞
=
[0,t]n
n/2
ω0
n τ0ti (Ai ) − ω0 (Ai )
dt1 · · · dtn
i=1
L 0 (Aπ(2 j−1) , Aπ(2 j) ),
π ∈Pn/2 j=1
where
L 0 (Ai , A j ) =
∞
−∞
ω0 (τ0t (Ai ) − ω0 (Ai ))(A j − ω0 (A j )) dt.
(3.42)
Remark. As in Remark 2 after Theorem 3.1, Part 1 of the previous theorem with n = 2 implies that ∞ t ω0 (τ (Ai ) − ω0 (Ai ))(A j − ω0 (A j )) dt < ∞, 0 −∞
and so L 0 (Ai , A j ) is well defined. Theorem 3.5 is in essence the main technical result of our paper. Its proof is given in Sect. 4. We have formulated Theorem 3.5 in terms of field operators since that allows for a combinatorially natural approach to its proof. Using the identities (3.41) one effortlessly gets the following reformulation which is more convenient for our application. ˜ Denote by M(f) the set of monomials with factors from {a # ( f )| f ∈ f}. A ∈ M(f) is of degree at most k if, for some f 1 , . . . , f k ∈ f, one can write A = a # ( f 1 ) · · · a # ( f k ). Let ∞ 2 t ω0 ϕ( f )τ (ϕ(g)) dt . Df = max 1, sup 0 f,g∈f ∪ if f g −∞ ˜ Corollary 3.6. Suppose that Df < ∞. Then for any A1 , . . . , An ∈ M(f) of degrees at most k1 , . . . , kn the following holds:
Central Limit Theorem for Locally Interacting Fermi Gas
199
1. sup t −n/2
n i ki t 4 i τ0 (Ai ) − ω0 (Ai ) dt1 · · · dtn ≤ 2 Mf Dfn n!. ω0 [0,t]n
t>0
i=1
2. If n is odd, lim t
−n/2
t→∞
[0,t]n
n t i τ0 (Ai ) − ω0 (Ai ) ω0 dt1 · · · dtn = 0. i=1
3. If n is even, lim t −n/2
t→∞
=
[0,t]n
n/2
ω0
n
τ0ti (Ai ) − ω0 (Ai )
dt1 · · · dtn
i=1
L 0 (Aπ(2 j−1) , Aπ(2 j) ),
π ∈Pn/2 j=1
where L 0 (Ai , Ak ) is defined by (3.42). Note that if c is as in Subsect. 1.2 and f is a finite subset of c, then Cf < ∞ and Df < ∞. After this paper was completed we have learned of a beautiful paper [De1] which is perhaps deepest among early works on quantum central limit theorems (Derezi´nski’s work was motivated by [Ha1,Ha2,Ru1,HL1,HL2,Da2]). In relation to our work, in [De1] Theorem 3.5 was proven in the special case k1 = · · · = kn = 2 of quadratic interactions. This suffices for the proof of SQCLT for quasi-free dynamics and for observables which are polynomials in Fermi fields. The proofs of Parts (2) and (3) of Theorem 3.5 are not that much different in the general case k j ≥ 2. The key difference is in Part (1) which in the quadratic case follows easily from Stirling’s formula. To prove Part (1) for any k j ≥ 2 is much more difficult and the bulk of the proof of Theorem 3.5 in Sect. 4 is devoted to this estimate. The proof of QCLT for locally interacting fermionic systems critically depends on this result.
3.4. Proofs of Theorems 3.1 and 3.2. In this subsection we shall show that Theorems 3.4 and 3.5 imply Theorems 3.1 and 3.2, thereby reducing the proof of Theorem 1.7 to the proof of Theorem 3.5. If η is a state, we shall denote ηT (A1 , . . . , An ) ≡ η ((A1 − η(A1 )) . . . (An − η(An ))). Let A=
KA k=1
Ak ,
Ak =
nk j=1
a ∗ ( f k j )a(gk j ),
(3.43)
200
V. Jakši´c, Y. Pautrat, C.-A. Pillet
be an element of C. Without loss of generality we may assume that max f ∈F (A) f = 1. With ! f = eish 0 f f ∈ F(V ) ∪ F(A), s ∈ R , ∞ 1 −1 ith 0 ith 0 DV,A = max 1, 2 |( f, e g)| + |( f, e T g)| dt , max f,g∈F (V )∪F (A) f g −∞ we clearly have Mf = 1 and Df ≤ DV,A . Proof of Theorem 3.1. For |λ| ≤ λV , t1 + ωλT τλ (A), . . . , τλtn (A) =
KA k1 ,...,kn =1
ω0T τ0t1 ◦ γλ+ (Ak1 ), . . . , τ0tn ◦ γλ+ (Akn ) , (3.44)
and the expansion (3.40) yields that τ0t ◦ γλ+ (Ak ) − ω0 ◦ γλ+ (Ak ) =
(iλ) j j≥0
q∈Q j (Ak )
( j) ( j) ( j) G Ak ,q (s) τ0t FAk ,q (s) − ω0 FAk ,q (s) ds,
j
(3.45)
where j denotes the simplex {s = (s1 , . . . , s j ) ∈ R j | − ∞ < s j < · · · < s1 < 0}. We have adopted the convention that Q 0 (Ak ) is a singleton, that G (0) Ak ,q = 1 and that (0)
FAk ,q = Ak . Moreover, integration over the empty simplex 0 is interpreted as the identity map. Applying Fubini’s theorem we get −n/2 t ω0T τ0t1 ◦ γλ+ (Ak1 ), . . . , τ0tn ◦ γλ+ (Akn ) dt1 · · · dtn [0,t]n
=
j1 ,..., jn ≥0
ds1 · · · j1
(iλ) j1 +···+ jn
jn
dsn
n l=1
q1 ∈Q j1 (Ak1 ),...,qn ∈Q jn (Akn )
(j ) G Akl ql (sl ) l
Ct ( j, q, s; Ak1 , . . . , Akn ),
(3.46)
where we have set Ct ( j, q, s; Ak1 , . . . , Akn ) (j ) (j ) = t −n/2 ω0T τ0t1 FAk1 q1 (s1 ) , . . . , τ0tn FAkn qn (sn ) dt1 · · · dtn . [0,t]n
1
n
We derive from Corollary 3.6 and Theorem 3.4 that n n Ct ( j, q, s; Ak , . . . , Ak ) ≤ 28(n V −1) l=1 jl 28n A Df n!, n 1
(3.47)
Central Limit Theorem for Locally Interacting Fermi Gas
201
holds for t > 0. Using this bound we further get from (3.46) ω0T τ t1 ◦ γ + (Ak ), . . . , τ tn ◦ γ + (Ak ) dt1 · · · dtn sup t −n/2 n 1 λ λ 0 0 t>0 [0,t]n ⎛ ⎞ n ( jl ) ⎜ 8n A ⎟ ≤ |28(n V −1) λ| jl G Ak ql (sl ) dsl ⎠ n!. ⎝2 Df jl ≥0
l=1
ql ∈Q jl (Akl )
l
(3.48)
l
For |λ| ≤ λ˜ V we have (recall Definitions (1.17) and (3.39)), ( j ) |28(n V −1) λ| jl G Akl ql (sl ) dsl ≤ 1 + WV,Akl . jl ≥0
l
ql ∈Q jl (Akl )
l
By Theorem 3.4, the right hand side of this inequality is finite. Combining this bound with (3.44) and (3.48) we finally obtain + t1 tn −n/2 ω sup t λT τλ (A), . . . , τλ (A) dt1 · · · dtn [0,t]n
|λ|0
≤ 2
8n A
n KA 1 + WV,Ak Df n!, k=1
which concludes the proof. The above proof gives that in Theorem 3.1 one may take C V,A = 28n A DV,A
KA 1 + WV,Ak .
(3.49)
j=1
For an explicit estimate on WV,Ak we refer the reader to [JOP4]. Proof of Theorem 3.2. Note that ωλ+ ( A˜ t )n =
KA
t −n/2
k1 ,...,kn =1
[0,t]n
ω0T τ0t1 ◦ γλ+ (Ak1 ), . . . , τ0tn ◦ γλ+ (Akn ) dt1 · · · dtn . (3.50)
In the proof of Theorem 3.1 we have established that the power series (3.46) converges uniformly for |λ| ≤ λ˜ V and t > 0. Suppose first that n is odd. Corollary 3.6 yields that lim Ct ( j, q, s; Ak1 , . . . , Akn ) = 0.
t→∞
(3.51)
By (3.47) and Part 3 of Theorem 3.4 we can apply the dominated convergence theorem to the s-integration in (3.46) to conclude that each term of this power series vanishes as t → ∞, and so lim ωλ+ ( A˜ t )n = 0,
t→∞
202
V. Jakši´c, Y. Pautrat, C.-A. Pillet
for |λ| ≤ λ˜ V . If n is even, Corollary 3.6 yields lim Ct ( j, q, s; Ak1 , . . . , Akn )
t→∞
(j ) (j ) L 0 FAkπ(2i−1) ,qπ(2i−1) (sπ(2i−1) ), FAkπ(2i) ,qπ(2i) (sπ(2i) )
n/2
=
π(2i−1)
π ∈Pn/2 i=1
=
n/2
n/2 i=1 π ∈Pn/2 R
π(2i)
(j (j ) ) ω0T τ0ti FAkπ(2i−1) ,qπ(2i−1) (sπ(2i−1) ) , FAkπ(2i) ,qπ(2i) (sπ(2i) ) π(2i−1)
π(2i)
dt1 · · · dtn/2 . The estimate (3.47) (applied in the case n = 2) yields that 2 ( j) ( j ) ω0T τ0t FAk ,q (s) , FA ,q (s ) dt ≤ 28n A +1/2 Df 28(n V −1)( j+ j ) , R
k
from which we obtain n lim Ct ( j, q, s; Ak1 , . . . , Akn ) ≤ 28n A +1/2 Df 28(n V −1) i ji . t→∞
Arguing as in the previous case we get, for |λ| ≤ λ˜ V , the expansion −n/2 lim t ω0T τ0t1 ◦ γλ+ (Ak1 ), . . . , τ0tn ◦ γλ+ (Akn ) dt1 · · · dtn/2 t→∞ n [0,t] n (j ) j1 +···+ jn l = (iλ) ds1 · · · dsn G Ak ,ql (sl ) q1 ∈Q j1 (Ak1 ),··· ,qn ∈Q jn (Akn )
j1 ,..., jn ≥0
n/2
n/2 i=1 π ∈Pn/2 R
j1
l=1
jn
l
(j (j ) ) ω0T τ0ti FAkπ(2i−1) ,qπ(2i−1) (sπ(2i−1) ) , FAkπ(2i) ,qπ(2i) (sπ(2i) ) π(2i−1)
π(2i)
dt1 · · · dtn/2 .
(3.52)
By Fubini’s theorem, this can be rewritten as ⎡ ⎢ (iλ) j1 +···+ jn ⎣ n/2 π ∈Pn/2 R
n l=1
j1 ,..., jn ≥0
(j ) ,q (sl ) l l
G Akl
n/2 i=1
ω0T
q1 ∈Q j1 (Ak1 ),··· ,qn ∈Q jn (Akn )
ds1 · · ·
dsn
jn
j1
⎤ (j ) (j ) τ0ti FAkπ(2i−1) ,qπ(2i−1) (sπ(2i−1) ) , FAkπ(2i) ,qπ(2i) (sπ(2i) ) ⎦ π(2i−1)
π(2i)
dt1 · · · dtn/2 . By Expansion (3.40), the expression inside the square brackets is n/2 i=1
n/2 t + τ i Akπ(2i−1) , Akπ(2i) , ω0T τ0ti ◦ γλ+ Akπ(2i−1) , γλ+ (Akπ(2i) ) = ωλT i=1
Central Limit Theorem for Locally Interacting Fermi Gas
203
so that, by (3.50), lim ω+ t→∞ λ
( A˜ t )n =
KA
n/2
k1 ,...,kn =1 π ∈Pn/2 i=1 n/2
=
π ∈Pn/2 i=1
=
n! 2n/2 (n/2)!
R
+ ωλT
R
+ ωλT
t τ Akπ(2i−1) , Akπ(2i) dt
t τ (A) , A dt
L(A, A)n/2 .
4. Proof of Theorem 3.5 For notational simplicity throughout this section we shall drop the subscript 0 and write h for h 0 , τ for τ0 , ω for ω0 . We shall also use the shorthand (3.43). 4.1. Graphs, Pairings and Pfaffians. A graph is a pair of sets g = (V, E) where E is a set of 2-elements subsets of V . The elements of V are called points or vertices of g, those of E are its edges. Abusing notation, we shall write v ∈ g for vertices of g and e ∈ g for its edges. If v ∈ e ∈ g we say that the edge e is incident to the vertex v. If the edge e is incident to the vertices u and v we write e = uv and say that the edge e connects u to v. The degree of a vertex v ∈ g is the number of distinct edges e ∈ g incident to v. A graph is k-regular if all its vertices share the same degree k. A vertex v ∈ g of degree 0 is said to be isolated. A path on g is a sequence (v0 , e1 , v1 , e2 , . . . , en , vn ), where vi ∈ V , ei ∈ E and ei = vi−1 vi . We say that such a path connects the vertices v0 and vn . If v0 = vn the path is closed and is called a loop. The graph g is connected if, given any pair v, v ∈ V there is a path on g which connects v and v . A connected graph without loops is a tree. A graph g = (V , E ) is a subgraph of the graph g = (V, E) if V ⊂ V and E ⊂ E. A subgraph g of g is said to be spanning g if V = V . A connected graph g has a spanning tree i.e., a subgraph which is spanning and is a tree. Let g = (V, E) be a graph. To a subset W ⊂ V we associate a subgraph g|W = (W, E |W ) of g by setting E |W = {e = uv ∈ E | u, v ∈ W }. Given two graphs g1 = (V1 , E 1 ) and g2 = (V2 , E 2 ) such that V1 and V2 are disjoint we denote by g1 ∨ g2 the joint graph (V1 ∪ V2 , E 1 ∪ E 2 ). Let g = (V, E) be a graph and = {V1 , . . . , Vn } a partition of V . The set E/ = {Vi V j | there are u ∈ Vi , v ∈ V j such that uv ∈ E} defines a graph g/ = (, E/). We say that g/ is the -skeleton of g. A graph g = (V, E) is said to be (V1 , V2 )-bipartite if there is a partition V = V1 ∪ V2 such that all edges e ∈ E connect a vertex of V1 to a vertex of V2 . A pairing on a set V is a graph p = (V, E) such that every vertex v ∈ V belongs to exactly one edge e ∈ E. Equivalently, p = (V, E) is a pairing if E is a partition of V or
204
V. Jakši´c, Y. Pautrat, C.-A. Pillet
Fig. 1. Diagrammatic representation of a pairing p
if it is 1-regular. We denote by P(V ) the set of all pairings on V . Clearly, only sets V of even parity |V | = 2n admit pairings and in this case one has (2n)! = (2n − 1)!!. 2n n! If the set V = {v1 , . . . , v2n } is completely ordered, v1 < v2 < · · · < v2n , writing |P(V )| =
E = {π(v1 )π(v2 ), π(v3 )π(v4 ), . . . , π(v2n−1 )π(v2n )}, sets a one-to-one correspondence between pairings p = (V, E) and permutations π ∈ SV such that π(v2i−1 ) < π(v2i ) and π(v2i−1 ) < π(v2i+1 ) for i = 1, . . . , n (compare with (1.5)). In the sequel we will identify the two pictures and denote by p the permutation of V associated to the pairing p. In particular, the signature ε( p) of a pairing p is given by the signature of the corresponding permutation. A diagrammatic representation of a pairing p ∈ P(V ) is obtained by drawing the vertices v1 , . . . , v2n as 2n consecutive points on a line. Each edge e ∈ p is drawn as an arc connecting the corresponding points above this line (see Fig. 1). It is well known that the signature of p is then given by ε( p) = (−1)k , where k is the total number of intersection points of these arcs. If V = V1 ∪ V2 is a partition of V into two equipotent subsets we denote by P(V1 , V2 ) ⊂ P(V ) the corresponding set of (V1 , V2 )-bipartite pairings and note that |P(V1 , V2 )| = n!. If V1 = {v1 , . . . , vn } and V2 = {vn+1 , . . . , v2n } are completely ordered by v1 < · · · < vn < · · · < v2n then p(v2i−1 ) = vi and σ (vn+i ) = p(v2i ) for 1 ≤ i ≤ n defines a one-to-one correspondence between bipartite pairings p ∈ P(V1 , V2 ) and permutations σ ∈ SV2 . A simple calculation shows that ε( p) = (−1)n(n−1)/2 ε(σ ). In the special case V = {1, . . . , 2n}, V1 = {1, . . . , n} and V2 = {n + 1, . . . , 2n} we shall )n . set P(V ) = Pn and P(V1 , V2 ) = P The Pfaffian of a 2n × 2n skew-symmetric matrix M is defined by Pf(M) =
ε( p)
p∈Pn
If B is a n × n matrix and
M p(2i−1) p(2i) .
i=1
M=
n
0 B , −B T 0
Central Limit Theorem for Locally Interacting Fermi Gas
205
)n contribute to the Pfaffian of M which reduces to then only bipartite pairings p ∈ P Pf(M) =
ε( p)
)n p∈P
=
n
B p(2i−1) p(2i)
i=1
(−1)n(n−1)/2 ε(σ )
σ ∈Sn
n
Biσ (i)
i=1
= (−1)n(n−1)/2 det(B).
(4.53)
4.2. Truncating quasi-free expectations. Let V ⊂ h be finite and totally ordered. To any subset W ⊂ V we assign the monomial (W ) ≡ ϕ(u), u∈W
where the product is ordered from left to right in increasing order of the index u. Let ω be a gauge invariant quasi-free state on CAR(h). We define a |V | × |V | skewsymmetric matrix by setting uv ≡ ω(ϕ(u)ϕ(v)), for u, v ∈ V and u < v. We also denote by W the sub-matrix of whose row and column indices belong to W . Then we have Pf(W ) if |W | is even, ω((W )) = (4.54) 0 otherwise. If |W | is even, assigning to any pairing p ∈ P(W ) the weight ( p) ≡ uv , uv∈ p u xi (g); |L ∩ Vi | is even; |M ∩ Vi | = |M ∩ Vi |.
If X, Y are two subsets of V denote by X,Y the sub-matrix of with row (resp. column) indices in X (resp. Y ).
Central Limit Theorem for Locally Interacting Fermi Gas
209
Fig. 4. The partition of Vi induced by a pairing p. Solid lines belong to the exit graph ex( p)
Lemma 4.3. For g ∈ Ex() one has
S(g) =
ε(θ )ω((R))
θ=(X,L ,M,M ,R)∈(g)
ω((L ∩ Vi )) det( M∩Vi ,M ∩Vi ) , (4.61)
i∈I
where (g) denotes the set of g-admissible partitions of V and ε(θ ) ≡ ε(X, L ∩ V1 , . . . , L ∩ Vn , (M ∪ M ) ∩ V1 , . . . , (M ∪ M ) ∩ Vn , R)ε(g|X ) (−1)|M∩Vi |(|M∩Vi |−1)/2 . i∈I
Proof. Let us have a closer look at a pairing p whose exit graph is g. What happens in X i (g) ≡ Vi ∩ X (g) is completely determined by g. However, the structure of p|Vi (g) , where Vi (g) ≡ Vi ∩ V (g) depends on finer details of p. Edges of p which are incident to a vertex in Vi (g) located to the left of the exit point xi (g) must connect this vertex to another vertex in Vi (g). These edges split in two categories: the ones which connect two vertices on the left of the exit point and the ones which connect a vertex on the left to a vertex on the right. We denote by L i ( p) the set of vertices which belong to an edge of the first category, and by Mi ( p) the vertices located to the left of xi (g) and belonging to an edge of the second one. By Mi ( p) we denote the set of vertices which are connected to elements of Mi ( p). This subset of Vi (g) is located on the right of the exit point. We group the remaining vertices of Vi (g), which are all on the right of the exit point, into a fourth set Ri ( p). Elements of this set connect among themselves or with elements of R j ( p) for some j = i (see Fig. 4). Setting L( p) ≡
+
L i ( p),
M( p) ≡
i∈I
+ i∈I
Mi ( p),
M ( p) ≡
+
Mi ( p),
i∈I
we obtain a partition θ ( p) ≡ (X (g), L( p), M( p), M ( p), R( p)),
R( p) ≡
+ i∈I
Ri ( p),
210
V. Jakši´c, Y. Pautrat, C.-A. Pillet
of V which is clearly g-admissible. Moreover, setting li ( p) ≡ p|L( p)∩Vi ∈ P(L( p) ∩ Vi ), m i ( p) ≡ p|(M( p)∪M ( p))∩Vi ∈ P(M( p) ∩ Vi , M ( p) ∩ Vi ), r ( p) ≡ p|R( p) ∈ P(R( p)), we obtain a map from ex−1 ({g}) to the set ,
+
{θ }×
θ=(X,L ,M,M ,R)∈(g)
P(L ∩ Vi ) ×
i∈I
-
P(M ∩ Vi , M ∩ Vi ) ×P(R) .
i∈I
Since p=g∨
*
li ( p) ∨
i∈I
*
m i ( p) ∨ r ( p),
i∈I
is injective. For any g-admissible partition θ = (X, L , M, M , R) and any li ∈ P(L ∩ Vi ), m i ∈ P(M ∩ Vi , M ∩ Vi ), r ∈ P(R),
(4.62)
* * p=g∨ li ∨ mi ∨ r
(4.63)
the pairing
i∈I
i∈I
satisfies ex( p) = g, θ ( p) = θ, li ( p) = li , m i ( p) = m i , r ( p) = r. We conclude that is bijective. Thus, using Lemma 4.1, we can rewrite the sum S(g) as
ε(g|X )ε(X, L ∩ V1 , . . . , L ∩ Vn , (M ∪ M )
θ=(X,L ,M,M ,R)∈(g)
∩V1 , . . . , (M ∪ M ) ∩ Vn , R) ⎞ ⎛ ⎛ ⎝ ⎝ ε(li )(li )⎠ i∈I
li ∈P (L∩Vi )
i∈I
⎞
ε(m i )(m i )⎠
m i ∈P (M∩Vi ,M ∩Vi )
The result now follows from Eq. (4.53) and (4.55).
r ∈P (R)
ε(r )(r ).
Central Limit Theorem for Locally Interacting Fermi Gas
211
4.4. Estimating truncated expectations. Apart from the entropic factor |(g)|, the following lemma controls the partial sum S(g). Lemma 4.4. For g ∈ Ex() one has |S(g)| ≤ 2−|V (g)|/2 | (g)|
v.
v∈V (g)
Proof. Since ϕ( f )2 =
1 ∗ 1 {a ( f ), a( f )} = f 2 , 2 2
we have, for any X ⊂ V , the simple bound |ω((X ))| ≤ 2−|X |/2
v.
v∈X
Combining this estimate with the following lemma, the result is an immediate consequence of Formula (4.61). Lemma 4.5. Let B be the k × k matrix defined by Bi j = ω(ϕ(u i )ϕ(v j )). Then, the estimate | det(B)| ≤ 2−k
k
(u i vi )
i=1
holds. Proof. Let · be a complex conjugation on h. The real-linear map Q: h → h ⊕ h 1/2 f → (1 − T )1/2 f ⊕ T f, is isometric and such that ω(ϕ(u i )ϕ(v j )) =
1 1 (u i , v j ) − (u i , T v j ) + (u i , T v j ) = (Qu i , Qv j ). 2 2
It immediately follows that det(B) = 2−k ωFock (a(Qu 1 ) · · · a(Qu k )a ∗ (Qvk ) · · · a ∗ (Qv1 )), where ωFock denotes the Fock-vacuum state on CAR(h ⊕ h). The fact that a(Qu) = a ∗ (Qu) = Qu = u for any u ∈ h yields the result.
212
V. Jakši´c, Y. Pautrat, C.-A. Pillet
For u, v ∈ V such that u < v set uv ≡ 2
|uv | |ω(ϕ(u)ϕ(v))| =2 , u v u v
and for any graph p on V set ( p) ≡
uv .
uv∈ p u l( j), in (g). It follows that (g) ≤
e j (s j ),
j∈I \{r }
and hence [0,t]n
(g) dt1 · · · dtn ≤ 0
⎛ t
⎝
⎞ t
j∈I \{r } −t
e j (s j )ds j ⎠ dsr ≤ C n−1 t.
In the general case, g/ is the disjoint union of Nc (g) connected subgraphs. Applying the above estimate to each of them yields the result.
214
V. Jakši´c, Y. Pautrat, C.-A. Pillet
Fig. 5. The pairing π induced by a maximally disconnected pairing p
Inserting the estimate (4.65) into Eq. (4.64) and using Lemma 4.8 we finally obtain, taking into account the fact that the skeleton of an exit graph can have at most n/2 connected components [0,t]n
2N √ |ωT (A1 , . . . , An )| dt1 · · · dtn ≤ 8 2 max f i j C n t n/2 n!, ij
which concludes the proof of Part 1. To prove part 2 it suffices to notice that if n is odd then the skeleton of an exit graph can have at most (n − 1)/2 connected components. To prove part 3, we go back to Formula (4.57) and write −n/2 −n/2 t ωT (A1 , . . . , An ) dt1 · · · dtn = ε( p) t ( p) dt1 · · · dtn . [0,t]n
p∈P ()
[0,t]n
(4.66) By Lemmata 4.6 and 4.9 one has, as t → ∞, −n/2 t ( p) dt1 · · · dtn = O(t Nc ( p)−n/2 ). [0,t]n
Thus, the pairings p ∈ P() which contribute to the limit t → ∞ are maximally disconnected in the sense that their skeleton have exactly n/2 connected components. The skeleton p/ of such a pairing induces a pairing π ∈ Pn/2 such that p = p1 ∨ · · · ∨ pn/2 ,
p j ∈ P0 (Vπ(2 j−1) , Vπ(2 j) ),
where P0 (Vi , V j ) denotes the set of pairings on Vi ∪V j whose skeleton w.r.t. the partition (Vi , V j ) has no isolated vertex (see Fig. 5). Since the map p → (π, p1 , . . . , pn/2 ) is clearly bijective we can, for the purpose of computing the limit of (4.66) as t → ∞, replace ωT (A1 , . . . , An ) by ε( p1 ∨ · · · ∨ pn/2 ) ( p1 ∨ · · · ∨ pn/2 ). π ∈Pn/2
p j ∈P0 (Vπ(2 j−1) ,Vπ(2 j) )
By Lemma 4.1 we have ε( p1 ∨ · · · ∨ pn/2 ) = ε(Vπ(1) , . . . , Vπ(n) )ε( p1 ) · · · ε( pn/2 ), = ( p1 ) · · · ( pn/2 )
( p1 ∨ · · · ∨ pn/2 )
Central Limit Theorem for Locally Interacting Fermi Gas
215
and by the remark following it, ε(Vπ(1) , . . . , Vπ(n) ) = 1. Thus, the last expression can be rewritten as ⎞ ⎛ n/2 ⎝ ε( p j )( p j )⎠ . π ∈Pn/2 j=1
p j ∈P0 (Vπ(2 j−1) ,Vπ(2 j) )
Finally observe that, by Lemma 4.2, ε( p)( p) = ωT (Ai , A j ). p∈P0 (Vi ,V j )
One easily concludes the proof by the remark following Theorem 3.5 and the dominated convergence theorem. Acknowledgement. A part of this work has been done during Y.P.’s stay at McGill University and the C.R.M. as ISM Postdoctoral Fellow, during his visit to McGill University funded by NSERC and his visit to Erwin Schrödinger Institut. The research of V.J. was partly supported by NSERC. We wish to thank Manfred Salmhofer for useful discussions.
References [AB] [AJPP1] [AJPP2] [AJPP3] [AM] [Bil] [BGM] [BM1] [BM2] [BKR] [BR1] [BR2] [CH] [Da1] [Da2] [DGM] [De1] [De2]
Accardi, L., Bach, A.: Quantum central limit theorems for strongly mixing random variables. Z. für Warscheinlichkeitstheorie verw. Gebiete 68, 393 (1985) Aschbacher, W., Jakši´c, V., Pautrat, Y., Pillet, C.-A.: Topics in non-equilibrium quantum statistical mechanics. In: Open Quantum Systems III. Attal, S., Joye, A., Pillet, C.-A. (eds.), Lecture Notes in Mathematics 1882, New York: Springer, 2006 Aschbacher, W., Jakši´c, V., Pautrat, Y., Pillet, C.-A.: Transport properties of quasi-free fermions. J. Math. Phys. 48, 032101 (2007) Aschbacher, W., Jakši´c, V., Pautrat, Y., Pillet, C.-A.: Fluctuations in quasi-free fermionic systems. In preparation Aizenstadt, V.V., Malyshev, V.A.: Spin interaction with an ideal fermi gas. J. Stat. Phys. 48, 51 (1987) Billingsley, P.: Probability and Measure. New York: Wiley, 1979 Botvich, D.D., Guta, M., Maassen, H.: Stability of Bose dynamical systems and branching theory. Preprint mp_arc 99–130, http://www.math.otexas.edu/mp-arc-bin/ Botvich, D.D., Malyshev, V.A.: Unitary equivalence of temperature dynamics for ideal and locally perturbed fermi gas. Commun. Math. Phys. 61, 209 (1983) Botvich, D.D., Malyshev, V.A.: Asymptotic completeness and all that for an infinite number of fermions. In Many-Particle Hamiltonians: Spectra and Scattering. Minlos, R.A. (ed.), Advances in Soviet Mathematics 5, Providence, RI: Amer. Math. Soc., 1991, p. 39 Bratteli, O., Kishimoto, A., Robinson, D.W.: Stability properties and the kms condition. Commun. Math. Phys. 61, 209 (1978) Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 1. Berlin: Springer, 1987 Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 2. Second edition, Berlin: Springer, 1996 Cushen, C.D., Hudson, R.L.: A quantum-mechanical central limit theorem. J. Appl. Prob. 8, 454 (1971) Davies, E.B.: Quantum Theory of Open Systems. London: Academic Press, 1976 Davies, E.B.: Markovian master equations. Commun. Math. Phys. 39, 91 (1974) De Groot, S.R., Mazur, P. Non-Equilibrium Thermodynamics. Amsterdam: North-Holland, 1969 Derezi´nski, J.: Boson free fields as a limit of fields of a more general type. Rep. Math. Phys. 21, 405 (1985) Derezi´nski, J.: Introduction to representations of canonical commutation and anticommutation relations. In: Large Coulomb Systems—Quantum Electrodynamics. Derezi´nski, J., Siedentop, H. eds., Lecture Notes in Physics 695, New York: Springer, 2006
216
[E] [FMU] [GV] [GVV1] [GVV2] [GVV3] [GVV4] [GVV5] [GVV6] [GvW] [Ha1] [Ha2] [HL1] [HL2] [JKP] [JOP1] [JOP2] [JOP3] [JOP4] [JPP] [JP1] [JP2] [JP3] [JP4] [JP5] [JP6] [KL] [KTH] [Kup] [Li] [Ma1] [Ma2] [Me]
V. Jakši´c, Y. Pautrat, C.-A. Pillet
Ellis, R.S.: Entropy, Large Deviations, and Statistical Mechanics. New York: Springer, 1985 Fröhlich, J., Merkli, M., Ueltschi, D.: Dissipative transport: thermal contacts and tunneling junctions. Ann. Henri Poincaré 4, 897 (2004) Goderis, D., Vets, P.: Central limit theorem for mixing quantum systems and the ccr algebra of fluctuations. Commun. Math. Phys. 122, 249 (1989) Goderis, D., Verbeure, A., Vets, P.: Noncommutative central limits. Probab. Theory Related Fields 82, 527 (1989) Goderis, D., Verbeure, A., Vets, P.: Quantum central limit and coarse graining. In: Quantum probability and applications, V. Lecture Notes in Math. 1442, Berlin-Heidelberg-New York: Springer, 1988, p. 178 Goderis, D., Verbeure, A., Vets, P.: About the mathematical theory of quantum fluctuations. In: Mathematical Methods in Statistical Mechanics. Leuven Notes Math. Theoret. Phys. Ser. A Math. Phys., 1, Leuven: Leuven Univ. Press, 1989, p. 31 Goderis, D., Verbeure, A., Vets, P.: About the exactness of the linear response theory. Commun. Math. Phys. 136, 265 (1991) Goderis, D., Verbeure, A., Vets, P.: Theory of quantum fluctuations and the onsager relations. J. Stat. Phys. 56, 721 (1989) Goderis, D., Verbeure, A., Vets, P.: Dynamics of fluctuations for quantum lattice systems. Commun. Math. Phys. 128, 533 (1990) Giri, N., von Waldenfels, W.: An algebraic version of the central limit theorem. Z. für Warscheinlichkeitstheorie 42, 129 (1978) Haag, R.: Quantum field theories with composite particles and asymptotic completeness. Phys. Rev. 112, 669 (1958) Haag, R.: The framework of quantum field theory. Nuovo Cimento Supp. 14, 131 (1959) Hepp, K., Lieb, E.H.: Equilibrium statistical mechanics of matter interacting with the quantized radiation field. Phys. Rev. A8, 2517 (1973) Hepp, K., Lieb, E.H.: Phase transitions in reservoir driven open systems with applications to lasers and superconductivity. Helv. Phys. Acta 46, 573 (1973) Jakši´c, V., Kritchevski, E., Pillet, C.-A.: Mathematical theory of the Wigner-Weisskopf atom. In Large Coulomb Systems—Quantum Electrodynamics. Derezi´nski, J., Siedentop, H. (eds.), Lecture Notes in Physics 695, New York: Springer, 2006 Jakši´c, V., Ogata, Y., Pillet, C.-A.: The green-kubo formula and the onsager reciprocity relations in quantum statistical mechanics. Commun. Math. Phys. 265, 721 (2006) Jakši´c, V., Ogata, Y., Pillet, C.-A.: Linear response theory for thermally driven quantum open systems. J. Stat. Phys. 123, 547 (2006) Jakši´c, V., Ogata, Y., Pillet, C.-A.: The green-kubo formula for the spin-fermion system. Commun. Math. Phys. 268, 401 (2006) Jakši´c, V., Ogata, Y., Pillet, C.-A.: The green-kubo formula for locally interacting fermionic open systems. Ann. Henri Poincaré 8, 1013 (2007) Jakši´c, V., Pautrat, Y., Pillet, C.-A.: A non-commutative Lévy-Cramér theorem. Preprint Jakši´c, V., Pillet, C-A.: On entropy production in quantum statistical mechanics. Commun. Math. Phys. 217, 285 (2001) Jakši´c, V., Pillet, C.-A.: Non-equilibrium steady states for finite quantum systems coupled to thermal reservoirs. Commun. Math. Phys. 226, 131 (2002) Jakši´c, V., Pillet, C.-A.: Mathematical theory of non-equilibrium quantum statistical mechanics. J. Stat. Phys. 108, 787 (2002) Jakši´c, V., Pillet, C-A.: A note on the entropy production formula. Contemp. Math. 327, 175 (2003) Jakši´c, V., Pillet, C.-A.: On the strict positivity of entropy production. Contemp. Math. 447, 153–163 (2007) Jakši´c, V., Pillet, C.-A.: In preparation Kipnis, C., Landim, C.: Scaling Limits of Interacting Particle Systems. Berlin: Springer, 1999 Kubo, R., Toda, M., Hashitsune, N.: Statistical Physics II. Second edition, Berlin: Springer, 1991 Kuperberg, G.: A tracial quantum central limit theorem. Trans. Amer. Math. Soc. 357, 459–471 (2005) Liverani, C.: Central limit theorem for deterministic systems. International Conference on Dynamical Systems (Montevideo, 1995), Pitman Res. Notes Math. Ser. 362, Harlow: Longman, 1996, pp. 56–75 Matsui, T.: Bosonic central limit theorem for the one-dimensional x y model. Rev. Math. Phys. 14, 675 (2002) Matsui, T.: On the algebra of fluctuation in quantum spin chains. Ann. Henri Poincaré 4, 63 (2003) Meyer, Y.: Quantum Probability for Probabilists. Lecture Notes in Mathematics 1358. Berlin: Springer Verlag, 1993
Central Limit Theorem for Locally Interacting Fermi Gas
[MSTV] [OP] [Pe] [Pi] [Ro] [Ru1] [Ru2] [Ru3] [Ru4] [Sp]
217
Manuceau, J., Sirugue, M., Testard, D., Verbeure, A.: The smallest c∗ -algebra for canonical commutations relations. Commun. Math. Phys. 32, 271 (1973) Ohya, M., Petz, D.: Quantum Entropy and its use. Berlin: Springer, 1993 Petz, D.: An invitation to the algebra of canonical commutation relations. Leuven Notes in Mathematical and Theoretical Physics. Series A: Mathematical Physics, 2. Leuven: Leuven University Press, 1990 Pillet, C-A.: Quantum dynamical systems. In: Open Quantum Systems I. Attal, S., Joye, A., Pillet, C.-A. (eds.), Lecture Notes in Mathematics 1880, New York: Springer, 2006 Robinson, D.W.: Return to equilibrium. Commun. Math. Phys. 31, 171 (1973) Ruelle, D.: On the asymptotic condition in quantum field theory. Helv. Phys. Acta 35, 147 (192)) Ruelle, D.: Natural nonequilibrium states in quantum statistical mechanics. J. Stat. Phys. 98, 57 (2000) Ruelle, D.: Entropy production in quantum spin systems. Commun. Math. Phys. 224, 3 (2001) Ruelle, D.: Topics in quantum statistical mechanics and operator algebras. http://arxiv.org/list/ math-ph/0107007, 2001 Spohn, H.: Large Scale Dynamics of Interacting Particles. Texts and Monographs in Physics, Berlin: Springer, 1991
Communicated by H.-T. Yau
Commun. Math. Phys. 285, 219–264 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0542-1
Communications in
Mathematical Physics
Current Algebra on the Torus Louise Dolan1 , Peter Goddard2 1 Department of Physics, University of North Carolina, Chapel Hill, NC 27599, USA.
E-mail:
[email protected] 2 Institute for Advanced Study, Princeton, NJ 08540, USA
Received: 2 November 2007 / Accepted: 31 January 2008 Published online: 24 June 2008 – © Springer-Verlag 2008
Abstract: We derive the N-point one-loop correlation functions for the currents of an arbitrary affine Kac-Moody algebra. The one-loop amplitudes, which are elliptic functions defined on the torus Riemann surface, are specified by group invariant tensors and certain constant tau-dependent functions. We compute the elliptic functions via a generating function, and explicitly construct the invariant tensor functions recursively in terms of Young tableaux. The lowest tensors are related to the character formula of the representation of the affine algebra. These general current algebra loop amplitudes provide a building block for open twistor string theory, among other applications. 1. Introduction Current algebra conformal field theory is often an important ingredient to supply gauge symmetry in string theory. The tree level N-point correlation functions of the currents [1] of an affine Kac-Moody Lie algebra [2], gˆ , c [Jma , Jna ] = f ab c Jm+n + κ ab mδm,−n ,
(1.1)
associated with a finite-dimensional algebra, g, are especially simple, and expressed as a sum over products of differences, with the group tensors given by the level and structure constants of the affine algebra. The current correlators on the torus have more structure, but turn out to be computable in terms of elliptic functions, and specified by constant but tau-dependent group invariant tensors. Recursion relations for these correlation functions [3] become tedious to evaluate for large numbers of currents. In this paper we calculate the one-loop N-point current correlation functions explicitly for an arbitrary Lie group, and describe their dependence on rather neat combinations of Weierstrass functions and on group tensors given in terms of the character of the representation. Loop calculations were considered for vertex operator algebras in [4,5], for particular constructions and levels of current algebras in [6,7], and for particular Lie groups [8].
220
L. Dolan, P. Goddard
Our general treatment of the affine current correlators is possible due to the simple holomorphic operator products of the currents. Loop correlation functions for other fields related to current algebras tend to be less completely accessible, although widely studied [9–18]. Our interest in the current algebra torus correlator was initially motivated by its appearance in the gluon loop amplitude [3] for open twistor string theory [19,20]. The N-point torus current correlator should be helpful to pursue perturbation theory there. The twistor string [21,19], and efforts to formulate it as a heterotic theory [22], although mixing conformal supergravity with Yang-Mills, also provides an enticing framework for a QCD string. Our analysis of current algebra on the torus provides a fundamental building block that will have general applications. The plan of this paper is as follows. In Sect. 2.1, we first use the representation of a current algebra as bilinear expressions in (Neveu-Schwarz) fermions to evaluate current algebra tree amplitudes. The expressions obtained involve the tensors formed from the traces of products of the real matrices representing g and can be described by a set of graphical rules that will be extended later in the paper to yield loop amplitudes. Although the tensors depend on the representation chosen, this dependence cancels out in the expressions for the tree amplitudes because these are determined by κ ab . For a compact simple algebra, we can take κ ab = kδ ab and we can obtain the general tree amplitude by scaling terms in the result obtained for any given representation by appropriate powers of k. Notwithstanding this, in Sect. 2.2, we find it useful to give a more generally phrased version of the construction, due to Frenkel and Zhu (FZ) [1]. This generalizes the traces of representation matrices to invariant m th order tensors κm , satisfying conditions (2.15) and (2.16), which determine κm in terms of κm−1 uniquely up to an arbitrary symmetric invariant tensor ωm . The successive freedoms, represented by the ωn , have no effect on the tree amplitudes constructed using the κm . We isolate a “connected” part of the tree amplitude, which possesses only simple poles and show that, like the full amplitude, this just depends on κ2 = κ, and so not on the ωm . In Sect. 2.3 we give a proof that, given a suitable κm−1 , there exists a κm satisfying (2.15) and (2.16), and we give explicit formulae for the general κ3 and κ4 . Our proof of the existence of κn does not itself provide a convenient algorithmic construction and we give this in Appendix A using Young tableaux and the representation theory of the permutation group. In Sect. 3.1, we begin by computing the n-point one loop amplitude using the representation of the current algebra as bilinears in Neveu-Schwarz fermions. The result is given by a modification of the graphical rules used in Sect. 2.1 to describe tree amplitudes. Similar rules describe two other versions of the loop: one in which we use Neveu-Schwarz fermions but also incorporate a factor of (−1) Nb , where Nb is the fermion number operator, into the trace defining the loop; and one where we use Ramond rather than Neveu-Schwarz fermions. These rules involve the tensors constructed from traces of representation matices, used in the fermionic construction of tree amplitudes. There nearly all the structure resulting from varying the representation, reflected in the ‘arbitrary’ symmetric tensors ωm coming into the FZ construction, was irrelevant, but this is not so for the loops. To approach the construction of the general one loop current algebra amplitude, we isolate a connected part of the amplitude in Sect. 3.2, which has only single poles, as we did for the tree amplitudes. The residues of this connected part for the n-point loop are specified in terms of the (n − 1)-point loop and this means that the n point loop is determined in this way up to a symmetric invariant n th order tensor, ωn (τ ),
Current Algebra on the Torus
221
depending only on the torus modulus, τ . In Sect. 3.3, we first obtain general forms for the two- and three-point loops in terms of symmetric invariant tensors ω2 (τ ) and ω3 (τ ) and Weierstrass P and ζ functions. The general form for the n-point loop is given by an adaptation of the rules for tree amplitudes, expressed in terms of Weierstrass σ functions through ν
−n
Hn =
n j=1
∞ σ (µ j + ν, τ ) Hn,m ν m−n , = σ (ν, τ )σ (µ j , τ )
(1.2)
m=0
which is elliptic as a function of ν and µ1 , . . . µn , provided that nj=1 µ j = 0, and in terms of n th order invariant tensor functions of τ , κn,m (τ ), with n ≥ m ≥ 2, defined inductively by (3.55) and (3.56) (which are similar to (2.15) and (2.16)), starting from invariant symmetric tensors κn,0 (τ ) = ωn (τ ). In Appendix B we discuss properties of the functions Hn,m and in Appendix C we show how the general results of this section relate to those previously obtained in [3] for two-, three- and four-point loops. The symmetric tensors ωn , irrelevant in the construction of tree amplitudes in Sect. 2, provide the extra structure necessary for the construction of the one-loop amplitudes. They are not but can be determined in terms of traces of zero modes of the arbitrary ai ai a currents, tr J0 1 J0 2 . . . J0 in w L 0 , symmetrized over the indices a j . In 4.1, we establish recurrence relations relating the traces over symmetrized products of currents, in terms of which the ωn (τ ) are initially defined, to symmetrized traces of their zero modes, showing how this works out in detail for n = 2, 3 and 4. More precisely, ωn (τ ) are defined in terms of the connected parts of the symmetrized traces of currents and, in 4.2, we use the recurrence relations to determine ωn (τ ) in terms of the connected part of the symmetrized trace of zero modes. Then, in Sect. 4.3, we show how the symmetrized traces of zero modes of the currents can be determined in terms of (1.3) χ (θ, τ ) = tr ei H ·θ w L 0 , the character of the representation of gˆ provided by the space of states of the theory. While the analysis up to this point has not made any assumptions about the Lie algebra g, in this section we assume that it is compact and, for ease of exposition, take it to be simple. The method depends on using the Harish-Chandra isomorphism of the center of the enveloping algebra of g, that is the ring of Casimir operators of g, onto the polynomials in H invariant under the action of the Weyl group, Wg of g. Section 5 provides a summary of our results. 2. Current Algebra Trees 2.1. Current algebra and the Fermionic tree construction. We consider a conformal field theory containing the affine algebra, gˆ , given by (1.1), where m, n are integers and f ab c are the structure constants of g and κ ab is a symmetric tensor invariant with respect a , f ab to g. [If the generators of the algebra satisfy the hermiticity condition Jna † = J−n c is pure imaginary and κ ab is real.] For a general introductory review see [23]. We consider evaluating the vacuum expectation value 1 a2 ...an Aatree (z 1 , z 2 , . . . , z n ) = 0|J a1 (z 1 )J a2 (z 2 ) . . . J an (z n )|0,
(2.1)
222
L. Dolan, P. Goddard
where J a (z) =
Jna z −n−1 ,
Jna |0 = 0, n ≥ 0,
a (Jna )† = J−n .
(2.2)
n
The currents J a (z) satisfy the operator product expansion J a (z 1 )J b (z 2 ) ∼
κ ab f ab c J c (z 2 ) + (z 1 − z 2 )2 z1 − z2
(2.3)
and the tree amplitudes satisfy the asymptotic condition 1 a2 ...an Aatree (z 1 , z 2 , . . . , z n ) = O(z −2 j ) as z j → ∞,
(2.4)
because in this limit 0|J a (z) ∼ 0|J1a z −2 . Because of the locality of the currents J a (z) relative to one another, the tree amplitude (2.1) is symmetric under simultaneous permutations of the z i and ai , a
(1) Atree
a (2) ...a (n)
1 a2 ...an (z (1) , z (2) , . . . , z (n) ) = Aatree (z 1 , z 2 , . . . , z n ),
(2.5)
where ∈ Sn , the group of permutations on n objects The condition (2.3) gives all the singularities of the n-point function in terms of (n − 1)- and (n − 2)-point functions. Thus, given (2.4), using Cauchy’s Theorem, we can inductively calculate the n-point function for any n starting from the two-point function, 0|J a (z 1 )J b (z 2 )|0 =
κ ab , (z 1 − z 2 )2
(2.6)
i.e. the n-point function is determined by the invariant symmetric tensor κ ab . A general prescription for doing this has been given by Frenkel and Zhu [1], which we shall discuss in Sect. 2.2, but first we shall note the explicit calculation when J a (z) is given as a bilinear in fermionic oscillators. Given a representation J0a → t a = i M a of g, where the M a are N -dimensional real antisymmetric matrices satisfying [M a , M b ] = −i f ab c M c ,
(2.7)
we can represent J a (z) as a bilinear in Neveu-Schwarz fermionic fields, J a (z) =
n∈Z
where bi (z)=
1
Jna z −n−1 =
i a i M b (z)b j (z), 2 ij
(2.8)
bri z −r − 2 , {bri , bs } = δr,−s δ i j , bri |0 = 0, r > 0, 1 ≤ i, j ≤ N .(2.9) j
r ∈Z+ 21
Then Jna satisfies (1.1) with κ ab = − 21 tr(M a M b ) = 21 tr(t a t b ).
(2.10)
Current Algebra on the Torus
223
Note bi (z 1 )b j (z 2 ) =: bi (z 1 )b j (z 2 ) : +
δi j , z1 − z2
(2.11)
with the usual definition of normal ordering. Using Wick’s theorem, we can evaluate the tree amplitude (2.1) and describe the result as follows. The n-point function can be written as a sum over permutations
∈ Sn with no fixed point. Each such permutation can be written as a product of cycles, = ξ1 ξ2 . . . ξr and we associate to a product F = (−1)r f ξ1 f ξ2 . . . f ξr , where the function f ξ is associated with the cycle ξ = (i 1 , i 2 . . . i m ), defined by fξ =
tr(t a1 t a2 . . . t am ) . (z i1 − z i2 )(z i2 − z i3 ) . . . (z im − z i1 ) 1 2
(2.12)
The n-point tree amplitude is then constructed as the sum of these products over the permutations ∈ S n , the subset of Sn with no fixed points, 1 a2 ...an Aatree (z 1 , z 2 , . . . , z n ) = F a1 a2 ...an (z 1 , z 2 , . . . , z n ). (2.13)
∈S n
2.2. The Frenkel-Zhu construction. Frenkel and Zhu have shown how the fermionic construction of the last section can be modified to give the general construction for the tree amplitude (2.1). Again the n-point function (2.1) is written as a sum over permutations with no fixed point, = ξ1 ξ2 . . . ξr , written as a product of cycles, with which is associated F = (−1)r f ξ1 f ξ2 . . . f ξr , where now ai ai ...ai
fξ =
κm 1 2 m , (z i1 − z i2 )(z i2 − z i3 ) . . . (z im − z i1 )
(2.14)
and the m-order tensors κm are defined inductively by the conditions ba3 ...am κma1 a2 a3 ...am − κma2 a1 a3 ...am = f a1 a2 b κm−1 ,
(2.15)
κma1 a2 a3 ...am = κma2 a3 ...am a1 .
(2.16)
and
The n-point tree amplitude is then constructed as in (2.13). A graphical way of describing the Frenkel-Zhu construction (or the fermionic construction) is to say that the n-point tree amplitude is given by summing over all graphs with n vertices where the vertices carry the labels 1, 2, . . . , n, and each vertex is connected by directed lines to other vertices, one of the lines at each vertex pointing towards it and one away from it. Then each graph consists of a number of directed “loops” or cycles, ξ = (i 1 , i 2 . . . i m ), with which we associate the expression (2.14) and the expression associated with the whole graph is the product of the expressions for the various cycles multiplied by a factor of −1 for each cycle. As is implied by comparing (2.12) and (2.14), a solution to the conditions (2.15) and (2.16) can be constructed by setting κma1 a2 a3 ...am = tr(t a1 t a2 . . . t am ) or, more generally, κma1 a2 ...am = tr(K t a1 t a2 . . . t am ),
(2.17)
224
L. Dolan, P. Goddard
where t a is any finite-dimensional representation of g, i.e. [t a , t b ] = f ab c t c ,
(2.18)
and K is any matrix commuting with all the t a , i.e. invariant under the action of g. K is to be chosen so that κ2ab = tr(K t a t b ) = κ ab as in (1.1), which can be done for any invariant tensor κ ab if g is compact and t a a faithful representation. It is straightforward to verify that (2.13) has the singularity structure implied by the operator product expansion (2.3), provided that κm satisfies (2.15) and (2.16), and satisfies the asymptotic condition (2.4), and thus is inductively determined by Cauchy’s Theorem, given the two-point function (2.6). Thus, it does not depend on the choice of κm satisfying (2.15) and (2.16), apart from through κ2 = κ. (In particular, although different choices of representation t a result in different tensors κm , as defined through (2.17), these differences cancel out in (2.14), apart from dependence on κ2 .) In fact, the stronger statement holds that the connected parts, that is the sums of (2.14) over permutations of (i 1 , i 2 . . . i m ), only depend on the κ’s through κ2 . This is expressed in the following proposition: Proposition 1. If g is a Lie algebra and the tensors κma1 a2 ...am , where 1 ≤ a j ≤ dim g, are defined for m ≤ N , and satisfy ba3 ...am , κma1 a2 a3 ...am − κma2 a1 a3 ...am = f a1 a2 b κm−1
(2.19)
κma1 a2 a3 ...am = κma2 a3 ...am a1 ,
(2.20)
and
where
f ab
c
are the structure constants of g, then the tensor functions
1 a a ...am (z 1 , z 2 , . . . , z m ) =
1 2 Atree ,C
m
=
f ( (1), (2),..., (m))
∈Sm
(2.21)
a (1) a (2) ...a (m) κm 1 m (z (1) − z (2) )(z (2) − z (3) ) . . . (z (m) − z (1) )
∈Sm
=
∈Sm−1
a (1) a (2) ...a (m−1) am
κm (z (1) − z (2) )(z (2) − z (3) ) . . . (z (m−1) − z m )(z m − z (1) ) (2.22)
depend on the κm only through κ2 . Proof of the Proposition. The result follows from Cauchy’s Theorem because the functions F defined by (2.22) satisfy 1 a2 ...am Aatree (z 1 , z 2 , . . . , z m ) = O(z 1−2 ), as z 1 → ∞ ,C
and 1 a2 a3 ...am Aatree (z 1 , z 2 , . . . , z m ) ∼ ,C
f a1 a2 b ba3 ...am A (z 2 , . . . , z m ) as z 1 → z 2 , z 1 − z 2 tree,C
and so can be calculated inductively from Aab tree,C (z 1 , z 2 ) =
κ ab . (z 1 − z 2 )2
Current Algebra on the Torus
225
2.3. The tensors κn . The conditions ba3 ...an κna1 a2 a3 ...an − κna2 a1 a3 ...an = f a1 a2 b κn−1 ,
(2.23)
κna1 a2 a3 ...an = κna2 a3 ...an a1
(2.24)
and
are sufficient to ensure that the amplitudes defined by (2.13) with F = (−1)r f ξ1 f ξ2 . . . f ξr , where f ξ is given by (2.14), depend on the κm only through κ2 . However, κ2 does not uniquely determine κm through (2.23) and (2.24). In this section, we shall discuss the existence and uniqueness of solutions to these equations. Although the arbitrariness in κm , given κ2 , is not relevant for tree amplitudes, we shall see in §3 that this freedom is very relevant for the construction of the one-loop amplitudes. The conditions (2.23) and (2.24) have some immediate consequences. First, if κn−1 satisfies (2.23) for some κn which also satisfies (2.24), then κn−1 is invariant because n j=1
f ba j c κ a1 ...a j−1 ca j+1 ...an =
n
(κ a1 ...a j−1 ba j a j+1 ...an − κ a1 ...a j−1 a j ba j+1 ...an ) = 0,
j=1
using (2.23) and then (2.24). Thus for (2.23) and (2.24) to have a solution for a given κn−1 then this tensor must be invariant. a1 a2 a3 ...an−1 Second, if κna1 a2 a3 ...an and κ˜ na1 a2 a3 ...an both satisfy (2.23) with the same κn−1 and both satisfy the cyclic property (2.24), then the difference ωna1 a2 a3 ...an = κ˜ na1 a2 a3 ...an − κna1 a2 a3 ...an is cyclically symmetric and satisfies ωna1 a2 a3 ...an = ωna2 a1 a3 ...an . These two symmetries generate the whole of Sn so that ωn must be a symmetric tensor. Conversely, if ωn is symmetric, it follows that κ˜ n = κn + ωn satisfies (2.23) and (2.24) if κn does. So κn−1 defines κn through (2.23) and (2.24), assuming a solution exists, up to a symmetric tensor ωn . We establish the existence of the solution in the following proposition: Proposition 2. If g is a Lie algebra, define inductively the spaces Kn to consist of the invariant n th order tensors κna1 a2 ...an , where 1 ≤ a j ≤ dim g, such that ba3 ...an κna1 a2 a3 ...an − κna2 a1 a3 ...an = f a1 a2 b κn−1 ,
(2.25)
for some κn−1 ∈ Kn−1 , where f ab c are the structure constants of g, and κna1 a2 a3 ...an = κna2 a3 ...an a1 ,
(2.26)
with K0 = {0}. Then, for each κn−1 ∈ Kn−1 , there exists a κn satisfying (2.25) and (2.26) that is unique up to the addition of a symmetric invariant tensor ωn . The solution can be uniquely specified by requiring that it be orthogonal to all symmetric n th order tensors.
226
L. Dolan, P. Goddard
Proof of the Proposition. We define the action of ∈ Sn on n th order tensors τn by a
( τn )a1 a2 ...an = τn
−1 (1) a −1 (2) ...a −1 (n)
so that this provides a representation of Sn on n th order tensors: ( σ )τn = (σ τn ). For any n th order tensor τn write η( , τn ) = τn − τn ;
(2.27)
then we can write τn =
1 η( , τn ) + ωn , n!
(2.28)
1
τn , n!
(2.29)
∈Sn
where ωn =
∈Sn
is the symmetrization of the tensor τn . If τn is invariant, η( , τn ) is also invariant. Then, if κn ∈ Kn , 1 η( , κn ) n!
(2.30)
∈Sn
is also in Kn and satisfies (2.25) for the same κn−1 ∈ Kn−1 ; further, it is orthogonal to any symmetric tensor. It is clear from (2.28) that, taking τn = κn ∈ Kn , κn is orthogonal to all symmetric tensors only if ωn , defined as in (2.29), vanishes. Thus, (2.30) is the unique solution to (2.25) and (2.26) for the given κn−1 , with this property. We now proceed to use the expression (2.30) to show there exists a solution to (2.25) and (2.26) for a given κn−1 ∈ Kn−1 . From (2.27), η( 1 2 , κn ) = κn − 1 2 κn = η( 1 , κn ) + 1 η( 2 .κn )
(2.31)
and, so, η( 1 . . . k , κn ) =
k
1 . . . j−1 η( j , κn ).
(2.32)
j=1
We can use this to give a formula for a given κn , in terms of κn−1 ∈ Kn−1 , by expressing each ∈ Sn as a product of transpositions of adjacent indices and then using (2.25) and (2.26). However, such expressions are not unique, so we need to address this by first ˜ n , generated by these transpositions, defining a function working in the free group, S ˜ n → Tn , the space of n th order invariant tensors, for each κn−1 ∈ Kn−1 , and then φ˜ : S checking that we can impose the appropriate relations to obtain a definition for ∈ Sn . In this way, we will obtain an n th order tensor φ( , κn−1 ), ∈ Sn , κn−1 ∈ Kn−1 , which will provide the desired element φ(κn−1 ) ∈ Kn on averaging over ∈ Sn . To show this, we finally show that φ(κn−1 ) satisfies (2.25) and (2.26). Sn is generated by transpositions {σi : 1 ≤ i ≤ n}, where σi (i) = i + 1, σi (i + 1) = i, σi ( j) = j, j = i, i + 1,
(2.33)
Current Algebra on the Torus
227
which satisfy the relations σi2 = 1,
(σi σi+1 )3 = 1,
(σi σ j )2 = 1, |i − j| > 1.
(2.34)
˜ n be the free group on the generators {σ˜ i : 1 ≤ i ≤ n − 1} and Wn the smallest Let S ˜ n containing {σ˜ 2 , 1 ≤ i ≤ n − 1; (σ˜ i σ˜ i+1 )3 , 1 ≤ i ≤ n − normal subgroup of S i ˜ n /Wn ∼ 2; (σ˜ i σ˜ j )2 |i − j| > 1}. Then S = Sn with σ˜ i → σi defining an homomorphism ˜ n can be ˜ n → Sn , which we shall denote by ˜ → . (See [24], p. 63.) Each ˜ ∈ S S −1 written = σ˜ i1 σ˜ i2 . . . σ˜ ik , where 1 ≤ |i j | ≤ n − 1 and σi = σ−i . We can define a ˜ n → Kn in terms of φ( ˜ σ˜ i ), 1 ≤ i ≤ n − 1, with φ( ˜ σ˜ −1 ) = φ( ˜ σ˜ i ) and function φ˜ : S i ˜ φ(1) = 0, by ˜ σ˜ i1 . . . σ˜ ik ) = φ(
k
˜ σ˜ i j ). σi1 . . . σi j−1 φ(
(2.35)
j=1
Then ˜ ˜ 1 ˜ 2 ) = 1 φ( ˜ ˜ 2 ) + φ( ˜ ˜ 1 ). φ(
(2.36)
˜ n . Suppose ˜ ∈ Ker φ˜ ∩Wn , We now show that Ker φ˜ ∩Wn is a normal subgroup of S ˜ so that = 1 ∈ Sn and φ( ) ˜ = 0. Then ˜ ˜ 1
˜ ) ˜ ˜ 1 ) = 1 φ( ˜ ˜ −1 ) + φ( ˜ ˜ −1 ) + 1 φ( ˜ ˜ 1 ) = φ(1) ˜ φ( ˜ + φ( ˜ 1−1 ) = 1 φ( = 0, 1 1 ˜ n . So if we can show so that ˜ 1 ˜ ˜ 1−1 ∈ Ker φ˜ ∩ Wn and this is a normal subgroup of S that ˜ {σ˜ i2 , 1 ≤ i ≤ n − 1; (σ˜ i σ˜ i+1 )3 , 1 ≤ i ≤ n − 2; (σ˜ i σ˜ j )2 , |i − j| > 1} ⊂ Ker φ,(2.37) ˜ because Wn is the smallest we must have Ker φ˜ ∩ Wn = Wn , i.e. Wn ⊂ Ker φ, normal subgroup containing these elements. Then φ˜ induces a map φ : Sn → Kn with ˜ ), φ( ) = φ( ˜ because φ( ˜ w) ˜ = φ( ) ˜ + φ(w) ˜ = φ( ) ˜ if w˜ ∈ Wn . Next we show that (2.37) holds if we define ˜ σ˜ i , κn−1 ) = f αi αi+1 β κ α1 ...αi−1 βαi+2 ...αn ∈ Tn , φ( n−1
(2.38)
for κn−1 ∈ Kn−1 . We will write out the argument for σ˜ 12 , (σ˜ 1 σ˜ 2 )3 , (σ˜ 1 σ˜ 3 )2 and the arguments for other values of i, j follow by similar arguments. First, writing κ = κn−1 , ˜ σ˜ 1 , κ), ˜ σ˜ 12 , κ) = φ( ˜ σ˜ 1 , κ) + σ1 φ( φ( implying ˜ σ˜ 12 , κ)α1 α2 α3 ...αn = f α1 α2 β κ βα3 ...αn + f α2 α1 β κ βα3 ...αn = 0. φ( Second ˜ σ˜ 3 , κ) + σ1 σ3 φ( ˜ σ˜ 1 , κ) + σ1 σ3 σ1 φ( ˜ σ˜ 3 , κ) ˜ σ˜ 1 σ˜ 3 )2 , κ) = φ( ˜ σ˜ 1 , κ) + σ1 φ( φ(( ˜ σ˜ 1 , κ) + σ1 φ( ˜ σ˜ 3 , κ) + σ3 φ( ˜ σ˜ 3 , κ) ˜ σ˜ 1 , κ) + σ1 σ3 φ( = φ(
(2.39)
228
L. Dolan, P. Goddard
implying ˜ σ˜ 1 σ˜ 3 )2 , κ)α1 α2 α3 α4 ...αn = f α1 α2 β κ βα3 α4 ...αn + f α2 α1 β κ βα4 α3 ...αn φ(( + f α3 α4 γ κ α2 α1 γ ...αn + f α4 α3 γ κ α1 α2 γ ...αn = f α1 α2 β f α3 α4 γ κ βγ α5 ...αn + f α2 α1 β f α3 α4 γ κ βγ α5 ...αn = 0.
(2.40)
Third, ˜ σ˜ 1 σ˜ 2 )3 , κ) = φ( ˜ σ˜ 2 , κ) + σ1 σ2 φ( ˜ σ˜ 1 , κ) + σ1 σ2 σ1 φ( ˜ σ˜ 2 , κ) ˜ σ˜ 1 , κ) + σ1 φ( φ(( ˜ σ˜ 1 , κ) + σ2 φ( ˜ σ˜ 2 , κ) +σ2 σ1 φ( = f α1 α2 β κ βα3 α4 ...αn + f α1 α3 β κ α2 βα4 ...αn + f α2 α3 β κ βα1 α4 ...αn + f α2 α1 β κ α3 βα4 ...αn + f α3 α1 β κ βα2 α4 ...αn + f α3 α2 β κ α1 βα4 ...αn = f α1 α2 β f βα3 γ κ γ α4 ...αn + f α1 α3 β f α2 β γ κ γ α4 ...αn + f α2 α3 β f βα1 γ κ γ α4 ...αn = 0
(2.41)
by the Jacobi identity. This establishes (2.37). ˜ , Thus, for each κn−1 ∈ Kn−1 , we can define φ( , κn−1 ) = φ( ˜ κn−1 ) and define φ : Kn−1 → Kn by φ(κn−1 ) =
1 φ( , κn−1 ). n!
(2.42)
∈Sn
We shall now show that φ(κn−1 ) satisfies (2.25) and (2.26). From (2.36), φ( 1 2 , κn−1 ) = 1 φ( 2 , κn−1 ) + φ( 1 , κn−1 ).
(2.43)
Then, for any σ ∈ Sn , 1 φ( , κn−1 ) − n!
∈Sn 1 φ( , κn−1 ) − = n!
φ(κn−1 ) − σ φ(κn−1 ) =
∈Sn
1 + φ(σ, κn−1 ) n!
1 σ φ( , κn−1 ) n!
∈Sn 1 φ(σ , κn−1 ) n!
∈Sn
∈Sn
= φ(σ, κn−1 ).
(2.44)
Taking σ = σ1 in (2.44), we have βα ...αn
φ(κn−1 )α1 α2 α3 ...αn − φ(κn−1 )α2 α1 α3 ...αn = f α1 α2 β κn−13 so that (2.25) holds.
Current Algebra on the Torus
229
If σ = σ1 σ2 . . . σn−1 , then σ ( j) = j + 1, 1 ≤ j ≤ n − 1 and σ (n) = 1, i.e. σ is cyclic permutation of (1, 2, . . . , n) so that
φ(σ, κn−1 ) =
n−1
σ1 . . . σ j−1 φ(σ j , κn−1 )
j=1
=
n−1
α ...α j−1 βα j+2 ...αn
1 σ1 . . . σ j−1 f α j α j+1 β κn−1
j=1
=
n−1
α ...α j βα j+2 ...αn
2 f α1 α j+1 β κn−1
=0
(2.45)
j=1
as κn−1 ∈ Kn−1 is invariant. Thus putting σ = σ1 σ2 . . . σn−1 in (2.44) gives φ(κn−1 )α1 α2 ...αn = φ(κn−1 )αn α1 ...αn−1 , so that (2.26) holds. By averaging (2.44) over σ ∈ Sn , we see that
σ φ(κn−1 ) = 0,
σ ∈Sn
so that it is orthogonal to all symmetric tensors and so the unique solution to (2.25) and (2.26) with this property. This completes the proof of Proposition 2. Note that Proposition 2 implies that Kn /Sn ∼ = Kn−1 , where Sn ⊂ Kn is the space of symmetric invariant tensors. As particular instances, we have that if κ ab is an invariant symmetric tensor, the general solution for (2.25) and (2.26) for n = 3 is κ3abc =
1 2
f ab e κ ec + ωabc ,
(2.46)
where ωabc is symmetric, and invariant for κ3 to be invariant. In this case, the general solution to (2.25) and (2.26) for n = 4 is κ4abcd =
1 6
f ab e f cd g κ eg + 16 f da e f bc g κ eg + 21 f ab e ωecd + 21 f bc e ωead
+ 21 f ac e ωebd + ωabcd ,
(2.47)
where ωabcd is symmetric. While Proposition 2 proves the existence of a solution to (2.25) and (2.26) for a given κn−1 ∈ Kn−1 , it does not provide an explicit expression for such a solution unless we have a method of specifying expressions for each element ∈ Sn as a product σi1 σi2 . . . σik of transpositions. In Appendix A, we derive an explicit expression for κn in terms of κn−1 using the representation theory of Sn and Young tableaux.
230
L. Dolan, P. Goddard
3. Current Algebra on the Torus 3.1. Fermionic loop constructions. We consider the loop amplitude n 1 a2 ...an (ν1 , ν2 , . . . , νn , τ ) = tr J a1 (ρ1 )J a2 (ρ2 ) . . . J an (ρn )w L 0 (2πi)n ρj, Aaloop j=1
(3.1) where ρ j = e2πiν j ,
w = e2πiτ ,
(3.2)
and begin by reviewing the explicit expressions for this amplitude when J a (ρ) is given as a bilinear in fermionic fields. First, we take J a (ρ) to be given in terms of Neveu-Schwarz fields by (2.8). Defining the partition function ∞
χ N S (τ ) =
(1 + wr ) N ,
(3.3)
r = 12
we can write tr (bi (ρ1 )b j (ρ2 )w L 0 ) =
1 1
2πi(ρ1 ρ2 ) 2
χ N S (ν1 − ν2 , τ ) χ N S (τ )δ i j ,
(3.4)
where χ N S (ν, τ )=2πi
∞ −2πir ν e + wr e2πir ν θ1 (0, τ )θ3 (ν, τ ) 1 = ∼ as ν → 0. (3.5) r 1+w θ3 (0, τ )θ1 (ν, τ ) ν 1
r= 2
With J a (ρ) given by (2.8), the two-point function is tr J a (ρ1 )J b (ρ2 )w L 0 = −
κ ab χ N S (ν1 − ν2 , τ )2 χ N S (τ ) 4π 2 ρ1 ρ2 κ ab =− 2 P N S (ν1 − ν2 , τ )χ N S (τ ), 4π ρ1 ρ2
(3.6)
where κ ab = − 21 tr(M a M b ) = 21 tr(t a t b ), and θ1 (0, τ )2 θ3 (ν, τ )2 1 ∼ 2 , as ν → 0 2 2 θ3 (0, τ ) θ1 (ν, τ ) ν
θ1 (ν, τ )
θ3
(0, τ ) − = θ3 (0, τ ) θ1 (ν, τ ) θ3
(0, τ ) + 2η(τ ) + P(ν, τ ). = θ3 (0, τ )
P N S (ν, τ ) =
Here the Weierstrass P function, P(ν, τ ) = −
θ1 (ν, τ ) θ1 (ν, τ )
− 2η(τ ),
(3.7)
(3.8)
Current Algebra on the Torus
231
with η(τ ) = −
1 θ1
(0, τ ) . 6 θ1 (0, τ )
(3.9)
(See [25], p. 361.) The general prescription for the n-point loop amplitude (3.1), with J a (ρ) given by (2.8), is given by a modification of the Frenkel-Zhu construction of §2.2, by writing (3.1) as a sum over permutations ρ ∈ Sn with no fixed point. If ρ = ξ1 ξ2 . . . ξr , a product of disjoint cycles, we associate to ρ a product FρN S = (−1)r f ξN1 S f ξN2 S . . . f ξNr S χ N S (τ ),
(3.10)
where the function f ξN S associated with the cycle ξ = (i 1 , i 2 . . . i m ) is defined by f ξN S = κ ai1 ai2 ...aim χ N S (νi1 − νi2 , τ )χ N S (νi2 − νi3 , τ ) . . . χ N S (νim − νi1 , τ ), κ a1 a2 ...an = 21 tr(t a1 t a2 . . . t an ) = 21 i n tr(M a1 M a2 . . . M an ).
(3.11)
(3.12)
The n-point loop amplitude is then constructed as the sum of these products over the permutations ρ ∈ S n , the subset of Sn with no fixed points,
1 a2 ...an (ν1 , ν2 , . . . , νn , τ ) = Aaloop
ρ∈S n
FρN S a1 a2 ...an (ν1 , ν2 , . . . , νn , τ ).
(3.13)
Again this construction can be described graphically by summing over all graphs with n vertices where the vertices carry the labels 1, 2, . . . , n, and each vertex is connected by directed lines to other vertices, one of the lines at each vertex pointing towards it and one away from it. An expression (3.11) is associated with each cycle, together with factor of −1, and the product of these cycle expressions is associated with the whole graph. For example, this gives as the expression for the three-point loop tr J a (ρ1 )J b (ρ2 )J c (ρ3 )w L 0 =
−ik 8π 3 ρ1 ρ2 ρ3
f abc χ N S (τ )χ N S (ν1 − ν2 , τ )
×χ N S (ν2 − ν3 , τ )χ N S (ν3 − ν1 , τ )
(3.14)
if tr(M a M b ) = −2kδ ab , so that κ ab = kδ ab , and δ ab is used to raise and lower indices. We can modify the above to give a second fermionic construction by defining the partition function χ N−S (τ ) =
∞
(1 − wr ) N .
(3.15)
r = 12
We can write tr bi (ρ1 )b j (ρ2 )w L 0 (−1) Nb =
1 2πi(ρ1 ρ2 )
1 2
χ N−S (ν1 − ν2 , τ )χ N−S (τ )δ i j , (3.16)
232
L. Dolan, P. Goddard
where χ N−S (ν, τ )=2πi
∞ −2πir ν e − wr e2πir ν θ1 (0, τ )θ4 (ν, τ ) 1 = ∼ as ν → 0; (3.17) 1 − wr θ4 (0, τ )θ1 (ν, τ ) ν 1
r= 2
and, if we replace χ N S (ν, τ ) by χ N−S (ν, τ ) in (3.11), with χ N−S (τ ) replacing χ N S (τ ) in (3.10), the above construction for the loop amplitudes gives tr J a1 (ρ1 )J a2 (ρ2 ) . . . J an (ρn )w L 0 (−1) Nb , (3.18) where Nb =
r >0 b−r br .
In particular, the two-point function is
tr J a (ρ1 )J b (ρ2 )w L 0 (−1) Nb = −
κ ab χ − (ν1 − ν2 , τ )2 χ N−S (τ ) 4π 2 ρ1 ρ2 N S κ ab =− 2 P − (ν1 − ν2 , τ )χ N−S (τ ), 4π ρ1 ρ2 N S
(3.19)
where θ1 (0, τ )2 θ4 (ν, τ )2 θ4 (0, τ )2 θ1 (ν, τ )2
θ
(0, τ ) θ (ν, τ )
= 4 − 1 θ4 (0, τ ) θ1 (ν, τ )
θ (0, τ ) + 2η(τ ) + P(ν, τ ). = 4 θ4 (0, τ )
P N−S (ν, τ ) =
(3.20)
A third fermionic construction is given by using the Ramond operators, 1 j d i (ρ) = dmi ρ −m+ 2 , {dmi , dn } = δm,−n δ i j , m∈Z dmi |0
= 0, m > 0, 1 ≤ i, j, ≤ N .
(3.21)
Defining the Ramond partition function χ R (τ ) =
∞
(1 + w n ) N ,
(3.22)
n=1
we can write tr d i (ρ1 )d j (ρ2 )w L 0 =
1 2πi(ρ1 ρ2 )
1 2
chi R (ν1 − ν2 , τ )χ R (τ )δ i j ,
where χ R (ν, τ ) = πi + 2πi
∞ θ1 (0, τ )θ2 (ν, τ ) e−2πimν + w m e2πimν = 1 + wm θ2 (0, τ )θ1 (ν, τ )
m=1
1 ∼ as ν → 0. ν
(3.23)
Current Algebra on the Torus
233
If we now replace χ N S (ν, τ ) by χ R (ν, τ ) in (3.11), with χ R (τ ) replacing χ N S (τ ) in (3.10), the construction for the loop amplitudes gives (3.24) tr J a1 (ρ1 )J a2 (ρ2 ) . . . J an (ρn )w L 0 , where now, instead of (2.8), J a (ρ) =
i a i M d (ρ)d j (ρ). 2 ij
(3.25)
The two-point function is now tr J a (ρ1 )J b (ρ2 )w L 0 = −
κ ab χ R (ν1 − ν2 , τ )2 χ R (τ ) 4π 2 ρ1 ρ2 κ ab =− 2 P R (ν1 − ν2 , τ )χ R (τ ) 4π ρ1 ρ2
(3.26)
where θ1 (0, τ )2 θ2 (ν, τ )2 θ2 (0, τ )θ1 (ν, τ )2
θ (ν, τ )
θ
(0, τ ) − 1 = 2 θ2 (0, τ ) θ1 (ν, τ )
θ (0, τ ) = 2 + 2η(τ ) + P(ν, τ ). θ2 (0, τ )
P R (ν, τ ) =
(3.27)
3.2. General torus amplitudes and connected parts. The loop amplitude 1 a2 ...an (ν1 , ν2 , . . . , νn , τ ) = tr J a1 (ρ1 )J a2 (ρ2 ) . . . J an (ρn )w L 0 Aaloop ×(2πi)n
n
ρj,
(3.28)
j=1
is invariant under ν j → ν j + 1 and ν j → ν j + τ for each j individually, so that it is defined on the torus obtained by identifying ν ∈ C with ν + 1 and ν + τ . Because of the locality of the currents J a j (ρ j ), the amplitude is also symmetric under simultaneous permutations of the ρ j and the a j . From (2.3) we have that J a (ρ1 )J b (ρ2 ) ∼
κ ab (2πi)2 ρ1 ρ2 (ν1
− ν2
)2
+
f ab c J c (ρ2 ) as ν1 → ν2 . 2πiρ1 (ν1 − ν2 )
(3.29)
Thus the singularities of the n-point loop amplitude on the torus are determined in terms of the (n −1)-point and (n −2)-point loop amplitudes. This means that knowledge of the (n − 1)-point and (n − 2)-point loop amplitudes determines the n-point loop amplitude up to a constant on the torus, that is a function of τ . (See, e.g., [26], p. 29.) Because of the permutation symmetry of the amplitude (3.28), this leaves the n-point loop determined up to a symmetric invariant tensor function of τ , given the (n −1)-point and (n −2)-point loops.
234
L. Dolan, P. Goddard
The sum over permutations in the expression (3.13) for the loop in the fermionic construction cases can be divided into terms which collect together the same ρi in each cycle. Such terms are labeled by the division of the variables {ρ1 , ρ2 , . . . , ρn } into subsets, each consisting of at least two elements (corresponding to the restriction to permutations with no fixed points). The full loop amplitude is then the sum over these terms. Such terms are products of “connected parts”, each of which involves one of the subsets of {ρ1 , ρ2 , . . . , ρn }, say {ρi1 , ρi2 , . . . , ρim }, given by an expression like (2.21), 1 a2 ...am Aaloop ,C (ν1 , ν2 , . . . , νm , τ ) = −
1 NS f ( (1), (2),..., (m)) χ N S (τ ), m
(3.30)
∈Sm
in the NS case. The amplitudes Aloop,C have a simpler structure than the full amplitudes Aloop in that they have only single poles for m > 2, rather than both single and double poles. For m > 2, the connected amplitudes satisfy the conditions 1 a2 ...am Aaloop ,C (ν1 , ν2 , . . . , νm , τ ) ∼
a1 a2 ...a j−1 a j a j+1 ...am−1 1 f am a j a Aloop,C j νm − ν j ×(ν1 , ν2 , . . . , νm−1 , τ ), (3.31)
which are sufficient to specify the m-point connected amplitude Aloop,C in terms of the (m − 1)-point connected amplitude, again up to a symmetric invariant tensor function of τ . Motivated by the fermionic constructions, we can give a general definition of the connected part of the loop amplitude in a familiar way. If A = {i 1 , i 2 , . . . , i n } is a set of distinct positive integers, define ai ai ...ai n
1 2 A A ≡ Aloop
(νi1 , νi2 , . . . , νin , τ ) = tr J ai1 (ρi1 )J ai2 (ρi2 ) . . . J ain (ρin )w L 0 ×(2πi)n
n
ρi j .
(3.32)
j=1
Let P = (A1 , A2 , . . . , Ar ) be a division of the integers A = {i 1 , i 2 , . . . , i n } = A1 ∪ A2 ∪ . . . ∪ Ar into a number of disjoint subsets; let P A denote the collections of such divisions; and denote the partition function by χ (τ ) = tr w L 0 .
(3.33)
Then we can define the connected amplitude ACA inductively by AA =
P∈PA
χ (τ )1−|P|
A
AC j ,
(3.34)
A j ∈P
where |P| = r , the number of subsets contained in the division P, together with the {i} vanishing of the one point function AC = 0, and, consequently, {i, j}
AC
= A{i, j} = tr J ai (ρi )J a j (ρ j )w L 0 (2πi)2 ρi ρ j .
(3.35)
Current Algebra on the Torus
235
Equation (3.34) is of the form given for the NS case by (3.13) together with (3.10), where {i ...i } AC1 m = − f iN (1)S ...i (m) χ , χ (τ ) = χ N S (τ ).
∈Sm
Equation (3.34) defines an inductive procedure because we can write it as Aj ACA = A A − χ (τ )1−|P| AC , P∈P A
(3.36)
A j ∈P
where P A denotes the same collection of divisions of A into disjoint subsets but omitting the division of A into the single set consisting of itself. If we single out a point i ∈ A, we can rewrite the inductive definition of ACA , A A = ACA +
ACB A A∼B /χ ,
(3.37)
B∈RiA
where RiA denotes the proper subsets of A which contain i. The point of this definition of ai ai 2 ...ai n
ACA ≡ AC 1
(νi1 , νi2 , . . . , νin , τ )
(3.38)
is that, for m > 2, the double poles at νi = ν j present in A A have been removed and {i, j} only single poles remain. A double pole remains in AC defined by (3.35), {i, j}
AC
∼
κ ai a j χ (τ ) (νi − ν j )2
as νi ∼ ν j ,
and this is its only singularity. To demonstrate the absence of the double pole at νi = ν j in ACA , m > 2, we use induction and (3.37). We note that the residue of the double pole on the left hand side is kδ ai a j A A∼{i, j} , and in the sum on the right hand side, assuming inductively that the result is true for smaller amplitudes, the double pole occurs only in the term involving ACB for B = {i, j} and the residue for this term is δ ai a j kχ multiplied by A A∼B /χ , i.e. the same as on the left hand side, so that these residues cancel and ACA has no double pole at νi = ν j . A similar argument shows that ACA satisfies the same relations for the residues at single poles as A A , so that (3.31) holds. 3.3. Structure of torus amplitudes. In general write a1 a2 ...an 1 a2 ...an Aaloop (ν1 , ν2 , . . . , νn , τ )χ (τ ), ,C (ν1 , ν2 , . . . , νn , τ ) ≡ −Fn
(3.39)
so that, for n > 2, Fna1 a2 ...an (ν1 , ν2 , . . . , νn , τ ) ∼
a1 a2 ...a j−1 a j a j+1 ...an−1 1 f am a j a Fn−1 j νn − ν j ×(ν1 , ν2 , . . . , νn−1 , τ )
(3.40)
236
L. Dolan, P. Goddard
as νn ∼ ν j , which specifies Fn on the torus in terms of Fn−1 up to a function of τ , ωna1 a2 ...an (τ ),
(3.41)
which, because of the properties of Fn , must be an invariant symmetric tensor. Inductively, this determines Fn in terms of F2 and these invariant tensors, ωm , 2 < m ≤ n. The 2-point function, F2 , has only a double pole, F2ab (ν1 , ν2 , τ ) ∼ −
κ ab as ν1 → ν2 . (ν1 − ν2 )2
(3.42)
In general (3.42) implies that the general form of the two-point function is F ab (ν1 , ν2 , τ ) = −κ ab P(ν1 − ν2 , τ ) + ω2ab (τ ), where ω2ab (τ ) is a symmetric invariant tensor. In the NS, NS− and R cases,
ab ab θs (0, τ ) ω2 (τ ) = −κ + 2η(τ ) θs (0, τ )
(3.43)
(3.44)
with s = 3, 4, 2, respectively. We can construct the general three-point loop, F3 , as follows; we start by rewriting (3.43) as F2ab (ν1 , ν2 , τ ) = −κ ab P N S (ν1 − ν2 , τ ) + ω˜ 2ab (τ ).
(3.45)
We then have that F abc (ν1 , ν2 , ν3 , τ ) differs from what it is in the NS case, k f abc χ N S (ν1 − ν2 , τ )χ N S (ν2 − ν3 , τ )χ N S (ν3 − ν1 , τ ),
(3.46)
by a function defined on torus, whose residues at ν1 = ν2 , ν2 = ν3 , ν3 = ν1 are all i f ab e ω˜ 2ec (τ ). To construct such a function, consider the Weierstrass ζ function (see [27], p. 445), ζ (ν, τ ) =
θ1 (ν, τ ) + 2η(τ )ν θ1 (ν, τ )
(3.47)
ζ (ν, τ ) has the properties: ζ (ν + 1, τ ) = ζ (ν, τ ) + 2η(τ ),
ζ (ν + τ, τ ) = ζ (ν, τ ) + 2η(τ )τ − 2πi, (3.48)
ζ (ν, τ ) = −P(ν, τ ), ζ (−ν, τ ) = −ζ (ν, τ ), 1 ζ (ν, τ ) = + O(ν 3 ), as ν → 0. ν
(3.49)
It follows that ζ (ν1 − ν2 , τ ) + ζ (ν2 − ν3 , τ ) + ζ (ν3 − ν1 , τ )
(3.50)
is defined on the torus and has residue 1 at ν1 = ν2 , ν2 = ν3 and ν3 = ν1 . Thus the general form for
Current Algebra on the Torus
237
F3abc (ν1 , ν2 , ν3 , τ ) = k f abc χ N S (ν1 − ν2 , τ )χ N S (ν2 − ν3 , τ )χ N S (ν3 − ν1 , τ ) + f abe ω˜ 2ec (τ ) [ζ (ν1 − ν2 , τ ) + ζ (ν2 − ν3 , τ ) +ζ (ν3 − ν1 , τ )] + 2ω3abc (τ ),
(3.51)
where ω3 is a symmetric invariant tensor, because this has the residues specified by (3.40). We could proceed to express the n-point connected loop amplitude as the expression in the NS case (3.11) with additional terms, but, instead, we adopt an approach that is more symmetric between all the terms. To this end, we define functions, Hn,m (µ1 , . . . , µn , τ ), symmetric under the permutations of the µi , initially for 0 ≤ m ≤ 4, by Hn,0 (µ, τ ) = 1, n ζj, Hn,1 (µ, τ ) = j=1
⎛ 2Hn,2 (µ, τ ) = ⎝ ⎛ 6Hn,3 (µ, τ ) = ⎝ ⎛ 24Hn,4 (µ, τ ) = ⎝
n j=1 n j=1 n
⎞2
+3⎝
ζj⎠ + 3
n
ζj⎠ + 6 ⎝
j=1
ζj
j=1
⎛
⎞4
n
ζ j ,
j=1
⎞3
j=1
⎛
n
ζj⎠ +
j=1
⎞2 ζ j ⎠
n
+
n
n
ζ j
+
j=1
⎞2
ζj⎠
n
ζ j
,
j=1 n j=1
ζ j + 4
n j=1
ζj
n
ζ j
j=1
ζ j
+ k4 ,
(3.52)
j=1
where µ = (µ1 , . . . , µn ), ζ j = ζ (µ j , τ ), and k4 = k4 (τ ) is a constant on the torus to be determined. Then the singularities in the µ j of Hn,m (µ1 , . . . , µn , τ ) are simple poles at µ j = 0 for n > 2, and the residue Res Hn,m (µ1 , . . . , µn , τ ) = Hn−1,m−1 (µ1 , . . . , µn−1 , τ ),
µn =0
(3.53)
for 1 ≤ m ≤ 4 and n > 2. This can be verified case by case but we shall give a general argument below. The Hn,m (µ, τ ) are not single valued for µ j on the torus but, if we impose the constraint that µ1 +. . .+µn = 0, they are. So Hn,m (ν12 , . . . , νn1 , τ ), where νi j = νi −ν j is defined on the torus and, for n > 2, just has poles at νi = νi+1 , 1 ≤ i ≤ n, with νn+1 ≡ ν1 . For n = 2, H2,1 (ν12 , ν21 , τ ) = 0,
H2,2 (ν12 , ν21 , τ ) = −P(ν12 , τ ).
(3.54)
By Proposition 2 of Sect. 2.3, we can define n th order tensors κn,m (τ ), n ≥ m ≥ 0, n ≥ 2, by the conditions ba3 ...an a1 a2 ...an a2 a1 ...an κn,m (τ ) − κn,m (τ ) = f a1 a2 b κn−1,m−1 (τ ),
(3.55)
238
L. Dolan, P. Goddard
a1 a2 ...an a2 ...an a1 κn,m (τ ) = κn,m (τ ),
(3.56)
together with the requirement that κn,m be orthogonal to all symmetric tensors for m > 0 and n > 2, and the initial condition that κ2,2 = κ, κ2,1 = 0 and κn,0 (τ ) = ωn (τ ), a symmetric invariant tensor. Then, setting κn,m Hn,m a1 a2 ...an (ν1 , ν2 , . . . , νn , τ ) =
1 a (1) ...a (n) κn,m n
∈Sn
×Hn,m (ν (1) (2) , . . . , ν (n) (1) , τ ), (3.57) Fn = κn,n−m Hn,n−m , n ≥ m, provides a solution to (3.40) for each m. By the linearity of those equations, we obtain the solution, Fn =
n
κn,m Hn,m ,
n ≥ 2.
(3.58)
m=0
Because so far we only have Hn,m for 0 ≤ m ≤ 4, (3.58) is only valid for 2 ≤ n ≤ 4. Explicitly, F2 = κ2 H2,2 + κ2,0 , F3 = κ3 H3,3 + κ3,1 H3,1 + κ3,0 , F4 = κ4 H4,4 + κ4,2 H4,2 + κ4,1 H4,1 + κ4,0 ,
(3.59)
where we have written κn ≡ κn,n . To demonstrate that Hn,m has the desired properties, 0 ≤ m ≤ 4, and to extend its definition to higher values of m, we note we can write ⎤m ⎡ n 1 ⎣ (∂ j + ζ j )⎦ 1, Hn,m (µ, τ ) = m!
for 1 ≤ m ≤ 3,
(3.60)
j=1
where ∂ j = ∂/∂µ j . (Here, and in what follows, n ≥ 2.) The ζ function can be written in terms of the Weierstrass σ function (see [27], p. 447) ζ (µ, τ ) =
σ (µ, τ ) , σ (µ, τ )
σ (µ, τ ) = eη(τ )µ
2
θ1 (µ, τ ) , θ1 (0, τ )
(3.61)
with σ (−µ, τ ) = −σ (µ, τ ), and σ (µ, τ ) =
∞ s=0
f s (τ )µ2s+1 = µ + f 2 (τ )µ5 + . . . ,
because f 1 = 0.
(3.62)
Current Algebra on the Torus
239
Then, defining Hˆ n,m (µ, τ ) by the right hand side of (3.60) for all m ≥ 0, we have as µn → 0, ⎤m ⎡ n−1 1 ⎣∂ n + Hˆ n,m (µ, τ ) = (∂ j + ζ j )⎦ σ (µn , τ ) m!σ (µn , τ ) j=1 ⎤m ⎡ n−1 ∞ 1 ⎣ = (∂ j + ζ j )⎦ f s µ2s+1 + O(1) ∂n + n µn m! s=0
j=1
=
1 µn
[ 21 m− 21 ]
f s (τ ) Hˆ n−1,m−2s−1 (µ , τ ) + O(1),
(3.63)
s=0
where µ = (µ1 , . . . , µn−1 ) and [ 21 m − 21 ] is the greatest integer less than or equal to 1 1 2 m − 2 . Thus Res Hˆ n,m (µ, τ ) = µn =0
[ 21 m− 21 ]
f s (τ ) Hˆ n−1,m−2s−1 (µ, τ );
(3.64)
s=0
In particular,
Res Hˆ n,m (µ, τ ) = Hˆ n−1,m−1 (µ , τ ),
µn =0
1 ≤ m ≤ 4,
so that (3.53) holds for 1 ≤ m ≤ 4 and n > 2, but
Res Hˆ n,5 (µ, τ ) = Hˆ n−1,4 (µ, τ ) + f 2 Hˆ n−1,0 (µ , τ ).
µn =0
To see how to modify Hˆ n,m to give an Hn,m that satisfies (3.53) for all m ≥ 1, write Hˆ n (µ, τ ; ν) =
∞
ν m Hˆ n,m (µ, τ ),
(3.65)
m=0
we have from (3.64),
Res Hˆ n (µ, τ ; ν) = σ (ν, τ ) Hˆ n−1 (µ , τ ; ν).
(3.66)
∞ νn ˆ n (µ, τ ; ν) = Hn (µ, τ ; ν) = Hn,m (µ, τ )ν m , H σ (ν, τ )n
(3.67)
µn =0
So, if we define
m=0
the definition of Hn,m in (3.52) is unchanged for 1 ≤ m ≤ 4 (except that the constant, k4 /24, in the defintion of Hn,4 is determined to be −n f 2 ), and
Res Hn (µ, τ ; ν) = ν Hn−1 (µ , τ ; ν),
n ≥ 1,
(3.68)
Res Hn,m (µ, τ ) = Hn−1,m−1 (µ , τ ).
n ≥ m ≥ 1.
(3.69)
µn =0
i.e.
µn =0
240
L. Dolan, P. Goddard
From (3.60) ⎤m ⎡ ∞ n m ν ⎣ (∂ j + ζ j )⎦ 1 Hˆ n (µ, τ ; ν) = m! m=0 j=1 ⎡ ⎤m ⎤ ⎡ n ∞ n n m 1 ν ⎦ ⎣ =⎣ ∂j⎦ σ (µ j , τ ) σ (µ j , τ ) m! m=0
j=1
=
n j=1
j=1
j=1
σ (µ j + ν, τ ) σ (µ j , τ )
(3.70)
and so Hn (µ, τ ; ν) =
n σ (µ j + ν, τ ) νn . σ (ν, τ )n σ (µ j , τ )
(3.71)
j=1
Note that ν −n Hn (µ, τ ; ν) =
n j=1
σ (µ j + ν, τ ) σ (ν, τ )σ (µ j , τ )
(3.72)
is elliptic as a function of the µ j and of ν provided that we impose the constraint that µ1 + µ2 + . . . + µn = 0. From (3.71) we see directly that Res Hn (µ, τ ; ν) =
µn =0
n σ (ν, τ ) σ (µ j + ν, τ ) νn σ (ν, τ )n σ (0, τ ) σ (µ j , τ ) j=2
= ν Hn−1 (µ , τ ; ν).
(3.73)
Properties of Hn (µ, τ ; ν) are discussed in Appendix B. In particular, it is shown that Hn,n−1 (ν12 , . . . , νn−1,n , νn,1 , τ ) = 0.
(3.74)
The relation (3.69) shows that (3.58) provides the general form of the n-point connected loop amplitude, with Hn,m defined as the moments of (3.71). It specifies Fn in terms of the invariant symmetric tensors κ2,2 = κ and κn,0 = ωn , Fna1 a2 ...an (ν1 , ν2 , . . . , νn , τ ) =
n 1 a (1) ...a (n) κn,m n
∈Sn m=0
Hn,m (ν (1) (2) , . . . , ν (n) (1) , τ ).
(3.75)
a1 a2 ...an Since the symmetrization of κn,m is zero for m > 0 and n ≥ 3, we can evaluate κn,0 in terms of connected parts of traces of the J a (ρ) by symmetrizing (3.75) over the group indices only, yielding a1 ...an F a1 a2 ...an (ν1 , ν2 , . . . , νn , τ ) S = (n − 1)!κn,0 (τ ), n ≥ 3,
(3.76)
Current Algebra on the Torus
241
where we define F a1 a2 ...an (ν1 , ν2 , . . . , νn , τ ) S =
1 a (1) ...a (n) Fn (ν1 , ν2 , . . . , νn , τ ). (3.77) n!
∈Sn
Equation (3.76) can be written ⎤ ⎡ n n (2πi) a1 ...an ⎣ ωna1 ...an (τ ) ≡ κn,0 (τ ) = − ρ j ⎦ tr (n − 1)!χ (τ ) j=1 a1 a2 an , × J (ρ1 )J (ρ2 ) . . . J (ρn )w L 0 C,S
(3.78)
for n ≥ 3; ω2 is determined by (3.43). Note that this implies that the symmetrized connected part of the trace ⎤ ⎡ n ⎣ ρ j ⎦ tr J a1 (ρ1 )J a2 (ρ2 ) . . . J an (ρn )w L 0 (3.79) C,S
j=1
is independent of the ν j . (This follows directly from symmetrizing (3.40) because this shows that all the residues of this elliptic function vanish, implying that it is a constant on the torus.) We will relate this to the trace of zero modes of the currents in Sect. 4. In Appendix C we show that the formulae given for two-, three- and four-point loops in [3] are equivalent to (3.75) for n ≤ 4. 4. Zero Modes 4.1. Recurrence relations and traces of zero modes. The symmetric tensor ωn is given in terms of the symmetrized connected part of the trace of currents by (3.78). We seek to express this in terms of traces of symmetrized products of zero modes, J0a . To this end consider r tr J a1 (ρ1 )J a2 (ρ2 ) . . . J ar (ρr )J0ar +1 . . . J0an w L 0 ρj.
(4.1)
j=1
These functions are not elliptic as functions of the ν j , 1 ≤ j ≤ r , if r < n; to see this move J a1 (ρ1 ) around the trace, through w L 0 , to calculate the effect of sending ν1 → ν1 + τ , and we find that it is not invariant because terms, proportional to f a1 a j e a are generated on commuting J a1 (ρ1 ) with J0 j , j > r . However, these terms clearly disappear on symmetrizing over all the indices (a1 , a2 , . . . , an ), so that r tr J a1 (ρ1 )J a2 (ρ2 ) . . . J ar (ρr )J0ar +1 . . . J0an w L 0 ρj, S
(4.2)
j=1
is elliptic in ν j , 1 ≤ j ≤ r and so a suitable function to consider. Symmetrizing the recurrence relation [3] tr J a1 (ρ1 )J a2 (ρ2 ) . . . J an (ρn )w L 0 = ρ1−1 tr J0a1 J a2 (ρ2 ) . . . J an (ρn )w L 0
242
L. Dolan, P. Goddard n 1 (ν j − ν1 , τ ) a1 a j f a j ρ1 j=2
×tr J a2 (ρ2 ) . . . J a j−1 (ρ j−1 )J a j (ρ j )J a j+1 (ρ j+1 ) . . . J an (ρn )w L 0
+i
n 2 (ν j − ν1 , τ ) a1 a j δ ρ1 ρ j j=2 ×tr J a2 (ρ2 ) . . . J a j−1 (ρ j−1 )J a j+1 (ρ j+1 ) . . . J an (ρn )w L 0 ,
+k
where i θ1 (ν, τ ) 1 1 1 (ν, τ ) = − , 2 (ν, τ ) = 2π θ1 (ν, τ ) 2 4π 2 1 1 = − 2 P(ν, τ ) − η(τ ), 4π 2π 2
θ1 (ν, τ ) θ1 (ν, τ )
(4.3)
(4.4)
we obtain
tr J a1 (ρ1 )J a2 (ρ2 ) . . . J an (ρn )w L 0 S 1 a1 a2 an L0 = tr J0 J (ρ2 ) . . . J (ρn )w S ρ1 n 2 (ν j − ν1 , τ ) +k ρ1 ρ j j=2 × δ a1 a j tr J a2 (ρ2 ) . . . J a j−1 (ρ j−1 )J a j+1 (ρ j+1 ) . . . J an (ρn )w L 0 . (4.5) S
This generalizes to tr J a1 (ρ1 )J a2 (ρ2 ) . . . J ar (ρr )J0ar +1 . . . J0an w L 0 =
S
2 (ν j − ν1 , τ ) 1 tr J a2 (ρ2 ) . . . J ar (ρr )J0a1 J0ar +1 . . . J0an w L 0 + k S ρ1 ρ1 ρ j j=2 × δ a1 a j tr J a2 (ρ2 ) . . . J a j−1 (ρ j−1 )J a j+1 (ρ j+1 ) . . . J ar (ρr )J0ar +1 . . . J0an w L 0 . r
S
(4.6) Applying this for r = n = 2, tr J a1 (ρ1 )J a2 (ρ2 )w L 0 ρ1 ρ2 = tr J0a1 J a2 (ρ2 )w L 0 ρ2 + k2 (ν2 − ν1 , τ )δ a1 a2 χ (τ ) = tr J0a1 J0a2 w L 0 + k2 (ν2 − ν1 , τ )δ a1 a2 χ (τ ); (4.7) using (3.43) and (4.4), we have ω2ab =
4π 2 a b L 0 − 2δ ab kη(τ ). tr J0 J0 w χ (τ )
(4.8)
Current Algebra on the Torus
243
Taking r = n = 3, tr J a1 (ρ1 )J a2 (ρ2 )J a3 (ρ3 )w L 0 ρ1 ρ2 ρ3 = tr J0a1 J a2 (ρ2 )J a3 (ρ3 )w L 0 ρ2 ρ3 S S a1 a2 a3 = tr J0 J0 J (ρ3 )w L 0 ρ3 S = tr J0a1 J0a2 J0a3 w L 0 (4.9) S
because tr(J a3 (ρ3 )w L 0 ) = tr(J0a3 w L 0 ) = 0, so that
ω3abc =
4π 3 i a b c L 0 tr J0 J0 J0 w . S χ (τ )
(4.10)
For r = n = 4, tr J a1 (ρ1 )J a2 (ρ2 )J a3 (ρ3 )J a4 (ρ4 )w L 0 ρ1 ρ2 ρ3 ρ4 S a1 a2 a3 a4 L0 = tr J0 J (ρ2 )J (ρ3 )J (ρ4 )w ρ2 ρ3 ρ4 S + k2 (ν2 − ν1 ) δ a1 a2 tr J a3 (ρ3 )J a4 (ρ4 )w L 0 ρ3 ρ4 S + k2 (ν3 − ν1 ) δ a1 a3 tr J a2 (ρ2 )J a4 (ρ4 )w L 0 ρ2 ρ4 S + k2 (ν4 − ν1 ) δ a1 a4 tr J a2 (ρ2 )J a3 (ρ3 )w L 0 ρ2 ρ3 S = tr J0a1 J0a2 J a3 (ρ3 )J a4 (ρ4 )w L 0 ρ3 ρ4 S + k (2 (ν2 − ν1 ) + 2 (ν3 − ν1 ) + 2 (ν4 − ν1 )) δ a1 a2 tr J0a3 J0a4 w L 0
S
+ k (2 (ν2 − ν1 )2 (ν3 − ν4 ) + 2 (ν3 − ν1 )2 (ν2 − ν4 ) +2 (ν4 − ν1 )2 (ν2 − ν3 )) δ a1 a2 δ a3 a4 S χ (τ ) + k2 (ν3 − ν2 ) δ a2 a3 tr J0a1 J a4 (ρ4 )w L 0 ρ4 S + k2 (ν4 − ν2 ) δ a2 a4 tr J0a1 J a3 (ρ3 )w L 0 ρ3 S = tr J0a1 J0a2 J0a3 J0a4 w L 0 S a3 a4 L 0 a1 a2 +k δ tr J0 J0 w 2 (νi − ν j ) 2
S
i< j
+ k (2 (ν2 − ν1 )2 (ν3 − ν4 ) + 2 (ν3 − ν1 )2 (ν2 − ν4 ) +2 (ν4 − ν1 )2 (ν2 − ν3 )) δ a1 a2 δ a3 a4 S χ (τ ). 2
(4.11)
244
L. Dolan, P. Goddard
Then, since tr J a1 (ρ1 )J a2 (ρ2 )J a3 (ρ3 )J a4 (ρ4 )w L 0 = tr J a1 (ρ1 )J a2 (ρ2 )J a3 (ρ3 )J a4 (ρ4 )w L 0 C a1 a2 L0 a3 − tr J (ρ1 )J (ρ2 )w tr J (ρ3 )J a4 (ρ4 )w L 0 +tr J a1 (ρ1 )J a3 (ρ3 )w L 0 tr J a2 (ρ2 )J a4 (ρ4 )w L 0 +tr J a1 (ρ1 )J a4 (ρ4 )w L 0 tr J a2 (ρ2 )J a3 (ρ3 )w L 0 /χ (τ ), (4.12) tr J a1 (ρ1 )J a2 (ρ2 )J a3 (ρ3 )J a4 (ρ4 )w L 0
CS
4
ρj
j=1
= tr J0a1 J0a2 J0a3 J0a4 w L 0 − 3 tr J0a1 J0a2 w L 0 tr J0a3 J0a4 w L 0 /χ (τ ). S
S
(4.13) From (3.78), this shows that ω4 is given as a “connected part” of a trace of zero modes. In the next section we define such connected parts and show that ωn is given in terms of them for all n ≥ 2. 4.2. Connected parts of zero mode amplitudes. Because of the locality of the currents, A A , defined as in (3.32), and so ACA , defined inductively by (3.36), is symmetric under simultaneous permutations of the indices a j and the variables ν j . We define the symmetrization A SA of A A by symmetrizing on the a j alone: ai ai 2 ...ai n
(νi1 , νi2 , . . . , νin , τ ) 1 ai (1) ai (2) ...ai (n) = A (νi1 , νi2 , . . . , νin , τ ); n!
AS 1
(4.14)
∈Sn
equivalently we could symmetrize on the variables ν j alone. We define ACA S , the symmetrization of ACA , similarly. We consider the trace of zero modes of the currents, a a a i i Z A ≡ Z ai1 ai2 ...ain (τ ) = tr J0 1 J0 2 . . . J0 in w L 0 (2πi)n , (4.15) and, more particularly, its symmetrization, Z SA , defined as in (4.14). We can define a “connected part”, ZC S , inductively for Z SA , following (3.36), Aj χ (τ )1−|P| ZC S , (4.16) ZCAS = Z SA − P∈P A
A j ∈P
(where again P A denotes the same collection of divisions of A into disjoint subsets but omitting the division of A into the single set consisting of itself) together with the {i} vanishing of the one point function ZC S = 0, and with the two-point function given by a {i, j} {i, j} (4.17) ZC S = Z S = tr J0ai J0 j w L 0 (2πi)2 .
Current Algebra on the Torus
245
For A = {i 1 , i 2 , . . . , i 2m }, define PA =
m k m (2πi)2 ai (2 j−1) ai (2 j) δ 2 (νi (2 j−1) − νi (2 j) ), 2m m!
(4.18)
∈S2m j=1
and define P A = 0 if A has an odd number of elements. Then {i, j}
{i, j}
AC S = ZC S + P {i, j} χ ,
(4.19)
and the recurrence relation (4.6) leads to
A SA = Z SA +
[P B Z SA∼B ] S ,
(4.20)
B∈RA
where R A denotes the subsets of A, excluding the empty set but including A itself. Now, symmetrizing (3.36), ⎡ ⎤ Aj A SA = ACA S + χ 1−|P| ⎣ AC S ⎦ . (4.21) P∈P A
A j ∈P
S
If we assume, as the inductive hypothesis, that ACB S = ZCBS , for 2 < |B| < |A|, and that A
(4.19) holds when |B| = 2, we have, on substituting for AC Sj and symmetrizing, that A SA = ACA S +
⎡ χ 1−|P| ⎣
P∈P A
= ACA S
⎡
⎤ A ZC Sj ⎦
A j ∈P
⎤
+ S
⎡
⎣P B
B∈RA
χ 1−|R|
R∈PA∼B
Aj P B Z A∼B + χ 1−|P| ⎣ ZC S ⎦ + P∈P A
A j ∈P
S
B∈RA
D j ∈R
⎤ D ZC Sj ⎦ S
S
(4.22) by (4.16). Then using (4.20), Z SA = ACA S +
P∈P A
⎡ χ 1−|P| ⎣
A j ∈P
⎤ A ZC Sj ⎦
,
(4.23)
S
so that ACA S satisfies the recurrence relation (4.16) for ZCAS and we can conclude inductively that ACA S = ZCAS , for |A| > 2. It follows from (3.78), a1 ...an ωn (τ ) = κn,0 (τ ) = −
with ω2 given by (4.8).
(2πi)n tr J0a1 J0a2 . . . J0an w L 0 , n ≥ 3, (4.24) C,S (n − 1)!χ (τ )
246
L. Dolan, P. Goddard
4.3. Traces of zero modes and characters. In this section we will relate the symmetrized traces of the zero modes a a ai ai ...ai a i i Z S 1 2 n (τ ) = tr J0 1 J0 2 . . . J0 in w L 0 (2πi)n , (4.25) S
to the character of the representation of the affine algebra, gˆ , defined by (1.1), in the space of states, χ (θ, τ ) = tr ei H ·θ w L 0 . (4.26) Here H denotes the generators of a Cartan subalgebra, h, of the finite-dimensional algebra g formed by the zero modes, [J0a , J0a ] = f ab c J0c .
(4.27)
For convenience of we shall take g to be simple in what follows. exposition, ai ai a For fixed τ , tr J0 1 J0 2 . . . J0 in w L 0 is an invariant symmetric tensor for g, The S space of symmetric tensors, S(g) is isomorphic (as a vector space) to U(g), the universal enveloping algebra of g, ai
ai
ωa1 a2 ...an → ωa1 a2 ...an J0 1 J0 2 . . . J0 in . a
(4.28)
The invariant tensors S(g)g ⊂ S(g) correspond to the center Z (U(g)) of U(g), i.e. the elements of U(g) that commute with g. This is a ring generated by rank g elements (e.g. [28], p. 337), the basic Casimir operators, or primitive invariant tensors. These can be taken to be orthogonal, ωa1 a2 ...an ωa 1 a2 ...am = 0,
(4.29)
where ω, ω are primitive invariant symmetric tensors of orders n, m, m < n. We can use a Cartan-Weyl basis for g, using to denote the set of roots of g, [H i , H j ] = 0,
1 ≤ i, j ≤ rankg;
i
α
αi E α ,
α ∈ , 1 ≤ i ≤ rank g;
α
β
(α, β)E α+β ,
α, β, α + β ∈ ;
[H , E ] = [E , E ] =
2 α · H, α2 = 0, =
β = −α ∈ otherwise.
(4.30)
(We omit the suffix 0 on H, E α .) With this choice of basis, the quadratic Casimir operator Ja Ja = H2 +
α2 α>0
2
(E −α E α + E α E −α )
= H 2 + 2δ · H +
α 2 E −α E α ,
α>0
= (H + δ) − δ + 2
2
α>0
δ=
1 α, 2 α>0
α E −α E α . 2
(4.31)
Current Algebra on the Torus
247
The value of J 2 in a representation with highest weight λ can be obtained by evaluating this on the highest weight state |λ, which has E α |λ = 0 for α > 0. Thus the value of J 2 in this representation is λ · (λ + 2δ) = (λ + δ)2 − δ 2 .
(4.32)
If ξ a1 a2 ...an is any invariant tensor for g, Cξ = ξ a1 a2 ...an J0a1 J0a2 . . . J0an ∈ Z (U(g)),
(4.33)
and we can evaluate its value in the representation with highest weight |λ by expressing it in the Cartan-Weyl basis and moving the E α , α > 0, to the right. Because [H i , Cβ ] = 0, we can write Cξ = φξ (H ) + Fξ,α E α , for suitable Fξ,α ∈ U(g), (4.34) α>0
where φξ (H ) is a polynomial of degree n in the H i . Then Cξ |λ = φξ (λ)|λ, so that Cξ takes the value φξ (λ) in the representation with highest weight λ. Given two such invariant tensors ξ1 , ξ2 , ⎞ ⎛ Cξ1 Cξ2 = φξ1 (H ) + Fξ1 ,α E α ⎝φξ2 (H ) + Fξ2 ,β E β ⎠ α>0
= φξ1 (H )φξ2 (H ) + +
α>0
Fξ1 ,α φξ2 (H )E α
α>0
β>0
Fξ1 ,α φξ2 (H − α)E α +
φξ1 (H )Fξ2 ,β E β
β>0
φξ1 (H )Fξ2 ,β E β ,
(4.35)
β>0
so that φξ1 ξ2 (H ) = φξ1 (H )φξ2 (H ).
(4.36)
If φξ1 = φξ2 , then Cξ1 = Cξ2 acting in each highest weight representation of g. It follows from this that Cξ1 = Cξ2 as elements of U(g) (see, e.g., [29], p. 251). Thus Cξ → φξ defines a map Z (U(g)) → S(h), which is an algebra homomorphism and is one-to-one. The elements φξ ∈ S(h) obtained in this way have an invariance under the Weyl group, W , of g as we shall now show (see [30], p. 130, or [29], p. 246). Consider the action of φξ in the infinite-dimensional representation, V˜λ , with highest weight λ, where λ ∈ g, the weight lattice of g, with α · λ ≥ 0 for α > 0, whose states are generated by the action of E α , α > 0, on a state |λ. The finite-dimensional representation, Vλ , is the quotient of V˜λ by its largest invariant subspace. Taking a basis of simple roots, m i +1 α1 , α2 , . . . , αr , r = rank g, m i = 2αi · λ/αi2 ∈ Z and αi · λ ≥ 0, consider E −α |λ. i Now s s αi · H E −α |λ = 21 αi2 (m i − 2s)E −α |λ i i
so m i +1 E αi E −α |λ = i
mi mi (m i − 2s)E −α |λ = 0 i s=0
248
L. Dolan, P. Goddard
m i +1 and E α j E −α |λ = 0 for i = j because [E αi , E α j ] = 0, i = j. It follows that i m i +1 m i +1 E α E −αi |λ = 0 for α > 0 and so E −α |λ = 0 generate an invariant subspace of V˜λ i (which is divided out in the construction of Vλ ). Then m i +1 m i +1 m i +1 |λ = φξ (H )E −α |λ = φξ (λ − m i αi − αi )E −α |λ. Cξ E −α i i i
(4.37)
But, on the other hand m i +1 m i +1 m i +1 Cξ E −α |λ = E −α |Cξ λ = φξ (λ)E −α |λ. i i i
(4.38)
Thus, for each simple root, αi , φξ (λ) = φξ (λ − m i αi − αi ).
(4.39)
If σi denotes the element of the Wg corresponding to reflection in the hyperplane orthogonal to αi , σi (λ) = λ − m i αi , and σi (δ) = δ − αi , because 2δ · αi /αi2 = 1 for each simple root αi . Thus (4.39) can be rewritten φξ (λ) = φξ (σi (λ + δ) − δ),
(4.40)
˜ i (λ)). φ˜ ξ (λ) = φξ (λ − δ) = φξ (σi (λ) − δ) = φ(σ
(4.41)
and, if we define
Because the reflections in the simple roots, σi , generate the Weyl group Wg, φ˜ ξ (λ) = φξ (λ−δ) defines a function invariant under the whole Weyl group. Thus Cξ → φ˜ ξ defines a homomorphism of Z (U(g)) → S(h)W , the polynomials in H invariant under the Weyl group. In fact, this map is an isomorphism, called the Harish-Chandra isomorphism. That Cξ → φ˜ ξ is onto follows from the fact that S(h)W is spanned by φξ for ξ a1 a2 ...an = tr(t a1 t a2 . . . t an ), where the t a are the representations of J0a in the finite-dimensional representation Vλ , λ ∈ g (see, e.g., [29], p 253). Now, writing bλ (w)χ λ (θ ), (4.42) tr ei H ·θ w L 0 = χ (θ, τ ) = λ∈+g
where +g = {λ ∈ g : α · λ ≥ 0 for α > 0}, ˜ + δ) dim Vλ , tr Cξ w L 0 = bλ (w)φ(λ
(4.43)
λ∈+g
and the character for the finite-dimensional representation Vλ of g, χ λ (θ ) is given by the Weyl character formula, χ λ (θ ) =
1 (σ )eiσ (δ+λ)·θ , g(θ ) σ ∈Wg
(4.44)
Current Algebra on the Torus
249
with (σ ) = ±1 being the determinant of σ , and the Weyl denominator being given by i i g(θ ) = e 2 α·θ − e− 2 α·θ , (4.45) α>0
where the product is over the positive roots of g. (See [30], p. 139.) The dimension dim Vλ = χ λ (0), but to evaluate this from (4.44), we need to take a limit on the right+ hand side. In fact g(θ ) = O(θ n ), as θ → 0, where n + is the number of positive roots of g. Now α · ∂θ (σ )eiσ (δ+λ)·θ = i(σ )eiσ (δ+λ)·θ σ (δ + λ) · α α>0
α>0 σ ∈Wg
σ ∈Wg
=
i(σ )(δ + λ) · σ −1 (α),
α>0 σ ∈Wg
when θ = 0.
(4.46)
As α runs over the positive roots of g, σ (α) will range over a set obtained from the positive roots by reversing some of their signs. The product of these sign changes equals (σ ) = (σ −1 ). Hence the sign changes cancel the effect of (σ ) in (4.46) and we have + α · ∂θ (σ )eiσ (δ+λ)·θ = i n |Wg| (δ + λ) · α. (4.47) α>0 α>0 σ ∈Wg θ=0
Since χ 0 (θ ) = 1, g(θ ) =
(σ )eiσ (δ)·θ ,
(4.48)
σ ∈Wg
and, hence, α>0
α · ∂θ g(θ )
+
= i n |Wg|
δ · α,
(4.49)
α>0
θ=0
and dim Vλ = χ λ (0) =
(δ + λ) · α . δ·α
(4.50)
α>0
Applying (β · ∂θ )n
α · ∂θ
(4.51)
(σ )eiσ (δ+λ)·θ ,
(4.52)
α>0
to the equation χ λ (θ )g(θ ) =
σ ∈Wg
250
L. Dolan, P. Goddard
we obtain (β · ∂θ )n
α · ∂θ
α>0
χ λ (θ )g(θ )
= in
+ +n
(δ + λ) · α
α>0
θ=0
=i
n + +n
δ·α
α>0
(σ (δ + λ) · β)n
σ ∈Wg
(σ (δ + λ) · β)n dim Vλ ,
σ ∈Wg
(4.53) and so
α · pθ (β · pθ ) χ (θ, τ )g(θ ) α·δ α>0 θ=0 = bλ (w) (σ (δ + λ) · β)n dim Vλ ,
n
λ
where pθ = −i∂θ . If ˜ φ(λ) =
n n β · σ (λ) = σ (β) · λ , σ ∈Wg
˜ pθ ) φ(
α · pθ α·δ
α>0
(4.54)
σ ∈Wg
(4.55)
σ ∈Wg
χ (θ, τ )g(θ )
n = bλ (w) σ (δ + λ) · σ (β) dim Vλ
θ=0
λ
= |Wg|
σ,σ ∈Wg
˜ + λ) dim Vλ . bλ (w)φ(δ
λ
(4.56) ˜ The functions (4.55) span the polynomial functions φ(λ) invariant under the Weyl group and so (4.56) holds for any such function. From (4.43), it follows that α · pθ 1 tr Cξ w L 0 = φ˜ ξ ( pθ ) . (4.57) χ (θ, τ )g(θ ) |Wg| α·δ α>0
θ=0
The symmetrized products of the primitive symmetric invariant tensors form a basis for all symmetric invariant tensors. Suppose ωaj 1 a2 ...an , 1 ≤ j ≤ N forms an orthonormal basis for the symmetric invariant tensors of order n, so that ωaj 1 a2 ...an ωka1 a2 ...an = δ jk .
(4.58)
N a a a i i tr J0 1 J0 2 . . . J0 in w L 0 = f j (w)ωaj 1 a2 ...an ,
(4.59)
Then we can write
S
where
j=1
f j (w) = tr Cω j w L 0
(4.60)
Current Algebra on the Torus
and
251
a a a i i tr J0 1 J0 2 . . . J0 in w L 0
S N α · pθ 1 a1 a2 ...an φ˜ ω j ( pθ ) = ωj χ (θ, τ )g(θ ) |Wg| α·δ α>0
j=1
θ=0
N 1 a1 a2 ...an = ωj (σ ) φ˜ ω j ( pθ + σ (δ)) |Wg| j=1 σ ∈Wg α · ( pθ + σ (δ)) × χ (θ, τ ) α·δ α>0 θ=0 N α · (δ + σ ( pθ )) 1 a1 a2 ...an = ωj φω j (σ ( pθ )) χ (θ, τ ) |Wg| α·δ σ ∈Wg
j=1
α>0
. θ=0
(4.61) 5. Summary and Conclusions In this paper, we have constructed a general formula for the loop amplitude tr J a1 (ρ1 )J a2 (ρ2 ) . . . J an (ρn )w L 0 ,
(5.1)
where the currents J a (ρ) satisfy the operator product expansion J a (z 1 )J b (z 2 ) ∼
κ ab f ab c J c (z 2 ) + , (z 1 − z 2 )2 z1 − z2
(5.2)
which is equivalent to the affine algebra gˆ , defined by (1.1). This formula extends the Frenkel-Zhu construction for tree amplitudes [1], described in Sect. 2.2, and generalizes the results obtained when the currents are given as bilinear expressions in fermionic fields, which are reviewed in 3.1. The general formula is described graphically by summing over all graphs with n vertices where the vertices carry the labels a1 , a2 , . . . , an and each vertex is connected by directed lines to other vertices, one of the lines at each vertex pointing towards it and one away from it. Each graph consists of a number, r , of directed “loops” or cycles, ξ = (i 1 , i 2 . . . i ) with which we associate an expression f ξ . The expression associated with the whole graph consists of a factor of 1/2πiρ j for each current J a j (ρ j ) and − f ξi , 1 ≤ i ≤ r , for each cycle, ⎡ ⎤ n r 1 ⎦ ⎣ (−1)r f ξi . (5.3) 2πiρ j diagrams j=1
i=1
For ξ = (i 1 , i 2 . . . i ), fξ =
m=0
ai ai 2 ...ai
1 κ,m
H,m (νi1 i2 , . . . , νi−1 i , νi i1 , τ ).
(5.4)
252
L. Dolan, P. Goddard
The functions H,m are defined in terms of the Weierstrass σ function by ∞ σ (µ j + ν, τ ) ν H,m (µ1 , µ2 , . . . , µ , τ )ν m , = σ (µ j , τ ) σ (ν, τ )
(5.5)
m=0
j=1
and the invariant tensors κ,m are defined inductively by the equations a1 a2 ...a a2 a1 ...a ba3 ...a κ,m (τ ) − κ,m (τ ) = f a1 a2 b κ−1,m−1 (τ ),
(5.6)
a1 a2 ...a a2 ...a a1 (τ ) = κ,m (τ ), κ,m
(5.7)
together with the requirement that κ,m be orthogonal to all symmetric tensors for m > 0 and > 2, and the initial condition that κ2,2 = κ, κ2,1 = 0 and κ,0 (τ ) = ω (τ ), where the symmetric invariant tensor, ωa1 a2 ...a (τ ) = −
(2πi) tr J0a1 J0a2 . . . J0a w L 0 , ≥ 3; C,S ( − 1)!χ (τ )
ω2ab =
4π 2 a b L 0 tr J0 J0 w − 2δ ab kη(τ ). χ (τ )
(5.8)
(5.9)
A proof that κ,m exists and is defined uniquely by (5.6) and (5.7) is given in Sect. 2.3 and an algorithmic method for constructing them inductively using Young tableaux is given in Appendix A. The results described so far in this section apply to the affine algebra, gˆ , associated with any finite-dimensional Lie algebra, g. In Sect. 4.3 we gave a method for calculating the traces of zero modes, necessary to determine the symmetric tensors ω, in terms of the character χ (θ, τ ) = tr ei H ·θ w L 0 (5.10) of the representation provided by the space of states of the theory. The method would apply to any compact g but we took it to be simple for ease of exposition. The “connected” symmetrized trace (5.8) is defined in Sect. 4.2 in terms of the “full” symmetrized traces, which themselves can be expanded in terms of an orthonormal basis of symmetric invariant tensors of order , N a a ai i i tr J0 1 J0 2 . . . J0 w L 0 = f j (w)ωaj 1 a2 ...a , S
(5.11)
j=1
where N is the number of independent symmetric invariant tensors of order , f j (w) = tr Cω j w L 0 and the Casimir operator Cω j = ωaj 1 a2 ...a J0a1 J0a2 . . . J0a . In Sect. 4.3, we reviewed how a normal ordering of the J0a in Cω j , by writing Cω j = φω j (H ) + N j , where H denotes the elements of a Cartan subalgebra and N j annihilates highest weight states, so that Cω j → φω j (H ) defines the Harish-Chandra isomorphism of the center of
Current Algebra on the Torus
253
the enveloping algebra of g (that is the ring of Casimir operators) onto the polynomials in H invariant under the action of the Weyl group, Wg of g. This leads to the expression N α · (δ + σ ( pθ )) 1 a1 a2 ...an ωj φω j (σ ( pθ )) χ (θ, τ ) |Wg| α·δ j=1
α>0
σ ∈Wg
(5.12) θ=0
for the symmetrized trace (5.11), where pθ = −i∂θ , α denotes a root of g and δ denotes half the sum of positive roots. With this we have assembled all the elements of an explicit expression for the loop amplitude (5.1). In Appendix C, this is compared with expressions given previously [3] for n = 2, 3, 4. Acknowledgements. We are grateful to Matthias Gaberdiel for helpful correspondence. LD thanks the Institute for Advanced Study at Princeton for its hospitality, and was partially supported by the U.S. Department of Energy, Grant No. DE-FG01-06ER06-01, Task A.
A. Explicit Construction of the Tensor κn in Terms of κn−1 As a preparation for giving an explicit construction of the tensor κn in terms of κn−1 , we review some salient features of the representation theory of Sn (see e.g. [31], p. 44). The number of inequivalent irreducible representations of Sn , the group of permutations of n objects, is p(n), the number of n. Each partition, p = mof partitions ( p 1 , . . . , p m ), with pi ≥ p j , if i ≤ j and i=1 pi = n, determines a Young diagram, consisting of n boxes arranged into m rows and p 1 columns, with pi boxes in the i th row and the number of boxes in the j th column equal to the number of p k ≥ j. We can identify the partition p with the corresponding Young diagram. The Young diagrams label the inequivalent irreducible representations. Given a Young diagram p, a Young tableau, λ, is defined by an assignment of the integers 1, . . . , n to the n boxes of p. This gives n! Young tableaux associated with a given Young diagram. A standard Young tableau is one for which the numbers assigned to the boxes decrease along each row (from left to right) and down each column. The number of standard Young tableau associated with the Young diagram p, dp =
n! (i − j ), 1 ! . . . m !
where j = p j + m − j,
(A.1)
i< j
and this is also the dimension of the irreducible representation associated with p. The regular representation of Sn , V , which consists of linear combinations g∈Sn x g g, of elements of Sn , contains d p representations of the type labeled by p, so that |Sn | = 2 p d p , which we can regard as being labeled by the standard Young tableaux associated p with p. We label these λi , 1 ≤ i ≤ d p . Given a Young tableau, λ, we define Aλ of Sn to be the subgroup of Sn consisting those permutations which map each row of λ into itself and define Bλ of Sn to be the the subgroup of Sn consisting those permutations which map each column of λ into itself. Let aλ =
, bλ = ( ) , (A.2)
∈Aλ
∈Bλ
254
L. Dolan, P. Goddard
where ( ) denotes the sign of the permutation . Then
aλ = aλ = aλ , ∈ Aλ ;
bλ = bλ = ( )bλ , ∈ Bλ .
Define the Young symmetrizer cλ = aλ bλ .
(A.3)
Then cλ2 = N p cλ , where N p =
n! , dλ
(A.4)
and cλ cµ = 0,
(A.5)
if λ, µ have different shapes, i.e. are associated with different Young diagrams (partitions). If the distinct Young tableaux λ, µ are associated with the same Young diagram, p, we can find a permutation σλµ ∈ Sn , which takes µ into λ; then aλ = σλµ aµ σµλ ,
bλ = σλµ bµ σµλ ,
cλ = σλµ cµ σµλ ,
−1 σµλ = σλµ .
(A.6)
Further (see [32], p. 393, or [33], p. 75), either there exists a pair ( j, k) contained in a single column of λ and a single column of µ, in which case, if t ∈ Sn is the transposition interchanging j and k, t ∈ Bλ ∩ Aµ , t 2 = 1, so that bλ aµ = bλ t 2 aµ = −bλ aµ , implying cλ cµ = 0,
(A.7)
or the elements of each given column of λ are in different rows in µ, in which case σλµ = βλµ αλµ , for some αλµ ∈ Aµ , βλµ ∈ Bλ ,
(A.8)
bλ aµ = λµ bλ βλµ αλµ aµ = λµ bλ σλµ aµ , where λµ = (βλµ ),
(A.9)
so that
implying cλ cµ = N p λµ σλµ cµ = N p λµ cλ σλµ . √ √ If the normalized Young tableau cˆλ = cλ /N p , and aˆ λ = aλ / N p , bˆλ = bλ / N p , and λ, µ have the same shape, bˆλ aˆ µ = λµ bˆλ σλµ aˆ µ ,
cˆλ cˆµ = λµ cˆλ σλµ = λµ σλµ cˆµ ,
(A.10)
where λλ = 1, λµ = (βλµ ) if the elements of each given column of λ are in different rows in µ, and λµ = 0 otherwise. p Writing, λi ≡ λi , 1 ≤ i ≤ d p for the d p standard Young tableaux of type p, in lexicographical order, that is if i < j and we compare the entries of integers in the boxes of λi and λ j reading along each row from left to right starting with the first row and proceeding to the second, and so on, then for the first discrepancy the integer in the relevant box in λ j is greater than the one in the corresponding box in λi ; in this case
Current Algebra on the Torus
255
we write λi < λ j if i < j. Then, writing ai = aλi , bi = bλi , σi j = σλi λ j , bi a j = 0 if i > j, bˆi aˆ j = i j bˆi σi j aˆ j ,
(A.11)
where i j is defined as in (A.10), for i ≤ j. For each Young tableau λ of type p, Vλ = V cλ ,
(A.12)
defines an irreducible representation subspace of the regular representation V of type p, dimension d p . The spaces Vλi , 1 ≤ i ≤ d p , provide d p irreducible representations of type p in V . In fact, V ∼ =
dp p
Vλ p .
(A.13)
i
i=1
Corresponding to this decomposition into irreducible components, a basis for V is provided by {σλ p λ p cˆλ p = aˆ λ p σλ p λ p bˆλ p : 1 ≤ i, j ≤ d p ; p ∈ P(n)}, i
j
j
i
i
j
j
(A.14)
where P(n) denotes the set of partitions of n. To establish that this is a basis, it is enough to show that the states aˆ i σi j bˆ j = σi j cˆ j , 1 ≤ i, j ≤ d p , using the notation of (A.11), are linearly independent. If xi j σi j cˆ j = 0, then xi j cˆ σi j cˆ j cˆk = 0, 1≤i, j≤d p
1≤i, j≤d p
implying
xi j cˆ σi j cˆ j cˆk = 0.
(A.15)
≤i, j≤k
Suppose some xi j = 0; choose so that is the largest value of i for which this is true and then k so that it is the smallest value of j for which xj = 0. Then all the terms on the left hand side of (A.15) are zero except for one leaving xk cˆ σk cˆk cˆk = xk cˆ σk = 0, which implies x,k = 0, a contradiction. Thus, we conclude that xi j = 0 for all i, j and so that the states (A.14) form a basis. Now we seek to determine xi j , 1 ≤ i < j < d p , so that Pp =
1≤i≤d p
cˆi +
1≤i< j≤d p
xi j σi j cˆ j
(A.16)
256
L. Dolan, P. Goddard
is the projection onto the spaces corresponding to the standard Young tableaux of shape p, Vp =
dp
Vλ p.
(A.17)
i
i=1
(See [33], p. 76.) A necessary and sufficient condition for this is Pp σi j cˆ j = σi j cˆ j for 1 ≤ i, j ≤ d p . If this holds we will have p Pp = 1, because Pp σi j cˆ j = 0 if p is another shape of Young tableaux: Pp σk cˆ =
cˆi cˆk σk +
1≤i≤d p
= cˆk2 σk +
= σk cˆ +
1≤i 0 for all i }
(2.21)
with the induced group operation of G is a Lie subgroup of G which integrates l.
Camassa-Holm Equation
271
Proof. It is clear that the group operation of Gis closed on L. On the other hand, if n n A ∈ l, we can show that the Neumann series ∞ n=0 (−1) A converges. To see this, take a nonzero vector u = (u(1), u(2), · · · ) ∈ H. Then from the inequality in (2.19), we have ⎞n−1 ⎛ j ∞ 2 ||u||2 A ( j) 1 ⎝ A1 (k)2 ⎠ || An u||2 ≤ (n − 1)! j=1 k=1 ⎞n ⎛ ∞ ||u||2 ⎝ = A1 ( j)2 ⎠ (2.22) n! j=1
=
||u||2 n!
|| A||2n 2 ,
from which we obtain the estimate || A||n || An || ≤ √ 2 . n!
(2.23)
The convergence of the Neumann series is now clear from (2.23). Thus I + A is invertible n n with (I + A)−1 = ∞ n=0 (−1) A ∈ L if I + A ∈ L. This shows that L is a subgroup of G. As L is clearly a submanifold of G, this completes the proof of the assertion.
Our next result is the global version of the direct sum decomposition in Proposition 2.2. Proposition 2.8. Suppose I + A ∈ G, then I + A has a unique factorization I + A = b− b+−1 ,
(2.24)
where b− ∈ L and b+ ∈ K. T , where Proof. The factorization problem in (2.24) is equivalent to (I + A)T = b+ b− b− ∈ L and b+ ∈ K. Now we can certainly obtain a unique orthogonal b+ ∈ G L(H) and a unique lower triangular b− ∈ G L(H) with (b− )ii > 0 by applying the Gram-Schmidt orthogonalization process to the vectors (I + A)T e1 , (I + A)T e2 , · · · . To complete the T. proof, it suffices to show that b− ∈ L. To this end, note that (I + A)(I + A)T = b− b− Since g is a 2-sided ideal in B(H) which is closed under the operation of taking the transpose, we can rewrite the above relation in the form −1 T K = b− − (b− ) ,
(2.25)
−1 T K = ((I + A)(I + A)T − I)(b− ) ∈ g.
(2.26)
where
Now, from ||K ||22 < ∞ and the relation in (2.25), we infer that j
(b− )i2j +
∞ ((b− )ii2 − 1)2 −1 2 (b− ) ji + < ∞. (b− )ii2 i< j i=0
(2.27)
272
L.-C. Li
But as ((b− )ii2 − 1)2 (b− )ii2 =
− ((b− )ii − 1)2
((b− )ii − 1)2 (2(b− )ii2 + 1)
≥ 0,
(b− )ii2 (2.28)
it follows on using (2.28) in (2.27) that ||b− − I||22 =
j
as desired.
(b− )i2j +
∞ ((b− )ii − 1)2 < ∞,
(2.29)
i=1
We are now ready to give the solution to the (±) Toda flow. As the proof is quite standard, we refer the reader to the analogous situation in Theorem 3.2 and Remark 3.3 (b) of [L2]. (See [RSTS] for the general theory of the factorization method.) Theorem 2.9. Let L 0 ∈ g, and let b− (t) ∈ L, b+ (t) ∈ K be the unique solution of the factorization problem 1 ex p ± t L 0 = b− (t)b+ (t)−1 . 2
(2.30)
Then for all t, −1 (t)L 0 b− (t) L(t) = b+−1 (t)L 0 b+ (t) = b−
(2.31)
solves the initial value problem 1 L˙ = ± [k L, L], L(0) = L 0 . 2
(2.32)
We next give the first result on the long time behaviour of the (±) Toda flow when the initial data L 0 ∈ p. It is in fact just a special case of Proposition 5 in Sect. 2 of [DLT1]. (The proof is a modification of Moser’s argument in [Mo].) Proposition 2.10. Let L(t) be the solution of L˙ = ± 21 [ k L, L ], L(0) = L 0 ∈ p. Then L(t) converges strongly to a diagonal operator L ± (∞) = diag(α1± , α2± , · · · ) with αi± belonging to the spectrum σ (L 0 ) of L 0 as t → ∞. Remark 2.11. Recall that L(t) converges strongly to L ± (∞) as t → ∞ means ||L(t)u− L ± (∞)u|| → 0 for each u ∈ H (see [RS1]). Since H is infinite dimensional, this notion of convergence is weaker than norm convergence, so in general the spectrum of L ± (∞) can shrink. (See VIII.7 of [RS1] for a discussion of such matters.) As the reader will see, this is indeed what happens in Sect. 4 below.
Camassa-Holm Equation
273
In the rest of the section, we will describe the symplectic leaves of the Lie-Poisson structure {·, ·} R which are given by the coadjoint orbits of the infinite dimensional Lie group G R which integrates g R . In particular, we will consider the coadjoint action of G R on the class p∗ of semiseparable operators L ∈ g. By definition, a Hilbert-Schmidt operator L = (L i j )i,∞j=1 ∈ p∗ if and only if
Li j =
ui v j , i ≤ j u j vi , i > j,
(2.33)
where u = (u 1 , u 2 , · · · ), v = (v1 , v2 , · · · ) are sequences of real numbers, which are not necessarily in l2+ . The Lie group G R can be described in the following way (cf. [DLT2]): the underlying manifold is G, but now the group operation is defined by g ∗ h ≡ g− hg+−1 ,
(2.34)
where g = g− g+−1 is the unique factorization into g− ∈ L and g+ ∈ K. Moreover, the coadjoint action of G R on g∗R g is given by −1 AdG∗ R (g −1 )L = ∗l (g− Lg− ) + ∗k (g+ Lg+−1 ),
(2.35)
and the orbits of this action are the symplectic leaves of {·, ·} R . Proposition 2.12. The class p∗ ⊂ p of Hilbert-Schmidt operators in H which are semiseparable is invariant under AdG∗ R . −1 Proof. From (2.35), we have AdG∗ R (g)L = ∗l (g− Lg− ) for L ∈ p∗ . If
Li j =
ui v j , i ≤ j u j vi , i > j,
(2.36)
for some sequences of real numbers u = (u 1 , u 2 , · · · ) and v = (v1 , v2 , · · · ), a straightforward computation shows that −1 −1 (g− Lg− )i j = (g− u)i (g T v) j , i ≤ j,
where we have used the fact that g− is lower triangular. Therefore the assertion follows from the formula for ∗l in (2.11)
From this result, it follows that if the initial data L 0 of (2.32) is in p∗ , then L(t) ∈ p∗ for all t. In the next section, the reader will see that we will be dealing with the (±) Toda flow on some rather special semiseparable operators which are related to the CH equation.
274
L.-C. Li
3. A Class of Low-Regularity Solutions of the Camassa-Holm Equation In this section, we will consider a class of weak solutions of the CH equation u t − u x xt + 3uu x = 2u x u x x + uu x x x ,
(3.1)
of the form ∞
u(x, t) =
1 −|x−q j (t)| e p j (t), 2
(3.2)
j=1
where p j (t) = 0 for all j ∈ N and such that p j (t) → 0 sufficiently fast as j → ∞. To + , be more precise, we assume (for small values of t) that q(t) = (q1 (t), q2 (t), · · · ) ∈ l∞ + + + while p(t) = ( p1 (t), p2 (t), · · · ) ∈ l1,2 . Here l∞ and l1,2 are Banach spaces defined as follows: + l∞ = {q = (q1 , q2 , · · · ) | ||q||∞ = sup j |q j | < ∞}, ∞ + = { p = ( p1 , p2 , · · · ) | || p||1,2 = j 2 | p j | < ∞}. l1,2
(3.3)
j=1
Following [BSS], we rewrite the CH equation (3.1) in the form m t + (mu)x + mu x = 0, m = u − u x x .
(3.4)
Then the solution in (3.2) corresponds to the measure m(x, t) =
∞
p j (t) δ(x − q j (t)).
(3.5)
j=1
Therefore, if we mimic the calculation in [BSS], we find that u(x, t) and m(x, t) satisfy the equation in (3.4) in a weak sense if and only if ∞
1 −|q j −qk | q˙ j = e pk , 2 k=1
∞
p˙ j =
1 pj sgn(q j − qk )e−|q j −qk | pk , 2
j ∈ N,
(3.6)
k=1
where we adopt the convention that sgn 0 = 0. Note that our assumptions above mean + ⊕ l+ , that we are considering these equations in the Banach space direct sum l∞ 1,2 equipped with the norm ||(q, p)|| = ||q||∞ + || p||1,2 . Clearly, the signs of the p j ’s are preserved as long as no blowup occurs. (See Remark 3.3 (c) below.) In this work, we will focus on the case where p j > 0 for all j. Indeed, we will restrict our attention to the two sectors + + S− = {(q, p) ∈ l∞ ⊕ l1,2 | q1 < q2 < · · · , p j > 0 for all j},
(3.7)
+ + S+ = {(q, p) ∈ l∞ ⊕ l1,2 | q1 > q2 > · · · , p j > 0 for all j}
(3.8)
and
for the most part. At the end of the paper, we will show how to adapt our analysis to other sectors which are defined by a restricted class of permutations of N.
Camassa-Holm Equation
275
Proposition 3.1. Suppose (q 0 , p 0 ) ∈ S± . Then the initial value problem ∞
1 −|q j −qk | q˙ j = e pk , 2 k=1
∞
p˙ j =
1 pj sgn(q j − qk )e−|q j −qk | pk 2 k=1 ∞
1 = ± pj 2 q j (0) =
q 0j ,
sgn(k − j)e−|q j −qk | pk ,
k=1
p j (0) = p 0j , j ∈ N
(3.9)
has a unique global solution in S± . Proof. For (q, p) ∈ S± , put f (q, p) = ( f 1 (q, p), f 2 (q, p)), where
(3.10)
⎛
⎞∞ ∞ 1 f 1 (q, p) = ⎝ e−|q j −qk | pk ⎠ , 2 j=1
⎛
j=1
⎞∞ ∞ 1 f 2 (q, p) = ⎝ p j sgn(q j − qk )e−|q j −qk | pk ⎠ . 2 j=1
We first show f : S± −→
+ l∞
|| p||1 =
+ . ⊕ l1,2
∞
(3.11)
j=1
For this purpose (and for later usage), we put
+ p k , p = ( p k )∞ k=1 ∈ l1,2 , pk > 0.
(3.12)
k=1
Then from the expressions for f 1 and f 2 above, we find that 1 || p||1 (1 + || p||1,2 ), (3.13) 2 as desired. We next show f is locally Lipschitz. To do this, take (q, p), ( q, p ) in an open ball centered at (q 0 , p 0 ) which is contained in S± . Then by making use of the inequality |e−ξ − e−η | ≤ |ξ − η| for ξ, η > 0 and the triangle inequalities, we have ∞ ∞ q j − qk | e−|q j −qk | pk − e−| pk || f (q, p)|| ≤
k=1
≤
∞
k=1
e−|q j −qk | | pk − pk | +
k=1
≤ ≤
∞ k=1 ∞ k=1
∞
q j − qk | | pk ||e−|q j −qk | − e−| |
k=1
| pk − pk | +
∞
| pk |(|q j − q j | + |qk − qk |)
k=1
| pk − pk | + || p ||1 |q j − q j | + || p ||1 ||q − q ||∞ .
(3.14)
276
L.-C. Li
Consequently, || f 1 (q, p) − f 1 ( q, p )||∞ 1 ≤ || p − p ||1 ||q − q ||∞ . (3.15) p ||1,2 + || 2 On the other hand, by using the fact that sgn(q j − qk ) = sgn( qj − qk ) = ±sgn(k − j) and the estimate in (3.14), we find || f 2 (q, p) − f 2 ( q, p )||1,2 ∞ ∞ 1 2 −|q j −qk | q j − qk | ≤ j |e pk − e−| pk | p j 2 j=1
+
1 2
k=1 ∞ ∞ 2 −| q j − qk |
j
j=1
≤ || p||1,2
e
pk | p j − pj|
k=1
1 1 || p − p ||1,2 + || p ||1 || p − p ||1 ||q − q ||∞ + || p ||1,2 . 2 2
(3.16)
Therefore, upon combining (3.15) and (3.16), we obtain || f (q, p) − f ( q, p )|| ≤ C(|| p||1,2 , || p ||1 ) ||(q, p) − ( q, p )||.
(3.17)
Finally, to establish global existence, let us suppose the solution (q(t), p(t)) exists for 0 ≤ t ≤ T for some T > 0. We will establish an a priori estimate for ||(q(t), p(t))||. To do so, observe that P = ∞ j=1 p j (t) is a conserved quantity. Hence the equations for q j (t) gives |q˙ j (t)| ≤ 21 P from which we obtain the estimate ||q(t)||∞ ≤ ||q(0)||∞ + Similarly, the equations for p j (t) gives | p˙ j (t)| ≤ p j (0) e
1 2
Pt
1 2
1 Pt. 2
(3.18)
p j (t)P from which we find p j (t) ≤
. Therefore, 1
|| p(t)||1,2 ≤ || p(0)||1,2 e 2 Pt .
(3.19)
Consequently, on combining (3.18) and (3.19), we conclude that ||(q(t), p(t))|| ≤ ||q(0)||∞ + for 0 ≤ t ≤ T . This completes the proof.
1 1 Pt + || p(0)||1,2 e 2 Pt 2
(3.20)
Our next result relates the equations in (3.6) with (q, p) ∈ S± to the (±) Toda flow and gives the spectral properties of the Lax operator. Proposition 3.2. For (q, p) ∈ S± , define the operator L(q, p) = (L i j (q, p))i,∞j=1 on l2+ by L i j (q, p) = Then
1 − 1 |qi −q j | √ e 2 pi p j , i, j ≥ 1. 2
(3.21)
Camassa-Holm Equation
277
(a) L(q, p) is a positive, semiseparable trace-class operator. Indeed,
u i (q, p)v j (q, p), i ≤ j L i j (q, p) = u j (q, p)vi (q, p), i > j, for (q, p) ∈ S− , while
L i j (q, p) =
u j (q, p)vi (q, p), i ≤ j u i (q, p)v j (q, p), i > j,
(3.22a)
(3.22b)
for (q, p) ∈ S+ , where 1 √ 1 1 √ 1 u i (q, p) = √ e 2 qi pi , vi (q, p) = √ e− 2 qi pi , i ≥ 1. 2 2
(b) If f = ( f (1), f (2), · · · ) is an eigenvector of L(q, p), then f (1) = 0. (c) The eigenvalues of L(q, p) are simple and ker (L(q, p)) = {0}. (d) If (q, p) evolves under (3.6), then
1 ˙L(q, p) = 2 [ L(q, p), k L(q, p) ], (q, p) ∈ S− 1 (q, p) ∈ S+ . 2 [k L(q, p) L(q, p) ],
(3.23)
(3.24)
Proof. We will give the proof for (q, p) ∈ S− , the arguments for the other case are similar. (a) It is clear from the definition of L(q, p) that L i j (q, p) is of the form given in (3.22a) with u i (q, p), vi (q, p) defined in (3.23). Moreover, it follows from the natural ordering of the qi ’s that 0
0. For n > 1, it follows from [GK] [Eq. (29) on p.78] that n−1 v (q, p) vi+1 (q, p) det(L (n) (q, p)) = u 1 (q, p)vn (q, p) det i u i (q, p) u i+1 (q, p) j=1
= v12 (q, p) · · · vn2 (q, p)
n u j (q, p) j=1
v j (q, p)
−
u j−1 (q, p) , (3.26) v j−1 (q, p)
(q, p) (n) where we formally set uv00(q, p) = 0. Hence we conclude from (3.25) that det(L (q, p)) > 0. Consequently, Finally, the fact that L(q, p) is trace-class follows L(q, p) is positive. ∞ as we have ∞ j=1 (e j , L(q, p)e j ) = j=1 p j < ∞. (b) Suppose L(q, p) f = λ f . Writing this out in terms of components, we have
u 1 (q, p)(v1 (q, p) f (1) + v2 (q, p) f (2) + v3 (q, p) f (3) + · · · ) = λ f (1), u 1 (q, p)v2 (q, p) f (1) + u 2 (q, p)(v2 (q, p) f (2) + v3 (q, p) f (3) + · · · ) = λ f (2), u 1 (q, p)v3 (q, p) f (1)+u 2 (q, p)v3 (q, p) f (2)+u 3 (q, p)(v3 (q, p) f (3)+· · · ) = λ f (3), .. . (3.27)
278
L.-C. Li
If f (1) = 0, then it follows from the first equation of (3.27) that v2 (q, p) f (2) + v3 (q, p) f (3) + · · · = 0. Substitute this into the second equation of (3.27), we find f (2) = 0. Hence v3 (q, p) f (3)+ · · · = 0. When we substitute this into the third equation of (3.27), we obtain f (3) = 0. Proceeding inductively, it is easy to see that f = 0, a contradiction to the assumption that f is an eigenvector. (c) Suppose there exist independent eigenvectors f and g of L(q, p) corresponding to the eigenvalue λ. Since f (1), g(1) = 0, we can find c1 , c2 ∈ R \ {0} such that c1 f (1) + c2 g(1) = 0. But on the other hand, c1 f + c2 g is obviously an eigenvector of L(q, p), which is impossible as c1 f (1) + c2 g(1) = 0. To show that ker (L(q, p)) = {0}, suppose L(q, p)h = 0. Writing this out in terms of components, we get u 1 (q, p)(v1 (q, p)h(1) + v2 (q, p)h(2) + v3 (q, p)h(3) + · · · ) = 0, u 1 (q, p)v2 (q, p)h(1) + u 2 (q, p)(v2 (q, p)h(2) + v3 (q, p)h(3) + · · · ) = 0, u 1 (q, p)v3 (q, p)h(1) + u 2 (q, p)v3 (q, p)h(2) + u 3 (q, p)(v3 (q, p)h(3) + · · · ) = 0, .. . (3.28) p) Next, we multiply the second equation of (3.28) by − uu 21 (q, (q, p) , and add it to the first equation; this gives u 2 (q, p) u 1 (q, p) u 1 (q, p) − h(1) = 0. (3.29) v1 (q, p)v2 (q, p) v2 (q, p) v1 (q, p) v1 (q, p)
Thus it follows from (3.29) and (3.25) that h(1) = 0. Clearly, if we proceed inductively and make use of (3.25), the conclusion is that h = 0, as required. (d) See Remark 3.3 (a) below.
Remark 3.3. (a) If we replace ∞ by a positive integer N in (3.2), we obtain the multipeakon solutions of the CH equation [CH,BSS] if p j (t) > 0 for j = 1, · · · , N . In that case, the Lax pair of the Hamiltonian equations for (q, p) ∈ R2N with p j > 0 for all j was discovered in [CF] in a remarkable calculation. On the other hand, in the sector where the q j ’s satisfy the natural ordering q1 < q2 < · · · < q N , the realization that the Lax equation is just a special case of the Toda flows on N × N matrices was pointed out in [RB]. In this regard, the calculation which leads to (3.24) is just an extension of the one in [CF]. We should mention, however, that the r-matrix in [RB] is not the appropriate one to use from the point of view of Poisson geometry. Indeed, it is easy to check that the set of N × N symmetric matrices is not even a Poisson submanifold of the Lie-Poisson structure associated with the corresponding R-bracket. (b) In the case of the peakons lattice in [RB], the Lax operator is invertible. (This also follows from (3.26) above.) In our case, although 0 is not an eigenvalue of L(q, p) by Proposition 3.2 (c) above, however, it is in the essential spectrum σess (L(q, p)) (indeed, σess (L(q, p)) = {0}). Hence L(q, p) is not invertible. (c) A general solution of the CH equation of the form given in (3.2) is a superposition of peakons ( p j (t) > 0) and antipeakons ( p j (t) < 0). If we have a mix of peakons and antipeakons, p j (t) can blow up in finite time if a collison occurs and we have to consider continuation of the solution u(x, t) beyond wave breaking. (For the finite dimensional version of (3.2) as given in (1.5), see [BSS] for a detailed
Camassa-Holm Equation
279
analysis using the explicit solution formulas.) For a method of continuation based on the introduction of a new set of independent and dependent variables, we refer the reader to the recent work in [BC1,BC2]. In this work, however, since we make the assumption that p j (0) > 0 for all j ∈ N, it follows from the above results that p j (t) for all j ∈ N will remain positive and bounded for all t > 0. Hence no wave breaking can occur. We close this section with the following result which is a consequence of Proposition 2.10, Proposition 3.2 and the equation for q j in (3.6). Proposition 3.4. Let (q(t), p(t)) be a solution of (3.6) in the sector S± . Then
(a) L(q(t), p(t)) → diag(α1± , α2± , · · · ) strongly as t → ∞, where αi± ∈ σ (L(q(0), p(0))), and hence limt→∞ p j (t) = 2α ± j for each j ∈ N, 1 (b) q˙ j (t) > 0 for each j ∈ N and limt→∞ e− 2 |q j (t)−qk (t)| p j (t) pk (t) = 0 whenever j = k. Remark 3.5. (a) From Proposition 3.4 (b) above, we conclude that the peakons are traveling to the right. (b) In Sects. 4 and 5, we will show that the peakons separate out, i.e., the q j (t)’s have the scattering behaviour. However, in contrast to the semi-infinite Toda lattice [L1], this does not follow immediately from the long time behaviour of L(q(t), p(t)). This is 1 clear from Proposition 3.4 (b) above as limt→∞ e− 2 |q j (t)−qk (t)| = 0, j = k does 1 not follow automatically from limt→∞ e− 2 |q j (t)−qk (t)| p j (t) pk (t) = 0. Indeed, for (q(t), p(t)) ∈ S− , we will show that limt→∞ p j (t) = 0 for all j ∈ N, so even the explicit values of the α − j ’s are of no help in this case in establishing the scattering behaviour. 4. Long Time Behaviour in the Sector S− Let (q(t), p(t)) be the solution of (3.6) with (q(0), p(0)) = (q 0 , p 0 ) ∈ S− , and let ∞ . In view of Proposition 3.2, we will order the eigenvalues σ (L(q 0 , p 0 )) \ {0} = {λi }i=1 as follows: 0 < · · · < λ3 < λ 2 < λ 1 .
(4.1)
We will also take the normalized eigenvectors φ1 (t), φ2 (t), . . . of L(q(t), p(t)) to be such that φk (1, t) > 0, k = 1, 2, . . . . Now it follows from the same proposition that L(q(t), p(t)) is one-to-one. Hence L(q(t), p(t)) has a left inverse J(q(t), p(t)) which is an unbounded operator defined on the dense linear subspace Ran L(q(t), p(t)) of H. In order to give the formula for J(q(t), p(t)), set 1
e j (t) = e− 2 (q j+1 (t)−q j (t)) , j ∈ N. Proposition 4.1. The matrix of J(q(t), p(t)) is tridiagonal: ⎛ a1 (t) −b1 (t) 0 ⎜ ⎜−b1 (t) a2 (t) −b2 (t) ⎜ J(q(t), p(t)) = ⎜ ⎜ 0 −b2 (t) a3 (t) ⎝ .. .. .. . . .
(4.2) ⎞ ··· .. ⎟ .⎟ ⎟ .. ⎟ .⎟ ⎠ .. .
(4.3)
280
L.-C. Li
with a j (t) =
1 − e2j−1 (t)e2j (t) 2 , p j (t) (1 − e2j−1 (t))(1 − e2j (t))
(4.4)
e j (t) 2 b j (t) = , j ∈ N, p j (t) p j+1 (t) 1 − e2j (t) where we formally set e0 (t) = 0. Moreover, e j−1 (t)(1 − e j (t)) φk ( j − 1, t) φk ( j + 1, t) e j (t) =− 1 − e2j−1 (t) p j+1 (t) p j−1 (t) 1 φk ( j, t) 1 2 + (1 − e j (t)) p j (t) a j (t) − 2 λk p j (t) 2
(4.5)
for all j, k ∈ N. Proof. To obtain the formula for J(q(t), p(t)), we solve L(q(t), p(t)) f = g recursively for f (1), f (2), · · · in terms of the components of g. On the other hand, from L(q(t), p(t))φk (t) = λk φk (t), we find J(q(t), p(t))φk (t) = λ1k φk (t). Since J(q(t), p(t)) is given by (4.3), we obtain the recurrence relation 1 φk ( j, t). b j (t)φk ( j + 1, t) = −b j−1 (t)φk ( j − 1, t) + a j (t) − (4.6) λk Therefore, the formula in (4.5) follows upon multiplying both sides of (4.6) by 1 p (t)(1 − e2j (t)) and making use of the formulas in (4.4).
j 2 Our next result shows φk (1, t) can be solved explicitly. Lemma 4.2. For each k ∈ N, 1
e− 2 λk t φk (1, 0)
φk (1, t) = ∞
−λ j t φ 2 (1, 0) j=1 e j
1 .
(4.7)
2
Proof. Let L(t) = L(q(t), p(t)). Then L(t) evolves under the (−) Toda flow. By Theorem 2.9, for any j ∈ N, (e1 , φ j (t)) = (e1 , b+ (t)−1 φ j (0)) 1
= ((b− (t)−1 )T e1 , e− 2 t L (0) φ j (0)) 1
=
e− 2 λ j t (e1 , φ j (0)) . (b− (t))11
The assertion therefore follows from (4.8) and the relation
(4.8) ∞
2 j=1 φ j (1, t)
= 1.
As the λ j ’s accumulate at 0, it is a difficult problem to get the asymptotics of φk (1, t) as t → ∞ from (4.7) above. In the following, we will bypass this difficulty.
Camassa-Holm Equation
281
Theorem 4.3. Let (q(t), p(t)) be the solution of (3.6) with (q(0), p(0)) = (q 0 , p 0 ) ∈ S− , then (a) (b) (c) (d)
limt→∞ p j (t) = 0 for all j ∈ N, limt→∞ |q j (t) − qk (t)| = ∞ for all j = k, q j (t) → ∞ as t → ∞ for all j ∈ N, q j (t) = o(t) as t → ∞ for all j ∈ N.
Proof. From (4.7), we have 1 φr (1, t) φr (1, 0) = e− 2 (λr −λs )t φs (1, t) φs (1, 0)
(4.9)
for all r, s ∈ N. Therefore, φ 2 (1, t) φk2 (1, t) = ∞ k p1 (t) 2 j=1 λ j φ 2j (1, t)
0 for t sufficiently large.
(4.20)
We next use the recurrence relation (4.6) for j = 1, (4.9) and (4.17) to obtain p1 (t)(a1 (t) − φr (2, t) = φs (2, t) p1 (t)(a1 (t) − 1
∼ e− 2 (λr −λs )t
1 λr )φr (1, t) 1 λs )φs (1, t)
φr (1, 0) as t → ∞. φs (1, 0)
(4.21)
From this, it follows that φ 2 (2, t) φk2 (2, t) = ∞ k p2 (t) 2 j=1 λ j φ 2j (2, t)
0 for t sufficiently large.
(4.32)
284
L.-C. Li
Now let n ≥ 3 and assume by induction that the following sequence of assertions holds for j ≤ n − 1 for each k, r, s ∈ N: φr ( j,t) φs ( j,t)
(1) j (2) j
φk2 ( j,t) p j (t)
0 for t sufficiently large by the induction assumptions. Consequently, lim φk (n, t) = 0
t→∞
(4.37)
and so lim pn (t) = lim 2
t→∞
t→∞
∞ j=1
λ j φ 2j (n, t) = 0.
(4.38)
Camassa-Holm Equation
285
To establish (6)n , we make use of the formula for an (t) in (4.4), the induction assumptions and (4.38), thus 1 (1 − en2 (t)) pn (t) an (t) − λk =
2 (t)e2 (t)) − p (t)(1 − e2 (t))(1 − e2 (t)) 2λk (1 − en−1 n n n n−1 2 (t)) λk (1 − en−1
∼ 2 as t → ∞.
(4.39)
Therefore, on using the recurrence relation (4.5) for j = n, the induction assumptions (3)n−1 , (8)n−1 together with (4.36) and (4.39), we obtain φk (n + 1, t) lim en (t) √ = 0. pn+1 (t)
(4.40)
t→∞
Hence lim en (t) = lim
t→∞
t→∞
∞
φk (n, t) φk (n + 1, t) λk √ en (t) √ 2 p (t) pn+1 (t) n k=1
1 2
=0
(4.41)
by (4.36) and (4.40). Using this result in (4.39), we obtain (9)n . Finally, the assertion (10)n follows from pn+1 (t)bn (t) = =
φk (n, t) φk (n + 1, t) pn+1 (t)bn2 (t)
(an (t) −
(4.42)
φk (n−1,t) 1 λk ) − bn−1 (t) φk (n,t)
4en2 (t) (1 − en2 (t))2 pn (t)(an (t) −
1 1 λk ) −
pn (t)bn−1 (t) φkφ(n−1,t) k (n,t)
upon using (9)n , the induction assumptions (8)n−1 , (10)n−1 and (4.41). This completes the proof of the sequence of assertions (1) j -(10) j by induction. Hence we have established parts (a) and (b) of the theorem as the relation limt→∞ e j (t) = 0 is equivalent to limt→∞ (q j+1 (t) − q j (t)) = ∞. To prove part (c), we begin with the assertion for j = 1. For this case, note that from the equation for p1 (t), we have ⎛ p1 (t) = p1 (0) ex p ⎝−
1 t 2
⎞ e−(qk (s)−q1 (s)) pk (s) ds ⎠ .
(4.43)
0 k=1
Since limt→∞ p1 (t) = 0, we conclude from (4.43) that t 0 k=1
e−(qk (s)−q1 (s)) pk (s) ds → ∞ as t → ∞.
(4.44)
286
L.-C. Li
Meanwhile, from the equation for q1 (t) in (3.6), we find 1 q1 (t) = q1 (0) + 2 > q1 (0) +
1 2
t ∞ 0 k=1 t
e−|qk (s)−q1 (s)| pk (s) ds e−(qk (s)−q1 (s)) pk (s) ds, t > 0.
(4.45)
0 k=1
Therefore, on taking the limit as t tends to infinity in (4.45) and making use of (4.44), we conclude that q1 (t) → ∞ as t → ∞. To show that q j (t) → ∞ as t → ∞ for j > 1, note that q˙ j−1 (t) > 0 by Proposition 3.4 (b). Since (q(t), p(t)) ∈ S− , it follows from this property that 0 < q j (t) − q j−1 (t) < q j (t) − q j−1 (0) for all t > 0.
(4.46)
Hence the assertion follows from (4.46) as we have limt→∞ (q j (t) − q j−1 (t)) = ∞. To establish part (d), note that by parts (a) and (b) above, and the equation for q j (t) in (3.6), we have limt→∞ q˙ j (t) = 0. Since q j (t) → ∞ as t → ∞, it follows by L’Hôpital’s rule that q j (t) = lim q˙ j (t) = 0. t→∞ t t→∞ lim
Hence q j (t) = o(t) as t → ∞, as asserted.
(4.47)
−|x−q 0j | 0 Corollary 4.4. If u 0 (x) = 21 ∞ p j , where (q 0 , p 0 ) ∈ S− , then the solution j=1 e u(x, t) of the CH equation (3.1) with u(x, 0) = u 0 (x) is such that u(x, t) 0 as t → ∞.
(4.48)
Remark 4.5. (a) Theorem 4.3 (a) can also be proved using the method in Sect. 5 below. (b) From the fact that Ran L(q, p) = H, it is straightforward to show that the equation ˙ L(q, p) = 21 [ L(q, p), k L(q, p) ] implies 1 ˙ J(q, p) = [ J(q, p), k L(q, p) ]. 2 However, as L(q, p) is not invertible, it is not possible to express this equation in terms of J(q, p) alone. (c) From the definition of a j (t) in (4.4) and the proof of Theorem 4.3 above, we see that a j (t) → ∞ as t → ∞. 5. Long Time Behaviour in S+ and Other Sectors Let (q(t), p(t)) be the solution of (3.6) with (q(0), p(0)) = (q 0 , p 0 ) ∈ S+ . Then L(t) = L(q(t), p(t)) evolves under the (+) Toda flow with initial condition L(0) = L(q 0 , p 0 ). As in Sect. 4, we order the eigenvalues of L(q 0 , p 0 ) in such a way that 0 < · · · < λ3 < λ2 < λ1 . Also, we let φk (t) be the normalized eigenvector of L(t) corresponding to λk with φk (1, t) > 0, k = 1, 2, . . . .
Camassa-Holm Equation
287
Denote by ∧k H the k th exterior power of H, k ≥ 1. (See [LS] and [RS2] for more details.) Then the operator L(t) gives rise to the induced derivations (L(t))k ≡
k−1
I r ⊗ L(t) ⊗ I k−1−r : ∧k H −→ ∧k H, k ≥ 1,
(5.1)
r =0
where I r and I k−1−r are the identity operators on ∧r H and ∧k−1−r H respectively. Since ker L(t) = {0}, {φk (t)}∞ k=1 is an orthonormal basis of H. Thus {φi1 (t) ∧ · · · ∧ φik (t) | 1 ≤ i 1 < · · · < i k }
(5.2)
is an orthonormal basis of ∧k H with respect to the natural inner product on ∧k H defined by (ξ1 ∧ · · · ∧ ξk , η1 ∧ · · · ∧ ηk ) = det ((ξi , η j )).
(5.3)
Moreover, the elements φi1 (t) ∧ · · · ∧ φik (t) (i 1 < · · · < i k ) are eigenvectors of (L(t))k as we have (L(t))k (φi1 (t) ∧ · · · ∧ φik (t)) = (λi1 + · · · + λik )(φi1 (t) ∧ · · · ∧ φik (t)).
(5.4)
Lemma 5.1. For any increasing sequence 1 ≤ i 1 < · · · < i k of numbers from N, (e1 ∧ · · · ∧ ek , φi1 (t) ∧ · · · ∧ φik (t))2 =
e(λi1 +···+λik )t (e1 ∧ · · · ∧ ek , φi1 (0) ∧ · · · ∧ φik (0))2 1≤ j1 0 the solution of the Cauchy problem (0.10) is given by
1 3 2 t ei[x·ξ +(e −1)|ξ |] H+ ; 1; 2iet |ξ | H− ; 3; 2i|ξ | u(x, t) = −i n (2π) Rn 2 2 1 3 t − ei[x·ξ −(e −1)|ξ |] H− ; 1; 2iet |ξ | H+ ; 3; 2i|ξ | |ξ |2 ϕˆ0 (ξ )dξ 2 2
1 1 1 i[x·ξ +(et −1)|ξ |] t −i e H+ ; 1; 2ie |ξ | H− ; 1; 2i|ξ | n (2π) Rn 2 2 1 1 t − ei[x·ξ −(e −1)|ξ |] H− ( ; 1; 2iet |ξ |)H+ ; 1; 2i|ξ | ϕˆ1 (ξ )dξ. 2 2 In the notations of [2] the last functions are H− (α; γ ; z) = eiαπ (α; γ ; z) and H+ (α; γ ; z) = eiαπ (γ −α; γ ; −z), where function (a; c; z) is defined in [2, Sect. 6.5]. Here ϕ(ξ ˆ ) is a Fourier transform of ϕ(x).
298
K. Yagdjian, A. Galstian
The L p − L q decay estimates obtained in [15] by dyadic decomposition of the phase space contain some derivative loss. More precisely, it is proved that for the solution u = u(x, t) to the Cauchy problem (0.10) with n ≥ 2, ϕ0 (x) ∈ C0∞ (Rn ) and ϕ1 (x) = 0 for all large t ≥ T > 0, the following estimate is satisfied: u(x, t) L q (Rn ) ≤ C(1 + et )
− 21 (n−1)( 1p − q1 )
ϕ0 W pN (Rn ) ,
(0.11)
where 1 < p ≤ 2, 1p + q1 = 1, and 21 (n + 1)( 1p − q1 ) ≤ N < 21 (n + 1)( 1p − q1 ) + 1 and W pN (Rn ) is the Sobolev space. In particular, in (0.11) the derivative loss, N , is positive, unless p = q = 2. This derivative loss phenomenon exists for the classical wave equation as well. Indeed, it is well-known (see, e.g., [19,20,23]) that for the Cauchy problem u tt − u = 0, u(x, 0) = ϕ(x), u t (x, 0) = 0, the estimate u(x, t) L q (Rnx ) ≤ C ϕ(x) L q (Rnx ) fails to fulfill even for small positive t unless q = 2. According to Theorem 1 [15], for the solution u = u(x, t) to the Cauchy problem (0.10) with n ≥ 2, ϕ0 (x) = 0 and ϕ1 (x) ∈ C0∞ (Rn ) for all large t ≥ T > 0 and for any small ε > 0, the following estimate is satisfied: r −n( 1p − q1 )
u(x, t) L q (Rn ) ≤ Cε (1 + t)(1 + et ) 0 where 1 < p ≤ 2, n+1 1 2 (p
1 q)+
1 1 p + q 1 p.
ϕ1 W pN (Rn ) ,
1 1 1 = 1, r0 = max{ε; (n+1) 2 ( p − q ) − q },
n+1 1 2 (p
− q1 ) −
1 q
≤
− N< The nonlinear Eqs. (0.6) and (0.7) are those we would like to solve, but the linear problem is a natural first step. An exceptionally efficient tool for studying nonlinear equations is the fundamental solution of the associated linear operator. The fundamental solutions for the operator of Eq. (0.10) are constructed in [32] and the representations of the solutions of the Cauchy problem u tt − e2t u = f (x, t),
u(x, 0) = ϕ0 (x), u t (x, 0) = ϕ1 (x),
are given in terms of the solutions of the wave equation in Minkowski spacetime. Then in [32] for n ≥ 2 the following decay estimate: t t (2s−n( 1p − q1 )) (−)−s u(x, t) L q (Rn ) ≤ Ce (1 + t − b) f (x, b) L p (Rn ) db 2s−n( 1p − q1 )
0
ϕ0 (x) L p (Rn ) + (1 + t)(1 − e−t ) ϕ1 (x) L p (Rn ) is proven, provided that 1 < p ≤ 2, 1p + q1 = 1, 21 (n +1) 1p − q1 ≤ 2s ≤ n 1p − q1 < 2s + 1. The decay given by this estimate is exponential, in contrast to the polynomial decay estimate for the wave equation in Minkowski spacetime (cf. with [5] and [22]). Moreover, this estimate is fulfilled for n = 1 and s = 0 as well, if ϕ0 (x) = 0 and ϕ1 (x) = 0. The case of n = 1, f (x, t) = 0, and non-vanishing ϕ0 (x) and ϕ1 (x) also is discussed in Sect. 8 [32]. In the construction of the fundamental solutions for the operator (0.8) we follow the approach proposed in [28] that allows us to represent the fundamental solutions as some integral of the family of the fundamental solutions of the Cauchy problem for the wave equation without source term. The kernel of that integral contains Gauss’s hypergeometric function. In that way, many properties of the wave equation can be extended to the hyperbolic equations with the time dependent speed of propagation. +C(e − 1) t
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
299
That approach was successfully applied in [30,31] by the first author to investigate the semilinear Tricomi-type equations. Thus, in the present paper we consider the Klein-Gordon operator in the de Sitter model of the universe, that is S := ∂t2 − e−2t +M 2 , where M is the curved mass, M ≥ 0, and x ∈ Rn , t ∈ R. We look for the fundamental solution E = E(x, t; x0 , t0 ), Ett − e−2t E + M 2 E = δ(x − x0 , t − t0 ), with support in the “forward light cone” D+ (x0 , t0 ), x0 ∈ Rn , t0 ∈ R, and for the fundamental solution with support in the “backward light cone” D− (x0 , t0 ), x0 ∈ Rn , t0 ∈ R, defined as follows:
(0.12) D± (x0 , t0 ) := (x, t) ∈ Rn+1 ; |x − x0 | ≤ ±(e−t0 − e−t ) . In fact, any intersection of D− (x0 , t0 ) with the hyperplane t = const < t0 determines the so-called dependence domain for the point (x0 , t0 ), while the intersection of D+ (x0 , t0 ) with the hyperplane t = const > t0 is the so-called domain of influence of the point (x0 , t0 ). Equation (0.8) is non-invariant with respect to time inversion. Moreover, the dependence domain is wider than any given ball if time const > t0 is sufficiently large, while the domain of influence is permanently, for all time const < t0 , in the ball of the radius et0 . Define for t0 ∈ R in the domain D+ (x0 , t0 ) ∪ D− (x0 , t0 ) the function E(x, t; x0 , t0 )
− 1 −i M 2 = (4e−t0 −t )i M (e−t + e−t0 )2 − (x − x0 )2 ×F
1 2
+ i M,
1 (e−t0 − e−t )2 − (x − x0 )2 + i M; 1; −t , 2 (e 0 + e−t )2 − (x − x0 )2
(0.13)
where F a, b; c; ζ is the hypergeometric function (see, e.g., [2]). Let E(x, t; 0, b) be function (0.13), and set E(x, t; 0, t0 ) in D+ (0, t0 ), E+ (x, t; 0, t0 ) := , 0 elsewhere E(x, t; 0, t0 ) in D− (0, t0 ), E− (x, t; 0, t0 ) := . 0 elsewhere Since the function E = E(x, t; 0, t0 ) is smooth in D± (0, t0 ) and is locally integrable, it follows that E+ (x, t; 0, t0 ) and E− (x, t; 0, t0 ) are distributions whose supports are in D+ (0, t0 ) and D− (0, t0 ), respectively. From now on we restrict ourselves to particles with “large” mass m ≥ n/2, that is, with nonnegative curved mass M ≥ 0, to make the presentation more transparent. The case of complex-valued curved mass M, and, in particular, of m < n/2, will be discussed in a forthcoming paper. The next theorem gives our first result.
300
K. Yagdjian, A. Galstian
Theorem 0.1. Suppose that M ≥ 0. The distributions E+ (x, t; 0, t0 ) and E− (x, t; 0, t0 ) are the fundamental solutions for the operator S = ∂t2 − e−2t ∂x2 + M 2 relative to the point (0, t0 ), that is SE± (x, t; 0, t0 ) = δ(x, t − t0 ), or 2 ∂2 −2t ∂ E (x, t; 0, t ) − e E± (x, t; 0, t0 ) + M 2 E± (x, t; 0, t0 ) = δ(x, t − t0 ). ± 0 ∂t 2 ∂x2
To motivate our construction for the higher dimensional case n ≥ 2 we follow the approach suggested in [28] and represent the fundamental solution E+ (x, t; 0, t0 ) as follows: E+ (x, t; 0, t0 ) = ×F
1 2
+ i M,
e−t0 −e−t
e−t −e−t0
− 1 −i M 2 (4e−t0 −t )i M (e−t0 + e−t )2 − r 2
1 (e−t0 − e−t )2 − r 2 string E (x, r ) dr, t > t0 , + i M; 1; −t 2 (e 0 + e−t )2 − r 2
where the distribution E string (x, t) is the fundamental solution of the Cauchy problem for the string equation: ∂ 2 string ∂ 2 string E − E = 0, ∂t 2 ∂x2
string
E string (x, 0) = δ(x), Et
(x, 0) = 0.
Hence, E string (x, t) = 21 {δ(x + t) + δ(x − t)}. The integral makes sense in the topology of the space of distributions. The fundamental solution E− (x, t; 0, t0 ) for t < t0 admits a similar representation. We appeal to the wave equation in Minkowski spacetime to obtain in the next theorem very similar representations of the fundamental solutions of the higher dimensional equation in de Sitter spacetime. Theorem 0.2. If x ∈ Rn , n ≥ 2, and M ≥ 0, then for the operator S = ∂t2 −e−2t + M 2 the fundamental solution E+,n (x, t; x0 , t0 ) (= E+,n (x − x0 , t; 0, t0 )) with support in the forward cone D+ (x0 , t0 ), x0 ∈ Rn , t0 ∈ R, supp E+,n ⊆ D+ (x0 , t0 ), is given by the following integral (t > t0 ):
e−t0 −e−t
E+,n (x − x0 , t; 0, t0 ) = 2 0
×F
− 1 −i M 2 (4e−t0 −t )i M (e−t0 + e−t )2 − r 2
1 (e−t0 −e−t )2 −r 2 w + i M, +i M; 1; −t E (x − x0 , r ) dr. 2 2 (e 0 + e−t )2 − r 2 (0.14)
1
Here the distribution E w (x, t) is a fundamental solution to the Cauchy problem for the wave equation Ettw − E w = 0, E w (x, 0) = δ(x), Etw (x, 0) = 0.
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
301
The fundamental solution E−,n (x, t; x0 , t0 ) (= E−,n (x − x0 , t; 0, t0 )) with support in the backward cone D− (x0 , t0 ), x0 ∈ Rn , t0 ∈ R, supp E−,n ⊆ D− (x0 , t0 ), is given by the following integral (t < t0 ): E−,n (x − x0 , t; 0, t0 ) = −2 ×F
0 e−t0 −e−t
− 1 −i M 2 (4e−t0 −t )i M (e−t0 + e−t )2 − r 2
1 (e−t0 −e−t )2 −r 2 w +i M, +i M; 1; −t E (x − x0 , r ) dr. 2 2 (e 0 +e−t )2 −r 2 (0.15)
1
In particular, the formula (0.14) shows that Huygens’s Principle is not valid for waves propagating in the de Sitter model of the universe (cf. with [27]). Next we use Theorem 0.1 to solve the Cauchy problem for the one-dimensional equation u tt − e−2t u x x + M 2 u = f (x, t),
t > 0, x ∈ R,
(0.16)
with vanishing initial data, u(x, 0) = u t (x, 0) = 0.
(0.17)
Theorem 0.3. Assume that the function f is continuous along with all its second order derivatives, and that for every fixed t it has compact support, supp f (·, t) ⊂ R. Then the function u = u(x, t) defined by u(x, t) =
t
db 0
×F
1 2
x+e−b −e−t x−(e−b −e−t )
+ i M,
− 1 −i M 2 dy f (y, b)(4e−b−t )i M (e−t + e−b )2 − (x − y)2
1 (e−b − e−t )2 − (x − y)2 + i M; 1; −b 2 (e + e−t )2 − (x − y)2
is a C 2 -solution to the Cauchy problem for Eq. (0.16) with vanishing initial data, (0.17). The representation of the solution of the Cauchy problem for the one-dimensional case (n = 1) of Eq. (0.8) without source term is given by the next theorem. Theorem 0.4. The solution u = u(x, t) of the Cauchy problem u tt − e−2t u x x + M 2 u = 0,
u(x, 0) = ϕ0 (x),
u t (x, 0) = ϕ1 (x),
with ϕ0 , ϕ1 ∈ C0∞ (R) can be represented as follows: u(x, t) =
1 t e 2 ϕ0 (x + 1 − e−t ) + ϕ0 (x − 1 + e−t ) 2 1−e−t + ϕ0 (x − z) + ϕ0 (x + z) K 0 (z, t) dz 0
1−e−t
+ 0
ϕ1 (x − z) + ϕ1 (x + z) K 1 (z, t)dz,
(0.18)
302
K. Yagdjian, A. Galstian
where the kernels K 0 (z, t) and K 1 (z, t) are defined by
∂ K 0 (z, t) := − E(z, t; 0, b) ∂b b=0 −i M = (4e−t )i M (1 + e−t )2 − z 2
1 [(1 − e−t )2 − z 2 ] (1 + e−t )2 − z 2 1 1 (1−e−t )2 −z 2 × e−t −1−i M(e−2t −1−z 2 ) F + i M, + i M; 1; 2 2 (1 + e−t )2 −z 2 1 1 1 (1 − e−t )2 − z 2 − i M F − + i M, + i M; 1; , + 1 − e−2t + z 2 2 2 2 (1 + e−t )2 − z 2 0 ≤ z < 1 − e−t ,
and K 1 (z, t) := E(z, t; 0, 0) − 1 −i M = (4e−t )i M (1 + e−t )2 − z 2 2 1 (1 − e−t )2 − z 2 1 , 0 ≤ z ≤ 1 − e−t , + i M, + i M; 1; ×F 2 2 (1 + e−t )2 − z 2 respectively. The kernels K 0 (z, t) and K 1 (z, t) play leading roles in the derivation of L p − L q estimates. Their main properties are listed and proved in Sect. 8. Next we turn to the higher-dimensional equation with n ≥ 2. Theorem 0.5. If n is odd, n = 2m + 1, m ∈ N, then the solution u = u(x, t) to the Cauchy problem u tt − e−2t u + M 2 u = f, u(x, 0) = 0, u t (x, 0) = 0,
(0.19)
with f ∈ C ∞ (Rn+1 ) and with vanishing initial data is given by the next expression u(x, t) = 2
t
db 0
e−b −e−t
dr1 0
∂ 1 ∂ n−3 r n−2 2 (n) ∂r r ∂r ωn−1 c
0
− 1 −i M 2 ×(4e−b−t )i M (e−t + e−b )2 − r12 (e−b − e−t )2 − r12 1 1 + i M, + i M; 1; −b ×F , 2 2 (e + e−t )2 − r12
S n−1
f (x + r y, b) d S y r =r1
(0.20)
where c0(n) = 1 · 3 · · · · · (n − 2), and the constant ωn−1 is the area of the unit sphere S n−1 ⊂ Rn .
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
303
If n is even, n = 2m, m ∈ N, then the solution u = u(x, t) is given by the next expression t e−b −e−t ∂ 1 ∂ n−2 2r n−1 f (x + r y, b) 2 u(x, t) = 2 db dr1 d Vy (n) n ∂r r ∂r 1 − |y|2 0 B1 (0) 0 ωn−1 c 0
− 1 −i M 2 ×(4e−b−t )i M (e−t + e−b )2 − r12 (e−b − e−t )2 − r12 1 1 + i M, + i M; 1; −b ×F . 2 2 (e + e−t )2 − r12
r =r1
(0.21)
(n)
Here B1n (0) := {|y| ≤ 1} is the unit ball in Rn , while c0 = 1 · 3 · · · · · (n − 1). Thus, in both cases, of even and odd n, one can write t e−b −e−t − 1 −i M 2 u(x, t) = 2 db dr v(x, r ; b)(4e−b−t )i M (e−t + e−b )2 − r 2 0
×F
0
1 (e−b − e−t )2 − r 2 1 , + i M, + i M; 1; −b 2 2 (e + e−t )2 − r 2
(0.22)
where the function v(x, t; b) is a solution to the Cauchy problem for the wave equation vtt − v = 0, v(x, 0; b) = f (x, b), vt (x, 0; b) = 0. The next theorem gives representation of the solutions of Eq. (0.8) with the initial data prescribed at t = 0. Theorem 0.6. The solution u = u(x, t) to the Cauchy problem u tt − e−2t u + M 2 u = 0, u(x, 0) = ϕ0 (x), u t (x, 0) = ϕ1 (x),
(0.23)
with ϕ0 , ϕ1 ∈ C0∞ (Rn ), n ≥ 2, can be represented as follows: 1 t vϕ0 (x, φ(t)s)K 0 (φ(t)s, t)φ(t) ds u(x, t) = e 2 vϕ0 (x, φ(t)) + 2 0
1
+2 0
vϕ1 (x, φ(t)s)K 1 (φ(t)s, t)φ(t) ds, x ∈ Rn , t > 0, (0.24)
φ(t) := 1 − e−t , and where the kernels K 0 and K 1 have been defined in Theorem 0.4. Here for ϕ ∈ C0∞ (Rn ) and for x ∈ Rn , n = 2m + 1, m ∈ N, r n−2 ∂ 1 ∂ n−3 2 vϕ (x, φ(t)s) := ϕ(x + r y) d S , y (n) ∂r r ∂r S n−1 ωn−1 c0 r =φ(t)s while for x ∈ Rn , n = 2m, m ∈ N, 2r n−1 1 ∂ 1 ∂ n−2 2 vϕ (x, φ(t)s) := ϕ(x + r y) d V . y (n) ∂r r ∂r 1 − |y|2 B1n (0) ωn−1 c0 r =sφ(t) The function vϕ (x, φ(t)s) coincides with the value v(x, φ(t)s) of the solution v(x, t) of the Cauchy problem vtt − v = 0, v(x, 0) = ϕ(x), vt (x, 0) = 0.
304
K. Yagdjian, A. Galstian
As a consequence of the above theorems we obtain in Sects. 9–10 for n ≥ 2 and for the particles with “large” mass m, m ≥ n/2, that is, with nonnegative curved mass M ≥ 0, the following L p − L q estimate (−)−s u(x, t) L q (Rn ) t 1+2s−n( 1 − 1 ) p q ≤C f (x, b) L p (Rn ) e−b e−b − e−t (1 + t − b)1−sgn M db 0
t 2s−n( 1p − q1 ) e 2 ϕ0 (x) L p (Rn ) + (1−e−t ) ϕ1 L p (Rn ) , + C(1 + t)1−sgn M (1−e−t )
(0.25)
provided that 1 < p ≤ 2, 1p + q1 = 1, 21 (n + 1) 1p − q1 ≤ 2s ≤ n 1p − q1 < 2s + 1. Moreover, according to Theorem 7.1 the estimate (0.25) with ϕ0 (x) = 0 and ϕ1 (x) = 0 is valid for n = 1 and s = 0 as well. The case of n = 1, f (x, t) = 0, and non-vanishing ϕ0 (x) and ϕ1 (x) is discussed in Sect. 8. Here we have to emphasize that the estimate (0.25) does not imply any decay for large time. It is essentially different from the decay estimate obtained in [32] for the wave equation in the anti-de Sitter spacetime. This difference is caused by the striking difference between the global geometries of the forward and backward light cones of Eq. (0.16). The paper is organized as follows. In Sect. 1 we construct the Riemann function of the operator of (0.8) in the characteristic coordinates for the case of n = 1. That Riemann function is used in Sect. 2 to prove Theorem 0.1. Then in Sect. 3 we apply the fundamental solutions to solve the Cauchy problem with the source term and with vanishing initial data given at t = 0. More precisely, we give a representation formula for the solutions. In that section we also prove several basic properties of the function E(x, t; x0 , t0 ). In Sects. 4–5 we use the formulas of Sect. 3 to derive and to complete the list of representation formulas for the solutions of the Cauchy problem for the case of one-dimensional spatial variable. The higher-dimensional equation with the source term is considered in Sect. 6, where we derive a representation formula for the solutions of the Cauchy problem with the source term and with vanishing initial data given at t = 0. In the same section this formula is used to derive the fundamental solutions of the operator and to complete the proof of Theorem 0.6. Then in Sects. 7–10 we establish the L p − L q decay estimates. Applications of all these results to the nonlinear equations will be done in a forthcoming paper. 1. The Riemann Function In the characteristic coordinates l and m, l = x + e−t, one has
m = x − e−t,
2 ∂2 ∂ ∂ ∂ 1 ∂2 1 ∂ ∂ 2 − + (l − m) + = (l − m) −2 ∂t 2 2 ∂l ∂m 4 ∂l 2 ∂l ∂m ∂m 2
and e−2t
∂2 1 = (l − m)2 2 ∂x 4
∂2 ∂2 ∂ ∂ . + + 2 ∂l 2 ∂m ∂l ∂m 2
(1.1)
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
305
Then the operator S of Eq. (0.16) reads 2 ∂2 −2t ∂ − e + M2 ∂t 2 ∂x2 2 ∂ ∂ 1 ∂ 1 2 . − − − = −(l − m)2 M ∂l ∂m 2(l − m) ∂l ∂m (l − m)2
S :=
In particular, in the new variables the equation
2 ∂2 −2t ∂ 2 −e + M u = 0 implies ∂t 2 ∂x2 2 ∂ ∂ 1 ∂ 1 − − u− M 2 u = 0. ∂l ∂m 2(l − m) ∂l ∂m (l − m)2
We need the following lemma with γ =
1 + i M. 2
Lemma 1.1. The function (l − a)(m − b) V (l, m; a, b) = (l − b)−γ (a − m)−γ F γ , γ ; 1; (l − b)(m − a) solves the equation
∂ ∂2 1 ∂ − γ − V (l, m; a, b) = 0 . ∂l ∂m (l − m) ∂l ∂m
(1.2)
Proof. In fact, the result follows by direct computations and basic properties of the hypergeometric function. We denote the argument of the hypergeometric function by z, and evaluate its derivatives, z :=
(l − a)(m − b) , (l − b)(m − a)
∂ (a − b)(b − m) z= , ∂l (l − b)2 (a − m)
∂ (a − b)(a − l) z=− . ∂m (b − l)(a − m)2
Further, we obtain ∂ V (l, m; a, b) = (a − m)−γ (l − b)−γ −1 ∂l (a − b)(b − m) Fz γ , γ ; 1; z . × − γ F γ , γ ; 1; z + (l − b)(a − m) Next ∂ V (l, m; a, b) = (a − m)−γ −1 (l − b)−γ ∂m (a − b)(a − l) F γ , γ ; 1; z . × γ F γ , γ ; 1; z − (b − l)(a − m) z
306
K. Yagdjian, A. Galstian
Then
∂ ∂ − ∂l ∂m
V (l, m; a, b)
1 1 + (l − b) (a − m) (a − l) (b − m) −γ −1 −γ −1 +(a − m) − . (l − b) Fz γ , γ ; 1; z (a − b) (l − b) (a − m)
= (a − m)−γ (l − b)−γ (−γ ) F γ , γ ; 1; z
Furthermore, ∂2 V (l, m; a, b) ∂l ∂m (a −b)(a −l) Fz γ , γ ; 1; z = (a − m)−γ −1 − γ (l −b)−γ −1 γ F γ , γ ; 1; z − (b−l)(a −m) (a − b)(a − l) −γ −1 −γ ∂ (l − b) γ F γ , γ ; 1; z − F γ , γ ; 1; z . + (a − m) ∂l (b − l)(a − m) z We calculate (a − b)(a − l) ∂ γ F γ , γ ; 1; z − Fz γ , γ ; 1; z ∂l (b − l)(a − m) (a − b)(b − m) (a − b)2 − F γ , γ ; 1; z = γ Fz γ , γ ; 1; z (l − b)2 (a − m) (b − l)2 (a − m) z (a − b)(b − m) (a − b)(a − l) − Fzz γ , γ ; 1; z . (b − l)(a − m) (l − b)2 (a − m) Here
∂ (a−b)(a−l) ∂l (b−l)(a−m)
=
(a−b)2 . (b−l)2 (a−m)
Finally,
∂2 V (l, m; a, b) ∂l ∂m (a −b)(a −l) Fz γ , γ ; 1; z = (a −m)−γ −1 − γ (l −b)−γ −1 γ F γ , γ ; 1; z − (b−l)(a −m) (a − b)(b − m) + (a − m)−γ −1 (l − b)−γ γ Fz γ , γ ; 1; z (l − b)2 (a − m) (a −b)(b−m) 2 (a −b)(a −l) (a −b) F F . γ , γ ; 1; z − γ , γ ; 1; z − zz (b−l)2 (a −m) z (b−l)(a −m) (l −b)2 (a −m) The coefficients of the derivatives of the hypergeometric function, Fzz , Fz , and of F ∂2 in the expression for V (l, m; a, b) are ∂l ∂m (a − m)−γ −3 (l − b)−γ −3 (a − b)2 (a − l)(b − m),
(a − m)−γ −2 (l − b)−γ −2 (a − b) γ (b − m − a + l) − (a − b) , (a − m)−γ −1 (−γ ) (l − b)−γ −1 {− (−γ )} = −(a − m)−γ −1 (l − b)−γ −1 γ 2 ,
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
respectively. The coefficients of Fz and F in the expression for V (l, m; a, b) are
307
1 γ (l − m)
∂ ∂ − ∂l ∂m
(a − l) 1 (b − m) −γ −1 −γ −1 γ (a − m) − (l − b) (a − b) (l − m) (l − b) (a − m) and −
1 1 1 γ (a − m)−γ (l − b)−γ γ + . (l − m) (l − b) (a − m)
Now we turn to Eq. (1.2). The coefficients of F and Fz in that equation are 1 (a − b)(a − m)−γ −1 (l − b)−γ −1 (−γ 2 ), (m − l) and 1 (a − m)−γ −1 (l − b)−γ −1 (a − b) (m − l) (m − l) (a − l) (b − m) × γ (b − m − a + l) + γ − (a − m)(l − b) (l − b) (a − m) (m − l) (a − b) . − (a − m)(l − b) The first two terms in the brackets can be written as follows: (m − l) (b − m) (a − l) γ (b − m − a + l) + γ − = −2γ z , (a − m)(l − b) (l − b) (a − m) while the last term can be transformed to −
(m − l) (a − b) = 1 − z . (a − m)(l − b)
Thus, the coefficient of Fz in Eq. (1.2) is 1 (a − m)−γ −1 (l − b)−γ −1 (a − b) 1 − (1 + 2γ )z . (m − l) Finally, the coefficient of Fzz in Eq. (1.2) is (a − b)(a − l)(b − m)(m − l) 1 (a − m)−γ −1 (l − b)−γ −1 (a − b) , (m − l) (a − m)2 (l − b)2 where (a − b)(a − l)(b − m)(m − l) = z(1 − z) . (a − m)2 (l − b)2
308
K. Yagdjian, A. Galstian
Hence, the left-hand side of (1.2) reads 2 ∂ ∂ 1 ∂ − γ − V (l, m; a, b) ∂l ∂m (l − m) ∂l ∂m 1 (a −m)−γ −1 (l −b)−γ −1 (a −b) z(1−z)Fzz + 1−(1+2γ )z Fz −γ 2 F = 0, = (m − l) and vanishes, since F solves the Gauss hypergeometic equation with c = 1, a = γ , and b = γ . The lemma is proven. Lemma 1.2. For γ ∈ C such that F γ , γ ; 1; z is well defined, the function m; a, b) := (a − b)γ − 2 (l − m)γ − 2 V (l, m; a, b) E(l, 1
1
1 1 (l −a)(m −b) = (a −b)γ − 2 (l −m)γ − 2 (l −b)−γ (a −m)−γ F γ , γ ; 1; (l −b)(m −a)
solves the equation 2 2 ∂ ∂ 1 1 ∂ 1 m; a, b) = 0. E(l, − − E(l, m; a, b)+ −γ ∂l ∂m 2(l − m) ∂l ∂m (l −m)2 2 Proof. Indeed, straightforward calculations show 2 2 ∂ 1 ∂ 1 1 ∂ 1 − − + − γ (a − b)−γ + 2 E ∂l ∂m 2(l − m) ∂l ∂m (l − m)2 2 1 1 γ (Vl − Vm ) = 0 . = (l − m)γ − 2 Vlm − (l − m) The lemma is proven.
Consider now the operator ∗ Sch :=
∂ 1 ∂ 1 ∂2 + − − (M 2 + 1), ∂l ∂m 2(l − m) ∂l ∂m (l − m)2
which is a formally adjoint to the operator Sch :=
∂ 1 ∂ 1 ∂2 − − − M 2. ∂l ∂m 2(l − m) ∂l ∂m (l − m)2
In the next lemma the Riemann function (see, e.g., [9, Ch.V, §5]) is presented. Proposition 1.3. The function m; a, b) R(l, m; a, b) = (l − m) E(l, 1
1
= (a − b)i M (l − m)1+i M (l − b)− 2 −i M (a − m)− 2 −i M 1 1 (l − a)(m − b) + i M, + i M; 1; ×F 2 2 (l − b)(m − a) defined for all l, m, a, b ∈ R, such that l > m, is a unique solution of the equation ∗ R = 0, which satisfies the following conditions: Sch
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
309
1 R along the line m = b; 2(l − m) 1 R along the line l = a; (ii) Rm = − 2(l − m) (iii) R(a, b; a, b) = 1. (i) Rl =
Proof. Indeed, if we denote γ =
1 2
+ i M, then for the Riemann function we have
m; a, b). R(l, m; a, b) = (a − b)γ − 2 (l − m)γ + 2 V (l, m; a, b) = (l − m) E(l, 1
1
∗ can be written as follows: The operators Sch and Sch
Sch ∗ Sch
∂ 1 2 1 ∂ 1 ∂2 γ− − − + = , ∂l ∂m 2(l − m) ∂l ∂m (l − m)2 2 ∂ 1 ∂ 1 ∂2 1 2 + − − = 1− γ − . ∂l ∂m 2(l − m) ∂l ∂m (l − m)2 2
Direct calculations show that, if the function u solves the equation Sch u = 0, then the ∗ v = 0, and vice versa. Then Lemma 1.2 function v = (l − m)u solves the equation Sch completes the proof. The lemma is proven.
2. Proof of Theorem 0.1 Next we use the Riemann function R(l, m; a, b) and the function E(x, t; x0 , t0 ) defined by (0.13) to complete the proof of Theorem 0.1, which gives the fundamental solution with support in the forward cone D+ (x0 , t0 ), x0 ∈ Rn , t0 ∈ R, and the fundamental solution with support in the backward cone D− (x0 , t0 ), x0 ∈ Rn , t0 ∈ R, defined by (0.12) with plus and minus, respectively. We present a proof for E+ (x, t; 0, t0 ) since for E− (x, t; 0, t0 ) it is similar. First, we note that the operator S is formally self-adjoint, S = S ∗ . We must show that < E+ , Sϕ >= ϕ(0, t0 ) ,
for every ϕ ∈ C0∞ (R2 ).
Since distribution E+ (x, t; 0, t0 ) is locally integrable in R2 , this is equivalent to showing that E+ (x, t; 0, t0 )Sϕ(x, t) d x dt = ϕ(0, t0 ), for every ϕ ∈ C0∞ (R2 ). (2.1) R2
Let (b, −b) := (e−t0 , −e−t0 ) be the image of the point (0, t0 ) in the characteristic coordinates l and m. The image of the interior of D+ (0, t0 ) in the characteristic coordinates l and m is the open triangle T+ (b, −b) := {(l, m) ∈ R2 | l > m, m > −e−t0 , l < e−t0 }. respectively, that For the functions ϕ and E in the new variables we use notations ϕ and E, m; b, −b). It is evident that is ϕ(x, t) = ϕ (l, m) and E(x, t; 0, t0 ) = E(l, ϕ ∈ C0∞ (R2 ) 2 and supp ϕ ⊂ {(l, m) ∈ R | l > m}. In the mean time |d(x, t)/d(l, m)| = (l − m)−1
310
K. Yagdjian, A. Galstian
is the Jacobian of the transformation (1.1). Hence the integral in the left-hand side of (2.1) is equal to
R2
E+ (x, t; 0, t0 )Sϕ(x, t) d x dt =
∞
dt t0
e−t0 −e−t
e−t −e−t0
E(x, t; 0, t0 )Sϕ(x, t) d x
2 1 ∂ ∂ 1 ∂ 2 ϕ dl dm − − − R(l, m; b, −b) M ∂l ∂m 2(l −m) ∂l ∂m (l −m)2 T+ (b,−b) 2 b ∞ ∂ 1 ∂ 1 ∂ 2 ϕ dl dm. − − − =− R(l, m; b, −b) M ∂l ∂m 2(l − m) ∂l ∂m (l − m)2 −∞ −b (2.2) =−
The rest of the proof is standard (see, e.g., [9, Ch.V, §5]), but we give it to make this section self-contained. We consider the first term of (2.2) and integrate it by parts
∞
∂2 ϕ dl R(l, m; b, −b) ∂l ∂m −b −∞ b ∞ ∂ϕ ∂ R(l, −b; b, −b) ϕ = R(b, m; b, −b) dm − − dl ∂m l=b ∂l −b −∞ m=−b 2 ∞ b ∂ R(l, m; b, −b) ϕ − dm dl ∂l ∂m −b −∞ ∞ ∂ = −ϕ(b, −b) − R(b, m; b, −b) ϕ (b, m) dm ∂m −b 2 ∞ b b ∂ ∂ R(l, −b; b, −b) ϕ(l, −b)+ R(l, m; b, −b) ϕ. + dl dm dl ∂l ∂l ∂m −∞ −b −∞ b
dm
Then, for the second term in Eq. (2.2) we obtain −
∞ −b
dm
b
b −∞
dl R(l, m; b, −b)
∂ ∂ 1 − ϕ 2(l − m) ∂l ∂m
1 =− R(l, −b; b, −b) ϕ (l, −b) dl 2(l + b) −∞ ∞ 1 ϕ (b, m) dm R(e−b , m; b, −b) − 2(b − m) −b ∞ b 1 − dm dl R(l, m; b, −b) ϕ (l, m) (l − m)2 −b −∞ ∞ b ∂ 1 ∂ R(l, m; b, −b) ϕ (l, m) . + dm dl − 2(l − m) ∂l ∂m −b −∞
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
311
Using properties of the Riemann function we derive 2 ∞ b ∂ 1 ∂ 1 ∂ 2 − − − ϕ − dm dl R(l, m; b, −b) M ∂l ∂m 2(l − m) ∂l ∂m (l − m)2 −b −∞ ∞ b ∂ ∂ = ϕ(b, −b) + R(b, m; b, −b) ϕ (b, m) dm − R(l, −b; b, −b) ϕ (l, −b) dl −b ∂m −∞ ∂l b ∞ 1 1 + ϕ (l, −b) dl + ϕ (b, m) dm R(l, −b; b, −b) R(b, m; b, −b) 2(l + b) 2(b − m) −∞ −b = ϕ (b, −b) = ϕ(0, t0 ) . The theorem is proven.
3. Application to the Cauchy Problem: Source Term and n = 1 Consider now the Cauchy problem for Eq. (0.16) with vanishing initial data (0.17). For every (x, t) ∈ D+ (0, b) one has e−t − e−b ≤ x ≤ e−b − e−t , so that − 1 −i M 2 E(x, t; 0, b) = (4e−b−t )i M (e−t + e−b )2 − x 2 ×F
1 2
+ i M,
1 (e−b − e−t )2 − x 2 . + i M; 1; −b 2 (e + e−t )2 − x 2
The coefficient in Eq. (0.8) is independent of x, therefore E+ (x, t; y, b) = E+ (x − y, t; 0, b). Using the fundamental solution from Theorem 0.1 one can write the convolution ∞ ∞ t ∞ u(x, t) = E+ (x, t; y, b) f (y, b) db dy = db E+ (x − y, t; 0, b) f (y, b) dy, −∞ −∞
0
−∞
(3.1) which is well-defined since supp f ⊂ {t ≥ 0}. Then according to the definition of the distribution E+ we obtain the statement of Theorem 0.3. Thus, Theorem 0.3 is proven. Remark 3.1. The argument of the hypergeometric function is nonnegative and bounded, 0≤
(e−b − e−t )2 − z 2 0 has an extension as an even function for r < 0 and hence r (u) has a natural extension as an odd function. That allows replacing the mixed problem with the Cauchy problem. Namely, let functions v be the continuations of the functions v and the forcing term F, respectively, by and F
v(x, r, t), i f r ≥ 0 v (x, r, t) = , −v(x, −r, t), i f r ≤ 0
r, t) = F(x,
F(x, r, t), i f r ≥ 0 . −F(x, −r, t), i f r ≤ 0
Then v solves the Cauchy problem r, t) for all t ≥ 0, r ∈ R , x ∈ Rn , vrr (x, r, t) + M 2 v (x, r, t) = F(x, vtt (x, r, t) − e−2t v (x, r, 0) = 0 , vt (x, r, 0) = 0 for all r ∈ R , x ∈ Rn .
Hence, according to Theorem 0.3, one has the representation t v (x, r, t) = db 0
×F
r +e−b −e−t
r −(e−b −e−t )
− 1 −i M r1 , b)(4e−b−t )i M (e−t + e−b )2 − (r − r1 )2 2 dr1 F(x,
1 1 (e−b − e−t )2 − (r − r1 )2 + i M, + i M; 1; −b 2 2 (e + e−t )2 − (r − r1 )2
.
(n) v (x, r, t)/(c0 r ) , we consider the case of r < t in the above Since u(x, t) = limr →0 representation to obtain: u(x, t) =
e−b −e−t
1 r − r1 , b) F(x, r + r1 , b) + F(x, r →0 r 0 0 − 1 −i M 2 ×(4e−b−t )i M (e−t + e−b )2 − r12 (e−b − e−t )2 − r12 1 1 + i M, + i M; 1; −b ×F . 2 2 (e + e−t )2 − r12 1
c0(n)
t
db
dr1 lim
326
K. Yagdjian, A. Galstian
r − r1 , b) + F(x, we replace limr →0 1 F(x, r+ Then by definition of the function F, r ∂ r1 , b) with 2 ∂r F(x, r, b) in the last formula. The definitions of F(x, r, t) and r =r1
of the operator r yield: e−b −e−t t ∂ 1 ∂ m−1 2m−1 2 u(x, t) = (n) db dr1 r I f (x, r, t) ∂r r ∂r 0 c0 0 r =r1 − 1 −i M 2 ×(4e−b−t )i M (e−t + e−b )2 − r12 (e−b − e−t )2 − r12 1 1 + i M, + i M; 1; −b ×F , 2 2 (e + e−t )2 − r12
where x ∈ Rn , n = 2m + 1, m ∈ N. Thus, the solution to the Cauchy problem is given by (0.20). We employ the method of descent (see, e.g., [9, Ch.VI, § 12]) to complete the proof for the case of even n, n = 2m, m ∈ N. Theorem 0.5 is proven. Proof of (0.14) and (0.15). We set f (x, b) = δ(x)δ(t − b) in (0.20) and (0.21), and we obtain (0.14) and (0.15), where if n is odd, then ∂ 1 ∂ n−3 1 2 1 δ(|x| − t), E w (x, t) := ωn−1 1 · 3 · 5 . . . · (n − 2) ∂t t ∂t t while for n even we have E w (x, t) :=
∂ 1 ∂ n−2 1 2 2 χ Bt (x) . 2 ωn−1 1 · 3 · 5 . . . · (n − 1) ∂t t ∂t t − |x|2
Here χ Bt (x) denotes the characteristic function of the ball Bt (x) := {x ∈ Rn ; |x| ≤ t}. Constant ωn−1 is the area of theunit sphere S n−1 ⊂ Rn . The distribution δ(|x| − t) is defined by δ(| · | − t), f (·) = |x|=t f (x) d x for f ∈ C0∞ (Rn ). Proof of Theorem 0.6. First we consider the case of ϕ0 (x) = 0. More precisely, we have to prove that the solution u(x, t) of the Cauchy problem (0.23) with ϕ0 (x) = 0 can be represented by (0.24) with ϕ0 (x) = 0. The next lemma will be used in both cases. Lemma 6.1. Consider the mixed problem ⎧ −2t 2 ⎪ ⎨vtt − e vrr + M v = 0 for all t ≥ 0 , r ≥ 0 , v(r, 0) = τ0 (r ) , vt (r, 0) = τ1 (r ) for all r ≥ 0 , ⎪ ⎩v(0, t) = 0 for all t ≥ 0 , τ1 (r ) the continuations of the functions τ0 (r ) and τ1 (r ) for and denote by τ0 (r ) and negative r as odd functions: τ0 (−r ) = −τ0 (r ) and τ1 (−r ) = −τ1 (r ) for all r ≥ 0, respectively. Then the unique solution v(r, t) to the mixed problem is given by the restriction of (4.1) to r ≥ 0: 1 t v(r, t) = e 2 τ0 (r + 1 − e−t ) + τ0 (r − 1 + e−t ) 2 1 + τ0 (r − φ(t)s) + τ0 (r + φ(t)s) K 0 (φ(t)s, t)φ(t) ds
0 1
τ1 r + φ(t)s + τ1 r − φ(t)s K 1 (φ(t)s, t)φ(t) ds,
+ 0
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
327
where K 0 (z, t) and K 1 (z, t) are defined in Theorem 0.4 and φ(t) = 1 − e−t . Proof. This lemma is a direct consequence of Theorem 0.4.
Now let us consider the case of x ∈ Rn , where n = 2m + 1. First for the given function u = u(x, t) we define the spherical means of u about point x. One can recover the functions by means of (6.1), (6.2), and ϕi (x) = lim Iϕi (x, r ) = lim r →0
1
r →0 c(n) r 0
r (ϕi )(x) , i = 0, 1.
Then we arrive at the following mixed problem: ⎧ −2t 2 n ⎪ ⎨vtt (x, r, t) − e vrr (x, r, t) + M v(x, r, t) = 0 for all t ≥ 0, r ≥ 0 , x ∈ R , n v(x, 0, t) = 0 for all t ≥ 0 , x ∈ R , ⎪ ⎩v(x, r, 0) = 0 , v (x, r, 0) = (x, r ) for all r ≥ 0 , x ∈ Rn , t 1 with the unknown function v(x, r, t) := r (u)(x, r, t), where 1 ∂ m−1 1 r 2m−1 ϕi (x + r y) d S y , (6.3) r ∂r ωn−1 S n−1 for all x ∈ Rn . (6.4) i (x, 0) = 0 , i = 0, 1, i (x, r ) := r (ϕi )(x) =
Then, according to Lemma 6.1 and to u(x, t) = limr →0 v(x, r, t)/(c0(n) r ) , we obtain: u(x, t) =
1 c0(n)
1 r →0 r
lim
1
1 x, r + φ(t)s + 1 x, r − φ(t)s K 1 (φ(t)s, t)φ(t) ds.
0
The last limit is equal to 1
∂ 1 (x, r ) K 1 (φ(t)s, t)φ(t) ds ∂r 0 r =φ(t)s 1 n−2 ∂ 1 ∂ n−3 2 r =2 ϕ1 (x + r y) d S y K 1 (φ(t)s, t)φ(t) ds. ∂r r ∂r ωn−1 S n−1 0 r =φ(t)s
2
Thus, Theorem 0.6 in the case of ϕ0 (x) = 0 is proven. Now we turn to the case of ϕ1 (x) = 0. Thus, we arrive at the following mixed problem: ⎧ −2t 2 n ⎪ ⎨vtt (x, r, t) − e vrr (x, r, t) + M v(x, r, t) = 0 for all t ≥ 0, r ≥ 0 , x ∈ R , n v(x, r, 0) = 0 (x, r ) , vt (x, r, 0) = 0 for all r ≥ 0 , x ∈ R , ⎪ ⎩v(x, 0, t) = 0 for all t ≥ 0 , x ∈ Rn ,
328
K. Yagdjian, A. Galstian
with the unknown function v(x, r, t) := r (u)(x, r, t) defined by (6.3), (6.4). Then, according to Lemma 6.1 and to u(x, t) = limr →0 v(x, r, t)/(c0(n) r ) , we obtain: 1 1 t 0 (x, r + et − 1) + 0 (x, r − et + 1) u(x, t) = (n) e 2 lim r →0 2r c0 1 1 2 0 (x, r − φ(t)s) + 0 (x, r + φ(t)s) K 0 (φ(t)s, t)φ(t) ds lim + (n) c0 0 r →0 2r 1 ∂ 2 1 t ∂ 2 + (n) K 0 (φ(t)s, t)φ(t) ds = (n) e 0 (x, r ) 0 (x, r ) ∂r ∂r 0 c0 r =φ(t) c0 r =φ(t)s 1 t = e 2 vϕ0 (x, φ(t)) + 2 vϕ0 (x, φ(t)s)K 0 (φ(t)s, t)φ(t) ds . 0
Theorem 0.6 is proven.
7. L p − L q and L q − L q Estimates for the Solutions of the One-dimensional Equation, n = 1 Consider now the Cauchy problem for Eq. (0.16) with the source term and with vanishing initial data (0.17). Theorem 7.1. For every function f ∈ C 2 (R × [0, ∞)) such that f (·, t) ∈ C0∞ (Rx ) the solution u = u(x, t) of the Cauchy problem (0.16), (0.17) satisfies the inequality t u(x, t) L q (Rx ) ≤ C M et (1−1/ρ) (1 + t − b)1−sgn M (et−b − 1)1/ρ (et−b + 1)−1 0
× f (x, b) L p (Rx ) db for all t > 0, where 1 < p < ρ ,
1 q
=
1 p
−
1 ρ ,
ρ < 2,
1 ρ
+
1 ρ
= 1.
Proof. Using the fundamental solution from Theorem 0.1 one can write the convolution ∞ ∞ t ∞ u(x, t) = E+ (x, t; y, b) f (y, b) db dy = db E+ (x − y, t; 0, b) f (y, b) dy. −∞ −∞
0
−∞
Due to Young’s inequality we have u(x, t) L q (Rx ) ≤ c
t
db 0
where 1 < p < ρ , q1 = 1p − ρ1 , can be transformed as follows φ(t)−φ(b) |E(x, t; 0, b)|ρ d x −(φ(t)−φ(b))
= 2e−t+tρ
et−b−1
φ(t)−φ(b) −(φ(t)−φ(b))
1 ρ
1/ρ f (x, b) L p (Rx ) ,
+ ρ1 = 1, φ(t) = 1 − e−t . The integral in parentheses
(et−b +1)2 − y 2
0
|E(x, t; 0, b)|ρ d x
− ρ 1 t−b − 1)2 − y 2 ρ 2 F +i M, 1 +i M; 1; (e dy. 2 2 (et−b + 1)2 − y 2
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
329
Denote z := et−b , where t ≥ b and z ∈ [1, ∞), and consider the integral ρ z−1 − ρ 1 1 (z − 1)2 − y 2 2 2 2 (z + 1) − y F 2 + i M, 2 + i M; 1; (z + 1)2 − y 2 dy. 0 First, consider the case of M > 0. There is a formula (see 15.3.6 of Ch. 15[1] and [2]) that ties together points z = 0 and z = 1: (c)(c − a − b) F (a, b; a + b − c + 1; 1 − z) (7.1) (c − a)(c − b) (c)(a +b−c) F (c−a, c−b; c−a −b+1; 1−z) , | arg(1−z)| < π. +(1−z)c−a−b (a)(b)
F (a, b; c; z) =
Each term of the last formula has a pole when c = a + b ± m, (m = 0, 1, 2, . . .); this case is covered by 15.3.10 of Ch.15[1]: F (a, b; a + b; z) (7.2) ∞ (a)n (b)n (a + b) = [2ψ(n + 1) − ψ(a + n) − ψ(b + n) − ln(1 − z)] (1 − z)n , (a)(b) (n!)2 n=0
| arg(1 − z)| < π, |1 − z| < 1.
If (c − a − b) > 0, then F(a, b; c; 1) = (c)(c−a−b) . For every given ε ∈ (0, 1) (c−a)(c−b) 1 the first formula implies F 2 + i M, 21 + i M; 1; z ≤ C M,ε for all z ∈ [ε, 1) and, consequently, together with the second one, means 1 1 1−sgn M F for all z ∈ [0, 1).(7.3) 2 + i M, 2 + i M; 1; z ≤ C M (1 − ln(1 − z)) Thus, φ(t)−φ(b)
ρ
−(φ(t)−φ(b))
|E(x, t; 0, b)| d x ≤ C M e
−t+tρ
et−b −1
(et−b + 1)2 − y 2
− ρ
2
dy.
0
Then, for all z > 1 the following equality z−1 1 ρ 3 (z − 1)2 ρ ((z + 1)2 − r 2 )− 2 dr = (z − 1)(z + 1)−ρ F , ; ; 2 2 2 (z + 1)2 0 holds, provided that 1 < p < ρ , 0
z−1
1 q
=
1 p
−
1 1 ρ , ρ
+
1 ρ
(7.4)
= 1. In particular, if ρ < 2, then
ρ
((z + 1)2 − r 2 )− 2 dr ≤ Cρ (z − 1)(z + 1)−ρ .
The last estimate completes the proof of the theorem in the case of M > 0. Next we consider the case of M = 0. Thus, φ(t)−φ(b) |E(x, t; 0, b)|ρ d x −(φ(t)−φ(b))
= 2e
−t+tρ
et−b −1
(e
0
t−b
ρ − ρ 1 1 (et−b − 1)2 − y 2 2 + 1) − y F 2 , 2 ; 1; (et−b + 1)2 − y 2 dy. 2
2
330
K. Yagdjian, A. Galstian
Lemma 7.2. [32]. For all z > 1 the following estimate: ρ z−1 1 1 (z − 1)2 − r 2 2 2 − ρ2 ((z + 1) − r ) F dr , ; 1; 2 2 (z + 1)2 − r 2 0 1 ρ 3 (z − 1)2 ≤ C(1 + ln z)ρ (z − 1)(z + 1)−ρ F , ; ; 2 2 2 (z + 1)2 is fulfilled, provided that 1 < p < ρ , q1 = 1p − ρ1 , ρ1 + ρ1 = 1. In particular, if ρ < 2, then ρ z−1 ρ 1 1 (z − 1)2 − r 2 ((z + 1)2 − r 2 )− 2 F dr ≤ C(1 + ln z)ρ (z − 1)(z + 1)−ρ . , ; 1; 2 2 (z + 1)2 − r 2 0 Thus for ρ < 2 and z = et−b we have t u(x, t) L q (Rx ) ≤ C M et (1−1/ρ) (1 + t − b)(et−b −1)1/ρ (et−b +1)−1 f (x, b) L p (Rx ) db . 0
The last inequality implies the estimate of the statement of theorem if M = 0. Theorem 7.1 is proven. Proposition 7.3. The solution u = u(x, t) of the Cauchy problem u tt − e−2t u x x + M 2 u = 0 , with ϕ0 , ϕ1 ∈
u(x, 0) = ϕ0 (x) ,
u t (x, 0) = ϕ1 (x) ,
C0∞ (R)
satisfies the following estimate t u(x, t) L q (Rx ) ≤ C(1 + t)1−sgn M e 2 ϕ0 (x) L q (Rx ) + (et − 1)e−t ϕ1 (x) L q (Rx ) for all t ∈ (0, ∞).
(7.5)
Proof. First we consider the equation without source but with the second datum, that is, the case of ϕ0 = 0. We apply the representation given by Theorem 0.4 for the solution u = u(x, t) of the problem, and obtain 1−e−t ϕ1 (x − z) + ϕ1 (x + z) K 1 (z, t)dz , u(x, t) = 0
where the kernel K 1 (z, t) is defined in Theorem 0.4. Hence, we arrive at the inequality 1−e−t u(x, t) L q (R) ≤ 2 ϕ1 (x) L q (R) |K 1 (r, t)|dr 0
= 2 ϕ1 (x) L q (R)
et −1
1 (et
+ 1)2 − y 2 0 1 1 (et − 1)2 − y 2 dy. + i M, + i M; 1; t × F 2 2 (e + 1)2 − y 2
To estimate the last integral we introduce z = et > 1 and denote the integral by I1 , z−1 1 1 (z − 1)2 − y 2 1 I1 (z) := F 2 + i M, 2 + i M; 1; (z + 1)2 − y 2 dy. (z + 1)2 − y 2 0
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
331
First we consider the case of M > 0. Then, according to (7.4) (with ρ = 1) we have for that integral the following estimate: I1 (et ) ≤ C M (et − 1)(et + 1)−1 . The last inequality implies the L q − L q estimate (7.5) for the case of M > 0. Then we consider the case of M = 0, that is, z−1 (z − 1)2 − y 2 1 1 1 dy. , ; 1; I1 (z) = F 2 2 (z + 1)2 − y 2 (1 + z)2 − y 2 0 According to Lemma 7.2 (with ρ = 1) we have for I1 (z) the following estimate: I1 (et ) ≤ C(1 + t)(et − 1)(et + 1)−1 .
(7.6)
Finally, (7.6) implies the L q − L q estimate (7.5) for the case of M = 0 and ϕ0 = 0. Next we consider the equation without source but with the first datum, that is, the case of ϕ1 = 0. We apply the representation given by Theorem 0.4 for the solution u = u(x, t) of the Cauchy problem with ϕ1 = 0, and obtain 1 t u(x, t) = e 2 ϕ0 (x + 1 − e−t ) + ϕ0 (x − 1 + e−t ) 2 1−e−t ϕ0 (x − r ) + ϕ0 (x + r ) K 0 (r, t) dr, + 0
where the kernel K 0 (r, t) is defined in Theorem 0.4. Then we easily obtain the following two estimates: 1−e−t t u(x, t) − ϕ0 (x − r ) + ϕ0 (x + r ) K 0 (r, t) dr L q (R) ≤ e 2 ϕ0 (x) L q (R) 0
and
t
u(x, t) L q (R) ≤ e 2 ϕ0 (x) L q (R) + 2 ϕ0 (x) L q (R)
1−e−t
|K 0 (r, t)| dr.
0
Finally, the following lemma completes the proof of the proposition. Lemma 7.4. The kernel K 0 (r, t) has an integrable singularity at r = 1 − e−t , more precisely, one has 1−e−t 1 |K 0 (r, t)| dr ≤ C M (1 + t)1−sgn M (et − 1)e− 2 t for all t ∈ [0, ∞). 0
Proof. For the integral we obtain z−1 1−e−t |K 0 (r, t)| dr ≤
y2]
1 [(z + 1)2 − y 2 ]
[(z − 0 1 1 (z − 1)2 − y 2 + i M, + i M; 1; × z − z 2 − i M(1 − z 2 − y 2 ) F 2 2 (z + 1)2 − y 2 1 2 1 (z − 1)2 − y 2 2 1 dy − i M F − + i M, + i M; 1; + z −1+y 2 2 2 (z + 1)2 − y 2
0
− 1)2
332
K. Yagdjian, A. Galstian
for all z := et > 1. We divide the domain of integration into two zones, (z − 1)2 − r 2 ≤ ε, 0 ≤ r ≤ z − 1 , Z 1 (ε, z) := (z, r ) (z + 1)2 − r 2 (z − 1)2 − r 2 Z 2 (ε, z) := (z, r ) ε ≤ , 0≤r ≤ z−1 , (z + 1)2 − r 2
(7.7) (7.8)
and split the integral into two parts,
et −1
|K 0 (r, t)| dr =
0
(z,r )∈Z 1 (ε,z)
|K 0 (r, t)| dr +
(z,r )∈Z 2 (ε,z)
|K 0 (r, t)| dr.
In the first zone we have 1 (z − 1)2 − y 2 + i M; 1; 2 2 (z + 1)2 − y 2 2 2 (z − 1)2 − y 2 (z − 1)2 − y 2 1 + iM +O =1+ , 2 (z + 1)2 − y 2 (z + 1)2 − r 2 1 1 (z − 1)2 − y 2 F − + i M, + i M; 1; 2 2 (z + 1)2 − y 2 2 − y2 2 − y 2 2 (z − 1) (z − 1) 1 +O =1− . + M2 4 (z + 1)2 − y 2 (z + 1)2 − y 2 F
1
+ i M,
(7.9)
(7.10)
We use the last formulas to estimate the term containing the hypergeometric functions: 2 2 z − z 2 − i M(1 − z 2 − r 2 ) F 1 + i M, 1 + i M; 1; (z − 1) − r 2 2 (z + 1)2 − r 2
1 1 1 (z − 1)2 − r 2 − i M F − + i M, + i M; 1; + z2 − 1 + r 2 2 2 2 (z + 1)2 − r 2 2 1 1 ≤ (z − 1)2 − r 2 + z − z 2 − i M(1 − z 2 − r 2 ) + iM 2 2 (z − 1)2 − r 2 1 1 − iM + M2 − z2 − 1 + r 2 (z + 1)2 − r 2 2 4 2 (z − 1)2 − r 2 2 2 2 2 2 + z − z − i M(1 − z − r ) + z − 1 + r O (z + 1)2 − r 2 1 (z − 1)2 − r 2 1 = (z − 1)2 − r 2 + (1 − 2i M)(−1 + 4M 2 )(r 2 + z 2 − 1) 2 8 (z + 1)2 − r 2 +2(1 + 2i M)2 (−z 2 + z + i M(r 2 + z 2 − 1)) 2 (z − 1)2 − r 2 2 2 2 2 2 + z −z −i M(1−z − r ) + z − 1 + r O . (7.11) (z + 1)2 − r 2
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
333
Hence, we have to consider the following three integrals, which can be easily evaluated and estimated: z−1 1 π A1 := dr ≤ Arctan ≤ , √ 2 z 2 (z,r )∈Z 1 (ε,z) (z + 1)2 − r 2 2 z A2 := dr ≤ (z + 1)−1/2 (z − 1), (z,r )∈Z 1 (ε,z) ((z + 1)2 − r 2 ) (z + 1)2 − r 2 and
A3 :=
(z,r )∈Z 1 (ε,z)
z − z 2 − i M(1 − z 2 − r 2 ) + z 2 − 1 + r 2 (z − 1)2 − r 2 dr ((z + 1)2 − r 2 )2 (z + 1)2 − r 2
≤ C M (z + 1)−1/2 (z − 1) for all z ∈ [1, ∞). Finally, for the integral over the first zone we have obtained 1 dr (z,r )∈Z 1 (ε,z) [(z − 1)2 − r 2 ] [(z + 1)2 − r 2 ] 1 1 (z − 1)2 − r 2 × z − z 2 − i M(1 − z 2 − r 2 ) F + i M, + i M; 1; 2 2 (z + 1)2 − r 2 1 1 1 (z − 1)2 − r 2 − i M F − + i M, + i M; 1; + z2 − 1 + r 2 2 2 2 (z + 1)2 − r 2 ≤ C M (z + 1)−1/2 (z − 1) for all z ∈ [1, ∞). In the second zone we have ε≤
1 1 (z − 1)2 − r 2 . ≤ 1 and ≤ (z + 1)2 − r 2 (z − 1)2 − r 2 ε[(z + 1)2 − r 2 ]
(7.12)
First consider the case of M > 0. According to (7.3) the hypergeometric functions obey the estimate F − 1 +i M, 1 +i M; 1; x ≤C and F 1 +i M, 1 +i M; 1; x ≤ C M for all x ∈ [ε, 1). 2 2 2 2 (7.13) This allows us to estimate the integral over the second zone: 1 dr (z,r )∈Z 2 (ε,z) [(z − 1)2 − r 2 ] (z + 1)2 − r 2 1 1 (z − 1)2 − r 2 + i M, + i M; 1; × z − z 2 − i M(1 − z 2 − r 2 ) F 2 2 (z + 1)2 − r 2 1 1 1 (z − 1)2 − r 2 − i M F − + i M, + i M; 1; + z2 − 1 + r 2 2 2 2 (z + 1)2 − r 2 z2 ≤ CM dr (z,r )∈Z 2 (ε,z) [(z − 1)2 − r 2 ] (z + 1)2 − r 2 z−1 1 2 ≤ C M,ε z dr 2 − r 2 )3/2 ((z + 1) 0 ≤ C M,ε (z + 1)−1/2 (z − 1)
334
K. Yagdjian, A. Galstian
for all z ∈ [1, ∞). In the case of M = 0 we have 2 2 1 1 F , ; 1; (z − 1) − r ≤ C (1 + ln z) , for all (z, r ) ∈ Z 2 (ε, z). 2 2 (z + 1)2 − r 2
(7.14)
The rest of the proof is a repetition of the above used arguments. Thus, the lemma is proven. 8. L p − L q Estimates for the Equation with n = 1 and without Source Term. Some Estimates of the Kernels K 0 and K 1 Theorem 8.1. Let u = u(x, t) be a solution of the Cauchy problem u tt − e−2t u x x + M 2 u = 0 ,
u(x, 0) = ϕ0 (x) ,
u t (x, 0) = ϕ1 (x) ,
with ϕ0 , ϕ1 ∈ C0∞ (R). If ρ ∈ (1, 2), then 1
1
1
u(x, t) L q (Rx ) ≤ e 2 ϕ0 (x) L q (Rx ) + C M,ρ (1 + t)1−sgn M (et − 1) ρ et ( 2 − ρ ) ϕ0 (x) L p (Rx ) t
1
t
+ C M,ρ (1 + t)1−sgn M (et − 1) ρ e− ρ ϕ1 (x) L p (Rx ) ,
for all t ∈ (0, ∞). Here 1 < p < ρ ,
1 q
=
1 p
−
1 1 ρ , ρ
+
1 ρ
= 1. If ρ = 1, then
t u(x, t) L q (Rx ) ≤ C M (1 + t)1−sgn M e 2 ϕ0 (x) L q (Rx ) + (et − 1)e−t ϕ1 (x) L q (Rx ) , for all t ∈ (0, ∞). Proof. For ρ = 1 we just apply Proposition 7.3. To prove this theorem for ρ > 1 we need some auxiliary estimates for the kernels K 0 and K 1 . We start with the case of ϕ0 = 0, where the kernel K 1 appears. The application of Theorem 0.4 and Young’s inequality lead to u(x, t) L q (Rx ) ≤ 2 where 1 < p < ρ ,
1 q
=
1 p
−
1−e
|K 1 (x, t)|ρ d x
|K 1 (x, t)|ρ d x
0
1 1 ρ , ρ
Proposition 8.2. We have 1/ρ −t
1−e−t
+
1 ρ
1/ρ ϕ1 (x) L p (Rx ) ,
= 1. Now we have to estimate the last integral.
≤ C(1 + t)1−sgn M (1 − e−t )1/ρ for all t ∈ (0, ∞).
0
Proof. For M = 0 one can write 1/ρ −t 1−e
|K 1 (x, t)|ρ d x
0
≤ Ce
t (1−1/ρ)
et −1 0
(e + 1) − y t
2
1/ρ 1 1 (et − 1)2 − y 2 ρ , ; 1; t . F dy 2 2 (e + 1)2 − y 2
ρ 2 −2
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
335
ρ z−1 (z−1)2 −x 2 1 1 √ 1 F , ; 1; dx 0 (1+z)2 −x 2 2 2 (z+1)2 −x 2 of the right-hand side. Then we apply Lemma 7.2 and obtain 1/ρ −t
Denote z := et > 1 and consider the integral
1−e
|K 1 (x, t)|ρ d x
≤ Cet (1−1/ρ) (1 + ln et )(et − 1)1/ρ (et + 1)−1
0
≤ C(1 + t)(1 − e−t )1/ρ . For M > 0 we apply (7.3): 1/ρ −t 1−e
|K 1 (x, t)|ρ d x
≤ Cet (1−1/ρ)
et −1
(et + 1)2 − y 2
0
− ρ
2
1/ρ dy
0 t (1−1/ρ)
≤ Ce (e − 1)1/ρ (et + 1)−1 −t 1/ρ ≤ C(1 − e ) . The proposition is proven.
t
Thus, the theorem in the case of ϕ0 = 0 is proven. Now we turn to the case of ϕ1 = 0, where the kernel K 0 appears. An application of Theorem 0.4 leads to $ $ $ 1−e−t $ t $ $ u(x, t) L q (Rx ) ≤ e 2 ϕ0 (x) L q (Rx ) + $ . [ϕ0 (x − z)+ϕ0 (x + z)] K 0 (z, t) dz $ $ 0 $ q L (Rx )
Similarly to the case of the second datum we apply the Young’s inequality and arrive at 1/ρ −t 1−e
t
u(x, t) L q (Rx ) ≤ e 2 ϕ0 (x) L q (Rx ) + 2 ϕ0 (x) L p (Rx )
|K 0 (r, t)|ρ dr
.
0
The next proposition gives an estimate for the integral of the last inequality. Proposition 8.3. Let 1 < p < ρ , 0
1−e−t
1 q
=
1 p
−
1 1 ρ , ρ
+
1 ρ
= 1, and ρ ∈ [1, 2). We have
1/ρ ρ
|K 0 (r, t)| dr
1
1
1
≤ Cρ (1 + t)1−sgn M (et − 1) ρ et ( 2−ρ )
for all t ∈ (0, ∞).
Proof. The case of ρ = 1 is just Lemma 7.4 therefore we bring up details, which in the case of ρ > 1 are different from ones used in the proof of that lemma. We turn to the integral (z := et > 1) 1/ρ ρ 1−e−t z−1 1 ρ |K 0 (r, t)| dr = dy [(z − 1)2 − y 2 ] (z + 1)2 − y 2 0 0 1 1 (z − 1)2 − y 2 × z − z 2 − i M(1 − z 2 − y 2 ) F + i M, + i M; 1; 2 2 (z + 1)2 − y 2 ρ 1/ρ 1 2 1 (z − 1)2 − y 2 2 1 − i M F − + i M, + i M; 1; . + z −1+y 2 2 2 (z + 1)2 − y 2
336
K. Yagdjian, A. Galstian
The formulas (7.9) and (7.10) describe the behavior of the hypergeometric functions in the neighbourhood of zero. Consider therefore two zones, Z 1 (ε, z) and Z 2 (ε, z), defined in (7.7) and (7.8), respectively. We split the integral into two parts: 1−e−t ρ ρ |K 0 (r, t)| dr = |K 0 (r, t)| dr + |K 0 (r, t)|ρ dr. (z,r )∈Z 1 (ε,z)
0
(z,r )∈Z 2 (ε,z)
In the proof of Lemma 7.4 the relation (7.11) was checked in the first zone. If 1 ≤ z ≤ N with some constant N , then the argument of the hypergeometric functions is bounded, (z − 1)2 − r 2 (z − 1)2 (N − 1)2 ≤ ≤ < 1 for all r ∈ (0, z − 1), 2 2 2 (z + 1) − r (z + 1) (N + 1)2 and we obtain with z = et , 1−e−t 1/ρ |K 0 (r, t)|ρ dr ≤ C M,N
z−1
0
[(z
−
1
(z + 1)2 − y 2 1/ρ 2 2 2 2 2 ρ 1 2 2 2 (z − 1) − y 2 (z − 1) − y × [(z − 1) − y ] + z +z dy 2 (z + 1)2 − y 2 (z + 1)2 − y 2 z−1 ρ 1/ρ 1 1 2 1+z ≤ C M,N dy (z + 1)2 − y 2 (z + 1)2 − y 2 0 0
− 1)2
(8.1)
y2]
≤ C M,N (z − 1)1/ρ (z + 1)−1 . Thus, we can restrict ourselves to the case of large z ≥ N in both zones. Consider therefore for ρ ∈ (1, 2) the following integrals over the first zone: ρ ρ z−1 1 1 A4 := dr ≤ dr (z,r )∈Z 1 (ε) 0 (z + 1)2 − r 2 (z + 1)2 − r 2 1 ρ 3 (z − 1)2 ≤ Cρ (z − 1)(z + 1)−ρ , = (z − 1)(z + 1)−ρ F , ; ; 2 2 2 (z + 1)2 ρ z2 dr A5 := (z,r )∈Z 1 (ε) [(z + 1)2 − r 2 ] (z + 1)2 − r 2 ρ z−1 z2 dr ≤ 0 [(z + 1)2 − r 2 ] (z + 1)2 − r 2 1 3ρ 3 (z − 1)2 ; ; = z 2ρ (z − 1)(z + 1)−3ρ F , . 2 2 2 (z + 1)2 Then, we use (7.1) and (7.2) with the argument (z − 1)2 /(z + 1)2 , to obtain for ρ < 2 and large z ≥ N the following estimate for the hypergeometric function, F
1 3ρ 3 (z − 1)2 3ρ , ; ; ≤ C(z + 1)−1+ 2 . 2 2 2 2 (z + 1)
Thus, ρ
A5 ≤ C(z − 1)(z + 1)−1+ 2 .
(8.2)
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
337
For the next term we obtain a similar estimate, 2 ρ ρ z2 (z − 1)2 −r 2 A6 := dr ≤ C(z −1)(z +1)−1+ 2 . 2 −r 2 2 2 2 2 (z +1) (z,r )∈Z 1 (ε) ((z − 1) −r ) (z +1) −r Hence,
ρ
(z,r )∈Z 1 (ε,z)
|K 0 (r, t)|ρ dr ≤ C(z − 1)(z + 1)−1+ 2 .
Now we consider the case of M = 0. In the second zone Z 2 (ε, z) we have (7.14) while for the argument of the hypergeometric functions we have (7.12). We have to estimate the following integral: ρ z 2 (1 + ln z) A7 := dr. 2 2 2 2 (z,r )∈Z 2 (ε,z) ((z − 1) − r ) (z + 1) − r We apply (7.12) and (8.2) and follow calculations have been used for the estimate of A5 . Then we obtain ρ
A7 ≤ C (1 + ln z)ρ (z − 1)(z + 1)−1+ 2 . Hence, (z,r )∈Z 2 (ε,z)
ρ
|K 0 (r, t)|ρ dr ≤ C (1 + ln z)ρ (z − 1)(z + 1)−1+ 2
for all z ≥ N .
The case of M > 0 needs evident modifications and we skip it. The proposition is proven. 9. L p − L q Estimates for the Equation with Source, n ≥ 2 For the wave equation Duhamel’s principle allows us to reduce the case with a source term to the case of the Cauchy problem without source term and consequently to derive the L p − L q -decay estimates for the equation. For (0.8) Duhamel’s principle is not applicable straightforwardly and we have to appeal to the representation formula of Theorem 0.5. In this section we consider the Cauchy problem (0.19) for the equation with the source term and with zero initial data. Theorem 9.1. Let u = u(x, t) be solution of the Cauchy problem (0.19). Then for n ≥ 2 one has the following estimate: (−)−s u(x, t) L q (Rn ) t ≤ C db f (x, b) L p (Rn )
0
0
provided that 1 < p ≤ 2,
1 p
1−sgn M −b −e−t )2 −r 2 F 21 , 21 ; 1; (e −b −t 2 2 (e +e ) −r dr r , (e−t +e−b )2 −r 2 = 1, 21 (n + 1) 1p − q1 ≤ 2s ≤ n 1p − q1 < 2s + 1.
e−b −e−t
+
1 q
2s−n( 1p − q1)
338
K. Yagdjian, A. Galstian
Proof. In both cases, of even and odd n, one can write the representation (0.22). Due to the results of [5,22] for the wave equation, we have (−)−s u(x, t) L q (Rn ) e−b −e−t t ≤ CM db 0
0
1−sgn M −b −e−t )2 −r 2 F 21 , 21 ; 1; (e (e−b +e−t )2 −r 2 × (−)−s v(x, r ; b) L q (Rn ) dr −t (e + e−b )2 − r 2 t e−b −e−t ≤ CM db f (x, b) L p (Rn ) 0
0
1−sgn M −b −e−t )2 −r 2 F 21 , 21 ; 1; (e −b −t 2 2 1 1 (e +e ) −r 2s−n( p − q ) ×r dr . (e−t + e−b )2 − r 2
The theorem is proven.
We are going to transform the estimate of the last theorem to a more compact form. To this aim we estimate for n( 1p − q1 ) < 2s + 1 the last integral of the right hand side. If we replace e−b /e−t > 1 with z := e−b /e−t > 1, then the integral will be simplified. 1−sgn M e−b −e−t 1 1 1 (e−b − e−t )2 − r 2 2s−n( 1p − q1 ) , ; 1; −b r dr F 2 2 (e + e−t )2 − r 2 0 (e−t + e−b )2 − r 2 1−sgn M z−1 (z − 1)2 − y 2 1 1 1 −t[2s−n( 1p − q1 )] 2s−n( 1p − q1 ) , ; 1; y dy. =e F 2 2 (z + 1)2 − y 2 0 (z + 1)2 − y 2
We skip proof of the next lemma, it is very similar to the proof of Lemma 9.2 [32]. Lemma 9.2. Assume that 0 ≥ 2s − n( 1p − q1 ) > −1. Then
z−1
r
2s−n( 1p − q1 )
0
1−sgn M 2 −r 2 F 21 , 21 ; 1; (z−1) 2 2 (z+1) −r dr (z + 1)2 − r 2
1+2s−n( 1p − q1 )
≤ C z −1 (z − 1)
(1 + ln z)1−sgn M ,
for all z > 1. Corollary 9.3. Let u = u(x, t) be a solution of the Cauchy problem (0.19). Then for n ≥ 2 one has the following estimate: (−)−s u(x, t) L q (Rn ) ≤ C M
provided that 1 < p ≤ 2,
1 p
0
t
1+2s−n( 1 − 1 ) p q f (x, b) L p (Rn ) e−b e−b − e−t
× (1 + t − b)1−sgn M db, + q1 = 1, 21 (n + 1) 1p − q1 ≤ 2s ≤ n 1p − q1 < 2s + 1.
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
339
Proof. Indeed, we apply Lemma 9.2 with z = et−b to the right-hand side of the estimate given by Theorem 9.1: (−)−s u(x, t) L q (Rn ) t −t[2s−n( 1p − q1 )] −1 1+2s−n( 1p − q1 ) ≤C db f (x, b) L p (Rn ) e z (z − 1) (1 + ln z)1−sgn M 0 t 1+2s−n( 1 − 1 ) p q ≤C f (x, b) L p (Rn ) e−b e−b − e−t (1 + t − b)1−sgn M db. 0
The corollary is proven.
10. L p − L q Estimates for Equation without Source, n ≥ 2 The L p − L q -decay estimates for the energy of the solution of the Cauchy problem for the wave equation without source can be proved by the representation formula, L 1 − L ∞ and L 2 −L 2 estimates, and interpolation argument (see, e.g., [24, Theorem 2.1]). There is also a proof of the L p − L q -decay estimates for the solutions that is based on the microlocal consideration and dyadic decomposition of the phase space (see, e.g., [5,22]). To avoid the derivative loss and obtain sharper estimates we appeal to the representation formula provided by Theorem 0.6 and then apply the results of [5,22]. Theorem 10.1. The solution u = u(x, t) of the Cauchy problem (0.23) satisfies the following L p − L q estimate: (−)−s u(x, t) L q (Rn ) ≤ C M (1 + t)1−sgn M (1 − e−t )
t × e 2 ϕ0 (x) L p (Rn ) + (1 − e−t ) ϕ1 L p (Rn ) for all t ∈ (0, ∞), provided that 1 < p ≤ 2, n 1p − q1 < 2s + 1.
1 p
+
1 q
2s−n( 1p − q1 )
= 1, 21 (n + 1)
1 p
−
1 q
≤ 2s ≤
Proof. We start with the case of ϕ0 = 0. Due to Theorem 0.6 for the solution u = u(x, t) of the Cauchy problem (0.23) with ϕ0 = 0 and to the results of [5,22] we have: (−)−s u(x, t) L q (Rn ) 1−e−t 2s−n( 1p − q1 ) |K 1 (r, t)| dr ≤ C ϕ1 L p (Rn ) r 0
−t (2s−n( 1 − 1 ))
p q ≤ C M ϕ1 L p (Rn ) e 1−sgn M et −1 − 1 1 1 (et − 1)2 − y 2 2s−n( 1p − q1 ) t (e + 1)2 − y 2 2 F × y dy. , ; 1; t 2 2 (e + 1)2 − y 2 0
To continue we apply Lemma 9.2 and obtain (−)−s u(x, t) L q (Rn ) ≤ C M ϕ1 L p (Rn ) (1 + t)1−sgn M (1 − e−t ) Thus, in the case of ϕ0 = 0 the theorem is proven.
1+2s−n( 1p − q1 )
.
340
K. Yagdjian, A. Galstian
Next we turn to the case of ϕ1 = 0. Due to Theorem 0.6 for the solution u = u(x, t) of the Cauchy problem (0.23) with ϕ1 = 0 and to the results of [5,22] we have: (−)−s u(x, t) L q (Rn ) t 2s−n( 1p − q1 ) ≤ C e 2 (1 − e−t ) +
1−e−t
r
2s−n( 1p − q1 )
0
|K 0 (r, t)| dr ϕ0 (x) L p (Rn ) .
One can estimate the last integral 1−e−t 2s−n( 1p − q1 ) r |K 0 (r, t)| dr 0
≤e
−t[2s−n( 1p − q1 )]
et −1
y
1 (et + 1)2 − y 2
2s−n( 1p − q1 )
[(et
− 1)2
y2]
− 0 1 t 1 (et − 1)2 − y 2 + i M, + i M; 1; t × e − e2t − i M(1 − e2t − y 2 ) F 2 2 (e + 1)2 − y 2 1 2t 1 (et − 1)2 − y 2 2 1 dy. − i M F − + i M, + i M; 1; t + e −1+y 2 2 2 (e + 1)2 − y 2
The following proposition gives the remaining estimate for that integral and completes the proof of the theorem. Proposition 10.2. If 2s − n( 1p − q1 ) > −1, then z−1 1 2s−n( 1p − q1 ) y 2 2 [(z − 1) − y ] (z + 1)2 − y 2 0 1 1 (z − 1)2 − y 2 + i M, + i M; 1; × z − z 2 − i M(1 − z 2 − y 2 ) F 2 2 (z + 1)2 − y 2 1 2 1 (z − 1)2 − y 2 2 1 dy − i M F − + i M, + i M; 1; + z −1+y 2 2 2 (z + 1)2 − y 2 1+2s−n( 1p − q1 )
1
≤ C M z − 2 (z − 1)
(1 + ln z)1−sgn M
for all z > 1.
Proof. We follow the arguments which have been used in the proof of Proposition 8.3. If 1 ≤ z ≤ N with some constant N , then the argument of the hypergeometric functions is bounded (8.1), and the integral can be estimated by: z−1 1 2s−n( 1p − q1 ) y 2 2 [(z − 1) − y ] (z + 1)2 − y 2 0 1 1 (z − 1)2 − y 2 + i M, + i M; 1; × z − z 2 − i M(1 − z 2 − y 2 ) F 2 2 (z + 1)2 − y 2 1 2 1 (z − 1)2 − y 2 2 1 dy − i M F − + i M, + i M; 1; + z −1+y 2 2 2 (z + 1)2 − y 2 z−1 1 1 2s−n( 1p − q1 ) ≤ CM 1 + z2 dy y (z + 1)2 − y 2 (z + 1)2 − y 2 0 1+2s−n( 1p − q1 )
≤ C M z −1 (z − 1)
for all z ∈ [1, N ].
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
341
Thus, we can restrict ourselves to the case of large z ≥ N in both zones Z 1 (ε, z) and Z 2 (ε, z), defined in (7.7) and (7.8), respectively. In the first zone we have (7.11). Consider therefore the following three estimates. For the first one we have A8 :=
(z,r )∈Z 1 (ε,z)
r
2s−n( 1p − q1 )
1+2s−n( 1p − q1 )
≤ C z −1 (z − 1)
1 (z + 1)2 − r 2
dr
for all z ∈ [N , ∞) .
For 0 ≥ a > −1 and z ≥ N the following integral can be easily estimated:
z−1
ra
0
z/2 z−1 1 1 1 a dr = r dr + ra dr ((z + 1)2 − r 2 )3/2 ((z + 1)2 − r 2 )3/2 ((z + 1)2 − r 2 )3/2 0 z/2 1 z a z−1 16 −3 z/2 a z r dr + a dr ≤ 2 − r 2 )3/2 9 4 ((z + 1) 0 z/2 ≤ C z a−3/2 for all z ∈ [N , ∞) .
Hence, A9 := z 2
(z,r )∈Z 1 (ε,z)
z−1
≤ z2
r
r
2s−n( 1p − q1 )
2s−n( 1p − q1 )
0
1 dr (z + 1)2 − r 2 (z + 1)2 − r 2
1+2s−n( 1p − q1 )
1
1 1 dr 2 − r2 2 2 (z + 1) (z + 1) − r
≤ C z − 2 (z − 1)
1
for all z ∈ [N , ∞) ,
and 2 (z − 1)2 − r 2 1 := z r dr (z,r )∈Z 1 (ε,z) ((z − 1)2 − r 2 ) (z + 1)2 − r 2 (z + 1)2 − r 2 1 1 2s−n( 1p − q1 ) r dr ≤ z2 (z,r )∈Z 1 (ε,z) (z + 1)2 − r 2 (z + 1)2 − r 2
A10
2s−n( 1p − q1 )
2
1+2s−n( 1p − q1 )
1
≤ C z − 2 (z − 1)
for all z ∈ [N , ∞) .
Finally, y
2s−n( 1p − q1 )
1
(z + 1)2 − y 2 [(z − 1 1 (z − 1)2 − y 2 + i M, + i M; 1; × z − z 2 − i M(1 − z 2 − y 2 ) F 2 2 (z + 1)2 − y 2 1 2 1 (z − 1)2 − y 2 2 1 dy − i M F − + i M, + i M; 1; + z −1+y 2 2 2 (z + 1)2 − y 2
(z,y)∈Z 1 (ε,z)
1
− 1)2
1+2s−n( 1p − q1 )
≤ C z − 2 (z − 1)
y2]
for all z ∈ [1, ∞).
342
K. Yagdjian, A. Galstian
In the second zone we use (7.12), (7.13), and (7.14). Thus, we have to estimate the next two integrals: 1 2s−n( 1p − q1 ) 2 r dr, A11 := z 2 2 (z,r )∈Z 2 (ε,z) ((z − 1) − r ) (z + 1)2 − r 2 1 2s−n( 1p − q1 ) r dr. A12 := z 2 (1 + ln z)1−sgn M 2 2 (z,r )∈Z 2 (ε,z) ((z − 1) − r ) (z + 1)2 − r 2 We apply (7.12) to A11 and obtain 2s−n( 1p − q1 ) 2 r A11 ≤ Cε z (z,r )∈Z 2 (ε,z)
≤ Cε z
− 21
1 1 dr [(z + 1)2 − r 2 ] (z + 1)2 − r 2
1+2s−n( 1p − q1 )
(z − 1)
for all z ∈ [1, ∞), while 1+2s−n( 1p − q1 )
1
A12 ≤ Cε z − 2 (z − 1)
for all z ∈ [1, ∞).
(1 + ln z)1−sgn M
The proposition is proven.
To complete the proof of the theorem we write
1−e−t
r
2s−n( 1p − q1 )
0
≤e
−t[2s−n( 1p − q1 )]
|K 0 (r, t)| dr
et −1
y
2s−n( 1p − q1 )
−t[ 12 +2s−n( 1p − q1 )]
(et − 1)
− 1)2
1 (et + 1)2 − y 2
− 0 1 t 1 (et − 1)2 − y 2 + i M, + i M; 1; t × e − e2t − i M(1 − e2t − y 2 ) F 2 2 (e + 1)2 − y 2 1 2t 1 (et − 1)2 − y 2 2 1 dy − i M F − + i M, + i M; 1; t + e −1+y 2 2 2 (e + 1)2 − y 2
≤ Ce
[(et
1+2s−n( 1p − q1 )
y2]
(1 + t)1−sgn M .
Thus, (−)−s u(x, t) L q (Rn ) t 2s−n( 1p − q1 ) −t[ 1 +2s−n( 1p − q1 )] t 1+2s−n( 1p − q1 ) ≤ C e 2 (1−e−t ) +e 2 (e − 1) (1 + t)1−sgn M × ϕ0 (x) L p (Rn ) t
≤ C(1 + t)1−sgn M e 2 (1 − e−t ) The theorem is proven.
2s−n( 1p − q1 )
ϕ0 (x) L p (Rn ) .
Acknowledgement. The authors would like to express their sincere gratitude to the referee for his/her careful reading, suggestions, and comments, which helped to improve the text.
Fundamental Solutions for Klein-Gordon Equation in de Sitter Spacetime
343
References 1. Abramowitz, M., Stegun, I.A.: Handbook of mathematical functions with formulas, graphs, and mathematical tables. National Bureau of Standards Applied Mathematics Series, 55, Washington, DC: Nat. Bur. of Standords, 1964 2. Bateman, H., Erdelyi, A.: Higher Transcendental Functions. Vol. 1,2, New York: McGraw-Hill, 1953 3. Birrell, N.D., Davies, P.C.W.: Quantum fields in curved space. Cambridge, New York: Cambridge University Press, 1984 4. Bony, J.-F., Hafner, D.: Decay and non-decay of the local energy for the wave equation in the De Sitter - Schwarzschild metric. http://arXiv.org/abs/0706.0350v1 5. Brenner, P.: On L p − L q estimates for the wave-equation. Math. Zeit. 145, 251–254 (1975) 6. Brozos-Vázquez, M., García-Río, E., Vázquez-Lorenzo, R.: Locally conformally flat multidimensional cosmological models and generalized Friedmann-Robertson-Walker spacetimes. J. Cosmol. Astropart. Phys. JCAP12 008 (2004) doi:10.1088/1475-7516/2004/12/008 7. Chandrasekhar, S.: The Mathematical Theory of Black Holes. Oxford, New York: Clarendon Press Oxford University Press, 1998 8. Christodoulou, D., Klainerman, S.: The global nonlinear stability of the Minkowski space. Princeton Mathematical Series, 41. Princeton, NJ: Princeton University Press, 1993 9. Courant, R., Hilbert, D.: Methods of mathematical physics. Vol. II: Partial differential equations. New York-London: Interscience Publishers, 1962 10. Dafermos, M., Rodnianski, I.: The wave equation on Schwarzschild-de Sitter spacetimes. http://arXiv. org/abs/0709.2766v1[ga-gc], 2007 11. De Sitter, W.: On Einstein’s Theory of Gravitation, and its astronomical consequences.II,III. Roy. Astron. Soc. 77, 155–184 (1917); 78, 3–28 (1917) 12. Einstein, A.: Kosmologische Betrachtungen zur allgemeinen Relativitätstheorie. Berlin: Sitzungsber Preuss. Akad. Wiss., 142–152 (1917) 13. Finster, F., Kamran, N., Smoller, J., Yau, S.-T.: Decay of solutions of the wave equation in the Kerr geometry. Commun. Math. Phys. 264, 465–503 (2006) 14. Friedrich, H., Rendall, A.: The Cauchy problem for the Einstein equations. Einstein’s field equations and their physical implications. Lecture Notes in Phys. 540, Berlin: Springer 2000, pp. 127–223 15. Galstian, A.: L p -L q decay estimates for the wave equations with exponentially growing speed of propagation. Appl. Anal. 82, 197–214 (2003) 16. Heinzle, J.M., Rendall, A.: Power-law inflation in spacetimes without symmetry. Commun. Math. Phys. 269, 1–15 (2007) 17. Hörmander, L.: The analysis of linear partial differential operators. IV. Fourier integral operators. Grundlehren der Mathematischen Wissenschaften 275. Berlin: Springer-Verlag, 1994 18. Kronthaler, J.: The Cauchy problem for the wave equation in the Schwarzschild geometry. J. Math. Phys. 47(4), 042501, 29 pp (2006) 19. Littman, W.: The wave operator and L p norms. J. Math. Mech. 12, 55–68 (1963) 20. Littman, W., McCarthy, C., Rivière, N.: The non-existence of L p estimates for certain translationinvariant operators. Studia Math. 30, 219–229 (1968) 21. Møller, C.: The theory of relativity. Oxford, Clarendon Press, 1952 22. Pecher, H.: L p -Abschätzungen und klassische Lösungen für nichtlineare Wellengleichungen.I. Math. Zeit. 150, 159–183 (1976) 23. Peral, J.C.: L p estimates for the wave equation. J. Funct. Anal. 36(1), 114–145 (1980) 24. Racke, R.: Lectures on Nonlinear Evolution Equations. Aspects of Mathematics. Braunschweig/ Wiesbaden: Vieweg, 1992 25. Rendall, A.: Asymptotics of solutions of the Einstein equations with positive cosmological constant. Ann. Henri Poincaré 5(6), 1041–1064 (2004) 26. Shatah, J., Struwe, M.: Geometric wave equations. Courant Lecture Notes in Mathematics, 2. New York University, Courant Institute of Mathematical Sciences, Amer. Math. Soc. New York: Providence, RI 1998 27. Sonego, S., Faraoni, V.: Huygens’ principle and characteristic propagation property for waves in curved space-times. J. Math. Phys. 33(2), 625–632 (1992) 28. Yagdjian, K.: A note on the fundamental solution for the Tricomi-type equation in the hyperbolic domain. J. Differ. Eqs. 206, 227–252 (2004) 29. Yagdjian, K.: Global existence in the Cauchy problem for nonlinear wave equations with variable speed of propagation, New trends in the theory of hyperbolic equations, Oper. Theory Adv. Appl., 159, Basel: Birkh¨auser, 2005, pp. 301–385 30. Yagdjian, K.: Global existence for the n-dimensional semilinear Tricomi-type equations. Comm. Partial Diff. Eqs. 31, 907–944 (2006)
344
K. Yagdjian, A. Galstian
31. Yagdjian, K.: Self-similar solutions of semilinear wave equation with variable speed of propagation. J. Math. Anal. Appl. 336, 1259–1286 (2007) 32. Yagdjian, K., Galstian, A.: Fundamental Solutions for Wave Equation in de Sitter Model of Universe. University of Potsdam, August, Preprint 2007/06, available at http://arXiv.org/abs/0710.3878v1[math. Ap], 2007 Communicated by G. W. Gibbons
Commun. Math. Phys. 285, 345–398 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0579-1
Communications in
Mathematical Physics
High-Velocity Estimates for the Scattering Operator and Aharonov-Bohm Effect in Three Dimensions Miguel Ballesteros, Ricardo Weder Department of Mathematics and Statistics, University of Helsinki, P.O. Box 68, Gustaf Hallstromin katu 2b, Helsinki, FI-00014, Finland. E-mail:
[email protected];
[email protected] Received: 19 November 2007 / Accepted: 11 March 2008 Published online: 1 August 2008 – © Springer-Verlag 2008
Abstract: We obtain high-velocity estimates with error bounds for the scattering operator of the Schrödinger equation in three dimensions with electromagnetic potentials in the exterior of bounded obstacles that are handlebodies. A particular case is a finite number of tori. We prove our results with time-dependent methods. We consider highvelocity estimates where the direction of the velocity of the incoming electrons is kept fixed as its absolute value goes to infinity. In the case of one torus our results give a rigorous proof that quantum mechanics predicts the interference patterns observed in the fundamental experiments of Tonomura et al. that gave conclusive evidence of the existence of the Aharonov-Bohm effect using a toroidal magnet. We give a method for the reconstruction of the flux of the magnetic field over a cross-section of the torus modulo 2π . Equivalently, we determine modulo 2π the difference in phase for two electrons that travel to infinity, when one goes inside the hole and the other outside it. For this purpose we only need the high-velocity limit of the scattering operator for one direction of the velocity of the incoming electrons. When there are several tori-or more generally handlebodies-the information that we obtain in the fluxes, and on the difference of phases, depends on the relative position of the tori and on the direction of the velocities when we take the high-velocity limit of the incoming electrons. For some locations of the tori we can determine all the fluxes modulo 2π by taking the high-velocity limit in only one direction. We also give a method for the unique reconstruction of the electric potential and the magnetic field outside the handlebodies from the high-velocity limit of the scattering operator.
Research partially supported by CONACYT under Project P42553F. On leave of absence from Departamento de Métodos Matemáticos y Numéricos. Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas. Universidad Nacional Autónoma de México. Apartado Postal 20-726, México DF 01000. Ricardo Weder is a Fellow of the Sistema Nacional de Investigadores.
346
M. Ballesteros R. Weder
1. Introduction The Aharonov-Bohm effect is a fundamental quantum mechanical phenomenon wherein charged particles, like electrons, are physically influenced, in the form of a phase shift, by the existence of magnetic fields in regions that are inaccessible to the particles. This genuinely quantum mechanical phenomenon was predicted by Aharonov and Bohm [3]. See also Ehrenberg and Siday [9]. This phenomenon has been extensively studied both from the theoretical and the experimental points of view. For a review of the literature see [29] and [30]. There has been a large controversy, involving over three hundred papers, concerning the existence of the Aharonov-Bohm effect. For a detailed discussion of this controversy see [30]. The issue was finally settled by the fundamental experiments of Tononura et al. [37,38], who used toroidal magnets to enclose a magnetic flux inside them. In remarkable experiments they were able to superimpose behind the magnet an electron beam that traveled inside the hole of the magnet with another electron beam that traveled outside the magnet, and they measured the phase shift produced by the magnetic flux enclosed in the magnet, giving conclusive evidence of the existence of the Aharonov-Bohm effect. In this paper we give a rigorous mathematical analysis of this scattering problem with time-dependent methods. In particular, we give a rigorous mathematical proof that quantum mechanics predicts the phase shifts observed in the Tonomura et al. experiments [37,38]. We consider bounded obstacles, K , whose connected components are handlebodies. In particular, they can be the union of a finite number of bodies diffeomorphic to tori or to balls. Some of them can be patched through the boundary. We study the high-velocity limit of the scattering operator in the complement, , of the obstacle, K , for the Schrödinger equation with magnetic field and electric potential in and with magnetic fluxes enclosed in the obstacle K . We obtain high-velocity estimates with error bounds for the scattering operator using the time-dependent method of [14]. We consider high-velocity limits where the direction of the velocity of the incoming electrons is kept fixed as its absolute value goes to infinity. The leading term of our estimate gives us a reconstruction formula that allows us to reconstruct the circulation of the magnetic potential modulo 2π along lines in the direction of the velocity (the X-ray transform). From these line integrals we uniquely reconstruct the magnetic field in some region of . The error term for the leading order goes to zero as a constant divided by the absolute value of the velocity. The next term in our high-velocity estimate allows us to reconstruct the integral of the electric potential along lines in the direction of the velocity (the X-ray transform). We uniquely reconstruct the electric potential in a region of from these lines’ integrals. The error term for this high-velocity estimate goes to zero as a constant divided by a power of the absolute value of the velocity, that depends on the decay rate at infinity of the magnetic field and of the electric potential. If we have enough decay this power is one, as for the leading order. The leading-order high-velocity estimate is given in Theorem 5.7 and the next term in our high-velocity estimate is given in Theorem 5.9. The unique reconstruction of the magnetic field and the electric potential in a region of is given in Theorem 6.3. The reconstruction method is summarized in Remark 6.4. Then, we consider the Aharonov-Bohm effect. We assume that the magnetic field in is identically zero. On the contrary, the electric potential is not assumed to be zero. In other words, we analyze the Aharonov-Bohm effect in the presence of an electric potential. We use for reconstruction only the leading-order high-velocity estimate. As
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
347
for high-velocities, the electric potential gives a lower-order contribution; it plays no role in the Aharonov-Bohm effect. However, to allow for a non-trivial electric potential could be of interest from the experimental point of view. In Theorem 7.1 we reconstruct the circulation of the magnetic potential, modulo 2π , over a set of closed paths in and in Remark 7.3 we reconstruct the projection of the de 1 Rham cohomology class of the magnetic potential onto a subspace of Hde R () in the sense that we reconstruct, modulo 2π , the expansion coefficients of the projection into the subspace of the de Rham cohomology class of the magnetic potential in any basis of the subspace. The set of circulations and the projection of the de Rham cohomology class of the magnetic potential that we can reconstruct depend on the relative position of the handlebodies and on the direction of the velocity of the incoming electrons. In Theorem 7.11, Corollary 7.12 and Remark 7.13 we give our method for the reconstruction of the fluxes inside the obstacle K , modulo 2π . Since the scattering operator is invariant under short-range gauge transformations that change the fluxes by multiples of 2π , the fluxes can only be reconstructed modulo 2π . Again, the fluxes that we reconstruct depend on the relative position of the handlebodies and on the direction of the velocity of the incoming electrons. In Example 7.14 we give obstacles that consist of a finite number of tori and manifolds diffeomorphic to balls, where from the high-velocity limit of the scattering operator in only one direction we reconstruct modulo 2π all the circulations in of the magnetic potential, its de Rham cohomology class modulo 2π , and the flux modulo 2π of the magnetic field over the cross section of all the tori. Finally, we discuss the fundamental experiments of Tonomura et al. [37,38] in Sect. 8. We show that our results give a rigorous proof that quantum mechanics predicts the interference patterns between electron beams that go inside and outside the torus, that were observed in these remarkable experiments. The paper is organized as follows. In Sect. 2 we give a precise definition of the obstacle, K , and we study in a detailed way the homology and the cohomology of K and . This allows us to construct a homology and cohomlogy basis that have clear physical significance. Using these results we construct in Sect. 3 classes of magnetic potentials characterized by the magnetic field in and by the fluxes of the magnetic field in the cross sections of the components of K that have holes. We construct classes of magnetic potentials where the fluxes are fixed, and classes where the fluxes are only fixed modulo 2π . We study the gauge transformations between these magnetic potentials. In Sect. 4 we define the Hamiltonian of our system. In Sect. 5 we study our direct scattering problem. We prove the existence of the wave operators and we define the scattering operator. We analyze how the wave and scattering operators change under the change of the magnetic potential when the fluxes are only fixed modulo 2π . We also prove our high-velocity estimates. In Sect. 6 we give our method for the reconstruction of the magnetic field and the electric potential in a region of . In Sect. 7 we obtain our results in the AharonovBohm effect and in Sects. 8 we discuss the Tonomura et al. experiments [37,38]. In Appendixes A and B we prove results in homology that we need. For the Aharonov-Bohm effect in scattering in two dimensions see [28] and [40]. For inverse scattering by magnetic fields in all space see [4–6,20–22]. For properties of the scattering matrix for scattering by Aharonov-Bohm potentials in all space see [33,34] and [42,43]. For the Ahanov-Bohm effect in inverse boundary-value problems see [10–13], [24] and [25]. Finally, some words about our notations and definitions. We use notions of homology and cohomology as defined, for example, in [7,16,17,8] and [41]. In particular, we consider homology and cohomology groups on open sets of Rn , n = 2, 3 with coefficients in
348
M. Ballesteros R. Weder
Z and in R. As these singular homology and cohomology groups are isomorphic to the C ∞ homology and cohomology groups, [7], p. 291, we will identify them. We also use differential forms, or just forms, in open sets of R3 with regular boundary—or in their closure- with the Euclidean metric, as defined, for example, in [8,35,41]. For such a set, O, we denote by k (O) the set of all k−forms in O. We use the standard identification between concepts of vector calculus and differential o
3 forms in three dimensions in the interior of O, that we denote by O , [35]. Let {x i }i=1 3 be the Euclidean coordinates of R . We identify vectors and 1−differential forms as
(A1 , A2 , A3 ) ⇐⇒
3
A jdx j.
i=1
We identify vectors and 2−differential forms as (B1 , B2 , B3 ) ⇐⇒ B3 d x 1 ∧ d x 2 − B2 d x 1 ∧ d x 3 + B1 d x 2 ∧ d x 3 . We identify scalars and 3−differential forms as f ⇐⇒ f d x 1 ∧ d x 2 ∧ d x 3 . The exterior derivative, d, in 1−forms is equivalent to the curl of the associated vector, and in 2−forms is equivalent to the divergence of the associated vector. In particular, a 1−form, A, is closed if dA = 0, or equivalently, if the associated vector has curl zero, and a 2−form, B, is closed if dB = 0, or equivalently, if the associated vector has divergence zero. For 0−forms the exterior derivative coincides with the gradient ∇. We will always assume that the coefficients of our forms are at least locally integrable in any coordinate chart. Hence, they define distributions or currents [8]. We say that a form belongs to some space if its coefficients in any coordinate chart belong to that space. For example, we say that a form is continuous if it has continuous coefficients or that is L p if its coefficients are in L p . In the case of a 2−form, B, we will say that o
B ∈ L p 2 (O), or, equivalently, that the associated vector B ∈ L p ( O ). For forms defined in O that are not differentiable in the classical sense, the derivatives are taken o
in the distribution sense in O, if O is open, or in O if it is closed. n For any x ∈ R3 , x = 0, we denote xˆ := x/|x|. By BrR (x0 ), n = 2, 3, we denote the open ball of center x0 and radius r . By S2 we denote the unit sphere in R3 . For any set O we denote by F(x ∈ O) the operator of multiplication by the characteristic function of O. The symbol ∼ = means isomorphism, the symbol means homotopic equivalence and the symbol ≈ means homeomorphism. We define the Fourier transform as a unitary operator on L 2 (R3 ) as follows: 1 ˆ p) := Fφ( p) := φ( e−i p·x φ(x) d x. (2π )3/2 R3 We define functions of the operator p := −i∇ by Fourier transform, ˆ p) ∈ L 2 (R3 )}, f (p)φ := F ∗ f ( p)Fφ, D( f (p)) := {φ ∈ L 2 (R3 ) : f ( p) φ( for every measurable function f .
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
349
2. The Obstacle 2.1. Handlebodies. Let us designate by S1 the unit circle. We denote by T := S1 × B1R (0) the solid torus of dimension 3. We orient T assuming that the inverse of the following function is a chart that belongs to the orientation of T , 2
U : (0, 1) × B1R (0) → T, U(t, x, y) = (e2πit , x, y). 2
(2.1)
The boundary sum of T with itself is defined as follows. See [15], p. 19. Let D1 ⊆ ∂ T be a disc contained in a chart, (U1 , φ1 ), belonging to the orientation of T and let D2 ⊆ ∂ T be a disc contained in a chart, (U2 , φ2 ), belonging to the opposite orientation. We define the boundary sum T T as the disjoint union of T with itself, identifying D1 in the first torus with D2 in the second torus by means of the charts, in such a way that T T is an oriented differentiable manifold, the inclusion l1 : T → T T in the first o
torus is an homeomorphism onto its image whose restriction to T is a diffeomorphism that preserves orientation and the inclusion l2 : T → T T in the second torus is an o
homeomorphism onto its image whose restriction to T is a diffeomorphism that inverts orientation. We define the boundary sum of k tori by induction. Suppose that we already defined the boundary sum (k − 1) T := T T · · · T, k − 1 times of k − 1 tori. Let l j , j = 1, 2, . . . , k − 1 be the inclusion of T on the j th torus. As before, Let D1 ⊆ ∂ T be a disc contained in a chart (U1 , φ1 ) belonging to the orientation of T if k − 1 is odd, or belonging to the opposite orientation if k − 1 is even. Moreover, we assume that lk−1 (U1 ) does not intersect any of the union charts in (k − 1)T . This is always possible choosing the union charts small enough. Let D2 ⊆ ∂ T be a disc contained in a chart (U2 , φ2 ) belonging to the opposite orientation of T . Then, the boundary sum k T := T T · · · T, k times is obtained from T · · · T, k − 1 times identifying −1 ) and (U2 , φ2 ) in such a lk−1 (D1 ) with D2 by means of the charts (lk−1 (U1 ), φ1 ◦ lk−1 way that kT is an oriented differentiable manifold, the inclusion (k − 1)T → kT in the first k −1 tori is an homeomorphism onto its image whose restriction to the interior is a diffeomorphism that preserves orientation and the inclusion lk : T → kT in the last 0
torus is an homeomorphism onto its image whose restriction to T is a diffeomorphism that inverts orientation. The structure of kT as an oriented differentiable manifold does not depend on the discs used to join the tori [15], p. 19. We will say that any oriented differentiable manifold diffeomorphic to kT is a handlebody with k handles, where the diffeomorphism is oriented. We will denote by 0T any oriented manifold that is diffeomorphic to the closed ball in R3 of center zero and radius one. Note that the inclusions l j : T → kT onto the j th torus are homeomorphisms onto their images whose restriction to the interior are diffeomorphisms that preserve orientation if j is odd and change orientation if j is even.
2.2. Homology of handlebodies. We define the functions γ± : [0, 1] → T : γ± (t)= (e±2πit , 0, 0) and Z j (t) :=
⎧ ⎨ l j ◦ γ+ (t) if j is odd, ⎩ l ◦ γ (t) if j is even. j −
(2.2)
350
M. Ballesteros R. Weder
For any ξ ∈ S1 we define
Bξ := {ξ } × B1 (0) ⊆ T.
(2.3)
We orient Bξ by requiring that the inverse of the inclusion B1 (0) → Bξ belongs to the orientation of Bξ , i.e., the inverse of the inclusion is a chart. The image of Z j in kT is a submanifold that we orient by means of the curve Z j . We assume that l j (Bξ ) does not intersect any of the union charts, which is always possible if the union charts are small enough. We orient the submanifold l j (Bξ ) by the orientation of Bξ . Let v1 ∈ Tl j (ξ,0,0) (Z j ([0, 1])) ⊆ Tl j (ξ,0,0) (kT ) be a tangent vector in the orientation of Z j ([0, 1]), and let v2 , v3 ∈ Tl j (ξ,0,0) (l j (Bξ )) ⊆ Tl j (ξ,0,0) (kT ) with (v2 , v3 ) in the positive orientation of l j (Bξ ). Then, (v1 , v2 , v3 ) is positively oriented in the tangent space Tl j (ξ,0,0) (kT ). This means that Z j ([0, 1]) and l j (Bξ ) intersect in a positive way. Let us denote by H1 (kT ; R) the first group of singular homology of kT with coefficients in R. See [16], p. 47. In Appendix A we give a proof, for the reader’s k convenience, that [Z j ] H1 (kT ;R) j=1 is a basis of H1 (kT ; R). 2.3. Definition of the obstacle. Assumption 2.1. We assume that the obstacle K is a compact submanifold of R3 of dimension three oriented with the orientation of R3 . Moreover, K = ∪ Lj=1 K j , where K j , 1 ≤ j ≤ L are the connected components of K . We assume that the K j are handlebodies. By our assumption there exist numbers m j ∈ N ∪ 0 and oriented diffeomorphisms F j : m j T → K j , 1 ≤ j ≤ L. We denote by J j the inclusion K j → K . The diffeomorphisms F j induce a diffeomorphism G:
L
m j T → K ,
j=1
where the symbol
means disjoint union. We denote,
J := { j ∈ {1, 2, . . . , L} : m j > 0}, m :=
L
m j,
j=1
{γk }m k=1 := J j ◦ F j ◦ Z i | j ∈ J, i ∈ {1, 2, . . . m j } .
(2.4)
Choose a ξ ∈ S1 such that li (Bξ ) does not intersect any chart of union in m j T, ∀ j ∈ J, ∀i ∈ {1, 2, . . . , m j }. This is always possible by choosing the charts of union in a proper way. If γk = J j ◦ F j ◦ Z i we define Bk := J j ◦ F j li (Bξ ) . Bk is a manifold that we orient by means of the orientation of Bξ . As F j is a oriented diffeomorphism and Z i intersects li (Bξ ) in a positive way, it follows that γk intersects Bk in a positive way. We define, Wξ : [0, 1] → T : Wξ (t) := (ξ, cos t, sin t) and γ˜k := J j ◦ F j ◦ li ◦ Wξ .
(2.5)
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
351
Take ε > 0 such that {x| distance(x, ∂ K ) < ε} is diffeomorphic to ∂ K × (−ε, ε). This is possible by the tubular neighborhood theorem. See Theorem 11.4, p. 93 of [7]. We define, ε γˆk (t) := γ˜k (t) + N (γ˜k (t)), (2.6) 2 where N (γ˜k (t)) is the exterior normal to K at the point γ˜k (t). Note that ∂ Bk = γ˜k ([0, 1]), the orientation on γ˜k ([0, 1]) induced by Bk is the orientation induced by the curve γ˜k . 2.4. The homology of the obstacle. Proposition 2.2. {[γk ] H1 (K ;R) }m k=1 is a basis of H1 (K ; R). L Proof. As G : j=1 m j T → K is a diffeomorphism and since ⎞ ⎛ L
H1 ⎝ m j T ; R⎠ ∼ = ⊕ Lj=1 H1 m j T ; R , j=1
by Proposition 9.5, p. 47 of [16], it follows from Proposition 9.3 of Appendix A that {[γk ] H1 (K ;R) }m k=1 is a basis of H1 (K ; R). 2.5. The cohomology of the obstacle. As K is an ANR (absolute neighborhood retract, p. 225 and Theorem 26.17.4 of [16]) we have that Hˇ 1 (K ; R) ∼ (2.7) = H 1 (K ; R), by Proposition 27.1, p. 230 of [16] (see also p. 347, Theorem 7.15 of [7]). By Alexander’s duality theorem (see Theorem 27.5, p. 233 of [16]) Hˇ 1 (K ; R) ∼ = H2 (R3 , R3 \ K ; R).
(2.8)
By Theorem 14.1, p. 75 of [16] we have the following exact sequence: H2 (R3 ; R) → H2 (R3 , R3 \ K ; R) → H1 (R3 \ K ; R) → H1 (R3 ; K ). As R3 is homotopically equivalent to a point, it follows from Theorem 11.3, p. 59 and Example 9.4, p. 47 of [16] that H2 (R3 ; R) = 0 and H1 (R3 ; R) = 0. Then, we have the exact sequence 0 → H2 (R3 , R3 \ K ; R) → H1 (R3 \ K ; R) → 0, and then, H2 (R3 , R3 \ K ; R) ∼ = H1 (R3 \ K ; R).
(2.9)
By (2.7,2.8, 2.9), H 1 (K ; R) ∼ = H1 (R3 \ K ; R). By the theorem of universal coefficients, p. 198 of [17], HomR (H1 (K ; R), R). Then, it follows that,
(2.10) ∼ =
H 1 (K ; R)
dim H1 (K ; R) = dim H1 (R3 \ K ; R) = m. We denote, := R3 \ K . We will prove in Corollary 2.4 that {[γˆk ] H1 (;R) }m k=1 is a basis of H1 (; R).
(2.11)
352
M. Ballesteros R. Weder
2.6. de Rham cohomology of . Let us define 1 1 1 1 G ( j) (x) := curl d γ j (y) := curl γ˙j (t) dt. 4π γ j |x − y| 4π |x − γ j (t)| (2.12) Then, curl G ( j) (x) = 0, x ∈ R3 \ γ j and G ( j) = δk, j , j, k = 1, 2, . . . , m. γˆk
(2.13)
Equation (2.12) is the law of Biot-Savart that gives the magnetic field created by a current in γ j and (2.13) is Ampere’s law. For a proof see Satz 1.4, p. 33, of [26]. m ( j) 1 is a basis of Hde () Proposition 2.3. G H1 R (). de R j=1 Proof. We first prove that they are linearly independent. Suppose that α j G ( j) = 0, then α j G ( j) = dλ for some 0−form λ. Hence, α j G ( j) = αk = 0. γˆk
By de Rham’s Theorem (Theorem 4.17, p. 154 of [41]) the dual space to H1 (; R) 1 is isomorphic to Hde R (). The isomorphisms are given by [α] H1 (;R) , [A] H 1 () := A. de R
α
Then, by (2.11), 1 dimHde R () = dimH1 (; R) = m,
and this proves the proposition. m Corollary 2.4. γˆr H (;R) is a basis of H1 (; R). 1
r =1
m Proof. By (2.13) γˆr H (;R) is the dual basis—in the sense of de Rham’s r =1 m 1 (r ) 1 of Hde () Theorem—to the basis G H 1 R (). de R r =1 Proposition 2.5. Let A be a closed 1 – form with continuous coefficients defined in and such that A = 0, r = 1, 2, . . . , m. γˆr
Then, there is a continuously differentiable 0–form, λ, such that A = dλ. Moreover, we can take λ(x) := C(x0 ,x) A, where x0 is any fixed point in and C(x0 , x) is any curve from x0 to x.
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
353
Proof. By Theorem 12, p. 68, of [8] there is a regularization R() and an operator () such that if α is a continuous k− form on , Rα is a C ∞ k− form on and α is a continuous (k − 1)− form on . Moreover, lim→0 Rα = α uniformly on compact sets in . Furthermore, Rα − α = bα + bα,
(2.14)
where bα := (−1)grade (α)−1 d. Multiplying (2.14) on the left by b and applying it to bα we prove that Rb = b R. As A is closed, it follows from (2.14) that R A − A = b A. In particular, this implies that b A is continuous. Let C be a closed curve. Then, by Stokes theorem, b A = lim Rb A = lim b R A = 0, →0 C
C
and then,
→0 C
RA =
C
and in particular,
γˆr
A, C
RA =
γˆr
A = 0, r = 1, 2, . . . , m.
m As R A is C ∞ and closed, and since [γˆr ] H1 (;R) r =1 is a basis of H1 (; R) it follows from de Rham’s Theorem (Theorem 4.17, p. 154, [41]) that there is an infinitely differentiable 0−form α such that R A = bα. But then, using Stokes theorem again, RA = bα = 0, C
C
and we obtain that,
A = 0, C
for any closed curve C and we can define λ :=
C(x0 ,x)
A. Clearly, A = dλ.
Recall that {K j } Lj=1 is the set of connected components of K . For each j ∈ {1, 2, · · · , L} we choose a x j in the interior of K j . We define the vector, D j := −grad
1 1 , x ∈ R3 \ {x j }, 4π |x − x j |
(2.15)
and according to our convention, we denote by the same symbol the associated 2–form. 1 1 Note that div D j (x) = d D j = − 4π |x−x j | = 0, x = x j , j = 1, 2, . . . , L and that, D j (x) ≤ C(1 + |x|)−2 , x ∈ .
For any r > 0 such that K ⊂ BrR (0) we denote, 3
r := ∩ BrR (0), and ∞ := . 3
(2.16)
354
M. Ballesteros R. Weder
L 2 Proposition 2.6. [D j ] H 2 is a basis of Hde ( ) R (r ) for r ≤ ∞. de R r j=1 Proof. Let us first consider the case r = ∞. As in the proof of (2.11) we prove that dim H0 (K ; R) = dim H2 (; R). But by Proposition 9.6, p. 48 of [16], H0 (K ; R) ∼ = ⊕ Lj=1 R. Moreover, by de Rham’s Theorem (Theorem 4.17, p. 154 of [41]) 2 ∗ ∼ Hde R () = (H2 (; R)) .
(2.17)
2 dim Hde R () = dim H2 (; R) = L .
(2.18)
Then,
Let us now consider r < ∞. We define, f : → r , ⎧ x ⎨ r1 |x| , if |x| ≥ r1 , f (x) := ⎩ x, if |x| ≤ r1 , and H (x, t) : ( × [0, 1]) → H (x, t) :=
⎧ x − x), if |x| ≥ r1 , ⎨ x + t (r1 |x| ⎩
x, if |x| ≤ r1 ,
where r1 < r and K ⊂ BrR1 (0). Let l be the inclusion l : r → . Then as l ◦ f (x)= l ◦ H (x, 1) = H (x, 1) and H (x, 0) = I (x), we have that l ◦ f is homotopic to the identity. Let us denote by H˜ (x, t) the restriction of H (x, t) to r . Then, f ◦ l(x)= H˜ (l(x), 1) = H˜ (x, 1), and as H˜ (x, 0) = I (x) we also have that f ◦ l is homotopic to the identity. Hence, by Theorem 11.3, p. 59 [16] the inclusion l induces an isomorphism in homology. In particular, H2 (r ; R) ∼ = H2 (; R) and then, 3
dim H2 (r ; R) = dim H2 (; R) = L . 1 It follows from Stoke’s theorem and as − 4π
1 |x−x j |
(2.19)
= div D j (x) = δ(x − x j ) that
∂ Ki
Dj =
3
∂ BρR (xi )
D j = δi, j ,
(2.20)
for ρ small enough and i, j = 1, 2, . . . , L. This easily implies that the set L [D j ] H 2 is linearly independent. ( ) de R r j=1
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
355
Lemma 2.7. Suppose that {[S j ] H2 (r ;R) } Lj=1 , is a basis of H2 (r ; R) for r ≤ ∞. Let D be a closed 2−form with continuous coefficients in r . Then, D = 0, ∀ j ∈ {1, 2, . . . , L} ⇐⇒ D = 0, ∀ j ∈ {1, 2, · · · , L}. (2.21) ∂K j
Sj
Proof. Denote K j,ε := {x ∈ R3 : dist(x, K j ) < ε}, where ε is so small that the tubular neighborhood theorem applies and let R be the regularization operator. Suppose that the left side of (2.21) holds. Then, as D is closed we prove using the Stokes theorem that R D = 0, ∀ j ∈ {1, 2, . . . , L}. ∂ K j,ε
As R D is C ∞ and closed, since b R = Rb, there are coefficients λ j , j = 1, 2, . . . , L and a 1–form α such that, RD =
L
λ j D j + dα.
j=1
Then, it follows from (2.20) (with K j,ε instead of K j ) and Stoke’s theorem that 0= RD = λj, ∂ K j,ε
and we obtain that R D = dα. Furthermore, using the regularization operator and Stoke’s theorem we prove that D= RD = dα = 0, j ∈ {1, 2, . . . , L}. Sj
Sj
Sj
Assume now that S j D = 0, j ∈ {1, 2, . . . , L}. We prove as above that S j R D= 0, j ∈ {1, 2, . . . , L}, and by de Rham’s Theorem (Theorem 4.17, p. 154 of [41]) there is a 1–form α such that R D = dα. Hence,
∂ K j,ε
D=
∂ K j,ε
RD =
∂ K j,ε
dα = 0,
and then, ∂K j
D = lim
ε→0 ∂ K j,ε
D = 0, j ∈ {1, 2, · · · , L}.
356
M. Ballesteros R. Weder
3. Magnetic Field and Magnetic Potentials In this section we introduce the class of magnetic fields that we consider and we construct a class of associated magnetic potentials with nice behavior at infinity that will allow us to solve our scattering problems. Definition 3.1. We say that a form B in is continuous in a neighborhood of ∂ K if there is a ε > 0 such that the coefficients of B are continuous in ∩ K ε , where K ε := {x ∈ R3 : dist(x, K ) < ε}. Below we assume that the magnetic field, B, is a 2–form that is continuous in a neighborhood of ∂ K and satisfies B = 0, j ∈ {1, 2, . . . , L}. (3.22) ∂K j
This condition means that the total contribution of magnetic monopoles inside each component K j of the obstacle is 0. In a formal way we can use Stokes theorem to conclude that B = 0 ⇐⇒ div B = 0, j ∈ {1, 2, . . . , L}. ∂K j
Kj
As div B is the density of magnetic charge, ∂ K j B is the total magnetic charge inside K j , and our condition (3.22) means that the total magnetic charge inside K j is zero, this condition is fulfilled if there is no magnetic monopole inside K j , j ∈ {1, 2, . . . , L}. p
Theorem 3.2. Let B be a 2–form in L loc 2 (), p ≥ 2 that is continuous in a neighborhood of ∂ K and satisfies (3.22). Suppose that the restriction of B to is closed (d B| = 0) as a distribution (or current [8]) . Then, B has an extension to a closed p 2−form B ∈ L loc 2 (Rn ) such that, B| = B. Proof. Let us denote M := r , r < ∞. M is a compact manifold. We denote by B M the restriction of B to M. As d B| = 0, it follows from Green’s formula (Prop. 2.12, p. 60, [35]) that ◦
B M , δη = 0, ∀η ∈ C0∞ 3 ( M ).
(3.23)
We denote (Definition 2.4.1, p. 80 [35]) C k (M) := δη|η ∈ H 1 k+1 N (M) , and (Definition 2.2.1, p. 67 [35]) H 1 kN (M) := η ∈ H 1 k (M)|nη = 0 . Let us recall (p. 27 [35]) that given η ∈ 3 (M) and tangent vectors vi ∈ Tx (M), x ∈ ∂ M, i ∈ {1, 2, 3}, tη(v1 , v2 , v3 ) = η v1 , v2 , v3 ,
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
357
where vi is the projection of vi into Tx (∂ M). As η is a multi-linear function and v1 , v2 , v3 are linearly dependent, tη = 0. By the definition in p. 27 [35], nη := η − tη = η. It follows that nη = η, η ∈ H 1 3 (M). Let η ∈ H 1 3N (M), then there exists f ∈ W 1,2 (M) such that η| o = f | o d x 1 ∧ d x 2 ∧ d x 3 . M
M
As nη = η = 0, it follows that f |∂ M = 0 in trace sense. Hence (Theorem p. 330, 4.7.1, ◦
[36]), f can be approximated in the W 1,2 (M) norm by functions in C0∞ M , and then η 0
can be approximated in the H 1 3 (M) norm by forms in 3 ( M ) with compact support. Whence, it follows from (3.23) that B M , δη = 0, ∀η ∈ C 2 (M).
(3.24)
By Corollary 2.4.9, p. 87 [35] 2 B M = dα + δβ + d + γ ∈ E 2 (M) ⊕ C 2 (M) ⊕ L 2 Hext (M) ⊕ H2N (M), (3.25)
where (Definition 2.4.1, p. 80 [35]) E k (M) := dα|α ∈ H 1 k−1 D (M) and (Definition 2.2.1, p. 67 [35]) H 1 kD (M) := η ∈ H 1 k (M)|tη = 0 . Furthermore (p. 86 [35]), k (M) := η ∈ Hk (M)|η = d , Hext and (Definition 2.2.1, p. 67 [35]) Hk (M) := η ∈ H 1 k (M)|dη = 0, δη = 0 are the harmonic fields, and HkN (M) := Hk (M) ∩ H 1 kN (M). Note that Theorem 2.2.7, p. 72 [35] implies that H2N (M) consists of C ∞ forms. Furthermore by Lemma 2.4.11, p. 90 [35] we can choose α ∈ W 1, p 1D (M), and by
358
M. Ballesteros R. Weder
Theorem 2.4.8, p. 86 and Theorems 2.2.6 and 2.2.7, p. 72 [35] ∈ W 1, p 1N (M). Moreover, the decomposition (3.25) is orthogonal in L 2 (M), and then by (3.24) δβ = 0. ◦
Let R be the regularization operator in r = M . Then, as in the proof of Lemma 2.7 we prove that R B = 0. ∂ K j,ε
Hence,
0=
∂ K j,ε
RB =
∂ K j,ε
d(Rα + R) +
∂ K j,ε
Rγ =
∂ K j,ε
Rγ .
Then, ∂ K j,ε Rγ = 0, j ∈ {1, 2, . . . , L} and when the parameter of the regularization tends to zero we obtain ∂ K j,ε γ = 0, j ∈ {1, 2, . . . , L}. As γ is harmonic it is closed and it follows from Stokes theorem that γ = 0, j ∈ {1, 2, . . . , L}. ∂K j
Then, by Lemma 2.7 ◦
Sj
γ = 0, j ∈ {1, 2, . . . , L}. By de Rham’s Theorem γ | ◦ = M
dλ, λ ∈ 1 ( M ). Denote Mε := {x ∈ M : dist (x, ∂ M) ≥ ε}. Let γε be the restriction of γ to Mε . Then γε is exact and by Lemma 3.2.1, p. 119 [35], and its proof, γε = dωε with ωε ∈ H 1 1 (Mε ) and ωε H 1 1 (Mε ) ≤ Cγε L 2 2 (Mε ) ≤ Cγ L 2 2 (M) , where the constant C can be taken independent of ε for 0 < ε < ε0 for ε0 small enough. Let us denote by k (M), k (Mε ), respectively, the exterior k–form bundle of M, Mε (see Definition 1.3.8 in p. 39 of [35]). For any vector bundle, F, over a manifold N we denote by (F) the space of all smooth sections of F (see Definition 1.1.9, p. 17 of [35]) . Note that the norm, C1 , of the trace operator (Theorem 1.3.7, p. 38 [35]) from H 1 (k (Mε )) into L 2 k (Mε )|∂ Mε can be taken independent of ε for 0 < ε < ε0 . By Green’s formula and as δγε = 0, γε , γε = dωε , γε = tωε ∧ ∗nγε . (3.26) ∂ Mε
But as tωε L 2 (1 (Mε )|∂ Mε ) ≤ C1 ωε H 1 1 (Mε ) ≤ C1 Cγε L 2 2 (Mε ) ≤ C1 Cγ L 2 2 (M) , and lim nγε L 2 (2 (Mε )|∂ Mε ) = 0,
→0
it follows from (3.26) and the Schwarz inequality that γ 2L 2 2 (M) = lim γε 2L 2 2 (M) = 0. ε→0
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
359
Then γ = 0 and we have that BM = d A M ,
(3.27)
o
where A M := α + ∈ W 1, p 1 ( M ). It follows from Theorem 4.2.2, p. 311 [36] that 3 there is A M ∈ W 1, p 1 (BrR (0)) such that A M | M = A M . We define ⎧ 3 ⎨ d A M (x), if x ∈ BrR (0), B(x) = (3.28) ⎩ 3 B(x), if x ∈ R3 \ BrR (0).
Hence, B is the required extension.
Recall that the functions {γˆ j }mj=1 were defined in (2.6). We introduce now a function that gives the magnetic flux across surfaces that have {γˆ j }mj=1 as their boundaries. Definition 3.3. The flux, is a function : {γˆ j }mj=1 → R. We now define a class of magnetic potentials with a given flux. Definition 3.4. Let B ∈ L p 2 (), p > 3, be a closed 2–form that is continuous in a neighborhood of ∂ K , where K is as in Assumption 2.1. Assume, furthermore, that (3.22) holds. We denote by A (B) the set of all continuous 1−forms in that satisfy. 1. |A(x)| ≤ C 2.
1 , a(r ) := maxx∈,|x|≥r {|A(x) · x|} ˆ ∈ L 1 (0, ∞). (3.29) 1 + |x| γˆj
A = (γˆj ), j ∈ {1, 2, . . . , m}.
(3.30)
3. d A| = B| .
(3.31)
The definition of the flux depends on the particular choice of the curves {γˆ j }mj=1 . However, the class A (B) is independent of this particular choice as we prove below. Recall that by Corollary 2.4 β := {[γˆ j ] H1 (;R) }mj=1 is a basis of H1 (; R). Let β := {[C j ] H1 (;R) }mj=1 be another basis of H1 (; R). We define β : {C j }mj=1 → R as follows. As β is a basis of H1 (; R) there are real numbers bij and chains σ j such that Cj =
m
bij γˆi + ∂σ j .
(3.32)
i=1
We define β (C j ) :=
m i=1
bij (γˆi ) +
σj
B.
(3.33)
360
M. Ballesteros R. Weder
We denote by Aβ (B) the set of continuous 1–forms A in that satisfy 1 and 3 of Definition 3.4 and moreover, A = β (C j ), j = 1, 2, . . . , m. Cj
Proposition 3.5. Aβ (B) = A (B). Proof. Let A ∈ A (B). Then, by (3.32) A= Cj
m
bij
i=1
γˆi
A+
σj
d A = β (C j ), j = 1, 2, . . . , m,
and it follows that A ∈ Aβ (B). Suppose now that A ∈ Aβ (B). As β and β are basis, the numbers bij , i, j j = 1, 2, . . . , m determine an invertible matrix. We denote by b˜i the entries of the inverse matrix. Hence, γˆi =
m
j b˜i bsj γˆs =
j,s=1
m
j b˜i (C j − ∂σ j ),
j=1
and then by (3.33), γˆi
A=
m
j b˜i
β (C j ) −
j=1
This implies that A ∈ A (B).
σj
B
= (γˆi ).
By Stoke’s theorem the circulation γˆ j A of a potential A ∈ A (B) represents the flux of the magnetic field B in any surface whose boundary is γˆ j , j = 1, 2, . . . , m. As the magnetic field is a priori known outside the obstacle, it is natural to specify the magnetic potentials fixing fluxes of the magnetic field in surfaces inside the obstacle. This is accomplished fixing the circulations γ˜ j A instead of the circulations γˆ j A, as we prove below. Recall that γ˜ j is defined in (2.5). With ε as in (2.6) we define, ε S j := γ˜ j (t) + s N (γ˜ j (t))|t, s ∈ [0, 1] . 2 We give S j the structure of an oriented surface with boundary γˆ j − γ˜ j . By Stoke’s theorem and regularizing we prove that A= A− B. γ˜ j
γˆ j
Sj
˜ : {γ˜ j }m → R accordingly, We define the fluxes j=1 ˜ γ˜ j ) = (γˆ j ) − (
B. Sj
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
361
We denote by A˜ ˜ (B) the set of continuous 1−forms, A, in that satisfy 1 and 3 of Definition 3.4 and moreover, ˜ γ˜ j ), j = 1, 2, . . . , m. A = ( γ˜ j
Proposition 3.6. A˜ (B) = A (B). Proof. Let A ∈ A (B). By Stoke’s theorem and regularizing, ˜ j. A= A− B= γ˜ j
γˆ j
Sj
Then, A ∈ A˜ (B). We prove in the same way that A ∈ A˜ (B) ⇒ A ∈ A (B).
3 3 ∂ Ai d x i , δ A = − i=1 Note that for 1−forms A = i=1 ∂ xi Ai = −div A [35]. We use the definition of divergence of a vector field, A, as it is usual in vector calculus. The definition given in [35] differs from ours in a − sign. Theorem 3.7. (Coulomb Potential). Let B ∈ L p 2 (), p > 3, be a closed 2–form that is continuous in a neighborhood of ∂ K , where K is as in Assumption 2.1. Assume, 3 furthermore, that (3.22) holds and that for some r with K ⊂ BrR (0), |B(x)| ≤ C(1 + |x|)−µ , |x| ≥ r, µ > 2.
(3.34)
Then, for any flux, , there is a potential AC ∈ A (B) such that AC = A(C,1) + A(C,2) , where A(C,1) is continuous on , A(C,2) is C ∞ on , and δ A(C, j) = −div A(C, j) = 0, j = 1, 2. Furthermore, |A(C,1) (x)| ≤ C(1 + |x|)−min(2−ε,µ−1) , ∀ε > 0,
(3.35)
|A(C,2) (x)| ≤ C(1 + |x|)−2 .
(3.36)
Proof. Let B be the extension to R3 of B given by Theorem 3.2. by Proposition 2.6 of [22], and its proof we can take as A(C,1) the Coulomb gauge of B, A(C,1)
x−y 1 := − × B(y) dy, 4π R3 |x − y|3
(3.37)
where we use the notation of vector calculus. We define A(C,2) as follows: A(C,2) :=
m j=1
(γˆ j ) −
γˆ j
A(C,1) G ( j) ,
(3.38)
where G ( j) , j = 1, 2, · · · , m are defined in (2.12) and we used (2.13). Clearly, G ( j) ∈ C ∞ () and |G ( j) (x)| ≤ C(1 + |x|)−2 .
362
M. Ballesteros R. Weder
Note that in R3 AC is the Coulomb potential that corresponds to the magnetic field m B+ A(C,1) δ(x − γ j )d γj , (γ j ) − γˆj
j=1
with
δ(x − γ j )d γj , φ :=
γj
φ d γj .
The div-curl problem in exterior domains in the case of C 1 vector fields with Hölder continuous first derivatives was considered in [39]. Lemma 3.8. (Gauge Transformations). Suppose that A, A˜ ∈ A (B). Then, there is a ˜ A = dλ. Moreover, we can take λ(x) := ˜ C 1 0− form λ in such that A− C(x0 ,x) ( A− A), where x0 is any fixed point in and C(x0 , x) is any curve from x0 to x. Furthermore, λ∞ (x) := limr →∞ λ(r x) exists and it is continuous in R3 \ {0} and homogeneous of order zero, i.e. λ∞ (r x) = λ∞ (x), r > 0, x ∈ R3 \ {0}. Moreover, ∞ |λ∞ (x) − λ(x)| ≤ |x| b(|x|), for some b(r ) ∈ L 1 (0, ∞), (3.39) and |λ∞ (x + y) − λ∞ (x)| ≤ C|y|, ∀x : |x| = 1, and ∀y : |y| < 1/2. Proof. The existence of λ follows from Proposition 2.5. The existence of λ∞ and the first equation in (3.39) follow from Condition 1 in Definition 3.4. The homogeneity R3 (0). follows from the definition. Denote G := A˜ − A. Take m > 1 such that K ⊂ Bm/2 Suppose that |x| = 1 and that |y| < 1/2. x+y Denote, x := mx, y := m |x+y| − x . Then, λ∞ (x) = λ∞ (x ), λ∞ (x + y) = λ∞ (x + y ). Hence, λ∞ (x + y) − λ∞ (x) = λ∞ (x + y ) − λ∞ (x ) ∞ ∞ r m(x+y)/|x+y| x +y G+ G− G = lim G, = x
x +y
r →∞ r mx
x
where we used Stoke’s theorem and dG = 0. Then, |λ∞ (x + y ) − λ∞ (x )| ≤ lim
r m(x+y)/|x+y|
r →∞ r mx
|G| ≤ C|y|.
This proves (3.39). We now consider potentials that satisfy the flux condition modulo 2π .
Definition 3.9. Let B be as in Definition 3.4. We denote by A,2π (B) the set of all continuous 1−forms in that satisfy 1 and 3 of Definition 3.4 and moreover, A = (γˆj ) + 2π n j (A), n j (A) ∈ Z, j ∈ {1, 2, . . . , m}. γˆj
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
363
Given A ∈ A,2π (B) we define, A := A −
m
2π n j (A) G ( j) .
(3.40)
j=1
By (2.13) A ∈ A (B). Suppose that A, A˜ ∈ A,2π (B). Then, A , A˜ ∈ A (B), and by Lemma 3.8, A˜ φ − A = dλ,
(3.41)
A˜ − A = dλ + AZ ,
(3.42)
and it follows that
where AZ :=
m
˜ − n j (A)) G ( j) . 2π(n j ( A)
(3.43)
j=1
Let C be any closed curve in . Then, by Proposition 10.1 in Appendix B, C :=
m
n j (C)γˆ j + ∂σ, n j (C) ∈ Z.
j=1
Hence,
( A˜ − A) = 2π N , for some N ∈ Z.
(3.44)
C
Whence, we can define the non-integrable factors [44], U A,A ˜ (x) := e
i
˜
C(x0 ,x) ( A−A)
=e
i(λ(x)+ C(x
0 ,x)
AZ )
,
(3.45)
where x0 is any fixed point in and C(x0 , x) is any curve in from x0 to x. Clearly, U A,A ∈ C 1 () and can be extended to a continuous function defined in that we denote ˜ ˜ A ∈ A (B) we have that AZ = 0, and then with the same symbol. Moreover, if A, iλ(x) ˜ , A, A ∈ A (B). U A,A ˜ (x) = e
(3.46)
˜ A ∈ A,2π (B). Then, for x = 0, Lemma 3.10. Suppose that A, i(λ∞ (x)+C A,A ˜ ) , lim U A,A ˜ (r x) = e
r →∞
(3.47)
with λ∞ (x) := limr →∞ λ(r x) given by Lemma 3.8 with λ as in (3.41), and where C A,A ˜ is a real number that is independent of x. Furthermore, ∞ i(λ∞ (x)+C A,A ˜ ) c(|x|), for some c(r ) ∈ L 1 (0, ∞). (3.48) ≤ U A,A ˜ (x) − e |x|
˜ A ∈ A (B) we have that C ˜ = 0. Moreover, if A, A,A
364
M. Ballesteros R. Weder
Proof. Let r0 be such that K ⊂ BrR0 (0). Take in (3.45) any curve from x0 to r0 xˆ and then the straight line from r0 xˆ to r xˆ with r0 ≤ r < ∞. By (2.12, 3.29, 3.39, 3.42) 3
iλ∞ (x) lim U A,A lim e ˜ (r x) = e
r →∞
i
r →∞
and i A iλ∞ (x) lim e C(x0 ,r x)ˆ Z ≤ U A,A ˜ (x) − e r →∞
∞
C(x0 ,r x) ˆ
AZ
c(|x|), for some c(r ) ∈ L 1 (0, ∞).
|x|
For any y = 0, y = ±x let C(r x, ˆ r yˆ ) be the straight line from r xˆ to r yˆ . Then, lim e
i(
r →∞
C(x0 ,r x) ˆ
AZ − C(x
0 ,r yˆ )
AZ )
= lim e−i
r →∞
C(r x,r ˆ yˆ )
AZ
= 1,
and it follows that, lim e
i
r →∞
C(x0 ,r x) ˆ
AZ
= lim e r →∞
i
C(x0 ,r yˆ )
AZ
˜ = eiC A,A
˜ A ∈ A (B), n j ( A) ˜ = n j (A) = for some C A,A ∈ R that is independent of x. If A, ˜ 0, j =, 1, 2, · · · , m and hence, AZ = 0, which implies that C A,A = 0. ˜ 4. The Hamiltonian Let us denote p := −i∇. The Schrödinger equation for an electron in with electric potential V and magnetic field B is given by i
∂ 1 q φ= (P − A)2 φ + q Vφ, ∂t 2M c
(4.1)
where is Planck’s constant, P := p is the momentum operator, c is the speed of light, M and q are, respectively, the mass and the charge of the electron and A is a magnetic potential with curlA = B. To simplify the notation we multiply both sides of (4.1) by 1 and we write Schrödinger’s equation as follows: i
∂ 1 φ= (p − A)2 φ + V φ, ∂t 2m
(4.2)
with m := M/, A = qc A and V := q V. Note that since we write Schrödinger’s equation in this form our Hamiltonians below are the physical Hamiltonians divided by . We fix the flux modulo 2π by taking A ∈ A,2π , where B := qc B. Note that this corresponds to fix the circulations of A modulo qc 2π , or equivalently, to fix the fluxes of the magnetic field B modulo qc 2π . For any open set, O, we denote by Hs (O), s = 1, 2, . . . the Sobolev spaces [1] and by Hs,0 (O) the closure of C0∞ (O) in the norm of Hs (O). We define the quadratic form, h 0 (φ, ψ) :=
1 (pφ, pψ), D(h 0 ) := H1,0 (). 2m
(4.3)
−1 D , where D is the Laplacian The associated positive operator in L 2 () [23,31] is 2m −1 with Dirichlet boundary condition on ∂. We define H (0, 0) := 2m D . By elliptic regularity [2], D(H (0, 0)) = H2 () ∩ H1,0 ().
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
365
For any A ∈ A,2π (B) we define, h A (φ, ψ) :=
1 1 (−(pφ, Aψ) ((p − A)φ, (p − A)ψ) = h 0 (φ, ψ) + 2m 2m 1 (4.4) −(Aφ, pψ)) + (Aφ, Aψ), D(h A ) = H1,0 (). 2m
1 1 As the quadratic form − 2m ((pφ, Aψ) + (Aφ, pψ)) + 2m (Aφ, Aψ) is h 0 −bounded with relative bound zero, h A is closed and positive. We denote by H (A, 0) the associated positive self-adjoint operator [23,31]. H (A, 0) is the Hamiltonian with magnetic poten1 2 ) is H (0, 0) compact we have that tial A. Note that as the operator 2m (−2 AC · p + AC 1 1 2 H (0, 0) − m AC · p + 2m AC is self-adjoint on the domain of H (0, 0), and then
1 1 2 AC · p + A , D(H (AC , 0)) = H2 () ∩ H1,0 (). (4.5) m 2m C The electric potential V is a measurable real-valued function defined on . We assume that |V | is h 0 −bounded with relative bound zero. Under this condition [23,31] the quadratic form, H (AC , 0) = H (0, 0) −
h A,V (φ, ψ) := h A (φ, ψ) + (V φ, ψ), D(h A,V ) = H1,0 (),
(4.6)
is closed and bounded from below. The associated operator, H (A, V ), is self-adjoint and bounded from below. H (A, V ) is the Hamiltonian with magnetic potential A and electric potential V . If furthermore, V is − D compact, the operator H (0, 0) − m1 AC · 1 2 + V is self-adjoint on the domain of H (0, 0) and then, p + 2m AC H (AC , V ) = H (0, 0) −
1 1 2 AC · p + A + V, D(H (AC , V )) = H2 () ∩ H1,0 (). m 2m C (4.7)
the operator of multiplication by U A,A We will denote by U A,A ˜ ˜ (x). See (3.45). Note ∗ 2 that U A,A ˜ is unitary in L () and that U A,A ˜ is the operator of multiplication by U A, A˜ (x) . ˜ V ) and H (A, V ) are unita˜ A ∈ A,2π (B). Then H ( A, Theorem 4.1. Suppose that A, rily equivalent, ˜ V ) = U ˜ H (A, V ) U ∗ , D(H ( A, ˜ V )) = U ˜ D(H (A, V )). H ( A, ˜ A,A A,A A,A Proof. As U A,A and U ∗˜ ˜
are bijections on H1,0 () we have that ∗ ∗ h A,V φ, U ψ , φ, ψ ∈ H1,0 (). ˜ (φ, ψ) = h A,V U A,A ˜ ˜ A,A A,A
˜ V )). Then, for every χ ∈ H1,0 (), Suppose that φ ∈ D(H ( A, ∗ ˜ V )φ, χ ) = h A,V (U ∗ φ, χ ). H ( A, (U A,A ˜ ˜ A,A
This implies that U ∗˜
A,A
φ ∈ D(H (A, V )) and that ∗ ∗ ˜ H (A, V )U A,A ˜ φ = U A,A ˜ H ( A, V )φ,
which proves the theorem.
(4.8)
366
M. Ballesteros R. Weder
5. Scattering In the following assumptions we summarize the conditions on the magnetic field and the electric potential that we use. We denote by the self-adjoint realization of the Laplacian in L 2 (R3 ) with domain H2 (R3 ). Below we assume that V is -bounded with relative bound zero. By this we mean that the extension of V to R3 by zero is −bounded with relative bound zero. Using a extension operator from H2 () to H2 (R3 ) [36] we prove that this is equivalent to require that V is bounded from H2 () into L 2 () with relative bound zero. We denote by · the operator norm in L 2 (R3 ). Assumption 5.1. We assume that the magnetic field, B, is a real-valued, bounded 2–form in , that is continuous in a neighborhood of ∂ K , where K satisfies Assumption 2.1, and furthermore, 1. B is closed : d B| ≡ divB = 0. 2. There are no magnetic monopoles in K : B = 0, j ∈ {1, 2, . . . , L}. ∂K j
(5.1)
3. |B(x)| ≤ C(1 + |x|)−µ , for some µ > 2.
(5.2)
4. d ∗ B| ≡ curl B is bounded and, |curl B| ≤ C(1 + |x|)−µ . 5. The electric potential, V , is a real-valued function, it is –bounded, and F(|x| ≥ r )V (− + I )−1 ≤ C(1 + |x|)−α , for some α > 1.
(5.3)
(5.4)
Note that (5.4) implies that V is h 0 −bounded with relative bound zero. Furthermore, condition (5.4) is equivalent to the following assumption [32]: (5.5) V (− + I )−1 F(|x| ≥ r ) ≤ C(1 + |x|)−α , for some α > 1. Condition (5.4) has a clear intuitive meaning, it is a condition on the decay of V at infinity. However, in the proofs below we use the equivalent statement (5.5). Let us define, H0 := −
1 , D(H0 ) = H2 (R3 ). 2m
Let J be the identification operator from L 2 (R3 ) onto L 2 () given by multiplication by the characteristic function of . The wave operators are defined as follows: W± (A, V ) := s- lim eit H (A,V ) J e−it H0 , t→±∞
(5.6)
provided that the strong limits exist. We first prove that they exist in the Coulomb gauge. Proposition 5.2. Suppose that B and V satisfy Assumption 5.1. Then, the wave operators W± (AC , V ) exist and are isometric.
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
367
Proof. Let χ ∈ C ∞ (R3 ) satisfy χ (x) = 0 in a neighborhood of K and χ (x) = 1 for |x| ≥ r0 with r0 large enough. Then, since (1 − χ (x))(H0 + I )−1 is compact, W± (AC , V ) = s- lim eit H (AC ,V ) χ (x) e−it H0 . t→±∞
By Duhamel’s formula, for φ ∈ D(H0 ), W± (AC , V )φ = χ (x)φ(x) +
±∞
i eit H (AC ,V ) [H (AC , V )χ (x) − χ (x)H0 ] φ(x) dt.
0
(5.7) By Theorem 3.7 the proof that the integral in the right-hand side of (5.7) is absolutely convergent is standard. For example, it follows from Lemma 2.2 of [14] taking R3 (0)), which is a dense φ = eimv·x ϕ, with v ∈ R3 , |v| ≥ 4η > 0, and ϕˆ ∈ C0∞ (Bmη set in L 2 (R3 ). Lemma 5.3. (Gauge Transformations). Suppose that Assumption 5.1 is true. Then, for every A ∈ A,2π (B) the wave operators W± (A, V ) exist and are isometric. Moreover, if A˜ ∈ A,2π (B), then, −iλ∞ (±p) ˜ ˜ V ) = e−iC A,A W± ( A, U A,A . ˜ W± (A, V ) e
(5.8)
Proof. Since we already know that W± (AC , V ) exist and are isometric it is enough to prove the gauge transformation formula (5.8). We argue as in the proof of Lemma 2.3 of [40]. By (4.8), ˜ V ) = U ˜ s- lim eit H (A,V ) U ˜ J e−it H0 W± ( A, A, A A,A t→±∞
it H (A,V ) ˜ ) −it H0 = U A,A J e−i(λ∞ (x)+C A,A e , ˜ s- lim e t→±∞
where we used that by Lemma 3.10 and the Rellich selection theorem ˜ ) U A, A˜ − e−i(λ∞ (x)+C A,A is a compact operator from D(H0 ) into L 2 (R3 ). We finish the proof of the lemma as in the proof of Eq. (2.29) of [40], using the second equation in (3.39). The scattering operator is defined as S(A, V ) := W+∗ (A, V ) W− (A, V ). By (5.8) ˜ V ) = eiλ∞ (p) S(A, V ) e−iλ∞ (−p) , A, ˜ A ∈ A,2π (B). S( A,
(5.9)
Definition 5.4. We say that A ∈ A,2π (B) is short-range if |A(x)| ≤ C(1 + |x|)−1−ε , for some ε > 0. We denote the set of all short-range potentials in A,2π (B) by A,2π,SR (B).
(5.10)
368
M. Ballesteros R. Weder
˜ A ∈ A,2π (B) and A˜ − A satisfies (5.10), λ∞ is constant, and then, Note that if A, ˜ V ) = S(A, V ), A, ˜ A ∈ A,2π (B) and A˜ − A satisfies (5.10). S( A,
(5.11)
This implies that, S(A , V ) = S(A, V ), for any A ∈ A,2π (B),
(5.12)
˜ A∈A where A is defined in (3.40). Remark that (5.11) holds if A, ,2π,SR (B). We quote below the following result of [40] that we will often use. R (0)), 0 ≤ ρ < 1, and for any j = 1, 2, . . . there is Lemma 5.5. For any f ∈ C0∞ (Bmη a constant C j such that ! ! F |x − vt| > |vt| e−it H0 f p − mv F (|x| ≤ |vt|/8) ≤ C j (1 + |vt|)− j , ρ 4 v (5.13) 3
for v := |v| > (8η)1/(1−ρ) . Proof. Corollary 2.2 of [40]. 5.1. High-velocity estimates I. The magnetic potential. We denote, vˆ := {x ∈ : x + τ vˆ ∈ , ∀τ ∈ R}, for v = 0, L A,ˆv (t) :=
t
vˆ · A(x + τ vˆ )dτ, −∞ ≤ t ≤ ∞.
(5.14) (5.15)
0
Remark that under translation in configuration or momentum space generated, respectively, by p and x we obtain eip·vt f (x) e−ip·vt = f (x + vt),
(5.16)
e−imv·x f (p) eimv·x = f (p + mv),
(5.17)
and, in particular, e−imv·x e−it H0 eimv·x = e−imv
2 t/2
e−ip·vt e−it H0 .
(5.18)
The purpose of the obstacle K is to shield the incoming electrons from the magnetic field inside the obstacle. In order to separate the scattering effect of the magnetic potential from that of the magnetic field inside the obstacle K , we consider asymptotic configurations that have negligible interaction with K for all times in the high-velocity limit. For any non-zero v ∈ R3 we take asymptotic configurations φ with compact support in vˆ . The free evolution boosted by vˆ is given by (5.18) and-to a good approximation-in the limit when v → ∞ with vˆ fixed this can be replaced (modulo an unimportant phase factor) by the classical translation e−ip·vt . Then, in the high-velocity limit it is a good approximation to assume that the free evolution of our asymptotic configuration is given by e−ip·vt φ0 = φ0 (x − vt), and as φ0 has support in vˆ , it has negligible interaction with K for all times. Note that instead of boosting the observables we can boost the asymptotic configurations and consider the high-velocity asymptotic configurations φv := eimv·x φ0 .
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
369
Lemma 5.6. Suppose that B, V satisfy Assumption 5.1. Let 0 be a compact subset of vˆ , with v ∈ R \ {0}. Then, for all and all A ∈ A,2π (B) there is a constant C such that 1 −imv·x W± (A, V ) eimv·x − e−i L A,ˆv (±∞) φ 2 3 ≤ C φH2 (R3 ) , (5.19) e L (R ) v and if moreover, divA ∈ L 2loc , 1 −imv·x ∗ W± (A, V ) eimv·x − ei L A,ˆv (±∞) φ 2 3 ≤ C φH2 (R3 ) , (5.20) e L (R ) v for all φ ∈ H2 (R3 ) with support φ ⊂ 0 . Proof. We follow the proof of Lemma 2.4 of [40]. We first give the proof in the case of the Coulomb potential AC . We give the proof for W+ (AC , V ). The proof for W− (AC , V ) follows in the same way. By Theorem 3.7, AC = A(C,1) + A(C,2) , where A(C,1) is the Coulomb potential for the extension B of the magnetic field. Then, A(C,1) is actually defined in R3 . We can extend A(C,2) | as an n−times, n = 1, 2, . . ., continuously differentiable vector valued function defined in R3 (Theorem 4.2.2, p. 311 [36]). Consequently, we can extend AC to a continuous vector valued function defined in R3 such that div AC is infinitely differentiable with support contained in the obstacle K . We denote also by AC this extension. Let g ∈ C0∞ (R3 ) satisfy g( p) = 1, | p| ≤ 1, g( p) = 0, | p| ≥ 2. Denote 1 ≤ ρ < 1. 2
(5.21)
1 φH2 (R3 ) . v 2ρ
(5.22)
φ˜ := g(p/v ρ ) φ, Then, ˜ φ − φ
L 2 (R 3 )
≤
˜ Hence, it is enough to prove (5.19) for φ. By our assumption there is a function χ ∈ C ∞ (R3 ) such that χ ≡ 0 in a neighborhood of K and χ (x) = 1, x ∈ {x : x = y + τ vˆ , y ∈ support φ, τ ∈ R} ∪ {x : |x| ≥ M} for some M large enough. We use the following notation: H1 := Note that
1 −imv·x 1 e H0 eimv·x , H2 := e−imv·x H (AC , V ) eimv·x . v v
e−imv·x W+ (AC , V ) eimv·x − χ (x)e−i L AC ,ˆv (∞) φ˜ " # ˜ = s- lim eit H2 χ (x)e−it H1 − χ (x)e−i L AC ,ˆv (t) φ. t→∞
Denote
" P(t, τ ) := eiτ H2 i H2 e−i L AC ,ˆv (t−τ ) χ (x) − e−i L AC ,ˆv (t−τ ) χ (x)
˜ × H1 − vˆ · AC (x + (t − τ )ˆv) e−iτ H1 φ.
(5.23)
(5.24)
(5.25)
370
M. Ballesteros R. Weder
Then, by Duhamel’s formula, " e
it H2
χ (x)e
−it H1
− χ (x)e
−i L AC ,ˆv (t)
#
φ˜ =
t
dτ P(t, τ ).
(5.26)
0
We designate
t
b(x, t) := AC (x + t vˆ ) +
(ˆv × B)(x + τ vˆ )dτ.
(5.27)
0
For f : R3 × R → R3 with ft (x) := f(x, t) ∈ L 1loc (R3 , R3 ) we define " # 1 χ (x) −p · f(x, t) − f(x, t) · p + (f(x, t))2 . f (x, t) := 2m
(5.28)
We have that [40] P(t, τ ) = T1 + T2 + T3 ,
(5.29)
with T1 := T2 :=
1 iτ H2 −i L A ,ˆv (x,t−τ ) ˜ C e ie (b (x, t − τ ) + χ V (x)) e−iτ H1 φ, v
1 iτ H2 −i L AC ,ˆv (x,t−τ ) {−(χ ) + 2(pχ ) · p − 2b(x, t ie 2mv e
˜ − τ ) · (pχ )} e−iτ H1 φ, (5.31)
˜ T3 := eiτ H2 ie−i L AC ,ˆv (x,t−τ ) (pχ ) · vˆ e−iτ H1 φ. Note that ([4], Eq. (2.18)) t−τ ˆ ˆ dν(ˆ v × B)(x + ν v ) F(|x − τ v | ≤ |τ |/4) 0
L ∞ (R 3 )
≤C
1 , (1 + |τ |)µ−1
t−τ 0 dν(∇ · (ˆv × B))(x + ν vˆ ) F(|x − τ vˆ | ≤ |τ |/4) ∞ 3 L (R ) t−τ = 0 dν(ˆv · curl B)(x + ν vˆ ) , F(|x − τ vˆ | ≤ |τ |/4) L ∞ (R3 ) ≤ C
1
(1 + |τ |)µ−1
(5.30)
(5.32)
(5.33)
(5.34)
.
Using Theorem 3.7, Lemma 5.5, (5.2, 5.3, 5.5, 5.33, 5.34) we prove as in the proof of Lemma 2.4 of [40] that 1 C φH2 (R3 ) , min(2−ε,µ−1,α) v (1 + |τ |)
(5.35)
Cj 1 φH2 (R3 ) , j = 1, 2, . . . , v (1 + |τ |) j
(5.36)
T1 (τ ) L 2 (R3 ) ≤
T2 (τ ) L 2 (R3 ) ≤
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
∞
−∞
dτ T3 (τ ) L 2 (R3 ) ≤
371
C φH2 (R3 ) . v
(5.37)
For the reader’s convenience we estimate one of the terms. Denote by t η(x, t) := (ˆv × B)(x + τ vˆ )dτ.
(5.38)
0
Then, by Lemma 5.5 and (5.33), 1 −i L −iτ H1 AC ,ˆv (x,t−τ ) ˜ e η(x, t − τ )e · p φ mv 2 3 L (R ) C" −i H0 τ/v η(x, t − τ )|F(|x − τ v| > |τ |/4)e ≤ v ! p − mv F(|x| ≤ |τ |/8)φH2 (R3 ) + η(x, t − τ )F(|x − τ v| g vρ ≤ |τ |/4) L ∞ (R3 ) φH2 (R3 ) # C ˜ L 2 (R 3 ) ≤ + F(|x| ≥ |τ |/8)p · φ φH2 (R3 ) . (1 + |τ |)µ−1 By (5.26, 5.29, 5.35, 5.36, 5.37) " # it H2 χ (x)e−it H1 − χ (x)e−i L AC ,ˆv (t) φ˜ e
L 2 (R 3 )
≤
C φH2 (R3 ) . v
(5.39)
By (5.24) this proves (5.19) for AC . Given A ∈ A,2π (B) we define A as in (3.40). As A ∈ A (B), we prove that (5.19) holds for A as in the proof of Lemma 2.4 of [40] using the formulae for change of gauge (5.8). Then, we prove that it is true for A using the gauge transformation formulae between A and A , note that in this case λ ≡ λ∞ ≡ 0 , observing that e−iC A,A = e
−i(
C(x0 ,x)
±∞ AZ + 0 vˆ ·AZ (x+τ vˆ )dτ )
= (U A,A )∗ e−i
±∞ 0
vˆ ·AZ (x+τ vˆ )dτ
, (5.40)
and using (3.42) with λ ≡ 0. We now prove (5.20). Note that ([4], Eq. 2.12) (p − A(x))e
−i L A,ˆv (t)
=e
−i L A,ˆv (t)
p − A(x + t vˆ ) −
t
! (ˆv × B)(x + τ vˆ )dτ .
0
(5.41)
Then, since div A ∈ L 2loc it follows from Sobolev’s imbedding theorem [1] that
ei L A,ˆv (±∞) φH2 (R3 ) ≤ CφH2 (R3 ) .
(5.42)
For simplicity we denote below W± (A, V ) by W± and we define W±,v := e−imv·x W± eimv·x .
(5.43)
372
M. Ballesteros R. Weder
∗ W As the wave operators are isometric, W±,v ±,v = I , and then ∗
i L A,ˆv (±∞) φ i L A,ˆv (±∞) φ ∗ φ − W∗ W W = W±,v ±,v − e ±,v ±,v e L 2 (R 3 ) L 2 (R 3 )
≤ W±,v − e−i L A,ˆv (±∞) ei L A,ˆv (±∞) φ L 2 (R3 ) ≤ C v1 φH2 (R3 ) . We now state the main result of this subsection. Theorem 5.7. (Reconstruction Formula I). Suppose that B, V satisfy Assumption 5.1. Let 0 be a compact subset of vˆ , with v ∈ R\{0}. Then, for all and all A ∈ A,2π (B) there is a constant C such that ∞ 1 −imv·x S(A, V ) eimv·x − ei −∞ vˆ ·A(x+τ vˆ ) dτ φ 2 3 ≤ C φH2 (R3 ) , (5.44) e L (R ) v ∞ −imv·x S(A, V )∗ eimv·x − e−i −∞ e
vˆ ·A(x+τ vˆ ) dτ
φ
L 2 (R 3 )
1 ≤ C φH2 (R3 ) , (5.45) v
for all φ ∈ H2 (R3 ) with support φ ⊂ 0 . Proof. We use the same notation as in the end of the proof of Lemma 5.6. First we prove (5.44) and (5.45) for AC , ∞ −imv·x S(AC , V ) eimv·x − ei −∞ vˆ ·AC (x+τ vˆ ) dτ φ 2 3 e L (R ) ∗ ∗ W ei(L AC ,ˆv (∞)−L AC ,ˆv (−∞)) = W+,v W−,v φ − W+,v +,v " # φ L 2 (R3 ) ≤ W−,v − e−i L AC ,ˆv (−∞) φ " # − W+,v − e−i L AC ,ˆv (∞) ei(L AC ,ˆv (∞)−L AC ,ˆv (−∞)) φ 2 3 ≤ Cv φH2 (R3 ) . L (R )
, V )∗
The proof for S(AC follows in the same way. Now we prove (5.44) for A ∈ A,2π (B), the proof of (5.45) follows in the same way. By (5.12), S(A, V ) = S(A , V ). From (5.40) it follows that ei
∞
ˆ ·AZ (x+τ vˆ )dτ −∞ v
and thus ei
∞
ˆ ·A(x+τ vˆ )dτ −∞ v
= e−iC A,A eiC A,A = 1, = ei
∞
ˆ ·A (x+τ vˆ )dτ −∞ v
.
(5.46)
Then it is enough to prove (5.44) for A = AC + ∇λ. By (5.9), (5.17) and as λ is homogenous of order zero, (e−imv·x S(A, V )eimv·x − ei = (e ≤ (e
p iλ∞ ( mv +ˆv)
p iλ∞ ( mv +ˆv)
∞
ˆ ·A(x+τ vˆ )dτ −∞ v
)φ L 2 (R3 ) p
e−imv·x S(AC , V )eimv·x e−iλ∞ (− mv −ˆv) − ei e−imv·x S(AC , V )eimv·x (e
p iλ∞ ( mv +ˆv)
p −iλ∞ (− mv −ˆv)
ˆ ·A(x+τ vˆ )dτ −∞ v
)φ L 2 (R3 )
− e−iλ∞ (−ˆv) )φ L 2 (R3 )
ˆ ·AC (x+τ vˆ )dτ −∞ v
)e−iλ∞ (−ˆv) )φ L 2 (R3 ) ∞ p 1 +(eiλ∞ ( mv +ˆv) − eiλ∞ (ˆv) )ei −∞ vˆ ·AC (x+τ vˆ )dτ e−iλ∞ (−ˆv) φ L 2 (R3 ) ≤ C φH2 (R3 ) . v The last inequality follows from (3.39), (5.42) and (5.44) for AC . +(e
(e−imv·x S(AC , V )eimv·x − ei
∞
∞
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
373
5.2. High-velocity estimates II. The electric potential. Recall that φ˜ is defined in (5.21) and that H1 is given by (5.23). Lemma 5.8. Let h : R3 → R be a bounded function with compact support contained in R3 \ vˆ , and let φ be a function in H6 (R3 ) with compact support contained in vˆ . Then, for any l ∈ N there exists a constant Cl such that the following inequalities hold: ˜ L 2 (R 3 ) ≤ C l 1 l i) he−iτ H1 φ (1+|τ |) ˜ L 2 (R 3 ) ≤ C l 1 ii) hpe−iτ H1 φ
1 φH6 (R3 ) ∀ > 0, v 3− 1 φH5 (R3 ) ∀ > 0. (1+|τ |)l v 2−
Proof. We prove i), ii) follows in a similar way. Clearly, φ˜ − φ L 2 (R3 ) ≤
1 φH6 (R3 ) , where ρ ≥ 1/2. v 6ρ
(5.47)
It follows from (5.18) and the properties of the support of h and φ that he
−iτ H1
! !2 p2 1 p2 p2 −iτ p·ˆv −iτ 2mv − −iτ φ L 2 (R3 ) = he − I − −iτ e φ . 2mv 2 2mv
Observing that | e
2
p −iτ 2mv
p2 − I − −iτ 2mv
!
1 p2 − −iτ 2 2mv
!2 | ≤ C|τ |3
p6 , (2mv)3
we obtain he−iτ H1 φ L 2 (R3 ) ≤ C
(1 + |τ |)3 φH6 (R3 ) . (2mv)3
(5.48)
We prove as in (5.36) that there exists a constant Cl such that ˜ L 2 (R 3 ) ≤ C l he−iτ H1 φ
1 φ L 2 (R3 ) . (1 + |τ |)l
Finally we obtain i) from (5.47) and interpolating (5.48, 5.49).
(5.49)
We denote a(ˆv, x) :=
∞ −∞
A(x + τ vˆ ) · vˆ dτ,
(5.50)
and for φ0 ∈ H6 (R3 ) with compact support in vˆ , φv := eimv·x φ0 . Recall that vˆ is defined in (5.14), that f (x, t) is defined in 5.28, that η is defined in (5.38), and that A,2π (B) is defined in Definition 5.4.
374
M. Ballesteros R. Weder
Theorem 5.9. (Reconstruction Formula II). Suppose that B, V satisfy Assumption 5.1. Let 0 be a compact subset of vˆ , with v ∈ R \ {0}. Then, for all and all A ∈ A,2π,SR (B), # " ∞ v S(A, V ) − eia(ˆv,x) φv , ψv = −ieia(ˆv,x) −∞ V (x + τ vˆ ) dτ φ0 , ψ0 0 + −ieia(ˆv,x) −∞ η (x + τ vˆ , −∞) dτ φ0 , ψ0 ∞ + −i 0 η (x + τ vˆ , ∞) dτ eia(ˆv,x) φ0 , ψ0 + R(v, φ0 , ψ0 ),
(5.51)
where,
⎧ 1 , if min(µ − 3, α − 2) < 0, ⎪ ⎪ v min(µ−2,α−1) ⎪ ⎪ ⎪ ⎪ ⎨ | ln v| v , if min(µ − 3, α − 2) = 0, |R(v, φ0 , ψ0 )| ≤ Cφ0 H6 (R3 ) ψ0 H6 (R3 ) ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎪ v , if min(µ − 3, α − 2) > 0, ⎩ (5.52)
for some constant C and all φ0 , ψ0 ∈ H6 (R3 ) with compact support in 0 . Proof. We first prove the theorem in the Coulomb gauge AC . Note that # " v S(A, V ) − eia φv , ψv = v e−i L AC ,ˆv (−∞) φ0 , R+ ψ0 +v R− φ0 , e−i L AC ,ˆv (∞) ψ0 + v (R− φ0 , R+ ψ0 ) , (5.53) where R± := e−imv·x W± (AC , V )eimv·x − e−i L AC ,ˆv (±∞) . By Lemma 5.6, 1 (5.54) v |(R− φ0 , R+ ψ0 )| ≤ C φ0 H6 (R3 ) ψ0 H6 (R3 ) . v We prove below that v e−i L AC ,ˆv (−∞) φ0 , R+ ψ0 ! ∞ (η (x + τ vˆ , ∞) + χ V (x + τ vˆ )) dτ eia φ0 , ψ0 + R+ (v, φ0 , ψ0 ), = −i 0
(5.55) v R− φ0 , e−i L AC ,ˆv (∞) ψ0 ! 0 = −ieia (η (x + τ vˆ , −∞) + χ V (x + τ vˆ )) dτ φ0 , ψ0 + R− (v, φ0 , ψ0 ), −∞
(5.56)
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
375
where R± satisfy (5.52). Note that (5.56) follows from (5.55) by time inversion and charge conjugation in the magnetic potential, i.e., by taking complex conjugates and changing AC to −AC . It can also be proved as in the proof of (5.55) that we give below in seven steps. We use the notation of the proof of Lemma 5.6. For simplicity we denote by O(r ) a term that satisfies |O(r )| ≤ Cφ0 H6 (R3 ) ψ0 H6 (R3 ) r. Step 1. v e−i L AC ,ˆv (−∞) φ0 , R+ ψ0 = e−i L AC ,ˆv (−∞) φ0 , limt→∞
t 0
dτ eiτ H2 ie−i L AC ,ˆv (t−τ ) [
b (x, t
− τ ) + χ V (x)]e−iτ H1 ψ˜ 0
(5.57) + O(1/v).
Equation (5.57) follows from (5.24), (5.26), (5.29) (with φ0 instead of φ) and the following formula that is easily obtained from Lemma 5.8: T2 + T3 L 2 (R3 ) ≤ Cl
φ0 H6 (R3 ) v 3− (1 + |τ |)l
, ∀ > 0, l = 1, 2, . . . ,
that improves (5.36, 5.37). Step 2. t limt→∞ 0 dτ eiτ H2 ie−i L AC ,ˆv (t−τ ) [b (x, t − τ ) + χ V (x)]e−iτ H1 ψ˜ 0 = limt→∞
t 0
dτ eiτ H2 ie−i L AC ,ˆv (t−τ ) [η (x, t − τ ) + χ V (x)]e−iτ H1 ψ˜ 0 .
(5.58)
(5.59)
This follows from Lebesgue’s dominated convergence theorem and as
lim b (x, t − τ ) − η (x, t − τ ) e−iτ H1 ψ˜ 0 2 3 = 0, L (R )
t→∞
and, moreover,
b (x, t − τ ) − η (x, t − τ ) e−it H1 ψ˜ 0
L 2 (R 3 )
≤ h(τ ), for some h(τ ) ∈ L 1 (0, ∞).
This estimate is proven as in the proof of Lemma 5.6, using Lemma 5.5. Step 3. t v e−i L AC ,ˆv (−∞) φ0 , R+ ψ0 = 0 dτ e−i L AC ,ˆv (−∞) φ0 , eiτ H2 ie−i L AC ,ˆv (∞) [η (x, ∞) + χ V (x)]e−iτ H1 ψ˜ 0
+O(1/v) + 0 1/(1 + |t|)min(µ−2,α−1) .
(5.60)
This follows from Steps 1 and 2, and from the following argument. As in the proof of Lemma 5.6 we prove that [η (x, t − τ ) + χ V (x)]e−iτ H1 ψ˜ 0 2 3 ≤ C 1/(1 + |τ |)min(µ−1,α) ψ0 H2 (R3 ) . L (R )
(5.61)
376
M. Ballesteros R. Weder
Then by Fatou’s lemma [η (x, ∞) + χ V (x)]e−iτ H1 ψ˜ 0
L 2 (R 3 )
≤ C 1/(1 + |τ |)min(µ−1,α) ψ0 H2 (R3 ) . (5.62)
Hence, by Lebesque’s dominated convergence theorem, t limt→∞ 0 dτ eiτ H2 ie−i L AC ,ˆv (t−τ ) [η (x, t − τ ) + χ V (x)]e−iτ H1 ψ˜ 0 =
∞ 0
dτ eiτ H2 ie−i L AC ,ˆv (∞) [η (x, ∞) + χ V (x)]e−iτ H1 ψ˜ 0 ,
where the limit is on the strong topology of L 2 (R3 ). We complete the proof of (5.60) using (5.62). We now estimate the integrand in (5.60). Step 4. e−i L AC ,ˆv (−∞) φ0 , eiτ H2 e−i L AC ,ˆv (∞) i[η (x, ∞) + χ V (x)]e−iτ H1 ψ˜ 0 = ei(L AC ,ˆv (τ )−L AC ,ˆv (−∞)) φ0 , eiτ H1 e−i L AC ,ˆv (∞) i[η (x, ∞) + χ V (x)]e−iτ H1 ψ˜ 0 ! 1 1 + O . (5.63) v (1 + |τ |)min(µ−2,α−1) Denote by χ the characteristic function of . Then e−i L AC ,ˆv (−∞) φ0 , eiτ H2 e−i L AC ,ˆv (∞) i[η (x, ∞) + χ V (x)]e−iτ H1 ψ˜ 0 = eiτ H1 χ e−iτ H2 e−i L AC ,ˆv (−∞) φ0 , eiτ H1 e−i L AC ,ˆv (∞) i[η (x, ∞) +χ V (x)]e−iτ H1 ψ˜ 0 . Hence, (5.63) will be proved if we can replace eiτ H1 χ e−iτ H2 by χ ei L AC ,ˆv (τ ) adding the error term. But, this follows from (5.62) and the estimate, 1 + |τ | iτ H1 φ0 H2 (R3 ) , χ e−iτ H2 − χ ei L AC ,ˆv (τ ) e−i L AC ,ˆv (−∞) φ0 ≤ C e v (5.64) that we prove below. We designate ϕτ := ei(L AC ,ˆv (τ )−L AC ,ˆv (−∞)) φ0 . We have that eiτ H1 χ e−iτ H2 − χ ei L AC ,ˆv (τ ) e−i L AC ,ˆv (−∞) φ0 = eiτ H1 χ e−iτ H2 e−i L AC ,ˆv (τ ) − eiτ H2 χ e−iτ H1 ϕτ + eiτ H1 χ e−iτ H1 − χ ϕτ . (5.65) By (5.41), ϕτ H2 (R3 ) ≤ Cφ0 H2 (R3 ) .
(5.66)
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
377
Hence, using |τ | p 2 2 −iτ ( p+mv)2 /2mv , − e−iτ ( p·ˆv+v /2mv) ≤ C e 2mv we prove that 2 −iτ H1 − e−iτ (p·ˆv+v /2mv) ϕτ e
L 2 (R 3 )
≤C
|τ | φ0 H2 (R3 ) , v
and since χ − 1 ≡ 0 on the support of e−iτ (p·ˆv+v /2mv) ϕτ , iτ H
e 1 χ e−iτ H1 − χ ϕτ 2 3 L (R ) 2 = eiτ H1 (χ − 1)(e−iτ H1 − e−iτ (p·ˆv+v /2mv) )ϕτ
(5.67)
2
≤
C |τv | φ0 H2 (R3 ) .
L 2 (R 3 )
(5.68)
Then (5.64) follows from (5.39,5.65, 5.66, 5.68). 2 Step 5. We now replace e±iτ H1 by e±i(τ p·ˆv+v /2mv) . We will prove that ei(L AC ,ˆv (τ )−L AC ,ˆv (−∞)) φ0 , eiτ H1 e−i L AC ,ˆv (∞) i[η (x, ∞) + χ V (x)]e−iτ H1 ψ˜ 0 = ei(L AC ,ˆv (τ )−L AC ,ˆv (−∞)) φ0 , (5.69) 1 eiτ p·ˆv e−i L AC ,ˆv (∞) i[η (x, ∞) + χ V (x)]e−iτ p·ˆv ψ˜ 0 + v1 O (1+|τ |)min(µ−2,α−1) 1 , τ ≥ 0. = φ0 , e−ia(ˆv,x) i[η (x + τ vˆ , ∞) + χ V (x +τ vˆ )]ψ˜ 0 + v1 O (1+|τ |)min(µ−2,α−1) Recall that ϕτ is defined below (5.64). By (5.62) and (5.67), 2 e−iτ H1 ϕτ , e−i L AC ,ˆv (∞) i[η (x, ∞) + χ V (x)]e−iτ H1 ψ˜ 0 = e−(iτ p·ˆv+v /2mv) ϕτ , (5.70) 1 e−i L AC ,ˆv (∞) i[η (x, ∞) + χ V (x)]e−iτ H1 ψ˜ 0 + v1 O (1+|τ |)min(µ−2,α−1) . The first equality in (5.69) follows from (5.67) and as [η (x, ∞) + χ V (x)]ei L AC ,ˆv (∞) e−i(τ p·ˆv+v 1 ≤ C (1+|τ |)min(µ−1,α) φ0 H2 (R3 ) , τ > 0,
2 /2mv)
ϕτ L 2 (R3 )
(5.71)
because φ0 has compact support, e−iτ p·ˆv is just a translation and the decay properties of V (x) and η (x, ∞) (in the direction vˆ ). The second equality is immediate. By (5.60, 5.63, 5.69) t v e−i L AC ,ˆv (−∞) φ0 , R+ ψ0 = dτ φ0 , e−ia(ˆv,x) i[η (x + τ vˆ , ∞) + χ V (x + τ vˆ )]ψ˜ 0 0
+O (1/v) + O 1/(1 + |t|)min(µ−2,α−1) % 1 v O (ln(1 + |t|)) , if min(µ − 2, α − 1) = 1, + 1 1 otherwise. v O (1+|t|)min(µ−3,α−2,0) ,
(5.72)
378
M. Ballesteros R. Weder
Step 6. We now prove that t −ia(ˆv,x) i[ (x + τ v ˆ , ∞) + χ V (x + τ vˆ )]ψ˜ 0 η 0 dτ φ0 , e ∞ − 0 dτ φ0 , e−ia(ˆv,x) i[η (x + τ vˆ , ∞) + χ V (x + τ vˆ )]ψ0 1 , t > 0. = O(1/v) + O (1+|t|)min(µ−2,α−1) As φ0 has compact support, [η (x + τ vˆ , ∞) + χ V (x + τ vˆ )]eia(ˆv,x) φ0 ≤C
(5.73)
L 2 (R 3 )
1 φ0 H2 (R3 ) , τ > 0. (1 + |τ |)min(µ−1,α)
(5.74)
Equations (5.22) and (5.74) prove (5.73). By (5.72, 5.73)
v e−i L AC ,ˆv (−∞) φ0 , R+ ψ0 =
dτ φ0 , e−ia(ˆv,x) i[η (x + τ vˆ , ∞) + χ V (x + τ vˆ )]ψ0 0 +O (1/v) + O 1/(1 + |t|)min(µ−2,α−1) % 1 v O (ln(1 + |t|)), if min(µ − 2, α − 1) = 1, + 1 1 otherwise. v O (1+|t|)min(µ−3,α−2,0) , ∞
(5.75) Finally, taking t = v we obtain (5.55) in the Coulomb gauge, and then, (5.51) is proven for AC . Suppose that A ∈ A,2π,SR (B). By (5.11) S(A, V ) = S(AC , V ). As λ∞ is constant, ei
∞
−∞
A(x+τ vˆ )·ˆvdτ
= ei
∞
−∞
AC (x+τ vˆ )·ˆvdτ
, and it follows that (5.51) holds for A ∈ A,2π (B).
6. Reconstruction of the Magnetic Field and the Electric Potential Outside the Obstacle In this section we obtain a method for the unique reconstruction of the magnetic field and the electric potential outside the obstacle, K , from the high-velocity limit of the scattering operator. The method is given in the proof of Theorem 6.3 and is summarized in Remark 6.4. Definition 6.1. We denote by rec the set of points x ∈ such that for some two-dimensional plane Px we have that x + Px ⊂ . Note that if K is convex rec = . Lemma 6.2. For every A ∈ A,2π (B) and every unit vector, vˆ , in R3 , we have that ∞ ∞ ∇ vˆ · A(x + τ vˆ ) dτ = vˆ × B(x + τ vˆ ) dτ, (6.1) −∞
in distribution sense in vˆ .
−∞
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
379
Proof. The following identity holds in distribution sense in vˆ (this is just the triple vector product formula): vˆ × (∇ × A) = ∇(ˆv · A) − (ˆv · ∇)A.
(6.2)
Then, for every φ ∈ C0∞ (vˆ ), ∞
∞ ˆ × B(x + τ vˆ ) dτ φ(x) × B(x + τ vˆ ) dτ [φ] = 3 d x −∞ v R r = R3 d x limr →∞ −r dτ vˆ × B(x) φ(x − τ vˆ )
ˆ −∞v
r
= R3 d x limr →∞ −r −ˆv · A(x)(∇φ)(x − τ vˆ ) + A(x) vˆ · ∇φ (x − τ vˆ )
∞ = ∇ −∞ vˆ · A(x + τ vˆ ) dτ [φ] + limr →∞ R3 A(x)(φ(x − r vˆ ) − φ(x + r vˆ )) ∞ = ∇ −∞ vˆ · A(x + τ vˆ ) dτ [φ] , where in the last equality we used the decay of A and the fact that φ has compact support. Theorem 6.3. (Reconstruction of the Magnetic Field and the Electric Potential). Suppose that B, V satisfy Assumption 5.1. Then, for any flux, , and all A ∈ A,2π (B), the high-velocity limits of S(A, V ) in (5.44) known for all 0 , all unit vectors vˆ and all φ0 ∈ H2 (R3 ) with support φ0 ⊂ 0 , uniquely determine B(x) for almost every x ∈ rec . Furthermore, for any flux, , and all A ∈ A,2π,SR (B), the high-velocity limits of S(A, V ) in (5.51) known for all 0 , all unit vectors vˆ and all φ0 , ψ0 ∈ H6 (R3 ) with support φ0 , support ψ0 ⊂ 0 , uniquely determine V (x) for almost every x ∈ rec . Proof. We proceed as in the proof of Theorem 1.1 of [14] (see also the proof of Theorem 1.4 [40]) with the modifications that are necessary to take the obstacle into account and to reconstruct the magnetic field. Let us fix a x0 ∈ rec . For each j = 1, 2, 3 we take, unit vectors uˆ j , vˆ j , and ε > 0 such that the following conditions are satisfied: 1. uˆ j · vˆ i = 0, i, j ∈ {1, 2, 3}. 2. The unit vectors nˆ j := uˆ j × vˆ j , j = 1, 2, 3, are linearly independent. 3. BεR (x0 ) + p(uˆ j , vˆ j ) ⊂ , j = 1, 2, 3, 3
where p(uˆ j , vˆ j ) is the two-dimensional plane generated by uˆ j , vˆ j . For any z = (z 1 , z 2 ) ∈ R2 we define φ j (z) := e−i(z 1 uˆ j +z 2 vˆ j )·p φ0 , ψ j (z) := e−i(z 1 uˆ j +z 2 vˆ j )·p ψ0 , 3 j = 1, 2, 3, φ0 , ψ0 ∈ C0∞ BεR (x0 ) .
(6.3)
380
M. Ballesteros R. Weder
From the limit (5.44) we uniquely reconstruct ei
∞
ˆ ·A(x+τ vˆ ) dτ −∞ v
∞ for all x ∈ vˆ , and then we reconstruct −∞ vˆ · A(x + τ vˆ ) dτ + 2π n(x, vˆ ) with n(x, vˆ ) an integer that is locally constant. By Lemma 6.2 we also reconstruct uniquely ∞ vˆ × B(x + τ vˆ ) dτ (6.4) −∞
for a.e. x ∈ vˆ . Take now vˆ ∈ p(uˆ j , vˆ j ). Hence, we uniquely reconstruct
∞ −∞
nˆ j · B(x + τ vˆ ) dτ = −nˆ j · vˆ ×
∞
−∞
! vˆ × B(x + τ vˆ ) dτ ,
(6.5)
for a.e. x ∈ vˆ . We used the triple vector product formula, a×(b×c) = (a·c)b−(a·b)c. We now define F j : R2 → C,
F j (z) := nˆ j · B(x)φ j (z), ψ j (z) . F j is continuous and F j (z) ≤ C(1 + |z|)−µ , j = 1, 2, 3. Moreover, we uniquely reconstruct from (5.44) the Radon transforms, ! ∞ ∞ ˜ ˆ z) := ˆ ˆ 1 uˆ j + w ˆ 2 vˆ j )) dτ φ j (z), ψ j (z) , F j (z + τ w)dτ = nˆ j · B(x + τ (w F j (w; −∞
−∞
ˆ := (w ˆ 1, w ˆ 2 ) ∈ R2 has modulus one. where z ∈ R2 and w Inverting this Radon transform (see Theorem 2.17 of [18,19,27]) we uniquely reconstruct F j (z) and in particular F j (0) = nˆ j · Bφ0 , ψ0 and hence, we uniquely reconstruct 3 nˆ j · B(x), j = 1, 2, 3 for a.e. x ∈ BεR (x0 ) and as the nˆ j are linearly independent we uni3 quely reconstruct B(x) for a.e. x ∈ BεR (x0 ). Since x0 ∈ rec is arbitrary we uniquely reconstruct B(x) for a.e. x ∈ rec . ˆ w ˆ be orthonormal We now uniquely reconstruct V . Take any x0 ∈ rec . Let u, 3 ˆ w) ˆ ⊂ vˆ . We define, vectors such that BεR (x0 ) + p(u, 3 ˆ 2 w)·p ˆ ˆ 2 w)·p ˆ φ(z) := e−i(z 1 u+z φ0 , ψ(z) := e−i(z 1 u+z ψ0 , φ0 , ψ0 ∈ C0∞ BεR (x0 ) , and the function F : R2 → C, F(z) := (V (x)φ(z), ψ(z)) . F is continuous and |F(z)| ≤ C(1 + |z|)−α .
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
381
Moreover, since B is already known in rec , we uniquely reconstruct from (5.51) the Radon transforms, ! ∞ ∞ ˜ ˆ dτ φ(z), ψ(z) , F(ˆy; z) := F(z + τ yˆ )dτ = V (x + τ (ˆy1 uˆ + yˆ 2 w)) −∞
−∞
where z ∈ R2 and yˆ := (ˆy1 , yˆ 2 ) ∈ R2 has modulus one. As above inverting these Radon transforms we uniquely reconstruct F(z), and in 3 particular F(0) = (V φ0 , ψ0 ) which uniquely determines V (x) for a.e. x ∈ BεR (x0 ). Since x0 ∈ rec is arbitrary, V (x) is uniquely reconstructed for a.e. x ∈ rec . Remark 6.4. Let us summarize the reconstruction method given by Theorem 6.3. From the high-velocity limit (5.44) we uniquely reconstruct ei
∞
ˆ ·A(x+τ vˆ ) dτ −∞ v
,
(6.6)
and from this we uniquely reconstruct ∞ vˆ × B(x + τ vˆ ) dτ, x ∈ vˆ ,
(6.7)
which gives us the Radon transform ∞ ˜ ˆ z) := ˆ F j (w; F j (z + τ w)dτ −∞ ! ∞ ˆ 1 uˆ j + w ˆ 2 vˆ j )) dτ φ j (z), ψ j (z) , = nˆ j · B(x + τ (w
(6.8)
−∞
−∞
ˆ := (w ˆ 1, w ˆ 2 ) ∈ R2 has modulus one. where z ∈ R2 and w Inverting this Radon transform we uniquely reconstruct F j (z) and in particular F j (0) = nˆ j · Bφ0 , ψ0 and hence, we uniquely reconstruct nˆ j · B(x), j =1, 2, 3 for 3 a.e. x ∈ BεR (x0 ) and as the nˆ j are linearly independent we uniquely reconstruct B(x) 3 for a.e. x ∈ BεR (x0 ). Since x0 ∈ rec is arbitrary we uniquely reconstruct B(x) for a.e. x ∈ rec . Note that to reconstruct B almost everywhere in a neighborhood of a point x0 we only need the high-velocity limit of the scattering operator applied to wave functions with support in a neighborhood of three two-dimensional planes. For the inversion of the Radon transform see Theorem 2.17 of [18] and [19,27]. Remember that given any A ∈ A,2π (B) we can always find an A ∈ A (B) with the same scattering operator. We can take, for example, A . See Eq. (5.12). Then there is no loss of generality taking A ∈ A (B). Note that (6.6) is not a gauge invariant quantity. ˜ A ∈ A (B) and A˜ = A + dλ, then, If A, ∞ ∞ ˜ + τ vˆ )dτ = vˆ · A(x vˆ · A(x + τ vˆ )dτ + λ∞ (ˆv) − λ∞ (−ˆv). −∞
−∞
We can, however, reconstruct (6.7) from the gauge invariant quantity, R(x, y) := ei
∞
ˆ ·[A(x+τ vˆ )−A(y+τ vˆ )] dτ −∞ v
, x, y ∈ vˆ .
382
M. Ballesteros R. Weder
We have that 1 R(x, y) ∇x R(x, y) = ∇x i
∞
−∞
vˆ · A(x + τ vˆ ) dτ =
∞
−∞
vˆ × B(x + τ vˆ ) dτ, x ∈ vˆ .
We now uniquely reconstruct V . Since B is already known in rec , for any ˆ w) ˆ we uniquely reconstruct from (5.51) the Radon transforms, vˆ ∈ p(u, ! ∞ ∞ ˜ ˆ dτ φ(z), ψ(z) , F(ˆy; z) := F(z + τ yˆ )dτ = V (x + τ (ˆy1 uˆ + yˆ 2 w)) −∞
−∞
where z ∈ R2 and yˆ := (ˆy1 , yˆ 2 ) ∈ R2 has modulus one. As above inverting these Radon transforms we uniquely reconstruct F(z), and in 3 particular F(0) = (V φ0 , ψ0 ) which uniquely determines V (x) for a.e. x ∈ BεR (x0 ). Since x0 ∈ rec is arbitrary, V (x) is uniquely reconstructed for a.e. x ∈ rec . 7. The Aharonov-Bohm Effect In this section we assume that B ≡ 0, i.e., that there is no magnetic field in . On the contrary, the electric potential, V , is not assumed to be zero. In other words, we will analyze the Aharonov-Bohm effect in the presence of an electric potential. As we will show, for high-velocities the electric potential gives a lower-order contribution that plays no role in the Aharonov-Bohm effect. However, it could be of interest to allow for a non-trivial electric potential from the experimental point of view. For any x ∈ R3 and any unit vector vˆ ∈ S2 we denote L(x, vˆ ) := x + Rˆv, ˆ ∈ S2 satisfy and we give to L(x, vˆ ) the orientation of vˆ . Suppose that x, y ∈ R3 , vˆ , w ˆ ≥ 0 and that vˆ · w ˆ ⊂ . L(x, vˆ ) ∪ L(y, w) Take ρ > 0 so large that
ˆ ∪ convex (x + (−∞, −ρ]ˆv) ∪ (y + (−∞, −ρ]w)
c 3 ˆ ⊂ BrR (0) , convex (x + [ρ, ∞)ˆv) ∪ (y + [ρ, ∞, )w) c
where K ⊂ BrR (0), BrR (0) is the complement of BrR (0) and the symbol convex(·) denotes the convex hull of the indicated set. ˆ the continuous, simple, oriented and closed curve with We denote by γ (x, y, vˆ , w) ˆ oriented in the direction sides, x + [−ρ, ρ]ˆv, oriented in the direction of vˆ , y + [−ρ, ρ]w, ˆ and the oriented straight lines that join the points x + ρ vˆ with y + ρ w ˆ and y − ρ w ˆ of −w and x − ρ vˆ . Suppose that A is short-range (see Definition 5.4). For example, we can take A = AC . We denote x⊥,ˆv := x − (x, vˆ )ˆv. It follows from Stoke’s theorem that if |x⊥,ˆv | ≥ r , ∞ ∞ ∞ vˆ · A(x + τ vˆ ) dτ = vˆ · A(x⊥ + τ vˆ ) dτ = lim vˆ · A(sx⊥ + τ vˆ ) dτ = 0. 3
−∞
3
−∞
3
s→∞ −∞
(7.1)
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
383
By Stoke’s theorem and arguing as in the proof of (7.1) we prove that for short-range A, A= A− A. (7.2) ˆ γ (x,y,ˆv,w)
L(x,ˆv)
ˆ L(y,w)
Take any z ∈ R3 such that |(x + z)⊥,ˆv | ≥ r, |(y + z)⊥,wˆ | ≥ r . By Stoke’s theorem and (7.1), A= A = 0. ˆ L(y+z,w)
L(x+z,ˆv)
Then, adding zero we write (7.2) as
ˆ γ (x,y,ˆv,w)
A=
L(x,ˆv)
A−
L(x+z,ˆv)
! A −
ˆ L(y,w)
A−
ˆ L(y+z,w)
! A . (7.3)
The point is that for any A ∈ A,2π (0) there is A˜ ∈ A,2π (0) with A = A˜ + ∇λ and A˜ short-range, consequently (7.3) holds for any A ∈ A,2π (0). It follows that from the high-velocity limit (5.44) we can reconstruct γ (x,y,ˆv,w) ˆ A, modulo 2π . We have proven the following theorem. Theorem 7.1. Suppose that B ≡ 0 and that V satisfies Assumption 5.1. Then, for any flux, , and all A ∈ A,2π (0), the high-velocity limits of S(A, V ) in (5.44) known for ˆ determines the fluxes vˆ and w A (7.4) ˆ γ (x,y,ˆv,w)
ˆ modulo 2π , for all curves γ (x, y, vˆ , w). ˆ Remark 7.2. Theorem 7.1 implies that from the high-velocity limit (5.44) for vˆ and w we can reconstruct the fluxes A α
for any closed curve α such that there is a surface (or chain) S in with ∂S = α − ˆ because by Stoke’s theorem, γ (x, y, vˆ , w), A= A+ B= A. α
ˆ γ (x,y,ˆv,w)
S
ˆ γ (x,y,ˆv,w)
Remember also that given any A ∈ A,2π (B) we can always find an A ∈ A (B) with the same scattering operator. We can take, for example, A . See Eq. (5.12). Then, there is no loss of generality taking A ∈ A (0). Furthermore, notice that we can at most reconstruct the fluxes modulo 2π because by (5.12) S(A , V ) = S(A, V ) and the fluxes of A and A differ by integer multiples of 2π . For general A ∈ A,2π (0) we recuperate the fluxes from Eq. (7.3). However if A is short-range we can use the simpler formula (7.2).
384
M. Ballesteros R. Weder
ˆ is a cycle, the homology class [γ (x, y, vˆ , w)] ˆ H1 (;R) is Remark 7.3. As γ (x, y, vˆ , w) well defined. We denote ˆ H1 (;R) : L(x, vˆ ) ∪ L(x, w) ˆ ⊂ . H1,rec (; R) := [γ (x, y, vˆ , w)] (7.5) 1 H1,rec (; R) is a vector subspace of H1 (; R). Let us denote by Hde R, rec () the 1 vector subspace of Hde R () that is the dual to H1,rec (; R), given by de Rham’s Theorem. Then, for all and all A ∈ A,2π (0), from the high-velocity limit (5.44) 1 ˆ we reconstruct the projection of A into Hde known for all vˆ , w R, rec () modulo 2π , as we now show. Let m [σ j ] H1,rec (;R) , j=1
be a basis of H1,rec (; R), and let m [A j ] H 1 , () de R, rec j=1 be the dual basis, i.e., σj
Ak = δ j,k , j, k = 1, 2, . . . , m.
1 Let us denote by Prec the projector onto Hde R, rec (). Hence, for any A ∈ A,2π (B),
Prec [A] H 1 () = de R
m j=1
λ j [A j ] H 1 () , de R, rec
and, furthermore, as λj =
σj
A,
we reconstruct λ j , j = 1, 2, . . . , m (modulo 2π ) from the high-velocity limit (5.44) ˆ known for all vˆ , w. ˆ goes through a hole of K . Take We now give a precise definition of when a line L(x, v) 3 3 r > 0 such that K ⊂ BrR (0). Suppose that L(x, vˆ ) ⊂ , and L(x, vˆ ) ∩ BrR (0) = ∅. 3 We denote by c(x, vˆ ) the curve consisting of the segment L(x, vˆ ) ∩ BrR (0) and an arc
on ∂ BrR (0) that connects the points L(x, vˆ ) ∩ ∂ BrR (0). We orient c(x, vˆ ) in such a way that the segment of straight line has the orientation of vˆ . 3
3
Definition 7.4. A line L(x, vˆ ) ⊂ goes through a hole of K if L(x, vˆ ) ∩ BrR (0) = ∅ and [c(x, vˆ )] H1 (;R) = 0. Otherwise we say that L(x, vˆ ) does not go through a hole of K. 3
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
385
Note that this characterization of lines that go or do not go through a hole of K is independent of the r that was used in the definition. This follows from the homotopic invariance of homology. See Theorem 11.2, p. 59 of [16]. In an intuitive sense [c(x, vˆ )] H1 (;R) = 0 means that c(x, vˆ ) is the boundary of a surface (actually of a chain) that is contained in and then it can not go through a hole 3 3 of K . Obviously, as K ⊂ BrR (0), if L(x, vˆ ) ∩ BrR (0) = ∅ the line L(x, vˆ ) can not go through a hole of K . ˆ ⊂ that go through a hole of K go through Definition 7.5. Two lines L(x, vˆ ), L(y, w) ˆ H1 (;R) . Furthermore, we say that the the same hole if [c(x, vˆ )] H1 (;R) = ±[c(y, w)] ˆ H1 (;R) . lines go through the hole in the same direction if [c(x, vˆ )] H1 (;R) = [c(y, w)] Lemma 7.6. Let A, A0 ∈ A (0) with A0 short-range and let λ be such that A0 = A+dλ. ˆ go through the same hole of K . Then, Assume that L(x, vˆ ) and L(y, w) ∞ vˆ · A(x + τ vˆ ) dτ + λ∞ (ˆv) − λ∞ (−ˆv) −∞ ! ∞ ˆ · A(y + τ w) ˆ dτ + λ∞ (w) ˆ − λ∞ (−w) ˆ , =± w (7.6) −∞
ˆ H1 (;R) . if [c(x, vˆ )] H1 (;R) = ±[c(y, w)] Moreover,
∞
vˆ · A(x + τ vˆ ) dτ + λ∞ (ˆv) − λ∞ (−ˆv) ∞ = vˆ · A0 (x + τ vˆ ) dτ = A0 =
−∞
c(x,ˆv)
−∞
Proof. By (7.1) and Stoke’s theorem, ∞ vˆ · A(x + τ vˆ ) dτ + λ∞ (ˆv) − λ∞ (−ˆv) = −∞
=±
ˆ c(y,w)
A0 = ±
∞
−∞
∞ −∞
c(x,ˆv)
A.
(7.7)
vˆ · A0 (x + τ vˆ ) dτ = !
c(x,ˆv)
A0
ˆ · A(y + τ w) ˆ dτ + λ∞ (w) ˆ − λ∞ (−w) ˆ . w
Lemma 7.7. Let A, A0 ∈ A (0) with A0 short-range and let λ be such that A0 = A+dλ. Assume that L(x, vˆ ) does not go through a hole of K . Then, ∞ vˆ · A(x + τ vˆ ) dτ + λ∞ (ˆv) − λ∞ (−ˆv) = 0. (7.8) −∞
Proof. If L(x, vˆ ) ∩ BrR (0) = ∅ it follows from (7.1) and Stoke’s theorem that (7.8) holds. Otherwise, [c(x, vˆ )] H1 (,R) = 0, and then, by Stoke’s theorem, A = 0. 3
c(x,ˆv)
386
M. Ballesteros R. Weder
Take z ∈ ∂ BrR (0) ∩ c(x, vˆ ) such that L(z, vˆ ) is tangent to ∂ BrR (0). By the argument above, ∞ vˆ · A(z + τ vˆ ) dτ + λ∞ (ˆv) − λ∞ (−ˆv) = 0. 3
3
−∞
Finally, using once more Stoke’s theorem we obtain that ∞ ∞ 0= vˆ · A(x + τ vˆ ) dτ − vˆ · A(z + τ vˆ ) dτ, A= c(x,ˆv)
and then, (7.8) is proven.
−∞
−∞
Remark 7.8. If (x, vˆ ) ∈ × S2 , there are neighborhoods Bx ⊂ R3 , Bvˆ ⊂ S2 such that ˆ ∈ Bx × Bvˆ then, the following is true: if L(x, vˆ ) does not (x, vˆ ) ∈ Bx × Bvˆ and if (y, w) ˆ does not go through a hole of K . If L(x, vˆ ) go through a hole of K , then, also L(y, w) ˆ goes through the same hole and in the same goes through a hole of K , then, L(y, w) direction. This follows from the homotopic invariance of homology, Theorem 11.2, p. 59 of [16]. Definition 7.9. For any vˆ ∈ S2 we denote by vˆ ,out the set of points x ∈ vˆ such that L(x, vˆ ) does not go through a hole of K . We call this set the region without holes of vˆ . The holes of vˆ is the set vˆ ,in := vˆ \ vˆ ,out . We define the following equivalence relation on vˆ ,in . We say that x Rvˆ y if and only if L(x, vˆ ) and L(y, vˆ ) go through the same hole and in the same direction. By [x] we designate the classes of equivalence under Rvˆ . We denote by vˆ ,h h∈I the partition of vˆ ,in given by this equivalence relation. It is defined as follows: I := {[x]}x∈
vˆ ,
in
.
Given h ∈ I there is x ∈ vˆ ,in such that h = [x]. We denote vˆ ,h := {y ∈ vˆ ,in : y Rvˆ x}. Then vˆ ,in = ∪h∈I vˆ ,h , vˆ ,h 1 ∩ vˆ ,h 2 = ∅, h 1 = h 2 . We call vˆ ,h the hole h of K in the direction of vˆ . Note that {vˆ ,h }h∈I ∪ {v,out }
(7.9)
is an open disjoint cover of vˆ . Definition 7.10. For any , A ∈ A (0), vˆ ∈ S2 , and h ∈ I we define, Fh := A, c(x,ˆv)
where x is any point in vˆ ,h . Note that Fh is independent of the x ∈ vˆ ,h that we choose. Fh is the flux of the magnetic field over any surface (or chain) in R3 whose boundary is c(x, vˆ ). We call Fh the magnetic flux on the hole h of K .
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
387
Let us take φ0 ∈ H2 (R3 ) with compact support in vˆ . Then, since (7.9) is a disjoint open cover of vˆ , φ0 = ϕh + ϕout , (7.10) h∈I
with ϕh , ϕout ∈ H2 (R3 ), ϕh has compact support in vˆ ,h , h ∈ H, and ϕout has compact support in vˆ ,out . The sum is finite because φ has compact support. We denote φv := eimv·x φ0 , ϕh,v := eimv·x ϕh , ϕout,v := eimv·x ϕout . Theorem 7.11. Suppose that B ≡ 0 and that V satisfies Assumption 5.1. Then, for any and any A ∈ A (0), ! 1 −i(λ∞ (ˆv)−λ∞ (−ˆv)) i Fh . (7.11) e ϕv,h + ϕout,v + O S(A, V ) φv = e v h∈I
Proof. The theorem follows from Theorem 5.7 and Lemmas 7.6, 7.7. Corollary 7.12. Under the conditions of Theorem 7.11,
S(A, V ) φv , ϕv,h = e
−i(λ∞ (ˆv)−λ∞ (−ˆv)) i Fh
e
S(A, V ) φv , ϕv,out = e
+O
−i(λ∞ (ˆv)−λ∞ (−ˆv))
! 1 , h ∈ I, v
+O
! 1 . v
(7.12)
(7.13)
Moreover, the high-velocity limit of S(A, V ) in the direction vˆ determines λ∞ (ˆv) − λ∞ (−ˆv) and the fluxes Fh , h ∈ I, modulo 2π . Proof. The corollary follows immediately from Theorem 7.11. Remark 7.13. Equations (7.12, 7.13) are reconstruction formulae that allow us to reconstruct λ∞ (ˆv) − λ∞ (−ˆv) and the fluxes Fh , h ∈ I, modulo 2π , from the high-velocity limit of the scattering operator in the direction vˆ . Recall that λ∞ (ˆv) − λ∞ (−ˆv) is independent of the particular short-range potential that we use to define λ. Remember also that given any A ∈ A,2π (B) we can always find an A ∈ A (B) with the same scattering operator. We can take, for example, A . See Eq. (5.12). Then, there is no loss of generality taking A ∈ A (0). Note that it is quite remarkable that we can determine λ∞ (ˆv)−λ∞ (−ˆv) since it is not a gauge invariant quantity. According to the standard interpretation of quantum mechanics only gauge invariant quantities are physically relevant. Note that if A is short-range λ∞ is constant. In this case λ∞ (ˆv)−λ∞ (−ˆv) ≡ 0 and it drops out from all our formulae. We see that one possibility is to consider that only short-range potentials are physically admissible. This is consistent with the usual interpretation of quantum mechanics in three dimensions. However, we can also go beyond the standard interpretation of quantum mechanics and consider the class of long-range potentials A (B) as physically admissible. This raises the interesting question of what is the physical significance of the λ∞ (ˆv) − λ∞ (−ˆv).
388
M. Ballesteros R. Weder
Example 7.14. Here we consider a simple example where we give an explicit description of the holes. Furthermore, the fluxes of the holes are the fluxes of the magnetic field over cross sections of the tori. We reconstruct all the fluxes modulo 2π and also we determine the cohomology class of the magnetic potential modulo 2π , from the high-velocity limit of the scattering operator in only one direction. Given a vector z ∈ R3 and a > b > 0 we denote by T (z, a, b) the following set: T (z, a, b) := {z + a(cos θ, sin θ, 0) + b (x(cos θ, sin θ, 0)
2 +y(0, 0, 1)) : θ ∈ [0, 2π ], (x, y) ∈ B1R (0) .
The map Fz,a,b : T → T (z, a, b) given by Fz,a,b ((cos θ, sin θ ), (x, y)) → z + a(cos θ, sin θ, 0) + b (x(cos θ, sin θ, 0) + y(0, 0, 1)) is a diffeomorphism. The obstacle. We now define the obstacle K . We assume that v = (0, 0, 1). As before the connected components of K are K j , j = 1, 2, . . . , L. Let us denote J = {1, 2, . . . , m} and I = {m + 1, . . . , L}. If m = L, then, I = ∅. We assume that K satisfies the following assumptions: 1. There are vectors z j ∈ R3 and numbers a j > b j , j = 1, 2, . . . , m such that, 3 K j = T (z j , a j , b j ), ∀ j ∈ J, K j ∼ = B1R (0), j ∈ I.
2.
convex (K j ) + Rv ∩ (convex (K l ) + Rv) = ∅, j, l ∈ J,
convex (K j ) + Rv ∩ (K l + Rv) = ∅, j ∈ J, l ∈ I. We denote as before by convex (·) the convex hull of the indicated set. The Curves γ j , γ˜j , γˆj . Let θ j be such that z j = r j (cos(θ j ), sin(θ j ), 0) + (0, 0, (z j )3 ). The curves γ j , j ∈ J are given by γ j (t) := z j + a j (cos t, sin t, 0), and the curves γ˜j , j ∈ J , are
γ˜j := z j + a j (cos θ j , sin θ j , 0) + b j cos t (cos θ j , sin θ j , 0) + sin t (0, 0, 1) . Furthermore, the curves γˆj , j ∈ J , are
γˆj := z j + a j (cos θ j , sin θ j , 0) + (b j + δ/2) cos t (cos θ j , sin θ j , 0) + sin t (0, 0, 1) , where δ > 0 so small that, δ < a j − b j , and
convex (K j,δ ) + Rv ∩ convex (K l,δ ) + Rv
= ∅, j, l ∈ J, (convex K j,δ ) + Rv ∩ K l,δ + Rv = ∅, j ∈ J, l ∈ I. The subindex δ denotes the set of points that are at distance up to δ of the indicated set. The flux . We define the following sets: h j := z j + t (cos θ, sin θ ) : θ ∈ [0, 2π ], t ∈ [0, a j − b j ) + Rˆv, j ∈ J.
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
389
We have that [c(x, vˆ )] H1 (;R) = [c(y, vˆ )] H1 (;R) , ∀x, y ∈ h j , j ∈ J.
(7.14)
Since c(x, vˆ ) and c(y, vˆ ) are homotopic in , this follows from the homotopic invariance of homology, see Theorem 11.2, p. 59 of [16] (the curves c(x, vˆ ) and c(y, vˆ ) are defined previously in this section). Then, we can associate a flux j to each h j , j ∈ J as follows: j = A, for some x ∈ h j , j ∈ J. c(x,ˆv)
We have that
3 [c(y, vˆ )] H1 (;R) = 0, ∀y ∈ vˆ \ ∪ j∈J h j ∩ BrR (0) + Rˆv ,
(7.15)
where the radius r is the one taken to define the curves c(y, vˆ ). Let us prove this. As the segment of straight line in c(y, vˆ ) does not belong to any of the sets convex (K j ) + Rˆv, j ∈ J , we have that for any j ∈ J there is a surface (or a chain) σ j contained in the m ( j) complement of K j such that ∂σ j = c(y, vˆ ). Let G be the basis () H1 de R j=1 1 ( j) of Hde R constructed in Proposition 2.3. Then, as dG = 0 it follows from Stoke’s theorem that G ( j) = 0, ∀ j ∈ J. c(y,ˆv)
Hence, (7.15) follows from de Rham’s Theorem, Theorem 4.17, p. 154 of [41]. Let us prove now that j = (γˆj ), j ∈ J.
(7.16)
For any j ∈ J we define, x j := z j + a j (cos θ j , sin θ j , 0) − (b j + δ/2)(cos θ j , sin θ j , 0), y j := z j + a j (cos θ j , sin θ j , 0) + (b j + δ/2)(cos θ j , sin θ j , 0). We choose the curves c(x j , vˆ ), c(y j , vˆ ) in such a way that the arc in c(y j , vˆ ) is contained in the arc in c(x j , vˆ ). Let c j be the curve obtained by taking the segments of straight line in c(x j , vˆ ) and in c(y j , vˆ ) and the two arcs that are obtained by cutting from the arc in c(x j , vˆ ) the arc in c(y j , vˆ ). We orient c j in such a way that the segment of straight line in c(x j , vˆ ) has the orientation of vˆ . Then, in homology, [c j ] H1 (;R) = [c(x j , vˆ )] H1 (;R) − [c(y j , vˆ )] H1 (;R) .
(7.17)
This follows from de Rham’s Theorem -Theorem 4.17, p. 154 of [41]- since for any closed 1-form D, D= D− D. cj
c(x j ,ˆv)
c(y j ,ˆv)
390
M. Ballesteros R. Weder
The curves γˆ j and c j are homotopically equivalent in . Hence, by the homotopical invariance of homology, Theorem 11.2, p. 59 of [16], [c j ] H1 (;R) = [γˆj ] H1 (;R) .
(7.18)
[γˆj ] H1 (;R) = [c(x j , vˆ )] H1 (;R) ,
(7.19)
Then, by (7.15, 7.17),
and hence,
c(x j ,ˆv)
A=
γˆj
A, j ∈ J,
what proves (7.16). The holes of K . Recall that vˆ ,out and vˆ ,in were defined in Definition 7.9, that the holes of K are the sets v,h , h ∈ I, that Fh is the flux over the hole v,h , h ∈ I, that vˆ ,in = ∪h∈I vˆ ,h . Then, we have that 1. The index set I can be taken as I = {h j } j∈J ∼ J . Moreover, denoting vˆ , j = vˆ ,h j , we have that vˆ , j = h j and vˆ ,in = ∪ j∈J h j . 2. We designate, F j := Fh j . Then F j = (γˆj ), j ∈ J. 3. vˆ ,out = R3
&
vˆ ,in ∪ Lj=1 (K j + vˆ R) .
Let us prove this. By (7.15) [c(y, vˆ )] H1 (;R) = 0, ∀y ∈ vˆ \ ∪ j∈J h j such that 3 BrR (0) ∩ L(y, vˆ ) = ∅. Then
3 vˆ \ ∪ j∈J h j ∩ BrR (0) + Rˆv ⊂ vˆ ,out . But by our definition the complement in vˆ of BrR (0) + Rˆv is contained in vˆ ,out . It follows that
vˆ \ ∪ j∈J h j ⊂ vˆ ,out . 3
Moreover, if x ∈ h j for some j ∈ J , since [γˆj ] H1 (;R) = 0, it follows from (7.14) and (7.19) that [c(x j , vˆ )] H1 (;R) = [c(x, vˆ )] H1 (;R) = 0, and then, x ∈ / vˆ ,out . Then, we have proven that
vˆ \ ∪ j∈J h j = vˆ ,out , (7.20) and hence, vˆ ,in = ∪ j∈J h j . Item 3 is now obvious. By (7.14) if x, y ∈ h j , then, [c(x, vˆ )] H1 (;R) = [c(y, vˆ )] H1 (;R) . Hence, x Rvˆ y which implies that h j is contained in some hole of K . But by (7.14) and (7.19) if x ∈ h j , y ∈ h l , j = l, then, [c(x, vˆ )] H1 (;R) = [c(y, vˆ )] H1 (;R) because as
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
391
the [γˆj ] H1 (;R) , j ∈ J are a basis of H1 (; R) they are different. In consequence, x and y belong to different holes of K . Then, since (7.20) holds, we have proven Item 1. Item 2 follows from (7.16). By Corollary 7.12 and Remark 7.13, this proves that from the high-velocity limit of S(A, V ) in the direction of vˆ we reconstruct all the fluxes (γˆj ), j ∈ J, modulo 2π . Let us now prove that from the high-velocity limit of S(A, V ) in the direction of vˆ we also reconstruct the cohomology class [A] H 1 () modulo 2π , in the sense that we de R 1 reconstruct modulo 2π the coefficients of [A] H 1 () in any basis of Hde R (). de R m m 1 Let [M j ] H 1 () and let [ be any basis of Hde ] be the j H (; R ) 1 () j=1 R de R j=1 dual basis of H1 (; R) given by de Rham’s Theorem, Ml = δ j,l , j, l ∈ J. j
Let {α j } j∈J be the expansion coefficients of A, [A] H 1 α j [M j ] H 1 () = () , de R de R j∈J αj = A. j
By Proposition 10.1 [γˆj ] H1 (;Z)
m j=1
[ j ] H1 (;Z) =
is a basis of H1 (; Z). Then,
n( j, l)[γˆl ] H1 (;Z) ,
l∈J
where the coefficients n( j, l) are integers. Finally, A= n( j, l) A= n( j, l)(γˆl ), j ∈ J, αj = j
l∈L
γˆl
l∈L
and since we have already determined the (γˆl ) modulo 2π , the coefficients α j , j ∈ J are determined modulo 2π . 8. The Tonomura et al. Experiments The fundamental experiments of Tonomura et al. [37,38], gave a conclusive evidence of the existence of the Aharonov-Bohm effect. For a detailed account see [30]. Tonomura et al. [37,38] did their experiments in the case of toroidal magnets. This corresponds to our Example 7.14 with only one torus, i.e., L = 1, J = {1}. In very careful and precise experiments they managed to superimpose behind the toroidal magnet two electron beams. One of them traveled inside the hole of the toroidal magnet and the other-the reference beam-outside it. They measured the interference fringes between the two beams produced by the magnetic flux inside the torus. We show now that our results give a rigorous mathematical proof that quantum mechanics predicts the interference fringes observed by Tonomura et al. [37,38] in their remarkable experiments.
392
M. Ballesteros R. Weder
An equivalent description of these experiments is to consider that both electron beams traveled inside the hole of the torus, one of them with a nonzero magnetic flux inside the torus, and the other-the reference beam-with the magnetic flux inside the torus set to zero. Note that it follows from Theorem 7.11 that particles that go outside the holes only feel the long-range part of the potential given by the factor e−i(λ∞ (ˆv)−λ∞ (−ˆv)) , for large velocities. Therefore, we can model the particles that go outside the hole (in the Tonomura et al. experiments [37,38] ) by particles that go inside the hole when the fluxes Fh are equal to zero and that feel the same long range effect e−i(λ∞ (ˆv)−λ∞ (−ˆv)) for large velocities. As in this model long-range magnetic potentials add a global constant phase that does not affect the interference pattern we take, for simplicity, a short-range magnetic potential. According to Theorem 7.11, for the particle that goes inside the hole with the magnetic flux present, up to an error of order 1/v, we have that q
S(A, V )φv = ei c φv ,
(8.1)
where we have taken physical units, with the flux of the physical magnetic field B and M φv = ei v·x φ0 . See Sect. 4. For the particle that goes outside the hole of the magnet, or equivalently inside the hole with the magnetic field set to zero, S(A, V )φv = φv .
(8.2)
If we superimpose both asymptotic states we obtain the wave function, q 1 + ei c φv ,
(8.3)
up to an error of order 1/v. This shows the interference patterns that were observed experimentally by Tonomura et al. [37,38]. For example, if qc is an odd multiple of π there is a destructive interference and there is a dark zone behind the hole of the magnet, as observed experimentally. Tonomura et al. [37,38] also considered the case when the reference beam is slightly tilted. In this case the reference beam is given by M
φv+v0 = ei v0 ·x φv , and (8.2) is replaced by, M
S(A, V )φv+v0 = φv+v0 = ei v0 ·x φv . In this case we obtain the wave function q M M ei v0 ·x 1 + e−i v0 ·x ei c φv , up to an error of order 1/v. We see that the factor, q M 1 + e−i v0 ·x ei c produces the parallel fringes that were observed experimentally by Tonomura et al. [37,38].
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
393
9. Appendix A In this Appendix we prove, for the reader’s convenience, that Hs ( k T ; R) = 0, s ≥ 2, k k R, and that [Z ] that H1 (kT ; R) ∼ = ⊕i=1 j H1 (kT ;R) j=1 is a basis of H1 (kT ; R). R is Z or R. Recall that we defined, γ± : [0, 1] → T : γ± (t) = (e±2πit , 0, 0). ∼ Z and [γ± ] H (T ;Z) are basis Proposition 9.1. Hs (T ; R) = 0, s ≥ 2 and H1 (T ; Z) = 1 of H1 (T ; Z). Proof. We define γ˜± : [0, 1] → S1 : γ˜± (t) := e±2πit and let IS1 : S1 → T be the inclusion given by IS1 (s) := (s, 0, 0). Clearly, IS1 ◦ γ˜± = γ± . It is easy to see that S1 is homotopically equivalent to T and that the inclusion IS1 : S1 → T is a homotopic equivalence. It follows that IS1 induces an isomorphism in holomogy given by Hs (IS1 ) (see Theorem 11.3, p. 59 [16]). Then, Hs (T ; R) ∼ = Hs (S1 ; R) and hence, we have that Hs (T ; R) = 0, s ≥ 2 by Corollary 15.5, p. 84 of [16]. For s = 1 and R = Z, the isomorphism is given in the following way (see p. 49 [16]). Let σi : [0, 1] → T be continuous functions andlet n i ∈ Z. Let us assume that n i σi is a cycle (its boundary is zero). Then, H1 (IS1 )[ n i σi ] H1 (S1 ;Z) := [ n i IS1 ◦ σi ] H1 (T ;Z) . As IS1 ◦ γ˜± = γ± , it follows that H1 (IS1 )[γ˜± ] H1 (S1 ;Z) = [γ± ] H1 (T ;Z) . Then, to prove the proposition it is enough to prove that H1 (S1 ; Z) ∼ = Z and that [γ˜± ] H1 (S1 ;Z) are basis of H1 (S1 ; Z). By Theorem 12.1, p. 63 of [16], there is a homomorphism : 1 (S1 ; 1) → H1 (S1 ; Z) that sends a homotopy class to its homology class. In our case is an isomorphism since 1 (S1 ; 1) is abelian. Actually, 1 (S1 ; 1) ∼ = Z.See Theorem 4.4, p. 17 of [16]. Then, Z ∼ that [γ˜± ] H1 (S1 ;Z) are = 1 (S1 ; 1) ∼ = H1 (S1; Z). To prove basis of H1 (S1 ; Z) it is enough to prove that [γ˜± ]1 (S1 ;1) are basis of 1 (S1 ; 1). The isomorphism : 1 (S1 ; Z) → Z is given (see Theorem 4.4, p. 17 of [16]) as follows. Given a path σ with [σ ]1 (S1 ;1) ∈ 1 (S1 ; 1) let σ : [0, 1] → R satisfy σ (0) = 0 and e2πiσ (t) = σ (t). Then, [σ ]1 (S1 ;1) = σ (1). In our case, if we take γ˜± (t) := ±t, γ˜± (0) = 0 and γ˜± (1) = ±1. It follows that [γ˜± ]1 (S1 ;1) = ±1. As ±1 are basis of Z it follows that [γ˜± ]1 (S1 ;1) are basis of 1 (S1 ; 1) and this concludes the proof that [γ± ] H1 (T ;Z) are basis of H1 (T ; Z). k Z, Proposition 9.2. For s ≥ 2, Hs (kT ; R) = 0. Furthermore, H1 (kT ; Z) ∼ = ⊕i=1 k and [Z j ] H1 (kT ;Z) j=1 is a basis of H1 (kT ; Z).
Proof. We prove the proposition by induction in k. For k = 1, Z 1 = γ+ and the result k−1 R and that follows from Proposition 9.1. Let us assume that Hs ( (k − 1) T ; R) ∼ = ⊕i=1 k−1 [Z j ] H1 ( (k−1) T ;Z) j=1 is a basis of H1 ( (k − 1) T ; Z). Let X 1 and X 2 be open subsets of k T such that ∪ j≤k−1 l j (T ) ⊆ X 1 , lk (T ) ⊆ X 2 , X 1 ∪ j≤k−1l j (T ) ≈ (k − 1)T, X 2 lk (T ) ≈ T, and X 1 ∩ X 2 is contractible, i.e. X 1 ∩ X 2 to a single point. The symbol means homotopic equivalence and ≈ means homeomorphism. By Example 17.1, p. 98 of [16] ( k T, X 1 , X 2 ) is an exact triad and we can apply the sequence of Mayer-Vietoris (17.7 p. 99 and 17.9 p. 100 of [16]), Hs (X 1 ∩ X 2 ; R) → Hs (X 1 ; R) ⊕ Hs (X 2 ; R) → Hs ( k T ; Z) → Hs−1 (X 1 ∩ X 2 ; R).
394
M. Ballesteros R. Weder
As X 1 ∩ X 2 is homotopically equivalent to a point -that we denote by {∗}- we have that Hs (X 1 ∩ X 2 ; R) ∼ = Hs ({∗}; R) = 0, Hs−1 (X 1 ∩ X 2 ; R) ∼ = Hs−1 ({∗}; R) = 0 (see Theorem 11.3, p. 59, Example 9.4, p. 47 and Example 9.7, p. 48 of [16]). Hence, we obtain the isomorphism, Hs (X 1 ; R) ⊕ Hs (X 2 ; R) → Hs ( k T ; R).
(9.1)
This isomorphism is given by (see 17.4, p. 99 of [16])
[c1 ] Hs (X 1 ;R) , [c2 ] Hs (X 2 ;R) → −[c1 ] Hs ( k T ;R) + [c2 ] Hs ( k T ;R) .
(9.2)
As ∪ j≤k−1 l j (T ) X 1 , lk (T ) X 2 , the inclusions ∪ j≤k−1l j (T ) → X 1 , lk (T ) → X 2 induce isomorphisms in homology (see Theorem 11.3, p. 59 of [16]). We have, then, the following isomorphisms: ∼ =
∼ =
Hs ( (k − 1) T ; R) → Hs (∪ j≤k−1l j (T ); R) → Hs (X 1 ; R), ∼ =
∼ =
Hs (T ; R) → Hs (lk (T ); R) → Hs (X 2 ; R).
(9.3) (9.4)
k R. Hence, by By our induction hypothesis and (9.1, 9.2, 9.3, 9.4) Hs ( k T ; R) ∼ = ⊕i=1 Proposition 9.1 Hs ( k T ; R) = 0, s ≥ 2. Moreover, by the induction hypothesis and k−1 (9.3), it also follows that [Z j ] H1 (X 1 ;Z) j=1 is a basis of H1 (X 1 ; Z). By Proposition 9.1 and (9.4) H1 (X 1 ; Z) ∼ = Z; furthermore, by the definition of Z k (see 2.2)) and as the homeomorphism lk : T → lk (T ) induces an isomorphism in homology it follows from Proposition 9.1 that [Z k ] H1 (lk (T );Z) is a basis of H1 (lk (T ); Z) and then, by (9.4) it is k Z and also a basis of H1 (X 2 ; Z). Finally, it follows from (9.2) that H1 (kT ; Z) ∼ = ⊕i=1 k [Z j ] H1 (kT ;Z) j=1 is a basis of H1 (kT ; Z). k R and Proposition 9.3. H1 (kT ; R) ∼ = ⊕i=1 H1 (kT ; R).
[Z j ] H1 (kT ;R)
k j=1
is a basis of
Proof. The homology group H1 ( k T ; R) is a module over the ring R, i.e. it is a vector space (p. 47 of [16]). If G is an abelian group we can also define the homology groups as in p. 153 of [17]. In this case H1 ( k T ; G) is a group. As R is a group and a ring we can define the homology groups as modules and as groups. To differentiate them we will denote by H1 ( k T ; R) the homology module considering R as a ring, and by H˜ 1 ( k T ; R) considering R as a group. Actually, H1 ( k T ; R) and H˜ 1 ( k T ; R) are equal as sets and as groups. By the theorem of universal coefficients-Corollary 3 A.4, p. 264 of [17]-there is the exact sequence, 0 → H˜ s ( k T ; Z) ⊗ R → H˜ s ( k ; R) → Tor ( H˜ s−1 ( k T ; Z), R) → 0. As R is torsion free, Tor ( H˜ s−1 ( k T ; Z), R) = 0. See Proposition 3A.5, p. 265 of [17]. In consequence, ∼
= H˜ s ( k T ; Z) ⊗ R → H˜ s ( k T ; R).
(9.5)
The isomorphism, I , is given as follows. Let σ be a singular simplex and take r ∈ R. Then, I ([σ ] ⊗ r ) = [r σ ].
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
395
See Eq. (iv) and Lemma 3.A1, pp. 261, 262 of [17]. By Proposition 9.2 H1 (kT ; Z) ∼ = k k ⊕i=1 Z and [Z j ] H1 (kT ;Z) j=1 is a basis of H1 (kT ; Z). Then k ⊕i=1 R∼ = H˜ 1 ( k T ; Z) ⊗ R.
The isomorphism is given by ⊕kj=1 R −→ ⊕kj=1 (Z ⊗ R) −→ (⊕kj=1 Z) ⊗ R −→ H˜ 1 ( k T ; Z) ⊗ R (r1 , . . . , rk ) → (1 ⊗ r1 . . . , 1 ⊗ rk ) → (1, 0, . . . , 0) ⊗ r1 + · · · + (0, 0, . . . 1)⊗ rk → kj=1 [Z j ] H˜ 1 ( k T ;Z) ⊗ r j . It follows that the morphism I : ⊕kj=1 R → H˜ 1 ( k T ; R) : I ((r1 , . . . , rk )) :=
k [r j Z j ] H˜ 1 ( k T ;R) , j=1
is an isomorphism of groups. We now prove that this implies that {[Z j ] H1 ( k T ;R) }kj=1 is a basis of H1 ( k T ; R) as a vector space. As H˜ 1 ( k T ; R) and H1 ( k T ; R) are equal as sets and as groups the morphism I : ⊕kj=1 R → H1 ( k T ; R) : I ((r1 , · · · , rk )) :=
k [r j Z j ] H1 ( k T ;R) , j=1
is an isomorphism of groups. By the structure of vector space of H1 ( k T ; R) we have that kj=1 [r j Z j ] H1 ( k T ;R) = kj=1 r j [Z j ] H1 ( k T ;R) . As I is an isomorphism of groups we have that ∀σ ∈ H1 ( k T ; R) there are real numbers {r j }kj=1 such that σ = k k j=1 r j [Z j ] H1 ( k T ;R) . This means that {[Z j ] H1 ( k T ;R) } j=1 generates H1 ( k T ; R). k k Moreover, if 0 = j=1 r j [Z j ] H1 ( k T ;R) = j=1 [r j Z j ] H1 ( k T ;R) we have that (r1 , r2 , · · · , rk ) = 0, and we conclude that {[Z j ] H1 ( k T ;R) }kj=1 is a linearly independent set and since it also generates H1 ( k T ; R) it is a basis. 10. Appendix B In this appendix we prove, for completeness, the following proposition. m Proposition 10.1. [γˆj ] H1 (;Z) j=1 is a basis of H1 (; Z). Proof. For simplicity we will omit Z in the homology groups in this proof. Step 1. As in the proof of (2.9) we prove that H2 (R3 , R3 \ K ) ∼ = H1 (R3 \ K ). Moreover the isomorphism is given by (p. 75 of [16]) [σ ] H1 (R3 ,R3 \K ) → [∂σ ] H1 (R3 \K ) .
(10.1) ◦
Step 2. Define K ε := {x ∈ R3 : dist(x, K ) ≤ ε}. Since R3 \ K ε ⊂ (R3 \ K ) it follows from the excision theorem (p. 82 of [16]) that the inclusion (K ε , K ε \ K ) → (R3 , R3 \ K ) induces an isomorphism in homology.
396
M. Ballesteros R. Weder
Step 3. Let K ε, j , j = 1, 2, . . . , L be the connected components of K ε for ε small enough. Then, K , j = {x ∈ R3 : dist(x, K j ) ≤ ε}. By Proposition 13.9, p. 72 of [16] H2 (K ε , K ε \ K ) ∼ = ⊕ Lj=1 H2 (K ε, j , K ε, j \ K j ). Step 4. We have the following homotopic equivalence K ε, j \ K j ∂ K ε, j , that induces the isomorphism in homology Hk (K ε, j \ K j ) ∼ = Hk (∂ K ε, j ). Let us consider the exact homology sequences of the pairs (K ε, j , K ε, j \ K j ) and (K ε, j , ∂ K ε, j ). The first starts at Hk (K ε, j \ K j ) and ends at Hk−1 (K ε, j ) and the second starts at Hk (∂ K ε, j ) and ends at Hk−1 (K ε, j ). By the five lemma (p. 77 of [16]) the inclusion (K ε, j , ∂ K ε, j ) → (K ε, j , K ε, j \ K j ) induces the isomorphism in homology, Hk (K ε, j , ∂ K ε, j ) ∼ = Hk (K ε, j , K ε, j \ K j ). Step 5. By the exact homology sequence for the pair (K ε, j , ∂ K ε, j ) we obtain the sequence 2
I
→ H2 (K ε, j ) → H2 (K ε, j , ∂ K ε, j ) → H1 (∂ K ε, j ) → H1 (K ε, j ) →, where 2 is taking boundary and I is the inclusion. By Proposition 9.2 H2 (K ε, j ) = 0. Hence we obtain the exact sequence 2
I
0 → H2 (K ε, j , ∂ K ε, j ) → H1 (∂ K ε, j ) → H1 (K ε, j ) → .
(10.2)
Let j ⊂ {1, 2, . . . m} be such that {[γi ] H1 (K , j ) }i∈ j is a basis of H1 (K , j ) (see Subsect. 2.4). Let {αi }i∈ j , {βi }i∈ j be the curves defined in Example 2A.2, p. 168 of [17]. Note that we can choose αi = γˆi (see (2.6), just take 2ε instead of ε in K ε ). Moreover as γi βi we have that (see Theorem 11.2, p. 59 of [16]) [βi ] H1 (K ε, j ) = [γi ] H1 (K ε, j ) . Then, by Example 2A.2, p. 168 of [17], [γˆi ] H1 (∂ K ε, j ) , [βi ] H1 (∂ K ε, j ) i∈ j
is a basis of H1 (∂ K ε, j ). It is clear that I ([γˆi ] H1 (∂ K ε, j )) = 0, i ∈ j . Moreover, I ([βi ] H1 (∂ K ε, j ) ) = [βi ] H1 (K ε, j )= ( ' [γi ] H1 (K ε, j ) . Hence, Kern I = [γˆi ] H1 (∂ K ε, j ) i∈ , the free Z−module or the free group j
generated by {[γˆi ] H1 (∂ K ε, j ) }i∈ j . We obtain then that, ( ' 2 H2 (K ε, j , ∂ K ε, j ) → Kern(I ) = [γˆi ] H1 (∂ K ε, j ) i∈ . j
It follows that to construct a basis of H2 (K ε, j , ∂ K ε, j ) it is enough to compute the inverse image under 2 of the [γˆi ] H1 (∂ K ε, j ) i∈ . Let us take then [σi ] H1 (K ε, j ,∂ K ε, j ) j such that, ∂σi = γˆi . Hence, [σi ] H1 (K ε, j ,∂ K ε, j ) i∈ is a basis of H1 (K ε, j , ∂ K ε, j ). j Finally, by Steps 4 and 5 [σi ] H2 (K ε, j ,K ε, j \K j ) i∈ is a basis of H2 (K ε, j , K ε, j \ K j ). j m By Step 3 [σi ] H2 (K ε ,K ε \K ) i=1 is a basis of H2 (K ε , K ε \ K ). By Step 2 m m [σi ] H2 (R3 ,R3 \K ) i=1 is a basis on H2 (R3 , R3 \ K ). By Step 1 [γˆi ] H1 (R3 \K ) i=1 is a basis of H1 (R3 \ K ).
High-Velocity Estimates for Scattering Operator and 3D Aharonov-Bohm Effect
397
Acknowledgements. This work was partially done while we were visiting the Department of Mathematics and Statistics of the University of Helsinki. We thank Prof. Lassi Päivärinta for his kind hospitality.
References 1. Adams, R.A., Fournier, J.J.F.: Sobolev Spaces, Oxford: Amsterdam Academic Press, 2003 2. Agmon, S.: Lectures on Elliptic Boundary Value Problems. Princeton, NJ: D. Van Nostrand, 1965 3. Aharonov, Y., Bohm, D.: Significance of electromagnetic potentials in the quantum theory. Phys. Rev. 115, 485–491 (1959) 4. Arians, S.: Geometric approach to inverse scattering for the Schrödinger equation with magnetic and electric potentials. J. Math. Phys. 38, 2761–2773 (1997) 5. Arians, S.:Inverse Streutheorie für die Schrödinger Gleichung mit Magnet Felder. Dissertation RWTH-Aachen, Berlin, Logos, 1997 6. Arians, S.: Geometric approach to inverse scattering for hydrogen-like systems in a homogeneous magnetic field. J. Math. Phys. 39, 1730–1743 (1998) 7. Bredon, G.E.:Topology and Geometry. New York: Springer-Verlag, 1993 8. de Rham, G.: Differentiable Manifolds. Berlin: Springer-Verlag, 1984 9. Ehrenberg, W., Siday, R.E.: The refractive index in electron optics and the principles of dynamics. Proc. Phys. Soc. London B 62, 8–21 (1949) 10. Eskin, G.: Inverse boundary value problems and the Aharonov-Bohm effect. Inverse Problems 19, 49–62 (2003) 11. Eskin, G.: Inverse boundary value problems in domains with several obstacles. Inverse Problems 20, 1497–1516 (2004) 12. Eskin, G.: Inverse problems for the Schrödinger equations with time-dependent electromagnetic potentials and the Aharonov-Bohm effect. http://arXiv.org/list/math.AP/0611342v1, 2006 13. Eskin, G.: Optical Aharonov-Bohm effect: an inverse hyperbolic problems approach. http://arXiv.org/ abs/0707.2835v2[math-ph], 2007 14. Enss, V., Weder, R.: The geometrical approach to multidimensional inverse scattering. J. Math. Phys. 36, 3902–3921 (1995) 15. Gompf, R.E., Stipsicz, A.I.: 4-Manifolds and Kirby Calculus, Graduate Studies in Mathematics 20, Providence, RI: Amer. Math. Soc., 1999 16. Greenberg, M.J., Harper, J.R.: Algebraic Topology, A First Course, New York: Addison-Wesley, 1981 17. Hatcher, A.: Algebraic Topology. Cambridge: Cambridge University Press, 2002 18. Helgason, S.: Groups and Geometric Analysis. Orlando: Academic Press, 1984 19. Helgason, S.: The Radon Transform. Progress in Mathematics Vol 5, 2nd edn, Berlin: Birkäuser, 1999 20. Jung, W.: Der geometrische Ansatz zur inversen Streutheorie bei der Dirac-Gleichung. Diplomarbeit RWTH-Aachen, 1996 21. Jung, W.: Geometrical approach to inverse scattering for the Dirac equation. J. Math. Phys. 38, 39–48 (1997) 22. Jung, W.: Gauge transformations and inverse quantum scattering with medium-range magnetic fields. Math. Phys. Electron. J. 11, paper 5 (2005), 32 pp 23. Kato, T.: Perturbation Theory for Linear Operators. Second Edition, Berlin: Springer-Verlag, 1976 24. Katchalov, A., Kurylev, Ya.: Multidimensional inverse problem with incomplete boundary spectral data. Comm. Part. Differ. Eqs. 23, 55–95 (1998) 25. Kurylev, Y., Lassas, M.: Inverse problems and index formulae for Dirac operators. http://arXiv.org/list/ math.AP/0501049v2, 2006 26. Martensen, E.: Potentialtheorie. Stuttgart: B.G. Tubner, 1968 27. Natterer, F.: The Mathematics of Computerized Tomography. Stuttgart: Teubner, 1986 28. Nicoleau, F.: An inverse scattering problem with the Aharonov-Bohm effect. J. Math. Phys 41, 5223–5237 (2000) 29. Olariu, S., Popescu, I.I.: The quantum effects of electromagnetic fluxes. Rev. Mod. Phys. 57, 339–436 (1985) 30. Peshkin, M., Tonomura, A.: The Aharonov-Bohm Effect. Lecture Notes in Phys. 340, Berlin: SpringerVerlag, 1989 31. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, II. Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 32. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, III. Scattering Theory. New York: Academic Press, 1979 33. Roux, Ph.: Scattering by a toroidal coil. J. Phys. A: Math. Gen. 36, 5293–5304 (2003)
398
M. Ballesteros R. Weder
34. Roux, Ph., Yafaev, D.: On the mathematical theory of the Aharonov-Bohm effect. J. Phys. A: Math. Gen. 35, 7481–7492 (2002) 35. Schwarz, G.: Hodge Decomposition- A Method for Solving Boundary Value Problems. Lecture Notes in Mathematics 1607, Berlin: Springer, 1995 36. Triebel, H.: Interpolation Theory, Function Spaces, Differential Operators. Amsterdan: North-Holland, 1978 37. Tonomura, A., Matsuda, T., Suzuki, R., Fukuhara, A., Osakabe, N., Umezaki, H., Endo, J., Shinagawa, K., Sugita, Y., Fujiwara, H.: Phys. Rev. Lett. 48, 1443–1446 (1982) 38. Tonomura, A., Osakabe, N., Matsuda, T., Kawasaki, T., Endo, J., Yano, S., Yamada, H.: Phys. Rev. Lett. 56, 792–795 (1986) 39. Neudert, M., von Wahl, W.: Asymptotic behaviour of the div-curl problem in exterior domains. Adv. Differ. Eqs. 6, 1347–1376 (2001) 40. Weder, R.: The Aharonov-Bohm effect and time-dependent inverse scattering theory. Inverse Problems 18, 1041–1056 (2002) 41. Warner, F.W.: Foundations of Differentiable Manifolds. Berlin: Springer-Verlag, 1983 42. Yafaev, D.R.: Scattering matrix for magnetic potentials with Coulomb decay at infinity. Int. Eqs. Opr Theory 47, 217–249 (2003) 43. Yafaev, D.R.: Scattering by magnetic fields. St. Petersburg Math. J. 17, 875–895 (2006) 44. Wu, T.T., Yang, C.N.: Concept of nonintegrable phase factors and global formulation of gauge fields. Phys. Rev. D (3) 12, 3845–3857 (1975) Communicated by I.M. Sigal
Commun. Math. Phys. 285, 399–420 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0549-7
Communications in
Mathematical Physics
The Wave Equation on Singular Space-Times James D.E. Grant, Eberhard Mayerhofer, Roland Steinbauer Faculty of Mathematics, University of Vienna, Nordbergstrasse 15, 1090 Vienna, Austria. E-mail:
[email protected] Received: 11 October 2007 / Accepted: 15 February 2008 Published online: 5 July 2008 – © Springer-Verlag 2008
Abstract: We prove local unique solvability of the wave equation for a large class of weakly singular, locally bounded space-time metrics in a suitable space of generalised functions.
1. Introduction The notion of a singularity in general relativity significantly differs from that in other field theories. In the absence of a background metric, one has to detect the presence of singularities by showing that the space-time is “incomplete” in some sense. In the standard approach to singularities (see, e.g., Hawking and Ellis [12, Ch. 8]), a singularity is regarded as an obstruction to extending geodesics. However, this definition does not correspond very closely to one’s physical intuition and classifies many space-times that have been used to model physically reasonable scenarios as being “singular”. Such “weakly singular” space-times have long been used to describe, for example, impulsive gravitational waves, shell-crossing singularities and thin cosmic strings. Typically these space-times admit a metric that is locally bounded but its differentiability is below C 1,1 (i.e., the first derivative locally Lipschitz)—the largest differentiability class where standard differential geometric properties, such as existence and uniqueness of geodesics, remain valid. For a recent review on the use of metrics of low regularity in general relativity, see [24]. This set of problems has stimulated considerations of whether physical objects would be subjected to unbounded tidal forces on approaching the singularity and was formulated mathematically in terms of strong curvature conditions. Unfortunately, it is hard to model the behaviour of real physical objects in a strong gravitational field. This led Clarke [2] to suggest that one consider the behaviour of physical fields (for which one has a precise mathematical description) near the singularity instead. According to this philosophy of “generalised hyperbolicity” one should regard singularities as obstructions
400
J. D. E. Grant, E. Mayerhofer, R. Steinbauer
to the Cauchy development of these fields rather than as an obstruction to the extension of geodesics. However, the weak singularities mentioned above are obstructions if one formulates the Cauchy problem for the wave equation in the standard theory of distributions. More precisely, there is no generally valid distributional solution concept for the wave equation on a space-time with a non-smooth metric. The equation, although linear, involves coefficients of low regularity that cannot be multiplied with the distributional solution. To resolve this problem in the case of shell-crossing singularities, Clarke [3] introduced a specific weak solution concept (called -global hyperbolicity) to prove unique solvability of the wave equation, hence showing that these space-times, indeed, satisfy the conditions of generalised hyperbolicity. On the other hand, Vickers and Wilson [25] used the setting of Colombeau algebras [4,5] to arrive at a valid formulation of the Cauchy problem for the wave equation on conical space-times (modelling a thin cosmic string) and showed the existence and uniqueness of solutions in a suitable algebra, G, of generalised functions. Hence they showed that conical space-times are generalised hyperbolic or, more precisely, G-hyperbolic. Vickers and Wilson also showed that their unique generalised solution corresponds to the “forbidden” distributional solution expected on physical grounds (via the concept of association — see Sect. 2.1 below). Their key tool is a refinement of the energy estimates for hyperbolic PDEs (see, e.g., [12, Sect. 7.4], [1, Sect. 4.4]), which makes them applicable in the new situation. In this paper, we generalise this method to a much wider class of weakly singular space-times and prove G-hyperbolicity for this class. Since our approach is based on regularisation of the singular metric by sequences of smooth ones, we must put restrictions on the growth of the sequence with respect to the regularisation parameter ε. Essentially we shall assume (see Sect. 2.3) asymptotic local uniform boundedness with respect to ε. Recall that the space-times of interest here typically possess a locally bounded metric. In particular, our class includes impulsive pp-waves (in the Rosen form), expanding spherical impulsive waves, and conical space-times (thereby generalising the results of Vickers and Wilson [25]). This work is organised in the following way. In Sect. 2 we fix our notation, recall some facts on the geometric theory of generalised functions and define our class of weakly singular space-times (Sect. 2.3). We state our main result in Theorem 3.1 of Sect. 3: given a point p in a weakly singular space-time, there exists a neighbourhood, V , of p such that the initial value problem for the wave equation admits a unique solution in G(V ). The proof is split into several steps: (generalised) higher order energy integrals are introduced and proved to be equivalent to suitable Sobolev norms in Sect. 4. The energy estimates are provided in Sect. 5, while some auxiliary estimates are proved in Sect. 6. Finally these results are collected to provide the proof of the main theorem in Sect. 7. We end with some concluding remarks.
2. Prerequisites In this section, we give a precise definition of the class of weakly singular metrics that we are going to consider in the sequel. Prior to that, and for the convenience of the reader, we give a brief summary of the geometric theory of generalised functions in the sense of Colombeau. Our main reference for the latter is [8, Sect. 3.2] and we adopt most notational conventions from there. For an overview of the use of these constructions in general relativity, we refer to [24].
The Wave Equation on Singular Space-Times
401
2.1. Geometric theory of generalised functions. The basic idea of Colombeau’s approach to generalised functions [4,5] is regularisation by sequences (nets) of smooth functions and the use of asymptotic estimates in terms of a regularisation parameter ε. Let M be a separable, smooth, orientable, Hausdorff manifold of dimension n, and let X(M) denote the space of smooth vector fields on M. Let (u ε )ε∈(0,1] with u ε ∈ C ∞ (M) for all ε. The (special) algebra of generalised functions on M is defined as the quotient G(M) := E M (M)/N (M) of the moderate nets modulo the negligible nets, where the respective notions are defined by the following asymptotic estimates: E M (M) := {(u ε )ε : ∀K ⊂⊂ M, ∀k ∈ N0 ∃N ∈ N ∀η1 , . . . , ηk ∈ X(M) : sup |Lη1 . . . Lηk u ε ( p)| = O(ε−N )}, p∈K
N (M) := {(u ε )ε : ∀K ⊂⊂ M, ∀k, q ∈ N0 ∀η1 , . . . , ηk ∈ X(M) : sup |Lη1 . . . Lηk u ε ( p)| = O(εq ))}. p∈K
Elements of G(M) are denoted by u = [(u ε )ε ] = (u ε )ε + N (M). With component-wise operations, G(M) is a fine sheaf of differential algebras with respect to the Lie derivative with respect to classical vector fields defined by Lη u := [(Lη u ε )ε ]. The spaces of moderate resp. negligible sequences and hence the algebra itself may be characterised locally, i.e., u ∈ G(M) iff u ◦ ψα ∈ G(ψα (Vα )) for all charts (Vα , ψα ), where, on the open set ψα (Vα ) ⊂ Rn , Lie derivatives are replaced by partial derivatives in the respective estimates. Smooth functions are embedded into G simply by the “constant” embedding σ , i.e., σ ( f ) := [( f )ε ]. On open sets of Rn , compactly supported distributions are embedded into G via convolution with a mollifier ρ ∈ S (Rn ) with unit integral satisfying ρ(x)x α d x = 0 for all |α| ≥ 1; more precisely setting ρε (x) = (1/εn )ρ(x/ε), we have ι(w) := [(w ∗ ρε )ε ]. In the case where supp(w) is non-compact, one uses a sheaf-theoretical construction which can be lifted to the manifold using a partition of unity. From the explicit formula, it is clear that the embedding commutes with partial differentiation. This embedding, however, is not canonical since it depends on the mollifier as well as the partition of unity. A canonical embedding of D is provided by the so-called full version of the construction (see [9], resp. [10] for the tensor case). However, since we will model our weakly singular metrics in generalised functions from the start (see Sect. 2.2 and the discussion at the end of Sect. 2.3 below) we have chosen to work in the so-called special setting which is technically more accessible. Note that this is in contrast to [25]. Inserting p ∈ M into u ∈ G(M) yields a well-defined element of the ring of constants (also called generalised numbers) K (corresponding to K = R resp. C), defined as the set of moderate nets of numbers ((rε )ε ∈ K(0,1] with |rε | = O(ε−N ) for some N ) modulo negligible nets (|rε | = O(εm ) for each m). Finally, generalised functions on M are characterised by their generalised point values, i.e., by their values on points in M˜ c , the space of equivalence classes of compactly supported nets ( pε )ε ∈ M (0,1] with respect to the relation pε ∼ pε :⇔ dh ( pε , pε ) = O(εm ) for all m, where dh denotes the distance on M induced by any Riemannian metric. As is evident from the definitions, all estimates are only required to hold for ε small enough, that is there exists ε0 such that for all ε < ε0 the respective statement holds. However, in order not to unnecessarily complicate our formulations we will notationally suppress this fact most of the time. The G(M)-module of generalised sections in vector bundles—especially the space of generalised tensor fields Gsr (M)—is defined along the same lines using analogous
402
J. D. E. Grant, E. Mayerhofer, R. Steinbauer
asymptotic estimates with respect to the norm induced by any Riemannian metric on the respective fibers. However, it is more convenient to use the following algebraic description of generalised tensor fields: Gsr (M) = G(M) ⊗ Tsr (M) ,
(2.1)
where Tsr (M) denotes the space of smooth tensor fields and the tensor product is taken over the module C ∞ (M). Hence generalised tensor fields are just given by classical ones with generalised coefficient functions. Many concepts of classical tensor analysis carry over to the generalised setting [14], in particular Lie derivatives with respect to both classical and generalised vector fields, Lie brackets, exterior algebra, etc. Moreover, generalised tensor fields may also be viewed as G(M)-multilinear maps taking generalised vector and covector fields to generalised functions, i.e., as G(M)-modules we have Gsr (M) ∼ = L G (M) (G10 (M)r , G01 (M)s ; G(M)).
(2.2)
Finally, in light of the Schwartz impossibility result [22], the setting introduced above gives a minimal framework within which tensor fields may be subjected to nonlinear operations, while maintaining consistency with smooth geometry and allowing an embedding of the distributional geometry as developed in [16,19]. Moreover, the interplay between generalised functions and distributions is most conveniently formalised in terms of the notion of association. A generalised function u ∈ G(M) is called associated to zero, u ≈ 0, if one (hence any) representative (u ε )ε converges to zero weakly. The equivalence relation u ≈ v :⇔ u − v ≈ 0 gives rise to a linear quotient of G that extends distributional equality. Moreover, we call a distribution w ∈ D (M) the distributional shadow or macroscopic aspect of u and write u ≈ w if, for all compactly supported n-forms ν and one (hence any) representative (u ε )ε , we have lim u ε ν = w, ν, ε→0
M
where , denotes the distributional action. By (2.1), the concept of association extends to generalised tensor fields in a natural way.
2.2. Elements of Lorentzian geometry. A generalised pseudo-Riemannian metric is defined to be a symmetric, generalised (0, 2)-tensor field g with a representative gε that is a smooth pseudo-Riemannian metric for each ε such that the determinant det(g) is invertible in the generalised sense. The latter condition is equivalent to the following notion called strictly nonzero on compact sets: for any representative (det(gε ))ε of det(g) we have ∀K ⊂⊂ M ∃m ∈ N : inf p∈K | det(gε )| ≥ εm . This notion captures the intuitive idea of a generalised metric as a net of classical metrics approaching a singular limit in the following precise sense: g is a generalised metric iff on every relatively compact open subset V of M there exists a representative (gε )ε of g such that, for fixed ε, gε is a classical metric and its determinant, det(g), is invertible in the generalised sense, i.e., does not go to zero too fast as ε → 0. Note that we work exclusively with representatives of generalised metrics that are classical metrics for each ε. If g is Lorentzian, i.e., there exists a representative which is Lorentzian, we call the pair (M, g) a generalised space-time.
The Wave Equation on Singular Space-Times
403
A generalised metric induces a G(M)-linear isomorphism from G01 (M) to G10 (M). The inverse of this isomorphism gives a well-defined element of G02 (M) (i.e., indepen−1 dent of the representative (gε )ε ). This is the “inverse metric”, which we denote by g , with representative gε−1 ε . The generalised covariant derivative, as well as the generalised Riemann-, Ricci- and Einstein tensors, of a generalised metric is defined by the usual formulae on the level of representatives. For further details see [15]. Next, we review the concept of causality in the generalised framework. Let ξ ∈ G01 (M) be a generalised vector field on M. Then, by (2.2), g(ξ , ξ ) ∈ G(M). For functions f ∈ G(M) we have the following notion of strict positivity: f > 0 :⇐⇒ ∀K ⊂⊂ M, ∃m ∈ N : inf f ε ( p) ≥ εm , as ε → 0, p∈K
and we define time-like, null and space-like for ξ by demanding g(ξ , ξ ) < 0, g(ξ , ξ ) = 0, respectively g(ξ , ξ ) > 0. (See [17] for details, as well as for a general account of basic Lorentzian geometry in the present setting.)
2.3. A class of metrics. We are now ready to define the class of metrics that we will study. Let (M, g) be a generalised space-time, and gε a representative of the generalised metric. Let p ∈ M, U a relatively compact open neighbourhood of p, and let t : U → R be a smooth map with the properties that t ( p) = 0, dt = 0 on U . We assume that there exists an M0 > 0 with gε−1 (dt, dt) ≤ −1/M02 , as ε → 0 on U . Therefore the level sets of the function t, τ := {q ∈ U : t (q) = τ }, are space-like hyper-surfaces with respect to the representative metrics, gε , uniformly as ε → 0. We define the normal covector field to these hyper-surfaces σ := −dt ∈ 1 (U ) which, via the constant embedding, may also be viewed as a generalised covector field on U . We define the corresponding generalised normal vector field, ξ , by its representative ξ ε ∈ X(U ), given, for each ε, by σ = gε (ξ ε , ·). We now define the generalised function, V , on U by its representative Vε : U → R+ , given by Vε2 = −gε (ξ ε , ξ ε ). We will also require the corresponding normalised versions of the generalised normal σ = [( σ ε )ε ] = g( ξ , ·). vector field, ξ = [(ξε )ε ] = [(ξ ε /Vε )ε ], and covector field, ε , Observe that, although σ does not depend on ε, the quantities derived from it, i.e., σ ξ ε and ξε necessarily do, since we are dealing with a generalised metric. Using these quantities, one may construct a positive-definite metric associated with the generalised space-time (cf. [17, Sec. 4]). In particular, we define ε , eε := gε + 2 σε ⊗ σ which clearly, for each fixed ε, is a Riemannian metric on U . Additionally, the resulting class e = [(eε )ε ] defines a generalised Riemannian metric on U ([17, Prop. 4.3]). We denote by := 0 the three-dimensional space-like hypersurface through p. Let m be a background Riemannian metric on U and denote by m the norm induced on the fibers of the respective tensor bundle on U . We demand the following conditions: (A) For all K compact in U , for all orders of derivative k ∈ N0 and all k-tuples of vector fields η1 , . . . , ηk ∈ X(U ) and for any representative (gε )ε we have: • sup K Lη1 . . . Lηk gε m = O(ε−k ) (ε → 0);
404
J. D. E. Grant, E. Mayerhofer, R. Steinbauer
• sup K Lη1 . . . Lηk gε−1 m = O(ε−k ) (ε → 0). In particular, this implies (for k = 0) that the metrics gε and their inverses gε−1 are locally uniformly bounded with respect to ε. (B) For all K compact in U , we have (ε → 0), (2.3) sup ∇ ε ξ ε m = O(1), K
where ∇ ε denotes the covariant derivative with respect to the Lorentzian metric gε . (C) For each representative (gε )ε of the metric g on U , is a past-compact space-like hypersurface such that ∂ Jε+ () = . Here Jε+ () denotes the topological closure of the future emission Dε+ () ⊂ U of with respect to gε . Moreover, there exists a nonempty open set A ⊆ M and an ε0 such that Jε+ (). A⊆ ε 0, Condition (A) states that there exists Mk such that
M
ε
Mk
∂ k gab ∂ k gεab k
≤
.
≤ k , a1
a1 a a k k
∂x · · · ∂x
ε ∂x · · · ∂x
εk • Conditions (A) and (B) imply that the generalised Riemannian metric, e, obeys the asymptotic condition ε −1 as ε → 0. ∇ eε = O(1) m
From Condition (A), it follows that eε m , eε−1 m = O(1). (This can most easily be deduced from the form of the metric given, below, in (3.2).) Therefore, by the Cauchy-Schwarz inequality for the inner product induced by m on the bundle of (2, 1) tensors on M, we have ε −1 as ε → 0. (2.5) ∇ eε = O(1) eε
The Wave Equation on Singular Space-Times
405
Similarly, “lowering the index” on ξ ε in Eq. (2.3) implies that sup K ∇ ε σ m = O(1) as ε → 0, where we have again used the fact that gε m = O(1). Taking the symmetric part of ∇ ε σ implies that Lξ ε gε m = O(1) as ε → 0. Finally, again using the Cauchy-Schwarz inequality, we deduce that Lξ gε = O(1) as ε → 0. (2.6) ε e ε
These estimates will be required in Sect. 5. • Condition (C) is necessary to ensure existence of smooth solutions on the level of representatives on a common domain (cf. Step 1 in the proof of Theorem 3.1 in Sect. 7). Remark 2.1. Conditions (A) and (B) are given in terms of the ε-asymptotics of the generalised metric. There is, however, the following close connection to the classical situation. Assume that we are given a space-time metric that is locally bounded but not necessarily C 1,1 or of Geroch-Traschen class [7] (i.e., the largest class that allows a “reasonable” distributional treatment). We may then embed this metric into the space of generalised metrics by convolution with a standard mollifier (cf. Sect. 2.1). From the explicit form of the embedding it is then clear that Condition (A) holds. We recall that in the special version of Colombeau’s construction the embedding is non-geometric and we could – at the price of technical complications – resort to the full version where a geometric embedding is available (as was done in [25]). Nevertheless, in the full construction generalised functions that are embeddings of locally bounded functions still display the ε-asymptotics of Condition (A). Moreover our approach using the special version offers more flexibility: Whenever one succeeds, e.g. by using some physically motivated procedure, to model a singular metric by a sequence of classical metrics obeying (A)–(C), then our results apply. Condition (B) on the other hand demands somewhat better asymptotics of the derivatives of the (0, 0)-component of the metric in adapted coordinates (see also (3.4) below). This is a technical condition that is satisfied by several relevant examples (see below). As to Condition (C), the only part that exceeds the classical condition for existence and uniqueness of solutions is the existence of the non-empty open set A. Geometrically, this means that the light-cones of the metric gε do not collapse as ε → 0. In terms of regularisations of classical metrics, this condition will always be satisfied if the classical metric is non-degenerate. Examples. To begin with we discuss the conical space-times of [25]. They fall into our class since estimates (6) and (7) in [25] for the embedded metric imply our Condition (A), while (B) is immediate from the staticity of the metric. The metric of impulsive pp-waves (in “Rosen form”) fall into our class. For simplicity we only consider plane waves of constant linear polarisation, i.e, −dudv + (1 + u + )2 d x 2 + (1 − u + )2 dy 2 , where u + := u H (u) denotes the kink function. This metric is locally bounded (actually continuous) and, since the non-trivial behaviour involves simply the spatial part of the metric, will therefore obey Conditions (A) and (B) when embedded with a standard mollifier, or – more generally – if we use any other regularisation that converges at least locally uniformly to the original metric.
406
J. D. E. Grant, E. Mayerhofer, R. Steinbauer
Similarly, in [21], metrics for expanding spherical impulsive waves of the form
2
u+
2dudv + 2v 2 dz + H dz
2v were studied, where H (z) is the Schwartzian derivative of any arbitrary analytic function h(z). Again, this metric is continuous and the non-trivial behaviour occurs in only the spatial directions. So we obtain Conditions (A) and (B) as for the above case. In all of these examples, the discussion at the end of Remark 2.1 imply that Condition (C) is also satisfied. 3. The Main Result We are interested in the initial value problem for the wave equation u = 0, u| = v, Lξ u| = w,
(3.1)
on the subset U of a weakly singular space-time (M, g) (i.e., g subject to the assumptions (A)–(C) of Sect. 2.3). Here := 0 denotes the level set {q ∈ U : t (q) = 0} of the function t : U → R introduced in Sect. 2.3. The initial conditions are defined by v and w, which are given functions in G(). Note that this, in particular, includes the case of arbitrary distributional initial data. We are interested in finding a local solution u ∈ G on U resp. an open subset thereof. A general strategy to solve PDEs in G is the following. First, solve the equation for fixed ε in the smooth setting and form the net (u ε )ε of smooth solutions. This will be a candidate for a solution in G, but particular care has to be taken to guarantee that the u ε share a common domain of definition. In the second step, one shows that the solution candidate (u ε )ε is a moderate net, hence obtaining existence of the solution [(u ε )ε ] in G. Finally, to obtain uniqueness of solutions, one has to prove that changing representatives of the data or the metric leads to a solution that is still in the class [(u ε )ε ]. Note that this amounts to an additional stability of the equation with respect to negligible perturbations of the initial data and the metric. According to this strategy, given the point p in we may, without loss of generality, assume that (U, {x a }), where (x a )a=0,1,2,3 = (t, x i ) is a coordinate neighbourhood of p, and formulate the initial value problem (3.1) in terms of representatives on U . To this end, given a representative (gε )ε of the metric g, there exist functions h iεj , Nεi on U such that gε = −Vε2 dt 2 + h iεj d x i − Nεi dt ⊗ d x j − Nεj dt . (3.2) We further choose representatives (vε )ε , (wε )ε of the data and a negligible net ( f ε )ε on U . We then consider the initial value problem ε u ε = f ε , u ε (t = 0, x i ) = vε (x i ), Lξ u ε (t = 0, x i ) = wε (x i ), ε
(3.3)
The Wave Equation on Singular Space-Times
407
where ε is the d’Alembertian derived from our particular representative gε , i.e., ε u ε = |gε |−1/2 ∂a |gε |1/2 gεab ∂b u ε
∂u ε 2 i 1 i j 1 2 ij = − 2 ∂t u ε − 2 N ∂t ∂i u ε + h ε − 2 Nε Nε ∂i ∂ j u ε − gεab [gε ]c ab c , Vε Vε Vε ∂x and h ε are the components of the inverse of h iεj , gε := det gε , and [gε ]c ab denote the Christoffel symbols of the metric gε . Note that, by Conditions (A) and (B) of Sect. 2.3, the following asymptotic estimates hold for the components of the metric in the above coordinate system ⎫ Vε , h iεj , Nεi = O(1) ⎪ ⎬ ∂a Vε = O(1) as ε → 0. ⎪ ⎭ α α ε α i −|α| ∂ Vε , ∂ h i j , ∂ Nε = O(ε ) for all multi-indices α with |α| ≥ 1 ij
(3.4) Following the general strategy outlined above, we will prove local unique solvability of (3.1) by showing that the smooth solutions, (u ε )ε , of (3.3) form a moderate net, and hence determine a class in G, and that this class is independent of the choice of representatives of v, w and g. More precisely, our main result is the following: Theorem 3.1 (Local existence and uniqueness of generalised solutions). Let (M, g) be a generalised space-time and assume that Conditions (A)–(C) of Sect. 2.3 hold. Then, for each p ∈ , there exists an open neighbourhood V on which the initial value problem for the wave equation (3.1) has a unique solution in G(V ). We split the proof in a series of arguments, the core of which are higher order energy estimates. To prepare for these, we first introduce suitable energy tensors and energy integrals. 4. Energy Integrals By assumption, we have a point p ∈ M and an open neighbourhood of p, U , and a map t : U → R with t ( p) = 0 such that U is foliated by the level sets of the function t, τ := {q ∈ U : t (q) = τ }, τ ∈ [−γ , γ ], for some γ > 0. Moreover, the level sets τ are space-like with respect to the generalised metric g. We now consider solving the forward in time initial value problem for the wave equation on U , i.e., with τ ≥ 0 (see Fig. 1). Given p ∈ = 0 , let be a neighbourhood of p with the properties that ⊂ U , and such that the boundary of the region ∩ {q ∈ U : t (q) ≥ 0} is space-like1 . We denote by Sτ := τ ∩ and by τ the open part of between and τ . We denote the part of the boundary of τ with 0 ≤ t ≤ τ by S ,τ , so that ∂ τ = S0 ∪ Sτ ∪ S ,τ . Notation. In order to simplify calculations, from now on we will adopt abstract index notation for (generalised) tensorial objects (see, e.g., [20]). In particular, representatives ε and g ab , respectively, and simiof the metric gε and its inverse will be denoted by gab ε larly for the corresponding Riemannian metric eε . We denote the representative of the 1 The existence of such a set, , follows from the fact that g−1 = O(1) as ε → 0. Geometrically, this m ε condition means that the collection of timelike directions at a given point is not collapsing to the empty set.
408
J. D. E. Grant, E. Mayerhofer, R. Steinbauer
Fig. 1. Local foliation of space-time
generalised normal vector field, ξ , by ξεa , and the corresponding generalised covector field, σ , by ξaε . In addition, to simplify the notation for tensors we are going to use capital p ... p letters to abbreviate tuples of indices, i.e., we will write T JI for Tq11...qs r with |I | = r , |J | = s. Also for I , J of equal length, say r , we write e I J for e p1 q1 e p2 q2 . . . eqr pr . We now use the Riemannian metric eε and the covariant derivative with respect to gε —which we have denoted by ∇ ε —to define ε-dependent Sobolev norms on U . Definition 4.1 (Sobolev norms). Let T JI be a smooth tensor field and u a smooth function on U , ε > 0, 0 ≤ τ ≤ γ and k, j ∈ N0 . 1. We define the “pointwise” norm of T JI by T JI 2eε := eεK L eεI J TIK T JL and the “pointwise norm” of covariant derivatives of u by p q |∇ε( j) u|2 := ||∇ εp1 . . . ∇ εp j u||2eε = eεp1 q1 . . . eε j j ∇ εp1 . . . ∇ εp j u ∇qε1 . . . ∇qε j u . 2. On τ we define Sobolev norms with respect to ∇aε resp. partial derivatives by ⎞1 ⎛ 2 k ∇ k ( j) 2 ε⎠ ⎝ u τ , ε := |∇ε (u)| µ , j=0 τ
⎞1
⎛
∂
uk τ , ε
⎜ := ⎝ p1 ,..., p j 0≤ j≤k
2
τ
⎟ |∂ p1 . . . ∂ p j u|2 µε ⎠ ,
where µε denotes the volume form derived from gε . 3. The respective “three-dimensional” Sobolev norms are defined by ⎛ ⎞1 2 k ∇ k ( j) 2 ε⎠ ⎝ u Sτ , ε := |∇ε (u)| µτ , j=0 Sτ
⎞1
⎛
∂
ukSτ , ε
⎜ := ⎝ p1 ,..., p j 0≤ j≤k
2
Sτ
⎟ |∂ p1 . . . ∂ p j u|2 µετ ⎠
,
The Wave Equation on Singular Space-Times
409
where µετ is the unique three-form induced on Sτ by µε such that dt ∧µετ = µε holds on Sτ . Note that although the integration is performed over the three-dimensional manifold Sτ only, derivatives are not confined to directions tangential to Sτ . Observe that, due to the use of a generalised metric, even the norms ∂ ukSτ , ε depend on ε. However, due to Condition (A), with k = 0, they are equivalent to an ε-independent norm derived, for example, from the fixed background metric m. In the following, we will provide suitable higher order energy estimates for nets of solutions of the wave equation. These estimates are best expressed in terms of energy momentum tensors and energy integrals, which we define following [25, Sect. 4]. For the “classical” case, see [12, Sect. 7.4], [1, Sect. 4.4] and [11] for a recent review. Definition 4.2 (Energy momentum tensors and energy integrals). Let u ∈ C ∞ (U ) and k ∈ N0 . On we define 1. the energy momentum tensors by (k > 0), 1 Tεab,0 (u) := − gεab u 2 , 2
1 ab cd p1 q1 p q ab,k ac bd Tε (u) := gε gε − gε gε eε . . . eε k−1 k−1 2 ×(∇cε ∇ εp1 . . . ∇ εpk−1 u)(∇dε ∇qε1 . . . ∇qεk−1 u), 2. the energy integrals by k (u) := E τ,ε
k j=0 Sτ
ξbε µετ , Tεab, j (u)ξa
k ≥ 0.
(4.1)
It may be verified, by direct calculation, that the tensor fields Tεab,k (u) satisfy the dominant energy condition. Indeed it suffices to observe that, for any future-directed time-like vector field U, the expression U a U b − 21 gε (U, U)gεab defines a Riemannian metric for fixed ε. For details, see Proposition 3.6 of [18]. For a generalised formulation of the dominant energy condition, see [17]. Remark 4.1. The energy momentum tensors introduced above are related to the superenergy tensors of Senovilla [23]. Omitting indices and ε’s for the moment, we construct the super-energy tensor, S k , of type (0, 2k) (see Definition 3.1 in [23]). Then T k (u) are the (2k − 2)-fold contraction of S k with the time-like vector field ξ . Theorem 4.1 of [23] then implies that the tensors S k (k ≥ 0) satisfy the dominant super-energy property, from which it follows more elegantly that the T k (u) satisfy the dominant energy condition. Remark 4.2. The energy integrals may be written in the more symmetrical form k E τ,ε (u)
:=
k j=0 Sτ
Tεab, j (u)ξaε ξbε µετ ,
where we have defined the volume element µετ = Vε−1 µετ on Sτ . In terms of the decomposition of the metric given in Eq. (3.2), µετ is the volume form on Sτ defined by the ε i three-dimensional metric hε := h i j d x ⊗ d x j .
410
J. D. E. Grant, E. Mayerhofer, R. Steinbauer ab, j
Since the part S ,τ of the boundary of is space-like and Tε (u) satisfies the dominant energy condition, an application of the Stokes theorem yields ε ab, j ab, j ε ε ξaε µε0 ∇a Tε (u)ξb µε = Tε (u)ξb ξa µτ − Tεab, j (u)ξb τ Sτ S0 + Tεab, j (u)ξb n aε d Sε S ,τ
≥ Sτ
ξaε µετ Tεab, j (u)ξb
− S0
ξaε µε0 , Tεab, j (u)ξb
where nε and d Sε denote the unit normal and surface element on ∂ τ , respectively. Hence summing over j we have the following energy inequality for each ε > 0 and each 0 ≤ τ ≤ γ, k k E τ,ε ξb ∇aε Tεab, j (u) + Tεab, j (u)∇aε ξb µε . (u) ≤ E τk =0,ε (u) + (4.2) j=0 τ
Note that the energy integrals and foliation used here correspond closely to those used in [12, Sect. 4.3]. In [25, pp. 1341], due to a different choice of foliation, inequality (4.2) is replaced with an equality. This alternative foliation allows one to work without the explicit use of the dominant energy condition, but complicates some of the resulting energy estimates. To end this section, we prove the equivalence of the Sobolev norms and the energy integrals. Note that this result is the analogue of Lemma 1 in [25] for our class of metrics, and is one of the key estimates in our approach. Lemma 4.1 (Energy integrals and Sobolev norms). 1. There exist constants A, A such that for each k ≥ 0, k (u) ≤ A(∇ ukSτ , ε )2 . A (∇ ukSτ , ε )2 ≤ E τ,ε
2. For each k ≥ 1, there exist positive constants (
∇
ukSτ , ε )2
≤
Bk
k j=1
(∂ ukSτ , ε )2 ≤ Bk
Bk , Bk
such that
1 j (∂ u Sτ , ε )2 , ε2(k− j)
k
1
j=1
ε2(k− j)
(∇ u Sτ , ε )2 . j
For k = 0 we simply have (∇ u0Sτ , ε )2 = (∂ u0Sτ , ε )2 . Proof. (1) For k = 0 we have 1 1 Vε 2 Tεab,0 (u)ξa u , ξbε = − gεab ξa ξbε u 2 = − −gε (ξ ε , ξ ε )u 2 = 2 2 2 hence by (2.4), setting A := M0 /2 and A := 1/(2M0 ), we obtain ξbε ≤ Au 2 , A u 2 ≤ Tεab,0 (u)ξa which upon integrating gives the result.
(4.3)
(4.4)
(4.5)
The Wave Equation on Singular Space-Times
411
For the case k > 0 note that
1 1 ab cd 1 2 c d ac bd ε cd gε gε − gε gε ξa ξb = Vε gε + 2 ξε ξε = Vε eεcd . 2 2 Vε 2 Hence, we may write 1 p q Vε eεcd eεp1 q1 . . . eε j−1 j−1 (∇cε ∇ εp1 . . . ∇ εp j−1 u)(∇dε ∇qε1 . . . ∇qε j−1 u) 2 1 = Vε |∇ε( j) u|2 . 2
ξbε = Tεab, j (u)ξa
Using (A), this implies that ξbε ≤ A|∇ε( j) u|2 , A |∇ε( j) u|2 ≤ Tεab, j (u)ξa which upon summation and integration establishes the claim. (2) follows by (A) from the fact that on the compact closure of the metrics eε and δab are equivalent and the Christoffel symbols and its derivatives are bounded by the respective inverse powers of ε. 5. Energy Estimates In this section, we establish the core estimates needed in the proof of our main theorem. Proposition 5.1. Let u ε be a solution of (3.3) on U . Then, for each k ≥ 1, there exist positive constants Ck , Ck , Ck such that for each 0 ≤ τ ≤ γ we have k k 2 E τ,ε (u ε ) ≤ E 0,ε (u ε ) + Ck (∇ f ε k−1 τ , ε ) + C k
+Ck
τ ζ =0
k−1
1
j=1
ε2(1+k− j)
τ
ζ =0
j
E ζ,ε (u ε )dζ
k E ζ,ε (u ε )dζ.
(5.1)
Before proving this statement, we draw the essential conclusions from it. Observe that the constant in front of the highest order term on the r.h.s. does not depend on ε, hence we obtain, by an application of Gronwall’s lemma. Corollary 5.1. Let u ε be a solution of (3.3) on U . Then, for each k ≥ 1, there exist positive constants Ck , Ck , Ck such that for each 0 ≤ τ ≤ γ , ⎛ ⎜ k k 2 (u ε ) ≤ ⎝ E 0,ε (u ε ) + Ck (∇ f ε k−1 E τ,ε τ , ε ) + C k
k−1 j=1
1 ε2(1+k− j)
τ
⎞ ⎟ j E ζ,ε (u ε )dζ ⎠ eCk τ .
ζ =0
(5.2) This statement immediately implies the main result in this section.
412
J. D. E. Grant, E. Mayerhofer, R. Steinbauer
Corollary 5.2. Let u ε be a solution of (3.3) on U . If, all k ≥ 1, the initial energy k (u )) is a moderate resp. negligible net of real numbers, and ( f ) is negligible (E 0, ε ε ε ε ε then k sup (E τ, ε (u ε ))ε
0≤τ ≤γ
is moderate resp. negligible. Proof of Proposition 5.1. We begin by estimating the second integrand on the r.h.s. of Eq. (4.2). Using the fact that the energy tensors are symmetric then, by an application of the Cauchy-Schwarz inequality to the inner product induced on the tensor bundle T02 (M) by the metric eε , we deduce that
1
ab, j
ε ξb) eε = Tεab, j (u ε )eε Lξ ε gε eε . (5.3)
Tε (u ε )∇aε ξb ≤ Tεab, j (u ε )eε ∇(a 2 Equation (2.6) implies that there exists a constant K > 0 such that Lξ ε gε eε ≤ K . In the case j = 0, we have
1 2 4 ε ε ac bd Tεab,0 (u ε )2eε = − u ε eab ecd gε gε = u 4ε , 2 so Tεab,0 (u ε )eε = u 2ε = |∇ε(0) u ε |2 . For j ≥ 1, we have Tεab, j (u ε )2eε
1 ab cd 1 a b c d a c b d gε gε − gε gε = − gε gε 2 2 × ∇c ∇ I u ε ∇d ∇ J u ε ∇c ∇ I u ε ∇d ∇ J u ε eεI J eεI J = eεcc eεdd ∇c ∇ I u ε ∇d ∇ J u ε ∇c ∇ I u ε ∇d ∇ J u ε eεI J eεI J ε ε eaa ebb
gεac gεbd
≤ A j |∇ε( j) u ε |2 , where A j ∼ 4 j are combinatorial constants. Letting A0 := 1, we deduce that
1
ab, j
Tε (u ε )∇aε ξb ≤ A j K |∇ε( j) u ε |2 , for j ≥ 0. 2 Letting A˜k := max j=0,...,k Ak , we therefore find that
k
1
ab, j ε
Tε (u ε )∇a ξb µε
≤ A˜ k K (∇ u ε k τ , ε )2
2
j=0 τ for k ≥ 0.
(5.4)
The Wave Equation on Singular Space-Times
413
We now consider the first integrand on the r.h.s. of (4.2). Beginning with the case k = 1, the divergence terms that we require take the form
1 1 ab gε (2u ε ∇aε u ε ) = −u ε ∇εb u ε ∇aε Tεab,0 (u ε ) = − (∇aε gεab )u 2ε − 2 2
1 ∇aε Tεab,1 (u ε ) = gεac gεbd − gεab gεcd (∇aε ∇cε u ε ∇dε u ε + ∇cε u ε ∇aε ∇dε u ε ) 2 = ∇εc ∇cε u ε ∇εb u ε = (ε u ε )∇εb u ε = f ε ∇εb u ε . Inserting this and the k = 1 form of (5.4) into (4.2) yields 1 ˜ 1 1 ∇ 1 2 E τ,ε (u ε ) ≤ E 0,ε (u ε ) + A1 K ( u ε τ , ε ) + ξεa ∇aε u ε ( f ε − u ε )µε 2 τ 1 1 ≤ E 0,ε (u ε ) + A˜ 1 K (∇ u ε 1 τ , ε )2 2 1 1
2 2 a ε 2 2 + (ξε ∇a u ε ) µε | f ε − u ε | µε τ
τ
1 1 ≤ E 0,ε (u ε ) + A˜ 1 K (∇ u ε 1 τ , ε )2 2
1 2 (1) 2 + M0 |∇ε u ε | µε τ
τ
1 | f ε | µε 2
1
2
+
|u ε | µε 2
τ
2
1 1 ≤ E 0,ε (u ε ) + A˜ 1 K (∇ u ε 1 τ , ε )2
2 M0 |∇ε(1) u ε |2 + |u ε |2 µε + + |∇ε(1) u ε |2 µε + | f ε |2 µε 2 τ τ τ 2 2 M0 ∇ 1 ˜ 1 0 ∇ 1 f ε τ , ε + M0 + A1 K u ε τ , ε , ≤ E 0,ε (u ε ) + 2 2 where we have repeatedly used the Cauchy-Schwarz inequality. Now we use (4.3) to obtain τ 2 τ 1 ∇ u ε 1 τ , ε = (∇ u ε 1Sζ , ε )2 dζ ≤ E 1 (u ε )dζ. A ζ =0 ζ,ε ζ =0 Setting C1 := M0 /2, C1 = 0, and C1 := (M0 + 21 A˜ 1 K )/A = 2M0 (M0 + yields the claim for k = 1. We now turn to the case k > 1. We first derive an estimate for ξb ∇aε Tεab,k (u ε ) = I1 + I2 + I3 , where we have defined
1 I1 := gεac ξεd − ξεa gεcd ∇aε eεI J (∇cε ∇ Iε u ε )(∇dε ∇ εJ u ε ), 2 ε ε a cd ε ε ε IJ I2 := −2eε ∇d ∇ J u ε ξε gε ∇[a ∇c] ∇ I u ε , I3 := eεI J ξεd ∇dε ∇ εJ u ε gεac ∇aε ∇cε ∇ Iε u ε .
1 ˜ 2 A1 K )
414
J. D. E. Grant, E. Mayerhofer, R. Steinbauer
The strategy is, again, to remove the terms involving derivatives of order k + 1 using the wave equation. This requires interchanging the order of covariant derivatives, and therefore introduces additional curvature terms. We now calculate the moduli of the terms I1 , I2 , I3 separately. We begin by estimating |I1 |:
1 |I1 | =
gεac ξεd − ξεa gεcd ∇aε eεI J (∇cε ∇ Iε u ε )(∇dε ∇ εJ u ε )
2 ac d 1 a cd ε ε ε IJ ε ε g ≤ ∇ ξ − g e ξ ε ε a ε · (∇c ∇ I u ε )(∇d ∇ J u ε ) eε ε ε 2 eε ac d 1 a cd ε IJ = gε ξε − ξε gε ∇a eε · |∇ε(k) u ε |2 , 2 eε where the inequality in the second line results from applying the Cauchy-Schwarz inequality to the inner product induced on the tensor bundle T02k (M) by the metric eε . The square of the first term may then be evaluated as 2
ac d 1 a cd 1 a cd ε IJ ε ε ε ε ac d g ξ − ξ g ∇a eε = ecc edd e I I e J J gε ξε − ξε gε ε ε 2 ε ε 2 eε
1 ∇aε eεI J ∇aε eεI J × gεa c ξεd − ξεa gεc d 2 2 aa ε = ξε eε eε e I I eεJ J ∇aε eεI J ∇aε eεI J 2 = ξε 2eε · ∇aε eεI J . eε
We now note that, by Condition (A) and Eq. (2.5) of Sect. 2.3, we have, ε IJ ∇a eε = O(1), (ε → 0). eε
In particular, on each compact set there exists a positive constant, Ck , such that ∇ ε e I J ≤ Ck , as ε → 0. Therefore, we have the following estimate for I1 : a ε e ε
|I1 | ≤ Ck · ξε eε · |∇ε(k) u ε |2 ≤ Ck M0 · |∇ε(k) u ε |2 , locally, as ε → 0. Next we turn to I2 . We then have
ε ε ε ∇c] ∇ I u ε
|I2 | = 2eεI J ∇dε ∇ εJ u ε ξεa gεcd ∇[a
= eεI J ∇dε ∇ εJ u ε ξεa eεcd ∇aε , ∇cε ∇ Iε u ε
= eεI J ξεa eεcd ∇dε ∇ εJ u ε ∇aε , ∇cε ∇ Iε u ε
≤ ξεa ∇cε ∇ Iε u ε e · ∇aε , ∇cε ∇ Iε u ε ε e ε ε ε ε (k) = ξε eε · |∇ε u ε | · ∇a , ∇c ∇ I u ε , eε
(5.5)
The Wave Equation on Singular Space-Times
415
where the equality in the first line follows from skew-symmetry in a, c, and the inequality on the third line follows from the Cauchy-Schwarz inequality. Moreover, from Condition (A), we have the following estimates for the curvature on compact sets: ε c ∇aε1 . . . ∇aεl Rab d eε ≤
Fl , ε2+l
l ≥ 0.
(5.6)
Using this estimate with l = 0 and the Ricci identity, we deduce the existence of a combinatorial constant Nk depending only on k such that F0 ε ε ε ∇a , ∇c ∇ I u ε ≤ Nk 2 |∇ε(k−1) u ε |. eε ε Hence, we have F0 ξε eε · |∇ε(k) u ε | · |∇ε(k−1) u ε | ε2
Nk F0 M0 1 |∇ε(k) u ε |2 + 4 |∇ε(k−1) u ε |2 ≤ 2 ε
|I2 | ≤ Nk
(5.7)
on compact sets. Finally, we consider the term I3 . We then have, by the Cauchy-Schwarz inequality
|I3 | = eεI J ξεd ∇dε ∇ εJ u ε gεac ∇aε ∇cε ∇ Iε u ε
≤ ξεd ∇dε ∇ Iε u ε · gεac ∇aε ∇cε ∇ Iε u ε e ε eε ac ε ε ε (k) ≤ Pk ξε eε · |∇ε u ε | · gε ∇a ∇c ∇ I u ε e , ε
where Pk is a combinatorial constant depending only on k. Again using the Ricci identities, and the fact that u ε is a solution of (3.3), we may write gεac ∇aε ∇cε ∇ Iε u ε = ∇ Iε f ε +
k−1 j) R(k−1, uε , ε I
j=1 (k−1, j)
where Rε u ε denotes a linear combination of contractions of the (k − j − 1)th covariant derivative of the Riemann tensor with the j th covariant derivative of u ε . A second appeal to (5.6) implies that on each compact set there exists a constant G k such that Gk j) u (ε → 0). R(k−1, ≤ k− j+1 |∇ε( j) u ε |, ε ε I eε ε We therefore have
⎛
|I3 | ≤ Pk ξε eε · |∇ε(k) u ε | ⎝|∇ε(k−1) f ε | + G k ⎛
⎞
k−1
1
j=1
ε1+k− j
|∇ε( j) u ε |⎠
⎞ k−1 Pk M0 ⎝ 1 ≤ · k|∇ε(k) u ε |2 + |∇ε(k−1) f ε |2 + G 2k |∇ ( j) u ε |2 ⎠ . (5.8) 2 ε2(1+k− j) ε j=1
416
J. D. E. Grant, E. Mayerhofer, R. Steinbauer
Putting together (5.5), (5.7), and (5.8), we have k−1 ( j)
|∇ε u ε |2
,
ξb ∇aε Tεab,k (u ε ) ≤ αk |∇ε(k) u ε |2 + βk |∇ε(k−1) f ε |2 + γk ε2(1+k− j) j=1
for positive constants αk , βk , γk . Summation over k = 1 . . . m and integration yields m , positive constants αm , β γm such that
m
m−1 2 2 ∇ αm (∇ u ε m ξb ∇aε Tεab,k (u ε )µε ≤
τ , ε ) + βm ( f ε τ , ε )
τ k=0
+ γm
m−1 j=1
1 j (∇ u ε τ , ε )2 . ε2(1+m− j)
On substituting this inequality and (5.4) into Eq. (4.2), we deduce that
1 m m 2 E τ,ε (u ε ) ≤ E 0,ε (u ε ) + αm + A˜ m K (∇ u ε m τ , ε ) 2 m (∇ f ε m−1 )2 + +β γm τ , ε
m−1 j=1
1 j (∇ u ε τ , ε )2 . ε2(1+m− j)
(5.9)
As in the case with k = 1, we may use Lemma 4.1 to write τ τ 1 j j j ∇ 2 ∇ 2 ( u ε Sζ , ε ) dζ ≤ E (u ε )dζ, ( u ε τ , ε ) = A ζ =0 ζ,ε ζ =0 for j = 1, . . . , m. Substituting these relations into (5.9) yields the inequality (5.1), with m , Cm := γm /A and Cm := ( αm + 21 A˜ m K )/A . Cm := β Remark 5.1. As can be seen from the expression for ∇aε Tεab,0 (u ε ), there is no estimate of 0 (u ). However, E 0 (u ) is estimated in terms of E k (u ), with the form (5.1) for E τ,ε ε τ,ε ε τ,ε ε k ≥ 1; a fact that is implicit in Proposition 5.1. 6. Auxiliary Estimates In this section, we complement the energy inequalities derived in Sect. 5 with estimates that allow us to utilise the former in the proof of the main result. In particular, we shall prove that (i) suitable bounds on the initial data give suitable bounds on the initial energies k (u ); E 0,ε ε k (u ) give suitable bounds on the solution u . (ii) suitable bounds on the energies E τ,ε ε ε The existence as well as the uniqueness part of the proof of the main theorem will then use (i) combined with Corollary 5.2 and (ii) to establish moderateness resp. negligibility of the candidate solution. Lemma 6.1 (Bounds on initial energies from initial data). Let u ε be a solution of (3.3). k (u )) , for If (vε )ε , (wε )ε are moderate resp. negligible, then the initial energies (E 0,ε ε ε each k ≥ 0, are moderate resp. negligible nets of real numbers.
The Wave Equation on Singular Space-Times
417
Proof. The estimates for the spatial derivatives ∂x i1 . . . ∂x ik u ε (0, x i ) = ∂x i1 . . . ∂x ik vε (x i ) are immediate. To estimate ∂t ∂x i1 . . . ∂x ik u ε (0, x i ), we rewrite the initial conditions in Eq. (3.3) in the form u ε (t = 0, x i ) = vε (x i ), ∂t u ε (t = 0, x i ) = w˜ ε (x i ), where we define w˜ ε := Vε wε − Nεi ∂x i vε . It is straightforward to show, using the asymptotic estimates (3.4), that (vε , wε ) being moderate resp. negligible implies moderateness resp. negligibility of (vε , w˜ ε ). Therefore moderateness resp. negligibility of (vε , wε ) implies moderateness resp. negligibility of ∂t ∂x i1 . . . ∂x ik u ε (0, x i ) ≡ ∂x i1 . . . ∂x ik w˜ ε (x i ). The estimates for higher (mixed) time derivatives follow inductively by rewriting the wave equation in the form
∂u ε 2 1 ∂t2 u ε = −Vε2 f ε + 2 N i ∂t ∂i u ε − h iεj − 2 Nεi Nεj ∂i ∂ j u ε + gεab [gε ]c ab c Vε Vε ∂x ij
and using again the estimates (3.4) for Vε , Nεi , h ε as well as f ε , vε , wε .
Lemma 6.2 (Bounds on solutions from bounds on energies). For m > 3/2 an integer, there exists a constant K and number N such that for all u ∈ C ∞ ( τ ) and for all ζ ∈ [0, τ ] we have m+l sup |∂x a1 · · · ∂x al u(x)| ≤ K ε−N sup E ζ,ε (u). 0≤ζ ≤τ
x∈ τ
Remark 6.1. Note that the statement is for all u ∈ C ∞ ( τ ). In the proof of the main theorem, we will apply it to a solution, u ε , of the wave equation. Proof of Lemma 6.2. First we combine the standard Sobolev embedding theorem on Sτ with the fact that by assumption (A) the metric and hence the volume is O(1) to obtain for m > 3/2, sup |u(x)| ≤ K ∂ um (6.1) Sζ , ε . x∈Sζ
Then we successively apply (4.5) and (4.3) to obtain m sup |u(x)| ≤ ε−N E ζ,ε (u).
x∈Sζ
Taking the supremum over ζ ∈ [0, τ ] on the right hand side gives the result for l = 0. To prove the general result, we replace u by the respective derivatives. In some more detail, note that time derivatives are not covered by the Sobolev embedding theorem since they are transversal to Sτ , i.e., we have to replace (6.1) by the estimate ∂ m+k+s sup |∂ρ1 . . . ∂ρk ∂ts u| ≤ K ∂ ∂ts um+k Sζ , ε ≤ K u Sζ , ε ,
x∈Sζ
where the last inequality holds because the norm ∂ m Sζ , ε , in addition, contains time derivatives.
418
J. D. E. Grant, E. Mayerhofer, R. Steinbauer
7. Proof of the Main Theorem We finally prove the main result by putting together the estimates achieved so far. Proof of Theorem 3.1. Step 1. Existence of classical solutions. Due to assumption (C), classical theory provided us with smooth solutions for fixed ε. More precisely, by [6, Theorem 5.3.2], for ε fixed there exists a unique smooth function u ε solving (3.3) on A ⊆ ε 0 such that the operator |D|− p (interpreted as the inverse of the restriction of |D| p on the closure of its range, which has a finite co-dimension since D has compact resolvents) has finite nonzero Dixmier trace, denoted by T rω (where ω is some suitable Banach limit). Consider the canonical ‘volume form’ τ coming from the Dixmier trace, i.e. τ : B(H) → C 1 − p ). We also assume that the spectral triple defined by τ (A) := T r (|D| − p ) T r ω (A|D| ω ∞ ∞ ∞ is QC , i.e. A and {[D, a], a ∈ A } are contained in the domains of all powers of the derivation [|D|, ·]. Under this assumption, τ is a positive faithful trace on the C ∗ -subalgebra generated by A∞ and {[D, a] a ∈ A∞ }, and using this there is a canonical construction of the Hilbert space of forms, denoted by HnD , n ≥ 0 (see [6] for details), with H0D = L 2 (A∞ , τ ). It is assumed that the unbounded densely defined map d D from ∗ d has A∞ H0D to H1D given by d D (a) = [D, a] for a ∈ A∞ , is closable, L := −d D D in its domain, and it is left invariant by L. Moreover, we assume that L has compact resolvents, with its eigenvectors belonging to A∞ , and the kernel of L is the onedimensional subspace spanned by the identity 1 of A∞ . The linear span of eigenvectors ∞ of L, which is a subspace of A∞ , is denoted by A∞ 0 , and it is assumed that A0 is ∗ ∞ norm-dense in the C -algebra A obtained by completing A . The ∗-subalgebra of A∞ generated by A∞ 0 is denoted by A0 . ∞ It is clear that L(A∞ 0 ) ⊆ A0 , and a compact quantum group (G, ) which has an action α on A is said to act smoothly and isometrically on the noncommutative ∞ manifold (A∞ , H, D) if (id ⊗ φ) ◦ α(A∞ 0 ) ⊆ A0 for all states φ on G, and also (id ⊗ ∞ φ) ◦ α commutes with L on A0 . One can consider the category of all compact quantum groups acting smoothly and isometrically on A, where the morphisms are quantum group morphisms which intertwine the actions on A. It is proved in [4] (under some regularity assumptions, which are valid for any compact connected Riemannian spin manifold with the usual Dirac operator) that there exists a universal object in this category, and this universal object is defined to be the quantum isometry group of (A∞ , H, D), denoted by Q I S O(A∞ , H, D), or simply as Q I S O(A∞ ) or even Q I S O(A) if the spectral triple is understood. In fact, we have considered a bigger category, namely the category of ‘quantum families of smooth isometries’ (see [4] for details), which is motivated by the ideas of Woronowicz and Soltan ([18,17]), and identified the underlying C ∗ -algebra of the quantum isometry group as a universal object in this bigger category. We believe that a detailed study of quantum isometry groups will not only give many new and interesting examples of compact quantum groups, it will also contribute to the understanding of quantum group covariant spectral triples. For this, it is important to explicitly describe quantum isometry groups of sufficiently many classical and noncommutative manifolds. This is our aim in this paper. We have computed quantum isometry groups of classical and noncommutative spheres and tori, and also obtained a general principle for computing such quantum groups, by proving that the quantum isometry group of an isospectral deformation of a (classical or noncommutative) manifold is a deformation of the quantum isometry group of the original (undeformed) manifold. Throughout the paper, we have denoted by A1 ⊗A2 the minimal (injective) C ∗ -tensor product between two C ∗ -algebras A1 and A2 . The symbol ⊗alg has been used to denote the algebraic tensor product between vector spaces or algebras. For a compact quantum group G, the dense unital ∗-subalgebra generated by the matrix coefficients of irreducible unitary representations has been denoted by G0 . The
Quantum Isometry Groups: Examples and Computations
423
coproduct of G, say , maps G0 into the algebraic tensor product G0 ⊗alg G0 , and there exist a canonical antipode and counit defined on G0 which make it into a Hopf ∗-algebra (see [10] for the details). 2. Computation of the Quantum Isometry Groups of the Sphere and Tori 2.1. Computation for the commutative spheres. Let Q be the quantum isometry group of S 2 and let α be the action of Q on C(S 2 ). Let L be the Laplacian on S 2 given by L=
∂2 1 ∂ ∂2 + + cot(θ ) , ∂θ 2 ∂θ sin2 (θ ) ∂ψ 2
and the cartesian coordinates x1 , x2 , x3 for S 2 are given by x1 = r cos ψ sin θ , 3 ∂ 2 x2 = r sin ψ sin θ , x3 = r cos θ . In the cartesian coordinates, L = i=1 . ∂x2 i
The eigenspaces of L on S 2 are of the form E k = Sp{(c1 X 1 + c2 X 2 + c3 X 3 )k : ci ∈ C, i = 1, 2, 3,
ci2 = 0},
where k ≥ 1. E k consists of harmonic homogeneous polynomials of degree k on R 3 restricted to S 2 . (See [11], p. 29–30.) We begin with the following lemma, which says that any smooth isometric action by a quantum group must be ‘linear’. Lemma 2.1. The action α satifies α(xi ) =
3 j=1
x j ⊗ Q i j , where Q i j ∈ Q, i = 1, 2, 3.
Proof. α is a smooth isometric action of Q on C(S 2 ), so α has to preserve the eigenspaces of the laplacian L. In particular, it has to preserve E 1 = Sp{c1 x1 + c2 x2 + 3 ci2 = 0}. c3 x3 : ci ∈ C, i = 1, 2, 3, i=1 Now note that x1 + i x2 , x1 − i x2 ∈ E 1 , hence x1 , x2 ∈ E 1 . Similarly x3 ∈ E 1 too. Therefore E 1 = Sp{x1 , x2 , x3 }, which completes the proof of the lemma.
Now we state and prove the main result of this section, which identifies Q with the commutative C ∗ algebra of continuous functions on the isometry group of S 2 , i.e. O(3). Theorem 2.2. The quantum isometry group Q is commutative as a C ∗ algebra. Proof. We begin with the expression α(xi ) =
3
x j ⊗ Q i j , i = 1, 2, 3,
j=1
and also note that x1 , x2 , x3 form a basis of E 1 and {x12 , x22 , x32 , x1 x2 , x1 x3 , x2 x3 } is a basis of E 2 . Since xi∗ = xi for each i and α is a ∗-homomorphism, we must have Q i∗j = Q i j ∀i, j = 1, 2, 3. Moreover, the condition x12 + x22 + x32 = 1 and the fact that α is a homomorphism gives: Q 21 j + Q 22 j + Q 23 j = 1, ∀ j = 1, 2, 3.
424
J. Bhowmick, D. Goswami
Again,the condition that xi ,x j commutes ∀i, j gives Q i j Q k j = Q k j Q i j ∀i, j, k, Q ik Q jl + Q il Q jk = Q jk Q il + Q jl Q ik .
(1) (2)
Now, it follows from Lemma 2.12 in [4] that α˜ : C(S 2 ) ⊗ Q → C(S 2 ) ⊗ Q defined by α(X ˜ ⊗ Y ) = α(X )(1 ⊗ Y ) extends to a unitary of the Hilbert Q-module L 2 (S 2 ) ⊗ Q (or in other words, α extends to a unitary representation of Q on L 2 (S 2 )). But α keeps V = Sp{x1 , x2 , x3 } invariant. So α is a unitary representation of Q on V , i.e. Q = ((Q i j )) ∈ M3 (Q) is a unitary, hence Q −1 = Q ∗ = Q T , since in this case entries of Q are self-adjoint elements. Clearly, the matrix Q is a 3-dimensional unitary representation of Q. Recall that (cf. [13]) the antipode κ on the matrix elements of a finite-dimensional unitary representation U α ≡ (u αpq ) is given by κ(u αpq ) = (u qαp )∗ . So we obtain T κ(Q i j ) = Q i−1 j = Q i j = Q ji .
(3)
Now from (1), we have Q i j Q k j = Q k j Q i j . Applying κ on this equation and using the fact that κ is an antihomomorphism along with (3), we have Q jk Q ji = Q ji Q jk . Similarly, applying κ on (2), we get Q l j Q ki + Q k j Q li = Q li Q k j + Q ki Q l j ∀i, j, k, l. Interchanging between k and i and also between l, j gives Q jl Q ik + Q il Q jk = Q jk Q il + Q ik Q jl ∀i, j, k, l.
(4)
Now, by (2)–(4), we have [Q ik , Q jl ] = [Q jl , Q ik ], hence [Q ik , Q jl ] = 0. Therefore the entries of the matrix Q commute among themselves. However, by faithfulness of the action of Q, it is clear that the C ∗ -subalgebra generated by entries of Q (which forms a quantum subgroup of Q acting on C(S 2 ) isometrically) must be the same as Q, so Q is commutative.
So Q = C(G) for some compact group G acting by isometry on C(S 2 ) and is also universal in this category, i.e. Q = C(O(3)). Remark 2.3. Similarly, it can be shown that Q I S O(S n ) is commutative for all n ≥ 2.
Quantum Isometry Groups: Examples and Computations
425
2.2. The commutative one-torus. Let C = C(S 1 ) be the C ∗ -algebra of continuous functions on the one-torus S 1 . Let us denote by z and z the identity function (which is the generator of C(S 1 )) and its conjugate respectively. The Laplacian coming from the standard Riemannian metric is given by L(z n ) = −n 2 z n , for n ∈ Z, hence the eigenspace corresponding to the eigenvalue −1 is spanned by z and z . Thus, the action of a compact quantum group acting smoothly and isometrically (and faithfully) on C(S 1 ) must be linear in the sense that its action must map z into an element of the form z ⊗ A + z ⊗ B. However, we show below that this forces the quantum group to be commutative as a C ∗ algebra, i.e. it must be the function algebra of some compact group. Theorem 2.4. Let α be a faithful, smooth and linear action of a compact quantum group (Q, ) on C(S 1 ) defined by α(z) = z ⊗ A + z ⊗ B. Then Q is a commutative C ∗ algebra. Proof. By the assumption of faithfulness, it is clear that Q is generated (as a unital C ∗ algebra) by A and B. Moreover, recall that smoothness in particular means that A and B must belong to the algebra Q0 spanned by matrix elements of irreducible representations of Q. Since zz = zz = 1 and α is a ∗-homomorphism, we have α(z)α(z) = α(z)α(z) = 1 ⊗ 1. Comparing coefficients of z 2 , z 2 and 1 in both sides of the relation α(z)α(z) = 1 ⊗ 1, we get AB ∗ = B A∗ = 0,
A A∗ + B B ∗ = 1.
(5)
A∗ A + B ∗ B = 1.
(6)
Similarly, α(z)α(z) = 1 ⊗ 1 gives B ∗ A = A∗ B = 0,
Let U = A + B, P = A∗ A, Q = A A∗ . Then it follows from (5) and (6) that U is a unitary and P is a projection since P is self adjoint and P 2 = A∗ A A∗ A = A∗ A(1 − B ∗ B) = A∗ A − A∗ AB ∗ B = A∗ A = P. Moreover, UP = (A + B)A∗ A = A A∗ A + B A∗ A = A A∗ A ( since B A∗ = 0 from (5)) = A(1 − B ∗ B) = A − AB ∗ B = A. Thus, A = U P, B = U − U P = U (1 − P) ≡ U P ⊥ , so Q = C ∗ (A, B) = C ∗ (U, P). We can rewrite the action α as follows: α(z) = z ⊗ U P + z ⊗ U P ⊥ . The coproduct can easily be calculated from the requirement (id ⊗ )α = (α ⊗ id)α, and it is given by : (U P) = U P ⊗ U P + P ⊥ U −1 ⊗ U P ⊥ , (U P ⊥ ) = U P ⊥ ⊗ U P + PU −1 ⊗ U P ⊥ .
(7) (8)
426
J. Bhowmick, D. Goswami
From this, we get (U ) = U ⊗ U P + U −1 ⊗ U P ⊥ , (P) = (U −1 )(U P) = P ⊗ P + U P ⊥ U −1 ⊗ P ⊥ .
(9) (10)
It can be checked that given by the above expression is coassociative. Let h denote the right-invariant Haar state on Q. By the general theory of compact quantum groups, h must be faithful on Q0 . We have (by right-invariance of h): (id ⊗ h)(P ⊗ P + U P ⊥ U −1 ⊗ P ⊥ ) = h(P)1. That is, we have h(P ⊥ )U P ⊥ U −1 = h(P)P ⊥ .
(11)
Since P is a positive element in Q0 and h is faithful on Q0 , h(P) = 0 if and only if P = 0. Similarly, h(P ⊥ ) = 0, i.e. h(P) = 1, if and only if P = 1. However, if P is either 0 or 1, clearly Q = C ∗ (U, P) = C ∗ (U ), which is commutative. On the other hand, if we assume that P is not a trivial projection, then h(P) is strictly between 0 and 1, and we have from (11) U P ⊥ U −1 =
h(P) P ⊥. 1 − h(P)
Since both U P ⊥ U −1 and P ⊥ are nontrivial projections, they can be scalar multiples of each other if and only if they are equal, so we conclude that U P ⊥ U −1 = P ⊥ , i.e. U commutes with P ⊥ , hence with P, and Q is commutative.
2.3. Commutative and noncommutative two-tori. Fix a real number θ , and let Aθ be the universal C ∗ algebra generated by two unitaries U and V such that U V = λV U , where λ := e2πiθ . It is well-known (see [3]) that the set {U m V n : m, n ∈ Z} is an orthonormal basis for L 2 (Aθ , τ ), where τ denotes the unique faithful normalized trace on Aθ given by, τ ( amn U m V n ) = a00 . We shall denote by A, B = τ (A∗ B) the inner product on H0 := L 2 (Aθ , τ ). Let Afin θ be the unital ∗-subalgebra generated by finite complex linear combinations of U m V n , m, n ∈ Z, and let d1 , d2 be the m n m n m n m n maps on Afin θ defined by d1 (U V ) = mU V , d2 (U V ) = nU V ). We consider fin the canonical spectral triple (see [3] for details) (Aθ , H, D), where H = H0 ⊕ H0 , 0 d1 + id2 , and the representation of Aθ on H is the diagonal one, D = d1 − id2 0 a0 i.e. a → . Clearly, the corresponding Laplacian L is given by L(U m V n ) = 0a −(m 2 + n 2 )U m V n , and it is also easy to see that the algebraic span of eigenvectors of L is nothing but the space Afin θ , and moreover, all the assumptions in [4] required for defining the quantum isometry group are satisfied. Let Q be the quantum isometry group of the above spectral triple, with the smooth isometric action of Aθ given by α : Aθ → Aθ ⊗Q. By definition, α must keep invariant the eigenspace of L corresponding to the eigenvalue −1, spanned by U, V, U −1 , V −1 .Thus, the action α is given by: α(U ) = U ⊗ A1 + V ⊗ B1 + U −1 ⊗ C1 + V −1 ⊗ D1 , α(V ) = U ⊗ A2 + V ⊗ B2 + U −1 ⊗ C2 + V −1 ⊗ D2 ,
Quantum Isometry Groups: Examples and Computations
427
for some Ai , Bi , Ci , Di ∈ Q, i = 1, 2, and by faithfulness of the action of the quantum isometry group (see [4]), the norm-closure of the unital ∗-algebra generated by Ai , Bi , Ci , Di ; i = 1, 2 must be the whole of Q. Next we derive a number of conditions on Ai , Bi , Ci , Di , i = 1, 2 using the fact that α is a ∗ homomorphism. Lemma 2.5. The condition U ∗ U = 1 = UU ∗ gives: A∗1 A1 + B1∗ B1 + C1∗ C1 + D1∗ D1 = 1,
A∗1 B1 + λD1∗ C1 = A∗1 D1 + λB1∗ C1 = 0, C1∗ D1 + λB1∗ A1 = C1∗ B1 + λD1∗ A1 = 0, A∗1 C1 = B1∗ D1 = C1∗ A1 = D1∗ B1 = 0, A1 A∗1 + B1 B1∗ + C1 C1∗ + D1 D1∗ = 1, A1 B1∗ + λD1 C1∗ = A1 D1∗ + λB1 C1∗ = 0, C1 D1∗ + λB1 A∗1 = C1 B1∗ + λD1 A∗1 = 0, A1 C1∗ = B1 D1∗ = C1 A∗1 = D1 B1∗ = 0.
(12) (13) (14) (15) (16) (17) (18) (19)
Proof. We get (12) - (15) by using the condition U ∗ U = 1 along with the fact that α is a homomorphism and then comparing the coefficients of 1, U ∗ V, U ∗ 2 , U ∗ V ∗ , U V ∗ , V ∗ 2 , U 2 , U V, V 2 . Similarly the condition UU ∗ = 1 gives (16)–(19).
Lemma 2.6. We have analogues of (12)–(19) with A1 , B1 , C1 , D1 replaced by A2 , B2 , C2 , D2 respectively. Proof. We use the condition V ∗ V = V V ∗ = 1.
ckl U k V l ⊗ Q kl for some Q kl ∈ Q, then the Now we note that if α(U m V n ) = condition that α commutes with the laplacian implies ckl = 0 unless k 2 + l 2 = m 2 + n 2 . We use this observation in the next lemma. Lemma 2.7. Inspecting the terms with zero coefficient in α(U ∗ V ), α(V U ∗ ), α(U V ), α(V U ), we get C1∗ A2 A2 C1∗ A1 A2 A2 A1
= = = =
0, 0, 0, 0,
D1∗ B2 = 0, B2 D1∗ = 0, B1 B2 = 0, B2 B1 = 0,
A∗1 C2 = 0, C2 A∗1 = 0, C1 C2 = 0, C2 C1 = 0,
B1∗ D2 = 0, D2 B1∗ = 0, D1 D2 = 0, D2 D1 = 0.
(20) (21) (22) (23)
Proof. Equation (20) is obtained from the coefficients of U 2 , V 2 , U ∗ 2 , V ∗ 2 in α(U ∗ V ) while (21), (22), (23) are obtained from the same coefficients in α(V U ∗ ), α(U V ), α(V U ) respectively.
Lemma 2.8. A1 B2 + λB1 A2 = λA2 B1 + B2 A1 , A1 D2 + λD1 A2 = λA2 D1 + λ2 D2 A1 , C1 B2 + λB1 C2 = λC2 B1 + λ2 B2 C1 , C1 D2 + λD1 C2 = λC2 D1 + D2 C1 . Proof. Follows from the relation α(U V ) = λα(V U ) and equating non-zero coefficients of U V, U V −1 , U −1 V and U −1 V −1 .
428
J. Bhowmick, D. Goswami
Now, by Lemma 2.12 in [4] it follows that α˜ : Aθ ⊗ Q → Aθ ⊗ Q defined by α(X ˜ ⊗ Y ) = α(X )(1 ⊗ Y ) extends to a unitary of the Hilbert Q-module L 2 (Aθ , τ ) ⊗ Q (or in other words, α extends to a unitary representation of Q on L 2 (Aθ , τ )). But α keeps W = Sp{U, V, U ∗ , V ∗ } invariant (as observed in the beginning of this section). So α is a unitary representation of Q on W . Hence, the matrix (say M) corresponding to the 4 dimensional representation of Q on W is a unitary in M4 (Q). ⎞ ⎛ A1 A2 C1∗ C2∗ B2 D1∗ D2∗ ⎟ ⎜B . From the definition of the action it follows that M = ⎝ 1 C1 C2 A∗1 A∗2 ⎠ ∗ ∗ D1 D2 B1 B2 Since M is the matrix corresponding to a finite dimensional unitary representation, κ(Mkl ) = Mkl−1 , where κ denotes the antipode of Q (see [13]). But M is a unitary, M −1 = M ∗ . ⎞ ⎛ ∗ A1 B1∗ C1∗ D1∗ ∗ ∗ ∗ ∗ B2 C2 D2 ⎟ ⎜A So,(k(Mkl )) = ⎝ 2 C1 D1 A1 B1 ⎠ C2 D2 A2 B2 Lemma 2.9. A1 is a normal partial isometry and hence has the same domain and range. Proof. From the relation A∗1 A1 + B1∗ B1 + C1∗ C1 + D1∗ D1 = 1 in Lemma 2.5, we have by applying κ, A∗1 A1 + A∗2 A2 +C1 C1∗ +C2 C2∗ = 1. Applying A1 on the right of this equation and using C1∗ A1 = 0 from Lemma 2.5, and A2 A1 = A∗1 C2 = 0 from Lemma 2.7, we have A∗1 A1 A1 = A1 .
(24)
Again, from the relation A1 A∗1 + B1 B1∗ + C1 C1∗ + D1 D1∗ = 1 in Lemma 2.5, applying κ and multiplying by A∗1 on the right, and then using C1 A∗1 = 0 from Lemma 2.5, A1 A2 = C2 A∗1 = 0 from Lemma 2.7, we have A1 A∗1 A∗1 = A∗1 .
(25)
(A∗1 A1 )(A1 A∗1 ) = A1 A∗1 .
(26)
A1 A1 A∗1 = A1 .
(27)
From (24), we have
By taking ∗ on (25), we have
So, by multiplying by A∗1 on the left, we have (A∗1 A1 )(A1 A∗1 ) = A∗1 A1 .
(28)
From (26) and (28), we have A1 A∗1 = A∗1 A1 , i.e A1 is normal. So, A1 = A∗1 A1 A1 (from (24)) = A1 A∗1 A1 . Therefore, A1 is a partial isometry which is normal and hence has same domain and range.
Remark 2.10. In an exactly similar way, it can be proved that D1 is a normal partial isometry and hence has the same domain and range.
Quantum Isometry Groups: Examples and Computations
429
Lemma 2.11. We have C1∗ B1∗ = C2∗ B2∗ = A1 D1 = A2 D2 = B1 C1∗ = B1∗ C1∗ = B1 A1 = A1 B1∗ = D1 A1 = A∗1 D1 = C1∗ B1 = D1 C1∗ = 0. Proof. Using A2 C1∗ = B2 D1∗ = C2 A∗1 = D2 B1∗ = 0 from Lemma 2.7 and applying κ, we have the first four equalities. But A1 D1 = 0. Hence Ran(D1 ) ⊆ K er A1 .
(29)
Ran(D1 ) = Ran(D1∗ ).
(30)
By the above made remark,
Equations (29) and (30) imply Ran(D1∗ ) ⊆ K er (A1 ), so A1 D1∗ = 0. But from Lemma 2.5, we have A1 D1∗ + λB1 C1∗ = 0, which gives B1 C1∗ = 0. From Lemma 2.7, we have C1∗ A2 = A2 A1 = 0, from which it follows by applying κ, that B1∗ C1∗ = A∗1 B1∗ = 0. So, B1 A1 = 0. So, Ran(A1 ) ⊆ K er B1 . But by Lemma 2.9, A1 is a normal partial isometry and so has the same range and domain. Thus, Ran(A∗1 ) ⊆ K er (B1 ) which implies B1 A∗1 = 0 , i.e , A1 B1∗ = 0.
(31)
Again, from Lemma 2.7, A∗1 C2 = 0. Hence, by applying κ, D1 A1 = 0 , i.e, A∗1 D1∗ = 0. But D1 is a partial isometry (from the remark following Lemma 2.9), we conclude A∗1 D1 = 0. But by Lemma 2.5, we have A∗1 D1 + λB1∗ C1 = 0. But, A∗1 D1 = 0 implies B1∗ C1 = 0, i.e, C1∗ B1 = 0. Also, A1 B1∗ = 0 (from (31)) and A1 B1∗ + λD1 C1∗ = 0 (by Lemma 2.5), so D1 C1∗ = 0.
Lemma 2.12. C1 is a normal partial isometry and hence has the same domain and range. Proof. From the relation A∗1 A1 + B1∗ B1 + C1∗ C1 + D1∗ D1 = 1 in Lemma 2.5, multiplying by C1∗ on the right and using A1 C1∗ = 0 from Lemma 2.5, and B1 C1∗ = D1 C1∗ = 0 from Lemma 2.11, we have C1∗ C1 C1∗ = C1∗ .
(32)
Therefore, C1∗ and hence C1 is a partial isometry. Also, from Lemma 2.5, A∗1 A1 + B1∗ B1 +C1∗ C1 + D1∗ D1 = 1 = A1 A∗1 + B1 B1∗ +C1 C1∗ + D1 D1∗ . Using the normality of A1 and D1 (obtained from Lemma 2.9 and the remark following it) to this equation, we have B1∗ B1 + C1∗ C1 = B1 B1∗ + C1 C1∗ .
(33)
Multiplying by C1∗ the left of (33), and using C1∗ B1∗ = C1∗ B1 = 0 from Lemma 2.11, we have : C1∗ C1∗ C1 = C1∗ C1 C1∗ . But C1∗ C1 C1∗ = C1∗ (from (32)), hence C1∗ C1∗ C1 = C1∗ . Applying C1 on the left, we have (C1 C1∗ )(C1∗ C1 ) = C1 C1∗ .
(34)
Now multiplying by C1∗ on the right of (33) and using B1 C1∗ = B1∗ C1∗ = 0 from Lemma 2.11, we have C1∗ C1 C1∗ = C1 C1∗ C1∗ and using (32), we have C1 C1∗ C1∗ = C1∗ . Thus, C1∗ C1 = (C1 C1∗ )(C1∗ C1 ) = C1 C1∗ (by (34)), hence C1 is a normal partial isometry and so has the same domain and range.
430
J. Bhowmick, D. Goswami
Remark 2.13. 1. In the same way, it can be proved that B1 is a normal partial isometry and hence has same domain and range. 2. In an exactly similar way, it can be proved that A2 , B2 , C2 , D2 are normal partial isometries and hence has same domain and range. Lemma 2.14. We have A1 C2 = B1 D2 = C1 A2 = D1 B2 . Proof. By Lemma 2.11, we have A1 D1 = A2 D2 = C2∗ B2∗ = 0. Now, using the fact that D1 , D2 and B2 are normal partial isometries, we have A1 D1∗ = A2 D2∗ = C2∗ B2 = 0. Taking adjoint and applying κ, we have the first, second and the fourth equalities. To prove the third one, we take adjoint of the relation C1∗ B1 = 0 obtained from Lemma 2.11 and then apply κ.
Now we define for i = 1, 2 Ai∗ Ai = Pi , Bi∗ Bi = Q i , Ci∗ Ci = Ri , Si = 1 − Pi − Q i − Ri , Ai Ai∗ = Pi , Bi Bi∗ = Q i , Ci Ci∗ = Ri , Si = 1 − Pi − Q i − Ri . By Lemma 2.5, and the remark following it, we have Di∗ Di = 1 − (Pi + Q i + Ri ) and Di Di∗ = 1 − (Pi + Q i + Ri ). Also we note that, since Ai , Bi , Ci , Di are normal, it follows that Pi = Pi , Q i = Q i , Ri = Ri , Si = Si . Lemma 2.15. P1 + R1 = 1 − (P2 + R2 ). Proof. From Lemma 2.7, A1 A2 = B1 B2 = C1 C2 = D1 D2 = 0 and from the first relation, we have A∗1 A1 A2 A∗2 = 0 which gives P1 P2 = 0. From the second relation, we have
B1∗ B1 B2 B2∗
(35)
= 0, hence
Q 1 Q 2 = 0.
(36)
Similarly, the third and fourth relations imply R1 R2 = 0
(37)
(1 − (P1 + Q 1 + R1 ))(1 − (P2 + Q 2 + R2 )) = 0
(38)
and
respectively. Now applying the same method to the relations A1 C2 = B1 D2 = C1 A2 = D1 B2 = 0 obtained from Lemma 2.14, we obtain P1 R2 = 0, Q 1 (1 − (P2 + Q 2 + R2 )) = 0, R1 P2 = 0, (1 − (P1 + Q 1 + R1 ))Q 2 = 0.
(39) (40) (41) (42)
From (38), we get : 1 − (P2 + Q 2 + R2 ) − P1 + P1 (P2 + Q 2 + R2 ) − Q 1 + Q 1 (P2 + + R2 ) − R1 + R1 (P2 + Q 2 + R2 ) = 0. Hence, 1 − (P2 + Q 2 + R2 ) − P1 + P1 (P2 + + R2 ) − Q 1 (1 − (P2 + Q 2 + R2 )) − R1 + R1 (P2 + Q 2 + R2 ) = 0. Applying (40), we have 1 − (P2 + Q 2 + R2 ) − P1 + P1 (P2 + Q 2 + R2 ) − R1 + R1 (P2 + Q 2 + R2 ) = 0. Now, using (36), we write this as : −(1 − (P1 + Q 1 + R1 ))Q 2 + 1 − P2 − R2 − P1 + P1 P2 + P1 R2 − R1 + R1 P2 + R1 R2 = 0. Now using (35), (39), (41), (37), (42), we obtain 1 − P2 − R2 − P1 − R1 = 0 So, we have, P1 + R1 = 1 − (P2 + R2 ).
Q 2 Q 2
Quantum Isometry Groups: Examples and Computations
431
Remark 2.16. 1. From Lemma 2.15 and the fact that Pi = Pi , Q i = Q i , Ri = Ri , i = 1, 2 , we have P1 + P1 + P1 + P1 +
R1 R1 R1 R1
= = = =
1 − (P2 + 1 − (P2 + 1 − (P2 + 1 − (P2 +
R2 ), R2 ), R2 ), R2 ).
(43) (44) (45) (46)
2. From the above results, we observe that if Q is imbedded in B(H ) for some Hilbert space H, then H breaks up into two orthogonal complements , the first being the range of P1 and R1 and the other being the range of Q 1 and S1 . Let p = P1 + R1 . Then p is also equal to P1 + R1 = Q 2 + S2 = Q 2 + S2 and p ⊥ = Q 1 + S1 = P2 + R2 = P2 + R2 = Q 1 + S1 . 2
Lemma 2.17. A1 B2 − B2 A1 = 0 = A2 B1 − λ B1 A2 , A 1 D 2 − λ2 D 2 A 1 = 0 = A 2 D 1 − D 1 A 2 , C1 B2 − λ2 B2 C1 = 0 = B1 C2 − C2 B1 , C 1 D 2 − D 2 C 1 = 0 = D 1 C 2 − λ2 C 2 D 1 . Proof. From Lemma 2.8, we have A1 B2 + λB1 A2 = λA2 B1 + B2 A1 . So, A1 B2 − 2 B2 A1 = λ(A2 B1 − λ B1 A2 ) Now, Ran(A1 B2 − B2 A1 ) ⊆ Ran(A1 ) + Ran(B2 ) = Ran(A1 A∗1 ) + Ran(B2 B2∗ ) = Ran(P1 ) + Ran(Q 2 ) ⊆ Ran( p). On the other hand, 2
Ran(A2 B1 − λ B1 A2 ) ⊆ Ran(A2 ) + Ran(B1 ) = Ran(P2 ) + Ran(Q 1 ) ⊆ Ran( p ⊥ ). 2
So, A1 B2 − B2 A1 = 0 = A2 B1 − λ B1 A2 . Similarly, the other three relations can be proved.
Let us now consider a C ∗ algebra B, which has eight direct summands, four of which are isomorphic with the commutative algebra C(T2 ), and the other four are irrational rotation algebras. More precisely, we take B = ⊕8k=1 C ∗ (Uk1 , Uk2 ), where for odd k, Uk1 , Uk2 are the two commuting unitary generators of C(T2 ), and for even k, Uk1 Uk2 = exp(4πiθ )Uk2 Uk1 , i.e. they generate A2θ . We set the folowing: A˜1 := U11 + U41 , B˜1 := U52 + U61 , C˜1 := U21 + U31 , D˜1 := U71 + U81 , A˜2 := U62 + U72 , B˜2 := U12 + U22 , C˜2 := U51 + U82 , D˜2 := U32 + U42 . Denote by M˜ the 4 × 4 B-valued matrix given by ⎛ ∗ A˜1 A˜2 C˜1 ∗ ⎜ ˜ ⎜ B B˜2 D˜1 M˜ = ⎜ 1 ∗ ⎝ C˜1 C˜2 A˜1 ∗ D˜1 D˜2 B˜1
∗⎞ C˜2 ∗⎟ D˜2 ⎟ ∗ ⎟. A˜2 ⎠ ∗ B˜2
We have the following: Lemma 2.18. (i) The ∗-subalgebra generated by the elements A˜i , B˜i , C˜i , D˜ i , i = 1, 2 is dense in B;
432
J. Bhowmick, D. Goswami
(ii) There is a unique compact (matrix) quantum group structure on B, where the corresponding coproduct 0 , counit 0 and antipode κ0 (say) are given on the above generating elements by 0 ( M˜ i j ) =
4
M˜ ik ⊗ M˜ k j ,
k=1
κ0 ( M˜ i j ) = M˜ ∗ji , 0 ( M˜ i j ) = δi j . The proof can be given by routine verification and hence is omitted. Moreover, we have an action of B on Aθ , as given by the following lemma. Lemma 2.19. There is a smooth isometric action of B on Aθ , which is given by the following: α0 (U ) = U ⊗(U11 +U41 ) + V ⊗(U52 + U61 )+U −1 ⊗(U21 + U31 )+V −1 ⊗(U71 + U81 ), α0 (V ) = U ⊗(U62 +U72 ) + V ⊗(U12 + U22 )+U −1 ⊗(U51 + U82 ) + V −1 ⊗(U32 +U42 ). Proof. It is straightforward to verify that the above indeed defines a smooth action of the quantum group B on Aθ . To complete the proof, we need to show that α0 keeps the eigenspaces of L invariant. For this, we observe that, since Ui j Ukl = 0 if i = k, we have α0 (U m ) = U m ⊗ (U11 + U41 )m + V m ⊗ (U52 + U61 )m + U −m ⊗(U21 + U31 )m + V −m ⊗ (U71 + U81 )m , α0 (V n ) = U n ⊗ (U62 + U72 )n + V n ⊗ (U12 + U22 )n + U −n ⊗(U51 + U82 )n + V −n ⊗ (U32 + U42 )n . From this, it is clear that in the expression of α0 (U m )α0 (V n ), only coefficients of U i V j survive, where (i, j) is one the following: (m, n), (m, −n), (−m, n), (−m, −n), (n, m), (n, −m), (−n, m), (−n, −m). This completes the proof of the action being isometric.
Now we are in a position to describe Q = Q I S O(Aθ ) explicitly. Theorem 2.20. Q = Q I S O(Aθ ) is isomorphic (as a quantum group) with B = C(T2 )⊕ A2θ ⊕ C(T2 ) ⊕ A2θ ⊕ C(T2 ) ⊕ A2θ ⊕ C(T2 ) ⊕ A2θ , with the coproduct described before. Proof. Define φ : B → Q by ⊥
⊥
φ(U11 ) = A1 P1 Q 2 , φ(U12 ) = B2 P1 Q 2 , φ(U21 ) = C1 P1 Q 2 , φ(U22 ) = B2 P1 Q 2 , ⊥
⊥
⊥
⊥
⊥
φ(U31 ) = C1 P1 Q 2 , φ(U32 ) = D2 P1 Q 2 , φ(U41 ) = A1 P1 Q 2 , ⊥
⊥
⊥
φ(U42 ) = D2 P1 Q 2 , φ(U51 ) = C2 P2 Q 1 , φ(U52 ) = B1 P2 Q 1 , φ(U61 ) = B1 P2 Q 1 , ⊥
⊥
⊥
⊥
φ(U62 ) = A2 P2 Q 1 , φ(U71 ) = D1 P2 Q 1 , φ(U72 ) = A2 P2 Q 1 , φ(U81 ) = D1 P2 Q 1 , ⊥
⊥
φ(U82 ) = C2 P2 Q 1 .
Quantum Isometry Groups: Examples and Computations
433
We show that φ is well defined and indeed gives a ∗-homomorphism. Using the facts that A1 , B2 are commuting normal partial isometries, we have, A1 P1 Q 2 B2 P1 Q 2 = A1 A1 A∗1 B2 B2∗ B2 A1 A∗1 B2 B2∗ = A1 A∗1 A1 B2 A1 A∗1 B2 B2∗ = A1 B2 A1 A∗1 B2 B2∗ B2 P 1 Q 2 A1 P 1 Q 2 = B2 A1 A∗1 B2 B2∗ A1 A1 A∗1 B2 B2∗ = A1 B2 A∗1 B2 B2∗ A1 A∗1 A1 B2 B2∗ = A1 B2 A∗1 B2 B2∗ A1 B2 B2∗ = A1 B2 A∗1 B2 A1 B2∗ B2 B2∗ = A1 B2 A∗1 A1 B2 B2∗ = A1 B2 A1 A∗1 B2 B2∗ . So, φ(U11 ) = A1 P1 Q 2 and φ(U12 ) = B2 P1 Q 2 commute and they are clearly unitaries when viewed as operators on the range of P1 Q 2 , which proves that there exists a unique C ∗ -homomorphism from C(T2 ) ∼ = C ∗ (U11 , U12 ) to Q which sends U11 and U12 to A1 P1 Q 2 and B2 P1 Q 2 respectively. Again, using the facts that C1 and B2 are normal partial isometries satisfying the relation B2 C1 = λ12 C1 B2 , we have, φ(U22 )φ(U21 ) = B2 P1 ⊥ Q 2 = λ12 C1 P1 ⊥ Q 2 B2 P1 ⊥ Q 2 = λ12 φ(U21 )φ(U22 ), i.e, φ(U21 )φ(U22 ) = λ2 φ(U22 )φ(U21 ) and they are clearly unitaries on the range of P1 Q 2 ⊥ which proves that there exists a unique C ∗ -homomorphism ⊥ ⊥ from A2θ ∼ = C ∗ (U21 , U22 ) to Q which sends U21 and U22 to C1 P 1 Q 2 and B2 P 1 Q 2 respectively. The other cases can be worked out similarly and thus it is shown that φ defines a C ∗ homomorphism from B to Q and moreover, it is easy to see that φ( M˜ i j ) = Mi j , and thus φ is a morphism of quantum group, and it clearly satisfies (id ⊗ φ) ◦ α0 = α. By universality of the quantum isometry group Q, this completes the proof of Q ∼ = B as compact quantum groups.
Remark 2.21. In particular, we note that if θ is taken to be 1/2, then we have a commutative compact quantum group as the quantum isometry group of a noncommutative C ∗ algebra. We conclude this section with an identification of the ‘quantum double torus’ discovered and studied by Hajac and Masuda ([12]) with an interesting quantum subgroup of Q I S O(Aθ ). Consider the C ∗ -ideal I of Q generated by C˜i , D˜ i , i = 1, 2. It is easy to verify that I = C ∗ (Uik , i = 2, 3, 4, 5, 7, 8; k = 1, 2), hence Q/I ∼ = C ∗ (U1k , U6k , k = 1, 2). Moreover, I is in fact a Hopf ideal, i.e. Q/I is a quantum subgroup of Q. Denoting by A0 , B0 , C0 , D0 the elements U11 , U61 , U62 , U12 respectively, we can describe the structure of Q/I as follows: Theorem 2.22. Consider the C ∗ algebra Qhol = C(T2 ) ⊕ A2θ , given by the generators A0 , B0 , C0 , D0 (where A0 , D0 correspond to C(T2 ) and B0 , C0 correspond to A2θ ), with the following coproduct: h (A0 ) = A0 ⊗ A0 + C0 ⊗ B0 , h (B0 ) = B0 ⊗ A0 + D0 ⊗ B0 , h (C0 ) = A0 ⊗ C0 + C0 ⊗ D0 , h (D0 ) = B0 ⊗ C0 + D0 ⊗ D0 . Then (Qhol , h ) is a compact quantum group isomorphic with Q/I. It has an action β0 on Aθ given by β0 (U ) = U ⊗ A0 + V ⊗ B0 , β0 (V ) = U ⊗ C0 + V ⊗ D0 .
434
J. Bhowmick, D. Goswami
Moreover, Qhol is universal among the compact quantum groups acting ‘holomorphically’ on Aθ in the following sense: whenever a compact quantum group (S, ) has a smooth isometric action γ on Aθ satisfying the additional condition that γ leaves the subalgebra generated by {U m V n , m, n ≥ 0} invariant, then there is a unique morphism from Qhol to S which intertwines the respective actions. Proof. We need to prove only the universality of Qhol . Indeed, it follows from the universality of Q in the category of smooth isometric actions that there is a unique morphism φ (say) from Q to S such that γ = (id ⊗ φ) ◦ α0 . Writing this relation on U , V and noting that by assumption on γ , the coefficients of U −1 , V −1 in the expression of γ (U ), γ (V ) are 0, it is immediate that φ(C˜i ) = φ( D˜ i ) = 0 for i = 1, 2, i.e. φ(I) = {0}. ˜ ◦ β0 .
Thus, φ induces a morphism φ˜ (say) from Q/I to S satisfying γ = (id ⊗ φ) 3. Quantum Isometry Group of Deformed Spectral Triples In this section, we give a general scheme for computing quantum isometry groups by proving that the quantum isometry group of a deformed noncommutative manifold coincides with (under reasonable assumptions) a similar deformation of the quantum isometry group of the original manifold. To make this precise, we introduce a few notations and terminologies. We begin with some generalities on compact quantum groups. Given a compact quantum group (G, ), recall that the dense unital ∗-subalgebra G0 of G generated by the matrix coefficients of the irreducible unitary representations has a canonical Hopf ∗-algebra structure. Moreover, given an action γ : B → B ⊗ G of the compact quantum group (G, ) on a unital C ∗ -algebra B, it is known that one can find a dense, unital ∗-subalgebra B0 of B on which the action becomes an action by the Hopf ∗-algebra G0 (see, for example, [7,19]). We shall use the Sweedler convention of abbreviating γ (b) ∈ B0 ⊗alg G0 by b(1) ⊗ b(2) , for b ∈ B0 . This applies in particular to the canonical action of the quantum group G on itself, by taking γ = . Moreover, for a linear functional f on G0 and an element c ∈ G0 we shall define the ‘convolution’ maps f c := ( f ⊗ id)(c) and c f := (id ⊗ f )(c). We also define convolution of two functionals f and g by ( f g)(c) = ( f ⊗ g)((c)). Let us now consider a C ∗ dynamical system (A, Tn , β) where β is an action of Tn , and assume that there exists a spectral triple (A∞ , H, D) on the smooth subalgebra A∞ w.r.t. the action of Tn , such that the spectral triple satisfies all the assumptions of [4] for ensuring the existence of the quantum isometry group. Let Q ≡ Q I S O(A) denote the quantum isometry group of the spectral triple (A∞ , H, D), with L denoting the corresponding Laplacian as in [4]. Let A0 be the ∗-algebra generated by complex linear (algebraic, not closed) span A∞ 0 of the eigenvectors of L which has a countable discrete set of eigenvalues each with finite multiplicities, by assumptions in [4], and it ∞ is assumed, as in [4], that A∞ 0 is a subset of A and is norm-dense in A. Moreover, we make the following assumptions : ∞ the Frechet topology coming from the action of Tn . (i) A 0 is dense innA w.r.t. ∞. (ii) Dom(L ) = A n≥1 (iii) L commutes with the Tn -action β, hence C(Tn ) can be identified as a quantum subgroup of Q.
Let π denote the surjective map from Q to its quantum subgroup C(Tn ), which is a morphism of compact quantum groups. We denote by α : A → A ⊗ Q the action of
Quantum Isometry Groups: Examples and Computations
435
Q = Q I S O(A) on A, and note that on A0 , this action is algebraic, i.e. it is an action of the Hopf ∗-algebra Q0 consisting of matrix elements of finite dimensional unitary representations of Q. We have (id ⊗ π ) ◦ α = β. We shall abbreviate e2πiu by e(u) (u ∈ Rn ), and shall denote by η the canonical homomorphism from Rn to Tn given by η(x1 , x2 , ......, xn ) = (e(x1 ), e(x2 ), .....e(xn )). For u ∈ Rn , αu will denote the Rn -action on A given by αu (a) := (id ⊗ (u))(α(a)), where (u) := evη((u)) ◦ π , for u ∈ R n (evx being the state on C(Tn ) obtained by evaluation of a function at the point x ∈ Tn ). Let us now briefly recall Rieffel’s formulation of deformation quantization (see, e.g. [14]). Let J denote a skew symmetric n × n matrix with real entries. We define a ‘deformed’ or ‘twisted’ multiplication × J : A∞ × A∞ → A∞ given by a × J b := α J u (a)αv (b)e(u.v)dudv, where u.v denotes the standard (Euclidean) inner product on Rn and the integral makes sense as an oscillatory integral, described in detail in [14] and the references therein. This defines an associative algebra structure on A∞ , with the ∗ of A being an involution w.r.t. the new product × J as well, and one can also get a C ∗ -algebra, denoted by A J , by completing A∞ in a suitable norm denoted by J (see [14]) which is a C ∗ -norm ∞ w.r.t. the product × J . We shall denote by A∞ J the vector space A equipped with the ∗-algebra structure coming from × J . One has a natural Frechet topology on A∞ J , given by a family of seminorms {n,J }, where {an,J } = |µ|≤n (|µ|!)−1 α X µ (a) J , (α X µ as in [14]), in which A∞ J is complete. Moreover, it follows from the estimates (Proposition 4.10 , p. 35) in [14] that A∞ = A∞ J as topological spaces, i.e. they coincide as sets and the corresponding Frechet topologies are also equivalent. In view of this, we shall denote this space simply by A∞ , unless one needs to consider it as Frechet algebra, in which case the suffix J will be used. Assume furthermore that for each skew-symmetric matrix J , there exists a spectral triple on A∞ J satisfying the assumptions in [4] for defining the quantum isometry group Q I S O(A J ), and assume also that the corresponding Laplacians, say L J , coincide with L on A∞ ⊂ A J , so that the quantum isometry group Q I S O(A J ) is the universal compact quantum group acting on A J , with the action keeping each of the eigenspaces of L invariant. Note that the algebraic span of eigenvectors of L J coincides with that ∞ = A∞ , hence in of L, i.e. A∞ 0 , which is already assumed to be Frechet-dense in A J particular norm-dense in A J . We now state and prove a criterion, to be used later, for extending positive maps defined on A0 . Lemma 3.1. Let B be another unital C ∗ -algebra equipped with a Tn -action, so that we can consider the C ∗ -algebras B J for any skew symmetric n × n matrix J . Let φ : A∞ → B ∞ be a linear map, satisfying the following : (a) φ is positive w.r.t. the defomed products × J on A0 and B ∞ , i.e. φ(a ∗ × J a) ≥ 0 (in B ∞ J ⊂ B J ) for all a ∈ A0 , and (b) φ extends to a norm-bounded map (say φ0 ) from A to B. Then φ also have an extension φ J as a J -bounded positive map from A J to B J satisfying φ J = φ(1) J . Proof. We can view φ as a map between the Frechet spaces A∞ and B ∞ , which is clearly closable, since it is continuous w.r.t. the norm-topologies of A and B, which are
436
J. Bhowmick, D. Goswami
weaker than the corresponding Frechet topologies. By the Closed Graph Theorem, we ∞ ∞ conclude that φ is continuous in the Frechet topology. Since A∞ = A∞ J and B = B J ∞ ∞ as Frechet spaces, consider φ as a continuous map from A J to B J , and it follows by the Frechet-continuity of × J and ∗ and the Frechet-density of A0 in A∞ J that the positivity (w.r.t. × J ) of the restriction of φ to A0 ⊂ A∞ J is inherited by the extension ∞ ∞ on A∞ = A∞ J . Indeed, given a ∈ A J = A , choose a sequence an ∈ A0 such that ∗ an → a in the Frechet topology. We have φ(a × J a) = limn φ(an∗ × J an ) in the Frechet topology, so in particular, φ(an∗ × J an ) → φ(a ∗ × J a) in the norm of B J , which implies that φ(a ∗ × J a) is a positive element of B J since φ(an∗ × J an ) is so for each n. Next, we note that A∞ is closed under holomorphic functional calculus as a unital ∗-subalgebra ∞ of A J (the identity of A∞ J is same as that of A), so any positive map defined on A admits a bounded extension (say φ J ) on A J , which will still be a positive map, so in particular the norm of φ J is the same as φ J (1).
We shall also need Rieffel-type deformation of compact quantum groups (due to Rieffel and Wang, see [9,15] and references therein), w.r.t. the action by a quantum subgroup isomorphic to C(Tn ) for some n. Indeed, for each skew symmetric n × n real matrix J , we can consider a 2n-parameter action on the compact quantum group, and equip the corresponding Rieffel-deformation with the structure of a compact quantum group. We will discuss it in some more detail later on. For a fixed J , we shall work with several multiplications on the vector space A0 ⊗alg Q0 , where Q0 is the dense Hopf ∗-algebra generated by the matrix coefficients of irreducible unitary representations of the quantum isometry group Q. We shall denote the counit and antipode of Q0 by and κ respectively. Let us define the following: e(−u.v)e(w.s)((−J u) x ((J w))((−v) y (s))dudvdwds, xy= R4n
where x, y ∈ Q0 . This is clearly a bilinear map, and will be seen to be an associative multiplication later on. Moreover, we define two bilinear maps • and • J by setting (a ⊗ x) • (b ⊗ y) := ab ⊗ x y and (a ⊗ x) • J (b ⊗ y) := (a × J b) ⊗ (x y), for a, b ∈ A0 , x, y ∈ Q0 . We have (u) ((v) c) = ((u) (v)) c. Lemma 3.2. The map satisfies ((J u) x) ((v) y)e(u.v)dudv = R2n
R2n
(x ((J u))(y (v))e(u.v)dudv,
for x, y ∈ Q0 . Proof. We have LHS
(((J u )) x) ((v ) y)e(u.v)du dv = e(−u.v)e(w.s)((−J u)) (((J u )) x) ((J w))((−v)) R2n R4n ((v ) y) (s) dudvdwds e(u .v )du dv = e(−u.v)e(w.s)((−J u)) ((J u )) x) ((J w))((−v)) =
R2n
R4n
Quantum Isometry Groups: Examples and Computations
437
((v ) y) (s) dudvdwds e(u v )du dv = (((J (u − u)) x) (J w))((v − v)) R6n
y (s)e(u .v )e(−u.v)e(w.s)dudvdwdsdu dv
= e(w.s)dwds e(u .v )e(−u.v)dudvdu dv 2n 4n R R × ((J (u − u)) xw )(((v − v)) ys ) , where xw = x (J w), ys = y (s). The proof of the lemma will be complete if we show e(u .v )e(−u.v)((J (u − u)) xw )((v − v) ys )dudvdu dv = xw .ys . R4n
By changing variable in the above integral, with z = u − u, t = v − v, it becomes R4n e(−u.v)e((u + z).(v + t))φ(z, t)dudvdzdt = R4n φ(z, t)e(u.t + z.v)e(z.t)dudvdzdt, where
φ(z, t) = ((J (z)) xw )((t) ys ). By taking (z, t) = X, (v, u) = Y, and F(X ) = φ(z, t)e(z.t), the integral can be written as F(X )e(X.Y )d X dY = F(0) ( by Corollary 1.12 of [14], page 9) = ((J (0)) xw )((0) ys ) = xw .ys , since (J (0)) xw = (evη(0) π ⊗ id)(xw ) = ( Tn ◦ π ⊗ id) (xw ) = ( ⊗ id)(xw ) = xw and similarly (0) ys = ys (here Tn denotes the counit of the quantum group C(Tn )). This proves the claim and hence the lemma.
Lemma 3.3. We have for a ∈ A0 , α(αu (a)) = a(1) ⊗ (id ⊗ (u))((a(2) ). Proof. We have αu (a) = (id ⊗ (u))α(a) = (id ⊗ (u))(a(1) ⊗ a(2) ) = a(1) ((u))(a(2) ).
438
J. Bhowmick, D. Goswami
This gives, α(αu (a)) = α(a(1) )(u)(a(2) ) = (id ⊗ id ⊗ (u))(α(a(1) ⊗ a(2) )) = (id ⊗ id ⊗ (u))((α ⊗ id)α(a)) = (id ⊗ id ⊗ (u))((id ⊗ )α(a)) = a(1) ⊗ (id ⊗ (u))(a(2) ).
Lemma 3.4. For a, b ∈ A0 , we have α(a × J b) = a(1) b(1) ⊗ (a(2) u)(b(2) v)e(u.v)dudv . Proof. Using the notations and definitions on pp. 4-5 of [14], we note that for any f : R2 → C belonging to IB(R2 ) and fixed x ∈ E, (where E is a Banach algebra), the function F(u, v) = x f (u, v) belongs to IB E (R2 ) and we have x f (u, v)e(u.v)dudv ⎞ ⎛ ( f φ p )(u, v)e(u.v)dudv ⎠ = x ⎝lim L
= lim L
p∈L
x ( f φ p )(u, v)e(u.v)dudv)
p∈L
=
x f (u, v)e(u.v)dudv.
Then, α(a × J b) =α α J u (a)αv (b)e(u.v)dudv =α a(1) ((J u))(a(2) )b(1) ((v))(b(2) )e(u.v)dudv ((J u))(a(2) )((v))(b(2) )e(u.v)dudv) = α(a(1) b(1) ((J u))(a(2) )((v))(b(2) )e(u.v)dudv = α(a(1) )α(b(1) ) = α(a(1) )α(b(1) )((J u))(a(2) )((v))(b(2) )e(u.v)dudv) = α(α J u (a))α(αv (b))e(u.v)dudv = (a(1) ⊗ (id ⊗ (J u))((a(2) )))(b(1) ⊗ (id ⊗ (v))((b(2) )))e(u.v)dudv
Quantum Isometry Groups: Examples and Computations
439
(using Lemma 3.3) (a(2) (J u))(b(2) (v))e(u.v)dudv. = a(1) b(1) ⊗
Lemma 3.5. For a, b ∈ A0 , α(a) • J α(b) = a(1) b(1) ⊗
((J u) a(2) ) ((v) b(2) )e(u.v)dudv .
Proof. We have α(a) • J α(b) = (a(1) ⊗ a(2) )(b(1) ⊗ b(2) ) = a(1) × J b(1) ⊗ (a(2) b(2) ) = α J u (a(1) )αv (b(1) )e(u.v)dudv ⊗ (a(2) b(2) ). Let : Q0 → C be the counit of the compact quantum group Q. So we have (id ⊗ )α = id. This gives α(a) • J α(b) = (id ⊗ )α(α J u (a(1) ))(id ⊗ )α(αv (b(1) )e(u.v)dudv ⊗ (a2 b2 ) = (id ⊗ )(α(α J u (a(1) )))(id ⊗ )(α(αv (b(1) ))e(u.v)dudv ⊗ (a(2) b(2) ). Note that (id ⊗ )(α(α J u (a(1) ))(id ⊗ )(α(αv (b(1) )))e(u.v)dudv by Lemma 3.4, = (id ⊗ )(a(1)(1) ⊗ (id ⊗ (J u))((a(1)(2) ))(id ⊗ )(b(1)(1) ⊗ (id ⊗ (v)) ((b(1)(2) ))e(u.v)dudv = (id ⊗ )(a(1)(1) ⊗(a(1)(2) (J u))(id ⊗ )(b(1)(1) ⊗(b(1)(2) (v))e(u.v)dudv = a(1)(1) b(1)(1) ((a(1)(2) (J u)) (b(1)(2) (v))e(u.v)dudv. Using the fact that f = f = f for any functional on Q0 , one has (a(1)(2) (J u)) = (J u)(a(1)(2) ) and (b(1)(2) (v)) = (v)(b(1)(2) ), from which it follows that α(a) • J α(b)
(J u)(a(1)(2) )(v)(b(1)(2) )e(u.v)dudv ⊗ (a(2) b(2) ) = a(1)(1) b(1)(1) = (id ⊗ (J u) ⊗ id)(a(1)(1) ⊗ a(1)(2) ⊗ a(2) ) • (id ⊗ (v) ⊗ id)(b(1)(1) ⊗ b(1)(2) ⊗ b(2) ) e(u.v)dudv = (id ⊗ (J u) ⊗ id)(a(1) ⊗ (a(2) )) • (id ⊗ ((v) ⊗ id)(b(1) ⊗ (b(2) ))e(u.v)dudv = {a(1) ⊗ ((J u) ⊗ id)(a2 )} • {b1 ⊗ ((v) ⊗ id)(b2 )}e(u.v)dudv = a(1) b(1) ⊗ (((J u) ⊗ id)(a2 )) ((v) ⊗ id))(b(2) )e(u.v)dudv ((J u) a(2) ) ((v) b(2) )e(u.v)dudv, = a(1) b(1) ⊗
440
J. Bhowmick, D. Goswami
where we have used the relation (α ⊗ id)α = (id ⊗ )α to get a(1)(1) ⊗ a(1)(2) ⊗ a(2) = a(1) ⊗ (a(2) ) and similarly b(1)(1) ⊗ b(1)(2) ⊗ b(2) = b(1) ⊗ (b(2) ).
Combining Lemma 3.2, Lemma 3.4 and Lemma 3.5 we conclude the following. Lemma 3.6. For a, b ∈ A0 , we have α(a) • J α(b) = α(a × J b). We shall now identify with the multiplication of a Rieffel-type deformation of Q. Since Q has a quantum subgroup isomorphic with Tn , we can consider the following canonical action λ of R2n on Q given by λ(s,u) = ((−s) ⊗ id)(id ⊗ (u)). Now, let J := −J ⊕ J , which is a skew-symmetric 2n × 2n real matrix, so one can deform Q by defining the product of x and y (x, y ∈ Q0 , say) to be the following: λ J(u,w) (x)λv,s (y)e((u, w).(v, s))d(u, w)d(v, s). We claim that this is nothing but introduced before. Lemma 3.7. x y = x × J y ∀x, y ∈ Q 0 . Proof. Let us first observe that λ J(u,w) (x) = ((J u) ⊗ id)(id ⊗ (J w))(x) = (J u) x (J w), and similarly λ(v,s) (y) = (−v) y (s). Thus, we have xy = ((−J u) x (J w))((−v) y (s))e(−u.v)e(w.s)dudvdwds 4n R = (J u ) x (J w))(−v) y (s))e(u .v)e(w.s)du dvdwds R4n = λ J(u,w) (x)λ(v,s) (y)e((u, w).(v, s))d(u, w)d(v, s), R2n
R2n
which proves the claim.
Let us denote by Q J the C ∗ algebra obtained from Q by the Rieffel deformation w.r.t. the matrix J described above. It has been shown in [9] that the coproduct on Q0 extends to a coproduct for the deformed algebra as well and (Q J, ) is a compact quantum group. Lemma 3.8. The Haar state (say h) of Q coincides with the Haar state on Q J (say h J ) on the common subspace Q∞ , and moreover, h(a × J b) = h(ab) for a, b ∈ Q∞ .
Quantum Isometry Groups: Examples and Computations
441
Proof. From [9] (Remark 3.10(2)), we have that h = h J on Q0 . By using h((−s) ⊗ id) = (−s)(id ⊗ h) and h(id ⊗ (u)) = (u)(h ⊗ id), we have for a ∈ Q0 , h(λs,u (a)) = (−s)(id ⊗ h)(id ⊗ (u))(a) = (−s)(h((id ⊗ (u))(a))1) = h((id ⊗ (u))(a)) = (u)(h(a).1) = h(a). Therefore, hλs,u (b) = h(b) ∀b ∈ Q0 . Now, h(a× Jb) = h(λ Ju (a)λv (b))e(u.v)dudv = h(λv (λ Ju−v (a)b))e(u.v)dudv = h(λt (a)b)e(s.t)dsdt, where s = −u, t = Ju − v, which by Corollary 1.12, [14] equals h(λ0 (a)b) = h(ab). That is, we have proved a, b J = a, b ∀a, b ∈ Q0 ,
(47)
where ·, · J and ·, · respectively denote the inner products of L 2 (h J ) and L 2 (h). We now complete the proof of the lemma by extending (47) from Q0 to Q∞ , by using the fact that Q∞ is a common subspace of the Hilbert spaces L 2 (h) and L 2 (h J ) and moreover, Q0 is dense in both these Hilbert spaces. In particular, taking a = 1 ∈ Q0 , we have h = h J on Q∞ .
Remark 3.9. Lemma 3.8 implies in particular that for every fixed a1 , a2 ∈ Q0 , the functional Q0 b → h(a1 × J b × J a2 ) = h((κ 2 (a2 ) × J a1 )b) extends to a bounded linear functional on Q. Lemma 3.10. If h is faithful on Q, then h J is faithful on Q J. Proof. Let a ≥ 0, ∈ Q J be such that h(a) = 0. Let e be the identity of T2n and Un be a sequnce of neighborhoods of e shrinking to e, f n smooth, positive functions with support contained inside Un such that f n (z)dz = 1 ∀n. Define λ fn (a) = T2n λz (a) f n (z)dz. It is clear that λ fn (a) ∈ Q∞ and is positive in Q J. Moreover, using the fact that the map z → λz (a) is continuous ∀a, we have λ fn (a) → a as n → ∞. Now h J (λ fn (a)) = h J (λz (a)) f n (z)dz 2n T = h J (a) f n (z)dz T2n
= 0,
442
J. Bhowmick, D. Goswami
so we have h(λ f n (a)) = 0, since h and h J coincide on Q∞ by Lemma 3.8. Now we fix some notation which we are going to use in the rest of the proof. Let L 2 (h) and L 2 (h J ) denote the G.N.S spaces of Q and Q J respectively with respect to the Haar states. Let i and i J be the canonical maps from Q and Q J to L 2 (h) and L 2 (h J ) respectively. Also, let J denote the G.N.S representation of Q J . Using the facts h(b∗ × J b) = h(b∗ b) ∀b ∈ Q∞ and h = h J on Q∞ , we get i J (b)2L 2 (h ) = J
i(b)2L 2 (h) ∀b ∈ Q∞ . So the map sending i(b) to i J (b) is an isometry from a dense
subspace of L 2 (h) onto a dense subspace of L 2 (h J ), hence it extends to a unitary, say : L 2 (h) → L 2 (h J ). We also note that the maps i and i J agree on Q∞ . Now, λ fn (a) = b∗ × Jb for some b ∈ Q J. So h(λ f n (a)) = 0 implies i J (b)2L 2 (h ) = J 0. Therefore, one has J (b∗ )i J (b) = 0, and hence i J (b∗ b) = i J (λ f n (a)) = 0. It thus follows that (i(λ fn (a))) = 0, which implies i(λ fn (a)) = 0. But the faithfulness of h means that i is one, hence λ fn (a) = 0 for all n. Thus, a = lim n→∞ λ fn (a) = 0, which proves the faithfulness of h J .
Theorem 3.11. If the Haar state is faithful on Q, then α : A0 → A0 ⊗ Q0 extends to an action of the compact quantum group Q J on A J , which is isometric, smooth and faithful. Proof. We have already seen in Lemma 3.6 that α is an algebra homomorphism from A0 to A0 ⊗alg Q0 (w.r.t. the deformed products), and it is also a ∗-homomorphism since it is so for the undeformed case and the involution ∗ is the same for the deformed and undeformed algebras. It now suffices to show that α extends to A J as a C ∗ -homomorphism. Let us fix any faithful imbedding A J ⊆ B(H0 ) (where H0 is a Hilbert space) and consider the imbedding Q J ⊆ B(L 2 (h J )). By definition, the norm on A J ⊗ Q J is the minimal (injective) C ∗ -norm, so it is equal to the norm inherited from the imbedding A J ⊗alg Q J ⊆ B(H0 ⊗ L 2 (h J )). Let us consider the dense subspace D ⊂ H0 ⊗ L 2 (h J ) consisting of vectors which are finite linear combinations of the form i u i ⊗ xi , with k u i ∈ H0 , xi ∈ Q0 ⊂ L 2 (h J ). Fix such a vector ξ = i=1 u i ⊗ x i and consider B := A ⊗ Mk (C), with the Tn -action β ⊗ id on B. Let φ : A∞ → B ∞ be the map given by k φ(a) := (id ⊗ φ(xi ,x j ) )(α(a)) i j=1 , where φ(x,y) (z) := h(x ∗ × J z × J y) for x, y, z ∈ Q0 . Note that the range of φ is in ∞ ∞ B ∞ = A∞ ⊗ Mk (C) since we have φ(x,y) (A ) ⊆ A by the Remark 2.16 of [4], using our assumption (ii) that n≥1 Dom(Ln ) = A∞ . Since α maps A0 into A0 ⊗alg Q0 and h = h J on Q0 , it is easy to see that for a ∈ A0 , φ(a ∗ × J a) is positive in B J . Moreover, by Remark 3.9, φ(xi ,x j ) extends to Q as a bounded linear functional, hence φ extends to a bounded linear (but not necessarily positive) map from A to B. Thus, the hypotheses of Lemma 3.1 are satisfied and we conclude that φ admits a positive extension, say φ J , from A J to B J = A J ⊗ Mk (C). Thus, we have for a ∈ A0 , k
u i , φ(a ∗ × J a)u j
i, j=1
≤ a2J
u i , φ(1)u j = a2J u i , u j h(xi∗ × J x j ) ij
ij
Quantum Isometry Groups: Examples and Computations
= a2J
u i ⊗ xi , u j ⊗ x j = a2J
ij
443 k
u i ⊗ xi 2 .
i=1
This implies α(a)ξ 2 = ξ, α(a ∗ × J a)ξ ≤ a2J ξ 2 for all ξ ∈ D and a ∈ A0 , hence α admits a bounded extension which is clearly a C ∗ -homomorphism.
Let C J be the category of compact quantum groups acting isometrically on A J with objects being the pair (S, αS ), where the compact quantum group S acts isometrically on A J by the action αS . If the action is understood, we may simply write (S, αS ) as S. For any two compact quantum groups S1 and S2 in C J , we write S1 < S2 if there is a surjective C ∗ homomorphism π from S2 to S1 preserving the respective coproducts (i.e. S1 is a quantum subgroup of S2 ) and π also satisfies αS1 = (id ⊗ π ) ◦ αS2 . Remark 3.12. It can easily be seen that S1 < S2 means that (S1 ) J < (S2 ) J Theorem 3.13. If the Haar state on Q I S O(A) is faithful, we have the isomorphism of compact quantum groups: (Q I S O(A)) ∼ = Q I S O(A J ). J
Proof. Let Q(A J ) be the universal object in C J . By Theorem 3.11, we have seen that (Q(A)) J also acts faithfully, smoothly and isometrically on A J , which implies, (Q(A)) J < Q(A J ) in C J . So, by Remark 3.12, ((Q(A) J)− J < (Q(A J ))− J in C0 , hence Q(A) < (Q(A J ))− J. Replacing A by A−J , we have Q(A−J ) < Q((A−J ) J )− J (in C−J ) ∼ = Q(A)− J (in C−J ) ∼ = (Q(A))−J . ∼ (Q(A)) in C J . Thus, Q(A J ) < (Q(A)) in C J , which implies Q(A J ) = J
J
Example 3.14. We recall that Aθ is a Rieffel type deformation of C(T2 ), (see [14], Ex. 10.2, p. 69) and it can be easily verified that in this case the hypotheses of this section are true. So Theorem 3.13 can be applied to compute Q I S O(Aθ ). This gives an alternative way to prove the results obtained in Subsection 2.3. Example 3.15. We can apply our result to the isospectral deformations of compact oriented Riemannian manifolds considered in [16], in particular to the deformations Sθn of the classical n-sphere, with the spectral triple defined in [16]. Since we have proved that Q I S O(S n ) ∼ = C(O(n)), it will follow that Q I S O(Sθn ) ∼ = Oθ (n), where Oθ (n) is the compact quantum group obtained in [16] as the θ -deformation of C(O(n)). Remark 3.16. We would like to conclude this article with the following important and interesting open question : Does there exist a connected, compact manifold whose quantum isometry group is non commutative as a C ∗ algebra? We have already observed that for S n , T1 , T2 , the answer is negative. Acknowledgement. The authors thank P. Hajac and S.L. Woronowicz for some stimulating discussion.
444
J. Bhowmick, D. Goswami
References 1. Banica, T.: Quantum automorphism groups of small metric spaces. Pacific J. Math. 219(1), 27–51 (2005) 2. Banica, T.: Quantum automorphism groups of homogeneous graphs. J. Funct. Anal. 224(2), 243–280 (2005) 3. Connes, A.: Noncommutative Geometry. London-New York: Academic Press, 1994 4. Goswami, D.: Quantum Group of Isometries in Classical and Noncommutative Geometry. Preprint, 2007 5. Wang, S.: Free products of compact quantum groups. Commun. Math. Phys. 167(3), 671–692 (1995) 6. Fröhlich, J., Grandjean, O., Recknagel, A.: Supersymmetric quantum theory and non-commutative geometry. Commun. Math. Phys. 203(1), 119–184 (1999) 7. Wang, S.: Quantum symmetry groups of finite spaces. Commun. Math. Phys. 195, 195–211 (1998) 8. Wang, S.: Structure and isomorphism classification of compact quantum groups Au (Q) and Bu (Q). J. Operator Theory 48, 573–583 (2002) 9. Wang, S.: Deformation of Compact Quantum Groups via Rieffel’s Quantization. Commun. Math. Phys. 178, 747–764 (1996) 10. Woronowicz, S.L.: Compact quantum groups. In: Symétries quantiques (Quantum symmetries) (Les Houches, 1995), edited by A. Connes et al., Amsterdam: Elsevier 1998, pp. 845–884 11. Helgason, S.: Topics in Harmonic analysis on homogeneous spaces. Boston-Basel-Stuttgart: Birkhäuser, 1981 12. Hajac, P., Masuda, T.: Quantum Double-Torus. Comptes Rendus Acad. Sci. Paris 327(6), Ser. I, Math. 553–558, (1998) 13. Maes, A., Van Daele, A.: Notes on Compact Quantum Groups. Niew Arch.Wisk(4) 16(1–2), 73–112 (1998) 14. Rieffel, M.A.: Deformation Quantization for actions of R d . Memoirs of the American Mathematical Society, Volume 106, Number 506, 1993 15. Rieffel, M.A.: Compact Quantum Groups associated with toral subgroups. Contemp. Math. 145, 465– 491 (1992) 16. Connes, A., Dubois-Violette, M.: Noncommutative finite-dimensional manifolds. I. Spherical manifolds and related examples. Commun. Math. Phys. 230(3), 539–579 (2002) 17. Soltan, P.M.: Quantum families of maps and quantum semigroups on finite quantum spaces. http://arXiv. org/list/math/0610922, 2006 18. Woronowicz, S.L.: Pseudogroups, pseudospaces and Pontryagin duality. In: Proceedings of the International Conference on Mathematical Physics, Lausanne (1979), Lecture Notes in Physics 116, Berlin Heidelberg-New York: Springer, 1980, pp. 407–412 19. Podles, P.: Symmetries of quantum spaces. Subgroups and quotient spaces of quantum SU (2) and S O(3) groups. Commun. Math. Phys. 170, 1–20 (1995) Communicated by A. Connes
Commun. Math. Phys. 285, 445–468 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0583-5
Communications in
Mathematical Physics
Melting Crystal, Quantum Torus and Toda Hierarchy Toshio Nakatsu1 , Kanehisa Takasaki2 1 Department of Physics, Graduate School of Science, Osaka University,
Toyonaka, Osaka 560-0043, Japan. E-mail:
[email protected] 2 Graduate School of Human and Environmental Studies, Kyoto University,
Yoshida, Sakyo, Kyoto 606-8501, Japan. E-mail:
[email protected] Received: 8 November 2007 / Accepted: 14 April 2008 Published online: 1 August 2008 – © Springer-Verlag 2008
Abstract: Searching for the integrable structures of supersymmetric gauge theories and topological strings, we study melting crystal, which is known as random plane partition, from the viewpoint of integrable systems. We show that a series of partition functions of melting crystals gives rise to a tau function of the one-dimensional Toda hierarchy, where the models are defined by adding suitable potentials, endowed with a series of coupling constants, to the standard statistical weight. These potentials can be converted to a commutative sub-algebra of quantum torus Lie algebra. This perspective reveals a remarkable connection between random plane partition and quantum torus Lie algebra, and substantially enables to prove the statement. Based on the result, we briefly argue the integrable structures of five-dimensional N = 1 supersymmetric gauge theories and A-model topological strings. The aforementioned potentials correspond to gauge theory observables analogous to the Wilson loops, and thereby the partition functions are translated in the gauge theory to generating functions of their correlators. In topological strings, we particularly comment on a possibility of topology change caused by condensation of these observables, giving a simple example. 1. Introduction An unanticipated but very exciting connection between the statistical mechanical problem of melting crystal, known as random plane partition, and A-model topological strings has been revealed [1], based on the topological vertex [2,3]. The topological vertex is a diagrammatical method which enables to compute all genus topological A-model string amplitudes for a certain class of local geometries. We can image the positive octant Z3≥0 ⊂ R3 occupied by unit cubes as the neighborhood of a corner of the crystal by putting unit cubes on the lattice points in the octant. The frozen crystal occupies the positive octant Z3≥0 . As the crystal melts, we remove atoms from the corner. We identify the configuration of crystal melting as the configuration of plane partition or the three-dimensional Young diagrams, as depicted in Fig. 1.
446
T. Nakatsu, K. Takasaki
Fig. 1. The corner of the melting crystal and the corresponding plane partition (the three-dimensional Young diagram) µ
Removing each atom contributes the factor q = e− T to the Boltzmann weight of the configuration, where T is the temperature and µ is the chemical potential. Heating up the crystal leads to melting of it. Random plane partition also has a significant relation with five-dimensional N = 1 supersymmetric gauge theories. Nekrasov’s formula [4,5] for five-dimensional N = 1 supersymmetric SU (N ) Yang-Mills theory can be retrieved from the partition function of a random plane partition [6], where the model is interpreted as a q-deformed random partition. All genus topological A-model string amplitude for the local SU (N ) geometry is evaluated by the topological vertex and reproduces Nekrasov’s formula for N = 2 SU (N ) gauge theory [7,8], as predicted in the geometric engineering. It is shown in [5] that the Seiberg-Witten solutions [9] of four-dimensional N = 2 supersymmetric gauge theories emerge through random partition, where Nekrasov’s functions for four-dimensional N = 2 supersymmetric gauge theories are understood as the partition functions of a random partition. The integrable structure of a random partition is elucidated in [10], and thereby the integrability of correlation functions among certain observables in four-dimensional N = 2 supersymmetric gauge theories is explained. Motivated by these results, we study in this article the integrable structure of a random plane partition in order to search for the integrable structures of supersymmetric gauge theories and topological strings. A partition λ = (λ1 , λ2 , · · · ) is a sequence of non-negative integers satisfying λi ≥ λi+1 for all i ≥ 1. Partitions are identified with the Young diagrams. The size is defined by |λ| = i≥1 λi , which is the total number of boxes of the diagram. A plane partition π is an array of non-negative integers π11 π21 π31 .. .
π12 π22 π32 .. .
π13 · · · π23 · · · π33 · · · .. .
(1.1)
satisfying πi j ≥ πi+1 j and πi j ≥ πi j+1 for all i, j ≥ 1. Plane partitions are identified with the three-dimensional Young diagrams. The three-dimensional diagram π is a set of unit cubes such that πi j cubes are stacked vertically on each (i, j)-element of π . The size is defined by |π | = i, j≥1 πi j , which is the total number of cubes of the diagram. Diagonal slices of a plane partition π become partitions, as depicted in Fig. 2.
Melting Crystal, Quantum Torus and Toda Hierarchy
447
(a)
(b)
Fig. 2. Plane partition (The three-dimensional Young diagram) (a) and the corresponding sequence of partitions (the two-dimensional Young diagrams) (b)
Let π(m) denote the partition along the m th diagonal slice, where m ∈ Z. In particular, π(0) = (π11 , π22 , · · · ) is the main diagonal partition. This series of partitions satisfies the condition · · · ≺ π(−2) ≺ π(−1) ≺ π(0) π(1) π(2) · · · ,
(1.2)
where µ ν means the interlace relation between two partitions µ, ν, µν
⇐⇒
µ1 ≥ ν1 ≥ µ2 ≥ ν2 ≥ µ3 ≥ · · · .
(1.3)
A statistical model of plane partitions is introduced by the following partition function: q |π | , (1.4) Z ≡ π
where the sum is over all plane partitions. The parameter q is indeterminate satisfying 0 < q < 1 and play the role of chemical potential (the energy of the removal of an atom from the crystal). Summing over plane partitions, we can obtain
q |π | =
π
∞ n=1
1 . (1 − q n )n
(1.5)
The partition function of the model is the generating function of plane partitions, known as the McMahon function.
1.1. The models. We introduce a series of potentials for partitions and put the main diagonal partition of π in these potentials. For the later convenience, we introduce them as functions on charged partitions, which are partitions paired with integers. Let k (k = 1, 2, · · · ) be the following functions on charged partitions: k (λ, p) =
∞ i=1
q k( p+λi −i+1) −
∞ i=1
q k(−i+1) ,
(1.6)
448
T. Nakatsu, K. Takasaki
where (λ, p) denotes a charged partition. Actually, the right hand side of this formula becomes a finite sum by cancellation of terms between the two sums. More precisely, k (λ, p) =
∞ 1 − q pk (q k( p+λi −i+1) − q k( p−i+1) ) + q k . 1 − qk
(1.7)
i=1
With each fixed value of p, these provide a series of potentials for partitions. These functions have been exploited in [10] from the four-dimensional gauge theory viewpoint, with q or q k being replaced by a generating spectral parameter. Introducing the coupling constants t = (t1 , t2 , · · · ), we write their combination as (t; p) (λ) =
∞
tk k (λ, p).
(1.8)
k=1
The partition function of the random plane partition whose main diagonal partition is in the potential (1.8) is defined by Z p (t) ≡ q |π | e(t; p) (π(0)) . (1.9) π
The model has an interpretation as a q-deformed random partition. To see this,note ∞ that, by virtue of the interlacing relations (1.2), the two series of partitions π(m) m=0 ∞ and π(−m) m=0 represent a pair T, T of semi-standard Young tableaux of shape π(0), in which the part of the m th skew Young diagrams π(±m)/π(±(m + 1)) is filled with m +1. The partition function can be thereby reorganized to a sum over the Young diagram λ = π(0) and the pair T, T of semi-standard Young tableaux of shape λ as
Z p (t) = q T q T e(t; p) (λ) , (1.10) λ T,T : shape λ
where q T =
∞
1
q (m+ 2 )(|π(m)|−|π(m+1)|) and q T =
m=0
∞
1
q (m+ 2 )(|π(−m)|−|π(−m−1)|) . The
m=0
partial sum over the semi-standard tableaux gives the Schur function sλ (q ρ ) = sλ (x1 , 1 x2 , · · · ) specialized to xi = q i− 2 :
qT = q T = sλ (q ρ ). (1.11) T : shape λ
T : shape λ
Therefore the partition function can be eventually expressed as Z p (t) = e(t; p) (λ) sλ (q ρ )2 .
(1.12)
λ
The representation of sλ (q ρ ) in terms of the hook polynomial [11] allows us to write it further as 2 1 q 2 |λ|+n λ (t; p) (λ) e , (1.13) Z p (t) = h(s) ) s∈λ (1 − q λ
where h(s) denotes the hook length of the box s ∈ λ.
Melting Crystal, Quantum Torus and Toda Hierarchy
449
1.2. Main result. In this article, we study these models of random plane partition from the viewpoint of integrable systems. Our main result is that the following series of the partition functions is a tau function of an integrable hierarchy: 1
τ (t; p) ≡ q 6 p( p+1)(2 p+1) Z p (t),
p ∈ Z.
(1.14)
More precisely, we show in the text that τ (t; p) is a tau function of the one-dimensional Toda hierarchy, where the coupling constants t = (t1 , t2 , · · · ) are interpreted as a single series of time variables of the one-dimensional Toda hierarchy. In particular, τ (t; p) is shown to have a representation by two-dimensional free fermions or free bosons as τ (t; p) = e
∞
tk q k k=1 1−q k
1
p| e 2
∞
k k=1 (−) tk Jk
1
g e 2
∞
k k=1 (−) tk J−k
| p .
(1.15)
In the right hand side of this formula, g is an element of G L(∞) and J±k denote the modes of the U (1) current associated with the complex fermions. Thus τ (t; p) is a matrix element, taken between the Dirac sea of U (1) charge p, of an element of G L(∞). The exponential operators on both sides of g are generators of the commuting flows of the one-dimensional Toda hierarchy. Once it is known that the partition function relates with the tau function, it becomes amenable to obtain an infinite set of non-linear differential equations that the partition function obeys. This is because tau functions of an integrable hierarchy satisfy an infinite set of non-linear differential equations. These non-linear differential equations are all encoded in the bilinear identity. It follows from the above that the partition functions satisfy the following bilinear identities in the one-dimensional Toda hierarchy: dz p− p 1 (tk −t )z k k z e 2 k≥1 Z p (t − [z −1 ])Z p (t + [z −1 ]) 2πi dz p− p − 1 (tk −t )z −k ( p+ p +1)( p− p +1) k =q z e 2 k≥1 Z p+1 (t + [z])Z p −1 (t − [z]), 2πi (1.16) where p, p are arbitrary. The integral on the left hand side means taking the residue at z = ∞ and multiply it by −1; the integral on the right hand side is understood to be the 2 3 residue at z = 0. We also use notations like t ±[z] = (t1 ±z, t2 ± z2 , t3 ± z3 , · · · ). Towers of non-linear differential equations are obtained from (1.16) as the coefficients of the Taylor expansions along the diagonal t = t , that is, the coefficients of the expansions of the bilinear identities in the variables tk − tk . For instance, let p = p . The first equation one gets in the tower is nothing but the Toda equation in Hirota’s bilinear form, Dt21 Z p · Z p = q 2 p+1 Z p+1 Z p−1 ,
(1.17)
where D denotes Hirota’s derivative that is defined by Dx f (x) · g(x) lim (∂x − ∂ y ) f (x)g(y).
=
y→x
The partition functions also give rise to a solution of the modified KP hierarchy. The corresponding bilinear identities are read as dz p− p (tk −t )z k k z e k≥1 Z p (t − [z −1 ])Z p (t + [z −1 ]) = 0, (1.18) 2πi
450
T. Nakatsu, K. Takasaki
where p ≥ p . These identities include in the towers the modified KP equations as well as the KP equation. For instance, let p = p − 1. The first equation one gets in this tower is (Dt21 − Dt2 )Z p (t) · Z p−1 (t) = 0.
(1.19)
This is the first equation of the modified KP hierarchy.
1.3. Transfer matrix approach. The main tool we use in the text is the transfer matrix formulation of the random plane partition. The hamiltonian picture is hinted from the interlace relations (1.2), which state that plane partitions are certain evolutions of partitions by the discretized time m. In particular, the transfer matrix formulation [12] makes it possible to express the partition function (1.9) in terms of two-dimensional conformal field theory (2d free fermions). Let ψ(z) = m∈Z ψm z −m−1and ψ ∗ (z) = m∈Z ψm∗ z −m be complex fermions with ∗ ∗ ∗ the anti-commutation relations, ψm , ψn = δm+n,0 and {ψm , ψn } = ψm , ψn = 0. The Noether current of the U (1) rotation is given by J (z) = : ψ(z)ψ ∗ (z) : = z −m−1 Jm , (1.20) m∈Z
where : : denotes the normal ordering of fermions that is defined by ψ(z)ψ ∗ (w) = : ψ(z)ψ ∗ (w) : +
1 , z−w
|z| > |w|.
(1.21)
It is well known that partitions are realized as states of the fermion Fock space. In particular, charged partitions are realized by the states of the same U (1) charges. For a charged partition (λ, p), the corresponding state is |λ; p =
∞
∗ ψi−λi −1− p ψ−i+1+ p | p ,
(1.22)
i=1
where | p denotes the Dirac sea having the U (1) charge p, and is defined by the conditions ψm | p = 0 for ∀ m ≥ − p, ψm∗ | p = 0 for ∀ m ≥ p + 1.
(1.23)
Using the realization (1.22), it can be seen that the function (1.6) corresponds to the following fermion bilinear operator: Hk ≡ q km : ψ−m ψm∗ : . (1.24) m∈Z
Actually, the potential function (1.6) is reproduced as Hk |λ; p = k (λ, p)|λ; p .
(1.25)
Melting Crystal, Quantum Torus and Toda Hierarchy
451
These operators are commutative. The following combination reproduces the potential (1.8), H (t) =
∞
tk Hk .
(1.26)
k=1
The transfer matrices [12] are vertex operators of the following forms: + (m) = exp − (m) = exp
+∞ 1 k=1 +∞ k=1
k
q
−k(m+ 12 )
Jk ,
1 k(m+ 1 ) 2 q J−k , k
(1.27)
(1.28)
where J±k are the modes of the U (1) current. The matrix elements between partitions of different charges always vanishes, while those between partitions of the same charge are
λ; p|+ (m)|µ; p =
µ; p|− (m)|λ; p =
1
q −(m+ 2 )(|µ|−|λ|) λ ≺ µ 0 otherwise,
(1.29)
1
q (m+ 2 )(|µ|−|λ|) µ λ 0 otherwise.
(1.30)
By comparing these formulas with the interlace relations (1.2), we see that ± (m) describes the evolutions of partitions. More precisely, the evolution at a negative time m ≤ −1 is given by + (m), while the evolution at a nonnegative time m ≥ 0 is by − (m). Taking the hamiltonian picture of plane partitions, the partition function (1.9) can be reproduced in the transfer matrix formulation. Actually, following the same steps as we translated the partition function to the q-deformed random partition (1.10), but using the transfer matrices in place of the Schur functions, the partition function (1.9) is eventually expressed as Z p (t) = p| G + e H (t) G − | p ,
(1.31)
where G ± are the propagators which are responsible respectively to the negative time evolutions and the nonnegative time evolutions of partitions. These are the operators given by the following infinite products: G+ ≡ G− ≡
−1 m=−∞ +∞
+ (m),
− (m).
m=0
(1.32)
(1.33)
452
T. Nakatsu, K. Takasaki
1.4. Quantum torus Lie algebra and random plane partition. Starting from the expression (1.31) of the partition function (1.9), we prove in the text that the series of the partition functions (1.14) is a tau function of the one-dimensional Toda hierarchy. Remarkable relations between random plane partition and quantum torus Lie algebra are revealed in the course of the proof. Throughout the text, we take the perspective that such a quantum Lie algebra is a hidden symmetry of the random plane partition. We realize the quantum torus Lie algebra in terms of the complex fermions. Using this realization, we can regard the operators Hk as a commutative sub-algebra of the quantum torus Lie algebra. The adjoint actions of the propagators G ± on the Lie algebra generate automorphisms of the algebra. By taking advantage of such automorphisms, we provide a proof of the statement. Actually, among such automorphisms, we pay special attention to the shift symmetry that is the automorphism generated by the adjoint action of the product G − G + or equivalently G + G − . By utilizing this symmetry, we can eventually express the partition function in the form (1.15). In particular, the element of G L(∞) in the formula (1.15) is given by W
W
g ≡ q 2 (G − G + )2 q 2 , where W is a generator of the W∞ -algebra of the following form: m 2 : ψ−m ψm∗ : . W ≡
(1.34)
(1.35)
m∈Z
Using this formula, we finally confirm the statement by showing that g actually realizes a solution of the one-dimensional Toda hierarchy. Taking the formula (1.15), Virasoro/W-constraints on the partition function (1.9) and the tau function (1.14) can be obtained from the transformation of the W∞ -algebra by the adjoint action of g . In the same way, the transformation of the quantum torus Lie algebra by the adjoint action of g gives rise to quantum torus analogues of the Virasoro/W-constraints on the partition function (1.9) and the tau function (1.14). As we argue subsequently, such quantum torus analogues of the Virasoro/W-constraints are also obtainable in five-dimensional N = 1 supersymmetric gauge theories and certain topological string amplitudes. These are reported in a separate publication [13]. 1.5. Integrability of 5d N = 1 SUSY gauge theories from random plane partition. The random plane partition has a significant relation with five-dimensional N = 1 supersymmetric gauge theories [6,14–16]. Our study of the integrable structure of the random plane partition is motivated by a quest for the integrable structure of five-dimensional N = 1 supersymmetric gauge theories and topological strings. 1.5.1. Random plane partition and 5d N = 1 SUSY gauge theories. The Nekrasov’s functions for five-dimensional SU (N ) gauge theories [4,5] are interpreted as partition functions of the random plane partition [6]. Actually, the original partition function (1.4) reproduces Nekrasov’s function for five-dimensional U (1) gauge theory, by adding a simple chemical potential Q |π(0)| to the statistical weight. In the transfer matrix approach, that partition function becomes Z 5d U (1) = 0| G + Q L 0 G − |0 ,
(1.36)
Melting Crystal, Quantum Torus and Toda Hierarchy
where L 0 ≡
453
m : ψ−m ψm∗ :. The above matrix element is easily computed and leads
m∈Z
Z 5d U (1) =
+∞ n=1
1 . (1 − Qq n )n
(1.37)
Indeterminates q, Q in the right hand side of this formula must be interpreted in terms of the gauge theory parameters to reproduce Nekrasov’s function for the five-dimensional gauge theory. The gauge theory lives on R4 ×S 1R , where R denotes the radius of the circle, and has the dynamical scale . The indeterminates are identified with these parameters by the relations q = e−R ,
Q = (R )2 .
(1.38)
By shrinking the circle to a point, from Nekrasov’s functions for five-dimensional gauge theories, one obtains the four-dimensional versions. Considering the relations (1.38), this indicates that the partition function (1.36) is a q-analogue of the fourdimensional version. To see this, we note that the four-dimensional limit of the right-hand 1 1 side of (1.36) is obtained by employing the dressed propagators, Q − 2 L 0 G + Q 2 L 0 and 1 1 Q 2 L 0 G − Q − 2 L 0 . Actually, taking the relations (1.38), these dressed operators become nonsingular at the limit R → 0, and eventually give the following operators: 1
1
1
lim Q − 2 L 0 G + Q 2 L 0 = −L 0 e J1 L 0 ,
(1.39)
R →0
1
1
1
lim Q 2 L 0 G − Q − 2 L 0 = L 0 e J−1 −L 0 .
(1.40)
R→0
By using these formulas, one obtains 1
1
1
1
lim 0| G + Q L 0 G − |0 = lim 0|Q − 2 L 0 G + Q 2 L 0 Q 2 L 0 G − Q − 2 L 0 |0
R →0
R→0
1
1
= 0| e J1 2L 0 e J−1 |0 .
(1.41)
The right-hand side of this formula is nothing but Nekrasov’s function for four-dimensional U (1) gauge theory. 1.5.2. Integrable structure of 5d N = 1 SUSY gauge theories. The operator Hk has a counterpart in five-dimensional gauge theories. It corresponds to the Wilson loop operator encircling the circle k times [17],
k Ok = Tr Pe dt A4 +iϕ ,
(1.42)
where A4 and ϕ denote respectively the fifth component of the gauge field and the real scalar field in the vector multiplet. The generating function of the correlators among these observables becomes thereby the following analogue of (1.31): U (1) Z 5d (t) = p| G + e H (t) Q L 0 G − | p . p
(1.43)
454
T. Nakatsu, K. Takasaki
In the same way as we stated on (1.14), the series of the generating functions (1.43) also gives rise to a tau function of the one-dimensional Toda hierarchy. In particular, the tau function has the following expression analogous to (1.15): τ 5d U (1) (t; p) = e 5d U (1)
where g
∞
tk q k k=1 1−q k
1
p| e 2
∞
k k=1 (−) tk Jk
1
g5d U (1) e 2
∞
k k=1 (−) tk J−k
| p , (1.44)
is the element of G L(∞) given by W
W
g5d U (1) ≡ q 2 (G − G + ) Q L 0 (G − G + ) q 2 .
(1.45)
We note that this formula is easily generalized to the SU (N ) gauge theory, where the series of the corresponding generating functions of the correlators becomes a tau function of the one-dimensional Toda hierarchy. In four-dimensions, such an integrable structure of N = 2 supersymmetric gauge theories has been found out [10,18] among the generating functions of the higher Casimir operators Trφ k , where φ is the complex scalar in the vector multiplet, corresponding to A4 + iϕ in five-dimensional theories. One might expect that the tau function (1.44) is a q-analogue of the tau function of the four-dimensional theory, just like the partition function (1.36) is the q-analogue. However, this is not the case. The relation between these two integrable structures is not straightforward. The subtlety can be found, for instance, in that the tau function (1.44) becomes trivial at the four-dimensional limit, owing to the degeneration of all the observables (1.42). Actually, when R is nearly zero, the generating function (1.43) behaves as U (1) Z 5d (t) ∼ R p( p+1) e p p
∞
k=1 tk
1
1
0| e J1 2L 0 e J−1 |0 .
(1.46)
1.6. Integrability of topological string amplitudes from random plane partition. In addition to the gauge theory interpretation, the partition function (1.36) has an interpretation as an all genus A-model topological string amplitude on O ⊕ O(−2) → CP1 . It is a non-compact toric Calabi-Yau threefold, often called a local geometry. The toric description is given by a fan consisting of rational cones of dimensions ≤ 3 on R3 . The topological vertex [2,3] is a diagrammatical method to compute all genus A-model topological string amplitudes for such local geometries. The diagram can be drawn from a polyhedron which is obtained by taking duals to the cones in R3 . For the local geometry O ⊕ O(−2) → CP1 , the relevant diagram is depicted in Fig. 3. The topological vertex computation based on the diagram gives all genus A-model topological string amplitude on O ⊕ O(−2) → CP1 as O⊕O(−2)
Astring
=
+∞ n=1
1 , (1 − e−a e−ngst )n
(1.47)
where a denotes the K¨ahler volume of the base CP1 , and gst is the string coupling constant. Comparing two formulas (1.37) and (1.47), one sees that the partition function (1.36) becomes an all genus A-model topological string amplitude on O ⊕ O(−2) → CP1 , by the following identification of the parameters: q = e−gst ,
Q = e−a .
(1.48)
Melting Crystal, Quantum Torus and Toda Hierarchy
455
Fig. 3. The diagram for O ⊕ O(−2) → CP1
As is the case of the five-dimensional gauge theory, we generalize the amplitude including the operators Hk in it, anticipating that a wound Euclidean brane along the M-theory circle1 corresponds to the observable Ok of the gauge theories. We physically conjecture such a generalization by O⊕O(−2)
Astring
(t; p) = p| G + e H (t) Q L 0 G − | p ,
(1.49)
where the right-hand side of this equation is same as the formula (1.43) but with the different interpretation (1.48). The series of the generating functions (1.49) gives the same tau function as is the case of the five-dimensional U (1) gauge theory. It seems rather rare that the one-dimensional Toda hierarchy shows up as an integrable structure of topological strings. One of such rare cases is the topological sigma model (in other words, the Gromov-Witten invariants) of CP 1 [19–24]. On the other hand, the “relative” or “equivariant” versions of those Gromov-Witten invariants have a different integrable structure, namely, the two-dimensional Toda hierarchy [23,25]. It is remarkable that substantially the same quantum torus Lie algebra is used in the work of Okounkov and Pandharipande [25], but we have been unable to see whether there is (k) a deep connection with our work. To obtain the generators Vm of our quantum torus Lie algebra, one has to specialize the parameter z of Okounkov and Pandharipande’s operators Em (z) to e z = q k ; such powers of q apparently play no role in the work of Okounkov and Pandharipande. The two-dimensional Toda hierarchy is also known to arise in the generating function s (x)s ¯ 뵕 of the two-legged topological vertex c뵕 [26]. By τ (x, x) ¯ = µ ( x)c λ,µ λ ¯ coincides changing variables from x, x¯ to Tk = k1 i xik , T¯k = − k1 i x¯ik , τ (x, x) with the special part τ 2 T oda (T, T¯ , 0) of a tau function τ 2 T oda (T, T¯ , p) of the twodimensional Toda hierarchy. As Zhou pointed out, the tau function τ 2 T oda (T, T¯ , p) has a fermionic representation in terms of an element g of G L(∞) of the form K
K
g = q 2 G + G −q 2 ,
(1.50)
where K = m (m − 21 )2 : ψ−m ψm∗ :. Thus the building blocks of this tau function are similar to the foregoing partition functions of U (1) gauge theory. The difference between K and W is almost negligible, and can be absorbed by rescaling of Tk , T¯k and an exponential prefactor. 1 We thank Y. Hyakutake for suggesting such a possibility to us.
456
T. Nakatsu, K. Takasaki
1.6.1. Emergence of geometry from condensation. It is amazing to comment on a possibility of emergence of another geometry from the condensation in a local geometry. Consider the generating function (1.49) for the case of p = 0, and write it simply as O⊕O(−2) O⊕O(−2) Astring (t) ≡ Astring (t; p = 0). As explained in the beginning of Sect. 4, (1.49) has another representation of the following form: O⊕O(−2)
Astring
(t) = e
∞
tk q k k=1 1−q k
0| e
∞
k k=1 (−) tk Jk
g5d U (1) |0 .
(1.51)
This becomes a tau function of the KP hierarchy. The generating function has a factorization in terms of the skew Schur functions. The matrix element in the right-hand side of formula (1.51) is factored by plugging the 5d U (1) unity 1 = p∈Z λ |λ; p λ; p| so that it divides g into two parts. Owing to the charge conservation, the sum over p truncates to p = 0. Writing |λ; 0 simply as |λ , we can obtain ∞ ∞ W k k 0| e k=1 (−) tk Jk g5d U (1) |0 = 0| e k=1 (−) tk Jk q 2 G − |λ × λ| G + Q L 0 G − |0 . λ
(1.52) Matrix elements in the right-hand side of this equation are expressed in terms of the skew Schur functions as follows: ∞ κ(µ) |µ| W k 0| e k=1 (−) tk Jk q 2 G − |λ = q 2 + 2 sµ (x)sµ/λ (q ρ ), (1.53) µ
Q |λ| sλ (q ρ ) λ| G + Q L 0 G − |0 = +∞ , (1.54) n n n=1 (1 − Qq ) where variables are changed from t to x by tk = k1 i (−xi )k . Combining these two formulas, the right hand side of formula (1.51) is expressed in terms of the skew Schur functions. Therefore, the factorization of the generating function, normalized by the original A-model topological string amplitude, can be written as O⊕O(−2) ∞ tk q k κ(µ) |µ| Astring (t) k=1 1−q k |λ| ρ + ρ =e Q sλ (q ) q 2 2 sµ (x)sµ/λ (q ) . (1.55) O⊕O(−2) Astring (0) µ λ We examine the condensation by choosing the coupling constants t at certain values. In particular, we take the following values. k
tk
q2 , = k(1 − q k )
k = 1, 2, ....
(1.56)
−1 Owing to the identification (1.48), t behave ∼ gst when gst is nearly zero and therefore a possible condensation becomes nonperturbative in I I A superstrings. However, one can compute the right hand side of formula (1.55). The computation yields eventually the amplitude as
O⊕O(−2) Astring (t ) O⊕O(−2) Astring (0)
=
+∞
1
(1 − Qq n+ 2 )n .
n=1
(1.57)
Melting Crystal, Quantum Torus and Toda Hierarchy
457
Fig. 4. The diagram for two conifolds
The right hand side of this formula coincides with all genus A-model topological string amplitude on the resolved conifold O(−1) ⊕ O(−1) → CP1 , where the K¨ahler volume of the base CP1 is a + 21 gst . Emergence of the resolved conifold in (1.57) seems mysterious. However, this can be explained as follows. Let us consider the local geometry of coupled conifolds. Coupled conifolds can be obtained by patching torically together the foregoing local geometry and C2 . The diagram of coupled conifolds is depicted in Fig. 4, where Q 1,2 are the K¨ahler parameters attached to the internal edge of the diagram. Each internal edges corresponds to CP 1 . The K¨ahler parameters are given by Q 1,2 = e−a1,2 , where a1,2 denote the K¨ahler volumes of the corresponding CP 1 s. Based on this diagram, the topological vertex computation yields the all genus A-model topological string amplitude as +∞ +∞ n n n n two coni f olds n=1 (1 − Q 1 Q 2 q ) · n=1 (1 − Q 2 q ) Astring = . (1.58) +∞ n n n=1 (1 − Q 1 q ) It is remarkable that the topological string amplitude (1.58) appears by tuning the K¨ahler volumes a1,2 , as a building block of the generating function (1.51) at t = t . ∞ k 5d U (1) Actually, the matrix element 0| e k=1 (−) tk Jk g |0 is evaluated by using formulas (1.53),(1.54) and becomes eventually. ∞ k two coni f olds 0| e k=1 (−) tk Jk g5d U (1) |0 = Astring , (1.59) t=t
where the K¨ahler volumes a1,2 in the right hand side of this formula are a1 = a,
a2 =
1 gst . 2
(1.60)
This formula indicates that the condensation (1.56) changes the original geometry into two-conifolds with two-cycles having the K¨ahler volumes (1.60), and that the ratio (1.57) counts the worldsheet instantons wrapping the two-cycles simultaneously. The further issue will be reported in [13]. Organization of the article. The purpose of this article is to show that the series of the partition functions (1.14) is a tau function of one-dimensional Toda hierarchy. We start Sect. 2 by giving the realization of the quantum torus Lie algebra using the complex fermions. In Sect. 3, we argue that automorphisms of the Lie algebra are generated by the adjoint actions of G ± . By using such automorphisms we confirm that the series of the partition functions (1.14) satisfy the Toda equation (1.17). In Sect. 4, we provide a proof of the statement.
458
T. Nakatsu, K. Takasaki
2. Quantum Torus Lie Algebra (k)
Let Vm , where k = 0, 1, 2, · · · and m ∈ Z, be a set of operators which are defined by the following generating function for each k: k k k : ψ(q 2 z)ψ ∗ (q − 2 z) : = z −m−1 q − 2 Vm(k) . (2.1) m∈Z (k)
This formula yields the following expression of Vm : k k k dz m (k) 2 Vm = q z : ψ(q 2 z)ψ ∗ (q − 2 z) :, 2πi
(2.2)
where the integral means taking the residue at z = 0. This integral can be evaluated by plugging the mode expansion of ψ(z), ψ ∗ (z) into the generating function. Thereby, the right-hand side of Eq. (2.2) is read as km Vm(k) = q − 2 q kn : ψm−n ψn∗ : . (2.3) n∈Z
The operator Hk (1.24) and the U (1) current Jm (1.20) are represented as (k)
Jm = Vm(0) .
Hk = V0 ,
(2.4)
Henceforth, taking the viewpoint of integrable systems, we call Hk (k ≥ 1) hamiltonians. We note that the normal ordering in the formula (2.1) is redundant when k = 0. k k Actually, the fermion bilinear form −ψ ∗ (q − 2 z)ψ(q 2 z) is regularized solely by the k point splitting2 z → q ± 2 z, without any normal ordering. Effect of the normal ordering is known from (1.21) and becomes the subtraction of a finite term as k
∗
− ψ (q
− k2
k 2
k 2
∗
z)ψ(q z) =: ψ(q z)ψ (q
− k2
q2 . z) : − z(1 − q k ) (k)
Thereby, the normal ordering only makes the finite gap between V0 without the normal ordering as follows: (k)
V0
=−
m
q km ψm∗ ψ−m +
(2.5)
and that obtained
qk . 1 − qk
(2.6)
(k)
The operators Vm satisfy the quantum torus Lie algebra (the sine-algebra [27]). The following commutation relations can be found:
lm−kn lm−kn (k+l) − δm+n,0 Vm(k) , Vn(l) = (q 2 − q − 2 ) Vm+n
q k+l , 1 − q k+l
(2.7)
where k, l = 0, 1, · · · and m, n ∈ Z. The commutation relations (2.7) become the qk standard ones by shifting V0(k) → V0(k) − 1−q k for k = 0. The hamiltonians Hk (k ≥ 0), where H0 ≡ V0(0) is included, generate a commutative sub-algebra of this Lie algebra. 2 We take 0 < q < 1.
Melting Crystal, Quantum Torus and Toda Hierarchy
459
The non-negative modes Jm m≥0 and the non-positive modes J−m m≥0 of the U (1) current do as well. In addition to these sub-algebras, there are two more commutative (k) (k) sub-algebras that are generated respectively by Vk k≥0 and V−k k≥0 . All these sub-algebras are found to relate with one another. The appearance of a quantum torus Lie algebra might be unexpected. However it can be explained from the viewpoint of the sine-algebra [27]. The sine-algebra is obtained from sl(N ) by taking the large N limit of its trigonometrical basis. Let X, Y be two N × N unitary matrices given by X=
N −1
E i,i+1 + E N ,1 ,
Y =
i=1
N
ωi−1 E i,i ,
(2.8)
i=1
where ω is a N th root of the unity. These two matrices satisfy the relation X Y = ωY X . Their non-commutative monomials X m Y k for 0 ≤ m < N and 0 ≤ k < N give the trigonometrical basis of sl(N ). The sine-algebra is the Lie algebra of X m Y k at N = ∞ and is identified with the quantum torus Lie algebra. It is the Lie algebra derived from a quantum two-torus (non-commutative two-torus), that is, an unital algebra with two generators U, V satisfying the relation U V = q V U . Here q is regarded as the non-commutative parameter. The trigonometrical basis X m Y k is transmuted to the noncommutative monomials U m V k . Let us normalize them as follows: (k) vm = q−
km 2
UmV k.
(2.9)
These normalized ones satisfy the commutation relations (2.7), apart from the shift of the zero modes. The sine-algebra is the algebra of trigonometric basis of sl(∞). Making use of the embedding sl(∞) ⊂ gl(∞), as utilized in (2.8) for finite N , a trigonometric basis can be realized in terms of q-difference operators with respect to z. Among them, the fundad mental operators are z and q −zd/dz = exp(− log qz dz ), which are the counterparts of X −zd/dz = qq −zd/dz z. The normalized and Y . These two operators satisfy the relation zq basis (2.9) of the sine-algebra correspond to (k) vm = q−
km 2
d
z m q −kz dz .
(2.10)
(k)
(k)
The operators Vm are nothing but the second-quantizations of vm by means of the fermions. dz (k) (k) ∗ : ψ(z)vm Vm = ψ (z) : . (2.11) 2πi (k)
Since we have restricted k ≥ 0, the operators Vm or the basis (2.10) generate only half of quantum torus Lie algebra. Needless to say, the operators for k < 0 are obtainable from the generating function (2.1) by choosing k as such, however those play no role in this article. The above half is, so to speak, a quantum cylinder Lie algebra to be obtained from a quantum cylinder. In the context of the random plane partition, this quantum cylinder becomes a classical cylinder C∗ at the thermodynamic limit or the semi-classical limit; the Seiberg-Witten hyper-elliptic curves of five-dimensional N = 1 supersymmetric gauge theories emerge as the double cover of the cylinder [14,16]. See [14,16] for details.
460
T. Nakatsu, K. Takasaki
3. One-Dimensional Toda Chain In this section we show that the series of the partition functions (1.14) provide a solution of the one-dimensional Toda chain. We will identify the partition functions with dynamical variables φ p on a Z-lattice by eφ p ≡
τ (t; p + 1) 2 Z p+1 (t) = q ( p+1) , τ (t; p) Z p (t)
p ∈ Z.
(3.1)
Then, writing x = t1 , the variables φ p satisfy the following Toda equation: ∂x2 φ p = eφ p+1 −φ p − eφ p −φ p−1 ,
p ∈ Z.
(3.2) (k)
3.1. Transformations by adjoint action. We argue transformations of the generators Vm by the adjoint action of the propagators G ± . For this purpose, we conveniently start with the transformations of the fermions ψ(z), ψ ∗ (z). These follow from their rotations by the transfer matrices ± (m), since G ± are just the products of ± (m). The rotations are easily computed by recalling that each mode of the U (1) current rotates the fermions as follows: eθ Jm ψ(z)e−θ Jm = eθ z ψ(z),
eθ Jm ψ ∗ (z)e−θ Jm = e−θ z ψ ∗ (z),
m
m
θ ∈ C. (3.3)
For instance, the rotation of ψ(z) by + (m) is computed as follows: + (m) ψ(z) + (m)−1 =
+∞
1 1 −k(m+ 2 )
ekq
Jk
k=1 +∞
=e
ψ(z)
1 1 −k(m+ 2 )
e− k q
Jk
k=1
1 1 k −k(m+ 2 ) k=1 k z q
= (1 − zq
+∞
ψ(z)
−(m+ 12 ) −1
)
ψ(z).
(3.4)
By using this formula reiteratively according as (1.32), we can compute the adjoint action of G + . In this way, we eventually obtain the following transformations of the fermions: ⎧ +∞ ⎪ 1 ⎪ ⎪ G + ψ(z)(G + )−1 = (1 − zq m− 2 )−1 ψ(z), ⎪ ⎨ m=1
+∞ ⎪ 1 ⎪ ∗ −1 ⎪ G ψ (z)(G ) = (1 − zq m− 2 ) ψ ∗ (z). ⎪ + + ⎩
(3.5)
m=1
⎧ +∞ ⎪ 1 ⎪ −1 ⎪ (1 − z −1 q m+ 2 ) ψ(z), ⎪ ⎨ (G − ) ψ(z)G − = m=0
+∞ ⎪ 1 ⎪ −1 ∗ ⎪ (G ) ψ (z)G = (1 − z −1 q m+ 2 )−1 ψ ∗ (z). ⎪ − ⎩ −
(3.6)
m=0
Nextly we examine the transformations of the generating function (2.1) of the quantum torus Lie algebra. These transformations are obtained by using formulas (3.5),
Melting Crystal, Quantum Torus and Toda Hierarchy
461
(3.6), taking (2.5) into account. We illustrate the computation for the case of G + : k
k
G + : ψ(q 2 z)ψ ∗ (q − 2 z) : (G + )−1 ∗
= G + −ψ (q
− k2
k q2 z)ψ(q z) + (G + )−1 z(1 − q k ) k 2
k
∗
= −G + ψ (q =−
k
− k2
z)(G + )
−1
k 2
G + ψ(q z)(G + )
−1
q2 + z(1 − q k ) k
(1 − zq
k+1 2 −m
∗
) ψ (q
− k2
m=1
q2 . z)ψ(q z) + z(1 − q k ) k 2
(3.7)
The last line in (3.7) can be rewritten in terms of the generating function itself. We thus obtain k k k q2 ∗ − G + : ψ(q 2 z)ψ (q 2 z) : − (G + )−1 z(1 − q k ) k k k+1 k k q2 −m ∗ − = (1 − zq 2 ) : ψ(q 2 z)ψ (q 2 z) : − . (3.8) z(1 − q k ) m=1
The similar computation goes as well for the case of G − and gives k 2 k k q (G − )−1 : ψ(q 2 z)ψ ∗ (q − 2 z) : − G− z(1 − q k ) k k k −1 − k+1 +m = (1 − z q 2 ) : ψ(q 2 z)ψ ∗ (q − 2 z) : − m=1
k q2 . (3.9) z(1 − q k )
The transformations of Vm(k) can be obtained from formulas (3.8), (3.9) by reading the coefficients of the Laurent expansions of the equations around z = 0. It is evident but nevertheless surprising that G ± generate automorphisms of the Lie algebra (2.7) by (k) adjoint action. Let us concentrate on the transformations of the hamiltonians Hk = V0 . These are read from formulas (3.8), (3.9) as follows: (k) G + V0 (G + )−1
(G − )
−1
(k) V0 G −
=q =q
k 2
k 2
k k+1 k k dz (1 − zq 2 −m ) : ψ(q 2 z)ψ ∗ (q − 2 z) :, 2πi
(3.10)
m=1
k k+1 k k dz (1 − z −1 q − 2 +m ) : ψ(q 2 z)ψ ∗ (q − 2 z) :, (3.11) 2πi m=1
where the integrals denote taking the residue at z = 0. These integrals can be evaluated by using the q-binomial theorem: k m=1
(1 + xq m ) =
k i=0
q
i(i+1) 2
k xi , i q
(3.12)
462
T. Nakatsu, K. Takasaki
where (q : q)k k = , i q (q : q)i (q : q)k−i
(q : q)n = (1 − q)(1 − q 2 ) · · · (1 − q n ). (3.13)
By expanding the products in the right hand sides of Eqs. (3.10),(3.11), these are brought to the following form: (k)
G + V0 (G + )−1 =
k i(k−i) (k) (−)i q − 2 ki Vi ,
(G − )−1 V0(k) G − =
k i(k−i) (k) (−)i q − 2 ki V−i .
(1)
3.2. Toda equation. The transformations of H1 = V0 read as
H1 G − =
in formulas (3.14),(3.15) are
(1)
− V1 ,
(3.16)
(1) V0
(1) V−1 .
(3.17)
G + H1 (G + )−1 = V0 (G − )
(3.15)
q
i=0
−1
(3.14)
q
i=0
(1)
−
These transformations deal naturally in the evolution of the partition function (1.9) by the time x = t1 . The operator G + e H (t) G − that appears in the expression (1.31) evolves according to (1)
∂x (G + e H (t) G − ) = G + H1 (G + )−1 G + e H (t) G − = (V0
(1)
− V1 ) G + e H (t) G − .
(3.18)
It can be also written as (1) ). ∂x (G + e H (t) G − ) = G + e H (t) G − (G − )−1 H1 G − = G + e H (t) G − (V0(1) − V−1 (3.19)
These two descriptions lead to (1)
∂x Z p (t) = p| (V0
(1)
− V1 ) G + e H (t) G − | p (1)
= p| G + e H (t) G − (V0
(3.20)
(1)
− V−1 ) | p .
(3.21)
Let us derive the Toda Eq. (3.2). Owing to the identification (3.1), it suffices to prove the following identity: Z p ∂x2 Z p − (∂x Z p )2 = q 2 p+1 Z p+1 Z p−1 ,
p ∈ Z.
(3.22)
We first rewrite the left hand side of Eq. (3.22), using the expression (1.31) and applying formulas (3.18), (3.19) in it, as follows: Z p ∂x2 Z p − (∂x Z p )2 (1) = Z p · p| (V0(1) − V1(1) ) G + e H (t) G − (V0(1) − V−1 ) | p (1)
− p| (V0
(1)
(1)
− V1 ) G + e H (t) G − | p · p| G + e H (t) G − (V0
(1)
− V−1 ) | p .
(3.23)
Melting Crystal, Quantum Torus and Toda Hierarchy
463
The matrix elements in the right-hand side of this equation can be translated to the (1) fermion correlation functions, by replacing V0(1) − V±1 with the states generated by these operators. The corresponding states are (1) ) | p = q (V0(1) − V−1
1 1 − qp | p + q p+ 2 ψ− p−1 ψ p∗ | p 1−q
(3.24)
and its conjugate state. Thereby, we can eventually translate the right-hand side of Eq. (3.23) into the following combination of the correlation functions. Z p ∂x2 Z p − (∂x Z p )2 ∗ = q 2 p+1 Z p · p| ψ− p ψ p+1 G + e H (t) G − ψ− p−1 ψ p∗ | p ∗ G + e H (t) G − | p · p| G + e H (t) G − ψ− p−1 ψ p∗ | p . (3.25) − q 2 p+1 p| ψ− p ψ p+1
Wick’s theorem shows that correlation functions of free fermions are factorized into products of their two point functions. The four point function in the right hand side of Eq. (3.25) is factorized into 1 ∗ p| ψ− p ψ p+1 G + e H (t) G − ψ− p−1 ψ p∗ | p Zp 1 1 ∗ p| ψ p+1 G + e H (t) G − ψ− p−1 | p · p| ψ− p G + e H (t) G − ψ p∗ | p = Zp Zp 1 1 ∗ G + e H (t) G − | p · p| G + e H (t) G − ψ− p−1 ψ p∗ | p . + p| ψ− p ψ p+1 Zp Zp
(3.26)
In the right hand side of this equation, the first term equals Z −2 p Z p+1 Z p−1 , making use of the relations | p + 1 = ψ− p−1 | p and | p − 1 = ψ p∗ | p . Thus, Wick’s theorem leads to ∗ Z p · p| ψ− p ψ p+1 G + e H (t) G − ψ− p−1 ψ p∗ | p ∗ G + e H (t) G − | p · p| G + e H (t) G − ψ− p−1 ψ p∗ | p . = Z p+1 Z p−1 + p| ψ− p ψ p+1 (3.27)
By plugging this formula into the right hand side of Eq. (3.25), we obtain q 2 p+1 Z p+1 Z p−1 . Thereby this completes the proof. 4. One-Dimensional Toda Hierarchy In this section we prove that the series of the partition functions (1.14) is a tau function of the one-dimensional Toda hierarchy. We first recall the theory of tau functions of the Toda hierarchy [28–30]. The twodimensional Toda hierarchy has two series of commuting flows and thereby two series of time variables, T = (T1 , T2 , · · · ) and T¯ = (T¯1 , T¯2 , · · · ), each of which describes each the commuting flows. Tau functions of the two-dimensional Toda hierarchy are admitted to have several expressions including the realization by means of free fermions or free bosons. The standard description in terms of free fermions is τ 2 T oda (T, T¯ ; p) = e
∞
¯
k=1 (ck Tk +c¯k Tk )
p| e
∞
k=1 Tk Jk
g e−
∞
¯ k=1 Tk J−k
| p ,
(4.1)
464
T. Nakatsu, K. Takasaki
where g is an element of G L(∞). ck and c¯k are numerical constants which originate in the ambiguity of the tau function. The two-dimensional Toda hierarchy reduces to the one-dimensional Toda hierarchy when the two-sided time evolutions degenerate. This reduction imposes the following constraint on the tau function: ∂ ∂ 2 T oda τ + (T, T¯ ; p) = 0, k = 1, 2, · · · . (4.2) ∂ Tk ∂ T¯k When the condition is fulfilled, the tau function translates to the tau function of the one-dimensional Toda hierarchy. Owing to the degeneration, the one-dimensional Toda hierarchy has one series of commuting flows. The corresponding time variables can be identified with T = (T1 , T2 , · · · ). The tau function has the following expression in terms of free fermionss: τ 1 T oda (T ; p) = e
∞
k=1 ck Tk
1
p| e 2
∞
1
k=1 Tk Jk
g e2
∞
k=1 Tk J−k
| p .
(4.3)
The coupling constants t = (t1 , t2 , · · · ) in the series of the partition functions (1.14) are eventually identified with the standard Toda time variables T = (T1 , T2 , · · · ) by Tk = (−)k tk . To implement the constraint (4.2), g is chosen to satisfy the constraint Jk g = g J−k ,
k = 1, 2, · · · .
(4.4)
Under this condition, the foregoing expression of τ 1 T oda (T ; p) can be rewritten as τ 1 T oda (T ; p) = e This shows that
τ 1 T oda (T ;
∞
k=1 ck Tk
p| e
∞
k=1 Tk Jk
g | p .
(4.5)
p) is a tau function of the modified KP hierarchy as well.
4.1. Shift symmetry. Among automorphisms of the Lie algebra (2.7), we pay special attention to the following shift symmetry: Vm(k) − δm,0
qk 1 − qk
(k)
−→ Vm+k − δm+k,0
qk , 1 − qk
k ≥ 1,
(4.6)
(0)
and Vm are left unchanged. Using this symmetry, three commutative sub-algebras (k) (k) (k) generated respectively by V0 k≥0 , Vk k≥0 and V−k k≥0 become conjugate to one another. The symmetry (4.6) becomes eventually one of the automorphisms generated by the adjoint action of G ± . Combining the transformations (3.8),(3.9), we find k k k q2 ∗ − (G − G + ) : ψ(q 2 z)ψ (q 2 z) : − (G − G + )−1 z(1 − q k ) k k k q2 k k ∗ − = (−) z : ψ(q 2 z)ψ (q 2 z) : − . (4.7) z(1 − q k ) Taking account of the mode expansion (2.1), we can read the symmetry (4.6) from formula (4.7) in the following form: qk qk (k) −1 k (G V . (4.8) G ) = (−) − δ (G − G + ) Vm(k) − δm,0 − + m+k,0 m+k 1 − qk 1 − qk
Melting Crystal, Quantum Torus and Toda Hierarchy
465
This formula shows that the conjugacy between the three sub-algebras is realized as their transformations by (G − G + )±1 . We shall describe such transformations in some detail. Let k ≥ 1. Putting m = 0 in (4.8), we obtain qk (k) (k) (G − G + )−1 = (−)k Vk . (4.9) (G − G + ) V0 − 1 − qk Similarly, putting m = −k in (4.8), we obtain qk (k) (k) −1 V0 − (G − G + ) = (−)k V−k . (G − G + ) 1 − qk
(4.10)
Let us consider two more commutative sub-algebras which are generated respectively (0) (0) by Vk k≥0 and V−k k≥0 . These sub-algebras become conjugate to the aforementioW
ned three sub-algebras by the adjoint actions of (G − G + )±1 and q ± 2 . To see this, note (k) (0) that W (1.35) is able to rotate V±k to V±k by the adjoint action. Actually, we have the following transformations. (k)
q
W 2
q
− W2
Vk
(0)
W
q − 2 = Vk ,
(k) W V−k q 2
=
(4.11)
(0) V−k .
(4.12) W
To obtain these formulas, note that, by the adjoint action of q 2 , the fermions transform W
W
m2 2
W
W
2 − m2
ψm∗ . By using these transformations, as q 2 ψm q − 2 = q ψm and q 2 ψm∗ q − 2 = q the left hand side of (4.11) can be computed as W W k2 W W q kn q 2 : ψk−n ψn∗ : q − 2 q 2 Vk(k) q − 2 = q − 2 n∈Z
=q =
2 − k2
q kn+
(k−n)2 n 2 − 2 2
: ψk−n ψn∗ :
n∈Z
: ψk−n ψn∗ :,
(4.13)
n∈Z
which is nothing but Vk(0) . Thus we obtain the formula (4.11). The similar computation leads to the formula (4.12). The formulas (4.11), (4.12) in addition to (4.9), (4.10) show that all the five subW algebras are conjugate to one another by the adjoint actions of (G − G + )±1 and q ± 2 . (k) (0) Therefore, the hamiltonian Hk = V0 can transform to J±k = V±k by the adjoint actions W
W
of (G − G + )±1 and q ± 2 . Actually, as seen from formulas (4.9), (4.11), q 2 (G − G + ) W rotates Hk to Jk , while q − 2 (G − G + )−1 rotates Hk to J−k , as seen from formulas (4.10), (4.12). We write down these transformations in the following form convenient for the later use: W −1 W qk (k) G + V0 − (G + )−1 = (−)k q 2 G − Jk (q 2 G − ), (4.14) 1 − qk W W qk G − = (−)k G + q 2 J−k (G + q 2 )−1 . (G − )−1 V0(k) − (4.15) k 1−q
466
T. Nakatsu, K. Takasaki
4.2. The proof. 4.2.1. New representation of the partition functions. We derive the expression (1.15) of the partition function (1.9). Let us first rewrite G + e H (t) G − as follows: 1
1
G + e H (t) G − = G + e 2 H (t) e 2 H (t) G − 1
1
= G + e 2 H (t) (G + )−1 G + G − (G − )−1 e 2 H (t) G − .
(4.16)
1 2 H (t)
The transformations of e by the adjoint actions of G + ,(G − )−1 in this expression can be evaluated by using formulas (4.14), (4.15). Eventually, these transformations are expressed as 1
G + e 2 H (t) (G + )−1 = e 1
(G − )−1 e 2 H (t) G − = e
∞
tk q k k=1 2(1−q k )
∞
tk q k k=1 2(1−q k )
W
1
(q 2 G − )−1 e 2 W
1
(G + q 2 ) e 2
∞
k k=1 (−) tk Jk
∞
k k=1 (−) tk J−k
W
(q 2 G − ), (4.17) W
(G + q 2 )−1 (4.18)
By plugging these expressions into the right hand side of (4.16), we obtain the following formula: G+e
H (t)
G− = e
∞
tk q k k=1 1−q k 1
× g e 2
W
1
(q 2 G − )−1 e 2
∞
k k=1 (−) tk J−k
∞
k k=1 (−) tk Jk
W
(G + q 2 )−1 ,
(4.19)
where g is the element of G L(∞) given by (1.34). Making use of this formula we arrange the expression (1.31) as Z p (t) = p|G + e H (t) G − | p =e =e
∞
tk q k k=1 1−q k
∞
tk q k k=1 1−q k
W
1
p| (G − )−1 q − 2 e 2 W
1
p| q − 2 e 2
∞
1
k k=1 (−) tk Jk
∞
k k=1 (−) tk Jk
1
g e 2
g e 2
∞
∞
k k=1 (−) tk J−k
k k=1 (−) tk J−k
W
q − 2 (G + )−1 | p
W
q − 2 | p .
(4.20)
Considering in the last line that the W -charge of the state | p is p( p + 1)(2 p + 1), we finally obtain the expression (1.15). 1 6
4.2.2. Reduction to one-dimensional Toda hierarchy. Let us show that τ (t; p) = 1 q 6 p( p+1)(2 p+1) Z p (t) is a tau function of the one-dimensional Toda hierarchy. The key is the fact that g satisfies (4.4). This can be seen by using formulas (4.9), (4.10), (4.11), (4.12) as follows: W
Jk g = Jk q 2 (G − G + )2 q W
W 2
(k)
W
= q 2 Vk (G − G + )2 q 2 W (k) = q 2 (G − G + ) (−)k V0 − W
(k)
W
W 2
= q 2 (G − G + )2 V−k q = q 2 (G − G + )2 q = g J−k .
W qk (G − G + )q 2 k 1−q
W 2
J−k (4.21)
Melting Crystal, Quantum Torus and Toda Hierarchy
467
The constraint (4.4) is equivalent to the constraint (4.2) on the tau function given by (4.1) taking g = g . This means that g actually gives a solution of the one-dimensional Toda hierarchy. Therefore it follows from (4.3) that τ (t; p) is the corresponding tau function by the identifications Tk = (−)k tk . This completes the proof. 5. Conclusion and Discussion We investigated a melting crystal, which is known as a random plane partition, from the viewpoint of integrable systems. We proved that a series of partition functions of melting crystals give rise to a tau function of the one-dimensional Toda hierarchy, where the models are defined by adding suitable potentials, endowed with a series of coupling constants, to the standard statistical weight. We showed that these potentials are converted to a commutative sub-algebra of a quantum torus Lie algebra. Further exploiting the underlying algebraic structure, a remarkable connection between the random plane partition and the quantum torus Lie algebra was revealed. This connection substantially enabled to prove the statement. Based on the result, we briefly studied the integrable structures of five-dimensional N = 1 supersymmetric gauge theories and A-model topological strings. The aforementioned potentials correspond to gauge theory observables analogous to the Wilson loops, and thereby the partition functions are translated in the gauge theory to generating functions of their correlators. In topological strings, we particularly comment on the possibility of topology change caused by condensation of these observables, giving a simple example. In four-dimensional N = 2 supersymmetric gauge theories, the authors of [10] obtained the generating function of correlation functions of the higher Casimir operators in the fermionic form 1
(1) Z 4dU (x) = p|e J1 e p
∞
xk k=0 (k+1)! Pk+1
1
e J−1 | p ,
(5.1)
where Pk are the fermion bilinear forms introduced by Okounkov and Pandharipande [23], and xk are the coupling constants of the higher Casimirs of the gauge theory. The above partition function also appears in the Gromov-Witten theory as the generating function of the absolute Gromov-Witten invariants on CP 1 [23,18]. Before this fermionic representation was presented, Getzler had conjectured, and later proven [21], that the generating function is a tau function of the one-dimensional Toda hierarchy. Getzler’s proof is, however, fairly complicated and somewhat indirect, combining the Virasoro conjecture [24] with the partial result that the Toda equation holds on the subspace x2 = x3 = · · · = 0. It will therefore be an interesting problem to give a more direct proof on the basis of the fermionic representation. A possible scenario will be, as we have done in the five-dimensional case, to find a suitable analogue of g and to rewrite the foregoing fermionic representation into a standard form like (4.3) and (4.5). Unfortunately, as we remarked in the end of Sect. 1.5.2, the naive four-dimensional (R → 0) limit of g itself does not work. Since the role of g reminds us of various “dressing operators” in the work of Okounkov and Pandharipande [23,25], a correct four-dimensional analogue of g might be hidden therein. Acknowledgements. We are very grateful to T. Tamakoshi for his participation in the research at an early stage of this work. T.N. benefited from discussion with K. Tsuda and Y. Noma. K.T. is also grateful to M. Mulase for fruitful discussion on a related issue. Finally we thank the referees for useful comments and helpful suggestions. K.T is supported in part by Grant-in-Aid for Scientific Research No. 18340061 and No. 19540179.
468
T. Nakatsu, K. Takasaki
References 1. Okounkov, A., Reshetikhin, N., Vafa, C.: Quantum Calabi-Yau and Classical Crystals. http://arxiv.org/ list/hep-th/0309208, 2003 2. Iqbal, A.: All Genus Topological String Amplitudes and 5-brane Webs as Feynman Diagrams. http://arxiv.org/list/hep-th/0207114, 2002 3. Aganagic, M., Klemm, A., Marino, M., Vafa, C.: The Topological Vertex. Commun. Math. Phys. 259, 425–478 (2005) 4. Nekrasov, N.A.: Seiberg-Witten prepotential from instanton counting. Adv. Theor. Math. Phys. 7, 831 (2004) 5. Nekrasov, N., Okounkov, A.: Seiberg-Witten Theory and Random Partitions. http://arxiv.org/list/hep-th/ 0306238 6. Maeda, T., Nakatsu, T., Takasaki, K., Tamakoshi, T.: Five-Dimensional Supersymmetric Yang-Mills Theories and Random Plane Partitions. JHEP 0503, 056 (2005) 7. Iqbal, A., Kashani-Poor, A.-K.: SU (N ) geometries and topological string amplitudes. Adv. Theor. Math. Phys. 10, 1–32 (2006) 8. Eguchi, T., Kanno, H.: Topological strings and Nekrasov’s formulas. JHEP 12, 006 (2003) 9. Seiberg, N., Witten, E.: Electric-Magnetic Duality, Monopole Condensation, and Confinement in N = 2 Supersymmetric Yang-Mills Theory. Nucl. Phys. B426, 19 (1994); Erratum, ibid. B430, 485 (1994); “Monopoles, Duality and Chiral Symmetry Breaking in N = 2 Supersymmetric QCD.” ibid. B431, 484 (1994) 10. Marshakov, A., Nekrasov, N.: Extended Seiberg-Witten Theory and Integrable Hierarchy. JHEP 0701, 104 (2007); Marshakov, A.: On Microscopic Origin of Integrability in Seiberg-Witten Theory. http://arxiv.org/list/0706.2857 [hep-th], 2007 11. Macdonald, I.G.: Symmetric Functions and Hall Polynomials. Gloucesterslire: Clarendon Press, 1995 12. Okounkov, A., Reshetikhin, N.: Correlation function of Schur process with application to local geometry of a random 3-dimensional young diagram. J. Amer. Math. Soc. 16(3), 81 (2005) 13. Nakatsu, T., Takasaki, K.: In preparation 14. Maeda, T., Nakatsu, T., Takasaki, K., Tamakoshi, T.: Free fermion and Seiberg-Witten differential in random plane partitions. Nucl. Phys. B 715, 275 (2005) 15. Maeda, T., Nakatsu, T., Noma, Y., Tamakoshi, T.: Gravitational quantum foam and supersymmetric gauge theories. Nucl. Phys. B 735, 96 (2005) 16. Maeda, T., Nakatsu, T.: Amoebas and instantons. Internat. J. Modern Phys. A 22, 937 (2007) 17. Baulieu, L., Losev, A., Nekrasov, N.: Chern-Simons and twisted supersymmetry in various dimensions. Nud. Phys. B 522, 52–104 (1998) 18. Losev, A., Marshakov, A., Nekrasov, N.: Small Instantons, Little Strings and Free Fermions. http://arxiv.org/list/hep-th/0302191 19. Eguchi, T., Hori, K., Yang, S.-K.: Topological σ models and large-N matrix integral. Internat. J. Modern Phys. A 10, 4203 (1995) 20. Eguchi, T., Hori, K., Xiong, C.-S.: Quantum cohomology and Virasoro algebra. Phys. Lett. B 402, 71 (1997) 21. Getzler, E.: The Toda conjecture. In: Fukaya, K. et al. (eds.) Symplectic Geometry and Mirror Symmetry (KIAS, Seoul, 2000), Singapore: World Scientific, 2001, pp. 51–79 22. Pandharipande, R.: The Toda equations and Gromov-Witten theory of the Riemann sphere. Lett. Math. Phys. 53, 59 (2000) 23. Okounkov, A., Pandharipande, R.: Gromov-Witten theory, Hurwitz theory, and completed cycles. Annals of Math. 163(2), 517–560 (2006) 24. Givental, A.: Gromov-Witten invariants and quantization of quadratic hamiltonians. Mosc. Math. J.l, no.4, 551–568 (2001) 25. Okounkov, A., Pandharipande, R.: The equivariant Gromov-Witten theory P1 . Ann. of Math. 163(2), 561–605 (2006) 26. Zhou, J.: Hodge Integrals and Integrable Hierarchies. http://arxiv.org/list/math.AG/0310408 27. Fairlie, D.B., Fletcher, P., Zachos, C.K.: Trigonometric structure constants for new infinite-dimensional algebras. Phys. Lett. B 218, 203 (1989) 28. Ueno, K., Takasaki, K.: Toda lattice hierarchy. Adv. Studies in Pure Math. 4, Group Representations and Systems of Differential Equations, Amsterdam: NorthHolland, 1984, pp. 1–95 29. Jimbo, M., Miwa, T.: Solitons and infinite dimensional Lie algebras. Publ. RIMS, Kyoto Univ. 19, 943–1001 (1983) 30. Takebe, T.: Representation theoretical meaning of the initial value problem for the Toda lattice hierarchy I. Lett. Math. Phys. 21 (1991), 77–84; Ibid., II, Publ. RIMS, Kyoto Univ. 27, 491–503 (1991) Communicated by Y. Kawahigashi
Commun. Math. Phys. 285, 469–501 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0584-4
Communications in
Mathematical Physics
Spatial Random Permutations and Infinite Cycles Volker Betz, Daniel Ueltschi Department of Mathematics, University of Warwick, Coventry, CV4 7AL, England. E-mail:
[email protected];
[email protected] Received: 8 November 2007 / Accepted: 15 April 2008 Published online: 8 August 2008 – © Springer-Verlag 2008
Abstract: We consider systems of spatial random permutations, where permutations are weighed according to the point locations. Infinite cycles are present at high densities. The critical density is given by an exact expression. We discuss the relation between the model of spatial permutations and the ideal and interacting quantum Bose gas. Contents 1. 2. 3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Model in Finite Volume . . . . . . . . . . . . . . . . . . . . . The Model in Infinite Volume . . . . . . . . . . . . . . . . . . . . 3.1 The σ -Algebra . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 An extension theorem . . . . . . . . . . . . . . . . . . . . . . 3.3 Permutation cycles and probability measure . . . . . . . . . . 3.4 Finite vs infinite volume . . . . . . . . . . . . . . . . . . . . 4. A Regime without Infinite Cycles . . . . . . . . . . . . . . . . . . 5. The One-Body Model . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Occurrence of infinite cycles . . . . . . . . . . . . . . . . . . 5.2 Fourier representation for spatial permutations . . . . . . . . . 6. The Quantum Bose Gas . . . . . . . . . . . . . . . . . . . . . . . 6.1 Feynman-Kac representation of the Bose gas . . . . . . . . . . 6.2 Discussion: Relevant interactions for spatial permutations . . . 7. A Simple Model of Spatial Random Permutations with Interactions 7.1 Approximation and definition of the model . . . . . . . . . . . 7.2 Pressure and critical density . . . . . . . . . . . . . . . . . . . 7.3 Occurrence of infinite cycles . . . . . . . . . . . . . . . . . . Appendix A: Macroscopic Occupation of the Zero Fourier Mode . . . . Appendix B: Convexity and Fourier Positivity . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
470 472 474 474 476 477 478 478 480 480 482 486 486 488 489 489 490 492 496 500 501
470
V. Betz, D. Ueltschi
1. Introduction This article is devoted to random permutations on countable sets that possess a spatial structure. Let x be a finite set of points x1 , . . . , x N ∈ Rd , and let S N be the set of permutations of N elements. We are interested in probability measures on S N where permutations with long jumps are discouraged. The main example deals with “Gaussian” weights, where the probability of π ∈ S N is proportional to N 1 2 |xi − xπ(i) | , exp − 4β i=1
with β a parameter. But we are also interested in more general weights on permutations. In addition, we want to allow the distribution of points to be random and to depend on the permutations. We are mostly interested in the existence and properties of such measures in the thermodynamic limit. That is, assuming that the N points x belong to a cubic box ⊂ Rd , we consider the limit ||, N → ∞, keeping the density ρ = N /|| fixed. The main question is the possible occurrence of infinite cycles. As will be seen, infinite cycles occur when the density is larger than a critical value. Mathematicians and physicists have devoted many efforts to investigating properties of non-spatial random permutations, when all permutations carry equal weight. In particular, a special emphasis has been put on the study of longest increasing subsequences [1,3] and their implications for such diverse areas as random matrices [3,21], Gromov-Witten theory [22] or polynuclear growth [8], and spectacular results have been obtained. The situation is very different for random permutations involving spatial structure; we are only aware of the works [10,16,17] (and [11]). This lack of attention is odd since spatial random permutations are natural and appealing notions in probability theory; it becomes even more astonishing when considering that they play an important rôle in the study of quantum bosonic systems: Feynman [9] and Penrose and Onsager [23] pointed out the importance of long cycles for Bose-Einstein condensation, and later Süt˝o clarified the notion of infinite cycles, also showing that infinite, macroscopic cycles are present in the ideal Bose gas [25,26]. These works, however, never leave the context of quantum mechanics. We believe that the time is ripe for introducing a general mathematical framework of spatial random permutations. The goal of this article is to clarify the setting and the open questions, and also to present some results. In Sect. 2, we introduce a model for spatial random permutations in a bounded domain ⊂ Rd . As stated above, the intuition is to suppress permutations with large jumps. We achieve this by assigning a “one-body energy” of the form i ξ(xi − xπ(i) ) to a given permutation π on a finite set x. The one-body potential ξ is nonnegative, spherically symmetric, and typically monotonically increasing, although we will allow more general cases. In addition, we will introduce “many-body potentials” depending on several jumps, as well as a weight for the points x. As usual, the most interesting mathematical structures will emerge in the thermodynamic limit ||, N → ∞, where N is the number of points in x. The ambitious approach is to consider and study the infinite volume limit of probability measures. The limiting measure should be a well-defined joint probability measure on countably infinite (but locally finite) sets x ⊂ Rd , and on permutations of x. To establish such a limit seems fairly difficult; as an alternative one can settle for constructing an infinite volume measure for permutations only, with a fixed x chosen according to some point process. We provide a framework for doing so in Sect. 3, and give a natural criterion
Spatial Random Permutations and Infinite Cycles
471
for the existence of the infinite volume limit (it is a generalisation of the one given in [11]). This criterion is trivially fulfilled if the interaction prohibits jumps greater than a certain finite distance; however, its verification for the physically most interesting cases remains an open problem. Another option for taking the thermodynamic limit is to focus on the existence of the limiting distribution of one special random variable as ||, N → ∞. Motivated by its relevance to Bose-Einstein condensation, our choice of random variable is the probability of the existence of long cycles; more precisely, we will study the fraction of indices that lie in a cycle of macroscopic length. The general intuition is that in situations where points are sparse (low density), or where moderately long jumps are strongly discouraged (high temperature), the typical permutation is a small perturbation of the identity map, and there are no infinite cycles. In Sect. 4 we give a criterion (that corresponds to low densities, resp. high temperatures) for the absence of infinite cycles. On the other hand, infinite cycles are usually present for high density. The density where existence of infinite cycles first occurs is called the critical density. We establish the occurrence of infinite, macroscopic cycles in Sect. 5 for the case where only the one-body potential is present, and where we average over the point configurations x in a suitable way. An especially pleasing aspect of the result is the existence of a simple, exact formula for the critical density. It turns out to be nothing else than the critical density of the ideal Bose gas, first computed by Einstein in 1925! The experienced physicist may shrug this fact off in hindsight. However, it is a priori not apparent why quantum mechanics should be useful in understanding this problem, and it is fortunate that much progress has been achieved on bosonic systems over the years. Of direct relevance here is Süt˝o’s study of the ideal gas [26], and the work of Buffet and Pulé on distributions of occupation numbers [6]. Section 6 is devoted to the relation between models of spatial random permutations and the Feynman-Kac representation of the Bose gas. We are particularly interested in the effect of interactions on the Bose-Einstein condensation. While this question has been largely left to numericians, experts in path-integral Monte-Carlo methods, we expect that weakly interacting bosons can be exactly described by a model of spatial permutations with two-body interactions. An interesting open problem is to establish this fact rigorously. Numerical simulations of the model of spatial permutations should be rather easy to perform, and they should help us to understand the phase transition to a Bose condensate. In Sect. 7 we simplify the interacting model of Sect. 6. It turns out that the largest terms contributing to the interactions between permutation jumps are due to cycles of length 2. Retaining this contribution only, we obtain a toy model of interacting random permutations that is simple enough to handle, but which allows to explore some of the effects of interactions on Bose-Einstein condensation. In particular, we are able to compute the critical temperature exactly. It turns out to be higher than the non-interacting one and to deviate linearly in the scattering length of the interaction potential. This is in qualitative agreement with the findings of the physical community [2,14,15,20]. In addition, we show that infinite cycles occur in our toy model whenever the density is sufficiently high. However, our condition on the density is not optimal — it is higher than the critical density of our interacting model. We expect the existence of infinite cycles right down to the interacting critical density, but this is yet another open problem.
472
V. Betz, D. Ueltschi
Fig. 1. Illustration for a random set of points x, and for a permutation π on x. Isolated points are sent onto themselves
2. The Model in Finite Volume Let be a bounded open domain in Rd , and let V denote its volume (Lebesgue measure). The state space of our model is the cartesian product ,N = N × S N ,
(2.1)
where S N is the symmetric group of permutations of {1,…,N}. The state space ,N can be equipped with the product σ -algebra of the Borel σ -algebra for N , and the discrete σ -algebra for S N . An element (x1 , . . . , x N ) × π ∈ ,N is viewed as a spatial random permutation in the sense that x j is mapped to xπ( j) for all j. Figure 1 illustrates this. The probability measure on ,N is obtained in the usual way of statistical mechanics: a reference measure, in our case the product of Lebesgue measure on N and uniform measure on permutations, is perturbed by a density given by the exponential of a Hamiltonian, i.e. a function H : ,N → (−∞, ∞]. We will shortly specify the shape of relevant Hamiltonians. We are interested in properties of permutations rather than positions, and we only consider random variables on S N . We consider two different expectations: E x , when positions x ∈ N are fixed; and E ,N , when we average over positions. For this purpose we introduce the partition functions Y (x) =
e−H ( x ,π ) ,
π ∈S N
1 Z (, N ) = N!
(2.2) N
Y (x) dx.
In the last line, dx denotes the Lebesgue measure on Rd N . The factor 1/N ! implies that Z (, N ) ∼ e V q for large V, N , and for “reasonable” Hamiltonians — a desirable property in statistical mechanics. Then, for θ : S N → R a random variable on the set of
Spatial Random Permutations and Infinite Cycles
473
permutations, we define E x (θ ) =
1 θ (π ) e−H ( x ,π ) , Y (x)
(2.3)
π ∈S N
and 1 Z (, N )N !
E ,N (θ ) =
1 = Z (, N )N !
N
dx
θ (π ) e−H ( x ,π )
π ∈S N
N
E x (θ )Y (x)dx.
(2.4)
We will be mostly interested in the possible occurrence of long cycles. Thus we introduce a random variable that measures the density of points in cycles of length between m and n: m,n (π ) =
1 V
# {i = 1, . . . , N : m i (π ) n}.
(2.5)
Here, i (π ) denotes the length of the cycle that contains i; that is, i (π ) is the smallest number n 1 such that π (n) (i) = i. We also have m,n (π ) =
N 1 χ [m,n] ( i (π )) V
(2.6)
i=1
with χ I denoting the characteristic function for the interval I . We denote by R N the space of random variables on S N that are invariant under relabeling. That is, θ ∈ R N satisfies θ (σ −1 π σ ) = θ (π )
(2.7)
for all σ, π ∈ S N . Random variables in R N have the useful property that they do not depend on the way the set x = {x1 , . . . , x N } is labeled. Instead, they only depend on the set x itself, and in that sense are the most natural quantities to study. Notice that m,n ∈ R N . We now discuss the form of relevant Hamiltonians. H is given by the sum H (x, π ) = H (1) (x, π ) + H (k) (x, π ) + G(x), (2.8) k 2
where the terms satisfy the following properties. Let x = (x1 , . . . , x N ). • The one-body Hamiltonian H (1) has the form H (1) (x, π ) =
N
ξ(xi − xπ(i) ).
(2.9)
i=1
We suppose that ξ is a spherically symmetric function Rd → [0, ∞], that ξ(0) = 0, and that e−ξ is integrable.
474
V. Betz, D. Ueltschi
• The k-body term H (k) : ,N → R can be negative; it has the form H (k) (x, π ) = V (x j , xπ( j) ) j∈A .
(2.10)
A⊂{1,...,N } |A|=k
• The function G : Rd N → R depends on the points only. It has no effect on the expectation E x , but it modifies the expectation E ,N . We will discuss in Sect. 6 the links between spatial random permutations and the quantum Bose gas. We will see that the physically relevant terms are ξ(x) = |x|2 /4β, that the interactions are two-body (H (k) = 0 for k > 2), and that G(x) ≡ 0. From the mathematical point of view it is interesting to consider a more general setting. In particular, we can restrict the jumps by setting ξ(x) = ∞ for |x| bigger than some cutoff distance R. The effect of G is to modify the typical sets of points. We can choose it such that Y (x) ≡ 1. We refer to this case as “Poisson”, since positions are independent of each other, and they are uniformly spread. The point process for the Bose gas is not Poisson, however. The fluctuations of the number of points in a subdomain were studied in [18]. They were shown to satisfy a large deviation principle with a rate function that is different than Poisson’s. 3. The Model in Infinite Volume As usual, the most interesting structures emerge in the infinite volume limit V → ∞ or, more precisely, the thermodynamic limit V → ∞, N = ρV . The easiest way to take this limit is to consider a fixed random variable, e.g. m,n (π ) from (2.5), and study its distribution as V → ∞ and N = ρV . We will indeed do this in Sect. 5; an advantage of this approach is that we do not have to worry about infinite volume probability measures. But these infinite volume measures are very interesting objects to study directly, in the same spirit as when constructing infinite volume Gibbs measures. We advocate this point of view in the present section, and introduce a framework for spatial permutations in unbounded domains. 3.1. The σ -Algebra. The present and the following subsection contain preparatory results about permutations on N. We introduce the “cylinder sets” Bi, j that consist of all permutations where i ∈ N is sent to j ∈ N: Bi, j = {π ∈ SN : π(i) = j}.
(3.1)
Let denote the collection of finite intersections of cylinder sets and their complements. One can check that it is closed under finite intersections, and also that the difference of two sets is equal to a finite union of disjoint sets. Such a family of sets is called a semiring by probabilists. Semirings are useful because they are easy to build from basic sets, and because premeasures on semirings can be extended to measures by the Carathéodory-Fréchet theorem. Let be the σ -algebra generated by the Bi, j . We start by proving a structural lemma that we shall use when extending finite volume measures to an infinite volume one. For A1 , A1 , . . . , Am , Am ⊂ N, let us define ...Am −1 B AA1...A (i) ∈ Ai and π(i) ∈ Ai , 1 i m . (3.2) = π ∈ SN : π 1
m
Spatial Random Permutations and Infinite Cycles
475
One easily checks that any element of the semiring can be represented by a set of the form (3.2). Also, the intersection of two such sets satisfies ...Am B AA1...A m
1
...Cm A1 ∩C1 ...Am ∩Cm ∩ BCC11...C = B A ∩C ...A ∩C . m m m 1 1
(3.3)
Lemma 3.1. Let A1 , A1 , A2 , A2 , . . . be finite subsets of N. If ...Am B AA1...A = ∅ m
1
for all finite m, then
...Am limm→∞ B AA1...A m 1
= ∅.
It is crucial that both Ai and Ai be finite for all i; counter-examples are easily found otherwise. For instance, choose A1 = N, Ai = {i − 1} for i > 1, and Ai = {i + 1} for all i; then each finite intersection is non-empty. The infinite intersection is empty, on the other hand, since there is no possibility left for the preimage of 1. Similarly, choosing A1 = N, Ai = {i − 1} for i > 1 and Ai = {i + 1} does not leave a possible image for 1. These two cases should be kept in mind when reading the proof. A claim similar to Lemma 3.1 and Theorem 3.2 was proposed in [11]. The proof there contains a little flaw that is corrected here. {a}
Proof. We write Baa instead of B{a } , etc... We have ...Am B AA1...A = m
1
∪
a1 ,...,am a1 ,...,am
...am Baa1...a .
(3.4)
m
1
∈ A , with the restriction that a = a and a = a The union is over a1 ∈ A1 , . . . , am i j m i j ...am for i = j. The union is disjoint, Baa1...a = ∅, and 1
m
...am+1 Baa1...a 1 m+1
...am ⊂ Baa1...a .
(3.5)
m
1
Permutations are charaterised by a “limiting set”, namely ... {π } = Baa1aa2... ,
(3.6)
1 2
where {ai }, {ai } are given by π −1 (i) = ai Conversely, given {ai }, {ai },
and
π(i) = ai .
(3.7)
... Baa1aa2... 1 2
is either empty, or it contains the permutation π that satisfies (3.7). We now check that the decomposition (3.4) yields at least one non-empty limiting set. ...am The sets Baa1...a that appear in (3.4) can be organised as a tree. The root is SN . The 1
m
sets with m = 1 are connected to SN , i.e. all the sets Baa1 , a1 ∈ A1 and a1 ∈ A1 . The 1
sets with m = 2 are connected to those with m = 1. Precisely, the sets Baa1aa2 , a2 ∈ A2 1 2
and a2 ∈ A2 , are connected to Baa1 . And so on... There are infinitely many vertices, but 1 finitely many vertices at finite distance from the root. m denote the length of the longest We can select a limiting set as follows. Let aa1 ...a ...a 1
m
...am path descending from Baa1...a . For each m, there is at least one {ai }, {ai } such that 1
m
476
V. Betz, D. Ueltschi
m aa1 ...a = ∞. (Otherwise the tree cannot have infinitely many vertices, since the sets ...a 1
m
∈ Am+1 A1 , A1 , . . . have finite cardinality.) Further, there exists am+1 ∈ Am+1 and am+1 a1 ...am+1 a1 ...am such that a ...a = ∞. We can choose an infinite descending path such that a ...a is 1
m+1
1
m
...am a1 a2 ... always infinite. The sets Baa1...a are not empty for any m, so that Ba a ... is a non-empty m 1 1 2 limiting set.
3.2. An extension theorem. We now give a criterion for a set function µ on to extend to a measure on . Theorem 3.2. Let be the semiring generated by the cylinder sets Bi, j in (3.1), and let µ be an additive set function on with µ(SN ) < ∞. We assume that for all fixed i ∈ N, we have µ(Bi, j ) = 1 and µ(B j,i ) = 1. (3.8) j 1
j 1
Then µ extends uniquely to a probability measure on (SN , ). If µ is symmetric with respect to the inversion of permutation, i.e. µ(Bi, j ) = µ(B j,i ), then the two conditions are equivalent. Since SN = ∪ j Bi, j , this amounts to σ -additivity for a restricted class of disjoint unions. Thus (3.8) is also a necessary condition. Theorem 3.2 does not contain any reference to space; but we shall see in the following section that the spatial structure provides a natural way for (3.8) to be fulfilled in certain cases. Proof. It follows from the assumption of the theorem that for any ε > 0, there exist finite sets C1 , C2 , . . . such that, for all i,
µ {π ∈ SN : π −1 (i) ∈ Ci and π(i) ∈ Ci } > 1 − 2−i−1 ε. (3.9) Then for any m, we have
...Cm > 1 − 21 ε. µ BCC1...C 1
(3.10)
m
We prove the following property, that is equivalent to σ -additivity in . For any decreasing sequence (G n ) of sets in such that µ(G n ) > ε for all n, we have limn G n = ∅. As we have observed above, G n can be written as ...Anm G n = B AAn1...A . n1
(3.11)
nm
for some finite m that depends on n. Without loss of generality, we can suppose that m n (choosing some sets to be N if necessary); we can actually take m = n (by restricting to a supersequence if necessary). This allows to alleviate a bit the argument. We have G n ∩ G n+1 = G n+1 . Using (3.3), we can choose the sets such that Ani ⊃ An+1,i and Ani ⊃ An+1,i for each i. The sets Ani , Ani may be infinite. We therefore = A ∩ C . Then define the finite sets Dni = Ani ∩ Ci and Dni i ni
...Dnn ...Cn = µ G n ∩ BCC11...C µ B DDn1...D n nn n1
C1 ...Cn c (3.12) µ(G n ) − µ B C1 ...Cn
>
1 2 ε.
Spatial Random Permutations and Infinite Cycles
477
are finite and decreasing for fixed i. Let The sets Dni , Dni Di = lim Dni .
Di = lim Dni , n
n
(3.13)
For any k, there exists n large enough such that ...Dk ...Dnn B DDn1...D ⊂ B DD1...D ⊂ Gk . n1
nn
1
(3.14)
k
The set in the left side is not empty since it has positive measure. Thus the set in the middle is not empty for any k. By Lemma 3.1, its limit as k → ∞ is not empty, which proves that limk G k = ∅.
3.3. Permutation cycles and probability measure. Here we discuss the connection between the considerations of the two previous subsections, and the spatial structure. The set of permutations where i belongs to a cycle of length n can be expressed as Bi(n) =
n
∪ ∩ B ji−1 , ji , j ,..., j i=1 1
(3.15)
n
where j1 , . . . , jn are distinct integers, and where we set j0 ≡ jn . The union is countable and Bi(n) belongs to the σ -algebra . The event where i belongs to an infinite cycle is then c (∞) (n) Bi = ∪ Bi (3.16) n 1
and it also belongs to . We can introduce the random variable i for the length of the cycle that contains i. It can take the value ∞. Since i−1 ({n}) = Bi(n) for all n = 1, 2, . . . , ∞, we see that i is measurable. In general the probability distribution of i depends on i. Thus we average over points in a large domain. Let x ⊂ Rd be a countable set with no accumulation points (that is, if ⊂ Rd is bounded, then x ∩ is finite). The elements x1 , x2 , . . . of x can be ordered according to their distance to the origin. More exactly, we suppose that for any cube centered at the origin, there exists N such that xi ∈ iff i N . Let V be the volume of . We introduce the density of points in cycles of length between m and n, by () m,n (π ) =
1 V
# {i = 1, . . . , N : m i (π ) n}.
(3.17)
This expression is of course very similar to Eq. (2.5) for the model in finite volume. Next we define the relevant measure on SN . Let S N be the set of permutations that are trivial for indices larger than N : S N = {π ∈ SN : π(i) = i if i > N }.
(3.18)
We define the finite volume probability of a set B in the semiring by ()
ν x (B) =
1 Y (x )
e−H ( x ,π ) .
(3.19)
π ∈B∩S N
Here, x = x ∩ . The Hamiltonian H (x , π ) and the normalisation Y (x ) are given by the same expression as in Sect. 2.
478
V. Betz, D. Ueltschi
The existence of the thermodynamic limit turns out to be difficult to establish. If x is a lattice such as Zd , or if x is the realisation of a translation invariant point process, we expect that ν x() (B) converges as V → ∞. We cannot prove such a strong statement, but it follows from Cantor’s diagonal argument that there exists a subsequence (Vn ) of ( ) increasing volumes, such that ν x n (B) converges for all B ∈ . (Here, n is the cube of volume Vn centered at 0.) Thus we have existence of a limiting set function, ν x , but we cannot guarantee its uniqueness. If ξ involves a cutoff, i.e. if e−ξ(x) is zero for x large enough, (3.8) is fulfilled and ν x extends to an infinite volume measure thanks to Theorem 3.2. But we cannot prove the criterion of the theorem for more general ξ , not even the Gaussian. It is certainly true, though. For relevant choices of point processes and of permutation measures, we expect () that lim V →∞ m,n (π ) exists for a.e. x and a.e. π . Let m,n denote the limiting random variable. It allows to define the expectation E(m,n ) = dµ(x) dν x (π )m,n (π ). (3.20) It would be interesting to obtain properties such as concentration of the distribution of m,n . The simplest case should be the Poisson point process. 3.4. Finite vs infinite volume. The finite volume setting of Sect. 2 can be rephrased in the infinite volume setting as follows. Let µ(,N ) be the point process on Rd such that Y (x) dx if x ⊂ and |x| = N , (,N ) µ (dx) = Z (,N )N ! (3.21) 0 otherwise. ()
Next, let ν x
be as in (3.19). For 1 ⊃ 2 ⊃ 3 , let us consider the expectation (1 ,N ) 3)) = 3) dµ (x) dν x(2 ) (π )( (3.22) E 1 ,2 ,N (( m,n m,n .
This can be compared to Eqs (2.4) and (2.5). Namely, we have E ,N (m,n ) = E ,,N (() m,n ).
(3.23)
It would be interesting to prove that the infinite volume limits in (3.22) can be taken separately. Precisely, we expect that lim E ,ρ|| (m,n ) = lim
Rd
lim
3) lim E 1 ,2 ,ρ|1 | (( m,n ).
3 R d 2 R d 1 R d
(3.24)
4. A Regime without Infinite Cycles At low density the jumps are very much discouraged and the typical permutations resemble the identity permutation, up to a few small cycles here and there. In this section we give a sufficient condition for the absence of infinite cycles. We consider the infinite volume framework. The condition has two parts: (a) a bound on the strength of the interactions; (b) in essence, that the points of x lie far apart compared to the decay length of e−ξ . Here, B (γ ) denotes the set of permutations where the cycle γ is present.
Spatial Random Permutations and Infinite Cycles
479
Theorem 4.1. Let x ⊂ Rd be a countable set, and let i be an integer. We assume the following conditions on interactions and on jump factors, respectively. (a) There exists 0 s < 1 such that for all cycles γ = ( j1 , . . . , jn ) ⊂ N long enough, with j1 = i, and all permutations π ∈ B (γ ) , n ξ(x j − x j−1 ), V (x j , xπ( j) ) j∈A − V (x j , xπ ( j) ) j∈A s A⊂N A∩γ =∅
j=1
with x0 ≡ xn , and where π is the permutation obtained from π by removing γ , i.e. π( j) if j = j1 , . . . , jn , π ( j) = j if j = jk for some k. (b) With the same s as in (a),
n
e−(1−s)ξ(x jk −x jk−1 ) < ∞.
n 1 γ =( j1 ,..., jn ) k=1 j1 =i
Then we have
lim lim
n→∞ V →∞
ν x() (Bi(k) ) = 0.
k n
The condition (a) is stated “for all cycles long enough”, i.e. it must hold for all cycles of length n n 0 , for some fixed n 0 that depends on i only. This weakening of the condition is useful; many points are involved, they cannot be too close, and ξ(·) in the right side cannot be too small. If ν x extends to a measure on , then lim lim
n→∞ V →∞
()
(k)
(∞)
ν x (Bi ) = ν x (Bi
),
(4.1)
k n
(∞)
and the theorem states that ν x (Bi ) = 0. An open problem is to provide sufficient conditions on the Hamiltonian such that, in the finite volume framework, we have lim
lim E ,ρV ( M,ρV ) = 0.
M→∞ V →∞
Proof of Theorem 4.1. By definition, (k)
ν x() (Bi ) =
1 Y (x)
e−
k
l=1 ξ(x il −x il−1 )−
A∩γ =∅
V ((x j ,xπ( j) ) j∈A )−H ( x \γ ,π )
γ =( j1 ,..., jk ) π ∈B (γ ) N j1 =i
(4.2)
480
V. Betz, D. Ueltschi (γ )
with B N = B (γ ) ∩ S N . By restricting the sum over permutations, we get a lower bound for the partition function: ⎧ ⎫ ⎨ ⎬ exp − V ((x j , xπ ( j) ) j∈A ) − H (x \γ , π ) . (4.3) Y (x ) ⎩ ⎭ (γ ) A∩γ =∅
π ∈B N
Observe that H (x \γ , π ) = H (x \γ , π ). Combining with the two conditions of the theorem, we obtain
()
(k)
ν x (Bi )
k n
n
e−(1−s)ξ(x jl −x jl−1 ) .
(4.4)
k n γ =( j1 ,..., jk ) l=1 j1 =i
The right side does not depend on , and it vanishes in the limit n → ∞.
5. The One-Body Model 5.1. Occurrence of infinite cycles. The occurrence of infinite cycles can be proved in a large class of models with one-body Hamiltonians, with the critical density being exactly known! Here the domain is the cubic box of size L, volume V = L d . We fix the density ρ of points, that is, we take N = ρV . Recall the definition (2.5) for the random variable m,n that gives the density of points in cycles of length between m and n. We consider the Hamiltonian H (x, π ) =
N
ξ (xi − xπ(i) ),
(5.1)
i=1
where the function ξ is a slight modification of the function ξ , defined by the relation e−ξ (x) = e−ξ(x−L y) . (5.2) y∈Zd
Our conditions on ξ ensure that the latter sum is finite, and that ξ converges pointwise to ξ as V → ∞. This technical modification should not be necessary, but it allows to simplify the proof of Theorem 5.1 below. In essence, this amounts to choosing periodic boundary conditions. Let C = e−ξ . We suppose that the Fourier transform of e−ξ(x) is nonnegative, and we denote it C e−ε(k) : For k ∈ Rd , −ε(k) Ce = e−2π ikx e−ξ(x) dx. (5.3) Rd
The “dispersion relation” ε(k) always satisfies • ε(0) = 0; ε(k) > a|k|2for small k (indeed, the Laplacian of e−ε(k) would otherwise be zero at k = 0; then |x|2 e−ξ(x) dx = 0, which is absurd). • ε(k) > 0 uniformly in k away from zero; −ε(k) • e dk = C −1 < ∞.
Spatial Random Permutations and Infinite Cycles
481
Among many examples of functions that satisfy the conditions above, let us mention several important cases. (1) The Gaussian: e−ξ(x) = e−|x|
2 /4β
,
ε(k) = 4π 2 β|k|2 ,
C = (4πβ)d/2 .
(5.4)
(2) In dimension d = 3, the exponential: e−ξ(x) = e−|x|/β ,
ε(k) = 2 log[1 + (2πβ|k|)2 ],
C = 8π/β 3 . (5.5)
(3) In d = 3, a sufficient condition for positive Fourier transform is that 1 r
e−ξ(r )
(5.6)
be monotone decreasing; here, ξ depends on r = |x| only. See e.g. [13] and references therein. Thus there exist functions e−ξ with compact support and positive Fourier transform. (4) In d = 1, e−ξ(x) = (|x| + 1)−3/2 .
(5.7)
Its Fourier transform is positive by Lemma B.1. It can be checked that ε(k) ∼ |k|1/2 for small k, so that its critical density is finite. We define the critical density by ρc =
Rd
dk . −1
eε(k)
(5.8)
The critical density is finite for d 3, but it can be infinite in d = 1, 2. The following theorem claims that infinite cycles are present for ρ > ρc only, and that they are macroscopic. It extends results of Süt˝o for the ideal Bose gas [26], which corresponds to the Gaussian case ξ(x) ∼ |x|2 . The theorem also applies when ρc = ∞. Theorem 5.1. Let ξ satisfy the assumptions above. Then for all 0 < a < b < 1, and all s 0, ρ if ρ ρc ; (a) lim E ,ρV (1,V a ) = V →∞ ρc if ρ ρc ; (b)
lim E ,ρV ( V a ,V b ) = 0; ⎧ ⎪ if ρ ρc ; ⎨0 (c) lim E ,ρV ( V b ,sV ) = s if 0 s ρ − ρc , ⎪ V →∞ ⎩ρ − ρ if 0 ρ − ρ s. c c V →∞
The rest of this section is devoted to the proof of this theorem.
482
V. Betz, D. Ueltschi
5.2. Fourier representation for spatial permutations. The first step is to reformulate the problem in the Fourier space. In the Gaussian case, ξ(x) = |x|2 , this is traditionally done using the Feynman-Kac formula and unitary transformations of Hilbert spaces. There is no Feynman-Kac formula for our more general setting, but we can directly use the Fourier transform. This actually simplifies the situation. Here and in the sequel, we use the definition (5.3) for the Fourier transform. Let be to L = 1. For f ∈ L 1 (Rd ), let f (x) = the unit cube, i.e. we fix the length 1 y∈Zd f (x + y); notice that f ∈ L (). Lemma 5.2. Let f ∈ L 1 (Rd ). For all n = 1, 2, . . . , we have n
dx1 . . . dxn
n
f (xi − xi−1 ) =
f (k)n .
k∈Zd
i=1
(By definition, x0 = xn .) Proof. The n th convolution of the function f with itself satisfies ( f ∗n )(x) = The product
dx1 . . . dxn−1 f (x1 ) f (x2 − x1 ) . . . f (x − xn−1 ).
(5.9)
f (xi − xi−1 ) is translation invariant, so that
n
dx1 . . . dxn
n
f (xi − xi−1 ) =
i=1
n
dx1 . . . dxn f (x1 ) f (x2 − x1 ) f (−xn−1 )
= f ∗n (0) f ∗n (k) = k∈Zd
=
f (k)n .
(5.10)
k∈Zd
The hat symbol in the third line denotes the L 2 ()-Fourier transform; but in the fourth line, it denotes the L 2 (Rd )-Fourier transform.
We will use a corollary of this lemma. Corollary5.3. Let be a cube of size L, ∗ = ( L1 Z)d be the dual space, and let f (x) = y∈Zd f (x + L y). Then for any permutation π ∈ S N we have: N
dx1 . . . dx N
N i=1
f (xi − xπ(i) ) =
N
k1 ,...,k N ∈∗ i=1
δki ,kπ(i)
N
f (ki ).
i=1
Proof. It is enough to consider the case L = 1, the general case can be obtained by scaling. The multiple integral factorises according to the cycles of the permutation. Using Lemma 5.2 for each cycle, we get the result.
Spatial Random Permutations and Infinite Cycles
483
We are now in position to reformulate the problem in the Fourier space. By Corollary 5.3, we have the relation E ,N (θ ) =
1 θ (π ) Z (, N )N ! π ∈S N
e−
N
i=1 ε(ki )
k1 ,...,k N ∈∗
N
δki ,kπ(i)
(5.11)
i=1
with Z (, N ) = C −N Z (, N ). A simpler expression than (5.11) is available for random variables invariant under relabeling, i.e. θ ∈ R N . To obtain it, we introduce the set N of “occupation numbers” on ∗ . An element n ∈ N is a sequence of integers n = (n k ), n k = 0, 1, 2, . . . , indexed by k ∈ ∗ . We denote by N,N the set of occupation numbers with total number N : N,N = n ∈ N : nk = N . (5.12) k∈∗
Let k = (k1 , . . . , k N ) be an N -tuple of elements of ∗ . We can assign an element n = n(k) in N,N by defining, for each k ∈ ∗ , n k = #{i = 1, . . . , N : ki = k}.
(5.13)
In other words, n k is the number of occurrences of the vector k in k. The map k → n(k) is onto but not one-to-one; the number of k’s that are sent to a given n is equal to n k !. N! k∈∗
We now introduce probabilities for Fourier modes, permutations, and occupation numbers. The sample space is (∗ ) N × S N ; it is discrete, and we consider the discrete σ -algebra. When taking probabilities, we write k × π for {(k, π )}; k for {k} × S N ; n for {k : n(k) = n} × S N and π for (∗ ) N × {π }. We define the probability P,N by N 1 − i=1 ε(ki ) if k = k (,N )N ! e i π(i) for all i, Z P,N (k × π ) = (5.14) 0 otherwise. Here, k = (k1 , . . . , k N ). Summing over permutations, we get 1 P,N (k) = e− k∈∗ ε(k)n k nk ! Z (, N )N ! ∗
(5.15)
k∈
with (n k ) = n(k). Finally, summing over Fourier modes that are compatible with occupation numbers yields P,N (n) =
1 − k∈∗ ε(k)n k e . Z (, N )
(5.16)
It follows that the partition function Z (, N ) can be expressed using occupation numbers as e− k∈∗ ε(k)n k . (5.17) Z (, N ) = n∈N,N
These definitions allow to express E ,N (θ ) in an illuminating form.
484
V. Betz, D. Ueltschi
Lemma 5.4. For all θ ∈ R N , E ,N (θ ) =
P,N (n)
n∈N,N
θ (π )P,N (π |k),
π ∈S N
with k any N -tuple of Fourier modes that is compatible with n, i.e. such that n(k) = n. Proof. Using the definitions (5.14)–(5.16), the expectation (5.11) of θ can be written as E ,N (θ ) = =
k∈(∗ ) N
π ∈S N
θ (π )P,N (k × π )
P,N (k)
θ (π )P,N (π |k).
(5.18)
π ∈S N
k∈(∗ ) N
The latter sum does not depend on the ordering of the ki ’s — it depends only on occupation numbers (notice that P,N (π |σ (k)) = P,N (σ −1 π σ |k) for all σ ∈ S N ). The lemma follows from (5.16).
Lemma 5.5. ⎧ ⎪n − m + 1 if 1 m n n k 1 ⎨ m,n (π )P,N (π |k) = n − m + 1 if 1 m n k n ⎪ k V k∈∗ ⎩0 π ∈S N if n k < m n. Proof. It follows from (5.14) that, given k, permutations are uniformly distributed (over compatible permutations). That is, 1/ k∈∗ n k ! if kπ(i) = ki for all i, P,N (π |k) = (5.19) 0 otherwise. Given k = (k1 , . . . , k N ), a permutation π that leaves it invariant can be decomposed into a collection (πk ), πk ∈ Sn k , of permutations for each Fourier mode, namely ⎛
k∈∗
πk ∈Sn k
m,n (π )P,N (π |k) = ⎝
π ∈S N
⎞ 1 ⎠ m,n ((πk )k∈∗ ). nk !
(5.20)
In addition, we have m,n ((πk )k∈∗ ) =
1 Nm,n (πk ), V ∗
(5.21)
k∈
with Nm,n (π ) being the number of indices that belong to cycles of length between m and n. We obtain π ∈S N
m,n (π )P,N (π |k) =
1 1 Nm,n (πk ). V n ! ∗ k k∈
πk ∈Sn k
(5.22)
Spatial Random Permutations and Infinite Cycles
485
We see that modes have been decoupled; further, we only need to average Nm,n (π ) over uniform permutations. One easily checks that the probability for an index to belong to a cycle of length , with N indices, is equal to 1/N for any . Then 1 Nm,n (π ) = n − m + 1, (5.23) N! π ∈S N
for integers 1 m n N . The lemma follows.
We have all the elements for the proof of Theorem 5.1. Proof of Theorem 5.1. We introduce a set Aη of occupation numbers that is typical. With ρ0 = max(0, ρ − ρc ), let ⎧ ⎫ ⎨ n ⎬ 0 Aη = n ∈ N,N : − ρ0 < η; n k < ηV ; n k < V 3η for all |k| V −η . ⎩ ⎭ V −η 0 2, and let δ as in Lemma 7.4. For all m 0,
Na,n (π )
π ∈Sn
e−α N2 (π ) (n − a − 2m)(1 − δ m ). h n (α)n!
Proof. When summing over permutations, all indexes i in the definition of Na,n (π ) are equivalent, so that
Na,n (π )
π ∈Sn
e−α N2 (π ) e−α N2 (π ) χ [a,n] ( 1 (π )) =n . h n (α)n! h n (α)n!
(7.24)
π ∈Sn
Summing over the lengths of the cycle that contains 1, we get π ∈Sn
n e−α N2 (π ) = Na,n (π ) n(n − 1) . . . (n − j + 1) h n (α)n!
π ∈Sn− j
j=a
=
n j=a
e−α N2 (π ) h n (α)n!
h n− j (α) . h n (α)
(7.25)
We get a lower bound by summing up to n − 2m. By Lemma 7.4 (c), we have for 2 < j < n − 2m, h n− j (α) e−δ − δ m /m! −δ 1 − δm . h n (α) e + δ n/2 / n2 !
(7.26)
The last result that is needed for the proof of Theorem 7.2 is that our interacting model displays Bose-Einstein condensation, in the sense that the zero Fourier mode is macroscopically occupied. Proposition 7.6. The expectation for the occupation of the zero Fourier mode is bounded below by lim E ,N
V →∞
n0 V
ρ−
4 ρ (0) . (1 + e−α )2 c
Proof. We proceed as in Appendix B of [27]. We have E ,N and E ,N (n k ) =
n0 V
=ρ−
E ,N (n k ),
(7.27)
k∈∗ \{0}
P,N (n k j)
j 1
=
1 V
j 1 n∈N,N − j
e−ε(k) j
k ∈∗
e−ε(k )n k h n k (α) h n k + j (α) . Z (, N ) h n k (α)
(7.28)
Spatial Random Permutations and Infinite Cycles
495
The latter ratio is smaller than (1 − δ)−1 by Lemma 7.4 (b). By restricting occupation numbers to n 0 j, we also have
Z (, N )
e−ε(k)n k h n k (α)
n∈N,N − j k∈∗
h n 0 + j (α) h n 0 (α)
Z (, N − j)(1 − δ).
(7.29)
Then E ,N (n k ) (1 − δ)−2
e−ε(k) j = (1 − δ)−2
j 1
1 . eε(k) − 1
(7.30)
It follows that E ,N
n0 V
ρ − (1 − δ)−2
We get the proposition by letting V → ∞.
1 V
k∈∗ \{0}
1 eε(k)
−1
.
(7.31)
Proof of Theorem 7.2. From Lemma 7.3, we have E ,N (m,n ) =
P,N (n)
n∈N,N
e−α N2 (π ) . k∈∗ h n k (α)n k !
m,n (π )
π ∈S N ki =kπ(i) ∀i
(7.32)
Compatible permutations factorise according to Fourier modes, i.e. π = (πk ) with πk ∈ Sn k . Also, N2 (π ) = k N2 (πk ). Then
E ,N (m,n ) =
⎛ P,N (n) ⎝
k∈∗ πk ∈Sn k
n∈N,N
=
P,N (n)
⎞ e−α N2 (πk ) ⎠ m,n ((πk )) h n k (α)n k !
m,n (πk )
k∈∗ πk ∈Sn k
n∈N,N
e−α N2 (πk ) . h n k (α)n k !
(7.33)
We keep only the term k = 0. Using Lemma 7.5, we obtain the lower bound E ,N ( V b ,N )
P,N (n)
n∈N,N
E ,N The claim follows from Proposition 7.6.
n0 V
n 0 − 3V b b (1 − δ V ) V
− 4V b−1 .
(7.34)
496
V. Betz, D. Ueltschi
Appendix A: Macroscopic Occupation of the Zero Fourier Mode In this appendix we investigate the random variable n → n 0 under the measure (5.16) in the thermodynamic limit V → ∞, N = ρV , for all density parameters ρ. We want to show that n 0 /V approaches a limit for each density ρ. We will actually show much more, by giving the limiting moment generating function of n 0 /V . We partly follow Buffet and Pulé [6], who considered the ideal Bose gas in arbitrary domains. Theorem A.1. Let ρ0 = max(0, ρ − ρc ), with ρc the critical density defined in (5.8). Then lim E ,ρV ( eλn 0 /V ) = eλρ0
V →∞
for all λ 0. Our first proof applies only when ρc is finite. We give below an argument that completes the proof. Proof when ρc < ∞. It is shown in [27], Appendix B, that for all k ∈ ∗ , we have P,N (n k j) = e−ε(k) j
Z (, N − j) . Z (, N )
(A.1)
Since P(n k = j) = P(n k j) − P(n k j + 1), we find for b > 0, E ,N eνn 0 =
1 eν j (Z (, N − j) − Z (, N − j − 1)) Z (, N ) N
j=0
eν N e−ν j (Z (, j) − Z (, j − 1)). Z (, N ) N
=
(A.2)
j=0
Here, we used the convention Z (, −1) := 0 and the fact that P,N (n k N + 1) = 0. Putting in N = ρV and setting ν = λ/V we obtain
E ,ρV
eλn 0 /V
ρ
=
λ eλρ e− V j (Z (, j) − Z (, j − 1)). Z (, ρV )
(A.3)
j=0
Above, we wrote ρV instead of ρV and we will continue to do so, to simplify notation. Now comes the clever insight of Buffet and Pulé [6]: In (A.3), both Z (, ρV ) and the sum over j can be written as integrals with respect to a purely atomic, V -dependent measure µ on R+ ; since the functions that are being integrated will not depend on V , we only need to study the limit of µ . The measure µ is given by µ := C
∞ (Z (, j) − Z (, j − 1))δ j/V j=0
on R+ . δx denotes the Dirac measure at x, and the constant C will be fixed later in order to obtain a limit measure. From (A.3) it is now immediate that
1[0,ρ] (x) e−λx µ (dx) λn 0 /V λρ E ,ρV e = e . (A.4) 1[0,ρ] (x)µ (dx)
Spatial Random Permutations and Infinite Cycles
497
What makes the idea work is that we can actually calculate the Laplace transform of µ and take the limit. We have ∞ ∞ λj e−λx µ (dx) = C e− V (Z (, j) − Z (, j − 1)) 0
j=0 λ
= C (1 − e− V )
∞ j=0
= C (1 − e
− Vλ
λj
e− V Z (, j) )
) exp −
*
− Vλ −ε(k) log 1 − e
k∈∗
⎛
⎞
λ log 1 − e− V −ε(k) ⎠.
= C exp ⎝−
k∈∗ \{0}
The second equality above is just an index shift, the third is the well-known formula for the pressure of the ideal Bose gas, Eq. (7.11), and the last line follows from ε(0) = 0. The sum in the exponent of the last line is actually quite manageable. By the fundamental theorem of calculus we have for each k = 0,
λ 1 λ 1 log 1 − e− V −ε(k) = dy + log 1 − e−ε(k) , y/V +ε(k) V 0 e −1 and summation over k gives
λ log 1 − e− V −ε(k) = k∈∗ \{0}
log 1 − e−ε(k) + λρc,
(A.5)
k∈∗ \{0}
with ρc, =
1 λ
λ
0
1 V
k∈∗ \{0}
1 dy. e y/V +ε(k) − 1
The first term in (A.5) diverges as V → ∞ and defines C . The second term converges to the critical density ρc : the integrand is decreasing as a function of y and converges to ρc as a Riemann sum for each fixed y, since 1 1 1 1 1 1 . e−y/V ε(k) y/V +ε(k) ε(k) − 1 V V V e − 1 e − 1 e ∗ ∗ ∗ k∈ \{0}
k∈ \{0}
k∈ \{0}
Dominated convergence in y now proves convergence to ρc . We have thus shown that for all λ > 0, ∞ e−λx µ (dx) = e−λρc, lim V →∞ 0
and thus by the general theory of Laplace transforms µ → δρc weakly. When used in (A.4), this shows the claim for ρ > ρc . For ρ < ρc , both denominator and numerator go to zero, and we need a different argument: we note that by what we have just proved lim lim E ,ρV ( eλn 0 /V ) = 1.
ρρc V →∞
498
V. Betz, D. Ueltschi
Since the expectation above can never be less than one, all we need to show is monotonicity in ρ, i.e. P,N +1 (n 0 j) P,N (n 0 j).
(A.6)
For this we use (A.1) and we obtain P,N +1 (n k j) = P,N (n k j)
P,N +1 (n k 1) , P,N − j+1 (n k 1)
so it will be enough to show (A.6) for j = 1. By (A.1) this means we have to show Z (, N )2 Z (, N − 1)Z (, N + 1). Davies [7] showed that the finite volume free energy is convex, which proves the inequality above.
Proof of Theorem A.1 when ρc = ∞. We get from Eq. (A.2), after some rearrangements of the terms, E ,N ( eλn 0 /V ) − e−λ/V
N −λ/V λρ = (1 − e ) e e−λi/V e−V [q (i/V )−q (N /V )] . i=0
(A.7) q (ρ) is convex and its limit q(ρ) is strictly decreasing for ρ < ρc (and ρc = ∞ here). Besides, we have q (i/V ) − q (N /V ) b > 0, for all 0 i N /2, uniformly in V and N = ρV . The right side of (A.7) is then less than ⎡ ⎤ N /2 N e−V b + e−λi/V ⎦. (1 − e−λ/V ) eλρ ⎣ i=N /2
i=0
This clearly vanishes in the limit V → ∞.
Recall the definition of the typical set of occupation numbers Aη in (5.24). Proposition A.2. For any density ρ, and any η > 0, lim P,ρV (Aη ) = 1.
V →∞
Proof. Let us introduce the following sets of unlikely occupation numbers: , + A(1) = (n k ) : n 0 /V − ρ0 > η , ⎧ ⎫ ⎨ ⎬ A(2) = (n k ) : n k ηV , ⎩ ⎭ 0 0 and q ≥ 0 such that − ln r = lim
N →∞
1 1 ln C N = sup ln C N , q = lim Cn r n . n→∞ N N N
Moreover, r satisfies the bound ⎛ ⎜ p ⎜e ⎝
n 1 ,...,n p ∈Z: n 1 +...+n p =0
⎞−1 ⎟ p eγ π2 exp − 2 (n 21 + . . . + n 2p ) ⎟ ≤ r p 1− 2 ( √ )1− p ≤ 1. ⎠ pγ π
(43)
Proof. We know already that − ln r = limn n −1 log Cn exists, see (38). The strict positivity of r will follow from Ineq. (43). The observation of the previous subsection yields the existence of q = limn Cn r n ; note that α1 = ||ψ0 ||2 = 1 so that the sequence of activities (αn ) fulfills the aperiodicity condition (42). Thus it remains to prove Ineq. (43). The√idea is to use the representation (4) of N as the p th power of a determinant times 1/ N ! and to give lower and upper bounds on C N using Hölder’s and Hadamard’s inequalities. These inequalities have already been used in [FGIL], Sect. 3.3., to derive bounds on the free energy of jellium on a sphere. We start with the application of Hadamard’s inequality, which gives N 1 | N (z 1 , . . . , z N )|2 ≤ N!
j=1
N −1 k=0
p |ϕk (z j )|2
.
Symmetry Breaking in Laughlin’s State on a Cylinder
521
It follows that CN
⎛ p ⎞N N −1 1 ⎝ ∞ 1 (s − pkγ )2 ) ds ⎠ . ≤ exp(− √ N! p π −∞ k=0
The integral from −∞ to − pγ /2 and from (N − 1/2) pγ to ∞ can be bounded by an N -independent constant. The integral from − pγ /2 to (N − 1/2) pγ is bounded from above by pγ ∞ 1 (x − pkγ )2 p . (44) f (x) dx, f (x) := exp − √ N p π 0 k=−∞
Representing f as a Fourier series via Poisson’s summation formula, we find that the first expression in (44) equals N b(γ ) with √ p−1 p π π2 b(γ ) = p 1− 2 exp − 2 (n 21 + . . . + n 2p ) . γ pγ n 1 ,...,n p ∈Z: n 1 +...+n p =0
Thus we get C N ≤ (N b(γ ) + c) N /N !, from which the lower bound on r is easily obtained. Now we turn to a lower bound for C N . With Hölder’s inequality written as p g p ≥ || p−1 g ,
applied to the domain of integration ([− pγ /2, (N − 1/2) pγ ] × [0, 2π γ −1 ]) N , we find √ N p−1 N −1 p N! π √ pN 1 − N −k− 1 − k+ 1 , p CN ≥ N 2 2 (N pγ ) k=0
√
where m = [erfc(m pγ )]/2 and erfc is the complementary error function. The product over k does not contribute to lim N −1 log C N . Making use of Stirling’s formula, we obtain the desired upper bound to r . Remark. The bounds (43) lead to a statement on the thick cylinder asymptotics of r : r = O(γ p−1 ) as γ → 0. This complements the thin cylinder (γ → ∞) asymptotics given in Eq. (48) below. Lemma 4 leaves open the question whether q > 0 or q = 0, i.e., whether the associated renewal process has finite or infinite mean. In order to answer this question, it is useful to have a closer look at the activity (αn ). Lemma 5 (γ -dependence of the activity). Monomers have activity α1 = 1. The activity of a polymer of length N ≥ 2 is a polynomial of exp(−γ 2 ) with minimal degree p(N − 1) and coefficient in N0 . In particular, p(N −1) 2 as γ → ∞. α N = O exp(−γ )
522
S. Jansen, E. H. Lieb, R. Seiler
Hence, in the thin cylinder limit, only monomers have a non-vanishing activity. Proof. The monomer functions are u {k} (z) = ψ pk (z), whence α1 = 1. Combining Eqs. (20) and (36), we find N 2 2 2 2 αN = |b N (m)|2 (e−γ ) j=1 ( p ( j−1) −m j ) . (45) m irreducible
By Eq. (32), the expansion coefficients b N (m) are sums of signs of permutations and therefore integers. The proof of the lemma is concluded by the following observation: if m 1 ≤ . . . ≤ m N is N -admissible and irreducible, N ( p 2 ( j − 1)2 − m 2j ) ≥ p(N − 1).
(46)
j=1
Due to N -admissibility, we can write m j = p( j − 1) + ν j − ν j−1 with ν0 = ν N = 0 and ν1 , . . . , ν N −1 ≥ 0 (see also [FGIL], Property 3). νk is just kj=1 [m j − p( j −1)]. Because of irreducibility, ν1 , . . . , νk must be strictly positive. In the left-hand side of (46), we insert the expression of m j in terms of νk and perform a summation by parts, and obtain Ineq. (46). Now recall that q > 0 if and only if αn r n = 1 and nαn r n < ∞. The crucial observation is that these two conditions are automatically fulfilled when the generating series of (αn ) has a radius of convergence Rα strictly larger than the radius of convergence r of the power series with coefficients (Cn ). Therefore we are going to compare domains of convergence. For a monomer system with α1 = 1 (and αn = 0 for n ≥ 2), the quantities are trivial to compute: Cn ≡ 1, r = 1,
Rα = ∞, q = 1.
(47)
We will show that on sufficiently thin cylinders, the quantities r, Rα , q take values close to the monomer values (47). For this purpose it is useful to keep track of the γ -dependence in the notation. By Lemma 5 and Eq. (37), the activity and the normalization 2 2 constants are polynomials of e−γ with coefficients in N. Therefore we write αn (e−γ ), 2 Cn (e−γ ). The power series C(t, e−γ ) := 1 + 2
∞
Cn (e−γ )t n , 2
A(t, e−γ ) := t + 2
n=1
∞
αn (e−γ ) t n 2
n=2
are actually power series of two variables, t and u = e−γ . They have non-negative integer coefficients and are related through 2
C(t, u) =
1 , 1 − A(t, u)
see Eq. (40). The curves r = r (u) and Rα = Rα (u) delimit the domains of convergence of C(t, u) and A(t, u), see Fig. 3. The following theorem states that the curve r (u) stays strictly below R(u), at least for small u (large γ ), and r (u) and q(u) converge to the monomer values r = q = 1 in the limit of thin cylinders (u → 0).
Symmetry Breaking in Laughlin’s State on a Cylinder
523
Fig. 3. Domains of convergence of A(t, u) and C(t, u) for p ≥ 2. The curve R(u) delimits the domain of convergence of A, r (u) the domain of convergence of C. Both series diverge when u ≥ 1. We know that r (u) = 1 + O(u) and R(u) ≥ const · u − p as u → 0. When u < exp(−γ p2 ), r (u) < R(u). It is an open question whether the curves r and R touch for some u = exp(−γ p2 ) strictly below 1
Theorem 1. Let p ≥ 2 be fixed. Let r , q be such that r N C N → q as in Lemma 4. The following holds: 1. There exists a γ p > 0 such that for γ > γ p , r < Rα . 2. The functions ]γ p , ∞[ γ → r, q are analytic and strictly positive. As γ → ∞, r = 1 + O(e−γ ), q = 1 + O(e−γ ). 2
2
(48)
Proof. 1. Let 0 < u ≤ v < 1. By Lemma 5, αn (u) = m≥ p(n−1) bmn u m ≤ ( uv ) p(n−1) αn (v) with suitable non-negative integers bmn . It follows that u p Rα (u) ≥ v p Rα (v). We fix v and let u → 0. Since Rα (v) ≥ r (v) > 0, we obtain that Rα (u) goes to infinity when u → 0, as expected from Eq. (47). On the other hand, we know that C N ≥ α1N = 1, hence r (u) ≤ 1. Thus for sufficiently small u = exp(−γ 2 ), r (u) ≤ 1 < Rα (u). 2. The positivity of r was proved in Lemma 4. The positivity of q is a consequence of r < Rα . Now, notice that the power series A(t, u) defines a holomorphic function of two complex variables in the domain |t| < Rα (|u|). For 0 ≤ u < exp(−γ p2 ), r (u) is the unique solution of A(r (u), u) = 1. By Lemma 5, A(t, 0) = t. Thus the thin cylinder limit corresponds to the point (r (0), 0) = (1, 0). We can apply an implicit function theorem for holomorphic functions to obtain the analyticity of r . The analyticity of q follows from q(u) = ((∂t A) (r (u), u))−1 .
(49)
Both r (u) and q(u) can be extended to holomorphic functions in a complex neighborhood of u = 0 and take the value 1 at 0, whence Eq. (48).
524
S. Jansen, E. H. Lieb, R. Seiler
Theorem 1 is the central technical result of the present work, as all our results on the one particle density will rely on the condition q > 0. 3.4. Thermodynamic limits of correlation functions and symmetries. In this subsection, we show that Laughlin’s state has a unique thermodynamic limit (Theorem 2) and show that the limiting state is periodic in the axial direction, with pγ as one of its periods. The proof that pγ is actually the smallest period is deferred to the next subsection. The results presented here hold provided lim r n Cn = q > 0, i.e., the associated renewal process has finite mean. From the previous subsection, we know that this condition is indeed fulfilled on sufficiently thin cylinders. In the following, A is the C ∗ -algebra generated by the fermionic creation and annihilation operators c∗ ( f ), c(g), with f, g ∈ L 2 (R × [0, 2π/γ ]). The operators associated with the lowest Landau level basis state ψk are denoted ck∗ , ck . Theorem 2 (Existence of the thermodynamic limit). Suppose r n Cn → q > 0. There is a state · such that for every sequence of integers (a N ) such that a N → ∞ and N +a N → ˜ N = t (a N pγ ex )⊗N N ∞, the states associated with the shifted Laughlin functions converge to ·: for all a ∈ A, a N =
1 ˜ N,a ˜ N −→ a. N →∞ CN
(50)
Proof. For L = {1 < . . . < r } ⊂ Z, let c L := c1 . . . cr and c∗L := (c L )∗ . It is enough to prove the convergence (50) for operators a = c∗L c L with |L | = |L|. The key idea of the proof is to show a formula similar to the one given in Prop. 1 for the solvable model, and then to use the asymptotics of the normalization constants, just as we did in the proof of Corollary 1. Let b N := N + a N . We will see that c∗L c L N can be written as c∗L c L N =
N b N −n C j−a N Cb N − j−n f n (L − pj, L − pj) CN
(51)
n=1 j=a N
for a suitable N -independent family of functions ( f n )n∈N . We use the notation L − pj = {1 , . . . , r } − pj = {1 − pj, . . . , r − pj}. The f n ’s have finite support, f n (L , L) = 0 ⇒ L ∪ L ⊂ {0, . . . , pn − p},
(52)
and non-negative “diagonal” values: ∀L ⊂ Z : f n (L , L) ≥ 0. For fixed j and n, C j−a N Cb N − j−n /C N → qr n due to C N r N → q > 0. Thus formally, the right-hand side of Eq. (51) converges to c∗L c L =
∞ n=1
provided the series converges.
qr n
∞ j=−∞
f n (L − pj, L − pj),
(53)
Symmetry Breaking in Laughlin’s State on a Cylinder
525
We prove the theorem in three steps. First, we define the auxiliary functions f n and prove the representation (51) of the correlation functions. Second, we look at “diagonal” correlation functions (L = L) and show that the series (53) is bounded and equals the limit of correlation functions: c∗L c L ≤ 1,
lim c∗L c L N = c∗L c L .
N →∞
As a last step, we turn to off-diagonal values (L = L). We prove that the series (53) is absolutely convergent: qr n | f n (L − pj, L − pj)| ≤ c∗L c L 1/2 c∗L c L 1/2 ≤ 1, (54) j,n
and show that c∗L c L N → c∗L c L . Note that once we know that Eq. (53) defines a state on A, the inequality (54) with absolute value bars outside the sum is just Cauchy-Schwarz for the state ·. 1. Representation of correlation functions. Let L , L ⊂ Z with |L | = |L|. We start with the representation N , c∗L c L N = a N (m )a N (m)ψm 1 ∧ . . . ∧ψm N , c∗L c L ψm 1 ∧ . . . ∧ψm N . m ,m
(55) The sum ranges over N -admissible sequences m, m . Suppose m and m have common renewal points s, t such that L ∪ L ⊂ { ps, . . . , pt − p}. Then ψm 1 ∧ . . . ∧ ψm N , c∗L c L ψm 1 ∧ . . . ∧ ψm N N = δm j ,m j |as (ms1 )|2 |a N −t (mt+1 − pt)|2 j∈{1,...,s}∪{t,...,N }
·at−s (m ts+1 − ps)at−s (mts+1 − ps) ·ψm s+1 ∧ . . . ∧ ψm t , c∗L c L ψm s+1 ∧ . . . ∧ ψm t .
(56)
Let M be the set of pairs (m, m ) such that 1. m, m are both N -admissible; 2. m and m have no common renewal point s below or above p −1 (L ∪ L ). By “s is below p −1 (L ∪ L )” we mean L ∪ L ⊂ {0, . . . , ps − p}, and we say ”s is above p −1 (L ∪ L )” when L ∪ L ⊂ { ps, . . . , pn − p}. The set M consists of the pairs (m, m ) for which no simplification of the type (56) is possible. f N (L , L) is defined by the sum (55), except that the summation includes only (m, m ) from M. With this definition, combining (55) and (56) we obtain (51). 2. “Diagonal” correlation functions. For L = L , the definition of f n gives f N (L , L) = |a N (m)|2 χ L⊂{m 1 ,...,m N } , m
where the sum ranges over N -admissible sequences that are L-irreducible. In particular, f N (L , L) ≥ 0. Moreover, if f N (L , L) = 0, there exists an N -admissible sequence m such that L ⊂ {m 1 , . . . , m N }, whence L ⊂ {0, . . . , pN − p}.
526
S. Jansen, E. H. Lieb, R. Seiler
Let d ∈ N. From (51) and f n (L , L) ≥ 0 we get − j−n d b N C j−a N Cb N − j−n f n (L − pj, L − pj) ≤ c∗L c L N ≤ 1. CN n=1
(57)
j=a N
If f n (L − pj, L − pj) = 0 we must have L − pj ⊂ {0, . . . , pn − p}, thus only a finite, N -independent number of j’s contribute to the sum and we can take the limit N → ∞, which gives ∞ d
qr n f n (L − pj, L − pj) ≤ 1.
n=1 j=−∞
Letting d → ∞, we obtain the bound c∗L c L ≤ 1. The proof of c∗L c L N → c∗L c L is then completed by an /3 argument. We leave the details to the reader and mention only a useful inequality on quotients of normalization constants. Using the supermultiplicativity of (C N ), 0 < Cn r n ≤ 1 and Cn r n → q > 0, we get inf n r n Cn =: c > 0 and C j C N − j−n C N −n r −(N −n) 1 ≤ ≤ = rn. CN CN cr −N c 3. “Off-diagonal” correlation functions (L = L ). The procedure is similar to Step 2, but the analogue of the bound (57) is slightly more delicate to obtain. Let d ∈ N. Then d b N −n C j−a N Cb N −n− j | f n (L − pj, L − pj)| CN n=1 j=a N
N b N −n C j−a N Cb N −n− j | f n (L − pj, L − pj)| ≤ CN n=1 j=a N = C N−1 a N (m )a N (m)ψm 1 ∧ . . . ∧ ψm N , c∗L c L ψm 1 ∧ . . . ∧ ψm N
m,m
= C N−1
a N (L ∪ K )a N (L ∪ K )
K ⊂Z, |K |=N −|L|
≤ =
C N−1
1/2
|a N (L ∪ K )| K 1/2 1/2 c∗L c L N c∗L c L N ≤ 1. 2
1/2 |a N (L ∪ K )|
2
K
The notation a N (L ∪ K ) refers to the amplitude of the increasing sequence obtained by rearranging the elements of L ∪ K . Letting first N and then d go to infinity, we obtain the bound (54). The convergence c∗L c L N → c∗L c L can be shown with an /3 argument.
Symmetry Breaking in Laughlin’s State on a Cylinder
527
Remark. The representation (53) of correlation functions does not lend itself to a simple interpretation. However, it leads to a very nice formula for the one particle density. Let nˆ k := ck∗ ck be the number operator for the lowest Landau level state ψk . The quantity f n ({k}, {k}) equals u X 0 , nˆ k u X 0 with the polymer X 0 = {0, . . . , pn − p}. Therefore Eq. (53) may be rewritten as nˆ k = ρ P (X )v X (k) (58) X
r N (X ) and v
with ρ P (X )
2 = qα N (X ) X (k) = u X , nˆ k u X /||u X || . The sum is over all polymers X = { j, . . . , j + n − 1}, j ∈ Z, n ∈ N. This formula has an intuitive probabilistic interpretation: v X (k) is the probability of finding a particle in the “site” k, given that k is contained in the polymer X (or, strictly speaking, in { p min X, . . . , p max X }), which happens with probability ρ P (X ).5 Together with a similar formula for two-point correlations nˆ k nˆ j , Eq. (58) will serve as a useful guide in Sect. 3.5 when we investigate clustering properties.
Now let us turn to the symmetries of Laughlin’s state. Let τx be the automorphism of the algebra A associated with the magnetic translation t (γ ex ). Let τ ya be the morphism associated with the translation t (ae y ) in the y-direction, and τs the morphism induced by the reversal (s0 ψ)(z) = ψ(−z). Proposition 3 (Symmetries). The state ω(·) = · of the previous theorem is invariant with respect to reversal, translations in the x-direction by multiples of pγ , and arbitrary translations in the y-direction: np
∀n ∈ Z, ∀a ∈ R : ω = ω ◦ τs = ω ◦ τx = ω ◦ τ ya . Proof. The invariance with respect to y-translations is a direct consequence of the fact that N has a definite y-momentum, see Eq. (21). The reversal invariance follows from the invariance for finitely many particles (22), see also [ŠWK]. The periodicity with respect to magnetic translations in the direction along the cylinder follows from the representation (53) of correlation functions. Theorem 2 and Prop. 3 lead to a simple corollary on the one-particle density: Corollary 2. Let ρ N (z) be the one-particle density of Laughlin’s state N . Under the assumptions of Theorem 2, the shifted density converges pointwise to the one-particle density ρ(z) of the limiting state ·: lim ρ N (z − p N /2γ ) = ρ(z), ρ(z) =
N →∞
∞
nˆ k |ψk (z)|2 .
(59)
k=−∞
The density is independent of the coordinate y = Im z around the cylinder. The density as well as the occupation numbers are periodic and reversal invariant: ρ(x + pγ ) = ρ(x), nˆ k+ p = nˆ k , ρ(−x) = ρ(x), nˆ −k = nˆ k . Note that weak∗ -convergence of the state · N is replaced with pointwise convergence of the one-particle density. This uses the representation of the density as a sum of Gaussians with occupation numbers as coefficients as in Eq. (23). Due to the good localization of the Gaussians, summation and limits can be interchanged, whence Eq. (59). 5 The notation ρ P refers to polymer correlation functions as defined in [GK]. Later, we shall use not only ρ P (X ) but also ρ P (X, Y ), see material following Remarks of Subsect. 3.5.
528
S. Jansen, E. H. Lieb, R. Seiler
3.5. Symmetry breaking and clustering. This subsection contains the second part of the main results of this paper: Theorem 3 shows that on sufficiently thin cylinders, pγ is actually the smallest period of the limiting state · as well as the one-particle density ρ of the previous subsection. Thus the state · has a larger minimal period than the Hamiltonian describing interacting electrons in a magnetic field, whose ground states it is supposed to approximate. In this sense, there is symmetry breaking. In addition, we prove that the state · is mixing with respect to magnetic translations in the direction of the cylinder axis (Theorem 4). Theorem 3 (Symmetry breaking). Suppose Cn r n → q > 0. Let ρ(x) be the infinite cylinder density from Cor. 2. Then on sufficiently thin cylinders, pγ is the smallest period of ρ(x). Proof. Due to Eq. (59) and Lemma 2, it is enough to look at the occupation numbers. We will show that
1 + O exp(−γ 2 ) , if k ∈ pZ,
nˆ k = (60) else. O exp(−γ 2 ) , Thus for sufficiently large γ , the sequence of occupation numbers has p as the smallest period and pγ is the smallest period of the one-particle density. The idea behind (60) is that the thin cylinder limit is at the same time a monomer limit, see Lemma 5 and the material following it. The wave function corresponding to a pure monomer system, for N particles, is u {0} ∧ u {1} ∧ . . . ∧ u {N −1} = ψ0 ∧ ψ p ∧ . . . ∧ ψ pN − p .
(61)
In the limit N → ∞, the corresponding monomer occupation numbers nˆ k mon equal 1 if k is a multiple of p, and 0 otherwise. Equation (60) now is a consequence of the following observation: the occupation numbers nˆ k are functions of v = exp(−γ 2 ) that can be extended to holomorphic functions of v in a complex neighborhood of the monomer point v = 0. This can be shown with the representation nˆ k =
∞ n=1
qr n
∞ j=−∞
f n ({k − pj}, {k − pj}) =
∞
qr n gn (k),
n=1
see Eq. (53). gn (k) is a polynomial of exp(−γ 2 ), and q and r are analytic functions of exp(−γ 2 ). We can adapt the procedure used in the proof of Theorem 1 and deduce the analyticity of nˆ k for small u. Remarks. 1. The monomer state (61) is the Tao-Thouless state, corresponding to the reference configuration mTT on in Eq. (29) and what follows. The fact that Laughlin’s wave function for a fixed, finite number of particles on very thin cylinders approaches the Tao-Thouless state has been observed by Rezayi and Haldane [RH]. The novelty here is twofold: first, the limits N → ∞ and γ → ∞ can be interchanged; second, the periodicity survives for small but non-vanishing cylinder radius. 2. If we assimilate orbitals ψk with lattice sites k ∈ Z, the restriction of the state ω to the algebra generated by the number operators nˆ k can be described by a probability distribution P on particle configurations on Z. Adapting techniques from [AM,AGL], one can show that the probability measures corresponding to ω and the
Symmetry Breaking in Laughlin’s State on a Cylinder
529
p−1
shifted states ω ◦ τx ,…,ω ◦ τx are mutually singular. This result holds provided r n Cn → q > 0 and the second moment n n 2 αn r n is finite. Again, this condition is fulfilled when γ is large enough. As a consequence, the p quantum-mechanical p−1 states ω, …,ω ◦ τx are not only distinct, but also orthogonal in the sense of [BR], Def. 4.1.20. Now we come to clustering properties. Before we state our result in its general form, let us have a look at two-point correlations nˆ k nˆ l , where nˆ k = ck∗ ck is the number operator for the state ψk . In the spirit of the Remark in Subsect. 3.4, nˆ k nˆ l may be interpreted as the probability of finding a particle in the site k and another particle in the site l. In fact, we have a formula analogous to Eq. (58) for the one-particle density. Define ρ P (X ) and v X (k) as in Subsect. 3.4. Let v X (k, l) := u X , nˆ k nˆ l u X /||u X ||2 and ρ P (X, Y ) = qr N (X ) α N (X ) r d(X,Y ) Cd(X,Y )r N (Y ) α N (Y ) , where X is to the left-hand side of Y , separated from Y by the distance d(X, Y ) := min Y − max X − 1 ≥ 0. v X (k, l) is the probability of finding particles in the sites k and l given that k and l are contained in [ p min X, p max X ], while ρ P (X, Y ) is the probability of finding the rods X and Y . Suppose that k < l. Then the diagonal two-point correlation equals
nˆ k nˆ l =
ρ P (X, Y )v X (k)vY (l) +
X 0. Then the state · of Theorem 2 is np mixing with respect to the shifts τx , n ∈ Z: pn
∀a, b ∈ A : lim a τx (b) = ab.
(63)
n→∞
Proof. We use the notation from the proof of Theorem 2. It is enough to check (63) for operators a = c∗L c L , b = c∗K c K with L , L , K , K ⊂ Z. Because of particle number conservation and y-invariance, the only interesting case is |L| = |L |, |K | = |K |,
k∈K ∪L
k=
k∈K ∪L
k.
(64)
530
S. Jansen, E. H. Lieb, R. Seiler
In the following we will assume that (64) holds and show that ab − ab is small when L ∪ L is far to the left of K ∪ K . The main idea is to generalize the formula (62) for two-point correlations. We will see that c∗L c L c∗K c K = F + G, qr N (X )+N (Y ) f X (L , L) Cd(X,Y ) r d(X,Y ) f Y (K , K ), F= X 0 such that: i) For 0 < p < p0 , E min (p) = inf{E(v), v ∈ W (R3 ), p(v) = p} =
√
2p,
and the infimum is not achieved in W (R3 ). 3 ) to ii) For p ≥ p0 , there exists a non-constant finite energy solution u p ∈ W (R√ Eq. (TWc), with c = c(u p), such that p(u p) = p, E(u p0 ) = E min (p0 ) = 2p0 and, for p > p0 , √ (1.7) E(u p) = E min (p) = inf{E(v), v ∈ W (R3 ), p(v) = p} < 2p. √ iii) There exists some positive constant E0 ≤ 2p0 , such that there are no non-trivial finite energy solutions v to Eq. (TWc) satisfying E(v) < E0 . iv) We have sup{c(u p), p ≥ p0 }
0,
0 < c(u p)
0 of the minimizer would lead to the differentiability of the full curve, which in turn would provide a full interval √ of speeds. In dimension two, the spectrum of speeds would then be the interval (0, 2). Although we did not work out a proof here, we believe that our compactness result would show that, if at some point E min were not differentiable, then there are at least two different minimizers with different speeds. In that case, the function p → c(u p) is not single valued. However, we can prove that it is a decreasing (possibly multivalued) function. In dimension two, the function E min has the following graph: E
6
E=
√
2p
E = E min (p)
0
p
572
F. Béthuel, P. Gravejat, J.-C. Saut
In dimension three, the graph of E min has the following form: E 6 E = E up (p) E=
√ 2p E = E min (p)
E(u p0 ) Eb
0
pb p0
p
√ Notice that as a consequence of Theorem 2, E(u √ p0 ) = 2p0 , and that, in √ view of (1.8), the slope of the curve E min at the point (p0 , 2p0 ) is strictly less than 2. Our results are in full agreement with the corresponding figure given in [31]. In dimension three, the numerical value found in [31] for p0 is close to 80. Jones and Roberts have also shown in [31], mainly by numerical means, that √ in dimension three, the branch is represented in of solutions u p can be extended past the curve E = 2p. This curve √ the E-p diagram above as the curve E up . Starting from the point (p0 , 2p0 ), the curve √ E up goes down to the left staying above the curve E = E min (p) = 2p, until it reaches the bifurcation point (pb , E b ). After this bifurcation point, the curve E up goes up to the √ right, and is asymptotic from above to the curve E = 2p, as p → +∞. We believe that the presence of the bifurcation point (pb , E b ) is due to the choice of our representation, and that the curve p → u p is actually differentiable. At this stage, there is no mathematical proof of the existence √ of the upper branch of , 2p0 ) is strictly less than solutions. The fact that the slope of the curve at the point (p 0 √ 2 leaves some hope to use an implicit function theorem to construct the curve E up , at √ least near (p0 , 2p0 ). One important point which we have not addressed in this paper is the appearance of vortices (that is zeroes of solutions). It is known that in dimension two, solutions have two vortices for large p (see [4]), whereas there are vortex rings in dimension three for large p (see [3,8]). Jones, Putterman and Roberts conjectured in [31] the existence of some momentum p1 such that u p has vortices for p ≥ p1 , and has no vortex otherwise. The numerical value found in [31] for p1 is close to 75. The next step in the analysis is to describe some properties of the solutions we obtained in Theorems 1 and 2. Theorem 3. Let N = 2 or N = 3, p > 0 and assume that E min (p) is achieved by u p. Then u p is real-analytic on R N , and is, up to a translation, axisymmetric. More precisely, there exists a function up : R × R+ such that u p(x) = up(x1 , |x⊥ |), ∀x = (x1 , x⊥ ) ∈ R N . In another direction, the limit p → +∞ has already been discussed in [3,4,8], and the analogy with solution of the incompressible Euler equations in fluid dynamics, stressed. In dimension two, we wish to initiate here the rigorous mathematical study of the other
Travelling Waves for the Gross-Pitaevskii Equation II
573
end of the E min curve, namely the asymptotic properties in the limit p → 0, for which many interesting results have been derived in the physical literature. In [31,30], it is formally shown that, if u c is a solution to (TWc) in dimension two, then, after √ a suitable rescaling, the function 1 − |u c |2 converges, as the speed c converges to 2, to a solitary wave solution to the two-dimensional KadomtsevPetviashvili equation (KP I), which is written ∂t u + u∂1 u + ∂13 u − ∂1−1 (∂22 u) = 0. As (GP), equation (KP I) is hamiltonian, with Hamiltonian given by 1 1 1 (∂1 u)2 + (∂1−1 (∂2 u))2 − u3, E K P (u) = 2 R2 2 R2 6 R2
(KP I)
(1.9)
and the L 2 -norm of u is conserved as well. Solitary-wave solutions u(x, t) = w(x1 − σ t, x2 ) may be obtained in dimension two minimizing the Hamiltonian, keeping the L 2 -norm fixed (see [9,10]). The equation for the profile w of a solitary wave of speed σ = 1 is given by ∂1 w − w∂1 w − ∂13 w + ∂1−1 (∂22 w) = 0.
(1.10)
In contrast with the Gross-Pitaevskii equation, the range of speeds is the positive axis. Indeed, for any given σ > 0, a solitary wave wσ of speed σ is deduced from a solution w to (1.10) by the scaling √ wσ (x1 , x2 ) = σ w( σ x1 , σ x2 ). √ The correspondence between the two equations is given as follows. Setting ε ≡ 2 − c2 and ηc ≡ 1 − |u c |2 , and performing the change of variables √ 2x2 6 x1 , 2 , (1.11) wc (x) = 2 ηc ε ε ε √ it is shown that w approximatively solves (1.10) as c converges to 2. Set 1 v2 . S(v) = E K P (v) + 2 R2 We will term a ground-state a solution w to (1.10) which minimizes the action S among all the solutions to (1.10) (see [11] for more details). In dimension two, it is shown in [9] that w is a ground state if and only if it minimizes the Hamiltonian keeping the L 2 -norm fixed. The constant S K P denotes the action S(w) of the ground-state solutions w to Eq. (1.10). In the asymptotic limit p → 0, the value E min (p) relates to the constant S K P as the next result shows. Proposition 1. Assume N = 2. i) There exist some constants p1 > 0, K 0 and K 1 such that we have the asymptotic behaviours √ √ 48 2 3 p − K 0 p4 ≤ 2p − E min (p) ≤ K 1 p3 , ∀0 ≤ p ≤ p1 . (1.12) 2 SK P
574
F. Béthuel, P. Gravejat, J.-C. Saut
ii) Let u p be as in Theorem 1. Then, there exist some constants p2 > 0, K 2 > 0 and K 3 such that √ (1.13) K 2 p2 ≤ 2 − c(u p) ≤ K 3 p2 , ∀0 ≤ p < p2 . Moreover, the map u p verifies |u p| ≥ 21 , so that we may write u p = p exp iϕp, and we have the estimates 2 2 2 2 |∇p| + |∂2 u p| + (1 − p)|∇ϕp| ≤ K p3 , (1.14) R2
R2
and |u p(0)| = min |u p(x)| ≤ 1 − K p2 , x∈R2
(1.15)
where K is some universal constant. In a separate paper, we will provide a rigorous proof of the asymptotic expansion given in [29,30], under specific assumptions on the solutions u p: these assumptions are in particular verified by the solutions constructed in Theorem 1, thanks in particular to estimates (1.14). Remark 6. If u c is a solution to (TWc) in dimension three, then it is also formally shown in [29–31], that the function wc defined by x √2x √2x 2 6 1 2 3 wc (x) = 2 1 − vc , 2 , 2 , ε ε ε ε √ converges, as the speed c converges to 2, to a solitary wave solution w to the threedimensional Kadomtsev-Petviashvili equation (KP I), which writes ∂t u + u∂1 u + ∂13 u − ∂1−1 (∂22 u + ∂32 u)) = 0. In particular, the equation for the solitary wave w is now written as ∂1 w − w∂1 w − ∂13 w + ∂1−1 (∂22 w + ∂32 w) = 0. Remark 7. For N = 2 and N = 3, the Cauchy problem for (GP) is known to be well-posed in the energy space E(R N ) (see [15,16]), as well as in the space {v} + H 1 (R N ), where v is a finite energy solution to (TWc) (see [14], and also [4,17,27]). An important advantage of the space {v} + H 1 (R N ), besides the fact that it is affine, is that the momentum is well-defined (in contrast as mentioned above with the energy space). Moreover, it can be shown that the momentum, defined by (1.5) in the three-dimensional case, and after an integration by parts, by p(v) = − ∂1 (Im(v))(Re(v) − 1), RN
in the two-dimensional case, is a conserved quantity. In particular, the fact that u p solves the minimization problem (1.2) strongly suggests that u p is orbitally stable. A rigorous proof of the orbital stability of u p would require, in addition to solving the Cauchy problem, to obtain compactness properties for minimizing sequences for (1.2). We will not tackle this problem here.
Travelling Waves for the Gross-Pitaevskii Equation II
575
1.2. Some elements in the proofs. The starting point of our proofs is a careful analysis of the curve p → E min ( p). Theorem 4. Let N = 2 or N = 3. For any p, q ≥ 0, we have the inequality √ |E min (p) − E min (q)| ≤ 2|p − q|,
(1.16) √ i.e. the real-valued function p → E min (p) is Lipschitz, with Lipschitz’s constant 2. Moreover, it is concave and non-decreasing on R+ . Set √ (1.17) (p) = 2p − E min (p). The function p → (p) is nonnegative, convex, continuous, non-decreasing on R+ , and tends to +∞ as p → +∞. In particular, there exists some number p0 ≥ 0 such that (p) = 0, if p ≤ p0 , and (p) > 0, otherwise. An important consequence of the concavity of the function E min (p) is the following
Corollary 1. The function E min is subadditive that is, for any non-negative numbers p1 , . . . , p , we have
E min (pi ) ≥ E min
i=1
pi .
(1.18)
i=1
Moreover, if ≥ 2 and (1.18) is an equality, then the function E min is linear on (0, p), pi . where p ≡ i=1
Proof. Since E min (0) = 0, and since E min is concave, its graph lies above the segment joining (0, 0) and (p, E min (p)). In particular, for any 0 ≤ q ≤ p, we have E min (q) ≥ q
E min (p) . p
Specifying this relation for pi , and adding these inequalities, we obtain (1.18). If (1.18) is an equality, then necessarily E min (pi ) = pi E minp(p) , and the graph has to be linear. Since the function E min is Lipschitz, non-decreasing and concave, its left and right derivatives exist for any p ≥ 0, are equal except possibly on a countable subset Q of R+ , are non-negative and non-increasing, and satisfy the inequality 0≤
d− √ d+ E min (p) ≤ E min (p) ≤ 2, dp dp
where we have set d± E min (p ± p) − E min (p) E min (p) ≡ lim + . p→0 dp ±p Moreover, the derivatives are related to the speed c(u p) as follows.
576
F. Béthuel, P. Gravejat, J.-C. Saut
Lemma 1. Let p > 0 and assume that E min (p) is achieved by a solution u p of (TWc) of speed c(u p). Then we have d+ d− E min (p) ≤ c(u p) ≤ E min (p) . dp dp
(1.19)
Strict concavity also plays an important role in our argument. In that direction, we have Lemma 2. Let 0 ≤ p1 < p2 and assume the function E min is affine on (p1 , p2 ). Then, for any p1 < p < p2 , the infimum E min (p) is not achieved in W (R N ). So far, we have not addressed the existence problem for u p. Notice that as a direct consequence of (1.16), one has √ (1.20) E min (p) ≤ 2p. Inequality (1.20) corresponds in some sense to a linearization of the equation, or alternatively to an asymptotic situation where only quadratic terms in the functional are kept. To get some feeling for its proof, we consider a map v ∈ {1} + Cc∞ (R N ) such that δ = inf |v(x)| ≥ x∈R N
1 , 2
so that we √ may write v = exp iϕ. To obtain (1.20) we need to construct v so that E(v) 2| p(v)|. In view of formula (1.6) for the momentum, we have 1 1 2 (1.21) | p(v)| = (1 − )∂1 ϕ ≤ 1 − 2 ∂1 ϕ . 2 RN 2δ R N Using the inequality ab ≤ 21 (a 2 + b2 ), we observe therefore that 2 1 1 1 1 2 2 2 | p(v)| ≤ √ ≤ √ E(v), 1−ρ |∇ϕ| + N N 2 4 2δ 2δ R R √ √ i.e. 2δ| p(v)| ≤ E(v). To obtain a map such that E(v) 2| p(v)|, we need therefore to have δ close to 1, and the inequality ab ≤ 21 (a 2 + b2 ) close to an equality, that is a b or in our case √ 2∂1 ϕ 1 − 2 . This elementary observation is the starting point in the proof of the existence of solutions minimizing E min , in the case (p) > 0, that is for p > p0 . As a matter of fact, the discrepancy term √ (v) = 2 p(v) − E(v) is central in our analysis. Notice in particular, that (p) = sup{(v), v ∈ W (R N ), p(v) = p}, and that, in view of Theorem 4, a way to formulate the fact that p0 < p is to say that there exists a map v ∈ W (R N ) such that (v) > 0, and p(v) = p. In this situation, we have
Travelling Waves for the Gross-Pitaevskii Equation II
577
Lemma 3. Let v ∈ W (R N ) and assume p(v) > 0. Then we have inf |v(x)| ≤ max
(v) . ,1 − √ 2 2 p(v)
1
x∈R N
(1.22)
Proof. Set as above δ ≡ inf |v(x)|. x∈R N
If δ ≤ 21 , there is nothing to prove. Otherwise, one may show that v has a lifting, i.e. that we may write v = exp iϕ. It follows therefore from (1.21) that √ √ √ 2δp(v) = 2δ| p(v)| ≤ E(v) = 2 p(v) − (v), and hence (v) 1−δ ≥ √ , 2 p(v) which yields the conclusion.
Lemma 3 is the starting point in the analysis of minimizing sequences for (1.2). In particular, it shows that their modulus cannot converge uniformly to 1 in the case p0 < p. However, to go beyond that simple observation, one needs to get a better control on minimizing sequences. Among the analytical difficulties which have already been stressed in [3,4,8], the first one is presumably the lack of compactness of minimizing sequences or Palais-Smale sequences. Working directly with arbitrary Palais-Smale sequences, i.e. approximate solutions to (TWc) with an additional small H −1 -term leads to substantial technical issues, which seem hard to remove. The lack of regularity of the H −1 -term raises some major problems, for instance concerning regularity of the functions under hand, as well as Pohozaev’s type identities which turn out to be crucial in our arguments, in particular in order to bound the Lagrange multiplier. 1 To overcome the difficulties related to a direct approach, we specify the way minimizing sequences are constructed. There are presumably many ways to proceed. Here, we consider the corresponding minimization problem on expanding tori (as in [3]). This choice has several advantages. First, the torus is compact, so that the existence of minimizers presents no major difficulty. Second, it has no boundary, so that the elliptic theory is essentially the local one and concentration near the boundary is avoided. The torus captures also some of the translation invariance for the problem on R N . Finally, Pohozaev’s identities yield comfortable bounds for the Lagrange multipliers, which provide a uniform control on the ellipticity of (TWc). Our strategy to obtain compactness for the sequence of approximate minimizers is then to develop the elliptic theory for the equation on tori, derive several estimates which do not rely on the size of the torus, and then to pass to the limit when the size of the torus tends to infinity. More precisely, we introduce the flat torus, for N = 2 and N = 3, defined by TnN nN ≡ [−π n, π n] N , 1 This direct approach would have the important advantage to pave the way to the study of the orbital stability of the travelling waves.
578
F. Béthuel, P. Gravejat, J.-C. Saut
for any n ∈ N∗ (with opposite faces identified), and the space 1 X nN = H 1 (TnN , C) Hper (nN , C)
of 2n-periodic H 1 -functions. We define the energy E n and the momentum pn on X nN by 1 1 E n (v) = |∇v|2 + (1 − |v|2 )2 = e(v), 2 TnN 4 TnN TnN and
1 pn (v) = i∂1 v, v, 2 TnN
N which clearly √ defines a quadratic functional on X nN, as well as the discrepancy term n (v) = 2 pn (v) − E n (v). We introduce the set n (p) defined in dimension three by n3 (p) ≡ {u ∈ X n3 , pn (u) = p}, whereas in dimension two, its definition is slightly more involved, and is given by
n2 (p) ≡ {u ∈ X n2 , pn (u) = p} ∩ Sn0 . The set Sn0 corresponds to a topological sector of the energy E n , following the approach of Almeida [1]. We will define it precisely in definition (4.14) of Subsection 4.2: at this stage, let us just mention that we introduce the set Sn0 to have appropriate lifting properties far from the possibly vorticity set. We consider the minimization problem n E n (u) . E min (p) ≡ inf (PnN (p)) u∈nN (p)
The constraint is non void, so that it is possible to prove the existence of a minimizer for (PnN (p)). Proposition 2. Assume N = 3, or N = 2 and n ≥ n(p), ˜ where n(p) ˜ is some integer n (p), and only depending on E min (p). Then, there exists a minimizer u np ∈ nN (p) for E min n n n some constant cp ∈ R such that u p satisfies (TWcp), i.e. icpn ∂1 u np + u np + u np(1 − |u np|2 ) = 0 on TnN . In particular, u np is smooth. Moreover if (p) > 0, then there exist a constant K (p), and an integer n(p), only depending on p such that |cpn | ≤ K (p)
(1.23)
for any n ≥ n(p). In particular, for any k ∈ N, there exists some constant K k (p) only depending on p and k such that we have u npC k (TnN ) ≤ K k (p).
(1.24)
Travelling Waves for the Gross-Pitaevskii Equation II
579
The last estimate in Proposition 2 is a simple consequence of bound (1.23) on cpn , combined with standard elliptic estimates. In view of the invariance by translation of the problem (PnN (p)) on the torus TnN , we may assume without loss of generality that the infimum of |u np| is achieved at the point 0, that is |u np(0)| = min |u np(x)|. N x∈Tn
(1.25)
On the other hand, the argument of the proof of Lemma 3 carries over for continuous maps v ∈ X n2 ∩ Sn0 , resp. v ∈ X n3 , for n sufficiently large, so that min |v(x)| ≤ sup
n (v) , ,1 − √ 2 2 pn (v)
1
x∈TnN
(1.26)
if pn (v) > 0, and n is sufficiently large. Indeed, it follows from Lemma 4.4, resp. Lemma 4.2, that continuous maps v ∈ X n2 ∩ Sn0 , resp. v ∈ X n3 , such that |v| ≥ 21 on TnN , have a lifting on TnN , so that the argument of the proof of Lemma 3 applies without any change. Combining (1.25) and (1.26), we are led to 1 (p) . lim sup |u np(0)| ≤ sup , 1 − √ 2 n→+∞ 2p If (p) > 0, then we may use Ascoli’s and Rellich’s compactness theorems to assert Proposition 3. Let N = 2 or N = 3, p > 0, and assume that (p) > 0.
(1.27)
Then, there exists a non-trivial finite energy solution u p to (TWc) such that, passing possibly to a subsequence, we have u np → u p in C k (K ), n→+∞
(1.28)
for any k ∈ N, and any compact set K in R N . Moreover, we have E(u p) ≤ E min (p), and |u p(0)| ≤ sup
(p) < 1. ,1 − √ 2 2p
1
Notice that, provided one is able to verify condition (1.27), Proposition 3 already provides the existence of non-trivial travelling wave solutions. To go further and establish the statements of Theorems 1 and 2, one needs to pass to the limit in the integral quantities E and p, the main difficulty being that the domain is not bounded, and therefore we have only weak convergences in L 2 . The possible failure of convergence is described in the next proposition, which is a classical concentration-compactness result.
580
F. Béthuel, P. Gravejat, J.-C. Saut
Proposition 4. Let N = 2 or N = 3, p > 0, and assume (1.27) is satisfied. Let u np and u p be as in Proposition 3. Then, there exists an integer 0 depending only on p, and points x1n = 0, . . . , x n depending on n, finite energy solutions u 1 = u p, . . . , u to (TWc), and a subsequence of (u np)n∈N∗ still denoted (u np)n∈N∗ such that ≤ 0 , |xin − x nj | → +∞, as n → +∞,
(1.29)
u np(· + xin ) → u i (·) in C k (K ), as n → +∞,
(1.30)
for any i = j, and
for any compact set K ⊂ R N and any k ∈ N. Moreover, we have the identities E min (p) =
E(u i ), and p =
i=1
p(u i ).
i=1
The maps u i are minimizers for E min (pi ), where pi = p(u i ), and 0 < c(u p)
p0 , where p0 ≥ 0 is defined in Theorem 4, then E min (p) is achieved by the map u p ∈ W (R N ) constructed in Proposition 3. Theorems 1 and 2 are then deduced from Theorem 5 and various aspects of the analysis presented above. We will show in particular, thanks to Proposition 1, that p0 = 0 in dimension two, whereas p0 > 0 in dimension three. This fact accounts in large part for the differences in the statements of Theorems 1 and 2.
1.3. Outline of the paper. The paper is organized as follows. In the next section, we present some known and also some new properties of finite energy travelling wave solutions. In Sect. 3, we provide properties of E min , in particular, we prove Theorems 3 and 4. In Sect. 4, we study solutions on tori, whereas in Sect. 5, we study their asymptotic properties on expanding tori. In particular, we prove the concentration-compactness result. In Sect. 6, we consider the minimization problem on tori, and we provide the asymptotic limits of the energy and momentum on expanding tori. In Sect. 7, we finally complete the proofs of Theorems 1, 2 and 5. 2. Properties of Finite Energy Solutions on R N In this section, we recall some known facts about finite energy solutions to (TWc) on R N , and supplement them with some results which enter in our analysis.
Travelling Waves for the Gross-Pitaevskii Equation II
581
2.1. Pointwise estimates. The following results were proved in [13] (see also [40]). Lemma 2.1 ([13,40]). Let v be a finite energy solution to (TWc) on R N . Then, v is a smooth, bounded function on R N . Moreover, there exist some constants K (N ) and K (c, k, N ) such that c (2.1) 1 − |v| ∞ N ≤ max 1, , L (R ) 2 c2 23 ∇v L ∞ (R N ) ≤ K (N ) 1 + , (2.2) 4 and more generally, vC k (R N ) ≤ K (c, k, N ), ∀k ∈ N.
(2.3)
Proof. In view of [13,40] (see also [4,20]), a finite energy solution v to Eq. (TWc) is a smooth, bounded function on R N , such that |v(x)| → 1, as |x| → +∞.
(2.4)
In particular, we have v L ∞ (R N ) ≥ 1. In order to prove inequalities (2.1), (2.2) and (2.3), we compute the laplacian of |v|2 . By the inequality ab ≤ 21 (a 2 + b2 ), we have |v|2 = 2v , v + 2|∇v|2 = 2|∇v|2 − 2ci∂1 v , v − 2|v|2 (1 − |v|2 ) c2 ≥ 2|∇v|2 − 2|∂1 v|2 − |v|2 − 2|v|2 (1 − |v|2 ), 2 so that c2 |v|2 + 2|v|2 1 + − |v|2 ≥ 0. 4
(2.5)
When v L ∞ (R N ) > 1, it follows from (2.4) and (2.5) that we can apply the weak maximum principle to |v|2 to obtain |v|2 ≤ 1 +
c2 . 4
(2.6)
When v L ∞ (R N ) = 1, inequality (2.6) is straightforward, so that it holds in any case. In particular, the function 1 − |v| satisfies ⎫ ⎧ ⎬ ⎨ c 2 c − 1 ≤ max 1, , 1 − |v| ∞ N ≤ max 1, 1 + ⎭ ⎩ L (R ) 4 2 so that inequality (2.1) is established. We turn to (2.2). Consider the function w defined by c w(x) = v(x) exp i x1 , ∀x ∈ R N . 2
(2.7)
582
F. Béthuel, P. Gravejat, J.-C. Saut
By Eq. (TWc), w satisfies c2 w + w 1 + − |w|2 = 0. 4 Letting x0 ∈ R N , and denoting B(x0 , 1) = {y ∈ R N , s.t. |y − x0 | < 1}, the ball with center x0 and radius 1, it holds from inequality (2.6) that c2 23 2 . w L ∞ (B(x0 ,1)) ≤ √ 1 + 4 3 3
(2.8)
By standard elliptic theory, there exists some constant K (N ) such that |∇w(x0 )| ≤ K (N ) w L ∞ (B(x0 ,1)) + w L ∞ (B(x0 ,1)) , so that, by inequalities (2.6) and (2.8), and definition (2.7), c2 23 |∇w(x0 )| ≤ 2K (N ) 1 + . 4 Hence, by definition (2.7) once more, c c2 23 |∇v(x0 )| ≤ |∇w(x0 )| + |v(x0 )| ≤ (2K (N ) + 1) 1 + , 2 4 which gives inequality (2.2). Finally, one invokes standard estimates for the laplacian to prove (2.3). Lemma 2.2 ([40]). Let r > 0, and assume that v is a finite energy solution to (TWc) on R N . There exists some constant K (N ) such that for any x0 ∈ R N , 1 − |v|
L ∞ (B(x0 , r2 ))
where E(v, B(x0 , r )) ≡
1 c 2 2 ≤ max K (N ) 1 + E v, B(x0 , r ) N +2 , 4 1 K (N ) 2 , E v, B(x , r ) 0 rN
(2.9)
B(x0 ,r ) e(v).
Proof. Let η = 1 − |v|2 . By Lemma 2.1, the function η is smooth on R N , and satisfies c2 2 ∇η L ∞ (R N ) ≤ 2v∇v L ∞ (R N ) ≤ K (N ) 1 + . 4 Let x be some point in B(x0 , r2 ) such that |η(x)| =
sup y∈B(x0 , r2 )
|η(y)|.
We compute c2 2 |y − x|, |η(y)| ≥ |η(x)| − K (N ) 1 + 4
Travelling Waves for the Gross-Pitaevskii Equation II
583
so that |η(y)| ≥ for any point y ∈ B(x, µ), where, µ = E(v, B(x0 , r )) ≥
1 16
B
|η(x)| , 2 |η(x)|
2
2K (N )(1+ c4 )2
. Therefore, we are led to
2 η(x) dy
x,min{µ, r2 }
|η(x)|2 r N |B(x0 , 1)| = min , . (2.10) 2 2 N +4 2 N +4 K (N ) N (1 + c4 )2N |η(x)| N +2 |B(x0 , 1)|
In conclusion, 1 − |v| L ∞ (B(x0 ,1)) ≤ η L ∞ (B(x0 ,1)) = |η(x)|, so that inequality (2.9) follows from inequality (2.10).
If we next consider the class of functions (R N ) = v ∈ C 0 (R N ), s.t. E(v) < +∞, and ∃R(v) > 0, s.t. |v(x)| ≥
1 , ∀|x| ≥ R(v) , 2
a rather direct consequence of Lemma 2.2 is Corollary 2.1. Let v be a finite energy solution to (TWc) on R N . Then v belongs to (R N ). Proof. Indeed, the energy of v being finite, there exists some radius R(v) > 1 so that 1 E(v, B(x0 , 1)) ≤ e(v) ≤ , ∀|x0 | ≥ R(v), 2 2 N +2 R N \B(0,R(v)−1) 2K (N ) 1 + c4 where K (N ) is the constant in inequality (2.9). Hence, v belongs to (R N ) in view of (2.9). 2.2. Alternate definitions of the momentum. If v ∈ (R N ), we may write, for |x|> R(v), v = exp iϕ,
(2.11)
where ϕ is a real function on R N \ B(0, R(v)) defined modulo a multiple of 2π . Notice that, if v may be written as in (2.11), then we have ∂ j v = i∂ j ϕ + ∂ j exp iϕ, so that
1 2 1 |∇|2 + 2 |∇ϕ|2 + 1 − 2 . (2.12) 2 4 In this context, the next elementary observation will be used in several places of our work. i∂1 v, v = −2 ∂1 ϕ, and e(v) =
584
F. Béthuel, P. Gravejat, J.-C. Saut
Lemma 2.3. Let and ϕ be C 1 scalar functions on a domain U in R N , such that is positive. Set v = exp iϕ. Then, we have the pointwise bound √2 2 e(v). ( − 1)∂1 ϕ ≤ Proof. Notice that we have by (2.12), 1 2 1 1 1 e(v) = |∇|2 + 2 |∇ϕ|2 + 1 − 2 ≥ 2 |∂1 ϕ|2 + (1 − 2 )2 . 2 4 2 2 The conclusion then follows from the inequality ab ≤ a = √1 (2 − 1) and b = ∂1 ϕ.
1 2 2 (a
+ b2 ) applied to
2
The previous observations lead as in [3,20] to an alternate definition of the momentum on the space (R N ). For that purpose, consider the function g(v) = i∂1 v, v + ∂1 (1 − χ )ϕ , where v = exp iϕ on R N \ B(0, R(v)), and χ is an arbitrary smooth function with compact support such that χ ≡ 1 on B(0, R(v)) and 0 ≤ χ ≤ 1. It follows from (2.12) that Lemma 2.4. If v belongs to (R N ), then g(v) belongs to L 1 (R N ). Moreover the integral 1 1 i∂1 v, v + ∂1 (1 − χ )ϕ p(v) ˜ ≡ g(v) = 2 RN 2 RN does not depend on the choice of χ . Proof. As a consequence of (2.12), we verify that g(v) = (1 − 2 )∂1 ϕ, on R N \ supp(χ ), so that in particular, it follows from Lemma 2.3 that √ |g(v)| ≤ 2 2e(v) on R N \ supp(χ ), and hence, since v is smooth on R N , and its energy is bounded, the function g(v) is integrable on R N . The last assertion is a direct consequence of the integration by parts formula. 2.3. Decay at infinity. We will use in several places decay properties of solutions which have been established in [18,20,21]. They play, among other things, a central role in our concentration-compactness results. Proposition 2.1 ([18,20,21]). Let v be a finite energy solution to (TWc). i) There exists a constant v∞ , such that |v∞ | = 1 and v(x) → v∞ , as |x| → ∞. Without loss of generality, we may assume v∞ = 1.
Travelling Waves for the Gross-Pitaevskii Equation II
585
√ ii) Assume c(v) < 2. Then, there exists some constant K > 0 depending only on c(v), E(v) and the dimension N , such that the following estimates hold for any x ∈ RN , K K , |Re(v(x)) − 1| ≤ , 1 + |x| N −1 1 + |x| N (2.13) K K |∇Im(v(x))| ≤ , |∇Re(v(x))| ≤ . 1 + |x| N 1 + |x| N +1 √ iii) Assume N = 3 and c(v) = 2. Then, Re(v) − 1 and ∇Im(v) belong to L p (R3 ), for any p > 53 , ∇Re(v) belongs to L p (R3 ), for any p > 45 , whereas Im(v) belongs to L p (R3 ), for any p > 15 4 . |Im(v(x))| ≤
A first consequence is Corollary 2.2. Let v be a finite energy solution to (TWc), and assume v∞ = 1. Then v belongs to W (R N ). Proof. In view of definitions (1.3) and (1.4), Corollary 2.2 directly follows from the decay estimates (ii) and (iii) of Proposition 2.1. Remark 2.1. Since any finite energy solution v to (TWc) has a limit v∞ at infinity, we may write v = exp iϕ outside some ball B(0, R), for some R > 0, where ϕ is a smooth function on R N \ B(0, R), which is defined up to an integer multiple of 2π . Moreover the function ϕ has a limit at infinity ϕ∞ , which we may take equal to 0, if we assume that v∞ = 1. The statements given in [18,20,21] are actually expressed in terms of the real functions and ϕ as follows: √ (i) If 0 ≤ c(v) < 2, there exists some constant K > 0 depending only on c(v), E(v) and N such that K K , |1 − (x)| ≤ , 1 + |x| N −1 1 + |x| N K K , |∇(x)| ≤ . |∇ϕ(x)| ≤ 1 + |x| N 1 + |x| N +1
|ϕ(x)| ≤
(2.14)
√ (ii) If c(v) = 2, the function ϕ belongs to L p (R3 \ B(0, R)), for any p > 15 4 , − 1 and ∇ϕ belong to L p (R3 \ B(0, R)), for any p > 53 , whereas ∇ belongs to L p (R3 \ B(0, R)), for any p > 45 . The previous inequalities are easily seen to be equivalent to those given in Proposition 2.1. Indeed, we have v = cos(ϕ)+i sin(ϕ), so that Re(v)−1 = cos(ϕ)−1 and Im(v) = sin(ϕ), and hence |Re(v) − 1| ≤ K | − 1| + ϕ 2 , |Im(v)| ≤ K |ϕ|, |∇Re(v)| ≤ K |∇| + |ϕ||∇ϕ| , |∇Im(v)| ≤ k |∇ϕ| + |ϕ||∇| , where K > 0 is some constant. A remarkable consequence of Proposition 2.1 is
586
F. Béthuel, P. Gravejat, J.-C. Saut
Proposition 2.2. Let v be a finite energy solution to (TWc) on R N . Then, we have p(v) ˜ = p(v). Proof. Let R(v) > 0 be such that |v| ≥ 21 on R N \ B(0, R(v)). As in Proposition 2.1 and Remark 2.1, we may assume without loss of generality that v∞ = 1 and ϕ∞ = 0. If x is sufficiently large, the expansion of the sin function yields Im(v(x)) |ϕ(x)|3 ≤ − ϕ(x) . (x) 6 Let R > R(v) be sufficiently large. We have, integrating by parts, 1 i∂1 v, 1 = − Im(v(x))x1 d x, R ∂ B(0,R) B(0,R) and
1 ∂1 (1 − χ )ϕ = R B(0,R)
∂ B(0,R)
ϕ(x)x1 d x,
so that it follows 1 i∂1 v, v − 1 − g(v) = Im(v(x)) − ϕ(x) x1 d x. R ∂ B(0,R) B(0,R) Im(v) We write Im(v) − ϕ = − ϕ + Im(v) −1 , so that
B(0,R)
i∂1 v, v − 1 − g(v) ≤
|ϕ|3 ∂ B(0,R)
6
+ 2|Im(v)|| − 1| . (2.15)
We next distinguish two cases. Case 1. N = 2 or N = 3, and c(v)
15 4 , and − 1 belongs to L (R \ B(0, R(v))) for any
q > 53 , so that by Proposition 2.1, the function f ≡ |ϕ|6 + 2|Im(v)|| − 1| belongs to L q (R3 \ B(0, R(v))) for any q > 45 . Given R > R(v) and q > 45 to be determined later, we may find some R ≤ R ≤ 2R, and some constant K (q) only depending on q, such that K (q) , fq ≤ R ∂ B(0,R ) 3
Travelling Waves for the Gross-Pitaevskii Equation II
587
so that by Hölder’s inequality, we are led to 2− 3 f ≤ K (q)R q . ∂ B(0,R )
Choosing q = 43 , we obtain that
B(0,R )
∂ B(0,R )
f → 0, as R → +∞. This yields by (2.15),
i∂1 v, v − 1 − g(v) → 0, as R → +∞,
which yields the conclusion again, since the integrand is integrable by Lemma 2.4 and Corollary 2.2. 2.4. Pohozaev’s type identities. Lemma 2.5. Let v be a finite energy solution to (TWc) on R N , with speed c = c(v). We have the identities E(v) = |∂1 v|2 , RN
and for any 2 ≤ j ≤ N , E(v) =
RN
|∂ j v|2 + c(v) p(v).
Moreover, if c(v) > 0 and v is not constant, then p(v) > 0. Proof. The first identity was established in [19], whereas, concerning the second identity, it was proved there that for any 2 ≤ j ≤ N , E(v) = |∂ j v|2 + c(v) p(v). ˜ RN
The conclusion then follows from Proposition 2.2. Notice that adding the identities in Lemma 2.5 we obtain N −2 N |∇v|2 + (1 − |v|2 )2 − c(v)(N − 1) p(v) = 0. (2.16) 2 4 RN RN √ 2 p(v) − E(v) and ε(v) = Notice also that introducing the quantities (v) = 2 − c(v)2 , the second identity in Lemma 2.5 may be recast as
√ 2 − 2 − ε(v)2 p(v) = √
ε(v)2 p(v). (2.17) RN 2 + 2 − ε(v)2 √ Corollary 2.3. Let v be a finite energy solution to (TWc) on R N , with speed c = 2 and such that (v) ≥ 0. Then, v is a constant. |∂ j v|2 + (v) =
588
F. Béthuel, P. Gravejat, J.-C. Saut
Proof. Since ε(v) = 0 and (v) ≥ 0, identity (2.17) implies that, for any 2 ≤ j ≤ N , |∂ j v|2 = 0, RN
so that v depends only on the x1 variable. Since the energy is finite, this is impossible, unless v is constant. Notice more generally that, if (v) > 0, then identity (2.17) gives |∂ j v|2 ≤ ε(v) p(v), ∀ 2 ≤ j ≤ N . RN
In connection with the previous inequality, the next result gives a more quantitative version of Corollary 2.3. Lemma 2.6. Let v be a finite energy solution to (TWc) on R N . Then, there exists a constant K (c) > 0, possibly depending on c, such that η2 , λ|∂ j v|2 + η LN∞+1(R N ) ≤ K (c) λ RN for any 2 ≤ j ≤ N , and for any λ > 0. In particular, we have E(v) η LN∞+1(R N ) ≤ K (c) λ ε(v) p(v) − (v) + . λ Proof. Set η∞ = η L ∞ (R N ) . We may assume without loss of generality, that |η(0)| = η∞ . In view of the uniform bound (2.2), there exists some constant K (c) depending only on c such that η∞ η∞ . (2.18) |η(x)| ≥ , ∀x ∈ B 0, 2 2K (c) We next consider for any point a = (a1 , . . . , a j−1 , 0, a j+1 , . . . , a N ), the line D j (a) parallel to the axis x j , that is the set D j (a) = {a j (x) ≡ (a1 , . . . , a j−1 , x, a j+1 , . . . , a N ), x ∈ R}. We claim that
|η∞ |2 ≤ 4
η2 , λ(∂ j η)2 + λ D j (a)
(2.19)
for any a = (a1 , . . . , a j−1 , 0, a j+1 , . . . , a N ) ∈ B(0, 2Kη∞(c) ). Indeed, since η(x) → 0, as |x| → +∞, we have by integration, 0 2 η(a) = 2 ∂ j η(a j (x))η(a j (x))d x −∞
2 η(a j (x))2 d x. λ ∂ j η(a j (x)) + ≤ λ R
(2.20)
Invoking inequality (2.18), one derives (2.19). The conclusion then follows integrating inequality (2.18) on a = (a1 , . . . , a j−1 , 0, a j+1 , . . . , a N ) ∈ B(0, 2Kη∞(c) ).
Travelling Waves for the Gross-Pitaevskii Equation II
589
2.5. Analyticity. The proofs of Theorem 3 and Lemma 2 rely on the following result, which is of independent interest. √ Proposition 2.3. Let v be a finite energy solution of (TWc) on R N , with 0 ≤ c < 2. Then, each component of v is real-analytic on R N . This is a consequence of the next more general result. Theorem√2.1. Let N ≥ 2 and let v be a finite energy solution to (TWc) on R N , with 0 ≤ c < 2. There exists some number λ0 > 0, possibly depending on v, such that Re(v) and Im(v) extend to analytical functions on the cylinder Cλ0 = {z ∈ C N , |Im(z)| < λ0 }. Proof. The argument is reminiscent of [6,7] (see also [32,35,36]). The idea is to prove the convergence of the Taylor series of v. Tv,x (z) =
∂ α v(x) (z − x)α , α! N
(2.21)
α∈N
on a complex neighbourhood of an arbitrary point x ∈ R N , the required estimates for the derivatives being provided by the partial differential equation, standard L q -multiplier theory, and Sobolev’s embedding theorem. We apply here this strategy to the functions v1 = Re(v) − 1 and v2 = Im(v), which satisfy the equations 2 v1 − 2v1 + c2 ∂12 v1 = F1 (v1 , v2 ) + c∂1 F2 (v1 , v2 ),
2
v2 − 2v2 + c2 ∂12 v2
(2.22)
= −c∂1 F1 (v1 , v2 ) − 2F2 (v1 , v2 ) + F2 (v1 , v2 ), (2.23)
where the functions F1 and F2 are defined from C2 to C by F1 (z 1 , z 2 ) = 3z 12 + z 22 + z 13 + z 1 z 22 , F2 (z 1 , z 2 ) = 2z 1 z 2 + z 12 z 2 + z 23 .
(2.24)
Indeed, by Eq. (TWc), v1 − c∂1 v2 − 2v1 = F1 (v1 , v2 ), v2 + c∂1 v1 = F2 (v1 , v2 ),
(2.25) (2.26)
so that Eq. (2.22) is derived applying the differential operator to Eq. (2.25), the operator c∂1 to Eq. (2.26), and adding the corresponding relations, whereas Eq. (2.23) is derived applying the differential operator − 2 to Eq. (2.26), the operator −c∂1 to Eq. (2.25), and adding the corresponding relations. Taking the Fourier transforms of Eqs. (2.22) and (2.23) and denoting H j,k , H1, j,k and K j,k , the kernels defined by H j,k (ξ ) =
ξ j ξk |ξ |2 ξ1 ξ j ξk , H1, j,k (ξ ) = , |ξ |4 + 2|ξ |2 − c2 ξ12 |ξ |4 + 2|ξ |2 − c2 ξ12
K j,k (ξ ) =
ξ j ξk (2 + |ξ |2 ) , |ξ |4 + 2|ξ |2 − c2 ξ12
for any 1 ≤ j, k ≤ N , Eqs. (2.22) and (2.23) may be recast as ∂ 2jk v1 = H j,k ∗ F1 (v1 , v2 ) − icH1, j,k ∗ F2 (v1 , v2 ), ∂ 2jk v2
= icH1, j,k ∗ F1 (v1 , v2 ) + K j,k ∗ F2 (v1 , v2 ).
(2.27) (2.28)
590
F. Béthuel, P. Gravejat, J.-C. Saut
In order to compute L q -estimates of ∂ 2jk v1 and ∂ 2jk v2 , we show that H j,k , H1, j,k and q K j,k are L -multipliers for any 1 < q < +∞. For that purpose, we use Theorem 8 of [33] 2 . in C N (R N \ {0}), Theorem 2.2 ([33]). Let 0 ≤ α < 1. Consider a bounded function K and assume that N α+k ξ j ∂1k1 j=1 j
(ξ ) ∈ L ∞ (R N ) . . . ∂ Nk N K
(2.29)
is a multiplier from for any (k1 , . . . , k N ) ∈ {0, 1} N such that k1 + . . . + k N ≤ N . Then, K q L q (R N ) to L 1−αq (R N ) for any 1 < q < α1 (with the usual convention 01 = +∞). More precisely, there exists a constant K (q) only depending on q, such that the multiplier operator K defined by (ξ ) K( f )(ξ ) = K f (ξ ), satisfies for any 1 < q < α1 , K( f ) where ) ≡ sup M( K
q
L 1−αq (R N )
) f L q (R N ) , ≤ K (q)M( K
(2.30)
N (ξ ), ξ ∈ R N , (k1 , . . . , k N ) ∈ {0, 1} N , |ξ j |α+k j ∂1k1 . . . ∂ Nk N K j=1 k1 + . . . + k N ≤ N . (2.31)
H1, j,k and K j,k satisfy assumption (2.29) By an inductive argument, the kernels H j,k , N for α = 0 and any (k1 , . . . , k N ) ∈ {0, 1} such that k1 + . . . + k N ≤ N . Therefore, they are L q -multipliers for any 1 < q < +∞. This implies Step 1. Let 1 ≤ j, k ≤ N , α ∈ N N and 1 < q < +∞. There exists some positive number K 1 (q), possibly depending on q, but not on α, such that ∂ α ∂ 2jk v1 L q (R N ) + ∂ α ∂ 2jk v2 L q (R N ) ≤ K 1 (q) ∂ α F1 (v1 , v2 ) L q (R N ) + ∂ α F2 (v1 , v2 ) L q (R N ) . (2.32) By [20], ∂ α v1 and ∂ α v2 belong to L q (R N ) for any NN−1 < q < +∞, if α = 0, 1 < q < +∞, elsewhere. Using the chain rule, it follows that ∂ α F1 (v1 , v2 ) and ∂ α F2 (v1 , v2 ) are in L q (R N ) for any 1 < q < +∞. On the other hand, by (2.27) and (2.28), ∂ α ∂ 2jk v1 = H j,k ∗ ∂ α F1 (v1 , v2 ) − icH1, j,k ∗ ∂ α F2 (v1 , v2 ), ∂ α ∂ 2jk v2 = icH1, j,k ∗ ∂ α F1 (v1 , v2 ) + K j,k ∗ ∂ α F2 (v1 , v2 ), and inequality (2.32) follows from the fact that H j,k , H1, j,k and K j,k are L q -multipliers for any 1 < q < +∞. We are now in position to obtain uniform estimates of ∂ α v1 and ∂ α v2 . 2 Estimate (2.30) in Theorem 2.2 is more precisely a consequence of the proof of Theorem 8, and Lemma 6 of [33].
Travelling Waves for the Gross-Pitaevskii Equation II
591
Step 2. Let 1 ≤ j ≤ N , α ∈ N N and N2 < q < +∞. There exist some positive numbers K 2 (q) and K 3 (q), possibly depending on q, but not on α, such that ∂ α v1 L ∞ (R N ) + ∂ α v2 L ∞ (R N ) ≤K 2 (q)Fq (α),
(2.33)
∂ α ∂ j v1 L q (R N ) + ∂ α ∂ j v2 L q (R N ) ≤K 3 (q)Fq (α), where we have set
Fq (α) = max ∂ β F1 (v1 , v2 ) L q (R N ) + ∂ β F2 (v1 , v2 ) L q (R N ) . 0≤β≤α
(2.34)
By Sobolev’s embedding theorem and inequality (2.32), we have ∂ α v1 L ∞ (R N ) ≤ K S (q) ∂ α v1 L q (R N ) + ∂ α d 2 v1 L q (R N ) ≤ 2K S (q)K 1 (q)Fq (α). Using the same argument for ∂ α v2 , the first estimate of (2.33) holds with K 2 (q) = 4K S (q)K 1 (q). On the other hand, using Gagliardo-Nirenberg’s inequality and inequality (2.32), we are led to 1
1
∂ α ∂ j v1 L q (R N ) ≤ K G N (q)∂ α v1 L2 q (R N ) ∂ α d 2 v1 L2 q (R N ) ≤ K G N (q)K 1 (q)Fq (α), and the second inequality of (2.33) also holds with K 3 (q) = 2K G N (q)K 1 (q). We now come back to the convergence of the Taylor series Tv1 ,x (z) and Tv2 ,x (z), defined in (2.21). Using the uniform estimates of Step 2, it suffices to prove the convergence of the series Sq,x0 (z) =
Fq (α) |z − x0 ||α| , α! N
(2.35)
α∈N
for z sufficiently close to x0 , and some suitable exponent q. This follows from the next estimate. Step 3. Let α ∈ N N and N2 < q < +∞. There exists some positive number K 4 (q), possibly depending on q, but not on α, such that Fq (α) ≤ K 4 (q)|α| α α˜ ,
(2.36)
where we have set α˜ = (max{α1 − 1, 0}, . . . , max{α N − 1, 0}). The proof is by induction on l = |α|. Inequality (2.36) being valid for 0 ≤ l ≤ 5 and any constant K 4 (q) sufficiently large, we assume that it holds for any multi-index α such that |α| ≤ l, and consider some multi-index α = (β1 , . . . , β j + 1, . . . , β N ) such that |α| = |β| + 1 = l + 1. Using the inductive assumption for β, we claim that Claim 1. Let (a, b, c) ∈ {1, 2}3 . Then, ˜
∂ j ∂ β (va vb ) L q (R N ) ≤22N +1 K 2 (q)K 3 (q)K 4 (q)l β β , ˜
∂ j ∂ β (va vb vc ) L q (R N ) ≤42N +1 K 2 (q)K 3 (q)2 K 4 (q)l β β .
(2.37)
592
F. Béthuel, P. Gravejat, J.-C. Saut
We postpone the proof of Claim 1 and first complete the proof of Step 3. By definitions (2.24) and estimates (2.37), ∂ j ∂ β F1 (v1 , v2 ) L q (R N ) + ∂ j ∂ β F2 (v1 , v2 ) L q (R N ) ˜
≤ 42N +2 K 2 (q)K 3 (q)(1 + K 3 (q))K 4 (q)l β β , so that choosing K 4 (q) = 42N +2 K 2 (q)K 3 (q)(1 + K 3 (q)), and using the above definition of α, we are led to ˜
∂ α F1 (v1 , v2 ) L q (R N ) + ∂ α F2 (v1 , v2 ) L q (R N ) ≤ K 4 (q)l+1 β β , which finally gives Fq (α) ≤ K 4 (q)l+1 α α˜ , and completes the inductive proof of Step 3. Proof of Claim 1. Let (a, b) ∈ {1, 2}2 . The chain rule gives
∂ j ∂ β (va vb ) L q ≤
0≤γ ≤β
β! ∂ j ∂ γ va L q ∂ β−γ vb L ∞ γ !(β − γ )!
+∂ γ va L ∞ ∂ j ∂ β−γ vb L q ,
so that using estimates (2.33), ∂ j ∂ β (va vb ) L q (R N ) ≤ 2
0≤γ ≤β
β! K 2 (q)K 3 (q)Fq (γ )Fq (β − γ ). γ !(β − γ )!
Hence, by the inductive assumption, ∂ j ∂ β (va vb ) L q (R N ) ≤ 2K 2 (q)K 3 (q)K 4 (q)l
0≤γ ≤β
β! γ γ˜ (β − γ )β −γ . γ !(β − γ )! (2.38)
At this stage, we require the next elementary lemma. Lemma 2.7. Let β ∈ N N \ {0}. Then,
0≤γ ≤β
β! ˜ γ γ˜ (β − γ )β −γ ≤ 4 N β β . γ !(β − γ )!
(2.39)
Proof of Lemma 2.7. Assume first that N = 1 and β ≥ 2. By Abel’s identity (as in [35]),
0≤γ ≤β
β! γ γ˜ (β − γ )β −γ = 2β β−1 + γ !(β − γ )!
1≤γ ≤β−1
= 2(2β − 1)β β−2 ,
β! γ γ −1 (β − γ )β−γ −1 γ !(β − γ )!
Travelling Waves for the Gross-Pitaevskii Equation II
593
so that
0≤γ ≤β
β! ˜ γ γ˜ (β − γ )β −γ ≤ 4β β−1 = 4β β . γ !(β − γ )!
Since this inequality is straightforward in the cases β = 0 or β = 1, (2.39) is proved for N = 1. If N > 1, we invoke a little algebra and the one-dimensional case to conclude by the relations
0≤γ ≤β
β! γ γ˜ (β − γ )β −γ = γ !(β − γ )! N
i=1 0≤γi ≤βi
βi ! γ˜ γ i (βi − γi )βi −γi γi !(βi − γi )! i
N ˜ β˜ βi i = 4 N β β . ≤ i=1
Using Lemma 2.7, we are in position to prove Claim 1. Inequalities (2.38) and (2.39) yield the first inequality of (2.37). The second follows using the chain rule, uniform estimates (2.33) and Lemma 2.7 twice. Step 4. The Taylor series Tv1 ,x0 (z) and Tv2 ,x0 (z) are uniformly convergent with respect to x0 ∈ R N on the set D(x0 , λ0 ) = {z ∈ C N , |z − x0 | < λ0 }, where λ0 = K 4e(N ) . We choose q = N . By Stirling’s formula, the series
α≥0
K 4 (N )|α| α α α z α!
converges on
D(0, λ0 ), so that using estimate (2.36), the series S N ,x0 (z) converges for any z ∈ D(x0 , λ0 ), uniformly with respect to x0 ∈ R N . By assertion (2.35), Tv1 ,x0 (z) and Tv2 ,x0 (z) converge the same way. By Step 4, we conclude that there exist some positive number λ0 and two analytic functions V1 and V2 on Cλ0 , such that v1 , respectively v2 , are identically equal to V1 , respectively V2 on R N . Therefore, Re(v) = 1 + v1 and Im(v) = v2 extend to analytic functions 1 + V1 and V2 on Cλ0 , which completes the proof of Theorem 2.1. 2.6. Solutions without vortices. In this subsection, we consider only solutions v to (TWc) on R N which do not vanish. In particular, we assume throughout that |v| ≥
1 , 2
(2.40)
so that v may be written as in (2.11), v = exp iϕ. Using (2.12), the energy is written in the variables and ϕ, (1 − ρ 2 )2 1 |∇|2 + 2 |∇ϕ|2 + , (2.41) E(v) = E(, ϕ) ≡ 2 RN 2 whereas for the momentum, we have i∂1 v, v = −2 ∂1 ϕ. Therefore, it follows from Proposition 2.2 that 1 1 − 2 ∂1 ϕ + ∂1 (1 − χ )ϕ = p(v) = p(v) ˜ = (1 − 2 )∂1 ϕ. (2.42) 2 RN 2 RN
594
F. Béthuel, P. Gravejat, J.-C. Saut
The system for and ϕ is written ⎧ ⎨ c ∂1 2 + div 2 ∇ϕ = 0, 2 ⎩ c∂1 ϕ − − 1 − 2 + |∇ϕ|2 = 0.
(2.43)
Notice that the quantity η = 1 − 2 satisfies the equation 2 η − 2η + c2 ∂12 η = −2(|∇v|2 + η2 − cη∂1 ϕ) − 2c∂1 div(η∇ϕ),
(2.44)
where the l.h.s is linear with respect to η, whereas the r.h.s is quadratic. A first elementary remark is Lemma 2.8. Let v be a finite energy solution to (TWc) on R N satisfying (2.40). Then, we have the identity 2 |∇ϕ|2 . (2.45) cp(v) = RN
Proof. The identity is obtained multiplying the first equation in (2.43) by ϕ and integrating by parts using the decay properties of Remark 2.1. Lemma 2.8 has the following remarkable consequence. Lemma 2.9. Let v be a finite energy solution to (TWc) on R N satisfying (2.40). Then, η2 . E(v) ≤ 7c(v)2 RN
Proof. We have, in view of assumption (2.40), Lemma 2.8 and the Cauchy-Schwarz’s inequality,
1 1 2 2 c 2 2 2 |∇ϕ| = (1 − )∂1 ϕ ≤ c η |∇ϕ| N N N N 2 R R R R 1 1 2 2 2 2 2 ≤ 2c η |∇ϕ| . 2
2
RN
RN
Hence, we deduce
RN
2 |∇ϕ|2 ≤ 4c2
RN
η2 .
(2.46)
It remains to bound the integral of |∇|2 . For that purpose we multiply the second equation in (2.43) by 2 − 1 and integrate by parts on R N using the decay properties of Remark 2.1. This yields 2|∇|2 + (1 − 2 )2 = c (1 − 2 )∂1 ϕ + (1 − 2 )|∇ϕ|2 , RN
RN
RN
(2.47)
Travelling Waves for the Gross-Pitaevskii Equation II
595
so that
1 |∇| + (1 − 2 )2 2 2
RN
≤c
1 |∂1 ϕ| 2
RN
2
2
RN
+2
RN
1 (1 − )
2 2
2
2 |∇ϕ|2 ,
by (2.40) and Cauchy-Schwarz’s inequality. Invoking (2.46) in order to bound the r.h.s of this identity in terms of the integral of η2 , we deduce the desired inequality. 2.7. Subsonic vortexless solutions. We next specify a little further the analysis assuming that the solution v verifies the additional condition √ 0 < c(v) < 2. (2.48) We set, for such a solution ε(v) =
2 − c(v)2 .
We first have the bound Proposition 2.4. Let v be a non-trivial finite energy solution to (TWc) on R N satisfying (2.48). Then, ε(v)2 . 1 − |v| ∞ N ≥ L (R ) 10 Proof. Set δ = 1 − |v| L ∞ (R N ) . If δ ≥ 21 , then the proof is straightforward. Otherwise, assumption (2.40) is satisfied and going back to identity (2.45), we observe that by Lemma 2.3, c c 2 cp(v) = (1 − )∂1 ϕ ≤ √ e(v), 2 RN 2(1 − δ) R N so that, using identity (2.42) and Lemma 2.8, c 2 |∇ϕ|2 ≤ √ e(v). 2(1 − δ) R N RN Similarly, we have for the r.h.s of identity (2.47) √ 2 c (1 − )∂1 ϕ ≤ 2c RN
and using (2.40),
RN
(1 − )|∇ϕ| ≤ 6δ 2
RN
2
RN
(2.49)
e(v),
(2.50)
e(v).
(2.51)
Combining (2.47), (2.50) and (2.51), and using the fact that ρ ≥ 1 − δ > 0, we are led to √ |∇|2 (1 − 2 )2 2c + 6δ + ≤ e(v). (2.52) 2 4 4(1 − δ) R N RN
596
F. Béthuel, P. Gravejat, J.-C. Saut
Finally, from (2.49) and (2.52), we derive λ e(v) ≤ 0, RN
where we have set λ = 1 − equal to 0 so that λ ≤ 0 and
√ c 2(1−δ)
−
3δ 2(1−δ) .
2 2 c δ≥ = 1− √ 1− 5 5 2 which is the desired inequality.
Since v is non-trivial, its energy is not
ε2 1− 2
≥
ε2 , 10
Combining Lemma 2.5 with Lemma 2.8, we are led to Lemma 2.10. Let v be a finite energy solution to (TWc) on R N satisfying (2.40) and (2.48). Then, 1 ε(v)2 (v) + |∇|2 = √ p(v). (2.53) N RN 2 + c(v) Moreover, if N = 2, then we have 1 |∇|2 1 + 2 = η|∇ϕ|2 . 2 2 R R
(2.54)
Proof. For the proof of equality (2.53), we first add equality (2.16) and the identity provided in Lemma 2.8, to obtain N −2 N N |∇|2 + 2 |∇ϕ|2 + η2 = cN p(v), 2 2 RN 4 RN RN using the identity |∇v|2 = |∇|2 + 2 |∇ϕ|2 . This yields 1 E(v) − cp(v) = |∇|2 , N RN and equality (2.53) follows from the definitions of (v) and ε(v). For equality (2.54), we multiply the second equation in (2.43) by η , and integrate on R N . This yields η − η2 + η|∇ϕ|2 = 0. cη∂1 ϕ − RN Since by definition η = 1 − 2 , we obtain, integrating by parts 1 − 2 1 η=− = ∇∇ |∇|2 1 + 2 . RN RN RN On the other hand, for N = 2, identities (2.16) and (2.42) yield 2 η = 2cp(v) = c η∂1 ϕ. R2
R2
Conclusion (2.54) follows combining the last three relations.
Travelling Waves for the Gross-Pitaevskii Equation II
597
2.8. Use of Fourier transform. We consider for ξ ∈ R N , and a function f defined on R N , its Fourier transform f (ξ ) defined by the integral f (x)e−i x.ξ d x. f (ξ ) = RN
Lemma 2.11. Let v be a finite energy solution to (TWc) on R N satisfying (2.40). Then, for any ξ ∈ R N , we have N ξ2 N
ξ2 ξ1 ξ j j η(ξ ) = 2 R0 (ξ ) − 2c |ξ |2 + 2 − c2 12 (ξ ) + 2c R R j (ξ ), 1 |ξ | |ξ |2 |ξ |2 j=2
j=2
(2.55) where we have set R0 =
|∇v|2
+ η2
and R j = η∂ j ϕ.
Proof. It suffices to consider the Fourier transform of (2.44).
Using (2.55), we deduce in the two-dimensional case, Lemma 2.12. Let N = 2 and let v be a finite energy solution to (TWc) on R2 satisfying (2.40) and (2.48). Then there exists some universal constant K > 0 such that ε(v) ≤ K E(v). Proof. We first notice that for any integrable function f on R N and any ξ ∈ R N , we have in view of the definition of f (ξ ), | f (ξ )| ≤ f L 1 (R N ) , so that i L ∞ (R N ) ≤ Ri L 1 (R N ) ≤ K E(v). R
(2.56)
It follows from integrating (2.55) that we have for any 1 ≤ q ≤ +∞, | η(ξ )|q dξ ≤ K (1 + cq ) |L ε (ξ )|q dξ E(v)q , RN
RN
where we have set, for any ξ ∈ R N , L ε (ξ ) =
1 ξ2
|ξ |2 + 2 − c2 |ξ1|2
=
|ξ |2 . |ξ |4 + 2|ξ |2 − c2 ξ12
In the case q = 2, this leads in view of Plancherel’s formula to η(x)2 d x ≤ K (1 + c2 ) |L ε (ξ )|2 dξ E(v)2 . RN
We claim that
RN
R2
L ε (ξ )2 dξ = √
π 2ε(c)
.
(2.57)
(2.58)
(2.59)
We postpone the proof of the claim, and complete the proof of Lemma 2.12. Combining (2.48), (2.58) and Lemma 2.9, we have K 2 E(v)2 , E(v) ≤ 7c η2 ≤ N ε(c) R where we used claim (2.59) for the second inequality. This concludes the proof of Lemma 2.12.
598
F. Béthuel, P. Gravejat, J.-C. Saut
Proof of Claim (2.59). Introducing polar coordinates, we have +∞ 2π r dr dθ dθ 1 2π L ε (ξ )2 dξ = = . (r 2 + 2 − c2 cos2 (θ ))2 2 0 2 − c2 cos2 (θ ) R2 0 0 2 Using the change of variables t = tan(θ ) and u = 2−c 2 t, we obtain +∞ +∞ dt 2 du π 2 L ε (ξ ) dξ = 2 = =√ . 2 + 2t 2 2 2 2 2 − c 2 − c 1 + u 2ε(c) R 0 0 In dimension three, the previous analysis leads to a result of a very different nature. Lemma 2.13. Let v be a finite energy solution to (TWc) on R3 satisfying (2.40) and (2.48). Then, there exists some universal constant K > 0 such that E(v) ≥ E0 (c) ≡
K c(1 + c2 ) arcsin
√c 2
.
Proof. The argument is parallel to the argument of Lemma 2.12 up to (2.58). However, in contrast with (2.59) we have in dimension three, c π2 (2.60) arcsin √ . L ε (ξ )2 dξ = c 2 R3 We postpone the proof of (2.60), and complete the proof of Lemma 2.13. By (2.58) and Lemma 2.9, there exists some constant K > 0 such that E(v) ≤ 7c2 η2 ≤ K c2 (1 + c2 ) L ε (ξ )2 dξ E(v)2 . R3
Lemma 2.13 follows using (2.60).
R3
Proof of Claim (2.60). Introducing spherical coordinates, and using the change of variables u = cos(θ ), we have +∞ π r 2 sin(θ )dθ dr L ε (ξ )2 dξ = 2π 2 2 2 2 0 0 (r + 2 − c cos (θ )) R3 +∞ 1 r 2 du = 4π dr. 2 2 2 2 0 0 (r + 2 − c u ) An integration by parts with respect to the variable r gives +∞ 1 du dr. L ε (ξ )2 dξ = 2π 2 2 2 0 0 r +2−c u R3 cu Using the change of variables v = √ r 2 2 and w = √ , we obtain 2 2−c u +∞ 1 1 dv du du 2 L ε (ξ )2 dξ = 2π = π √ √ 2 2 2 3 1 + v 2 − c u 2 − c2 u 2 0 0 0 R √c 2 2 c dw π π 2 arcsin √ . = = √ 2 c 0 c 2 1−w
Travelling Waves for the Gross-Pitaevskii Equation II
599
The previous analysis extends to any finite energy subsonic and sonic solutions, providing a proof of iii) in Theorem 2. Lemma 2.14. Let v be a non-trivial finite energy solution to (TWc) on R3 . Then E(v) ≥ E0 , where E0 is some positive universal constant. Proof. In view of the results of [19], we know√that there are no supersonic travelling waves on the whole space that is 0 < c(v) ≤ 2. Moreover, in dimensions N > 2, it follows from Lemma 2.2 that 1
1 − |v| L ∞ (R N ) ≤ K E(v) N +2 , so that, choosing possibly a smaller constant E0 , we may assume that v satisfies (2.40), √ and v may be written as in (2.11), v = exp iϕ. If v is subsonic, i.e. c(v) < 2, then Lemma 2.13 yields the conclusion, since the function c → c(1 + c2 ) arcsin( √c ) is 2 √ bounded on (0, 2). In the sonic case, one observes similarly that π3 L 0 (ξ )2 dξ = √ < +∞, 2 2 R3 and the same proof as in Lemma 2.13 applies to yield the conclusion.
In the same spirit, but with more involved methods, we may prove Lemma 2.15. Let 53 < q < +∞, and let v be a finite energy solution to (TWc) on R3 satisfying (2.40). Then, there exists a constant K (q) only depending on q such that 1
η L q (R3 ) ≤ K (q)E(v) q
+ 25
.
Proof. We first recall that there are no non-trivial √ finite energy supersonic travelling waves on R3 . Hence, we may assume that 0 ≤ c ≤ 2. In view of Eq. (2.55), we have 3 ξ2 3
ξ1 ξ j j η(ξ ) = L ε (ξ ) 2 R0 (ξ ) − 2c R1 (ξ ) + 2c R j (ξ ) , |ξ |2 |ξ |2 j=2
(2.61)
j=2
so that the proof reduces to estimate the r.h.s of (2.61) by using multipliers theory √ developed in [33]. Indeed, using Lemma 2.1, and the fact that 0 ≤ c ≤ 2, we first notice that there exists some universal constant K such that |R j | ≤ K e(v). Invoking Lemma 2.1 and the fact that 0 ≤ c ≤ 1 < q < +∞,
1
R j L q (R3 ) ≤ K
√
R3
e(v)q
q
2 once more, we have for any
1
≤ K E(v) q .
On the other hand, it follows from standard Riesz-operator theory (see [39]) that the ξ ξ functions ξ → |ξj |2k are L q -multipliers for any 1 < q < +∞ and any 1 ≤ j, k ≤ 3.
600
F. Béthuel, P. Gravejat, J.-C. Saut
Hence, there exists some constant K (q), possibly depending on q, so that the function F defined by ) = 2 F(ξ R0 (ξ ) − 2c
3 ξ2
j
3
ξ1 ξ j
|ξ |
|ξ |2
j=2
R (ξ ) + 2c 2 1
j=2
R j (ξ ),
(2.62)
belongs to L q (R3 ) for any 1 < q < +∞, and satisfies F L q (R3 ) ≤ K (q)
3
1
R j L q (R3 ) ≤ K (q)E(v) q .
(2.63)
j=0 5q
Finally, in view of Theorem 2.2, L ε is a multiplier from L q (R3 ) to L 5−2q (R3 ) for any 1 < q < 25 . More precisely, denoting Lε , the multiplier operator given by L f (ξ ), ε ( f )(ξ ) = L ε (ξ )
(2.64)
there exists some constant K (q) possibly depending on q but not on c, such that, for any 1 < q < 25 , Lε ( f )
5q
L 5−2q (R3 )
≤ K (q) f L q (R3 ) , ∀ f ∈ L q (R3 ).
(2.65)
We postpone the proof of claim (2.65), and complete the proof of Lemma 2.15. Indeed, it follows from (2.61), (2.62) and (2.64) that η = Lε (F). Therefore, by (2.63) and (2.65), we have η for any 1 < q < Lemma 2.15.
5 2.
1
5q L 5−2q
Letting q =
(R 3 )
≤ K (q )E(v) q ,
5q 5−2q ,
that is
1 q
=
1 q
+ 25 , this ends the proof of
Proof of Claim (2.65). Claim (2.65) is a consequence of Theorem 2.2 applied to L ε . Indeed, we may check that L ε satisfies the assumptions of Theorem 2.2, i.e. that the quantity 3 M(L ε ) ≡ sup |ξ j |α+k j ∂1k1 ∂2k2 ∂3k3 L ε (ξ ), ξ ∈ R3 , (k1 , k2 , k3 ) ∈ {0, 1}3 , j=1 k1 + k2 + k3 ≤ 3 , is finite for some suitable choice of α. This follows from the next computation of some derivatives of L ε , 2ξ j 4 2 2 2 2 ∂ j L ε (ξ ) = − |ξ | , (2.66) − c ξ + c δ |ξ | 1, j 1 (|ξ |4 + 2|ξ |2 − c2 ξ12 )2
Travelling Waves for the Gross-Pitaevskii Equation II
601
for any 1 ≤ j ≤ 3, ∂ 2jk L ε (ξ ) =
(|ξ |4
4ξ j ξk 2|ξ |6 + c2 6ξ12 |ξ |2 + 4ξ12 − (δ1, j + δ1,k ) 2 2 2 3 + 2|ξ | − c ξ1 ) × (3|ξ |4 + 2|ξ |2 + c2 ξ12 ) , (2.67)
for any 1 ≤ j = k ≤ 3, and 3 ∂123 L ε (ξ ) =
16ξ1 ξ2 ξ3 8 2 − 3|ξ | 6|ξ |6 − 18ξ12 |ξ |4 + 8|ξ |4 + c (|ξ |4 + 2|ξ |2 − c2 ξ12 )4 (2.68) + 6(c2 − 4)ξ12 |ξ |2 − 3c2 ξ14 + 4|ξ |2 + 4c2 ξ12 − 12ξ12 .
Considering some multi-index (k1 , k2 , k3 ) ∈ {0, 1}3 such that k1 + k√ 2 + k3 ≤ 3, it follows from (2.57), (2.66), (2.67) and (2.68), and the fact that 0 ≤ c ≤ 2, that there exists some universal constant K such that, for any |ξ | ≥ 1, 3 |ξ j |α+k j ∂1k1 ∂2k2 ∂3k3 L ε (ξ ) ≤
j=1
K ≤ K, |ξ |2−3α
(2.69)
provided α ≤ 23 . On the other hand, if |ξ | ≤ 1, we deduce from (2.57), (2.66), (2.67) and (2.68), that there exists some universal constant K such that k1 k2 k3 ∂ ∂ ∂ L ε (ξ ) ≤ K 1 2 3
|ξ1 |k1 |ξ2 |k2 |ξ3 |k3 (|ξ |4 + 2|ξ |2 − c2 ξ12 )1+k1 +k2 +k3 × 1 − k1 |ξ |2 + k1 |ξ |4 + 2|ξ |2 − c2 ξ12 .
Therefore, denoting ξ = ρσ , where ρ ≥ 0 and σ = (σ1 , σ⊥ ) ∈ S N −1 , we are led to 3 j=1
ρ 2(1+k2 +k3 )+3α |σ⊥ |2(α+k2 +k3 ) |ξ j |α+k j ∂1k1 ∂2k2 ∂3k3 L ε (ξ ) ≤ K 2(1+k +k ) 2 2 3 (ρ + 2|σ |2 )1+k2 +k3 ρ ⊥ ≤ K max{ρ, |σ⊥ |}5α−2 ≤ K ,
(2.70)
provided that α ≥ 25 . Using (2.69) and (2.70), and choosing α = 25 , the quantity M(L ε ) is bounded by some constant K not depending on ε, so that, by Theorem 2.2, L ε is 5q
a multiplier from L q (R3 ) to L 5−2q (R3 ) for any 1 < q < follows from (2.30) and (2.31).
5 2.
Hence, inequality (2.65)
We will use the following consequence. Corollary 2.4. There exists some constants K > 0 and α > 0 such that 1 − |v| L ∞ (R3 ) ≥
K . E(v)α
602
F. Béthuel, P. Gravejat, J.-C. Saut Eα
Proof. If 1 − |v| L ∞ (R3 ) ≥ 21 , then we are done in view of Lemma 2.14, for K = 20 and any α > 0. Therefore, we may assume 1 − |v| L ∞ (R3 ) ≤ 21 . In that case, using √ Lemmas 2.9 and 2.15, and the fact that 0 ≤ c ≤ 2, we write for any 53 < q < 2, q
2−q
2−q
2q
E(v) ≤ 7c2 η2L 2 (R3 ) ≤ 14η L q (R3 ) η L ∞ (R3 ) ≤ K (q)η L ∞ (R3 ) E(v)1+ 5 . Hence, 1 − |v| L ∞ (R3 ) ≥
η L ∞ (R3 ) 1 + L ∞ (R3 )
We conclude, choosing for instance q =
7 4
≥
2 K (q) η L ∞ (R3 ) ≥ . 2q 5 E(v) 5(2−q)
and α =
14 5 .
Combining the previous result with Lemma 2.6, we obtain the following bound for ε(v). Lemma 2.16. Let v be a finite energy solution to (TWc) on R3 , such that (v) > 0. Then, there exists some constant K (c) depending only on c, and some universal constant α > 0, such that K (c) ε(v) p(v) ≥ . E(v)8α+1 Proof. In view of Lemma 2.6, we have λε(v) p(v) +
K (c) E(v) ≥ K (c)η4L ∞ (R3 ) ≥ , ∀λ > 0, λ E(v)4α
where we have made use of Corollary 2.4 for the last inequality. The choice λ = yields the desired result.
2E(v)4α+1 K (c)
3. Properties of the Function Emin (p) The main purpose of this section is to provide the proofs to Theorem 4, Lemma 1 and Lemma 2, as well as the proof of Theorem 3. 3.1. Proof of Theorem 4. We begin this subsection with a number of elementary observations. Lemma 3.1. For N = 2 and N = 3, we have the inclusion W (R N ) ⊂ E(R N ). Moreover, the functions E and p are continuous on W (R N ). Proof. Concerning the momentum p, we have already seen that, in view of (1.5) and Hölder’s inequality, it is well-defined and continuous on W (R N ). For the energy E, we start with the identity (1 − |1 + w|2 )2 = 4Re(w)2 + 4Re(w)|w|2 + |w|4 , for any w ∈
V (R N ),
(3.1)
so that
(1 − |1 + w|2 )2 ≤ 8Re(w)2 + 4Re(w)4 + 4Im(w)4 , and the l.h.s of this identity belongs to L 1 (R N ), whenever w belongs to V (R N ). Hence, W (R N ) is included in E(R N ), and the L 1 -norm of the l.h.s of (3.1) being continuous on V (R N ), E is also continuous in W (R N ).
Travelling Waves for the Gross-Pitaevskii Equation II
603
Lemma 3.2. Assume N = 2 or N = 3 and let v = 1 + w be in W (R N ). There exists a sequence of maps (wn )n∈N in Cc∞ (R N ) such that wn → w in V (R N ), as n → +∞, p(vn ) = p(v), and E(vn ) → E(v), as n → +∞. In particular, given any p ≥ 0, there exists a sequence of maps (wn )n∈N in Cc∞ (R N ) such that p(1 + wn ) = p, and E(1 + wn ) → E min (p), as n → +∞, so that E min (p) = inf{E(1 + v), v ∈ Cc∞ (R N ), p(1 + v) = p}.
(3.2)
Proof. In view of continuity properties stated in Lemma 3.1 and the density of Cc∞ (R N ) into V (R N ), given any w ∈ V (R N ), there exists a sequence of maps (w˜ n )n∈N such that p(v˜n ) → p(v), and E(v˜n ) → E(v), as n → +∞, where we have set v˜n = 1 + w˜ n , and v = 1 + w. In order to prove the first assertion, we distinguish two cases. First, if p(v) = 0, then we may assume without loss of generality that p(v) > 0, so that by continuity, we have p(v˜n ) > 0, for n sufficiently large. In this case, we set p(v) w˜ n , and vn = 1 + wn , wn = p(v˜n ) so that p(vn ) = p(v), whereas wn → w in V (R N ), as n → +∞, which yields the conclusion. The case p(v) = 0 (which is actually not the most relevant one for our discussion) is treated by an approximation argument. Indeed, in this case, we may assume that p(v˜n ) = 0 for n sufficiently large (otherwise, up to a subsequence, the conclusion holds for wn = w˜ n ). For given δ > 0, we may construct (see Lemma 3.3 below) a √ map f δ ∈ Cc∞ (R N ) such that p(1 + f δ ) = δ, E(1 + f δ ) ≤ 2|δ| and f δ V (R N ) ≤ K |δ| for some universal constant K . Denoting fˇδ (x1 , x⊥ ) = f δ (−x1 , x⊥ ), this construction is also possible for any δ < 0. We then consider the map wn = w˜ n + f δn (· − an ), where δn = − p(1 + w˜ n ) → 0, as n → +∞, and the point an ∈ R N is chosen sufficiently large so that the supports of w˜ n and f δn (·−an ) do not intersect. Denoting vn = 1+wn , we have p(vn ) = p(w˜ n )+ p( f δn ) = 0, |E(vn )− E(v)| ≤ |E(v˜n )− E(v)|+|E(1+ f δn )| → 0, and wn − wV (R N ) ≤ w˜ n − wV (R N ) + f δn V (R N ) → 0, as n → +∞, which completes the proof of the first assertion. The two last assertions follow, once it is proved that
N (p) = w ∈ W (R N ), s.t. p(w) = p is not empty. This is again a consequence of Lemma 3.3 below. As a rather direct consequence of Lemma 3.2, we have Corollary 3.1. Let p > 0. Then,
n lim sup E min (p) ≤ E min (p). n→+∞
604
F. Béthuel, P. Gravejat, J.-C. Saut
Proof. In view of identity (3.2), given any δ > 0, there exists a map v = 1 + w ∈ {1} + Cc∞ (R N ) such that E min (p) ≤ E(v) ≤ E min (p) + δ, and p(v) = p. Since w has compact support in some ball B(0, R), for some radius R > 0, the restriction of w to the set nN vanishes on the boundary ∂nN , provided π n > R, and hence defines a map in H 1 (TnN ). Adding to w the constant function 1, we have similarly v ∈ H 1 (TnN , C). 4 Moreover, in the two-dimensional case, if n ≥ πR 3 , then w ∈ Sn0 (see definition (4.14) in Subsect. 4.2 below). This implies n E(v) ≥ E min (p), ∀n ≥
R 4
3
π
.
Hence, n E min (p) ≤ E min (p) + δ,
and the conclusion follows letting δ tend to zero.
Next, we have Lemma 3.3. Let N ≥ 2 and s > 0 be given. There exists a sequence of non-constant maps (γn )n∈N in {1} + Cc∞ (R N ) such that √ √ p(γn ) = s, γn W (R N ) ≤ K s, and E(γn ) → 2s, as n → +∞, where K is some universal constant. In particular, E min (p) ≤ the map p → (p) is non-negative.
√
2p, for any p ≥ 0, and
Proof. Recall that if v = exp iϕ ∈ {1} + Cc∞ (R N ), then the energy and momentum write E(v) =
1 η2 1 1 − |∇|2 + |∇ϕ|2 + η|∇ϕ|2 , and p(v) = η∂1 ϕ. 2 RN 2 2 RN 2 RN
As mentioned in the Introduction, if one keeps√ only the quadratic terms, minimizing the energy for fixed momentum amounts to have 2∂1 ϕ η. For this simplified problem, the infimum is not achieved, and for minimizing sequences, transverse derivatives tend to zero, as well as the modulus. In view of this observation, we take an arbitrary N map ϕ ∈ C∞ c (R ), and we construct by scaling and multiplication by a scalar a sequence which has the properties announced in the statements of Lemma 3.3. Inspired by the scaling (1.11), we introduce three parameters α > 1, λ > 1 and 0√< µ < 1, and consider the map = ρ exp i, where the phase is given by (x) = 2µϕ xλ1 , xλ⊥α and the modulus ρ by ρ(x) = 1 − µλ ∂1 ϕ xλ1 , xλ⊥α . Notice in particular that, if µ → 0 and λ → +∞, then √ 2∂1 = 2(1 − ρ) 1 − ρ 2 ≡ η,
Travelling Waves for the Gross-Pitaevskii Equation II
605
and that the transverse derivatives are of lower order. Next, we compute 2 µ 1 1 − ∂1 ϕ (∂1 ϕ)2 + 2 |∂1 |2 = µ2 λ(N −1)α−1 2 (∂12 ϕ)2 , N λ λ RN RN R 2 µ 1 1 − ∂1 ϕ |∇⊥ ϕ|2 + 2 |∇⊥ |2 = µ2 λ(N −3)α+1 2 |∇⊥ ∂1 ϕ|2 , N λ λ RN RN R 2 1 µ ∂1 ϕ (∂1 ϕ)2 , 1− η2 = µ2 λ(N −1)α−1 (3.3) 4 RN 2λ RN whereas
√ 2 (N −1)α−1 2µ λ
µ ∂1 ϕ (∂1 ϕ)2 . 1− (3.4) 2λ RN For given n ∈ N, we choose λ = n, and determine the parameter µ so that p() = s. In particular, this choice leads to √ 1−(N −1)α s µ∼ 1 n 2 , as n → +∞, 2 4 ∂1 ϕ L 2 (R N ) p() =
so that µ → 0 and µλ → 0, as n → +∞. In view of (3.3) and (3.4), choosing γn = (with the particular choices of λ and µ above), we are led to √ p(γn ) = s, and E(γn ) ∼ 2µ2 λ(N −1)α−1 (∂1 ϕ)2 ∼ 2s, as n → +∞, RN
for any α > 1. In order to complete the proof of Lemma 3.3, we now turn to the norm of the function γn in W (R N ). Using the fact that µλ → 0, as n → +∞, we compute |Re(γn ) − 1| ≤ K ϕ (|ρ − 1| + 2 ), |Im(γn )| ≤ K ϕ ||, and |∇Re(γn )| ≤ K ϕ (|∇ρ| + |||∇|), where K ϕ is some constant possibly depending on ϕ, but not on α and n. Hence, we are led to 2 2 (N −1)α−1 2 2 2 4 |Re(γn ) − 1| ≤ K ϕ µ λ (∂1 ϕ) + µ λ ϕ , N RN RN R |Im(γn )|4 ≤ K ϕ µ4 λ(N −1)α+1 ϕ4, N N R R 4 4 5 4 4 4 4 4 (N −1)α− 3 |∂1 Re(γn )| 3 ≤ K ϕ µ 3 λ (∂12 ϕ) 3 + µ 3 λ 3 |ϕ| 3 |∂1 ϕ| 3 , RN RN RN 4 4 7 1 4 4 4 4 4 |∇⊥ ∂1 ϕ| 3 + µ 3 λ 3 |ϕ| 3 |∇⊥ ϕ| 3 , |∇⊥ Re(γn )| 3 ≤ K ϕ µ 3 λ(N − 3 )α− 3 RN
RN
so that, assuming that α =
3 N −1 ,
√ γn W (R N ) ∼ K ϕ s, as n → +∞.
This concludes the proof of Lemma 3.3. Indeed, the last assertions of this lemma are direct consequences of the definitions (1.2) and (1.17) of E min and .
606
F. Béthuel, P. Gravejat, J.-C. Saut
Lemma 3.4. We have, for any p, q ≥ 0, |E min (p) − E min (q)| ≤
√ 2|p − q|.
(3.5)
In particular √ the function p → E min (p) is Lipschitz continuous on R+ , with Lipschitz’s constant 2, and the function p → (p) is non-negative, non-decreasing and continuous on R+ . Proof. We may assume without loss of generality that q ≥ p. We show first that √ (3.6) E min (q) ≤ E min (p) + 2(q − p). For that purpose, let δ > 0 be given, and consider a map vδ = 1 + wδ , where wδ ∈ Cc∞ (R N ), such that δ p(vδ ) = p, and E(vδ ) ≤ E min (p) + . 2 The existence of such a map vδ follows from identity (3.2) in Lemma 3.2.√ Set s = q − p and let f δ be in Cc∞ (R N ) such that p(1 + f δ ) = s, and E(1 + f δ ) ≤ 2s + 2δ . The existence of such a map f δ follows from Lemma 3.3. We set v = 1 + wδ + f δ (· − aδ ), where aδ ∈ R N is chosen so that the support of wδ and f δ (· − aδ ) do not intersect. In particular, we have E(v) = E(vδ ) + E(1 + f δ ), and p(v) = p(vδ ) + p(1 + f δ ) = p + s = q. It follows that E min (q) ≤ E(v) = E(vδ ) +
√
2s +
√ δ ≤ E min (p) + 2(q − p) + δ, 2
which yields (3.6) in the limit δ → 0. Next we turn to the inequality √ E min (p) ≤ E min (q) + 2(q − p).
(3.7)
We similarly consider a map v˜δ = 1 + w˜ δ , where w˜ δ ∈ Cc∞ (R N ), such that δ p(v˜δ ) = q, and E(v˜δ ) ≤ E min (q) + . 2 We set v˜ = 1 + w˜ δ + fˇδ (· − bδ ), where the map fˇδ is defined as in the proof of Lemma 3.2, and where bδ is chosen so that the support of w˜ δ and fˇδ (· − bδ ) do not intersect. Notice that we have √ δ E 1 + fˇδ (· − bδ ) = E(1 + f δ ) ≤ 2s + , 2 and
p 1 + fˇδ (· − bδ ) = − p(1 + f δ ) = −s,
Travelling Waves for the Gross-Pitaevskii Equation II
607
so that p(v) ˜ = p(v˜δ ) − s = p. Hence we have √ E min (p) ≤ E(v) ˜ = E(v˜δ ) + E 1 + fˇδ (· − bδ ) ≤ E min (q) + 2s + δ, and the conclusion (3.7) follows letting δ tend to 0. This completes the proof of Lemma 3.4, the last assertion being a consequence of (3.5). Lemma 3.5. Let p, q ≥ 0. Then, p + q E (p) + E (q) min min E min ≥ . 2 2 Proof. The main idea is to construct comparison maps using a reflexion argument. For that purpose, for a map f ∈ W (R N ), and a ∈ R, we consider the map Ta± f defined by Ta± f = f ◦ Pa± , where Pa+ (resp. Pa− ) restricted to the set a+ = {x = (x1 , . . . , x N ) ∈ R N , x N ≥ a} (resp. the set a− = {x = (x1 , . . . , x N ) ∈ R N , x N ≤ a}) is the identity, whereas its restriction to the set a− (resp. a+ ) is the symmetry with respect to the hyperplane of equation xn = a. In coordinates, this reads as Ta+ f (x1 , . . . , x N ) = f (x1 , . . . , x N ) if x N ≥ a, Ta+ f (x1 , . . . , x N ) = f (x1 , . . . , 2a − x N ) if x N ≤ a. One similarly defines Ta− f , reversing the inequalities at the end of each line. We verify that Ta± f belongs to W (R N ), and that 1 ± ± ± i∂1 f, f − 1 . (3.8) E(Ta f ) = 2E( f, a ), and p(Ta f ) = 2 2 a± We also notice that the function a → p(Ta+ f ) is continuous and, by Lebesgue’s theorem, tends to zero, as a → +∞, and to 2 p( f ), as a → −∞. Therefore, it follows by continuity that, for every α ∈ (0, p( f )), there exists a number a ∈ R such that p(Ta+ f ) = 2α, and p(Ta− f ) = 2( p( f ) − α).
(3.9)
Next, we consider, for any p, q ≥ 0 and any δ > 0, a map v ∈ W (R N ) such that p + q δ p+q , and E(v) ≤ E min + . p(v) = 2 2 2 Invoking (3.9) for f = v and α =
p 2,
we may find some a ∈ R such that
p(Ta+ v) = p, and p(Ta− v) = q. It then follows from (3.8) that E min (p) ≤ E(Ta+ v) ≤ 2E(v, a+ ), and E min (q) ≤ E(Ta− v) ≤ 2E(v, a− ). Adding these relations, we obtain E min (p) + E min (q) ≤ 2E(v, a− ) + 2E(v, a+ ) = 2E(v) ≤ 2E min The conclusion follows, letting δ tend to 0.
p + q 2
+ δ.
608
F. Béthuel, P. Gravejat, J.-C. Saut
Corollary 3.2. The function p → E min (p) is concave and non-decreasing on R+ . Proof. Continuous functions f satisfying the inequality p + q f (p) + f (q) ≥ f 2 2 are concave. Similarly, concave non-negative functions on R+ are non-decreasing, so that, in view of Lemmas 3.4 and 3.5, E min is concave and non-decreasing on R+ . Proof of Theorem 4 completed. Combining Lemma 3.4 and Corollary 3.2, all the statements in Theorem 4 are proved, except the fact that (p) tends to +∞, as p → +∞ (the existence of p0 being a consequence of the properties of ). This fact is a direct consequence of the vortex solutions constructed in dimension two in [4], and of the vortex ring solutions constructed in dimension three in [3,8]. As a matter of fact, these results show that E min (p) ≤ 2π ln(p) + K , as p → +∞,
(3.10)
in case N = 2, respectively
√ E min (p) ∼ π p ln(p), as p → +∞, √ in case N = 3, so that (p) ∼ 2p, as p → +∞.
(3.11)
Remark 3.1. We actually believe that the arguments in [4] might lead to the estimate E min (p) ∼ 2π ln(p), as p → +∞. 3.2. Proof of Lemma 1. Let p > 0 be given, and assume that E min is achieved by a solution u = u p of (TWc) of speed c = c(u p). Equation (TWc), which is the EulerLagrange equation for the constrained minimization problem E min , may be recast in a more abstract form as cdp(u p) = d E(u p), where dp and d E denote the Fréchêt differentials of p and E given, for any ψ ∈ Cc∞ (R N ), by dp(u p)(ψ) = i∂1 u p, ψ, and d E(u p)(ψ) = − u p + u p(1 − |u p|2 ), ψ. RN
RN
We claim that dp(u p) = 0. Indeed, if we take formally ψ0 = u p − 1, then dp(u p) (ψ0 ) = 2p = 0. By density of smooth functions with compact support in V (R N ), the claim follows. Let therefore ψ1 be a function in Cc∞ (R N ) such that dp(u p)(ψ1 ) = 1. We consider the curve γ : R → W (R N ) defined by γ (t) = u p +tψ1 . Since the functions E and p are smooth on W (R N ), we have p(γ (t)) = p + s, where s = t + p(ψ1 )t 2 , E(γ (t)) = E min (p) + ct + O (t 2 ), t→0
so that E min (p + s) − E min (p) ≤ E(γ (t)) − E min (p) ≤ cs + O (s2 ). s→0
Conclusion (1.19) follows, letting s →
0± .
Travelling Waves for the Gross-Pitaevskii Equation II
609
3.3. Proof of Lemma 2. We consider in this subsection two numbers 0 ≤ p1 < p2 and assume throughout this section that E min is affine on the interval (p1 , p2 ) that is E min θ p1 + (1 − θ )p2 = θ E min (p1 ) + (1 − θ )E min (p2 ), ∀θ ∈ [0, 1]. (3.12) The first observation is Lemma 3.6. Assume that assumption (3.12) holds and that, for some 0 ≤ p1 < p < p2 the infimum E min (p) is achieved by some function u p. Then, we have c(u p) = Moreover, 0 < c(u p )
1. Moreover, its gradient belongs to L q (R2 ) for any q > 1 as well. Its Fourier transform satisfies the relation w (ξ ) =
ξ12 1 2 (ξ ). w 2 |ξ |2 + ξ14
(3.19)
As a consequence of (3.19), there exists a smooth function v which solves w = ∂1 v. Indeed, at least formally, the distribution v whose Fourier transform is given by v (ξ ) = −
i ξ1 2 (ξ ), w 2 2 |ξ | + ξ14
(3.20)
has the desired property. More precisely, the necessary properties of v can be deduced from properties of the kernel H0 given by H0 (ξ ) =
ξ1 . + ξ14
|ξ |2
First, we notice that, by Theorems 3 and 4 of [24], the kernel H0 belongs to L q (R2 ) for any 2 < q < +∞. Since the function w2 as well as any of its derivatives is a smooth function in L 1 (R2 ) ∩ L ∞ (R2 ), the map v defined in (3.20) is also smooth and belongs to L q (R2 ) for any q > 2, and then relations (3.19) and (3.20) yield as expected w = ∂1 v. We also deduce that the gradient of v belongs to L q (R2 ) for any q > 1. Indeed, the first order partial derivatives of v are given by v (ξ ) = ∂ j
1 ξ1 ξ j 2 w (ξ ). 2 |ξ |2 + ξ14
By Propositions 1 and 2 of [24], the kernels H j defined by j (ξ ) = H
ξ1 ξ j |ξ |2 + ξ14
belong to L q (R2 ) for any 1 < q < 3, if j = 1, 1 < q < 23 , otherwise. Since w 2 is in L q (R2 ) for any q ≥ 1, the function ∇v does belong to L q (R2 ) for any q > 1. We are now in position to define a comparison map w making use of the formal scaling (1.11), w = exp iϕ ,
(3.21)
where and ϕ are the functions defined by (x1 , x2 ) = 1 − and
t 2 2 w x1 , √ x2 , 2 2
t 2 ϕ (x1 , x2 ) = √ v x1 , √ x2 . 2 2
(3.22)
(3.23)
Travelling Waves for the Gross-Pitaevskii Equation II
613
Here, t denotes some positive parameter to be fixed later, whereas the constant is chosen so that the identity w2 (3.24) p= 72 R2 holds. We claim that the map w belongs to W (R2 ). Indeed, the integrability properties for v and w first show that the functions − 1, ∇ and ∇ϕ belong to L q (R2 ) for any q > 1, whereas ϕ is in L q (R2 ) for any q > 2. On the other hand, it follows from definitions (3.21), (3.22) and (3.23), and the boundedness of v and w that there exists some constant K > 0 such that |Re(w ) − 1| ≤ K | − 1| + ϕ2 , |Im(w )| ≤ K |ϕ |, |∇Re(w )| ≤ K |∇ | + |ϕ ||∇ϕ | , |∇Im(w )| ≤ K |∇ϕ | + |ϕ ||∇ | , so that w does belong to W (R2 ). The next step is to determine the value of the parameter t so that p(w ) = p. For that purpose, we compute the momentum of w making use of formula (1.6). This gives t 2 3t 3 p(w ) = w2 − w3 . (3.25) 2 R2 8 R2 Therefore, there exists a positive number tp such that p(w ) = p. We expand tp using Eqs. (3.24) and (3.25), ! 3 4 1 2 R2 w tp = + 18 3 p + O p . p→0 2 6 2 w
(3.26)
R
On the other hand, using definition (3.21) and formula (2.41), the value of E(w ) is given by 5 3 2 2 2 2 2 (∂1 w) + (∂2 v) + √ E(w ) = t w + √ (∂2 w) √ 2 R2 4 2 R2 8 2 R2 t − √ 3 3 w3 + 5 w∂2 v 4 2 R2 R2 5 2 5 t + √ w4 − 7 w 2 (∂2 v)2 . 16 2 2 R2 R2 Since for t = tp, p(w ) = p, we may invoke the definition of E min (p) to conclude that there exists some constant K > 0 only depending on w such that tp2 E min (p) ≤ E(w ) ≤ √ 2
R2
+
w2 −
3 2 4
2 4
R2
R2
(∂1 w)2 + (∂2 v)2
3 w tp + K 5 .
614
F. Béthuel, P. Gravejat, J.-C. Saut
By (3.24) and (3.26), we are led to √ 2 E min (p) ≤ 2p 1 + A2 (w)p + K p4 ,
(3.27)
where the number A2 (w) is given in view of (1.9), by E K P (w) A2 (w) = 2592 3 . 2 w R2 In order to complete the proof of inequality (3.18), we optimize the value of the constant A2 (w) with a suitable choice of the solution w to Eq. (1.10). For this purpose, we first relate the coefficient A2 (w) with the action S(w) of the considered solution w. By [10,24], the following equalities hold 1 1 2 E K P (w) = − w , and S(w) = w2 , (3.28) 6 R2 3 R2 so that A2 (w) = −
48 . S(w)2
Since S(w) > 0 by the second equation of (3.28), it remains to minimize the action S(w) among all non-trivial solutions w to Eq. (1.10). In view of the definition of ground states, the minimizer S K P is precisely the action of a ground-state solution to Eq. (1.10), so that the optimal inequality provided by (3.27) is exactly (3.18). 4. Properties of Solutions on TnN The purpose of this section is to present a number of results concerning solutions to (TWc) on the torus which are of presumably independent interest. Many of the results presented here have already been derived elsewhere (in particular in [3]), possibly in a slightly different form. In order to be self-consistent, we wish to give here a self-contained presentation. Since many of the results in this section are valid in higher dimensions, we consider more generally the torus TnN in dimension N , and let v ≡ vcn be an arbitrary non-trivial solution to (TWc) on TnN . As we have seen in the introduction, working on tori has a number of important advantages. The first one is that they are compact, allowing to establish quite easily existence of minimizing solutions. The second one is that the torus has some invariance by translation. Working on tori introduces however a number of small new difficulties. Some of them are related to the identification TnN [−π n, π n] N , which we will clarify next, as well as we recall the notion of unfolding. 4.1. Working on tori. Working on tori presents a number of peculiarities which we would like to point out in this subsection. We begin with the usual definition TnN = R N /(2π nZ) N obtained by the identification x ∼ x if and only if x − x ∈ (2π nZ) N . N
For α = (α1 , . . . , α N ) ∈ R N , the cube Cα ≡ [−π n + αi , π n + αi [ contains a unique i=1
Travelling Waves for the Gross-Pitaevskii Equation II
615
element of each equivalence class (Cα is often termed a fundamental domain). It may therefore be identified with TnN . Given α ∈ R N , the unfolding τα of TnN associated to α is by definition the one-to-one mapping τα : TnN −→ nN ≡ [−π n, π n[ N p = [(x1 + α1 , . . . , x N + α N )] −→ (x1 , . . . , x N ). This corresponds to a translation of the origin in R N , and thus on the torus. For a given function f defined on TnN , each unfolding τα induces a 2π n-periodic function f α defined on nN . In some computations (in particular, dealing with integration by parts for functions which are not necessarily all periodic), we will need to estimate boundary integrals. The following lemma provides a choice of a “good” unfolding of the torus, by averaging. Lemma 4.1. Let f be a 2π n-periodic function of L 1 (nN ), and let A be a subset of [−π n, π n]. There exists some α N in [−π n, π n] \ A such that 1 f (x)d x ≤ | f (x)|d x. (4.1) 2π n − |A| nN [−π n,π n] N −1 ×{α N } In particular, we may find an unfolding τα of the torus TnN such that 2 ≤ f (x)d x α 2π n − |A| N | f (x)|d x, [−π n,π n] N −1 ×{−π n,π n} n and
∂nN
f α (x)d x ≤
2 N −1 + 2π n − |A| πn
nN
| f (x)|d x.
Proof. Integrate the l.h.s of (4.1) for α N ∈ [−π n, π n] \ A and use the mean-value theorem. Remark 4.1. The trace of f α is well-defined for almost every unfolding. In the sequel, we will no longer distinguish f and f α : this hopefully will not lead to any confusion. Remark 4.2. Recall that, on X nN ≡ H 1 (TnN ), we have defined the momentum as 1 pn (v) = i∂1 v, v. 2 TnN For a map u in the space YnN ≡ {u ∈ H 1 (nN , C), u ≡ 1 on ∂nN } ⊂ X nN , we claim that we have 1 pn (u) = m n (u) ≡ 2
nN
J u, ζ1 .
Here, the Jacobian J u of u denotes the 2-form defined on nN by Ju ≡
1 d(u × du) = 2
1≤i< j≤N
(∂i u × ∂ j u)d xi ∧ d x j ,
(4.2)
616
F. Béthuel, P. Gravejat, J.-C. Saut
and ζ1 denotes the 2-form defined on nN by 2
xi d x1 ∧ d xi . N −1 N
ζ1 (x) ≡ −
(4.3)
i=2
Finally, ·, · stands for the scalar product of 2-forms. Claim (4.2) reduces in fact to an integration by parts. Indeed, we write 2J u, ζ1 d x1 ∧ . . . ∧ d x N = d(u × du) ∧ ζ1 = d((u × du) ∧ ζ1 ) + (u × du) ∧ d(ζ1 ),
(4.4)
where denotes the Hodge-star operator for differential forms. The special choice of ζ1 yields the identity (u × du) ∧ d(ζ1 ) = 2i∂1 u, ud x1 ∧ . . . ∧ d x N . Combining with (4.4) and integrating on the torus, we are led to 1 i∂1 u, u − J u, ζ1 = − (u × du) ∧ (ζ1 ) . 2 ∂nN TnN nN
(4.5)
If u belongs to YnN , then the boundary term is zero, and claim (4.2) follows. As we will see later, the term m n (u) explicitly appears in Pohozaev’s formula. On the torus TnN however, m is not well-defined (due to the fact that the 2-form ζ1 is not periodic, and hence well-defined on the torus). We will circumvent this difficulty by choosing suitable unfoldings. 4.2. Lifting properties and topological sectors. In several places of this paper, we have to face the following situation. Let R ≥ 1 and 0 ∈ N∗ be given, and consider points x1 , . . . , x on TnN with ≤ 0 . Assume that we have |xi − x j | ≥ 2R,
(4.6)
for any i = j, and that v is a map in H 1 (TnN ) such that |v(x)| ≥
R 1 R , ∀x ∈ On = TnN \ ∪ B x j , . j=1 2 2 2
(4.7)
The problem we wish to investigate is the following: find conditions such that one may
lift the map v on On (R) = TnN \ ∪ B(x j , R) as v = exp iϕ, with ϕ ∈ H 1 (On (R)). j=1
In dimension three, we have a simple answer. Lemma 4.2. Assume N = 3. Given any numbers E > 0, R > 1 and 0 ≥ 1, there exists a constant n(E, R, 0 ), such that if n ≥ n(E, R, 0 ), and v ∈ H 1 (T3n , C) satisfies (4.6), (4.7) and E n (v) ≤ E, then v = |v| exp iϕ on On (R), with ϕ ∈ H 1 (On (R), R).
(4.8)
Travelling Waves for the Gross-Pitaevskii Equation II
617
Proof. We consider first a special case. Case 1. On (R) = T3n , i.e. |v| ≥ 21 on T3n . In this case we may write v = |v|w, with |w| = 1, and perform the Hodge-de-Rham decomposition of w × dw. Since d(w × dw) = 0, the Hodge-de-Rham decomposition is written as w × dw = dϕ +
3
αjdx j,
j=1
where ϕ is a H 1 -function, each α j is a real number, and d x j stands for the canonical harmonic forms on T3n . One then checks that ⎛ w(x) = exp i ⎝ϕ(x) +
3
⎞ α j x j + θ ⎠,
j=1 k
for some constant θ ∈ R. Periodicity implies that α j has the form α j = nj , for some integer k j . The L 2 -orthogonality of the Hodge-de-Rham decomposition yields w × dw2L 2 (T3 ) = dϕ2L 2 (T3 ) + 8π 3 n n
n
3
k 2j ,
(4.9)
j=1
which implies by (4.8), 3
j=1
k 2j ≤
1 E ∇w2L 2 (T3 ) ≤ 3 . n 8π 3 n π n
Choosing n(E, 0, 0) = π23 E, the previous inequality implies k j = 0 for n ≥ n(E, 0, 0). On the other hand, it follows from (4.9) that ϕ ∈ H 1 (TnN , R), so that the conclusion of Lemma 4.2 holds. Case 2. The general case. Arguing by a density argument, we may assume without loss of generality that v is smooth. Next we claim that there exists a map v˜ = |v| ˜ w, ˜ with w˜ = 1, such that |v| ˜ ≥ 21 on T3n , and v˜ = v on On (R),
(4.10)
∇ w ˜ 2L 2 (T3 ) ≤ K (R, 0 )E(v),
(4.11)
and n
where K (R, 0 ) is some constant depending only on R and 0 . The conclusion then follows using the argument of Case 1 for the function v. ˜ In particular, it only remains to prove Claims (4.10)–(4.11).
618
F. Béthuel, P. Gravejat, J.-C. Saut
Proof of Claims (4.10)–(4.11). The construction of v˜ relies on some standard topological arguments. Using the mean-value inequality, there exists some radius R2 < R j < R such that 2 e(v) ≤ e(v). (4.12) R B(x j ,R) ∂ B(x j ,R j ) On the other hand, v is a continuous function on ∂ B(x j , R j ) such that |v| ≥ 21 , so that it can be written as v = exp iϕ on ∂ B(x j , R j ). Denoting j , resp. ϕ j , the harmonic extensions of , resp. ϕ, on B(x j , R j ), we consider the function v˜ defined by
v˜ =v on T3n \ ∪ B(x j , R j ),
(4.13)
j=1
v˜ = j exp iϕ j on B(x j , R j ), so that v˜ satisfies (4.10), and |v| ˜ ≥ 21 on T3n (by (4.7) and the maximum principle). Moreover, using standard trace theorems, the harmonicity of ϕ j gives 2 2 |∇ w| ˜ = |∇ϕ j | ≤ K (R) |∇ϕ|2 B(x j ,R j )
B(x j ,R j )
∂ B(x j ,R j )
≤ K (R)
∂ B(x j ,R j )
e(v),
where K (R) denotes some constant only depending on R, so that, by (4.12), |∇ w| ˜ 2 ≤ K (R) e(v). B(x j ,R j )
B(x j ,R)
Claim (4.11) then follows from definition (4.13), and the fact that ≤ 0 .
In dimension two, the situation is very different. In particular, the minimal energy of harmonic 1-forms does not depend on the size n of the torus (by conformal invariance of the energy), so that the statement of Lemma 4.2 does not extend. As an example, one may take for instance the map vn = exp i xn1 , whose energy does not depend on n, and which is not liftable. However, when the energy is small, the situation is parallel to the three-dimensional case. Lemma 4.3. Assume N = 2, and consider a function v ∈ H 1 (T2n ) such that E(v) ≤ 2 E 0 = π3 , and |v| ≥
1 on T2n , 2
then v = |v| exp iϕ on T2n , with ϕ ∈ H 1 (T2n , R). Proof. The proof is identical to Case 1 of the proof of Lemma 4.2, apart from Eq. (4.9), which becomes here w × dw2L 2 (T2 ) = dϕ2L 2 (T2 ) + 4π 2 n
n
3
j=1
k 2j ,
Travelling Waves for the Gross-Pitaevskii Equation II
619
so that 3
j=1
k 2j ≤
1 2E 0 2 ∇w2L 2 (T2 ) ≤ 2 ≤ . n 4π 2 π 3
Hence, the numbers k j are identically equal to 0, so that the conclusion of Lemma 4.3 holds, following the lines of Case 1 of the proof of Lemma 4.2. If we wish to have a lifting property when the energy is not small, we need to restrict the class of test maps. Indeed, although the zero set of Ginzburg-Landau maps on T2n may be not empty, a restriction on the Ginzburg-Landau energy allows us to define a notion of degree with suitable continuity properties. First, notice that by Sobolev’s embedding j theorem, for v ∈ H 1 (T2n ) and j ∈ {1, 2}, the restriction v to In,r is continuous for almost any r ∈ [−π n, π n], where we have set 1 2 In,r = {r } × [−π n, π n] S1n , and In,r = [−π n, π n] × {r } S1n . j
In particular, if v does not vanish on In,r , in view of periodicity, we may define the j v degree of |v| restricted to In,r . Denoting B j (v), the subset of numbers r in [−π n, π n] j
for which the restriction of v to In,r is continuous and does not vanish, we set 3 Tnd1 ,d2 = u ∈ H 1 (T2n ), s.t. ∀ j ∈ {1, 2}, ∃B j ⊂ B j (u), s.t. |B j | ≥ 2π n − n 4 ,
j and deg(u, In,r ) = d j , ∀r ∈ B j , d ,d
for any (d1 , d2 ) ∈ Z2 . It follows from this definition that Tnd1 ,d2 ∩Tn 1 2 = ∅ if (d1 , d2 ) = (d1 , d2 ), and ∪ Tnd = H 1 (T2n ). In the Introduction, we have introduced the set Sn0 , d∈Z2
which we next define as Sn0 = Tn0,0 .
(4.14)
We claim that if a map v in Sn0 satisfies suitable topological assumptions, then v has a lifting. Lemma 4.4. Assume N = 2, and that v ∈ Sn0 = Tn0,0 satisfies (4.6), (4.7) and v , ∂ B(x j , R) = 0, ∀ j ∈ {1, . . . , }. deg |v|
(4.15)
Then, there exists a constant n(R, 0 ) depending only on R and 0 such that, if n ≥ n(R, 0 ), then v = |v| exp iϕ on On (R), with ϕ ∈ H 1 (On (R), R). Proof. The proof is very similar to the proof of Lemma 4.2. Using (4.15), we may construct in the three-dimensional case as in the two-dimensional case a map v˜ satisfying (4.10), and such that |v| ˜ ≥ 21 on T2n . Moreover, since ≤ 0 , we may check that v˜ belongs to Sn0 for n sufficiently large, so that we may restrict ourselves to the case On (R) = T2n . In this situation, denoting v = |v|w, with |w| = 1, the argument of Case 1 of the proof
620
F. Béthuel, P. Gravejat, J.-C. Saut
of Lemma 4.2 gives that there exist a function ϕ ∈ H 1 (T2n ), a constant θ and integers k j such that ⎞ ⎛ 2
kjxj + θ ⎠, w(x) = exp i ⎝ϕ(x) + n j=1
for any x ∈ T2n . It follows that v is in Tnk1 ,k2 , so that k1 = k2 = 0, which completes the proof of Lemma 4.4. We will use the following consequence in the spirit of Lemma 4.3. Corollary 4.1. Assume N = 2, and that v ∈ Sn0 satisfies (4.6), (4.7) and R π E n v, On < . 2 8
(4.16)
Then, there exists a constant n(R, 0 ) depending only on R and 0 such that, if n ≥ n(R, 0 ), then v = |v| exp iϕ on On (R), with ϕ ∈ H 1 (On (R), R). Proof. Corollary 4.1 is a direct consequence of Lemma 4.4, once it is proved that assumption (4.15) is a consequence of assumption (4.16). This last fact follows from a direct v computation of the topological degree of |v| on ∂ B(x j , R). Indeed, by the mean-value inequality, there exists some radius R2 ≤ r j ≤ R such that R 2 , e(v) ≤ E n v, On R 2 ∂ B(x j ,r j )
so that by Cauchy-Schwarz’s inequality, ∂ B(x j ,r j )
|∇v| ≤
However, the topological degree of deg
v |v|
R 1 8πr j 2 E n v, On . R 2
(4.17)
on ∂ B(x j , r j ) is defined by
v v v 1 , ∂ B(x j , r j ) = ∧ ∂τ , |v| 2π ∂ B(x j ,r j ) |v| |v|
where τ denotes the properly oriented unit tangent vector to ∂ B(x j , r j ). Hence, we have by (4.7), (4.16) and (4.17), v R 1 |∇v| 8 1 2 , ∂ B(x j , r j ) ≤ ≤ E n v, On deg < 1, |v| 2π ∂ B(x j ,r j ) |v| π 2 so that deg
v , ∂ B(x j , r j ) = 0. |v|
Since v does not vanish on B(x j , R)\ B(x j , r j ), assumption (4.15) holds. The conclusion then follows invoking Lemma 4.4.
Travelling Waves for the Gross-Pitaevskii Equation II
621
It remains to verify that these sets have appropriate properties with respect to the methods of calculus of variations. For that purpose we restrict ourselves to the sublevel sets E n, of H 1 (T2n ) defined by E n, = {u ∈ H 1 (T2n ), s.t. E n (u) ≤ }, and set d1 ,d2 Sn, = E n, ∩ Tnd1 ,d2 .
The following result was readily proved by Almeida (see Theorem 6 in [1]). Theorem 4.1 ([1]). Let > 0 be given. There exists an integer n , such that for every n ≥ n , we have the partition E n, =
&
d1 ,d2 Sn, .
(d1 ,d2 )∈Z2
Moreover, the degree application deg : E n, → Z2 u → deg(u) = (d1 (u), d2 (u)) d1 ,d2 d1 ,d2 is a closed subset of H 1 (T2n ), whereas S˙n, ≡ is continuous on E n, , so that Sn,
{u ∈ H 1 (T2n ), s.t. E n (u) < } ∩ Tnd1 ,d2 is an open subset of H 1 (T2n ).
Notice that in [1] there is a small parameter ε > 0 appearing in the energy functional. In our case, ε corresponds to ε = n1 , and one recovers the context of Theorem 6 in [1] (in the particular case of the torus T2 = R/(2π Z)) performing the change of scale x˜ = nx , with the exception of one major difference. Indeed, the sets Tnd1 ,d2 are defined in [1] by 3π n , Tnd1 ,d2 = u ∈ H 1 (T2n ), s.t. ∀ j ∈ {1, 2}, ∃B j ⊂ B j (u), s.t. |B j | ≥ 2 j and deg(u, In,r ) = d j . . The difference in the assumption on the length of the sets B j comes from the fact that the proof performed in [1] requires that the sets B j are at some distance larger than π4n ( 18 in the ε context of [1]) from the boundary of a suitable unfolding of the torus T2n . However, it can be proved using exactly the same arguments as in [1] that this assumption can be removed by a less restrictive one, where the sets B j are at some distance only larger than 3
1
πn 4 ε 4 2 ( 2 in the ε context of [1]) from the boundary of a suitable unfolding T2n , so that Theorem 6 in [1] extends to the case considered here.
of the torus
622
F. Béthuel, P. Gravejat, J.-C. Saut
4.3. Pointwise estimates on TnN . Our first result provides local bounds. For a domain U ⊂ TnN , we consider the local energy density |∇v|2 (1 − |v|2 )2 . e(v) ≡ E n (v, U) ≡ + 2 4 U U As for the whole space, we have on the torus TnN the pointwise estimates. Lemma 4.5. Let n ∈ N∗ , and let v be a finite energy solution to (TWc) on TnN . There exist some constants K (N ) and K (c, k, N ) such that c 1 − |v| ∞ N ≤ max 1, , L (Tn ) 2 c2 23 ∇v L ∞ (TnN ) ≤ K (N ) 1 + , 4 and more generally, vC k (TnN ) ≤ K (c, k, N ), ∀k ∈ N. Lemma 4.6. Let n ∈ N∗ and r > 0. Assume that v is a finite energy solution to (TWc) on TnN . There exists some constant K (N ) such that for any x0 ∈ TnN , 1 c2 2 ≤ max K (N ) 1 + E n v, B(x0 , r ) N +2 , 1 − |v| ∞ r L (B(x0 , 2 )) 4 1 K (N ) 2 . v, B(x E , r ) n 0 rN The proofs are, almost word for word, identical to the proofs of Lemmas 2.1 and 2.2 respectively. Therefore, we omit them. 4.4. Upper bounds for the velocity. We first notice that, if v is non-constant, there is at most one value of c for which v might be a solution to (TWc): we sometimes emphasize this fact writing for a non-trivial solution c = c(v). We consider next, for a solution v to (TWc), the discrepancy term √ n (v) = 2 pn (v) − E n (v). The main result of this subsection asserts, that, in dimensions two and three, if n (v) > 0, the speed c(v) can be bounded by a function of n (v) and the energy E n (v). More precisely, we have Theorem 4.2. Assume N = 2 or N = 3, and let E 0 > 0 and 0 > 0 be given. Let v be a non-trivial finite energy solution to (TWc) in X n2 ∩ Sn0 , resp. X n3 , with c = c(v) ∈ R, E n (v) ≤ E 0 and 0 < 0 ≤ n (v). Then, there is some constant n 0 ∈ N depending only on E 0 and 0 such that, if n ≥ n 0 , then E n (v) |c(v)| ≤ K , |n (v)| where K > 0 is some universal constant.
Travelling Waves for the Gross-Pitaevskii Equation II
623
Remark 4.3. In contrast with√the results in [19], which asserts that every finite energy solution to (TWc) with c > 2 is constant, the corresponding result is presumably not true on general tori. The proof of Theorem 4.2 relies on Pohozaev’s formula, which we specify next for solutions to Eq. (TWc) on the torus TnN . Lemma 4.7. Let n ∈ N∗ , and let v be a solution to (TWc) on TnN . We have, for any unfolding, N N −1 N −2 2 2 2 |∇v| + (1 − |v| ) − c(v) J v, ζ1 2 4 nN 2 nN nN
N |∇v|2 (1 − |v|2 )2 + − ∂ν v · x j ∂ j v , (4.18) = πn 2 4 ∂nN ∂nN j=1
where ζ1 is the 2-form defined by (4.3). Remark 4.4. 1. Notice that ζ1 is not periodic and therefore (4.18) depends on the choice of the unfolding. 2. Identity (4.18) actually holds for any subdomain U ⊂ nN replacing the integrals on nN by integrals on U, and the boundary integrals by integrals on ∂U. In particular, if 0 < R < π n, then B(0, R) ⊂ nN and Pohozaev’s identity yields the inequality N −2 N N −1 2 2 2 |∇v| + (1 − |v| ) − c J v, ζ1 2 4 B(0,R) 2 B(0,R) B(0,R) ≤R e(v). ∂ B(0,R)
The starting point in order to prove Theorem 4.2 and to bound c(v) is formula (4.18). The use of this formula requires to have an upper bound of the boundary terms on the r.h.s as well as a lower bound for the quantity (4.19) J v, ζ 1 , N n
which depends on the unfolding. We have already noticed that (4.19) is related to the momentum pn (v) (they would actually even be equal if v were constant on ∂nN ). An appropriate choice of the unfolding allows to obtain the suitable bounds as the next proposition shows. Proposition 4.1. Assume N = 2 or N = 3, and let E 0 > 0 be given. Let v be a nontrivial finite energy solution to (TWc) in X n2 ∩ Sn0 , resp. X n3 , with E n (v) ≤ E 0 . Given any δ0 > 0, there exists a constant n 0 ∈ N depending only on E 0 and δ0 , such that, if n ≥ n 0 , then there exists an unfolding of TnN such that E n (v) pn (v) − 1 J v, ζ (4.20) + δ0 , 1 ≤ √ 2 nN 2 and
n
∂nN
e(v) ≤ 2
TnN
e(v).
(4.21)
624
F. Béthuel, P. Gravejat, J.-C. Saut
For the proof of Proposition 4.1, we invoke several elementary lemmas. The first one is used throughout the paper. Lemma 4.8. Let I be an interval of R, such that |I | ≥ 1. Given any δ > 0, there exists a constant µ0 (δ) > 0, such that if u ∈ H 1 (R, C) satisfies e(u) ≤ µ0 (δ), (4.22) I
then 1 − |u| ≤ δ on I.
(4.23)
Proof. By the energy bound, we have 1 1 |∇u|2 + (1 − |u|2 )2 ≤ µ0 . 2 I 4 I
(4.24)
Hence, from the inequality ab ≤ 21 (a 2 + b2 ), it holds
1 |∇u||1 − |u| | ≤ √ 2
2
I
√ √ 2 |∇u| + (1 − |u|2 )2 ≤ 2µ0 , 4 I 2
I
that is |∇ξ(|u|)| ≤
√
2µ0 ,
I 3
where the function ξ is defined by ξ(t) = t − t3 . In particular, ξ has a strict local maximum at t = 1. Going back to (4.24), the mean-value inequality and the fact that |I | ≥ 1 yields the existence of some point x0 ∈ I such that √ |1 − |u(x0 )|2 | ≤ 2 µ0 . Combining both the previous inequalities, we deduce that 1 sup ξ(|u(x)|) − ξ(1) ≤ |∇ξ(|u|)| + 1 − |u(x0 )|2 − |u(x0 )| − |u(x0 )|2 3 I x∈I 8 √ ≤ + 2 µ0 , 3 from which (4.23) follows invoking the coercivity of ξ at t = 1.
Lemma 4.9. Let u be in H 1 ([a, b]) with |a − b| ≥ 1, such that u = exp iϕ, with ϕ(a) = ϕ(b). Assume moreover that for some 0 ≤ δ < 21 , u satisfies (4.22). Then, we have √ 2 i u, ˙ u ≤ e(u). 1−δ I I
Travelling Waves for the Gross-Pitaevskii Equation II
625
Proof. Since by assumptions, u = ρ exp iϕ on I , we have i u, ˙ u = −ρ 2 ϕ, ˙ and |u| ˙ 2 = ρ 2 ϕ˙ 2 + ρ˙ 2 . We next compute √ 2 2 2 i u, ˙ u = ρ ϕ˙ = (ρ − 1)ϕ˙ ≤ e(u), 1 − δ I I I I
where we used the results of Lemma 2.3 and Lemma 4.8 for the last inequality. In dimension two, a related result is
Lemma 4.10. Let 0 < δ < 21 be given. There exists some constant µ1 (δ) > 0, such that, for any map u ∈ H 1 (T2n ) which satisfies e(u) ≤ µ1 (δ), (4.25) T2n
we have the estimate √ 2 2 i∂ j u, u ≤ 1 − δ 2 e(u), ∀ j ∈ {1, 2}. Tn Tn Proof. If we knew that |u| ≤ 1 − δ, then the proof would essentially follow the same arguments as the proof of Lemma 4.9. However, in contrast with the one-dimensional case, smallness of the energy does not allow to draw that conclusion. To overcome the difficulty, we introduce an approximation of u for which the proof of Lemma 4.9 applies. Indeed, for λ > 1 given to be determined later, we consider a map u λ ∈ H 1 (T2n , C) solution to the minimization problem Fλ (u λ ) = inf{Fλ (v), v ∈ H 1 (T2n , C)}, where Fλ (v) =
λ |u − v|2 + e(v). 2 T2n T2n
Existence of u λ is straightforward. By minimality of u λ , we have λ 2 |u − u λ | + e(u λ ) ≤ e(u), 2 T2n T2n T2n
(4.26)
and the Euler-Lagrange equation writes −u λ = λ(u − u λ ) + u λ (1 − |u λ |2 ) on T2n . In particular u λ is smooth. We compute the difference i∂ j u λ , u λ − i∂ j u, u = i∂ j u λ , u λ − u + i∂ j (u λ − u), u, so that integrating by parts, we are led to the identity i∂ j u λ , u λ − i∂ j u, u = i∂ j u λ , u λ − u + i(u − u λ ), ∂ j u , T2n
T2n
626
F. Béthuel, P. Gravejat, J.-C. Saut
and hence ≤ u − u i∂ ∇u u , u − i∂ u, u + ∇u 2 2 2 2 2 2 j λ λ j λ L (Tn ) λ L (Tn ) L (Tn ) T2n
4 e(u), ≤ √ λ T2n
(4.27)
where we have used (4.26) for the last inequality. We choose therefore the value of the parameter λ = λ(δ) so that 1 1 1 1 = √ . (4.28) − √ λ(δ) 1 − 2δ 2 2 1−δ For this choice of λ, we claim that there exists a constant µ1 (δ) > 0, such that, if (4.25) is satisfied, then |u λ | ≥ 1 −
δ on T2n . 2
(4.29)
We postpone the proof of Claim (4.29), and complete the proof of Lemma 4.10. Indeed, invoking Lemma 4.3, and using (4.25) and (4.26), with µ1 (δ) sufficiently small, we can assume that u λ is written as u λ = exp iϕ on T2n , with ϕ ∈ H 1 (Tn2 ). In view of Claim (4.29), the same argument as in Lemma 4.9 then shows that √ 2 e(u), 2 i∂ j u λ , u λ ≤ 1 − δ 2 Tn 2 Tn so that the conclusion follows from (4.27) and the choice (4.28) of λ(δ).
Proof of Claim (4.29). We write 3 2 2 2 2 u ≤ 2 1 − |u 1 (1 − |u | ) | + |u | − 1 1 λ λ λ {|u λ |≤2} λ {|u λ |≥2} , so that if u satisfies (4.25) with 0 ≤ µ1 (δ) ≤ 1, then 3 u λ (1 − |u λ |2 ) 2 4 2 ≤ 2 1 − |u λ |2 L 2 (T2n ) + 1 − |u λ |2 L2 2 (T2 ) L +L 3 (Tn )
n
≤ 10
1 2
T2n
e(u)
.
By (4.28), we have λ(u − u λ ) L 2 (T2n ) ≤ λδ
1 2
T2n
e(u)
,
so that u λ
4 L 2 +L 3 (T2n )
≤ 10(λδ + 1)
1 2
T2n
e(u)
.
Travelling Waves for the Gross-Pitaevskii Equation II
627
By (4.26), we are led to ∇u λ
H 1 +W
1, 43
(T2n )
≤ K (λδ + 1)
1 2
T2n
e(u)
,
where K is some universal constant, so that, by Sobolev’s embedding theorem, ∇u λ L 4 (B(x,1)) ≤ K (λδ + 1)
T2n
1 2
e(u)
,
for any point x ∈ T2n . It follows therefore from Morrey’s embedding theorem that we have 1 2 1 1 1 u λ (x) − u λ (y) ≤ K (λδ + 1) e(u) |x − y| 2 ≤ K (λδ + 1)µ1 (δ) 2 |x − y| 2 , T2n
for any |x − y| ≤ 1. To conclude, assume by contradiction that there is a point x0 such that |u λ (x0 )| ≤ 1 − 2δ . Then, we have |u λ (x)| ≤ 1 − 4δ for any x ∈ B(x0 , r0 ), where the radius r0 is given by r0 =
δ2 , 16K 2 (λδ+1)2 µ1 (δ)
B(x0 ,r0 )
(1 − |u λ |2 )2 ≥
so that integrating, we obtain
πr02 δ 2 π δ6 = , 3 4 16 (16) K (λδ + 1)4 µ1 (δ)2
which implies using (4.25), µ1 (δ)3 ≥ K δ 10 , and leads to a contradiction if the number µ1 (δ) is chosen sufficiently small.
We are now in position to give the proof of Proposition 4.1. Proof of Proposition 4.1. The starting point is formula (4.5), the main point being to estimate the boundary term. Since the computations depend on the dimensions, we distinguish two cases N = 2 and N = 3. Case 1. N = 2. We have ζ1 = −2x2 d x1 ∧ d x2 , so that ζ1 = −2x2 . Inserting this identity into (4.5), we obtain i∂1 v, v − J v, ζ1 = nπ i∂1 v, v (4.30) T2n
nN
[−π n,π n]×{−π n,π n}
for any unfolding of the torus. Let δ > 0 be fixed, to be determined later, and consider the subset A of [−π n, π n] defined by α ∈ A if and only if µ0 (δ) , e(v) ≥ 2 [−π n,π n]×{α} where µ0 (δ) is the constant provided by Lemma 4.8. Notice in particular that by integration |A| ≤
3 2 2E 0 E n (v) ≤ ≤ 2π n 4 , µ0 (δ) µ0 (δ)
(4.31)
628
F. Béthuel, P. Gravejat, J.-C. Saut
as soon as n ≥
E0 π µ0 (δ)
4
3
. Using Lemma 4.8, it follows that |v| ≥ 1 − δ on [−π n, π n] × {α},
for any α ∈ [−π n, π n] \ A. Since v ∈ Sn0 , the topological degree of
v |v|
is equal to 0 on 3
[−π n, π n] × {α}, for any α in a subset B of [−π n, π n] \ A with |B| ≥ 2π(n − 2n 4 ) by (4.31), so that v may be written as v = |v| exp iφ on [−π n, π n] × {α},
(4.32)
for any α ∈ B. We next apply Lemma 4.1 to the function f ≡ e(v) and the complementary B c of the set B. This yields an unfolding of the torus such that ±π n ∈ / A,
[−π n,π n]×{−π n,π n}
e(v) ≤ µ0 (δ),
[−π n,π n]×{−π n,π n}
e(v) ≤
1 E (v), 3 n π n − 2n 4
and
∂nN
e(v) ≤
1 E n (v). + 3 πn π n − 2n 4 1
(4.33)
Invoking (4.32) to apply Lemma 4.9 to this choice of unfolding, we have therefore √ n 2E n (v) i∂1 v, v ≤ . 3 [−π n,π n]×{−π n,π n} n − 2n 4 (1 − δ)
nπ We first fix δ
0. Let δ0 > 0 be given, and consider the unfolding provided by Proposition 4.1 for any n ≥ n 0 . It follows from (4.20) that √ 1 n (v) ≤ √ J v, ζ1 + 2δ0 . 2 nN √ Choosing δ0 so that 2δ0 < 20 , we are led to the inequality √ n (v) ≤ 2 J v, ζ1 , (4.37) nN
for any n ≥ n 0 . On the other hand, we may invoke Lemma 4.7 to assert that N −2 N N −1 2 2 2 ≤ πn |∇v| + (1 − |v| ) − c(v) J v, ζ e(v), 1 2 4 nN 2 nN nN ∂nN which yields, combined with (4.21) and provided n ≥ n 0 , |c(v)| J v, ζ1 ≤ K E n (v), nN
(4.38)
for some universal constant K > 0. Combining (4.37) and (4.38), we deduce |c(v)|n (v) ≤ K E n (v), which yields the desired conclusion.
4.5. Concentration of energy. The results in this section are a first step towards the proof of Proposition 4. By a standard covering argument, we first deduce from Lemma 4.6, Lemma 4.11. Let c0 > 0 and E 0 > 0 be given. Let v be a finite energy solution to (TWc) on TnN such that |c(v)| ≤ c0 and E n (v) ≤ E 0 . Given any δ > 0, there exists a number 0 ∈ N depending only on c0 , E 0 and δ, such that there exists a finite number (v) ≤ 0 of points x1 , . . . , x (v) in TnN which satisfy (v) (4.39) 1 − |v| ≤ δ on TnN \ ∪ B(xi , 1), i=1
630
F. Béthuel, P. Gravejat, J.-C. Saut
and, for any i ∈ {1, . . . , (v)},
1 − |v(xi )| ≥ δ.
(4.40)
Proof. It follows from Lemma 4.5 that v is continuous on TnN , so that the set Vδ defined by Vδ = {x ∈ TnN , s.t. |1 − |v|| ≥ δ}, is compact, and is therefore included in a finite collection of balls B(xi , 15 ) with xi ∈ Vδ . Using Vitali’s covering lemma, there exists a finite subcollection of balls(B(xi , 15 ))1≤i≤ (v) so that (v)
Vδ ⊂ ∪ B(xi , 1), i=1
and
1 1 ∩ B xj, = ∅, ∀1 ≤ i = j ≤ (v). B xi , 5 5
(4.41)
In particular, conclusions (4.39) and (4.40) hold for this subcollection. On the other hand, we deduce from Lemma 4.6 that 1 E n v, B xi , ≥ K (N , c0 , δ), 5 where K (N , c0 , δ) is some positive constant depending on N , c0 and δ, so that invoking (4.41), E0 ≥
(v)
i=1
1 ≥ K (N , c0 , δ) (v). E n v, B xi , 5
Hence, there exists some integer 0 ∈ N depending only on c0 , E 0 and δ so that (v) ≤ 0 , which completes the proof of Lemma 4.11. Considering clusters of the balls B(xi , 1), and enlarging possibly the radius, we may assume that their mutual distance is even larger in view of the following abstract but elementary lemma. Lemma 4.12. Let X be a metric space, and consider distinct points x1 , . . . , x in X . Let µ0 > 0 and 0 < κ ≤ 21 be given. Then, there exists µ > 0 such that 2 µ0 ≤ µ ≤ µ0 , κ and a subset {x j } j∈J of {xi }1≤i≤ such that
∪ B(xi , µ0 ) ⊂ ∪ B(x j , µ),
(4.42)
µ , ∀ j = k ∈ J. κ
(4.43)
i=1
j∈J
and dist(x j , xk ) ≥
Travelling Waves for the Gross-Pitaevskii Equation II
631
Proof. The proof is by iteration, in at most steps. First, consider the collection {xi }1≤i≤ . If (4.42) and (4.43) are verified with µ = µ0 , there is nothing else to do. Otherwise, take two points, say x1 and x2 such that dist(x1 , x2 ) ≤ κ −1 µ0 , consider the collection {x2 , . . . , x }, and set µ = 2κ −1 µ0 . If (4.42) is verified, we stop. Otherwise, we go on in the same way. If the process does not stop in − 1 steps, at the th step, we are left with one single ball of radius µ = 2 κ − µ0 , and (4.43) is void. We may specify Lemma 4.12 to the points x1 , . . . , x (v), provided by Lemma 4.11. (v) It follows that there exists some 1 ≤ µ ≤ κ2 , and a subset J of {1, . . . , (v)} such that µ , ∀ j = k ∈ J. κ
(v)
∪ B(xi , 1) ⊂ ∪ B(x j , µ), and |x j − xk | ≥
i=1
j∈J
(4.44)
In particular, |v| ≥ 1 − δ ≥
1 on On (µ) ≡ TnN \ ∪ B(x j , µ), j∈J 2
(4.45)
and for any j ∈ J , 1 − |v(x j )| ≥ δ. The main result in this subsection is
Proposition 4.2. Let E 0 , c0 , δ and v be as in Lemma 4.11, and assume the points x1 , . . . , x (v), and the set J are such that (4.44) and (4.45) are satisfied. i) Let 0 < κ
0, let v be a non-trivial finite energy solution to (TWc) in X n2 ∩ Sn0 , resp. X n3 , satisfying (4.57), and E n (v) ≤ E 0 . Then there exists some integer n 0 only depending on E 0 such that 1 − |v|
L ∞ (TnN )
≥
ε(v)2 , ∀n ≥ n 0 . 10
Proof. The proof is similar to the proof of Proposition 2.4, invoking Lemmas 4.2 and 4.4 to construct a lifting of v. Therefore, we omit it. 5. Asymptotics of Solutions on Expanding Tori In this section, we consider the asymptotics n → +∞ for a sequence (v n )n∈N∗ of solutions of (TWc) on tori TnN . Our purpose is to carry out the asymptotic analysis of the sequence, in the spirit of concentration-compactness. We assume throughout that there exists a constant K > 0, which is independent of n, such that, for any n ∈ N, E n (v n ) ≤ K , 0 ≤ pn (v n ) ≤ K , and c(v n ) ≤ K . Passing possibly to a subsequence, we may assume, without loss of generality, that for some numbers E ≥ 0, p ≥ 0 and c ≥ 0, we have E n (v n ) → E, pn (v n ) → p, and c(v n ) → c, as n → +∞. The main result in this section is
(5.1)
Travelling Waves for the Gross-Pitaevskii Equation II
635
Theorem 5.1. Assume N = 2 or N = 3, and let (v n )n∈N∗ be a sequence of solutions of (TWc) in X n2 ∩ Sn0 , resp. X n3 , satisfying (5.1). Assume moreover that E > 0 and √ (5.2) 0 < c < 2. There exist an integer 0 depending only on c and E, non-trivial finite energy solutions v1 , . . . , v to (TWc) on R N of speed c with 1 ≤ ≤ 0 , points x1n , . . . , x n , and a subsequence of (v n )n∈N∗ still denoted (v n )n∈N∗ such that |xin − x nj | → +∞, as n → +∞,
(5.3)
v n (· + xin ) → vi (·) in C k (K ), as n → +∞,
(5.4)
and
for any 1 ≤ i = j ≤ , any k ∈ N, and any compact set K ⊂ R N . Moreover, we have the identities E = lim
n→+∞
E n (v ) = n
E(vi ), and p = lim
n→+∞
i=1
pn (v ) = n
p(vi ). (5.5)
i=1
In Theorem 5.1, the tori TnN are identified with the subdomains nN of R N , using possibly a suitable unfolding, so that convergence (5.3) makes sense. We will also need a variant of Theorem 5.1 in the sonic case. Theorem 5.2. Assume N = 3, and let (v n )n∈N∗ be as in Theorem 5.1 with assumption (5.2) replaced by √ c = 2. Let 0 < δ < 1 be given. There exist an integer 0 depending √ only on E, non-trivial finite energy solutions v1 , . . . , v to (TWc) on R N of speed 2 with 0 ≤ ≤ 0 , points x1n , . . . , x n , and a subsequence of (v n )n∈N∗ still denoted (v n )n∈N∗ such that (5.3) and (5.4) hold. Moreover, there exist real numbers µ ≥ 0 and ν such that we have the identities E = lim
n→+∞
E(vi ) + µ, and p = lim pn (v n ) = p(vi ) + ν, E n (v n ) = n→+∞
i=1
i=1
(5.6) and the inequality |µ −
√
2ν| ≤ K δµ,
(5.7)
where K is some universal constant. Remark 5.1. In contrast with the subsonic case, where ≥ 1, i.e. there is always at least one non-trivial finite energy solution on R N , with speed c, namely v1 , appearing in the limiting behaviour, here, we may well have = 0. In this situation we have E = µ and p = ν.
636
F. Béthuel, P. Gravejat, J.-C. Saut
The first observation which paves the way to the proof of Theorem 5.1 is that the elliptic estimates derived in Subsect. 4.3 lead to the compactness of sequences of solutions, when we consider their restrictions to bounded domains. More precisely, as a direct consequence of Lemma 4.5, we have Lemma 5.1. Let (v n )n∈N∗ be as in Theorem 5.1 or 5.2. There exists a finite energy solution v to (TWc) with speed c, and a subsequence of (v n )n∈N still denoted (v n )n∈N such that v n → v in C k (K ), as n → +∞, for any k ∈ N, and any compact set K ⊂ R N . Proof. Since the family of speeds (cn )n∈N is bounded, it follows from Lemma 4.5, that, for any k ∈ N, there exists a constant K (k) such that v n C k (TnN ) ≤ K (k), ∀n ∈ N. Hence, it follows from Ascoli’s theorem, that, considering for any j ∈ N, the compact balls B(0, j), there exists a subsequence (depending possibly on j), and a smooth map v j on B(0, j) such that vn
→ v j in C k (B(0, j)), ∀k ∈ N .
n→+∞
Taking the limit in Eq. (TWc) for v n , one verifies that the map v j solves (TWc), with speed c, the limit of the speeds c(v n ) as n → +∞. To conclude, we let j → +∞, and we invoke a diagonal argument, so that in particular v = v j does not depend on the ball B(0, j). Lemma 5.1 does not handle the invariance by translations of the equation, which is a source of non-compactness. In particular, if we do not take care of this invariance, the limit v provided by Lemma 5.1 might well be equal to a constant, so that there is no hope to have conservation of energy and momentum. In order to handle the invariance by translations, we invoke the following general result. Lemma 5.2. Let ∈ N∗ , and consider for any n ∈ N∗ , a family {x1n , . . . , x n } of points in TnN . There exist a subset J∞ of {1, . . . , }, a non-decreasing injection σ : N → N, and sequences (κk )k∈N , (Rk )k∈N and (n(k))k∈N such that 2 1 , κk → 0, as k → +∞, and 1 ≤ Rk < 0 < κk < , ∀k ∈ N, 64 κk and such that we have the relations σ (n) σ (n) ∪ B x i , 1 ⊂ ∪ B x j , Rk , i=1
j∈J∞
and Rk dist xiσ (n) , x σj (n) ≥ , ∀i = j ∈ J∞ , 4κk for any n ≥ n(k).
Travelling Waves for the Gross-Pitaevskii Equation II
637
Proof. We iterate Lemma 4.12 applied to the family {x1n , . . . , x n } with values of κ going to zero. More precisely, we introduce a new parameter m ∈ N which will eventually go to +∞ and take κ of the form κm =
1 . m + 64
Starting with m = 0, Lemma 4.12 yields for any n ∈ N, a subset J0n of {1, . . . , }, and a number 0 < µn0 ≤ κ20 such that
∪ B(xin , 1) ⊂ ∪ n B(x nj , µn0 ), and dist(xin , x nj ) ≥
i=1
j∈J0
µn0 , ∀i = j ∈ J0n . κ0
σ (n)
We may extract a subsequence σ0 : N → N such that J0 0 does not depend on n, so σ (n) that we may denote it J0 , and such that the sequence (µ00 )n∈N has a limit which we denote µ0 . In particular, there exists some integer n 0 such that ∪ B xiσ0 (n) , 1 ⊂ ∪ B x σj 0 (n) , 2µ0 , i=1
and
j∈J0
µ0 , ∀i = j ∈ J0 , dist xiσ0 (n) , x σj 0 (n) ≥ 2κ0
for any n ≥ n 0 . We next proceed the same way with {x1σ0 (n) , . . . , x σ0 (n) }n∈N and κ1 , so σ (n) that by Lemma 4.12, we find a subset J1 0 of {1, . . . , }, and a number 0 < µn1 ≤ κ21 such that σ (n) σ (n) ∪ B xi 0 , 1 ⊂ ∪ n B x j 0 , µn1 , i=1
j∈J1
and µn σ (n) σ (n) ≥ 1 , ∀i = j ∈ J1n , dist xi 0 , x j 0 κ1 for any n ∈ N. We may extract a new subsequence σ˜ 1 : N → N such that, for σ1 = σ˜ 1 ◦σ0 , σ (n) the set J1 1 does not depend on n, so that we may denote it J1 , and such that the sequence σ (n) (µ11 )n∈N has a limit which we denote µ1 . In particular, there exists some integer n 1 such that ∪ B xiσ1 (n) , 1 ⊂ ∪ B x σj 1 (n) , 2µ1 , i=1
and
j∈J1
µ1 σ (n) σ (n) ≥ , ∀i = j ∈ J1 , dist xi 1 , x j 1 2κ1
for any n ≥ n 1 . We then iterate this argument to construct for any j ∈ N, some non-decreasing injection σ˜ j : N → N, some subset J j of {1, . . . , }, some number
638
F. Béthuel, P. Gravejat, J.-C. Saut
1 < µj ≤ have
2 κj
, and some integer n j , such that, if we set σ j = σ˜ j ◦ σ j−1 , then we
σ (n) σ (n) σ (n) σ (n) µj ≥ ∪ B xi j , 1 ⊂ ∪ B xk j , 2µ j , and dist xi j , xk j , (5.8) i=1 k∈J j 2κ j for any i = k ∈ J j , and any n ≥ n j . We then set for any n ∈ N, following the usual diagonal argument σ (n) = σn (n) and Rn = 2µn . In view of (5.8), we have for this choice σ (n) σ (n) ∪ B x i , 1 ⊂ ∪ B x j , Rm , i=1
j∈Jm
and Rm σ (n) σ (n) ≥ dist xi , x j , ∀i = j ∈ Jm , 4κm for any m ∈ N , and any n ≥ n m . To conclude, we finally extract a subsequence (α(m))m∈N such that Jα(m) does not depend on m. In the course of the proof of Theorem 5.1, we will combine Lemma 5.2 with Proposition 4.2. For that purpose, some decay properties of travelling wave solutions, which we recalled in Subsect. 2.3, turn out to be central in the arguments. √ 2 Proof of Theorem 5.1. Set ε = 2 − c , so that we may assume, without loss of genε n n 2 erality that 2 ≤ ε(v ) = 2 − c(v ) → ε, as n → +∞, and that the energy E n (v n ) is uniformly bounded by some constant E 0 . Our starting point is Lemma 4.11, which we apply to v n , with c0 =
2−
ε2 4,
and δ > 0 taken as
δ = δ0 (c0 ) = inf
√
2 − c0 1 , , √ 2 2(K (c0 ) + 1) 2
(5.9)
where K (c0 ) is the constant appearing in Proposition 4.2. This yields a finite number 1 ≤ n ≤ (δ0 ) of points x1n , . . . , x nn in TnN such that n 1 − |v n | ≤ δ on TnN \ ∪ B(xin , 1), i=1
and
1 − |v n (xin )| ≥ δ(c0 ), ∀1 ≤ i ≤ n .
(5.10)
Notice that the collection is not empty (i.e. ln ≥ 1), otherwise the map v n would be constant, in view of Proposition 4.3. Passing possibly to a subsequence, we may assume that the number n is independent of n, so that we may denote it . We next apply Lemma 5.2 to the family {x1n , . . . , x n }n∈N∗ . Passing to a further subsequence, there exist a subset J∞ of {1, . . . , }, and sequences (κk )k∈N , (n(k))k∈N and (Rk )k∈N such that 2 1 0 < κk < , , κk → 0, as k → +∞, and 1 ≤ Rk < 64 κk
Travelling Waves for the Gross-Pitaevskii Equation II
639
for any k ∈ N, and such that we have the relations Rk σ (n) σ (n) σ (n) σ (n) ≥ ∪ B xi , 1 ⊂ ∪ B x j , Rk , and dist xi , x j , (5.11) i=1 j∈J∞ 4κk for any i = j ∈ J∞ , and any n ≥ n(k). We next apply Lemma 5.1 to the sequences of maps (v n (· + x nj ))n∈N∗ , for any j ∈ J∞ , to assert that there exists some finite energy solution v j with speed c to (TWc) such that we have, up to a subsequence, v n (· + x nj ) → v j in C m (B(0, R)), as n → +∞,
(5.12)
for any R > 0, and any m ∈ N. It follows from (5.10) that 1 − |v j (0)| ≥ δ(c0 ), so that v j is not constant. At this stage, it remains to establish identities (5.5). Rk Identity for the limiting energy. We introduce the number µk = 8√ κk for any k ∈ N, as well as the exterior set Rk µk Uk = TnN \ ∪ B x nj , = TnN \ ∪ B x nj , √ , j∈J∞ j∈J∞ 8κk κk so that µk → +∞ as k → +∞. In view of relations (5.11), we are in position to apply √ Proposition 4.2 to v n , with µ = µk and κ = κk , which yields K (c0 )E 0 e(v n ) ≤ = o (1), ∀n ≥ n(k), (5.13) k→+∞ | ln(κk )|ε2 Uk where K (c0 ) is some constant possibly depending on c0 . In view of convergence (5.12), and since √µκkk → +∞, as k → +∞, we have for any fixed k, µ k n n µk lim E n v , B x j , √ = E v j , B 0, √ = E(v j ) + o (1). n→+∞ k→+∞ κk κk (5.14) Combining (5.13) and (5.14), we deduce that
n lim E n (v ) = E(v j ) + n→+∞
j∈J∞
o (1).
k→+∞
Identity for the limiting momentum. Let r > 0 be such that |v j | ≥ 21 on R N \ B(0, r ), for any j ∈ J∞ , so that in particular we may write v j = j exp iϕ j on R N \ B(0, r ). Let 0 ≤ χ ≤ 1 be a smooth function with compact support on R N such that χ = 1 on B(0, r ). In view of Lemma 2.4, we have 1 i∂1 v j , v j + ∂1 (1 − χ )ϕ j . ˜ j) = p j ≡ p(v j ) = p(v 2 RN Since the integrand is integrable on R N , and since rk ≡ have p(v j ) = pk (v j ) +
o (1),
k→+∞
µk √ κk
→ +∞, as k → +∞, we (5.15)
640
F. Béthuel, P. Gravejat, J.-C. Saut
where we have set pk (v j ) =
1 2
B(0,rk )
i∂1 v j , v j +
1 2rk
∂ B(0,rk )
ϕ j (x)x1 d x.
We now go back to the sequence (v n )n∈N∗ . Using Lemma 4.2, we may write v n = n exp iϕ n on the set On (2Rk ) ≡ TnN \ ∪ B(x nj , 2Rk ) in dimension three j∈J∞
for any n ≥ n(k). The existence of such a lifting in the two-dimensional case is more involved. Using (5.13), we can invoke Corollary 4.1 to write v n = n exp iϕ n on the set On (2µk ) ≡ TnN \ ∪ B(x nj , 2µk ), so that, since v n does not vanish on each annulus j∈J∞
B(x nj , 2µk ) \ B(x nj , 2Rk ), we may also write v n = n exp iϕ n on the set On (2Rk ) in dimension two. In particular, i∂1 v n , v n = −(n )2 ∂1 ϕ n for any n ≥ n(k), so that, since Uk is included in On (2Rk ), we have
Uk
1 − (n )2 ∂1 ϕ n ( ) ∂1 ϕ = Uk Uk
1 + ϕi (x)x1 d x. rk ∂ B(x nj ,rk )
i∂1 v , v = − n
n
n 2
n
j∈J∞
It follows that pn (v n ) =
P j,k (v n ) +
j∈J∞
1 1 − (n )2 ∂1 ϕ n , 2 Uk
(5.16)
where we have set P j,k (v n ) =
1 2
B(x nj ,rk )
i∂1 v n , v n +
1 2rk
∂ B(x nj ,rk )
ϕ n (x)x1 d x.
We deduce from convergence (5.12) that, for any fixed k, we have P j,k (v n ) → pk (v j ), as n → +∞.
(5.17)
On the other hand, we have by Lemma 2.3 and inequality (5.13),
Uk
√ 1 − (n )2 ∂1 ϕ n ≤ 2 2
Uk
e(v n ) ≤
K (c0 )E 0 = o (1). k→+∞ | ln(κk )|ε2
Combining (5.15), (5.16), (5.17) and (5.18), we obtain lim
n→+∞
so that the proof is complete.
pn (v ) = n
j∈J∞
p(v j ),
(5.18)
Travelling Waves for the Gross-Pitaevskii Equation II
641
Proof of Theorem 5.2. The beginning of the proof is identical to the proof of Theorem 5.1 above, with the exception of one major difference: whereas the choice of δ (when applying Lemma 4.11) is given in the proof of Theorem 5.1 by (5.9) (which would yield 0 in the sonic case), here we use the parameter δ which is provided in the statement of Theorem 5.2. Another difference concerns the integer n : it might be equal to 0, and at this stage is bounded by a number possibly depending on δ. All the arguments and estimates extend to the sonic case, except estimates (5.13) for the integral of the energy density on Uk , and estimate (5.18) for the momentum on Uk . Since the total energy is bounded, passing possibly to a further subsequence, we may assume that there exists some number 0 ≤ µ ≤ E such that Uk
e(v n ) → µ, as n → +∞.
(5.19)
Combining (5.19) with (5.14), we deduce the first equality in (5.6). For the second equality, using Lemma 4.2, we may write v n = n exp iϕ n on the set On (2Rk ), and passing possibly to another subsequence, we may assume similarly that, for some number ν ∈ R, we have 1 1 − (n )2 ∂1 ϕ n → ν, as n → +∞, 2 Uk
(5.20)
so that combining (5.20) with (5.15), (5.16) and (5.17), we deduce the identity for the limiting momentum in (5.6). For inequality (5.7), we invoke inequality (4.47) in Proposition 4.2 which yields
√ 2 1 − (n )2 ∂1 ϕ n − e(v n ) ≤ K (c0 ) δ e(v n ) + o (1) , k→+∞ 2 Uk Uk
which yields the desired conclusion, taking the limit n → +∞. Finally, in order to see that the number may be bounded independently of δ, we invoke Lemma 2.14. Indeed, each function v j is a non-trivial finite energy solution to (TWc) on R3 , so that by Lemma 2.14, E(v j ) ≥ E0 . Hence,
E=
E(v j ) + µ ≥ E0 ,
j=1
so that ≤ 0 ≡ EE0 is bounded independently of δ.
n (p) and u n 6. Properties of Emin p n (p) In this section, we provide the proofs of Propositions 2 and 4. We also show that E min converges to E min (p) as n → +∞.
642
F. Béthuel, P. Gravejat, J.-C. Saut
6.1. Proof of Proposition 2. The first task is to establish the existence of a minimizer for (PnN (p)). For that purpose, we consider a minimizing sequence (wk )k∈N for (PnN (p)). Since E n (wk ) is uniformly bounded with respect to k, (wk )k∈N is bounded in H 1 (TnN ), so that, passing possibly to a subsequence, we may assume that wk # u np in H 1 (TnN ), as k → +∞,
(6.1)
for some u np ∈ H 1 (TnN ). By weak lower semi-continuity and Rellich’s compactness theorem, we infer that n E n (u np) ≤ lim inf E n (wk ) = E min (p), (6.2) k→+∞
and
1 iwk , ∂1 wk = p. k→+∞ 2 T N n
pn (u np) = lim
(6.3)
The sequel of the proof is different in dimension two, or in dimension three. The twodimensional case is substantially more difficult due to the topological constraint on the test functions. Case 1. N = 3. Since u np belongs to n3 (p) by (6.3), it is a minimizer for (PnN (p)), so that the Lagrange multiplier rule implies that d E n (u np) = cpn dpn (u np), for some cpn ∈ R. The previous equality is precisely the weak formulation for the equation icpn ∂1 u np + u np + u np(1 − |u np|2 ) = 0 on T3n , whose finite energy solutions are smooth by standard elliptic theory. Case 2. N = 2. In order to prove that u np is a minimizer for (Pn2 (p)), it remains to verify in view of (6.2) and (6.3), that u np ∈ Sn0 ,
(6.4)
a difficulty which was not present in the three-dimensional case. To prove (6.4), we are going to show that a suitable choice of the minimizing sequence yields strong converge to u np. This will yield the conclusion in view of the closeness of Sn0 ∩ E n, for any fixed . The main tool is Ekeland’s variational principle. Indeed, we consider some number such that > E min (p). Using Corollary 3.1, n (p) < for any n ≥ n(). In particular, there exists some integer n() such that E min using Ekeland’s variational principle (see [12]), we can construct a minimizing sequence (wk )k∈N for (Pn2 (p)) such that n E min (p) ≤ E n (wk ) < , ∀k ∈ N,
(6.5)
and E n (wk ) − E n (w) ≤
1 wk − w H 1 (T2n ) , ∀w ∈ n2 (p), ∀k ∈ N∗ . k
(6.6)
Travelling Waves for the Gross-Pitaevskii Equation II
643
Letting δ > 0, and ψ ∈ H 1 (T2n ), and invoking Theorem 4.1 and (6.5), the function wk − δψ belongs to E n, ∩ Sn0 for any δ sufficiently small, and any n sufficiently large. Moreover, pn (wk − δψ) = pn (wk ) − δ i∂1 wk , ψ + δ 2 pn (ψ) → p, as δ → 0, T2n
so that the function z k,δ = pn (wkp−δψ) (wk − δψ) belongs to n2 (p) for δ sufficiently small. Setting w = z k,δ in inequality (6.6), and taking the limit δ → 0 after dividing by δ, we are led to 1 dpn (wk )(ψ) λk dpn (wk )(ψ) − d E n (wk )(ψ) ≤ wk − ψ 1 2 , H (Tn ) k 2 where λk = 21p d E n (wk )(wk ). By (6.5), this gives K () ψ H 1 (T2n ) , λk dpn (wk )(ψ) − d E n (wk )(ψ) ≤ k where K () is some constant only depending on . In particular, choosing ψ = u np, we are led to λk dpn (wk )(u np) − d E n (wk )(u np) → 0, as k → +∞. On the other hand, it follows from (6.1) that dpn (wk )(u np) → 2 pn (u np) = 2p, as k → +∞, and
d E n (wk )(u np) →
so that
T2n
|∇u np|2 − |u np|2 (1 − |u np|2 ) , as k → +∞,
1 |∇u np|2 − |u np|2 (1 − |u np|2 ) , as k → +∞. λk → 2p T2n
Hence, using (6.1) and Rellich’s compactness theorem, we have |∇wk |2 → 2p lim λk + |u np|2 (1 − |u np|2 ) = |∇u np|2 , as k → +∞, T2n
k→+∞
T2n
T2n
which proves the strong H 1 -convergence of the sequence (wk )k∈N towards u np. In particular, since E n, ∩ Sn0 is closed by Theorem 4.1, u np belongs to Sn0 , so that u np is a minimizer for (Pn2 (p)). Moreover, the set {u ∈ H 1 (T2n ), s.t. E n (u) < } ∩ Sn0 is open by Theorem 4.1, so that the Lagrange multiplier rule implies that icpn ∂1 u np + u np + u np(1 − |u np|2 ) = 0 on T2n , for some cpn ∈ R. Hence, u np is also smooth in the two-dimensional case.
644
F. Béthuel, P. Gravejat, J.-C. Saut
We finally turn to (1.23) and (1.24) to complete the proof of Proposition 2. We first notice that, by Corollary (3.1), n (p) ≤ E min (p), ∀p > 0, lim sup E min n→+∞
so that
lim inf (unp ) ≥ (p). n→+∞
In particular, if (p) > 0, it follows that there exists some integer n(p), and some number 0 > 0 such that (unp ) ≥ 0 , ∀n ≥ n(p). Invoking Theorem 4.2, we obtain (1.23), whereas (1.24) follows from Lemma 4.5. 6.2. Proof of Proposition 4. By Proposition 2, for given p > 0, the sequence (u np)n∈N∗ is a sequence of finite energy solutions of (TWc), with uniformly bounded energy, and such that (c(u np))n∈N∗ is bounded. By Proposition 3, it converges up to a subsequence towards a non-trivial finite energy solution u p to (TWc) on R N of speed c, which satisfies in particular ∂1 u p = 0. Hence, in view of convergence (1.28) of Proposition 3, we have cpn ≡ c(u np) → c, as n → +∞. Moreover, we deduce from the results of [19] that √ 0 < c ≤ 2. On the other hand, we may assume up to another subsequence that n n E(u np) = E min (p) → lim sup E min (p) , as n → +∞. n→+∞
We next distinguish two cases. √ Case 1. 0 < c < 2. In this case, we may apply directly Theorem 5.1 to the sequence (u np)n∈N∗ . This yields (1.29), (1.30) as well as p=
n pi , and lim sup E min (p) = E(u i ). n→+∞
i=1
(6.7)
i=1
Moreover, since c > 0, it follows from Lemma 2.5 that pi = p(u i ) > 0. In view of the definition n of E min , we have E(u i ) ≥ E min (pi ), whereas by Corollary 3.1, we have lim sup E min (p) ≤ E min (p), so that identity (6.7) yields n→+∞
E min (pi ) ≤ E min (p).
i=1
Comparing with inequality (1.18) of Corollary 1, which is precisely the reverse inequality, we deduce that all inequalities actually have to be identities, that is n E(u i ) = E min (pi ), and lim sup E min (p) = E min (p), n→+∞
and the proof is complete in the considered case.
Travelling Waves for the Gross-Pitaevskii Equation II
645
√ Case 2. c = 2. We are going to show that this case is excluded, so that Case 1 always holds. For that purpose, we apply Theorem 5.2 instead of Theorem 5.1 to the sequence (u np)n∈N∗ , with a parameter δ > 0 to be determined later. It yields a number ≥ 1 of √ finite energy solutions u i of speed 2 on R N , and numbers µ ≥ 0, and ν ≥ 0 such that √ |µ − 2ν| ≤ K δµ, (6.8) where K is some universal constant, p = p + ν, where p =
p(u i ) =
i=1
pi ,
(6.9)
i=1
and
n (p) = E(u i ) + µ. lim sup E min n→+∞
i=1
Invoking as above Corollary 3.1, we are led to
E min (pi ) + µ ≤ E min (p).
i=1
In view of Corollary 2.3, we have (u i ) < 0, so that E(u i ) > (6.8) and (6.9), we obtain
√
2pi . Combining with
√ √
pi + ν ≤ E min (p)(1 + K δ), 2p = 2 i=1
that is (p) ≤ K δ E min (p). Since δ was arbitrary, we may let it go to zero, so that (p) ≤ 0, which is a contradiction with assumption (1.27), and completes the proof of Proposition 4. Remark 6.1. In the course of the proof, we have proved the identity n lim sup E n (u np) = lim sup E min (p) = E min (p). n→+∞
n→+∞
7. Proof of the Main Theorems 7.1. Proof of Theorem 5. If p > p0 , it follows from Theorem 4 that (p) > 0, so that Propositions 3 and 4 apply. In particular, it follows from Proposition 4 that there exist some integer ≥ 1, and some positive numbers p1 , . . . , p such that E min (pi ) is achieved by some map u i for any i ∈ {1, . . . , }, with p=
i=1
pi , and E min (p) =
i=1
E min (pi ).
(7.1)
646
F. Béthuel, P. Gravejat, J.-C. Saut
We claim that = 1.
(7.2)
Indeed, assume by contradiction that ≥ 2. Then, it follows from Corollary 1 that E min is linear on the interval (0, p). By Lemma 2, we deduce that E min (q) is not achieved for any 0 < q < p. This contradicts the fact that E min (pi ) is achieved by u i , for any i ∈ {1, . . . , }, and establishes therefore (7.2). Going back to (7.1), we have p = p1 = p(u 1 ), and E min (p) = E min (p1 ) = E(u 1 ). This shows that E min (p) is achieved by the map u 1 = u p, which belongs to W (R N ), up to a multiplication by a constant of modulus one, by Corollary 2.2.
7.2. The two-dimensional case: Proof of Theorem 1 and Proposition 1. In this section, we provide the proof of the main results in dimension two, namely the proofs to Theorem 1 and Proposition 1. Proof of Theorem 1. In view of Theorem 5, it is sufficient to show that p0 = 0 if N = 2. Going back to the definition of p0 and the properties stated in Theorem 4, this is equivalent to show that (p) > 0, ∀p > 0.
(7.3)
Since the function is non-decreasing, it is sufficient to check that property for p sufficiently small. By Lemma 3.9, we have for any p sufficiently small, min (p) ≥
√ 48 2 3 p − K 0 p4 , S K2 P
(7.4)
which yields (7.3), then the desired conclusion. Proof of Proposition 1. We divide the proof into several steps. Step 1. There exists p1 > 0 such that, if 0 < p < p1 , and u is a finite energy solution to (TWc) on R2 such that p(u) = p, and E(u) = E min (p), then |u(x)| ≥
1 , ∀x ∈ R2 . 2
This is a consequence of inequality (2.9) of Lemma 2.2, and the facts that 0 ≤ c ≤ √ and E min (p) ≤ 2p.
√ 2,
Step 2. If 0 < p < p1 , and u is a finite energy solution to (TWc) on R2 such that p(u) = p and E(u) = E min (p), then
Travelling Waves for the Gross-Pitaevskii Equation II
647
K 2 p ≤ ε(u) ≤ K 3 p, where K 2 > 0 and K 3 are some universal constant. In particular, (1.13) is established. The √ second inequality follows from Step 1, Lemma 2.12, and the fact that E min (p) ≤ 2p. For the first one, we invoke equality (2.17) for a minimizer u p, which yields in particular, (p) = (u p) ≤
ε(u p)2 ε(u p)2 √ p(u p) = √ p. 2 2
(7.5)
The conclusion follows using (7.4). Step 3. Proof of inequality (1.12). The lower bound for in inequality (1.12) is provided by Lemma 3.9. Concerning the upper bound, we invoke inequality (7.5) and Step 2 to obtain (p) ≤ K p3 . Step 4. Proof of inequality (1.14). Combining inequality (2.17) with Step 1 and Step 2, we obtain |∂2 u p|2 ≤ |∂2 u p|2 + (p) ≤ K ε(u p)2 p ≤ K p3 , R2
R2
whereas Lemma 2.10 yields a similar estimate for the two other terms on the l.h.s of (1.14). Step 5. Proof of inequality (1.15). In view of the invariance by translation, we may assume without loss of generality that the infimum of |u p| is achieved at the point 0, that is |u p(0)| = min |u p(x)|. 2 x∈R
Inequality (1.15) is then a consequence of (1.22) for v = u p, and (7.5).
7.3. The three dimensional case: Proof of Theorem 2. Lemma 7.1. Let N = 3. We have E0 p0 ≥ √ , 2
(7.6)
where E0 > 0 is the constant provided in Lemma 2.14. E0 Proof. Assume by contradiction that p0 < √ . Then, by Theorem 5, E min (p) is achieved 2 √ for any p > p0 , by some map u p. On the other hand, E min (p) ≤ 2p, so that if E0 , then p0 < p < √ 2
648
F. Béthuel, P. Gravejat, J.-C. Saut
E(u p) < E0 , whereas, in view of Lemma 2.14, there is no finite energy solution to (TWc) with energy smaller than E0 . This gives a contradiction. Lemma 7.2. Given any p > p0 , let u p be a minimizer of E min (u p) given by Theorem 5. ∞ (R3 ), as p → p , Then, there exists a function u p0 ∈ W (R3 ) such that u p → u p0 in Cloc 0 √ with p(u p0 ) = p0 , and E(u p0 ) = 2p0 . In particular, E min (p0 ) is achieved. Proof. We first notice that it follows from Theorem 4 and Corollary 2.4 that, there exists a universal constant K > 0, such that, for any p > p0 , we have 1 − |u p| L ∞ (R3 ) ≥
K ≥ K. E min (p0 )α
(7.7)
Without loss of generality, we may assume, in view of the invariance by translation, that |u p(0)| = inf |u p(x)|, 3 x∈R
so that it follows from (7.7) that |u p(0)| ≤ 1 − K < 1. Arguing as in the proof of Proposition 3, there exists a non-trivial finite energy solution u 1 to (TWc) with c = lim sup c(p) , such that, passing possibly to a subsequence, we p→p0
have u p → u 1 in C k (K ), as p → p0 , for any compact set K in R3 , and any k ∈ N. Moreover, we have √ E(u 1 ) = lim inf (E(u p)) = E min (p0 ) = 2p0 , and |u 1 (0)| ≤ 1 − K < 1, p→p0
√ so that u 1 is non-trivial. Assuming first that c = 2, we next apply Theorem 5.2 to the sequence (u p)p>p0 , with a parameter δ > 0 to be determined later. Indeed, following the lines of the proofs of Theorems 5.1 and 5.2, we may verify that their statements remain valid for any sequence (v n )n∈N of finite energy solutions to (TWcn ) on R N satisfying (5.1). Hence, there exist a number ≥ 1 of finite energy solutions u i of (TWc) on R3 , and numbers µ ≥ 0 and ν ≥ 0 such that √ |µ − 2ν| ≤ K δµ, p0 = p + ν, where p =
i=1
p(u i ) =
pi , and
i=1
√ 2p0 = E min (p0 ) = E(u i ) + µ, i=1
Travelling Waves for the Gross-Pitaevskii Equation II
649
so that, by Theorem 4, E min (p0 − ν) =
√
2(p0 − ν) = E min (p0 ) −
√ 2ν ≥ E(u i ) − K δµ. i=1
In view of inequality (1.18) of Corollary 1, we deduce that √ E(u i ) ≤ E min (pi ) + K δµ ≤ 2pi + K δ E min (p0 ). Taking i = 1 and letting δ → 0, we deduce that (u 1 ) ≥ 0. It then follows from Corollary 2.3 that √ c = lim sup c(u p) < 2, p→p0
√ which gives a contradiction. Hence, we may assume that c < 2, and apply Theorem 5.1 to the sequence (u p)p>p0 . We then conclude as in the proof of Theorem 5 that = 1, p(u 1 ) = p0 , and E min (p0 ) = E(u 1 ). Proof of Theorem 2 completed. It follows from Lemma 7.1 that p0 is not equal to zero, from Theorem 5 that E min (p) is√achieved for p > p0 , and from Lemma 7.2, that E min (p0 ) is achieved, with E min (p0 ) = 2p0 . In view of Theorem 4, E min is affine on (0, p0 ), so that it is not achieved on (0, p0 ) by Lemma 2. Hence, statements i) and ii) of Theorem 2 hold, such as statement iii), which follows from Lemma 2.14 and (7.6). Finally, statement iv) results from √ Lemma 2.16. Indeed, this yields that any minimizer u p0 of E min (p0 ) = E(u p0 ) = 2p0 , satisfies ε(u p0 ) ≥
K p8α+2 0
> 0,
√ for some universal constants α and K , so that c(u p0 ) < 2, and the conclusion follows using the monotonicity properties of the function p → c(u p). Acknowledgement. The authors are grateful to L. Almeida, D. Chiron, A. Farina, C. Gallo, P. Gérard, S. Gustafson, M. Maris, K. Nakanishi, G. Orlandi, D. Smets, and T.-P. Tsai for interesting and helpful discussions. The authors are also thankful to the referee for providing valuable comments and suggestions. The first and second authors acknowledge support from the ANR project JC05-51279, “Équations de Gross-Pitaevskii, d’Euler, et phénomènes de concentration”, of the French Ministry of Research.
References 1. Almeida, L.: Topological sectors for Ginzburg-Landau energy. Rev. Mat. Iber. 15(3), 487–546 (1999) 2. Berloff, N.: Quantum vortices, travelling coherent structures and superfluid turbulence. Preprint 3. Béthuel, F., Orlandi, G., Smets, D.: Vortex rings for the Gross-Pitaevskii equation. J. Eur. Math. Soc. 6(1), 17–94 (2004) 4. Béthuel, F., Saut, J.-C.: Travelling waves for the Gross-Pitaevskii equation I. Ann. Inst. Henri Poincaré, Physique Théorique. 70(2), 147–238 (1999) 5. Béthuel, F., Saut, J.-C.: Vortices and sound waves for the Gross-Pitaevskii equation. In: Nonlinear PDE’s in Condensed Matter and Reactive Flows, Volume 569 of NATO Science Series C. Mathematical and Physical Sciences, Dordrecht, Kluwer Academic Publishers, 2002, pp. 339–354
650
F. Béthuel, P. Gravejat, J.-C. Saut
6. Bona, J.L., Li, Y.A.: Analyticity of solitary-wave solutions of model equations for long waves. SIAM J. Math. Anal. 27(3), 725–737 (1996) 7. Bona, J.L., Li, Y.A.: Decay and analyticity of solitary waves. J. Math. Pures Appl. 76(5), 377–430 (1997) 8. Chiron, D.: Travelling waves for the Gross-Pitaevskii equation in dimension larger than two. Nonlinear Anal. 58(1-2), 175–204 (2004) 9. de Bouard, A., Saut, J.-C.: Remarks on the stability of generalized KP solitary waves. In: Mathematical problems in the theory of water waves (Luminy, 1995), Volume 200 of Contemp. Math., Providence, RI: Amer. Math. Soc., 1996, pp. 75–84 10. de Bouard, A., Saut, J.-C.: Solitary waves of generalized Kadomtsev-Petviashvili equations. Ann. Inst. Henri Poincaré, Analyse Non Linéaire. 14(2), 211–236 (1997) 11. de Bouard, A., Saut, J.-C.: Symmetries and decay of the generalized Kadomtsev-Petviashvili solitary waves. SIAM J. Math. Anal. 28(5), 1064–1085 (1997) 12. Ekeland, I.: On the variational principle. J. Math. Anal. Appl. 47, 324–353 (1974) 13. Farina, A.: From Ginzburg-Landau to Gross-Pitaevskii. Monatsh. Math. 139, 265–269 (2003) 14. Gallo, C.: The Cauchy problem for defocusing nonlinear Schrödinger equations with non-vanishing initial data at infinity. Preprint 15. Gérard, P.: The Cauchy problem for the Gross-Pitaevskii equation. Ann. Inst. Henri Poincaré, Analyse Non Linéaire. 23(5), 765–779 (2006) 16. Gérard, P.: The Gross-Pitaevskii equation in the energy space. Preprint 17. Goubet, O.: Two remarks on solutions of Gross-Pitaevskii equations on Zhidkov spaces. Monatsh. Math. 151(1), 39–44 (2007) 18. Gravejat, P.: Limit at infinity for travelling waves in the Gross-Pitaevskii equation. C. R. Math. Acad. Sci. Paris. 336(2), 147–152 (2003) 19. Gravejat, P.: A non-existence result for supersonic travelling waves in the Gross-Pitaevskii equation. Commun. Math. Phys. 243(1), 93–103 (2003) 20. Gravejat, P.: Decay for travelling waves in the Gross-Pitaevskii equation. Ann. Inst. Henri Poincaré, Analyse Non Linéaire. 21(5), 591–637 (2004) 21. Gravejat, P.: Limit at infinity and nonexistence results for sonic travelling waves in the Gross-Pitaevskii equation. Differ. Int. Eqs. 17(11–12), 1213–1232 (2004) 22. Gravejat, P.: Asymptotics for the travelling waves in the Gross-Pitaevskii equation. Asymptot. Anal. 45 (3–4), 227–299 (2005) 23. Gravejat, P.: First order asymptotics for the travelling waves in the Gross-Pitaevskii equation. Adv. Differ. Eqs. 11(3), 259–280 (2006) 24. Gravejat, P.: Asymptotics of the solitary waves for the generalised Kadomtsev-Petviashvili equations. Disc. Cont. Dynam. Syst., 21(3), 835–882 (2008) 25. Gross, E.P.: Hydrodynamics of a superfluid condensate. J. Math. Phys. 4(2), 195–207 (1963) 26. Gustafson, S., Nakanishi, K., Tsai, T.-P.: Scattering theory for the Gross-Pitaevskii equation in three dimensions. Preprint, available at http://arxiv.org/abs/0803.3208vi[math.AP], 2008 27. Gustafson, S., Nakanishi, K., Tsai, T.-P.: Scattering for the Gross-Pitaevskii equation. Math. Res. Lett. 13(2), 273–285 (2006) 28. Gustafson, S., Nakanishi, K., Tsai, T.-P.: Global dispersive solutions for the Gross-Pitaevskii equation in two and three dimensions. Ann. Henri Poincaré. 8(7), 1303–1331 (2007) 29. Iordanskii, S.V., Smirnov, A.V.: Three-dimensional solitons in He II. JETP Lett. 27(10), 535–538 (1978) 30. Jones, C.A., Putterman, S.J., Roberts, P.H.: Motions in a Bose condensate V. Stability of solitary wave solutions of nonlinear Schrödinger equations in two and three dimensions.. J. Phys. A, Math. Gen. 19, 2991–3011 (1986) 31. Jones, C.A., Roberts, P.H.: Motions in a Bose condensate IV. Axisymmetric Solitary Waves. J. Phys. A, Math. Gen. 5, 2599–2619 (1982) 32. Kato, K., Pipolo, P.-N.: Analyticity of solitary wave solutions to generalized Kadomtsev-Petviashvili equations. Proc. Roy. Soc. Edinb. A. 131(2), 391–424 (2001) 33. Lizorkin, P.I.: On multipliers of Fourier integrals in the spaces L p,θ . Proc. Steklov Inst. Math. 89, 269– 290 (1967) 34. Lopes, O.: A constrained minimization problem with integrals on the entire space. Bol. Soc. Bras. Mat. 25(1), 77–92 (1994) 35. Maris, M.: Analyticity and decay properties of the solitary waves to the Benney-Luke equation. Differ. Int. Eqs. 14(3), 361–384 (2001) 36. Maris, M.: On the existence, regularity and decay of solitary waves to a generalized Benjamin-Ono equation. Nonlinear Anal. 51(6), 1073–1085 (2002) 37. Nakanishi, K.: Scattering theory for the Gross-Pitaevskii equation. Preprint 38. Pitaevskii, L.P.: Vortex lines in an imperfect Bose gas. Sov. Phys. JETP. 13(2), 451–454 (1961)
Travelling Waves for the Gross-Pitaevskii Equation II
651
39. Stein, E.M.: Harmonic analysis : Real-variable methods, orthogonality, and oscillatory integrals. Volume 43 of Princeton Mathematical Series. Monographs in Harmonic Analysis. Princeton, NJ: Princeton Univ. Press, 1993 (With the assistance of T.S. Murphy) 40. Tarquini, É.: A lower bound on the energy of travelling waves of fixed speed for the Gross-Pitaevskii equation. Monatsh. Math. 151(4), 333–339 (2007) Communicated by P. Constantin
Commun. Math. Phys. 285, 653–672 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0616-0
Communications in
Mathematical Physics
A Random Matrix Decimation Procedure Relating β = 2/(r + 1) to β = 2(r + 1) Peter J. Forrester Department of Mathematics and Statistics, University of Melbourne, Victoria 3010, Australia. E-mail:
[email protected] Received: 19 November 2007 / Accepted: 20 May 2008 Published online: 2 September 2008 – © Springer-Verlag 2008
Abstract: Classical random matrix ensembles with orthogonal symmetry have the property that the joint distribution of every second eigenvalue is equal to that of a classical random matrix ensemble with symplectic symmetry. These results are shown to be the case r = 1 of a family of inter-relations between eigenvalue probability density functions for generalizations of the classical random matrix ensembles referred to as β-ensembles. The inter-relations give that the joint distribution of every (r + 1)st eigenvalue in certain β-ensembles with β = 2/(r + 1) is equal to that of another β-ensemble with β = 2(r + 1). The proof requires generalizing a conditional probability density function due to Dixon and Anderson. 1. Introduction 1.1. The setting and summary of results. The Dixon-Anderson conditional probability density function (PDF) refers to the function of {λ j } specified by [3,1] n
(
j=1 s j )
(s1 ) · · · (sn )
1≤ j a2 > λ2 > · · · > λn−1 > an , specifying an interlaced region. It occurs as the conditional PDF of the zeros of the random rational function n j=1
wj , aj − λ
(1.2)
654
P. J. Forrester
where the w j have a Dirichlet distribution (this interpretation of the workings of [3,1] is given in [12]). Being a (conditional) PDF in {λ j }, (1.1) has the property that integration over these variables gives unity. When cleared of denominators, the resulting multiple integral is referred to as the Dixon-Anderson integral (see e.g. [9, Ch. 3], [14]). An analogous conditional PDF for angles {ψ j } j=1,...,n , due to Forrester and Rains [13], is specified by n n iψk − eiψ j | 2 (( n−1 j=0 α j + 1)/2) 1≤ j
λ˜ l(r −1)
> ··· >
λ˜ l(1)
j = 1, . . . , n − 1 ( j = l),
> a˜ l .
(2.10)
Furthermore, we see from Fig. 1 that for some p = 0, . . . , r , q = 0, . . . , r , ( p) (1) (2) a˜ l+1 > λ˜ l+1 > λ˜ l+1 > · · · λ˜ l+1 > a˜ l ,
a˜ l+1 >
(r −q+1) λ˜ l−1
>
(r −q+2) λ˜ l−1
> ··· >
(2.11) (r ) λ˜ l−1
> a˜ l
(2.12)
(note that these configurations are empty if p = 0, q = 0 respectively). In other words, between a˜ l and a˜ l+1 there are exactly r coordinates of species l, p of species l + 1 and q of species l − 1. A crucial feature of the contour integrals is that only configurations with p = q = 0 contribute, due to cancellation effects for p and/or q non-zero. To quantify the latter,
Random Matrix Decimation Procedure Relating β = 2/(r + 1) to β = 2(r + 1)
661
consider first the case that p = 0 while q ≥ 1, and suppose that to begin the r species l variables are to the left of the q species l − 1 variables in the interval (a˜ l , a˜ l+1 ). We see from (2.9) that interchanging the position of coordinates corresponding to different species does not change the magnitude of the integrand but it does change the phase, with each interchange of a species l − 1 and left neighbouring species l contributing e−2πi/(r +1) . Hence for a general ordering of the r species l variables and q species l − 1 variables amongst a given set of (r + q) positions in (a˜ l , a˜ l+1 ) the phase is given by e−2πi K (A)/(r +1) .
(2.13)
Here K (A) is as in Lemma 1 with the 0’s corresponding to species l and the 1’s to species l − 1. But Lemma 1 tells us that if we sum (2.13) over all arrangements we get zero, which is the claimed cancellation effect in this case. Essentially the same argument, making use of Lemma 1 with the role of the 0’s and 1’s interchanged, gives cancellation of the contribution to the contour integrals from configurations with q = 0, p ≥ 1. It remains to consider the cases p, q ≥ 1. In such cases, with the positions of the species l − 1 coordinates fixed (we could just as well fix the position of the species l + 1 coordinates), we see that the contribution to the phase ∗ of each such coordinate is equal to e−2πil /(r +1) , where l ∗ is the number of both species l, l + 1 to its left, and in particular is independent of their ordering. But we know that summing over this latter ordering gives the cancellation (2.1), so in all cases there is no contribution from non-empty configurations (2.11) and (2.12). As a consequence of both (2.11) and (2.12) having to be empty for a non-zero contribution to the contour integral, it follows that (2.10) can be supplemented by the requirements that (1)
(2)
(r )
a˜ l > λ˜ l+1 > λ˜ l+1 > · · · λ˜ l+1 > a˜ l+2 , (1) (2) (r ) a˜ l−1 > λ˜ l−1 > λ˜ l−1 > · · · λ˜ l−1 > a˜ l+1 .
Up to a phase, this contour integral is precisely (2.1) with the position of al and al+1 interchanged, and correspondingly sl and sl+1 interchanged. The phase is straightforward to calculate, giving as the final result (2.3).
2.3. Proof of Theorem 1 for (1.7). As remarked below (2.2), we must show that L r,n ({a p }) is proportional to Rr,n ({a p }), and then determine the proportionality. For the former task, our strategy is to show that L r,n ({a p }) factorizes into a term singular in {a p }, and a term analytic in {a p }. The singular factor is precisely Rr,n ({a p }), while a scaling argument shows that the analytic factor must be a constant. Intermediate working relating to the singular terms allows the proportionality to be determined. Consider L r,n ({a p }) as an analytic function of a1 in the appropriately cut complex a1 -plane. Singularites occur as a1 approach any of a2 , . . . , an . The singular behaviour as a1 approaches a2 can be determined directly from (2.1). Thus, as a1 → a2 the integral over species 1 effectively factorizes from the integral over the other species, showing L r,n ({a p }) =
n
(a2 − a p )r (s p −1) Ir (a1 , a2 )
p=3
×L r,n−1 ({a p } p=2,...,n )|s1 →s1 +s2 +2/(r +1)−1 F(a1 − a2 ; {a p } p=2,...,n ),
(2.14)
662
P. J. Forrester
where F(z; {a p } p=2,...,n ) is analytic about z = 0 and equal to unity at z = 0, and (1) (r ) Ir (a1 , a2 ) := dλ1 · · · dλ1 (1)
(r )
a1 >λ1 >···>λ1 >a2
×
(ν) (λ1
(µ) − λ1 )2/(r +1)
r
(ν)
(ν)
(a1 − λ1 )s1 −1 (λ1 − a2 )s2 −1 . (2.15)
ν=1
1≤ν≤µ≤r
Thus the singular behaviour is determined by the singular behaviour of Ir (a1 , a2 ). This in turn is revealed by a simple scaling of the integrand, which shows Ir (a1 , a2 ) = (a1 − a2 )r (r −1)/(r +1)+r (s1 +s2 −1)
1 Sr (s1 − 1, s2 , 1/(r + 1)), (2.16) r!
where Sn (λ1 , λ2 , λ) denotes the Selberg integral (1.5). For the singular behaviour as a1 approaches ak (k = 2), we make use of Proposition 1 which says that up to a phase the function of {a p } p=1,...,n obtained from L r,n ({a p }) by analytic continuation is symmetric in {(a p , s p )}. Hence as a function of a1 it must be that L r,n ({a p }) =
n
˜ 1 ; {a p } p=2,...,n ), (a j − ak )r (r −1)/(r +1)+r (s j +sk −1) F(a
(2.17)
k=2
where F˜ is analytic in a1 . Further, repeating the argument with L r,n ({a p }) regarded as a function of a2 , . . . , an in turn shows L r,n ({a p }) = Rr,n ({a p })G({a p }),
(2.18)
where G is analytic in {a p } and symmetric in {(a p , s p )}. It remains to determine G. This can be done by considering the scaling properties of both sides of (2.18) upon the replacements {a p } → {ca p }, c > 0. After changing variables λk → cλk (k = 1, . . . , n(r − 1)) in (2.1) we see L r,n ({ca p }) = cr (n−1)+r (n−1)(r (n−1)−1)/(r +1)+r (n−1) while we read off from (2.2) that Rr,n ({ca p }) =
n
p=1 (s p −1)
L r,n ({a p }), (2.19)
cr (s j +sk −2/(r +1)) Rr,n ({a p }).
(2.20)
1≤ j w, and Cψ ≥ C ψ ≥ ψ0 ˙ 1 . themselves whenever R H By the standard contraction mapping theorem, an immediate corollary to Proposition 3.4 is the existence of a unique fixed point (t → Zt ) ∈ Cw0 (R+ , ), with (t → ψt ) ∈ Cb0 (R+ , H˙ 1 (R3 )), of the fixed point equation with the truncated F, Z. = F .,0 (Z. |Z0 ).
(76)
By a regularity bootstrapping argument, fixed points of the untruncated F exist and are in C 1 (R, ), indeed, furnishing unique -strong Vlasov solutions. Thus we may state our main existence and uniqueness theorem of this section. Theorem 3.5. For every Z0 ∈ B, H˙ there exists w > 0 such that whenever w > w, the Vlasov fixed point equation (67) with Cauchy data limt→0 Zt = Z0 is solved by a unique curve t → Zt ∈ Cw0 (R, ); since in particular (ψ0 , 0 ) ∈ ( H˙ 2 × H˙ 1 )(R3 ), the map t → Zt ∈ (Cw0 ∩ C 1 )(R, ), and thus it is the unique -strong solution to (51)–(54) conserving mass (58), momentum (59), angular momentum (60), and energy (61). 3.4.3. Proof of Proposition 3.4. We begin with auxiliary results concerning the flow on the particle sub-phase space. Lemma 3.6. Given any curve ζ. ∈ C k (R, ( H˙ 1 ⊕ L 2 )(R3 )), k = 0, 1, . . . , we have (i) (ii) (iii) (iv)
J · ∂z H( · ; ζ. ) ∈ C k (R × R6 , R6 ), J · ∂z H( · ; ζt ) ∈ C ∞ (R6 , R6 ); ∂z · J · ∂z H( · ; ζt ) ≡ 0; |J · ∂z H( · ; ζt )| ≤ 1 + L 2 ψt H˙ 1 .
Proof of Lemma 3.6. Regularity (i), (ii), and incompressibility (iii),are obvious. The bound (iv) obtains by using the triangle inequality, then | p| ≤ 1 + | p|2 for the momentum part, respectively for the space part the Cauchy–Schwarz inequality to get | ∗ ∇ψ|(q) ≤ L 2 ψ H˙ 1 for all q; cf. (34). As a spin-off of Lemma 3.6, we have Corollary 3.7. If ζ. ∈ C k (R, ( H˙ 1 ⊕ L 2 )(R3 )), then .,. [ζ· ] ∈ C k (R × R × R6 , R6 ), and t,t [ζ· ] is a symplectomorphism ∀t, t ∈ R; in particular, det ∂z t,t [ζ· ](z) = 1. Proof of Corollary 3.7. This is a standard corollary. See, e.g., [HiSm74].
Controlling the field space component of F .,0 requires only the following lemma:
692
Y. Elskens, M. K.-H. Kiessling, V. Ricci
1 ), then .,. [µ. ] ∈ C 0 (R × R × ( H˙ 1 ⊕ L 2 )(R3 ), ( H˙ 1 ⊕ Lemma 3.8. If µ. ∈ C 0 (R, P 2 3 L )(R )). Proof of Lemma 3.8. Since (72), (73) are quite explicit, a straightforward calculation proves the lemma in the special case that ζ0 is classical. Since classical fields are dense in ( H˙ 1 ⊕ L 2 )(R3 ), the Hahn-Banach theorem now completes the proof. Proof of Proposition 3.4. We first show that, given any Z0 ∈ B, H˙ , the map F ·,0 ( · |Z0 ) is Lipschitz-continuous from a closed subset of Cw0 (R+ , |Z0 ), defined by the condition supt≥0 ψt H˙ 1 ≤ Cψ with Cψ ≥ C ψ ≥ ψ0 H˙ 1 , to Cw0 (R+ , |Z0 ) whenever w > w, with w depending at most on , Cψ , and the Lipschitz constant at most on , Cψ , w. We emphasize that the conditioning limt→0 Zt = Z0 and limt→0 Zt = Z0 implied by the definition of Cw0 (R+ , |Z0 ) do not enter our estimates. To break up the proof into two parts, we use the triangle inequality in the form
F.,0 (µ. ; ζ. |Z0 ) − F.,0 (µ˜ . ; ζ˜. |Z0 ) w≤ F.,0 (µ. ; ζ. |Z0 ) − F.,0 (µ˜ . ; ζ. |Z0 ) w + F.,0 (µ˜ . ; ζ. |Z0 ) − F.,0 (µ˜ . ; ζ˜. |Z0 ) w .
(77)
Given Z0 = (µ0 ; ζ0 ) and Cψ ≥ ψ0 H˙ 1 , we show: (a) given ζ. ∈ C 0 (R+ , ( H˙ 1 ⊕ 1 ) L 2 )(R3 )) satisfying ψt H˙ 1 ≤ Cψ for all t > 0, for any two µ. and µ˜ . in C 0 (R+ , P and all w > 0 we have (78)
F.,0 (µ. ; ζ. |Z0 ) − F.,0 (µ˜ . ; ζ. |Z0 ) w ≤ L 1 [; w] sup e−wt µt − µ˜ t L∗ ; t≥0
1 ), for any two ζ. and ζ˜. in C 0 (R+ , ( H˙ 1 ⊕ L 2 )(R3 )), satisfying (b) given µ. ∈ C 0 (R+ , P ˜ max{ ψt H˙ 1 , ψt H˙ 1 } ≤ Cψ for all t > 0, and for all w > w[; Cψ ] we have
F.,0 (µ. ; ζ. |Z0 ) − F.,0 (µ. ; ζ˜. |Z0 ) w ≤ L 2 [; w, w] sup e−wt ζt − ζ˜t H L ; (79) t≥0
for then it follows from (77), (78), (79) that, given any Z0 and Cψ ≥ C ψ ≥ ψ0 H˙ 1 , Z. |Z0 ) w ≤ L[; w, w] Z. − Z. w
F.,0 (Z. |Z0 ) − F.,0 (
(80)
whenever w > w[, Cψ ], with L[; w, w] := max{L 1 [; w], L 2 [; w, w]}. Part a) To prove (78), we fix Z0 and ζ. and note that in this case
Ft,0 (µ. ; ζ. |Z0 ) − Ft,0 (µ˜ . ; ζ. |Z0 ) = t,0 [µ. ](ζ0 ) − t,0 [µ˜ . ](ζ0 ) H L ,
(81)
where, in components, ψ
ψ
t,0 [µ. ](ζ0 ) − t,0 [µ˜ . ](ζ0 ) H L= t,0 [µ. ](ζ0 ) − t,0 [µ˜ . ](ζ0 ) H˙ 1 + ˜ . ](ζ0 ) L 2 . t,0 [µ. ](ζ0 ) − t,0 [µ
(82)
Furthermore, using (72) and then the definition of · H˙ 1 , we have ψ
ψ
2
t,0 [µ. ](ζ0 )−t,0 [µ˜ . ](ζ0 ) H˙ 1 =
2 t − (t − t )∇[ ∗ (ρt − ρ˜t )](q )ddt dq, 2 0
S
(83)
Vlasov Limit for a System of Particles which Interact with a Wave Field
693
while with (73) and the definition of · L 2 , we find
˜ . ](ζ0 ) 2L 2 t,0 [µ. ](ζ0 ) − t,0 [µ 2 t = − 1 + (t − t ) · ∇ [ ∗ (ρt − ρ˜t )](q )ddt dq. 0
S2
(84)
As to (83), triangle and Jensen’s inequalities, and Fubini, yield the estimate ψ
ψ
2
t,0 [µ. ](ζ0 ) − t,0 [µ˜ . ](ζ0 ) H˙ 1 t 2 ≤− (t − t ) ∇ ∗ (ρt − ρ˜t ) (q ) dt dqd. S2
(85)
0
Now multiply (85) by e−2wt , pull e−2wt under the square in r.h.s. (85) and note that t 2 −wt e (t − t ) ∇ ∗ (ρt − ρ˜t ) (q ) dt dq 0t 2 1 − 21 w(t−t ) − 2 w(t−t ) −wt e (t − t ) e e = ∇ ∗ (ρt − ρ˜t ) (q ) dt dq 0 t t 2 ≤ e−w(t−t ) (t − t )2 dt e−w(t−t ) e−2wt ∇ ∗ (ρt − ρ˜t ) (q ) dqdt 0 0 2 ≤ w24 sup e−2wt ∇ ∗ (ρt − ρ˜t ) (q ) dq , (86) t ≥0
the first inequality by Cauchy–Schwarz, by Hölder
t followed by Fubini,
∞ the second one
t ∞ followed by 0 e−w(t−t ) (t − t )2 dt 0 e−w(t−t ) dt ≤ 0 e−wτ τ 2 dτ 0 e−wτ dτ = 2/w 4 . We next estimate the remaining dq integral by itself. For this we first rewrite it with the help of one of Green’s identities, a change of integration variables q → q , and Fubini’s theorem, exchanging the dq integration with one of the convolution integrations (dq, ˜ say); we then apply the Kantorovich–Rubinstein duality twice to obtain generalized Hölder estimates, then use the estimate ρ − ρ
˜ L∗ ≤ µ − µ
˜ L∗ for ρ(dq) = µ(dq × R3 ) (similarly for ρ). ˜ Thus, independently of , we have 2 ˜ (ρt − ρ˜t )(dq) ˜ ∇ ∗ (ρt − ρ˜t ) (q ) dq = − ∗ ∇ 2 ∗ (ρt − ρ˜t ) (q) ≤ Lip ∗ ∇ 2 ∗ (ρt − ρ˜t ) ρt − ρ˜t L∗ 2 ≤ Lip◦2 ∗ ∇ 2 ρt − ρ˜t L∗ 2 ≤ Lip◦2 ∗∇ 2 µt − µ˜ t L∗ , (87) where Lip◦2 ∗∇ 2 is the iterated Lipschitz constant 16 of ∗∇ 2 . We estimate r.h.s. (86) with the help of (87), which in turn estimates (e−2wt r.h.s. (85)) in such a way 16 The iterated Lipschitz constant is defined by
Lip◦2 ( f ) = sup sup
x = y x˜ = y˜
| f (x − x) ˜ + f (y − y˜ ) − f (x − y˜ ) − f (y − x)| ˜ . |x − y||x˜ − y˜ |
If f ∈ C 2 (Rd ), then Lip◦2 ( f ) = supx∈Rd ∇ ⊗2 f (x) ∞ , where ∇ ⊗2 f (x) is the Hessian of f at x and
M ∞ the sup norm (i.e. spectral radius) of a real symmetric matrix M.
694
Y. Elskens, M. K.-H. Kiessling, V. Ricci
that the integration over d now factors out, yielding the factor unity. Thus, taking supt≥0 (e−2wt l.h.s. (85)) and then square roots yields ψ ψ sup e−wt t,0 [µ. ](ζ0 ) − t,0 [µ˜ . ](ζ0 ) H˙ 1 ≤ Lip◦2 ∗∇ 2 w24 t≥0
× sup e−wt µt − µ˜ t L∗ .
(88)
t≥0
Similarly, except that after the Cauchy-Schwarz and Fubini steps we now also use 1 + (t − t ) · ∇ g(q)2 dq = g(q) 1 − (t − t )2 ( · ∇)2 g(q) dq, (89)
where g(q) = [ ∗ (ρt − ρ˜t )](q ), for (84) we obtain sup e−wt ˜ . ](ζ0 ) L 2 t,0 [µ. ](ζ0 ) − t,0 [µ t≥0 ≤ Lip◦2 ( ∗ )w12 +Lip◦2 ( ∗ (0 · ∇)2 ) w24 sup e−wt µt − µ˜ t L∗ , (90) t≥0
where 0 ∈ S2 is arbitrary. We now recall (82). Noting that, by triangle inequality, supt≥0 e−wt l.h.s. (82) ) is not bigger than the sum of (88) and (90), we arrive at sup e−wt t,0 [µ. ](ζ0 )−t,0 [µ˜ . ](ζ0 ) H L ≤ L 1 [; w] sup e−wt µt − µ˜ t L∗ , (91) t≥0
t≥0
with L 1 [; w] =
Lip◦2 ∗∇ 2 w24 + Lip◦2 ( ∗ )w12 +Lip◦2 ( ∗ (0 · ∇)2 ) w24 . (92)
Finally recalling (81), we see that our proof of (78) is concluded. Part b) To prove (79), we fix Z0 , and µ. and note that in this case
Ft,0 (µ. ; ζ. |Z0 ) − Ft,0 (µ. ; ζ˜. |Z0 ) = µ0 ◦ 0,t [ζ. ] − µ0 ◦ 0,t [ζ˜. ] L∗ .
(93)
Recalling (62) and Corollary 3.7, we note next that
µ0 ◦ 0,t [ζ. ] − µ0 ◦ 0,t [ζ˜. ] L∗= sup g d(µ0 ◦ 0,t [ζ. ] − µ0 ◦ 0,t [ζ˜. ]) g∈C 0,1 (R6 ) Lip(g)≤1
= sup g∈C 0,1 (R6 ) Lip(g)≤1
(g ◦ t,0 [ζ. ] − g ◦ [ζ˜. ])dµ0 . t,0 (94)
Pulling | · | under the last integral in (94) and using Lip (g) ≤ 1 gives the estimate
µ0 ◦ 0,t [ζ. ] − µ0 ◦ 0,t [ζ˜. ] L∗ ≤ t,0 [ζ. ](z) − t,0 [ζ˜. ](z)µ0 (dz). (95)
Vlasov Limit for a System of Particles which Interact with a Wave Field
695
By (70), for z . , z˜ . ∈ C 1 (R, R6 ) solving the characteristic equations for given t → ψt and t → ψ˜ t , respectively, with17 z 0 = z = z˜ 0 , we have t J · ∂z H(z τ ; ζτ ) − J · ∂z H(˜z τ ; ζ˜τ ) dτ. (96) t,0 [ζ. ](z) − t,0 [ζ˜. ](z) = 0
We now insert (96) in the right-hand side of (95), estimate the resulting expression by pulling the absolute bars under the t-integral and applying the triangle inequality, next simplify by noting that J is an isometry on R6 , and use Fubini’s theorem to exchange dτ and dµ0 integrations. We thus obtain the estimate t ˜
µ0 ◦ 0,t [ζ. ] − µ0 ◦ 0,t [ζ. ] L∗ ≤ ∂z H(z τ ; ζτ ) − ∂z H(z τ ; ζ˜τ ) µ0 (dz)dτ 0 t + ∂z H(z τ ; ζ˜τ ) − ∂z H(˜z τ ; ζ˜τ ) µ0 (dz)dτ. 0
(97) Since we want an estimate for supt≥0 e−wt l.h.s. (97) , we next consider the exponentially weighted suprema of the two integrals on r.h.s. (97) separately. The first integral on r.h.s. (97) is estimated as follows. By (69) with z τ = (qτ , pτ ) and by the Cauchy–Schwarz inequality, we have |∂z H(z τ ; ζτ ) − ∂z H(z τ ; ζ˜τ )| = ∗ ∇ψτ − ∇ ψ˜ τ (qτ ) ≤ L 2 ψτ − ψ˜ τ H˙ 1 . (98)
Since r.h.s. (98) is independent of z, integration w.r.t. dµ0 factors out and yields 1. Proceeding now similarly as in estimate (86), we obtain t −wt −wt 1 ˜ ˜ (99) sup e
ψτ − ψτ H˙ 1 dτ ≤ w sup e
ψt − ψt H˙ 1 . t≥0
0
t≥0
t
Multiplying (99) by L 2 gives an upper bound for supt≥0 e−wt 0 r.h.s. (98) dµ0 dτ ,
which in turn is an upper bound for supt≥0 (e−wt first integral on r.h.s. (97)). As to the second integral on r.h.s. (97), its integrand is rewritten using (69), with z τ = (qτ , pτ ) and z˜ τ = (q˜τ , p˜ τ ), then estimated by the triangle inequality, giving pτ p˜ τ ˜ ˜ − ∂z H(z τ ; ζτ )−∂z H(˜z τ ; ζτ ) ≤ 1 + | p τ |2 1 + | p˜ τ |2 + ∗ ∇ ψ˜ τ (qτ ) − ∗ ∇ ψ˜ τ (q˜τ ). (100) The two expressions on the right-hand side of (100) are now estimated separately. The second term on r.h.s. (100) is estimated as follows. Spelling out the convolutions and factoring out ∇ ψ˜ in the integrand, then pulling | · | into the convolution integral, then using the Cauchy–Schwarz inequality, we get ( ∗ ∇ ψ˜ τ )(q) − ( ∗ ∇ ψ˜ τ )(q) ˜ L 2 ψ˜ τ H˙ 1 . (101) ˜ ≤ ( · − q) − ( · − q)
17 The initial data condition z = z˜ derives from the Z in F ( · ; · |Z ). 0 0 0 .,0 0
696
Y. Elskens, M. K.-H. Kiessling, V. Ricci
Now recall that for any two equi-measurable √ translates 1 and 2 of a bounded domain one has χ1 + χ2 − χ1 ∩2 L 2 ≤ 2||, where χ is the characteristic function of . This, the compact support of , and its Lipschitz continuity, then yield ∗ ∇ ψ˜ τ (qτ ) − ∗ ∇ ψ˜ τ (q˜τ ) ≤ C Cψ |qτ − q˜τ |, (102) where C = 2|supp()| Lip (), and where we also used supτ ≥0 ψ˜ τ H˙ 1 ≤ Cψ . We next estimate |qτ − q˜τ |, for which purpose we (i) use the integrated characteristic equations for qτ and q˜τ , with q(0) = q(0), ˜ (ii) pull | · | under the time integral, (iii) use that ∂ p 1 + | p|2 ∈ Cb0,1 (R3 , R3 ) with Lip ∂ p 1 + | p|2 = 1, and obtain τ τ pτ p˜ τ |qτ − q˜τ | ≤ − | pτ − p˜ τ |dτ . (103) dτ ≤ 1 + | p˜ τ |2 0 1 + | p τ |2 0 Repeating steps (i) and (ii) now for pτ and p˜ τ with p0 = p˜ 0 , then applying the triangle inequality to the resulting integrand, followed by applications of (102) and supτ ≥0 ψτ H˙ 1 ≤ Cψ , respectively using (98), gives the string of estimates τ | ∗ ∇ψτ (qτ ) − ∗ ∇ ψ˜ τ (q˜τ )|dτ | pτ − p˜ τ | ≤ 0 τ | ∗ ∇ψτ (qτ ) − ∗ ∇ψτ (q˜τ )| ≤ 0 +| ∗ ∇ψτ − ∇ ψ˜ τ (q˜τ )| dτ τ τ ≤ C Cψ |qτ − q˜τ |dτ + L 2 ψτ − ψ˜ τ H˙ 1 dτ , (104) 0
0
with C given below (102), and Cψ ≥ ψ0 H˙ 1 chosen later. Inserting (104) into (103), recalling from (97) that τ ≤ t, then employing a second order variant of the Gronwall lemma (see Appendix A.2) with z 0 = z˜ 0 , we find for all τ ≤ t that τ τ cosh w(τ − τ )
ψτ − ψ˜ τ H˙ 1 dτ dτ , (105) |qτ − q˜τ | ≤ L 2 0
0
with w = C Cψ . With (105) and (102) we have the relevant estimates for the second term on the right-hand side of (100). To estimate the first term on r.h.s. (100), recall that ∂ p 1 + | p|2 ∈ Cb0,1 (R3 , R3 ) with Lip ∂ p 1 + | p|2 = 1, then recall (104). Estimating |qτ − q˜τ | by r.h.s. (105) and inserting this estimate into (104), we now find that for all τ ≤ t we have τ pτ p˜ τ ≤
−
ψτ − ψ˜ τ H˙ 1 dτ L2 2 1 + | p τ |2 1 + | p˜ τ | 0 τ τ + L 2 w 2 cosh w(τ − τ )
τ
× 0
0
0
ψτˇ − ψ˜ τˇ H˙ 1 dτˇ dτ dτ ,
which provides an upper bound to the first term on r.h.s. (100).
(106)
Vlasov Limit for a System of Particles which Interact with a Wave Field
697
The bounds on the two terms of r.h.s. (100), i.e. (105) with (102), and (106), combine into an estimate of l.h.s. (100) which is independent of z τ and z˜ τ , the solutions to the characteristic equations for given fields ψ and ψ˜ with initial data z 0 = z = z˜ 0 , respectively. For all z we have τ |∂z H(z τ ; ζ˜τ ) − ∂z H(˜z τ ; ζ˜τ )| ≤ L 2
ψτ − ψ˜ τ H˙ 1 dτ 0 τ τ 2 + L 2 w cosh w(τ − τ )
ψτ − ψ˜ τ H˙ 1 dτ dτ 0 0 τ τ τ + L 2 w2 cosh w(τ − τ )
ψτˇ − ψ˜ τˇ H˙ 1 dτˇ dτ dτ . (107) 0
0
0
We integrate (107) w.r.t. µ0 (dz); due to the z-independence of the integrand on the righthand side, that z-integral factors out there and equals 1. It thus remains to integrate (107) w.r.t. dτ from 0 to t, to multiply by e−wt and to take the supremum over t ≥ 0. The three terms on r.h.s. (107) are estimated by repeating the strategy used in (86) a total of 9 times (however, one of the estimates is just (99) again). For all w > w we thereby arrive at the desired estimate for supt≥0 (e−wt second integral on r.h.s. (97)), t −wt ˜ ˜ sup e ∂z H(z τ ; ζτ )−∂z H(˜z τ ; ζτ ) dµ0 dτ t≥0 0 1 w2 w2 −wt ˜ t H˙ 1 , + sup e (108)
ψ − ψ ≤ L 2 2 + t w w(w 2 − w 2 ) w 2 (w 2 − w 2 ) t≥0 with w = C Cψ from (105). The estimates given by (108) and by (99) (and ensuing text), together with (97), now give the estimate sup e−wt µ0 ◦ 0,t [ζ. ] − µ0 ◦ 0,t [ζ˜. ] L∗ ≤ L 2 sup e−wt ψt − ψ˜ t H˙ 1 (109) t≥0
t≥0
whenever w > w, with 1+w
L 2. (110) w2 − w2 Since supt≥0 e−wt ψτ − ψ˜ τ H˙ 1 ≤ supt≥0 e−wt ζt − ζ˜t H L , and because of (93), we see that (109) proves (79). Part b) is completed. We have thus proved that F.,0 ( · |Z0 ) is a Lipschitz map from a closed subset of Cw0 (R+ , |Z0 ), defined by the condition supt≥0 ψt H˙ 1 ≤ Cψ , to Cw0 (R+ , |Z0 ) whe never Cψ ≥ C ψ ≥ ψ0 H˙ 1 and w > w = C Cψ . The Lipschitz constant is L[; w, w] = max{L 1 [; w], L 2 [; w, w]}, with L 1 given in (92) and L 2 in (110). Finally, we note that everything proven so far for F.,0 ( · |Z0 ) remains valid for its trunψ cation F .,0 ( · |Z0 ), obtained for all t > 0 by replacing t,0 [µ. ] by its upper truncation L 2 [; w, w] =
ψ
t,0 [µ. ] given in (75). By time reversal symmetry, the same conclusions hold with Cw0 (R+ , |Z0 ) replaced by Cw0 (R, |Z0 ). This concludes the Lipschitz continuity part of the proof. ≥ Z0. , the map We show next that for sufficiently large C ψ ≥ ψ0 H˙ 1 and R w 0 0 F.,0 ( · |Z0 ) sends a closed subset of a closed ball B R(Z. ) ⊂ Cw (R+ , |Z0 ), satisfying
698
Y. Elskens, M. K.-H. Kiessling, V. Ricci
supt≥0 ψt H˙ 1 ≤ Cψ , into B R(Z0. ) whenever Cψ ≥ C ψ and w > w = C Cψ ; recall that Z0. = F.,0 (0. |Z0 ) ∈ Cw0 (R, |Z0 ) denotes of Z0 (∈ ), where the free6 evolution 1 (R ) ⊕ H˙ 1 (R3 ) ⊕ L 2 (R3 ) . Note 0. is the trivial constant map (t → 0) ∈ Cw0 R, M that l.h.s. (80) is well-defined if we substitute 0. for Z. ; clearly, 0. ∈ Cw0 (R, |Z0 ). However, since µ. − 0. ∈ P1 − P1 , r.h.s. (80) cannot be directly used to estimate this particular version of l.h.s. (80). One possible way out is to derive an analog of (80) for more general measures involving our extension · L∗ of the dual Lipschitz distance, see Appendix A.1. A more direct way out is as follows. We invoke the triangle inequality to estimate t,0 [µ. ](ζ0 ) − t,0 [0. ](ζ0 ) H L ≤ t,0 [µ. ](ζ0 ) − t,0 [µ0. ](ζ0 ) H L +
t,0 [µ0. ](0) H L , where we used that t,0 [µ0. ](ζ0 ) − t,0 [0. ](ζ0 ) = t,0 [µ0. ](0), and also to estimate
µ0 ◦ 0,t [ζ. ] − µ0 ◦ 0,t [0. ] L∗ ≤ µ0 ◦ 0,t [ζ. ] − µ0 ◦ 0,t [ζ.0 ] L∗ + µ0 ◦ 0,t [ζ.0 ] − µ0 ◦ 0,t [0. ] L∗ . (111) Recalling that F.,0 (0. |Z0 ) = Z0. , it is straightforward to verify that (80) modifies to
F.,0 (Z. |Z0 ) − Z0. w ≤ L Z. − Z0. w + K , (112) whenever w > w = C Cψ , with L[; w, w] = max{L 1 [; w], L 2 [; w, w]} as before, and K = sup e−wt t,0 [µ0. ](0) H L + µ0 ◦ 0,t [ζ.0 ] − µ0 ◦ 0,t [0. ] L∗ . (113) t≥0 ψ
0 Since t,0 [µ0. ](0) H˙ 1 ≤ L 2 t and t,0 [µ. ](0) L 2 ≤ L 2 t by (30), and since ψ sup e−wt µ0 ◦ 0,t [ζ.0 ] − µ0 ◦ 0,t [0. ] L∗ ≤ L 2 sup e−wt t,0 [0. ](ζ0 ) H˙ 1 t≥0
t≥0
(114) ψ
by (109), with t,0 [0. ](ζ0 ) H˙ 1 ≤ (2EW (ζ0 ))1/2 + L 2 t (by (30) again), we find that K ≤ w1 2e L 2 + L 2 max w1 L 2 , (2EW (ζ0 ))1/2 . (115) Recalling that L[; w, w] = max{L 1 [; w], L 2 [; w, w]} with L 1 given in (92) and L 2 in (110), we see that both L and K are monotonically decreasing functions of w(> w), with asymptotic decay to zero ∝ 1/w for large w. Now let w be large enough such that L ≤ 1/2. Pick an R∗ , independent of w, such that K ≤ L R∗ (clearly such an R∗ exists). Now, either Z. − Z0. w ≤ R∗ or R∗ ≤ Z. − Z0. w . In the former case, F.,0 (Z. |Z0 ) − Z0. w ≤ 2L R∗ , i.e. F.,0 ( . |Z0 ) maps any closed subset of the closed ball B R∗ (Z0. ) satisfying supt≥0 ψt H˙ 1 ≤ Cψ into B2L R∗ (Z0. ); clearly, since by assumption w is large enough so that L ≤ 1/2, the ball B2L R∗ (Z0. ) ⊂ B R∗ (Z0. ). In the latter case on the other hand we have F.,0 (Z. |Z0 ) − Z0. w ≤ ≥ R∗ , the fixed point 2L Z. − Z0. w , with L ≤ 1/2. Thus we conclude that, as long as R 0 map F.,0 ( . |Z0 ) sends any closed subset of B R(Z. ) which satisfies supt≥0 ψt H˙ 1 ≤ Cψ into B R(Z0 ), given that L ≤ 1/2. Again, we note that everything proven in this paragraph for F.,0 ( · |Z0 ) remains valid for its truncation F .,0 ( · |Z0 ).
Vlasov Limit for a System of Particles which Interact with a Wave Field
699
It remains to notice that the truncated map F .,0 ( · |Z0 ) sends any closed subset of Cw0 (R+ , |Z0 ) satisfying supt≥0 ψt H˙ 1 ≤ Cψ , with Cψ ≥ ψ0 H˙ 1 , to itself. Hence, ≥ R∗ and C ψ ≥ ψ0 ˙ 1 , the truncated map F .,0 ( · |Z0 ) sends the intersection for R H of any closed ball B R(Z0. ) ⊂ Cw0 (R+ , |Z0 ) with any closed subsets of Cw0 (R+, |Z0 ) satisfying supt≥0 ψt H˙ 1 ≤ Cψ to itself whenever Cψ ≥ C ψ and w > w = C Cψ are large enough so that L ≤ 1/2. This completes the proof of Proposition 3.4. Remark 3.9. By Proposition 3.4 it suffices to work with w = 2w, in which case the Lipschitz constant becomes 2 ◦2 Lip (∗∇ 2 ) Lip◦2 (∗) Lip◦2 (∗(0 ·∇)2 ) 1 L2 , (116) L = max + + , 1+ 4 2 4 2w 3w 8w 4w 8w which decreases monotonically to 0 as w → ∞. Setting now Cψ = C ψ so that w = C C ψ , we see that there is a unique C ∗ψ [] such that r.h.s. (116)≤ 1/2 for
C ψ ≥ C ∗ψ []. Choosing C ψ ≥ max{C ∗ψ [], ψ0 H˙ 1 } will do.
3.4.4. Proof of Theorem 3.5. To prove Theorem 3.5 for some w and all w > w, it suffices to set w = (1 + θ )w and prove Theorem 3.5 for some w and all θ > 0. Yet, to keep the presentation as simple as possible, we will prove Theorem 3.5 only for θ = 1, viz. w = 2w (as used for illustration purposes already in Remark 3.9). The straightforward adaptation of our proof to arbitrary θ > 0 is left to the reader. Proof of Theorem 3.5. By Remark 3.9 it suffices to set Cψ = C ψ ≥ max{C ∗ψ [],
ψ0 H˙ 1 } which implies that w = C C ψ . As remarked above, we also set w = 2w. Then, as shown in the proof of Proposition 3.4, and its ensuing remark, F .,0 (Z. |Z0 ) is a contraction map, with Lipschitz constant L ≤ 1/2, from the intersection of any ≥ R∗ with any closed subset of closed ball B R(Z0. ) ⊂ Cw0 (R, |Z0 ) of radius R Cw0 (R, |Z0 ) defined by the condition supt≥0 ψt H˙ 1 ≤ Cψ (= C ψ ), to itself. The standard contraction mapping theorem now guarantees the existence of a unique fixed point t → Zt ∈ Cw0 (R, ), with t → ψt ∈ Cb0 (R, H˙ 1 (R3 )), of the fixed point equation with the truncated F, Z. = F .,0 (Z. |Z0 ).
(117)
We will now show that for sufficiently large C ψ ≥ max{C ∗ψ [], ψ0 H˙ 1 } the solutions of (117) are fixed points for F, with w = 2w, moreover of type C 1 (R, ), thus furnishing unique -strong Vlasov solutions, as claimed in√ Theorem 3.5. To this effect, choose C ψ > max C ∗ψ [], 2EW (ζ0 ), 4 + 4E 0 − 8E g , where EW (ζ0 ) is the initial field energy, E 0 = E(Z0 ) is the total energy of the initial state, and where E g is the ground state energy of the N -body Hamiltonian given in (35); note that the ground state energy is N -independent and therefore identical to the ground state of the continuum (Vlasov) limit energy functional (61). Note that√automatically we have C ψ > ψ0 H˙ 1 , for it is trivially obvious that ψ0 H˙ 1 ≤ 2EW (ζ0 ) and
ψ0 H˙ 1 ≤ 4 + 4E 0 − 8E g by (41). With the so chosen C ψ there then exists (at least) a small neighborhood of t = 0 such that for all t in this neighborhood the fixed points of (117), which are of type C 0 (R, ), by continuity satisfy Zt = Ft,0 (Z. |Z0 ).
(118)
700
Y. Elskens, M. K.-H. Kiessling, V. Ricci
Now recall that (ψt , t ) ∈ ( H˙ 2 × H˙ 1 )(R3 ), ensuring a strong solution of the wave 1 ). equation, by the Hille–Yosida theorem. Thus it remains to prove that µ · ∈ C 1 (R, P To this end, recall Remark 2.5 after Lemma 2.4; i.e.
ψt H˙ 1 ≤ (2EW (ζ0 ))1/2 + L 2 |t|,
(119)
for the strong solution of the wave equation given any subluminal source ∗ ρ · ∈ C 0 (R, Cc∞ (R3 )). Clearly, there is a unique T > 0 for which C ψ = (2EW (ζ0 ))1/2 + L 2 T ,
(120)
such that ψt H˙ 1 < C ψ strictly for all |t| < T , by (119). This now implies that there exist T ≥ T such that the fixed point Z. of (117) satisfies (118) for all t ∈ [−T, T ]. We now show that sup{T : (118) holds for all |t| ≤ T } = ∞. Thus, suppose that max{T : (118) holds for all |t| ≤ T } = T∗ < ∞. Then either for t = T∗ + or for t = −T∗ − (or both), Zt is given by (117) but not by (118). For the sake of concreteness, assume that this is so for t = T∗ + , and thus in an open right neighborhood of T∗ . This then means that for all t ∈ (T∗ , T∗ + ) we ψ have t,0 [µ. ](ζ0 ) H˙ 1 ≥ C ψ > 4 + 4E − 8E g , which in particular implies that ψ limt↓T∗ t,0 H˙ 1 ≥ C ψ > 4 + 4E − 8E g . On the other hand, for all t ∈ [−T∗ , T∗ ], the solution Z. of (117) satisfies (118), and ζ. is a strong solution of the wave equation. But then, by Corollary 3.7, t → Zt ∈ C 1 ([−T∗ , T∗ ], ) is a -strong solution, which by Theorem 3.3 conserves energy. As a consequence of energy conservation, for all t ∈ [−T∗ , T∗ ], and in particular for t = T∗ , we have ψ (121)
t,0 H˙ 1 ≤ 4 + 4E − 8E g . ψ Since t → t,0 ∈ C 0 (R, H˙ 1 (R3 )), we thus have a contradiction to the previously ψ concluded t,0 H˙ 1 ≥ C ψ > 4 + 4E − 8E g for t > T∗ . Hence, T∗ = ∞.
Remark 3.10. By the proof of Theorem 3.5 it suffices to work with w = 2w∗ , where ∗ ∗ (122) w = C max C ψ [], 4 + 4E − 8E g , with C ∗ψ [] defined in Remark 3.9. Remark 3.11. In the proof of Theorem 3.5 we only made use of the a priori bound (121) following from Theorem 3.3 and the analog of the proof of our Proposition 2.8 for the regularized Vlasov model. The other bounds expressed in Proposition 2.8 are carried over as follows. Let t → Z(t) ∈ C 0 (R, ) be a generalized solution of the wave gravity Vlasov equations which conserves energy E, momentum P, and angular momentum J , and of course mass M = 1. Then, beside (121), uniformly in t we have 6E g − 3E − 3 ≤
t 2L 2 ≤ 2E − 2E g , 1 + | p|2 µt (dz) ≤ 1 + E − E g ,
∗ ψt (q)µt (dz) ≤ E − 1.
(123) (124) (125)
Vlasov Limit for a System of Particles which Interact with a Wave Field
701
4. The Limit N → ∞ We prove first that the -strong N -body generalized solutions of the Vlasov model converge · w -strongly to solutions when N → ∞. We then specify when these limit solutions are continuum solutions. 4.1. The -strong limit of the N -body generalized solutions. Suppose the family of 1 -strongly when N → ∞, written ε[z(N ) ](dz) initial empirical measures converges M 0 µ0 (dz). Then the microscopic ‘density’ ρ (N ) ( ·, 0), as given in (15) with t = 0, converges strongly (in the marginal measures’ subspace) to the ‘density’ ρ( ·, 0). The initial fields are assumed to be functions of the initial data z0(N ) , rather than merely of N , and denoted by ζ [z0(N ) ], rather than just by ζ0(N ) . Of course we assume that ζ [z0(N ) ] → ζ0 ∈ H˙ 1 (R3 ) ⊕ L 2 (R3 ) satisfying (55), (56) with (ψt , t ) ∈ ( H˙ 2 × H˙ 1 )(R3 ). Our goal is to show that, when N → ∞, the generalized solution t → (ε[zt(N ) ]; ζt(N ) ) ∈ (Cw0 ∩ C 1 )(R, ) associated with this converging family of initial data in turn converges in . w norm to a solution t → (µt ; ζt ) ∈ (Cw0 ∩ C 1 )(R, ) of the regularized wave gravity Vlasov fixed ˙1
) (N ) H point equation. In the following, Z(N −→ ψt ∈ H˙ 2 (R3 ) t −→Zt ∈ B, H˙ means ψt 2
L 1 satisfying (55), t(N ) −→ t ∈ H˙ 1 (R3 ), satisfying (56), and ε[zt(N ) ] µt in P satisfying (57). The main result is an immediate consequence of the following theorem, which states that the . w induced distance between any two Cw0 (R, ) solutions of our Vlasov fixed point equation is controlled by the distance of their initial states in B, H˙ .
Proposition 4.1. Let I ⊂ N or I ⊂ R be an index set, and let {Z(α) ∈ Cw0 (R, )}α∈I . be a family of solutions of the Vlasov fixed point equation (67), having initial data (α) (α) Z0 ∈ B, H˙ with (ψ0 , 0 )(α) ∈ ( H˙ 2 × H˙ 1 )(R3 ), for which E ∗ := supα∈I {E(Z0 )} exists. Define (126) w¯ = C max C ∗ψ [], 4 + 4E ∗ − 8E g , with C = 2|supp()| Lip () (cf. text below (102)). Then for all w ≥ 2w¯ there exists ¯ such that for any (α, α) ˜ ∈ I2 , a constant L 0 [w] (α)
(α) ˜
(α) ˜
Z(α) . − Z. w ≤ L 0 Z0 − Z0 .
(127)
Before we prove this proposition, we state and prove its main corollary, Thm. 4.2. ) Theorem 4.2. Let t → Z(N ∈ (Cw0 ∩ C 1 )(R, ) be the -strong N -body solution t ) of the regularized wave gravity Vlasov equations (51)–(54) with Cauchy data Z(N 0 =
) (N ) limt→0 Z(N t described in Theorem 3.3. Suppose Z0 −→Z0 ∈ B, H˙ , with Z0 having mass ) M(= 1), energy E, momentum P, and angular momentum J . Then Z(N . − Z. w → 0, 0 where t → Zt ∈ Cw (R, ) is the unique solution of (67) described in Theorem 3.5. Furthermore, since (ψ0 , 0 ) ∈ ( H˙ 2 × H˙ 1 )(R3 ), we have Z. ∈ C 1 (R, ). Beside mass, t → Zt also conserves energy, momentum, and angular momentum.
702
Y. Elskens, M. K.-H. Kiessling, V. Ricci
) Proof of Theorem 4.2. By Theorem 3.5, there exist unique Cw0 (R, ) solutions Z(N . , Z. (N ) of the fixed point equation (67) for Cauchy data Z0 ∈ B, H˙ , Z0 ∈ B, H˙ . Recall that by Lemma 3.6 such solutions are of type (Cw0 ∩ C 1 )(R, ), conserving mass, momentum, angular momentum, and energy. Let Z.(∞) ≡ Z. , and set I = N ∪ {∞}. Since energy is conserved by each solution, (α) E ∗ = supα∈I {E(Z0 )} exists. Thus, w¯ exists. Pick any w > 2w. ¯ Now Proposition 4.1 (N ) } , and since
Z − Z
→ 0 by hypothesis, Proposiapplies to our family {Z(α) 0 α∈ I . 0 ) (∞) → 0. − Z tion 4.1 now implies that Z(N . . w
To prepare the proof of Proposition 4.1, we will need the following lemmata. Lemma 4.3. Let ζ. ∈ Cb0 (R, ( H˙ 1 ⊕ L 2 )(R3 )), with supt≥0 ψt H˙ 1 ≤ Cψ , and let w = C Cψ . Then t,t [ζ· ] ∈ C 0,1 (R6 , R6 ), with Lipschitz constant18 Lip t,t [ζ· ] = √1 (2 + max{w, 1/w})ew |t−t | . (128) 2
H˙ 1 (R3 )) be given, with supt≥0 ψt H˙ 1 ≤ Cψ . Proof of Lemma 4.3. Let ψt ∈ Let t → z t ∈ R6 and t → z˜ t be two distinct solutions of (7), (8) for this ψ. . ˜ and allowing z 0 = z˜ 0 , we Retracing the steps taken in (103) and (104), now for ψ = ψ, find t p˜ τ pτ |q˜t − qt | ≤ |q˜t − qt | + − dτ 2 2 1 + | p˜ τ | 1 + | pτ | t t ≤ |q˜t − qt | + | p˜ τ − pτ |dτ, (129) t t | ∗ ∇ψτ (q˜τ ) − ∗ ∇ψτ (qτ )|dτ | p˜ t − pt | ≤ | p˜ t − pt | + t t ≤ | p˜ t − pt | + C Cψ |q˜τ −qτ |dτ, (130) Cb0 (R, ∈ R6
t
with C and Cψ as stated in the lemma. Inserting (130) into (129) and using the second order variant of Gronwall’s lemma gives |q˜t − qt | ≤ |q˜t − qt | cosh w(t − t ) + | p˜ t − pt | w1 sinh w|t − t | , (131) with w = C Cψ . Back-substituting (131) into (130) and integrating then gives | p˜ t − pt | ≤ | p˜ t − pt | cosh w(t − t ) + |q˜t − qt |w sinh w|t − t | . (132) To get from (131) and (132) to the conclusion of Lemma √ 4.3, use cosh(x) ≤ e|x| and |x| sinh(|x|) ≤ e /2, as well as the familiar v 2 ≤ v 1 ≤ 2 v 2 for v ∈ Rn . 1 . The next lemma transfers control about the flow t,t on R6 to the flow †t,t on P Lemma 4.4. For any symplectomorphism on R6 which in addition is a Lipschitz map with Lipschitz constant , the adjoint map † : M(R6 ) → M(R6 ), defined by † (σ ) := σ ◦ −1 , is a positivity- and · TV -preserving smooth automorphism of 1 (R6 ) for · L∗ with Lipschitz M(R6 ), and it is a Lipschitz homeomorphism on M constant . 18 Incidentally, by Lemma 4.3, the largest Liapunov exponent for [ζ ] is bounded above by w. t,t ·
Vlasov Limit for a System of Particles which Interact with a Wave Field
703
Proof of Lemma 4.4. First, since is a symplectomorphism, by way of the definition of its adjoint, † maps M(R6 ) smoothly onto M(R6 ), and it preserves (a) σ TV for σ ∈ M and (b) the positivity of µ ∈ M+ . Furthermore, since is invertible, so is † . 1 (R6 ), we only need to show that † maps To see that † is a homeomorphism of M 6 1 into M 1 . Thus, let z ∗ ∈ R be the unique element of ker . Then note that by the M definition of † and the Lipschitz property of we have |z|σ ◦ −1 (dz) = |(z) − (z ∗ )|σ (dz) ≤ |z − z ∗ ||σ |(dz), (133) 1 . where |σ | is the total variation of σ ; the last integral exists for σ ∈ M As for the Lipschitz continuity of the adjoint flow, let σˆ , σˇ ∈ M1 (R6 ). We have g d σˆ ◦ −1 − σˇ ◦ −1 : Lip (g) ≤ 1
† (σˆ ) − † (σˇ ) L∗ = sup g∈C 0,1 (R6 ) g ◦ d(σˆ − σˇ ) : Lip (g) ≤ 1 = sup g∈C 0,1 (R6 ) 1 : Lip (g) ≤ 1 = sup g ◦ d( σ ˆ − σ ˇ ) g∈C 0,1 (R6 ) ≤ σˆ − σˇ L∗ . (134) In the last step, we used that −1 g ◦ ∈ C 0,1 (R6 ) with Lip −1 g ◦ ≤ 1. Proof of Proposition 4.1. Pick w ≥ 2w, ¯ with w¯ defined in (126), and pick Z. , Z. ∈ Cw0 (R, ) from the family of solutions Z(α) of the Vlasov fixed point equation (67) . specified in Proposition 4.1. Then
Z. − Z. w = F.,0 (Z. |Z0 ) − F.,0 ( Z. | Z0 ) w .
(135)
By the triangle inequality,
F.,0 (Z. |Z0 ) − F.,0 ( Z. | Z0 ) w≤ F.,0 (Z. |Z0 ) − F.,0 (Z. | Z0 ) w + F.,0 (Z. | Z0 ) − F.,0 ( Z. | Z0 ) w .
(136)
Now, F.,0 (Z. | Z0 ) − F.,0 ( Z. | Z0 ) w was estimated already in the proof of Proposition 3.4, see (80) (recall that the conditioning limt→0 Zt = Z0 = limt→0 Zt that entered the statement of Proposition 3.4 did not enter the estimates for (80) themselves). Furthermore, with w ≥ 2w¯ it follows that the parameter conditions in the proof of Theorem 3.5 are fulfilled for each Z0 ; hence, in (80) we have L[; w, w] ≤ 1/2 for each Z0 . Thus, by (135), (136), and (80), and with (1 − L[; w, w])−1 ≤ 2, we arrive at the estimate
Z. − Z. w ≤ 2 F.,0 (Z. |Z0 ) − F.,0 (Z. | Z0 ) w ,
(137)
(α) uniformly for all Z0 ∈ {Z0 }α∈I . The proof of Proposition 4.1 has thus been reduced to proving Lipschitz continuity of F.,0 in its second argument, given the first. Since, by the triangle inequality,
704
Y. Elskens, M. K.-H. Kiessling, V. Ricci
F.,0 (Z. |µ0 ; ζ0 ) − F.,0 (Z. |µ˜ 0 ; ζ˜0 ) w ≤ F.,0 (Z. |µ0 ; ζ0 ) − F.,0 (Z. |µ˜ 0 ; ζ0 ) w + F.,0 (Z. |µ˜ 0 ; ζ0 ) − F.,0 (Z. |µ˜ 0 ; ζ˜0 ) w , (138) it suffices to show that for given Z. and ζ0 , we have
F.,0 (Z. |µ0 ; ζ0 ) − F.,0 (Z. |µ˜ 0 ; ζ0 ) w ≤ L ∗1 µ0 − µ˜ 0 L∗ ,
(139)
and for given Z. and µ˜ 0 ,
F.,0 (Z. |µ˜ 0 ; ζ0 ) − F.,0 (Z. |µ˜ 0 ; ζ˜0 ) w ≤ L ∗2 ζ0 − ζ˜0 H L ,
(140)
with L ∗1 , L ∗2 depending at most on w. ¯ For then it follows from (138), (139), (140) that Z0 ) w ≤ L ∗ Z0 − Z0 ,
F.,0 (Z. |Z0 ) − F.,0 (Z. |
(141)
¯ := max{L ∗1 , L ∗2 }, completing the proof of Proposition 4.1, with L 0 = 2L ∗ . with L ∗ [w] As to (139), for all t ∈ R we have
Ft,0 (Z. |µ0 ; ζ0 ) − Ft,0 (Z. |µ˜ 0 ; ζ0 ) = †t,0 [ζ. ](µ0 ) − †t,0 [ζ. ](µ˜ 0 ) L∗ ≤ L ∗1 exp(w|t|) µ ¯ ˜ 0 L∗ , (142) 0−µ L ∗1 [w] ¯ =
√
√ 2 + max{w, ¯ 1/w}/ ¯ 2,
(143)
the inequality by Lemmas 4.3 and 4.4. Since w ≥ 2w, ¯ the supt∈R e−w|t| (142) exists. Estimating it further with w − w¯ ≥ w¯ for w ≥ 2w¯ gives (139), with L ∗1 given in (143). As to (140), for all t ∈ R we have ψ
Ft,0 (Z. |µ˜ 0 ; ζ0 ) − Ft,0 (Z. |µ˜ 0 ; ζ˜0 ) = t,0 [0. ](ζ0 − ζ˜0 ) H˙ 1
˜ + t,0 [0. ](ζ0 − ζ0 ) L 2 ,
(144)
ψ
where t,0 [0. ]( · ) and t,0 [0. ]( · ) are the free propagators obtained from Kirchhoff’s ψ
formulas (72) and (73) by replacing µ. → 0. ; note that t,0 [0. ]( · ) and t,0 [0. ]( · ) are linear operators. Since the field energy EW (ζ.0 ) is conserved for solutions of the homogeneous wave equation with initial data (ψ0 , 0 ) ∈ ( H˙ 2 × H˙ 1 )(R3 ), we have the bounds √ 1/2 ψ ˜ ˜ ˜
t,0 [0. ](ζ0 − ζ˜0 ) H˙ 1 + t,0 [0. ](ζ0 − ζ0 ) L 2 ≤ 2EW (ζ0 − ζ0 ) ≤ 2 ζ0 − ζ0 H L , (145) √ √ so L ∗2 = 2. Estimates (139) (with (143)) and (140) (with L ∗2 = 2 < L ∗1 ) now give
F.,0 (Z. |µ0 ; ζ0 ) − F.,0 (Z. |µ˜ 0 ; ζ˜0 ) w ≤ L ∗1 Z0 − Z0 . ¯ = 2L ∗ [w] ¯ = 2L ∗1 [w]. ¯ The proof of Proposition 4.1 is complete, with L 0 [w]
(146)
Vlasov Limit for a System of Particles which Interact with a Wave Field
705
4.2. The continuum limit. Note that so far nothing prevents the measure µ0 (dz), which obtains as limit of the ε[z0(N ) ](dz) when N → ∞, from being as singular as the ε[z0(N ) ](dz) are. In particular, we may even allow ε[z0(N ) ](dz) δz 0 (dz). Since in physical applications of Vlasov theory one is typically interested in continuum solutions, we now suppose that when N → ∞, the family of initial empirical measures ε[z0(N ) ](dz) converges 1 -strongly to a measure µ0 (dz) which is absolutely continuous w.r.t. Lebesgue meaM sure. We write µ(dz) = µ f (dz) = f (z)dz for the absolutely continuous measures 6 in P1 (R6 ). The set of their Radon–Nikodym derivatives is denoted L 1,1 +,1 (R ); thus 6 f0 ft f ∈ L 1,1 +,1 (R ). We now show that when µ0 (dz) = µ (dz), then µt = µ , with 6 f t ( ·, · ) ∈ L 1,1 +,1 (R ) for all t ∈ R.
Proposition 4.5. If (µ. ; ζ. ) ∈ (Cw0 ∩C 1 )(R, ) solves (67) with µ0 = µ f0 , f 0 ∈ (L 1,1 +,1 ∩ p 6 L p )(R6 ) for some p ≥ 1, then µt = µ ft with f t ( ·, · ) ∈ (L 1,1 +,1 ∩ L )(R ) for all t ∈ R; 6 p 6 note that p ≥ 1 includes the case that f 0 ∈ L 1,1 +,1 (R ) while f 0 ∈ L (R ) for any p > 1.
1 ) is a strong generalized solution of Proof of Proposition 4.5. Suppose µ. ∈ C 1 (R, P the Vlasov continuity equation (54) for given ζ. ∈ (Cb0 ∩ C 1 )(R, ( H˙ 1 ⊕ L 2 )(R3 )), with p 6 ft Cauchy data µ0 = µ f0 , f 0 ∈ (L 1,1 +,1 ∩ L )(R ) for some p ≥ 1. Then µt = µ p 6 with f t ( ·, · ) ∈ (L 1,1 +,1 ∩ L )(R ) for all t ∈ R. This follows from the definition of a generalized solution, a straightforward change of variables from z to t,0 [ζ. ](z) under the integral, noting the properties of the flow .,. summarized in Corollary 3.7.
4.2.1. Additional conservation laws for continuum solutions. The argument used to prove Proposition 4.5 has the useful corollary that continuum solutions Z.f with 6 f 0 ∈ L 1,1 +,1 (R ) enjoy additional conservation laws if f 0 satisfies stronger integrability requirements. Here we wrote Z.f for Z. = (µ. ; ζ. ) with µt = µ ft . Thus, for any g : R+ → R, we define the g-Casimir functional of Z f by 1 6 C (g) Z f = g ◦ f dz, for all f ∈ L 1,1 +,1 such that g ◦ f ∈ L (R ). (147) For g( · ) = (id( · )) p , p ≥ 1, we get the p th power of the L p norm of f ; when p = 1 this yields just the mass functional (58) for absolutely continuous µt = µ ft . For g( · ) = −id( · ) log(id( · )/ f ∗ ) we get the entropy of f relative to some arbitrary 6 f ∗ ∈ L 1,1 +,1 (R ), (−id log(id/ f ∗ )) f Z =− f ln( f / f ∗ )dz ≡ S( f | f ∗ ) . (148) C Proposition 4.6. Let t → Zt ∈ (Cw0 ∩ C 1 )(R, ) be a generalized solution of the regularized wave gravity Vlasov model for which µt is absolutely continuous, i.e. f µt = µ ft . Then, beside the conservation laws (65), whenever C (g) Z0 exists, also f C (g) Z.f = C (g) Z0 . (149) In particular, if the relative entropy S( f 0 | f ∗ ) exists initially, then at later times S( f t | f ∗ ) = S( f 0 | f ∗ ).
(150)
706
Y. Elskens, M. K.-H. Kiessling, V. Ricci
Acknowledgement. The work of M.K. was supported in part by the NSF under Grants No. DMS-0103808, DMS-0406951, and in part by CNRS through a poste rouge to M.K. while visiting CNRS-Université de Provence; the work of Y.E. by Université de Provence through a congé pour recherche and by CNRS through a delegation position; the work of V.R., when the collaboration started, by the Foundation BLANCEFLOR Boncompagni Ludovisi née Bildt. The participation of A. Nouri in the early stages of this work is gratefully acknowledged. M.K. thanks M. Kunze, C. Lancellotti, H. Spohn, P. Smith, and A.S. Tahvildar-Zadeh for valuable discussions. The authors thank two anonymous referees for their constructive criticisms.
Appendix A.1. Nested modes of convergence of probability measures. A certain frustration about the absence of an authoritative survey of the relationships of various important notions of convergence that are used in the probability literature has already been expressed [GiSu02], where that gap has been filled to some extent. Unfortunately, [GiSu02] does not cover all our needs. Furthermore, when addressing a mixed readership of mathematical physicists, analysts and probabilists, the frustration can get compounded by the various ‘competing’ terminologies and notations that are in use in these areas of activity. In view of this, it seems advisable to be more explicit about the notions of convergence that we use. The following general notions hold (and are formulated) for any dimension d ≥ 1. We recall that, if {µn } n∈N is a sequence of Borel probability measures on Rd and
d µ ∈ P(R ), too, and if f dµn → f dµ for every bounded continuous function L
f ∈ Cb0 (Rd ), then one says that µn converges to µ in law,19 written µn −→µ; see p. 292 L
of [Dud02]. Clearly, since C00 (Rd ) ⊂ Cb0 (Rd ), convergence in law µn −→µ implies w* convergence µn µ. Convergence in law can be metrized as follows. Let Cb0,α (Rd ) denote the subset of the bounded continuous functions on Rd which are also Hölder continuous with exponent α ∈ (0, 1]. Now Cb0,α (Rd ) is not a closed subspace of Cb0 (Rd ) w.r.t. · u , but
g u,α ≡ max g u , H¨olα (g) , (151) where H¨olα (g) ≡
|g(ξ ) − g(ξ )| |ξ − ξ |α ξ =ξ ∈Rd sup
(152)
is the α-Hölder seminorm of g, turns Cb0,α (Rd ) into a (non-separable) Banach space. 0,α (Rd ). If the suffix b is replaced by the The positive cone in Cb0,α (Rd ) is denoted by Cb,+ suffix 0 , we mean the corresponding subsets of these functions that vanish at infinity. In much of what follows, we will need Cb0,1 (Rd ), the space of bounded Lipschitz functions on Rd , and we write20 Lip (g) for H¨olα (g) when α = 1.21 Now let µ1 ∈ P(Rd ) and µ2 ∈ P(Rd ) be two Borel probability measures on Rd . We define the dual bounded-Lipschitz distance between µ1 and µ2 as22 g d(µ1 − µ2 ) : g u,1 ≤ 1 . distbL∗ (µ1 , µ2 ) := (153) sup 0,1 g∈Cb (Rd )
19 In the probability literature, convergence in law is usually called “weak convergence” of probability measures; however, this notion generally differs from the analysts’ notion of weak convergence on M. 20 Since Lip ( · ) is a seminorm, we prefer this notation over · , which is also in use in the literature. L 21 We recall that if g ∈ C 1 (Rd ), then Lip (g) = sup x∈Rd |∇g(x)|. 22 The * at dist ∗ (, ) refers to the Kantorovich–Rubinstein duality theorems; see below. bL
Vlasov Limit for a System of Particles which Interact with a Wave Field
707
Our dual bounded-Lipschitz distance, though not identical to, is equivalent to the Fortet– Mourier β-distance (p. 395 of [Dud02]), which instead of g u,1 ≤ 1 works with the equivalent condition g u + Lip (g) ≤ 1. Therefore, by Proposition 11.3.2 of [Dud02], distbL∗ ( , ) is a metric on the convex set P(Rd ), and by Corollary 11.5.5 of [Dud02], P(Rd ) is complete for distbL∗ ( , ). Furthermore, by Theorem 11.3.3 of [Dud02], if {µn }n∈N is a sequence of Borel probability measures on Rd , and µ ∈ P(Rd ), too, L
then dist bL∗ (µn , µ) → 0 as n → ∞ is equivalent to µn −→µ as n → ∞. Hence, distbL∗ ( , ) metrizes convergence in law of the Borel probability measures on Rd . Our dual bounded-Lipschitz distance dist bL∗ ( , ) is equivalent, but not identical, to 0,1 the distance obtained by restricting g to Cb,+ (Rd ), here denoted dbL∗ ( , ) (following [Spo91], Def. 2.2; actually, Spohn writes dbL ( , ), but we here better keep the *). Clearly, distbL∗ (µn , µ) → 0 implies dbL∗ (µn , µ) → 0. The converse of this follows from three simple observations: first, the integral on the r.h.s. of (153) is invariant under g → g + g u , so that in our definition of distbL∗ ( , ) we can replace Cb0,1 (Rd ) by 0,1 (Rd ) and simultaneously replace the condition g u,1 ≤ 1 with the condition Cb,+ 0,1 (Rd ) : g u ≤ 2, Lip (g) ≤ 1} is a strict max{ 21 g u , Lip (g)} ≤ 1; second, {g ∈ Cb,+ 0,1 subset of {g ∈ Cb,+ (Rd ) : g u ≤ 2, Lip (g) ≤ 2}; third, the simple scaling g → 2g
0,1 (Rd ) : g u ≤ 2, Lip (g) ≤ 2} reveals that the sup of g d(µ1 − µ2 ) over {g ∈ Cb,+
0,1 is twice the sup of g d(µ1 − µ2 ) over {g ∈ Cb,+ (Rd ) : g u ≤ 1, Lip (g) ≤ 1}. These three facts together imply that distbL∗ (µ1 , µ2 ) ≤ 2dbL∗ (µ1 , µ2 ), and this means that dist bL∗ (µn , µ) → 0 whenever dbL∗ (µn , µ) → 0. Recall that the general Kantorovich–Rubinstein distance23 is defined as
distKRc (µ1 , µ2 ) :=
inf
µ∈Pc (R2d |µ1 ,µ2 )
cost(ξ1 , ξ2 )µ(dξ1 dξ2 ) ,
(154)
where cost(ξ, ξ ) = distKRc δξ , δξ for ξ, ξ ∈ Rd is the “cost (per transport unit) function,” and where Pc (R2d |µ1 , µ2 ) is the set of Borel probability measures µ on d d Rd × Rd satisfying
µ(dξ1 × R ) = µ1 (dξ1 ) and µ(R × dξ2 ) = µ2 (dξ2 ), with µ1 and µ2 satisfying cost(ξ1 , ξ )µ1 (dξ1 ) < ∞ and cost(ξ, ξ2 )µ2 (dξ2 ) < ∞ for some ξ ∈ Rd . By the Kantorovich–Rubinstein theorem ([Dud02], Theorem 11.8.2), dist bL∗ (µ1 , µ2 ) is identical to the Kantorovich–Rubinstein distance for cost(ξ1 , ξ2 ) = min{2, |ξ1 − ξ2 |}. Incidentally, cost(ξ1 , ξ2 ) = min{1, |ξ1 − ξ2 |} is the cost function for the particular Kantorovich–Rubinstein distance identical to dbL∗ ( , ). The dual bounded-Lipschitz distance (dbL∗ ) is used in [NeWi74,BrHe77,Dob79,Neu85,Spo91] and [FiEl98]. However, if one is only interested, as we are, in the subset P1 (Rd ) ⊂ P(Rd ), it is rather prudent to work with the dual Lipschitz distance in P1 (Rd ), given by distL∗ (µ1 , µ2 ) :=
sup g∈C 0,1 (Rd )
g d(µ1 − µ2 ) : Lip (g) ≤ 1 ,
23 Also associated with the names of Monge and Wasserstein.
(155)
708
Y. Elskens, M. K.-H. Kiessling, V. Ricci
which is identical with the standard24 Kantorovich–Rubinstein distance, given by distKR (µ1 , µ2 ) := (156) inf |ξ1 − ξ2 |µ(dξ1 dξ2 ) , µ∈P1 (R2d |µ1 ,µ2 )
where P1 (R2d |µ1 , µ2 ) is the set of Borel probability measures µ on Rd × Rd satisfying µ(dξ1 × Rd ) = µ1 (dξ1 ) ∈ P1 (Rd ) and µ(Rd × dξ2 ) = µ2 (dξ2 ) ∈ P1 (Rd ). We write µn µ if distL∗ (µn , µ) → 0. Clearly, dist L∗ (µn , µ) → 0 implies25 distbL∗ (µn , µ) → 0. We note that the metric dist L∗ ( · , · ) defines a norm · L∗ on (P1 − P1 ) ⊂ M by26 σ L∗ := distL∗ (σ+ , σ− ). This definition extends identically to λ(P1 − P1 ) for any λ ∈ R. To extend · L∗ to the linear span of P1 , for σ ∈ lsp P1 we define
σ L∗ := distL∗ (σ − σ (Rd )µ) ˜ + , (σ − σ (Rd )µ) ˜ − + |σ (Rd )|, (157) where µ˜ ∈ P1 (Rd ) is arbitrary but fixed; e.g. µ˜ = δ0 . Clearly, for σ ∈ P1 − P1 , such that σ (Rd ) = 0, (157) reduces to σ L∗ = distL∗ (σ+ , σ− ), i.e. σ L∗ = σ L∗ whenever σ (Rd ) = 0. It is straightforward to verify that · L∗ is a norm on lsp P1 . 1 (Rd ), is a Banach The completion of the linear span of P1 (Rd ) w.r.t. (157), denoted M 1 (Rd ) for P1 (Rd ) → M 1 (Rd ). space with norm · L∗ given in (157). We write P A.2. The second order variant of the Gronwall lemma. The standard Gronwall lemma provides a simple upper bound on a function t → u(t) satisfying the first order differential inequality d u ≤ f (t)u + g(t) dt
(158)
for all t ∈ R+ , with u(0) = u 0 > 0, and with f (t) and g(t) given positive continuous functions; namely, with the help of an integrating factor one finds right away that u is bounded by t t t f (τ )dτ + exp f (τ˜ )dτ˜ g(τ )dτ. (159) u(t) ≤ u 0 exp 0
0
τ
In particular, if f (t) ≡ γ > 0 is a constant, then t exp[γ (t − τ )]g(τ )dτ. u(t) ≤ u 0 exp(γ t) +
(160)
0
However, (159) does not suit our purposes; instead, we need the following second order variant of (159): 24 The word “standard” refers to the custom in the probability community that, by default, the cost function is identified with the metric of the underlying complete metric space on which the Borel probability measures are defined; in standard Euclidean Rd this gives cost(ξ1 , ξ2 ) = |ξ1 − ξ2 |. 25 The converse is not true. In particular, Dudley gives the following counterexample for d = 1: µ = n (1 − n −1 )δ0 + n −1 δn and µ = δ0 , for which dist L∗ (µn , µ) = 1 while dist bL∗ (µn , µ) ≤ 2n −1 ↓ 0. 26 In particular, if σ = µ − µ with µ , µ ∈ P , then dist ∗ (σ , σ ) = dist ∗ (µ , µ ); note, however, + − 1 2 1 2 1 1 2 L L that generally µ1 = (µ1 − µ2 )+ and µ2 = (µ1 − µ2 )− .
Vlasov Limit for a System of Particles which Interact with a Wave Field
709
Lemma A1. Let γ > 0 be a given constant and g(t) a given positive continuous function. Suppose t → u(t) satisfies the second order differential inequality d2 u ≤ γ 2 u + g(t) dt 2
(161)
for all t ∈ R+ , with u(0) = u 0 ≥ 0 and u (0) = v0 ≥ 0. Then u is bounded by t τ cosh[γ (t − τ )] g(τ˜ )dτ˜ dτ u(t) ≤ u 0 cosh(γ t) + v0 γ1 sinh(γ t) + 0
(162)
0
for all t ∈ R+ . Proof of Lemma A1. Denote r.h.s. (162) = U (t) = Uhom (t) + Uinh (t), where Uinh (t) is the term linear in g. By direct computation one verifies that the function t → U (t) satisfies (161) with “=” instead of ≤, and U (0) = u 0 and U (0) = v0 . Since the Cauchy problem for (161) with positive data has a unique positive solution, it follows that u(t) ≤ U (t) by the usual subsolution argument. References [And05] [Ang00] [AnTo99] [ApKi01] [AsUk86] [BEGMY02] [BaDü01] [Ber88] [BGP00]
[BGP03] [BrHe77] [Bre93] [CaRe03] [CaRe04] [CIP91] [Deg86]
Andréasson, H.: The Einstein–Vlasov system/kinetic theory. Living Rev. Relativ. 8 (2005). URL: http://www.livingreviews.org/lrr-2005-2 (2008). Cited on April 20, 2008 Anguige, K.: Isotropic cosmological singularities. III. The Cauchy problem for the inhomogeneous conformal Einstein–Vlasov equations. Ann. Phys. 282, 395–419 (2000) Anguige, K., Tod, K.P.: Isotropic cosmological singularities. II. The Einstein–Vlasov system. Ann. Phys. 276, 294–320 (1999) Appel, W., Kiessling, M.K.-H.: Mass and spin renormalization in Lorentz electrodynamics. Ann. Phys. (NY) 289, 24–83 (2001) Asano, K., Ukai, S.: On the Vlasov–Poisson limit of the Vlasov–Maxwell equations. In: Nishida, T., Mimura, M., Fujii, H. (eds.) Patterns and Waves. Stud. Math. Appl. vol. 18, pp. 369–383. North-Holland, Amsterdam (1986) Bardos, C., Erdös, L., Golse, F., Mauser, N., Yau, H.-T.: Derivation of the Schrödinger– Poisson equation from the quantum n-body problem. C. R. Acad. Sci. Ser. I Math. 334, 515–520 (2002) Bauer, G., Dürr, D.: The Maxwell–Lorentz system of a rigid charge. Ann. Inst. H. Poincaré 2, 179–196 (2001) Bernstein, J.: Kinetic Theory in the Expanding Universe. Cambridge University Press, Cambridge (1988) Bouchut, F., Golse, F., Pallard, C.: Nonresonant smoothing for coupled wave + transport equations and the Vlasov–Maxwell system. In: Anton, A. et al. (eds.) Dispersive Corrections to Transport Phenomena. IMA Proceedings, vol. 136, (Minneapolis, 2000). Springer, New York (2004) Bouchut, F., Golse, F., Pallard, C.: Classical solutions and the Glassey–Strauss theorem for the 3D Vlasov–Maxwell system. Arch. Rat. Mech. Anal. 170, 1–15 (2003) Braun, W., Hepp, K.: The Vlasov dynamics and its fluctuations in the 1/n limit of interacting classical particles. Commun. Math. Phys. 56, 101–113 (1977) Brezis, H.: Analyse Fonctionnelle—Théorie et applications. Masson, Paris (1993) Calogero, S., Rein, G.: On classical solutions of the Nordström–Vlasov system. Commun. Partial Diff. Eqn. 28, 1863–1885 (2003) Calogero, S., Rein, G.: Global weak solutions to the Nordström–Vlasov system. J. Diff. Eqn. 204, 323–338 (2004) Cercignani, C., Illner, R., Pulvirenti, M.: The Mathematical Theory of Dilute Gases. Springer, Berlin (1991) Degond, P.: Local existence of solutions of the Vlasov–Maxwell equations and convergence to the Vlasov–Poisson equations for infinite light velocity. Math. Meth. Appl. Sci. 8, 533–558 (1986)
710
[dPLi89b] [Dob79] [Dud02] [Ehl71] [Ehl73] [FiEl98] [GaVi89] [GLV91] [GiSu02] [GiTr01] [Gla96] [GlSch85] [GlSch88] [GlSt86] [GlSt87a] [GlSt87b] [HiSm74] [Hor86] [Ika00] [Jan77] [Kie99] [Kie04] [KiTZ08] [KlSt02] [KoSp00] [KKSp99] [KSpK97] [KuRe01a] [KuRe01b] [KuSp00a] [KuSp00b]
Y. Elskens, M. K.-H. Kiessling, V. Ricci
DiPerna, R.J., Lions, P.L.: Global weak solutions of Vlasov–Maxwell systems. Commun. Pure Appl. Math. 42, 729–757 (1989) Dobrushin, R.L.: Vlasov equations, Funkts. Anal. Pril. 13(2), 48–58 (1979) [Engl. Transl. Funct. Anal. Appl. 13, 115–123 (1979)] Dudley, R.M.: Real analysis and probability. Cambridge Univ. Press, Cambridge (2002) Ehlers, J.: General relativity and kinetic theory. In: Sachs, R.K. (ed.) General relativity and cosmology. Proceedings of the International Enrico Fermi School of Physics, vol. 47, pp. 1–70. Academic Press, New York (1971) Ehlers, J.: Survey of general relativity theory. In: Israel, W. (ed.) Relativity, Astrophysics and Cosmology. D. Reidel Publ. Co., Amsterdam (1973) Firpo, M.-C., Elskens, Y.: Kinetic limit of n-body description of wave-particle selfconsistent interaction. J. Stat. Phys. 93, 193–209 (1998) Ganguly, K., Victory, H.D. Jr.: On the convergence of particle methods for multidimensional Vlasov–Poisson systems. SIAM J. Num. Anal. 26, 249–288 (1989) Ganguly, K., Lee, J.T., Victory, H.D. Jr.: On simulation methods for Vlasov–Poisson systems with particles initially asymptotically distributed. SIAM J. Num. Anal. 28, 1574– 1609 (1991) Gibbs, A.L., Su, E.F.: On choosing and bounding probability metrics. Int. Stat. Rev. 70, 419–435 (2002) Gilbarg, D., Trudinger, N.: Elliptic Partial Differential Equations of Second Order, 3rd edn. Springer, New York (2001) Glassey, R.: The Cauchy Problem in Kinetic Theory. SIAM, Philadelphia (1996) Glassey, R., Schaeffer, J.: On symmetric solutions to the relativistic Vlasov–Poisson system. Commun. Math. Phys. 101, 459–473 (1985) Glassey, R., Schaeffer, J.: Global existence for the relativistic Vlasov–Maxwell system with nearly neutral initial data. Commun. Math. Phys. 119, 353–384 (1988) Glassey, R., Strauss, W.: Singularity formation in a collisionless plasma could occur only at high velocities. Arch. Rat. Mech. Anal. 92, 59–90 (1986) Glassey, R., Strauss, W.: Absence of shocks in an initially dilute collisionless plasma. Commun. Math. Phys. 113, 191–208 (1987) Glassey, R., Strauss, W.: High velocity particles in a collisionless plasma. Math. Meth. Appl. Sci. 9, 46–52 (1987) Hirsch, M.W., Smale, S.: Differential Equations, Dynamical Systems, and Linear Algebra. Academic Press, New York (1974) Horst, E.: Global Solutions of the Relativistic Vlasov–Maxwell System of Plasma Physics. Habilitationsschrift. Ludwig Maximilian Universität, München (1986) Ikawa, M.: Hyperbolic Partial Differential Equations and Wave Phenomena. Trans. Math. Monogr. (B.I. Kurpita, trans.) vol. 189. Amer. Math. Soc., Providence (2000) Janicke, L.: Non-linear electromagnetic waves in a relativistic plasma. J. Plasma Phys. 19, 209–228 (1977) Kiessling, M.K.-H.: Classical electron theory and conservation laws. Phys. Lett. A 258, 197–204 (1999) Kiessling, M.K.-H.: Electromagnetic field theory without divergence problems. 1. The Born legacy. J. Stat. Phys. 116, 1057–1122 (2004) Kiessling, M.K.-H., Tahvildar-Zadeh, A.S.: On the relativistic Vlasov–Poisson system. Indiana Univ. Math. J. (2008, in press) Klainerman, S., Staffilani, G.: A new approach to study the Vlasov–Maxwell system. Commun. Pure Appl. Anal. 1, 103–125 (2002) Komech, A., Spohn, H.: Long-time asymptotics for the coupled Maxwell–Lorentz equations. Commun. PDE 25, 559–584 (2000) Komech, A., Kunze, M., Spohn, H.: Effective dynamics for a mechanical particle coupled to a wave field. Commun. Math. Phys. 203, 1–19 (1999) Komech, A., Spohn, H., Kunze, M.: Long-time asymptotics for a classical particle interacting with a scalar wave field. Commun. PDE 22, 307–335 (1997) Kunze, M., Rendall, A.D.: The Vlasov–Poisson system with radiation damping. Ann. H. Poincaré 2, 857–886 (2001) Kunze, M., Rendall, A.D.: Simplified models of electromagnetic and gravitational radiation damping. Class. Quantum Grav. 18, 3573–3587 (2001) Kunze, M., Spohn, H.: Adiabatic limit of the Maxwell–Lorentz equations. Ann. Inst. H. Poincaré, Phys. Théor. 1, 625–653 (2000) Kunze, M., Spohn, H.: Radiation reaction and center manifolds. SIAM J. Math. Anal. 32, 30–53 (2000)
Vlasov Limit for a System of Particles which Interact with a Wave Field
[KuSp00c] [LuVl50] [MMW84] [Mor80] [NaSe81] [NeWi74]
[Neu85] [Pfa89] [Pfa92] [Rei90] [Rei95] [Rei97] [ReRe92] [RRSch95] [ReRo03] [Ren94] [Sch86] [Sch91] [SchJ73] [ShSt00] [Spo91] [Spo04] [ViAl91] [VTG91] [Vla38] [Vla61]
[WeMo81] [Wol00]
711
Kunze, M., Spohn, H.: Slow motion of charges interacting through the maxwell field. Commun. Math. Phys. 212, 437–467 (2000) Luchina, A.A., Vlasov, A.A.: Chap. II, Sect. 12 in Ref. [Vla61] Marsden, J., Morrison, P.J., Weinstein, A.: The Hamiltonian structure of the BBGKY hierarchy equations. In: Proceedings of “Fluids and Plasmas: Geometry and Dynamics” (Boulder, Colo., 1983). Contemp. Math. vol. 28, pp. 115–124. Amer. Math. Soc., Providence (1984) Morrison, P.J.: The Maxwell-Vlasov equations as a continuous hamiltonian system. Phys. Lett. A 80, 383–386 (1980) Narnhofer, H., Sewell, G.L.: Vlasov hydrodynamics of a quantum mechanical model. Commun. Math. Phys. 79, 9–24 (1981) Neunzert, H., Wick, J.: Die Approximation der Lösung von Integro-Differentialgleichungen durch endliche Punktmengen. In: Numerische Behandlung nichtlinearer Integrodifferentialund Differentialgleichungen, Tagung, Math. Forschungsinst., Oberwolfach, 1973. Lecture Notes in Math. vol. 395, pp. 275–290. Springer, Berlin (1974) Neunzert, H.: An introduction to the nonlinear Boltzmann-Vlasov equation. In: Kinetic theories and the Boltzmann equation, Montecatini, 1981. Proceedings, Lecture Notes in Math. vol. 1048, pp. 60–110. Springer, Berlin (1985) Pfaffelmoser, K.: Globale klassische Lösungen des dreidimensionalen Vlasov–Poissonsystems. Doctoral Dissertation, Ludwig Maximilian Universität, München (1989) Pfaffelmoser, K.: Global classical solutions of the Vlasov–Poisson system in three dimensions with generic initial data. J. Diff. Eqn. 95, 281–303 (1992) Rein, G.: Generic global solutions of the relativistic Vlasov–Maxwell system of plasma physics. Commun. Math. Phys. 135, 41–78 (1990) Rein, G.: The Vlasov–Einstein System with Surface Symmetry. Habilitationsschrift, Ludwig Maximilian Universität, München (1995) Rein, G.: Self-gravitating systems in Newtonian theory—the Vlasov–Poisson system. In: Mathematics of Gravitation, Part I, Banach Center Publ. vol. 41, pp. 179–194. Polish, Acad. Soc, Warszawa (1997) Rein, G., Rendall, A.D.: Global existence of solutions of the spherically symmetric Vlasov– Einstein system with small initial data. Commun. Math. Phys. 150, 561–583 (1992) [Errata. Commun. Math. Phys. 176, 475–478 (1996)] Rein, G., Rendall, A.D., Schaeffer, J.: A regularity theorem for solutions of the spherically symmetric Vlasov–Einstein system. Commun. Math. Phys. 168, 467–478 (1995) Rein, G., Rodewis, T.: Convergence of a particle-in-cell scheme for the spherically symmetric Vlasov–Einstein system. Indiana Univ. Math. J. 52, 821–862 (2003) Rendall, A.: The newtonian limit for asymptotically flat solutions of the Vlasov–Einstein system. Commun. Math. Phys. 163, 89–112 (1994) Schaeffer, J.: The classical limit of the relativistic Vlasov–Maxwell system. Commun. Math. Phys. 104, 403–421 (1986) Schaeffer, J.: Global existence of smooth solutions to the Vlasov–Poisson system in three dimensions. Commun. PDE 16, 1313–1335 (1991) Schindler, K., Janicke, L.: Large amplitude electromagnetic waves in hot relativistic plasmas. Phys. Lett. 45A, 91–92 (1973) Shatah, J., Struwe, M.: Geometric wave equations. In: Courant Lect. Notes, vol. 2. 2nd edn, Courant Institute. Amer. Math. Soc., New York (2000) Spohn, H.: Large scale dynamics of interacting particles. Texts and Monographs in Physics. Springer, Cambridge (1991) Spohn, H.: Dynamics of Charged Particles and their Radiation Fields. Cambridge Univ. Press, Cambridge (2004) Victory, H.D. Jr., Allen, E.J.: The convergence theory of particle-in-cell methods for multidimensional Vlasov–Poisson systems. SIAM J. Num. Anal. 28, 1207–1241 (1991) Victory, H.D. Jr., Tucker, G., Ganguly, K.: The convergence analysis of fully discretized particle methods for solving Vlasov–Poisson systems. SIAM J. Num. Anal. 28, 955–989 (1991) Vlasov, A.A.: On vibrational properties of a gas of electrons. Zh. E.T.F. 8, 291–318 (1938) Vlasov, A.A.: Many-particle theory and its application to plasma, In: Russian Monographs and Texts on Advanced Mathematics and Physics, vol. 7. Gordon and Breach, New York (1961); originally published by: Moscow and Leningrad, State Publishing House for Technical-Theoretical Literature (1950) Weinstein, A., Morrison, P.J.: Comments on: “the Maxwell-Vlasov equations as a continuous hamiltonian system” by Morrison. Phys. Lett. A 86, 235–236 (1981) Wollman, S.: On the approximation of the Vlasov–Poisson system by particle methods. SIAM J. Num. Anal. 37, 1369–1398 (2000)
712
[WON01]
Y. Elskens, M. K.-H. Kiessling, V. Ricci
Wollman, S., Ozizmir, E., Narasimhan, R.: The convergence of the particle method for the Vlasov–Poisson system with equally spaced initial data points. Transp. Theory Stat. Phys. 30, 1–62 (2001)
Communicated by H. Spohn
Commun. Math. Phys. 285, 713–762 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0619-x
Communications in
Mathematical Physics
Massless Sine-Gordon and Massive Thirring Models: Proof of Coleman’s Equivalence G. Benfatto1 , P. Falco2 , V. Mastropietro1 1 Dipartimento di Matematica, Università di Roma “Tor Vergata”, via della Ricerca Scientifica,
I-00133 Roma, Italy. E-mail:
[email protected] 2 Mathematics Department, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
Received: 21 December 2007 / Accepted: 16 May 2008 Published online: 30 September 2008 – © Springer-Verlag 2008
Abstract: We prove Coleman’s conjecture on the equivalence between the massless Sine-Gordon model with finite volume interaction and the Thirring model with a finite volume mass term. 1. Introduction 1.1. Coleman’s Equivalence. One of the most fascinating aspects of QFT in d = 1+1 is the phenomenon of bosonization; fermionic systems can be mapped in bosonic ones and vice versa. The simplest example is provided by the equivalence between free massless Dirac fermions and free massless bosons with the identifications (see for instance [ID]): ψ¯ x (1 + σ γ5 )ψx ∼ b0 :eiσ
√
4πφx
1 :, ψ¯ x γ µ ψx ∼ − √ εµν ∂ν φx , π
(1.1)
where σ = ±1 and b0 is a suitable constant, depending on the precise definition of the Wick product. Such equivalence can be extended to interacting theories; Coleman [C] showed the equivalence, in the zero charge sector, between the massive Thirring model, with Lagrangian (with our conventions) L = i Z ψ¯ x ∂ ψx − Z 1 µψ¯ x ψx −
λ 2 Z jµ,x jxµ , 4
(1.2)
where Z and Z 1 are (formal) renormalization constants, jµ,x = ψ¯ x γ µ ψx and the massless Sine-Gordon model, with Lagrangian L=
1 ∂µ ϕx ∂ µ ϕx + ζ : cos(αφx ): 2
(1.3)
with the identifications Z 1 ψ¯ x (1 + σ γ5 )ψx ∼ b0 :eiασ φx:,
Z ψ¯ x γ µ ψx ∼ −b1 εµν ∂ν φx ,
(1.4)
714
G. Benfatto, P. Falco, V. Mastropietro
where b0 , b1 are two suitable constants, depending on λ and the details of the ultraviolet regularization. Moreover, this equivalence is valid if certain relations between the Thirring parameters λ, µ and the Sine-Gordon parameters α, ζ are assumed. The case α 2 = 4π is special, as it corresponds to free fermions (λ = 0); the choice ζ = 0 (free bosons) corresponds to massless fermions (µ = 0). In order to establish such equivalence, Coleman considered a fixed infrared regularization of the models (1.2) and (1.3), replacing µ in (1.2) with µχΛ (x) and ζ with ζ χΛ (x), with χΛ (x) a compact support function; this means that the mass term in the Thirring model, and the interaction in the Sine-Gordon is concentrated on a finite volume Λ. Such regularization makes possible a perturbative expansion, respectively in µ for the Thirring model and ζ for the Sine-Gordon model; it turned out that the coefficients of such series expansions can be explicitly computed (in the case of the Thirring coefficients this was possible thanks to the explicit formulas for the correlations of the massless Thirring model given first in [Ha,K]) and they are order by order identical if the identification (1.4) is done and provided that suitable relations between the parameters are imposed. The identification of the series expansions coefficients would give a rigorous proof of the equivalence provided that the series are convergent. The issue of convergence, which was mentioned but not addressed in [C], is technically quite involved and crucial; there are several physical examples in which order by order arguments without convergence lead to incorrect predictions. The search for a rigorous proof of Coleman equivalence was the subject of an intense investigation in the framework of constructive QFT, leading to a number of impressive results. The equivalence between the massive Sine-Gordon model (with mass M large enough) at α 2 < 4π and a Thirring model with a large long-range interaction was rigorously proved in [FS]; similar ideas were also used in [SU]. The properties of the massive Sine-Gordon model for α 2 ≥ 4π were later on deeply investigated. In [BGN] and [NRS] it was proved that the model is stable if one adds a finite number, increasing with α, of vacuum counterterms, while the full construction, through a cluster expansion, of the model was partially realized in [DH]. In [DH] it was also proved that the correlation functions are analytic in ζ , for any α 2 < 8π . A proof of analyticity, only based on a multiscale analysis of the perturbative expansion, was first given in [B], for α 2 < 4π , and then extended in [BK] up to α 2 < 16/3π . Using the results in [DH] for a fixed finite volume, Dimock [D] was able finally to achieve a proof of Coleman’s equivalence, in the Euclidean version of the models, for the case α 2 = 4π ; such a value is quite special as it corresponds to λ = 0, that is the equivalence is with a free massive fermionic system, without current-current interaction. Such limitation was mainly due to the fact that the constructive analysis of interacting fermionic systems was much less developed at that time: indeed a rigorous construction of the massive Thirring model in a functional integral approach has been achieved only quite recently [BFM]. A more physically oriented research on Coleman’s equivalence was focused in recovering bosonization in the framework of the (formal) path-integral approach, [N,FGS]. The idea is to introduce a vector field Aµ and to use the identity √ λ D A exp exp − dx −A2µ,x + λAµ,x jµ,x ) . (1.5) dx jµ,x jµ,x = 4 By parameterizing Aµ in terms of scalar fields ξx , φx , Aµ = ∂µ ξx + εµ,ν ∂ν φx ,
(1.6)
Proof of Coleman’s Equivalence
715
it turns out that the massive Thirring model can be expressed in terms of the boson fields ξx and φx : the first is a massless free field, while the second one has an exponential interaction when µ = 0. In the expectations of the operators ψ¯ x (1 + σ γ5 )ψx and jµ,x , the ξx field has no role and it can be integrated out; the resulting correlations imply the identification (1.4). Such computations are however based on formal manipulations of functional integrals (with no cut-offs, hence formally infinite) and it is well known that such arguments can lead to an incorrect result (see for instance the discussion in §1 in [BFM]). In this paper we will give the first proof of Coleman’s equivalence between the Euclidean massive Thirring model with a small interaction and the mass term restricted to a fixed finite volume Λ and the Euclidean massless Sine-Gordon model with the interaction restricted to the same volume Λ and α 2 around 4π . We will follow the Coleman strategy, but an extension of the multiscale techniques developed in [B] for the Sine-Gordon model and in [BM,BFM] for the Thirring model allow us to achieve the convergence of the expansion.
1.2. Main results. We start from a suitable regularization of the Sine-Gordon and Thirring models via the introduction of infrared and ultraviolet cut-offs, which will be removed at the end, by taking fixed the volume Λ in the interaction term of the Sine-Gordon model and in the mass term of the Thirring model. Let us consider first the (Euclidean) Sine-Gordon model. Let γ > 1, h be a large negative integer (γ h is the infrared cutoff) and N be a large positive integer (γ N is the ultraviolet cutoff). Moreover, let ϕx be a 2-dimensional bosonic field and Ph,N (dϕ) be de f N j the Gaussian measure with covariance C h,N (x) = j=h C 0 (γ x), for 1 (2π )2
de f
C0 (x) =
dk −k2 −(γ k)2 e eikx . − e k2
(1.7)
Given the two real parameters ζ , the coupling, and α (related with the inverse temperature β, in the Coulomb gas interpretation of the model, by the relation β = α 2 ), the SineGordon model with finite volume interaction and ultraviolet and infrared cutoffs is defined by the interacting measure Ph,N (dϕ) exp{ζ N V (ϕ)}, with V (ϕ) =
Λ
dx cos(αϕx ), ζ N = e
α2 2 C 0,N (0)
where Λ is a fixed volume of size 1. Note that ζ N V (ϕ) = ζ de f
:eiaϕx: = eiaϕx e
a2 2 C 0,N (0)
ζ,
Λ dx : cos(αϕx ): ,
(1.8) where (1.9)
is the Wick order exponential eiaϕx , a ∈ R, with respect to the measure with covariance C0,N (x) (for any h); hence ζ N has the role of the bare strength. We consider now the Thirring model. The precise regularization of the path integral for fermions was already described in [BFM], §1.2, therefore we only recall the main features. We introduce in Λ L ≡ [−L/2, L/2] × [−L/2, L/2] a lattice Λa whose sites represent the space-time points. We also consider the lattice Da of space-time momenta k = (k, k0 ). We introduce a set of Grassmann spinors ψk , ψ¯ k , k ∈ Da , such that
716
G. Benfatto, P. Falco, V. Mastropietro
− − + , ψ + ). The γ matrices are explicitly ψk = (ψk,+ , ψk,− ), ψ¯ = ψ + γ 0 and ψk+ = (ψk,+ k,− given by
γ0 =
0 1 , 1 0
γ1 =
0 −i , i 0
γ 5 = −iγ 0 γ 1 =
1 0 . 0 −1
We also define a Grassmann field on the lattice Λa by Fourier transform, according to the following convention: de f
[h,N ]σ ψx,ω =
1 iσ kx [h,N ]σ ψ e , k,ω L2
x ∈ Λa .
(1.10)
k∈Da
[h,N ]σ σ . Moreover, since the limit a → 0 is Sometimes ψx,ω will be shortened into ψx,ω [h,N ]σ as defined in the continuous trivial [BFM], we shall consider in the following ψx,ω box Λ L . In order to introduce an ultraviolet and an infrared cutoff, we could use a gaussian cut-off as in (2.4), but for a technical reason, and to use the results of [BFM], we find it more convenient to use a compact support cut-off. We define the function χh,N (k) in the following way; let χ ∈ C ∞ (R+ ) be a Gevrey function of class 2, non-negative, non-increasing smooth function such that
χ (t) =
1 if 0 ≤ t ≤ 1
(1.11)
0 if t ≥ γ0 ,
for a fixed choice of γ0 : 1 < γ0 ≤ γ ; then we define, for any h ≤ j ≤ N , f j (k) = χ γ − j |k| − χ γ −( j−1) |k|
(1.12)
N and χh,N (k) = j=h f j (k); hence χh,N (k) acts as a smooth cutoff for momenta |k| ≥ γ N +1 and |k| ≤ γ h−1 . Given two real parameters, the bare coupling λ and the bare mass µ, the Thirring model with finite volume mass term and ultraviolet and infrared cutoffs is defined by the interacting measure Ph,N (dψ) exp{V(ψ)}, with
2 λ (1) V(ψ) = − Z 2N dx ψ¯ x γ µ ψx + Z N µ dx ψ¯ x ψx + E h,N |Λ L | (1.13) 4 ΛL Λ and de f
Ph,N (dψ) = dψ
2 L −4 Z 2N |(−|k|2 C h,N (k)
k∈D [h,N ]
⎧ ⎨
1
· exp −Z N 2 ⎩ L ω=± de f
k∈D [h,N ]
−1 ⎫ ⎬
Dω (k) + ψ − , ψ χh,N (k) k,ω k,ω ⎭
(1.14)
where Dω (k) = − ik0 + ωk1 and E h,N is a constant chosen so that, if µ = 0, Ph,N (dψ) exp{V(ψ)} = 1. We will prove the following theorem.
Proof of Coleman’s Equivalence
717
Theorem 1.1. Let Λ be a fixed volume of size 1 and assume |ζ |, |λ|, |µ| small enough, α 2 < 16π/3; then there exist two constants η− = aλ2 + O(λ3 ) and η+ = bλ + O(λ2 ), with a, b > 0, independent of µ and analytic in λ, such that, if we put (1)
Z N = γ −η− N , Z N = γ −η+ N
(1.15)
then, if r = 0 and q ≥ 2 or r ≥ 1, for any choice of the non coinciding points (x1 , . . . , xq , y1 , . . . , yr ), and of σi = ±1, i = 1, . . . , q, ν j = 0, 1, j = 1, . . . , r , ⎤ q ⎡ r iσi αϕxi ν µ µ lim :e : ⎣ (−1)ε j ∂ φy j ⎦ TSG −h,N →∞
=
lim
i=1
−h,N →∞
j=1 (1)
(b0 Z N )q (b1 Z N )r
q i=1
ψ¯ xi
1 + σi γ5 2
⎤ ⎡ r ψxi ⎣ ψ¯ y j γ ν j ψy j ⎦ TT h, j=1
(1.16) where · TT h and · TSG denote the truncated expectations in the Thirring (in the limit L → ∞) and Sine-Gordon models, respectively, b0 and b1 are bounded functions of λ and the following relations between the parameters of the two models have to be verified: α2 = 1 + η− − η+ , ζ = b0 µ. 4π
(1.17)
If q = 1 and r = 0 both the r.h.s. and the l.h.s. of (1.16) are diverging for λ ≤ 0, while the equality still holds for λ > 0. A divergence also appears, for λ ≤ 0, in the pressure, but only for the second order term in ζ or µ; however, if we add a suitable vacuum counterterm, also the pressures are equal. This theorem proves Coleman’s equivalence (1.4). We remark that the relations between the Sine-Gordon parameters and the Thirring parameters in (1.17) are slightly different with respect to those in [C], for λ = 0; this is true in particular for the first equation, involving only quantities which have a physical meaning in the removed cutoff limit, if we express them in terms of λ, as Coleman does. This is not surprising, as the relations between the physical quantities, like the critical indices η± , and the bare coupling depend on the details of the regularization, and in our Renormalization Group analysis the running coupling constants have a bounded but non-trivial flow from the ultraviolet to the infrared scales. Indeed, with a different regularization of the Thirring model (that is starting from a non-local current-current interaction and performing the local limit after the limit N → ∞), as in [M1,M2], one would get a simple relation between α and λ. This new relation again is not equal to that of [C], but is in agreement with the regularization procedure of [J], see footnote 7 of [C]. Another important remark concerns the limit Λ → ∞. In the case of the Sine-Gordon model, one expects that, in this limit, there is an exponential decrease of correlations (implying the screening phenomenon in the Coulomb gas interpretation), which is not compatible with convergence of perturbative expansion (in this case the correlations would have a power decay as in the free theory). Up to now, screening has been proved only for α 2 4π [Y], by extending to dimension two the analogous result obtained in three dimensions by Brydges and Federbush [BF], but screening is expected to be verified in all ranges of validity of the model (α 2 < 8π ), hence even around α 2 = 4π .
718
G. Benfatto, P. Falco, V. Mastropietro
However, if the interaction is restricted to a fixed finite volume, convergence is possible and we could indeed prove it, for α 2 < 6π ; in this paper, for simplicity, we give the proof only for α 2 < 16π/3, which is sufficient to state the main result. The situation for the massive Thirring model is slightly different, because it has been shown [BFM] thatit is well defined in the limit Λ → ∞ and that its correlations decay at least as exp(−c |µ|1+O(λ) |x|). Hence, even if the power expansion in the mass can be convergent only if we fix the volume, the proof of Coleman’s conjecture strongly supports the related conjecture that even the Sine-Gordon model is well defined around α 2 = 4π in the infinite volume limit and has exponential decrease of correlations. The proof is organized in the following way. In §2 we analyze the massless SineGordon model with finite volume interaction and α 2 < 16π/3, extending the proof of analyticity in ζ given in [B] for the massive case in the infinite volume limit and α 2 < 4π . With respect to the technique used in [D], where only the case α 2 = 4π was analyzed, our method has the advantage that an explicit expression of the coefficients can be easily achieved; this is probably possible even with the other method, but the proof was given only for α 2 < 4π and, as a consequence, the correlations in the model with α 2 = 4π were defined as the limit α 2 → 4π of those with α 2 < 4π . In §3 we use the methods developed in [BFM,M1,M2] to prove the analyticity in µ of the Thirring model; the explicit expressions of the coefficients are obtained in §4, by using the explicit expression of the field correlation functions given in the Appendix (through the solution of a Schwinger-Dyson equation, based on a rigorous implementation of Ward Identities) and by a rigorous implementation, in a RG context, of the point splitting procedure used in theoretical physics. An important role in the analysis is also played by the proof of the following exact relation between critical indices: (1 + η− )2 = 1 + η+2 ,
(1.18)
which is used in order to exclude the presence of an extra massless Gaussian field ξx in the second part of (1.4). 2. The Massless Sine-Gordon Model with a Finite Volume Interaction We want to study the measure defined in §1.2 in the limit of removed cutoff, −h, N → ∞. To this purpose, we consider the Generating functional, Kh,N (J, A, ζ ), defined by the equation Kh,N (J, A, ζ ) = log Ph,N (dϕ) eζ N V (ϕ) ⎫ ⎧ ⎬ ⎨
dx Jxσ :eiασ ϕx: + dy Aνy ∂ ν ϕy , (2.1) · exp ⎭ ⎩ σ =±1
ν=0,1
µ Ay
where Jzσ and are two-dimensional, external bosonic fields. Then, given two nonnegative integers q and r , as well as two sets of labels σ = (σ1 , . . . , σq ) and ν = (ν1 , . . . , νr ), together with two sets of two by two distinct points z = (z1 , . . . , zq ) and y = (y1 , . . . , yq ), we consider the Schwinger functions, defined by the equation (q,r ;ζ )
K h,N
de f
(z, y; σ , ν) =
∂ q+r Kh,N σ
∂ Jzσ11 · · · ∂ Jzqq ∂ Aνy11 · · · ∂ Aνyrr
(0, 0, ζ ).
(2.2)
Proof of Coleman’s Equivalence
719
Theorem 2.1. If |ζ | is small enough, α 2 < 16π/3 and q ≥ 2, if r = 0, or q ≥ 0, if r ≥ 1, the limit de f
K (q,r ;ζ ) (z, y; σ , ν) =
lim
(q,r ;ζ )
−h,N →+∞
K h,N
(z, y; σ , ν)
(2.3)
exists and is analytic in ζ . In the case q = r = 0 (the pressure), the limit does exist and is analytic, up to a divergence in the second order term, present only for α 2 ≥ 4π . For clarity’s sake, we prefer to give the proof of the above theorem in the special cases (q, r ) = (k, 0) and (q, r ) = (0, k) separately; the proof in the general case is a consequence of the very same ideas that will be discussed for the special ones, but it needs a more involved notation, so we will not report its details.
2.1. The free measure. By the definitions given in §1.2, the regularized free measure is the two–dimensional boson Gaussian measure with covariance C h,N (x) =
1 (2π )2
N
dk −(γ −N k)2 −(γ −h+1 k)2 ikx e e − e = C0 (γ j x). k2
(2.4)
j=h
The two–dimensional massless boson Gaussian measure is obtained by taking the limits h → −∞ and N → ∞. It is easy to prove that C0 (0) =
q q log γ , ∂x00 ∂x11 C0 (x) ≤ Aq0 ,q1 ,κ e−κ|x| , 2π
(2.5)
where q0 , q1 are non-negative integers and κ is an arbitrary positive constant. Let us now consider the function C h,∞ (x) = lim C h,N (x) = C0,∞ (γ h x). N →∞
(2.6)
It is easy to show, by a standard calculation, that there exists a constant c such that |C0,∞ (x) +
1 log(c|x|2 )| ≤ C|x|2 . 4π
(2.7)
Hence, C h,∞ (x) diverges for h → −∞ as −(2π )−1 log(γ h |x|). However, if we define ∆−1 h,∞ (x) = C h,∞ (x) +
1 log(cγ 2h ), 4π
(2.8)
we have, by (2.7): de f
∆−1 (x) =
lim ∆−1 h,∞ (x) = −
h→−∞
1 log |x|. 2π
(2.9)
Then it is natural to define the Coulomb potential with ultraviolet cutoff by de f
∆−1 N (x) =
de f
−1 lim ∆−1 h,N (x), ∆h,N (x) = C h,N (x) +
h→−∞
1 log(cγ 2h ). 4π
(2.10)
720
G. Benfatto, P. Falco, V. Mastropietro
Since C h,N (x) = C h,∞ (x) − C N ,∞ (x), by using (2.5) and (2.9), we see that −1 ∆ (x) + 1 log |x| ≤ Ce−κγ N |x| , γ N |x| ≥ 1 N 2π
(2.11)
and, by using (2.7), we see that |∆−1 N (x) −
1 log(cγ 2N )| ≤ Cγ 2N |x|2 , γ N |x| ≤ 1. 4π
(2.12)
We define Eh,N and E j to be the expectation with respect to the Gaussian measures with covariance C h,N (x) and C j (x) = C0 (γ j x), respectively; a superscript T in the expectation will indicate a truncated expectation. Recall that, for a generic probability measure with expectation E, and any family of random variables ( f 1 , . . . , f s ), E T is defined as
T |Π|−1 E [ f1 ; . . . ; fs ] = (−1) (|Π | − 1)! E fi , (2.13) Π
X ∈Π
i∈X
where Π denotes the sum over the partitions of the set (1, . . . , s). Finally we recall that :eiσβϕx: is the Wick normal ordering of eiσβϕx always taken with respect to the measure with covariance C0,N (x) (see definitions in §1.2). de f
Lemma 2.1. Let σi ∈ {−1, +1}, i = 1, . . . , n, and α ∈ R. If Q = lim Eh,N
h→−∞
n
:e
iασr ϕxr
α2
: = δ Q,0 c− 8π n e−α
2
r <s
r
σr , then
σr σs ∆−1 N (xr −xs )
.
(2.14)
r =1
Proof. We first notice that if the Wick product ! had been defined with respect to the covan iασr ϕxr : would have been equal to −α 2 :e riance C h,N , then log Eh,N r <s σr σs r =1 C h,N (xr − xs ). Hence, by definition (2.10), we get log Eh,N
n
:eiασr ϕxr :
=
α2 hn log γ − α 2 σr σs C h,N (xr − xs ) 4π r <s
=
α2 2 α2 h Q 2 log γ + (Q − n) log c − α 2 σr σs ∆−1 h,N (xr − xs ), 4π 8π r <s
r =1
which immediately implies the lemma. If E is the expectation Eh,N in the limit −h, N → ∞, by taking the limit N → ∞ in the r.h.s. of (2.14), we get, in the case Q = 0, E
n
r =1
:e
iασr ϕxr
α2
: = δ Q,0 c− 8π n
α2
|xr − xs |σr σs 2π .
r <s
We are now ready to consider the interacting measure.
(2.15)
Proof of Coleman’s Equivalence
721
2.2. The case q = r = 0 (the pressure). To begin with we analyze the pressure: de f de f p(ζ ) = lim log Z h,N (ζ ), Z h,N (ζ ) = Ph,N (dϕ) eζ N V (ϕ) . (2.16) −h,N →∞
We proceed as in [B], by studying the multiscale expansion associated with the following decomposition of the covariance: C h,N (x) =
N
de f
C j (x) + C h,−1 (x), C j (x) = C0 (γ j x).
(2.17)
j=0
In comparison with [B], where the case h = 0 - the “Yukawa gas” - was considered, here we are collecting in a single integration step all scales below h = 0: as we shall see, this is effective since the volume size is fixed to be 1. To simplify the notation, from now on E−1 will denote the expectation w.r.t. C h,−1 (x), while E j will have the previous meaning for j ≥ 0. (N ) Let Tn , n ≥ 2, be the family of labelled trees with the following properties: 1) there is a root r and n ordered endpoints ei , i = 1, . . . , n, which are connected by the tree; the tree is ordered from the root to the endpoints; 2) each vertex v carries a frequency label h v , which is an integer taking values between −1 and N + 1, with the condition that h u < h v , if u precedes v in the order of the tree; moreover, the root has frequency −1 and the endpoint ei has frequency h i + 1, if h i is the frequency of the higher vertex preceding it. 3) The endpoint ei carries two other labels, the charge σi and the position xi . These trees differ from those used in [BFM] for the Thirring model, because there are no “trivial vertices” on the lines of the tree. (N ) (N +1) (∞) (N ) Since Tn ⊂ Tn , then Tn = lim N →∞ Tn is be free to vary between −1 and ∞. We shall also use the following definitions: a) Given a tree τ , we shall call non trivial (n.t. in the following) the tree vertices different from the root and from the endpoints. If v ∈ τ is a n.t. vertex, sv ≥ 2 will denote the number of lines branching from v in the positive direction, v ∈ τ is the higher non trivial vertex preceding v, if it does exist, or the root, otherwise. Moreover, X v will be the set of endpoints following v along the tree; X v will be called the cluster of v and n v will denote the number of its elements. If v is an endpoint, X v will denote the endpoint itself. Finally we define (X v ) = i : ei ∈X v σi ϕxi . b) Given a n.t. vertex v and an integer j ∈ [−1, N ], we shall denote
de f U j (v) = σr σm C j (xr − xm ) = E j 2 (X v ) ≥ 0 (2.18) r,m : er ,em ∈X v
the (double of the) total energy on scale j associated with its cluster. If k +1 ≤ k −1, we shall also define de f
Uk ,k (v) =
k−1
j=k +1
U j (v).
(2.19)
722
G. Benfatto, P. Falco, V. Mastropietro
c) If X and Y are two disjoint clusters and j is an integer contained in [−1, N ], we denote
de f
W j (X, Y ) =
σr σm C j (xr − xm ) = E j [(X )(Y )] (2.20)
r,m : er ∈X, em ∈Y
the interaction energy on scale j between X and Y . d) Given a n.t. vertex v, (v1 , . . . , vsv ), will denote the set of vertices following it along the tree; moreover we define G j (v1 , . . . , vsv ) = E Tj eiα(X v1 ) ; . . . ; eiα(X vsv ) . (2.21) By proceeding as in [B], it is easy to see that
∞
Ph,−1 (ϕ) eζ V (ϕ)+
Z h,N (ζ ) =
n=2 ζ
n V (N ) (ϕ) n
,
(2.22)
where Vn(N ) (ϕ) =
n 1
dx1 · · · dxn eiα r =1 σr ϕxr Vτ (σ , x), n 2 σ ,...,σ Λn
(N ) τ ∈Tn
Vτ (σ , x) =
" n
α2 4π h ei
γ
i=1
#
(2.23)
n
1
G h (v1 , . . . , vs ) − α2 U (v) v v e 2 h v ,h v . s ! v n.t. v∈τ
(2.24)
We note that Vτ (σ , x) is independent of N and h. In order to prove that the pressure, see (2.16), is well defined, the main step is to verify that, uniformly in h and N , Z h,N (ζ ) = 1 + O(ζ ). As we will discuss later in this section, since the only dependence on h in (2.22) is through the measure Ph,−1 (dϕ), which has support on smooth functions for any h, the wanted bound for Z h,N (ζ ) is an (N ) easy consequence of a uniform C n bound of Vn (ϕ). Since |Vn(N ) (ϕ)| ≤
de f
bτ , bτ =
(N ) τ ∈Tn
1
dx1 · · · dxn |Vτ (σ , x)|, 2n σ ,...,σ Λn 1
(2.25)
n
and the number of trees is of order C n , we shall look for a “good” bound of bτ . The main ingredients in this task are the positivity of U j (v), see (2.18), and the Battle-Federbush formula for the truncated expectations (see [Br]): G j (v1 , . . . , vs ) =
α2 −α 2 W j (X vr , X vm ) dpT (t) e− 2 U j (v,t) , (2.26)
T ∈T¯s r,m ∈T
where s = sv , T¯s is the family of connected tree graphs on the set of integers {1, . . . , s}, 1 2 U j (v, t) is obtained by taking a sequence of convex linear combination, with parameters t, of the energies of suitable subsets of X v = ∪i X vi (hence U j (v, t) ≥ 0) and dpT (t) is a probability measure.
Proof of Coleman’s Equivalence
723
By using (2.24), (2.26) and (2.20), we get " n # α2 h ei 4π γ |Vτ (σ , x)| ≤ i=1
·
α 2(sv −1)
sv ! n.t. v∈τ
T ∈T¯sv r,m ∈T
C h (xe − xe ) v
(2.27)
e∈X vr e ∈X vm
On the other hand, for any given ε > 0, we can use the bound
n vr n vm ≤ s! ε−2(s−1)
s
e2εn vi .
(2.28)
r =1
T ∈T¯s r,m ∈T
Moreover, by (2.17) and (2.5), for h v ≥ 0, dx |C h v (x)| ≤ Cγ −2h v .
(2.29)
Λ
The trees, as defined after (2.17), satisfy the following identity: w≥v (sw −1) = n v −1; as a consequence, if v0 is the first non trivial vertex of τ ,
h v (sv − 1) = h r (n v0 − 1) + (h v − h v )(n v − 1), n.t. v∈τ
n.t. v∈τ
v
where is the n.t. vertex immediately preceding v or the root, if v = v0 . This allows us to write: " n # α2 h ei n 4π γ γ −2h v (sv −1) e2εn v (2.30) bτ ≤ Cε n.t. v∈τ
i=1
≤
Cεn
γ
−(h v −h v )[D(n v )−2εn v ]
,
n.t. v∈τ
where the dimension D(n) is given by D(n) = 2(n − 1) −
α2 n. 4π
(2.31)
Let us consider first the case α 2 < 4π . This condition implies that D(n) > 0 for any n ≥ 2; hence, the bound (2.30) implies in the usual way that |Vτ (σ , x)| ≤ Cαn , for a constant Cα , which diverges as α 2 → (4π )− ; since |Λ| = 1, this bound is valid also for bτ . By a little further effort, see below, one can then prove that the pressure p(ζ ) is an analytic function of ζ , for ζ small enough. If 4π ≤ α 2 < 16π/3, D(n) > 0 only for n ≥ 3, so that the previous bound is (N ) divergent for all trees containing at least one vertex with n v = 2. In particular, V2 (ϕ) diverges as N → ∞; this divergence is related with the fact that the term of order ζ 2 and σ1 = −σ2 in the perturbative expansion of the pressure is really divergent, as one can easily check. The only way to cure this specific divergence is to renormalize the model by subtracting a suitable constant of order ζ 2 from the potential, as we shall see below. However, all other terms, even those associated with a tree containing at least one vertex with n v = 2, are indeed bounded uniformly in N ; in order to prove this claim, we need to improve the bound (2.30) by the two following lemmas.
724
G. Benfatto, P. Falco, V. Mastropietro
Lemma 2.2. If n v = 2 and v1 , v2 are the two endpoints following v with positions x1 , x2 respectively and equal charges σ1 = σ2 , then |G h v (v1 , v2 )e
2
− α2 Uh
v ,h v
(v)
α2
| ≤ Cγ − π
(h v −h v ) −γ h v |x1 −x2 |
e
(2.32)
Proof. Since h v + 1 ≥ 0, it is easy to check that 2
−α U
(v)
G h v (v1 , v2 )e 2 h v ,h v h v −1 2 2 −α 2 k=h [C0 (0)+Ck (x1 −x2 )] de f v +1 = e−α Ch v (x1 −x2 ) − 1 e−α C0 (0) · e = F(x1 − x2 ). Hence, by using (2.5) with κ > 1 and since C0 (x) ≤ C0 (0), we get |F(z)| ≤ Ce−κγ
h v |z|
e−2α
2C
α2
0 (0)(h v −h v )
e
h v −1
−α 2
k=h v +1
! C0 (γ k z)−C0 (0)
hv
≤ Cγ − π (h v −h v ) e2α RC0 (0) e−(κ−κ R )γ |z| , de f where κ R = α 2 B r∞=R γ −r , the constant B is such that |C0 (x) − C0 (0)| ≤ B|x| (it exists by (2.5) for (q0 , q1 ) = (1, 0), (0, 1)) and R is an arbitrary positive integer, that we can choose so that κ − κ R ≥ 1. 2
Lemma 2.3. For j ≥ 0, if the cluster X is made of two endpoints with positions x1 , x2 and opposite charges, and Y is another arbitrary cluster, then
1 j j |W j (X, Y )| ≤ Cγ |x1 − x2 | dt e−γ |x2 +t (x1 −x2 )−y| . (2.33) y∈Y
0
Moreover, if also the cluster Y is made of two endpoints with opposite charge and positions y1 , y2 , then 1 j |W j (X, Y )| ≤ Cγ 2 j |x1 − x2 | |y1 − y2 | dtds e−γ |x2 +t (x1 −x2 )−y2 −s(y1 −y2 )| . (2.34) 0
Proof. By using the identity C0 (x1 − y) − C0 (x2 − y) =
(x1 − x2 )a
1
dt (∂a C0 ) [x2 + t (x1 − x2 ) − y]
0
a=0,1
(2.35) together with (2.5) (with κ ≥ 1), we get the bound
|C j (x1 − y) − C j (x2 − y)| ≤ Cγ |x1 − x2 | j
1
dt e−γ
j |x
2 +t (x1 −x2 )−y|
(2.36)
0
which immediately implies (2.33). The bound (2.34) is proved in a similar way, by using the identity C0 (x1 − y1 ) − C0 (x2 − y1 ) − C0 (x1 − y2 ) + C0 (x2 − y2 ) 1
= (x1 −x2 )a · (y1 −y2 )b dtds (∂a ∂b C0 ) [x2 +t (x1 −x2 )−y2 −s(y1 −y2 )]. a,b=0,1
0
(2.37)
Proof of Coleman’s Equivalence
725
Let us now consider a tree with n ≥ 3 endpoints. By using Lemma 2.2, we can improve the bound (2.30) by replacing D(n v ) with D(n v ) + α 2 /π in all vertices with n v = 2 and σ1 = σ2 . Since D(2) + α 2 /π = 2 + α 2 /(2π ) > 0, this is sufficient to make the corresponding sum over h v − h v convergent. It follows that the sum over all trees with n ≥ 3 and no vertex with n v = 2 and Q = 0 is finite, uniformly in h, if α 2 < 16π/3. A similar argument can be used to control the vertices with n v = 2 and Q = 0. In fact, if v is a vertex with n v = 2, then v is certainly a n.t. vertex, otherwise n would be equal to 2 and we are supposing n ≥ 3. Hence, we can use Lemma 2.3 in (2.26) for the vertex v , which allows us to improve the bound of (2.26) for the vertex hv v: since γ h v |x1 − x2 | |G h v (v1 , v2 )| ≤ Cγ −(h v −h v ) e−γ |x1 −x2 |/2 , if v1 and v2 are the two endpoints following v, we can modify the bound (2.30) by adding 1 to the dimension D(n v ) of v; this is sufficient, since D(2) + 1 = 3 − α 2 /(2π ) is positive for α 2 < 6π . It follows that |Vτ (σ , x)| ≤ Cαn holds for all n ≥ 3, with C(α) → ∞ as α 2 → (16π/3)− . By a further effort, one could prove that C(α) can be substituted with a new constant, which is indeed finite up to 6π , but we do not need this stronger property here. Let us now come back to the terms of order two. It is easy to see that ζ 2 (N ) V2,σ (ϕ), 2 σ =±1 (N ) V2,σ (ϕ) = dxdy cos[αϕx + σ αϕy ]Wσ(N ) (x − y), (N )
V2
(ϕ) =
Wσ(N ) (x − y) =
Λ2 N
1 2
(2.38) (2.39)
α2 2 2 j−1 γ 2π ( j−1) e−σ α C j (x−y) − 1 e−α r =0 [Cr (0)+σ Cr (x−y)] . (2.40)
j=0 (N )
By proceeding as in Lemma 2.2, it is easy to see that V2,+ (ϕ) is bounded uniformly in N for any α. This is not true for
(N ) V2,− (ϕ);
in fact, if we define (N )
c N = Eh,−1 (V2,− (ϕ))
(2.41)
one can easily check that c N diverges for N → ∞ and that ζ 2 c N /2 is equal to the term of order ζ 2 and σ1 = −σ2 in the perturbative expansion of the pressure. However, if
Fig. 2.1. A subtree of τ with n v = 2 and Q = 0. While v1 and v2 are endpoints, therefore their scale has to be h v + 1, v is the higher non trivial vertex of τ preceding v, hence the only constraint is that h v − h v ≤ N
726
G. Benfatto, P. Falco, V. Mastropietro ζ2
we define $ Z h,N (ζ ) = Z h,N (ζ )e− 2 c N , we can show that the renormalized pressure (in the presence of the cutoffs) $ ph,N (ζ ) = log $ Z h,N (ζ ) has a power expansion uniformly convergent as −h, N → ∞, for α 2 < 16π/3 (the result is indeed true for α 2 < 6π ). It is easy to see that $ ph,N (ζ ) = ζ γ
α2 h 4π
+
∞
n=2
ζn
pτ(h) ,
(2.42)
$n(N ) τ ∈T
where T$n(N ) is a family of trees defined as Tn(N ) , with the following differences: 1) the root has scale −2; 2) there is no tree which has only two endpoints with opposite charge. Moreover
1
$τ(h) (σ , x), = n dx1 · · · dxn V (2.43) 2 σ ,...,σ Λn n 1 " n # G α2 $h v (v1 , . . . , vsv ) − α2 U (v) (h,N ) (h +1) $τ e 2 h v ,h v , (2.44) V (σ , x) = γ 4π i s ! v n.t. v∈τ pτ(h)
i=1
$h v (v1 , . . . , vsv ) = G h v (v1 , . . . , vsv ), if h v ≥ 0, while, if h v = −1 and sv = s, where G ! T $−1 (v1 , . . . , vs ) = Eh,−1 G F(ϕ, X v1 ); . . . ; F(ϕ, X vs ) (2.45) with, given a cluster X , F(ϕ, X ) =
cos[αΦ(X )] − 1 , if |X | = 2 and σ1 = −σ2 , cos[αΦ(X )] , otherwise
(2.46)
where we subtracted a −1 in the terms with |X | = 2 and σ1 = −σ2 (without changing the value of the truncated expectation, since s ≥ 2), in order to improve the bound in the corresponding vertex, with an argument similar to that used before. In fact, in order to bound (2.45), we shall use the definition (2.13) and the bound m F(ϕ, X i ) Eh,−1 i=1 ⎡ ⎤ 2 m 2 α (i) (i) 2⎦ ⎣ ≤ |x1 − x2 | · sup Eh,−1 |∂ϕy1 |2 · · · |∂ϕym 2 |2 , 2 y1 ,...,ym 2 ∈Λ i:|X i |=2
(2.47) where m 2 ≤ m is the number of clusters with 2 endpoints and, for each cluster of (i)
(i)
de f
this type, x1 and x2 are the two endpoint positions; |∂ϕy |2 = (∂0 ϕy )2 + (∂1 ϕy )2 . On the other hand, it is easy !to see that there is a constant c0 , independent of h, such that Eh,−1 ∂a1 ϕx1 ∂a2 ϕx2 ≤ c0 . It follows, by using the Wick Theorem, that ! q Eh,−1 |∂ϕy1 |2 · · · |∂ϕyq |2 ≤ 2q c0 (2q − 1)!! ≤ C q q!, so that, if we choose C ≥ 1
Proof of Coleman’s Equivalence
727
(which allows us to substitute m 2 with m) and use (2.13), we obtain the following bound: $−1 (v1 , . . . , vs )| |G ⎡ ⎤ s
1 ≤⎣ |x1(i) − x2(i) |2 ⎦ · (k − 1)! k! i:|X vi |=2
k=1
The sum in the second line is equal to C s s!
m ,...,m k ≥1 1k r =1 m r =s
s
1 k=1 k
⎡ $−1 (v1 , . . . , vs )| ≤ C s s! ⎣ |G
k s! (C m r m r !). m1! · · · mk !
s−1 k−1
r =1
(2.48)
≤ 2s−1 C s s!, so that ⎤
(i)
(i)
|x1 − x2 |2 ⎦ .
(2.49)
i:|X vi |=2 (i)
(i)
The factors |x1 −x2 |2 can be used to control the sum over the scale labels of the vertices with |X vi | = 2, by the same argument used in the discussion following (2.37). Hence, (h) if we define the function Wn,h,N (x) so that τ ∈T$ (N ) pτ = Λn dx1 · · · dxn Wn,h,N (x), n the previous arguments imply that there exist positive functions f τ (x), independent of h and N , and a constant C, such that
de f |Wn,h,N (x)| ≤ f τ (x) = Hn,N (x), dx Hn,N (x) ≤ C n . (2.50) Λn
$n(N ) τ ∈T (N )
(N +1)
, Hn,N (x) is monotone in N . Hence, by the Monotone Convergence Since T$n ⊂ T$n Theorem, Hn,N (x) has a L 1 limit Hn (x), as N → ∞; by (2.50), |Wn,h,N (x)| ≤ Hn (x). On the other hand, by definition we have 1 1 T iασ1 ϕx 1 :; . . . ;:eiασn ϕxn : Wn,h,N (x) = :e E (2.51) n! 2n σ ,...,σ h,N 1
n
and Lemma 2.1, (2.11) and (2.13) imply that Wn,h,N (x) is almost everywhere convergent as −h, N → ∞. Then, by the Dominated Convergence Theorem, (2.42) and (2.50), $ p (ζ ) = lim−h,N →∞ $ ph,N (ζ) does exist and is an analytic function of ζ , for ζ small n enough; moreover, $ p (ζ ) = ∞ n=2 pn ζ and, if n ≥ 3, by (2.15), α2 Q=0 c− 8π n 1
dx1 · · · dxn pn = n! 2n σ ,...,σ Λn n 1 ⎧ ⎫ ⎪ ⎪ ⎨
⎬ 2 α · (−1)|Π|−1 (|Π | − 1)! |xr − xs |σr σs 2π , ⎪ ⎪ ⎩Π ⎭ Y ∈Π r,s∈Y
(2.52)
r <s
where Π denotes the sum over the partitions of the set {1, . . . , n}. If α 2 < 4π , the previous expression is well defined also for n = 2, and gives the coefficient of order 2 of p(ζ ). We stress that the integral and the sum over the partitions can not be exchanged.
728
G. Benfatto, P. Falco, V. Mastropietro
2.3. The case r = 0 (the charge correlation functions). Let ξi = (zi , σi ), i = 1, . . . , k, a family of fixed positions and charges, such that zi = z j for i = j and µi , i = 1, . . . , k, a set of real numbers. If k iασr ϕ(zr ) : Z h,N (ζ, ξ , µ) = P[h,N ] (dϕ)eζ N V (ϕ)+ r =1 µr :e (2.53) the charge correlation function of order k, k ≥ 1, defined by (2.2), is given by ∂k (k,ζ ) K h,N (z, σ ) = log Z h,N (ζ, ξ , µ) . ∂µ1 · · · ∂µk µ=0 By proceeding as in Sect. 2.2, one can show that (N ) (N ) (N ) Z h,N (ζ, ξ , µ) = P[h,−1] (dϕ)e Ve f f (ζ,ϕ)+B (ζ,ϕ,ξ ,µ)+R (ζ,ϕ,ξ ,µ) ,
(2.54)
(2.55)
∞ n (N ) ) (N ) (ζ, ϕ, ξ , µ) is the sum over the where Ve(N n=2 ζ Vn (ϕ), B f f (ζ, ϕ) = ζ V (ϕ) + (N ) terms of order at most 1 in each of the µr , and R (ζ, ϕ, ξ , µ) is the rest. Equation (2.54) implies that ∂k (k,ζ ) $ K h,N (z, σ ) = log Z h,N (ζ, ξ , µ) , (2.56) ∂µ1 · · · ∂µk µ=0 where $ Z h,N (ζ, ξ , µ) =
(N )
Ph,−1 (dϕ) e Ve f f
(ζ,ϕ)−ζ 2 c N /2+B (N ) (ζ,ϕ,ξ ,µ)
.
(2.57)
In order to describe the functional B (N ) (ζ, ϕ, ξ , µ), we need to introduce a new defini-
(N ) the family of labelled trees whose properties are very similar to tion. We shall call Tn,k (N )
those of Tn , with the only difference that there are n +m endpoints, n ≥ 0, 1 ≤ m ≤ k; n endpoints, to be called normal, are associated as before to the interaction, while the others, to be called special, are associated with m different variables ξi , whose set of indices we shall denote Iτ , while ξτ will denote the set of variables itself. It is easy to see that B (N ) (ζ, ϕ, ξ , µ) =
∞
ζ n Bn(N ) (ϕ, ξ , µ),
(2.58)
n=0
Bn(N ) (ϕ, ξ , µ) = δn,0
k
µr:eiασ ϕzr : +
r =1
·
µs eiα
n
(N ) τ ∈Tn,k n+m≥2
r =1 σr ϕxr + s∈Iτ
1
dx1 · · · dxn n 2n
Λ
σs ϕzs
σ1 ,...,σn
!
Vτ (σ , x, ξτ ),
(2.59)
s∈Iτ
where Vτ (σ , x, ξτ ) is defined exactly as in (2.24), with (σ , σ τ , x, zτ ) in place of (σ , x). One can easily check that, if α 2 ≥ 4π , the terms with n = k = 1 and σ1 = −σ¯ 1 in the r.h.s. of (2.59) have a divergent bound as N → ∞. This is related to the fact that the
Proof of Coleman’s Equivalence
729
(1,ζ )
function K h,N (z; σ ) is indeed divergent at the first order in ζ . However, if we regularize these terms by subtracting their value at ϕ = 0, the counterterms give no contribution (k,ζ ) to K h,N (z; σ ), for k ≥ 2. Hence, we can proceed as in the bound of the pressure and we get similar results. There are however a few differences to discuss. (k,ζ ) Given a tree τ (with root of scale −2) contributing to K h,N (z; σ ), we call τ ∗ the tree which is obtained from τ by erasing all the vertices which are not needed to connect the m ≤ k special endpoints. The endpoints of τ ∗ are the m special endpoints of τ , which we denote ei∗ , i = 1, . . . , m. Given a vertex v ∈ τ ∗ , we shall call zv the subset of the positions associated with the endpoints following v in τ ∗ ; moreover, we shall call sv∗ the number of branches following v in τ ∗ . The positions in zv are connected in our bound by a spanning tree of propagators of scales j ≥ h v ; hence, if we use the bound e−2γ
h |x|
≤ e−γ
h |x|
· e−c
h
j=0 γ
j |x|
, c=
∞
γ − j/2
(2.60)
j=0
and define δ = min1≤i< j≤k |zi − z j |, it is easy to see that we can extract, for any v ∈ τ ∗ , hv j a factor e−cγ δ from the propagators bound, by leaving a decaying factor e−γ |x| for each propagator (of scale j) of the spanning tree. On the other hand, the fact that the points in zv are not integrated implies that there are sv∗ − 1 less integrations to do by using propagators of scale h v , for each vertex v ∈ τ ∗ . In conclusion, with respect to the pressure bound, we have to add, for each tree τ , a factor ∗ hv γ 2h v (sv −1) e−cγ δ ≤ (cδ)−2(m−1) (2m − 2)!, (2.61) v∈τ ∗
where we used the identity v∈τ ∗ (sv∗ − 1) = m − 1. Since m ≤ k, the sum over the scale labels can be done exactly as in the pressure case, up to a C k (2k)! overall factor. There is another difference to analyze, related with the fact that, in the analogue of (2.45), the function F(ϕ, X v ) corresponding to a cluster with two endpoints of opposite charge, one normal and one special, is bounded by |x − z| supy |∂ϕy |, rather than bounded by |x − z|2 supy |∂ϕy |2 . The fact that the zero in the positions is of order one has no consequence, since such a zero is sufficient to regularize the bound over a cluster with two endpoints of opposite charges. The fact that |∂ϕy | appears, instead of its square, is also irrelevant, since the only consequence is that, in the bound analogue to (2.47), one has to substitute Eh,−1 |∂ϕy1 |2 · · · |∂ϕym 2 |2 with Eh,−1 |∂ϕy1 | · · · |∂ϕym | , 2 with m 2 ≤ 2m 2 . However, by the Schwartz inequality, Eh,−1 |∂ϕy1 | · · · |∂ϕym | ≤ 2 & 2 2 Eh,−1 |∂ϕy1 | · · · |∂ϕym | , and we can still use the Wick Theorem to get an even 2
better bound. The previous arguments allow us to prove that K (k,ζ ) (z; σ ) = lim−h,N →+∞ (k,z) K h,N (z, σ ) does exist and is an analytic function of ζ around ζ = 0, with a radius of convergence independent of δ (the minimum distance between two points in z). On the other hand, it is easy to check the well known identity ∞
ζn 1
(k,ζ ) K h,N (z; σ ) = dx1 · · · dxn n! 2n
n=0 σ
T :eiασ1 ϕz1 :; . . . ;:eiασk ϕzk :;:eiασ1 ϕx1 :; . . . ;:eiασn ϕxn : . (2.62) ×Eh,N
730
G. Benfatto, P. Falco, V. Mastropietro
An argument similar to that used at the end of §2.2 allows us to prove that the power expansion of K (k,ζ ) (z; σ ) is obtained by the previous equation, by substituting ∞ n in the r.h.s T with E T . Hence, by using (2.15), we get that K (k,ζ ) (z; σ ) = Eh,N n=0 ζ gk,n (z; σ ), with α2
c− 8π (n+k) 1 gk,n (z; σ ) = n! 2n
σ1 ,...,σn
n
k σ i=1 i + r =1 σr =0
Λn
dx1 · · · dxn
⎧ ⎫ ⎪ ⎪ ⎨
⎬ α2 · (−1)|Π|−1 (|Π | − 1)! |yr − ys |σ¯r σ¯ s 2π , ⎪ ⎪ ⎩Π ⎭ Y ∈Π r,s∈Y where y = (x, z), σ¯ = (1, . . . , n + k).
(σ , σ )
and
Π
(2.63)
r <s
denotes the sum over the partitions of the set
2.4. The case r > 0 (the ∂ϕ correlation functions). Let y = (y1 , . . . , yk ) be a set of k ≥ 1 distinct fixed positions, ν = (ν1 , . . . , νk ) be a set of derivative indices and µ = (µ1 , . . . , µk ) a set of real numbers. If k νr Z h,N (ζ, y, ν, µ) = P[h,N ] (dϕ)eζ N V (ϕ)+ r =1 µr ∂ ϕ(yr ) (2.64) the ∂ϕ correlation function of order k, k ≥ 1, is given by (k,ζ ) K h,N (y; ν)
∂k = log Z h,N (ζ, y, ν, µ) . ∂µ1 · · · ∂µk µ=0
(2.65)
(k,ζ )
We can proceed as in §2.3 and we can represent K h,N (y; ν) as in (2.56), that is we can substitute in (2.65) Z h,N (ζ, y, ν, µ) with (N ) 2 (N ) $ Z h,N (ζ, y, ν, µ) = P[h,−1] (dϕ)e Ve f f (ζ,ϕ)−2ζ c N +B (ζ,ϕ,y,ν,µ) . (2.66) It is not hard to see that B (N ) (ζ, ϕ, y, ν, µ) = Bn(N ) (ϕ, y, ν, µ) = δn,0
k
µr ∂ νr ϕ(yr ) +
r =1
·eiα (N )
n
∞
r =1 σr ϕ(xr )
n=0 ζ
(N ) τ ∈Tn,k n+m≥2
n
(N )
Bn (ϕ, y, ν, µ), with
1
dx1 · · · dxn 2n σ ,...,σ Λn
$τ (σ , ν, x, y ) V τ
1
n
µr ,
(2.67)
r ∈Iτ
where Tn,k is defined exactly as in (2.59), except for the fact that the m special endpoints $τ (σ , x, y ) is defined in a way (1 ≤ m ≤ k) are associated with the ∂ϕ terms; moreover V τ similar to Vτ (σ , x, ξ τ ), but, before giving its expression, we need a few new definitions. If v is a non trivial vertex, we shall call Iv ⊂ Iτ the set of special endpoints immediately following v (that is the set of ∂ϕ endpoints which are contracted in v), s¯v the number
Proof of Coleman’s Equivalence
731
of vertices immediately following v, which are not special endpoints, and sv∗ = |Iv | (hence sv = s¯v + sv∗ ). Moreover, we shall use X v to denote the set of normal endpoints (instead of all endpoints) following v. Then we can write " n # G α2 $h v (v1 , . . . , vsv ) − α2 U (v) h e $ e 2 h v ,h v γ 4π i (2.68) Vτ (σ , ν, x, yτ ) = s ! v n.t. v∈τ i=1
! $ j (v1 , . . . , vs ) = E T Fv1 (ϕ); . . . ; Fvs (ϕ) , where Fv (ϕ) = ∂ ν ϕy , if the vertex v with G j is a special endpoint with position y and label ν, otherwise Fv (ϕ) = exp(iαΦ(X v )). We can always rearrange the order of the arguments so that the first possibility happens for i = 1, . . . , m. If m = 0, we can use the identity (2.26), otherwise we can write m ∂ $ j (v1 , . . . , vs ) = H j (λ1 , . . . , λm ) , (2.69) G ∂λ1 · · · ∂λm λ=0
where H j (λ) = E Tj (eλ1 ∂
ν1 ϕ y1
; . . . ; eλm ∂
νm ϕ ym
; eiαΦ(X vm+1 ) ; . . . ; eiαΦ(X vs ) )
is a quantity which satisfies an identity similar to (2.26), that is
1$ H j (λ) = ca,b dpT (t) e− 2 U j (v,t,λ) ,
(2.70)
(2.71)
T ∈T¯s a,b ∈T
where
⎧ de f ⎪ ca,b = − α 2 W j (X va , X vb ) if a, b > m ⎪ ⎨$ ν
de f de f a ca,b = λa$ ca,b = iαλa r : er ∈X v σr ∂ C j (ya − xr ) if a ≤ m < b ⎪ ⎪ b
⎩ de f ca,b = − λa λb ∂ νa ∂ νb C j (ya − yb ) if a, b ≤ m. λa λb$
It follows that $ j (v1 , . . . , vs ) = G
$ ca,b
1
$
dpT (t)e− 2 U j (v,t,0) ,
(2.72)
(2.73)
T ∈T¯s a,b ∈T
where T¯s is the set of T ∈ T¯s , such that all special endpoints are leaves of T . Note $j (v, t, 0) is a positive quantity, since it is a convex combination of “interaction that U energies” which do not involve the special vertices; hence we can safely bound the r.h.s. of (2.73), as in the previous sections. Let us define 1
$ $τ (σ , ν, x, y )|. bτ (y) = n dx |V (2.74) τ 2 ν,σ Λn The bound of $ bτ (y) differs from the r.h.s. of (2.30) for the following reasons: 1) there is a γ ki factor more, coming from the field derivative, for the i th special endpoint, if ki is the scale label of the higher n.t. vertex preceding it (the vertex where it is contracted);
732
G. Benfatto, P. Falco, V. Mastropietro ∗
2) there is a factor γ h v (sv −1) more, which takes into account the fact that the special endpoints positions are not integrated, for each n.t. vertex v such that sv∗ > 0; 3) if δ = min1≤i< j≤k |yi − y j |, there is a factor exp(−cγh v δ) for each n.t. vertex v such that sv∗ > 0, coming from the same argument used in the case of the charge correlation functions. Hence, if m τ ≥ 1 is the number of special endpoints in τ , we get " n #" # mτ α2 $ γ 4π h ei γ −2h v (sv −1) e2εn v γ ki bτ (y) ≤ Cεn+m τ
·
n.t. v∈τ
i=1
γ
2h v (sv∗ −1)
e
i=1
−cγ h v δ
(2.75)
n.t.v:sv∗ >0
The last product can be bounded as in (2.61), so that, by “distributing along the tree” the other factors, we get $ $ γ −(h v −h v )( D(n v ,m v )−2εn v ) , (2.76) bτ (y) ≤ Cεn+k (δ)−2(m τ −1) (2m τ − 2)! n.t. v∈τ
where n v and m v denote the number of normal and special endpoints following v, respectively, and 2 $ m) = 2(n − 1) − α n + m. D(n, 4π
(2.77)
$ v , m v ) is always Let us consider first the case α 2 < 4π . Since n v + m v ≥ 2, D(n positive, except if n v = 0 and m v = 2. However, no tree may have a non trivial vertex of this type, except the trees with only two special endpoints and no normal endpoint, (N ) that is the trees belonging to T0,2 , and it is very easy to see that
$τ (ν1 , ν2 , y1 , y2 ) = − V
(N ) τ ∈T0,2
N
γ 2 j ∂ ν1 ∂ ν2 C0 γ j (y1 − y2 ) .
(2.78)
j=0
By (2.5), this quantity has a finite limit as N → ∞, if y1 = y2 , as we are supposing. (N ) Hence there is no ultraviolet divergence in the expansion (2.67) of Bn (ϕ, y, ν, µ) and we have only to check that there is no infrared problem related with the integration over the ϕ field in (2.66). This follows as in §2.2, by using the identity (2.13); it is sufficient to observe that ⎡" ⎤ ' # s m ( m ( 2 ν
iαΦ(X ) v Eh,−1 ⎣ ∂ϕy , j ⎦ ≤ )E ∂ i ϕyi e (2.79) h,−1 i i=1 j=1 i=1 and then apply the arguments used in §2.2 to bound the sum over the partitions. $ v , m v ) can be non Let us now suppose that 4π ≤ α 2 < 16π/3. In this case D(n positive only if either m v = 0 and n v = 2 or m v = n v = 1. The vertices satisfying the first condition can be regularized as before, for the others we can use the factor 2
e
− α2 Uh
v ,h v
(v)
2
α $ 1) + = γ − 4π (h v −h v −1) to make their dimension positive; in fact D(1,
Proof of Coleman’s Equivalence
733
α 2 /(4π ) = 1. The integration over the ϕ field in (2.66) can now be done by an obvious modification of the argument used for the charge correlation functions. (k,ζ ) It is now easy to prove, as in the previous sections, that K h,N (y; ν) has a finite limit, as −h, N → ∞, if δ > 0, and that this limit is an analytic function of ζ around ζ = 0, with a radius of convergence independent of δ. On the other hand, (k,ζ )
K h,N (y; ν) =
∞
ζn 1
T ν1 νk iασ1 ϕx1 iασn ϕxn ∂ dx · · · dx E ϕ ; . . . ; ∂ ϕ ;:e :; . . . ;:e : . 1 n y y 1 k h,N n! 2n σ n=0
(2.80) An argument similar to that used at the end of §2.2 allows us to prove that the power expansion of K (k,ζ ) (y; ν) is obtained by the previous equation, by substituting in the n T with E T . Moreover, it is not hard to check that, if n > 0, Q ≡ r.h.s Eh,N i=1 σi and h k,n (x, y; σ , ν) is the limiting value of the truncated expectation in (2.80), we have h k,n (x, y; σ , ν) = δ Q,0 c
2
"
− α8π n
k
# W (yr , x; νr , σ )
(2.81)
r =1
⎧ ⎫ ⎪ ⎪ ⎨
⎬ 2 α · (−1)|Π|−1 (|Π | − 1)! |xr − xs |σr σs 2π , ⎪ ⎪ ⎩Π ⎭ Y ∈Π r,s∈Y r <s
where
Π
denotes the sum over the partitions of the set (1, . . . , n) and
W (y, x; σ , ν) = iα
n
i=1
n iα (xi − y)ν σi ∂ ν ∆−1 (y − xi ) = σi 2π |y − xi |2
(2.82)
i=1
while, if n = 0, h k,0 (y, ν) = δk,2 E T
ν
! ∂ 1 ϕy1 ; ∂ ν2 ϕy2 = δk,2 h ν1 ,ν2 (y1 − y2 )
(2.83)
with h ν1 ,ν2 (y) = Hence, we get that K (k,ζ ) (y; ν) = 1 1 h¯ k,n (y; ν) = n! 2n
* + 1 yν1 yν2 ν1 ,ν2 δ . − 2 2π |y|2 |y|2 ∞
n=0 ζ
σ ,...,σn n1 i=1 σi =0
n h¯
k,n (y; ν),
(2.84)
with
Λn
dx1 · · · dxn h k,n (x, y; σ , ν).
(2.85)
Note that h¯ 1,n (y, ν) = 0 for any n, since W (y, x, σ , ν) is odd in σ and the sum in (2.85) is restricted to the σ such that Q = 0; hence K (1,ζ ) (y, ν) = 0.
734
G. Benfatto, P. Falco, V. Mastropietro
3. The Thirring Model with a Finite Volume Mass Term The Generating Functional, Wh,N (J, A, µ), of the Thirring model with cutoff and with a mass term in finite volume is defined by the equation de f (1) 2 dx ψ¯ x ψx Wh,N (J, A, µ) = log Ph,N (dψ) exp −λZ N VL (ψ) + µZ N Λ ⎫ ⎬
(1) +Z N dx Jxσ ψ¯ x Γ σ ψx + Z N dx Aνx ψ¯ x γ ν ψx , ⎭ σ =±1
ν=0,1
(3.1) (1)
where the free measure Ph,N (dψ) is defined by (1.14), Z N and Z N are defined in (1.15), µ Jzσ and Ay are two-dimensional, external bosonic fields and de f
Z 2N VL (ψ) =
1 4
5
2 de f I + σ γ , dx Z N ψ¯ x γ µ ψx + E h,N |Λ L |, Γ σ = 2 ΛL
(3.2)
E h,N being the vacuum counterterm introduced in (1.13); it is chosen so that Wh,N (0, 0, 0) = 0. Given the set of non coinciding points x = (x1 , . . . , xq ) and the set σ = (σ1 , . . . , σq ), σi = ±1, we want to study the Schwinger functions (q,r ;µ)
G h,N
de f
(x, y; σ , ν) =
lim
σ a −1 ,L→∞ ∂ Jx11
∂ q+r Wh,N σ
· · · ∂ Jxqq ∂ Aνy11 · · · ∂ Aνyrr
(0, 0, µ).
(3.3)
Theorem 3.1. If µ and λ are small enough and q ≥ 2, if r = 0, or q ≥ 0, if r ≥ 1, the limit de f
G (q,r ;ζ ) (z, w; σ , ν) =
lim
−h,N →+∞
(q,r ;ζ )
G h,N
(z, w; σ , ν)
(3.4)
exists and is analytic in µ. In the case q = r = 0 (the pressure), the limit does exist and is analytic, up to a divergence in the second order term, present only for λ ≤ 0. As in §2, we shall give the proof of the above theorem only in the special cases (q, r ) = (k, 0) and (q, r ) = (0, k) separately. In order to prove Theorem 3.1, we note first that definition (3.3) and the identity ψ¯ x ψx = σ ψ¯ x Γ σ ψx imply that
µp (q,r ;µ) (2n,r ) (z x, y; σ σ , ν), (3.5) G h,N (z, y; σ , ν) = dx χ Λ (x)Sh,N p!
p=2n−q n≥0
σ
where z x = (z1 , . . . , zq , x1 , . . . , x p ), σ σ = (σ1 , . . . , σq , σ1 , . . . , σ p ), we defined de f
χ Λ (x) = χΛ (x1 ) · · · χΛ (x p ), (m,r ;0)
(m,r )
de f
(m,r ;0)
Sh,N (x, y; σ , ν) = G h,N
(x, y; σ , ν),
and we used the fact that G h,N (x, y; σ , ν) can be different from 0 only if implying in particular that m is even.
m
i=1 σi
(3.6) = 0,
Proof of Coleman’s Equivalence
735 (m,r )
In the following we shall give a bound for the functions Sh,N (x, y; σ , ν), uniform in the cutoffs and implying (by an argument similar to that used for the Sine-Gordon model, that we shall skip here) that the limit exists, is integrable and is exchangeable with the integral in (3.5). It follows that
µp G (q,r ;µ) (z, y; σ , ν) = dx χ Λ (x)S (2n,r ) (z x, y; σ σ , ν). (3.7) p!
p=2n−q n≥0
σ
We remark that χ Λ (x) is not a regular test function since it is not vanishing for coinciding points, and hence we could encounter divergences caused by the ultraviolet problem. Indeed, as we shall see, the integration of G (2,0;0) will be finite only for λ > 0 (and small in absolute value), so that the pressure G (0,0;µ) and the, if x ∈ Λ, “density” G (1,0;µ) (x, σ ) are really divergent for λ ≤ 0, since this is true for the terms with 2n = 2 and r = 0 in the r.h.s. of (3.7). As announced in the introduction, we first consider the case q, r = 0, then we discuss the case q > 0 and r = 0; and finally the case q = 0 and r > 0. 3.1. Case q = r = 0 (the pressure). Our definitions imply that S (0,0) = 0. If m ≥ 2 and even (otherwise it is 0 by symmetry), the m-points Schwinger function S (m,0) (x, σ ) is obtained as the m th order functional derivative of the generating function Wh,N (J, 0, 0) with respect to Jxσ11 , . . . , Jxσmm at J = 0. We can proceed as in [BFM] and we get an expansion similar to Eq. (2.28) of that paper, which we refer to for the notation. The only difference is that the special endpoints of type J are associated with the terms (1) ¯ (1) σ + − + − Zj σ ψx Γ ψx = Z j σ ψx,−σ ψx,σ instead of Z j σ ψx,σ ψx,σ , but this does not change the structure of the expansion; we only have to add, for each special endpoint of (1) (1) scale h i , a factor Z h i /Z h i , which can be controlled by studying the flow of Z j . It turns out that there are two constants η+ (λ) = bλ + O(λ2 ), b > 0, and c+ (λ) = 1 + O(λ), such −η+ j ; this result is obtained by an argument that, in the limit N → ∞, Z (1) j = c+ (λ)γ similar to that used in [BFM] to prove that there are two constants η− (λ) = aλ2 + O(λ3 ), a > 0, and c− (λ) = 1 + O(λ), such that, in the limit N → ∞, Z j = c− (λ)γ −η− j (in [BFM] c− (λ) = 1, since the definition of Z N differs by a constant chosen to get this result). In analogy to Eq. (2.40) of [BFM], we can write S (m,0) (x, σ ) = m!
lim
|h|,N →∞
−1 ∞ N
S0,m,τ,σ (x),
(3.8)
n=0 j0 =−∞ τ ∈T 0,m P∈P j0 ,n
Given a tree τ contributing to the r.h.s. of (3.8), we call τ ∗ the tree which is obtained from τ by erasing all the vertices which are not needed to connect the m special endpoints (all of type J ). The endpoints of τ ∗ are the m special endpoints of τ , which we denote vi∗ , i = 1, . . . , m; with each of them a space-time point xi and a label σi are associated. Given a vertex v ∈ τ ∗ , we shall call uv the set of the space-time points associated with the normal endpoints of τ that follow v in τ (in [BFM] they were called internal points); xv will denote the subset of x made of all points associated with the endpoints of τ ∗ following v. Furthermore, we shall call sv∗ the number of branches of τ ∗ following v ∈ τ ∗ , sv∗,1 the number of branches containing only one endpoint and sv∗,2 = sv∗ − sv∗,1 . For each n.t.
736
G. Benfatto, P. Falco, V. Mastropietro
vertex or endpoint v ∈ τ ∗ , shortening the notation of sv∗ into s, we choose one point in xv , let it be called wv , with the only constraint that, if v1 , . . . , vs are the n.t. vertices or endpoints following v, then wv is one among wv ≡ {wv1 , . . . , wvs }. The bound of S0,m,τ,σ (x) will be done as in [BFM], by comparing it with the bound of its integral over x, given by Eq. (2.36) of that paper. However, we shall slightly modify the procedure, to get an estimate more convenient for our actual needs. Given the space-time points v = (v1 , . . . , v p ) connected by the tree T , we shall define, if vl,i and vl, f denote the endpoints of the line l ∈ T , de f
de f
DT (v) = T =
|vl,i − vl, f |.
(3.9)
l∈T
Now we want to show that, from the bounds of the propagators associated with the lines √ ,
γ h v D (w ) −c Cv v for each l of the spanning tree Tτ = v Tv , we can extract a factor e n.t. vertex v ∈ τ ∗ , where Cv is a chain of segments that only depends on τ and Tτ , and connects the space-time points wv . Indeed, given a n.t. v ∈ τ ∗ , there is a subtree Tv∗ of Tτ connecting the points wv together with a subset of xv ∪ uv . Since Tv∗ is made of lines of scale j ≥ h v , the decaying factors in the bounds of the propagator in Tv∗ can be written as e−c
√
γ h |x|
c
= e− 2
√
γ h |x|
· e−2c
h j=−∞
√
γ j |x|
, (3.10) − j/2 . Hence, collecting the latter factor for each of the lines T ∗ for c = c/ 4 ∞ v j=0 γ √
γ h v T ∗ −2c v we obtain e . We finally would like to replace, in the previous bound, Tv∗ with Cv , up to a constant, for a Cv which does not depend on the position of the internal points of Tv∗ . This is possible as a consequence of the following lemma. Lemma 3.1. Let T be a tree graph connecting the points {w j }lj=1 together with other q “internal points”, {u j } j=1 . Then there exists a chain C connecting all and only the points {w j }lj=1 such that 2T ≥ C and C only depends on T . q
Proof. Suppose that the points {u j } j=1 are fixed in an arbitrary way and let us consider the oriented closed path C¯ obtained by “circumnavigating” T , for example in the clockwise direction; this path contains twice each branch of T , with both possible orientations. ¯ as We shall call C the oriented closed path obtained by continuous deformation of C, 2 q the points {u j } j=1 vary in R . The path C allows us to reorder the points w1 , . . . , wl into wt (1) , . . . , wt (l) , by putting t (1) = 1 and by choosing t (i + 1), 1 ≤ i ≤ l − 1, so that wt (i+1) is the point following wt (i) on C. The chain C is obtained by joining with a segment wt (i) and wt (i+1) , for i = 1, . . . , l − 1; the condition 2T ≥ C then easily follows from the triangle inequality for the function x → |x|1/2 . As a consequence of the above lemma and (3.10), √ we can extract from the propagator bounds, for each choice of Tv , a factor
v∈τ ∗
e−c
γ h v DCv (wv ) ,
√
which does not depend
j |x−y|
for each propagator on the internal points positions, by leaving a factor e−(c/2) γ of Tτ , to be used for bounding the integral over the internal points.
Proof of Coleman’s Equivalence
737
The final bound of S0,m,τ,σ (x) will be obtained by “undoing”, in the r.h.s of Eq. (2.36) of [BFM], the sum over Tv for any v ∈ τ ∗ (note that Cv depends on Tv∗ and hence on Tv ), then adding the factors coming from the previous considerations, together with a factor taking into account that there are 1 + v∈τ ∗ (sv∗ − 1) = m integrations less to do. By suitably choosing them, the lacking integrations produce in the bound an extra factor 2h v (sv∗ −1) L −2 , so that we get v∈τ ∗ γ ⎡ ⎤ Z h |Pv |/2 v |S0,m,τ,σ (x)| ≤ C m (C λ¯ j0 )n γ − j0 (−2+m) ⎣ γ −dv ⎦ Z h −1 v v not e.p. ⎞ " m (1) # ⎛ √ 1
Zh ∗
h v i ⎝ · γ 2h v (sv −1) e−c γ DCv (wv ) ⎠, (3.11) Z hi s ! v ∗ n.t.v∈τ
i=1
Tv
where h i is the scale of the i th endpoint of type J and de f
dv = − 2 + m v + |Pv |/2 + z v ,
(3.12)
z(Pv ) defined by Eq. (2.38) of [BFM]. with m v = |Xv | and z v equal to the parameter $ We have now to bound the integral of S0,m,τ,σ (x)χ Λ (x), let us call it Im,τ,σ . In order to exploit the improvement related with the restriction of the integration variables to a fixed volume of size 1, we shall proceed in a way different with respect to that followed in [BFM], that is we bound the integral before the sums over the trees Tv . We use the bound: −2h √
h if h > 0 γ . (3.13) dx χΛ (x)e−c γ |x−y| ≤ C 1 if h ≤ 0 The sum over the tree graphs is done in the usual way and we get " # " m (1) # Zh ∗ −1) i m n − j (−2+m) 2h (s v 0 v Im,τ,σ ≤ C (C λ¯ j0 ) γ γ Z h i n.t.v∈τ ∗ i=1 ⎡ ⎤" # h >0 v |Pv |/2 Z ∗ h v · ⎣ γ −dv ⎦ γ −2h v (sv −1) . Z h −1 v ∗ v not e.p.
(3.14)
n.t.v∈τ
Let us now call E i the family of trivial vertices belonging to the branch of τ ∗ which connects vi∗ with the higher non trivial vertex of τ ∗ preceding it and note that, by the (1) remark preceding (3.8), Z h i /Z h i ≤ Cγ −h i η¯ , with η¯ = c0 λ + O(λ¯ 2j0 ), c0 > 0. Hence, the definition of sv∗,1 implies that, if E = ∪i E i , " # m Z h(1) i m −η¯ ¯ v∗,1 ≤C γ γ −h v ηs . (3.15) Z hi ∗ v∈E
i=1
v0∗
n.t.v∈τ
sv∗
Let be the first vertex with ≥ 2 following v0 (recall that v0 is the vertex immediately following the root of τ , of scale j0 + 1); since m ≥ 2, this vertex is certainly present. Then, since m v = m for v0 ≤ v ≤ v0∗ , we have the identity −h ∗ (−2+m v ∗ ) $ 0 γ −dv = γ v0 γ −dv , (3.16) γ − j0 (−2+m) v0 ≤v0
γ
−2h v (sv∗ −1)
n.t.v∈τ ∗
|Pv |/2
γ
−d¯v
,
(3.26)
with αv = α v + ε(sv∗,2 − 1) and d¯v = dv − ε, if v0∗ ≤ v ∈ E; and d¯v = dv otherwise. Let us now define χv = 1 if h v > 0 and χv = 0 for h v ≤ 0. If we put w = v0∗ , we can write γ
εh v ∗
0
" γ
αv h v
n.t.v∈τ ∗
h v >0
# γ
−2h v (sv∗ −1)
n.t.v∈τ ∗ ∗
= γ [αw +ε−2χw (sw −1)]h w
∗
γ [αv −2χv (sv −1)]h v ,
(3.27)
n.t.v∈τ ∗ v =w
and, if |λ| 0
∗
γ [αw +ε−2(sw −1)]h +
γ [αw +ε]h .
(3.30)
h≤0
The second sum is always finite since αw + ε ≥ εsw∗ ≥ 2ε. Regarding the first sum, we note that αw + ε − 2(sw∗ − 1) = 2 − (1 + η)s ¯ w∗ − sw∗:2 (1 − ε − η) ¯ ∗ ≤ 2 − (1 + η)s ¯ w.
(3.31)
Hence, the sum is always bounded, except in the case η¯ ≤ 0 (that is λ ≤ 0) with sw∗ = sw∗,1 = 2. It follows, by (3.8), that (3.32) dx χ Λ (z)|S (m) (z, σ )| ≤ m!C m , m ≥ 3 so that, by (3.7), the pressure can be defined only by subtracting from G (0,µ) the term with m = 2. The renormalized pressure is analytic in µ, for µ small enough. 3.2. Case q ≥ 2, r = 0. We have for S (m) (z x; σ , σ ) an expansion analogous to (3.8), but now the special endpoints are associated with two different types of space-time points, those which have to be integrated as before (x) and those which are fixed (z). We denote by xv and zv the points following v of the two types and we slightly modify the definition of the point wv to be one point in zv , if zv = ∅, or one point in xv , otherwise; we still require that wv ∈ wv . We want to mimic the strategy used for the Sine-Gordon correlations functions. Therefore we introduce a new tree τ o , that is obtained from τ ∗ by erasing all the vertices which are not needed to connect the q special endpoints carrying a space-time point of type z. (We remark that the roles of the trees τ and τ ∗ of the bosonic theory here are played by τ ∗ and τ o respectively). Correspondingly, we define svo the number of the branches of τ o following v ∈ τ o ; note that the space-time points associated with the endpoints of τ o following v ∈ τ o are those in zv , hence w≥v (swo − 1) = |zv | − 1. A bound similar to (3.11) holds. In this case, anyway, we prefer to have a separate decaying factor in the distance of the points z: for each nontrivial vertex v of τ o , DCv (wv ) ≥
1 1 DC$v (zv ∩ wv ) + DCv (wv ), 2 2
(3.33)
$v denotes the ordered path connecting the points in (zv ∩ wv ), made of lines where C which connect a point with that following it in the ordered path Cv , see Lemma 3.1.
Proof of Coleman’s Equivalence
741
Therefore, in place of (3.11), we have:
⎡
|S0,m,τ,σ (z, x)| ≤ C m (C λ¯ j0 )n γ − j0 (−2+m) ⎣ ·
v not e.p.
" m (1) # Zh i
Z hv Z h v −1
|Pv |/2
⎤ γ −dv ⎦
1 2h v (sv∗ −1) c hv γ exp − γ 2 DCv (wv ) Z s ! 2 n.t.v∈τ ∗ v Tv i=1 h i c h v exp − γ 2 DC$v (zv ∩ wv ) . (3.34) 2 o
·
n.t.v∈τ
We can repeat, with no essential modification, the steps that from (3.11) have led to (3.14). Hence, if we call Im,τ,σ (z) the integral over x of S0,m,τ,σ (zx)χ Λ (x), we get the bound: " # ∗ −1) m n − j (−2+m) 2h (s v 0 v Im,τ,σ (z) ≤ C (C λ¯ j0 ) γ γ " · " ·
m i=1
⎡
(1) # Z hi ⎣
Z hi
h v >0
n.t.v∈τ ∗
v not e.p.
#
γ
n.t.v∈τ o
2h v (svo −1)
Z hv Z h v −1
n.t.v∈τ
|Pv |/2
⎤" γ −dv ⎦
h v >0
# γ
−2h v (sv∗ −1)
n.t.v∈τ ∗
c hv exp − γ 2 DC$v (zv ∩ wv ) . 2 o
(3.35)
Indeed, we observe that the chain Cv is a spanning tree of propagators with root in one of the zv points (if any, see the definition of wv ). Hence, integrating down the position of the vertices xv from the endpoints of such a tree to the root, in the case at hand there are, with respect to the procedure for q = 0, svo − 1 missing integration for o each nontrivial vertex v of the tree τ o . By (3.13), this means a factor γ −2h v (sv −1) less, if h v > 0, and a constant factor less, if h v ≤ 0; this explains the last line of (3.35). Going on in parallel with §3.1, we obtain the analogue of (3.29); recalling that w is the lowest n.t. vertex of the tree τ ∗ , Z h |Pv |/2 ∗ −1) h ¯ v m n α +ε−2χ (s − d [ ] w w w v w Im,τ,σ (z) ≤ C (C λ¯ j0 ) γ γ Z h v −1 v not e. p. # " h >0
v c hv 2h v (svo −1) γ exp − γ 2 DC$v (zv ∩ wv ) . (3.36) · 2 o o n.t.v∈τ
n.t.v∈τ
At this point, in contrast with the pressure bound, we want to take advantage of the exponential fall off in the diameter of zv ∩wv to prove the convergence of the correlations (with q ≥ 2) for any sign of η. ¯ $v is a Note that our definitions imply that ∪n.t.v∈τ 0 zv ∩ wv = z and that ∪n.t.v∈τ 0 C tree connecting all the points in z. This remark, together with the trivial bound h v ≥ h v0∗ , implies that
4 c hv c h ∗ γ v0 diam(z) . exp − γ 2 DC$v (zv ∩ wv ) ≤ exp − (3.37) 4 4 o n.t.v∈τ
742
G. Benfatto, P. Falco, V. Mastropietro de f
On the other hand, since |zv ∩ wv | ≥ 2 for any n.t. v ∈ τ 0 , if we define δ = mini, j |zi − z j |, we have γ
2h v (svo −1)
2(sv0 −1) C c hv 0 exp − γ 2 DC$v (zv ∩ wv ) ≤ (sv0 − 1)4(sv −1) (3.38) 4 δ
so that, by using also the identity "
#
0 v∈τ 0 (sv
− 1) = q − 1,
4 c exp − γ h v diam(zv ∩ wv ) 2 n.t.v∈τ o n.t.v∈τ o 2(q−1) 4 c h v∗ 4 C exp − γ 0 diam(z) . ≤ [(q − 1)!] δ 4 h v >0
γ
2h v (svo −1)
(3.39)
Let us now remark that the quantity
1 + diam(z)(αw +ε)
+∞
4 ∗ γ [αw +ε−2χw (sw −1)]h exp −c0 γ h diam(z)
(3.40)
h=−∞
is bounded by a constant. In fact, the series is convergent also without the exponential, −h 0 , h ≤ 0, we as shown before, and this is sufficient, if diam(z) ≤ 1; 0 if diam(z) = γ +∞ can bound the series by 2 h=−∞ γ (αw +ε)h exp[−c0 γ h ], which is convergent, since αw + ε ≥ 2ε. Hence we get, by using 2ε ≤ aw + ε ≤ q(1 + ε − η), ¯ that there is a constant Cq , such that
dx χ Λ (x)|S (m) (z, x, σ )| ≤ m! 1 + δ −2(q−1)
Cq . 1 + diam(z)2ε
(3.41)
3.3. Case r ≥ 1, q = 0. This case is very similar to the previous one; therefore we limit ourself to the discussion of the differences. Formula (3.34) still holds, with yv in place of zv (to be consistent with notation in (1)
p
(1)
m (Z h i /Z h i ) −→ i=1 (Z h i /Z h i ), following from (3.7)) and with the replacement i=1 ¯ µ ψ is equal to Z h . It is easy the fact that the strength renormalization of the field ψγ to go along the developments of §3.2 again, up to a couple of differences. The minor one is that in formulas (3.15) and (3.24) the set E has to be replaced with the set E\Y , where Y is the family of trivial vertices of τ ∗ belonging to the branches ending up with an endpoint of type y; but this is not a problem, since the dimensions of all the vertices remain strictly positive. The major difference is that in (3.15), in the case at hand, there is h v η(s ¯ v∗,1 − tv∗,1 ) in place of h v ηs ¯ v∗,1 , if tv∗,1 is the number of branches departing from v and ending up with one endpoint of type y (hence 0 ≤ tv∗,1 ≤ sv∗,1 ). At the end of the developments, the latter fact generates a new αv , that we have to prove to be positive in order to control the bound in the vertices v = w such that h v ≥ 0 (as done in (3.28) for the old one). With simple computations we find:
αv = ε(sv∗ − 1) + (sv∗,1 − tv∗,1 )(1 − η¯ − ε) + tv∗,1 (1 − ε) ≥ ε.
(3.42)
Proof of Coleman’s Equivalence
743
Also, we need to prove that αv − 2(sv∗ − 1) is negative, in order to control the bound in the vertices v = w such that h v > 0; and indeed: αv − 2(sv∗ − 1) = (2 − ε) − sv∗ − (sv∗,1 − tv∗,1 )η¯ − sv∗,2 (1 − ε) ≤ (2 − ε) − (1 − |η|)s ¯ v∗ < 0.
(3.43)
Finally, the summation on the scale of w is controlled by the exponential fall off in the diameter of y, as in (3.40). 4. Explicit Expression of the Coefficients in the Mass Expansion and Proof of Theorem 1.1 4.1. The case r = 0. As explained in the remark preceding (3.7), in order to get an explicit expression for the coefficients of the expansion (3.7), it is sufficient to calculate the correlations S (m,0) (x, σ ). We now show how to get this result by computing the correlations of the ψ field at non coinciding points. We consider the following generating function: 2 σ ¯ (1) ¯ σ W N ,ε (J ) = lim log Ph,N (dψ)e−λZ N V (ψ)+ Z N σ dxdy Jx δε (x−y)ψx Γ ψy , (4.1) h→−∞
where δε (x) is a smooth approximation of the delta function, rotational invariant, whose support does not contain the point x = 0; for definiteness we will choose δε (x) = 1 being a function on R with support in [1, 2], such that dρρv(ρ) = ε−2 v(ε−1 |x|), v(ρ) (2π )−1 (so that dxδε (x) = 1). We define (m) S¯ N ,ε (x, σ ) =
∂m W N ,ε (J )| J =0 , ∂ Jxσ11 ...∂ Jxσmm
(4.2)
(m)
while S N ,0 (x, σ ) will denote the analogous quantity with δε (x − y) → δ(x − y). Note (m)
that S (m,0) (x, σ ) = lim N →∞ S N ,0 (x, σ ).
Lemma 4.1. If λ is small enough, there exists a constant c1 = 1 + O(λ), such that, if (1) we put Z¯ = c1 εη+ , then, for any set x of m distinct points, N
(m) lim lim S¯ (x, σ ) ε→0 N →∞ N ,ε
(m)
= lim S N ,0 (x, σ ). N →∞
(4.3)
Proof. The proof of the lemma is based on a multiscale analysis of the functional W N ,ε (J ), performed by using the techniques explained in Sect. 2 of [BFM]. We shall not give here the detailed proof, but we shall stress only the relevant differences with respect to the case studied there. First of all, the external field ϕ is zero and the free measure has mass zero. Moreover + − =ψ ¯ x Γ σ ψy , the terms linear in J and quadratic in ψ contain the monomial ψx,−σ ψy,σ + − instead of ψx,σ ψy,σ . This difference is unimportant from the point of view of dimensional analysis, so that, in the case ε = 0, we can essentially repeat the analysis of [BFM] with obvious minor changes. The situation is different for ε > 0, since in this case these terms (which are marginal) are not local on the scale N , so that they need a more accurate discussion.
744
G. Benfatto, P. Falco, V. Mastropietro ( j)
Let us call B J (ψ) the contribution to the effective potential on scale j, which is [h, j]+ [h, j]− linear in J and has as external fields ψx,ω and ψy,−ω , and let h ε be the largest integer −h such that γ ε ≥ ε and let N > h ε . We want to show that, if N ≥ j ≥ h ε , this term, which is dimensionally marginal, is indeed irrelevant, so there is no need to localize it. ( j) This follows from the observation that B J (ψ) is of the form
( j) [h, j]+ [h, j]− B J (ψ) = Z¯ (1) dxdy Jzω δε (x − y)ψx,−ω ψy,ω N ω
+
dzJzω
[h, j]+
[h, j]−
d z¯ dxdyδε (z − z¯ )W j (z, z¯ , x, y)ψx,−ω ψy,ω
, (4.4)
ω
where W j (z, z¯ , x, y) is the kernel of the sum over all graphs containing at least one λ vertex. It is easy to see that it is of the form $ j (z, x)W $ j (¯z, y) + W¯ j (z, z¯ , x, y), W j (z, z¯ , x, y) = W (4.5) where the second term is given by the sum over the graphs which stay connected after cutting the line δε , while the first term is associated with the other graphs. The first term $ j (z, x) and W $ j (¯z, y) are sums does not need a localization, even for j < h ε , because W over graphs with two external lines, one (the one contracted with the J vertex) of scale h 1 > j, the other one of scale h 2 ≤ j. The momentum conservation and the compact support properties of the single scale propagators imply that h 1 = j + 1, so that there is no diverging sum associated with h 1 , as one could expect since the first term has a bound C|λ|. On the other hand, it is easy to see that the second term satisfies the bound (4.6) d z¯ dxdyδε (z − z¯ )|W¯ j (z, z¯ , x, y)| ≤ C|λ|γ −2( j−h ε ) . This immediately follows by comparing this bound with the analogous one for ε = 0, which is C|λ| for dimensional reasons. With respect to the case ε = 0, we have a new vertex z¯ , which is linked to the graph by the line δε and a propagator of scale j > j. The bound (4.6) is obtained by using the decaying properties of this propagator to integrate over z¯ and by bounding δε by Cε−2 . Note that this procedure is convenient only because j ≥ h ε , otherwise it would be convenient to integrate over z¯ by using δε and we should get the dimensional bound C|λ| ( j) of the case ε = 0. It follows that, starting from j = h ε , we have to apply to B J (ψ) the localization procedure; then we define, if j ≥ h ε ,
(1) ( j) [h, j]+ [h, j]− (4.7) dzJzω ψz,−ω ψz,ω , LB J (ψ) = Z¯ j ω
(1) and we perform the limit N → ∞. In this limit, Z¯ j can be represented as an expansion in terms of trees, which have one special vertex (the J vertex) and an arbitrary number of normal vertices, the normal vertices being associated with the limiting value λ−∞ (1) of the running coupling (whose flow is independent of the Z¯ j flow). It follows that (1) Z¯ = c1 γ −h ε η+ [1 + O(λ)] and that, if j < h ε , he
¯ (1) η+ + O(|λ|γ −h ε η+ γ −(h ε − j)/2 ), Z¯ (1) j−1 = Z j γ
(4.8)
where the first term comes from the trees with the special vertex of scale ≤ h ε ; it is exactly equal to the term one would get in the theory with ε = 0, in the limit N → ∞.
Proof of Coleman’s Equivalence
745
The second term is the contribution of the trees with the special vertex of scale > h ε (these trees must have at least one normal vertex); it is of course proportional to εη+ and takes into account the “short memory property” (exponential decrease of the irrelevant terms influence). The flow (4.8) immediately implies that, for any fixed j and |η+ | < 1/2, (1) (1) (1) (1) (1) (1) limε→0 ( Z¯ j−1 / Z¯ j ) = γ η+ = (Z j−1 /Z j ) and that Z¯ j = c1 [1 + O(λ)]Z j . Hence, (1) (1) by suitably choosing c1 , we can get limε→0 Z¯ j = Z j . Note that S (m) (x, ω) is different from 0 only if m is even and i ωi = 0; moreover the truncated correlations can be written as sums over the non truncated ones. Hence, in order to get an explicit formula for S (m) (x, ω), it is sufficient to calculate the correlation K (n) (x, u) =
lim
(1)
−h,N →∞
(Z N )2n
n
ψ¯ x j Γ + ψx j ψ¯ u j Γ − ψu j ,
(4.9)
j=1
where · denotes the expectation with respect to the zero mass Thirring measure. By using Lemma 4.1, we have n $(2n) (x, y, u, v), K (n) (x, u) = c12n lim ε2nη+ · dydv[ δε (xi − yi )δε (ui − vi )] K ε→0
i=1
(4.10) where $(2n) (x, y, u, v) = K
n
ψ¯ y j Γ + ψx j ψ¯ v j Γ − ψu j .
(4.11)
j=1
On the other hand, by using the results of [BFM], see Theorem A.1 below, one can prove de f
de f
that, if · 0 is the mean value for λ = 0 and ψi− = ψx−i ,ωi , ψi+ = ψy+ ,ω , i
ψn− · · · ψ1− ψ1+ · · · ψn+
=
i
¯ c0λA(a−a)n ψn− · · · ψ1− ψ1+ · · · ψn+ 0
·
s 1, Sω,ω (x, y) satisfy (A.10), that, after suitable derivatives in the fields, reads (2n) ∂ωx11 Sω,ω
(x, y) = λA
+λA
n
n
∗∗
h=2 X 1 ,X h π ∈P X 1 ,X h X
A−ω1 ωh g−ω1 (x1 − xh ) −
h=2
(−1)π M Xn,h (x, y◦π ) 1 ,X h
n
(2n)
A−ω1 ωh g−ω1 (x1 − yh ) Sω,ω (x, y)
h=1
(A.41) with de f
M Xn,h (x, y) = 1 ,X h
A−ω1 ωh g−ω1 (x1 − xh ) − A−ω1 ωh g−ω1 (x1 − yh ) (2|X 1 |)
X ,ω X
·Sω
1
1
(2|X h |)
X ,ω X
(x X 1 , y X )Sω 1
h
h
(x X h , y X ). h
(A.42)
References [B] [BF] [BGN] [BK]
Benfatto, G.: An iterated mayer expansion for the yukawa gas. J. Stat. Phys. 41, 671–684 (1985) Brydges, D., Federbush, P.: Debye screening. Commun. Math. Phys. 73, 197–246 (1980) Benfatto, G., Gallavotti, G.: Nicolò f.: on the massive sine-gordon equation in the first few regions of collapse. Commun. Math. Phys. 83, 387–410 (1982) Brydges, D., Kennedy, T.: Mayer expansion and the hamilton-jacobi equation. J. Stat. Phys. 48, 14–49 (1987)
762
[BM] [BFM] [Br] [C] [D] [DH] [F] [FGS] [FS] [ID] [J] [K] [Ha] [H] [N] [M1] [M2] [NRS] [P] [S] [So] [SU] [Y]
G. Benfatto, P. Falco, V. Mastropietro
Benfatto, G., Mastropietro, V.: Ward identities and chiral anomaly in the luttinger liquid. Commun. Math. Phys. 258, 609–655 (2005) Benfatto, G., Falco, P., Mastropietro, V.: Functional integral construction of the massive thirring model: verification of axioms and massless limit. Commun. Math. Phys. 273, 67–118 (2007) Brydges, D.: A short course on Cluster Expansions, Les Houches 1984, K. Osterwalder, R. Stora, eds., Amsterdan: North Holland Press, 1986 Coleman, S.: Quantum sine-gordon equation as the massive thirring model. Phys. Rev. D 11, 2088–2097 (1975) Dimock, J.: Bosonization of massive fermions. Commun. Math. Phys. 198, 247–281 (1998) Dimock, J., Hurd, T.R.: Construction of the two-dimensional Sine-Gordon model for β < 8π , Commun. Math. Phys. 156, 547–580 (1993); corrected by the same authors in Ann. Henri Poincaré 1, 499–541 (2000) Folland, G.B.: Introduction to partial differential equations, Mathematical Notes, Princeton, NJ: Princeton Univ. Press, 1976 Furuya, K., Gamboa Saravi, S., Schaposnik, F.A.: Path integral formulation of chiral invariant fermion models in two dimensions. Nucl. Phys. B 208, 159–181 (1982) Fröhlich, J., Seiler, E.: The massive thirring-schwinger model (qed2 ): convergence of perturbation theory and particle structure. Helv. Phys. Acta 49, 889–924 (1976) Itzykson, C., Drouffe, J.: Statistical field theory, Cambridge Monographs in Mathematical Notes, Cambridge: Cambridge Univ. Press, 1989 Johnson, K.: Solution of the equations for the green’s functions of a two dimensional relativistic field theory. Nuovo Cimento 20, 773–790 (1961) Klaiber, B.: The Thirring model. In: Quantum theory and statistical physics, Vol X A, Barut, A.O., Brittin, W.F., eds. London: Gordon and Breach, 1968 Hagen, C.R.: New solutions of the thirring model. Nuovo Cimento 1, 5861–5878 (1967) Hua, L.: Harmonic analysis of functions of several complex variables in the classical domains. Providence, RI: Amer.Math.Soc., 1963 Naon, C.M.: Abelian and non-abelian bosonization in the path integral framework. Phys. Rev. D 31, 2035–2044 (1976) Mastropietro, V.: Non-perturbative Adler-Bardeen Theorem. J. Math. Phys. 48(2), 022302 (2007) Mastropietro, V.: Non-perturbative aspects of chiral anomalies. J. Phys. A: Math. Theor. 33, 10349–10365 (2007) Nicolò, F., Renn, J., Steinmann, A.: On the massive sine-gordon equation in all regions of collapse. Commun. Math. Phys. 105, 291–326 (1986) Polchinski, J.: String duality. Rev. Mod. Phys. 68, 1245–1258 (1996) Seiler, E.: Phys. Rev. D 22, 2412–2418 (1980) Solyom, J.: Adv. in Phys. 28, 201–303 (1979) Seiler, R., Uhlenbrock, D.A.: On the massive thirring model. Ann. Physics 105, 81–110 (1977) Yang, W.-S.: Debye screening for two-dimensional coulomb systems at high temperatures. J. Stat. Phys. 49, 1–32 (1987)
Communicated by G. Gallavotti
Commun. Math. Phys. 285, 763–798 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0613-3
Communications in
Mathematical Physics
Scaling Algebras and Pointlike Fields A Nonperturbative Approach to Renormalization Henning Bostelmann1 , Claudio D’Antoni1 , Gerardo Morsella2 1 Dipartimento di Matematica, Università di Roma “Tor Vergata”, Via della Ricerca Scientifica,
00133 Roma, Italy. E-mail:
[email protected];
[email protected] 2 Scuola Normale Superiore di Pisa, Piazza dei Cavalieri, 7, 56126 Pisa, Italy.
E-mail:
[email protected] Received: 22 December 2007 / Accepted: 6 May 2008 Published online: 22 October 2008 – © Springer-Verlag 2008
Dedicated to Klaus Fredenhagen on the occasion of his 60th birthday. Abstract: We present a method of short-distance analysis in quantum field theory that does not require choosing a renormalization prescription a priori. We set out from a local net of algebras with associated pointlike quantum fields. The net has a naturally defined scaling limit in the sense of Buchholz and Verch; we investigate the effect of this limit on the pointlike fields. Both for the fields and their operator product expansions, a well-defined limit procedure can be established. This can always be interpreted in the usual sense of multiplicative renormalization, where the renormalization factors are determined by our analysis. We also consider the limits of symmetry actions. In particular, for suitable limit states, the group of scaling transformations induces a dilation symmetry in the limit theory.
1. Introduction Renormalization has proven to be one of the key concepts of quantum field theory, in particular in the construction of models. We can roughly divide its mathematical and physical implications as follows, even though they are often mixed in one approach. First, renormalization has a constructive aspect: It serves as a tool to remove divergencies, momentum-space cutoffs, or lattice restrictions from unphysical theories in order to arrive at a physical limit theory. This aspect is found both in perturbative approaches and in mathematically rigorous constructions of quantum field theory, usually in the Euclidean regime [FRS]. Second, renormalization is a means of short distance analysis: It allows to pass from a given physical theory to a theory which describes the behavior at short distances, often expected to be simpler than the full theory, in particular when this limit theory is a Work supported by MIUR, GNAMPA-INDAM, and the EU network “Quantum Spaces – Non Commutative Geometry” (HPRN-CT-2002-00280).
764
H. Bostelmann, C. D’Antoni, G. Morsella
model of free particles (asymptotic freedom). This is the idea that underlies quantum chromodynamics and the parton picture in high energy physics. It is the second aspect that we are interested in here. We assume that a fully constructed, mathematically rigorous quantum field theory “at finite scales” is given, and investigate its short distance behavior in a physically natural setting. As Buchholz and Verch have shown [B-V1], it is possible to define the short-distance limit of such a theory in a model-independent way. This method is based on the algebraic approach to quantum field theory [Ha], and allows applications in particular to the charge structure of the theory in the scaling limit [Bu,DMV,D-M]. This algebraic approach to renormalization does not depend on technical details of the theory, such as a choice of generating quantum fields. In fact, it is not even necessary to make any reference to pointlike localized quantum fields at all, or even assume their existence; the renormalization limit can be defined referring only to the algebras of bounded operators associated with finite regions. On the other hand, it is not obvious how this approach relates to pointlike quantum fields, given that they exist in the theory, and how the usual picture of point field renormalization emerges. The present work aims at clarifying these questions. We set out from a theory given as a local net of algebras, but assume that pointlike fields are associated to these algebras in the way proposed in [Bo2]. Then we consider the scaling limit of the theory in the sense of Buchholz and Verch, and analyze its effect on the pointlike fields. Specifically, the description of fields in [Bo2] is based on a certain phase space condition. We show that, given that this condition holds at finite scales, it carries over to the scaling limit theory, so that also the limit theory has a well-described connection with pointlike quantities. We analyze in detail how limits of pointlike objects arise from the algebras, and show that a multiplicative renormalization of quantum fields naturally follows as a consequence of our setting. We also discuss the effect of renormalization transformations on the operator product expansion in the sense of Wilson [W,W-Z], which is known to have a precise meaning at finite scales [Bo3]. The operator product expansion plays an important role in this context: It reflects how renormalization transformations change the interaction of the theory, which are captured here in the structure constants of the “improper algebra” of pointlike fields. On a heuristic level, our method can be understood as follows: In the usual fieldtheoretic setting, renormalization of a pointlike field φ(x) is established by a purely geometric scaling in space-time, combined with a multiplication by a c-number Z λ depending on scale. The field at scale λ is given by φλ (x) = Z λ φ(λx),
(1.1)
where e.g. for a real scalar free field in physical space-time, one would choose Z λ = λ. This φλ converges to a limit field φ0 , e.g. in the sense of Wightman functions or of suitable matrix elements. The choice of Z λ is not unique, and may contain ambiguities to some extent, even if these do not influence the structure of the limit theory for a free field. In the algebraic setting, one considers the set of all such possible renormalization schemes, without selecting a preferred choice. One abstractly works with functions λ → Aλ , valued in the bounded operators, that are subject only to a geometric condition, Aλ ∈ A(λO) for some region O, and to a continuity condition that serves to keep the unit of action constant in the limit. A typical function may be thought of as Aλ = exp(i Z λ φ( f λ )),
f λ (x) = λ−4 f (λ−1 x),
(1.2)
Scaling Algebras and Pointlike Fields
765
where f is a fixed test function. (This is up to a necessary smearing in space-time.) However, the explicit form of Z λ needs not explicitly be fixed in this approach, since it does not influence the norm of Aλ . The important point of our analysis is now that the Z λ can be constructed from the algebraic setting, rather than fixing them from the outset. Namely, following [Bo2], the link between fields and bounded operators is given as follows: Bounded operators A localized in a double cone Or of radius r centered at the origin can be approximated by local fields by means of a series expansion, A≈ σ j (A)φ j , (1.3) j
where σ j are normal functionals and φ j are quantum fields localized at x = 0, both independent of r . The sum is to be understood as an asymptotic series in r . For example, for the real scalar free field in 3 + 1 dimensions, the first terms of the expansion are A ≈ (Ω|AΩ)1 + σ1 (A)φ,
(1.4)
where φ is the free field and σ1 a certain matrix element with 1-particle functions (cf. [Bo2, Eq. A19]). Now this σ1 has the property that σ1 A(Or ) ∼ r . Thus inserting operators Aλ ∈ A(λO) with some fixed region O, we obtain σ1 (Aλ ) ∼ λ, and can expect that Z λ := σ1 (Aλ ) is a suitable renormalization factor for the field φ. We shall see in Sect. 3 that this heuristic expectation can indeed be made precise. The central point here is that the factors Z λ arise as a consequence of our analysis; they are determined by the scaling limit of the algebras, it is not necessary to put them in explicitly. Another important aspect of our analysis is the description of symmetries, in particular dilations. It is heuristically expected that scaling transformations, which would map Aλ to Aµλ , or Z λ φ to Z µλ φ, relate to a dilation symmetry in the limit theory. In order to make this precise, we need to generalize the structures introduced in [B-V1], since the limit states considered there are not invariant under scaling. We propose more general limit states that are in fact invariant under scaling transformations of the above kind, and allow a canonical implementation of these transformations in the limit, yielding an action of the dilation group. However, these generalized limit states are no longer pure states; and pure limit states are not dilation covariant. The paper is organized as follows: In Sect. 2, we recall the algebraic approach to renormalization, and introduce the generalizations needed for our analysis. This includes the dilation invariant limit states mentioned above, but also a generalization of the scaling limit from bounded operators to unbounded objects. Section 3 then describes pointlike fields associated with the theory, and analyzes their scaling limit, giving a construction for the renormalization factors Z λ . Section 4 concerns operator product expansions and their scaling limits. We end with a conclusion in Sect. 5, in particular discussing the expected situation in quantum chromodynamics. In the Appendix, we handle a technical construction regarding states on C ∗ algebras. One notational convention applies throughout the paper: In order to avoid complicated index notation, we will sometimes write the index of a symbol in brackets following it; e.g. we write α[x, Λ] as a synonym for αx,Λ . 2. The Algebraic Approach to Renormalization We use the algebraic approach to quantum field theory [Ha] for our analysis. Let us briefly summarize the basic structure, since we will use several variants of it: A net of
766
H. Bostelmann, C. D’Antoni, G. Morsella
algebras is a map A : O → A(O) that assigns to each open bounded subset O ⊂ Rs+1 of Minkowski space a C ∗ algebra A(O), such that isotony holds, i.e. A(O1 ) ⊂ A(O2 ) if O1 ⊂ O2 . For such a net, we can define the quasilocal algebra, again denoted by A following the usual convention, as the closure (or inductive limit) of ∪O A(O), where the union runs over all open bounded regions. Definition 2.1. Let G be a Lie group of point transformations of Minkowski space that includes the translation group. A local net of algebras with symmetry group G is a net of algebras A together with a representation g → αg of G as automorphisms of A, such that (i) [A1 , A2 ] = 0 if O1 , O2 are two spacelike separated regions, and Ai ∈ A(Oi ) (locality); (ii) αg A(O) = A(g.O) for all O, g (covariance). We call A a net in a positive energy representation if, in addition, the A(O) are W ∗ algebras acting on a common Hilbert space H, and (iii) there is a strongly continuous unitary representation g → U (g) of G on H such that αg = ad U (g); (iv) the joint spectrum of the generators of translations U (x) lies in the closed forward light cone V¯ + (spectrum condition); (v) there exists a vector Ω ∈ H which is invariant under all U (g) and cyclic for A. We call A a net in the vacuum sector if, in addition, (vi) the vector Ω is unique (up to scalar factors) as an invariant vector for the translation group. We are frequently interested in special regions O, namely standard double cones Or of radius r centered at the origin. For their associated algebra A(Or ), we often use the shorthand notation A(r ). ↑ The group G will usually be the (proper orthochronous) Poincaré group P+ , but in some cases additionally include the dilations. In a slight abuse of notation, we will sometimes refer to translations as αx or U (x), to Lorentz transforms as αΛ or U (Λ), etc., leaving out those components of the group element that equal the identity of the corresponding subgroups. In the Hilbert space case, we write the positive generator of time translations as H , and its spectral projectors as P(E). We denote the vacuum state as ω = (Ω| · |Ω). In the case of a vacuum sector, it follows from condition (vi) that ω is a pure state.
2.1. Scaling algebra. Our approach to renormalization in the context of algebraic quantum field theory is based on the results of Buchholz and Verch [B-V1], however with some modifications. Let us briefly recall the notions introduced there. We assume in the following that a theory “at scale 1”, denoted by A, is given, and fulfills the requirements of Definition 2.1 for a local net in the vacuum sector, with ↑ symmetry group P+ . We will analyze the scaling limit of this theory. To this end, we define the set of “scaling functions”, B := {B : R+ → B(H) | sup B λ < ∞}, λ
(2.1)
Scaling Algebras and Pointlike Fields
767
where we write B λ rather than B(λ) for the image points. Equipping B with pointwise addition, multiplication, and ∗ operation, and with the norm B = supλ B λ , it is easily seen that B is a C ∗ algebra. ↑ On B, we introduce an automorphic action α of the Poincaré group P+ by (α x,Λ (B))λ := αλx,Λ (B λ ).
(2.2)
We also have an automorphic action δ of the dilation group R+ on B: (δ µ (B))λ := B µλ .
(2.3)
It is easily checked that the α x,Λ and δ µ fulfill the usual commutation rules. We will combine them into a larger Lie group G, with elements g = (µ, x, Λ), and their representation denoted by α again: α g = α µ,x,Λ := δ µ ◦ α x,Λ = α µx,Λ ◦ δ µ .
(2.4)
We are now ready to define a new set of local algebras as subalgebras of B: A(O) := A ∈ B | Aλ ∈ A(λO) for all λ > 0; g → α g (A) is norm continuous . (2.5) This defines a new local net of algebras in the sense of Definition 2.1, with symmetry group G, referring to its usual geometric action. We denote by A the associated quasilocal algebra, i.e. the norm closure of ∪O A(O); our interest is actually in its Hilbert space representations. Note that A has a large center Z(A), consisting of those A ∈ A where each Aλ is a multiple of 1. This will turn out to be important in our analysis. Let us recall from [B-V1] what the two conditions on A ∈ A(O) heuristically stand for: Aλ ∈ A(λO) means that our scaled operators are localized in smaller and smaller regions, as required for the scaling limit. The norm continuity of g → α g (A) ensures that the unit of action is kept constant in this limit. We are requiring a bit more here than in [B-V1], inasmuch as also the action of dilations is required to be norm continuous; so we are actually considering subalgebras of those investigated by Buchholz and Verch. This will not influence the construction, but allow us to implement a continuous action of the dilation group in the limit theory later. We note that the continuity conditions imposed are not too strong restrictions, since operators A fulfilling them can be constructed in abundance. In order to show this, we need some technical preparations, which will be useful also in the following. We denote here by B(R+ ) and Cb (R+ ) the C ∗ -algebras of bounded functions and of bounded continuous functions on R+ respectively, equipped with the supremum norm f ∞ = supλ>0 | f (λ)|. Lemma 2.2. Let a > 1, and let f ∈ Cb (R+ ) with supp f ⊂ (1/a, a). There exists a bounded linear operator K f : B(R+ ) → B(R+ ) such that: (i) |(K f g)(λ)| ≤ 2 log a f ∞ supµ∈[1/a,a] |g(λµ)| for all λ > 0; in particular, K f ≤ 2 f ∞ log a ; (ii) at fixed a, the map f → K f is linear; (iii) if µ > 0 is such that µ−1 supp f ⊂ (1/a, a), then K[ f (µ · )]g(µ · ) = K[ f ]g; (iv) for each g ∈ B(R+ ), the map µ ∈ R+ → (K f g)(µ ·) ∈ B(R+ ) is continuous; (v) if g ∈ Cb (R+ ), then (K f g)(λ) = R+ f (µ)g(λµ) dµ/µ.
768
H. Bostelmann, C. D’Antoni, G. Morsella
Proof. Using the map λ ∈ [1/a, a] → loga λ ∈ [−1, 1], and considering [−1, 1] with endpoints identified, we can endow the interval [1/a, a] with the structure of an abelian group under multiplication. There exists therefore an invariant mean ma over the Banach space of bounded functions on [1/a, a], i.e. a bounded linear functional on this space such that ma (1) = 1, ma (h) ≥ 0 for h ≥ 0, and ma (h(µ ·)) = ma (h) for all µ ∈ [1/a, a]. We define then (K f g)(λ) := 2ma ( f (·)g(λ ·)) log a.
(2.6)
Properties (i), (ii), and (iii) then follow from boundedness, linearity, and invariance of the mean ma , respectively. For property (iv), we first note that it is sufficient to show continuity at µ = 1. For µ close enough to 1, we can use (i)–(iii) to show (K f g)(µ ·) − K f g ≤ 2 log a f (µ−1 ·) − f ∞ g∞ .
(2.7)
The right hand side converges to 0 as µ → 1, thanks to the uniform continuity of f , which proves (iv). Finally, by uniqueness of the (normalized) Haar measure on [−1, 1] we immediately have that, for g ∈ Cb (R+ ), dµ 1 1 1 f (µ)g(λµ). (2.8) d x f (a x )g(λa x ) = ma ( f (·)g(λ ·)) = 2 −1 2 log a R+ µ (Note that the first integrand is a continuous function on [−1, 1].) This proves (v).
We now use the above lemma to show that it is possible to smear elements of B with respect to dilations. Lemma 2.3. Let a > 1, and let f ∈ Cb (R+ ) with supp f ⊂ (1/a, a). There exists a bounded linear operator δ[ f ] : B → B such that: (i) δ[ f ] ≤ 2 f ∞ log a; (ii) if B λ ∈ A(λO) for all λ > 0, then (δ[ f ]B)λ ∈ A(λO1 ), where O1 is any open bounded region such that µO ⊂ O1 for all µ ∈ [1/a, a], and if furthermore the ↑ function (x, Λ) ∈ P+ →α x,Λ (B) is norm continuous, then δ[ f ]B ∈ A(O1 ); (iii) if B ∈ A, then δ[ f ]B = R+ f (µ) δ µ (B) dµ/µ as a Bochner integral. Proof. Let χ , ψ ∈ H be arbitrary vectors. Given B ∈ B, consider the function g ∈ B(R+ ) defined by g(λ) = (χ |B λ ψ). Since g∞ ≤ Bχ ψ, by (i) of the previous lemma the equation (χ |(δ[ f ]B)λ ψ) = (K f g)(λ)
(2.9)
uniquely defines an element δ[ f ]B ∈ B, and property (i) is satisfied for δ[ f ]. If furthermore B ∈ A, then g ∈ Cb (R+ ), and therefore (iii) follows from the analogous statement of the previous lemma. If now B λ ∈ A(λO) and O1 ⊃ µO for all µ ∈ [1/a, a], substituting (χ | · ψ) with (χ |[A, ·]ψ), A ∈ A(λO1 ) , in Eq. (2.9) immediately entails ↑ (δ[ f ]B)λ ∈ A(λO1 ). Suppose now that (x, Λ) ∈ P+ → α x,Λ (B) is norm continuous. In order to show that δ[ f ]B ∈ A(O1 ), it is now sufficient to show that the functions ↑ µ ∈ R+ → δ µ (δ[ f ]B) and (x, Λ) ∈ P+ → α x,Λ (δ[ f ]B) are norm continuous at the identity of the respective groups. Since (χ |δ µ (δ[ f ]B)λ ψ) = (K f g)(µλ), continuity with respect to dilations follows at once from the estimate (2.7). For Poincaré transformations, we proceed as follows. For κ ∈ R+ , set gx,Λ,κ (λ) := U (κ x, Λ)∗ χ B λ U (κ x, Λ)∗ ψ . (2.10)
Scaling Algebras and Pointlike Fields
769
With this definition, we have χ (α x,Λ δ[ f ]B)λ − (δ[ f ]B)λ ψ = K f (gx,Λ,λ − g) (λ).
(2.11)
Estimating by Lemma 2.2 (i), we obtain after a straightforward computation, α x,Λ δ[ f ]B − δ[ f ]B ≤ 2 log a f ∞
sup
µ∈[1/a,a]
α[µ−1 x, Λ]B − B. (2.12)
This vanishes as (x, Λ) → id, since Poincaré transformations act norm-continuous on B by assumption; thus (ii) is proved. Since elements B ∈ B such that (x, Λ) → α x,Λ (B) is norm continuous can be easily constructed by smearing in the traditional way over the Poincaré group [B-V1], the above lemma shows the existence of a large family of elements satisfying the continuity conditions imposed in the definition of the scaling algebra A. 2.2. Scaling limit. The scaling limit of the theory is defined by the limits of our operatorvalued scaling functions in the vacuum state. Consider the following states ωλ on A: ωλ : A → ω(Aλ ).
(2.13)
The rough idea for constructing a scaling limit theory is to take the limit of these states as λ → 0, and then to consider the GNS representation of A with respect to this limit state. However, while the above expression may converge for certain operators A in relevant examples [B-V2], the limit will certainly not exist in general. The approach taken in [B-V1] is to choose a weak-∗ cluster point of the set {ωλ }, which exists by the Alaoglu-Bourbaki theorem. These states were shown to be Poincaré invariant pure vacuum states. We will use a somewhat more general approach here: Let m be a mean on the semigroup (0, 1], with multiplication. That is, m is a linear functional on B((0, 1]), such that m(1) = 1, and m( f ) ≥ 0 for f ≥ 0. The functional is automatically bounded by |m( f )| ≤ supλ | f (λ)|. We now define1 a functional ω on the scaling algebra as ω(A) := m(ω(Aλ )),
A ∈ A.
(2.14)
Then ω is a linear, positive, normalized functional on A, hence a state. This construction covers in particular the following relevant cases: (a) The mean m is an evaluation functional, i.e. m( f ) = f (λ0 ) for some fixed λ0 . Then ω is the “vacuum at scale λ0 ”, and the GNS representation corresponds to the theory at this scale. (b) m is a weak-∗ limit point of such evaluation functionals as λ0 → 0. These are the scaling limit states considered by Buchholz and Verch in [B-V1]. (c) m is an invariant mean on the semigroup; that is, m( f µ ) = m( f ), where f µ (λ) = f (µλ), 0 < µ ≤ 1. It is well known that such invariant means exist, although they are not unique. This gives an alternative version of the scaling limit theory, which we will investigate in more detail below. 1 For simplicity of notation, we will often write m( f (λ)) as a shorthand for m(λ → f (λ)).
770
H. Bostelmann, C. D’Antoni, G. Morsella
Cases (a) and (b) share the property that m is multiplicative, i.e. m( f g) = m( f )m(g). In fact, it is known [Du-S, Ch. IV.6] that weak-∗ limit points of evaluation functionals are the only multiplicative means on (0, 1]. On the other hand, m is precisely not multiplicative in case (c), since there can be no multiplicative invariant means on nontrivial abelian semigroups [M]. Further, cases (b) and (c) are asymptotic means, in the sense that m( f ) = 0 whenever f vanishes on a neighborhood of 0. Such asymptotic means are generalizations of the limit λ → 0: Namely, if f (λ) converges as λ → 0, it follows that m( f ) = limλ f (λ). Also, along the line of ideas given in [B-V1, Corollary 4.2], one can show that if m is asymptotic, the vacuum state ω in Eq. (2.14) can be replaced with any other locally normal state in the original theory, without changing the resulting state ω on the scaling algebra. For the following, we will usually choose a fixed mean m. Since we are mainly interested in asymptotic means,2 we will refer to the corresponding state on A as the scaling limit state ω0 . The scaling limit theory is now obtained by a GNS construction with respect to this state. We denote this GNS representation of A as π0 , and the representation Hilbert space as H0 , where Ω0 ∈ H0 is the GNS vector. We can also transfer the symmetry group action to the representation space H0 ; but here the properties of m are crucial. We first note: Lemma 2.4. One has ω0 ◦ α g = ω0 for all Poincaré transformations g = (1, x, Λ). If m is invariant, the same holds for all g ∈ G. Proof. For Poincaré transformations, ω0 ◦ α g = ω0 follows from the invariance of ω under αx,Λ at finite scales, and for dilations it follows from the invariance of m. Now, in the limit theory, we can implement the subgroup of those symmetries that leave ω0 invariant: Theorem 2.5. Let ω0 be a scaling limit state, and let G0 ⊂ G be the subgroup of all g which fulfill ω0 ◦ α g = ω0 . There exists a strongly continuous unitary representation U0 of G0 on H0 , such that α0,g ◦ π0 = π0 ◦ α g with α0,g = ad U0 (g), for all g ∈ G0 . One has U0 (g)Ω0 = Ω0 for g ∈ G0 . The representation U0 (x) of translations fulfills the spectrum condition. Proof. We set U0 (g)π0 (A)Ω0 := π0 (α g A)Ω0 .
(2.15)
This defines U0 (g) on a dense set; it is well-defined, since ker π0 is invariant under α g by assumption. One easily checks that U0 (g) is norm-preserving, and thus can be extended to a unitary on all of H0 . Strong continuity of the representation follows from norm continuity of g → α g A. The properties U0 (g)Ω0 = Ω0 and α0,g ◦ π0 = π0 ◦ α g are immediate. For the spectrum condition, it suffices to show that
d x f (x) π0 (A)Ω0 U0 (x)π0 (A )Ω0 = 0
(2.16)
2 However, most of our results do not rely on this property; they apply to other means m as well, and are not limited to cases (b) and (c).
Scaling Algebras and Pointlike Fields
771
for all A, A ∈ A, and all test functions f ∈ S(Rs+1 ) such that the Fourier transform of f vanishes on the closed forward lightcone. Due to norm continuity of the representation, we can rewrite this expression as
d x f (x) π0 (A)Ω0 U0 (x)π0 (A )Ω0 = m d x f (x) ω(A∗λ αλx Aλ ) . (2.17) But the right-hand side vanishes due to the spectrum condition in the original theory A. This concludes the proof. We now define the local net in the limit theory as A0 (O) := π0 (A(O)) . It is clear that A0 is local, isotone, and covariant under α0,g , since these properties transfer from A on an ultraweakly dense set. By the above results, we have established A0 as a local net of algebras in a positive energy representation with symmetry group G0 , in the sense of Definition 2.1. Now as a last and crucial point, we ask whether the limit vacuum is a pure state, or (equivalently) Ω0 is unique as a translation-invariant vector, or (equivalently) the representation π0 is irreducible. For the case of multiplicative means m in s ≥ 2 dimensions,3 a positive answer has been given in [B-V1]. For non-multiplicative m, and in particular if m is invariant, the same is however impossible: Here already π0 Z(A) is known to be reducible, and the space of translationinvariant vectors must be more than 1-dimensional. The structure of the limit theory may be more complicated in this case. Since it is not directly relevant to our current line of arguments, we will confine ourselves to some remarks here, leaving the details – which are of interest in their own right – to a separate discussion [BDM]. First, it can be shown that the nontrivial image of the center Z(A) is the only “source” of reducibility: namely one has π0 (A) = π0 (Z(A)) if s ≥ 2. Here the case of a vacuum limit state arises as a special case for π0 (Z(A)) = C1. For more general limit states, one would like to decompose the limit net A0 along its center. By the representation theory of the commutative C ∗ algebra Z(A), one has a canonical decomposition dν(z) ω z (A), A ∈ A, (2.18) ω0 (A) = Z
where Z is a compact Hausdorff space, ν a regular Borel measure on Z , and ω z are scaling limit states that correspond to multiplicative means. One would naturally want to interpret Eq. (2.18) in terms of a direct integral of Hilbert spaces, and decompose the representation π0 into corresponding irreducible representations πz . This faces technical problems however: In particular for invariant means m, the limit Hilbert space H0 is not separable [Do], and so the standard methods of decomposition theory cannot be applied (see e.g. [D-Su,K-R, Ch. 14]). One needs to use generalized notions of direct integrals for nonseparable spaces [Ws,S]. We do not enter this discussion here. Let us just note that for the case of a unique vacuum structure, as introduced in [B-V1], and under additional regularity assumptions, it can be shown that the limit theory A0 has a simple product structure: ˆ ¯ A(O), A0 (O) ∼ = π0 (Z(A)) ⊗
(2.19)
3 In 1 + 1 space-time dimensions, the analogue is false: Even if m is multiplicative, π (A) may contain a 0 nontrivial center. An example for this behavior occurs in the Schwinger model [B-V2]. However, since the phase space conditions we use in Sect. 3 do not apply to this class of models a priori, we do not place emphasis on this situation here.
772
H. Bostelmann, C. D’Antoni, G. Morsella
ˆ is a fixed local net in a vacuum sector, independent of m. For a free real scalar where A ˆ would correspond to a massless free field of mass m in physical space-time, the net A field. Note that the factor π0 (Z(A)) depends on the mean m, but not on the specific theory A in question. The symmetry group operators U0 (g) also factorize along the above product: One has U0 (g) ∼ = (U0 (g)HZ)⊗Uˆ (g), where HZ is the Hilbert space generated ˆ In the case of an invariant ˆ by π0 (Z(A)), and U the representation associated with A. mean, this representation includes the dilations. But while for Poincaré transformations U0 (g)HZ is trivial, dilations have a nontrivial action on HZ. To summarize: If m is multiplicative, in particular in case (b) above, we obtain a theory in the vacuum sector, with a pure vacuum state. It need not be dilation covariant, however. If m is invariant, i.e. in case (c), dilations can canonically be implemented in the limit; but the limit state ω0 is not pure. 2.3. States and energy bounds in the limit. So far, we have described the theory on the level of bounded observables, at which the scaling limit can be computed. For the analysis of pointlike quantum fields, however, we need to consider a more general structure, which is tied to the Hilbert space representations of the scaling algebra. In this section, we will not yet refer to the locality properties of pointlike fields, but only be concerned with their singular high energy behavior. We first describe the situation in the original theory A. Here, let Σ = B(H)∗ be the predual Banach space of B(H), i.e. the set of normal functionals. We are interested in functionals with an energy cutoff E, and therefore define (2.20) Σ(E) := σ (P(E) · P(E)) σ ∈ Σ . Setting R := (1 + H )−1 , a bounded operator, we can also consider the space of smooth normal functionals: (2.21) C ∞ (Σ) := σ ∈ Σ σ (R − · R − ) < ∞ for all > 0 . We equip this space with the Fréchet topology induced by all the norms σ () := σ (R − · R − ), > 0. Its dual space in this topology is C ∞ (Σ)∗ = φ : C ∞ (Σ) → C φ() := R φ R < ∞ for some > 0 . (2.22) This is the space in which we expect our pointlike fields φ(x) to be contained. The “polynomial energy damping” with powers of R plays an important role in our analysis, and we will often need the following key lemma; cf. [Bo1, Lemma 3.27]. Lemma 2.6. For any > 0 there exists a constant c > 0 with the following property: Let H be a Hilbert space, and H ≥ 0 a positive selfadjoint operator, possibly unbounded, on a dense domain in H. Let P(E) be its spectral projections, and R = (1 + H )−1 . If c > 0, and φ is a sesquilinear form on a dense set of H × H such that P(E) φ P(E) ≤ c · (1 + E)−1 ∀E > 0, then it follows that
R φ R ≤ c c.
It is important here that c depends on only, not on H , φ, or c.
Scaling Algebras and Pointlike Fields
773
Proof. The spectral theorem applied to R yields for any χ ∈ H, (χ |R χ ) = (1 + E)− d(χ |P(E)χ ).
(2.23)
Integrating by parts in this Lebesgue-Stieltjes integral [Sa, Ch. III Thm. (14.1)], we obtain the following formula in the sense of matrix elements: ∞ R = d E (1 + E)−−1 P(E). (2.24) 0
Now let E > 0 be fixed, and let χ , χ ∈ P(E)H be unit vectors. Using Eq. (2.24) twice, we obtain (χ | P(E 1 ) φ P(E 2 ) | χ ) 2 |(χ |R φ R χ )| = d E 1 d E 2 (1+ E 1 )+1 (1+ E 2 )+1 ∞ ∞ (1 + max{E 1 , E 2 })−1 ≤ c · 2 d E 1 d E 2 . (2.25) (1+ E 1 )+1 (1+ E 2 )+1 0
0
In the estimate, we have used the hypothesis of the lemma. The integral on the righthand side exists, since the integrand vanishes like E i−2 in both variables; and this bound is independent of E. This allows us to extend R φ R to a bounded operator, and gives us an estimate for its norm that depends only on . In the limit theory, we use analogous definitions for the spaces Σ0 , Σ0 (E), C ∞ (Σ0 ), and C ∞ (Σ0 )∗ , referring to the Hamiltonian H0 and its spectral projectors P0 (E). Note that Lemma 2.6 above holds true for H0 in place of H as well. It is not obvious however how to obtain corresponding structures on the scaling algebra A, in order to describe the limiting procedure. Difficulties arise because the representation α is not unitarily implemented, and hence we cannot refer to its generators and their spectral projections: an energy operator on the scaling algebra is not available. However, it is possible to consider operators in A of finite energy-momentum transfer. We describe them here as follows:4 ˜ A(E) := α f A = d s+1 x f (x)α x A A ∈ A, f ∈ S(Rs+1 ), supp f˜ ⊂ (−E, E)s+1 . (2.26) ˜ Here we refer to the translation subgroup of α only. It is clear that A(E) is a linear space, ˜ ) ˜ closed under the ∗ operation. Moreover, for any fixed g ∈ G, we have α g A(E) ⊂ A(E for suitable E . ˜ The important point is now that the representors of A ∈ A(E) generate energybounded vectors from the vacuum, with the correct renormalization. To formulate this, we consider in addition to P(E) also P(E −0), the spectral projector of H for the interval (−∞, E); it differs from P(E) by the projector onto the eigenspace5 with eigenvalue E. Correspondingly, we use P0 (E − 0) in the limit theory. 4 One can more abstractly define the spectral support of an operator A ∈ A with respect to translations, ˜ and define the space A(E) by this means. We do not need the general formalism here, however. 5 It is in fact not expected that H has any eigenvectors other than Ω, at least under reasonable assumptions on the phase space behavior of the theory [Dy]. We do however not rely on this property.
774
H. Bostelmann, C. D’Antoni, G. Morsella
˜ Proposition 2.7. For any B ∈ A(E), one has B λ Ω ∈ P(E/λ − 0)H and π0 (B)Ω0 ∈ ˜ P0 (E − 0)H0 . Further, the inclusion π0 (A(E))Ω 0 ⊂ P0 (E − 0)H0 is dense. ˜ With Q κ being the spectral projectors of Proof. Let χ ∈ H, and let B = α f A ∈ A(E). the momentum operators P κ , we compute
(χ |B λ Ω) = d s+1 x f (x)(χ |U (λx)Aλ Ω) = f˜( p)d s+1 (χ | Q κ ( p κ /λ)Aλ Ω). κ
(2.27) Now if χ ∈ (1 − P(E/λ))H, or if χ is an eigenvector of H with eigenvalue E/λ, then the right-hand side vanishes by the support properties of f˜. This shows B λ Ω ∈ P(E/λ − 0)H. The proof for π0 (B)Ω0 is analogous. ˜ Now suppose that the inclusion π0 (A(E))Ω 0 ⊂ P0 (E − 0)H0 was not dense. Then we can find χ ∈ P0 (E − 0)H0 , χ = 0, such that ˜ (χ |π0 (B)Ω0 ) = 0 for all B ∈ A(E).
(2.28)
This means that, whenever f is a test function with supp f˜ ⊂ (−E, E)s+1 , we have s+1 0 = d x f (x)(χ |U0 (x)π0 (A)Ω0 ) = f˜( p)d s+1 ν( p) for all A ∈ A, (2.29) V¯ +
where ν( p) is a measure, the Fourier transform of (χ |U0 (x)π0 (A)Ω0 ). We specifically choose f (x) = f T (x 0 ) f S (x), where supp f˜T ⊂ (−E, E), supp f˜S ⊂ (−E, E)s . By the spectrum condition, the support of ν( p) is within the closed forward light cone; thus we can actually remove the constraints on supp f˜S . Choosing for f S a delta sequence in configuration space, we obtain that 0= f˜T ( p0 ) dνT ( p0 ) whenever supp f˜T ⊂ (−E, E). (2.30) Here νT ( p0 ) is the Fourier transform of (χ |U0 (t)π0 (A)Ω0 ), referring to time translations only. Thus the measure νT must have its support in (−E, E)c . On the other hand, the spectrum condition implies that supp νT ⊂ [0, E]. Hence supp νT = {E}, and νT is a delta measure: νT = c δ( p0 − E) with some constant c. By Fourier transformation, that yields (χ |U (t)π0 (A)Ω0 ) = (χ |π0 (A)Ω0 )ei Et
for all A ∈ A, t ∈ R.
(2.31)
Since π0 (A)Ω0 is dense in H0 , this means that χ is an eigenvector of H0 with eigenvalue E. But this contradicts χ ∈ P0 (E − 0)H0 . Thus the referred to inclusion must be dense. We will now investigate in more detail the convergence of states in the limit. The heuristic picture is that functionals like σλ = (B λ Ω| · |B λ Ω),
˜ B, B ∈ A(E),
(2.32)
Scaling Algebras and Pointlike Fields
775
“converge” to an energy-bounded limit state as λ → 0. This can easily be understood if evaluating them on bounded operators A ∈ A; we will however need the same also for unbounded objects. Let us formalize this more strictly. We set ˜ Σ(E) := σ : R+ → Σ σ λ (A) = ω(B ∗λ AB λ ) for some B, B ∈ A(E) . (2.33) We also define Σ = ∪ E>0 Σ(E). These sets are not linear spaces, but this will not be needed for our purposes. Note that, via the action on B and B , we have a natural action α ∗g of G on Σ, which fulfills (α ∗g σ )λ ((α g A)λ ) = σ µλ (Aµλ ), where σ ∈ Σ, A ∈ A, g = (µ, x, Λ) ∈ G.
(2.34)
We now define the scaling limit of σ ∈ Σ, denoted by π0∗ σ ∈ Σ0 , via π0∗ σ (π0 A) := m(σ λ (Aλ )) = ω0 (B ∗ A B ).
(2.35)
It is clear that this is well-defined. By Proposition 2.7 above, the span of all π0∗ σ , σ ∈ Σ(E), is dense in Σ0 (E − 0), and the union over all E > 0 is dense in C ∞ (Σ0 ) −1 in the corresponding topology. We also have π0∗ (α ∗g σ ) = (π0∗ σ ) ◦ α0,g in a natural way. Let us further note: Lemma 2.8. For all σ ∈ Σ, it holds that m(σ λ ) ≤ π0∗ σ . Proof. First, it is clear that we have π0 (A)Ω0 2 = m(Aλ Ω2 ) for any A ∈ A. Now let σ ∈ Σ with σ λ = (B λ Ω| · |B λ Ω). The mean m, as a state on a commutative algebra, satisfies the Cauchy-Schwarz inequality, from which we can conclude: m(σ λ ) = m(B λ ΩB λ Ω) ≤ m(B λ Ω2 )1/2 m(B λ Ω2 )1/2 = π0 (B)Ω0 π0 (B )Ω0 = π0∗ σ . This proves the lemma.
(2.36)
We can now use this structure to describe the scaling limit behavior of unbounded objects, more general than the bounded operator sequences A ∈ A. Namely, we consider functions λ → φ λ , with values in C ∞ (Σ)∗ . These will later be sequences of pointlike fields with renormalization factors. Here we are only interested in their high-energy behavior. We use a notion of “uniform” polynomial energy damping at all scales: For λ > 0, set R λ := (1 + λH )−1 . This R is an element of B, but not of A. We can, however, multiply φ with powers of R from the left or right, this product being understood “pointwise”. We can then consider the norms φ() = sup R λ φ λ R λ λ
(2.37)
on the spaces of those functions φ where the supremum is finite. We also have a notion of symmetry transformation on the functions φ, defined in a natural way as (α µ,x,Λ φ)λ := U (µλx, Λ)φ λµ U (µλx, Λ)∗ .
(2.38)
Our space of “regular” φ is now defined as follows, in analogy to the algebras A: Φ := φ : R+ → C ∞ (Σ)∗ ∃ : φ() < ∞; g → α g φ is continuous in · () . (2.39)
776
H. Bostelmann, C. D’Antoni, G. Morsella
By the natural inclusion B(H) → C ∞ (Σ)∗ , we have an embedding A → Φ, compatible with the symmetry action. We note that the continuity requirement in Eq. (2.39) is trivially fulfilled for the translation subgroup, since λP µ R λ ≤ 1 at all scales by the spectrum condition. For Lorentz transformations and dilations, the requirement is nontrivial; it suffices however to check continuity at g = id, as is easily verified using the commutation relations between α g and R. The functions in Φ are sufficiently regular to allow the definition of a scaling limit. Namely, we have: Proposition 2.9. Let φ ∈ Φ. There exists a unique element π0 φ ∈ C ∞ (Σ0 )∗ such that (π0∗ σ )(π0 φ) = m(σ λ (φ λ )) for all σ ∈ Σ. The map φ → π0 φ is linear, and one has for any E > 0, π0 φΣ0 (E − 0) ≤ sup φ λ Σ(E/λ). 0 0 is fixed in the following. Set φ E := sup00 R H. In generalization of [F-He, Eq. 2.4], one easily proves the following relation, valid at fixed λ in the sense of matrix elements: [R, α f φ] = −i R(α[∂0 f ]φ)R.
(2.45)
This will be crucial in our approximation of pointlike fields later. As a last point, let us consider a finite-dimensional subspace E ⊂ C ∞ (Σ)∗ . It is wellknown that E is always complementable in the locally convex space C ∞ (Σ)∗ ; i.e. there exists a continuous projection p onto E. We will need the fact that such projections can be chosen uniformly at all scales. To that end, we set Φ () := {φ ∈ Φ | φ() < ∞}. Proposition 2.10. Let E ⊂ C ∞ (Σ)∗ be a finite-dimensional subspace such that αΛ E = E for all Lorentz transforms Λ. Let > 0 be large enough such that φ() < ∞ for all φ ∈ E. There exists a map p : Φ () → Φ () of the form ( p φ)λ = pλ φ λ , where each pλ is a projector onto E, and a constant c > 0, such that p φ() ≤ cφ() . We will refer to the map p (for given E and ) as a uniform projector onto E. Proof. Let Eλ be the space E equipped with the norm · λ = R λ · R λ , and let Fλ be {φ ∈ C ∞ (Σ)∗ | φ() < ∞} with√ the same norm. Then there exists [L] a projection qλ : Fλ → Eλ such that qλ φλ ≤ nφλ , where n = dim E. We now choose two positive test functions, f on the Lorentz group and h on R+ , both of compact support and normalized in the respective L 1 norm, and define pλ as
−1 φ) (λ) σ ( pλ φ) := Kh µ → dν(Λ) f (Λ) σ (αΛ qµ αΛ for σ ∈ C ∞ (Σ), φ ∈ C ∞ (Σ)∗ .
(2.46)
Here ν is the Haar measure on the Lorentz group, and Kh is the map established in −1 is a projector onto E, and one Lemma 2.2. Since E is invariant under αΛ , each αΛ qµ αΛ easily sees in matrix elements that then the same is true for pλ . Using Lemma 2.2 (i), we find the following estimate for pλ : |σ ( pλ φ)| ≤ 2 log a h∞ f 1
sup
sup
µ∈[1/a,a] Λ∈supp f
−1 |σ (αΛ qλµ αΛ φ)|.
(2.47)
Here we can further estimate: √ −1 − |σ (αΛ qλµ αΛ φ)| ≤ nσ (R − λ · R λ ) R λ φ R λ
− 2 − 2 2 − 2 −1 ×R λ αΛ (R − λ ) R λ R µλ R λ R µλ R λ αΛ (R λ ) . (2.48)
Using spectral analysis of the R λ , we can find a uniform bound on the right-hand side when µ and Λ range over a compact set. Thus, combining Eqs. (2.47) and (2.48), we obtain a constant c (depending on f , h and ) such that the proposed estimate for p φ() holds: − |σ ( pλ φ)| ≤ cσ (R − λ · R λ ) R λ φ R λ .
(2.49)
778
H. Bostelmann, C. D’Antoni, G. Morsella
It remains to show that the symmetry transforms act continuously on p φ. Here continuity for the translations is clear; we check continuity of α µ,Λ p φ as (µ, Λ) → id. Using the translation invariance of the Haar measure and of the convolution Kh , we can derive the following for any λ > 0 and σ ∈ C ∞ (Σ)∗ : σ ((α µ,Λ p φ − p φ)λ )
−1 −1 = K[h(µ · ) − h] µ → dν(Λ ) f (Λ ) σ (αΛΛ qµ αΛ φ µλ ) (λ)
−1 +K[h] µ → dν(Λ ) f (Λ ) σ (αΛΛ qµ αΛ (φ µλ − φ λ )) (λ)
−1 −1 +K[h] µ → dν(Λ ) f (Λ Λ ) − f (Λ ) σ (αΛ qµ αΛ αΛ φ λ ) (λ)
−1 (2.50) +K[h] µ → dν(Λ ) f (Λ ) σ (αΛ qµ αΛ (αΛ φ λ − φ λ )) (λ). Now we can use the following uniform estimates: Since φ ∈ Φ, we have α Λ φ − → 0 and α µ φ − φ( ) → 0 as (µ, Λ) → id, where is sufficiently large. Further, since g and h are test functions, one has g(Λ−1 · )− g1 → 0 and h(µ−1 · )− h∞ → 0 in that limit. Applying all these to Eq. (2.50), and using similar techniques as in Eqs. (2.47)–(2.49), we can obtain α µ,Λ p φ − p φ( ) → 0. So p φ ∈ Φ. φ( )
3. Pointlike Fields In this section, our task will be to analyze the behavior of pointlike quantum fields in the scaling limit. In order to relate these to the local algebras in question, we use the methods of [Bo2]. These methods are based on the assumption of a certain regularity condition, the microscopic phase space condition, which we shall recall in a moment. They allow for a full description of the field content of the net A in the sense of Fredenhagen and Hertel [F-He]. We will introduce a phase space condition that is slightly stronger than the one proposed in [Bo2]. Assuming this condition, we show that the limit theory for pure limit states fulfills the original condition of [Bo2]. We describe the scaling limit of pointlike fields in detail. In particular, we are able to recover the usual picture of multiplicative renormalization of pointlike fields in our context. 3.1. Phase space conditions. Let us first recall the microscopic phase space condition from [Bo2]. It demands that the natural inclusion map Ξ : C ∞ (Σ) → Σ can be approximated by finite-rank maps, when the image functionals Ξ (σ ) ∈ Σ are restricted to small local algebras A(r ), r → 0. The approximation quality, measured in the norms and seminorms introduced in Sect. 2.3, can be chosen to any given polynomial order in r . The precise definition is as follows. Phase Space Condition I. For every γ ≥ 0 there exist an ≥ 0 and a map ψ : C ∞ (Σ) → Σ of finite rank, such that ψ() < ∞, (Ξ − ψ)A(r )() = o(r γ ) as r → 0.
Scaling Algebras and Pointlike Fields
779
We shall call the map ψ appearing above an approximating map (I) of order γ , with the roman numeral referring to the phase space condition. It is known that the image of the dual map ψ ∗ essentially consists of pointlike fields. More precisely, let us consider for γ ≥ 0 the following space:6 Φγ = φ ∈ C ∞ (Σ)∗ σ (φ) = 0 whenever σ A(r ) = O(r γ + ) for some > 0 . (3.1) We know from [Bo2] that, if ψ is an approximating map (I) of order γ + for some > 0, then Φγ ⊂ img ψ ∗ . If ψ is of minimal rank with this property, then equality holds. As shown in [Bo2], the phase space condition guarantees that the finite-dimensional spaces Φγ consist of pointlike quantum fields, which fulfill the Wightman axioms after smearing with test functions. Their union ΦFH = ∪γ Φγ is equal to the field content of the theory as introduced by Fredenhagen and Hertel [F-He]. The spaces are invariant under Lorentz transforms and other symmetries of the theory. Further, an operator product expansion exists between those fields [Bo3]. However, the above Condition I does not seem strict enough to guarantee a regular scaling limit of the fields. This is, roughly, because the estimates are not preserved under scaling: The short distance dimension γ and the energy dimension of the fields are not required to coincide. We therefore propose a stricter condition. Phase Space Condition II. For every γ ≥ 0 there exist c, , r1 > 0 and a map ψ : C ∞ (Σ) → Σ of finite rank, such that ψΣ(E), A(r ) ≤ c(1 + Er )γ , for E ≥ 1, r ≤ r1 , (Ξ − ψ)Σ(E), A(r ) ≤ c(Er )γ + for E ≥ 1, Er ≤ r1 . Here the restriction Σ(E), A(r ) is to be understood as follows: the map (e.g. ψ) is restricted to Σ(E), and its image points, being linear forms on A, are then restricted to A(r ). Again, we shall call the map ψ an approximating map (II) of order γ . This stricter criterion, which is still fulfilled in free field theory in physical space-time [Bo1], will allow us to pass to the scaling limit; only the scale-invariant expression Er enters in its estimates. For consistency, we show that the image of ψ ∗ has similar properties to those implied by Condition I. Proposition 3.1. If ψ is an approximating map (II) of order γ , then Φγ ⊂ img ψ ∗ . If ψ is of minimal rank with this property, then Φγ = img ψ ∗ . Proof. For the first part, it suffices to show that ψ(σ ) = 0 implies σ Φγ = 0. So let σ ∈ C ∞ (Σ) with ψ(σ ) = 0. For any E ≥ 1, set σ E = σ (P(E) · P(E)). Then, we can obtain for any > 0, σ − σ E ≤ 2σ () (1 + E)− ; σ − σ E () ≤ 2σ (2) (1 + E)− .
(3.2)
Let us consider the estimate σ A(r ) = (σ − ψ(σ ))A(r ) ≤ (Ξ − ψ)(σ E )A(r ) +σ − σ E + ψ(σ − σ E ).
(3.3)
6 Note that this definition differs slightly from the convention chosen in [Bo2]. The effect of this change is that the map γ → dim Φγ is guaranteed to be continuous from the right.
780
H. Bostelmann, C. D’Antoni, G. Morsella
Here we choose E = r −η with sufficiently small η > 0, and observe that ψ() < ∞ for large , which is easily seen by expanding ψ in a basis and applying Lemma 2.6. Using this, Eq. (3.2), and the fact that ψ is approximating (II), we can achieve that the right-hand side of (3.3) vanishes like O(r γ + ) for some > 0. But that implies σ Φγ = 0. Hence Φγ ⊂ img ψ ∗ . Now suppose that there is φ ∈ img ψ ∗ with φ ∈ Φγ . Then there exists σ ∈ C ∞ (Σ) with σ (φ) = 1, σ A(r ) = O(r γ + ) for some > 0. We note the following: With E ≥ 1 and > 0 to be specified later, we have per Eq. (3.2), ψ(σ )A(r ) ≤ (ψ − Ξ )(σ E )A(r ) + (ψ − Ξ )(σ − σ E ) + σ A(r )
≤ O((Er )γ + ) + 2ψ − Ξ () (1 + E)− σ (2) + O(r γ + ). (3.4) Choosing E = r −/2(γ +) , and sufficiently large, we can certainly find > 0 such that ψ(σ )A(r ) = O(r γ + ). Also, it is clear that φΣ(E) = O(E γ ). Now consider the map ψˆ := ψ − ψ(σ )φ. We show that it is also an approximating map (II) of order γ , but of lower rank than ψ. First we compute for r ≤ r1 , E ≥ 1: ˆ ψΣ(E), A(r ) ≤ ψΣ(E), A(r ) + ψ(σ )A(r ) φΣ(E)
≤ c(1 + Er )γ + O(r γ + )O(E γ ) ≤ c (1 + Er )γ
(3.5)
with some c > 0. This is the first of the desired estimates. The second estimate follows similarly, for Er ≤ r1 : ˆ (Ξ − ψ)Σ(E), A(r ) ≤ (Ξ − ψ)Σ(E), A(r ) + ψ(σ )A(r ) φΣ(E)
≤ c(Er )γ + + O(r γ + )O(E γ ) ≤ c (Er )γ + .
(3.6)
(We have assumed ≥ here.) So ψˆ is an approximating map (II) of order γ . Also, ˆ We know that σ ∈ ker ψ, it is clear that img ψˆ ∗ ⊂ img ψ ∗ , and thus ker ψ ⊂ ker ψ. ˆ ) = ψ(σ ) − ψ(σ )σ (φ) = 0. Thus ker ψˆ is since σ (φ) = 1. On the other hand, ψ(σ strictly larger than ker ψ, implying rank ψˆ < rank ψ. If rank ψ was minimal, this is not possible; thus we must have img ψ ∗ = Φγ . We will formulate another phase space condition which will be of technical importance for us, and generalizes the above inasmuch as the finite-rank maps ψ are allowed to depend on r in a controlled way. Phase Space Condition III. For every fixed γ ≥ 0 there exist c, , r1 > 0, a closed subspace K ⊂ C ∞ (Σ) of finite codimension, and for each r ≤ r1 a map ψr : C ∞ (Σ) → A(r )∗ , such that K ⊂ ker ψr and ψr Σ(E) ≤ c(1 + Er )γ , for E ≥ 1, r ≤ r1 , (Ξ A(r ) − ψr )Σ(E) ≤ c(Er )γ + for E ≥ 1, Er ≤ r1 . These maps ψr will be called approximating maps (III) of order γ . We now show how these phase space conditions are interrelated. First, we deduce from Condition III a version relating to the energy norms · () rather than to a sharp cutoff.
Scaling Algebras and Pointlike Fields
781
Lemma 3.2. Let {ψr } be a set of approximating maps (III) of order γ . Then, we can find > 0 and c > 0 such that ψr () ≤ c , for all r ≤ r1 , r −γ (Ξ A(r ) − ψr )() → 0,
as r → 0.
Proof. The first part follows by expressing ψr in a basis and applying Lemma 2.6. For the second part, set ϕr = Ξ A(r ) − ψr , and let , > 0 be sufficiently large in the following. In the expression
ϕr (+ ) = ϕr (1 · R + · R + · 1) ,
(3.7)
we replace the identities shown as 1 with (1 − P(E)) + P(E). A brief calculation shows that
ϕr (+ ) ≤ 2ϕr () (1 + E)− + ϕr Σ(E) ≤ 2c (1 + E)− + c(Er )γ + (3.8) for E ≥ 1, Er ≤ r1 . Now setting E = r −η with sufficiently small positive η, and choosing large enough, it is obvious that
r −γ ϕr (+ ) −−→ 0;
(3.9)
r →0
which gives the desired estimate after a redefinition of .
The relations between the different conditions are now: Proposition 3.3. The following implications hold between the phase space conditions: II =⇒ III =⇒ I. Proof. The implication II =⇒ III follows at once by defining ψr := ψA(r ) and K := ker ψ. We prove III =⇒ I, setting out from the estimates in Lemma 3.2. Let γ ≥ 0 be given, and consider the corresponding K and ψr . Define K⊥ := {φ ∈ C ∞ (Σ)∗ | φK = 0}.
(3.10)
Since K is of finite codimension, K⊥ is finite dimensional. Furthermore K ⊂ ker ψr implies img ψr∗ ⊂ K⊥ for all r ≤ r1 . Now let p = j σ j φ j be a projector onto K⊥ , i.e. a linear map p : C ∞ (Σ)∗ → C ∞ (Σ)∗ such that p 2 = p and img p = K⊥ ; and let p∗ : C ∞ (Σ) → C ∞ (Σ) be its predual map, which always exists since rank p is finite. It is easily seen that p∗ (σ ) − σ ∈ K for all σ ∈ C ∞ (Σ), so that ψr ◦ p∗ = ψr for all r ≤ r1 . Furthermore if > 0 is so big that φ j ( ) < ∞ for all j, it is clear that, for each > 0, p∗ (σ ) φ j ( ) σ j , (3.11) p∗ ( ) = sup ) ≤ ( σ ∈C ∞ (Σ) σ j
p∗ ( ,) =
p∗ (σ )() ≤ φ j ( ) σ j () . ( ) ∞ σ σ ∈C (Σ) j sup
(3.12)
Therefore if we define ψ := Ξ ◦ p∗ we have ψ( ) = p∗ ( ) < ∞, and if ≥ ,
(ψ − Ξ )A(r )( ) ≤ Ξ ◦ p∗ A(r ) − ψr ( ) + ψr − Ξ A(r )( )
≤ (Ξ A(r ) − ψr ) ◦ p∗ ( ) + ψr − Ξ A(r )( )
≤ Ξ A(r )−ψr () p∗ ( ,) +ψr −Ξ A(r )() = o(r γ ), (3.13) which implies Condition I.
782
H. Bostelmann, C. D’Antoni, G. Morsella
3.2. Phase space behavior of the limit theory. We will now investigate the phase space properties of the scaling limit theory. In what follows, we shall assume that the original theory A fulfills Phase Space Condition II, as introduced above. The goal of this section is to show that, if the mean m is multiplicative, the limit theory A0 fulfills at least the somewhat weaker Condition I, which however still allows for a full description of pointlike fields. We will keep γ > 0 and ψ, an approximating map (II) of order γ , fixed for this section. Our task is to construct a map ψ0 : C ∞ (Σ0 ) → Σ0 which fulfills the properties required in Condition I. Heuristically, one would like to define ψ0 as a scaling limit of the map ψ, such that ψ0 (π0∗ σ )(π0 A) = lim ψ(σ λ )(Aλ ). λ→0
(3.14)
Of course, this limit does not exist in general, and we need to replace it with the mean m. But even then, trying to use Eq. (3.14) as a definition of ψ0 , it is not clear why this would be well-defined in σ and A. Specifically, it is not clear whether the right-hand side, considered as a functional in A, is in the folium of the representation π0 . For technical reasons, we will in fact use the dual map ψ ∗ : B(H) → C ∞ (Σ)∗ to define our phase space map in the limit, using the techniques developed in Sect. 2.3. For A ∈ A(r1 ), consider ψ ∗ (A), i.e. the function λ → ψ ∗ (Aλ ). With a proper choice of ψ, these functions are elements of our space Φ. Lemma 3.4. Let Phase Space Condition II be fulfilled. For γ ≥ 0, we can choose an approximating map (II) ψ of order γ and of minimal rank such that, for every A ∈ A(r1 ), one has ψ ∗ (A) ∈ Φ. Further, ψ ∗ (A)() ≤ cA for suitable , c. Proof. Let ψˆ be any approximating map (II) of order γ and of minimal rank. Phase Space Condition II tells us that for suitable cˆ and rˆ1 , ψˆ ∗ (Aλ )Σ(E/λ) ≤ c(1 ˆ + Er )γ A for A ∈ A(r ), r ≤ rˆ1 , E ≥ 1, 0 < λ ≤ 1. (3.15) By application of Lemma 2.6 with respect to λH , this implies ψˆ ∗ (A)() ≤ c A for sufficiently large c , . Now let h be a positive test function of compact support on the Lorentz group, with h1 = 1, and consider the map ψ, defined by ˆ −1 σ ), ψ(σ ) = dν(Λ) h(Λ) αΛ ψ(α (3.16) Λ where the weak integral is well defined thanks to the fact that the restriction of αΛ to Φγ is a finite-dimensional representation, and therefore is continuous in Λ. Then ψ is as well an approximating map (II) of order γ , with suitable constants c > c, ˆ r1 < rˆ1 , as is easily seen. It fulfills ψ ∗ (A)() ≤ c A for sufficiently large c . Also, it has the same ˆ since img ψˆ ∗ = Φγ is stable under Lorentz transforms. Now let A ∈ A(r1 ). rank as ψ, We prove that g → α g ψ ∗ (A) is continuous in · () for large . For translations, this is trivially fulfilled, and for dilations it follows from α µ ψ ∗ (A) = ψ ∗ (α µ A) and from the continuity properties of A. So let Λ be in the Lorentz group. Similar to Eq. (2.50), we obtain −1 ∗ ∗ (α Λ ψ (A) − ψ (A))λ = dν(Λ ) h(Λ−1 Λ ) − h(Λ ) αΛ ψˆ ∗ αΛ αΛ A λ −1 + dν(Λ ) h(Λ ) αΛ ψˆ ∗ αΛ (3.17) (α Λ A − A)λ
Scaling Algebras and Pointlike Fields
783
in the sense of matrix elements. Since Λ → α Λ A is norm continuous, h is smooth, and ψˆ fulfills the bounds in Eq. (3.15), the above expression vanishes uniformly in λ in some norm · () as Λ → id. So ψ ∗ (A) ∈ Φ, and ψ has all the required properties. Using the map ψ constructed in the above lemma, we now define our approximation map in the limit theory by ψ0∗ : A(r1 ) → C ∞ (Σ0 )∗ , ψ0∗ (A) := π0 (ψ ∗ (A)).
(3.18)
Proposition 2.9 and Lemma 3.4 yield the estimate ψ0∗ (A)(2+1) ≤ cA
for A ∈ A(r ), r ≤ r1 .
(3.19)
This estimate also shows that the predual map ψ0 : C ∞ (Σ0 ) → A(r1 )∗ exists, fulfilling ψ0 (σ0 )(A) = σ0 (ψ0∗ (A)). Spelling this out explicitly for σ0 = π0∗ σ , we obtain ψ0 (π0∗ σ )(A) = m(ψ(σ λ )(Aλ )),
(3.20)
which resembles the heuristic formula in Eq. (3.14). In addition to the estimate (3.19), we also obtain from Condition II and Proposition 2.9 that for normalized A ∈ A(r ), (ψ0∗ (A) − π0 (A))Σ0 (E − 0) ≤ cA(Er )γ + ,
(3.21)
supposing that E ≥ 1, Er ≤ r1 . Since the right-hand side is continuous in E, this amounts to (ψ0 − π0∗ Ξ0 )Σ0 (E), A(r ) ≤ c(Er )γ + .
(3.22)
So most parts of Phase Space Condition I are fulfilled in the limit theory. However, a crucial point needs to be investigated: whether ψ0 is of finite rank. We will actually show this in the case of our pure limit states. Proposition 3.5. If the mean m is multiplicative, then rank ψ0 ≤ rank ψ. Proof. Let n := rank ψ. It suffices to prove the following: For given A0 , . . . , An ∈ A(r1 ), there are constants (c0 , . . . , cn ) ∈ Cn+1 \{0} such that ⎛ ψ0∗ ⎝
n
⎞ c j A j ⎠ = 0;
(3.23)
j=0
for this shows that dim img ψ0 ≤ n. Let such A j be given. Since ψ is of rank n, it is certainly possible to choose, for any 0 < λ ≤ 1, numbers c0,λ , . . . , cn,λ ∈ C such that ⎛ ψ∗ ⎝
n j=0
⎞ c j,λ A j,λ ⎠ = 0.
(3.24)
784
H. Bostelmann, C. D’Antoni, G. Morsella
Here not all of the c j,λ vanish, and so we can choose them to be on the unit sphere in Cn+1 . Then the functions λ → c j,λ are bounded, and we can define c j := m(c j,λ ). We observe that for any σ ∈ Σ, ⎛ ⎞ π0∗ σ (ψ0∗ ⎝ c j A j ⎠) = m(c j,λ )m(ψ(σ λ )(A j,λ )) j
j
⎛
= m(ψ(σ λ ) ⎝
⎞ c j,λ A j,λ ⎠) = 0,
(3.25)
j
where we have used multiplicativity of the mean. Since we can extend this equation from π0∗ Σ to C ∞ (Σ0 ), this establishes Eq. (3.23). As a last point, not all of the c j vanish, since j c¯ j c j = j m(c¯ j,λ c j,λ ) = 1, again using the multiplicative mean. This proves rank ψ0 ≤ n. We note that the same will in general not be true if the mean is not multiplicative, for example for invariant means. In this case, the image of ψ0∗ will in general be infinitedimensional, containing in particular all operators in π0 (Z(A)). The remaining problem for establishing Phase Space Condition I is now the target space of ψ0 : Its image points are not normal functionals with respect to π0 . This does not directly affect our computations here, but is crucial in the analysis of associated Wightman fields [Bo2]. We solve this problem by taking the normal part of those functionals with respect to π0 , and showing that this is just as well suited for our approximation. The notion of the normal part of a functional on a C ∗ algebra with respect to a specific state needs explanation, since we use it in a slightly nonstandard way; it is treated in detail in Appendix A. Here we note only that the normal part depends both on the state and the algebra, and is not compatible with restriction to subalgebras. We obtain from Theorem A.1 for each r ≤ r1 a linear map N[A(r ), ω0 ] : A(r )∗ → A0 (r )∗ of norm 1. Now we define for each r ≤ r1 a map ψ0,r : C ∞ (Σ0 ) → A0 (r )∗ by ψ0,r := N[A(r ), ω0 ] ◦ ρr ◦ ψ0 ,
(3.26)
where ρr is the restriction map A(r1 )∗ → A(r )∗ . Proposition 3.6. The maps ψ0,r fulfill the two estimates requested in Phase Space Condition III. Moreover, if m is multiplicative, we have rank ψ0,r ≤ rank ψ. Proof. The first estimate of Condition III follows from the corresponding estimate for ψ0 , see Eq. (3.19), and the fact that N[A(r ), ω0 ] = 1 by Theorem A.1 (iii). For the second estimate, we remark that, with ρ0,r being the restriction to A0 (r ), ψ0,r − ρ0,r Ξ0 = N[A(r ), ω0 ]ρr ψ0 − N[A(r ), ω0 ]π0∗ ρ0,r Ξ0 = N[A(r ), ω0 ]ρr ψ0 − π0∗ Ξ0 ,
(3.27)
cf. Theorem A.1 (i). We can now prove the second estimate of Condition III from Eq. (3.22). By expressing ψ0 in a basis, it is also clear that composition with N[A(r ), ω0 ] does not increase the rank of the map; hence rank ψ0,r ≤ rank ψ for multiplicative m by Proposition 3.5.
Scaling Algebras and Pointlike Fields
785
Setting K := ker ψ0 ⊂ ker ψ0,r , the limit theory then fulfills Phase Space Condition III by virtue of the maps ψ0,r . Due to Proposition 3.3, this implies Phase Space Condition I for the limit theory. The dimensions of the field spaces Φγ do not increase when passing to the limit, since by our construction, the field space Φ0,γ of the limit theory is contained in img ψ0∗ . Theorem 3.7. If the original net A fulfills Phase Space Condition II, and the mean m is multiplicative, then the scaling limit net A0 fulfills Phase Space Condition I. For the size of the field content, we have dim Φ0,γ ≤ dim Φγ . This establishes all the consequences of the phase space condition in the limit theory, including the existence of operator product expansions. We shall see this more explicitly in Sec. 4. 3.3. Renormalized pointlike fields. We can now describe the renormalization limit of pointlike fields in our context. Heuristically, as noted in the introduction, renormalized point fields should appear as images of operator sequences A ∈ A, which already bear the “correct” renormalization, under the finite-rank map ψ ∗ . We shall now investigate this in detail. In the following, let γ > 0 and a corresponding approximating map (II) ψ of order γ be fixed, where we choose ψ as described in Lemma 3.4. For A ∈ A(r ), r ≤ r1 , also kept fixed for the moment, we set φ := ψ ∗ (A). Lemma 3.4 guarantees that φ ∈ Φ, i.e. φ is a “correctly renormalized” sequence. By Proposition 2.9, we know that a well-defined scaling limit π0 (φ) ∈ C ∞ (Σ0 ) exists in the limit theory. A priori, we do not know anything about localization properties of π0 (φ) however. For describing these, we will establish a uniform approximation of φ with bounded operators. Theorem 3.8. Let φ ∈ Φ such that φ λ ∈ ΦFH for all λ. There exist operators Ar ∈ A(r ), 0 < r ≤ 1, and constants k, > 0 such that in the limit r → 0: (i) (ii) (iii)
Ar = O(r −k ), Ar − φ() = O(r ), π0 (Ar ) − π0 (φ)() = O(r ).
Proof. We use methods similar to [Bo2, Lemma 3.5]. Choose a positive test function f ∈ S(Rs+1 ) with f 1 = 1, and with support in the double cone Or =1/2 . Further, set fr (x) = r −(s+1) f (x/r ), which is then a “delta sequence” as r → 0. For any fixed r and λ, we know that (α[ fr ]φ)λ = d x fr (x)(α x φ)λ is a closable operator [F-He]; let Vr,λ Dr,λ be the polar decomposition of its closure, with Vr,λ being a partial isometry, and Dr,λ ≥ 0. We know from [F-He] that both Vr,λ and the spectral projectors of Dr,λ are contained in A(λr/2). Let be sufficiently large such that φ() < ∞, and set for > 0: B r,,λ = −1 Vr,λ sin( Dr,λ ) ∈ A(λr/2).
(3.28)
This B r, is certainly an element of B, but not necessarily of A. Using the inequality for real numbers, (x − −1 sin(x))2 ≤ 2 x 4 for x ≥ 0, > 0,
(3.29)
786
H. Bostelmann, C. D’Antoni, G. Morsella
we can use corresponding operator inequalities to establish the following estimate: 2 −1 2 (B r,,λ − (α[ fr ]φ)λ )R 4 sin( Dr,λ ))R 4 λ = (Dr,λ − λ 2 ≤ 2 (α[ fr ]φ)∗λ (α[ fr ]φ)λ R 4 λ .
(3.30)
Now employing the commutation relation in Eq. (2.45), and using estimates of the type α[ f ]φ() ≤ f 1 · const., we can obtain the following uniform estimate in λ: −4 (α[ fr ]φ)∗λ (α[ fr ]φ)λ R 4 , λ ≤ cr
(3.31)
where the constant c depends on the details of the function f , but not on r or λ. Combined with Eq. (3.30), this yields B r, − α[ fr ]φ(4) ≤ c r −4 .
(3.32)
Using the spectral properties of the translation group, it is also easy to verify that φ − α[ fr ]φ(+1) = O(r ).
(3.33)
Now setting = r 4+1 , and then redefining , we have obtained operators B r ∈ B, with B r,λ ∈ A(λr/2), and k, > 0, such that B r = O(r −k ), B r − φ() = O(r ).
(3.34)
In general, however, we cannot show that B r ∈ A(r ). In order to remedy this problem, we proceed in two steps by regularizing first with respect to Poincaré transformations and then with respect to dilations. To that end, choose a family (h qP )q>0 of positive ↑
test functions on P+ , with compact supporting shrinking to the identity as q → 0, and converging to the delta function at (x, Λ) = id. We then set C r := α[h qP ]B r = dν(x, Λ)h qP (x, Λ)α x,Λ B r . If q is sufficiently small for r , we obtain C r,λ ∈ A(3λr/4), and that (x, Λ) → α x,Λ (C r ) is norm continuous. Also, it is clear that C r = O(r −k ), regardless of our choice of q(r ). If q is small enough and taking > 1, we have, by spectral analysis of αΛ (R λ )R −1 λ , α[h qP ]B r − α[h qP ]φ() ≤ c B r − φ()
(3.35)
for some constant c > 0. Taking into account the continuity in some -norm of φ under Poincaré transformations, and choosing q(r ) small enough, we can obtain from Eq. (3.34) that C r − φ() = O(r ).
(3.36)
Let now 1 < a < 4/3, and let h aD be a positive, continuous function of compact support in (1/a, a), with h aD (µ)dµ/µ = 1 and h aD ∞ log a ≤ 2. (The value of a will be specified later, dependent on r .) Recalling Lemma 2.3, set Ar := δ[h aD ]C r ∈ A(r ). Clearly Ar = O(r −k ). Further, define φ a ∈ Φ by dµ D σ (φ a,λ ) = K[h aD ] µ → σ (φ µ ) (λ) = h (µ)σ (φ µλ ). (3.37) µ a
Scaling Algebras and Pointlike Fields
787
(The equality with the integral follows from continuity properties of α µ φ, and it is easily checked that in fact φ a ∈ Φ.) We now consider the estimate Ar − φ() ≤ Ar − φ a () + φ a − φ() . For the first term on the right-hand side, we have σ (Ar,λ − φ a,λ ) = K[h aD ] µ → σ (C r,µ − φ µ ) (λ) for all σ ∈ C ∞ (Σ)∗ ,
(3.38)
(3.39)
which yields the estimate Ar − φ a () ≤ 2(log a)h aD ∞ C r − φ()
sup
λ>0 µ∈(1/a,a)
R λ /R µλ 2 = O(r ),
(3.40)
regardless of our choice of a. The second term in Eq. (3.38) can be written in terms of integrals; that gives φ a − φ() ≤ h aD 1
sup
µ∈(1/a,a)
α µ φ − φ() .
(3.41)
Now if a(r ) is chosen sufficiently close to 1, we can certainly achieve that this bound vanishes like O(r ), due to the continuity of µ → α µ φ. Inserting into Eq. (3.38), we obtain Ar − φ() = O(r ) as proposed, which establishes part (ii) of the proposition. Part (i) was already clear. For part (iii), we only need to invoke Proposition 2.9. The above theorem shows in particular that π0 φ can be approximated with bounded operators in the limit theory, with their localization region shrinking to the origin. Applying the results of [F-He], we can state: Corollary 3.9. Let φ ∈ Φ such that φ λ ∈ ΦFH for all λ. Then π0 φ is an element of ΦFH,0 , the field content of the limit theory A0 . This applies in particular to φ = ψ ∗ (A), so π0 φ = ψ0 (A) ∈ ΦFH,0 . This relation holds true even in the case where the rank of ψ0 is not finite. For multiplicative means m, we further know by our results in Sect. 3.2 that every local field in the limit theory can be obtained in this way; “no new fields appear in the limit”. We now explain how our results are related to the usual formalism of renormalized fields. As above, we consider φ = ψ ∗ (A) with a fixed A ∈ A(r1 ). We write the finite rank map ψ in a basis, ψ = j σ j φ j , with local fields φ j associated with the original theory A. Now we have (α x φ)λ = αλx ψ ∗ (Aλ ) = σ j (Aλ )φ j (λx) = Z j,λ φ j (λx), (3.42) j
j
defining the “renormalization factors” Z j,λ := σ j (Aλ ). By our above results, φ0 := π0 φ = π0 ψ ∗ (A) is a local field in the limit theory, for which we have the formula φ0 (x) = π0 α x φ = “ lim ” Z j,λ φ j (λx). (3.43) λ
j
788
H. Bostelmann, C. D’Antoni, G. Morsella
The “limit” on the right-hand side needs to be read as an application of the mean m to the appropriate expectation values, i.e. ⎛ ⎞ Z j,λ (B λ Ω|φ j (λx)|B λ Ω)⎠ (π0 (B)Ω0 |φ0 (x)|π0 (B )Ω0 ) = m ⎝ j
for B, B ∈
˜ A(E).
(3.44)
E
Symmetry transformations are compatible with this limit by Eq. (2.43), so the symmetry group at finite scales converges to the symmetry group in the limit. In the case of an invariant mean m, not only the Poincaré transformations but also the dilations can be extended in this way to the limit theory, and hence to the fields. We obtain U0 (µ)φ0 U0 (µ)∗ = π0 α µ φ, where (α µ φ)λ = φ µλ = Z j,µλ φ j . (3.45) j
So a shifting of the renormalization factors corresponds to a unitary transformation of the fields in the limit theory. We may interpret α µ as an action of the renormalization group on the theory, in the sense of Gell-Mann and Low [G-L]. Thus the renormalization group α µ induces the dilation symmetry α0,µ in the scaling limit. Note however that the field spaces will in general not be finite dimensional in the limit if the mean m is not multiplicative; so the representation µ → ad U0 (µ)Φγ is not finite dimensional. To understand this in more detail in an example, let us consider the case where the limit theory factorizes as a tensor product like in Eq. (2.19), such as in the case of a free field [BDM]. Setting Z0 := π0 (Z(A)) , a commutative algebra, we would have ˆ ¯ A(O), A0 (O) ∼ U0 (g) ∼ = Z0 ⊗ = U0 (g)HZ ⊗ Uˆ (g),
(3.46)
ˆ is the theory associated with a pure limit state. Here U0 (g)HZ = 1 for all where A Poincaré transformations. The field content of the limit theory is characterized [F-He] by φ0 ∈ ΦFH,0 ⇔ R0 φ0 R0 ∈
O
R0 A0 (O)R0 ∼ =
ˆ Z0 ⊗ Rˆ A(O) Rˆ for some > 0. (3.47)
O
The intersection runs over all neighborhoods O of the origin, and the bar denotes weak closure. ΦFH,0 in particular includes the finitely generated modules Z0 Φˆ γ , where ˆ For a pure limit state, we have Φˆ γ are the field spaces corresponding to the theory A. ˆ Z0 = C1, so that ΦFH,0 coincides with Φˆ FH , the field content of A. Now choose a special type of element in ΦFH,0 , of the form φ0 = 1⊗ φˆ with φˆ ∈ Φˆ FH . Suppose that φ ∈ Φ exists with π0 φ = φ0 (we note that this is actually the case for a free field). For this special choice, Eq. (3.45) reads ˆ 0 (µ)∗ = 1 ⊗ Uˆ (µ)φˆ Uˆ (µ)∗ = “lim” U0 (µ)(1 ⊗ φ)U Z j,µλ φ j . (3.48) j
Here the scaling tranformations actually act like Uˆ Φˆ γ , a finite dimensional representation of the dilation group; these are well classified [Boe, Ch. V, §9]. For more general elements of ΦFH,0 , the nontrivial action of dilations on the center has to be taken into account, as per Eqs. (3.45) and (3.46).
Scaling Algebras and Pointlike Fields
789
4. Operator Product Expansions A critical point in the analysis of the scaling limit is the behavior of the interaction with the changing scales. In Lagrangian quantum field theory, this is usually formulated in terms of the renormalization group flow: A change in scale is compensated by a change in the Lagrangian, i.e. by modifying the coupling constants of the theory. Here, we do not assume that the theory under discussion is generated by a Lagrangian, and the concept of coupling constants is unavailable in our model-independent context. Rather, a change of interaction shows up in the algebraic relations of the observables, which are different at each scale λ. Relating to pointlike quantum fields, their (singular) algebraic structure is described by the operator product expansion [W-Z]. It is the behavior of this expansion at small scales which reflects the “structure constants” of our “improper algebra” of quantum fields. We know that, as a consequence of the phase space condition, an operator product expansion (OPE) exists for the theory at finite scales [Bo3]. In this section, we will investigate how this carries over to the limit theory. We assume throughout this section that the original theory A fulfills Phase Space Condition II. We will focus here on the OPE for the product of two fields, understood in the sense of distributions. Recall from [Bo3] that the OPE at finite scales is roughly given by φ( f )φ ( f ) ≈ pγ (φ( f )φ ( f )),
(4.1)
where φ, φ ∈ Φγ are pointlike fields, γ is large enough for γ , and pγ is a projector onto Φγ . The approximation is valid in the limit where the support of the test functions f and f shrinks to the origin. In the following, let A, A ∈ A(r1 ) be fixed, as well as γ > 0, and let ψ be an approximating map of order γ , as in Lemma 3.4. We set φ := ψ ∗ (A), φ := ψ ∗ (A ). Further, let f, f ∈ S(Rs+1 ) be fixed test functions with support in Or =1 . We consider for d > 0, f d [] := d −(s+1) f [] (x/d);
(4.2)
our short distance limit will then be d → 0. We wish to analyze the product := (α[ f d ]φ) · (α[ f d ]φ ) ∈ Φ.
(4.3)
See Eq. (2.44) for the notation. We note that each (α[ f d ]φ)λ can be extended [F-He] to an unbounded operator on the invariant domain C ∞ (H) = ∩>0 R H, so the product is well-defined; and energy bounds and continuity properties with respect to α g can be obtained using Eq. (2.45), so that in fact ∈ Φ. Our task is to obtain a product expansion for , uniform at all scales. To that end, we choose Ar , Ar as approximating sequences for φ, φ by Theorem 3.8. We set B r := (α[ f d ]Ar ) · (α[ f d ]Ar ) ∈ A(r + d). This sequence is supposed to approximate as r → 0. In fact, we show: Lemma 4.1. There are constants c, > 0 such that − B r () ≤ c r d − for all d > 0 and r ≤ 1.
(4.4)
790
H. Bostelmann, C. D’Antoni, G. Morsella
Proof. We write the difference − B r in terms of the individual fields: − B r = α[ f d ](φ − Ar ) α[ f d ]φ + α[ f d ]Ar α[ f d ](φ − Ar ) .
(4.5)
We shall derive estimates only for the first summand, the second is treated in an analogous way. Using the commutation relation in Eq. (2.45) multiple times, we obtain a relation of the type R 2 α[ f d ](φ − Ar ) (α[ f d ]φ )R 2
n n = c j R j α[∂0 j f d ](φ − Ar ) R j α[∂0 j f d ](φ ) R j , (4.6) j
with certain constants c j , where we can achieve that j ≥ , j ≥ 2, j ≥ , n j ≤ , n j ≤ . If now is sufficiently large (as in Theorem 3.8), we can apply the following estimates: n
n
α[∂0 j f d ](φ − Ar )() ≤ ∂0 j f d 1 O(r ) ≤ d − O(r ), n α[∂0 j f d ](φ )()
≤
n ∂0 j f d 1 O(1)
(4.7)
≤ d − O(1).
(4.8)
Applying this to Eq. (4.6), and using a similar bound for φ and φ exchanged, yields the proposed estimates after a redefinition of . Now let > and γ > 0 be fixed; their value will be specified later. We choose a uniform projector p onto Φγ according to Proposition 2.10 (for details see there), where p φ( ) ≤ φ( ) · const. We note that p may depend on and γ , but the estimate referred to does not, apart from a multiplicative constant. By Lemma 4.1 above, we know
− B r ( ) = d − O(r ), p( − B r )( ) = d − O(r ).
(4.9)
We now choose an approximating map (II) ψ of order γ . Then pψ ∗ = ψ ∗ , and therefore
B r − p B r ( ) ≤ B r − ψ ∗ (B r )( ) + p(B r − ψ ∗ (B r ))( )
≤ B r − ψ ∗ (B r )( ) · const.
(4.10)
With similar arguments as in Eq. (3.8), we can obtain an estimate of the form (4.11) B r − ψ ∗ (B r )( ) ≤ B r (E(r + d))γ + (1 + E)− /2 · const, where we can choose E dependent on d, and we have supposed ≥ 2γ + 2 and E(r + d) ≤ r1 . Now, in summary, this yields (4.12) − p() = d − O(r ) + O(r −k ) (E(r + d))γ + (1 + E)− /2 . For given β > 0, we now choose r = d +β+1 , E = r1 d −1/2 , γ = (2k + 2)( + β + 1), and = 2γ + 2. With this choice, we obtain − p() = o(d β ). We summarize our result as follows:
(4.13)
Scaling Algebras and Pointlike Fields
791
Theorem 4.2. Let be defined as in Eq. (4.3). For every β > 0, there exist γ > 0 and > 0 such that, with p being a uniform projector onto Φγ as in Proposition 2.10, d −β − p() → 0 as d → 0. This constitutes a uniform OPE at all scales. We now transfer these estimates to the limit theory. The OPE terms in the limit are supposed to be π0 p, i.e. the limit of those at finite scales, and they should approximate π0 , the limit of the product. First, we show that π0 is in fact compatible with the product structure. Lemma 4.3. For any f, f ∈ S(Rs+1 ) and any d > 0, we have: π0 = α0 [ f d ](π0 φ) · α0 [ f d ](π0 φ ) . Proof. We certainly know that an analogous relation holds between bounded operators: π0 B r = α0 [ f d ](π0 Ar ) · α0 [ f d ](π0 Ar ) .
(4.14)
Considering B r as an element of Φ, we obtain by Proposition 2.9 and Lemma 4.1, π0 (B r − )(2+1) ≤ B r − () · const → 0 as r → 0,
(4.15)
where is large enough, and d is kept fixed. Also, by Theorem 3.8 and Eq. (2.43), we know that for large , α0 [ f d ]π0 (Ar ) − α0 [ f d ]π0 (φ)() → 0 as r → 0.
(4.16)
The same holds for A , φ in place of A, φ. We can now use techniques as in the proof of Lemma 4.1 to show that, for large , π0 (B r ) − α0 [ f d ]π0 (φ) α0 [ f d ]π0 (φ ) () → 0 as r → 0. Combined with Eq. (4.15), this yields the proposed result.
(4.17)
Now applying Proposition 2.9 to the result of Theorem 4.2, we can summarize our results as follows: Corollary 4.4. For every β > 0, there exists γ > 0, > 0, and a uniform projector p onto Φ γ such that d −β π0 − π0 p() → 0 as d → 0. Here we have π0 = α0 [ f d ](π0 φ) · α0 [ f d ](π0 φ ) . Further, π0 ( p) ∈ ΦFH,0 at any fixed d.
792
H. Bostelmann, C. D’Antoni, G. Morsella
The last part follows by applying Corollary 3.9 to p. We may interpret these results as follows: The product of fields at fixed scales converges to the corresponding product of fields in the limit theory; and the OPE terms at fixed scales converge to OPE terms in the limit theory. If m is multiplicative, then we obtain finitely many independent OPE terms, in the sense that the multilinear map (A, A , f, f ) → π0 = π0 p(α[ f ]ψ ∗ (A)α[ f ]ψ ∗ (A )) has a finite-dimensional image if we keep the approximation map ψ fixed. If m is not multiplicative, the OPE may be degenerate in the sense discussed in Sect. 3.3. In order to understand the role of the renormalization factors in the OPE, let us again translate our results to the usual notation in physics. From Corollary 4.4, the OPE terms in the limit theory are given by π0 p. We write each p λ in a basis: p λ = j σ j,λ ( · )φ j , where we can choose the fields φ j independent of λ, but the functionals σ j,λ will generally depend on λ. Writing the distributions as formal integration kernels (as usual in physics), this gives ( p)λ = p((α x φ)(α x φ )) = σ j,λ ((α x φ)λ (α x φ )λ )φ j . (4.18) λ
j
So the expressions c j,λ := σ j,λ ((α x φ)λ (α x φ )λ ) can be interpreted as OPE coefficients at scale λ. For further comparison with perturbation theory, let us choose (m) (m) φ λ(m) = n Z n,λ φn (see Eq. (3.42)) such that at scale 1, we have Z n,1 = δmn . Let us (x, x )
(m)
assume that the matrix Z n,λ is invertible at each fixed λ. The product expansion for the product (m,m ) with φ = φ (m) and φ = φ (m ) then reads (m) (m ) (k) Z n,λ Z n ,λ (Z −1 ) j,λ σk,λ (φn (λx)φn (λx )) φ (λj) . (4.19) ( p(m,m ) )λ = j k,n,n
(m,m )
=:c j,λ
(x,x )
If we can assume here that the functionals σk,λ can be chosen independent of λ, then this gives the formula well known from perturbation theory (cf. Eq. (98) in [Ho]): (m) (m ) (m,m ) (k) (n,n ) c j,λ (x, x ) = Z n,λ Z n ,λ (Z −1 ) j,λ ck,1 (λx, λx ). (4.20) k,n,n
Whether the assumption of λ-independent functionals σ j is justified remains open, however. In fact, if the basis functionals σ j,λ are chosen energy-bounded, which is always possible at finite scales, one would rather expect that this energy bound needs to be properly rescaled as λ → 0. However, in the context of perturbation theory, it may be justified to assume that σ j,λ is independent of λ up to terms of higher order. The symmetry group G of the scaling algebra acts on the uniform operator product expansions in a natural way. Namely, let g = (µ, 0, Λ) be a dilation and/or Lorentz transform, but with no translation part. By Theorem 4.2, the OPE is given by d −β − p() → 0 as d → 0.
(4.21)
Now note that, if p is a uniform projector as described in Proposition 2.10, then α −1 g pα g is also a projector with the properties given in that lemma. So we also obtain () d −β − α −1 → 0 as d → 0. g pα g
(4.22)
Scaling Algebras and Pointlike Fields
793
Combining Eqs. (4.21) and (4.22), and observing that we can apply α g within the norm without changing the limit, we obtain d −β α g p − pα g () → 0 as d → 0.
(4.23)
In other words, the OPE terms pα g for the transformed product α g are the same as the α g -transformed OPE terms p for , up to terms of higher order in d. The same holds then in the limit theory, by application of π0 to all terms. As above, we can use a basis representation of p and φ, φ in order to express the relation in Eq. (4.23) in terms of renormalization factors. It should be noted that, in a perturbative context, symmetry properties of the OPE are one of its key features that is exploited for applications; see e.g. [B+]. The situation is simpler if we consider only products evaluated in the vacuum state, i.e. if we compute the renormalization limits of Wightman functions. We shall only consider the case of a two-point function here, with φ = φ ∗ ; other n-point functions can be handled in a similar way. Evaluating the results of Lemma 4.3 in the vacuum (for d = 1), we have for φ0 := π0 φ, (Ω0 |(α0 [ f ]φ0 )(α0 [ f ]φ0∗ )|Ω0 ) = m(ω((α f φ)λ (α f φ)∗λ )).
(4.24)
Defining the usual two-point functions W jk (x, x ) = (Ω|φ j (x)φk (x )|Ω) in the original theory, and W0 (x, x ) = (Ω0 |φ0 (x)φ0∗ (x )|Ω0 ) the limit theory, we obtain: ⎛ ⎞ Z j,λ Z¯ k,λ W jk (λx, λx )⎠ . (4.25) W0 (x, x ) = m ⎝ j,k
So the Wightman functions “converge” (in the sense of means) to their expected limits. To illustrate the consequences, let us consider an invariant mean m, and let us assume the following simplified situation: (i) We only deal with one renormalization factor, i.e. φ λ = Z λ φ with a fixed φ ∈ C ∞ (Σ)∗ . (We then have only one Wightman function W11 = W at finite scales.) (ii) The factor |Z λ | is strictly positive and monotonously decreasing as λ → 0. (iii) |Z λ |2 W (λx, λx ) converges to W0 (x, x ) in the topology of S as λ → 0 (not only in the sense of means). (iv) W0 is not the zero distribution. Then, for any µ ≥ 1, the function λ → Z λ/µ /Z λ is bounded; we set h(µ) := m(|Z λ/µ /Z λ |2 ). We know that, in the sense of distributions, W0 (µx, µx ) = m(|Z λ |2 W (µλx, µλx )) = m(|Z λ/µ |2 W (λx, λx )) = h(µ)W0 (x, x ). (4.26) Here we haved used the invariance of the mean, and the fact that the mean “factorizes” since |Z λ |2 W (λx, λx ) is convergent by assumption. It is clear from the above that h(1) = 1 and h(µµ ) = h(µ)h(µ ). Also, h is continuous since µ → W0 (µx, µx ) is continuous in S , and since W0 = 0. As is well known, this implies h(µ) = µ−a for some a ≥ 0. This reproduces a result from [F-Ha] in our context. Note that it is not implied that |Z λ |2 = λa ; rather, Z λ might also differ from this by a slowly varying factor, such as |Z λ |2 = λa (log λ)b .
794
H. Bostelmann, C. D’Antoni, G. Morsella
We have confined our attention here to the case of a product of two fields, in the sense of distributions. Many generalizations of this setting are certainly within reach. First, the analogue of Corollary 4.4 should hold for products of an arbitrary finite number of fields, and for their linear combinations. One can also allow more general short distance limits than a simple scaling of the test functions f d , at the price of increased technical effort. Moreover, like shown for the theory at fixed scales in [Bo3], it should be possible to obtain more detailed results on the OPE at spacelike distances, where the OPE coefficients exist as analytic functions rather than only as distributions. This would be particularly interesting for obtaining estimates on the two-point function and its limit, which might lead to a criterion for asymptotic freedom, since massless free theories can be characterized by estimates on their two-point function [P,D,Ba]. These extensions go beyond the scope of the current paper however; we hope to return to these questions elsewhere. 5. Conclusions The renormalization group methods in quantum field theory have found their main applications in the study of short distance properties of non-abelian gauge theories, such as quantum chromodynamics (QCD), which are expected to exhibit interesting features like confinement and asymptotic freedom. Since no rigorously constructed version of QCD is available to date, it is not clear if the results presented here are directly applicable to this case; but some heuristic comments are anyway in order. According to our results in Theorem 3.7, “no new fields appear in the scaling limit”. In view of the common expectations about the confining dynamics of QCD, it may seem that this would imply that our phase space conditions are not general enough to encompass such a theory. However, this conclusion is not justified, since we are restricting attention here to observable fields only, and do not directly deal with charged fields. This is also not necessary, since the gauge group and the field algebras7 of unobservable objects can be constructed from the algebras of observables by means of charge analysis [D-R]. More precisely, the following diagram of theories holds: A HH uu HHcharge analysis HH uu HH u u HH zuu $ A0 A F AA A scaling limit charge analysis AA A scaling limit uu
F(0) ⊇ F0 Here F is the field algebra at finite scales, in the sense of Doplicher-Roberts, F0 is its scaling limit, and F(0) is the field algebra constructed from the scaling limit of observables, A0 . The inclusion in the bottom line of the diagram is a strict one in the case of confinement [Bu].8 In the case of non-abelian gauge theories, it is generally expected that in covariant gauges, charged fields of the theory at finite scales, such as the field 7 In order to avoid confusion in terminology, let us note that the field algebras F(O) in the sense of Doplicher-Roberts are supposed to contain charge-carrying objects, but still bounded operators associated with an open region O, not (unbounded) pointlike localized fields. 8 A general discussion of the relations between the superselection structures of A and A and the corre0 sponding Doplicher-Roberts field nets, leading to an intrisic notion of charge confinement, has been performed in [Bu,DMV,D-M], to which we refer the interested reader for further details.
Scaling Algebras and Pointlike Fields
795
a (x) and the quark fields Q a (x) (here a denotes the SU(3) color index), strength tensor Fµν are non-local when restricted to the “physical” Hilbert space of the states satisfying the gauge condition (see e.g. [St]). In particular these fields are not observable, even if some a (x)F µν (x) in the sense of some normal product, may of their functions, such as e.g. Fµν a be. On the other hand, assuming that the traditional scenario of asymptotic freedom holds beyond perturbation theory, in the scaling limit we expect that the corresponding a (x), Q a (x) are free local massless fields. Still they are not observable, as fields F0,µν 0 they transform nontrivially under color SU(3), which is now a true (global) symmetry in a (x) and Q a (x) can then be regarded as the ultraviolet limit. Charged fields such as F0,µν 0 (0) being associated with F , but not with F0 , as they should not be the limit of pointlike fields associated to F. It should also be kept in mind that it is in general necessary, in order to perform the superselection analysis in the scaling limit, to pass from A0 to its dual net Ad0 , and it is thus possible that new fields appear there. In view of these facts, our result about the non-increase of the number of observable fields when passing to the scaling limit seems to be in line with the general picture of confinement in QCD. The approach to the renormalization of quantum fields that we presented here is more general than the traditional one, including also cases where the scaling limit is not unique. Also, we have shown that the very existence of renormalization factors, which is an ansatz in the traditional approach, is actually a consequence of the general properties of quantum field theory. In our context it is always possible, independent of the model under consideration, to form finite linear combinations of the fields associated to the finite scale theory, with suitable, scale dependent coefficients, in such a way that the resulting field has a well-defined limit as a field associated to the scaling limit theory. It should be stressed that in our approach no requirement is made about the convergence of n-point functions at small scales, so that our results are applicable also to the situations in which no proper fixed points of the renormalization group exist. Such theories are those with a “degenerate scaling limit” according to the classification of [B-V1]. The short distance behavior of these theories is described by a whole family of scaling limits, distinguished by the choice of a mean along which the limit is performed. We also have considered the behavior of operator product expansions under scaling. We have shown that it is possible to obtain a uniform expansion at all scales, which converges to the expansion of the product of the limit fields, and whose coefficients satisfy, at least in special cases, the scaling law which is customary in perturbation theory, cf. Eq. (4.20). Our analysis uses as a basic input the Phase Space Condition II stated in Sect. 3, allowing the identification of the pointlike fields associated to the given net of local algebras. The validity of such condition has been verified in models with a finite number of free fields, massive or massless, in s ≥ 3 spatial dimensions [Bo1]. In view of these facts, it seems reasonable to expect that this criterion is verified also in more general field theoretical models, possibly interacting, in particular in asymptotically free theories, since their short-distance behavior should not differ significantly from that in free models. In order to find a counterpart of the action of the Gell-Mann and Low renormalization group in our framework, we had to generalize the notion of scaling limit given in [B-V1], introducing a class of dilation invariant but not pure scaling limit states. The study of the structure of the scaling limit theory corresponding to such states is in progress [BDM]. As mentioned in Sect. 2, it is possible to show that, for the theory of a massive free scalar field, such scaling limit theory is a tensor product of the algebras of the massless free field with a model-independent abelian factor, which corresponds to the restriction of the scaling limit representation to the center of the scaling algebra. In general, scaling limits
796
H. Bostelmann, C. D’Antoni, G. Morsella
with the described tensor product structure fall into the class of theories with unique vacuum structure, in the terminology of [B-V1]. The precise conditions under which such a tensor product structure occurs are currently under investigation. However, there are certainly other models which do not exhibit a simple tensor product structure in the limit. In this context, it seems interesting to investigate the model proposed in [B-V2], i.e. an infinite tensor product of free fields with masses 2n m, n ∈ Z. Although this model obviously violates our phase space conditions, so that the analysis of the scaling behavior of its field content is out of reach with the methods employed here, its scaling limit can nevertheless be considered from the algebraic point of view, and it could give an interesting example of a dilation covariant theory which contains components of massive free fields. Finally, also in view of the discussion above about possible applications of the present analysis to physically interesting models, it would be worthwhile to extend the results presented here to treat the renormalization of charge carrying pointlike fields, associated with the Doplicher-Roberts field net. A. The Normal Part of a Functional For our investigation, we need the concept of the normal part of a functional ρ in the dual of some C ∗ algebra A. That is, we want to extract from ρ that part which is in the folium of some given state ω. The techniques used in this context can be found in [K-R, Ch. 10.1]; we will however repeat some part of the construction here, since we need some properties specific to our setup. Theorem A.1. Let A be a C ∗ algebra, ω a state on A, and πω the associated GNS representation. Denote by Σω the space of ultraweakly continuous functionals on πω (A) . There exists a linear map Nω : A∗ → Σω with the following properties.9 (i) Nω ◦ πω∗ Σω = idΣω . (ii) If ρ ∈ A∗ , ρ ≥ 0, then Nω (ρ) ≥ 0 and ρ ≥ πω∗ Nω (ρ). (iii) Nω = 1. Nω is uniquely determined by properties (i) and (ii). We will call Nω (ρ) the normal part of ρ with respect to ω. It depends on the algebra A and is usually not compatible with the restriction to subalgebras. Therefore, we will label the normal part also as N[A, ω](ρ), where the reference to the base algebra A is particularly important. ˜ ω be two maps which fulfill (i) and (ii). Set Proof. We first show uniqueness. Let Nω , N ∗ ∗ ˜ ˜ Q = πω Nω , Q = πω Nω . From (i), one easily sees that Q˜ Q = Q. Now let ρ ∈ A∗ , ρ ≥ 0. Due to (ii), one has ρ ≥ Qρ, thus ρ − Qρ ≥ 0. Applying Q˜ to this positive functional, and observing that Q˜ preserves positivity due to (ii), we have ˜ − Qρ) ≥ 0 Q(ρ
⇒
˜ − Q˜ Qρ ≥ 0 Qρ
⇒
˜ ≥ Qρ. Qρ
(A.1)
˜ thus Qρ = Qρ. ˜ Since πω∗ By symmetry, however, we likewise obtain Qρ ≥ Qρ, ˜ ω ρ for all positive ρ. By taking linear is clearly injective, it follows that Nω ρ = N combinations, the same holds for all ρ. 9 With π being a representation of a C ∗ algebra, we denote by π ∗ the “pullback” action on the dual spaces: π ∗ : π(A)∗ → A∗ , ρ → ρ ◦ π .
Scaling Algebras and Pointlike Fields
797
Now for existence: Let πu : A → B(Hu ) be the universal representation of A. For each ρ ∈ A∗ , there is a unique ultraweakly continuous ρu ∈ (πu (A) )∗ such that ρu ◦ πu = ρ [K-R, p. 721]. Here the map ρ → ρu is linear, isometric, and preserves positivity. Now according to [K-R, Theorem 10.1.12], there exists an orthogonal projection P in the center of πu (A) and an ultraweakly bi-continuous isomorphism α : Pπu (A) → πω (A) such that πω = α ◦µ P ◦πu , where µ P : πu (A) → πu (A) is the multiplication with P. We set Nω (ρ) := ρu ◦ α −1 .
(A.2)
This expression is ultraweakly continuous on π(A) ; hence Nω (ρ) ∈ Σω . Note that α −1 is norm preserving as an isomorphism, and ρu = ρ; thus Nω ≤ 1. In fact, evaluating Nω on ω yields Nω = 1, so we obtain (iii). For σ ∈ Σω , and ρ := σ ◦ πω , one has ρu ◦ πu = σ ◦ πω = σ ◦ α ◦ µ P ◦ πu ;
(A.3)
thus ρu = σ ◦ α ◦ µ P . It follows that Nω (σ ◦ πω ) = ρu ◦ α −1 = σ ◦ α ◦ µ P ◦ α −1 = σ,
(A.4)
for µ P acts as identity on the image of α −1 . This proves (i). Now let ρ ∈ A∗ , ρ ≥ 0, thus also ρu ≥ 0. Since the isomorphism α −1 preserves positivity, one has Nω (ρ) ≥ 0. Further, note that πω∗ Nω (ρ) = ρu ◦ α −1 ◦ πω = ρu ◦ µ P ◦ πu .
(A.5)
ρ − πω∗ Nω (ρ) = ρu ◦ πu − ρu ◦ µ P ◦ πu = ρu ◦ (1 − µ P ) ◦ πu .
(A.6)
It follows that
Using P ∗ = P, it is easy to see that (1 − µ P ) preserves positivity, and so does πu as a representation. Thus ρ − πω∗ Nω (ρ) ≥ 0. This proves (ii). Acknowledgements. The authors would like to thank K. Fredenhagen, G. Morchio and F. Strocchi for discussions on the subject, and D. Buchholz for information on his previous work on the topic. A substantial part of this work was done during stays of the authors at the Erwin Schrödinger Institute, Vienna. H.B. wishes to thank the Universities of Rome “La Sapienza” and “Tor Vergata” for their hospitality. He would also like to thank the University of Florida for an invitation.
References [Ba] [B+] [BDM] [Boe] [Bo1] [Bo2] [Bo3] [Bu]
Baumann, K.: On the two-point functions of interacting Wightman fields. J. Math. Phys. 27, 828–831 (1986) Bernard, C., Duncan, A., LoSecco, J., Weinberg, S.: Exact spectral-function sum rules. Phys. Rev. D12, 792–804 (1975) Bostelmann, H., D’Antoni, C., Morsella, G.: Work in progress Boerner, H.: Representations of Groups. Amsterdam: North-Holland, 1963 Bostelmann, H.: Lokale Algebren und Operatorprodukte am Punkt. Thesis, Universität Göttingen (2000). Available online at http://webdoc.sub.gwdg.de/diss/2000/bostelmann/ Bostelmann, H.: Phase space properties and the short distance structure in quantum field theory. J. Math. Phys. 46, 052301 (2005) Bostelmann, H.: Operator product expansions as a consequence of phase space properties. J. Math. Phys. 46, 082304 (2005) Buchholz, D.: Quarks, gluons, colour: facts or fiction?. Nucl. Phys. B 469, 333 (1996)
798
[B-V1] [B-V2] [D] [D-M] [DMV] [Do] [D-R] [Du-S] [D-Su] [Dy] [F-He] [F-Ha] [FRS] [G-L] [Ha] [Ho] [K-R] [L] [M] [P] [S] [Sa] [St] [W] [W-Z] [Ws]
H. Bostelmann, C. D’Antoni, G. Morsella
Buchholz, D., Verch, R.: Scaling algebras and renormalization group in algebraic quantum field theory. Rev. Math. Phys. 7, 1195–1239 (1995) Buchholz, D., Verch, R.: Scaling algebras and renormalization group in algebraic quantum field theory. II. Instructive examples. Rev. Math. Phys. 10, 775–800 (1998) Dell’Antonio, G.F.: On dilation invariance and the Wilson expansion. Nuovo Cimento 12A, 756–762 (1972) D’Antoni, C., Morsella, G.: Scaling algebras and superselection sectors: study of a class of models. Rev. Math. Phys. 18, 565 (2006) D’Antoni, C., Morsella, G., Verch, R.: Scaling algebras for charged fields and short-distance analysis for localizable and topological charges. Ann. Henri Poincare 5, 809–870 (2004) Douglas, R.G.: On the measure-theoretic character of an invariant mean. Proc. Amer. Math. Soc. 16, 30–36 (1965) Doplicher, S., Roberts, J.E.: Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics. Commun. Math. Phys. 131, 51–107 (1990) Dunford, N., Schwartz, J.T.: Linear Operators, Part I: General theory. New York: Interscience, 1958 Driessler, W., Summers, S.J.: Central decomposition of Poincaré-invariant nets of local field algebras and absence of spontaneous breaking of the Lorentz group. Ann. Inst. H. Poincaré Phys. Theor. 43, 147–166 (1985) Dybalski, W.: A sharpened nuclearity condition and the uniqueness of the vacuum in QFT. To appear in Commun. Math. Phys. 2008, doi:10.1007/s00220-008-0514-5 Fredenhagen, K., Hertel, J.: Local algebras of observables and pointlike localized fields. Commun. Math. Phys. 80, 555–561 (1981) Fredenhagen, K., Haag, R.: Generally covariant quantum field theory and scaling limits. Commun. Math. Phys. 108, 91–115 (1987) Fredenhagen, K., Rehren, K.-H., Seiler, E.: Quantum field theory: where we are. Lect. Notes Phys. 721, 61–87 (2007) Gell-Mann, M., Low, F.E.: Quantum electrodynamics at small distances. Phys. Rev. 95, 1300–1312 (1954) Haag, R.: Local Quantum Physics. 2nd edition, Berlin: Springer, 1996 Hollands, S.: The operator product expansion for perturbative quantum field theory in curved spacetime. Commun. Math. Phys 273, 1–36 (2007) Kadison, R.V., Ringrose, J.R.: Fundamentals of the Theory of Operator Algebras, Volume II: Advanced Theory. Orlando: Academic Press, 1997 Lewis, D.R.: An upper bound for the projection constant. Proc. Amer. Math. Soc. 103, 1157–1160 (1988) Mitchell, T.: Fixed points and multiplicative left invariant means. Trans. Amer. Math. Soc. 122, 195–202 (1966) Pohlmeyer, K.: The Jost-Schroer theorem for zero-mass fields. Commun. Math. Phys. 12, 204–211 (1969) Schaflitzel, R.: Direct integrals of unitary equivalent representations of nonseparable C ∗ algebras. J. Funct. Anal. 111, 62–75 (1963) Saks, S.: Theory of the Integral. Warsaw: Lwau/New York: Stechert, 1937 Strocchi, F.: Selected Topics on the General Properties of Quantum Field Theory. Singapore: World Scientific, 1993 Wilson, K.G.: Non-lagrangian models of current algebra. Phys. Rev. 179, 1499–1512 (1969) Wilson, K.G., Zimmermann, W.: Operator product expansions and composite field operators in the general framework of quantum field theory. Commun. Math. Phys. 24, 87–106 (1972) Wils, W.: Direct integrals of Hilbert spaces I. Math. Scand. 26, 73–88 (1970)
Communicated by Y. Kawahigashi
Commun. Math. Phys. 285, 799–824 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0674-3
Communications in
Mathematical Physics
Schramm–Loewner Equations Driven by Symmetric Stable Processes Zhen-Qing Chen , Steffen Rohde Department of Mathematics, University of Washington, Seattle, WA 98195, USA. E-mail:
[email protected];
[email protected] Received: 13 August 2007 / Accepted: 29 August 2008 Published online: 19 November 2008 – © Springer-Verlag 2008
Abstract: We consider shape, size and regularity of the hulls K t of the chordal Schramm–Loewner evolution driven by a symmetric α-stable process. We obtain derivative estimates, show that the domains H\K t are Hölder domains, prove that K t has Hausdorff dimension 1, and show that the trace is right-continuous with left limits almost surely. 1. Introduction and Results The Loewner differential equation (LE for short) ∂t gt (z) =
2 , g0 (z) = z gt (z) − Wt
(1.1)
takes as input a real-valued function Wt (t ≥ 0) and produces an increasing family of sets (K t )t≥0 such that gt is the (suitably normalized) conformal map from H\K t onto the upper halfplane H. See Sect. 3. The Schramm–Loewner Evolution S L E κ is the random process K t (or gt ) when Wt = Bκt , where Bt is Brownian motion. See [18] and the references therein. The spectacular success of S L E κ in describing scaling limits of lattice models and in resolving numerous questions from probability and mathematical physics motivates the study of the Loewner equation driven by other stochastic processes. Roughly speaking, if the driving function is sufficiently continuous, then LE produces a continuous curve γ (t) ∈ H defined by gt (γ (t)) = Wt . This so-called trace generates the hull in the sense that K t = γ [0, t] (if γ is not a simple curve, one has to add the filled-in loops). If W has a discontinuity at time t, then γ has a discontinuity too and the trace grows a “branch”. In fact, if W is piecewise constant, then K is a union of analytic curves (and the n th of Research supported in part by NSF Grant DMS-0600206.
Research supported in part by NSF Grants DMS-0501726 and DMS-0244408.
800
Z.-Q. Chen, S. Rohde
these curves is a geodesic for the hyperbolic metric in the half plane minus the previous n − 1 curves). Thus tree-like sets K can be described by LE with a discontinuous driving term. In the mathematical physics literature, the LE driven by the symmetric α-stable process St (plus Brownian motion) has first appeared in [14]. A mathematically rigorous treatment of some elementary properties is in [8]. Another motivation for studying random families of conformal maps comes from a circle of problems known in the complex analysis literature as Brennan’s conjecture, see [2] or [13]. The problem is to maximize 2π log 0 | f (r eit )| p dt β f ( p) = lim sup (1.2) | log(1 − r )| r →1 over all bounded conformal maps f of the unit disc. While it is conjectured that β( p) := sup f β f ( p) = p 2 /4 for −2 ≤ p ≤ 2, there is no proof of either β( p) ≤ p 2 /4 or β( p) ≥ p 2 /4, for any nontrivial value of p. The lower bound just requires one example f, but there are no candidates for extreme domains. From work of Carleson, Jones, Makarov and others it is known that extremals can be found amongst domains with selfsimilar boundary, and that extremal boundaries can be approximated by “dendrites”. Whereas it is difficult to compute the above integral means for individual functions f , it could be easier to estimate the expected value ⎡ 2π ⎤ E ⎣ | f (r eit )| p dt ⎦ 0
because in a rotationally invariant family this amounts to computing E[| f (r )| p ]. The computations in [17] showed that Brownian SLE does not produce examples close to extremal. At the 2001/02 Mittag-Leffler program “Probability and Conformal Mappings”, Nikolai Makarov and the second author tried to find stochastic processes that produced large integral means, and recognized that it would be interesting to study LE driven by the symmetric stable processes. The second author would like to thank Nick for these stimulating conversations. In 2003, Daniel Meyer (then graduate student at University of Washington, Seattle) performed computer experiments that suggested a nontrivial and perhaps even close to extremal integral means spectrum for the stable LE. In this paper, we will consider LE driven by the symmetric α-stable process Wt = St , see Sect. 2 for the definition of symmetric stable processes and some of the basic properties. As Wt satisfies a scaling relation different from Brownian scaling, stable LE does not exhibit scale invariance, see [14, Sect. 2.3]. Thus it is no surprise that rescaling the hulls by capacity leads to deterministic sets. Indeed, we show in Sect. 3 that for 0 < α < 2, as s → 0, the rescaled hulls 1s K s 2 converge to the vertical line segment [0, 2i] (in the Hausdorff metric) in probability. On the other hand, for all ε > 0,
1 K s 2 ∩ {y > ε} = ∅ = 0. lim P s→∞ s We will then consider continuity and metric properties of the hulls by analyzing the backward flow ∂t f t (z) = −
2 , f t (z) − Wt
f 0 (z) = z.
(1.3)
Schramm–Loewner Equations Driven by Symmetric Stable Processes
801
For each fixed t > 0, this random conformal map f t (z) of H has the same distribution as gt−1 (z − Wt ) + Wt and thus K t has the same distribution as H\ f t (H) − Wt . However as a family of maps, { f t (·), t ≥ 0} does not have the same distribution as {gt−1 (·), t ≥ 0} (see the discussion at the beginning of Sect. 4). Write f t (z) − Wt = X t + iYt ,
t ≥ 0.
It is easy to see that Yt is increasing in t ≥ 0. We prove in Sect. 4 that for z = x + i y ∈ H with y < 1, if α ∈ [1, 2), Y reaches height 1 almost surely when α ∈ [1, 2), and Y does not reach height 1 with positive probability when α ∈ (0, 1). Below are some computer simulations for SLE driven by Cauchy stable processes, with t = 0.1, 1, 10 and t = 100 respectively.
Fig. 1.1. (α = 1 and t = 0.1)
Fig. 1.2. (α = 1 and t = 1)
802
Z.-Q. Chen, S. Rohde
Fig. 1.3. (α = 1 and t = 10)
Fig. 1.4. (α = 1 and t = 100)
As in the study of SLE in [17], a key role in understanding K t is therefore played by the derivative expectation E[| f t (z)| p ]. However in contrast with Brownian motion, the infinitesimal generator of the symmetric α-stable process S on R is the fractional Laplacian α/2 , which is not very amenable to calculations. Many nice smooth functions such as polynomials of order 2 and beyond are not in its domain. For this technical reason, we use the truncated symmetric standard α-stable process S instead, which is the symmetric α-stable process S with jumps of size larger than 1 removed. Any C 2 -smooth function on R is in the domain of the infinitesimal generator of S. Note that for the symmetric α-stable process S, jumps of size larger than 1 arrive according to a Poisson process. So there are only a finite number of jumps of size larger than 1 in any given time interval. For any κ > 0, information on SLE driven by S = {St , t ≥ 0} can be easily deduced from SLE driven by {Sκt , t ≥ 0} (see Lemma 3.1 below), which in turn can be recovered from SLE driven by { Sκt , t ≥ 0} (see Lemma 5.3 below). Our main estimate here is Theorem 4.4. For κ > 0, let Wt = Sκt and write f t (z) − Wt = X t + iYt ,
t ≥ 0.
After a time change γu := inf{t ≥ 0 : Yt ≥ Y0 eu } and f u (z) := f γu (z) we show in Sκt and Sect. 4 that for every 0 < p < 2 and δ > 0 there is κ > 0 such that for Wt = every 0 < y < 1,
E | f − log y (z)| p ; γ− log y < ∞ ≤ C p,δ y −δ . (1.4) This is strong evidence for trivial integral means, β( p) = 0 a.s.,
(1.5)
for all 0 < p < 2 and all κ > 0, and also for the (non-truncated) stable process. However, without additional work, (1.4) only yields (1.5) for the “bulk” of the stable SLE (precisely, the conformal maps f t restricted to those points z for which f t (z) reaches a definite height h > 0, cf. Theorem 6.2). Also, our estimates say nothing about β( p)
Schramm–Loewner Equations Driven by Symmetric Stable Processes
803
for negative values of p. Finally, it is possible that the stable SLE is (at least close to) extremal at some “intermediate scale”, in the sense that the quotient in (1.2) could be close to extremal at some value r < 1. This would be consistent with the aforementioned computer experiments, and then the trivial spectrum may be another manifestation of the trivial scaling limits in Proposition 3.2. We hope to return to some of these questions in a future publication. We apply the above derivative estimates to prove in Sect. 5 that for every T > 0, the maps of the backward flow f t (z) of (1.3) driven by Wt = Sκt with small κ are uniformly γ -Hölder continuous on every bounded set A ⊂ H for t ∈ [0, T ] with γ close to 1/6. The Hölder exponents are certainly not optimal (we believe that the correct exponent is 1/2 for all α ∈ (0, 2)). Nevertheless, this establishes enough regularity to prove that the box counting (and hence the Hausdorff) dimension of the hull K t (of SLE (1.1) driven either by Wt = Sκt or by Wt = Sκt for every κ > 0) is 1 a.s. It also implies that the backward flow f t of (1.3) driven either by Wt = Sκt or by Wt = Sκt for every κ > 0 is locally uniformly Hölder continuous in H a.s. In particular, this implies that for each t > 0, the domain H\K t is a Hölder domain almost surely. Finally, as another application of the Hölder continuity of the maps of the backward flow f t (z) of (1.3), we prove that the trace is right-continuous with left limits (RCLL in abbreviation): Let {gt , t ≥ 0} be SLE (1.1) driven either by Wt = Sκt or by Wt = Sκt . We show in Theorem 7.1 that for every α ∈ (0, 2) and κ > 0, almost surely, for each t > 0 the limit γ (t) =
lim
z→Wt ;z∈H
gt−1 (z)
exists, the function t → γ (t) is RCLL, and K t = γ [0, t]. This is achieved by first showing that with probability one, the maps {gt−1 , 0 ≤ t ≤ T } are equicontinuous on H for every T > 0. Independently from and parallel to this paper, Qing-Yang Guan [9] has recently investigated the continuity properties of the trace of the Loewner equation driven by Wt = Bκt + Sθt for κ ≥ 0, θ ≥ 0, and S the symmetric α-stable process with 0 < α < 2 (he informed us that the assumption κ > 0 in his manuscript is not needed). Thus the main result of [9] contains our Theorem 7.1 as the special case κ = 0. Whereas his proof of the RCLL property is an adaptation of the continuity proof from [17], we employ a different simpler method that takes advantage of the tree structure of the hulls (which works only for κ ≤ 4), and is of independent interest. 2. Definition and Basic Properties of Symmetric α-Stable Process A random variable X is symmetric α-stable if its characteristic function α
E[eiθ X ] = e−c|θ| .
(2.1)
For α = 2, this is the normal distribution. It is not hard to show (but nontrivial) that such X exists if and only if 0 < α ≤ 2 (see for instance [6, Sect. 6.5]). Write X ∼ S(α, c) where α is called the index and c1/α is the scale. If X i ∼ S(α, ci ) are independent, then (2.1) immediately gives X 1 + X 2 ∼ S(α, c1 + c2 )
804
Z.-Q. Chen, S. Rohde
and a X ∼ S(α, ca α ). The symmetric α-stable process S = {St , t ≥ 0} (or α-stable Lévy motion) is a Lévyprocess (meaning S is right continuous with left limits, and has stationary independent increments) with St − Sr are distributed according to the α-stable law: St − Sr ∼ S(α, t − r ) for 0 ≤ r ≤ t. Notice that stable process is self-similar: for every c > 0, . {Sct − S0 ; t ≥ 0} = c1/α (St − S0 ) ; t ≥ 0 ,
(2.2)
. where = denotes equality in distribution. This is the analog of the classical Brownian scaling for Brownian motion. The transition density function can be obtained from the characteristic function by the inverse Fourier transform: 1 α p(t, x, y) = p(t, x − y) = Px (St ∈ [y, y + dy]) /dy = e−i(x−y)θ e−t|θ| dθ. 2π R
Explicit formulas for p exist only in a few √ special cases (for α = 2 we have the normal distribution p(t, x) = exp(−x 2 /2t)/ 2π t, and for α = 1 the Cauchy distribution p(t, x) = π(t 2t+x 2 ) ). However we have the following estimate (see, for example, [4]): p(t, x, y) t
− α1
1 t ∧ =t− α 1+α |x − y|
1
tα 1∧ |x − y|
1+α ,
t > 0, x, y ∈ R. (2.3)
Here for a, b ∈ R, a ∧ b := min{a, b} and a ∨ b = max{a, b}. Therefore P (|St − S0 | ≥ x) 1 ∧
t . |x|α
(2.4)
We thus see that for α < 2, St has infinite variance, and for α ≤ 1, St is even not integrable. Lemma 2.1. For t > 0 and x > 0,
t P max Sr > x ≤ 2P (St > x) 1 ∧ α . 0≤r ≤t |x| Proof. Let T = inf{s : Ss > x}. Then P(T ≤ t) = P (T ≤ t, St ≤ ST ) + P(T ≤ t, St > ST ) ≤ 2P(T ≤ t, St ≥ ST ) ≤ 2P (St > x) . For a Borel measurable function f on R, we define the fractional Laplacian α/2 f = −(− )α/2 f at x ∈ R as follows: f (x + h) − f (x) α/2 f (x) := cα lim dh, ε↓0 |h|1+α {|h|>ε}
Schramm–Loewner Equations Driven by Symmetric Stable Processes
805
whenever the limit exists. It is easy to see that for every f ∈ Cb2 (R) and every δ > 0, f (x + h) − f (x) − f (x)h1{|h|≤δ} dh, α/2 f (x) = cα |h|1+α R
which is well-defined and is in fact a bounded continuous function in x. By Ito’s formula t t → f (St ) −
α/2 f (Sr )dr
(2.5)
0
is a martingale for every f ∈ Cb2 (R) (see, e.g., the proof of Proposition 4.1 in [1] for details). Let A be the Feller generator (that is, the infinitesimal generator in the space Cb (R) of bounded continuous functions equipped with the supremum norm · ∞ ) of the symmetric α-stable process S. Then the above implies that for f ∈ Cb2 (R), A f (x) = lim
t→0
E[ f (x + St )] − f (x) = α/2 f. t
(2.6)
That is, Cb2 (R) ⊂ D(A) and for f ∈ Cb2 (R), A f = α/2 f . Let the domain D ⊂ R and let f be defined in all of R and continuous in D. Then f is harmonic in D with respect to S if f has the mean value property f (x) = Ex [ f (Sτ B(x,r ) )] for all balls B(x, r ) with closure in D, where τ B(x,r ) := inf {t ≥ 0 : St ∈ / B(x, r )}. Then the ball can be replaced by any open D1 with D 1 ⊂ D, see [5, Theorem 2.2]. The function |x|α−1 if α = 1 u(x) = log |x| if α = 1 is harmonic in R\{0} as is shown in [14] (for α = 1 this follows from the harmonicity of the Kelvin transform |x|α−1 h(1/x) of the constant function h ≡ 1, cf. [11]). This can be used to obtain a quick proof of the recurrence resp. transience of the stable process for α > 1 resp. α < 1 : From Ito’s formula, {u(St ), t ∈ [0, T0 )} is a non-negative local martingale, where T0 = inf{t ≥ 0 : St = 0} and by Fatou’s Lemma it is a supermartingale. For 0 ≤ r < S0 = x < R, we therefore have u(x) ≥ Ex u(Sτr ∧τ R ) = Ex u(Sτ R )1{τ R 1 we get Px (τ R < τr ) ≤
u(x) → 0 as R → ∞ u(R)
proving recurrence. Whereas for α < 1 we have Px (τr < τ R ) ≤ R → ∞, we get Px (τr < ∞) ≤
u(x) < 1, u(r )
u(x) u(r )
and so after letting
806
Z.-Q. Chen, S. Rohde
giving the transience of S. Moreover, if we let r ↓ 0 in the last formula, we have for α < 1 and every x = 0, Px (σ{0} < ∞) = 0. Here σ{0} = inf{t > 0 : St = 0 or St− = 0}. In other words, almost surely neither St nor St− will visit 0.
3. The Loewner Equation 3.1. Deterministic equation. Let Wt be a real-valued function that is right continuous with left limits, RCLL for short. For each initial point z ∈ C\{0}, the Loewner differential equation ∂t gt (z) =
2 , g0 (z) = z gt (z) − Wt
(3.1)
has a unique solution up to a time 0 < Tz ≤ ∞, where gt (z) = Wt− or gt (z) = Wt . More precisely, let Tz = sup t : inf |gs (z) − Ws | > 0 , s∈[0,t]
then the initial value problem (3.1) has a unique solution on [0, Tz ) and if Tz < ∞ then either lim inf |gt (z) − Wt | = 0, or lim inf |gt (z) − Wt | > 0 and gTz (z) = WTz (in this t→Tz −
t→Tz −
case, W jumps at time Tz ). The subset K t = {z ∈ H : Tz ≤ t} is a compact subset of the closed upper half plane H and is called the hull of LE (3.1). It is well-known that the map z → gt (z) is a conformal map (i.e. analytic and one-toone) from H\K t onto H, with Laurent series gt (z) = z + 2tz + O( z12 ) near ∞. From the uniqueness of normalized conformal maps it follows that K t ∩ H = ∅ is strictly increasing in t. Writing gt (z) = xt + i yt and taking real- and imaginary parts in (3.1), the Loewner equation reads x t − Wt , (xt − Wt )2 + yt2 yt ∂t yt = −2 . (xt − Wt )2 + yt2 ∂t xt = 2
It is √easy to see that when W ≡ 0, gt (z) = i 2 t.
√
(3.2)
z 2 + 4t and K t = γ [0, t], where γ (t) =
Schramm–Loewner Equations Driven by Symmetric Stable Processes
807
3.2. LE driven by stable processes. As t → ∞, the diameter of the hulls K t tends to infinity. In fact, √ 1√ t ≤ diam K t ≤ C( t + sup |Wr − Ws |) C 0≤r ≤s≤t
(3.3)
for some universal C and all t (cf. [12, Lemma 4.13] for the upper bound and [12, (3.9)] for the lower bound). What do the hulls of LE driven by stable process (that is, W in (3.1) is a symmetric α-stable process) look like if we scale them back down as t → ∞, or scale them up when t → 0? We will see that both the “conformally natural” and the “metrically natural” way of rescaling the hulls does not lead to any interesting sets: If we scale them so as to have (halfplane-) capacity one or so that the diameter is one, then the hulls converge to a vertical line segment as t → 0 and to the empty set as t → ∞. To make this precise, let c > 0. The solution to (3.1) with t = 1 Wc2 t W c
(3.4)
is given by the function gt (z) = 1c gc2 t (cz). It follows that the hulls are related by t = 1 K c2 t . K c
(3.5)
If Wt is a Brownian motion with variance κ, then 1c W (c2 t) has the same distribution which translates to the important and useful scaling invariance of the S L E hulls. If Wt is α-stable then 1c W (c2 t) is α-stable too but the scale is different: From (2.2) it follows that 1 . 2 W 2 = c α −1 Wt . (3.6) c ct Let {gt (z), t ≥ 0} be the SLE driven by W = S, the symmetric standard α-stable process on R, with hulls {K t , t ≥ 0}. For c > 0, define gt (z) := c−1 gc2 t (cz). Then ∂t gt (z) =
2 gt (z) − c−1 Sc2 t
with gt (z) = z.
t = c−1 K c2 t , t ≥ 0}, driven by symmetric α-stable So { gt (z), t ≥ 0} is SLE with hulls { K . −1 process {c Sc2 t , t ≥ 0} = {Sc2−α t , t ≥ 0} running at a different speed. We record this as a lemma for future reference. Lemma 3.1. Let {K t , t ≥ 0} be the hulls of SLE driven by W = S, the symmetric standard α-stable process. Then for every c > 0, {c−1 K c2 t , t ≥ 0} has the same distribution as the hulls of SLE driven by {Wt = Sc2−α t , t ≥ 0}. Hence the geometric information on hulls of SLE driven by Wt = St and by Wt = Sκt can be deduced one from the other. From (3.6), it is not difficult to prove: Proposition 3.2. Let 0 < α < 2 and {K t , t ≥ 0} be the hulls of SLE driven by W = S. As s → 0, the rescaled hulls 1s K s 2 converge to the vertical line segment [0, 2i] (in the Hausdorff metric) in probability. On the other hand, for all ε > 0,
1 K 2 ∩ {y > ε} = ∅ = 0. lim P s→∞ s s
808
Z.-Q. Chen, S. Rohde
The proof uses the following simple result for deterministic hulls: R. Lemma 3.3. (a) If Wt ∈ [a, b] for all t ∈ [0, T ], then K T ⊂ [a, b] × √ (b) Let 0 < ε < 1 and r >√ 1. If I ⊂ R is an interval of length T and 10I the concentric interval of size 10 T , and if T 1{Wt ∈10I } dt ≤ εT, 0
then
√ K T ∩ I × [4 εT , ∞) = ∅.
Proof. (a) If z = x + i y ∈ H with x < a (resp. > b), then ∂t xt (z) < 0 (resp. > 0) and hence |gt (z) − Wt | is bounded from below by |x − a|. (b) By means of Brownian scaling (3.4) and (3.5), we may assume T = 1 and √ I = [−1/2, 1/2]. Fix z 0 ∈ I × [4 ε, ∞) and write gt (z 0 ) = xt + i yt . We may assume y0 < 2, else trivially z 0 ∈ / K 1 (only the hull of the constant function W√ t ≡ 0 reaches height 2). Let T1 ≤ 1 be maximal time such that xt + i yt ∈ [−2, 2] × [2 ε, ∞) for all t ∈ [0, T1 ]. We will show T1 = 1 and hence z 0 ∈ / K 1 , proving the lemma. Up to T1 , from (3.2) we have T1 |xt − x0 | ≤
|xt − Wt | dt (xt − Wt )2 + yt2 1 2 dt + 2yt 2
0
≤
{Wt ∈10I }
{Wt ∈10I / }
2
1 dt |xt − Wt |
ε 3 2 ≤ √ + < . 2 4 ε 3
(3.7)
Thus xt does not reach the boundary of [−2, 2]. Similarly, T1 |yT1 − y0 | = 0
≤
yt dt (xt − Wt )2 + yt2 2 yt dt + 2 dt yt (xt − Wt )2 2
{Wt ∈10I }
{Wt ∈10I / }
y0 2 y0 √ y0 < , ≤ ε +2 ≤ ε+ y0 9 4 2 √ and therefore yt does not reach 2 ε. Hence T1 = 1 and the lemma is proved.
(3.8)
Proof of Proposition 3.2. From (3.6) we have that the rescaled hull has the same distri2 bution as the time 1 hull of the map t → s α −1 Wt . By Lemma 2.1, the support of this function tends to zero in probability, and from Lemma 3.3 (a) it follows that the width of the hull tends to zero in probability. Since the halfplane-capacity is 1, the height has to converge to 2 and the hull converges to the segment [0, 2i] as s → 0. By Lemma 3.3 (b), the second claim is equivalent to saying that the maximal amount of time that Wt spends in an interval [x, x + δ] tends to zero in probability as δ → 0.
Schramm–Loewner Equations Driven by Symmetric Stable Processes
809
3.3. Stable LE on R. When z is a non-zero real number, Z t := gt (z) of (3.1) is realvalued. We will call the real-valued equation ∂t Z t =
2 Z t − Wt
(3.9)
the forward Loewner equation on R driven by Wt , and ∂t Z t = −
2 Z t − Wt
(3.10)
the backward Loewner equation on R. The latter corresponds to the backward flow f t (z) of (1.3) with z ∈ R\{0}. If W is the symmetric α-stable process on R, then the generator of X t = Z t − Wt in the forward, resp. backward, equation is A± = ±
2 d − (− )α/2 . x dx
For the (− )α/2 -harmonic function u(x) = |x|α−1 (α = 1) we have A± u = ±2(α − 1)|x|α−3 . Thus u is superharmonic for A+ if α < 1 and for A− if α > 1. With the above reasoning (2.7) we obtain Proposition 3.4. For α > 1, X t is recurrent in the backward LE on R, whereas for α < 1, X t is transient in the forward LE on R and almost surely, neither X t nor X t− visits 0. Notice that in SLE driven by Brownian motion, which corresponds to the case α = 2, the question of recurrence versus transience of X t in the forward LE is rather subtle: If √ Bt is Brownian motion and Wt = κ Bt with κ ≤ 4, we have transience whereas for κ > 4 we have recurrence. We will now prove a partial converse to Lemma 3.3(a) about the deterministic forward LE (3.1) in H. If the trajectory of a point x0 on the real line stays away by ε from the singularity, then the disc of radius ε at this point does not meet the singularity and therefore is disjoint from the hull. More generally, if the real part of some trajectory stays away from the singularity by ε, then the ε- disc around this point is disjoint from the hull. Let gt (z) be the solution to the deterministic LE (3.1) with z ∈ H\{0} and define X tz = gt (z) − Wt . When z ∈ R\{0}, Z t = gt (z) solves the forward LE (3.9) on R discussed at the beginning of the section and X tz = Z t − Wt is real-valued. When z ∈ H, X tz is complex valued. We will use B(z, r ) to denote the ball in R2 = C centered at z with radius r . Lemma 3.5. If | Re X tz 0 | ≥ ε for some z 0 ∈ H and all 0 ≤ t ≤ T, then | Re X tz | > 0 for all z ∈ B(z 0 , ε) ∩ H and all 0 ≤ t ≤ T. In particular, B(z 0 , ε) ∩ K T = ∅.
810
Z.-Q. Chen, S. Rohde
Proof. From (3.1) we have ∂t (X tz − X tz 0 ) = and so
2 2 X tz 0 − X tz − = 2 X tz X tz 0 X tz X tz 0
∂t |X tz − X tz 0 |2 = 2 Re ∂t (X tz − X tz 0 )(X tz − X tz 0 ) = −4
|X tz − X tz 0 |2 Re(X tz X tz 0 ). |X tz X tz 0 |2
(3.11)
It follows that |X tz − X tz 0 | is decreasing because Re(X tz X tz 0 ) > 0 as long as | Re X tz 0 | ≥ ε and | Re(X tz 0 − X tz )| < ε. Since | Re X tz 0 | ≥ ε for every 0 ≤ t ≤ T, we have for every z ∈ B(z 0 , ε) ∩ H, |X tz − X tz 0 | ≤ |z − z 0 | < ε
for every 0 ≤ t ≤ T,
and so | Re X tz | > 0 for every 0 ≤ t ≤ T. Now suppose x ∈ R\{0} and gt (x) is the solution to the LE (3.1) driven by a symmetric α-stable process W on R with α < 1. As mentioned previously, gt (x) is the solution to the forward LE on R. Proposition 3.4 tells us that for X tx = gt (x) − Wt , r := inf | Re X tx | = inf |X tx | > 0 t≥0
t≥0
a.s.
We then have by Lemma 3.5 B(x, r ) ∩
Kt = ∅
a.s.
t>0
4. Derivative Estimates We would like to estimate the derivative of h t = gt−1 . Because h t satisfies the PDE ∂t h t (z) = −2∂z h t (z)/(h t (z) − Wt ) rather than an ODE, it is usually easier to work with the time t map f t of the backward Loewner equation (1.3). The connection is as follows: If gt is the solution to (3.1) driven by a function Wt (0 ≤ t ≤ T ), and if f s is the solution to (3.10) driven by s = WT −s , then f T = g −1 . But generally f t = gt−1 for t < T. Because for the symW T metric stable process, s → WT −s − WT has the same distribution as Ws , it follows that for each fixed T > 0, the random conformal map f T (z) of H has the same distribution as gT−1 (z − WT ) + WT (but the family of maps, { f t (·), t ≥ 0} does not have the same distribution as {gt−1 (· − Wt ) + Wt , t ≥ 0}). For the remainder of this section, we consider the time t map f t of the backward Loewner equation (1.3). Let (X t , Yt ) := Z t − Wt . Then by (3.10), d(X t + iYt ) =
−2 −2X t + 2iYt dt − dWt = dt − dWt . X t + iYt X t2 + Yt2
Schramm–Loewner Equations Driven by Symmetric Stable Processes
811
Hence d Xt = −
2X t dt − dWt + Yt2
and
X t2
In particular, we have d ln Yt =
2 dt X t2 +Yt2
dYt =
2Yt dt. + Yt2
X t2
(4.1)
and so t
Yt = Y0 e
2 0 X 2 +Y 2 dt t t
.
√ We record a simple lemma for later use. Let φt (z) = z 2 − 4t be the solution to the backward LE (1.3) driven by the constant function W ≡ 0. Lemma 4.1. For every Z 0 = X 0 + iY0 with Y0 ∈ (0, 1], √ Yt ≤ Im φt (iY0 ) ≤ 1 + 4t for every t > 0. Proof. From (4.1) we have dYt ≤ 2dt/Yt with equality if and only if X t ≡ 0 and therefore Wt ≡ 0. Thus d(Yt2 ) ≤ 4dt and integration gives Yt2 ≤ Y02 + 4t with equality if and only if Wt ≡ 0. For u > 0, define
⎧ ⎫ t ⎨ ⎬ 2 ds ≥ u . γu = inf t > 0 : Yt ≥ Y0 eu = inf t > 0 : 2 2 ⎩ ⎭ X s + Ys
(4.2)
0
Theorem 4.2. Let Wt = St be a standard symmetric α-stable process on R (that is, Wt ∼ S(α, t)). Then for every z = x + i y ∈ H and u > 0, Pz (γu < ∞) = 1 when α ∈ [1, 2) and Pz (γu = ∞) > 0 when α ∈ (0, 1). u ) := (X γu , Yγu ) for u < u 0 . Clearly Proof. Define u 0 := inf{u : γu = ∞}, and ( Xu, Y u u = yeu for u < u 0 , Yu = Y0 e . Note that under Pz , (X 0 , Y0 ) = (x, y), so for u < u 0 , Y and X u = X γu = x −
γu 0
2X s ds − Wγu = x − 2 X s + Ys2
u
X s ds − Wγu .
(4.3)
0
By [15, Theorem 3.1], there is a symmetric α-stable process Z on R such that 1/α u 2 X r − + y 2 e2r Wγu = d Zr on [0, u 0 ). 2 0
Thus X satisfies the following SDE: 2 + y 2 e2u 1/α X u− d Xu = − X u du − d Z u on [0, u 0 ) with X 0 = x, 2 where Z is a symmetric α-stable process on R. We can rewrite (4.4) as u 2
1/α (e X u− ) + y 2 e4u d(eu d Zu . X u ) = −e(1−2/α)u 2
(4.4)
(4.5)
812
Z.-Q. Chen, S. Rohde
By [7, Lemma 5.2 and Theorem 5.3], the above SDE for Ut := et X t has a unique weak solution. Moreover [7, Theorems 5.4 and 5.6] tell us that the solution has non-explosion if and only if α ∈ [1, 2) (see also [16] for the case of α ∈ (1, 2)). It follows that SDE (4.4) has a unique weak solution X that has infinite lifetime if and only if α ∈ [1, 2). u ) in law. So we have for α ∈ [1, 2), Note that the process (X , yeu ) extends ( Xu, Y u 0 = ∞ a.s., in other words, for any t > 0, the original height process Y can reach level yet with probability 1. When α ∈ (0, 1), [7, Theorem 5.6] tells us that the unique weak solution to the SDE (4.5) for Ut := et X t explodes a.s., moreover its explosion time u 0 is bounded by a random variable that has strictly positive probability density on (0, ε) for some ε > 0. The latter implies that Pz (γt = ∞) > 0 for every t > 0. This proves the lemma. As we mentioned in the Introduction, many smooth functions such as polynomials of order 2 and higher are not α/2 -differentiable. For this reason, we need to look at truncated symmetric stable processes. Let ! St := St − (Sr − Sr − )1{|Sr −Sr − |>1} , t ≥ 0. 0 0 and p < 2. Then there are constants C1 , C2 > 0 depending on p and α only such that α/2 f (x)| ≤ C1 (x 2 + a 2 )( p−α)/2 + C2 . | When p = α, the right-hand side is to be interpreted as log(1/(x 2 + a 2 )).
(4.8)
Schramm–Loewner Equations Driven by Symmetric Stable Processes
813
Proof. The proof is similar to that of Lemma 2.9 in [8]. Assume first that |x| ≤ a. Then w := x/a ∈ [−1, 1]. When 0 < a < 1/2, we have " " " " # $ p/2 " " 2 2 2 2 p/2 (x + h) + a − (x + a ) " " α/2 f (x)| = " lim cα | dh " 1+α "ε→0 " |h| " " {ε 0 we can choose κ > 0 small so that Cβ,α κ < δ. Taking λ = β − δ, we have 2 2
+ β x − y ϕ(t, x, y) ≤ −(β − δ)ϕ(t, x, y) = ∂ ϕ(t, x, y) L x 2 + y2 ∂t
Schramm–Loewner Equations Driven by Symmetric Stable Processes
817
for x ∈ R and 0 < y ≤ 1. Thus by Ito’s formula (cf. [10]), for each fixed 0 < t ≤ − log y, 2 2 u ) := (X γu , Yγu ) and q(x, y) := β x 2−y2 , with ( Xu, Y x +y ⎛ s ⎞ s ) exp ⎝ q( u )du ⎠ 1{γs 0. For 0 < ρ < 1 and ε > 0 there is κ = κ(ρ, ) > 0 such that for z = x + i y with −R < x < R and 0 < y < 1, there is a constant C > 0 depending on T, α, ε, R and ρ so that
P max | f t (z)| ≥ y ρ−1 ≤ C y 2−6ρ−ε . 0≤t≤T
Proof. Fix 0 ≤ t ≤ T, z = x + i y and write f t (z) − Wt = X t + iYt , f u (z) − Wτu = u . Notice y = Y0 . Recall by (4.11), Xu + i Y u2 X2 − Y ∂u log | f u (z)| = u2 , u2 Xu + Y so that
⎛ ⎜ | f t (z)| = exp ⎜ ⎝
log
0
Yt Y0
⎞ u2 ⎟ X u2 − Y du ⎟ . u2 ⎠ X u2 + Y
818
Z.-Q. Chen, S. Rohde
Let q(u) :=
u2 X u2 − Y . u2 X u2 + Y
Since |q(u)| ≤ 1, if Yt < y ρ , it follows that | f t (z)| < y ρ−1 . On {1 ≥ Yt ≥ y ρ }, since q(u) ≤ 1, ⎛ yρ ⎞ log y log Yyt ⎜ ⎟ q(u)du + q(u)du ⎟ | f t (z)| = exp ⎜ ⎝ ⎠ 0
log
≤ | f (ρ−1) log y (z)|
yρ y
Yt −ρ ≤ | f (ρ−1) , log y (z)| y yρ
while on {Yt > 1} we have by Lemma 4.1, ⎛ ⎞ log Y1 log YYt 0 0 ⎜ ⎟ | f t (z)| = exp ⎜ q(u)du + q(u)du ⎟ ⎝ ⎠ 0
√
log
1 Y0
≤ | f − log y (z)| Yt ≤ ( 1 + 4t) | f − log y (z)|. ρ−1 ⊂ {YT ≥ y ρ }, It follows that, on max | f t (z)| ≥ y 0≤t≤T
√ −ρ max | f t (z)| ≤ | f (ρ−1) 1{γ(ρ−1) log y 0 there is κ > 0 such that with Wt = Sκt , for every bounded set A ⊂ H and every T > 0, a.s. all f t , 0 ≤ t ≤ T , are Hölder continuous with exponent 1/6 − ε on A when 0 < α < 2: 1
| f t (z) − f t (z )| ≤ C|z − z | 6 −ε for all z, z ∈ A with a random constant C = C(A, α, T, ε). Proof. Let R > 0 and b > 0 be such that A ⊂ [−R, R] × (0, b]. It suffices to show that 5
max | f t (x + i y)| ≤ C y − 6 −ε
0≤t≤T
for all −R < x < R and all 0 < y ≤ b. By Koebe distortion, it is enough to show this for dyadic points z j,n = ( j + i)2−n , where n ≥ 0 and −R2n ≤ j ≤ R2n . For every ε > 0 there is ρ > 1/6 − ε such that the exponent 2 − 6ρ − ε in Lemma 5.1 is strictly larger than 1. Hence
R2n ∞ ! ! 5 P max | f t (z j,n )| > 2n( 6 +ε) < ∞, 0≤t≤T
n=0 j=−R2n
and the theorem follows as Theorem 5.2 in [17].
It immediately follows that for each fixed t, the map f t (z) extends continuously to H a.s. In order to pass from SLE driven by a truncated stable process Sκt to SLE driven Sκt , let’s recall the following relation between St and St . Note that symmetric α-stable process has Lévy measure cα |h|−1−α dh. The jumps {(Sκt − Sκt− )1{|St −St− |>1} , t ≥ 0} of size larger than 1 form a Poisson point process with intensity measure cα κ|h|−1−α 1{|h|>1} dtdh. The process ! Sκt := Sκt − (Sκr − Sκr − )1{|Sr −Sr − |>1} , t ≥ 0, r ≤t
has the same distribution as { Sκt , t ≥ 0}. Define T0 = 0 and let Tk := inf{t > Tk−1 : |Sκt − Sκt− | > 1}
for k ≥ 1,
jumping time of Sκt of size larger than 1. Then {Tk −Tk−1 , k ≥ 1} is a sequence of i.i.d. exponential random variables with parameter λκ. Moreover, the processes {Sκ(t+Tk−1 ) − Sκ Tk−1 , t ∈ [0, Tk − Tk−1 )}, k ≥ 1 be the k th
are i.i.d., which are independent copies of Sκt killed at an independent exponential random time T1 . All this tells us that Sκt can be constructed as follows. Let T0 = 0 and {Tk − Tk−1 , k ≥ 0} be an i.i.d. sequence of exponential random k , t ≥ 0} be a sequence of independent copies of variables with parameter λκ. Let { Sκt { Sκt , t ≥ 0}. Let {ξk , k ≥ 1} be an i.i.d. sequence of random variables with density k , t ≥ 1} and {ξ , k ≥ 1} function proportional to 1|h|>1 |h|−1−α . These {Tk , k ≥ 1}, { Sκt k are all independent. For t > 0, let n be the largest integer so that Tn ≤ t. Define X t :=
n−1 ! k n + Sκ(t−T + ξ . Sκ(T k −T ) n) k k−1
(5.1)
k=1
Then {X t , t ≥ 0} has the same distribution as {Sκt , t ≥ 0}. From this, we immediately have the following.
820
Z.-Q. Chen, S. Rohde
k , t ≥ 1} and {ξ , k ≥ 1} be as in the last Lemma 5.3. For κ > 0, let {Tk , k ≥ 1}, { Sκt k (k) paragraph, which are all independent, and let X be defined by (5.1). Let { f t , t ≥ 0} k . For t > 0, let n be the largest integer so that T ≤ t. Define be SLE driven by Sκt n (n) (2) (1) f t (z) := f t−Tn (· − X Tn ) + X Tn ◦ · · · ◦ f T2 −T1 (· − X T1 ) + X T1 ◦ f T1 (z).
Then { f t (z), t ≥ 0} has the same distribution as the SLE driven by Wt = Sκt . Because compositions of Hölder continuous maps are Hölder, from Theorem 5.2, Lemma 3.1 and Lemma 5.3 we obtain the following Corollary 5.4. For every 0 < α < 2, κ > 0, and Wt = Sκt , for every bounded set A ⊂ H and every t > 0, a.s. f t is Hölder continuous on A. The same holds for Wt = St . 6. Hausdorff Dimension We will now show that the hulls have Hausdorff dimension 1 almost surely. The situation is similar to [17], Sect. 8.2: Because f t is Hölder continuous, the (box counting) dimension can be estimated by the convergence exponent of the Whitney decomposition of H\K t , which in turn is controlled by the growth of the derivative f t towards the boundary R of H. For a Borel set K ⊂ R2 , we use dim H K to denote its Hausdorff dimension. Theorem 6.1. For each 0 < α < 2, κ > 0, and Wt = Sκt (or Wt = Sκt ) dim H K t = 1 for all t ≥ 0, almost surely. Since K t has empty interior by [8], K t ∪ R = ∂(H\K t ). Because gt−1 has the same distribution as f t (for fixed t), it thus suffices to show that the boundary of Ht = f t (H) has dimension 1 a.s. Denote by N (ε) = N (ε, A) the minimal number of disks of radius ε needed to cover a set A ⊂ C. The following is an analog of the upper bound for the dimension of the outer SLE boundary, Theorem 8.6 in [17]. Theorem 6.2. For each 0 < α < 2 and 1 < a < 2, there is κ > 0 such that with Wt = Sκt , for all T > 0, h > 0 and R > 0, a.s. we have lim εa max N (ε, f t [−R, R] ∩ {y > h}) = 0.
ε→0
0≤t≤T
Proof. As in [17], Sect. 8.2, consider a Whitney decomposition of Ht (that is a covering of Ht by essentially disjoint closed squares Q ⊂ Ht with sides parallel to the coordinate axes such that the side length d(Q) is comparable to the distance of Q from the boundary of Ht , and such that d(Q) is an integer power of 2). Denote by Wt the collection of those squares Q for which Q ∩ f t ([−R, R] × (0, ∞)) ∩ {y > h} = ∅, and let ! S(a) = max d(Q)a ≤ ∞. 0≤t≤T
Q∈Wt
Schramm–Loewner Equations Driven by Symmetric Stable Processes
821
Then the proof of Theorem 8.6 in [17] (the last displayed formula) shows that, for each 0 ≤ t ≤ T, # $ N 2−n , f t [−R, R] ∩ {y > 2 h} ≤ C(ω) 2(n+O(log n)) a S(a). The factor C(ω) comes from the Hölder norm of f t and is random, but does not depend on t or n. The theorem follows at once if we show S(a) < ∞ a.s. To this end, we will show E[S(a)] < ∞ for a > 1, in analogy with the upper bound in Theorem 8.3 of [17]. By the Koebe distortion theorem, again writing z j,n = ( j + i)2−n , the quantity R2 ∞ ! ! n
S(a) = max
0≤t≤T
1{Im
−n a f t (z j,n )>h} | f t (z j,n ) 2 |
n=0 j=−R2n
is comparable to S(a) (see (8.2) and Lemma 8.4 of [17] for the details), in particular S(a) ≤ C S(a) for some universal C. For 0 ≤ t ≤ T, Lemma 4.1 yields 1{Im
f log(2 n h) (z j,n )|1{γlog (2n h) h} | f t (z j,n )| ≤ 1{γlog (h/y j,n ) 0 be as in Theorem 6.2. Let f t be driven by Sκt , and factor f t according to Lemma 5.3. Then by Theorem 6.2, the hulls of the factors of f t have Hausdorff dimension ≤ a, and thus dim H ∂ f t (H) ≤ a by Lemma 6.3. Letting a tend to 1, we see that the hulls driven by Wt = Sκt have Hausdorff dimensional at most 1. Because the boundary of the simply connected domain H\K t is connected, and because K t ∩ H = ∅, we have dim H K t ≥ 1 and conclude dim H K t = 1 for every t > 0. By the scaling Lemma 3.1, the hulls driven by Wt = St have the same dimension as the hulls of Sκt . Finally, the hulls of Sκt for an arbitrary (not neccessarily small) κ can be recovered from the hulls of Sκt , and therefore have dimension 1, by removing the jumps of Wκt , similar to Lemma 5.3.
822
Z.-Q. Chen, S. Rohde
7. Trace Continuity The purpose of this section is to prove the following Theorem 7.1. Fix α ∈ (0, 2) and κ > 0. Let Wt = Sκt or Wt = Sκt . Then almost surely, for each t > 0 the limit γ (t) =
lim
z→Wt ;z∈H
gt−1 (z)
exists, the function t → γ (t) is RCLL, and K t = γ [0, t]. From Theorem 5.2 we know that for each fixed T, f T (z) extends continuously to H a.s. Because the hulls K T have the same law as H\( f T (H) − WT ), they are locally connected a.s. In general, this does not imply that the subsets K t ⊂ K T for t < T are locally connected too (for instance, it is possible that K t is not locally connected at some time t0 , but that due to “swallowing” K t0 is contained in the interior of K t for some t1 > t0 , and that the boundary of K t1 is smooth). Nor does the equicontinuity √ −1 of f t (z) √ generally imply equicontinuity of gt (z). For instance, if Wt = c 1 − t with c = 2 3, then f t (z) is equicontinuous (H\ f t (H) is a halfdisc of radius proportional to √ t) whereas gt−1 is not (K t is an arc of a semicircle up to time 1 when K t is a semidisc). Because of the tree structure of the hulls, our situation is better: Proposition 7.2. Let Wt = Sκt or Wt = Sκt . For each 0 < α < 2 and each T > 0, a.s. each of the maps gt−1 , 0 ≤ t ≤ T , has a continuous extension to H (which we again denote gt−1 ). Moreover, the maps {gt−1 , 0 ≤ t ≤ T }, are equicontinuous on H. We postpone the proof until the end of this section and continue with the Proof of Theorem 7.1. Fix α ∈ (0, 2) and T > 0, and let t ≤ T. Because gt−1 has a continuous extension to H by Proposition 7.2, γ (t) = lim z→Wt ;z∈H gt−1 (z) exists, and γ (t) = gt−1 (Wt ). The equicontinuity of gt−1 , together with the pointwise continuity of t → gt−1 (z), easily implies the continuity of (t, z) → gt−1 (z) on [0, T ] × H. It follows immediately that γ is RCLL. To prove K t = γ [0, t], first let z = γ (t) = gt−1 (Wt ). Clearly Tz ≤ t, so that z ∈ K t (see Sect. 3.1 for the notation). Because K t is closed, we have K t ⊃ γ [0, t]. Conversely, if w ∈ K t , then Tw ≤ t. Then either lim inf |gs (w) − Ws | = 0, and the continuity of s→Tw −
(s, z) → gs−1 (z) implies lim inf |gs−1 (gs (w))−gs−1 (Ws )| = 0, which yields w ∈ γ [0, t]. s→Tw −
Or lim inf |gs (w) − Ws | > 0 and gTw (w) = WTw , which means w = γ (Tw ) ∈ γ [0, t]. s→Tw −
It follows that K t ⊂ γ [0, t].
In order to prove Proposition 7.2, we need a variant of a theorem of Warschawski [19] about the modulus of continuity of conformal maps of the disc. Roughly speaking, after suitable normalization the modulus only depends on the “roughness” of the boundary of the domain as measured by the size of bottlenecks. Let G ⊂ C be a simply connected domain and a ∈ G be a marked point (in [19], a = 0 whereas here we will have p = ∞). A crosscut of G is a simple arc {σ (t), 0 ≤ t ≤ 1} that lies in G except for the endpoints σ (0), σ (1) ∈ ∂G. Every crosscut separates G into two connected components.
Schramm–Loewner Equations Driven by Symmetric Stable Processes
823
If a ∈ / σ, denote G(σ ) the component that does not contain p in its closure. Following Warschawski, define ηG (δ) =
sup
diam σ ≤δ
diam G(σ ).
Thus ηG (δ) → 0 as δ → 0 if and only if ∂G is locally connected. Now assume that G = H\K and that f : H → G is the hydrodynamically normalized conformal map, f (z) = z + a/z + O(1/z 2 ) near ∞. Denote ω f (r ) = sup | f (z) − f (z )| : z, z ∈ H with |z − z | ≤ r the modulus of continuity of f. The following is Theorem I of Warschawski [19], except for the different normalization. His proof carries over with only minor modifications. Theorem 7.3. For each R > 0 and each function η(δ) with η(0+) = 0 there is a function ω(r ) with ω(0+) = 0 such that the following holds: If K ⊂ {|z| < R}, and if ηG (δ) ≤ η(δ) for all δ, then ω f (r ) ≤ ω(r ) for all r > 0. Proof of Proposition 7.2. Fix T > 0. Because f T (z) extends continuously to H a.s. by Corollary 5.4, and because the hulls K T have the same law as H\ f T (H) − WT , we have ηG T (0+) = 0 a.s., where G T = H\K T . By Theorem 1.3 (i) in [8], we know that K T and hence K t , 0 ≤ t ≤ T, does not have interior points. Hence every crosscut σ of H\K t can be decomposed into crosscuts σ j of H\K T such that (H\K t )(σ ) ⊂ ∪ j (H\K T )(σ j ). It follows that ηG t (δ) ≤ δ + 2ηG T (δ), for all t ≤ T. Now Proposition 7.2 follows from Theorem 7.3. Acknowledgement. We would like to thank the referees for their careful reading of the manuscript, and for their valuable comments.
References 1. Bass, R.F., Chen, Z.-Q.: Systems of equations driven by stable processes. Probab. Theory Relat. Fields 134, 175–214 (2006) 2. Bertilsson, D.: On Brennan’s conjecture in conformal mapping, Ph.D. thesis, Stockholm, 1999 3. Bertoin, J.: Lévy Processes. Cambridge: Cambridge Univ. Press, 1996 4. Chen, Z.-Q., Kumagai, T.: Heat kernel estimates for stable-like processes on d-sets. Stochastic Process Appl. 108, 27–62 (2003) 5. Chen, Z.-Q., Song, R.: Martin boundary and integral representation for harmonic functions of symmetric stable processes. J. Funct. Anal. 159, 267–294 (1998) 6. Chung, K.L.: A Course in Probability Theory. Third Edition, London-New York: Academic Press, 2001 7. Engelbert, H.-J., Kurenok, V.P.: On one-dimensional stochastic equations driven by symmetric stable processes. In: Stochastic Processes and Related Topics (Siegmundsburg, 2000). Stochastics Monogr. 12, London: Taylor & Francis, 2002, pp. 81–109 8. Guan, Q.-Y., Winkel, M.: SLE and α-SLE driven by Lévy processes. Ann. Probab. 36(4), 1221– 1266 (2008) 9. Guan, Q.-Y.: Cadlag curves of SLE driven by Levy processes. http://arXiv.org/abs/0705.2321v1 [math.PR], 2007
824
Z.-Q. Chen, S. Rohde
10. He, S.W., Wang, J.G., Yan, J.A.: Semimartingale Theory and Stochastic Calculus. Beijing-New York: Science Press, 1992 11. Landkof, N.S.: Foundations of Modern Potential Theory. Berlin-Heidelberg-New York: Springer-Verlag, 1972 12. Lawler, G.F.: Conformally Invariant Processes in the Plane. Providence, RI: Amer. Math. Soc. 2005 13. Pommerenke, C.: Boundary behaviour of conformal maps. Berlin-Heidelberg: Springer Verlag, 1992 14. Rushkin, I., Oikonomou, P., Kadanoff, L.P., Gruzberg, I.A.: Stochastic Loewner evolution driven by Lévy processes. J. Stat. Mech. P01001 (2006) 15. Rosi´nski, J., Woyczy´nski, W.A.: On Ito stochastic integration with respect to p-stable motion: inner clock, integrability of sample paths, double and multiple integrals. Ann. Probab. 14, 271–286 (1986) 16. Pragrauskas, H., Zanzotto, P.A.: On one-dimensional stochastic differential equations deiven by stable processes. Lithuanian Math. J. 40, 277–295 (2000) 17. Rohde, S., Schramm, O.: Basic properties of SLE. Ann. Math. 161, 879–920 (2005) 18. Schramm, O.: Conformally invariant scaling limits: an overview and a collection of problems. In: Proceedings of ICM Madrid 2006, vol. 1, Zürich: European Math. Soc., 2007, pp. 513–543 19. Warschawski, S.: On the Degree of Variation in Conformal Mapping of Variable Regions. Trans. Ameri. Math. Soc. 69, 335–356 (1950) Communicated by M. Aizenman
Commun. Math. Phys. 285, 825–871 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0679-y
Communications in
Mathematical Physics
On the Localized Phase of a Copolymer in an Emulsion: Supercritical Percolation Regime F. den Hollander1,2 , N. Pétrélis2 1 Mathematical Institute, Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands 2 EURANDOM, P.O. Box 513, 5600 MB Eindhoven, The Netherlands.
E-mail:
[email protected] Received: 15 October 2007 / Accepted: 18 July 2008 Published online: 19 November 2008 – © The Author(s) 2008. This article is published with open access at Springerlink.com
Abstract: In this paper we study a two-dimensional directed self-avoiding walk model of a random copolymer in a random emulsion. The copolymer is a random concatenation of monomers of two types, A and B, each occurring with density 21 . The emulsion is a random mixture of liquids of two types, A and B, organised in large square blocks occurring with density p and 1− p, respectively, where p ∈ (0, 1). The copolymer in the emulsion has an energy that is minus α times the number of A A-matches minus β times the number of B B-matches, where without loss of generality the interaction parameters can be taken from the cone {(α, β) ∈ R2 : α ≥ |β|}. To make the model mathematically tractable, we assume that the copolymer is directed and can only enter and exit a pair of neighbouring blocks at diagonally opposite corners. In [7], a variational expression was derived for the quenched free energy per monomer in the limit as the length n of the copolymer tends to infinity and the blocks in the emulsion have size L n such that L n → ∞ and L n /n → 0. Under this restriction, the free energy is self-averaging with respect to both types of randomness. It was found that in the supercritical percolation regime p ≥ pc , with pc the critical probability for directed bond percolation on the square lattice, the free energy has a phase transition along a curve in the cone that is independent of p. At this critical curve, there is a transition from a phase where the copolymer is fully delocalized into the A-blocks to a phase where it is partially localized near the AB-interface. In the present paper we prove three theorems that complete the analysis of the phase diagram : (1) the critical curve is strictly increasing; (2) the phase transition is second order; (3) the free energy is infinitely differentiable throughout the partially localized phase. In the subcritical percolation regime p < pc , the phase diagram is much more complex. This regime will be treated in a forthcoming paper. 1. Introduction and Main Results 1.1. Background. The problem considered in this paper is the localization transition of a random copolymer near a random interface. Suppose that we have two immiscible
826
F. den Hollander, N. Pétrélis
liquids, say, oil and water, and a copolymer chain consisting of two types of monomer, say, hydrophobic and hydrophilic. Suppose that it is energetically favourable for monomers of one type to be in one liquid and for monomers of the other type to be in the other liquid. At high temperatures the copolymer will delocalize into one of the liquids in order to maximise its entropy, while at low temperatures energetic effects will dominate and the copolymer will localize close to the interface between the two liquids, because in this way it is able to place more than half of its monomers in their preferred liquid. In the limit as the copolymer becomes long, we may expect a phase transition. In the literature most attention has focussed on models with a single flat infinite interface or an infinite array of parallel flat infinite interfaces. Relevant references can be found in the monograph by Giacomin [4] and in the theses by Caravenna [3] and Pétrélis [9]. In the present paper we continue the analysis of a model introduced in den Hollander and Whittington [7], where the interface has a random shape. In particular, the situation was considered in which the square lattice is divided into large blocks, and each block is independently labelled A (oil) or B (water) with probability p and 1 − p, respectively, i.e., the interface has a percolation type structure. This is a primitive model of an emulsion, consisting of oil droplets dispersed in water (see Fig. 1). The copolymer consists of an i.i.d. random concatenation of monomers of type A (hydrophobic) and B (hydrophilic). It is energetically favourable for monomers of type A to be in the A-blocks and for monomers of type B to be in the B-blocks. Under the restriction that the copolymer is directed and can only enter and exit a pair of neighbouring blocks at diagonally opposite corners, it was shown that there are phase transitions between phases where the copolymer is fully delocalized away from the interface and phases where it is partially localized near the interface. Let pc ≈ 0.64 be the critical probability for directed bond percolation on the square lattice. It turns out that the phase diagram does not depend on p when p ≥ pc , while it does depend on p when p < pc . In the present paper we focus on the supercritical percolation regime, i.e., p ≥ pc . Our paper is organised as follows. In the rest of Sect. 1 we recall the definition of the model, state the relevant results from [7], and formulate three theorems for the supercritical percolation regime. These theorems are proved in Sects. 3, 4 and 5, respectively. Section 2 recalls the key variational formula for the free energy, as well as some basic facts about block pair free energies and path entropies needed along the way.
Fig. 1. An undirected copolymer in an emulsion
Localized Phase of a Copolymer in an Emulsion
827
1.2. The model. Each positive integer is randomly labelled A or B, with probability each, independently for different integers. The resulting labelling is denoted by ω = {ωi : i ∈ N} ∈ {A, B}N
1 2
(1.2.1)
and represents the randomness of the copolymer, with A denoting a hydrophobic monomer and B a hydrophilic monomer. Fix p ∈ (0, 1) and L n ∈ N. Partition R2 into square blocks of size L n : R2 = L n (x), L n (x) = x L n + (0, L n ]2 . (1.2.2) x∈Z2
Each block is randomly labelled A or B, with probability p, respectively, 1 − p, independently for different blocks. The resulting labelling is denoted by = {(x) : x ∈ Z2 } ∈ {A, B}Z
2
(1.2.3)
and represents the randomness of the emulsion, with A denoting oil and B denoting water. Let • Wn = the set of n-step directed self-avoiding paths starting at the origin and being allowed to move upwards, downwards and to the right. • Wn,L n = the subset of Wn consisting of those paths that enter blocks at a corner, exit blocks at one of the two corners diagonally opposite the one where it entered, and in between stay confined to the two blocks that are seen upon entering (see Fig. 2). The corner restriction, which is unphysical, is put in to make the model mathematically tractable. We will see that, despite this restriction, the model has physically relevant behaviour. Given ω, and n, with each path π ∈ Wn,L n we associate an energy given by the Hamiltonian ω, Hn,L (π ) = − n
n Ln Ln α 1 ωi = (π = A + β 1 ω = = B , i (πi−1 ,πi ) i−1 ,πi ) i=1
(1.2.4)
Fig. 2. A directed self-avoiding path crossing blocks of oil and water diagonally. The light-shaded blocks are oil, the dark-shaded blocks are water. Each block is L n lattice spacings wide in both directions. The path carries hydrophobic and hydrophilic monomers on the lattice scale, which are not indicated
828
F. den Hollander, N. Pétrélis
Ln where (πi−1 , πi ) denotes the i th step of the path and (π denotes the label of i−1 ,πi ) the block this step lies in. What this Hamiltonian does is count the number of A Amatches and B B-matches and assign them energy −α and −β, respectively, where α, β ∈ R. (Note that the interaction is assigned to bonds rather than to sites: we identify the monomers with the steps of the path). As we will recall in Sect. 2.1, without loss of generality we may restrict the interaction parameters to the cone
CONE
= {(α, β) ∈ R2 : α ≥ |β|}.
(1.2.5)
Given ω, and n, we define the quenched free energy per step as 1 ω, , log Z n,L n n ω, = exp −Hn,L (π ) . n
ω, f n,L = n ω, Z n,L n
(1.2.6)
π ∈Wn,L n
We are interested in the limit n → ∞ subject to the restriction Ln → ∞
and
1 L n → 0. n
(1.2.7)
This is a coarse-graining limit where the path spends a long time in each single block yet visits many blocks. In this limit, there is a separation between a copolymer scale and an emulsion scale. In [7], Theorem 1.3.1, it was shown that lim f ω, n→∞ n,L n
= f = f (α, β; p)
(1.2.8)
exists ω, -a.s. and in mean, is finite and non-random, and can be expressed as a variational problem involving the free energies of the copolymer in each of the four block pairs it may encounter and the frequencies at which the copolymer visits each of these block pairs on the coarse-grained block scale. This variational problem, which is recalled in Sect. 2.1, will be the starting point of our analysis. 1.3. Phase diagram for p ≥ pc . In the supercritical regime the oil blocks percolate, and so the coarse-grained path can choose between moving into the oil or running along the interface between the oil and the water (see Fig. 3). We begin by recalling from den Hollander and Whittington [7] the two main theorems for the supercritical percolation regime (see Fig. 4). Theorem 1.3.1. ([7], Theorem 1.4.1). Let p ≥ pc . Then (α, β) → f (α, β; p) is nonanalytic along the curve in CONE separating the two regions
D = delocalized phase = (α, β) ∈ CONE : f (α, β; p) = 21 α + , (1.3.1)
L = localized phase = (α, β) ∈ CONE : f (α, β; p) > 21 α + . Here, = limn→∞ to (1.2.7).
1 n
log |Wn,L n | =
1 2
log 5 is the entropy per step of the walk subject
Localized Phase of a Copolymer in an Emulsion
829
Fig. 3. Two possible strategies when the oil percolates
Theorem 1.3.2. ([7], Theorem 1.4.3). Let p ≥ pc . (i) For every α ≥ 0 there exists a βc (α) ∈ [0, α] such that the copolymer is delocalized if − α ≤ β ≤ βc (α), localized if βc (α) < β ≤ α.
(1.3.2)
(ii) α → βc (α) is independent of p, continuous, non-decreasing and concave on [0, ∞). There exist α ∗ ∈ (0, ∞) and β ∗ ∈ [α ∗ , ∞) such that βc (α) = α if α ≤ α ∗ , βc (α) < α if α > α ∗ ,
(1.3.3)
and lim∗
α↓α
α − βc (α) ∈ [0, 1), α − α∗
lim βc (α) = β ∗ .
α→∞
(1.3.4)
The intuition behind Theorem 1.3.1 is as follows (see Fig. 3). Suppose that p > pc . Then the A-blocks percolate. Therefore the copolymer has the option of moving to the infinite cluster of A-blocks and staying inside that infinite cluster forever, thus seeing only A A-blocks. In doing so, it loses an entropy of at most O(n/L n ) = o(n) (on the coarse-grained scale), it gains an energy 21 αn + o(n) (on the lattice scale, because only half of its monomers are matched), and it gains an entropy n +o(n) (on the lattice scale, because it crosses blocks diagonally). Alternatively, the path has the option of running along the boundary of the infinite cluster (at least part of the time), during which it sees AB-blocks and (when β ≥ 0) gains more energy by matching more than half of its monomers. Consequently, f (α, β; p) ≥ 21 α + .
(1.3.5)
The boundary between the two regimes in (1.3.1) corresponds to the crossover from full delocalization into the A-blocks to partial localization near the AB-interfaces. The critical curve does not depend on p as long as p > pc . Because p → f (α, β; p) is continuous (see Theorem 2.1.1(iii) in Sect. 2.1), the same critical curve occurs at p = pc .
830
F. den Hollander, N. Pétrélis
The proof of Theorem 1.3.2 relies on a representation of D and L in terms of the single interface (!) free energy (see Proposition 2.3.4 in Sect. 2.3). This representation, which is key to the analysis of the critical curve, expresses the fact that localization occurs for the emulsion free energy only when the single interface free energy is sufficiently deep inside its localized phase. This gap is needed to compensate for the loss of entropy associated with running along the interface and crossing at a steeper angle. The intuition behind Theorem 1.3.2 is as follows (see Fig. 4). Pick a point (α, β) inside D. Then the copolymer spends almost all of its time deep inside the A-blocks. Increase β while keeping α fixed. Then there will be a larger energetic advantage for the copolymer to move some of its monomers from the A-blocks to the B-blocks by crossing the interface inside the AB-block pairs. There is some entropy loss associated with doing so, but if β is large enough, then the energetic advantage will dominate, so that ABlocalization sets in. The value at which this happens depends on α and is strictly positive. Since the entropy loss is finite, for α large enough the energy-entropy competition plays out not only below the diagonal, but also below a horizontal asymptote. On the other hand, for α small enough the loss of entropy dominates the energetic advantage, which is why the critical curve has a piece that lies on the diagonal. The larger the value of α the larger the value of β where AB-localization sets in. This explains why the critical curve is non-decreasing. At the critical curve the single interface free energy is already inside its localized phase. This explains why the critical curve has a slope discontinuity at α ∗ . 1.4. Main results. In the present paper we prove three theorems, which complete the analysis of the phase diagram in Fig. 4. Theorem 1.4.1. Let p ≥ pc . Then α → βc (α) is strictly increasing on [0, ∞). Theorem 1.4.2. Let p ≥ pc . Then for every α ∈ (α ∗ , ∞) there exist 0 < C1 < C2 < ∞ and δ0 > 0 (depending on p and α) such that C1 δ 2 ≤ f (α, βc (α) + δ; p) − f (α, βc (α); p) ≤ C2 δ 2
∀ δ ∈ (0, δ0 ].
Fig. 4. Qualitative picture of α → βc (α) for p ≥ pc
(1.4.1)
Localized Phase of a Copolymer in an Emulsion
831
Theorem 1.4.3. Let p ≥ pc . Then, under Assumption 5.2.2, (α, β) → f (α, β; p) is infinitely differentiable throughout L. Assumption 5.2.2 states that a certain intermediate single-interface free energy has a finite curvature. We believe this assumption to be true, but have not managed to prove it. See the end of Sect. 5.2, in particular, Remark 5.3.3, for a motivation and for a way to weaken it. Theorem 1.4.1 implies that the critical curve never reaches the horizontal asymptote, which in turn implies that α ∗ < β ∗ and that the slope in (1.3.4) is > 0. Theorem 1.4.2 shows that the phase transition is second order off the diagonal. (In contrast, we know that the phase transition is first order on the diagonal. Indeed, the free energy equals 21 α + on and below the diagonal segment between (0, 0) and (α ∗ , α ∗ ), and equals 21 β + on and above this segment as is evident from interchanging α and β.) Theorem 1.4.3 tells us that the critical curve is the only location in CONE where a phase transition of finite order occurs. Theorems 1.4.1, 1.4.2 and 1.4.3 are proved in Sects. 3, 4 and 5, respectively. Their proofs rely on perturbation arguments, in combination with exponential tightness of the excursions away from the interface inside the localized phase. The analogues of Theorems 1.4.2 and 1.4.3 for the single flat infinite interface were derived in Giacomin and Toninelli [5,6]. For that model the phase transition is shown to be at least of second order, i.e., only the quadratic upper bound is proved. Numerical simulation indicates that the transition may well be of higher order. The mechanisms behind the phase transition in the two models are different. While for the single interface model the copolymer makes long excursions away from the interface and dips below the interface during a fraction of time that is at most of order δ 2 , in our emulsion model the copolymer runs along the interface during a fraction of time that is of order δ, and in doing so stays close to the interface. Morover, because near the critical curve for the emulsion model the single interface model is already inside its localized phase, there is a variation of order δ in the single interface free energy. Thus, the δ 2 in the emulsion model is the product of two factors δ, one coming from the time spent running along the interface and one coming from the variation of the constituent single interface free energy away from its critical curve. See Sect. 4 for more details. In the proof of Theorem 1.4.3 we use some of the ingredients of the proof in Giacomin and Toninelli [6] of the analogous result for the single interface model. However, in the emulsion model there is an extra complication, namely, the speed per step to move one unit of space forward may vary (because steps are up, down and to the right), while in the single interface model this is fixed at one (because steps are up-right and down-right). We need to control the infinite differentiability with respect to this speed variable. This is done by considering the Fenchel-Legendre transform of the free energy, in which the dual of the speed variable enters into the Hamiltonian rather than in the set of paths. Moreover, since the block pair free energies and the total free energy are both given by variational problems, we need to show uniqueness of maximisers and prove non-degeneracy of the Jacobian matrix at these maximisers in order to be able to apply implicit function theorems. See Sect. 5 for more details. 1.5. Discussion. The corner restriction imposed through the set Wn,L n in Sect. 1.2 is unphysical. However, without this restriction the model would be very hard to analyze, and would have a degree of difficulty comparable to that of the directed polymer in random environment, for which no detailed phase diagram has yet been derived. If the copolymer is allowed to exit a pair of blocks also at the corner to the right of the
832
F. den Hollander, N. Pétrélis
entrance corner, then this adds an extra critical curve to the phase diagram, namely, the critical curve of the single linear interface. Our critical curve still persists, because the copolymer has to cross AB-blocks diagonally every now and then in order to reach the most favorable block environment. The order of the phase transition at our critical curve is unaffected. The order of the extra critical curve would be the same as for the single linear interface, i.e., second order or higher. 2. Preparations In Sects. 2.1–2.3 we recall a few key facts from den Hollander and Whittington [7] that will be crucial for the proofs. Section 2.1 gives the variational formula for the free energy, Sect. 2.2 states two elementary lemmas about path entropies, while Sect. 2.3 states two lemmas for the block pair free energies and a proposition characterising the localized phase of the emulsion free energy in terms of the single interface free energy. Section 2.4 states a lemma about the tail behaviour of the single interface free energy and the block pair free energies, showing that long paths wash out the effect of entropy. 2.1. Variational formula for the free energy. To formulate the key variational formula for the free energy that serves as our starting point, we need three ingredients. I. For L ∈ N and a ≥ 2 (with a L integer), let Wa L ,L denote the set of a L-step directed self-avoiding paths starting at (0, 0), ending at (L , L), and in between not leaving the two adjacent blocks of size L labelled (0, 0) and (−1, 0) (see Fig. 5). For k, l ∈ {A, B}, let 1 log Z aωL ,L , aL = exp −Haω, L ,L (π ) when (0, 0) = k and (0, −1) = l,
ψklω (a L , L) = Z aωL ,L
π ∈Wa L ,L
(2.1.1) denote the free energy per step in a kl-block when the number of steps inside the block is a times the size of the block. Let lim ψklω (a L , L) = ψkl (a) = ψkl (α, β; a).
L→∞
(2.1.2)
Note here that k labels the type of the block that is diagonally crossed, while l labels the type of the block that appears as its neighbour at the starting corner (see Fig. 5). We will recall in Sect. 2.3 that the limit exists ω-a.s. and in mean, and is non-random. Both ψ A A and ψ B B take on a simple form, whereas ψ AB and ψ B A do not. II. Let W denote the class of all coarse-grained paths = { j : j ∈ N} that step diagonally from corner to corner (see Fig. 4, where each dashed line with arrow denotes a single step of ). For n ∈ N, ∈ W and k, l ∈ {A, B}, let ρkl ( , n) n 1 ( j−1 , j ) diagonally crosses a k-block in that has an l-block . = 1 in appearing as its neighbour at the starting corner n j=1
(2.1.3)
Localized Phase of a Copolymer in an Emulsion
833
Fig. 5. Two neighbouring blocks. The dashed line with arrow indicates that the coarse-grained path makes a step diagonally upwards. The path enters at (0, 0), exits at (L , L), and in between stays confined to the two blocks
Abbreviate
ρ ( , n) = ρkl ( , n) k,l∈{A,B} ,
(2.1.4)
which is a 2 × 2 matrix with non-negative elements that sum up to 1. Let R ( ) denote the set of all limits points of the sequence {ρ ( , n) : n ∈ N}, and put R = the closure of the set R ( ). (2.1.5)
∈W
Clearly, R exists for all . Moreover, since has a trivial sigma-field at infinity (i.e., all events not depending on finitely many coordinates of have probability 0 or 1) and R is measurable with respect to this sigma-field, we have R = R( p)
− a.s.
(2.1.6)
for some non-random closed set R( p). This set, which depends on the parameter p controlling , is the set of all possible limit points of the frequencies at which the four pairs of adjacent blocks can be seen along an infinite coarse-grained path. The elements of R( p) are matrices ρ A A ρ AB (2.1.7) ρB A ρB B whose elements are non-negative and sum up to 1. In [7], Proposition 3.2.1, it was shown that p → R( p) is continuous in the Hausdorff metric and that, for p ≥ pc , R( p) contains matrices of the form 1−γ γ for γ ∈ C ⊂ (0, 1) closed. (2.1.8) Mγ = 0 0 III. Let A be the set of 2 × 2 matrices whose elements are ≥ 2. The elements of these matrices are used to record the average number of steps made by the path inside the four block pairs divided by the block size. With I–III in hand, we can state the variational formula for the free energy. Define ρkl akl ψkl (akl ) V : ((ρkl ), (akl )) ∈ R( p) × A → kl . (2.1.9) kl ρkl akl
834
F. den Hollander, N. Pétrélis
Theorem 2.1.1. ([7], Theorem 1.3.1). (i) For all (α, β) ∈ R2 and p ∈ (0, 1), lim f ω, n→∞ n,L n
= f = f (α, β; p)
(2.1.10)
exists ω, -a.s. and in mean, is finite and non-random, and is given by f =
sup
sup V ((ρkl ), (akl )).
(ρkl )∈R( p) (akl )∈A
(2.1.11)
(ii) (α, β) → f (α, β; p) is convex on R2 for all p ∈ (0, 1). (iii) p → f (α, β; p) is continuous on (0, 1) for all (α, β) ∈ R2 . (iv) For all (α, β) ∈ R2 and p ∈ (0, 1), f (α, β; p) = f (β, α; 1 − p), f (α, β; p) = 21 (α + β) + f (−β, −α; p).
(2.1.12)
Part (iv) is the reason why without loss of generality we may restrict the parameters to the cone in (1.2.5). The behaviour of f as a function of (α, β) is different for p ≥ pc and p < pc (recall that pc is the critical probability for directed bond percolation on the square lattice). The reason is that the coarse-grained paths , which determine the set R( p), sample just like paths in directed bond percolation on the square lattice rotated by 45 degrees sample the percolation configuration (see Fig. 6).
2.2. Path entropies. The two lemmas in this section identify the path entropies associated with crossing a block and running along an interface. They are based on straightforward computations and are crucial for the analysis of the model.
Fig. 6. sampling . The dashed lines with arrows indicate the steps of . The block pairs encountered in this example are B B, A A, B A and AB
Localized Phase of a Copolymer in an Emulsion
835
Let DOM
= {(a, b) : a ≥ 1 + b, b ≥ 0}.
(2.2.1)
For (a, b) ∈ DOM, let N L (a, b) denote the number of a L-step self-avoiding directed paths from (0, 0) to (bL , L) whose vertical displacement stays within (−L , L] (a L and bL are integer). Let κ(a, b) = lim
L→∞
1 log N L (a, b). aL
(2.2.2)
Lemma 2.2.1. ([7], Lemma 2.1.1). (i) κ(a, b) exists and is finite for all (a, b) ∈ DOM. (ii) (a, b) → aκ(a, b) is continuous and strictly concave on DOM and analytic on the interior of DOM. (iii) For all a ≥ 2, aκ(a, 1) = log 2 + 21 a log a − (a − 2) log(a − 2) . (2.2.3) (iv) supa≥2 κ(a, 1) = κ(a ∗ , 1) = 21 log 5 with unique maximiser a ∗ = 25 . ∂ ∂ (v) ( ∂a κ)(a ∗ , 1) = 0 and a ∗ ( ∂b κ)(a ∗ , 1) = 21 log 95 .
∂ 8 ∂ 262 ∂ 2 9 ∗ ∗ ∗ (vi) ( ∂a 2 κ)(a , 1) = − 25 , ( ∂b2 κ)(a , 1) = − 225 and ( ∂a∂b κ)(a , 1) = − 25 log 5 + 44 75 . 2
2
2
Part (vi), which was not stated in [7], follows from a direct computation via [7], Eqs. (2.1.5), (2.1.8) and (2.1.9). For µ ≥ 1, let Nˆ L (µ) denote the number of µL-step self-avoiding paths from (0, 0) to (L , 0) with no restriction on the vertical displacement (µL is integer). Let κ(µ) ˆ = lim
L→∞
1 log Nˆ L (µ). µL
(2.2.4)
Lemma 2.2.2. ([7], Lemma 2.1.2). (i) (ii) (iii) (iv)
κ(µ) ˆ exists and is finite for all µ ≥ 1. µ → µκ(µ) ˆ is continuous and strictly concave on [1, ∞) and analytic on (1, ∞). κ(1) ˆ = 0 and µκ(µ) ˆ ∼ log µ as µ → ∞. supµ≥1 µ[κ(µ) ˆ − 21 log 5] < 21 log 95 .
2.3. Free energies per pair of blocks. In this section we identify the block pair free energies. In [7], Proposition 2.2.1, we showed that ω-a.s. and in mean, ψ A A (a) = 21 α + κ(a, 1)
and
ψ B B (a) = 21 β + κ(a, 1).
(2.3.1)
Both are easy expressions, because A A-blocks and B B-blocks have no interface. To compute ψ AB (a) and ψ B A (a), we first consider the free energy per step when the path moves in the vicinity of a single linear interface I separating a liquid A in the upper halfplane from a liquid B in the lower halfplane including the interface itself. To
836
F. den Hollander, N. Pétrélis
that end, for c ≥ b > 0, let WcL ,bL denote the set of cL-step directed self-avoiding paths starting at (0, 0) and ending at (bL , 0). Define ψ Lω,I (c, b) =
1 ω,I log Z cL ,bL cL
(2.3.2)
with ω,I Z cL ,bL =
ω,I exp −HcL (π ) ,
π ∈WcL ,bL
ω,I HcL (π ) = −
cL
(α 1{ωi = A, (πi−1 , πi ) > 0} + β 1{ωi = B, (πi−1 , πi ) ≤ 0}) ,
i=1
(2.3.3) where (πi−1 , πi ) > 0 means that the i th step lies in the upper halfplane and (πi−1 , πi ) ≤ 0 means that the i th step lies in the lower halfplane or in the interface (see Fig. 7). For a ∈ [2, ∞), let DOM(a)
= {(c, b) ∈ R2 : 0 ≤ b ≤ 1, c ≥ b, a − c ≥ 2 − b}.
(2.3.4)
Lemma 2.3.1. ([7], Lemma 2.2.1). For all (α, β) ∈ R2 and c ≥ b > 0, lim ψ Lω,I (c, b) = φ I (c/b) = φ I (α, β; c/b)
L→∞
(2.3.5)
exists ω-a.s. and in mean, and is non-random. Lemma 2.3.2. ([7], Lemma 2.2.2). For all (α, β) ∈ R2 and a ≥ 2, aψ AB (a) = aψ AB (α, β; a) = sup cφ I (c/b) + (a − c) 21 α + κ(a − c, 1 − b) . (2.3.6) (c,b)∈DOM(a)
Lemma 2.3.3. ([7], Lemma 2.2.3). Let k, l ∈ {A, B}. (i) For all (α, β) ∈ R2 , a → aψkl (α, β; a) is continuous and concave on [2, ∞). (ii) For all a ∈ [2, ∞), α → ψkl (α, β; a) and β → ψkl (α, β; a) are continuous and non-decreasing on R.
Fig. 7. Illustration of (2.3.2–2.3.3) for c = µ and b = 1
Localized Phase of a Copolymer in an Emulsion
837
The idea behind Lemma 2.3.2 is that the copolymer follows the AB-interface over a distance bL during cL steps and then wanders away from the AB-interface to the diagonally opposite corner over a distance (1 − b)L during (a − c)L steps. The optimal strategy is obtained by maximising over b and c (see Fig. 8). A similar expression holds for ψ B A . The key result behind the analysis of the critical curve in Fig. 4 is the following proposition, whose proof relies on Lemmas 2.3.1–2.3.3. Proposition 2.3.4. ([7], Proposition 2.3.1) Let p ≥ pc . Then (α, β) ∈ L if and only if sup µ φ I (α, β; µ) − 21 α − µ≥1
1 2
log 5 >
1 2
log 95 .
(2.3.7)
Note that 21 α + 21 log 5 is the free energy per step when the copolymer diagonally crosses an A-block. What Proposition 2.3.4 says is that for the copolymer in the emulsion to localize, the excess free energy of the copolymer along the interface must be sufficiently large to compensate for the loss of entropy of the copolymer coming from the fact that it must diagonally cross the block at a steeper angle (see Fig. 8). We have φ I (µ) ≥ 21 α + κ(µ)∀ ˆ µ > 1, φ I (µ) ≤ α + κ(µ)∀ ˆ µ ≥ 1,
(2.3.8)
where κ(µ) ˆ is the entropy defined in (2.2.4). The upper bound and the gap in Lemma 2.2.2(iv) are responsible for the linear piece of the critical curve in Fig. 4. In analogy with Lemma 2.2.2, we further note that, for all (α, β) ∈ R2 , µ → µφ I (µ) is finite and concave on [1, ∞), and hence is continuous on (1, ∞). In the definition of φ I the interface belongs to solvent B (see (2.3.3)), so that φ I (1) = 21 β. Finally, by mimicking the proof of Lemma 2.4.1(i) below, we can show that limµ↓1 φ I (µ) = 21 α. 2.4. Tail behaviour of free energies for long paths. In this section we show that long paths wash out the effect of entropy. This will be needed later for compactification arguments.
Fig. 8. Two possible strategies inside an AB-block: The path can either move straight across or move along the interface for awhile and then move across. Both strategies correspond to a coarse-grained step diagonally upwards as in Fig. 6
838
F. den Hollander, N. Pétrélis
I Let Pω, µL denote the law of the copolymer of length µL in the single interface model with the energy shifted by − α2 , i.e., I Pω, µL (π ) =
1 ω,I Z µL ,L
ω,I exp −HµL (π ) ,
π ∈ WµL ,L ,
(2.4.1)
with ω,I (π ) = − HµL
µL
(−α 1{ωi = A} + β 1{ωi = B}) 1{(πi−1 , πi ) ≤ 0}. (2.4.2)
i=1
Let ω,I φ I (µ) = φ I (α, β; µ) = lim φµL ω − a.s. L→∞
ω,I ω,I = φµL (α, β) = φµL
with
1 ω,I log Z µL ,L µL
(2.4.3)
(compare with (2.3.3)). Henceforth we adopt this shift, but we retain the same notation. The reader must keep this in mind throughout the sequel! Lemma 2.4.1. For any β0 > 0, (i) limµ→∞ φ I (α, β; µ) = 0, (ii) lima→∞ ψ AB (α, β; a) = 0, uniformly in α ≥ β and β ≤ β0 . Proof. (i) Recall the definition of WµL ,L in Sect. 2.3. Abbreviate χi = 1{ωi = B}−1 {ωi = A}. Because α ≥ β and β ≤ β0 , we have ⎡ ⎤ µL 1 log exp ⎣β χi 1{(πi−1 , πi ) ≤ 0}⎦ φ I (α, β; µ) ≤ lim L→∞ µL π ∈WµL ,L
1 ≤ κ(µ) ˆ + β0 lim sup L→∞ µL
i=1
max
π ∈WµL ,L
⎧ µL ⎨ ⎩
i=1
⎫ ⎬
χi 1{(πi−1 , πi ) ≤ 0} . (2.4.4) ⎭
ˆ = 0. Therefore it suffices to show We know from Lemma 2.2.2(iii) that limµ→∞ κ(µ) that for every ε > 0 there exists a µ0 (ε) ≥ 2 such that ⎫ ⎧ µL ⎬ ⎨ 1 lim sup max χi 1{(πi−1 , πi ) ≤ 0} ≤ ε ω − a.s. ∀ µ ≥ µ0 (ε). ⎭ L→∞ µL π ∈WµL ,L ⎩ i=1
(2.4.5) The random variables χi are i.i.d. ±1 with probability 21 . Let I j be the set of indices µL i in the excursion of π on or below the interface. Then i=1 χi 1{(πi−1 , πi ) ≤ jth 0} = j i∈I j χi . Let Fµ,L denote the family of all possible sequences I = (I j ) as
Localized Phase of a Copolymer in an Emulsion
839
π runs over the set WµL ,L , and write |I | = j |I j |. For 0 < ε ≤ 1, consider the quantity ⎛ ⎞ pµ,L ,ε = P ⎝∃I ∈ Fµ,L : χi ≥ εµL ⎠ , (2.4.6) j i∈I j
where P denotes the probability law of ω. By the exponential Markov inequality, there exists a C > 0 such that " N # 2 P χi ≥ ε R N ≤ e−Cε R N ∀ N , R ≥ 1, ∀ 0 < ε ≤ 1. (2.4.7) i=1
Since |I | ≤ µL for all I ∈ Fµ,L , we can apply (2.4.7) with N = |I | and R = µL/|I | to estimate ⎞ ⎛ µL 2 |I |⎠ ≤ |Fµ,L | e−Cε µL . P⎝ χi ≥ ε (2.4.8) pµ,L ,ε ≤ |I | I ∈Fµ,L
Since |Fµ,L | ≤
j i∈I j
2 µL = exp [C(µ)L + o(L)] L
as L → ∞,
(2.4.9)
with C(µ) ∼ log µ as µ → ∞, there exists a C > 0 such that, for µ ≥ 2 and L large enough, |Fµ,L | ≤ exp[LC log µ] and hence pµ,L ,ε ≤ exp[L(C log µ − Cε2 µ)]. Thus, there exists a µ0 (ε) ≥ 2 such that for µ ≥ µ0 (ε), ∞
pµ,L ,ε < ∞.
(2.4.10)
L=1
The Borel-Cantelli lemma now us to assert that, ω-a.s. for µ ≥ µ0 (ε) and L allows large enough, the inequality j i∈I j χi ≤ εµL holds uniformly in I ∈ Fµ,L . Hence (2.4.5) is true indeed. (ii) This follows from a similar argument. The counterpart of Eq. (2.4.4) is (recall (2.2.1)(2.2.2)) % $ aL 1 ψ AB (α, β; a) ≤ κ(a, 1) + β0 lim sup max χi 1{(πi−1 , πi ) ≤ 0} . L→∞ a L π ∈N L (a,1) i=1
(2.4.11) Lemma 2.2.1(iii) implies that κ(a, 1) → 0 as a → ∞, while the proof that ω-a.s. the second term in the r.h.s. of (2.4.11) tends to 0 is the same as in (i). 3. Proof of Theorem 1.4.1 In Sect. 3.1 we derive a proposition stating that the excursions away from the interface are exponentially tight in the localized phase. In Sect. 3.2 we use this proposition to prove Theorem 1.4.1.
840
F. den Hollander, N. Pétrélis
3.1. Tightness of excursions. We will call the triple (α, β, µ) ∈ CONE × [1, ∞) weakly localized if (recall Proposition 2.3.4 and (2.4.1–2.4.3)) α ∈ (α ∗ , ∞) and sup ν φ I (α, β; ν) − = µ φ I (α, β; µ) − ≥ ς (3.1.1) ν≥1
with =
1 2
log 5
and
ς=
1 2
log 95 .
(3.1.2)
Let lµL denote the number of strictly positive excursions in π ∈ WµL ,L . For k = 1, . . . , lµL , let τk denote the length of the kth such excursion in π . Proposition 3.1.1. Let (α, β, µ) be a weakly localized triple. Then for every C > 0 there exists an M0 = M0 (C) such that for M ≥ M0 , ⎛ ⎞⎞ ⎛ lµL I⎝ lim E ⎝Pω, τk 1{τk ≥ M} ≥ CµL ⎠⎠ = 0. (3.1.3) µL L→∞
k=1
Proof. Along the way we need the following concentration inequality for the free energy ω,I ω,I of the single interface. Let φµL = (1/µL) log Z µL ,L (recall (2.3.3)). Lemma 3.1.2. There exist C1 , C2 > 0 such that for all ε > 0, (α, β, µ) ∈ CONE×[1, ∞) and L ∈ N, & & & ω,I & ω,I P &φµL (α, β) − E φµL (α, β) & ≥ ε ≤ C1 exp −ε2 µL/C2 (α + β) . (3.1.4) Proof. See Giacomin and Toninelli [6]. The argument for their single interface model readily extends to our single interface model. Step 1. Throughout the proof, (α, β, µ) is a weakly localized triple and C ∈ (0, 1). Fix M. For π ∈ WµL ,L , we let K L = K L (π ) = {k ∈ {1, . . . , lµL } : τk ≥ M}. We also define 'L = W
⎧ ⎨ ⎩
π ∈ WµL ,L :
τk ≥ CµL
k∈K L
⎫ ⎬ ⎭
,
Q L = {CµL , . . . , µL} × {1, . . . , L} × {1, . . . , µL/M}. ' L is the union of the events (As,r,t )(s,r,t)∈Q with Note that W L ⎧ ⎫ ⎧ ⎫ ⎨ ⎬ ⎨ ⎬ As,r,t = τk = s ∩ τk /µk = r ∩ {|K L | = t} , ⎩ ⎭ ⎩ ⎭ k∈K L
(3.1.5)
(3.1.6)
(3.1.7)
k∈K L
where µk is the number of steps divided by the number of horizontal steps in the kth strictly positive excursion. Let v = (vk1 , vk2 )k∈K L denote the starting points and ending points of the successive positive excursions of length ≥ M. If VL denotes all
Localized Phase of a Copolymer in an Emulsion
841
possible values of v, then As,r,t is the union of the events (Avs,r,t )v∈VL . We will estimate I v E(Pω, µL (As,r,t )). Step 2. We want to bound from above the quantity ω,I ω,I −HµL (π ) −µLφµL I v e . (3.1.8) E Pω, A = E e v s,r,t π ∈As,r,t µL 2 , v 1 ], k ∈ {1, . . . , t}, as follows. To that end, we concatenate the excursions of π in [vk−1 k Since these excursions start and end at the interface, either with a horizontal step or with a vertical step up, we concatenate them by adding a strictly positive excursion of 3 steps between them. The latter has no effect on the Hamiltonian. We also concatenate the strictly positive excursions in [vk1 , vk2 ], k ∈ {1, . . . , t}, by adding 1 horizontal step between them. Thus, if we abbreviate S1 = µL − s + 3t and S2 = L − r + t, and if we 2 , v 1 ], k ∈ {1, . . . , t}, then we have denote by ωv the concatenation of the ωi in [vk−1 k
ω,I
π ∈Avs,r,t
e−HµL
(π )
≤
π ∈W S1 ,S2
e
−HSωv ,I (π ) 1
K (s + t, r + t),
(3.1.9)
where K (a, b) is the number of strictly positive excursions of length a that make b horizontal steps. A standard superadditivity argument gives s+t
ˆ r +t ) K (s + t, r + t) ≤ e(s+t)κ(
(3.1.10)
with κˆ the entropy function defined in (2.2.4). Put µˆ = S1 /S2 . Then with (3.1.10) we can rewrite (3.1.9) as
ω,I
π ∈Avs,r,t
e−HµL
(π )
≤e
ωv ,I S1 φµS ˆ 2
s+t
ˆ r +t ) e(s+t) κ( .
(3.1.11)
At this stage, two cases need to be distinguished. Fix η > 0. Case S1 ≥ ηL. Let ω,I ω,I A1 = φµL −ε , ≤ E φµL
ωv ,I ωv ,I +ε . ≥ E φ A2 = φµS ˆ µS ˆ 2
(3.1.12)
2
Since µL ≥ µS ˆ 2 = S1 ≥ ηL, Lemma 3.1.2 gives the large deviation inequality max{P(A1 ), P(A2 )} ≤ C1 exp −ε2 ηL/C2 (α + β) . (3.1.13) ωv ,I ωv ,I By superadditivity, we have E(φµS ) ≤ sup L∈N E(φµL ) = φ I (µ). ˆ Moreover, for L ˆ ˆ 2
ω,I ) ≥ φ I (µ) − ε. Hence, it follows from (3.1.11–3.1.13) large enough, we have E(φµL that ω,I ω,I −HµL (π ) −µLφµL ω,I v e E PµL As,r,t = E π ∈Avs,r,t e ω,I ω,I −HµL (π ) −µLφµL c c e ≤ P(A1 ) + P(A2 ) + E 1 A1 ∩A2 π ∈Avs,r,t e
≤ 2C1 e−ε
2 ηL/C
2 (α+β)
+ e S1 (φ
I (µ)+ε) ˆ
e−µL(φ
I (µ)−2ε)
s+t
ˆ r +t ) e(s+t) κ( . (3.1.14)
842
F. den Hollander, N. Pétrélis
ω,I Case S1 ≤ ηL. Note that, for (α, β) ∈ CONE, the trivial inequality φµL ≤ α + κ(µ) ˆ (compare with (2.3.8)) and Lemma 2.2.2 (iii) are sufficient to assert that there exists an ωv ,I ω,I Rα > 0 such that φµL ≤ Rα for all µ ≥ 1, L ∈ N and ω. Therefore also φµS ≤ Rα ˆ 2 for all µˆ ≥ 1, S2 ∈ N and ωv , and so it follows from (3.1.11–3.1.13) that
ω,I ω,I −HµL (π ) −µLφµL I v e A = E e E Pω, v s,r,t π ∈As,r,t µL ω,I ω,I −HµL (π ) −µLφµL c e = P(A1 ) + E e 1 v A1 π ∈As,r,t ≤ C1 e−ε
2 µL/C
2β
+ e S1 Rα e−µL(φ
I (µ)−2ε)
s+t
ˆ r +t ) e(s+t) κ( .
(3.1.15)
ˆ = S1 φ I (S1 /S2 ) in (3.1.14), we define x = s/µL Step 3. To bound the quantity S1 φ I (µ) and µ˜ = s/r . Then S1 = µL(1 − x) + 3t and S2 = L(1 − xµ/µ) ˜ + t. Since (α, β, µ) is a weakly localized triple (recall (3.1.1)), we have S1 φ I (S1 /S2 ) ≤ µS2 φ I (µ) + (S1 − µS2 ), with given in (3.1.2). This can be further estimated by µ2 L[ − φ I (µ)] S1 φ I (S1 /S2 ) ≤ µLφ I (µ) − xµL + x µ˜ +t µφ I (µ) + (3 − µ) ≤ µLφ I (µ) − 56 xµL ,
(3.1.16) (3.1.17)
where we use that − φ I (µ) ≤ 0, t ≤ µL/M, and M is large enough (by assumption). Next, let µ0 be such that κ(ν) ˆ ≤ 2 for all ν ≥ µ20 (which is possible by Lemma 2.2.2(iii)). Case µ˜ ≥ µ0 . Since s ≥ cµL and t ≤ µL/M, if µ˜ ≥ µ0 , then (s + t)/(r + t) ≥ µ/(1 ˜ + t/r ) ≥ µ20 . Since s + t ≤ xµL + µL/M, it follows from (3.1.17) that for M large enough, S1 φ I (S1 /S2 ) + (s + t) κˆ
s+t r +t
≤ µLφ I (µ) − 16 xµL .
(3.1.18)
Case µ˜ ≤ µ0 . For µ˜ < µ0 , we first note that, by Lemma 2.2.2(iv) and (3.1.1), there exists a z > 0 such that sup y[κ(y) ˆ − ] = µ(φ I (µ) − ) − z.
(3.1.19)
y≥1
Therefore, picking y = (s + t)/(r + t) in (3.1.19), we get (s + t)κˆ
s+t r +t
≤ µ(r + t)φ I (µ) + [(s + t) − µ(r + t)] − z(r + t) CL ≤ µr φ I (µ) + (s − µr ) − zr + M µ2 L I xµL C L µ =x φ (µ) + xµL 1 − −z + , µ˜ µ˜ µ˜ M
(3.1.20)
Localized Phase of a Copolymer in an Emulsion
843
where C = C (µ) > 0 and the second line uses t ≤ µL/M. Summing (3.1.16) and (3.1.20), we obtain that for M large enough, s+t xµL C L ≤ µLφ I (µ) − z S1 φ I (S1 /S2 ) + (s + t)κˆ + . (3.1.21) r +t µ˜ M Since x ≥ C and µ˜ ≤ µ0 , we can choose M large enough such that the r.h.s. of (3.1.21) is bounded from above by µLφ I (µ) − 2zC µ˜0 µL. Setting C3 = inf{zC/2µ˜0 , C/6}, we obtain that the r.h.s. of (3.1.18) and (3.1.21) are both bounded from above by µLφ I (µ) − C3 µL. Step 4. In the case S1 ≥ ηL, (3.1.14) becomes 2 I v (A ) ≤ 2C1 e−ε ηL/C2 (α+β) + eµL(−C3 +3ε) , (3.1.22) E Pω, s,r,t µL while in the case S1 ≤ ηL we choose η ≤ C3 /2Rα , and (3.1.15) becomes 1 2 I v E Pω, (A ) ≤ C1 e−ε µL/C2 (α+β) + eµL(− 2 C3 +2ε) . s,r,t µL
(3.1.23)
Thus, there are C4 , C5 > 0 such that, for ε small enough, I v E Pω, (A ) ≤ C4 e−C5 µL . s,r,t µL
(3.1.24)
Therefore it remains to estimate the number of possible values of (s, r, t) and v. Since (s, r, t) ∈ {1, . . . , µL}3 , there are at most (µL)3 such triples. At fixed t, choosing v amounts to choosing t starting and t ending points for the excursions, which can
points µL ≤ be done in at most µL 2t 2µL/M ways when M ≥ 4. By Stirling’s formula there exists a C > 0 such that for all M ≥ 4 and L ∈ N,
µL 2µL/M
( ≤ C µL ed(M)µL
with
2 log 2 − 1 − 2 log 1 − 2 . d(M) = − M M M M
(3.1.25) Since lim M→∞ d(M) = 0, we have d(M) ≤ C5 /2 for some C5 > 0 and M large enough. Therefore ω,I E PµL (Avs,r,t ) ≤ C4 C (µL)7/2 e−C5 µL/2 . (3.1.26) (s,r,t)∈Q L
v
Since the l.h.s. equals the expectation in (3.1.3), we have completed the proof.
3.2. Proof of Theorem 1.4.1. The proof uses Lemma 2.2.1 and Proposition 3.1.1. Step 1. From Theorem 1.3.2(ii) we know that α → βc (α) is non-decreasing and converges to a finite limit β ∗ as α → ∞. Equation (2.3.7), which gives a criterion for the localization of the copolymer at AB-interfaces, implies that sup µ[φ I (α, βc (α); µ) − ] = ς
µ≥1
∀α ≥ 0
(3.2.1)
with , ς defined in (3.1.2) (recall the energy shift made in (2.4.1–2.4.3)). Lemma 2.4.1 asserts that φ I (α, βc (α); µ) tends to zero as µ → ∞, uniformly in α ≥ 0. Since
844
F. den Hollander, N. Pétrélis
φ I (α, βc (α); 1) = 0 for all α > 0 (the path lies in the interface), it follows that the supremum in (3.2.1) is attained at some µα > 1. Therefore, if we can prove that φ I (α , βc (α); µα ) > φ I (α, βc (α); µα )
∀ α > α,
(3.2.2)
then sup µ[φ I (α , βc (α); µ) − ] ≥ µα [φ I (α , βc (α); µα ) − ]
µ≥1
> µα [φ I (α, βc (α); µα ) − ] = ς,
(3.2.3)
and hence βc (α) > βc (α ). Step 2. Let α > α and D = φ I (α , βc (α); µα ) − φ I (α, βc (α); µα ) ⎤ ⎡ ω,I ω,I 1 ⎣ = lim e−Hµα L (α ,βc (α);π ) − log e−Hµα L (α,βc (α);π ) ⎦ log L→∞ µα L π ∈Wµα L ,L π ∈Wµα L ,L ⎛ ⎡ ⎤⎞ µ L α 1 ω,I ⎝ log Eµα L exp ⎣(α − α ) = lim 1{ωi = A, (πi−1 , πi ) ≤ 0}⎦⎠ , L→∞ µα L i=1
(3.2.4) where the expectation is w.r.t. the law of the copolymer with parameters α and βc (α), µα L which are both suppressed from the notation. For ε > 0, let Aε,L = {π : i=1 1{ωi = A, (πi−1 , πi ) ≤ 0} ≥ εµα L}. Then we may estimate 1 I ω,I c log e(α−α )εµα L Pω, D ≥ lim sup (3.2.5) µα L (Aε,L ) + Pµα L ([Aε,L ] ) . L→∞ µα L We will prove that, for ε small enough, there is a subsequence (L m )m∈N such that I c limm→∞ Pω, µα L m ([Aε,L m ] ) = 0 ω-a.s. This willl imply that D ≥ (α −α )ε and complete the proof. Step 3. We recall that lµα L denotes the number of strictly positive excursions in I lµα L π ∈ Wµα L ,L . By Proposition 3.1.1, ω-a.s., Pω, µα L ( k=1 τk 1{τk ≥ M} ≥ Cµα L) µα L tends to zero as L → ∞ along a subsequence. Moreover, ω-a.s., i=1 1{ωi = A} ≥ 1 1 2 µα L − Cµα L for L large enough. Thus, putting s = 2 − 2C − ε, for L large enough we have the inclusion ⎧ ⎫ µα L ⎨l ⎬ τk 1{τk ≥ M} ≥ Cµα L [Aε,L ]c ⊂ ⎩ ⎭ k=1 ⎧⎧ ⎫ ⎫ αL ⎨⎨µ ⎬ ⎬ ∪ 1{ωi = A}1{iM = 1} ≥ sµα L ∩ [Aε,L ]c , (3.2.6) ⎩⎩ ⎭ ⎭ i=1
where iM is the indicator of the event the i th step lies in a strictly positive excursion of length ≤ M.
Localized Phase of a Copolymer in an Emulsion
845
From now on we fix C = 18 and ε ≤ 18 , implying that s ≥ 18 . We also fix M such that Proposition 3.1.1 holds for C = 18 . The proof will be completed once we show that I lim Pω, µα L (Bε,L ) = 0
L→∞
where Bε,L =
⎧ ⎨ ⎩
π:
µ αL
ω − a.s.,
1{ωi = A}1{iM = 1} ≥ sµα L
i=1
(3.2.7) ⎫ ⎬ ⎭
∩ [Aε,L ]c .
(3.2.8)
Each path of Bε,L puts at least sµα L monomers labelled by A in strictly positive excursions of length ≤ M and at most εµα L monomers labelled by A in non-positive excursions. Step 4. For π ∈ Bε,L , let E L (π ) label the excursions of π that are strictly positive, have length ≤ M and contain at least 1 monomer labelled by A. Abbreviate r L (π ) = |E L (π )| ≥ sµα L/M. Partition E L (π ) into two parts: – E L1 (π ): those excursions whose preceding and subsequent non-positive excursions do not contain an A. – E L2 (π ): those excursions whose preceding and/or subsequent non-positive excursions contain an A. The total number of non-positive excursions containing an A is bounded from above by εµα L. Since a non-positive excursion can be at most once preceding and once subsequent, we have |E L1 (π )| ≥ (s/M − 2ε)µα L. We will discard the excursions in E L2 (π ). Morover, to avoid overlap, we will keep from E L1 (π ) only half of the excursions. Call the remainder E˜L1 (π ), and abbreviate r˜L (π ) = |E˜L1 (π )|. Then r˜L (π ) ≥ r µα L with r = (s/2M − ε)µα L. Next, for π ∈ Bε,L , let χ (π ) denote the partition of {1, . . . , µα L} into 2˜r L (π ) + rL 1 intervals, i.e., (It )2˜ t=0 with I2( j−1)+1 , j ∈ {1, 2, . . . , r˜L }, the interval occupied by the jth excursion of E˜L1 (π ) and its preceding and subsequent non-positive excursions. rL r L }, the The partition χ (π ) also contains 2˜r L + 1 integers (i t )2˜ t=0 with i t , i ∈ {0, 1, . . . , 2˜ number of horizontal steps the path π makes in It . Let K Lω be the set of possible outcomes of χ (π ) as π runs over Bε,L . For χ ∈ K Lω , let t (χ ) denote the family of possible paths over the even intervals I0 , I2 , . . . , I2˜r (χ ) . The paths of t (χ ) do not put more than εµα L monomers of type A on or below the interface, put exactly one excursion of type 1 in each interval I2 j , j ∈ {1, . . . , 2˜r (χ )}, no excursion of type 1 in I0 and at most one excursion in I2˜r (χ ) . For j ∈ {1, . . . , r˜ (χ )}, let t j (χ ) be the set of paths on I2 j−1 that make i 2 j−1 horizontal steps, perform exactly one excursion of type 1, and have their preceding and subsequent non-positive excursions without an A. Then we have the formula ) r˜ (χ ) −H ω,I (π ) −H ω,I (π j ) e e ω χ ∈K L π ∈t (χ ) π j ∈t j (χ ) j=1 I Pω, . ω,I (π ) µα L Bε,L = −H π ∈Wµ L ,L e α
(3.2.9) Step 5. For j ∈ {1, . . . , r˜ (χ )}, let s j (χ ) be the set of non-positive excursions of |I2 j−1 | steps of which i 2 j−1 are horizontal. Then we may estimate
846
F. den Hollander, N. Pétrélis
µα L Pω,I µα L Bε,L ≤ εµα L εµ L α * ×
ω χ ∈K L
*
ω χ ∈K L
π ∈t (χ ) e
π ∈t (χ ) e
−H ω,I (π )
−H ω,I (π )
) r˜ (χ ) j=1
)r˜ (χ )
π j ∈t j (χ ) e
j=1
π j ∈t j (χ ) e
−H ω,I (π j )
+
−H ω,I (π j )
+
π j ∈s j (χ ) e
−H ω,I (π j )
+ .
(3.2.10) Here, the prefactor comes from the fact that a path with more than one non-positive excursion containing an A may be associated with more than one family (χ , t (χ )) in the sum in the denominator of (3.2.9). However, a path t (χ ) cannot have more than εµα L excursions of such type. Since the number of excursions from above by µα L,
µisα Lbounded times in the denominator. we can assert that each path can appear at most εµα L εµ αL At this stage it suffices to show that there exists a C > 0, depending only on α, α and M, such that for all χ ∈ K Lω and j ∈ {1, . . . , r˜ (χ )}, ω,I ω,I e−H (π j ) ≥ C e−H (π j ) . (3.2.11) π j ∈s j (χ )
π j ∈t j (χ )
Indeed, since r ≥ µα L this yields, via (3.2.10), µα L ω,I (1 + C)−r µα L . Pµα L Bε,L ≤ εµα L εµα L
(3.2.12)
For ε small enough the r.h.s. of (3.2.12) tends to zero as L → ∞ because C > 0, implying (3.2.7) as desired. Step 6. To prove (3.2.12), we note that, since the paths of s j (χ ) stay in the lower halfplane, their Hamiltonian is a constant, namely, H ω,I (s j (χ )) = i∈I j (α1{ωi = A}−β1{ωi = B}) (recall (2.4.2)). A path of t j (χ ) puts at most M steps of I j in the upper halfplane, and so π j ∈ t j (χ ) implies H ω,I (π j ) ≥ H ω,I (s j (χ )) − α M. It therefore remains to compare the cardinalities of s j (χ ) and t j (χ ). The number of strictly positive excursions of length ≤ M is some integer, denoted by (M). Moreover, on I j the possible starting points of the excursion of type 1 are at most M. Indeed, the excursion has to contain all the ωi of I j that are equal to A, and hence it must start less than M steps to the left of the leftmost i ∈ I j such that ωi = A. Thus, we have at most M(M) possible excursions of type 1 in I j (if we take into account their starting point). Next, we note that by fixing the starting point and the shape of the excursions of type 1, we can create an injection from t j (χ ) to s j (χ ) as follows (see Fig. 9). If 2r is the number of vertical steps in the fixed excursion of type 1, then we associate with each path of t j (χ ) a path of s j (χ ) that begins with r vertical steps down before performing the preceding non-positive excursion, next makes s horizontal steps, where s is the number of horizontal steps in the excursion of type 1, next performs the subsequent non-positive excursion, and afterwards returns to the interface with r vertical steps. We conclude that |s j (χ )| ≥ |t j (χ )|/Mh(M), which allows us to estimate ω,I ω,I e−H (π j ) = |s j (χ )| e−H (s j (χ )) π j ∈s j (χ )
≥
|t j (χ )| −H ω,I (s j (χ )) e =C M(M)
with C = e−α M /Mh(M), proving (3.2.11).
π j ∈t j (χ )
e−H
ω,I (π
j)
(3.2.13)
Localized Phase of a Copolymer in an Emulsion
847
Fig. 9. Injection from t j (χ ) to s j (χ ). Here, (b1 , b2 ) and (d1 , d2 ) label the endpoints of the preceding and subsequent non-positive excursions
4. Proof of Theorem 1.4.2 Section 4.1 states two propositions providing the lower, respectively, upper bound for f near the critical curve. These two propositions are proved in Sects. 4.3 and 4.4, respectively, and together yield Theorem 1.4.2. Section 4.2 contains several lemmas about the maximisers of the variational problem for ψ AB , which are needed in the proofs. 4.1. Lower and upper bounds on the free energy. Recall (2.4.2). Fix p ≥ pc , α ∈ (α ∗ , ∞) and δ0 > 0 small enough (depending on p and α). Abbreviate I0 = (0, δ0 ] ∩ (0, α − βc (α)], and for δ ∈ I0 define ψkl (a, δ) = ψkl (α, βc (α) + δ; a), a ≥ 2, φ I (µ, δ) = φ I (α, βc (α) + δ; µ), µ ≥ 1,
(4.1.1)
and Tα (δ) = f (α, βc (α) + δ; p) − f (α, βc (α); p).
(4.1.2)
Proposition 4.1.1. There exists a C1 > 0 such that Tα (δ) ≥ C1 δ 2
∀ δ ∈ I0 .
(4.1.3)
Proposition 4.1.2. There exists a C2 < ∞ such that Tα (δ) ≤ C2 δ 2
∀ δ ∈ I0 .
(4.1.4)
4.2. Maximisers of the block pair free energy. Lemmas 4.2.1–4.2.6 below are elementary assertions about the existence and the limiting behaviour of the maximisers in the variational expression for ψ AB in (2.3.6). These lemmas will be needed in the proof of Propositions 4.1.1–4.1.2 in Sects. 4.3–4.4. Step 1. We first show that a → ψ AB (a, δ) has a maximiser for δ small enough. Lemma 4.2.1. For every δ0 > 0 there exists an a0 > 2 such that, for every α > α ∗ and δ ∈ I0 (α), there exists an aα (δ) ∈ (2, a0 ] satisfying sup ψ AB (a, δ) = ψ AB (aα (δ), δ). a≥2
(4.2.1)
848
F. den Hollander, N. Pétrélis
Proof. Recall (4.1.1). In Lemma 2.4.1 we showed that, for every β0 > 0, ψ AB (a, α, β) tends to zero as a → ∞ uniformly in α ≥ β and β ≤ β0 . Since βc (α) ≤ β ∗ for all α ≥ 0, there therefore exists an a0 > 2 such that ψ AB (a, δ) < κ(a ∗ , 1) for all a ≥ a0 , α > α ∗ and δ ∈ I0 (α). By [7], Theorem 1.4.2, we have supa≥2 ψ A,B (a, δ) > κ(a ∗ , 1) for all δ > 0 and α > α∗. This implies sup ψ AB (a, δ) = sup ψ AB (a, δ) a≥2
∀ α > α ∗ , δ ∈ I0 (α).
(4.2.2)
2≤a≤a0
For δ fixed, a → ψ AB (a, δ) is continuous on [2, ∞) and ψ AB (2, δ) = 0. Therefore there exists an aα (δ) ∈ (2, a0 ] such that the l.h.s. of (4.2.2) is equal to ψ A,B (aα (δ), δ). Step 2. Let Qαδ,µ0 = {(c, µ) : 0 ≤ c ≤ µ, µ ≥ µ0 , aα (δ) − c ≥ 2 − c/µ}
(4.2.3)
1 I cφ (µ, δ) + (a − c)κ(a − c, 1 − c/µ) . a
(4.2.4)
and H (c, a, µ, δ) =
Then, by Lemma 2.2.1(ii), we can assert that there exists a unique pair (cα (δ), µα (δ)) ∈ Qαδ,1 satisfying ψ AB (aα (δ), δ) = H (cα (δ), aα (δ), µα (δ), δ). Lemma 4.2.2. For every δ0 > 0 there exists a µ0 > 1 such that (cα (δ), µα (δ)) ∈ Qαδ,1\Qαδ,µ0 for all α > α ∗ and δ ∈ I0 (α). Proof. Prior to (4.2.2) we noted that ψ AB (aα (δ), δ) > κ(a ∗ , 1). We will show that there exists a µ0 > 1 such that H (c, aα (δ), µ, δ) ≤ κ(a ∗ , 1) for all α > α ∗ , δ ∈ I0 (α) and (c, µ) ∈ Qαδ,µ0 . This goes as follows. In Lemma 2.4.1(i) we showed that φ I (µ, δ) tends to zero as µ → ∞, uniformly in α > α ∗ and δ ∈ I0 (α). Therefore there exists a µ0 > 1 such that φ I (µ, δ) < 21 κ(a ∗ , 1) for all µ ≥ µ0 , α > α ∗ and δ ∈ I0 (α). Lemma 4.2.3. There exists an M > 0, depending on a0 , such that κ(a, b) ≤ κ(a ∗ , 1) + M(1 − b) for all (a, b) ∈ DOM (recall (2.2.1)) satisfying a ≤ a0 and 21 ≤ b ≤ 1. Proof. This is easily proved via Lemma 2.2.1(ii), which says that (a, b) → κ(a, b) is analytic on the interior of DOM, and the equality κ(a, a − 1) = 0 for all a ≥ 2. We now choose µ0 large enough so that µ > 2a0 and Ma0 /µ ≤ 21 κ(a ∗ , 1). Thus, for (c, µ) ∈ Qαδ,µ0 we have c/µ ≤ a0 /µ0 ≤ 21 , which entails 21 ≤ 1 − c/µ ≤ 1. Therefore, (aα (δ) − c, 1 − c/µ) satisfies the assumptions of Lemma 4.2.3 and H (c, aα (δ), µ, δ) ≤
1 1 c 2 κ(a ∗ , 1) + (aα (δ) − c) κ(a ∗ , 1) + Mc/µ aα (δ)
≤ κ(a ∗ , 1) +
1 c Ma0 /µ − 21 κ(a ∗ , 1) ≤ κ(a ∗ , 1). (4.2.5) aα (δ)
Localized Phase of a Copolymer in an Emulsion
849
Step 3. We next show that a → ψ AB (a, 0) has a unique maximiser. Lemma 4.2.4. For every α ≥ α ∗ , supa≥2 ψ AB (a, 0) = κ(a ∗ , 1) and is achieved uniquely at a = a ∗ . Consequently, for α ≥ α ∗ and β = βc (α), the supremum in (2.3.6) is achieved uniquely at c = 0. Proof. Since (α, βc (α)) ∈ L, [7], Theorem 1.4.2, tells us that supa≥2 ψ AB (a, 0) ≤ κ(a ∗ , 1). Moreover, ψ AB (a ∗ , 0) ≥ κ(a ∗ , 1), and therefore sup ψ AB (a, 0) = κ(a ∗ , 1) = ψ AB (a ∗ , 0).
(4.2.6)
a≥2
Now, pick a ≥ 2 such that ψ AB (a, 0) = κ(a ∗ , 1) and recall that DOM(a) in (2.3.4) is the domain of the variational problem for ψ AB (a, 0). We argue by contradiction. Suppose that there exist c, b > 0 such that (c, b) ∈ DOM(a) and 1 I ψ AB (a, 0) = κ(a ∗ , 1) = cφ (c/b, 0) + (a − c)κ(a − c, 1 − b) . (4.2.7) a Then
1 (c/b) φ I (c/b, 0) − κ(a ∗ , 1) − (a/b − c/b) κ(a ∗ , 1) − κ(a − c, 1 − b) = 0. a (4.2.8) However, (c/b) [φ I (c/b, 0) − κ(a ∗ , 1)] ≤ ς by Proposition 2.3.4. Moreover, by [7], Eq. (2.3.3), we have , g(ν) = ν
κ(a ∗ , 1) −
sup
κ(bν, 1 − b) > ς
∀ ν ≥ 1.
(4.2.9)
2/(ν+1)≤b≤1
Pick ν = (a − c)/b to make the l.h.s. of (4.2.8) strictly negative. Then the equality in (4.2.8) cannot occur with b > 0 and c > 0. Consequently, the only way to obtain (4.2.8) is to take c = 0 and a = a ∗ . Step 4. Fix α > α ∗ and δ0 > 0. For δ ∈ I0 (α), the quantity aα (δ) may not be unique, which is why from now on we take its minimum value. We next prove that (aα (δ), cα (δ)) tends to (a ∗ , 0) as δ ↓ 0. In what follows, (δn )n∈N is a sequence in I0 (α) such that limn→∞ δn = 0. Lemma 4.2.5. Let (an )n∈N and (µn )n∈N be such that limn→∞ an = a ≥ 2 and limn→∞ µn = µ ≥ 1. Then limn→∞ ψ AB (an , δn ) = ψ A,B (a, 0) and limn→∞ φ I (µn , δn ) = φ I (µ, 0). Proof. A simple computation gives that ψ AB (a, δ)−ψ AB (a, 0) ≤ δ for all a ≥ 2 (recall (4.1.1)). This allows us to write the inequality |ψ AB (an , δn )−ψ AB (a, 0)| = |ψ AB (an , δn )−ψ AB (an , 0)|+|ψ AB (an , 0)−ψ AB (a, 0)| ≤ δn + |ψ AB (an , 0) − ψ AB (a, 0)|. (4.2.10) Since a → ψ A,B (a, 0) is continuous (recall Lemma 2.3.3(i)), the r.h.s. of (4.2.10) tends to zero as n → ∞. This yields the claim for ψ AB . The same proof gives the claim for φI . Step 5. Finally, we obtain the convergence of aα (δ) and cα (δ) as δ ↓ 0.
850
F. den Hollander, N. Pétrélis
Lemma 4.2.6. (i) limδ↓0 aα (δ) = a ∗ . (ii) limδ↓0 cα (δ) = 0. Proof.
(i) The family (aα (δ))δ∈I0 (α) is bounded. We show that the only possible limit of its subsequences is a ∗ . Assume that aδn → a∞ as n → ∞, with a∞ ∈ [2, a0 ]. Since δ → ψ A,B (aα (δ), δ) is non-decreasing, we get ψ AB (aδn , δn ) − ψ AB (a ∗ , 0) ≥ 0.
(4.2.11)
Lemma 4.2.5 tells us that the r.h.s. of (4.2.11) tends to ψ AB (a∞ , 0)−ψ AB (a ∗ , 0) as n → ∞. Thus, ψ AB (a∞ , 0) ≥ ψ AB (a ∗ , 0) and, since a ∗ is the unique maximiser of ψ A,B (a, 0) (by Lemma 4.2.4), we obtain that a∞ = a ∗ . This implies that aα (δ) tends to a ∗ as δ ↓ 0. (ii) The family (cα (δ))δ∈I0 is bounded, because cα (δ) ≤ aα (δ) − 1 ≤ a0 − 1 for every δ ∈ I0 . Assume that cα (δn ) → c∞ as n → ∞. Since aα (δn ) → a ∗ , we necessarily have c∞ ≤ a ∗ −1. Moreover, (µα (δn ))n∈N is bounded above by µ0 (by Lemma 4.2.2). Therefore, we can pick a subsequence satisfying µα (δn ) → µ∞ as n → ∞. We now recall (4.2.4) and write ψ AB (aα (δn ), δn ) =
1 cα (δn )φ I (µα (δn ), δn ) aα (δn ) 1 aδn −cα (δn ) κ (aα (δ)−cα (δn ), 1 − cα (δn )/µ) . + aα (δn ) (4.2.12)
Let n → ∞. Then Lemma 4.2.5 tells us that ψ AB (a ∗ , 0) =
∗ 1 I ∗ c . φ (µ , 0) + (a − c ) κ a − c , 1 − c /µ ∞ ∞ ∞ ∞ ∞ ∞ a∗ (4.2.13)
Therefore Lemma 4.2.4 gives that c∞ = 0 and consequently cα (δ) tends to 0 as δ ↓ 0.
4.3. Proof of Proposition 4.1.1. Proof. Along the way we need the following. Let ∂φ I /∂β + and ∂φ I /∂β − denote the right- and left-derivative of φ I , respectively. Lemma 4.3.1. For all µ ≥ 1 and α, β ≥ 0 such that φ I (α, β; µ) > κ(µ), ˆ ∂φ I ∂φ I (α, β; µ) ≥ (α, β; µ) > 0. ∂β + ∂β −
(4.3.1)
Proof. Use that φ I (α, β; µ) is convex in β and that φ I (α, β; µ) ≥ φ I (α, 0; µ) = κ(µ) ˆ for all β ≥ 0.
Localized Phase of a Copolymer in an Emulsion
851
What Lemma 4.3.1 says is that the localized phase of φ I (α, β; µ) for fixed µ corresponds ˆ to pairs (α, β) satisfying φ I (α, β; µ) > κ(µ). Step 1. Recall (2.1.8) and pick a γ ∈ (0, 1) for which Mγ ∈ R( p). By picking a A A = a AB = a ∗ = 25 and (ρkl ) = Mγ in (2.1.11), and noting that ψ A A (a ∗ ) = f (α, βc (α); p) = κ(a ∗ , 1) = , we get Tα (δ) ≥ γ ψ AB (a ∗ , δ) − κ(a ∗ , 1) . (4.3.2) Since µ → φ I (µ, 0) is continuous and φ I (1, 0) = 0, Proposition 2.3.4 allows us to choose a µα ≥ 1 that is a solution of the equation φ I (µ, 0) = + (1/µ)ς (recall (3.1.2)). Pick C ∈ (0, 1) and, in the variational formula for ψ AB (a ∗ , δ) in Lemma 2.3.2, pick c = Cδ and c/b = µα , to obtain the lower bound
γ Tα (δ) ≥ ∗ Cδφ I (µα , δ) + (a ∗ − Cδ)κ a ∗ − Cδ, 1 − Cδ/µα − a ∗ κ(a ∗ , 1) . a (4.3.3) Use Lemma 2.2.1(iv–vi) to Taylor expand
κ a ∗ − Cδ, 1 − Cδ/µα = κ(a ∗ , 1) − (ς/a ∗ ) Cδ/µα + Bα C 2 δ 2 +ζ (Cδ, Cδ/µ) C 2 δ 2 1 + 1/µ2α , δ ↓ 0,
(4.3.4)
for some Bα ∈ R and ζ a function on R2 tending to zero at (0, 0). Since βc (α) ≤ β ∗ for α ≥ α ∗ , Lemma 2.4.1 tells us that φ I (α, βc (α); µ) tends to 0 as µ → ∞ uniformly in α ≥ α ∗ . Consequently, µα is bounded uniformly in α ≥ α ∗ , and therefore so is Bα . By inserting (4.3.4) into (4.3.3), we obtain that there exist M ∈ R and δ0 > 0 such that γ Tα (δ) ≥ ∗ Cδ φ I (µα , δ) − φ I (µα , 0) + Ma ∗ C 2 δ 2 ∀ α > α ∗ , δ ∈ I0 (α). a (4.3.5) ˆ α ), Lemma 4.3.1 Since, by Lemma 2.2.2(iv) and Proposition 2.3.4, φ I (µα , 0) > κ(µ gives that (α, βc (α)) lies in the localized phase of (α , β ) → φ I (µα , α , β ). Therefore φ I (µα , δ) − φ I (µα , 0) ≥ Cα δ
with
Cα =
∂φ I (α, βc (α); µα ) ∈ (0, 1]. (4.3.6) ∂β +
Hence (4.3.5) becomes Tα (δ) ≥
γ (CCα + Ma ∗ C 2 ) δ 2 a∗
∀α > α ∗ , δ ∈ I0 (α).
(4.3.7)
Now pick C small enough so that Ma ∗ C > − 21 Cα , to get the inequality in (4.1.3) with C1 = 2aγ ∗ CCα . Step 2. To complete the proof of Proposition 4.1.1 it suffices to show that Cα can be bounded from below by a strictly positive constant. The latter is done as follows. Suppose that there exists a sequence (αn )n∈N in (α ∗ , ∞] such that limn→∞ Cα n = 0. By considering a subsequence of (αn )n∈N , we may assume that αn and µαn converge, respectively, to α∞ ∈ [α ∗ , ∞] and µ∞ . Moreover, as proved in Lemma 4.2.5, lim φ I (αn , β, µαn ) = φ I (α∞ , β, µ∞ )
n→∞
∀ β > 0,
(4.3.8)
852
F. den Hollander, N. Pétrélis
and β → φ I (αn , β; µαn ) is convex for every n ∈ N. Consequently, ∂φ I ∂φ I (α , β (α ); µ ) ≤ lim sup (αn , βc (αn ); µαn ) = lim sup Cα n = 0 ∞ c ∞ ∞ + ∂β − n→∞ ∂β n→∞ (4.3.9) and φ I (α∞ , β; µ∞ ) = +
1 ς > κ(µ ˆ ∞ ). µ∞
(4.3.10)
I
But (4.3.9) yields ∂φ (α , β (α ); µ∞ ) ≤ 0, which contradicts the statement in ∂β − ∞ c ∞ Lemma 4.2.5, because of (4.3.10). 4.4. Proof of Proposition 4.1.2. Step 1. Since ψ AB ≥ ψkl for all kl ∈ {A, B}2 , we can write f (α, βc (α) + δ; p) − f (α, βc (α); p) ≤ ψ AB (aα (δ), δ) − .
(4.4.1)
Because of Lemma 4.2.4 we also have f (α, βc (α) + δ; p) − f (α, βc (α); p) ≤ ψ AB (aα (δ), δ) − ψ AB (aα (δ), 0).
(4.4.2)
Since ψ AB (aα (δ), δ) − ψ AB (aα (δ), 0) 1 cα (δ) φ I (µα (δ), α, βc (α) + δ) − φ I (µα (δ), α, βc (α)) ≤ (4.4.3) aα (δ) and, for δ fixed, β → φ I (α, β; µα (δ)) is convex with slope bounded by 1, we obtain + * ∂ I 1 φ (α, βc (α) + δ; µα (δ)) cα (δ) δ ψ AB (aα (δ), δ) − ψ AB (aα (δ), 0) ≤ a0 ∂β 1 ≤ cα (δ) δ. (4.4.4) a0 Step 2. The proof of (4.1.4) is now completed by the following. Lemma 4.4.1. For every α > α ∗ there exist Cα < ∞ and δ0 > 0 such that cα (δ) ≤ Cα δ for all δ ∈ I0 (α). Proof. Recall the statement of Lemma 4.2.2, i.e., for every δ ∈ I0 (α) there exists a µα (δ) ∈ [1, µ0 ] such that ψ AB (aα (δ), δ) =
sup
c≤min{aα (δ)−1,µα (δ)(aα (δ)−2)/(µα (δ)−1)}
H (c, aα (δ), µα (δ), δ) (4.4.5)
with H (c, aα (δ), µα (δ), δ) =
1 I cφ (µα (δ), δ) + (aα (δ) − c)κ (aα (δ) − c, 1 − c/µα (δ)) . aα (δ)
(4.4.6)
Localized Phase of a Copolymer in an Emulsion
853
We proved in Lemma 4.2.6 that the supremum is attained in a point cα (δ) > 0 that tends to zero as δ ↓ 0. Since H is differentiable w.r.t. its first variable, we have ∂H (cα (δ), aα (δ), µα (δ), δ) = 0. ∂1
(4.4.7)
Moreover, since H is also differentiable w.r.t. its second variable, and since the maximum of ψ AB (a, δ) over a ∈ [2, ∞) is attained in aα (δ), we have ∂H (cα (δ), aα (δ), µα (δ), δ) = 0. ∂2
(4.4.8)
In what follows, we consider three functions (δ → ξi,α (δ))i=1,2,3 that tend to zero as δ ↓ 0. Since aα (δ) tends to a ∗ by Lemma 4.2.6(i), we use the notation aα (δ) = a ∗ + aˆ α (δ). For simplicity, when we do not indicate the point at which a derivative is taken, this point is (a ∗ , 1) by default. Computing the derivative in (4.4.7) from (4.4.6), we obtain a relation between cα (δ) and aα (δ). We may simplify this relation by using a first order Taylor expansion of the quantities κ (aα (δ), 1 − cα (δ)/µα (δ)) ,
∂κ (aα (δ), 1 − cα (δ)/µα (δ)) , ∂2
∂κ (aα (δ), 1 − cα (δ)/µα (δ)) , ∂2
(4.4.9)
in the neighbourhood of (a ∗ , 1). This gives, after some straightforward but tedious computations, φ I (µα (δ), δ) − κ(a ∗ , 1) − 2µα5(δ) ∂∂2K +cα (δ) Aα,δ + aˆ α (δ) Bα,δ + ξ1,α (δ) (|cα (δ)| + |aˆ α (δ)|) = 0 with Aα,δ =
1 µα (δ)
∂2κ 2 ∂κ ∂2 + 5 ∂1∂2 +
Bα,δ = − µα1(δ)
∂κ ∂2
+
5 ∂2κ 2 ∂1∂2
∂2κ 5 2µα (δ) ∂22
+
5µα (δ) ∂ 2 κ 2 ∂12
+
5µα (δ) ∂ 2 κ 2 ∂22
.
(4.4.10)
, (4.4.11)
The same type of computation applied to (4.4.8) gives aˆ α (δ) + ξ2,α (δ)aˆ α (δ) = cα (δ)Cα,δ + ξ3,α (δ)cα (δ)
(4.4.12)
with Cα,δ = −( 25 )2
κ(a ∗ ,1)−φ I (µα (δ),δ) ∂2κ ∂12
+1+
∂κ 2 ∂1∂2 2 µα (δ) ∂ 2κ ∂1
.
(4.4.13)
Recalling that cα (δ) and aˆ α (δ) tend to zero as δ ↓ 0 (by Lemma 4.2.6), we obtain from (4.4.12) that aˆ α (δ) ∈ [(Cα,δ − ε)cα (δ), (Cα,δ + ε)cα (δ)] for all ε > 0 and δ small enough. From this last inclusion and (4.4.10), we get that there exists a δ1 > 0 such that, for all ε > 0 and δ ≤ δ1 ,
φ I (µα (δ), δ) − κ(a ∗ , 1) − 2µα5(δ) ∂∂2K + cα (δ) Aα,δ + Bα,δ Cα,δ + ε ≥ 0. (4.4.14)
854
F. den Hollander, N. Pétrélis
Abbreviate (δ) = φ I (µα (δ), δ) − κ(a ∗ , 1) −
∂K 5 2µα (δ) ∂2 .
(4.4.15)
Since (α, βc (α)) lies in the delocalized region, Proposition 2.3.4 tells us that φ I (µα (δ), 0) ≤ κ(a ∗ , 1) + 2µα5(δ) ∂∂2K . Therefore we can write (δ) ≤ φ I (µα (δ), δ) − φ I (µα (δ), 0).
(4.4.16)
A simple computation gives that φ I (µ, δ) − φ I (µ, 0) ≤ δ for all µ ≥ 1 (recall (4.1.1)). Hence (δ) ≤ δ. From (4.4.11) and (4.4.13), we have (4.4.17) Aα,δ + Bα,δ Cα,δ = µ A(δ)2 + (δ) µαB(δ) − 25 α
with * A= B=
1 ∂2κ ∂12
1 ∂2κ ∂12
5 ∂2κ 2 ∂22
−
2 − 25
2 5
∂κ ∂2
∂κ 2 ∂2
−
∂ κ − 2 ∂κ ∂2 ∂1∂2 −
2 ∂2κ 5 ∂1∂2
2
.
5 2
∂2κ ∂1∂2
2 +
and (4.4.18)
By inserting the values of the derivatives given in Lemma 2.2.1(v–vi), we find that A < 0. Thus, recalling that 1 ≤ µα (δ) ≤ µ0 for all δ ∈ I0 (α) (by Lemma 4.2.2), we can rewrite (4.4.14) as Aα,δ + Bα,δ Cα,δ ≤ µA2 + (δ) |B| + 25 . (4.4.19) 0
Since (δ) ≤ δ, we can now assert that there exists a δ2 > 0 such that 0 < δ ≤ δ2 implies Aα,δ + Bα,δ Cα,δ ≤ 3A/2µ20 . Therefore (4.4.14) becomes δ + cα (δ) 3A/2µ20 ≥ 0 and, consequently, for δ0 = min{δ1 , δ2 } there exists a Cα > 0 such that for all δ ∈ I0 (α), cα (δ) ≤ Cα δ. This completes the proof of Lemma 4.4.1.
(4.4.20)
5. Proof of Theorem 1.4.3 In Sect. 5.1 we study a variation of the single linear interface model in which the variable µ is replaced by a dual variable λ, which enters into the Hamiltonian rather than in the set of paths. We show that the free energy for this dual model is smooth. In Sect. 5.2 we show that the dual free energy has a non-zero curvature. In Sects. 5.3 and 5.4 we use this to prove that φ I and ψ AB are smooth on their localized phases and have a non-zero curvature too. The latter in turn are used in Sect. 5.5 to prove the smoothness of f on L. Key ingredients in the proofs are the implicit function theorem, the exponential tightness of the excursions in the localized phases, and the uniqueness of the maximisers in the variational formulas for φ I , ψ AB and f .
Localized Phase of a Copolymer in an Emulsion
855
5.1. Fenchel-Legendre transform of φ I . We begin by defining the dual of the single interface model. Let W L be the set of L-step directed self-avoiding paths that start at (0, 0) and end at (x, 0) for some x ∈ {1, . . . , L}. For π ∈ W L , let h(π ) be the number of horizontal steps in π . For λ ≥ 0, define (recall (2.4.2)) U Lω,I (α, β; λ) =
ω,I
e−λh(π )−HL
(π )
π ∈W L
u I (α, β; λ) = lim
L→∞
1 log U Lω,I (α, β; λ) ω − a.s. L
(5.1.1)
and κ(λ) ˜ = lim
L→∞
1 log e−λh(π ) . L
(5.1.2)
π ∈W L
The convergence ω-a.s. and in mean and the constantness ω-a.s. of u I (α, β; λ) follow from the subadditive ergodic theorem (Kingman [8]). Set Lu = (α, β, λ) = CONE × [0, ∞) : u I (α, β; λ) > κ(λ) ˜ ,
(5.1.3)
i.e., the region where the dual of the single linear interface model is localized. Proposition 5.1.1. The function (α, β, λ) → u I (α, β; λ) is infinitely differentiable on Lu . Proof. The proof is similar to that of the infinite differentiability of the free energy for the single interface model, proved in Giacomin and Toninelli [6]. Therefore, we only sketch the main steps in the proof and refer to [6] for further details. Step 1. The claim follows from the Arzela-Ascoli theorem as soon as we prove that for all (α0 , β0 , λ0 ) ∈ Lu there exists V ⊂ Lu a neighborhood of (α0 , β0 , λ0 ) such that for all k ∈ N, the kth derivative of L −1 E(log U Lω,I (α, β; λ)) w.r.t. any of the parameters α, β, λ is bounded uniformly in L and (α, β, λ) ∈ V, where E denotes expectation w.r.t. ω. For a, b ∈ N with a < b, let Ha,b be the set of bounded functions that are measurable w.r.t. the σ -algebra σ (π j : j ∈ {a, . . . , b}). As explained in [6], the conditions of the Arzela-Ascoli theorem are satisfied once we show that for all (α0 , β0 , λ0 ) ∈ Lu there exist C1 , C2 > 0 and V ⊂ Lu such that, for all a1 , b1 , a2 , b2 ∈ N with a1 < b1 < a2 < b2 ≤ L and ( f 1 , f 2 ) ∈ Ha1 ,b1 × Ha2 ,b2 and (α, β, λ) ∈ V, the following inequality holds: E E Lω,I ( f 1 f 2 ) − E Lω,I ( f 1 )E Lω,I ( f 2 ) ≤ C1 f 1 ∞ f 2 ∞ e−C2 (a2 −b1 ) . (5.1.4) Here, E Lω,I is expectation w.r.t. the law of the L-step copolymer at fixed ω given by (recall (5.1.1)) PLω,I (π ) =
1 U Lω,I
ω,I
e−λh(π )−HL
(π )
.
(5.1.5)
856
F. den Hollander, N. Pétrélis
Next, the correlation inequality in (5.1.4) will follow once we show that there exist C1 , C2 > 0 and V ⊂ Lu (depending on α0 , β0 , λ0 ) such that, for all a, b, L ∈ N with a ≤ b ≤ L, we have E [PLω,I ]⊗2 (Ba,b ) ≤ C1 e−C2 (b−a) , (5.1.6) where [PLω,I ]⊗2 is the joint law of two independent copies of the L-step copolymer at fixed ω, and Ba,b = {(π 1 , π 2 ) : j ∈ {a, . . . , b} such that the jth steps of π1 and π2 are the same and occur at the same height}. (5.1.7) Indeed, on [Ba,b ]c the two paths can be coupled as soon as they make the common step. An example of a pair of paths (π1 , π2 ) not in Ba,b is displayed in Fig. 10. Step 2. For i = 1, 2 and M ∈ N, let li,M be the number of excursions of πi (either strictly positive or non-positive) that are included in {a, . . . , b} and are smaller than or equal to M. Let E M (πi ) = {(b1i , e1i ), . . . , (blii,M , elii,M )},
(5.1.8)
where (bij , eij ) denote the end-steps of the jth excursion. Put τ ij = eij − bij + 1, and for γ ∈ (0, 1) let ⎧ ⎫ li,M ⎨ ⎬ Ai,γ ,M = πi : τ ij ≥ γ (b − a) . (5.1.9) ⎩ ⎭ j=1
Lemma 5.1.2. (i) For all γ0 ∈ (0, 1) and (α0 , β0 , λ0 ) ∈ Lu there exist M ∈ N, an open neighborhood V of (α0 , β0 , λ0 ) in Lu and C1 , C2 > 0 such that, for L ≥ b ≥ a and (α, β, λ) ∈ V, I E Pω, (A ) ≥ 1 − C1 e−C2 (b−a) , i = 1, 2. i,γ ,M 0 L (ii) For all T0 ∈ N and (α0 , β0 , λ0 ) ∈ Lu there exist γ ∈ (0, 1), an open neighborhood V of (α0 , β0 , λ0 ) in Lu and C1 , C2 > 0 such that, for all L ≥ b ≥ a and (α, β, λ) ∈ V, I E Pω, (A ) ≤ C1 e−C2 (b−a) , i = 1, 2. i,γ ,T 0 L
Fig. 10. A pair of paths (π1 , π2 ) whose jth steps are the same and occur at the same height
Localized Phase of a Copolymer in an Emulsion
857
Proof.
(i) This part gives the exponential tightness of the excursions of the copolymer in the localized phase. Compared to Proposition 3.1.1, both the model and the statement are different. However, the same tools can be used and for this reason c we only give a sketch of the proof. By the definition of Ai,γ , there are two cases. 0 ,M Case 1. The sum of the lengths of the strictly positive excursions larger than M in {a, . . . , b} is ≥ γ b−a 2 . Case 2. The sum of the lengths of the non-positive excursions larger than M in {a, . . . , b} is ≥ γ b−a 2 . In Case 1, by concatenating the strictly positive excursions larger than M in {a, . . . , b}, we can bound the total entropy carried by these excursions (i.e., the logarithm of their total cardinality) from above by the entropy of a large single positive excursion whose length is equal to the sum of the lengths of the excursions larger than M, which is at least γ b−a 2 . This provides an upper bound for the analogue of the sum in (3.1.14). Next, the gain in the free energy obtained by replacing the large single positive excursion by a path with the same endpoints but no positivity constraint is, for b − a large enough, of order exp[C2 (b − a)], with C2 = γ2 [u(λ) − κ(λ)]. ˜ This provides a lower bound for the normalizing partition sum in (3.1.14). By choosing a small enough open neighborhood V of (α0 , β0 , λ0 ) in Lu , we get that there exists a c > 0 such that, for all (α, β, λ) ∈ V, we have u(α, β; λ) − κ(λ) ˜ ≥ c. Thus, cγ2 is a lower bound for C2 , uniform in V. In Case 2, a similar argument applies. (ii) Again we only sketch the proof. We partition {a, . . . , b} into b−a R blocks of size R. A block is called “good” if it carries only monomers of type A. By the law of large numbers, there exists a c R > 0 such that approximately c R (b − a) of the blocks are good. We can therefore choose γ close enough to 1 such that, on A1,γ ,T , at least c2R (b − a) of the good blocks are covered only by excursions smaller than T . Such blocks are called “good T -blocks”. Consequently, more than TR excursions are required to cover a good T -block and so at least TR steps in each good T -block are below the interface. Thus, after relaxing the condition A1,γ ,T in the normalizing partition sum, we can replace on each good T -block the excursions smaller than T by a large strictly positive excursion. This does not decrease the entropy, but increases the energy by at least β TR on each good T -block. Summed up these energy increases are of order c2R (b − a)β TR .
Step 3. Let D = A1, 3 ,M ∩ A2, 3 ,M and T M = {E M (π1 ) : π1 ∈ A1, 3 ,M }. For i = 1, 2 4
4
4
and E M ∈ T M , let J i (E M ) = {πi : E M (πi ) = E M }. Then Lemma 5.1.2 applied at γ0 = 43 implies that there exists M ∈ N, an open neighborhood V of (α0 , β0 , λ0 ) in Lu and C1 , C2 > 0 such that for L ≥ b and (α, β, λ) ∈ V we have [PLω,I ]⊗2 (D c ) ≤ 2C1 e−C2 (b−a) , so that it remains to estimate [PLω,I ]⊗2 (Ba,b ∩ D), [PLω,I ]⊗2 (Ba,b ∩ D) 1 2 = [PLω,I ]⊗2 Ba,b ∩ {J 1 (E M ) × J 2 (E M )} 1 ,E 2 ∈T EM M M
=
1 ,E 2 ∈T EM M M
1 E Lω,I 1{π2 ∈J 2 (E 2 )} PLω,I Ba,b ∩ {π1 ∈ J 1 (E M )} | π2 . (5.1.10) M
858
F. den Hollander, N. Pétrélis
Next, set i˜ = 2 if i = 1 and vice versa, and define ˜ ˜ 1 2 , EM ) = j ∈ {1, . . . , li,M } : bki or eki ∈ {bij , eij } for some k ∈ {1, . . . , li,M Ri (E M ˜ } . (5.1.11) 1 , E 2 ∈ T there are at least 1 (b−a) By the definition of Ai, 3 ,M in (5.1.9), for any E M M M 4 4 steps in {a, . . . , b} belonging to excursions smaller than M, in both π1 and π2 . The1 , E 2 ∈ T , either refore we can choose a C > 0 small enough such that, for all E M M M ˜ 1 , E 2 )| ≥ C(b −a)/M or |Ri (E 1 , E 2 )| ≥ C(b −a)/M. Without loss of genera|Ri (E M M M M 1 , E 2 )| ≥ C(b − a)/M. Because of the condition impolity, we may assume that |R1 (E M M 1 , E 2 ) the excursion of π on {b1 , . . . , e1 } has some prohised by Ba,b , for all j ∈ R1 (E M 1 M j j bited parts. Indeed, π2 starts or ends an excursion inside {b1j , . . . , e1j }, which restricts the possible excursions of π1 , because π1 cannot make the same step as π2 at the same height. Moreover, there is only a finite number of possibilities to make an excursion smaller than 1 , E 2 ), relaxing the condition B 1 1 M and so, for all j ∈ R1 (E M a,b on {b j , . . . , e j } amounts M to increasing the probability in (5.1.10) by a factor Q > 1 depending only on M, i.e., 2 1 1 1 1 )} | π2 ≤ Q −|R (E M ,E M )| PLω,I {π1 ∈ J 1 (E M )} . PLω,I Ba,b ∩ {π1 ∈ J 1 (E M
(5.1.12) 1 , E 2 )| ≥ C(b − a)/M, (5.1.10) becomes Therefore, since |R1 (E M M
[P ω,I ]⊗2 (Ba,b ∩ D) ≤ e−C
b−a M
log Q
,
(5.1.13)
which proves (5.1.6) and completes the proof of Proposition 5.1.1.
The following proposition provides the link between u I and φ I . Proposition 5.1.3. For λ ≥ 0, u I (λ) = sup {−λρ + φ I (1/ρ)}.
(5.1.14)
ρ∈(0,1]
Proof. For ρ ∈ (0, 1], let W L (ρ) = {π ∈ W L : h(π ) = ρ L} and U Lω,I (λ, ρ) =
ω,I
e−λh(π )−HL
(π )
.
(5.1.15)
π ∈W L (ρ)
By restricting the sum defining U Lω,I (λ) in (5.1.1) to the set W L (ρ), we obtain u I (λ) ≥ lim L→∞ E[L −1 log U Lω,I (λ, ρ)] = −λρ + φ I (1/ρ). Therefore, optimising over ρ, we get u I (λ) ≥ supρ∈(0,1] {−λρ + φ I (1/ρ)}. To prove the reverse inequality, we note that an analogue of the concentration inequality (3.1.4) gives that there exists a C > 0 such that, for all L ∈ N, ρ ∈ (0, 1] and ε > 0, * + 1 1 log U Lω,I (λ, ρ) ≥ E log U Lω,I (λ, ρ) + ε ≤ C exp[−ε2 L/C(α + β)2 ]. P L L (5.1.16)
Localized Phase of a Copolymer in an Emulsion
859
Next, we define the event * + 1 1 ω,I ω,I log U L (λ, j/L) ≥ E log U L (λ, j/L) + ε , J (L) = ∃ j ∈ {1, . . . , L} : L L (5.1.17) and abbreviate E(L) = E[L −1 log U Lω,I (λ)]. Then we can write E(L) ≤ E
⎛ ⎛ ⎞ ⎞ L 1 1 ω,I ω, I log U L (λ) 1 J (L) + E ⎝ log ⎝ U L (λ, j/L)⎠ 1[J (L)]c ⎠. L L j=1
(5.1.18) Trivially, the quantity L −1 log U Lω,I (λ) can be bounded from above by α + κ(0) ˜ (recall (5.1.2)), uniformly in L and ω. Therefore, with the help of the inequality in (5.1.16), we see that the first term in the r.h.s. of (5.1.18) is bounded from above by (α + κ(0))C ˜ L exp[−ε2 L/(C(α + β)2 )], which tends to zero as L → ∞. Moreover, for every j ∈ {1, . . . , L}, a standard subadditivity argument gives that E(L −1 log U Lω,I (λ, j/L)) ≤ −λj/L + φ I (L/j). Therefore, on the event [J (L)]c , we have that L −1 log U Lω,I (λ, j/L) ≤ −λj/L + φ I (L/j) + ε for all j ∈ {1, . . . , L}. Thus, the second term in the r.h.s. of (5.1.18) is bounded from above by (log L)/L +maxρ∈(0,1] {−λρ +φ I (1/ρ)}+ ε. Letting L → ∞ and ε ↓ 0, we obtain lim L→∞ E 1 (L) ≤ maxρ∈(0,1] {−λρ +φ I (1/ρ)}, which is the reverse inequality we were after. Since ρ → φ I (1/ρ) is continuous and concave, we can apply the Fenchel-Legendre duality lemma (see Dembo and Zeitouni [2], Lemma 4.5.8), to obtain φ I (µ) = inf {λ/µ + u I (λ)},
µ ≥ 1.
λ≥0
(5.1.19)
In the same spirit we have κ(λ) ˜ = sup {−λρ + κ(1/ρ)}, ˆ ρ∈(0,1]
κ(µ) ˆ = inf {λ/µ + κ(λ)}, ˜ λ≥0
λ ≥ 0,
µ ≥ 1.
(5.1.20)
5.2. Positive and finite curvature of u I . In Propositions 5.1.1–5.1.3 we found that u I is smooth and is the Fenchel-Legendre transform of φ I . In Sect. 5.3 we will exploit these properties to obtain information on φ I . To prepare for this, we first need to show the following. It is immediate from (5.1.1) that λ → u I (α, β; λ) is convex. Lemma 5.2.1 and Assumption 5.2.2 below state that it has a strictly positive and finite curvature. To ease the notation, we suppress α, β from some of the expressions. Lemma 5.2.1. For all (α, β, λ) ∈ Lu , ∂ 2 u I (α, β; λ)/∂λ2 > 0.
860
F. den Hollander, N. Pétrélis
Proof. It suffices to prove that for all (α, β, λ0 ) ∈ Lu there exist C, ε > 0 such that, for all λ ∈ Iε (λ0 ) = [λ0 − ε, λ0 + ε] and L ≥ 1, (5.2.1) E [E Lω,I ]⊗2 [h(π1 ) − h(π2 )]2 ≥ C L , where E Lω,I is the expectation w.r.t. the law in (5.1.5), and λ is suppressed from the notation. Step 1. By Lemma 5.1.2(ii), we can assert that for all T0 ∈ N there exist z 0 ∈ (0, 1) and L 0 ∈ N such that, for all L ≥ L 0 and λ ∈ Iε (λ0 ), " %## "$ l L 3 ω,I (5.2.2) E PL τk 1{τk >T0 } ≥ z 0 L ≥ , 4 k=1
where τk is the length of the k th excursion. Similarly, by Lemma 5.1.2(i), there exists M0 ∈ N with M0 > T0 and L 1 ∈ N such that, for all L ≥ L 1 and λ ∈ Iε (λ0 ), %## " "$ l L 3 z0 ω,I L ≥ . (5.2.3) E PL τk 1{τk ≤M0 } ≥ 1 − 2 4 k=1
Abbreviate 0 = {T0 + 1, . . . , M0 } × {−1, +1}. Let ( j, σ ) ∈ 0 and L ≥ L 2 = max{L 0 , L 1 }. Define % $l L z0 τk 1{T0 0, which will complete the proof of (5.2.1). For given π , we let T (π ) = {(T1 , T1 , σ1 ), . . . , (Tl L , TlL , σl L )}
(5.2.7)
denote the starting points, ending points and signs of the l L excursions of π between 0 and L. For r ∈ N, we set ZrL = {T (π ) : π ∈ B L , l L = r },
(5.2.8)
Localized Phase of a Copolymer in an Emulsion
861
and we denote by E(T, σ ) the set of excursions of length T and sign σ . Futhermore, we write (ε1 , . . . , εr ) ∼ T as shorthand notation for (ε1 , . . . , εr ) ∈ E(T1 − T1 , σ1 ) × · · · × E(Tr − Tr , σr ). With this notation, we can write the quantity in (5.2.6) as ⎡ ⎛ ⎞ ⎤ r . r˜ . 1 ω L ⎦, (5.2.9) E⎣ ω 2 ⎝ Z T,s Z Tω˜ ,˜s ⎠ Rr,T,˜ HL = r ,T˜ (Z ) L L L r,˜r T ∈Zr T˜ ∈Z r˜
s=1 s˜ =1
with Z ωL the total partition sum, L Rr,T,˜ = r ,T˜
⎤2 ⎡ r˜ r r˜ r . . e−λh(εs ) e−λh(˜εs˜ ) ⎣ h(εs ) − h(˜εs˜ )⎦ Z T,s Z T˜ ,˜s
(ε1 ,...,εr )∼T (˜ε1 ,...,˜εr˜ )∼T˜ s=1 s˜ =1
s=1
s˜ =1
(5.2.10) and (recall (2.3.3)) ω Z T,s =
e−λh(εs )−H
ω,I (ε ) s
,
εs ∈E (T,s)
Z T,s =
e−λh(εs ) .
(5.2.11)
εs ∈E (T,s)
Note that R L does not depend on ω. r,T,˜r ,T˜ Step 3. Putting X s = h(εs ), we note that in R L
r,T,˜r ,T˜
X˜ s˜ = h(˜εs˜ ),
t0 = z 0 /4M0 (M0 − T0 ),
(5.2.12)
the random variables (X 1 , . . . , X r , X˜ 1 , . . . , X˜ r˜ )
(5.2.13)
are independent, and that the law of X s depends on (Ts − Ts , σs ). Since (T, T˜ ) ∈ ZrL × Zr˜L , there are at least t0 L excursions of length jL and sign σ L in T and T˜ . Let (s1 , . . . , st0 L ) and (˜s1 , . . . , s˜t0 L ) be the indices of the t0 L first such excursions in T and T˜ , put L Yr,T,˜ = Xs − (5.2.14) X˜ s˜ , r ,T˜ s∈{1,...,r }\{s1 ,...,st0 L }
s˜ ∈{1,...,˜r }\{˜s1 ,...,˜st0 L }
and write (5.2.10) as L Rr,T,˜ = E T,T˜ r ,T˜
⎛, -2 ⎞ t0 L L ⎝ ⎠, Wk + Yr,T,˜ r ,T˜
(5.2.15)
k=1
where Wk = X sk − X˜ s˜k and E T,T˜ denotes expectation w.r.t. the law of (5.2.13). Clearly, W = (Wk )k∈{1,...,t0 L} are i.i.d., symmetric and bounded random variables. Denote their variance by v L . We can choose T0 large enough so that the Wk are not constant. Moreover,
862
F. den Hollander, N. Pétrélis
since the Wk have only a finite number of laws, there exists an a > 0 such that v L > a for all λ ∈ Iε (λ0 ) and L ≥ L 2 . Step 4. At this stage, we may assume without loss of generality that PT,T˜ (Y L ˜ ≥ r,T,˜r ,T
0) ≥ 21 . Then (5.2.15) gives L L ≥ PT,T˜ (Yr,T,˜ ≥ 0) Rr,T,˜ r ,T˜ r ,T˜
1 2
E T,T˜
⎛, ⎛, -2 ⎞ -2 ⎞ t0 L t0 L ⎝ Wk ⎠ ≥ 41 E ( jL ,σ L ) ⎝ Wk ⎠, k=1
k=1
(5.2.16) where E ( jL ,σ L ) is expectation w.r.t. the law of W . Since the Wk take only values smaller than 2M0 , their third moments are bounded by some finite N uniformly in λ ∈ Iε (λ0 ) and ( j, σ ) ∈ 0 . Therefore we can apply the Berry-Esseen theorem and, writing ξ(u) = P(N (0, 1) ≤ u), u ∈ R with N (0, 1) a standard normal random variable, can assert that, for all u ∈ R, λ ∈ Iε (λ0 ) and ( j, σ ) ∈ 0 , & & "t L # 0 & & ( 3N & & , (5.2.17) Wk ≤ u t0 Lv L − ξ(u)& ≤ 3/2 √ & P( j,σ ) & & a t0 L k=1 where P( j,σ ) is the law of W when ( jL , σ L ) = ( j, σ ). Taking the restriction of the r.h.s. √ t0 L of (5.2.16) to the event K = { k=1 Wk / t0 Lv L ∈ [1, 2]}, we obtain 6N v L t0 L at0 L L Rr,T,˜r ,T˜ ≥ P( j,σ ) (K ) ≥ ξ(2) − ξ(1) − 3/2 √ , (5.2.18) 4 4 a t0 L which implies that R L ≥ t0 L for L large enough and some t0 > 0. Recalling (5.2.9), r,T,˜r ,T˜ we can now estimate (5.2.19) H L ≥ t0 L E [PLω,I ]⊗2 (B L ) ≥ t0 L/4(M0 − T0 ), which yields (5.2.1) with C = t0 L/4(M0 − T0 ).
The following assumption will be needed in Sects. 5.3–5.5. Assumption 5.2.2. For all (α, β) ∈ CONE and λ > 0 there exist C(λ) > 0 and δ0 > 0 such that, for all δ ∈ (0, δ0 ], u I (λ − δ) + u I (λ + δ) − 2u I (λ) ≤ C(λ)δ 2 .
(5.2.20)
Although we are not able to prove this assumption, we believe it to be true for the following reason. First, as a consequence of Proposition 5.1.1, we have that, for all (α, β) ∈ CONE, λ → u(α, β; λ) is infinitely differentiable on the set {λ ∈ [0, ∞) : u(α, β; λ) > κ(λ)}. ˜ Since λ → κ(λ) ˜ is infinitely differentiable on [0, ∞), this implies that λ → u(α, β; λ) is infinitely differentiable on the interior of the set {λ ∈ [0, ∞) : u(α, β; λ) = κ(λ)}. ˜ Thus, the assumption only concerns the values of λ located at the boundary of the latter. For these values, proving the assumption amounts to proving the reverse of inequality (5.2.1), i.e., showing that the variance of the number of horizontal steps made by the polymer of length L is of order L, which we may reasonably expect to be true. In Remark 5.3.3 we give a weaker alternative to Assumption 5.2.2.
Localized Phase of a Copolymer in an Emulsion
863
5.3. Smoothness of φ I in its localized phase. Having collected in Sects. 5.1–5.2 some key properties of the dual free energy u I , we are now ready to look at what these imply for φ I . We begin by showing that φ I is strictly concave. Lemma 5.3.1. Let D(δ) = 21 φ I
1 ρ0 + δ
+ 21 φ I
1 ρ0 − δ
− φI
1 ρ0
.
(5.3.1)
Then, for all (α, β) ∈ CONE and ρ0 ∈ (0, 1) there exist C > 0 and δ0 > 0 such that, for all δ ∈ (0, δ0 ], D(δ) ≤ −Cδ 2 .
(5.3.2)
This inequality implies the strict concavity of ρ → φ I (1/ρ) on (0, 1]. Proof. Lemma 5.2.1 states the strict convexity of λ → u I (λ), which implies the uniqueness of the maximiser in the variational formula (5.1.19), i.e., there exists a unique λ0 = λ0 (ρ) ≥ 0 such that φ I (1/ρ0 ) = λ0 ρ0 + u I (λ0 ). Let x > 0. By picking λ = λ0 − xδ in (5.1.19) with µ = 1/(ρ0 + δ), and λ = λ0 + xδ in (5.1.19) with µ = 1/(ρ0 − δ), we obtain D(δ) ≤ 21 [(λ0 − xδ)(ρ0 + δ) + u I (λ0 − xδ)] + 21 [(λ0 + xδ)(ρ0 − δ) + u I (λ0 + xδ)] − λ0 ρ0 − u I (λ0 ) = −xδ 2 + 21 [u I (λ0 − xδ) + u I (λ0 + xδ) − 2u I (λ0 )].
(5.3.3)
Picking x = 1/2C(λ0 ), with C(λ0 ) the constant in Assumption 5.2.2, we see that (5.3.3) implies, for 0 < δ < 2C(λ0 )δ0 , D(δ) ≤ −xδ 2 + C(λ0 )x 2 δ 2 = −δ 2 /4C(λ0 ),
(5.3.4)
which proves (5.3.2). To prove the claim made below (5.3.2), pick 1 ≤ u < v and consider (5.3.1) at the point ρ0 = (u + v)/2. Then, by (5.3.1–5.3.2), there exists a 0 < δ < (v − u)/2 such that φ I ( ρ01+δ ) − φ I ( ρ10 ) δ
ρ0 + δ > ρ0 − δ > u, it follows that ∂ −φI 1 ∂ +φI 1 |ρ=v ≤ l.h.s. (5.3.5) < r.h.s. (5.3.5) ≤ |ρ=u , (5.3.6) ∂ρ ρ ∂ρ ρ with − and + denoting the left- and the right-derivative.
We are now ready to prove that φ I is smooth. Let Lφ = (α, β, µ) = CONE × [1, ∞) : φ I (α, β; µ) > κ(µ) ˆ , i.e., the region where the single linear interface model is localized. Proposition 5.3.2. (α, β, µ) → φ I (α, β; µ) is infinitely differentiable on Lφ .
(5.3.7)
864
F. den Hollander, N. Pétrélis
Proof. Let (α, β, µ) ∈ Lφ . Lemma 5.2.1 states the strict convexity of λ → u I (λ) on {λ : u(λ) > κ(λ)} ˜ and it can be shown that λ → κ(λ) ˜ is strictly convex on [0, ∞). This entails that λ → u I (λ) is strictly convex on [0, ∞). Therefore, the variational formula in (5.1.19) attains its maximum at a unique point λ(µ) ≥ 0, so that the variational formula in (5.1.14) allows us to write φ I (µ) = λ(µ)/µ + sup {−λ(µ)ρ + φ I (1/ρ)}, ρ∈(0,1]
(5.3.8)
after which the strict concavity of ρ → φ I (1/ρ) (recall Lemma 5.3.1) implies that this ˆ for all ρ, and φ I (µ) > supremum is attained uniquely at ρ = 1/µ. Since φ I (ρ) ≥ κ(ρ) I ˜ and κ(µ), ˆ the variational formula in (5.1.20) allows us to write u (λ(µ)) > κ(λ(µ)), therefore (α, β, λ(µ)) ∈ Lu . Next, let
S = (α, β, µ, λ) ∈ CONE × [1, ∞) × [0, ∞) : (α, β, µ) ∈ Lφ , (α, β, λ) ∈ Lu , (5.3.9) and define ϒ1 as ϒ1 : (α, β, µ, λ) ∈ S →
∂(λ/µ + u I (λ)) . ∂λ
(5.3.10)
We want to apply the implicit function theorem in Bredon [1], Chapter II, Theorem 1.5, to ϒ1 . This requires checking three properties: (i) ϒ1 is infinitely differentiable on S. (ii) For all (α, β, µ) ∈ Lφ , λ(µ) is the unique λ ∈ [1, ∞) such that (α, β, λ) ∈ Lu and ϒ1 (α, β, µ, λ(µ)) = 0. 1 (iii) For all (α, β, µ) ∈ Lφ , ∂ϒ ∂λ (α, β, µ, λ(µ)) = 0. Property (i) holds because u I is infinitely differentiable on Lu (by Proposition 5.1.1). Property (ii) holds because λ → u I (λ) is strictly convex (by Lemma 5.2.1). Moreover, Lemma 5.2.1 gives that ∂ϒ1 ∂ 2uI (α, β, µ, λ(µ)) = (α, β, λ(µ)) > 0, ∂λ ∂λ2
(5.3.11)
so property (iii) holds too. We can therefore indeed use the implicit function theorem, obtaining that (α, β, µ) → λ(µ) and (α, β, µ) → φ I (α, β; µ) are infinitely differentiable on Lφ . Remark 5.3.3. Assumption 5.2.2 can be weakened. Namely, instead of assuming finite curvature of λ → u(α, β; λ), we may assume strict concavity of µ → µφ I (µ) (which is already known to be concave). This strict concavity (which is implied by Assumption 5.2.2, Lemma 5.3.1 and (5.4.1)) is sufficient to guarantee, in the proof of Proposition 5.3.2, that λ(µ) in (5.3.8) is unique and satisfies (α, β, λ(µ)) ∈ Lµ . The latter in turn is enough to carry out the rest of the proof.
Localized Phase of a Copolymer in an Emulsion
865
5.4. Smoothness of ψ AB in its localized phase. In this section we transport the properties of φ I obtained in Sect. 5.3 to ψ AB . We begin with some elementary observations. Fix (α, β) ∈ CONE and recall (2.3.4). By Lemma 5.3.1 and Lemma 2.2.1(ii), for all a ≥ 2, (c, b) → cφ I (c/b) and (c, b) → (a − c)κ(a − c, 1 − b) are strictly concave on DOM(a). Consequently, for all a ≥ 2, the supremum of the variational formula in (2.3.6) is attained at a unique pair (c, b) ∈ DOM(a) (use that DOM(a) is a convex set). Next, note that Lemma 5.3.1 and Proposition 5.3.2 imply that for all (α, β, ρ0 ) ∈ Lφ there exists a C > 0 such that I I ∂2 1 ∂2 1 φ (5.4.1) [ρφ (ρ)](ρ ) = (1/ρ) 0 ρ0 ≤ −C. ∂ρ 2 ρ 3 ∂ρ 2 0
Let Lψ = {(α, β, a) ∈ CONE × [2, ∞) : ψ AB (α, β; a) > },
(5.4.2)
i.e., the region where ψ AB is localized. Our main result in this section is the following. Proposition 5.4.1. (α, β, a) → ψ AB (α, β; a) is infinitely differentiable on Lψ . Proof. Define ˆ Lα,β,a = {(c, b) ∈ DOM(a) : φ I (α, β; c/b) > κ(c/b)}.
(5.4.3)
As noted above, the variational formula in (2.3.6) attains its maximum at a unique pair (c(α, β; a), b(α, β; a)) ∈ DOM(a). We write (c(a), b(a)), suppressing (α, β) from the notation. Since (α, β) ∈ L (recall (1.3.1)), Lemma 2.2.2 (iv) and Proposition 2.3.4 imply that (c(a), b(a)) ∈ Lα,β,a . Let F(c, b) = cφ(c/b),
˜ b) = (a − c)κ(a − c, 1 − b), F(c,
(5.4.4)
and denote by {Fc , Fb , Fcc , Fcb , Fbb } the partial derivatives of order 1 and 2 of F with ˜ By the strict concavity of (c, b) → respect to the variables c and b (and similarly for F). ˜ F(c, b) + F(c, b) in DOM(a), we know that (c(a), b(a)) is also the unique pair in Lα,β,a at which Fc + F˜c = 0 and Fb + F˜b = 0. We need to show that (c(a), b(a)) is infinitely differentiable w.r.t. (α, β, a). To that aim we again use the implicit function theorem. Define R = {(α, β, a, c, b) : (α, β, a) ∈ Lψ , (c, b) ∈ Lα,β,a }
(5.4.5)
ϒ2 : (α, β, a, c, b) ∈ R → (Fc + F˜c , Fb + F˜b ).
(5.4.6)
and
Let J2 be the Jacobian determinant of ϒ2 as a function of (c, b). Applying the implicit function theorem to ϒ2 requires checking three properties: (i) ϒ2 is infinitely differentiable on R. (ii) For all (α, β, a) ∈ Lψ , (c(a), b(a)) is the only pair in Lα,β,a satisfying ϒ2 = 0. (iii) For all (α, β, a) ∈ Lψ , J2 = 0 in (c(a), b(a)).
866
F. den Hollander, N. Pétrélis
As explained below (5.4.4), property (ii) holds. Proposition 5.3.2 and Lemma 2.2.2 (ii) show that also property (i) holds. Computing the Jacobian determinant J2 , we get J2 = (Fcc + F˜cc )(Fbb + F˜bb ) − (Fcb + F˜c,b )2 .
(5.4.7)
2 = 0, F = µ2 F and F = µF , (5.4.7) becomes Since Fcc Fbb − Fcb bb cc cb cc 2 + Fcc [ F˜bb + 2µ F˜cb + µ2 F˜cc ]. J2 = F˜cc F˜bb − F˜cb
(5.4.8)
˜ b), we have Fcc ≤ 0 and F˜cc ≤ 0. By the concavity of c → F(c, b) and c → F(c, ˜ Moreover, by the concavity of (c, b) → F(c, b), its Hessian matrix necessarily has two non-positive eigenvalues. Therefore, the determinant of this matrix is non-negative, 2 ≥ 0. This, together with the inequality F˜ ≤ 0, implies that µ → i.e., F˜cc F˜bb − F˜cb cc F˜bb + 2µ F˜cb + µ2 F˜cc is non-positive on R. Hence J2 ≥ 0. 2 > 0. Lemma 5.4.2. F˜cc F˜bb − F˜cb
Proof. The strict inequality can be checked with MAPLE. In [7], an explicit variational formula is given for the entropy function in (2.2.2), which is easily implemented. It follows from Lemma 5.4.2 that J2 > 0, which proves property (iii). We know from Lemma 2.2.1 (ii) and Proposition 5.3.2 that F˜ and F are infinitely differentiable on DOM(a) for all a ∈ [2, ∞). Hence, the claim indeed follows the implicit function theorem. We close this section with the following observations needed in Sect. 5.5. Lemma 5.4.3. Fix (α, β) ∈ CONE. (i) For all k, l ∈ {A, B}, a → ψkl (a) is strictly concave on [2, ∞). (ii) For all k, l ∈ {A, B} with kl = B B, lima→∞ aψkl (a) = ∞. (iii) For all k, l ∈ {A, B}, lima→∞ ∂[aψkl (a)]/∂a ≤ 0. Proof.
(i) This is a straightforward consequence of the observations made at the beginning of this section, together with the strict concavity of µ → µφ I (µ) proved in Lemma 5.3.1. (ii) Because ψ AB ≥ ψ A A , it suffices to consider kl ∈ {A A, B A}. For kl = A A, the claim is immediate from Lemma 2.2.1(iii) and (2.3.1). For kl = B A, we use the fact that φ I (µ) ≥ κ(µ) ˆ (recall (2.3.8)) in combination with the variational formula of Lemma 2.3.2 with c = a − 23 and b = 21 . This gives aψ B A (a) ≥
1 2
(2a − 3) κ(2a ˆ − 3) +
3 2
3 1 κ( 2 , 2 ) + 21 (β − α) ,
(5.4.9)
which yields the claim because µκ(µ) ˆ ∼ log µ as µ → ∞ by Lemma 2.2.2(iii). (iii) Since, for all k, l ∈ {A, B}, ψ AB ≥ ψkl and a → aψkl (a) is concave, it suffices to prove that lim supa→∞ ψ AB (a) ≤ 0. The latter is immediate from the variational formula in (2.3.6) and the fact that lima→∞ φ I (a) = 0 (Lemma 4.2.6(i)) and lima→∞ κ(a, 1) = 0((2.2.3)).
Localized Phase of a Copolymer in an Emulsion
867
5.5. Smoothness of f on L. We begin by proving the uniqueness of the maximisers in the variational formula in (2.1.11). For (α, β) ∈ CONE, p ∈ (0, 1) and (ρkl ) ∈ R( p), let (recall (2.1.9)) f (ρkl ) = sup V ((ρkl ), (akl )) , (akl )∈A
O(ρkl ) = {kl ∈ {A, B}2 : ρkl > 0}, R f ( p) = {(ρkl ) ∈ R( p) : f = f (ρkl ) }, P( p) = O(ρkl ) .
(5.5.1)
(ρkl )∈R f ( p)
Proposition 5.5.1. (i) For every (α, β) ∈ CONE, p ∈ (0, 1) and ρ = (ρkl ) ∈ R( p), ρ there exists a unique family a ρ = (akl )kl∈Oρ ∈ A satisfying ρ ρ kl∈Oρ ρkl akl ψkl (akl ) fρ = = V (ρ, a ρ ). (5.5.2) ρ ρ a kl∈Oρ kl kl (ii) For every (α, β) ∈ CONE and p ∈ (0, 1), R f ( p) = ∅, and there exists a unique ρ ∗) ∗ f family (akl (k,l)∈P ( p) such that akl = akl for all ρ ∈ R ( p) and kl ∈ Oρ . Proof. Recall Theorem 2.1.1. (i) The case ρ B B = 1 is trivial. In that case we have ρ f ρ = supa B B ≥2 ψ B B (a B B ) = ψ B B (a ∗ ) = 21 β + (by Lemma 2.2.1(iv)), and so a B B = 5 ∗ a = 2 . Therefore assume that ρ B B < 1. Then at least one pair k1l1 ∈ {A A, AB, B A} satisfies ρk1 l1 > 0, and since limu→∞ uψk1 l1 (u) = ∞ by Lemma 5.4.3 (ii), we have f ρ > 0. The latter is needed in what follows. To prove existence of a ρ , for R > 0 let f ρ,R =
sup a∈[2,R]Oρ
V (ρ, a).
(5.5.3)
We prove that for R large enough the supremum in (5.5.2) is attained in [2, R]Oρ , i.e., f ρ = f ρ,R . Indeed, for a ∈ A, ρ ∈ R( p) and k2 l2 ∈ {A, B}2 we have (recall (2.1.9)) ∂[uψk2 l2 (u)] ρk l ∂V |u=ak2 l2 − V (ρ, a) . (ρ, a) = 2 2 (5.5.4) ∂ak2 l2 ∂u kl ρkl akl Moreover, for every kl ∈ {A, B}2 , u → uψkl (u) is strictly concave and u → ∂[uψkl (u)]/∂u is strictly decreasing (by Lemma 5.4.3(i)) and converges to a limit ≤ 0 as u → ∞ (by Lemma 5.4.3(iii)). Pick R > 0 large enough so that ∂[uψkl (u)]/∂u ≤ f ρ /2 for all u ≥ R and kl ∈ {A, B}2 . We will show that f ρ > f ρ,R implies that V (ρ, a) ≤ max{ f ρ /2, f ρ,R } for all a ∈ A\[2, R]Oρ , and this will provide a contradiction. To achieve the latter, assume that A A ∈ Oρ and consider, for instance, a ∈ A such that a A A > R and akl ≤ R for kl ∈ Oρ \{A A}. Fix x ≥ R and denote by a x the element x = a , kl ∈ O \{A A}. Since a R ∈ [2, R]Oρ , we have of Oρ given by a xA A = x and akl kl ρ V (ρ, a R ) ≤ f ρ,R < f ρ and / x ∂V V (ρ, a x ) − V (ρ, a R ) = (ρ, a u ) du. (5.5.5) R ∂a A A
868
F. den Hollander, N. Pétrélis
Since, by (5.5.4), the sign of (∂ V /∂a A A )(ρ, a u ) is equal to the sign of ∂[uψ A A (u)]/∂u − V (ρ, a u ), it follows that V (ρ, a x ) decreases with x whenever V (ρ, a x ) ≥ f ρ /2. Since V (ρ, a R ) < f ρ , we therefore have V (ρ, a x ) ≤ max{ f ρ /2, f ρ,R } for all x ≥ R and, consequently, V (ρ, a) ≤ max{ f ρ /2, f ρ,R }. Therefore the supremum of (5.5.2) is attained in [2, R]Oρ . The uniqueness of a ρ realising f ρ = V (ρ, a ρ ) follows from (5.5.4), because for each kl ∈ {A, B}Oρ we must have (∂ V /∂akl )(ρ, a ρ ) = 0. This means that for each kl ∈ Oρ we must have ∂[uψkl (u)] |u=a ρ = V (ρ, a ρ ) = sup V (ρ, a), kl ∂u a∈A
(5.5.6)
and, since u → uψkl (u) is strictly concave (by Lemma 5.4.3(i)), there is only one such akl for each kl ∈ Oρ . (ii) As shown in [7], Proposition 3.2.1, ρ → f ρ is continuous on R( p). Therefore, the compactness of R( p) entails R f ( p) = ∅. Consider (ρ1 , ρ2 ) ∈ R f ( p) and kl ∈ Oρ1 ∩ Oρ2 . Then (5.5.4) also gives ∂[uψkl (u)] ∂[uψkl (u)] |u=a ρ1 = f = |u=a ρ2 , kl kl ∂u ∂u ρ
(5.5.7) ρ
which, by the strict concavity of u → uψkl (u), implies that akl1 = akl2 .
We are now ready to prove the smoothness of f on L. Because of the inequalities ψ A A ≥ ψ B B and ψ AB ≥ ψ B A , the concavity of a → aψ A A (a) and a → aψ AB (a) implies that the variational problem in (2.1.11) reduces to the matrices {Mγ , γ ∈ C}, with Mγ the matrix and C the set defined in (2.1.8). Write V (γ , a AB , a A A ) for the quantity V (Mγ , (a AB , a A A , 0, 0)) defined in (2.1.9), put γ ∗ = max C and let (x ∗ (α, β), y ∗ (α, β)) be the unique maximisers (a ∗AB , a ∗A A ) defined in Proposition 5.5.1. By differentiating the quantity V (γ , x ∗ , y ∗ ) with respect to γ , we easily get that R f ( p) contains only the matrix Mγ ∗ . Thus, we have the equality f (α, β) = V (γ ∗ , x ∗ , y ∗ ) =
γ ∗ x ∗ ψ AB (x ∗ ) + (1 − γ ∗ )y ∗ κ(y ∗ , 1) . γ ∗ x ∗ + (1 − γ ∗ )y ∗
(5.5.8)
Since (α, β) ∈ L, we have ψ AB (x ∗ ) > and therefore (α, β, x ∗ ) ∈ Lψ . To show that f is infinitely differentiable on L, we once more use the implicit function theorem. For that we define N = {(α, β, x, y) : (α, β) ∈ L, (α, β, x) ∈ Lψ , y > 2} and
ϒ3 : (α, β, x, y) ∈ N →
∂V ∗ ∂V ∗ (γ , x, y), (γ , x, y) . ∂x ∂y
(5.5.9)
(5.5.10)
Let J3 be the Jacobian determinant of ϒ3 as a function of (x, y). To apply the implicit function theorem we must check three properties: (i) ϒ3 is infinitely differentiable on N . (ii) For all (α, β) ∈ L, (x ∗ , y ∗ ) is the only pair in [2, ∞)2 satisfying (α, β, x, y) ∈ N and ϒ3 (α, β, x, y) = 0. (iii) For all (α, β) ∈ L, J3 = 0 in (α, β, x ∗ , y ∗ ).
Localized Phase of a Copolymer in an Emulsion
869
It follows from Lemma 2.2.1(ii), Proposition 5.4.1 and (5.5.8) that property (i) and (ii) hold. To get property (iii), abbreviate xψ AB (x) = ψ(x), yκ(y, 1) = κ(y). From Lemma 2.2.1(ii) and Proposition 5.4.1, we know that ψ and κ are infinitely differentiable. By (5.5.10), ∂2V ∂2V J3 = − ∂ x 2 ∂ y2
∂2V ∂ x∂ y
2 .
(5.5.11)
Taking into account that (∂ V /∂ x)(x ∗ , y ∗ ) = (∂ V /∂ y)(x ∗ , y ∗ ) = 0, we deduce from (5.5.8) that ψ (x ∗ ) = κ (y ∗ ) and J3 = c∗ ψ (x ∗ )κ (y ∗ ), where c∗ > 0 is a constant depending on (x ∗ , y ∗ ). We already know from Lemma 2.2.1(iii) that κ (y ∗ ) < 0. Lemma 5.5.2. ψ (x ∗ ) < 0. Proof. For x > 2 satisfying (α, β, x) ∈ Lψ , we will show that (xψ AB (x)) < 0. For this it suffices to show that there exists a C > 0 such that, for δ small enough, T (δ) =
1 2
[(x + δ)ψ AB (x + δ) + (x − δ)ψ AB (x − δ) − 2xψ AB (x)] ≤ −Cδ 2 . (5.5.12)
Set x−δ = x −δ and xδ = x + δ, and let (e−δ , b−δ ) and (eδ , bδ ) be the unique maximisers of (2.3.6) at x−δ and xδ . Pick (c, b) = ( 21 (e−δ + eδ ), 21 (b−δ + bδ )) in (2.3.6). Since x = 21 (x−δ + xδ ), we obtain T (δ) ≤ V1 (δ) + V2 (δ) with V1 (δ) =
1 2
I eδ I e−δ +eδ e−δ φ I ( be−δ , ) + e φ ( ) − (e + e )φ δ −δ δ b b +b −δ δ −δ δ
V2 (δ) = (x−δ − e−δ ) κ(x−δ − e−δ , 1 − b−δ ) + (xδ − eδ ) κ(xδ − eδ , 1 − bδ )
−(x−δ + xδ − e−δ − eδ ) κ 21 (x−δ + xδ − e−δ − eδ ), 1 − 21 (b−δ + bδ ) . (5.5.13)
Lemma 5.5.3. The determinant of the Jacobian matrix of (a, b) → aκ(a, b) is strictly positive everywhere on DOM. Proof. The non-negativity of the Jacobian determinant is a consequence of the concavity of (a, b) → aκ(a, b) (recall Lemma 2.2.1(ii)). The strict positivity can be checked with MAPLE via the explicit expression κ(a, b) given in den Hollander and Whittington [7]. Since (a, b) → aκ(a, b) is concave and twice differentiable, Lemma 5.5.3 allows us to assert that on DOM the Jacobian matrix of (a, b) → aκ(a, b) has two strictly negative eigenvalues. The second derivatives of κ are continuous. Moreover, the uniqueness of ' > 0 such that, (e−δ , b−δ ) and (eδ , bδ ) imply their continuity in δ, and so there exists a C for δ small enough, ' ((x−δ − xδ ) − (e−δ − eδ ))2 + (b−δ − bδ )2 . V2 (δ) ≤ −C
(5.5.14)
870
F. den Hollander, N. Pétrélis
In what follows, we set Y ( be ) = (∂ 2 /∂ 2 µ)[µφ I (µ)]( be ). To bound V1 (δ) from above, we compute the Jacobian matrix of (e, b) → eφ I (e/b): " # e 1 − b 1 e 2 . (5.5.15) b Y(b) −e e b b2 Thus, if for t ∈ [0, 1] and u ∈ [0, t] we set eu,t = e−δ2+eδ + t (u − 21 )(e−δ − eδ ) and bu,t = b−δ2+bδ + t (u − 21 )(b−δ − bδ ), then a Taylor expansion gives us / t / 1 2 e eu,t 1 V1 (δ) = 41 (e dt t du bu,t Y bu,t − e ) − (b − b ) . (5.5.16) −δ δ −δ δ bu,t u,t 0
0
As explained in the proof of Proposition 5.4.1, the fact that (α, β, x) ∈ Lψ implies (e0 , b0 ) ∈ Lα,β,x and therefore (α, β, be00 ) ∈ Lφ . Moreover, Lφ is an open subset of CONE × [1, ∞) and (eδ , bδ ) is continuous in δ, so that for δ small enough, t ∈ [0, 1] e and u ∈ [0, t], we have (α, β, bu,t ) ∈ Lφ . This implies, by Lemma 5.3.1 and by the u,t 0 > 0 such that, continuity of the second derivative of φ I on Lφ , that there exists a C eu,t 1 0 At this stage, we need to consider the following Y ( ) ≤ −C. for δ small enough, bu,t
three cases: Case 1. |b−δ − bδ | ≥
bu,t
b0 δ e0 4 .
' 2 Cb
Then, (5.5.14) gives V2 (δ) ≤ − 42 e02 δ 2 . 0
' 2. Case 2. |e−δ − eδ | ≤ δ. Then, since xδ − x−δ = 2δ, (5.5.14) gives V2 (δ) ≤ −Cδ eu,t b0 δ 0 Case 3. |e−δ − eδ | > δ and |b−δ − bδ | < e0 4 . By continuity of eδ and bδ , bu,t ≤ 2e b0 for δ small enough and therefore |(e−δ − eδ ) −
eu,t bu,t (b−δ
− bδ )| ≥ |e−δ − eδ | −
2e0 b0 |b−δ
− bδ | ≥ δ −
2e0 b0 δ b0 e 0 4
= 2δ . (5.5.17)
0
C 2 Thus, (5.5.16) and (5.5.17) give V1 (δ) ≤ − 48 δ .
We conclude by setting C = min{
' 2 Cb 0 42 e02
0
' C }, so that Cases 1,2 and 3 give T (δ) ≤ , C, 48
−Cδ 2 for δ small enough, which proves (5.5.12).
Lemma 5.5.2 implies that J3 > 0. Hence, the implicit function theorem can indeed be applied to (5.5.8), and it follows that f is infinitely differentiable on L. Acknowledgement. NP is supported by a postdoctoral fellowship from the Netherlands Organization for Scientific Research (grant 613.000.438). FdH and NP are grateful to the Pacific Institute for the Mathematical Sciences and the Mathematics Department of the University of British Columbia, Vancouver, Canada, for hospitality: FdH from January to August 2006, NP from mid-March to mid-April 2006 when the work in this paper started. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References 1. Bredon, G.E.: Topology and Geometry. New York: Springer, 1993 2. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd ed., Applications of Mathematics 38, New York: Springer, 1998
Localized Phase of a Copolymer in an Emulsion
871
3. Caravenna, F.: Random Walk Models and Probabilistic Techniques for Inhomogeneous Polymer Chains, PhD Thesis, 21 October 2005, University of Milano-Bicocca, Italy, and University of Paris 7, France 4. Giacomin, G.: Random Polymer Models. London: Imperial College Press, 2007 5. Giacomin, G., Toninelli, F.L.: Smoothing effect of quenched disorder on polymer depinning transitions. Commun. Math. Phys. 266, 1–16 (2006) 6. Giacomin, G., Toninelli, F.L.: The localized phase of disordered copolymers with adsorption. ALEA 1, 149–180 (2006) 7. den Hollander, F., Whittington, S.G.: Localization transition for a copolymer in an emulsion. Theor. Prob. Appl. 51, 193–240 (2006) 8. Kingman, J.F.C.: Subadditive ergodic theory. Ann. Probab. 6, 883–909 (1973) 9. Pétrélis, N.: Localisation d’un Polymère en Interaction avec une Interface. PhD thesis, 2 February 2006, University of Rouen, France Communicated by F. Toninelli
Commun. Math. Phys. 285, 873–900 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0650-y
Communications in
Mathematical Physics
Uniqueness Results for Ill-Posed Characteristic Problems in Curved Space-Times Alexandru D. Ionescu1 , Sergiu Klainerman2 1 Department of Mathematics, University of Wisconsin – Madison, Madison,
WI 53706, USA. E-mail:
[email protected] 2 Department of Mathematics, Fine Hall, Princeton University, Princeton,
NJ 08544, USA. E-mail:
[email protected] Received: 1 November 2007 / Accepted: 3 July 2008 Published online: 6 November 2008 – © Springer-Verlag 2008
Abstract: We prove two uniqueness theorems concerning linear wave equations; the first theorem is in Minkowski space-times, while the second is in the domain of outer communication of a Kerr black hole. Both theorems concern ill-posed Cauchy problems on bifurcate, characteristic hypersurfaces. In the case of the Kerr space-time, the hypersurface is precisely the event horizon of the black hole. The uniqueness theorem in this case, based on two Carleman estimates, is intimately connected to our strategy to prove uniqueness of the Kerr black holes among smooth, stationary solutions of the Einstein-vacuum equations, as formulated in [14].
Contents 1.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 A model problem in Minkowski spaces . . . . . . . . . . 1.2 The main theorem in the Kerr spaces . . . . . . . . . . . 2. Unique Continuation and Conditional Carleman Inequalities . 2.1 General considerations . . . . . . . . . . . . . . . . . . 2.2 A conditional Carleman inequality of sufficient generality 3. Proof of Theorem 1.2 . . . . . . . . . . . . . . . . . . . . . 3.1 The first Carleman inequality in Kerr spaces . . . . . . . 3.2 The second Carleman inequality in Kerr spaces . . . . . 3.3 Vanishing of the tensor S . . . . . . . . . . . . . . . . . 4. Proof of Theorem 1.1 . . . . . . . . . . . . . . . . . . . . . Appendix A. Explicit Computations in the Kerr Spaces . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
The first author was supported in part by an NSF grant and a Packard Fellowship. The second author was supported by NSF grant DMS 0070696.
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
874 876 876 877 877 878 880 880 884 886 892 896 899
874
A. D. Ionescu, S. Klainerman
1. Introduction The goal of the paper is to prove two uniqueness results for the Cauchy problem in the exterior of a bifurcate characteristic surface. In the simplest case of the wave equation in Minkowski space R1+d , φ = 0,
= −∂t2 +
d
∂i2 ,
i=1
the problem is to find solutions in the exterior domain Ea = {(t, x) : |x| > |t| + a}, a ≥ 0, with prescribed data on the boundary Ha = {(t, x) : |t| = |x| + a}. The problem is known to be ill-posed, that is, (1) Solutions may not exist for smooth, non-analytic, initial conditions. (2) There is no continuous dependence on the data. The situation is similar to the better known case of the Cauchy problem prescribed on a time-like characteristic hypersurface, such as x d = 0. The Cauchy–Kowalewski theorem allows one to solve the problem for analytic initial data, but solutions may not exist in the smooth case. It is known in fact that smooth solutions cannot be prescribed freely, since certain necessary compatibilities may be violated. Though existence fails, one can often prove uniqueness. A general result due to Holmgren, improved by F. John [8], shows that the non-characteristic initial value problem for linear equations with analytic coefficients is locally unique in the class of smooth solutions, see [9]. The case of equations with smooth coefficients is considerably more complicated. An important counterexample to uniqueness was provided by P. Cohen [5], see also [12] and [1] for more general families of examples. Thus, in the case of the Cauchy problem for a time-like hypersurface (such as x d = 0), even a zero order, smooth, perturbation of the wave operator can cause uniqueness to fail. We note also that there cannot be, in general (unless one considers solutions with suitable decay at infinity such as discussed in [16]), unique continuation across characteristic hyperplanes, see the counterexample and the discussion in [13, Theorem 8.6.7]. On the other hand, there exist conditions which can guarantee uniqueness, most importantly those of Hörmander [13, Chap. 28]. See also [19,21] and the references therein for uniqueness results under partial analyticity assumptions. These results prove uniqueness for a large class of problems which include, in particular, the Cauchy problem on an arbitrary, noncharacteristic, time-like hypersurface for the wave equation g φ = 0, corresponding to a time independent Lorentz metric of the form −g00 (x)dt 2 + gi j (x)d x i d x j with g00 > 0 and (gi j )i,d j=1 positive definite. The method of proof for these and other modern unique continuation results is based on Carleman type estimates. The case of ill-posed problems for bifurcate characteristic hypersurfaces, i.e. surfaces composed of two characteristic hypersurfaces which intersect transversally, seems to have been first studied by Friedlander1 [6], by using a variation of Holmgren’s method of proof. The same problem for equations with smooth coefficients, seems not to have been specifically considered in the literature. Yet it is precisely this case which seems to be of considerable importance in General Relativity, particularly for the problem of uniqueness of stationary, smooth solutions of the Einstein field equations, see the discussion in [14]. Indeed, it turns out that remarkable simplifications occur for the geometry of 1 In [7] he also considers a similar, ill-posed, characteristic problem at infinity, concerning uniqueness of solutions with identical radiation fields.
Uniqueness Results for Ill-Posed Characteristic Problems
875
bifurcate horizons for general, stationary, asymptotically flat black hole solutions of the Einstein-vacuum equations, verifying reasonable regularity assumptions. For such regular black hole space-times, Hawking has shown, see [10], then there must exist an additional Killing vector-field defined on the event horizon, tangent to the generators of the horizon. In the case when the space-time is real analytic one can extend this additional Killing vector-field to the entire exterior region, and deduce that the spacetime must be not only stationary but also axially symmetric. A satisfactory uniqueness result (due to Carter [3] and Robinson [20]) is known for stationary solutions which have this additional symmetry. However, in the smooth, non-analytic case, the problem of extending Hawking’s Killing vector-field from the horizon to the exterior region leads to an ill posed characteristic problem. This appears to be the key obstruction to proving the analogue of Hawking’s uniqueness theorem in the class of smooth, non-analytic space-times. Motivated by this latter problem, to avoid the analyticity assumption we are proposing a completely different approach2 based on the following ingredients: (1) The Kerr space-times can be locally characterized, among stationary solutions, by the vanishing of a four covariant tensor-field, called the Mars-Simon tensor S introduced in [17]. (2) The Mars-Simon tensor-field S verifies a covariant system of a wave equation of the form (see also first equation in (1.6)), g S = A · DS + B · S.
(1.1)
Moreover, since g is stationary, we know that there exists a globally defined Killing vector-field ξ , which is time-like at space-like infinity. Thus it is easy to verify that the Lie derivative of S with respect to ξ vanishes identically. Lξ S = 0.
(1.2)
(3) One can show that the tensor-field S vanishes identically on the bifurcate horizon H of the stationary metric g. We show this by making an assumption (automatically satisfied on a Kerr metric) concerning the vanishing of a complex scalar on the bifurcate sphere of the horizon. (4) Using a first Carleman estimate for the covariant wave Eq. (1.1) we show that S vanishes in a neighborhood of the bifurcate sphere. This step does not require condition (1.2), indeed it is a result that applies to a general equation of type (1.1) in a neighborhood of a regular bifurcate characteristic hypersurface, for a general Lorentz metric g. (5) To extend the vanishing of S to the entire domain of outer communication we need a more sophisticated Carleman estimate which depends in an essential fashion, among other considerations, on Eq. (1.2). In this paper we prove, see Theorem 1.2, a global uniqueness result for tensor-field solutions to covariant equations of the form (1.1) and (1.2) on the domain of outer communication of a Kerr background, which vanish on the event horizon. The condition (1.2) relative to the stationary Killing vector-field ξ , which is important to prove a global result, is justified by the fact that the problem of uniqueness of Kerr is restricted, naturally, to stationary solutions of the Einstein vacuum equations (see the discussion in [14]). We also discuss a simple model problem, see Theorem 1.1, concerning scalar linear wave equations in the exterior domain E = E1 of the Minkowski space-time with prescribed data on the characteristic boundary H = H1 . 2 See the longer discussion in [14].
876
A. D. Ionescu, S. Klainerman
1.1. A model problem in Minkowski spaces. Assume d ≥ 1 and let (M = R × Rd , m) denote the usual Minkowski space of dimension d + 1. We define the subsets of M E = {(t, x) ∈ M : |x| > |t| + 1},
(1.3)
H = δ(E) = {(t, x) ∈ M : |x| = |t| + 1}.
(1.4)
and
Let E = E ∪ H. Our first theorem concerns a uniqueness property of solutions of wave equations on E. Theorem 1.1. Assume φ ∈ C 2 (M), A, B l ∈ C 0 (M), l = 0, . . . , d, and φ = A · φ +
d
B l · ∂l φ on E.
(1.5)
l=0
Assume that φ ≡ 0 on H. Then φ ≡ 0 on E. Theorem 1.1 extends easily to diagonal systems of scalar equations. We remark that in Theorem 1.1 we do not assume any global bounds on the coefficients A and B l . Also, we make no assumption on the vanishing of the derivatives of φ on H, which is somewhat surprising given that is a second order operator. This is possible because of the special bifurcate characteristic structure of the surface H. The proof of Theorem 1.1, which is given in Sect. 4, follows from a standard Carleman inequality with a suitably defined pseudo-convex weight. However, the simple statement of Theorem 1.1 appears to be new. We include it here mostly as a model result to illustrate, in a very simple case, the connection between bifurcate characteristic horizons and unique continuation properties of solutions of wave equations. 1.2. The main theorem in the Kerr spaces. Let (K4 , g) denote the maximally extended Kerr spacetime of mass m and angular momentum ma (see the Appendix for some details and explicit formulas). We assume m > 0 and a ∈ [0, m). Let E4 denote a domain of outer communication of K4 , and H = δ(E4 ) the corresponding event horizon. Let M4 denote an open neighborhood of E4 ∪ H in K4 , and let ξ denote a Killing vector field on E4 which is timelike at the spacelike infinity in E4 . Let T(M4 ) denote the space of smooth vector-fields on M4 , and let Trs (M4 ), r, s ∈ Z+ , denote the space of complex-valued tensor-fields of type (r, s) on M4 . Our main theorem concerns a uniqueness property of certain solutions of covariant wave equations on E4 . 4 Theorem 1.2. Assume k ∈ Z+ , S ∈ T0k (M4 ), A ∈ Tkk (M4 ), B ∈ Tk+1 k (M ), k 4 C ∈ Tk (M ), and g Sα1 ...αk = Sβ1 ...βk Aβ1 ...βk α1 ...αk + Dβk+1 Sβ1 ...βk B β1 ...βk+1 α1 ...αk ; (1.6) Lξ Sα1 ...αk = Sβ1 ...βk C β1 ...βk α1 ...αk ,
in E4 . Assume in addition that S ≡ 0 on H. Then, S ≡ 0 on E4 ∪ H.
Uniqueness Results for Ill-Posed Characteristic Problems
877
2. Unique Continuation and Conditional Carleman Inequalities 2.1. General considerations. Our proof of Theorem 1.2 is based on a global unique continuation strategy. We say that a linear differential operator L, in a domain ⊂ Rd , satisfies the unique continuation property with respect to a smooth, oriented, hypersurface ⊂ , if any smooth solution of Lφ = 0 which vanishes on one side of must in fact vanish in a small neighborhood of . Such a property depends, of course, on the interplay between the properties of the operator L and the hypersurface . A classical result of Hörmander, see for example Chap. 28 in [13], provides sufficient conditions for a scalar linear equation which guarantee that the unique continuation property holds. In the particular case of the scalar wave equation, g φ = 0, and a smooth surface defined by the equation h = 0, ∇h = 0, Hörmander’s pseudo-convexity condition takes the form, D2 h(X, X ) < 0
if
g(X, X ) = g(X, Dh) = 0
(2.1)
at all points on the surface , where we assume that φ is known to vanish on the side of corresponding to h < 0. In our situation, we plan to apply the general philosophy of unique continuation to the covariant wave equation (see the first equation in (1.6)), g S = A ∗ S + B ∗ DS.
(2.2)
We know that S vanishes on the horizon H and we would like to prove, by unique continuation, that S vanishes in the entire domain of outer communication. In implementing such a strategy one encounters the following difficulties: (1) The horizon H = H+ ∪ H− is characteristic and not smooth in a neighborhood of the bifurcate sphere. (2) Even though one can show that an appropriate variant of Hörmander’s pseudoconvexity condition holds true along the horizon, in a neighborhood of the bifurcate sphere, such a condition may fail to be true slightly away from the horizon, within the ergosphere region of the stationary space-time where ξ is space-like. Problem (1) can be dealt with by exploiting the fact that the horizon is a bifurcate characteristic hypersurface, which, in particular, is sufficient to allow us to prove that higher order derivatives of S vanish on the horizon. Problem (2) is more serious, in the case when a is not small compared√to m, because of the existence of null geodesics √ trapped within the ergoregion m + m 2 − a 2 ≤ r ≤ m + m 2 − a 2 cos2 θ. Indeed surfaces of the form r = m(r 2 −a 2 )1/2 , which intersect the ergoregion for a sufficiently close to m, are known to contain such null geodesics, see [4]. One can show that the presence of trapped null geodesics invalidates Hörmander’s pseudo-convexity condition. Thus, even in the case of the scalar wave equation g φ = 0 in such a Kerr metric, one cannot guarantee, by a classical unique continuation argument (in the absence of additional conditions) that φ vanishes beyond a small neighborhood of the horizon. In order to overcome this main difficulty we need to exploit the second identity in (1.6), namely LT S = C ∗ S.
(2.3)
Observe that (2.3) can, in principle, transform (2.2) into a much simpler elliptic problem, in any domain which lies strictly outside the ergoregion (where ξ is strictly time-like).
878
A. D. Ionescu, S. Klainerman
Unfortunately this possible strategy is not available to us when a is not small compared to m, since, as we have remarked above, we cannot hope to extend the vanishing of S, by a simple analogue of Hörmander’s pseudo-convexity condition, beyond the first trapped null geodesics. Our solution is to extend Hörmander’s classical pseudo-convexity condition (2.1) to one which takes into account both Eqs. (2.2) and (2.3) simultaneously. These considerations lead to the following qualitative, ξ -conditional, pseudo-convexity condition, D2 h(X,
X) < 0
ξ(h) = 0; g(X, X ) = g(X, Dh) = g(ξ, X ) = 0.
if
(2.4)
We will show that this condition can be verified in all Kerr spaces a ∈ [0, m), for the simple function h = r , where r is one of the Boyer–Lindquist coordinates. Thus (2.4) is a good substitute for the more general condition (2.1). The fact that the two geometric identities (2.2) and (2.3) cooperate exactly in the right way, via (2.4), thus allowing us to compensate for both the failure of condition (2.1) as well as the failure of the vector field ξ to be time-like in the ergoregion, seems to us to be a very remarkable property of the Kerr spaces. In the next subsection we give a quantitative version of the condition and state a Carleman estimate of sufficient generality to cover all our needs.
2.2. A conditional Carleman inequality of sufficient generality. Unique continuation properties are often proved using Carleman inequalities. In this subsection we state a sufficiently general Carleman inequality, Proposition 2.3, under a quantitative conditional pseudo-convexity assumption. This general Carleman inequality is used to show first that S vanishes in a small neighborhood of the bifurcate sphere S0 in E4 , using only the first identity in (1.6), and then to prove that S vanishes in the entire exterior domain using both identities in (1.6). The two applications are genuinely different, since, in particular, the horizon is a bifurcate surface which is not smooth and the weights needed in this case have to be “singular” in an appropriate sense. In order to be able to cover both applications and prove unique continuation in a quantitative sense, we work with a more technical notion of conditional pseudo-convexity than (2.4), see Definition 2.1 below. Let Br = {x ∈ R4 : |x| < r } denote the standard open ball in R4 . Assume that (M, g) is a smooth Lorentzian manifold of dimension 4, x0 ∈ M, and x0 : B1 → B1 (x0 ) is a coordinate chart. For simplicity of notation, let Br (x0 ) = x0 (Br ), r ∈ (0, 1]. For any smooth function φ : B → C, where B ⊆ B1 (x0 ) is an open set, and j = 0, 1, . . . let 4
|D j φ(x)| =
|∂α1 . . . ∂α j φ(x)|.
(2.5)
α1 ,...,α j =1
Let gαβ = g(∂α , ∂β ) and assume that V = V α ∂α is a vector-field on B1 (x0 ). We assume that sup
4 6
x∈B1 (x0 ) j=0 α,β=1
|D j gαβ | + |D j gαβ | + |D j V β | ≤ A0 .
In our applications V = 0 or V = ξ .
(2.6)
Uniqueness Results for Ill-Posed Characteristic Problems
879
Definition 2.1. A family of weights h : B 10 (x0 ) → R+ , ∈ (0, 1 ), 1 ≤ A−1 0 , will be called V-conditional pseudo-convex if for any ∈ (0, 1 ), h (x0 ) = ,
sup
4
x∈B 10 (x0 ) j=1
j |D j h (x)| ≤ / 1 , |V (h )(x0 )| ≤ 10 , (2.7)
Dα h (x0 )Dβ h (x0 )(Dα h Dβ h − Dα Dβ h )(x0 ) ≥ 12 ,
(2.8)
and there is µ ∈ [− 1−1 , 1−1 ] such that for all vectors X = X α ∂α ∈ Tx0 (M), 12 [(X 1 )2 + (X 2 )2 + (X 3 )2 + (X 4 )2 ] ≤ X α X β (µgαβ − Dα Dβ h )(x0 ) + −2 (|X α Vα (x0 )|2 + |X α Dα h (x0 )|2 ). (2.9) A function e : B 10 (x0 ) → R will be called a negligible perturbation if sup
x∈B 10 (x0 )
|D j e (x)| ≤ 10
for j = 0, . . . , 4.
(2.10)
Remark 2.2. One can see that the technical conditions (2.7), (2.8), and (2.9) are related to the qualitative condition (2.4), at least when h = h + for some smooth function h. The assumption |V (h )(x0 )| ≤ 10 is a quantitative version of V (h) = 0. The assumption (2.9) is a quantitative version of the inequality in the second line of (2.4), in view of the large factor −2 on the terms |X α Vα (x0 )|2 and |X α Dα h (x0 )|2 , and the freedom to choose µ in a large range. The assumption (2.8) is a quantitative version of the condition ∇h = 0 (assuming that (2.9) already holds). It is important that the Carleman estimates we prove are stable under small perturbations of the weight, in order to be able to use them to prove unique continuation. We quantify this stability in (2.10). We observe that if {h } ∈(0, 1 ) is a V -conditional pseudo-convex family, and e is a negligible perturbation for any ∈ (0, 1 ], then h + e ∈ [ /2, 2 ] in B 10 (x0 ). The pseudo-convexity conditions of Definition 2.1 are probably not as general as possible, but are suitable for our applications both in Proposition 3.2, with “singular” weights h and V = 0, and Proposition 3.3, with “smooth” weights h and V = ξ . Proposition 2.3. Assume 1 ≤ A−1 0 , {h } ∈(0, 1 ) is a V -conditional pseudo-convex family, and e is a negligible perturbation for any ∈ (0, 1 ], see Definition 2.1. Then there is ∈ (0, 1 ) sufficiently small and C sufficiently large such that for any λ ≥ C and any φ ∈ C0∞ (B 10 (x0 )), λe−λ f φ L 2 + e−λ f |D 1 φ| L 2 ≤ C λ−1/2 e−λ f g φ L 2 + −6 e−λ f V (φ) L 2 , (2.11) where f = ln(h + e ).
880
A. D. Ionescu, S. Klainerman
As mentioned earlier, many Carleman estimates such as (2.11) are known, for the particular case V = 0. Optimal proofs are usually based on some version of the Fefferman-Phong inequality, as in [13, Chap. 28]. A self-contained, elementary proof of Proposition 2.3, using only simple integration by parts arguments is given in [14, Sect. 3] (see also Proposition 4.1 in Sect. 4 for a similar proof in a simpler case). We also note that it is useful to be able to track quantitatively the size of the support of the functions for which Carleman estimates can be applied; in our notation, the value of for which (2.11) holds depends only on the parameter 1 . 3. Proof of Theorem 1.2 3.1. The first Carleman inequality in Kerr spaces. The horizon H decomposes as H = H+ ∪ H− , where H+ is the boundary of the black hole region and H− is the boundary of the white hole region. Let S0 = H+ ∩ H− denote the bifurcate sphere. In this section we prove a Carleman estimate for functions supported in a small neighborhood of the bifurcate sphere S0 . We first construct two suitable defining functions for the surfaces H+ and H− . Lemma 3.1. There is an open set O ⊆ M4 , S0 ⊆ O, and smooth functions u, v : O → R with the following properties: (a) We have ⎧ 4 ⎪ ⎨ E ∩ O = {x ∈ O : u(x) > 0 and v(x) > 0}; H+ ∩ O = {x ∈ O : u(x) = 0}; ⎪ ⎩ H− ∩ O = {x ∈ O : v(x) = 0}. In addition, the set {x ∈ O : u(x), v(x) ∈ [0, 1/2]} is compact. (b) With L 3 = g αβ ∂α (u)∂β , L 4 = g αβ ∂α (v)∂β ∈ T(O), ⎧ + ⎪ ⎨ g(L 3 , L 3 ) = 0 on H ∩ O; g(L 4 , L 4 ) = 0 on H− ∩ O; ⎪ ⎩ g(L , L ) > 0 on S . 3 4 0
(3.1)
(c) For any smooth function φ : O → R with the property that φ ≡ 0 on H+ ∩ O, there is a smooth function φ : O → R such that φ = φ · u
on O ∩ E4 .
Also, for any smooth function φ : O → R with the property that φ ≡ 0 on H− ∩ O, there is a smooth function φ : O → R such that φ = φ · v
on O ∩ E4 .
Uniqueness Results for Ill-Posed Characteristic Problems
881
Proof of Lemma 3.1. A more precise construction of global optical functions u, v is given in [18]. In our problem we do not need this global construction; for simplicity we construct the functions u, v explicitly, using the Kruskal coordinates of the Kerr space-times. In standard Boyer-Lindquist coordinates (r, t, θ, φ) ∈ (r+ , ∞) × R × (0, π ) × S1 , r± = m ± (m 2 − a 2 )1/2 (see the Appendix), the Kerr metric on the dense open subset E4 of E4 is ρ2
2 (sin θ )2 ρ2 (dr )2 + ρ 2 (dθ )2 , (dt)2 + (dφ − ωdt)2 + 2 2 ρ
(3.2)
⎧ ⎪
= r 2 + a 2 − 2mr ; ⎪ ⎪ ⎨ ρ 2 = r 2 + a 2 (cos θ )2 ; ⎪ 2 = (r 2 + a 2 )ρ 2 + 2mra 2 (sin θ )2 = (r 2 + a 2 )2 − a 2 (sin θ )2 ; ⎪ ⎪ ⎩ ω = 2amr . 2
(3.3)
ds 2 = − where
We define the function r∗ : (r+ , ∞) → R, 2mr+ r 2 + a2 2mr− dr = r + ln(r − r+ ) − ln(r − r− ). (3.4) r∗ = 2 2 r + a − 2mr r+ − r− r+ − r− With c0 =
2mr+ r+ −r− ,
we make the changes of variables r∗ = c0 (ln u + ln v)
where u, v ∈ (0, ∞)2 , so
and
t = c0 (ln u − ln v),
dr∗ = c0 (u −1 du + v −1 dv); dt = c0 (u −1 du − v −1 dv).
(3.5)
(3.6)
We observe also that ω(r+ , θ ) = a/(2mr+ ). We make the change of variables φ = φ∗ +
a ac0 t = φ∗ + (ln u − ln v), 2mr+ 2mr+
(3.7)
with dφ = dφ∗ +
ac0 −1 (u du − v −1 dv). 2mr+
(3.8)
In the new coordinates (u, v, θ, φ∗ ) ∈ (0, ∞) × (0, ∞) × (0, π ) × S1 the Kerr metric (3.2) becomes
c2 ρ 2 2 a 2 (sin θ )2 2 2c02 ρ 2 1 1 2 2 2 dudv ds 2 = − 02 2 2 2 [v (du) + u (dv) ] + + u v (r + a 2 )2 uv 2 (r 2 + a 2 )2 2 2 (sin θ )2 ω c0 + dφ∗ − (vdu − udv) + ρ 2 (dθ )2 , (3.9) ρ2 uv where ω = ω − a/(2mr+ ).
882
A. D. Ionescu, S. Klainerman
We restrict to the region = {(u, v, θ, φ∗ ) ∈ (−c1 , 1)2 × (0, π ) × S1 }, O for some constant c1 > 0 sufficiently small. We examine the coefficients that appear in the Kerr metric (3.9). Since er∗ /c0 = uv and r− < r+ (since a ∈ [0, m)), it follows from Moreover /(uv) = (r − r− )(r − r+ )/(uv) (3.4) that r is a smooth function of uv in O. Thus the Kerr metric (3.9) is smooth in and ω/(uv) are smooth functions of uv in O. and we identify O with the corresponding open subset of the Kerr space. We let O O, in M4 (by adding in be any open neighborhood of S0 contained in the closure of O the points corresponding to θ ∈ {0, π }). It is easy to see that the coordinate functions u, v : O → (−c1 , 1) verify the conclusions of the lemma. Assume now that x0 ∈ S0 , Br = {x ∈ R4 : |x| < r }, and x0 : B1 → O, x0 (0) = x0 , is a smooth coordinate chart around x0 . In view of (3.1), δ0 = inf g(L 3 , L 4 ) > 0. S0
(3.10)
It follows from (3.1) that there is 0 ∈ (0, 1/2] such that g(L 3 , L 4 ) > δ0 /2 and |g(L 3 , L 3 )| + |g(L 4 , L 4 )| < δ0 /100 on B 0 (x0 ), (3.11) where Br (x0 ) = x0 (Br ). Thus we can fix smooth vector fields L 1 , L 2 ∈ T(B 0 (x0 )) such that g(L 1 , L 1 ) = g(L 2 , L 2 ) = 1; g(L 1 , L 2 ) = g(L 1 , L 3 ) = g(L 2 , L 3 ) = g(L 1 , L 4 ) = g(L 2 , L 4 ) = 0.
(3.12)
We define also the smooth function N x0 : B1 (x0 ) → [0, ∞), N x0 (x) = |(x0 )−1 (x)|2 . The main result in this section is the following Carleman estimate: Proposition 3.2. There is ∈ (0, 0 ) sufficiently small and C sufficiently large such that for any λ ≥ C and any φ ∈ C0∞ (B 10 (x0 )) λe−λ f φ L 2 + e−λ f |D 1 φ| L 2 ≤ C λ−1/2 e−λ f g φ L 2 ,
(3.13)
f = ln[ −1 (u + )(v + ) + 12 N x0 ].
(3.14)
where
Proof of Proposition 3.2. We apply Proposition 2.3 with V = 0. It is clear that 12 N x0 is a negligible perturbation, in the sense of (2.10), for sufficiently small. It remains to prove that there is 1 > 0 such that the family of weights {h } ∈(0, 1 ) , h = −1 (u + )(v + )
(3.15)
satisfies conditions (2.7), (2.8) and (2.9). denote constants ≥ 1 that may depend only on the predefined geometric Let C quantities 0 , δ0 , and a uniform bound in B 0 (x0 ) of |D j gαβ |, |D j gαβ |, |D j u|, |D j v|, j = 0, . . . , 6. Since u(x0 ) = v(x0 ) = 0, the definition (3.15) shows easily that condition −1 . (2.7) is satisfied, provided that 1 ≤ C
Uniqueness Results for Ill-Posed Characteristic Problems
883
Relative to the frame L 1 , L 2 , L 3 , L 4 the metric g takes the form, gab = δab , ga3 = ga4 = 0, a, b = 1, 2, g33 = g3 , g44 = g4 , g34 = ,
(3.16)
in B 0 (x0 ), where g3 = g(L 3 , L 3 ), g4 = g(L 4 , L 4 ), = g(L 3 , L 4 ). Also, for the inverse metric, gab = δ ab , ga3 = ga4 = 0, a, b = 1, 2, (3.17) g33 = g3 , g44 = g4 , g34 = , where g3 = −g4 /(2 − g3 g4 ), g4 = −g3 /(2 − g3 g4 ), = /(2 − g3 g4 ). Recall that ≥ δ0 /2 in B 0 (x0 ), see (3.11), g3 = 0 on H+ ∩ B 0 (x0 ), g4 = 0 on H− ∩ B 0 (x0 ), see (3.1). Thus, using Lemma 3.1 (c), and |g4 | ≤ Cv |g3 | ≤ Cu
in B 0 (x0 ).
(3.18)
as We denote by O(1) any quantity with absolute value bounded by a constant C before. In view of the definitions of u, v, L 1 , L 2 , L 3 , L 4 we have, L 1 (u) = L 2 (u) = L 1 (v) = L 2 (v) = 0, L 4 (v) = g4 , L 4 (u) = L 3 (v) = .
L 3 (u) = g3 , (3.19)
Thus L 4 (h ) = −1 (v + ) + −1 (u + )g4 , L 3 (h ) = −1 (u + ) + −1 (v + )g3 , L 1 (h ) = L 2 (h ) = 0, (3.20) and, using (3.18), (3.19), and (3.20), in B 10 (x0 ), ⎧ 2 2 −1 2 ⎪ ⎨ (D h )34 = (D h )43 = + O(1), (D2 h )33 = O(1), (D2 h )44 = O(1), (D2 h )ab = O(1), ⎪ ⎩ (D2 h ) = O(1), (D2 h ) = O(1), a = 1, 2. 3a 4a
a, b = 1, 2, (3.21)
Using (3.17), (3.20), (3.21), and g3 (x0 ) = g4 (x0 ) = 0 we compute Dα h (x0 )Dβ h (x0 )(Dα h Dβ h − Dα Dβ h )(x0 ) = 22 + O(1) ≥ δ02 −1 . if 1 is sufficiently small. Thus condition (2.8) is satisfied provided 1 ≤ C −1/2 α 4 Assume now Y = Y L α is a vector in Tx0 (M ). We fix µ = 1 and compute, using (3.20), (3.21), and g3 (x0 ) = g4 (x0 ) = 1, Y α Y β (µgαβ − Dα Dβ h )(x0 ) + −2 |Y α Dα h |2 = µ((Y ) + (Y ) + 2Y Y )−2 1 2
2 2
3 4
−1
Y Y + 2 3 4
−2
(Y +Y ) + O(1) 2
3
4 2
4 α=1
≥ (µ/2)[(Y 1 )2 + (Y 2 )2 ] + 2 ( −1 /2)[(Y 3 )2 + (Y 4 )2 ] ≥ (Y 1 )2 + (Y 2 )2 + (Y 3 )2 + (Y 4 )2
(Y α )2
884
A. D. Ionescu, S. Klainerman
if 1 is sufficiently small. We notice now that we can write Y = X α ∂α in the 1 |+|Y 2 |+|Y 3 |+|Y 4 |) for α = 1, 2, 3, 4. coordinate frame ∂1 , ∂2 , ∂3 , ∂4 , and |X α | ≤ C(|Y −1 , which completes the proof of the Thus condition (2.9) is satisfied provided 1 ≤ C lemma.
3.2. The second Carleman inequality in Kerr spaces. In this section we prove a Carleman estimate for functions supported in small open sets in E4 . Assume that x0 ∈ E4 and x0 : B1 → E4 , x0 (0) = x0 , is a smooth coordinate chart around x0 . We define the smooth function N x0 : B1 (x0 ) → [0, ∞), N x0 (x) = |(x0 )−1 (x)|2 as before. We use the notation in the Appendix. The coordinate function r : E4 → (r+ , ∞) extends to a smooth function r : E4 → (r+ , ∞). The main result in this subsection is the following Carleman estimate: sufficiently large such Proposition 3.3. There is ∈ (0, 1/2] sufficiently small and C and any φ ∈ C ∞ (B 10 (x0 )), that for any λ ≥ C 0 f λ−1/2 e−λ λe−λ f φ L 2 + e−λ f |D 1 φ | L 2 ≤ C g φ L 2 + −6 e−λ f ξ(φ) L 2 , (3.22)
where, with r0 = r (x0 ), f = ln[r − r0 + + 12 N x0 ].
(3.23)
Proof of Proposition 3.3. As in the proof of Proposition 3.2, we will use the notation to denote various constants in [1, ∞) that may depend only on the chart and the C position of x0 in E4 (i.e. on (r (x0 )−r+ )−1 +(r (x0 )−r+ )]), and O(1) to denote quantities It is important to keep in mind that r (x0 ) > r+ , bounded in absolute value by a constant C. i.e. x0 ∈ E4 . We apply Proposition 2.3 with V = ξ . It suffices to prove that there is 1 > 0 such that the family of weights {h } ∈(0, 1 ) , h = r − r0 +
(3.24)
satisfies conditions (2.7), (2.8), and (2.9). Condition (2.7) is clear if 1 is sufficiently small, since ξ(h ) = 0. To prove conditions (2.8) and (2.9), with the notation in Sect. A, we work in the orthonormal frame e0 , e1 , e2 , e3 defined in (A.7). We have
D0 (h ) = D1 (h ) = D3 (h ) = 0,
D2 (h ) = ( /ρ 2 )1/2 .
(3.25)
Uniqueness Results for Ill-Posed Characteristic Problems
885
Using the table (A.16), we have −D0 D0 h −D0 D1 h −D1 D1 h −D2 D2 h −D2 D3 h −D3 D3 h D0 D2 h
Y
r r −m − 2 = 2 + ρ ρ2
ma sin θ =− 2 · √ (2r Y − 2 ) ρ ρ 2 2
Y r =− 2 − 2 ρ 2 ρ
r r −m = 2 − ρ ρ2
√ 2
a sin θ cos θ =− ρ4
r =− 4 ρ = D0 D3 h = D1 D2 h = D1 D3 h = 0.
(3.26)
It follows that Dα h (x0 )Dβ h (x0 )(Dα h Dβ h − Dα Dβ h )(x0 ) = 2 /ρ 2 + O(1) which verifies condition (2.8) if 1 is sufficiently small. To verify condition (2.9) we fix µ=
3 r 2ρ 4
(3.27)
and use the formula (compare with (A.9) and (A.4)) √ ρ
2amr sin θ e0 − e1 . ξ= ρ
(3.28)
Assume X = Y 0 e0 + Y 1 e1 + Y 2 e2 + Y 3 e3 is a vector expressed in the frame eα . We compute Y α Y β (µgαβ − Dα Dβ h )(x0 ) + −2 (|Y α ξα (x0 )|2 + |Y α Dα h (x0 )|2 ) = (Y 0 )2 (−µ − D0 D0 h ) + (Y 1 )2 (µ − D1 D1 h ) + 2Y 0 Y 1 (−D0 D1 h ) +(Y 2 )2 (µ − D2 D2 h ) + (Y 3 )2 (µ − D3 D3 h ) + 2Y 2 Y 3 (−D2 D3 h ) √ 2 Y 0 + 2amr (sin θ )Y 1 )2
(Y 2 )2 −2 (ρ + + −2 . (3.29) 2 2 ρ ρ4 √ Let Z = ρ 2 Y 0 + 2amr (sin θ )Y 1 , thus Y0 =
Z − 2amr (sin θ )Y 1 = αY 1 + β Z . √ ρ2
886
A. D. Ionescu, S. Klainerman
Using also µ − D3 D3 h = ( r )/(2ρ 4 ), the right-hand side of (3.29) becomes (Y 2 )2 ( −2 ρ −4 + µ − D2 D2 h ) + (Y 3 )2 ( rρ −4 )/2 − 2Y 2 Y 3 · D2 D3 h +Z 2 [ −2 ρ −2 −2 + β 2 (−µ − D0 D0 h )] +(Y 1 )2 [α 2 (−µ − D0 D0 h ) − 2αD0 D1 h + µ − D1 D1 h ] +2Y 1 Z [αβ(−µ − D0 D0 h ) − βD0 D1 h ]. (3.30) It is clear that the first line of the expression above is bounded from below by −1 ( −2 (Y 2 )2 + (Y 3 )2 ) C −1 . The main term we need to bound from below if is sufficiently small, since ≥ C 1 2 is the coefficient of (Y ) in (3.30). We use the table (3.26) and the definitions of α and µ; after several simplifications this term is equal to 5 r
Y 4a 2 m 2 r (sin θ )2 − + 2ρ 4 ρ22 ρ6
r2 r Y mr − a 2 − 2+ 2+ . 2ρ
In view of (A.14) and (A.15) this is bounded from below by ( r )/(2ρ 4 ). Thus the sum of the last three lines of (3.30) is bounded from below by −1 ( −2 Z 2 + (Y 1 )2 ) C if is sufficiently small. It follows that Y α Y β (µgαβ − Dα Dβ h )(x0 ) + −2 (|Y α ξα (x0 )|2 + |Y α Dα h (x0 )|2 ) −1 [(Y 0 )2 + (Y 1 )2 + −2 (Y 2 )2 + (Y 3 )2 ] ≥C if is sufficiently small. The condition (2.9) is verified, which completes the proof of the proposition. 3.3. Vanishing of the tensor S. In this subsection we prove Theorem 1.2. Arguments showing how to use Carleman inequalities to prove uniqueness are standard. We provide all the details here for the sake of completeness. Some care is needed at the first step, in Lemma 3.4 below, since we do not assume that derivatives of the tensor S vanish on the horizon. We show first that the tensor S vanishes in a neighborhood of the bifurcate sphere S0 in E4 . Lemma 3.4. With the notation in Theorem 1.2, there is an open set O ⊆ M4 , S0 ⊆ O , such that S ≡ 0 in O ∩ E4 . Proof of Lemma 3.4. We use the functions u, v defined in Lemma 3.1 and the Carleman estimate in Proposition 3.2. Since S0 is compact, it suffices to prove that for every point x0 ∈ S0 there is a neighborhood Ox0 of x0 such that S ≡ 0 in E4 ∩ Ox0 . As in Proposition 3.2, assume x0 : B1 → O, x0 (0) = x0 , is a smooth coordinate chart
Uniqueness Results for Ill-Posed Characteristic Problems
887
around x0 . With the notation in Proposition 3.2, there are constants ∈ (0, 0 ) and ≥ 1 such that, for any λ ≥ C and any φ ∈ C ∞ (B 10 (x0 )) C 0 −1/2 e−λ f g φ L 2 , λe−λ f φ L 2 + e−λ f |D 1 φ| L 2 ≤ Cλ
(3.31)
f = ln[ −1 (u + )(v + ) + 12 N x0 ].
(3.32)
where
The constant will remain fixed in this proof, and we assume implicitly it is sufficiently small as discussed in Proposition 3.2. We will show that S≡0
in B 40 (x0 ) ∩ E4 .
(3.33)
For ( j1 , . . . , jk ) ∈ {1, 2, 3, 4}k we define, using the coordinate chart , φ( j1 ... jk ) = S(∂ j1 , . . . , ∂ jk ).
(3.34)
If k = 0 we define φ = S in B1 (x0 ). The functions φ( j1 ... jk ) : B1 (x0 ) → C are smooth. Let η : R → [0, 1] denote a smooth function supported in [1/2, ∞) and equal to 1 in [3/4, ∞). With u, v as in Proposition 3.1, for δ ∈ (0, 1] we define
x0 20 φ(δ, = φ · 1 · η(uv/δ) · 1 − η(N / ) 4 ( j1 ... jk ) E j1 ... jk ) ηδ, . = φ( j1 ... jk ) ·
(3.35)
∞ Clearly, φ(δ, j1 ... jk ) ∈ C 0 (B 10 (x 0 )). We would like to apply the inequality (3.31) to the
functions φ(δ, j1 ... jk ) , and then let δ → 0 and λ → ∞ (in this order). Using the definition (3.35), we have
ηδ, · g φ( j1 ... jk ) + 2Dα φ( j1 ... jk ) · Dα ηδ, + φ( j1 ... jk ) · g ηδ, . g φ(δ, j1 ... jk ) = Using the Carleman inequality (3.31), for any ( j1 , . . . jk ) ∈ {1, 2, 3, 4}k we have ηδ, φ( j1 ... jk ) L 2 + e−λ f · ηδ, |D 1 φ( j1 ... jk ) | L 2 λ · e−λ f · −1/2 · e−λ f · ≤ Cλ ηδ, g φ( j1 ... jk ) L 2 −λ f e ·Dα φ( j1 ... jk ) Dα +C ηδ, L 2 + e−λ f · φ( j1 ... jk ) (|g ηδ, | + |D 1 ηδ, |) L 2 , (3.36) We estimate now |g φ( j1 ... jk ) |. Using the first identity in (1.6) and (3.34), for any λ ≥ C. in B 10 (x0 ) we estimate pointwise A,B | D 1 φ(l1 ...lk ) | + |φ(l1 ...lk ) | , |g φ( j1 ... jk ) | ≤ C (3.37) l1 ,...,lk
A,B that depends only on the tensors A and B. We add up the for some constant C inequalities (3.36) over ( j1 , . . . , jk ) ∈ {1, 2, 3, 4}k . The key observation is that, in view
888
A. D. Ionescu, S. Klainerman
of (3.37), the first term in the right-hand side can be absorbed into the left-hand side for A,B and δ ∈ (0, 1], λ sufficiently large. Thus, for any λ ≥ C λ e−λ f · ηδ, φ( j1 ... jk ) L 2 j1 ,..., jk
≤C
e−λ f ·Dα φ( j1 ... jk ) Dα ηδ, L 2+e−λ f · φ( j1 ... jk ) (|g ηδ, |+|D 1 ηδ, |) L 2 .
j1 ,..., jk
(3.38) We would like to let δ → 0 in (3.38). For this, we observe first that the functions ηδ, and (|g ηδ, | + |D 1 ηδ, |) vanish outside the set Aδ ∪ B , where Dα φ( j1 ... jk ) Dα Aδ = {x ∈ B 10 (x0 ) ∩ E4 : uv ∈ (δ/2, δ)}; B = {x ∈ B 10 (x0 ) ∩ E4 : N x0 ∈ ( 20 /2, 20 )}. In addition, since φ( j1 ... jk ) = 0 on H (using the hypothesis of Theorem 1.1), it follows from Proposition 3.1 (c) that there are smooth functions φ ( j1 ... jk ) : O → C such that (3.39) φ( j1 ... jk ) 1 − η(N x0 ) = uv · φ ( j1 ... jk ) in O ∩ E4 . We show now that + (1/δ)1Aδ ). |g ηδ, | + |D 1 ηδ, | ≤ C(1 B
(3.40)
The inequality for |D 1 ηδ, | follows directly from the definition (3.35). Also, using again the definition,
+ (1/δ)1Aδ ). |Dα Dα ηδ, | ≤ |Dα Dα (1E4 · η(uv/δ))| · 1 − η(N x0 / 20 ) + C(1 B Thus, for (3.40), it suffices to prove that · 1Aδ . 1E4 · |Dα Dα (η(uv/δ))| ≤ C/δ
(3.41)
Since u, v, η are smooth functions, for (3.41) it suffices to prove that δ −2 |Dα (uv)Dα (uv)| ≤ C/δ
in Aδ .
(3.42)
Since uv ∈ [δ/2, δ] in Aδ , it suffices to prove that u 2 |Dα vDα v| + v 2 |Dα uDα u| ≤ Cδ
in Aδ .
For this we use the frame L 1 , L 2 , L 3 , L 4 as in the proof of Prooposition 3.2. The bound follows from (3.19), (3.18), and (3.17). We show now that φ (1 + 1Aδ ), ηδ, | ≤ C |Dα φ( j1 ... jk ) Dα B
(3.43)
φ depends on the smooth functions φ ( j ... j ) defined in (3.39). where the constant C 1 k B ), this Using the formula (3.39) (which becomes φ( j1 ... jk ) = uv · φ ( j1 ... jk ) in Aδ ∪ follows easily from (3.42).
Uniqueness Results for Ill-Posed Characteristic Problems
889
It follows from (3.39), (3.40), and (3.43) that φ (1 + 1Aδ ). |Dα φ( j1 ... jk ) Dα ηδ, | + |φ j1 ... jk |(|g ηδ, | + |D 1 ηδ, |) ≤ C B Since limδ→0 1Aδ L 2 = 0, we can let δ → 0 in (3.38) to conclude that φ λ e−λ f · 1 B 10 (x0 )∩E4 · φ( j1 ... jk ) L 2 ≤ C e−λ f · 1 B L 2
j1 ,..., jk
/2
(3.44)
j1 ,..., jk
A,B . Finally, using the definition (3.32), we observe that for any λ ≥ C inf
B 40 (x0
)∩E4
e−λ f ≥ e−λ ln( +
32 /2)
≥ sup e−λ f . B
It follows from (3.44) that φ 1 B 40 (x0 )∩E4 · φ( j1 ... jk ) L 2 ≤ C 1 λ B L 2 j1 ,..., jk
j1 ,..., jk
A,B . We let now λ → ∞. The identity (3.33) follows. for any λ ≥ C We show now that the tensor S vanishes in an open neighborhood of the horizon H in E4 . For any R > r+ let E4R = {x ∈ E4 : r (x) ∈ (r+ , R)}, where r : E4 → (r+ , ∞) is the smooth function used in Proposition 3.3. Lemma 3.5. With the notation in Theorem 1.2, there is R > r+ such that S ≡ 0 in E4R . Proof of Lemma 3.5. It follows from Proposition 3.1 (a) and Lemma 3.4 that there is 1 > 0 such that S ≡ 0 in the set {x ∈ E4 ∩ O : u(x) < 1 and v(x) < 1 }. (3.45) 4 4 4 It suffices to prove that S ≡ 0 in E R ∩ E , where E is the dense open subset of E4 defined in Sect. A. In view of (3.45) and the definition of the functions u, v in the proof of Lemma 3.1, there is 2 > 0 such that S ≡ 0 in the set {x = (r, t, θ, φ) ∈ E4 : t = 0 and r < r+ + 2 }. (3.46) We use the Boyer-Lindquist coordinate chart (see Appendix A) to define ∂1 = ∂r , ∂2 = ∂t , ∂3 = ∂θ , ∂4 = ∂φ and ( j1 ... jk ) = S( φ ∂ j1 , . . . , ∂ jk ). The second identity in (1.6) gives, for any ( j1 , . . . , jk ) ∈ {1, 2, 3, 4}k , (l1 ...lk ) C l1 ...lk j1 ... jk . ( j1 ... jk ) ) = φ ∂t (φ
(3.47)
l1 ,...,lk
In view of (3.46) ( j1 ... jk ) (r, 0, θ, φ) = 0 φ
if r < r+ + 2 .
( j1 ... jk ) (r, t, θ, φ) = 0 if r < r+ + 2 , Since C is a smooth tensor in E4 , it follows that φ which completes the proof of the lemma.
890
A. D. Ionescu, S. Klainerman
We prove now that S ≡ 0 in E4 , which completes the proof of the theorem. In view of Lemma 3.5, it suffices to prove the following: Lemma 3.6. With the notation in Theorem 1.1, assume that S ≡ 0 in E4R0
(3.48)
for some R0 > r+ . Then there is R1 > R0 such that S ≡ 0 in E4R1 . Proof of Lemma 3.6. Assume that x0 ∈ E4 and r (x0 ) = R0 . We show first that there is a neighborhood Ox0 of x0 such that S ≡ 0 in Ox0 .
(3.49)
This is similar to the proof of Lemma 3.4, using the Carleman estimate in Proposition 3.3 instead of the Carleman estimate in Proposition 3.2. Assume x0 : B1 → E4 , x0 (0) = x0 , is a smooth coordinate chart around x0 . With the notation in Proposition 3.3, there sufficiently large such that is ∈ (0, 1/2] sufficiently small and C f −1/2 e−λ g φ L 2 + −6 e−λ f ξ(φ) L 2 , λe−λ f φ L 2 + e−λ f |D 1 φ | L 2 ≤ Cλ (3.50)
and any φ ∈ C ∞ (B 10 (x0 )), where for any λ ≥ C 0 f = ln[r − R0 + + 12 N x0 ].
(3.51)
The constant will remain fixed in this proof, and sufficiently small in the sense of Proposition 3.3. We will show that S ≡ 0 in B 40 (x0 ).
(3.52)
For ( j1 , . . . , jk ) ∈ {1, 2, 3, 4}k we define, using the coordinate chart , φ( j1 ... jk ) = S(∂ j1 , . . . , ∂ jk ). If k = 0 we simply define φ = S in B1 (x0 ). The functions φ( j1 ... jk ) : B1 (x0 ) → C are smooth. Let η : R → [0, 1] denote a smooth function supported in [1/2, ∞) and equal to 1 in [3/4, ∞), as before. We define
φ( j1 ... jk ) = φ( j1 ... jk ) · 1 − η(N x0 / 20 ) = φ( j1 ... jk ) · η . Clearly, φ( j1 ... jk ) ∈ C0∞ (B 10 (x0 )) and g φ( j1 ... jk ) = η · g φ( j1 ... jk ) + 2Dα φ( j1 ... jk ) · Dα η + φ( j1 ... jk ) · g η ξ(φ( j1 ... jk ) ) = η · ξ(φ( j1 ... jk ) ) + φ( j1 ... jk ) · ξ( η ). Using the Carleman inequality (3.50), for any ( j1 , . . . jk ) ∈ {1, 2, 3, 4}k we have
η φ( j1 ... jk ) L 2 + e−λ f · η |D 1 φ( j1 ... jk ) | L 2 λ · e−λ f · f f −1/2 · e−λ −λ ≤ Cλ · η g φ( j1 ... jk ) L 2 + Ce · η ξ(φ( j1 ... jk ) ) L 2 e−λ f · Dα φ( j1 ... jk ) Dα +C η L 2 + e−λ f · φ( j1 ... jk ) (|g η | + |D 1 η |) L 2 ,
(3.53)
Uniqueness Results for Ill-Posed Characteristic Problems
891
Using the identities in (1.6), in B 10 (x0 ) we estimate pointwise for any λ ≥ C.
A,B,C l ,...,l |D 1 φ(l1 ...lk ) | + |φ(l1 ...lk ) | ; |g φ( j1 ... jk ) | ≤ C 1 k A,B,C l ,...,l |φ(l1 ...lk ) |, |ξ(φ( j1 ... jk ) )| ≤ C 1 k
(3.54)
A,B,C that depends only on the constants C and the tensors A, B, C. for some constant C We add up the inequalities (3.53) over ( j1 , . . . , jk ) ∈ {1, 2, 3, 4}k . The key observation is that, in view of (3.54), the first two terms in the right-hand side can be absorbed into A,B,C , the left-hand side for λ sufficiently large. Thus, for any λ ≥ C λ
e−λ f · η φ( j1 ... jk ) L 2
j1 ,..., jk
≤C
e−λ f · Dα φ( j1 ... jk ) Dα η L 2 +e−λ f · φ( j1 ... jk ) (|g η | + |D 1 η |) L 2 . j1 ,..., jk
(3.55) Using the hypothesis (3.48) and the definition of the function η , we have φ · 1{x∈B (x ): r ≥R and N x0 > 20 /2} , |Dα φ( j1 ... jk ) Dα η | + φ( j1 ... jk ) (|g η | + |D 1 η |) ≤ C 10 0 0
φ that depends on the smooth functions φ j1 ... jk . Using the definition (3.51), for some C we observe also that inf
B 40 (x0 )
e−λ f ≥ e−λ ln( +
32 /2)
≥
sup {x∈B 10 (x0 ): r ≥R0 and
N x0 > 20 /2}
e−λ f .
The identity (3.52) follows by letting λ → ∞ in (3.55). The set {x ∈ E4 : t (x) = 0 and r (x) = R0 } is compact, where t : E4 → R is a smooth function which agrees with coordinate function t in the Boyer-Lindquist coordinates. It follows from (3.49) that there is 3 > 0 such that S ≡ 0 in the set {x ∈ E4 : t (x) = 0 and r (x) < R0 + 3 }.
(3.56)
We define the vectors ∂1 = ∂r , ∂2 = ∂t , ∂3 = ∂θ , ∂4 = ∂φ ∈ T( E4 ) and the functions φ( j1 ... jk ) = S(∂ j1 , . . . , ∂ jk ) as in the proof of Lemma 3.5. It follows from the identity (3.47) and (3.56) that ( j1 ... jk ) (r, t, θ, φ) = 0 φ which completes the proof of the lemma.
if r < R0 + 3 ,
892
A. D. Ionescu, S. Klainerman
4. Proof of Theorem 1.1 In this section we prove Theorem 1.1. We define the smooth optical functions u, v : E → (−1/2, ∞), u(t, x) = |x| − 1 − t; (4.1) v(t, x) = |x| − 1 + t, where E = {(t, x) ∈ M : |x| > |t| + 1/2}. Notice that E = {(t, x) ∈ E : u > 0 and v > 0}. For R ∈ [1, ∞) we define the relatively compact open set E R = {(t, x) ∈ E : (u + 1/2)(v + 1/2) < R}.
(4.2)
Proposition 4.1. Assume R ≥ 1. Then there is λ(R) 1 such that for any φ ∈ C02 (E R ) and λ ≥ λ(R), λ · e−λ f · φ L 2 + e−λ f · Dφ L 2 ≤ C R λ−1/2 · e−λ f · φ L 2 ,
(4.3)
f = log(u + 1/2) + log(v + 1/2) = log (|x| − 1/2)2 − t 2
(4.4)
where
and |Dφ| =
d 2 µ=0 |∂µ φ|
1/2
.
The Carleman inequality in Proposition 4.1 suffices to prove Theorem 1.1, by an argument similar to the one given in Lemma 3.4 (which exploits implicitly the bifurcate characteristic geometry of H, using a cutoff function of the form η(uv/δ), to compensate for the fact that we do not assume vanishing of the derivatives of φ on H). Proposition 4.1 can be obtained as a direct consequence of Hörmander’s general pseudo-convexity condition (2.1). For the convenience of the reader, we provide below a self-contained elementary proof of Proposition 4.1, in which we verify implicitly a similar pseudo-convexity condition in our simple case and show how it implies the Carleman inequality. Proof of Proposition 4.1. The constants C ≥ 1 in this proof may depend on R and d. We may assume that φ ∈ C0∞ (E R ) is real-valued. Since all partial derivatives of f are bounded in E R , for (4.3) it suffices to prove that, for λ ≥ λ(R), λ · e−λ f · φ L 2 + D(e−λ f · φ) L 2 ≤ Cλ−1/2 · e−λ f · φ L 2 .
(4.5)
To prove estimate (4.5) we start by setting, φ = eλ f ψ with f = f (u, v) as above. Observe that e−λ f (eλ f ψ) = ψ + λ(2Dβ f Dβ ψ + f ψ) + λ2 (Dβ f Dβ f )ψ. Thus estimate (4.5) follows from λψ L 2 + C −1 Dψ L 2 ≤ Cλ−1/2 Lψ + λ( f )ψ L 2 ,
(4.6)
Uniqueness Results for Ill-Posed Characteristic Problems
893
where, Lψ = ψ + 2λW ψ + λ2 Gψ, W = Dα f Dα ,
G = Dβ f Dβ f.
Since f is bounded on E R , i.e. | f | ≤ C, it suffices in fact to show that, λψ L 2 + C −1 Dψ L 2 ≤ Cλ−1/2 .Lψ L 2 .
(4.7)
We shall establish in fact a lower bound for an integral of the form, E =< Lψ, 2λ(W − w)ψ >= 2λ Lψ (W (ψ) − wψ) ,
(4.8)
ER
where w is a smooth function on E R we will choose below. In fact we will choose w such that we can establish the lower bound,
E ≥ C −1 λDψ2L 2 + λ3 ψ2L 2 + λ2 (W − w)ψ2L 2 . (4.9) Since E ≤ Lψ2L 2 + λ2 (W − w)ψ2L 2 , (4.7) easily follows from (4.9). Now, writing Lψ = ψ + λ2 Gψ + λ(W ψ + wψ) + λ(W ψ − wψ), E = 2λ < Lψ, (W − w)ψ >= 2λ2 (W − w)ψ2L 2 + 2λ2 W ψ2L 2 − 2λ2 wψ2L 2 + E 1 + E 2 , E 1 = λ < ψ, (2W − 2w)ψ >,
(4.10)
E 2 = λ3 < Gψ, (2W − 2w)ψ > . Thus, for bounded w and for λ sufficiently large, (4.9) is an immediate consequence of
(4.11) 2λ2 W ψ2L 2 + E 1 + E 2 ≥ C −1 λDψ2L 2 + λ3 ψ2L 2 . To evaluate E 1 and E 2 we make use of the following simple lemma. Lemma 4.2. Let Q αβ = Dα ψDβ ψ − 21 m αβ (Dµ ψDµ ψ) denote the energy-momentum tensor of the wave operator = m αβ Dα Dβ . Then, ψ · (2W ψ − 2wψ) = Dα (2W β Q αβ − 2wψ · Dα ψ + Dα w · ψ 2 ) −Q αβ (Dα Wβ + Dβ Wα ) + 2wDα ψ · Dα ψ − g w · ψ 2 , and Gψ · (2W ψ − 2wψ) = Dα (ψ 2 G · Wα ) − ψ 2 (2wG + W (G) + G · Dα Wα ). Since ψ ∈ C0∞ (E R ) we integrate by parts to conclude that 2wDα ψ · Dα ψ − 2Dα W β · Q αβ E1 + E2 = λ ER +λ3 ψ 2 (−2wG − W (G) − G · Dα Wα ) ER −λ ψ 2 g w. ER
(4.12)
894
A. D. Ionescu, S. Klainerman
To prove (4.11) we are reduced to prove pointwise bounds for the first two integrands in (4.12). More precisely, dividing by λ and λ3 respectively, it suffices to prove that the pointwise bounds C −1 |Dψ|2 ≤ λ|W (ψ)|2 + (wDα ψ · Dα ψ − Dα W β · Q αβ ),
(4.13)
C −1 ≤ −2wG − W (G) − G · Dα Wα ,
(4.14)
and
hold on E R , for λ sufficiently large. Recall that W α = Dα f and G = Dα f Dα f . Observe that wDα ψ · Dα ψ − Dα W β · Q αβ = (Dα ψ · Dβ ψ)[(w + f /2)m αβ − Dα Dβ f ] and −2wG − W (G) − G · Dα Wα = −G(2w + f ) − 2Dα f Dβ f · Dα Dβ f. Thus, with w = w + f /2 ∈ C ∞ (E R ) (still to be chosen), the inequalities (4.13) and (4.14) are equivalent to the pointwise inequalities C −1 |Dψ|2 ≤ λ|Dα f · Dα ψ|2 + (Dα ψ · Dβ ψ)(w m αβ − Dα Dβ f ),
(4.15)
C −1 ≤ −w (Dα f Dα f ) − Dα f Dβ f · Dα Dβ f
(4.16)
and
on E R , for λ sufficiently large. Let h = e f or, in view of (4.4), h = (|x| − 1/2)2 − t 2 ). In terms of h making use of the inequality h ≥ 1/4 on E R , the inequalities (4.15) and (4.16) are equivalent to C −1 |Dψ|2 ≤ λ|Dα h · Dα ψ|2 + (Dα ψ · Dβ ψ)(w m αβ − h −1 Dα Dβ h),
(4.17)
C −1 ≤ Dα hDβ h(h −2 Dα hDβ h − h −1 Dα Dβ h) − w Dα hDα h,
(4.18)
and
provided that λ is sufficiently large. To summarize, we need to find w ∈ C ∞ (E R ) such that the inequalities (4.17) and (4.18) hold in E R , for all λ sufficiently large. We shall see below that our function h, strictly positive and smooth on E R verifies the equation, Dα hDα h = 4h.
(4.19)
We infer by differentiation that, Dα Dβ h Dβ h = 2Dα h and therefore, Dα Dβ h Dα h Dβ h = 8h. Therefore the right-hand side of (4.18) is equal to 8 − 4hw and thus inequality (4.18) is equivalent to hw ≤ 2 − C −1 in E R , which is clearly satisfied if w = h −1 (2 − A0 |x|−1 )
for some constant A0 > 0.
(4.20)
Uniqueness Results for Ill-Posed Characteristic Problems
895
On the other hand, setting Y α = Dα ψ and Hαβ = Dα Dβ h, α, β = 0, . . . , d and observing that H0i = 0 for i = 1, . . . , d, we infer that the right-hand side of (4.17) is equal to
E := λ(Dα hY α )2 + w −(Y 0 )2 + |Y |2 − h −1 H00 (Y 0 )2 + Hi j Y i Y j
= (Y 0 )2 −w − h −1 H00 + |Y |2 w − h −1 Hi j Yˆ i Yˆ j + λ(D0 hY 0 + Di h Y i )2 , d where |Y |2 = i=1 (Y i )2 and Yˆ i = |Y |−1 Y i . Since h = (|x| − 1/2)2 − t 2 , we have −1 |h| + |h | + |x| + (|x| − 1/2)−1 ≤ C in E R . We compute D0 h = −2t,
D j h = (2 − |x|−1 )x j
for j = 1, . . . , d,
(4.21)
and H00 = D0 D0 h = −2, Hi j = Di D j h = (2 − |x|−1 )δi j + xi x j |x|−3
for i, j = 1, . . . , d.
(4.22)
Thus we easily check that (4.19) is indeed verified. Setting Z = Y · x, ˆ with xˆi = the expression for E becomes
xi |x| ,
E = (Y 0 )2 h −1 2 − hw + |Y |2 h −1 (hw − (2 − |x|−1 )) − h −1 |x|−1 Z 2
2 +λ −2tY 0 + (2|x| − 1)Z = h −1 A0 |x|−1 (Y 0 )2 + h −1 (1 − A0 )|x|−1 |Y |2 − h −1 |x|−1 Z 2
2 +λ −2tY 0 + (2|x| − 1)Z . To derive the bound,
E ≥ C −1 (Y 0 )2 + |Y |2 ,
(4.23)
from which (4.17) follows, we rely on the following simple lemma. Lemma 4.3. Given δ > 0 there exists λ sufficiently large (depending on R and δ) such that the following inequality holds: 2 λ (2|x| − 1)Z − 2tY 0 + h −1 A0 |x|−1 (Y 0 )2 − h −1 |x|−1 Z 2
t2 ≥ (Y 0 )2 h −1 |x|−1 A0 − − δ . (|x| − 1/2)2
(4.24)
In view of the lemma the bound (4.23) follows by choosing A0 = 1 − C0−1 and δ = C0−1 , for C0 sufficiently large depending on R. This completes the proof of the proposition. We give below the proof of Lemma 4.3.
896
A. D. Ionescu, S. Klainerman
Proof. Inequality (4.24) is equivalent to λ(2|x| − 1)
2
t Z− Y0 (|x| − 1/2)
2 +h
−1
|x|
−1
(Y 0
t )2 − Z 2 |x| − 1/2
+δh −1 |x|−1 (Y 0 )2 ≥ 0. Setting X =
t 0 |x|−1/2 Y
− Z we can rewrite the above inequality in the form,
λ(2|x| − 1)2 X 2 + h −1 |x|−1 X − X + 2
t )Y 0 + δh −1 |x|−1 (Y 0 )2 ≥ 0 |x| − 1/2
or, equivalently,
X 2 λ(2|x| − 1)2 − h −1 |x|−1 + 2
t XY 0 + δh −1 |x|−1 (Y 0 )2 ≥ 0 |x| − 1/2
which clearly holds for t, x in E R and all X, Y 0 in R provided that λ is sufficiently large.
Appendix A. Explicit Computations in the Kerr Spaces We consider the exterior region E4 of the Kerr spacetime of mass m and angular momentum ma, a ∈ [0, m). Following [4, Chap. 6], in the standard Boyer-Lindquist coordinates (r, t, θ, φ) ∈ (r+ , ∞) × R × (0, π ) × S1 , r± = m ± (m 2 − a 2 )1/2 , the Kerr metric on a dense open subset E4 of E4 is ds 2 = −
ρ2
2 (sin θ )2 (dt)2 + 2 ρ2
2 2amr ρ2 dφ − (dr )2 + ρ 2 (dθ )2 , (A.1) dt + 2
where ⎧ 2 2 ⎪ ⎨ = r + a − 2mr ; ρ 2 = r 2 + a 2 (cos θ )2 ; ⎪ ⎩ 2 = (r 2 + a 2 )ρ 2 + 2mra 2 (sin θ )2 = (r 2 + a 2 )2 − a 2 (sin θ )2 .
(A.2)
This metric is of the form ds 2 = −e2ν (dt)2 + e2ψ (dφ − ωdt)2 + e2µ2 (dr )2 + e2µ3 (dθ )2 , where
(A.3)
Uniqueness Results for Ill-Posed Characteristic Problems
ρ2
1 and ν = [ln(ρ 2 ) + ln − ln( 2 )]; 2 2 2 (sin θ )2 1 e2ψ = and ψ = [ln( 2 ) + 2 ln(sin θ ) − ln(ρ 2 )]; 2 ρ 2 2amr ω= ; 2 ρ2 1 and µ2 = [ln(ρ 2 ) − ln ]; e2µ2 =
2 1 e2µ3 = ρ 2 and µ3 = ln(ρ 2 ). 2 We compute
897
e2ν =
r r −m −a 2 sin θ cos θ and ∂θ µ2 = − ; 2 ρ
ρ2 r −a 2 sin θ cos θ ∂r µ3 = 2 and ∂θ µ3 = ; ρ ρ2
(A.4)
∂r µ2 =
(A.5)
and 2am [(3r 2 − a 2 )(r 2 + a 2 ) − a 2 (sin θ )2 (r 2 − a 2 )]; 4 4a 3 mr sin θ cos θ ∂θ ω = ; 4 r r −m 2r (r 2 + a 2 ) − a 2 (sin θ )2 (r − m) ∂r ν = 2 + ; − ρ
2
1 − 2 ; ∂θ ν = a 2 sin θ cos θ 2 ρ ∂r ω = −
(A.6)
2r (r 2 + a 2 ) − a 2 (sin θ )2 (r − m) r − 2; 2 ρ
cos θ
1 2 . − 2 + ∂θ ψ = −a sin θ cos θ 2 ρ sin θ
∂r ψ =
We fix the frame e0 = e−ν (∂t + ω∂φ ), e1 = e−ψ ∂φ , e2 = e−µ2 ∂r , e3 = e−µ3 ∂θ .
(A.7)
Clearly, (gαβ ) = (gαβ ) = diag(−1, 1, 1, 1), where gαβ = g(eα , eβ ), α, β = 0, 1, 2, 3. The dual basis of 1-forms is η0 = eν dt, η1 = eψ (dφ − ωdt), η2 = eµ2 dr, η3 = eµ3 dθ.
(A.8)
ξ = ∂t = eν · e0 − eψ ω · e1 .
(A.9)
Also We compute now the covariant derivatives Dei e j , i, j = 0, 1, 2, 3. We use the formula g(Z , DY X ) =
1 (X (g(Y, Z )) + Y (g(Z , X )) − Z (g(X, Y )) 2 − g([X, Z ], Y ) − g([Y, Z ], X ) − g([X, Y ], Z )) ,
(A.10)
898
A. D. Ionescu, S. Klainerman
for any vector fields X, Y, Z . We have [e0 , e1 ] = 0; [e0 , e2 ] = e−µ2 ∂r ν · e0 − eψ−µ2 −ν ∂r ω · e1 ; [e0 , e3 ] = e−µ3 ∂θ ν · e0 − eψ−µ3 −ν ∂θ ω · e1 ; [e1 , e2 ] = e−µ2 ∂r ψ · e1 ;
(A.11)
[e1 , e3 ] = e−µ3 ∂θ ψ · e1 ; [e2 , e3 ] = e−µ3 ∂θ µ2 · e2 − e−µ2 ∂r µ3 · e3 . With [ei , e j ] = Cikj ek , Cikj + C kji = 0, it follows from (A.10) that 1 j (g j j gkk Cik + gii gkk C ijk + Cikj )ek . 2 3
De j ei = −
(A.12)
k=0
Using the table (A.11), this gives 0 0 e2 + C03 e3 ; De1 e0 = De0 e0 = C02
−1 1 −1 1 C e2 + C e3 ; 2 02 2 03
−1 1 −1 1 C e1 ; De3 e0 = C e1 ; 2 02 2 03 −1 1 −1 1 1 1 C02 e2 + C e3 ; De1 e1 = (−1)C12 = e2 + (−1)C13 e3 ; 2 2 03 −1 1 −1 1 C02 e0 ; De3 e1 = C e0 ; = 2 2 03 1 1 −1 1 0 1 C e0 + C12 = C02 e0 + C02 e1 ; De1 e2 = e1 ; 2 2 02 2 3 = −C23 e3 ; De3 e2 = −C23 e3 ; 1 1 −1 1 0 1 C e0 + C13 = C03 e0 + C03 e1 ; De1 e3 = e1 ; 2 2 03 2 3 = C23 e2 ; De3 e3 = C23 e2 .
De2 e0 = De0 e1 De2 e1 De0 e2 De2 e2 De0 e3 De2 e3
(A.13)
Let Y = 2r (r 2 + a 2 ) − a 2 (sin θ )2 (r − m), and observe that (3r 2 − a 2 )(r 2 + a 2 ) − a 2 (sin θ )2 (r 2 − a 2 ) = 2r Y − 2 > 0,
(A.14)
2r 2 > ρ 2 Y.
(A.15)
and
We compute now the Hessian D2 r . More generally, for a function f that depends only on r (i.e. e0 ( f ) = e1 ( f ) = e3 ( f ) = 0), using (A.11) and (A.13), and the formula Dα Dβ f = Dβ Dα f = eα (eβ ( f )) − Deα eβ ( f ),
Uniqueness Results for Ill-Posed Characteristic Problems
Y
r r −m ∂r f, − + ρ2 ρ2
2 1 1 −µ2
ma sin θ = C02 e ∂r f = 2 · √ (2r Y − 2 )∂r f, 2 ρ ρ 2 2
Y r 1 −µ2 ∂r f, = C12 e ∂r f = 2 − ρ 2 ρ2
r r −m = e−µ2 ∂r (e−µ2 ∂r f ) = 2 ∂r2 f − 2 − ∂r f, ρ ρ ρ2
√ 2
a sin θ cos θ 2 −µ2 = −C23 e ∂r f = ∂r f, ρ4
r 3 −µ2 = −C23 e ∂r f = 4 ∂r f, ρ = D0 D3 f = D1 D2 f = D1 D3 f = 0.
899
0 −µ2 D0 D0 f = −C02 e ∂r f = −
D0 D1 f D1 D1 f D2 D2 f D2 D3 f D3 D3 f D0 D2 f
(A.16)
Acknowledgements. We would like to thank A. Rendall for bringing to our attention the work of Friedlander, [6,7].
References 1. Alinhac, S., Baouendi, M.S.: A non-uniqueness result for operators of principal type. Math. Z. 220, 561–568 (1995) 2. Carleman, T.: Sur un probleme d’unicite pour les systemes d’equations aux derivees partielles a deux variables independantes. Ark. Mat., Astr. Fys. 26 (1939) 3. Carter, B.: An axy-symmetric black hole has only two degrees of freedom. Phys. Rev. Lett. 26, 331–333 (1971) 4. Chandrasekhar, S.: The mathematical theory of black holes. International Series of Monographs on Physics, 69, Oxford Science Publications, New York: The Clarendon Press/Oxford University Press, 1983 5. Cohen, P.: The non-uniqueness of the Cauchy problem. ONR Technical Report, 93, Stanford Univ., 1960 6. Friedlander, F.G.: On an improperly posed characteristic initial value problem. J. Math. Mech. 16, 907–915 (1967) 7. Friedlander, F.G.: An inverse problem for radiation fields. Proc. London Math. Soc. 27, 551–576 (1973) 8. John, F.: On linear partial differential equations with analytic coefficients. Unique Continuation of Data. Comm. Pure Appl. Math. 2, 209–253 (1949) 9. John, F.: Partial Differential Equations. Fourth edition, Berlin-Heidelberg-New York: Springer- Verlag, 1991 10. Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge: Cambridge Univ. Press, 1973 11. Hörmander, L.: On the uniqueness of the Cauchy problem under partial analyticity assumptions. In: Geometrical optics and related topics (Cortona, 1996), Progr. Nonlinear Differential Equations Appl., 32, Boston, MA: Birkhäuser Boston, 1997, pp. 179–219 12. Hörmander, L.: Non-uniqueness for the Cauchy problem. Lect. Notes in Math., 459, Berlin-HeidelbergNew York: Springer Verlag, 1975, pp. 36–72 13. Hörmander, L.: The analysis of linear partial differential operators. Berlin: Springer-Verlag, 1985 14. Ionescu, A., Klainerman, S.: On the uniqueness of smooth, stationary black holes in vacuum. http://arXiv. org/abs/0711.0040v2[gr.gc], 2008 15. Isakov, V.: Carleman type estimates in an anisotropic case and applications. J. Diff. Eqs. 105, 217–238 (1993) 16. Kenig, C.E., Ruiz, A., Sogge, C.D.: Uniform Sobolev inequalities and unique continuation for second order constant coefficient differential operators. Duke Math. J. 55, 329–347 (1987) 17. Mars, M.: A spacetime characterization of the Kerr metric. Class. Quant. Grav. 16, 2507–2523 (1999) 18. Pretorius, F., Israel, W.: Quasi-spherical light cones of the Kerr geometry. Class. Quant. Grav. 15, 2289–2301 (1989)
900
A. D. Ionescu, S. Klainerman
19. Robbiano, L., Zuily, C.: Uniqueness in the Cauchy problem for operators with partially holomorphic coefficients. Invent. Math. 131, 493–539 (1998) 20. Robinson, D.C.: Uniqueness of the Kerr black hole. Phys. Rev. Lett. 34, 905–906 (1975) 21. Tataru, D.: Unique continuation for operators with partially analytic coefficients. J. Math. Pures Appl. 78, 505–521 (1999) Communicated by P. Constantin
Commun. Math. Phys. 285, 901–923 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0651-x
Communications in
Mathematical Physics
Stochastic Porous Media Equations and Self-Organized Criticality Viorel Barbu1 , Giuseppe Da Prato2 , Michael Röckner3,4 1 2 3 4
Institute of Mathematics “Octav Mayer”, Romanian Academy, Iasi 700506, Romania Scuola Normale Superiore di Pisa, Piazza dei Cavalieri, 7, 56126 Pisa, Italy. E-mail:
[email protected] Faculty of Mathematics, University of Bielefeld, D-33501 Bielefeld, Germany Department of Mathematics and Statistics, Purdue University, 150 N. University Street, West Latayette, IN 47907-2067, USA
Received: 1 November 2007 / Accepted: 3 July 2008 Published online: 25 October 2008 – © Springer-Verlag 2008
Abstract: The existence and uniqueness of nonnegative strong solutions for stochastic porous media equations with noncoercive monotone diffusivity function and Wiener forcing term is proven. The finite time extinction of solutions with high probability is also proven in 1-D. The results are relevant for self-organized criticality behavior of stochastic nonlinear diffusion equations with critical states. 1. Introduction The phenomenon of self-organized criticality is widely studied in Physics from different perspectives. (We refer to [1,2,8–10,13–19,23] for various studies). Roughly speaking it is the property of systems to have a critical point as attractor and to reach spontaneously a critical state. In [2] Bantay and Janosi beautifully explained that the continuum limit of the sand pile model of Bak-Tang-Wiesenfeld in [1] (“BTW model”), which was based on a cellular automaton algorithm, can be interpreted as a solution of an anomalous (singular) diffusion equation of the type d X (t) = (H (X (t) − xc )dt,
(1.1)
where H is the Heaviside function and xc is the critical value. In [13] (see also [14]) Diaz-Guilera pointed out that for this and a similar model due to Zhang [24] given by d X (t) = (X (t) − xc )(H (X (t) − xc )dt,
(1.2)
it is more realistic to consider Eqs. (1.1) and (1.2) perturbed by (an additive) noise to model a random amount of energy put into the system varying all over the underlying domain. The resulting equations are then stochastic partial differential equations (SPDE) of evolution type, however, with very singular (non-continuous) coefficients which mathematically can only be treated as multi-valued functions.
902
V. Barbu, G. Da Prato, M. Röckner
The purpose of this paper is to analyze such type of equations within the framework of multi-valued stochastic evolution equations with (1.1) and (1.2) as the underlying motivating examples. To the best of our knowledge this is the first time this is done in the presence of a stochastic force and in such generality in a mathematically strict way. Let us introduce our framework. Let O be an open bounded domain of Rd , d = 1, 2, 3, with smooth boundary ∂O. We shall study here the nonlinear stochastic diffusion equation with linear multiplicative noise, ⎧ d X (t) − (X (t))dt σ (X (t))dW (t), in (0, ∞) × O, ⎪ ⎨ (X (t)) 0, on (0, ∞) × ∂O, (1.3) ⎪ ⎩ X (0, x) = x on O, where x is an initial datum and : R → 2R is a maximal monotone (possibly multivalued) graph with polynomial growth and random forcing term σ (X )dW =
∞
µk X dβk ek , t ≥ 0,
k=1
which is linear in X . Here {ek } is an orthonormal basis in L 2 (O), {µk } is a sequence of positive numbers and {βk } a sequence of independent standard Brownian motions on a filtered probability space (, F , {Ft }t≥0 , P). We note that the linear operator σ (X ) is defined by σ (X )h =
∞
µk X h, ek 2 ek , ∀ h ∈ L 2 (O),
k=1
where ·, ·2 is the scalar product in L 2 (O). Apart from the self-organized criticality phenomena mentioned above, Eq. (1.3) models the dynamics of flows in porous media and more generally the phase transition (including melting and solidification processes) in the presence of a random forcing term σ (X )dW . Existence for stochastic equations of the form (1.3) with additive and multiplicative noise was studied in [6] under the main assumption that is monotonically increasing, continuous and such that ⎧ (0) = 0, (r ) ≤ α1 |r |m−1 + α2 , ∀ r ∈ R, ⎪ ⎪ ⎪ ⎨ r (1.4) ⎪ (s)ds ≥ α3 |r |m+1 + α4 , ∀ r ∈ R, ⎪ ⎪ ⎩ 0
where α1 ≥ 0, α3 > 0, α2 , α4 ≥ 0 and m ≥ 1. (See also [7] and [22] for general growth conditions on .) Here we shall study Eq. (1.3) under the following assumptions. Hypothesis 1.1. (i) is a maximal monotone multivalued function from R into R such that 0 ∈ (0).
Stochastic Porous Media Equations and Self-Organized Criticality
903
(ii) There exist C > 0 and m ≥ 1 such that sup{|θ | : θ ∈ (r )} ≤ C(1 + |r |m ), ∀ r ∈ R. (iii) The sequence {µk } is such that ∞
µ2k λ2k < +∞,
k=1
where λk are the eigenvalues of the Laplace operator − in O with Dirichlet boundary conditions. We recall that the domain of is H 2 (O)∩ H01 (O). A multivalued function : R → 2R is said to be maximal monotone if it is monotone, i.e., (v1 − v2 )(u 1 − u 2 ) ≥ 0, ∀ vi ∈ (u i ), u i ∈ R, i = 1, 2, and the range R(I + ) of I + is all of R. Standard examples of maximal monotone functions (or graphs) are continuous and increasing functions, the subdifferential of the indicator function I K of a closed interval K of the form [a, b] or (−∞, b), [0, +∞), i.e. 0, if r ∈ K , I K (r ) = +∞, if r ∈ / K, or for −∞ = a0 < a1 < · · · < a N +1 = ∞ and for 0 ≤ i ≤ N − 1, ϕi (r ), for ai < r < ai+1 , (r ) = (ϕi (ai+1 − 0), ϕi+1 (ai+1 + 0)) , for r = ai+1 , N are monotonically non-decreasing continuous functions on (a , a where {ϕi }i=1 i i+1 ) and such that limr →ai+1 ϕi (r ) ≤ limr →ai+1 ϕi+1 (r ). Of course, any linear combination of maximal monotone graphs is maximal monotone. It should be noticed also that the subdifferential ∂ j : R → 2R of a lower semicontinuous convex function j : R → (−∞, +∞], i.e.,
∂ j (r ) = {η ∈ R : j (r ) ≤ η(r − r¯ ) + j (¯r ), ∀ r¯ ∈ R} is maximal monotone and conversely every maximal monotone function is of the form ∂ j, where j is a lower semicontinuous convex function on R. Since for x ∈ H −1 (O), |xek |2−1 ≤ C1 |ek |2H 2 (O ) |x|2−1 ≤ C1 λ2k |x|2−1 ,
(1.5)
and hence σ (x) 2L
2
(L 2 (O ),H −1 (O ))
=
∞ k=1
µ2k |xek |2−1 ≤ C1
∞
µ2k λ2k |x|2−1 ,
(1.6)
k=1
it follows by (iii) that σ (x) ∈ L 2 (L 2 (O), H −1 (O)) (the space of all Hilbert-Schmidt operators from L 2 (O) into H −1 (O)) and that it is Lipschitz continuous from H −1 (O) into L 2 (L 2 (O), H −1 (O)). Under these assumptions we shall prove that if x ∈ L p (O),
904
V. Barbu, G. Da Prato, M. Röckner
p ≥ max{2m, 4}, then there is a unique strong solution to Eq. (1.3) which is nonnegative if so is the initial data x. With respect to the situation considered in [5–7], in the present case one does not assume that the range of is all of R. This general setting, motivated by the diffusion models mentioned above, requires, however, a different treatment of existence. It should be mentioned that several other physical problems with free boundary and with phase transition can be put into this functional setting. For instance if ⎧ ⎨ α1 (x − a), for x < a (x) = [0, ρ], for x = a (1.7) ⎩ α (x − a) + ρ, for x > a, 2 with a, ρ, α1 , α2 ∈ (0, +∞), then (1.3) models the phase transition in porous media or in heat conduction (Stefan problem). If (x) = ρ sign x, where ρ > 0 and ⎧ x ⎪ , if x = 0 ⎨ |x| (1.8) sign x = ⎪ ⎩ [−1, 1], if x = 0, then (1.3) reduces to the nonlinear singular diffusion equation d X (t) − ρ div (δ(X (t))∇ X (t))dt = σ (X (t))dW (t), where δ is the Dirac measure concentrated at the origin. We already mentioned the Heavside step function ⎧ ⎨ 0, if x < 0 H (x) = [0, 1], if x = 0 ⎩ 1, if x > 0. Furthermore, (x) = |x|α sign x with 0 < α ≤ 1 also satisfy Hypothesis 1.1. Typical examples considered in the literature are (r ) = (r − xc )α , where α < 1 and the key result is that the density X (t) of the system converges to the critical value. In the same category fall the stochastically perturbed versions of Eqs. (1.1) and (1.2), that is e.g. in the first case the highly singular diffusion equation d X (t) − (H + λ)(X (t) − xc )dt = σ (X (t) − xc )dW (t),
(1.9)
where λ ≥ 0. This is a diffusion problem with free boundary driven by a random forcing term proportional to X (t) − xc , where xc is the critical density and X (t) is the density at the moment t. Taking into account the numerical simulation in 1-D (see [2]), one might expect that the time evolution of the system displays self-organized criticality, i.e. the supercritical region {X (t) > xc } is absorbed asymptotically in time by the critical one {X (t) = xc }. A few of the previous works (see e.g. [11]) on self-organized criticality in singular diffusion equations based on numerical tests brought attention on the failure of the self-organized behavior in the presence of random fluctuations (white noise perturbation). Here we shall prove, however, for systems of the form (1.7)-(1.9) that the self-organized criticality takes place with high probability under appropriate assumptions on the parameters and more precisely that the supercritical region “vanishes” into the critical one in finite time with high probability, at least if µk = 0 for all k ≥ N + 1 for
Stochastic Porous Media Equations and Self-Organized Criticality
905
some N ∈ N. We emphasize that this is in particular true when the noise is zero. In this case one gets an explicit bound for the time when this happens (cf. Remark 4.4 below). The plan of this paper is the following. The main results are presented in Sect. 2 and are proven in Sect. 3. In Sect. 4 we prove a finite time extinction type result for solutions to (1.3) which displays a self-organized criticality behavior. The following notations will be used. L p (O), p ≥ 1, is the usual space of p-integrable functions with norm denoted by | · | p . The scalar product in L 2 (O) and the duality induced by the pivot space L 2 (O) will be denoted by ·, ·2 . H k (O) ⊂ L 2 (O), k = 1, 2, are the standard Sobolev spaces on O, while H01 (O) is the subspace of q H 1 (O) with zero trace on the boundary. For p, q ∈ [1, +∞] by L W ((0, T ); L p (; H )) (H a Hilbert space) we shall denote the space of all q-integrable processes u : [0, T ] → L p (; H ) which are adapted to the filtration {Ft }t≥0 . By C W ([0, T ]; L 2 (; H )) we shall denote the space of all H -valued adapted processes which are mean square continuous. L(H ) denotes the space of bounded linear operators equipped with the usual norm. In the following by H we shall denote the distribution space H = H −1 (O) = (H01 (O)) endowed with the scalar product and norm defined by u, v = A−1 u(ξ )v(ξ )dξ, |u|−1 = u, u1/2 , O
where A = − with D(A) = H 2 (O) ∩ H01 (O). In terms of A Eq. (1.3) can be formally rewritten as ⎧ ⎨ d X (t) + A(X (t))dt σ (X (t))dW (t), (1.10)
⎩ X (0, x) = x.
Its exact meaning will be precised later (see Definition 2.1 below). It should be recalled, however, that the operator x → A(x) with the domain {x ∈ L 1 (O) ∩ H −1 (O) : there is η ∈ H01 (O), η ∈ (x) a.e. in O} is maximal monotone in H := H −1 (O) (see e.g. [3]) and so the distribution space H offers the natural functional setting for the porous media equation (1.3) or its abstract form (1.10). However, the general existence theory of infinite dimensional stochastic equations in Hilbert space with nonlinear maximal monotone operators (see [12,21]) is not applicable in the present case and so a direct approach must be used. Finally, in this paper we use the same letter C for several different positive constants arising in chains of estimates. 2. Existence, Uniqueness and Positivity Definition 2.1. Let x ∈ H . An H -valued continuous Ft -adapted process X = X (t, x) is called a solution to (1.3) (equivalently (1.10)) on [0, T ] if X ∈ L p ( × (0, T ) × O) ∩ L 2 (0, T ; L 2 (, H )),
p ≥ m,
906
V. Barbu, G. Da Prato, M. Röckner
and there exists η ∈ L p/m ( × (0, T ) × O) such that P-a.s. t X (t, x), e j 2 = x, e j 2 +
η(s, ξ )e j (ξ )dξ ds 0 O
+
∞
t µk
k=1
η ∈ (X )
X (s, x)ek , e j 2 dβk (s), ∀ j ∈ N, t ∈ [0, T ],
(2.1)
0
a.e. in × (0, T ) × O.
(2.2)
Below for simplicity we often write X (t) instead of X (t, x). From the stochastic point of view the solution X given by Definition 2.1 is a strong one, but from the PDE point of view it is a solution in the sense of distributions since the boundary condition (X ) 0 on ∂O is satisfied in a weak sense only. Theorem 2.2 below is the main existence result. Theorem 2.2. Assume that d = 1, 2, 3 and that Hypothesis 1.1 holds. Then for each p x ∈ L p (O), p ≥ max{2m, 4} there is a unique solution X ∈ L ∞ W (0, T ; L (; O)) to (1.3). Moreover, if x is nonnegative a.e. in O then P-a.s. X (t, x)(ξ ) ≥ 0, for a.e. (t, ξ ) ∈ (0, ∞) × O. As mentioned earlier, Theorem 2.2 was proven in [6] for a differentiable satisfying conditions (1.4) and for p ≥ max{m + 1, 4}. It should be said, however, that in contrast with what happens for coercive functions arising in [6], here it seems no longer possible to extend the existence result to all x ∈ H −1 (O), x ≥ 0. 3. Proof of Theorem 2.2 We shall consider the approximating equation d X λ (t) + A(λ (X λ (t)) + λX λ (t))dt = σ (X λ (t))dW (t), X λ (0, x) = x,
(3.1)
where λ > 0 and 1 (x − (1 + λ)−1 (x)) ∈ ((1 + λ)−1 (x)) λ is the Yosida approximation of . We recall that λ is Lipschitzian and monotonically increasing and so x → λ (x) + λx is strictly monotonically increasing and bounded by C1 (1 + |x|m ) and (λ (x) + λx)x ≥ λ|x|2 for all x ∈ R. By [6, Theorem 2.2] (applied with m = 1), for each x ∈ H −1 (O) Eq. (3.1) has a unique solution λ (x) =
X λ ∈ L 2 ( × (0, T ) × O) ∩ L 2W (, C([0, T ]; H )) in the sense of Definition 2.1. Here as usual C([0, T ]; H ) is equipped with the supremum norm. Moreover, (see e.g. [21, Theorem 4.2.5]) the following Itô formula holds t E|X λ (t)|2−1
(λ (X λ (s)) + λX λ (s))X λ (s)dξ ds
+ 2E 0 O
=
|x|2−1
+
∞ k=1
t µ2k
E
|X λ (s)ek |2−1 ds. 0
(3.2)
Stochastic Porous Media Equations and Self-Organized Criticality
907
We note that since |X λ ek |−1 ≤ C|ek | H 2 (O ) |X λ |−1 ≤ Cλk |X λ |−1 , (cf. (1.5)) we have by Hypothesis 1.1(iii) (cf. (1.6)) ∞
t µ2k E
k=1
t |X λ (s)ek |2−1 ds
≤ CE
0
|X λ (s)|2−1 ds.
(3.3)
0
Lemma 3.1. There exists a constant C > 0 such that for all p ≥ 2 and all x ∈ L p (O),
p−1 p p |x| p , ∀ λ > 0. ess.supt∈[0,T ] E|X λ (t, x)| p ≤ exp C (3.4) 2 Proof. We know from [6, Lemma 3.4] (with m = 1) that as ε → 0, ε 2 X λ → X λ strongly in L ∞ W (0, T ; L (; H )), ε ∗ p p X λ → X λ in the weak topology in L ∞ W (0, T ; L (; L (O))), where X λε is the solution to the approximating equation d X λε (t) + (Aλ )ε X λε (t)dt = σ (X λε (t))dW (t), t ≥ 0, X λε (0) = x, where
(3.5)
(3.6)
Aλ x = A(λ (x) + λx) = −(λ (x) + λx), D(Aλ ) = {x ∈ H ∩ L 1 (O) : λ (x) + λx ∈ H01 (O)},
and (Aλ )ε is the Yosida approximation of Aλ , (Aλ )ε =
1 (I − (I + ε Aλ )−1 ), ε > 0. ε
Furthermore, by [6, Lemma 3.2] we have that X λε ∈ L 2 (; C([0, T ]; L 2 (O)). As a matter of fact the results of [6] were proven for smooth nonlinear functions while λ is only Lipschitz but the extension to Lipschitzian functions satisfying (1.4) is immediate. In fact, one might take a smoother approximation of , for instance the mollifier λ ∗ ρλ (ρλ (r ) = λ1 ρ(λ/r ), ρ ∈ C0∞ (R), ρ ≥ 0, ρdr = 1) which still remains monotonically increasing and has all properties of λ . p Next we apply Itô’s formula (3.6) for the function ϕ(x) = 1p |x| p . More precisely, we first apply Itô’s formula to ϕγ (x) = 1p |(1 + γ A)−1 x| p , γ > 0, and then we let γ → 0. We have (for details see the proof in [6, Lemma 3.5]), p
Eϕ(X λε (t)) + E
t
(Aλ )ε X λε (s), |X λε (s)| p−2 X λε (s)2 ds
0
t ∞ p−1 2 = ϕ(x) + µk E |X λε (s)| p−2 |X λε (s)ek |2 dξ ds dξ 2 k=1
≤ ϕ(x) +
p−1 CE 2
0 O
t 0 O
|X λε (s)| p dξ ds,
(3.7)
908
V. Barbu, G. Da Prato, M. Röckner
since by Sobolev embedding |ek |∞ ≤ Cλk for all k ∈ N. If Yλε is the solution to the equation Yλε − ε(λ (Yλε ) + λYλε ) = X λε , λ (Yλε ) + λYλε ∈ H01 (O), then (see [6, (3.25)]) |Yλε | p ≤ |X λε | p and therefore (Aλ )ε X λε , |X λε | p−2 X λε 2 =
1 X λε − Yλε , |X λε | p−2 X λε 2 ≥ 0. ε
Then by (3.7) it follows, via Gronwall’s lemma, that
p−1 p p ε E|X λ (t)| p ≤ |x| p exp C , 2 where C is independent of x, λ and t. Now one obtains (3.4) by letting ε tend to 0 and taking into account (3.5). From now on let us assume that p ≥ max{4, 2m} and x ∈ L p (O). From Lemma 3.1 it follows that for a subsequence {λ} → 0 we have ⎧ X → X weakly in L p ( × (0, T ) × O), ⎪ ⎨ λ and weak∗ in L ∞ (0, T ; L p (; L p (O))), (3.8) p/m ( × (0, T ) × O), ⎪ ⎩ λ (X λ ) → η weakly in L 2 in particular weakly in L ( × (0, T ) × O), because by Hypothesis(ii), |λ (x)| ≤ | 0 (x)| ≤ C(1 + |x|m ), ∀ x ∈ R. ( 0 is the minimal section of .) By (3.4) we have for λ → 0, λX λ → 0 strongly in L p ( × (0, T ) × O).
(3.9)
Clearly X and η are adapted processes. On the other hand, we have d(X λ (t) − X µ (t)) − (λ (X λ (t)) − µ (X µ (t)) + λX λ (t) − µX µ (t))dt = (σ (X λ (t)) − σ (X µ (t)))dW (t), and therefore once again applying Itô’s formula (cf. (3.2)) we obtain for α > 0, t ∈ [0, T ], 1 |X λ (t) − X µ (t))|2−1 e−αt 2 t (λ (X λ (s)) − µ (X µ (s)) (λλ (X λ (s)) − µµ (X µ (s))) + 0 O
+ (λX λ (s) − µX µ (s))(X λ (s) − X µ (s)) e−αs dξ ds t ∞ 1 2 2 µk λk − α |X λ (s − X µ (s))|2−1 e−αs ds + Mλ,µ (t), ∀ λ, µ > 0, ≤ C 2 k=1
0
(3.10)
Stochastic Porous Media Equations and Self-Organized Criticality
909
where t Mλ,µ (t) :=
e−αs X λ (s) − X µ (s), σ (X λ (s) − X µ (s))dW (s)2
0
is a real local valued martingale. To derive (3.10) we used that x = λλ (x) + (1 + λ)−1 (x), and thus for all x, y ∈ R, (λ (x) − µ (y))(x − y) = [λ (x) − µ (y)][(1 + λ)−1 (x) − (1 + µ)−1 (y)] +[λ (x) − µ (y)][λλ (x) − µµ (y)], and that the first summand on the right-hand side is nonnegative because is monotonically increasing and λ (x) ∈ ((1 + λ)−1 (x)). Hence for α > 0 large enough we obtain for all λ, µ ∈ (0, 1) and t ∈ [0, T ], 1 |X λ (t) − X µ (t))|2−1 e−αt 2 t |λ (X λ (s))|2 + |X λ (s)|2 + |µ (X µ (s))|2 ≤ C max{λ, µ} + |X µ (s)|
2
0 O
e
−αs
dξ ds + Mλ,µ (t).
(3.11)
Hence by the Burkholder-Davis-Gundy inequality (for p = 1) we get for all λ, µ ∈ (0, 1), r ∈ [0, T ], 1 E sup |X λ (t) − X µ (t))|2−1 e−αt 2 t∈[0,r ] r |λ (X λ (s))|2 + |X λ (s)|2 + |µ (X µ (s))|2 ≤ C max{λ, µ}E 0 O
⎛
+ |X µ (s)|2 e−αs dξ ds + CE ⎝
r
⎞1/2 |X λ (s) − X µ (s)|4−1 e−2αs ds ⎠
.
(3.12)
|X λ (s) − X µ (s)|2−1 e−αs ds.
(3.13)
0
But ⎛ E⎝
r
⎞1/2 |X λ (s) − X µ (s)|4−1 e−2αs ds ⎠
0
≤ E sup |X λ (s) − X µ (s))|−1 e s∈[0,r ]
− α2 s
⎛ r ⎞1/2 ⎝ |X λ (s) − X µ (s)|2−1 e−αs ds ⎠ 0
1 ≤ E sup |X λ (s) − X µ (s))|2−1 e−αs + CE 4 s∈[0,r ]
r 0
Taking into account that by Hypothesis 1.1(ii), |λ (X λ )| ≤ C(1 + |X λ |m ), ∀ λ > 0,
910
V. Barbu, G. Da Prato, M. Röckner
and that by (3.4) {X λ } is bounded in L p (×(0, T )×O) for p ≥ max{4, 2m}, we infer by (3.12), (3.13) and Gronwall’s lemma that {X λ } is a Cauchy net in L 2 (; C([0, T ]; H )) Hence for λ → 0, X λ → X in L 2 (; C([0, T ]; H )).
(3.14)
In order to complete the proof of the existence part of Theorem 2.2 it suffices to show that η(ω, t, ξ ) ∈ (X (ω, t, ξ )) a.e in × (0, T ) × O.
(3.15)
Since the operator p
p
L p ( × (0, T ) × O) → L m ( × (0, T ) × O) ⊂ L p−1 ( × (0, T ) × O), X → (X ), in the duality pair p L p ( × (0, T ) × O), L p ( × (0, T ) × O) = L p−1 ( × (0, T ) × O) , is maximal monotone, it suffices to show that (see e.g. [3]) T lim inf E λ→0
T λ (X λ )X λ dξ dt ≤ E
0 O
ηX dξ dt.
(3.16)
0 O
To prove (3.16) we first note that by (3.2) we have T lim inf E λ→0
λ (X λ )X λ dξ dt + 0 O
1 E|X (t)|2−1 2
∞ 1 1 2 = |x|2−1 + µk E |X (s)ek |2−1 ds, 2 2 t
k=1
(3.17)
0
because by (1.5), |(X λ − X )ek |−1 ≤ Cλk |X λ − X |−1 and so by Hypothesis 1.1(iii), lim
∞
λ→0
t µ2k E
k=1
|X λ (s)ek |2−1 ds
=
∞
t µ2k E
k=1
0
|X (s)ek |2−1 ds. 0
Next letting λ tend to zero in (3.1) and using (3.8) we see that P-a.s., for all t ∈ [0, T ], t X (t), e j 2 = x, e j 2 +
η(s), e j 2 ds +
∞ k=1
0
t µk
X (s)ek , e j 2 dβk (s). 0
(3.18) Note that by continuity the P-zero set does not depend on t ∈ [0, T ], since ∞ k=1
t µk
t X (s)ek , e j 2 dβk (s) =
0
e j , σ (X (s))dW (s)2 . 0
Stochastic Porous Media Equations and Self-Organized Criticality
911
In order to get (3.18) we have used the fact that by (3.14) we have t 2 t E X λ (s)ek , e j 2 dβk (s)ds − X (s)ek , e j 2 dβk (s)ds 0 0 t = E (X λ (s) − X (s))ek , e j 22 ds ≤ Cλ2j λ2k T |X λ − X |2L 2 (,C([0,T ];H )), 0
and therefore lim
∞
λ→0
k=1
t µk
X λ (s)ek , e j 2 dβk ds =
∞
t µk
k=1
0
X (s)ek , e j 2 dβk ds. 0
Therefore (3.18) follows and this yields, via Itô’s formula (applied to X (t), e j 22 , t ∈ [0, T ]) and summation over j that 1 E|X (t)|2−1 + E 2
t ηX dξ ds 0 O
∞ 1 1 2 = E|x|2−1 + µk E |X (s)ek |2−1 ds, ∀ t ∈ [0, T ]. 2 2 t
k=1
(3.19)
0
Comparing (3.17) and (3.19) we get (3.16). Hence X is a solution to (1.3) as claimed. To prove uniqueness we take two solutions X (1) and X (2) with corresponding η(1) and η(2) . Repeating the argument above we obtain 1 E|X (1) (t) − X (2) (t)|2−1 2 t +E (η(1) (s) − η(2) (s))(X (1) (s) − X (2) (s))dξ ds 0 O
∞ 1 2 µk E |(X (1) (s) − X (2) (s))ek |2−1 ds, ∀ t ∈ [0, T ]. = 2 t
k=1
0
Since, because is monotone, the second term on the left is positive, by (1.5), Hypothesis 1.1(iii) this implies X (1) = X (2) by Gronwall’s lemma. Finally, if x ≥ 0 a.e. in O we know by [6, Theorem 2.2] that X λ ≥ 0 P-a.s. and so by (3.14) it follows that X ≥ 0, a.e in × (0, T ) × O as desired. This completes the proof of Theorem 2.2. Remark 3.2. Theorem 2.2 extends to any dimension d ≥ 1 if one modifies condition (iii) in Hypothesis 1.1 as in [6, Condition 4.1], i.e., one assumes ∞ k=1
µ2k (|ek |∞ + λk |ek |
4d (O)
L d+6
)2 < +∞.
912
V. Barbu, G. Da Prato, M. Röckner
Remark 3.3. The existence part of Theorem 2.2 remains true for stochastic porous media equations with additive noise, i.e. d X − (X )dt = Q dW (t), where satisfies Hypothesis 1.1 and
Q dW (t) =
∞
µk ek dβk (t)
k=1
with ∞
2 λ−1 k µk < +∞.
k=1
The proof is exactly the same and so, it will be omitted. Proposition 3.4. Let X λ , λ ∈ (0, 1), be as above, x ∈ L 4 (O). Assume that satisfies Hypothesis 1.1 with m = 1 and for some δ > 0, (x˜ − y˜ )(x − y) ≥ δ(x − y)2 , ∀ (x, x), ˜ (y, y˜ ) ∈ .
(3.20)
Then X λ , X ∈ L 2W (0, T ; L 2 (, H01 (O))) and lim E|X λ − X |2L 2 (0,T ;L 2 (O )) = 0.
(3.21)
λ→0
Proof. A simple calculation reveals that (λ (x) − λ (y))(x − y) ≥
δ |x − y|2 , ∀ x, y ∈ R 2
˜ λ (r ) := λ (r )− δ r, r ∈ R, is increasing ˜ λ defined by for λ sufficiently small. Then 2 and so by Itô’s formula we have E|X λ (t)|22
δ + E 2
t |X λ (s)|2H 1 (O ) ds ≤ C.
(3.22)
0
0
As a matter of fact, we shall apply Itô’s formula not directly to Eq. (3.1) but to Eq. (3.6) (cf. the proof of Lemma 3.1 to obtain (3.7)). Thus we get 1 E|X λε (t)|22 + E 2
t
(Aλ )ε X λε (s), X λε (s)2 ds ≤
0
1 |x|22 + CE 2
t
|X λε (s)|22 ds.
0
Next we have (Aλ )ε X λε , X λε 2 = Aλ (1 + ε Aλ )−1 X λε , (1 + ε Aλ )−1 X λε 2 + ε|(Aλ )ε X λε |22 . Taking into account that Aλ = (λ + λI ) and that r → λ (r ) − δr/2 is monotonically increasing we get δ (Aλ )ε X λε , X λε 2 ≥ |∇(1 + ε Aλ )−1 X λε |2 dξ + ε|(Aλ )ε X λε |22 . 2 O
Stochastic Porous Media Equations and Self-Organized Criticality
913
Hence t E
|(1 + ε Aλ )−1 X λε (s)|2H 1 (O ) ds ≤ C 0
0
and letting ε → 0 we get (3.22) and the first assertion (taking also into account (3.5)). To prove the second part we note that 1 ˜ λ (X λ ) − ˜ µ (X µ ) + λX λ − µX µ + δ (X λ − X µ )]dt d(X λ − X µ ) − [ 2 = (σ (X λ ) − σ (X µ ))dW. Hence exactly the same arguments to derive (3.11) lead to 1 δ |X λ (t) − X µ (t)|2−1 e−αt + 2 2 ≤ C max{λ, µ}
t
t
|X λ (s) − X µ (s)|22 e−αs ds
0
|λ (X λ (s))|22 + |µ (X µ (s))|22
0 + |X λ (s)|22 + |X µ (s)|22 e−αs ds + Mλ,µ (t),
for α large enough and λ, µ ∈ (0, 1), t ∈ [0, T ]. Since m = 1, we have |λ (x)| ≤ C(1 + |x|) for all x ∈ R, λ ∈ (0, 1), hence taking the expectation we get δ E 2
t
t |X λ (s) −
0
X µ (s)|22 ds
≤ C max{λ, µ}E
(|X λ (s)|2 + |X µ (s)|2 )ds. 0
By Lemma 3.1 with p = 2 and (3.8) this implies (3.21).
Besides Hypothesis 1.1, we shall now assume the following: (r ), for r ∈ R, where ρ > 0, : R → R is Lipschitz, (iv) (r ) = ρ sign r + ∈ C 1 (R \ {0}) and for some δ > 0 it satisfies (r ) ≥ δ for all r ∈ R\{0}. Here the signum is defined by (1.8). Below we shall use an approximation to which is slightly different from λ defined before. Namely, below we consider (r ) + λr, r ∈ R, λ (r ) := ρ (sign)λ (r ) + where (sign)λ is the Yosida approximation of the sign, i.e. ⎧ ⎨ 1 if r > λ (sign)λ (r ) := λr if r ∈ [−λ, λ] ⎩ −1 if r < −λ. We shall use the symbol λ also for this approximation and denote also by X λ the corresponding solution of (3.1). This approximation in the special case of condition (iv) is much more convenient. We emphasize that all previous results remain true for this modified approximation. The proofs are the same and some parts even simplify. We therefore shall use all previous results for λ and X λ as above without further notice. The following technical result will be used in Sect. 4 (cf. Lemma 4.1) in a crucial way.
914
V. Barbu, G. Da Prato, M. Röckner
Proposition 3.5. The solutions X λ to (3.1) and X to (1.3) satisfy all conditions of Proposition 3.4 and in addition T E
|∇(sign)λ (X λ )|2 dξ dt ≤ C, ∀ λ > 0, 0 O
and consequently η ∈ L 2W (0, T ; L 2 (; H01 (O)). Proof. We set r gλ (r ) :=
(sign)λ (s)ds, r ∈ R, 0
and choose ϕλ ∈
C 2 (R)
such that
(i) ϕλ (0) = 0. (ii) ϕλ (r ) = λr for |r | ≤ λ, ϕλ (r ) = 1 + λ for r ≥ 2λ, ϕλ (r ) = −1 − λ for r ≤ −2λ. (iii) 0 ≤ ϕλ (r ) ≤ Cλ for all r ∈ R. It is easily seen that such a function exists and can be constructed simply by smoothing the function (sign)λ . Let us denote the resulting function by f λ . Then define r ϕλ (r ) :=
f λ (s)ds, r ∈ R. 0
As mentioned above the arguments of the previous proofs extends to the present situation in order to prove that {X λ } is convergent to the solution X to (1.3). Now we shall apply Itô’s formula to Eq. (3.1) (or, more exactly, to (3.6) and then let ε → 0 as in the proof of Proposition 3.4) with λ defined as above and to the function O ϕλ (X λ )dξ . Arguing as in the proof of Lemma 3.1 to obtain (3.7), we get (recall that X λ (t) ∈ H01 (O)), E
t ϕλ (X λ (t))dξ − E
O
0
≤
ϕλ (x)dξ + C O
∞
ϕλ (x)dξ + 4λC O
t µ2k E
k=1
≤
(X λ (s)), ϕλ (X λ (s))2 ds (sign)λ (X λ (s)) +
∞
0 O
ϕλ (X λ (s))|X λ (s)ek |2 dξ ds t
µ2k λ2k E
k=1
1λ (s, ξ )|ek |2 dξ ds, 0 O
where 1λ is the characteristic function of the set {(s, ξ ) : 0 ≤ |X λ (s, ξ )| ≤ 2λ}. ˜ are monotonically increasing Concerning the first line we note that, since ϕλ and 1 while as seen earlier X λ (t) ∈ H0 (O), we have by the Green formula that (X λ )ϕλ (X λ )|∇ X λ |2 dξ ≤ 0. (X λ ), ϕλ (X λ )2 = − O
Stochastic Porous Media Equations and Self-Organized Criticality
915
This yields T E
∇(sign)λ (X λ ), ∇ϕλ (X λ )2 dξ ds ≤ C, ∀ λ ∈ (0, 1).
0 O
Taking into account that −(sign)λ (X λ ), ϕλ (X λ )2 = ∇(sign)λ (X λ ), ∇ϕλ (X λ )2 ≥ 0, a.e. and that ∇ϕλ (X λ ) =
1 λ
∇ X λ on {(s, ξ ) : |X λ (s, ξ )| < λ}, we get
T E
|∇(sign)λ (X λ )|2 dξ ds ≤ C, ∀ λ ∈ (0, 1), 0 O
because ∇(sign)λ (X λ ) = λ1 ∇(X λ ) if |X λ )| < λ and ∇(sign)λ (X λ ) = 0 if |X λ )| ≥ λ. Then we get the desired estimate and since also by (3.22), T E
(X λ )|2 dξ ds ≤ C, ∀ λ ∈ (0, 1) |∇
0 O
(X λ ) → η weakly in L 2 ( × (0, T ) × O) as λ → 0, we infer that and (sign)λ (X λ ) + η ∈ L 2W (0, T ; L 2 (; H01 (O)) as claimed. 4. Extinction in Finite Time and Self-Organized Criticality In this section we shall prove a finite extinction property for solutions of (1.3) in 1-D for a special density dependent diffusion coefficient function . However, Lemma 4.1 below can be proved without restriction on dimension. So, for the moment we remain in our general framework. For simplicity we choose the Wiener process W (t) =
N
µk ek βk (t), t ≥ 0,
(4.1)
k=1
where N ∈ N. Besides Hypothesis 1.1, we shall assume Hypothesis (iv) (following the proof of Prop. 3.4), i.e. (r ), for r ∈ R, where ρ > 0, : R → R is Lipschitzian, (iv) (r ) = ρ sign r + 1 ∈ C (R \ {0}) and for some δ > 0 it satisfies (r ) ≥ δ for all r ∈ R \ {0}. Here the signum is defined by (1.8). Now let τ be the stopping time τ = inf{t ≥ 0 : |X (t, x)|−1 = 0}, where X (t, x), t ≥ 0, is the solution to (1.3) given by Theorem 2.2 for x ∈ L p (O), p ≥ max{4, 2m}.
916
V. Barbu, G. Da Prato, M. Röckner
Lemma 4.1. Under assumptions (i)−(iv) we have X (t, x) = 0, for t ≥ τ, P-a.s. Proof. Set A = −, D(A) = H 2 (O) ∩ H01 (O). Define µ : [0, T ] × → Cb2 (O; R) by µ(t) := −
N
µk ek βk (t), t ∈ [0, T ],
k=1
and µ˜ : [0, T ] → Cb2 (O; R) by µ˜ :=
N
µ2k ek2 .
k=1
Define Y (t) = eµ(t) X (t), t ≥ 0. Let D(A) be equipped with the graph norm of A and let D(A) be its dual space, hence D(A) ⊂ H01 (O) ⊂ L 2 (O) ⊂ H −1 (O) ⊂ D(A) .
(4.2)
It is easy to see that for all ω ∈ , t ∈ [0, T ] the function eµ(t,ω) is a multiplier both in D(A) and in H , hence eµ(t,ω) z ∈ D(A) is well defined for all z ∈ L 2 (O) and Y (t) ∈ H . Claim. We have t Y (t) = x +
e
µ(s)
1 η(s)ds − 2
0
t µY ˜ (s)ds, t ∈ [0, T ],
(4.3)
0
where the fist integral on the right-hand side is a Bochner integral in D(A) , the second by (3.8) is one in L p (O) ⊂ L 2 (O). In particular a posteriori the first integal is in H , continuous in H as a function of t ∈ [0, T ], P-a.s. Proof of the Claim. Let ϕ ∈ D(A). As before we shall use ·, ·2 also for the extended dualizations with pivot space L 2 (O) as the ones in (4.2).Then for t ∈ [0, T ], ϕ, eµ(t) X (t)2 =
∞
e j , eµ(t) ϕ2 e j , X (t)2 .
j=1
Furthermore, we have by Itô’s formula for all ξ ∈ O, e
µ(t,ξ )
t =1+
e 0
µ(s,ξ )
1 dµ(s, ξ ) + 2
t 0
eµ(s,ξ ) µ(ξ ˜ )ds.
Stochastic Porous Media Equations and Self-Organized Criticality
917
Now fix j ∈ N. Then by the stochastic Fubini Theorem e j , e
µ(t)
ϕ2 = e j , ϕ2 −
N k=1
1 + 2
t
t µk
e j , ek eµ(s) ϕ2 dβk (s)
0
e j , µe ˜ µ(s) ϕ2 ds, t ∈ [0, T ].
0
By Itô’s product rule and (3.18) we hence obtain e j , eµ(t) ϕ2 e j , X (t)2 t = e j , ϕ2 e j , x2
+
e j , eµ(s) ϕ2 e j , η(s)2 ds
0
+
N
t µk
k=1
+
1 2
e j , eµ(s) ϕ2 e j , X (s)ek 2 dβk (s)
0
t
e j , X (s)2 e j , µ˜ eµ(s) ϕ2 ds
0
−
N
t µk
k=1
−
N k=1
e j , X (s)2 e j , ek eµ(s) ϕ2 dβk (s)
0
t µ2k
e j , ek eµ(s) ϕ2 e j , X (s)ek 2 dβk (s).
0
After summing over j ∈ N the two stochastic terms cancel and the claim follows since ϕ ∈ D(A) was arbitrary. Below we work for P-a.s. ω ∈ , ω fixed. Hence all constants C appearing below may depend on ω. Consider the solution X λ ∈ L 2W (0, T ; L 2 (, H01 (O))) to Eq. (3.1). By Proposition 3.4 we have lim E|X λ − X |2L 2 (0,T ;L 2 (O )) = 0
λ→0
and λ (X λ ) ∈ L 2W (0, T ; L 2 (, H01 (O))) because λ is Lipschitz. On the other hand,we have as in (4.3) for Yλ = eµ X λ , 1 dYλ (t) = eµ(t) ηλ (t) − µ(t)Y ˜ λ (t), ∀ t ≥ 0, dt 2 where ηλ (t) = λ (X λ (t)) ∈ H01 (O).
(4.4)
918
V. Barbu, G. Da Prato, M. Röckner
It follows by (3.21) that lim E|Yλ − Y |2L 2 (0,T ;L 2 (O )) = 0,
λ→0
(4.5)
and therefore for some sequence λn → 0, lim |Yλn − Y | L 2 (0,T ;L 2 (O )) = 0 a.e. on .
n→∞
(4.6)
Below we simply write λ instead of λn . Next we have by (4.4) that dYλ (t) 1 , Yλ (t) = ηλ (t), (eµ(t) Yλ (t)) − µ(t)Y ˜ λ (t), Yλ (t)2 a.e. t ∈ [0, T ]. 2 dt 2 2 (4.7) Also we have (for simplicity we take ρ = 1) ηλ (t), (eµ(t) Yλ (t))2 −µ(t) (e−µ(t) Yλ (t)), (eµ(t) Yλ (t))2 = (sign) Yλ (t)) + λ (e (∇(sign)λ (e−µ(t) Yλ (t)), ∇(eµ(t) Yλ (t)))dξ
=− O −
(e−µ(t) Yλ (t))(∇(e−µ(t) Yλ (t)), ∇(eµ(t) Yλ (t)))dξ
1 (|∇Yλ (t)|2 − |Yλ (t)|2 |∇µ(t)|2 )1λ (t, ξ )dξ λ O (e−µ(t) Yλ (t))(|∇Yλ (t)|2 − |Yλ (t)|2 |∇µ(t)|2 )dξ, − O
=−
O
because for y ∈ H01 (O), ∇ (sign)λ (y) =
0, on {y ∈ / (−λ, λ)}, 1 λ ∇ y, on {y ∈ (−λ, λ)}.
(Here 1λ is the characteristic function of {(ξ, t) ∈ O × [0, T ] : |e−µ(t,ξ ) Yλ (t, ξ ))| < λ} ≥ δ and ∈ L ∞ (R), and (·, ·) is the euclidean scalar product in Rn .) Since µ ∈ C([0, T ] × O) this yields (4.8) ηλ (t), (eµ(t) Yλ (t))2 ≤ C |Yλ (t)|22 + λ . Hence (4.7) and Gronwall’s lemma imply |Yλ (t)|22 ≤ eC(t−s) |Yλ (s)|22 + CλT a.e. t > s. Now taking into account (4.6) and letting λ → 0 we get |Y (t)|22 ≤ eC(t−s) |Y (s)|22 a.e. t > s.
(4.9)
If Y (·) is L 2 (O)-continuous then (4.9) holds for all s, t ∈ [0, T ], t ≥ s. Taking in (4.9) s = τ ∧ T we get Y (t) = 0 for all t ≥ τ ∧ T and since T > 0 was arbitrary for all
Stochastic Porous Media Equations and Self-Organized Criticality
919
t ≥ τ as claimed. So, we have to prove that Y is L 2 (O)-continuous on [0, T ]. For this we recall that by Proposition 3.5 we have eµ η ∈ L 2 (0, T ; H01 (O)), P-a.s.
(4.10)
1 2 −1 2 Then by Eq. (4.3) we have dY dt ∈ L (0, T ; H (O)) and so, since Y ∈ L (0, T ; H0 (O)) P-a.s. by Proposition 3.4, by a well known interpolation result (see e.g. [3]), we conclude that Y ∈ C([0, T ]; L 2 (O)). This concludes the proof of Lemma 4.1.
For proving our extinction result we need O ⊂ R, i.e. d = 1. To be more specific let O = (0, π ). Then ek (ξ ) = π2 sin kξ, ξ ∈ [0, π ], λk = k 2 and L 1 (0, π ) ⊂ H continuously, so γ = inf
|x| L 1 : x ∈ L 1 (0, π ) > 0. |x|−1
(4.11)
Theorem 4.2. Let x ∈ L p (0, π ), p ≥ max{2m, 4}, be such that |x|−1 < C N−1 ργ , where C N :=
N π (1 + k)2 µ2k . 4
(4.12)
k=1
Then, for each n ∈ N, ⎞−1 ⎛ n |x|−1 ⎝ P(τ ≤ n) ≥ 1 − e−C N s ds ⎠ , ργ
(4.13)
0
where by Lemma 4.1 we have τ (ω) = sup{t ≥ 0 : |X (t, x)|−1 > 0}. Proof. By condition (iv) we see that r (r ) ≥ ρ|r |, ∀ r ∈ R.
(4.14)
Consider the solution X λ ∈ L 2W (0, T ; L 2 (; H01 (0, π ))) to Eq. (3.1). Then by first applying Krylov-Rozovskii’s Itô formula (cf. [20, Theorem I.3.1] or e.g. [21, Theorem 4.2.5]) and then the classical Itô formula to the real valued semi-martingale |X λ (t)|2−1 , t ∈ [0, T ], and the function ϕε (r ) = (r + ε2 )1/2 , r ∈ R,
920
V. Barbu, G. Da Prato, M. Röckner
we find dϕε (|X λ (t)|2−1 ) + (|X λ (t)|2−1 + ε2 )−1/2 X λ (t), λ (X λ (t))2 dt =
N 1 2 |X λ (t)ek |2−1 (|X λ (t)|2−1 + ε2 ) − |X λ (t)ek , X λ (t)−1 |2 ) µk dt 2 (|X λ (t)|2−1 + ε2 )3/2 k=1
+σ (X λ (t))dW (t), ϕε (|X λ (t)|2−1 )X λ (t)−1 ≤
N |X λ (t)ek |2−1 1 2 µk dt + σ (X λ (t))dW (t), ϕε (|X λ (t)|2−1 )X λ (t)−1 2 + ε 2 )1/2 2 (|X (t)| λ −1 k=1
≤ CN
|X λ (t)|2−1 (|X λ (t)|2−1 + ε2 )1/2
dt + 2σ (X λ (t))dW (t), ϕε (|X λ (t)|2−1 )X λ (t)−1 .
(4.15)
Here C N is given by (4.12) and σ (X λ (t))dW (t) =
N
µk X λ (t)ek dβk (t).
k=1
Integrating over t and letting λ → 0 we see that the right-hand side of (4.15) converges to the right-hand side of (4.16) below. But by (3.8), (3.12), (3.13) and by Proposition 3.4 the same is true for the left-hand side with limit t ϕε (|X (t)|2−1 ) − ϕε (|x|2−1 ) + 0 O
X (s) η(s)dξ ds. (|X (s)|2−1 + ε)1/2
Taking into account (2.2) and (4.14) we altogether obtain dϕε (|X (t)|2−1 ) + ρ
|X (t)| L 1 (0,π )
dt (|X (t)|2−1 + ε2 )1/2 |X (t)|2−1 dt + 2σ (X (t))dW (t), ϕε (|X (t)|2−1 )X (t). ≤ CN (|X (t)|2−1 + ε2 )1/2
Consequently by Lemma 4.1 for all t ≥ 0, t∧τ ϕε (|X (t)|2−1 ) + γρ 0
|X (s)|−1 ds (|X (s)|2−1 + ε2 )1/2
t∧τ ≤ ϕε (|x|2−1 ) + C N 0
|X (s)|2−1 (|X (s)|2−1 + ε2 )1/2
ds
t∧τ +2 σ (X (s))dW (s), ϕε (|X (s)|2−1 )X (s), P-a.s., 0
where γ is defined by (4.4).
(4.16)
Stochastic Porous Media Equations and Self-Organized Criticality
921
Clearly, we have t∧τ lim
ε→0 0
|X (s)|−1 ds = t ∧ τ, (|X (s)|2−1 + ε2 )1/2
P-a.s.
Now, letting ε tend to zero we get |X (t)|−1 + γρ(t ∧ τ ) t t ≤ |x|−1 +C N |X (s)|−1 ds + 1[0,τ ] (s)σ (X (s))dW (s), X (s)|X (s)|−1 −1 P-a.s. 0
0
(4.17) Hence by a standard comparison result t |X (t)|−1 + ργ
eC N (t−s) 1[0,τ ] (s)ds
0
t ≤e
CN t
|x|−1 +
eC N (t−s) 1[0,τ ] (s)σ (X (s))dW (s), X (s)|X (s)|−1 −1 .
0
Taking the expectation and multiplying by (ργ )−1 e−C N t , we obtain t
e−C N s P(τ > s)ds ≤
0
|x|−1 . ργ
Writing P(τ > s) = 1 − P(τ ≤ s) we deduce that ⎞−1 ⎛ t |x|−1 ⎝ P(τ ≤ t) ≥ 1 − e−C N s ds ⎠ ργ 0
and (4.13) follows.
In particular Theorem 4.2 applies to self-organized criticality stochastic models (1.9), ⎧ (X (t) − xc ))dt d X (t) − (ρ sign (X (t) − xc ) + ⎪ ⎪ ⎪ ⎪ N ⎪ ⎪ ⎪ ⎨ µk ek dβk , t ≥ 0, σ (X (t) − xc ) (4.18) k=1 ⎪ ⎪ ⎪ ⎪ (X (t) − xc ) 0, on ∂[0, π ], ⎪ ρ sign (X (t) − xc ) + ⎪ ⎪ ⎩ X (0, x) = x. is as in assumption (iv) and xc ∈ R. Here the function
922
V. Barbu, G. Da Prato, M. Röckner
Corollary 4.3. Assume that |x − xc |−1 < ργ C N−1 , where C N is as in (4.12) and γ as in (4.11). Then for each n ∈ N, ⎞−1 ⎛ n |x − xc |−1 ⎝ P(τc ≤ n) ≥ 1 − e−C N s ds ⎠ , ργ
(4.19)
0
where τc = inf{t ≥ 0 : |X (t) − xc |−1 = 0} = sup{t ≥ 0 : |X (t) − xc |−1 > 0}, and X = X (t, x) is the solution to (4.18) in the sense of Definition 2.1. We note that Eq. (1.9) reduces to (4.18) by shifting the Heavside function with xc . Remark 4.4. One must notice that if x > xc , i.e. if the initial state is in the supercritical region then by the positivity result in Theorem 2.2 we have X (t) ≥ xc , P-a.s. for all t ≥ 0. This means that the state remains in the supercritical-critical region for all time. |x|−1 However, by (4.19) if C Nργ is small, it reaches the critical state xc with high probability in a finite time, i.e. the supercritical-critical region is completely absorbed by the critical |x|−1 is not small, i.e., if the magnitude of the one in a finite time. In contrast, if C Nργ random fluctuations induced by the noise is large compared with the initial state x then the above conclusion might fail because the random perturbations can push the density X (t) over the singularity xc . So, in general we cannot expect τc < ∞, P-a.s. However, by (4.19) we see that P(τc < ∞) = lim P(τc ≤ n) ≥ 1 − n→∞
|x − xc |−1 . ργ C N
Acknowledgement. We would like to thank Philippe Blanchard for introducing us to these models of self-organized criticality and the relevant literature. This work has been supported in part by the CEEX Project 05 of Romanian Minister of Research, the DFG -International Graduate School “Stochastics and Real World Models”, the SFB-701 and the BiBoS-Research Center, the research programme “Equazioni di Kolmogorov” from the Italian “Ministero della Ricerca Scientifica e Tecnologica” and “FCT, POCTI-219, FEDER”. Most of the work was done during very pleasant visits of the first and second author to the University of Bielefeld and the first and the third author to the SNS in Pisa. Note added in proof. Employing a supermartingale argument it is possible to prove Lemma 4.1 without the assumption that N in (4.1) is finite. Then also Theorem 4.2 holds for N = ∞. In addition, Lemma 4.1 also ˜ (r ) ≥ 0 for all r ∈ R\{0}. Details on this holds without assuming in (iv) that δ > 0, but rather only that will be included in a forthcoming paper.
References 1. Bak, P., Tang, C., Wiesenfeld, K.: Phys. Rev. Lett. 59, 381–384 (1987); Phys. Rev. A 38, 364–375 (1988) 2. Bantay, P., Janosi, M.: Self organization and anomalous diffusions. Physica A 185, 11–189 (1992) 3. Barbu, V.: Nonlinear semigroups and differential equations in Banach spaces. Leiden: Noordhoff International Publishing, 1976 4. Barbu, V., Bogachev, V.I., Da Prato, G., Röckner, M.: Weak solution to the stochastic porous medium equations: the degenerate case. J. Funct. Anal. 235(2), 430–448 (2006)
Stochastic Porous Media Equations and Self-Organized Criticality
923
5. Barbu, V., Da Prato, G.: The two phase stochastic Stefan problem. Probab. Theory Relat. Fields 124, 544–560 (2002) 6. Barbu, V., Da Prato, G., Röckner, M.: Existence and uniqueness of nonnegative solutions to the stochastic porous media equation. Indiana Univ. Math. J. 57, 187–212 (2008) 7. Barbu, V., Da Prato, G., Röckner, M.: Existence of strong solutions for stochastic porous media equation under general monotonicity conditions. Ann. Prob., to appear 8. Cafiero, R., Loreto, V., Pietronero, L., Vespignani, A., Zapperi, S.: Local Rigidity and Self-Organized Criticality for Avalanches. Europhys. Lett. (EPL) 29(2), 111–116 (1995) 9. Carlson, J.M., Chayes, J.T., Grannan, E.R., Swindle, G.H.: Self-organized criticality in sandpiles: nature of the critical phenomenon. Phys. Rev. A (3) 42, 2467–2470 (1990) 10. Carlson, J.M., Chayes, J.T., Grannan, E.R., Swindle, G.H.: Self-orgainzed criticality and singular diffusion. Phys. Rev. Lett. 65(20), 2547–2550 (1990) 11. Carlson, J.M., Swindle, G.H.: Self-organized criticality: Sand Piles, singularities and scaling. Proc. Nat. Acad. Sci. USA 92, 6710–6719 (1995) 12. Da Prato, G., Zabczyk, J.: Ergodicity for infinite dimensional systems. London Mathematical Society Lecture Notes, n.229, Cambridge: Cambridge University, 1996 13. Díaz-Guilera, A.: Dynamic Renormalization Group Approach to Self-Organized Critical Phenomena. Europhys. Lett. (EPL) 26(3), 177–182 (1994) 14. Giacometti, A., Díaz-Guilera, A.: Dynamical properties of the Zhang model of self-organized criticality. Phys. Rev. E 58(1), 247–253 (1998) 15. Grinstein, G., Lee, D.H., Sachdev, S.: Conservation laws, anisotropy, and self-organized criticality in noisy nonequilibrium systems. Phys. Rev. Lett. 64, 1927–1930 (1990) 16. Hentschel, H.G.E., Family, F.: Scaling in open dissipative systems. Phys. Rev. Lett. 66, 1982–1985 (1991) 17. Hwa, T., Kardar, M.: Dissipative transport in open systems: An investigation of self-organized criticality. Phys. Rev. Lett. 62(16), 1813–1816 (1989) 18. Janosi, I.M., Kertesz, J.: Self-organized criticality with and without conservation. Physica (Amsterdam) A 200, 179–188 (1993) 19. Jensen, H.J.: Self-organized criticality. Cambridge: Cambridge University Press, 1988 20. Krylov, N.V., Rozovskii, B.L.: Stochastic evolution equations. Translated from Itogi Naukii Tekhniki, Seriya Sovremennye Problemy Matematiki 14, 71–146, (1979) New York: Plenum Publishing Corp., 1981 21. Prevot, C., Röckner, M.: A concise course on stochastic partial differential equations. Lecture Notes in Mathematics, Berlin-Heidelberg-New York: Springer, 2007 22. Ren, J., Röckner, M., Wang, F.-Y.: Stochastic generalized porous media and fast diffusions equations. J. Diff. Eq. 238, 118–156 (2007) 23. Turcotte, D.L.: Self-organized criticality. Rep. Prog. in Phys. 621, 1377–1429 (1999) 24. Zhang, Y.: Scaling theory of self-organized criticality. Phys. Rev. Lett. 63, 470–473 (1989) Communicated by P. Constantin
Commun. Math. Phys. 285, 925–955 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0657-4
Communications in
Mathematical Physics
Extensions and Degenerations of Spectral Triples Erik Christensen1 , Cristina Ivan2 1 Department of Mathematics, University of Copenhagen, Universitetsparken 5,
DK 2100 Copenhagen, Denmark. E-mail:
[email protected] 2 Department of Mathematics, Leibniz University of Hannover, Welfengarten 1, D-30167 Hannover,
Germany. E-mail:
[email protected] Received: 5 November 2007 / Accepted: 15 July 2008 Published online: 12 November 2008 – © Springer-Verlag 2008
Abstract: For a unital C*-algebra A, which is equipped with a spectral triple (A, H, D) and an extension T of A by the compacts, we construct a two parameter family of spectral triples (At , K , Dα,β ) associated to T . Using Rieffel’s notation of quantum Gromov-Hausdorff distance between compact quantum metric spaces it is possible to define a metric on this family of spectral triples, and we show that the distance between a pair of spectral triples varies continuously with respect to the parameters. It turns out that a spectral triple associated to the unitarization of the algebra of compact operators is obtained under the limit - in this metric - for (α, 1) → (0, 1), while the basic spectral triple (A, H, D) is obtained from this family under a sort of dual limiting process for (1, β) → (1, 0). We show that our constructions will provide families of spectral triples for the unitarized compacts and for the Podle`s sphere. In the case of the compacts we investigate to which extent our proposed spectral triple satisfies Connes’ 7 axioms for noncommutative geometry, [8]. Introduction The so called Toeplitz algebra, say T , may be obtained in a number of different ways. The simplest description of it is possibly as the C*-algebra on the Hilbert space 2 (N) generated by the unilateral shift. A more profound description which relates to analysis, can be obtained via the algebra, C := C(T), of continuous functions on the unit circle. A function f in this algebra is represented as a multiplication operator, M f on the Hilbert space H := L 2 (T) of square integrable functions. This space has a subspace H+ , which consists of those functions in H that have an analytic extension to the interior of the unit disk. Let P+ denote the orthogonal projection of H onto H+ , then the compression to H+ of a multiplication operator M f for a continuous function f on T becomes the Toeplitz operator T f := P+ M f |H+ , and these operators form a subspace in the Toeplitz algebra such that the Toeplitz algebra becomes the direct sum of {T f | f ∈ C} and the algebra
926
E. Christensen, C. Ivan
of compact operators on H+ . In this way the Toeplitz algebra becomes an extension of C by the compact operators. The mapping C f → T f relates to the differentiable structure on the circle in the way that for the ordinary differentiation on the circle with d respect to arc length, i. e. D := 1i dθ , we know that the space H+ is the closed linear span of the eigenvectors corresponding to non negative eigenvalues for D, so there is a strong connection between the differentiable structure on the circle and the operator theoretical construction called extension, of C by the compacts. In this article we will study this process from a more general point of view. Our study is based on Connes’ notion of a spectral triple which is a way of expressing a differentiable structure in the world of non-commutative *-algebras, [7]. Definition 0.1. Let A be a unital C*-algebra, H a Hilbert space which carries a faithful unital representation π of A and D an unbounded self-adjoint operator on H . For a dense self-adjoint subalgebra A of A the set (A, H, D) is called a spectral triple associated to A if (i) For all a in A the commutator [ D, π(a) ] is bounded and densely defined. (ii) The operator (I + D 2 )−1 is compact. In this article our starting point is a spectral triple associated to a C*-algebra A and we want to study some of the possibilities for constructing spectral triples associated to an extension of A by the algebra of compact operators on an infinite dimensional Hilbert space. Our fundamental example of a spectral triple is the one coming from the unit circle, as described above. This particular example was investigated by Connes and Moscovici in [9], where they constructed a spectral triple associated to the Toeplitz algebra for each natural number n in the following way. Let S denote the unilateral shift on H+ , i. e. for the canonical basis for H+ we have Sek = ek+1 , and let d D+ = −i dθ |H+ , which means that D+ is the positive self-adjoint operator on H+ which satisfies D+ ek = kek . Then the Hilbert space K of the spectral triple for the Toeplitz algebra is defined as K = H+ ⊕ H+ , and the Dirac operator Dn is defined via the matrix form 1/2 0 D+ S n Dn := . 1/2 (D+ S n )∗ 0 In the construction we present in this paper we look at a spectral triple (A, H, D) associated to a C*-algebra A and an orthogonal projection P onto a subspace of H such that P commutes with D, and for each operator a from A the commutator [P, a] is compact. All of this set-up is analogous to the classical spectral triple for the circle algebra, but there is, in general, no counterpart to the unilateral shift. This means that we have to modify the construction by Connes and Moscovici in order to construct a spectral triple associated to the C*-algebra generated by the operators {Pa|P H a ∈ A} and the compact operators on P H. A C*-algebra, obtained this way, is called an extension of A by the compacts, and one of the problems we try to solve in this article is to find ways to extend a spectral triple associated to a C*-algebra to a spectral triple for an extension of that algebra by the compacts. A more general question of this sort has been studied by Chakraborty in [6]. In that paper he studies compact quantum metric spaces as introduced by Rieffel [28], and he investigates the possibilities to generate the structure of a compact quantum metric space associated to an extension of a C*-algebra which is associated to the given compact quantum metric space. In Chakraborty’s article he studies a short exact sequence of C*-algebras of the type 0 → K ⊗ A → A1 → A2 → 0,
Extensions and Degenerations of Spectral Triples
927
for which the last homomorphism has a positive splitting σ : A2 → A1 , and he shows that if there is a compact quantum metric space associated to both A and A2 , then there exist several compact quantum metric spaces associated to the C*-algebra A1 . This set up is more general than ours from the point of view of possible extensions, but our concern is spectral triples rather than the construction of compact quantum metric spaces. Chakraborty offers several applications of his construction to known examples, such as the Podle`s sphere. We show that our construction can also be applied to generate spectral triples for this example and also for the algebra of compact operators on a separable Hilbert space. We will, through the entire article, suppose that (A, H, D) is a spectral triple associated to a unital C*-algebra A, which is a subalgebra of B(H ). As in the book [16] Def. 2.7.7 and Chap. 5, we will study extensions of Toeplitz type. This means that we are interested in an orthogonal projection P in B(H ) which commutes with A modulo the compact operators. One can then define a C*-algebra B on P H as the C*-algebra generated by the space of operators {Pa|P H a ∈ A} in B(P H ). For each operator a in A we have that (I − P)a|P H is compact, so operators of the form Pa ∗ (I − P)a|P H are compact and—unless P commutes with A – the algebra B will contain non trivial compact operators. We will let C(P H ) denote the algebra of compact operators on P H. In the classical case of the Toeplitz algebra for the circle we actually have C(P H ) = B ∩ C(P H ). In any case, independently of what the C*-algebra B ∩ C(P H ) might be we will consider the C*-algebra T which is defined as the sum T := B + C(P H ). Let Q(P H ) denote the Calkin algebra B(P H )/C(P H ) and let κ denote the quotient homomorphism, then we can define a homomorphism ϕ of A into Q(P H ) by ϕ(a) := κ(Pa|P H ). The extensions we will consider are those obtained via the construction described above which also have the property that the homomorphism ϕ is faithful on A. We may then define a homomorphism of T onto A as ϕ −1 ◦ κ, and we will say that a projection P in B(H ) which satisfies all the properties discussed here is of Toeplitz type. We will not study all such projections, but restrict our investigations to projections of Toeplitz type such that P commutes with the Dirac operator D and satisfies the following regularity property: ∀a ∈ A [P D, a]
is bounded and densely defined.
(1)
Under these assumptions we will say that the quadruple ((A, H, D), P) is of Toeplitz type. We would like to remind the reader that for any spectral triple - of infinite dimensions - like (A, H, D) the spectral projection P+ for D corresponding to the interval [0, ∞[ is a natural candidate for P. This follows from the well known fact that the symmetry 2P+ − I generates a bounded Fredholm module [3]. In the case P = P+ the regularity condition amounts to the assumption that for any a from A, the commutator [|D|, a] is bounded, too. We can now describe the general construction, which we will study here. So for a quadruple ((A, H, D), P) of Toeplitz type we define the C*-algebra T , as above, a Hilbert space K := P H ⊕ H and a representation π of T on K given by the matrix form t 0 π(t) := . 0 ϕ −1 (κ(t)) The Hilbert space K = P H ⊕ H decomposes as K = P H ⊕ P H ⊕ (I − P)H and we can see that the first two summands are exactly analogous to the ones appearing
928
E. Christensen, C. Ivan
in the Connes-Moscovici construction. We have tried to follow their idea, but our analysis indicates that we can not in general give up the information which is encoded in the (I − P)H part of the structures, so we will consider K = P H ⊕ H rather than P H ⊕ P H. On the other hand this opens the possibility to play on the two parts with different weights as the introduction of parameters in our proposal for a Dirac operator shows. Since D is supposed to commute with P, the regularity conditions imposed make it possible to define a family of Dirac operators on K in the following way. For positive reals α, β such that αβ ≤ 1 we define an unbounded self-adjoint operator Dα,β on K via its matrix: ⎛ ⎞ 0 β D|P H 0 ⎠. 0 Dα,β := ⎝β D|P H α1 D|P H 1 0 0 α D|(I − P)H The reason for having the parameter α appearing in the form 1/α is mainly aesthetical. For instance, the formula giving a distance estimate between the non-commutative spaces obtained for two pairs of parameters, say (α, β) and (γ , δ) becomes by Theorem 3.9
β αβ γ δ max , − 1 + 1 − diamα,β , γ δ αβ δ and in this formula the product αβ fits in as a parameter. The reason why the parameters are supposed to satisfy the inequality αβ ≤ 1 comes originally from the classical Toeplitz case, where it is quite easy to analyze the situation in detail. It turns out that for this example and a pair of parameters (α, β) for which αβ > 1 all the aspects of the non-commutative space associated to these parameters is already contained in the space given by the parameters (1/β, β). The general case does not work exactly in the same way, but Remark 1.10, explains why we think nothing essential is lost, if we just stick to the region in the parameter space where αβ ≤ 1. Still another argument for the choice of parameter space is that it turns out that the limiting process α → 0 behaves uniformly nicely on the set of parameters where β ≥ β0 . For the convergence (α, 1) → (0, 1) we find that it induces a convergence with respect to the quantum Gromov-Hausdorff metric of the compact quantum metric spaces associated to T for the parameters (α, 1) onto that of the unitarized algebra of compact operators C(P H ). Our intuitive description of this phenomenon is as follows. We think that the spectral triple acts like a microscope where we have a fixed screen to watch, but we are allowed to change the magnification. The parameter α is a measure of the actual size, say in meters, of the objects we can watch on our screen, and then 1/α is the magnification factor. When α decreases to zero, we lose the sight of the big picture and can only see tiny details of very small things. In the end – when α = 0 − our mathematical construction can only see the compacts, and they are considered to be the infinitesimals in Connes’ dictionary [8]. The precise mathematical content of this story is contained in Theorem 4.6. The limit (1, β) → (1, 0) is quite easy to understand if you take a look at the definition of Dα,β just above, and you can see that you get the basic spectral triple (A, H, D) back, but now in a degenerated representation. This is not a limit with respect to the quantum Gromov-Hausdorff metric on the associated non commutative spaces, but rather a sort of degenerated limit where the compacts, i. e. the infinitesimals, become invisible. We can provide a simple 2-dimensional model of the picture we try to present. Look at the unit square [0, 1]2 in R2 and equip it with the metric dα,β ((x, y), (s, t)) := α|x − s| +
Extensions and Degenerations of Spectral Triples
929
1/β|y −t|, then the limit (α, 1) → (0, 1) gives the unit interval {(0, y) | 0 ≤ y ≤ 1} with its standard metric as the limit in the Gromov-Hausdorff metric. For (1, β) → (1, 0) there is no limit of this sort but we get - pointwise - a degenerate metric d1,0 on the unit square as a limit. This degenerate metric is given by the formula
|x − s| if t = y d1,0 ((x, y), (s, t)) = . (2) ∞ if t = y In the first version of the article we considered this limiting process as a sort of deformation, but we have been told by several people that we do not deform a product, so the wording is wrong. By looking into the literature on metric spaces associated to Riemannian structures we have found that the limiting processes we are watching may be considered as degenerations of metric spaces, so the title has been changed accordingly. Usually this sort of degeneration of Riemannian structures is studied under some assumptions on boundedness of curvature during the process [4,5,15], but this last aspect of degeneration of metric structures does not apply to our results, at least for the time being. On the other hand we still think that the limiting process (1, β) → (1, 0)—in the case where the algebra A is commutative—offers a way of describing a passage from a noncommutative compact metric space into a commutative compact metric space. Equation (2) indicates that a better description of the degeneration occurring while β → 0 might possibly be; a passage from a non-commutative space to an infinite collection of disjoint identical copies of the same commutative space. After we had posted the first preprint version of this article on the arXiv, we were informed by Hanfeng Li, that our constructions can work in the settings of David Kerr’s, [17], and his own, [22], and the one from their joint work, [18], where the quantum Gromov-Hausdorff metric is extended in a way which is based on the operator space structure of the given algebra. The introduction of the state spaces of the tensor products of the given algebra by the algebras of n × n complex matrices into the definition of a compact quantum space, is to be able to describe certain aspects of order in more detail. We have chosen not to expand the present paper and hope that these results of Li’s some day will find a suitable place to be presented. Near the end we give a couple of examples and show that our method creates an abundance of spectral triples for the unitarized compact operators on a separable infinite dimensional Hilbert space. Then we show that the method, when applied to the unit d circle and the classical differential operator 1i dθ , gives a spectral triple associated to the classical Toeplitz C*-algebra. Based on this spectral triple we can then by a slight modification of our method obtain a spectral triple for the Podle`s sphere. Unfortunately we can see no relations between our constructions and the ones presented in [10] and [11]. At the very end we present some small comments on the relations between the constructions in this article to the concepts of even and odd spectral triples and to analytic K-homology as described by Higson and Roe in their book [16]. It may be that further assumptions or conditions on the starting spectral triple may be used to give a basis for a more detailed study of such relations. In this paper we have been focusing on the quite general degeneration aspects of the extended spectral triples. We are most thankful to the referee who has pointed out some problems in the first version of the article, and he has also suggested several possibilities for improvements. Among the questions he asked is the question on how the spectral triples constructed here relate to the 7 axioms for non-commutative geometry which Connes lists in the article [8]. To answer this question we have studied it for our examples involoving the
930
E. Christensen, C. Ivan
Toeplitz algebra and the unitarized compacts. The Toeplitz case seems not promising at all with respect to this investigation, so we have not included any comments on this aspect for the Toeplitz algebra. For the compacts we do check all the axioms, and we show that we can meet some, whereas for others we can not decide, but for the so-called reality axiom we get one of the signs wrong. 1. A Family of Spectral Triples Associated to an Extension We will keep a C*-algebra A with an associated spectral triple (A, H, D) fixed during the whole article and moreover suppose that A is a concrete C*-algebra acting on the Hilbert space H. As stated in the Introduction, we will assume that we have a projection P in B(H ) of Toeplitz type and study an extension T of A by the algebra C(P H ) of compact operators on P H. It should be remarked that we do not assume that the algebra of compact operators C(P H ) is contained in the C*-algebra B generated by operators of the form Pa|P H, and in particular we can also study the situation where P commutes with A. We will collect the definitions from the Introduction in a formal definition. Definition 1.1. Let A be a unital C*-algebra on a Hilbert space H and let (A, H, D) be a spectral triple associated to A. A projection P in B(H ) is said to be of Toeplitz type for (A, H, D) if (i) The projection P commutes with D. (ii) The projection P commutes modulo the compacts with A. (iii) The homomorphism ϕ of A to the Calkin algebra Q(P H ), defined by ϕ(a) := Pa|P H + C(P H ), is faithful. For such a triple and a projection P of Toeplitz type we define the Toeplitz extension T of A by C(P H ) as the C*-algebra generated by {P A|P H A ∈ A} ∪ C(P H ). At the end of the paper we construct an example which will give a spectral triple for the Podles’ sphere. That example is based on a slight variation of the construction presented in this paper, and it suggests that it might be possible to study extensions of A by a sub C*-algebra of C(P H ) instead. A generalization of our construction to cover cases like this seems possible, but also quite demanding with respect to extra details, so we have chosen only to consider extensions by all of C(P H ), and then just present the other point of view in connection with the example for the Podles’ sphere. Given a projection P of Toeplitz type for (A, H, D), we assume that P commutes with D and by this we mean that P commutes with all the spectral projections of D. From this it follows that the unitary S := P − (I − P) also will commute with D and S will map the domain of definition for D onto itself, ( [24], Prop. 5.3.18.) Hence the domain of definition for D splits into a direct sum of its intersections with P H and (I − P)H respectively. We will need that the commutators from the spectral triple respect this decomposition too, and this is the basis for the following definition. In the classical case where P = P+ this means that we will not only demand that commutators [D, a] are bounded and densely defined for a in A, but we want both [D, a] and [|D|, a] to be bounded and defined on a common dense domain. Definition 1.2. A quadruple ((A, H, D), P) where P is a projection of Toeplitz type for (A, H, D), is said to be of Toeplitz type if:
Extensions and Degenerations of Spectral Triples
931
(i) For any a in A, the commutators [P D, a] and [(I − P)D, a] are bounded and densely defined and their common domain of definition contains two subspaces dom([D, a]) ∩ P H and dom([D, a]) ∩ (I − P)H which are dense in P H and (I − P)H respectively. (ii) The operator D P := D|P H has trivial kernel. The properties in the definition above seem natural in the setting for a classical Toeplitz algebra, except for the last one. On the other hand that one does not really matter. Let namely N denote the orthogonal projection onto the kernel of D, then N is of finite rank, and since it is a spectral projection for D, it commutes with P and we can replace P by P − P N , without disturbing any properties of the extension we are studying. The first condition has been imposed in order to be able to look at commutators of the form [P D, a]|P H and their relatives with restrictions to (I − P)H and/or P D replaced by (I − P)D. The conditions are made such that the lemma below holds. To keep the notational problems at a minimum we introduce the conventions that H p := P H, Hq := (I − P)H, Pp := P, Pq := (I − P), D p := D P, Dq := D(I − P). Lemma 1.3. For any a in A and any combination of the symbols s, t, r in the set { p, q}, The closure of (Ps [D, a]|Ht ) = Ps the closure of ([D, a]) |Ht ,
(3)
The closure of (Ps [Dr , a]|Ht ) = Ps the closure of ([Dr , a]) |Ht .
(4)
Proof. We will not prove all these statements but restrict ourselves to the relation (3) in the situation where s = p and t = q. The closure of the commutator [D, a] is bounded and we will denote its closure by δ(a). It is immediate that as operators we have the inclusion P[D, a]|(I − P)H ⊆ Pδ(a)|(I − P)H, and in order to show the statement of the lemma it is sufficient to show that P[D, a]|(I − P)H is densely defined, but this is fulfilled by the condition (i) in Definition 1.2. We now claim that we can perform exactly the same computations with respect to any other combination of the symbols { p, q}, and then obtain the lemma. The effect of the lemma is that we may decompose the commutator [D, a] into its matrix parts with respect to the decomposition of H = H p ⊕ Hq , such that each of the 4 matrix entries of the closure is the closure of the corresponding operator-theoretical matrix entry. From this follows the lemma just below: Lemma 1.4. For any a in A the operators D Pp a Pq and D Pq a Pp are bounded and everywhere defined. Proof. We remind you that a product of operators of the form C B where C is closed and B is bounded is automatically closed. The result then follows from Definition 1.2, (i). We will now define various maps and a spectral triple associated to T . Before we give the definition we would like to mention that its first item is legal, due to a general result on ideals in C*-algebras that we recall here ([12], Cor. 1.5.6).
932
E. Christensen, C. Ivan
Proposition 1.5. Suppose that I is a two sided closed ideal of a C*-algebra A, and that B is a sub C*-algebra of A. Then B + I is a C*-algebra and B/(B ∩ I) (B + I)/I is a *-isomorphism. Let now κ denote the quotient mapping of B(P H ) onto the Calkin algebra Q(P H ), then the proposition above has the following corollary as a consequence. Corollary 1.6. For any quadruple ((A, H, D), P) of Toeplitz type with associated Toeplitz extension T , the images κ(T ) and ϕ(A) in the Calkin algebra Q(P H ) agree and κ(T ) is isomorphic to A. Definition 1.7. Let ((A, H, D), P) be a quadruple of Toeplitz type associated to a C*-algebra A. For the induced Toeplitz extension T of A we define: (i) A representation ρ : T → B(H ) by ρ(t) := ϕ −1 (κ(t)). (ii) A completely positive unital and injective mapping T : A → T by T (a) := Pa|P H. (iii) A projection of T onto C(P H ) by (t) := t − T (ρ(t)). (iv) For any x in B(H ) and any combination of the symbols s, r ∈ { p, q} we define xsr in B(Hr , Hs ) by xsr := Ps x|Hr . It should be noted that for an a from A, we have T (a) = a pp . Given a situation as above, we will then define a representation π of T on a Hilbert space K and a family of unbounded self-adjoint operators Dα,β on K , but it is not immediate that we will get spectral triples this way, so we start by defining the ingredients separately and study some of their properties. Definition 1.8. Let ((A, H, D), P) be a quadruple of Toeplitz type associated to the C*-algebra A and let T denote the induced Toeplitz algebra on the space P H. To this quadruple is associated: (i) A dense self-adjoint subalgebra Ac of C(P H ) defined by Ac := {k ∈ C(P H ) | D p k and k D p are bounded}. (ii) A dense self-adjoint subalgebra At of T defined by At := {T (a) + k | a ∈ A, k ∈ Ac }. (iii) A Hilbert space K defined as the sum K := P H ⊕ H = H p ⊕ H p ⊕ Hq . (iv) A representation π of T on K defined by ∀t ∈ T :
π(t) :=
t 0 . 0 ρ(t)
(v) For positive reals α, β a self-adjoint operator Dα,β is defined on K via its matrix, which, with respect to the decomposition K = H p ⊕ H p ⊕ Hq , is given by ⎞ ⎛ 0 0 β Dp ⎟ ⎜ 0 ⎠. Dα,β := ⎝β D p α1 D p 1 0 0 α Dq
Extensions and Degenerations of Spectral Triples
933
It may not be obvious that the linear space At is an algebra, but it follows from Lemma 1.4. We will show that for each pair (α, β) we will get a spectral triple for the Toeplitz extension T of A, induced by the projection P. This will be an odd spectral triple and it is possible—via a standard trick—to obtain an even triple instead. But from the point of view we are studying here, namely the variation of the compact quantum metric spaces with respect to the parameters α and β we do not get any changes if the investigation is performed with the odd spectral triple described above or an even one. The properties of a quadruple of Toeplitz type now come into play and it helps us to split a commutator [Dα,β , π(t)] for a t in At into its matrix parts. Lemma 1.9. For a in A and k in Ac and positive reals α, β with αβ ≤ 1 the commutator [Dα,β , π((T (a) + k)] is bounded. For each matrix part of the closure of this commutator, with respect to the decomposition K = P H ⊕ P H ⊕ (I − P)H , the element is the closure of the corresponding matrix part of the algebraic commutator. Proof. We will do the computations where they are defined purely algebraically, then show that each matrix entry is bounded and densely defined and then conclude that the closure of the commutator is the closure of the operator composed of the matrix entries. The reason why this is possible is the regularity assumptions and Lemma 1.3: ⎡⎛
⎞ ⎛ ⎞⎤ 0 β Dp 0 T (a) + k 0 0 1 0 T (a) a pq ⎠⎦ [Dα,β , π(T (a) + k)] = ⎣⎝β D p α D p 0 ⎠ , ⎝ 1 0 aq p aqq 0 0 α Dq ⎛ ⎞ 0 [D p , T (a)] − k D p D p a pq 0 0 ⎠ = β ⎝[D p , T (a)] + D p k −aq p D p 0 0 ⎛ ⎞ 1 ⎝0 0 0 ⎠ 0 + . α 0 [D, a] The lemma follows.
(5)
Remark 1.10. The idea in the setup of the commutator [Dα,β , π(T (a) + k)] is that it shall reflect both of the given commutators [D p , k] and [D, a] in such a way that a variation of the parameters α and β will reveal information on each of these parts separately. Since [D p , T (a)] = P[D, a|P H, we get [D p , T (a)] ≤ [D, a] and then for αβ ≤ 1, 1 1 β[D p , T (a)] ≤ [D p , T (a)] ≤ [D, a]. α α Hence for αβ ≤ 1 the term β[D p , T (a)] will not be of significance and then we see from (5) that in this case we will have 1 [D, a] ≤ 1 and max β{D p k, k D p } ≤ 2 if α [Dα,β , π(T (a) + k)] ≤ 1.
934
E. Christensen, C. Ivan
We will then impose the condition αβ ≤ 1 in all of our future statements, and this also fits nicely with the results of Theorem 3.9 which indicate that the product αβ is a relevant parameter. Proposition 1.11. For any pair of positive real numbers α, β such that αβ ≤ 1 the tuple (At , K , Dα,β ) is a spectral triple associated to the C*-algebra T . This extended spectral triple is s−summable if and only if the given one is s-summable. Proof. Having Lemma 1.9 we just have to prove that each Dα,β has compact resolvents, but that follows immediately from the definition of Dα,β . Since P commutes with the spectral projections for D, each eigenspace Hλi for D decomposes as an orthogonal sum P Hλi ⊕ (I − P)Hλi and we can find an orthonormal basis for P H, say (ei ), consisting of eigenvectors for D p , plus an orthonormal basis for (I − P)H, say ( f j ), consisting of eigenvectors for Dq . If ei is an eigenvector corresponding to the eigenvalue λi , the operator Dα,β will have an invariant 2 dimensional subspace of the form {(zei , wei , 0) | z, w ∈ C} in the decomposition of K . The eigenvalues of Dα,β on this space are determined by the 2 × 2 matrix 0 β M(α, β) := β α1 such that the eigenvalues become λi times the eigenvalues of M(α, β). For an eigenvector f j for Dq corresponding to an eigenvalue µ j this vector becomes an eigenvector for Dα,β corresponding to the eigenvalue µ j /α. Let now s denote a positive real and we see that we get the equality below Tr(|Dα,β |−s ) = Tr(|M(α, β)|−s ) Tr(|D p |−s ) + α s Tr(|Dq |−s ) and the proposition follows.
While we are at such matrix computations we remind you that for positive real numbers α, β, γ , δ, Hilbert spaces L , M, N and bounded operators v ∈ B(M, L), x ∈ B(L , M), y ∈ B(M), z ∈ B(N ), we can obtain the identities below with respect to some operator matrices on Hilbert sum L ⊕ M ⊕ N : ⎛ ⎞ 0 βv 0 ⎝βx 1 y 0 ⎠ = (6) α 0 0 α1 z ⎞ ⎞ ⎛ ⎛ α β α β ⎛ ⎞ 0 0 0 0 γ ( δ )I γ ( δ )I 0 δv 0 ⎟ ⎟ ⎜ ⎜ ⎟ ⎝δx 1 y 0 ⎠ ⎜ ⎟ ⎜ γ γ (7) I 0 I 0 0 0 ⎟ ⎟, ⎜ ⎜ γ α α ⎝ ⎠ 0 0 1z ⎝ ⎠ γ γ γ 0 0 0 0 αI αI and by (5) we can conclude as stated in the following lemma. Lemma 1.12. Let a be in A, k in Ac and α, β, γ , δ positive real numbers such that αβ ≤ 1 and γ δ ≤ 1. For t := T (a) + k :
2 Dα,β , π(t) ≤ max γ , α β Dγ ,δ , π(t) . α γ δ2 We will use this result heavily in the computations to come.
Extensions and Degenerations of Spectral Triples
935
2. The Family of Compact Quantum Metric Spaces ( AT , L α,β ) For a spectral triple (A, H, D) associated to a unital C*-algebra A, Connes has showed that it is possible to define a metric on the state space S(A) of A by the following formula: ∀φ, ψ ∈ S(A) :
distA (φ, ψ) := sup{|(φ − ψ)(a)| | [D, a] ≤ 1}.
(8)
A metric defined in this generality is allowed to be infinite, but here we are mostly interested in spectral triples which have the extra property that the metric defined above is an ordinary metric, which also is a metric for the w*-topology on the state space. This aspect of non - commutative geometry has been studied in several articles by Marc Rieffel [26] and references there. Rieffel has generalized this set up to what he calls compact quantum metric spaces. Here the algebra A of the spectral triple is replaced by an order unit space and the Dirac operator is not directly present, but replaced by a seminorm L on A. In the case where a spectral triple is present the seminorm is given by A a → L(a) := [D, a] . Our investigation will not be so general here, since we will only study degenerations of spectral triples as constructed in the previous section. On the other hand we will base our results on Rieffel’s memoir [27], and we will use the language from that memoir to quantify the impact of the changes of the parameters α and β. We will now recall some definitions and results from that memoir. Definition 2.1. An order-unit space is a real partially ordered vector space, A, with a distinguished element e (the order unit) which satisfies: (i) (Order unit property) For each a ∈ A there is an r ∈ R such that a ≤ r e. (ii) (Archimedean property) If a ∈ A and if a ≤ r e for all r ∈ R with r > 0, then a ≤ 0. The norm on an order-unit space is given by a = inf{r ∈ R : −r e ≤ a ≤ r e}. Any order-unit space can be realized as a real linear subspace of the vector space of self-adjoint bounded operators on a Hilbert space in such a way that the order unit is the unit operator I. Definition 2.2. Let (A, e) be an order-unit space, and its dual, A∗ . The state space S(A) is defined to be the collection of all states, µ, of A, i.e. µ ∈ A∗ such that µ(e) = 1 = µ. Consider now a seminorm L on the order-unit space (A, e) having its null-space equal to the scalar multiples of the order unit. Then, for µ, ν ∈ S(A) one can define a metric, ρ L , on S(A) by ρ L (µ, ν) := sup{|µ(a) − ν(a)| | L(a) ≤ 1}. In the absence of further assumptions, ρ L (µ, ν) may be infinite. It is most often true that the ρ-topology on S(A) is finer than the weak*-topology. Definition 2.3. Let (A, e) be an order-unit space. A Lip-norm on A is a seminorm L on A with the following properties: (i) For a ∈ A we have L(a) = 0 if and only if a ∈ Re. (ii) The topology on S(A) from the metric ρ L is the weak*-topology.
936
E. Christensen, C. Ivan
Definition 2.4. A compact quantum metric space is a pair (A, L) consisting of an orderunit space A with a Lip-norm L defined on it. In our context we have four C*-algebras C(P H ), A, T , and the unitarization of the compacts which we define by C := C(P H ) := C(P H ) + CI P H . We will now define the order unit spaces and associated Lip-norms, which we will study. Definition 2.5. The order unit spaces AC , AA , AT are defined by: AC := {k + λI | λ ∈ R, k = k ∗ ∈ Ac } Definition 1.8, ∗
AA := {a ∈ A | a = a },
(9) (10)
∗
AT := T (AA ) + AC = {t ∈ At | t = t }.
(11)
Definition 2.6. The seminorms L C , L A and L α,β on AC , AA and AT are defined by ∀k ∈ AC ∩ C(P H ) ∀λ ∈ R : L C (k + λI ) := D p k, ∀a ∈ AA : L A (a) := [D, a] , ∀t ∈ AT : L α,β (t) := [Dα,β , π(t)] .
(12) (13) (14)
The corresponding Minkowski sets or unit balls are defined by Definition 2.7. UC := {x ∈ AC | L C (x) ≤ 1}, UA := {a ∈ AA | L A (a) ≤ 1},
(15)
Uα,β := {t ∈ AT | L α,β (t) ≤ 1}.
(17)
(16)
The associated metrics are given by Definition 2.8. ∀ f, g ∈ S(C(P H )) : distC ( f, g) := sup{| f (k) − g(k)| | k ∈ UC }, ∀µ, ν ∈ S(A) : distA (µ, ν) := sup{|µ(a) − ν(a)| | a ∈ UA }, ∀φ, ψ ∈ S(T ) : distα,β (φ, ψ) := sup{|φ(t) − ψ(t)| | t ∈ Uα,β }.
(18) (19) (20)
The diameters of these spaces, which may take the value ∞, are denoted diamC , diamA and diamα,β respectively. Let X denote any of the 3 subscripts C, A, (α, β), then it follows from [28] that the metric dist X defined above is a metric for the w*-topology on the corresponding state space, if and only if the set U X is separating for the state space, and the image of U X in the quotient space A X /(RI ) is relatively norm compact (Aα,β := AT ). We start by showing that L C is always a Lip-norm and then we show that any L α,β is a Lip-norm if L A is so. Proposition 2.9. The seminorm L C is a Lip-norm.
Extensions and Degenerations of Spectral Triples
937
Proof. Let us first prove that UC will separate the state space of C(P H ). To this end we remind you that the spectrum of D p is discrete and D p has a compact inverse (on P H ), since it has a trivial kernel. Let then h := D −1 p and it follows that the set {hyh | y ∈ B(P H ) and y = y ∗ } is contained in AC ∩ C(P H ). Since h is selfadjoint with trivial kernel ( in B(P H ) ) this set is norm dense in the self-adjoint part of C(P H ), and it follows that the set {hyh | y ∈ B(P H ) and y = y ∗ } separates the states and that UC will too. Let us define UC◦ := {k ∈ AC ∩ C(P H ) | D p k ≤ 1}, and we will prove that this set is norm compact. From here it will follow directly that UC /(RI ) = (UC◦ + RI )/(RI ) is norm compact. To see that UC◦ is norm compact we will let ε denote a positive real and recall that h is compact, so there exists a finite dimensional spectral projection E for h such that h(I − E) < ε. For a k in UC◦ we then have k(I − E) = k D p h(I − E) ≤ εk D p ≤ ε, and since k is self-adjoint (I − E)k ≤ ε. Then for the set UC◦ we get UC◦ = UC◦ (I − E) + (I − E)UC◦ E + EUC◦ E, where each operator in either of the first two summands is of norm at most ε and the set EUC◦ E is the unit ball for some norm on the finite dimensional space B(E H ). It then follows that UC◦ is relatively norm compact. To see that it is norm closed, we consider a sequence (kn ) of elements from UC◦ which converges in norm to a compact self-adjoint operator k. For any spectral projection E of D p corresponding to a bounded interval of the real numbers we have that the sequence (D p Ekn ) is norm convergent with limit D p Ek, and we see that D p Ek ≤ 1, and therefore k belongs to UC◦ . We will end this section by showing that each of the seminorms L α,β is a Lip-norm if the seminorm L A is a Lip-norm. This leads to a detailed study—in the next section— of the two parameter family of compact quantum metric spaces, (AT , L α,β ). On the other hand we already now need some estimates on the relations between the various seminorms in order to prove that dist α,β generates the w*-topology, if dist A does so. Lemma 2.10. Let α, β be positive reals such that αβ ≤ 1, k a compact operator in AC and a an operator in AA then t = k + T (a) is in AT and L A (a) ≤ αL α,β (t), 1 + αβ L α,β (t). L C (k) ≤ β
(21) (22)
Proof. The first inequality follows directly from (5) and properties of norms of matrices. For the second inequality we use again (5), the result in the first inequality and the triangle inequality to obtain β L C (k) ≤ L α,β (t) + β [D p , T (a)] ≤ L α,β (t) + β L A (a) ≤ (1 + αβ)L α,β (t), and the lemma follows.
Proposition 2.11. If L A is a Lip-norm, then for each pair of positive reals (α, β) such that αβ ≤ 1 the seminorm L α,β is a Lip-norm.
938
E. Christensen, C. Ivan
Proof. To see that Uα,β /(RI ) is relatively norm compact we turn back to Lemma 2.10, which implies that for a t = T (a) + k in Uα,β we have that a ∈ αUA and k ∈
1 + αβ UC , β
so Uα,β ⊆ αT (UA ) +
1 + αβ UC β
(23)
and Uα,β /(RI ) ⊆ αT (UA /(RI )) +
1 + αβ UC /(RI ). β
Since distA generates a metric for the w*-topology on S(A) the set UA /(RI ) is relatively norm compact in AA /(RI ), and from the proof of Proposition 2.9 we know that that UC /(RI ) is a norm compact subset of AC /(RI ) so we find that Uα,β /(RI ) is a relatively norm compact subset of AT /(RI ). In the recent article [26] Rieffel studies Lip-norms which satisfy some extra conditions, which he needs in order to show certain results on convergence in the space of compact quantum metric spaces, equipped with the quantum Gromov-Hausdorff metric. The new seminorms are called C*-seminorms and it seems most likely that the seminorms we study may possess most of the properties which a C*-seminorm is required to have. We will not recall all of the definitions from [26], but just recall that one of the properties is that such a seminorm is demanded to be lower semicontinuous. In our context this means that the set {t ∈ AT L α,β (t) ≤ 1} is norm closed. It seems quite unlikely to be the case here since we have imposed some regularity conditions in Definition 1.2. on the set AA . This means that already the seminorm L A will probably not in general be lower semicontinuous. On the other hand we might extend such a seminorm to a larger subalgebra of T and in this way obtain a lower semicontinuous seminorm, but then it seems difficult for an operator t in the extended domain for L α,β to control the behavior of the matrix parts of the commutators of the form [Dα,β , π(t)]. We are very thankful to Hanfeng Li, who has showed us how it is possible to prove that the seminorm L α,β has the two other properties of a C*-seminorm named spectral stability and strongly Leibniz, provided the original seminorm L A has these properties. On the other hand it seems that the regularity conditions, we have imposed may be in conflict with the possibility for L A to be spectrally stable. Mainly inspired by the classical case we have the impression that difficulties of this type may be avoided if we restrict our construction to the special case, where the domain of definitions for the seminorms, AA and AT are only the smooth elements as defined by Connes in his smoothness axiom of [8]. We will present and discuss this axiom in Sect. 6. 3. The Compact Quantum Metric Spaces Associated to T In this section we will suppose that the seminorm L A is a Lip-norm, and then by Proposition 2.11 all the tuples (AT , L α,β ) are compact quantum metric spaces. This means that the metric spaces {(S(T ), distα,β ) 0 < αβ ≤ 1}
Extensions and Degenerations of Spectral Triples
939
are equipped with the w*-topology and hence they are ordinary compact metric spaces. It seems natural to compare these metric spaces by obtaining Lipschitz estimates between any pair of two metrics. Based on Lemma 1.12 we can quite easily obtain such results, which we present just below. The spaces we are studying are not only compact metric spaces but also compact quantum metric spaces and Rieffel has in the memoir [27] developed a distance concept for such spaces called the quantum Gromov-Hausdorff distance. This last concept of distance is based on the Hausdorff metric on the closed subsets of a compact metric space. Gromov has extended this idea and introduced a distance function defined on pairs of compact metric spaces, and finally Rieffel [27] has extended Gromov’s ideas to cover the case of compact quantum metric spaces. We will return to this definition shortly, but first we will treat the Lipschitz estimates between a pair of metrics distα,β and distγ ,δ on S(T ). Proposition 3.1. For any positive reals α, β, γ , δ such that αβ ≤ 1, γ δ ≤ 1 and any t in AT ,
γ α β2 γ α β2 L L γ ,δ (t). , , min (t) ≤ L (t) ≤ max γ ,δ α,β α γ δ2 α γ δ2 Proof. We will only prove the right inequality, since the left then follows by symmetry. As usual we have a decomposition of t in AT as the sum T (a) + k with a in AA and k in AC ∩ C(P H ). When going back to the definition in (14) we get L α,β (t) = Dα,β , π(t) , and from the results of Lemma 1.12 we then get
γ α β2 L α,β (t) ≤ max , α γ δ2
L γ ,δ (t).
The results of Proposition 3.1 may be applied to the metrics distα,β and we can obtain the following proposition. Theorem 3.2. Let α, β, γ , δ be positive reals such that αβ ≤ 1 and δγ ≤ 1, then the metrics distα,β (·, ·) and distγ ,δ (·, ·) on S(T ) are Lipschitz equivalent and satisfy the following inequalities:
γ α β2 distα,β (φ, ψ) ≤ distγ ,δ (φ, ψ) ∀φ, ψ ∈ S(T ) : min , α γ δ2
γ α β2 , distα,β (φ, ψ). ≤ max α γ δ2 Proof. Proposition 3.1 shows that with the notation from Definition 2.7 we get
min
γ α β2 γ α β2 U Uα,β . , , ⊆ U ⊆ max α,β γ ,δ α γ δ2 α γ δ2
The theorem then follows from the definition given in (20).
940
E. Christensen, C. Ivan
We have now seen that any two metrics in this two parameter family of metrics on S(T ) are Lipschitz equivalent, and it follows from this that we can deduce estimates of the distance with respect to a quantum Gromov-Hausdorff metric between the compact quantum metric spaces (AT , L α,β ) and (AT , L γ ,δ ). We shall first review, briefly, the Gromov-Hausdorff distance for compact metric spaces and Rieffel’s quantum distance for compact quantum metric spaces. We use as references [22] and [27]. For any closed subset Y of a metric space (X, ρ) and r > 0, we denote: Nrρ (Y ) := {x ∈ X : ∃y ∈ Y with ρ(x, y) ≤ r }. Let S denote the class of all non-empty closed subsets of X . The formula, ρ
∀Y, Z ∈ S : distH (Y, Z ) := inf{r : Y ⊆ Nrρ (Z ) and Z ⊆ Nrρ (Y )}, defines a metric (called the Hausdorff metric) on S. One can also use the notation distHX (Y, Z ) when there is no confusion about the metric on X . Gromov generalized the Hausdorff distance to a distance between any two compact metric spaces X, Y as follows: distGH (X, Y ) := inf{distHZ (h X (X ), h Y (Y ))h X : X → Z , h Y : Y → Z are isometric embeddings into some compact metric space Z }. One can reduce the space Z above to be the disjoint union X Y , and we shall denote with D(X, Y ) the set of all distances ρ on X Y fulfilling that the inclusions X, Y → X Y are isometric embeddings. It is then true that ρ
distGH (X, Y ) := inf{distH (X, Y ) : ρ ∈ D(X, Y )}. Let A be an order-unit space. By a quotient (B, π ) of A, we mean an order-unit space B and a surjective linear positive map π : A → B preserving the order-unit. Via the dual map π ∗ : B ∗ → A∗ , one may identify S(B) with a closed convex subset of S(A). This gives a bijection between isomorphism classes of quotients of A and closed convex subsets of S(A). If L is a Lip-norm on A, then the quotient seminorm L B on B, defined by L B (b) := inf{L(a) : π(a) = b} is a Lip-norm on B, and π ∗ | S(B) : S(B) → S(A) is an isometry for the corresponding metrics ρ L and ρ L B . Let (A, L A ) and (B, L B ) be compact quantum metric spaces. The direct sum A ⊕ B has naturally the structure of an order unit space with order unit (e A , e B ). We will let M(L A , L B ) denote the set of all Lip-norms L on A ⊕ B that induces L A and L B under the natural quotient maps A⊕ B → A and A⊕ B → B. For an element L in M(L A , L B ) with the associated metric ρ L on S(A ⊕ B), it is then possible to consider both of the compact metric spaces (S(A), ρ L A ) and (S(B), ρ L B ) as compact subsets of the compact metric space (S(A ⊕ B), ρ L ), and one can then compute the usual Hausdorff distance ρ between them. This distance is denoted dist HL (S(A), S(B)). We can then define a metric on compact quantum metric spaces as follows.
Extensions and Degenerations of Spectral Triples
941
Definition 3.3. Let (A, L A ) and (B, L B ) be compact quantum metric spaces. Then the quantum Gromov-Hausdorff distance between them is denoted distq (A, B) and it is defined by ρ
distq ((A, L A ), (B, L B )) := inf{dist HL (S(A), S(B)) | L ∈ M(L A , L B )}. Li gave in [22] the following description of the Gromov-Hausdorff distance. Proposition 3.4. Let (A, L A ) and (B, L B ) be compact quantum metric spaces. Then we have V distq ((A, L A ), (B, L B )) = inf{distH (h A (S(A)), h B (S(B))) : h A , h B are affine isometric embeddings of S(A), S(B) into some real normed space V }.
This tells us that the quantum Gromov-Hausdorff distance between two compact quantum metric spaces (A, L A ) and (B, L B ) always will be larger or equal to the GromovHausdorff distance between the compact metric spaces (S(A), ρ L A ) and (S(B), ρ L B ). Besides Rieffel and Li, there are by now several mathematicians who have published articles on convergence and estimates of distances between compact quantum metric spaces and even incorporated the extra structure coming from the theory of operator spaces into their research [17,21,30] and we have found this very stimulating for the present work. We will now use the results of Proposition 3.1 to compute estimates for the distance between a pair (AT , L α,β ) and (AT , L γ ,δ ) of compact quantum metric spaces. Our construction is based on Rieffel’s concept called a bridge, but we could not get his concept to fit exactly into our frame, so we have modified it a bit and incorporated the idea of a bridge into the proof of the following proposition. On the other hand our situation is much simpler than the general situation, considered by Rieffel, since the order unit space is kept fixed as AT . Proposition 3.5. Let A be an order unit space and let L 1 and L 2 be two Lip-norms on A for which there exist positive real number s < r such that ∀a ∈ A : s L 2 (a) ≤ L 1 (a) ≤ r L 2 (a). √ Define L 3 := (1/ r s)L 1 , let dist3 , dist2 be the metrics induced by L 3 , L 2 and let diam3 , diam2 denote the diameters of the compact metric spaces (S(A), dist 3 ), (S(A), dist2 ) then r − 1 min{diam3 , diam2 }. distq ((A, L 3 ), (A, L 2 )) ≤ s Proof. We first fix an arbitrary base point, which in this case means a state σ on A, and then we let M denote an arbitrary positive real. Later in the argument we will let M increase unlimited, so you may think of M as a big positive real. We will let R denote the positive real which is defined by √ s R := √ √ , r− s and we can then define a seminorm L on A ⊕ A by
942
E. Christensen, C. Ivan
∀a, b ∈ A : L(a, b) := max{L 3 (a), L 2 (b), R L 3 (a − b), R L 2 (a − b), M|σ (a − b)|}. Since L is defined as a maximum over seminorms, it follows that L is a seminorm on A ⊕ A. If L(a, b) = 0 then since L 3 and L 2 are Lip-norms we see that a = α I and b = β I for some real numbers α, β and finally σ (a − b) = 0 implies that α = β, so (a, b) = α(I, I ) and the first condition for L being a Lip-norm is established. We will of course also show that L belongs to M(L 3 , L 2 ), and we will address the question of whether L induces L 3 and L 2 on the summands first. Let us start by looking at the first summand and L 3 first. We then define the following sets. U L := {(a, b) ∈ A ⊕ A | L(a, b) ≤ 1}, U L|A := {a ∈ A | ∃b ∈ A : (a, b) ∈ U L }, U2 := {b ∈ A | L 2 (b) ≤ 1}, U3 := {a ∈ A | L 3 (a) ≤ 1}. In order to prove that L induces L 3 it is sufficient to prove that U L|A = U3 , so we will do that. By definition L(a, b) ≥ L 3 (a), so for any pair (a, b) ∈ U L we have a ∈ U3 , and then U L|A ⊆ U3 . To establish the opposite inclusion we choose an a ∈ U3 and construct a suitable b such that (a, √b) is in U L . It√is a matter of checking to show that the element b in A defined by b := s/ra + (1 − s/r )σ (a)I will do. The situation for the second summand is very similar, out that for any b in A such that L 2 (b) ≤ 1 we can √ and it turns √ define a in A by a := s/r b + (1 − s/r )σ (b)I, and then L(a, b) ≤ 1. The seminorm L is defined on all of A ⊕ A so the set U L will be separating for the states on A ⊕ A. We then just have to prove that the set U L /(R(I, I )) is relatively norm compact in the quotient space (A ⊕ A)/(R(I, I )). Let σ˜ denote the state on A ⊕ A given by σ˜ (a, b) := σ (a), then it is standard to deduce that U L /(R(I, I )) is relatively norm compact if and only if the set Uσ := {(a, b) ∈ U L | σ˜ (a, b) = 0} is relatively norm compact in A ⊕ A. This implies that we may define two relatively norm compact sets in A by U(3,σ ) := {a ∈ U3 | σ (a) = 0} and U(2,σ,M) := {b ∈ U2 | |σ (b)| ≤ 1/M}. For these sets we find that Uσ ⊆ U(3,σ ) ⊕ U(2,σ,M) so the metric ρ L generates the w*-topology on S(A ⊕ A). We can now use this metric to get an upper estimate for the quantum GromovHausdorff distance and we find that for any state φ on A, ρ L ((φ, 0), (0, φ)) = sup{|φ(a − b)| (a, b) ∈ U L } 1 ≤ sup{|(φ − σ )(a − b) | (a, b) ∈ U L } + M
1 1 diam3 diam2 1 , + , since (a − b) ∈ U3 ∩ U2 . ≤ min R R M R R By letting M grow we conclude that distq ((A, L 3 ), (A, L 2 )) ≤ and the proposition follows.
r/s − 1 min{diam3 , diam2 },
Extensions and Degenerations of Spectral Triples
943
Remark 3.6. In connection with the proposition above it may be relevant to note that the diameters diam2 , diam3 relate in a reciprocal way as the corresponding seminorms, so we have s/r · diam3 ≤ diam2 ≤ r/s · diam3 . The special case where the seminorms L 1 and L 2 are proportional is taken out as a corollary. Corollary 3.7. Let A be an order unit space with a Lip-norm L . For any positive real t : distq ( (A, L), (A, t L) ) ≤ |1 − 1/t| diam(A,L) . Proof. Suppose t > 1 then for the Lip-norm N := t 2 L we have L √ ≤ N ≤ t 2 L . The 2 proposition then applies with s = 1 and r = t , so for t L = (1/ sr )N we get by Remark 3.6 and the use of the min option in Proposition 3.5 distq ( (A, L), (A, t L) ) ≤ r/s − 1 diamt L = (t − 1)diamt L = (1 − t −1 )diam L . For t < 1 and N = t 2 L we get t 2 L ≤ N ≤ L and then distq ( (A, L), (A, t L) ) ≤ (t −1 − 1)diam L . The corollary above suggests that it could be interesting to see what will happen for t increasing to infinity, so we will include such a result. Proposition 3.8. Let (A, L) be an order unit space with a Lip-norm, and let (R, 0) be the one point order unit space with Lip-norm equal to 0. For any positive real t we have the estimate. diam(A,L) . t Proof. We choose and fix a state σ on A and let M denote a big positive real. We can then define a seminorm Lˆ t on A ⊕ R by distq ( (A, t L), (R, 0) ) ≤
Lˆ t (a, s) := max{t L(a), M|σ (a) − s|}. It is easy to check that Lˆ t induces the seminorms t L on A and the zero seminorm on R. The order unit space R has exactly one state which we denote by ψ. For a state φ on A we can estimate as follows: dist Lˆ t ((φ, 0), (0, ψ)) = sup{|φ(a) − s| Lˆ t (a, s) ≤ 1} ≤ sup{|φ(a) − σ (a)| t L(a) ≤ 1}
1 + sup{|σ (a) − s| |σ (a) − s| ≤ } M diam(A,L) 1 ≤ + . t M
The proposition follows.
944
E. Christensen, C. Ivan
We can then combine some of the results just obtained with Proposition 3.1 to obtain estimates on the variation of the compact quantum metric spaces (AT , L α,β ). In this connection we will let diamα,β denote the diameter of this space. Theorem 3.9. If α, β, δ, γ are positive reals such that αβ ≤ 1 and γ δ ≤ 1 then: distq (AT , L α,β ), (AT , L γ ,δ )
β αβ γ δ , − 1 + 1 − diamα,β . ≤ max γ δ αβ δ Proof. Inspired by Proposition 3.1 we define
γ α β2 γ α β2 r := max , , , s := min α γ δ2 α γ δ2 then we get ∀t ∈ AT :
s L γ ,δ (t) ≤ L α,β (t) ≤ r L γ ,δ (t).
In the notation from Proposition 3.5 δ 1 and √ = β rs so we have the estimate
r αβ γ β , = max , s γ δ αδ
distq (AT , L γ ,δ ), (AT , (δ/β)L α,β )
αβ γ δ − 1 diamα,β . ≤ max , γ δ αβ
We can then use Corollary 3.7 and the triangle inequality to get distq AT , L γ ,δ , AT , L α,β
β αβ γ δ , − 1 + 1 − diamα,β , ≤ max γ δ αβ δ and the theorem follows.
4. On Limits of AT , L α,β In this section we will keep the set-up from last section so we cancontinue our investigation of the family of compact quantum metric spaces AT , L α,β and study the limiting processes α = 1, β → 0 and α → 0, β = 1. There are limits in both cases, but they are of different nature. In the first case the expression L 1,0 has an obvious meaning and it follows from (5) that this will be a seminorm on AT . This seminorm will be degenerate because its kernel will contain all of C(P H ), but on the other hand you can obtain the seminorm L A directly from L 1,0 , so we recover all the ingredients of the original spectral triple via this limit process. For the family (α, 1) with α decreasing from 1 to 0 there is no sort of a limit on the level of seminorms, since α appears in the expression for L α,β in the negative power 1/α, but this does not affect the convergence of the corresponding compact quantum metric spaces since we prove that the spaces (AT , L α,1 )
Extensions and Degenerations of Spectral Triples
945
converge to (AC , L C ) in the quantum Gromov-Hausdorff metric for α → 0. We have thought of possible interpretations of this result and do offer some remarks concerning the connection to physics in the text below, but we are not trained physicists, so we are reluctant to make too many comments in this direction. The proofs of the results are based on some structural results on the dual space of a unital C*-algebra. Let the dual space of T be denoted T ∗ , and we will then define two subspaces N and S of T ∗ by N := {φ ∈ T ∗ | φ|C(P H ) = φ}, S := {φ ∈ T ∗ | φ|C(P H ) = 0}. Here the letters N and S are chosen because they refer to the terms normal and singular functionals on B(H ). A priori it is not at all clear that N is a subspace, and we will not prove it here, but recall some results of Effros [14] which are presented just below. For details we refer to Dixmier’s book [13] Prop. 2.11.7. Proposition 4.1. With the notation described above, there exist positive contractive linear projection operators N : T ∗ → N and S : T ∗ → S such that for any φ in T ∗ N (φ) + S(φ) = φ, N (φ) + S(φ) = φ. It is easy to identify N with the dual space of C(P H ) simply by restricting a functional in N to C(P H ). The identification the other way goes via the fact that B(P H ) is the second dual of C(P H ), so the canonical embedding of C(P H )∗ into C(P H )∗∗∗ induces an embedding, say ιC , of C(P H )∗ onto N . The space S may be identified with A∗ in the following way. The identification is made via the homomorphism ρ : T → A, which was defined in Definition 1.7. Any functional µ in A∗ may be mapped into S by the composition µ ◦ ρ. Since the kernel of ρ is C(P H ), it follows that this will be an isometric and order isomorphic mapping of A∗ onto S, and we will denote this embedding ιA . As an immediate corollary of these identifications we get the following result. Corollary 4.2. For any state φ in T ∗ there exists a unique pair of states f in S(C(P H )) and µ in S(A) and a real α in [0, 1] such that φ = (1 − α)ιC ( f ) + αιA (µ). These structures have been studied and generalized in [1,2] and in the language of compact convex sets one would say that the two convex sets ιC (S(C(P H ))) and ιA (S(A)) form a pair of split faces of S(T ). The discussion on how the dual space of C(P H ) fits into the dual of T can be applied to the situation when C(P H ) is considered as a subalgebra of C = C(P H) = C(P H ) + CI too. In this case we will fix a state σ from the space S of singular functionals on T and use this state as a basis vector for the one-dimensional singular space ∗ associated to the decomposition of C(P H ) = N ⊕ Cσ. In the general study of the variation of the metrics on S(T ) we will use σ as a base point in the w*-compact space S(T ). The limit of AT , L 1,β as β → 0. This limit is very easy to understand from the point of view of compact quantum metric spaces. It is simply an affine deformation at the level of seminorms as it can be seen immediately from Definitions 1.8 and 2.6. We
946
E. Christensen, C. Ivan
will then extend that definition to cover the pair (1, 0) too, and let L (1,0) denote the corresponding seminorm. We can also still define the unit ball or Minkowski set U(1,0) for this seminorm by the definitions given at (17), and it follows that AC ∩ C(P H ) is contained in U(1,0) . It is then easy to prove the following result. Theorem 4.3. For any a in AA and k in AC : L 1,β (T (a) + k) → L 1,0 (T (a) + k) = L A (a) for β → 0. For states φ, ψ on T with φ = ιC ( f ) + ι A (µ), ψ = ιC (g) + ι A (ν) the distance formula applied to the seminorm L 1,0 gives ⎧ ⎨0 dist1,0 (φ, ψ) = ∞ ⎩ µdist A (µ/µ, ν/µ)
if φ = ψ if f = g . if f = g and µ = ν
(24)
Proof. Since the kernel of L (1,0) contains all of AC , it follows from the distance formula (8) that dist(1,0) (φ, ψ) = ∞ if f = g. If φ = ψ and f = g then f = g < 1 so µ = ν = 1 − f = 0. Again the distance formula and (5) give right away that dist(1,0) (φ, ψ) = sup{|(φ − ψ)(t)| L (1,0) (t) ≤ 1} = sup{|(µ − ν)(a)| L A (a) ≤ 1} = µdist A (µ/µ, ν/µ). We can not prove that the metric distances dist1,β (φ, ψ) converge to dist (1,0) (φ, ψ) for β → 0, when the latter is finite, unless we have a trivial extension, but in the cases where the distance is infinite, i. e. when the normal parts, f and g, of the states are different, we can always give an estimate of the speed of divergence. Proposition 4.4. Let 0 < β ≤ 1 be real and φ, ψ states on T with decompositions φ = ιC ( f ) + ι A (µ), ψ = ιC (g) + ι A (ν). If f = g then there exists a positive real γ such that ∀β ∈ (0, 1] : dist(1,β) (φ, ψ) ≥ γ /β. Proof. We will establish a set theoretical inclusion from which the statement is easy to deduce: 1 UC ∩ C(P H ) ⊆ U1,β . β
(25)
This inclusion follows from the definitions presented in (15)–(17) and the computations which lead to (5). We can then see that the proposition follows when we define γ by γ := sup{|( f − g)(k)| k ∈ UC ∩ C(P H )}.
Extensions and Degenerations of Spectral Triples
947
Suppose A is commutative and represents some classical system and T models a quantization of A, then for a couple of states on T , such as φ and ψ we could look at f, g as their quantum parts and µ, ν as the classical parts. Then it appears that the limit for d(1,β )(φ, ψ) exists and gives the classical metric, scaled to the size of the classical parts if and only their quantum parts are identical. Another attempt to make an interpretation is that the inequality in the proposition above, implies that in a space where β is small, the quantum parts are far apart; but we do not want to press this any further right now. The limit of AT , L α,1 as α → 0. We realized very early on that the family of compact quantum spaces (AT , L α,1 ) converges pointwise as concrete metric spaces towards (C(P H ), L C ) when α decreases to 0, but it took rather long to see that this convergence actually also works with respect to the quantum Gromov-Hausdorff metric. Before we prove this result we need a simple estimate. Lemma 4.5. For any positive functional f in the dual space C(P H )∗ : sup{| f (k)| k ∈ UC ∩ C(P H )} ≤ f diamC . Proof. Let ε > 0 and choose x in UC ∩ C(P H ) such that | f (x)| ≥ sup{| f (y)| | y ∈ UC ∩ C(P H )} − ε/2. Since x is compact and P H is of infinite dimension we can find a positive functional g in C(P H )∗ such that g = f and |g(x)| ≤ ε/2. Hence f diamC ≥ |( f − g)(x)| ≥ sup{| f (y)| | y ∈ UC ∩ C(P H )} − ε. Theorem 4.6. For α, β positive reals such that αβ ≤ 1: distq AT , L α,β , (AC , β L C ) ≤ α (diamA + diamC ) . Proof. We will define a seminorm L on AT ⊕ AC which induces the given seminorms on each summand. Let σ be a state on T which vanishes on C(P H ) and let M be a big positive real number. We can then define the seminorm L, ∀a ∈ AA ∀k, h ∈ AC ∩ C(P H ) ∀s ∈ R : L((T (a) + k, h + s I )) := 1 1 max{L α,β (T (a) + k), β L C (h), L A (a), L C (k − h), M|σ (T (a) − s I )|}. α α Let us show that the seminorm induced by L on AT is L α,β . By definition we always have L((T (a) + k, h + s I )) ≥ L α,β (T (a) + k) so it is enough to prove that for a given t = T (a) + k with a in AA and k in AC ∩ C(P H ) we can find an h in AC ∩ C(P H ) and an s in R such that L((T (a) + k, h + s I )) = L α,β (T (a) + k). We will prove that h := (1 + αβ)−1 k and s := σ (T (a)) will work. To this end we may without loss of generality assume that L α,β (T (a) + k) = 1, and then by (21) it follows that L A (a) ≤ α, and by (22) we find that L C (k) ≤ (1 + αβ)/β. From here it is easy to prove that L(T (a) + k, h + s I ) = 1. For the seminorm induced by L on AC we also get by definition that L((T (a) + k, h + s I )) ≥ β L C (h). Let then an h + s I be given in AC and define a := s I, k := h, then it is again a matter of computation to show that L((T (a) + k, h + s I )) = β LC (h). We will then show distq (AT , L α,β ), (AC , L C ) ≤ α(diamA + diamC ) by showing that for each positive ε and any state φ on T there exists a state ψ on C(P H ) such
948
E. Christensen, C. Ivan
that for the metric ρ L on the state space of AT ⊕ AC we have ρ L ((φ, 0), (0, ψ)) ≤ α(diamA + diamC ) + ε, and vice versa. For a state φ on T we can write φ = ιC ( f ) + ιA (µ) for positive functionals f on C(P H ) and µ on A. Let fˆ denote the extension - with the same norm—of f to C(P H ), ˆ then the functional ψ is defined as f + µσ on C(P H ) and we get the following string of inequalities: ρ L ((φ, 0), (0, ψ)) = sup{|φ(T (a) + k) − ψ(h + s I ) | L((T (a) + k, h + s I )) ≤ 1} ≤ sup{|φ(T (a)) − σ (T (a))| L A (a) ≤ α} + sup{|σ (T (a)) − s| |σ (T (a)) − s| ≤ 1/M} + sup{| f (k − h)| L C (k − h) ≤ α} which by Lemma 4.5 ≤ α(diamA + diamC ) +
1 . M
Given a state ψ on C(P H ) we can write ψ = fˆ + (1 − f )σ for a positive functional f on C(P H ) of norm at most 1. Then the state φ is defined as ιC ( f ) + (1 − f )σ on T , and we get as above, ρ L ((φ, 0), (0, ψ)) = sup{|φ(T (a) + k) − ψ(h + s I ) | L((T (a) + k, h + s I )) ≤ 1} ≤ sup{|φ(T (a)) − σ (T (a)) | L A (a) ≤ α} + sup{|σ (T (a)) − s| |σ (T (a)) − s| ≤ 1/M} + sup{| f (k − h)| L C (k − h) ≤ α} ≤ α(diamA + diamC ) + and the theorem follows.
1 , M
The inequalities just above show that when α → 0 then the system seems to forget how it was created and only the very basic structure of the quantum infinitesimals modelled by C(P H ) are left visible. 5. A Quantum Metric on the Set of Parameters P := {(α, β) ∈ R2 | α ≥ 0, β > 0, αβ ≤ 1} ∪ {(0, ∞)} The quantum Gromov-Hausdorff metric on our two-parameter family of compact quantum Hausdorff spaces naturally define a metric on the parameter space, say P ◦ := {(α, β) ∈ R2+ | αβ ≤ 1}, and we want to get an impression on the sort of metric space we can obtain this way. We have not made a very detailed study of this but we show that some balls in this metric are unbounded with respect to the Euclidian distance in R2 . We also show, the other way around, that some sets which are bounded with respect the Euclidian metric are unbounded with respect to the quantum-metric. Based on the results in Theorem 4.6 we realized that it is reasonable to extend the parameter space to the space P, defined below. P := {(α, β) | α ≥ 0, β > 0, αβ ≤ 1} ∪ {(0, ∞)} for 0 < β < ∞ : (A0,β , L 0,β ) := (AC , β L C ) (A0,∞ , L 0,∞ ) := (R, 0).
Extensions and Degenerations of Spectral Triples
949
The points we have added are also compact quantum metric spaces, and it turns out that they fit in very well with respect to the quantum Gromov-Hausdorff metric. Proposition 5.1. Let β0 > 0 then the subset Pβ0 := {(α, β) ∈ P | β ≥ β0 } is compact with respect to the metric inherited from the quantum Gromov-Hausdorff distance. Proof. Fix a positive ε and define β1 := max{β0 , 2(diam A + diamC )/ε}. For any pair (α, β) in P with β ≥ β1 we get α ≤ β1−1 and by Theorem 4.6 distq ((α, β), (0, β)) ≤ α(diam A + diamC ) ≤ (diam A + diamC )/β1 . By Proposition 3.8 distq ((0, β), (0, ∞)) ≤ diamC /β ≤ diamC /β1 , hence it follows that for (α, β) in P with β ≥ β1 this point is in the ball of radius ε with centre in (0, ∞). We are then left with the set {(α, β) ∈ Pβ0 | β0 ≤ β ≤ β1 } and we will divide this set into two sets dependent on a positive real δ which we define by δ := min{
1 ε , }, 3(1 + diamA + diamC ) β1
and the sets become X := {(α, β) | 0 ≤ α ≤ δ and β0 ≤ β ≤ β1 }, Y := {(α, β) | δ ≤ α and β0 ≤ β ≤ β1 and αβ ≤ 1}. the results from Theorem 3.9 show that the usual Euclidean metric and the metric distq generate the same topology on the subset Y, so this set is compact. For the set X we can look at the subset Z which we define by Z := {(δ, β) | β0 ≤ β ≤ β1 }. Since Z is also a subset of Y, it is compact for the quantum metric distq , by the result above, and we can find a finite number of points {(δ, βi ) | i ∈ J } in Z such that any point in Z is within distance δ from a point of the form (δ, βi ). For any point (α, β) in X , we get from Theorem 4.6 that distq ((α, β), (0, β)) ≤ ε/3 and for suitable βi we get distq ((α, β), (δ, βi )) ≤ distq ((α, β), (0, β)) +distq ((0, β), (δ, β)) + distq ((δ, β), (δ, βi )) ≤ ε, and the proposition follows.
We will then look at the subsets of P such that α ≥ α0 . Here the situation is quite the opposite since these sets will be unbounded with respect to the quantum metric on P. To see this we fix a positive γ ≤ 1 and we will study behavior of the metric along the hyperbola Hγ := {(α, β) ∈ R2+ | αβ = γ }. We see that the seminorms corresponding to the points on Hγ are all proportional and for any positive real s we see from Definition 2.6 L (γ /s),s = s L γ ,1 , so the space is well understood along each of these curves. In particular, for the diameters we have diam(γ /s),s = (diamγ ,1 )/s, so for s ≤ 1 and s decreasing to 0, we get immediately the following estimate.
950
E. Christensen, C. Ivan
Proposition 5.2. For positive reals γ , s such that 0 < γ ≤ 1 and 0 < s ≤ 1: distq ((AT , L γ ,1 ), (AT , L (γ /s),s )) ≥ (1/2)(s −1 − 1)diamγ ,1 . Proof. Let ε > 0 and let δ := distq ((AT , L γ ,1 ), (AT , L (γ /s),s )). For a pair of states, say φ, ψ on AT such that d(γ /s),s (φ, ψ) ≥ (diamγ ,1 )/s − ε/3 we can find approximating states - with respect to (AT , L γ ,1 ),—say µ and ν on AT such that (diamγ ,1 )/s − ε/3 ≤ dist(γ /s),s (φ, ψ) ≤ 2δ + 2ε/3 + distγ ,1 (µ, ν) ≤ 2δ + 2ε/3 + diamγ ,1 , and the proposition follows.
For the vertical intervals {(α, β) 0 < β ≤ α} we get that they are all unbounded with respect to this new metric. This follows easily from Proposition 4.4, and we will state it formally in the following proposition. Proposition 5.3. For a fixed α0 > 0 there exists a positive γ such that ∀β ∈ ]0, 1/α0 ] : diamα0 ,β ≥ γ /β. 6. Applications to the Compacts and an Investigation of Connes’ 7 Axioms for this Spectral Triple Right after Definition 1.1 of the Toeplitz extension of a C*-algebra A, we remarked that it is debatable if the generalized Toeplitz algebra should be defined as the C*-algebra generated by PA|P H alone or—as we have chosen—the one generated by this set plus ) it is a rather the compacts. The difference is a trivial extension, but for the algebra C(H crucial difference, when this algebra is considered to be a trivial extension of the one dimensional C*-algebra CI. Our first example here shows that our construction offers a variety of spectral triples for the unitarized compacts. On the other hand, for the Podle`s sphere our construction gives an algebra which has more compacts than the universal C*-algebra for the Podle`s sphere has. If we just had used the C*-algebra generated by PA|P H, we would have obtained the right algebra here. Example 6.1. Let H be a separable infinite dimensional Hilbert space and let A := CI be the unital C*-algebra generated by the unit I on H . Let D be an unbounded selfadjoint invertible operator on H with compact inverse and let the projection P := I. We now have a spectral triple (A, H, D) and a quadruple ((A, H, D), P) of Toeplitz type. ), a Hilbert space K := H ⊕ H , Our construction will then give a C*-algebra T := C(H and a representation π of T on K by k+γI 0 ∀k + γ I ∈ C(H ) + CI π(k + γ I ) := . 0 γI The Dirac operator then becomes
Dα,β
0 := βD
βD . 1 αD
Extensions and Degenerations of Spectral Triples
951
You may notice that the part 1/α D has no effect, for this spectral triple and this leads to the following proposition which will yield many more spectral triples associated to ). C(H Proposition 6.2. Based on the notation in the example above let T be an unbounded densely defined and closed operator on H. If |T | is invertible with compact resolvent then for 0 T∗ D := , T 0 ). the set |T |−1 C(H )|T |−1 + CI, K , D is a spectral triple associated to C(H In the article [8] Connes lists 7 axioms for Non - Commutative Geometry. In the ) the dimenpresent case of a spectral triple associated to the unitarized compacts C(H sion must be 0 so the dimension is even and for the Dirac operator D the growth of the eigenvalues must be such that for any positive real s the operator (I + D 2 )−s/2 is of trace class. There should also be a grading γ and a conjugate linear operator J which relate in certain ways. We can provide candidates for these ingredients, which seem natural to us, but they will not fulfill all of Connes’ axioms. We will therefore present the candidates for D, γ , J, check each of the axioms and show what sort of problems we are facing. We keep the notation from above in this section and define Definition 6.3. (D) Let T be a self-adjoint unbounded operator with trivial kernel, such that for any positive real s the operator |T |−s is of trace class, then the Dirac operator D is defined on H ⊕ H by 0 T D := . T 0 (γ ) The obvious choice for this unitary seems to be the unitary on H ⊕ H, given by I 0 γ := . 0 −I (J) It is not so obvious what to choose here, since our setup is not the same as the one Connes clearly has in mind. In [8] Connes obtains the J operation from a standard representation of a self-adjoint algebra of bounded operators. This is not what we have here for C(H ), but we have anyway a candidate for J which seems reasonable. First we define j : H → H by choosing an orthonormal basis (ξn ) ¯ n and extend for H consisting of eigenvectors for T. Then we define j on λξn as λξ this to a conjugate linear isometry of H onto H. The choice for J is then given by 0 j J := . j 0 Remark that ¯ )J = J π(a ∗ + λI
ja ∗ j + λI 0
0 , λI
¯ )J so for any operators a + λI and b + µI the operators π(a + λI ) and J π(b∗ + µI do commute.
952
E. Christensen, C. Ivan
We will then look at the 7 axioms taken from [8] one by one, but first we will define the ) such that for δ(x) := [|D|, x] algebra A of smooth elements as the operators a ∈ C(H we have for any a ∈ A and any natural number m both π(a) and [D, π(a)] are in the domain of δ m . Inside this algebra A we have a norm dense subalgebra of operators of finite rank which we denote A0 . This algebra is defined via the orthonormal basis (ξn ), from above, consisting of eigenvectors for T. The algebra A0 is then the linear span of the matrix units ai j := ., ξ j ξi . (1) The operator D −1 is an infinitesimal of infinite order. This is fulfilled by the demand that |T |−s is of trace class for any psoitive real s. (2) For any pair of elements a, b ∈ A : [[D, π(a)], J π(b∗ )J ] = 0. This demand can not be met, but we have - as in Dabrowski’s paper [12]—∀a, b ∈ A : [[D, π(a)], J π(b∗ )J ] ∈ C(H ⊕ H ). (3) Smoothness The smoothness axiom is fulfilled by the definition of the algebra A. (4, 5, 6) We can not show that the spectral triple we investigate fulfills any of these 3 axioms. We only have the grading γ and it seems to be uniquely determined by its basic properties. (7) We have an operator J such that [π(a), J π(b∗ )J ] = 0, but γ and J do not fit with the reality properties of the table in [8] p. 162. We get J 2 = I, J D = D J, J γ = −γ J. For n = 0 the first two identities are as in the table, but the last one should have been J γ = γ J. 2 ), [25] and base the presentation here We will now turn to the Podle`s sphere, C(Sqc on Sect. 4 of that paper and on Chakraborty’s description, [6], of a concrete faithful representation for this algebra. Chakraborty’s purpose was to some extent the same as ours since he wanted to create a Lip-norm for the Podle`s sphere, based on the fact that this C*-algebra is an extension of the classical Toeplitz algebra by the compacts. In the latter presentation the parameter µ from Podle`s paper is replaced by the letter q, which now seems to be standard, so we will use this notation. 2 ) is the For c, q reals such that c > 0 and 0 < |q| < 1 the Podle`s sphere, C(Sqc universal C*-algbera generated by 2 operators A and B which satisfy the following relations:
A = A∗ , B A = q 2 AB, B ∗ B = A − A2 + cI, B B ∗ = q 2 A − q 4 I + cI. Let T (T) denote the classical Toeplitz algebra for the unit circle and let ρ : T (T) → 2 ) turn out C(T) denote the canonical surjective homomorphism. All the algebras C(Sqc to be isomorphic [29] and can be described by 2 C(Sqc ) = {(x, y) ∈ T (T) ⊕ T (T) | ρ(x) = ρ(y)}. 2 ) is an extension of the Toeplitz algebra by the compacts. So we can see that C(Sqc Let us consider the standard spectral triple associated to the C*-algebra A of continuous functions on the unit circle. For the algebra A we can take the functions whose d Fourier coefficients form rapidly decreasing sequences and the Dirac operator is 1i dθ . 2 As the Hilbert space H we take L (T) and the projection P is the projection onto H+ , the closed linear span of the eigen functions einθ , n > 0, corresponding to the positive eigen values. This is not the usual definition, where the constant function I usually is assumed to be in H+ . This will not change the construction qualitatively but it will have the nice consequence that the restriction of D to H+ is invertible with a compact inverse. Finally
Extensions and Degenerations of Spectral Triples
953
we let H− denote the orthogonal complement of H+ in H. The quadruple ((A, H, D), P) is then of Toeplitz type and Definition 1.8 gives a spectral triple (At , K , Dα,β ) for the ordinary Toeplitz algebra. Recall that the Hilbert space K is given as H+ ⊕ H+ ⊕ H− , so we can define a projection Q of K onto the first two summands and it follows from Definition 1.8,(v), that Dα,β commutes with Q. By checking the same definition’s point (iv) it can be seen that for any t in T (T) the commutator [π(t), Q] is compact since it is nothing but the embedding of the operator [ρ(t), P] into B(K ). In order to show that we now have a quadruple of Toeplitz type in the set (At , K , Dα,β ), Q , we then, according to Definition 1.2 only have to prove that D(α,β)Q := Dα,β |Q K has trivial kernel, but this follows easily from the description of Dα,β given in Definition 1.8 point (v) and the fact that D P is assumed to be injective. As mentioned above we will not consider the extended algebra as defined in Definition 1.1. Instead we will define TT as the C*-algebra on H+ ⊕ H+ generated by Qπ(t)|Q K . It is not difficult to see that 2 ). Our this C*-algebra is exactly the one which above is described as the algebra C(Sqc 2 construction can give a family of spectral triples associated to C(Sqc ) + C(H+ ⊕ H+ ), and let (Att , K tt , Dtt ) denote such a set. Then we are left with the problem to realize 2 ) + C(H ⊕ H ). To deal with this question how the algebra Att relates to the sum C(Sqc + + we first remark that by Definition 1.8 point (ii) any element in Att is a sum of an element related to a differentiable symbol and a differentiable compact. Consequently we only have to see how a differentiable compact, say C in C(H+ ⊕ H+ )) behaves with respect to the splitting as a sum of a diagonal operator and an off diagonal operator, u v u 0 0 v C= = + . x y 0 y x 0 By Definition 1.8 point (i) the matrix above is a differentiable compact if and only if both of the products D(α,β)Q C and C D(α,β)Q are bounded and densely defined. In matrix forms these products are as seen below: β DP x β DP y D(α,β)Q C = , β D P u + α1 D P x β D P v + α1 D P y βv D P βu D P + α1 v D P . C D(α,β)Q = βy D P βx D P + α1 y D P From here it follows that the products D(α,β)Q C and C D(α,β)Q are bounded and densely defined if and only if all of the operators u, v, x, y belong to the algebra Ac , as defined in Definition 1.8 point (i). This has the immediate consequence that if we replace Att by Attd which we define by
a 0 a 0 Attd := ∈ Att , 0 d 0 d then (Attd , K tt , Dtt ) is a spectral triple associated to the universal C*-algebra for the Podle`s sphere. Remark 6.4. We have been asked by the referee, if there are some connections between our example of a spectral triple for the Podle`s sphere and the ones obtained in [10] and [11]. We have, but in vain, tried to answer this question, and it is our impression that there is no simple connection.
954
E. Christensen, C. Ivan
7. Odd and Even Extensions in Analytic K-Homology Our constructions in this paper produce odd spectral triples and this seems not to be the right setup for algebras containing the compact operators. It is quite easy to produce an even spectral triple from an odd one by doubling the representation and introduce the Dirac operator Dˆ on the Hilbert space K ⊕ K which is given by 0 D Dˆ = D 0 The grading γ is then given on K ⊕ K by γ (ξ, η) := (ξ, −η). We could have performed all our computations in this setting, but it would not give any new insights with respect to the metric properties we have been investigating in this article, so we have not pursued a presentation this way. Extensions of unital C*-algebras by the compacts as we do it here is described in Higson and Roe’s book [16] Chap. 5. So according to that description an extension of the sort we are looking at corresponds to an element in the reduced analytic K-homology of the unital C*-algebra A. But our construction is only designed for projections P in the commutant of D. References 1. Alfsen, E.M.: Compact convex sets and boundary integrals. Berlin-Heidelberg, New York: SpringerVerlag, 1971 2. Alfsen, E.M., Shultz, F.W.: State spaces of operator algebras. Basic theory, orientations and C*-products. Basel-Boston: Birkhäuser, 2001 3. Baaj, S., Julg, P.: Théorie bivariante de Kasparov et opérateurs non bornés dans les C ∗ -modules hilbertiens. C.R. Acad. Sci. Paris, Serie I 296, 875–878 (1983) 4. Belgradek, I.: Degenerations of Riemannian manifolds. http://arXiv.org/list/mathDG/0701723, 2007 5. Burago, D., Burago, Y., Ivanov, S.: A course in metric geometry, Graduate Studies in Mathematics, 33. Providence, RI: Amer. Math. Soc. 2001 6. Chakraborty, P.S.: From C*-algebra extensions to CQMS, SUq (2), Podle`s sphere and other examples. http://arXiv.org/list/math/02100155v1, 2002 7. Connes, A.: Non Commutative Geometry. San Diego: Academic Press, 1994 8. Connes, A.: Gravity coupled with matter and the foundation of non commutative geometry. Commun. Math. Phys. 182, 155–176 (1996) 9. Connes, A., Moscovici, H.: Transgression and the Chern character of finite-dimensional K -cycles. Commun. Math. Phys. 155, 103–122 (1993) 10. D¸abrowski, L., D’Andrea, F., Landi, G., Wagner, E.: Dirac operators on all Podle´s quantum spheres. J. Noncommut. Geom. 1, 213–239 (2007) 11. D¸abrowski, L., Sitarz, A.: Dirac operator on the standard Podle´s quantum sphere. In: Noncommutative geometry and quantum groups (Warsaw, 2001), Banach Center Publ., 61, Warsaw: Polish Acad. Sci., 2003, pp. 49–58 12. Davidson, K.R.: C ∗ -algebras by example. Fields Institute Monographs, 6, Providence, RI: Amer. Math. Soc., 1996 13. Dixmier, J.: Les C*-algèbres et leurs représentations. Paris: Gauthier-Villars, 1964 14. Effros, E.G.: Order ideals in a C*-algebra and its dual. Duke Math. J. 30, 391–412 (1963) 15. Fukaya, K.: Metric Riemannian Geometry. Handbook of differential geometry, Vol. II, Amsterdam: Elsevier/North-Holland, 2006, pp. 189–313 16. Higson, N., Roe, J.: Analytic K-Homology. Oxford: Oxford University Press, 2000 17. Kerr, D.: Matricial quantum Gromov-Hausdorff distance. J. Funct. Anal. 205, 132–167 (2003) 18. Kerr, D., Li, H.: On Gromov-Hausdorff convergence for operator metric spaces. arXiv:mathOA/0411157 v4, 2007, J. Op. Theory, to appear 19. Klimek, S., Lesniewski, A.: A two-parameter quantum deformation of the unit disc. J. Funct. Anal. 115, 1–23 (1993) 20. Klimek, S., Lesniewski, A.: Quantum Riemann surfaces. I. The unit disc. Commun. Math. Phys. 146, 103–122 (1992)
Extensions and Degenerations of Spectral Triples
955
21. Latremoliere, F.: Approximation of quantum tori by finite quantum tori for the quantum GromovHausdorff distance. J. Funct. Anal. 223, 336–395 (2005) 22. Li, H.: Order-unit quantum Gromov-Hausdorff distance. J. Funct. Anal. 231, 312–360 (2006) 23. Nagy, G.: On the Haar measure of the quantum SU(N ) group. Commun. Math. Phys. 153, 217–228 (1993) 24. Pedersen, G.K.: Analysis now, Graduate Texts in Mathematics, 118, New York: Springer-Verlag, 1989 25. Podle`s, P.: Quantum spheres. Lett. Math. Phys. 14, 193–202 (1987) 26. Rieffel, M.A.: Leibniz seminorms for "matrix algebras converge to the sphere”. http://arXiv.org/abs/ mathOA/0707.3229v1[math.OA], 2007 27. Rieffel, M.A.: Gromov–Hausdorff distance for quantum metric spaces. Mem. Amer. Math. Soc. 168(796), 1–65 (2004) 28. Rieffel, M.A.: Metrics on state spaces. Doc. Math. 4, 559–600 (1999) 29. Sheu, A.J.L.: Quantization of the Poisson SU(2) and its Poisson homogeneous space—the 2 -sphere. With an appendix by Jiang-Hua Lu and Alan Weinstein. Commun. Math. Phys. 135, 217–232 (1991) 30. Wu, W.: Quantized Gromov-Hausdorff distance. J. Funct. Anal. 238, 58–98 (2006) Communicated by A. Connes
Commun. Math. Phys. 285, 957–974 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0648-5
Communications in
Mathematical Physics
On Universality for Orthogonal Ensembles of Random Matrices M. Shcherbina Mathematical Division of the Institute for Low Temperature Physics, Kharkov, Ukraine. E-mail:
[email protected];
[email protected] Received: 7 November 2007 / Accepted: 29 June 2008 Published online: 23 October 2008 – © Springer-Verlag 2008
Abstract: We prove universality of local eigenvalue statistics in the bulk of the spectrum for orthogonal invariant matrix models with real analytic potentials with one interval limiting spectrum. Our starting point is the Tracy-Widom formula for the matrix reproducing kernel. The key idea of the proof is to represent the differentiation operator matrix written in the basis of orthogonal polynomials as a product of a positive Toeplitz matrix and a two diagonal skew symmetric Toeplitz matrix. 1. Introduction and Main Results In this paper we consider ensembles of n × n real symmetric (or Hermitian) matrices M with the probability distribution nβ −1 Pn (M)d M = Z n,β exp − TrV (M) d M, (1) 2 where Z n,β is a normalization constant, V : R → R+ is a Hölder function satisfying the condition |V (λ)| ≥ 2(1 + ) log(1 + |λ|).
(2)
A positive parameter β here assumes the values β = 1 (in the case of real symmetric matrices) or β = 2 (in the Hermitian case), and d M means the Lebesgue measure on the algebraically independent entries of M. Ensembles of random matrices (1) in the real symmetric case are usually called orthogonal, and in the Hermitian case - unitary ensembles. This terminology reflects the fact that the density of (1) is invariant with respect to orthogonal or unitary transformations of matrices M. The joint eigenvalue distribution corresponding to (1) has the form (see [13]) pn,β (λ1 , ..., λn ) = Q −1 n,β
n i=1
e−nβV (λi )/2
1≤ j 0.
(14)
C2. The support σ of IDS of the ensemble consists of a single interval: σ = [−2, 2]. C3. DOS ρ(λ) is strictly positive in the internal points λ ∈ (−2, 2) and ρ(λ) ∼ |λ ∓ 2|1/2 , as λ ∼ ±2. C4. The function u(λ) = 2 log |µ − λ|ρ(µ)dµ − V (λ) (15) achieves its maximum if and only if λ ∈ σ . Consider a semi-infinite Jacoby matrix J (n) , generated by the recursion relations for orthogonal polynomials (8) (n)
(n)
(n)
(n)
(n)
(n)
(n)
Jl ψl+1 (λ) + ql ψl (λ) + Jl−1 ψl−1 (λ) = λψl (λ),
(n)
J−1 = 0, l = 0, .... (16)
(n)
It is known (see [2]) that under conditions C1 − C4, ql = 0 and there exists some fixed γ such that uniformly in k : |k| ≤ 2n 1/2 , 2 2/3 (n) J − 1 − k γ ≤ C |k| + n . (17) n+k n n2 (n)
Remark 1. The convergence Jn+k → 1 (n → ∞) without uniform bounds for the remainder terms was shown in [1] under much weaker conditions (V (λ) is a Hölder function in some neighborhood of the limiting spectrum). Note also (see [2]) that under conditions C1 − C4 the limiting density of states (DOS) ρ has the form 1 ρ(λ) = P(λ) 4 − λ2 1|λ| 0, λ ∈ [−2, 2].
(20)
On Universality for Orthogonal Ensembles of Random Matrices
961
An important role below belongs to the following two operators: P j,k
1 = 2π
π P(2 cos y)ei( j−k)y dy = −π
1 2πi
P(ζ + ζ −1 )ζ j−k−1 dζ,
(21)
|ζ |=1
and R = P −1 , R j,k = R j−k =
1 2π
π −π
ei( j−k)x d x 1 = P(2 cos x) 2πi
ζ −1 ζ j−k dζ . P(ζ + ζ −1 )
(22)
It is important that δ1 ≤ R ≤ δ2 , δ1 = inf P −1 (λ), δ2 = sup P −1 (λ). σ
(23)
σ
Remark also that if we denote by J ∗ an infinite Jacobi matrix with constant coefficients ∗ ∞ J ∗ = {J j,k } j,k=−∞ ,
∗ J j,k = δ j+1,k + δ j−1,k ,
(24)
then the spectral theorem yields that P = P(J ∗ ), R = P −1 (J ∗ ). The main result of the paper is Theorem 1. Consider the orthogonal ensemble of random matrices defined by (1)–(3) with V satisfying Conditions C1-C4 and even n. Then for λ0 in the bulk (ρ(λ0 ) = 0) there exist weak limits of the scaled correlation functions (4) and these limits are given in terms of the universal matrix kernel (0)
K ∞,1 (s1 , s2 ) = lim
n→∞
1 K n,1 (λ0 + s1 /nρ(λ0 ), λ0 + s2 /nρ(λ0 )), nρ(λ0 )
where K n,1 (λ, µ) is defined by (10)–(11), and (0) K ∞,2 (s1 − s2 ) (0) K ∞,1 (s1 , s2 ) = s1 −s2 (0) K ∞,2 (t)dt − (s1 − s2 ) 0
(0) K ∞,2 (s1 − s2 ) (0) K ∞,2 (s1 − s2 )
∂ ∂s1
(25)
,
(0)
with K ∞,2 (s1 − s2 ) of the form (9). The proof of the theorem is based on the following result Theorem 2. Under conditions of Theorem 1 for even n the matrix (M(0,n) )−1 defined in (12) is bounded uniformly in n, i.e. ||(M(0,n) )−1 || ≤ C, where C is independent of n and ||.|| is a standard norm for n × n matrices. (0,∞)
The main idea of the proof of Theorem 2 is to consider the matrix V j,k n −1 ddx
in the basis
(n) {ψk }∞ k=0
(0,∞)
V j,k
of the operator
and to prove that (see Lemma 1)
= (PD) j,k + o(1), for | j − n|, |k − n| 0, sup |ψk(n,L) (λ) − ψk(n) (λ)| ≤ e−nC , |ψk(n) (±L)| ≤ e−nC ,
|λ|≤L
(27)
with some absolute C. Therefore from the very beginning we can take all integrals in (4), (8), (13) and (12) over the interval [−L , L]. Besides, observe that since V is an analytic function in Ω[d1 , d2 ] (see (14)), for any m ∈ N there exists a polynomial Vm of the (2m)th degree such that |Vm (z)| ≤ C0 , |V (z) − Vm (z)| ≤ e−Cm , z ∈ Ω[d1 /2, d2 /2].
(28)
Here and everywhere below we denote by C, C0 , C1 , ... positive n, m-independent constants (different in different formulas). Take m = [log2 n]
(29)
(n,L ,m)
}∞ and consider the system of polynomials { pk k=0 orthogonal in the interval [−L , L] (n,L ,m) (n,L ,m) −nVm /2 = pk e and construct with respect to the weight e−nVm (λ) . Set ψk (0,n) (n,L ,m) 1/2 . Then for any k ≤ n +2n and uniformly in λ ∈ [−L , L], Mm by (12) with ψk (n,L)
|ψk
(n,L ,m)
(λ) − ψk
(n,L)
(λ)| ≤ e−C log n , |εψk 2
(n,L ,m)
(λ) − εψk
||M(0,n) − M(0,n) || ≤ e−C log n . m 2
(λ)| ≤ e−C log n , 2
(30)
The proof of the first bound here is identical to the proof of (27) (see [15]). The second bound follows from the first one because the operator ε : L 2 [−L , L] → C[−L , L] is bounded by L. The last bound in (30) follows from the first one and the inequality valid for the norm of an arbitrary matrix A, ||A||2 ≤ max |Ai, j | · max |Ai, j |. (31) i
j
j
i
Remark also that if for arbitrary matrices A, B ||A−1 || ≤ C and ||A − B|| ≤ qC −1 with some 0 < q < 1, then we can write B = A(I − A−1 (A − B)). Since ||A−1 (A − B)|| ≤ q < 1, ||(I − A−1 (A − B))−1 || ≤ (1 − q)−1 (see any textbook on linear algebra). Thus B is invertible and ||B −1 || ≤ C(1 − q)−1 . Moreover, ||A−1 − B −1 || ≤ q(1 − q)−1 C 2 . −1 Due to this simple observation and (30), we obtain that if ||(M(0,n) m ) || ≤ C 1 , then 2 ||(M(0,n) )−1 || ≤ C1 (1 − C1 e−C log n )−1 ≤ 2C1 and −1 −C log n ||(M(0,n) )−1 − (M(0,n) . m ) || ≤ C 2 e 2
Using this bound combined with the first and the second bound of (30) we can compare each term of the kernel Sn,m (λ, µ) constructed by formula (11) with new orthogonal
On Universality for Orthogonal Ensembles of Random Matrices (n,L ,m) ∞ }k=0
polynomials { pk result of [14]
963
with the corresponding term of Sn (λ, µ). Then, since by the
(n)
|ψk (λ)|2 ≤ K n,2 (λ, λ) ≤ nC, λ ∈ [−L , L], and by the Schwarz inequality (n)
(n)
|ψk (λ)| ≤ (2L)1/2 ||ψk ||2 ≤ (2L)1/2 , λ ∈ [−L , L],
(32)
where ||.||2 is a standard norm in L 2 [−L , L], we obtain that uniformly in λ, µ ∈ [−L , L], |Sn,m (λ, µ) − Sn (λ, µ)| ≤ Cn 4 e−C log
2n
≤ e−C
log2 n
.
(33)
(0,n)
Therefore below we will study Mm and Sn,m (λ, µ) instead of M(0,n) and Sn (λ, µ). To simplify notations we omit the indexes m, L, but keep the dependence on m in the estimates. Let us set our main notations. We denote by H = l2 (−∞, ∞) the Hilbert space of ∞ all infinite sequences {xi }i=−∞ with the standard scalar product (., .) and a norm ||.||. ∞ Let also {ei }i=−∞ be a standard basis in H, and I (n 1 ,n 2 ) with −∞ ≤ n 1 < n 2 ≤ ∞ be an orthogonal projection operator defined as e , n1 ≤ i < n2, (n 1 ,n 2 ) I ei = i (34) 0, otherwise. For any infinite or semi-infinite matrix A = {Ai, j } we denote by A(n 1 ,n 2 ) = I (n 1 ,n 2 ) AI (n 1 ,n 2 ) , −1 (A(n 1 ,n 2 ) )−1 = I (n 1 ,n 2 ) I − I (n 1 ,n 2 ) + A(n 1 ,n 2 ) I (n 1 ,n 2 ) ,
(35)
so that (A(n 1 ,n 2 ) )−1 is a block operator which is inverse to A(n 1 ,n 2 ) in the space I (n 1 ,n 2 ) H and zero on the (I − I (n 1 ,n 2 ) )H. We denote also by (., .)2 a standard scalar product in L 2 [−L , L]. Set V (0,∞) = {V j,l }∞ j,l=0 , where (n) (n) j > l, 2 (ψ j , (ψl ) )2 , (n) (n) V j,l = sign(l − j)(ψ j , V ψl )2 = (36) 2n (n) (n) −C log n (ψ j , (ψl ) )2 +O(e ), j ≤ l. Here O(e−C log n ) appears because of the integration by parts and bounds (27), (30). Since (ψk(n) ) = qk e−nV /2 , where qk is a polynomial of the (k + 2m − 1)th degree, its Fourier expansion in the basis {ψk(n) }∞ k=0 contains not more than (k + 2m − 1) terms and 2 for | j − k| > 2m − 1 the j th coefficient is O(e−C log n ). Therefore for k ≤ n + 2n 1/2 , 1 2 (n) (n) V j,k ψ j + O2 (e−C log n ). (37) n −1 (ψk ) = 2 2
j
Here and below we write φ(λ) = O2 (εn ), if ||φ||2 ≤ Cεn . The above relation implies ⎛ ⎞ 1 ⎝ (0,∞) (n) ⎠ 2 (n) V j,k ψ j (38) = n −1 ψk + O2 (e−C log n ). 2 j
964
M. Shcherbina
Hence, by (12), for 0 ≤ j, k ≤ n + 2n 1/2 , 1 (0,∞) (0,∞) 2 M V = δ j,k + O(e−C log n ). j,k 2
(39)
Thus, 1 (0,n) (0,n) 2 V = I (0,n) − µ(0,n) ν (0,n) + E (0,n) , ||E (0,n) || = O(e−C log n ), (40) M 2 where ν (0,n) is a matrix with entries equal to zero except the block (2m − 1) × (2m − 1) in the right bottom corner and µ(0,n) in (40) has (n − 2m + 1) first columns equal to zero and the last (2m − 1) ones of the form (0,n)
µl,n−2m−1+k = Ml,n−1+k , k = 1, . . . , 2m − 1, l = 0, . . . , n − 1. The relation (40) was obtained in [8]. If we transpose the matrices in (40) we get 2m−1 (l) 1 (0,n) (0,n) V M = δ j,k − f k δn−l, j + E (0,n)T , j,k j,k 2
(41)
l=1
where f (1) , . . . , f (2m−1) ∈ H(0,n) are some vectors, whose form is not important for us. The idea of the proof is to show that for | j − n| ≤ N := 4[log2 n], Mk, j−1 − Mk, j+1 = M j+1,k − M j−1,k = 2Rk− j + εj,k , ∞
|εj,k |2 ≤ Cm 2 N 2 n −1 ,
(42)
k=0
where Rk is defined by (22).
Remark 2. To prove Theorems 1, 2 we need not to know M j,k , but (42) allows to find the limiting expressions for M j,k up to some additive constant. Indeed, if we know, e.g., Mn−1,n , we can find Mn+2 j,n+1+2k , going step by step from the point (n − 1, n) to (n + 2 j + 1, n + 2k). Then, using the symmetry M j,k = −Mk, j we obtain Mn+2 j,n+2k+1 . Hence, since M j,k = 0 for even j − k because of the evenness, we find in such a way all M j,k with | j − n|, |k − n| ≤ [2 log2 n]. Thus, if we denote C(n) = Mn−1,n − M2 (see (44) for the definition of M2 ), then for odd j − k we have M j,k = M ∗j,k + ε j,k , |ε2 j−1,2k | ≤ C∗ N mn −1/2 (1 + | j − n| + |k − n|), 1 M ∗j,k = Mk− j+1 − ((1 + (−1) j )M−∞ − (−1) j C(n), 2 where for odd k, Mk = 0 and for even k, Mk = (1 + (−1) ) k
∞
Rj = P
j=k
M−∞ = 2
∞ j=−∞
R j = 2P −1 (2),
−1
1 (2) − 2π
π −π
(43)
sin(k − 1)x d x , P(2 cos x) sin x (44)
On Universality for Orthogonal Ensembles of Random Matrices
965
and P is defined in (19). It is possible to show also that C(n) → 0, as n → ∞ (see [20]). For the case V (λ) = λ2 p + o(1) expressions for M j,k were found in [8]. Let us assume that we know (42) and obtain the assertion of Theorem 2. Define 1 V (0,n) for 0 ≤ j ≤ n − 2m, 0 ≤ k < n, (0,n) j,k , Q j,k = (45) 2 (R(−∞,n) )−1 D(−∞,n) ) j,k , for n − 2m < j < n, 0 ≤ k < n. Remark that since (R)−1 j,k = P j,k = 0 for | j − k| > 2m − 2, the standard linear algebra yields that (R(0,n) )−1 possesses the same property, i.e., (0,n)
(R(−∞,n) )−1 j,k = 0, for | j − k| > 2m − 2 ⇒ Q j,k = 0, for | j − k| > 2m −1. (46) It follows from (41) that for 0 ≤ j ≤ n − 2m, 0 ≤ k < n, (Q(0,n) M(0,n) ) j,k = δ j,k + O(e−C log n ). 2
(47)
For n − 2m < j ≤ n, 0 ≤ k < n, using (42) and (46), we get n−1
(Q(0,n) M(0,n) ) j,k =
(−∞,n) (R(−∞,n) )−1 M(0,n) )l,k j,l (D
l=n−4m n−1
=
(−∞,n) (R(−∞,n) )−1 )l,k + j,l (R
l=n−4m −(R(−∞,n) )−1 j,n−1 Mn,k
=
n−1
(R(−∞,n) )−1 j,l εl,k
l=n−4m δ j,k − (R(−∞,n) )−1 j,n−1 Mn,k
+ r j,k , (48)
are defined in (42). According to (42), where εl,k n−1 k=0
n−1 2 n−1 (−∞,n) −1 |r j,k | = (R ) j,l εl,k ≤ C N 2 m 3 n −1 . 2
k=0 l=n−4m
Hence, n n−1
|r j,k |2 ≤ C N 2 m 4 n −1 ,
(49)
j=n−2m k=0
and we obtain from (47)–(49), Q(0,n) M(0,n) = I (0,n) − + E(0,n) , ||E(0,n) || ≤ C N m 2 n −1/2 ,
(50)
1 (x, µ1 )a, with µ1k = Mn,k , a = (R(−∞,n) )−1 en−1 . 2
(51)
where x =
Note that by (46) a ∈ I (n−2m,n) H.
966
M. Shcherbina
Now let x be an eigenvector of M(0,n) , corresponding to the eigenvalue iε0 (||x|| = 1). Since by definition (45), ||Q(0,n) || ≤ max{||V (0,n) ||, 2||(R(0,n) )−1 ||} ≤ C V , we have from (50) x = (x, µ1 )a + y,
y = iε0 Q(0,n) x − E(0,n) x, ||y|| ≤ 2C V |ε0 |.
(52)
But since M(0,n) is a skew symmetric matrix of even dimensionality with real entries, if iε0 is its eigenvalue, −iε0 is its eigenvalue too (if ε0 = 0, then this eigenvalue has multiplicity at least 2). Thus, there exists an eigenvector x (1) (||x (1) || = 1) such that M(0,n) x (1) = −iε0 x (1) and (50) implies x (1) = (x (1) , µ1 )a + y (1) , ||y (1) || ≤ 2C V |ε0 |.
(53)
Now it is easy to see that for |ε0 | ≤ C V∗ with some C V∗ , depending only on C V , relations (52) and (53) contradict the condition (x, x (1) ) = 0 valid for any eigenvectors of M(0,n) , corresponding to different eigenvalues. Thus, we conclude that ||(M(0,n) )−1 || ≤ |C V∗ |−1 . Since from (50) satisfy the relation 2 = λ, with λ = (µ1 , a),
(54)
(50) and the bound ||(M(0,n) )−1 || ≤ |C V∗ |−1 imply |1 − λ| ≥ C V∗ /2. Thus, (0,n) (M(0,n) )−1 = Q(0,n) + (1 − λ)−1 Q(0,n) + E1 , (0,n) ||E1 ||
≤ C Nm
3/2 −1/2
n
(55)
.
To finish the proof of Theorem 2 we are left to prove 42). Define V ∗ = {V ∗j,l }∞ j,l=−∞ with V ∗j,l = sign(l − j)V (J ∗ ) j,l , where J ∗ is defined in (24). Then by the spectral theorem π sign(l − j) ∗ ∗ V j,l = V j−l = d x V (2 cos x)ei( j−l)x . (56) 2π −π
The key point in the proof of (42) is the lemma: Lemma 1. Under conditions of Theorem 1, V ∗ = PD = DP,
(57)
where P and D are defined in (21) and (26) respectively. Moreover, taking N = 4[log2 n], we have ε j,k , |k − n| ≤ 2N , | j − n| ≤ 2N + 2m, V j,k = DP j,k +
(58)
where ε j,k = 0, if | j − k| > 2m − 1 and | ε j,k | ≤ C N mn −1 , i f | j − k| ≤ 2m − 1. We will use also
(59)
On Universality for Orthogonal Ensembles of Random Matrices
967
Proposition 1. For any j : | j − n| < 4N , 2 −1 ||ψ (n) j ||2 ≤ Cn .
(60)
Since ε is a bounded operator in L 2 [−2 − d/2, 2 + d/2], by (38), (58) and (38), we have for |k − n| < 2N , 1 (0,∞) (n) (n) (n) ψ j−1 − ψ j+1 = n −1 ψk + rk , P j,k 2
(61)
j
where O2 (.) is defined in (37) and for |k − n| < 2N we have by (60) and (59), rk :=
1 2
−C log n ε j,k ψ (n) ) = O2 (N mn −3/2 ). j + O2 (e 2
(62)
| j−k|≤2m−1
Let us extend (61) to all 0 ≤ k < ∞, choosing rk for |k − n| ≥ 2N in such a way to obtain for these k identical equalities: rk :=
j>0
(0,∞)
P j,k
1 (n) (n) (n) ψ j−1 − ψ j+1 − ψk = O2 (1). n
(63)
Applying (P (0,∞) )−1 to both sides of (61), we get 1 (n) (n) (n) ψ j−1 − ψ j+1 = n −1 (P (0,∞) )−1 (P (0,∞) )−1 k, j ψk + k, j rk = 1 j + 2 j . (64) 2 k>0
k>0
Now we need some facts from the theory of Jacobi matrices. Proposition 2. Let J be a Jacobi matrix with entries |J j, j+1 | ≤ 1 + d1 /4 and Q be a bounded analytic function (|Q(z)| ≤ C0 ) in Ω[d1 /2, d2 /2]. Then: (i) for any j, k, |Q(J ) j,k | ≤ Ce−d| j−k| ;
(65)
(ii) if J is another Jacobi matrix, satisfying the same conditions, then for any j, k ∈ (n 1 , n 2 ), |Q(J ) j,k − Q(J) j,k | ≤ Ce−d| j−k|
sup
i∈[n 1 ,n 2 )
|Ji,i+1 − Ji,i+1 |
+C(e−d(|n 1 − j|+|n 1 −k|) + e−d(|n 2 − j|+|n 2 −k|) );
(66)
(iii) if Q(λ) > δ > 0 for λ ∈ [−2 − d1 /2, 2 + d1 /2], then for i, j ∈ (n 1 , n 2 ) −1 |(Q(J )(n 1 ,n 2 ) )−1 j,k − Q (J ) j,k | ≤ C min e−d|n 1 − j| + e−d|n 2 − j| , e−d|n 1 −k| + e−d|n 2 −k| ,
where C and d depend only on d1 , d2 , C0 and δ.
(67)
968
M. Shcherbina
The proof of Proposition 2 is given at the end of Sect. 3. Using (67) and (65) to estimate (P (0,∞) )−1 j,k and (62)–(63) to estimate rk , it is easy to obtain that uniformly in | j − n| ≤ N , ||2 j ||2 ≤ C
sup ||rk ||2 + Ce−cN sup ||rk ||2 ≤ C N mn −3/2 .
|k− j|≤N
k
Besides, it follows from (67) that uniformly in | j − n| ≤ N , (n) −cn 1 j − P −1 ). j,k ψk (λ) = O2 (e k>0
Hence, we obtain (n)
(n)
ψ j−1 − ψ j+1 = 2n −1
(n)
R j,k ψk
+ O2 (N mn −3/2 ).
(68)
k>0 (n)
Multiplying the relation by nψk , we get (42).
Corollary 1. Under conditions of Theorem 1, 1 (0,n) −1/2 log6 n), (M(0,n) )−1 j,k = Q j,k + a j bk + O(n 2
(69)
where Q(0,n) , a are defined by (45) and (51) respectively, and ∗ bk = ((R(−∞,n) )−1r ∗ )k , rn−i = Ri
(70)
with Ri defined by (22). Proof of Corollary 1. Using (55) we have for any x ∈ I (n−2m,n) H, ||x|| ≤ 1, 2(M(0,n) )−1 x = ((R(−∞,n) )−1 D(−∞,n) )x + (ν, x)a + O(n −1/2 log6 n),
(71)
where a is defined by (51), ν is some unknown vector and we write x = O(εn ), if ||x|| = O(εn ). Making transposition of both sides of the last equation (recall that M(0,n)T = −M(0,n) and D(−∞,n)T = −D(−∞,n) ), we get for any x ∈ I (n−2m,n) H, − 2(M(0,n) )−1 x = −(D(−∞,n) (R(−∞,n) )−1 )x + (x, a)ν + O(n −1/2 log6 n). (72) Subtracting (71) from (72) we have [(R(−∞,n) )−1 , D(−∞,n) ]x = −(a, x)ν − (ν, x)a + O(n −1/2 log6 n), where the symbol [., .] means the commutator. On the other hand, it is easy to see that [D(−∞,n) , R(−∞,n) ]x = (x, r ∗ )en−1 + (x, en−1 )r ∗ . where r ∗ is defined in (70). Hence, [(R(−∞,n) )−1 , D(−∞,n) ]x = −(x, a)b − (x, b)a,
(73)
On Universality for Orthogonal Ensembles of Random Matrices
969
with a, b defined in (51) and (70). Using the last relation and (73), we obtain that for any x ∈ I (n−2m,n) H, (x, a)b + (x, b)a = (a, x)ν + (ν, x)a + O(n −1/2 log6 n).
(74)
Taking an arbitrary x such that (a, x) = (b, x) = 0, we get that ν = λ1 a + λ2 b + O2 (n −1/2 log6 n), Using this expression in (74), we obtain λ1 = O(n −1/2 log6 n), λ2 = 1−O(n −1/2 log6 n). These relations combined with (71) prove (69). Proof of Theorem 1. Substituting (69) in (11) and using (38), we obtain Sn (λ, µ) = K n,2 (λ, µ) + nrn (λ, µ),
(75)
where K n,2 (λ, µ) is defined by (6) and rn (λ, µ) =
2m−1
(n)
(n)
r j,k ψn− j (λ)(ψn− j )(µ), |r j,k | ≤ C.
(76)
j,k=−2m+1
According to the result of [23], to prove the weak convergence of all correlation functions it is enough to prove the weak convergence of cluster functions, which have the form Rn (s1 , . . . , sk ) =
TrK n,1 (λ0 +
s1 nρ(λ0 ) , λ0
+
s2 nρ(λ0 ) ) . . .
K n,1 (λ0 +
s1 nρ(λ0 ) , λ0
+
s1 nρ(λ0 ) )
(nρ(λ0 ))k
,
(77) where the matrix kernel K n,1 (λ, µ) has the form (10) with −1 ∂ Sn (λ, µ), I Sn (λ, µ) = n (λ − λ )Sn (λ , µ)dλ . Sdn (λ, µ) = −n ∂µ Define similarly
∂ K n,2 (λ, µ), I K n,2 (λ, µ) = n (λ − λ )K n,2 (λ , µ)dλ , K dn,2 (λ, µ) = n ∂µ ∂ r dn (λ, µ) = −n −1 rn (λ, µ), I rn (λ, µ) = n (λ − λ )rn (λ , µ)dλ . (78) ∂µ −1
Lemma 2. Under conditions of Theorem 1 uniformly in |k − n| ≤ 2[log2 n] and λ ∈ [−2 + δ, 2 − δ], (n) |ψk (λ)| ≤ C n −1 + (1 − (−1)k )n −1/2 . (79) Moreover, for any compact K ⊂ R uniformly in s1 , s2 ∈ K, ∂ ∂ ≤ C, K + (λ + s /n, λ + s /n) n,2 0 1 0 2 ∂s ∂s2 1 (0) n −1 I K n,2 (λ0 + s1 /(nρ(λ0 )), λ0 + s2 /(nρ(λ0 ))) → K ∞,2 (s1 − s2 ).
(80) (81)
970
M. Shcherbina
Since in (75)–(76) r j,k = 0, if both j, k are odd, the bounds (79) and relations (75)–(76) yield that uniformly in s1 , s2 ∈ K, 1 K n,1 (λ0 + s1 , λ0 + s2 ) − 1 K n,1 (λ0 + s1 , λ0 + s2 ) ≤ o(1), n nρ(λ0 ) nρ(λ0 ) n nρ(λ0 ) nρ(λ0 ) (82) where n,1 (λ, µ) = K
K n,2 (λ, µ) I K n,2 (λ, µ) − (λ − µ)
K dn,2 Sn (λ, µ) . K n,2 (µ, λ)
n,1 in (77). Then, using integration by parts and (80), Hence, we can replace K n,1 by K we obtain that the integral b I (a, b) =
b ...
a
Rn (s1 , . . . sk )ds1 . . . dsk a
can be represented as a finite sum of the terms: b T (a, b; k1 , . . . , k p ; l1 , . . . , lq ) =
b ...
a
ds1 . . . dsk F1 (s1 , s2 ) . . . Fk (sk , s1 ) a
δ(sk1 − a) − δ(sk1 − b) + · · · + δ(sk p − a) − δ(sk p − b) , where 1 Fi (s, s ) = nρ(λ0 )
s s , λ0 + nρ(λ ) − (s1 − s2 ), I K n,2 (λ0 + nρ(λ 0) 0) s s K n,2 (λ0 + nρ(λ0 ) , λ0 + nρ(λ0 ) ),
(83)
i = l1 . . . . , lq , otherwise.
Using the result of [14] and (81) we can take the limit n → ∞ in each of these term, Theorem 1 is proved. 3. Auxiliary Results Proof of Lemma 1. According to the standard theory of Toeplitz matrices, ∗ Vk, j
=
∗ Vk− j
1 = 2π
π
x, ei(k− j)x V(x)d
−π
where V(x) =2
∞
Vk sin kx,
Vk
k=1
1 = 2π
π
eikx V (2 cos x)d x,
(84)
−π
and to prove (57) it is enough to prove that V(x) = 2 sin x · P(2 cos x).
(85)
On Universality for Orthogonal Ensembles of Random Matrices
971
Replacing in (19) z → 2 cos x, 2 cos y → (ζ + ζ −1 ), dy → (iζ )−1 dζ and using the Cauchy theorem, we get P(2 cos x) =
1 2πi
1 = 2πi =
k
V (ζ + ζ −1 ) − V (2 cos x) −1 ζ dζ ζ + ζ −1 − 2 cos x
|ζ |=1+δ
Vk (ζ k + ζ −k )dζ 1 = (ζ − ei x )(ζ − e−i x ) 2πi
|ζ |=1+δ
|ζ |=1+δ
Vk ζ k dζ (ζ − ei x )(ζ − e−i x )
V(x) sin kx = . Vk sin x 2 sin x
Since V is a polynomial of (2m − 1)th degree, V (J ) j,k = 0 for | j − k| ≥ 2m and (n)
|V (J (n) ) j,k − V (J ∗ ) j,k | ≤ Cm max {|Jl |l−k|≤2m
This bound implies (59).
− 1|}.
Proof of Proposition 2. To prove (60) we use the result of [7], according to which for any δ > 0 and any λ ∈ (−2 + δ, 2 − δ) and |k| ≤ 16[log2 n], ⎛ ψn+k (λ) = √
2 + εn+k 2π |4 − λ2 |1/4
cos ⎝nπ
2
⎞ ρ(µ)dµ + kγ (λ) + θ (λ) + o(1)⎠ + O(n −1 ),
λ
(86) where εn+k → 0 does not depend on λ, ρ(λ) is the limiting IDS, and γ (λ), θ (λ) are smooth functions in (−2, 2). Moreover, it follows from the result of [7] that there exists δ > 0 such that for |λ ∓ 2| ≤ δ ψn+k (λ) = n 1/6 (B1 + O(k/n))Ai n 2/3 ± (λ ∓ 2) + kγ±(1) (λ)/n (87) (2) +n −1/6 (B2 + O(k/n))Ai n 2/3 ± (λ ∓ 2) + kγ± /n + O(n −1 ), (1)
where Ai is the Airy function, B1 , B2 are some uniformly bounded constants, + , γ+ , (2) (1) (2) γ+ are some functions analytic in (2 − δ, 2 + δ), − , γ− , γ− are some functions analytic in (−2 − δ, −2 + δ) and + (2) = 0, − (−2) = 0. Integrating (86) and (87), we get |ψn+k (λ)| ≤ Cn −1/2 , which implies (60).
Proof of Lemma 2. Integrating the first line of (86) between 0 and λ, we get (n)
(n)
|ψn−k (λ) − ψn−k (0)| ≤ Cn −1 .
(88)
972
M. Shcherbina
Then, using the fact that ( f )(0) = 0 for even f , we get (79) for even k (recall, that n is even). For odd k the above inequality implies (n)
(n)
||ψn−k ||2 ≥ C|ψn−k (0)| + Cm 2 n −1 . Combining the above bounds with (88), we get (79). Inequality (80) follows from the result [14] (see Lemma 7), according to which ∂ ∂ ≤ C n −1 |s1 − s2 |2 K + (λ + s /n, λ + s /n) n,2 0 1 0 2 ∂s ∂s 1
2 (n) +|ψn (λ0
(n)
+ s1 /n)|2 + |ψn−1 (λ0 + s1 /n)|2
(n) +|ψn(n) (λ0 + s2 /n)|2 + |ψn−1 (λ0 + s2 /n)|2 .
(n)
(n)
Since by(86) ψn , ψn−1 are uniformly bounded in each compact K ⊂ (−2, 2), we obtain (81). To prove (81) we use the Christoffel-Darboux formula, which gives us (n) (n) (n) (n) ψn (λ )ψn−1 (µ) − ψn−1 (λ )ψn (µ) −1 n I K n,2 (λ, µ) = dλ (λ − λ ) λ − µ |λ −λ0 |≥δ
(λ − λ )
+
(n)
(n)
(n)
λ − µ
|λ −λ0 |≤δ (n)
Integrating by parts (we use again that ψk we get |I1 | ≤ Cδ
−1 −1/2
n
(n)
ψn (λ )ψn−1 (µ) − ψn−1 (λ )ψn (µ)
+δ
−2
L
dλ = I1 + I2 (89)
(n)
= (ψk ) ) and taking into account (88),
(n)
(|ψn(n) (λ )| + |ψn−1 (λ )|)dλ
−L
≤ Cδ
−1 −1/2
n
+ Cδ
−2
(n)
(||ψn−1 ||2 + ||ψn(n) ||2 ) = O(n −1/2 ).
To find I2 observe that (86) yields for λ, µ ∈ (−2 + ε, 2 − ε), λ sin nπ µ ρ(λ )dλ (1 + (λ − µ)φ1 (λ, µ)) n −1 K n,2 (λ, µ) = R(λ) n(λ − µ) ⎛ ⎛ ⎞ ⎞ λ λ µ + n −1 cos ⎝nπ ρ(λ )dλ⎠ φ2 (λ, µ)+n −1 cos ⎝nπ( + )ρ(λ )dλ⎠ φ3 (λ, µ), µ
2
2
where R and φ1 , φ2 , φ3 are smooth functions of λ. Hence, using the Riemann-Lebesgue theorem to estimate integrals with φi (λ, µ), we obtain λ +s /n nδ sin nπ λ00+s2 /n ρ(λ )dλ ds (s1 − s )R(λ0 + s /n) + o(1). I2 = s − s2 −nδ
Now we split here the integration domain in two parts: |s | ≤ A and |s | ≥ A and take the limits n → ∞ and then A → ∞. Relation (81) follows.
On Universality for Orthogonal Ensembles of Random Matrices
973
Proof of Proposition 2. Assertion (i) follows from the spectral theorem, according to which
1 Q(J ) j,k = R j,k (z)Q −1 (z)dz, (90) 2πi d(z)=d
and the bound, valid for the resolvent R(z) = (J − z)−1 of any Jacobi matrix J , satisfying conditions of the proposition (see [17]) |R j,k | ≤
C −Cd(z)| j−k| e , d(z) = dist {z, [−2 − d1 /2, 2 + d1 /2]}. d(z)
(91)
To prove assertion (ii) consider J (n 1 , n 2 ) = J (n 1 ,n 2 ) + J (−∞,n 1 ) + J (n 2 ,∞) , J(n 1 , n 2 ) = J(n 1 ,n 2 ) + J(−∞,n 1 ) + J(n 2 ,∞) and denote R (1) (z) = (J (n 1 , n 2 ) − z)−1 ,
R (2) (z) = (J(n 1 , n 2 ) − z)−1 ,
= (J − z)−1 . R(z)
It is evident that for n 1 ≤ j, k ≤ n 2 and z ∈ [−2, 2], (1)
R j,k (z) = (J (n 1 ,n 2 ) − z)−1 j,k ,
(2)
R j,k (z) = (J(n 1 ,n 2 ) − z)−1 j,k .
Then, using the resolvent identity H1−1 − H2−1 = H1−1 (H2 − H1 )H2−1
(92)
and (91), we get (1)
(2)
|R j,k (z) − R j,k (z)| ≤ C
sup
i∈[n 1 ,n 2 )
e |Ji,i+1 − Ji,i+1 |
−d(z)| j−k|/2
d 2 (z)
.
On the other hand, by (92) and (91), we obtain (1) (1) (1) (1) |R j,k − R (1) j,k | ≤ |R j,n 1 +1 Rn 1 ,k | + |R j,n 1 Rn 1 +1,k | + |R j,n 2 Rn 2 −1,k | + |R j,n 2 −1 Rn 2 ,k |
≤
C (e−d(z)(|n 1 − j|+|n 1 −k|) + e−d(z)(|n 2 − j|+|n 2 −k|) ). d 2 (z) (2)
j,k − R |. Then (91) and (90) yield (66). A Similar bound is valid for | R j,k
To prove assertion (iii) observe that x j = (Q(J )(n 1 ,n 2 ) )−1 j,k is the solution of the infinite linear system:
Q(J )i, j x j = δi,k , i ∈ [n 1 , n 2 ), Q(J )i, j x j = ri := Q(J )i, j (Q(J )(n 1 ,n 2 ) )−1 j,k , i ∈ [n 1 , n 2 ).
Hence, −1 (Q(J )(n 1 ,n 2 ) )−1 j,k = Q (J ) j,k +
i∈[n 1 ,n 2 )
Now, using assertion (i), we obtain (67).
Q −1 (J ) j,i ri .
974
M. Shcherbina
Acknowledgements. The author thanks P.Deift, T.Kriecherbauer and L.Pastur for fruitful discussions. The author would like to thank also the unknown referee for the suggestion which helped to simplify the proof of Theorem 2. The author acknowledges the INTAS Research Network 03-51-6637 for financial support
References 1. Albeverio, S., Pastur, L., Shcherbina, M.: On Asymptotic Properties of the Jacobi Matrix Coefficients. Matem. Fiz. Analiz, Geom. 4, 263–277 (1997) 2. Albeverio, S., Pastur, L., Shcherbina, M.: On the 1/n expansion for some unitary invariant ensembles of random matrices. Commun. Math. Phys. 224, 271–305 (2001) 3. Boutetde Monvel, A., Pastur, L., Shcherbina, M.: On the statistical mechanics approach in the random matrix theory. Integrated density of states. J. Stat. Phys. 79, 585–611 (1995) 4. Claeys, T., Kuijalaars, A.B.J.: Universality of the double scaling limit in random matrix models. Comm. Pure Appl. Math. 59, 1573–1603 (2006) 5. Deift, P., Kriecherbauer, T., McLaughlin, K.T.-R.: New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory 95, 388–475 (1998) 6. Deift, P., Kriecherbauer, T., McLaughlin, K., Venakides, S., Zhou, X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Commun. Pure Appl. Math. 52, 1335–1425 (1999) 7. Deift, P., Kriecherbauer, T., McLaughlin, K., Venakides, S., Zhou, X.: Strong asymptotics of orthogonal polynomials with respect to exponential weights. Commun. Pure Appl. Math. 52, 1491–1552 (1999) 8. Deift, P., Gioev, D.: Universality in random matrix theory for orthogonal and symplectic ensembles. Int. Math. Res. Papers. article ID rpm 004.116 pages (2007) 9. Deift, P., Gioev, D.: Universality at the edge of the spectrum for unitary, orthogonal, and symplectic ensembles of random matrices Comm. Pure Appl. Math. 60, 867–910 (2007) 10. Deift, P., Gioev, D., Kriecherbauer, T., Vanlessen, M.: Universality for orthogonal and symplectic Laguerre-type ensembles. J.Stat.Phys 129, 949–1053 (2007) 11. Dyson, D.J.: A Class of Matrix Ensembles. J.Math.Phys. 13, 90–107 (1972) 12. Johansson, K.: On fluctuations of eigenvalues of random Hermitian matrices. Duke Math. J. 91, 151–204 (1998) 13. Mehta, M.L.: Random Matrices. New York: Academic Press, 1991 14. Pastur, L., Shcherbina, M.: Universality of the local eigenvalue statistics for a class of unitary invariant random matrix ensembles. J. Stat. Phys. 86, 109–147 (1997) 15. Pastur, L., Shcherbina, M.: On the edge universality of the local eigenvalue statistics of matrix models. Matem. Fiz. Analiz, Geom. 10(N3), 335–365 (2003) 16. Pastur, L., Shcherbina, M.: Bulk universality and related properties of Hermitian matrix models. J. Stat. Phys. 130, 205–250 (2007) 17. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, vol. IV. New York:Academic Press, 1978 18. Saff, E., Totik, V.: Logarithmic Potentials with External Fields. Berlin:Springer-Verlag, 1997 19. Shcherbina, M.: Double scaling limit for matrix models with non analytic potentials. J. Math. Phys. 49, 033501–033535 (2008) 20. Shcherbina, M.: On Universality for Orthogonal Ensembles of Random Matrices. http://arXiv.org/abs/ math-ph/0701046, 2007 21. Stojanovic, A.: Universality in orthogonal and symplectic invariant matrix models with quatric potentials. Math. Phys. Anal. Geom. 3, 339–373 (2002) 22. Stojanovic, A.: Universalité pour des modéles orthogonale ou symplectiqua et a potentiel quartic. Math. Phys. Anal. Geom. Preprint Bibos 02-07-98; avaiable at http://www.physik.oni-bielefeld.de/bibos/ preprints/02-07-08.pdf, 1988 23. Tracy, C.A., Widom, H.: Correlation functions, cluster functions, and spacing distributions for random matrices. J. Stat. Phys. 92, 809–835 (1998) 24. Widom, H.: On the relations between orthogonal, symplectic and unitary matrix models. J. Stat. Phys. 94, 347–363 (1999) Communicated by A. Kupiainen
Commun. Math. Phys. 285, 975–990 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0460-2
Communications in
Mathematical Physics
Inertial Manifolds for a Smoluchowski Equation on the Unit Sphere Jesenko Vukadinovic Department of Mathematics, CUNY-College of Staten Island, 1S-208, 2800 Victory Boulevard, Staten Island, NY 10314, USA. E-mail:
[email protected] Received: 19 November 2007 / Accepted: 24 December 2007 Published online: 13 March 2008 – © Springer-Verlag 2008
Abstract: The existence of inertial manifolds for a Smoluchowski equation—a nonlinear Fokker-Planck equation on the unit sphere which arises in modeling of colloidal suspensions—is investigated. A nonlinear and nonlocal transformation is used to eliminate the gradient from the nonlinear term. 1. Introduction Although intrinsically infinite-dimensional, many dissipative parabolic systems exhibit long-term dynamics with properties typical of finite-dimensional dynamical systems. The global attractor, often considered the central object in the study of long-term behavior of dynamical systems, appears to be inadequate in capturing this finite-dimensionality, even when its Hausdorff dimension is finite. This is mainly due to two facts. Firstly, the global attractor can be a very complicated set, not necessarily a manifold; the question whether the dynamics on it can be described by a system of ODEs is yet to be resolved in the literature. Secondly, although all solutions approach this set, they do so at arbitrary rates, algebraic or exponential, and, consequently, the dynamics outside the attractor is not tracked very well on the attractor itself. When they exist, inertial manifolds emerge as most adequate objects to capture the finite-dimensionality of a dissipative parabolic PDE. Introduced by Foias et al. in [15], they are defined to remedy the shortcomings of the global attractor just described: they should be finite dimensional positive-invariant Lipschitz manifolds which attract all solutions exponentially, and on which the solutions of the underlying PDE are recoverable from solutions of a system of ODEs, termed ‘inertial form’. The existence of an inertial manifold does not merely have a theoretical value, but a practical one as well: using a system of ODEs instead of a system of PDEs facilitates computations and numerical analysis. One of the most notable examples of parabolic PDEs which possess inertial manifolds is the Kuramoto-Sivashinsky equation [14]. However, the existence of inertial manifolds remains to date unattainable for the vast majority of physically relevant
976
J. Vukadinovic
dissipative PDEs; chief amongst them is the 2D Navier-Stokes system, which possesses a finite-dimensional attractor, for which, however, the existence of inertial manifolds is still open. The main reason for this lies in the fact that most methods (except in some very special cases) require the system at hand to satisfy a very restrictive spectral-gap condition; this condition is especially restrictive when first-order derivatives are present in the nonlinear term, as is the case for the Navier-Stokes equations, and many other physically relevant systems. The Smoluchowski equation describes the temporal evolution of the probability distribution function ψ for directions of rod-like particles in a suspension. In its simplest form, the equation has the form of a Fokker-Planck equation ∂t ψ = ψ + ∇ · (ψ∇V ). It is, however, phrased on the unit sphere, and the gradient, the divergence and the Laplacian are correspondingly modified. The unknown function V stands for a meanfield potential resulting from the excluded volume effects due to steric forces between molecules. Unlike the Fokker-Planck equation, the equation is quadratically nonlinear due to the dependence of V on the probability distribution ψ, and it is nonlocal, since this dependence is nonlocal. In that respect, the equation is similar to the Keller-Segel model of chemotaxis, but unlike the Keller-Segel equations, there is no blow-up, and the existence and the dissipativity of solutions are well established. In this paper, we shall use a particular type of mean-field potential due to Maier and Saupe [22], which can be thought of as the projection of the probability distribution function on the second eigenspace of the Laplace-Beltrami operator multiplied by a constant. The Smoluchowski equation was first proposed in the works of Doi [10] and Hess [17] as a dynamical model for nematic liquid crystalline polymers. The model takes into account both the micro-micro interaction (modeled through the mean field potential) and the macro-micro interaction (modeled by an advection term that couples the equation to fluid equations). In this paper, we study a version of the equation in which the interaction with the ambient flow is neglected. In this case, the equation is a gradient system with the free energy as the Lyapunov functional. It is dissipative in a Gevrey class of analytic functions, and it possesses a finite-dimensional global attractor consisting of the steady-states and their unstable manifolds. Historically, the Smoluchowski equation was preceded by a variational model for colloidal suspensions due to Onsager [23]. Onsager calculated the free energy functional and derived the Euler-Lagrange equation for the steady-states. The mean-field potential used in his work was different, and the Maier-Saupe potential is a truncation of this potential. However, it is widely accepted that the Maier-Saupe potential affords sufficient degrees of freedom to capture the dynamics of the micro-micro interaction. In a recent development, the bifurcation diagram for the Onsager equation (and therefore also Smoluchowski equation) with the Maier-Saupe potential was confirmed rigorously (see [5,6,8,13,18,19]). The equation undergoes two bifurcations. At a lower potential intensity, the equation undergoes a saddle-node bifurcation, in which a prolate nematic branch of steady-states (probability distribution concentrates to one direction) and an oblate nematic branch of steady-states (probability distribution concentrates uniformly to the equator) emerge. At a higher potential intensity, the equation undergoes a transcritical bifurcation: the oblate branch intersects with the isotropic state, and there is a transfer of stability. The existence of a finite-dimensional attractor and the bifurcation diagram just described suggest finite-dimensionality of the dynamics, and the question of the existence of inertial manifolds emerges. However, just like for the 2D Navier-Stokes system,
Inertial Manifolds for a Smoluchowski Equation on the Unit Sphere
977
the Smoluchowski equation has a gradient in the nonlinear term, and the equation in its usual form does not satisfy the spectral-gap condition. This difficulty is circumvented in this paper by a nonlinear nonlocal transformation which eliminates the gradient from the nonlinearity. Let us also remark that the dynamics of the Smoluchowski equation becomes much more complex when we allow for interaction with the ambient flow. Even a passive interaction with a shear flow leads to complicated and peculiar dynamical behavior. This is due to the fact that the fluid introduces a non-variational element to the dynamics, even though the nonlinearity remains unchanged. The equation ceases to be a gradient system, and the attractor becomes a much more complicated object: In addition to flow-aligning (steady-states), different time-periodic solution regimes and chaos were confirmed numerically. The method developed here can be modified to prove the existence of inertial manifolds for this dynamically more interesting case, as well as for other equations of a similar structure. 2. Preliminaries 2.1. Smoluchowski equation on a unit sphere. We consider the Smolukowski equation on the unit sphere (m ∈ S 2 ) ∂t ψ = m ψ + ∇m · (ψ∇m V ),
(2.1)
2 where ∇m = m × ∂m stands for the gradient operator on the unit sphere, and m = ∇m stands for the Laplace-Beltrami operator. We also write the equation in the functional form
∂t ψ + Aψ = B(ψ, V ),
(2.2)
where the linear operator A and the bilinear operator B are defined as A = −m and B(ψ, ξ ) = ∇m · (ψ∇m ξ ), respectively. The unknown ψ is interpreted as the probability distribution function for the orientations of rigid rod-like molecules in a suspension. The simplest quantity representing its anisotropy is the orientational order-parameter tensor which is calculated as the traceless equivalent of the second moment tensor: S[ψ(t)] =< mm − I/3 >ψ(t) = [mm − I/3] ψ(m, t) dm. S2
The scalar order parameter 1
S[ψ] = (3/2S[ψ] : S[ψ]) 2 ∈ [0, 1] ¯ = represents the degree of molecular alignment. For the isotropic phase ψ¯ = 1/4π , S[ψ] 0, and for the perfect alignment S[ψ] = 1. The unknown V is a mean-field intermolecular interaction potential resulting from the excluded volume effects due to steric forces between molecules. In this paper, we utilize the Maier-Saupe potential given by V (m, t) = −b(mm − I/3) : S[ψ(t)],
(2.3)
where the parameter b > 0 represents the potential intensity. Due to the nonlocal dependence of V on the probability distribution function ψ, the Smoluchowski equation is nonlinear (quadratic) and nonlocal. Note that A(mm − I/3) = λ2 (mm − I/3),
978
J. Vukadinovic
and therefore also AV = λ2 V , where λ2 = 6 is the second smallest positive eigenvalue of A. More specifically, 8π P2 ψ, 15 where P2 is the projection on the spectral eigenspace of the operator A corresponding to the eigenvalue λ2 = 6. Regarding the existence, uniqueness and regularity of solutions of (2.1) with (2.3), it is easy to prove the following theorem (see [5,6]). Theorem 1. Let ψ0 > 0 be a continuous function on S 1 such that S 2 ψ = 1. A unique smooth solution ψ(t) = S(t)ψ0 of (2.1) and (2.3) with initial datum ψ(0) = ψ0 exists for all nonnegative times and remains positive and normalized ψ(m, t) dm = 1. V (m, t) = −b
S2
The Smoluchowski equation preserves certain symmetries. Symmetry with respect to the origin – reflecting the fact that that we do not distinguish between orientations m and −m – is preserved. Also, symmetry with respect to any plane passing through the origin is preserved. This allows us to choose a coordinate system, so that the symmetry with respect to any of the coordinate planes is preserved. We will refer to functions with these symmetries simply as “symmetric". We express ψ in terms of the local coordinates: ψ(θ, ϕ) = ψ(m(θ, ϕ)), where m 1 (θ, ϕ) = sin θ cos ϕ, m 2 (θ, ϕ) = sin θ sin ϕ, and m 3 (θ, ϕ) = cos θ . Then, in terms of the local coordinates 1 ∂ϕ ψ ∇m ψ = ∂θ ψ, sin θ and 1 1 ∂θ (sin θ ∂θ ψ) + ∂ϕ2 ψ. sin θ sin2 θ Due to the choice of the coordinate system, the order parameter tensor is a diagonal trace-free matrix ⎤ ⎡ 2 0 0 m 1 − 1/3 ψ 2 ⎥ ⎢ 0 0 m 2 − 1/3 ψ . S[ψ] = ⎣ ⎦ 2 0 0 m 3 − 1/3 ψ , m ψ =
This enables to rewrite the potential in the following way:
V (m, t) = −b m 21 − 1/3 (m 21 − 1/3) + m 22 − 1/3 (m 22 − 1/3) ψ ψ + m 23 − 1/3 (m 23 − 1/3) ψ
b 2 2 2 2 2 2 m 1 − m 2 (m 1 − m 2 ) + 3 m 3 − 1/3 (m 3 − 1/3) =− ψ ψ 2
b = − w1 ψ w1 + w2 ψ w2 2 b = − wψ · w, 2
Inertial Manifolds for a Smoluchowski Equation on the Unit Sphere
979
√ √ where w1 = m 21 − m 22 = sin2 θ cos 2ϕ, w2 = 3(m 23 − 1/3) = 3(cos2 θ − 1/3), and 2 w = (w1 , w2 ). Notice that w1 22 = w2 22 = 16π 15 . Multiplying (2.1) by m i − 1/3 and integrating by parts yields the equation for temporal evolution of the order parameter tensor 3 2 2 2 ∂t Sii [ψ] + 6Sii [ψ] = 4b Sii [ψ] m i − Skk [ψ] m i m k . (2.4) ψ
k=1
ψ
In particular, this equation yields the equation for the temporal evolution of the meanfield potential ∂t V + 6V = G[ψ],
(2.5)
where G[ψ] : S 2 → IR depends Lipschitz-continuously on second and fourth moment tensors of ψ only. 2.2. Dissipativity and the global attractor. Here we review some of basic properties of the spherical harmonics. Let Pk denote the Legendre polynomial of degree k. For k = 0, 1, 2, . . . and j = 0, ±1, ±2, . . . , ±k we define j
j
j
Yk (θ, ϕ) = Ck ei jϕ Pk (cos θ ), j
where Ck = j
−j
2k+1 (k−| j|)! 4π (k+| j|)!
1/2
j
j
, Pk (x) = (1 − x 2 ) j/2 dd xPjk (x), j = 0, 1, 2, . . . , k, and j
Pk = Pk , j = −1, −2, . . . , −k. Each Yk is an eigenvector of A = −m correj j j sponding to the eigenvalue λk = k 2 + k: AYk = λk Yk . Moreover, the set {Yk : k = 2 2 0, 1, 2, . . . ; j = 0, ±1, ±2, . . . , ±k} forms an orthonormal basis in L (S ); in particular, for each ψ ∈ L 2 (S 2 ), there is a representation ψ=
∞ k
j
j
ψk Yk ,
k=0 j=−k
where j
ψk =
−j
S2
ψYk
dm.
We define Pk ψ =
k
j
j
ψk Yk .
j=−k
Let H = {ψ ∈ L 2 (S 2 ) : ψ normalized and symmetric}. For ψ ∈ H , the symmetry −j j j j implies ψk = ψ¯ k = ψk , ψ2k+1 = 0, and the normalization yields 1 ψ00 = √ 4π
980
J. Vukadinovic
and j |ψk |
≤
S2
−j ψ|Yk |
dm ≤
2k + 1 . 4π
5 . We will make use of the fact that P2 ψ∞ ≤ 4π For the scalar order parameter we now have 3 4π 3 P2 ψ22 = (w1 2ψ + w2 2ψ ). S[ψ]2 (t) = − V (m, t)ψ(m, t) dm = 2b S 2 5 4
After multiplying (2.1) by −V (m, t) and integrating by parts, we obtain d 3 S[ψ]2 + 6S[ψ]2 = |∇m V |2 ψ dm. 2dt 2b S 2
(2.6)
On the other hand, by multiplying (2.4) by Sii [ψ] and summing, one obtains 3 3 d 4b 2 2 S[ψ] + 6 − S[ψ] = b Sii [ψ]Skk [ψ]Tik [ψ], 2dt 5 i=1 k=1
where Ti j [ψ] =< Ti j (m) >ψ , and (Ti j (m)) ∈ P2 (L 2 (S 2 )) ⊕ P4 (L 2 (S 2 )) is symmetric. If ψ(t) → ψ¯ = 1/4π as t → ∞, then T [ψ(t)] → O as t → ∞, so if b < 15/2, then S[ψ(t)] → 0 exponentially as t → ∞, and if b > 15/2, then S[ψ(t)] ≡ 0 or S[ψ(t)] → ∞ as t → ∞, which cannot be the case. Note that if S[ψ(0)] ≡ 0, then S[ψ(t)] ≡ 0 (V ≡ 0), and the Smoluchowski equation reduces to the heat equation in this case. We again have ψ(t) → ψ¯ as t → 0, exponentially. In the following we define Gevrey classes of functions. For a > 1, we define on the set of eigenvalues of A the function f a (λk ) = a 2k . We define a spectral operator Fa = f a (A) by Fa ψ =
∞
f a (λk )Pk ψ.
k=2 1/2
Let (·, ·)a = (·, Fa (·)) L 2 (S 1 ) , ·a = (·, ·)a , and Ha = {ψ ∈ H : ψa < ∞}. From [7], we adopt a lemma concerning the nonlinear term in the equation: Lemma 1. For ψ, χ , ξ ∈ D(A), (B(ψ, ξ ), χ ) L 2 (S 1 ) =
1 2
S2
[ξ χ Aψ − ψξ Aχ − ψχ Aξ ] dm.
(2.7)
With a minor modification, we also adopt the following Lemma 2. There exist an absolute constant C1 > 0 and a constant C2 = C2 (b) > 0 depending on b only, such that, if 1 ≤ a 4 ≤ 1 + (4C1 b)−1 , then for any ψ ∈ H ∩ D(AFa (A)) and V computed through (2.3) the inequality 1 |(B(ψ, V ), ψ)a | ≤ C22 + ((A − λ2 )ψ, ψ)a 2 holds.
Inertial Manifolds for a Smoluchowski Equation on the Unit Sphere
981
This enables us to prove the following theorem on the existence of absorbing cones in Gevrey classes: Theorem 2. Let ψ0 ∈ Ha and ψ(t) be the unique solution of (2.2) corresponding to that initial datum, and S[ψ(t)] the scalar order parameter. Let the number a be such that 1 < a 4 ≤ 1 + (4C1 b)−1 . Then 2 ψ(t)a2 2 −t ψ0 a ≤ 2C + e , t ≥ 0. 2 S[ψ(t)]2 S[ψ(0)]2
(2.8)
In particular, the cone {ψ ∈ Ha : ψa ≤ 2C2 S[ψ]} is absorbing and invariant. Proof. Multiplying Eq. (2.2) by Fa (t)ψ and integrating over S 2 one obtains d (ψ, Fa ψ) + (Aψ, Fa ψ) = (B(ψ, V ), Fa ψ). 2dt Let ψ˜ := ψ/S[ψ] and V˜ := V /S[ψ]. In view of (2.6), d ˜ ψ) ˜ a ≤ (B(ψ/ ˜ V˜ ), Fa ψ)S[ψ], ˜ ˜ ψ) ˜ a + ((A − λ2 )ψ, (ψ, 2dt and therefore in view of Lemma 2, 1 d ˜ ψ) ˜ a + ((A − λ2 )ψ, ˜ ψ) ˜ a ≤ C22 , (ψ, 2dt 2 and so d ˜ ψ) ˜ a + (ψ, ˜ ψ) ˜ a ≤ 2C22 , (ψ, dt and the statement follows.
Following [7], we can modify the previous theorem to prove the dissipativity in Gevrey classes even for the initial data ψ0 ∈ L 2 (S 2 ), which, in turn, implies the dissipativity in L 2 (S 2 ). By Bρ(b) = {ψ ∈ H : ψ2 < ρ(b)} let us denote an absorbing invariant ball for (3.11). We chose ρ : IR+ → IR+ to be continuous and increasing. Another consequence of the theorem is the existence of a finite dimensional global attractor A whose structure is somewhat simple due to the fact that Eq. (2.1) is a gradient system. For dissipative gradient systems, the dynamical behavior is characterized by a global attractor which is formed by the steady-states and their unstable manifolds, and all solutions approach a steady-state as t → ∞. Denoting u = ψ exp (V /2), the Lyapunov functional for the Smoluchowski equation is given by F(t) = ψ(m, t) log u(m, t) dm, S2
which satisfies the equation dF =− dt
S2
|∇m (V + log ψ)|2 ψ dm.
982
J. Vukadinovic
The right-hand side of the energy dissipation equation yields the equation for the steadystates for the Smoluchowski equation which coincides with the Onsager equation which preceded the Hess-Doi theory. The equation reads V + log ψ = const., or, by denoting r = (b/4) < w >ψ and solving for ψ, ψ(m) =
S2
exp(2r·w(m)) . exp(2r·w(m)) dm
(2.9)
This transcendental matrix equation is now well understood (see [5,6,8,13,18])). The isotropic steady state ψ¯ = 1/4π persists for all potential intensities b > 0. All anisotropic steady-states are axisymmetric with the coordinate axes as axes of symmetry. There are two critical intensities at which bifurcations occur: a saddle-node bifurcation at b = b∗ = 6.7314863965 in which three oblate and three prolate anisotropic steadystates emerge (one for each coordinate axis as the axis of symmetry), and a transcritical bifurcation at b = b∗∗ = 15/2 in which the oblate branch intersects with the isotropic steady-state. If b < b∗∗ , then ψ = ψ¯ is asymptotically stable. In view of the fact that ψ(t) → ψ¯ as t → ∞ implies S[(ψ(t)] → 0 as t → ∞, exponentially, and in view of (2.8), ψ(t) → ψ¯ as t → ∞, exponentially. If b > b∗∗ , ψ¯ becomes a saddle with the set {ψ : S[ψ] = 0} as the basin of attraction. All anisotropic steady-states have the form (2.9) for some r ∈ IR2 . By r ∗ (b) we denote the least |r|. 2.3. Inertial manifolds. In this section, we recall the definition of inertial manifolds and a theorem on their existence. Consider an evolution equation on a Hilbert space H endowed with the inner product (·, ·), and the norm | · | of the form ∂t u + Au = N (u),
(2.10)
where A is a positive self-adjoint linear operator with compact inverse, and N : H → H is a locally Lipschitz function. Recall that, since A−1 is compact, there exists a complete set of eigenfunctions wk for A, Awk = λk wk . We arrange the eigenvalues in a nondecreasing sequence λk ≤ λk+1 , k = 1, 2, . . . . It is a well-known fact that λk → ∞ as k → ∞. We also define the projection operators Pn u =
n
(u, wk )wk
k=1
and Q n = I − Pn , and the the cone-like sets Cln = {w ∈ H : |Q n w| ≤ l|Pn w|}. Definition 1. An inertial manifold M is a finite-dimensional Lipschitz manifold which is positively invariant, i.e. S(t)M ⊂ M, t ≥ 0,
Inertial Manifolds for a Smoluchowski Equation on the Unit Sphere
983
and exponentially attracts all orbits of the flow uniformly on any bounded set U ⊂ H of initial data, i.e. dist(S(t)u 0 , M) ≤ CU e−µt , u 0 ∈ U, t ≥ 0. The inertial manifold is said to be asymptotically complete, if, for any solution u(t), there exists v0 ∈ M such that |u(t) − S(t)v0 | → 0, t → ∞, exponentially. There are several methods for proving the existence of inertial manifolds. The vast majority of them require some kind of Lipschitz continuity of the nonlinearity N and make use of a very restrictive spectral gap property of the linear operator A. These two conditions yield the strong squeezing property, which, in turn, yields the existence of an inertial manifold. The inertial manifold is obtained as a graph of a Lipschitz mapping. We shall use the following Theorem 3. Suppose that the nonlinearity in (2.10) satisfies the following three conditions: • It has compact support in H, i.e. supp(N ) ⊂ Bρ = {u ∈ H : |u| ≤ ρ}. • It is bounded, i.e. |N (u)| ≤ C0 for u ∈ H . • It is globally Lipschitz continuous, i.e. |N (u 1 )− N (u 2 )| ≤ C|u 1 −u 2 | for u 1 , u 2 ∈ H . Suppose that the eigenvalues of A satisfy the spectral gap condition, i.e. λn+1 − λn > 4C, for some n ∈ IN. Then the strong squeezing property holds, i.e. • If u 1 (0) − u 2 (0) ∈ Cln , then u 1 (t) − u 2 (t) ∈ Cln for t ≥ 0. • If u 1 (t0 ) − u 2 (t0 ) ∈ Cln , for some t0 ≥ 0, then |Q n (u 1 (t) − u 2 (t))| ≤ |Q n (u 1 (0) − u 2 (0))|e−µt for some µ > 0 and for 0 ≤ t ≤ t0 . The strong squeezing property implies the existence of an asymptotically complete inertial manifold which is obtained as the graph of a Lipschitz function : Pn H → Q n H , i.e. M = G[] = { p + ( p) : p ∈ Pn H } with |( p1 ) − ( p2 )| ≤ l| p1 − p2 |. Restricting (2.10) to M yields the ordinary differential equation for p = Pn u, dp + Ap = Pn N ( p + ( p)), dt termed the inertial form. Different proofs are available in the literature (e.g. [3,24,25]). Let us remark here that the above result is not the strongest possible. It is possible to ease the Lipschitz condition to allow for nonlinearities that contain first-order derivatives of u, resulting, however, in an even more restrictive spectral gap condition which is not satisfied by the Smoluchowski equation. The main idea of this paper is to eliminate the gradient from the nonlinearity of (2.1) through the transformation u = ψ exp(V /2), and then to apply Theorem 3.
984
J. Vukadinovic
3. Existence of Inertial Manifolds for the Smoluchowski Equation 3.1. Transformed equation. We would like to transform the Smoluchowski equation in a manner that would eliminate the gradient from the nonlinear term. It can be easily verified that functions ψ and V satisfy (2.1) if and only if u = ψ exp(V /2) and V satisfy the equation 1 1 2 ∂t u = m u + ∂t V + m V − |∇m V | u. 2 2 However, due to the dependence of V on ψ, this is not a closed equation in u, and it turns out that it is not possible to express V as a function of u. We can circumvent this difficulty by first performing the following transformation: ξ = (ψ) = ψ − 2P2 ψ + d = ψ + cV + d, 15 5 where c = 4bπ and d = 2π . Notice that ψ > 0 implies ξ > 0 and S 2 ψ = 1 implies S 2 ξ = 11. It can be easily seen that ψ satisfies the Smoluchowski equation if and only if ξ satisfies ∂t ξ = m ξ + ∇m · (ξ ∇m V ) + c(∂t V − m V − |∇m V |2 − V m V ) − dm V, (3.11) where V = b2 w· < w >ξ = − b2 w· < w >ψ . Similarly as above, this is equivalent to u = ξ exp(V /2) satisfying the equation ∂t u = m u + e V /2 21 ∂t V + m V − 21 |∇m V |2 ξ (3.12) + c ∂t V − m V − |∇m V |2 − V m V − dm V . Equation (2.5) for the evolution of V and the fact that m V = −6V allow us to rewrite this equation in the form ∂t u = m u + F(ξ ),
(3.13)
where F(ξ ) : S 2 → IR depends Lipschitz-continuously on m, ξ and the second and fourth moments of ξ . Our next goal is to express V and therefore also ξ as a function of u in order to view (3.13) as a closed semilinear parabolic equation in u. To this end, we develop the following framework. If u ∈ L 1 (S 2 ), we define the transform u ∈ C ∞ (IR2 ), u(m)e−x·w(m) dm. u (x) = S2
For a ∈ IR2 we define µa u(m) := u(m)e−a·w(m) ∈ L 1 (S 2 ), and so
µa u(x) =
S2
u(ϕ)e−(x+a)w(m) dm =: τa u (x).
Inertial Manifolds for a Smoluchowski Equation on the Unit Sphere
We define the function spaces 2 2 + X = ξ ∈ L (S ; IR ) : ξ symmetric, < w >ξ = o, where o = (0, 0), and
Also, let
985
S2
ξ = 11 ,
H = u ∈ L 2 (S 2 ; IR+ ) : u symmetric .
X = u∈H:
S2
u > 11,
S2
µa u < 11 for some a ∈ IR2 ,
which is an open subset of H. For u ∈ X we have ∇ u (x) = − µx u(m)w(m) dm, 2 S ∇∇ u (x) = µx u(m)(w(m)w(m)) dm. S2
∇∇ u (x) is positive definite, since by Cauchy-Schwarz,
where < f >=
det(∇∇ u ) =< w12 >< w22 > − < w1 w2 >2 > 0, S2
f (m)µx u(m) dm, so u is a concave up. Since u ∈ X , the level set (u) = {x ∈ IR2 | u (x) ≤ 11}
is a nonempty convex set, and the point o ∈ (u). Thus, there exists a unique point r ∈ ∂(u) so that |r| = dist((u), o). Note that r is the unique point on ∂(u) for which there exists b > 0 such that 4 ∇ u (r) = − r. b We now define the mappings R:X u :X u
→ → → →
IR2 , r, X, ξ = µ R(u) u = ue−R(u)·w ,
Y : X → IR2 , u → −(∇ u )(R(u)) =
S2
u(m)e−R(u)·w(m) w(m) dm =< w >(u) ,
B : X → IR+ , u → b = 4|R(u)|/|Y (u)|. Note the inequality R(u) ≤ B(u)/4. We will need the following: Lemma 3. R, , Y , and B are continuous functions on X .
986
J. Vukadinovic
Proof. We prove the continuity of R, and the continuities of , Y and B follow. To prove the statement by contradiction, we choose a sequence (vn )n∈IN in X and u ∈ X such that vn → u in L 2 (S 2 ). This obviously implies vn → u and vn → u in L ∞ (S 2 ). Let r = R(u), sn = R(vn ), and suppose sn → r as n → ∞. Let bn = 4|R(vn )|/|Y (vn )| = 4|sn |/|∇ vn (sn )|. One can easily observe that the sequence (sn ) is bounded. Therefore, without loss of generality, we can assume that sn → s = r as n → ∞. Because of the convergence in the sup norm, vn (sn ) → u (s) and ∇ vn (sn ) → ∇ u (s). Therefore, vn (sn ) = 11 implies u (s) = 11, and bn → b := 4|s|/|∇ u (s)|, so ∇ u (s) = − b4 s. This is a contradiction to s = r. Corollary 1. Let b > 0 be fixed, and let Xb = B −1 {b}. The function b = |Xb : Xb → X is a homeomorphism, and its inverse is given by (b/4)<w>ξ ·w . −1 b (ξ ) = ξ e
As already mentioned, ξ(t) = (ψ(t)) is a solution of (3.11) for some b > 0, if and only if u(t) = ξ(t)e V (t)/2 = ξ(t)e(b/4)<w>ξ(t) ·w = −1 b (ξ(t)) satisfies (3.13). Denoting by b < w >ξ(t) , 4 we have ξ(t) = u(t) exp(−r(t) · w). Then, S 2 ξ(m, t) dm = 11 implies u (r(t)) = 11, 2 u (r(t)) = − b4 r(t). Using and multiplying by w and integrating over S , we obtain ∇ the framework developed earlier, we conclude that r(t) = R(u(t)), ξ(t) = (u(t)) and b = B(u(t)), t ≥ 0. Therefore, u(t) satisfies the closed equation r(t) =
u t = m u + F((u)).
(3.14)
On the other hand, if u(t) ∈ X , t ≥ 0 is a solution to (3.14) for an initial datum −1 u 0 = −1 b ( (ψ0 )) for some ψ0 ∈ H , it is immediate that ψ(t) = (u(t)) satisfies (2.1), and b = B(u(t)) is preserved by the solution operator.
3.2. Prepared equation. In order to apply the classical theory, we need the nonlinear term in (3.14) to satisfy a Lipschitz condition. Since this is not necessarily true, we have to modify Eq. (3.14) in a way that preserves its long-term behavior. This is done by modifying the nonlinear term outside of an absorbing set. The equation obtained in this way is referred to as the prepared equation. In order to see how we are to define the prepared equation, let us examine the Lipschitz continuity of the above defined functions. Lemma 4. The functions R|Xb , |Xb , Y |Xb , and F ◦ |Xb are Lipschitz continuous. In particular, b : Xb → X is a Lipschitz homeomorphism.
Inertial Manifolds for a Smoluchowski Equation on the Unit Sphere
987
Proof. We prove the Lipschitz continuity of R, and the others follow. Let u, v ∈ Xb , and let r = R(u), s = R(v). The mean-value theorem implies the existence of θ1 ∈ [0, 1] and θ2 ∈ [0, 1] so that, with the convexity of u and v , we have 4 u (r) · (s − r) = − r(s − r) u (s) − u (r) = ∇ u (r + θ1 (s − r)) · (s − r) ≥ ∇ b and 4 v (s) · (r − s) = − s(r − s). v (r) − v (s) = ∇ v (s + θ2 (r − s)) · (r − s) ≥ ∇ b Adding both equations yields 4 (u(m) − v(m))(e−s·w(m) − e−r·w(m) ) dm ≥ |s − r|2 , 2 b S and therefore there exists C3 = C3 (b) so that 4 |s − r|2 ≤ C3 u − v2 |r − s|, b bC3 4 u − v2 . Let us now choose κ : IR+ → IR+ , continuous and increasing, such that the ball Bκ(b) = {u ∈ H : u2 < κ(b)} satisfies Bκ(b) ⊃ −1 b ( (Bρ(b) )). We will need the following
and so |R(u) − R(v)| ≤
Lemma 5. Let b1 > 0 and r1 > 0. Let B = Bκ(b1 ) ∩ B −1 (0, b1 ) ∩ R −1 {r ∈ IR2 : r1 < |r|}. Then R|B , |B , Y |B , B|B and F ◦ |B are Lipschitz continuous. Proof. Let u, v ∈ B, and let r = R(u), s = R(v). As before, we have u (s) − u (r) ≥
4 r(r − s) B(u)
v (r) − v (s) ≥
4 s(s − r). B(v)
and
Since r(r − s) + s(s − r) = |r − s|2 ≥ 0, we distinguish the following cases: Case 1. r(r − s) ≥ 0 and s(s − r) ≥ 0. In this case, similarly as in the previous lemma, we have 4 (u(m) − v(m))(e−s·w(m) − e−r·w(m) ) dm ≥ |s − r|2 , b2 S2 and so |R(u) − R(v)| ≤ b1 C34(b1 ) u − v2 . Case 2. r(r − s) < 0 and s ∈ (u). In this case, u (s) − u (r) > 0 >
4 r(r − s) B(v)
and v (r) − v (s) ≥
4 s(s − r), B(v)
and one arrives at the same conclusion as in the previous case.
988
J. Vukadinovic
Case 3. r(r − s) < 0 and s ∈ (u). Since s ∈ (u), there exists s ∈ ∂(u) ∩ [o, s]. Let v = µs−s v, and so v = τs−s v . Thus, v (s ) = v (s) = 11, so R(v ) = s follows. Another easy observation is that r(r − s ) ≤ 0, and so u (s ) − u (r) = 0 ≥
4 r(r − s ) B(v)
and 4 4 s (s − r) ≥ s (s − r), B(v ) B(v) and we conclude again |r − s | ≤ b1 C34(b1 ) u − v 2 . On the other hand, since s and s are collinear, v (r) − v (s ) ≥
v (s ) − v (s) ≥
4 4 4r1 |s − s |. s(s − s ) = |s||s − s | ≥ B(v) B(v) b2
1) v − u2 . The desired follows with the Since v (s) = u (s ), we have |s − s | ≤ b1 C4r3 (b 1 estimate 2 v − v 2 = (v(m) − v (m)) dm = v(m)2 (1 − e(s −s)·w(m) )2 dm 2 2 S2 S [(v)(m)]2 (es·w(m) − es ·w(m) )2 dm ≤ C42 |s − s |2 (v)22 =
S2
≤ C52 v − u2 . where the constants C4 and C5 depend on r1 and b1 only. Case 4. s(s − r) < 0. The inequalities for this case follow in an analogous fashion to the previous two cases. Lemma 6. Let 0 < b < b1 and r1 > r ∗ (b), and let B be defined as in the previous lemma. Let B = B ∪ (Xb ∩ Bκ(b) ). Then the functions R|B , |B , Y |B , B|B and F ◦ |B are Lipschitz continuous. Proof. Let us partition B into three regions: B1 = Xb ∩ Bκ(b) ∩ R −1 (0, r1 /2], B2 = Xb ∩ Bκ(b) ∩ R −1 (r1 /2, r1 ), and B. By the previous two lemmas, the functions are Lipschitz continuous on any of these three regions, as well as on sets B1 ∪ B2 and B2 ∪ B. The Lipschitz continuity on B1 ∪ B follows since both of these sets are bounded in L 2 and dist(R(B1 ), R(B)) > r1 /2. This implies the Lipschitz continuity on B . We can now define the prepared equation: F[(u)], if u ∈ B N P (u) = . 0, if u ∈ H\B2κ(b1 ) This is clearly a Lipschitz function on B ∪ (H\B2κ(b1 ) ). Denote by C > 0 its Lipschitz constant. Following [27], a Lipschitz-continuous function defined on a subset of a Hilbert space can be extended to a Lipschitz continuous function defined on the entire Hilbert space, even preserving the Lipschitz constant C > 0. Without changing the notation, let us by N P : H → IR denote such an extension. The prepared equation reads now ∂t u + Au = N P (u).
Inertial Manifolds for a Smoluchowski Equation on the Unit Sphere
989
3.3. Main theorem. We are now in a position to prove the existence of the inertial manifold of the Smoluchowski equation (2.1). Theorem 4. Let 0 < b = 15 2 . The Smoluchowski equation on the unit sphere with the Maier-Saupe potential possesses an asymptotically complete inertial manifold Mb . Proof. The positivity of A and the Lipschitz continuity of N P ensure that the prepared equation generates a strongly continuous semigroup S P (t). The fact that N P vanishes outside of B2κ(b1 ) suffices to prove that the prepared equation is dissipative, and that it possesses a finite-dimensional global attractor A P . Also, by construction, −1 b ( (A)) ⊂ A P . There exists n ∈ IN such that λn+1 − λn = 2n + 2 > 4C, and the spectral gap condition is satisfied for the prepared equation. Theorem 3 applies, and we infer the existence of an asymptotically complete inertial manifold M P ⊃ A P for the prepared equation given as a graph of a Lipschitz function P : M P = G[ P ] = { p + P ( p) : p ∈ Pn H}. We now define Mb = Bρ(b) ∩ −1 ((M P )). Since b : Xb → X is a Lipschitz homeomorphism, it is immediate that Mb is a finite dimensional Lipschitz manifold. It is positively invariant under S(t), since both Bρ(b) and −1 (b (M P )) are positively invariant. It remains to prove that Mb is exponentially attracting and asymptotically complete. Let ψ0 ∈ H and ψ(t) = S(t)ψ0 . Let u(t) = −1 b ( (ψ(t))), t ≥ 0. Since the convergence to the isotropic steady-state ψ¯ = 1/4π is always exponential if b = 15/2, we can assume without loss of generality that ψ(t) → ψ ∗ as t → ∞ for an anisotropic ∗ ∗ state ψ ∗ . Let u ∗ = −1 b ( (ψ )), so u(t) → u as t → ∞. On the other hand, since M P is exponentially attracting and asymptotically complete, there exists v0 ∈ M P so that for v P (t) = S P (t)v0 we have u(t) − v P (t)2 → 0, as t → ∞, exponentially. Thus, v P (t) → u ∗ as t → ∞ as well. Since u ∗ ∈ B, there exists T > 0 so that v P (t) ∈ B for t ≥ T . However, since N P |B = N |B , B(v P (t)) = B(u ∗ ) = b for t ≥ T . Therefore, σ (t) := −1 ((v P (t))) ∈ (M P ), t ≥ T is a solution of (2.1). For some T ≥ T we have σ (t) ∈ Bρ(b) , t ≥ T , and therefore σ (t) ∈ Mb , t ≥ T . Since b is Lipschitz continuous, ψ(t) − σ (t)2 → 0, as t → ∞, exponentially. This concludes the proof. Remark 1. In the preceding proof, the fact that the Smoluchowski equation is a a gradient system was utilized. The issue of how the inertial manifold of a prepared equation relates to the inertial manifold of the original equation in general is addressed in [14]. Acknowledgements. I would like to thank Peter Constantin and Edriss Titi for introducing me to the problem of existence of inertial manifolds for the Smoluchowski equation, as well as for many helpful discussions. This work was supported in part by the NSF grant DMS-0733126.
References 1. Chow, S.-N., Lu, K., Sell, G.R.: Smoothness of inertial manifolds. J. Math. Anal. Appl. 169, 283–312 (1992) 2. Constantin, P.: Smoluchowski Navier-Stokes systems. Contemp. Math. 429, 85–109 (2007) 3. Constantin, P., Foias, C., Nicolaenko, B., Temam, R.: Integral and inertial manifolds for dissipative partial differential equations. Applied Math. Sciences 70, New York: Springer-Verlag, 1989 4. Constantin, P., Foias, C., Nicolaenko, B., Temam, R.: Spectral barriers and inertial manifolds for dissipative partial differential equations. J. Dynam. Diff. Eq. 1, 45–73 (1988)
990
J. Vukadinovic
5. Constantin, P., Kevrekidis, I., Titi, E.S.: Remarks on a Smoluchowski equation. Dis. Cont. Dynam. Syst. 11(1), 101–112 (2004) 6. Constantin, P., Kevrekidis, I., Titi, E.S.: Asymptotic states of a Smoluchowski equation. Arch. Rat. Mech. Anal. 174, 365–384 (2004) 7. Constantin, P., Titi, E.S., Vukadinovic, J.: Dissipativity and Gevrey regularity of a Smoluchowski equation. Indiana Univ. Math. J. 54(44), 949–970 (2005) 8. Constantin, P., Vukadinovic, J.: 2004 Note on the number of steady states for a 2D Smoluchowski equation 2005 Nonlinearity 18, 441–443 (2005) 9. de Gennes, P.G., Prost, J.: The physics of liquid crystals. Oxford: Oxford University Press, 1993 10. Doi, M.: Molecular dynamics and rheological properties of concentrated solutions of rodlike polymers in isotropic and liquid crystalline phases. J. Polym. Sci., Polym. Phys. Ed. 19, 229–243 (1981) 11. Doi, M., Edwards, S.F.: The theory of polymer dynamics. London-NewYork: Oxford University Press (Clarendon), 1986 12. Faraoni, F., Grosso, M., Crescitelli, S., Maffetone, P.L.: The rigid rodmodel for nematic polymers: An analysis of the shear flow problem. J. Rheol. 43(3), 829–843 (1999) 13. Fatkullin, I., Slastikov, V.: Critical points of the Onsager functional on a sphere. Nonlinearity 18, 2565–2580 (2005) 14. Foias, C., Nikolaenko, B., Sell, G.R., Temam, R.: Inertial manifolds for the Kuramoto-Sivashinsky equation and an estimate of their lowest dimension. J. Math. Pures Appl. 67, 197–226 (1988) 15. Foias, C., Sell, G.R., Temam, R.: Variétés inertielles des équations differeéntielles dissipatives. C. R. Acad. Sci. Paris I 301, 285–288 (1985) 16. Foias, C., Sell, G.R., Temam, R.: Inertial manifolds for nonlinear evolutionary equations. J. Diff. Eq. 73, 309–353 (1988) 17. Hess, S.Z.: Fokker-Planck-equation approach to flow alignment in liquid crystals. Z. Naturforsch. A 31 A, 1034–1037 (1976) 18. Liu, H., Zhang, H., Zhang, P.: Axial symmetry and classification of stationary solutions of Doi-Onsager equation on the sphere with Maier-Saupe potential. Comm. Math. Sci. 3(2), 201–218 (2005) 19. Luo, C., Zhang, H., Zhang, P.: The structure of equilibrium solutions of one dimensional Doi equation. Nonlinearity 18, 379–389 (2005) 20. Maffettone, P.L., Crescitelli, S.: Bifurcation analysis of a molecular model for nematic polymers in shear flows. J. Non-Newtonian Fluid Mech. 59, 73–91 (1995) 21. Mallet-Paret, J., Sell, G.R.: Inertial manifolds for reaction diffusion equations in higher space dimensions. J. Amer. Math. Soc. 1, 805–866 (1988) 22. Maier, W., Saupe, A.: Eine einfache molekular-statistische Theorie der nematischen kristallinflüssigen phase, tail I. Z. Naturforsch. A 14 A, 882–889 (1959) 23. Onsager, L.: The effects of shape on the interaction of colloidal particles. Ann. N. Y. Acad. Sci 51, 627–659 (1949) 24. Robinson, J.C.: Inertial manifolds and the cone condition. Dyn. Sys. Appl. 2, 311–330 (1993) 25. Robinson, J.C.: A concise proof of the geometric construction of inertial manifolds. Phys. Lett. A 200, 415–417 (1995) 26. Vukadinovic, J.: Inertial manifolds for a Smoluchowski equation on a circle. Available at http://www. math.csi.cuny.edu/~vukadino/papers/pub7.pdf, 2007 27. Wells, J.H., Williams, L.R.: Embeddings and extensions in analysis. Ergebnisse der Mathematik und ihrer Grenzgebiete. New York-Hedelberg: Springer Verlag, 1975 Communicated by P. Constantin
Commun. Math. Phys. 285, 991–1004 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0562-x
Communications in
Mathematical Physics
W -Algebra W (2, 2) and the Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0) Wei Zhang, Chongying Dong Department of Mathematics, University of California, Santa Cruz, CA 95064, USA. E-mail:
[email protected] Received: 27 November 2007 / Accepted: 28 February 2008 Published online: 27 June 2008 – © Springer-Verlag 2008
Abstract: In this paper the W -algebra W (2, 2) and its representation theory are studied. It is proved that a simple vertex operator algebra generated by two weight 2 vectors is either a vertex operator algebra associated to an irreducible highest weight W (2, 2)module or a tensor product of two simple Virasoro vertex operator algebras. Furthermore, we show that any rational, C2 -cofinite and simple vertex operator algebra whose weight 1 subspace is zero, weight 2 subspace is 2-dimensional and with central charge c = 1 is isomorphic to L( 21 , 0) ⊗ L( 21 , 0). 1. Introduction Motivated partially by the problem of classification of rational vertex operator algebras with central charge c = 1 and by the Frenkel-Lepowsky-Meurman’s uniqueness conjecture on the moonshine vertex operator algebra V [FLM], we give a characterization of the vertex operator algebra L(1/2, 0) ⊗ L(1/2, 0) in terms of the central charge and the dimensions of weights 1 and 2 subspaces in this paper. Here L(1/2, 0) is the vertex operator algebra associated to the irreducible highest weight module for the Virasoro algebra with central charge 1/2 which is the smallest central charge among the discrete unitary series for the Virasoro algebra. The classification of rational conformal field theories with c = 1 at character level has been achieved in the physics literature under the assumption that the sum of the square of the norm of the irreducible characters is a modular function over the full modular group [K]. But the classification of rational vertex operator algebras with c = 1 remains open. If a vertex operator algebra V = n≥0 Vn with dim V0 = 1 is rational and C2 -cofinite, then V1 is a reductive Lie algebra and its rank is less than or equal to the effective central charge c˜ [DM1]. Also, the vertex operator subalgebra generated by V1 is a tensor product of vertex operator algebras associated to integrable highest weight Supported by NSF grants and a research grant from the Committee on Research, UC Santa Cruz.
992
W. Zhang, C. Dong
modules for affine Kac-Moody algebras and the lattice vertex operator algebra [DM2]. In the case that c = c˜ = 1, we can classify the vertex operator algebras with dim V1 = 0. Since V1 is a reductive Lie algebra whose rank is less than or equal to 1, we immediately see that V1 is either 1-dimensional or 3-dimensional, as a result, V is isomorphic to a vertex operator algebra associated to a rank 1 lattice. So one can assume that V1 = 0. There are two cases: dim V2 > 1 and dim V2 = 1. The L(1/2, 0) ⊗ L(1/2, 0) is the only known such vertex operator algebra whose weight two subspace is not one-dimensional. So a characterization of L(1/2, 0) ⊗ L(1/2, 0) can be regarded as a part of a program of classification of rational vertex operator algebras with c = 1. The vertex operator algebra L(1/2, 0)⊗L(1/2, 0) plays an important role in the study of the moonshine vertex operator algebra V . The moonshine vertex operator algebra V which is fundamental in shaping the field of vertex operator algebra was constructed as a bosonic orbifold theory based on the Leech lattice [FLM]. The discovery of existence of L(1/2, 0)⊗48 inside the moonshine vertex operator algebra V [DMZ] opens a different way to study V . This leads to the theory of code and framed vertex operator algebras [M2,DGH]. This discovery is also essential in a proof that V is holomorphic [D], a new construction of V [M3], proofs of weak versions of the Frenkel-Lepowsky-Meurman’s uniqueness conjecture on V [DGL,LY] and a study of V in terms of conformal nets [KL]. There is no doubt that a characterization of L(1/2, 0) ⊗ L(1/2, 0) will be very helpful in the study of the structure of V and the Frenkel-Lepowsky-Meurman’s uniqueness conjecture. The W (2, 2) and its highest weight modules enter the picture naturally during our discussion on L(1/2, 0) ⊗ L(1/2, 0). The W -algebra W (2, 2) is an extension of the Virasoro algebra and also has a very good highest weight module theory (see Sect. 2). Its highest weight modules produce a new class of vertex operator algebras. In contrast to the Virasoro algebra case, this class of vertex operator algebras are always irrational. From this point of view, this class of vertex operator algebras are not interesting. The W (2, 2) and associated vertex operator algebras are also closely related to the classification of the simple vertex operator algebra with two generators.It is well known that each homogeneous subspace Vn of a vertex operator algebra V = n∈Z Vn is some kind of algebra under the product u · v = u n−1 v for u, v ∈ Vn , where u n−1 is thecomponent operator of Y (u, z) = m∈Z u m z −m−1 . If a vertex operator algebra V = n≥0 Vn with dim V0 = 1 is rational and C2 -cofinite, then V1 and the vertex operator subalgebra generated by V1 are well understood [DM1]. So it is natural to turn our attention to V2 . This is still a very hard problem even with V1 = 0. A simple vertex operator algebra V satisfying V1 = 0 is called the moonshine type. The V2 in this case is a commutative nonassociative algebra. The simple vertex operator algebras of the moonshine type with dim V2 = 2 and generated by V2 are also classified in this paper. There are two families of such algebras. One of this family consists of the tensor product of two vertex operator algebras associated to the irreducible highest weight modules for the Virasoro algebra and the other family consists of the vertex operator algebras associated to the highest weight modules for the W -algebra W (2, 2). The paper is organized as follows. We define and study the W -algebra W (2, 2) in Sect. 2. In particular we use the bilinear form on Verma modules V (c, h 1 , h 2 ) to determine the irreducible quotient modules L(c, h 1 , h 2 ) for W (2, 2) for most c and h i . In Sect. 3 we classify the simple vertex operator algebras of the moonshine type generated by two weight 2 vectors. Section 4 is devoted to the characterization of rational vertex operator algebra L(1/2, 0) ⊗ L(1/2, 0). The main idea is to use the modular invariance of the graded characters of the irreducible modules [Z] to control the growth of the graded dimensions of the vertex operator algebra.
W -Algebra W (2, 2) and the Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
993
2. W -Algebra W (2, 2) The W -algebra W (2, 2) considered in this paper is an infinite dimensional Lie algebra with basis L m , Wm , C for m ∈ Z and Lie brackets m3 − m δm+n,0 C, 12 m3 − m δm+n,0 C, + 12
[L m , L n ] = (m − n)L m+n +
(2.1)
[L m , Wn ] = (m − n)Wm+n
(2.2)
[Wm , Wn ] = 0
(2.3)
for m, n ∈ Z, where C is a central element. Since the adjoint action of L 0 is semisimple with integral eigenvalues, W (2, 2) is a Z-graded algebra and has the triangular decomposition W(n) = CL −n ⊕ CW−n for n = 0, W(0) = CL 0 ⊕ CW0 ⊕ CC, W (2, 2) = W+ ⊕ W(0) ⊕ W− , where W+ = ⊕n≥1 W(n) , W− = ⊕n≥1 W(−n) . In this section we study the highest weight modules for this algebra and the corresponding vertex operator algebras. Let c, h 1 , h 2 ∈ C, and we denote by V (c, h 1 , h 2 ) the Verma module for W (2, 2) with central charge c and highest weight (h 1 , h 2 ). Then V (c, h 1 , h 2 ) = U (W (2, 2))/Ic,h 1 ,h 2 , where Ic,h 1 ,h 2 is the left ideal of the universal enveloping algebra U (W (2, 2)) generated by L m , Wm , C − c, L 0 − h 1 and W0 − h 2 for positive m. The V (c, h 1 , h 2 ) can also be realized as an induced module as in the case of Virasoro algebra. By PBW theorem V (c, h 1 , h 2 ) has basis {W−m 1 · · · W−m s L −n 1 · · · L −n t 1|m 1 ≥ · · · ≥ m s ≥ 1, n 1 ≥ · · · ≥ n t ≥ 1}, where 1 = 1 + Ic,h 1 ,h 2 . Then V (c, h 1 , h 2 ) is graded by the L 0 -eigenvalues: V (c, h 1 , h 2 ) =
V (c, h 1 , h 2 )n+h 1 ,
n≥0
where V (c, h 1 , h 2 )n+h 1 = {v ∈ V (c, h 1 , h 2 )|L 0 v = (n + h 1 )v} is spanned by W−m 1 · · · W−m s L −n 1 · · · L −n t 1 with m 1 + · · · + m s + n 1 + · · · + n t = n. Note that V (c, h 1 , h 2 )h 1 = C1. A highest weight W (2, 2)-module is a quotient module of the Verma module with the same central charge and highest weight. It is standard that V (c, h 1 , h 2 ) has a unique maximal submodule J (c, h 1 , h 2 ) so that L(c, h 1 , h 2 ) = V (c, h 1 , h 2 )/J (c, h 1 , h 2 ) is an irreducible highest weight module. As in the case of Virasoro algebra, there is an anti-involution α for W (2, 2) defined by α(L n ) = L −n , α(Wn ) = W−n , α(C) = C. The α can be extended to an anti-involution of U (W (2, 2)). So we get a symmetric bilinear form (,) on V (c, h 1 , h 2 ) by (A1, B1)1 = Ph 1 (α(A)B1)
(2.4)
994
W. Zhang, C. Dong
for A, B ∈ U (W (2, 2)), where Ph 1 is the projection from V (c, h 1 , h 2 ) to V (c, h 1 , h 2 )h 1 . Then the bilinear form is invariant in the sense: (L m u, v) = (u, L −m v), (Wm u, v) = (u, W−m v), (1, 1) = 1
(2.5)
for u, v ∈ V (c, h 1 , h 2 ) and m ∈ Z. Moreover, the radical of this bilinear form is exactly the maximal submodule J (c, h 1 , h 2 ). Let X be a proper submodule of V (c, h 1 , h 2 ). Then X is a submodule of J (c, h 1 , h 2 ) and the bilinear form (,) on V (c, h 1 , h 2 ) induces an invariant symmetric bilinear form (,) on the quotient module V (c, h 1 , h 2 )/ X. As in the classical case we need to answer the basic question: What is J (c, h 1 , h 2 )? We first consider the case (c, h 1 , h 2 ) = (c, 0, 0). Clearly, L(0, 0, 0) = C. So we now assume that c = 0. Note that U (W (2, 2))L −1 1 + U (W (2, 2))W−1 1 is a proper submodule of V (c, 0, 0). Theorem 2.1. If c = 0, then J (c, 0, 0) = U (W (2, 2))L −1 1 + U (W (2, 2))W−1 1 and L(c, 0, 0) has a basis {W−m 1 · · · W−m s L −n 1 · · · L −n t 1|m 1 ≥ · · · ≥ m s > 1, n 1 ≥ · · · ≥ n t > 1}, (2.6) where 1 is the canonical highest weight vector of L(c, 0, 0). Proof. Set V¯ (c, 0, 0) = V (c, 0, 0)/(U (W (2, 2))L −1 1 + U (W (2, 2))W−1 1) and let S be the set consisting of vectors given by (2.6) with 1 being the canonical highest weight vector of V¯ (c, 0, 0). Then S forms a basis of V¯ (c, 0, 0) by PBW theorem. For n ≥ 0 we set Sn = S ∩ V¯ (c, 0, 0)n . We prove the irreducibility of V¯ (c, 0, 0) by showing that the Gram matrix of (,) with respect to the basis Sn of V¯ (c, 0, 0)n is nondegenerated for all n ≥ 0. For short we set u(m 1 , . . . , m s ; n 1 , . . . , n t ) = W−m 1 · · · W−m s L −n 1 · · · L −n t 1 with m 1 ≥ · · · ≥ m s > 1, n 1 ≥ · · · ≥ n t > 1. Let P = {(m 1 , . . . , m s )|s ≥ 1, m 1 ≥ · · · ≥ m s > 1} which is a set of partitions of n without 1. We define a total order on P so that (m 1 , . . . , m s ) > (n 1 , . . . , n t ) if there exists 1 ≤ k ≤ s such that m i = n i for i < k and m k > n k . For n ≥ 0 we define a total order for Sn as follows: u(m 1 , . . . , m s ; n 1 , . . . , n t ) > u(k1 , . . . , k p ; l1 , . . . , lq ) k j (if no m term, the m i is understood to be 0 and if (a) mi < there is similarly k j and (m 1 , . . . , m s ) > (k1 , . . . , k p ) or (c) m i = for k j ), or (b) m i = k j , (m 1 , . . . , m s ) = (k1 , . . . , k p ) and (n 1 , . . . , n t ) < (l1 , . . . , lq ). For example, S6 is ordered in the following way from the largest to the smallest:
L 3−2 1, L 2−3 1, L −4 L −2 1, L −6 1, W−2 L 2−2 1, W−2 L −4 1, W−3 L −3 1, 2 2 3 L −2 1, W−6 1, W−4 W−2 1, W−3 1, W−2 1. W−4 L −2 1, W−2
W -Algebra W (2, 2) and the Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
995
Observe that if (m 1 , . . . , m s ) ∈ P and m ≥ m 1 then L m W−m 1 · · · W−m s 1 =
∂ m3 − m c W−m 1 · · · W−m s 1. 12 ∂ W−m
Using (2.4) and (2.5) we immediately see that (L −n 1 · · · L −n t 1, W−m 1 · · · W−m s 1)1 = L n t · · · L n 1 W−m 1 · · · W−m s 1 = 0 if (m 1 , . . . , m s ) < (n 1 , . . . , n t ). Since c = 0, (L −m 1 · · · L −m s 1, W−m 1 · · · W−m s 1)1 = L m s · · · L m 1 W−m 1 · · · W−m s 1 = 0. Let u = W−m 1 · · · W−m s L −n 1 · · · L −n t 1, v = W−k1 · · · W−k p L −l1 · · · L −lq 1 ∈ Sn . Again using (2.4) and (2.5) we have (u, v)1 = L n t · · · L n 1 W−k1 · · · W−k p Wm s · · · Wm 1 L −l1 · · · L −lq 1 = L lq · · · L l1 W−m 1 · · · W−m s Wk p · · · Wk1 L −n 1 · · · L −n t 1, with other (2.3). where we have used the fact that Wm commute each It is clear that (u, v) = 0 if m i > l j or ki > n j . If m i = l j and n i = k j we have (u, v) = (W−m 1 · · · W−m s 1, L −l1 · · · L −lq 1)(W−k1 · · · W−k p 1, L −n 1 · · · L −n t 1). This implies that (u, v) = 0 if either (l1 , . . . , lq ) > (m 1 , . . . , m s ) or (n 1 , . . . , n t ) > (k1 , . . . , k p ). Now let s be the cardinality of Sn we can label vectors in Sn by u 1 , . . . , u s such that u i > u j if i > j. Set A = (ai j ), where ai j = (u s+1−i , u j ) for i, j = 1, . . . , s. Note that if u i = W−m 1 · · · W−m s L −n 1 · · · L −n t 1, then u s+1−i = W−n 1 · · · W−n t L −m 1 · · ·−m s 1. It immediately follows that aii = (u s+1−i , u i ) = 0 for all i. We prove next that ai j = 0 if i > j. Let u i = W−m 1 · · · W−m s L −n 1 · · · L −n t 1 and u j = W−k1 · · · W−k p L −l1 · · · L −lq 1. So u s+1−i = u(n 1 , . . . , n t ; m 1 , . . . , m s ), u j = u(k1 , . . . , k p ; l1 , . . . , lq ). either (a) m i < k j , or (b) m i = k j and (m 1 , . . . , m s ) > Since u i > u j then (k1 , . . . , k p ) or (c) mi = k j , (m 1 , . . . , m s ) = (k1 , . . . , k p ) and (n 1 , . . . , n t ) < (l1 , . . . , lq ). From the discussion above, it is obvious that (u s+1−i , u j ) = 0 in all cases. That is, ai j = 0 if i > j. As a result, the Gram matrix A is an upper triangular matrix with every entry in the diagonal being nonzero. This shows that V¯ (c, 0, 0) is irreducible and L(c, 0, 0) = V¯ (c, 0, 0). Remark 2.2. Although W (2, 2) is an extension of the Virasoro algebra, the representation theory for W (2, 2) is different from that for the Virasoro algebra in a fundamental way. For W (2, 2), the structure of L(c, 0, 0) for c = 0 is uniform and simple. But for the Virasoro algebra, the situation is totally different. Let L(c, h) be the irreducible highest weight module for the Virasoro algebra with central charge c and highest weight h. In the case c = cs,t = 1 − 6(s − t)2 /st, where s, t are two coprime positive integers 1 < s < t, then L(c, 0) = V¯ (c, 0), where V¯ (c, 0) = V (c, 0)/U (V ir )L −1 v and v is a nonzero highest weight vector of the Verma module V (c, 0) (see [FF]). The structure of L(cs,t , 0) is much more complicated. On the other hand, from the point of view of vertex operator algebra, L(cs,t , 0) is a rational vertex operator algebra for all cs,t but L(c, 0) is not if c = cs,t (see [FZ] and [W]).
996
W. Zhang, C. Dong
Next we discuss the vertex operator algebras associated to the highest weight modules for W (2, 2). Let 1 be the canonical highest weight vector of V (c, 0, 0). From the axiom of vertex operator algebra we must modulo out the submodule generated by L −1 1. From the commutator relation (2.2) we know that Wn should be the component operators of a weight two vector. That is, there is a weight two vector x such that Y (x, z) = xn z −n−1 = Wn z −n−2 . n∈Z
n∈Z
Moreover, we must modulo out the submodule generated by W−1 1, since L −1 W0 1 = [L −1 , W0 ]1 + W0 L −1 1 = −W−1 1 + W0 L −1 1 by (2.2). A W (2, 2)-module M is restricted if for any w ∈ M, L m w = Wm w = 0 if m is sufficiently large. Recall the weak module, admissible module and ordinary module from [DLM1]. Theorem 2.3. Assume that c = 0. Then (1) There is a unique vertex operator algebra structure on L(c, 0, 0) with the vacuum vector 1 and the Virasoro element ω = L −2 1. Moreover, L(c, 0, 0)is generated by ω and x = W−2 1 with Y (ω, z) = n∈Z L n z −n−2 and Y (x, z) = n∈Z Wn z −n−2 . (2) If M is a restricted W (2, 2)-module with charge c, then M is a weak central−n−2 L(c, 0, 0)-module with Y M (ω, z) = and Y M (x, z) = n∈Z L n z n∈Z Wn z −n−2 . In particular, any quotient module of V (c, h 1 , h 2 ) is an ordinary module for L(c, 0, 0). (3) Any irreducible admissible L(c, 0, 0)-module is ordinary. (4) {L(c, h 1 , h 2 )|h i ∈ C} gives a complete list of irreducible L(c, 0, 0)-modules up to isomorphism. Proof. (1) and (2) are fairly standard following from the local system theory (see [L2,LL]). (3) and (4) follow from the fact that any irreducible admissible module for L(c, 0, 0) is an irreducible highest weight module for W (2, 2). We now turn our attention to the Verma module V (c, h 1 , h 2 ) in general. As in general highest weight module theory, we want to know when V (c, h 1 , h 2 ) = L(c, h 1 , h 2 ) is irreducible. 2 Theorem 2.4. The Verma module V (c, h 1 , h 2 ) is irreducible if and only if m12−1 c+2h 2 = 0 for any nonzero integer m. Proof. The proof is similar to that of Theorem 2.1. Note that V (c, h 1 , h 2 ) = ⊕n≥0 V (c, h 1 , h 2 )h 1 +n . By PBW theorem, V (c, h 1 , h 2 )h 1 +n has a basis S consisiting of vectors W−m 1 · · · W−m s L −n 1 · · · L −n t 1, where m 1 ≥ · · · ≥ m s > 0, n 1 ≥ · · · ≥ n t > 0, m i + n j = n. We also define a total order on Sn as before. Let Sn = {u 1 , . . . , u s } and u i < u j if i < j. Set An = (ai j ), where ai j = (u s+1−i , u j ). Then V (c, h 1 , h 2 ) = L(c, h 1 , h 2 ) if and only if det An = 0 for all n > 0. Note that if m ≥ m 1 ≥ · · · ≥ m s > 0, 3 ∂ m −m c + 2mh 2 L m W−m 1 · · · W−m s 1 = W−m 1 · · · W−m s 1. 12 ∂ W−m
W -Algebra W (2, 2) and the Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
997
If m12−1 c + 2h 2 = 0 for all 0 = m ∈ Z, we see immediately that the same argument used in the proof of Theorem 2.1 works for V (c, h 1 , h 2 ). That is, An is an upper triangular matrix with every entry in the diagonal being nonzero for all n ≥ 0, and V (c, h 1 , h 2 ) is irreducible in this case. 2 If m12−1 c + 2h 2 = 0 for some 0 < m, then Am is still an upper triangular matrix and one of the entries in the diagonal is (L −m 1, W−m 1) which is the coefficient of 1 in L m W−m 1 = 0. As a result, det Am = 0. The proof is complete. 2
It is definitely interesting to determine the J (c, h 1 , h 2 ) if m12−1 c + 2h 2 = 0 for some nonzero integer m. But this will be a problem which has nothing to do with the characterization of L(1/2, 0) ⊗ L(1/2, 0) in this paper. We will not further go in this direction. 2
3. Vertex Operator Algebras of the Moonshine Type Motivated by the moonshine vertex operator algebra V [FLM], we call a vertex operator algebra V = ⊕n∈Z Vn the moonshine type if V1 = 0. In this section we classify the simple vertex operator algebras V of the moonshine type such that V is generated by V2 and V2 is 2-dimensional. Note that V0 = C1 is 1-dimensional for the moonshine type vertex operator algebra V and Vn = 0 if n < 0 by Lemma 7.1 of [DGL]. Since V1 = 0 and V0 is 1-dimensional, there is a unique symmetric, nondegenerate invariant bilinear form (,) on V such that (1, 1) = 1 (see [L1]). Then for any u, v, w ∈ V , (Y (u, z)v, w) = (v, Y (e L(1)z (−z −2 ) L(0) u, z −1 )w) and (u, v)1 = Resz z −1 Y (e L(1)z (−z −2 ) L(0) u, z −1 )v. In particular, the restriction of the form to each homogeneous subspace Vn is nondegenerate and (u n+1 v, w) = (v, u −n+1 w) for all u, v ∈ V2 and w ∈ V. The V2 is a commutative and associative algebra with the product ab = a1 b for a, b ∈ V2 and the identity ω2 (cf. [FLM]). The V2 is called the Griess algebra of V. Note that for a, b ∈ V2 we have (a, b)1 = a3 b. Moreover, the form on V2 is associative. That is, (ab, c) = (a, bc) for a, b, c ∈ V2 . Theorem 3.1. Let V be a simple vertex operator algebra of the moonshine type with central charge c = 0 such that V is generated by V2 and V2 is 2-dimensional. Then V is isomorphic to L(c1 , 0) ⊗ L(c2 , 0) for some nonzero complex numbers c1 , c2 such that c1 + c2 = c if V2 is semisimple, and isomorphic to L(c, 0, 0) if V2 is not semisimple. Proof. Assume that V2 is a 2-dimensional semisimple commutative associative algebra with the identity ω/2. Then ω/2 is a sum of two primitive idempotents ω1 /2 and ω2 /2. It follows from [M1] that ω1 and ω2 are Virasoro vectors. Let Y (ωi , z) = L i (n)z −n−2 n∈Z
998
W. Zhang, C. Dong
for i = 1, 2. Then [L i (m), L i (n)] = (m − n)L i (m + n) +
m3 − m δm+n,0 ci 12
for all m, n ∈ Z, where ci ∈ C is the central charge of ωi . Since ω2 ω2 = δi, j we see that (ω1 )3 ω2 = (ω1 , ω2 )1 = 0 by using the invariant property of the bilinear form. This implies that i
j
[L 1 (m), L 2 (n)] = 0 for all m, n ∈ Z and c1 +c2 = c. Then V = ω1 ⊗ ω2 , where ωi is the vertex operator subalgebra of V generated by ωi (with a different Virasoro vector). Note that ωi is a quotient of V¯ (ci , 0). Since V is simple we immediately have that ωi is isomorphic to L(ci , 0). As a result, V is isomorphic to L(c1 , 0) ⊗ L(c2 , 0) in this case. It remains to deal with the case that V2 is not semisimple. In this case the Jacobson radical J of V2 is 1-dimensional. Assume that J = Cx. Then x 2 = 0 and (x, x) = (ω/2, x 2 ) = 0. Using the skew symmetry Y (x, z)x = e L(−1)z Y (x, −z)x we see that x0 x = −x0 x + L(−1)x1 x = −x0 x + L(−1)x 2 = −x0 x. This implies x0 x = 0. As a consequence, we see the component operators xn of Y (x, z) commute with each other. That is, [xm , xn ] = 0 for all m, n ∈ Z. Note that (ω, ω)1 = L(2)ω = 2c 1. Since the form (,) on V2 is nondegenerate, we may choose x so that (ω, x) = c/2. Set W (m) = xm+1 for m ∈ Z. Then we have the following commutator formula [L(m), W (n)] = (m − n)W (m + n) +
m3 − m δm,−n c. 12
This exactly says that the operators L(m), W (m), c generate a copy of W (2, 2) and V is an irreducible highest weight module for W (2, 2). Hence V is isomorphic to L(c, 0, 0), as desired. Remark 3.2. Theorem 3.1 is the main reason we introduce and study the Lie algebra W (2, 2) and its highest weight modules. The vertex operator algebra L(c, 0, 0) will be used in the next section when we characterize the rational vertex operator algebra L(1/2, 0) ⊗ L(1/2, 0). 4. Characterization of L(1/2, 0) ⊗ L(1/2, 0) In this section we give a characterization for the vertex operator algebra L(1/2, 0) ⊗ L(1/2, 0). We first recall some basic facts about a rational vertex operator algebra following [DLM1]. A vertex operator algebra V is called rational if any admissible module is completely reducible. It is proved in [DLM1] (also see [Z]) that if V is rational then there are only finitely many irreducible admissible modules M 1 , . . . , M k up to isomorphism such that M i = ⊕n≥0 Mλi i +n ,
W -Algebra W (2, 2) and the Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
999
where λi ∈ Q, Mλi i = 0 and each Mλi i +n is finite dimensional (see [AM] and [DLM2]). Let λmin be the minimum of λi ’s. The effective central charge c˜ is defined as c − 24λmin . A vertex operator algebra is called C2 -cofinite if C2 (V ) has finite codimension, where C2 (V ) = u −2 v|u,v ∈ V . Let f (z) = q λ n≥0 an q n be either a formal power series in z or a complex function. We say that the coefficients of f (z) satisfy the polynomial growth condition if there exist positive numbers A and α such that |an | ≤ An α for all n. For each M i we define the q-character of M i by chq M i = q −c/24 (dim Mλi i +n )q n+λi . n≥0
Mi
Then chq converges to a holomorphic function on the upper half plane if V is C2 -cofinite [Z]. Using the modular invariance result from [Z] and results on vector valued modular forms from [KM] we have (see [DM1]) Lemma 4.1. Let V be rational and C2 -cofinite. For each i, the coefficients of η(q)c˜ chq M i satisfy the polynomial growth condition where (1 − q n ). η(q) = q 1/24 n≥1
We also need some basic facts about the highest weight modules for the Virasoro algebra (see [FF,FQS,GKO,FZ,W]). Proposition 4.2. Let c be a complex number. (1) V¯ (c, 0) is a vertex operator algebra and L(c, 0) is a simple vertex operator algebra. (2) If c = cs,t = 1 − 6(s − t)2 /st for all coprime positive integers s, t with 1 < s < t, then V¯ (c, 0) = L(c, 0),and L(c, 0) is not rational. In this case, the q-character of −c/24 L(c, 0) is equal to q (1−q n ) and the coefficients grow faster than any polynomials. n>1 (3) If c = cs,t for some s, t, then V¯ (c, 0) = L(c, 0), and L(c, 0) is rational. From now on we assume that V is a rational and C2 -cofinite vertex operator algebra of the moonshine type such that c = c˜ = 1 and dim V2 = 2. We have already mentioned in Sect. 3 that V2 is a commutative associative algebra with identity ω2 . Lemma 4.3. The V2 is a semisimple associative algebra. That is, V2 is a direct sum of two ideals isomorphic to C. Proof. Suppose that V2 is not semisimple. Recall from the proof of Theorem 3.1 that the Jacobson radical J = Cx is one-dimensional. We assume that (ω, x) = 1. Then the component operator W (n) of Y (x, z) = n∈Z W (n)z −n−2 and the component operator of the Y (ω, z) generate a copy of the W -algebra W (2, 2) with central charge 1. Let U be the vertex operator subalgebra of V generated by V2 . Then U is a highest weight W (2, 2)-module with highest weight vector 1 such that Wn acts as W (n) and L n acts as L(n) for all n ∈ Z. Since L(−1)1 = W (−1)1 = 0, we see that U is a quotient of V¯ (c, 0, 0). From Theorem 2.1, V¯ (c, 0, 0) = L(c, 0, 0) is irreducible and U is isomorphic to L(1, 0, 0). Furthermore, q −1/24 . n 2 n>1 (1 − q )
chq U =
1000
W. Zhang, C. Dong
Note that chq U ≤ chq V, that is, the coefficients of chq U are less than or equal to the corresponding coefficients of chq V . Note that if |q| < 1, chq U and chq V are convergent. So as functions we also have chq U ≤ chq V for q ∈ (0, 1). Then η(q)chq U ≤ η(q)chq V as functions for q ∈ (0, 1) since η(q) is positive. By Lemma 4.1, the coefficients of η(q)chq V satisfy the polynomial growth condition. On the other hand, the coefficients of η(q)chq U = 1−q n grow faster than any polynomial in n. Thus η(q)chq U n>1 (1−q ) should be much bigger than η(q)chq V as q goes close to 1. This is a contradiction. Again from the proof of Theorem 3.1, we can write ω = ω1 + ω2 so that ω1 /2 and are the primitive idempotents. The ω1 and ω2 are Virasoro vectors with central charges c1 and c2 such that c1 + c2 = 1. Let L i (n) be as in Sect. 3. Then we have two commutative Virasoro algebras: m3 − m i j i δm+n,0 ci [L (m), L (n)] = δi, j (m − n)L (m + n) + 12 ω2 /2
for m, n ∈ Z and i, j = 1, 2. As before we denote by U the vertex operator subalgebra of V generated by V2 . Then U = ω1 ⊗ ω2 , where ωi is the vertex operator subalgebra of V generated by ωi (with a different Virasoro vector). Then ωi is a quotient of V¯ (ci , 0). Lemma 4.4. If c = 0, then the coefficients of chq L(c, 0) does not satisfy the polynomial growth condition. −c/24
Proof. If c = cs,t for any coprime integers 1 < s < t, then chq L(c, 0) = q (1−q n ) n>1 by Proposition 4.2 and the result is clear. We now assume that c = cs,t for some s, t. Suppose that the coefficients of chq L(cs,t , 0) = q −c/24 an q n n≥0
satisfy the polynomial growth condition. Then there exists a positive integer A and α such that an ≤ An α for all n ≥ 0. Let m be a positive integer such that m ≥ α. Then −m − 1 1 (−1)n q n , = n (1 − q)m+1 n≥0
where
m+n (−m − 1)(−m − 2) · · · (−m − n) −m − 1 = (−1)n . = n! m n
Thus
Since
m + n 1 qn. = m (1 − q)m+1 m+n m
n≥0
is greater than
nm m!
we see that
q c/24 chq L(cs,t , 0) ≤ m!A as formal power series.
1 (1 − q)m+1
W -Algebra W (2, 2) and the Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
1001
We next prove that there exists a positive integer k such that kcs,t = cs1 ,t1 for any coprime integers 1 < s1 < t1 . To see this we need to examine the equation 6(s1 − t1 )2 6(s − t)2 1− =k 1− s 1 t1 st which is equivalent to st (13s1 t1 − 6s12 − 6t12 ) = s1 t1 k(13st − 6s 2 − 6t 2 ). Then both s1 and t1 are factors of 6st. So there are only finitely many s1 , t1 satisfy this equation. This implies that such k exists. Consider a vertex operator algebra L(c, 0)⊗k which contains the vertex operator subalgebra V¯ (kc, 0) = L(kc, 0) as kc = cs1 ,t1 for any s1 , t1 . So q kc/24 chq L(kc, 0) ≤ q kc/24(chq L(c, 0)⊗k ) = q kc/24 (chq L(c, 0))k ≤ (m!A)k
1 (1−q)(m+1)k
and the coefficients of q kc/24 chq L(kc, 0) satisfy the polynomial growth condition. On the other hand we know from Proposition 4.2 that 1 (1 − qn) n>1
q kc/24 chq L(kc, 0) =
whose coefficients satisfy the exponential growth condition. This is a contradiction. The proof is complete. Lemma 4.5. Let ωi and ci be as before. Then ci = csi ,ti for some coprime integers 1 < si < ti and ωi is isomorphic to L(csi ,ti , 0) for i = 1, 2. Proof. Recall that U is the vertex operator subalgebra of V generated by V2 . First we note that as formal power series, chq U ≤ chq V. Let U i = ωi . Then U = U 1 ⊗ U 2 and chq U 1 ch q U 2 ≤ chq V. Since chq U i ≥ chq L(ci , 0) for i = 1, 2 we have η(q)chq L(c1 , 0)chq L(c2 , 0) ≤ η(q)chq U 1 chq U 2 ≤ η(q)chq V as functions for q ∈ (0, 1). −c1 /24 Assume that chq U 1 = q (1−q n ) . Then n>1
η(q)chq U ≥ η(q)
q −c1 /24 chq L(c2 , 0) n n>1 (1 − q )
as functions for q ∈ (0, 1). That is, η(q)chq U ≥ q c2 /24 (1 − q)chq L(c2 , 0). From the proof of Lemma 4.4 we see that if the coefficients of (1 − q)chq L(c2 , 0) satisfy the polynomial growth condition, so does the coefficients of chq L(c2 , 0). But this is impossible by Lemma 4.4. Thus the coefficients of (1−q)chq L(c2 , 0) does not satisfy the polynomial growth condition. On the other hand, q c2 /24 (1−q)chq L(c2 , 0) ≤ η(q)chq V as functions for q ∈ (0, 1) and the coefficients of η(q)chq V satisfy the polynomial growth condition. This is a contradiction. By Proposition 4.2 we see immediately that ci = csi ,ti for some si , ti and ωi is isomorphic to L(csi ,ti ,) for i = 1, 2.
1002
W. Zhang, C. Dong
Lemma 4.6. Let ci = csi ,ti as in Lemma 4.5. Then both c1 and c2 are 1/2. Proof. We need to solve the equation 1−
6(s2 − t2 )2 6(s1 − t1 )2 +1− =1 s 1 t1 s 2 t2
for two pairs of coprime integers 1 < si < ti . That is, s1 t1 s 2 t2 25 . + + + = t1 s 1 t2 s 2 6 Let x =
s1 t1
and y =
s2 t2 .
Then the equation becomes x+
1 25 1 +y+ = . x y 6
The following argument using the elliptic curve is due to N. Elkies and we thank him and A. Ryba for communicating the solution to us. The equation x + x1 + y + 1y = 25 6 gives an elliptic curve. Multiply the equation by 6x y to get E : 6x y 2 + 6x 2 y + 6x + 6y = 25x y. Putting one of the Weierstrass points at infinity yields the curve Y 2 + X Y = X 3 − 1070X + 7812 which has rank 0 over Q. So every rational points in E is a torsion point. So E/Q has at most 16 torsion points. Note that the curve has 8 obvious symmetries, generated by the involutions taking (x, y) to (1/x, y), (x, 1/y), and (y, x). Here are the rational points in E : four from ( 43 , 43 ), four from (1, 23 ), four from (−1, 6) and four from infinity. Since we assume that 1 < si < ti and si , ti are coprime, we immediately see that the only solution interesting to us is ( 43 , 43 ). This is, ci = 21 for i = 1, 2. Here is a characterization of L(1/2, 0) ⊗ L(1/2, 0). Theorem 4.7. If V is a simple, rational and C2 -cofinite vertex operator algebra of the moonshine type such that c = c˜ = 1 and dim V2 = 2, then V is isomorphic to L(1/2, 0) ⊗ L(1/2, 0). Proof. By Lemmas 4.5 and 4.6, the vertex operator subalgebra U generated by V2 of V is isomorphic to L( 21 , 0) ⊗ L( 21 , 0) which is rational and has 9 inequivalent irreducible 1 } (see [DMZ,W]).Thus V is a direct modules L( 21 , h 1 ) ⊗ L( 21 , h 2 ) for h i ∈ {0, 21 , 16 1 1 sum of irreducible L( 2 , 0) ⊗ L( 2 , 0)-modules. Note that h 1 + h 2 ∈ Z if and only if h 1 = h 2 = 0 or h 1 = h 2 = 21 . So only L( 21 , 0) ⊗ L( 21 , 0) and L( 21 , 21 ) ⊗ L( 21 , 21 ) can possibly occur in V as L( 21 , 0) ⊗ L( 21 , 0)-modules. Since dim V0 = 1 and V1 = 0, we immediately see that V is isomorphic to L( 21 , 0) ⊗ L( 21 , 0).
W -Algebra W (2, 2) and the Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
1003
We certainly believe that Theorem 4.7 is false if we do not assume c = c. ˜ One can construct a counter example involving the permutation orbifolds [BDM] modulo the following rational orbifold theory conjecture: If V is a rational vertex operator algebra and A is a finite automorphism group of V then the fixed point vertex operator subalgebra V A is rational. Let U = L(c3,5 , 0)⊗5 and W = L(1/2, 0)⊗8 . Then both U and V are rational vertex operator algebras with central charges −3 and 4 respectively. Let G be the cyclic group generated by the permutation (1, 2, 3, 4, 5) and H the cyclic group generated by (1, 2, 3, 4, 5, 6, 7, 8). Then G and H act obviously on U and W as automorphisms. The tensor product U G ⊗ W H is a counter example. We end this paper with the following conjecture which strengthens Theorem 4.7. Conjecture 4.8. If V is a simple, rational and C2 cofinite vertex operator algebra of the moonshine type with c = c˜ = 1 and dim V2 > 1, then V is isomorphic to L(1/2, 0) ⊗ L(1/2, 0). We have already mentioned that Theorem 4.7 is false without assuming c = c. ˜ This implies that Conjecture 4.8 is false without assuming c = c. ˜ Here we give a counter example to the conjecture without using the rational orbifold theory conjecture. Let U = L(c3,5 , 0)⊗5 and W = L(1/2, 0)⊗8 as in the counter example before the conjecture. Then V = U ⊗W is a rational, C2 -cofinite vertex operator algebra of the moonshine type and with c = 1, c˜ = 7 (cf. [DM1]). It is clear that dim V2 = 13, and V is not isomorphic to L(1/2, 0) ⊗ L(1/2, 0). It is essentially proved in [K] that if V is a rational vertex operator algebra such that i |χi (q)|2 is modular invariant where χi (q) are the q-character of the irreducible V -modules, then the q-character of V is equal to the character of one of the following vertex operator algebras VL , VL+ and VZGα , where L is any positive definite even lattice of rank 1, VL+ is the fixed points of the automorphism of V lifted from the −1 isometry of L , and Zα is the root lattice of type A1 such that (α, α) = 2 and G is a finite subgroup of S O(3) isomorphic to A4 , S4 or A5 . It is widely believed that VL , VL+ and VZGα should give a complete list of simple and rational vertex operator algebras with c = c˜ = 1. It is clear from the construction that if V is one of these vertex operator algebras of the moonshine type then dim V2 = 2. This should be very strong evidence for Conjecture 4.8. We remark that the assumption that i |χi (q)|2 is modular invariant in [K] is still an open problem in mathematics. References [AM]
Anderson, G., Moore, G.: Rationality in conformal field theory. Commun. Math. Phys. 117, 441–450 (1988) [BDM] Dong, C., Barron, K., Mason, G.: Twisted sectors for tensor product voas associated to permutation groups. Commun. Math. Phys. 227, 349–384 (2002) [D] Dong, C.: Representations of the moonshine module vertex operator algebra. Contemp. Math. 175, 27–36 (1994) [DGH] Dong, C., Griess, R. Jr., Hoehn, G.: Framed vertex operator algebras, codes and the moonshine module. Commun. Math. Phys. 193, 407–448 (1998) [DGL] Dong, C., Griess, R. Jr., Lam, C.: Uniqueness results of the moonshine vertex operator algebra. Amer. J. Math. 129, 583–609 (2007) [DLM1] Dong, C., Li, H., Mason, G.: Twisted representations of vertex operator algebras. Math. Ann. 310, 571–600 (1998) [DLM2] Dong, C., Li, H., Mason, G.: Modular invariance of trace functions in orbifold theory and generalized moonshine. Commun. Math. Phys. 214, 1–56 (2000) [DM1] Dong, C., Mason, G.: Rational vertex operator algebras and the effective central charge. International Math. Research Notices 56, 2989–3008 (2004)
1004
[DM2] [DMZ] [FF] [FLM] [FZ] [FQS] [GKO] [KL] [K] [KM] [LY] [L1] [L2] [LL] [M1] [M2] [M3] [W] [Z]
W. Zhang, C. Dong
Dong, C., Mason, G.: Integrability of C2 -cofinite vertex operator algebras. International Math. Research Notices 2006 Article ID 80468, 15 pages, 2006 Dong, C., Mason, G., Zhu, Y.: Discrete series of the virasoro algebra and the moonshine module. Proc. Symp. Pure. Math., American Math. Soc. 56(II), 295–316 (1994) Feigin, B., Fuchs, D.: Verma Modules over the Virasoro Algebra. Lect. Notes in Math. 1060, BerlinHeidelberg-New York: Springer, pp. 230–245 (1984) Frenkel, I.B., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. Pure and Applied Math., Vol. 134, New York: Academic Press, 1988 Frenkel, I.B., Zhu, Y.: Vertex operator algebras associated to representations of affine and virasoro algebras. Duke Math. J. 66, 123–168 (1992) Friedan, D., Qiu, Z., Shenker, S.: Details of the non-unitarity proof for highest weight representations of virasoro algebra. Commun. Math. Phys. 107, 535–542 (1986) Goddard, P., Kent, A., Olive, D.: Unitary representations of the virasoro algebra and super-virasoro algebras. Commun. Math. Phys. 103, 105–119 (1986) Kawahigashi, Y., Longo, R.: Local conformal nets arising from framed vertex operator algebras. Adv. Math. 206, 729–751 (2006) Kiritsis, E.: Proof of the completeness of the classification of rational conformal field theories with c = 1. Phys. Lett. B 217, 427–430 (1989) Knopp, M., Mason, G.: On vector-valued modular forms and their fourier coefficients. Acta Arith. 110, 117–124 (2003) Lam, C., Yamauchi, H.: A characterization of the moonshine vertex operator algebra by means of Virasoro frame. http://arXiv.org/list/math/0609718, 2006 Li, H.: Symmetric invariant bilinear forms on vertex operator algebras. J. Pure and Appl. Math. 96, 279–297 (1994) Li, H.: Local system of vertex operators, vertex superalgebras and modules. Pure and Appl. Math. 109, 143–195 (1996) Lepowsky, J., Li, H.: Introduction to Vertex Operator Algebras, and Their Representations. Progress in Math. Vol. 227, Birkhäuser, Boston, 2004 Miyamoto, M.: Griess algebras and conformal vectors in vertex operator algebras. J. Algebra 179, 523–548 (1996a) Miyamoto, M.: Binary codes and vertex operator (super)algebras. J. Algebra 181, 207–222 (1996b) Miyamoto, M.: A new construction of the moonshine vertex operator algebra over the real number field. Ann. of Math. 159, 535–596 (2004) Wang, W.: Rationality of virasoro vertex operator algebras. International Math. Research Notices 71, 197–211 (1993) Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. Amer, Math. Soc. 9, 237–302 (1996)
Communicated by Y. Kawahigashi
Commun. Math. Phys. 285, 1005–1031 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0655-6
Communications in
Mathematical Physics
Symmetric Chern-Simons-Higgs Vortices Robin Ming Chen, Daniel Spirn School of Mathematics, University of Minnesota, Minneapolis, MN 55455, USA. E-mail:
[email protected];
[email protected] Received: 30 November 2007 / Accepted: 16 July 2008 Published online: 22 October 2008 – © Springer-Verlag 2008
Abstract: We prove the existence of radially symmetric vortices of the static nonself-dual Chern-Simons-Higgs equations with and without magnetic field in dimension 2. The vortex profiles are shown to be monotonically increasing and bounded. For a given vorticity n, when there is no magnetic field we prove that the n-vortices are stable for n = 0, ±1. 1. Introduction The Chern-Simons-Higgs (CSH) theory generally refers to a wide category of fieldtheoretic models in (2 + 1) dimensional Minkowski space that contain a Chern-Simons term in their action densities, see [2,8,9,20]. These models have applications to several important problems in condensed matter physics such as high-temperature superconductivity and quantum and fractional Hall effect ([2,20]). CSH theory is one of the simplest known anyonic models, i.e. a model that allows for quantized statistics of fractional values. Define the Minkowski spacetime metric tensor g = diag(1, −1, −1), then in normalized units, the Lagrangian density of the CSH theory is written ([8,9]) Lcsh = Dα u D α u +
2 µ αβγ Aα Fβγ − λ2 |u|2 1 − |u|2 , 4
(1.1)
where A = −i Aα d x α with Aα : R1,2 → R for α = 0, 1, 2 is the gauge potential with covariant derivative DA = d − iA. The corresponding curvature FA = − 21 Fβγ d x β ∧ d x γ with Fβγ = ∂β Aγ − ∂γ Aβ defines the gauge field, and u : R1,2 → C is the Higgs scalar with Dα u = ∂α u − i Aα u, α = 0, 1, 2. Furthermore, the antisymmetric Levi-Civita tensor αβγ is fixed by setting 012 = 1 and µ, ε > 0 are the Chern-Simons coupling parameters. Here αβγ Aα Fβγ is the Chern-Simons term. The Euler-Lagrange
1006
R. M. Chen, D. Spirn
equations of (1.1) are
Dα D α u + λ2 u |u|2 − 1 3|u|2 − 1 = 0, µ αβγ Aα Fβγ + J α = 0, 4
(1.2) (1.3)
where J α = (iu, D α u) is the matter current. Since α = 0 refers to time coordinates, we replace D0 by ∂ = ∂t − i and replace Dα by ∇ A = ∇ − i A when α = 1, 2, where A = (A1 , A2 ). Here (, A) is the field potential. The curvature tensor is defined by ⎛ ⎞ 0 −E 1 −E 2 F = ⎝ E 1 0 −h ⎠ , (1.4) E2 h 0 where h = curlA and E α = ∂t Aα − ∂α are the induced magnetic and electric fields, respectively. We also use the standard current definition J 0 = (iu, ∂ u) = q,
J α = (iu, ∇ Aα u) = j Aα
for α = 1, 2 which are the charge and supercurrent, respectively. Hence we get the set of CSH equations as 2 ∂ u = ∇ 2A u + λ2 u |u|2 − 1 3|u|2 − 1 , (1.5) µ (1.6) q = − curlA, 2 µ j A = (E × e3 ). (1.7) 2 Well-posedness for the initial value problem for Eqs. (1.5)-(1.7) can be found in [3] and [4]. We look for static solutions. Setting ∂t u = 0 then Eqs. (1.5)-(1.7) becomes −2 u = ∇ 2A u + λ2 u |u|2 − 1 3|u|2 − 1 , µ |u|2 = curlA, 2 µ j A (u) = (∇ × e3 ). 2 Removing the electic field potential , we are left with a system of coupled elliptic PDE’s µ2 |curlA|2 2 2 2 2 − u = ∇ u + λ u |u| − 1 3|u| − 1 , (1.8) A 4 |u|4 curlA µ2 + j A (u). 0 = − curl (1.9) 4 |u|2 The above static equations can be viewed as the Euler-Lagrange equations of the following Chern-Simons-Higgs energy: 2 1 µ2 |curlA|2 2 2 2 G csh (u, A) = 1 − |u| |∇ A u|2 + + λ |u| d x. (1.10) 2 R2 4 |u|2
Symmetric Chern-Simons-Higgs Vortices
1007
When there is no magnetic field the Chern-Simons-Higgs energy becomes 1 E csh (u) = |∇u|2 + λ2 |u|2 (1 − |u|2 )2 d x, 2 R2
(1.11)
with the associated Euler-Lagrange equation − u + λ2 u(1 − |u|2 )(1 − 3|u|2 ) = 0.
(1.12)
Due to the form of the potential in (1.10) and (1.11), locally minimizing configurations should satisfy or
|u| → 1, as |x| → ∞, |u| → 0, as |x| → ∞.
We will only consider the first case which leads to the definition of the topological degree, deg(u), of such a configuration: u
1 1 deg(u) = deg
|x|=R : S → S |u| for R sufficiently large. The degree is related to the phenomenon of flux quantization. Indeed, an application of Stokes’ theorem to (1.7) and using (1.6) shows that a locally minimizing energy configuration satisfies 1 1 deg(u) = curlA d x = − q dx 2π R2 µπ R2 so long as there is good decay at infinity. Therefore, topological vortices in CSH theory carry both a quantized magnetic field and electrostatic charge. 1.1. Prior results. When µ = λ1 , minimizers of the CSH energy satisfy a simpler system of first order PDE’s. This self-dual mechanism was discovered by Hong-Kim-Pak [8] and Jackiw-Weinberg [9] and has been the subject of rich mathematical development. The resulting equations can allow for multivortex configurations and are similar to the JaffeTaubes self-dual Ginzburg-Landau theory. We point to Caffarelli-Yang [2] and Tarantello [18] for important results on the existence of such multivortex configurations in the selfdual regime. However, once the self-dual regime is left, the theory is underdeveloped. 2 A|2 Immediately there are difficulties in understanding the term µ4 | curl in the energy |u|2 since u vanishes at least once whenever deg(u) = 0. In this paper we initiate a study of the CSH energies (1.10) and (1.11) on the plane outside of the self-dual regime. We study, in particular, radially symmetric fields of the form u (n) = f (n) (r )einθ , an (r ) ⊥ A(n) = n x , r
(1.13) (1.14)
where (r, θ ) are polar coordinates, x⊥ = (−x2 , x1 )T , n is an integer which corresponds to the degree of u, and f (n) , an : [0, ∞) → R. We note Han [7] studied radial symmetric one-vortex solution in the self-dual regime µ = λ1 .
1008
R. M. Chen, D. Spirn
There are some similarities to the rigorous study of planar vortex minimizers of the Ginzburg-Landau energy 2 1 |∇ A u|2 + |curl A|2 + λ2 1 − |u|2 , (1.15) G gl (u, A) = 2 R2 which originated with the work of Plohr [13] and Berger-Chen [1]. The stability of (1.15) was initiated by Guo [5] and completely characterized by Gustafson-Sigal [6]. When the magnetic field is not present, the Ginzburg-Landau energy simplifies to 2 1 |∇u|2 + λ2 1 − |u|2 . E gl (u) = (1.16) 2 R2 Ovchinnikov-Sigal [12] established the existence and examined the stability of symmetric, planar minimizers of (1.16). Our CSH existence proofs rely on the existence results of [1,12], as we look for minimizers in a constraint class of functions with finite Ginzburg-Landau energy. 1.2. Main Results. We consider the existence problem of CSH n-vortex solutions, in the cases when A ≡ 0 and A ≡ 0, as well as the stability problem of the n-vortex in the absence of the magnetic field potential A. One major difficulty with the existence problem is that there are trivial global minimizers. Therefore, we need to employ an unusual constraint to force a minimizing sequence f (n) → 1 as r → ∞. The main results of this paper are the following: Theorem 1.1. For any n, there is a radially symmetric vortex solution u (n) = f (n) einθ of degree n to Eq. (1.12). Moreover, f (n) minimizes the renormalized energy functional r en given in (3.1) over a certain admissible set defined in (5.2). For n = 0, ±1, u (n) E csh r en . are local minima of E csh The proof of the existence part of Theorem 1.1 is similar in spirit to the methods developed in [12] for Ginzburg-Landau energy. The primary difficulty is the existence of a trivial global minimizer. In order to establish a nontrivial local minimizer, we examine a minimizing sequence of a renormalized CSH energy in a constraint class of functions with finite renormalized Ginzburg-Landau energy. The renormalized CSH energy functional yields the same Euler-Lagrange equation (1.12). To show coercivity of our minimizing sequence, we need to control the size of the set in which |u| ≤ 41 . This is done via a covering argument, similar to methods developed for Allen-Cahn by Modica-Mortola [11], Ginzburg-Landau by Sandier [15], and Chern-Simons-Higgs by Kurzke-Spirn [10]. We want to point out that the potential term λ2 |u|2 (1 − |u|2 )2 in the energy functional (1.11) prevents us from getting partial convexity of the renormalized energy functional, unlike the partial convexity found by Ovchinnikov-Sigal [12] for the reduced GinzburgLandau energy. Therefore, we are unable to prove uniqueness of the n-vortex solutions. The second part of Theorem 1.1 concerns the stability property of the n-vortices for n = 0, ±1. When n = 0 it follows from the definition that a strict absolute minimum is given by u (0) ≡ z for any z ∈ C with |z| = 1. The proof for n = ±1 uses a block decomposition of the linearized operator for the energy functional which is similar to the argument in [12]. However, because the potential term in the energy functional does not imply partial convexity, the Hessian of the energy might induce some zero modes
Symmetric Chern-Simons-Higgs Vortices
1009
other than the ones due to the symmetry breaking. We are able to show that the possible extra zero mode is at most one-dimensional when n = ±1 and the vortices u (±1) are still minimizing the energy along that direction. This, in turn, implies stability. We now turn to the full CSH energy (1.10). Our primary result is: Theorem 1.2. For any n, there are radially symmetric field solutions of the form (1.13), (1.14) to Eqs. (1.8) and (1.9). In particular, the radial functions ( f (n) , an ) minimize the radial energy functional (1.10) and 1 − f (n) (r ), 1 − an (r ) → 0 as r → ∞. The proof of Theorem 1.2 also relies on the results for the Ginzburg-Landau energy. We choose to minimize the energy functional over a constraint set suggested by Ginzburg2 2 Landau vortices, see Berger-Chen [1]. The difficulty now comes from the term µ4 |curlA| |u|2 in the energy functional (1.10). We show the pointwise convergence of that term by reco2 2 vering A (in particular an (r )) from the induced magnetic field term µ4 |curlA| . Then a |u|2 combination of weak lower semi-continuity and Fatou’s Lemma gives the existence of minimizers, which is also a solution to Eqs. (1.8) and (1.9). We further investigate the basic properties of the minimizers of the energy functional (1.10). It is straightforward to establish regularity of the vortex profile; on the other hand establishing monotonicity and/or maximum principles turns out to be tricky. In the end, though, we are able to prove the following Theorem 1.3. For any n, the radial functions ( f (n) , an ) obtained in Theorem 1.2 are C ∞ on (0, ∞) and have the following properties (for n = 0): (1) 0 < f (n) < 1 on (0, ∞), (2) 0 < an ≤ 1 on (0, ∞), (3) an ≥ 0, f (n) > 0. The maximum principle and monotonicity for an (r ) can be established by a truncation argument, similar to the method used by Berger-Chen [1] to establish monotonicity of the planar, symmetric Ginzburg-Landau equations. The proof of the monotonicity for f (n) cannot be attacked in the same way due to the nonconventional structure of the CSH energy. Truncation of f does not work effectively, due to the offsetting behavior of the 2 2 2 2 and µ4 r(a2 f)2 in the energy. Furthermore, the elliptic equation for f : terms n 2 f (1−a) r
1 n2 µ2 n 2 (a )2 2 2 2 r − λ f (3 f − 1) f 2 − 1 = ( f )2 + 2 (1 − a)2 f 2 − 2 r 4 r2 f 2
does not have a definite sign on the right-hand-side, hence no simple application of the maximum principle. On the other hand, we use the first and second variations of the energy, along with the Euler-Lagrange equations to prove that f (n) (r ) > 0. The bounds on f (n) follow. 1.3. Discussion. One quantity that we have difficulty describing is the induced magnetic field, h = curlA = (n/r )an (r ) for r = 0. From Theorem 1.3 we know that h(r ) ≥ 0 and that h → 0 as r → ∞. Furthermore, from the Euler-Lagrange equation for ( f (n) , an ) and the regularity result one has that h → 0 as r → 0. Since there is a quantized amount of magnetic field, we can conclude that h is roughly of annular shape in the plane. Although we are unable to determine much explicit behavior of h, we nonetheless assert
1010
R. M. Chen, D. Spirn
Conjecture 1.4. The magnetic field profile h(r ) has exactly one local maximum for any positive µ and λ. Another issue which turns out to be difficult to analyze at this moment is the instability of vortices with large degree when A ≡ 0. The potential term in the energy indicates that the 0 state may also be preferable. Numerically this in turn gives rise to the existence of a sharp transition layer of O(1) thickness, at a distance of O(n 2 ) from the origin. So far no sharp analytical results can be obtained on the behavior of the transition layer, which seems necessary to excite an unstable mode. We offer Conjecture 1.5. When |n| ≥ 2, the n-vortices u (n) obtained in Theorem 1.1 are saddle points of the renormalized energy, hence unstable. It is natural to study the stability of the full CSH energy (1.10) as was done by Gustafson-Sigal [6] for the Ginzburg-Landau energy (1.15). We note that the hessian of the CSH energy (1.10) is significantly more complicated than the hessian of the Ginzburg-Landau energy (1.15). The rest of this paper is organized as follows. Sections 2–6 treat the case when A ≡ 0. In Sect. 2 we compute the linearized operator of Eq. (1.12) and identify the zeromodes of that operator due to symmetry-breaking. We renormalize the energy functional (1.11) in Sect. 3 and then consider minimizing the renormalized energy. In Sect. 4 we establish a certain covering property of the Ginzburg-Landau energy which controls the set on which the amplitude of solution is small, which enables us to choose a constraint set of the minimization problem. We provide an existence result of the n-vortex in Sect. 5. In Sect. 6 we make a block-decomposition for the linearized operator and give a spectral characterization of the operator, which provides the stability of the n-vortex for n = 0, ±1. In Sect. 7 we prove the existence of the n-vortex of the full Eqs. (1.8) and (1.9), when A ≡ 0. In Sect. 8 we give some basic properties of those solutions. 2. Symmetry Breaking A central feature of the static Chern-Simons-Higgs energy functional G csh (and the CSH equations) is its infinite-dimensional symmetry group. Specifically, G csh is invariant under U (1) gauge transformations u → eiγ u, A → A + ∇γ
(2.1) (2.2)
for any smooth γ : R2 → R. In addition, G csh is invariant under coordinate translations and rotation transformations. The same thing holds for E csh . The following theorem from [12] is crucial in our analysis Theorem 2.1 (Ovchinnikov-Sigal [12]). Let u 0 be a solution to the abstract equation F(u) = 0 breaking a one parameter subgroup g(s) of the symmetry group of this equation. Let T be the generator of g(s). Then D F(u 0 )T u 0 = 0, where D F(u 0 ) is the linearized operator around u 0 . When the magnetic field A ≡ 0 the Chern-Simons-Higgs energy reduces to (1.11). We let L u be the linearized operator around u, i.e. lim ∂ε ∂δ E csh (u + εξ + δη) = L u (ξ ), η = Re ηL ¯ u (ξ ) d x. ε,δ→0
R2
Symmetric Chern-Simons-Higgs Vortices
1011
A simple computation gives
L u (ξ ) = − + λ2 (9|u|4 − 8|u|2 + 1) ξ + λ2 (6|u|2 − 4) u 2 ξ¯ .
(2.3)
In the radially symmetric case, we are looking for solutions of the form (1.13). An immediate consequence of Theorem 2.1 is: (n) (n) solve the linearized equation Corollary 2.2. The functions u (n) x1 , u x2 and iu
L u (n) (ξ ) = 0, where L u (ξ ) is given in (2.3). We will also need the following lemma later: Lemma 2.3. We have 1 (n) n (n) i(n+1)θ 1 (n) n (n) i(n−1)θ u (n) f f e e − f + + f , x1 = 2 r 2 r i (n) n (n) i(n+1)θ i (n) n (n) i(n−1)θ f f e e − f + + f . u (n) x2 = − 2 r 2 r A proof of this lemma can be found in [12].
(2.4) (2.5)
3. Renormalized Energy Functional Lemma 3.1. If u ∈ C 1 (R2 ) such that |u| → 1 as |x| → ∞ and deg(u) = 0, then E csh = ∞, where E csh is given in (1.11). Proof. Take u = f eiϕ with f = |u|. Then |∇u|2 = f 2 |∇ϕ|2 + |∇ f |2 ≥ f 2 |∇ϕ|2 . Because f → 1 at ∞, ∃R sufficiently large such that | f |2 > 1/2 for all |x| ≥ R. Hence 1 1 ∞ 2π 2 2 |∇ϕ| d x = |∇u| d x ≥ r |∇ϕ|2 dθ dr. 2 R 0 R2 |x|≥R 2 We also have that for r ≥ R,
2π deg(u) =
|x|=r
d(argu) =
≤
|x|=r
2π
|dϕ| =
≤ r 2π
|x|=r
r |∇ϕ|dθ
0 2π
dϕ
|∇ϕ| dθ . 2
0
Therefore
0
and then
2π
|∇ϕ|2 dθ ≥
2π(deg(u))2 , r2
2E csh ≥
R2
|∇u|2 d x ≥ π(deg(u))2
∞ R
1 r dr = ∞. r2
1012
R. M. Chen, D. Spirn
We renormalize the energy as follows. Let χ (x) ∈ C ∞ (R2 ) be such that 0 ≤ χ (x) ≤ 1, χ (x) = 1 for |x| ≥ 2, and χ (x) = 0 for |x| ≤ 1. Define the renormalized CSH energy functional to be 1 (deg(u))2 r en 2 2 2 2 2 |∇u| − d x, (3.1) E csh (u) = χ + λ |u| (1 − |u| ) 2 R2 r2 where r = |x|. Then the renormalized energy functional has the same Euler-Lagrange equation (1.12). 4. Further Properties about Ginzburg-Landau Energy Consider the renormalized Ginzburg-Landau energy functional as in [12] 1 (degu)2 r en 2 2 2 |∇u|2 − d x. E gl (u) = χ + λ (1 − |u| ) 2 R2 r2
(4.1)
From [12] we know that for any n there exists a unique radially symmetric vortex u (n) = f (n) einθ such that f (n) minimizes 1 n2 r en |∇ f |2 + 2 ( f 2 − χ ) + λ2 (1 − f 2 )2 d x (4.2) E gl (f) = 2 R2 r r en ( f ) < ∞. Denote among all real f such that E gl r en K (n) = E gl ( f (n) ).
We also have the following estimate, via the Cauchy -Schwarz inequality: 1 n2 r en |∇ f |2 + 2 ( f 2 − χ ) + λ2 (1 − f 2 )2 d x (f) = E gl 2 R2 r 1 1 n2 λ2 |∇ f |2 + (1 − f 2 )2 d x − ≥ dx 2 R2 2 2 1≤r ≤2 r 2 2 n λ2 1 2 2 2 dx ( f − 1) + (1 − f ) + 2 r ≥2 r 2 2 1 1 n2 n4 λ2 |∇ f |2 + (1 − f 2 )2 d x − ≥ d x − dx 2 2 4 2 R2 2 2 1≤r ≤2 r 2≤r λ r λ2 n4π 1 2 2 2 2 |∇ f | + (1 − f ) d x − n π ln 2 + 2 = 2 R2 2 4λ 4 λ n π ≥ √ |∇ f ||1 − f 2 | d x − n 2 π ln 2 + 2 . 4λ 2 R2 4 Let N (n) = n 2 π ln 2 + n4λπ2 , then
λ 1 λ2 r en |∇ f |2 + (1 − f 2 )2 d x ≤ E gl ( f ) + N (n) . √ |∇ f ||1 − f 2 | d x ≤ 2 R2 2 2 R2 (4.3)
Symmetric Chern-Simons-Higgs Vortices
For any open set we define 1 H∞ () = inf
1013
⎧ ⎨ ⎩
2r j : ⊂
j
⎫ ⎬
Br j (x j ) , ⎭
(4.4)
1 () ≤ H1 (∂), as noted in [15]. We can see that then H∞ 1 ({x : f (x) ≤ t}) t → H∞
is an increasing function. Suppose there exists some large R such that f ≥ 1/2 on ∂ B R . Hence λ r en (n) E gl ( f ) + N ≥ √ |∇ f ||1 − f 2 | d x 2 R2 ∞ λ = √ |1 − t 2 |H1 f −1 (t) dt 2 0 1/2 λ ≥ √ |1 − t 2 |H1 f −1 (t) dt 2 1/4 1/2 λ 1 ≥ √ |1 − t 2 |H∞ ({x ∈ B R : f (x) ≤ t})dt 2 1/4 1/2 λ 1 1 {x ∈ B R : f (x) ≤ } ≥ √ H∞ (1 − t 2 )dt 4 2 1/4 41 λ 1 1 = √ H∞ {x ∈ B R : f (x) ≤ } 192 2 4 λ 1 1 {x ∈ B R : f (x) ≤ } , ≥ H 10 ∞ 4 where we need the assumption that f ≥ 1/2 on ∂ B R between the third and fourth lines. Therefore {x ∈ B R : f (x) ≤ 1/4} ⊂ j Br j (x j ) with j
rj ≤
5 r en (E ( f ) + N (n) ). λ gl
(4.5)
Note that estimate (4.5) is independent of R. 5. Existence of Radially Symmetric Vortices when A ≡ 0 Let u = f einθ with f real. Then |∇u|2 = n 2 f 2 |∇θ |2 + |∇ f |2 . Hence 1 n2 2 r en 2 2 2 2 2 |∇ f | + 2 ( f − χ ) + λ f (1 − f ) d x ≡ E( f ). (5.1) E csh (u) = 2 R2 r Take a positive number M and consider the set (n)
r en A M = { f real | E gl ( f ) < K (n) + M, E( f ) < ∞}.
(5.2)
1014
R. M. Chen, D. Spirn
r en ( f ) is defined in (4.2). Recall that E gl (n)
r en ( f ) and Therefore A M is not empty and from the continuity of the functionals E gl
E( f ) we know that A(n) M is open. Now we consider minimizing E over the constraint (n) set A M . The main result of this section is the following
Theorem 5.1. For any given n, there is an M such that the functional E( f ) has a (n) minimizer f (n) on A M . Such minimizer f (n) is radially symmetric, 0 ≤ f (n) ≤ 1, and (n) (n) inθ u = f e is an n-vortex, i.e. solution to Eq. (1.12) of degree n. If |n| ≥ 1, then f (n) is monotonically increasing. (n)
Proof. We first show that E( f ) > −∞ on A M . We provide two ways. ¯ Method 1. Let f ∈ A(n) M . Taking f = | f | ∧ 1 we have r en ¯ r en ( f ) ≤ E gl ( f ), E( f¯) ≤ E( f ), E gl
. Therefore it suffices to consider 0 ≤ f ≤ 1. Let v = which shows f¯ ∈ A(n) 2πM 1 where p(r ¯ ) = 2π 0 p(r, θ ) dθ . Then
f 2,
v 2 = f 2 , v 4 ≤ f 4 , |∇r u|2 ≤ |∇r f |2 . Hence r en r en E gl (v) ≤ E gl ( f ), E(v) ≤ E( f ),
which shows that it suffices to consider f being radially symmetric. From (4.3) we know that 1 − f ∈ H 1 (R2 ). For any s > 0,
s
| f |dr ≤
1/s
s
1/s
≤
s
1/s
1 dr r 1 dr r
1/2
s
( f )2 r dr
1/s
1/2
R2
( f )2 d x
1/2 1/2 < ∞,
so f ∈ L 1 [1/s, s], which implies f is absolutely continuous on [1/s, s]. Therefore we get f (r ) ∈ C(0, ∞). (n) Due to the definition of A M and (4.3) we know that | f | → 1, we can take R large enough so that f ≥ 1/2 for |x| ≥ R. For f ≥ 1/4 we have n2 2 n4 4n 4 ( f − 1) + λ2 f 2 (1 − f 2 )2 ≥ − 2 2 4 ≥ − 2 4 . 2 r 4λ f r λ r
Symmetric Chern-Simons-Higgs Vortices
1015
Therefore we have n2 2 2 2 2 2 dx ( f − 1) + λ f (1 − f ) 2 r ≥1 r 2 n 1 ≥ ( f 2 − 1) 2 r ≥1, f ≥1/4 r 2 1 n2 2 2 2 2 2 + λ f (1 − f ) d x + ( f − 1) d x 2 r ≥1, f −∞, λ λ
1 E( f ) ≥ 2
(5.3)
where we’ve used (4.5) in getting the fourth inequality. Method 2. We use an averaging method by Struwe [17]. From (4.3), (5.2) and f ≥ 0 we know that there exists some C sufficiently large such that 2 2 2 C≥ |∇ f | + (1 − f ) d x ≥ |∇ f |2 + (1 − f )2 d x R≤|x|≤2R R≤|x|≤2R |∇ f |2 + (1 − f )2 dω. ≥ R inf r ∈[R,2R] ∂ Br
Hence there exists r∗ ∈ [R, 2R] such that 2C , |∇ f |2 + (1 − f )2 dω ≤ R ∂ Br∗ which shows
1 − f H 1 (∂ Br∗ ) ≤
2C . R
Since we have the H 1 bound of 1 − f on the circle, we may apply Morrey’s Inequality to get 2C . 1 − f C 0,1/2 (∂ Br∗ ) ≤ R Suppose there is some point r∗ eiθ∗ ∈ ∂ Br∗ such that f (r∗ eiθ∗ ) ≤ 1/2, then |1 − f (r∗ eiθ )| ≥ |1 − f (r∗ eiθ∗ )| − | f (r∗ eiθ∗ ) − f (r∗ eiθ )| 1 iθ∗ iθ 1/2 2C ≥ − |r∗ e − r∗ e | 2 R 2Cr∗ 1 ≥ − |θ∗ − θ |1/2 2 R √ 1 1 ≥ − 2 C|θ∗ − θ | > 2 4
1016
R. M. Chen, D. Spirn
√ if we consider |θ∗ − θ | < 1/(8 C). Therefore 2C ≥ |1 − f |2 r∗ dθ √ R ∂ Br∗ {|θ∗ −θ| r∗ √ > √ , 16 8 C 128 C which is a contradiction if we choose R to be sufficiently large. Therefore, for any large R, there is some r∗ ∈ [R, 2R] such that f ≥ 1/2 on ∂ Br∗ , and (4.5) also holds for such r∗ . We also notice that the estimate (4.5) is independent of R. The rest follows from the similar argument for (5.3). (n) Hence we’ve shown that E( f ) is bounded from below on A M . We take a minimizing (n) sequence f m ∈ A M such that lim E( f m ) = inf E(u).
m→∞
(n)
u∈A M
Without loss of generality we may assume 0 ≤ f m ≤ 1. Otherwise we consider f¯m = | f m | ∧ 1. Since r en ¯ r en E( f¯m ) ≤ E( f m ), E gl ( f m ) ≤ E gl ( f m ), (n) { f¯m } would also be a minimizing sequence in A M . Let gm = 1 − f m , then 0 ≤ gm ≤ 1. We have 1 n2 2 (n) r en 2 2 2 2 |∇gm | + 2 ( f m − χ ) + λ gm (1 + f m ) dx K + M > E gl ( f m ) = 2 R2 r 1 2n 2 2 ≥ |∇gm |2 − 2 χ|x|≥1 gm + λ2 gm dx 2 R2 r 1 1 n4 λ2 2 ≥ |∇gm |2 + gm dx − χ d x. 2 4 |x|≥1 2 R2 2 R2 λ r
Hence
R2
2 |∇gm |2 + gm d x ≤ C,
for some fixed C < ∞. Therefore we have up to a subsequence that gm → g0
weakly in H 1 ,
gm → g0
a.e. in R2 .
Let f 0 = 1 − g0 , then E( f 0 ) ≤ lim inf E( f m ). m→∞
On the other hand since 0 ≤ f m ≤ 1 we have that r en E( f m ) ≤ E gl ( f m ) < K (n) + M.
Symmetric Chern-Simons-Higgs Vortices
1017
(n)
Therefore f 0 ∈ A M , and E( f 0 ) = inf E(u). (n)
u∈A M
Next we show that f 0 is radially symmetric, using the same method in [12]. 2π 1 Let v = f 02 , where p(r ¯ ) ≡ 2π 0 p(r, θ ) dθ . Then v 2 = f 02 , v 4 ≤ f 04 , |∇r u|2 ≤ |∇r f 0 |2 . Hence E(v) ≤ E( f 0 ), and
r en r en E gl (v) ≤ E gl ( f 0 ),
and the equality holds only if f 0 is radially symmetric. Thus f 0 must be radially symmetric. Now we show that there exists an M such that f 0 is an interior minimizer. Let f 0k be (n)
the minimizer of E( f ) over A M+1/k for k = 1, 2, . . . . If f 0k is not an interior minimizer then by previous argument we know 0 ≤ f 0k ≤ 1 and (n)
r en E( f 0k ) = inf{E(u) : u ∈ A M+1/k } ≤ E( f 0k−1 ), E gl ( f 0k ) = K (n) + M + 1/k.
(5.4) r en ( f k ) are uniformly bounded by K (n) + M +1. From (4.3), 0 ≤ f k ≤ 1, Therefore all E gl 0 0 and the fact that R2 (1 − f 0k )2 d x ≤ R2 (1 − ( f 0k )2 )2 d x we have that 1 − f 0k H 1 is uniformly bounded. Hence 1 − f 0k (r ) 1 − f 0 (r ) in H 1 . Since f 0k are radial, from [19], radial H 1 functions on R2 decays like r −1/2 . We know further that f 0k (r ) → f 0 (r ) a.e. Thus applying the weak-lower semicontinuity of norms, Fatou’s lemma and (5.4) we get r en r en E( f 0 ) ≤ lim inf E( f 0k ) = E( f 01 ), E gl ( f 0 ) ≤ lim inf E gl ( f 0k ) = K (n) + M. k→∞
k→∞
In this way we see that f 0 is an interior minimizer of A(n) M+1/k for any k. Now we prove that f 0 > 0. Since f 0 minimizes E( f ), E ( f 0 ) = 0, E ( f 0 ) ≥ 0. We have E ( f ) = − r f + E ( f ) = − r +
n2 f + λ2 f (1 − f 2 )(1 − 3 f 2 ), r2
n2 + λ2 (15 f 4 − 12 f 2 + 1). r2
(5.5) (5.6)
Differentiating E ( f 0 ) with respect to r we obtain (E ( f 0 ) +
1 2n 2 ) f0 = 3 f0 . 2 r r
Since f 0 ≥ 0 and f 0 ≡ 0, applying the maximum principle (see, [16], Theorem B.4) we obtain that f 0 > 0 when |n| ≥ 1. Hence we obtain further that f 0 (r ) > 0 for r > 0.
1018
R. M. Chen, D. Spirn
Lastly, since f 0 is radially symmetric, we have ∇ f 0 · ∇θ = 0. Hence n2 ( f 0 einθ ) = f 0 − 2 f 0 einθ , r and therefore together with (5.5) we obtain that f 0 einθ satisfies the equation − u + λ2 u(1 − |u|2 )(1 − 3|u|2 ) = 0. Since deg( f 0 einθ ) = n, we have completed the proof of Theorem 5.1. Remark 5.2. When n = 0, it follows from the definition that a strict absolute minimum is given by u (0) ≡ z for any z ∈ C with |z| = 1. 6. Stability We follow the argument in [12] to consider the stability of the critical points of the r en (u). It is discussed in [12] renormalized Chern-Simons-Higgs energy functional, E csh that this question is related to the spectral property of the Hessian of the energy functional, r en (u). HessE csh r en (u) to be We compute HessE csh − + λ2 (9|u|4 − 8|u|2 + 1) 2λ2 (3|u|2 − 2)u 2 r en (u) = . (6.1) HessE csh 2λ2 (3|u|2 − 2)u¯ 2 − + λ2 (9|u|4 − 8|u|2 + 1) Denote ξ =
ξ , ξ¯
(6.2)
then we also have −−−→ r en HessE csh (u)ξ = L u (ξ ),
(6.3)
where L u (ξ ) is the linearized operator given in (2.3). Using the same notation as in [12], r en (u) to be the maximal null space of HessE r en (u) due to we denote Sym Null HessE csh csh symmetry breaking. Then it is known that r en (u) ≥ 0 and Null HessE r en (u) = Sym Null HessE r en (u) ⇒ u is a (i) HessE csh csh csh r en (u), local minimum of E csh r en (u) has a negative eigenvalue ⇒ u is a saddle point of E r en (u). (ii) HessE csh csh
The main stability result of this section is r en (u) for n = 0, ±1. Theorem 6.1. u (n) are local minima of E csh
Theorem 6.1 says that the n-vortices are stable for |n| = 0, 1. From Remark 5.2 we know that when n = 0, f (0) ≡ 1 and the corresponding 0-vortex u (0) is an absolute minimum. Hence we only argue the case when n = ±1. From the previous argument we know that we need to understand the spectrum of r en (u). Without loss of generality we assume n = 1. The case n = −1 can be HessE csh treated the same way by observing that u (1) = u (−1) .
Symmetric Chern-Simons-Higgs Vortices
1019
We begin with an elementary harmonic analysis of the linearized operator L u (n) r en (u) (see (6.3)). Consider a function ξ(r, θ ) in polar which is closely related to HessE csh coordinates and expand it in the Fourier series in θ, ξ(r, θ ) =
∞
ξk (r )eikθ ,
k=−∞
where the Fourier coefficients are given by 2π 1 ξk (r ) = ξ(r, θ )e−ikθ dθ. 2π 0 Considera map of measurable functions ξ : R2 → C into measurable functions ξ . If ξ ’s are endowed with inner product ξˆ = k≥n ¯ k ξ2n−k ξ , η = 2Re
R2
ξ¯ η d x,
where ξ is given in (6.2), then is unitary, provided ξˆ ’s are endowed with the inner product ξk η k , . ξˆ , η ˆ = Reξn , ηn + Re ¯ η¯ 2n−k ξ2n−k k>n
Define the real linear operator Lˆ u (n) on functions ξˆ by Lˆ u (n) (ξ ) = L u (n) (ξ ).
(6.4)
We first give the characterization of Lˆ u (n) . Lemma 6.2. The operator Lˆ u (n) is block diagonal of the form ξ , Lˆ u (n) (ξ ) = L ku (n) ¯ k ξ2n−k
(6.5)
k≥n
where L ku (n) are given by ⎛ L k (n) = ⎝ u
2 − r + k 2 + λ2 (9|u (n) |4 − 8|u (n) |2 + 1)
2λ2 |u (n) |2 (3|u (n) |2 − 2)
2λ2 |u (n) |2 (3|u (n) |2 − 2)
2 − r + (2n−k) + λ2 (9|u (n) |4 − 8|u (n) |2 + 1) 2
r
⎞ ⎠ , (6.6)
r
where r f = r1 ∂r (r ∂r f ). Proof. First it is easily checked that ! k2 2 (n) 4 (n) 2 L u (n) (ξ ) k = − r + 2 + λ (9|u | − 8|u | + 1) ξk r
+ λ2 (6|u (n) |2 − 4) |u (n) |2 ξ¯2n−k .
(6.7)
1020
R. M. Chen, D. Spirn
Since = r + r −2 ∂θ2 , we have (− ξ )k = − r ξk + Moreover we have (u (n) )2 ξ¯ = |u (n) |2 (2π )−1/2 k
2π
k2 ξk . r2
ei2nθ ξ¯ e−ikθ dθ
0
= |u (n) |2 (2π )−1/2
2π
ξ e−i(2n−k)θ dθ = |u (n) |2 ξ¯2n−k .
0
Therefore (6.7) implies
(L u (n) ξ )k (L u (n) ξ )2n−k
= L ku (n)
ξk
ξ¯2n−k
,
which, due to (6.4), yields (6.5). Lemma 6.3. For n ≥ 1, we have the following characterization of the linear operators: (1) (2) (3) (4)
L nu (n) ≥ 0 and if 0 is an eigenvalue, then it is non-degenerate. L n+1 ≥ 0 and 0 is its non-degenerate eigenvalue. u (n) k L u (n) ≥ 0 for k ≥ 3n and 0 is not an eigenvalue. The continuous spectrum cont specL ku (n) = [0, ∞), for any k.
Proof. (1) Due to the breaking of the gauge symmetry we have L u (n) (iu (n) ) = 0. After separating out the angular variable we obtain n2 − r + 2 + λ2 (3 f n4 − 4 f n2 + 1) f n = 0, r where f n = |u (n) |. We can rewrite the equation as n2 2 2 2 4 − r + 2 + 4λ (1 − f n ) − 3λ (1 − f n ) f n = 0. r Since f n > 0, bounded and ∈ / L 2 (R+ , r dr ), we can apply Theorem B.1 in [12] to 2 conclude that the operator − r + nr 2 + λ2 (3 f n4 − 4 f n2 + 1) is non-negative, 0 is not its eigenvalue, and any solution g to the equation n2 2 4 2 − r + 2 + λ (3 f n − 4 f n + 1) g = 0 r is of the form g = c f, where c is some constant.
Symmetric Chern-Simons-Higgs Vortices
1021
Also since f n minimizes E( f ) we have E ( f n ) = − r +
n2 + λ2 (15 f n4 − 12 f n2 + 1) ≥ 0. r2
Suppose E ( f n ) has an eigenvalue 0, and the corresponding eigenfunction is ψ0 (r ). Then we can write n2 2 4 2 2 4 0 = E ( f n )ψ0 = − r + 2 + 4λ (3 f n − 3 f n + 1) − 3λ (1 − f n ) ψ0 , r where we see that 4λ2 (3 f n4 − 3 f n2 + 1) ≥ λ2 > 0. We now follow the idea from [12]. Let L 0 = − r +
n2 + 4λ2 (3 f n4 − 3 f n2 + 1), V = 3λ2 (1 − f n4 ) > 0. r2
Then V = O(r −2 ) as r → ∞. Let R0 (α) = (L 0 − α)−1 with α ≤ λ2 and consider the Birman-Schwinger-type operator function √ √ K (α) = V R0 (α) V . Then we have
√ • If (E ( f n ) − α)ψ = 0, then K (α)ϕ = ϕ with ϕ = V ψ. If ψ ∈ L 2 (R2 ) then ϕ ∈ L 2 (R2 ). √ • If K (α)ϕ = ϕ, then (E (√f n ) − α)ψ = 0 with ψ = R0 (α) V ϕ. If ϕ ∈ L 2 (R2 ) then 2 for α < λ2 , |ψ| ≤ Ce−r λ −α . We also have the following result
Lemma 6.4 (Ovchinnikov-Sigal [12]). K (α) with α ≤ λ2 is positivity improving, i.e. K (α)ϕ > 0 (modulo a set of zero measure) whenever ϕ ≥ 0, and (1) α is an eigenvalue of E ( f n ) iff 1 is an eigenvalue of K (α). (2) α is the lowest eigenvalue of E ( f n ) iff 1 is the largest eigenvalue of K (α). By assumption that E ( f n ) has an eigenvalue 0 and the fact that E ( f n ) ≥ 0 we know that 0 is its lowest eigenvalue. Hence from the above lemma, 1 is the largest eigenvalue of K (0). Therefore we have (see [14], Theorem XIII.43) that the eigenfunction of K (0), ϕ0 , is positive and the eigenvalue 1 is non-degenerate. Thus from Lemma√6.4, 0 is a V ϕ0 > 0 non-degenerate eigenvalue of E ( f n ) and ψ0 can be taken to be ψ0 = L −1 0 and ψ0 decays exponentially fast. On the other hand when k = n, we have R L nu (n) R T = L , where
1 −1 1 , R=√ 1 1 2 " 2 − r + nr 2 + λ2 (3 f n4 − 4 f n2 + 1) L= 0 − r +
0 n2 r2
+ λ2 (15 f n4 − 12 f n2 + 1)
# .
1022
R. M. Chen, D. Spirn
Therefore L nu (n) ≥ 0 and if 0 is an eigenvalue then it is non-degenerate, which proves (1). (2) We perform the similar argument as in (1), but instead of the zero mode due to breaking the gauge symmetry we use the zero mode due to breaking the translation symmetry. Such a zero mode is ∇u (n) . Due to Lemma 2.3 and since n − 1 = 2n − k for (n) contains only the k = n + 1 block, (n + 1, n − 1): k = n + 1, ∂x$ ju (n) = ∂x$ 1u
(n) = g (n) δk,n+1 , ∂x$ 2u
k≥n
" where g (n) =
1 2
−ig (n) δk,n+1 ,
k≥n
f (n) − nr f (n) f (n) + nr f (n)
# . Hence
(n) ) = 0 = Lˆ u (n) (∂x$ 1u
(n) L n+1 δk,n+1 , g (n) u
k≥n
and therefore g (n) = 0. L n+1 u (n)
(6.8)
(n) leads to the same equation. The zero mode ∂x$ 1u n (n) f (n) Notice that Rg = r (n) . Since f (n) > 0 and f (n) > 0, r > 0, Rg (n) has f positive entries. Hence (6.8) together with Appendix B in [12] shows that 0 is the lowest eigenvalue of L n+1 and is non-degenerate. This and statement (4) imply statement (2). u (n) (3) We have
" L ku (n)
−
L n+1 u (n)
=
k 2 −(n+1)2 r2
0
#
0 (k−2n)2 −(n−1)2 r2
> 0,
for k ≥ 3n. From (2) we obtain (3). (4) As |x| → ∞, we have − + 2λ2 2λ2 =: L 0 . L ku (n) → 2λ2 − + 2λ2 We know that cont specL ku (n) = specL 0 .
(6.9)
Using the transformation matrix R to diagonalize L 0 as R L0 R = T
Thus specL = [0, ∞)
− 0 0 − + 4λ2
.
[4λ2 , ∞), which together with (6.9) yields (4).
Symmetric Chern-Simons-Higgs Vortices
1023
Proof of Theorem 6.1. When n = 1, due to Lemma 6.2 and Lemma 6.3, Lˆ u (1) ≥ 0, i.e. r en (u (1) ) ≥ 0 with zero modes determined either completely by the symmetry HessE csh breaking or by symmetry breaking and an extra mode ψ0 eiθ . In the first case we obviously know that u (1) is a local minimum. r en (u (1) ). Then we know that except for the direcIf (ψ0 eiθ , ψ0 e−iθ )T ∈ Null HessE csh r en locally. Along this direction we compute tion generated by ψ0 eiθ , u (1) minimizes E csh the second variation of the renormalized energy to be r en (1) ψ¯ L u (1) (ψ) d x, (u + εψ) = Re (6.10) lim ∂ε2 E csh ε→∞
R2
where ψ = cψ0 eiθ for some c ∈ C and L u is given in (2.3). If c = a + ib, where a, b ∈ R, further computation gives Re ψ¯ L u (1) (ψ) d x R2 n2 = Re |c|2 ψ0 · − r + 2 + λ2 (9 f 14 − 8 f 12 + 1) ψ0 + c¯2 λ2 (6 f 12 − 4) f 12 ψ02 d x r R2 n2 2 2 2 4 2 = a E ( f 1 )ψ0 , ψ0 L 2 + b − r + 2 + λ (3 f 1 − 4 f 1 + 1) ψ0 , ψ0 2 L r 2 n = b2 − r + 2 + λ2 (3 f 14 − 4 f 12 + 1) ψ0 , ψ0 2 > 0 L r if b = 0 (from Lemma 6.3). Hence r en (1) r en (1) E csh (u + cψ0 eiθ ) > E csh (u )
for |c| sufficiently small. Therefore we only need to check the case when c ∈ R. In this case, from (5.1) we know that r en (1) r en E csh (u + ψ) = E csh (( f 1 + cψ0 )eiθ ) = E( f 1 + cψ0 ).
Hence we can apply Theorem 5.1 to obtain that u (1) is also a local minimizer along this direction ψ0 eiθ . Thus we conclude to obtain Theorem 6.1. 7. Existence of Radially Symmetric Vortices when A ≡ 0 When the magnetic field A ≡ 0, the CSH energy functional and the Euler-Lagrange functions become more complicated. We look for minimizers of the CSH energy (1.10) among all symmetric vortices of the form (1.13), (1.14), with f (n) , an → 1 as r → ∞. (This means we are looking for topological symmetric vortices.) Let a(r ) ⊥ x . (7.1) u = f (r )einθ , A = n r Then the CSH energy functional takes the following radial form: 2 1 n2 µ2 n 2 a 2 2 2 2 1 − f | f |2 + 2 (1 − a)2 f 2 + + λ f d x. G rcsh ( f, a) = 2 R2 r 4 r2 f (7.2)
1024
R. M. Chen, D. Spirn
Our argument is also based on the results for Ginzburg-Landau vortices. Following [1], we define the spaces C f = the set of real-valued radially symmetric functions f (|x|) defined on R2 such that f ≥ 0 a.e. and 1 − f ∈ H 1 (R2 ). Ca = the set of real-valued radially symmetric functions a(|x|) defined on R2 such that a/r ∈ L 2loc (R2 ) and a /r ∈ L 2 (R2 ) where the derivative a is in the distributional sense. Recall the following results for Ginzburg-Landau equations. Lemma 7.1 (Berger-Chen [1]). The Ca and C f spaces satisfy the following: (1) Ca with the inner product which induces the norm aCa = a /r L 2 (R2 ) is a Hilbert space. (2) For f ∈ C f , f (r ) ∈ C(0, ∞). r (3) For a ∈ Ca , a(r ) ∈ C[0, ∞), a(0) = 0, a(r ) = 0 a (s)ds, and sup |a/r | ≤ aCa .
r ∈(0,∞)
(4) If f ∈ C f , a ∈ Ca , and G rgl ( f, a) < ∞, where G rgl ( f, a) is the radial GinzburgLandau energy given by G rgl ( f, a)
2 1 n2 n 2 (a )2 2 2 1 − f = | f |2 + 2 (1 − a)2 f 2 + + λ d x, (7.3) 2 R2 r r2
then f ∈ C[0, ∞) and f (0) = 0. Theorem 7.2 (Berger-Chen [1]). For any integer n and λ, there is a solution (u (n) , A(n) ) to the Ginzburg-Landau equation which is of the form (1.13), (1.14). In particular, ( f (n) , an ) ∈ C f Ca minimizes the radial Ginzburg-Landau energy G rgl ( f, a) defined in (7.3). First we note that G rcsh ( f, a) ≥ 0 and G rcsh (1, 1) = 0.Therefore u ≡ 1 and A ≡ 1 give a trivial solution to the CSH equations and minimize G rcsh ( f, a) without restricting G rcsh ( f, a) by any vortex number n = 0. On the other hand if we take (u, A) of the form (7.1) with f, a → 1 at ∞ and if m = inf C f Ca G rcsh ( f, a) is attained at ( f 0 , a0 ), then m > 0. Otherwise R2 f 02 (1 − f 02 )2 d x = R2 (1 − a0 )2 f 02 /r 2 d x = 0. From Lemma 7.1 we obtain the continuity of f 0 and a0 . Therefore we know from the integral identities that f 0 ≡ 1 and a0 ≡ 1, which contradicts that a0 /r ∈ L 2loc (R2 ). Similarly we know that inf C f Ca G rgl ( f, a) > 0. Let m 0 = inf C f Ca G rgl ( f, a) > 0. From Theorem 7.2 we know that m 0 is attained Ca . For any M > 0, let in C f B M = {( f, a) ∈ C f
Ca : G rgl ( f, a) < m 0 + M}.
(7.4)
The main result of this section is the following Theorem 7.3. There is an M such that the infimum of G rcsh over B M is attained and is positive.
Symmetric Chern-Simons-Higgs Vortices
1025
Proof. Since G rcsh ≥ 0, we can take a minimizing sequence ( f m , am ) ∈ B M . Therefore we have (i) G rcsh ( f m , am ) < K for some K < ∞, and (ii) G rgl ( f m , am ) < m 0 + M. From (ii) we know that K > | f m |2 + λ2 (1 − f m2 )2 d x = | f m |2 + λ2 (1 − f m )2 (1 + f m )2 d x R2 R2 2 ≥ min{1, λ } | f m |2 + (1 − f m )2 d x = min{1, λ2 }1 − f m 2H 1 (R2 ) , R2
K >
)2 n 2 (am r2
R2
2 d x = am C . a
Therefore 1 − f m is bounded in H 1 (R2 ) and am is bounded in Ca . Hence we may extract a subsequence, still denoted ( f m , am ), such that 1 − f m 1 − f 0 weakly in H 1 (R2 ), am a0 weakly in Ca , with ( f 0 , a0 ) ∈ C f Ca . Thus the Rellich-Kondrachov embedding theorem implies p strong convergence in L loc (R2 ). From [19] we know that radial H 1 functions in R2 have good decay properties, like r −1/2 . Hence it is easy to see that f m (r ) → f 0 (r ), am (r ) → a0 (r ) a.e. Using Fatou’s Lemma and the weak lower semicontinuity of L 2 -norm of f m we obtain that 2 n2 | f 0 |2 + 2 (1 − a0 )2 f 02 + λ2 f 02 1 − f 02 d x r R2 2 n2 ≤ lim inf | f m |2 + 2 (1 − am )2 f m2 + λ2 f m2 1 − f m2 d x. (7.5) m→∞ R2 r It’s also easy to see from the Ginzburg-Landau energy form (7.3) that G rgl ( f 0 , a0 ) ≤ lim inf G rgl ( f m , am ) ≤ m 0 + M. m→∞
Hence ( f 0 , a0 ) ∈ B M . From (i) we know that /f ) (am m ∈ L 2 (R2 ). r = a / f . Then since g /r ∈ L 2 (R2 ), Let gm m m m
0
r
|gm |ds
1/2
1/2 1 2 ≤ sds |gm | ds 0 0 r √ 1/2 1 2 r 2 2K ≤ √ |gm | d x < 2 2 < ∞. µ n 2 R2 r 2 r
r
1026
R. M. Chen, D. Spirn
∈ L 1 (R+ ), hence Therefore gm loc
r
gm (r ) − gm (0) =
gm (s)ds.
0
It’s also easy to see that r1 (gm (r ) − gm (0)) ∈ L 2loc (R2 ). Let h m = gm (r ) − gm (0), then h m ∈ Ca and h m Ca < K . So there is a subsequence, still denoted h m , such that h m h 0 weakly in Ca and h m (r ) → h 0 (r ) a.e. ∈ L 1 (R+ ). Thus For any r > 0, since am ∈ Ca , we know that am loc am (r ) = am (l) +
r
l
am (s)ds,
= f h and perfoming integration by parts we obtain for 0 < l < r . Plugging in am m m
r
am (r ) = am (l) + f m h m
rl − f m h m ds . l
Letting m → ∞ we get
r am (r ) − am (l) → a0 (r ) − a0 (l), f m h m rl → f 0 h 0 l .
r
r
r r
f m h m ds − f 0 h 0 ds =
f m (h m − h 0 ) ds + ( f m − f 0 )h 0 ds
l l l l r 1/2 1/2 r 1 ≤ h m − h 0 L ∞ [l,r ] ds | f m |2 d x + ( f m − f 0 )h 0 ds R2 l s l → 0 as m → ∞. Therefore
a0 (r ) = a0 (l) + f 0 h 0
rl −
l
r
f 0 h 0 ds ,
which in turn gives that a0 = f 0 h 0 . Thus /f a / f0 am m 0 , weakly in L 2 (R2 ). r r
Hence the weak lower semicontinuity of the L 2 -norm implies that
µ2 n 2 2 R2 4 r
a0 f0
2
µ2 n 2 d x ≤ lim inf m→∞ R2 4 r 2
am fm
2 d x.
(7.6)
Combining (7.5) and (7.6) we obtain that G rcsh ( f 0 , a0 ) ≤ lim inf G rcsh ( f m , am ). m→∞
Thus we have G rcsh ( f 0 , a0 ) = inf B M G rcsh ( f, a).
(7.7)
Symmetric Chern-Simons-Higgs Vortices
1027
Next we show that there is some M such that ( f 0 , a0 ) is an interior minimizer. The argument is similar to the one in Theorem 5.1. Let ( f 0k , a0k ) be minimizers of G rcsh over B M+1/k for k = 1, 2, . . . . If they are not interior minimizers, then G rcsh ( f 0k , a0k ) = inf G rcsh ≤ G rcsh ( f 0k−1 , a0k−1 ), G rgl ( f 0k , a0k ) = m 0 + M + 1/k. B M+1/k
(7.8) Hence G rgl ( f 0k , a0k ) is uniformly bounded and then as is discussed before, we have 1 − f 0k (r ) 1 − f 0 (r ) in H 1 , a0k (r ) a0 (r ) in Ca , f 0k (r ) → f 0 (r ), a0k (r ) → a0 (r ) a.e.,
(a0k ) / f 0k a / f0 0 in L 2 . r r
Therefore applying lower-semicontinuity of norms, Fatou’s lemma and (7.8) we obtain G rcsh ( f 0 , a0 ) ≤ lim inf G rcsh ( f 0k , a0k ) = G rcsh ( f 01 , a01 ), k→∞
G rgl ( f 0 , a0 )
≤ lim inf G rgl ( f 0k , a0k ) = m 0 + M. k→∞
Thus ( f 0 , a0 ) is an interior minimizer of B M+1/k for any k. The positivity of the infimum is given in the argument before the theorem. It is easy to check that the minimizing solution ( f 0 , a0 ) obtained in Theorem 7.3 solves the following equations on R2 \{0} : n2 µ2 n 2 (a )2 r f = 2 (1 − a)2 f − + λ2 (1 − f 2 )(1 − 3 f 2 ) f, r 4 r2 f 3 µ2 a f 2 (1 − a) , − = 4 rf2 r
(7.9) (7.10)
where r = r1 ∂r (r ∂r ) is the radial Laplacian. 8. Basic Properties of Symmetric Vortices In Sect. 5 we showed the existence of symmetric vortices in the absence of the magnetic field, and discovered some properties of the vortices. In this section we will develop some basic properties of the symmetric vortices when A ≡ 0. 1. Regularity. Theorem 8.1. Suppose (u (n) , A(n) ) is of the form (1.13), (1.14), where ( f (n) (r ), an (r )) is obtained from Theorem 7.3. Then f (n) (r ), an (r ) ∈ C ∞ (0, ∞), hence (u (n) , A(n) ) is C 2 on R2 \{0}. Proof. For simplicity, we omit the superscript and subscript n in ( f (n) , an ) in the proof of Theorem 8.1. We know from the previous section that ( f, a) satisfy Eqs. (7.9) and (7.10) on R2 \{0}. From Lemma 7.1 we know that f, a ∈ C[0, ∞), f (0) = a(0) = 0. Moreover, since f, a → 1 as r → ∞, we have f, a ∈ L ∞ .
1028
R. M. Chen, D. Spirn
From (7.10) we have for any l > 0, l
l 2 4 f |1 − a|
a
dr = dr
2 2
µ 1/l r 1/l r f l 2 1/2 l 2 1/2 4 f f (1 − a)2 ≤ 2 dr r dr µ r2 1/l r 0 l 1/2 !1/2 1 4 dr 2G rcsh ( f, a) ≤ 2 f L ∞ < ∞. µ 1/l r Thus raf 2 ∈ L 1 [1/l, l] for any l > 0, which implies that raf 2 ∈ C(0, ∞). Hence a ∈ C 1 (0, ∞). Using standard elliptic theory on Eq. (7.9) we obtain f ∈ C 2 (0, ∞). A standard iterative bootstrap argument shows that f, a ∈ C ∞ (0, ∞). Hence f, a ∈ C 2 (R2 \{0}), which implies that u (n) , A(n) ∈ C 2 (R2 \{0}). 2. Maximum Principle. Theorem 8.2. For any M > 0, if ( f, a) minimizes G rcsh over B M , then (i) (ii) (iii) (iv)
0 < a(r ) ≤ 1, for r ∈ (0, ∞); a (r ) ≥ 0; f (r ) > 0; 0 < f (r ) < 1, for r ∈ (0, ∞).
Proof. Since ( f, a) minimizes G rcsh over B M , we know that ( f, a) also solves Eqs. (7.9) and (7.10) on R2 \{0}. (i) First we use a truncation argument to show that 0 ≤ a(r ) ≤ 1. Suppose not. Then the set Da = {r ∈ (0, ∞) : a(r ) < 0 or a(r ) > 1} is not empty. We define a truncated function a(r ¯ ) by ⎧ ⎨ 0 if a(r ) < 0, n if a(r ) > 1, a(r ¯ )= (8.1) ⎩ a(r ) otherwise. Since a(r ) ∈ C 2 (0, ∞) and Da is not empty, we know n2 n2 µ2 n 2 a¯ 2 µ2 n 2 a 2 2 2 2 2 (1 − a) ¯ f + dx < (1 − a) f + d x, 2 2 4 r2 f 4 r2 f R2 r R2 r n2 n2 n2 2 n2 2 2 (1 − a) ¯ f + ( a ¯ ) d x < (1 − a)2 f 2 + 2 (a )2 d x. 2 2 2 r r R2 r R2 r So ( f, a) ¯ ∈ B M and G rcsh ( f, a) ¯ < G rcsh ( f, a) = inf B M G rcsh , a contradiction. Therefore 0 ≤ a(r ) ≤ 1. We then make use of the second Euler-Lagrange Eq. (7.10), which can also be written as a 1 4 f 2 (1 − a) ≤ 0. (8.2) + a = − rf2 rf2 µ2 r Hence maximum principle implies that either a(r ) > 0 on (0, ∞) or a(r ) ≡ 0.
Symmetric Chern-Simons-Higgs Vortices
1029
If a(r ) ≡ 0 then from the energy we know n2 2 f d x < ∞. 2 R2 r From the fact that ( f, a) ∈ B M we also have 1 (1 − f 2 )2 d x ≤ 2 G rgl ( f, a) < ∞. λ R2 Therefore r ≥1
n 2 (1 − f 2 ) dx ≤ r2
r ≥1
n4 dx r4
1/2
1/2 R2
(1 − f 2 )2
< ∞,
which implies that r ≥1
n2 d x < ∞, r2
a contradiction. Hence a(r ) > 0 on (0, ∞). (ii) Suppose a(r ) is not nondecreasing. Then there are r1 < r2 ∈ [0, ∞] such that a(r1 ) > a(r2 ). Now let % a(r ) for r ∈ [ 0, r1 ], a(r ¯ )= (8.3) max{a(r ), a(r1 )} for r ∈ [ r1 , ∞]. Then the distributional derivative of a¯ is equal to the classical derivative a.e. and a¯ ∈ Ca . Since f = 0 forces a to be zero and hence a equals a constant, we know that f ≡ 0 in [ r1 , r2 ]. Therefore f 2 (1 − a) ¯ 2 f 2 (1 − a)2 d x < d x. r2 r2 R2 R2 By the continuity of f, a and the fact that |a¯ | ≤ |a | we know that ( f, a) ¯ ∈ B M and G rcsh ( f, a) ¯ < G rcsh ( f, a), which contradicts the minimality of G rcsh ( f, a). Hence a (r ) ≥ 0. Moreover, when a(r ) is between 0 and 1, from (iv) 0 < f (r ) < 1 for r > 0, we know that the right-hand-side of (8.2) is negative. Therefore by a maximum principle we get that a (r ) > 0 when 0 < a(r ) < 1. (iii) Since ( f, a) minimizes G rcsh over B M , we have that (G rcsh ) ( f, a) = 0, An explicit computation gives ⎛ − r f + (G rcsh ) ( f, a) = ⎝
n2 (1 − a)2 f r2 2 2 − µ4rn
0 , 0 G 11 G 12 (G rcsh ) ( f, a) = ≥ 0, G 21 G 22 =
(G rcsh ) ( f, a) ≥ 0.
and
µ2 n 2 (a )2 + λ2 f (1 − 4r2 f 3 2 a − nr 2 (1 − a) f 2 rf2
−
f 2 )(1 − 3 f 2 )
⎞ ⎠
(8.4) (8.5)
1030
R. M. Chen, D. Spirn
where n2 3µ2 n 2 (a )2 2 (1 − a) + + λ2 (15 f 4 − 12 f 2 + 1), r2 4r 2 f4 2n 2 µ2 n 2 a = G 21 = − 2 (1 − a) f + , r 2r rf3 n2 2 µ2 n 2 1 2 1 2 =− 2 ∂ − ( + )∂ r + 2 f . r 2 2 3 4r f rf f r
G 11 = − r + G 12 G 22
Differentiating Eq. (8.4) with respect to r and using (8.5) we obtain & ' f d 0 + R, (G rcsh ) ( f, a) = (G rcsh ) ( f, a) + Q = a 0 dr where
⎛ Q=⎝ " R=
− µ rn
0
a 3 rf 2 n2 3µ2 n 2 −µ 4 4r f 2 r3 f 3
2 2
1 r2
µ2 n 2 ∂ 2r 3 f r 2
−
(8.6)
⎞ ⎠, #
3µ2 n 2 (a )2 2r 2 f4 2n 2 2 (1 − a) f r3
− 2n (1 − a)2 f − r3
.
The first-row equation in (8.6) is ( ) 1 µ2 n 2 a 2n 2 3µ2 n 2 (a )2 2 G 11 + 2 f + G 12 − = (1 − a) f + . (8.7) a r r rf3 r3 2r 2 f4 On the other hand we have ( ( ) ) µ2 n 2 a 2n 2 µ2 n 2 a G 12 − a = − 2 (1 − a) f − a. r rf3 r 2r rf3 From the second-row equation in (8.4) we know that 2n 2 µ2 n 2 a − 2 (1 − a) f = . r 2r f rf2 Hence the previous expression becomes ( ( ) ) µ2 n 2 a µ2 n 2 a µ2 n 2 a µ2 n 2 a = − = f . a a G 12 − r rf3 2r f rf2 2r rf3 2r 2 f 4 Therefore (8.7) becomes 1 µ2 n 2 a 2n 2 3µ2 n 2 (a )2 2 G 11 + 2 + f = (1 − a) f + . r 2r 2 f 4 r3 2r 2 f4
(8.8)
Since (G rcsh ) ( f, a) ≥ 0, we know G 11 ≥ 0. From part (i), (ii) and the fact that f ≥ 0,
f ≡ 0, the right-hand side of (8.8) is nonnegative and µ2rn2 af 4 ≥ 0. Therefore using the maximum principle (see [16], Theorem B.4) we know that f > 0. (iv) Combining the regularity result and (iii) we conclude that 0 < f < 1. 2 2
Symmetric Chern-Simons-Higgs Vortices
1031
Acknowledgements. The authors would like to thank Yisong Yang for valuable discussions on the project. D. Spirn was supported in part by NSF grants DMS–0510121 and DMS–0707714.
References 1. Berger, M.S., Chen, Y.Y.: Symmetric vortices for the Ginzburg-Landau equations of superconductivity and the nonlinear desingularization phenomenon. J. Funct. Anal. 82, 259–295 (1989) 2. Caffarelli, L., Yang, Y.: Vortex condensation in the Chern-Simons-Higgs model: An existence theorm. Commun. Math. Phys. 168, 321–336 (1995) 3. Chae, D., Chae, M.: The global existence in the Cauchy problem of the Maxwell-Chern-Simons-Higgs system. J. Math. Phys. 43, 5470–5482 (2002) 4. Chae, D., Choe, K.: Global exisence in the Cauchy problem of the relativistic Chern-Simons-Higgs theory. Nonlinearity 15, 747–758 (2002) 5. Guo, Y.: Instability of symmetric vortices with large charge and coupling constant. Comm. Pure Appl. Math. 49, 1051–1080 (1996) 6. Gustafson, S., Sigal, I.M.: The stability of magnetic vortices. Commun. Math. Phys. 212, 257–275 (2000) 7. Han, J.: Radial symmetry of topological one-vortex solutions in the Maxwell-Chern-Simons-Higgs model. Comm. Korean Math. Soc. 19(2), 283–291 (2004) 8. Hong, J., Kim, Y., Pac, P.-Y.: Multivortex solutions of the Abelian Chern-Simons-Higgs vortices. Phys. Rev. Lett. 64, 2230–2233 (1990) 9. Jackiw, R., Weinberg, E.J.: Self-dual Chern-Simons vortices. Phys. Rev. Lett. 64, 2234–2237 (1990) 10. Kurzke, M., Spirn, D.: Gamma-limit of the nonself-dual Chern-Simons-Higgs energy. J. Funct. Anal. 255(3), 535–588 (2008) 11. Modica, L., Mortola, S.: Il limite nella -convergenza di una famiglia di funczionali elliptici. Boll. Un. Mat. Ital. 14-A, 526–529 (1977) 12. Ovchinnikov, Y.N., Sigal, I.M.: Ginzburg-Landau equation I. static vortices. Partial Differential Equations and Their Applications 12, 199–220 (1997) 13. Plohr, B.J.: Unpublished thesis, Princeton University, 1980 14. Reed, M., Simon, B.: Methods of Modern Mathematical Physics IV, New York: Academic Press, 1978 15. Sandier, E.: Lower bounds for the energy of unit vector fields and applications. J. Funct. Anal. 152(2), 379–403 (1998) 16. Struwe, M.: Variational methods. Berlin-Heidelberg-New York: Springer-Verlag, 1990 17. Struwe, M.: On the asymptotic behavior of minimizers of the Ginzburg-Landau model in 2 dimensions. Differential Integral Equations 7(5-6), 1613–1624 (1994) 18. Tarantello, G.: Multiple condensate solutions for the Chern-Simons-Higgs theory. J. Math. Phys. 37, 3769–3796 (1996) 19. Strauss, W.A.: Existence of solitary waves in higher dimensions. Commun. Math. Phys. 55(2), 149–162 (1977) 20. Yang, Y.: Solitons in field theory and nonlinear analysis. Springer Monographs in Mathematics, New York: Springer-Verlag, 2001 Communicated by I. M. Sigal
Commun. Math. Phys. 285, 1033–1063 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0615-1
Communications in
Mathematical Physics
Poisson Sigma Model on the Sphere Francesco Bonechi1 , Maxim Zabzine2 1 I.N.F.N. and Dipartimento di Fisica, Via G. Sansone 1, 50019 Sesto Fiorentino, Firenze, Italy 2 Department of Theoretical Physics, Uppsala University, Box 803, SE-751 08 Uppsala, Sweden.
E-mail:
[email protected] Received: 3 December 2007 / Accepted: 6 May 2008 Published online: 15 October 2008 – © Springer-Verlag 2008
Abstract: We evaluate the path integral of the Poisson sigma model on the sphere and study the correlators of quantum observables. We argue that for the path integral to be well-defined the corresponding Poisson structure should be unimodular. The construction of the finite dimensional BV theory is presented and we argue that it is responsible for the leading semiclassical contribution. For a (twisted) generalized Kähler manifold we discuss the gauge fixed action for the Poisson sigma model. Using the localization we prove that for the holomorphic Poisson structure the semiclassical result for the correlators is indeed the full quantum result.
1. Introduction The Poisson sigma model (PSM), introduced in [24,43], is a topological two-dimensional field theory with target a Poisson manifold M, whose Poisson tensor we will denote by α throughout. Recently PSM has attracted a lot of attention due to its role in the deformation quantization [6]. In particular the star product is given by a semiclassical expansion of the path integral of the PSM over the disk. In the present paper we study the PSM defined over the sphere. Let us start with a brief reminder of PSM. Take to be a two-dimensional oriented compact manifold without boundary. The starting point is the classical action functional S defined on the space of vector bundle morphisms Xˆ : T → T ∗ M from the tangent bundle T to the cotangent bundle T ∗ M of the Poisson manifold M. Such a map Xˆ is given by its base map X : → M and the linear map η between fibers, which may also be regarded as a section in (, H om(T , X ∗ (T ∗ M))). The pairing , between the cotangent and tangent space at each point of M induces a pairing between the differential forms on with values in the pull-backs X ∗ (T ∗ M) and X ∗ (T M) respectively. It is defined as pairing of the values and the exterior product of differential forms. Then the
1034
F. Bonechi, M. Zabzine
action functional S of the theory is 1 S(X, η) = η, d X + η, (α ◦ X )η. 2
(1.1)
Here η and d X are viewed as one-forms on with the values in the pull-back of the cotangent and tangent bundles of M correspondingly. Thus, in local coordinates, we can rewrite the action (1.1) as follows: 1 S(X, η) = ηµ ∧ d X µ + α µν (X )ηµ ∧ ην . (1.2) 2 D The variation of the action gives rise to the following equations of motion: 1 dηρ + (∂ρ α µν )ηµ ∧ ην = 0, d X µ + α µν ην = 0. 2
(1.3)
In covariant language these equations are equivalent to the statement that the bundle morphism Xˆ is a Lie algebroid morphism from T (with standard Lie algebroid structure) to T ∗ M (with Lie algebroid structure canonically induced by the Poisson structure). The action (1.2) is invariant under the infinitesimal gauge transformations δβ X µ = α µν βν , δβ ηµ = −dβµ − (∂µ α νρ )ην βρ ,
(1.4)
which form a closed algebra only on-shell (i.e., modulo the equations of motion (1.3)). In order to quantize the PSM we have to resolve the Batalin–Vilkovisky (BV) formalism [3] which we will review later. In what follows we will concentrate mainly on the case when the world-sheet is the two-sphere S 2 . Our goal is to calculate a leading term for PSM correlators on S 2 . We will argue that the notion of unimodularity appears naturally in the construction of the correlators. Indeed our construction is very similar to the one presented in [41] and is a generalization of the correlators for A- and B-models (see [23] for review). It is not surprising since the notion of generalized Calabi-Yau manifold given in [20] is a complex version of the notion of unimodularity of a Lie algebroid. In particular the unimodularity of the Poisson manifold is a real analog of the generalized Calabi-Yau condition. Previously in a different context the path integral for PSM and related models was also discussed in [4,19,32]. In the second part of the paper we consider a particular gauge fixing which involves a choice of an (almost) complex structure. The whole setup is realized on (twisted) generalized Kähler manifolds. For these gauge fixed models there exists a residual BRST symmetry which allows to use the localization. Thus we are able to produce examples where the leading term is a full answer for the quantum theory. The paper is organized as follows. In Sect. 2 we review basic concepts of BV formalism. Section 3 is devoted to overview of BV treatment of PSM. In particular we discuss the classical observables. In Sect. 4 we consider the truncation of the full BV theory to a finite dimensional BV theory which is responsible for the leading semiclassical contribution in the correlators. We discuss this finite dimensional BV theory in detail. In this context the unimodularity of the Poisson manifold arises naturally from the quantum master equation. In Sect. 5 the specific gauge fixing is discussed. Indeed the geometrical set-up we are using is the same as for the N = 2 supersymmetric PSM [5]. We work out the details of gauge fixing and discuss the residual BRST transformations of the gauge fixed action and present the calculations of the correlators for the gauge fixed model. Finally Sect. 6 summarizes the results and discusses open issues.
Poisson Sigma Model on the Sphere
1035
In addition we have Appendices A and B where the relevant mathematical material is collected. The material presented there is not entirely original and furthermore we could not find appropriate references with all material. Many of the results presented in the Appendices are scattered throughout the literature. Moreover we would like to link two different languages used by different communities. In particular the notion of generalized Calabi-Yau manifold introduced by Hitchin [20] is related to the notion of unimodularity for a complex Lie algebroid. Throughout the paper we use the language of graded manifolds which are supermanifolds with a Z-refinement of Z2 -grading, e.g. see [42] for the review. 2. Review of BV Formalism In this section we briefly review the relevant concepts within the general BV framework. For further details the reader may consult the following reviews [8,13,18]. Definition 1. A graded algebra A with an odd bracket {, } is called an odd Poisson algebra (Gerstenhaber algebra) if the bracket satisfies { f, g} = −(−1)(| f |+1)(|g|+1) {g, f }, { f, {g, h}} = {{ f, g}, h} + (−1)(| f |+1)(|g|+1) {g, { f, h}}, { f, gh} = { f, g}h + (−1)(| f |+1)|g| g{ f, h}. Quite often such an odd Poisson bracket is called either a Gerstenhaber bracket or an antibracket. Definition 2. A Gerstenhaber algebra (A, {, }) together with an odd R-linear map
: A −→ A, which squares to zero 2 = 0 and generates the bracket {, } as { f, g} = (−1)| f | ( f g) + (−1)| f |+1 ( f )g − f ( g), is called a BV-algebra. is called an odd Laplace operator (odd Laplacian). The canonical example of BV algebra is given by the space of functions on W ⊕W ∗ , where W is a superspace, W ∗ is its dual and stands for the reversed parity functor. W ⊕ W ∗ is equipped with an odd non-degenerate pairing. Let y a be the coordinates on W (the fields) and ya+ be the corresponding coordinates on W ∗ (the antifields). We a + a denote the parity of y a as (−1)|y | and that of ya+ as (−1)|ya | = (−1)|y |+1 . Then the odd Laplacian is defined as follows:
= (−1)|ya |
∂ ∂ . ∂ ya+ ∂ y a
(2.5)
It generates the canonical antibracket on C ∞ (W ⊕ W ∗ ), { f, g} = (−1)|y
a|
← − − → ← − − → ∂ f ∂g |y a | ∂ f ∂ g + (−1) , ∂ ya+ ∂ y a ∂ y a ∂ ya+
(2.6)
1036
F. Bonechi, M. Zabzine
− → ← − where we use the notation ∂ v f = ∂v f and ∂ v f = (−1)|v|| f | ∂v f . Indeed the bracket (2.6) is non degenerate and defines the canonical odd symplectic structure on W ⊕W ∗ . A Lagrangian submanifold L ⊂ W ⊕ W ∗ is an isotropic supermanifold of maximal dimension. The volume form dy 1 . . . dy n dy1+ . . . dyn+ induces a well defined volume form on L. Thus the integral f, f ∈ C ∞ (W ⊕ W ∗ ) (2.7) L
is defined for any L. The following is the main theorem of BV-formalism. Theorem 3. If f = 0, then L f depends only on the homology class of L. Moreover L f = 0 for any Lagrangian L. The canonical example W ⊕ W ∗ can be generalized to the cotangent bundle of any graded manifold M [44]. As a cotangent bundle, T ∗ [−1]M is naturally equipped with an odd Poisson bracket that makes C ∞ (T ∗ [−1]M) a Gerstenhaber algebra according to Definiton 1. The idea is that locally one can map T ∗ [−1]M to W ⊕ W ∗ , define the bracket on coordinates with (2.6) and then glue the patches in a consistent manner. Now in order to define the odd Laplacian we need an integration over T ∗ [−1]M. Namely, the choice of a volume form v on M produces the corresponding volume form µv on T ∗ [−1]M. The divergence operator is defined as a map from the vector fields on T ∗ [−1]M to C ∞ (T ∗ [−1]M) through the following integral relation X ( f ) µv = − divµv X f µv , ∀ f ∈ C ∞ (T ∗ [−1]M), (2.8) T ∗ [−1]M
T ∗ [−1]M
T ∗ [−1]M
with X being a vector field. As one can easily check, for any function f and vector field X the divergence satisfies divµv ( f X ) = f divµv (X ) + (−1)| f ||X | X ( f ).
(2.9)
Now the odd Laplacian of f ∈ C ∞ (T ∗ [−1]M) is defined through the divergence of the corresponding Hamiltonian vector field as
v f =
(−1)| f | divµv X f , 2
{ f, g} = X f (g).
(2.10)
Indeed one can check that thanks to (2.9) v generates the bracket and 2v = 0. Thus C ∞ (T ∗ [−1]M) is a BV-algebra according to Definition 2, see [29] for the explicit calculations. If the volume form is written in terms of an even density ρv as µv = ρv dy 1 · · · dy n dy1+ · · · dyn+ , then the Laplacian can be written as
v = (−1)|ya |
∂ ∂ 1 + {log ρv , −}. + a ∂ ya ∂ y 2
(2.11)
Poisson Sigma Model on the Sphere
1037
There exists a canonical way (up to a sign) of restricting a volume form µv on T ∗ [−1]M √ to a volume form on a Lagrangian submanifold L. We denote such restriction as µv and consider the integrals of the form √ µv f, f ∈ C ∞ (T ∗ [−1]M). (2.12) L
Thus Theorem 3 will remain true for the general case. In particular we are interested in the situation when the integrands in (2.12) are of the form √ µv e S ≡ , (2.13) L
where we assume naturally that v ( e S ) = 0. If = 1 then we get the following relation: 1 (2.14)
v e S = 0 ⇐⇒ v S + {S, S} = 0, 2 which is known as the quantum master equation. In the general case we have
v e S = 0 ⇐⇒ (v,S) = v + {S, } = 0,
(2.15)
where we refer to (v,S) as the quantum Laplacian. In the derivation of (2.15) we have used the quantum master equation (2.14). A function S that satisfies the quantum master equation is called a quantum BV action and satisfying (2.15) is a quantum observable. Indeed the quantum observables are elements of the cohomology H ( (v,S) ); by the above construction it is clear that S defines the isomorphism H • ( v ) ≈ H • ( (v,S) ).
(2.16)
If we change S to S/, we see that in the classical limit ( → 0) S must satisfy the classical master equation {S, S} = 0 and the classical observables are such that δ BV ≡ {S, } = 0. Due to the classical master equation the vector field δ BV squares to zero and defines the cohomology H (δ BV ) of classical observables. If M is a finite dimensional manifold then everything is well-defined. However in field theory one deals with M being infinite dimensional. In fact, M is usually the space of the physical fields, ghosts and Lagrange multipliers, that is infinite dimensional. We extend this set of fields by adding antifields such that together they form T ∗ [−1]M, where an odd Poisson bracket is well-defined on a large enough class of functions, as described above. However there are no well-defined measure on M and thus there are no well-defined odd Laplace operators. In the physics literature, the naive Laplacian of the form (2.6) is used. Moreover the field theory suffers from the problems with renormalization which can be resolved within the perturbative setup. 3. BV Formalism for PSM The quantization of PSM requires the machinery of the BV formalism. In this section we set the notation and give a background information on the BV treatment of PSM. We mainly review the relevant results from from [6] and [7]. Furthermore we discuss the classical observables.
1038
F. Bonechi, M. Zabzine
3.1. BV action. The PSM action (1.2) has gauge symmetries which do not close off-shell. Therefore one should resort to the BV formalism. We may organize the fields, ghosts and antifields into superfields (X, η) which correspond to the components of supermap T [1] → T ∗ [1]M. Introducing the local coordinates on and M the superfields read as 1 +µ Xµ = X µ + θ α ηα+µ − θ α θ β βαβ , 2 1 + ηµ = βµ + θ α ηαµ + θ α θ β X αβµ , 2 with θ being the odd coordinate on T , α, β are labels for local coordinates on and µ are labels for local coordinates on M. In the expansion β is a ghost with the ghost number 1, while η+ , β + and X + are antifields of ghost number −1, −2 and −1 respectively. The full BV action reads 1 µν 2 2 µ (3.17) S BV = d θ d u ηµ DX + α (X)ηµ ην , 2 where D = θ α ∂α . An elegant way to derive this action is to use the AKSZ formalism [1] as done in [7]. On T ∗ [−1]M the odd symplectic structure is ω= δ X ∧ δ X + + δη ∧ δη+ + δβ ∧ δβ + , (3.18)
where M is an infinite dimensional manifold corresponding to the fields (X, η, β). The action (3.17) satisfies both classical and naive quantum master equations [6]. The corresponding BRST operator δ BV acts on the superfields as follows: δ BV Xµ = DXµ + α µν (X)ην , 1 δ BV ηµ = Dηµ + ∂µ α νρ (X)ην ηρ . 2 In component the BV action (3.17) has the form 1 S BV = ηµ ∧ d X µ + α µν (X )ηµ ∧ ην + X µ+ α µν (X )βν − η+µ 2 1 ρν ∧ dβµ + ∂µ α (X )ηρ βν − β +µ ∂µ α ρν (X )βρ βν 2 1 +µ − η ∧ η+ν ∂µ ∂ν α ρσ (X )βρ βσ . 4
(3.19) (3.20)
(3.21)
The component version of the BV transformations (3.19)–(3.20) is δ BV X µ = α µν (X )βν , δ BV η+µ = −d X µ − α µν (X )ην − ∂ν α µρ (X )η+ν βρ , 1 δ BV β +µ = −dη+µ − α µν (X )X ν+ + ∂ν ∂ρ α µσ (X )η+ν ∧ η+ρ βσ 2 +∂ρ α µν (X )η+ρ ∧ ην + ∂ρ α µν (X )β +ρ βν ,
(3.22) (3.23)
(3.24)
Poisson Sigma Model on the Sphere
δ BV βµ =
1039
1 ∂µ α νρ (X )βν βρ , 2
(3.25)
1 δ BV ηµ = −dβµ − ∂µ α νρ (X )ην βρ − ∂µ ∂ν α ρσ (X )η+ν βρ βσ , 2
(3.26)
1 δ BV X µ+ = dηµ + ∂µ α νρ (X )X ν+ βρ − ∂µ ∂ν α ρσ (X )η+ν ∧ ηρ βσ + ∂µ α νρ (X )ην ∧ ηρ 2 1 1 − ∂µ ∂ν ∂ρ α σ τ (X )η+ν ∧ η+ρ βσ βτ − ∂µ ∂ν α ρσ (X )β +ν βρ βσ . (3.27) 4 2 3.2. Classical observables. Next we consider the classical observables for PSM. By an observable we mean a BRST invariant operator which is not BRST exact. Let us take the antisymmetric multivector field w ∈ (∧ p T M) and construct the superfield w µ1 ...µ p (X)ηµ1 . . . ηµ p . Using (3.19)–(3.20) we calculate the BRST transformation of this superfield δ BV (w µ1 ...µ p ηµ1 . . . ηµ p ) = D(w µ1 ...µ p ηµ1 . . . ηµ p ) 1 − ([α, w]s )µ0 µ1 ...µ p ηµ0 ηµ1 . . . ηµ p . 2
(3.28)
The last term on the right-hand side vanishes if d L P w = [α, w]s = 0. Moreover we do not want the superfield wµ1 ...µ p ηµ1 . . . ηµ p to be BRST exact. Thus we have to take w to be an element in the Lichnerowicz-Poisson cohomoogy HL• P (M). Now assuming [w] ∈ HL• P (M), we can interpret (3.28) in components. The superfield has the expansion w µ1 ...µ p ηµ1 . . . ηµ p = O0 + θ α (O1 p
p−1
1 p−2 )α + θ α θ β (O2 )αβ 2
on which the BRST differential δ BV acts as δ BV (w µ1 ...µ p ηµ1 . . . ηµ p ) = δ BV O0 − θ α δ BV (O1 p
p−1
1 p−2 )α + θ α θ β δ BV (O2 )αβ . 2
The operator D = θ α ∂α acts on the component fields as the de Rham differential. Thus for [w] ∈ HL• P (M) the condition (3.28) implies the descent equations for the components p
p−1
δ BV O0 = 0, δ BV O1
p
p−2
= −d Q 0 , δ BV O2
p−1
= d Q1
.
(3.29)
p
More explicitly for a nontrivial element [w] ∈ HL P (M) we can formally define the cocycles O0 (w) = w µ1 ...µ p βµ1 . . . βµ p , p
p−1 O1 (w)
µ1 ...µ p +ρ
(3.30) µ1 µ2 ...µ p
= ∂ρ w η βµ1 . . . βµ p + pw ηµ1 βµ2 . . . βµ p , (3.31) 1 p−2 O2 (w) = − ∂ρ ∂σ w µ1 ...µ p η+ρ ∧ η+σ βµ1 . . . βµ p − ∂ρ w µ1 ...µ p β +ρ βµ1 . . . βµ p 2 − p∂ρ w µ1 ...µ p η+ρ ∧ ηµ1 βµ2 . . . βµ p + pw µ1 ...µ p X µ+ 1 βµ2 . . . βµ p + p( p − 1)w µ1 ...µ p ηµ1 ∧ ηµ2 βµ3 . . . βµ p ,
(3.32)
1040
F. Bonechi, M. Zabzine p−i
where in Oi (w) the upper index stands for the ghost number and the lower index for p−i p the degree of the differential form on . Q i (w) satisfy ( 3.29) and thus O0 (w) are BRST-invariant local observables labeled by the elements of the Lichnerowicz-Poisson p−i cohomology HL• P (M). From Oi (w) with i > 0 we can construct BRST-invariant non-local observables as integrals p−i W (w, ci ) = Oi (w), (3.33) ci
where ci is an i-cycle on . These observables depend only on the homology class of ci . The antibracket {, } of two non-local observables {W (w, ), W (λ, )} = −W ([w, λ]s , )
(3.34)
gets mapped into the Schouten bracket between the multivector fields [6].
3.3. General comments on the path integral. The main task is to calculate the correlation functions of observables which can be represented as the path integral expression i W (w1 , ci1 ) . . . W (wn , cin ) = DXDη W (w1 , ci1 ) . . . W (wn , cin ) e S BV . (3.35) L
For this integral to make sense at least perturbatively we have to integrate not over the whole functional space but over the “Lagrangian” submanifold L. The choice of L is called the gauge fixing and it is typically generated by a gauge fixing fermion . The path integral (3.35) is invariant under the deformations of the Lagrangian submanifold L. However due to the absence of any well-defined measure on the space of fields we cannot treat this integral non-perturbatively. Despite this difficulty we can address and even sometimes solve it completely from the different direction, namely by reducing to an appropriate finite dimensional problem. We would expect that the correlator (3.35) has a well-defined expansion in non-negative powers of . In particular there will be a leading term in this expansion which we can evaluate by consistent reduction of the full theory to a finite dimensional BV theory for which all objects can be defined. This reduction will produce the leading terms in the correlators. Indeed for some models these terms correspond to a full quantum result. In Sect. 4 we will consider the finite dimensional BV theory responsible for leading terms in the correlators on S 2 . In Sect. 5 we present the details for a concrete choice of L. The gauge fixed theory will have residual BRST symmetry which allows us to localize the infinite dimensional integrals to finite dimensional. 4. The Reduced BV Theory In this section we consider a consistent truncation of the infinite dimensional BV theory to a finite dimensional one, that computes the contribution of constant configurations. We conjecture that this reduced BV theory controls the leading contribution into the path integral in the limit → 0.
Poisson Sigma Model on the Sphere
1041
This procedure can be considered as a reduction of BV -manifolds and for a Riemann surface g of genus g the truncation can be organized in the following fashion. We define the submanifold C of the whole space of fields by requiring that all fields are closed forms d X = 0, dβ = 0, dη = 0, dη+ = 0, d X + = 0, dβ + = 0.
(4.36)
These equations define a set of first class constraints (the conditions d X + = dβ + = 0 are redundant since X + and β + are the top forms), i.e. C is a coisotropic submanifold. The gauge transformations generated by the constraints (4.36) shift the field by an exact form. Therefore the reduced BV space is obtained by going to the cohomology of g . The reduced variables are then defined by the integration of the fields over all cycles of g . Thus zero-forms X and β are constants, and we use the same symbols to indicate the reduced coordinates. For one-forms we choose the basis {ca } in H1 (g , R) = H 1 (g , R) and introduce the reduced coordinates η, ηa+ = η+ . ηa = ca
ca
While two-forms X + and β + are integrated over whole and give X +, β + = β +. X+ = g
g
All the BV structure goes to the quotient and defines a finite dimensional BV manifold. The space H 1 (g , R) is symplectic with the structure ωab . Therefore on the reduced finite dimensional manifold, the odd symplectic structure (3.18) reads ω = d X µ d X µ+ + ωab dηa dηb+ + dβµ dβ +µ .
(4.37)
Moreover, the BV action S BV defined in (3.21) when restricted to C depends only on the reduced variables, i.e. it is a pull-back of a function on the reduced manifold. We use the same notation S BV for it. However we are interested in zero genus case, and we leave for future investigations the case of genus g > 0. In this situation the corresponding finite dimensional BV manifold is F = T ∗ [−1]T ∗ [1]M, where the odd symplectic structure is written in the coordinates z = (X µ , βµ , X µ+ , β +µ ) as ω = d X µ d X µ+ + dβµ dβ +µ .
(4.38)
The degree of the coordinates is the one induced from the corresponding fields. Under a coordinate change X˜ i (X µ ), the new coordinates z˜ = ( X˜ i , β˜i , X˜ +i , β˜ †i ) are µ β˜i = Ti βµ ,
β˜ +i = Tµi β +µ ,
µ X˜ i+ = X µ+ Ti − β +µ βν
∂ T jν ∂Y i
(T −1 )µj ,
(4.39)
µ where Ti = ∂ X µ /∂ X˜ i . The BV action (3.21) becomes
1 S BV = X µ+ α µν (X )βν − β +µ ∂µ α ρν (X )βρ βν , 2
(4.40)
1042
F. Bonechi, M. Zabzine
which obviously satisfies the classical master equation. In the following discussion we will analyze this finite dimensional BV theory and claim that it gives the leading contribution to PSM correlators. Later using a particular gauge fixing we will confirm this statement. In addition to the BV reduction described above we can provide a different heuristic argument in the support of our construction. The action (4.40) can be understood as a leading term in the effective BV theory with the “constant” maps as IR degrees of freedom. The reader may consult [31,40] for the explanation the effective actions within the BV framework.
4.1. Integration on finite dimensional BV manifold. We start by defining the integration over F = T ∗ [−1]T ∗ [1]M. This will allow us to define an odd Laplacian which is necessary for a proper BV description, according to the lines outlined in Sect. 2. Integration on F can be defined by putting together berezinian integration in the odd directions of X µ+ and βµ and fiberwise integration in the even directions of β +µ . Let us choose a volume form = µ1 ···µn d X µ1 · · · d X µn = ρ d X 1 · · · d X n on M. 4 Dz, where Dz = d X 1 · · · dβ · · · We introduce the volume form µ = ρ 1 + +1 d X 1 · · · dβ · · · is the coordinate volume form. Since under the change of coordinates (4.39) the coordinate volume form transforms as −1 det(I00 − I01 I11 I10 ) ∂ z˜ I I D z˜ = Ber Dz, Ber 00 01 = , I I 10 11 ∂z det I11 it is simple to check that µ is well defined. By applying (2.11), we get
=
∂ ∂ ∂ ∂ − +µ + 2{log ρ , −}. + µ ∂ Xµ ∂ X ∂β ∂βµ
The restriction to F of local and the non-local observables (3.32) associated to multivector fields defines the corresponding observables on the reduced manifold F. Namely, to w ∈ (∧ p T M) we associate the local observable O0 (w) = w µ1 ···µ p βµ1 · · · βµ p , p
(4.41)
and the non-local one p−2
O2
(w) = −∂ρ w µ1 ···µ p β +ρ βµ1 · · · βµ p + pw µ1 ···µ p X µ+ 1 βµ2 · · · βµ p .
(4.42)
It is straightforward to check that they are covariant under the transformation of coordinates (4.39). The antibracket defined by the odd symplectic structure (4.37) between local and non-local observables can be expressed in terms of the Schouten bracket; let w ∈ ( p T M), λ ∈ ( T M), then we have that p−2
{O2
(w), O0 (λ)} = −O0
p−2 {O2 (w),
p+−1
O2−2 (λ)}
=
([w, λ]s ) p+−3 −O2 ([w, λ]s ),
(4.43)
in analogy with (3.34). The odd Laplacian acts on this observable as follows: p−2
O2
(w) = −2(D (w))µ1 ···µ p−1 βµ1 · · · βµ p−1 = −2O0
p−1
(D (w)), (4.44)
Poisson Sigma Model on the Sphere
1043
where D is the divergence associated to the volume form defined in Appendix A. The BV -differential also descends to the reduced manifold as δ BV (F) = {S BV , F}, for any F ∈ C ∞ (F). The action S BV = 1/2 O20 (α) defined in ( 4.40) satisfies the quantum master equation ( 2.14) if the following holds: 1
S BV + {S BV , S BV } = 0 ⇐⇒ D α = 0, [α, α]s = 0, 2
(4.45)
where [, ]s is the Schouten bracket on multivector fields, see Appendix A for the definitions. Thus the classical and quantum master equations have to be satisfied simultaneously. The geometrical meaning of the quantum master equation is clear: the volume form must be invariant under the flow of the hamiltonian vector fields of α. The existence of such a volume form is equivalent to the unimodularity of the Poisson tensor, see the discussion in Appendix A. More generally, we may say that the action (4.40) is the zero order in of the solution of the quantum master equation if and only if α is Poisson and unimodular1 . If is not an invariant form then the unimodularity of α implies D α = −d L P f,
(4.46)
for some function f (X ). This would correspond to the addition to S BV + 2 f (X ). Equivalently this amounts to the redefinition by e f . In what follows we set = 1. By applying formulas (4.43), we see that for any w ∈ (• T M) we have p
(,α) O0 (w) = 0 ⇐⇒ d L P (w) = 0,
(4.47)
and thus the local observable associated to w is a quantum observable iff d L P w = 0. p−2 The non-local observable O2 (w) will be quantum if the following holds: p−2 p−2
O2 (w)e S BV = 0 ⇐⇒ (,α) (O2 (w)) = 0 ⇐⇒ D w = 0, d L P w = 0.
(4.48)
Moreover, by applying (4.43) we see that local and nonlocal observables form a subcomplex of the quantum laplacian (,α) = + δ BV . See the next subsection for the discussion of these observables. Finally we can evaluate the path integral. We have to choose a Lagrangian submanifold L and the most obvious choice is L = {X + = 0, β + = 0}. In order to compensate the odd integration we have to insert into the path integral the local observables p p O0 1 (w1 ) . . . . O0 k (wk ) e S BV = tr (w1 ∧ . . . ∧ wk ), (4.49) L
where the trace map is defined in Appendix B. This expression is non-zero only if p1 + · · · + pk = d. With this choice of lagrangian submanifold, the nonlocal observables are identically zero. We conclude that in the present finite dimensional BV -theory the action (4.40) satisfies the quantum master equation if the Poisson tensor α is unimodular. This is equivalent 1 Within the general BV framework it can be shown that the modular class corresponds to the first obstruction to the existence of a quantum master action [36].
1044
F. Bonechi, M. Zabzine
to the requirement that there exists a trace map tr satisfying two properties in Theorem 9 of Appendix A. In Appendix A we present the mathematical discussion of these properties. Below we present “physical” derivation of those identities. The first property of tr from Theorem 9 is a consequence of the quantum master equation for S BV (i.e., the unimodularity of Poisson structure α). Namely we have the following chain of relations tr d L P (w) ∧ λ − (−1)|w|+1 w ∧ d L P (λ) = tr (d L P (w ∧ λ)) |w|+|λ| |w|+|λ| S BV = −2 {e , O0 (w ∧ λ)} = −2
e S BV O0 (w ∧ λ) = 0. L
L
This property implies that the trace map tr descends to the Lichnerowicz–Poisson cohomology HL• P (M). The second property in Theorem 9 is a simple consequence of the fundamental BV Theorem 3. To be specific for the multivector fields w, λ we have the following relations: tr D (w) ∧ λ − (−1)|w| w ∧ D (λ) |w|−1 |λ| |w| |λ|−1 O0 = (D w)O0 (λ) − (−1)|w| O0 (w)O0 (D λ) L |w|−2 |λ| |w| |λ|−2
O 2 (w)O0 (λ) − O0 (w)O2 (λ) = 0, = −2 L
where (4.45) and (4.48) have been used. This property implies that the trace descends to the cohomology of D . The cohomology of D on the multivectors H • (D ) is isomorphic to the de Rham cohomology Hd•R (M). In the present context it is worthwhile to mention another interesting property of the trace map tr on multivector fields. For the unimodular Poisson structure α there is the following relation: e−α D eα = d L P + D ,
(4.50)
where eα acts on the multivector field w as 1 eα w = w + α ∧ w + α ∧ α ∧ w + · · · , 2 and D eα = 0 is used. The relation (4.50) implies the isomorphism of cohomologies, H • (d L P + D ) ≈ Hd•R (M). Moreover the trace map tr descends to the cohomology H • (d L P + D ). 4.2. Maurer-Cartan equation and formal Frobenius manifolds. In this subsection we comment on the relation between the BV setting described above and the construction of Frobenius manifolds from BV -manifolds which appeared previously in mathematical works, in particular in the papers by Barannikov and Kontsevich [2] and by Manin [38,39]. Our observations have preliminary and speculative character. We plan to come back to this subject elsewhere. The BV theory discussed in the previous section can be deformed by adding to the solution (4.40) of the quantum master equation any observable of ghost number 0.
Poisson Sigma Model on the Sphere
1045
Take w(t) ∈ (2 T M[[t]]) with t being a formal parameter of degree zero such that w = w(0). Consider the deformed action S BV (t) = S BV +
t 0 O (w(t)). 2 2
(4.51)
Obviously, S BV (t) satisfies the quantum master equation if and only if α + tw(t) is an unimodular Poisson structure with the invariant volume form . This is equivalent to the Maurer–Cartan equation for w(t), t d L P w(t) + [w(t), w(t)]s = 0, 2
D w(t) = 0.
(4.52)
At the infinitesimal level this means d L P w = D w = 0 and thus O20 (w) is a quantum non-local observable. However it is natural to allow the volume form to vary and use the argument presented around Eq. (4.46). Therefore we can describe the infinitesimal deformations as follows: d L P w = 0,
D w + d L P f = 0,
(4.53)
with w + f ∈ (∧2 T M ⊕ ∧0 T M), where w corresponds to the deformations of unimodular Poisson structure and f to the deformations of the volume form. Eqs. (4.53) can be equivalently rewritten as follows: (d L P + D )(w + f ) = e−α D eα (w + f ) = 0,
(4.54)
where we assume that is an invariant volume form for α. In BV theory the deformation will be trivial if it is in the image of the quantum Laplacian (,α) . However the question is to understand the geometrical description of these trivial BV deformations. For example, the diffeomorphisms give a trivial deformation of the BV theory. Namely for w = Lξ α = d L P (ξ ) and f = D ξ for ξ ∈ (T M) the deformation is trivial, 1 0 O (w) + 2O00 ( f ) = − (,α) O2−1 (ξ ). 2 2 However the formula (4.54) suggests that the deformations is trivial if w + f = (d L P + D )ξ = e−α D (eα ξ ),
(4.55)
with ξ ∈ (∧• T M), not just simply a vector. One has to show that the corresponding deformations of the BV theory are trivial. Unfortunately we are unable to do it in all generality. Nevertheless we give some plausible arguments in its favor and analyze the problem in special cases. The linear space of deformations defined as the condition (4.54) modulo the identification (4.55) would be interpreted as the tangent space to some kind of modular space of unimodular Poisson structures (if such space exists). The crucial point motivated by the BV consideration is that the Poisson tensors may be equivalent even if they are not diffeomorphic. Indeed the equivalence relation (4.55) looks very natural in terms of the pure spinor description (see Appendix B for the details). The unimodular Poisson structure can be described in terms of closed pure spinor ρ = eα . The deformation of the pure spinor would be given by δρ = (w + f ) · ρ,
1046
F. Bonechi, M. Zabzine
where the finite deformation is eα+w e f . The property (4.54) implies that d(δρ) = 0. If the deformation satisfies (4.55) then δρ = (w + f ) · ρ = −d (ξ · ρ), where we used Theorem 13 in Appendix B. Thus we look at the deformations of closed pure spinor modulo exact ones which correspond to the subspace of the de Rham cohomology group, namely {[(w + f ) · ρ] ∈ Hd•R (M), (w + f ) ∈ (∧2 T M ⊕ ∧0 T M)}, where we deal with the alternative grading of the differential forms, see Appendix B. Following standard terminology, we refer to the corresponding space of deformations of the BV theory modulo the trivial ones as the geometric moduli space. Let us get back to the BV theory. More generally we want to understand the subspace of the cohomology of the quantum Laplacian spanned by non-local observables Hnonloc ( (,α) ) = {[O2 (w)] ∈ H ( (,α) ), w ∈ (• T M)}. In particular we want to understand if it is finite dimensional and moreover related to the de Rham cohomology Hd R (M) ≈ H (D ) ≈ H (d L P + D ). We are unable to answer this question in all generality. However we can analyze two special cases which give a positive answer. Let us discuss first the case of the trivial Poisson structure, α = 0. In this case a quanp−2 tum non-local observable O2 (w) corresponds to the multivector field w ∈ (∧ p T M) p−2 such that D w = 0. Then we can show that O2 (D ν), D ν ∈ p T M, is trivial. ∞ In fact it is always possible to write ν = i f i D λi , for some f i ∈ C (M) and p+2 λi ∈ ( T M). This is obviously equivalent to say that the de Rham differential finitely generates the module of forms. Then using the basic properties of the antibracket we arrive to
p−2
p−2 p−1 O2 ([ f i , D λi ]s ) = − {O2−2 ( f i ), O2 (D λi )} O2 (D ν) = i
= −
i p−1 O2−2 ( f i )O2 (D λi )
.
(4.56)
i p−2
Therefore the correspondence w → O2 (w) defines a surjection from H (D ) to Hnonloc ( ). Thus the corresponding geometrical moduli space is finite dimensional. Next consider the case of the non-trivial Poisson structure α such that two differentials ¯ (d L P , D ) satisfy the ∂ ∂-lemma, i.e. Imd L P D = Imd L P ∩ Ker D = Kerd L P ∩ Im D .
(4.57)
The condition (4.57) is satisfied for a large class of symplectic manifolds obeying the ¯ strong Lefschetz property (see [39]). However the ∂ ∂-lemma does not hold for a generic Poisson manifold since HL P (M) is infinite dimensional. One of the consequences of ¯ the ∂ ∂-lemma is the isomorphism of the cohomologies, HL P (M) ≈ Hd R (M). The extreme example of the failure for this lemma is the trivial Poisson structure. Consider w ∈ ( p T M) which defines a trivial class in (d L P + D )-chomology, i.e.
Poisson Sigma Model on the Sphere
1047
w = d L P ξ p−1 + D ξ p+1 , 0 = d L P ξk−1 + D ξk+1 for k = p. After straightforward calculation we arrive at the following relation: p−2
O2
(w) = −2 (,α) (O2 (ξ p−1 ) + 4O0 (ξ p−3 )) + O2 (D ξ p+1 ).
Since D ξ p+1 ∈ Im D ∩ Kerd L P = Im D d L P , there exists ν p such that D ξ p+1 = D d L P ν p and O2 (D ξ p+1 ) = 2 (,α) O2 (D ν p ). Thus we conclude that also in this p−2 case the correspondence w → O2 (w) defines a surjective map from the finite dimenp p sional space Hd R (M, α) to Hnonloc ( (,α) ) where Hd R (M) is defined as follows: Hd R (M, α) = {[w · ρ] ∈ Hd•R (M), w ∈ (∧ p T M)}. p
Motivated by these two examples we conjecture that the space Hnonloc ( (,α) ) is finite dimensional. Thus in general the action S BV can be deformed for arbitrary ghost number, mimicking the construction of Frobenius manifolds of [2] and [38]. Let p −2 {wk ∈ ( pk T M)} define a basis {O2 k (wk )} of Hnonloc ( (,α) ). We introduce the formal variables {tk } of degree 2 − pk and extend the full BV machinery to F ⊗ R[[tk ]]. p −2 Clearly S(t) = S BV + k tk O2 k (wk ) the quantum master equation solves at the infinitesimal level. Interpreting Hnonloc ( (,α) ) as the tangent space of the extended moduli space the main problem is to find a finite deformation, i.e. a solution of the Maurer-Cartan equation 1 δ BV S(t) + {S(t), S(t)} = 0. 2
(4.58)
In [2,38,39] the solution of such an equation is discussed within the BV setup. The main ¯ difference with the setup in [2,38,39] is the requirement of ∂ ∂-lemma that we want to avoid because it excludes the non-symplectic cases. Is it possible to solve the Ma¯ urer-Cartan equation (4.58) in this context? The ∂ ∂-lemma provides the isomorphism between the spaces of the classical and quantum observables. While for the generic unimodular Poisson manifold, the space of classical observables is infinite dimensional and the space of quantum observables is expected to be finite dimensional. 5. Gauge Fixing In this section we perform the gauge fixing by choosing an appropriate Lagrangian submanifold. In particular we use a complex structure for the gauge fixing. 5.1. Geometrical setup. Let us start from the description of the relevant geometric setup. It turns out to be very convenient to consider the N = 2 supersymmetric PSM [5]. The existence of the extended supersymmetry for PSM requires a generalized complex strucrure J P J = , (5.59) L −J t such that [R, J ] = 0, where
R=
1d α . 0 −1d
(5.60)
1048
F. Bonechi, M. Zabzine
These conditions can be worked out completely. To be specific L = 0, J is a complex structure and moreover the (2, 0) + (0, 2) part of α P=
1 (J α + α J t ), 2
(5.61)
is a holomorphic Poisson structure. If we switch to the complex coordinates with the ¯ then the (2, 0)-part α i j is a holomorphic Poisson structure if the following labels (i, i) holds: ∂k¯ α i j = 0,
α il ∂l α jk + α jl ∂l α ki + α kl ∂l α i j = 0.
(5.62)
Indeed the geometrical setup we will use can be summarized as follows: a Poisson manifold (M, α, J ) with a complex structure J such that (2, 0)-part of α is holomorphic. The fact that (2, 0)-part is Poisson itself follows from this. It may look at first that the geometry we just described is somewhat exotic. However that is not the case and this Poisson geometry is always realized on (twisted) generalized Kähler manifolds [37,15,21]. The (twisted) generalized Kähler manifold can be characterized as a bihermitian geometry (g, J+ , J− ), where J± are two complex structures and g is a metric which is hermitian with respect to both complex structures. In addition there are certain integrability conditions on two-forms g J± . The (twisted) generalized Kähler manifold has two real Poisson structures π± = (J+ ± J− )g −1 [37]. Moreover their (2, 0)-part with respect to J+ (or J− ) is a holomorphic Poisson structure with respect to J+ (or J− ), [21]. 5.2. Gauge fixed action. Let us assume that the Poisson manifold (M, α) admits a complex structure J such that the (2, 0)-part of α is a holomorphic Poisson structure and the world-sheet is equipped with a complex structure. We will concentrate our attention on the case of the two-sphere where the complex structure is unique. Introducing the complex coordinates on M and we define the following Lagrangian submanifold in the space of (anti)fields: ηzi = 0,
ηz¯ i¯ = 0,
ηz+i = 0,
¯
ηz+¯ i = 0,
X + = 0,
β + = 0,
(5.63)
¯ stand for the complex coordinates on M and (z, z¯ ) are the complex coorwhere (i, i) dinates on . The odd symplectic structure (3.18) is zero on (5.63). Equivalently we could write the conditions (5.63) using the projectors constructed out of J and complex structure on , in the same fashion as in [47]. Indeed we do not need to assume that J is integrable, it is enough for J to be an almost complex structure. However in what follows we are in the geometrical setup described in the previous subsection. In this case many calculations simplify drastically. Assuming the gauge (5.63) the gauge fixed action is
¯ ¯ ¯ ls ηz i¯ ∂z¯ X i − ηz¯ i ∂z X i + α i j ηz i¯ ηz¯ j + ηz+i ¯ (∂z βi + ∂i α ηzl¯βs ) ¯ ¯ ¯ +j (5.64) −ηz+i (∂z¯ βi¯ + ∂i¯ αl s¯ ηz¯l βs¯ ) − ∂i¯ ∂ j α kl ηz+i ηz¯ βk βl¯ ,
SG F = i
d 2σ
Poisson Sigma Model on the Sphere
1049
which is just the action (3.21) restricted to ( 5.63). The action (5.64) is invariant under the following BRST transformations: ¯
δ X i = α i j β j + α i j β j¯ , ¯
¯¯
(5.65)
¯
δ X i = α i j β j¯ + α i j β j ,
(5.66) ¯
i ij i j +k i j +k δηz+i ¯ = −∂z¯ X − α ηz¯ j − ∂k α ηz¯ β j¯ − ∂k α ηz¯ β j ,
(5.67)
¯ δηz+i
(5.68)
i¯
= −∂z X − α
i¯ j¯
¯ ¯ ηz j¯ − ∂k¯ α i j ηz+k β j
¯¯ ¯ − ∂k¯ α i j ηz+k β j¯ ,
1 ¯ δβi = ∂i α k j βk β j¯ + ∂i α k j βk β j , 2 1 ¯¯ k¯ j δβi¯ = ∂i¯ α βk¯ β j + ∂i¯ α k j βk¯ β j¯ , 2 ¯
(5.69) (5.70) ¯
¯¯
δηz i¯ = −∂z βi¯ − ∂i¯ α kl ηz k¯ βl − ∂i¯ α kl ηz k¯ βl¯ − ∂i¯ ∂s¯ α kl ηz+¯s βk βl¯ 1 ¯¯ − ∂i¯ ∂s¯ α kl ηz+¯s βk¯ βl¯, 2 ¯
(5.71) ¯
δηz¯ i = −∂z¯ βi − ∂i α kl ηz¯ k βl¯ − ∂i α kl ηz¯ k βl − ∂i ∂s α kl ηz+s ¯ βk¯ βl 1 − ∂i ∂s α kl ηz+s ¯ βk βl , 2
(5.72)
which are nilpotent only on-shell. The existence of such residual BRST symmetry within BV formalism is discussed in [18,1]. Next using the gauge fixed action (5.64) we can calculate the path integral explicitly on the sphere. In particular let us perform the one-loop calculation around the constant map. We take a classical solution η = 0 and X = x0 with x0 being a constant and the rest of the fields are zero. Consider the fluctuations around this configuration X = x0 + X f , η = 0 + η f , β = 0 + β f , η+ = 0 + η+f ,
(5.73)
where naturally by η and η+ we understand only non-vanishing components (ηz¯ i , ηz i¯ ) +i¯ and (ηz+i ¯ , ηz ) correspondingly. We take the expansion (5.73) and plug it into the gauge fixed action (5.64) while keeping only up to the quadratic terms in the fluctuations. The bosonic part of the resulting action can be written schematically as 1 0 D X X η , (5.74) −D A η 2 where A is a part composed from the Poisson tensor α and D is a first order differential operator ∂z 0 D= . 0 −∂z While the fermionic part of the corresponding action is written as ηt Dβ,
(5.75)
1050
F. Bonechi, M. Zabzine
with the same D. We can perform easily the gaussian integral over the bosonic (5.74) and the fermionic parts (5.75). The integration produces the ratio of determinants of D which is exactly 1. Thus the result of this gaussian integration is just one. However the integration over zero modes of D will remain. The fields η and η+ do not have any zero modes since there are no (anti)holomorphic 1-forms on the sphere. While β have constant zero modes and X does as well. These zero modes give an integration over the finite dimensional graded manifold T ∗ [1]M which is defined by choosing a volume form on M. In order to compensate the odd integration we have to insert the local observables into the path integral. Thus the final result for the correlators of local observales is p
p
O0 1 (w1 ) . . . . O0 k (wk ) = tr (w1 ∧ · · · ∧ wk ),
(5.76)
where the trace map tr is defined in the Appendix and the correlator agrees with (4.49). Since the number of zero modes for β corresponds to the dimensionality of M we have that the correlator (5.76) is non-zero only if p1 + · · · pk = d. Moreover if we require that the correlator is invariant under the BRST symmetry (5.65)–(5.72) then the Poisson tensor α should be unimodular and is the corresponding invariant volume form. To prove this we need to remember how BRST symmetry (5.65)–(5.72) acts on the local observables and the theorem 8 from the Appendix A. Notice that as far as the fields X and β concern the action of BV symmetry (3.22)–(3.27) and the BRST symmetry (5.65)–(5.72) is the same. Since the local observables are constructed from X and β only we can apply the discussion of Subsect. 3.2 to the analysis of BRST invariant observables in the present setup. We conclude that the present calculation is in complete agreement with our previous analysis within the finite dimensional BV framework. Although the unimodularity of α is argued completely differently, now through the BRST invariance of the zeromode measure. The answer (5.76) is just the leading contribution into the full quantum correlator. Finally we comment when the geometry required for the present gauge fixing is compatible with the unimodularity. Indeed for a generalized Calabi-Yau manifold the corresponding Poisson structure is always unimodular [16]. Thus as a possible example, we may consider the generalized Kähler geometry where one of the generalized complex structures satisfies a generalized Calabi-Yau condition. Actually the gauge fixing can be performed for a generalized Calabi-Yau manifold by itself with the use of an almost generalized complex structure. However we have to stress that unimodularity of Poisson structure is a real condition and indeed much weaker than the generalized Calabi-Yau condition. 5.3. Relation to A-model. If we assume that α i j = 0 and α is invertible, then we are on the Kähler manifold where ω = α −1 is the Kähler form and g = −ω J is the hermitian metric. Due to the fact that α is invertible we can perform the integration over ηz i¯ and ηz¯ i in the path integral with the gauge fixed action (5.64). Introducing the following notation: ¯
¯
¯
¯
¯
+i i ψ i = −ig i j β j¯ , ψ i = ig i j β j , ψz¯i = −iηz+i ¯ , ψz = −iηz ,
(5.77)
the result of the integration of η is
¯ ¯ ¯ j ¯ S A = d 2 σ ∂z¯ X i gi¯ j ∂z X j + iψzi gi¯ j ∇z¯ ψ j + iψz¯i gi j¯ ∇z ψ k − R pi¯ j n¯ ψz¯ ψzi ψ p ψ n¯ , (5.78)
Poisson Sigma Model on the Sphere
1051
where we adopted the following notation: ¯
¯
¯
¯
∇z¯ ψ k = ∂z¯ ψ k + knl ∂z¯ X n ψ l , ∇z ψ k = ∂z ψ k + kn¯ l¯∂z X n¯ ψ l
(5.79)
with being the Levi–Civita connection and R the corresponding Riemann tensor. The first term in the action (5.78) can be rewritten as ¯
∂z¯ X i gi¯ j ∂z X j =
1 √ αβ 1 ¯ ¯ hh ∂α X i gi¯ j ∂β X j + αβ ∂α X i (igi¯ j )∂β X j , 2 2
(5.80)
where the last term is a topological, the pull-back of the Kähler form ω. The BRST transformations (5.65)-(5.72) become ¯
¯
¯
δ X i = ψ i , δ X i = ψ i , δψ i = 0, δψ i = 0, δψz¯+i
= i∂z¯ X + i
i
k l lk ψz¯ ψ ,
¯ δψz+i
i¯
= i∂z X +
(5.81) i¯
¯ ¯ ψk ψl . l¯k¯ z
(5.82)
The action (5.78) with the BRST transformations (5.82) corresponds to the topological sigma model [47] on the Kähler manifold which corresponds to the A-twist of N = (2, 2) supersymmetric sigma model [48]. Previously the BV treatment of the A-model has been discussed in [1]. Here we presented the improved analysis of the relation between the BV-formulation of PSM and the A-model. Any symplectic manifold with symplectic structure ω is unimodular with the volume form given by = ωd/2 . Moreover there exists a natural isomorphism between the Lichnerowicz–Poisson cohomology and the de Rham cohomology, HL• P (M) ≈ Hd R (M) which is provided by the symplectic structure ω. Therefore the observable corresponding to a multivector field can be mapped into the observable corresponding to the differential form through the identification (5.77). Thus the correlator (5.76) can be rewritten as tr (w1 ∧ . . . ∧ wk ) = (w1 ) ∧ · · · (wk ), (5.83) M
where wl is a differential form corresponding to a multivector field wl constructed through the map : ∧• T M → ∧• T ∗ M defined by the symplectic structure ω. Indeed the correlator (5.83) is the standard one for the A-model and can be interpreted as the intersection number of the Poincaré dual cycles to wl . In the full quantum theory the correlator (5.83) gets corrections from the holomorphic maps on which the theory is localized. These instanton corrections are related to the Gromov–Witten invariants. This is well-developed subject, see [23] for a review.
5.4. Zero Poisson structure. As a next example we consider the case of zero Poisson structure, α = 0. In this case the gauge fixed action (5.64) is of the form
¯ +i¯ SG F = i (5.84) d 2 σ ηz i¯ ∂z¯ X i − ηz¯ i ∂z X i + ηz+i ¯ ∂z βi − ηz ∂z¯ βi¯ ,
while the BRST transformations (5.65)–(5.72) become ¯
¯
¯
i i +i δ X i = 0, δ X i = 0, δηz+i ¯ = −∂z¯ X , δηz = −∂z X , δβi = 0, δβi¯ = 0, δηz i¯ = −∂z βi¯ , δηz¯ i = −∂z¯ βi .
(5.85) (5.86)
1052
F. Bonechi, M. Zabzine
Now these transformations are nilpotent off-shell. The action (5.84) is reminiscent of the action obtained through the infinite volume limit of the A-model [14]. However our BRST symmetry differs from the one discussed in [14] and thus these are different theories. As well the action (5.84) with the symmetries (5.85)–(5.86) has appeared in different context in [52] as a specific gauge fixed version of the “Hitchin sigma model” [51]. Next we argue that the correlator (5.76) is a full quantum answer for the PSM with α = 0. We can use the BRST symmetry (5.85)–(5.86) to localize the theory on the holomorphic maps, ∂z¯ X i = 0. Namely we can add to the action (5.84) the BRST exact term ¯ 2 j i j¯ i¯ j¯ = t , g ∂ X d σ ∂ X g ∂ X + ∂ X g ∂ X −tδ d 2 σ ηz+i gi¯ j ∂z¯ X j + ηz+i ¯ ¯ ¯ z z z ¯ z ¯ z ¯ ij ij ij
(5.87) where t is any real number and this exact term is positive definite. The addition of this exact term to the action cannot change the theory and the result is independent from the parameter t. By sending t to infinity the dominant contribution to the path integral will ¯ come from the holomorphic maps, ∂z¯ X i = 0 and ∂z X i = 0. Moreover we can perform ¯ the integration over η which imposes the conditions ∂z¯ X i = 0 and ∂z X i = 0 which together with the BRST argument imply that only the constant maps contribute to the path integrals. Thus in the evaluation of the path integral on the sphere with the insertion of local observables the only remaining integration is the integration over M and the corresponding zero modes of β. On the sphere there will be no zero modes for η and η+ . Thus we have proven that for the PSM with zero Poisson structure the leading result (5.76) for the correlators of local observables is indeed exact. Actually this should not be a surprise since the Poisson tensor controls -corrections. In the general action (3.17) the fields can be rescaled in such way that appears in front of α only.
5.5. Holomorphic Poisson structure. Another interesting case is when there exists such a complex structure J that α is a holomorphic Poisson structure. In other words the (1, 1)-part of α vanishes and thus the gauge fixed action (5.64) is independent of α. The gauge fixed action for the holomorphic Poisson structure is the same as (5.84) for the zero Poisson structure. However the Poisson structure enters into the BRST transformations. For the case of holomorphic Poisson structure the transformations (5.65)–(5.72) become δ X i = αi j β j , ¯
(5.88)
¯¯
δ X i = α i j β j¯ ,
(5.89)
i ij i j +k δηz+i ¯ = −∂z¯ X − α ηz¯ j − ∂k α ηz¯ β j ,
(5.90)
¯
¯
¯¯
¯¯
¯
δηz+i = −∂z X i − α i j ηz j¯ − ∂k¯ α i j ηz+k β j¯ , 1 ∂i α k j βk β j , 2 1 ¯¯ δβi¯ = ∂i¯ α k j βk¯ β j¯ , 2 δβi =
(5.91) (5.92) (5.93)
Poisson Sigma Model on the Sphere
1053
1 ¯¯ ¯¯ δηz i¯ = −∂z βi¯ − ∂i¯ α kl ηz k¯ βl¯ − ∂i¯ ∂s¯ α kl ηz+¯s βk¯ βl¯, 2 1 δηz¯ i = −∂z¯ βi − ∂i α kl ηz¯ k βl − ∂i ∂s α kl ηz+s ¯ βk βl . 2
(5.94) (5.95)
These transformations are nilpotent δ 2 = 0 off-shell and the action (5.84) is invariant under them. Indeed there is not a single BRST transformation but a whole family. In the transformations (5.88)–(5.95) we can put a complex parameter t ∈ C in front of all ¯¯ terms containing α i j and correspondingly t¯ in front of terms with α i j . This would define a complex family of the BRST transformations δt which are nilpotent δt2 = 0 off-shell and the action (5.84) is invariant under δt . We can repeat the argument from the previous subsection. Using the localization with respect to δt for any t (including zero) and the integration over η we arrive at the conclusion that the path integral is localized on the constant maps. Thus again the correlator (5.76) of local observables is a full quantum result. The example of holomorphic Poisson structure is provided by the hyperKähler manifold which admits a holomorphic symplectic structure with respect to the appropriate complex structure. Therefore the A-model on the hyperKähler manifold can be localized to constant maps and the semi-classical result is exact. However our results are applicable for the wide class of Poisson holomorphic manifold, e.g. the Del Pezzo surfaces, the Poisson Fano varieties, CP 2 , etc. These examples have attracted a lot attention recently, especially in the context of generalized complex geometry (see [33,16] for the general discussion and examples [22,17]). One may observe that the PSM for a holomorphic Poisson manifold has striking similarities with the B-model [41] defined for the following generalized complex structure: J α , (5.96) 0 −J t where α = α (2,0) + α (0,2) is the real part of a holomorphic Poisson structure. However to define the B-model we need a closed pure spinor ρ = eα
(2,0)
,
where is a closed holomorphic volume form. Indeed this condition gives the holomorphic analog of unimodularity. However for the PSM discussed above we need a real version of unimodularity of α which is a weaker condition on a real volume form. Thus the unimodular deformations of holomorphic Poisson structure cannot be mapped to the corresponding deformations of generalized Calabi-Yau structure corresponding to (5.96). Therefore for a given geometrical setup the B-model and PSM are two different models, with different moduli dependence. 6. Conclusions In this work we have attempted to study the Poisson sigma model beyond the perturbative expansion. The main lesson is that the quantum theory requires the corresponding Poisson tensor α to be unimodular. We argued this additional property of α in different ways. In the BV framework the unimodularity is related to the quantum master equation, which requires additional care in its definition. Moreover for the specific gauge fixing
1054
F. Bonechi, M. Zabzine
we obtained the unimodularity as from the requirement of the BRST invariance of the zero mode measure. Alternatively one can provide a different heuristic argument2 for the unimodularity of the Poisson tensor coming from the perturbative analysis as in [6]. In the perturbative expansion all integrals are absolutely convergent except those containing tadpole diagrams. One may try to regularize the tadpoles by point-splitting using the vector field with no zeros on . However such a vector does not exist on S 2 and thus the tadpoles should be dealt with differently. Since the tadpoles correspond to the bidifferential operators involving the divergence of the Poisson tensor then the unimodularity is the way to eliminate them. The unimodulary of the Poisson tensor reformulated in terms of pure spinors allows us to treat the PSM exactly in the same fashion as A- and B-models [23] together with their generalized complex relatives [25,26,34,41]. Indeed the Poisson structure defines a real analog of the generalized complex structure and the unimodulary of α is a real analog of the generalized Calabi-Yau condition. We believe that it is important that all these models can be treated uniformly and there is an intricate interrelation between all these models. There are several open questions we would like to address in the future, in particular the generalization of the construction of Frobenius manifolds from [2] and [38] for the ¯ case when the ∂ ∂-lemma fails, as in a generic Poisson case. Also we plan to use further the localization for PSM along the lines presented in Sect. 5. There is an indication that the Gromov-Witten story can be generalized for PSM defined over the generalized Kähler manifold. Furthermore it would be interesting to develop the present analysis for PSM for the higher genus surfaces. Acknowledgement. We are grateful to Alberto Cattaneo, Gil Cavalcanti, Andrei Losev, Vasily Pestun, Gabriele Vezzosi and Roberto Zucchini for the discussions. We thank Alberto Cattaneo, Yvette Kosmann-Schwarzbach and Vasily Pestun for reading and commenting on the manuscript. We thank the referee for the comments and suggestions. We thank the Erwin Schrödinger International Institute for Mathematical Physics for hospitality. M.Z. thanks INFN Sezione di Firenze and Università di Firenze where part of this work was carried out. The research of M.Z. was supported by VR-grant 621-2004-3177.
A. The Multivector Calculus Throughout the Appendices A and B we consider mainly the case of the compact manifold M. This condition can be relaxed if we require the appropriate integrals to be defined and integration by parts should work without any boundary contributions. In this Appendix we review the relevant structures on the multivector fields (∧• T M) over a smooth manifold M. For further details the reader may consult the textbook by Vaisman [45]. The Lie bracket on the vector fields can be extended to a bracket on the multivectors. This bracket is called the Schouten bracket. In local coordinates the multivector fields P and Q are written as P = P µ1 ...µ p ∂µ1 ∧ . . . ∧ ∂µ p ,
Q = Q µ1 ...µq ∂µ1 ∧ . . . ∧ ∂µq ,
2 We thank Alberto Cattaneo for sharing this argument with us. Also see [12] for the related discussion and another interesting work [9] on the relation between the deformation quantization and unimodularity.
Poisson Sigma Model on the Sphere
1055
and their Schouten bracket is defined by the following expression:3 [P, Q]s = p P µ1 ...µ p−1 ρ ∂ρ Q µ p ...µq+ p−1 − q ∂ρ P µ1 ...µ p Q ρµ p+1 ...µq+ p−1 ∂µ1 ∧ · · · ∧ ∂µq+ p−1 . (A.1) The algebra ((∧• T M), ∧, [, ]s ) is a Gerstenhaber algebra (see Definition 1). If further we specify a volume form on M and a closed one-form λ then we can introduce an operator D,λ , D,λ P = div P + i λ P, where div is a divergence operator defined by and i λ is a contraction with one-form λ. In local coordinates with the volume form written as = ρ d x 1 ∧ · · · ∧ d x d the divergence operator is 1 (div P)µ2 ...µ p = − p ∂µ1 ρ P µ1 µ2 ...µd . ρ Equivalently, in coordinate free notation, the divergence can be written as div P = − ∗−1 d ∗ P, where ∗P = i P provides a map from (∧ p T M) to differential forms and d is de Rham differential. Assuming that dλ = 0 we have (D,λ )2 P = 0 and moreover [P, Q]s = (−1) p D,λ (P ∧ Q) + (−1) p+1 (D,λ P) ∧ Q − P ∧ D,λ Q. (A.2) Indeed D,λ is the most general operator which generates the Schouten bracket [49]. Therefore the algebra ((∧• T M), ∧, [, ]s , D,λ ) is a BV algebra (see Definition 2). Definition 4. The bivector α ∈ (∧2 T M) is called a Poisson structure if it satisfies [α, α]s = 0. The manifold with such α is called a Poisson manifold. The Poisson structure defines a Lichnerowicz–Poisson differential d L P on multivector fields d L P P ≡ [α, P]s ,
P ∈ (∧• T M).
The corresponding cohomology HL• P (M) is called the Lichnerowicz–Poisson cohomology group. We assume that M is orientable and thus we can choose a volume form . Then we can study how the Hamiltonian vector fields X f = α(d f ), f ∈ C ∞ (M) act on . In particular there exists a vector field φ such that L X f = φ ( f ). φ is named the modular vector field. Indeed the vector field φ defines a class [φ ] ∈ HL1 P (M). This class is independent of , 1 L X f (e g ) = φ + d L P g ( f )e g , 2 and [φ ] is called the Poisson modular class. 3 Our definition differs by the overall factor (−1) p−1 compared to the one in [45].
1056
F. Bonechi, M. Zabzine
Definition 5. A Poisson manifold (M, α) is called unimodular [46] if [φ ] = 0. In other words there exists such that L X f = 0 for any Hamiltonian vector field X f . We refer to such as an invariant volume form. For a Poisson manifold (M, α) we can introduce a (Koszul-)Brylinski differential δ B on the differential forms • (M) δ B = i α d − di α , where i α is a contraction with a Poisson tensor α and d is a de Rham differential [30]. Theorem 6. A Poisson manifold (M, α) is unimodular if and only if there exists a volume form such that δ B = 0 or alternatively D α = 0. Proof. We use notation D ≡ D,0 . The proof of the theorem follows straightforwardly from the relation δ B = −i φ . This relation arises from the definition of the modular vector field φ given above and the following identities: d(i X f ) = −d f ∧ δ B ,
φ ( f ) = d f ∧ i φ .
Moreover using the definition of D the modular vector field can also be defined using the divergence operator with respect to as D α = −φ . For more details and the related discussion the reader may consult [28,46]. Thus we refer to an unimodular Poisson manifold as a triple (M, α, ), where is a volume form which is closed under the Brylinski differential. Definition 7. For a manifold M with a volume form we define a trace map over the multivector fields tr : (∧top T M) → R as follows: tr (P) =
∧ i P . M
Theorem 8. For a Poisson manifold (M, α) with a trace map tr the relation tr (d L P P ∧ Q) = (−1) p+1 tr (P ∧ d L P Q) is satisfied if and only if (M, α) is an unimodular and is an invariant volume form. Proof. To prove this statement we use the formulas from Vaisman’s textbook [45]. The relation in the theorem is equivalent to the following statement: ∧ i (d L P W ) = 0, W ∈ (∧d−1 T M). M
For this to hold it would be enough to show that ∧ i (d L P W ) is an exact d-form. Using the Lichnerowicz definition of the Schouten bracket (see the formula (1.16) in [45]) we rewrite ∧ i (d L P W ) = − ∧ i W δ B + (−1)d−1 ∧ δ B (i W ).
Poisson Sigma Model on the Sphere
1057
Assuming that the one-form i W = f dg and using the properties of the Brylinski differential, we recast the two terms in the above expression as follows: − ∧ i W δ B = (−1)d−1 f L X g , (−1)d−1 ∧ δ B ( f dg) = (−1)d {g, f } = (−1)d L X g ( f ) + (−1)d−1 f L X g . To derive the first relation we have used δ B = −i φ . If we require that the above forms are exact for any g and f then the manifold should be unimodular and is an invariant volume form. Since any one form can be written as the sum of the terms like f dg we can extend our proof for a generic situation. We can summarize the relevant properties of an unimodular Poisson manifold in the following theorem. Theorem 9. If (M, α, ) is a unimodular Poisson manifold then ((∧• T M), ∧, [, ]s , D , d L P ) is a graded differential BV algebra such that D d L P + d L P D = 0. Moreover there exists a trace map tr such that tr (d L P P ∧ Q) = (−1) p+1 tr (P ∧ d L P Q), tr (D P ∧ Q) = (−1) p tr (P ∧ D Q). Proof. The first part of the theorem has been discussed in [28,49]. We have explained most of the statements already. The relation between d L P and D is derived as follows: D d L P P = D (D (α ∧ P) − α ∧ D P) = −D (α ∧ D P) = −d L P D P, where we use the unimodularity, D α = 0. The property of the trace with respect to the divergence operator D is valid for any manifold with a volume form and is just a simple consequence of the Stokes theorem for the differential forms. B. Poisson Geometry and Pure Spinors In this Appendix we reformulate the previous Appendix in a different language. This allows us to put the whole formalism into the wider context which is related to generalized geometry on the sum T M ⊕ T ∗ M ≡ T ⊕ T ∗ of the tangent and contangent bundles. Below we review very briefly the notions of generalized complex structure, generalized Calabi-Yau condition and their real analogs. For more details we refer the reader to the reviews [15,16,50]. The sum of tangent and cotangent bundles T ⊕ T ∗ has a natural O(d, d) structure given by the natural pairing v + ξ, s + λ =
1 (i v λ + i s ξ ), 2
where we adopt the notation (v + ξ ), (s + λ) ∈ (T ⊕ T ∗ ). We are interested in a real (complex) Dirac structure which is defined as a maximally isotropic subbundle of T ⊕T ∗ (or (T ⊕ T ∗ ) ⊗ C) and this subbundle is involutive with respect to the Courant bracket. The Dirac structure is an example of the Lie algebroid with the bracket originated from the restriction of the Courant bracket. In particular we are interested in the case when
1058
F. Bonechi, M. Zabzine
tangent plus cotangent bundles (or its complexification) can be decomposed as a sum of two real (complex) Dirac structures T ⊕ T ∗ = L ⊕ L ∗ , (T ⊕ T ∗ ) ⊗ C = L ⊕ L ∗ . This decomposition gives us a real (complex) bialgebroid. Furthermore there is the structure a differential Gerstenhaber algebra [27,35] ((∧• L ∗ ), ∧, {, }, d L ), where {, } is the extension of the Lie bracket from L ∗ to ∧• L ∗ and d L is the Lie algebroid differential. In the complex case it is natural to impose an extra condition, namely the dual space L ∗ is a complex conjugate of L. Thus the corresponding bialgebroid is ¯ (T ⊕ T ∗ ) ⊗ C = L ⊕ L. This special case corresponds to the notion of generalized complex structure [15,20]. Alternatively the Dirac structures can be described by means of the pure spinor lines. We define the action of a section (v + ξ ) ∈ (T M ⊕ T ∗ M) on a differential form ρ ∈ (∧• T ∗ M), (v + ξ ) · ρ ≡ i v ρ + ξ ∧ ρ, which corresponds to the action of Cl(T ⊕ T ∗ ) on ∧• T ∗ . Thus the differential forms form a natural representation of Cl(T ⊕ T ∗ ). Consider the Dirac structure L and define a subbundle U0 of ∧• T ∗ as follows: L = {(v + ξ ) ∈ (T ⊕ T ∗ ), (v + ξ ) · U0 = 0}. We refer to U0 as a pure spinor line. The Dirac structure L induces the alternative grading on the differential forms ∧• T ∗ =
dim M
Uk ,
Uk = (∧k L ∗ ) · U0 ,
k=0
where · stands for the extension of Cl(T ⊕ T ∗ ) action to ∧• T ∗ . The property that L is involutive under the Courant bracket is equivalent to the following: d((U0 )) ⊂ (U1 ), where d is de Rham differential. Indeed we can define a Dirac structure through the subbundle U0 of ∧• T ∗ with the above properties. With respect to the alternative grading we can decompose the de Rham differential as follows: ∂
∂¯
d = ∂¯ + ∂, (Uk−1 ) ← (Uk ) → (Uk+1 ), 2 2 such that ∂ = 0 and ∂¯ = 0. We borrow the notation from the generalized complex geometry and in the present context bar over ∂ does not mean complex conjugation. From now on we assume that the bundle U0 is trivial and there exists a global section, a pure spinor form ρ which defines L completely. The integrability of L is equivalent to the statement dρ = (v + ξ ) · ρ, for some section (v + ξ ) ∈ (L ∗ ). Since for given L the pure spinor ρ is defined non uniquely, namely for any f ∈ C ∞ (M) the form e f ρ is also a pure spinor. Thus there is a cohomology class [(v + ξ )] ∈ H 1 (d L ), which is just proportional to the modular class of the Lie algebroid [11]. Thus we arrive at the following theorem.
Poisson Sigma Model on the Sphere
1059
Theorem 10. The Dirac structure L admits the description in terms of closed pure spinor if and only if the corresponding U0 bundle is trivial and the Lie algebroid L is unimodular. Since U0 is a line bundle then its triviality is analyzed differently in the real and complex cases. For instance, in the complex case we have to require the trivial first Chern class, c1 (U0 ) = 0. In the generalized complex case (T ⊕ T ∗ ) ⊗ C = L ⊕ L¯ the ability to describe L in terms of a closed pure spinor corresponds to the generalized CalabiYau condition, the notion introduced by Hitchin [20]. Thus the generalized Calabi-Yau condition is equivalent to two requirements, c1 (U0 ) = 0 and the unimodularity of Lie algebroid L. From now on we assume that L admits the description in terms of a closed pure spinor ρ. For A ∈ (∧• L ∗ ) and a closed pure spinor ρ there are the following relations: ¯ · ρ), (d L A) · ρ = ∂(A
(D A) · ρ = ∂(A · ρ),
where the last relation can be regarded as the definition of the operator D such that D 2 = 0. Indeed D generates the bracket {, } on ∧• L ∗ . Therefore one can show that ((∧• L ∗ ), ∧, {, }, D, d L ) is a differential BV-algebra [49,26,34]. In addition the closed ¯ and pure spinor provides the isomorphisms of the cohomologies, H • (d L ) ≈ H • (∂) H • (D) ≈ H • (∂). There exists an invariant form on spinors which, in the present context, corresponds to the Mukai pairing of the differential forms
(ρ, φ) = (−1) j (ρ2 j ∧ φn−2 j + ρ2 j+1 ∧ φn−2 j−1 ), j
where n = dim M and the forms decomposed by the standard degree ρ = φ = φi . We can introduce the trace map as (ρ, A · ρ), A ∈ (∧n L ∗ ). trρ (A) =
ρi ,
M
We can summarize these observations in the following theorem: Theorem 11. For a Lie bialgebroid T ⊕ T ∗ = L ⊕ L ∗ with L being a Dirac structure described by the closed pure spinor ρ, ((∧• L ∗ ), ∧, {, }, D, d L ) is a differential BV-algebra and there exists a trace map with the following properties: trρ (d L A ∧ B) = (−1)|A|+1 trρ (A ∧ d L B), trρ (D A ∧ B) = (−1)|A| trρ (A ∧ D B), where A, B are sections of ∧• L ∗ . Proof. The proof of this theorem is straightforward and the different elements of the proof are scattered in the literature, see [49,26,34]. Let us sketch the main idea behind the proof. For any differential form ρ ∈ (∧• T ∗ ) and any sections A, B ∈ (T ⊕ T ∗ ), there is the following identity: A · B · dρ = d(A · B · ρ) + B · d(A · ρ) − A · d(B · ρ) + [A, B]c · ρ − dA, B ∧ ρ,
1060
F. Bonechi, M. Zabzine
where [, ]c is the Courant bracket and , is the natural pairing on T ⊕ T ∗ . If we have a Lie bialgebroid T ⊕ T ∗ = L ⊕ L ∗ with L being a Dirac structure described by the closed pure spinor ρ then the above formula implies d(A · B · ρ) + B · d(A · ρ) − A · d(B · ρ) + {A, B} · ρ = 0, where now A, B ∈ (L ∗ ) and {, } is a Lie bracket on L ∗ , which is a restriction of the Courant bracket to L ∗ . This formula can be extended to the general case when A, B are sections of definite degree in (∧• L ∗ ). This extension together with the definition dL
D
(d L + D)A · ρ = d(A · ρ), ∧k L ∗ → ∧k+1 L ∗ , ∧k L ∗ → ∧k−1 L ∗ allow us to we recover that D generates the bracket on (∧• L ∗ ) and moreover (∧• L ∗ ) is a differential BV algebra. The properties of the trace map can be proven easily also using the above properties. Using this language we now recast the previous definitions in Poisson geometry in a new language. Let us start from the following theorem. Theorem 12. The manifold M is a unimodular Poisson manifold if and only there exists a closed pure spinor of the form 1 ρ = eα = + i α + i α2 + · · · , 2 where α is a bivector and is a volume form. Proof. If we have a unimodular Poisson manifold (M, α, ) then we can construct a pure spinor ρ = eα which satisfies 1 dρ = δ B + δ B (i α ) + · · · = 0, 2 since δ B = 0 and δ B i α = i α δ B . In the opposite direction we can start from a closed pure spinor ρ = eα which defines the following maximally isotropic subbundle of T ⊕ T ∗: L = eα (T ∗ ) = {i ξ α + ξ : ξ ∈ (T ∗ )}. Since ρ is closed, L is a Dirac structure and thus α is an Poisson structure. Moreover the volume would be an invariant volume form with respect to the unimodular Poisson structure α. Thus the Poisson structure on M gives the real Lie bialgebroid T ⊕ T ∗ = eα (T ∗ ) ⊕ T . If the Poisson structure is unimodular then there exists a closed pure spinor ρ = eα and (∧• T ) is a differential BV algebra. Indeed the trace map tr defined in the previous appendix coincides with the one defined here, trρ , since only the top form part contributes in ρ. On an unimodular Poisson manifold (M, α, ) with the pure spinor ρ = eα we can calculate the differentials ∂ and ∂¯ associated with the alternative grading on the differential forms ∧• T ∗ =
dim M
(∧k T ) · eα .
k=0
Indeed in this case we have ∂¯ = −δ B and ∂ = d + δ B , see the following theorem:
Poisson Sigma Model on the Sphere
1061
Theorem 13. For a unimodular Poisson manifold (M, α, ) with the closed pure spinor ρ = eα , the following relations hold: (D P) · ρ = −(d + δ B )(P · ρ), (d L P P) · ρ = δ B (P · ρ). Proof. Let us start from the proof of the first relation. If α = 0 then this is just a definition of D given in the previous appendix. In the general case α = 0 a simple calculation produces the following formula [10]: d + δ B = eα de−α , which together with the definition of D gives the desired relation. Next we prove the second relation in the theorem. Using the fact that D generates the Schouten bracket and the manifold is unimodular, D α = 0, we get (d L P P) · ρ = (D (α ∧ P) − α ∧ D P) · ρ = −(d + δ B )(i α i P ρ) + i α (d + δ B )(i P ρ) = δ B (i P ρ), where we used the previously proved relation and the property i α δ B = δ B i α .
This theorem implies the isomorphism of certain cohomologies. For any Poisson manifold (M, α) there are the following isomorphisms: Hd•R (M) ≈ H • (D ) ≈ H • (d + δ B ), while for the unimodular Poisson manifold in addition we have HL• P (M) ≈ H • (δ B ).
References 1. Alexandrov, M., Kontsevich, M., Schwartz, A., Zaboronsky, O.: The Geometry of the master equation and topological quantum field theory. Int. J. Mod. Phys. A 12, 1405 (1997) 2. Barannikov, S., Kontsevich, M.: Frobenius manifolds and formality of Lie algebras of polyvector fields. Internat. Math. Res. Notices 4, 201 (1998) 3. Batalin, I.A., Vilkovisky, G.A.: Relativistic S matrix of dynamical systems with Boson and Fermion constraints. Phys. Lett. B 69, 309 (1977) 4. Bergamin, L., Grumiller, D., Kummer, W., Vassilevich, D.V.: Classical and quantum integrability of 2D dilaton gravities in Euclidean space. Class. Quant. Grav. 22, 1361 (2005) 5. Calvo, I.: Supersymmetric WZ-Poisson sigma model and twisted generalized complex geometry. Lett. Math. Phys. 77, 53 (2006) 6. Cattaneo, A.S., Felder, G.: A path integral approach to the Kontsevich quantization formula. Commun. Math. Phys. 212, 591 (2000) 7. Cattaneo, A.S., Felder, G.: On the AKSZ formulation of the Poisson sigma model. Lett. Math. Phys. 56, 163 (2001) 8. Cattaneo, A.S.: From Topological Field Theory to Deformation Quantization and Reduction. Proceedings of ICM 2006, Vol. III, Zurich: European Mathematical Society, 2006, pp. 339–365 9. Dolgushev, V.: The Van den Bergh duality and the modular symmetry of a Poisson variety. http://arXiv. org/list/math/0612288, 2006 10. Evens, S., Lu, J.-H.: Poisson harmonic forms, Kostant harmonic forms, and the S 1 -equivariant cohomology of K /T . Adv. Math. 142, 171 (1999) 11. Evens, S., Lu, J.-H., Weinstein, A.: Transverse measures, the modular class, and a cohomology pairing for Lie algebroids, Quart. J. Math. Oxford Ser. 2. 50(200), 417–436 (1999) 12. Felder, G., Shoikhet, B.: Deformation quantization with traces. Lett. Math. Phys. 53(1), 75–86 (2000)
1062
F. Bonechi, M. Zabzine
13. Fiorenza, D.: An introduction to the Batalin-Vilkovisky formalism, Comptes Rendus des Rencontres Mathematiques de Glanon, Edition 2003 14. Frenkel, E., Losev, A.: Mirror symmetry in two steps: A-I-B. Commun. Math. Phys. 269, 39 (2007) 15. Gualtieri, M.: Generalized Complex Geometry. Oxford University, DPhil thesis, 2004, avaible at http:// arXiv.org/list/math.DG/0401221, 2004 16. Gualtieri, M.: Generalized Complex Geometry. http://arXiv.org/list/math.DG/0703298, 2007 17. Gualtieri, M.: Branes on Poisson varieties. http://arXiv.org/abs/0710.2719v1[math.DG], 2007 18. Henneaux, M., Teitelboim, C.: Quantization of Gauge Systems. Princeton Series in Physics, Princeton, NJ: Princeton Univ. Press, 1992 19. Hirshfeld, A.C., Schwarzweller, T.: The partition function of the linear Poisson-sigma model on arbitrary surfaces. http://arXiv.org/list/hep-th/0112086, 2001 20. Hitchin, N.: Generalized Calabi-Yau manifolds. Quart. J. Math. Oxford Ser. 54, 281 (2003) 21. Hitchin, N.: Instantons, Poisson structures and generalized Kaehler geometry. Commun. Math. Phys. 265, 131 (2006) 22. Hitchin, N.: Bihermitian metrics on Del Pezzo surfaces. http://arXiv.org/list/math.DG/0608213, 2006 23. Hori, K., Katz, S., Klemm, A., Pandharipande, R., Thomas, R., Vafa, C., Vakil, R., Zaslow, E.: Mirror symmetry. Providence, RI: Amer Math. Soc., 2003 24. Ikeda, N.: Two-dimensional gravity and nonlinear gauge theory. Annals Phys. 235, 435 (1994) 25. Kapustin, A.: Topological strings on noncommutative manifolds. Int. J. Geom. Meth. Mod. Phys. 1, 49 (2004) 26. Kapustin, A., Li, Y.: Topological sigma-models with H-flux and twisted generalized complex manifolds. http://arXiv.org/list/hep-th/0407249, 2004 27. Kosmann-Schwarzbach, Y.: Exact Gerstenhaber algebras and Lie bialgebroids. Acta Appl. Math. 41, 153– 165 (1995) 28. Kosmann-Schwarzbach, Y.: Modular vector fields and Batalin-Vilkovisky algebras. In: Poisson geometry (Warsaw, 1998), Banach Center Publ. 51, Warsaw: Polish Acad. Sci., 2000, pp. 109–129 29. Kosmann-Schwarzbach, Y., Monterde, J.: Divergence operators and odd Poisson brackets. Ann. Inst. Fourier (Grenoble) 52(2), 419–456 (2002) 30. Koszul, J.-L.: Crochet de Schouten-Nijenhuis et cohomologie. In: The mathematical heritage of Élie Cartan (Lyon, 1984), Astérisque 1985, Numero Hors Serie, 257–271, (1985) 31. Krotov, D., Losev, A.: Quantum field theory as effective BV theory from Chern-Simons. http://arXiv.org/ list/hep-th/0603201, 2006 32. Kummer, W., Liebl, H., Vassilevich, D.V.: Exact path integral quantization of generic 2-D dilaton gravity. Nucl. Phys. B 493, 491 (1997) 33. Laurent-Gengoux, C., Stienon, M., Xu, P.: Holomorphic Poisson Structures and Groupoids. http://arXiv. org/list/0707.4253v4[math.DG], 2007, to appear in Intl. Math. Res. Notices 34. Li, Y.: On deformations of generalized complex structurs: The generalized Calabi-Yau case. http://arXiv. org/list/hep-th/0508030, 2005 35. Lu, Z.J., Weinstein, A., Xu, P.: Manin Triples for Lie Bialgebroids. J. Diff. Geom. 45, 547 (1997) 36. Lyakhovich, S.L., Sharapov, A.A.: Characteristic classes of gauge systems. Nucl. Phys. B 703, 419 (2004) 37. Lyakhovich, S., Zabzine, M.: Poisson geometry of sigma models with extended supersymmetry. Phys. Lett. B 548, 243 (2002) 38. Manin, Yu.I.: Three constructions of Frobenius manifolds: a comparative study, In: Surveys in differential geometry, 497–554, Surv. Differ. Geom., VII, Somerville, MA: Int. Press, 2000, pp. 497–554 39. Manin, Yu.I.: Frobenius manifolds, quantum cohomology, and moduli spaces. American Mathematical Society Colloquium Publications 47, Providence, RI: American Mathematical Society, 1999 40. Mnev, P.: Notes on simplicial BF theory. http://arXiv.org/list/hep-th/0610326, 2006 41. Pestun, V.: Topological strings in generalized complex space. Adv. Theor. Math. Phys. 11, 399 (2007) 42. Roytenberg, D.: On the structure of graded symplectic supermanifolds and Courant algebroids. In: Quantization, Poisson Brackets and Beyond, Theodore Voronov (ed.), Contemp. Math., Vol. 315, Providence, RI: Amer. Math. Soc., 2002 43. Schaller, P., Strobl, T.: Poisson structure induced (topological) field theories. Mod. Phys. Lett. A 9, 3129 (1994) 44. Schwarz, A.S.: Geometry of Batalin-Vilkovisky quantization. Commun. Math. Phys. 155, 249 (1993) 45. Vaisman, I.: Lectures on the geometry of Poisson manifolds. Progress in Mathematics, 118. Basel: Birkhauser Verlag, 1994 46. Weinstein, A.: The modular automorphism group of Poisson manifolds. J. Geom. Phys. 23, 379 (1997) 47. Witten, E.: Topological Sigma Models. Commun. Math. Phys. 118, 411 (1988) 48. Witten, E.: Mirror manifolds and topological field theory. http://arXiv.org/list/hep-th/9112056, 1991 49. Xu, P.: Gerstenhaber algebras and BV-algebras in Poisson geometry. Commun. Math. Phys. 200, 545 (1999)
Poisson Sigma Model on the Sphere
1063
50. Zabzine, M.: Lectures on generalized complex geometry and supersymmetry. Archivum Mathematicum (Supplement) 42, 119–146 (2006) 51. Zucchini, R.: A sigma model field theoretic realization of Hitchin’s generalized complex geometry. JHEP 0411, 045 (2004) 52. Zucchini, R.: A topological sigma model of biKaehler geometry. JHEP 0601, 041 (2006) Communicated by N.A. Nekrasov
Commun. Math. Phys. 285, 1065–1086 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0618-y
Communications in
Mathematical Physics
Optimal Concentration for SU(1, 1) Coherent State Transforms and An Analogue of the Lieb-Wehrl Conjecture for SU(1, 1) Jogia Bandyopadhyay Department of Physics, Georgia Institute of Technology, Atlanta, GA 30332, USA. E-mail:
[email protected] Received: 5 December 2007 / Accepted: 13 May 2008 Published online: 16 September 2008 – © Springer-Verlag 2008
Abstract: We derive a lower bound for the Wehrl entropy in the setting of SU (1, 1). For asymptotically high values of the quantum number k, this bound coincides with the analogue of the Lieb-Wehrl conjecture for SU (1, 1) coherent states. The bound on the entropy is proved via a sharp norm bound. The norm bound is deduced by using an interesting identity for Fisher information of SU (1, 1) coherent state transforms on the hyperbolic plane H2 and a new family of sharp Sobolev inequalities on H2 . To prove the sharpness of our Sobolev inequality, we need to first prove a uniqueness theorem for solutions of a semi-linear Poisson equation (which is actually the Euler-Lagrange equation for the variational problem associated with our sharp Sobolev inequality) on H2 . Uniqueness theorems proved for similar semi-linear equations in the past do not apply here and the new features of our proof are of independent interest, as are some of the consequences we derive from the new family of Sobolev inequalities. 1. Introduction Let M be a Riemannian manifold with volume element dM. For a probability density ρ on M, that is, a non-negative measurable function on M with M ρdM = 1, its entropy, if it exists, is defined as: S(ρ) = − ρ ln ρ dM. (1.1) M
Thus defined, the entropy of a density ρ can be thought of as a measure of its “concentration”. If some part of the mass of ρ is very nearly concentrated in a multiple of a Dirac mass, then S(ρ) may be very negative. We shall be mainly interested in the case in which M is the phase space of some classical system, so that, in particular, Work partially supported by U.S. National Science Foundation grant DMS 06-00037.
1066
J. Bandyopadhyay
M is a symplectic manifold. In that case, we shall refer to ρ as a classical density, and S(ρ) as its classical entropy. The uncertainty principle limits the extent of possible concentration in phase space: for instance, it prevents both the momentum variables p and the configuration variables q in a canonical phase space, from taking on well-defined values at the same time. A quantum mechanical density ρ Q is a non-negative operator on the Hilbert space H, which is the state space of the quantum system, having unit trace. Then the quantum entropy (or von Neumann entropy) of ρ Q is defined by S Q (ρ Q ) = −Tr ρ Q ln ρ Q .
(1.2)
Since all of the eigenvalues of ρ Q lie in the interval [0, 1], it is clear that S Q (ρ Q ) ≥ 0 .
(1.3)
There is a natural way to make the correspondence between quantum states and classical probability densities on phase space, which goes back to Schrödinger. It is based on the coherent state transform, which is an isometry L from the quantum state space H into L 2 (M), the Hilbert space of square integrable functions on the classical phase space M. Since it is an isometry, if ψ is any unit vector in H, ρψ = |Lψ|2 is a probability density on M. Wehrl [Weh] proposed defining the classical entropy of a quantum state ψ in this way (note the corresponding density matrix has rank one, and hence the von Neumann entropy would be zero, for a “pure state”). The Wehrl entropy is defined in terms of the coherent states for the quantum system and is bounded below by the quantum entropy. It has several physically desirable features such as monotonicity, strong subadditivity, and of course, positivity (see [Weh,Lie]). Wehrl identified the class of probability densities arising through the coherent state transform as the class of quantum mechanically significant probability densities on M, and conjectured that corresponding to (1.3), there should be a lower bound on S(|Lψ|2 ) as ψ ranges over the unit sphere in H. Specifically, if H is L 2 (R, dx), so that the classical phase space is R2 with its usual symplectic and Riemannian structure, Wehrl conjectured that the lower bound on S(|Lψ|2 ) is attained when ψ is a minimal uncertainty state ψmin , also known as a Glauber coherent state. That is: inf
ψH =1
S(|Lψ|2 ) = S(|Lψmin |2 ) .
(1.4)
This was proved by Lieb [Lie] . Lieb generalized the Wehrl conjecture to the SU (2) coherent states, for which the corresponding classical phase space is S2 , the twodimensional sphere. The analogues of the Glauber coherent states in this case are the Bloch coherent states generated by least weight vectors in the various unitary representations of SU (2), indexed by the half integer quantum number j and Lieb conjectured the analogue of (1.4) for the SU (2) coherent state transform. Although Lieb’s conjecture for SU (2) is still open, it has attracted the attention of a number of researchers, and much progress has been made. The bound is trivial for j = 1/2, in which case every state is a Bloch coherent state, but is already non-trivial for j = 1. Schupp [Sch] proved the conjecture for j = 1 and j = 3/2. Later Bodmann [Bod] proved a result which may be seen as complementary to Schupp’s result; he deduced a lower bound for the Wehrl entropy of SU (2) coherent states, for which the high spin asymptotics coincided with Lieb’s conjecture up to, but not including, terms of first and higher orders in the inverse of spin quantum number j.
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1067
Bodmann did this by proving a sharp L p bound on the range of the coherent state transform. This led to a proof of an analogue of Lieb’s conjecture for certain Renyi entropies (cf. [Gnu]): for any p > 1 and any classical density ρ, define 1 ln ρ p , (1.5) S p (ρ) = p−1 where ρ p is the L p norm of ρ. Then it is easy to see that, if it exists, lim S p (ρ) = S(ρ) .
p→1
Bodmann derived his bound on Renyi entropies from a Sobolev type inequality and a Fisher information identity, which is another type of concentration bound on the range of the coherent state transform. The Fisher information I (ρ) of a probabilty density ρ on M is defined by √ I (ρ) = |∇ ln ρ|2 ρ dM = 4 |∇ ρ|2 dM . M
M
For the Glauber coherent state transform, Carlen [Car] had proved that all classical densities on R2 arising through the coherent state transform had the same finite value of the Fisher information. He then used that together with the logarithmic Sobolev inequality (cf. [Gro1]) to give a new proof of Wehrl’s conjecture, and to show that the lower bound in (1.4) is attained only for Glauber coherent states. Bodmann proved an analogue of Carlen’s result for Fisher information, and used this, together with a sharp Sobolev inequality, instead of the sharp logarithmic Sobolev inequality, to obtain his Renyi information bounds. In this paper, we investigate the analogue of the Lieb-Wehrl conjecture for SU (1, 1). The representations of SU (1, 1) belonging to a discrete series, are labeled by a half-integer k, the relevant quantum number in this context. The classical phase space is H2 , the hyperbolic plane (cf. [Per]). It is natural to conjecture that, here too, the coherent states generated by the least-weight vector of the representation provide a lower bound on the entropy. We prove that this is indeed asymptotically true, in the semi-classical limit. To obtain these results, we prove a number of theorems concerning analysis in H2 that are of independent interest. Specifically, we prove a new sharp Sobolev inequality, and a sharpened energy–entropy inequality in H2 . The Sobolev inequality is kp − 1 q/ p kq − 1 4 2k − 1 q q q/2 2 f q + f p, |∇| f | | dν ≥ kq(kq − 2) kq − 1 2k − 1 kq − 2 where p = q + 1/k, q ≥ 2, kq > 2 and the measure dν is a constant times the standard measure on H2 , obtained from the Poincaré metric; we determine all of the cases of equality. To prove the sharpness of our Sobolev inequality we need to prove and use a uniqueness result for radial solutions of a semi-linear Poisson equation on the hyperbolic plane. The nature of this equation on H2 is substantially different from that of similar equations which have been investigated in the past. The methods developed here may well be useful for other uniqueness problems. We then prove the following Fisher information identity: 1 q/2 2 |∇|Lψ| | dν = kq |Lψ|q dν, 4 where q is a positive number such that kq > 2.
1068
J. Bandyopadhyay
The sharp Sobolev inequality and the Fisher information identity allow us to prove an L p norm estimate a la Bodmann. This norm estimate is used to deduce a lower bound for the Wehrl entropy of coherent state transforms via a convexity argument, and the result is: 1 . S(|Lψ(ζ )|2 ) ≥ 2k ln 1 + 2k − 1 It is seen that for high values (this gives us the semi-classical limit) of the quantum number k, this lower bound coincides with the analogue of the Lieb-Wehrl conjecture, up to but not including terms of first and higher order in k −1 . The methods used to bound the entropy also serve to produce a new, sharpened entropy-energy inequality for functions on H2 . An entropy-energy inequality is an inequality of the form − S(ρ) ≤ M (I (ρ)),
(1.6)
for some function . For a given Riemannian manifold M, the entropy–energy problem is to determine the least function : R+ → R for which (1.6) is true. There has been much investigation of entropy-energy inequalities for various Riemannian manifolds (see [Bec1,Bec2,Heb,Rot] for example). Though there has been significant progress, many questions are still open. In the case of H2 , Beckner proved [Bec2] that the entropy–energy inequality for H 2 holds with the same as in R2 . That is, H2 (t) ≤ R2 (t), for all t. This result is asymptotically sharp in the sense that lim
t→0
H2 (t) =1 R2 (t)
however, H2 (t) < R2 (t). We shall give a sharpened estimate on H2 (t), which adds to Gross’s program of logarithmic Sobolev inequalities and improved hypercontractivity for complex geometry [Gro2]. The paper is organized as follows: in Sect. 2 we give a short description of a discrete representation of SU (1, 1) and define the associated coherent states and coherent state transform. Given any quantum state ψ, we denote its coherent state transform by Lψ(ζ ), where the complex number ζ is used to label the coherent states. We show that these coherent state transforms are actually probability amplitudes on the hyperbolic plane. We also state the analogue of the Lieb-Wehrl conjecture in this setting. Section 3 contains the proof of the lower bound for the Wehrl entropy for SU (1, 1), and the results leading up to it. Here we prove a Fisher information identity for the coherent state transforms, and the sharp Sobolev inequality. The proof of the latter result uses the uniqueness result that is postponed to the final section. Section 4 contains the sharpened entropy–energy inequality for H2 , and finally Sect. 5, the longest one, contains our uniqueness proof. The problem of proving an analogue of the Lieb-Wehrl conjecture in the SU (1, 1) setting was suggested to me by my advisor, Prof. Eric Carlen. I am greatly indebted to him for introducing me to this beautiful problem and for the many valuable remarks and discussions without which this work would not have been possible. I would also like to thank the referee for helpful comments and suggestions.
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1069
2. Representation of the Group SU(1, 1) and the Construction of Coherent States The group SU (1, 1) consists of unimodular 2 × 2 complex matrices which leave the Hermitian form |z 1 |2 − |z 2 |2 invariant: αβ g= ¯ , |α|2 − |β|2 = 1. β α¯ A particular representation of SU (1, 1) is labeled by a single number k [Per]. For the discrete series this number takes on discrete half-integral values, k = 1/2, 1, 3/2, · · · (cf. [Bar,Per]). Let us call a particular representation T k (g). We consider a realization of T k (g) in the space Gk of functions f (z), which are analytic inside the unit circle and have 2 2k−2 d 2 z finite L 2 -norm with respect to the invariant density d k (z) = 2k−1 π (1 − |z| ) [Bar], i.e.,: 2k − 1 | f (z)|2 (1 − |z|2 )2k−2 d 2 z < ∞, D = {z : |z| < 1}. π D The group action on Gk in the multiplier representation T k (g) is given by [Bar]: ¯ −2k f (z g ), T k (g) f (z) = (βz + α)
zg =
αz + β¯ . βz + α¯
The operators T k (g) with the group action defined as above furnish a unitary and irreducible representation of SU (1, 1) [Bar]. The generators of the group act as first order differential operators and the representation space is spanned by monomials in z. We denote these basis vectors by |k, k + m, where m is a nonnegative integer (cf. [Per]). To construct the coherent states, let us choose the least-weight vector |k, k in Gk . The stationary forthis state is the subgroup H of diagonal matrices of the form subgroup eiϕ/2 0 . The factor space G/H is realized as the unit disk {ζ : |ζ | < 1}, h= 0 e−iϕ/2 or equivalently, as the hyperbolic plane H2 = {n : |n|2 = n 20 − n 21 − n 22 = 1, n 0 > 0} ( cf. [Per]). An element of G/H determines a hyperbolic rotation and the corresponding operator is T k (gn ). The coherent states are expressed, in terms of the standard orthonormal basis vectors, as follows (cf. [Per]): ∞ (m + 2k) 2
1
T (gn )|k, k = (1 − |ζ | ) k
2 k
m=0
m! (2k)
ζ m |k, k + m.
In what follows, we shall denote the coherent state corresponding to a particular ζ by |ζ . If we now choose any arbitrary normalized vector |ψ = ∞ m=0 am |k, k + m, then we can define its coherent state transform Lψ(ζ ) via the following inner product: ∞ (m + 2k) 2
1
Lψ(ζ ) = ψ|ζ = (1 − |ζ |2 )k
m=0
m! (2k)
a¯ m ζ m .
(2.1)
Evidently Lψ(ζ ) is a function on the unit disk and so the coherent state transform maps unit vectors in our representation space Gk into functions on the unit disk, which vanish at the boundary of the disk. This mapping becomes an isometry
unit if we equip the 1 2k−1 2 2 disk with the L -metric corresponding to the measure: dν(ζ ) = π (1−|ζ |2 )2 d ζ .
1070
J. Bandyopadhyay
times the standard measure on the unit disk, that is, Note that dν(ζ ) is just 2k−1 4π
4 the measure dµ(ζ ) = (1−|ζ |2 )2 d 2 ζ , obtained from the Poincaré metric on the disk. With inner product defined in the usual way with respect to the measure dν(ζ ), the space of the coherent state transforms described above is a Hilbert space [Bar]. We call this space Fk . The transform L is thus an analogue of the Bargmann-Segal transform for the Glauber coherent states based on the Heisenberg group. Since |ψ is a unit vector in our representation space Gk , its coherent state transform Lψ(ζ ) is a probability amplitude on the unit disk. Thus, Fk is a space of probability amplitudes on the unit disk. We can calculate the Wehrl entropy S(|Lψ(ζ )|2 ) associated with the coherent state transform Lψ(ζ ). If the unit vector |ψ happens to be a coherent state itself, we find 2k . The analogue of the Lieb-Wehrl conjecture for SU (1, 1) that: S(|Lψ(ζ )|2 ) = 2k − 1 coherent states would then be: Conjecture 2.1. For all Lψ(ζ ) ∈ Fk , the Wehrl entropy is bounded below by: S(|Lψ(ζ )|2 ) ≥
2k . 2k − 1
(2.2)
3. The Entropy Bound and Related Results In this section we first present a useful Fisher information identity for functions in Fk , that relates the q-norm (for all positive q such that kq > 2) of a function to the L 2 -norm of the associated gradient. We then prove a sharp Sobolev inequality for functions in a larger function space H, defined to be the space of bounded non-constant functions f ∈ W 1,2 (D) on the unit disk which vanish at the boundary; the norms here are computed with respect to the measure dν(ζ ). Next, we prove a sharp norm estimate for functions in Fk (note that Fk is a subspace of H) by converting the gradient norm of | f |q/2 that appears in our sharp Sobolev inequality, into the L q -norm of the function f , via the Fisher information identity. This sharp norm estimate is then used to derive a lower bound on the entropy of functions in Fk . The variational problem associated with our sharp Sobolev inequality in the function space H, naturally leads us to an Euler-Lagrange equation which is actually a semi-linear Poisson equation on the unit disk. We reduce the Euler-Lagrange equation to an ordinary differential equation by using radially symmetric decreasing rearrangements of functions. To prove the sharpness of the Sobolev inequality we need to prove that the ground state solution, that is to say, the solution that decays to zero at the boundary of the disk, is unique. Since the proof is somewhat involved, we present a detailed analysis of the Euler-Lagrange equation and relevant results in Sect. 5. 3.1. A Fisher information identity. The Fisher information of a probability density function is a measure of its concentration. In this subsection we prove a Fisher information identity for functions in Fk . Theorem 3.1. For Lψ(ζ ) in Fk the following identity holds: 1 |∇|Lψ(ζ )|q/2 |2 dν(ζ ) = kq |Lψ(ζ )|q dν(ζ ), 4 where q is a positive number such that kq > 2.
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1071
Proof. Using the expression (2.1) for the coherent state transforms in Fk , we can write: ∞ q/2 (m + 2k) 21 |Lψ(ζ )|q/2 = (1 − |ζ |2 )kq/2 a¯ m ζ m = (1 − |ζ |2 )kq/2 |(ζ )|q/2, m! (2k) m=0
where (ζ ) is holomorphic in ζ . Thus (ζ ) satisfies the Cauchy-Riemann equations on the unit disk/hyperbolic plane. Let us do our computations in terms of the radial variable τ and the angular variable φ on the two-dimensional hyperbolic plane. The gradient is 1 ∂ ∂ , . then given by: ∇ = ∂τ sinh τ ∂φ A brief computation yields the following Cauchy-Riemann equations for an analytic function = u + iv on the hyperbolic plane: ∂u 1 ∂v = , ∂τ sinh τ ∂φ
∂u ∂v = − sinh τ . ∂φ ∂τ
Using these two equations we obtain the following: ∇u · ∇v =
∂u ∂v 1 ∂u ∂v + = 0, ∂τ ∂τ sinh2 τ ∂φ ∂φ |∇u|2 = |∇v|2 .
We now compute some results for the non-holomorphic pre-factor (1 − |ζ |2 )kq/2 in the expression for the coherent state transforms: 1 ∂ τ τ kq/2 τ ∂ kq , 1 − tanh2 ∇(1−|ζ |2 )kq/2 = = − tanh sech kq , 0 , ∂τ sinh τ ∂φ 2 2 2 2 and,
kq/2 ∂2 ∂ 2 τ 1 − tanh + coth τ ∂τ 2 ∂τ 2 2 kq τ τ τ kq sech kq . = tanh2 sech kq − 2 2 2 2 2
(1 − |ζ |2 )kq/2 =
As for ||q/2 , the Cauchy-Riemann equations for guarantee that: ||q = 4|∇||q/2 |2 . Thus: |∇|Lψ(ζ )|q/2 |2 = (1 − |ζ |2 )kq |∇||q/2 |2 + |∇(1 − |ζ |2 )kq/2 |2 ||q +2(1 − |ζ |2 )kq/2 ∇(1 − |ζ |2 )kq/2 · ||q/2 ∇||q/2 1 = (1 − |ζ |2 )kq |∇||q/2 |2 + ||q (1 − |ζ |2 )−kq |∇(1 − |ζ |2 )kq |2 4 1 2 kq q + ∇(1 − |ζ | ) · ∇|| 2
1 = (1 − |ζ |2 )kq |∇||q/2 |2 + ||q (1 − |ζ |2 )kq + kq(1 − |ζ |2 )kq 4
1 + ∇ · ((1 − |ζ |2 )kq ∇||q ) − (1 − |ζ |2 )kq ||q . 2
1072
J. Bandyopadhyay
We notice that the divergence term, when integrated with respect to the invariant measure dν(ζ ) yields a vanishing surface integral for kq > 2, by a limiting argument as in the proof of [Car, Theorem 1]. Also, 1 1 ||q (1 − |ζ |2 )kq dν(ζ ) = (1 − |ζ |2 )kq ||q dν(ζ ). 4 4 Putting these all together we finally arrive at: 1 q/2 2 2 kq q/2 2 q |∇|| | − || dν(ζ ) |∇|Lψ(ζ )| | dν(ζ ) = (1 − |ζ | ) 4 1 + kq ||q (1 − |ζ |2 )kq dν(ζ ). 4 The first term on the right hand side in the equation above, vanishes due to analyticity of as we have already shown, yielding the following identity: 1 |∇|Lψ(ζ )|q/2 |2 dν(ζ ) = kq |Lψ(ζ )|q dν(ζ ). 4
3.2. A sharp Sobolev inequality and a norm estimate. We now prove a sharp Sobolev inequality for functions in H. Theorem 3.2. For all functions in H the following inequality holds: kp−1 q/ p kq − 1 4 2k −1 q q q/2 2 f q + f p , |∇| f | | dν(ζ ) ≥ kq(kq − 2) kq −1 2k −1 kq − 2 (3.1) where p = q + 1/k, q ≥ 2, kq > 2 and the norms are computed with respect to the measure dν(ζ ); equality is obtained if and only if the function f comes from a coherent state, i.e., | f (z)| = A|(1 − ζ z)|−2k (1 − |ζ |2 )k , for some ζ in the open disk and some constant A. Proof. Proving Theorem 3.2 is equivalent to showing that the infimum of the functional q 4 f q + kq(kq−2) |∇| f |q/2 |2 dν(ζ )
I[ f ] = , q kq−1 f p kq−2
2k − 1 kp − 1 q/ p . Since we are in the function space H, the existence of the kq − 1 2k − 1 infimum is obvious. Let us take a minimizing sequence { f n }. We can now perform a radially symmetric decreasing rearrangement [Bae], since the gradient norm can only decrease under such a rearrangement while the other norms in the functional stay constant. So each function in the minimizing sequence is replaced by its decreasing rearrangement. Functions in the new sequence { f n∗ } thus obtained also have bounded norms and gradient norms. The sequence being monotone and bounded we can use Helly’s principle to obtain a convergent subsequence. Since the functions are in W 1,2 , is
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1073
the convergence is in the s-norm, for all finite s, by the Rellich-Kondrashov theorem. We thus need to show that in a class of radially symmetric solutions the minimizer is unique. The minimizer satisfies the following Euler-Lagrange equation for our optimization problem: u + kq(kq − 2)[γ u
2 1+ kq
− u] = 0 ,
(3.2)
where u = | f |q/2 , is the Laplacian on the hyperbolic plane (or, equivalently, the unit disk) and γ > 0 is fixed by choosing the p-norm of the function f . It is readily seen that this Euler-Lagrange equation is solved by the coherent state: f = A(1 − |ζ |2 )k , where A is a constant determined by fixing the p-norm. Since we are dealing with radial functions only, (3.2) is equivalent to an ordinary differential equation. We now refer to Sect. 5, where we prove in detail that there is only one solution of this ODE, in the space of radially symmetric functions on the unit disk, which decays to zero at the boundary of the disk (or, equivalently, decays to zero as the radial coordinate on the hyperbolic plane tends to infinity). On the basis of this uniqueness result we can conclude that the coherent state f = A(1 − |ζ |2 )k is indeed the unique solution and hence furnishes the infimum.
This sharp Sobolev inequality, coupled with our Fisher information identity, trivially yields the following corollary: Corollary 3.3. For all functions in Fk the following inequality holds: q
|| f ||q ≥
2k − 1 kq − 1
kp − 1 2k − 1
q/ p
q
|| f || p ,
(3.3)
where q ≥ 2; equality is obtained if and only if the function f is a coherent state. Proof. The Fisher information identity for functions in Fk tells us: |∇| f |q/2 |2 dν(ζ ) =
1 kq 4
| f |q dν(ζ ).
We can thus re-write the left hand side of (3.1) as: q f q
4 + kq(kq − 2)
∇| f |
| dν(ζ ) =
q/2 2
kq − 1 q f q . kq − 2
So now our sharp Sobolev inequality yields the following norm estimate for functions in Fk : q f q
≥
2k − 1 kq − 1
kp − 1 2k − 1
q/ p
q
f p.
1074
J. Bandyopadhyay
3.3. A lower bound for the Wehrl entropy of functions in Fk . We now derive a lower bound for the entropy of functions in Fk . Theorem 3.4. The Wehrl entropy associated with Lψ(ζ ) ∈ Fk has a lower bound given by: S(|Lψ(ζ )| ) ≥ 2k ln 1 + 2
1 . 2k − 1 p
Proof. Let us define, for any function f , ϕ( p) = ln || f || p = ln S(| f | ) = −2 2
(3.4)
| f | p . Then, we have:
| f |2 ln | f |dν(ζ ) = −2ϕ (2),
if || f ||2 = 1. By logarithmic convexity of the p-norm: 1 −2ϕ (2) ≥ −2kϕ 2 + . k If we now set q = 2, p = 2 +
1 in Corollary 3.3, we have: k 2+ 1k
Lψ(ζ )
2+ 1k
≤
2k − 1 , 2k
since Lψ(ζ )22 = 1, by definition. This implies, in Fk : 2k − 1 1 ≤ ln . ϕ 2+ k 2k Thus:
or,
1 −2ϕ (2) ≥ −2kϕ 2 + k 1 . S(|Lψ(ζ )|2 ) ≥ 2k ln 1 + 2k − 1
A comparison between (2.2) and (3.4) shows that the estimate obtained above has the conjectured high-spin asymptotics up to, but not including, first and higher order terms 1 1 1 1 −1 in (k ) because 2k ln 1 + = 2k − + · · · . In fact this 2k − 1 2k − 1 2 (2k − 1)2 is completely analogous to the lower bound Bodmann [Bod] obtained for coherent state transforms on the sphere S2 .
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1075
4. Entropy-Energy Inequalities on the Hyperbolic Plane H2 We say a Riemannian manifold M with measure dM admits a logarithmic Sobolev inequality with constant C if: 2 2 2 | f | ln | f | dM ≤ C |∇ f | dM for all f such that | f |2 dM = 1. (4.1) M
M
M
Since the Fisher information associated with a function is often regarded as an “energy”, one can say that logarithmic Sobolev inequalities give a bound on the entropy of a function f in terms of its energy E( f ) =
M
|∇| f ||2 dM.
Even if C is the best possible constant in (4.1), this is only one of a whole family of sharp inequalities, and in many applications use of the whole family leads to more incisive results. To obtain this family of inequalities, one must determine, for each A > 0, the least value of B for which | f |2 ln | f |2 dM ≤ A |∇ f |2 dM + B for all f such that | f |2 dM = 1, M
M
M
(4.2) is true. Call this optimal choice B(A). If one then defines an increasing concave function through (t) = inf { At + B(A)}, A>0
one has
for all f with
M
M
| f |2 ln | f |2 dM ≤ (E( f ))
| f |2 dM = 1.
Conversely, given the optimal function (t), B(A) can be recovered: It is just the y–intercept of the tangent line to y = (t) at the value of t for which (t) = A. Thus, determining an optimal entropy-energy inequality is essentially equivalent to solving an “AB” type problem in the sense of Hebey [Heb]: Obviously, if (4.2) holds for some A (that is, if, given some A, one can find a constant B such that (4.2) is valid), then it holds for all A ≥ A. Similarly, if (4.2) is valid for some B, it remains valid for all B ≥ B. Thus, it is natural to ask: what is the smallest constant A (or B) for which one can find a constant B (respectively, A) such that inequality (4.2) holds? In fact, these questions arise naturally whenever one has a Sobolev-type inequality on a Riemannian manifold [Heb]. The smallest A for which (4.2) holds is called the first best constant while the smallest such B is called the second best constant with respect to the inequality (4.2). Given any Sobolev-type inequality on some Riemannian manifold, Hebey associated two parallel research programs with the notion of best constants. The A-part of the program gives priority to the first best constant while the B-part is concerned with the second best constant. On R2 , the optimal entropy–energy function R2 (t) is given by 1 t . R2 (t) = ln πe
1076
Thus:
J. Bandyopadhyay
R2
| f |2 ln | f |2 ≤ ln
1 E( f ) . πe
Equality is achieved when f is an isotropic Gaussian function. For an appropriate choice of the variance of the Gaussian, the energy E( f ) can take any value, so this inequality is sharp for all values of E( f ). In the case of H2 , Beckner proved [Bec2] that the entropy has the same bound as in R2 , i.e., 1 2 2 E( f ) . | f | ln | f | ≤ ln πe H2 In other words, H2 ≤ R 2 . This result is asymptotically sharp for small t as explained in the introduction. However, the inequality is actually strict, and significantly so, for large t. Here we prove an improved bound: For t > 0, define (t) by
2k+1 2k − 2 2k+1 2k − 1 2k 2k − 1 1 1 ln 1+ t . (t) = inf 2k − 1 2k 4π k(k − 1) k∈N 2 Notice that this is an infimum over a family of increasing, concave functions. As such, it is increasing and concave. While we cannot explicitly evaluate the infimum that defines (t), we have the following result: Theorem 4.1. For all t > 0, H2 ≤ (t) < R2 . Proof. We start from the sharp Sobolev inequality proved in Theorem 3.2, re-written in terms of the standard measure derived from the Poincaré metric. Recall that the measures 2k − 1 dµ and dν are related via: dν = dµ. 4π If we rescale f in inequality (3.1) so as to make it L 2 -normalized in the measure dµ and rewrite the inequality with respect to dµ, we get: kq − 2 p/q 2k − 1 p−q/q kq − 1 p/q 2k − 1 f p dµ ≤ 2k − 1 kp − 1 kq − 1 4π p/q 4 × . |∇ f q/2 |2 dµ f q dµ + kq(kq − 2) Putting q = 2, p = 2 + 1/k, and using the logarithmic convexity of the p-norm as in the proof of Theorem 3.4, we obtain the following estimate:
2k − 2 2k+1 2k − 1 2k 2k − 1 1 2 f ln f dµ ≤ ln 2 2k − 1 2k 4π 2k+1 1 × 1+ . (4.3) |∇ f |2 dµ k(k − 1)
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1077
Since this holds for every k, we get an entropy–energy inequality by taking the infimum over k, and this amounts to the inequality H2 ≤ (t). It remains to show that (t) < R2 . We shall do this using the equivalent A–B form of the inequality. To make the tangent line computation and subsequent comparison with R2 , and hence Beckner’s estimate, we note that, (4.3) implies: k−1 2k + 1 k−1 + ln + (4.4) |∇ f |2 dµ. f 2 ln f 2 dµ ≤ 2k ln k 2π k(k − 1) Now Beckner’s inequality [Bec2] on the upper half plane is: 1 1 | f |2 ln | f |dµ ≤ ln |∇| f ||2 dµ . 2 πe
(4.5)
ln x − ln x0 1 Since the logarithm is a concave function of its argument, < , where x − x0 x0 2 x > x0 . If we put x = |∇ f | dµ in (4.5), we obtain the following inequality: f 2 ln f 2 dµ ≤
1 x0
|∇ f |2 dµ + ln x0 − ln π − 2.
Inequalities (4.4) and (4.6) have the form
(4.6)
f 2 ln f 2 dµ ≤ C +
|∇ f |2 dµ. We
would like to see how the values for the intercept C compare for a given value of the slope . Let C x0 and Ck denote the intercepts for the inequalities parametrized by x0 and 1 2k + 1 . Then, for this k respectively. Now, to make the comparison let us put = x0 k(k − 1) value of x0 we have:
1 1 1 2 + C x0 = ln x0 − ln π − 2 = − + · · · + ln(k − 1) − ln 2π − 2. 2k 2 2k On the other hand: k−1 1 k−1 + ln = ln(k − 1) − ln 2π − 2 − k 2π k 2 1 2 1 1 3 − − − ··· . 3 k 2 k
Ck = 2k ln
13 1 k(k − 1) 1 , we have: C x0 − Ck = + Thus, for x0 = + · · · . This means that 2k + 1 2k 24 k 2 the logarithmic Sobolev inequality (4.4) actually gives an improvement on Beckner’s
inequality (4.6) as regards the second best constant and (t) < R2 . Another way to see the extent to which is a better estimate of H2 than R2 is to use them both to estimate the entropy of our coherent state transforms, since for these, k(k − 1) k . E( f ) = > 2 2k + 1
1078
J. Bandyopadhyay
k Inserting the value E( f ) = into R2 we obtain, using Beckner’s estimate with 2 respect to the measure dν(ζ ):
−
| f |2 ln | f |2 dν ≥ 1 − ln
2k , 2k − 1
while inserting this value into (with respect to measure dν(ζ )) yields the marginally better bound (3.4).
5. The Uniqueness Theorem In this section we study (3.2) written in terms of the radial hyperbolic coordinate. Similar equations in Rn have been investigated in the past (cf. [Pel1,Pel2,McLe] and [Kwo]). Our case is significantly different and here we adapt the methods described in [Kwo] to the hyperbolic setting. We investigate the question of uniqueness of ground state solution of the equation u + coth τ u + f (u) = 0,
(5.1)
where τ ∈ (0, ∞) on the two-dimensional hyperbolic plane. The function f (u) is given 2 1+ kq ˜ where b˜ = kq(kq − 2) and a˜ = γ kq(kq − 2). The boundary − bu, by: f (u) = au ˜ conditions on the solutions of interest are: limτ −→∞ u(τ ) = 0 and u (0) = 0. There exist three points ξ0 , ξ1 and ξ2 in (0, ∞) such that:
ξ0
f (u)du = 0
v
f (u)du < 0 for v < ξ0
u=0
u=0
f (u)du > 0 for v > ξ0 ,
u=0
f (ξ1 ) = 0; f (u) < 0
v
and
f (ξ2 ) = 0; f (u) < 0
if
u < ξ1
and
f (u) > 0 if u > ξ1 ,
if
u < ξ2
and
f (u) > 0 if u > ξ2 .
Following [McLe] and [Kwo], let us consider u as a function of the initial value α and τ , and study, instead of the boundary value problem mentioned above, the following initial value problem: u + coth τ u + f (u) = 0, u(0) = α > 0,
(5.2)
u (0) = 0.
We first divide the set of solutions into three mutually disjoint subsets, namely: 1. Solutions that have a zero at some finite τ . We call the corresponding set of initial values N . We denote the finite zero as b(α). 2. Positive solutions that satisfy limτ →∞ u(τ ) = 0. We call the set of initial values G in this case. 3. Solutions that remain positive and do not belong to case 2. We let P denote the set of initial values for such solutions.
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1079
f(u)
^
(0,0)
>
u
Fig. 1. The function f (u)
For a particular solution u ∈ G ∪ N , we let τ1 denote the zero of f (u), that is to say, u(τ1 ) = ξ1 (it is possible to define this point uniquely because, as we shall show momentarily, solutions u ∈ G ∪ N are monotone). Our subsequent results rely heavily on Sturm’s comparison theorem (as mentioned in [Lemma 1, [Kwo]] and also in Chapter X, p. 229 of [Inc]) and a few important corollaries that we state below. Consider two second order differential equations: U (x) + f (x)U (x) + g(x)U (x) = 0, V (x) + f (x)V (x) + G(x)V (x) = 0,
x ∈ (a, b), x ∈ (a, b).
(5.3) (5.4)
Suppose that (5.3) has solutions that do not vanish in a neighborhood of point b. Then the largest neighborhood of b, (c, b), on which there exists a solution of (5.3) without any zero, is called the disconjugacy interval of (5.3). Sturm’s theorem implies that no non-trivial solution can have more than one zero in (c, b). A corollary (Lemma 6, [Kwo]) of Sturm’s theorem is: if (c, ∞) is the discongugacy interval of (5.3), as defined above, then every solution of (5.3) with a zero in (c, ∞) is unboounded. We also have another very useful corollary (Lemma 3, [Kwo]) of Sturm’s theorem: if Equations (5.3) and (5.4) satisfy the comparison condition G(x) ≥ g(x), U is not identically equal to V in any neighborhood of b and there exists a solution V of (5.4) with a largest zero at ρ ∈ (a, b), then the disconjugacy interval of (5.3) is a strict superset of (ρ, b). We are now ready to state and prove our results. But first let us briefly outline our strategy in a few steps, since the proof of uniqueness is rather involved: 1. The first two lemmas state well-known facts about the structure of the sets N , P and G. As we increase α from 0 we first have solutions in P. Since the arguments are exactly similar to those used for the Euclidean case in [Kwo], we refer to the relevant lemmas in [Kwo], instead of reiterating the proofs. 2. Next we study the variation w of a solution u ∈ G ∪ N with respect to its initial value. The proof of uniqueness depends crucially on the properties of w. If, for α ∈ G, limτ −→∞ w(α, τ ) = −∞, then a right neighborhood of α belongs to N . Also, if α ∈ N and w(α, b(α)) < 0, then a right neighborhood of α belongs to N as well. Suppose these hypotheses are indeed true. As we continuously increase α, we shall first have solutions in P. The right boundary point will belong to G.
1080
J. Bandyopadhyay
A right neighborhood of the corresponding α will be in N . Then, if for all α ∈ N , w(α, b(α)) < 0, we would continue to remain in N as we increase α further. Thus the proof of uniqueness of the ground state will be complete. Hence we just need to prove that for α ∈ G, limτ −→∞ w(α, τ ) = −∞, while for α ∈ N , w(α, b(α)) < 0. In fact, if we can prove that w has only one zero for initial values in G ∪ N and w is unbounded for initial values in G, uniqueness will be guaranteed. Initial values satisfying these two conditions are called strictly admissible. 3. To prove that w can have no more than one zero and that it is unbounded, we construct a comparison function v for w. The zero of w is then shown to belong to the disconjugacy interval of the differential equation satisfied by w, which in turn implies unboundedness of w. The idea of constructing a comparison function like this was used in [Kwo] to prove uniqueness of positive solutions of a semi-linear Poisson equation in a bounded or unbounded annular region in Rn , for n > 1. It is in this crucial step, right after Lemma 5.5 in this paper, that our proof of uniqueness differs from that of [Kwo]. This happens because we are dealing with a semi-linear Poisson equation on the hyperbolic plane H2 . The difference in geometry manifests itself in the form of the comparison function and, more importantly, in the subsequent analysis. Proofs of Lemma 5.6 through Lemma 5.8 are thus specific to the hyperbolic case. As we go along we point out these differences in detail. The main result of this section is: Theorem 5.1. The initial value α ∈ G ∪ N is strictly admissible. Let us construct an “energy” function corresponding to (5.2): 2 2+ kq
u 2 (τ ) au ˜ E(τ ) = + 2 2+
2 kq
−
˜ 2 bu . 2
It is readily seen that E (τ ) = − coth τ u 2 (τ ) ≤ 0. Thus E is a non-increasing function of τ . Lemma 5.2. The set (0, ξ0 ] of initial values belongs to the set P [Lemma 8, [Kwo]]. For solutions in N , the function E decreases to a positive constant while for solutions in G, E(∞) = 0. This fact leads us to the following lemma: Lemma 5.3. If u ∈ G ∪ N , then u (τ ) < 0 in (0, b(α)) (if u ∈ N ) or (0, ∞) (if u ∈ G) [Lemma 11, [Kwo]]. The fact that the sets N and P are open subsets of (0, ∞) [Lemma 13, [Kwo]; Lemma 1.1, [Ber]] is crucial but easy to observe. We concern ourselves only with solutions that are either in G or in N . Let us define: ∂u w = w(τ, α) = . We study the function w for such solutions. First of all let us ∂α τ,α note that w = 0 means two nearby solutions (i.e. solutions having nearby initial values) can intersect. Evidently w satisfies the following equation (the derivatives are taken with respect to τ ): w + coth τ w + f (u)w = 0, w(0) = 1, w (0) = 0.
(5.5)
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1081
Lemma 5.4. For u ∈ G ∪ N , w has to change sign before ξ1 [Lemma 17, [Kwo]]. Following Kwong, we call the initial value α ∈ G strictly admissible if the corresponding w(α, τ ) has only one zero in (0, ∞) and limτ −→∞ w(α, τ ) = −∞. We call the initial value α ∈ N strictly admissible if the corresponding w(α, τ ) has only one zero in (0, ∞) and w(α, b(α)) < 0. ∂u (b(α), α) < 0, then It is easy to see that if for a particular α ∈ N , w(b(α)) = ∂α in a right neighborhood of α, b(α) is a strictly decreasing function of α and thus that neighborhood belongs to N . Lemma 5.5. If for α ∈ G, limτ −→∞ w(α, τ ) = −∞, in particular if w(α, τ ) is strictly admissible, then there exists a right neighborhood of α that belongs to N [Lemma 19, [Kwo]]. We now need to prove that every initial value α ∈ G ∪ N is strictly admissible. The strategy is to construct a comparison function v(τ ) (to be compared with w), which has the following properties: 1. v(τ ) has only one zero in (0, ∞), and 2. v(τ ) is a strict Sturm majorant of w(α, τ ) in both (0, ρ) and (ρ, ∞), where ρ is the first zero of w(α, τ ). If we are able to construct such a function, then by property (2) the zero of v occurs before that of w and by property (1) w cannot have another zero in (0, b(α)). Here b(α) is the zero of the solution u ∈ G ∪ N . If u ∈ G then b(α) is to be interpreted as the point τ = ∞. If b(α) is finite then of course the corresponding u is in N and w(α, b(α)) < 0, i.e., α is strictly admissible. On the other hand if b(α) = ∞, w has a zero in the disconjugacy interval of v, and hence in the disconjugacy interval of the differential equation satisfied by w itself. This happens because w being a strict Sturm minorant of v in (0, ∞), the disconjugacy interval of (5.3) is bigger than that of the differential equation satisfied by v. This means w is unbounded. Hence the corresponding α is strictly admissible. It is helpful to first construct an auxiliary function θ (τ ) and then use it to deduce that v has the necessary properties described above. In the Euclidean case [Kwo], the r u (r ) auxiliary function θ (r ) is given by: θ (r ) = − . For the hyperbolic case we define u(r ) the auxiliary function for all solutions u ∈ G ∪ N as: θ (τ ) =
− sinh τ u (τ ) . u(τ )
(5.6)
The auxiliary functions and the comparison functions in the Euclidean and hyperbolic cases have different forms but similar properties. Thus lemmas that follow are basically hyperbolic analogues of lemmas proved by Kwong in the Euclidean case. The function θ (τ ) is obviously continuous in (0, ∞) for u ∈ G; for u ∈ N θ (τ ) is continuous in (0, b(α)), where b(α) is the zero of u(α). Lemma 5.6. For solutions u ∈ G ∪ N , θ (0) = 0 and limτ −→b(α) θ (τ ) = ∞. If u ∈ N , b(α) is interpreted to be the zero of u and if u ∈ G, b(α) = ∞.
1082
J. Bandyopadhyay
Proof. The first claim is easy to verify since for all u ∈ G ∪ N , u (0) = 0; since u (τ ) < 0, θ (τ ) > 0 in (0, ∞). For u ∈ N , u (b(α)) = 0 and the second assertion of the lemma automatically follows. Let us consider the case: u ∈ G. u Let R = − . u u u 2 f (u) . Then R ≥ 0 and R = − + 2 = R 2 − R coth τ + u u u f (u) ˜ We assert that for large values of τ we would = −b. Now we know that lim τ −→∞ u b˜ b˜ . If not, then R(τ ) ≤ for some τ . Then: always have: R(τ ) > 2 2 b˜ f (u) f (u) < R2 + ≤− . R (τ ) = R 2 − coth τ R + u u 2 Thus R will remain strictly and hugely negative, eventually causing R to change sign. Thus −
u (τ ) > u(τ )
b˜ for large values of τ . This in turn means limτ −→∞ θ (τ ) = ∞. 2
We next define the comparison function vβ (τ ) = sinh τ u + βu (in the Euclidean case it is defined as vβ (r ) = r u (r ) + βu(r )) . It is readily seen that vβ (τ ) = ()0 if and only if θ intersects (is above, is below) the straight line y(τ ) = β. Also, vβ (τ ) is tangent to the τ -axis at some point τˆ if and only if θ (τ ) is tangent to the straight line y(τ ) = β at τˆ . The function vβ (τ ) satisfies the following differential equation: v + coth τ v + f (u)v = (τ ) = β(u f (u) − f (u)) − 2 cosh τ f (u), v(0) > 0, v (0) = 0.
(5.7)
Now, 2 2 1+ kq β au ˜ − 2 cosh τ f (u). = β u f (u) − f (u) − 2 cosh τ f (u) = kq
It is not really obvious that one can choose a β such that has only one zero and the position of that zero has a continuous dependence on β. However our next lemma proves that this can indeed be achieved. Lemma 5.7. There exists some β¯ such that for 0 < β < β¯ the function (u, τ ) has only one zero, say at τ = σ in (0, ∞) such that: (u, τ ) < 0 (u, τ ) > 0
for τ < σ, for τ > σ.
The point σ is a continuous monotone function of β.
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1083
Proof. First, we note that (τ ) > 0 in [τ1 , ∞) by definition; so its zeros must be concentrated in (0, τ1 ). At a zero of the function we have: 2 2 1+ kq = 2 cosh τ f (u). β au ˜ kq
Thus at = 0 we have:
2 2 2 = β a˜ 1 + u kq u − 2 sinh τ f (u) − 2 cosh τ f (u)u kq kq 2 2u cosh τ 1+ f (u) − u f (u) − 2 sinh τ f (u). = u kq
So, if at = 0, > 0, then: 2 2u cosh τ 1+ f (u) − u f (u) > 2 sinh τ f (u) u kq 2b˜ or , − u > tanh τ f (u), kq which in turn implies 2b˜ (− sinh τ u ) > sinh τ tanh τ f (u) kq
(5.8)
Similarly if < 0 at = 0, then 2b˜ (− sinh τ u ) < sinh τ tanh τ f (u). kq
(5.9)
Now the differential equation (5.2) satisfied by u can be rewritten as (− sinh τ u ) = sinh τ f (u). If at the first zero of the function (τ ), (τ ) > 0 then inequality (5.8) holds at that point and we the left hand side of the inequality is positive and increasing also know that 2b˜ 2b˜ at the rate (− sinh τ u ) = sinh τ f (u). As for the right hand side, we have, in kq kq the interval (0, τ1 ): (sinh τ tanh τ f (u)) = sinh τ f (u) + sinh τ sech2 τ f (u) + sinh τ tanh τ f (u)u < 2 sinh τ f (u). The inequality above holds because f (u) > 0 in (0, τ1 ) and u < 0. Since in our case 2b˜ 2b˜ = 2(kq − 2) and k is chosen so that kq > 1, it turns out that sinh τ f (u) > kq kq 2 sinh τ f (u). This in turn implies that the left hand side of (5.8) increases more rapidly than the right hand side. So if inequality (5.8) holds at some point in (0, τ1 ) then it prevails at all subsequent points in this interval. We can thus conclude that if (0) < 0, then (τ ) can have only one zero in (0, τ1 ).
1084
J. Bandyopadhyay
Now for a particular solution having initial value α ,(τ = 0) = β α f (α) − f (α) − 2 f (α). Putting in the specific form of f (u) we obtain the condition that (τ ) has a negative initial value:
b˜ β < kq 1 − 2/kq . aα ˜ ¯ We let β¯ denote the upper limit set on β by the condition above. Then for β ∈ (0, β), the function (τ ) has a negative initial value and consequently only one zero in (0, τ1 ). We denote that zero by σ . Let us now find out how σ depends on β. We have: f (u(σ )) aβ ˜ = cosh σ . kq u(σ )1+2/kq Evidently then β depends continuously on σ . Also:
kq −1−2/kq 2b˜ u u (σ ) cosh σ + f (u) sinh σ . β (σ ) = a˜ kq ¯ (5.8) holds at σ , as proved before. Thus 2b˜ u (σ ) cosh σ + Now for β ∈ (0, β), kq f (u) sinh σ < 0, and hence β (σ ) < 0 for all β in this range. This means there exists a continuous inverse function in a neighborhood of β(σ ). Thus σ depends continuously on β. In fact σ is a decreasing function of β. When β = 0 the only zero of (τ ) is at τ1 . As we increase β the zero shifts continuously to the left.
Let ρβ be the first zero of vβ (τ ) (we do not yet know how many zeros v can have). Then for β = 0, ρ = 0. As we increase β, ρβ moves to the right. In order to prove that we can control β such that ρβ and σβ can be made to coincide, we need to show that ρβ continuously depends on β. We first show that actually, given any β, vβ (τ ) can have only one zero and then prove the continuous dependence of that zero on the parameter β. Lemma 5.8. The function vβ (τ ) has only one zero in (0, ∞). Proof. In the interval [0, τ1 ], (− sinh τ u (τ )) = f (u) sinh τ ≥ 0. Thus (− sinh τ u (τ )) is non-decreasing in [0, τ1 ]. Since u(τ ) is decreasing, θ (τ ) = − sinh τ u (τ ) is non-decreasing in [0, τ1 ]. Thus for any β it can intersect the straight u(τ ) line y(τ ) = β no more than once in this interval and the corresponding vβ (τ ) can have at most one zero. Since limτ −→∞ θ (τ ) = ∞, if θ (τ ) is not non-decreasing in the entire interval (τ1 , ∞), then it has to have local minima. Suppose the lowest of all such minima occurs at ω and has height β0 . Then in (ω, ∞), vβ0 (τ ) is negative and has a double zero at ω. Also vβ0 (τ ) satisfies the following differential inequality in (ω, ∞): v + coth τ v + f (u)v ≥ 0.
Lower Bound for Wehrl Entrophy in SU (1, 1) and the Lieb-Wehrl Conjecture
1085
But this is impossible (since, if v satisfies the second-order differential equation above, then it cannot have a double zero; cf. Lemma 5, [Kwo]). Thus we conclude that θ (τ ) is non-decreasing in (0, ∞), which in turn implies that for any value of β, vβ (τ ) can have only one zero in (0, ∞).
To prove that one can choose β such that ρβ = σβ it is sufficient to show that ρβ as a function of β does not have any discontinuity in (0, τ1 ). Since vβ has a zero at ρβ if and only if θ intersects the straight line y(τ ) = β at τ = ρβ , we just need to show θ (ρβ ) = 0. As shown in the preceding lemma, θ (τ ) > 0 in (0, τ1 ). As we increase β, the height of the horizontal straight line y(τ ) = β increases. This results in a continuous shift of the point of intersection ρβ to the right. Thus we can conclude that in (0, τ1 ) ρβ is a continuous increasing function of β. For β = 0, ρ = 0 and σ = τ1 . When we increase β, ρβ moves continuously to the right even as σβ shifts continuously to the left ¯ as shown before. It follows that there exists until it is at the origin τ = 0 for β = β, ¯ for which we would have ρβ0 = σβ0 . Let us then fix the parameter β by a β0 ∈ (0, β) choosing that value β0 . We are now in a position to prove Theorem 5.1. Proof. Let us use vβ0 (τ ) as a comparison function for w(τ ). The differential equations to be compared are w + coth τ w + f (u)w = 0, w(0) = 1, w (0) = 0, and
(τ ) v = 0, v + coth τ v + f (u) − v v(0) > 0, v (0) = 0.
Since in (0, ρ), < 0 and v > 0, the coefficient of v is larger than that of w. Thus v is a strict Sturm majorant of w and its zero ρ occurs before the first zero of w, say c. But at c, > 0 and v < 0, thus the coefficient of v is still larger than that of w. Moreover, since w (c) v (c) w (c) = +∞ and > . Thus v again is a strict Sturm majorant w(c) = 0, w(c) w(c) v(c) of w. But v does not have a zero in [c, ∞). Then w cannot have a zero in this interval either. So if u ∈ N then w(b(α)) < 0 and α is strictly admissible. Let us consider the case u ∈ G now. Evidently, c belongs to the disconjugacy interval of (5.7). Since v is a strict Sturm majorant of w in (0, ∞), the disconjugacy interval of (5.5) is a superset of the disconjugacy interval of (5.7). Thus w has a zero in the disconjugacy interval of the differential equation it satisfies. Hence it must be unbounded. Thus for u ∈ G ∪ N the corresponding initial value is strictly admissible (and this ensures uniqueness of the corresponding solution, as shown before).
References [Bae] [Bar]
Baernstein, A., Taylor, B.A.: Spherical rearrangement, subharmonic functions and *-functions in n-space. Duke Math. J. 43(2), 245–268 (1976) Bargmann, V.: Irreducible unitary representations of the lorentz group. Ann. Math. 48(3), 568–640 (1947)
1086
[Bec1] [Bec2] [Ber] [Bod] [Car] [Gro1] [Gro2] [Gnu] [Heb] [Inc] [Kwo] [Lie] [McLe] [Pel1] [Pel2] [Per] [Rot] [Sch] [Weh]
J. Bandyopadhyay
Beckner, W.: Sharp Inequalities and Geometric Manifolds. J. Fourier Anal. Appl. 3, Special Issue, 825–836 (1997) Beckner, W.: Geometric asymptotics and the logarithmic sobolev inequality. Forum Math. 11, 105–137 (1999) Berestycki, H., Lions, P.L., Peletier, L.A.: An ode approach to the existence of positive solutions for semilinear problems in r n . Indiana U. Math. J. 30, 141–167 (1981) Bodmann, B.G.: A lower bound for the wehrl entropy of quantum spin with sharp high-spin asymptotics. Commun. Math. Phys. 250, 287–300 (2004) Carlen, E.A.: Some integral identities and inequalities for entire functions and their application to the coherent state transform. J. Funct. Anal. 97(1), 231–249 (1991) Gross, L.: Logarithmic sobolev inequalities. Amer. J. Math. 97, 1061–1083 (1975) Gross, L.: Hypercontractivity over complex manifolds. Acta Math. 182(2), 159–206 (1999) Gnutzmann, S., Zyckzkowski, K.: Renyi-wehrl entropies as measures of localization in phase space. J. Phys. A: Math. Gen. 34, 10123–10139 (2001) Hebey, E.: Nonlinear Analysis on Manifolds: Sobolev Spaces and Inequalities. CIMS Lecture Notes, Newyork: Courant Institute of Mathematical Sciences, 1999 Ince, E.L.: Ordinary Differential Equations. Newyork: Dover Publications, 1944 Kwong, M.K.: Uniqueness of positive radial solutions of u − u + u p = 0 in Rn . Arch. Rat. Mech. Anal. 105, 243–266 (1989) Lieb, E.H.: Proof of an entropy conjecture of wehrl. Commun. Math. Phys. 62, 35–41 (1978) McLeod, K., Serrin, J.: Uniqueness of positive radial solutions of u + f (u) = 0 in r n . Arch. Rat. Mech. Anal. 99, 115–145 (1987) Peletier, L.A., Serrin, J.: Uniqueness of positive solutions of semilinear equations in Rn . Arch. Rat. Mech. Anal. 81, 181–197 (1983) Peletier, L.A., Serrin, J.: Uniqueness of non-negative solutions of semilinear equations in Rn . J. Diff. Eqs. 61, 380–397 (1986) Perelomov, A.: Generalized Coherent States and Their Applications. Texts and Monographs in Physics, Berlin-Heidelberg-Newyork: Springer-Verlag, 1986 Rothaus, O.: Diffusion on compact riemannian manifolds and logarithmic sobolev inequalities. J. Funct. Anal. 42, 358–367 (1981) Schupp, P.: On lieb’s conjecture for the wehrl entropy of bloch coherent states. Commun. Math. Phys. 207(2), 481–493 (1999) Wehrl, A.: On the relation between classical and quantum-mechanical entropy. Rep. Math. Phys. 16(3), 353–358 (1979)
Communicated by M.B. Ruskai
Commun. Math. Phys. 285, 1087–1107 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0543-0
Communications in
Mathematical Physics
Birkhoff Coordinates for the Focusing NLS Equation T. Kappeler1, , P. Lohrmann1, , P. Topalov2 , N. T. Zung3 1 Institut für Mathematik, Universität Zürich, Winterthurerstrasse 190, 8057 Zürich, Switzerland.
E-mail:
[email protected];
[email protected] 2 Department of Mathematics, Northeastern University, Boston, MA 02115, USA.
E-mail:
[email protected] 3 Institut de Mathématiques de Toulouse, Université Paul Sabatier,
118 Rte de Narbonne, 31062 Toulouse, France. E-mail:
[email protected] Received: 6 December 2007 / Accepted: 24 February 2008 Published online: 4 July 2008 – © Springer-Verlag 2008
Abstract: In this paper we construct Birkhoff coordinates for the focusing nonlinear Schrödinger equation near the zero solution. 1. Introduction Consider the focusing nonlinear Schrödinger equation (fNLS) i∂t ψ = −∂x2 ψ − 2|ψ|2 ψ
(1.1)
with periodic boundary conditions, i.e. ψ(x + 1, t) = ψ(x, t) for x, t ∈ R. The fNLS equation (1.1) is integrable and admits a Lax-pair formalism – see [14]. It can be written in Hamiltonian form as follows. Let L 2 := L 2 (T, C) denote the Hilbert space of L 2 -integrable complex-valued functions on the circle T := R/Z and let L2 := L 2 × L 2 . For C 1 -functionals F and G introduce the Poisson bracket 1 {F, G} (ϕ) = i ∂ϕ1 F ∂ϕ2 G − ∂ϕ2 F ∂ϕ1 G d x, (1.2) 0
where ϕ = (ϕ1 , ϕ2 ) and ∂ϕi F denotes the L 2 -gradient of F with respect to ϕi , i = 1, 2. The Hamiltonian system with Hamiltonian 1 H ≡ H(ϕ) := (∂x ϕ1 ∂x ϕ2 + ϕ12 ϕ22 )d x (1.3) 0
is then given by ∂t (ϕ1 , ϕ2 ) = i(−∂ϕ2 H, ∂ϕ1 H).
(1.4)
Supported in part by the Swiss National Science Foundation, and the European Community through the FP6 Marie Curie RTN ENIGMA (MRTN-CT-2004-5652). Supported by the Swiss National Science Foundation.
1088
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
Equation (1.1) is obtained by restricting (1.4) to the invariant subspace iL2R := (ϕ1 , ϕ2 ) ∈ L2 | ϕ1 = −ϕ¯2 . ¯ one has With (ϕ1 , ϕ2 ) = (ψ, −ψ) ∂t ψ = i∂ψ¯ Hf = i∂x2 ψ + 2i|ψ|2 ψ, where
1
Hf (ψ) =
(1.5)
(−∂x ψ∂x ψ¯ + ψ 2 ψ¯ 2 )d x.
(1.6)
0
When restricting (1.4) to the invariant subspace L2R := (ϕ1 , ϕ2 ) ∈ L2 | ϕ1 = ϕ¯ 2 of L2 one obtains the defocusing nonlinear Schrödinger equation (dNLS). With ¯ one has (ϕ1 , ϕ2 ) = (ψ, ψ) ∂t ψ = −i∂ψ¯ Hd = i∂x2 ψ − 2i|ψ|2 ψ, where
1
Hd (ψ) =
(1.7)
(∂x ψ∂x ψ¯ + ψ 2 ψ¯ 2 )d x.
0
Equation (1.4) admits the Lax pair representation ∂t L(ϕ) = [A(ϕ), L(ϕ)],
(1.8)
where ϕ = (ϕ1 , ϕ2 ), L = L(ϕ) is the Zakharov-Shabat operator (ZS operator) 0 ϕ1 1 0 ∂ + L(ϕ) := i 0 −1 x ϕ2 0 and
A(ϕ) := i
−2∂x2 + ϕ1 ϕ2 −∂x ϕ1 − 2ϕ1 ∂x ∂x ϕ2 + 2ϕ2 ∂x 2∂x2 − ϕ1 ϕ2
(1.9)
.
Birkhoff normal form. The theory of normal forms of integrable (or near integrable) systems aims at representing such systems in coordinates which are particularly suited to integrate them as well as to study their (Hamiltonian) perturbations. The most simple case is arguably the normal form of such systems near an isolated equilibrium solution. It goes back to Birkhoff and is usually referred to as Birkhoff normal form. Assume that the origin 0 of Rn × Rn is an isolated equilibrium of some Hamiltonian system with real analytic Hamiltonian H and standard symplectic structure. It means that 0 is an isolated singular point of the corresponding Hamiltonian vector field X H . For simplicity, we assume that H admits an expansion of the form 1 λi (qi2 + pi2 ) + · · · , 2 n
H=
i=1
Birkhoff Coordinates for the Focusing NLS Equation
1089
where z = (q, p) denotes a point near 0 ∈ Rn × Rn and the dots stand for terms of higher order in z. The real numbers λ1 , . . . , λn are referred to as the frequencies of the linearized system. They are said to be nonresonant up to order m, if n
ki λi = 0 whenever 1 ≤
i=1
n
|ki | ≤ m,
i=1
where k1 , . . . , kn are arbitrary integers and m ≥ 1. They are nonresonant if they are nonresonant up to any finite order. A Hamiltonian H is in Birkhoff normal form up to order m if it is of the form H = N2 + N4 + · · · + Nm + · · · , where the Nk , 2 ≤ k ≤ m, are homogenous polynomials of order k, which are actually functions of qk2 + pk2 , 1 ≤ k ≤ n, and where . . . stands for terms of order strictly greater than m. If this holds for any m, the Hamiltonian is said to be in Birkhoff normal form. Birkhoff showed that if the frequencies λ1 , . . . , λn are nonresonant up to order m ≥ 3, then there exists an analytic canonical transformation = id + · · · near 0 such that H ◦ = N2 + N4 + · · · + Nm + · · · is in Birkhoff normal form up to order m. If the frequencies λ1 , . . . , λn are nonresonant up to any order, then this normalization process can be carried to any order. The resulting symplectic transformation, however, is in general no longer convergent in any neighborhood of the origin and can only be given the meaning of a formal power series. If some canonical transformation into Birkhoff normal form were convergent, then the resulting Hamiltonian would be integrable in a neighborhood of the origin, the integrals in involution being q12 + p12 , . . . , qn2 + pn2 . It turns out that a certain converse is true. If a Hamiltonian with a nonresonant elliptic equilibrium admits n functionally independent integrals in involution, then the formal transformation into Birkhoff normal form is convergent, hence the Hamiltonian itself is integrable. Such a result was proven by Vey [13] and then improved by Ito [7] and Zung [15]. Note that the normalizing transformation is typically only defined in a neighborhood of the elliptic equilibrium. In case the transformation is defined on all of phase space, one refers to the Birkhoff coordinates as global Birkhoff coordinates. In the last decade, normal form theory has been extended to Hamiltonian PDEs. In particular, Birkhoff normal forms of finite order have been studied for Hamiltonian PDEs and applied to obtain results on long time asymptotics for solutions near an equilibrium – see e.g. [2] and references therein. As in Hamiltonian systems of finite dimension, in the case of integrable PDEs one expects stronger results to hold. First results in this direction were obtained for the KdV equation and the defocusing nonlinear Schrödinger equation – see [8], respectively, [6]. Denote by H N ≡ H N (T, C) the Sobolev space of complex valued functions on the circle T, ˆ H N (T, C) := {ψ(x) = e2πikx ψ(k) : ψ N < ∞}, k∈Z
where for N ≥ 0, ψ N :=
k∈Z
1 (1 + |k|)
2N
2 ˆ |ψ(k)|
2
,
1090
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
1 ˆ and ψ(k) := 0 ψ(x)e−2πikx d x, k ∈ Z, denote the Fourier coefficients of ψ. Further 2 be the Hilbert space let lC 2 2 2 2 lC (x, y) = (xk , yk )k∈Z . 2 = l (Z, C) × l (Z, C), 2 with the standard Poisson bracket for which {x , y } = − {y , x } = 1 We endow lC k k k k 2 for any k ∈ Z whereas all other brackets between coordinate functions vanish. It induces the standard Poisson brackets on the real subspaces 2 2 2 2 2 2 lR 2 := l (Z, R) × l (Z, R) and il R2 := l (Z, iR) × l (Z, iR). More generally, for any N ≥ 0, introduce
l 2N ≡ l 2N (Z, C) := {x = (x j ) j∈Z | x ∈ l 2 (Z, C), x N < ∞}, where
⎛ x N := ⎝
⎞1 2
(1 + | j|)
2N
|x j |
2⎠
< ∞.
j∈Z
The main result of this paper is the following Theorem 1.1. There exist a neighborhood W f of 0 ∈ iL2R , a neighborhood U f of 2 , and a map 0 ∈ ilR 2 f : Wf → Uf
such that (i) f is 1 – 1, onto, bi-analytic and preserves the Poisson bracket. (ii) The coordinates (xk , yk )k∈Z = f (ϕ) are Birkhoff coordinates for the focusing NLS equation, i.e. for ϕ ∈ iL2R ∩ (H 1 × H 1 ), the Hamiltonian Hf ◦ −1 f depends only on the action variables Ik = 21 (xk2 + yk2 ), k ∈ Z. (iii) For any N ≥ 0, f maps W f ∩(H N × H N ) diffeomorphically onto U f ∩(l 2N ×l 2N ). Remark 1.1. Statement (iii) of Theorem 1.1 remains valid if the Sobolev space H N is replaced by the weighted Sobolev space H ω with subexponential weight ω and, correspondingly, the sequence space l 2N by the weighted sequence space lω2 – see [9]. Theorem 1.1 can be used to obtain a KAM-result for the focusing NLS equation of the type obtained in [5] for the defocusing NLS equation. In the case of fNLS it is valid in a neighborhood of 0 of the invariant subspace of iL2R ∪ (H N × H N ) consisting of odd potentials. It improves on the KAM theorem established in [10] and can be proved in the same way as the corresponding result in [5]. To prove Theorem 1.1 we use that the defocusing NLS equation admits global Birkhoff coordinates. More precisely, in [6] it is shown that there exists a real ana2 which associates to a potential ϕ in L2 its Birkhoff lytic canonical map : L2R → lR 2 R 2 , coordinates (xk (ϕ), yk (ϕ))k∈Z . The map extends analytically to a map W → lC 2 defined on an open neighborhood W of L2R in L2 . In order to provide Birkhoff coordinates on a neighborhood of 0 for the focusing NLS-equation, we will show that there exists a neighborhood of 0 in iL2R ∩ W so that the restriction of to this neighborhood has all the properties listed in Theorem 1.1. The main point consists in verifying that 2 (iL2R ∩ W ) ⊂ ilR 2.
Birkhoff Coordinates for the Focusing NLS Equation
1091
2. Set-up In this section we introduce some more notations, recall several results needed in the sequel and establish some auxiliary results. 2.1. Spectral properties of L(ϕ) and its discriminant. For ϕ = (ϕ1 , ϕ2 ) ∈ L2 , consider the ZS operator L(ϕ), defined by (1.9). For any λ ∈ C, let M = M(x, λ, ϕ) denote the fundamental 2 × 2 matrix of L(ϕ), L(ϕ)M = λM, satisfying the initial condition M(0, λ, ϕ) = Id2×2 . The entries of M are denoted by Mi j , (1 ≤ i, j ≤ 2). Periodic spectrum. Denote by Specper (ϕ) the spectrum of the operator L = L(ϕ) with domain 1 1 domper (L) := {F ∈ Hloc × Hloc | F(1) = ±F(0)}.
This spectrum coincides with the spectrum of the operator L(ϕ) considered on [0, 2] with periodic boundary conditions. The following proposition is well known – see e.g. [6], Prop. I.6. Proposition 2.1. For any ϕ ∈ L2 , the set of periodic eigenvalues of L(ϕ) (listed with ± + multiplicities) consists of a sequence of pairs (λ− k (ϕ), λk (ϕ)), λk (ϕ) ∈ C, satisfying 2 λ± k (ϕ) = kπ + l (k) 2 locally uniformly in ϕ, i.e. (λ± k (ϕ) − kπ )k∈Z ∈ l (Z, C) and the sequences are locally uniformly bounded.
We say that two complex numbers a, b are lexicographically ordered, a b, if [Re(a) < Re(b)] or [Re(a) = Re(b) and Im(a) ≤ Im(b)]. ¯ ∈ L2 , the periodic eigenvalues (λ± (ϕ))k∈Z Proposition 2.2. (i) For ϕ = (ψ, ψ) k R are real. Moreover, they can be listed (with multiplicities) in such a way that − + . . . λ+k−1 < λ− k ≤ λk < λk+1 · · · .
(2.1)
¯ ∈ iL2 the periodic eigenvalues (λ± (ϕ))k∈Z can be (ii) For potentials ϕ = (ψ, −ψ) k R listed (with multiplicities) in such a way that Im(λ+k ) ≥ 0 ∀k ∈ Z, and (λ+k (ϕ))k∈Z is lexicographically ordered. In addition, for any k ∈ Z, λ− k is given by + λ− k = λk .
Proof. (i) For ϕ ∈ L2R , the operator L(ϕ) with periodic boundary conditions is selfadjoint, hence its spectrum is real. The sequence of inequalities (2.1) follows from [6], formula (I.20). (ii) The claimed statement follows from Proposition 2.1 and the fact that if F = (F1 , F2 ) is a periodic eigenfunction with eigenvalue λ, then Fˇ := (− F¯2 , F¯1 ) is a periodic eigenfunction with eigenvalue λ¯ .
1092
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
Dirichlet spectrum. For ϕ ∈ L2 , denote by Specdir (ϕ) the Dirichlet spectrum of the operator L(ϕ), i.e. the spectrum of L(ϕ) considered with domain domdir (L) := {F = (F1 , F2 ) ∈ H 1 ([0, 1], C)2 | F1 (0) = F2 (0), F1 (1) = F2 (1)}. (2.2) Note that the Dirichlet spectrum is discrete. The following results are well known – see e.g. [6] Prop. I.9, formula I.22. Proposition 2.3. (i) For ϕ ∈ L2 the Dirichlet eigenvalues (µk (ϕ))k∈Z can be listed (with multiplicities) in such a way that they are lexicographically ordered and satisfy the asymptotic estimates µk (ϕ) = kπ + l 2 (k), locally uniformly in ϕ. (ii) For ϕ ∈ L2R , the Dirichlet eigenvalues are real and satisfy + λ− k (ϕ) ≤ µk (ϕ) ≤ λk (ϕ).
Discriminant. Let (λ, ϕ) := M11 (1, λ, ϕ) + M22 (1, λ, ϕ) be the trace of the fundamental matrix M evaluated at x = 1. It is well known that (λ, ϕ) is an entire function ˙ the partial derivative of (λ, ϕ) with on C × L2 ( cf. [6], Lemma I.1 ). Denote by respect to λ. The following properties of (λ, ϕ) are well known – see e.g. [11] or [6], Sect. I.2, Lemma I.19, Lemma I.20, and Lemma I.22. Proposition 2.4.
(i) For any ϕ ∈ L2 and any λ ∈ C,
− + λ+k (ϕ) − λ λ− k (ϕ) − λ (λ, ϕ) − 4 = −4 λ0 (ϕ) − λ λ0 (ϕ) − λ . k2π 2 2
k=0
˙ of (λ, ϕ) has countably many roots. They (ii) For any ϕ ∈ L2 , the λ-derivative can be listed (with multiplicities) in such a way that they are lexicographically ordered and satisfy the asymptotic estimates λ˙ k = kπ + l 2 (k), ˙ locally uniformly in ϕ. For any ϕ ∈ L2 , (λ, ϕ) admits the following product representation: ˙ (λ, ϕ) = 2(λ˙ 0 − λ)
λ˙ k − λ . kπ
k=0
(iii) For any ϕ ∈ iL2R and λ ∈ C, ˙ ˙ λ, ¯ ϕ). ¯ ϕ) = (λ, ϕ) and (λ, ϕ) = ( (λ,
Birkhoff Coordinates for the Focusing NLS Equation
1093
Proof. The first two items are proved in [6], Sect. I.6. The third item is well known – see for example [1]. For the convenience of the reader we repeat the proof here. Let F = F(x, λ, ϕ) be the solution of L(ϕ)F = λF
(2.3)
such that F|x=0 = (1, 0). Then Fi (x, λ, ϕ) = Mi1 for i = 1, 2. A straightforward computation shows that ˇ F(x, λ, ϕ) = −F2 (x, λ¯ , ϕ), F1 (x, λ¯ , ϕ) ˇ x=0 = (0, 1). Hence, (λ, ϕ) = F1 (1, λ, ϕ) + F2 (1, λ¯ , ϕ). is a solution of (2.3) with F| The latter equality proves the statement. Spectral properties of potentials in iL2R near 0. Potentials ϕ ∈ iL2R near the origin have additional spectral properties. To describe them let (Dk )k∈Z denote the sequence of disks in C with center kπ and radius π/4. Proposition 2.5. There exists a neighborhood W of 0 in L2 , such that, for any ϕ ∈ W ∩ iL2R and k ∈ Z, the following properties hold: (i) (ii) (iii) (iv)
+ Specper (L(ϕ)) ∩ Dk = {λ− k , λk }; Crit( (·, ϕ)) ∩ Dk = {λ˙ k }; Specdir (L(ϕ)) ∩ Dk = {µk }; k λ˙ k ∈ R, and (λ± k (ϕ), ϕ) = 2(−1) .
Proof. The existence of a neighborhood W of 0 in L2 so that any ϕ ∈ W ∩ iL2R satisfies items (i) − (iii) follows from the fact that for ϕ = (0, 0), + ˙ λ− ∀k ∈ Z k = λk = λk = µk = kπ together with Proposition 2.1, Proposition 2.3 (i) and Proposition 2.4 (ii). By Proposition 2.4 (iv) the critical points λ˙ k of are either real or they occur in complex conjugate pairs. By item (ii) they cannot occur in complex conjugate pairs. Hence they must be real. Further, by a deformation argument, one sees that k (λ± k (ϕ), ϕ) = 2(−1) and item (iv) is proved as well. 2.2. Branches of the square root. We need to consider different branches of the square root. √ √ Canonical branch. We denote by + z (or simply by z) the principal branch of the √ square root defined on√C\ {x ∈ R | x ≤ 0} by + 1 = 1. Given a, b ∈ C with a = b and a b, we denote by s (a − z)(b − z) the standard branch of the square root, defined on C\[a, b] and determined by √ + s (a − z)(b − z)|z=b+(b−a) = − 2(b − a), (2.4) where [a, b] denotes the interval {ta + (1 − t)b| 0 ≤ t ≤ 1} in C. Using the product representation of 2 (λ, ϕ) − 4 (cf. Proposition 2.4 (ii)), we now define, for c − + 2 λ ∈ C\(∪k∈Z [λk , λk ]), and ϕ ∈ L , the canonical square root 2 (λ, ϕ) − 4 by s + (λ− k (ϕ) − λ)(λk (ϕ) − λ) c s − + 2 . (λ, ϕ) − 4 := 2i (λ0 (ϕ) − λ)(λ0 (ϕ) − λ) kπ k=0
(2.5)
1094
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
+ One easily sees that for any ϕ ∈ L2R and λ ∈ [λ− k (ϕ), λk (ϕ)] ⊂ R, c ± (−1)k 2 (λ ± io, ϕ) − 4 > 0,
(2.6)
where o denotes a real positive infinitesimal increment. γ -branch. Recall that for any k ∈ Z we denote by (Dk )k∈Z the disk in C with center kπ and radius π/4. Proposition 2.6. There exists a neighborhood W of 0 in L2 so that for any ϕ ∈ W ∩iL2R , the following properties hold: For any k ∈ Z there exists a smooth arc γk ⊂ Dk from + λ− k (ϕ) to λk (ϕ) such that (i) (λ, ϕ) ∈ R, for any λ ∈ k∈Z γk ; (ii) the orthogonal projection of γk to the imaginary axis is a diffeomorphism onto its image; (iii) γ¯k = γk ; (iv) λ˙ k ∈ γk ∩ R; (v) 2 (λ, ϕ) − 4 < 0 for any λ ∈ k∈Z (γk \{λ+k , λ− k }). Remark. For related results for non-selfadjoint Hill’s operators see also [12]. Proof. For any λ ∈ C, write λ = u + iv with u, v ∈ R and let = 1 + i 2 , where 1 (u, v; ϕ) := Re( (u + iv, ϕ)) and 2 (u, v; ϕ) := Im( (u + iv, ϕ)). For any given ϕ ∈ iL2R we want to study the zero level set of 2 (λ, ϕ) ≡ 2 (u, v; ϕ) in C. To this end, consider the function F(u, v; ϕ) := 2 (u, v; ϕ)/v .
(2.7)
By Proposition 2.4 (iii), (λ, ϕ) is real-valued on R × iL2R . Hence, 2 (u, v; ϕ) = 0 for λ ∈ R, and thus, for any ϕ ∈ iL2R ,
F(u, v; ϕ) = 0
1
(∂v 2 )(u, vt; ϕ) dt .
(2.8)
As (λ, ϕ) is an analytic function on C×L2 , F is a real analytic function on R×R×iL2R , hence has an analytic extension to a neighborhood of R × R × iL2R in C × C × L2 which we again denote by F. Note that for any given ϕ ∈ iL2R , the functions F(·, ·; ϕ) and 2 (·, ·; ϕ) have the same zeroes in R × (R\{0}), hence it suffices to study the zero level sets of F. To this end consider the following map: F : B ∞ × (−1, 1) × iL2R → l ∞ ≡ l ∞ (Z, R)
(2.9)
defined by F = (Fk )k∈Z with Fk (u, v; ϕ) := F(kπ + u k , v; ϕ), u := (u k )k∈Z .
(2.10)
Here B ∞ := {u ∈ l ∞ | u∞ < 1}.1 It follows from (2.8), Cauchy’s inequality (see e.g. Lemma A.2 in [8]) and Lemma I.2 in [6] that F : B ∞ × (−1, 1) × iL2R → l ∞ extends to a locally bounded function FC : VC → l ∞ (Z, C), where VC is an open neighborhood of B ∞ × (−1, 1) × iL2R in (B ∞ × (−1, 1) × iL2R ) ⊗ C. As for any k ∈ Z 1 More generally, for any δ > 0 let B ∞ := {u ∈ l ∞ | u < δ}. ∞ δ
Birkhoff Coordinates for the Focusing NLS Equation
1095
the component Fk is analytic on VC (cf. (2.8)) we conclude from Theorem A.3 in [8] that F is real analytic on B ∞ × (−1, 1) × iL2R . Note that (λ, 0) = 2 cos λ and 2 (u, v; 0) = −2 sin u sinh v. Hence, F|u=0,v=0,ϕ=0 = (−2 sin kπ )k∈Z ≡ 0 and ∂F |u=0,v=0,ϕ=0 = 2 diag((−1)k+1 )k∈Z . ∂u By the implicit function theorem there exist an open neighborhood W1 of ϕ = 0 in iL2R , ε > 0, and a real analytic function G : (−ε, ε) × W1 → l ∞ ,
G = (gk )k∈Z ,
such that for any v ∈ (−ε, ε) and any ϕ ∈ W1 , F(G(v, ϕ), v; ϕ) = 0 . Moreover, there exists δ > 0 such that the map (−ε, ε) × W1 → B ∞ × (−ε, ε) × W1 , (v, ϕ) → (G(v, ϕ), v, ϕ) , parametrizes the zero level set of F in Bδ∞ × (−ε, ε) × W1 . In particular, for any ϕ ∈ W1 and any k ∈ Z, the intersection of the zero level set of F with Dkε := {λ ∈ C | | Re(λ) − kπ | < δ, | Im(λ)| < ε} is parametrized by z k : (−ε, ε) → Dkε , v → kπ + gk (v, ϕ) + iv . Let γ˜k := Image(z k ) ⊆ Dkε . By definition (2.7) of F, γ˜k \R coincides with the intersection of the zero level set of 2 with Dkε \R. As (λ, ϕ) is real for λ ∈ R, we see that the intersection of the zero level set of 2 with Dkε coincides with Z k := γ˜k ∪ (Dkε ∩ R) ⊆ C . Hence, for any ϕ ∈ W1 and any k ∈ Z, any complex number λ ∈ Dkε satisfies (λ, ϕ) ∈ R ⇐⇒ λ ∈ Z k .
(2.11)
± k ˙ Recall that at ϕ = 0, λ± k = λk = kπ and (λk ) = 2(−1) . Hence, by Proposition 2.1 and Proposition 2.4 (ii) there exists an open neighborhood W of ϕ = 0 in L2 such that W ∩ iL2R ⊆ W1 and for any ϕ ∈ W ∩ iL2R and any k ∈ Z, ± ε k ˙ λ± k (ϕ), λk (ϕ) ∈ Dk and (λk (ϕ), ϕ) = 2(−1) . k ¯ Using that (λ± k (ϕ), ϕ) = 2(−1) as well as the symmetry (λ, ϕ) = (λ, ϕ) one 2 sees that for any ϕ ∈ W ∩ iLR and any k ∈ Z,
˙ λ± k (ϕ), λk (ϕ) ∈ Z k . Now one easily sees that for any ϕ ∈ W ∩ iL2R , γk := γ˜k ∩ {λ ∈ C | | (λ)| ≤ 2} has the claimed properties.
1096
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
Let W be a neighborhood of 0 in L2 as in Proposition 2.6. For ϕ in W ∩ iL2R we now + define the following modification γk (λ− k (ϕ) − λ)(λk (ϕ) − λ) of the standard branch of the square root defined by (2.4): first define it for λ ∈ C\Dk by
γk
+ (λ− k (ϕ) − λ)(λk (ϕ) − λ) :=
s
+ (λ− k (ϕ) − λ)(λk (ϕ) − λ),
(2.12)
and then extend it by analyticity to C\γk . The γ -root of 2 (λ, ϕ) − 4 in C\ ∪k∈Z γk is defined by γ
2 (λ, ϕ) − 4
:= 2i
γ0
+ (λ− 0 (ϕ) − λ)(λ0 (ϕ) − λ)
γk
+ (λ− k (ϕ) − λ)(λk (ϕ) − λ)
k=0
kπ
.
(2.13) Similarly as for the canonical root of 2 (λ, ϕ) − 4 for ϕ ∈ L2R , one verifies that for any ϕ ∈ W ∩ iL2R , k ∈ Z and λ ∈ γk , we have ± (−1)k i
γ
2 (λ ± o, ϕ) − 4 > 0,
(2.14)
where o denotes a real positive infinitesimal increment. 2.3. Action variables for dNLS and their analytic extensions. Let ϕ ∈ L2R be a potential of real type. Following [6], Sect. III.1, we associate to ϕ the k th action variable Ik (ϕ) :=
1 π
k
˙ (λ, ϕ) dλ, λ c 2 (λ, ϕ) − 4
(2.15)
+ where k is a counterclockwise oriented contour in C around the interval [λ− k (ϕ), λk (ϕ)]. The k are chosen so small that together with their interiors they do not intersect each other. Alternatively, Ik can be written as 1 c Ik (ϕ) = log (−1)k (λ, ϕ) − 2 (λ, ϕ) − 4 dλ. (2.16) π k
By [6], Theorem III.2 and [6], Prop. III.21, we have the following results: Proposition 2.7. There exists a neighborhood W of L2R in L2 such that for any k ∈ Z, the action variable Ik analytically extends to potentials ϕ ∈ W and (i) (2.15) –(2.16) hold on W , (ii) I j , Ik = 0 for any j, k ∈ Z. Proof. By Theorem III.2 in [6], Ik and I j are real analytic functions on L2R . Hence by Proposition III.24 in [6], Ik , I j is real analytic as well and Ik , I j |L2 = 0. This R shows that Ik , I j = 0 in some neighborhood of L2R in L2 .
Birkhoff Coordinates for the Focusing NLS Equation
1097
2.4. Angle variables for dNLS and their analytic extensions. Let ϕ ∈ L2R and denote by (ϕ) the curve (ϕ) = {(λ, z) : z 2 = 2 (ϕ, λ) − 4} ⊂ C2 . In view of definition (2.2), for any Dirichlet eigenvalue µk of L(ϕ) one has (M11 + M12 )|1,µk = (M21 + M22 )|1,µk .
(2.17)
Using (2.17) and the Wronskian identity det M(1, λ) = 1, it follows that 2 (µk , ϕ) − 4 = (M21 + M12 )2|1,µk . The latter identity allows us to choose a sign of the root ∗
2 (µk , ϕ) − 4,
2 (µk , ϕ) − 4 := (M21 + M12 )|1,µk ,
and hence the point µ∗k on (ϕ) ∗ µ∗k = µk , 2 (µk , ϕ) − 4 := µk , (M21 + M12 )|1,µk . We refer to µ∗k as a Dirichlet divisor. Following [6], Sect. III.3, we can associate to + th ϕ ∈ L2R for any k ∈ Z with λ− k < λk , the k angle variable θk (ϕ), defined by the following path integral on (ϕ): θk (ϕ) :=
j∈Z
µ∗j
λ−j
χk (λ) 2 (λ) − 4
dλ mod 2π,
(2.18)
where χn (λ) ≡ χn (λ, ϕ), n ∈ Z, is a family of analytic functions on C × L2R uniquely determined by the normalization conditions 1 χn (λ) dλ = δ jn ∀ j, n ∈ Z. (2.19) 2π j c 2 (λ) − 4 Each angle variable is real-analytic modulo 2π on the (dense) domain L2R \Dk , where + Dk := {ϕ ∈ L2 | λ− k (ϕ) = λk (ϕ)}.
(2.20)
In fact, the right-hand side of (2.18), when taken modulo π , analytically extends to W \Dk , where W is a (sufficiently small) neighborhood of L2R in L2 which is independent of k (cf. Theorem III.10 in [6]). By Theorem III.10, Proposition III.24, and Proposition III.25 in [6], the following results hold. Proposition 2.8. There exists a neighborhood W of L2R in L2 so that for any k ∈ Z, χk extends analytically to C × W and θk , when taken modulo π , analytically extends to W \Dk , satisfying the following properties: (i) relations (2.18) and (2.19) hold for any k, n, j ∈ Z; (ii) I j , θk = δ jk on W \Dk for any j, k ∈ Z; (iii) θ j , θk = 0 on W \(Dk ∪ D j ), for any k, j ∈ Z.
1098
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
2.5. Birkhoff coordinates for dNLS and their analytic extensions. In [6], Chapt. III, it is shown that the map 2 : L2R → lR 2 , ϕ → (ϕ) = (x k (ϕ), yk (ϕ))k∈Z ,
given by
√
2Ik (ϕ) (cos θk (ϕ), sin θk (ϕ)) if ϕ ∈ L2R \Dk (0, 0) if ϕ ∈ L2R ∩ Dk defines global Birkhoff coordinates. More precisely, the following theorem holds. (xk (ϕ), yk (ϕ)) =
Theorem 2.1. The map 2 : L2R → lR 2 is a diffeomorphism with the following properties: (i) is bi-analytic and preserves the Poisson bracket. (ii) The coordinates (xk , yk )k∈Z = (ϕ) are Birkhoff coordinates for the defocusing NLS equation (and its hierarchy), i.e. for ϕ ∈ L2R ∩ (H 1 × H 1 ), the push forward Hd ◦ −1 of the dNLS-Hamiltonian Hd depends only on the action variables Ik = 21 (xk2 + yk2 ), k ∈ Z. 2 , is the Fourier transform (cf. [6], Prop. III.20). (iii) The differential at 0, d0 : L2R → lR 2 More precisely, for any f = ( f 1 , f 2 ) ∈ L2R , the image (ξ, η) := d0 ( f 1 , f 2 ) is given by
fˆ1 (−k) + fˆ2 (k) fˆ1 (−k) − fˆ2 (k) (ξk , ηk ) = − ,i (2.21) √ √ 2 2 or √ √ (2.22) (ξk , ηk ) = −( 2 Re fˆ2 (k), 2 Im fˆ2 (k)). 2 ∩(l 2 ×l 2 ). (iv) For any N ≥ 1, maps L2R ∩(H N × H N ) diffeomorphically onto lR 2 N N 2 extends to an analytic map on a neighborhood By Theorem 2.1, the map : L2R → lR 2 2 2 2 of LR in L with values in lC2 :
Proposition 2.9. There exists a neighborhood W of 0 in L2 , and a neighborhood U of 2 such that analytically extends to a map W → U , which we again denote by 0 in lC 2 , satisfying the following properties: (i) is 1 – 1, onto, bi-analytic and preserves the Poisson bracket. (ii) The push forward H ◦ −1 of the Hamiltonian (1.3), restricted to U ∩ (l12 × l12 ), depends only on the action variables Ik = 21 (xk2 + yk2 ), k ∈ Z. 2 is the Fourier transform and is given by the (iii) The differential at 0, d0 : L2 → lC 2 formula (2.21) for arbitrary elements ( f 1 , f 2 ) ∈ L2 . (iv) For any N ≥ 0, the restriction of to W ∩ (H N × H N ) is a diffeomorphism W ∩ (H N × H N ) → U ∩ (l 2N × l 2N ). Proof. By Theorem 2.1, 2 (d0 )|L2 = d0 (|L2 ) : L2R → lR 2 R R is a linear R-isomorphism given by formula (2.21). As is real analytic it then follows 2 is a C-linear isomorphism given by formula (2.21). The claimed that d0 : L2 → lC 2 statements then follow from the inverse function theorem and Theorem 2.1.
Birkhoff Coordinates for the Focusing NLS Equation
1099
3. Actions In this section we want to show that the action variables for ϕ in a neighborhood of 0 in iL2R are real valued. Let W be a neighborhood of 0 in L2 such that Proposition 2.5, Proposition 2.6, and Proposition 2.9 hold. The main result of this section is the following one. Proposition 3.1. For any ϕ ∈ W ∩ iL2R , the action variables (2.15) are real valued. Proof. We have to show that for any k ∈ Z, Ik = Ik . By (2.15), and Proposition 2.7 (i), ˙ (λ) 1 Ik = dλ, (3.1) λ c 2 π k (λ) − 4 where we chose k to be the (counterclockwise oriented) circle in C of center kπ and radius π/4. Then 1 Ik = π
k
˙ (λ) λ¯ dλ. c 2 (λ) − 4
(3.2)
+ As λ− k = λk by Proposition 2.2, it follows from the definition of the standard branch of the square root (cf. Sect. 2.2), that s ¯ + − λ), ¯ (λ− − λ)(λ+ − λ) = s (λ− − λ)(λ k
k
k
k
and thus by (2.5), c c 2 ¯ − 4. (λ) − 4 = − 2 (λ) When combined with Proposition 2.4 (iii), formula (3.2) becomes ˙ λ¯ ) ( 1 λ¯ I¯k = dλ. c π k − 2 (λ¯ ) − 4 Parametrize k by λ(t) = kπ + π4 eit with 0 ≤ t ≤ 2π . Then λ(t) = λ(−t) and −it dt, and thus dλ = − iπ 4 e ˙ (λ(−t)) π 1 2π i ei(−t) dt Ik = λ(−t) c π 0 2 (λ(−t)) − 4 4 2π ˙ (λ(s)) 1 π = i eis ds λ(s) c 2 π 0 (λ(s)) − 4 4 = Ik , where for the latter identity we used again (3.1). In fact, one can show that for ϕ ∈ W ∩ iL2R , the action variables are nonpositive. Proposition 3.2. Let W be the neighborhood of 0 in L2 as in Proposition 3.1. Then for any k ∈ Z and ϕ ∈ W ∩ iL2R , we have Ik ≤ 0.
1100
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
Proof. It follows from Proposition 2.7 (i) that for any k ∈ Z, 1 c Ik (ϕ) = log (−1)k (λ, ϕ) − 2 (λ, ϕ) − 4 dλ. π k
(3.3)
+ If λ− we assume k = λk then (3.3) shows that Ik (ϕ) = 0. So forthe rest of the proof c γ − + 2 2 that λk = λk . By (2.5), (2.12), and (2.13) one has (λ) − 4 = (λ) − 4 for + λ ∈ k∈Z k and by (2.14), for any k ∈ Z, λ ∈ γk \ λ− k , λk , and ∈ {−1, +1},
(−1)k i
γ
2 (λ + o) − 4 > 0.
Hence, by Proposition 2.6 (v), γ + (−1)k i 2 (λ + o) − 4 = 4 − 2 (λ),
(3.4)
(3.5)
where o denotes a real positive infinitesimal increment. In addition, it follows from (3.4) γ − + 2 that for any λ ∈ k∈Z (γk \{λk , λk }) the imaginary part of (λ ± o) − 4 does not vanish. Hence, the sign of this imaginary part remains constant. As a consequence, for λ ∈ γk \{λ+k , λ− k }, the principal branch of the logarithm γ log (−1)k (λ) − 2 (λ ± o) − 4 is well defined. By shrinking the contour k to γk , and assuming that γk is oriented, + issuing from λ− k and ending at λk , we can write 1 γ Ik (ϕ) = log (−1)k (λ) − 2 (λ + o) − 4 dλ π γk 1 γ − log (−1)k (λ) − 2 (λ − o) − 4 dλ. π γk As by (3.5), for any ∈ {−1, +1}, γ + (−1)k 2 (λ + o) − 4 = − i 4 − 2 (λ), it then follows that Ik (ϕ) =
1 π
+ log (−1)k (λ) + i 4 − 2 (λ) dλ γk 1 + − log (−1)k (λ) − i 4 − 2 (λ) dλ. π γk
(3.6)
Using that for λ ∈ γk , + + (−1)k (λ) + i 4 − 2 (λ) = (−1)k (λ) − i 4 − 2 (λ) , one sees that + + Re log (−1)k (λ) + i 4 − 2 (λ) = Re log (−1)k (λ) − i 4 − 2 (λ) .
Birkhoff Coordinates for the Focusing NLS Equation
1101
Moreover, as (λ) is real valued and −2 ≤ (λ) ≤ 2 for λ ∈ γk , one has + + Im log (−1)k (λ) + i 4 − 2 (λ) = −Im log (−1)k (λ) − i 4 − 2 (λ) . Hence (3.6) leads to the identity 2 + Ik (ϕ) = i Im log (−1)k (λ) + i 4 − 2 (λ) dλ. π γk To evaluate the latter integral, parametrize the path γk by the imaginary part. By Proposition 2.6 (ii) this is possible, i.e. there exists a C 1 -curve t → a(t) so that λ(t) = a(t) + ti, |t| ≤ Imλ+k . Then, with a(t) ˙ =
d dt a(t),
dλ = (a˙ + i)dt. As the action variables are real valued by Proposition 3.1, we get Ik (ϕ) = −
2 π
Imλ+k
Imλ− k
+ Im log (−1)k (λ(t)) + i 4 − 2 (λ(t)) dt.
Since for any |t| < Imλ+k , + + Im (−1)k (λ(t)) + i 4 − 2 (λ(t)) = 4 − 2 (λ(t)) > 0, one concludes that + Im log (−1)k (λ(t)) + i 4 − 2 (λ(t)) ∈ (0, π ). + Thus we have shown that Ik (ϕ) < 0 for any k ∈ Z with λ− k = λk .
For ϕ ∈ W ∩ iL2R , Proposition 3.2 can be used to obtain a formula for the Birkhoff coordinates (xk , yk )k∈Z provided by Proposition 2.9. It follows from the construction in [6], III.4, that for any ϕ ∈ W \Dk , xk =
√
2 ξk
√ λ+k − λ− λ+ − λ− k k cos θk and yk = 2 ξk k sin θk , 2 2
where θk is defined by formula (2.18) and where 2 ξk := + 4Ik /(λ+k − λ− k )
(3.7)
(3.8)
is a real analytic non-vanishing function defined on W (cf. Theorem III.3 in [6]). On W \Dk , the angle variable θk is analytic modulo π . When taken modulo 2π , θk might not be continuous. In fact, continuous deformations of ϕ in W \Dk could lead to dis− + + continuities of λ− k and λk due to the imposed lexicographic ordering λk λk , and hence to an increment ±π on the right-hand side of (2.18). It follows from Proposition 2.2 (ii) and Proposition 2.5 (i) that for continuous deformations of ϕ in the smaller
1102
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
+ ¯+ set (W ∩ iL2R )\Dk , the eigenvalues λ− k = λk and λk change continuously. Hence, for ϕ ∈ (W ∩ iL2R )\Dk the angle variable
θk (ϕ) =
j∈Z
µ∗j
λ−j
χk (λ, ϕ) 2 (λ, ϕ) − 4
dλ mod 2π
(3.9)
is an analytic function on (W ∩ iL2R )\Dk and by (3.7), (3.8), and Proposition 3.2 we get that (3.10) xk = i + −2Ik cos θk and yk = i + −2Ik sin θk for any ϕ ∈ iL2R \Dk . 4. Even Potentials To prove Theorem 1.1, the notion of even potentials will play an important role. In this section, we assume that W is a neighborhood of 0 in L2 , chosen in such a way that Propositions 2.6, 2.7, 2.8, 2.9 and Propositions 3.1, 3.2 hold. Denote by U the image of W by the bi-analytic map of Proposition 2.9, U = (W ). Definition 4.1. An element ϕ = (ϕ1 , ϕ2 ) in L2 is said to be even if ϕ2 (x) = ϕ1 (1 − x) a.e. x ∈ R. ¯ ∈ L2 is even iff ψ(x) = ψ(1 ¯ − x) a.e. whereas ϕ = (ψ, −ψ) ¯ ∈ Note that ϕ = (ψ, ψ) R 2 2 2 ¯ iLR is even iff ψ(x) = −ψ(1 − x) a.e. . Denote by LR, even [Leven ] the set of even potentials in L2R [L2 ]. Then iL2R, even is the set of even potentials in iL2R . 2 is said to be even iff y = 0 for Definition 4.2. An element (x, y) = (xk , yk )k∈Z in lC k 2 any k ∈ Z. 2 2 2 Denote by lR 2 , even the even elements of l R2 . Then il R2 , even is the set of even elements
2 . of ilR 2
Lemma 4.1. d0 |i L2
R, even
2 : iL2R, even → ilR 2 , even is a R-linear isomorphism.
Proof. The claimed statement follows easily from formula (2.21) of Theorem 2.1. 2 Next we want to show that (W ∩ iL2R, even ) ⊆ ilR 2 , even . For this we first need to establish a few auxiliary results.
Lemma 4.2. For any k ∈ Z, (W ∩ iL2R, even )\Dk is dense in W ∩ iL2R, even . Proof. First note that by formula (2.15), W ∩ Dk is contained in the zero set of the action variable Ik , i.e. W ∩Dk ⊆ {ϕ ∈ W | Ik (ϕ) = 0}. Assume that the claimed statement does not hold. Then there exists k ∈ Z and a non empty, open set U ⊆ W so that Ik vanishes on U ∩ iL2R, even . Note that L2even = (iL2R, even ) ⊗ C and recall that Ik is real-valued ≡ 0 that Ik ≡ 0 on a non empty on W ∩ iL2R, even . It then follows from Ik |U ∩i L2 R, even
connected component of W ∩ L2even , contradicting Theorem 2.1.
Birkhoff Coordinates for the Focusing NLS Equation
1103
Lemma 4.3. For any ϕ ∈ W ∩ L2even , µk (ϕ) ∈ {λ+k (ϕ), λ− k (ϕ)} ∀k ∈ Z. Proof. Let ϕ = (ϕ1 , ϕ2 ) ∈ L2 ∩ W be an even potential. Let F = (F1 , F2 ) be a Dirichlet eigenfunction associated to the kth Dirichlet eigenvalue µk (ϕ), i.e. i∂x F1 + ϕ1 F2 = µk F1 (4.1) −i∂x F2 + ϕ2 F1 = µk F2 ˜ and F1 (0) = F2 (0), F1 (1) = F2 (1). Let F(x) := (F2 (1 − x), F1 (1 − x)). Note that F˜ satisfies the same boundary conditions as F. To see that F˜ is a solution of (4.1), interchange the two equations in (4.1) and evaluate them at 1 − x. As (∂x F j )(1 − x) = −∂x (F j (1 − x)), one gets
i∂x (F2 (1 − x)) + ϕ2 (1 − x)F1 (1 − x) = µk F2 (1 − x) −i∂x (F1 (1 − x)) + ϕ1 (1 − x)F2 (1 − x) = µk F1 (1 − x).
Using the assumption that ϕ is even, one then concludes that
i∂x F˜1 (x) + ϕ1 (x) F˜2 (x) = µk F˜1 (x) −i∂x F˜2 (x) + ϕ2 (x) F˜1 (x) = µk F˜2 (x).
Hence F˜ is an eigenfunction for the Dirichlet eigenvalue µk . We now distinguish between two cases: If F˜ = F, then (F1 (0), F2 (0)) = (F2 (1), F1 (1)). Since F satisfies Dirichlet boundary conditions, F1 (0) = F2 (0) and F1 (1) = F2 (1), it satisfies periodic boundary conditions as well. If F˜ = F, then F − F˜ is a non-trivial solution of the system (4.1), which satisfies anti-periodic boundary conditions, i.e. ˜ ˜ (F − F)(1) = −(F − F)(0). In other words we have shown that µk (ϕ) ∈ λ±j (ϕ), j ∈ Z . Lemma 4.3 then follows from Proposition 2.5 (i) and (iii). 2 Lemma 4.4. (W ∩ iL2R, even ) ⊆ ilR 2 , even .
Proof. Let ϕ ∈ W ∩iL2R, even . By Proposition 2.8 (i), for any k ∈ Z with λ+k (ϕ) = λ− k (ϕ), the angle variable θk (ϕ) is well defined by (2.18) and the normalizing condition (2.19) χk (λ) dλ = 2π δ jk c
j 2 (λ) − 4 is valid. Shrink the contour j to the arc γ j , given by Proposition 2.6, to get, in view of formula (2.19) and Proposition 2.8 (i), χk (λ) dλ ∈ ±π δ jk . 2 γj (λ) − 4
1104
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
By Lemma 4.3, µk ∈ λ+k (ϕ) = λ− k (ϕ),
θk (ϕ) =
+ λ− k (ϕ), λk (ϕ) for any k ∈ Z. Hence for any k ∈ Z with
j∈Z
µ∗j λ−j
χk (λ) dλ ∈ {0, π } mod 2π. 2 (λ) − 4
By formula (3.10) for (x, y) = (ϕ) it then follows that for such k’s, xk (ϕ) = i + −2Ik (ϕ) and yk (ϕ) = i + −2Ik (ϕ) sin θk (ϕ) = 0 on (W ∩ iL2R, even )\Dk . It then follows by Proposition 3.2 that xk ∈ iR. By Lemma 4.2, (W ∩ iL2R, even )\Dk is dense in W ∩ iL2R, even . By the continuity of xk and yk it then follows that xk ∈ iR and yk = 0 on W ∩ iL2R, even . This shows that 2 (W ∩ iL2R, even ) ⊆ ilR 2 , even
as claimed. Proposition 4.1. By shrinking W and U if necessary, it follows that |W ∩i L2
R, even
2 : W ∩ iL2R, even → U ∩ ilR 2 , even
is a diffeomorphism. Proof. In view of Lemma 4.1 and Lemma 4.4 the claimed statement follows from the inverse function theorem. 5. The Real Symplectic Subspace iL2R Recall that we have introduced the real subspace L2R of L2 = L 2 (T, C) × L 2 (T, C) given by ¯ ψ ∈ L 2 (T, C) . L2R = ϕ = (ψ, ψ)| Note that iL2R is a real subspace of L2 as well and for any ϕ = (ϕ1 , ϕ2 ) ∈ L2 one has ϕ ∈ iL2R iff ϕ2 = −ϕ¯1 .
(5.1)
The subspace iL2R can be identified with L 2 (T, R) × L 2 (T, R) in a natural way. To this end introduce the C-linear isomorphism T : L 2 (T, C) × L 2 (T, C) → L2 , 1 (ψ1 , ψ2 ) → (ϕ1 , ϕ2 ) = √ (ψ1 + iψ2 , −ψ1 + iψ2 ). 2 In a straightforward way one shows the following lemma. Lemma 5.1. (i) iL2R is the image by T of the real subspace L 2 (T, R) × L 2 (T, R) of L 2 (T, C) × L 2 (T, C).
Birkhoff Coordinates for the Focusing NLS Equation
1105
(ii) T is canonical when its domain of definition is endowed with the canonical Poisson structure 1 {F, G}0 (ψ1 , ψ2 ) = ∂ψ1 F ∂ψ2 G − ∂ψ2 F ∂ψ1 G d x 0
and the target of T with the Poisson bracket introduced in Sect. 1. Now consider an analytic functional F : W → C defined in a neighborhood W of 0 in L2 . Lemma 5.2. If F|i L2 is real valued then the Hamiltonian vector field (−i∂ϕ2 F, i∂ϕ1 F) is tangent to iL2R .
R
Proof. Consider the pull back F ◦ T of F. Then F ◦ T is an analytic functional on W := T −1 (W ) whose restriction to L 2 (T, R) × L 2 (T, R) ∩ W is real valued. This implies that the Hamiltonian vector 2 field (−∂ψ22 (F ◦ T ), ∂ψ1 (F ◦ T )) takes values in 2 2 L (T, R) × L (T, R) on W ∩ L (T, R) × L (T, R) . As T is canonical, T −∂ψ2 (F ◦ T ), ∂ψ1 (F ◦ T ) = (−i∂ϕ2 F, i∂ϕ1 F), and this vector field is tangent to iL2R on W ∩ iL2R as T maps L 2 (T, R) × L 2 (T, R) to iL2R . 6. Proof of Theorem 1.1 The idea of our proof can be best explained in terms of the Birkhoff coordinates (xk , yk )k∈Z . We consider the sequence of Hamiltonian vector fields X (k) (x, y) := ((−yk , xk )δkl )l∈Z 2 with Hamiltonian I = 1 (x 2 + y 2 ) and study their integral curves. For any k ∈ Z, on ilR k 2 2 k k (k) (k) of the initial value problem the solution xl (t), yl (t) l∈Z
(x˙l , y˙l ) = (−yk , xk ) δkl ∀ l ∈ Z, (xl (0), yl (0)) = (ξl , ηl ) ∈ iR2 ∀ l ∈ Z,
(6.1) (6.2)
is given by
(k) (k) xl (t), yl (t)
=
(ξl , ηl ) ∀ l = k (ξk cos t − ηk sin t, ξk sin t + ηk cos t) l = k.
2 . Actually, it evolves in Iso(ξ, η) ∩ il 2 , Clearly, it exists for all time and evolves in ilR 2 R2 2 where for (x, y) = (xk , yk )k∈Z in lC2 we denote by Iso(x, y) the set of sequences 2 2 2 2 2 Iso(x, y) := (xk , yk )k∈Z ∈ lC 2 | x k + yk = x k + yk ∀k ∈ Z . 2 can be reached from any We want to show that any given point in Iso(x, y) ∩ ilR 2 2 other point in Iso(x, y) ∩ ilR2 by concatenating integral curves of the above vector
1106
T. Kappeler, P. Lohrmann, P. Topalov, N. T. Zung
fields. First we follow the integral curve of X (0) which starts at the point (ξ, 0), where ξ = (ξl )l∈Z ∈ il 2 (Z, R) is given by ξl = i + |xl |2 + |yl |2 ∀l ∈ Z (6.3) until we reach the point (ξ (0) , η(0) ) where (ξl , 0) if l = 0 (0) (0) (ξl , ηl ) = (x0 , y0 ) if l = 0. Then we continue on the integral curve of X (1) until we reach (ξ (1) , η(1) ) where (0) (0) (1) (1) (ξl , ηl ) = (ξl , ηl ) if l = 1 (x1 , y1 ) if l = 1. Next we continue on the integral curve of X (−1) until we have reached (ξ (−1) , η(−1) ) where (1) (1) (−1) (−1) (ξl , ηl ) = (ξl , ηl ) if l = −1 (x−1 , y−1 ) if l = −1. 2 , In this way we construct a sequence of points in Iso(x, y) ∩ ilR 2
(ξ, 0), (ζ (0) , η(0) ), (ξ (1) , η(1) ), (ξ (−1) , η(−1) ), . . . .
(6.4)
It is easy to see that this sequence converges to the point (x, y). In order to prove Theorem 1.1 we apply −1 to such sequences of points and use Proposition 4.1 and Lemma 5.2 to conclude that their images are in iL2R . Proof of Theorem 1.1. By Proposition 2.9 there exist an open neighborhood W of 0 in 2 , and a diffeomorphism : W → U so that L2 , an open neighborhood U of 0 in lC 2 2 . By Proposition 4.1 we can assume that (W ∩ iL2R ) = U ∩ ilR 2 2 (W ∩ iL2R, even ) = U ∩ ilR 2 , even .
Without loss of generality we may assume that U is a ball. In a first step we want to prove that 2 2 −1 (U ∩ ilR 2 ) ⊆ W ∩ iLR . 2 . As U is assumed to be a ball it follows that Let (x, y) be an arbitrary point in U ∩ ilR 2 (ξ, 0), defined by (6.3), is also in U , hence 2 (ξ, 0) ∈ U ∩ ilR 2 , even .
By Proposition 4.1 it follows that ζ := −1 (iξ, 0) is in W ∩ iL2R . As is canonical, the pull backs of the vector fields X (k) by are again Hamiltonian vector fields. They are given by (k ∈ Z), Y (k) = i(−∂ϕ2 Ik , ∂ϕ1 Ik ).
Birkhoff Coordinates for the Focusing NLS Equation
1107
We recall that Ik are analytic functionals on W which are real valued on iL2R . Hence by Lemma 5.2, the vector fields Y (k) when restricted to W ∩ iL2R are tangent to iL2R . It then follows that the sequence ζ (k) := −1 (ξ (k) , η(k) ) is in iL2R where (ξ (k) , η(k) ) is given by (6.4). As iL2R is closed in L2 and is continuous one concludes that lim ζ (k) = lim −1 (ξ (k) , η(k) ) = −1 (x, y)
k→∞
is an element in
iL2
R.
k→∞
This shows that 2 2 −1 (U ∩ ilR 2 ) ⊆ W ∩ iLR .
(6.5)
2 , is a C-linear By Proposition 2.9 (iii), the differential of at 0, d0 : L2 → lC 2 isomorphism. By applying the inverse function theorem and using (6.5) once more one 2 of 0 in il 2 and a then concludes that there exists a neighborhood U f ⊆ U ∩ ilR 2 R2 neighborhood W f ⊆ W ∩ iL2R of 0 in iL2R so that
: Wf → Uf is a diffeomorphism. The properties of f := |W f , stated in items (i) − (iii) of Theorem 1.1, now follow from the corresponding properties of the Birkhoff map : W → U (Proposition 2.9, items (i),(ii), and (iv)) in a straightforward way. References 1. Ablowitz, M., Ma, Y.: The periodic cubic Schrödinger equation. Studies Appl. Math. 65, 113–158 (1981) 2. Bambusi, D., Grébert, B.: Birkhoff normal form for PDE’s with tame modulus. Duke Math. J. 135, 507–567 (2005) 3. Grébert, B., Guillot, J.C.: Gaps of one-dimensional periodic AKNS systems. Forum Math. 5, 459– 504 (1993) 4. Grébert, B., Kappeler, T.: Symmetries of the Nonlinear Schrödinger equation. Bull. Soc. Math. France 130(4), 603–618 (2002) 5. Grébert, B., Kappeler, T.: Perturbations of the defocusing nonlinear Schrödinger equation. Milan J. Math. 71, 141–174 (2003) 6. Grébert, B., Kappeler, T., Pöschel, J.: Normal form theory for the nonlinear Schrödinger equation. Preliminary version available at http://www.math.sciences.univ-nantes.fr/~grebert/publication.html, 2002 7. Ito, H.: Convergent normal forms for integrable systems. Comment. Math. Helv. 64, 412–461 (1989) 8. Kappeler, T., Pöschel, J.: KdV & KAM. Ergeb. der Math. und ihrer Grenzgeb., Berlin-Heidelberg-New York: Springer Verlag, 2003 9. Kappeler, T., Serier, F., Topalov, P.: On the characterisation of the smoothness of skew-adjoint potentials in periodic Dirac operators. Preprint 10. Kuksin, S.B., Pöschel, J.: Invariant Cantor manifolds of quasi-periodic oscillations for a nonlinear Schrödinger equation. Ann. Math. 143, 149–179 (1996) 11. Li, Y., McLaughlin, D.W.: Morse and Melnikov functions for NLS Pde’s. Commun. Math. Phys. 162, 175–214 (1994) 12. Tkachenko, V.A.: Spectra of non-selfadjoint Hill’s operators and a class of Riemann surfaces. Ann. Math. 143, 181–231 (1996) 13. Vey, J.: Sur certains systemes dynamiques separables. Amer. J. Math. 100, 591–614 (1978) 14. Zakharov, V.E., Shabat, A.B.: A scheme for integrating nonlinear equations of mathematical physics by the method of the inverse scattering problem I. Funct. Anal. Appl. 8, 226–235 (1974) 15. Zung, N.T.: Convergence versus integrability in Birkhoff normal forms. Ann. Math. 161, 141–156 (2005) Communicated by G. Gallavotti
Commun. Math. Phys. 285, 1109–1128 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0541-2
Communications in
Mathematical Physics
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve Erwin Miña-Díaz Indiana-Purdue University Fort Wayne, Department of Mathematical Sciences, 2101 E. Coliseum Blvd, Fort Wayne, IN 46805-1499, USA. E-mail:
[email protected] Received: 10 December 2007 / Accepted: 3 March 2008 Published online: 25 June 2008 – © Springer-Verlag 2008
Dedicated to Prof. Guillermo López Lagomasino on the occasion of his 60th birthday Abstract: We consider polynomials that are orthogonal over an analytic Jordan curve L with respect to a positive analytic weight, and show that each such polynomial of sufficiently large degree can be expanded in a series of certain integral transforms that converges uniformly in the whole complex plane. This expansion yields, in particular and simultaneously, Szeg˝o’s classical strong asymptotic formula and a new integral representation for the polynomials inside L. We further exploit such a representation to derive finer asymptotic results for weights having finitely many singularities (all of algebraic type) on a thin neighborhood of the orthogonality curve. Our results are a generalization of those previously obtained in [7] for the case of L being the unit circle. 1. Introduction and Statements of the Results The study of polynomials orthogonal over a closed rectifiable curve of the complex plane was initiated by Szeg˝o in [20], and later continued by Szeg˝o himself and such authors as Smirnov, Keldysh, Lavrentiev, Korovkin, Suetin and Geronimus (see [17] for references and an overview of the developments until 1964). Polynomials orthogonal over several arcs and curves have also been studied, for instance (and without being exhaustive), by Akhiezer [1,2], Widom [21], Aptekarev [3], Peherstorfer and coauthors [9–13], and for an orthogonality measure with finitely many point masses outside the curve/arc, by Kaliaguine [5,6]. Among the central questions that are often investigated figure the asymptotic behavior of the orthogonal polynomials and the distribution and location of their zeros. In this regard, the case of a closed curve has the peculiarity (not observed in that of an open arc) that the interior of its polynomial convex hull is non-empty,1 giving more freedom of distribution to the zeros of the polynomials, and consequently, making the behavior 1 A well-known result by Widom [22] asserts that the zeros must accumulate, in the limit, on the polynomial convex hull of the support of the orthogonality measure.
1110
E. Miña-Díaz
of the polynomials themselves less clear. The results of the present work clarify this question to a substantial extent for a single closed curve under analyticity conditions. Let L 1 be an analytic Jordan curve in the complex plane C and let h(z) be an analytic function in a neighborhood of L 1 such that h(z) > 0 for all z ∈ L 1 . Using the Gram-Schmidt orthogonalization process, we can form a unique sequence { pn (z)}∞ n=0 of orthonormal polynomials over L 1 with respect to h(z), i.e., satisfying pn (z) = γn z n + lower degree terms, γn > 0, n ≥ 0, 1 0, n = m, pn (z) pm (z)h(z)|dz| = 1, n = m. 2π L 1
(1) (2)
In what follows, we are concerned with the asymptotic behavior of these polynomials as their degree n becomes large. With this generality, essentially the only known result is Szeg˝o’s strong asymptotic formula pn (z) = φ (z)[φ(z)]n [∆e (z; h) + o(1)] . (3) Here φ is the conformal map of the exterior Ω1 of L 1 onto the exterior of the unit circle satisfying that φ(∞) = ∞, φ (∞) > 0, ∆e (z; h) is the so-called exterior Szeg˝o function for the weight h, and (3) holds locally uniformly as n → ∞ on any open set Ωρ ⊃ Ω 1 that is conformally mapped by φ onto the exterior of a circle about the origin of radius ρ < 1, and is such that ∆e (z; h) is analytic on Ωρ (see the next subsection for details). For h(z) ≡ 1, this formula was established by Szeg˝o in his paper [20] of 1921, while for an arbitrary positive analytic weight, it first appears in Chap. XVI of the first edition of his book [19] of 1939. So far as this writer can learn, progress in understanding the asymptotic behavior of pn (z) at the remaining points of the complex plane, that is, for z ∈ C\Ωρ , has only been made in the specific case of L 1 being the unit circle, the strongest result having been obtained recently in [7]. Here the authors use the Riemann-Hilbert approach for the asymptotic analysis of orthogonal polynomials to derive, for each pn (z) of sufficiently large degree, a series expansion in terms of certain recursively generated Cauchy transforms. This important result yields at once Szeg˝o’s asymptotic formula and an integral representation for pn (z) inside the unit circle, from which it is possible to distill the precise behavior of the polynomials under additional assumptions on the first singularities encountered by the exterior Szeg˝o function. This has been done in [7] for finitely many polar singularities, as well as for two examples of an isolated essential singularity. Earlier related works (e.g., [18]) are briefly described in the introduction of [7]. In the present paper we extend the expansion of [7] to an arbitrary analytic curve. Our proof is not based on the Riemann-Hilbert method, it is rather direct and in some sense natural, which we believe will lead to applications to other systems of orthogonal polynomials. From the dominant term of the expansion we derive precise asymptotic formulas for pn (z) in a case where the exterior Szeg˝o function has finitely many algebraic singularities in a thin neighborhood of L 1 . We state our results in Subsect.s 1.2 and 1.3 below, followed by their proofs in Sect. 2. 1.1. Preliminaries. In this subsection we introduce some notation to be used throughout, as well as the concepts involved in the asymptotic behavior of pn . In particular, we discuss the Szeg˝o functions associated with the weight h. For a deeper discussion of these functions we refer the reader to Chap. X of [19].
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve
1111
Given r ≥ 0, we set Tr := {w : |w| = r }, Er := {w : r < |w| ≤ ∞}, Dr := {w : |w| < r }. If K is a set and f (z) a function defined on K , then K and ∂ K denote, respectively, the closure and the boundary of K , and f (K ) := { f (z) : z ∈ K }. Szeg˝o functions. Let f (t) be an analytic function defined on a neighborhood of the unit circle T1 such that f (t) > 0 for all t ∈ T1 . The function log f (t) t + w 1 t +w 1 w → exp log f (t) |dt| = exp · dt , 4π T1 t −w 4πi T1 t t −w (4) w ∈ C\T1, is analytic on C\T1 . Its restriction to D1 is called the interior Szeg˝o function for f , and we denote it by Di (w; f ). It is univocally determined by the properties (a) Di (w; f ) has an analytic continuation from D1 to some neighborhood of D1 , Di (w; f ) = 0 for all w ∈ D1 and Di (0; f ) > 0; (b) |Di (w; f )|2 = f (w) for all w ∈ T1 . Property (a) easily follows by noticing that log f (t) is analytic in a neighborhood of T1 , and therefore, the analytic continuation of Di (w; f ) to D1 is given by the expression to the right of the = sign in (4) if integration is taken over a circle about the origin of radius slightly larger than 1. Property (b) is a consequence of that |Di (w; f )|2 is the exponential of the Poisson integral of log f (t) (see, e.g., [14, Thm. 11.8]). The restriction of the function in (4) to E1 is called the exterior Szeg˝o function for f and we denote it by De (w; f ). Notice that De (w; f ) =
1 Di (1/w ; f )
, w ∈ E1 ,
so that De (w; f ) is univocally determined by the properties (a ) De (w; f ) has an analytic continuation from E1 to some neighborhood of E1 ; De (w; f ) = 0 for all z ∈ E1 and De (∞; f ) > 0; (b ) |De (w; f )|−2 = f (w) for all w ∈ T1 . These considerations can be generalized to an arbitrary analytic Jordan curve L 1 as follows. Let Ω1 be the exterior of L 1 , that is, the unbounded component of C\L 1 , and let ψ = ψ(w) be the unique conformal map of E1 onto Ω1 satisfying that ψ(∞) = ∞, ψ (∞) > 0. Let ρ ≥ 0 be the smallest number such that ψ has an analytic and univalent continuation from E1 to Eρ. Because L 1 is analytic, ρ < 1. For every r with ρ ≤ r < ∞, set Ωr := ψ(Er ),
L r := ∂Ωr , G r := C\Ω r ,
and let φ(z) : Ωρ → Eρ
1112
E. Miña-Díaz
be the inverse function of ψ. Observe that for every r > ρ , L r is an analytic Jordan curve. Then, given a weight function h(z) that is positive and analytic on L 1 , the exterior Szeg˝o function ∆e (z; h) for h is defined as ∆e (z; h) := De (φ(z); h ◦ ψ) 1 1 + φ(ζ )φ(z) = exp log h(ζ ) φ (ζ )dζ , z ∈ Ω1 , 4πi L 1 φ(ζ ) − φ(z)
(5)
which is uniquely determined by the properties (i ) ∆e (z; h) is analytic and never zero on Ω 1 , ∆e (∞; h) > 0; (ii ) |∆e (z; h)|−2 = h(z) for all z ∈ L 1 . Let ρ be the smallest number larger than or equal to ρ such that ∆e (z; h) is analytic on Ωρ . By property (i ), ρ ≤ ρ < 1. Similarly, any conformal map χ (w) of the unit disk onto the interior domain G 1 of L 1 has an analytic and univalent continuation to some neighborhood of D1 . Denoting its inverse by ϕ(z) : G 1 → D1 , an interior Szeg˝o function ∆i (z; h) for h is defined as ∆i (z; h) := Di (ϕ(z); h ◦ χ ) 1 1 + ϕ(ζ )ϕ(z) ϕ (ζ )dζ , z ∈ G 1 , log h(ζ ) = exp 4πi L 1 ϕ(ζ ) − ϕ(z)
(6)
which satisfies the properties (i) ∆i (z; h) is analytic and never zero on G 1 ; (ii) |∆i (z; h)|2 = h(z) for all z ∈ L 1 . Moreover, (iii) ∆i (z; h)−1 has an analytic continuation from G 1 to G 1/ρ . To see why (iii) is true, consider the Schwarz function S(z) of the curve L 1 (see, e.g., [4]), which is analytic and univalent in a neighborhood of L 1 , and is univocally determined by the property that S(z) = z for z ∈ L 1 . Indeed, S(z) is well-defined all over the band Ωρ ∩ G 1/ ρ , where it can be expressed in terms of the exterior conformal maps as S(z) = ψ 1/φ(z) , z ∈ Ωρ ∩ G 1/ ρ. Let z ∗ := S(z), z ∈ Ωρ ∩ G 1/ ρ, be the so-called Schwarz reflection of z about L 1 . Then, the analytic continuation of ∆i (z; h)−1 from G 1 to G 1/ρ is given by 1 := ∆e (z; h)∆e (z ∗ ; h)∆i (z ∗ ; h), z ∈ G 1/ρ \G 1 . ∆i (z; h) Although the interior Szeg˝o functions as defined by (6) depend on the choice of ϕ, any two of them differ at most in a multiplicative constant of modulus 1. Hereafter we shall assume that one such ∆i (z; h) has been fixed.
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve
1113
The kernel W (ζ, z). Let ϕ denote, as above, a conformal map of G 1 onto D1 , and define the meromorphic kernel √ √ ϕ (z) ϕ (ζ ) , ζ, z ∈ G 1 . (7) W (ζ, z) := ϕ(ζ ) − ϕ(z) That this kernel does not depend on the choice of ϕ can be easily verified from the fact that any other conformal map ϕ1 of G 1 onto D1 is related to ϕ through a Möbius transformation, that is, ϕ1 (z) − ϕ1 (z 0 ) , eiθ = ϕ (z 0 ) 1 − |ϕ1 (z 0 )|2 /ϕ1 (z 0 ), ϕ(z) = eiθ 1 − ϕ1 (z 0 )ϕ1 (z) where z 0 is that point of G 1 mapped by ϕ onto 0. Moreover, if we choose, as we may, a conformal map ϕ that does not vanish on Ωρ ∩ G 1 , then this ϕ has an analytic and univalent continuation from G 1 to G 1/ ρ given by ϕ(z) =
1 ϕ (z ∗ )
,
z ∈ G 1/ ρ \G 1 ,
so that W (ζ, z) can be extended as a function W (ζ, z) : G 1/ ρ × G 1/ ρ →C in such a way that for every fixed z ∈ G 1/ ρ , W (ζ, z) is analytic in the variable ζ on G 1/ ρ \{z}, with a simple pole at z of residue 1. We finish this subsection by noticing that positive analytic weights h(z) over L 1 are easy to generate because they are precisely those of the form h(z) = V (z)V (z ∗ ), z ∈ L 1 , with V (z) a zero-free analytic function in a neighborhood of L 1 . 1.2. The expansions. Hereafter we will suppress h and simply write ∆e (z) for ∆e (z; h) and ∆i (z) for ∆i (z; h). Fix a number r such that ρ < r < 1, and for each integer n ≥ 0, let us recursively define the following sequence of functions: f n(0) (z) := 1, z ∈ C, and for all k ≥ 0,
1 f n(2k) (ζ )∆e (ζ )∆i (ζ )W (ζ, z) φ (ζ ) [φ(ζ )]n dζ, 2πi L r z ∈ G 1/ρ \L r , √ (2k+1) fn (ζ ) φ (ζ ) [φ(ζ )]−n dζ 1 (2k+2) (z) := fn , z ∈ Ωρ \L 1/r . 2πi L 1/r ∆e (ζ )∆i (ζ )[φ(ζ ) − φ(z)] f n(2k+1) (z) := −
Let
∆e (ζ )∆i (ζ )
,
Λr := max √ ζ ∈L r φ (ζ )
−1
Λr := max φ (ζ )∆e (ζ )∆i (ζ ) , ζ ∈L 1/r
1114
E. Miña-Díaz
and Mr :=
max
(ζ,z)∈L r ×L 1/r
|W (ζ, z)| < ∞,
so that obviously (verify it by mathematical induction), for all k ≥ 0,
2n k
(2k+1)
n+1 Λr Λr Mr r (z) ≤ Λr r max |W (ζ, z)|, z ∈ G 1/ρ \L r ,
fn ζ ∈L r 1/r − r k
Λr Λr Mr r 2n Λr Λr Mr r 2n
(2k+2)
(z) ≤ , z ∈ Ωρ \L 1/r .
fn |1/r − |φ(z)|| 1/r − r
(8) (9)
It follows that the two series f n(0) (z) + f n(2) (z) + f n(4) (z) + · · · + f n(2k) (z) + · · · , z ∈ Ωρ \L 1/r , f n(1) (z) + f n(3) (z) + · · · + f n(2k+1) (z) + · · · , z ∈ G 1/ρ \L r , converge absolutely and locally uniformly in their respective regions of definition, provided n is so large that Λr Λr Mr r 2n < 1. 1/r − r
(10)
Let Pn (z) be the n th monic orthogonal polynomial, that is, Pn (z) = γn−1 pn (z), n ≥ 0, where pn satisfies (1) and (2). Theorem 1. For every n satisfying (10), we have ∆e (∞)Pn (z) [φ (∞)]−n−1/2 ⎧ ∞ ⎪ ⎪ (z) [φ(z)]n ⎪ ∆ (z) φ f n(2k) (z), e ⎪ ⎪ ⎪ ⎪ k=0 ⎪ ⎪ ∞ ∞ ⎨ 1 (2k+1) f n(2k) (z) − fn (z), = ∆e (z) φ (z) [φ(z)]n ∆i (z) ⎪ ⎪ k=0 k=0 ⎪ ⎪ ∞ ⎪ ⎪ 1 (2k+1) ⎪ ⎪ − fn (z), ⎪ ⎩ ∆ (z) i
z ∈ Ω1/r , z ∈ Ωr ∩ G 1/r , z ∈ Gr ,
k=0
(11) and γn−2 = ∆e (∞)−2 [φ (∞)]−2n−1 √ ∞ (2k+1) 1 fn (ζ ) φ (ζ ) [φ(ζ )]−n−1 dζ × 1+ . 2πi L 1/r ∆e (ζ )∆i (ζ ) k=0
(12)
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve
1115
Let us now consider the following slightly different sequence of integral transforms. For each fixed integer n ≥ 0, set gn(0) (z) := 1, z ∈ C, and for all k ≥ 0,
1 gn(2k) (ζ )∆e (ζ )∆i (ζ )W (ζ, z) φ (ζ ) [φ(ζ )]n dζ, 2πi L r z ∈ G 1/ρ \L r , √ (2k+1) gn (ζ ) φ (ζ ) [φ(ζ )]−n−1 dζ φ(z) gn(2k+2) (z) := , z ∈ Ωρ\L 1/r . 2πi L 1/r ∆e (ζ )∆i (ζ )[φ(ζ ) − φ(z)] gn(2k+1) (z) := −
Then,
2n+2 k
(2k+1)
n+1 Λr Λr Mr r (z) ≤ Λr r max |W (ζ, z)|, z ∈ G 1/ρ\L r , k ≥ 0,
gn ζ ∈L r 1/r − r
2n+1 Λ Λ M r 2n+2 k r r r
(2k+2) |φ(z)|Λr Λr Mr r (z) ≤ , z ∈ Ωρ\L 1/r , k ≥ 0,
gn |1/r − |φ(z)|| 1/r − r and the following theorem holds true: Theorem 2. For every n satisfying (10), we have γn2 Pn (z) ∆e (∞)[φ (∞)]n+1/2 ⎧ ∞ ⎪ ⎪ (z) [φ(z)]n ⎪ ∆ (z) φ gn(2k) (z), e ⎪ ⎪ ⎪ ⎪ k=0 ⎪ ⎪ ∞ ∞ ⎨ 1 (2k+1) gn(2k) (z) − gn (z), = ∆e (z) φ (z) [φ(z)]n ∆i (z) ⎪ ⎪ k=0 k=0 ⎪ ⎪ ∞ ⎪ ⎪ 1 (2k+1) ⎪ ⎪ gn (z), ⎪− ⎩ ∆i (z)
z ∈ Ω1/r , z ∈ Ωr ∩ G 1/r , z ∈ Gr .
k=0
In particular, γn2
= ∆e (∞) [φ (∞)] 2
2n+1
∞
gn(2k) (∞).
(13)
k=0
Remark 1. These expansions have been previously obtained in [7] for L 1 = T1 . They are the outcome of applying the steepest descent method of Deift and Zhou for the asymptotic analysis of a matrix Riemann-Hilbert problem solved by the orthogonal polynomials and closely related functions. Theorem 1 extends Theorem 1 of [7], while relation (13) extends Theorem 2 of [7]. Our proof of Theorem 1 (and similarly, that of Theorem 2) is direct: call Hn (z) the function in the right-hand side of (11) and observe that the three expressions that define it in the corresponding components of C\ L r ∪ L 1/r are redundant, in the sense that they are analytic continuations of each other. Thus, Hn (z) is an entire function with a pole of order n at ∞, therefore it is a polynomial. Proving that it is orthogonal to all powers of z m , 0 ≤ m < n, is also straightforward.
1116
E. Miña-Díaz
Corollary 1. Let τ be such that ρ < τ < 1/ρ. Then for every r > ρ, γn = ∆e (∞)[φ (∞)]n+1/2 1 + O r 2n , √ n φ (z) [φ(z)]n r Pn (z) = ∆ , z ∈ Ωτ . (z) + O e n+1/2 ∆e (∞)[φ (∞)] τn
(14) (15)
If ρ < τ < 1, then ∆e (∞)Pn (z) ∆i (z)−1 = ∆e (ζ )∆i (ζ )W (ζ, z) φ (ζ ) [φ(ζ )]n dζ −n−1/2 [φ (∞)] 2πi L1 n 2n , z ∈ Gτ . +O τ r
(16)
Equalities (15) and (16) hold uniformly as n → ∞. Of course, (14) and (15) are equivalent to n r n , z ∈ Ωτ . pn (z) = φ (z) [φ(z)] ∆e (z) + O τn
(17)
Formula (17) is due to Szeg˝o [19, Thm. 16.5], though he established it with a less precise estimate for the rate of decay of the error term. For ρ < τ ≤ 1, the estimate in (17) was already obtained by Suetin in [17] (see formula (2.16) therein). Let Z be the set of accumulation points of the zeros of the Pn ’s, i.e., Z consists of those points t ∈ C such that every neighborhood of t contains zeros of infinitely many polynomials Pn . Corollary 2. For every τ > ρ there is a number Nτ such that for all n ≥ Nτ , Pn (z) has exactly as many zeros on Ωτ as ∆e (z) (counting multiplicities), and Ωρ ∩ Z = {z ∈ Ωρ : ∆e (z) = 0}. 1.3. Positive weights with algebraic singularities near L 1 . We can derive from (16) finer asymptotic formulas for Pn if we know more about the singularities of both the exterior Szeg˝o function and the map ψ. For instance, if h(z) ≡ 1, then ∆e (z) ≡ 1, ∆i (z) ≡ 1, and the behavior of Pn inside L 1 only depends on geometric considerations and can be determined with great precision, for instance, when ∂Ωρ is a piecewise analytic curve, in which case the map ψ has finitely many singularities on the circle Tρ, having an asymptotic expansion about each of them. We will not pursue the analysis of this case here as it is very similar to the one already carried out in [8] for polynomials orthogonal over the interior of an analytic curve with respect to area measure. Instead, we shall concentrate on a case where the behavior of Pn is only influenced by the singularities of ∆e (z), which are finitely many, all lying on the band G 1 ∩ Ωρ and of algebraic type. Let a1 , a2 , . . . , as be s ≥ 1 distinct complex numbers all lying on a curve L ρ with ρ < ρ < 1. Let λ1 ≥ λ2 ≥ · · · ≥ λs be such that λk ∈ R\ {0, −1, −2, . . .} for all 1 ≤ k ≤ s, and let u be the number of subindexes k for which λk = λ1 , so that λ1 = λ2 = · · · = λu > λu+1 ≥ · · · ≥ λs (1 ≤ u ≤ s).
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve
1117
Fig. 1. Illustration of 9 singularities, a6 , a7 and a8 are poles
Consider a weight of the form h(z) := |ω(z)|
−2
s
|z − ak |2λk , z ∈ L 1 ,
(18)
k=1
< σ < ρ, positive at ∞ and never where ω(z) is an analytic function on Ω σ for some ρ zero on Ω 1 ∪ {a1 , a2 , . . . , as }. Let the numbers ρk , σk , Θk (1 ≤ k ≤ s) be defined from the ak ’s by the relations ρk := φ(ak ) = ρeiΘk , σk := σ eiΘk , 0 ≤ Θk < 2π, 1 ≤ k ≤ s.
(19)
Let [σk , ρk ] be the segment joining σk and ρk , and define (see Fig. 1 below) Γσ := Tσ ∪ {ρk : λk ∈ N} ∪ ∪λk ∈N [σk , ρk ] , Σσ := {z ∈ Ωσ : φ(z) ∈ Γσ }, (20) so that the exterior Szeg˝o function for the weight h in (18) is analytic on Σσ with a1 , a2 , . . . , as being its only singularities on L ρ , since, indeed, s φ(z) λk , z ∈ Σσ, (21) ∆e (z) = ω(z) z − ak k=1
with the branches of the λk -power functions chosen so that [φ (∞)]λk > 0. An interior Szeg˝o function ∆i (z) for h is given by ⎞λk ⎛ s (z − ak ) 1 − ϕ(ak )ϕ(z) ⎝ ⎠ , z ∈ G1. ∆i (z) = ∆i z; |ω|−2 ϕ(z) − ϕ(ak ) k=1
In what follows,
αk := [φ (ak )] and
a b
λk −1/2
lim
z→ak
φ(z) z − ak
−λk
∆e (z) , 1 ≤ k ≤ s,
stands for the generalized binomial coefficient, i.e., a Γ (a + 1) := , b Γ (b + 1)Γ (a − b + 1)
where Γ denotes the Euler gamma function.
(22)
1118
E. Miña-Díaz
Theorem 3. (a) For all z ∈ G 1 \∂Σσ , √ ∆e (∞)Pn (z) ∆e (z) φ (z)[φ(z)]n , z ∈ G 1 ∩ Σσ , = 0, z ∈ Gσ , [φ (∞)]−n−1/2 n u λ1 −1 n+1 n αk ∆i (ak )W (ak , z)[φ(ak )] + ρ n (z) , (23) + ∆i (z) k=1
where n (z) converges locally uniformly to 0 on G 1 \∂Σσ . (b) For every 1 ≤ j ≤ s, ∆e (∞)Pn (a j ) n α j φ (a j )φ(a j )n = −n−1/2 [φ (∞)] λj n λ −1 + 1 αk ∆i (ak )W (ak , a j )[φ(ak )]n+1 ∆i (a j ) 1≤k≤u, k= j max{λ j ,λ1 −1} n +o n ρ
(24)
as n → ∞. Several remarks are in order. Remark 2. The proof of Theorem 3 yields the following estimates for the rate of decay of the functions n (z) in (23). If λ1 = 1 and u = s, then for every compact set K ⊂ G 1 \∂Σσ , there is 0 < δ < 1 such that n (z) = O(δ n ) uniformly on K as n → ∞. Otherwise, ⎧ λ −λ O n u+1 1 , i f λ1 = 1, u < s, ⎪ ⎪ ⎪ ⎨ −1 , i f λ1 = 1, u = s, n (z) = O n ⎪ ⎪ ⎪ ⎩ O n − min{1, λ1 −λu+1 } , i f λ1 = 1, u < s, locally uniformly on G 1 \∂Σσ as n → ∞. Likewise, a better and generally exact estimate for the o-error term in (24) can be easily obtained from the proof of (24), though a somewhat tedious case comparison is required. Remark 3. Many fine results on the location and distribution of the zeros of the polynomials Pn (z) follow from Theorem 3(a). For instance: (a) For every ρ < η < ρ there is a number Nη such that for all n ≥ Nη , Pn (z) has at most u − 1 zeros on G η , counting multiplicities. (b) Z ∩ G ρ consists of those points t ∈ G ρ satisfying an equation of the form u
αk ∆i (ak )W (ak , t)eiθk = 0,
k=1
with angles θ1 , . . . , θu for which it is possible to find a subsequence {n j } j≥1 ⊂ N such that eiθk = lim ei(n j +1)Θk, 1 ≤ k ≤ u. j→∞
(25)
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve
1119
(c) For each n ≥ 1, let µn be the normalized ! counting measure of the zeros z n,1 , z n,2 , . . . , z n,n of Pn , that is, µn := n −1 nk=1 δz n,k , where δz denotes the Dirac unit point mass at z. Let µ L ρ be the equilibrium measure of the compact set L ρ , whose value at any given Borel set B ⊂ L ρ is " 1 |dt|. (26) µ L ρ (B) := 2πρ φ(B) Then, the sequence {µn }n≥1 converges in the weak*-topology to µ L ρ , i.e., for every # # continuous function f defined on C, limn→∞ f dµn = f dµ L ρ . (d) L ρ ⊂ Z. Moreover, a result similar to Theorem 4 of [7] (see also Theorems 11.1 and 11.2 of [16]) on the separation and speed of convergence to L ρ of the zeros of Pn can be also obtained from (23). Statements (a) and (b) follow straightforwardly from (23) and the Hurwitz theorem. Which u-tuples {θ1 , . . . , θu } satisfy (25) depends on the specific values of the angles Θk and can be characterized as it has been done in [18, Thm. 5], [7, Prop. 3] (or, if more details are needed, see [8, Sect. 2.2]). Statement (d) is a clear consequence of (c), given that, by definition (26), the support of µ L ρ is L ρ . The proof of (c) is based on standard arguments of logarithmic potential theory: by (15), Corollary 2 and statement (a) above, any measure µ that is the weak*-limit of some subsequence {µn j } j≥1 is supported on L ρ and satisfies " " 1 1 dµ(t) = lim log dµn j (t) = lim n j −1 log |Pn j (z)|−1 log j→∞ j→∞ |z − t| |z − t| = log |φ (∞)/φ(z)|, z ∈ Ωρ . On the other hand, it is not difficult to verify from (26) that " 1 dµ L ρ (t) = log |φ (∞)/φ(z)|, z ∈ Ωρ , log |z − t| i.e., the logarithmic potential of µ coincides outside L ρ with that of µ L ρ , which, by a well-known theorem of Carleson [15, Thm. 4.13], implies that µ = µ L ρ . Remark 4. Values of λk ∈ {0, −1, −2, . . .} are purposely excluded because their corresponding factors (z − ak )λk would not create a singularity (but a zero) for ∆e (z) at ak , and therefore, these factors may be simply regarded as being part of the function ω(z). We also note that among the weights defined by (18) are those of the form h(z) := |ω1 (z)|−2
s
|z − ak |2λk , {a1 , a2 , . . . , as } ⊂ L ρ ∪ L 1/ρ ,
k=1
with ω1 (z) an analytic function on Ω ρ that is never zero on Ω 1 ∪ {ak ∈ L ρ } ∪ {ak∗ : ak ∈ L 1/ρ } (here ak∗ denotes the Schwarz reflection of ak about L 1 ). For in such a case, h(z) can be also written in the form (18) as follows: ⎛ ⎞⎛ ⎞ −2 ⎝ 2λk ⎠ ⎝ ∗ 2λk ⎠ h(z) = |ω(z)| |z − ak | |z − ak | , z ∈ L 1, k : ak ∈L ρ
k : ak ∈L 1/ρ
1120
E. Miña-Díaz
with
ω(z) = ω1 (z)
|φ(ak )|
λk
k : ak ∈L 1/ρ
φ(z) − φ(ak∗ ) z − ak · z − ak∗ φ(z) − φ(ak )
λk
.
2. Proofs Proof of Theorem 1. Let us denote by Hn (z) the right-hand side of (11), which is originally defined on C\ L r ∪ L 1/r , and let us prove that Hn (z) is indeed an entire function. Let Hn+ (z) := ∆e (z) φ (z) [φ(z)]n $ % √ ∞ (2k+1) (27) 1 fn (ζ ) φ (ζ ) [φ(ζ )]−n dζ × 1+ , z ∈ Ω1 , 2πi L 1 ∆e (ζ )∆i (ζ )[φ(ζ ) − φ(z)] k=0
which, in view of (8), is well-defined and analytic on Ω1 . This function provides the analytic continuation of Hn |Ω1/r to Ω1 , which follows from the very definition of Hn |Ω1/r , given that for all k ≥ 0 and z ∈ Ω1/r (and by deforming L 1/r into L 1 ), f n(2k+2) (z)
√ (2k+1) fn (ζ ) φ (ζ ) [φ(ζ )]−n dζ ∆e (ζ )∆i (ζ )[φ(ζ ) − φ(z)] L 1/r √ (2k+1) 1 fn (ζ ) φ (ζ ) [φ(ζ )]−n dζ . = 2πi L 1 ∆e (ζ )∆i (ζ )[φ(ζ ) − φ(z)]
1 := 2πi
Moreover, by the residue theorem (deforming L 1 back into L 1/r in (27)), we find that for every z ∈ Ω1 ∩ G 1/r , Hn+ (z)
$ % √ ∞ (2k+1) −n dζ 1 f (ζ ) φ (ζ ) [φ(ζ )] n n = ∆e (z) φ (z) [φ(z)] 1 + 2πi L 1/r ∆e (ζ )∆i (ζ )[φ(ζ ) − φ(z)] k=0
−
1 ∆i (z)
∞
f n(2k+1) (z),
k=0
that is, the analytic continuation Hn+ of Hn |Ω1/r to Ω1 coincides for values of z ∈ Ω1 ∩ G 1/r with Hn |Ωr ∩G 1/r as defined by (11). Similarly, Hn− (z)
∞
1 1 := ∆i (z) 2πi k=0
L1
f n(2k) (ζ )∆e (ζ )∆i (ζ )W (ζ, z) φ (ζ ) [φ(ζ )]n dζ,
z ∈ G1, provides the analytic continuation of Hn |G r to G 1 , which for values of z ∈ Ωr ∩ G 1 coincides precisely with Hn |Ωr ∩G 1/r .
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve
1121
Thus, Hn (z) is an entire function and
∞ Hn (z) n+1/2 (2k) = ∆e (∞)[φ (∞)] f n (∞) lim 1+ z→∞ z n k=1
= ∆e (∞)[φ (∞)]
n+1/2
.
By Liouville’s theorem, Hn (z) is a polynomial of exact degree n, whose leading coefficient is ∆e (∞)[φ (∞)]n+1/2 . Now, from the definition of Hn |Ωr ∩G 1/r we have 1 2π
Hn (z)z m h(z)|dz| = L1
∞ 1 φ (z) [φ(z)]n z m f n(2k) (z)∆e (z)h(z)|dz| 2π L 1 k=0 ∞ 1 − f (2k+1) (z)z m ∆i (z)−1 h(z)|dz|, 2π L 1 n k=0
so that Theorem 1 will follow at once if we show that with √ (2k−1) fn (ζ ) φ (ζ )[φ(ζ )]−n−1 dζ 1 , An,k := 2πi L 1/r ∆e (ζ )∆i (ζ ) 1 φ (z) [φ(z)]n z m f n(2k) (z)∆e (z)h(z)|dz| 2π L 1 ⎧ 0 ≤ m < n, k ≥ 0, ⎨ 0, m = n, k = 0, = ∆e (∞)−1 [φ (∞)]−n−1/2 , ⎩ ∆e (∞)−1 [φ (∞)]−n−1/2 An,k , m = n, k ≥ 1, and 1 2π
L1
f n(2k+1) (z)z m ∆i (z)−1 h(z)|dz| = 0, n, m ≥ 0, k ≥ 0.
(28)
(29)
First, we obtain by making the change of variables z = ψ(w) that for all 0 ≤ m ≤ n, 1 φ (z) [φ(z)]n z m f n(2k) (z)∆e (z)h(z)|dz| 2π L 1 (30) √ 1 ψ (w)[ψ(w)]m n (2k) = w f n (ψ(w))|dw|, 2π T1 ∆e (ψ(w)) −2 where we have used that √ for z ∈ L 1 , h(z) = |∆e (z)| . Now, the function ψ (w)[ψ(w)]m /∆e (ψ(w)) is analytic on E1 \{∞} with a pole of order m at ∞, so that from its Laurent expansion at infinity we obtain that for certain coefficients a j (that depend on n and m), ⎛ ⎞ √ ∞ ψ (w)[ψ(w)]m w n = ∆e (∞)−1 [φ (∞)]−m−1/2 w n−m ⎝1 + ajwj⎠ , ∆e (ψ(w)) j=1
w ∈ T1 .
(31)
1122
E. Miña-Díaz (2k)
On the other hand, from the definition of f n f n(2k) (ψ(w))
1 = 2πi
=
(z) for k ≥ 1 we see that
√ (ζ ) φ (ζ )[φ(ζ )]−n dζ ∆e (ζ )∆i (ζ )[φ(ζ ) − w] √ (2k−1) fn (ψ(t))t −n ψ (t)dt , w ∈ Eρ \T1/r , ∆e (ψ(t))∆i (ψ(t))(t − w) (2k−1)
fn
L 1/r
1 2πi T1/r
is indeed analytic in all of C\T1/r , and we obtain from its Taylor expansion about 0 that for certain coefficients b j (that depend on n and k), f n(2k) (ψ(w)) =
1 2πi
(2k−1)
fn L 1/r
√ ∞ (ζ ) φ (ζ )[φ(ζ )]−n−1 dζ + bjwj, ∆e (ζ )∆i (ζ ) j=1
(32)
w ∈ T1 . (0)
Taking into account that f n (z) ≡ 1, we then get (28) by combining (30), (31) and (32). Similarly, if ϕ is a conformal map of G 1 onto D1 and χ (w) is its inverse, we have 1 f (2k+1) (z)z m ∆i (z)−1 h(z)|dz| 2π L 1 n (33) 1 = f n(2k+1) (χ (w)) χ (w) χ (w)[χ (w)] m ∆i (χ (w))|dw|, 2π T1 2 where we have used that √ for z ∈ L 1 , mh(z) = |∆i (z)| . On the one hand, χ (w)[χ (w)] ∆i (χ (w)) is analytic on D1 , and from its Taylor expansion we obtain
χ (w)[χ (w)]m ∆i (χ (w)) =
∞
c j w − j , w ∈ T1 .
(34)
j=0 (2k+1)
On the other hand, from the definition of f n
(z) and (7) we have that
√ √ (2k) f n (ζ )∆e (ζ )∆i (ζ ) ϕ (ζ ) φ (ζ ) [φ(ζ )]n dζ 1 f n(2k+1) (χ (w)) χ (w) = − 2πi L r ϕ(ζ ) − w is analytic on C\ϕ(L r ) ⊃ E1 and vanishes at ∞, so that its Laurent expansion at ∞ restricted to T1 is of the form ∞ f n(2k+1) (χ (w)) χ (w) = d j w − j , w ∈ T1 . j=1
Thus, (29) follows by inserting (34) and (35) in (33). Proof of Theorem 2. Proceed just as in the proof of Theorem 1 above.
(35)
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve
1123
Proof of Corollary 1. Let τ and r be such that ρ < τ < 1/ρ, ρ < r < min τ, τ −1 , so that L τ ⊂ G 1/r ∩ Ωr . By inequalities (8) and (9), we see that ∞
Λ2 Λ Mr max(ζ,z)∈L r ×L τ |W (ζ, z)|
(2k+1)
(z) ≤ r 3n+1 r r , z ∈ Lτ ,
fn 1/r − r − Λr Λr Mr r 2n
k=1 ∞
(2k+2)
(z) ≤ r 2n
fn
k=0
(1/r − r )Λr Λr Mr , z ∈ Lτ , (1/r − τ )(1/r − r − Λr Λr Mr r 2n )
and we obtain from Theorem 1 that ∆e (∞)Pn (z) (z) [φ(z)]n − ∆ (z)−1 f (1) (z) + O τ n r 2n , z ∈ L . = ∆ (z) φ e i τ n [φ (∞)]−n−1/2 (36) Given that, again by (8), f n(1) (z) = O (r n ) uniformly in z ∈ L τ as n → ∞, we get from (36) that (15) holds uniformly in z ∈ L τ , and by the maximum modulus principle for analytic functions, it also holds on Ω τ . If now τ < 1, then from the definition of f n(1) (z) and the residue theorem we obtain that for all z ∈ L τ , ∆i (z)−1 ∆e (ζ )∆i (ζ )W (ζ, z) φ (ζ ) [φ(ζ )]n dζ 2πi Lr = ∆e (z) φ (z) [φ(z)]n ∆i (z)−1 − ∆e (ζ )∆i (ζ )W (ζ, z) φ (ζ ) [φ(ζ )]n dζ, 2πi L1
f n(1) (z) = −
which together with (36) yields that (16) holds for z ∈ L τ , and again by the maximum modulus principle for analytic functions, it also holds on G τ . Equality (14) follows, for instance, from (12) and (8), or from (13). Proof of Corollary 2. This follows from (15) by an application of Hurwitz theorem. Proof of Theorem 3. We first prove a proposition that will help the proof of Theorem 3 to go through smoothly. The following notation will be used. For each δ > 0 and t ∈ C, Dδ (t) := {w : |w − t| < δ}, Tδ (t) := {w : |w − t| = δ} = ∂ Dδ (t). Let 0 < σ < ρ be given numbers, and define δ := ρ − σ . Suppose that v(t, z) is a function of two complex variables that is analytic in the variable t on the closed disk D2δ (ρ) for each z ∈ E (E certain set), and that sup |v(t, z)| : (t, z) ∈ D2δ (ρ) × E < ∞,
1124
E. Miña-Díaz
so that, by the Cauchy integral formula, we also have that for every integer p ≥ 0, there is a constant 0 < M p < ∞ such that |∂ p v(t, z)/∂t p | ≤ M p , (t, z) ∈ Dδ (ρ) × E. For β ∈ R\ {0, −1, −2, . . .}, let the function (t −ρ)−β be defined for t ∈ C\(−∞, ρ] according to the branch of the argument −π < arg(t − ρ) < π, t ∈ C\(−∞, ρ], and let −β
(t − ρ)− := −β
(t − ρ)+ :=
lim
(z − ρ)−β , t ∈ C\(−∞, ρ],
lim
(z − ρ)−β , t ∈ C\(−∞, ρ],
z→t, z>0 z→t, z 1, O n β −1 uniformly in z ∈ E as n → ∞.
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve
1125
Next, consider a β that is not an integer. Let β¯ be the smallest nonnegative integer not less than β. Consecutive integrations by parts over Tδ (ρ) yield I =
β¯ '
(−1) j−1 1 ∂ j−1 & j−β −β+ j n
v(t, z)t (σ − ρ)− − (σ − ρ)+ (j
j−1 2πi ∂t (l − β) t=σ j=1
+ 2πi
(β¯
l=1
¯ (−1)β l=1 (l − β)
¯
Tδ (ρ)
(t − ρ)β−β
¯ ' ∂β & v(t, z)t n dt. ¯ ∂t β
(39)
We can now deform Tδ (ρ) into the two-sided segment [σ, ρ] without altering the value of this last integral, and so obtain from (39) and (38) that " ρ ∂ β¯ &v(t, z)t n ' 1 ¯ β¯ β−β I = (t − ρ)+ − (t − ρ)− dt (β¯ ∂t β¯ 2πi l=1 (β − l) σ ¯
+
β
O n j−1 σ n
j=1
=
n−β+1 ¯ sin(π(β − β))n!ρ (β¯ π Γ (n − β¯ + 1) l=1 (β − l) ¯ +O n β−1 σ n .
"
1 σ/ρ
¯ ¯ (1 − x)β−β x n−β v(ρx, z) + O n −1 d x (40)
On the one hand, there is some constant M1 > 0 such that
"
∂v(w, z)
|v(t, z) − v(ρ, z)| ≤
dw ≤ M1 |t − ρ|, ∂w [ρ,t]
(41)
(t, z) ∈ Dδ (ρ) × E. On the other hand, for every integer n ≥ 0 and real α > −1, we have " 1 " σ/ρ " 1 α n α n (1 − x) x d x = (1 − x) x d x − (1 − x)α x n d x σ/ρ
0
0
Γ (α + 1)Γ (n + 1) + O n −1 (σ/ρ)n = Γ (n + α + 2) Γ (α + 1)(1 + o(1)) = (n → ∞), n α+1
so that by (41), " 1 ¯ ¯ (1 − x)β−β x n−β v(ρx, z) + O n −1 d x σ/ρ
"
= v(ρ, z) " + =
1
σ/ρ
1 σ/ρ
¯
¯
(1 − x)β−β x n−β d x + n −1
O (1 − x)
¯ 1+β−β n−β¯
x
"
1 σ/ρ
¯ ¯ O (1 − x)β−β x n−β d x
dx
Γ (β¯ − β + 1)Γ (n − β¯ + 1)v(ρ, z) ¯ + O n −β+β−2 (n → ∞). Γ (n − β + 2)
(42)
1126
E. Miña-Díaz
Then, from (40) and (42), and taking into account that ¯
¯ (β¯ − β + 1) = Γ (β − β)Γ
β π ¯ , Γ (β − β) (β − l) = Γ (β), ¯ sin(π(β − β)) l=1
we obtain that (37) also holds if β is not an integer. Having proven the proposition above, it is now easy to prove Theorem 3. Let us start by fixing a number σ with ρ < σ < ρ and such that ω(z) is analytic on Ω σ . Let the corresponding points ρk , σk (1 ≤ k ≤ s) and the sets Γσ and Σσ be defined as in (19)–(20). Let E ⊂ G 1 ∩ Σσ and F ⊂ G σ be compact sets, and let ρ < τ < 1 be such that E ⊂ G τ . Then, according to (16), we have that for all z ∈ G τ ⊃ E ∪ F ∪ {a1 , . . . , as }, ∆e (∞)Pn (z) ∆i (z)−1 = ∆e (ψ(t))∆i (ψ(w))W (ψ(w), z) ψ (w) w n dw −n−1/2 [φ (∞)] 2πi T1 n 3n/2 . (43) +O τ ρ Choose σ such that σ < σ < ρ, and if δ := ρ − σ , then the closed disks D2δ (ρk ), 1 ≤ k ≤ s, are pairwise disjoint, and D2δ (ρk ) ⊂ D1 ∩ Eσ \{φ(z) : z ∈ E}, 1 ≤ k ≤ s. Define σ eiΘk , 1 ≤ k ≤ s, σk := and the positively oriented contour C σk ] ∪ ∪sk=1 Tδ (ρk ) , σ := Tσ ∪ ∪λk ∈N [σk , σk ] is viewed as having two sides. where each segment [σk , Since ω(z) is analytic on Ω σ and ∆e (ψ(w)) = ω(ψ(w))
s k=1
w w − ρk
λk s k=1
w − ρk ψ(w) − ψ(ρk )
λk
,
(44)
w ∈ Eσ \Γσ , we have that for all z ∈ E ∪ F ∪ {a1 , a2 , . . . , as }, the function (in the variable w) F(w, z) := ∆e (ψ(w))∆i (ψ(w))W (ψ(w), z) ψ (w) (45) is analytic on {w : σ < |w| < 1}\Γσ (with the exception of the point φ(z) in case z ∈ E, where it has a simple pole) with continuous boundary values on T1 ∪ Γσ \{ρ1 , ρ2 , . . . , ρs } when viewing each segment
An Expansion for Polynomials Orthogonal Over an Analytic Jordan Curve
1127
[σk , ρk ) as having two sides. Consequently, by deforming in (43) T1 into C σ (and applying the residue theorem in case z ∈ E) we obtain that √ ∆e (∞)Pn (z) ∆e (z) φ (z)[φ(z)]n , z ∈ E, − 0, z ∈ F ∪ {a1 , a2 , . . . , as }, [φ (∞)]−n−1/2 −1 ∆i (z) F(w, z)w n dw + O τ n ρ 3n/2 = 2πi C σ (46) s n 1 −1 n = ∆i (z) F(w, z)w dw + O σ 2πi Tδ (ρk ) k=1 + O τ n ρ 3n/2 , z ∈ E ∪ F ∪ {a1 , . . . , as }. If we now specify − π < arg(t − ρ) < π , t ∈ C\(−∞, ρ],
(47)
−π < arg(t) < π , t ∈ D2δ (ρ),
(48)
we see from (44), (45) and (7) that for every 1 ≤ k ≤ s, σ , ρ], F(teiΘk , z) = e−iλk Θk (t − ρ)−λk Fk (t, z), t ∈ Dδ (ρ)\[ ) * z ∈ E ∪ F ∪ a j : j = k , where Fk (t, z) ) is analytic* (as a function of t) on the closed disk D2δ (ρ) for every z ∈ E ∪ F ∪ ρ j : j = k , and −λk φ(z) λk ∆e (z) . (49) Fk (ρ, z) = ρk [φ (ak )]λk −1/2 ∆i (ak )W (ak , z) lim z→ak z − ak * ) Hence we get from the proposition proven above that for every z ∈ E ∪ F ∪ a j : j = k , 1 ei(n−λk +1)Θk F(w, z)w n dw = (t − ρ)−λk t n Fk (t, z)dt 2πi Tδ (ρk ) 2πi Tδ (ρ) n (50) Fk (ρ, z)ρkn−λk +1 = λk − 1 0, if λk = 1, + O n λk −2 ρ n , if λk = 1. Thus, part (a) of Theorem 3 follows from (46), (49) and (50). Part (b) follows similarly. The function W (ψ(w), a j ) is analytic in {w : σ < |w| < 1, w = ρ j }, with a simple pole at ρ j of residue φ (a j ), and with the specifications (47) and (48), we have σ , ρ], F(teiΘ j , a j ) = e−i(λ j +1)Θ j (t − ρ)−λ j −1 U j (t), t ∈ Dδ (ρ)\[ where U j (t) is analytic on D2δ (ρ) and U j (ρ) =
λ ρ j j [φ (a j )]λ j +1/2 ∆i (a j ) lim z→a
j
φ(z) z − aj
−λ j
∆e (z) ,
(51)
1128
E. Miña-Díaz
so that 1 2πi
Tδ (ρ j )
ei(n−λ j )Θ j (t − ρ)−λ j −1 t n U j (t)dt 2πi Tδ (ρ) n n−λ U j (ρ)ρ j j + O n λ j −1 ρ n , = λj
F(w, a j )w n dw =
(52)
and part (b) of the theorem follows by combining (46), (50), (51) and (52).
References 1. Akhiezer, N.I.: Orthogonal polynomials on several intervals. Soviet Math. Dokl. 1, 989–992 (1960) 2. Akhiezer, N.I., Tomchuk, Yu. Ya.: On the theory of orthogonal polynomials over several intervals (in Russian). Dokl. Akad. Nauk SSSR 138, 743–746 (1961) 3. Aptekarev, A.I.: Asymptotic properties of polynomials orthogonal on a system of contours and periodic motions of Toda chains. Mat. Sb. 125, 231–258 (1984) 4. Davis, P.J.: The Schwarz function and its applications. The Carus Mathematical Monographs Vol. 17. Washington, DC: The Mathematical Association of America, 1974 5. Kaliaguine, V.A.: On asymptotics of L p extremal polynomials on a complex curve (0 < p < ∞). J. Approx. Theory 74, 226–236 (1993) 6. Kaliaguine, V.A.: A note on the asymptotics of orthogonal polynomials on a complex arc: the case of a measure with a discrete part. J. Approx. Theory 80, 138–145 (1995) 7. Martínez-Finkelshtein, A., McLaughlin, K.T.-R., Saff, E.B.: Szeg˝o orthogonal polynomials with respect to an analytic weight: canonical representation and strong asymptotics. Constr. Approx. 24, 319–363 (2006) 8. Miña-Díaz, E.: An asymptotic integral representation for Carleman orthogonal polynomials. Int. Math. Res. Not. vol. 2008, rnn065, 38pp. (2008) doi:10.1093/imrn/rnn065 9. Peherstorfer, F.: On Bernstein-Szeg˝o orthogonal polynomials on several intervals. SIAM J. Math. Anal. 21, 461–482 (1990) 10. Peherstorfer, F.: Zeros of polynomials orthogonal on several intervals. Int. Math. Res. Not. 7, 361–385 (2003) 11. Peherstorfer, F., Yuditskii, P.: Asymptotic behavior of polynomials orthonormal on a homogeneous set. J. Anal. Math. 89, 113–154 (2003) 12. Peherstorfer, F.: On the zeros of orthogonal polynomials: the elliptic case. Constr. Approx. 20, 377–397 (2004) 13. Lukashov, A.L., Peherstorfer, F.: Zeros of polynomials orthogonal on two arcs of the unit circle. J. Approx. Theory 132, 42–71 (2005) 14. Rudin, W.: Real and complex analysis. 3rd ed., New York, McGraw-Hill, 1986 15. Saff, E.B., Totik, V.: Logarithmic potentials with external fields. Berlin: Springer-Verlag, 1997 16. Simon, B.: Fine structure of the zeros of orthogonal polynomials, I. A Tale of Two Pictures. ETNA 25, 328–368 (2006) 17. Suetin, P.K.: Fundamental properties of polynomials orthogonal on a contour. Russ. Math. Surv. 21, 35–83 (1966) 18. Szabados, J.: On some problems connected with polynomials orthogonal on the complex unit circle. Act. Math. Scien. Hung. 33, 197–210 (1979) 19. Szeg˝o, G.: Orthogonal Polynomials. Amer. Math. Soc. Colloq. Publ. Vol. 23, Providence, RI, Amer. Math. Soc., 4th ed., 1975 20. Szeg˝o, G.: Über orthogonale polynome, die zu einer gegebenen Kurve der Komplexen Ebene gehören. Math. Z. 9, 218–270 (1921) 21. Widom, H.: Extremal polynomials associated with a system of curves in the complex plane. Adv. Math. 3, 127–232 (1969) 22. Widom, H.: Polynomials associated with measures in the complex plane. J. Math. Mech. 16, 997–1013 (1967) Communicated by B. Simon
Commun. Math. Phys. 285, 1129–1163 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0653-8
Communications in
Mathematical Physics
Cosmological Horizons and Reconstruction of Quantum Field Theories Claudio Dappiaggi1 , Valter Moretti2,3,4 , Nicola Pinamonti1 1 II. Institut für Theoretische Physik, Universität Hamburg, Luruper Chaussee 149,
D-22761 Hamburg, Germany. E-mail:
[email protected];
[email protected] 2 Istituto Nazionale di Fisica Nucleare-Gruppo Collegato, Trento, Italy 3 Dipartimento di Matematica, Università di Trento, di Trento,
via Sommarive 14, I-38050 Povo (TN), Italy. E-mail:
[email protected] 4 Istituto Nazionale di Alta Matematica “F.Severi”– GNFM, via Madonna del Diano,
50019 Sesto Fiorentino, Italy Received: 12 December 2007 / Accepted: 17 July 2008 Published online: 29 October 2008 – © Springer-Verlag 2008
Dedicated to Professor Klaus Fredenhagen on the occasion of his 60th birthday. Abstract: As a starting point, we state some relevant geometrical properties enjoyed by the cosmological horizon of a certain class of Friedmann-Robertson-Walker backgrounds. Those properties are generalised to a larger class of expanding spacetimes M admitting a geodesically complete cosmological horizon − common to all co-moving observers. This structure is later exploited in order to recast, in a cosmological background, some recent results for a linear scalar quantum field theory in spacetimes asymptotically flat at null infinity. Under suitable hypotheses on M, encompassing both the cosmological de Sitter background and a large class of other FRW spacetimes, the algebra of observables for a Klein-Gordon field is mapped into a subalgebra of the algebra of observables W(− ) constructed on the cosmological horizon. There is exactly one pure quasifree state λ on W(− ) which fulfills a suitable energy-positivity condition with respect to a generator related with the cosmological time displacements. Furthermore λ induces a preferred physically meaningful quantum state λ M for the quantum theory in the bulk. If M admits a timelike Killing generator preserving − , then the associated self-adjoint generator in the GNS representation of λ M has positive spectrum (i.e., energy). Moreover λ M turns out to be invariant under every symmetry of the bulk metric which preserves the cosmological horizon. In the case of an expanding de Sitter spacetime, λ M coincides with the Euclidean (Bunch-Davies) vacuum state, hence being Hadamard in this case. Remarks on the validity of the Hadamard property for λ M in more general spacetimes are presented. Contents 1. 2.
Introduction . . . . . . . . . . . . . . . . . . . . . 1.1 Notation, mathematical conventions . . . . . . 1.2 Outline of the paper . . . . . . . . . . . . . . . Cosmological Horizons and Asymptotically Flatness 2.1 Friedmann-Robertson-Walker spacetimes and cosmological horizons . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
1130 1132 1132 1132
. . . . . . . . . . . . . 1132
1130
C. Dappiaggi, V. Moretti, N. Pinamonti
2.2 FRW metrics with κ = 0 and associated geometric structure Expanding Universes with Cosmological Horizon and its Group . 3.1 Expanding universes with cosmological horizon − . . . . . 3.2 The horizon symmetry group SG − . . . . . . . . . . . . . 4. Preferred States Induced by the Cosmological Horizon . . . . . . 4.1 QFT in the bulk . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Bosonic QFT on − and SG − -invariant states . . . . . . . 4.3 Interplay of QFT in M and QFT on − . . . . . . . . . . . . 4.4 The preferred invariant state λ M . . . . . . . . . . . . . . . 4.5 Testing the construction for the de Sitter case and for other FRW metrics . . . . . . . . . . . . . . . . . . . . 5. Conclusions and Open Issues . . . . . . . . . . . . . . . . . . . A. Proof of Some Technical Results . . . . . . . . . . . . . . . . . 3.
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
1134 1136 1136 1138 1141 1141 1141 1144 1146
. . . . . . 1147 . . . . . . 1150 . . . . . . 1151
1. Introduction In the framework of quantum field theory over curved backgrounds we witnessed, in the past few years, an increased display of new and important formal results. In many cases we can trace back their origin to the existence of a non trivial interplay between some field theories living on a Lorentzian background - say M - and a suitable counterpart constructed over a co-dimension one submanifold of M, often chosen as the conformal boundary of the spacetime. Usually thought of as a realization of the so-called holographic principle, this research line provided its most remarkable results in the framework of (asymptotically) AdS backgrounds. As a matter of fact, concepts such as Maldacena’s conjecture [AGM00] - in a string framework - or Rehren’s duality (see [DR02] and references therein) - in the algebraic quantum field theory setting - are appearing nowadays almost ubiquitously in the theoretical high-energy physics literature. More recently a similar philosophy has been also adopted to deal with a rather different scenario, namely asymptotically flat spacetimes, where it is future null infinity – + ∼ R × S2 , i.e., the conformal boundary – which plays the role of the above-mentioned co-dimension one submanifold [DMP06,Mo06,Mo07,Da07]. Although one could safely claim that all these mentioned results are compelling, one should also actively seek connections to those theoretical models which are nowadays testable and, within this respect, one can safely claim that cosmology is a rather natural playground. In this realm, one of the most widely known theories is inflation where, as in other models, the pivotal role is played by a single scalar field living on an (almost) de Sitter background. Although, within this framework, most of the results are mainly, though not only, at a classical level, it is to a certain extent mandatory to look for a deeprooted analysis of the full-fledged underlying quantum field theory in order to achieve a firmer understanding of the model under analysis. To this avail, the first, but to a certain extent, not appealing chance is to perform a caseby-case analysis of the quantum structure of all the possible models nowadays available. In our opinion a more attractive possibility is to look for some mean allowing us to draw some general conclusions or to point out some universal feature, independently from the chosen model or from the chosen background. Taking into account this philosophy, a natural “first step" to undertake would be to try to implement the previously discussed bulk-to-boundary correspondence which appears to encode, almost per construction, all the sought criteria of universality in the case of a large class of cosmological models.
Cosmological Horizons and Reconstruction of Quantum Field Theories
1131
As a starting point point let us assume the Cosmological Principle which leads the underlying background to be endowed with the widely-used Friedmann-RobertsonWalker (FRW) metrics. A direct inspection of the geometric properties of these spacetimes points out that, in most of the relevant physical cases, such as de Sitter to quote just one example, a natural submanifold exists which, at first glance, appears to be a good candidate as the preferred co-dimension 1 hypersurface: the cosmological (future or past) horizon as defined by Rindler [Ri06]. More precisely, in this paper we shall consider the cosmological past horizon − , in common with all the co-moving observers, in order to deal with expanding universes. The first of the main aims of this manuscript is indeed to discuss some non trivial geometric features of the cosmological horizon − . Particularly, under some technical restrictions on the analytic form of the expanding factor in the FRW metric with flat spatial section, the horizon has a universal structure and, hence, it represents the natural setting where to stage a bulk-to-boundary correspondence. An expanding universe admits a preferred future-oriented timelike vector field X defining the worldlines of co-moving observers, whose common expanding rest-frames are the 3surfaces orthogonal to X . In FRW metrics X is a conformal Killing field which becomes tangent to the cosmological horizon and, in the class of FRW metrics we consider, it individuates complete null geodesics on − . This extent will be generalised to expanding spacetimes M equipped with a geodesically complete cosmological horizon − and an asymptotical conformal Killing field X , generally different from FRW spacetimes. The leading role of X in such a construction is strengthened by its intertwining relation with the conformal factor which is a primary condition to take into account if one wants to study the structure of the symmetry group of the horizon (actually a subgroup of the huge full isometry group of the horizon viewed as a semi-Riemannian manifold). We also address such an issue and we discover that such a group is actually an infinite dimensional group SG − which has the structure of an iterated semidirect product, i.e., it is S O(3) C ∞ (S2 ) C ∞ (S2 ) where S O(3) is the special orthogonal group with a three dimensional algebra, whereas C ∞ (S2 ) stands for the set of smooth functions over S2 thought of as an Abelian group under addition. The geometric interpretation of SG − is intertwined to the following result. The subgroup of isometries of the spacetime which preserves the cosmological horizon structure is injectively mapped to a subgroup of SG − which, hence, encodes some of the possible symmetries of the spacetime. However it must be remarked that SG − is universal in the sense that it does not depend on the particular spacetime M in the class under consideration. As a result we find that, under suitable hypotheses on M, valid particularly for certain FRW spacetimes which are de Sitter asymptotically, the algebra of observables W(M) of a Klein-Gordon field in M is one-to-one (isometrically) mapped to a subalgebra of the algebra of observables W(− ) naturally constructed on the cosmological horizon. In this sense the information of quantum theory in the bulk M is encoded in the quantum theory defined on the boundary − . It turns out that there is exactly one pure quasifree state λ on W(− ) which fulfills a certain energy-positivity condition with respect to some generators of SG − . The relevant generators are here those which can be interpreted as limit values on − of timelike Killing vectors of M, whenever one fixes a spacetime M admitting − as the cosmological horizon. However, exactly as the geometric structure of − , λ is universal in the sense that it does not depend on the particular spacetime M in the class under consideration. The GNS-Fock representation of λ individuates a unitary irreducible representation of SG − . Fixing an expanding spacetime M with complete cosmological horizon, λ induces a preferred quantum state λ M for the quantum theory
1132
C. Dappiaggi, V. Moretti, N. Pinamonti
in M and it enjoys remarkable properties. It turns out to be invariant under all those isometries of M (if any) that preserve the cosmological horizon structure. If M admits a timelike Killing generator preserving − , the associated self-adjoint generator in the GNS representation of λ M has positive spectrum, i.e., energy. Eventually, if M is the expanding de Sitter spacetime, λ M coincides to the Euclidean (Bunch-Davies) vacuum state, so that it is Hadamard in that case at least. Actually, the Hadamard property seems to be valid in general, but that issue will be investigated elsewhere. As a final technical remark we would like to report that, in the derivation of many results reported here, we have been guided by similar analyses previously performed in the case of asymptotically flat spacetime, using the null infinity as a co-dimension one submanifold. However, to follow the subsequent discussion there is no need of being familiar with the tricky notion of asymptotically flat spacetime. 1.1. Notation, mathematical conventions. Throughout the paper R+ := [0, +∞), N := {0, 1, 2, . . .}. For smooth manifolds M, N , C ∞ (M; N ) (omitting N whenever N = R) is the space of smooth functions f : M → N . C0∞ (M; N ) ⊂ C ∞ (M; N ) is the subspace of compactly-supported functions. If χ : M → N is a diffeomorphism, χ ∗ is the natural extension to tensor bundles (counter-, co-variant and mixed) from M to N (Appendix C in [Wa84]). A spacetime (M, g) is a Hausdorff, second-countable, smooth, four-dimensional connected manifold M, whose smooth metric has signature − + ++. We shall also assume that a spacetime is oriented and time oriented. We adopt (M, g) and ( M, g) definitions of causal structures of Chap. 8 in [Wa84]. If S ⊂ M ∩ M, (I ± (S; M)) indicate the causal being spacetimes, J ± (S; M) (I ± (S; M)) and J ± (S; M) (chronological) sets associated to S and respectively referred to the spacetime M or M. 1.2. Outline of the paper. In Sect. 2 we introduce and discuss the geometric set-up of the backgrounds we are going to take into account throughout this paper. Particularly we find under which analytic conditions on the expanding factor, a Friedmann-Robertson-Walker (FRW) spacetime can be smoothly extended to a larger spacetime that encompasses the cosmological horizon. In Sect. 3 we provide a generalisation of the results of Sect. 2 and we study their implications. Furthermore we introduce and discuss the structure of the horizon symmetry group showing its interplay with the possible isometries of the bulk metric. In Sect. 4 we study the structure of bulk scalar QFT and of the associated Weyl algebra and its horizon counterpart. Furthermore we discuss the existence of a preferred algebraic state invariant under the full symmetry group, which enjoys some uniqueness/energy-positivity properties. Subsections 4.3 and 4.4 are devoted to the development of the interplay between the bulk and the boundary theory; a particular emphasis is given to the selection of a natural preferred bulk state and on the analysis of its properties. Since all these conclusions are based upon some a priori assumptions on the behaviour of the solutions in the bulk of the Klein-Gordon equation with a generic coupling to curvature, we shall devote Sect. 4.5 to test these requirements. Eventually, in Sect. 5, we draw some conclusions and we provide some hints on future research perspectives. 2. Cosmological Horizons and Asymptotically Flatness 2.1. Friedmann-Robertson-Walker spacetimes and cosmological horizons. A homogeneous and isotropic universe can be locally described by a smooth spacetime, in the following indicated by (M, g F RW ), where M is a smooth Lorentzian manifold equipped with the following Friedmann-Robertson-Walker (FRW) metric
Cosmological Horizons and Reconstruction of Quantum Field Theories
g F RW = −dt ⊗ dt + a(t)2
1 2 2 dr ⊗ dr + r dS (θ, ϕ) . 1 − κr 2
1133
(1)
Above, dS2 (θ, ϕ) = dθ ⊗dθ +sin2 θ dφ ⊗dφ is the standard metric on the unit 2-sphere and, up to normalisation, κ can take the values −1, 0, 1 corresponding respectively to an hyperbolic, flat and closed spaces. The coordinate t ranges in some open interval I . Here a(t) is a smooth function of t with constant sign (since g is nondegenerate). Henceforth we shall assume that a(t) > 0 when t ∈ I . We also suppose that the field ∂t individuates the time orientation of the spacetime. Physically speaking and in the universe observed nowadays, the sections of M at fixed t are the isotropic and homogeneous 3-spaces containing the matter of the universe, the world lines describing the histories of those particles of matter being integral curves of ∂t . In this picture, the cosmic time t is the proper-time measured at rest with each of these particles, whereas the scale a(t) measures the size of the observed cosmic expansion in function of t. The metric (1) may enjoy two physically important features. Consider a co-moving observer pictured by a integral line γ = γ (t), t ∈ I , of the field ∂t and focus on J − (γ ). If J − (γ ) does not cover the whole spacetime M, the observer γ cannot receive physical information from some events of M during his/her history: causal future-directed signals starting from M \ J − (γ ) cannot achieve any point on γ . In other words, and adopting the terminology of [Ri06], a cosmological horizon takes place for γ and it is the null 3hypersurface ∂ J − (γ ). Conversely, whenever J + (γ ) does not cover the whole spacetime M, physical information sent by the observer γ during his/her story is prevented from getting to some events of M: causal future-directed signals starting from γ do not reach any point in M \ J + (γ ). In this case, exploiting again the terminology of [Ri06], a cosmological past horizon exists for γ . It is the null 3-hypersurface ∂ J + (γ ). As it is well-known, a sufficient condition for the appearance of cosmological horizons can be obtained from the following analysis. One re-arranges the metric (1) into the form 1 . 2 2 g F RW = a 2 (τ ) −dτ ⊗ dτ + dr ⊗ dr + r dS (θ, ϕ) = a 2 (τ )g(τ, r, θ, ϕ), 1 − κr 2 (2) where
τ (t) = d +
a −1 (t)dt,
(3)
is the conformal cosmological time, d ∈ R being any fixed constant. By construction τ = τ (t) is a diffeomorphism from I to some open, possibly infinite, interval (α, β) τ . Notice that ∂τ is a conformal Killing vector field whose integral lines coincide, up to the parametrisation, to the integral lines of ∂t and that (M, g F RW ) is globally hyperbolic. As causal structures are preserved under conformal rescaling of the metric, a straightforward analysis based on the shape of g in (2) establishes that J − (γ ) does not cover the whole spacetime M whenever β < +∞. In that case a cosmological event horizon takes place for γ . Similarly J + (γ ) does not cover the whole spacetime M whenever α > −∞. In that case a cosmological past horizon takes place for γ . In both cases the horizons ∂ J − (γ ) and ∂ J + (γ ) are null 3-hypersurfaces diffeomorphic to R × S2 , made of null geodesics of g F RW . One may think of these surfaces as the limit light-cones emanating from γ (t), respectively towards the past or towards the future, as t tends to
1134
C. Dappiaggi, V. Moretti, N. Pinamonti
sup I or inf I respectively. The tips of the cones generally get lost in the limit procedure: In realistic models α and β correspond, when they are finite, to a big bang or a big crunch respectively. As a general comment, we stress that the cosmological horizons introduced above generally depend on the fixed observer γ . Remark 2.1. The requirement on the finiteness of the bounds α and β for the range of the conformal cosmological time τ are sufficient conditions for the existence of the cosmological horizons, but they are by no means necessary. Indeed it may happen that – and this is the case of de Sitter spacetime – there is, indeed a cosmological horizon arbitrarily close to M, but outside M. This happens when the spacetime M and its g ) so that metric can be extended beyond its original region M to a larger spacetime ( M, = ∂ M and − = ∂ J + (M; M) = ∂ M. Hence the it happens that + = ∂ J − (M; M) cosmological horizon + or − coincides with the boundary ∂ M and, by construction, it does not depend on the considered observer γ (an integral curve of the field ∂t ) evolving in M. Referring in particular to a conformally static region M (equipped with the metric ∂ M turns out to be a (1) for κ = 0) embedded in the complete de Sitter spacetime M, null surface with the topology of R × S2 . In the following we shall focus on this type of cosmological horizons. 2.2. FRW metrics with κ = 0 and associated geometric structure. Here, we would like to pinpoint some geometrical properties enjoyed by a subclass of the FRW spacetimes that will be used later in order to get the main results presented in this paper. To this end we consider here the spacetime (M, g F RW ), where M (α, β) × R3 and the metric g F RW is like in (2), but with κ = 0. Furthermore we shall restrict our attention to the case where the factor a(τ ) in (2) has the following form da(τ ) 1 γ γ 1 , , (4) =− 2 +O a(τ ) = + O 2 τ τ dτ τ τ3 for either (α, β) := (−∞, 0) and γ < 0, or (α, β) := (0, +∞) and γ > 0. The above asymptotic values are meant to be taken as τ → −∞ or τ → +∞ respectively. The first issue we are going to discuss is the extension of the spacetime (M, g F RW ) to g ) that encompasses + and/or − . To this end, if we introduce a larger spacetime ( M, the new coordinates U = tan−1 (τ + r ) and V = tan−1 (τ − r ) ranging in subsets of R individuated by τ ∈ (α, β) and r ∈ (0, +∞), (2) can be written as: 1 1 sin2 (U − V ) 2 a 2 (τ (U, V )) g F RW = − dU ⊗ d V − d V ⊗ dU + dS (θ, ϕ) . (5) cos2 U cos2 V 2 2 4 The metric, obtained cancelling the overall factor a 2 (τ (U, V ))/(cos2 U cos2 V ), is wellbehaved and smooth for U, V ∈ R removing the axis U = V . This is nothing but the apparent singularity appearing for r = 0 in the original metric (2). Consider R2 equipped with null coordinates U, V with respect to the standard Minkowskian metric on R2 and assume that every point is a 2-sphere with radius | sin(U − V )|/2 (hence the spheres for U = V are degenerate). Then, let us focus on the segments in R2 , a, V = U with U ∈ (−π/2, π/2), b, U = π/2 with V ∈ (−π/2, π/2), c, V = −π/2 with U ∈ (−π/2, π/2).
Cosmological Horizons and Reconstruction of Quantum Field Theories
1135
V
U a π 2
b
−
π 2
c Fig. 1. The interior of the triangle represents the original FRW background seen as an open subset of Einstein’s static universe. Each point in the (U, V )-plane represents a 2-sphere and, furthermore, the segments b and c are respectively + and −
The original spacetime M is realized as a suitable subset of the union of the segment a, i.e., r = 0, and the interior of the triangle abc, i.e., r > 0, as in Fig. 1. In this picture it is natural to assume that the null endless segments b and c, representing null 3-hypersurfaces diffeomorphic to R × S2 , individuate respectively + and − provided that β = +∞ in the first case and/or α = −∞ in the second case where (α, β) is the domain of τ . Otherwise the points of M cannot get closer and closer to all the points of those segments. Therefore we are committed to assume α = −∞ and/or β = +∞ and we stick with this assumption in the following discussion. Summarising, we wish to extend g F RW smoothly to a region larger than the open triangle abc joined with a, and including one of the endless segments b and c at least. In the case a(τ ) is of the form (4), the function a 2 (τ (U, V ))/(cos2 U cos2 V ) is smooth in neighbourhoods of the open segments b and c only if γ = 0, and in particular it does not vanish on b and c, making nondegenerate g thereon. However, a bad singularity appears as soon as U = −V , that is τ = 0. Therefore either: (α, β) = (0, +∞) – and in this case M (r ≥ 0, τ ∈ (0, +∞)) coincides with the g ) by upper half of the triangle abc, and it may be extended to a larger spacetime ( M, adding a neighbourhood of the endless segment b viewed as + – or (α, β) = (−∞, 0) – and in this case M (r ≥ 0, τ ∈ (−∞, 0)) coincides to the lower g ) by adding half of the triangle abc, and it may be extended to a larger spacetime ( M, − a neighbourhood the endless segment c viewed as . In both cases the line U = −V does not belong to M and to its extension, and the metric g coincides with the right-hand side of (5). The function a(τ ) and its interplay with the vector field ∂τ when approaching the cosmological horizon will play a distinguished role in our construction; for this reason let’s enumerate below some of its properties that we are going to generalise in the next and vanishes exactly either on section. To this end, notice that a(τ ) is smooth in M or on − = ∂ J + (M; M), depending on the considered values for + = ∂ J − (M; M)
1136
C. Dappiaggi, V. Moretti, N. Pinamonti
the interval (α, β) and for γ as discussed below formula (4). On the other hand, by direct inspection da + = −2γ dU , da − = −2γ d V,
(6)
and hence da does not vanish either on + or on − , provided γ = 0. By direct inspection one finds that, restricting either to + or − , the metric g takes the following distinguished form called Bondi form:
g ± = γ 2 −d ⊗ da − da ⊗ d + dS2 (θ, ϕ) , where, with ± , it is implicitly assumed that one must choose either + or − and where, for arbitrarily fixed constants k+ , k− ,
(U ) = −γ −1 tan U + k−
on − ,
(V ) = −γ −1 tan V + k+ on + , . hence ∈ R turns out to be the parameter of the integral lines of n = ∇a. Consider then the vector field ∂τ , it is an easy task to check that it is a conformal Killing vector for g in M with conformal Killing equation L∂τ g = −2∂τ (ln a) g, where the right-hand side vanishes approaching either + or − . Furthermore, ∂τ tends b a thereon, to become tangent to either + or − approaching it and it coincides to −γ ∇ as can be directly seen from the form of . 3. Expanding Universes with Cosmological Horizon and its Group 3.1. Expanding universes with cosmological horizon −. The previous discussion remarked that in an expanding FRW spacetimes the scale factor a and its interplay with the conformal Killing field ∂τ play a distinguished role when approaching the cosmological horizon. A reader interested in asymptotically flat spacetime could have noticed that many of the above mentioned geometrical properties are shared by the structure of null infinity. In that realm, in [DMP06,Mo06,Mo07], it was shown that, when dealing with quantum field theory issues, a key role is played by a certain symmetry subgroup of diffeomorphisms defined on + , the so called BMS group, which has the most notable property to embody the isometries of the bulk spacetime [Ge771,AX78] through a suitable geometric correspondence of generators. In the following we first generalise the result presented in Sect. 2.2 and then we shall construct the counterpart of the BMS group for the found class of spacetimes and the particular form of cosmological horizons. Definition 3.1. A globally hyperbolic spacetime (M, g) equipped with a positive smooth function : M → R+ , a future-oriented timelike vector X defined on M, and a constant γ = 0, will be called an expanding universe with (geodesically complete) cosmological (past) horizon when the following facts hold: 1. Existence and causal properties of horizon. (M, g) can be isometrically embedded g ), the as the interior of a sub manifold-with-boundary of a larger spacetime ( M, − − + boundary := ∂ M verifying ∩ J (M; M) = ∅. such that (i) − = 0 and 2. Data interplay 1). extends to a smooth function on M (ii) d = 0 everywhere on − .
Cosmological Horizons and Reconstruction of Quantum Field Theories
1137
3. Data interplay 2). X is a conformal Killing vector for g in a neighbourhood of − in M, with L X ( g ) = −2X (ln ) g,
(7)
where (i) X (ln ) → 0 approaching − and (ii) X does not tend everywhere to the zero vector approaching − . 4. Global Bondi-form of the metric on − and geodesic completeness. (i) − is diffeomorphic to R × S2 , (ii) the metric g − takes the Bondi form globally up to the constant factor γ 2 > 0:
g− = γ 2 −d ⊗ d − d ⊗ d + dS2 (θ, φ) , ∈ R , (θ, φ) ∈ S2 , = 0, (8) dS2 being the standard metric on the unit 2-sphere. Hence − is a null 3-submanifold, and (iii) the curves R → ( , θ, φ) are complete null g -geodesics. The manifold − is called the cosmological (past) horizon of M. The integral parameter of X is called the conformal cosmological time. There is a completely analogous definition of contracting universe referring to the existence of + in the future instead of − . Remark 3.1. (1) In view of condition 3, the vector X is a Killing vector of the metric g0 := −2 g in a neighbourhood of − in M. In such a neighbourhood, one can think of 2 as an expansion scale evolving with rate X (2 ) referred to the conformal cosmological time. = ∅ entails M = I + (M; M) and − = ∂ M = ∂ I + (M; M) = (2) − ∩ J + (M; M) + − ∂ J (M; M), so that has the proper interpretation as a past cosmological horizon in common for all the observers in (M, g) evolving along the integral lines of X . (3) It is worth stressing that the spacetimes considered in the given definition are neither homogeneous nor isotropic in general; hence we can deal with a larger class of manifolds than simply the FRW spacetimes. Similarly to the particular case examined previously, also in the general case pictured by Definition 3.1, the conformal Killing vector field X becomes tangent to − and it coincides with ∂ up to a nonnegative factor, which now may depend on angular variables, as we go on to establish. The proof of the following proposition is in the Appendix. Proposition 3.1. If (M, g, , X, γ ) is an expanding universe with cosmological horizon, the following holds. (a) X extends smoothly to a unique smooth vector field X on − , which may vanish − on a closed subset of with empty interior at most. Then X fulfills the g -Killing equation on − . (b) X has the form f ∂ , where, referring to the representation − ≡ R×S2 , f depends only on the variables S2 and, furthermore, it is smooth and nonnegative. Since, for the FRW spacetimes, the function f = f (θ, φ) appearing in X = f ∂ takes the constant value 1, the presence of a nontrivial function f is related to the failure of isotropy for the more general spacetimes considered in Definition 3.1.
1138
C. Dappiaggi, V. Moretti, N. Pinamonti
3.2. The horizon symmetry group SG − . In the forthcoming discussion we shall make use several times of the following technical fact. In the representation − ≡ R × S2
( , s), the null g -geodesic segments imbedded in − are all of the curves J → (α + β, s) , for constants α = 0, β ∈ R, s ∈ S2 , and some interval J ⊂R. (9) In this section, in the hypotheses of Definition 3.1, we select a subgroup SG − of physically relevant isometries of − . We shall see in Proposition 3.3 that, as matter of fact, SG − contains the isometries generated by Killing vectors obtained as a limit towards − of (all possible) Killing vectors of (M, g), when these vectors tend to become tangent to − . As a preliminary proposition, it holds: Proposition 3.2. If (M, g, , X, γ ) is an expanding universe with cosmological horizon and Y is a Killing vector field of (M, g), Y can be extended to a smooth vector field Y defined on M and (a) LY g = 0 on M ∪ − ; − is uniquely determined by Y , and it is tangent to − if and only if (b) Y := Y g(Y, X ) vanishes approaching − from M. Restricting to the linear space of the Killing fields Y on (M, g) such that g(Y, X ) → 0 approaching − , the following further facts hold. vanishes in some A ⊂ − and A = ∅ is open with respect to the topology of (c) If Y − in M ∪ − . , then Y = 0 everywhere in M as well as Y (d) The linear map Y → Y is injective, i.e. Killing vectors of (M, g) are represented on − faithfully The proof of the proposition above is given in the Appendix. The statements (a) and (b) of Proposition 3.2 establish that the Killing vectors Y in M with g(Y, X ) → 0 approaching − extend to Killing vectors of (− , h), h being the degenerate metric on − induced by g. tangent to − admit − as invariant manifold, we can define Since the vector fields Y Definition 3.2. If (M, g, , X, γ ) is an expanding universe with cosmological horizon, a Killing vector field of (M, g), Y , is said to to preserve − if g(Y, X ) → 0 approaching − . Similarly, the Killing isometries of the (local) one-parameter group generated by Y are said to preserve − . In the rest of this section we shall consider the one-parameter group of isometries of − . These isometries amount to a little part (− , h) generated by such Killing vectors Y of the huge group of isometries of (− , h). For instance, referring to the representation ( , s) ∈ R×S2 ≡ − , for every smooth diffeomorphism f : R → R, the transformation
→ f ( ), s → s is an isometry of (− , h). However only diffeomorphisms of the form f ( ) = a + b with a = 0 can be isometries generated by the restriction − to − of extensions of Killing fields Y of (M, g) as in Proposition 3.2. This is Y because those isometries are restrictions of isometries of the manifolds-with-boundary (M ∪ {− }, g M∪{− } ), and thus they preserve the null g -geodesics in − . These geodesics have the form (9). The requirement that, for every constant a, b ∈ R, a = 0, there must be constants a , b ∈ R, a = 0 such that f (a + b) = a + b for all varying in a fixed nonempty interval J , is fulfilled only if f is an affine transformation as stated above. We relax now the constraints on the above transformations allowing them also
Cosmological Horizons and Reconstruction of Quantum Field Theories
1139
to be dependent on the angular coordinates. Hence we aim to study the class G − of diffeomorphisms F : − → − ,
→ := f ( , s) , s → s := g( , s) with ∈ R and s ∈ S2 ,
(10)
such that: (i) they are isometries of the degenerate metric h induced by g − (8) and (ii) they may be restrictions to − of isometries of g in M ∪ − . Assume that F ∈ G − . The curve γ : R → γs ( ) ≡ ( , s) (with s ∈ S2 arbitrarily fixed) is a null geodesic forming − , therefore R → F(γs ( )) has to be, first of all, a null curve. In other words g−
∂f ∂ ∂g ∂ ∂g ∂ ∂ f ∂ ∂g ∂ ∂g ∂ + + , + + ∂ ∂ ∂ ∂θ ∂ ∂φ ∂ ∂ ∂ ∂θ ∂ ∂φ
=0.
Using (8) and arbitrariness of s ≡ (θ, φ), it implies that g does not depend on since the standard metric on the unital sphere is strictly positive definite. The map g has to be an isometry of S2 equipped with its standard metric. In other words g ∈ O(3). Moreover, R → F(γs ( )) = ( f ( , s), g(s)) has to be a null geodesic which belongs to − . As a consequence of (2) in Remark 3.1, ( f ( , s), g(s)) = (c(s) + b(s), g(s)) for some fixed numbers c(s), b(s) ∈ R with c(s) > 0, and for every ∈ R. Summarising, it must be g( , s) = R(s) for all , s and f ( , s) = c(s) + b(s), for all , s, for some R ∈ O(3), c, b ∈ C ∞ (S2 ) with c(s) = 0. It is obvious that, conversely, every such diffeomorphism fulfills (i) and (ii). Remark 3.2. (1) By direct inspection one sees that the class G − of all diffeomorphisms F as above is a group with respect to the composition of diffeomorphisms. (2) Only transformations F ∈ G − , associated with R lying in the component connect to the identity of O(3), i.e., S O(3) belong to a one-parameter group of isometries induced by Killing vectors in M. From now on we shall restrict ourselves to the subgroup of G − whose elements are constructed using elements of S O(3) and each element of the one-parameter group of diffeomorphisms generated by a vector field Z will be denoted by exp{t Z } being t ∈ R. Definition 3.3. The horizon symmetry group SG − is the group (with respect to the composition of functions) of all diffeomorphisms of R × S2 ,
F(a,b,R) : R × S2 ( , s) → ea(s) + b(s), R(s) ∈ R × S2 with ∈ R and s ∈ S2 , (11) where a, b ∈ C ∞ (S2 ) are arbitrary smooth functions and R ∈ S O(3). The Horizon Lie algebra g− is the infinite-dimensional Lie algebra of smooth vector fields on R × S2 generated by the fields S1 , S2 , S3 , β∂ , α∂ , for all α, β ∈ C ∞ (S2 ). S1 , S2 , S3 indicate the three smooth vector fields on the unit sphere S2 generating rotations about the orthogonal axes, respectively, x, y and z.
1140
C. Dappiaggi, V. Moretti, N. Pinamonti
It is worth noticing that SG − depends on the geometric structure of − but not on the attached spacetime (M, g), which, in principle, could not even admit any Killing vector preserving − . In this sense SG − is a universal object for the whole class of expanding spacetimes with cosmological horizon. SG − may be seen as an abstract group defined on the set S O(3) × C ∞ (S2 ) × C ∞ (S2 ), without reference to any expanding spacetime with cosmological horizon (M, g). Adopting this point of view, if we indicate Fa,b,R by the abstract triple (R, a, b), the composition between elements in SG − reads
(12) (R, a, b)(R , a , b ) = R R , a + a ◦ R , ea◦R b + b ◦ R , for any (R, a, b), (R , a , b ) ∈ S O(3) × C ∞ (S2 ) × C ∞ (S2 ) and where ◦ denotes the usual composition of functions. The relationship between SG − and g− is clarified in the following proposition. Proposition 3.3. Referring to the Definition 3.3, the following facts holds (a) Each vector field Z ∈ g− is complete and the generated global one-parameter group of diffeomorphisms of R × S2 , {exp{t Z }}t∈R , is a subgroup of SG − . (b) For every F ∈ SG − there are Z 1 , Z 2 ∈ g− – with, possibly, Z 1 = Z 2 – such that F = exp{t1 Z 1 } exp{t2 Z 2 } for some real numbers t1 , t2 . The proof of this proposition is in the Appendix. Furthermore, we have the following important result which finally makes explicit the interplay between Killing vectors Y in M preserving − , the group SG − and the Lie algebra g− . Theorem 3.1. Let (M, g, , X, γ ) be an expanding universe with cosmological horizon and Y a Killing vector field of (M, g) preserving − . The following holds of Y to − (see Prop. 3.2) belongs (a) The restriction of the unique smooth extension Y to g− . }}t∈R is a subgroup of SG − . (b) {exp{t Y The proof of this theorem is in the Appendix. As an example consider the expanding universe M with cosmological horizon associated with the metric g F RW (2) with κ = 1 and a as in (4). In this case X := ∂τ and there is a lot of Killing vectors Y of (M, g F RW ) satisfying g F RW (Y, X ) → 0 approaching − . The most trivial ones are all of the Killing vectors of the surfaces at τ =constant with respect to the induced metric. We have here a Lie algebra generated by 6 independent Killing vectors Y associated respectively with space translations and with space rota − tions. In this case g F RW (Y, X ) = 0 so that the associated with Killing vectors Y belongs to g− . This is not the whole story in the sharp case a(τ ) = γ /τ with γ < 0 which corresponds to the expanding de Sitter spacetime. Indeed, in this case, there is another Killing vector B of g F RW fulfilling g F RW (B, X ) → 0 approaching − . It is B := τ ∂τ + r ∂r . B, extended to M ∪ − , gives rise to the structure of a bifurcate Killing horizon [KW91]. A last technical result, proved in the Appendix and useful in the forthcoming discussion, is Proposition 3.4. Let (M, g, , X, γ ) be an expanding universe with cosmological hori ∈ g− zon and Y a smooth vector field of (M, g) which tends to the smooth field Y with A ⊃ − , where Y A∩M is timelike and pointwisely. If there is an open set A ⊂ M future directed, then, everywhere on − , ( , s) = f (s)∂ , for some f ∈ C ∞ (S2 ), with f (s) ≥ 0 on S2 . (13) Y
Cosmological Horizons and Reconstruction of Quantum Field Theories
1141
4. Preferred States Induced by the Cosmological Horizon In this section (M, g, , X, γ ) is an expanding universe with cosmological horizon. Since (M, g) is globally hyperbolic per definition, one can study properties of quantum fields propagating therein, following the algebraic approach in the form presented in [KW91,Wa94]. 4.1. QFT in the bulk. Consider real linear bosonic QFT in (M, g) based on the symplectic space (S(M), σ M ), where S(M) is the space of real smooth, compactly supported on Cauchy surfaces, solutions ϕ of Pϕ = 0 , where P is the Klein-Gordon operator P = + ξ R + m 2
(14)
with = −∇a ∇ a , m > 0 and ξ ∈ R constants. The nondegenerate, Cauchy-surface independent, symplectic form σ M is: ∀ϕ1 , ϕ2 ∈ S(M), (15) σ M (ϕ1 , ϕ2 ) := (ϕ2 ∇ N ϕ1 − ϕ1 ∇ N ϕ2 ) dµ(S) g S
S being any Cauchy surface of M with normal unit future-directed vector N and 3(S) volume measure dµg induced by g. As it is well known [BR021,BR022], it is possible to associate canonically any symplectic space, for instance (S(M), σ M ), a Weyl C ∗ algebra, W(M) in this case. This is, up to (isometric) ∗-isomorphisms, unique and its generators W M (ϕ) = 0, ϕ ∈ S(M), satisfy Weyl commutation relations (from now on we employ conventions as in [Wa94]) W M (−ϕ) = W M (ϕ)∗ ,
W M (ϕ)W M (ϕ ) = eiσ M (ϕ,ϕ )/2 W (ϕ + ϕ ).
(16)
W(M) represents the basic set of quantum observables associated with the bosonic field φ propagating in the bulk spacetime (M, g). The main goal of this section is to prove that the geometric structures on (M, g, ,X,γ) pick out a very remarkable algebraic state ω on W(M), which, among other properties turns out to be invariant under the natural action of every Killing isometry of (M, g) which preserves − . This happens provided it exists a certain algebraic interplay between QFT in M and QFT on − . 4.2. Bosonic QFT on − and SG − -invariant states. Referring to − ≡ R × S2 , consider
S(+ ) := ψ ∈ C ∞ (R × S2 ) ψ , ∂ ψ ∈ L 2 (R × S2 , d ∧ S2 (θ, φ) , (17) S2 being the standard volume form of the unit 2-sphere, and the nondegenerate symplectic form σ ∂ψ1 ∂ψ2 d ∧ S2 (θ, φ) ∀ψ1 , ψ2 ∈ S(+ ) . (18) − ψ1 ψ2 σ (ψ1 , ψ2 ) := 2 ∂ ∂ R ×S As in the previous section, we associate to (S(− ), σ ) the C ∗ -algebra W(− ) whose generators W (ψ) = 0 satisfy the Weyl commutation relations (16).
1142
C. Dappiaggi, V. Moretti, N. Pinamonti
Remark 4.1. Exploiting the given definitions, it is straightforwardly proved that (S(+ ), σ ) is invariant under the pull-back action of SG − . In other words (i) ψ ◦ g ∈ S(− ) if ψ ∈ S(− ) and also (ii) σ (ψ1 ◦ g, ψ2 ◦ g) = σ (ψ1 , ψ2 ) for all g ∈ SG − and ψ1 , ψ2 ∈ S(− ). As a well known consequence [BR022,BGP96], SG − induces a ∗-automorphism G − -representation α : W(− ) → W(− ), uniquely individuated by linearity and continuity by the requirement αg (W (ψ)) := W (ψ ◦ g −1 ) , ψ ∈ S(− ) and g ∈ G − .
(19)
Since we are interested in physical properties which are SG − -invariant, we face the issue about the existence of αg -invariant algebraic states on W(− ) with g ∈ SG − . We adopt here the definition of quasifree state given in [KW91], and also adopted in [DMP06,Mo06,Mo07]. Consider the quasifree state λ on W(S(− )) unambiguously defined as follows: if ψ, ψ ∈ S(− ), then λ(W (ψ)) = e−µ(ψ,ψ)/2 , (k, θ, φ)dk ∧ S2 (θ, φ), (k, θ, φ)ψ µ(ψ, ψ ) := Re 2k(k)ψ R×S2
(20)
the bar denoting the complex conjugation, (k) := 0 for k < 0 and (k) := 1 for of ψ: k ≥ 0; here we have used the -Fourier-Plancherel transform ψ eik (k, θ, φ) := ψ (21) √ ψ( , θ, φ)d , (k, θ, φ) ∈ R × S2 . R 2π The constraint |σ (ψ, ψ )|2 ≤ 4 µ(ψ, ψ)µ(ψ , ψ ),
for every ψ, ψ ∈ S,
(22)
which must hold for every quasifree state (see Appendix A in [Mo06]), is fulfilled by the scalar product µ, as the reader can verify by direct inspection exploiting both (21) and the definition of σ . Consider the GNS representation of λ, (H, , ϒ). Since λ is quasifree, H is a bosonic Fock space F+ (H) with cyclic vector ϒ given by the Fock vacuum and 1-particle Hilbert H space obtained as the Hilbert completion of the complex =: K µ ψ, of every wavefunction space generated by the “positive-frequency parts” ψ − ψ ∈ S( ), with the scalar product ·, · individuated by µ, as stated in (ii) of Lemma A1 in the Appendix A of [Mo06]. In our case (k, θ, φ)dk ∧ S2 (θ, φ). (k, θ, φ)ψ K µ ψ, K µ ψ = 2k(k)ψ (23) R×S2
The map K µ : S(− ) → H is R-linear and has a dense complexified range. A state similar to λ, and denoted by the same symbol, has been defined on + R × S2 in [DMP06,Mo06,Mo07]1 and, barring minor adaption, it enjoys exactly the form (20). Therefore, we can make use of Theorem 2.12 in [DMP06]; we know that λ is pure. Furthermore the one-particle space H of its GNS representation is isomorphic to the separable Hilbert space L 2 (R+ × S2 ; 2kdk ∧ S2 ). 1 In [DMP06,Mo06] a different, but unitarily-equivalent, Hilbert space representation was used referring to the measure dk instead of 2kdk. Features of Fourier-Plancherel theory on R × S2 were discussed in the Appendix C of [Mo07].
Cosmological Horizons and Reconstruction of Quantum Field Theories
1143
The state λ enjoys further remarkable properties in reference to the group SG − . Particularly, since (H, , ϒ) is its GNS triple, λ turns out to be invariant under the ∗-automorphisms representation (19) for all g ∈ SG − . In other words λ(αg (A)) turns out to be equal to λ(a) for all A ∈ W(− ) and for all g ∈ SG − , as it can be realized out of the straightforward extension to the whole algebra of the following unitary action V of SG − on the one-particle Hilbert space H:
−1 −1 −1 V(R,a,b) ϕ (k, s) := ea(R (s)) e−ikb(R (s)) ϕ ea(R (s)) k, R −1 (s) for all ϕ ∈ H , (24) being g = (R, a, b) ∈ SG − and s = (θ, φ). Furthermore, by standard manipulation, one can realize that the unique unitary representation U : SG − g → Ug that implements α in H while leaving ϒ invariant, preserves H and it is unambiguously determined by UH . U has the following tensorialised form: U = I ⊕ UH ⊕(UH ⊗ UH ) ⊕ (UH ⊗ UH ⊗ UH ) ⊕ · · · .
(25)
Finally the restriction of U on the one-particle Hilbert space H is an irreducible representation. A second important result concerns the positive-energy/uniqueness properties of λ. In Minkowski QFT positivity of energy is a stability requirement and in general spacetimes the notion of energy is associated to that of a Killing time. This interpretation can be extended to this case too, namely to the theory on − . The positive-energy requirement towards is fulfilled for the “asymptotic” notion of time associated to the limit values Y ∈ g− . Notice that Y may − of a timelike future-directed vector field Y in M, when Y ∈ g− . This includes the not be a Killing vector outside − ; it is enough that Y → Y case Y = X in particular, due to Proposition 3.1. In the following, {exp{t Z }}t∈R is the one-parameter subgroup of G − generated by (Z ) any Z ∈ g− and {αt }t∈R is the associated one-parameter group of ∗-automorphisms − of W( ) (19). Proposition 4.1. Consider an expanding universe with cosmological horizon (M, g, X, , γ ), the quasifree, pure, SG − -invariant state λ on W(− ) defined in (20) and a time ∈ g− pointwisely approaching like future-directed vector field Y in M such that Y → Y − (Y = X in particular, in view of Proposition 3.1). The following holds. (a) The unitary group {Ut(Y ) }t∈R which implements α (Y ) leaving fixed the cyclic GNS vector in the GNS representation of λ is strongly continuous with a nonnegative ) (Y d U |t=0 . self-adjoint generator H (Y ) = −i dt s t
(b) The restriction of H (Y ) to the one-particle space has no zero modes if and only if vanishes on a zero-measure subset of − . Y
( , s) = f (s)∂ for some non negative Proof. From Proposition 3.4 one has that Y } amounts to the displacement ( , s) → smooth function f : S2 → R. Therefore exp{t Y ( + f (s)t, s). As a consequence of the previous discussion, the one parameter group ) ) (Y (Y α (Y ) is unitarily represented by {Ut }t∈R . Ut is the tensorialisation (as in (25)) of the (representation of the) unitary group in the one-particle space Vt : H → H, with (Y)
(Vt φ)(k, s) = eitk f (s) ψ(k, s) = eith ψ (k, s), for all φ ∈ H.
1144
C. Dappiaggi, V. Moretti, N. Pinamonti
From standard theorems of operator theory one obtains that R t → Vt is strongly continuous in the one-particle space H = L 2 (R+ × S2 ; 2kdk ∧ S2 ); its self-adjoint generator h (Y ) , by (h (Y ) φ)(k, s) = k f (s)φ(k, s), is defined in the dense domains D(h (Y ) ) 2 + 2 made of the elements of the Hilbert space L (R × S ; 2kdk ∧ S2 ) such that the righthand side belongs to L 2 (R+ × S2 ; 2kdk ∧ S2 ). It is so evident that, since f ≥ 0, for every ψ ∈ D(H ), +∞ φ, h (Y ) φ = 2kdk S2 (s)|φ(k, s)|2 k f (s) ≥ 0, (26) 0
S2
and thus σ (h (Y ) ) ⊂ [0, +∞). Passing to the whole Fock space by (25) the result remains unchanged for the whole generator H (Y ) = 0 + h (Y ) ⊕ I ⊗ h (Y ) ⊕ h (Y ) ⊗ I ⊕ · · · using standard properties of generators. The last statement is a trivial consequence of (26) = f ∂ . using Y = ∂ in particular, since it is always possible to view ∂ The result applies for Y as the limit value of some timelike vector field of M. For expanding universes with cosmological horizon as described in Sect. 2.2, if X := −γ ∂τ , then X → ∂ while approaching − . In this above case the energy-positivity property applies for X and there are no zero modes. This is not the whole story, since the positive-energy property for ∂ , determines completely λ. Theorem 4.1. Consider the state λ defined in (20) and its GNS representation. The following holds. (a) The state λ is the unique pure quasifree state on W(− ) satisfying both: (i) it is invariant under α (∂ ) , (ii) the unitary group which implements α (∂ ) leaving fixed the cyclic GNS vector is strongly continuous with nonnegative self-adjoint generator (energy positivity condition). (b) Each folium of states on W(− ) contains at most one pure α (∂ ) -invariant state. Proof. The proofs of (a) and (b), though rather technical, are identical to those of the corresponding statements in Theorem 3.1 of [Mo06], where, in the cited proof, F refers to a Bondi frame. This holds since the self-adjoint generator of the unitary group t → Ut , implementing {αt(∂ ) }t∈R and leaving ϒ invariant, is the tensorialisation of the positive self-adjoint generator H acting on the one-particle space L 2 (R+ × S2 ; 2kdk ∧ S2 ) )(k, θ, φ) = k ψ (k, θ, φ). Note that H is defined in the dense domains of the as (H ψ elements of the Hilbert space L 2 (R+ × S2 ; 2kdk ∧ S2 ) such that the right-hand side is still in L 2 (R+ × S2 ; 2kdk ∧ S2 ). Hence σ (H ) = σc (H ) = [0, +∞). The action of the one-parameter subgroup R t → g (∂ ) (t) of G − on fields defined on − coincides exactly with the one-parameter subgroup of the B M S group on fields defined on + . Furthermore both the unitary representations of SG − and of the BMS group are identical when restricted to those subgroups. 4.3. Interplay of QFT in M and QFT on − . While in the previous section we have shown that it exists a preferred quasifree SG − -invariant pure state λ enjoying some uniqueness properties, we wonder now if it is possible to induce a state λ M on the algebra
Cosmological Horizons and Reconstruction of Quantum Field Theories
1145
of field observables in the bulk starting from λ. If this is the case, we would expect λ M to fulfill some invariance properties with respect to the possible isometries individuated by Killing vectors which preserve − . To this avail, we concentrate beforehand on algebraic properties, establishing the existence of a nice interplay between W(− ) and W(M) under suitable hypotheses on the considered symplectic forms. That interplay will be used to define λ M in the next subsection. The symplectic form σ M on S(M) defined in (15) can be equivalently rewritten as the integral of a 3-form, σ M (ϕ1 , ϕ2 ) := χ (ϕ1 , ϕ2 ) S 1 ϕ1 ∇ µ ϕ2 − ϕ2 ∇ µ ϕ1 − g µαβγ d x α ∧ d x β ∧ d x γ , (27) = S 6 where µαβγ is the totally antisymmetric Levi Civita symbol, S is a future oriented Cauchy surface and the second equality holds in any local coordinate patch. Notice that, even though S is moved back in the past and it seems to tend to coincide with − , this is not necessarily the case, since − and Cauchy surfaces in M may have different topologies. In particular, information could get lost through the time-like past in our infinity i − , the tip of the cone representing − . That point does not belong to M hypotheses. However one may expect that, in certain cases at least, assuming that each ϕi extends to ϕi ∈ S(− ) smoothly, it holds σ M (ϕ1 , ϕ2 ) = χ (ϕ1 , ϕ2 ) . (28) −
Now, by direct inspection one verifies that, for ψ1 , ψ2 ∈ S(− ), ∂ψ1 ∂ψ2 2 ψ2 − ψ1 d ∧ S2 (θ, φ), χ (ψ1 , ψ2 ) = γ ∂ ∂ − R×S2
(29)
where γ is the last constant in (M, g, , X, γ ). Following this way one is led to expect that σ M (ϕ1 , ϕ2 ) = σ (γ ϕ1 , γ ϕ2 ).
(30)
Notice that this result is by no means trivial and it might not hold, since it strictly depends on the behaviour of the solutions of Klein-Gordon equations across − . Here we investigate the consequences of (30) under the hypothesis that such an identity holds true. The existence of : S(M) → S(− ) fulfilling (30) implies the existence of a isometric ∗-homomorphism ı : W(M) → W(− ). In this way the field observables of the bulk are mapped into observables of the theory on − . Moreover, the state λ on − induces a preferred state λ M on W(M) via pull-back. This state enjoys interesting invariance properties with respect to the symmetries of (M, g) which preserve − , as well as a positivity property with respect to timelike Killing vectors of M which preserve − . Theorem 4.2. Consider an expanding universe with cosmological horizon (M, g, X, , γ ) and suppose that every ϕ ∈ S(M) extends smoothly to some φ ∈ S(− ) in order that (30) holds true: σ M (ϕ1 , ϕ2 ) = σ (γ ϕ1 , γ ϕ2 ) , for every ϕ1 , ϕ2 ∈ S(M).
1146
C. Dappiaggi, V. Moretti, N. Pinamonti
In these hypotheses, there is an (isometric) ∗-homomorphism ı : W(M) → W(− ) that identifies the Weyl C ∗ -algebra of the bulk M with a sub C ∗ -algebra of the boundary − ; it is completely determined by the requirement: ı (W M (ϕ)) := W (γ ϕ) , for all ϕ ∈ W(M).
(31)
Proof. Notice that the linear map γ : S(M) → S(− ) has to be injective due to nondegenerateness of σ and (30). Consider the sub Weyl-C ∗ -algebra A M of W(− ) generated by the elements W (γ ϕ) with ϕ ∈ S(M). Since Weyl C ∗ -algebras are determined up to (isometric) ∗-algebra isomorphisms, A M is nothing but the Weyl C ∗ -algebra associated with the symplectic space (γ (S(M)), σ ) and the map γ : S(M) → (S(M)) is an isomorphism of symplectic spaces. Under these hypotheses [BR022], there is a unique (isometric) ∗-isomorphism ı : W(M) → A M ⊂ W(− ) completely individuated by (31). 4.4. The preferred invariant state λ M . We proceed to show that, in the hypotheses of Theorem 4.2, a preferred state λ M on W(M) is induced by λ. That state enjoys very remarkable physical properties. From now on, if Y is a complete Killing vector of (M, g), the associated oneparameter group of g-isometries, {exp{tY }}t∈R , preserves σ M under pull-back action. Hence, as discussed in [BR022,BGP96], there is a unique isometric ∗-isomorphism (Y ) βt : W(M) → W(M) induced by (Y )
βt
(W M (ϕ)) := W M (ϕ ◦ exp{−tY }), for every ϕ ∈ S(M). (Y )
In the following we shall call β (Y ) := {βt }t∈R the natural ∗-isomorphism action of {exp{tY }}t∈R on W(M). Similarly, every Z ∈ g− has a natural action α (Z ) on W(− ) in terms of isometric ∗-isomorphism, obtained by requiring, (Z )
αt
(W (ψ)) := W (ψ ◦ exp{−t Z }), for every ψ ∈ S(− ),
since the pull-back action of {exp{t Z }}t∈R , generated by Z on fields of S(− ) preserves σ. To stress a further important point, let us consider an expanding universe with cosmological horizon (M, g, X, , γ ) and let us suppose that every ϕ ∈ S(M) extends smoothly to some ϕ ∈ S(− ) in order that (30) holds true. In this case there is a uniquely defined smooth function ϕ defined on M ∪ − , that reduces to ϕ in M and to ϕ on − . If Y is a complete Killing vector of (M, g) preserving − , the one parameter to M ∪ − (Proposition 3.2 and Theorem group generated by its unique extension Y 3.1) acts on ϕ globally. Taking the relevant restrictions of scalar fields and Killing vector fields we obtain: } = (ϕ ◦ exp{tY }), (ϕ) ◦ exp{t Y := Y − . As a straightforward consequence it holds where, as usual, Y
) (Y ) (Y ı βt (a) = αt (ı(a)), for all a ∈ W(M) and t ∈ R .
(32)
(33)
Cosmological Horizons and Reconstruction of Quantum Field Theories
1147
Theorem 4.3. Consider an expanding universe with cosmological horizon (M, g, X, , γ ) fulfilling the hypotheses of Theorem 4.2. Let λ M : W(M) → C be the state induced by λ defined in (20) through the isometric ∗-homomorphism ı (31): λ M (a) := λ(ı(a)) , for all a ∈ W(M).
(34)
λ M enjoys the following properties: (a) Whenever (M, g) admits some complete Killing vector field Y preserving − , then letting β (Y ) be the natural action on W(M), λ M is invariant under β (Y ) and the unitary one-parameter group {Ut(Y ) }t∈R , which implements β (Y ) in the GNS representation of λ M leaving fixed the cyclic vector, is strongly continuous. (b) If Y above is everywhere timelike and future-directed in M, then (i) the one(Y ) parameter group {Ut }t∈R has positive self-adjoint generator, (ii) that generator = 0 on a zero-measure subset has no zero-modes in the one-particle subspace, if Y of − . The proof is in the Appendix. Remark 4.2. As noticed before Proposition 4.1, positivity of energy is a stability requirement. The statement (b) of the theorem assures that, in the presence of a timelike Killing vector out of which defining the notion of energy, if it preserves − , the condition of energy positivity holds true. If such a timelike Killing vector is absent, then Proposition 4.1 assures nonetheless the validity of a positivity-energy condition, particularly with respect to the conformal Killing vector X .
4.5. Testing the construction for the de Sitter case and for other FRW metrics. We proceed to show that the hypotheses of Theorem 4.2 are valid when (M, g, X, , γ ) is in the class of the FRW metrics considered in Sect. 2.2, so that the preferred state λ M exists for those spacetimes. That class includes the expanding region of de Sitter spacetime (see [BMG94,BM96] for a related analysis in the framework of Wightman’s axioms). We shall verify, in this last case, that the preferred state λ M is nothing but the well-known de Sitter Euclidean vacuum or Bunch-Davies state, ω E [SS76,BD78,Al85]. Let us start with the de Sitter scenario. The expanding de Sitter region is M (−∞, 0) × R3 , g = a 2 (τ ) −dτ ⊗ dτ + dr ⊗ dr + r 2 dS2 (θ, ϕ) ,
(35)
where τ ∈ (−∞, 0) and where r, θ, φ are standard spherical coordinates on R3 , whereas a(τ ) = γ /τ for some constant γ < 0, so that and R = 12/γ 2 . A class of, generally complex, solutions k , k ∈ R3 of (14) is2 k (τ, x) :=
eik·x χk (τ ) , (2π )3/2 a(τ )
(36)
2 The form of the modes as presented in [BD78,BD82] is different both since in [SS76,BD78] the contracting region of de Sitter spacetime was considered and due to the absence of the overall exponential exp −iπ ν/2, which would affect the final results and the normalisation (38) for ν imaginary, but not the final form of the two-point function.
1148
C. Dappiaggi, V. Moretti, N. Pinamonti
where, according to [SS76], 1√ −π τ eiπ ν/2 Hν(2) (−kτ ) , where 2 9 ν := − 12(m 2 R −1 + ξ ) , 4
χk (τ ) :=
(37)
(2)
k := |k| and Hν is the second-type Hankel function. The sign in front of the square root in the definition of ν (which may be imaginary) does not affect the right-hand side of (37) and it could be fixed arbitrarily (either for ν real or imaginary). With these choices one finds the time-independent normalisation dχk (τ ) dχk (τ ) χk (τ ) − χk (τ ) = i , for all τ ∈ (−∞, 0). dτ dτ
(38)
Let us now show how ω E is defined. To this end, take any ϕ ∈ S(M) and a Cauchy surface τ in (M, g) at fixed τ . Define ∂k (τ, x) ∂ϕ(τ, x) ϕ (k) := −i ϕ(τ, x) − k (τ, x) (39) a(τ )2 dx, ∂τ ∂τ R3 where, per direct inspection, the right-hand side of (39) does not depend on the choice of τ . Furthermore, Hν(2) (z) decays as z −1/2 for |z| → ∞, ϕ ∈ C ∞ (R3 \{0}) and it vanishes for |k| → ∞ faster than every power |k|−n , n ∈ N. From the known behaviour of the (2) functions Hν (z) in a neighbourhood of z = 0 [GR95], one sees both that the leading divergence as k → 0 due to the functions χk is of order |k|−|Reν| and that | ϕ |2 , as well as | ϕ |, is integrable with respect to dk whenever |Reν| < 3/2 or, equivalently, m 2 + ξ R > 0. Once one constructs ϕ out of (39), then ϕ is k (τ, x) ϕ (k) + k (τ, x) ϕ (k) dk . (40) ϕ(τ, x) = R3
This holds out of (36), (38), (39), and of the properties of Fourier transform for functions in C0∞ (R3 ). ϕ ∈ L 2 (R3 ; dk) ∩ L 1 (R3 ; dk) then Since when m 2 + ξ R > 0 and ϕ ∈ S(M), −2I m ϕ1 (k) ϕ2 (k)dk = (ϕ2 ∂τ ϕ1 − ϕ1 ∂τ ϕ2 ) a 2 (τ )dx R3
R3
=: σ M (ϕ1 , ϕ2 ) ∀ϕ1 , ϕ2 ∈ S(M) . (41) The (restriction to M of the) Euclidean vacuum in de Sitter space is nothing but the quasifree state ω E on W(M) completely identified by 1
ω E (W M (ϕ)) = e− 2
R3
ϕ (k) ϕ (k) dk
, for every ϕ ∈ S(M).
Notice that the constraint (22) is automatically fulfilled in view of (41).
(42)
Cosmological Horizons and Reconstruction of Quantum Field Theories
1149
Remark 4.3. The maximally extended de Sitter spacetime can be realized by glueing together two isometric spacetimes – one expanding and the other contracting, when moving towards the future – on the common cosmological horizon. The obtained spacetime is maximally symmetric and admits S O(1, 5) as a group of isometries. The state ω E extends to a globally defined state on the whole de Sitter spacetime [Al85] and such a state is O(1, 5)-invariant, hence it is invariant also under symmetries which do not preserve the horizon. We have the following result whose proof is in the Appendix. Theorem 4.4. Consider the expanding universe (M, g, X, , γ ) given by (35) with a(τ ) = γ /τ . Consider a quantum scalar Klein-Gordon field propagating in (M, g) with m 2 + ξ R > 0. Then, 5 (a) If m 2 + ξ R > 48 R (see also Remark 4.4), every ϕ ∈ S(M) extends smoothly to some φ ∈ S(− ), (30) holds true and (b) λ M on S(M) coincides with the restriction to M of ω E . 5 Remark 4.4. The requirement m 2 + ξ R > 48 R, i.e. |Reν| < 1 is used to assure that − ϕ ∈ S( ) if ϕ ∈ S(M). Actually the requirement can be dropped preserving only m 2 + ξ R > 0 if we change the definition (15) of S(− ), namely
− ∞ 2 2 |ψ (k, θ, φ)| |k| dk ∧ S2 (θ, φ) < +∞ , S( ) := ψ ∈ C (R × S )
R×S2
indicates the Fourier-Plancherel transform of the Schwartz distribution ψ (as where ψ discussed in Appendix C of [Mo07]). Then the symplectic form on − could be defined Fourier transforming along the R-direction (18). In this way, the identity (18) would hold true in a weaker limit sense, employing a suitable regularisation of ψ1 and or ψ2 by means of sequences of smooth compactly supported functions. Then the construction of λ on W(− ) and of its GNS triple as well as the uniqueness/positive energy theorems would closely resemble our previous analysis. To conclude we have the last promised theorem whose proof is in the Appendix: The hypotheses of Theorem 4.2 are fulfilled, and thus λ M is defined, for FRW metrics as described in Sect. 2.2 with a(τ ) as in (4), provided the mass m of the Klein-Gordon field and/or the constant ξ are large enough. Theorem 4.5. Consider a quantum scalar Klein-Gordon field ϕ, satisfying (14) and propagating in an expanding universe (M, g, X, , γ ). Consider a(τ ) as in (4) and with a(τ ¨ ) = 2γ /τ 3 + O(1/τ 4 ) in such a way that R = 12/γ 2 + O(1/τ ), then, if M (−∞, 0) × R3 , g = a 2 (τ ) −dτ ⊗ dτ + dr ⊗ dr + r 2 dS2 (θ, ϕ) , τ ∈ (−∞, 0) and r, θ, φ standard spherical coordinates on R3 , X = ∂τ and = a(τ ) = γ /τ + O(1/τ 2 ), whenever m 2 γ 2 + 12ξ > 2, every ϕ ∈ S(M) extends smoothly to some φ ∈ S(− ) and (30) holds true. Remark 4.5. (1) Theorem 4.5 is also valid relaxing the hypothesis to the case ξ = 1/6 and m = 0. In this case the proof is similar to that of the case studied in [DMP06,Mo06].
1150
C. Dappiaggi, V. Moretti, N. Pinamonti
(2) The validity of the Hadamard property for the states λ M will be investigated in a forthcoming paper. However, a first scrutiny shows that it does hold for the states λ M considered in Theorem 4.5 provided the two-point function of such a state is a distribution of D (M × M). The proof is similar to the one in [Mo07]. The distributional requirement is fulfilled if the functions ϕ, ϕ ∈ S(M), satisfy a suitable decay property as → −∞. 5. Conclusions and Open Issues In this manuscript, we were able to prove that, imposing some suitable constraints on the expansion factor a(t), the FRW background can be extended to a larger spacetime which encompasses a cosmological horizon. Such structure is later generalised in Definition 3.1 where we introduce a novel notion of an expanding universe (M, g) with geodesically complete cosmological past horizon − . It is worth to stress that, in the set of backgrounds we are taking into account, besides the conformal factor , a relevant role is played by a future oriented timelike vector X which is a conformal Killing vector for the metric g. As a byproduct of these geometric properties, we were able to construct explicitly the structure of the subgroup group of − , i.e., the iterated semidirect − of the isometry ∞ SG 2 ∞ 2 product S O(3) C (S ) C (S ) . Such a result suggests that one could hope to readapt in this framework some of the properties of a scalar quantum field theory as discussed in [DMP06,Mo06,Mo07,Ho00]. In fact, using only the universal structure of − , we were able to select, for the theory on the horizon, a preferred state λ which is quasi-free and pure. λ is the unique state which, besides the previous properties, is also invariant under the action of the horizon symmetry group; actually, uniqueness for pure quasifree states on W(− ) holds with the only hypotheses of invariance with respect to the one-parameter group generated by ∂ and a more general uniqueness property is valid as discussed in Theorem 4.1. Moreover, for any future oriented timelike vector field Y in the bulk such that it projects , i. e. a generator of the Lie algebra of SG − , then the unitary group on the horizon to Y on the GNS representation of λ is strongly of operators implementing the action of Y continuous with a non negative self-adjoint generator. Finally the one-particle space in the GNS representation of the state λ turns out to be an irreducible representation of the group of horizon symmetries SG − . In Sect. 4, we considered a generic massive scalar Klein-Gordon equation with an arbitrary coupling to curvature. Under the assumption that each solution of such an equation for compactly supported initial data projects on the horizon to a rapidly decreasing smooth function - say ψ - and that such a projection preserves a suitable symplectic form, then we were able to draw some interesting conclusions. As a first step the projection map between classical fields extends also at a level of Weyl algebras, namely we can embed the bulk Weyl C ∗ -algebra as a C ∗ -subalgebra of the horizon counterpart. Furthermore such an embedding between Weyl algebras can be exploited in order to pull-back λ to a bulk state λ M which is still quasi-free and invariant under the action of any bulk isometry which preserves the cosmological horizon. Furthermore, whenever the Killing vector is everywhere future oriented and timelike, then the one-parameter group of unitary operators implementing such an action is positive with self-adjoint generator. As previously mentioned these results hold true under certain hypotheses which we tested in Sect. 4.6 where we studied the behaviour of solutions for the Klein-Gordon equation of motion with an arbitrary coupling to curvature both in the de-Sitter and
Cosmological Horizons and Reconstruction of Quantum Field Theories
1151
in the FRW background. Our analysis shows – see Theorem 4.6 – that the hypotheses made at the beginning of Sect. 4, hold true at least whenever certain conditions between the relevant parameters in the equation of motion are satisfied. In the de Sitter case λ M coincides with the well-known Euclidean Bunch-Davies vacuum. We feel safe to claim that the analysis we performed proves that the investigation of a quantum field theory in a suitable cosmological background by means of an horizon counterpart is a viable option. Hence, as a future perspective, one would hope as a first step to extend the domain of applicability of Theorem 4.6, and later to further discuss the properties for the bulk state. In particular our long-term aim is to prove both that λ M is pure and that it is Hadamard so that it can be used in renormalisation procedures, especially for the stress energy tensor [Wa94,Mo03,HW05]. Furthermore we should also investigate possible relations with the adiabatic states often exploited in the study of field theories on FRW backgrounds [JS02,LR90,Ol07,Pa69]. Concerning the validity of Hadamard property, it holds true for λ M when M is de Sitter spacetime since in this case λ M is the Euclidean vacuum. However, a first scrutiny shows that it does hold for all the states λ M considered in Theorem 4.5 provided the two-point function of such a state is a distribution of D (M × M). The proof is almost the same as that preformed in [Mo07]. At last but not at least, it would be interesting to extend our results to interacting fields. From a physical perspective this would be the most appealing scenario since, as mentioned in the Introduction, nowadays cosmological models are often based upon a single scalar field whose dynamic is governed by a non trivial potential. It could also be worth to investigate possible applications of our results to the description of dark matter. Being weakly interacting, it is feasible to model it, at least in a first approximation, as a free quantum scalar field on a curved background. Although here we do not address all the above mentioned topics, we believe that this manuscript could be a nice first step towards this direction and we hope to discuss many if not all these mentioned points in a forthcoming manuscript. Acknowledgement. The work of C.D. is supported by the von Humboldt Foundation and that of N.P. has been supported by the German DFG Research Program SFB 676. We would like to thank K. Fredenhagen and R. Brunetti for useful discussions.
A. Proof of Some Technical Results Proof of Proposition 3.1. (a) If there were a smooth extension of X to M it would be unique by continuity, moreover, by continuity again, it would define a Killing vector for g when restricting to the surface − , because the right-hand side of (7) vanishes there. Coordinates We, in fact, will prove the existence of a smooth extension to the whole M. of − = ∂ M. Using the whole ( , , θ, φ) are defined in a neighbourhood U ⊂ M class of smooth curves γ : t → ( 0 , t, θ0 , φ0 ), where ( 0 , θ0 , φ0 ) ∈ R × S2 are fixed arbitrarily, and the transport equations [Ge772,Hal04], a , ab + 1 a a gab γ˙ a ∇ ϕ , γ˙ a ∇ ϕ = γ˙ a K X b = γ˙ a F 2
bcad [b bc = γ˙ a R a F γ˙ a ∇ gc]a , Xd + K
d(a F b = γ˙ a b) d a K d γ˙ a ∇ ϕ L ab + 2 R L ab + Xd∇
(43)
1152
C. Dappiaggi, V. Moretti, N. Pinamonti
ab − 1 (where L ab := R g := 6 gab R) we can “transport” X , Fab = ∇a X b − ∇b X a , ϕ 1 − a g ), and K a := ∇a ϕ beyond in U . The transported fields X , F, ϕ , and K 2 L X ( are nothing but the solutions of the first order differential equations (43), with initial conditions given by the known fields X , F, ϕ, K evaluated on a fixed smooth surface = ( , θ, φ) completely included in M ∩ U . In M, X coincides with X itself (and coincides with F itself and so on), since every conformal Killing vector field fulfills F transport equations (43) [Ge772,Hal04] and the uniqueness theorem holds for solutions of ordinary differential equations. Outside M one gets a smooth field X anyway, due to the joint dependence of the solution of differential equations from the initial data (assigned on a smooth surface as well). Obviously the constructed field X does not need to fulfill conformal Killing equations outside M. In this way we have constructed a smooth extension X of X on the open set M ∪U enclosing − ; the further extension to M −1 X () is now trivial, using standard smoothing technology. By continuity, L = g X must hold on − . This means that the right-hand side smoothly extends there (to zero by hypotheses). In particular, since = 0 on − , X () = 0 on − . That is X − , − d = 0, and thus X− is tangent to as wanted. The set on − of the points where X vanishes is closed since X is continuous. To conclude, we wish to prove that X − cannot vanish on every (nonempty) open set A ⊂ − (otherwise it vanishes everywhere on − , but this case is not allowed by definition of X ). Assume that there is such an A where X A = 0, take p ∈ A and fix any other point q ∈ − , such that there is a g -geodesics, γ ⊂ − , joining p and q. We assume here that γ is either a space-like geodesics on S2 or a null-like geodesic at constant angular variables. We want to prove that X (q) = 0 when X A = 0. a If X A = 0, all the derivatives ∇ X b vanish, in A, when a = , that is referring to directions tangent to − . However, on − it holds L g = 0, by hypotheses. Writing X down these equations explicitly, one finds that X = 0 on A implies ∇ X b = 0 if b = . However ∇ X − = 0 holds since both X = X () and X ()/ = X / ab = 0. Notice that ϕ = 0 in A, since it is vanishes on − . We have found that, in A, F −1 proportional to the limit of X () approaching − which vanishes by hypotheses. a = 0 when a = , in A, that is K a = 0 for a = at most, This also entails that K ( p) for the considered field in A. Let k denote the value K X with X A = 0. Let us finally focus on the differential equations (43 ) referred to in the mentioned geodesic [0, 1] t → γ (t). We argue that a solution, and thus the unique solution, for initial data ab (0) = 0, (0) := k is ab (t) = 0, at p, X (0) = 0, F ϕ (0) = 0, K X (t) = 0, F ϕ (t) = 0, (t), for all t ∈ [0, 1], where the last function uniquely satisfies γ˙ a ∇ b = 0 with a K K (0) := k. To prove it notice that, inserting these functions in (43), the equations reduce K to a = 0 , γ˙ a K b − γ˙ b K a = 0 , γ˙ a ∇ b = 0. a K γ˙ a K
(44)
The first two equations are certainly fulfilled at t = 0 by hypotheses, the third one (0) := k. However also the first two determines K uniquely with the initial condition K equations are fulfilled on this solution in view of the fact that they are fulfilled at t = 0 a γ˙ b = 0, since we are dealing with a geodesic. We have found that, in and that γ˙ a ∇ particular, X vanishes at q as wanted, since X (1) = 0. With the same procedure, moving p and q about the original positions, we find that X vanishes in a open set Aq which enlarges A and it includes q. Iterating the procedure, we can enlarge Aq in order to include any third point q ∈ − , joined to q by means of a second geodesics, so that X vanishes at q too. In view of the form (8) of the metric on − , for every couple of
Cosmological Horizons and Reconstruction of Quantum Field Theories
1153
points p, q ∈ − , there is always a sequence of three consecutive geodesics, of the two above-mentioned types, joining p and q . Therefore X vanishes everywhere on − . (b) In a neighbourhood of − , referring to coordinates , , θ, φ, one has X = f ∂ + f ∂ + f θ ∂θ + f φ ∂φ . Approaching − (i.e. as = 0) one gets (1) f = 0, since X becomes tangent to − . However one also finds (2) ∂ f − = 0 as a consequence of ( f − f − )/ = −1 X () → 0 approaching − . Since X − is tangent to the null surface − and it is the limit of a timelike vector, we also know that, at the points where it does not vanish, it must be light-like and future directed. Since X− = f ∂ + f θ ∂θ + f φ ∂φ , the θ φ requirement g( X, X )− = 0 implies that (3) f = f = 0 everywhere on − , in view of the Bondi form of the metric on − . Therefore (4) X− = f (0, , θ, φ)∂ . Using the Bondi form of the metric again, the requirement (L g )− = 0 produces immediately X the constraints ∂ f − = 0 in view of (1), (2), (3), and (4), so that X − = f (θ, φ)∂ . Since X − cannot vanish in any open set on − , f cannot vanish in any open set on S2 . Since f is smooth and thus continuous, the set f −1 (0) must be closed. Since, with our sign convention for the Bondi metric, both X and ∂ are future oriented, f cannot be negative. Proof of Proposition 3.2. We start from the proofs of (a) and (b). If there were a smooth extension of Y to M = M ∪ − it would be unique by continuity and it would satisfy LY g = 0 up to − by continuity again. Therefore it is sufficient to establish the existence to get the most relevant part of (a) and (b). The proof is of a smooth extension to M essentially the same as done in the proof of Proposition 3.1, concerning the existence of the extension of the field X . Now, Y is a proper conformal Killing field so that the transport equations (44) [Ge772,Hal04] reduce to bcad γ˙ a Y b = γ˙ a F ab and γ˙ a ∇ bc = R d . a Y a F γ˙ a ∇
(45)
The procedure is exactly as that in the proof of Proposition 3.1 and, in this way, one of Y on M and in particular on − . The condition that obtains a smooth extension Y − Y is tangent to is Y , d = 0 everywhere on − . However g sb ∂b = (∂ )s and X → f ∂ approaching − , for some nonnegative function f ∈ C ∞ (S2 ), as , d f = lim→− g(Y , X ). If the limit vanishes shown in Proposition 3.1. Therefore Y , d = 0 on the points ( , s) ∈ R × S2 , where f (s) = 0. This approaching − , Y , d = 0 on R × B. Let happens on an open nonempty set B ⊂ S2 . Therefore Y 2 ( 0 , s0 ) ∈ R × B. Since S \B has no interior (see Proposition 3.1), there is a sequence , d( , s) implies R × B ( 0 , sn ) → ( 0 , s0 ) as n → ∞. Continuity of ( , s) → Y , d = 0 in R × (S2 \B) and, thus, everywhere. Conversely, if Y is tangent to − , Y − then Y , d = 0 on , and hence lim→− g(Y , X ) = Y , d f = 0. − To conclude, we prove the last statements: (c) and (d). Since the map Y → Y is linear by construction, (d) is a trivial consequence of (c). Let us prove (c). If the considered space is made of the zero vector only, the proof of (c) is trivial. Assume that − = 0 on a it is not the case. To prove (c), it is sufficient to prove that the identity Y set A ⊂ − which is nonempty and open with respect to the topology of − , entails = 0 in M ∪ − by continuity). Let us show it. Consider Y = 0 in M (and thus Y any fixed point p ∈ M and a smooth path γ from some q ∈ A to p (it exists because M is connected and − = ∂ M). In view of the first order transport equations (45), ( p) = 0 when both Y (q) and F ab (q) vanish. Let us show that it is the case. Y ( p) = Y
1154
C. Dappiaggi, V. Moretti, N. Pinamonti
− = 0 on A as above. Using coordinates ( , , θ, φ) about − , one has Suppose that Y b that ∂a Y A = 0 if a = . On the other hand, the condition LY gab = 0 computed on A = 0 and ∂a Y b A = 0 if a = , yields ∂ Y b A = 0, so that A, taking into account Y b b b c ∇a Y A = ∂ Y A +ac Y A = 0. Therefore Fab A = 0 which concludes the proof. Proof of Proposition 3.3. (a) If (s 1 , s 2 ) are (local) coordinates of a point s ∈ S2 , fix α, β ∈ C ∞ (S2 ) and real constants r1 , r2 , r3 . We wish to study the integral lines t → ( (t), s(t)) ∈ R × S2 of the field Z ( , s) := (α(s) + β(s))∂ + 3k=1 rk Ski ∂s i on R × S2 , with initial condition ( 0 , s0 ). By construction, the components referred to the sphere do not depend on and thus, the corresponding equations can be integrated separately. Since 3k=1 rk Ski ∂s i is smooth and S2 is compact, the integral lines t → s(t|s0 ) (here and henceforth |s0 denotes the initial condition at t = 0) must be smooth and complete (i.e. defined for t ∈ (−∞, +∞)), in view of well-known theorems of differential equations on manifolds. Then assume that the smooth function R t → s(t|s0 ) is known (computed as above). The remaining differential equation reads d = α(s(t|s0 )) + β(s(t|s0 )) . dt It can be integrated and the right-hand side is defined for the values of t where the full integral converge:
(t|s0 , 0 ) = e
t 0
dt1 α(s(t1 |s0 ))
0 + e
t 0
dt1 α(s(t1 |s0 ))
t
dt1 β(s(t1 |s0 ))e−
t1 0
dt2 α(s(t2 |s0 ))
.
0
(46) It is apparent that the parameter t ranges in the whole real axis due to smoothness of R t → α(s(t|s0 )) and R t → β(s(t|s0 )), and that R t → (t|s0 , t0 ) is smooth as well. We have established that the integral lines of Z are complete and thus, in view of known theorems, the one-parameter group of diffeomorphisms generated by Z is global. Since s = s(t) must necessarily describe a rotation of S O(3), about the
axis (r1 , r2 , r3 )/ r12 + r22 + r32 with angle t r12 + r22 + r32 , of the point on S2 initially individuated by s0 and, taking (46) into account, it is evident that each diffeomorphism R × S2 ( 0 , s0 ) → ( (t|s0 , t0 ), s(t|s0 )) ∈ R × S2 ,
for every fixed t ∈ R, has the form (11) and, thus, it belongs to SG − . (b) A fixed (a, b, R) ∈ SG − can be decomposed as (R, a, b) = (I, a ◦ R −1 , b ◦ R −1 ) (R, 0, 0). Looking at (46), (R, 0, 0) is an element of the one-parameter group generated by 3 k=1 n k Sk , where (n 1 , n 2 , n 3 ) are the Cartesian components of the rotation axis of R; conversely the transformation (I, a◦ R −1 , b ◦ R −1 ) can be written as exp{1Z }, −1 −1 where Z = a R (s) ∂ + b R (s) ∂ . Proof of Theorem 3.1. Consider the local one-parameter group of diffeomorphisms gene in a sufficiently small neighbourhood (in M) of a point q ∈ − and for rated by Y
Cosmological Horizons and Reconstruction of Quantum Field Theories
1155
t ∈ (−, ) with > 0 sufficiently small. In local coordinates over − , ( , s 1 , s 2 ) ∈ (a, b) × A, such a set of transformations can be represented by
→ t := f ( , s1 , s2 , t) , (s 1 , s 2 ) → (st1 , st2 ) := g( , s 1 , s 2 , t) with ( , s 1 , s 2 ) ∈ (a, b) × A.
(47)
Using the same argument as the one used to characterise the group SG − (after Definition 3.2), one finds that it must be g( , s1 , s2 , t) = Rt (s) for all , s and f ( , s1 , s2 , t) = c(s1 , s2 , t) + b(s1 , s2 , t), for all , s, for some Rt ∈ O(3) depending on t smoothly, and where c, b are jointly smooth real functions. The requirement that t → Rt is a (local) one-parameter subgroup of S O(3), implies that ddtRt |t=0 = 3k=1 rk Sk (s1 , s2 ). Similarly d ft ∂c(s1 ,s2 ,t) 1 ,s2 ,t) |t=0 + ∂b(s∂t |t=0 . We have found that, in local coordinates, dt |t=0 = ∂t − = Y
3
rk Sk (s1 , s2 ) +
k=1
∂c(s1 , s2 , t) ∂b(s1 , s2 , t) |t=0 ∂ + |t=0 ∂ , ∂t ∂t
− takes the form of the vectors in g− . However, since it holds and thus, about q, Y − ∈ g− . true in a neighbourhood of each point on − , we have that Y To conclude, (b) is an immediate consequence of (a) and of the last part of (a) in Proof of Proposition 3.3. ∈ g− , in principle it has the form Proof of Proposition 3.4. Since Y ( , s) = Y
3
ci Si (s) + ( f (s) + g(s))∂ .
i=1
, is tangent to − it must Since g (Y, Y ) < 0 about − and its limit toward − , namely Y satisfy g (Y , Y ) = 0 by continuity (no timelike tangent vectors can be tangent to a null 3 ci Si (s) = 0 on − . surface). Using the form (8) of g one sees that it must be: i=1 Using the explicit form of S1 , S2 , S3 referring to the base ∂φ , ∂θ of T S2 , one sees that this is equivalent to claim that, everywhere on the sphere, (c1 sin φ − c2 cos φ) = 0, c1 cot θ cos φ + c2 cot θ sin φ + c3 = 0. As a consequence c1 = c2 = c3 = 0. Therefore, everywhere on − , = ( f (s) + g(s))∂ , Y is the limit of a causal future-directed vector. for some functions f, g ∈ C ∞ (S2 ). Y Therefore, it has either to vanish or to be directed as ∂ at every point of − . Since g(s) may take every arbitrarily large, positive or negative, value (notice that g is bounded, it being smooth on a compact set), it must be g(s) = 0 and f (s) ≥ 0. Proof of Theorem 4.3. As before, from now on, (F+ (H), , ϒ) is the GNS triple of λ. First of all we notice that λ M is in fact a well-defined state on W(M) since ı is a ∗-homomorphism. λ M is quasifree associated with a real scalar product µ M : S(M) × S(M) → R defined as µ M (ϕ, ϕ ) := µ(γ ϕ, γ ϕ ). From this fact, it follows that the GNS triple of λ M can be constructed as (F+ (H M ), A M , ϒ), where A M ⊂ W(− ) is the sub C ∗ -algebra isomorphic to W(M) in view of Theorem 4.2, H M is the Hilbert
1156
C. Dappiaggi, V. Moretti, N. Pinamonti
subspace of H given by the closure of the space of complex linear combinations of K µ ((ϕ)), for every ϕ ∈ S(M) and, thus, F+ (H M ) is a Fock subspace of F+ (H). In particular, the canonical R-linear map K µ M : S(M) → H M is nothing but K µ M = K µ ◦ γ . (a) By construction, using the definition of λ M , taking advantage of (33) as well as of the invariance property of λ under the action of SG − , if a ∈ W(M), one has
(Y ) (Y ) (Y ) λ M βt (a) = λ ı βt (a) = λ αt ı(a) = λ (ı(a)) = λ M (a).
This proves the first part of (a). To conclude the proof of (a), let Vt(Y ) : H → H the ) (Y αt
one-parameter group of unitaries that implements in the one-particle space H for λ. From K µ M = K µ ◦ γ , (33) and the construction of V one has:
Vt(Y ) K µ M ϕ = Vt(Y ) K µ γ (ϕ) = K µ (γ (ϕ ◦ exp{−tY })) = K µ M (ϕ ◦ exp{−tY }) . ) (Y
We have found that, for every ϕ ∈ S(M), Vt ) (Y Vt
K µ M ϕ = K µ M (ϕ ◦ exp{−tY }) , hence ) (Y
leaves the one particle space of λ M , H M , invariant and Vt
in H M . As a consequence of the structure of the GNS triple of λ M , if ) (Y βt
(Y )
H M implements βt ) (Y Ut
implements
unitarily in H = F+ (H) leaving ϒ invariant, it leaves also invariant the structure
of the GNS-Fock space of λ M and, therein, Ut(Y ) F+ (H M ) implements αt(Y ) unitarily in F+ (H M ) leaving the cyclic vector invariant. In other words
Ut(Y ) = Ut(Y )F+ (H M ) . ) (Y
Notice that R t → Ut
Moreover the self-adjoint generator of
) (Y Ut F+ (H M )
) (Y
F+ (H M ) is strongly continuous since R t → Ut ) (Y Ut F+ (H M )
is such.
is obtained by restricting that of
to F+ (H M ). If the former generator is positive, the latter has to be so. In the considered case, the former is positive since Y is timelike and future directed and thus we can apply (a) of Proposition 4.1. The same argument shows that the self-adjoint ) ) (Y (Y generator of Vt H M has no zero modes if Vt H M has no zero modes. This last fact vanishes on a zero-measure subset of − due to (b) of Proposition 4.1. happens if Y
Proof of Theorem 4.4. (a) Consider a wavefunction ϕ ∈ S(M). It satisfies ϕ = E f , where E : C0∞ (M) → S(M) is the causal propagator and f is some real smooth and compactly supported function in M. Since the maximally extended de Sitter spacetime M is globally hyperbolic and M ⊂ M , – so that C0∞ (M) ⊂ C0∞ (M ) – one can focus on the wavefunction ϕ := E f , where E is the causal propagator in M . By construction ϕ M = ϕ, so that ϕ is a smooth extension of ϕ. Since − ⊂ M , all that implies that ϕ extends to − smoothly (and uniquely) and this extension is lim→− ϕ = ϕ − . In this way, an R-linear map : S(M) ϕ → ϕ − ∈ C0∞ (− ) is defined. To conclude (a), it is enough to prove both that Ran ⊂ S(− ) and that preserves the symplectic forms. (2) Let us prove them. Bearing in mind the previously discussed behaviour of Hν (z) for large z (with |ar gz| ≤ π − ), making use of (36) and (37), the identity (40) can be
Cosmological Horizons and Reconstruction of Quantum Field Theories
1157
recast as π +∞ e−i 4 ϕ(τ, x) = 2 (θ, φ) dkkei(kr cos λx (θ,φ)−kτ ) γ 4π 3/2 S2 S 0 √ 1 × τ+O k ϕ (k, θ, φ) + c.c. , k
(48)
where λx (θ, φ) ∈ [0, π ] is the angle between x and k. The iterated integrations make √ sense and can be interchanged (via Fubini-Tonelli theorem) since both k ϕ (k, θ, φ) and √ ( k ϕ (k, θ, φ) are integrable in the measure dk. They are smooth everywhere but for k = 0, they vanish rapidly at large |k| and, for k = 0, ϕ ∝ 1/|k|−Re|ν| if m 2 +ξ R > 0 for − ν. Now, calling τ = (u + v)/2 and r = (u − v)/2, arises as the limit v → −∞. The contribution due to the factor of O k1 vanishes due to the Riemann-Lebesgue lemma: (ϕ) (u, θx , φx )
π
e−i 4 = lim s→+∞ γ 4π 3/2
+∞
dk 0
S2 (θ, φ)
S2
ks i ks [cos λx (θ,φ)+1] −iuk √ e k ϕ (k, θ, φ) + c.c. e 2 2
That limit can be computed using integration by parts exactly as in Appendix A2 of [DMP06]. In detail, one rotates the axes so that the axis z coincides with x and, thinking of ϕ as a function of k, c, φ, where c := cos θ ∈ [−1, 1], one re-arranges the expression above as π +∞ 2π −ie−i 4 dk dφ (ϕ) (u, θx , φx ) = lim s→+∞ γ 4π 3/2 0 0 1 ∂ i ks [c+1] −iuk √ e 2 e k ϕ (k, c, φ) + c.c., × dc −1 ∂c where θx = 0 in our case. The right-hand side can be expanded using integration by parts and only the contribution for c = −1 (that is θ = −π , i.e. k/|k| = −x/|x|) survives, the others vanish as s → +∞, due to Riemann-Lebesgue’s lemma (interchanging various integrations using the Fubini-Tonelli theorem and finally taking advantage of dominated convergence theorem). The integration over φ produces a trivial factor 2π since the dependence from φ of the involved functions disappears as θ = 0, π . The final result reads, using the initial generic choice for the axes x, y, z: π +∞ √ i2π e−i 4 dk e−iuk k ϕ (k, η(θx , φx )) + c.c., (ϕ) (u, θx , φx ) = 3/2 γ 4π 0 η : S2 → S2 denoting the parity inversion S2 n → −n ∈ S2 . Dropping the index x, and viewing θ, φ as the standard coordinates on − , the obtained result can be re-written as π +∞ e−i k k k e−i 4 ϕ , η(θ, φ) + c.c., (49) dk √ (γ ϕ) ( , θ, φ) = i (−γ ) 0 2(−γ ) (−γ ) 2π where we have passed to the standard Bondi coordinates on − , i.e. , θ, φ with 5 u = −γ . In our hypotheses on ϕ and ν, most notably m 2 + ξ R > 48 R, the functions k k 2 + 2 ϕ (k, η(θ, φ)) and k 2 ϕ (k, η(θ, φ)) belong also to L (R × S ; dk ∧ S2 (θ, φ)). 2
1158
C. Dappiaggi, V. Moretti, N. Pinamonti
This implies that both the functions ϕ, ∂ ϕ belong to L 2 (R × S2 ; d ∧ S2 ). In this way we have found that Ran ⊂ S(− ). Actually we have obtained much more: by means of both (21) and the Fourier transformed expression of σ , (49) implies that σ (γ ϕ, γ ϕ )
k k k = −2I m (−γ ) dk ∧ S2 2k ϕ , η(θ, φ) ϕ , η(θ, φ) 2(−γ ) (−γ ) (−γ ) R+ ×S2 k 2 dk ∧ S2 ϕ (k, θ, φ)ϕ (k, θ, φ) = −2I m dk ϕ (k)ϕ (k) = −2I m −2
R+ ×S2
R3
= σ M (ϕ, ϕ ), where in the last step we exploited (41). Hence γ preserves the symplectic form as requested. (b) Exactly as in the last step of the proof of (a), since the functions k2 ϕ (k, η(θ, φ)) and k k2 ϕ (k, η(θ, φ)) are also in L 2 (R+ × S2 ; dk ∧ S2 (θ, φ)), (23) and (49) imply: k k µ(K λ γ ϕ, K λ γ ϕ) = (−γ ) dk ∧ S2 2k ϕ , η(θ, φ) ϕ 2(−γ ) (−γ ) R+ ×S2 k , η(θ, φ) × (−γ ) k 2 dk ∧ S2 ϕ (k, θ, φ) ϕ (k, θ, φ) = dk ϕ (k) ϕ (k). = −2
R+ ×S2
R3
Therefore, for every ϕ ∈ S(M), in view of (42), 1
λ M (W M (ϕ)) := λ(W (γ ϕ)) = e−µ(K λ γ ϕ,K λ γ ϕ)/2 = e− 2 = ω E (W M (ϕ)) , and this concludes the proof.
R3
ϕ (k) ϕ (k) dk
Proof of Theorem 4.5. Here, weexploit the same notation, i.e. x, k, as in the proof of Theorem 4.4. In particular ν := 49 − (m 2 γ 2 + 12ξ ), so that ν ≥ 0 when 49 − (m 2 γ 2 + 12ξ ) ≥ 0 in the following. However the sign of ν could be fixed arbitrarily (and this applies for imaginary ν, in particular), since the functions we shall employ are invariant under ν → −ν. As a first step, we notice that if ϕ ∈ S(M), it extends to − smoothly so that ϕ := lim→− ϕ ∈ C ∞ (− ) does exist. This is because, as found in Sect. 2.2, the spacetime (M, g) extends to a larger spacetime equipped with a metric g obtained by multiplying the metric of the closed static Einstein universe with a strictly positive smooth factor. Since the closed static Einstein universe is globally hyperbolic and global hyperbolicity does not depend on nonsingular conformal rescaling of the metric, (M, g) itself is included in a globally hyperbolic spacetime. With the same argument used for de Sitter spacetime in the proof of Theorem 4.4, one has that every ϕ ∈ S(M) extends to − smoothly. We have now to show that Ran ⊂ S(− ) and that preserves the symplectic forms.
Cosmological Horizons and Reconstruction of Quantum Field Theories
1159
First of all, analogously to what done in the de Sitter case, we determine a class of modes k (τ, x) that will be useful in decomposing the solutions of the Klein-Gordon equation in order to take the limit of wavefunctions towards − , k (τ, x) :=
eik·x ρk (τ ) , (2π )3/2 a(τ )
(50)
where, taking the exponential factor into account, the Klein-Gordon equation reduces to the following equation for the functions (−∞, 0) τ → ψk (τ ), d2 ρk (τ ) + (V0 (k, τ ) + V (τ ))ρk (τ ) = 0, dτ 2 γ 2 2 2 2 m + ξ− 2 , V (τ ) = O(1/τ 3 ) . (51) with V0 (k, τ ) := k + τ γ Comparing with the Klein-Gordon equation, one sees that V0 (k, τ ) + V (τ ) = k 2 + a(τ )2 [m 2 + (ξ − 1/6)R(τ )], where V0 is nothing but the contribution of the pure de Sitter metric and V is a perturbation. If we dropped the perturbation V (τ ), the functions ρk would reduce to the functions χk and the modes k would reduce to the modes k used to construct ω E beforehand. Notice that the curvature of the spacetime does not coincide with 12/γ 2 as in de Sitter spacetime, but it reads R(τ ) = 12/γ 2 + O(1/τ ) and a(τ ) = γ /τ + O(1/τ 2 ). It follows that the added potential V (τ ) = O(1/τ 3 ) above. A formal solution of (51) is obtained in terms of the series: ρk (τ ) = χk (τ ) +(−1)
+∞
n
τ
t1
tn−1
dt1 dt2 · · · dtn Sk (τ, t1 )Sk (t1 , t2 ) · · · Sk (tn−1 , tn )
n=1 −∞ −∞
−∞
V (t1 )V (t2 ) · · · V (tn )χk (tn ),
(52)
where
Sk (t, t ) := −i χk (t)χk (t ) − χk (t )χk (t) , t, t ∈ (−∞, 0),
(53)
satisfying, in view of antisymmetry and (38), Sk (t, t) = 0 and
∂ Sk (t, t ) = 1. ∂t t =t
(54)
By direct inspection and making use of (54), one sees that the right-hand side of (52) defines a solution of (51) if one is allowed to interchange the τ -derivative operator – up to the second order – with the sign of sum. This is always possible when the series itself and the series of the derivatives of first and second order converge τ -uniformly in a neighbourhood of every fixed τ ∈ (−∞, 0). Actually the locally τ -uniform convergence of the series of derivatives of second order directly follows from the uniform convergence of those of zero and first order, when one refers to the solutions χk and the solutions (2) Sk . Using the expression (37) of the modes χk , expanding Hν in terms of Bessel functions J±ν [GR95] and, finally, exploiting standard integral representations valid for
1160
C. Dappiaggi, V. Moretti, N. Pinamonti
Reν > −1/2 (formula 5 in 8.411 in [GR95]) of Jν , one achieves the following bounds for Reν < 1/2 (that is m 2 γ 2 + 12ξ > 2), for τ < −1, and for some constant Cν ≥ 0: Reν Reν+1/2 k + k −Reν |χk (τ
)| ≤ C
ν (−τ ) (55)
dχk (τ )
dτ ≤ Cν (−τ ) Reν+1/2 k Reν + k −Reν (1 + k), where k = |k|. Furthermore, for the same reasons it is possible to obtain the following (non optimal) k-uniform bound for Reν < 1/2, for t2 ≤ t1 < −1, and for some other constant Cν ≥ 0, |Sk (t1 , t2 )| ≤ Cν (t1 t2 ) Reν+1/2 .
(56)
Now fix any T < −1 and consider τ ∈ (−∞, T ], so that |V (τ )| ≤ K T /(−τ )3 , for some constant K T ≥ 0. From (55), one sees with a few of trivial computations, that the series in the right-hand side of (52) and that of the τ -derivatives are τ -uniformly dominated, respectively, by
k Reν + k −Reν Sν,T , k Reν + k −Reν (1 + k) Sν,T , (57) where Sν,T is the following convergent series of positive constants: Sν,T
+∞ 2Cν K T n 1 1 := Cν . 1−2Reν 1 − 2Reν n! ((−T ) )n−1/2
(58)
n=1
Summarising, we can conclude that (52) defines a solution of (51) and that, the same equation entails the solution to be smooth. As a straightforward consequence we also have the following τ -uniform bound valid on (−∞, T ]:
|ρk (τ ) − χk (τ )| ≤ k Reν + k −Reν Sν,T ,
dρk (τ ) dχk (τ )
≤ 2 k Reν + k −Reν (1 + k)Sν,T . − (59)
dτ dτ This implies that, at fixed τ , the measurable (since limit of measurable functions) funck (τ ) do not grow, for large |k|, faster than tions R3 k → ρk (τ ) and R3 k → dρdτ Reν 1+Reν and |k| respectively. Moreover, their divergence at k = 0 cannot be worse |k| k (τ ) , that is k −|Reν| . than that of R3 k → χk (τ ) and R3 k → dχdτ Finally, notice that each term in the series in the right-hand side of (52) and in the analogy for dρk /dτ vanishes as τ → −∞ by construction. In view of the fact that, τ -uniformly, the series in (57) dominates both the series in the right-hand side of (52) and the series of τ -derivatives, we are allowed to interchange the operations of limit with that of the sum, obtaining dρk (τ ) dχk (τ ) − =0. (60) lim (ρk (τ ) − χk (τ )) = 0 and lim τ →−∞ τ →−∞ dτ dτ This result has a first important consequence. Using Eq. (51), one sees that the function k (τ ) k (τ ) τ → dρdτ ρk (τ ) − ρk (τ ) dρdτ is actually a constant. The value of this constant can be computed by taking the limit as τ → −∞, making use of (38), (60) and taking into k (τ ) and ρk (τ ) are bounded on (−∞, T ] (notice that account the fact that, for k fixed, dρdτ
Cosmological Horizons and Reconstruction of Quantum Field Theories
1161
these functions have no limit for τ → −∞), as one can show employing the asymptotic behaviour of Hν(2) (z) for large values of the argument z. In this way one finds dρk (τ ) dρk (τ ) ρk (τ ) − ρk (τ ) =i. dτ dτ
(61)
Now, to analyse the behaviour of ϕ, we can follow the same way as that followed in de Sitter space. Take any (real by definition) ϕ ∈ S(M) and fix a Cauchy surface τ in (M, g) individuated by the points in M with the fixed value of τ ; eventually define ∂ϕ(τ, x) ∂k (τ, x) ϕ(τ, x) − k (τ, x) (62) ϕ (k) := −i a(τ )2 dx . 3 ∂τ ∂τ R The right-hand side of (62) does not depend on the choice of τ , as it follows from direct inspection, exploiting (51). Remembering that ϕ ∈ S(M), so that its Cauchy data are real, smooth and compactly supported, we have that their Fourier transform are of Schwartz class. Afterwards, exploiting the fact that both the measurable functions k (τ ) grows at most as a polynomial with degree R3 k → ρk (τ ) and R3 k → dρdτ two for large |k|, and that their divergence at k = 0 is at most of order k −|Reν| with Reν < 1/2, we find that ϕ ∈ C ∞ (R3 \{0}) and it vanishes for |k| → ∞ faster than −n every power |k| , n = 1, 2, . . .. In particular ϕ ∈ L 2 (R3 ; dk) ∩ L 1 (R3 ; dk). Once one knows ϕ by (62), the associated ϕ can be constructed out of a decomposition in terms of modes k : k (τ, x) ϕ (k) + k (τ, x) ϕ (k) dk . (63) ϕ(τ, x) = R3
This is a trivial consequence of (62), (50), (61), and of the standard properties for the Fourier transform of smooth compactly supported functions on R3 . Eventually, per direct computation, one verifies that, if ϕ1 , ϕ2 ∈ S(M), − 2I m ϕ1 (k) ϕ2 (k)dk = (ϕ2 ∂τ ϕ1 − ϕ1 ∂τ ϕ2 ) a 2 (τ )dx =: σ M (ϕ1 , ϕ2 ). R3
R3
(64) We are now in position to draw some conclusions. Indeed, if ϕ ∈ S(M), p ∈ − and (τq , xq ) are the coordinates of q ∈ M, we can write down eik·xq ρk (τq ) − χk (τq ) ϕ (k) (ϕ) ( p) = lim dk q→ p R3 (2π )3/2 eik·xq χk (τ ) ϕ (k) + c.c. (65) + lim dk q→ p R3 (2π )3/2 As q → p ∈ − , τq → −∞ so that ρk (τq ) − χk (τq ) → 0 due to (60). Moreover, since (57) is valid, we have the τ -uniform bound
ik·x
e Reν −Reν
≤ Sν,T
|k| | ϕ (k)|, (τ ) − χ (τ )) ϕ (k) + |k| (ρ k k
(2π )3/2
(2π )3/2 where the right hand side is integrable because Reν < 1/2, ϕ ∈ L 1 (R3 ; dk) ∩ L 2 (R3 ; dk) and it vanishes faster than any power for |k| → +∞. Lebesgue’s dominated
1162
C. Dappiaggi, V. Moretti, N. Pinamonti
convergence theorem implies that the former limit in (65) vanishes. The remaining limit has been computed in the proof of (a) in Theorem 4.4. The final result reads as follows: if ( , θ, φ) are Bondi coordinates of p ∈ − and η : S2 → S2 is the inversion n → −n on the sphere, π +∞ e−i k k e−i 4 k ϕ , η(θ, φ) + c.c. (66) dk √ (γ ϕ) ( , θ, φ) = i (−γ ) 0 2(−γ ) (−γ ) 2π From this point on the proof carries on up to the conclusions exactly as in the proof of (a) in Theorem 4.4, since (41) holds also in our generalised case, as (64) shows. References [AGM00] [Al85] [Ar99] [AX78] [BGP96] [BD82] [BMG94] [BM96] [BD78] [BR021] [BR022] [DMP06] [Da07] [Di80] [DR02] [GR95] [Ge771] [Ge772] [Haa92] [Hal04] [Ho00] [HW05] [Is04] [JS02]
Aharony, O., Gubser, S.S., Maldacena, J.M., Ooguri, H., Oz, Y.: Large n field theories, string theory and gravity. Phys. Rept. 323, 183 (2000) Allen, B.: Vacuum states in the sitter space. Phys. Rew. D 32, 3136 (1985) Araki, H.: Mathematical Theory of Quantum Fields. Oxford: Oxford University Press, 1999 Ashtekar, A., Xanthopoulos, B.C.: Isometries compatible with asymptotic flatness at null infinity: a complete description. J. Math. Phys. 19, 2216 (1978) Bär, C., Ginoux, N., Pfäffle, F.: Wave equations on Lorentzian manifolds and quantization, ESI Lectures in Mathematics and Physics. Zürich: European Mathematical Society Publishing House, 2007 Birrel, N.D., Davies, P.C.W.: Quantum Field Theory in Curved Space. Cambridge: Cambridge University Press, 1982 Bros, J., Moschella, U., Gazeau, J.P.: Quantum field theory in the de sitter universe. Phys. Rev. Lett. 73, 1746 (1994) Bros, J., Moschella, U.: Two-point functions and quantum fields in de sitter universe. Rev. Math. Phys. 8, 327 (1996) Bunch, T.S., Davies, P.C.W.: Quantum fields theory in de sitter space: renoramlization by pointsplitting. Proc. R. Soc. Lond. A 360, 117 (1978) Bratteli, O., Robinson, D.W.: Operator algebras and quantum statistical mechanics. Vol. 1: C* and W* algebras, symmetry groups, decomposition of states. 2nd edition, Berlin-HeidelbergNew York: Springer-Verlag, 2002 Bratteli, O., Robinson, D.W.: Operator algebras and quantum statistical mechanics. Vol. 2: Equilibrium states. Models in quantum statistical mechanics. 2nd edition, Berlin-HeidelbergNew York: Springer, 2002 Dappiaggi, C., Moretti, V., Pinamonti, N.: Rigorous steps towards holography in asymptotically flat spacetimes. Rev. Math. Phys. 18, 349 (2006) Dappiaggi, C.: Projecting massive scalar fields to null infinity. Ann. Henri Poinc. 9, 35 (2008) Dimock, J.: Algebras of local observables on a manifold. Commun. Math. Phys. 77, 219 (1980) Duetsch, M., Rehren, K.H.: Generalized free fields and the ads-cft correspondence. Annales Henri Poincare 4, 613 (2003) Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products. Fifth Edition, LondonNew York: Academic Press, 1995 Geroch, R.: In: Esposito, P., Witten, L. eds., Asymptotic Structure of Spacetime. London: Plenum, 1977 Geroch, R.: Limits of spacetimes. Commun. Math. Phys 13, 180–193 (1969) Haag, R.: Local quantum physics: Fields, particles, algebras. Second Revised and Enlarged Edition, Berlin-Heidelberg-New York: Springer, 1992 Hall, G.S.: Symmetries and Curvature Structire in General Relativity. River Edge, NJ: World Scientific Publishing, 2004 Hollands, S.: Aspects of quantum field theory in curved spacetimes. Ph.D. Thesis, University of York (2000), advisor B.S. Kay, unpublished Hollands, S., Wald, R.M.: Conservation of the stress tensor in interacting quantum field theory in curved spacetimes. Rev. Math. Phys. 17, 227 (2005) Islam, J.N.: An introduction to mathematical cosmology. Cambridge: Cambridge Univ. Press, 2004 Junker, W., Schrohe, E.: Adiabatic vacuum states on general spacetime manifolds: definition, construction, and physical properties. Annales Poincare Phys. Theor. 3, 1113 (2002)
Cosmological Horizons and Reconstruction of Quantum Field Theories
[KW91] [Le53] [Li96] [LR90] [Mo06] [Mo07] [Mo03] [Pi05] [Ol07] [Pa69] [Ri06] [SS76] [Wa84] [Wa94]
1163
Kay, B.S., Wald, R.M.: Theorems on the uniqueness and thermal properties of stationary, nonsingular, quasifree states on space-times with a bifurcate killing horizon. Phys. Rept. 207, 49 (1991) Leray, J.: Hyperbolic Differential Equations, Unpublished. Lecture Notes, Princeton, 1953 Linde, A.: Particle Physics and Inflationary Cosmology. London: Harwood Academic Publishers, 1996 Lüders, C., Roberts, J.E.: Local quasiequivalence and adiabatic vacuum states. Commun. Math. Phys. 134, 29 (1990) Moretti, V.: Uniqueness theorem for bms-invariant states of scalar qft on the null boundary of asymptotically flat spacetimes and bulk-boundary observable. Commun. Math. Phys. 268, 726 (2006) Moretti, V.: Quantum out-states holographically induced by asymptotic flatness: invariance under spacetime symmetries, energy positivity and hadamard property. Commun. Math. Phys. 279, 31 (2008) Moretti, V.: Comments on the stress energy tensor operator in curved space-time. Commun. Math. Phys. 232, 189 (2003) Pinamonti, N.: De Sitter quantum scalar field and horizon holography. http://arXiv.org/list/ hep-th/0505179, 2005 Olbermann, H.: States of low energy on robertson-walker spacetimes. Class. Quantum. Grav. 24, 5011 (2007) Parker, L.: Quantized fields and particle creation in expanding universes. 1. Phys. Rev. 183, 1057 (1969) Rindler, W.: Relativity. Special, General and Cosmological. Second Edition, Oxford: Oxford University Press, 2006 Schomblond, C., Spindel, P.: Conditions d’unicité pour le propagateur (1) (x, y) du champ scalaire dans l’univers de de sitter. Ann. Inst. Henri Poincaré 25, 67 (1976) Wald, R.M.: General Relativity, Chicago: University of Chicago Press, 1984 Wald, R.M.: Quantum field theory in curved space-time and black hole thermodynamics. Chicago: The University of Chicago Press, 1994
Communicated by Y. Kawahigashi
Commun. Math. Phys. 285, 1165–1182 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0459-8
Communications in
Mathematical Physics
How to Remove the Boundary in CFT – An Operator Algebraic Procedure Roberto Longo1 , Karl-Henning Rehren2 1 Dipartimento di Matematica, Università di Roma “Tor Vergata”, Via della Ricerca Scientifica 1,
I-00133 Roma, Italy. E-mail:
[email protected] 2 Institut für Theoretische Physik, Universität Göttingen, Friedrich-Hund-Platz 1,
D-37077 Göttingen, Germany. E-mail:
[email protected] Received: 13 December 2007 / Accepted: 18 January 2008 Published online: 18 March 2008 – © The Author(s) 2008
Dedicated to Klaus Fredenhagen on the occasion of his 60th birthday Abstract: The relation between two-dimensional conformal quantum field theories with and without a timelike boundary is explored. 1. Introduction In [18], the authors have formulated boundary conformal field theory (BCFT) in real time (Lorentzian signature) in the algebraic framework of quantum field theory. BCFT is a local Möbius covariant QFT B+ on the two-dimensional Minkowski halfspace M+ (given by x > 0), which contains a (given) local chiral subtheory A, e.g., the stressenergy tensor. The reward of this approach was the surprisingly simple formula (1.2) below, expressing the von Neumann algebras of local observables B+ (O) in a double cone O ⊂ M+ in terms of an (in general nonlocal) chiral conformal net B of localized algebras associated with intervals along the boundary (the time axis x = 0). The net B is Möbius covariant and contains the local chiral observables A: A(I ) ⊂ B(I )
(1.1)
for each interval I ⊂ R. The reduction to a single chiral net is responsible for a kinematical simplification, explaining, e.g., Cardy’s observation [3] that in BCFT, bulk n-point correlation functions are linear combinations of chiral 2n-point conformal blocks. The algebra B+ (O) is a relative commutant of B(K ) within B(L), B+ (O) = B(K ) ∩ B(L),
(1.2)
where K ⊂ L are a pair of open intervals on the boundary R such that the disconnected complement L \ K = I ∪ J is the set of advanced and retarded times t ± x associated with points in (t, x) ∈ O (see Fig. 1). Although the chiral net B is not necessarily local,
1166
R. Longo, K.-H. Rehren
d -
L=(a,d)
I=(c,d) c -
K=(b,c)
O
b -
J=(a,b) a -
A double−cone O=IxJ
Fig. 1. Intervals on the boundary and double cones in the halfspace
the intersections (1.2) do commute with each other when two double cones are spacelike separated. The main result in [18] is that every BCFT is contained in a maximal (Haag dual) BCFT of the form (1.2). This leads to a somewhat paradoxical conclusion: on the one hand, each local bulk observable is defined as a (special) observable from a chiral CFT. Thus, superficially, the “degrees of freedom” of a BCFT are not more than those of a chiral CFT, containing only a single chiral component of the stress-energy tensor (Virasoro algebra). One might argue that such a “reduction of degrees of freedom” is a characteristic feature of QFT with a boundary. But this point of view cannot be maintained, because on the other hand, it was shown in [18] that the resulting BCFT B+ is locally equivalent to another CFT B2D on the full two-dimensional (2D) Minkowski spacetime, which has all the degrees of freedom of a 2D QFT, and in particular contains a full 2D stress-energy tensor (two commuting copies of the Virasoro algebra). Even in the simplest case, when the chiral net B on the boundary coincides with A (sometimes known as “the Cardy case”), the associated bulk QFT contains apart from the full 2D stress-energy tensor more (“nonchiral”) local fields that factorize into chiral fields with braid group statistics. Locally, also the BCFT contains the same fields. This paradoxical situation is not a contradiction; it rather shows that “counting degrees of freedom” of a QFT is an elusive task. Trivially, there is no obstruction against a proper inclusion of the form B(H) ⊗ B(H) ⊂ B(H) if H is an infinite-dimensional Hilbert space. But “counting degrees of freedoms”, e.g. by entropy arguments, requires the specification of the Hamiltonian. The BCFT shares the Hamiltonian and ground state (vacuum) of the chiral CFT, while the associated 2D CFT has a different Hamiltonian and a different ground state. Thus, with respect to different Hamiltonians, the spacetime dimension (measured through some power law behaviour of the entropy) may assume different values (1 or 2, in the present case). Looking at the issue from a different perspective, we may start from a vacuum representation of the Virasoro algebra. The latter integrates to a unitary projective representation of the diffeomorphism group of the circle Diff (S 1 ), which contains the diffeomorphism group of an interval Diff (I ) as a subgroup. For two open intervals with disjoint closures, there is a canonical identification between Diff (I ∪ J ) and Diff (I ) × Diff (J ). In terms of the stress-energy tensor T , this amounts to an isomorphism between exp i T ( f + g) and exp i T ( f ) ⊗ exp i T (g), when f and g have disjoint support. It would be hard to see this local isomorphism directly in terms of the Virasoro algebra.
How to Remove the Boundary in CFT – An Operator Algebraic Procedure
1167
The mathematical theorem underlying these facts is the well-known Split Property [6], which can be derived in local QFT in any dimension under a suitable phase space assumption. In chiral local CFT, a sufficient assumption is the existence of the conformal character Tr exp −β L 0 . In the algebraic framework, the chiral observables of a BCFT (e.g., the stress-energy tensor) localized in a double cone O are operators belonging to the von Neumann algebra A+ (O) = A(I )∨ A(J ), where I and J are two open intervals of the time axis (“advanced and retarded times”) such that t + x ∈ I , t − x ∈ J for (t, x) ∈ O (this justifies the notation O = I × J ), and A(I ) are the von Neumann algebras generated by the unitary exponentials of chiral fields smeared within I . In contrast, the chiral observables in a 2D CFT are operators in the algebra A2D (O) = A L (I )⊗ A R (J ) where I and J are regarded as two open intervals of the lightcone axes, and A R (I ) and A L (J ) are generated by left and right chiral fields. Our present association between BCFT and 2D CFT applies to the case when A L (I ) = A R (I ) = A(I ), i.e., the left chiral observables A L (I ) ⊗ 1 are isomorphic with the right chiral observables 1 ⊗ A R (I ), and both are isomorphic with the chiral observables A(I ) of the BCFT. Let H0 denote the vacuum Hilbert space for the chiral CFT described by the algebras A(I ). The split property states that if I and J are two intervals with disjoint closures, there is a canonical unitary V : H0 → H0 ⊗ H0 implementing an isomorphism V (A(I ) ∨ A(J )) V ∗ = A(I ) ⊗ A(J ).
(1.3)
The split isomorphism does not preserve the vacuum vector, i.e., the canonical “split vector” Ξ = V ∗ (Ω ⊗Ω) is an excited state in H0 . By construction, the split state (Ξ, ·Ξ ) on A(I ) ∨ A(J ) has the property that its expectation values for either subalgebra A(I ) or A(J ) coincide with those in the vacuum state, but the correlations between observables a1 ∈ A(I ) and a2 ∈ A(J ) are suppressed: (Ξ , a1 a2 Ξ ) = (Ξ , a1 Ξ ) (Ξ , a2 Ξ ) = (Ω , a1 Ω) (Ω , a2 Ω).
(1.4)
The split isomorphism depends on the pair of intervals I and J . It trivially restricts to algebras associated with subintervals, but it does not, in general, extend to larger intervals. When the intervals touch or overlap, a split state and the split isomorphism cease to exist. While the split isomorphism is well known, we discuss in this paper its extension to “non-chiral” local observables, which do not belong to A(I ) ∨ A(J ) in the BCFT, and to A(I ) ⊗ A(J ) in the 2D CFT. As a concrete demonstration for the resolution of the above “paradox”, we present two simple but nontrivial models where the algebraic relations outlined can be easily translated into the field-theoretic setting, i.e., we characterize the local algebras of the various QFTs in terms of generating local Wightman fields. Let us translate (1.2) into the field-theoretic language. The intervals I and J shrink to the points t ± x when O = I × J shrinks to a point (t, x). Thus, we have to approximate a field Φ(t, x) of the BCFT by observables in A(L) (where the interval L approximates (t − x, t + x) from the outside), that commute with all fields localized in the interval K (which approximates (t − x, t + x) from the inside). This will be done in Sect. 2. A crucial point here is that generating the local algebra A(L) involves “non-pointwise” operations, e.g., typical observables may be exponentials of smeared field operators, so that an element of the relative commutant is not necessarily localized in the disconnected set L \ K = I ∪ J .
1168
R. Longo, K.-H. Rehren
A second, somewhat puzzling feature of the algebraic treatment of BCFT is the fact that the description of the local algebras B+ (O) in terms of the chiral boundary net (Eq. (1.2)) is much simpler than that of the local algebras B2D (O) of the associated 2D conformal QFT without a boundary. The latter are (rather clumsily) defined as Jones extensions of the tensor products A(I ) ⊗ A(J ) in terms of a Q-system constructed from the chiral extension A ⊂ B with the help of α-induction [20]. One purpose of this work is to present a more direct construction of the 2D CFT without boundary from the BCFT. The obvious idea is to take a limit as the boundary is “shifted to infinity”. But we shall do more, and establish the covariant local isomorphism between the subnets O → B+ (O) and O → B2D (O) as O ⊂ O0 , i.e., the restriction of the AQFTs to any fixed double cone O0 within the halfspace x > 0, at finite distance from the boundary. The main problem here is, of course, the enhancement of the conformal symmetry, i.e., the reconstruction of the unitary positive-energy representation of the two-dimensional conformal group Möb × Möb from that of the chiral conformal group Möb. This is done by a “lift” of the chiral Möbius covariance of the local chiral net A, using the split property which allows to “embed” the 2D chiral algebra A(I ) ⊗ A(J ) into a local BCFT algebra B+ (O). This will be done in Sect. 3. The point is that only a single local algebra of the BCFT is needed for this reconstruction of the 2D conformal group and the full 2D CFT. In Sect. 4, we show that the 2D CFT can also be obtained through a limit where the boundary is “shifted to the left”, or equivalently, the BCFT observables are “shifted to the right”. The translations in the spatial direction “away from the boundary” do not belong to the chiral Möbius group of the BCFT. But they are at our disposal by the previous lifting of the 2D Möbius group into the BCFT. Therefore, we can study the behavior of correlation functions in the limit of “removing the boundary”. As we shift the boundary, the retarded and advanced times are shifted apart from each other. The convergence of the vacuum correlations of the BCFT to the vacuum correlations of the 2D CFT is therefore a consequence of the cluster behavior of vacuum correlations of the chiral CFT A. We add three appendices containing some related observations. 2. Models The purpose of this section is to illustrate the construction (1.2) in a field-theoretic setting. It is convenient to assume the trivial chiral extension B = A since even in this case the construction (1.2) is nontrivial, i.e., non-chiral local BCFT fields that factorize into nonlocal chiral fields can be constructed from local chiral fields only. We exhibit local BCFT fields in a region O = I × J ⊂ M+ as “neutral” chiral operators, that behave like products of “charged” chiral operators localized in I and J in the limit of large distance from the boundary. The limit of pointlike localization is also discussed, and reproduces familiar vertex operators. Consider the free U (1) current j with commutator [ j (x), j (y)] = 2πiδ (x − y) and −1 charge operator Q = (2π ) j (x)d x. The unitary Weyl operators W ( f ) = ei j ( f ) for real test functions f satisfy the Weyl relation W ( f ) W (g) = e−iπ σ ( f,g) · W ( f + g) = e−2πiσ ( f,g) · W (g) W ( f )
(2.1)
and have the vacuum expectation value 1
ω(W ( f )) = e−iπ σ ( f− , f+ ) = e− 2
R+
k dk| fˆ(k)|2
,
(2.2)
How to Remove the Boundary in CFT – An Operator Algebraic Procedure
1169
where the symplectic form is 1 1 d x f (x) g (x) − f (x) g(x) = k dk fˆ(−k) g(k), ˆ σ ( f, g) = 2 R 2πi R
(2.3)
and f + ( f − ) correspond to the restrictions to positive (negative) values of k of the Fourier transform fˆ(k) = R d x eikx f (x). With these conventions, W ( f )Ω is a state with charge density − f (x). The vacuum correlations of Weyl operators are ω(W ( f 1 ) · · · W ( f n )) = e
−iπ
i
σ ( f i− , f i+ )+2
i< j
σ ( f i− , f j+ )
.
(2.4)
The Weyl operators W ( f ) with supp f ⊂ I generate the local von Neumann algebras of the chiral net I → A(I ). We fix a double cone O = I × J ∈ M+ . Let K ⊂ L be the open intervals such that L\ K¯ = I ∪ J , as before. If f is a test function that vanishes outside L and is constant in K , then W ( f ) belongs to A(L) and commutes with A(K ) by (2.1) and (2.3), hence W ( f ) ⊂ B+ (O) = A(K ) ∩ A(L).
(2.5)
These are examples of operators that belong to B+ (O) but (if f | K = 0) not to A+ (O) = A(I ) ∨ A(J ). Weyl operators can also be defined for smooth functions f such that f has compact support, and the relation (2.1) holds. Then q = f (−∞) − f (∞) is called the charge. However, iσ ( f − , f + ) diverges, and the vacuum expectation value (2.2) vanishes unless q = 0 (see below). This implies that correlation functions (2.4) of charged Weyl operators vanish whenever the total charge is non-zero (charge conservation), while the IR divergences in each term in the exponent of (2.4) cancel for neutral correlations. The neutral Weyl operators (2.5) in B+ (O) are (up to a phase factor) products of charged Weyl operators with charge densities localized in J and in I . In the limit of sharp step functions G u (x) = q · θ (x − u) (requiring a regularization [4]), the regularized Weyl operators W (G u ) become the well-known vertex operators of charge −q and scaling dimension 21 q 2 [21], which are formally written as ∞ j (y)dy :. (2.6) V−q (u) = : exp iq u
Thus, as O shrinks to a point (t, x) ∈ M+ , and I and J shrink to the points t + x and t − x, the (regularised) Weyl operators W (G t−x − G t+x ) behave as Φq (t, x) = Vq (t + x)V−q (t − x). The correlation functions of vertex operators are computed from (2.4), giving −qi q j
−i
. . . · Vqi (u i ) · . . . = lim ε0 u i − u j − iε
(2.7)
(2.8)
i< j
if i qi = 0, and = 0 otherwise, from which the well-known anyonic commutation relations can be read off. It is then easily seen that Φq1 (t1 , x1 ) commutes with Φq2 (t2 , x2 ) when either t1 + x1 > t2 + x2 > t2 − x2 > t1 − x1 or when t2 + x2 > t1 + x1 > t1 − x1 > t2 − x2 , because in these cases the anyonic phase factors cancel. It also commutes with
1170
R. Longo, K.-H. Rehren
f(x) = G(x) H(x)
q charge q
charge +q
J
I
Fig. 2. A test function f such that W ( f ) belongs to B+ (O), but not to A+ (O). G and H are smooth step functions, supp G ⊂ J , supp H ⊂ I
j (t2 ± x2 ) if t2 ± x2 = t1 ± x1 . These are precisely the requirements for locality of the fields Φq (t, x) among each other, and relative to the conserved current j0 (t, x) = j (t + x) + j (t − x),
j1 (t, x) = j (t + x) − j (t − x)
(2.9)
defined for x > 0, i.e., Φq and j µ are local fields on the halfspace M+ . The correlation functions of n fields Φqi (ti , xi ) are correlations of 2n vertex operators (2n-point conformal blocks). After this digression to pointlike fields, let us resume the study of the correlation functions (2.4) of the smooth Weyl operators W ( f i ) ∈ B+ (O), and their behavior as O is shifted away from the boundary. We choose n test functions of the form f i = G i − Hi ,
(2.10)
where G i , Hi are smooth step functions with values 0 at −∞ and qi at +∞, such that G i = gi is supported in J and Hi = h i is supported in I (see Fig. 2). The neutral states W ( f i )Ω carry the charge qi in I and the charge −qi in J . The neutrality condition for each Weyl operator W ( f i ) can be written d x gi (x) − d x h i (x) = 0 ⇔ gˆi (0) − hˆ i (0) = 0. (2.11) R
R
The exponent in (2.4) is a linear combination of terms of the form (using fˆi = i(gˆi − hˆ i )/k) dk 2πi σ ( f i− , f j+ ) = d x (gi (x) − h i (x)) dy g j (y) − h j (y) e−ik(x−y) R+ k (2.12) which are IR finite because of (2.11). The separate contributions from gi and h i , however, are IR divergent. Therefore, we first regularize at k = 0 by the subtraction e−ik(x−y) → e−ik(x−y) − e−k/µ (µ > 0 arbitrary), which does not change the result because of (2.11), and then compute the contributions from g and h separately. We are interested in the behavior of the correlation function (2.4) as O is shifted away from the boundary. This means that the functions gi are shifted by a distance a to the left, and h i are shifted by the same distance to the right. The g-g contributions and the h-h contributions to σ ( f i− , f j+ ) are obviously invariant under this shift, while in the mixed h-g contributions x − y is replaced by x − y + 2a: dk −ik(x−y+2a) e 2πi σh i ,g j (a) := − d x h i (x) dy g j (y) − e−k/µ , (2.13) I J R+ k
How to Remove the Boundary in CFT – An Operator Algebraic Procedure
1171
and similarly for the g-h contributions. The last integrand can be split into two parts: (2.14) e−ik(x−y+2a) − 1 e−k/µ + e−ik(x−y+2a) 1 − e−k/µ so that the first contribution to the momentum integral equals − log (1 + iµ(x − y + 2a))
(2.15)
while the second (distributional) contribution is of order O(a −1 ) in the limit of large a. Because the remaining integrals have compact support, we obtain lim σh i ,g j (a) = qi q j · log(2i aµ) + O(a −1 ).
a→∞
(2.16)
Together with the g-h contributions qi q j · log(−2i aµ), these terms in the exponent of (2.4) cumulate up to the factor
2
2 (2aµ)−qi (2aµ)−2qi q j = (2aµ)−q , (2.17) i
i< j
where q = i qi is the total charge within I . Thus (2.4) vanishes in the limit a → ∞ if q = 0, enforcing “chiral charge conservation” in the limit. If q = 0, the mixed contributions give 1, and the remaining g-g and h-h contributions yield lim ω (W ( f 1 ) · · · W ( f n ))
a→∞
= ω (W (G 1 ) · · · W (G n )) · ω (W (−H1 ) · · · W (−Hn ))
(2.18)
involving charged These expressions are well-defined (and independent Weyl operators. of µ) because i G i and i Hi are neutral precisely due to q = 0. The factorization of the vacuum correlations in the limit a → ∞ is the desired feature we wanted to illustrate by this example. In the limit, W ( f i ) have the same correlations as W (−Hi ) ⊗ W (G i ), which are charged observables of the associated 2D CFT. Notice that in the limit of sharp test functions (see above), one obtains Vq (t + x) ⊗ V−q (t − x),
(2.19)
which are local fields in the entire two-dimensional Minkowski spacetime M 2 . Remark. The above construction can be generalized to the SU (2) current algebra. The Frenkel-Kac representation of SU (2) currents at level 1 is given by j 3 ≡ j and j ± (x) = j 1 (x) ± i j 2 (x) = V±√2 (x). Then Vq (x) · V−q (y) commutes with Vq (w) at w = x, y provided qq ∈ Z. Hence the field Φ 1 √2 (t, x) = V 1 √2 (t + x) · V− 1 √2 (t − x) 2
2
2
(2.20)
is local (as before) and relatively local w.r.t. the conserved currents j a (a = 1, 2, 3) j0a (t, x) = j a (t + x) + j a (t − x),
j1a (t, x) = j a (t + x) − j a (t − x).
(2.21)
Φ 1 √2 (t, x) is a neutral combination of charged primary fields of dimension 41 , transfor2
ming in the spin- 21 representation of SU (2), localized at t + x and t − x. The description of this model in terms of smooth Weyl operators is √ rather straightforward, see e.g., [2]: 2 belong to A(I ), while operators Weyl operators with integer multiples of the charge √ with half-integer multiples of the charge 2 in I and in J belong to A(K ) ∩ A(L).
1172
R. Longo, K.-H. Rehren
The mechanism of “charge separation” described here for obtaining elements of B+ (O) that do not belong to A+ (O) is very general [18], although in general it cannot be formulated in terms of Weyl operators. In Sect. 4 we shall show that also the factorization behavior far away from the boundary is a general feature, which allows to recover the 2D CFT from the BCFT. 3. Reconstruction of the 2D Symmetry We work in this section with a fixed “chiral extension” A ⊂ B. Here, A is a Haag dual Möbius covariant local net R ⊃ I → A(I ) of von Neumann algebras on its vacuum Hilbert space H0 , satisfying the split property and having finitely many irreducible DHR sectors of finite dimension (these properties together are called “complete rationality” [14]; in the case of diffeomorphism covariant nets, Haag duality = strong additivity is a consequence of the other properties [19]. The fact that the U (1) Weyl algebra in Sect. 2 is not completely rational, indicates that the results to be reported in this section hold also in more general situations). B is a Möbius covariant net R ⊃ I → B(I ) on its vacuum Hilbert space H0B such that for each I the inclusion A(I ) ⊂ B(I ) holds and is an irreducible subfactor, which has automatically finite Jones index [13] equal to the statistical dimension of the (reducible) representation of A on H0B [17]. The net B may be non-local, but is required to be relatively local w.r.t. A. If only A is specified, the irreducible chiral extensions B of A can be classified in terms of Q-systems of A [17]. The complete classification has been computed for A the Virasoro nets with central charge c < 1 (and implicitly also for the SU (2) current algebras) in [15]. With A ⊂ B one can associate a boundary CFT B+ on the halfspace M+ and a two-dimensional CFT B2D on Minkowski spacetime M 2 . To describe the former, we introduce a convenient notation (see Fig. 1). For any quadruples of four real numbers such a < b < c < d we define I = (c, d), J = (a, b), K = (b, c), L = (a, d), and O = {(t, x) : t + x ∈ I, t − x ∈ J } ⊂ M+ . Every double cone O ⊂ M+ is of this form and determines I, J, K , L, and similarly every pair of open intervals J < I (“I is to the right = future of J ”) determines K , L, and O = I × J . Then the BCFT associated with A ⊂ B is the net (1.2), i.e., O → B+ (O) = B(K ) ∩ B(L). We have shown in [18] that B+ (O) contains A+ (O) = A(I ) ∨ A(J ) as a subfactor with finite index, B+ is local and Haag dual on M+ , every Haag dual BCFT with chiral observables A arises in this way (namely the chiral extension B can be recovered from the BCFT), and every non-Haag-dual local BCFT net is intermediate between A+ and B+ . If B = A, B+ (O) equals the four-interval subfactor A(E) ⊂ A(E ) on the circle [14] (E = I ∪ J ). The 2D CFT B2D associated with A ⊂ B has been constructed in [20]. Its local algebras are extensions (with finite Jones index) of the tensor products A(I ) ⊗ A(J ), specified in terms of a Q-system constructed from the chiral extension A ⊂ B with the help of α-induction. We know from [18] that B+ and B2D are locally isomorphic, i.e., for each O ⊂ M+ there is an isomorphism ϕ O : B+ (O) → B2D (O) such that ϕ O (B+ (O1 )) = B2D (O1 ) for all O1 ⊂ O.
(3.1)
However, the Hilbert space and the vacuum state for the two theories are very different.
How to Remove the Boundary in CFT – An Operator Algebraic Procedure
1173
In this section, we wish to understand the relation between these two nets, by giving an alternative construction of the 2D CFT directly from the BCFT. The crucial point is the construction of the enhanced Möbius symmetry of the 2D CFT, and its ground state (the 2D vacuum) which is different from the BCFT vacuum. We first construct the Hilbert space H2D for the 2D CFT. We choose a fixed reference double cone O0 = I0 × J0 ⊂ M+ . The subfactor A+ (O0 ) = A(I0 )∨ A(J0 ) ⊂ B+ (O0 ) = B(K 0 ) ∨ B(L 0 ) is irreducible with finite index [18], and hence has a unique conditional expectation µ : B+ (O0 ) → A+ (O0 ), which is automatically normal and faithful. Let Ξ ∈ H0 be the canonical split vector for A(I0 ) ∨ A(J0 ) as in (1.4). The split state ˆ Ξˆ and πˆ ξ = (Ξ, ·Ξ ) on A+ (O0 ) extends to the state ξˆ = ξ ◦ µ on B+ (O0 ). Let H, denote the GNS Hilbert space, GNS vector and GNS representation for (B+ (O0 ), ξˆ ). ˆ We also write |b for πˆ (b)Ξˆ . Let us analyze the structure of H. The structure of B+ (O0 ) has been described in [18]. By complete rationality, A has finitely many irreducible superselection sectors [14]. Choose for each irreducible sector of A a representative DHR endomorphism [7] σ localized in I0 , and a representative τ localized in J0 . (For the vacuum sector, σ = τ = id. σ¯ and τ¯ are the representatives of the conjugate sector.) Then the elements of B+ (O0 ) are (weak limits of) sums of operators of the form ι(a1 a2 ) · ψ where ι is the injection A → B, a1 ∈ A(I0 ), a2 ∈ A(J0 ), and ψ ∈ B(L 0 ) generalize the Weyl operators W ( f ) (2.10) of Sect. 2: they are (for each pair σ, τ ) “charged” intertwiners in Hom(ι, ισ τ¯ ) ∩ B(K 0 ) . We may express these intersections in a different way: Let αρ± denote the endomorphisms of B extending the DHR endomorphisms ρ of A by “α-induction” [17], where αρ+ (αρ− ) acts trivially on b ∈ B localized to the right = future (left = past) of the interval where ρ is localized. Thus ασ− ατ+¯ acts trivially on B(K 0 ), because J0 < K 0 < I0 . Hence Hom (ι, ισ τ¯ ) ∩ B(K 0 ) = Hom id B , ασ− ατ+¯ . (3.2) (For an alternative characterization of the charged intertwiners by means of an eigenvalue condition, see App. B.) If O1 ⊂ M+ is another double cone in the halfspace, the algebra B+ (O1 ) is generated by A(I1 ) ∨ A(J1 ) and charged intertwiners (3.3) ¯ · ψ ∈ Hom id B , ασ−1 ατ+¯1 ψ1 = ι (u × u) with unitary charge transporters u ∈ Hom(σ, σ1 ) and u¯ ∈ Hom(τ¯ , τ¯1 ), where σ1 is localized in I1 and τ¯1 is localized in J1 . E.g., if B = A (the “Cardy case”), the charged intertwiners (generalizing the Weyl operators W ( f ) in (2.10) of Sect. 2) are of the form ψ ∈ Hom(id, σ τ¯ ). This implies that τ and σ are representatives of the same sector. Thus, the charges of BCFT fields are in 1:1 correspondence with the DHR sectors of A. In the general case, when ψ and ψ are two charged intertwiners, µ(ψ ψ ∗ ) is an intertwiner ∈ Hom(σ τ¯ , σ τ¯ ) ∩ (A(I0 ) ∨ A(J0 )). This space is zero unless σ = σ and τ = τ , and Hom(σ τ¯ , σ τ¯ ) ∩ (A(I0 ) ∨ A(J0 )) = C · 1 [16]. Therefore, we may choose (for each pair σ, τ ) a basis of charged intertwiners ψ which is orthonormal w.r.t. the inner product µ(ψ ψ ∗ ). Lemma 1. The subspaces Hˆ ψ of Hˆ spanned by |ψ ∗ · ι(A(I0 ) ∨ A(J0 )) are mutually orthogonal. Each subspace Hˆ ψ factorizes as a representation of A+ (O0 ) according to Hˆ ψ ∼ = Hσ ⊗ Hτ¯ ,
(3.4)
where Hσ and Hτ¯ carry the representations σ and τ¯ of A(I0 ) and A(J0 ), respectively.
1174
R. Longo, K.-H. Rehren
Proof. The computation of matrix elements in a dense set of vectors
ψ ∗ · ι(a1 a2 )| πˆ (ι(a1 a2 )) |ψ ∗ · ι(a1 a2 ) = Ξ , a1 ∗ a2 ∗ µ ψ ι(a1 a2 ) ψ ∗ a1 a2 Ξ = Ξ , a1 ∗ a2 ∗ σ τ¯ (a1 a2 ) a1 a2 Ξ = a1 Ω , σ (a1 ) a1 Ω · a2 Ω , τ¯ (a2 ) a2 Ω proves the claim.
(3.5)
We may therefore identify the vectors |ψ ∗ ι(a1 a2 ) with a1 Ω ⊗ a2 Ω ∈ Hσ ⊗ Hτ¯ in the representation σ ⊗ τ¯ under the split isomorphism, such that in particular, the GNS vector Ξˆ = |1 ∈ Hˆ corresponds to the 2D vacuum vector Ω ⊗ Ω ⊂ H0 ⊗ H0 . We write the extended Hilbert space Hˆ in the form
Hˆ ≡ H2D ∼ Z σ,τ Hσ ⊗ Hτ¯ (3.6) = σ,τ
(the “2D Hilbert space”). The nonnegative integer multiplicities are Z σ,τ = dim Hom(ατ+ , ασ− )
(3.7)
by the above characterization (3.2) of the spaces of charged intertwiners. The chiral factorization (3.6) of the GNS construction from the extended state ξ ◦ µ may be viewed as the remnant of the original “splitting behavior” of the split vector Ξ . As shown in [18] by comparison of the Q-system, the local subfactor πˆ (A+ (O0 )) ⊂ πˆ (B+ (O0 )) on Hˆ is isomorphic to A(I0 ) ⊗ A(J0 ) ⊂ B2D (O0 ) constructed in [20]. We may therefore consistently denote also the former by A2D (O0 ) ⊂ B2D (O0 ). Next, we construct the action of the 2D Möbius group on H2D , by a “lift” of the Möbius transformations of the chiral net A, using the split isomorphism and the conditional expectation µ. The action of Möb × Möb on H2D will then be used to define B2D (O) as the images of the reference algebra B2D (O0 ) under a 2D Möbius transformation g = (g1 , g2 ) taking O0 to O. The 2D Möbius group Möb × Möb is unitarily represented in the vacuum Hilbert space H0 of the chiral net A by U+ U− , the preimage of U0 ⊗ U0 on H0 ⊗ H0 under the split isomorphism. (See App. A, how U+ and U− can be obtained by modular theory directly on the boundary Hilbert space.) We need to lift U+ U− to H2D . Let Σ I ⊂ Möb denote the connected semigroup taking the interval I into itself, generated by the one-parameter subgroup preserving I and two one-parameter semigroups fixing either of its endpoints. Then Σ = Σ I0 × Σ J0 ⊂ Möb × Möb is the connected semigroup taking the reference double cone O0 into itself. For g = (g1 , g2 ) ∈ Σ, the adjoint action of U+ (g1 )U− (g2 ) on a1 ∈ A(I0 ), a2 ∈ A(J0 ) is given by the independent (= product) action of the chiral Möbius transformations given by geometric automorphisms αg of the chiral net A: αg+1 αg−2 (a1 · a2 ) = αg1 (a1 ) · αg2 (a2 ). We extend these endomorphisms of A+ (O0 ) to endomorphisms of B+ (O0 ) by βg+1 βg−2 (ι(a1 a2 ) · ψ) := ι αg1 (a1 )αg2 (a2 ) · ι z σ (g1 )z τ¯ (g2 ) · ψ.
(3.8)
(3.9)
Here z ρ (g) ∈ Hom(ρ, αg ραg−1 ) are the unitary cocycles z ρ (g) = U0 (g)Uρ (g)∗ ∈ A [10,16], where U0 and Uρ are the representations of the Möbius group in the vacuum representation and in the DHR representation ρ.
How to Remove the Boundary in CFT – An Operator Algebraic Procedure
1175
Proposition 1. (i) The maps βg+1 βg−2 defined by (3.9) for g ∈ Σ are homomorphisms from B+ (O0 ) onto B+ (g O0 ) ⊂ B+ (O0 ). (ii) For O1 ⊂ O0 we have βg+1 βg−2 (B+ (O1 )) = B+ (g O1 ), i.e., βg+1 βg−2 “act geometrically inside B+ (O0 )”. (iii) βg+1 βg−2 respect the group composition law within the semigroup Σ. (iv) The conditional expectation µ intertwines βg+1 βg−2 with αg+1 αg−2 . Proof.
(i) The homomorphism property follows from the composition and conjugation laws of charged intertwiners [18] and the intertwining and localization properties of the operators and endomorphisms involved. The statement about the range is just a special case of (ii). (ii) It is sufficient to show that a charged intertwiner ψ1 ∈ B+ (O1 ) is mapped to a charged intertwiner in B+ (g O1 ). By virtue of (3.3), we compute ¯ τ¯ (g2 ) · ψ. (3.10) βg+1 βg−2 (ψ1 ) = ι αg1 (u)z σ (g1 )αg2 (u)z
Then the claim follows, because αg1 (u) z σ (g) ∈ Hom(σ, αg1 σ1 αg−1 ), and αg1 1 −1 −1 σ1 αg1 is localized in g1 I1 , and similarly αg2 τ¯1 αg2 is localized in g2 J1 . (iii) The group composition law follows from the cocycle properties [10,16] of z ρ . (iv) The intertwining property of µ is due to the fact that µ annihilates all charged intertwiners except the neutral one (σ = τ¯ = id). Next, we adapt a well-known lemma about the implementation of (groups of) automorphisms to the case of (semigroups of) endomorphisms. Lemma 2. Let M be a von Neumann algebra on a Hilbert space H with a cyclic and separating vector Ψ . Let β be an endomorphism of M, preserving the state (Ψ, ·Ψ ). Then the closure of the map mΨ → β(m)Ψ is an isometry Uβ . If Ψ is cyclic also for β(M), then Uβ is unitary. For two endomorphisms β, β with the same properties, such that Ψ is cyclic for β(M), one has Uβ β = Uβ Uβ . Proof. That Uβ is an isometry is an obvious consequence of the invariance of the state. Since β(M)Ψ is a dense subset, Uβ is surjective, hence unitary. For the last statement it is sufficient to notice that Uβ is densely defined on β(M)Ψ . We apply the lemma to the endomorphisms βg+1 βg−2 of B+ (O0 ). Using (iv) of Prop. 1, we see that βg+1 βg−2 leave the GNS state (Ξˆ , ·Ξˆ ) invariant because the split state (Ξ, ·Ξ ) on A+ (O0 ) is invariant under αg+1 αg−2 . The vector Ξˆ is cyclic and separating for each πˆ (B+ (O1 )) (O1 ⊂ O0 ) because µ is faithful and Ξ is cyclic and separating for each A+ (O), which in turn follows by the split isomorphism because Ω is cyclic and separating for A(I1 ) and for A(J1 ). Thus, lemma 2 applies: Corollary 1. The homomorphisms βg+1 βg−2 induce unitary operators on Hˆ = H2D , which satisfy the group composition law within the semigroup Σ. Together with the inverse unitary operators, they generate a covering representation Uˆ (g1 , g2 ) = Uˆ + (g1 )Uˆ − (g2 ) of Möb × Möb on H2D . The last statement is due to the fact that Σ and its inverse generate Möb × Möb, and the group law within Σ secures the commutation relations of the Lie algebra.
1176
R. Longo, K.-H. Rehren
By construction, for g = (g1 , g2 ) ∈ Σ, Uˆ (g1 , g2 ) on the subspace Hψ is equivalent to Uσ (g1 ) ⊗ Uτ¯ (g2 ) on Hσ ⊗ Hτ¯ under the isomorphism (3.4). By (ii) of Prop. 1, the adjoint action of Uˆ (g1 , g2 ) takes B+ (O1 ) to B+ (g O1 ) for O1 ⊂ O0 . By constructing U+ U− , we have thus furnished the local subnet O0 ⊃ O1 → B+ (O1 ) of the BCFT with a covariant “two-dimensional re-interpretation”. In the representation πˆ on Hˆ = H2D , this is precisely the local isomorphism ϕ O0 referred to in (3.1). The present discussion shows that ϕ O0 intertwines the global 2D Möbius covariance with a “hidden” symmetry of the BCFT, which is induced by the extended split state ξˆ and acts locally geometric. We now define for arbitrary double cones O ⊂ M 2 the associated local algebras of the 2D conformal net on H2D by varying g = (g1 , g2 ) ∈ Möb × Möb in the connected neighborhood of unity for which g O0 ⊂ M 2 , and putting B2D (O) := Uˆ (g1 , g2 ) B2D (O0 ) Uˆ (g1 , g2 )∗ if O = g O0 ⊂ M 2 .
(3.11)
For O ⊂ O0 , this coincides with πˆ βg+1 βg−2 (B+ (O0 )) = πˆ (B+ (O)) by virtue of (ii) of Prop. 1. Notice that B2D (g O0 ) is uniquely defined as long as O = g O0 ⊂ M 2 because in this case any two g with the same image g O0 differ by an element of Σ, while it requires the passage to a covering space when M 2 is conformally completed. Theorem 1. The net of von Neumann algebras O → B2D (O) defined by (3.11) is covariant, isotonous, and local. Proof. The covariance is by construction. Isotony and locality of the 2D net follow from the geometric action inside O0 , (ii) of Prop. 1, and the fact that every pair of double cones in M 2 such that either O1 ⊂ O2 are O1 ⊂ O2 can be moved inside O0 by a Möbius transformation, where we know (from the boundary CFT) that isotony and locality hold. Corollary 2. The extension A2D ⊂ B2D is isomorphic to the extension constructed in [20]. Proof. Since the local subfactor A2D (O0 ) ⊂ B2D (O0 ) constructed in [20] is isomorphic to A+ (O0 ) ⊂ B+ (O0 ), and the isomorphism intertwines the representations of the 2D Möbius group, the global isomorphism follows. We have associated with the BCFT a 2D local CFT, that is locally isomorphic. The association is intrinsic in the sense that it requires only the subnet O0 ⊃ O1 → B+ (O1 ) together with the covariance of the DHR sectors of the underlying chiral CFT A. It should be noticed that the construction is up to unitary equivalence independent of the choice of the reference double cone O0 ⊂ M+ . The reason is essentially that the charge structure of B(K ) ∩ B(L) exhibited by the multiplicities Z σ,τ in (3.6) is independent of the pair K ⊂ L. We conclude this section with an observation concerning diffeomorphism covariance: Proposition 2. If A ⊂ B is a chiral extension of a diffeomorphism covariant chiral net A, then the (possibly non-local) chiral net B, the BCFT net B+ defined by (1.2), and the 2D net B2D associated with B+ by Thm. 1 are also diffeomorphism covariant.
How to Remove the Boundary in CFT – An Operator Algebraic Procedure
1177
Proof. The chiral net A is diffeomorphism covariant if for a diffeomorphism γ of S 1 there is a unitary operator wγ on H0 such that u γ A(I )u ∗γ = A(γ I ). Haag duality of A implies that if γ is localized in an interval I (i.e., acts trivially on the complement), then wγ is an observable in A(I ). For a chiral extension A ⊂ B we claim that if γ is localized in I0 , then for I1 ⊂ I0 one has ι(wγ )B(I1 )ι(wγ∗ ) = B(γ I1 ), i.e., ι(wγ ) implement the local diffeomorphisms. Namely, B(I1 ) is generated by ι (A(I1 )) and v1 = ι(u) · v, where v ∈ B(I0 ) is the canonical charged intertwiner v ∈ Hom(ι, ιθ ) for the canonical DHR endomorphism θ localized in I0 [17] (see also App. B), and θ1 is an equivalent DHR endomorphism localized in I1 . We find ι(wγ ) v1 ι(wγ∗ ) = ι(wγ uθ (wγ∗ )) · v.
(3.12)
Now, wγ uθ (wγ∗ ) ∈ Hom(θ, γ θ1 γ −1 ), and γ θ1 γ −1 is localized in γ I1 . This proves the claim. The diffeomorphism covariance of the chiral net B follows because the diffeomorphisms localized in I0 together with the Möbius group generate the diffeomorphism group of S 1 . The argument for the boundary CFT and for the 2D CFT are very similar: we first show that for diffeomorphisms γ = γ1 γ2 where γ1 is localized in I0 and γ2 localized in J0 , the adjoint action with ι(wγ1 wγ2 ) takes B+ (O1 ) to B+ (γ O1 ) if O1 ⊂ O0 . Again, it is sufficient to verify the action on the charged intertwiners (3.3) of B+ (O1 ): (3.13) ι(wγ1 wγ2 ) · ψ1 · ι(wγ1 wγ2 )∗ = ι (wγ1 uσ (wγ∗1 ))(wγ2 u¯ τ¯ (wγ∗2 )) · ψ, where wγ1 uσ (wγ∗1 ) ∈ Hom(σ, γ1 σ1 γ1−1 ) and wγ2 u¯ τ¯ (wγ∗2 ) ∈ Hom(τ¯ , γ2 τ¯1 γ2−1 ), and γ1 σ1 γ1−1 is localized in γ1 I1 and γ2 τ¯1 γ2−1 is localized in γ2 J1 . Hence (3.13) is a charged intertwiner of B+ (γ O1 ). This proves the claim. Then the diffeomorphism covariance of B+ and B2D follow because the diffeomorphisms localized in O0 together with the Möbius group generate all diffeomorphisms. 4. Cluster Limit Let b1 , . . . , bn ∈ B+ (O) be BCFT observables localized within any fixed double cone O = I × J ⊂ M+ . We wish to consider the behavior of a vacuum correlation (Ω , βx (b1 · · · bn ) Ω) ,
(4.1)
− is the one-parameter semigroup of “right shifts” (x > 0, away from where βx = βx+ β−x the boundary), that take I to I + x and J to J − x, represented as homomorphisms from B+ (O) to B+ (I + x × J + x), see (3.9). In Sect. 3 (with O as the fixed reference double cone) we have given the re-interpretation of bi in the GNS representation πˆ of the state ξ ◦ µ as observables of the associated 2D CFT, with the 2D vacuum Ω2D given by the GNS vector. We shall show (i) (i)
Theorem 2. Let each bi ∈ B+ (O) (i = 1, . . . , n) be of the form ι(a1 a2 ) · ψ (i) with (i) (i) charged intertwiners ψ (i) and a1 ∈ A(I ) and a2 ∈ A(J ). As x goes to +∞, the BCFT vacuum correlations (4.1) converge to the 2D vacuum correlations Ω2D , πˆ (b1 · · · bn ) Ω2D = ξ ◦ µ(b1 · · · bn ). (4.2)
1178
R. Longo, K.-H. Rehren
Proof. We compute the limit and the 2D vacuum expectation value separately. Using the decomposition of products ψ1 ψ2 into finite sums of operators of the form ι(T1 T2 ) · ψ [18], where Ti are intertwiners between DHR endomorphisms of A, we see that the product b1 · · · bn is a finite sum of operators of the same form ι(a1 a2 ) · ψ. For the present purpose, it is more convenient to write the charged intertwiners as ψ = t · ι(¯r ), where r ∈ Hom(id, τ τ¯ ) ⊂ A(J ) and t ∈ Hom(ατ+ , ασ− ) ⊂ Hom(ιτ, ισ ) (Frobenius reciprocity). Then, because a2 = σ (a2 ), we get ι(a2 ) · ψ = t · ι(τ (a2 )¯r ). Hence, the product b1 . . . bn is a finite sum of operators of the form ι(a1 ) · t · ι(a2 ).
(4.3)
Thus, the above vacuum correlation function is a finite sum of expectation values F(x) = (Ω , βx (ι(a1 ) · t · ι(a2 )) Ω) = Ω , ι(αx (a1 )z σ (x)) · t · ι(z τ (−x)∗ α−x (a2 )) Ω = Ω , αx (a1 )z σ (x) · ε(t) · z τ (−x)∗ α−x (a2 ) Ω .
(4.4)
Here, ε is the global conditional expectation B → A, which preserves the vacuum state [17]. In particular, ε(t) ∈ Hom(τ, σ ). Therefore, the expression vanishes identically unless σ and τ belong to the same sector. In the latter case, we express the cocycles as z ρ (g) = U0 (g)Uρ (g)∗ , and αg = AdU0 (g) , giving F(x) = Ω , a1 Uσ (x)∗ · ε(t) · Uτ (x)∗ a2 Ω = (Ω , a1 · Uσ (−2x) · ε(t)a2 Ω) , (4.5) because the intertwiners between DHR endomorphisms also intertwine the representations of the Möbius group [10]. By the spectrum condition, F(x) has a bounded analytic continuation to the lower complex halfplane. Uσ (−z) weakly converges in every direction z = r eiϕ (−π < ϕ < 0, r → ∞) to the projection onto the zero eigenspace of the generator, and the latter projection is nonzero only if σ = id is the vacuum representation; in this case t = ε(t) = 1. Thus, F(z) converges in these directions to the vacuum expectation value
Next, we consider
δσ,0 δτ,0 (Ω, a1 Ω) · (Ω, a2 Ω).
(4.6)
F(x) = Ω , βx ι(a2∗ ) · t ∗ · ι(a1∗ ) Ω .
(4.7)
Let rσ ∈ Hom(id, σ¯ σ ) ⊂ A(I ) and rτ ∈ Hom(id, τ¯ τ ) ⊂ A(J ). Then we can write t ∗ = ι(rσ∗ ) · t¯ · ι(rτ ), where t¯ ∈ Hom(ατ+¯ , ασ− ¯ ). Using the locality ¯ ) ⊂ Hom(ιτ¯ , ισ properties of a1 ∈ A(I ), a2 ∈ A(J ), we can rewrite F(x) = Ω , βx ι(rσ∗ σ¯ (a1∗ )) · t¯ · ι(τ¯ (a2∗ )rτ ) Ω . (4.8) This expression can be computed in the same way as F(x) before, giving F(x) = Ω , rσ∗ σ¯ (a1∗ )) · Uσ¯ (−2x) · ε(t¯)τ¯ (a2∗ )rτ Ω .
(4.9)
Thus F(x) also has a bounded analytic continuation to the upper complex halfplane, and converges to the same limit (4.6) also in the directions z = r eiϕ (0 < ϕ < π , r → ∞). From this, we may conclude the cluster limit lim (Ω , βx (ι(a1 ) · t · ι(a2 )) Ω) = δσ,0 δτ,0 (Ω, a1 Ω) · (Ω, a2 Ω).
x→∞
(4.10)
How to Remove the Boundary in CFT – An Operator Algebraic Procedure
1179
On the other hand, we now compute (4.2) and show that it coincides with the factorizing cluster limit of (4.1). For each contribution of the form (4.3), we have (Ω2D , πˆ (ι(a1 ) · t · ι(a2 )) , Ω2D ) = ξ ◦ µ (ι(a1 ) · t · ι(a2 )) = ξ (a1 · µ(t) · a2 ) . (4.11) But µ(t) ∈ A(I ) ∨ A(J ) is an intertwiner in Hom(σ, τ ) which vanishes unless σ = id and τ = id both belong to the vacuum sector. In the latter case, t = µ(t) = 1. Thus,
Ξˆ |πˆ (ι(a1 ) · t · ι(a2 )) |Ξˆ = δσ,0 δτ,0 ξ(a1 a2 ) = δσ,0 δτ,0 (Ω, a1 Ω) · (Ω, a2 Ω). (4.12) This coincides with the cluster limit (4.10) “far away from the boundary”.
Recall that a1 and a2 in (4.3) were obtained by multiplying b1 · · · bn and successively decomposing the products of the charged intertwiners. Thus, the vacuum expectation values (Ω, ai Ω) in (4.12) are precisely the chiral conformal blocks of the corresponding 2D correlation functions. A variant of the conformal cluster theorem [8] should also give a quantitative estimate for the rate of the convergence, depending on the charges of the operators involved through the corresponding spectrum of L 0 . 5. Conclusion We have studied the passage from a local conformal quantum field theory defined on the halfspace x > 0 of two-dimensional Minkowski spacetime (boundary CFT, BCFT) to an associated local conformal quantum field defined on the full Minkowski spacetime (2D CFT). There are essentially two ways: the first is to consider BCFT vacuum correlations of observables localized far away from the boundary. In the limit of infinite distance, these correlations factorize into chiral correlations (conformal blocks) of charged fields. We have traced this effect back to the cluster property of the underlying local chiral subtheory. The second method exploits the split property, i.e., the existence of states of the underlying local chiral CFT in which correlations between observables in two fixed intervals at a finite distance are suppressed. With the help of the split property one can algebraically identify a fixed local algebra of the BCFT with a fixed local algebra of the 2D CFT, and one can generate a unitary representation of the 2D Möbius group in the GNS Hilbert space of a suitable “extended split state” of this algebra. Its ground state, the 2D vacuum, is different from the BCFT vacuum. Then, by acting with the 2D Möbius group, one can obtain all local algebras of the 2D CFT in the same Hilbert space. The converse question: can one consistently “add” a boundary in any 2D CFT (without affecting the algebraic structure away from the boundary), is not addressed here. However, there arises a necessary condition from the discussion in App. C: the 2D partition function should be either modular invariant, or at least it should be intermediate between the vacuum partition function and some modular invariant partition function. We hope to return to this problem, and find also a sufficient condition. Acknowledgement. KHR thanks the Dipartimento di Matematica of the Università di Roma “Tor Vergata” for hospitality and financial support, and M. Weiner and I. Runkel for discussions related to the subject. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
1180
R. Longo, K.-H. Rehren
A. Modular Construction of Möb × Möb in the Split State In [12] it was shown that a unitary representation of the Möbius group Möb is generated by the modular groups of a “halfsided modular triple”, i.e., three von Neumann algebras Ai (i = 0, 1, 2) with a joint cyclic and separating vector Ψ such that if σti is the modular group for (Ai , Ψ ), then σti (Ai+1 ) ⊂ Ai+1 for t 0. (Here, i + 1 is understood mod 3.) Specifically, when I is an open interval and I1 , I2 are the subintervals obtained by removing an interior point from I , the three algebras A1 = A(I1 ), A2 = A(I2 ), A3 = A(I ) in a local chiral CFT together with the vacuum vector Ω define a halfsided modular triple. This means that the entire local net can be recovered from these data. We want to show here how this construction can be applied to construct a unitary representation of the 2D Möbius group Möb × Möb from six suitable algebras in the split state Ξ associated with a pair of intervals I and J , see (1.4). Let I1 , I2 arise from I by removing a point, and similarly J1 , J2 from J . Tensoring by 1, the two halfsided modular triples A(I ) ⊗ 1 , A(I1 ) ⊗ 1 , A(I2 ) ⊗ 1 , (A.1) 1 ⊗ A(J ) , 1 ⊗ A(J1 ) , 1 ⊗ A(J2 ) in the state Ω ⊗ Ω generate U0 ⊗ U0 . Under the split isomorphism, these triples turn into A(I ) ∩ N , A(I1 ), A(I2 ) , (A.2) A(J ) ∩ N , A(J1 ), A(J2 ) in the split state Ξ , where N is the canonical intermediate type I factor between A(I ) and A(J ) . Ξ is cyclic and separating for these algebras in the subspaces N Ξ and N Ξ , respectively. The latter halfsided modular triples thus generate the two commuting representations U+ , U− of Möb directly in H0 . B. Charged Intertwiners in BCFT The charged intertwiners ψ for a given chiral extension A ⊂ B, that together with A+ (O) generate B+ (O), are elements of the finite-dimensional spaces Hom(ι, ισ τ¯ ) ∩B(K ) . In [18, Eq. (5.12)] a linear condition on ϕ = ι¯(ψ) ∈ Hom(θ, θ σ τ¯ ) was given which guarantees that ϕ commutes with ι¯(B(K )). Here ι¯ : B → A is a homomorphism conjugate to the injection ι : A → B, such that γ = ι¯ι on B(K ) is a canonical endomorphism for A(K ) ⊂ B(K ) and θ = ι¯ι is the dual canonical endomorphism, which is a DHR endomorphism of A localized in K [17]. Unfortunately, the condition displayed in [18] does not take into account that ϕ belongs to ι¯(B(L)) (i.e., is in the range of ι¯). We want to reformulate this condition so that it is equivalent to ψ belonging to B+ (O) = B(K ) ∩ B(L). We first notice that every element of B(K ) is of the form ψ = ι(y) v, where v ∈ Hom(id B , γ ) ⊂ B(K ) is the canonical isometry intertwining γ . Then ψ ∈ Hom(ι, ισ τ¯ ) if and only if y ∈ Hom(θ, σ τ¯ ) ⊂ A(L). This already secures that ψ ∈ B(L), and since θ is localized in K , ψ commutes with ι(A(K )). Hence it commutes with B(K ) iff it also commutes with v ∈ Hom(id B , γ ). This is equivalent to the relation !
∗ y x = θ (y) x ≡ σ (εθ,τ¯ ) εσ,θ θ (y) x,
(B.1)
How to Remove the Boundary in CFT – An Operator Algebraic Procedure
1181
where x = ι¯(v) ∈ Hom(θ, θ 2 ). The statistics operators ε are trivial [9] due to the localizations of σ in I , τ¯ in J , and θ in K , but we have displayed them in order to make the condition covariant under unitary deformations of ι¯ and v ∈ Hom(id B , ι¯ι), possibly changing the localization of θ and leading to nontrivial statistics operators. The condition (B.1) can be equivalently written as the eigenvalue equation ∗ 1 ! ∗ Π (y) := λ 2 · 1σ × r ∗ × 1τ¯ ◦ εσ,θ × εθ, (B.2) τ¯ ◦ (1θ × y × 1θ ) ◦ x 2 = y. Here r = x ◦w ∈ Hom(id A , θ 2 ), where w ∈ Hom(id A , θ ) ⊂ A(K ) is the dual canonical isometry (such that (γ , v, ι(w)) form a Q-system); x2 = (1θ × x) ◦ x = (x × 1θ ) ◦ x ∈ Hom(θ, θ 3 ), and λ 1 is the index [B : A]. ◦ and × are the concatenation and the monoidal product in the tensor category of DHR endomorphisms of A. The map Π defined by (B.2) is a linear map Π : Hom(θ, σ τ¯ ) → Hom(θ, σ τ¯ ). Equation (B.2) obviously follows from (B.1) by left multiplication with σ (r ∗ ) and right multiplication with x. To see that (B.2) implies (B.1), one may insert (B.2) into both sides of (B.1) and repeatedly use the relations of the dual Q-system (θ, w, x) to get equality. We thank I. Runkel who has pointed out to us that Π is in fact a projection. Hence the charged intertwiners ψ are precisely given by ι(y) · v, where y is in the range of Π . The multiplicities Z σ,τ in (3.7) equal the dimension of the range of these projections (for each pair σ, τ ). C. Haag Duality and Modular Invariance If A is completely rational, the C* tensor category defined by its DHR superselection sectors is modular [14], i.e., the unitary S and T matrices defined by the statistics [10] generate a representation of the group S L(2, Z). By the Verlinde formula [22], these matrices also describe the modular transformation behavior of chiral partition functions (“characters”). By [1], the matrix Z given by (3.7) is a modular invariant (it commutes with S and T ), hence the partition function of the 2D CFT B2D on H2D is invariant under modular transformations. We want to point out an interesting relation of this fact to Haag duality of the associated BCFT. As mentioned before, every BCFT defined by (1.2) is automatically Haag dual, and any non Haag dual BCFT B+ with the same chiral observables is intermediate between A+ and B+ [18]. Therefore, the charged intertwiners ψ ∈ B+ constitute linear subspaces of the spaces of charged intertwiners in B+ . Let the dimensions of these spaces be Z σ,τ Z σ,τ , and at least one of them < Z σ,τ (i.e., B+ is strictly contained in B+ ). Then the matrix Z cannot be a modular invariant by the following simple argument: consider the 00 component of S ∗ Z S. Because each Si0 is positive, ∗ S0i S0 j (C.1) S Z S 00 = Zi j ij
is strictly smaller than Z were modular invariant, we would 00 = Z 00 = 1. If conclude Z 00 < 1, which is impossible. We notice that the construction of a 2D CFT associated to a BCFT described in Sect. 3 takes an intermediate BCFT A+ ⊂ B+ ⊂ B+ to an intermediate 2D CFT A2D ⊂ B2D ⊂ B2D . Its Hilbert space is of the form (3.6) with Z replaced by Z . Hence, we conclude that the partition function of the associated 2D CFT is modular invariant if and only if the BCFT is Haag dual. (S ∗ Z S)
1182
R. Longo, K.-H. Rehren
References 1. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: On α-induction, chiral generators and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487 (1999) 2. Buchholz, D., Mack, G., Todorov, I.T.: The current algebra on the circle as a germ of local field theories. Nucl. Phys. B (Proc. Suppl.) 5B, 20–56 (1988) 3. Cardy, J.: Conformal invariance and surface critical behavior. Nucl. Phys. B 240, 514–532 (1984) 4. Carey, A.L., Ruijsenaars, S.N.M., Wright, J.D.: The massless Thirring model: Positivity of Klaiber’s N -point functions. Commun. Math. Phys. 99, 347–364 (1985) 5. D’Antoni, C., Doplicher, S., Fredenhagen, K., Longo, R.: Convergence of local charges and continuity properties of W ∗ inclusions. Commun. Math. Phys. 110, 325–348 (1987); [Erratum ibid. 116, 175–176, (1988)] 6. Doplicher, S., Longo, R.: Standard and split inclusions of von Neumann algebras. Invent. Math. 75, 493–536 (1984) 7. Doplicher, S., Haag, R., Roberts, J.E.: Local observables and particle statistics, 1+2. Commun. Math. Phys. 23, 199–230 (1971); Commun. Math. Phys. 35, 49–85, (1974) 8. Fredenhagen, K., Jörß, M.: Conformal Haag-Kastler nets, pointlike localized fields and the existence of operator product expansions. Commun. Math. Phys. 176, 541–554 (1996) 9. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics, I. Commun. Math. Phys. 125, 201–226 (1989) 10. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics, II. Rev. Math. Phys. SI1, Special issue, 113–157 (1992) 11. Frenkel, I.B., Kac, V.: Basic representation of affine Lie algebras and dual resonance models. Invent. Math. 62, 23–66 (1980) 12. Guido, D., Longo, R., Wiesbrock, H.-W.: Extensions of conformal nets and superselection structures. Commun. Math. Phys. 192, 217–244 (1998) 13. Kawahigashi, Y., Longo, R.: Classification of local conformal nets. Case c < 1. Ann. Math. 160, 493–522 (2004) 14. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 15. Kawahigashi, Y., Longo, R., Pennig, U., Rehren, K.-H.: The classification of non-local chiral CFT with c < 1. Commun. Math. Phys. 271, 375–385 (2007) 16. Longo, R.: Conformal subnets and intermediate subfactors. Commun. Math. Phys. 237, 7–30 (2003) 17. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 18. Longo, R., Rehren, K.-H.: Local fields in boundary conformal QFT. Rev. Math. Phys. 16, 909–960 (2004) 19. Longo, R., Xu, F.: Topological sectors and a dichotomy in conformal field theory. Commun. Math. Phys. 251, 321–364 (2004) 20. Rehren, K.-H.: Canonical tensor product subfactors. Commun. Math. Phys. 211, 395–406 (2000) 21. Schroer, B., Swieca, J.A., Völkel, A.H.: Global operator expansions in conformally invariant relativistic quantum field theory. Phys. Rev. D11, 1509–1520 (1975) 22. Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theories. Nucl. Phys. B300, 360–376 (1988) Communicated by Y. Kawahigashi
Commun. Math. Phys. 285, 1183–1205 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0544-z
Communications in
Mathematical Physics
On the Second-Order Correlation Function of the Characteristic Polynomial of a Hermitian Wigner Matrix F. Götze , H. Kösters Fakultät für Mathematik, Universität Bielefeld, Postfach 100131, 33501 Bielefeld, Germany. E-mail:
[email protected];
[email protected] Received: 19 December 2007 / Accepted: 28 February 2008 Published online: 1 July 2008 – © Springer-Verlag 2008
Abstract: We consider the asymptotics of the second-order correlation function of the characteristic polynomial of a random matrix. We show that the known result for a random matrix from the Gaussian Unitary Ensemble essentially continues to hold for a general Hermitian Wigner matrix. Our proofs rely on an explicit formula for the exponential generating function of the second-order correlation function of the characteristic polynomial. Furthermore, we show that the second-order correlation function of the characteristic polynomial is closely related to that of the permanental polynomial.
1. Introduction The characteristic polynomials of random matrices have attracted considerable interest in the last years, a major reason being the striking similarities between the (asymptotic) moments of the characteristic polynomial of a random matrix from the Circular Unitary Ensemble (CUE) and the (asymptotic) moments of the value distribution of the Riemann zeta function along its critical line (see Keating and Snaith [KS]). These findings have inspired several authors to investigate the moments and correlation functions of the characteristic polynomial also for other random matrix ensembles (see e.g. Brézin and Hikami [BH1,BH2], Mehta and Normand [MN], Fyodorov and Strahov [FS], Strahov and Fyodorov [SF], Baik, Deift and Strahov [BDS], Borodin and Strahov [BS]). In this paper, we consider the second-order moment and correlation function of the characteristic polynomial of a general (Hermitian) Wigner matrix: Let Q be a probability distribution on the real line such that x Q(d x) = 0 , a := x 2 Q(d x) = 1/2 , b := x 4 Q(d x) < ∞ , (1.1) Supported by CRC 701 “Spectral Structures and Topological Methods in Mathematics”.
1184
F. Götze, H. Kösters
√ Im and let (X ii / 2)i∈N , (X iRe j )i< j, i, j∈N and (X i j )i< j, i, j∈N be independent families of independent random variables with distribution Q on some probability space (, F, P). Im Re Im Also, let X i j := X iRe j + i X i j and X ji := X i j − i X i j for i < j, i, j ∈ N. Then, for any N ∈ N, the (Hermitian) Wigner matrix of size N × N is given by X N = (X i j )1≤i, j≤N , and the second-order correlation function of the characteristic polynomial is given by f (N ; µ, ν) := E(det(X N − µI N ) · det(X N − ν I N )) ,
(1.2)
where µ, ν are real numbers and I N denotes the identity matrix of size N × N . We are interested in the asymptotics of the values f (N ; µ N , ν N ) as N → ∞, where µ N , ν N depend on N in some suitable fashion. In the special case where Q is the Gaussian distribution with mean 0 and variance 1/2, the distribution of the random matrix X N is the so-called Gaussian Unitary Ensemble (GUE). (See e.g. Forrester [Fo] or Mehta [Me] and the references cited therein. However, let it be noted that these authors use the variance 1/4 instead of 1/2, so that we have to do some rescalings when using their results.) A remarkable feature of the GUE is that the joint distribution of the eigenvalues of the random matrix X N is known explicitly: It is given by PN (dλ1 , . . . , dλ N ) = Z −1 N ·
(λ j − λi )2 ·
1≤i< j≤N
N
e−λi /2 λλ N (dλ1 , . . . , dλ N ) , 2
i=1
where λλ N denotes the N -dimensional Lebesgue measure on R N and Z N denotes the N k! . Thus, the correlation function of the normalizing factor Z N = (2π ) N /2 · k=1 characteristic polynomial can be written as f GUE (N ; µ, ν) :=
N
R N i=1
(λi − µ)
N
(λi − ν) PN (dλ1 , . . . , dλ N ) ,
i=1
from which it follows (see e.g. the proof of Proposition 4.3 in Forrester [Fo]) that f GUE (N ; µ, ν) =
pN , pN e−(µ
2 +ν 2 )/4
· K N +1 (µ, ν) .
+∞ 2 Here, the scalar product · , · is given by ϕ, ψ := −∞ ϕ(x) ψ(x) e−x /2 d x, the pk are the monic orthogonal polynomials associated with this scalar product (i.e., up to scaling, the Hermite polynomials), and the kernel K N is given by K N (x, y) := e−(x
2 +y 2 )/4
N pk−1 (x) pk−1 (y) . pk−1 , pk−1 k=1
Using this representation, it is possible to obtain asymptotic approximations of the values f GUE (N ; µ N , ν N ) from the corresponding asymptotics of the Hermite polynomials (see e.g. Sect. 8.22 in Szegö [Sz]). It turns out that sin π(µ − ν) πµ πν 1 π = · · f GUE N ; √ , √ lim N →∞ 2N N! π(µ − ν) N N
Characteristic Polynomial of a Hermitian Wigner Matrix
1185
(see e.g. the proof of Proposition 4.14 in Forrester [Fo]) and, for ξ ∈ (−2, +2), √ √ 1 1 1 2 4 − ξ2 lim · · e−N ξ /2 · f GUE (N ; N ξ, N ξ ) = N →∞ 2π N N! 2π (see e.g. the derivation of the semi-circle law in Chapter 4.3 in Forrester [Fo]). More generally, it is known that, for ξ ∈ (−2, +2), √ √ 1 1 µ ν −N ξ 2 /2 · ·e · f GUE N ; N ξ + √ lim , Nξ + √ N →∞ 2π N N! N (ξ ) N (ξ ) sin π(µ − ν) π(µ − ν) 1 (see e.g. Sect. 2.1 in Strahov and Fyodorov [SF]), where (ξ ) := 2π 4 − ξ 2 denotes the density of the semi-circle law. Note that this formula includes the preceding two formulas as special cases. Even more, it turns out that a similar result holds for the correlation function (of any even order 2, 4, 6, . . .) of the characteristic polynomial of a random matrix from the larger class of unitary-invariant ensembles (see e.g. Sect. 2.1 in Strahov and Fyodorov [SF]). In this respect, it is interesting to note that the emergence of the sine kernel is “universal” in that it is independent of the particular choice of the potential function of the unitary-invariant ensemble. In contrast to that, most of the other factors in the above result for the GUE have to be replaced by potential-specific factors. It is well-known that the GUE is a special case not only of a unitary-invariant ensemble but also of a (Hermitian) Wigner ensemble as described at the beginning of this section. The purpose of this paper is to show that the above result for the GUE can also be generalized in this direction. More precisely, our main result is as follows: = eξ(µ+ν)/2 (ξ ) · (ξ ) ·
Theorem 1.1. Let Q be a probability distribution on the real line satisfying (1.1), let f be defined as in (1.2), let ξ ∈ (−2, +2), and let µ, ν ∈ R. Then we have √ √ 1 1 µ ν 2 lim · · e−N ξ /2 · f N ; N ξ + √ , Nξ + √ N →∞ 2π N N! N (ξ ) N (ξ )
sin π(µ − ν) , = exp b − 43 · eξ(µ+ν)/2 (ξ ) · (ξ ) · π(µ − ν) 1 where (ξ ) := 2π 4 − ξ 2 and sin 0/0 := 1. Specifically for the Gaussian distribution with mean 0 and variance 1/2, we have b = 43 , so that we re-obtain the above result for the GUE. Furthermore, we see that for general Wigner matrices, the appropriately rescaled correlation function of the characteristic polynomial asymptotically factorizes into the universal sine kernel, a universal factor involving the density of the semi-circle law, and a non-universal factor depending only on the fourth moment b, or the fourth cumulant b − 43 , of the underlying distribution Q. In particular, it follows immediately that if we normalize the correlation function of the characteristic polynomial by means of its second moment, we obtain the following universality result:
1186
F. Götze, H. Kösters
Corollary 1.2. Under the assumptions of Theorem 1.1, we have
E D N (ξ, µ) D N (ξ, ν) sin π(µ − ν) , lim = N →∞ π(µ − ν) ED N (ξ, µ)2 ED N (ξ, ν)2
√ where D N (ξ, η) := det X N − IN . Nξ + √ η N (ξ )
Moreover, it can be shown that the asymptotics remain unchanged if we replace the correlation function f (N ; µ, ν) by the “true” correlation (in the sense of probability) of the characteristic polynomial, f (N ; µ, ν) := E((det(X N − µI N ) − E det(X N − µI N )) · (det(X N − ν I N ) − E det(X N − ν I N ))) .
(1.3)
We then have the following result: Proposition 1.3. Let Q be a probability distribution on the real line satisfying (1.1), let f be defined as in (1.3), let ξ ∈ (−2, +2), and let µ, ν ∈ R. Then we have √ √ 1 1 µ ν −N ξ 2 /2 lim · ·e · f N; Nξ + √ , Nξ + √ N →∞ 2π N N! N (ξ ) N (ξ )
sin π(µ − ν) = exp b − 43 · eξ(µ+ν)/2 (ξ ) · (ξ ) · , π(µ − ν) 1 4 − ξ 2 and sin 0/0 := 1. where (ξ ) := 2π Similarly as before, normalizing the correlation of the characteristic polynomial by means of its variance leads to the following universality result for the correlation coefficient: Corollary 1.4. Under the assumptions of Proposition 1.3, we have
N (ξ, µ) D N (ξ, ν) E D sin π(µ − ν) lim , = N →∞ π(µ − ν) (ξ, µ)2 E D (ξ, ν)2 ED N
where
√ N (ξ, η) := det X N − D Nξ +
N
√ η N (ξ )
√ I N − E det X N − Nξ +
√ η N (ξ )
IN .
The proofs of the above-mentioned results on the correlation function of the characteristic polynomial of a random matrix from the GUE (or another unitary-invariant ensemble) heavily depend on the special structure of the joint distribution of the eigenvalues. However, such a structure seems not to be available for general Wigner matrices. Instead, we start from recursive equations for the correlation function of the characteristic polynomial (as well as some closely related correlation functions), derive an explicit expression for the associated exponential generating function and deduce all our asymptotic results from this expression. For the determinant of a real symmetric Wigner matrix, a similar analysis was carried out by Zhurbenko [Zh]. The crucial step in our analysis is to obtain an expression in closed form for the exponential generating function of the correlation function. Unfortunately, this approach has
Characteristic Polynomial of a Hermitian Wigner Matrix
1187
proven successful so far only for the second-order correlation function of the characteristic polynomial, which explains why we do not have any results for the higher-order correlation functions of the characteristic polynomial. Note however that for any distribution Q with the first 2k moments identical to the Gaussian moments, the correlation function of order k of the characteristic polynomial must be the same as that for the GUE. Our approach also has some interesting implications for permanental polynomials of (Hermitian) Wigner matrices (see Sect. 5 for the definition). The analysis of permanental polynomials of random matrices from various well-known ensembles was recently initiated by Fyodorov [Fy], who established some striking similarities between permanental polynomials and characteristic polynomials. In particular, he proved that in the special case of the GUE, the second-order correlation functions of the permanental polynomial and the characteristic polynomial almost coincide, up to a simple transformation of the shift parameters. It follows from our approach that the same is true for arbitrary (Hermitian) Wigner matrices (see Proposition 5.4). This paper is organized as follows. In Sect. 2, we start with the analysis of the recursive equations for the correlation function of the characteristic polynomial and derive the explicit expression for its exponential generating function. Sections 3 and 4 are devoted to the proofs of Theorem 1.1 and Proposition 1.3, respectively. In Sect. 5, we discuss the relationship between the second-order correlation function of the characteristic polynomial and that of the permanental polynomial. Throughout this paper, K denotes an absolute constant which may change from one occurrence to the next. 2. Generating Functions To simplify the notation, we adopt the following conventions: The determinant of the “empty” (i.e., 0 × 0) matrix is taken to be 1. If A is an n × n matrix and z is a real or complex number, we set A − z := A − z In , where In denotes the n × n identity matrix. Furthermore, if A is an n × n matrix and i 1 , . . . , i m and j1 , . . . , jm are families of pairwise different indices from the set {1, . . . , n}, we write A[i1 ,...,im : j1 ,..., jm ] for the (n − m) × (n − m)-matrix obtained from A by removing the rows indexed by i 1 , . . . , i m and the columns indexed by j1 , . . . , jm . Thus, for any n × n matrix A = (ai j )1≤i, j≤n (n ≥ 1), we have the identity det(A) =
n−1
(−1)i+ j−1 ai,n an, j det(A[n,i:n, j] ) + an,n det(A[n:n] ) ,
(2.1)
i, j=1
as follows by expanding the determinant about the last row and the last column. (For n = 1, note that the big sum vanishes.) Recall that we write X N for the random matrix (X i j )1≤i, j≤N , where the X i j are the random variables introduced below (1.1). We will analyze the function f (N ; µ, ν) := E (det(X N − µ) · det(X N − ν))
(N ≥ 0) .
To this purpose, we will also need the auxiliary functions f 11 (N ; µ, ν) := E(det((X N − µ)[1:1] ) · det((X N − ν)[2:2] ))
(N ≥ 2) ,
χ f 11 (N ; µ, ν)
(N ≥ 2) ,
:= E(det((X N − µ)
[1:2]
) · det((X N − ν)
[2:1]
))
1188
F. Götze, H. Kösters
f 10 (N ; µ, ν) := E(det(X N −1 − µ) · det(X N − ν)) f 01 (N ; µ, ν) := E(det(X N − µ) · det(X N −1 − ν))
(N ≥ 1) , (N ≥ 1) .
Since µ and ν can be regarded as constants for the purposes of this section, we will only write f (N ) instead of f (N ; µ, ν) in the sequel, etc. We have the following recursive equations: Lemma 2.1. f (0) = 1 , f (N ) = (1 + µν) f (N − 1) + (2b + 21 )(N − 1) f (N − 2) + (N − 1)(N − 2) f 11 (N − 1) χ
+ (N − 1)(N − 2) f 11 (N − 1) + ν(N − 1) f 10 (N − 1) + µ(N − 1) f 01 (N − 1)
(N ≥ 1) ,
(2.2)
(N ≥ 2) ,
(2.3)
(N ≥ 2) ,
(2.4)
f 10 (N ) = −(N − 1) f 01 (N − 1) − ν f (N − 1)
(N ≥ 1) ,
(2.5)
f 01 (N ) = −(N − 1) f 10 (N − 1) − µ f (N − 1)
(N ≥ 1) .
(2.6)
f 11 (N ) = µν f (N − 2) + (N − 2) f (N − 3) + (N − 2)(N − 3) f 11 (N − 2) + ν(N − 2) f 10 (N − 2) + µ(N − 2) f 01 (N − 2) χ
f 11 (N ) = f (N − 2) + (N − 2) f (N − 3) χ
+ (N − 2)(N − 3) f 11 (N − 2)
For the sake of clarity, note that these recursive equations may contain some terms which have not been defined (such as f 11 (N − 1) for N = 1), but this is not a problem since these terms occur in combination with the factor zero only. Proof. We begin with the proof of (2.2). For N = 0, the result is clear. For N ≥ 1, we expand the determinants of the matrices (X N − µ) and (X N − ν) as in (2.1) and use the independence of the random variables X i j = X ji (i ≤ j), to the effect that f (N ) =
N −1 N −1
(−1)i+ j+k+l E X i,N X N , j X k,N X N ,l
i, j=1 k,l=1
· E det(X N −1 − µ)[i: j] · det(X N −1 − ν)[k:l]
+
N −1
(−1)i+ j+1 E X i,N X N , j · E X N ,N − ν
i, j=1
· E det(X N −1 − µ)[i: j] · det(X N −1 − ν) +
N −1 k,l=1
(−1)k+l+1 E X k,N X N ,l · E X N ,N − µ
Characteristic Polynomial of a Hermitian Wigner Matrix
1189
· E det(X N −1 − µ) · det(X N −1 − ν)[k:l]
+ E (X N ,N − µ)(X N ,N − ν) · E (det(X N −1 − µ) · det(X N −1 − ν)) . Since the (complex-valued) random variables X i j = X ji (i ≤ j) are independent with E(X i j ) = 0 (i ≤ j) and E(X i2j ) = 0 (i < j), several of the expectations vanish, and the sum reduces to f (N ) = (EX 2N ,N + µν) · E (det(X N −1 − µ) · det(X N −1 − ν))
+ E|X i,N |4 · E det(X N −1 − µ)[i: j] · det(X N −1 − ν)[k:l] i= j=k=l
E|X i,N |2 · E|X k,N |2 · E det(X N −1 − µ)[i: j] · det(X N −1 − ν)[k:l]
+
i= j =k=l
E|X i,N |2 · E|X k,N |2 · E det(X N −1 − µ)[i: j] · det(X N −1 − ν)[k:l]
+
i=l = j=k
+ν
E|X i,N |2 · E det(X N −1 − µ)[i: j] · det(X N −1 − ν)
i= j
+µ
E|X k,N |2 · E det(X N −1 − µ) · det(X N −1 − ν)[k:l] .
k=l
Equation (2.2) now follows by noting that EX 2N ,N = 1, E|X i,N |2 = 1, E|X i,N |4 = 2b + 21 , and by using symmetry. To prove (2.3), we apply the analogue of (2.1) for the first row and the first column to the matrices (X N − µ)[1:1] and (X N − ν)[2:2] . Using similar arguments as in the proof of (2.2) afterwards, we obtain f 11 (N ) =
N N
(−1)i+ j+k+l E X i,2 X 2, j
i, j=3 k,l=3
·E X k,1 X 1,l · E det(X N − µ)[1,2,i:1,2, j] · det(X N − ν)[1,2,k:1,2,l] +
N
(−1)i+ j+1 E X i,2 X 2, j
i, j=3
·E X 1,1 − ν · E det(X N − µ)[1,2,i:1,2, j] · det(X N − ν)[1,2:1,2] +
N
(−1)k+l+1 E X k,1 X 1,l
k,l=3
·E X 2,2 − µ · E det(X N − µ)[1,2:1,2] · det(X N − ν)[1,2,k:1,2,l]
+ E X 2,2 − µ · E X 1,1 − ν
·E det(X N − µ)[1,2:1,2] · det(X N − ν)[1,2:1,2]
1190
F. Götze, H. Kösters
= µν · E det(X N − µ)[1,2:1,2] · det(X N − ν)[1,2:1,2]
E|X i,2 |2 · E|X k,1 |2 · E det(X N − µ)[1,2,i:1,2, j] · det(X N − ν)[1,2,k:1,2,l]
+
i= j=k=l
E|X i,2 |2 · E|X k,1 |2 · E det(X N − µ)[1,2,i:1,2, j] · det(X N − ν)[1,2,k:1,2,l]
+
i= j =k=l
+ν
E|X i,2 |2 · E det(X N − µ)[1,2,i:1,2, j] · det(X N − ν)[1,2:1,2]
i= j
+µ
E|X k,1 |2 · E det(X N − µ)[1,2:1,2] · det(X N − ν)[1,2,k:1,2,l] ,
k=l
and hence (2.3). The proof of (2.4) is similar to that of (2.3): χ
f 11 (N ) =
N N
(−1)i+ j+k+l E X i,1 X 1,l
i, j=3 k,l=3
· E X 2, j X k,2 · E det(X N − µ)[1,2,i:1,2, j] · det(X N − ν)[1,2,k:1,2,l] N
+
(−1)i+ j+1 E X i,1 X 2, j
i, j=3
· E X 1,2 · E det(X N − µ)[1,2,i:1,2, j] · det(X N − ν)[1,2:1,2] N
+
(−1)k+l+1 E X k,2 X 1,l
k,l=3
· E X 2,1 · E det(X N − µ)[1,2:1,2] · det(X N − ν)[1,2,k:1,2,l]
+ E X 2,1 X 1,2 · E det(X N − µ)[1,2:1,2] · det(X N − ν)[1,2:1,2]
= E det(X N − µ)[1,2:1,2] · det(X N − ν)[1,2:1,2] + E|X i,1 |2 · E|X k,2 |2 i=l=k= j
· E det(X N − µ)[1,2,i:1,2, j] · det(X N − ν)[1,2,k:1,2,l] E|X i,1 |2 · E|X k,2 |2 + i=l =k= j
· E det(X N − µ)[1,2,i:1,2, j] · det(X N − ν)[1,2,k:1,2,l] . For the proof of (2.5), we expand the determinant of the matrix (X N − ν) as in (2.1) and use similar arguments as above to obtain f 10 (N ) =
N −1 k,l=1
(−1)k+l+1 E X k,N X N ,l · E det(X N −1 − µ) · det(X N −1 − ν)[k:l]
+ E X N ,N − ν · E (det(X N −1 − µ) · det(X N −1 − ν))
Characteristic Polynomial of a Hermitian Wigner Matrix
=−
1191
E|X k,N |2 · E det(X N −1 − µ) · det(X N −1 − ν)[k:l]
k=l
− ν · E (det(X N −1 − µ) · det(X N −1 − ν)) , and hence (2.5). The proof of (2.6) is completely analogous to that of (2.5).
The interesting (although apparently rather special) phenomenon is that the preceding recursions can be combined into a single recursion involving only the values f (N ). To shorten the notation, we put c(N ) := and s(N ) :=
f (N ) N!
(N ≥ 0)
c(N − k)
(N ≥ 0) .
k=0,...,N k even
We then have the following result: Lemma 2.2. The values c(N ) satisfy the recursive equation c(0) = 1 , N c(N ) = c(N − 1) + N · c(N − 2)
+ µν · s(N − 1) + s(N − 3) − (µ2 + ν 2 ) · s(N − 2)
+ (2b − 23 ) · c(N − 2) − c(N − 4)
(2.7)
(N ≥ 1) ,
(2.8)
where all terms c( · ) and s( · ) with a negative argument are taken to be zero. Proof. It is immediate from Lemma 2.1 that χ
f (N ) = f 11 (N + 1) + f 11 (N + 1) + (2b − 23 )(N − 1) f (N − 2) for all N ≥ 1. Using this relation for N − 2 instead of N (where now N ≥ 3), we can χ substitute f 11 (N − 1) + f 11 (N − 1) on the right-hand side of (2.2) to obtain f (N ) = (1 + µν) f (N − 1) + (2b + 21 )(N − 1) f (N − 2)
+ (N − 1)(N − 2) f (N − 2) − (2b − 23 )(N − 3) f (N − 4) + ν(N − 1) f 10 (N − 1) + µ(N − 1) f 01 (N − 1) = (1 + µν) f (N − 1) + N (N − 1) f (N − 2) + ν(N − 1) f 10 (N − 1) + µ(N − 1) f 01 (N − 1)
+ (2b − 23 ) · (N − 1) f (N − 2) − (N − 1)(N − 2)(N − 3) f (N − 4)
1192
F. Götze, H. Kösters
for all N ≥ 3. Dividing by (N − 1)!, it follows that N c(N ) = (1 + µν) · c(N − 1) + N c(N − 2) + ν · f 10 (N − 1) / (N − 2)! + µ · f 01 (N − 1) / (N − 2)!
+ (2b − 23 ) · c(N − 2) − c(N − 4) for all N ≥ 3. (For N = 3, note that the second term in the large bracket vanishes.) A straightforward induction using (2.5) and (2.6) shows that f 10 (N − 1)/(N − 2)! = −νs(N − 2) + µs(N − 3) , f 01 (N − 1)/(N − 2)! = −µs(N − 2) + νs(N − 3) , for all N ≥ 3, which yields the assertion for N ≥ 3. For N < 3, the assertion is verified by direct calculation, also making use of Lemma 2.1: c(0) = f (0) = 1, 1c(1) = f (1) = (1 + µν) f (0) = (1 + µν)c(0) = c(0) + µνs(0), 2c(2) = f (2) = (1 + µν) f (1) + (2b + 21 ) f (0) + ν(−ν f (0)) + µ(−µf (0)) = (1 + µν) f (1) + (2b + 21 ) f (0) − (µ2 + ν 2 ) f (0) = (1 + µν)c(1) + (2b + 21 )c(0) − (µ2 + ν 2 )c(0) = c(1) + 2c(0) + µνc(1) − (µ2 + ν 2 )c(0) + (2b − 23 )c(0) = c(1) + 2c(0) + µνs(1) − (µ2 + ν 2 )s(0) + (2b − 23 )c(0) .
We can now determine the exponential generating function of the sequence ( f (N )) N ≥0 : N Lemma 2.3. The exponential generating function F(x) := ∞ N =0 f (N ) x /N ! of the sequence ( f (N )) N ≥0 is given by
x 1 2 + ν 2 ) · x 2 + b∗ x 2 exp µν · 1−x − (µ 2 2 1−x 2 , F(x) = 3/2 1/2 (1 − x) · (1 + x) where b∗ := b − 43 . Proof. It is straightforward to obtain F(x) starting from (2.7) and (2.8) and using the basic properties of generating functions. For the sake of completeness, we provide a detailed proof.
Characteristic Polynomial of a Hermitian Wigner Matrix
1193
To begin with, recall that f (N )/N ! = c(N ). Multiplying (2.8) by x N −1 , summing over N and recalling our convention concerning negative arguments, we have ∞
N c(N )x N −1 =
N =1
∞
c(N − 1)x N −1 +
N =1
+ µν
∞
N c(N − 2)x N −1
N =2 ∞
s(N − 1)x
N −1
+
N =1
+ 2b
∗
s(N −3)x
N −1
N =3 ∞
− (µ2 + ν 2 )
∞
s(N − 2)x N −1
N =2 ∞
c(N − 2)x
N −1
−
N =2
∞
c(N − 4)x
N −1
,
N =4
whence F (x) = F(x) + (2x F(x) + x 2 F (x)) + µν
1 + x2 x 2 2 ∗ 3 x F(x) − x F(x) − (µ + ν ) F(x) + 2b F(x) . 1 − x2 1 − x2
We therefore obtain the differential equation F (x) =
1 + 2x 1 + x2 x + µν − (µ2 + ν 2 ) + 2b∗ x 2 1−x (1 − x 2 )2 (1 − x 2 )2
F(x) ,
which has the solution F0 x 1 2 2 ∗ 2 1 F(x) = . exp µν − 2 (µ + ν ) +b x (1 − x)3/2 · (1 + x)1/2 1 − x2 1 − x2 Here, F0 denotes a multiplicative constant which is determined by (2.7): F0 = exp
2 1 2 (µ
+ ν2) .
We therefore obtain F(x) =
x x2 1 2 2 ∗ 2 1 , exp µν − (µ + ν ) + b x 2 (1 − x)3/2 · (1 + x)1/2 1 − x2 1 − x2
which completes the proof.
1194
F. Götze, H. Kösters
3. The Proof of Theorem 1.1 To prove Theorem 1.1, we will establish the following slightly more general result: Proposition 3.1. Let Q be a probability distribution on the real line satisfying (1.1), let f be defined √ as in (1.2), let (ξ N ) N ∈N be a sequence of real numbers such that lim N →∞ ξ N / N = ξ for some ξ ∈ (−2, +2), and let η ∈ C. Then we have lim
N →∞
η η N ; ξN + √ , ξN − √ N N 2
sin( 4 − ξ · η) = exp b − 43 · 4 − ξ 2 · , ( 4 − ξ 2 · η)
2π 1 · · exp(−ξ N2 /2) · f N N!
where sin 0/0 := 1. It is easy to deduce Theorem 1.1 from Proposition 3.1: Proof of Theorem 1.1. Taking ξ N :=
√
Nξ + √
π(µ + ν) N · 4 − ξ2
and
π(µ − ν) η := 4 − ξ2
in Proposition 3.1, we have
2π 1 · · exp −N ξ 2 /2 − π ξ(µ + ν)/ 4 − ξ 2 N →∞ N N! √ √ 2π µ 2π ν , Nξ + √ · f N; Nξ + √ N 4 − ξ2 N 4 − ξ2
sin π(µ − ν) . = exp b − 34 · 4 − ξ 2 · π(µ − ν) lim
Multiplying by
1 2π
exp π ξ(µ + ν)/ 4 − ξ 2 yields Theorem 1.1.
It therefore remains to prove Proposition 3.1: Proof of Proposition 3.1. By Lemma 2.3, we have
z 1 z2 2 2 ∗ 2 ∞ exp µν · 1−z 2 − 2 (µ + ν ) · 1−z 2 + b z f (N ; µ, ν) N z = . N! (1 − z)3/2 · (1 + z)1/2
N =0
Thus, by Cauchy’s formula, we have the integral representation
exp µν · z − 1 (µ2 + ν 2 ) · z 2 + b∗ z 2 2 2 2 f (N ; µ, ν) 1 dz 1−z 1−z = , N! 2πi γ (1 − z)3/2 · (1 + z)1/2 z N +1
(3.1)
Characteristic Polynomial of a Hermitian Wigner Matrix
1195
where γ ≡ γ N denotes the counterclockwise circle of radius R ≡ R N = 1 − 1/N around the origin. assume that N ≥ 2 for the rest of the proof.) Setting √ (We may and do √ µ = ξ N + η/ N and ν = ξ N − η/ N , we have z2 z 2 2 ∗ 2 1 − 2 (µ + ν ) · +b z exp µν · 1 − z2 1 − z2 z z2 2 2 2 2 ∗ 2 = exp (ξ N − η /N ) · − (ξ N + η /N ) · +b z 1 − z2 1 − z2 z z = exp ξ N2 · − (η2 /N ) · + b∗ z 2 1+z 1−z
1 2 2 ∗ 2 1 2 1 2 1−z − (η /N ) · +b z . = exp 2 ξ N + η /N · exp − 2 ξ N · 1+z 1−z We therefore obtain
1 η η = exp 21 ξ N2 + η2 /N · f N ; ξN + √ , ξN − √ N! N N
exp − 1 ξ 2 · 1−z − (η2 /N ) · 1 + b∗ z 2 2 N 1+z 1−z dz 1 . · 2πi γ (1 − z)3/2 · (1 + z)1/2 z N +1
(3.2)
The idea is that the main contribution to the integral in (3.2) comes from a small neighborhood of z = 1, where the function
1 2 ∗ 2 exp − 21 ξ N2 · 1−z 1+z − (η /N ) · 1−z + b z h(z) := (1 − z)3/2 · (1 + z)1/2 can be well approximated by the simpler function
1 2 2 exp(b∗ ) exp − 4 ξ N · (1 − z) − (η /N ) · h 0 (z) := √ · (1 − z)3/2 2
1 1−z
We therefore rewrite the integral in (3.2) as 1 dz h(z) N +1 = I1 + I2 + I3 − I4 2πi γ z with 1 I1 := 2πi 1 I2 := 2π I3 :=
1 2π
1 I4 := 2π
γ
h 0 (z)
√ 2π −1/ N
√ 1/ N +1/√ N √ −1/ N
dz , z N +1
√ 1/ N
h(Reit )
h 0 (Reit )
.
(3.3)
(3.4)
dt , (Reit ) N
h(Reit ) − h 0 (Reit )
√ 2π −1/ N
dt . (Reit ) N
(3.5) dt , (Reit ) N
(3.6) (3.7)
1196
F. Götze, H. Kösters
We will show that the integral I1 is the asymptotically dominant term. First of all, note that since ξ N ∈ R, we have
exp − 41 ξ N2 · (1 − z) = exp − 41 ξ N2 · Re (1 − z) ≤ 1
(3.8)
for any z ∈ C with Re (z) ≤ 1. Plugging in the series expansion ∞ (−1)l η2l 1 1 2 = exp −(η /N ) · 1−z l! N l (1 − z)l l=0
and using uniform convergence on the contour γ (for fixed N ≥ 2 and η ∈ C), we obtain
∞ exp − 41 ξ N2 · (1 − z) dz 1 I1 exp(b∗ ) (−1)l η2l · . (3.9) · √ = √ l! N l+1/2 2πi γ (1 − z)l+3/2 z N +1 N 2 l=0
We will show that for each l = 0, 1, 2, 3, . . . ,
exp − 41 ξ N2 · (1 − z) dz (−1)l η2l 1 lim · N →∞ l! N l+1/2 2πi γ (1 − z)l+3/2 z N +1 1 (−1)l η2l · (4 − ξ 2 )l+1/2 . =√ · π (2l + 1)! To begin with,
exp − 41 ξ N2 · (1 − z) dz 1 2πi γ (1 − z)l+3/2 z N +1
(1−1/N )+i ∞ exp − 41 ξ N2 · (1 − z) dz 1 = . 2πi (1−1/N )−i ∞ (1 − z)l+3/2 z N +1
(3.10)
(3.11)
In fact, for any R > 1, we can replace the contour γ by the contour δ which consists of the line segment between the points R − i (R )2 − R 2 and R + i (R )2 − R 2 , and the arc of radius R around the origin to the left of this line segment (see Fig. 1). Now, it is easy to see that the integral along this arc is bounded above by 1 1 1 · 2π R · · , 2π (R − 1)l+3/2 (R ) N +1 and therefore tends to zero as R → ∞, whence (3.11). Next, performing a change of variables, we find that the right-hand side in (3.11) is equal to
+∞ exp − 41 (ξ N2 /N ) · (1 − iu) du 1 N l+1/2 · . 1−iu N +1 2π −∞ (1 − iu)l+3/2 (1 − N ) √ Since lim N →∞ ξ N / N = ξ , it follows by the dominated convergence theorem that
+∞ exp − 41 (ξ N2 /N ) · (1 − iu) 1 du lim 1−iu N +1 N →∞ 2π −∞ (1 − iu)l+3/2 (1 − N )
+∞ exp (1 − 41 ξ 2 ) · (1 − iu) 1 = du , 2π −∞ (1 − iu)l+3/2
Characteristic Polynomial of a Hermitian Wigner Matrix
1197
Fig. 1. The contour δ
which is equal to (1 − 41 ξ 2 )l+1/2 1 l! =√ · · (4 − ξ 2 )l+1/2 (l + 3/2) π (2l + 1)! by Laplace inversion (see e.g. Chap. 24 in Doetsch [Do]) and the functional equation of the Gamma function. This proves (3.10). Let ε > 0 denote a constant such that cos t ≤ 1 − ε2 t 2 for −π ≤ t ≤ +π . Then, for any α > 1, we have the estimate
+π −π
1 dt = |1 − Reit |α
+π
1 dt 2 − 2R cos t)α/2 (1 + R −π +π 1 ≤ dt 2 + ε 2 t 2 )α/2 ((1 − R) −π +π 1 = Nα dt 2 ε 2 t 2 )α/2 (1 + N −π +N επ 1 = K N α−1 du 2 α/2 −N επ (1 + u ) ∞ 1 ≤ K N α−1 1 + du uα 1 1 , ≤ K N α−1 1 + α−1
(3.12)
1198
F. Götze, H. Kösters
where K denotes some absolute constant which may change from line to line. We therefore obtain the bound
∞ l 2l exp − 41 ξ N2 · (1 − z) dz 1 (−1) η · l! N l+1/2 2πi γ (1 − z)l+3/2 z N +1 l=0 +π ∞ 1 |η|2l 1 dt 1 ≤ · l+1/2 · l! N 2π −π |1 − Reit |l+3/2 |Reit | N l=0 ∞ 1 |η|2l · 1+ < ∞, ≤K· l! l + 1/2 l=0
uniformly in N ≥ 2. Thus, the term-by-term convergence established in (3.10) entails the convergence of the complete series in (3.9), and we obtain
2l ∞
(−1)l 4 − ξ 2 · η 1 3 2 · exp b − 4 · 4 − ξ · 2π (2l + 1)! l=0
1 sin( 4 − ξ 2 · η) = . · exp b − 43 · 4 − ξ 2 · 2π ( 4 − ξ 2 · η)
I1 lim √ = N →∞ N
Hence, in view of (3.2) and (3.3), the proof of Proposition 3.1 will be complete once we have shown that the√integrals I2 , I3 , I4 are asymptotically negligible in the sense that they are of order o( N ). For the integral I2 , we use the estimates (R = 1 − 1/N , t ∈ R) it it exp − 1 ξ 2 · 1 − Re = exp − 1 ξ 2 · Re 1 − Re 2 N 1 + Reit 2 N 1 + Reit 1 − R2 = exp − 21 ξ N2 · ≤ 1, 1 + R 2 + 2R cos t 1 1 ≤ exp (|η|2 /N ) · exp −(η2 /N ) · 1 − Reit |1 − Reit | 1 2 = exp(|η|2 ) , ≤ exp (|η| /N ) · 1− R
exp b∗ (Reit )2 ≤ exp |b∗ ||Reit |2 ≤ exp(|b∗ |) ,
(3.13)
(3.14) (3.15)
|1 + Reit | ≥ 1 + R cos t ≥ 1 for cos t ≥ 0 ,
(3.16)
|1 − Reit | ≥ 1 − R cos t ≥ 1 for cos t ≤ 0 .
(3.17)
Characteristic Polynomial of a Hermitian Wigner Matrix
1199
Using these estimates, it follows that π/2 exp(|η|2 + |b∗ |) dt 1 |I2 | ≤ √ 2π 1/ N |1 − Reit |3/2 R N 3π/2 exp(|η|2 + |b∗ |) dt 1 + 2π π/2 |1 + Reit |1/2 R N √ 2π −1/ N exp(|η|2 + |b∗ |) dt 1 + . 2π 3π/2 |1 − Reit |3/2 R N Similarly as in (3.12), we have the estimates ∞ π/2 1 1 1/2 dt ≤ K N du ≤ K N 1/4 , √ √ it 3/2 3/2 1/ N |1 − Re | Nε u N επ/2 3π/2 1 1 −1/2 1+ dt ≤ K N du ≤ K , |1 + Reit |1/2 u 1/2 π/2 1
√ 2π −1/ N 3π/2
1 dt ≤ K N 1/2 |1 − Reit |3/2
∞ √
1
Nε
du ≤ K N 1/4 .
u 3/2
(Recall our convention that the constant K may change from one occurrence to the next.) It follows that √ |I2 | ≤ K exp(|η|2 + |b∗ |) N 1/4 = o( N ) . For the integral I3 , we write
exp − 41 ξ N2 · (1 − z) − (η2 /N ) · h(z) − h 0 (z) = (1 − z)3/2 where ˜ h(z) =
exp − 41 ξ N2 ·
(1−z)2 1+z (1 + z)1/2
1 1−z
+ b∗ z 2
˜ ˜ · h(z) − h(1) ,
,
so that ˜
h (z) =
Let
1ξ2 · 2 N
1−z 1+z
(1−z)2 (1+z)2 1/2 z)
+ 41 ξ N2 ·
+ 2b∗ z
(1 + (1 − z)2 · exp − 41 ξ N2 · + b∗ z 2 . 1+z
−
1/2 (1 + z)3/2
√ Z := z ∈ C | z = r eiϕ , 1 − 1/N ≤ r ≤ 1, |ϕ| ≤ 1/ N ,
and note that for z ∈ Z , we have the estimate (1 − z)2 Re − ≤ 4/N . 1+z
(3.18)
1200
F. Götze, H. Kösters
Indeed, with z = r eiϕ , a simple calculation yields (1 − z)2 −1 + r cos ϕ + 2r 2 − r 2 cos 2ϕ − r 3 cos ϕ Re − = , 1+z 1 + r 2 + 2r cos ϕ where the denominator is clearly larger than 1 and the numerator is bounded above by r cos ϕ − r 3 cos ϕ + r 2 − r 2 cos 2ϕ ≤ r (1 − r 2 ) cos ϕ + r 2 − r 2 (1 − 2ϕ 2 ) ≤ 4/N . √ Since for z = Reit with |t| ≤ 1/ N , the line segment between the points z and 1 is contained in the set Z , it follows that ˜ ˜ ≤ |z − 1| sup h˜ ((1 − α)z + α) ≤ |z − 1| sup h˜ (ζ ) h(z) − h(1) ζ ∈Z
α∈[0;1]
(1 − ζ )2 2 ∗ ∗ 1 2 ≤ K |z − 1| sup ξ N |1 − ζ | + |b | + 1 exp 4 ξ N · Re − + |b | 1+ζ ζ ∈Z
√ ≤ K |z − 1| ξ N2 / N + |b∗ | + 1 exp ξ N2 /N + |b∗ | √ ≤ K (b∗ , ξ ∗ ) N |z − 1| , √ where the last step uses that lim N →∞ ξ N / N = ξ , and K (b∗ , ξ ∗ ) denotes some constant depending only on b∗ and ξ ∗ := (ξ N ) N ∈N . Using (3.8), (3.14) as well as a similar estimate as in (3.12), we therefore obtain +1/√ N dt exp(|η|2 ) 1 ˜ it ˜ h(Re · ) − h(1) |I3 | ≤ N 2π −1/√ N |1 − Reit |3/2 R √ +1/ N √ 1 ≤ K (b∗ , ξ ∗ , η) N dt √ it |1/2 |1 − Re −1/ N √N ε 1 ∗ ∗ ≤ K (b , ξ , η) 1 + du u 1/2 1 √ ≤ K (b∗ , ξ ∗ , η) N 1/4 = o( N ) , where K (b∗ , ξ ∗ , η) denotes some constant which depends only on b∗ , ξ ∗ , and η (and which may change from line to line as usual). For the integral I4 , we can use (3.8), (3.14) as well as a similar estimate as in (3.12) to obtain 2π −1/√ N exp(|η|2 + |b∗ |) dt 1 |I4 | ≤ √ 2π 1/ N |1 − Reit |3/2 R N π 2π −1/√ N 1 1 2 ∗ ≤ K exp(|η| + |b |) dt + dt √ it 3/2 |1 − Reit |3/2 1/ N |1 − Re | π ∞ ∞ 1 1 1/2 du + N du ≤ K exp(|η|2 + |b∗ |) N 1/2 √ √ 3/2 3/2 Nε u Nε u √ ≤ K exp(|η|2 + |b∗ |) N 1/4 = o( N ) . This completes the proof of Proposition 3.1.
Characteristic Polynomial of a Hermitian Wigner Matrix
1201
4. The Proof of Proposition 1.3 Let f (N ; µ, ν) and f (N ; µ, ν) be defined as in (1.2) and (1.3), respectively, and note that
f (N ; µ, ν) = E (det(X N − µ) − E det(X N − µ)) · (det(X N − ν) − E det(X N − ν))
= E det(X N − µ) · det(X N − ν) − E det(X N − µ) · E det(X N − ν) = f (N ; µ, ν) − g(N ; µ) g(N ; ν) ,
(4.1)
where g(N ; µ) := E det(X N − µ) for any µ ∈ R. We will deduce Proposition 1.3 from Theorem 1.1 by showing that f (N ; µ, ν) is asymptotically much larger than g(N ; µ) g(N ; ν). To this end, we need some more information about the values g(N ; µ). Similarly as in Sect. 2, we have the following recursive equation: Lemma 4.1. g(0; µ) = 1, g(N ; µ) = −µg(N − 1; µ) − (N − 1)g(N − 2; µ) (N ≥ 1) . Proof. For N = 0, the claim follows from our convention that the determinant of the empty matrix is 1. For N ≥ 1, we expand the determinant of the matrix (X N − µ) as in (2.1) and use independence and symmetry to get g(N ; µ) =
N −1
(−1)i+ j+1 E X i,N X N , j · E det(X N −1 − µ)[i: j]
i, j=1
+ E (X N N − µ) · E (det(X N −1 − µ))
= (−1) E|X i,N |2 · E det(X N −1 − µ)[i:i] i= j
+ E (X N N − µ) · E (det(X N −1 − µ)) = −(N − 1)g(N − 2; µ) − µg(N − 1; µ) . This completes the proof of Lemma 4.1.
It follows from Lemma 4.1 that the polynomials g(N ; µ) coincide, up to scaling, with the Hermite polynomials HN (x) (see e.g. Sect. 5.5 in Szegö [Sz]), which satisfy the recursive equation H0 (x) = 1,
H N (x) = 2x H N −1 (x) − 2(N − 1)H N −2 (x) (N ≥ 1) .
(Specifically for the GUE, this is well-known, see e.g. Chap. 4 in Forrester [Fo].) The precise relationship is as follows: Lemma 4.2. For any N = 0, 1, 2, 3, . . . ,
√ g(N ; µ) = (−1) N 2−N /2 H N (µ/ 2) .
Proof. This follows from the recursive equations for g(N ; µ) and H N (x) by a straightforward induction on N .
1202
F. Götze, H. Kösters
√ Due to Lemma 4.2, it is easy to obtain the asymptotics of the values g(N ; N ξ + √ µ/ N (ξ )) from the corresponding asymptotics of the Hermite polynomials (see e.g. Sect. 8.22 in Szegö [Sz]). For our purposes, the following estimate will be sufficient: Lemma 4.3. For ξ ∈ (−2, +2), µ ∈ R fixed, √ −N ξ 2 /4 µ ≤ K (ξ, µ) N −1/4 N !1/2 , e g N ; N ξ + √ N (ξ ) where K (ξ, µ) is some constant depending only on ξ and µ. Proof. By Theorem 8.22.9 (a) in Szegö [Sz], we have, for x = e−x
2 /2
√
2N + 1 cos ϕ,
H N (x) = 2(N /2)+(1/4) (N !)1/2 (N π )−1/4 (sin ϕ)−1/2
· sin (((2N + 1)/4) · (sin 2ϕ − 2ϕ) + 3π/4) + O(N −1 ) ,
where the O-bound holds uniformly in ϕ ∈ [ε, π − ε], for any ε > 0. Combining this result with Lemma 4.2, we obtain, for N sufficiently large, e−(
√
√ N ξ +µ/ N (ξ ))2 /4
g(N ;
√
√ N ξ + µ/ N (ξ ))
= (−1) N 21/4 (N !)1/2 (N π )−1/4 (sin ϕ)−1/2
· sin (((2N + 1)/4) · (sin 2ϕ − 2ϕ) + 3π/4) + O(N −1 ) , where
√ √ √ ϕ ≡ ϕ N := arccos ( N ξ + µ/ N (ξ )) / 4N + 2 is contained in an interval of the form [−ε, π − ε] with ε > 0. From this, Lemma 4.3 easily follows.
After these preparations we can turn to the proof of Proposition 1.3: Proof of Proposition 1.3. By Eq. (4.1) and Lemma 4.3, the difference between the lefthand sides in Theorem 1.1 and Proposition 1.3 is bounded by √ √ 1 1 µ ν 2 g N; Nξ + √ · · e−N ξ /2 · g N ; N ξ + √ 2π N N ! N (ξ ) N (ξ )
2 ≤ K (ξ, µ, ν) N −1/2 N !−1 N −1/4 N !1/2 = K (ξ, µ, ν) N −1 , where K (ξ, µ, ν) is some constant depending only on ξ , µ and ν. Thus, Proposition 1.3 follows from Theorem 1.1.
Characteristic Polynomial of a Hermitian Wigner Matrix
1203
5. The Permanental Polynomial In this section, we discuss the implications of our approach for permanental polynomials of Hermitian Wigner matrices. For any n × n matrix A = (ai j )i, j=1,...,n , the permanent is defined by per(A) :=
n
a j,π( j) ,
π ∈Sn j=1
where the sum is taken over the set Sn of permutations of the set {1, . . . , n}. Similarly as in Sect. 2, we adopt the convention that the permanent of the “empty” (i.e., 0 × 0) matrix is taken to be 1. The definition of the permanent is obviously analogous to that of the determinant, except that the sign factors are absent. It is easy to see that this analogy extends to row and column expansions. Thus, in analogy to (2.1), the permanent satisfies the identity per(A) =
n−1
ai,n an, j per(A[n,i:n, j] ) + an,n per(A[n:n] ) .
(5.1)
i, j=1
The permanental polynomial of the matrix A is defined by per(A − z), where A − z is defined as in Sect. 2, and the second-order correlation function of the permanental polynomial is defined by fˆ(N ; µ, ν) := E (per(X N − µ) · per(X N − ν))
(N ≥ 0) ,
where µ and ν are complex numbers. We will also need the auxiliary functions χ fˆ11 (N ; µ, ν), fˆ11 (N ; µ, ν), fˆ10 (N ; µ, ν), fˆ01 (N ; µ, ν), which are defined in the same way as the auxiliary functions from Section 2, but with the determinant replaced by the permanent. Starting from the identity (5.1), it is easy to see that all our results from Sect. 2 carry over to the correlation function of the permanental polynomial, provided that one adjusts the signs appropriately. Since the proofs are virtually the same, we confine ourselves to stating the results. Lemma 5.1. fˆ(0) = 1 , fˆ(N ) = (1 + µν) fˆ(N − 1) + (2b + 21 )(N − 1) fˆ(N − 2) + (N − 1)(N − 2) fˆ11 (N − 1) χ + (N − 1)(N − 2) fˆ11 (N − 1) − ν(N − 1) fˆ10 (N − 1) − µ(N − 1) fˆ01 (N − 1)
(N ≥ 1) ,
(5.2)
(N ≥ 2) ,
(5.3)
fˆ11 (N ) = µν fˆ(N − 2) + (N − 2) fˆ(N − 3) + (N − 2)(N − 3) fˆ11 (N − 2) − ν(N − 2) fˆ10 (N − 2) − µ(N − 2) fˆ01 (N − 2)
1204
F. Götze, H. Kösters χ fˆ11 (N ) = fˆ(N − 2) + (N − 2) fˆ(N − 3) χ + (N − 2)(N − 3) fˆ (N − 2)
(N ≥ 2) ,
(5.4)
fˆ10 (N ) = (N − 1) fˆ01 (N − 1) − ν fˆ(N − 1)
(N ≥ 1) ,
(5.5)
fˆ01 (N ) = (N − 1) fˆ10 (N − 1) − µ fˆ(N − 1)
(N ≥ 1) . (5.6)
11
Let c(N ˆ ) :=
fˆ(N ) (N ≥ 0) and sˆ (N ) := N!
c(N ˆ − k) (N ≥ 0).
k=0,...,N k even
Lemma 5.2. The values c(N ˆ ) satisfy the recursive equation c(0) ˆ = 1, N c(N ˆ ) = c(N ˆ − 1) + N · c(N ˆ − 2)
+ µν · sˆ (N − 1) + sˆ (N − 3) + (µ2 + ν 2 ) · sˆ (N − 2)
+ (2b − 23 ) · c(N ˆ − 2) − c(N ˆ − 4)
(5.7)
(N ≥ 1) ,
(5.8)
where all terms c( ˆ · ) and sˆ ( · ) with a negative argument are taken to be zero. N ˆ ˆ Lemma 5.3. The exponential generating function F(x) := ∞ N =0 f (N ) x / N ! of the ˆ sequence ( f (N )) N ≥0 is given by
x 1 x2 2 2 ∗ 2 exp µν · 1−x 2 + 2 (µ + ν ) · 1−x 2 + b x ˆ , F(x) = (1 − x)3/2 · (1 + x)1/2 where b∗ := b − 43 . Note that Lemmas 5.2 and 5.3 are completely analogous to Lemmas 2.2 and 2.3, except that the sign associated with the factor (µ2 + ν 2 ) is different. Clearly, this can also be effectuated by making the replacements µ → −iµ and ν → +iν in Lemmas 2.2 and 2.3. This implies that the second-order correlation functions of the permanental polynomial and of the characteristic polynomial are related as follows: Proposition 5.4. Under the moment conditions (1.1), we have E (per(X N − µ) per(X N − ν)) = E (det(X N + iµ) det(X N − iν))
(5.9)
for all N ≥ 0 and all µ, ν ∈ C. Proposition 5.4 generalizes a result by Fyodorov [Fy], who obtained Eq. (5.9) by a completely different approach for GUE random matrices (see Eq. (1.15) in Fyodorov [Fy]). For the latter Fyodorov [Fy] conjectured that, as the matrix size tends to infinity, the suitably rescaled roots of the permanental polynomial concentrate around the imaginary axis in the complex plane, with a semi-circular density profile. Acknowledgements. We thank Mikhail Gordin for bringing the connection between the correlation function of the characteristic polynomial of the GUE and the sine kernel to our attention. Furthermore, we are indebted to an anonymous referee for pointing out the implications of our approach for permanental polynomials. His detailed suggestions have led to Sect. 5 of this paper.
Characteristic Polynomial of a Hermitian Wigner Matrix
1205
References [BDS] [BS] [BH1] [BH2] [Do] [Fo] [Fy] [FS] [KS] [Me] [MN] [SF] [Sz] [Zh]
Baik, J., Deift, P., Strahov, E.: Products and ratios of characteristic polynomials of random hermitian matrices. J. Math. Phys. 44, 3657–3670 (2003) Borodin, A., Strahov, E.: Averages of characteristic polynomials in random matrix theory. Comm. Pure Appl. Math. 59, 161–253 (2006) Brézin, E., Hikami, S.: Characteristic polynomials of random matrices. Commun. Math. Phys. 214, 111–135 (2000) Brézin, E., Hikami, S.: Characteristic polynomials of real symmetric random matrices. Commun. Math. Phys. 223, 363–382 (2001) Doetsch, G.: Einführung in Theorie und Anwendung der Laplace-Transformation, 2nd edition. Basel: Birkhäuser Verlag, 1970 Forrester, P.J.: Log Gases and Random Matrices. Book in progress, www.ms.unimelb.edu.au/ ~matpjf/matpjf.html, since 2005 Fyodorov, Y.V.: On permanental polynomials of certain random matrices. Int. Math. Res. Not., 2006, Article ID 61570 (2006) Fyodorov, Y.V., Strahov, E.: An exact formula for general spectral correlation function of random hermitian matrices. J. Phys. A: Math. Gen. 36, 3202–3213 (2003) Keating, J.P., Snaith, N.C.: Random matrix theory and ζ (1/2 + it). Commun. Math. Phys. 214, 57–89 (2000) Mehta, M.L.: Random Matrices, 3rd edition. Pure and Applied Mathematics, Vol. 142. Amsterdam: Elsevier, 2004 Mehta, M.L., Normand, J.-M.: Moments of the characteristic polynomial in the three ensembles of random matrices. J. Phys. A 34, 4627–4639 (2001) Strahov, E., Fyodorov, Y.V.: Universal results for correlations of characteristic polynomials: Riemann-Hilbert approach. Commun. Math. Phys. 241, 343–382 (2003) Szegö, G.: Orthogonal Polynomials. 3rd edition. American Mathematical Society Colloquium Publications, Vol. XXIII, Providence, RI: Amer. Math. Soc., 1967 Zhurbenko, I.G.: Certain moments of random determinants. Theory Prob. Appl. 13, 682–686 (1968)
Communicated by H. Spohn