Commun. Math. Phys. 186, 1-59 (1997)
Communications in
Mathematical Physics
(~) Springer-Verlag1997
Meanders and the Temperley-Lieb Algebra E Di Francesco, O. Golinelli, E. Guitter* Service de PhysiqueTh6odque, C.E.A. Saclay,F-91191 Gif sur YvetteCedex, France Received: 25 February 1996/Accepted: 12 August 1996
Abstract: The statistics of meanders is studied in connection with the Temperley-Lieb algebra. Each (multi-component) meander corresponds to a pair of reduced elements of the algebra. The assignment of a weight q per connected component of meander translates into a bilinear form on the algebra, with a Gram matrix encoding the fine structure of meander numbers. Here, we calculate the associated Gram determinant as a function of q, and make use of the orthogonalization process to derive alternative expressions for meander numbers as sums over correlated random walks.
1. Introduction The meander problem is one of these fundamental combinatorial problems with a simple formulation, which resist the repeated attempts to solve them. The problem is to count the number Mn of meanders of order n, i.e. of ineqnivalent configurations of a closed non-self-intersecting loop crossing an infinite line through 2n points. The infinite line may be viewed as a river flowing from east to west, and the loop as a closed circuit crossing this river through 2n bridges. Two configurations are considered as equivalent if they are smooth deformations of one another. Apparently, the meander problem dates back the work of Poincar6 about differential geometry. Since then, it arose in various domains such as mathematics, physics, computer science [1] and fine arts [2]. In the late 80's, Arnold reactualized this problem in relation with Hilbert's 16th problem, concerning the enumeration of ovals of planar algebraic curves [3]. Meanders also emerged in the classification of 3-manifolds [4]. More recently, random matrix model techniques, borrowed from quantum field theory, were applied to this problem [5, 6]. As such, the meander problem seems to belong to the same class as large N QCD [7]. * E-mail: philippe,golinel,
[email protected] 2
P. Di Francesco,O. Golinelli, E. Guitter
In a previous paper [6], we made our first incursion into the meander problem, in trying to solve the compact folding problem of a polymer chain. Considering indeed a long closed polymer chain of say 2n identical monomers, we ask the question of counting the inequivalent ways of folding the whole chain onto itself, forbidding interpenetration of monomers. By compact folding, we mean that all the monomers are packed on top of each other. Accordingly, folding is a simple realization of objects with self-avoiding constraints. The reader may bear in mind the simple image of the folding of a closed strip of 2n stamps, with all stamps piled up on top of each other [8, 9].
~
~
bridge (a)
road
river (b)
Fig. 1. A compactlyfoldedpolymer(a) with 2n = 6 monomers,and the associatedmeander(b), obtainedby drawing a line (river) horizontallythroughthe monomers.Eachmonomerbecomesa bridge, and each hinge a segmentof road betweentwo bridges. The equivalence between this folding problem and the meander problem may be seen as follows. As illustrated in Fig. 1, drawing a line (river) across the 2n constituents (bridges) of the folded polymer, and pulling them apart, produces a meander of order n. The folding of a closed polymer chain and the meander problem are therefore completely identical. By analogy, we were led to define the meander counterpart of the folding problem of an open polymer chain: the semi-meanders. The latter are defined in the same way as meanders, except that the river is now semi-infinite, i.e. it has a source, around which the semi-meander is allowed to wind freely. We denote by ~rn the number of semi-meanders of order n, namely with n bridges. In this paper, we reconsider the meander and semi-meander problems in the framework of the Temperley-Lieb algebra [10]. This is based on a one-to-one correspondence between (multicomponent) semi-meanders and reduced elements of the Temperley-Lieb algebra. Similarly, (multicomponent) meanders are associated topairs of such elements. More precisely, the Tempedey-Lieb algebra is endowed with a bilinear structure out of which a Gram matrix can be constructed. In our language, the bilinear form associates to each pair of elements of the algebra a weight qC, where r denotes the number of connected components of the associated meander. In particular, the Gram matrix, as a polynomial of q, encodes all the relevant information about meander and semi-meander numbers. Here we obtain as a main result an exact compact expression for the determinant of the Gram matrix, referred to as the meander determinant. Far from solving the question of enumerating meanders, this gives however some partial information on the problem, and produces an exact solution to a meander-flavored issue. This result is summarized in Eq. (5.6), and proved by explicit orthogonalization of the Gram matrix. In a second step, we make use of the precise form of the change of basis in the orthogonalization process to derive various expressions for the semi-meander (Eq. (6.62)) and meander
Meanders and the Temperley-Lieb Algebra
3
(Eq. (6.63)) numbers as statistical sums over paths, with an interpretation as Solid On Solid (SOS) model partition functions. The paper is organized as follows. We start in Sect. 2 by giving basic definitions of (multi-component) meander (Eq. (2.1)) and semi-meander (Eq. (2.3)) numbers and associated polynomials in which a weight q is assigned to each connected component. The relation between (semi-)meanders and both arch configurations and walk diagrams is then discussed, and known results for q = • are reviewed (Eqs. (2.6)-(2.8)). Various conjectured and/or numerical asymptotic behaviors for large n are given (Eqs. (2.11)(2.18)). In Sect. 3, we introduce the Temperley-Lieb algebra TLn(q), and discuss its relation with walk diagrams and arch configurations, in one-to-one correspondence with reduced elements of the algebra. These reduced elements form a natural basis (basis 1) of TLr~(q). The contact with meanders is made through the introduction of a trace and a bilinear form on TL,~ (q) (Eqs. (3.11) and (3.14)). When evaluated on pairs of reduced elements (of the basis 1), this form generates the Gram matrix (Eq. (3.15)), which encodes the fine structure of meander numbers. In Sect. 4, we make a change from basis 1 to a new basis 2, in which the Gram matrix is diagonal. This allows for the calculation of the Gram determinant as a function of q (Eq. (5.6)), and the identification of its zeros (Eq. (5.10)) and their multiplicities (Eq. (5.23)). These results, together with a complete combinatorial proof are detailed in Sect. 5. The matrix for the change of basis 1 ~ 2 is studied in great detail in Sect. 6, where it is shown to obey a simple recursion relation (Eq. (6.29)). This equation is explicitly solved, in the form of matrix elements between two walk diagrams, factorized into a selection rule (with value 0 or 1, see Eq. (6.38)) multiplied by some weight, with a local dependence on the heights of the walk diagrams (Eq. (6.43)). This leads to expressions for the meander and semi-meander polynomials as sums over selected walk diagrams (Eqs. (6.62) and (6.63)). Analogous formulas are derived within the framework of SOS models (Eq. (6.90)), leading to various conjectures as to the asymptotic form of the meander and semi-meander polynomials for q _> 2. Section 7 is devoted to a refinement of the meander determinant for semi-meanders with fixed number of windings around the source of the river (Eq. (7.5)). A few concluding remarks are gathered in Sect. 8. Some technical ingredients are detailed in Appendices A,B and C.
2. Definitions
2.1. Meanders. A meander of order n is a planar configuration of a closed non-selfintersecting loop (road) crossing an infinite oriented line (river flowing from east to west) through 2n points (bridges). We denote by M,~ the number of topologically inequivalent meanders of order n. We extend the definition to a set of k roads (i.e., a meander with k possibly interlocking connected components). The number of meanders with k connected components is denoted by M~k). Note that necessarily 1 _< k < n. These numbers are summarized in the meander polynomial n
ran(q) = ~ M (k) qk k=l
(2.1)
The various meanders corresponding to n = 2 are depicted in Fig. 2. They correspond to the polynomial mz(q) = 2q + 2q 2. (2.2)
P. Di Francesco, O. Golinelli, E. Guitter
@00 k=l
k=2
Fig. 2, The four meanders of order n = 2, i.e. with 2n = 4 bridges. The two first ones have k = 1 connected component, the two other have k = 2 connected components
The numbers M~ k) are listed in [6] for 1 < k < n < 12.
2.2. Semi-meanders.
k=l
k=2
k=3
Fig. 3. The five semi-meandersof order n = 3, arranged according to their numbers k = 1,2, 3 of connected components.
A semi-meander of order n is a planar configuration of a closed non-self-intersecting loop (road) crossing a semi-infinite line (river with a source) through n points (bridges). Note that, in a semi-meander, the road may wind around the source of the river. We denote by )~/n the number of topologically inequivalent semi-meanders of order n, and by 37/~k) the number of semi-meanders with k connected components, 1 < k < n. We also have the semi-meander polynomial ran(q) =
M,~ ~ .
(2.3)
k=l
The various semi-meanders corresponding to n = 3 are depicted in Fig. 3. They correspond to the polynomial m3(q) = 2q + 2q 2 + q3. (2.4) The numbers ,~/~k) are listed in [6] for 1 < k < n < 14.
2.3. Arch configurations and (semi) meanders. A multicomponent meander may be viewed as the superimposition of two (top and bottom) arch configurations of order n, corresponding respectively to the configurations of the road on both sides of the river, as shown in Fig. 4. An arch configuration is simply a configuration of n planar non-intersecting arches (lying, say, above the river) linking the 2n bridges by pairs. The number of arch configurations o f order n is given by the Catalan number (2n)! cn ( n + 1)!n! (2.5) The set o f arch configurations o f order n is denoted by An. As an immediate consequence, as arbitrary multicomponent meanders are obtained by superimpositions of arbitrary top and bottom arch configurations, we have
Meanders and the Temperley-Lieb Algebra
5
...av i:iii:!ii
Fig. 4. Any meander is obtained as the superimposition of a top (a) and bottom (b) arch configurationsof same order (n = 5 here). An arch configuration is a planar pairing of the (2n) bridges through n non-intersecting arches lying above the river (by convention, we represent the lower configuration b reflected with respect to the river).
4
Fig. 5. Any semi-meander may be viewed as a particular meander by opening the semi-infinite river as indicated by the arrows. This doubles the number of bridges in the resulting meander, hence the order is conserved (n = 5 here). By construction, the lower arch configuration of the meander is always a rainbow arch configuration of same order. The number of connected components (k = 3 here) is conserved in the transformation.
m,~(1) = (c,~) 2
(2.6)
As for semi-meanders, upon o p e n i n g the semi-infinite river and dedoubling the bridges (cf. Fig. 5), they can also be v i e w e d as the superimposition of a top arch configuration of order n, and of a particular bottom "rainbow" arch configuration (namely that linking the ith bridge to the ( 2 n + 1 - i) th one, i = 1,2, ..., n). Therefore arbitrary m u l t i c o m p o n e n t semi-meanders m a y be obtained by s u p e r i m p o s i n g an arbitrary arch configuration with a rainbow of order n, leading to ~ n ( 1 ) = on.
(2.7)
In ref.[6], we have also proved the following results m~(-1) = {
O(cp)2
mn(-1) =
-(Cp)
{o
i f n = 2p ifn=2p+
1 ' (2.8)
i f n = 2p i f n = 2p + 1 "
Note that the o n e - c o m p o n e n t m e a n d e r and semi-meander n u m b e r s are recovered in the q ---+ 0 limit of respectively mn(q)/q and ~n(q)/q.
2.4. Walk diagrams.
A n arch configuration of order n m a y be viewed as a closed r a n d o m walk o f 2 n steps on a semi-infinite line, or equivalently its t w o - d i m e n s i o n a l
6
E Di Francesco, O. Golinelli, E. Guitter
extent, which we call a w a l k diagram, defined as follows 1. Let us first label the segments of river between consecutive bridges, namely the segment i lies between the i th and the (i + l ) th bridge, for i = 1,2, ..., 2n - 1. Let us also label by 0 and 2n, the semi-infinite portions of fiver respectively to the left of the first bridge and to the fight of the last one. To each portion of river i, we attach a height gi equal to the number of arches passing at the vertical of i. The nonnegative integers gi satisfy the following conditions go -- g2n -- 0, g i + l - g i C {4-1} i = 0 , 1 , . . . , 2 n - 1 .
(2.9)
The diagram formed by the broken line joining the successive points (i, gi), i = 0, 1,..., 2n, is the walk diagram corresponding to the initial arch configuration. This diagram represents the two-dimensional extent of a walk of 2n steps on the semi-infinite line g _> 0 starting and ending at its origin.
i
0
:
1 2
i i !
3
:
4 5 6 7 8
:
i
:
9 1011 12 1314 1 5 1 6 1718
Fig. 6. A walk diagram of 18 steps, and the corresponding arch configuration. Each dot corresponds to a segment of river. The height on the walk diagram is given by the number of arches intersected by the vertical dotted line
Conversely, any walk diagram of 2n steps, characterized by integer heights g~ >_ 0, i = 0, ..., 2n, satisfying (2.9), corresponds to a unique arch configuration of order n. To construct the arch configuration corresponding to a walk diagram, notice that, going from left to fight along the river, whenever g~ - gi- 1 = 1, a new arch originates from the bridge i, whereas when gi - gi-1 ---- - - 1 , an arch terminates at the bridge i. We denote by Wn the set of walk diagrams of 2n steps. We have the identification Wn -
An.
(2.10)
In this paper, we will alternatively use the arch configuration and walk diagram pictures. Earlier numerical work [9, 5, 6] suggests that the (one-component) meander and semi-meander numbers behave in the large n limit as respectively 2.5. Asymptotics.
Rn
Mn
'~
nC` ,
~n J~n
~
n T ~
with 1 The walk diagrams are usually referred to as Dyck paths in the combinatorial literature.
(2.11)
Meanders and the Temperley-Lieb Algebra
7
/~ ~ 3.5...
R = R 2,
a = 7/2
"1' = 2.
(2.12)
The values of the exponents a and "7 are conjectured to be exact. The relation R =/~2 is a consequence of the polymer folding interpretation [6]: the entropy per monomer is the same for the open and closed polymer folding problems. Note however that the configuration exponents a and 2/depend on the boundary conditions (open or closed). A natural quantity of interest for the study of semi-meanders is the winding, namely the number of times the road winds around the source of the river in the river/road picture of the semi-meander. In the arch configuration picture, the winding of a semi-meander is the number of arches of the upper configuration passing at the vertical of the middle point; representing the upper arch configuration as a walk diagram a, the winding of the semi-meander is simply g~. Denoting by c(a) the number of connected components of the superimposition of the arch configuration a and of a rainbow configuration of order n, the average winding in semi-meanders of order n reads Wn(q) = Y'~aeW,~ g~ qc(a) ~ n~'(q), ~-~aEW,~ qc(a) "*---~oo
(2.13)
where we have identified a winding exponent v(q) E [0, 1]. In this paper, we give strong analytical evidence that v(q) = 1 for all q > 2. For 0 < q < 2, numerical work seems to indicate that 1/2 2, where v ( q ) = 1. Indeed, it is easy to see that, for large q, R ( q ) / R ( q ) 2 ~ 4 / q ~ O, as r a n ( q ) ~ c n q n ~ (4q) n and rh2n(q) ~ q2n.
P. Di Francesco, O. Golinelli, E. Guitter
8
hence R(q) ,-~ 4q and [~(q) ~ q, whereas
~(~)
3
7(c~) = O.
=
(2.18)
3. Temperley-Lieb Algebra and Meanders 3.1. The Temperley-Lieb algebra and arch configurations. The Temperley-Lieb algebra of order n and parameter q, denoted by TL,~(q), is defined through its n generators 1, el, e2, ..., e n - l subject to the relations (i)
i = 1,2,...,n-
e~=qe~
[e~,ej] = 0
if ]i - j[ > 1,
(iii) eiei+l e~ = ei
i = 1,2,...,n-
(ii)
1, (3.1) 1.
This definition becomes clear in the "braid" pictorial representation, where the generators act on n parallel strings as follows:
1=
i i+l
ei =
i+l
(3.2)
n
and a product of elements is represented by the juxtaposition of the corresponding braid diagrams. The relation (ii) expresses the locality of the e's, namely that the e's commute whenever they involve distant strings. The relations (i) and (iii) read respectively
(i)
2 ei =
(iii) ei ei+l ei = D
i
= q
+ti [ i+l =
9 ). ( - -
) (--
i i+l = q ei,
i
(3.3)
i+l = e~.
In the relation (i), the loop has been erased, but affected the weight q. The relation (iii) is simply obtained by stretching the (i + 2) th string. 3.2. The basis 1. The algebra T L n ( q ) is built out of arbitrary products of generators ei. Up to numerical factors depending on q, any such product can be reduced by using the relations (i)-(iii). The algebra TLn(q), as a real vector space, is therefore naturally endowed with the basis formed by all the distinct reduced elements of the algebra. This basis will be referred to as basis 1 in the following (as opposed to the basis 2, defined in Sect. 4 below). For illustration, the reduced elements of TL3(q) read
(3.4) cle2_-
e2o,_-
Meanders and the Temperley-Lieb Algebra
17 16 15 14 13 12 U 10
1 2 3
4 $ 6 7 8 9
10 U 12 1314 1516 1718
Fig. 7. The transformation of a reduced element of TL9 (q) into an arch configuration of order 9. The reduced element reads e3e4e 2 e5e 3 e 1e6e4e2
Let us now show that the reduced elements of TLn(q) are in one to one correspondence with arch configurations of order n. This is most clearly seen by considering the braid pictorial representation of a reduced element. Such a diagram has no internal loop (by virtue of (i)), and all its strings are stretched (using (iii)). As shown in Fig. 7, one can construct a unique arch configuration of order n by deforming the diagram so as to bring the (2n) ends of the strings on a line. This deformation is invertible, and we conclude that, as a vector space, T L n ( q ) has dimension
dim(TLn(q)) = Cn.
(3.5)
The basis 1 is best expressed in the language of walk diagrams. The walk diagrams of 2n steps are arranged .according to their middle height 2n = h, where h = n - 2p, 0 < p < n / 2 . For each value of h, the basic reduced element
f(n) = ele3es...e2p-1
f~n) = 1
(3.6)
corresponds to the lowest walk diagram ~A;~hn) with middle height h, namely
)'V~hn)
=
/N/~
.
/.
.
~
.
~.
.
(3.7)
/N/N
0 2 4 ... 2p ... n--.2(n-p)
...
2n
with 20 = 22 . . . . .
22p = 0,
21 : 23 . . . . .
22p--1 = 1,
(3.8)
22p+j = j
j = 1,2,...,h, 22n-j=g 9 j=0,1,2,...,n.
i
n
n
Fig. 8. An example of allowed left multiplication by ei. The initial walk diagram must have a minimum at the vertical of the point i. This operation adds a box to the walk diagram at the vertical of the point i < n.
It is then easy to see that any reduced element corresponding to a walk diagram with middle height g,, = h is obtained by repeated appropriate multiplications to the left or to the right of Jr h with e's. The walk diagrams of middle height h are constructed
10
E Di Francesco, O. Golinelli, E. Guitter
univocally b y adding "boxes" to the diagram W~n). As illustrated on Fig. 8, adding a box to a diagram )4; at the vertical of the point i is allowed only if i is a minimum of W , namely g~+t = gi-1 = gi + 1, in which case the new diagram, with the box added, has gi ~ gi + 2. For the associated basis 1 elements, this addition of a box corresponds to the left (resp. tight) multiplication by ei (resp. e 2 n - i ) when i < n (resp. i > n). This does not affect the middle height gn = h. For illustration, we list the elements o f the basis 1 for TL3 (q) together with the corresponding walk diagram (the middle height g3 takes only the values 1 (in 4 diagrams) and 3 (in 1 diagram)) el = f ~ 3 ) = _ ~ v
e2E1 = c 2
w
= w
ele2 = f~3)e2 =
_ _
e2
=
,
w
v
~
w
~
.
, -
(3.9)
e2
1 = f3~3) = _ ~
~
To avoid later confusion (with the basis 2), we will denote by (a)l the basis 1 element 3 corresponding to the walk diagram (or arch configuration) a E W~ ( ~ An).
3.3. Scalar product and meanders.
! L/Li,i' L ~. . . . . . . . . . . /,:, ::
Fig. 9. The trace of an element e E TL6(q) is obtained by identifying the left and right ends of its strings (dashed lines). In the arch configurationpicture, this amounts to closing the upper configurationby a rainbow of order 6. The corresponding semi-meanderhas 3 connected components,hence Tr(e) = q3 The standard scalar product on TLn(q) is defined as follows. First one introduces a trace over TLn(q). From the relation (i) o f (3.1), we see that in any element e of TLn (q) each closed loop may be erased and replaced by a prefactor q. Taking the trace of a basis 1 element e corresponds to identifying the left and right ends of each string as in Fig. 9, and assigning an analogous factor to each closed loop, which results in a factor Tr(e) = qC(~)
(3.10)
3 This notation will become clear when we introduce the basis 2. Indeed, the basis 2 elements will be indexed by the same walk diagrams ( ( a ) 2 ) , but will represent different combinations of products of e's, hence (a)z 5t(a)l in general.
Meanders and the Tempedey-LiebAlgebra
11
where c(e) is the number of connected components of the closure of e. The definition of the trace is extended to any linear combination of basis elements by linearity. Note that, with this definition, the trace is cyclic, namely Tr(ef) = Tr(fe). In the arch configuration picture, e(e) is easily identified as the number of connected components of the semimeander obtained by superimposing the arch configuration a corresponding to e and the rainbow of order n: indeed, the rainbow connects the ith bridge to the (2n + 1 - i)th which exactly corresponds to the above identification of string ends. In particular, this permits to identify the semi-meander polynomial (2.3) as
!~n(q ) =
~ eE basis 1
qC(e) = ~ Tr((a)l) aCWn
(3.11)
We also define the transposition on TL,~(q), by its action on the generators e~ = ei, and the relation (el) t = fret for any e, f E TLn(q). The definition extends to real linear combinations by (Ae + # f ) t = )~et + #ft. In the arch configuration picture, this corresponds to the reflection i --~ (2n + 1 - i) of the bridges. In the walk diagram picture, this is the reflection i ~ (2n - i).
Fig. 10. The scalar product (e, f) is obtained by first multiplying e with ft, and then identifying the left and right ends of the strings (by the dashed lines). Here we have (e, f) = q3. The correspondingmeander is obtainedby superimpositionof the upperarch configurationa correspondingto e and lowerarch configurationb correspondingto f (the transposition of f is crucial to recoverb as lowerarch configuration).Here the meander has c(a, b) = c(e, f) = 3 connectedcomponents For any two elements e and f E TLn(q), the scalar product is defined as (e, f ) = Tr(e ft).
(3.12)
This has a simple interpretation in terms of meanders. We have indeed
(e, f ) = qC(e,f) = qC(a,b),
(3.13)
where c(e, f ) = c(a, b) is the number of connected components of the meander obtained by superimposing the a and b arch configurations corresponding respectively to e and f (see Fig. 0 for an example). This permits to identify the meander polynomial as
Iron(q) = ~-~a,bEA,~ qC(a,b)= Ea,bcWn ((a)l,(b)l) J
(3.14)
Note that (e, 1) = Tr(e), hence the semi-meander expression (3.11) corresponds to taking (b)l = 1 in the above and summing over a E Wn only. This agrees with the abovementioned fact that the semi-meanders are particular meanders, namely with lower arch configuration fixed to be a rainbow. Indeed, the unit i E TL~(q) corresponds in the arch configuration picture to the rainbow of order n, (rn)l = 1.
3.4. Gram matrix. The Gram matrix G~(q) of the basis 1 of TLn(q) is the cn x Cn symmetric matrix with entries equal to the scalar products of the basis elements, namely
12
E Di Francesco, O. Golinelli, E. Guitter [ [~n(q)] a,b = ( (a)l'
(5)1) = qC(a,b)
For instance, G3(q)reads, in the basis 1 (3.9): q3 q2 q2 q3 G3(q) = q2 q q q2 q2 q
q2 q q3 q2 q
V a, b E An =- Wn
q2 q2 q3 q2 q2 q3 )
(3.15)
(3.16)
The meander and semi-meander polynomials are easily expressed in terms of the Gram matrix. Arranging the elements of basis 1 by growing middle height of the walk diagrams (in particular, the unit 1 is the last element), and defining the c,~-dimensional vectors g = (1, 1, 1 , . . . , 1)
~ = ( 0 , 0 , . . . , 0 , 1)
(3.17)
we have m~(q) = if- G,~(q)ff, ~hn(q) = ~" ~n(q)g,
(3.18)
where ~ . ff denotes the ordinary Euclidean scalar product of ~c~. Moreover, we also have rr~n(q 2) = tr (~n(q)2). (3.19) The Gram matrix Gn(q) contains therefore all the information we need about meanders. The remainder of the paper is devoted to a thorough study of this matrix and of the consequences on meanders.
4. The Basis 2
The multiplication of elements of the basis 1 involves many reductions, and therefore is quite complicated. In this section, we describe another basis for TLn(q), which we refer to as basis 2, in which the products of basis elements are trivialized, namely the product of any two basis 2 elements is either 0 or equal to another basis element. This second basis, described in detail in [11], will be instrumental in writing alternative expressions of the meander and semi-meander polynomials. 4.1. Definition of the basis 2. We need a few preliminary definitions. The Chebishev polynomials of the second kind are defined by the initial data Uo(x) = 1 and Ul(x) = x and the recursion relation
Un+l(X) =
x U n ( X ) -- U n - I ( X )
(4.1)
or equivalently by Un(z
-I- 1 ) =
Z n+l __ Z--n--1 z - 2; - 1
(4.2)
We also introduce the fractions On- j (q) Un-
subject to the recursion relation
Un(q)
(4.3)
Meanders and the Temperley-LiebAlgebra
13 1
- -
]~n+l
1 -
#n.
(4.4)
]Zl
To describe the basis 2, we use a walk diagram picture analogous to that for basis 1. Each basis element will be attached to a walk diagram of 2n steps. As in the case of basis 1, we start from the definition of the fundamental element ~_(n) h ' corresponding to ),V(h'~), the lowest walk diagram with middle height gn = h = n - 2p (3.7), namely
h •p(n)
=
(4.5)
(# 1)pele3"'" e2p-lEh(e2p+l, e2p+2, ..., en-1),
where the elements E h are defined recursively by
Eo=E1 =1 Eh+l(Ci,
(4.6)
Ci+l ~ ...~ e i + h - 1 ) ~
= E h ( e i , ei+l,..., ei+h--2)(1 -- IZhei+h-l)Eh(ei, ei+l,..., el+h-2).
9For instance, we have
E2(ei) = 1 - IZlei,
(4.7)
E3(ei, ei+l) = (1 - # l e 0 ( 1 - #2el+l)(1 - # l e i ) = 1 - Iz2(ei + ei+l) + Itl/Z2(eiei+l + ei+lei).
Note that E h is a projector 4 (E~ = Eh), and that the normalization factor in (4.5) ensures that ~(hn) is a projector too. In a second step, we construct the other basis elements corresponding to walk diagrams with middle height h. The latter are obtained by repeated left and right additions of boxes on the basic diagram W(hn). To define the corresponding basis 2 elements, it is sufficient to give the multiplication rule corresponding to a box addition (see Fig. 8). The rule reads as follows. I f a box is added on a minimum (gi+l = g i - l = gi + 1) of the walk diagram at the vertical of the point i < n (resp. 2n - i > n), the corresponding basis element is multiplied to the left (resp. right) by the quantity
~(
(4.8)
e i - - /-s
Applying these rules in the case of T L 3 ( q ) , we find the following basis 2 elements 4 This is easily proved by recursion on h, by simultaneously proving that E 2 ( E h (ei , ..., ez+h -- 2)e~+h -- 1)2 = # h I E h ( e l , . . . , ei+h-- 2 ) e l + h - 1.
=
Eh
and
14
P. Di Francesco, O. Golinelli, E. Guitter
.
-
-
= ~/-~-~2(e 2 _ / . t l ) p l e 1 V Pl = px/-~~(e2el -- #1el) =/Zlel ~ / ~ 2 (e2 -- ]A1)
.
-
.
V #1
= ~V/-~(ele2
.
-
-- /Alel)
V /-~1
(4.9) V ]~1
= m(e2 - ,ul(ele: + e2el) + / ~ e l ) ~
= ~(3)
3 = E3(el, e2)
v
v =
1 -- #2(el
+ e2) + #l#2(ele2
+
ezel)
4.2. Properties of the basis 2. The construction of the basis 2 basic elements ~h ^(nl is entirely dictated by the requirement that
ejEh(ei,ei+l,...,ei§
forj=i,i+l,...,i+h-1.
(4.10)
These relations were indeed used in [11] as a defining property for the Eh'S. The multiplication rule (4.8) ensures that whenever the multiplication by e~ acts on a slope of the corresponding walk diagram (i.e., when f i + l + g i - 1 -- 2gi -- 0), the result vanishes. In other words, ei (a)2 = 0
whenever gi+l + gia--1 -- 2g~ = 0.
(4.11)
These rules are also responsible for the following main property of the basis 2 elements. To write it explicitly, we need a more detailed notation for the walk diagrams of middle height gn = h, and the associated basis 2 elements. Such a diagram will be denoted a = lr, where I (resp. r) denotes the left (resp. right) half of the walk diagram, with i = 0, 1, ..., n (resp. i = 2n, 2n - 1, ..., n), namely l = {(i,gi)}
r = {(i,g2n-0}
(4.12)
for i = 0, 1,2..., n. Note that I is read from left to right on a and that r is read from right to left. Moreover, (/r) t = (rl). (4.13) Both half-walks start at height go = g2n = 0 and end at height g~ = h. To avoid confusion, we will denote the corresponding basis 1,2 elements by (lrh, (lr)2 respectively. The main property satisfied by the basis 2 elements reads, for any elements (a)2, (at)2 of the basis 2, a = Ir and a' = / ' r ' : (/r)2 (/'r')2 = ~r,l, (/r')2.
(4.14)
On this relation, we learn that all the self-transposed elements (i.e., with (a)2 = (a)t), namely those attached to symmetric walk diagrams (i.e., with l = r), are projectors. In particular, we recover the fact that qo(h~) = (l,V(h~))2 is a projector. As we shall see in the next section, the relation (4.14) implies also that the basis 2 is orthogonal with respect to the scalar product (3.12).
Meanders and the Temperley-Lieb Algebra
15
5. The Meander Determinant
5.1. The Gram matrix for basis 2. Thanks to the main property (4.14), the Gram matrix Fn(q) of the basis 2 elements takes a particularly simple diagonal form. Its cn x cn entries read
[rn(q)]o,a, = ((a)2,(a')2).
(5.1)
Let us compute the scalar product ((a)2, (a')2) = Tr((lr)2(l'r')tg) = Tr((Ir)2(r'l')2) = (ir,r' Tr((ll')2)
= Tr((l'r')t~(lr)2) = Tr((r'l')2(Ir)2) = 3t,t, Tr((rr')2)
(5.2)
= 3a,~, Tr((a)2(a) t) by direct application of (4.14) and use of the cyclicity of the trace and of (4.13). Hence the matrix F~(q) is diagonal. Moreover Tr((a)2(a) t) = Tr((rr)2) = Tr((ll)2)
(5.3)
for any r, l, does not depend on the half-path r of final height gn = h. It may be evaluated on the left half-path Ph corresponding to the walk diagram ),V(h~) of (3.7). A simple calculation shows that
Tr((phPh)2) = Tr(~ (n)) = Uh(q),
(5.4)
where U denotes the Chebishev polynomial (4.1). Hence Fn(q) is simply the diagonal matrix with the c,~ entries INn(q)] a , a = Uga (q), (5.5) where g~ denotes the middle height of the walk diagram a. We conclude that the basis 2 is orthogonal with respect to the scalar product ( , ) .
5.2. Main result. This remarkable property of the basis 2 will enable us to compute the determinant Dn(q) of the Gram matrix Gn(q) for the basis 1, also referred to as meander determinant. The result reads 5 Dn(q)
=
det (G,~(q)) = I~I Ui(q) an'i i=1
(5.6) 2n
where Ui(q) are the Chebishev polynomials (4.1), and we use the convention that (~) = 0 if j < 0. For instance, the determinant of the matrix ~3(q) (3.16) reads Da(q) = Ul(q) 4 U2(q) 4 U3(q) = q5 (q2 _ 1)4 (q2 _ 2).
(5.7)
As a nontrivial check, let us first compute the degree of D,~(q) as a polynomial in q 5 Ref. [4] presents a recursive algorithm for computing this determinant, which relies on direct manipulations of lines and columns of ~n. The main result of [4] is the identification of the zeros of Dn(q). Here we also give their multiplicities.
16
E Di Francesco,O. Golinelli,E. Guitter deg(Dn(q)) = E
ian'i =
= nc~,
n - 1
(5.8)
i=l
which is in agreement with the definition of the Gram matrix 6n: the term with highest degree in the expansion of the determinant comes from the product of the diagonal elements of Gn, namely
H
qC(a,a) = H
aEW~,
q'~ = qn~.
(5.9)
aEW,.,
as all the meanders with identical top and bottom arch configurations have the maximal number n of connected components.
5.3. The zeros of the meander determinant and their multiplicities. Before going into the proof of the formula (5.6), let us describe a few consequences of this result. The zeros zk,z of the polynomial Dn(q) are those of the Uk(q), for k = 1, ..., n, namely, using (4.2) zk,~ = 2 cos 7r k +l 1
1< I< k< n
(5.10)
hence we may rewrite
n
l 0, for i = 0, 1, ..., n. The walk diagrams of middle height h = n - 2p are simply obtained by taking arbitrary left and right halves of final height h, hence their number is (bn,,~_Zp) e. The number bn,n-2p is obtained by subtracting from (p), the total number of unconstrained walks with go = 0 and g~ = h, the number of those which touch the line g = - 1 , namely (p~_1)" Indeed, by a simple reflection (mirror image) with respect to the line g = - 1 of the portion of walk between its origin and the first encounter with g = - 1, we get a one-to-one mapping with unconstrained walks such that g~ = - 2 and g~ = h; the number of such walks is (pnl). Hence we have
The normalization Ar~(q). To get D~(q) from (5.50), we are left with the calculation of det[N'n(q)]. The diagonal entries of N'n(q) are computed as follows. For the diagram W(h~/, the entry reduces to the global normalization of the vector ~hn), namely [.N'n(q)] Wr),W 2, = (/Zl)p.
(5.53)
The entries corresponding to other walk diagrams of middle height h are simply the product of this factor by the product over all the box additions to l,V(h~) of the
Meanders and the Temperley-Lieb Algebra
23
Fig. 12. The left and right strip decomposition of a diagram of middle height h. The strip lengths are given by the numbers ~a corresponding to the maxima i of a, i ~ n
normalization factors V/#~+2/#e~+l which enter the multiplication rule (4.8). In other words
[J~fn(q)]aa
= (#1)P
'
H ~/+2 . box additions i V #ei+l
(5.54)
To make this formula more explicit, let us arrange the box additions needed to generate a from l/Y~hn) into p left and p right strips of consecutive boxes, oriented respectively to the right and left as indicated on Fig. 12. This is called the strip decomposition of a. Each strip ends at a local maximum of a, namely at the vertical of a point i with s = g i - - 1 = g i - - 1. The length of the corresponding strip is defined to be gi (there are actually gi - 1 boxes in a strip of length gi; a strip with no box has indeed gi = 1). The expression (5.54) becomes
['/~fn(q)]a,a
= ( # I ) P H V ~11 strips
(5.55)
where g denotes the length of each strip. As there are p left and p fight strips, the factors #t cancel out, and we are left with (5.56)
[J~fn(q)]a,a = H V ~ strips
Hence the prefactor in (5.50) reads n
det [.A/'~(q)]2 = H
H
/ze = H ( # ~ ) s~,',
aCWn strips of a
(5.57)
i=l
where sn,i denotes the total number of strips of length i in the strip decompositions of all the walk diagrams of W~, or equivalently the number of distinct diagrams of W,~, with a marked top of strip of length i. Using the relation Ui = 1/(#1#2... #~), we can rewrite det[Fn(q)] (5.51) as n
det[F~(q)] = H ( # ~ ) -h~,~,
(5.58)
i=l
where
hn,i = E
(b~,k)2
(5.59)
z_ i). 9
/
Case 2. When the walk has a descending slope at j, the marked point (j, gjw ) corresponds to the top end of a (left) strip in the strip decomposition of w ~ only i f j < n. Therefore we have the three subcases
Meanders and the Temperley-Lieb Algebra
n
j'
25
j
(c~)
n
j
([~)
Fig. 14. The cases (2)(b)(c~) and (/3). We indicate the migration of the marked dot in the (c~)case. In the (/3) case, the diagram w has a middle height _> i.
(2) (a): If the marked point has j < n, w ' is a diagram with marked top of (left) strip. (2) (b): If j > n, we move the marked point to the left along a line of fixed height g = g~', until we reach a top of (right) strip (see Fig. 14 (a)). Of course, one may reach the middle of the diagram before crossing any top o f strip (see Fig. 14 (/3)). This leads to two more possibilities (2) (b)(a): The line of constant height g = g~o' crosses an ascending slope of w' at j ' ( g j , - 1 = gj, - 1 = gj,+l - 2), such that n < j ' < j. Taking for jl the largest such integer, we move the mark from j to f , and end up with a diagram w ' E Wn with a marked top of (right) strip. (2) (b)(/3): The line of constant height g = g~o' does not cross any ascending slope of w' between n and j. The diagram w ~ E Wn has therefore a middle height _> i. More precisely, we have either possibility: (2) (b)(/3)(i): The middle height is > i. (2) (b)(/3)(ii): The middle height is = i. (2) (c): The marked point is at j = n. The diagram w ~ E Wn has middle height i (hence enters the category of walk diagrams of Wn with middle height _> i). This exhausts all the diagrams with marked top of strips, according to whether - the top is a maximum (1)(a) the top is on a left descending slope (2)(a) - the top is on a right ascending slope (2)(b)(a) -
and all the diagrams with middle height _> i, according to whether - the middle - the middle the middle the middle -
-
height height height height
is > i (2)(b)(/3) is = i and is a maximum (1)(b) is = i and is either an ascending slope or a minimum (2)(b)(/3)(ii) is = i and is a descending slope (2)(c).
T h e i n v e r s e m a p . Conversely, any walk diagram w ~ E Wn with a marked top of strip at height i can be mapped onto a walk w of 2n steps with g~ = 0, g~',~ = 2i and g~o > 0 for all k as follows. The marked top of strip (j, g~o' = i) can be either (i) a maximum, (ii) a descending slope in the left half of w ~ (j < n) or (iii) an ascending slope in the right part of w' (j > n). In the cases (i) and (ii), the marked point separates the walk w ~ into a left piece a (with j steps and g~ = 0, g~ = i, g~ > 0 for all k), and a right piece b (with (2n - j) steps and gob = i, gzbn_j = 0, gb _> 0 for all k). The diagram w is built by the inverse
26
P. Di Francesco, O. Golinelli, E. Guitter
of the reflection-translation of Fig. 13, namely by first reflecting b --~ bt, and then by translating it and gluing its right end to the left end of a. After this transformation, the gluing point, now at position (2n - j) has an ascending slope on w at height i, and is the largest such point. In the case (iii), the marked point is first moved to the right until the first crossing of the line of constant height g = i with a descending slope is reached: such a point always exists, because the height g2w~ = 0 must be eventually reached. One then applies the above inverse reflection-translation to this new marked diagram. This produces again a diagram w E Wn where the gluing point is the largest point on w with ascending slope and height i. Finally, any walk diagram w ~ E Wn with middle height > i may first be marked as follows. Mark the first crossing j > n between the line of constant height g = i and the walk w' at a descending slope. Then apply the above inverse reflection-translation. In all cases, we have associated a walk diagram w to each diagram w ~ with either a marked end of strip of height i or a middle height ___ i. This concludes the proof of (5.61). Conclusion. Equation (5.60) implies that /z
D~(q) =
II(#i) -b2n,2~
(5.62)
i=l
or, reexpressed in terms of Ui through #i = Ui-1/U~,
Dn(q) = ~ I
[Ui(q)](b2~'z~-bz~'2i+2)'
(5.63)
i=1
which takes the desired form (5.6) with an,i = b2n,2i -- b2n,2i+2
=
(n-i)_2(n-i-1) 2n 2n
+
(n--i-2). 2n
(5.64)
6. Effective Meander Theory In this section, we study various properties of the matrix 79n(q) and its inverse, in relation with the meander and semi-meander polynomials through (3.18). Indeed, rewriting (5.49) as ~n(q) = (Pn(q)t) -1 F n ( q ) (79n(q)) -1 (6.1) the relations (3.18) become ?~n(q) = i f " G n ( q ) f f = ( ~ n ( q ) - l f f ) " Fn(q) 79n(q)-1ff,
ran(q) = if-Gn(q)u = ('Pn(q) -1~) 9Fn(q) 79~(q)-1~,
(6.2)
where the vectors g and g are defined in (3.17).
6.1. The matrix 79n(q)-k By definition, the matrix 79n(q) -1 describes the change of basis 2 --* 1, through
Meanders and the Tempedey-LiebAlgebra
27
(a)l = E
[Pn(q)-l]b,~ (b)2.
(6.3)
bCW~ Multiplying both sides to the right by (c) t, for some c E W,~, and taking the trace, we get
Tr((a)l(C)~) = E [Vn(q)-l]b,a Tr((b)2(c)~) beW~ , = [P~(q)-']~,Tr((c)2(c) t)
(6.4)
where we have used the orthogonality of the basis 2 elements. According to (5.3) (5.4), we have Tr((c)2(c)~) = Ue~(q), where g~ is the middle height of the diagram c, and we finally get
[79n(q)_l]ca - Tr ((a)l(c)~) - Ut~(q)Tr((ah(c)t).. 1 , Tr ((c)2(c) t )
(6.5)
6.2. Properties ofTen(q) -l. The formula (6.5) can be used to derive many properties of the matrix Pn(q) -1. Let us take a = W~n) (i.e., (ah = f~'~) = 1) in (6.5). This yields [Pn(q)-l]c'W~n'-
Tr((c)~)
Ugh(q)
(6.6)
Writing c t = Ir as a juxtaposition of a left and right half-walk, and using (4.14), we compute Tr(lr)2 = Tr ((lr)2(rr)2) = Tr((rr)2(Ir)2) = 5z,r Tr(rr)2. (6.7) Hence the trace of (ct)2 vanishes, unless c t is a symmetric diagram, i.e. with l = r, in which case the trace takes the value (5.3), (5.4) Tr(ct)2 = Tr(rr)2 = Tr(pe~ptg)2 = Ue~(q).
(6.8)
Putting (6.6) and (6.8) together, we simply find that
['Pn(q)- 1] c,W~~) =
1 if cis symmetric 0 otherwise ~c,symmetric
(6.9)
With the definition (3.17) of the vector ~7, this translates into
79n(q) -1 ff = g,
(6.10)
8a = ~a,symmetric.
(6.11)
where the vector g has the entries
Comparing with (3.17), this permits to rewrite the semi-meander polynomial as
~ n ( q ) = g" Fn(q)[Pn(q)] - l f f
= E a,bEWn a
symmetric
whereas the meander polynomial reads
[Pn(q)-l]a,b Ue•(q)'
(6.12)
28
P. Di Francesco, O. Golinelli, E. Guitter
ran(q) =
79n(q)-1U
"
I'n(q)~n(q)-lu (6.13)
aEWn
bEW~
Another interesting particular case of formula (6.5) is obtained by taking c = W ~ ), where e,~ is the smallest possible middle height in W,~, namely e~ = (1 - ( - 1 ) ' * ) / 2 = 5~,odd. The heights of a read g2i = 0 and g2i-1 = 1, for all i. This diagram is the smallest of all the diagrams in W,~, in the sense that it is included in all of them. It corresponds to the first entry of the bases 1 and 2, hence to the vector v7 = ( 1 , 0 , 0 , - . . ,0).
(6.14)
The corresponding basis 1 and 2 elements read respectively f ~ ) and ,_,,~,(n).By the definitions (4.5) and (3.6) taken at h = e~ (in which case E,~ = Eo or Ea, hence E,~ = 1), we find the following relation between them: D n) ---- (#l)[n/Zlf~:),
(6.15)
or equivalently
(]/~ en( n ) ]/ 2
(6.16)
= ( P l ) [ n / 2 ] (~/~(n)~
\
en ] l ~
where we have identified en = n - 2p, hence p = [n/2]. For the choice c = W ~ ), (6.5) reads
['Pn(q)-l]w•:)a
= Tr((a)t(W~: ))2)
,
Urn(q) = (]d,1) [n/2]
T r ( ( a h ( W ~ : ))1)
= (#,)[(n+l)/2]
U,,(q) [G,ffq)] a,W~:,
(6.17)
= (i,Zl)[(n+l)/2]-c(a,~'V~2 ))
In the second line, we have used the relation (6.16), whereas in the third line, we have used the fact that U~n(q) = q~" = (/q)-~" and that en + I n / 2 ] = [(n + 1)/2]. The last expression uses the definition of the Gram matrix (3.15): the quantity e(a, W~] )) is, in the arch configuration picture, the number of connected components of the meander obtained by superimposing the upper configuration a and the lower configuration b - W~: ), made of a sequence of n consecutive single arches, linking the bridges (2i - 1) and (2i), for i = 1,2, ..., n. The (meander) polynomial corresponding to the closings of W ~ ) was computed in [6] and reads 6 in(q)=v7
c(aw~')
~,~(q)~7= ~ "
q
aEWn
'
~l(k)(k-1) =
n
n
n
qk
(6.18)
k=l
6 In ref.[6], it has been shownthat the numberof closingsof ~4),(n~) with k connectedcomponentsis identical to that of arch configurationsof order n with k interior arches (i.e.. arches linking two neighboring bridges i and (i + 1)). In turn, this is nothing but the number of walk diagrams in W,~ with exactly k maxima (the notion of interior arch in an arch configurationis equivalent to that of a maximum in the correspondingwalk diagram). This number is (kn)(k~ 1)/n.
Meanders and the Temperley-Lieb Algebra
29
with the vectors g and C defined respectively in (3.17) and (6.14). Note that the polynomial in(q) is reciprocal, i.e. qni~(1/q) = i,~(q). Hence, from (6.17), we get a sum rule for the first line of the matrix ~~ E
[• n ( q ) - l ]
~Ew.
lA1(n) a = (~tl)[(n+l)/2] i n ( L )
"~'
/~1
(6.19)
= in(#l)/(l~)[n/2] by using the reciprocality of in(q).
6.3. Recursion relation for the matrix a similar way as P,~(q), as the matrix
Q~(q)-l. The matrix Qn(q) is constructed in of a redefinition of basis 1, except that all the normalization factors are dropped, namely the prefactor (#1) p in the definition (4.5) of p(,0 h is dropped, as well as the prefactor g//Zei+2/].tgi+l in the multiplication rule (4.8). This results in a diagonal of l's for Q,~(q). Qn(q) is the matrix of change of basis 1 to the unnormalized basis 2 (denoted by basis U), with elements (a)2, = (a)2/.Afa,a. Let us now derive recursion relations for constructing the inverse matrix Qn(q) -1. This matrix sends the unnormalized basis 2 ~into the basis 1, according to the identity (b)l = E
[Qn(q)-l]a,b (a)2'"
(6.20)
aE W~
Recall that the basis 1 elements are constructed by box additions (Fig. 8) on the basic elements f(hn), each box addition corresponding to the multiplication by some ei.
+
+
(ii)
(iii)
(i)
Fig. 15. The three possibilities for the multiplication ei(a)2,, represented as a box addition at the vertical of the point i on a diagram a E Wn. The latter may be above (i) a slope of a, (ii) a maximum of a or (iii) a minimum of a.
Let us study the consequences of a left box addition on b, at a minimum i < n of b. Let us denote by b + o the resulting diagram. Multiplying accordingly (6.20) to the left by e~, we find a recursion relation for the matrix elements of Qn(q) - t . Indeed
(b+O)l-- Z [Qo(q)-t]o,b+o(a)2' aew. =ei(b)l = E
(6.21)
[Qn(q)-l]a,b ei(a)2'
a E W,~
gives a relation between [Qn(q)-'],,b+o and elements of the f o r m [~n(q)-l]a,,b by identifying the coefficients of the basis 2' elements. Three situations may occur for ei(a)2,, as depicted in Fig. 15.
30
P. Di Francesco, O. Golinelli, E. Guitter
(i)
a = 2g~). Due to the The box addition is performed on a slope of a (gi~+l + gi-1 vanishing property (4.11), we find that the resulting element vanishes, namely [ e~ (a)2, = 0]
(6.22)
a = s ~ - 1). For (ii) The box addition is performed on a maximum of a (gi~+l = gi-a (a)2,, this maximum is itself the result of an (unnormalized) box addition with the roles of basis 2, hence a factor (ei - #k), where k = g~ - 1, according to (4.8). The multiplication by ei results in
1
x(ei--#k)+ ei(ei - #k) = ( 1 _ i.zk)e i = #1 #k+l
#k X 1, #k+l
(6.23)
where we have used the recursion relation (4.4) for the #'s. The first term in (6.23) restores the box of (a)2,, while in the second term the box is removed, yielding (a - )2,,where a - o denotes the walk diagram a with the box below the maximum removed. Hence 1 ei(a)2, = ((a)2, + I~k(a -- )2') (6.24) #k+l with k = g~ - 1. (iii) The box addition is performed on a minimum of a (g~+l = gi~-I = g~ + 1). We are left with the multiplication of (a):, by (6.25)
ei = (ei -- pk) + #k • 1, where k - gi + 1. Hence _
a
ei(a)2, = (a + )2' -I-]Ak(a)2, f
(6.26)
Substituting (6.22), (6.24),(6.26), in (6.21), we get [Qn(q)-lla,b+(>( a)2'=
~
aCW~
[Qn(q)-l]a,b
aCW~ X
{1
- - ~ a max(i) ((a)2, + # q - l ( a #e~ '
+ 6a,min(i) ((a + )2' + ]zg~+l(a)2')
- 0)2,),
(6.27)
)
where we use the notation da'max(i) = ~a,min(i) =
{o (1
-o-o if ~i~+l -~i--I -- ~i -- 1 otherwise ' if gia+l -o-o -- gi--1 -- gi + 1 otherwise
(6.28)
The identification of coefficients of (a)2, yields the relation
[Q'(q)-q.,b+o
(Ja,rnax(,)(~-~ [~'a(q)-l]a,b + [~n(q)-l]a_c,,b) +~ rQ " - 1 7 , +(~a,min(i)(fl'g:+l [~n(q)-l]a,b #~+2 k n(q) Ja+,b) (6.29)
Meanders and the Temperley-Lieb Algebra
31
where we have used (~a,max(i) = (~a--o,min(i) ~ 5a,min(i) = (~a+o,max(i), .
(6.30)
ga+~' = g~ + 2 Z
Together with the initial condition [ Q n ( q ) - l ] a w("' = a, w(-),
(6.31)
Equation (6.29) is an actual recursion relation, yielding all the entries of Q - l , column by column starting from the left. A first remark is in order: the entries of Q~(q)-I satisfy the property
[Qn(q)-l]a,b g 0 ~
a C b,
(6.32)
easily proved by recursion using (6.29). This last condition has been previously derived for the entries of ;O(q) (cf. (5.45)), but holds as well for the inverse matrix. Note that (6.29) also implies that [Qn(q)-l,]a,a'= 1 (6.33) in agreement with the normalization of Q.
6.4. The matrix Q~(q)-l. The recursion relation (6.29) will be solved in two steps. The idea is to treat separately the question of finding when [Q,~(q)- 1] a,b vanishes or not, and that of determining its precise value when it does not vanish. This suggests to separate the matrix element [Qn(q) -1 ] a,b into a product [ ~ n ( q ) - I ] a,b = Wa,b fa,b,
(6.34)
where f~,b is subject to the recursion relation
fa,b+o
=
( a,max(i) (fa,b + fa--o,b) + 6a,n n.)(A,b + A+o,b)
(6.35)
and
fa,w(,~' = d,,,,v,~("'"
(6.36)
Solving for f. From (6.35), (6.36), it is clear that the f ' s are nonnegative integers. In fact, the f ' s may only take the values 0 or 1, and act as selection rules on the couples of diagrams a C b. To describe the solution of (6.35), (6.36) we need one more definition. We will need a mixed representation of a couple a C b of walk diagrams in Wn, namely a E Wn is represented as a walk diagram, but b E A,~ -- Wn is represented as an arch configuration of order n. The diagram b is therefore represented by the permutation ab of the bridges, with a~ = 1, describing the arches (namely ab(i) = j iffthe bridges / and j are linked by an arch). The diagram a C b is said to be b-symmetric iff it satisfies g~b(~) - ea~,b(~)-I = -(g~ - gig_l)
(6.37)
In other words, we may represent on the same figure the arch configuration b and the walk diagram a, as illustrated in Fig. 16. Each bridge i of b sits at the vertical of the link (i - 1, i) of a. Then a is b-symmetric iff the links of a are pairwise symmetrical under
32
E Di Francesco, O. Golinelli, E. Guitter
:
:
:
:
:
;
:
:
:
:
:
:
Fig. 16. An example of walks a C b, where a is b-symmetric, b is represented in the arch configuration picture, and a in the walk diagram picture. The dotted lines continuing the arches of b indicate the links of a which have to be symmetrical: the two links connected to the same arch must be mirror image of each other
the pairs of bridges linked by an arch on b. In particular, if a is b-symmetric, then, b e l o w an interior arch of b (i.e., an arch linking two consecutive bridges i, (i + 1)), a must have a m a x i m u m or a m i n i m u m (the only two left-right symmetrical link configurations around i). Note also that a d i a g r a m a is symmetric iff it is w ~ n ) - s y m m e t r i c , and that the d i a g r a m / 4 2 Cn (n) is b-symmetric for all b E Wn. With this definition, the solution of the recursion relation (6.35), (6.36) reads
fa,b=(
1 0
if a is b - symmetric I otherwise
(6.38)
Hence, in (6.34), f selects the couples of diagrams a C b such that a is b-symmetric 7. With fi,,b as in (6.38) let us now c h e c k (6.35), (6.36). T h e relation (6.36) amounts to the fact that a is a - s y m m e t r i c . Indeed, an arch of a always starts (say, at the b r i d g e i) above an ascending link of a (g~ = gi~l + 1) and ends (say, at the bridge j = ~ra(i)) over a d e s c e n d i n g link of a (g~ - g~_ 1 = - 1 ) ; these two links are therefore symmetrical. To c h e c k (6.35), let us consider a d i a g r a m a C b+o, which is b+o-symmetric. N o t i n g that b + o has an interior arch linking the bridges i and (i + 1) (this is equivalent to a m a x i m u m above i on the corresponding walk diagram), by virtue of the a b o v e m e n t i o n e d property, the b + -symmetric d i a g r a m a must have either a m a x i m u m or a m i n i m u m above i. These two possibilities c o r r e s p o n d to the two lines o f (6.35). To complete the c h e c k of (6.35), we must prove that in either case one and only one of the two diagrams a and a -4- o is b-symmetric (then (6.35) simply reads 1 = 1). M o r e precisely, the b o x addition on b ~ b + o is interpreted in the arch configuration picture as the bridge move illustrated in Fig. 17. Before the b o x addition, b has a m i n i m u m at the vertical of i. This means that an arch (starting, say, at the bridge il < i) ends at the bridge i2 = i, and that another starts from the bridge i3 = (i + 1) ( and ends, say, at the bridge i4 > (i + 1)). The bridge m o v e of Fig. 17 replaces these two arches by an arch connecting the bridges it and i4, and an interior arch connecting i2 and i3. The creation of an interior arch corresponds to that o f a m a x i m u m (the top of the box) on b. Let us denote b y A , B , C, A r (like in Fig. 17), the regions of b lying respectively to the left of 7 Note, with the above definition, that fa,b 5t0 =~ a C b. Indeed, if fa,b 510, a cannot cross b, otherwise one would have g~ = g~ and gai+l = /?za + 1, ebi+l = 17'3~- 1, for some i. Take the smallest such i, this means that an arch of b ends at the bridge i. Let i ~ < i be the bridge where it starts, then by b-symmetry, we must h a v e ~ t + l = /?bi,+land g~, = gi'+Ia+ 1, gb = /?ai,+l-- 1, which contradicts the fact that i is the first crossing between a and b.
Meanders and the Temperley-Lieb Algebra
il
iz ia
33
14
I1
b
i2 ia
14
b+
Fig. 17. The bridge move b ---* b + on the corresponding arch configurations, b has a minimum at i = i2, hence an arch ends at the bridge i = i2, and another starts at the bridge i 3 = (i + 1). In b + o, this minimum has been changed into a maximum, hence the bridges (i~, i4) and (i2, i3) are connected. All the other parts A, /3, C and A ~ of b are unchanged.
between il and i2, between i3 and i4 and to the right of i4. Note that the regions A and A ~ may be connected to each other by arches passing above the (i], i2) and (/3, i4) arches, but/3 and C are only connected to themselves. il,
b+~
b
A
A' il
(i)
i2 i3
14
il
i2 i3
14
a= ((~1= %=4-])
(ii) a= ~
a-O
a
(el= - - ~ ] )
a-~
~
Fig. 18. Example of a walk a, which is b + -symmetric. The two possibilities (i) al = o2 = 1 and (ii) 0"1 = --0" 2 = - - I are represented. In both cases, one and only one of the two diagrams a and a - 0"2~ is b-symmetric.
Let us consider a walk diagram a which is b+o-symmetric (cf. Fig. 18). The portions a,/3, % a ~ of the walk a lying respectively below A , / 3 , C, A ~ satisfy the following properties: fl is B-symmetric, "~ is C-symmetric, and a a ~ is AA'-symmetric 8. All these portions of a remain untouched in a + o. Only the two links (i2 - 1, i2) and (i3 - 1, i3) of a will be affected. The b + -symmetry of a implies that (eia - - e ~ _ l ) --
= ( e ai4--1 _ s a ) ~ O'1 -= ( e ai 3 - - [ - e
ai3 ) ~
+l,
(6.39)
a2 = •
as the bridges (i], i4) and (i2, i3) are connected in b + 0. Two situations may now occur, according to the relative values of (xl and or2. (i) (71 = or2: a is not b-symmetric, because the links (i] - 1, il) and (i2 - l, i2) of a are not symmetrical (the same holds for the links (i3 - 1, i3) and (i4 - 1, i4)). On the s Here we extend slightly the notion of respective symmetry to walks c C d, with initial and final heights not necessarily equal to 0, by still imposing the condition (6.37
34
E Di Francesco, O. Golinelli, E. Guitter contrary, a - o2is b-symmetric, because both links (i2 - 1, iz) and (i3 - 1, i3) are flipped by the box addition/subtraction. This is illustrated on Fig. 18-(i).
(ii) cr1 = - a 2 : a is b-symmetric, but a - o-2is not, as the situation of the previous case is reversed. This is illustrated in Fig. 18-(ii). Hence, we have shown that, when a is b + o-symmetric, one and only one of the two diagrams a and a - o2appearing on the rhs of (6.35) is b-symmetric. This completes the check of the recursion relation (6.35) (which reduces in both cases crz = -4-1to i = 1). Equation (6.38) is the unique solution to (6.35), (6.36).
1
2
Fig. 19. A particular folding of the walk diagram b C Wn, leading to an a C Wn, such that a is b-symmetric. The solid horizontal lines represent the unfoldedfolding lines, while the horizontal dashed lines represent the lines along which b is effecti,lelyfolded (lines number 3,5,6). The total number of folding lines is n, the order of the diagrams (n = 6 here). In addition to their defining recursion relation, the f ' s satisfy a number of interesting properties, which will prove crucial in the study of meander and semi-meander polynomials. A m o n g the many interpretations of the condition f~,b = 1, the set o f a ' s such that f~,b = 1 for a given b E W~, may be obtained as shown in Fig. 19. First represent b as a walk diagram of 2n steps. Then draw horizontal lines joining the couples of points (of (j,9 gjb =_ gi), the f o r m ( i , g ~ ) b z,9 j >_ 1) corresponding to the beginning and end of all arches o f b (the arch starts at the bridge (i + 1) and ends at the bridge j). It is easy to see that there are exactly n such lines. The set of admissible a's is simply obtained by f o l d i n g the path b arbitrarily along these lines (see Fig. 19). Indeed, the folding operation preserves the b-symmetry of a, by simply reversing all the quantities (g~+l - g~) along the folding line. If no additional constraint was imposed on the a's, we would get 2 n possible foldings for each diagram b. However, a is further constrained to have nonnegative heights, which reduces this number, but we expect it to still behave as 2 '~ for most b's, in the large n limit. Conversely, here is an algorithm to generate, for fixed a E W~, all the walks b E Wn such that f~,b = 1. The path b = a is always admissible. Let us represent it by the sequence o f signs ti(a) = g~ - g~-l, i = 1,2, ..., 2n, and consider the modified sequence -
-
9 ~ri (a) --- ( - 1 )~ - 1 t~ (a) = ( - 1 )i - 1 (e~ -
gia_ 1 )"
(6.40)
Interpreting these indices i as bridge numbers (from 1 to 2n), the set of b's such that f~,b = 1 is simply the set of arch configurations linking these 2n bridges, such that each arch connects two bridges with the same value of the sign ~r~(a). An example is displayed in Fig. 20. The number of admissible b's for fixed a seems to depend strongly on a.
Meanders and the Temperley-Lieb Algebra
35
ti(a ) + + - b . - - + . - - - q- . . . . + - ~,(a) + i - - + i + : + + - - . - - -- + + + i
Fig. 20. For fixed a, the b's such that fa,b = 1 are the arch configurations connecting bridges with the same value of a~(a) = ( - 1 ) i - l t i ( a ) , where ti(a) = g~ - / ~ - l " for i = 1,2, ..,2n. Here n = 6, and we have represented one of the admissible b's.
Let us finally mention the following sum rule, proved in detail in Appendix C: 3 2n-l(__2n)!
E fa,b = a,bEWn
2n-I
n!(n + 2)! - 3 ~
(6.41)
cn
expressing the total number of couples (a, b) c W~ • W~, where a is b-symmetric. By Stirling's formula, we see that 3 8n fa,b ~ 2V/-~ n5/2.
E
(6.42)
a,bGWn
The leading behavior 8 '~ agrees with the expectation that the number of admissible a ' s for fixed b behave like 2 '~ for most b's (whose number is of the order of 4'~). Solving for w. To complete the solution of (6.29), we have to compute the weight Wa,b = [Q --1 ] a,b when a is b-symmetric. The form of Wa,b is entirely dictated by the coefficients of the recursion relation (6.29). The result reads 2n--1
Wa, b
"~
1
=
a
(6.43)
i=1
w(k, g, m)
b
(w(ea 1,e a
I~e+l (Izel.ze-(~'~'-~r>)
(6.58)
i=O 2n--I
H
=
i--O
1 1
a
a
8
8
("('+':+':+,>/')~(-"'"-'~'-",+,- ~>](~,+,-,> lo~..~+,= +,,. max
max
b
~b
a
ga
f,~,bf~,b' Ue~ (q)
a,b,b' cWn max max b ~b I ~b ! a ~a x ~ 1, = o 2r~ -- ] [~(e~+l-e, >-~,+,.>-( ~b,+,-, >](~.+,.>~og.,,:+~r+,+,,.
(6.63)
Meanders and the Temperley-Lieb Algebra
39
Note that the semi-meander expression (6.62) may be viewed as (6.63) in which U is fixed to be W(~n) - r,~, the walk diagram corresponding to the rainbow arch configuration of order n, which restricts the sum to symmetric walk diagrams a. The expressions (6.62), (6.63) should permit a detailed asymptotic study of the semi-meander and meander polynomials for large n.
6. 7. Connected components in meanders. For any b c A,~ -- W,~, let v~ be the vector with entries (V~)a = 5o,6. The matrix elements of G,~(q) can be expressed as [~n(q)] b,b' = V~, 9~n(q)~
(T'n(q)-lv~ ') 9 Fn(q)T'~(q)-lz~ = qc(b,b'),
=
(6.64)
where c(b, U). Equation (3.13) is the number of connected components of the meander obtained by superimposing the arch configurations b and U. Hence we can write a refined version of 6.63 for fixed b and b~ E A~
q~(b,b') = E
fo,b fo,b' Ue~(q) (6.65)
a E W~
I max max >( e ~ E : ~ -1 [2(ei+,-g , ) - ( s
b
b
b/
b!
a
a
)](gi+l--g,)log,te~+,a+l+l)/Z
Note that the highly non-local quantity c(b, b~) is expressed as a sum of local weights. However, the non-locality reemerges in a weaker form through the selection factors f , which induce mutually non-local constraints on the walks summed over. This formula gives an interesting expression for c(b, U) in the limit of large q. Indeed, we have, for q ~ oc,
Ue(q) ~ qe
#e ~ -1, q
(6.66)
hence (6.65) becomes
qc(b,b')~
E aE
f~,bf~188176
2n -- 1
b
b
b!
bt
a
a
((e,+~-e,)+(e,+,-e~)(t,.~-e,),
(6.67)
W~
where we see the contributions of the ~max,s and that of the Chebishev polynomial have cancelled each other, thanks to the identity g~_l
2n--1
~
a (gim+• - gimax )(gi+l - g~) = O.
(6.68)
i=O
For large q's, the sum in the rhs of (6.67) is dominated by some a C Wn for which the exponent of q is maximal. Such a maximum is unique, as the coefficient of qC(6,6') is 1. This yields the following formula for the number of connected components c(b, bt):
c(b, U) = ~1
max o~wn,
[ A..r
-- ~ i ) -b (gi+l -- gi )] (gi+l -- ~i )
b and b ~-symmetric
(6.69) A particular case corresponding to semi-meanders consists in taking br = }V~n) = rn the rainbow configuration of order n. Using (6.68), we find
40
E Di Francesco, O. Golinelli, E. Guitter
1
c(b) = c(b, ],V(~'~)) =
2n--1 max and b--symmetric
a ~ Wr~ ~ symmetric
i
2e~ + y~.
b a (~ib+l -- e i )(ei+ 1 -- e~)
} (6.70)
i-0
Another interesting consequence of the expression (6.65) is obtained if we take b = U, in which case c(b, b) = n. It takes the form of a sum rule for fa,b, namely, for any b E Wn,
q"=
1
[(e,.m~_e.~%_(gb+_g})](g~%l_eg)logl.(eg+~,~.l+,)/="(6.71)
2~.--1
aE W~ In particular, for b = 14~(~l, hence gb = gm~x for all i, we find, with
q~= E
f~,w~2~=
5~,symm~uic:
Ue,~(q)
a E Wf~ a
symmetric
In/21
(5.72)
-- ~ bn,n-2~ Un-2p(q) 1)-0 which is easily proved by recursion on n (the coefficient b,~,,~-2p, computed in (5.52), is indeed the number of symmetric diagrams with middle height h = n - 2p).
6.8. Asymptoticsfor
q > 2. In this section, we use the expressions (6.62), (6.63) to derive asymptotic formulas for the semi-meander and meander polynomials for large n. Such formulas can only be inferred when all the terms in the sums (6.62), (6.63) over walk diagrams are positive. This is the case for all q > 2, for which Urn(q) > 0 and #m > 0 for all m. q=2. As a preliminary exercise, let us start by taking the limit q ---* 2 of the sum rule (6.71). Due to the definition (4.2), we have
g U/(2) = (g + 1)
#e(2) = g + 1'
(6.73)
therefore, when q ~ 2, (6.71) becomes
2n
E
1
fa'b(e~+l)e~2:;-'
max
max
b
b
]
(e,.,-e,)-(e,,-e,)(e,§
a
la+g a"+1 +1 ~ g.~§e. +3
a
YL- -"I A ~
aCWn Note that, summing (6.74) over b c Wn we get the result
E
1_
fa,b(e~ + 1 ) e 2 E : 7 '
max
max
b
b
a
a
~,
t+l
[(gi+l-g' )-(g~+l-gi)](gi+l-gi)l~ ~?+~;~+,.3 e%~q" +1 = 2nc,~
(6.75)
a,bE Wn
which behaves, for large n, like n
n3/~--~ ~ n
E fa,b a,bEWn
(6.76)
by making use of the asymptotics (6.42). Comparing (6.75) and (6.76), we are led to the following scaling hypothesis for the values of g/5 and ga dominating the sum (6.75):
Meanders and the Temperley-Lieb Algebra
41
g~ ,~ n~g~(x)
gb ~ n,gb(x),
(6.77)
where x = i / n and u E [0, 1] is an exponent characterizing the average height o f the walk diagrams a, b. For this hypothesis to be compatible with (6.76), we m u s t necessarily have u = 1, in which case the exponential in (6.75) tends to a constant 9 (the s u m over i is of order n, but the logarithm is o f order l/n), and the factor (g~ + 1) tends to const, x n, which yields (6.76). This is an example of use of a scaling hypothesis o n the g's d o m i n a t i n g the sum (6.75), l e a d i n g to large n asymptotics. Analogously, if we make the same scaling hypothesis (6.77), with u = 1, on the g's d o m i n a t i n g the sums (6.62), (6.63), for q = 2, we find the asymptotic relations, valid for large n rhn(2)
~
n
~
f~,b
a,bEWn
o ',"~"~'
,-~ n
m,~(2)
(6.78)
fa,b fa,b'
Z a,b,b~ E W n
This expresses the asymptotics o f the m e a n d e r and s e m i - m e a n d e r polynomials at q = 2 in terms off~,b only. In going from (6.76) to (6.78), we have a s s u m e d that configurations of the same order of m a g n i t u d e d o m i n a t e both sums. In fact, we have m a d e a scaling hypothesis o n the matrix elements of p ~ - l ( q = 2) and F n ( q = 2), n a m e l y that the configurations with [7~,7~(2)]~,b "
[rn(2)]~,. = (g~ + 1)
J'~,b
,-, n "
(6.79)
dominate the three sums
Tr(
o(2))
~ n
fo,b,
a,bEW,~ 17"gn(2)Z7 ~ n ~
Z
fa,b,
a,bE Wn a symmetric
if" 9n(2)ff ~ n "
y~
(6.80)
f~,o f~,b'
a,b,b~ E W n
9 To see why, note that for large n and g's the sum in the exponential may be approximated by 2n-- 1
2-
max _ i
_ (ei+1
i+l - gi e ~ +[
i---O
i=0
where we have performed a discrete integration by parts. Hence the exponential of this sum is equivalent to Hi
min. of b (ca + 1)
( g ~ + l ) • 1-Ii . . . . . fb (~a +1) ~ const. The products extend respectively over the i's which are minima and maxima of the walk b and as there is always one more maximum than minima, the above ratio is exactly balanced, hence is of order 1 for large e~'s.
42
P. Di F r a n c e s c o , O. GolinelS, E. Guitter
with the same value of u = 1. Let us stress, however, that the scaling hypothesis (6.79) leads to a wrong result for the meander determinant, D~(2), for large n. Indeed, from (6.79), we would conclude that
]-[ 2 ~ f~,a n
D,~(2)
~
(6.81)
n ~'c'~
whereas, from the exact result (5.6) for D,~(2), we extract the large n asymptotics n
log D~(2) = Z a n , j
(6.82)
l o g o + 1) ~ x/-d-nc~
j=l
by the standard saddle point technique (note that we find exactly twice the previous result (5.19) for the large n asymptotics of log det D~ (0)). The correct asymptotics (6.82) contradict (6.81). This simply means that the configurations of a C 1 ~ dominating the meander determinant are very different from those dominating the trace of the Gram matrix or the (semi-)meander polynomial. q > 2 . We start again from the sum rule (6.71), with q = e ~ + c - ~ 0 > 0. We again make the hypothesis that, when summed over b E W,~, the sum (6.71) is dominated by large g's for large n. Noting that Urn(c~ + e - ~
~
emO 1 - e -2~
Pm ~
(6.83)
e -~
for large m, this gives the asymptotic formula cn(ca+e-~ n ~
c~
~
A,bl_e-2~c
Drnax~'-,igb- , - gb~,J~e,~+,-e~) ~] o ~:2o-' [tprnax L,~,+,--,
a,bC W n 1
-
~
1 -- C- 2 ~
0 ~"~2r~--llob
flb'~dpa
~Oa~
~ fa,be'~ A.~i=o , ~ + 1 - ~ . . . . i+1-- ~, a,b~ Wn
4n e_O)n n3/2 ( cO +
(6.84) where we have used (6.68). This gives an asymptotic sum rule involving the f~,b'S and q. Assuming that the same scaling hypothesis holds for the sums (6.62), (6.63), we find the following asymptotic formulas
~(e o+c-0)
~
~
A,bc o[eo+~
27-, b
b o
a,bG W n a symmetric 0
m,~(e ~ + e - ~
~
~
f~,bf~,b,e ~ s
2n--I
b
b
bt
b!
a
a
[,e,+,-eO+(~,,-e~ )],e~+,-e,)
a,b,b' E W~
(6.85) where we have dropped the prefactor 1/(1 - e-2~ subleading for 0 > 0. Indeed, the limits 0 -+ 0 and n --~ cc do not commute, hence (6.85) is only valid for 0 > 0. On the other hand, in the limit 0 ~ co, we recover the large q asymptotics
Meanders and the Temperley-Lieb Algebra
43
~ n ( q ) ~ q'* ~ e n~ (4e0)n
ran(q) ~ Cn qn ~
(6.86)
n 3/2
by using the two formulas (6.70), (6.69). As before," we can test the scaling hypothesis used above against the large n asymptotics of the meander determinant for q > 2. This hypothesis amounts to writing o [_2g~+X-,2n-bab _obve= _god [ ~ n l ( e 0 -t- e-0)]a,b "~ fa,b e 41- n /--~i=0 ,~,+l ",,' i+1 i ' ] , (6.87)
[Fn(e ~ + e - ~
"~ eOe:.
The corresponding large n estimate of the meander determinant reads
Dn(e~ + e-~
~" H f2,,y~o ~ e ~ o , aEWn
(6.88)
whereas the exact formula (5.6) leads to the asymptotics n
log D,~(e ~ + e - ~ = 0 E
an,j log
sinh(j + 1)0
~,, nc,~O
sinh 0
j=l
(6.89)
by the standard saddle point method. The agreement between the two estimates (6.88)(6.89) is a confirmation a posteriori that the scaling hypothesis (6.87) holds for a very large class of properties of the gram matrix G~(q), for q > 2 and large n. Finally, in view of the assumed q = 2 value u(2) = 1, and the exact q ~ c~ value u(cx~) = 1 (the semi-meander polynomial (6.86) is indeed dominated by the single diagram b = kV(n'~), with winding gb = n ~ n~(~176 it is reasonable to infer that u(q) is identically equal to 1 for all q _> 2.
6. 9. Meander and semi-meander polynomials as SOS partition functions. The asymptotic formulas (6.85) are to be compared with the following exact formulas rh~(e ~ + e - ~
=
E
0
a
1
2n--1
fa'be$[gn+~ Ei=~
b
b
~a
a
(~i+l--~i)(
i+l--~i)]
b +
b!
aE P n ,bE W n a symmetric 0
2r~--I
b
b!
a
a
A,bA,b,e~ E~:o [(~.,-e,)(~.+,-e~ )](~,+,-e,)
m~(e ~ + e -~
(6.90)
aEPn b,bt E W n
where a runs now over the set Pn of all closed paths of (2n) steps (with g~ = g~n = 0) not subject to the constraint g~ >_ O. The relations (6.90) may indeed be obtained as consequences of the following alternative formula for qC(b,b'), b, bt E Wn (to be compared with (6.65))
( e~ + e-~ c(b'b') = ~ a e P , fa,bf~,b 'e~ ~2"2~ [(e,,-eO+(e,+,-t, )](t,+,-e,)
(6.91)
Let us now prove (6.91). On the one hand, as a is both b and U-symmetric, the values of ti(a) = (s - g~) are fixed, up to an overall sign, along each connected component of the meander (b, U), and alternate on successive bridges along the connected component.
44
E Di Francesco, O. Golinelli, E. Guitter
3
6 \
-1
+1
0
0
Fig. 22. The four possible local environments of the (i + 1)th bridge together with the corresponding value si(b, b t) = =kl, 0
On the other hand, the quantity s i ( b , b') = [(eibl _ ~i)+(~i+lb b' _ gb')]/2 may only take the three values - 1,0 and + 1, corresponding to the four possibilities of local environment of the (i + 1) th bridge of the meander (b, U), depicted in Fig. 22. Along any connected component of (b, b'), the variable s i ( b , b') alternates as long as it remains nonzero, and discarding all the zeros leaves us with an alternating sign.
t"'x 1 23
10
Fig. 23. An oriented connected component K with 10 bridges. Starting from bridge 1, the sequence of visited bridges is 1, 8, 9, 10, 3, 4, 7, 6, 5, 2.
For illustration, with the connected component depicted in Fig. 23, this gives the sequence, starting from the bridge 1 bridgei
1
8
9
10
3
4
7
6
5
2
ti(a) s~(b,b') ti(a)s~(b,b')
+ + +
0 0
+ 0 0
+
+ + +
0 0
+ -
0 0
+ 0 0
0 0
turn
R
R
R
-
L
where we also indicated the type of turn (right=R, left=L) taken on the corresponding bridge. The global sign t i ( a ) s i ( b , U ) is thus constant between two zeros and is reversed through each zero. Since a zero indicates a transition from turning left to right and vice versa along the meander, the quantity 1
a
b
b~
(6.92)
i along K
summed along any connected component K of the meander (b, U), is simply equal, up to a sign, to the total number of right turns minus that of left turns ( n n - n r ) , taken on the bridges along K . As on any closed loop we have ( n n - n L ) = 5:2, we compute
Meanders and the Temperley-Lieb Algebra
f(K) = E
45
fa'bfa,b'e~ E , a,oogK t~(a)s,(a)
ti(a)=d=l i along K
(6.93)
= E eOe(nR--nL)/2 e=4-l = e 0 + e-O~
where the sum over e = -4-1 corresponds to the only overall sign ambiguity left on the t~(a) after taking into account the b and U-symmetry of a on K . The final result (6.91) is simply the product over all the connected components K of (b, b~) of the weight f ( K ) above, which completes the proof of the result. More generally, the above analysis can be carried over to q = z + 1/z, for any complex number z, resulting in Z
1
+ 1/z)C(b'5') = E A,bA,b'ZZ aEP~
2n--1 ~-~,--o
b
bt
bI
[(e'b+l-e')+(ei§
(ela+l-~)
(6.94)
This yields the following general expressions for semi-meander and meander polynomials at q = z + 1/z for arbitrary complex z
~n(Z + l/z)
=
E
fa,b zl [~a+l ~2n-1
,b
b
a
a
aEPn,bEWn a symmetric
m~(z + 1/z)
=
E
f~
zl
2n--I E,=O
b bp br [ ( ~ b + l - ~ / ) q - ( ' i + ' - ' i ) ] (~ia+l-~)
(6.95)
aE Pn b,bt E Wr~
Fig. 24. An example of SOS configuration attached to a meander. We display the value of the height L Note that it is entirely dictated by the choices of orientation of the connected components of the meander, and the fact that s = 0 at infinity
This analysis suggests to interpret the quantity qc(b,b') as the Boltzmann weight of a particular configuration, formed by the meander (b, b'), of a suitably defined SOS model. Indeed, the b and b/-symmetry of a E Pn implies that the variable/?a takes identical values on all segments of river which can be connected to each other without crossing any arch of b or b'. Therefore, the variable ~ may be considered as a height variable in the plane, constant on each connected component delimited by one ore several roads, and undergoing a jump discontinuity of-4-1 across each road (see Fig. 24 for an example), and continuous across the river. In particular, ~ = 0 at infinity, due to the boundary condition g0 = g2n = 0. Such an height configuration induces a unique orientation of the various
46
R Di Francesco, O. Golinelli, E. Guitter
connected components of(b, b'), by taking the convention that g --~ g+ 1 (resp. g + g - 1) across a road pointing to the right (resp. left). Conversely, a choice of orientation of the connected components of (b, b') specifies uniquely the height configuration, by further demanding that g = 0 at infinity. The Boltzmann weight b
b
b!
b!
z88 ~2~-1 [(gi+_el)+(g~+l_g* )](g~+l_g~)
(6.96)
corresponds to attaching to each bridge of (b, b') one of the following Boltzmann weights
!
!
Z 2
Z 2
1
1
_! Z
_! 2
Z
1
2
(6.97)
1
according to the local environment of the bridge, and taking the product over all the bridge weights. Again, summing over the two orientations of each connected component K of (b, b') results in a total weight per connected component
Z
z~('~R-nO/2 = z + 1/z = q,
(6.98)
e=4-1
where nR (resp. nL) is the number of right (resp. left) tums of the road on the bridges of K, and e = +1 accounts for the global orientation of K. In the language of SOS models, the expression (6.65) corresponds to a Restricted SOS version, in which the height variable is further restricted to be non-negative (in particular the configuration of Fig. 24 is ruled out). As a first element of comparison with the results of the previous section, if we write (6.95) at z = 1, hence q = 2, we see that ~n(2)
L,b
= a~Pn,bEW~.
a sy. . . . .
m,~(2)
(6.99)
fo,b fa,b,
= aE Pn b,bt E W n
to be compared with the asymptotic estimates (6.78): this gives a relation between sums over Pn and over Wn, involving the same combinations of f . Note that the same type of relation links the cardinals of the two sets over which a is summed, namely
card(Pn)=
n ) = (n 2n
+
1)en
(n + 1)card(Wn),
(6.100)
and also, using (6.41)
fa,b = 2 n aE Pn ,bE W n
Cn -~
2 -~(n + 2) ~
fa,b.
(6.101)
a,bE W n
The reader could wonder in what the restricted expressions (6.62), (6.63) of the previous section are really different from the simple SOS expressions (6.90) obtained above.
Meanders and the Temperley-Lieb Algebra
47
Actually, the considerations of the previous section on the heights g dominating the expressions (6.62), (6.63) for the meander and semi-meander polynomials, eventually leading to an exponent v = 1 for q = 2, could not be carried over here, because of the lack of an explicit prefactor proportional to (g + 1). Hence, in some sense, the formulas (6.62), (6.63) (at least for q = 2) give us access to more precise details on the path formulation. More generally, it is interesting to compare the q > 2 formulas (6.90) and (6.85). We see that these are identical, except for the range of summation over a ( W n in (6.85) and Pn in (6.90)). We conclude that the restriction condition that g~ _> 0 in (6.85) is not important in the large n limit, for q > 2.
7. Generalization: the Semi-meander Determinant In this section, we consider a possible generalization of the meander determinant to semi-meanders in the following way.
12345
Fig. 25. Any semi-meander may be viewed as the superimposition of an upper and a lower open arch configurations. Here the initial semi-meander has winding 3. The two open arch configurations on the right have h = 3 open arches. To recover the initial semi-meander, these open arches must be connected two by two, from the right to the left (the arches number 5,4,1 of the upper configuration are respectively connected to the arches number 5,4,3 of the lower configuration).
Going back to the original fiver/road formulation of semi-meanders, we see on Fig. 25 that any given semi-meander, with winding number h, is obtained as the superimposition of two (upper and lower) open arch configurations of order n, with h open arches. By this, we mean that h semi-infinite vertical roads originate from h of the n bridges, otherwise connected by pairs through (n - h ) / 2 nonintersecting arches (the winding h has always the same parity as the order n in the semi-meanders). The semi-meander is re-built in a unique way by connecting the upper and lower open arches from the right to the left. In particular, only open arch configurations with the same number of open arches may be superimposed to yield a semi-meander. Let A ~ ~ denote the set of open arch configurations of order n with h open arches. It is a simple exercise to show that card(A~ ))
= bn,h
=
n~h
,,--~-/
--
n-h
~
- 1
"
(7.1)
Indeed, the open arch configurations of order n with h open arches are in one-to-one correspondence with the half-walk diagrams of n steps, with final height h, namely with Q = 0, g~ > 0 and gn = h. Let W~h) -= A ~ ) denote the set of half-walks of order n with final height h. The number of such half-walks has been derived in Eq. (5.52) above. We now define the semi-meander determinant of order n and winding h, as the determinant D~)(q) of the matrix G~)(q) with entries
48
P. Di Francesco, O. Golinelli, E. Guitter
[G(h)(q)] t,z' = qr
l, l' E W (h) = A (h),
(7.2)
where c(l, l') denotes the number of connected components of the semi-meander obtained by superimposing the open arch configurations 1 and l ~ and connecting their h open arches. For illustration, we list below the matrices corresponding to n = 4, h = 0, 2, 4, G4(~
= ( q2 q
q2) q
G4(2)(q)=
q3 q2
q2 q3
G4(4)(q)=q4
(7.3)
with the following ordering of open arch configurations h=0
: ~ ~-,, ~
h=2
: ~11,1~1
h=4
I1~
:
(7.4)
Note also that G~(q) = G2n(1) l(q) = Gn(q), hence the formula (5.6) applies to the winding zero and one cases. More generally, we conjecture that -~+1 TT
.~
otfh)
D~)(q) = det G~h)(q) = 1-I ujtq) "~,J
(7.5)
j=l (h) where the numbers c~,~,j read, in terms of the an,j of (5.6)
O•(2h) 2n,j (2h+l) r 1,j
= =
an,j+h + 2h an,j+h-1 an,j+h + 2h (an-l,j+h + an--l,j+h-l)
(7.6)
We checked the validity of this conjecture up to n = 9. For instance, for n = 8, 9, we have
= u ~' u~ 3
u6
u4
D(2) 8 n =8
= u ? 9 u~ 2 u~ 3 u4~ D~4) = u,~8 u~ ~ u 4 u6 D~6~ = u ~ D~ 8)
=u[ (7.7)
n : 9
D(91) (3) 9 D(95) D(97) D(9) 9
= U115 U240 U26 = U~2 U ~ u~ 2 = V 102
U 36
= U15~ = U9
U7
v~
U8
U5
u2
in agreement with (7.5), (7.6). We have performed various checks on the numbers a ~ } (7.6). In particular, the term of highest degree of D~)(q), as a polynomial of q, is given by the product of the diagonal terms in ~h)(q), namely qdeg(D~)) =
II
lEW~h)
n+h q--r-,
(7.8)
Meanders and the Temperley-LiebAlgebra
49
hence deg(D~) ) = n + h bn,h.
(7.9)
This can actually be derived from (7.6). We expect that (7.5), (7.6) can be proved by diagonalizing the matrix G(~h)(q). This matrix has again a simple interpretation as the Gram matrix of a certain subspace of TL,~(q), generated by some particular basis 1 elements. Inspired by the one-to-one correspondence between walk diagrams of order n and the elements of the basis 1, we attach to any half-walk I of n steps and final height h in W(~h) the basis 1 element (a)l corresponding to the walk diagram a = lr E W,~, where we have completed the half-walk 1 with a particular choice of right half-walk r of final height h, namely with g~ = [1 + ( - 1 ) i ] / 2 , i = 0, 1, ..., n - h, and g~ = i + h - n for i = n - h+ 1, n - h+2, ..., n. This corresponds to only retaining basis 1 elements which are obtained by acting on f('~) h (defined in (3.6)) through left multiplications by ei. In this new basis, the scalar product between two elements reads
(lr, l'r) = Tr((lr)l(l'r)~) = qC(tr,z'r) = q.V_~ qC(Z,l')
(7.10)
which coincides with (7.2) up to an overall prefactor of q(n-h)/2 due to the addition of (n - h)/2 trivial loops to the semi-meander ll r. A proof of (7.5), (7.6) should follow the lines of that of (5.6), by writing a change of basis which diagonalizes the Gram matrix (7.2). Note also that like in the meander case, the formula (7.5), (7.6) gives the multiplicities of the zeros of D ~ )(q), Finally, the product over all the possible windings of the semi-meander determinants takes the simple form
[gn(q) =
~I
D~)(q) = ~ I UJ (q)~n'j
h=O r ~ - h=O rood 2
(7.11)
j=l
where
\ n -- j / ~2n--l,j
=
\ n -- j -- 1
\n-j/
\n-j~
(7.12)
n-j-1
as a direct consequence of (7.6), with fln,j = Z h ~ (h) Equation (7.11) may be viewed as the semi-meander counterpart of (5.6). The semi-meander gram matrix (7.2) also gives access to refined properties of the semi-meanders. Indeed, we may compute = Tr n
=Z
29/~k)(h) qSk,
(7.13)
k=l
where ]f/I(~k)(h) denotes the total number of semi-meanders of order n with winding h and k connected components. An asymptotic study of these numbers should be made possible by the explicit diagonalization of ~(h)(q).
50
R Di Francesco, O. Golinelli, E. Guitter
8. Conclusion
In this paper, we have extensively studied the representation of the meander and semimeander enumeration problems within the framework of the Temperley-Lieb algebra TLn(q). This representation is induced by the existence of a map between the reduced elements of TLn(q) and the arch configurations of order n used to build meanders and semi-meanders. Moreover, we have seen that the standard trace over TLn(q) provides a tool for counting the number of connected components of meandric objects. The first result of this paper is a direct computation of the meander determinant (5.6), interpreted as the Gram determinant of the basis of reduced elements of TLn(q), and the exact study of its zeros (5.11) and associated multiplicities (5.23)-(5.24). Beyond the meander determinant, we have been able to rewrite the change of basis diagonalizing the Gram matrix in terms of local height variables defining a restricted SOS model (see (6.65)). We also derived an unrestricted SOS model interpretation (see (6.94)) of the Gram matrix elements. These lead to various expressions for the meander and semi-meander polynomials, as weighted sums over discrete paths (walk diagrams). It is tempting to try to approximate these sums by continuous path integrals, in the limit of large number of bridges. In the case q > 2, where all the SOS Boltzmann weights are positive, this path integral might even be dominated by a simple subset of configurations, obtained for instance through a saddle point approximation. A generalization of this approach to the semi-meanders with fixed winding (number of times the roads wind around the source of the river) should be possible, in view of the conjectured form (7.5) for the corresponding (fixed winding) semi-meander determinants. A proof of (7.5) should be at hand, by a simple adaptation of the proof of (5.6) presented here. This will be addressed elsewhere.
Appendix A. Proof of the formula (5.23) for the multiplicities of the zeros of the meander determinant
In order to prove (5.23), we note that
1
k
Oj+l,O mod (k+l) ---- k -t- 1 ~2--~(Cdk+l)m(j+l)' m=O
where Wk+l = e 2irr/(k+l), and rewrite
(A.1)
Meanders and the Temperley-Lieb Algebra |
k
d,~(zk,t) - k + 1 ~
51
n
Z(OOk+l)m(j+l)an'j
m=O j = l
- k+ 1E
n - j
(5dk+l)m(j+l) --
2(r
+ (COk+l)m(j-1)
m=O j = l
- (n271) _
-
1 ~--~(2sin rrm)2 g V i j=l k + 1 m=0
~,n-J) (oak+l) _
1 k 7rm )2 [ 1 2 ( k + l ) Z ( e s i n k + 1 (w v / 7 ~ + l + ~
2n
(A.2)
n-I
)2n
(2:)] -
m---0
- (n221) k
1 x--"2 sin rrm .2.~ rrm -2n 2(k+ 1) 2--at ~-~--i-)tzcos ~-f]-)
=cn
m=l
which is equivalent to (5.23). In the second line of (A.2), we have performed two discrete 2n integrations by parts, which have produced the boundary term (n-l)" In the fourth line of (A.2,)we have used the reality of dn(zk,t) to express the sum over j as
~-.~( 2n )wJ+w -' j=l
n-j
- - 2
_ 1 [ + 1 ]2n_ (2:)] 2 (x/~ ~ j
.
(A.3)
In the last line of (A.2), we have used the sum rule 1 k 2(k + 1) ~ / 2
/
.
7rm \2
sm k-7-i)
= 1
(A.4)
m=O
and recombined (2~)
-- ( n2,~ - - l ) = Cn"
Appendix B. The Gram matrix at q = v/2
Let us illustrate the conjecture (5.39) in the case k = 3, l = 1, namely q = z3,1 = v'~. For n = 3, 4 we have the following identities relating the last line of Gn(V/2) to those corresponding to diagrams of maximal height 2 (B.1) where each line vector is represented by its labeling diagram. In turn, the labeling diagram represents a basis 1 element for TLn(q = v~). Equations (B.1) translate into the fact that the element
52
P. Di Francesco, O. Golinelli, E. Guitter E3(el, e2) = 1 - v/-2(el + e2) + (e2el + ele2)
(B.2)
is orthogonal (with respect to the scalar product (3.12)) to all the elements of respectively TL3(v"2) and T L 4 ( v ~ ) . This is a direct consequence of the following identities:
el E3(el, e2) = e2 E3(el, e2) = 0, Tr(1 E3(e~, e2)) = ~/U3(V~) = 0,
,
(B.3)
Tr(e3E3(el, e2)) = v ~ U3(x/2) = 0 where the first and second lines are valid in both T L 3 ( v ~ ) (7/ = 1) and T L 4 ( v ~ ) (7/= v~), and the third line holds only in T L a ( v ~ ) . More generally, the element (B.2) is orthogonal to all the elements of TL,~(x/~) for any n > 5 as the ei commute with E3(el,e2) for i >__ 4. For n > 5 however, all the linear combinations we get involve diagrams with some heights > 3. For instance, for n = 5, the first combination reads
Z ~
= V'2 ( ~
+/'/~)
- (Z'r
+ ~"N,~)
(B.4)
Fig. 26. The enhancement transformationof a walk diagram. The walk diagram a = ABE Wn is enhanced at the point marked by a dot, by simply inserting a maximum at this point. Here A = l and B = r t, as the marked point lies in the middle of the diagram. The enhanced diagrambelongs to Wn+l. Note that going from TL4 to TL5 (as well as going from TL3 to TL4) amounts simply to enhancing the middle part of the diagrams, as depicted in Fig. 26, which results in a middle height 64 = 2 ~ 6L = 3 for the four diagrams on the r.h.s, of (B.4). To reexpress the combination (B.4) in terms of diagrams of W5,2, we note that the four diagrams appearing in the r.h.s, of (B.4) contain a middle sequence of heights of the f o r m (6 3 = 1 , 6 4 = 2, 65 = 3, 6 6 = 2, 6 7 = 1), as the result of two successive enhancements. Using the first line of (B.1), we may rewrite this central part as a linear combination o f four diagrams with central height _< 2, which results in the four combinations
(B.5) ~_~
= v~ (~..,,~
+ ~
)
- (z~x,.~
+ ~
)
= v~ ( , ~
+~
)
- (
+~
)
~
which, upon substitution into (B.4), yield the desired expression of the last line o f Gs(v'~) as a linear combination of the 25-1 = 16 lines corresponding to the elements of W5,2. Note that all these diagrams have middle height 1. For general n, we have the following recursive algorithm to generate the desired linear combination expressing the last line of Gn(x/2) in terms of the lines a E Wn,2, denotedby K,~ = Y~aew,,: AT(a). T h e combinations K3, K4 and K5 have been constructed above. Suppose we have constructed K , . Two situations may occur for K,~+I.
Meanders and the Temperley-Lieb Algebra
53
(i) If n = 2p - 1, the combination K2p is simply obtained by enhancing (see Fig. 26) the middle of all the diagrams of W2p-l,2 appearing in K2p-1, and keeping the coefficients of the combination fixed. But as the middle heights always satisfy gn = n mod 2, for all n, the diagrams of W2p-l,2 have all middle height g2p-1 = 1. Therefore, the combination Kzp only contains elements of W2p,2, with middle height g2p = 2. (ii) If n = 2p, the combination K2p+l is obtained in two steps. First enhance the middle of all the diagrams in K2p to get another linear combination Lzp+l. According to the previous discussion, the enhanced diagrams in L2p+l have all middle height equal to 3. But they actually arise from the diagrams appearing in Kzp_ 1, after two successive enhancements. This means that they all contain a middle sequence of heights of the form (gn-1 = 1,gn = 2, g~+1 = 3,gn+z = 2, gn+3 = 1). The second step consists in using the first line of (B. 1) to reexpress this middle piece as a linear combination of diagrams with middle height 1 < 2. This yields Kzp+l after substitution in L2p+I. By carefully following the above algorithm, we find the following compact expression for the linear combination K2p+l. P
(1A)( 2p+I )'~ _
9 "2p+l ' - K 2 p +1 = E
(- 1)J(v~)P-J
j---o
E (a) ae I~
(B.6)
where the sets Ij C W2p+l,2 are constructed recursively as follows. I0 is the set of symmetric diagrams of W2p+l,2. lk is the set of diagrams of W2p+1,2 which may be obtained from diagrams in Ik-1 by one box addition, and which are not already elements of some 1k-l, 1 > 1. One can easily show that card(Ij) = 2P(P). The reader will easily check (B.6) for n = 3, 4, 5, with the previous expressions (B. 1), (B.4), (B.5). The expression for K2p+Z is easily obtained by enhancing K2~1 (case (i) above). This leads to the relation (5.39) linking the semi-meander polynomial of degree (2p + 1) at q = x/2 to the polynomials (5.40) corresponding to the closures of all a E W2p+1,2, at the same value of q P
r~2p+,(vf2) = ~-~(-1)J(v~) p-j ~ j=O
~(a, V~)
(B.7)
aCIj
This proves the conjectured relation (5.39) in the case k = 3,1 = 1. Note also that changing x/~ ~ - x / ~ in (B.7) gives an analogous relation in the case k = 3,1 = 2. More generally, the element ~(n) = E,~(el, ..., en-1) (4.5) is orthogonal to all the elements of TLn(q = 2 cos 7r/(n + 1)), as a consequence of the identities e~ q0~n) = 0
for i = 1 , 2 , . . . , n -
Y r ~ n) = Un(q = 2 cos
7r n+l )=0,
1, (B.8)
This permits to express the last line of Gn(q = 2 cos 7r/(n + 1)) (corresponding to the diagram },V~'~) or equivalently to the element (w~n))l = 1) as a linear combination of the (Ca - 1) other lines, corresponding to diagrams with heights < (n - 1), and middle height (n - 2). This implies in particular that rn (2 cos 7r/(n + 1)) < c,~ - 1, and agrees with the conjectured relation (5.36), which reads here
54
E Di Francesco, O. Golinelli, E. Guitter
dn(2 cos
71"1 ) = 1,
r n ( 2 cos
n+
71" ) = Cn -- 1.
n+l
(B.9)
For m > n, E n ( e l , . . . , e,~_ l) remains orthogonal to all the e l e m e n t s of T L m ( 2 cos 7el(n+ 1)). This results in an expression o f the last line o f Gm(2COS~/(n + 1)) as a linear c o m b i n a t i o n o f the (Cn - l) r e p e a t e d ( m - n times) e n h a n c e m e n t s o f the e l e m e n t s o f W n , n - l , which belong to Win,m-1. For m = n + l, the elements o f the e n h a n c e d linear c o m b i n a t i o n still lie in W,~,n-1 as only the middle heights h a v e been affected, and c h a n g e d from ( n - 2) to ( n - 1). H e n c e all the linear c o m b i n a t i o n s corresponding to m = k n + 1 are the trivial e n h a n c e m e n t s of the linear c o m b i n a t i o n at m = kn. In all the o t h e r cases, m a n y reductions must be applied to the d i a g r a m s to eventually get a linear c o m b i n a t i o n o f elements o f W m , n - 1 only. We will not discuss the details o f this m e c h a n i s m here.
Appendix C. Proof of the sum rule (6.41)
b2 1
2j+2 al
a
I"
a2
II k ............ al
a
~1
Il
a2
~~k-1 ...... l//lI k 0 1
2j+l 2j+2
2n+2
Fig. 27. The recursion for .(k) , t n + 1 The diagram b C Wn+l is represented as an arch configuration, and we have represented its leftmost arch, separating its interior piece bl C Wj from its exterior p!ece b2 E Wn .4. The a's C P~+I (-k) which are b-symmetric are of either form depicted. In the first case, g~ -_ ga 2a+l = 1. The piece al 9
of a between these two points is bi-symmetric, and has its restriction condition lowered by 1: al G P3( - k - l ) (the dashed line represents the g = 0 line in the ai's). There are ~1~k+l) such couples (al, bl ). In the second case, g~ = gzad+l = --1. am is bl-symmetric, but now its restriction condition is raised by I: al E P3~.-k+l). There are r/? -1) such couples (al, bl). The piece a 2 is bz-symmetric and has its restriction condition unchanged in c--k)9There are r/(k) both cases: a2 C P~_j n - j couples (a2, b2) W e w i s h to establish the f o l l o w i n g result
Z fa'b=2ncn--12n+lcn+l a,bEWn
(C.1)
valid for n _> 1 (we set the n u m b e r on the lhs o f (C.1) to be 1 w h e n n = 0). B y a simple r e a r r a n g e m e n t of factorials, this is readily seen to b e equivalent to (6.41). Our
Meanders and the Temperley-Lieb Algebra
55
strategy will be the following. First we write a system of recursion relations linking the numbers (C. 1) to other numbers, to be defined below. We proceed and show that this set completely determines all the numbers, provided we take some suitable boundary conditions. Finally, we solve the system explicitly, and extract back the exact value (C. 1). Like in Sect. 6.9, we denote by Pr~ the set of unrestricted walks a, such that g~ = g ~ = 0, without the positivity constraint on the ga,s. Let p ( - k ) denote the set of walks a E Pn, whose (possibly negative) heights are bounded from below by - k , k a given nonnegative integer,
p(-k) = {a E Pn, s.t. g,'~ = g"~n= 0andg~ _> - k , Vi}.
(C.2)
In particular, p(O) = Wn. Note also that if k _> n, the above restriction amounts to no restriction at all, hence p ( - k ) = Pn. We define ~7~) to be the total number of couples (a, b), a E p ( - k ) and b E W,~, such that a is b-symmetric
7l~) =
E
f~,b
(C.3)
aEP(~-k), bEW,~
and Ek the generating function oo
Ek(x) = E
~7~) x~"
(C.4)
f~,b = 2 n C,~
(C.5)
n=0
Again, whenever k > n, we simply have ~(k)
=
E aE P~ , bE Wn
as shown in (6.101). The desired result (C. 1) amounts to writing that 1 (C(2x) E o ( x ) = C ( 2 x ) - 8xx
1 - 2x),
(C.6)
where C(x) denotes the generating function (5.35)of the Catalan numbers (the subtractions in the second term are ad hoc to yield the initial value %(0) = 1). Let us now derive a system of recursion relations for the numbers ~(nk). Let us count the pairs of walk diagrams (a E W~+l, b E Wn+~) such that a is b-symmetric. Representing b in the arch configuration picture as in Fig. 27, let us concentrate on its leftmost arch, connecting the first bridge (1) to, say, the bridge (2j + 2) (the bridge number must clearly be even). This arch isolates its interior, corresponding to the bridges 2, 3 ..... (2j + 1) from its exterior, corresponding to the bridges (2j + 3) ..... (2n + 2): these two sets of bridges cannot be connected to each other. Let us now count the a's which are b-symmetric, and consider an a E Wn+l, such that fa,b = 1. The part al of a corresponding to the interior bl of the leftmost arch of b is symmetric w.r.t, this piece of b. The same holds for the part a2 of a corresponding to the exterior b2 of this arch, which may be simply seen as a walk with 2(n - j) steps, i.e. an element of W n - j . In addition, we also have ~ - g~) = 1 = g~j+l -- g~j+2 by symmetry w.r.t, the leftmost arch of b, which implies that g~ = g~j+l = 1, while g~ _> 0 for i = 1 , 2 , . . , 2j + 1. Therefore, by a trivial translation of the heights and bridge numbers g~ = gi+l - 1, the part of a corresponding to the interior of the arch may be seen as a walk of (2j) steps with g~' = gzaj' = 0, but with the constraint
56
E Di Francesco,O. Golinelli,E. Guitter
that g~' _> - 1 for i = 0, 1, ..., 2j, hence as an element of pj(-1). Conversely, we may build any a which is b-symmetric by the juxtaposition of a walk in p)-l) and one in Wn-j, with the respective conditions that they are b-symmetric w.r.t, the corresponding portions of b, and elevating the interior portion by shifting the ga's of p ! - l ) by +1, and adding g~ = g~j+2 = 0. This is summarized in the following recursion relation: n ~(0) _-- ~ "tln+l ~
~(1)~(0) llj 'lln_ j 9
(C.7)
j--o (k) , More generally, the same reasoning applies to r/n+1 with the result (see Fig. 27)
r/(k) n+l
~
k+l) ~(k- 1)~ ~(k)
=
(r/5
+ "j
(c.8)
) "n--j,
j---0
where two situations may now occur for the part of a corresponding to the interior of the arch: either g~ = g2~+l = 1, in which case the restriction condition on a is lowered by 1 (term r/~k+l)), or g~ =
gEaj+l ---
--1, which may occur as soon as k >_ 1, in which case the
restriction condition is raised by 1 (term r/~k-1)). The exterior part of a is unaffected and keeps the restriction condition at level - k (term r/,~(k)_j).We may take (C.8) as generic recursion relation, also valid for k = 0, provided we define 77(--1) - - 0 for all n _> 0. In addition to this boundary condition, we set r/(k) = 1 for all k (there is exactly one walk diagram of 0 steps, with go = 0, whatever the restriction k). The recursion relations (C.8) together with the boundary conditions ?7(0k) =
r/(--1) -~ 0
(C.9)
1
determine all the numbers r/~) completely. Indeed, (C.8) expresses r/n+1 in terms of r/j, j < n, hence by repeated applications, we may express all the numbers ~(k) t/n in terms of the collection of numbers r/(0k). This establishes the uniqueness of the solution to (C.8), (C.9) provided it exists. To show the existence, we next exhibit the solution explicitly. It is best expressed in terms of the generating functions Ek(x) (C.4), in terms of the variable _
_
y = C(2x)2 - 1 = ~
2 n-I
cn
x~
(C.10)
n=l
easily invertible as x - - -Y (2y + 1)2
(C.11)
by use of (5.35). The general solution reads E2k(x)
=
2y+l--
E2k+l(x)
=
2y+l--
y+l
Uk(1/y)Ua+I( I /y)
(2y + 1)(y + 1)
y(Uk(1/y) + Uk+l(1/y) ) (Uk+l(1/y) + Uk+2(1/y)) (C.12)
Meanders and the Temperley-Lieb Algebra
57
where Uk(z) denote the Chebishev polynomials (4.2). Note in particular that for k = 0, we recover Eo(x) = 1 + 2y - y(y + 1) = 1 + y - y2, which yields the desired result (C.6), and therefore proves (C. 1). The first few generating functions read
Eo(x) = 1 + y - y2, El(x) =
(1 - y)(2y + 1) 2 l +y-y 2 1 + y - 2y 2 - y3
(C.13)
E2(x) =
1-y (1 - 2y2)(2y + l) 2 E3(x) = (1 + y - y2)(1 + y - 2y 2 - y3)" Note also that the expressions (C.12) make it clear that the E k ( x ) converge uniformly towards (2y + 1) = C(2x) when k ~ oc, for small enough x (indeed, when expanded at small y, (C.12) reads Ek(x) = 2y + 1 + O(y k+l) ~ 2y + 1 when k ~ c~). This is not surprising, as letting k tend to infinity amounts to progressively removing the constraints on the counted paths, whose numbers tend to 2ncn (they are actually exactly equal to this for all n _< k), and 2y + 1 = C(2x) is precisely the generating function for unconstrained paths. To prove (C. 12), let us rephrase the recursion relations (C.8) in terms of generating functions. We have E k(x) - 1 = x E k (x) (Ek+ l(X) -I- E k_ l(X)),
(C.14)
where we have used the boundary condition ~7rk) = 1 ~ Ek (0) = 1. The remainder of (C.9) implies that E - I ( x ) = O. (C.15) It is now a straightforward but tedious exercise to check that (C. 14) is satisfied by (C. 12). For odd k = 2p + 1, we have 1 -
x(E2p+2(z)+E2p(z))
2y y(y + 1) Up+2(1/y) + Up(l/y) - - + - 2y + 1 (2y + l) 2 Up(1/y)Up+l(1/y)Up+z(1/y) , 1 y+l + 2y + 1 ( 2 y + 1)2Up(1/y)Up+2(1/y)
=1 --
(C.16)
(2y + 1)Up(1/y)Up+2(1/y) + y + 1 (2y + 1)2Up(1/y)Up+2(1/y) where, in the second line, we have used the recursion relation (4.1). On the other hand, we compute
1 E2p+l(X)
y(Up + up+l)(u., + up+2) (2y + 1 ) ( y ( U p + U p + , ) ( U p + l + U p + 2 ) - f f - 1 )
Using the multiplication rule
(C.17)
58
P. Di Francesco, O. Golinelli, E. Guitter
m+k Uk(t) U.~(t) = E Uj(t) j=lm-kb
(C.18)
j = m + k mod 2
easily proved by recursion, and implying in particular that U 2 I = UpUp+2 + 1, we reexpress (Up(t) + Up+l(t)) (Vp+l(t) + Vp+2(t)) = Vp+l(Vp + Up+2) + Vp2+l + VpUp+2 = (t + I)U2+I + UpUp+2
(C.19)
= (t + l)(UpUp+2 + 1) + UpUp+2 = (t + 2)Up(t)Up+2(t) + t + 1 by various applications of (4.18) Substituting this into (C.17), with t = l / y , this gives exactly (C.16), thus proving (C.14) for k = 2p + 1. For even k = 2p, we have 1 -- X (E2p+l (x)+E2p_ l(X))
1
2y+ 1 1
2y + 1
y+1 up_~ + up + up+l + up+2 2y + 1 (Up-1 + Up)(U v + Up+t)(Uv+t + Uv+2) y+l y(2y + 1)(Up_l + Up)(Up+l + Up+2)
(C.20)
-- (Up-1 + Up)(gp+l + Up+2) + (y + 1)/y (2y + l ) ( g p - 1 + gp)(Up+l + Up+2) We then compute (Uv_l(t)+Uv(t))(Up+l(t) + Uv+2(t)) + t + 1
= (Up_lUp+l +UpUp+2) +Up-lUp+2+UpUp+l + t + 1 = (tUpUv+I - 1) + (UpUp+1 - U1) + UpUp+t + t + 1
(C.21)
= (t + 2)Uv(t)Up+l(t)
Finally, we write 1
E2p(x)
Up(1/y)Uv+l(1/y) (2y + 1)Up(1/y)Up+,(1/y) - y - 1
(C.22)
which, upon the substitution of (C.21), with t = i / y , is equal to (C.20). This completes the proof of (C.14) for k = 2p. Acknowledgement. We thank A. Zvonkin for bringing Ref.[4] to our knowledge, R. Balian for helpful discussions, S. Legendre for interesting historical remarks and J.-B. Zuber for a careful reading of the manuscript.
Meanders and the Temperley-Lieb Algebra
59
References 1. Hoffman, K., Mehlhorn, K., Rosenstiehl, E, Tarjan, R.: Sorting Jordan sequences in linear time using level-linked search trees. Information and Control 68, 170-184 (1986) 2. Phillips, A.: La topologia dei labirinti. In M. Emmet, ed. L' occhio di Horus: Itinerario nell'immaginario matematico, Istituto della Enciclopedia Italia, Roma, 1989 pp. 57-.67 3. Arnold, V.: The branched covering o f C P 2 ~ $4, hyperbolicity and projective topology. Siberian Math. Jour. 29, 717-726 (1988) 4. Ko, K.H., Smolinsky, L.: A combinatorial matrix in 3-manifold theory. Pacific. J. Math 149, 319-336 (1991) 5. Lando, S., Zvonkin, A.: Plane and Projective Meanders. Theor. Comp. Science 117, 227-241 (1993) and Meanders. Selecta Math. Sov. 11, 117-144 (1992) 6. Di Francesco, E, Golinelli, O. Guitter, E.: Meander, folding and arch statistics. J. Math. and Computer Modelling 144, (1996) 7. Makeenko, 5(.: Strings, Matrix Models and Meanders. Proceedings of the 29th Inter. Ahrenshoop Symp., Germany (1995) 8. Touchard, J.: Contributions ~tl'rtude du probl6me des timbres poste. Canad. J. Math. 2, 385-398 (1950) 9. Lunnon, W.: A map-folding problem. Math. of Computation 22, 193-199 (1968) 10. Temperley, H. Lieb, E.: Relations between the percolation and coloring problem and other graphtheoretical problems associated with regular planar lattices: some exact results for the percolation problem. Proc. Roy. Soc. A322, 251-280 (1971) 1I. Martin, P.: Potts models and related problems in statistical mechanics. Singapore: World Scientific, 1991 Communicated by G. Felder
Commun. Math. Phys. 186, 61-85 (1997)
Communications in
Mathematical
Physics
9 Springer-Verlag1997
Unitarity of Rational N = 2 Superconformal Theories W. Eholzer, M. R. Gaberdiei Department of Applied Mathematics and Theoretical Physics, Universityof Cambridge, Silver Street, Cambridge, CB3 9EW, UK. E-mail:
[email protected],
[email protected] Received: 12 February 1996/ Accepted: 15 August 1996
Abstract: We demonstrate that all rational models of the N = 2 super Virasoro algebra are unitary. Our arguments are based on three different methods: we determine Zhu's algebra A(7-/0) (for which we give a physically motivated derivation) explicitly for certain theories, we analyse the modular properties of some of the vacuum characters, and we use the coset realisation of the algebra in terms of su(2) and two free fermions. Some of our arguments generalise to the Kazama-Suzuki models indicating that all rational N = 2 supersymmetric models might be unitary.
1. Introduction Among the various conformal field theories, the supersymmetric field theories play a special rrle as they are important for the construction of realistic string theories which involve fermions. There exist different classes of superconformal field theories which are parametrised by N , the number of fermionic (Grassmann) variables of the underlying space. For realistic string theories with N = 1 space-time supersymmetry, the worldsheet conformal field theory is believed to require N = 2 supersymmetry. In contrast to the N = 1 super Virasoro algebra which is rather similar to the nonsupersymmetric ( N = 0) algebra, the N = 2 algebra seems to be structurally different. For example the Neveu-Schwarz and Ramond sector of the N = 2 algebra are connected by the spectral flow [39], and the embedding structure of its Verma modules is much more complicated [9, 10]. In this paper another special feature of the N = 2 superconformal field theory is analysed in detail: the property that all rational theories are unitary. Here we call a theory rational if it has only finitely many irreducible highest weight representations, and if the highest weight space of each of them is finite dimensional. We shall use three different methods to analyse this problem which we briefly describe in tuna.
62
W. Eholzer, M. R. Gaberdiel
It was shown by Zhu [42] that a theory is rational in this sense if a certain quotient A(7-/0) of the vacuum representation 7-10 is finite-dimensional. This space also forms an associative algebra, and the irreducible representations of this algebra, the so-called Zhu algebra, are in one-to-one correspondence with the irreducible representations of the meromorphic conformal field theory 7-(0. For the case of the N = 2 superconformal theory, the algebra has always the structure of a finitely generated quotient of a polynomial algebra in two variables, and this implies that A(7-10) is finite dimensional for every rational theory. In this paper we give a physically motivated definition for A(~0). We then show, using the embedding diagrams of the vacuum representations of the N = 2 algebra [9, 10], that A(~0) is infinite dimensional for a certain class of non-unitary theories, thereby proving that these theories are not rational. In addition, we also calculate A(7-(0) explicitly for a few special values of the central charge. We find that A(7-/0) is indeed infinite dimensional for the non-unitary cases we consider (and finite dimensional in the unitary cases). In order to be able to determine the dimension of A(7-/0) for arbitrary central charge one would need to know all vacuum Verma module embedding diagrams and explicit formulae for certain singular vectors. The embedding diagrams are known [9, 10], but sufficiently simple explicit formulae for the singular vectors do not exist so far in general. It was also shown by Zhu [42] that the space of torus amplitudes (which is invariant under the modular group) is finite dimensional for a rational superconformal field theory 1. For such theories, this implies in particular that the orbit of the vacuum character under the modular group is a finite dimensional vector space. If this is not the case, on the other hand, the theory cannot be rational. We determine the vacuum characters using the embedding diagrams, and analyse the action of the modular group on it. We then show that the relevant space is infinite dimensional for c > 3, and for the class of non-unitary theories with c < 3 which was already analysed by the previous method. As a non-trivial check, we also show that this space is finite dimensional in the unitary minimal cases, where c < 3. The only cases which remain can be analysed using the coset realisation of the N = 2 super Virasoro algebra (see e.g. [30])
su(2-"~)k9
(~et) 2 ,
(1)
u(1) which is known to preserve unitarity [20]. Because of this property, a non-unitary (rational) N = 2 theory must correspond to non-integer level for the su(2) k. The only remaining cases correspond to admissible level k r N for which s u ( 2 ) k always has at least one (admissible) representation whose highest weight space is infinite dimensional. Following a simple counting argument due to Ahn et al. [1] we then show that this gives rise to infinitely many inequivalent representations of the N = 2 theory, thus proving that the theory cannot be rational. The reasoning should be contrasted with the situation for N = 0 and N = 1, where the corresponding counting argument does not work: there the admissible representations of su(2) k give rise to the non-unitary minimal models [27, 31]. The last method is well amenable to generalisation. Apart from some mathematical subtleties which we discuss, it can also be applied to the large class of Kazama-Suzuki A
1 Here we use again that for the case of the N = 2 theory,A(7-10)is finite dimensionalfor every rational theory.
Unitarity of Rational N = 2 SuperconformalTheories
63
models, and we therefore formulate it in this setup. As this class already provides most of the known N = 2 models, our arguments seem to indicate that actually all rational N = 2 superconformal field theories might be unitary. All three methods rely to varying degrees on the (conjectured) embedding diagrams for the vacuum representations of the N = 2 super Virasoro algebra which we shall discuss in some detail. For example we use the embedding diagrams to obtain a formula for the vacuum character, whose modular properties we analyse. In the calculation of A(7-/0) we conclude from the embedding diagrams that there exist no further relations, and finally, we check that the coset (1) actually realises the N = 2 algebra by comparing the coset vacuum character with the one obtained from the embedding diagrams. The paper is organised as follows. In Sect. 2 we fix our notations and describe the embedding diagrams for the vacuum Verma modules of the N = 2 super Virasoro algebra following DSrrzapf [9, 10]. In Sect. 3 we give a physically motivated derivation for A(7-(0), and calculate it for certain cases. In Sect. 4, we use the coset realisation of the algebra (1) to analyse the theories which correspond to an admissible level. Furthermore, we indicate how the arguments generalise to the Kazama-Suzuki N = 2 superconformal theories. Finally, we remark in Sect. 5 how the construction can be even further generalised and give some prospective rgmarks. In Appendix A we derive the vacuum character using the embedding diagrams of Sect. 2 as well as the coset realisation. In Appendix B, we analyse the modular properties of these characters, thereby showing that certain classes cannot be rational.
2. Preliminaries and Embedding Diagrams Let us first fix some notations and conventions. Throughout this paper we will consider the Neveu-Schwarz sector of the N = 2 super Virasoro algebra which is the infinite dimensional Lie super algebra with basis Ln, Tn, G~, C (n, r + 89 E Z) and (anti)commutation relations given by C 3 [Lm, Ln] = (m - n)Lm+n + i ~ ( m - m)gm+n,O, 1
+
[Lm, G~v ] = ( ~ m - r)Gm+r, [Lm, Tn] = - n T m + m 1
(~
,
[Tin, Tn] = ~ C m m+n o, [T.~, G~]
= + G . ~~= +~
C 2 1 {G~+, G~-} = 2L~+~ + (r - s)T~+~ + ~ ( r - -~)6~+~,o, [Lm, C] = [Tn, C] = [Gv=t=,C] = 0,
{a;, a : }
= {a;, a2} = o
for all m , n E Z and r , s E Z + 89 We denote the Verrna module generated from a highest weight state [h, q, c > with L0 eigenvalue h, To eigenvalue q and central charge C = c~ by )2h,q,c. An element of a highest weight representation of the N = 2 super Virasoro algebra which is not proportional to the highest weight itself will be called a 'singular vector' if it is annihilated
64
W. Eholzer, M. R. Gaberdiel
by all positive modes Ln, Tn, G~ (n, r + ~l E N := { 1 , 2 . . . }) and is an eigenvector of L0 and To. A singular vector is called uncharged if its To eigenvalue is equal to the To eigenvalue of the highest weight state and charged otherwise. The character Xv of a highest weight representation "1)is defined by
Xv(q)
:=
qh-c/24
~
dim02n)q,~
1 1 nE~N--~
where Vn is the Lo eigenspace with eigenvalue h + n, c is the central charge and h the conformal dimension of Y. The character of the Verma module ])h,q,c is for example given by XYh,q,c =
qh-c/24 1-I n=l
(1 + q n + l ) 2 (1 -- qn )2
and is called the 'generic' character. We call a meromorphic conformal field theory (MCFT) [19] rational if it possesses only finitely many irreducible M C F r representations2, and if the highest weight space of each of them is finite dimensional. We should stress that this definition of rationality differs from the definition used in the mathematical literature, where it is not assumed that the highest weight spaces are finite dimensional, but where in addition all representations are required to be completely reducible. It was shown by Dong et al. [12] that the mathematical definition of rationality implies the one used in this paper. On the other hand the converse is not true as there exists a counterexample [17]. For bosonic rational theories it has been shown by Zhu [42] that the space of toms amplitudes which is invariant under the natural action of the modular group is finite dimensional 3. The generalisation to the fermionic case has been studied in [24, Satz 1.4.6]. If all representations are completely reducible, the space of toms amplitudes is generated by the (finitely) many characters of the irreducible representations, In this case, the central charge and the conformal dimensions of the highest weight states are all rational numbers [2, 40]. We parametrise the central charge c as c(p, p') = 3(1 - 2P---~), P where p and pr will be chosen positive for e < 3. The well-known series of unitary minimal models then corresponds to the central charges being given as c(p, 1), where p > 2 [5, 7]. Finally, let us denote by Xp,p' the vacuum character of the model with central charge c(p, pr), i.e. the character of the irreducible quotient of V0,0,c(p,p, ). One of the main points realised in [8, 9] is that there can be up to two linearly independent uncharged singular vectors at the same level. Indeed, this happens for example for the Verma modules related to the unitary minimal models of the N = 2 super Virasoro algebra4. In [9] a complete list of all embedding diagrams of the N = 2 super Virasoro algebra has been conjectured. 5 2 A MCFT representation is a representation which is compatible with the vacuum representation, i.e. the null-fields of the MCFT act trivially. We do not assume that the graded components of a MCFT representation are finite dimensional. 3 Actually, Zhu showed that this is true if A(T/0) is finite dimensional. In the case under consideration, A ( ~ 0 ) is a finitely generated quotient of the polynomial algebra in two variables (as we shall show in Sect. 3), and thus, the theory is rational if and only if A(7-/0) is finite dimensional. 4 The embedding diagrams conjectured in [11, 32, 34] for the unitary case are not correct. 5 However some of them are still not correct [10].
Uniturity of Rational N = 2 Superconforrnal Theories
65
In contrast to the case of the Virasoro algebra it is not directly clear how to define embedding diagrams for the Verma modules of the N = 2 super Virasoro algebra. This is due to the fermionic nature of the N = 2 algebra: suppose that there is a singular vector ~b,~,p = O~,plh, q, c > of energy h + n and charge q + p in lPh,q,c and that ~t ,,p, = O%,p, Ih + n, q + p, c > is singular in )')h+n,q+p,c. T h e n Otn,,p, On,plh, q, e > might be identically zero in Vh,q,c. Our definition of embedding diagrams of Verma modules of the N = 2 super Virasoro algebra follows [9] and includes only those Verma modules which are actually embedded in the original Verma module. To be more specific, the embedding diagram of a Verma module of the N = 2 super Virasoro algebra shows the highest weight vector and all nontrivial singular vectors contained in it up to proportionality. These vectors are connected by a line to a singular vector if there exists an operator mapping the singular vector of lower level (or the highest weight vector) onto the singular vector of higher level. As in the case of the embedding diagrams of the Virasoro algebra we shall omit lines between two vectors if these vectors are already indirectly connected. We also want to include in the embedding diagrams information about the type of the singular vectors. To this end we use the following notation: the highest weight vector is denoted by a square and the singular vectors by circles. These circles are filled for singular vectors corresponding to Kac-determinant formula vanishings and unfilled for descendent singular vectors (for the explicit form of the Kac-determinant see [5, Eq. (6)]). Furthermore, uncharged singular vectors which have no singular descendents of positive or negative charge, respectively, are denoted by surrounding triangles pointing to the left or right, respectively. (These singular vectors are of type A (1,0) or A (0, 1) in the notation of [91). It has been shown in [9] that all singular vectors in Vh,q,~ have charge 0 or :t:1. Therefore we indicate the charge of a singular vector relative to the highest weight vector by drawing the uncharged vectors vertically underneath the highest weight vector, the - 1 charged vectors in a strip to the left of the highest weight vector and finally the +1 charged singular vectors in a strip to the right of the highest weight vector. Let us now consider all embedding diagrams of the Verma modules Vo,o,~r with p, ff > 0. For these values of the central charge there exist three types of embedding diagrams corresponding to p = l , f f ~ Q or p = 1,ff E N or 2 < p E N,ff E N, (.p, pP) = 1, whose respective embedding diagrams are shown in Fig. 2.1, Fig. 2.2 and Fig. 2.3. [1016. q:
--1
0
+l
Fig. 2.1. Embedding diagram for Vo,o,cO,vt) with pt ~ Q
G•
The Verma modules ],20,0,c( 1,pt) withp' r Q contain only two singular vectors, namely f2, where f2 = ]0, 0, c > is the vacuum vector (cf Fig. 2.1). The Verma modules 2
6 Note that in [9] the embedding diagram of type I I I ~ A +B - , I I 1 ~ A - B + (corresponding in our notation to Vo,0,~p,p,~ with coprime p, pt E N and p, pt > 2) is not correct. The correct embedding diagram is the same as the embedding diagram for the case I I I ~ A +- B + , I I I ~ A + B + - (corresponding in our notation to 12o,0,c~p,1) with 2 3 which we will need in the appendix (for details see [9]). For c = 3 there are infinitely many singular vectors which are all embedded in the two generic singular vectors G • , f2. For c > 3 all embedding diagrams terminate, i.e. there are 2
only finitely many singular vectors contained in the vacuum Verma modules.
3. Zhu's Algebra In this section we shall first give a physically motivated derivation of Zhu's algebra; we shall then use this formulation to determine A(7-/0) for a certain class of theories of the N = 2 super Virasoro algebra, and for some special cases. A conceptually interesting way to determine all irreducible representations of a (bosonic) meromorphic conformal field theory is the method introduced by Zhu [42], whereby one associates an associative algebra, usually denoted by A(7-/0), to the vacuum representation 7-[0 of a conformal field theory. It was shown by Zhu that the irreducible representations of this associative algebra are in one-to-one correspondence with the irreducible representations of the meromorphic field theory 7-/0. To define this algebra, a certain product structure was introduced by means of some rather complicated formulae, and it was not clear how this construction could be understood from the more traditional point of view of conformal field theory. Here we shall give a different derivation for this algebra, from which it will be immediate that all representations of 7-[0 have to be representations of A(7-/0) (this derivation follows in the spirit [14] and [41]); to show the converse direction, a similar argument as the one given in [42] would be sufficient. Another virtue of our derivation is that the Neveu-Schwarz fermionic case (which has by now been independently worked out by Kac and Wang in [29]) can essentially be treated on the same footing. To fix notation, let us denote the modes of a holomorphic field S(z) of conformal weight h by
S(z) = ~
S-t z l-h 9
(2)
lEZ
Given two representations of the chiral symmetry algebra .At, 7-ll and 7-/2, and two points zl, z2 c C in the complex plane, the fusion tensor product can be defined by the following construction [15]. First we consider the product space (7-/1 | 7Y2) on which two different actions of the chiral algebra are given by the two comultiplication formulae [16] (n+h-lm+h_l)Z~ m ( S , ~ |
/x z,,~(sn)= /~ z,,z~(S~)= m=l-h
+ej
\ l+h l=l--h
1
z2
(3)
68
W. Eholzer, M. R. Gaberdiel
m=l-h
+el
n-h
(-z2)Z-n(ll|
'
(4)
l=n
A z,,z2(S_n) =
n - h
( - z O m - n ( S - m | n)
m=n
+el
~
(n+l-1) n -- h
(-
l)l+h_lz-(n+O(ll| 2
,
(5)
l=l--h
where in (3) we have n > 1 - h, in (4,5) n > h, and el is :q=l according to whether the left-hand vector in the tensor product and the field S are both fermionic or not. 7 The fusion tensor product is then defined as the quotient of the product space by all relations which come from the equality of A z1,~2 and A ~1,z2, (7-/1 | 7-/2)S := (7-/1 | 7-/2)/( A z,,z2 - A ~,,z2).
(6)
It has been shown for a number of examples that this definition reproduces the known restrictions for the fusion rules [15, 16]. To analyse the possible representations of the meromorphic field theory 7-/0, let us consider the fusion product of a given representation 7-I at z2 = 0 with the vacuum representation 7-/0 at zl = z. We shall be interested in the quotient of the fusion product by all states of the form A ~,0(.A_)(7-/0 | 7-/) s , (7) where .A_ is the algebra generated by all negative modes. (In the conventional approach to fusion in terms of 3-point functions, all such states vanish if there is a highest weight vector at infinity.) Using the comultiplication A z,0, it is clear that we can identify this quotient space with a certain subspace of (7"/0 | ~ ) f / A ~,o(A-) (7"/0 | ~ ) S C (7-/0 | ~(o)) ,
(8)
where 7-/(~ is the highest weight space of the representation 7-/. The idea is now to analyse this quotient space for the universal highest weight representation 7-[ = 7-tuniv, (o) i.e. to use no property of ~b E Tlun w, other than that it is a highest weight state. We can then identify this quotient space with a certain quotient of the vacuum representation 7-/0, thus defining A(~0), [
(o) \
/ A Z,0(~--) (no | n n v)s 9
t A(n0) | nun v) = (no |
(9)
In order to do this analysis without using any information about ~b, we have to find a formula for (~ | So)(7-/o | ~ )
rood
A z,o(.A-)(7-/o | ?-tuniv),
(10)
in terms of modes acting on the left-hand factor in the tensor product, where S is any bosonic field, i.e. S has integral conformal dimension h. 7 The second formula differs from the one given in [ 16] by a different e factor. There the two comultiplication formulae were evaluated on different branches; this is corrected here.
Unitarity of Rational N = 2 Superconformal Theories
69
The crucial ingredient we shall be using is the observation
(11)
" A O , - z ( S - h ) = A z,o(eZL-' S _ h e - z L - ' ) E A z , O ( A _ ) .
Hence we have (for h > 2) --1
o ~ So,-z(S-h) + ~
z -(h+z) n ~,o(S~)
l=l--h
\ h-h
--(s_men)+ l=l--h
+
-s I z -'h+'~ (~ | st)+ z -(~+'~ Z l {l+h-1 ) \,~ + h - 1 l=l--h k
m=l-h
_- (s_h | ~) - z-~ 0 | -1
+ Z
ZZ_m(,_qm@fl)}
So) + ( n |
/
Z
z-(h+m)(
l=l--h m=l--h
l+h-1 re+h-1
)
(Sm@ll)
where ~ is the algebra generated by the positive modes, and ~' denotes equality up to terms in the quotient. Evaluated on (~0 @ r where ~ is a highest weight, we then have (ll|176174
+
-1
l
Z
[" l+h- 1 )
Z z-'~\m+h-1
(S~|
.
(12)
l=l--h m=l--h
In particular, we can use this result to obtain a formula for the action of A ~,0(So) on (r | ~9) modulo vectors in the quotient, where r c A(7-/o). (It is clear that A z,o(So) is well-defined on the quotient.) We calculate A z,o(So) ~ z~(S-h | ~1)+ (So | n) + Z
re+h-1 +l=m \re+h-1
z-'~(Sm@ll)
m=l--h h rn=0
where we have used the identity
Z ( a+kk
=
a+/+ll
(13)
k=O
e.g.
(see [21, p. 174]) to rewrite the sum in curly brackets. This reproduces precisely the product formula of Zhu for z = 1 [42], h
S.r176162 m--O
~
(14)
70
W. Eholzer, M. R. Gaberdiel
If S is the field corresponding to a state in the subspace of the vacuum representation 7-/0 by which we quotient to obtain A(7-/0), then its zero mode vanishes by definition on all highest weight states. This implies that the product structure defined by the action of A z,0(S0) gives rise to a well-defined product on A(7-/0). We have now achieved our first goal, namely to express the zero modes of the holomorphic fields on a highest weight state in terms of modes acting in the vacuum representation, modulo terms which vanish if there is a highest weight vector at infinity. In the next step we want to derive the relations by which the vacuum representation has to be divided in order to give A(7-/0). In particular, we shall see that we can express all states of the form (S_n | R)(7-(0 | ~b) with n > h by corresponding states with n < h. (Again this can be done without using any property of ~b other than that it is a highest weight vector.) We shall do the calculation for the bosonic case first, i.e. for h E N; we shall explain later, what modifications arise in the fermionic case. As before we have
+ Z (n+l-l"] \ n - h / (--1)h-'~z -(n+t)A z,o(St)
0 ~ A'o,-z(S-n)
l=l-h ( nn- - h ) (-1)h-n z - n ( ' Q So) + (ll Q ,A+)
= (S_n|
+
~ I=l--h
n+l - 1 n-h
zl-rn(sm @ ]l).
~ \re+h-1 m=l-h
Using (12) we can rewrite the (ll @ So) term, and find after a short calculation (S-n |
~) ~
(:-') h
(--1)h-n
{
zh-n(S-h | ~) q-
z-(m+n)Cm(Sm | n) m=l-h
}
,
(15) where
-I
C.~=z(l+h-1 ) ( (n+l-1),(h-1),~ l=,~ re+h-1 1-(nZ-l~.(l+h-1)!]
"
(16)
For completeness we should also give the result for h = 1, where the analysis simplifies to
(T-n | l[) ~ --(--z)--n(]l ~ TO) ~ (-z)-n+l(T_l @ ~).
(17)
Taking n = h + 1, we note that (15) and (17) become
h+fi( h + l
0~
h
-m
)zm_h_l(S_m|
'
(18)
m=l
which, for z = 1, just reproduces the formula of Zhu [42]. Here we have used
m f h-l-l) Z
/=1
(see
e.g. [21, p.
176]).
lth-l-m
( --
h h+l-m
)
(19)
'
Unitarity o f Rational N = 2 Superconformal Theories
71
From our definition of A(7-/0) it is clear that every highest weight representation of gives rise to a representation of A(7-(0) with respect to the product structure induced by A z,0(S0). As our space is at most as large as the space of Zhu, it is then clear that our definition has to agree with the one of Zhu. The fermionic case is slightly simpler, as there are no zero modes, and thus there is no relation corresponding to ( 1 2 ) . W e therefore only have to calculate for n _> h, 7%
'/'
( S _ n @ ]l) ------AO,_z(S_n)--gll=_ -
(--1)Z+h-I(--Z)-(n+Z) (11| Sz)
_! =
el(-1)h-n
n-
h
l=l--h For
l =
1 -
h,
999
, - 3, 1 we have furthermore 1
, , (11 | & ) = A z,o(&) -
z
l)z
rn=l--h Hence we find for n _> h, l
(s_n | n)
2
(-1) h-n" E
(Sin | n),
(21)
m=l --h
where
_!
Din=
l=m
[,m+h-1
n-h
"
Using (13) this formula simplifies for n = h to h-~
m=O which, for z = 1, just reproduces the formula of Kac and Wang [29]. By the same reasoning as before, it is then clear that our definition agrees with the one of Kac and Wang. In the case of the N = 2 algebra, the only fermionic fields are G + of conformal weight h = 3/2. Because of (21), all negative modes of G_im with ra >_ 3/2 can be eliminated in the quotient. On the other hand, G+-m vanishes on the vacuum for m = 89 and thus all G • modes can be removed. Furthermore, using (15) and (17), all (negative) modes of L and T can be eliminated, except for T_l and L - 2 , L - 1 . On the other hand, L-1 vanishes on the vacuum, and thus can be removed by commuting it through to the right. The space A(7-/0) is therefore a certain quotient space of the space generated by L_2 and T-1. We can then equally well describe A(7-/0) as a quotient space of the space generated by h = L - 2 + L _ I and q = T_l; this formulation has the advantage that the two generators commute, and that they can be directly identified with the eigenvalues of the highest weight with respect to L0 and To (by (12)), since
72
W. Eholzer, M. R. Gaberdiel (ll | L0) ~ (h | 1),
(ll @ To) ~ (q | 11).
(24)
Thus A(7-/0) is a quotient space of the space of polynomials in h and q. Generically, this space is infinite dimensional, and to obtain some restrictions, we have to use singular vectors in the vacuum representation. It is clear that all bosonic descendents of singular vectors do not give new information for the quotient, as we can always replace negative modes of L and T by h, q, L - 1 and some non-negative modes. Apart from the L _ 1 contributions, these give only restriction which contain, as a factor, restrictions from the original singular vector. The L_1 contributions simply correspond to an infinitesimal shift in the insertion point z, and thus do not give new restrictions either. For fermionic descendents, a similar argument implies that the only descendents of potential interest are
G~_89
,
G+_89G~89
,
(25)
where .M is a singular vector. It is clear that all three are trivial for the first generic singular vectors of the vacuum representation, G • , ~2 = 0, but in general, they need 2
not be trivial. It follows from the embedding structure for the cases corresponding to Fig. 2.1 and Fig. 2.2 that all singular vectors are descendents of the generic singular vectors of the vacuum representation. We can thus conclude that A(7-/0) is isomorphic to a polynomial ring in two independent variables, and thus, in particular, infinite dimensional. This shows that the corresponding theories are not rational. In the case of the diagram of Fig. 2.3, there exists an additional independent bosonic singular vector .Ms . The relations of the generic singular vectors have already been taken into account in the above derivation, and A(7-/0) is therefore only finite dimensional (and the corresponding theory rational) if A/" gives rise to two independent relations. For a bosonic singular vector.IV', only the third descendent in (25) can contribute, as the other two have odd fermion number and thus are equivalent to zero in the quotient. We conclude from this that the potential rational models all have a non-trivial bosonic singular vector in the vacuum representation. Whether the model is actually rational depends then on whether the G + 89G - 89descendent gives an independent relation or not. We have calculated the relations coming from.N" and the G +_~ G - ~ descendent for a few examples explicitly. The first bosonic singular vector is given in each case as J~c=l =
(2L_2
3T-1T-t)f2
,
A/'c=3 =
(10T_3 - 3L_3 + 3G+3/zG-3/2 - 12L_2T-1 + 8 T - I T - I T - I ) if2 ,
A/'c=-6 = ( - 1 0 T _ 3 - 6 L - 3 + 6G+_3/gGS3/2 + 6 L - z T - 1 + T - I T - I T - I ) ~ , Arc=-1 = (42T_4 + 2 4 L - 4 + 2 7 T - z T - 2 - 84T-3T-1 - 6G+3/zG-5/2
+6G+_5/2G-3/2 - 3 2 L - z L _ 2 - 3 6 L - 3 T - 1 + 36T-tG+3/zG-3/2 +12L_zT-1T-1 + 9 T - a T - I T - I T - I ) f2 , .N'c=-12 = ( - 2 4 0 L _ 5 + 360G+3/2G-v/z + 120G+_5/zG-5/2 + 840L_zT-3 +360G+_v/zG-_3/2 + 6 0 0 L _ 3 L - 2 + 1 2 0 L - 3 T - 2 + 180L-4T-1 8 To avoid confusion, we shouldpoint out that L-1 ~ is a descendentof the two generic singular vectors. We should also note that we implicitly assume here, that the vacuum representation does not possess any subsingular vectors which might give additional relations.
Unitarity of Rational N = 2 SuperconformalTheories
73
- 6 0 T _ 1G+_3/2G-5/2 + 60T_ 1G+_5/2G-3/2 - 600L-2G+-3/2G~.3/2 - 3 0 0 L - 2 L - 2 T - 1 + 6 0 L - 3 T _ I T _ I - 60T_4T-1 + 30T-2T-2T_1 -60L-2T-1T-1T-1 -3T-IT-IT-IT-IT-1
+ 1 8 0 T - 3 T _ I T _ I - 60T-1T-1G+_3/2GZ3/2 - 1992T_s) ~ .
The singular vector and its descendent give rise to the polynomial relations (in h and q) Pl, and P2, respectively. The algebra A(7-/0) is then given by A(7"[o) = C[h, q]/ < pl(h, q),p2(h, q) > 9
Our results for the five cases above are contained in Table 3.1. Table 3.1. Polynomials determining A(7-(0)for certain values ofc c 1 3/2 -6 -1 -12
p~ (h, q) (2h - 3q 2) q(1 - 12h + 8q 2) q(2 + 6h + q2) ( - 4 h + 3 q 2 ) ( 1 + 8 h + 3 q 2) q(4+ 1 0 h + q 2 ) ( 6 + 1 0 h + q 2)
p2( h, q) ( - 2 h + q)(1 + 3q) (2h - q)(1 - 2h + 5q + 8q 2) (2h - q)(2 + 6h + q2) (2h - q)(2 + 3q)(1 + 8h + 3q 2) ( - 2 h + q)(4 + 10h + q2)(6 + 10h + qZ)
We note that for the unitary models c = 1 and c = 3, A(7-/o) is finite-dimensional, as the two relations are independent. In the other three cases, however, the two relations contain a common factor, and thus A(7-/0) is infinite dimensional.
4. The Coset Argument
We have shown in the last section that the N = 2 super Virasoro algebra is not rational for certain non-unitary cases. In this section we will analyse the remaining cases with c < 3. The analysis for c > 3, using the modular properties of the vacuum character, is contained in Appendix B. We shall use the coset realisation (1) of the N = 2 super Virasoro algebra to show A that certain admissible representations of su(2) k give rise to infinitely many irreducible representations in the non-unitary cases. The basic idea of this argument is due to Ahn et al. [ 1]. We shall present the argument in the more general setting of the Kazama-Suzuki models as the counting argument generalises. Recall that the Kazama-Suzuki models can be constructed from hermitian symmetric spaces [30]. More precisely, it has been shown in ref. [30] that if G / H is a hermitian symmetric space, the coset ~k @ (~et) 2n ~k ' (26) where n = rank(G) = rank(H), contains the N = 2 super Virasoro algebra and the explicit form of the N = 2 super Virasoro generators T, G • L in terms of the 2n free fermions and the ~k currents has been given. A complete list of hermitian symmetric space can, for example, be found in ref. [30, Table 1]. Note that for all hermitian symmetric spaces rank(G) = rank(H), g is simple and that 0 is of the form 0 = u(J.) | 0~_, where 01 is semisimple. The case of the N = 2 super Virasoro algebra corresponds to g = su(2) and 0 = u(1).
74
W. Eholzer, M. R. Gaberdiel
Before proceeding, we should note that it is in general rather difficult to determine the actual coset algebra of a given coset. In particular, even if the coset algebra is correctly identified for generic level (which is usually a tractable problem, for example by comparing generic characters), it is a priori not clear that this identification remains correct at arbitrary level. However, for the case of 13 = su(2), b = u(1), the character calculations of Appendix A show that the coset is indeed the N = 2 super Virasoro algebra for arbitrary level h 9. Using the explicit form of the u(1) current T and the Virasoro field L in the coset [30, Eq. (4.5)] it is easy to obtain a formula for the eigenvalues q and h of To and L0, respectively, acting on the subspace of the 9k highest weight space with t~-weight )~ of a ~k | ~et 2'~ highest weight representation h-
C2(~1) 2(k +9)
q-
(A, A + 2pb) 2(k +9) '
(27)
2 k+g (p~-p0'A)'
Here C2(9) denotes the second order Casimir of the 9 representation on the ~k highest weight space, 9 the dual Coxeter number of 1~, and p~ and po are half the sum of the positive roots of 9 and b, respectively. For the case of 9 = su(2), the formulae become h-
J0" + 1) (k+2)
m2 (k+2)'
q-
2m k+2'
(28)
where j and m label the spin and the magnetic quantum number of the corresponding su(2) representation. If for admissible k ~ N the admissible representations of su(2) k are MCFT representations, it follows directly from these formulae that there are infinitely many highest weight states, as was already observed by Ahn et al. [1]. (For more details see below.) This implies then directly that the corresponding theory__.._is not rational. It has now been shown that the admissible representations of su(2) are indeed MCFT representations [13, Corollary 2.11]. For general g, the corresponding result is not yet known, but we believe it to be true. Assuming this for the general case, the argument can be generalised as follows: we note that 0 has one simple root less than g which we denote by c~, and, that the Dynkin index of a is 1. For an admissible but non-integer level k there always exists an admissible representation of 9k whose Dynkin label corresponding to the fundamental weight dual to a is fractional ([28, Theorem 2.1 (c)] see also [35, p. 236]). This representation has in particular an infinite dimensional highest weight space. Let A be the q-weight of the highest weight vector VA of this representation, and denote by A~ the b-weight of VA. Furthermore, let E,~ be the step operator corresponding to a. Then the vectors E2~ VA are highest weight vectors of b, and their 0-weights are given as A,~ = AO + n#, where # is a non-zero b-weight and n + 1 E 1~. This implies that the expression (An, A,~ + 2p~) is unbounded for n ~ ~ and hence, that there are infinitely many different values for the conformal weight h in (27). Thus the coset (26) has infinitely many inequivalent representations. To relate the arguments for p = su(2) to the results of Sect. 3, let us parametrise the admissible level as k = p / p t - 2, where p, p~ are coprime positive integers and p > 2. The admissible representations are given by the .su(2)k weights [27, p. 4958] 9 This argumentrelies on the conjecturedembeddingdiagramsof the N = 2 algebra.
Unitarityof Rational N = 2 SuperconformalTheories
75
An,z = (k - n + l(k + 2))A0 + (n - l(k + 2))A1,
(29)
where n and l run through n = 0 , . . . ,p - 2, 1 = 0 , . . . ,p~ - 1, and A0, A1 are the A fundamental weights of su(2). We note that the spin j of the su(2) representation on A the highest weight space of the su(2) k representation corresponding to An,t is given by j = 89 - l(k + 2)). In particular, for k E N which corresponds to the unitary case, the spin j is always half-integral. If k r N, the admissible representations corresponding to the weights An,z with l 5/0 have an infinite dimensional highest weight space as 2j + 1 r N. These representations give rise to infinitely many MCFq" representations of the coset algebra, thus showing that it cannot be rational. Indeed, Eq. (28) implies that k+2
2
h+~--q-
j(j+l) - k-+ 2,
(30)
where j is the spin of the su(2) representation. This equation gives precisely the common factors in Table 3.1 for c = - 6 , - 1 and - 12, i.e. k + 2 = p / p ' = 2, 3 and ~0, where the A
spin j corresponds to the admissible representations of su(2) k with infinite dimensional highest weight space which are given by weights A~,z with 1 5/0. The additional discrete A
representations of A(7-/0) correspond to the su(2) k representations with the weights An,0 (n = 0 , . . . , p - 2): for example in the second case c -- - 1, (h, q) = (89 - ~) comes from n = 1 and (h, q) = (0, 0) from n = 0. We know that all irreducible representations satisfying the conditions given by the polynomials in Table 3.1 are MCFT representations, so in particular there exists a continuum of MCFT representations in the non-unitary cases. The above argument, however, only shows that those representations satisfying (28) can be obtained from the coset construction. Furthermore, it is clear that only countab/.~ly many representations of the coset MCFI" can be constructed from the admissible su(2)k representations. It therefore seems that the remaining representations cannot be constructed using the coset realisation. Finally, let us mention that our findings are in perfect agreement with the results obtained in ref. [4]. The authors of loc. cit. have investigated the representation theory of several exceptional N = 2 super W-algebras from a completely different point of view. The only rational models they found weie unitary and even contained in the unitary minimal series of the N = 2 super Virasoro algebra.
5. Conclusion In this paper we have analysed systematically the question whether the rational theories 9f the N = 2 superconformal algebra are always unitary. We have used three independent ~rguments to exclude the existence of rational non-unitary theories. Where possible we aave checked that the different methods lead to consistent conclusions. One of the methods is based on the coset realisation of the N = 2 algebra, and ~e have already indicated in section 4 how this argument can be generalised to the Kazama-Suzuki models. Apart from the (aforementioned) problem that the admissible 9epresentations are not yet known to be MCFF representations in general, this argument toes not exclude that the theories corresponding to non-admissible affine theories are 9ational. However, we expect that there should be 'fewer' singular vectors than in the ldmissible case, and thus that the corresponding theories should also not be rational. It
76
W. Eholzer, M. R. Gaberdiel
should be possible to settle both of these problems as soon as the representation theory of the Kac-Moody algebras at non-integer level is understood in detail. The coset argument is even more general, as it does not involve the fermions. Indeed, ignoring the fermions where applicable, it can be applied to cosets of the form ~(1)
~_(n)
kl G . , .
~ (1) 11 G where the ~
@~k~
~,(m) ' . . . G 'Jl,~
(31)
and ~I~) are simple affine Kac-Moody algebras at admissible level, and the
sum of the numbers of simple roots t,, -~ ~k, _(0 with ki ~ N, is bigger than the corresponding sum for the denominator. For the arguments to work for general (admissible) ki, we also have to assume that one of the ~(i) with ki i / N contains a simple root of Dynkin index 1 which is not contained in the denominator. In particular, the arguments apply to the diagonal coset
~k, 9 ~k2
~kl+k2 indicating that they only give rise to rational models if at least one of the levels kl or k2 is a positive integer (c.f. the conjecture in ref. [3, p. 2421]). This is for example the case for the coset realisation of the (non-unitary) minimal models of the Virasoro algebra where one has ~ = su(2) and k2 = 1. There are certain purely bosonic coset MCFTs which are of the general form (31), e.g. those corresponding to the unifying W-algebras associated to the unitary series of the Casimir W-algebras W.An or W D n [3, Table 7]. Our arguments confirm in these cases the conjecture of ref. [3, p. 2422] that all rational models of the unifying W-algebras are also rational models of the Casimir W-algebras (which are, in these cases, all unitary). Let us close by mentioning some open problems. It would be interesting to know under which conditions all representations of a coset MCF'I" ~/b can be obtained from MCFI" representations of ~ - - as we have seen in Sect. 4, this is not the case for certain non-rational N = 2 models, where there exists a continuum of representations. In the same spirit it would be interesting to describe all representations of the N = 2 super Virasoro algebra that can be obtained from the admissible su(2) k representations for given k and to investigate whether they define quasi-rational theories in the sense of [36]. It would also be important to have a more general criterion for determining whether a coset MFCT is rational or not. Finally, in order to complete the arguments for the general case, it would be necessary to have a better understanding of the representation theory of affine Kac-Moody algebras, in particular at admissible level.
A. Calculation of Vacuum Characters In this appendix we want to calculate the vacuum characters of the N = 2 super Virasoro algebra from the embedding diagrams of Sect. 2 and from the coset realisation described in Sect. 4. Let us first calculate the vacuum characters of the N = 2 super Virasoro algebra in the cases corresponding to the embedding diagrams in Figs. 2.1-2.3. In order to be able to determine the characters from the embedding diagrams we have to assume that there do not exist subsingular vectors in the vacuum representation, i.e. vectors which
Unitarity of Rational N = 2 Superconformal Theories
77
are not singular in the vacuum Verma module but become singular in the quotient of the vacuum Verma module by its maximal proper submodule 1~ We also want to assume that the character of a submodule of a Verma module generated from a level n charged singular vector is given by qn/(1 + qn-~') times the generic character where n ' < n is the level of the uncharged singular vector (or highest weight vector) connected by a line to the charged singular vector in the embedding diagram. This means for example that the character of the submodule generated from G +- ~ f2 or G - ~ ~ is just given by !
oo
(l+qn+ 89
....
q2/(1 + q89 Hn=l ~ , wnlcn is obvious in this case. In the case of the embedding diagram shown in Figs. 2.1 and 2.2 all singular vectors are embedded in the two submodules generated from G • ~ f2. Moreover, the embedding 2
diagram implies that the intersection of the two submodules generated by G +--31 ~'~ and G - 1 f2 is trivial. Therefore, the vacuum character of the N = 2 super Virasoro algebra with
c(1,p')
= 3(1 -
2p') ( f
Xl,p'(q) =
9~ Q o r p ' c N) is given by
q-C(1,p')/24 H n=l
('-l----q--~)2
1 --
1 + q~
The case corresponding to the embedding diagram in Fig. 2.3 is more interesting. Here we have to subtract and add successively the characters of the modules generated by the corresponding singular vectors. Using the two assumptions above we obtain that the vacuum character of the N = 2 super Virasoro algebra with c(p, f ) = 3(1 - 2 ~-); p, p' E N; ( p , / ) = 1; p > 2 is given by ~o
Xp,p'(q) = q-C(p'P')/24 H (1 + qn+ 89 n=l ('--1-S "~--~ X oo 61pn(ptn+l)+p,n+89 1 -- E qp'(n+l)(p(n+l)--t) + 2 ~ qpn+89
n--O
1+
oo qpn(p'n-- l)--p'n--89 ) + E qp'n(pn+l) + 2 -1+----~ " n=l
In particular, for / = 1 the above formula gives the vacuum character of the unitary minimal model with central charge c = 3(1 - 8) it Finally, note that the last expression for the vacuum character )~p,p, c a n be rewritten as
Xp,p,(q)=q-C(p,p,)/24 ( f i ( l + q n - 8 9 ' 1--qpn+89 n=l -(f:-~'))2 ] E qpn(pn+l) - - - - . nez 1 + qpn+89
(33)
In the second part of this appendix we use the coset realisation of the N = 2 super Virasoro algebra (1) for the calculation of the vacuum characters Xp,pt. Recall that the 10In [18] certain representations of the N = 2 superconformal algebra have been found which possess subsingular vectors. However, these representations are rather special and do not include the vacuum representation. ll Althoughthe multiplicities in the embedding diagrams of ref. [11, 34, 32] are not correct, the authors of Ioc.cit.obtainedthe correct characters in the unitary case.
78
W. Eholzer, M. R. Gaberdiel
central charge of the N = 2 algebra is given as c = ~
(k g 0, - 2 ) , and that the vacuum A
representation is given by the space of all uncharged u(1) highest weight states in the vacuum representation of su(2--""-)k | a c t 2. The character •p,p, is therefore the u(1 t ' ) uncharged part of X~(2-"---)Xa'r2 divided by the u(1) character
Xp,p'(q) =
~
1
Res,
(!
~
)
X~(2)(q,z)xar
,
(34)
)~m(q) where X~(1)(q)
1/~(q) = q - ~ / ~ ( q ) A
and r/(q)
e-~-r/(q)
e ~ - I-In=l( - qn). (Here
A
we have used the su(2) characters -Xk,~(2),_ kt/, z) = trqL~ 2J3o which also take the zero mode of j3 into account.) There exist two types of embedding diagrams for su(2)k vacuum representations with level k > - 2 (corresponding to c < 3) [33, L e m m a 4.1]: either the level k can be written as k = p i p ' - 2, where p, p~ c N, (/9, i f ) = 1 and p __ 2, or the vacuum representation is generic, i.e. all singular vectors are descendents of the level zero singular vector. In the former case the representation is admissible and the vacuum character is given by [27] A
XSU(2~)(q,k
z) = LqP"PP'(T, z i p I) -- O_p, ,pp, (T, z i p I) 01,2(% z) -- ~)-l,2(r, z)
where
'
(35)
qkn2z2kn "
~)k,X('r, Z) :=
nCZ+~ (Note that our z corresponds to qZ/2 in [27].) In the latter case the vacuum character is generic, and is given by q - Nc
su(2)~_ z) =
Xk
ktl~
oo [In=l(
l
-- qn)( 1 - qnz2)( 1 -- q n z - 2 )
Let us first consider the generic (i.e. not admissible) case with c < 3. To evaluate the above residue (34) we want to use the following expression for the fermionic character:
X:Se~2(q'z) = q-"~ Y I ( 1 + qn- 89 n>_l
(1+ qn- 89
which follows from the product formula for the
1 x--', =-7-7 L q~'m s z 2m , rltq) mcz
(36)
~)3 function (see e.g. [23, p. 164])
(DO
0 3 ( r / 2 , z) = ~
q89
= I I ( 1 - q~)(1 - z2qn-l/2)(1 -- z--2qn--1/2).
nCZ
(37)
n=l
Furthermore, we shall also use the following identity for the denominator of the su(2) character (see e.g. [26, p. 262 Eq. (5.26)])
1 ac ( 1 __ qnz2)( 1 _ qnz-2 ) I'In=l
where
-
1
~
~](q)2 leZ
r
21
(38)
Unitarity of Rational N = 2 Superconformal Theories
79
oo
r
= ~-~(- 1yqtr§ Y=O
An important property of ez(q) is that r = qlr The vacuum character of the N = 2 model is then (up to the
-Resz
~
E
q89162
q-C(1,p')/24 term)
- z-l))
'
m,ICZ where p / C N or ff ~ Q. The evaluation of the residue gives 1
/
,z2
1
x--,
1
.--., 1/2
lEZ
~z2
IcE
-
~.~q'=
Cz(q)(1 - q 89
~(q)3 fez oo
=
1 (1 ~/(q)3
- q 89
E(
-
1)rq89
)
lCZ r=O oo
=
lq ~( )3 (1 - q 8 9
E(_l)rq~qfl ~ ''~'
icZ r=O where 1 = l + r. We can now do the sums over l and r, and obtain the N = 2 vacuum character
Xl,p'(q) =
c"'P'~ (fi(l+qn-1/2)2~ ( _2 ql ~ q- 24 n=l (1--qn)-----i ] 1 l +q~ ]
where we have used that l-q3 = 1 - 2 q89 This is indeed the generic N = 2 vacuum l+q2
l+q~
character, where the only null vectors are G • , ~ (cf. Eq. (32)). Finally, consider the case where the su(2)k vacuum character is admissible, i.e. 2 with p, ff E N, (p, if) = 1 and p > 2. In this case we find, using (35) and the well-known denominator formula,
k = p/p~ -
q
. s u ( 2 ) t _ Z)
Xaxu(l~")(q) ~t/,
c(p,pZ)_l 24
/ ~ 2 p n + l __
_ IJn=l( oo 1 - q n z ~ ( 1 -- q n z - 2 ) E -
nEZ
qnp'(l+pn)'~"
Using (34), the N = 2 vacuum character is then (up to the we suppress for the moment) the residue -Res~
~(q)3
E n,rn,IEZ
~--2pn--l'~ '.
Z2 --~ 1
q-C(p,p')/24 term
qnp'(l+pn)q89162 Z21Z2m(Z2pn+l-- Z--2pn--1)
which
= (*)
"
The z-dependent part is Resz (z -l+2q§ - z l§247 , whose residue is easily obtained. By expressing m in terms of n and I, the sum then becomes
80
W. Eholzer, M. R. Gaberdiel
( * ) - ~/(q)31 ~
{q,W,(,+p,~)r189
_ q,W,O+pn)r189
n,lEg
- ~ ( q 1) 3 Z
[qnp'(l+pn)q89
n,lEZ
_qnp'(l+pn) ql (t:+p2nZ+l+2t+Zpn+2pnOr [qnp,(l+pn)q89
- ~(q)31 ~
21pn)(r
) _ q89162
]
n,lEZ
- f / ( q1) 3 Z
[qnp'(l+pn)q 89162
n,ICZ
where we have replaced 1 by - I in the second sum of the penultimate line, and used the previously mentioned symmetry of r in the last equation. Next we use the explicit expression for Ct to obtain oo
(*) = ~/(q)31 Z
qnP'(l+Pm( 1 _ qPn+89 Z ~-~(- 1)~q89
nEZ
lEZ r--'O
The last exponent of q can be rewritten as
-re+r+IZ+p2n2+2lr-2lpn = + r + 2rpn , 2 where l = 1 - pn + r. We then replace the sum over l by a sum over 1 which 1~2 00 1 + qm- 89): . The sum over r is the geometric series gives ~_Mczq -~ = rl(q) l-[m=l( ~_o(-1)~q 89 (r(l+2pn)) = 1/(1 + qpn+89 and we thus arrive at (compare for example [38, 25, 1])
Xp,p'(q) = q
24
~=1 (1
- q~)2 /] ~ez
1 +q pn+l "
This expression equals the one derived from the embedding diagram (33).
B. Modular Properties of the Vacuum Characters As already mentioned in Sect. 2, it was shown in [42] that for bosonic rational conformal field theories the space of torus amplitudes which is invariant under the natural action of the modular group is finite dimensional. We expect therefore that the dimension of the space spanned by the functions X[A(T) = x(A'r) (A c SL(2, Z)), where X is the vacuum character, is finite if and only if the N = 2 super Virasoro algebra is rational for the corresponding value of c. We shall show, using the following two lemmas, that the dimension of this space is infinite for c _> 3 and for the non-unitary models corresponding to the embedding diagrams in Figs. 2.1 and 2.2, i.e. c = c(1, p') with p = 1, p' ~ Q or p = I, p' E N. On the other hand, it is finite for the unitary models with c = c(p, 1) andff = 1,p > 3. (Forff = 1,p = 2 the dimension is clearly 1 since Xl,z = 1.)
Unitarity of Rational N = 2 SuperconformalTheories
81
L e m m a B.1. For k = p - 2 E N the vacuum characters Xk(r) := Xp,l(q) (q = e2~i~) of the N = 2 super Virasoro algebra are modular functions on I" (24k(k + 2)). More explicitly, they are given by
1
Xk(T) ----" ~ 3 ( T )
E
69L,(2~l+2,,~)('r)Om(k+2)'k(k+2)(r/2)'
mmod2k
(2kn+.k) 2
where the O;~,k = ~ n 6 Z q ~k are Riemann-Jacobi theta functions and the ~L,~ are Hecke indefinite modular forms (of weight one) associated to the lattice L = Z | Z and the quadratic form Q(7) = 2(k + 2)'72 - 2k7 2.
Proof We first recall the definition of a Hecke indefinite modular form (see [22] or [26, pp. 254] for more details). Let L C R 2 be a lattice of rank two and Q : L --+ 2Z an indefinite quadratic form such that Q(a:) = 0, x E L implies x = 0. Denote by L~ the lattice dual to L, L~ = {x E ]RZlB(x,g) c Z for y E L}, where B ( 7 , " / ) = 1 ( Q ( 7 + 7 ' ) - Q(7) - Q(7')) is the bilinear form associated to Q. Let Go be the subgroup of the identity component of the orthogonal group of (B, R 2) which preserves L and fixes all elements of L ~/L. Fix a factorisation Q(7) = l l (7)12('7), where I i and 12 are real linear and set sign(7) = sign(/1(7)). Then 69L,i*(r) :=
E
sign(7)qQ(7)/2
" / E L+/~ B('y,'/)>O -y raod G 0
is called a Hecke indefinite modular form associated to/z and L. It is a modular form of weight one on P (N), where N E N satisfies N Q ( 7 ) c 2Z for all "7 E L ~. The case we are interested in has been studied in [26, pp. 256]. We have L = g @ Z and Q(7) = 2(k + 2)72 - 2k72 =/1(7)/2(7), where ll(7) = ~ + 2)71 + x/~'72 and 12(7) = ~ ' 7 1 for 7 c L implies 7 = 0. Then B is given by
+ x/~"/2 so that Q('7) = 0
B(T, T') = 2(k + 2)717tl - 2k727~, implying that L ~ equals 2(~+~Z @ 2~Z. We observe that A, given by A(Tt, 72) = ((k + 1)71 + k72, (k + 2)7t + (k
+ 1)72) ,
satisfies Q(7) = Q(A'7), and that the group generated by A is the identity component of the orthogonal group of (B, IR2) which leaves L invariant. Furthermore, A 2 generates Go. Hence the functions 69L ~ , ~ (ra E Z) are modular forms of weight one on ~ 2(k+2) ~ 2k /
F (4k(k + 2)). Moreover, one can show that the Hecke indefinite modular forms (9 i ~ 1 ,,, are given by [26, p. 258]
82
W. Eholzer, M. R. Gaberdiel
~Lr l ~__~(T) = ~x 2 ( k + 2 } ~ 2 k ]
(
~
_
s> ,n>_O s
0,n>0
1 ~2 kr
m~2
s 0 and k E Z, the space spanned by the functions flk,A (A c F (N)) is infinite dimensional if f is not constant. Here flk,A is defined as flk,A(r) = (C7- + d ) - k f(AT-) , where A = ( a
db ) and AT- = ~+b c'r+d"
Unitarity of Rational N = 2 Superconformal Theories
83
Proof Assume that the space spanned by the flk,A (A C SL(2, Z)) is n dimensional, where n < c~, and that f is not constant. Let ~ be 7-Is where s is the denominator of a = ~, t~ = e 2 ~ , and let P, O be the polynomials given by
ff9(7t) = q~P(q), /5(~) =
p(q),
~)(71) =
Q(q) O(q) = q-~Q(q)
for a _> 0 f o r a < 0.
Then there exist n matrices Ai (i = 1, . . . , n) such that the functions flk,A~ (i = 1 , . . . , n) are linearly dependent over C. (Without loss of generality we can assume that A T 1Aj are not of the form ( * \o
* ] as we are interested in abasis over C.)Hence the polynomials /
P(qJ) I]i4j O(qi) (j = 1 , . . . , n) with qi = e 2~ia~ are linearly dependent over C[~], and thus the ~j are algebraically dependent over C[~]. Applying A11 to ~ we can assume that Al is the identity. Looking at the asymptotic behaviour of the qi for T ~ --icx~ we observe that there cannot be a term containing ql. By induction on n we find that the ~ are algebraically independent. This gives the desired contradiction. [] The proof of Lemma B.2 is due to J. Nekovar [37]. The last lemma proves that the space spanned by the functions XI,p,(AT) (A @ F (48)) is infinite dimensional since the function ~((~-§ r/(,'r) 4
X l,p' (T) satisfies the assump-
tions of the lemma and ~7(('r+1)/2)2 is invariant under the ]-I,A action for A G F (48). ~('r)4 Therefore, the space spanned by the functions XI,p,(A~-) (A E SL(2, Z)) is infinite dimensional. We also expect that the dimension of the corresponding space is infinite for e = e(p, p') with coprime integers p~, p _> 2. Finally the embedding diagrams of the vacuum Verma modules for e > 3 (cf the end of Sect. 2) imply that the corresponding vacuum characters are given by the product of the generic Verma module character and a rational function of q89 We can therefore again apply Lemma B.2 to conclude that the space obtained from the SL(2, Z) action on such a vacuum character is infinite dimensional. This shows that all theories with e _> 3 are not rational. Acknowledgement. We would like to thank M. D6rrzapf, H. Kansch, E Goddard, A. Kent, J. Nekovar and G. Watts for discussions, and A. Honecker and M. RSsgen for comments on a draft version of this paper. We are grateful to G. Watts for pointing out an error in an earlier version. We also thank the referee for pointing out the relevance of ref. [ 1]. W. E. is supported by the EPSRC, and M. R. G. is supported by a Research Fellowship of Jesus College, Cambridge. We also acknowledge partial support from PPARC and EPSRC, grant GR/J73322.
References 1. Ahn, C., Chung, S.,Tye, S.-H.: New Parafermion, su(2) Coset and N = 2 Superconformal Field Theories, Nucl. Phys. B365, 191-240 (1991) 2. Anderson, G., Moore, G.: Rationality in Conformal Field Theory. Commun. Math. Phys. 117, 441-450 (1988) 3. Blumenhagen, R., Eholzer, W., Honecker, A., Hornfeck, K., Hfibel, R.: Coset Realization of Unifying "W-Algebras. Int. J. Mod. Phys. A10, 2367-2430 (1995) 4. Blumenhagen, R., Hiibel, R.: A Note on Representations of N = 2 SVC-Algebras. Mod. Phys. Lett. A9, 3193-3204 (1994) 5. Boucher, W., Friedan, D., Kent, A.: Determinant Formulae and Unitarity for the N = 2 Superconformal Algebras in Two Dimensions or Exact Results on String Compactification. Phys. Lett. B172, 316-322 (1986)
84
W. Eholzer, M. R. Gaberdiel
6. Bouwknegt, E, Schoutens, K.: W-Symmetry in Conformal Field Theory. Phys. Rep. 223, 183-276 (1993) 7. Di Vecchia, P., Petersen, J.L., Yu, M., Zheng, H.B.: Explicit Construction of the N = 2 Superconformal Algebra. Phys. Lett. B174, 280-284 (1986) 8. D6rrzapf, M.: Analytic Expressions for Singular Vectors of the N = 2 Superconformal Algebra. Commun. Math. Phys. 180, 195-232 (1996)
9. D6rrzapf• M.: Superc•nf•rmal FieM The•ries and their Representati•ns. Cambridge: PhD thesis ( • 995) 10. Dtrrzapf, M.: Embedding Diagrams for Verma Modules of the N = 2 Superconformal Algebra. (in preparation) 11. Dobrev, V.K.: Characters of Unitarizable Highest Weight Modules over the N = 2 Superconformal Algebras. Phys. Lett. B186, 43-51 (1987) 12. Dong, C., Li, H., Mason, G.: Twisted Representations of Vertex Operator Algebras. Preprint, qalg/9509025 13. Dong, C., Li, H., Mason, G.: Vertex Operator Algebras Associated to Admissible Representations of s~2. Commun. Math. Phys. 184, 65-93 (1997) 14. Feigin, B.L., Fuchs, D.B.: Cohomology of some nilpotent subalgebras of the Virasoro and Kac-Moody Lie algebras. J. Geom. Phys. 5, 209-235 (1988) 15. Gaberdiel, M.R.: Fusion in Conformal Field Theory as the Tensor Product of the Symmetry Algebra. Int. J. Mod. Phys. A9, 4619--4636 (1993) 16. Gaberdiel, M.R.: Fusion Rules of Chiral Algebras. Nucl. Phys. B417, 130-150(1994) 17. Gaberdiel, M.R., Kausch, H.G.: A Rational Logarithmic Conformal Field Theory. Phys. Lett. B386, 131-137 (1996) 18. Gato-Rivera, B., Rosado, J.I.: New Interpretation for the Determinant Formulae of the N = 2 Superconformal Algebras. Preprint IMAFF-96-38, hep-th/9602166. 19. Goddard, P.. Meromorphic Conformal Field Theory. In: V.G. Kac (ed.), "Infinite Dimensional Lie Algebras and Lie Groups", Proceedings of the CIRM-Lumminy Conference 1988, Singapore: World Scientific, (1989) 20. Goddard, P., Kent, P., Olive, D.: Unitary Representations of the Virasoro and Super-Virasoro Algebras. Commun. Math. Phys. 103, 105-119 (1986) 21. Graham, R.L., Knuth, D.E., Patashnik, O.: Concrete Mathematics, New York: Addison-Wesley, (1992) 22. Hecke, E.: 13-bereinen Zusammenhang zwischen elliptischen Modulfunktionen und indefiniten quadratischen Formen. Nachrichten der K. Gesellschaft der Wissenschaften zu Gtttingen, Mathematischphysikalische Klasse 1925, pp. 35-44 23. Hille, E.: Analytic Function Theory II, London: Blaisdell Publishing Company, (1962) 24. Hthn, G.: Selbstduale Vertexoperator-Superalgebren unddasBabymonster, Bonn: PhD thesis, (1995) 25. Huitu, K., Nemeschansky, D., Yankielowicz, S.: N = 2 Supersymmetry, Coset Models and Characters. Phys. Lett. B246, 105-113 (1990) 26. Kac, V.G., Peterson, D.H.: Infinite-Dimensional Lie Algebras, Theta Functions and Modular Forms. Adv. in Math. 53, 125-264 (1984) 27. Kac, V.G., Wakimoto, M.: Modular Invariant Representations of Infinite-Dimensional Lie Algebras and Superalgebras. Proc. Natl. Acad. Sci. USA 85, 4956-4960 (1988) 28. Kac, V.G., Wakimoto, M.: Classification of Modular Invariant Representations of Affine Algebras. Adv. Series Math. Phys. 7, 138-177 (1989) 29. Kac, V.G., Wang, W.: Vertex Operator Superalgebras and Their Representations. Contemp. Math. 175, 161-191 (1994), (hep-th/9312065) 30. Kazama, Y., Suzuki, H.: New N = 2 Superconformal Field Theories and Superstring Compactification. Nucl. Phys. B321, 232-268 (1989) 31. Kent, A.: Infinite DimensionalAlgebras and the Conformal Bootstrap, Cambridge: PhD thesis, (1986) 32. Kiritsis, E.B.: Character Formulae and Structure of the Representations of the N = 1, N = 2 Superconformal Algebras. Int. J. Mod. Phys. A3, 1871-1906 (1988) 33. Malikov, EG.: Verma Modules over Kac-Moody Algebras of Rank 2. Leningrad Math. J. 2, 269-286 (1991) 34. Matsuo,Y.: Character Formula of G' < 1 Unitary Representation of N = 2 Superconformal Algebra. Prog. Theor. Phys. 77, 793-797 (1987) 35. Mathieu, P., Walten, M.A.: Fractional Level Kac-Moody Algebras and Non-Unitary Coset Conformal Theories. Prog. Theor. Phys. 102, 229-254 (1990) 36. Nahm, W.: Quasi-Rational Fusion Products. Int. J. Mod. Phys. BS, 3693-3702 (1994) 37. Nekovar, J.: Private communication
Unitarity of Rational N = 2 Superconformal Theories
85
38. Ravanini, E, Yang, S.-K.: Modular Invariance in N = 2 Superconformal Field Theories. Phys. Lett. B195, 202-208 (1987) 39. Schwimmer, A., Seiberg, N.: Comments on the N = 2, 3, 4 Superconformal Algebras in Two Dimensions. Phys. Lett. B184, 191-196 (1987) 40. Vafa, C.: Toward Classification of Conformal Theories. Phys. Lett. B206, 421-426 (1988) 41. Watts, G.M.T.: Fusion in the W3 Algebra. Commun. Math. Phys. 171, 87-98 (1995) 42. Zhu,Y.: Vertex Operator Algebras, Elliptic Functions, and Modular Forms, PhD thesis, Yale University (1990), Modular Invariance of Characters of Vertex Operator Algebras, Journal AMS 9, 237-302 (1996) Communicated by R.H. Dijkgraaf
Commun. Math. Phys. 186, 87-94 (1997)
Communications in
Mathematical
Physics
@ Springer-Verlag 1997
Representations of Lie Superalgebras and Generalized Boson-Fermion Equivalence in Quantum Stochastic Calculus* T.M.W. Eyre**, R.L. Hudson Mathematics Department, University of Nottingham, University Park, Nottingham NG7 2RD, Great Britain. E-mails:
[email protected];
[email protected] Received: 6 August 1996/Accepted: 15 September 1996
Abstract: The boson-fermion equivalence scheme of [5] can be generalized to N dimensions with r boson and N - r fermion creation and annihilation fields. The same stochastic integral prescription replaces the N 2 generalized number processes A}, which form representations of the Lie algebra 91(N), by processes forming representations of the Lie superalgebra gl(N, r).
1. Introduction The integrator processes of N-dimensional quantum stochastic calculus are conventiently denoted [2] as AT, a,/3 = 0, 1 , . . . N. The processes A~ consist of operators AT(t), t E R§ acting in the (boson) Fock space F(L2(R+, cN)) whose matrix elements between exponential vectors are given by
@(f),A~(t)e(g)} =fO<s sVt,
j:l ~ H(u)gj(u), j=lf J ( u ) g A u ) - j=r+l
sAt q.
"
Optimal Heat Kernel Estimates for Schrtidinger Operators
99
Remark 2. Instead of using the results of [L90] one might approach Theorem 1.1 a la Gross [G76], i.e., trying to reduce the problem to an integral over infinitesimal time steps, and estimating the semigroup over these time steps by employing logarithmic Sobolev inequalities. It is clear that one would get some smoothing estimates, but can one obtain them in the sharp form? That there are some obstructions to reach this goal by this method can be seen as follows. Equations (1.5) and (1.7) determine the Gaussian function Uo that yields the norm of the magnetic heat kernel as an operator from L p to L q. With this Gaussian we may write Co(t;p,q) = [[u~ -Ilu0(s)ll~ Iluo(t)llq
Ilu01lp
[l~011p Ilu0(s)llr '
for some r and s with p < r _< q, 0 < s < t. Hence it is obvious that Co(t;p,q) 13o > 0 is continuous. Then the magnetic heat
kernel satisfies the bound BO
[etH(x,y)l < 47r sinh(_~)e
_ 0 be arbitrary. We approximate Gt by the operator G~ with kernel
C~(x, y) = e -~x~ G(t, x, y), which for e > 0 is a non-degenerate, centered Gaussian kernel. According to [L90], Theorem 3.4, there is a unique (up to a multiplicative constant) centered Ganssian function u~ which yields the maximum of IlGfU]lq/llUllp over all u E LP(R2). The function u~ is of the form u~(x) = e x p ( - x . J~x), where Je is a (possibly complex-valued) matrix which is symmetric with respect to the scalar product in N 2 and has a strictly positive real part. But since the integral operator Gf commutes with rotations, the unique maximum u~ must also be rotationally invariant. Hence Je = a ~ l + i/3~1, where ae > 0, and fl~ is real. Since the integrals of Gaussian functions can be evaluated explicitly, we can evaluate the quotient IIG~ullq/llullp for u(x) = e x p [ - ( a + i/3)x 2] and maximize this expression over all a and/~. The maximum is obtained for (a,/3) = (a~, fl~), with ac > 0 and fl~ = 0. Since e x p ( - e x 2) < 1 we find for any Gaussian function u
IIG%llq ~ I[a~ullq, and hence
c~
IIG~u~llq
-[lu~[Ip
0, we assume during the following calculation, without loss of generality, that at the time r = s > 0 the solution is normalized such that Ilu(s)llr(s) = 1. For the derivative at r = s we obtain therefore d In 4-7
Ilu(s)l[r(s)
= ~~(s) f lu(s, x)l r(~) In lu(s, x)l ~(~) dZx
a- [ U ( 8 , X)] 2 d2x q- ~1 f ]U(8, x)[(~(,)-2) -ds
(3.4)
The integrals have to be taken over IR2. The formal computation can be easily justified by an approximation argument. For simplicity, the arguments s and x in the integrand on the right side will be omitted from now on.
Optimal Heat Kernel Estimates for SchrSdinger Operators
103
Using (3.1) we obtain, after a partial integration 1 f lu1r
ad lUl 2 = -- 89 -- 2 ) f lul (r-2} (VluD 2
- ~1 f ]u[(r-2) [(V + i A ) u[ 2 9
(3.5)
The integration by parts can be justified as follows. Since u E D ( H ) , by the LeinfelderSimader Theorem [LS81] we can pick a sequence un E C ~ ( I R 2) such that Un --+ u, H u n --'+ H u in L 2. Inspecting the proof of the Leinfelder-Simader Theorem one sees that the sequence u,~ can be chosen to converge to u in L t and to have a uniform bound on the L ~~ norm. Thus u,~ converges to u in L p for all 1 < p < c~. In particular, all the following computations can be justified in the same fashion and we can assume without restriction that u E C~(IR2). If we set u = f + ig then lul = r
+ 92. We find
[(V + iA) U[2 =- (V]U[) 2 -b IA + v S l 2 lul 2 = s 2 + y 2 ,
(3.6)
where we have introduced two real vector fields X and Y over R 2, X=V•
(~x2'
OXlO),ul '
Y = (A + V S ) lul. Here the symbol V S denotes the expression fVg - gVf
f2 + 92
'
which is defined wherever f2 + 92 > O. For any c > 0 we may estimate c2X2 + y 2 > 2 e X 9Y
(3.7)
with equality if and only if c X = Y. If B = Bo is a constant magnetic field we can choose a Gaussian function u = N exp { - / 3 0 X2 ;
4c
(3.8)
J
such that equality holds everywhere in (3.7), because (with A as in Eq. (1.2))
cX = c V •
B0
~ - ( - x 2 , x l ) u = Au= Y .
Let us now insert (3.6) and (3.7) into (3.5). Since r - 1 > 0 we can add and subtract a positive constant c 2 with 0 < c < v/r - 1 in order to obtain 1 f lU[(r--2) dssd[~t[2 = -- 89 -- 1 -- c 2) f I@ `-2) (Vlul) 2
_ !2 f lul(r-2)ca(Vrul)2 _ 89f lul(~-2)(A + VS) 2 lul 2 < - 89 - 1 - ca) f lul 2 vanish and thus E3 = E4 . . . . . E~r Therefore, we will never have to calculate higher than the third terms in the spectral sequence. We will give more details on the application of the method of spectral sequences to our case in Appendix A.
3. Chiral Extended Complex Before attempting a calculation of the cohomology of the full extended complex let us consider its chiral version. This is a warm up problem which, nevertheless, captures the major features. We replace the Fock space ~p(a,-5) by its chiral version, )Vp(C0 which is generated from the vacuum by the left moving modes c~n only. Repeating the arguments of the previous section we conclude that the chiral version of Q is given by
O=l|
0
| E
Cne~-n - ~1 v" , [] |
(3.1)
n
We will calculate the cohomology of the chiral extended complex Vp for three different cases: case p: ~ 0, which describes the massive spectrum; case p2 = 0 - the massless one; and case p - 0, which besides the particular states from the massless spectrum describes a number of discrete states.
3.1. Massive states. Let us start the calculation of the cohomology of the chiral extended complex Vp by considering the case o f p 2 5~0. For this case the cohomology of the BRST complex is non-zero only for ghost number one and ghost number two. The cohomotogy contains the same number of ghost number one and two states which can be written in terms of dimension one primary matter states. Let Iv, p) E Vp be a dimension one primary state with no ghost excitations; then the following states, cocllv,p)
and
cllv,p),
(3.1.1)
represent nontrivial cohomology classes and, moreover, each cohomology class has a representative of this kind (see ref. [10]). We will calculate the cohomology of the extended complex in two steps. First, we extend the BRST complex by adding polynomials of one variable :~ = (p 9 x). The resulting space, Vp = C[~] | Vp, (3.1.2) is a subcomplex of Vp and we define its cohomology as H(~), Vp), where (~ is the restriction of ~) on Vp. Calculation of GrH(Q, Vp) is the objective of the first step. Second, we obtain the full extended space as a tensor product of Vp with the polynomials of the transverse variables ~p = C[:~1..., ~:D-1] | Fgp'
(3.1.3)
where
~i = X i
pi(p. X) p2
(3.1.4)
114
A. Astashkevich, A. Belopolsky
I
I
'..
s=4
'. 9
9149
s=3
.l
6 -'>.. ". 0 2
s=l
0
r=--4
r=-3
Gh# = 4
......
Gh# = 3
......
Gh# = 2
......
Gh# = 1
......
Gh# = 0
o0..
s=2
s
......
r=-2
r=--I
r=O
Fig. 1. A n a t o m y of a double complex
N
~
A
Using G r H ( Q , Vp) found in the first step, we will calculate GrH(QLVp). Let us calculate GrH(Q, Vp). Beside the ghost number, complex Vp has an additional grading - the x-degree. According to these two gradings we can write Vp as a double sum
-- 9
~r~8 E0 (p),
(3.1.5)
7"~8
where E~ r0, s = 5 ~ - - r | V, (pr + s ) is the space of ghost number r + s states with - r factors of ~. Note that in our notations r < 0. It will be convenient to represent a double graded complex like Vp graphically by a lattice (see Fig. 1) where each cell represents a space E 0 , columns represent the spaces with definite z-degrees and the diagonals represent the spaces with definite ghost numbers. The action of Q) on 7/p can be easily derived from the general formula (3.1). Any vector f r o m / ~ o k's can be represented as ~k | iv,p), where Iv,p) ~ Vv(8-k)is a vector from the BRST complex Vp with ghost number s - k. Applying ~9 to this state we obtain ~) ~k | iv, p } = ~k
|
- ikYck-1 | Z cn(p" ~-'~)I v,p)
(3.1.6)
n
k(k
- 1) p 2 : ~ k _ 2 |
2
CO I v , p ) .
According to Eq. (3.1.6) we decompose ~) in the sum of operators with a definite xdegree
String Center of Mass Operator and B R S T C o h o m o l o g y
s
[
":4
E1
]
115
E2 = E ~
s
" " " : .. . ~ ' . " 9"." i ~.." '.. i
:
O..-._ ':);
1
I -2
-1
9g ~ t ~
.:
I
0
, r
-2
--1
I
0
~r
Fig. 2. Spectralsequencefor Vp
~) = 00 -I- 01 -I- ~2,
(3.1.7)
where each On reduces the x-degree, or increases r, by n (see Fig. 1). Now we start building the spectral sequence of the complex (Vp, Q). For a short review of the method see Appendix A. The first step, the calculation of E~ '8 = GrrHr+8(00, V), reduces to the calculation of the cohomology of the BRST complex. Indeed, according to Eq. (3.1.6), 00 = 1 | Q, and therefore
E,~'~ = ~c-~ | H(~+~)(Q, Vp).
(3.1.8)
As we mentioned above, the BRST complex has nontrivial cohomology only at ghost numbers one and two. Thus the space/~1 = (~/~7 '~ looks as shown in Fig. 2 (left), where shaded cells correspond to non-zero spaces. The differential dl is induced on El by 01, and acts f r o m / ~ , s t o E1 +i's as shown in Fig. 2 (left). Since there are no states below the ghost number one and above ghost number two, the cohomology of dl at ghost number one is given by its kernel: / ~ ; , 1--r =
ker dl,
(3.1.9)
and at ghost number two by the quotient of/~2~'2-r by the image of dl:
~l,2--r
~r,2--r / I m dl. = E~
(3.1.10)
We are going to show that dl establishes an isomorphism of the corresponding spaces and, therefore, the only non-empty component of/~2 is/~o,1 _~ Hi(Q, Vp) as shown in Fig. 2 (right). Consider an operator B0 = ~ | b0. This operator is well defined o n / ~ 1 i.e., it maps cohomology classes to cohomology classes. On the other hand, its anticommutator with dl is given by {dl, Bo} =p2~, (3.1.11)
116
A. Astashkevich, A. Belopolsky
where k the x-degree operator. The last equation shows that if p2 5i 0, nontrivial cohomology of dl may exist only in k = 0 subspace of/~2. Moreover, if we apply {dl,/30} = dl B0 + B0 dl to ghost number one states only the second term will survive because there are no ghost number zero states i n / ~ . Thus we conclude that up to a ~ r , l - r and ~1 ~,r+l,l--r have the diagonal matrix Bo is an inverse operator to dl. Since ~l ~?r,l--r . ~--~r+l 1--r same dimension and dl is invertible it is an isomorphism between ~2 ana/5 2 ' for any r < 0. As shown in Fig. 2 (right),/~2 contains only one non-empty component. This means that the second differential d2 and all higher are necessarily zero and the spectral sequence collapses at/~2 = / ~ . Therefore, we conclude that GrH(Q, Vp) =
Hi(Q, Vp).
(3.1.12)
The second step in our program is trivial because the spectral sequence {E~ } of the full complex Vp = C [ ~ l , . . . , Y~D-1l | Vp, (3.1.13) stabilizes at J~l and J~l = J~oc = C [ ; ~ I , . . -
,XD--1] | GrH(Q, Vv)"
(3.1.14)
This happens simply because, according to Eq. (3.1.12), GrH(~), Vp), and thus /~1, contains only ghost number one states and therefore dl and all higher differentials must vanish. Combining Eqs. (3.1.14) and (3.1.12) we obtain GrH((~, Vp) = C[:~1,..., ~D-1I |
Hi(Q, Vp).
(3.1.15)
This completes our analysis of the cohomology of the chiral extended complex for p2 5~0.
3.2. Massless states. The analysis presented above can not be applied to the light-cone, p2 = 0. We could, in principle, repeat all the arguments using ~ 9x instead of ~, where is some vector for which ( . p 5/0, to build V and this would work everywhere except at the origin of the momentum space, p = 0. Yet it is instructive to make a covariant calculation in this case. Since there is no covariant way to choose a vector ~ we can not apply our two step program. Instead we will start from scratch and build a spectral sequence for the whole module Vv = C[x~ "'" , xD-1]
@ Vp,
(3.2.1)
graded by the total x-degree. According to Eq. (3.1), we can decompose ~) into a sum of operators of definite x-degree ~) = 00 + 01 + 02, (3.2.2) where 0o = I | i 0 o2 =
1
n [] |
(3.2.3)
String Center of Mass Operator and BRST Cohomology
117
Table 1. Chiral BRST cohomology at p = 0 Ghost # 3
Representatives Cl COC--1 IO>
1
2
CoOt~110) o~_ll0) IO)
D D 1
1 o
Dimension
The first step is to find cohomology of 00, which is just the tensor product of the BRST cohomology H(Q, Vp) with the space of polynomials
E1 = H(Oo, Vp) = C[x ~
X D-1 ] ~
H(Q, Vp).
(3.2.4)
Multiplying the representatives of H(Q, Vp) by arbitrary polynomials in x we obtain the following representatives of E1 cohomology classes
P u(x) ~ ClOL~ 1 IP),
Qu(x) |
c,0cloJ~ 11P),
(3.2.5)
where P~(x) and Qu(x) are polynomials in x that satisfy the transversality condition, p~Qu(x) = puPu(x) = 0, and are not proportional to pu. These transversality conditions come from the same conditions on BRST cohomology classes at p2 = 0. The first differential acts non-trivially from ghost number one to ghost number two states according to the following formula
dl :
P~ -+ Q~ = - i f f ' o ~ P~,.
(3.2.6)
It is easy to check that the map (3.2.6) is surjective and therefore E2 '8 = 0 for r + s = 2. As expected the cohomology of the massless complex has a similar structure to that of the massive one. There are no cohomology states with ghost number two and there is an infinite tower of ghost number one states with different x-degree.
3.3. Cohomology of the zero momentum chiral complex. The zero momentum complex is exceptional. Already in the BRST cohomology we encounter additional "discrete" states at exotic ghost numbers (see refs. [10, 11]). The cohomology is one dimensional at ghost numbers zero and three and D-dimensional at ghost numbers one and two. Explicit representatives for these classes can be written as given in Table 1. Let us denote the direct sum of spaces E~ '8 with the same ghost number m = r + s by E(~): E~m) - ~ : ) E~ ' m - r . (3.3.1) r_ 2
Hn+ 2
2 Hn+ 2
Hn+ 2
Hn
4 Hn +2 Vn
5 Hn +Sn +An +2 Vn
Hn
Hn-2
4Hn-2+2Vn-2
Hn-4
4 H n - 4 + 2 Vn-4
6Hn-2+Sn-2 +An_2+4 Vn_ 2 6Hn-4+Sn-4 +An-4 +4 V n _ 4
5Hn-2+Sn-2 +An_2+2 Vn_ 2 6Hn-4+Sn-4 +An-4 +4 Vn-4
Hn-6
4 H n - 6 + 2 Vn--6
6Hn-6+Sn_ 6 +An_6+4Vn_ 6
6Hn-6+Sn_ 6 +An-6+4 V n - 6
0 ~ Lr(0) ~ - ~ V(1) 9 n+2
9 n+l
dl
> w(2)
d2> TT(3) dl Vr~--2 ~ If(a)" n-3
2Hn-2
dl
2 H n - 4 + 2 Vn--4
Hn-4
2Hn_6+2Vn_ 6
Hn-6
> ~I(5)" n - 4 ~ 0,
(5.2.2)
which calculate GrHs(V0, ~)). By definition V~~ and V~5) are the spaces of homogeneous polynomials of degree n. These spaces are reducible under the Lorentz group because the subspaces of the polynomials of the form (xUx~,)khn_2k are invariant under S O ( D 1, 1). Furthermore, if hn-2k are harmonic, []hn-2k = 0, these subspaces form irreducible representations of S O ( D - 1, 1). We will denote these irreducible representations by Hn. These representations can be alternatively described by Young tableaux as n #,
Hn---i I I " I i,
(5.2.3)
Now we can write the decomposition of V~~ or V~5) into irreducible representations as V~~ = V~5) = H,~ + I-In_2 + H,~-4 + . - . 9
(5.2.4)
At the ghost numbers one and four spaces we find another kind of irreducible representations: n
vn=i I I II
I] t,
(5.2.5)
and, finally in the decompositions of V(2) and V~3) we will encounter n
An=U I'"1 ] and Sn--
n
(5.2.6)
Suppose n _> 2. Table 4 shows the decomposition of the whole complex (5.2.2) into irreducible representations. From the series of lemmas presented in Appendix B, we know that the complex (5.2.2) has cohomology only in V~2). Using Table 4 we conclude that Gr-nH2(V0, ~)) = Ha + An + Sn, (5.2.7) which we obtain by "subtracting" the odd columns from column two and "adding" the even columns. A more detailed analysis shows that if we choose the representatives of
String Center of Mass Operator and BRST Cohomology
129
Gr_~H2(V0, ~)) so that they belong to Hn, An, or Sn, they also will represent cohomology classes of Q, no lower z-degree corrections required. Two exceptional cases, n = 0 and n = 1, have to be treated separately. The decompositions of the complex (5.2.2) into irreducible representations for these the first case is presented in Table 5. Using the results of Appendix B we can infer from Table 5 (left) Table 5. Decomposition of the complex (5.2.2) into irreducible representations for n = 0 (left) and n = 1 (right) V(0)
V(1)
Vo(2)
V3(0)
V2(1)
V(2)
H2 Ho
2 H2 2Ho + 2Vo
H2 Ho+Vo
H3 H1
2 H3 4Hi +2V1
H3 4H1 +A1 +2V1
that Gr_lH~(~), V0) = V0
and
Gr0H~(O, V0)= H0,
(5.2.8)
and from Table 5 (right) that G r - l H 2 ( Q), V0) = Ht + A1;
(5.2.9)
and again, if we pick the representatives of GrHs from the irreducible representations, they will be annihilated by Q and therefore represent the cohomology classes without lower x-degree corrections. For this case this can be easily checked by explicit calculation (see Appendix B). It is tempting to interpret the irreducible representations I-In, Sn, and A,~ as the dilaton, graviton, and antisymmetric tensor. If we do so, it is not quite clear why we have infinitely many irreducible representations for each field, and not just one. We speculate that these representations are related by infinitesimal shifts (the translational part of the Poincar6 algebra), which acts on the spaces of polynomials by differentiation with respect to x ~.
Appendix A. Spectral Sequence In this section we review some basic facts about a particular type of the spectral sequence which we use in our analysis of the extended complex. This is not intended to be a complete introduction to the method. Our only goal is to introduce the spaces E0, E1 and E2 equipped with differentials do, dl and d2 acting on them. We will prove that these differentials have zero square and provide some motivations to why their cohomologies are related to G r H . For a more detailed analysis of the first three terms of a spectral sequence, the reader is referred to the book by Dubrovin, Fomenko and Novikov [13]. A general introduction to the spectral sequences from the physicist's point of view and further references can be found in refs. [14, 15]. Let (C, d) be a complex with additional grading C = ~ Cr such that the differential d can be written as d = 00 + 01 + 02, (A.1) where On maps Cr to C r+,,. Since d mixes vectors from different gradings we can not define grading on cohomology H(d, C), but we can still define a decreasing filtration. By filtration of the element x E H(d, GrC) we will mean the smallest (negative) integer s such that x is representable by a cocycle
130
A. Astashkevich, A. Belopolsky
3 = x~ +xr+l + ' " ,
(A.2)
where xr c C~. We will denote the space of such vectors by FrH(d, C). Using the filtration we can define a graded space associated to the cohomology H(d, C) GrH = GGrsH'
where
G r s H = F~H/F~+IH.
(A.3)
8
The investigation of the spaces Gr~H '~ is carried out using the method of "successive approximations" based on what is called the "spectral sequence". The idea is to construct a sequence of complexes (E~, d~) such that En+l = H(d,~, En) which converges to GrH, 8~n--8 GrsH n = Eoo .
(A.4)
Differentials dn are acting on the spaces E~ '~ as follows: dn :
l~r+n,s-- n+ l
E~ '~ ~ --n
-
(1.5)
For a complete description of the spectral sequence and the proof of the theorem which states that the spectral sequence converges to G r H we refer to [13, 16]. Let us describe the first few terms of the spectral sequence. Suppose 3 given in Eq. (A.2) represent a cohomology class x in FsH(d, C). Applying d = 00 + 01 + 02 to 3 we obtain 6 X ----003~o + ( 0 1 3 r + 0 0 3 r + l ) + ( 0 2 3 r + 0 1 3 r + 1 + 003r+2) -I-(02Xr+ 1 -I- 013r+2 -1- 00Xr+3) -I- 9 9 9 ,
(A.6)
where we enclosed in braces the terms from the same Cr space. It follows that 003~ = 0,
013~ = -003~+1,
023~ = -013~+l - 003~+2,...,
(A.7)
from which we conclude that 3~ is a 00 cocycle and a 01 cocycle modulo image of 00. This suggests that the first approximation in the spectral sequence should be Ej = H(Oo, C) and do. = 00, the second approximation is 1772 = H(dl, El), where dl is induced on E1 = H(O0, C) by 01. Usin~ the second equation of (A.7) we can formally find 37.+1 in terms o f ~ as 3~+1 = - 0 o 013~ and rewrite the last equation of (A.7) as (02 -- O l O 0 1 0 1 ) 3 r ---- --00X~,+2.
(A.S)
The last formulae suggests that d2 is induced on E2 = H(dl, El) by 02 - 0100 101 and the third approximation is E3 = H(d2, E2). Let us show that these differentials are well defined and square to zero. For do the first is obvious since it acts on the same space as 0o and 0o2 = 0 follows from d 2 - 0 which is equivalent to 02 ---- {00, 01} ~. 02 -I- {00, 02} ----0~ -----0.
(A.9)
In order to show that dl is well defined we have to show that 01 maps 0o-closed vectors to 0o-closed vectors and 0o-trivial to 0o-trivial. This easily follows from the anticommutation relation {00, 01} = 0. Let us prove that d~ = 0. Suppose x E E1 can be represented by a cocycle 3 c C, 00~ = 0. Then applying 012 = - { 0 0 , 02} to 3 we obtain a trivial cocycle 023 = -00023. In cohomology this implies that dZx = O. Before we consider the differential dz, let us describe the space E 2 o n which it acts in greater detail. By definition E2 = H(dl, El), but E1 in turn is the cohomology of the original complex with respect to 0o. Therefore, in order to find a dl cocycle we should
String Center of Mass Operator and BRST Cohornology
131
start with a Oo cocycle N and require that its image under 01 is O0 exact. Two cocycles and ~ represent the same dl cohomology class if U - ~ E I m Oo + O1ker 00. Let ~ be a dl cocycle, which means that 00~ = 0
and
01~ = 0o~.
(A.10)
We define d2x as dzN = 02~ -- 0 1 y = (02 -- 0 1 0 o l 0 1 ) x .
(A.11)
Let us show that the result is again a dl cocycle. Indeed, using the properties of On listed in Eq. (A.9) we obtain
Oodz-x =
0002x + 0001y = 0002X -- 0100y = (00, 02}X -- 0~lX ----0
and 01d2~ = 0102~ - 02~ = 0o02~.
Similarly we can prove that if ~ and ~' belong to the same dl cohomology class then their d2 images belong to the same class as well. This will finally establish the correctness of the definition of d2 as an operator on E2. In conclusion let us show that d 2 = 0. With ~ as above we have d~2x-- 02(02~ - 01y ) - 0100101(02 ~ - 01y )
= 0102y + 0 1 0 o 1 0 2 0 1 ~
- 010oOo02y - 0~0o O20o~
= 0102y -- 0102y ----0,
which completes our analysis of d2.
Appendix B. Three Lemmas In this appendix we will calculate the cohomology of the following complex 0 --+ V (0) d(~ V (1) d(l)) V (2) a~2)) V (3) d~ V (4) d(4)) V (5) --~ 0,
(B.1)
where V (5) -~ V (1) -~ C[x 4. " ' x D - - 1 ] , V (2) "" V (3) "~ c 2 D [ x 0" " ' x D - l ] and V (~ --~ V (0) ~ cD2+I [Z 0"" 937.D - 1]. Following the notations of Sect. 4 we represent the elements (O[21 R[21~ of V (~ and V (5) as (R f~ and (RD]), elements of V (2) and V (3) as ,-~uu, , and Or31 RD]), and elements of V (1) and V (4) by (P~, Pu)- Differentials d (n) act as follows (R[~
(0)
tp[l] '. # ~ ~[1]~ # / ~ aA("~tO '" '. # R[0] ~ 0 # R [01~ / ( Q ~ ] , /~[2]).d(2~'~(0/~P[1] - our [I,, Op.p[llz, - 0,aN[l]) (O[31 R[3]'1 j")/~ (t~/O[31 ~R[31"I tD[4]
-~[4]~ ~(4--'~3-~t,qut')[3]+ ,0
(Risl) where
R[31 0~O[31 + 0•RI3I)
-.....~(Ot~p[41_ oupt
1),
(B.2)
132
A. Astashkevich, A. Belopolsky 6~013]_F10121 _ .qA,q g)[21 "*/*u----"-,~p,v
,a
.qA.q /3121
~,l.,,,,~,~u--,..*
t.,u,~p,A
+20.0~,R[21
(8.3)
~/:?[31 [21 + 2 [-1 R[21. . . . _ ._ .,qA Do r,~Ap
(B.4)
It is obvious that d (4) is surjective and the kernel of d (~ contains only constant polynomials, Thus we conclude that H ~ = C and H 5 = 0. The other cohomology spaces are described by the following lemmas
H 1 is finite dimensional and dim H 1 ---- D(D+I) H 1 cohomology classes 2 " can be represented by polynomials of degree no bigger than one. L e m m a B.1.
Proof According to (B.2) H 1 is a quotient of the space S of solutions to the system of first order differential equations
OuP~, = OvPj, OuPu = Ou~.
(B.5)
by the space T of trivial solutions Pu = P u = OuR" Note that both S and T naturally decompose into a direct sum of the spaces of homogeneous polynomials and so does the quotient
S=(~S
'~,
T=t~T
n,
H'=S/T=t~S~/T
n.
(B.6)
I
We want to prove that S n = T '~ for n > 1. Let Pu and Pu be homogeneous polynomials of degree n > 1 that satisfy Eq. (B.5). First, it is obvious that Pu = 0 for every # requires P u = 0. Indeed, if P~, = 0 for every # then according to the first equation in Eq. (B.5) O,,P u = 0 for every/z and v and since by assumption deg P u > 1 this means P u = 0. Second, using the first equation of (B.5) twice we obtain
O~O~Pu = O~OuP, = OuO~P. = OuO,P~.
(B.7)
Therefore for any a , / z and v, o ~ ( o ~ P u - O u P ~ ) = o.
(B.8)
And since deg(0,~Pu - OuP~) = n - 1 > 0 we conclude that O~Pu - OuP~ = 0 and there exist R such that Pu = OuR. Subtracting a trivial solution (OUR, OuR) from (Pu, P u ) we get another solution (Ps = 0, --'P, = flu - OUR)" According to our first observation Ps = 0 requires PJu = 0 and thus P u = Pu = OuR. It is easy to see that there are exactly D(_~- ~____non-trivial ) solutions of degree one and D non-trivial constant solutions which can be written as Pu = - P ,
= ~[,.lx ",
and
where ~I~-] is an antisymmetric tensor.
[]
L e m m a B . 2 . H 3 = 0.
Pu = - P ~ = const,
(B.9)
String Center of Mass Operator and BRST Cohomology
133
Proof This is the most difficult lemma in this work. We have to show that any solution of the system
OvtO[3] + 0 ~ R [3] = 0 7 (B.10)
v(9[31 + OUR[3] = O, "~ t.'/z can be represented in the form
~O[3] ()[2] -~uv = DO[2] -*tzu _ OAOizQ~] -- 0A0 u'~ u)~ + 20tz Ov R[2] , 5 R [31 = -O)'OpQ~lp + 2 [] R E21.
(B.1 1)
We will start from an arbitrary solution of the system (B.IO) and will be modifying it step by step by adding the trivial solutions of the form (B. 1 1) in order to get zero. We can use the same arguments as in Lemma B. 1 to consider only homogeneous polynomials of some degree m. Suppose m _> 1. We will describe an iterative procedure which will allow us to modify "'~u~'(O[31R[31) so that Q ~ will depend only on one variable, say x ~ and its only nonzero components be '3[3] '~ 0 and r313] "%0v" If this is the case, the cocycle condition (E-~q. (B. 10)) tells us that
O,R = OoQu,o = OoQo,u,
(B.12)
moreover, Qm0 = Cux~ and Qo,u = Cux~. Furthermore, using Eq. (B. 12) we conclude that C u = Cu. Integrating Eq. (B.12) we obtain
D--1 R
=
mCixix
g -1
+ Coxg 6.
(B.I3)
i=l One can check that such solution ((9 [31 R 13]) can be written in the form (B.1 1) with Q[2] = 0 and
~ 1 ( m - - ~ l xix~+l + R[21 = 21 i=1
Co
(m + 1)(m +
2) x~n+2 )
"
Now let us describe the procedure which reduces any solution to the abovementioned form. Our first objective is to get rid of xi dependence for i = 1..D - 1. Let us pick i. The following four step algorithm will make (Qua, R) independent of xi. We will see that when we apply the procedure to (Qua, R) which does not depend on some other xk it will not introduce xk dependence in the output. This observation will allow us to apply the algorithm D - 1 times and make ( Q ~ , R) depend only on x0.
Step 1. Let us introduce some notations. For a polynomial P we will denote the minimal degree ofx~ among all the monomials in P by hi(P). For a zero polynomial we formally set ni(O) = +oe. Given a matrix of polynomials Q~,., let N i ( Q ~ ) = nfafn ni(Qm,).
(B.14)
v~ Since (9131 _ ~ , are homogeneous polynomials of degree m we can write them as (9131 ..~ l,Z l l
Z
=
mo+.
. .+m
Cmo'"mD-l,~t]Xo D _ l =m
m0 . "'XD ~D--1 1
(B.15)
[34
A. Astashkevich, A. Belopolsky
Let us show that it is possible to add a trivial solution to (Q~,., R) and increase Ni(Qu~) by one. Indeed, let m0
~mo".mD_l,tzu
(0[2] ---"~ #~'
Z-
X0
mz+2
"''32 i
~/2D
...37D_
1
I
i ~ - i 7 ~{~rt/-7----~
mo+'.'+mD- imm
0
for #, v g i, otherwise,
and R [21 = 0. It is easy to see that Nitr3[3] ~r~[3l >-- Ni(Q[3]~) ~'~v + Otg~v) # + 1, where 6Q[3~ comes from the trivial solution generated by ( 0 [21 , R i21 ) according to (B. 1 1). Repeating this procedure, we will increase N~(tr -.--.[3] at least by one every time. Since .,~,~ O [31 are homogeneous polynomials of degree m, Ni is either less than m + 1 or equal +oc. Therefore after a finite number of steps we will make Ni = +oc which means that all Q~] are zero for # ~ i and v : / i .
Step 2. Since ",(c)I3] "V ,//,/J R[31) is a solution to the system (B.10) we can write OiQ[~3)
=
f~[31 O . R [3] = Oi~dip,
# ~t i,
[31 = Oi R[31 = 0 v Q i[3] 0 v Qvi ..
Suppose R [3] =
Z
Dmo...mD_lX~O
too+'" "+roD
" ' ' X D _m D 1 t
--1=77~
then we choose R [21 = 0,
Q[2[ ii =
~
Dmo. . . .
D-1
m,+2
3gr~o
(mi + 1)(mi + 2)
""xi
mD-,
"" ' X D - I
'
mo+...+mD_l=m
alia" (og>u"~[2l= 0 for all the other # a n d u. It is easy to see that (~[3l.~/,v= (013]-~/,v+ 5(0131-~pv
have the following properties: -
-
all "~[31 ~ / A v except Qii do not depend on zi (~[31 ~,/zb' =
0 for all # • i and v ~r i.
Step 3. Suppose to) [31 /~[31~ is of the form we obtained at the end of Step 2. Since OiQ~ = 0 for # ~ i, then a ~ R [3] = 0 for # ~ i and therefore, R [3] depends only on xi. Thus there exists R [21 such that R [3] = - 2 0 2 R [2~ and R ~2l depends only on xi. Adding a trivial solution generated by (0, R ~2]) we can make R [3] = 0.
Step 4. Using R [3J = 0 we can rewrite the system (B.10) as follows. O.g)[3] -z"~ii
S -~
-- ~
O.Q~Ii "
.4i Therefore, QI~1 = Q[3lii,0T xiQl3].i,1, where QI~I,0 and Q[31ii,1 do not depend on xi. For every Q~l,l we can find a polynomial P which depends on the same set of
String Center of Mass Operator and BRST Cohomology
135
variables and [Np = _Q[3I 0 [2] = 0 for ii, 1" Choose 0[2] "vii = Xi P , R [2] = 0 and -~u~ (p, u) ~ (i, i). Adding the corresponding trivial solution we achieve that 0"~/zv [3] and R [31 do not depend on xi. Repeating this program D - 1 times for each value of i = 1, . . . , D - 1, we make 0 31 R [3]) depend only on xo. Now we can repeat the first step once again with i = 0 and make Q[3] kj = 0 for k , j = 1, .. D - 1. We have already proven that such solution is trivial. Recall that in the very beginning of our analysis we have made an assumption that the polynomials have non-zero degree (m _> 1). Therefore we have to consider this last case separately. If polynomials 0 [3] and R [31 are constant, they trivially satisfy the system (B. 10). To show that any such constant solution can be represented in the form (B. 1 1), it is sufficient to take Q[2] # u = quu,o (xo) 2/2 + quu, l(Xl) 2/2, which generates (,~[31 ~4u~ R [3]) if qua,0 and qu~,l are chosen so that
qoo,, - qoo,o=Q[~ , _f~[3] q l l , 0 -- t / l l , 0 - - t ~ l l
qoo,l + qoo,o + q11,1 + qll,0 =R[3], q0u, 1=~d0u t-~[3] for v 5/0, qu0,~=Q~l for # 7( 0, q lv,O=tOdlv ,q[31 for v 7? 1, q/zl,O--=~/zl ,.~[3] for # ~' 1 [31 for p, u > 1. qu~,,o "t- q~,,~,l=Qu,,
This completes the proof of Lemma B.2. Lemma
B.3.
H 4 =
[]
0.
Proof It is almost obvious that the image of d ~3) covers the whole kernel of d (4) in V (4) because the space V (3) is much bigger than V (4) at every degree. Indeed we will show that it is sufficient to consider a subspace of V (3) spanned by the zeroth row and the zeroth column of the matrix Qm" Loosely speaking the row will cover Pu and the column will cover Pt,. Suppose (P~, flu) E kerd (4) o r equivalently d(4)(P~, P ~ ) = (OuP~ - 0~T~) = 0.
(B.16)
We want to show that subtracting vectors of the form d(3)(Q,,, 0) from (Pu, T , ) we can reduce it to zero. First of all we can easily get rid of the spatial components Pi and Pi for i = 1 9 .. D - 1 using Q ~ which has the only non-zero components given by
Qoi=fPidx~ This will reduce according to Eq. x ~ Thus varying we have a vector
and
Qio=fff~dx~
(B.17)
P~, and P u to the form Pu = aSo,, and P u = gS0,u. Furthermore, (B.16) polynomials a and g have the same derivative with respect to say Cl(x 1 9 99x D-l) in Eq. (B.17) we can achieve that a = g. Finally if given by P~ = P ~ = aSo,u we can use Q00 to reduce it to zero. []
136
A. Astashkevich, A. Belopolsky
Acknowledgement. We would like to thank Barton Zwiebach for reading the manuscript and giving many valuable suggestions. This work much benefited from discussions with Alexander Gorohovsky and Barton Zwiebach. One of us (A.B.) would also like to acknowledge conversations with Jeffrey Goldstone, Kenneth Johnson and Paul Mende.
References 1. Belopolsky, A. and Zwiebach, B.: Who changes the coupling constant? Nucl. Phys., B472, 109-138 (1996). e-Print Archive: hep-th/9511077 2. Green, M.B., Schwarz, J.H. and Witten E.: Superstring theory. Cambridge: Cambridge University Press, 1987 3. Friedan, D., Martinec, E. and Shenker, S.: Conformal invariance, supersymmetry and string theory. Nucl. Phys. B271, 93 (1986) 4. Nelson, P.: Covariant insertion of general vertex operators. Phys. Rev. Lett. 62, 993 (1989) 5. Distler, J. and Nelson, P.: Topological couplings and contact terms in 2-D field theory. Commun. Math. Phys. 138, 273-290 (1991) 6. Zwiebach, B.: Closed string field theory: Quantum action and the B-V master equation. Nucl. Phys. B390, 33-152 (1993). e-Print Archive: h e p - t h / 9 2 0 6 0 8 4 7. Becchi, C.M., Collina, R. and Imbimbo, C.: On the semirelative condition for closed (topological) strings. Phys. Lett. B322, 79-83 (1994). e-Print Archive: h e p - t h / 9 3 1 1 0 9 7 8. Feigin, B.: The semi-infinite homology of Kac-Moody and Virasoro Lie algebras. Russian Math. Surveys 39, 155-156 (1984) 9. Fuks, D.B.: Cohomology of infinite-dimensional Lie algebras. New York: Consultants Bureau, 1986 10. Frenkel, I.B., Garland, H. and Zuckerman, G.J.: Semi-infinite cohomology and string theory. Proc. Nat. Acad. Sci. USA $3, 8442-8446 (1986) 11. Distler, J. and Nelson, P.: New discrete states of strings near a black hole. Nucl. Phys. B374, 123-155 (1992) 12. Witten, E. and Zwiebach, B.: Algebraic structures and differential geometry in 2d string theory. Nucl. Phys. B377, 55-112 (1992). e-Print Archive: h e p - t h / 9 2 0 1 0 5 6 13. Dubrovin, B.A., Fomenko, A.T. and Novikov, S.P.: Modem Geometry - Methods and Applications. Part IlL Berlin-Heidelberg-New York: Springer-Verlag, 1984 14. Figueroa-O'Farrill, J.M. and Kimura, T.: The BRST cohomology of the NSR string: Vanishing and "No-Ghost" theorems. Commun. Math. Phys. 124, 105-132 (1989) 15. Dixon, J.A.: Calculation of BRS cohomology with spectral sequences. Commun. Math. Phys. 139, 495-526 (1991) 16. McCleary, J.: User's Guide to Spectral Sequences. Berkeley, CA: Publish or Perish, Inc., 1985 Communicated by R.H. Dijkgraaf
Comnmn. Math. Phys. 186, 137-165 (1997)
Communications in
Mathematical
Physics
@ Spfinger-Veflag1997
Characteristic Cohomology of p-Form Gauge Theories Marc H e n n e a u x 1'2, Bernard K n a e p e n 1,*, Christiane S c h o m b l o n d 1 1 Facult~des Sciences, Universit6Libre de Bruxelles, Campus Plaine C.P. 231, B-1050 Bruxelles, Belgium 2 Centro de Estudios Cientificos de Santiago, Casilla 16443, Santiago 9, Chile Received:4 July 1996 / Accepted: 8 October 1996
Abstract: The characteristic cohomology H~ha,.(d) for an arbitrary set of free p-form gauge fields is explicitly worked out in all form degrees k < n - 1, where n is the spacetime dimension. It is shown that this cohomology is finite-dimensional and completely generated by the forms dual to the field strengths. The gauge invariant characteristic cohomology is also computed. The results are extended to interacting p-form gauge theories with gauge invariant interactions. Implications for the BRST cohomology are mentioned.
1. Introduction The characteristic cohomology [ 1] plays a central role in the analysis of any local field theory. The easiest way to define this cohomology, which is contained in the so-called Vinogradov C-spectral sequence [2, 3, 4], is to start with the familiar notion of conserved current. Consider a dynamical theory with field variables r (i = 1 , . . . , M ) and Lagrangian E ( r i, 0 ~ r 0 m..4,k r The field equations read Z;i = 0,
(1.1)
with ~E OF_, Ci - 8r i - 0 r i
0
" OE # ~ )
,
..4,k (
+... + (-1)k0m
0s 0(0m...~kr
).
(1.2)
A (local) conserved current j~' is a vector-density which involves the fields and their derivatives up to some finite order and which is conserved modulo the field equations, i.e., which fulfills cg.j ~' ,-~ O. (1.3)
* Aspirant du Fonds National de la Recherche Scientifique (Belgium)
138
M. Henneaux, B. Knaepen, C. Schomblond
Here and in the sequel, ~ means "equal when the equations of motion hold" or, as one also says equal "on-shell". Thus, (1.3) is equivalent to
O,j" = ; ~ i + ,~'o~,s
+ . . . + ,V~,..~sO#,...~sz:~
(1.4)
for some )~im...~j, j = 0 , . . . , s. A conserved current is said to be trivial if it can be written as
jr ~ OS#~
(1.5)
for some local antisymmetric tensor density S "~" = - S ~ . The terminology does not mean that trivial currents are devoid of physical interest, but rather, that they are easy to construct and that they are trivially conserved. Two conserved currents are said to be equivalent if they differ by a trivial one. The characteristic cohomology in degree n - 1 is defined to be the quotient space of equivalence classes of conserved currents. One assigns the degree n - I because Eqs. (1.3) and (1.5) can be rewritten as dco ..~ 0 and co ~ d~b in terms of the (n - 1)-form co and (n - 2)-form ~b respectively dual to j r and S~. One defines the characteristic cohomology in degree k (k < n) along exactly the same lines, by simply considering other values of the form degree. So, one says that a local k-form co is a cocycle of the characteristic cohomology in degree k if it is weakly closed, d~ ~ 0; "cocycle condition" (1.6) and that it is a coboundary if it is weakly exact, co ~ d~p, "coboundary condition,"
(1.7)
just as it is done for k = n - 1. For instance, the characteristic cohomology in form degree n - 2 is defined, in dual notations, as the quotient space of equivalence classes of weakly conserved antisymmetric tensors, O~S ~ ~ O, S ~ = S I ~ ,
(1.8)
where two such tensors are regarded as equivalent iff S I~u -- S 'l~u ~, O p R p#u, R pItu = R [p#u] .
(1.9)
We shall denote the characteristic cohomological groups by H~ha,.(d). Higher order conservation laws involving antisymmetric tensors of degree 2 or higher are quite interesting in their own right. In particular, conservation laws of the form (1.8), involving an antisymmetric tensor S u~ have attracted a great deal of interest in the past [5] as well as recently [6, 7] in the context of the mechanism of"charge without charge" of Wheeler [8]. But the characteristic cohomology is also important for another reason: it appears as an auxiliary cohomology in the calculation of the local BRST cohomology [9]. This local BRST cohomology, in turn, is quite useful in the determination of the structure of the counterterms [ 10, 11 ] and the anomalies [ 12] in the quantum theory. It plays also a central role classically, in constraining the form of the consistent deformations of the action [13]. It is by establishing vanishing theorems for the characteristic cohomology that the problem of consistent deformations and of candidate anomalies has been completely solved in the cases of Yang-Mills gauge theories and of gravity [14, 6[. For this reason, it is an important question to determine the characteristic cohomological groups for any given theory.
Characteristic Cohomology of p-Form Gauge Theories
139
The purpose of this paper is to carry out this task for a system of free antisymmetric a tensor fields Bm...~p, a = 1, 9 N, with Lagrangian - 1
/2 = E
( 2 ( p a + 1)!
Ha HaU,...zpa+,), m...Upa+,
(1.10)
a
where the H a's are the "field strengths" or "curvatures", 1
H a
-
Ba
_
(pa + 1)!
H a
m...,,o+,
d x t''
""
. d x U ' o § = d B a,
1
a ...u~a d x #1 . . d. x U. ~ Pa ! Bu~
(1.11) (1.12)
The equations of motion, obtained by varying the fields B~,...up, are given by O p H ap#I'''pp~ = 0.
(1.13)
We consider simultaneously antisymmetric tensors of different degrees, but we assume 1 Pa + 1 for each a so that the fields B~l...u~a all carry local degrees of freedom. Modifications of the Lagrangian by gauge invariant interactions are treated at the end of the paper. We give complete results for the characteristic cohomology in degree < n - 1, that is, we determine all the solutions to the equation O ~ S ~ ' ' ' ~ ' ~ ~ 0 with s > 0. Although we do not solve the characteristic cohomology in degree n - 1, we comment on the gauge invariance properties of the conserved currents and provide an infinite number of them, generalizing earlier results of the Maxwell case [15, 16, 17] a. The results of this paper will be used in [18] to compute the BRST cohomology of free, antisymmetric tensor fields. This is a necessary step not only for determining the possible consistent interactions that can be added to the free Lagrangian, but also for analyzing completely the BRST cohomology in the interacting case. Our results have already been used and partly announced in [19] to show the uniqueness of the Freedman-Townsend deformation of the gauge symmetries of a system of antisymmetric tensors of degree 2 in four dimensions. Antisymmetric tensor fields - or, as one also says, p-form gauge fields - have been much studied in the past [20, 21, 22, 23, 24] and are crucial ingredients of string theory and of various supergravity models [25]. The main feature of theories involving p-form gauge fields is that their gauge symmetries are r e d u c i b l e . More precisely, in the present case, the Lagrangian (1.10) is invariant under the gauge transformations B a --+ B 'a = B a + d A %
(1.14)
where A a are arbitrary (Pa -- 1)-fOrlTIS. NOW, if A a = d e a, then, the variation of B a vanishes identically. Thus, the gauge parameters A a do not all provide independent gauge symmetries: the gauge transformations (1.14) are reducible. In the same way, if I The determination of all the conserved currents is of course also an interesting question, but it is not sysn 1 tematically pursued here for two reasons. First the characteristic cohomology Hc~h-a,.(d) is infinite-dimensional for the free theories considered here and does not appear to be completely known even in the Maxwell case in an arbitrary number of dimensions. By contrast, the cohomological groups Hkchar(d), k < n - l, are all n l finite-dimensional and can be explicitly computed. Second, the group Hc~a,~(d) plays no role in the analysis of the consistent interactions of antisymmetric tensor fields of degree > l, as well as in the analysis of candidate anomalies if the antisymmetfic tensor fields all have degree > 2 [18].
140
M. Henneaux,B. Knaepen,C. Schomblond
e a is equal to dl,~ a, then, it yields a vanishing A a. There is "reducibility of reducibility" unless e a is a zero form. If ea is not a zero form, the process keeps going until one reaches 0-forms. For the theory with Lagrangian (1.10), there are thus P M -- 1 stages of reducibility of the gauge transformations (A a is a (Pa - 1)-form), where pM is the degree of the form of highest degree occurring in (1.10) [26, 27, 28, 29]. One says that the theory is a reducible gauge theory of reducibility order P M -- 1. General vanishing theorems have been established in [1, 2, 3, 9] showing that the characteristic cohomology of reducible theories of reducibility order p - 1 vanishes in form degree strictly smaller than n p 1. Accordingly, in the case of p-form gauge theories, there can be a priori non-vanishing characteristic cohomology only in form degree n - P M -- 1, n -- P M , etc., up to form degree n - 1 (conserved currents). In the 1-form case, these are the best vanishing theorems one can prove, since a set of free gauge fields A~ has characteristic cohomology both in form degree n - 1 and n - 2 [9]. Representatives of the cohomology classes in form degree n - 2 are given by the duals to the field strengths, which are indeed closed on-shell due to Maxwell equations. Our main result is that the general vanishing theorems of [1, 2, 3, 9] can be considerably strengthened when p > 1. For instance, if there is a single p-form gauge field and if n - p - 1 is odd, there is only one non-vanishing group of the characteristic cohomology n--p--1 in degree < n - 1. This is H;har (d), which is one-dimensional. All the other groups H ~ h a r ( d ) of the characteristic cohomology with n - p - 1 < k < n - 1 are zero, even though the general theorems of [1, 2, 3, 9] leave open the possibility that they do not vanish. As we shall show in [18], it is the presence of these additional zeros that give p-form gauge fields and gauge transformations their strong rigidity. Besides the standard characteristic cohomology, one may consider the invariant characteristic cohomology, in which the local forms w and g, occurring in (1.6) and (1.7) are required to be invariant under the gauge transformations (1.14). We also completely determine in this paper the invariant characteristic cohomology in form degree < n - 1. Our method for computing the characteristic cohomology is based on the reformulation performed in [9] of the characteristic cohomology in form degree k in terms of the cohomology H un_ k ((lid) of the Koszul-Tate differential (i modulo the spacetime exterior derivative d. Here, n is the form degree and n - k is the antighost number. This approach is strongly motivated by the BRST construction and appears to be particularly attractive and powerful. Our paper is organized as follows. In the next section, we formulate precisely our main results, which are (i) that the characteristic cohomology H k h a r ( d ) with k < n - 1 is generated (in the exterior product) by the exterior forms ~ a dual to the field strengths Ha; these are forms of degree n - Pa - 1; and (ii) that the invariant characteristic k,inv cohomology gchar (d) with k < n - 1 is generated (again in the exterior product) by the exterior forms H a and ~ a . We then review, in Sects. 3 and 4, the definition and properties of the Koszul-Tate complex. Section 5 is of a more technical nature and relates the characteristic cohomology to the cohomology of the differential (i + d, where (i is the Koszul-Tate differential. Section 6 analyses the gauge invariance properties of (f-boundaries modulo d. In Sect. 7, we determine the characteristic cohomology for a single p-form gauge field. The results are then extended to an arbitrary system of pform gauge fields in Sect. 8. The invariant cohomology is analyzed in Sect. 9. Section l0 discusses in detail the cohomological groups H * ( 5 1 d ) , which play a key role in the calculation of the local BRST cohomological groups H * ( s ] d ) . In Sect. 1 l, we show that the existence of representatives expressible in terms of the H a ' s does not extend to the characteristic cohomology in form degree n - 1, by exhibiting an infinite number
Characteristic Cohomologyof p-FormGaugeTheories
141
of (inequivalent) conserved currents which are not of that form. We show next in Sect. 12 that the results on the free characteristic cohomology in degree < n - 1 can be generalized straightforwardly if one adds to the free Lagrangian (1.10) gauge invariant interaction terms that involve the fields B~l...#p~ and their derivatives only through the gauge invariant field strength components and their derivatives (which are in general the only consistent interactions that one can add). We conclude in Sect. 13 by summarizing our results and indicating future lines of research. We assume throughout this paper that spacetime is the n-dimensional Minkowski space, so that the indices in (1.10) are raised with the inverse ~/~' of the fiat Minkowski metric ~ v - However, because of their geometrical character, our results generalize straightforwardly to curved backgrounds.
2. Results 2.1. Characteristic cohomologyo The equations of motion (1.13) can be rewritten as dH ~ ~ 0
(2.1)
in terms of the (n - p a - 1)-forms ~ a dual to the field strengths. It then follows that any polynomial in the H~'s is closed on-shell and thus defines a cocycle of the characteristic cohomology. The remarkable feature is that these polynomials are not only inequivalent in cohomology, but also completely exhaust the characteristic cohomology in form degree strictly smaller than n - 1. Indeed, one has: Theorem 2.1. Let ~ be the algebra generated by the -~a's and let V be the subspace containing the polynomials in the -H~ 's with no term of form degree exceeding n - 2. The subspace V is isomorphic to the characteristic cohomology inform degree < n - 1.
We stress again that the theorem does not hold in degree n - 1 because there exist conserved currents not expressible in terms of the ~a,s. Since the form degree is limited by the spacetime dimension n, and since ~ a has form degree n - Pa - 1, which is strictly positive (as explained in the introduction, we assume n - pa - 1 > 0 for each a), the algebra 7-/is finite-dimensional. In that algebra, the ~ a with even n - Pa - 1 commute with all the other generators, while the H~ with odd n - Pa - 1 are anticommuting objects. 2.2. Invariant characteristic cohomology. While the cocycles of Theorem 2.1 are all gauge invariant, there exists coboundaries of the characteristic cohomology that are gauge invariant, i.e., that involve only the field strength components and their derivatives, but which cannot, nevertheless, be written as coboundaries of gauge invariant local forms, even weakly. Examples are given by the field strengths H a = d B a themselves. For this reason, the invariant characteristic cohomology and the characteristic cohomology do not coincide. We shall denote by 7-/the finite-dimensional algebra generated by the (Pa + 1)forms H a, and by ,.7 the finite-dimensional algebra generated by the field strengths H a and their duals ~ a . One has
142
M. Henneaux, B. Knaepen, C. Schomblond
Theorem 2.2. Let kV be the subspace o f J containing the polynomials in the H ~ 's and the ~ a 's with no term o f f o r m degree exceeding n - 2. The subspace 14; is isomorphic to the invariant characteristic cohomology i n f o r m degree < n - 1.
Our paper is devoted to proving these theorems. 2.3. Cohomologies in algebra o f x-independent forms. The previous theorems hold as
they are formulated in the algebra of local forms that are allowed to have an explicit xdependence. The explicit x-dependence enables one to remove the constant k-forms (k > 0) from the cohomology, since these are exact, ci, i2...ik dxi' dxi2 9 9 dxik = d(ci, ~2...i~x ~ dx ~ . . . d x ik). If one restricts one's attention to the algebra of local forms with no explicit dependence on the spacetime coordinates, then, one must replace in the above theorems the polynomials in the curvatures and their duals with coefficients that are numbers by the polynomials in the curvatures and their duals with coefficients that are constant exterior forms.
Note that the constant exterior forms can be alternatively gotten rid of without introducing an explicit x-dependence, by imposing Lorentz invariance (there is no Lorentzinvariant constant k-form for 0 < k < n).
3. Koszul-Tate Complex The definition of the cocycles of the characteristic cohomology H ~ h ~ ( d ) involves "weak" equations holding only on-shell. It is convenient to replace them by "strong" equations holding everywhere in field space, and not just when the equations of motion are satisfied. The reason is that the coefficients of the equations of motion in the conservation laws are not arbitrary, but are subject to restrictions whose analysis yields useful insight on the conservation laws themselves. From this point of view, Eq. (1.4) involving the coefficients Aim~*j is a more interesting starting point than Eq. ( 1.3). One useful way to replace weak equations by strong equations is to introduce the Koszul-Tate resolution associated with the equations of motion (1.13). The details of the construction of the Koszul-Tate differential 5 can be found in [30]. Because the present theory is reducible, we must introduce the following set of BV-antifields [31]: B'am...#,=, B ' a m .-.upa-,,..., B*~u,, B *~.
(3.1)
The Grassmann parity and the antighost number of the antifietds B *am'f*v~ associated with the fields B I~1...IZpa a are equal to 1. The Grassmann parity and the antighost number of the other antifields is determined according to the following rule. As one moves from one term to the next one to its right in (3.1), the Grassmann parity changes and the antighost number increases by one unit. Therefore the parity and the antighost number of a given antifield B * a m u p - , are respectively j + 1 modulo 2 and j + 1. The Koszul-Tate differential acts in the algebra 7) of local exterior forms. By deftnition, a local exterior form co reads CO = E
COgi ""t* j d x l X l " " " d x l X Y :
(3.2)
where the coefficients cat,l...t,j are smooth functions of the coordinates x u, the fields Ba the antifields (3.1), and their derivatives up to a finite order. Although this is
Characteristic Cohomologyof p-FormGaugeTheories
143
not strictly necessary, we shall actually assume polynomiality in the fields, the antifields and their derivatives, as this is the situation encountered in field theory. The Koszul-Tate differential is defined by its action on the fields and the antifields as follows:
5B~,...u~o 5B *am'''upa 5B*at~,...t~p~-,
5B *au' 6B a*
= O,
(3.3)
= =
(3.4) (3.5)
OpH apul"''up~', OpB *apm'''up~-',
= OpB *apm, = OpB *ap.
(3.6) (3.7)
Furthermore we have,
5x u = O, 5(dx t~) = O.
(3.8)
The action of 5 is extended to an arbitrary element in 79 by using the rule
50~ = 0~6,
(3.9)
and the fact that 5 is an odd derivation which we take here to act from the left,
5(ab) = (~a)b + (-)C~
(3.10)
In 3.10, ea is the Grassmann parity of the (homogeneous) element a. These rules make 5 a differential and one has the following important property [32, 33, 30, 34]: Theorem 3.1. Hi(5) = Ofor i > 0, where i is the antighost number, i.e, the cohomology of 5 is empty in antighost number strictly greater than zero. One can also show that in degree zero, the cohomology of 5 is the algebra of "onshell functions" [32, 33, 30, 34]. Thus, the Koszul-Tate complex provides a resolution of that algebra. For the reader unaware of the BRST developments, one may view this property as the motivation for the definitions (3.3) through (3.7). One has a similar theorem for the cohomology of the exterior derivative d (for which we also take a left action, d(ab) = (da)b + (-)C~ Theorem 3.2. The cohomology of d in the algebra of local forms is given by,
H~ "~ R, Hk(d) = Of o r k 5~O, k 4 n, Hn(d) "~ space of equivalence classes of local forms,
(3.11) (3.12) (3.13)
where k is the form degree and n the spacetime dimension. In (3.13), two local forms are said to be equivalent if and only if they have identical Euler-Lagrange derivatives with respect to all the fields and the antifields.
144
M. Henneaux, B. Knaepen, C. Schomblond
Proof This theorem is known as the algebraic Poincar6 Lemma. For various proofs, see
[2, 35, 36, 37]. It should be mentioned that the theorem holds as such because we allow for an explicit x-dependence of the local exterior forms (3.2). If the local forms had no explicit x-dependence, then (3.12) would have to be amended as H k ( d ) ~ {constant forms} for k ~/0, k ~ n,
(3.14)
where the constant forms are by definition the local exterior forms (3.2) with constant coefficients. We shall denote in the sequel the algebra of constant forms by A* and the subspace of constants forms of degree k by A k. The following formulation of the Poincar~ lemma is also useful. []
Theorem 3.3. Let a be a local, closed k-form ( k < n) that vanishes when the fields and the antifields are set equal to zero. Then, a is d-exact. Proof The condition that a vanishes when the fields and the antifields are set equal to
zero eliminates the constants. This form of the Poincar6 lemma holds in both the algebras of x-dependent and x-independent local exterior forms. []
4, Characteristic Cohomology and Koszul-Tate Complex Our analysis of the characteristic cohomology relies upon the isomorphism established in [9] between H~ha~(d) and the cohomology H*(d[d) of 3 modulo d. The cohomology H#(rld) in form degree k and antighost number i is obtained by solving in the algebra of local exterior forms the equation, 6a ki + dbik_-~ = 0,
(4.1)
and by identifying solutions which differ by &exact and d-exact terms, i.e, Ik
k-1
k k ai ~,o ai -_ ai + oni-+l + drn i
.
(4.2)
One has
Theorem 4.1. Hkh~r(d)
~
H~_h(~ld), 0 < k < n,
(4.3)
H~
~
H~(6[d),
(4.4)
~
H~+k(61d), k > 0.
(4.5)
R 0
Proof Although the proof is standard and can be found in [37, 9], we shall repeat it
explicitly here because it involves ingredients which will be needed below. Let c~ be a class of H)har(d) (k < n) and let a0k be a representative of c~, c~ = [a0k]. One has 6a k+' + da~ = 0
(4.6)
for some a~ +1 since any antifield-independent form that is zero on-shell can be written as the ~ of something. By acting with d on this equation, one finds that da k§ is &closed and thus, by Theorem 3.1, that it is &exact, 6a k+2 + da k+l = 0 for some a k+2. One can repeat the procedure until one reaches degree n, the last term a~_~ fulfilling
Characteristic Cohomology of p-Form Gauge Theories
145
5a~_ k + d a n _ l _
= O,
(4.7)
and, of course, dan_ k = 0 (it is a n-form). For future reference we collect all the terms appearing in this tower of equations as ak _ n n-1 -- a n _ k + a n _ l _
k +...
-4- alk+l
+
a k.
(4.8)
Equation (4.7) shows that an k is a cocycle of the cohomology of 6 modulo d, in form-degree n and antighost number n - k. Now, given the cohomological class a of H~ha,.(d), it is easy to see, using again Theorem 3.1, that the corresponding element a n _ k is well-defined in H~_k(JId). Consequently, the above procedure defines an nonambiguous map m from H)har(d) to H~_k(Jld). This map is surjective. Indeed, let an_ k be a cocycle of Hg_k(Jld). By acting with d on Eq. (4.7) and using the second form of the Poincar6 lemma (Theorem 3.3), one finds that ann-l-k1 is also J-closed modulo d. Repeating the procedure all the way down to antighost number zero, one sees that there exists a cocycle a0k of the characteristic cohomology such that m([a0k]) = [a n_k ]" The map m is not quite injective, however, because of the constants. Assume that a0k is mapped on zero. This means that the corresponding an_ k is trivial in Hn_k(JId), i.e., an_ k n = Jbn_k+ln + ~'4bn-ln-k"Using the Poincar6 lemma (in the second form) one then finds successively that a n - Ik - 1 . . . up to a~ +1 are all trivial. The last term a0k fulfills da~ + Jdb~ = 0 and thus, by the Poincar6 lemma (Theorem 3.2), a0k = Jb~ + db~ -1 + c k. In the algebra of x-dependent local forms, the constant k-form c k is present only if k = 0. This establishes (49 and (49 That H,~(J[d) vanishes for ra > n is proved in [34]. [] The proof of the theorem shows also that (4.3) holds as such because one allows for an explicit x-dependence of the local forms. Otherwise, one must take into account the constant forms c k which appear in the analysis of injectivity and which are no longer exact even when k > 0, so that (4.3) becomes k
H~ha"(d) ~ H~_k(J[d), Ak -
(4.9)
while (4.4) and (4.5) remain unchanged.
5. Characteristic C o h o m o l o g y and C o h o m o l o g y of .4 = 6 + d
It is convenient to rewrite the Koszul-Tate differential in form notations9 Denoting the duals with an overline to avoid confusion with the antifield *-notation, and redefining the antifields by appropriate multiplicative constants, one finds that Eqs. (3.4) through (3.7) become simply I~r
a
~B 1
+
d H a = O,
6B 2
+
d B I = 0,
(5.1) ~Bp,~+ 1
+
dBp~
= O.
146
M. Henneaux,
B. Knaepen,
C. Schomblond
--$a ~ " - E l * a / y . | . I[~p ~-I j TheformB j dual to the antisymmetrlc tensor aenslty D "" ~ - (j = 1, . . . , p a + l ) has (i) form degree equal to n - pa - 1 + j ; and (ii) antighost number equal to j. Since B*~U~...t,,~+,-j has Grassmann parity j and since the product of (n - Pa - 1 + j ) dx's has Grassmann parity n - p~ - 1 + j , each B ; a has same Grassmann parity n - Pa - 1
(modulo 2), irrespective o f j . This is the same parity as that of the n - p~ - 1-form H " dual to the field strengths. Equation (5.1) can be rewritten as A/7/a = 0
(5.2)
A=~+d
(5.3)
with and
pa+l ~
(5.4)
j=l
The parity of the exterior form/~a is equal to n - p a - 1. The regrouping of physical fields with ghost-like variables is quite standard in BRST theory [38]. Expressions similar (but not identical) to (5.4) have appeared in the analysis of the Freedman-Townsend model and of string field theory [39, 40], as well as in the context of topological models [41, 42]. Note that for a one-form, expression (5.4) reduces to Ey. (9.8) of [14]. Quite generally, it should be noted that the dual ~ a to the field strength H a is the term of lowest form degree in/7/% It is also the term of lowest antighost number, namely, zero. At the other end, the term of highest form degree i n / J ~ is B;~+I, which has form degree n and antighost number p~ + 1. If we call the difference between the form degree and the antighost number the "A-degree", all the terms present in the expansion of H a have the same A-degree, namely n - pa - 1. The differential A = ~ + d enables one to reformulate the characteristic cohomology as the cohomology of A. Indeed one has T h e o r e m 5.1. The cohomology o f A is isomorphic to the characteristic cohomology, H k ( A ) ~_ Hkhar(d), 0 < k < n
(5.5)
where k in H k ( A ) is the A-degree, and in Hkchar(d) is the form degree. Proof. Let aok (k < n) be a cocycle of the characteristic cohomology. Construct a k as in the proof of Theorem 4. l, formula (4.8). The form a k is easily seen to be a cocycle of A, A a k = 0, and furthermore, to be uniquely defined in cohomology given the class of a0k. We leave it to the reader to check that the map so defined is both injective and surjective. This proves the theorem for k < n. For k = n, the isomorphism of H ~ ( A ) and H ~ , ~ ( d ) is even more direct (da~ = 0 is equivalent to Aa~ = 0 and a~ = db~-1 +Sb~ is equivalent to a~ = A(b~~-1 + b~)). [] Our discussion has also established the following useful rule: the term of lowest form degree in a A-cocycle a is a cocycle of the characteristic cohomology. Its form degree is equal to the A-degree k of a. For a = [ta, this reproduces the rule discussed above Theorem 5.1. Similarly, the term of highest form degree in a has always form degree equal to n if a is not a A-coboundary (up to a constant), and defines an element of g ~ k(Sld).
Characteristic Cohomologyof p-FormGauge Theories
147
Because A is a derivation, its cocycles form an algebra. Therefore, any polynomial in t h e / ~ is also a A-cocycle. Since the form degree is limited by the spacetime dimension n, and since the term ~ a with minimum form degree in H a has form degree n - p~ - 1, which is strictly positive, the algebra generated by the/_)a is finite-dimensional. We shall show below that these A-cocycles are not exact and that any cocycle of form degree < n - 1 is a polynomial in the ~ a modulo trivial terms. According to the isomorphism expressed by Theorem 5.1, this is equivalent to proving Theorem 2.1.
Remarks.
(i) The A-cocycle associated with a conserved current contains only two
terms, a = a~ + a~ -1,
(5.6)
where a 0n-1 is the dual to the conserved current in question. The product of such a Acocycle with a A-cocycle of A-degree k has A-degree n - 1 + k and therefore vanishes unless k = 0 or 1. (ii) It will be useful below to introduce another degree N as follows. One assigns N degree 0 to the undifferentiated fields and N-degree 1 to all the antifields irrespective of their antighost number. One then extends the N-degree to the differentiated variables according to the rule N(Ou~) = N ( ~ ) + 1. Thus, N counts the number of derivatives and of antifields. Explicitly,
N = Z Na
(5.7)
a
with
Na = ~-~ [(IJl ~-~ OjB~ O - ~ +(IJl+ l) ~ Od~*~aOj~a], J
i
J
(5.8)
i
where (i) the sum over J is a sum over all possible derivatives including the zeroth order one; (ii) IJI is the differential order of the derivative Oj (i.e., ]JI = k for 0m...uk); (iii) the sum over i stands for the sum over the independent components of Ba; and (iv) the sum over c~ is a sum over the independent components of all the antifields appearing in the tower associated with B ~ (but there is no sum over the p-form species a in (5.8)). The differential 5 increases N by one unit. The differentials d and A have in addition an inhomogeneous piece not changing the N-degree, namely dx~'(Oe~pUc~t/OxU),where O~pu~it/Ox~ sees only the explicit xU-dependence. The forms/1~ have N-degree equal to one.
6. Acyclicity and Gauge Invariance
6.1. Preliminaryresults. Under the
gauge transformations (1.14) of the p-form gauge fields, the field strengths and their derivatives are gauge invariant. These are the only invariant objects that can be formed out of the "potentials" B Idl a " " ~ P a and their derivarives. We shall denote by ZSmaU the algebra of local exterior forms with coefficients wm...t,J that depend only on the field strength components and their derivatives (and possibly x~). The algebras 7-/, 7-/and f f respectively generated by the (pa + 1)-forms H a, (n - Pa - 1)-forms H~ and ( H a, H~) are subalgebras of ~-'Small. Since the field equations are gauge invariant and since d maps Zsm~u on ZSmatt, one can consider the cohomological problem (1.6), (1.7) in the algebra ZSmaU.This defines the invariant * inv characteristic cohomology H c ~ r (d).
148
M . H e n n e a u x , B . K n a e p e n , C. S c h o m b l o n d
It is natural to decree that the antifields and their derivatives are also invariant. This can be more fully justified within the BRST context, using the property that the gauge transformations are abelian, but here, it can simply be taken as a useful, consistent postulate. With these conventions, the differentials 6, d and A map the algebra Z of invariant polynomials in the field strength components, the antifield components and their derivatives on itself. Clearly, Zs,~au C Z. The invariant cohomologies H*'inv(z~) and are defined by considering only local exterior forms that belong to Z. In order to analyze the invariant characteristic cohomology and to prove the nontriviality of the cocycles listed in Theorem 2.1, we shall need some preliminary results on the invariant cohomologies of the Koszul-Tate differential 6 and of d. The variables generating the algebra T' of local forms are, together with x ~' and dx ~,
H~'inv(51d)
B a l , t...I,~,~ , OpBat~, . . . , ~ , = , 9 9 9 , B *ap' . . . , p = _ m , OpB*al~l...,p~ - ~ ,
. . . , B'a,
O p B *a ' . . . .
These variables can be conveniently split into two subsets. The first subset of generators will be collectively denoted by the letter X- They are given by the field strengths (Ham ...u,=+, ) and their derivatives, the antifields and their derivatives. The field strengths and their derivatives are not independent, since they are constrained by the identity d H ~ = 0 and its differential consequences, but this is not a difficulty for the considerations of this section. The X'S are invariant under the gauge transformations and they generate the algebra 27 of invariant polynomials. In order to generate the full algebra we need to add to the X'S some extra variables that will be collectively denoted g'. The ~ ' s contain the field components B a~*l'''**,~ and their appropriate derivatives not present in the X'S. The explicit form of the ~P's is not needed here. All we need to know is that the ~ ' s are algebraically independent from the X's and that, in conjunction with the X'S, they generate 5D. Theorem 6.1. Let a be a polynomial in the X: a = a(x). If a = 5b, then we can choose b such that b = b(x). In particular, H~nv(5) ~- Of o r j > 0.
(6.1)
Proof We can decompose b into two parts: b = b + ~, with b = b(x) = b(~ = 0) and ~ = ~ , ~ R m ( x ) S m ( ~ ) , where Sm(k~) contains at least one ~P. Because 6~P = 0, we have, 5(b + b) = 5-b(x) + ~ , ~ 5 R m ( X ) S m ( ~ ) . Furthermore if M = M(X), then 6 M ( x ) = (SM)(x). We thus get, a(x ) = (Sb)(x) + ~ ( ~ R m ) ( X ) S m ( k ~ ) . m
The above equation has to be satisfied for all the values of the ~P's and in particular for = 0. This means that a(x) = (Sb)(x) = ~b(x). [] Theorem 6.2. Let ~ k be the subspace ofform degree k of the finite dimensional algebra 7-I of polynomials in the curvature (Pa + 1)-forms H a, TI = Gk T-Ik. One has
and
H~k , i n v (d) = 0, k < n, j > 0
(6.2)
H~k , i n v (d) = 7-/k, k < n.
(6.3)
CharacteristicCohomology of p-Form Gauge Theories
149
Thus, in particular, if a = a(x) with da = O, antighost a > 0 and deg a < n, then a = db with b = b(x). And if a has antighost number zero, then a = P ( H a) + db, where P ( H a) is a polynomial in the curvature forms and where b c ZSmaU. Proof The theorem has been proved in [36, 43] for 1-forms. It can be extended straightforwardly to the case of p-forms of odd degree. The even degree case is slightly different because the curvatures (p + D-forms H a are then anticommuting. It is fully treated in Appendix A. If the local forms are not taken to be explicitly x-dependent, Eq. (6.3) must be replaced by k,inv
H0
(d) = (A | H ) k.
(6.4) []
6.2. Gauge invariant 6-boundaries modulo d. We assume in this section that the antisymmetric tensors B a case of a single p-form.
have all the same degree p. This covers, in particular, the
Theorem 6.3. (Valid when the B a 's have all the same f o r m degree p). Let a~ = aq(X) E Z be an invariant local n-form of antighost number q > O. If aq is 5-exact n modulo d, aqn = 5Pq+t + dpqn - - 1 , then one can assume that #q+t and # q - t only depend on the X'S, i.e., are invariant (#qn+t and pq-1 C Z). Proof The p r o o f goes along exactly the same lines as the proof o f a similar statement made in [14] (Theorem 6. l) for 1-form gauge fields. Accordingly, it will not be repeated here 2 . []
Remark. The theorem does not hold if the forms have various formdegrees (see Theorem 10.1 below).
7. Characteristic Cohomology for a Single p-Form Gauge Field Our strategy for computing the characteristic cohomology is as follows. First, we compute H,~(~ld) (cocycle condition, coboundary condition) for a single p-form. We then use the isomorphism theorems to infer H'hat(d). Finally, we solve the case of a system involving an arbitrary (but finite) number o f p-forms of various form degrees.
7.1. General theorems. Before we compute H~.(61 d) for a single abelian p-form gauge field B m ..up, we will recall some general results which will be needed in the sequel. These theorems hold for an arbitrary linear theory of reducibility order p - 1.
Theorem 7.1. For a linear gauge theory of reducibility order p-1, one has, H~(~ld) =0,
j >p+
1.
(7.1)
2 We shall just mention a minor point that has been overlookedin the proof of Theorem 6.1 of [14], namely, that when p = 1 in Eq. (6.4) of [14], the form Z ~ need not vanish (in the notations of [14]). However,this does not invalidate the fact that one can replace Z ~. X~, etc. by invariantpolynomials as the recurrenceused in the proof of [14] and the absence of invariant cohomology for d in form degree one indicate. This is just what is needed for establishing the theorem.
150
M. Henneaux, B. Knaepen, C. Schomblond
Proof See [9], Theorem 9.1. See also [1, 2, 3].
[]
Theorem 7.1 is particularly useful because it limits the number of potentially nonvanishing cohomologies. The calculation of the characteristic cohomology is further simplified by the following theorem: T h e o r e m 7.2. Any solution of ~a + Opbp = 0 that is at least bilinear in the antifields is necessarily trivial.
Proof See [9], Theorem 11.2.
[]
Both theorems hold whether the local forms are assumed to have an explicit xdependence or not.
7.2. Cocycles of H~+1(3[d). We have just seen that the first potentially non-vanishing cohomological group is Hp+l(3ld). We show in this section that this group is onedimensional and provide explicit representatives. We systematically use the dual notations involving divergences of antisymmetric tensor densities.
H~+~(81d)
Theorem 7.3. is one-dimensional. One can take as representatives of the cohomological classes a = kB*, where B* is the last antifield, of antighost number p + 1 and where k is a number.
Proof Any polynomial of antighost number p + 1 can be written a = f B * + fPOp13* + ... + #, where f does not involve the antifields and where # is at least bilinear in the antifields. By adding a divergence to a, one can remove the derivatives of B*, i.e., one can assume fP = fo~ . . . . . 0. The cocycle condition da + Opbp = 0 implies then - O p f B*P + ~iz + Op(bp + f B *o) = O. By taking the Euter-Lagrange derivative of this equation with respect to B *p, one gets p 5LP - Opf+~((-1)
~-2-~) = 0.
(7.2)
This shows that f is a cocycle of the characteristic cohomology in degree zero since d(anything of antighost number one) m 0. Furthermore, if f is trivial in H~ then a can be redefined so as to be at least bilinear in the antifields and thus is also trivial in the cohomology of d modulo d. Now, the isomorphism of H~ with H~(~[d) implies f = k + ~g with k a constant (Hn~(3td) = 0 because n > p + 1). As we pointed out, the second term can be removed by adding a trivial term, so we may assume f = k. Writing a = kB* + #, we see that p has to be a solution of d# + Opb~p = 0 by itself and is therefore trivial by Theorem 7.2. So Hp+ 1(dl d) can indeed be represented by a = kB*. In form notations, this is just the n-form kB;+ l . Note that the calculations are true both in the x-dependent and x-independent cases. To complete the proof of the theorem, it remains to show that the cocycles a = L'B*, which belong to the invariant algebra 27 and which contain the undifferentiated antifields, are non-trivial. If they were trivial, one would have according to Theorem 6.3, that B;+I = (Su + dv for some u, v also in 2-. But this is impossible, because both 3 and d bring in one derivative of the invariant generators X while B;+l does not contain derivatives of X. [This derivative counting argument is direct if u and v do not involve explicitly the spacetime coordinates x ~. If they do, one must expand u, v and the equation B;+i = c~u + dv according to the number of derivatives of the fields in order to reach
Characteristic Cohomologyof p-Form Gauge Theories
151
the conclusion. Explicitly, one sets u = uo + . . . + uk, v = vo + . . . + vk, where k counts the number of derivatives of the H~,,...u~§ and of the antifields. The condition -Bp+1 = 5u + dv implies in degree k + 1 in the derivatives that 5Uk + d'vk = 0, where d' does not differentiate with respect to the explicit dependence on x t'. This relation implies in turn that uk is 5-trivial modulo d' since there is no cohomology in antighost number p + 2. Thus, one can remove uk by adding trivial terms. Repeating the argument for u k - 1, and then for Uk-2, etc., leads to the desired conclusion.] [] 7.3. Cocycles o f H~(51d) with i < p. We now solve the cocycle condition for the remaining degrees. First we prove
T h e o r e m 7.4. Let K be the greatest integer such that n - K ( n - p - 1) > 1. The cohomological groups H~(Sld) (j > 1) vanish unless j = n - k ( n - p - 1), k = 1 , 2 , . . . , K . Furthermore, f o r those values o f j , Hy(Sld) is at most one-dimensional. Proof We already know that Hn(5ld) is zero for j > p + 1 and that H~+l(51d ) is onedimensional. Assume thus that the theorem has been proved for all 3 s strtctly greater than J < p + 1 and let us extend it to J. In a manner analogous to what we did in the proof of Theorem 7.3, we can assume that the cocycles of n~(51d) take the form
f~l...~p§
J B*~~§
+/z,
(7.3)
where f.,...up+,_j does not involve the antifields and defines an element of H p ~ I S ( d ) . Furthermore, if f.,....p+~_ j is trivial, then the cocycle (7.3) is also trivial. Now, using the isomorphism Hv~s J (d) ~- H ~ _ v_ ~+g(51d) (p+ 1 - J > 0), we see that f is trivial unless j ' = n - p - 1+.J, which is strictly greater than J and is of the form j ' = n - k ( n - p 1). In this case, H ~ is at most one-dimensional. Since J = j ' - ( n - p - 1) = n - (k+ 1 ) ( n - p - 1) is of the required form, the property extends to J. This proves the theorem. [] Because we explicitly used the isomorphism HP~h~d(d) ~ n~n_p_~+j(5ld), which holds only if the local forms are allowed to involve explicitly the coordinates x~', the theorem must be amended for x-independent local forms. This will be done in Sect. 7.5. Theorem 7.4 goes beyond the vanishing theorems of [1, 2, 3, 9] since it sets further cohomological groups equal to zero, in antighost number smaller than p + 1. This is done by viewing the cohomological group H~(51d) as a subset of H~_v_l+~(51d) at a higher value of the antighost number, through the form (7.3) of the cocycle and the isomorphism between H P ~ i ( d ) and Hn_p_l+i(5ld). In that manner, the known zeros at values of the ghost number greater than p + 1 are "propagated" down to values of the ghost number smaller than p + 1. To proceed with the analysis, we have to consider two cases: (i) Case I: n - p - 1 is even. (ii) Case II: n - p - 1 is odd. We start with the simplest case, namely, Case I. In that case,/:/is a commuting object and we can consider its various powers (/~)k, k = 1 , 2 , . . . , K with K as in Theorem 7.4. These powers have A-degree k ( n - p - 1). By Theorem 5.1, the term of form degree n in (/:/)k defines a cocycle of H,~_ k(n-p- 1)(SId), which is non-trivial as the same invariance argument used in the previous subsection indicates. Thus, H~_ k(n-p-~)(5ld), which we know is at most one-dimensional, is actually exactly one-dimensional and one may take
152
M. Henneaux,
B. Knaepen,
C. Schomblond
as representative the term of form degree n in ( ~ ) k . This settles the case when n - p - 1 is even. In the case when n - p - 1 is odd, t7t is an anticommuting object and its powers (/7/)k, k > 0 all vanish unless k = 1. We want to show that H 2 _ k(n--p- 1)(~1d) similarly vanishes unless k = 1. To that end, it is enough to prove that H.'~_2(~_p_l)(Sld) = H~p+2_,~(~ld) = 0 as the proof o f Theorem 7.4 indicates (we assume, as before, that 2p + 2 - n > 1 since we only investigate here the cohomological groups H~(3ld) with i > 1). Now, as we have seen, the most general cocycle in H~p+2_,~(61d) may be assumed to take the form a = fl~p+2...l~,~B*"p§ + It, where # is at least quadratic in the antifields and where fu~+2...~.~ does not involve the antifields and defines an element of H "~h~p--1(d). But ~ r rhn=- ~P - I (d) _~ Hff+l(~ld) is one-dimensional and one may take as representative of H~h~P~-l(d) the dual k % , . . . # H m'''"~§ necessarily of the form, a = k~#l., .~n H#I
of the field strength. This means that a is
4s
Bg 1 (if 2p + l - n = l, 9 9 * 9 up.-..) n , R*up-~...un see below). One can then successwely ehmlnate B(2p_,~ --(ep-,~-1) ' etc " f r o m p , so that the question ultimately boils down to: is ~[~*fLl .. , p j ,~ j ~ * P j + I .--p2j ]s162 .. "P23 ~ ( p + l - - j ) " ~ ( p + l - - j )
(n even = 2 j ) or ,g
r ~ * / ~ l .../z3+l r r~*/~3+2..-/~2j+l IE/zj .../z2j+l D ( p _ j ) OD(p+l_j)
(n odd = 23 + 1) &exact modulo d, i.e., o f the form cSv + OpeV, where v does not involve the antifields B* for s > p + 1 - j (n even) or s > p - j (n odd)? That the answer to this question is negative unless k = 0 and a accordingly trivial, which is the desired result, is easily seen by trying to construct explicitly v. We treat for definiteness the case n even (n = 2j). One has // _-- /~/z 1
---
/j,2j R * # l ' " / z J ~(p+l--j)
R */z3+1'''/~23 ~(p+l--j)
Characteristic Cohomology of p-Form Gauge Theories
153
where )~/Zl...t`2j is antisymmetric (respectively symmetric) for the exchange of (/Z 1 . . . ~ j ) with (#j+l -. 9#2j) i f j is even (respectively odd) (the j - f o r m B~p+l-j) is odd by assumption and this can happen only if the components ~(p+l-j)n*mmare odd for j even, or even for j odd). From the equation /:~*/*1...t`4 .g j~*t`a+1 ...t`a.j ]g,:~,/Z1"',[~23~(p+l--j) ~(p+l--j) = ~b' + OpCP~
(7.6)
one gets ~ * ~ l . . . t ` j ~ jET~*Pt`34-1"*t`2j=2At`, 1,2jB *t`l"''t`~ /:~*pt`j+l...t`2j +OqpCp" ket`l ...t`23 ~(p+l--j) ~ P ~ ( p - - j ) ... (p+l--j) ~P~(,p--j)
(7.7)
Taking the Euler-Lagrange derivative of this equation with respect to -r~*u, ( p + l ...m - j ) yields next ~''t`l.. "t`23] ~ P ~ ( p - - j )
= O~
which implies k%l...mj = 2Am...mj. This contradicts the symmetry properties of At,,...m~, unless k = 0, as we wanted to prove.
7.4. Characteristic Cohomology. By means of the isomorphism theorem of section 4, our results on H2(51d) can be translated in terms of the characteristic cohomology as follows: (i) If n - p - 1 is odd, the only non-vanishing group of the characteristic cohomology n--p--1 in form degree < n - 1 is Hchar (d), which is one-dimensional. All the other groups vanish. One may take as representatives for H~h-~-l(d) the cocycles k H . Similarly, the only non-vanishing group Hi(A) with j < n - 1 is H ' ~ - P - a ( A ) with representatives k/7/and the only non-vanishing group H~(5]d) with i > 1 is H~+1(61d) with representatives kB~+ 1. (ii) If n - p - 1 is even, there is further cohomology. The degrees in which there is non-trivial cohomology are multiples of n - p - 1 (considering again values of the form degree strictly smaller than n - 1). Thus, there is characteristic cohomology only in degrees n - p - 1, 2(n - p - 1), 3(n - p - 1), etc. The corresponding groups are one-dimensional and one may take as representatives kH, k(-H) z, k ( H ) 3, etc. There is also non-vanishing ALcohomology for the same values of the A-degree, with representative cocycles given by k H , k(/]) 2, k(/]) 3, etc. By expanding these cocycles according to the form degree and keeping the terms of form degree n, one gets representatives for the only non-vanishing groups n~(51d) ( with i > 1), which are respectively Hp+ 1(S Id), Hp+ l_(n_p_ 1), Hp+1-2(n-p-I), etc. An immediate consequence of our analysis is the following useful theorem: T h e o r e m 7.5. If the polynomial Pk( H) of form degree k < n in the curvature (p + 1)form H is &exact modulo d in the invariant algebra Z, then pa( H) = O.
Proof The theorem is straightforward in the algebra of x-independent local forms, as a direct derivative counting argument shows. To prove it when an explicit x-dependence is allowed, one proceeds as follows. Ifpk(H) = 5akl+da~ -1 where alk and a0k - l E Z, then day +rak2+~ = 0 for some invariant a 2k+l . Using the results on the cohomology of 5 modulo d that we have just established, this implies that a~ differs from the component of form degree k and antighost number 1 of a polynomial Q ( / t ) by a term of the form 5p + da,
154
M. Henneaux, B. Knaepen, C. Schomblond
where p and cr are both invariant. But then, 5a~ has the form d ([Q(/7/)]ok-' + &r), which implies p k (H) = d ( - [Q(H)]0k- 1 _ 5a + a0k- ~), i.e., p k (H) = d(invariant). According to the theorem on the invariant cohomology of d, this can occur only if P k ( H ) = O. [] 7.5. Characteristic cohomology in the algebra of x-independent local forms. Let us denote ([/)m by Pm (ra = 0 , . . . , K). We have just shown (i) that the most general cocycles of the A-cohomology are given, up to trivial terms, by the linear combinations AmPm with Am real or complex numbers; and (ii) that if AmPm is b-exact, then the Am are all zero. In establishing these results, we allowed for an explicit x-dependence of the local forms (see comments after the proof of Theorem 7.4). How are our results affected if we work exclusively with local forms with no explicit x-dependence? In the above analysis, it is in calculating the cocycles that arise in antighost number < p + 1 that we used the x-dependence of the local forms, through the isomorphism H P + I - J ( ,4~ char ~l ~ - Hn-p-l+j(rSld) 9 If the local exterior forms are not allowed to depend explicitly on x, one must take the constant k-forms (k > 0) into account. The derivation goes otherwise unchanged and one finds that the cohomology of A in the space of xindependent local forms is given by the polynomials in the P,~ with coefficients A~,~that are constant forms, Am = Am(dx). In addition, if AmP,~ is b-exact, then, A,mP ~ = 0 for each m. One cannot infer from this equation that Am vanishes, because it is an exterior form. One can simply assert that the components of A~ of form degree n - m(n - p - 1) or lower are zero (when multiplied by Pro, the other components of Am yields forms of degree > n that identically vanish, no matter what these other components are). It will be also useful in the sequel to know the cohomology of A t, where A t is the part of A that acts only on the fields and antifields, and not on the explicit x-dependence. One has A = A' + dx, where dx =- O~PliCit/Ox~ sees only the explicit x-dependence. By the above result, the cohomology of A t is clearly given by the polynomials in the Pm with coefficients Am that are now arbitrary spacetime forms, Am = Am(X, dx).
8. Characteristic Cohomology in the General Case To compute the cohomology H~(Sld) for an arbitrary set of p-forms, one proceeds along the lines of the Kunneth theorem. Let us illustrate explicitly the procedure for two fields B~I ...~p, and B 2 .--~2 . One may split the differential A as a sum of terms with definite No-degrees, A = z~ 1 -I- A 2 "1- d x (8.1) (see (5.8)). In (8.1), dx leaves both N1 and Na unchanged. By contrast, Al increases Nl by one unit without changing N2, while A2 increases Na by one unit without changing N1. The differential Al acts only on the fields B 1 and its associated antifields ("fields and antifields of the first set"), whereas the differential A2 acts only on the fields B 2 and its associated antifields ("fields and antifields of the second set"). Note that A1 + A2 = A'. Let a be a cocycle of A with b-degree < n - 1. Expand a according to the Nl-degree, a = ao + a l
+ a2
+ . . . + am, N l ( a j ) = j.
(8.2)
The equation A a = 0 implies A l a m = 0 for the term a,,~ of highest Nl-degree. Our analysis of the At-cohomology for a single p-form yields then am = c,~(/:/1) k + Al(something), where cm involves only the fields and antifields of the second set, as
Characteristic Cohomology of p-Form Gauge Theories
155
well as d x ~ and possibly x u. There can be no conserved current in am since we assume the A-degree of a - and thus of each aj - to be strictly smaller than n - 1. Now, the exact term in am can be absorbed by adding to am a A-exact term, through a redefinition of a m - 1 . Once this is done, one finds that the next equation for am and am-1 following f r o m / 1 a = 0 reads [(/12 + dx)cm](ffI1) k + Alam-1 = 0.
(8.3)
But we have seen that Am(H1) k cannot be exact unless it is zero, and thus this last equation implies both [(A2 + dx)cm](/~l) k = 0
(8.4)
~11am-1 = O.
(8.5)
and
Since (~1)k has independent form components in degrees k ( n - p 1), k ( n - p 1)+ 1 up to degree n, we infer from (8.4) that the form components of (/12 + dx)cm of degrees 0 up to degree n - k ( n - p - 1) are zero. If we expand Cm itself according to the form degree, c,,~ = ~ c~, this gives the equations ~eim +dcim-1 = O, i = 1 , . . . , n -
k(n-p-
1),
(8.6)
and ~c~ = 0.
(8.7)
Our analysis of the relationship between the Zl-cohomology and the cohomology of modulo d indicates then that one can redefine the terms of form degree > n - k ( n - p - 1) of Cm in such a way that Acre = 0. This does not affect the product C m ( f t l ) k. We shall assume that the (irrelevant) higher order terms in Cm have been chosen in that manner. With that choice, cm is given, up to trivial terms that can be reabsorbed, by )~m(/12) t, with )~,,~ a number, so that am = Am(/7/2)t(H1) k is a / 1 - c o c y c l e by itself. One next repeats successively the analysis for a m - 1 , a m - 2 to reach the desired conclusion that a may indeed be assumed to be a polynomial in the ~ a , s , as claimed above. The non-triviality of the polynomials in the/~ra,s is also easy to prove. If p(/7/) = Ap, with p = Po + Pl + . . . + Pro, N l ( p k ) = k, then one gets at Nl-degree m + 1 the condition ( P ( H ) ),~ +I = A I Pro, which implies ( P ( H ) )m +I = 0 and A 1pro. = O, since no polynomial i n / ~ l is Al-trivial, except zero. It follows that P m = u(/7/1) m up to trivial terms that play no role, where u is a function of the variables of the second set as well as of x u and d x ~. The equation of order m implies then ( P ( H ) ) m = ((A2 + dx)u) ([tl)m + A l P m - 1. The non-triviality of the polynomials in t7/1 in A v c o h o m o l o g y yields next A l p , ~ _ l = 0 and (P(/q))m = ((/12 + d x ) u ) ( f 1 1 ) m. Since the coefficient of (/7/1)m in (P(-f-1))m is a polynomial in/it2, which cannot be (//2 + dx)-exact, one gets in fact ( P ( H ) ) m = 0 and (/12 + d x ) u = 0. It follows that Pm fulfills ~1Pro = 0 and can be dropped. The analysis goes on in the same way at the lower values of the /11-degree, until one reaches the desired conclusion that the exact polynomial P(/~) indeed vanishes. In view of the isomorphism between the characteristic cohomology and H* (A), this completes the proof of Theorem 2.1 in the case of two p-forms. The case of more p-forms is treated similarly and left to the reader.
156
M. Henneaux, B. Knaepen, C. Schomblond
9. Invariant Characteristic Cohomology 9.1. Isomorphism theorems for the invariant cohomologies. To compute the invariant characteristic cohomology, we proceed as follows. First, we establish isomorphism thek,i~ (d), Hn_ n # k~ (5ld) and H k # ~ ( A ) . Then, we compute Hk'~n~(A) orems between H~h~ for a single p-form. Finally, we extend the calculation to an arbitrary systems of p-forms. Theorem 9.1. k,inv Hchar (d)
~k
0
n,inv
~-- H~_ k (~ld), 0 _< k < n,
(9.1)
~
(9.2)
H~;~(5ld), k > 0.
Theorem 9.2. The invariant cohomology of A is isomorphic to the invariant characteristic cohomology, k,inv Hk'inv(z~) "~ Hchar (d), 0 < k < n.
(9.3)
Proof First we prove (9.1). To that end we observe that the map m introduced in the k inv n inv demonstration of Theorem 4.1 maps Hc~a~ (d) on H~'_ k ((~Id). Indeed, in the expansion (4.8) for a, all the terms can be assumed to be invariant on account of Theorem 6.1. The surjectivity of m is also direct, provided that the polynomials in the curvature P ( H ) are not trivial in H*(fild), which is certainly the case if there is a single p-form (Theorem 7.5). We shall thus use Theorem 9.1 first only in the case of a single p-form. We shall then prove that Theorem 7.5 extends to an arbitrary system of forms of various form degrees, so that the proof of Theorem 9.1 will be completed. To compute the kernel of m, consider an element a0k E Z such that the corresponding n,inv a~_ k is trivial in Hn_ k (5[d). Then, again as in the proof of Theorem 4.1, one finds that all the terms in the expansion (4.8) are trivial, except perhaps a0k, which fulfills da k + 6db~ = 0, where bk E 2" is the k-form appearing in the equation expressing the triviality of a~ +1, a~+l = db~ +Sbk2+l. This implies d(a~ - 6b~) = 0, and thus, by Theorem 6.2, a k = P + db~-l + 5b~ with P E 7-/k and b~-1 E Z. This proves (9.1), since P is not trivial in H*(5[d) (Theorem 7.5). [Again, we are entitled to use this fact only for a single p-form until we have proved the non-triviality of P in the general case.] The proof of (9.2) is a direct consequence of Theorem 6.1 and parallels step by step the proof of a similar statement demonstrated for 1-forms in [14] (Lemma 6.1). It will not be repeated here. Finally, the proof of Theorem 9.2 amounts to observing that the map m ~ that sends [a0k] on [a] (Eq. (4.8)) is indeed well defined in cohomology, and is injective as well as surjective (independently of whether P ( H ) is trivial in the invariant cohomology of 3 modulo d). Note that if the forms do not depend explicitly on x, one must replace (9.1) by k ir~v d
(A | 7-t)k - H~:-k (~ld).
[]
(9.4)
9.2. Case of a single p-form gauge field. Theorem 6.3 enables one to compute also the invariant characteristic cohomology for a single p-form gauge field. Indeed, this n,inv theorem implies that H~_ k ( 5 ]d) and H~_k(51d ) actually coincide since the cocycles of
Characteristic Cohomology of p-Form Gauge Theories
157
H n - k (~ld) are invariant and the coboundary conditions are equivalent. The isomorphism of Theorem 9.1 shows then that the invariant characteristic cohomology for a single pform gauge field in form degree < n - 1 is isomorphic to the subspace__of form degree < n - 1 of the direct sum 7-/G 7-/. Since the product H A H has form degree n, which exceeds n - 1, this is the same as the subspace W of Theorem 2.2. The invariant characteristic cohomology in form degree k < n - 1 is thus given by (7"t | ~ ) k , i.e., by the invariant polynomials in the curvature H and its dual H with form degree < n - 1. Similarly, by the isomorphism of Theorem 9.2, the invariant cohomology H k #n~(A) of A is given by the polynomials in t7/and H with A-degree smaller than n - 1. 9.3. Invariant cohomology o f A in the general case. The invariant A-cohomology for an
arbitrary system of p-form gauge fields follows again from a straightforward application of the Kunneth formula and is thus given by the polynomials in the ~ a , s and H a's with b-degree smaller than n - 1. The explicit proof of this statement works as in the noninvariant case (for that matter, it is actually more convenient to use as degrees not N1 and N2, but rather, degrees counting the number of derivatives of the invariant variables X's. These degrees have the advantage that the cohomology is entirely in degree zero). In particular, none of the polynomials in the 9 a , s and H a's is trivial. The isomorphism of Theorem 9.2 implies next that the invariant characteristic COhomology nk,inv char (d) (k < n - l) is given by the polynomials in the curvatures H a and their duals ~ a , restricted to form degree smaller than n - 1. Among these, those that involve the curvatures H ~ are weakly exact, but not invariantly so. The property of Theorem 7.5 thus extends as announced to an arbitrary system of dynamical gauge forms of various form degrees. Because the forms have now different form degrees, one may have elements in k,inv (d) (k < n - l) that involve both the curvatures and their duals. For instance, if Hchar B 1 is a 2-form and B 2 is a 4-form, the cocycle H 1 A ~ 2 is a (n -- 2)-form. It is trivial in H~h~(d), k but not in H ~k inv (d).
10. Invariant Cohomology of 6 mod d
n,inv
The easiest way to work out explicitly H~_ k (~]d) in the general case is to use the above isomorphism theorems, which we are now entitled to do. Thus, one starts from H k # '~" (A) and one works out the component of form degree n in the associated cocycles. Because one has elements in H k # n v ( A ) that involve simultaneously both the curvan,inv d and H,~_k(~ld ) coincide ture and its A-invariant dual/~, the property that H~_ k (~1) may no longer hold. In the previous example, one would find that H ( ~ v B *(2)~, which has antighost number two, is a ~-cocycle modulo d, but it cannot be written invariantly n#nv so. An important case where the isomorphism H ~ _ k (~ld) ~ H~_k(~ld) (k > l) does hold, however, is when the forms have all the same degrees. To write down the generalization of Theorem 6.3 in the case of p-forms of different degrees, let P ( H ~, ~ a ) be a polynomial in the curvatures (Pa + 1)-forms H a and their A-invariant duals/7/% One has A P = 0. We shall be interested in polynomials of Adegree < n that are of degree > 0 in both H a and ~ a . The condition that P be of degree > 0 in H ~ implies that it is trivial (but not invariantly so), while the condition that it be of degree > 0 i n / / a guarantees that when expanded according to the antighost number, P has non-vanishing components of antighost number > 0,
158
M. Henneaux, B. Knaepen, C. Schomblond n
P = ~_~[P]~-k"
(10.1)
j=k
From A P = 0, one has 8 [ P ] ~ - k + d[p]n--~-I = O. There is no polynomial in H a and/Qa with the required properties if all the antisymmetric tensors B /*l--->pa a have the same form degree (Pa = P for all a's) since the product H a l l b has necessarily A-degree n. When there are tensors of different form degrees, one can construct, however, polynomials P with the given features. The analysis of the previous subsection implies straightforwardly. T h e o r e m 10.1. Let aq = aq (X) E I be an invariant local n-form o f antighost number
q > O. Ifa~ is 5-exact modulo d, aqn = 5#qnl + dlZqn--1 , then one has 'n 1 + d # ~ - I aqn = [P]q + al~q+
(10.2)
for some polynomial P ( H ~, [I ~) o f degree at least one in H a and at least one in [1 a, and where #'qn1 and # ~ -1 can be assumed to depend only on the X 's, i.e., to be invarianr In particular, if all the p-form gauge fields have the same form degree, [P];~ is absent and one has ' '~-I aqn = alZqn+l + d #
(10.3)
where one can assume that #'qn+l and lzq -~ are invariant (p'qn+l and p ~ - I E Z).
11. R e m a r k s on Conserved Currents
That the characteristic cohomology is finite-dimensional and entirely generated by the duals ~ a , s to the field strengths holds only in form degree k < n - 1. This property is not true in form degree equal to n - 1, where there are conserved currents that cannot be expressed in terms of the forms H a , even up to trivial terms. An infinite number of conserved currents that cannot be expressible in terms of the forms ~ a are given by T~s163 1 ... O~~/~l ... ~ T
=
L ( 1p!H ~ p l . . .pp,oq O~sHlj~I'''pB ,fll...~r ...
-2
1 H* H *p2''pn-v-1 ) (n - p - 2)!--~P2""Pn-P-I"~'"a~ ~ ,3,...~,-"
(11.1)
These quantities are easily checked to be conserved T vvq t* ...c~/51 ...3r,# = 0
(11.2)
and generalize the conserved currents given in [15, 16, 17] for free electromagnetism. They are symmetric for the exchange of # and u and are duality invafiant in the critical dimension n = 2p + 2, where the field strength and its dual have the same form degree p + 1. In this critical dimension, there are further conserved currents which generalize the "zilches", Zt~Ucq...c~fll...3s
=
H~Zal...ap,cq...~ H , V
~1 ...O'p
_H.#al
...ap,cq
,~1 ' " ~
..... H~ , .... p ,/~1...~,
(11.3)
Characteristic Cohomology of p-Form Gauge Theories
159
Let us prove that the conserved currents (11.1) which contain an even total number of derivatives are not trivial in the space of z-independent local forms. To avoid cumbersome notations we will only look at the currents with no/~ indices. One may reexpress (11.1) in terms of the field strengths as TI*~,,~,...,~
1
_
2p!
( H n`~, . . . . p,al...OZm HUal
1 - - H +rf*~2(p+ 1)!
~ .... ~§
"
,~ + H~*o~...O~, H ~' ~ ~1 . . . .
H ~' "''~
,,~, . . . . . )
'~ . . . . .
(11.4)
If one takes the divergence of this expression one gets, Tt*~'~' . . . . .
,t*
=
5 K~"~'""~,
(11.5)
where K ~ . . . . . differs from k H % ~ . . . ~ , , ,'~I'"'~'"B*~I"'~, by a divergence. It is easy to see that T ~/jO/1*~ is trivial if and only if H%~ .... p ,~'''~m B * " ' " ' ~ is trivial. So the question is: can we write, H ~,
(9"1 . . . O p
,,~, . . . . . B . a , . . . o - p = 6 M U a l . . . . .
+ O p N P U a l ...a,~
(11.6)
for some M ~ l ' ' ' ~ m and N puc~ . . . . . .9 Without loss of generality, one can assume that M and N have the Lorentz transformation properties indicated by their indices (the parts of M and N transforming in other representations would cancel by themselves). Moreover, by Theorem 6.3, one can also assume that M and N are gauge invariant, i.e., belong to I . If one takes into account all the symmetries of the left-hand side and use the identity d H = 0, the problem reduces to the determination of the constant c in, H u o . , . . . a v , a , . . . . . . B *al'''crp
=
5(cH~,,~, . . . . p-~(~,,~2 . . . . . )B*~' .... "-~)
+ O p N ~ , ~ l . . . e , m + terms that vanish on-shell.
(11.7)
If one takes the Euler-Lagrange derivative of this equation with respect to B *~'''~p one gets, (11.8) HmTl...c~p,oq...o~m ~ ( - - ) p+l CH v[~l...~p-ll(m,~=.-.~,,)l~l' where the right-hand side is symmetrized in c q . . . c ~ and antisyinmetrized in al .. 9 ap. The symmetry properties of the two sides of this equation are not compatible unless c = 0. This proves that T #v~l . . . . . (with m even) is not trivial in the algebra of xindependent local forms. It then follows, by a mere counting of derivative argument, that the T u " ~ ' ' ' " " define independent cohomological classes and cannot be expressed as polynomials in the undifferentiated dual to the field strengths H with coefficients that are constant forms. The fact that the conserved currents are not always expressible in terms of the forms ~ a makes the validity of this property for higher order conservation laws more striking. In that respect, it should be indicated that the computation of the characteristic cohomology in the algebra generated by the H a is clearly a trivial question. The non-trivial issue is to demonstrate that this computation does not miss other cohomological classes in degree k < n - 1. Finally, we point out that the conserved currents can all be redefined so as to be strictly gauge-invariant, apart from a few of them whose complete list can be systematically determined for each given system of p-forms. This point will be fully established in [ 18], and extends to higher degree antisymmetric tensors a property established in [44] for one-forms (see also [45] in this context).
160
M. Henneaux, B. Knaepen, C. Schomblond
12. Introduction of Gauge Invariant Interactions The analysis of the characteristic cohomology proceeds in the same fashion if one adds to the Lagrangian (1.10) interactions that involve higher dimensionality gauge invariant terms. As we shall show in [18], these are in general the only consistent interactions. These interactions may increase the derivative order of the field equations. The resulting theories should be regarded as effective theories and can be handled through a systematic perturbation expansion [46]. The new equations of motion read
OuE ~m''m'''~'p" = 0,
(12.1)
where/2 am'la2"''"pa are the Euler-Lagrange derivatives of the Lagrangian with respect to the field strengths (by gauge invariance, E involves only the field strength components and their derivatives). These equations can be rewritten as d ~ a ~ 0,
(12.2)
where Z a is the (n - pa - l)-form dual to the Euler-Lagrange derivatives. The Euler-Lagrange equations obey the same Noether identities as in the free case, so that the Koszul-Tate differential takes the same form, with H~ replaced everywhere by Z ~. It then follows that p+l
E~ = Z~ + Z
--g,(z
Bj
(12.3)
j=l
fulfills AZ~a = 0.
(12.4)
This implies, in turn, that any polynomial in the s is A-closed. It is also clear that any polynomial in the Z a is weakly d-closed. By making the regularity assumptions on the higher order terms in the Lagrangian explained in [9], one easily verifies that these are the only cocycles in form degree < n - 1, and that they are non-trivial. The characteristic cohomology of the free theory possesses therefore some amount of"robustness" since it survives deformations. By contrast, the infinite number of non-trivial conserved currents is not expected to survive interactions (even gauge-invariant ones). [In certain dimensions, one may add Chern-Simons terms to the Lagrangian. These interactions are not strictly gauge invariant, but only gauge-invariant up to a surface term. The equations of motion still take the form d(something) ~ 0, but now, that "something" is not gauge invariant. Accordingly, with such interactions, some of the cocycles of the characteristic cohomology are no longer gauge invariant. These cocycles are removed from the invariant cohomology, but the discussion proceeds otherwise almost unchanged and is left to the reader.]
13. Summary of Results and Conclusions In this paper, we have completely worked out the characteristic cohomology//~ha,.(d) in form degree k < n - 1 for an arbitrary collection of free, antisymmetric tensor theories. We have shown in particular that the cohomological groups H~ha,.(d) are finite-dimensional and take a simple form, in sharp contrast with H~h~l~(d), which is
Characteristic Cohomologyof p-FormGaugeTheories
161
infinite-dimensional and appears to be quite complex. Thus, even though one is dealing with free theories, which have an infinite number of conserved local currents, the existence of higher degree local conservation laws is quite constrained. For instance, in ten dimensions, there is one and only one (non-trivial) higher degree conservation law for a single 2-, 3-, 4-, 6-, or 8-form gauge field, in respective form degrees 7, 6, 5, 3 and 1. It is d H ~ O. For a 5-form, there are two higher degree conservation laws, namely d H ~ 0 and d(H) 2 ~ 0, in form deg_rees 4 and 8. For a 7-form, there are four higher degree conservation laws, namely d H ~ 0, d ( n ) 2 ~ 0, d ( n ) 3 ~ 0 and d(H) 4 ~ 0, in form degrees 2, 4, 6 and 8. Our results provide at the same time the complete list of the isomorphic groups Hk(A), as well as of H~_k(51d). We have also worked out the invariant characteristic cohomology, which is central in the investigation of the BRST cohomology since it controls the antifield dependence of BRST cohomological classes [14]. An interesting feature of the characteristic cohomology in form degree < n - 1 is its "robustness" to the introduction of gauge invariant interactions, in contrast to the conserved currents. As we pointed out in the introduction, the characteristic cohomology is interesting for its own sake since it provides higher degree local conservation laws. But it is also useful in the analysis of the BRST cohomology. The consequences of our study will be fully investigated in a forthcoming paper [18], where consistent interactions and anomalies will be studied (see [47] for the 2-form case in this context). In particular, it will be pointed out how rigid the gauge symmetries are. We will also apply our results to compute the BRST cohomology of the coupled Yang-Mills-two-form system, where the field strength of the 2-form is modified by the addition of the Chem-Simons 3-form of the Yang-Mills field [48]. This computation will use both the present results and the analysis of [50, 36, 49, 14]. Acknowledgement. M.H. is gratefulto LPTHE(Universit6sParis VI and Pads VII) for kind hospitality.This workhas beenpartly supportedby researchfunds fromF.N.R.S. and a researchcontractwith the Commission of the,EuropeanCommunity.
A. Proof of Theorem 6.2
To prove Theorem 6.2, it is convenient to follow the lines of the BRST formalism. In that approach, gauge invariance is controlled by the so-called longitudinal exterior derivative operator "7, which acts on the fields and further variables called ghosts. The construction of "7 can be found in [31, 30]. For simplicity, we consider throughout this appendix the case of a single p-form; the general case is covered by means of the Kunneth formula. The important point here is the reducibility of the gauge transformations. Because of this, we need to introduce p ghost fields: Cm...~p_l,... , Cm...~p_j,... , C.
(A.1)
These ghosts carry a degree called the pure ghost number. The pure ghost number of Cm...t~p_ , is equal to 1 and increases by one unit up to p as one moves from the left to the right of (A. 1). The action of '7 on the fields and the ghosts is given by, "TB
=
dC1,
(A.2)
"7C1
=
dC2,
(A.3)
162
M. Henneaux, B. Knaepen, C. Schomblond
"TCp_ 1
=
"TCp = "7(anti field) =
dVp,
(A.4)
0,
(A.5)
0.
(A.6)
In the above equations, Cj is the p - j form whose components a r e C~l ...l~p_j. For p even, Cp is a commuting object. One extends '7 such that it is a differential that acts from the left and anticommutes with d. The motivation behind the above definition is essentially contained in the following theorem:
Theorem A.1. The cohomology of'7 is given by,
H('7) = Z | Cp,
(A.7)
where Cp is the algebra generated by the last, undifferentiated ghost Cp. In particular, in antighost and pure ghost numbers equal to zero, one can take as representatives of the cohomological class the gauge invariant functions, i.e, the functions which depend solely on the field strengths and their derivatives. [It is in that sense that the differential "7 incorporates gauge invariance.] The proof of this theorem follows the lines given in [43], by redefining the generators of the algebra so that "7 takes the standard form 'Txi = Yi, 'Tyi = 0, 'Tz~ = 0 in terms of the new generators xi, yi, z~. The paired variables xi, Yi disappear from the cohomology, which is entirely generated by the unpaired variables z,~. In the present case, one easily convinces oneself that the generators of Z | Cp are precisely of the z~-type, while the other generators come in pairs. The derivatives of the last ghost Cp are paired with the symmetrized derivatives of the next-to-last ghost Cp_lu, the other derivatives of the next-to-last ghost Cp_lu which may be expressed as derivatives of the "curvatures" 0uCp_ 1~ - O~Cp_ l u, are paired with the derivatives of the previous ghost Oat .... k C(p-2),u~ involving a symmetrization, say on cq and #, etc. The details present no difficulty and are left to the reader. According to the theorem, any solution of the equation "Ta = 0 can be written,
a=E
c~z(x)CZ + "Tb.
(A.8)
l
Furthermore, if a is 7-exact, then one has c~t ~ 0 since the various powers of C are linearly independent. The previous theorem holds independently of whether p is even or odd. We now assume that p is even, so that the curvature (p + 1)-form H is anticommuting and the last ghost Cp is commuting, and prove Theorem 6.2 in that case (the case when p is odd parallels the 1-form case and so need not be treated here). Assume that da~) = 0 with a 0k a polynomial in the field strengths and their derivatives. By the Poincar6 lemma we have a~ = da~ -1, but there is no guarantee that a0k- 1 is also in 7?s,~au. Acting with "7 on this equation we get, again using the Poincar6 lemma, "Tako-1 + da~ -2 = 0. One can thus construct a tower of equations which take the form,
Characteristic Cohomology of p-Form Gauge Theories
163
a~
=
da k - l ,
"ya~- l +dak1-2
=
O,
(A.IO)
=
O,
(A.11)
=
0.
(A.12)
"yakq-l-q
- k-2-q
+aaq+ 1
k--2-q 'yaq+ 1
Let
r = k -
2 -
q and
q + 1 = m.
If m
=
(A.9)
pl then the last equation o f the tower implies,
r = CplP + "Yam-1, am
(A.13)
with P C 2-s,~,al. If m ~ pl then we simply have arm = "ya~_ 1- In that case, an allowed redefinition o f the tower allows one to suppose that the tower stops earlier with "yam,r = 0 and m' = pl. A n allowed redefinition of the tower simply adds to a0k a term of the form dbko- 1, where b0k - 1 is gauge invariant. So from now, we shall assume that indeed m = pl. If we substitute (A.13) in (A.11) we get, t ~§ 1 + 1Cp-lCtp - ' ) + CZpdP = 0 (A.14) "y~am_ (the trivial term ' y a ~ _ 1 is absorbed in an allowed redefinition of the tower). Since the action of d is well defined in ~small this implies d P = O. The form degree of P is strictly smaller than the form degree of a0k, so let us make the recurrence hypothesis that the theorem holds for P . Because we treat the case p even, H is odd and P = d H + c + dQ, where c and c I are constants and Q E ZsmaU. We thus have,
a r =cClp +c'CZpH +dQelp.
(A.15)
The last two terms in (A. 15) are trivial. For the first one we have,
dBCZp = d(QCZp) - "y(QtCp_,c~).
(A.16)
Then we note that,
-
1
l+--~ (d(
E
Ci, Ci2""Ci~.,)
(A.17)
..+il+l=pl
i l +. 0_ 1 then c = O. We can write,
a pl o =c
~ il+...+il=Pl
Ci, C~:... Ci,.
0 < i I < p , . . . , 0 < i , + 1 _ 2, we use Taylor's theorem to get m-1
/=1
11
dcr(1 - a)m-Z((x - y). O)m (V(y + cr(x - y)) - V(y)) }.
+ ~
(4.8)
Then we have by (2.3) and (2.4) IEo[wl(t, x, Y)]] --- ]Wl( t, x, Y)I m--1
1=i
lal=l+l
- y)m fro I d c r ( 1 - if)m-2( E
+~]x
)O'V(y + ff(x - y)) - O'V(y)12) 1/2 }
t(,l=~ m
_< ~ t { E a * /~2 c - - x
] ~ n c x--ylm+~}. y.~V(y) 1-16 + a(m-l)
(4.9)
1=2
To proceed note here that for a > O,
tae -t/2 Then we have from (4.9) by (4. I0)
_ O.
(4.10)
184
T. I c h i n o s e , S. T a k a n o b u
t
Idt,(t, x, Y)I = lEo[wl(t,x,y)]le-~(v(z)+V(Y))
2. We have from (4.5), Iw(t, x,
_
2, completing the proof of Theorem 3.1.
5. Proof of Theorem 3.2
5.1. The case m = 0. We obtain directly from (3.21b) Id(t,x,y;A) I
1, and the condition (2.3) holds with b = 1. We can use for d2(t, x, y; A) and d3(t, x, y; A) the results of Theorem 3.1, but have to make a new treatment for dl(t, x, y; A). In fact, we have
KacOperator
189
Id2(t,x,y;A)l
< ~_~Eo[l~(t,x, -
"'J"] e
Y)I
~
,
(5.1)
j=2
Ida(t,x,y',A) l < l e o [Llw(t,x,y)lm§
dO(1 - O)~
So we can give for d2(t, x, y; A) and d3(t, x, y; A) the same bounds as d2(t, x, y) and d3(t, x, y) in (3.13c), (3.12c) and (3.13d). Now we come to dl (t, x, y; A), which is rewritten based on the decomposition (4.6a) of w(t, x, y) as 4
dl(t,x,y;A)
= E
dll(t,x,y;A),
(5.3a)
l=l
d l l ( t , x , y ; A)
=
-Eo[e-i~(t'x'Y)wl(t,x, y)]e -~(v(x)+y(y)), 1 < l < 4,
(5.3b)
with wt(t, x, y), 1 < l < 4, in (4.6bcde), where d13(t, x, y; A) is absent if m = 1. Among dll(t, x, y; A), 1 < l < 4, we have to treat d12(t, x, y; A) anew, but can use for the others the previous results in Sect. 4. In fact, we have
[dlz(t,x,y;A)[
0 is the Chern-Simons coupling parameter, the symbol tr refers to the trace in the matrix representation of ~, the Minkowski spacetime metric is defined by 9u~, =diag{-1, 1, 1} which is used to raise or lower indices, and the potential energy density of the Higgs field is given by the special formula V(r ct)
_-
4_~tr{([[r ' r
r
_
,02r162
r
r __ ,02r
in order to write the energy density of the model as a sum of squares and a divergence term, and v > 0 is a constant which measures either the scale of the broken symmetry or the subcritical temperature of the system. In the static case, this special property of the energy density implies an explicit energy lower bound, expressed in terms of magnetic or electric charges, which may be attained by the solutions of the following self-dual Chem-Simons equations: D_r
0,
(1)
l
F+_ = ~ [ v 2 r - [[r r
r
where D_ = D1 - iD2 and F+_ = O_A+ - O+A_ + [A_, A+] with A+ = A1 • and 0+ = 01 + i02. Of course the solutions of (1) also solve the Euler-Lagrange equations of /2. For the above general system, the only known exact solutions so far in the literature are the zero-energy (trivial) solutions determined by the algebraic equation
202
Y. Yang
The purpose of the present paper is to establish an existence theorem for a large class of nontrivial solutions for the original self-dual Chern-Simons system (1), commonly called topological solitons or multivortices. Since ~ is semi-simple, it is most convenient to use the Cartan-Weyl basis or the root space decomposition to express the generators of G or ~ for which the mutually commuting set, generating the Cartan subalgebra of ~, is denoted by {H~}l 0 so that
ILT,RU - ( P L ) - I l l 2 >_ colU - 112,
(26)
which is a stronger version of (25). In fact, by (22), we can rewrite the left-hand side of (26) as IL':R(U - 1)12. Thus (26) follows from the fact that U R is nonsingular. We are ready to use functional analysis to prove the existence of a minimizer of the energy (24) in the standard Sobolev spaces. For this purpose, let W k,p denote the space of scalar or vector-valued functions with distributional derivatives up to order k which are all lying in LV(R2). The norm o f L p will be written II " lipConsider the optimization problem min{I(w) [ w
E wl'2).
(27)
Because the matrix P L is nonsingular, we may switch to the variable v from w via (17) back and forth according to convenience. Using (26) and the variable v, we can find suitable constants cl, c2, c3 > 0 such that
Relativistic Non-Abelian Chem-Simons Equations
209
I(w) >_ Cl ~_,(IJVv~ll~ + lle"~ +'~ - lll~) - c2 ~_, ll lhl valll - c3. a=l
(28)
a=l
It is easily checked that
Isl
le s - 11 > 1 + Is---T'
Vs 9
R.
Consequently we have the lower hound o
o
/R
lie ~'o+~o - 111~ >
l U0a + V a [0 2
~ (l+luO+vOl)2 (29)
-> 2
~ (1 + I~~ + Iv~l) 2 -
~ (1 + luOl)2"
Note that since (14) implies u ~ =O(Ixl-2) for large Ix I, the last integral on the right-hand side of (29) is convergent. We recall the standard embedding inequality in two dimensions:
Ilvll] _< v~llvll21lVvll2,
v 9 w 1,2.
(30)
Thus, in view of (30), we have for any meaningful insertion u0 the upper bound
Ilvll4 = < 2
ivl)(1 + luol + I~1)1~1 2 (1 +
_< C(1 +
'v12
luol + Ivl)2
~ (lu~ + Iv12 + Iv'4)
Ilvll~ + Ilvll~llW@
s
IvF ~ (1 + luol + Ivl)2
1 4+C( ivl 2 4 < ~llvll2 [~2 (l+luol+lvl) 2] +llVvll8+
1).
Here and in the sequel, C denotes a generic positive constant. Thus we get
(
Ilvll~ _< c 1 + IlVvlt~ +
/. 2 (1 + luol + Ivl)~)
"
(31)
On the other hand, for any v E W 1'2, we have in view of (30) that
II Ihlvll,
< Ilhl14/311vl14
< ellvll2 + CIIvvll2 +
c
_< ~(llvl12 + IlVvll~) + ~. Eo
(32)
210
Y. Yang
Now insert (29), (31) and (32) into (28) with u ~ = u ~ v~ = v (a = 1 , 2 , . . . , r). We arrive at the following coercive inequality: Z(w) > ClllVliw,,~ - c2 (33) c31lwllw~,=
- c4,
where the constants C ' s above are all positive because of the fact that (16) defines an invertible transformation w ~ v from W ~,2 to itself. It is easily seen that (24) is finite everywhere on W 1,2. In fact, similar to (26), we have I L ' R U - ( P Z ) - l l l 2 5 cl[U -- 1[2 7"
_< 2 c , ~ ( l e u ~ (e vo - 1)12 + le ~~ - ll2) a=l ?-
< c:}-~(fe wo - it2+ re~~ _ 112) a=l
for some constants Cl, c2 > 0. Using the MacLaurin series (e f - 1 ) 2 = f 2 + ~
2k-2fk k=3
and (37) in the next section as in [44], we may find a bound C'w > 0 depending on w c W 1'2 so that f [L~RU - ( P L ) - I l l 2 < Cw, which proves the finiteness of (24) on
Wl, 2. In view of (33) we see that the functional (24) is bounded from below on W 1'2. Set r/0 = i n f { I ( w )
[w
E wl'2),
and let {w (n) } he a sequence in W 1,2 satisfying I ( w (m) --~ z/0 as n --~ oo. The inequality (33) says that {w (n) } is bounded in W 1,2. Without loss of generality, we can assume that {w (m } weakly converges to an element w E W 1'2. We now show that w is a solution to the problem (27). Of course, the finiteness of I(w) implies that for any e > 0, there is bounded domain so that the truncated energy (24) over ~ (in other words the integral in (24) is now taken over ~ instead of R2), Is?(w), satisfies I o ( w ) > I ( w ) - e. Recall the Trudinger-Moser inequality of the form
s
2 e y _< Cle c211f[lwl'2(Y2),
f E wl'2(~'~),
and the compact embedding W1,2(~) --~ LP(O) (p > 2). Thus the structure of I ~ implies that Is9 is weakly lower semi-continuous over W1,2(S2). Consequently, lim I a ( w (n)) > I o ( w ) ,
n--+OO
Relativistic Non-Abelian Chem-Simons Equations
211
where, without loss of generality, we have assumed the convergence of the sequence of numbers {Io(w(~))} because otherwise we can always focus on a convergent subsequence. Besides, we may also assume that the bounded domain f2 is so chosen that f < [
h . w (") + h 9 ( P L ) - l v 0 < e,
Vn.
2_~
Hence, we obtain I(w(n)) --> In(w(n)) + f n : - ~ h . W (n) + h . ( P L ) - l v 0
> I n ( w (n)) - e. Letting n --~ oo in the above, we have z/0 _> Is2(w) - e > I ( w ) - 2e. Since e > 0 can be arbitrarily small, we conclude that I ( w ) _< 70- This proves that w solves (27). Thus w is a weak solution of (19). Using the elliptic regularity theory, we see that w is a smooth solution. Therefore a solution o f (20) is obtained. We now turn to the study of the behavior of the solution at infinity.
5. Asymptotic Behavior We first rewrite (20), after taking the shift v ~-~ v ~ + v, in the form Av = A(KRU - KR)(KRU
-
Since the matrix K R is invertible and K R 1
IKRU
- KRI
0 is a suitable constant. Using (35) in (34), we see that there are constants C 1 , C2 > 0 SO that
]the right-hand side of (34) I < c, ~ ( e ~'~
-
le ~'~176 - I I + Ig[. (36)
1) 2 + c2
a=1
a=l
The existence proof carried out in the last section already showed e ~'~ - 1 E L 2 (a = 1 , 2 , . . . , r ) . We need to derive now e u~ - 1 E L 4 by using va E W 1'2 (a = 1 , 2 , . . . , r ) established earlier also in the existence proof part. Denote by u ~ and v any pair among u l , o . . . , u ro a n d v l , v 2 , . . . , v r , respectively. We proceed as follows. Since u ~ < 0, we have [e u~
-
1[ _< le v - 1[ + [ e u~ - - 1].
So it suffices to verify that e v - 1 E L 4 for v E W 1'2. For this purpose, we recall the following embedding inequality in two dimensions:
II/1[,-
2.
(37)
212
Y. Yang
We then use the MacLaurin series oo 4 k _ 4 ( 3 k + l ) + 6 . 2 (ev -- 1)4 = E k! k--4
k Vk
and (37) to obtain a formal upper estimate, k--2
Ile~ - 1114
2) which are characterized by the boundary condition lim eU~(~)=O, a=l,2,...,r. Besides, the existence of doubly periodic solutions of (7) is also an important open problem. It is not clear whether the methods in [4, 41], for the simplest case r = 1, may be applied to (7) for r > 2. Technically, we have seen that the Cholesky decomposition is a powerful tool in the variational formulation of a system of nonlinear partial differential equations. It is expected that such a tool will be useful for the study of other problems involving systems of nonlinear equations. In fact we have been successful to obtain a complete resolution for the general system r
Na
AUa= EI(ab(eub--rb)+47rE~p~ b=l
j,
j=l
(50)
lim eU~(~)=r~, a = l , 2 , . . . , r , Ixl--,~ where K = (Ka6)is a positive definite symmetric matrix and r~ > 0 (a = 1, 2 , . . . , r) are constants. For (50), the case r = 1 is the self-dual vortex equation in superconductivity [21], the case r = 2 arises in the extended electroweak model of Bimonte and Lozano containing two Higgs doublets which lie in the fundamental representation of the gauge group [2, 43], and the general case, r = 1 , 2 , . . . , is derived by Schroers as the governing equations for gauged linear sigma model solitons [37]. In [44] existence and uniqueness results for arbitrary r are presented for the system defined either on a closed 2-surface or the full plane R 2. Acknowledgement. I would like to thank Gerald Dunne for helpful correspondence, whose talk at the AMS special session on gauge field theory held at the Courant Institute in April 1996 initiated this work.
References 1. Abrikosov, A.A.: On the magnetic properties of superconductors of the second group. Sov. Phys. JETP 5, 1174-1182 (1957) 2. Bimonte, G., Lozano, G.: Vortex solutions in two-Higgs-doublet systems. Phys. Lett. B326. 270-275 (1994) 3. Bogomol'nyi, E.: The stability of classical solutions. Sov. J. Nucl. Phys. 24, 449454 (1976) 4. CaffareUi, L.A., Yang, Y.: Vortex condensation in the Chern-Simons Higgs model: An existence theorem. Commun. Math. Phys. 168, 321-336 (1995) 5. Chen, X., Hastings, S., McLeod, J.B., Yang, Y.: A nonlinear elliptic equation arising from gauge field theory and cosmology. Proc. R. Soc. Lond. 446, 453-478 (1994) 6. de Vega, H.J., Schaposnik, E: Electrically charged vortices in non-Abelian gauge theories with ChernSimons term. Phys. Rev. Lett. 56, 2564-2566 (1986)
Relativistic Non-Abelian Chem-Simons Equations
217
7. Dunne, G.: Self-Dual Chern-Simons Theories. Lecture Notes in Physics, vol. m36, Berlin: Springer, 1995 8. Dunne, G.: Mass degeneracies in self-dual models. Phys. Lett. B345, 452-457 0995) 9. Dunne, G., Jackiw, R., Pi, S.-Y.,Tmgenberger, C.: Self-dual Chern-Simons solitons and two-dimensional nonlinear equation. Phys. Rev. D43, 1332-1345 (1991) 10. Frthlich, J., Marchetti, E: Quantum field theory of anyons. Lett. Math. Phys. 16, 347-358 (1988) 11. Frthlich, J., Marchetti, E: Quantum field theory of vortices and anyons. Commun. Math. Phys. 121, 177-223 (1989) 12. Ganoulis, N., Goddard, E, Olive, D.: Self-dual monopoles and Toda molecules. Nucl. Phys. B205, 601636 (1982) 13. Ginzburg, V.L., Landau, L.D.: On the theory of superconductivity. In: Collected Papers of L. D. Landau, edited by D. Ter Haar, New York: Pergamon, 1965, pp. 546-568 14. Golub, G.H., Ortega, J.M.: Scientific Computing and Differential Equations. San Diego: Academic, 1992 15. Greiner, W., Miiller, B.: Quantum Mechanics - Symmetries. 2nd ed., Berlin-New York: Springer, 1994 16. Hong, J., Kim, Y., Pac, E-Y.: Multivortex solutions of the Abelian Chern-Simons-Higgs theory. Phys. Rev. Lett. 64, 2330-2333 (1990) 17. Humphreys, J.E.: Introduction to Lie Algebras and Representation Theory. New York and Heidelberg: Springer, 1972 18. Jackiw, R., Pi, S.-Y.: Soliton solutions to the gauged nonlinear Schr(idinger equation on the plane. Phys. Rev. Lett. 64, 2969-2972 (1990) 19. Jackiw, R., Pi, S.-Y.: Classical and quantum nonrelativistic Chem-Simons theory. Phys. Rev. D42, 35003513 (1990) 20. Jackiw, R., Weinberg, E.J.: Self-dual Chem-Simons vortices. Phys. Rev. Lett. 64, 2334-2337 (1990) 21. Jaffe, A., Taubes, C.H.: Vortices and Monopoles. Boston: Birkh~iuser, 1980 22. Julia, B., Zee, A.: Poles with both magnetic and electric charges in non-Abelian gauge theory. Phys. Rev. Dll, 2227-1232 (1975) 23. Kao, H.C., Lee, K.: Self-dual SU(3) Chem-Simons Higgs systems. Phys. Rev. D50, 662645632 (1994) 24. Kostant, B.: The solution to a generalized Toda lattice and representation theory. Adv. Math. 34, 195-338 (1979) 25. Kumar, C., Khare, A.: Charged vortex of finite energy in non-Abelian gauge theories with Chem-Simons tenn. Phys. Lett. B178, 395-399 (1986) 26. Lee, K.: Relativistic nonabelian self-dual Chem-Simons systems. Phys. Lett. B255, 381-384 (1991) 27. Lee, K.: Self-dual nonabelian Chem-Simons solitons. Phys. Rev. Lett. 66, 553-355 (1991) 28. Leznov, A.N.: On the complete integrability of a nonlinear system of partial differential equations in two-dimensional space. Theoret. Math. Phys. 42, 225-229 (1980) 29. Leznov, A.N., Saveliev, M.V.: Representation of zero curvature for the system of nonlinear partial differential equations x~,zz = exp(kx)= and its integrability. Lett. Math. Phys. 3, 489-494 (1979) 30. Leznov, A.N., Saveliev, M.V.: Representation theory and integration of nonlinear spherically symmetric equations to gauge theories. Commun. Math. Phys. 74, 111-118 (1980) 31. Mansfield, E: Solutions of Toda systems. Nucl. Phys. B208, 277-300(1982) 32. Mikhailov, A., Olshanetsky, M., Perelomov, A.: Two-dimensional generalized Toda lattice. Commun. Math. Phys. 79, 473-488 (1981) 33. Nielsen, H., Olesen, E: Vortex-line models for dual strings. Nucl. Phys. B61, 45-61 (1973) 34. Olive, D., Turok, N.: The symmetries of Dynkin diagrams and the reduction of Toda field equations. Nucl. Phys. B215, 4713--494 (1983) 35. Paul, S., Khare, A.: Charged vortices in an Abelian Higgs model with Chem-Simons tenn. Phys. Lett. B174, 420-422 (1986) 36. Paul, S., Khare, A.: Charged vortex of finite energy in nonabelian gauge theories with Chem-Simons tenn. Phys. Lett. B178, 395-399 (1986) 37. Schroers, B.: The spectrum of Bogomol'nyi solitons in gauged linear sigma models. Nucl. Phys. B475, 440-468 (1996) 38. Spruck, J., Yang, Y.: Topological solutions in the self-dual Chem-Simons theory: Existence and approximation. Ann. Inst. Henri Poincar6 -Anal. non lintalre; 12, 75-97 (1995) 39. Spruck, J., Yang, Y.: The existence of non-topological solitons in the self-dual Chern-Simons theory. Commun. Math. Phys. 149, 361-376 (1992) 40. Stoer, J., Burlirsch, R.: Introduction to Numerical Analysis. Berlin-New York: Springer, 1983 41. Tarantello, G.: Multiple condensate solutions for the Chem-Simons-Higgs theory. J. Math. Phys. 37, 3769-3796 (1996)
218
Y. Yang
42. Wang, R.: The existence of Chem-Simons vortices. Commun. Math. Phys. 137, 587-597 (1991) 43. Yang, Y.: Topological solitons in the Weinberg-Salam theory. Physica D 101, 55-94 (1997) 44. Yang, Y.: On a system of nonlinear elliptic equations arising in theoretical physics. Preprint, 1996 Communicated by T. Miwa
Commun. Math. Phys. 186, 219-231 (1997)
Communications in
Mathematical Physics
Q Spfinge~Vedag1997
A New N = 6 Superconformal Algebra Shun-Jen C h e n g 1,*, Victor G. Kac 2,** 1 Department of Mathematics, National Cheng-Kung University,Tainan, Taiwan. E-mall:
[email protected] 2 Department of Mathematics, MIT, Cambridge, MA 02139, USA. E-maih
[email protected] Received: 3 October 1996/ Accepted: 12 December 1996
Abstract: In this paper we construct a new N = 6 superconformal algebra which extends the Virasoro algebra by the S06 current algebra, by 6 odd primary fields of conformal weight 23-and by 10 odd primary fields of conformal weight i.1 The commutation relations of this algebra, which we will refer to as CK6, are represented by short distance operator product expansions (OPE). We construct CK6 as a subalgebra of the SO(6) superconformal algebra/(6, thus giving it a natural representation as first order differential operators on the circle with N = 6 extended symmetry. We show that CK6 has no nontrivial central extensions.
1. Introduction Superconformal algebras have been playing a fundamental role in string theory and conformal field theory. A rigorous mathematical definition of a superconformal algebra is as follows [3]. This is a simple Lie superalgebra over C spanned by the modes of a finite family ~ of pairwise local fields such that (i) ~ contains the Virasoro field L(z) = ~-:nEz Lnz-n-2, (ii) the coefficients of the OPE between the fields from ~ are linear combinations of the fields from ~r and their derivatives. One can show ([3] Corollary 4.7) that (ii) follows from the property that L(z) is an energy-momentum field, i.e. [ L - t , ~b(z)] = O~b(z),~(z) c ~. In the study of representations of a superconformal algebra g it is important to consider central extensions of g. This amounts to considering C-valued 2-cocycles on ~. Hence we can (and will) ignore central terms in the study of superconformal algebras and consider the calculation of 2-cocycles as a separate problem. * Partially supported by NSC grant 85-2121-M-006-019 of the ROC. ** Partially supported by NSF grant DMS-9622870.
220
s.-J. Cheng, V.G.Kac Recall the OPE for L(z):
OL(w) L(z) . L(w) ~ z - w -
-
3 I-
2L(w) (z - w) 2"
-
-
In particular the Virasoro algebra is a superconformal algebra with trivial odd part. Recall that a field r is called primary of conformal weight A C C if
Or L(z).r
z-w
Ar ( z - w ) 2"
Note that if all fields from ~ are primary, then L(z) is an energy-momentum field. A primary field of conformal weight 1 is called a current. A current subalgebra is a collection of currents {r such that r 9 ej(w) ,-~ ~-~kZ~j~k(w). (For a detailed --W discussion of all these notions see e.g. [3].) One may find a conjecture on classification of superconformal algebras in [3] (cf [5]). This conjecture states that the algebra CK6, constructed in the present paper, is the only new superconformal algebra. Towards the proof of this conjecture the following result has been established. Theorem 1.1. ([3, 4]) Let g be a superconformal algebra such that its even part 96 is
spanned by L( z ) and currents and its odd part is spanned by fields of conformal weights 12and 3. Then ~ is isomorphic to one of the following superconformal algebras: (i) KN for 0 < N < 3 (see [1, 2, 5] or Sect. 2) spanned by 2 N fields, (ii) N = 4 superconforrnal algebra (see [1, 5] or Sect. 3) spanned by 8fields, (iii) W(1,2) (see [2, 5] or Sect. 2) spanned by 12fields, (iv) CI4f6 (see Sect. 3) spanned by 32fields. At this point one should mention the paper by Ramond and Schwarz [6] which gives a classification of algebras satisfying conditions of Theorem 1.1 and some additional conditions. The additional conditions exclude algebras W(1,2) and CK6, so that the answer given in [6] is correct, but the proof is not quite correct. In the present paper we give an explicit construction of the superconformal algebra CK6 and prove that it has no non-trivial central extensions (in contrast to all other superconformal algebras appearing in the list of Theorem 1.1). The absence of central extensions seems to be disappointing as it prevents from construction of simple chiral algebras based on CK6. On the other hand this feature seems to be useful for string theory as it automatically excludes anomalies.
2. Preliminaries In this section we will set the notation to be used throughout this paper. We denote by C[t, t -1] the ring of Laurent polynomials in the indeterminate t over the complex numbers C. Let (1, ~2, 99 9 ~N be N odd indeterminates, and let A (N) be the Grassmann superalgebra in these indeterminates. We let A (1, N) stand for the (super) commutative associative superalgebra A (N) | C[t, t - l ] . Let o be the even differentiation with respect to t and let ~ be the odd differentiation with respect to ~i, for i = 1,2,. 99 N. The Lie superalgebra of derivations of A (1, N ) is denoted by W(1, N). It consists of elements of the form ([2, 5]):
New N = 6 SuperconformalAlgebra 0
N
221 0
D = aoN + ~= ~b-g~'
ao, a l . . . , a n E A(1,N).
The algebras KN ([2, 5]). Consider the differential form f2 = dt - ~ N 1 {id{i. The Lie superalgebra KN is the subalgebra of W(1, N) given by KN := {D E W(1,N)IDfa = Pg2, for some P E A(1,N)}. This Lie superalgebra is isomorphic to the algebra known in physics as the S O ( N ) superconformal algebra [ 1]. Equivalently KN is the subalgebra of W(1, N) consisting of elements of the form N DU := 2 u O + (--1)degu E D i ( u ) D i , i=l
where Oi = ~i o + o and u E A (1, N). From this we see that KN can be identified with A (1, N), where the Lie bracket in KN is identified with the contact bracket on A (1, N), which is given by ([2, 5]): N
N N Ou Ov Ou Ov [ u , v ] = ( 2 u - ~_ {i-~ii)-~ -- ~-(2V -- Ei=I{i0--~/) + (--1)dCg'* Ei=, O{i 0~,"
Ou Ov
Next we write down the commutation relations of KN in field notation. Let I := {il, 9 9 ik } be an ordered subset of the set { 1,. --, N} of positive integers from 1 to N. We let III= k and ~i := ~i, A . . . A ~ik E A (N), and define the field ~I(Z) := E ( ~ I n~Z
@ t n) Z - n - 1 .
Proposition 2.1. Thefields ~i(z) are pairwise local, and the operatorproduct expansion of the fields (i(z) and ~j(z), where I and d are ordered subsets o f { l , . . . , N}, is given
by 5z(z) 95Aw) ,-~ (1I[ + [J[
-
4) ~I(ZA--~j(w) W)2 +(1II - 2) 0(~I A ~j)(w) z--w
N O~ A O0~i(W)
+ (-1)lrl E
o~ z - w
i=l
In particular, KN is a superconformal algebra with the Virasoro feld - l ~o(z ) = - 31 ~ n E z (1 | tn)z-n-l' and the field ~i(z) is primary (with respect to this Virasoro field) of conformal weight 2 - ~2" Proof The proof of the proposition is straightforward using the definition of the contact bracket above. []
222
S.-J. Cheng, V.G. Kac
3. The Algebras K6 and CK6 The goal of this section is to construct a new N = 6 superconforrnal algebra as a subalgebra of/s In order to do so we will first prove a few simple lemmas that will facilitate the computations in this section significantly. First let us recall the notion of the Hodge dual. Let I = { i l , i 2 , ' ' ' , i k } be a subset of { 1 , 2 , . . . , N } . Let I* = {jl, J2, " " " , J N - k } be the complement of I in { 1,2, 999 N}. For ~ I = ~il /~ ~i2 A ' ' " / k ~ik we define the Hodge dual o f ~ to be ~ = d:~j I A (j~ A . . . A ~jN-k such that (t A ~ = (1 A (2 A . . . A {N. The Hodge dual ~ ( z ) of the field ~x(z) is defined in an obvious way. In what follows small letters i, j, k, etc. will denote elements in { 1 , 2 , . - . , N}, while capital letters I, J will denote subsets of {1,2,. 99 N}. Lemma 3.1.
~~v O~i-~ A 0 ~
(_l)la(~ A ~j)*.
=
Proof If i E J, then the right-hand side and the left-hand side are both 0. Hence we can assume that i r J. In this case there exists some J' such that ~ = ~ A ~j,. We have ~j A ~3 = ~j A ~ A ~y, = ~12-..N, which implies that (-1)lJl~i A ~a A ~a, = ~12...N. Hence (~i A ~a)* = (_l)lgl~j,. On the other hand ~--~, ~ A ~ = ~ , ~ A ~o~, = ~J'" [] The next two lemmas can be proved similarly and will be stated without proofs. Lemma 3.2. N
N
t=l
t=l
( Z O~ij O~J).
O~ij
.
O~j
Lemma 3.3. For N = 6 we have
@ A @r. = - ~
60~*j -~
A
O~;lm o~
t=l
Lemma 3.4. For N = 6 we have the foUowing equality of singular parts of the OPEs:
~*jk(z)" ~l-n(w) ~ @k(z). ~Zm.(W). Proof If I{i, j, k} N {l, m, n}l -> 2, then the right-hand side and the left-hand side both equal to O. So we can assume that either {i, j, k} O {l, m, n} = Oor {i, j, k} N {1, m, n} = i = l: Let us first consider the case {i, j, k} fq {l, m, n} = •. We can assume that {ijk A ~tmn = ~123456.The left-hand side equals 2 (~zm~ A - ~ j k ) ( w ) + O(~gr~ A --@k)(W) (Z -- W) 2
Z -- W
while the fight-hand side is
2(@k/~ ~,~,~)(w) + 0(@k/x ~tmn)(W) (z -- W) 2
z -- W
New N = 6 Superconformal Algebra
223
Clearly both sides are equal. Now suppose that {i, j, k} N {/, m, n} = i = 1. Let {i, j, k, m, n, s} = { 1,2, 3, 4, 5, 6}. , ~- 6~mn A ~s, where e = +1. It follows that ~%~ = e~jks. The left-hand We have ~ijk side then equals - (~mnh~Jk)(w) while the fight-hand side equals - ( ( ~ h ~ r a nZ)-(- wW) [] Z--W ~ As an immediate consequence we have Corollary 3.1. For N = 6 we have: ~*~(z)
Thus ~ i j k ( Z )
+
. ~tm,(W)
~ --~i~(Z)
" ~--,(W).
O~i*jk(Z) is a (super) commutative subalgebra of K6, where o~2 =
--1.
L e m m a 3.5.
N 0~
0~
t=l
Proof Both sides equal to 0 unless k = i or k = j. This is clear for the left-hand side. For the right-hand side note that if k 5t i and.k 4"j, then ~ contains both ~i and ~j. But always either contains ~i or ~j. OCt Assume that k = i. The left-hand side in this case equals ~ A ~*j. Now ~ A ~*j = ~2--.N. Hence ~ = ~j A ~i*j. So the right-hand side equals - ~ []
A ~ o~
= ~i A ~i*j.
Now we are in a position to write down some of the OPEs of K6 in a form from which it is much easier to deduce that it contains an N = 6 superconformal subalgebra. The proofs of these formulas are simple applications of the previous lemmas. We have
~ ~ij(z) . ~k(w) ~
~i(Z)" ~j(W) ~
~i(z)"
(z -
w) 2 + - - ' z - w
2~ij(w)
O~j(w)
5ij~O(w)
(Z -- W ) 2
Z -- W
Z -- W
~jkz(w) ~
0~t(w)
~(zo),
Z--W
Z--W
~jk~.~(w) + ~ ~ j ( z ) . ~klm(W) ~ ~_--~--~ ~i*j(z) " ~ k l m ( w ) ~ij* /x ~ k ( w ) ~i*j(z) " ~ k ( w ) ~
(z -
. ~;dw)
~
(Z -- W) 2
Z -- W
'
Z--W
+ _~ i ~_( w )
w) 2
25ik~jl~;(W) ~j(z)
( o~~" A ~ ) ( w )
~:~
~
'
z -
w
+ ~, t=l
+
20((i*j A (k)(W) Z--W
~",~o~t A ~o~t ~,"* (- w ) Z--W
S.-J. Cheng, V.G. Kac
224
~3(z). ~;(w) ~
(~)*(w)
~ 26~(w)
~(z).~(w)
Z--W
~i3(w) ~i~o~(w) ~ - - 7
(Z -- W) 2
~i(z)-G(w)~
Z -- W
Z -- W
~;(w). Z--W
Theorem 3.1. For each complex number a such that a 2 = - 1 the following fields form an N = 6 superconformM subalgebra of K6: (i) L ( z ) = - 8 9 -~0 ~ 3~o(z) 9 is the Virasoro field, (ii) ~ j ( z ) + aO~i*j(z) are currents, (iii) ~ ( z ) - aOZ~*(z) are odd primary fields of conformal weight 3, 1 (iv) ~jk(z) + a~*jk(z) are odd primary fields of conformal weight ~. The operator product expansions of this algebra are as follows (we omit the ones involving the Virasoro feld as usual): (~,~k(z) + a ~ k ( z ) ) . (~,~,(w) + a~;-,(w)) ~ 0, .o~14
Et6=l ( - ~ t )(W)z--w A ~176
( ~ i j ( z ) + O~O~i~(Z)) " ( ~ k l ( w ) + O~O~l(W)) ~'~
Z--~I)
((dz)
- aOZ~;(zl)
9 (~j(w)
- aoZ~2(wl)
O(&Aw) + aO~i~(w))
25ijL(w)
Z--W
Z--W
,.~ -
2(~ij(w) + oeO~i*3(w))
(z - w) 2
6
(~q(z) + aO~i~(z)) . (~klm(w) + a ~ ; m ( W ) ) ~ ~
[0_~ A ~)(W)
" 04, z --
t=l 6 0 [ 0 04 ~. ` A ~)*(W) + O~) Z--W t=l
(~ij (z) + ~O~,*j (z)) . ((k(w) - aO 2 ~ (w)) o~q tw~ - a O 2 ( ~ 04k ~
(Z -- W) 2
)*(w)
Z--W
(~i(z) - aO2~*(z)) 9 (~r
Z--W
+ a~kl(w)) ~
a(~i~ k; (w) + aO~i~ k; (w)) Z--W
9
New N = 6 SuperconformalAlgebra
225
Proof Thefactthatthefieldsofconformalweight 89form a commutative subalgebra was already mentioned in Corollary 3.1. The calculations of the other OPEs are straightforward, although somewhat tedious. However those OPEs of/s written down explicitly above and the lemmas greatly simplify the task. [] We denote the algebra constructed in Theorem 3.1 by CK6. Actually, Theorem 3.1 gives two different embeddings of CK6 into K6. It is immediate from the construction that CK6 is a superconformal algebra whose current subalgebra is the S06 current algebra, and that the odd part of CK6 is spanned by six primary fields of conformal weight 3 and ten primary fields of conformal weight 89 on which S06 acts as in the vector representation and as in one of two irreducible components of the exterior cube of the vector representation, respectively.
Remark. Incidentally the construction of CK6 inside K6 can be imitated to obtain the N = 4 superconformal algebra inside K4. Let f12 _ 1. Explicitly the N = 4 superconformal algebra can be realized inside/s as follows (cf [1]): (i)
L(z) = - l ~o(z) - ~O~(~(z) is a Virasoro field.
(ii) ( o ( z ) -/3(i*j(z) are currents. 3 (iii) ~ ( z ) + flO(*(z) are odd primary fields of conformal weight 2.
For completeness' sake let us also include the OPEs: ( ~ ( z ) + ~O~;(zl). ( ~ ( w ) + ~ O ~ ( w l ) ~
(z-~)~
--O(~ij(w) --/3~*j(w)) + 25ijL(w) Z--W
4 taOj A ~
t=l
(O(*j
_ s(O(*j
~)*(w)
Z--W
t=l 4
~(~
~ktt),(W)
Z--W
( ~ j ( z ) - fl~**j(z)) . (~k(w) + f l O ~ ( w ) )
~
o~k . . . . z - , o ~ , , o~;~ (w) + ~_~,o~_% , w _ f l o~k p u t o~ ) ( )
Z--W These OPEs can be easily calculated using Lemma 3.1, Lemma 3.2 and the fact that for N = 4 one has ~i A ~2 = --~* A ~j.
Remark. If we consider the subalgebra g of CK6 consisting of polynomial vector fields on C (i.e. generated by non-negative modes of the fields of CK6), we obtain a simple primitive Lie superalgebra (see [2] for definition). Let t have degree 2 and let ~i have degree 1. This induces a Z-gradation IJ = Oj~ with dimg_2 = 1, dimgl = 6. Furthermore, go = S06 (9 Ct ~ ~ GL(4) and gl as an SO6-module is isomorphic to S2(sl4) 0 so6 on which t ~ot acts as the scalar - 1 . This Z-graded superalgebra should be added to the list of Theorem l0 from [2].
226
s.-J. Cheng, V.G. Kac
Another Realization of CK6.
In the remaining part of this section we give another realization of K6 and hence of CK6 using 4 x 4 matrices. Although we will not use this realization, we believe that it is worth presenting it here, as it might be useful in the future. As before let us start by analyzing the algebra K6. The quadratic fields of K6 (i.e. ~I (z) such that Ill = 2) is an S06 current algebra over C. We will identify S06 with SL4. In particular SL4 can be represented by traceless 4 x 4 matrices over C. Hence we can identify the zero modes of the quadratic fields (i.e. the linear span of ~I @ t 0) with SL4. The fixed modes of the linear fields of K6 (i.e. ~i(z), i = 1 , . . - , 6) form the standard representation so6 of this Lie algebra S06, and it is isomorphic to A 2(8/4) when considered as an SL4-module, where 814 here denotes its standard representation. Thus the linear fields o f K6 can be represented by quadratic terms of A (4), on which a natural Hodge involution acts. Let's denote this Hodge involution by w. It should not be confused with the Hodge dual in K6, described above. Now the quadratic terms of A (4) can be represented by skewsymmetric 4 x 4 matrices, hence w acts on 4 • 4 skewsymmetric matrices. The action is as follows: Given a skewsymmetric 4 • 4 matrix/3 of the form
(0 a
B =
we have
-a -b -c
0 -d -e
d 0 -f
e f 0
o f -e d) w( B) =
- f e -d
0 -c b
c 0 -a
The action of SL4 on these skewsymmetric matrices is given b y J G _ + G_ j T (which is skewsymmetric), where here J C SL4 and G _ is a 4 x 4 skewsymmetric matrix representing an element of A 2(s/4). The fixed modes of the cubic fields of K6 is not an irreducible representation of SL4. Rather it consists of the representation $2(8/4) and its dual $2(814) *. We will represent both $2(8/4) and ,..q2(8/4)* by symmetric 4 x 4 matrices with the corresponding SL4actions being J F + + F +jT and _ j T F ++ - F ++J, respectively, where F + and F ++ are both symmetric 4 x 4 matrices. The quartic fields form the adjoint representation of SL4 and will also be represented by traceless 4 x 4 matrices with obvious SLn-action. The quintic fields form a representation isomorphic to the representation formed by the linear fields and will also be represented by 4 • 4 skewsymmetric matrices with the same action. Finally the highest degree field is the trivial representation, while the lowest degree field is the Virasoro field. With this in mind it is not surprising that we have the following realization o f / ( 6 :
Even felds. L(z) is a Virasoro field. J(z) are currents, where J is a 4 x 4 matrix with t r J = 0. J ( z ) are primary fields of conformal weight 0, where J is a 4 x 4 matrix with t r J = 0.
A(z) is a primary field of conformal weight - 1. Odd fields. G _ (z) are primary fields of conformal weight 23-,where G _ is a skewsymmetric 4 x 4 matrix.
New N = 6 Superconformal Algebra
227
F +(z) are primary fields of conformal weight 89 where F + is a 4 x 4 symmetric matrix. F ++(z) are primary fields of conformal weight 89 where F ++ is a 4 x 4 symmetric matrix. G _ _ (z) are primary fields of conformal weight - 1 , where G _ _ is a skewsymmetric 4 x 4 matrix. For an arbitrary 4 x 4 matrix A we define A0 := A - l(trA) 9 1, where I is the identity matrix. The fields of K6 satisfy the following operator product expansions (a, b, a,/3, i, j from now on are always indices): J a ( z ) . Jt'(w) ~ [J~, Jb](w) (z - w)
'
1 (JG--G-JT)+(w)+(JT~~176
J(z) . G_(w) ~
~
.
(z_w)Z
.+ ( J G - + G _ j T ) _ (w}
(z-w)
G ~_(z)" G ~_(w) ~
tr(G~- w(G~-))L(w) + O(G~w(G~_))o(w) _ (G ~_w(G~))o(w) (z - w)
G_(z).
J(z) .
r'+(w)~--
r+(w)~
(JF++
1 ( r +~(G_))(w)
I o ( r +~(G_))(w)
2
4
(z - w)
F+JT)+(w) (z - w)
Y(z)- r + ( w ) ~ ](z) . G_(w)
(z - w) 2
(z - w)
1 (JF+F+JT)__(w) 2 (z - w) 2 '
(Jr+-
r.2T)__(w). (z - w)
O ( J G _ +G _ ,l T ) _ _ (w) , 1 ( J G _ +O _ j T ) _ _ (w)
~ ~
- - "1" -~
(z_,,,)~
( J G - - - G - J T )+(w) "t- ( J T t~ G - )--t~ G - )J)++(w) , (z-w) (z-w)
J'~(z). Jb(w) ~ [ J % Jb](w) 1 tr(J'~jb)A(w) (z--w) +2 (z--w) 2 ' J(z).
G__(w)
G,~(z) . G ~ _ ( w ) ~ -
,.~
(JG__ + G_-JT)__(w) (z - w)
(O~J(a~--))0(~) --
(z-w)
z
1 tx(G~-w(Ge ))OA(w) +
4
z-w- -
1 tr(G~_oa(G o _ _ ) ) A ( w ) 2 (z-w) 2
G _ ( z ) . A(w),,~
G__(w)
- -
Z--W
,
N
Fi+(z). FJ++(w)"o
t (r'+r'+§
~ u ( r + r {+)A(~v)
1 tr(F + i F {+)OA(w)
2
4
8
z-w
(z-w) z
z-w
'
228
S.-J. Cheng, V.G. Kac
r ~(z). r ~(w) ~ o. Now we can describe the subalgebra CK6 of K6 as follows: 1
= L(z) - ~03A(z),
~_(z) = G_(z) + ~OZG__(z), 1
-
J(z) = J(z) + -~OJ(z), +(z) = r §
To be more precise it is the following algebra:
Even fields. L(z) is a Virasoro field. J(z) are currents, where J i s a 4 • 4 matrix with t r J = 0.
Odd fields. G_(z) are primary fields of conformal weight ~, where G _ is a skewsymmetric 4 • 4 matrix. +(z) are primary fields of conformal weight 1, where F + is a 4 • 4 symmetric matrix. The fields of CK6 satisfy the following OPEs: Ja(z) " Jb(w) ~ [ya,
ffb](w)
(z - w)
'
J(z) 9G_(w) ~ ( J ~ - + G - J'T)-(w) + ( J ~ - -- G - J'T)+(w) (z - w)
(z -
w) 2
'
Y(z). ? +(~)~ (y~§ + P+~)§ (z - w)
O~_(z). ~_(w) ~
tr(&-~(~-))~(~) + ~ ( ~ ( ~ _ ) ) o ( ~ ) (z - w)
_ (&_~(~_))o(~) (z -
w) 2
'
G_(z). ~ +(~) ~ I ( P § 2
(z - w)
Ai
r + ( z ) . P~(w) ~ o.
As before w(G_) denotes the Hodge dual of the matrix G_. All OPEs can be checked using those OPEs of/('6 given above.
Remark. The currents form, as for/s the SL4 loop algebra. Their zero modes therefore form an SL4. The fixed modes of the fields of weight ~ form the module A 2(s/4) and the fixed modes of the fields of weight 1 form the module S2(sl4).
New N = 6 SuperconformalAlgebra 4. Central Extensions of
229
CK6
In this section we will prove the following theorem: Theorem 4.1. C K6 has no nontrivial central extensions. Let C be a central (even) element of the algebra CK6. The OPEs of the mutually local fields of CK6 are equivalent to the commutation relations of CK6, viewed as an infinite A
A
~ i
dimensionalLie superalgebra with basis {Jna, G~, F k, Ln},wheren E Z a n d k E 89 Here a, a, i are used as indices of bases. We will write down the commutation relations of a central extension of CK6. The coefficients fCab,lZac~,c ~3 7"~,iPai,Jg~z, A~Z, r below are of course the structure constants of CK6 and can be calculated explicitly from the OPE given in Sect. 3. But since we will not need to know them explicitly, we will not do it here. g(., .) of course is the 2-cocycle of this central extension, while Cab is an invariant symmetric bilinear form on SL4. Using the same argument as in [5] we see that any central extension of CK6 has the following form: [Zn' Zm]
= (n -
m ) Z n + m 4- ~ n ' - - m ( n 3
-- n ) C ,
12
= f~bJ~+m + ne~b~,-mC, [Jna, G~n] ='f~,~a~~zn+m +n~-iaaFn+m, 7a
[
9 Aj
r m] = + 2Ac~z(n -
t a n , e~m] : ~ ~c~
~ i
~ i
r
m)Ja+m + g(O~, G~)C, ^i
a ~a
[G,~, F ml = ~oiJ~+m + g(G~, F m)C, n,
m
)c.
(Here we have used the convention of summing over repeated indices.) Let J d e n o t e the linear span of the zero modes of a basis of J~ over C. Furthermore we let Gk denote the linear span of the U h modes of a basis of Oa, and G - k denote the linear span of the - k th modes of a basis of G~, for a fixed k. Finally let F k denote the linear span of the k th modes of abasis of F , and F - k denote the linear span of the - k th modes of a basis of F , for a fixed k. ] is the adjoint representation of SL4 (denoted by adSL4), while Gk and G - k both are the SL4-module A 2(sl4). Similarly F k and - k are both the representation S2(sl4). The central element C can be viewed as the trivial representation of SL4. With this in mind we see from the Jacobi identity that the existence of a central extension implies the existence of the following SL4-module homomorphisms: A 2(814) @ A 2(sl4) ---+ C, adSL4 | adSL4 --* C, A 2(8/4) @ $2(814) --* C, $2(814) @ $2(814) --+ C. Since A 2(sl4) and S2(sl4) are not contragredient and S2(sl4) is not self-contragredient, we obtain immediately that g ( F i~, F ~ ) - - g(G~,^ ^iFm) = O.
230
S.-J. Cheng, V.G. Kac
Lemma 4.1. e is trivial.
Proof Consider the Jacobi identity Ai
= [[d~, G~ ], F 811 + [G~, [J~, ~ ~]]. We collect the central term and use the fact that g( F n, F m) = g(G~, ~o~ F m) = 0 to get (summation over repeated indices): b
n@c~iCab~n,-r-s = O. The equation is valid for all n, r, s. So we get b
~)~is
=0.
Suppose that e is nontrivial. Then, since e is an invariant symmetric bilinear form o n S L 4 , it is nondegenerate and we can assume that the set {J~, j b , . . . } is chosen so that they form an orthonormal basis. Since ~ is a non-zero homomorphism from A 2(sl4)@$2(814) onto adSL4, we can find c~ and i such that ~ is not zero for each fixed a. Thus c ~ = 0 and so ~ is trivial, which is a contradiction. [] Lemma 4.2. 9(G~, G ~ ) = 0.
Proof Let [G~, G ~ I = c ~ Ln+m ~ " ( n - m) J~+,~ "i', +9(G,~, ~ Gm)C. ^~ +2A~;~ As weight spaces GL4 = SL4 @ Lo it follows that 9(Gn, ~ G~) 4 0 only when a = - / 3 and n = - m . Let C a and G# belong to oppositeffeight spaces with respect to SL4. We can find an in the Cartan of SL4 such that [H, G '~] =~0. Consider the Jacobi identity of
We collect the central term and use Lemma 4.1 to get 9(G~+r, G ] ) = g(G ^ r , G~+8), A;~ for all r, s, n. Choose s = - n - r, we obtain g(d~_~, G~) = 9 ( d ~ , d ~ ) , for all r, s. Now we pick s = r and we get 9(G~, 0~-~) = 9(d~-~, d~), which implies that -
=Y~,
s~
-s)~
Vr~8.
Hence g(G~, G~_~) is a constant function of r. Now we use the Jacobi identity
and compare the central terms of both sides. The left-hand side, if non-zero, must be a function in n (in fact a polynomial in n with zero constant term). The right-hand side, however, is a constant, if non-zero. This implies that the Virasoro subalgebra has trivial center and also g(G~, G ~ ) = 0. [] This completes the proof of Theorem 4.1.
New N = 6 Superconformal Algebra
231
References 1. 2. 3. 4.
Ademollo, M. et al: Dual strings with U(1) colour symmetry. Nucl. Phys. B l l l , 77-111 (1976) Kac, V. G.: Lie superalgebras. Adv. Math. 26, 8-96 (1977) Kac, V. G.: Vertex algebras for beginners. Providence: AMS, University Lecture Notes, vol. 9, 1996 Kac, V. G.: Superconformal algebras and transitive group actions on quadrics. Commun. Math. Phys. 186, 233-252 (1997) 5. Kac, V.G., van de Leur, J.W.: On classification of superconformal algebras. In: S.J. Gates et al. (eds.) Strings 88, Singapore: World Scientific, 1989 pp. 77-106 6. Ramond, E, Schwarz, J.H.: Classification of dual model gauge algebras. Phys. Lett. 64B, no. 1, 75-77 (1976) Communicated by G. Felder
Commun. Math. Phys. 186, 233-252 (1997)
Communications in
Mathematical
Physics
(~) Springer-Verlag1997
Superconformal Algebras and Transitive Group Actions on Quadrics Victor G. Kac* Departmentof Mathematics,MIT,Cambridge,MA 02139, USA Received: 16 October 1996/Accepted: 12 December1996 To Ernest Borisovich Vinberg on his 60 th birthday
Abstract: A classification of "physical" superconformal algebras is given. The list consists of seven algebras: the Virasoro algebra, the Neveu-Schwarz algebra, the N = 2, 3 and 4 algebras, the superalgebra of all vector fields on the N = 2 supercircle, and a new algebra C K 6 constructed in [3]. The proof relies heavily on the classification of all connected subgroups of S O N ( C ) which act transitively on the quadric (v, v) = 1.
O. Introduction Superconformal algebras have been playing a fundamental role in superstring theory and conformal field theory. It is an important and challenging problem to give their complete classification. My conjecture is that the examples given in Sect. 2 of this paper exhaust all possibilities (cf. [7 and 6]. The main theorem of the present paper (Theorem 4.1), which gives a classification of "physical" superconformal algebras, is the first step towards the proof of this conjecture 1. A superconformal algebra is a Lie superalgebra g over C spanned by the coefficients of a finite family F of pairwise local formal distributions (fields) such that the following three properties hold: 1) F contains a Virasoro formal distribution L(z) = ~ c z [L(z), L(w)] = OL(w)~5(z - w) + 2 L ( w ) ~ ( z - w));
L n z - ' ~ - 2 (recall that
2) the coefficients of the delta-function expansion of the commutators of formal distributions from F are from R := C[0]F; 3) g contains no non-trivial ideals spanned by coefficients of formal distributions from a C [0]-submodule of R. * Partially supportedby NSF grant DMS-9622870 1Added in proof. This conjecturehas been proved.
234
V.G. K a c
Recall that a formal distribution ~5 with coefficients in g is called primary of conforreal weight A if [L(z), qS(w)] = O~(w)~(z - w) + A ~ ( w ) ~ ( z -- w). A superconformal algebra (g, F ) is called physical if the set of even (resp. odd) formal distributions from F consists of L and primary distributions of conformal weight 1 (resp. primary distributions of conformal weight 3 and ~1 ). Theorem 4. l(see Sect. 4) states that there are seven physical superconformal algebras, namely, the N = 0 (Virasoro), N = 1 (Neveu-Schwarz), N = 2, 3 and 4 superconformal algebras, the superalgebra W2 of all vector fields on the N = 2 supercircle, and a new superalgebra CK6 constructed in [3]. The rank of R over C[0] (i.e. the minimal number of formal distributions generating g) is 1,2, 4, 8, 8, 12 and 32 respectively. The proof is based on the analysis of the conformal superalgebra R = C [ 0 ] F associated to (g, F). Apart from the C[0]-module structure, R carries a C-bilinear product a(w)(n)b(w) for each n E Z+, which is simply the coefficient of 5(~)(z - w)/n! in the commutator [a(z), b(w)]. The Lie superalgebra axioms are translated into some complicated identities for R ([6], Sect. 2.7). These identities show, in particular, that the 0 th product defines a Lie algebra structure over C on the space a of all primary distributions from R of conformal weight 1 and a faithful representation of a on the space V of all odd primary distributions from R of conformal weight ~ with a non-degenerate invariant symmetric bilinear form (., .). The basic observation of the paper is that the complex connected algebraic group A corresponding to the Lie algebra a acts transitively on the quadric (v, v) = 1, v c V. Let au denote the stabilizer of a point u on this quadric. If F contains no odd primary 1 then a~ = 0 ([6],Sect. 5.9) which immediately distributions of conformal weight ~, leads to a conclusion that the only possibilities for dim V are 0, 1,2 and 4. These cases correspond to N = 0, 1,2 and 4 superconformal algebras respectively. 1 Denote by E the space of all odd primary distributions in R of conformal weight ~. Another important observation is that the a-module E when restricted to au is isomorphic to the a~-module ad a~. In Sect. 3 1 classify all connected algebraic subgroups of the group SON(C) which act transitively on the quadric (Theorem 3.1). The above restriction on a~ rules out most of the cases if a~ ~ 0. ! avoid doing this inspection by using the numerical restriction (4.8), which leaves only six cases, of which three are ruled out by the above condition on a~, and the remaining three cases correspond to N = 3 superconformal algebra, to W2 and to the new superconformal algebra CK6 constructed in [3]. This is, in fact, how CK6 has been predicted (in the spring of 1996)! Ramond and Schwarz [ 13] gave a classification of Lie superalgebras satisfying conditions of Theorem 4.1 and some additional conditions. The additional conditions exclude algebras W2 and CK6, so that the answer given in [13] is correct. However the argumentation of [13] is incomplete and is not quite correct (like the "proof" of simplicity of the current algebra on p. 76 ). On the other hand, restrictions given by Lemma 4.2b and Lemma 4.3 are similar to their observations. This paper is dedicated to my teacher, E.B. Vinberg, who taught me invariant theory (among many other things ), which turned out to be indispensable for this paper. I am grateful to him and to A.L. Onishchik for consultations on transitive group actions.
1. Preliminaries on Lie Superalgebras of Formal Distributions
Let g = g~ + gT be a Lie superalgebra over C. A formal distribution (usually called a field by physicists) with coefficients in g is a generating series of the form :
Superconformal Algebras and Transitive Group Actions on Quadrics
a(z) = E
235
a('~)z-n-l'
n~Z
where a(~) E 9 have the same parity for all n E Z and z is an indeterminate. The formal distribution a(z) is called even (resp. odd) if all a(n) are even (resp. odd). Two formal distributions a(z) and b(z) are called (mutually) localiffor some N E Z+ one has: (z - w)N[a(z), b(w)] = 0. (1.1) Introducing the formal delta-function ~
nEZ
we may write a condition equivalent to (1.1) : N-1
[a(z), b(w)] = E (a(j)b) (w)o~(~5(z - w)/j! j=0
(1.2)
for some formal distributions (a@)b)(w) ([5],Theorem 2.3), which are uniquely determinedand given by the formula
(a(j)b) (w) = Res(z - w) j [a(z), b(w)].
(1.])
Z
Formula (1.3) defines a C-bilinear product a(j)b for each j E Z+ on the space of all formal distributions with coefficients in 9Note also that the space (over C) of all formal distributions with coefficients in 9 is a (left) module over C [Oz], where the action of Oz is defined in the obvious way:
so that Oza(z) = ~ ( 0 a ) ( ~ ) z - n - l , where (Oa)(~) = -na(~-l). The Lie superalgebra g is called a formal distributions Lie superalgebra if there exists a family F of pairwise local formal distributions whose coefficients span g. In such a case we say that the family F spans 9. We denote by Fg (resp. FT) the set of all even (resp. odd) distributions from F, so that F 6 (resp. FT) spans gg (rasp. ~tT)-We often will write (9, F ) to emphasize the dependence of F. The simplest example of a formal distributions Lie superalgebra is the current superalgebra g associated to a Lie superalgebra ~: 0 = C [t,t -1] | It is spanned by the following family of pairwise local formal distributions (a E 3) :
a(z) = E
(t~ | a)z-~-I"
nEZ
Indeed, it is immediate to check that [a(z), b(w)] = [a, bl(w)5(z - w).
236
V.G. Kac
The most fundamental for us example is the (centerless) Virasoro algebra, the Lie algebra with the basis L,~ (n E Z) and commutation relations
[L,~, Ln] = (m - n)Lm+n. It is spanned by the local formal distribution L(z) = ~-~nez L n z - n - 2 ' since one has : [L(z), L(w)] = O,,L(w)(~(z - w) + 2L(w)O~(z - w).
(1.4)
A formal distribution L(z) with coefficients in g satisfying (1.4) is called a Virasoro distribution. Note that C[0] F is a C [0]-submodule of the space of formal distributions which still consists ofpairwise local formal distributions. The formal distributions Lie superalgebra g spanned by F is called simple ifg contains no non-trivial ideals spanned by a submodule of C[0]F. The Virasoro algebra has no ideals at all, but the current algebra always has "evaluation" ideals. Nevertheless the current algebra g associated to Lie superalgebra ~ is simple in the above sense iff ~ is simple in the usual sense. I conjecture that the only simple formal distributions Lie algebras with a finite family F are either the quotients of current algebras associated to simple finite-dimensional Lie algebras and their twisted analogues or the Virasoro algebra. A solution of this conjecture under the assumption of existence of a gradation follows from a very difficult theorem of Mathieu [8]. The list of simple formal distributions Lie superalgebras (g, F ) with a finite family F is much larger. Apart from the quotients of current superalgebras associated to simple finite-dimensional Lie superalgebras, it includes three series: WN ( N >_0), SN (N > 2) and KN (N _> 0) (see [7] and Sect. 2 of the present paper), and the exceptional superalgebra CK6 (see [3]). One can also change modings, i.e. consider twisted superalgebras (see [7]). (For SN there is a 1-parameter family of modings; for all other examples of Sect. 2 there are two modings.) The conjecture is that this is a complete list (cf. [6],[7]). Definition 1.1. We say that a formal distribution Lie superalgebra (g, F) is finite if F is finite and the module C[0]F is closed under all products (1.3). The Virasoro algebra and the quotients of current superalgebras associated to finitedimensional Lie superalgebras are finite. All the superalgebras described in Sect. 2 are finite. The finiteness condition provides the choice of the "non-twisted" moding.
Example. Let ~ be a finite-dimensional Lie algebra and let ~ = ~o + ~1 be a Z/2Zgradation. Then g := C [t,t -1] |176 +t89 [t,t -1] | is a subalgebra of the Lie algebra C It89 t- 89 | ~0. This is a "twisted" current algebra. It is spanned by pairwise local formal distributions L
a(z) = E
(tn | a) z-n-1 for a r ~0
nEZ
and
a(z) = E ( tn+89• a)z -n-l nEZ
for a r
E 1.
Superconformal Algebras and Transitive Group Actions on Quadrics
237
However, [a(z), b(w)] = w[a, b](w)d(z - w) if a, b E ~1. Hence ~t is not a finite Lie algebra of formal distributions, at least if ~ = [~, ~]. Finite formal distributions Lie superalgebras can be studied via conformal superalgebras introduced in [6]. Definition 1.2. A conformal superalgebra R is a left Z/2Z-graded C[O]-module R = R~ G R~ with a C-bilinear product a(n)bfor each n E Z+ such that the following axioms hoM (a, b, c E R, m, n E Z+): (CO) (C1)
a(n)b = Ofor n >> 0, (Oa)(n)b = -na(n_l)b,
(C2)
a(n)b= ( - 1)p(a)p(b)~-'~j~o(- 1)J+n+l(OJ/J !)b(n+j)a,
(C3)
a(m) (b(n)C) = }-'~d~__0( 7 ) (a(j'b)(,~+n-2 c + (--1)P(a)P'b)b('~)(a(m)C).
It is shown in [6],Sec. 2.7 that if (G F) is a formal distributions Lie superalgebra such that R(g, F) := C[O]F is closed under all products (1.3), then R(g, F) is a conformal superalgebra with respect to these products. Note that, by definition, (~, F) is finite iff R( G F) is a finitely generated C[0]-module. Conversely, if R = @ieiC[O]a i is a free as the C[0]-module conformal superalgebra, we may associate to R a formal distributions Lie superalgebra g(R) with the basis 9 i --n--1 a~m ) (i E I, m E Z) and F = {ai(z) = ~-]n a(n)z }~er with the bracket (cf. (1.2)): [a'(z)'aJ(w)] = E
(a~k)aj) (W)O~5(z--w)/k!,
kEZ+
so that R(~(R), F) = R. Proposition 1.1. Every finite formal distributions Lie superalgebra with trivial center is a quotient of a Lie superalgebra I~(R), where R is a conformal superalgebra finitely and freely generated as a C[O]-module and with trivial center, by an ideal that does not contain all a(n), n E Z, for a non-zero element a E R. For such Lie superalgebras the C[O]-module R(g, F) is free. Proof Follows from the above discussion and [6], Proposition 2.7 which states that a finitely generated C[0]-module conformal superalgebra with a trivial center is a free C[0]-module. [] All notions concerning formal distributions Lie superalgebras of course are automatically translated into the language of conformal superalgebras [6]. The following simple observation is extremely useful. Lemma 1.1. Let R be a conformal superalgebra. Then, with respect to 0 th product, OR is a 2-sided ideal of R and R / O R is a Lie superalgebra over C. Moreover, the 0taproduct defines a structure of a left R/OR-module on R (over C[0]). Proof See e.g. [6],Corollary 2.3c.
[]
The simplest examples of conformal superalgebras are the current conformal superalgebras : R(g) C[0] | =
with the products defined by :
238
V.G. Kac
a(o)b = [a, b], a(j)b = 0 for j > 0, a, b E ~, and the Virasoro conformal algebra V = C[O]L with the products (cf. (1.4)):
L(o)L = OL, L(1)L = 2L, L(j)L = 0 for j > 1. More complicated (finite rank) conformal superalgebras are associated to (finite) Lie superalgebras of formal distributions constructed in Sect. 2. I conjecture [6] that all simple conformal superalgebras which are finitely generated C[0]-modules are either current conformal superalgebras associated to finitedimensional simple Lie superalgebras (classified in [5]) or conforrnal superalgebras associated to one of the superconformal algebras listed in Sect. 2. A proof of this conjecture is in sight now 2. The non "super" case has been recently settled. Let ~t be a Lie superalgebra and let L(z) be its Virasoro distribution. One says that a formal distribution a(z) with coefficients in g is an eigendistribution of conformal weight A~ E C with respect to L(z) if L(z) and a(z) are mutually local and [L(z), a(w)] = Owa(w)5(z - w) + Aaa(w)OwS(z - w) + . . . .
(1.5)
The formal eigendistribution a(z) is called a primary distribution with respect to L(z) if the coefficients of O~6(z - w) with j > 1 in (1.5) vanish. This is equivalent to the commutation relations: [Lm, a(z)] = z m (zOz + (m + 1)Aa) a(z).
(1.6)
(Equality (1.5) is equivalent to (1.6) for m = 0 and - 1 ) . A primary distribution of conformal weight 1 is usually called a current. The following proposition is a straightforward but useful bookkeeping device (see e.g. [6],Lemma 5.9).
Proposition 1.2. Let a(z) and b(z) be eigendistributions (with respect to L(z)) ofconformal weights Aa and Ab respectively. Then Oza( z ) is an eigendistribution of conformal weight Aa + 1, and ( a~j)b) ( z ) is an eigendistribution of conformal weight Aa + Ab-- j -- 1. Let (g, F ) be a formal distributions Lie superalgebra. A formal distribution L(z) E F is called an energy-momentum distribution if L(z) is a Virasoro distribution and all a(z) E F are its eigendistributions. Definition 1.3. A formal distributions Lie superalgebra (g, F ) is called a superconformal algebra if it is simple, finite and F contains a Virasoro distribution L. The most important examples of superconformal algebras g that appear in conformal field theory have the property that L is an energy-momentum distribution and that A,, = 1 if a E F~ is different from the energy-momentum distribution, and Aa = 3 or = 89if a E F T. We then say that (9, F ) is a physical superconformal algebra. It follows from Proposition 1.2 that for a physical superconformal algebra all formal distributions from F are primary and all currents from F span a current Lie algebra. In the study of representations of a superconformal algebra (g, F ) it is important to consider its central extensions. By definition, this amounts to adding to the righthand side of the bracket [a(z), b(w)] in (1.2) a linear combination of terms of the form ctj(a, b)OJ~5(z - w ) / j ! , where aj(a, b) E C is a C-valued C-bilinear form defined for
2 Addedinproof. This conjecturehas been proved
Superconformal Algebras and Transitive Group Actions on Quadrics
239
each j E Z§ on R = C[0]F. The usual properties of a 2-cocycle of a Lie superalgebra are then equivalent to the following identities for all m, n E Z+ ([6], Sec.2.7):
Olr~(Oa,b) = -nc~n_l(a, b), c~n(a, b) = ( - 1)n+l+p(a)p(b)an(b, a),
(1.7)
Olm (a,b(n)C) = ~-~ ( ~. ) Olm+n-j (a(j)b,c) +(-1)p(a)p(b)oLn (b,a(m)C) . j--o As usual, the trivial cocycle an(a, b) = f (a(n)b), where f : R ---+C is a C-linear map, defines a trivial central extension of g (isomorphic to the direct sum of g and C). Two cocycles that differ by a trivial cocycle are called equivalent; they define isomorphic central extensions. Following [7],[3] we list in Sect. 2 all, up to equivalence, central extensions of all superconformal algebras discussed there.
2. Examples of Supereonformal Algebras Denote by A(1, N) the tensor product of the algebra of Laurent polynomials C It,/~--1] over C in the indeterminate t and the Grassmann algebra A(N) over C in the indeterminates ~l,. 99 iN. This is an associative superalgebra with the parity
p(t) = -0, p (~i) = 1, i = 1,..., N. The Lie superalgebra of all derivations of the superalgebra A(1, N) is denoted by
WN. This is a simple Lie superalgebra [5, 7]. Introducing the even derivation 00 and odd derivations Oi for i = 1 , . . . , N by 0
0
we can write every element of W N in the form of a linear differential operator N
D = EPiOi,
where Pi C A(1,N).
(2.1)
i=O
Given an element A E A(N), we introduce for each j = 0, 1 , . . . , N the following formal distributions with coefficients in WN:
AJ(z) = E
(t~AOj) z-"-l"
nCZ
Note that W N is spanned by ( N + 1)2 N linearly independent over C[0z] formal distributions AJ(z), where A runs over all monomials in A(N) and j over the set 0, 1 , . . . , N. Of course, W0 is the Virasoro algebra. The commutation relations of the Lie superalgebra WN can be written in a compact form by making use of the formal delta-function as follows.
Proposition 2.1. For arbitrary A and B C A(N) one has:
240
V . G . Kac
(a) [Ai(z),BJ(w)] : ((A cOiB)j ( w ) + ( - 1 ) p(A) ((OjA) B ) ' (w)) 6(z - w) if i , j = l , . . . , N . (b) [Ai(z), B~ = (A OiB) ~ (w)5(z - w) - (-1)p(B)(AB)i(w)O~5(z - w) /f i = 1 , . . . , N . (c) [A~ B~ = - O ~ ( A B)~ - w) - 2(AB)~ - w). Proof is straightforward by using (1.2) and (1.3)
[]
Here are some important special cases of Proposition 1.1: [-l~
AJ(w)] = O~AJ(w)5(z - w) + (1 + 5jo) AJ(w)O~5(z - w),
(2.2)
[(~i) ~ (z), AJ(w)] = (5~A -- 5~j) AJ(w)5(z - w) if j = 1 , . . . , N ,
(2.3)
[(~i)i (z), A~
= 5iAA~
-- w) -- (-- l) p(A) (~iA) i (w)O~5(z - w).
(2.4)
In the last two formulas we assume that A is a monomial ~i~ ... ~i~ and we let (~iA
----
1 if OiA 5t 0, and = 0 if OiA = O.
Let now ), = (A1, 9 9 )~N) C C g and consider the following formal distribution: N
L~(z) = - l ~
+E
;~iOz (~i) i (z).
(2.5)
i=l
Proposition 2.2. (a) For each )~, the formal distribution L ~( z) is an energy-momentum formal distribution. (b) Let A = ~il ... ~i~ be a monomial. Then the formal distribution AJ(z), where j = 1 , . . . , N, isprimary with respect to L~(z) with conformal weight A= l+.~j - E)~k. k
(c) Let A = ~i, ... ~is be a monomial. Then the formal distribution A~ formal weight A =2- E
has con-
"~k
k
with respect to LA(z), but is not primary, the extra term in [LA(z), A~
(--1)P(A) E
hi (~iA) i (w)O2 ~(z - w). i
If A 5r l, then the corrected formal distribution A~
+
( - 1 )p(A) ----Z-~
E
)~i (~iA) i (w) i
is primary ( of conformal weight A ).
being
SuperconformalAlgebras and TransitiveGroup Actions on Quadrics
241
Proof Ffollows from (2.2)-(2.4) and the usual observations. First, if L(z) is a Virasoro formal distribution, and ~(z) is even primary of conformal weight 1 with respect to L(z) and [~(z), ~(w)] = 0, then L(z) + A0z~(z) is again a Virasoro formal distribution. Second, if [L(z), a(w)] = O~.a(w)5(z - w) + Aa(w)O~5(z -- w) + b(w)O2 ~(z - w), then a(w) - F-10b(w) is primary of conformal weight A with respect to L(z).
[]
The following corollary is immediate by Proposition 2.2. Corollary 2.1. (a) The Lie superalgebra WN is a superconformal algebra for all N >_ O. It is physical with respect to L ;~(z) iff A = ( 1 , . . . , 89 and N < 2. (b) The even part of W1 is spanned by L89 and a current (~1) 1 (z). The odd part of W1 is spanned by two primary formal distributions of conformal weight 3: l l(z) and ((1) ~ (z). (This is the N = 2 superconformal algebra.) (c) The even part of W2 is spanned by L(89189 and five currents:
(~)J (z) for i , j = 1, 2 and (~1~2)~ (z). The odd part of W2 is spanned by four primary formal distributions of conformal weight 3. li(z)for i = 1,2 and (~i)~ (z) for i-- 1,2, 1. and two primary formal distribution of conformal weight ~.
(~1~2)i (Z) for i = 1,2. Remark 2.1. It is shown in [7] that any central extension of WN is trivial unless N 2 [5, 7]. The odd part of Sl is an ideal9 It is straightforward to check the next proposition. Proposition
9
N
2.3. (a) L~(z) lies in SN iff ~ = 1 Ai = 1.
(b) The following formal distributions span SN (A E A(N)): A~j(z) := (O~A)j (z) + (OjA) ~(z), i , j = 1,..., N, Aoj(z) := (OjA) ~ (z) - (--1)P(A)OzAJ(z), j = 1 , . . . , N. One can choose N 2 N linearly independent over C [Oz] among them that still span S~r (c) If A = ~i~ ... ~i~ is a monomial, then the formal distributions Aij(z), i, j = 1, . . . , N, are primary with respect to L;~(z) of conformal weight 1 +A~ +Aj - E A i k " k
(d) If A = ~il ... ~ is a monomial, then Aoj(z) with respect to L X(z) has conformal weight 2 + Aj - E
Aik.
k
Corollary 2.2. (a) The Lie superalgebra SN is a superconformal algebra iff N > 2. It is physical with respect to LX(z) iff N = 2 and ),1 = ),2 - ~. (b) The even part of $2 is spanned by L(z) := L(89189 and three currents: _
1
(~1) 2 (Z), (~2) 1 (Z), (~1) 1 (Z) -- (~l)2"(Z).
The odd part of $2 is spanned by four primary formal distributions of conformal weigth 3.
li(z), i = 1,2, (~1) 0 (Z) +
0 z (~1~2) 2 (Z), ~20(Z) -- Oz (~1~2) 2 (Z).
(This is the N = 4 superconformal algebra.) Remark 2.2. According to [7], any central extension of SN with N > 2 is trivial. The
only (up to equivalence) central extension of $2 looks as follows: a3(L,L)=c/2,
a,
o~1 ((~,), _ (~,)2 (r
((~1)2 , (~2)1) _ (~,)2)
=c/6,
= e/6, ]
a2 ((1) > r (~0~ + 0 ((~i~,)' + (~i~2)z)) -- - e / 3 , i = 1,2.
Superconformal Algebras and Transitive Group Actions on Quadrics
243
Consider the differential form N
w =dt-
Z~d~i i=1
and the following subalgebra of WN:
KN := {D E WN IDw = Pa~ for some P E A(1, N)}, introduced independently in [1] and [5] (see [71). This Lie superalgebra consists of linear differential operators of the form ( f E A(1, N)): N
D: := fOo +
I(-I)P(':)ED~(f)Di, i=l
where we let
Di=~iO0+Oi, i = 1 , . . . , N . (Note that Dfw = (Oof) w), It easily follows [5, 7] that KN is a simple Lie superalgebra for N > 0, unless N = 4 in which case we have: K4 = [1s K4] | CD t-1~2~3~4 and [K4,/s is simple. Using the vector space isomorphism A(N) -% KN given by f H D : , we identity KN with A(1, N). Then the bracket of differential operators on KN gets identified with the following bracket on A(1, N) [5],[7]:
If, y]= I - ~ 1 -Oof ( g - :
~0J
Oog
Ni~=l ~Oig ) ( _+l ) p ( D ~ O- i f O i g
(2.8)
i=l
Given A E A(N), define the following formal distribution with coefficients in A(1, N) = K N :
A(z) = Z
(tnA) z-n-l"
nEE
Note that KN is spanned by 2 N linearly independent over C [0z] formal distributions A(z), where A runs over all monomials in A(N). Proposition 2.4. (a) LX(z) lies in I(N iff A1 . . . . . AN = 89 One also has: 1 1 1 L(~'~ ..... ~)(z) = - l(z). (b) For arbitrary monomials A = ~ ... ~i. and B = ~jl ... ~J. one has:
[A(z),B(w)]
= ((~ - 1 ) OwAB(w)+(-1)~ 89 y~.N, (OiAOiB)(w)) 6(z - w) + (~-~ - 2) AB(w)O~5(z - w).
In particular, A(z) is a primary formal distribution of conformal weight 2 - ~ with respect to the energy-momentum distribution -1 (z). Proof Straightforward as that of Proposition 2.1, using (2.8).
[]
244
V.G. Kac
Corollary 2.3. The Lie superalgebra KN is a superconformal algebra for all N >_O. It is physical with respect to L(z) = - l ( z ) / f i N < 3.
Remark 2.3. It is proved in [7] that any central extension of KN is trivial unless N < 4. For KN with N < 3 the only non-trivial (up to equivalence) central extension look as follows:
ao (~j~k,~i~j~k) = C/12 (i 7~j •k), 0~1 (~i~j,~i~j) = c/12 (i r C~2(~i, ~i) = c/6, a3(1, 1) = c/2. It follows from [7] that, up to equivalence, the only "field-theoretic" central extension of K4 looks as follows, where we let u = ~1~2~3~4: OLl(1,/2 ) = C, Oq
(~i,Oil]) = C, Oq (~i~j,OiOjl)) = (3,
the Virasoro central charge being 0.
Remark 2.4. One has the following isomorphisms between superconformal algebras: W0 -~ K0 (Virasoro), W1 ~ K2 ( N = 2 superconformal algebra). Finally, the superconformal algebra CK6 is constructed as a subalgebra of/s [3] as follows. Let u = ~1~2~3~4~s~6 and for a monomial A = ~, ~i2 ... denote by A* the Hodge dual monomial: A* = 0i, 0i2 999u. Then CK6 is spanned by the energy-momentum distribution - l ( z ) + x/-Z]-03 u(z), by 15 currents ~i~j (z)+ V/-~Oz (~i~j)* (Z), by 6 primary distributions of conformal weight 3: ~ ( z ) - v/-L-]-02~* (z), and 10 primary distributions of conformal weight i.1, ~i~j~k(z) + ~ (~i~j~k) * (z).any central extension of CK6 is trivial [3].
3. Transitive Group Actions on Quadrics Let S O N be the complex orthogonal group, i.e. the group of all unimodular linear transformations of the N-dimensional complex vector space V preserving a non-degenerate invariant bilinear form (., .) on V. Then SON acts transitively on the complex quadric
Q(V) = {v E Vl(v,v)= 1}. Taking the group GLN of complex invertible matrices acting linearly on the Ndimensional complex vector space V1 and defining on the 2N-dimensional vector space V = 171 | VI* the symmetric bilinear form (u + u*, v + v*) = (u, v*) + (v, u*) (where (,) is the bilinear pairing between V and V*), we obtain an embedding of GLN in S02N. The subgroup SLN of this GLN still acts transitively on the complex quadric. The subgroup SLN is contained in a unique up to conjugacy maximal proper closed algebraic subgroup of S02N; this subgroup has the form GLN ~ 1), a = d i m c a, f = dimc F. We have by Lemma 4.1a, Lemma 4.2 and Lemma 4.3: a + 1 = N + f is divisible by 2 [89
(4.8)
Furthermore, let u E V be such that (u, u) = 1, and let u • be the orthogonal complement to Cu in V. Consider the following basis of R over C[0]: L, basis of a, u, basis of u • basis of F. The matrix Mu of u, viewed as an element of R / O R acting on R by 0 -th product, looks in this basis as follows (we use that u has conformal weight 3/2 and formulas (4.1), (4.3), (4.5)) :
M~,=
0 0
0 0
2 aO
0 AO
0 / /3
0/2 0 0
7 u #9
o 0 0
o 0 0
o 0 0
Herec~,/3, 7, A, #anduarematricesoverCofsizesax 1, a x f , 1 xa, a x ( N - 1 ) , f x a and (N - 1) • a respectively. But due to Lemma 4.2a, the square of the matrix M~ is OI. This is equivalent to c~ = 0, -y = 0 and the following relations:
vA = I N - l , tz/3 = Iy, vfl = O, #A = O, Au +/3lz = It.
(4.9a) (4.9b) (4.9c) (4.9d)
Superconformal Algebras and Transitive Group Actions on Quadrics
249
It follows from (4.9a) that rank v > N - 1. This implies
rc(a)u has codimension < 1 in V.
(4.10)
Let A be the connected linear algebraic subgroup of S O ( V ) whose Lie algebra is 7r(a) (recall that the bilinear form (., .) on V is a-invariant). It follows from (4.10) that A . is an open orbit on the quadric (u, u) = 1. Since this holds for any point of this quadric, we conclude that A acts transitively on the quadric (u, u) = 1.
(4.11)
Consider the following diagram of maps (we denote the map by the same letter as the corresponding matrix): # a
,
F
/]
U•
/3 P
a
Since rank u _> N - 1 and rank # _> f (due to (4.9b)), we conclude that the maps u and # are surjective. It follows from (4.9c) that fl(F) is annihilated by u. Finally the relation (4.9d) implies that both maps/3 and A are injective and their images have zero intersection. Since all four maps #, u, A and/3 are a~,-module homomorphisms, we conclude that the a~,-module a (with respect to ad) is isomorphic to the a~,-module F q~ u • But the au-modules a/a~, and u • are isomorphic. It follows that the a~,-modules F and ad a~, are isomorphic. We thus obtain the following condition a-module F restricted to au is isomorphic to ad a~,.
(4.12)
Now we are in a position to prove Theorem 4.1. First, due to condition (4.11), a is a Lie subalgebra of so(V) corresponding to one of the subgroups listed by Theorem 3.1. If N = 2n+ 1 is odd, we see that there are two possibilities: a = sou or a = Lie G2 c so7. The second case is ruled out immediately (14 + 1 is not divisible by 8). In the first case condition (4.8) gives: n(2n + 1) + 1 is divisible by 2 n which eliminates all cases except for (a) and (b) below: (a) N = 1, a = O, f = O;
(b) N = 3 ,
a=so3, f=l.
Let now N = 2n be even. First of all, the exceptional cases 5) and 6) of Theorem 3.1 are ruled out by (4.8). If a = SON (case 1 of Theorem 3.1), condition (4.8) gives: n(2n - 1) + 1 is divisible by 2 n-1 which permits only the following cases: (c) N = 2 ,
a=so2, f=0;
(d) N = 6, a = 806, f - - - 1 0 .
250
v.G. Kac
In case 2) of Theorem 3.1 there are the following 4 possibilities for a C so2n, n > 1 (see (3.1)): a = gln, a = sln ~ Un, a = gln ~4 Un, a = sln, where dim un = 89
- 1). Hence we have respectively:
a = n 2, a =
l, a = ~1n ( 3 n - 1 ) ,
n(3n-1)-
a = n2 - 1 .
Due to (4.8) this number should be divisible by 2 ~-1 . This permits only the following cases: (e) N = 4 ,
a=sl2,
f=0;
(f) N = 4, a = gl2 ~ 1. Let q ∈ C be a primitive l root of unity. Let i −q −i (i)q = qq−q −1 . Let u be an associative algebra over C with generators E, F, K, K −1 and relations: KK −1 = K −1 K = 1;
Decomposition of the Adjoint Representation of the Small Quantum sl2
255
KEK −1 = q 2 E, KF K −1 = q −2 F ; EF − F E =
K − K −1 ; q − q −1
E l = F l = 0, K l = 1. The algebra u is finite-dimensional and dim u = l3 . Let ω be an automorphism of the algebra u given on generators by the formulas: ω(E) = F, ω(F ) = E, ω(K) = K −1 . The algebra u is Hopf algebra with respect to coproduct 1, antipode S and counit ε given by the formulas: 1(E) = E ⊗ 1 + K ⊗ E, 1(F ) = F ⊗ K −1 + 1 ⊗ F, 1(K) = K ⊗ K; S(E) = −K −1 E, S(F ) = −F K, S(K) = K −1 ; ε(E) = ε(F ) = 0, ε(K) = 1. 2.2 . Let C be a category of finite-dimensional Z-graded u-modules V =
L i∈Z
V i such
that the following conditions hold: (a) E is operator of degree 2, i.e. E acts from V i to V i+2 ; (b) F is operator of degree −2; (c) K acts on V i by multiplication by q i . The morphisms in category C are morphisms of u-modules compatible with Zgrading. 2.3 . We introduce the duality D on category C. If V ∈ C, then D(V ) is V ∗ as a vector space. The action of x ∈ u on D(V ) is given by the formula (xf )(v) = f (ωS(x)v), where f ∈ D(V ), v ∈ V, S is the antipode. 2.4 . Let us define the adjoint representation ad ∈ C (see e.g. [LM]). Let x be an element of u. The adjoint action of generators is given by the folloing formulas: ad(E)x = Ex − KxK −1 E = K[K −1 E, x], ad(F )x = F xK − xF K = [F, x]K, ad(K)x = KxK −1 . 2.5 . Now we introduce Z-grading on adjoint representation. We put deg(E) = 2, deg(F ) = −2, deg(K) = 0 and deg(ab) = deg(a) + deg(b) for any a, b ∈ u such that deg(a), deg(b) are defined. Note that all the weights of ad are even integers in the interval [2 − 2l, . . . , 2l − 2]. 2.6 . It is known (see e.g. [LM] or [BFS]) that D(ad) ' ad.
256
V. Ostrik
3. u-Modules 3.1 . It is easy to check that an element X = EF +
qK + q −1 K −1 q −1 K + qK −1 = F E + (q − q −1 )2 (q − q −1 )2
(1)
lies in the center of algebra u (see e.g. [Ke]). The element X is called a Casimir element. It satisfies the following equation of degree l (see loc.cit.): Y (X − bj ) = 0, (2) P (X) := j∈Z/lZ j+1
−j−1
+q l j 0 where bj = q(q−q −1 )2 (q = 1, so q is well defined). In particular bj = bj 0 if j +j = l−2. The root b−1 = (q−q2−1 )2 of P has multiplicity 1, and the rest roots bj have multiplicity 2.
3.2 . Let j ∈ Z/lZ. Let Cj be a full subcategory of C such that (X − bj ) acts nilpotently on objects of Cj . In what follows we will identify Cj and Cj 0 if j + j 0 = l − 2. 3.3 . Let us fix H 0 — a maximal subset of Z/lZ with the following property: if {j, j 0 } is two-element subset of H 0 , then j + j 0 6= l − 2. We have H 0 = {−1} ∪ H, where H = H 0 − {−1}. 3.4 . For any j ∈ H we define the integers J, J 0 by the following properties: (1) 0 ≤ J < J 0 < l; (2) J + J 0 = l − 2; (3) (J − j)(J 0 − j) ≡ 0 (mod l). 3.5 . The category C is a direct sum of subcategories Cj where j runs through H 0 . 3.6 . We denote by u± ⊂ u the subalgebra generated by K, E (resp. K, F ). For λ ∈ Z ± denote by C± λ the one-dimensional Z−graded u −module of weight λ such that K acts λ as q and E (resp. F ) acts as zero on it. We denote by M ∓ (λ) the Z−graded u−module ± ± ± u⊗u± C± λ . The modules M (λ) are called Verma modules. Let M (λ) 3 v (λ) := 1⊗1. Let V ∈ C, v ∈ V λ and E ·v = 0 (resp. F v = 0). Then v is called an upper singular (resp. a lower singular) vector in V, and there exists a unique morphism φ : M ± (λ) → V such that φ(v ± (λ)) = v. 3.7 . For each λ ∈ Z there is a unique up to isomorphism simple module L(λ) ∈ C with highest weight λ. The modules L(λ1 ) and L(λ2 ) are isomorphic iff λ1 ≡ λ2 (mod l). Indecomposable projective cover of L(λ) will be denoted by P (λ). We have D(L(λ)) ' L(λ) and D(P (λ)) ' P (λ). In particular P (λ) is injective; dim Hom(L(λ), P (λ)) = 1. 3.8 . The set of isomorphism classes of simple objects in category C−1 is {L(λ), λ ≡ −1(mod l)}. As u-modules without grading all the L(λ) are isomorphic to one and
Decomposition of the Adjoint Representation of the Small Quantum sl2
257
the same u-module St (Steinberg module). It has dimension l. The category C−1 is semisimple. In particular P (λ) = L(λ). 3.9 . Let j ∈ H. The set of isomorphism classes of simple modules in Cj is {L(λ), λ ≡ J(mod l) or λ ≡ J 0 (mod l)}. The modules L(λ), λ ≡ J(mod l) (resp. λ ≡ J 0 (mod l)) are isomorphic as u-modules. Their dimension is J + 1 (resp. J 0 + 1). The projective module P (λ) admits a filtration P (λ) ⊃ W (λ) ⊃ L(λ) ⊃ 0 such that P (λ)/W (λ) ' L(λ), W (λ)/L(λ) ' L(λ0 ) ⊕ L(λ00 ), where λ0 6= λ00 , λ0 ≡ λ00 ≡ −2 − λ (mod l), |λ − λ0 | < 2l > |λ − λ00 |. In particular dim P (λ) = 2l. 3.10 . It is easy to see from 2, 3 and 3 that all the simple subquotients of adjoint representation have the type L(λ), where λ is an even integer from the interval [1 − l, . . . , 2l − 2]. Hence a projective module P (λ) can be a subquotient of ad only if λ ∈ 2Z∩[0, . . . , l−1]. In particular each subcategory Cj contains only one isomorphism class of such projectives. 3.11 . Lemma 3.1. Suppose V ∈ C is indecomposable and the action of Casimir element X on V is not semisimple. Then there exists λ 6≡ −1 (mod l) such that V ' P (λ). Proof. Casimir acts nonsemisimply on regular representation (see (2)). It follows that action on projective modules P (λ), λ 6≡ −1 (mod l) is not semisimple. It is easy to see that the space of eigenvectors of Casimir in P (λ) is W (λ). Let W ⊂ V be a maximal submodule of V such that X acts on W semisimply. Choose 0 6= ϕ ∈ Hom(V, L), where L = L(λ) for some λ ∈ Z such that Ker ϕ contains W. We have a morphism ψ ∈ Hom(P, V ), where P = P (λ) such that the diagram is commutative: P ψ
? ϕR - L V
-0
If Ker ψ 6= 0, then X acts on Im ψ semisimply. Therefore W is not maximal. We have a contradiction. If Ker ψ = 0, then we have injection P ,→ V. From 3.7. it follows that P is direct summand of V. The proof is complete. 4. The Blocks of Adjoint Representation In what follows we always will identify Sj and Sj 0 , where j + j 0 = l − 2 and S is an object, map, etc. Also we will identify SJ and Sj if J ∈ Z, J ≡ j (mod l). 4.1 . The regular action of Casimir X (by multiplication) is an endomorphism of adjoint representation. This gives a decomposition of adjoint representation into blocks ad = L adj , where (X − bj ) acts nilpotently on adj . Let prj denote a projection onto adj . j∈H 0
4.2 . Let j ∈ H. Let Mj = Ker(X − bj ), Nj = adj ∩ Im(X − bj ). Each adj admits a filtration adj ⊃ Mj ⊃ Nj ⊃ 0. The rest of this section is a computation
258
V. Ostrik
of associated graded factors of this filtration. Evidently adj /Mj ' Nj . It remains to compute Nj , Mj /Nj . It is convenient to put N−1 = ad−1 . 4.3 . Recall (see 2) that ad0 ⊂ ad denotes the zero weight space. Let ad0j = adj ∩ ad0 0 (for all j ∈ H 0 ), Nj0 = Nj ∩ ad0 , Mj0 = Mj ∩ ad0 (for j ∈ H) and N−1 = ad0−1 . We will compute the action of ad(X) on ad0j . We start with a computation of action of ad(X) on Nj0 . 4.4 . Lemma 4.1. We have (a) dim adj = 2l2 if j ∈ H and dim ad−1 = l2 ; (b) if j ∈ H then dim Nj = dim adj /Mj = (J + 1)2 + (J 0 + 1)2 and dim Mj /Nj = 4(J + 1)(J 0 + 1). (c) dim ad2m = l(l − |m|) for all m ∈ Z such that |m| < l; (d) dim ad0j = 2l for j ∈ H and dim Nj0 = l for all j ∈ H 0 ; ≥ 2(l − |m|) if j ∈ H and |m| ≥ l − 1 − J. (e) dim ad2m j Proof. (a), (b), (c), (d) are trivial. Let us prove (e). Suppose m > 0. It is easy to see from consideration of u-action on Verma modules M + (J) and M + (J 0 ) that E m acts nontrivially at least on 2(l − m) weights. By standard arguments with a Vandermond determinant we obtain that the desired dimension is at least 2(l − m). The proof for m < 0 is similar.
4.5 . The subspace ad0 ⊂ u is a subalgebra of u. It is generated as an algebra by K and X (see [Ke]). Moreover ad0 is a free module over a subalgebra generated by K (see loc. cit.). In particular we have Mj0 = Nj0 . We have ad(X)K i =
q 2i−1 + q 1−2i i K − (q i − q −i )2 XK i+1 + (i)q (i + 1)q K i+2 . (q − q −1 )2
(3)
4.6 . For j ∈ H the elements (X − bj )prj K i , i = 1, . . . , l (resp. pr−1 K i , i = 1, . . . , l for j = −1) form a basis of Nj0 . In this basis ad(X) acts as a lower-triangular matrix: A(j) =
b0 (q − q −1 )2 bj (1)q (2)q 0 .. .
0 b2 (q 2 − q −2 )2 bj (2)q (3)q .. .
0 0 b4 (q 3 − q −3 )2 bj .. .
... ... ... ... .. .
0 0 0 0 .. .
0
0
0
...
b0
(4)
Decomposition of the Adjoint Representation of the Small Quantum sl2
259
4.6.1. Remark 1. The vectors prj K i , i = 1, . . . , l; (X − bj )prj K i , i = l + 1, . . . , 2l form a basis of ad0j (j ∈ H). 4.7 . Let k ∈ 2Z ∩ [0, . . . , l − 1]. The eigenvalues of this matrix are bk (with multiplicity 2 if k 6= l − 1 and multiplicity 1 if k = l − 1). Let k 6= l − 1. It is obvious that there exists 1 or 2 eigenvectors of A(j) corresponding to eigenvalue bk . We have 2 eigenvectors iff the determinant d(j, k) of matrix (see Lemma 6.1) k/2+1 − q −k/2−1 )2 bj bk+2 − bk 0 ... (q (k/2 + 1)q (k/2 + 2)q (q k/2+2 − q −k/2−2 )2 bj bk+4 − bk . . . D(j, k) = . . . . . . . . . . . . ...
0
...
...
(5) is equal to zero. It is easy to see that this determinant is a polynomial in b2j of degree l−1−k . Since b2j = b2j 0 ⇒ bj = bj 0 , the polynomial d(j, k) vanishes for at most l−1−k 2 2 values of j ∈ H 0 . 4.8 . In this section we prove the following proposition: Proposition 4.1. For any j ∈ H we have the following decomposition: l−3
J 2 M M Nj ' (L(2i) ⊕ L(2i)) ⊕ P (2i) ⊕ L(l − 1). i=0
i=J+1
l−5 Proof. The proof proceeds by induction: we start from j = l−3 2 , then proceed to j = 2 , etc. It follows from 4 that for any j ∈ H 0 the module Nj contains as subquotients L(0), L(2), . . . , L(l − 3) with multiplicities 2 and L(l − 1) with multiplicity 1 (since only these modules have nontrivial zero weight space).
Lemma 4.2. Let j + 1 =
l−1 2 .
Then Nj ' L(0) ⊕ L(0) ⊕ . . . ⊕ L(l − 3) ⊕ L(l − 1).
l+1 2 l +1 2 Proof. Let us compute the dimensions. We have dim Nj = ( l−1 2 ) + ( 2 ) = 2 (see Lemma 4(b)). On the other hand by the above we have dim Nj ≥ 2 dim L(0) + 2 2 dim L(2)+. . .+2 dim L(l−3)+dim L(l−1) = 2·1+2·3+. . .+2·(l−2)+l = l 2+1 . It follows that in this case Nj is a direct sum of L(0), L(2), . . . , L(l − 3) with multiplicities 2 and L(l − 1) with multiplicity 1 (since Ext1 (L(λ), L(µ)) = 0 ∀λ, µ ∈ {0, 2, . . . , l − 1}). 2
The lemma implies that all eigenvalues of A( l−3 2 ) are semisimple. It follows from 4 that eigenvalue bl−3 is not semisimple for all the rest of j. Therefore for j 6= l−3 2 the corresponding Nj contains the projective submodule P (l − 3) (see 3). l2 +9 Now let j + 1 = l−3 2 . Then dim Nj = 2 . On the other hand dim Nj ≥ 2 · 1 + . . . + 2 2 · (l − 4) + 2l + l = l 2+9 . So in this case Nj is a direct sum of L(0), L(2), . . . , L(l − 5) with multiplicities 2, L(l − 1) with multiplicity 1, and P (l − 3). As above it follows that all the rest of Nj contains projective submodules P (l − 5) and P (l − 3), etc. The proposition is proved.
260
V. Ostrik
4.8.1. Corollary 4.1. We have: l−1
ad−1 = N−1 =
2 M
P (2i).
i=0
Proof. It follows from the proof of Proposition 4.1 that all the eigenvalues of the matrix A(−1), except for bl−1 , are not semisimple. Hence the result follows from Lemma 3.1 and computation of dimensions. 4.9 . Corollary 4.1 gives a decomposition of ad−1 . So in what follows we will assume that j 6= −1, i.e. j ∈ H. Lemma 4.3. The module Mj /Nj is a direct sum of modules L(λ) with multiplicity 2, where λ is even and satisfies one of the following conditions: (i) λ lies in the interval [2(l − J + 1), . . . , 2l − 2]; (ii) λ lies in the interval [−2J − 2, . . . , −2]. Proof. Recall that Mj0 = Nj0 . Hence Mj /Nj contains only subquotients L(λ), where either λ > l or λ < 0. By Lemma 4.1 (e), Proposition 4.1, and Corollary 4.1, we have l+1 ≥ 2(l − |m|) if j ∈ H and dim ad2m dim ad2m j −1 ≥ l − |m| for any m ≥ 2 . It follows = 2(l − |m|) for any j ∈ H and m ≥ l+1 from Lemma 4.1 (c) that dim ad2m j 2 . l+1 Hence for any m ≥ 2 we have exactly two upper (resp. lower) singular vectors of weight 2m (resp. −2m). It follows that adj has two simple subquotients with highest weight 2m and two simple subquotients with lowest weight −2m for any m ≥ l+1 2 . Thus ] with multiplicities 4; adj has the following subquotients: L(2i), where i ∈ [0, . . . , l−3 2 l−3 L(l − 1) with multiplicity 2; L(−2 − 2i) and L(2l − 2 − 2i), where i ∈ [0, . . . , 2 ], with multiplicities 2. The computation of dimensions shows that these modules are all the subquotients of adj . It follows from Proposition 4.1 that [Nj : L(λ)] = 1 if λ = −2 − 2i and λ = 2l − 2 − 2i, where i ∈ [J + 1, . . . , l−3 2 ]. Since adj /Mj ' Nj , any simple subquotient of Mj /Nj is of the type L(λ), where λ ∈ [2(l − J + 1), . . . , 2l − 2] ∪ [−2J − 2, . . . , −2] is even. But for any λ, µ satisfying such conditions we have Ext1 (L(λ), L(µ)) = 0. The result follows. 4.9.1. = 2(l − |m|) for any Remark. It follows from the proof of Lemma 4.3 that dim ad2m j j ∈ H and m ∈ Z, |m| < l. 4.10 . Let k be an even integer and k ∈ [0, l−1]. Let adj (k) be a summand corresponding to the subcategory Ck in adj . Let Mj (k) = Mj ∩ adj (k) and Nj (k) = Nj ∩ adj (k). Let us summarize the results of the present section. 4.10.1. ad−1 (k) = P (k).
Decomposition of the Adjoint Representation of the Small Quantum sl2
261
4.10.2. If j ∈ H then (a) adj (l − 1) = L(l − 1) ⊕ L(l − 1); (b) if k ≥ 2J + 2 then adj (k) = P (k) ⊕ P (k); (c) if k ≤ 2J then adj (k) admits a fitration adj (k) ⊃ Mj (k) ⊃ Nj (k) ⊃ 0 with the following associated graded factors: Nj (k) ' L(k) ⊕ L(k); Mj (k)/Nj (k) ' L(2l − 2 − k) ⊕ L(2l − 2 − k) ⊕ L(−2 − k) ⊕ L(−2 − k); adj (k)/Mj (k) ' L(k) ⊕ L(k). 5. The Proof of the Main Theorem 5.1 . Let us find the multiplicities of projective submodules in adj . Recall (see Remark in 4.6.1) that the vectors prj K i , i = 1, . . . , l; (X − bj )prj K i , i = l + 1, . . . , 2l form a basis of ad0j . In this basis ad(X) acts as a block matrix A(j) 0 A0 (j) = , B A(j) where A(j) is a matrix (4) and B is a matrix (see (3)) 0 0 0 0 0 −(q − q −1 )2 0 −(q 2 − q −2 )2 0 ... ... ...
... ... . ... ...
5.2 . Let bk be a nonsemisimple eigenvalue of A(j). Then a summand corresponding to the subcategory Ck in adj is a sum of 2 copies of projective P (k). 5.3 . Let bk (k 6= −1) be a semisimple eigenvalue of A(j), i.e. k ≤ 2J. 5.3.1. Lemma 5.1. In this case adj contains the projective module from category Ck . Proof. It is enough to prove that the matrix A0 (j) has exactly 3 eigenvectors with eigenvalue bk or, equivalently, that the matrix A0 (j) − bk has corank 3. Let us denote by A˜ 0 (j, k) the matrix A0 (j, k) − bk with the ith and (i + l)th columns divided by positive numbers −(q i − q −i )2 for any i ∈ [1, . . . , l − 1]. In order to apply Lemma 6.2 let us put A0 = A˜ 0 (j, k). Then the corresponding matrix D in notations of Lemma 6.2 is the matrix D(j, k) with columns divided by some positive numbers. In order to check the semisimplicity of the matrix D, put q = exp(πi l+1 l ). Then the conditions of Lemma 6.3 Indeed, off-diagonal entries of D(j, k) are either (t)q (t + 1)q or bk+2t − bk = (t)q (t + k + 1)q . In both cases this entry is (t1 )q (t2 )q where t1 , t2 ∈ [1, . . . , l − 1] and one of t1 , t2 is even and another is odd. But if q = exp(πi l+1 l ) and t ∈ [1, . . . , l − 1] then (t)q > 0 ⇔ t is odd. Finally note that the entries of D are the entries of D(j, k) divided by some positive numbers. The lemma is proved.
262
V. Ostrik
5.3.2. The above lemma implies that adj (k) is a sum of projective module and some module Y (k, j). It follows from 4.9.1 that Y (k, j) admits a filtration of length 3 with the following associated graded factors: L(k); L(−2 − k) ⊕ L(2l − 2 − k); L(k). 5.3.3. For any 0 ≤ s < l an element prj (K −1 E)s is an upper singular vector in adj . Let us −1 E)s 6= 0. Consider the action of this element prove that if s ≤ l−1 2 then (X −bj )prj (K on P (J 0 ). Then (X −bj ) is a surjection onto L(J 0 ) ⊂ P (j 0 ) and the desired result follows s from the fact that dim L(J 0 ) ≥ l+1 2 . Similarly the vector prj F is a lower singular vector l−1 k s in adj , and prj F 6∈ Mj if s ≤ 2 . For s = 2 we obtain that adj contains a submodule L(k) which does not lie in Mj . Indeed, the submodules generated by prj (K −1 E)s and prj F s coincide since adj (k)/Mj (k) ' L(k) ⊕ L(k) and adj (k) ⊃ P (k) 6⊂ Mj (k). 5.3.4. It follows that L Hence L Y (k, j) = Z(k, j)⊕L(k), where Z(k, j) contains a submodule L(k). adj (k) is a direct sum of a few copies of P (k) and Y (k) = Y (k, j) ad(k) = j∈H 0
j∈H 0
where all the subquotients L(k) of Y (k) are the submodules of Y (k). Now from the autoduality (see 2.6) we see that all the subquotients L(k) are direct summands of Y (k). Thus Y (k) is a direct sum of its simple subquotients. Hence in the case 4.10.2 (c) (i.e. if k ≤ 2J) we have that adj (k) = P (k) ⊕ L(k) ⊕ L(k) ⊕ L(2l − 2 − k) ⊕ L(−2 − k).
This completes the proof of the Main Theorem.
6. Three Matrix Lemmas The results of this section were used in the previous sections. 6.1 . Let A be a r × r lower-triangular matrix: A=
α1 β1 γ1 0 .. .
0 α2 β2 γ2 .. .
0 0 α3 β3 .. .
... ... ... ... .. .
0 0 0 0 .. .
0
0
...
βr−1
αr
.
(6)
Lemma 6.1. Let αi = αj = α for some i < j and αk 6= α for k 6= i, j. The matrix A has 2 different eigenvectors with eigenvalue α iff the determinant of (j − i) × (j − i) matrix
Decomposition of the Adjoint Representation of the Small Quantum sl2
D=
263
βi γi 0 0 .. .
αi+1 − α βi+1 γi+1 0 .. .
0 αi+2 − α βi+2 γi+2 .. .
... ... ... ... .. .
0 0 0 0 .. .
0
0
...
γj−2
βj−1
(7)
vanishes. Proof. Clear.
6.2 . Suppose αi = αj iff i + j = r + 1. Let A0 be the following matrix A0 = where
0 1 0 B = 0 ...
A B0
0 A
0 0 1 ...
0 0 0 ...
,
... ... . ... ...
Lemma 6.2. Suppose the matrix A above has 2 eigenvectors with eigenvalue αi . Suppose the matrix D is semisimple of corank 1. Then the matrix A0 − αi has corank 3. Proof. It suffices to consider the case i = 1, j = r, α1 = αr = 0. Deleting two rows and columns consisting of zeros we obtain a matrix 0
D =
D E
0 D
,
where E is a unit matrix. We have to prove that corank of D0 is equal to 1. Let Im(D) (resp. Im(D0 )) denote the linear space generated by columns of D (resp. D0 ). Let pr : Im(D0 ) → Cr−1 be a map forgetting the last r − 1 coordinates. Then pr(Im(D0 )) = Im(D) has dimension r − 2. Let us prove that Ker(pr) has dimension r − 1. Indeed, Ker(pr) ⊃ Im(D) and Ker(pr) contains the kernel of operator D. Since D is semisimple, Im(D) 6⊃ Ker(D). The result follows.
6.3 . Lemma 6.3. Suppose D is a real matrix, and all the off-diagonal entries are negative. Then D is semisimple. Proof. It is easy to see that conjugating matrix D by some diagonal matrix we can obtain a symmetric matrix.
264
V. Ostrik
References [BFS] Bezrukavnikov, R., Finkelberg, M., Schechtman, V.: Localization of u-modules. V. The modular structure on the category F S. Preprint, to appear [Ke] Kerler, T.: Mapping class group actions on quantum doubles. Comm. Math. Phys. 168, 353–388 (1995) [Lu] Lusztig, G.: Finite-dimensional Hopf algebras arising from quantized universal enveloping algebras. J. of AMS 3, 257–296 (1990) [LM] Lyubashenko, V., Majid, S.: Braided groups and quantum Fourier transform. Preprint DAMTP/91-26
Communicated by G. Felder
Commun. Math. Phys. 186, 265 – 293 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
The Brjuno Functions and Their Regularity Properties S. Marmi1 , P. Moussa2 , J.-C. Yoccoz3 1 2 3
Dipartimento di Matematica “U. Dini”, Universit`a di Firenze, 50134 Firenze, Italy CEA, Service de Physique Th´eorique, CE-Saclay, F-91191 Gif-sur-Yvette Cedex, France Universit´e de Paris-Sud, Math´ematiques, Bˆat. 425, 91405 Orsay, France
Received: 21 March 1995 / Accepted: 8 August 1996
Abstract: We show that various possible versions of the Brjuno function, based on different kinds of continued fraction developments, are all equivalent and we study their regularity (Lp , BMO and H¨older) properties, through a systematic analysis of the functional equation which they fulfill.
0. Introduction When an irrational rotation is analytically perturbed, it is a natural question to ask whether or not there exists a neighborhood of the fixed point where the dynamics looks like the unperturbed case. More precisely, does there exist a local holomorphic coordinate for which the perturbed transformation is expressed as an ordinary rotation? When the rotation number satisfies the Brjuno condition [Br], such a coordinate exists in a domain called a Siegel disk. The Brjuno function tells more: it gives an estimate of minus the logarithm of the size of the Siegel disks as a function of the rotation number [Yo]. A similar result also holds in the simpler case of certain complex area–preserving maps [Ma, Da]. In [Yo], the relation between the logarithm of the size of the Siegel disks and the Brjuno function is established through a geometrical renormalisation argument, which uses both a change of variable, combined with the replacement of the original map with a suitably chosen first return map. If the original map has rotation number x (where 0 < x < 1), the replaced map has rotation number 1/x mod 1. The argument gives then the functional equation fulfilled by the Brjuno function, but only establishes inequalities between the Brjuno function and the logarithm of the radius of the Siegel disk (see [Pe] for a review). In this work, we will analyse the consequences of an additional assumption, which remains to be proven, that is we assume that the logarithm of the size of the Siegel disk fulfills the same equation as the logarithm of the Brjuno function, up to a remainder term
266
S. Marma, P. Moussa, J.-C. Yoccoz
having sufficient regularity conditions. We shall discover that a natural assumption is that this remainder be continuous and satisfying an H¨older condition of exponent 1/2. We first observe that regularity assumptions are quite common and usually implicitly made on remainder terms by physicists in their approach of renormalisation group type applied to dynamical systems. The assumption is also supported here by previous numerical calculations [Ma], which display the a priori surprising following result: the difference between the Brjuno function and the suitable measure of regularity adapted to the particular dynamical system, is not only bounded, as anticipated from Yoccoz’s result, but moreover continuous. This observation holds in several cases: 1) holomorphic maps with indifferent fixed points, where the Brjuno function is compared to some measure of the logarithm of the size of the Siegel disk (or Herman rings), as function of the rotation number, 2) complex area preserving maps (semistandard map, modulated singular maps) where the Brjuno function is compared to the logarithm of the critical function. It is remarkable also that the plot of the ratio of each of these functions (size of disks and rings, critical functions) to the exponential of minus the Brjuno function, displays the same general features, boundedness, continuity, and universal shape [Ma, see for instance Fig. 6, 13, and 16]. One of the purposes of this work is to explain that these numerical features become natural outcomes of the above mentioned regularity hypothesis of the renormalisation equation. In Sect. 1, we first analyze the relation between the Brjuno function and the various kind of continued fractions. This required a review of the properties of these various continued fraction expansions. A Brjuno function is associated in Sect. 2 to each kind of continued fraction expansion, so that we get a one parameter family of functions Bα (x), and the Brjuno condition on x is equivalent to Bα (x) bounded, for any fixed α. We shall see later that only two cases appear to be relevant to our purpose, that is the usual Gauss case α = 1 and the nearest integer case α = 1/2 which is used in [Yo]. We establish the functional equation fulfilled by each Brjuno function, and show that its solution requires the inversion of an operator T (α) (see Eqs. (2.5) and (2.7) below—usually, the superscript α is omitted in T (α) , when the value of α is clearly fixed). We estimate for any α the value of the spectral radius of the operator T (and of some of its generalisations) for all Lp norms. This shows that the Brjuno functions belong to all Lp spaces for finite p. In Sect. 3, we fix α = 1/2, and we estimate the norm of T (and of some of its generalisations) in the BMO–space (BMO meaning Bounded Mean Oscillation–see a short review on BMO norms in the Appendix); in this case, the Brjuno function B1/2 is obtained as the action of (1 − T )−1 on a logarithmic function which belongs to the BMO–space. Since the spectral radius of T in this space is shown to be smaller than one, the Brjuno function is also in this space. Noticing that the adjoint of T is nothing else than the generalized Ruelle-Frobenius-Perron operator associated to the dynamical system which generates the continued fraction, the identification of the space adapted to T seems to us promising for the study of dynamical properties. Another nice picture of the case α = 1/2 is that T also acts on the space of continuous functions. We study in Sect. 4 this action with respect to H¨older continuity properties. We show that regular perturbations (at least C 1/2 ) of the logarithmic term do modify the solution only by a C 1/2 contribution, so that the most singular part remains unchanged. This is where the above regularity assumption for the remainder of the renormalisation equation can be used: if these remainder perturbative terms to the renormalisation equation have regularity C β (β > 1/2), then the most singular part of minus the logarithm of the size of the stability domains as function of the rotation number is universally (that is modulo a C 1/2 function), described by the Brjuno function. One could infer from these results that
Brjuno Functions and Their Regularity Properties
267
the Brjuno function for the α = 1/2 case is the best candidate for applications; actually, we also found that the difference B1 −B1/2 is H¨older-1/2 continuous. Therefore both B1 and B1/2 are suitable. This is somewhat surprising, since the operator T (1) is not defined is the space of continuous functions (indeed it is not even defined in the BMO-space). However a specific character of B1/2 is that it is an even function. As an example, we find it natural to set up the following conjecture: The difference between the Brjuno function B1/2 (equivalently B1 ), and minus the logarithm of the conformal radius of the Siegel disk of the quadratic polynomial z exp(2iπx)+z 2 is a 1/2H¨older continuous function of the real variable x. Actually, Douady-Hubbard’s theory of quadratic-like maps [DH] suggest that this could even be true also for perturbations of the quadratic maps. Also we would like to replace the radius in the above conjecture by the critical function of holomorphic area preserving maps, of the kind considered in [Ma, Da]. A further motivation for the present study, and particularly the BMO-space results, is the problem of building a complex analytic extension of the Brjuno function. This will be the subject of a subsequent paper. One of us (S. M.) wishes to thank the Italian CNR for financial support, and A. Berretti and S. Isola for useful discussions. Part of this work was made during visits of the second author (P. M.), who thanks the Department of Mathematics ‘U. Dini’ of the University of Florence, the Italian Institutes INFN and INFM for hospitality and/or financial support. Support from the EC contract ERBCHRXCT94-0460 for the project “Stability and universality in classical mechanics” is also acknowledged.
1. On a Family of Continued Fraction transformations Let α ∈ [1/2, 1] and let x ∈ R. We define [x]α = min{p ∈ Z | x < α + p} ,
(1.1)
that is [x]α = p iff
α−1+p≤x qn > 0; ii) pn > 0 when x > 0 and pn < 0 when x < 0; 1 1 1 iii) |qn x − pn | = qn+1 + εn+1 qn xn+1 , so that 1 + α < βn qn+1 < α ; iv) if α > g , βn ≤ αg n ; v) if α ≤ g , βn ≤ αγ n . Proof. One gets parts (i) and (ii) by recursion using (1.17), in fact it is obvious only when α = 1. When α 6= 1, one could alternatively use Lemma 1.8 below. Part (iii) is easily obtained from (1.18). The proof of (iv) is also easy: either xk ≤ g for all k = 0, . . . , n, or xk > g for some −1 − 1 = g, thus xk xk+1 = 1 − xk < 1 − g = g 2 . In k. Then xk+1 = x−1 k − 1 and xk+1 < g the sequence βn = x0 · · · xn one then isolates the pairs xk , xk+1 such that xk > g (since for each pair xk xk+1 < g 2 ). The other terms in βn are all smaller or equal to g, except for xn < α. The proof of (v) is more complicated: the result is obvious if among x0 , x1 , . . . , xn , there are less than two of them taking values greater than γ. Otherwise, let xk and xk+p+1 be two successive values greater than γ. Note that xk > γ implies xk+1 = |2 − x−1 k | < γ, therefore we must have p ≥ 1. Now statement (v) is an immediate consequence of the following assertion which we will then prove: k+p xi < γ p+1 . if xk > γ, p ≥ 1, xk+1 , . . . , xk+p < γ and xk+p+1 > γ, then Πi=k We divide the proof into some different cases. (1) (2)
2 If γ < xk ≤ 1/2, then xk+1 = x−1 k − 2 and xk xk+1 = 1 − 2xk < 1 − 2γ = γ , and the assertion holds. By the way this closes the proof in the case α = 1/2. 2 2 Let xk > 1/2, thus xk+1 = 2 − x−1 k < g . Now observe that the image of [1/3, g ] 2 is [0, g ]. If xk+1 ≤ 1/3 we let m = 1 , and if xk+1 > 1/3 we let m ≥ 2 such that
xk+1 , . . . , xk+m−1 ∈ (1/3, g 2 ] ,
xk+m ∈ [0, 1/3] .
Note that p ≥ m ≥ 1. (2.1) If m ≥ 4, then xk xk+1 · · · xk+m ≤ 13 g 2m−1 , since xk < g, xk+1 , . . . , xk+m−1 ≤ g 2 and xk+m ≤ 1/3. A numerical exercise shows that for m ≥ 4, 13 g 2m−1 < γ m+1 , and the assertion follows. We must now consider the three cases left: m = 1, m = 2 and m = 3. < 1 − γ, then xk xk+1 = 2xk − 1 < γ 2 ; otherwise xk ≥ 1 − γ, and (2.2) m = 1. If xk √ xk+1 ≥ 1 − 2/2 so that 2/7 < xk+1 ≤ 1/3 and xk+2 = x−1 k+1 − 3, xk+1 xk+2 = 1 − 3xk+1 and finally xk xk+1 xk+2 = 3 − 5xk < 5γ − 2 = γ 3 , which shows the assertion.
272
S. Marma, P. Moussa, J.-C. Yoccoz
3 (2.3) m = 2, then xk+2 = 3 − x−1 k+1 and xk xk+1 xk+2 = 5xk − 3. If this is smaller √ than γ , then the assertion follows. √ If not, which is equivalent to assume xk > 2 − 4/5, ∈ [2/7, 1/3]. then xk+1 > (48 − 25 2)/34 and xk+2 > 0.3111 . . ., so that xk+2 √ Thus xk+3 = x−1 − 3 and x x x x = 8 − 13x < 8 − 13( 2 − 4/5). A k k+1 k+2 k+3 k k+2 numerical exercise shows that this last number is smaller than γ 4 , which completes the assertion in this case. 4 (2.4) m = 3, then assume √ xk xk+1 xk+2 xk+3 = 13xk − 8 > γ . This would be equivalent to xk > (25−12 2)/13. However, from the definition of m, one gets xk+3 ≤ 1/3 which implies xk ≤ 21/34, and the two inequalities on xk are contradictory. Thus xk xk+1 xk+2 xk+3 ≤ γ 4 , and the proof of the assertion is completed.
Remark 1.5. From (iii) and (iv) one gets if α > g , qn ≥ α(11+ α) Gn−1 , and similarly, from (iii) and (v) if α ≤ g , qn ≥ α(11+ α) 0n−1 . Remark 1.6. From (iii) one gets 1 pn 1 1 1 ≤ < x − < < 2qn qn+1 qn (qn + qn+1 ) qn (αqn + qn+1 ) qn qn qn+1
(1.23)
if εn+1 = +1, whereas
1 pn 1 1 < < x − < qn qn+1 qn qn (qn+1 − (1 − α)qn ) αqn2
(1.24)
if εn+1 = −1. Note also that assertions (iv) and (v) remain valid for x ∈ Q, with the convention that βn = 0 as soon as one of the xk , k ≤ n, vanish (in which case the xk with larger order are undefined). Remark 1.7. Using the estimates of Remark 1.5, qk ≥ max(1, Gk−1 /2), and the ele1/2 mentary inequality log qk ≤ (2/e)qk , there exists two positive constants c1 and c2 such that ∞ X log qk k=0 ∞ X k=0
qk log 2 qk
≤ c1 =
2 e
3+
√ 2 √ G− G
= 5.214... ,
≤ c2 = 5 log 2 = 3.465... ,
(1.25) (1.26)
for all α ∈ [1/2, 1] and for all x ∈ (0, α). Let Pn /Qn denote the nth convergent to x according to the standard continued fraction expansions (i.e. obtained by the iteration of the Gauss map A1 ). Following the method of [Bo], we shall now relate the nth convergents of the α–continued fractions to Pn /Qn . In fact the following result is obtained through a repeated use of the identity: A−
1 1 =A−1+ . B+x 1 + B −11 + x
Brjuno Functions and Their Regularity Properties
273
Lemma 1.8. For fixed x ∈ R\Q, let k α : N → N be the arithmetic function inductively defined by k α (−1) = −1 and α
k (n + 1) =
k α (n) + 1 k α (n) + 2
if εn+1 = +1, if εn+1 = −1,
where εn+1 is defined as in (1.10) and (1.12). Then k α is strictly increasing and for all n∈N pn Pkα (n) = . qn Qkα (n) Moreover, when k α (n + 1) = k α (n) + 2, we have for the denominators of the convergent of Gauss’continued fraction Qkα (n+1) = Qkα (n)+2 = Qkα (n)+1 + Qkα (n) . In the remainder of this Section, we collect a few technical facts concerning the nearest integer continued fraction which will be systematically used in the various proofs of Sects. 3 and 4. The reader who is mainly interested in the results can skip the following lemmas. Let A = A1/2 . We say that x and x0 belong to the same branch of An , when xk and x0k belong to the same branch of A for 0 ≤ k ≤ n − 1. Then the coefficients ak , εk , of the expansion of x and the coefficients a0k , ε0k of the expansion of x0 do coincide for 0 ≤ k ≤ n. We now define an integer n(x, x0 ) which represents the number of iteration steps needed to separate the orbits of x x0 . Definition 1.9. Let x, x0 be two distinct irrationals in (0, 1/2). The splitting order n(x, x0 ) is the greatest integer m such that x, x0 belong to the same branch or to two adjacent branches of Am . We define also the integer δ(x, x0 ) such that m = n(x, x0 )−δ(x, x0 ) is the greatest integer such that x, x0 belong to the same branch of Am . We shall see in the sequel that δ(x, x0 ) is equal to 0, 1, or 2. Indeed there are four possible situations, provided we also include the cases where x and x0 are permuted (for brevity we will write n and δ for n(x, x0 ) and δ(x, x0 )): (A) x and x0 belong to the same branch of An . Then δ = 0. (B) x and x0 belong to the same branch of An−1 , and there exists k ≥ 3 such that 1 2 2 < xn−1 < < x0n−1 < . 2k + 1 k 2k − 1
(1.27)
Since x and x0 belong to the same branch of An−1 and to adjacent branches of An , in this case δ = 1. (C) x and x0 belong to the same branch of An−2 and there exists k ≥ 3 such that 2 5 5 < xn−2 < < x0n−2 < . 5k − 2 2k − 1 5k − 3
(1.28)
In this case, both xn and x0n belong to [2/5, 1/2]. x and x0 belong to adjacent branches of An−1 as well as of An , and δ = 2.
274
S. Marma, P. Moussa, J.-C. Yoccoz
(D) x and x0 belong to the same branch of An−1 and there exists k ≥ 3 such that 2 1 1 < xn−1 < < x0n−1 < . k 2k − 1 k−1
(1.29)
In this case δ = 1, as in case (B) above, but one must add the condition that one of the numbers xn , x0n (at least) does not belong to [2/5, 1/2], otherwise, one is in case (C). From the above definitions, for all l ≤ n − δ one has al = a0l , εl = ε0l , pl = p0l , ql = ql0 , |βl (x) − βl (x0 )| = ql |x − x0 | , |x − x0 | = |xl − x0l |βl−1 (x)βl−1 (x0 ) .
(1.30)
In the case (B) one has an = a0n = k , pn = p0n , qn = qn0 , εn = +1 , ε0n = −1 . Let
x00 =
pn ∈ (x, x0 ) . qn
(1.31) (1.32)
Then one has x00 n−1 = k −1 , βn−1 (x00 ) = qn−1 , x00 n = 0 and |x − x00 | = qn−1 βn−1 (x)xn = qn−1 βn (x), |x0 − x00 | = qn−1 βn−1 (x0 )x0n = qn−1 βn (x0 ) .
(1.33)
In the case (C) one has an−1 = k , εn−1 = −1 , an = 2 , εn = +1, a0n−1 = k − 1 , ε0n−1 = +1 , a0n = 2 , ε0n = +1, 0 0 + qn−2 , qn = qn0 = qn−1 + qn−1 , qn−1 = qn−1
(1.34)
pn−1 = p0n−1 + pn−2 , pn = p0n = pn−1 + p0n−1 . Let x00 =
pn pn−1 + p0n−1 = ∈ (x, x0 ) . 0 qn qn−1 + qn−1
(1.35)
Then one has x00 n−2 = 2/(2k − 1), βn−2 (x00 ) = 2qn−1 , x00 n−1 = 1/2, βn−1 (x00 ) = qn−1 , x00 n = 0 and |x − x00 | = qn−1 βn−1 (x)xn = qn−1 βn (x), |x0 − x00 | = qn−1 βn−1 (x0 )x0n = qn−1 βn (x0 ) . In the case (D) one has an = k , εn = −1 a0n = k − 1 , ε0n = +1 Let
qn = qn0 + qn−1 pn = p0n + pn−1 .
(1.36)
(1.37)
Brjuno Functions and Their Regularity Properties
x00 =
pn + p0n ∈ (x, x0 ) . qn + qn0
Then one has x00 n−1 = 2/(2k − 1), x00 n = 1/2, βn−1 (x00 ) = 2(qn + qn0 )−1 and |x − x00 | = 2(qn + qn0 )−1 βn−1 (x) 21 − xn , |x0 − x00 | = 2(qn + qn0 )−1 βn−1 (x0 ) 21 − x0n . We recall that in the case (D) one also has 1 1 1 0 − xn , − xn ≥ . max 2 2 10
275
(1.38)
(1.39)
(1.40)
We give now two lemmas which relate the separation between two numbers and their splitting orders. Lemma 1.10. There exists a positive constant c3 independent on x and x0 , such that for all l < n = n(x, x0 ), we have 0 0 c−1 3 βl (x ) < βl (x) < c3 βl (x ) .
(1.41)
Indeed one can take c3 = 9/2. Proof. For l < n − δ, this is just a consequence of (1.30) and Proposition 1.4 (iii), and the constant obtained in this case is 3. When δ = 1, one gets from (1.27) or (1.29) 2/3 ≤ xn−1 /x0n−1 ≤ 3/2, which leads to a constant 9/2. When δ = 2, one gets from (1.28), 12/13 ≤ xn−2 /x0n−2 ≤ 13/12, and 4/5 ≤ xn−1 /x0n−1 ≤ 5/4, so that we also get for the constant 65/16 < 9/2. Lemma 1.11. There exists a positive constant c4 > 0 such that for all x, x0 ∈ (0, 1/2), and n ≥ n(x, x0 ), one has max(βn (x), βn (x0 )) ≤ c4 |x − x0 |1/2 . √ Indeed one can take c4 = 9 15/2 = 17.42....
(1.42)
Proof. In the case (D) one has |x − x0 | = |x − x00 | + |x00 − x0 |, so that |x − x0 | ≥
1 1 −2 1 2 (qn + qn0 )−1 inf(βn−1 (x), βn−1 (x0 )) ≥ q ≥ β (x) , 5 15 n 60 n−1
(1.43)
since qn = qn0 + qn−1 > qn0 , and (2/3)qn−1 ≤ βn−1 (x) ≤ 2qn−1 . The previous lemma 2 then shows that |x − x0 | ≥ (c23 /60)βn−1 (x0 ). Since βn ≤ (1/2)βn−1 , the constant c4 is √ √ at most equal to c3 15 = 9 15/2. In the cases (B) and (C) one has |x − x0 | = qn−1 (βn (x) + βn (x0 )) ≥ qn−1 max(βn (x), βn (x0 )) .
(1.44)
0 0 However, qn−1 √ ≥ (1/2)βn−1 (x) ≥ (1/2c3 )βn−1 (x ) = (1/9)βn−1 (x ), and therefore we get c4 ≥ 3/ 2. 0 −1 0 Finally in the case (A) one has |x−1 n − x n | ≥ 1 since x, x do not belong to n+1 0 two adjacent branches of A , from which follows that |xn − xn | ≥ |xn ||x0n |. Suppose xn > x0n (the other case resulting by symmetry), then if x0n ≥ xn /2, we get |xn − x0n | ≥
276
S. Marma, P. Moussa, J.-C. Yoccoz
|xn |2 /2 >, and if x0n < xn /2, we get |xn − x0n | = xn (1 − x0n /xn ) > xn /2 > x2n /2. Therefore 1 (1.45) |xn − x0n | ≥ [max(x2n , x02 n )] 2 and |x − x0 | = βn−1 (x)βn−1 (x0 )|xn − x0n | ≥
βn−1 (x)βn−1 (x0 ) [max(x2n , x02 n )], 2
and using the previous lemma, |x − x0 | ≥
2 2 max(βn−1 (x), βn−1 (x0 )) [max(x2n , x02 n )], 2c3
so that one gets c4 ≥ 3 in this case.
(1.46)
Lemma 1.12. Let J be the interval of definition of one single branch of Am , and |J| its length. One of its end-points is equal to pm /qm . We have 1 1 ≤ |J| ≤ 2 , 2 3qm qm
and for x ∈ J ,
m dA (x) 9 2 1 2 ≤ q , q ≤ 4 m dx 4 m
(1.47)
so that dAm (x) 9 1 ≤ |J| ≤ . 12 dx 4
(1.48)
Proof. The end-points of J are obtained in setting xm = 0 or xm = ±1/2 in (1.18), that is pm /qm and (2pm ± pm−1 )/(2qm ± qm−1 ) respectively. Therefore we have |J|−1 = qm (2qm ± qm−1 ). On the other hand, one gets dx/dxm from (1.18), so that its inverse |dAm /dx| = (qm ± qm−1 xm )2 , and the lemma follows easily. Lemma 1.13. Let J and J 0 be the intervals of definition of two adjacent branches of Am , with respective lengths |J| and |J 0 |, then there exists a constant c5 (which can be taken c5 = 12), such that |J| c−1 ≤ c5 . (1.49) 5 ≤ |J 0 | Proof. Let x be the common end-point. If Am (x) = 0, then one has x = pm /qm , 0 and the other end-points are (2pm ± pm−1 )/(2qm ± qm−1 ). Therefore qm = qm , and |J|/|J 0 | = (2qm ± qm−1 )/(2qm ± qm−1 ) ≤ 3. If Am (x) = 1/2, then one of the two intervals has the form [pm /qm , (2pm ± pm−1 )/(2qm ± qm−1 )] and the same holds for 0 the other, but with pm and qm replaced by p0m and qm respectively. For the same reasons 0 ± qm−1 , as in case (C) above (see Eq. (1.34)), we have pm = p0m ± pm−1 , qm = qm 0 0 pm−1 = p0m−1 , qm−1 = qm−1 , so that qm /qm ≤ 3, and the length ratio |J|/|J 0 | = 2 0 qn (2qn ± qn−1 )/qn0 (2qn0 ± qn−1 ) ≤ 3qn2 /q 0 n ≤ 12. Note that in both cases we have qm 1 ≤ 0 ≤3. 3 qm
(1.50)
Brjuno Functions and Their Regularity Properties
277
2. The Brjuno Functions Following Yoccoz [Yo] we define a (generalized) Brjuno function: Definition 2.1. The α-Brjuno function Bα : R \ Q → R is defined by the formula Bα (x) = −
∞ X
βi−1 (x) log xi ,
(2.1)
i=0
where the xn follow x0 = x by repeated iterations of Aα , as defined in (1.10) and (1.11), and the βn ’s are given by (1.21). We have posed β−1 = 1. Remark 2.2. It is useful to extend the above definition x ∈ Q, by setting Bα (x) = +∞, or exp(−Bα (x)) = 0. The Brjuno function defined in [Yo] corresponds to B1/2 , the one defined by the nearest integer continued fraction map A1/2 . Proposition 2.3. Given α ∈ [1/2, 1] one has (i) Bα (x) = Bα (x + 1) for all x ∈ R; (ii) for all x ∈ (0, α)
1 ; Bα (x) = − log x + xBα x
(2.2)
(iii) if x ∈ [α − 1, 0) then Bα (−x) = Bα (x); (iv) there exists a constant C1 > 0 (independent of α) such that for all x ∈ R \ Q one has ∞ X log qj+1 Bα (x) − (2.3) ≤ C1 , qj j=0 where {qj }j≥0 denotes the sequence of the denominators of the convergents to x of the α-continued fraction expansion. Proof. Given x ∈ R \ Q, the sequences (xi )i≥0 and (βi )i≥0 associated to x and x + 1 are the same, which proves (i). The same is true for x and −x if x ∈ (α − 1, 0), which proves (iii). If x ∈ (0, α), let y = 1/x and denote by yi , ai (y), βi (y), and xi , ai (x), βi (x) the sequences (1.11) and (1.21) associated to y and to x respectively. From (1.9) and (1.10) it follows that x0 = x, a0 (y) = a1 (x), y0 = x1 and by induction for all n ≥ 0, yn = xn+1 and βn (y) = (βn+1 (x))/x. Thus Bα (y) = −
∞ X
βi−1 (y) log yi = − log y0 −
i=0 ∞
∞ X 1 βi (x) log xi+1 x i=1
1X 1 =− βi−1 (x) log xi = [Bα (x) + log x] , x x i=1
which proves (ii). To prove (iv) we first remark that (1.21) implies qi βi−1 + εi qi−1 βi = 1
278
S. Marma, P. Moussa, J.-C. Yoccoz
for all i ≥ 0. Then Bα (x) +
∞ X log qi+1
qi
i=0
=
∞ X i=0
=
∞ X
∞ X βi qi−1 βi−1 log + βi log qi+1 βi−1 + εi βi−1 qi i=0 ∞ X
βi−1 log βi qi+1 −
i=0
βi−1 log βi−1 +
i=0
∞ X i=0
εi
qi−1 βi log qi+1 , qi
but by (1.21), Proposition 1.4 (iii), and the estimates of Remark 1.7, ∞ ∞ X X log 2 βi−1 log βi qi+1 ≤ 2 ≤ 2c2 , qi i=0 i=0 ∞ ∞ X X log 2 + log qi βi−1 log βi−1 ≤ 2 ≤ 2(c1 + c2 ) , qi i=0 i=0 ∞ ∞ X X qi−1 log qi+1 εi βi log qi+1 ≤ 2 ≤ 2c1 , qi qi+1 i=0
i=0
from which it follows that ∞ X log qi+1 ≤ C1 = 4(c1 + c2 ) . Bα (x) − qi i=0
By means of Lemma 1.8 one can prove the following Proposition 2.4. There exists a positive constant C2 > 0 such that for all α ∈ [1/2, 1] and for all x ∈ R \ Q one has ∞ X log Q j+1 Bα (x) − (2.4) ≤ C2 . Qj j=0 P∞ Proof. Thanks to (iv), Proposition 2.3, it suffices to compare j=0 (1/qj ) log qj+1 with P∞ j=0 (1/Qj ) log Qj+1 . By Lemma 2.3, one has qj = Qk(j) for all j, where for brevity we write k(j) for k α (j). Thus ∞ X log qj+1 j=0
qj
=
X k(j+1)=k(j)+1
log Qk(j+1) + Qk(j)
X k(j+1)=k(j)+2
log Qk(j+1) . Qk(j)
Using the fact that Qk(j+1) = Qk(j)+2 = Qk(j)+1 + Qk(j) we have log Qk+2 log(Qk+1 + Qk ) log Qk+1 log 1 + Qk /Qk+1 = = + Qk Qk Qk Qk but
Brjuno Functions and Their Regularity Properties
279
log 1 + Qk /Qk+1 log 2 0≤ ≤ , Qk Qk By applying the estimates of Remark 1.7 one gets the result: ∞ ∞ X log qj+1 X log Qj+1 − ≤ 2c2 + c1 , qj Qj j=0
j=0
and one gets C2 = 6c2 + 5c1 .
Remark 2.5. The Brjuno numbers [Br] are usually defined by the Brjuno condition ∞ X log Qi+1 i=0
Qi
< +∞ .
Proposition 2.4 shows that the α-Brjuno functions Bα are finite at x if and only if x is a Brjuno number and that all the generalized Brjuno functions differ one from the other for a L∞ function. On the other hand, the advantage of the functions Bα with respect to the Brjuno condition is that they verify a nice functional equation (2.2) under the action of the modular group SL (2, Z). Another important characterization of the generalized Brjuno functions comes from their “uniqueness”, as it is an immediate consequence of Theorem 2.6 below. For fixed 1/2 ≤ α ≤ 1, let us consider the operator 1 ν , (Tν f )(x) = x f x
(2.5)
if x ∈ (0, α), where ν ≥ 0, defined for the moment on measurable functions of R which verify f (x) = f (x + 1) for almost every x ∈ R ,
(2.6)
f (−x) = f (x) for a.e. x ∈ (0, 1 − α) . It is understood that the function Tν f is completed outside (0, α) by imposing on Tν f the same parity and periodicity conditions which are expressed for f in (2.6). As noticed in the Introduction, one should write Tν(α) instead of Tν , however, we will omit the α dependance for shortness, since the value of α is usually obviously given by the context. The functional equation for the α-Brjuno function can be written in the form [(1 − T1 )Bα ](x) = − log x ,
(2.7)
for all x ∈ (0, α), complemented with the periodicity and symmetry conditions (2.6). This suggest to study the operator Tν on the Banach spaces Xα,p = {f : R → R | f verifies (2.6) , f ∈ Lp ((0, α), dmα (x)), }
(2.8)
p
endowed with the norm of L ((0, α), dmα (x)), where dmα (x) = cα ρα (x) dx is the invariant measure defined in Sect. 1, so that
(2.9)
280
S. Marma, P. Moussa, J.-C. Yoccoz
Z
1/p
α
||f ||α,p =
|f (x)|p dmα (x)
,
(2.10)
0
as α varies in (1/2, 1) and p ∈ [1, ∞]. Note that if p < p0 one has the obvious inclusion Xα,p0 ⊂ Xα,p and let Xα = ∩p≥1 Xα,p .
(2.11)
If (1 − T1 ) is invertible in the considered space, then (2.7) has a unique solution for Bα , provided that the right hand side also belongs to the space, which is easy to check. The invertibility property is given by the following theorem, which states in particular that the spectral radius of T1 is strictly smaller than 1. Theorem 2.6. Tν is a linear bounded operator from Xα,p into itself for all ν > 0, for all α ∈ [ 21 , 1] and for all p ∈ [1, ∞]. Indeed its spectral radius on Xα,p satisfies r(Tν ) ≤
gν , γν ,
if α > g if α ≤ g.
(2.12)
Proof. It is a simple calculation. We observe first that (Tνn f )(x) = (βn−1 (x))ν f (xn ) = (βn−1 (x))ν (f ◦ Anα )(x) ,
(2.13)
therefore Z
α 0
Z |Tνn f (x)|p mα (x)dx =
0
α
(βn−1 (x))νp |f (Anα (x))|p dmα (x)
≤ [αγαn−1 ]νp
Z
α
|f (x)|p dmα (x),
(2.14) (2.15)
0
where we have used Proposition 1.4 (iv) and (v) (therefore γα = g if α > g, γα = γ if α ≤ g) and the invariance of the measure dmα (x) w.r.t. Aα . From (2.14) it immediately follows that ||Tνn f ||α,p ≤ [αγαn−1 ]ν ||f ||α,p , and one gets (2.12) by taking the 1/nth root of both sides.
(2.16)
The use of the invariant measure in (2.8) makes the evaluation of the spectral radius remarkably simple. We of course get the same result if we replace the measure in (2.8) by the Lebesgue measure which is equivalent (see Remark 1.2). In particular, the above theorem implies that the spectral radius is also bounded by (2.12) for the operator Tν in the spaces Lp (T), naturally introduced using the periodicity property. It is more difficult to tell whether Tν is itself contracting (see [MMY] for some results in this direction), we only mention here that in the case α = 1/2, T1 is a contraction for all Lebesgue Lp -norms on [0, 1/2].
Brjuno Functions and Their Regularity Properties
281
3. The Brjuno Function and the BMO Space In the previous section, it has beenTshown that the Brjuno functions Bα belong to Lp (T) ∞ and therefore to the intersection p=1 Lp (T). The purpose of this section is to show a stronger result: the Brjuno functions Bα belong to BMO(T). We recall the definition and the main properties of BMO spaces in Appendix A. In fact, we already know that all Brjuno functions Bα differ by L∞ functions, and since L∞ ⊂ BMO, it will be enough to prove that Bα is in BMO(T) for a fixed value of α. In this section we fix α = 1/2 and denote A1/2 and B1/2 simply by A and B respectively. We will also write dm(x) = dm1/2 (x) =
1 log G
1 1 + G+x G+1−x
dx .
On the interval I, we define now the mean value fI of f Z Z 1 fI = f (x) dm(x) , mI = dm(x) , mI I I
(3.1)
(3.2)
and its quadratic oscillation OI (f ) OI (f ) = =
1 mI
Z f (x) − fI
2
1/2 dm(x)
I
1 2m2I
Z f (s) − f (t)
2
(3.3)
21 dm(s) dm(t)
.
I×I
We now consider the space X∗ ⊂ X1/2 = ∩∞ p=1 X1/2,p defined as: X∗ = {f ∈ BMO(R) | f (x + 1) = f (x) ∀x ∈ R ,
(3.4)
f (−x) = f (x) ∀x ∈ [0, 1/2]} , with the norm ||f ||∗ = |f |∗ + ||f ||2 ,
(3.5)
where ||f ||2 = ||f ||L2 ((0,1/2),dm) , and |f |∗ =
sup
OI (f ) .
(3.6)
I⊂[0,1/2]
Therefore we have ||f ||∗ =
sup I⊂[0,1/2]
Z
1/2
1 2m2I
Z
2
I×I
f (s) − f (t) ! 21
(f (x))2 dm(x)
+ 0
.
21 dm(s) dm(t) (3.7)
282
S. Marma, P. Moussa, J.-C. Yoccoz
Remark 3.1. Due to the equivalence between the measure m and the Lebesgue measure, one gets an equivalent norm in replacing m(I) by the length |I| in (3.7) We explain in the appendix why the norm we use here is equivalent to the usual BMO norm: it differs with the usual one [Gr,GCRF] in two respects, first, we use here the invariant measure dm instead of the Lebesgue measure, second, we use here a L2 -norm definition of BMO instead of the usual L1 -norm definition. The equivalence between the L1 and L2 definitions is a corollary of the John-Nirenberg Theorem which is far from obvious. Furthermore, the BMO-norm on [0, 1/2] is equivalent to the BMO-norm on T only for even functions, which is the case for B1/2 . We shall now prove the following theorem, Theorem 3.2. The Brjuno function B = B1/2 belongs to X∗ , and therefore to BMO(T). For 1/2 ≤ α ≤ 1, the functions Bα also belong to BMO(T). The proof follows immediately from the following Theorem 3.3, which states in particular that for α = 1/2, 1 − T1 is invertible in X∗ . And it is easy to show by direct computations that, for α = 1/2, the right hand side of (2.7), namely the even and periodic function equal to log x on (0, 1/2], is in X∗ . It results that B = B1/2 is also in X∗ , as well as all Bα for 1/2 ≤ α ≤ 1. Note that B 6⊂ L∞ because the logarithmic function is unbounded. Theorem 3.3. In the case α = 1/2, and for all ν > 0, Tν is a bounded linear √ operator from X∗ to X∗ . Indeed, its spectral radius in X∗ is at most equal to γ ν = ( 2 − 1)ν . Proof. In order to prove the theorem, we must estimate OI (Tνm f ) for m ≥ 0, I ⊂ [0, 1/2]. Let I = [x, x0 ], n = n(x, x0 ) the splitting order of x and x0 . We divide the proof into three cases. First case: One has m > n. Let Ib denote the union of the domains of the branches of Am b and from (1.49), 1 ≤ |I|/|I| b which meet the interval I. Then one has I ⊂ I, ≤ (1 + 2c5 ), since in this case I contains the full interval of at least one branch of Am . Setting g = Tνm f , one gets Z Z Z 1 1 1 (g − gI )2 dm ≤ g 2 dm ≤ c0 (1 + 2c5 ) g 2 dm , (3.8) OI2 (g) = b Ib mI I mI I |I| where we have used (1.7) and (1.49). For the domain J of any complete branch of Am , we take Am (t) instead of t as the integration variable. Then, it follows from (1.48) that Z Z m 2 2ν 2 (Tν f ) dm ≤ ||βm−1 ||C 0 (f ◦ Am )2 dm ≤ 12|J|c20 ||βm−1 ||2ν (3.9) C 0 ||f ||2 , J
where ||
J
||0C
denotes the sup-norm on [0, 1/2]. Therefore one gets Z 2 b (Tνm f )2 dm ≤ 12c20 ||βm−1 ||2ν C 0 ||f ||2 |I| , Ib
from which it follows that OI (Tνm f ) ≤ 2c0
p
3c0 (1 + 2c5 )||βm−1 ||νC 0 ||f ||2 .
(3.10)
(3.11)
Second case: One has m ≤ n − δ. Then I is contained in the domain J of a single branch of Am . Let I1 = Am (I) ⊂ [0, 1/2]. We have
Brjuno Functions and Their Regularity Properties
283
Z 2m2I OI2 (T m f ) = where
(Tνm f (s) − Tνm f (t))2 dm(s) dm(t) ≤ 2(M1 + M2 ) ,
(3.12)
2ν βm−1 (s)(f (Am s) − f (Am t))2 dm(s) dm(t) ,
(3.13)
ν ν (f (Am t))2 (βm−1 (s) − βm−1 (t))2 dm(s) dm(t) .
(3.14)
I×I
Z M1 = Z
I×I
M2 = I×I
Now, from the bound (1.47) on dAm /dx, we deduce upper and lower bounds on the ratio |I1 |/|I|, m −1 2 2 dA (x) |I1 | 9qm qm ≤ 4 ≤ 9 |I| ≤ 9c20 mI . ≤ ≤ , and, (3.15) 2 4 |I| 4 dx qm |I1 | mI1 Taking Am (s) and Am (t) as new integration variable, one gets m −2 ! Z dA 4 2ν M1 ≤ c0 ||βm−1 ||C 0 max (f (s0 ) − f (t0 ))2 dm(s0 ) dm(t0 ) x∈I1 dx I1 ×I1 (3.16) ≤ = √ 4 with c6 = 9 2c0 . On the other hand, from (1.30) one gets 2 2 81c80 ||βm−1 ||2ν C 0 2mI OI1 (f )
2 2 c26 ||βm−1 ||2ν C 0 mI OI1 (f )
,
|βm−1 (s) − βm−1 (t)| ≤ qm−1 |I| ≤ |I|1/2 ,
(3.17)
−2 −2 ≤ qm−1 . Then, using the obvious inequality since from (1.47), one has |I| ≤ |J| ≤ qm ν ν ν−1 ν−1 )|x − y|, and Lemma 1.10, we get |x − y | ≤ ν max(x , y Z 2ν−2 2ν−2 M2 ≤ ν 2 (f (Am t))2 max(βm−1 (s), βm−1 (t))(βm−1 (s) − βm−1 (t))2 dm(s)dm(t) I×I Z 2ν−2 (f (Am t))2 βm−1 (t) dm(s)dm(t) (3.18) ≤ ν 2 |I|c23 I×I
Z
M2 ≤ ν 2 |I|mI c0 c23 ||βm−1 ||2ν C0
I
−2 2 2 (f (Am (t)))2 βm−1 (t)dt ≤ c27 ||βm−1 ||2ν C 0 mI ||f ||2 .
(3.19) We have bounded βm−1 in the integral using Proposition 1.4 (iii) and taken Am (t) as a 3/2 new integration variable using (1.47), so that setting c7 = 3νc0 c3 , one obtains OI (Tνm f ) ≤ ||βm−1 ||νC 0 c26 OI21 (f ) + c27 ||f ||22
1/2
≤ ||βm−1 ||νC 0 c6 OI1 (f ) + c7 ||f ||2 .
(3.20)
Third case: One has n − δ < m ≤ n. Thus one is in one of the cases (B), (C) or (D), discussed above after Definition 1.9 : the interval I is contained in the union of two adjacent branches of Am and the point x00 is the common point of the two branches. Let I − = [x, x00 ], I + = [x00 , x0 ] and we still define M1 and M2 as in Eqs. (3.13) and (3.14),
284
S. Marma, P. Moussa, J.-C. Yoccoz
so that we get (3.12) as a bound of the oscillation on I. We first bound M2 as in the previous case, but now from (1.30), |βm−1 (s) − βm−1 (t)| ≤ |βm−1 (s) − βm−1 (x00 )| + |βm−1 (x00 ) − βm−1 (t)| 0 |I− | (3.21) ≤ qm−1 |I+ | + qm−1 √ 1/2 1/2 1/2 ≤ |I+ | + |I− | ≤ 2|I| , where the beforelast inequality is obtained as (3.17) above. Then Z Z Z −2 m 2 −2 f (A (t)) βm−1 (t)dt ≤ + (t)dt ≤ 2c27 ||f ||22 , (3.22) f (Am (t))2 βm−1 I−
I
I+
from which it follows that 2 2 M2 ≤ 4c27 ||βm−1 ||2ν C 0 mI ||f ||2 .
On the other hand
(3.23)
Z
M1 ≤
c20 ||βm−1 ||2ν C0
(f (Am (s)) − f (Am (t)))2 dsdt .
(3.24)
I×I
Let I1+ = Am (I + ), I1− = Am (I − ); one has, using (3.15), Z (f (Am (s)) − f (Am (t)))2 dsdt I ε ×I ε
|I ε |2 ≤ 81 ε 2 |I1 |
Z I1ε ×I1ε
(f (s0 ) − f (t0 ))2 ds0 dt0
(3.25)
≤ 81c40 2m2I OI2 ε (f ) . 1
Finally one gets Z I + ×I −
(f (Am (s)) − f (Am (t)))2 ds dt
|I + ||I − | ≤ 81 + − |I1 ||I1 |
Z I1+ ×I1−
(f (s0 ) − f (t0 ))2 ds0 dt0 .
(3.26)
Let I1 = I1+ ∪I1− . From the discussion of cases (B), (C) and (D) made after Definition 1.9, it follows that I1+ and I1− have a common end-point which is either 0 or 1/2. Therefore, either I1+ ⊂ I1− or I1− ⊂ I1+ , then Z (f (s0 ) − f (t0 ))2 ds0 dt0 ≤ c20 2m2I1 OI21 (f ) . (3.27) I1+ ×I1−
However, as read from (3.15), Then, from (1.50), one gets
|I − | 0 2 |I + | 2 q both belong to the interval [4/9, 4]. + qm , and |I1 | |I1− | m |I + ||I − | ≤ c28 |I1+ ||I1− |
|I| |I1 |
2 ,
(3.28)
Brjuno Functions and Their Regularity Properties
285
with c8 = 22 34 , from which it follows that Z (f (Am (s)) − f (Am (t)))2 dsdt ≤ c29 2m2I OI21 (f ) ,
(3.29)
I + ×I −
with c9 = 81c28 c40 . Putting all constants together, one therefore finds c10 and c11 such that 2 2 2 2 M1 ≤ c10 ||βm−1 ||2ν C 0 mI (OI + (f ) + OI − (f ) + OI1 (f )) , 1
OI (Tνm f )
≤
c11 ||βm−1 ||νC 0 (OI1+ (f )
1
(3.30)
+ OI − (f ) + OI1 (f ) + ||f ||2 ) . 1
In the three cases we have considered, we get a bound for the oscillation of Tνm (f ) as a linear combination with constant coefficients of the sup of the oscillation of f , the norm ||f ||2 , and an overall factor ||βm−1 ||νC 0 which contains the only dependence on m. Since (12.5) gives the same result for the L2 -norm, we easily deduce that, for the norm (3.5), there exist a positive constant c12 independant of m and f , such that ||Tνm f ||∗ ≤ c12 ||βm−1 ||νC 0 ||f ||∗ ≤ (γ ν )m (c12 /2γ)||f ||∗ .
(3.31)
The first consequence is that Tν is a bounded linear operator from X∗ to X∗ , the second is that its spectral radius r(Tν ) = lim ||(Tν )m ||1/m is bounded by γ ν . The spectral radius computation is rather general, and does not require a tight adjustment for the constants: more specific results on the norm of Tν itself depend on the peculiar norm taken. See [MMY] for some results of this kind. We observe that the computation of the spectral radius for the Lp -norm and the BMO-norm both come from the leading behaviour of βm−1 , as given by Theorem 1.4. The embedding of the BMO space in between the Lp spaces and L∞ makes this result very natural. We expect interesting consequences will follow for the complex extension of the Brjuno functions. 4. The Brjuno Function and its H¨older Stability Properties The functional equation (1.29) for the Brjuno function for α = 1/2 is [(1 − T1 )B1/2 ](x) = − log x ,
(4.1)
for all x ∈ (0, 1/2), complemented with the condition that B = B1/2 is even and periodic. In this section we will suppose that the right hand side of this equation is perturbed, by an additional term f , which is less singular than the logarithmic function. We want to study the singular properties of the perturbed solution. Since the equation is linear, we only need to consider the action on f of T1 and (1 − T1 )−1 , which will be conveniently called the Brjuno operator B. We will consider even and periodic functions f which are continuous. It is sufficient to know the value of f on [0, 1/2], so we assume 0 . One can check that T1 f (resp. Tν for ν > 0) is also continuous provided we f ∈ C[0,1/2] set (T1 f )(0) = 0 (resp (Tν f )(0) = 0). We need now the usual H¨older’s type semi-norms for continuous functions. 0 , Then we define the H¨older’s η-norm as Definition 4.1. Let f ∈ C[0,1/2]
|f |η =
|f (x) − f (y)| , |x − y|η 0≤x 2η (thus η = η), Tν is a bounded linear operator in C η , of spectral radius −1 smaller or equal to γ ν−2η P∞< 1.mThe operator Bν = (1 − Tν ) is defined in this space and fulfills Bν = m=0 Tν . (3) If ν = 2η (thus η = η = ν/2), there exists a positive constant c14 > 0 such that ||Tνm ||C ν/2 (0,1/2) ≤ c14 , for all m ≥ 0 .
(4.4)
Proof. Let x, x0 ∈ [0, 1/2], m ≥ 0, η ∈ (0, 1], ν > 0. We want to estimate |Tνm f (x) − Tνm f (x0 )| η under the assumption that f ∈ C[0,1/2] . We let n = n(x, x0 ).
First case: m > n. From (2.13), one has ν ν (x) + βm−1 (x0 )) |Tνm f (x) − Tνm f (x0 )| ≤ ||f ||C 0 (βm−1
≤
(4.5)
ν ν (xn+1 ) + βnν (x0 )βm−n−2 (x0n+1 )) ≤ ||f ||C 0 (βnν (x)βm−n−2 2cν4 ||f ||C 0 |x − x0 |ν/2 ||βm−n−2 ||νC 0 ,
(4.6) (4.7)
where we have used Lemma 1.11. Second case: m ≤ n − δ. One has |Tνm f (x) − Tνm f (x0 )| ≤ (βm−1 (x))ν |f (Am (x)) − f (Am (x0 ))| ν ν (x) − βm−1 (x0 )| . +|f (Am (x0 ))||βm−1
(4.8)
From (1.47) and Proposition 1.4 (iii), one has |Am (x) − Am (x0 )| ≤ 9(βm−1 (x))−2 |x − x0 | ,
(4.9)
and, using (1.30), and (1.41) ν−1 ν−1 ν ν |βm−1 (x) − βm−1 (x0 )| ≤ ν max(βm−1 (x), βm−1 (x0 ))|βm−1 (x) − βm−1 (x0 )|
≤ νcν−1 (βm−1 (x))ν−1 qm−1 |x − x0 | 3 ≤ Therefore
(βm−1 (x))ν−2 |x νcν−1 3
0
−x|.
(4.10)
Brjuno Functions and Their Regularity Properties
287
|x − x0 |−η |Tνm f (x) − Tνm f (x0 )|
||f ||C 0 (βm−1 (x))2η−2 |x − x0 |1−η , ≤ (βm−1 (x))ν−2η 9η |f |η + νcν−1 3 ||f ||C 0 (16/9)1−η ≤ (βm−1 (x))ν−2η 9η |f |η + νcν−1 3 = Kf (βm−1 (x))ν−2η ,
(4.11)
−1 where we have used |x−x0 | ≤ |J|, and (from 1.48) βm−1 (x) ≥ (3/4)qm ≥ (3/4)|J|1/2 , J being the domain of the branch of Am which contains x and x0 . Third case: One has n − δ < m ≤ n: one is then led to consider the cases (B), (C) or (D) defined above, after Definition 1.9. One then introduces the intermediate point x00 and one gets the same estimates of the second case. More precisely,
|Tνm f (x) − Tνm f (x0 )| ≤ |Tνm f (x) − Tνm f (x00 )| + |Tνm f (x0 ) − Tνm f (x00 )| ≤ 2Kf max (βm−1 (x))ν−2η |x − x00 |η , (βm−1 (x0 ))ν−2η )|x0 − x00 |η (βm−1 (x))ν−2η |x − x0 |η , ≤ 2Kf cν−2η 3
(4.12)
where we have used (1.41). One can summarize the possible cases, in view of Theorem 4.3, by stating that there exists always a constant c13 such that |Tνm f (x) − Tνm f (x0 )| ≤ c13 ||f ||η |βm−1 (x)|ν−2η |x − x0 |η .
(4.13)
The statements of the theorem result easily from Eqs. (4.3), (4.7), (4.11), (4.12), and from the obvious inequality for the C0 norm, ν (x)] ≤ |Tνm f (x)| ≤ ||f ||C 0 sup[βm−1
1 (m−1)ν γ ||f ||C 0 . 2
(4.14)
Remark 4.3. It will be useful to notice here that an estimate similar to (4.13) holds for |εm (x)Tνm f (x) − εm (x0 )Tνm f (x0 )| provided f (0) = f (1/2) = 0 ,
(4.15)
namely |εm (x)Tνm f (x) − εm (x0 )Tνm f (x0 )| ≤ c13 ||f ||η |βm−1 (x)|ν−2η |x − x0 |η .
(4.16)
In order to get (4.16), we follow the same argument as we used in the previous proof, and distinguish three cases. The proof is immediate in the first case where m > n, and the bound is obtained in the same way as (4.7). In the second case where m ≤ n − δ, we also get the result as in (4.8), since we now have εm (x) = εm (x0 ). In the third case, where n − δ < m ≤ n, one introduces once more the intermediate point x00 , and one observes that either x00m = 0, or x00m = 1/2, according to the discussion which follows Definition 1.9. It follows that Tνm (x00 ) = βm−1 (x00 )f (x00m ) = 0, due to condition (4.15). Therefore |εm (x)Tνm f (x) − εm (x0 )Tνm f (x0 )| ≤ |Tνm f (x) − Tνm f (x00 )| + |Tνm f (x0 ) − Tνm f (x00 )|, and the required estimate is obtained as in (4.12). This completes the proof of (4.16). η , with ν < 2η, we easily deduce from the proof of the preceding When f ∈ C[0,1/2]
theorem that Tν f is in C ν/2 , and that Bν f is in C θ , for any θ < ν/2. In fact, one has a slightly stronger result for Bν f .
288
S. Marma, P. Moussa, J.-C. Yoccoz
η Theorem 4.4. Let f ∈ C[0,1/2] ,
(1) If ν < 2η, then the function Bν f =
P m≥0
Tνm f is in C ν/2 .
(2) If ν = 2η, then Bν f is in C η for 0 ≤ η < ν/2. Indeed it admits xν/2 | log x| as continuity modulus. Proof. We have, using (4.13) and setting n = n(x, x0 ), 0
|Bν f (x) − Bν f (x )| ≤
n X
|Tνm f (x)
−
Tνm f (x0 )|
≤ c13 ||f ||η |x − x |
|Tνm f (x) − Tνm f (x0 )|
n+1
0 0 ν/2
+
∞ X
0 η−ν/2
|x − x |
n X 0
! ν−2η
(βm−1 (x))
∞ X + ||βm−n−2 ||νC 0
! .
n+1
(4.17) From the bound on the βn , one sees immediately that the second series in the previous inequality converges. Furthermore all xk belong to (0, 1/2], and one has 2η−ν n n X X 1 ν−2η ν−2η (βm−1 (x)) ≤ (βn−1 (x)) = c15 (qn (x))2η−ν , 2n−m 0
0
Pn with c15 = (3/2)2η−ν 0 2−(m(2η−ν)) , which is finite. Now, if x and x0 belong to the same branch of An , let K be its interval of definition. If they belong to two adjacent branches, let K be the union of their intervals of definition. In both cases one has |x − x0 | ≤ |K| and, using Lemmas 1.12 and 1.13, one easily sees that |K|qn2 (x) is bounded, above and below, so that finally, one can find c16 such that for
|Bν f (x) − Bν f (x0 )| ≤ c16 ||f ||η |x − x0 |ν/2 .
(4.18)
|Bν f (x) − Bν f (x0 )| ≤ c17 (1 + n(x, x0 ))||f ||η |x − x0 |ν/2 .
(4.19)
ν < 2η ,
When ν = 2η, we get
However, x and x0 belong to the same branch |J| of An−2 , so that, using (1.48), one gets −2 |x − x0 | ≤ |J| ≤ qn−2 , and following remark 1.5, |x − x0 | ≤ (3/4)γ n−3 . We therefore have, up to some constant c, log |x − x0 | ≥ n log γ + c, which means that n is bounded: n ≤ (log(|x − x0 |−1 ) − c)/ log 0. Therefore there exists a constant c18 such that ν = 2η ,
for
|Bν f (x) − Bν f (x0 )| ≤ c18 ||f ||η log(|x − x0 |−1 )|x − x0 |ν/2 . (4.20)
As a consequence of the above theorem, we observe that if we consider a perturbation of the functional equation (2.7) for the Brjuno function (in the case α = 1/2), that is for 0 < x ≤
1 , 2
[(1 − T1 )Bf ](x) = − log x + f (x) ,
the function Bf differs from B by 1/2-H¨older continuous function when f is analytic, or at least 1/2- H¨older continuous. Therefore the ‘most singular’ part of Bf does not depend on the perturbation when it is sufficiently regular. The present result could provide
Brjuno Functions and Their Regularity Properties
289
an explanation for the so-called ‘modular smoothing’ of critical functions observed in earlier works [BPV,MS]. The above result seems to give a special role to the Brjuno function B = B1/2 . However, the theorem below displays a somewhat surprising result, namely that the difference B1 − B1/2 is not only bounded, as we already know, but also 1/2-H¨older continuous. We first give a preparatory statement. Proposition 4.5. Let B1+ and B1− be the even and odd part of B1 . B1+ and B1− are periodic, so that they are determined by their values in [0, 1/2]. We have x 1−x − , (4.21) for x ∈ [0, 1/2] , B1 (x) = log 2 x 1 B1+ (x) = xB1+ + g(x) − log x , (4.22) x with, still for x ∈ [0, 1/2], x g(x) = − log 2
1−x x
+
xB1−
1 . x
(4.23)
Proof. For x ∈ [0, 1/2]∩[R\Q], we have B1 (−x) = B1 (1−x), and 1 < (1−x)−1 < 2, so that B1 (−x) = − log(1 − x) + (1 − x)B1 ((1 − x)−1 − 1). But x 1−x 1 x 1 − 1 = B1 = log + B1 , B1 1−x 1−x x 1−x x since 0 < x/(1 − x) < 1. Therefore B1 (−x) = −x log(1 − x) − (1 − x) log x + xB1 (x−1 ) . Since we also have
B1 (x) = − log x + xB1 (x−1 ) ,
we easily get (4.21) by subtraction. By addition, we get x 1−x 1 1 + − log x − log = xB1 − log x − B1− (x) , (4.24) B1 (x) = xB1 x 2 x x which leads to (4.22) since we already know B1− .
B1− ,
The odd part of B1 − B1/2 coincides with and (4.21) shows that B1− is in C η for 0 ≤ η < 1. The even part of B1 − B1/2 , namely 1(x) = B1+ (x) − B1/2 (x)
(4.25)
satisfies 1(x) = g(x)+x1(x−1 ). From (4.23) and Theorem 4.2, one deduces that g ∈ C η , for 0 ≤ η < 1/2. Then Theorem 4.4 tells that 1 ∈ C η for 0 ≤ η < 1/2, and the same holds for the difference B1 − B1/2 . It is easy to see that the continuity extends to x = 0 and x = 1/2, by setting 1(0) = 1(1/2) = 0. In fact, the following theorem gives a stronger result which includes the case η = 1/2. Theorem 4.6. The difference B1 − B1/2 may be extended from R \ Q to R as an 1/2H¨older continuous periodic function with period one.
290
S. Marma, P. Moussa, J.-C. Yoccoz
Proof. From (4.24) and (4.25), we get for 0 < x < 1/2, 1(x) = x1(x−1 ) + xB1− (x−1 ) − B1− (x) = x0 1(x1 ) + ε1 x0 B1− (x1 ) − B1− (x0 ) , (4.26) where we have used the first step of the continued fraction expansion of x. Solving (4.26) by iteration leads to the even and periodic functions 1, 11 , 12 defined for 0 < x < 1/2 as: 1(x) = 11 (x) + 12 (x) , ∞ X 11 (x) = − βn−1 (x)B1− (xn ) , 12 (x) =
n=0 ∞ X
εn (x)βn−1 (x)B1− (xn ) .
(4.27) (4.28) (4.29)
n=1
We deduce from Theorem 4.4 that 11 = −(1 − T1 )−1 B1− is an even periodic function in C 1/2 , since B1− is sufficiently regular. We now observe that B1− (0) = B1− (1/2) = 0, therefore the evaluation of |12 (x)−12 (x0 )| is made following the proof of Theorem 4.4, for ν = 1, using (4.16) instead of (4.13). The conclusion is the same as Theorem 4.4, part (i), that is that also 12 ∈ C 1/2 . Now, 1 is the even part of B1 − B1/2 , and since the odd part of B1 −B1/2 is nothing but B1− , we deduce that B1 −B1/2 is 1/2-H¨older-continuous. Appendix: B. M. O. Norms Let f ∈ L1loc (R). We define the mean value fI of f on the interval I as: Z 1 fI = f dx , |I| I where |I| is the length of the interval I. Then we define for any interval U , Z 1 |f − fI | dx . ||f ||∗,U = sup I⊂U |I| I
(A.1)
(A.2)
We then say f belongs to the space BMO(U ) if ||f ||∗,U < ∞, i.e. is finite. BMO is an abbreviation for ‘bounded mean oscillation’. ||f ||∗,U is a seminorm on BMO(U ), since for any constant c, we have ||f + c||∗,U = ||f ||∗,U . In particular ||f ||∗,U = 0 if f is constant on U . This applies to U = R and leads to the space BMO(R), abbreviated as BMO and the seminorm ||f ||∗,R on BMO(R) will simply be written ||f ||∗ . In fact this seminorm is a norm on the quotient space of function in L1loc (R) modulo the constant functions. With this norm, the quotient space is complete. We list now some more or less classical results and lemmas [Gr,GRCF]. Proposition A.1. The space L∞ (U ) is a subspace of BMO(U ), and ||f ||∗,U ≤ inf c ||f − c||∞,U . Proposition A.2. Let f in BMO(U ) and let I be an interval. Then for any λ > 0, the Lebesgue measure of the set of points t ∈ I such that |f (t) − fI | > λ is bounded by K1 exp(−K2 λ/||f ||∗,U ).
Brjuno Functions and Their Regularity Properties
291
This is the John-Nirenberg Theorem, see Garnett [Gr]. The proof there is for U = R, but a careful reading show that it works for any U . The constants K1 and K2 do not depend on U , λ, and f . Roughly speaking, this theorem says that where f is unbounded, it behaves at most as a logarithmic function. Proposition A.3. We have the following “magic reverse H¨older’s inequality´´: let f ∈ L1loc (R) and suppose that for some interval U ⊂ R, the seminorm ||f ||∗,U is finite, then for any bounded real p ≥ 1, there exists a constant Ap such that sup I⊂U
1 |I|
Z |f − fI | dx p
p1
≤ Ap ||f ||∗,U .
(A.3)
I
In fact it is an easy corollary of the John-Nirenberg theorem, see Garnett [Gr]. The constant Ap does not depend on U , and may be shown to be smaller than pC with an explicit constant C. Note that the inequality does not work in the limit p → ∞. The preceding proposition shows that replacing the L1 norm in the definition of the BM O norm ||f ||∗,U , by the analogous Lp norm (with p finite), leads to the same BM O space. More precisely using the usual Lp norm Z |f | dx
||f ||p,U =
p
p1 ,
(A.4)
U
we define ||f ||∗,p,U = sup
I⊂U
1 |I|
Z |f − fI | dx p
I
p1
1
= sup |I|− p ||f − fI ||p,I .
(A.5)
I⊂U
We then have Proposition A.4. The space BMO(U ), is a subspace of Lp (U ) when U is a bounded interval. p ∞ Thus, BMO(U ) is a subspace of ∩∞ p=1 L (U ), but not a subspace of L (U ). In fact, on BMO(U ), we have a family of equivalent norms, as shown in the following proposition.
Proposition A.5. On BMO(U ), where U is a bounded interval, define for any real a > 0, and b > 0, and for any integer p ≥ 1 (finite), the following family of norms N (f, a, b, p) = a||f ||∗,p,U + b||f ||p,U ,
(A.6)
then these norms are all equivalent for various a and b and p. In BMO(R), we will also define the seminorm ||f ||∗,T as follows Z 1 |f − fI | dx , ||f ||∗,T = sup |I| ≤ 1 |I| I
(A.7)
the supremum being taken over intervals I ⊂ R with length less or equal to 1. The seminorm ||f ||∗,T is convenient for periodic functions with period 1. For any f ∈ BMO(R) we obviously have: ||f ||∗,[−(1/2),+(1/2)] ≤ ||f ||∗,T ≤ ||f ||∗,R . This observation will be useful if we now consider functions f ∈ BMO(R) which are even and periodic with period 1: for such functions, we also have the following result
292
S. Marma, P. Moussa, J.-C. Yoccoz
Proposition A.6. There exist constants K3 > 1, K4 > 1, K5 > 1 such that for any f ∈ BMO(R), which is even and periodic with period 1, we have a) b) b)
||f ||∗,R ≤ K3 ||f ||∗,T , ||f ||∗,T ≤ K4 ||f ||∗,[0,1] , ||f ||∗,[0,1] ≤ K5 ||f ||∗,[0,1/2] ,
.
See [MMY] for a detailed proof. Note that parts b) and c) are not true if the periodic function f is not even. A non trivial, but immediate consequence of these results is the following corollary. Corollary A.7. Let f be a function defined in [0, 1/2], which belongs to BMO([0, 1/2]). The function g which is even and periodic with period 1, and which coincides with f on [0, 1/2] is in BMO(R). As indicated in Propositions A.5 and A.6, we have a wide choice among possible equivalent norms. The L2 -norm is especially useful, due to the following elementary identity: Z Z 1 1 2 (f − fI ) ds = (f (s) − f (t))2 dsdt . (A.8) |I| I 2|I|2 I×I Defining the oscillation of f in I with the L2 -norm, namely OI (f ) =
1 2|I|2
Z (f (s) − f (t)) dsdt 2
21 ,
(A.9)
I×I
the norm N (f, a, b, 2) can be rewritten as N (f, a, b, 2) = a sup OI (f ) + b||f ||2,U ,
(A.10)
I⊂U
In the above expression, (A.9) and (A.10), replacing the Lebesgue measure on U by any equivalent mesure, leads to an equivalent norm. We now end this appendix by quoting Fefferman’s theorem, which makes the link between analysis on the real line and harmonic or complex extension on the upper halfplane. Proposition A.8. The space BMO(R) is the dual space of the Hardy space H 1 on R. If f ∈ L1loc (R), f ∈ BMO(R) if and only if there exist a constant c, and functions φ and ψ in L∞ , such that f = c + φ + Hψ, where the Hilbert transform Hψ is the harmonic conjugate of ψ. Furthermore, φ and ψ can be chosen such that ||φ||∞ ≤ C||f ||∗ and ||ψ||∞ ≤ C||f ||∗ , with C a constant. References [Bo] [Br] [BPV] [Da] [DH]
Bosma, W.: Optimal continued fractions. Indag. Math. A90, 353–379 (1987) Brjuno, A. D.: Analytical form of differential equations. Trans. Moscow Math. Soc. 25, 131–288 (1971); 26, 199–239 (1972) Buric, N., Percival, I. C., Vivaldi, F.: Critical function and modular smoothing. Nonlinearity 3, 21–37 (1990) Davie, A. M.: The critical function for the semistandard map. Nonlinearity 7, 219–229, (1994). The same author has announced similar results for the standard map (private communication) ´ Norm. Douady, A., Hubbard, J. H.: On the dynamics of polynomial-like mappings. Ann. Scient. Ec. Sup. 4e` me S´erie 18, 287–343 (1985)
Brjuno Functions and Their Regularity Properties
293
[Ga] Gauss, E. F.: Collected Works. No. X1 , Leipzig: Teubner, 1917, p. 372 [Gr] Garnett, J. B.: Bounded Analytic Functions. New York: Academic Press, 1981 [GCRF] Garcia–Cuerva, J. and Rubio de Francia, J.L.: Weighted Norm Inequalities and Related Topics. North Holland Mathematical Studies, 116, Amsterdam: North Holland, 1985 [LM] Lasota, A., Mackey, M. C.: Probabilistic properties of deterministic systems. Cambridge: Cambridge University Press, 1985 [Ma] Marmi, S.: Critical functions for complex analytic maps. J. Phys. A: Math. Gen. 23, 3447–3474 (1990) [MMY] Marmi, S., Moussa, P., Yoccoz, J.-C.: Continued fraction transformations, Brjuno functions, and BMO spaces. Note CEA-N-2788, Feb. 1995. This internal report is in fact a preliminary version of the present work [MS] Marmi, S. Stark, J.: On the standard map critical function. Nonlinearity 5, 743–761 (1992) [Me1] Meyer, D. H.: On a ζ-function related to the continued fraction transformation. Bull. Soc. Math. France 104, 195–203 (1976) [Me2] Meyer, D. H.: On the Thermodynamic Formalism for the Gauss Map. Commun. Math. Phys. 130, 311–333 (1990) [Me3] Meyer, D. H.: Continued fractions and related transformations. In: Ergodic theory, symbolic dynamics and hyperbolic spaces, T. Bedford, M. Keane, C. Series, editors, Oxford: Oxford University Press, 1991 [Na] Nakada, H.: On the invariant measures and the entropies for continued fraction transformations. Keio Math. Rep. 5, 37–44 (1980) [Pe] P´erez-Marco, R.: Solution compl`ete au probl`eme de Siegel de lin´earisation d’une application holomorphe au voisinage d’un point fixe. S´eminaire Bourbaki nr. 753, Ast´erisque 206, 273–310 (1992) [Ri] Rieger, G. J.: Mischung und Ergodizit¨at bei Kettenbruchen nach n¨achsten genzen. J. Reine Angew. Math. 310, 171–181 (1979) [Yo] Yoccoz, J.-C.: Th´eor`eme de Siegel, polynˆomes quadratiques et nombres de Brjuno. Ast´erisque 231, 3–88 (1995) Communicated by M. Herman
Commun. Math. Phys. 186, 295 – 322 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Quantum Deformation of Lattice Gauge Theory D.V. Boulatov? International School for Advanced Studies (SISSA/ISAS), via Beirut 2-4, I-34014 Trieste, Italy. E-mail:
[email protected] Received: 29 April 1996 / Accepted: 24 September 1996
Abstract: A quantum deformation of 3-dimensional lattice gauge theory is defined by applying the Reshetikhin-Turaev functor to a Heegaard diagram associated to a given cell complex. In the root-of-unity case, the construction is carried out with a modular Hopf algebra. In the topological (weak-coupling) limit, the gauge theory partition function gives a 3-fold invariant, coinciding in the simplicial case with the Turaev-Viro one. We discuss bounded manifolds as well as links in manifolds. By a dimensional reduction, we obtain a q-deformed gauge theory on Riemann surfaces and find a connection with the algebraic Alekseev-Grosse-Schomerus approach. 1. Introduction The lattice regularization of non-abelian gauge theory (LGT) proposed by K.Wilson in 1974 [19] plays a fundamental role of a non-perturbative definition of QCD. The main principle has been to give up the Poincar´e invariance of the theory and preserve the local gauge symmetry as more fundamental. In the weak coupling regime, the broken translational and rotational invariance restores dynamically. If the gauge coupling is strictly equal to 0, the gauge strength tensor vanishes identically and the theory becomes topological: one can take any finite lattice from a given equivalence class without changing the content of the model. The lattice formulation extends in a natural way the set of acceptable gauge groups for all compact groups while the continuous one is based on the notion of the Lie algebra thus excluding finite groups, for example. After the theory of quantum groups had appeared as a distinct mathematical subject [10, 21], the natural question arose whether the notion of gauge symmetry could be extended to incorporate quantum groups as well. This problem has not been only of academic interest. As was noticed in Ref. [5], the ? Address after 1 October 1996: Institute for Theoretical and Experimental Physics, B. Cheremushkinskaya 25, 117259 Moscow, Russian Federation
296
D.V. Boulatov
Ponzano-Regge model [14] (which coincides with the classical (q → 1) limit of the Turaev-Viro construction [17]) can be represented as LGT defined on lattices dual to simplicial complexes. Then it was natural to assume that the Turaev-Viro invariant could be related to a kind of LGT built on quantum group symmetry (for the sake of brevity, we shall call it qQCD3 ). This program was carried out in Ref. [6] explicitly (see also [7]). Technical difficulties originating from a complicated structure of the representation 2π ring of SLq (2) at a root of unity, q = ei k+2 , was avoided in [6] by establishing a direct connection with the ribbon graph invariants of Reshetikhin and Turaev [15]. The gauge invariance is implicit in this formulation (however, it does not mean that the model does not enjoy it). On the other hand, relative simplicity makes this explicit representation very convenient. The classical (q → 1) limit of the Turaev-Viro construction was discovered long ago by Ponzano and Regge in the framework of Regge calculus [14]. They argued that it can be regarded as a discretization of 3d gravity with the Einstein-Hilbert action. On the other hand, the Turaev-Viro invariant is related to Witten’s Chern-Simons invariant [20]. Witten has shown that, with the ISO(2,1) gauge group, the latter is connected with 3d quantum gravity. In the Euclidean regime for a negative cosmological constant, the gauge group becomes isomorphic to SO(4) = SU (2) × SU (2). It means that qQCD3 possessing SLq (2) gauge symmetry is interesting from the physical point of view (an exposition of the subject can be found in Ref. [7]). The structure of the answers for the partition function suggests that, in 2 dimensions, the corresponding q-deformed LGT (qQCD2 ) is related to the topological (G/G)k coset model (as was argued in Ref. [8] these two models are in some sense dual to each other). An alternative purely algebraic approach to qQCD2 was put forward in Ref. [1, 3, 4]. The starting point for them was a Poisson bracket on 2d lattice connections proposed by Fock and Rosly [11]. In 2 dimensions there is a cyclic order of links incident to a vertex. Demanding that variables performing gauge rotations at each vertex form a quasi-triangular Hopf algebra, Alekseev, Grosse and Schomerus have deduced an algebra of gauge fields. In contrast to this situation, in 3 dimensions there is a natural cyclic order of faces sharing the same link of a lattice, which suggests that one should start with gauge fields forming a Hopf algebra while gauge transformations are interpreted as changes of bases these fields act on. As in the quantum case there is no group manifold behind the construction, gauge fields and gauge transformations have clearly the different statuses. One could say that this is one more occurrence of the principle “Quantization removes degenerations”. Descending from 3 to 2 dimensions, one finds a model which seems, at first sight, to be different from the AGS one. However, as we shall show they are locally equivalent. The outline of the paper is the following. Chapter 2 is devoted to the construction of q-deformed LGT in three dimensions. In Sect. 2.1 we introduce classical LGT and make some general remarks on its quantum deformation. In Sect. 2.2 we collect facts from 3-manifold topology which are used in the sequel. In Sect. 2.3 we introduce the Reshetikhin-Turaev functor in the form adopted for our purposes. In Sect. 2.4 we define qQCD3 functor and discuss the notion of gauge invariance within our framework. In Sect. 2.5 we introduce the qQCD3 partition function in the case of Uq (su(n)) gauge group. Sect. 2.6 is devoted to the root-of-unity case: Uq (sl(n, R)), q ` = 1. In Sect. 2.7 we prove that the weak-coupling partition function introduced in the previous section is a topological invariant. In Sect. 2.8 we discuss the case of bounded manifolds and shortly outline the introduction of Wilson loop averages in our model. Chapter 3 is devoted to the 2-dimensional case. Here we derive Verlinde’s
Quantum Deformation of Lattice Gauge Theory
297
formula and discuss a connection between our approach and the AGS algebra. We conclude with a few general remarks.
2. qQCD3
2.1. Formulation of the problem. To introduce lattice gauge theory, one needs a cell decomposition of a manifold (in physicist usage, a lattice). A gauge field is a map from a set of oriented edges to a compact group: ` 7→ g` ∈ G. A change of an orientation corresponds to the conjugation: g` → g`−1 . One attaches to every vertex a G-module (usually the regular representation). Gauge transformations rotate bases of the modules independently at each vertex. The gauge field is interpreted as performing a parallel transport between vertices, thus relating bases at adjacent ones. If an oriented link, k, connects vertices v2 and v1 , the gauge transformation of the group element gk is gk → hv1 gk h−1 v2
(2.1)
A holonomy associated with a path {L} in the lattice is an ordered product of gauge field elements along {L}: Y gkk , (2.2) hL = k∈L
where k = +1, if the k edge is directed along the path, and k = −1, if their directions are opposite. Gauge invariant quantities are those taking values in the set of conjugacy classes of G. A trace of the holonomy along a closed loop in any representation of G is an example of such an invariant. The Boltzmann weights are functions of holonomies along boundaries, ∂f , of faces, f . One of the standard choices is the so-called group heat kernel X dR χR (h∂f )e−βCR , (2.3) Wβ (h∂f ) = th
R
P
In Eq. (2.3), R is the sum over all finite-dimensional irreps of a gauge group G; χR (x) is the character of an irrep R; dR = χR (I) is its dimension; CR is a second Casimir eigenvalue; β is a real parameter called a coupling constant. The construction makes sense for compact groups whose unitary finite dimensional irreps span the regular representation. The choice (2.3) ensures that Wβ (h∂f ) becomes the group δ-function in the weak coupling limit, β → 0: W0 (h∂f ) = δ(h∂f , I). We shall call this limit topological. The partition function is defined as the integral of the product of the Boltzmann weights over all faces: Z Y Y Y dg` Wβ ( gkk ), (2.4) Zβ = G `
f
k∈∂f
where dg` is the Haar measure on the group G, and the product
Q `
goes over all edges.
298
D.V. Boulatov
If the term “q-deformed” is to mean that gauge variables take values in a quantum group, any presentation of the model should be reducible to a form where the variables are represented in a standard fashion as matrices of non-commutative elements. The simplest and most famous example is SLq (2), which can be introduced as the set of matrices a b g= (2.5) c d the entries of which obey the commutation relations ba = qab db = qbd cb = bc, ca = qac dc = qcd da − ad = (q − q −1 )bc, −1 ad − q bc = 1.
(2.6)
The relations (2.6) imply the existence of the R-matrix
q 0 R= 0 0
0 1 q − q −1 0
0 0 1 0
0 0 0 q
(2.7)
and the RT T = T T R equation (g ⊗ 1)(1 ⊗ g)R = R(1 ⊗ g)(g ⊗ 1).
(2.8)
The R-matrix obeys the quantum Yang-Baxter equation R12 R13 R23 = R23 R13 R12 .
(2.9) 1
2
3
Indices show at which positions in the tensor cube of representation spaces, V ⊗ V ⊗ V , acts the R-matrix. SLq (2) has two real forms: SUq (2), for real q, and SLq (2, R), for |q| = 1. The matrices can be multiplied. If entries of both g and h obey Eq. (2.6) and are mutually commutative, the entries of the product gh obey (2.6) as well. Therefore, matrices on different links of a lattice have to co-commute with one another in the tensor product. The algebra of matrices (2.5) naturally extends to the quasi-triangular Hopf algebra Fq (SL(2)) of quantized functions on SL(2). Owing to the famous duality, its basis is provided by the matrix elements of finite-dimensional irreducible representations of the quantized universal envelopping (QUE) algebra Uq (sl(2)). Therefore, to construct qQCD, we have at hands co-multiplication, R-matrix, antipode and Clebsch-Gordan coefficients (CGC). The q-deformation of Eq. (2.4) is roughly speaking a way to write it down in terms of elements of the Hopf algebra. A priori, it is not unique. The guiding principle here can be to identify any transformation of the construction with some isometry of a base cell complex in a self-consistent way. Then all algebraic manipulations become geometrically meaningful. It is close in spirit to the Reshetikhin-Turaev functor from the category of ribbon tangles to the modular Hopf algebras [15]. Our presentation of qQCD3 is in many respects inspired by their work. To describe it, we need to look at LGT from a bit more general than the usual point of view.
Quantum Deformation of Lattice Gauge Theory
299
2.2. Topological background. For the reader’s convenience we collect in this section some definitions which we shall use in the sequel. A k-cell is a polyhedron homeomorphic to a k-dimensional ball. A cell complex is a union of a finite number of cells such that an intersection of any 2 k-cells is either empty or a finite number of less dimensional cells. A cell complex can be obtained starting with a finite set of points by attaching subsequently cells of higher dimensions, any cell being attached to a finite number of lower dimensional cells. A union of all cells of dimension ≤ n is called an n-skeleton. A cell complex is a manifold if and only if the neighbourhood of each vertex is a spherical ball. A complex is called simplicial if all cells are simplexes (i.e., points, links, triangles, tetrahedra, etc.). Physicists usually mean by a lattice a cell complex such that an intersection of any two k-cells either empty or consists of only one entire less dimensional cell. We adopt this notion. Simplicial complexes are lattices by definition. e is constructed by putting into correspondence its k-cells to cells A dual complex, C, of C having complimentary dimensions, n − k. To introduce LGT, we need a presentation of a cell complex, i.e., an effective way to describe it unambiguously. In the classical case, one needs to know only a 2-skeleton of a complex. From the topological point of view, the construction of LGT described at the beginning of the previous section is reminiscent of the definition of H 1 (C, G), the noncommutative first cohomology of C with coefficients in G. In the topological limit, all holonomies along contractible loops vanish and gauge fields obey the defining relations of π1 (C). Therefore, being properly normalized, the partition function Z0 counts the number of conjugacy classes of injective homomorphisms from π1 (C) into a gauge group G: Z0 = |Hom(π1 (C), G)/G|.
(2.10)
Of course, it makes sense only if G is finite. If G is a Lie group, one speaks about a moduli space of flat G-connections, which is defined as a set of fields modulo gauge transformations: MG := {Hom(π1 (C), G)/G}.
(2.11)
It is easy to see that classical topological LGT is completely determined by a homotopy type of a complex. The construction of qQCD3 requires a more precise presentation of a complex. It is known that any oriented 3-manifold can be obtained by gluing up two 3dimensional handlebodies along their boundaries. This operation is the Heegaard splitting. The minimal genus of the handlebodies is called the Heegaard genus of the manifold. We can obtain a Heegaard splitting for a given oriented manifold M from its cellular decomposition, C, as follows. We take a tubular neighbourhood, H, of the 1-skeleton of e = M \H, can be regarded as a tubular neighbourhood C. The complement of H in M , H e of the 1-skeleton of the dual complex C. Every 1-cell σi1 ∈ C determines a disk Di ⊂ H whose detachment destroys a handle of H. The boundaries of the disks ∂Di ⊂ ∂H give a system of cycles on the
300
D.V. Boulatov
boundary, ∂H, of the handlebody H. We shall call them the α-cycles: αi := ∂Di . Dual f1 ∈ C e determine analogously a system of α e of 1-cells σ e-cycles on the boundary, ∂ H, j e Images of the α H. e-cycles on ∂H produced by a gluing homomorphism h are called the characteristic curves (or γ-cycles) of the Heegaard diagram and define the manifold S e unambiguously. M =H hH b ⊂ {D} such that the detachment of them Let us fix a number of the disks {D} b ∼ makes the handlebody connected and simply-connected (i.e., H\{D} = B 3 ). We can b put into correspondence a generator ai of the fundamental group to each disk Di ∈ {D}. Defining relations are read off in an obvious way from a system of the characteristic curves {γ}. That is, if γj intersects disks Dj1 , Dj2 , . . . , Djk subsequently, then the j j j corresponding relator is Γj = aj1 1 aj2 2 . . . ajkk , where k = ±1 is the intersection number depending on a mutual orientation of γj and the k th disk at the intersection point. This set of relators is of course excessive. A minimal set can be fixed by choosing a number of be α e-cycles which span a disjoint collection of disks {D} in the complementary handlebody be e = M \H such that the detachment of all the disks from the set {D} e a 3-ball: H makes H b 3 ∼ e D} e =B . H\{ One can deform a Heegaard diagram by any 2-dimensional isomorphism of a boundary ∂H which extends to the whole handlebody H. A set of generators for such isomorphisms is called in the literature the Suzuki moves (see, e.g., refs. [12, 9] for an exposition accessible to a physicist). It can be shown that any class of isotopic diffeomorphisms of a genus g surface Mg2 onto itself has a representative which can be constructed as a composition of the Dehn twists, Tµ , where µ is one of the basic cycles on Mg2 and = ±1. One detaches from Mg2 a thin neighborhood Uµ ∼ = S 1 × [0, 1] of a cycle µ and then attaches it back after the full twist Uµ → Uµ : ϕ × t → (ϕ + 2πt) × t, (where t ∈ [0, 1] and φ ∈ [0, 2π] parametrizes S 1 ). In the sequel, we shall only need the following fact: all the Suzuki moves are combinations of Dehn twists on loops in ∂H which bound disks D ⊂ H, except for the handle slide defined in the following way. Imagine solid handles attached to a surface of a spherical ball. One drags one end of a handle up, along and down another handle. As a result of this operation, an α-cycle corresponding to the second handle slides around an α-cycle corresponding to the first one. It can be described as a multiplication of loops on ∂H defined in the standard fashion as in the definition of the fundamental group π1 (Mg2 ) e as well. The same operations can be applied to H It is a classical result that any two Heegaard diagrams representing the same manifold can be connected by a sequence of the following operations: e They do not change a presentation of 1. Dehn twists on loops contractible in H or H. π1 (C). 2. Cycle slide, which consists in the multiplication of a cycle by another one: γj → γj γk . It means that a relator Γj in a presentation of π1 (C) is substituted by Γj Γk . The same operation applied to the α-cycles, αj → αj αk , corresponds to the change of generators of π1 (C): aj is substituted for aj ak . 3. Stabilization, which consists in adding a new handle to H and extending a gluing diffeomorphism by the identity on its boundary. It means that one adds one character-
Quantum Deformation of Lattice Gauge Theory
301
istic curve and one α-cycle to a Heegaard diagram or, equivalently, a new generator ag+1 along with the trivial relation, Γg+1 = ag+1 = 1, to a presentation of π1 (C). It should be noted that an isotopy within a handlebody itself cannot necessarily take place for its embeddings in R3 . The obvious obstruction is that the characteristic curves can become linked in R3 . We shall need the operation of the connected sum of two manifolds: M = M1 #M2 . One constructs M by deleting spherical balls from M1 and M2 and then gluing the manifolds together along the boundaries. Obviously, M #S 3 ∼ = M , which can be represented as the attachment of a single 3-cell to the spherical boundary of the ball obtained from M . This operation introduces an abelian semi-group structure and any 3-fold invariant can be regarded as a representation of this semi-group. A manifold is called simple, if it cannot be represented as a connected sum of two nonspherical manifolds. Any compact oriented 3-manifold possesses a unique expansion into a connected sum of simple manifolds. By performing a Heegaard splitting, a manifold is constructed out of two handlebodies joined by some homeomorphism of their boundaries. Such a homeomorphism can be continued into small neighborhoods of the boundaries. It means that the characteristic curves can submerge a bit into the inside of H. If nij = (αi , γj ) is an intersection number of two cycles on the boundary, then γj will have after the deformation the same number as a linking coefficient with αi . The linking coefficient of two loops in R3 is equal, by definition, to the intersection number of the first with a disk spanned by the second. If αi and γj are linked, the corresponding 1-cell, σi1 , enters the boundary of the corresponding 2-cell, σj2 : σi1 ∈ ∂σj2 . And vice versa for co-boundaries: σj2 ∈ δσi1 . The boundary of a 2-cell defines a natural cyclic order of 1-cells belonging to it. A peculiarity of the dimension 3 is that 2-cells forming a co-boundary of a 1-cell are naturally ordered as well. It is a cyclic order of dual 1-cells forming a boundary of a dual 2-cell. 2.3. Reshetikhin-Turaev functor. The quantized function (QF) algebra Fq (SL(n)) is dual to the QUE algebra Uq (sl(n)), therefore the topological basis of Fq (SL(n)) is given by the matrix elements of irreducible representations of Uq (sl(n)). For example, the 2 × 2 matrix realization of SLq (2) given in Eqs. (2.5) and (2.6) exactly corresponds to the matrix elements of the 2-dimensional irreducible representation of Uq (sl(2)). In this paper, we shall deal with real forms of Uq (sl(n, C)) with respect to some ∗-structures, whose existence is always assumed. The discussion in the previous section suggests that we can construct qQCD3 with help of the Reshetikhin-Turaev functor from the category of colored ribbon tangles ctang to the category of representation rings of Uq (sl(n)), repUq . The basic geometric object is a tangle, which was defined in Ref. [15] as “a link of circles and segments in the 3-ball, where it is assumed that ends of segments lie on the boundary of the ball”. One puts into correspondence to every tangle a linear operator, f , acting on a tensor product of modules associated with segments (which have therefore to be oriented). f : Vi1 ⊗ . . . ⊗ Vin → Vj1 ⊗ . . . ⊗ Vjk or, graphically,
(2.12)
302
D.V. Boulatov
j1
...jk ∼ fij11...i = n
jk
66666 , f
(2.13)
66666
i1
in
where j1 , . . . , jk and i1 , . . . , in are some indices numerating the modules. The simplest example is the identity operator represented by a single segment: α δα,β ∼ =
.
(2.14)
β All modules considered in this paper are assumed to be irreducible. We shall draw linear operators acting on them as small boxes (coupons) with labels inside. The elementary building blocks are the matrix elements α
i (a) ∼ Dα,β =
a ,
(2.15)
i, β where a is an element of Uq . The arrows show a direction of the action of an operator. We use the Greek letters to numerate basis vectors of irreducible modules and the Latin ones, to numerate the modules. They will often be omitted. Uq possesses several ∗-structures. We shall draw conjugate objects as β
i (a∗ ) ∼ Dα,β =
a
i, α
β
=
a
.
(2.16)
i, α
The last equality takes place for a real form of Uq , where the ∗-structure matches basis vectors of a module: α → α. The operators form an algebra A. We can translate this property in pictures as
Quantum Deformation of Lattice Gauge Theory
a, b ∈ A → ab ∈ A
303
∼ =
a
ab
,
(2.17)
b
∃1 ∈ A : a1 = 1a = a, ∀a ∈ A
∼ =
a
a1
1a
.
(2.18) A is a ring, i.e., an Abelian group under some + operation, which we shall understand as a formal sum of pictures with the natural definition of the multiplication by an integer number. To have a bi-algebra structure on A, we need a co-multiplication ∆ : V → V ⊗ V and a co-unit ε. We introduce ∆ as
i ∆(a) ∼ =
∆(a)
X
=
a
a
a
.
(2.19)
i
In general, the last equality is simply a convenient pictorial representation and has to be given a precise meaning in every particular case. In Eq. (2.19), the 3-valent vertices,
j3 , α 3
j1 , α 1 j α ∼ = Cj13α13;j2 α2
j1 , α 1
j2 , α 2
j2 , α 2 j1 α1 ;j2 α2 ∼ = C j3 α3
j3 , α 3
(2.20)
304
D.V. Boulatov
are the quantum Clebsch-Gordan coefficients ejα11 ⊗ ejα22 =
X
Cjj13αα13;j2 α2 ejα33 (ejα is a
j3 ,α3
basis of Vj ). They obey the properties
X
;
=
i
=
.
(2.21)
i
j1
j2
j1
j2
j
j
which simply means that they are elements of a unitary matrix connecting bases in V and V ⊗ V : X iβ j 0 α0 ;j 0 α0 Cj1 α1 ;j2 α2 C iβ1 1 2 2 = δj1 ,j10 δj2 ,j20 δα1 ,α01 δα2 ,α02 . i,β
We can check the properties of the co-multiplication graphically
a
∆(ab) = ∆(a)∆(b) ∼ =
a
(2.22)
ab b b
and
∆(1) = 1 ⊗ 1
∼ =
=
.
(2.23)
In these formulas, the sum over intermediate states is assumed. In what follows, we shall often omit the sum sign in pictures. Thus, the co-associativity is coded in the properties of the Clebsch-Gordan coefficients. The co-unit is a homomorphism to an abelian group associated with a field over which A is defined,
Quantum Deformation of Lattice Gauge Theory
305
ε(ab) = ε(a)ε(b) ε(1) = 1. (2.24) We shall connect the co-unit with a projection on the trivial representation of a quantum group. In other words, with the group integration. To have the Hopf algebra structure on A, we introduce an antipode map: S : A → A:
S
a
=
=
a
a∗
(2.25)
obeying · (S ⊗ id) ◦ ∆ = ·(id ⊗ S) ◦ ∆ = 1 ◦ ε which looks graphically as
· (S ⊗ id) ◦
=
a
(2.26)
=
a
ε(a),
(2.27)
where a is an arbitrary element of A and ◦ means a composition of operations. This property shows that the antipode can serve as a q-analog of the inverse. However, in general, S 2 6= 1. The maps V ⊗ V → C and C → V ⊗ V are constructed with help of CGC: 0 j
:= j
p dj
j
;
j
j
j
p dj
,
j
:=
j
(2.28)
0
where dj is the quantum dimension of the module Vj . These objects become the ordinary δ-functions in the q → 1 limit. The self-consistency requires that
S(ab) = S(a)S(b)
∼ =
b ab
a
a
b
. (2.29)
306
D.V. Boulatov
For this property to hold, it is important that
a a
b
b
a
.
(2.30)
b
The other property of the antipode is ∆ ◦ Sa = τ ◦ (S ⊗ S) ◦ ∆a. It can be checked graphically:
∆◦
a
=
a
= τ ◦(S ⊗S)
a
, (2.31)
a
τ
where τ is the flip operator: Vj1 ⊗ Vj2 −→ Vj2 ⊗ Vj1 . Having equipped A with an R-matrix, we obtain a quasi-triangular Hopf algebra (A, R). The R-matrix obeys the Yang-Baxter equation (2.9). In our context, it will be ˇ more convenient to consider the R-matrix Rˇ := τ ◦ R
(2.32)
which can be represented graphically as X ˇ αi ⊗ βi ∼ R=τ◦ =
(2.33)
i
It is invertible X −1 ˇ βi ⊗ S(αi ) ∼ R =τ◦ = i
=
. (2.34)
A Hopf algebra is called triangular if Rˇ 2 = 1. ˇ The standard definition of the R-matrix requires that the following properties hold:
(∆ ⊗ id)Rˇ = Rˇ 12 Rˇ 23
∼ =
(∆ ⊗ id)
=
, (2.35)
Quantum Deformation of Lattice Gauge Theory
(id ⊗ ∆)Rˇ = Rˇ 12 Rˇ 23
∼ =
307
=
(id ⊗ ∆)
(2.36)
along with the general form of Eq. (2.8)
ˇ R(∆a) Rˇ −1 = ∆a, ∀a ∈ A ∼ =
a
a
=
a
a
.
(2.37)
.
(2.38)
,
(2.39)
These equations are equivalent to the Yang-Baxter one
Rˇ 12 Rˇ 23 Rˇ 12 = Rˇ 23 Rˇ 12 Rˇ 23
∼ =
=
The standard relations including the antipode are
−1 ∼ ˇ ˇ (S ⊗ id)R = R = (S ⊗ id)
−1 ˇ ˇ (id ⊗ S)R = R ∼ = (id ⊗ S)
= =
= =
, (2.40)
308
D.V. Boulatov
and those with the co-unit are
(ε ⊗ id)Rˇ = (id ⊗ ε)Rˇ = 1
∼ =
=
j
0
= 0
j
.
(2.41)
j
The standard way to introduce the ribbon Hopf algebra structure on A is to bring forward the element u ∈ A defined as [15] u :=
X
S(βi )αi
∼ =
.
(2.42)
i
The element v 2 = uS(u) lies in the center of A. ˇ v) is a quasi-triangular Hopf algebra (A, R) ˇ A ribbon Hopf algebra Uq = (A, R, equipped with a central invertible element v ∈ A,
v :=
v −1 :=
=
=
(2.43)
obeying the properties: ε(v) = 1,
uS(u) = v 2 ∼ = u◦
=
=
=
.
(2.44)
Quantum Deformation of Lattice Gauge Theory
S(v) = v
∼ =
∆(v) = (Rˇ 2 )−1 (v ⊗ v)
309
∼ =
,
=
(2.45)
=
=
. (2.46)
The element
uv −1 ∼ =
=
(2.47)
allows for defining the q-trace of an operator
qtr(a) := tr(auv −1 ) ∼ =
a
.
(2.48)
In the tensor square of spaces it takes the form
tr[a ⊗ b ◦ ∆(uv −1 )] = qtr(a) qtr(b)
∼ =
a
b
=
a
b
.
(2.49) Following Reshetikhin and Turaev, we shall call this operation the closing of a tangle. The quantum dimension of a module, Vj , is, by definition, the q-trace of the identity operator:
310
D.V. Boulatov
dj := qtr(1V j ) ∼ =
.
(2.50)
2.4. Algebra of fields and gauge invariance. In Sect. 2.2 we have described a Heegaard diagram as a handlebody with a given system of α-cycles and characteristic curves on its boundary. Every α-cycle span a disk D in a handlebody H. The disk can beTthickened to a plate P . In this way we obtain a collection of disjoint plates in H (Pi Pj = Ø, if i 6= j). Each plate corresponds to a 1-cell of a base cell complex C from which the Heegaard diagram has been read off. By detaching the plates, H reassembles into a collection of 3-balls {B}, each corresponding to a 0-cell of C. Definition 1. We construct qQCD3 functor in the following way: 1. A gauge variable taking values in a ribbon QUE algebra Uq is put into correspondence to each plate: Pk −→ ak ∈ Uq . The variables attached to different plates are distinct elements of Uq , hence their matrix elements are co-commutative. 2. All the characteristic curves are colored with irreducible finite-dimensional representations of Uq . 3. If on a boundary of the k th plate there are nk disjoint cuts of the characteristic curves colored with representations j1 , j2 , . . . , jnk , we construct a gauge field tangle by repeatedly applying the co-multiplication: Fk = ∆nk −1 (ak ) : Vj1 ⊗ . . . ⊗ Vjnk → Vj1 ⊗ . . . ⊗ Vjnk or, graphically,
=⇒ j1
j2
jnk
ak
ak
j1
j2
···
ak
jnk
.
One has to respect the cyclic order and mutual orientations of the cuts. A reversion of an orientation of a cut corresponds to the conjugation? of the corresponding matrix element. 4. One puts into correspondence to each ball Bi ∈ {B} carrying a pattern of the characteristic curves on its boundary a vertex tangle by using the Reshetikhin-Turaev functor ctang → repUq . ?
With respect to some fixed ∗-structure.
Quantum Deformation of Lattice Gauge Theory
311
5. In the end, the pieces are attached together. To do it, one embeds the handlebody into R3 in such a way that the cuts of the characteristic curves on boundaries of the plates project to distinct points on the (x, y) plane and onto disjoint segments on the (x, z) plane. Then one can use the (x, z) projection of the vertex tangles to complete the construction in terms of elements of a ribbon Hopf algebra as was depicted in the previous section. A result is a functional taking values in C. Remarks. 1) Let us notice that the initial data are colorings and directions of the characteristic curves as well as Uq elements attached to the plates. 2) One can sum over all the colorings with arbitrary weights as in Eq. (2.4). As we are restricted to real forms of Uq (sl(n)), the result has to be independent of the directions of the characteristic curves. 3) Modules appearing in different vertex tangles are independent. However, after the assemblage, all pieces are fit together and, permuting matrix elements adjacent via a vertex tangle, one has to deform the tangle, which means some effective non-cocommutativity. We shall dwell at this point later. Now, let us simply illustrate what may happen by the example shown in Fig. 1. 4) Clearly, the construction gives different results for non-isotopic embeddings of H into R3 . This lack of self-consistency will disappear after the integration over gauge fields (see the next section).
a
b
a
b
=
a
=
b
a
b
= c
c
b
a
b
a
c
c
= c
c
Fig. 1. An illustration of the non-cocommutativity of fields adjacent via a vertex.
Let us now discuss the issue of gauge invariance within our framework. In the general settings of gauge theory the basic object is a fiber bundle over a base manifold. Gauge field performs a parallel transport of fibers and thus is interpreted as a G-connection in sections of the bundle. Gauge transformations act by automorphisms of the fibers. To
312
D.V. Boulatov
make it explicit, one has to choose some G-basis at each point of the base. A quantity is gauge invariant if it is independent of a particular choice of the bases. In LGT a base manifold is substituted by a finite cell complex. Therefore, instead of a fiber bundle, one has a tensor product of G-modules, one for each 0-cell in the complex. In our construction of qQCD3 , 0-cells are associated with the vertex tangles. One chooses bases of the Uq -modules for each tangle independently and then sandwiches the matrix elements between them. Thus, we can reformulate gauge invariance as the requirement of independence from particular choices of all the bases. As in the quantum case the notion of the group manifold is absent, one cannot translate a change of a frame into a group rotation. Although these changes can be given a matrix form, their status is quite different from the one of gauge fields. Therefore, we lose a contact with the explicit formula (2.1).
2.5. Partition function. We connect integration with the co-unit. By definition, it is a linear functional projecting onto the trivial representation. For the matrix elements, we have
R
j
da D (a) = δj,0
∼ =
ε( a ) = δj,0
.
(2.51)
j
An integral of an arbitrary product of matrix elements can be reduced to the basic one (2) by subsequently applying the tensor product decomposition with the Clebsch-Gordan coefficients. For example, the orthogonality of matrix elements reads β1
β1 β2 β1
β2
β2
( α
α ) =
j1 α1 j2 α2
P
α
(
) =
δj1,j−2 dj1
j1 α1
(2.52)
j1 α1 j2 α2 ,
where dj is the q-dimension of a module Vj . The selfconsistency of the definition can be easily checked
ε( a ε( b
a ) b )
=
1 d2j
=
1 dj
(2.53)
Quantum Deformation of Lattice Gauge Theory
313
The main property of the Haar integral is the right/left invariance: a
( abc
abc ) =
P
a
( b ) =
c
a
a
1 dj
c
=
c
1 dj
.
(2.54)
c
In the general case, the invariance easily follows from the properties of the ClebschGordan coefficients and the antipode. The integral should be used with some caution. For example, a reader has to be aware that
ε( a )
∼ 6 = ε(
a
)
(2.55)
and the r.h.s. of this formula makes R no sense. Otherwise, one could easily arrive at contradictions. Expressions like dx dy f (x, y) are inadequate in the quantum case. To exclude ambiguity, we shall always connect the integration with the linear operator acting on a tensor product of modules constructed with help of the co-multiplication: Definition 2. Z
ε ∆n := ε ∆n ( ) : (V1 ⊗ . . . ⊗ Vn+1 −→ V1 ⊗ . . . ⊗ Vn+1 ) −→ C.
It can be calculated recursively:
ε( a · · · a a ) =
X
ε( a · · · a )
.
(2.56)
To complete our construction we need to specify a real form of Uq (sl(n)) with respect to some fixed ∗-structure. We are interested in 2 cases: (i) Uq (su(n)) which makes sense for real q and (ii) Uq (sl(n, R)) for |q| = 1. The Hopf ∗-algebra Uq (su(n)) has been well investigated starting from the pioneering works of Woronowicz and Vaksman and Soibelman [21]. We need the following facts: 1. There is the one-to-one correspondence between finite dimensional irreducible representations of Uq (su(n)) and the classical algebra U (su(n)). 2. The representation ring of Uq (su(n)) spanned by matrix elements of finite-dimensional irreducible representations can be regarded as the q-deformation of the algebra of regular functions on SU (N ) (the quantum Peter-Weyl theorem). In particular, there exists a q-analog of the group δ-function.
314
D.V. Boulatov
3. There exists a q-analog of the Haar measure. The matrix elements are orthogonal with respect to it. An explicit representation of them can be given in terms of qspecial functions. In this case the group integration is performed with help of the so-called Jackson integral from the q-special function theory. Now, we are in a position to define the qQCD3 partition function. We shall denote the number of k-cells in a complex as Nk . Definition 3. We take the construction of the qQCD3 functor introduced in the previous section. Then 1. We color the characteristic curves with Uq (su(n)) irreps: γi → ji , i = 1, . . . , N2 . 2. We put into correspondence to every plate Pk (k = 1, . . . , N1 ) the integral Pk −→ ε(∆nk −1 (ak )), ak ∈ Uq (su(n)). 3. By applying Eqs. (2), (2.52) and (2.56) we obtain a collection of closed 3-valent ribbon graphs {τ }, the number of which equals the number of 0-cells in a base cell complex. By using the Reshetikhin-Turaev functor, we calculate the quantum invariant?? , J(τk ), for each connected component, τk . Let us denote their product as Zj1 ...jN2 =
N0 Y
J(τk ).
(2.57)
k=1
4. The partition function equals the sum over all colorings of the characteristic curves Zβ =
X
N2 Y
djk e−βCjk Zj1 ...jN2 ,
(2.58)
{j1 ...jN2 } k=1
where dj is the quantum dimension and Cj is a second Casimir eigenvalue. Remarks. 1) If q = 1, this definition reduces to the one given in Eqs. (2.3) and (2.4). 2) Zβ is a gauge invariant quantity in the sense described in the previous section. Indeed, as any vertex tangle after the integration gives a closed ribbon 3-valent graph, the choice of a basis attached to it is irrelevant. 3) After the integration, all non-isotopic embeddings of a handlebody H into R3 become equivalent and the consideration can be restricted to isotopies of the handlebody itself. 4) If one considers a cell complex dual to a simplicial one, the ribbon graph invariants J(τk ) in Eq. (2.57) coincide with the quantum 6-j symbols in the Racah-Wiegner normalization. 2.6. The root of unity case. For applications most interesting is the case when q equals a primitive root of unity: q ` = 1. Then Uq (sl(n)) possesses the real form Uq (sl(n, R)). This case is rather complicated technically. One has to work with the restricted specialization Uqres (sl(n)) of Uq (sl(n)) and the issue of the duality between the QF and QUE algebras becomes quite subtle. Fortunately, one can go on with the notion of the modular Hopf algebra [15]. ??
Often called the generalized Jones polynomial.
Quantum Deformation of Lattice Gauge Theory
315
ˇ v) equipped with a distinguished Definition 4. Consider a ribbon Hopf algebra (A, R, family {Vj }j∈S of irreducible A-modules indexed by a finite set S including the trivial ˇ v) is called a modular Hopf algebra if the following requirerepresentation V0 . (A, R, ments are fulfilled: 1. qdim Vj 6= 0, ∀j ∈ S. 2. The set {Vj }j∈S is equipped with an involution j → j ∗ such that Vj ∗ = Vj∗ and Vj∗∗ = Vj . 3. For any sequence j1 , . . . , jn ∈ S M Vi⊕mi ⊕ I, mi ∈ N V j 1 ⊗ Vj 2 ⊗ . . . V j n = i∈S
as A-modules and for all A-module endomorphisms, f , of the ideal I qtr(f ) = 0. 4. Let sij be the quantum invariant of the Hopf link, two components of which are colored with irreps i and j ∈ S h i , sij = qtr i j
then the matrix (sij )i,j∈S is invertible. Let us take the row of the inverse matrix s−1 corresponding to the trivial representation V0 , then X (s−1 )0j 0 sj 0 j = δ0,j . (2.59) j 0 ∈S
We can consider Eq. (2.59) as an analog of the basic integral (2) with the obvious action of the co-multiplication:
∆
=
.
(2.60)
Owing to the third condition in the definition of the modular Hopf algebra, we find the following analog of the orthogonality of matrix elements
i
X i∈S
(s−1 )0i qtr
i
i
h
= f j1 j2
X i∈S
(s−1 )0i qtr
i
h
= f j1 j2
δj1 ,j2 dj1
f
(2.61)
316
D.V. Boulatov
for any endomorphism f : V ⊗ V → V ⊗ V . In these formulas, the q-trace is necessary to project out the ideal I. An example of the modular Hopf algebra has been given by Reshetikhin and Turaev [15] in the sl2 case. In Ref. [18] the notion of the quasi-modular Hopf algebra has been introduced by slightly weakening the irreducibility condition on the modules from {Vj }j∈S . It still leads to 3-manifold invariants of the Reshetikhin-Turaev type and therefore sufficient for our purposes as well. Turaev and Wenzl have constructed examples of quasi-modular Hopf algebras associated with Uq (g), q ` = 1, for all g of the A, B and D types. Thus, we define the qQCD3 partition function at a root of unity in the same way as in the previous section, using the given above definition of the integral. One can describe the quantity Zj1 ...jN2 appearing in Eq. (2.57) as follows. We consider a Heegaard splitting S e M = H h H. Let us continue a homomorphism h into a small neighborhood of ∂H. T e e = M 2 × 0; {α} ∈ ∂H In other words, H H = Mg2 × [0, 1], ∂H = Mg2 × 1 and ∂ H g e Then for any standard embedding of H into R3 , the characteristic and {γ} ∈ ∂ H. curves and the α-cycles form a non-trivial link, L. They are colored with two sets of representations j1 , . . . , jN2 ∈ S and i1 , . . . , iN1 ∈ S. By using the Reshetikhin-Turaev i1 ...iN functor, we calculate the quantum invariant of the link, Jj1 ...jN1 (L), and sum over the 2 colors of the α-cycles with the weights (s−1 )0i : Zj1 ...jN2 =
X
N1 Y
i1 ...iN1 ∈S k=1
i1 ...iN
(s−1 )0ik Jj1 ...jN1 (L).
(2.62)
2
2.7. The topological limit. If in the root of unity case one chooses the Boltzmann weight coefficients equal to vj = (s−1 )0j , one finds the partition function
Z0 (C) =
X
N2 Y
j1 ...jN2 ∈S k=1
vjk Zj1 ...jN2 =
X
N2 Y
j1 ...jN2 ∈S k=1
v jk
X
N1 Y
i1 ...iN1 ∈S k=1
i1 ...iN
vik Jj1 ...jN1 (L), 2
(2.63) i1 ...iN
where Nk is the number of k-dimensional cells in a complex C; Jj1 ...jN1 (L) is the 2 quantum invariant of aPlink L given by a Heegaard diagram associated to the complex C. Let us denote ω = i∈S vi di . Theorem 1. I(M) = Z0 (C)/ω N0 +N3 −2 is a topological invariant of a manifold M represented by a complex C. I(M) is multiplicative with respect to the connected sum: I(M) = I(M1 )I(M2 ),
if M = M1 #M2
and I(S 3 ) = 1. Proof. The Heegaard splitting associated to a cell complex C having Nk cells in the k’th dimension gives a handlebody, Hg , of the genus g = N1 − N0 + 1. Let us fix g independent α-cycles of the Heegaard diagram and take the corresponding integrals (i.e., sums over i’s in Eq. (2.63)) first. By applying the CGC decomposition and then using the orthogonality (2.61), we deform the set of the characteristic curves in the link L into
Quantum Deformation of Lattice Gauge Theory
317
some 3-valent ribbon graph G. Every application of Eq. (2.61) destroys a handle of Hg . Therefore, having taken the g integrals, we obtain a spherical ribbon 3-valent graph G plus a collection of N1 − g = N0 − 1 disjoint unlinked loops corresponding to the rest of the α-cycles. The integrals associated to them give ω N0 −1 . Now, we can recover the intial configuration of the characteristic curves??? by restoring the g integrals corresponding to the independent α-cycles. In this way we obtain a cell decomposition of M with only one 0-cell and every 1-cell corresponding to a generator of π1 (M). This procedure is the direct analog of fixing an axial gauge in LGT. The Heegaard splitting is obviously symmetric with respect to the Poincar´e duality, therefore we can repeat the previous procedure with roles of the α-cycles and the characteristic curves interchanged. In this way we fix a set of g independent characteristic curves and pick up the factor ω N3 −1 . Thus, we finish with some balanced presentation of π1 (M). To prove the topological invariance, we have to show that I(M) is not changed [i] by Dehn twists on contractible loops, [ii] by the cycle slide and [iii] by the stabilization. i) Invariance under the Dehn twists on loops contractible inside Hg is obvious. By taking the g integrals, we cut all handles and always get the same 3-valent ribbon graph G. ii) Invariance under the cycle slide follows from the analog of Haar measure invariance as illustrated in Fig. 2.
X
vi
X
j
i
vj qtr
X
=
vi
X
j i
=
vj qtr f
f
1X d
i
X
=
vi qtr
vi
X
j i
vj qtr
f f
Fig. 2. Invariance under the cycle slide.
iii) The stabilization consists in adding a handle to Hg and extending a gluing homomorphism h by the identity on its boundary. It amounts to the addition to L of one α-cycle and one characteristic curve forming the Hopf link. Therefore, the integration associated to the new handle attaches the trivial representation to the new characteristic curve and it is unimportant how it is linked with the other α-cycles. To show the multiplicative nature of the invariant, let us choose such a cell decomposition of M = M1 #M2 that a sphere dividing M1 and M2 consists of only one 0-cell and one 2-cell. Then the corresponding characteristic curve is linked with no α-cycle and the corresponding link L in Eq. (2.63) has two connected components. ???
or another one equivalent to it.
318
D.V. Boulatov
The normalization I(S 3 ) = 1 follows from the observation that the Hopf link corresponds to a genus 1 Heegaard splitting of the sphere. t u Remarks. 1) The meaning of the choice of the Boltzmann weight coefficients made in Eq. (2.63) is clear. They correspond to the δ-function weights. Therefore, Z0 (M) can be regarded as a generalization of Eq. (2.10). In contrast to the finite group partition function, the q-deformed model is obviously self-dual with respect to the Poincar´e duality of 3folds. 2) Let us consider a simplicial complex C (s) . If we take in the expression (2.63) for the partition function Z0 (C (s) ) all the sums associated to triangles in C (s) prior to the others, then the answer is identical to the definition of the Turaev-Viro invariant. Indeed, for each triangle we find the tangle equivalent to the product of two 3j-symbols:
X
vi
=
X
vi
j1 j2 j3
=
1 dj3
.
(2.64)
j1 j2 j3
j1 j2 j3
By closing all the tangles, we obtain a Racah-Wiegner 6j-symbol
j1 j4
j2 j5
j3 j6
j1
j2
1 p = dj6 dj2 dj5
j3
(2.65)
j6 j4 j5
inside each tetrahedron of the simplicial complex. The indices, j1 , . . . , jN1 , are attached to 1-simplexes of C (s) . Taking the sums over them we arrive at the Turaev-Viro state sum invariant [17]: I0 (C (s) ) = ω N1 −N2 +2
N1 X Y {jk ∈S} k=1
v jk
N3 Y j t1 j t4 t=1
j t2 j t5
j t3 j t6
,
(2.66)
where the 6-tuple (t1 , . . . , t6 ) denotes six edges of the t-th tetrahedron. Explicit expressions for (s−1 )ij in the sl2 case are given in Ref. [15]. Thus, Eq. (2.63) can be regarded as a general definition of the Turaev-Viro invariant. 3) The expression for Z0 (M) given in Eq. (2.63) coincides with the Reshetikhin-Turaev construction of 3-fold invariants IRT via the surgery representation [15]. Therefore, Z0 (M) is automatically invariant under the Kirby calculus applied formally to the link L. It means that, given a manifold M, there exists another one N such that I(M) = IRT (N ). As I(M) = |IRT (M)|2 , we conclude that N ∼ = M#M (M is M with the opposite orientation). A simple illustration in the case of lens spaces can be found in Ref. [6].
Quantum Deformation of Lattice Gauge Theory
319
2.8. Bounded manifolds and links. Every set of disjoint simple closed curves {γ} on a handlebody H determines a bounded 3-manifold M constructed by gluing plates to annular neighborhoods of the curves. It can be shown that every orientable bounded 3-manifold can be obtained in this way. The handlebody in this construction is a tubular neighborhood of a 1-skeleton of M. Therefore we can straightforwardly apply the qQCD3 functor in the bounded case. For it, we [i] fix a system of α-cycles on ∂H; [ii] color curves from {γ} with Uq irreps; [iii] repeat steps 3, 4 and 5 from Definition 1 without any modification. It suggests the following interpretation of our construction. A spine is a 2-dimensional polyhedron which can be embedded in some 3-manifold. Any 3-manifold with a boundary collapses to a spine. Let us delete a ball from every 3-cell of a closed complex C. In such a way we obtain a bounded manifold which collapses to a 2-skeleton K2 of C. If C is dual to a simplicial complex, K2 is called a standard spine. Matveev has introduced two moves which relate all standard spines of the same manifold [13]. It can be easily shown that I(M) from the previous section is invariant under the Matveev moves. The definition of the qQCD3 functor uses an immersion of K2 into R3 and depends on it. It seems to be an intrinsic feature of q-deformed LGT rather than a defect of our presentation. Only gauge invariant singlet quantities (the partition function, for example) are independent of a way K2 is immersed into R3 . One of the advantages of our presentation of qQCD3 is a relative simplicity of introducing Wilson loops in it. In classical LGT, a loop average is defined as
A(L1 , . . . , Lm ) =
1 Zβ
Z Y G `
dg`
Y f
Wβ (h∂f )
m Y i=1
tr Vj [hLi ], i
(2.67)
where {L} are m closed curves embedded into a 1-skeleton K 1 of a complex C. We color the ith curve with a representation ji of a gauge group G. The holonomy hL is defined in Eq. (2.2). In the q-deformed case, we have to specify a link formed by the collection of curves in a manifold M ∼ = C. For it we represent the curves {L} by a set of disjoint ribbon loops on a boundary ∂H 0 of a handlebody H 0T⊂ H (as usual,SH 0 and H are tubular neigborhoods of K 1 and H 0 lies inside H: H H 0 = H 0 , H H 0 = H). If it is not possible, then one has to take a finer subdivision of M. In the case of links in R3 , it is a standard technical trick to realize a link as a system of disjoint loops on a handlebody embedded into R3 . And we simply use it as a definition. We can apply the qQCD3 functor to such a composite handlebody without any additional modification. Loops from {L} enters on equal footing with characteristic curves. One can repeat the same argument as in the partition function case to prove that an answer is independent of an embedding of H in R3 . However, it does not mean that the q-deformation of Eq. (2.67) gives no non-trivial knot invariant. Let us consider a link in R3 . There has to exist a trivial embedding such that characteristic curves of a Heegaard diagram lying on ∂H are unlinked and contractible in R3 \ H. Therefore the sums over their colors disjoin α-cycles on ∂H and the link of curves {L} on ∂H 0 (in other words, cut handles of the complementary handlebody S 3 \ H). What remains is exactly the Jones polynomial associated to the link {L}. The comprehensive treatment of quantum invariants of links and 3-valent graphs in 3-manifolds can be found in Ref. [16].
320
D.V. Boulatov
3. qQCD2 We define a qQCD2 functor by applying the qQCD3 one to an embedding of an oriented 2-manifold Mg2 in R3 . In the topological limit, we can consider the simplest cell decomposition of Mg2 consisting of a single 2-cell and 2g 1-cells. A tubular neigborhood of its 1-skeleton is a handlebody H of the genus 2g. The Heegaard diagram has only one characteristic curve. Each integral destroys a handle of H and contributes a factor 1/dj to an answer. The calculation is reduced to a repeated application of Eq. (2.61) and one easily gets I(Mg2 ) = ω 2g−1 Z0 (Mg2 ) = ω 2g−1
X
vj Zj =
j∈S
X
vj
j∈S
dj ω
1−2g .
(3.1) 2πi
Let us consider a concrete example of the quantum group Uq (sl(2)), at q = e k+2 . In this case, the set of modules in the definition of the modular Hopf√algebra is given by sin( 2j+1 π)
the fusion ring Vj (j = 0, 21 , 1, 23 , . . . , k2 ) and dj = sink+2π ; ω = k+2 We find k/2 2 sin2 2j+1 π 1−g X k+2 . I(Mg2 ) = k + 2 1
(k+2)/2 ; π sin k+2
vj = dj /ω.
(3.2)
j=0, 2 ,...
These are known as Verlinde’s numbers. They are all integer and equal to the dimensions of spaces of conformal blocks in the WZW model on a genus g Riemann surface. If one starts with a more complicated cell decomposition of a Riemann surface, then one has simply to apply the orthogonality relation (2.61) till all handles of H are destroyed. In two dimensions, local properties of the qQCD2 functor can be formalized in a pure algebraic way. For it, let us cut from Mg2 a piece which can be projected on a plane R2 . It gives a subdivision (triangulation, say) of some region on the plane. There is a natural cyclic order of edges incident to a vertex. Following Fock and Rosly [11], one introduces a ciliation at every vertex, i.e., breaks this order. Let us say that an edge `1 goes after `2 (`1 > `2 ), if an anti-clockwise angle ϕ(`1 ) between the edge `1 and the x-axis is bigger than an angle ϕ(`2 ) between `2 and the x-axis: ϕ(`1 ) > ϕ(`2 ). We assume that no edge is parallel to the x-axis, and orient edges in the y-direction. Say, put an arrow at an end having a bigger y coordinate. Assuming that any two vertices are connected at most by i one edge, we can numerate edges by ordered pairs of vertices (i, j) ∼ = j% . Alekseev, Grosse and Schomerus have introduced the following algebra of gauge fields U(i,j) [3]: i) If two edges have no common vertices, fields are co-commutative: 1
2
2
1
U (i,j) U (n,m) =U (n,m) U (i,j) ; here i, j, n and m are all distinct. ii) If two edges share a vertex, then 1 2 1 2 U (k,j) U (i,j) U (i,j) U (k,j) = 1 2 U (k,j) U (i,j)
Rˇ 12 Rˇ −1 12
if (i, j) > (k, j) if (i, j) < (k, j)
.
Quantum Deformation of Lattice Gauge Theory
321
We picture these relations as i
i
k
a
b
=
k
if (i, j) > (k, j)
j
j i
k
a
a
b
b
i
k
=
a
b
j
if (i, j) < (k, j) .
j
We have drawn in solid lines Uq -elements figuring in the AGS relations associated to the j th vertex. All attached to other vertices are dashed. iii) Fields attached to the same edge form a quasi-triangular Hopf algebra: 2
1
2
1
−1 1. R12 U (i,j) U (i,j) R12 =U (i,j) U (i,j) . 1
2
2. · U (j,i) U (i,j) = 1. This property is sometimes called the cancellation of a backtracking (c.f. Eq. (2.27)). Remarks. 1) The properties [i] and [iii] are obviously in agreement with our definition of qQCD2 (see, e.g., , the pictorial illustration in Fig. 1, and the discussion preceding it). The relations [ii] follow from transformations of modules associated to vertex tangles. Of course, being made, such a move has to be compensated somewhere by its reciprocal for a whole construction to remain invariant. 2) As a closed surface can be projected onto R2 only locally, one has to use gluing homomorphisms to assemble a Riemann surface out of flat pieces. These homomorphisms match the gauge field algebra relations on different pieces and have to be added in order to complete the construction. 3) The set of the AGS relations is distinguished by an observation that they generate a lattice Kac-Moody algebra in the sense of Ref. [2]. However, they do not constitute all possible symmetries of the qQCD2 functor. 4. Concluding Remarks 1. The first natural question to ask is whether the results of this paper could be generalized to higher dimensions. The answer is certainly “No”! The reason for it is that, in dimensions bigger than 3, there is no natural ordering of faces incident to an edge in a complex. It restricts the class of acceptable Hopf algebras to triangular ones. Then the corresponding construction essentially coincides with the classical Wilsonian LGT.
322
D.V. Boulatov
2. It is tempting to interpret the topological invariant considered in this paper as some suitable generalization of Eq. (2.10) and the construction of topological qQCD3 as a generalization of H 1 (C, G). Unfortunately, we are able to say nothing constructive about it. However, in the 2-dimensional case, a notion of a quantum moduli space could presumably be formulated [11], which leaves some hope for the future. 3. We conjecture that qQCD3 with Uq (su(n)) gauge group possesses a continuum limit equivalent to a gauge theory whose action includes both Yang-Mills and ChernSimons terms. One could introduce a non-zero coupling constant in the root-of-unity case as well, which implies some deformation of Chern-Simons theory. The meaning of this procedure is absolutely unclear to us. Acknowledgement. I thank V.Turaev for the fruitful discussion. This work was supported by the EEC program “Human Capital and Mobility” under the contract ERBCHBICT9941621.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
Alekseev, A.: St.-Peterburg Math. J. 6, 1 (1994) Alekseev, A., Faddeev, L., Semenov-Tian-Shansky, M.: Commun. Math. Phys. 149, 335 (1992) Alekseev, A., Grosse, H. ,Schomerus, V.: Commun. Math. Phys. 172, 317 (1995) Buffenoir, E., Roche, P.: Commun. Math. Phys. 170, 669 (1995) Boulatov, D.V.: Mod. Phys. Lett. A7, 1629 (1992) Boulatov, D.V.: Int. J. Mod. Phys. A8, 3139 (1993) Boulatov, D.V.: 3D Gravity and Gauge Theories. In NATO Advanced Studies Institutes Series B, 328, 39 (1995) Boulatov, D.V.: Mod. Phys. Lett. A8, 3491 (1993) Crane, L.: Commun. Math. Phys. 135, 615 (1991) Drinfeld, V.G.: Quantum Groups. In Proc. ICM (1987) 798; Faddeev, L.D., Reshetikhin, N., Takhtajan, L.: Leningrad Math. J. 1, 193 (1990) Fock, V.V., Rosly, A.A.: Poisson structure on moduli of flat connections on Riemann surfaces and r-matrix. Preprint ITEP 72–92 (1992); Teor. mat. Fiz. 95, 228 (1993) Kohno, T.: Topology 31, 203 (1992) Matveev, S.V.: Math. USSR Izvestiya 31, 423 (1988) Ponzano, G., Regge, T.: In: F.Bloch (ed.) Spectroscopic and group theoretical methods in physics, 1968 Reshetikhin, N.Yu., Turaev, V.G.: Commun. Math. Phys. 124, 307 (1989); Invent. Math. 103, 547 (1991) Turaev, V.: Publ. Math. IHES 77, 121 (1993) Turaev, V.G., Viro, O.Y.: Topology 31, 865 (1992); Turaev, V.G.: C.R. Acad. Sci. Paris 313, 395 (1991); J. Diff. Geom. 36, 35 (1992) Turaev, V., Wenzl, H.: Int. J. Math. 4, 323 (1993) Wilson, K.: Phys. Rev. D10, 2445 (1974) Witten, E.: Commun. Math. Phys. 121, 351 (1989); Nucl. Phys. B311, 46 (1988/89) Woronowicz, S.L.: Commun. Math. Phys. 111, 613 (1987); Vaksman, L.L., Soibelman, Ya.S.: Func. Anal. Appl. 22, 170 (1988)
Communicated by A. Connes
Commun. Math. Phys. 186, 323 – 379 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Distribution of Overlap Profiles in the One-Dimensional Kac–Hopfield Model? Anton Bovier1 , V´eronique Gayrard2 , Pierre Picco2 1
Weierstraß-Institut f¨ur Angewandte Analysis und Stochastik, Mohrenstraße 39, D-10117 Berlin, Germany. E-mail:
[email protected] 2 Centre de Physique Th´ eorique – CNRS, Luminy, Case 907, F-13288 Marseille Cedex 9, France. E-mail:
[email protected];
[email protected] Received: 14 February 1996 / Accepted: 30 September 1996
Abstract: We study a one-dimensional version of the Hopfield model with long, but finite range interactions below the critical temperature. In the thermodynamic limit we obtain large deviation estimates for the distribution of the “local” overlaps, the range of the interaction, γ −1 , being the large parameter. We show in particular that the local overlaps in a typical Gibbs configuration are constant and equal to one of the meanfield equilibrium values on a scale o(γ −2 ). We also give estimates on the size of typical “jumps”, i.e. the regions where transitions from one equilibrium value to another take place. Contrary to the situation in the ferromagnetic Kac-model, the structure of the profiles is found to be governed by the quenched disorder rather than by entropy. 1. Introduction Models of statistical mechanics where particles (or spins) interact through potentials Jγ (r) ≡ γ d J(γr), r ∈ Rd , with J some function that either has bounded support or is rapidly decreasing were introduced by Kac et al. [KUH] in 1963 as links between shortrange, microscopic models and mean field theories such as the van der Waals theory of the liquid-gas transition. The main success of these models can be seen in that they explain, through the Lebowitz-Penrose theorem, the origin of the Maxwell rule that has to be invoked in an ad hoc way to overcome the problem of the non-convexity of the thermodynamic functions arising in mean-field theories. Recently, there has been renewed interest in this model in the context of attempting to obtain a precise description of equilibrium configurations [COP] and their temporal evolution [DOPT] in magnetic systems at low temperatures. In [COP] large deviation techniques were used to describe precisely the profiles of local magnetization in a one dimensional Ising model with Kac potential in infinite volume in the limit γ ↓ 0. It ? Work partially supported by the Commission of the European Union under contract No. CHRX-CT930411
324
A. Bovier, V. Gayrard, P. Picco
turned out that this apparently simple system exhibits a surprisingly rich structure when considered at appropriate scales and it appears that the Kac-type models can still offer an interesting test ground for the study of low-temperature phenomena. The purpose of the present paper is to extend such an analysis to a class of models with random interactions. Spin systems where spins at sites i and j interact through a random coupling Jij whose mean value is zero (or close to zero) are commonly termed spin glasses. The prototype models are the Sherrington-Kirkpatrick model (SK-model) [SK] where the lattice is the completely connected graph on N vertices and the couplings Jij are i.i.d. centered gaussian variables with variance N −1/2 , and the Edwards-Anderson model [EA], defined on the lattice Zd and with Jij i.i.d. centered random variables with variance 1 if i and j are nearest neighbors in the lattice, whereas Jij ≡ 0 otherwise. These systems are notoriously difficult to analyse and little is known on a firm basis about their low temperature properties. The situation is somewhat better in the case of the mean-field SK-model, for which there is at least a rather elaborate picture based on the so-called replica-method (for a review see [MPV]) which is quite commonly accepted, although almost no results exist that are mathematically rigorous. Exceptions concern the high-temperature phase [ALR, FZ, CN, T1] and some self-averaging properties of the thermodynamic quantities [PS, BGP3]. For short-range models (the Edwards-Anderson model [EA]) the situation is much worse, and there exist conflicting theories on such fundamental questions as the upper and lower critical dimension and the number of low temperature phases, all of which are more or less supported by heuristic arguments (see e.g. [FH, BF, vE, NS]), and the interpretation of numerical simulations on finite systems (for a recent analysis and a critical assessment of the situation see [MPR]). The difficulties with the SK-model have soon prompted the proposal of simplified models for spin-glasses in which the statistics of the random couplings was changed while some of the features are conserved. The Mattis-model [Ma] where Jij ≡ i j with i independent symmetric Bernoulli variables was realized to be trivially equivalent to a ferromagnet and lacking the essential feature of frustration; Luttinger [Lu] amended this by setting Jij ≡ ξi1 ξj1 + ξi2 ξj2 while Figotin and Pastur [FP1, FP2] proposed and analysed a generalization of this interaction with an arbitrary fixed number of summands and more general distribution of the random variables ξiµ . While these models could be solved exactly, they lacked essential features expected for real spin glasses and thus did not become very popular until they were again proposed in a quite different context by Hopfield [Ho] as models for autoassociative memory. Hopfield also considered the number of summands, M , to be a function of the size, N , of the graph (“network”) and observed numerically a drastic change of behaviour of the system as the ratio α ≡ M/N exceeded a certain threshold. This was confirmed by Amit et al. [AGS] through a theoretical analysis using the replica trick. Indeed, the Hopfield model can be seen as a family of models depending on the different growth rate of M (N ) that mediates between simple ferromagnets and the SK spin-glass. The Hopfield model offers the advantage to be more amenable to a mathematically rigorous analysis than the SK-model, at least as long as M (N ) does not grow too fast with N . By now we have a fairly complete understanding of the structure of the low temperature Gibbs states [BGP1, BGP3, BG4] in the case where limN ↑∞ M/N ≤ α0 , for α0 sufficiently small. It is thus interesting to take advantage of this situation in order to get some insight into the relation between finite dimensional spin-glasses and the corresponding mean field models by studying the finite dimensional version of the Hopfield model with a Kac-type interaction. It should be noted that such a model had
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
325
already been considered by Figotin and Pastur [FP3] in 1982 in the case of bounded M . In a recent paper [BGP2] we have proven the analogue of the classical Lebowitz-Penrose theorem for this model, i.e. we have proven the convergence of the thermodynamic functions to the convex hulls of those of the mean-field model as γ ↓ 0 under the condition that limγ↓0 M (γ)/γ = 0. In the present paper we turn to the more detailed analysis of the Gibbs states of the Kac-Hopfield model and consider, as a first step, the one dimensional case along the lines of [COP]. Let us start by defining our model in a precise way and by fixing our notations. Let (, F , P) be an abstract probability space. Let ξ ≡ {ξiµ }i∈Z,µ∈N be a two-parameter family of independent, identically distributed random variables on this space such that P(ξiµ = 1) = P(ξiµ = −1) = 21 . (The precise form of the distribution of ξiµ is not really essential and far more general distributions can be considered.) We denote by σ a function σ : Z → {−1, 1} and call σi , i ∈ Z the spin at site i. We denote by S the space of all such functions, equipped with the product topology of the discrete topology in {−1, 1}. We choose the function Jγ (i − j) ≡ γJ γ|i − j| , and 1, if |x| ≤ 1/2 J(x) = . (1.1) 0, otherwise (Note that other choices for the function J(x) are possible. They must satisfy the conR ditions J(x) ≥ 0, dxJ(x) = 1, and must decay rapidly to zero on a scale of order unity. For example, the original choice of Kac was J(x) = e−|x| . For us, the choice of the characteristic function is particularly convenient.) The interaction between two spins at sites i and j will be chosen for given ω ∈ , as M (γ) 1 X µ ξi [ω]ξjµ [ω]Jγ (i − j)σi σj . (1.2) − 2 µ=1
and the formal Hamiltonian will be Hγ [ω](σ) = −
1 2
X
M (γ) X
ξiµ [ω]ξjµ [ω]Jγ (i − j)σi σj .
(1.3)
(i,j)∈Z×Z µ=1
As usual, to make mathematically meaningful statements, we have to consider restrictions of this quantity to finite volumes. We will do this in a particular way which requires some prior discussion. Note that the parameter γ introduces a natural length scale γ −1 into our model which is the distance over which spins interact directly. We will be interested later in the behaviour of the system on that and larger scales and will refer to it as the macroscopic scale, whereas the sites i of the underlying lattice Z are referred to as the microscopic scale. In the course of our analysis we will have to introduce two more intermediate, mesoscopic scales, as shall be explained later. We find it convenient to measure distances and to define finite volumes in the macroscopic rather than the microscopic scale, as this allows to deal with volumes that actually do not change with γ. Although this will require some slightly unconventional looking definitions, we are convinced the reader will come to appreciate the advantages of our conventions later on. Let thus 3 = [λ− , λ+ ] ⊂ R be an interval on the real line. Thus for points i ∈ Z referring to sites on the microscopic scale we will write i∈3
iff λ− ≤ γi ≤ λ+ .
(1.4)
326
A. Bovier, V. Gayrard, P. Picco
Note that we will stick very strictly to the convention that the letters i, j, k always refer to microscopic sites. The Hamiltonian corresponding to a volume 3 (with free boundary conditions) can then be written as Hγ,3 [ω](σ) = −
1 2
X
M (γ) X
ξiµ [ω]ξjµ [ω]Jγ (i − j)σi σj .
(1.5)
(i,j)∈3×3 µ=1
We shall also write in the same spirit S3 ≡ ×i∈3 {−1, 1} and denote its elements by σ3 . The interaction between the spins in 3 and those outside 3 will be written as Wγ,3 [ω](σ3 , σ3c ) = −
(γ) XXM X i∈3
j∈3c
ξiµ [ω]ξjµ [ω]Jγ (i − j)σi σj .
(1.6)
µ=1
The finite volume Gibbs measure for such a volume 3 with fixed external configuration σ3c (the ‘local specification’) is then defined by assigning to each σ3 ∈ S3 the mass σ3c Gβ,γ, [ω](σ3 ) ≡ 3
1
e σ3c Zβ,γ, [ω] 3
−β [Hγ,3 [ω](σ3 )+Wγ,3 [ω](σ3 ,σ3c )]
(1.7)
σ3c [ω] is a normalizing factor usually called partition function. We will also where Zβ,γ, 3 denote by 1 e−βHγ,3 [ω](σ3 ) (1.8) Gβ,γ,3 [ω](σ3 ) ≡ Zβ,γ,3 [ω]
the Gibbs measure with free boundary conditions. It is crucial to keep in mind that we are always interested in taking the infinite volume limit 3 ↑ R first for fixed γ and to study the asymptotic of the result as γ ↓ 0 (this is sometimes referred to as the ‘Lebowitz-Penrose limit’). In [BGP2] we have studied the distribution of the global ‘overlaps’ mµ3 (σ) ≡ µ γ P i∈3 ξi σi under the Gibbs measure (1.7). Here we are going into more detail in that |3| we want to analyse the distribution of local overlaps. To do this we will actually have to introduce two intermediate mesoscopic length scales, 1 `(γ) L(γ) γ −1 . Note that both `(γ) and L(γ) will tend to infinity as γ ↓ 0 while `(γ)/L(γ) as well as γL(γ) tend to zero. We will assume that `, L and γ −1 are integer multiples of each other. Further conditions on these scales will be imposed later. To simplify notations, the dependence on γ of ` and L will not be made explicit in the sequel. We now divide the real line into boxes of length γ` and γL, respectively, with the first box, called 0, being centered at the origin. The boxes of length γ` will be called x, y, or z, and labelled by the integers. That is, the box x is the interval of length γ` centered at the point γ`x. No confusion should arise from the fact that we use the symbol x as denoting both the box and its label, since again x, y, z are used exclusively for this type of boxes. In the same way, the letters r, s, t are reserved for the boxes of length γL, centered at the points γLZ, and finally we reserve u, v, w for boxes of length one centered at the integers. With these conventions, it makes sense to write e.g. i ∈ x shorthand for `x − `/2 ≤ i ≤ `x + `/2, etc.1 In this spirit we define the M (γ) dimensional vector m` (x, σ) and mL (r, σ) whose µth components are 1 On a technical level we will in fact have to use even more auxiliary intermediate scales, but as in [COP] we will try to keep this under the carpet as far as possible.
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
327
mµ` (x, σ) ≡
1X µ ξ σi ` i∈x i
(1.9)
mµL (r, σ) ≡
1X µ ξ σi L i∈r i
(1.10)
and
respectively. Note that we have, for instance, that mµL (r, σ) =
` X µ m (x, σ). L x∈r `
(1.11)
We will also have to be able to indicate the box on some larger scale containing a specified box on the smaller scale. Here we write simply, e.g., r(x) for the unique box of length L that contains the box x of length `. Expressions like x(i), u(y) or s(k) have corresponding meanings. Remark . It easy to connect from our notation to the continuum notation used in [COP]. For instance, (1.9) can be rewritten as m` (x, u) =
1 X µ γ ξ σi , γ` i∈x i
(1.12)
P where γ i∈x can be interpreted as a Riemann sum; the same occurs in all other expressions. The rˆole of the different scales will be the following. We will be interested in the typical profiles of the overlaps on the scale L, i.e. the typical mL (r, σ) as a function of r; we will control these functions within volumes on the macroscopic scale γ −1 . The smaller mesoscopic scale ` enters only in an auxiliary way. Namely, we will use a block-spin approximation of the Hamiltonian with blocks of that size. We will see that it is quite crucial to use a much smaller scale for that approximation than the scale on which we want to control the local overlaps. This was noted already in [COP]. We want to study the probability distribution induced by the Gibbs measure on the functions mL (r) through the map defined by (1.10). The corresponding measure space is for fixed γ simply the discrete space {−1, −1 + 2/L, . . . , 1 − 2/L, 1}M (γ)×Z , which should be equipped with the product topology. Since this topology is quite non-uniform with respect to γ (note that both L and M tend to infinity as γ ↓ 0), this is, however, not well adapted to take the limit γ ↓ 0. Thus we replace the discrete topology on {−1, −1 + 2/L, . . . , 1 − 2/L, 1}M (γ) by the Euclidean `2 -topology (which remains meaningful in the limit) and the product topology P corresponding to Z is replaced by thenweak local L2 topology w.r.t. the measure γL r∈· ; that is to say, P a family of profiles mL (r) converges to the profile mL (r), iff for all finite R ∈ R, γL r∈[−R,R] kmnL (r) − mL (r)k2 ↓ 0 as n ↑ ∞. While for all finite γ this topology is completely equivalent to the product topology of the discrete topology, the point here is that it is meaningful to ask for uniform convergence with respect to the parameter γ. We will denote this space by Tγ , or simply T and call it the space of profiles (on scale L). Before presenting our results, it may be useful to discuss in a somewhat informal way the heuristic expectations based on the the work of [COP] and the results known from [BGP1, BGP3, BG4]. In [COP] it was shown that the typical magnetization profiles are such that almost everywhere, mL (r, σ) is very close to one of the two equilibrium
328
A. Bovier, V. Gayrard, P. Picco
values of the mean field model, ±a(β); moreover, the profile is essentially constant −1 over macroscopic distances of the order eγ . The distances between jumps are actually independent exponentially distributed random variables. Heuristically, this picture is not too difficult to understand. First, one approximates the Hamiltonian by a blockspin version by replacing the interaction potential by a function that is constant over blocks of length L. Ignoring the error term, the resulting model depends on σ only through the variables mL (r, σ). In fact, at each block r there is a little mean-field model and these mean field models interact through a ferromagnetic interaction of the form JγL (r − s)(mL (r) − mL (s))2 . This interaction can only bias a given block to choose between the two possible equilibrium values, but never prevent it from taking on an equilibrium value over a longer interval. Moreover, it tends to align the blocks. To jump from one equilibrium into the other costs in fact an energy of the order of γ −1 , so that the −1 probability that this happens in a given unit interval is of the order e−γ . This explains why the entropy can force this to happen only on distances of the order of the inverse of this value. Finally, the Markovian character of a one-dimensional model leaves only a Poisson-distribution as a candidate for the distribution of the jumps. The main difficulty in turning these arguments into rigorous proofs lies in the control of the error terms. It is crucial for the above picture that there is a complete symmetry between the two equilibrium states of the mean field model. As we have shown in [BGP2], the KacHopfield model can be approximated by a blocked model just the same, and in [BGP1] we have shown that the mean field Hopfield model has its equilibrium states sharply concentrated at the 2M points m(µ,s) ≡ sa(β)eµ ,
µ ∈ {1, . . . , M }, s ∈ {−1, 1},
(1.13)
where eµ is the µth standard unit vector and a(β) is the largest solution of a = tanh(βa). Thus we can again expect the overlap profiles to be over long distances constant close to one of these values. What is different here, however, is that due to the disorder the different equilibrium positions are not entirely equivalent. We have shown in [BGP3] that the fluctuations are only of the order of the square root of the volume, but since they are independent from block to block, they can add up over a long distance and effectively enforce jumps to different equilibrium positions at distances that are much shorter than those between entropic jumps. In fact, within the blocked approximation, it is not hard to estimate that the typical distance over which the profiles remain constant should be of the order γ −1 on the macroscopic scale (i.e. γ −2 on the microscopic scale). Using a concentration of measure estimates in a form developed by M. Talagrand [T2], we extend these estimates to the full model. Our main results on the typical profiles can then be summarized (in a slightly informal way) as follows: Assume that limγ↓0 γM (γ) = 0. Then there is a scale L γ −1 such that with P-probability tending to one (as γ ↓ 0) the following holds: (i) In any given macroscopic finite volume in any configuration that is “typical” with respect to the infinite volume Gibbs measure, for “most” blocks r, mL (r, σ) is very close to one of the values ±a(β)eµ (we will say that mL (u, σ) is “close to equilibrium”). (ii) In any macroscopic volume 1 that is small compared to γ −1 , in a typical configuration, there is at most one connected subset J (called a “jump”) 1 on which mL is not close to equilibrium. Moreover, if such with |J| ∼ γL a jump occurs, then there exist (s1 , µ1 ) and (s2 , µ2 ), such that for all u ∈ 1 to the left of J, mL (u, σ) ∼ s1 a(β)eµ1 and for all u ∈ 1 to the right of J, mL (u, σ) ∼ s2 a(β)eµ2 .
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
329
The precise statement of these facts will require more notation and is thus postponed to Sect. 6 where it will be stated as Theorem 6.15. That section contains also the large deviation estimates that are behind these results. We should mention that we have no result that would prove the existence of a “jump” in a sufficiently large region. We discuss this problem in Sect. 7 in some more detail. We also remark that the condition limγ↓0 γM (γ) = 0 will be imposed thoughout the paper. It could be replaced with lim supγ↓0 γM (γ) ≤ αc (β) for some strictly positive αc (β) for all β > 1. However, an actual estimate of this constant would be outrageously tedious and does not really appear, in our view, to be worth the trouble. The remainder of the paper is organized in the following way. The next two sections provide some technical tools that will be needed throughout. Sect. 2 introduces the mesoscopic approximation of the Hamilitonian and corresponding error estimates. Sect. 3 contains large deviation estimates for the standard Hopfield model that are needed to analyse the mesoscopic approximation introduced before. Here we make use of some fundamental results from [BGP2] and [BG3] but present them in a somewhat different form. In Sect. 4 we begin the actual analysis of typical profiles. Here we show that for events that are local, we can express their probabilities in terms of a finite volume measure with random boundary conditions (see Corollary 4.2). In Sect. 5 we derive estimates on the random fluctuations of the free energies corresponding to these measures. In Sect. 6 we make use of these estimates to show that local events can be analysed using the mesoscopic approximation introduced in Sect. 2. This section is divided into three parts. Sect. 6.1 contains an analysis of measures with free boundary condition in macroscopic volumes of order o γ −1 . It is shown that they are asymptotically concentrated on constant profiles (see Theorem 6.1). This result is already quite instructive, and technically rather easy. In Sections 6.2 and 6.3 the measures with non-zero boundary conditions are studied. In Sect. 6.2 the case where the boundary conditions are the same on both sides of the box is studied. It is shown that here, too, the profiles are typically constant and take the value favored by the boundary conditions (see Theorem 6.9). In Sect. 6.3 the case with different boundary conditions is treated. Here we show that the typical profile has exactly one “jump” and is constant otherwise (see Theorem 6.14). The results of Sections 4 and 6 are then combined to yield Theorem 6.15 which gives a precise statement the result announced above. In Sect. 7 we discuss some of the open points of our analysis. In particular we argue, that typical profiles are non-constant on a sufficiently large scale and that their precise form is entirely disorder determined (up to the global sign). We also formulate some conjectures for the model in dimensions greater than one. In Appendix A we give a proof of a technical estimate on the minimal energy associated to profiles that contain “jumps” between different equilibrium positions that is needed in Sect. 6.
2. Block-Spin Approximations While mean-field models are characterized by the fact that the Hamiltonian is a function of global averages of the spin variables, in Kac-models the Hamiltonian is “close”, but not identical to a function of “local” averages. In this section we make this statement precise by introducing the block version of the Hamiltonian and deriving the necessary estimates on the error terms. We define X 1 ` Jγ` (x − y)(m(x), m(y)), (2.1) Eγ, 3 (m) ≡ − γ` 2 (x,y)∈3×3
330
A. Bovier, V. Gayrard, P. Picco `,L Eγ, ˜ ≡ −γ`L 3 (m, m)
XX
Jγ (`x − Lr)(m(x), m(r)), ˜
(2.2)
x∈3 r∈3c
and
` −1 ` 1Hγ, Eγ,3 (m` (σ)), 3 (σ3 ) ≡ Hγ,3 (σ3 ) − γ
(2.3)
`,L −1 `,L Eγ,3 (m` (σ), mL (σ)). 1Wγ, 3 (σ3 , σ3c ) ≡ Wγ,3 (σ3 , σ3c ) − γ
(2.4) −
For our purposes, we only need to consider volumes 3 of the form 3 = [λ , λ ] with |3| > 1. For such volumes we set ∂3 ≡ ∂ − 3∪∂ + 3, ∂ − 3 ≡ [λ− − 21 , λ− ), and ∂ + 3 ≡ `,L (λ+ , λ+ + 21 ]. Thus, obviously, Wγ,3 (σ3 , σ3c ) = Wγ,3 (σ3 , σ∂ 3 ) and 1Wγ, 3 (σ3 , σ3c ) = `,L 1Wγ,3 (σ3 , σ∂ 3 ). +
Lemma 2.1. For all δ > 0, i) "
# √ √ |3| γ |1H3 (σ)| ≥ γ`(γ)8 2(log 2 + δ) + 2 2γM (γ) ≤ 16e−δ γ , P sup |3| σ∈S3 (2.5) ii) h `,L P supσ∈S3∪∂ 3 γ|1Wγ, 3 (σ3 , σ∂ 3 )| > (4γL(γ)(log 2 + δ) + γM (γ)) 1 +
` L
21 i
δ
≤ 8e− γ . (2.6) Proof. We will give the proof of (ii) only; the proof of (i) is similar and can be found in [BGP2]. Since |3| > 1, the spins inside ∂ − 3 do not interact with those inside ∂ + 3 `,L and 1Wγ, 3 (σ3 , σ∂ 3 ) can be written as `,L `,L `,L 1Wγ, 3 (σ3 , σ∂ 3 ) = 1Wγ,3 (σ3 , σ∂ − 3 ) + 1Wγ,3 (σ3 , σ∂ + 3 ),
where `,L 1Wγ, 3 (σ3 , σ∂ ± 3 ) = −
X X XX
(2.7)
[Jγ (i−j)−Jγ (`x−Lr)](ξi , ξj )σi σj . (2.8)
x∈3 r∈∂ ± 3 i∈x j∈r `,L Both terms in (2.7) being treated similarly, we will only consider 1Wγ, 3 (σ3 , σ∂ + 3 ). First notice that since
Jγ (i− j) − Jγ (`x − Lr) (2.9) = γ 1I{|i−j|≤(2γ)−1 } 1I{|`x−Lr|>(2γ)−1 } − 1I{|i−j|>(2γ)−1 } 1I{|`x−Lr|≤(2γ)−1 } h i `,L `,L `,L 1 2 we can write 1Wγ, 3 (σ3 , σ∂ + 3 ) = γ 1 Wγ,3 (σ3 , σ∂ + 3 ) − 1 Wγ,3 (σ3 , σ∂ + 3 ) with `,L 11 Wγ, 3 (σ3 , σ∂ + 3 ) = −
X X XX
1I{|i−j|≤(2γ)−1 } 1I{|`x−Lr|>(2γ)−1 } (ξi , ξj )σi σj
x∈3 r∈∂ + 3 i∈x j∈r
(2.10) and
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model `,L 12 Wγ, 3 (σ3 , σ∂ + 3 ) = −
X X XX
331
1I{|i−j|>(2γ)−1 } 1I{|`x−Lr|≤(2γ)−1 } (ξi , ξj )σi σj .
x∈3 r∈∂ + 3 i∈x j∈r
(2.11) `,L `,L 2 + + (σ , σ ) and 1 W (σ , σ ) can be treated in the same Again, both terms 11 Wγ, 3 ∂ 3 ∂ 3 3 γ,3 3 way so that we only present an estimate of the former. Using the identity 1I{|i−j|≤(2γ)−1 } 1I{|`x−Lr|>(2γ)−1 } = 1I{|i−j|≤(2γ)−1 } 1I{(2γ)−1 P sup 4 4 σ∈S3∪∂ + 3 µ=1 r∈∂ + 3 (2.15) where the probability in the right-hand side is independent of the chosen spin configuration σ3∪∂ + 3 . For convenience we will choose the configuration whose spins are all one’s. Using the exponential Markov inequality together with the independence, we get # " `,L 2 1 P sup γ |1 Wγ,3 (σ3 , σ∂ + 3 )| > 4 σ∈S3 ∪∂ + 3 " #M Y 1 (2.16) tgγ, (r) (γ −1 +1) −tγ −2 4 inf e Ee 3 . ≤2 "
`,L γ 2 |11 Wγ, 3 (σ3 , σ∂ + )|
t≥0
r∈∂ + 3
1 + Thus we have to estimate the Laplace-transform of gγ, 3 (r) for any r ∈ ∂ 3. We write
Ee
1 tgγ, 3 (r)
X = E exp t ξj1 j∈r
X
X
x∈3: i∈x (2γ)−1 1, we set n τ + = inf{u ≥ v+ : η(u, σ) 6= 0} (4.2) ∞ if no such u exists and τ− =
n
sup{u ≤ v− : η(u, σ) 6= 0} −∞ if no such u exists.
(4.3)
336
A. Bovier, V. Gayrard, P. Picco
For a given configuration σ, τ ± indicates the position of the first unit interval to the right, resp. the left, of V where the configurations σ is close to equilibrium. Let us introduce the indices µ+ , µ− , s+ , s− , w+ , w− , where µ± ∈ {1, . . . , M (γ)}, ± s ∈ {−1, 1} and w+ ∈ [v+ , ∞), w− ∈ (−∞, v− ]. In the sequel, if not otherwise specified, all sums and unions over these indices run over the above sets. The expression (µ± , s± ), resp. (µ± , s± , w± ), are abbreviations for (µ+ , s+ , µ− , s− ), resp. (µ+ , s+ , w+ , µ− , s− , w− ). With these notations we define a partition of the configuration space S whose atoms are given by ± ± , s , w± ) ≡ A(µ o n + − σ ∈ S : τ + = w+ , τ − = w− , η(τ + , σ) = s+ eµ , η(τ − , σ) = s− eµ .
and we denote by
[
SR =
(4.4)
A(µ± , s± , w± ).
(4.5)
µ± ,s± ,w
± 0≤±(w± −v± )≤R
Notice that c SR = A+ (R) ∪ A− (R),
(4.6)
where A+ (R) ≡ {σ ∈ S : τ + > v+ + R} = {σ ∈ S : ∀v+ ≤w≤v+ +R η(w, σ) = 0}
(4.7)
and A− (R) ≡ σ ∈ S : τ − < v− − R = σ ∈ S : ∀v− −R≤w≤v− η(w, σ) = 0 . (4.8) Before stating the main results of this chapter we need some further notations. For given indices µ± , s± , w± we write 1 ≡ [w− + 21 , w+ − 21 ] and we set o n b ± , s± , w± ) ≡ σ ∈ S : η(w+ , σ) = s+ eµ+ , η(w− , σ) = s− eµ− . A(µ
(4.9)
We define the Gibbs measure on 1 with mesoscopic boundary conditions m(µ − − w+ and m(µ ,s ) at w− as the measure that assigns, to each σ1 ∈ S1 , the mass, (µ± ,s± ) [ω](σ1 ) Gβ,γ,1
±
=
1 ±
±
(µ ,s ) Zβ,γ,1 [ω]
−β Hγ,1 [ω](σ1 )+Wγ,1 [ω](σ1 ,m(µ e
± ,s± )
+
,s+ )
at
)
,
(4.10)
±
(µ ,s ) [ω] is the corresponding normalization factor and where Zβ,γ,1 ±
±
Wγ,1 [ω](σ1 , m(µ ,s ) ) − P P ≡ − i∈1 s− a(β)ξiµ σi j∈∂ − 1 Jγ (i − j) + P P − i∈1 s+ a(β)ξiµ σi j∈∂ + 1 Jγ (i − j).
(4.11)
Proposition 4.1. Let F be a cylinder event with base contained in [v− , v+ ]. Then
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
i)
337
There exists a positive constant c such that, for all integer R, there exists −1 R with P(R ) ≥ 1 − Re−cγ such that for all µ± , s± , w± , v+ ≤ w+ ≤ v+ + R, v− − R ≤ w ≤ v− and ω ∈ R for all 3 ⊃ [v− − R, v+ + R], Gβ,γ,3 [ω] F ∩ A(µ± , s± , w± ) (4.12) (µ± ,s± ) b ± , s± , w± ) e8βγ −1 (ζ+2γL) [ω] (F ) Gβ,γ,3 [ω] A(µ ≤ Gβ,γ,1 and for any u+ ≥ v+ , u− ≤ v− , b ± , s ± , u± ) Gβ,γ,3 [ω] F ∩ A(µ (µ± ,s± ) b ± , s± , u± ) e−8βγ −1 (ζ+2γL) . (F ) [ω] G [ω] A(µ ≥ Gβ,γ,[u β,γ, 3 ,u ] − +
(4.13)
i h √ α 1 , 2 a(β) set (ζ) ≡ c1 a(β)2 ζ 2 , where c0 and c1 are the conii) For ζ ∈ c0 a(β)
stants appearing in Proposition 3.2. There exist a positive constant c0 such 0 that for all integer R, there exists R with P(R ) ≥ 1 − γ −1 Re−c Mqand
there exist finite positive constants c2 and c3 such that if ζ(ζ)γL > 2c2 then for all ω ∈ R and 3 ⊃ [v− − R, v+ + R], c ) ≤ exp (−βLRc3 ζ(ζ)) . Gβ,γ,3 [ω](F ∩ SR
M ` ,
(4.14)
Corollary 4.2. Let F be a cylinder event with base contained in [v− , v+ ]. Then there exist a positive constant c0 such that for all integer R, there exists R with 0 M P(R ) ≥ 1 − γ −1 Re−c q and there exist finite positive constants c1 and c2 such that if ζ(ζ)γL > 2c1
then for all ω ∈ R and 3 ⊃ [v− − R, v+ + R],
M ` ,
Gβ,γ,3 [ω](F ) ≤
P
±
µ± ,s±
−R<w− ≤v− v+ ≤w+ δ ≤ K exp −N 2K In particular,
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
"
P kA(0)k ≥
q 1+
2 M N
339
#
δ2 . (1 + δ) ≤ K exp −N 2K
(4.24)
Proof. For the proof of this lemma, see [BG3], Sect. 2. Somewhat weaker estimates were previously obtained in [Ge,ST,BG1,BGP1]. Lemma 4.4. Let {Xi (n), i ≥ 1} be independent random variables with Xi (n) ≥ 0, satisfying, for any z ≥ 0, P [Xi (n) ≥ (1 + z)an ] ≤ cn e−zbn ,
(4.25)
where an , bn , cn are strictly positive parameters satisfying bn ↑ ∞ and (ln cn )/bn ↓ 0 as n ↑ ∞. Then, ln cn (4.26) E(Xi (n)) ≤ an 1 + bn and, for all > 0 and n sufficiently large, # " K 1 X Xi (n) ≥ (1 + z + )an ≤ e−zbn (1−η)K , P K
(4.27)
i=1
where η ≡ η(, bn , cn ) ↓ 0 as n ↑ ∞. Proof. Setting Yi (n) ≡ Xi (n)/an , we have, Z ∞ Z E(Yi (n)) = E 1I{y≤Yi (n)} dy = 0
∞
P(Yi (n) ≥ y)dy.
(4.28)
0
Thus, for any x ≥ 0, Z E(Yi (n)) ≤ 1 + x +
∞
P(Yi (n) ≥ y)dy.
(4.29)
1+x
Performing the change of variable y = 1 + z and making use of (4.25) yields Z ∞ cn E(Yi (n)) ≤ 1 + x + cn e−bn z dz = 1 + x + e−xbn . bn x
(4.30)
Now, choosing x = (ln cn )/bn minimizes the r.h.s. of (4.42) and gives (4.26). To prove (4.27) we first use that, by the exponential Markov inequality, for any t > 0, # " K K Y 1 X Yi (n) ≥ 1 + z + ≤ e−Kt(1+z+) EetYi (n) . (4.31) P K i=1
i=1
To estimate the Laplace transform of Yi (n), we write that, using integration by parts, Z ∞ Z ∞ EetYi (n) = E(1 + tety 1I{y≤Yi (n)} dy) = 1 + tety P(Yi (n) ≥ y)dy (4.32) 0
and, for any x ≥ 0,
0
340
A. Bovier, V. Gayrard, P. Picco
EetYi (n) = 1 +
R 1+x 0
≤ et(1+x) +
tety P(Yi (n) ≥ y)dy + R∞ 1+x
≤ et(1+x) + cn te
R∞ 1+x
tety P(Yi (n) ≥ y)dy
tety P(Yi (n) ≥ y)dy R t ∞ x
(4.33)
e−z(bn −t) dz,
where we used (4.25) in the last line after having performed the change of variable y = 1 + z. Choosing t = bn (1 − η) for some 0 < η ≤ 1, we get h i −xbn EetYi (n) ≤ ebn (1−η)(1+x) 1 + cn 1−η e η (4.34) −xbn ≤ exp bn (1 − η)(1 + x) + cn 1−η , η e and finally, inserting (4.46) in (4.43) yields h P i K P K1 Y (n) ≥ 1 + z + i i=1 ≤e
−zbn (1−η)K
h
exp −(1 − η)K bn ( − x) −
cn −xbn η e
i
(4.35) .
For n large enough, choosing x = /2, one can always choose η ≡ η(, bn , cn ) such that the last exponential in (4.27) is less than 1 and η(, bn , cn ) ↓ 0 as n ↑ ∞. Lemma 4.5. There exists a positive constant c such that, for all integer R, there −1 exists R with P(R ) ≥ 1 − Rγ −1 e−cγ such that for all µ± , s± , w± , v+ ≤ w+ ≤ v+ + R, v− − R ≤ w− ≤ v− and ω ∈ R (i) ± ± 1,L supσ:η(w± ,σ)=s± eµ± γ −1 Eγ,1 [ω](σ1 , mL (σ∂1 )) − Wγ,1 [ω](σ1 , m(µ ,s ) ) √ √ ≤ ζγ −1 (1 + 2γM (γ)) 2 (4.36) and (ii)
2 p sup |Wγ,1 [ω](σ1 , σ∂1 )| ≤ γ −1 4 1 + M/` ,
(4.37)
σ
where 1 = [w− + 21 , w+ − 21 ]. Proof. We first prove (i). We set Wγ,1 [ω](σ1 , m(µ
±
,s± )
+ ) = Wγ,1 [ω](σ1 , m(µ
+
,s+ )
− ) + Wγ,1 [ω](σ1 , m(µ
−
,s− )
), (4.38)
where − [ω](σ1 , m(µ Wγ,1
−
,s− )
) ≡ −L
X i∈1
and
−
s− a(β)ξiµ σi
X r∈∂ − 1
Jγ (i − Lr)
(4.39)
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model + Wγ,1 [ω](σ1 , m(µ
+
,s+ )
) ≡ −L
X
341 +
s+ a(β)ξiµ σi
X
Jγ (i − Lr).
(4.40)
r∈∂ + 1
i∈1
We will consider only the terms corresponding to the interaction with the right part of 1, the other ones being similar. We have + + −1 1,L + [ω](σ1 , m(µ ,s ) ) 1I{σ∈A(µ γ Eγ,1 [ω](σ1 , mL (σ∂ + 1 )) − Wγ,1 b ± ,s± ,w± )} P i h P + + ≤ L i∈1 r∈∂ + 1 Jγ (i − Lr)σi ξi , mL (r, σ∂ + 1 ) − m(µ ,s ) 1I{σ∈A(µ b ± ,s± ,w± )}
P
P + + ≤ L r∈∂ + 1 i∈1 Jγ (i − Lr)ξi σi 2 mL (r, σ∂ + 1 ) − m(µ ,s ) 1I{σ∈A(µ b ± ,s± ,w± )} 2
P
P ≤ ζL r∈∂ + 1 i∈1 Jγ (i − Lr)ξi σi 2 ≡ T + (σ). (4.41) T − (σ) is defined in an analogous way. Recalling the definition (4.21) we have P T + (σ) = ζL r∈∂ + 1 P 21 P i∈[w+ −1,w+ − 21 ] j∈[w+ −1,w+ − 21 ] (ξi , ξj )σi σj Jγ (i − Lr)Jγ (j − Lr) 21 P P (4.42) ≤ ζL r∈∂ + 1 γ −1 kBk i∈[w+ −1,w+ − 1 ] (σi Jγ (i − Lr))2 2 P 1 ≤ ζL r∈∂ + 1 kBk 2 ≤ ζ(2γ)−1 kBk 2 , 1
where we have used in the last inequality that #{r ∈ ∂ + 1} = (2γL)−1 . Thus, by Lemma 4.3, for all > 0, p √ 2 + −1 −1 (4.43) P sup T (σ) ≥ ζ(2γ) (1 + 2γM ) 1 + ≤ 2Kγ exp − 2Kγ σ∈S from which (i) follows. We turn to the proof of (ii). Using (2.4) we have, for all > 0, P supσ∈S |Wγ,1 [ω](σ1 , σ∂1 )| ≥ 42 h i `,` (m` (σ1 ), m` (σ∂1 )) ≥ 22 ≤ P supσ∈S γ −1 Eγ,1
(4.44)
i h `,` (σ1 , σ∂1 ) ≥ 22 . +P supσ∈S 1Wγ,1 Let us consider the first probability in the r.h.s. of (4.44). By definition, X X `,` (m` (σ1 ), m` (σ∂1 )) = γ` Jγ` (x − y)(m` (x, σ1 ), m` (y, σ∂1 )). (4.45) Eγ,1 x∈1 y∈∂1
Now
(m` (x, σ1 ), m` (y, σ∂1 )) ≤ km` (x, σ1 )k2 km` (y, σ∂1 )k2 (4.46) 1
1
≤ kB(x)k 2 kB(y)k 2 ,
342
A. Bovier, V. Gayrard, P. Picco
where B(x) is the ` × `-matrix B(x) = {B(x)i,j }i∈x,j∈x with B(x)i,j = Thus
1 `
PM
µ µ µ=1 ξi ξj .
`,` Eγ,1 (m` (σ1 ), m` (σ∂1 )) ≤ (γ`)2
P
P x∈1
1
y∈∂1
1
1I{|`x−`y|≤(2γ)−1 } kB(x)k 2 kB(y)k 2
P P 1 1 γ` y∈[w+ − 1 ,w+ +1] kB(y)k 2 ≤ γ` x∈[w+ −1,w+ − 1 ] kB(x)k 2 2
(4.47)
2
P P 1 1 γ` y∈[w− ,w− + 1 ] kB(y)k 2 + γ` x∈[w− + 1 ,w− +1] kB(x)k 2 2
2
≡ T 1 T2 + T3 T4 and,
X 4 `,` 2 P sup Eγ,1 (m` (σ1 ), m` (σ∂1 )) ≥ 2 ≤ P(Tk ≥ ), σ∈S
(4.48)
k=1
where the last equality in (4.47) defines the quantities Tk . All four probabilities on the right-handnside of (4.48) o will be bounded in the same way. Let us consider P(T1 ≥ ). 1 2 are independent random variables. It follows from Note that kB(x)k 1 x∈[w+ −1,w+ − 2 ]
Lemma 4.3 that, for all ˜ > 0,
2 i h p 1 ˜ ` , P kB(x)k 2 > 1 + M/` (1 + ˜) ≤ 4K` exp − K
(4.49)
and by Lemma 4.4, we get that for large enough `, p 1 ˜ . P T1 ≥ (1 + M/`)(1 + ˜) ≤ K exp − 2 2Kγ
Therefore, choosing ≡ 21 (1 +
p
(4.50)
M/`)(1 + ˜) in (4.44), (4.48) yields
i h p `,` (m` (σ1 ), m` (σ∂1 )) ≥ (2γ)−1 (1 + M/`)2 (1 + ˜)2 P supσ∈S γ −1 Eγ,1 ˜ . ≤ 4K exp − 2Kγ
(4.51)
Choosing ˜ = 1 and using Lemma 2.1 to bound the second term in (4.44) we get (4.37) which concludes the proof of Lemma 4.5. We are now ready to prove Proposition 4.1. Proof of Proposition 4.1, Part i). Setting 1c ≡ 3 \ 1 and denoting by σ˜ and σ¯ independent copies of σ, some simple manipulations allow us to write
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
343
b ± , s± , w± )) Gβ,γ,3 [ω](F ∩ A(µ± , s± , w± )) ≤ Gβ,γ,3 [ω](F ∩ A(µ ± ± 1 −β Hγ,1 [ω](σ1 )+Wγ,1 [ω](σ1 ,m(µ ,s ) ) Eσ1 e = Zβ,γ,3 [ω] ± ± −β Hγ,1c [ω](σ1c )+ Wγ,1 [ω](σ1 ,σ1c )−Wγ,1 [ω](σ1 ,m(µ ,s ) ) × Eσ1c e ×1I{σ∈F ∩A(µ b ± ,s± ,w )} ±
= Eσ1
1 ±
±
(µ ,s ) Zβ,γ,1 [ω]
× Eσ1c Eσ˜ 1
−β Hγ,1 [ω](σ1 )+Wγ,1 [ω](σ1 ,m(µ e
1 Zβ,γ,3 [ω]
± ,s± )
)
e
(4.52)
−β Hγ,1c [ω](σ1c )+Hγ,1 [ω](σ˜ 1 )+Wγ,1 [ω](σ˜ 1 ,σ1c ) ∗
∗∗
−β(W +W ) ×1I{σ∈F ∩A(µ b ± ,s± ,w± )} e µ± ,s± = Eσ1 Gβ,γ,1 [ω](σ1 )1I{σ∈F } Eσ¯ 3 Gβ,γ,3 [ω](σ¯ 3 )1I{σ∈ b ± ,s± ,w± )} ¯ A(µ † †† e−β(W +W ) ,
h i ± ± W ∗ = Wγ,1 [ω](σ1 , σ1c ) − Wγ,1 [ω](σ1 , m(µ ,s ) ) , h i ± ± W ∗∗ = Wγ,1 [ω](σ˜ 1 , m(µ ,s ) ) − Wγ,1 [ω](σ˜ 1 , σ1c ) , h i ± ± W † = Wγ,1 [ω](σ1 , σ¯ 1c ) − Wγ,1 [ω](σ1 , m(µ ,s ) ) , h i ± ± W †† = Wγ,1 [ω](σ¯ 1 , m(µ ,s ) ) − Wγ,1 [ω](σ¯ 1 , σ¯ 1c ) ,
where
b ± , s± , w± ), Now, if σ¯ ∈ A(µ h i ± ± Wγ,1 [ω](σ1 , σ¯ 1c ) − Wγ,1 [ω](σ1 , m(µ ,s ) ) h i ± ± + Wγ,1 [ω](σ¯ 1 , m(µ ,s ) ) − Wγ,1 [ω](σ¯ 1 , σ¯ 1c ) ± ± ≤2 sup Wγ,1 [ω](σ¯ 1 , σ¯ 1c ) − Wγ,1 [ω](σ¯ 1 , m(µ ,s ) ) b ± ,s± ,w± ) σ∈ ¯ A(µ ± ± (4.53) −1 1,L ≤2 sup γ Eγ,1 [ω](σ¯ 1 , mL (σ¯ ∂1 ))−Wγ,1 [ω](σ¯ 1 , m(µ ,s ) ) b ± ,s± ,w± ) σ∈ ¯ A(µ 1,L + 2 sup 1Wγ,1 [ω](σ¯ 1 , σ¯ ∂1 ) . σ∈S ¯
Finally, by Lemma 4.5 and Lemma 2.1, the supremum over µ± , s± and w± , v+ ≤ w+ ≤ v+ + R v− − R ≤ w− ≤ v− , of the last line of (4.53) is bounded from above by 8γ −1 (ζ + 2γL) with a Pξ -probability, greater than 1 − 4γ −1 R exp(−cγ −1 ) for some positive constant c. Thus from (4.52) and (4.53) follow both (4.12) and (4.13).
344
A. Bovier, V. Gayrard, P. Picco
Proof of Proposition 4.1 part ii). Using (4.6) the l.h.s. of (4.14) is bounded from above by Gβ,γ,3 [ω](A+ (R)) + Gβ,γ,3 [ω](A− (R)). We estimate the first term, the second one being similar. Since the spin configuration is away from the equilibria on a length R, we can decouple the interaction between this part and the rest of the volume 3, by making a rough estimate of those interaction terms. The fact that we are out of equilibrium will give terms proportional to R that will be dominant if R is chosen large enough. More precisely, calling 1R ≡ [v+ , v+ + R], we have, for all fixed R, Gβ,γ,3 A+ (R) =
1 Zβ,γ,3
E σ3
−β Hγ,1 (σ1 )+Wγ,1 (σ1 ,σ3\1 ) R R R R R × e−βHγ,3\1R (σ3\1R ) e 1I{σ∈A+ (R)} ≤e
1
4cγ −1
Zβ,γ,1R
Eσ1
h R
e−βHγ,1R (σ1R ) 1I{σ∈A+ (R)}
i
(4.54)
−1
with a Pξ -probability greater than 1 − 4γ −1 e−cγ for some positive constant c, where we have used Lemma 4.5 to bound the interaction between 1R and 3 \ 1R . To estimate the last term in (4.54), we express it in terms of block spin variables on the scale `. Using (2.5) we get
Gβ,γ,1R A (R) ≤ e +
2cγ −1 |1R |(4γ`+γM )
Eσ1 e
` −βγ −1 Eγ, 1 (m` (σ)) R
R
Eσ1 e
−βγ −1 E `
γ,1R
R
1I{σ∈A+ (R)}
(m` (σ))
(4.55)
−1 with a Pξ -probability greater than 1 − e−cγ |1R | We derive first a lower bound on the denominator which will be given effectively by restricting the configurations to be in the neighborhood of a constant profile near one of the equilibrium positions sa(β)eµ . We will choose without loss of generality s = 1, µ = 1. Recalling the definitions of Bρ(1,1) and Bρ from (3.14) and (3.15) we have that, obviously,
Eσ1 e
` −βγ −1 Eγ, 1 (m` (σ))
≥ Eσ1 e
R
R
` −βγ −1 Eγ, 1 (m` (σ))
Y
R
R
1I{m` (x,σ)∈Bρ(1,1) } . (4.56)
x∈1R
It can easily be shown that, on the set {m` (x, σ) ∈ Bρ , ∀x ∈ 1R }, ` (m` (σ)) ≥ −γ −1 Eγ,1 R
` X (km` (x, σ)k22 − 4ρ2 ), 2 x∈1
(4.57)
R
from which (4.56) yields Eσ1 e
` −βγ −1 Eγ, 1 (m` (σ)) R
R
≥ e−4βγ = e−4βγ
−1
−1
|1R |ρ2
|1R |ρ2
Q x∈1R
Q x∈1R
E σx e
β` 2 2 km` (x,σ)k2
Zx,β,ρ a(β)e1 ,
1I
{m` (x, σ) ∈ Bρ(1,1) }
(4.58)
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
345
provided that ρ is sufficiently large so that Bρ(1,1) contains the lowest minimum of Φ in q the neighborhood of a(β)e1 , which is the case if ρ ≥ c M ` , for some finite constant c with a Pξ -probability ≥ 1 − e−cM . Next we derive an upper bound for the numerator of the ratio in (4.55). Using the inequality ab ≤ 21 (a2 + b2 ) we get ` −γ −1 Eγ,1 (m` (σ)) ≤ R
` X km` (x, σ)k22 , 2 x∈1
(4.59)
R
and whence Eσ1 e
` −βγ −1 Eγ, 1 (m` (σ))
R
≤ Eσ1 e
β` 2
R
P
x∈1R
R
1I{σ∈A+ (R))} km` (x,σ)k22 1I{σ∈A+ (R)} .
(4.60)
Let us now recall that, by definition, A+ (R) = σ ∈ S ∀u∈1R ∃r∈u : inf km(µ,s) − mL (r, σ)k2 > ζ .
(4.61)
µ,s
Using that mL (r, σ) =
` L
P x∈r
m` (x, σ) we have
km(µ,s) − mL (r, σ)k2 ≤ so that
(
A (R) ⊂ +
` X km(µ,s) − m` (x, σ)k2 L x∈r
` X σ ∈ S ∀u∈1R ∃r∈u : inf km(µ,s) − m` (x, σ)k2 > ζ µ,s L x∈r
(4.62)
) .
(4.63)
We will use the following fact: Lemma 4.6. Let {Xk , k = 1, 2, . . . , K} be a sequence of real numbers satisfying 0 ≤ Xk ≤ c for some c < ∞. Let ζ < c and assume that K 1 X Xk > ζ. K
(4.64)
k=1
Then |{1 ≤ k ≤ K : Xk > δζ}| ≥ K
ζ(1 − δ) . c − δζ
(4.65)
c Proof. For 0 ≤ δ ≤ 1, define the set Vδ,ζ ≡ {k|Xk ≤ δζ} and set Vδ,ζ ≡ {1, . . . , K} \ Vδ,ζ . Then 1 K
=
PK k=1 1 K (c
Xk ≤
1 K
P k∈Vδ,ζ
− δζ)|Vδ,ζ | + δζ,
Xk +
1 K
P
c k∈Vδ,ζ
Xk ≤
which, together with (4.64) implies the bound (4.65).
1 K c|Vδ,ζ |
+
1 c K δζ|Vδ,ζ |
(4.66)
346
A. Bovier, V. Gayrard, P. Picco
Let us denote by Vδ,ζ (r) the set of all subsets S ⊂ {x ∈ r} with cardinality respectively volume ζ(1 − δ) . |S| ≥ γL 2 − δζ
L ζ(1−δ) ` 2−δζ ,
(4.67)
Then, since km(µ,s) − m` (x, σ)k2 < 2, Lemma 4.6 implies that there exists a set S ∈ Vδ,ζ (r) such that for all x ∈ S, km(µ,s) − m` (x, σ)k2 > δζ. That is to say o n c A+ (R) ⊂ σ ∈ S ∀u∈1R ∃r∈u ∃S∈Vδ,ζ (r) : ∀x∈S , m` (x, σ) ∈ Bδζ . (4.68) Therefore Eσ1 e
` −βγ −1 Eγ, 1 (m` (σ)) R
R
≤
Q u∈1R
×1I{∃
E σu e
β` 2
P x∈u
1I{σ∈A+ (R)}
km` (x,σ)k22
c : ∀x ∈ S , m` (x, σ) ∈ Bδζ } P β` P P Q km` (x,σ)k22 x∈u ≤ u∈1R r∈u S∈Vδ,ζ (r) Eσu e 2
Q x∈S
r∈u ∃S∈Vδ,ζ (r)
(4.69)
1I{m (x, σ) ∈ B c } . ` δζ
Inserting this and (4.58) into (4.55) we have
Gβ,γ,1R [ω] A+ (R) −1
≤ eγ |1R |(16γ`+4γM +4βρ ) Y X X Y 2
c Y Zx,β,δζ Zx,β Z (a(β)e1 ) Zx,β,ρ (a(β)e1 ) u∈1R r∈u S∈Vδ,ζ (r) x∈u\S x,β,ρ x∈S Y X X −1 2 TS(1) TS(2) , ≡ eγ |1R |(16γ`+4γM +4βρ )
(4.70)
u∈1R r∈u S∈Vδ,ζ (r)
where we have defined c ≡ Eσx e Zx,β,δζ
β` 2 2 km` (x,σ)k2
1I{m (x, σ) ∈ B c } . ` δζ
It follows from Proposition 2.3 of [BGP1] that q Zx,β ≤ exp −β` φ(a(β)) − c M . `
(4.71)
(4.72)
so that using Lemma 3.1 we get that TS(1)
≤
Y
exp +β`c
q
M `
≤ e+βγ
−1
c
√M `
,
(4.73)
x∈u\S c with a Pξ -probability ≥ 1 − (γ`)−1 e−cM On the other hand, to bound Zx,β,δζ , we 2 proceed as in [BG2] and first note that km` (x, σ)k2 ≤ 2 for all σ. Next, we introduce
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
347
√ the lattice W`,M with spacing 1/ ` in RM and we denote by W`,M (2) the intersection of this lattice with the ball of radius 2 in RM . We have 2` . (4.74) |W`,M (2)| ≤ exp M ln M p Now, we may cover the ball of radius 2 in RM with balls of radii ρˆ ≡ M/` centered at the points of W`,M (2). Supposing that δζ > ρˆ this yields, P c Zx,β,ρˆ (m)[ω] ≤ m∈W`,M (2) 1I{m ∈ B c Zx,β,δζ δζ−ρˆ } (4.75) P exp −β` Φx,β (m)[ω] − 21 ρˆ2 . ≤ m∈W`,M (2) 1I{m ∈ B c } δζ−ρˆ Let us now assume that δζ − ρˆ satisfies the hypothesis of Proposition 3.2, then c Zx,β,δζ q 1 2 ≤ exp −β` φ(a(β)) + (δζ − ρ) ˆ − 4(δζ − ρ) ˆ M ˆ − ` − 2ρ
M β`
ln
2` M
(4.76)
with a Pξ -probability ≥ 1 −q e−cM , where (·) is the function defined in Proposition 4.1. We will assume that δζ
M ` .
Thus
q c Zx,β,δζ M (δζ ≤ exp −β` − ρ) ˆ − cδζ ` Zx,β,ρ (a(β)e1 )
(4.77)
0
with a Pξ -probability ≥ 1 − e−c M . Thus the product TS(1) TS(2) defined in (4.70) is bounded by q M − (ζ)|S| (4.78) TS(1) TS(2) ≤ exp βγ −1 c ` 0
with a Pξ -probability ≥ 1 − (γ`)−1 |S|e−c M . Hence Y X X TS(1) TS(2) u∈1R r∈u S∈Vδ,ζ (r)
≤
Y X
X
u∈1R r∈u S∈Vδ,ζ (r)
q exp −βγ −1 c |S|(ζ) − M `
ln 2 −c ≤ exp −βγ −1 |1R | γLζc(ζ) − γ| ln(γL)| − γL `
q
(4.79)
M `
0
with a Pξ -probability ≥ 1 − (γ)−1 Re−c M , and finally, inserting (4.70) in (4.59) we arrive at Gβ,γ,1R [ω] (A+ (R)) ≤ h q i M 2 + 8γ` + 2ρ exp −βγ −1 R γLcζ(ζ) − c0 ` 0
(4.80)
with a Pξ -probability ≥ 1 − (γ`)−1 Re−c ` , where we have used the fact that M `.
348
A. Bovier, V. Gayrard, P. Picco
5. Self Averaging Properties of the Free Energy In this chapter we study the self averaging properties of the free energy of the HopfieldKac model with mesoscopic boundary conditions. We denote the partition function on the volume 1 with boundary condition − + s− a(β)eµ on the left of 1 and s+ a(β)eµ on the right of 1 by − − + + −β Hγ,1 (σ)+Wγ,1,∂ − 1 (σ1 |m(µ ,s ) )+Wγ,1,∂ + 1 (σ1 )|m(µ ,s ) ) (µ± ,s± ) ≡ Eσ1 e Z1 , (5.1) and the corresponding free energy f1(µ
±
,s± )
≡ f1 = −
± ± γ ln Z1(µ ,s ) . β|1|
(5.2)
To include the case of free boundary conditions, we set m(0,0) ≡ 0. ± ± We are interested in the behavior of the fluctuations of f1(µ ,s ) around its mean value. We will use Theorem 6.6 of Talagrand [T2] that we state for the convenience of the reader. We denote by MX a median of the random variable X. Recall that a number x is called the median of a random variable X if both P[X ≥ x] ≥ 21 and P[X ≤ x] ≥ 21 . Theorem 5.1. [T2] Consider a real valued function f defined on [−1, +1]N . We assume that, for each real number a the set {f ≤ a} is convex. Consider a convex set B ⊂ [−1, +1]N , and assume that for all x, y ∈ B, |f (x)−f (y)| ≤ kkx−yk2 for some positive k. Let X denote a random vector with i.i.d. components {Xi }1≤i≤N taking values in [−1, +1]. Then for all t > 0, 4 t2 P |f (X) − Mf (X)| ≥ t ≤ 4b + exp − , (5.3) 1 − 2b 16k 2 where b ≡ P [X 6∈ B] and we assume that b < 21 . The main result of this chapter is the following proposition: Proposition 5.2. If γ`, M/` and γM are small enough, then for all t > 0, there exists a universal numerical constant K such that i h ± ± p ± ± −1 P f1(µ ,s ) − Ef1(µ ,s ) ≥ t + K γ −1 |1| (5.4) −1 √ ≤ K exp − γ 8 |1|( 1 + t2 − 1) . Proof. Note first that the set {f1 ≤ a} is convex. This follows from the fact that the Hamiltonian Hγ,1 is a convex function of the variable ξ. The main difficulty that remains is to establish that f1 is a Lipshitz function of the independent random variables ξ with a constant k that is small with large probability. To prove the Lipshitz continuity of f1 it is obviously enough to prove the corresponding bounds for Hγ,1 (σ) and ± ± Wγ,1,∂ ± 1 (σ1 |m(µ ,s ) ). Let us first prove that Hγ,1 (σ) is Lipshitz in the random variable ξ. Let us write ξ ≡ ξ[ω] and ξˆ ≡ ξ[ω 0 ]. Denoting by ξ µ σ the coordinatewise product of the two vectors ξ µ and σ and Jγ (i − j) the symmetric γ −1 |1| × γ −1 |1| matrix with i, j entries, we have
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
349
X M h h i i ξ µ σ − ξˆµ σ , Jγ ξ µ σ + ξˆµ σ . |Hγ,1 [ω](σ) − Hγ,1 [ω 0 ](σ)| = µ=1 1/2
Since Jγ is a symmetric and positive definite matrix, its square root Jγ using the Schwarz inequality we may write P M µ=1 [ξ µ σ − ξˆµ σ], Jγ [ξ µ σ + ξˆµ σ] ≤
P µ=1
(5.5)
exists. Thus
1/2 1/2 kJγ [ξ µ σ − ξˆµ σ]k2 kJγ [ξ µ σ + ξˆµ σ]k2
(5.6)
≤ J +J − where
1/2 M X J + ≡ ([ξ µ σ + ξˆµ σ], Jγ [ξ µ σ + ξˆµ σ])
(5.7)
µ=1
and J−
1/2 M X ˆ 2. ≡ ([ξ µ σ − ξˆµ σ], Jγ [ξ µ σ − ξˆµ σ]) ≤ kξ − ξk
(5.8)
µ=1
The last inequality in (5.8) follows since kJγ k ≤ 1. On the other hand, by convexity PM PM (J + )2 ≤ 2 µ=1 (ξ µ σJγ ξ µ σ) + 2 µ=1 (ξˆµ σJγ ξˆµ σ)
(5.9)
0
= 2Hγ,1 [ω](σ) + 2Hγ,1 [ω ](σ) Collecting, we get |Hγ,1 [ω](σ) − Hγ,1 [ω 0 ](σ)| ≤
√
1/2
ˆ 2 Hγ,1 [ω](σ) + Hγ,1 [ω 0 ](σ) 2kξ − ξk
. (5.10)
This means that as in [T2], we are in a situation where the upper bound for the Lipshitz norm of Hγ,1 [ω](σ) is not uniformly bounded. However the estimates of Sect. 2, allow us to give reasonable estimates on the probability distribution of this Lipshitz norm. Recalling (2.5) we have # " P
sup |1Hγ,1 (σ)| ≥ γ −1 |1|(16(1 + c))γ` + 4γM ) ≤ 16e−cγ
−1
|1|
.
(5.11)
σ∈S1
Therefore, using (2.1) we get h i P supσ∈S1 |Hγ,1 (σ)| ≥ γ −1 |1|(C + (16(1 + c))γ` + 4γM ) ≤ 16e
h
i
(5.12)
2(m` (x, σ), m` (y, σ)) ≤ km` (x, σ)k22 + km` (y, σ)k22 .
(5.13)
−Cγ −1 |1|
+ P supσ∈S1 |γ
−1
` Eγ,1 (m` (σ))|
≥ Cγ
−1
1 .
To estimate this last probability, we notice that by convexity
350
A. Bovier, V. Gayrard, P. Picco
Therefore
P ` (m` (σ))| = 1/2 x,y∈1 Jγ` (x − y)(m` (x, σ), m` (y, σ)) |γ −1 Eγ,1 ≤ `/2
P
(5.14)
2 x∈1 km` (x, σ)k2
Now we have i h P P supσ∈S1 ` x∈1 km` (x, σ)k22 ≥ 2Cγ −1 |1| ≤ 2γ ≤2
−1
|1|
P P ` x∈1 km` (x, σ)k22 ≥ 2Cγ −1 |1|
γ −1 |1|
inf 0≤t c, for −1/2 for all γ small enough then there exists a set g with P[g ] ≥ 1 − Ke−c(g(γ)) some positive constants c and K, such that for all ω ∈ g , i h ± ± (µ± ,s± ) − E ln Z1(µ ,s ) ≤ βγ −1 (g(γ))1/4 . (5.26) ln Z1 Proof. The Corollary follows from Proposition 5.2 by choosing t = γ 1/2 |1|−1/2 (g(γ))−1/4 6. Localization of the Gibbs Measures II: The Block-Scale 6.1. Finite volume, free boundary conditions. Instead of dealing with the measures (µ± ,s± ) [ω] immediately, we will first consider the simpler case of Gibbs measures in Gβ,γ, 3 a finite volume 3 ≡ [v− , v+ ] of order |3| = o(γ −1 ) with free (Dirichlet) boundary conditions. This will be considerably simpler and the result will actually be needed as a basic input in order to deal with the full problem. On the other hand, the result may be seen as interesting in its own right and exhibits, to a large extent, the main relevant features of the model. This may indeed satisfy many readers who may not wish to follow the additional technicalities. With this in mind, we give a more detailed exposition of this case. Our basic result here will be that the free boundary conditions measure in volumes small compared to γ −1 are concentrated on “constant profiles” with very large probability. More precisely, we have
352
A. Bovier, V. Gayrard, P. Picco
Theorem 6.1. Assume that γ|3| ↓ 0, β large enough (β > 1) and γM (γ) ↓ 0. Then we can find γ −1 Lˆ 1 and ζˆ ↓ 0, such that on a subset 3 ⊂ −1/2 with P(c3 ) ≤ e−cg (γ) , where g(γ) ↓ 0 and γ −1 g(γ) > c, we have that for all ω ∈ 3 , ˆ ˆ ζ) −Lh( Gβ,γ,3 [ω] ∃u∈3 ηζ, (6.1) ˆL ˆ (u, σ) = 0 ≤ e and
ˆ ˆ ζ) −Lh( , Gβ,γ,3 [ω] ∃u∈3 ηζ, ˆL ˆL ˆ (u, σ) 6= ηζ, ˆ (u + 1, σ) ≤ e
(6.2)
where h(ζ) = cβζ(ζ) and (ζ) is defined in Proposition 4.1. The proof of this theorem relies on a large deviation type estimate for events that take place on a scale much smaller than the size of 3. We will consider events F that are in the cylinder algebra with base I = [u− , u+ ] ⊂ 3, where |I| 1/(γ`) is very small compared to 3 and that in addition are measurable with respect to the sigma-algebra ± ± generated by the variables {m` (σ, x)}x∈I . Let us define the functions U1(µ ,s ) and ± ± (µ ,s ) F1,β,ρ by U1(µ
±
,s± )
(m` ) ≡ γ`
P
+ γ` and
±
±
x,y∈1
P
(µ ,s ) (m` ) ≡ U1(µ F1,β,ρ
Jγ` (x − y)
km` (x)−m` (y)k22 4
x∈1,y∈∂1 Jγ` (x − y)
±
,s± )
(m` ) + γ`
X
km` (x)−m(µ 2
± ,s± ) 2 k2
fx,β,ρ (m` (x)),
(6.3)
(6.4)
x∈1
where fx,β,ρ (m` (x)) ≡ −
β` 2 1 ln Eσ e 2 km` (σ,x)k2 1I{km` (σ,x)−m` (x)k2 ≤ρ} . β`
(6.5)
For any δ > 0 define the δ-covering Fδ of F as Fδ ≡ {σ|∃σ0 ∈F : ∀x∈I km` (σ, x) − m` (σ 0 , x)k2 < δ}.
(6.6)
With these notations we have the following large deviation estimates: Theorem 6.2. Let F and Fδ be as defined above. Assume that |3| ≤ g(γ)γ −1 where g(γ) satisfies the hypothesis of Corollary 5.3. Then there exist `, L, ζ, R −1/2 all depending on γ and a set 3 ⊂ with P[c3 ] ≤ Ke−c(g(γ)) + e−cR/γ such that for all ω ∈ 3 , γ β
ln Gβ,γ,3 [ω](F )
h i (µ± ,s± ) (1,1,1,1) ≤ − inf µ± ,s± ,±(w± −u± )≤R inf m` ∈F F[w (m ) − inf F (m ) ` m ` ` [w− ,w+ ],β,γ − ,w+ ],β,γ +er(`, L, M, ζ, R), (6.7) and for any δ > 0, for γ small enough
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model γ β
353
ln Gβ,γ,3 [ω](Fδ )
≥ − inf µ± ,s± ,±(w± −u± )≤R h i (µ± ,s± ) (1,1,1,1) inf m` ∈F F[w (m` ) − inf m` F[w (m` ) − ,w+ ],β,γ − ,w+ ],β,γ
(6.8)
−er(`, L, M, ζ, R), ˆ R) is a function of α ≡ γM that tends to zero as α ↓ 0. where er(`, L, M, ζ, Proof. Relative to the interval I we introduce again the partition S from Sect. 4. While we will use again the estimate (4.10) we treat the terms corresponding to SR somewhat differently. Let us introduce the constrained partition functions Zβ,γ,3 [ω](F ) ≡ Gβ,γ,3 [ω](F )Zβ,γ,3 [ω].
(6.9)
Just as in Proposition 4.1 we have that −
Zβ,γ,3 (F ∩ A(µ± , s± , w± )) ≤ Zβ,γ,3− ({η(w− , σ) = s− eµ }) ±
±
+
(µ ,s ) (F )Zβ,γ,3+ ({η(w+ , σ) = s+ eµ })e8γ ×Zβ,γ,1
−1
(6.10) (ζ+2γL)
and −
Zβ,γ,3 (F ∩ A(µ± , s± , w± )) ≥ Zβ,γ,3− ({η(w− , σ) = s− eµ }) (µ± ,s± ) (F )Zβ,γ,3+ ({η(w+ , σ) ×Zβ,γ,1
+ µ+
= s e })e
−8γ −1 (ζ+2γL)
(6.11) ,
where 1 = [w− + 21 , w+ − 21 ] and 3± are the two connected components of the complement of 1 in 3. Using the trivial observation that Zβ,γ,3 ≥ Zβ,γ,3 (A(µ± = 1, s± = 1, w± )),
(6.12)
this combines to ±
±
±
Gβ,γ,3 (F ∩ A(µ , s , w± )) ≤
±
(µ ,s ) Zβ,γ,1 (F ) (1,1,1,1) Zβ,γ,1 −
×
Zβ,γ,3− ({η(w− , σ) = s− eµ }) Zβ,γ,3+ ({η(w+ , σ) = s+ eµ }) Zβ,γ,3− ({η(w− , σ) = e1 }) Zβ,γ,3+ ({η(w+ , s) = e1 })
×e16γ
−1
+
(6.13)
(ζ+2γL)
The point is now that the ratios of partition functions on 3± are in fact “close” to one. Indeed we have Lemma 6.3. Let 3 = [w− − 21 , w+ + 21 ] with |3| ≤ γ −1 g(γ), where g(γ) ↓ 0 and g(γ)/γ ≥ c > 0. Then − ln Zβ,γ,3 ({η(w− , σ) = s− eµ }) − ln Zβ,γ,3 ({η(w− , σ) = e1 }) (6.14) −1 1/4 (g(γ)) + 10ζ + 48γL ≤ βγ with probability greater than 1 − e−cγ
−1
−1/2
− Ke−c(g(γ))
.
354
A. Bovier, V. Gayrard, P. Picco −
−
(0,0,µ ,s ) Proof. Let us denote by Zβ,γ, the partition function with free boundary condi3\w− −
−
tion on ∂ + 3 and mesoscopic boundary condition m(µ ,s ) at w− (see the lines following (2.4) and (4.9) for the notation). Introducing a carefully chosen zero and using the triangle inequality, we then see that − ln Zβ,γ,3 ({η(w− , σ) = s− eµ }) − ln Zβ,γ,3 ({η(w− , σ) = e1 }) − ≤ ln Zβ,γ,3 ({η(w− , σ) = s− eµ }) (0,0,µ− ,s− ) (0,0,1,1) 1 + ln Z − ln Z ({η(w , σ) = e }) − ln Zβ,γ, β,γ, 3 − 3\w− β,γ,3\w−
(0,0,µ− ,s− ) (0,0,µ− ,s− ) + ln Zβ,γ, − E ln Z 3\w− β,γ,3\w−
(6.15)
(0,0,µ− ,s− ) (0,0,1,1) − E ln Zβ,γ, + E ln Zβ,γ, 3\w− 3\w−
(0,0,1,1) (0,0,1,1) − ln Z + E ln Zβ,γ, 3\w− β,γ,3\w− .
The third term on the right-hand side of (6.15) is zero by symmetry, while the second and fourth are bounded by Corollary 5.3 by γ −1 (g(γ))−1/4 with probability at least −1 −1/2 . To bound the first term we proceed as in the proof of 1 − e−cγ − Ke−c(g(γ)) Proposition 4.1, Part i, that is we use the same decomposition as in (4.4) and (4.53). This gives that −
−
−
−
(0,0,µ ,s ) = ln Zw− ,β,γ ({η(w− , σ) = s− eµ }) ln Zβ,γ,3 ({η(w− , σ) = s− eµ }) − ln Zβ,γ, 3\w−
+ O 4γ −1 (ζ + 2γL)
(6.16) The constraint partition function on the block w− is easily dealt with. First, we note that by (2.5) with probability greater than 1 − exp(−cγ −1 ) we can replace the Hamiltonian by its blocked version on scale L at the expense of an error of order γ −1 (16γL). Then we can repeat the steps (4.56) to (4.58) and use Lemma 3.1 to get that with the same probability, −
ln Zw− ,β,γ ({η(w− , σ) = s− eµ }) (6.17) ≥ −βγ −1 φ(a(β)) + ζ 2 + lnln L2 − βγ −1 (16γL), q provided ζ ≥ 2 M L . Using (4.59) and the large deviation bound (3.3), we also get − ln Zw− ,β,γ ({η(w− , σ) = s− eµ }) ≤ −βγ −1 φ(a(β)) − 21 ζ 2 + βγ −1 (16γL). (6.18) The same bounds hold of course for the term with (s− , µ− ) replaced by (1, 1), so that we get an upper bound 3 2 −1 48γL + 8ζ + ζ (6.19) βγ 2 for the first term on the right of (61.9). Putting all things together, we arrive at the assertion of the lemma.
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
355
Lemma 6.3 asserts that to leading order, only the first ratio of partition functions is relevant in (6.13). On the other hand, since by Proposition 4.1, Part (ii), we only need to consider the case |1| ≤ R, we can use the block approximation on scale ` for those, committing an error of order βγ −1 (Rγ`) only. We will make this precise in the next lemma. Lemma 6.4. For any (µ± , s± , w± ) and I ⊂ 1 ⊂ 3 and any F that is measurable with respect to the sigma algebra generated by {m` (σ, x)}x∈I ,
γ β
ln
± ,s± ) (F ) β,γ,1 Z (1,1,1,1) β,γ,1
Z (µ
±
±
(µ ,s ) (1,1,1,1) ≤ − inf m` ∈F F1,β,ρ (m` ) + inf m` F1,β,ρ (m` ) 2` | + |1| M + c0 |1|γ` + |1|γM | ln M `
(6.20)
and ∀δ > 0 for sufficiently small γ
γ β
ln
± ,s± ) (Fδ ) β,γ,1 Z (1,1,1,1) β,γ,1
Z (µ
±
±
(µ ,s ) (1,1,1,1) ≥ − inf m` ∈F F1,β,ρ (m` ) + inf m` F1,β,ρ (m` ) 2` | + |1| M − c0 |1|γ` + |1|γM | ln M `
(6.21)
with probability greater than 1 − e−c|1|/γ . Proof. Using Lemma 2.1, we see that
(µ± ,s± ) (F ) Zβ,γ,1
≤ Eσ 1I{m` (σ)∈F } × eβγ
−1
`,L ` (µ −βγ −1 Eγ, 1 (m` (σ))+Eγ,1 m` (σ1 ),m e
± ,s± )
(6.22)
40|1|γ`
and
(µ± ,s± ) (F ) Zβ,γ,1
≥ Eσ 1I{m` (σ)∈F } × e−βγ
Now
−1
`,L ` (µ −βγ −1 Eγ, 1 (m` (σ))+Eγ,1 m` (σ1 ),m e
40|1|γ`
.
± ,s± )
(6.23)
356
A. Bovier, V. Gayrard, P. Picco
± ± `,L ` (m` (σ1 )) + E1,∂1 E1 m` (σ1 ), m(µ ,s ) ± ± `,L ` (m` (σ1 )) + E1,∂1 = E1 m` (σ1 ), m(µ ,s ) +γ`
X km` (σ, x)k2 X [a(β)]2 2 + γ` 2 2 x∈1 x∈∂1
X km` (σ, x)k2 X [a(β)]2 2 −γ` − γ` 2 2 x∈1 x∈∂1 X 1 Jγ` (x − y) (m` (σ, x), m` (σ, y)) = − γ` 2 (x,y)∈1×1 X ± ± Jγ` (x − y) m` (x, σ), m(µ ,s ) − γ`
(6.24)
x∈1,y∈∂1
X 1 X1 ± ± ± ± (m` (x, σ), m` (x, σ)) + γ` m(µ ,s ) , m(µ ,s ) + γ` 2 2 x∈1 x∈∂1
X [a(β)]2 X km` (σ, x)k2 2 − γ` . − γ` 2 2 x∈1 x∈∂1
On the other hand γ`
X
Jγ` (x − y)
x,y∈1
+ γ`
X
km` (σ, x) − m` (σ, y)k22 4
Jγ` (x − y)
x∈1,y∈∂1
km` (σ, x) − m(µ 2
±
,s± ) 2 k2
X
1 Jγ` (x − y) (m` (σ, x), m` (σ, y)) 2 x,y∈1 X ± ± 1 m` (σ, x), m(µ ,s ) Jγ` (x − y) − γ` 2
= −γ`
x∈1,y∈∂1
X
1 Jγ` (x − y) km` (σ, x)k22 2 x,y∈1 X 1 1 km` (σ, x)k22 + [a(β)]2 + γ` Jγ` (x − y) 2 2
+ γ`
x∈1,y∈∂1
X
1 Jγ` (x − y) (m` (σ, x), m` (σ, y)) 2 x,y∈1 X ± ± 1 m` (σ, x), m(µ ,s ) Jγ` (x − y) − γ` 2
= −γ`
x∈1,y∈∂1
X1 km` (σ, x)k22 + γ` + γ` 2 x∈1
X x∈1,y∈∂1
1 Jγ` (x − y) [a(β)]2 . 2
(6.25)
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
357
Comparing (6.24) and (6.25) we find that ± ± `,L ` (m` (σ1 )) + E1,∂1 m` (σ1 ), m(µ ,s ) E1 + γ`
X km` (σ, x)k2 X [a(β)]2 2 + γ` 2 2 x∈1 x∈∂1
X
= γ`
Jγ` (x − y)
x,y∈1
X
+ γ`
Jγ` (x − y)
x∈1,y∈∂1
X
− γ`
x∈1,y∈∂1
≡ U1µ
±
,s±
km` (σ, x) − m` (σ, y)k22 4 km` (σ, x) − m(µ 2
±
,s± ) 2 k2
(6.26)
1 Jγ` (x − y) [a(β)]2 2
(m` (σ1 )) − C(|1|, β),
where C(|1|, β) is an irrelevant σ-independent constant that will drop out of all relevant formulas and may henceforth√be ignored. For suitably chosen ρ we introduce a lattice WM,ρ in RM with spacing ρ/ M . Then for any domain D ⊂ RM , the balls of radius ρ centered at the points q of WM,ρ ∩D cover D. For reasons that should be clear from Sect. 3, we choose ρ = 2
M ` .
With probability greater than 1 − exp(−c`), fx,β,ρ (m` (x)) = ∞
kmk22
> 2, while if the number of lattice points within the ball of radius 2 are bounded 2` . But this implies that by exp M ln M ` ± ± −βγ −1 E1 m` (σ1 ),m(µ ,s ) (m` (σ1 )+E `,L 1,∂ 1 ln Eσ1 1I{m` (σ)∈F } e (6.27) h i (µ± ,s± ) 2` (m` ) − C(|1|, β) + |1| M | ln M | + 2M , ≤ −γ −1 β inf m` ∈F F1,β,ρ ` q and also, if δ > 2
M ` ,
` ± ± −βγ −1 E1 m` (σ1 ),m(µ ,s ) (m` (σ1 )+E `,L 1,∂ 1 ln Eσ1 1I{m` (σ)∈Fδ } e h i (µ± ,s± ) (m` ) − C(|1|, β) − |1|2 M ≥ −γ −1 β inf m` ∈F F1,β,ρ ` .
(6.28)
Treating the denominator in the first line of (6.13) in the same way and putting everything together concludes the proof of the lemma. An immediate corollary of Lemma 6.4 is Lemma 6.5. For any (µ± , s± , w± ), |3| ≤ γ −1 g(γ) and any F that is measurable with respect to the sigma algebra generated by {m` (σ, x)}x∈I ,
358
A. Bovier, V. Gayrard, P. Picco γ β
˜ ± , s± , w± )) ln Gβ,γ,3 (F ∩ A(µ ±
±
(µ ,s ) (1,1,1,1) (m` ) + inf m` F1,β,ρ (m` ) ≤ − inf m` ∈F F1,β,ρ
+ c0 γL + (g(γ))1/4 + ζ + |1|γ` + |1|γM | ln −1/2
with probability greater than 1 − Ke−c(g(γ)) numerical constants c, c0 , K.
2` M|
(6.29) + |1| M `
− 2e−c/γ for some finite positive
Proof. This is an immediate consequence of (6.13) and Lemmata 6.3 and 6.4.
We are now set to prove the upper bound in Theorem 6.2. Using the notation of Sect. 4 we have that c ) ln Gβ,γ,3 (F ) ≤ ln Gβ,γ,3 (F ∩ SR ) + Gβ,γ,3 (F ∩ SR c Gβ,γ,3 (F ∩ SR ) = ln Gβ,γ,3 (F ∩ SR ) + ln 1 + Gβ,γ,3 (F ∩ SR ) ≤ 4M 2 2R
sup µ± ,s± ,±(w± −u± )≤R
+ ln 1 +
ln Gβ,γ,3 (F ∩ A(µ± , s± , w± ))
exp (−c2 βLRζ(ζ)) Gβ,γ,3 (F ∩ SR )
(6.30)
,
where we used (4.14). We see that the last term can be made irrelevantly small by choosing R sufficiently large. In fact, since we will consider events F whose probability will be at least of order exp(−γ −1 βC), it will suffice to choose R
1 . γLζ(ζ)
(6.31)
On the other hand, in order for the error terms in (6.20) to go to zero, we must assure that (note that |1| = |I| + 2R is of order R) R(γ` + M ` ) tends to zero. With α ≡ γM , this means α ↓ 0. (6.32) R γ` + γ` √ √ From this we see that ` should be chosen as γ` = α while R must satisfy R α ↓ 0. (6.31) and (6.32) impose conditions on L and ζ, namely that √ α γLζ(ζ)
↓ 0.
(6.33)
Of course we also need that ζ ↓ 0 and γL ↓ 0, but clearly these constraints can be satisfied provided that α ↓ 0 as γ ↓ 0. Thus the upper bound of Theorem 6.2 follows. To prove the lower bound, we will actually need to make use of the upper bound. To do so, we need more explicit control of the functional F , i.e. we have to use the explicit bounds on fx,β,ρ (m` (x)) in terms of the function Φ from Lemma 3.1. Lemma 6.6. The functional F defined in (6.4) satisfies ±
±
(µ ,s ) (m` ) ≥ U1(µ F1,β,ρ
and
±
,s± )
(m` ) + γ`
X
1 Φx,β (m` (x)) − |1|ρ2 2 x∈1
(6.34)
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
359
(1,1,1,1) inf F1,β,ρ (m` ) ≤ |1|φβ (a(β)) + |1| m`
where φβ (a) ≡
a2 2
ln 2 `β
(6.35)
− β −1 ln cosh(βa).
Proof. Equation (6.34) follows straightforward from (3.3). To get (6.35), just note that U is non-negative and is equal to zero for any constant m` , while from Lemma 3.1 it follows that inf fx,β,ρ (m` (x)) ≤ inf Φx,β (m` (x)) +
m` (x)
m` (x)
ln 2 `β
ln 2 `β ln 2 = φβ (a(β)) + . `β
≤ Φx,β (m(1,1) ) +
(6.36)
This concludes the derivation of the upper bound. We now turn to the corresponding lower bound. What is needed for this is an upper bound on the partition function that would be comparable to the lower bound (6.12). Now X Eσ e−βH3 (σ3 ) 1I{η(w± ,σ)=s± eµ± } Zβ,γ,3 = (µ± ,s± )
Zβ,γ,3
×P =
X
(µ± ,s± )
Eσ e
−βH3 (σ3 ) Eσ e{η(w ,σ)=s± eµ± } ±
−βH3 (σ3 )
(µ± ,s± )
×
=
1I{η(w± ,σ)=s± eµ± }
Zβ,γ,3 1 − 1I{η(w± ,σ)=0}
(6.37)
e−βH3 (σ3 )
Eσ X Eσ e−βH3 (σ3 ) (µ± ,s± )
−1 ×1I{η(w± ,σ)=s± eµ± } 1 − Gβ,γ,3 {η(w± , σ) = 0} . This is almost the same form as the one we want, except for the last factor. The point is now that we want to use our upper bound from Theorem 6.2 to show that Gβ,γ,3 {η(w± , σ) = 0} is small, e.g. smaller than 1/2, so that this entire factor is negligible on our scale. Remembering our estimate (4.10), one may expect an estimate of the order exp(−c2 βLζ(ζ)), up to the usual errors. Unfortunately, these errors are of order exp(±βγ −1 (ζ + γL)) and thus may offset completely the principal term. A way out of this apparent dilemma is given by our remaining freedom of choice in the parameters ζ and L; that is to say, to obtain the lower bound, we will use a ζˆ and a Lˆ such that first ˆ ζ) ˆ γ −1 ζ + L. This is they still satisfy the requirement (61.223) while second c2 Lˆ ζ(( clearly possible. With this in mind we get Lemma 6.7. With the same probability as in Lemma 6.5,
360
A. Bovier, V. Gayrard, P. Picco
γ ˆ ˆ ˆ 1 − δ (δ ζ) ln Gβ,γ,3 {ηζ, ˆL ˆ (w± , σ) = 0} ≤ −γ Lζ β 2 − δ ζˆ 2` . | + RM + c0 γL + (g(γ))1/4 + ζ + Rγ` + RγM | ln M `
(6.38)
Proof. The proof of this lemma is very similar to the proof of (ii) of Proposition 4.1, except that in addition we use the upper bound of Lemma 6.5 to reduce the error terms. We will skip the details of the proof.
Choosing Lˆ and ζˆ appropriately, we can thus achieve that −1 ≤ 2 so that 1 − Gβ,γ,3 {η(w± , σ) = 0} Zβ,γ,3 ≤ 2
X (µ± ,s± )
Eσ e−βH3 (σ3 ) 1I{η(w± ,σ)=s± eµ± } −
≤ 2(2M )2 sup Zβ,γ,3− ({η(w− σ) = s− eµ }) µ± ,s±
(µ± ,s± ) ×Zβ,γ,1 Zβ,γ,3+ ({η(w+ σ)
+
= s+ eµ })e+8γ
(6.39) −1
ˆ ˆ β(ζ+2γ L)
(we will drop henceforth the distinction between Lˆ and L and ζˆ and ζ). The first and third factor in the last line are, by Lemma 6.3, independent of µ± , s± , up to the usual errors. The second partition function is maximal for (µ+ , s+ ) = (µ− , s− ), (this will be −1 −1/2 , shown later). Thus with probability greater than 1 − e−cγ − Ke−c(g(γ)) ±
±
±
Gβ,γ,3 (F ∩ A(µ , s , w± )) ≥
±
(µ ,s ) Zβ,γ,1 (F ) (1,1,1,1) Zβ,γ,1
0
e−c βγ
−1
(ζ+γL+(g(γ))1/4 )
(6.40)
for some numerical constant c, c0 . Using the second assertion of Lemma 6.4 allows us to conclude the proof of Theorem 6.2. We are now ready to prove Theorem 6.1: Proof of Theorem 6.1. Notice first that the first assertion (6.1) follows immediately from Lemma 6.7. Just note that P G [ω] {η (u, s) = 0} Gβ,γ,3 [ω] ∃u∈3 ηζ, ˆL ˆ ˆ (u, σ) = 0 ≤ ˆ β,γ, 3 u∈3 ζ,L (6.41) ˆ ˆ ˆ ≤ |3|e−cβ Lζ((ζ)) ˆ z. for suitably chosen L, ˆ To prove (6.2), note that we need only consider the case where both η(u, σ) and η(u+1, σ) are non-zero. This follows then simply from the upper bound of Theorem 6.2 and the lower bound ±
±
inf µ± ,s± inf m` :η(u,m` )6=η(u+1,m` )6=0 U1(µ ,s ) (m` ) P P ≥ 41 γ` x∈u y∈u+1 Jγ` (x − y)km` (x) − m` (y)k22 Using convexity, we see that
(6.42)
Distribution of Overlap Profiles in 1-D Kac–Hopfield Model
γ`
P
P x∈u
≥ (γ`)2
Jγ` (x − y)km` (x) − m` (y)k22 P P 2 r∈u,s∈u+1 x∈r y∈s km` (x) − m` (y)k2
y∈u+1
P
ˆ −1 −2 |r−s|≤(γ L)
P ˆ 2 ≥ (γ L) ˆ 2 = (γ L)
P
361
r∈u,s∈u+1 ˆ −1 −2 |r−s|≤(γ L)
r∈u,s∈u+1 ˆ −1 −2 |r−s|≤(γ L)
P
`
Lˆ x∈r m` (x) − kmLˆ (r) −
2 mLˆ (s)k2
` ˆ L
P
2
m (y)
` y∈s
(6.43)
2
.
Inserting this inequality into (6.42) gives immediately that ±
±
inf µ± ,s± inf m` :η(u,m` )6=η(u+1,m` )6=0 U1(µ ,s ) (m` ) P (a(β))2 − 2a(β)ζˆ ≥ 41 r∈u,s∈u+1 ˆ −1 −2 |r−s|≤(γ L) ˆ 2 (a(β))2 − 2a(β)ζˆ . ≥ 18 (1 − 2γ L) From here the proof of (6.2) is obvious.
(6.44)
This concludes our analysis of the free boundary condition measure in volumes of order o(γ −1 ). We have seen that these measures are concentrated on constant profiles on some scale Lˆ γ −1 (microscopic scale). In the next subsection we will analyse the measures with fixed equilibrium boundary conditions. 6.2. Finite volume, fixed symmetric boundary conditions. To proceed in order of increasing difficulty, we consider first the case where the boundary conditions are the same on both sides of the box 3. Since these are compatible with one of the preferred constant profiles of the free boundary conditions measures and since the size of the box 3 we consider is so small that by our self-averaging results we know that the random fluctuations do not favour one of the constant values by a factor on the scale exp(βγ −1 ), we expect that the optimal profile will be the constant profile compatible with the boundary conditions. Indeed, we will prove Theorem 6.8. Assume that |3| ≤ g(γ)γ −1 , where g(γ) satisfies the hypothesis of Corollary 5.3. Then there exist `, L, ζ, R all depending on γ and a set 3 ⊂ −1/2 with P[c3 ] ≤ Ke−c(g(γ)) + e−cR/γ such that for all ω ∈ 3 , γ β
(µ,s,µ,s) ln Gβ,γ, [ω](F ) 3
h i (µ,s,µ,s) (1,1,1,1) ≤ − inf ±(w± −u± )≤R inf m` ∈F F[w (m ) − inf F (m ) ` m ` ` [w− ,w+ ],β,γ − ,w+ ],β,γ + er(`, L, M, ζ, R) (6.45) and for any δ > 0, for γ small enough γ β
(µ,s,µ,s) ln Gβ,γ, [ω](Fδ ) 3
i h (µ,s,µ,s) (1,1,1,1) ≥ − inf ±(w± −u± )≤R inf m` ∈F F[w (m ) − inf F (m ) ` m ` ` [w− ,w+ ],β,γ − ,w+ ],β,γ − er(`, L, M, ζ, R), (6.46)
362
A. Bovier, V. Gayrard, P. Picco
where er(`, L, M, ζ, R) is a function of α ≡ γM that tends to zero as α ↓ 0 and where Fδ is defined in (6.6). An immediate corollary of Theorem 6.8 is the analog of Theorem 6.1 for the measures (µ,s,µ,s) Gβ,γ, [ω]: 3 Theorem 6.9. Assume that γ|3| ↓ 0, β large enough (β > 1) and γM (γ) ↓ 0. Then we can find γ −1 Lˆ 1 and ζˆ ↓ 0, such that on a subset 3 ⊂ with −1/2 P(c3 ) ≤ e−cg (γ) , where g(γ) ↓ 0 and γ −1 g(γ) > c, we have that for all ω ∈ 3 ˆ ˆ (µ,s,µ,s) µ ≤ e−Lg(ζ) , Gβ,γ, [ω] ∃u∈3 ηζ, (6.47) ˆL ˆ (u, σ) 6= se 3 where h(ζ) = cβζ(ζ) and (ζ) is defined in Proposition 4.1. Remark . Equation (6.47) implies that with P-probability one (µ,s,µ,s) µ lim Gβ,γ, = 1. [ω] ∀u∈3 ηζ, ˆL ˆ (u, σ) = se 3 γ↓0
(6.48)
Proof of Theorem 6.8. Many of the technical steps in this proof are similar to those of the preceeding subsection, and we will stress only the new features here. Let us fix without restriction of generality (µ, s) = (1, 1). We consider again the upper bound first. Proceeding as in (6.1), the first major difference is that (6.13) is replaced by (1,1,1,1) (F ∩ A(µ± , s± , w± )) Gβ,γ, 3 −
≤
−
±
±
+
+
(1,1,µ ,s ) (µ ,s ,1,1) (µ ,s ) Zβ,γ, (F ) Zβ,γ,3+ \w+ 3− \w− Z1,β,γ (1,1,1,1) Zβ,γ, 3− \w−
(1,1,1,1) Z1,β,γ
(1,1,1,1) Zβ,γ, 3+ \w+
ecγ
−1
(ζ+γL)
,
(6.49)
where we have also used (6.16) through (6.18) to replace partition functions with boundary condition on one side and constraint on the other by partition functions with two-sided boundary conditions. While in the free boundary condition case, by symmetry, the ratios of partition functions on 3± were seen to be negligible, we will show here that they favour (µ± , s± ) = (1, 1). To make this precise, define for any box 3 ≡ [λ− , λ+ ] with |3| = o(γ −1 ), (µ, ˜ s,µ,s) ˜ Zβ,γ, 3 (µ, ˜ s,µ,s) ˜ Pβ,γ, ≡ . (6.50) 3 (1,1,1,1) Zβ,γ,3 In the case of symmetric boundary conditions, Corollary 5.3 provides the following estimates: −1 1/4 −1 1/4 (µ,s,µ,s) ≤ ecβγ (g(γ)) . (6.51) e−cβγ (g(γ)) ≤ Pβ,γ, 3 ˜ s,µ,s) ˜ for (µ, ˜ s) ˜ 6= (µ, s). Without loss All we need are thus estimates on the quantity P3(µ, of generality we may consider the case (µ, ˜ s, ˜ µ, s) = (1, 1, 2, 1) only. As shown in the forthcoming lemma, the quantity
P0 ≡
sup
[w− ,w+ ]⊂3∪∂ 3 |w− −w+ | 0, P0 ≥ e− 2 βγ 1
−1 2
a (β) −cβγ −1 (Rγ`+RγM | ln
e
2` M
ln 2 |+R M ` +2R ` )
.
(6.56) q
ii) There exists ζ˜0 > 0 depending on β such that for all ζ˜0 ≥ ζ˜ ≥ 2a(β)
M ` ,
0
with a probability greater than 1 − e−c M , for some constant c0 > 0, √ √ √ −1 ˜ ˜ 2` M −βγ −1 (ζ) 12((a(β))2 −4ζ˜ 2 )−3 (ζ) P0 ≤ e ecβγ (Rγ`+RγM | ln M |+R ` ) . (6.57) We will assume in the sequel that the parameters `, L, M and R satisfy the set of conditions (6.31) to (6.33) from Sect. 6.1. It is then clear that the parameter ζ˜ in part ii) of Lemma 6.11 can always be chosen in such a way that the exponential decrease of the first term in the r.h.s. of (6.57) compensates the increase of the second one. We will postpone the proof of Lemma 6.11 to the end of this subsection.
364
A. Bovier, V. Gayrard, P. Picco
Proof of Lemma 6.10. Without loss of generality we will, for convenience, consider only sets 3 of the form 3 ≡ [λ− − 21 , λ+ + 21 ], where λ± are assumed to be integers. We start with the proof of the upper bound (6.54). Let us define the set (6.58) B ≡ σ : ∀u∈3 η(u, σ) ∈ {0, e1 , e2 } . We further define 1 1 1 , if such u exists , u1 (σ) ≡ sup u ∈ [λ− − 2 , λ+ + 2 ] | η(u, σ) = e , otherwise λ− − 1 1 2 , if such u exists , u2 (σ) ≡ inf u ∈ (u1 (σ), λ+ + 2 ] | η(u, σ) = e , otherwise λ+ + 1 and we set
B(u1 , u2 ) ≡ {σ ∈ B | u1 (σ) = u1 , u2 (σ) = u2 } .
(6.59) (6.60)
(6.61)
A piece of profile between locations u1 (σ) and u2 (σ) will be called a “jump” between equilibrium (1,1) and (2, 1). For R chosen as in (6.32), we will set moreover [ B(u1 , u2 ) (6.62) C≡ λ− −1≤u1 j1 + j2 + 1
with j1 j2 = 0
(4.59)
gives rise respectively to the C2 conditions: c2 ≥ 2j1 (j1 + 1) + 2j2 (j2 + 1) + (j1 + j2 )2 − 4, c2 > 3(j1 + j2 + 1)(j1 + j2 − 1).
(4.60)
For j1 j2 6= 0 and d > j1 +j2 +2, each representation contains spins s = |j1 −j2 |, · · · , j1 +j2 , while for d = j1 + j2 + 2 or j1 j2 = 0 each representation contains only the spin s = j1 + j2 .
Acknowledgement. We would like to thanks Philippe Caldero for valuable discussions about commutants in Lie algebras. We are also indebted to Raymond Stora for his interest and numerous comments on induced representations of the Poincar´e and conformal algebras and Xo Luc Frappat for a careful reading of the manuscript.
References 1. 2. 3. 4. 5. 6. 7. 8.
Barbarin, F., Ragoucy, E., Sorba, P.: Nucl. Phys. B442, 425 (1995) Barbarin, F., Ragoucy, E., Sorba, P.: Int. Journ. Mod. Phys. A11, 2835 (1996) de Boer, J., Harmsze, F., Tjin, T.: Phys. Rep. 272, 139 (1996) Feher, L., O’Raifeartaigh, L., Ruelle, P., Tsutsui, I., Wipf, A.: Phys. Rep. 222, 1 (1992) and references therein Leinaas, J.M. and Myrheim, J.: Int. Journ. Mod. Phys. A8, 3649 (1993) Mack, G.: Commun. Math. Phys. 55, 1 (1977) Kostant, B.: Lecture Notes in Mathematics. 466, Berlin, Heidelberg, New-York: Springer 1974, p 101 Evans, J.M., Madsen, J.O.: Phys. Let. 384, 131 (1996)
W-Realization of Lie Algebras: Application to so(4, 2) and Poincar´e Algebras
411
9. Moussa, P., Stora, R.: In “Methods in Subnuclear Physics”. London: Gordon and Breach, Herceg-Novi Summer School, 1966 10. Mack, G., Todorov, I.: Journ. Math. Phys. 10, 2078 (1969) 11. R¨uhl, W.: Commun. Math. Phys. 30, 287 (1973) 12. Knapp, A.W., Speh, B.: J. Funct. Anal. 45, 41 (1982) Communicated by R.H. Dijkgraaf
Commun. Math. Phys. 186, 413 – 449 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
On the Stability of Realistic Three-Body Problems? Alessandra Celletti1 , Luigi Chierchia2 1 Dipartimento di Matematica, Universit` a dell’Aquila, 67100–Coppito, L’Aquila, Italy. E-mail:
[email protected] 2 Dipartimento di Matematica, Universit` a di Roma Tre, Largo San Murialdo 1, 00146 Roma, Italy. E-mail:
[email protected] Received: 1 April 1996 / Accepted: 25 October 1996
Abstract: We consider the system Sun–Jupiter–Ceres as an example of a planar, circular, restricted three-body problem and, after substituting the mass ratio of Jupiter/Sun (which is approximately 10−3 ) with a parameter ε, we prove the existence of stable quasiperiodic motions with frequencies close to the observed (average) frequencies reported in “The Astronomical Almanac” for |ε| ≤ 10−6 . The proof is “computer-assisted”. Table of Contents 1 Introduction and Theorem 1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 2 Restricted, Circular, Planar Three-Body Problem . . . . . . . . . . . . . . . . . . 418 3 A Model from the Solar System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 4 A KAM Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 4.1 Algebraic Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 4.2 Analytic Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 4.3 KAM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 4.4 KAM Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432 5 Proof of Theorem 1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438 5.1 Step 1: Formal ε-expansion and initial approximate solution . . . . . . . . . 438 5.2 Step 2: Norm bounds relative to the initial approximate solution . . . . . . 441 5.3 Step 3: Application of the KAM algorithm and of the KAM Theorem . . 445 A Some Computer-Assisted Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449 ? This work is part of the research program of the European Network on: “Stability and Universality in Classical Mechanics”, # ER–BCHRXCT940460.
414
A. Celletti, L. Chierchia
1. Introduction and Theorem 1.1 1) Since human kind developed a definite mathematical taste (whatever this means) the “stability” of planetary motions might be considered as one of the central questions in mathematics. Nowadays consciousness about new phenomena (pollution, just to name one) has drawn the attention of scientists and non-scientists to other types of stabilities (in other words, extinction of living species will depend more and more upon the chaotic effects of pollution rather than on “the sky falling on our heads”). Nevertheless the stability problem for many-body systems interacting only through gravitation still stems out as one of the more intriguing and rich problems in mathematics. In modern times outstanding contributions came, above all, from H. Poincar´e [13], V. I. Arnold [2] and J. Moser [12]. In particular the so-called KAM (Kolmogorov, Arnold, Moser) theory (see [1] and references therein) gave a “positive” answer to the above stability problem in the sense that it proved ([2]) the possibility of the existence of many-body systems (“planetary systems”) whose time evolution may be described by a linear flow on a torus (“quasi-periodic motion,” here, synonymous of “stable motion”). The drawback of this beautiful result is that, taking the estimates contained in it seriously, it turns out that the mass ratio of the planets of such a hypothetical planetary system with their star should be more or less comparable to the mass ratio of a proton with the Sun.1 One then may pose the question of stability of realistic many-body systems. Of course, to give a mathematical content to the word “realistic” is clearly impossible and the best we could do was to get inspiration from our own planetary system. The “simplest non-trivial” three-body problem is the so-called planar, circular, restricted three-body problem (see Sect. 2 for definitions): we considered one of the most popular three-body problems of the Solar system, namely Sun, Jupiter and Ceres (one of the major bodies of the so-called asteroid belt). Following Delaunay [6], we then derived a Hamiltonian model. Clearly, in deriving a model one simplifies Nature2 quite a bit. We then took our model seriously (from the mathematical point of view) and replacing the Jupiter/Sun mass ratio, which is approximately 10−3 , with a “perturbative parameter” ε, we asked for how large values of ε one could find quasi-periodic motions with “frequencies” close to the observed frequencies of the Sun–Jupiter–Ceres system (which may be found in the ephemeris [16]). For such a model we proved stability for |ε| ≤ 10−6 , being therefore away from “reality” by three orders of magnitude. We leave it to the reader to judge if this is realistic or not. We believe, however, that with some more efforts one should indeed be able to prove stability up to ε = 10−3 and we regard our result as a first step towards a proof of the mathematical stability of realistic many-body problems. Our result relies basically on two techniques: (i) a (new) KAM scheme presented in Sect. 4 (for the experts: a KAM result in Hamiltonian setting in the style of Moser, Salamon and Zehnder [15] with emphasis on analytical dependence upon parameters); (ii) computer-assisted (rigorous) estimates, which are needed in order to apply “effectively” the KAM scheme to our three-body problem. It is well known that computers may be used to prove theorems (see, e.g., [7, 11, 10] or think of the famous “four-colour theorem”). We were not really enthusiastic to rely 1 Mass of proton/mass of Sun= 1.6724 · 10−21 /8.4078 · 1030 ' 1.9891 · 10−52 ; compare [8]. For some recent applications of Arnold’s result to three-body problems see [14]. 2 For example, to consider Sun–Jupiter–Ceres as a planar, circular, restricted three-body problem means, in particular, that one is assuming the Jupiter orbit circular: this is a rather crude approximation and more “realistic” models would include the Jupiter eccentricity, the Saturn “secular” effects on Jupiter, etc. See Sect. 3 for a (partial) justification of our model.
Stability of Three-Body Problems
415
on machines to prove our result but couldn’t get away without it. In fact we are pretty sure that with more refined techniques and/or with new ideas one might get better results without the (essentially trivial but) lengthy computations which are the only reason to call in machines. 2) We give now a precise formulation of the main result. Let (`, g) ∈ T2 ≡ R2 /(2πZ2 ); let3 L0 ≡ 0.729305 ,
G0 ≡ 0.727162 ,
r0 ≡ 0.001 ,
(1.1)
and let B ≡ {(L, G) ∈ C2 : |L − L0 | ≤ r0 , |G − G0 | ≤ r0 },
B0 ≡ B ∩ R2 .
(1.2)
On T2 × B0 , endowed with the standard symplectic form d` ∧ dL + dg ∧ dG, consider the one-parameter family of Hamiltonian functions given by H(`, g, L, G; ε) ≡ (
1 1 − G)2 + 2ε ( 2 − G) R(`, g, L, G) 2 2L 2L
≡ h(L, G) + ε f (`, g, L, G) , with the “perturbing function” R defined as X Rn (L, G) cos(n1 ` + n2 g), R(`, g, L, G) ≡
(1.3)
(1.4)
n∈Z2 0≤|n1 |+|n2 |≤10
where Rn ≡ Rn1 n2 vanishes unless it belongs to the following list: R00 ≡ R11 ≡ R22 ≡ R33 ≡ R55 ≡
9 L4 3 (1 + L4 + e2 ) , 4 16 2 3 6 5 L (1 + L4 ) , 8 8 5 4 L4 (3 + L ) , 4 4 5 6 7 L (1 + L4 ) , 8 16 63 10 L , 128
9 L4 e (1 + L4 ) , 2 8 L4 e (9 + 5L4 ) , R12 ≡ − 4 3 R32 ≡ L4 e , 4 35 8 L , R44 ≡ 64
R10 ≡ −
(1.5) (1.6)
(1.7)
and the “eccentricity” e, which is a function of the “action variables” (L, G), is defined as4 r G2 e ≡ e(L, G) ≡ 1 − 2 . L Let α be the golden mean (α ≡ − ≡ 3
√
5−1 2 )
and let
1 5 + = 2.573432... , 2 13 + α
+ ≡
1 5 + = 2.579251... . 2 12 + α
All the following numbers, quantities and functions will be physically motivated in Sects. 2 and 3. In this paper the letter e will always refer to “eccentricities” and never to the Neper number; the exponential function will be denoted exp(x). 4
416
A. Celletti, L. Chierchia
The “observed average frequency” of Ceres is approximately −c ≡ −2.577107 (see Sect. 3 below) so that − < c < + . Let q −1/3 L± ≡ ± , G± ≡ L± 1 − e20 , where e0 = 0.0766 is the “observed” eccentricity of Ceres as found in [16]. Finally, we define the h-frequencies ω (±) ≡ (E± ± , E± ) ,
E± ≡ −2(
1 − G± ) . 2L2±
(1.8)
For later use we point out that ω (±) is a “Diophantine vector” satisfying |ω (±) · n| ≥ (γ± |n|)−1 , with γ± given by5
∀ n ∈ Z2 \{0} ,
√ γ± = 2|E± |( 5 + 24 ± 1) .
(1.9)
(1.10)
The result discussed in 1) above can now be formulated as follows. Theorem 1.1. Let H be as in (1.3)–(1.5); let B0 be as in (1.1), (1.2); let ω (±) be as in (1.8). Then, for all 0 ≤ |ε| ≤ 10−6 there exist (unique) two-dimensional analytic tori Sε (ω (±) ) ⊂ B0 × T2 , depending analytically also on the parameter ε (for |ε| ≤ 10−6 ), on which the H-flow is (analytically) conjugated to the linear flow θ ∈ T2 → θ − ω (±) t. Remark 1.1. In Sect. 3 we give a physical motivation for having chosen as a perturbating function a trigonometric polynomial. We believe, however, that considering perturbations with an infinite number of non-vanishing Fourier coefficients would lead to essentially the same results making only technically more involved (and more expensive) the proof. 3) (On the proof) The proof of Theorem 1.1 is given in Sect. 5 and, as already mentioned, is a “computer assisted” application of the KAM scheme6 of Sect. 4. As it is well known KAM schemes are “Newton algorithms”: they are procedures to iteratively construct solutions for certain nonlinear equations (with “loss of regularity”), starting from some initial “approximate solution,” with a quadratic rate of convergence. Our initial approximate solution is a suitable truncation (actually a “fifth order truncation”) of the so-called “Lindstedt series” (see [1] for generalities), i.e., of the formal ε-power series solution for the invariant torus equation associated to the looked for quasiperiodic solution. To this initial datum we apply the “KAM algorithm” presented in Sect. 4. The “KAM algorithm” is based on an algebraic scheme which, starting from a given “approximate solution,” produces a new function solving the invariant torus equation up to an error which is “quadratically smaller” than the one produced by the starting approximate solution (Sect. 4.1). This algebraic scheme (which, as already mentioned, is new) is equipped with a set of “accurate” estimates (Sect. 4.3). The algebraic scheme plus the set of estimates is what we call “the KAM algorithm.” We then work out a criterion (the KAM Theorem of Sect. 4.4) which guarantees the applicability of the KAM 5
In general, numbers of the form a =
p q
±
1 k+α
with p, q, k non negative integers, q > 0 and k ≥ 2,
satisfy |an + m| ≥ (γ|n|)−1 for any n, m ∈ Z, n 6= 0 with γ = q 2 (k + α) (see, e.g., [3]). 6 See [4] and [5] for general information, references and a different “KAM computer-assisted algorithm”.
Stability of Three-Body Problems
417
algorithm an infinite number of times yielding a solution of the invariant torus equation. Such criterion is obtained simplifying the estimates and getting a unique stronger condition (the “KAM condition”) ensuring the indefinite applicability of the scheme. As we already pointed out in previous papers (see [5] and references therein), in concrete applications, it is convenient to iterate a few times the KAM algorithm before trying to apply the KAM theorem. Both the computation of the initial approximate solution and the application of the KAM algorithm are computer-assisted. We remark that it is quite different to explicitly compute the initial approximate solution from the “computation” of the new approximate solutions based on the KAM algorithm: the calculation of the truncated Lindstedt series is completely explicit (we compute numbers!) while the construction of the sequence of quadratically better and better approximate solutions is only implicitly described by the KAM algorithm and what we actually compute are bounds on norms relative to such approximate solutions. 4) (On the use of computers) Our proof is “computer-assisted” in the sense that certain formulae (derived below) have been implemented on a computer (a VAX) keeping rigorous control, by means of the so-called “interval arithmetic” (see below), of the numerical errors introduced by the machine. We report in Sects. 5.2 and 5.3 all the computer-aided calculations needed to prove Theorem 1.1. Instead we do not include the computer program which anybody can write by her/himself.7 We are obviously aware of the (phylosofical?) problem of proceeding in such a way: It is clear that writing a program in a slightly different way or using different machines might (better: will) produce slightly different outputs, which in our case are intervals of rational numbers (see below). However we regard the computer-implementation, once it is clearly settled the type of rigorous method used to control the propagation of numerical errors (here “interval arithmetic”) as a detail at the same level, say, of the details needed to work out explicitly the estimates of the KAM algorithm of Sects. 4.3 and 4.4. Of course, we shall be happy to send to interested readers the computer programs contained in this paper. Let us now briefly discuss interval arithmetic which is the technical tool we used to control the numerical errors introduced by the machines. Real numbers are represented by computers as sign-exponent-fraction quantities, with the length of the exponent and of the fraction depending on the machine. Any result among elementary operations (sum, subtraction, multiplication and division) is rounded by the computer up to a certain decimal digit. To rigorously implement on a computer a certain sequence of formulae, one first reduces such formulae to a sequence of elementary operations.8 The idea of the “interval arithmetic” is then to construct an interval (exactly representable on the computer) containing the exact result of an elementary operation and to replace (in the obvious way) algebra on numbers with algebra on intervals. In our FORTRAN 77 programs we define quadruple precision (H-floating) variables, which are allowed to vary in a range between 0.84 · 10−4392 and 0.59 · 104392 . The binary structure of a quadruple precision datum is composed by 128 bits, with 1 sign bit, 15 bits for the exponent and the remaining bits for the fraction. Two extra hidden guard bits are used to guarantee the result of an elementary operation “up to 1/2 of the last significant bit” ([17]). The interval containing the result of an elementary operation is therefore obtained increasing or decreasing by one bit the last bit of the mantissa, eventually taking care of 7 The (11189–line) computer program is “just” a translation in computer language (FORTRAN 77) of the formulae of Sects. 5.1 and 4.3 after the standard arithmetic (basic operations) is replaced by “interval arithmetic” (the “arithmetic routines” may be found, e.g., at pages 153–158 of [4]). 8 Elementary functions (such as roots, exponentials, trigonometric functions, etc.), will be approximated by a finite sequence of elementary operations using Taylor polynomials keeping track of errors; see [5] §8.3.
418
A. Celletti, L. Chierchia
the propagation of the carry. For further information and for the necessary routines we refer to [4 and 5]. We finally mention that in the Appendix we report a few computer-assisted data with the following doublefold aim. From one side the reader reproducing our estimates might check her/his results with ours; from the other side the reader who is not going to waste time performing the computations will have an idea of the type of outputs one needs in this paper. 2. Restricted, Circular, Planar Three-Body Problem Here we recall the Hamiltonian formulation of the “restricted, circular, planar three-body problem” (for general information see [6 or 1]). Consider first a Keplerian two-body problem made up of two material points (“bodies”) P1 and P2 with masses m1 and m2 and let P2 revolve on a circular orbit around P1 . Consider now a third body A moving on the orbital plane of P1 and P2 and subject to the gravitational attraction of P1 and P2 . Let the mass mA of A be much smaller than m1 , m2 and assume that the motion of P1 and P2 is not affected by A. The study of the dynamics associated to such a model is known in the literature as the circular, planar, restricted three-body problem. In particular, we shall be interested in phase space regions for which the resulting motion of A is a nearly circular orbit “around” P1 . A convenient Hamiltonian formulation of such a three-body problem is based upon the classical “planar Delaunay variables” [6]. Let T ≡ R/(2πZ) and consider the phase space9 P = {(λ, γ, ψ) ∈ T3 } × {(3, 0, E) ∈ R3 : 3 6= 0, |0| < |3|} endowed with the standard symplectic form dλ ∧ d3 + dγ ∧ d0 + dψ ∧ dE. Then, the dynamics associated with the circular, planar restricted three-body problem is given by the Hamiltonian flow generated by the Hamiltonian10 H0 (λ, γ, ψ, 3, 0, E) ≡
1 + E + ε R0 (λ, γ − ψ, 3, 0) , 232
(2.1)
where ε ≡ m2 /m1 and the “perturbation function” R0 is given as follows. Let ν ∈ T (the “eccentric anomaly”) be implicitly defined for |e| < 1 (e ∈ R) by the relation (“Kepler’s equation”) λ = ν − e sin ν ; let ϕ ∈ T (the “true anomaly”) be implicitly defined (again for |e| < 1) by the relation tg
1 + e 1/2 ν ϕ−γ = tg , 2 1−e 2
and define the “orbital radius” r as r≡
a(1 − e2 ) , 1 + e cos(ϕ − γ)
where
a ≡ 32 .
The function R0 in (2.1) is then given by 9 The Delaunay coordinates λ, γ, ψ are often called, respectively, the “mean anomaly,” the “argument of the perihelion” and the “longitude” (of the “planet” P2 ). 10 We have chosen the units of measure in such a way that m + m = 1 and that the period of P is 2π. 1 2 A
Stability of Three-Body Problems
419
1 , R0 (λ, γ − ψ, 3, 0) ≡ − r cos(ϕ − ψ) − p 1 + r2 − 2r cos(ϕ − ψ) s
where e is defined as e≡
1−
02 . 32
We recall that a convenient representation of R0 is obtained by means of Legendre polynomials11 : if r < 1 (which will be the case for our specific model) one finds R0 = 1 +
∞ X
rj Pj (cos(ϕ − ψ)) .
j=2
A trivial reduction shows that the dynamics generated by (2.1) may be described by a two–degree–of–freedom Hamiltonian: under the canonical (or “symplectic”) transformation (`, g, τ ) ≡ (λ, γ − ψ, ψ) , (L, G, T ) ≡ (3, 0, 0 + E) , H0 takes the form H1 (`, g, L, G) =
1 − G + ε R0 (`, g, L, G) , 2L2
(2.2)
having omitted the dummy variable T ; the phase space is now T2 × {(L, G) ∈ R2 : L 6= 0, |G| < |L|}. 3. A Model from the Solar System Let us now focus on the case in which P1 is the Sun, P2 is Jupiter and A is Ceres (one of the largest bodies in the asteroid belt). Notice (again) that regarding Sun–Jupiter–Ceres as a planar, circular, restricted three-body problem contains a lot of physical approximations, which we shall not discuss here. But even accepting these basic approximations, the reader will have certainly noticed that the Hamiltonian in (1.3) is different from the Hamiltonian in (2.2): besides the factor ( 2L1 2 − G) (and 2ε in place of ε), the main difference is that R is a trigonometric polynomial of degree 10 while R0 contains infinite non vanishing Fourier harmonics. The “selection rule” which led us to the choice of the “physically relevant” Fourier modes is based on the following trivial observation. Among other things, the gravitational effects on Ceres of asteroids and planets and most notably the attraction exerted by Saturn (which, after Jupiter, is the largest planet in the Solar system12 ) have been neglected. Therefore, after having defined a (rough) measure, GSa , of the Ceres–Saturn attraction, we disregard in the Fourier expansion of R0 the terms exceeding GSa in absolute value. In order to define GSa we first look up a few astronomical data in the ephemeris (see The Astronomical Almanac [16]). In particular we want to define the “reference values” of L0 and G0 for the Sun–Jupiter–Ceres system. Observations of the true motion of Ceres, as found in [16], indicate that Ceres moves on a nearly elliptical orbit of “average eccentricity” (2k+1)P (x)x−kP
(x)
k k−1 P0 (x) = 1; P1 (x) = x; Pk+1 (x) = , (for k ≥ 1). k+1 Even though the orbit of Ceres is closer to the orbit of Mars than to the orbit of Saturn, the difference in mass makes the gravitational attraction of Saturn the largest one after that of Jupiter.
11
12
420
A. Celletti, L. Chierchia
e0 ≡ 0.0766
(3.1)
and whose average semimajor axis is approximately 0.532. Hence, the corresponding “average frequency” of Ceres, computed by Kepler’s third law, yields a value of −c ≡ −2.577107 . Since −c ' ∂L H1 |ε=0 = −L−3 we take as “reference L-value” the quantity 1
L0 ≡ 0.729305 ' c3
and, since G0 = L0
q 1 − e20 , we take, as “reference G-value” the quantity G0 ≡ 0.727162 .
Such reference values have been taken as center of the analyticity domain for the “action variables”; see (1.1) and (1.2). Notice that with our choice of the analyticity radius r0 one finds that the function e(L, G) satisfies 0.019799 < |e(L, G)| < 0.106364 ,
∀ (L, G) ∈ B .
Let us turn now to the definition of GSa . In general, for planets whose orbits have a larger semimajor axis than that of Ceres, the “secular term” of H1 is given by13 ε (≡ mass of the planet/mass of the Sun) times the term R00 ≡ R00 (L; e) in (1.5). Keeping in mind that, in the integrable limit, L is the ratio of the semimajor axis of Ceres with that of the planet we define (for planets whose orbits have a larger semimajor axis than that of Ceres) (3.2) GP ≡ ε(P ) × R00 (L(P ); e0 ), where e0 is the observed “average eccentricity” of Ceres (3.1), ε(P ) is the mass ratio of the planet P and of the Sun and L(P ) is the ratio of the semimajor axis of Ceres with that of the planet P . Looking up the “true” values in [16] one finds GSa ≡ GSaturn = 6.3778 · 10−6 . For comparison purposes we report also the value for Jupiter, which is GJupiter = 7.8850 · 10−5 . Neglecting in the expansion of εR0 those terms whose size is smaller than14 GSa , one is led to consider a “three-body problem” governed by the Hamiltonian H2 (`, g, L, G; ε) ≡
1 − G + ε R(`, g, L, G) 2L2
(3.3)
with R given in (1.4) and (1.5) of Sect. 1. The final modification of H2 which gives the Hamiltonian in (1.3), (1.4) is due to merely technical reasons. As mentioned above, our results are based on computer-assisted KAM theory, and one of the standard hypotheses of KAM theorems is that the unperturbed Hamiltonian 13 The “secular term” of H is the average over the angular variables ` and γ of the “perturbation” εR ; 1 0 the computation is immediately checked using, e.g., the above mentioned expansion in terms of Legendre Polynomials. 14 More precisely we omit all the terms such that ε|R (L , G )| < G . n 0 0 Sa
Stability of Three-Body Problems
421
(ε = 0) is non-degenerate, i.e. has an invertible Hessian matrix on its domain of analyticity. In the case of (3.3), the unperturbed Hamiltonian is given by h0 (L, G) =
1 − G, 2L2
whose Hessian matrix is not invertible. There are a few well known methods to overcome this minor problem15 and it turns out that for our purposes the most convenient one is to follow Poincar´e’s trick [13], which consists in replacing the Hamiltonian H2 by its square.16 Therefore we let H3 ≡ (H2 )2 : 1 1 − G)2 + 2ε ( 2 − G) R(`, g, L, G) 2L2 2L + ε2 [R(`, g, L, G)]2 .
H3 (`, g, L, G; ε) = (
The Hessian of the unperturbed Hamiltonian (H3 |ε=0 ) is equal to 5 6G 2 4 6 − L3 , A ≡ A(L, G) ≡ L 2 L 2 L3
(3.4)
(3.5)
and, if (L, G) ∈ B, one has 12 1 | det A| = 4 ( 2 − G) L 2L 12 1 − (G + r ) ≥ 8.830153 . ≥ 0 0 (L0 + r0 )4 2(L0 + r0 )2 To be consistent with the criterion that led us to the Hamiltonian (3.3), we have to omit the term of order ε2 in (3.4) and this leads us to the Hamiltonian (1.3) introduced in Sects.(1, 2).
4. A KAM Theorem Here we prove a KAM result, in the style of [5], which will be the basis of the proof of Theorem 1.1. First we provide a “KAM algorithm” (in the Hamiltonian context), which yields a sequence of quadratically better and better approximations to the conjugacy function of a maximal invariant (Diophantine) torus, and then we formulate a criterion ensuring the applicability of the algorithm an infinite number of times and hence the existence of an invariant torus. Technically, the algorithm, which does not use symplectic transformations (used, instead, in the original works of the masters), may be viewed as a Hamiltonian version of the Lagrangian approach developed in the eighties by Moser, Salamon and Zehnder (see [5] and references therein). 4.1. Algebraic Scheme. Let us consider a smooth (later real-analytic) Hamiltonian h(x, y), where x varies on the standard N -torus TN ≡ RN /(2πZN ) and y varies in 15 For example, one can replace the non-degeneracy hypothesis with a “iso-energetic non-degeneracy” (see, e.g., [1]), which is satisfied by h0 . 16 Note that the dynamics generated by a Hamiltonian function h = h(q, p) and by h2 coincide up to a time scale: if z(t) = (q(t), p(t)) is an h-motion then z(2Et), with E = h(z(0)) is the corresponding h2 -motion.
422
A. Celletti, L. Chierchia
some open ball B N ⊂ RN ; (x, y) are standard symplectic coordinates.17 The problem is to construct an invariant N -torus S on which the flow is conjugated to the linear flow θ ∈ TN → θ + ωt for some “rationally independent” vector18 ω ∈ RN . The Sembedding function θ ∈ TN → (θ + u(θ), v(θ)) ∈ TN × B N is immediately seen to satisfy the following quasi-linear, degenerate PDE on TN : ω + Du − hy (θ + u, v) = 0 , Dv + hx (θ + u, v) = 0 ,
(4.1)
where D denotes the derivatives in the ω direction: D ≡ ω · ∂θ ≡
N X
ωi
i=1
∂ , ∂θi
(4.2)
and hx , hy denote the gradient of h with respect to x, y. As usual, we assume that ω is a Diophantine vector, i.e. there exist γ > 0 and a positive integer τ such that N X ωi ni ≥ (γ|n|τ )−1 , |ω · n| ≡
∀ n ∈ ZN \{0} .
(4.3)
i=1
The starting point of a KAM algorithm is an approximate solution (u, v), which solves (4.1) up to some “error.” In order to formulate a precise result we need some notations and some assumptions. Given a function u : TN → RN we denote by uθ or by ∂θ u its Jacobian matrix (uθ )ij ≡ hxy denotes the matrix with entries hxy ≡ ij
∂ui ; ∂θj ∂2h ; ∂xi ∂yj
thus hyx is the transpose of hxy , i.e. hyx = hTxy ; if A is a square matrix, we denote by A# the antisymmetric part of A times two: A# ≡ A − AT ; finally, if θ ∈ TN → f (θ) ∈ Rs (s ≥ 1) is a smooth function with vanishing mean value, i.e. Z 1 f (θ) dθ = 0 , hf i ≡ (2π)N TN (and if D is as in (4.2), (4.3)), we denote by D −1 f the unique solution with vanishing mean value of the equation Dg = f ; such a solution in Fourier expansion has the form X fn exp(in · θ) , D−1 f = iω ·n N n∈Z \{0}
where fn denote Fourier coefficients and i = 17 18
√ −1.
That is, the symplectic structure is given by the standard 2-form I.e. if ω · n = 0 for some n ∈ ZN , then n must be 0.
PN i=1
dxi ∧ dyi .
Stability of Three-Body Problems
423
Assumption 4.1. Let θ ∈ TN → (u, v) ∈ RN × RN be a smooth function and let M and h0yy be the matrices19 M ≡ I + uθ , h0yy (θ) ≡ hyy θ + u(θ), v(θ) . (4.4) Denoting20
T ≡ M−1 h0yy M−T ,
(4.5)
we assume that, for any θ ∈ TN , the matrices M, h0yy and hT i are invertible. Proposition 4.1. Let h, u, v and ω satisfy, respectively, Assumption 4.1 and (4.3) and define f and g by ω + Du − hy (θ + u, v) = f , Dv + hx (θ + u, v) = g .
(4.6)
Then, if we define the vector/matrix-valued functions b(θ) and B(θ) by # B ≡ MT gθ − vθT fθ , b ≡ vθT f − MT g , we have hbi = 0 ,
hBi = 0 .
(4.7)
Furthermore, the following equation holds21 : ω + Du0 − hy (θ + u0 , v 0 ) = f 0 , Dv 0 + hx (θ + u0 , v 0 ) = g 0 ,
(4.8)
where u0 , v 0 , f 0 , g 0 are defined at the end of the following list of definitions22 : −1 hM−1 f i − hT D−1 bi , c1 ≡ hT i b0 ≡ T D−1 b + c1 − M−1 f , z ≡ M D−1 b0 + c2 , c2 ≡ −hMD−1 b0 i , −1 Dz − h0yx z + f , w ≡ h0yy q1 ≡ hx (θ + u + z, v + w) − h0x − h0xx z − h0xy w , q2 ≡ hy (θ + u + z, v + w) − h0y − h0yx z − h0yy w , q3 ≡ fθT w − gθT z − MT q1 , f 0 ≡ −q2 , h i g 0 ≡ M−T D (D−1 B)M−1 z + MT (h0yy )−1 fθ M−1 z − q3 , u0 ≡ u + z ,
v0 ≡ v + w .
I is the identity matrix. To be precise we should replace, in (4.4), u with p ◦ u, p being the projection of onto TN ; however we shall omit, here and in other circumstances, such projection. 20 The superscript −T denotes the transpose of the inverse: A−T = (A−1 )T . 21 Here and in what follows, the prime attached to a function will never denote derivates but just new functions. 22 As above if h = h(x, y), h0 (θ) denotes the function h(θ + u(θ), v(θ)). 19
RN
424
A. Celletti, L. Chierchia
Remark 4.1. (i) It is immediate to check that if we replace f and g by εf and εg then z, w = O(ε) and f 0 , g 0 = O(ε2 ), i.e. the errors associated to u0 and v 0 are quadratically smaller than the errors associated to u and v. We shall call a couple (u, v) as in (4.6) an approximate solution for (4.1) and the relative couple (f, g) the error function. (ii) Note that the constants c1 and c2 are defined so that the functions b0 and z have vanishing mean value. Proof. To check the first of (4.7), observe that ∂θ h0 = MT h0x + vθT h0y ; now, multiply the first of (4.6) by vθT , the second by −MT , add them together and use integration by parts to get rid of the terms containing θ-derivatives. To check the second of (4.7), take the θ-gradient of (4.6) to obtain DM = h0yx M + h0yy vθ + fθ , Dvθ = −h0xx M − h0xy vθ + gθ . Let
# A ≡ M T vθ ,
(4.9)
(4.10)
and notice (integration by parts) that hAi = 0 .
(4.11)
From (4.9), it follows that the matrix A satisfies the equation DA = B ,
(4.12)
from which the second of (4.7) follows at once. The first of (4.8) follows immediately from the definitions: ω + Du0 − hy (θ + u0 , v 0 ) = ω + Du + Dz − hy (θ + u + z, v + w) = hy (θ + u, v) − hy (θ + u + z, v + w) + Dz + f = −q2 − h0yx z − h0yy w + Dz + f = f0 .
(4.13)
The check of the second of (4.8) is more tricky. First observe that last identity in (4.13) can be rewritten as Dz = h0yx z + h0yy w − f . (4.14) Next, from the definition of z, it follows that i h D MT (h0yy )−1 MD(M−1 z) + f = b .
(4.15)
Solving for h0yy in the first of (4.9) and inserting the obtained expression in the definition of w, we get h i w = (h0yy )−1 Dz − (DM)M−1 − h0yy vθ M−1 − fθ M−1 z + f = vθ M−1 z + (h0yy )−1 fθ M−1 z + MD(M−1 f ) + f . (4.16)
Stability of Three-Body Problems
425
From the definition of g 0 , (4.15), (4.16), (4.10)÷(4.12), it follows that MT g 0 + q3 + b = D(MT w − vθT z) , i.e. g 0 = M−T D(MT w − vθT z) − b − q3 . From this identity, recalling the definitions of b and q3 and using (4.9) to eliminate fθ and gθ , one obtains i h h i g 0 = M−T vθT h0yx z + h0yy w − f + g + h0xx z + h0xy w + q1 + Dw − M−T vθT Dz , which, in view of (4.14), the definition of q1 and the second of (4.6), yields the second of (4.8). 4.2. Analytic Tools. From now on we shall work in the real-analytic category; in this section we review some basic technical facts. We shall consider the Banach space of periodic functions f real-analytic on the torus TN , admitting (for some prefixed ξ > 0) analytic extension on the closed strip 1ξ ≡ {θ ∈ CN : | Im θi | ≤ ξ , ∀ i = 1, ..., N } , equipped with the “Fourier norm” X
kf kξ ≡
|fn | exp(|n|ξ) ;
(4.17)
n∈ZN
in CN (and its subsets RN and ZN ) we shall use the 1-norm, |y| ≡ |y|1 ≡
N X
|yi | .
i=1
If f : TN → Cs is aPvector valued, real-analytic function, analytic on 1ξ , its norm is defined as kf kξ ≡ i kfi kξ , which coincides with (4.17) if fn denotes the s-vector whose components are given by the Fourier coefficients of the components of f . These definitions are immediately extended to matrix/tensor-valued functions by making use of the standard “operator norm”: e.g. if A(θ) is a matrix-valued periodic functions with analytic extension on 1ξ , we set kAkξ ≡
sup c∈CN :|c|=1
kAckξ ;
or, if ∂x3 f is the tensor of order three of the derivatives of a periodic, real-analytic function f : 1ξ → C, its Fourier norm is given by k∂x3 f kξ
N N X X ≡ sup k |b|=|c|=1 i=1
j,k=1
∂3f bk cj kξ . ∂xi ∂xj ∂xk
Finally, we shall also consider functions (possibly vector/matrix/tensor-valued) h = br (y0 ), where B br (y0 ) h(x, y) periodic in x and real-analytic on the closed domain 1ξ × B
426
A. Celletti, L. Chierchia
b N (y0 ) denotes the closed complex ball of radius r around y0 ∈ CN . For any such ≡B r br (y0 )) function, which admits the expansion23 (convergent on 1ξ × B X h(x, y) = hn,k exp(in · x) (y − y0 )k , (4.18) n∈ZN k∈NN
we set
X
khkξ,r ≡
|hn,k | exp(|n|ξ) r|k| .
(4.19)
n∈ZN k∈NN
The elementary properties of interest in the present context are collected in the following Lemma 4.1. (i) Let f : TN → R have an analytic extension on 1ξ (for some ξ > 0) and let ω ∈ RN be a rationally independent vector. Then, for all 0 < δ ≤ ξ, for any p ∈ Z and for any k ∈ N or any k ∈ NN one has24 kD−p ∂xk f kξ−δ ≤ kf kξ σpk (δ), where, if k = 0, f is assumed to have vanishing mean value, and sup πnk |ω · n|−p exp(−δ|n|) , σpk (δ) ≡ {n∈ZN \{0}:fn 6=0}
πnk ≡
|n|k , if k ∈ N , |nk | , if k ∈ NN .
(ii) Let f, g : TN → R have an analytic extension to 1ξ , then kf gkξ ≤ kf kξ kgkξ . ¯ let h : TN × {y ∈ Rs : |y − y0 | ≤ r} → R have an analytic (iii) Let 0 < ξ < ξ; b extension on 1ξ¯ × Brs (y0 ), f : TN → RN and g : TN → Rs have analytic extension on 1ξ . Assume that kf kξ ≤ ξ¯ − ξ, kg − y0 kξ ≤ r. Then, denoting φ(θ) ≡ (θ + f (θ), g(θ)), one has kh ◦ φkξ ≤ khkξ,r ¯ . Proof. (i) The claim follows immediately by expanding f in Fourier series. (ii) In the following sums the indices n, m run over ZN , X X X kf gkξ ≡ |(f g)n | exp(|n|ξ) = fm gn−m exp(|n|ξ) n
≤
X
n
m
exp |m| exp |n − m||fm | |gn−m |
n,m
= kf kξ kgkξ . (iii) In the following sums the indices n, m run over ZN , the index k runs over Ns and j over N. Using (ii), one gets Qk 23 k k We use the standard notation: (y − y0 ) =
24
If k ∈ N,
∂xk f
i=1
(yi − y0i ) i .
denotes the k-tensor of the derivatives of f ; if k ∈ NN , ∂xk f =
∂ |k| f
k
k
∂x1 1 ···∂xNN
.
Stability of Three-Body Problems
427
X X hm,k ij (m · f )j (g − y0 )k exp(|n|ξ) j! n−m n m,k,j X |hm,k | exp(|m|ξ) ≤ (m · f )j (g − y0 )k exp(|n − m|ξ) j! n−m
|h ◦ φ|ξ =
n,m,k,j
=
X |hm,k | k(m · f )j (g − y0 )k kξ exp(|m|ξ) j!
m,k,j
≤
X |hm,k | |k| km · f kjξ kg − y0 kξ exp(|m|ξ) j!
m,k,j
≤
X |hm,k | |m|j (ξ¯ − ξ)j r|k| exp(|m|ξ) j!
m,k,j
= khkξ,r ¯ .
Remark 4.2. (i) Note that the result might be empty if ω is “too well approximable by rationals vectors.” If ω satisfies (4.3) then one checks easily that σpk (δ) ≤
γ p δ −(|k|+pτ ) (|k| + pτ )! , if p ≥ 0, |p| δ −(|k|+|p|) (|k| + |p|)! , if p < 0 ,
≡ max |ωi | .
(4.20)
(ii) It is easy to check that (i) and (iii) of Lemma 4.1 holds also if f , respectively, h are vector valued.
4.3. KAM Algorithm. Here we describe the “KAM algorithm” associated to the scheme of Sect. 4.1, i.e. we equip the algebraic scheme described in Proposition 4.1 with “accurate” and detailed estimates so as to end up with a map K, which to given bounds on norms of the relevant objects relative to a certain approximate solution (u, v), associates corresponding bounds on the new approximation (u0 , v 0 ). More precisely, let us start by making quantitative the hypotheses formulated in Sect. 4.1. ¯ r, E, Ep,q (p, q ∈ N) be such that (x, y) → h(x, y) is Assumption 4.2. Let 0 < ξ < ξ, N
real analytic on 1ξ¯ × B r (y0 ) and k(hyy )−1 kξ,r ¯ ≤E ,
k∂xp ∂yq hkξ,r ¯ ≤ Ep,q ;
(4.21)
assume that the approximate solution (u, v) is real analytic on 1ξ and let U , V , M , M , Ve , F , G, Te be positive numbers bounding the following norms: kukξ ≤ U , kvkξ ≤ V , kMkξ ≤ M , kM−1 kξ ≤ M , −1 kvθ kξ ≤ Ve , kf kξ ≤ F , kgkξ ≤ G , |hT i | ≤ Te ,
(4.22)
where M, T , f and g are, respectively, as in (4.4), (4.5) and (4.6). Finally assume that U ≤ ξ¯ − ξ ,
ρ ≡ kv − y0 kξ < r .
(4.23)
428
A. Celletti, L. Chierchia
Now let 0 < δ ≤ ξ and define ξ0 ≡ ξ − δ .
(4.24)
In the rest of this section we shall define the KAM map, i.e., an explicit map 0
K : (U, V, M, M , Ve , F, G, Te) → (U 0 , V 0 , M 0 , M , Ve 0 , F 0 , G0 , Te0 ),
(4.25)
where (U 0 , ..., G0 , Te0 ) are bounds on the norms ku0 kξ0 ,..., kg 0 kξ0 and on the number −1 |hT 0 i |, with u0 , v 0 , f 0 , g 0 defined in Proposition 4.1, while M0 and T 0 are (obviously) defined as M0 ≡ I + u0θ ≡ M + zθ ,
T 0 ≡ M0
−1
hyy (θ + u0 , v 0 ) M0
−T
.
We start now to work out the necessary estimates. By Lemma 4.1 and (4.22), we get 2
kT kξ ≤ T ≡ M E0,2 . Remark 4.3. In principle, the bound on kT kξ could be improved replacing E0,2 by k(h0yy )−1 kξ (without invoking point (iii) of Lemma 4.1); in practice, however, such a norm is difficult to evaluate accurately and one would not get significantly better estimates. All other estimates are immediately obtained from Lemma 4.1 (and from the definitions given in Proposition 4.1). Here is the complete list: kT −1 kξ ≤ T ≡ M 2 E ; kbkξ ≤ F∗ ≡ Ve F + M G ;
δ kBkξ− δ ≤ s1 F∗ , ; s1 ≡ 2σ01 2 2 |c1 | ≤ Te M F + T σ10 (ξ)F∗ ; (note that in the last estimate we have used the fact that the supremum norm supTN | · | is dominated by the 0-Fourier norm k · k0 ); δ kAkξ0 ≡ kD−1 Bkξ0 ≤ s2 F∗ , s1 ; s2 ≡ σ10 2 δ kb0 kξ− δ ≤ B∗ ≡ T σ10 F∗ + T Te M F + T F∗ σ10 (ξ) + M F ; 2 2 δ |c2 | ≤ M kD−1 b0 k0 ≤ M σ10 ξ − B∗ ; 2 δ δ kzkξ0 ≤ B∗ s3 , + M 2 σ10 ξ − ; s3 ≡ M σ10 2 2 ku0 kξ0 ≡ ku + zkξ0 ≤ U + kzkξ0 ≤ U 0 ≡ U + B∗ s 3 ; kDzkξ0 = k(DM)(D−1 b0 + c2 ) + Mb0 kξ0 ≤ B∗ s 4 , s4 ≡ s3 σ−10 (δ) + M ; −1 kzθ kξ0 = kMθ (D b0 + c2 ) + MD−1 ∂θ b0 kξ0
Stability of Three-Body Problems
≤ B∗ s 5 , kDzθ kξ0
429
s5 ≡ σ01 (δ)s3 + M σ11
δ
; 2 = k(DMθ )(D−1 b0 + c2 ) + DMD−1 ∂θ b0 + Mθ b0 + M∂θ b0 kξ0 ≤ B∗ s 6 , δ δ i h + σ01 (δ) + σ01 ; s6 ≡ σ−11 (δ)s3 + M σ−10 (δ)σ11 2 2
kwkξ0 ≤ E kDzkξ0 + E1,1 kzkξ0 + F 0
kv kξ0 kwθ kξ0
s7 ≡ s4 + E1,1 s3 ; ≤ E(s7 B∗ + F ) , ≡ kv + wkξ0 ≤ V + kwkξ0 ≤ V 0 ≡ V + E(s7 B∗ + F ) ; = k∂θ (h0yy )−1 (Dz − h0yx z + f ) + (h0yy )−1 (Dzθ − ∂θ h0yx z − h0yx zθ + fθ )kξ0 ≤ σ01 (δ)E(kDzkξ0 + E1,1 kzkξ0 + F ) h i +E kDzθ kξ0 + σ01 (δ)E1,1 kzkξ0 + E1,1 kzθ kξ0 + σ01 (δ)F
kq1 kξ0
≤ EB∗ s8 + EF s9 , s8 ≡ σ01 (δ)s7 + s6 + σ01 (δ)E1,1 s3 + E1,1 s5 , s9 ≡ 2σ01 (δ) ; 1 1 2 2 ≤ khxxx kξ,r ¯ kzkξ 0 + khxxy kξ,r ¯ kzkξ 0 kwkξ 0 + khxyy kξ,r ¯ kwkξ 0 2 2 ≤ s10 B∗2 + s11 B∗ F∗ + s12 F 2 , 1 1 2 s10 ≡ E3,0 s23 + EE2,1 s3 s7 + E1,2 E s27 , 2 2 2
kf 0 kξ0
s11 ≡ s3 E2,1 E + s7 E1,2 E , 1 2 s12 ≡ E E1,2 ; 2 1 2 ≡ kq2 kξ0 ≤ khyxx kξ,r ¯ kzkξ 0 + khyxy kξ,r ¯ kzkξ 0 kwkξ 0 2 1 2 + khyyy kξ,r ¯ kwkξ 0 2 ≤ F 0 ≡ s010 B∗2 + s011 B∗ F + s012 F 2 , 1 1 2 s010 ≡ E2,1 s23 + EE1,2 s3 s7 + E0,3 E s27 , 2 2 2
kq3 kξ0
s011 ≡ s3 E1,2 E + s7 E0,3 E , 1 2 s012 ≡ E E0,3 ; 2 ≤ σ01 (δ)F kwkξ0 + σ01 (δ)Gkzkξ0 + M kq1 kξ0 ≤ s13 B∗ F + s14 B∗ G + s15 F 2 + s16 B∗2 , s13 ≡ σ01 (δ)Es7 + M s11 , s15 ≡ σ01 (δ)E + M s12 ,
s14 ≡ σ01 (δ)s3 , s16 ≡ M s10 ;
430
A. Celletti, L. Chierchia
h kg 0 kξ0 = kM−T BM−1 z + A(DM−1 )z + AM−1 Dz i +(DMT )(h0yy )−1 fθ M−1 z +D(h0yy )−1 fθ M−1 z + (h0yy )−1 Dfθ M−1 +(h0yy )−1 fθ (DM−1 ) z + (h0yy )−1 fθ M−1 Dz − M−T q3 kξ0 h ≤ M s1 F∗ M B∗ s3 + s2 F∗ σ−10 (2δ)M B∗ s3 + s2 F∗ M B∗ s4 i +σ−10 (2δ)Eσ01 (2δ)F M B∗ s3 +σ−10 (δ)Eσ01 (2δ)F M B∗ s3 + Eσ−11 (2δ)F M B∗ s3 +Eσ01 (2δ)F σ−10 M B∗ s3 + Eσ01 (2δ)F M B∗ s4 +M s13 B∗ F + s14 B∗ G + s15 F 2 + s16 B∗2 ≤ G0 ≡ s17 F∗ B∗ + s18 EF B∗ + s14 M B∗ G + s15 M F 2 + s16 M B∗2 , 2
s17 ≡ M (s1 s3 + σ−10 (2δ)s2 s3 + s2 s4 ) , h s18 ≡ M s3 M σ−10 (2δ)σ01 (2δ) + σ−10 (δ)σ01 (δ) + σ−11 (2δ) i −1 +σ01 (2δ)σ−10 (2δ) + M σ01 (2δ)s4 + M E s13 ; kM0 kξ0 ≡ kM + zθ kξ0 ≤ M + kzθ kξ0 ≤ M 0 ≡ M + B∗ s 5 . 0
To define M we have to distinguish two cases according to whether M B∗ s5 is greater 0 or smaller than 1. In the first case we define M to be infinite: M B∗ s5 ≥ 1
0
=⇒ M ≡ ∞ ,
while if M B∗ s5 < 1,
(4.26)
we proceed as follows: kM0
−1
kξ0 = k(I + M−1 zθ )−1 M−1 kξ0 ≤ M (1 − M kzθ kξ0 )−1 0
≤ M ≡ M (1 − M B∗ s5 )−1 . Next, kvθ0 kξ0 ≡ kvθ + wθ kξ0 ≤ Ve + kwθ kξ0 ≤ Ve 0 ≡ Ve + EB∗ s8 + EF s9 . 0
Of course, as it is clear from the definition of M , the KAM algorithm will be of some use only if (4.26) is satisfied. To complete the computation of K, it remains to bound 0 −1 |hT 0 i |, i.e. to compute Te0 . If M = ∞, we define also Te0 ≡ ∞, otherwise we proceed as follows. A bit of algebra shows that if we define
Stability of Three-Body Problems
431
C ≡ (I + M−1 zθ )−1 − I =
∞ X
(−M−1 zθ )k ,
k=1 0
−1
C ≡ CM , C 00 ≡ hyy (θ + u0 , v 0 ) − h0yy , then we can write T 0 as
T 0 = T + C∗ ,
where T
T
C∗ ≡ M−1 h0yy C 0 + M−1 C 00 M−1 + M−1 C 00 C 0 + C 0 h0yy M−1 T
T
+C 0 h0yy C 0 + C 0 C 00 M−1 + C 0 C 00 C 0 . Thus, since supTN |zθ | ≤ kzθ k0 (and a bound on kzθ k0 is obtained exactly as above replacing δ with ξ), we obtain sup |C| ≤ M kzθ k0 (1 − M kzθ k0 )−1 TN
≤ C ≡ M B∗0 s05 (1 − M B∗0 s05 )−1 ;
sup |C 0 | ≤ C 0 ≡ CM ; TN
sup |C 00 | ≤ E1,2 kzk0 + E0,3 kwk0 TN
≤ C 00 ≡ E1,2 B∗0 s03 + E0,3 E(s07 B∗0 + F ) ; 2
sup |C∗ | ≤ C∗ ≡ 2M E0,2 C 0 + M C 00 + 2M C 00 C 0 TN
+C 0 E0,2 + C 0 C 00 , 2
2
(4.27)
where B∗0 , s05 , s03 , s07 are defined as above but with δ replaced by ξ. Now, if TeC∗ ≥ 1 otherwise, if
=⇒ Te0 = ∞ , TeC∗ < 1 ,
(4.28)
we have |hT 0 i
−1
| = |(I + hT i
−1
hC∗ i)−1 hT i
−1
| ≤ Te0 ≡ Te(1 − TeC∗ )−1 .
(4.29)
The computation of the map K is completed. Remark 4.4. (“KAM algorithm”). If (4.23) is satisfied when U, v, ξ are replaced by U 0 , v 0 , ξ 0 then the map K can be re-applied and, iterating when possible, one obtains a sequence (u(j) , v (j) ) of approximate solutions [with relative error functions (f (j) , g (j) )] and corresponding norm-bounds Uj , Vj ,...,Tej . More precisely, fix numbers δj & 0 such P that δj < ξ (where δ0 ≡ δ as above); let, for j ≥ 1, ξj ≡ ξj−1 − δj−1 ; call the above “first approximate solution” (u(0) , v (0) ), (f (0) , g (0) ) the relative error function and attach to M, T and to their norm bound an index 0. Then, given, for j ≥ 0, (u(j) , v (j) )
432
A. Celletti, L. Chierchia
and the relative norm bounds (Uj , Vj , ..., Tej ) we let (u(j+1) , v (j+1) ) be the approximate solution constructed in25 Proposition 4.1 and, if conditions (4.23), (4.26) and (4.28) are satisfied (see, again, footnote 25), then (Uj+1 , ..., Tej+1 ) = K(Uj , ..., Tej ) are the norm bounds controlling the new approximate solution (u(j+1) , v (j+1) ). 4.4. KAM Theorem. described above.26
Here we prove a KAM theorem based on the KAM algorithm
Theorem 4.1. Let ω satisfy (4.3), let Assumption 4.2 hold, let 0 < ξˆ < ξ ,
1 ξ − ξˆ , δˆ ≡ 2 ξ
let η be the following norm on (f, g) η ≡ max{E E0,2 , E0,2 Te} max{γF , γ 2 E0,2 Ve F , γ 2 E0,2 G} , define the following parameters related to ω and to the quantities introduced in Assumption 4.2: ≡ max |ωi | , 1≤i≤N
1 ≡ max{ , E1,1 } , 2
∗ ≡ max{E2,1 , EE1,2 1 , E0,3 E 21 } , 2
H∗ ≡ max{E3,0 , EE2,1 1 , E1,2 E 21 } , H∗0 ≡ max{H∗ , E1 }, H∗00 ≡ max{H∗0 , Ve ∗ } , α∗ ≡ max{E E0,2 , E0,2 Te} · max{γ 2 E0,2 H∗00 , γ∗ } , ˆ 2τ +2 } , ˆ 2τ , (ξ − ξ) θ ≡ max{(ξ − ξ) E1 E1 e , , T E0,2 , TeE1,2 , TeE0,3 E1 } , r−ρ Ve α ≡ max{α∗ , α∗∗ } .
α∗∗ ≡ θ max{1 ,
ˆ satisfying27 There exists a polynomial ν in (ξ, δ) 5 ˆ ≤ 21 + 88 max{ξ , ξ 6 } , ≤ ν(ξ, δ) 4
∀ξ > 0 , ∀ 0 < δˆ
0) ; σ10 (ξi ) ≤ γξi−τ τ ! ; σ01 t δ δi i σ10 ≤ 2τ γδi−τ τ ! , σ10 ξi − ≤ σ10 (ξi ) ≤ γξi−τ τ ! ; 2 2 29 See the previous section. Note that, with these definitions, ξ = ξ, U = U ,...,M = M , etc. The “norm0 0 0 parameters” are defined in the previous section where the primed quantities correspond to u(i+1) , v (i+1) while the unprimed ones correspond to u(i) , v (i) . 30 Recall that the primed quantities of Sect. 4.3 correspond here to the index i + 1 while the unprimed ones to the index i, that the index i = 0 corresponds to the “initial” approximate solution, namely to the quantities defined in Assumption 4.2 and, finally, that Ep,q and E are independent of i. 31 The first relation in (4.36) follows from (4.3) letting n = e(i) , where {e(i) } is the standard orthonormal basis of RN ; the second and third relations follow by observing that e(1) is an eigenvector with eigenvalue 1 of MT (θ0 ) and of M−T (θ0 ), if θ0 is a critical point of u1 (θ); the last two relations are obvious.
Stability of Three-Body Problems
σ−10
δ i
435
≤ t δi−1 , (∀ t > 0);
t δ i σ−11 ≤ 2t2 δi−2 , (∀ t > 0) ; t 2
Ti ≤ 4M 0 E0,2 ; s1 ≤ 4δi−1 ;
σ11
T i ≤ 4M02 E ,
δ i
2
≤ γ2τ +1 δi−(τ +1) (τ + 1)! ;
F∗i ≤ 2Ve0 Fi + 2M0 Gi ;
s2 ≤ 22+τ γδi−(τ +1) τ ! ;
h 1 δ τ i i s3 ≤ M0 2τ +1 γδi−τ τ ! + 4M02 γξi−τ τ ! = M02 2τ +1 γδi−τ τ ! +2 M0 2ξi ≤ M02 2τ +1 γδi−τ τ ! 1 + δˆ ≡ M02 2τ +1 γδi−τ τ ! ν1 , ν1 ≡ 1 + δˆ ; δ02 ; 2 ν 1 ; ≤ γM02 2τ +2 δi−(τ +1) (τ + 1)! ν3 , ν3 ≡ 1 + 4 3 ν1 + δ02 ; ≤ (γ)M02 2τ +2 δi−(τ +2) (τ + 1)! ν4 , ν4 ≡ 1 + 2 16 ≤ (γ1 )M02 2τ +1 δi−(τ +1) τ ! ν5 , ν5 ≡ ν2 + δ0 ν1 ; ν 5 ν 1 δ0 + + ν 3 δ0 ; ≤ (γ1 )M02 2τ +2 δi−(τ +2) (τ + 1)! ν6 , ν6 ≡ ν4 + 4 4 ≤ 2δi−1 ;
s4 ≤ (γ)M02 2τ +1 δi−(τ +1) τ ! ν2 , ν2 ≡ ν1 + s5 s6 s7 s8 s9
s10 ≤ γ 2 H∗ M04 22τ +1 δi−2(τ +1) τ !2 ν7 , ν7 ≡ (ν5 + ν1 δ0 )2 ; s11 ≤ γ 2 H∗ M02 2τ +1 δi−(τ +1) τ ! ν8 , ν8 ≡ ν5 + ν1 δ0 ; γ 2 H∗ ; 2 ≤ γ 2 ∗ M04 22τ +1 δi−2(τ +1) τ !2 ν7 ;
s12 ≤ s010
s011 ≤ γ 2 ∗ M02 2τ +1 δi−(τ +1) τ ! ν8 ; γ 2 ∗ ; 2 ≤ γ 2 H∗0 M03 2τ +1 δi−(τ +2) τ ! ν9 , ν9 ≡ ν5 + 2δ0 ν8 ;
s012 ≤ s13
s14 ≤ γM02 2τ +1 δi−(τ +1) τ ! ν1 ;
s15 ≤ γ 2 H∗0 M0 δi−1 ν10 , ν10 ≡ 1 + δ0 ;
s16 ≤ γ 2 H∗ M05 22τ +2 δi−2(τ +1) τ !2 ν7 ;
ν1 ν1 δ02 + ; 2 2 9 ν 9 ν2 + . ≡ ν1 + 8 2 4
s17 ≤ γ(γ)(M0 M 0 )2 22τ +5 δi−2(τ +1) τ !2 ν11 , ν11 ≡ ν2 + s18 ≤ γ 2 E
−1
H∗0 M03 M 0 M0 2τ +3 δi−(τ +2) τ ! ν12 , ν12 2
Let now ηi be as η (defined in the text of Theorem 4.1) but with Fi , Gi in place of F , G. Then, 4
γB∗i ≤ ηi M0 M 0 2τ +4 δi−τ τ ! ν13 , ν13 ≡ 1 + 4δˆ + 4
9 δ0 ; 16
kz (i) kξi+1 ≤ ηi M03 M 0 22τ +5 δi−2τ τ !2 ν14 , ν14 ≡ ν1 ν13 ;
436
A. Celletti, L. Chierchia
kzθ(i) kξi+1 ≤ ηi M03 M 0 22τ +6 δi−(2τ +1) τ !2 (τ + 1) ν15 , ν15 ≡ ν3 ν13 ; 4
kw(i) kξi+1 ≤ ηi (E1 )M03 M 0 22τ +5 δi−(2τ +1) τ !2 ν16 , ν16 ≡ ν5 ν13 + 4
δ03 ; 27
kwθ(i) kξi+1 ≤ ηi (E1 )M03 M 0 22τ +6 δi−2(τ +1) τ !2 (τ + 1) ν17 4
ν17 ≡ ν6 ν13 +
δ03 ; 28
kf (i+1) kξi+1 ≤ ∗ ηi2 M06 M 0 24τ +9 δi−(4τ +2) τ !4 ν18 8
2 ν18 ≡ ν7 ν13 +
δ03 δ06 ν8 ν13 + 14 ; 6 2 2
kg (i+1) kξi+1 ≤ H∗0 ηi2 M07 M 0 24τ +12 δi−(4τ +2) τ !4 ν19 , 9
δ3 δ05 δ0 δ2 ν11 ν13 + 06 ν12 ν13 + 07 ν1 ν13 + 14 ν10 ; (4.38) 2 2 2 2 (in the last inequality we used also the fact that from EE0,2 ≥ 1 and γ1 ≥ 1 it follows that γ max{Ve F, G} ≤ H∗0 ηi ). We let now 2 ν19 ≡ ν7 ν13 +
1 (4.39) 4 and observe that all the ν 0 s are greater than or equal to 1. Putting together the above ˆ one finds definitions of the various νi ’s and recalling that δ0 = ξ δ, 101 ˆ2 47 ˆ 1161 ˆ3 5 δξ + δ ξ+ δ ξ ν = + 10 δˆ + 33 δˆ2 + 40 δˆ3 + 16 δˆ4 + 4 8 2 8 5257 ˆ2 2 39527 ˆ3 2 50415 ˆ4 2 329 ˆ4 δ ξ + 64 δˆ5 ξ + δ ξ + δ ξ + δ ξ + 2 512 512 256 77319 ˆ3 3 455303 ˆ4 3 6105 ˆ5 3 δ ξ + δ ξ + δ ξ +194 δˆ5 ξ 2 + 64 δˆ6 ξ 2 + 8192 8192 64 1131 ˆ4 4 8415 ˆ5 4 921 ˆ6 4 15447 ˆ5 5 δ ξ + δ ξ + δ ξ + δ ξ +50 δˆ6 ξ 3 + 256 512 64 16384 925 ˆ6 5 1369 ˆ6 6 δ ξ + δ ξ . + 512 16384 We note, for later use, that32 ν18 , ν14 , ν15 , ν16 + δ0 ν14 } . ν ≥ max{ (4.40) 4 Thus, (using the inequalities α ≥ α∗ and ν ≥ max{ ν418 , ν19 }), we find ˆ ≡ ν19 + ν ≡ ν(ξ0 , δ)
ηi+1 ≤ κλi ηi2 , where33
9 ˆ −(4τ +2) τ !4 ν , κ ≡ αM07 M 0 28τ +13 (ξ − ξ) 34 Iterating, for all 0 ≤ i ≤ j + 1, we get
λ ≡ 24τ +2 .
32 As it is immediate to check after having written down explicitly the definition of ν (which, recalling that ˆ turns out to be a polynomial of degree 12 in ξ0 and δˆ with positive (rational) coefficients) and of the δ0 = ξ0 δ, other νk ’s. 33 The reason for having in this formula α in place of α will be plain when we shall check the inductive ∗ assumptions (4.35) for j + 1. P P i−1 k i−1 34 Recall that 2 = 2i − 1, (i − k)2k−1 = 2i − i − 1, (and that η0 = η). k=0 k=1
Stability of Three-Body Problems
437 i
ηi ≤
(ηκλ)2 , κλi+1
whence, in view of the “KAM condition” (4.31) (which is now recognized as equivalent to require ηκλ ≤ 1), 1 . (4.41) ηi ≤ κλi+1 This bound allows to get simple estimates on the norms of z (i) and w(i) for 0 ≤ i ≤ j. Using (4.41) and the facts that ν ≥ max{ν14 , ν15 }
and
ˆ 2τ +2 , (ξ − ξ) ˆ 2τ } , α ≥ max{(ξ − ξ)
from (4.38) we get max{kz (i) kξi+1 , kzθ(i) kξi+1 } ≤ min{δi ,
1 1 1 }. M 0 215 8i
(4.42)
Analogously, using ˆ 2τ +1 max{ 1 , 1 } , α ≥ E1 (ξ − ξ) r − ρ Ve
ν ≥ ν16 , from (4.38) there follows
1 1 kw(i) kξi+1 ≤ min{r − ρ , Ve } 18 i . 2 8
(4.43)
To check the inductive hypotheses (4.35) we shall also need simple bounds on the constants Ci , Ci0 , Ci00 , C∗i [recall (4.27) and (4.29)]. Using ν ≥ max{ν15 , ν16 + δ0 ν14 } and the fact that ˆ 2τ +1 max{1 , TeE0,2 , TeE1,2 , TeE0,3 E1 } , α ≥ (ξ − ξ) and that M i B∗i s5 ≤
1 2 M0
1 1 1 1 min{1 , } ≤ 14 , 14 i e 2 8 2 T E0,2
we obtain Ci ≤ min{1 ,
1
}
1
1 1 , 213 8i
TeE0,2 1 1 1 1 } , Ci0 ≤ min{1 , 12 8i e M T E0,2 0 2 2 M0
4
Ci00 ≤ max{E1,2 , E0,3 E1 } ηi M03 M 0 δi−2τ +1 22τ +5 τ !2 (ν16 + δ0 ν14 ) , 1 1 C∗i ≤ Te0−1 8 i . (4.44) 2 8 We are ready to check (4.35) for i + 1: From (4.35) and (4.42) we find ku(i+1) kξi+1 ≤ ku(i) kξi + kz (i) kξi+1 ≤ ξ¯ − ξi + δi = ξ¯ − ξi+1 .
438
A. Celletti, L. Chierchia
From (4.23) and (4.43), we get kv (i+1) − y0 kξi+1 ≤ kv (0) − y0 kξ +
i X
kw(j) kξi+1
j=0 ∞ r−ρX 1 ≤ ρ + 18 ≤r. 2 8j j=0
Finally, using (4.43), (4.42), (4.44) one easily obtains the remaining inductive hypotheses. Observe that from (4.41), (4.42) and (4.43) it follows that the error functions f (i) and g (i) go to zero exponentially fast, while u(i) and v (i) converge (exponentially fast) to real-analytic functions u˜ and v. ˜ This concludes the proof of Theorem 4.1. Proof. (of Theorem 4.2) The reader will have no difficulty in checking that the previous proof goes through word by word so that the claim follows from uniformity in µ ∈ D. 5. Proof of Theorem 1.1 The proof of Theorem 1.1 will be divided in three steps: (1) construction of the starting approximate solution (Remark 4.1) using ε-expansions; (2) bounds (4.22) on the norms relative to starting approximate solution with D ≡ {ε ∈ C : |ε| ≤ ε0 ≡ 10−6 }, see Assumption 4.4; (3) iteration of the map K (4.25) and application of Theorem 4.2. 5.1. Step 1: Formal ε-expansion and initial approximate solution. The Hamiltonian H in (1.3) contains ε as a parameter: ε corresponds to the parameter µ of Theorem 4.2 and D corresponds to the complex ball {ε ∈ C : |ε| ≤ ε0 } for some ε0 > 0 to be determined below.35 As was well known to Poincar´e, Lindstedt & Co., one may compute formally the ε-expansion of quasi-periodic formal solutions (“Lindstedt series”; see [1]). Our starting approximate solution will be a suitable truncation of such a formal expansion. Here we deduce a few elementary formulae which allow us to explicitly compute recursively the formal solution. Let N = 2, x ≡ (`, g), y ≡ (L, G) and let us rewrite explicitly Eq. (4.1) for the Hamiltonian H: 1 1 1 1 − v2 ) 3 + 2ε( 2 − v2 )RL (θ + u, v) − 2ε 3 R(θ + u, v), 2v12 2v1 v1 v1 1 1 ω2 + Du2 (θ) = −2( 2 − v2 ) + 2ε( 2 − v2 )RG (θ + u, v) − 2εR(θ + u, v), 2v1 2v1 1 Dv1 (θ) = −2ε( 2 − v2 )R` (θ + u, v), 2v1 1 Dv2 (θ) = −2ε( 2 − v2 )Rg (θ + u, v) ; (5.1) 2v1 ω1 + Du1 (θ) = −2(
35 Clearly H is an entire function of ε and the restriction on D is needed in order to meet the basic condition (4.31); the choice of the “best value for ε0 ” (i.e., the largest one) has been done simply by “trial and error”.
Stability of Three-Body Problems
439
here ω and D are short for ω (±) and for ω (±) · ∂θ , θ ∈ T2 and recall (1.8), (1.9) and (1.10). It is well known that (5.1) admits a formal solution u(θ) ˜ ∼
∞ X
u˜ (j) (θ) εj ,
v(θ) ˜ ∼
j=0
∞ X
v˜ (j) (θ)εj ,
(5.2)
j=0
with u˜ (j) and v˜ (j) being vector-valued real-analytic on T2 functions: u˜ (j) ≡ (u˜ (j) ˜ (j) 1 ,u 2 ),
v˜ (j) ≡ (v˜ 1(j) , v˜ 2(j) ) .
Furthermore, such a formal solution is uniquely determined by requiring that the averages of the u˜ (j) ’s vanish: hu˜ (j) i = 0 , (∀ j ≥ 0) . In particular this implies that u˜ (0) ≡ 0 . Instead (as one checks immediately by inserting (5.2) into (5.1) and looking at the order zero in ε) v˜ (0) has to be chosen so that ∂(L,G) H (v˜ (0) ) = ω , i.e. , v˜ (0) ≡ (L± , G± ) . ε=0
Remark 5.1. The formal solvability of (5.1) implies that the right-hand sides of the last two equations of (5.1) have vanishing mean value36 over T2 . The averages of the v˜ (j) ’s have then to be chosen so that the first two equations in (5.1) are solvable. This leaves free the averages of the u˜ (j) ’s, which, as already said, will be taken to be zero. As initial approximate solution we take u (θ; ε) ≡ (0)
5 X
(j)
j
u˜ (θ) ε ,
v (θ; ε) ≡ (0)
j=0
5 X
v˜ (j) (θ) εj
(5.3)
j=0
(recall that, with the above conventions, v˜ (0) ≡ (L± , G± ), u˜ (0) ≡ 0 and hu˜ (j) i = 0). We proceed by writing down the explicit formulae which, implemented on a machine, allow to compute37 the functions u˜ (j) , v˜ (j) or, more precisely, allow to compute intervals of real numbers containing the Fourier coefficients of the (u˜ (j) , v˜ (j) )’s. Recalling the explicit form of the function R [see (1.4), (1.5)], one sees that the right-hand sides of (5.1) have the form M X
ri εsi (v1(0) )pi (v2(0) )qi eσi (v1(0) , v2(0) ) cni (θ + u(0) ) ,
(5.4)
i=1
where: M < ∞; ri are rational numbers; si , pi , qi , σi are integers obeying the constraints 0 ≤ si ≤ 1 ,
−5 ≤ pi ≤ 10 ,
0 ≤ qi ≤ 1 ,
−1 ≤ σi ≤ 1 ;
For any periodic function f (θ), θ ∈ TN , the integral over TN of Df vanishes. Notice that since the Hamiltonian H is a trigonometric polynomial in x, the functions (u˜ (j) , v˜ (j) ) are also trigonometric polynomials. 36 37
440
A. Celletti, L. Chierchia
ni ∈ Z2 with |ni | ≤ 10; finally cn (x) is either cos n · xP or sin n · x. We shall denote by [·]j the operator that acts on a (formal) ε-power series, εk a(k) , by associating to it the j th coefficient a(j) : hX i εk a(k) ≡ a(j) . j
k≥0
Let p, q, σ ∈ Z with q > 0 and |σ| ≤ 1 and let us compute the j th ε-coefficient of, respectively, ap bq eσ (a, b) and of cn (θ + ϕ) where a, b and ϕ are formal ε-power series (with periodic real-analytic coefficients) given by X X ak (θ)εk , b ∼ bk (θ)εk , a∼ k≥0
ϕ∼
X
k≥0
k
ϕ (θ)ε ≡ (k)
X
k≥0
k ϕ(k) 1 (θ)ε ,
k≥0
X
k ϕ(k) . 2 (θ)ε
k≥0
As above, we denote y0 ≡ v˜ (0) and write the expansions of e(y) and of e−1 (y) ≡ 1/e(y) as38 X X e(y) ≡ eh (y − y0 )h , e−1 (y) ≡ e˜h (y − y0 )h . h∈N2
h∈N2
Then, for p ≥ 0, one finds [a−p bq eσ (a, b)]j X p q k −p q−k 1] 2 +k2 ] y 1 y02 2 eσ,h a[h = ˜ k[k41 ] b[h , k3 a k5 k1 k2 01
(5.5)
(k,h)∈I−p,q
where I−p,q ≡ {(k, h) ∈ N5 × N2 : 0 ≤ k1 ≤ p , 0 ≤ k2 ≤ q , k3 ≥ h1 , k4 ≥ k1 , k 5 ≥ k2 + h 2 , k 3 + k 4 + k 5 = j } ; if σ = −1 e˜h if σ = 1 . eσ,h ≡ eh δ 0|h| if σ = 0 P th c[k] (ε) denotes the k th power of a formal power series c ∼ j≥0 cj εj and c[k] j its j ε-coefficient; finally a˜ is the power series defined by a˜ ∼
1 1 − . a a0
Analogously, for p > 0, one gets [ap bq eσ (a, b)]j X p q p−k q−k 1 +k1 ] [h2 +k2 ] = y 1 y02 2 eσ,h a[h bk4 , k3 k1 k2 01
(5.6)
(k,h)∈Ip,q
38
We use standard multiindex notation: if n ∈
n! = n1 ! · · · nN !;
zn
=
n z1 1
n · · · zNN .
NN , |n|
=
PN i=1
ni ; if z ∈
RN , ∂zn
=
∂ |n| n n ∂z1 1 ···∂zn N
;
Stability of Three-Body Problems
441
where: Ip,q ≡ {(k, h) ∈ N4 × N2 : 0 ≤ k1 ≤ p , 0 ≤ k2 ≤ q , k 3 ≥ k 1 + h 1 , k 4 ≥ k2 + h 2 , k 3 + k 4 = j } .
(5.7)
Computing [cn (θ + ϕ)]j is clearly equivalent to evaluate [exp(in · ϕ)]j and letting En(0) ≡ 1 , En(k) (θ)
∀ n ∈ Z2 ,
k 1X ≡ ` En(k−`) (θ) n · ϕ` (θ) , k `=1
one gets
h
i
exp(in · ϕ)
j
= En(j) (θ) .
(5.8)
Inserting (5.2) in (5.1) one obtains recursive relations of the type Du˜ (k) = Av˜ (k) + 8(k) ,
Dv˜ (k) = 9(k) ,
(5.9)
where A is the (constant) matrix given in (3.5) evaluated at (L, G) ≡ (L± , G± ) and the vectors 8(k) and 9(k) depend on u˜ (0) ,...,u˜ (k−1) , v˜ (0) ,...,v˜ (k−1) (and on θ) and can be explicitly written down by using the remark leading to (5.4) and the expansions (5.5), (5.6) and (5.8). Assume now that (u˜ (j) , v˜ (j) ), for j = 0, ..., k − 1 are known and let us determine (k) (k) (u˜ , v˜ ). Inverting the operator D in the second of (5.9) (recalling Remark 5.1) we let v˜ (k) ≡ v¯ (k) + D−1 9(k) , where v¯ (k) denotes the average of v˜ (k) and it has to be determined so that the equations for Du˜ (k) have a right-hand side with vanishing mean value, i.e., u˜ (k) ≡ D−1 Av˜ (k) + 8(k) . v¯ (k) ≡ −A−1 h8(k) i , The formulae for the recursive computation of the functions (u˜ (j) , v˜ (j) ) (and hence of our choice of the initial approximate solution) are complete. In Appendix A we report the number of Fourier coefficients of the functions (u˜ (j) , v˜ (j) ) for j ≤ 5 and, as an example, the list of intervals trapping the Fourier coefficients of the first component of u˜ (1) . 5.2. Step 2: Norm bounds relative to the initial approximate solution. Having defined the initial approximate solution as the fifth order truncation (5.3) of the ε-expansion of the formal solution of (5.1), we want now to estimate the relative norm parameters as defined in (4.22). We attach an index 0 to the quantities related to our starting approximate solution (i.e. the fifth order truncation (5.3) of the ε-expansion of the formal solution of (5.1)); thus the symbols ξ, U, V ,...,Te of Sect. 4.3 correspond here to ξ0 , U0 , V0 ,...,Te0 (see Remark 4.4). Let ε0 ≡ 10−6 , ξ0 ≡ 0.2 , and recall that the norm in (4.32) contains also a supremum taken over the complex parameter region D, D ≡ {ε ∈ C : |ε| ≤ ε0 } .
442
A. Celletti, L. Chierchia
The evaluation of U0 , V0 , M0 and Ve0 are easily obtained having computed the “explicit” (in the sense of interval arithmetic) form of the approximate solution (u, v). Having (u, v) given as u(0) =
5 X
εj
j=1
X
u˜ (j) n exp(in · θ) ,
v (0) =
5 X
εj
j=0
0 δ1 > δ2 ... and δj < ξ. These values have been chosen by “optimizing” (trial and error) the KAM algorithm.
Stability of Three-Body Problems
447
After the second iteration: U2 ≤ 1.309817593322718 · 10−5 , V2 ≤ 1.45622036006177 , M2 ≤ 1.00002126646582 , M 2 ≤ 1.00002126691809 , V˜2 ≤ 7.453583426051443 · 10−7 , F2 ≤ 1.209224397282491 · 10−31 , G2 ≤ 2.301233775774239 · 10−33 , T˜2 ≤ 0.834604675233862 . After the third iteration: U3 ≤ 1.309817593322718 · 10−5 , V3 ≤ 1.45622036006177 , M3 ≤ 1.00002126646582 , M 3 ≤ 1.00002126691809 , V˜3 ≤ 7.453583426051448 · 10−7 , F3 ≤ 8.855523608162042 · 10−44 , G3 ≤ 3.174732732716713 · 10−46 , T˜3 ≤ 0.834604675233862 . The quantities 1 ,...,η defined in Theorem 4.2 (recalling Theorem 4.1) are immediately computed using the above values and one obtains α ≤ 88530999255.1887 ,
η ≤ 1.185668436207269 · 10−38 .
With such values condition (4.31) is satisfied, in fact we obtained46 9
ηα M 7 M ξ −2(2τ +1) 216τ +23 τ !4 ≤ 3.584973875295102 · 10−8 < 1 , so that Theorem 1.1 holds.
A. Some Computer-Assisted Data We first report the trigonometric degrees νj , νj0 appearing in the Fourier-Taylor expansion of the approximate solution (u(0) , v (0) ) (obtained as the fifth ε-order truncation of the formal solution) u(0) =
5 X j=1
εj
X
u˜ (j) n exp in · θ ,
0 0 and a < b, means [x + δa, x + δb]: for example, 0.6730562643923955199965449172780 + [58, 87] · 10−33 = [0.673056264392395519996544917278058, 0.673056264392395519996544917278087] .
Stability of Three-Body Problems
449
0.6766043577880158539656402750183 + [17, 48] · 10−33 , 0.3061845611995878973674729226746 + [45, 54] · 10−33 , 1.488798647829940896171271103877 + [83, 90] · 10−32 , −0.4659443843426364961646113772143 + [22, 11] · 10−33 , −0.2018899370701671721933153371788 + [70, 62] · 10−33 . References 1. Arnold, V. I. (editor): Encyclopaedia of Mathematical Sciences. Dynamical Systems III, BerlinHeidelberg-New York: Springer-Verlag 3, 1988 2. Arnold, V. I.: Small divisor problems in classical and Celestial Mechanics. Russ. Math. Surv. 18, 85–191 (1963) 3. Celletti, A.: Analysis of resonances in the spin-orbit problem in Celestial Mechanics: The synchronous resonance (Part I). J. of App. Math. and Phys. (ZAMP) 41, 174 (1990) 4. Celletti, A., Chierchia, L.: Construction of Analytic KAM Surfaces and Effective Stability Bounds. Commun. Math. Phys. 118, 119–161 (1988) 5. Celletti, A., Chierchia, L.: A Constructive Theory of Lagrangian Tori and Computer-assisted Applications. In: Dynamics Reported, C.K.R.T. Jones, U. Kirchgraber, H.O. Walther, Managing Editors, Vol. 4 (New Series), Berlin-Heidelberg-New York: Springer–Verlag, 1995 6. Delaunay, C.: Th´eorie du Mouvement de la Lune. M´emoires de l’Acad´emie des Sciences 1, Tome XXVIII, Paris (1860) 7. Eckmann, J.-P., Wittwer, P.: Computer methods and Borel sommability applied to Feigenbaum’s equation. Berlin-Heidelberg-New York: Springer Lecture Notes in Physics 227, 1985 8. H´enon, M.: Explorationes num´erique du probl`eme restreint IV: Masses egales, orbites non periodique. Bullettin Astronomique 3, no. 1, fasc. 2, 49–66 (1966) 9. John, F.: Partial Differential Equations. Berlin-Heidelberg-New York: Springer-Verlag, 1982 10. Koch, H., Schenker, A., Wittwer, P.: Computer-Assisted Proofs in Analysis and Programming in Logic: A Case Study. Preprint Universit´e de Gen`eve (1994) 11. Meyer, K.R., Schmidt, D. (editors): Computer Aided Proofs in Analysis. Berlin-Heidelberg-New York: Springer-Verlag, 1991 12. Moser, J.: Stable and random motions in dynamical systems, with special emphasis on Celestial Mechanics. Ann. Math. Studies; 77, Princeton, NJ: Princeton University Press, 1973 13. Poincar`e, H.: Les Methodes Nouvelles de la Mechanique Celeste. Paris: Gauthier Villars, 1892 14. Robutel, P.: Stability of the planetary three-body problem. II. KAM Theory and existence of quasi-periodic motions. Celestial Mechanics 62, 219–261 (1995) 15. Salamon, D., Zehnder, E.: KAM theory in configuration space. Comment. Math. Helveticae 64, 84 (1989) 16. (no author listed) The Astronomical Almanac. Washington, D.C.: U.S. Government Printing Office, 1990 17. (no author listed) Vax Architecture handbook. Digital Equipment Corporation, 1981 Communicated by Ya.G. Sinai
Commun. Math. Phys. 186, 451 – 479 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
An Analogue of the Kac–Wakimoto Formula and Black Hole Conditional Entropy Roberto Longo? Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, via della Ricerca Scientifica, I–00133 Roma, Italy, and Centro Linceo Interdisciplinare, Accademia Nazionale dei Lincei via della Lungara 10, I–00165 Roma, Italy. E-mail:
[email protected] Received: 31 May 1996 / Accepted: 8 November 1996
Abstract: A local formula for the dimension of a superselection sector in Quantum Field Theory is obtained as vacuum expectation value of the exponential of the proper Hamiltonian. In the particular case of a chiral conformal theory, this provides a local analogue of a global formula obtained by Kac and Wakimoto within the context of representations of certain affine Lie algebras. Our formula is model independent and its version in general Quantum Field Theory applies to black hole thermodynamics. The relative free energy between two thermal equilibrium states associated with a black hole turns out to be proportional to the variation of the conditional entropy in different superselection sectors, where the conditional entropy is defined as the Connes–Størmer entropy associated with the DHR localized endomorphism representing the sector. The constant of proportionality is half of the Hawking temperature. As a consequence the relative free energy is quantized proportionally to the logarithm of a rational number, in particular it is equal to a linear function of the logarithm of an integer once the initial state or the final state is taken fixed. Introduction In the first part of this paper we shall derive a formula for the dimension of a superselection sector in Conformal Quantum Field Theory. However, as we shall see, the role played by conformal invariance is not essential, and indeed we will subsequently deal with general Quantum Field Theory and apply our formula to the computation of the relative free energy between two thermal equilibrium states of the background system for a black hole. The reader mainly interested in the latter topic may at first read the second part of this introduction and then go directly to Sect. 3. A local analogue of the Kac–Wakimoto formula. There is a general phenomenon relating the distribution of the Hamiltonian density levels to a dimension, a classical example be?
Supported in part by MURST and CNR-GNAFA.
452
R. Longo
ing given by Weyl’s theorem on the asymptotic distribution of the Laplacian eigenvalues on a compact Riemann manifold. A similar occurrence appears in the context of lowest weight representations of certain affine Kac–Moody algebras. If L0 and Lρ are the conformal Hamiltonians in the vacuum representation and in the representation ρ of such an infinite dimensional Lie algebra, then there exists the limit lim+
β→0
Tr(e−βLρ ) = d(ρ), Tr(e−βL0 )
(0.1)
and the thus defined d(ρ) has the formal properties of a dimension [30]. One expects such a formula to hold in more generality in conformal Quantum Field Theory on S 1 with ρ a superselection sector and d(ρ) the statistical dimension of ρ (one has to assume at least that Tr(e−βL0 ) < ∞ or any structural property to guarantee this), but no result in this direction has been so far obtained (cf. [43]). However, if we restrict the vacuum state ω to the local von Neumann algebra A(I) associated with an interval I, then ω is faithful by the Reeh-Schlieder theorem and hence, by the Tomita–Takesaki theory, ω is a Gibbs state with respect to its modular group1 . Such a modular group has a geometric meaning and we may interpret it as a local dynamics, in other words the logarithm of the modular operator can be regarded as a local Hamiltonian. One can then argue from [9] that in this local situation the distribution of the energy density levels should be tested in mean, at a specific value of the inverse temperature parameter β, and hence also a local version of formula (0.1) should not be asymptotic as β → 0+ , but evaluated at a specific β. Let 3I (t) be the one-parameter group of special conformal transformations of S 1 associated with the interval I of S 1 (formula (2.1) below) and Kρ the generator of the corresponding one-parameter unitary group in the representation ρ. We shall obtain the formula (0.2) (e−2πKρ ξ, ξ) = d(ρ), where ξ is any cyclic vector for ρ(A(I 0 )) such that (ρ(·) ξ, ξ) is the vacuum state on A(I 0 ) and I 0 = S 1 \I. In comparison with the formula (0.1), we first note that the right-hand side d(ρ) in (0.2) has not only the properties of a dimension, but it is actually identified with the Doplicher–Haag–Roberts [17] statistical dimension of ρ. Moreover the local formula (0.2) holds in full generality, independently of any requirement on the growth of Hamiltonian spectral density. Hopefully it might lead to a model independent proof of the formula (0.1), but it has its own interest. As we shall see, its proof makes use of the knowledge of the modular structure of the local von Neumann algebras [7, 21], the description of the conjugate sector in terms of the modular involutions [22, 23], and a result on actions of groups on tensor categories that determines Kρ as a linear function of the logarithm of the relative modular operator (as in formula (3.8) below). Finally the validity of formula (0.2) goes beyond the context of Conformal Quantum Field Theory. Indeed the same structure is present in the general setting Quantum Field Theory on Minkowski space [47], provided we consider the local von Neumann algebra associated with a wedge region, because in this case the modular structure has the geometric interpretation described by the Bisognano–Wichmann theorem. We then treat this case, where a further physical interpretation can be given. Mutatis mutandis, formulae 1
The reader should be aware of the different meaning of the term modular in refs. [48] and [30].
Kac–Wakimoto Formula and Black Hole Conditional Entropy
453
in Sect. 2 are also valid in Sect. 3 and vice versa, with the exception of Corollary 3.6. We avoid repetitions and state in each of them the results closer to the spirit of the section. Quantum numbers for the relative free energy associated with a black hole. As was indicated in [3, 4], a black hole appears from the outside to have some features of a thermodynamical system in equilibrium. In particular Bekenstein suggested the entropy of a black hole to be equal to λA, where A is the surface area of the black hole and λ a constant, an hypothesis related to the Generalised Second Law of Thermodynamics, dS + λdA ≥ 0,
(0.3)
where, in any process, dA is the increment of A and dS is the increment the entropy of the outside region. Taking into account Quantum effects and General Relativity, Hawking [27] was led to the conclusion that the black hole has a surface temperature T =
~ a , kc 2π
where a is the acceleration of a freely falling object at the surface of the black hole2 . This computation was made in the context of a free Quantum Field Theory on a curved space-time. Sewell [44] noticed that, at least in the case of a spherically symmetric eternal black hole, a model independent derivation of the Hawking temperature is possible, also in analogy with the Unruh effect [51], see also [16], by means of the Bisognano–Wichmann theorem [5] on the Minkowski space-time. One considers the Rindler space-time W as an approximation of the Schwarzschild space-time and realizes W as a wedge region in the Minkowski space-time, say W = {x, x1 > |x0 |}. The evolution x0 (t) = a−1 sht x1 (t) = a−1 cht corresponds to an observer moving within W with uniform acceleration a, and his proper time is equal to t/a. W is a natural horizon for this observer, since he cannot send a signal out of W and receive it back. The von Neumann algebra A(W ) of the observables on the Minkowski space localized in W is therefore the proper (global) observable algebra for such a mover. The proper time translations for him are thus given by the one-parameter automorphism group αat of A(W ) corresponding to the rescaled pure Lorentz transformations leaving W invariant. By the Bisognano–Wichmann theorem αat satisfies the KMS condition at inverse temperature β=
2π a
with respect to the (restriction of the Minkowski) vacuum state ω to A(W ), i.e. the latter is a thermal equilibrium state [25]. By the Einstein equivalence principle one can identify W with the outside region of the black hole and interpret the thermal outcome as a gravitational (black hole) effect. We refer to [24, 31, 52] for a more complete account of these arguments and further references. We note here that this description has certain restrictions. One is due to the 3
c ~ 2 T = in terms of the mass M of the star and the gravitational constant G. In the following we 8πkM G shall always use natural units so that the Plank costant ~, the speed of light c and the Boltzmann costant k are all set equal to 1, thus T = a/2π.
454
R. Longo
appearance of the Minkowskian vacuum tied up with the Poincar´e symmetries that do not exist globally on a general curved space-time. Another point concerns the choice of the Rindler space-time, that only near the horizon is a good approximation of the more appropriate Schwarzschild space-time. We shall briefly discuss these aspects in the final comments. Despite its limitations, this viewpoint is strikingly simple and powerful. Let us now consider a thermodynamical system Σ placed in the asympotically flat outside region of a black hole B. The Hawking radiation creates a heat bath for Σ and therefore Σ is an open system. Taking into account only observable quantities, Sewell [45] inferred that the right themodynamical potential for Σ is the Gibbs free energy, rather than the entropy, and rederived the generalised second law (0.3), where the area term represents now a contribution to the mechanical work done by Σ on B. Due to the Hawking effect, we have spontaneous creation of particles, so that the system undergoes a change of quantum numbers. From the Quantum Field Theory point of view, the system goes in a different superselection sector [53]. Following [26, 17] we thus consider a representation ρ of the quasi-local C∗ -algebra A = ∪A(O)− that is localizable in any space-like cone and has finite statistics. Under general conditions ρ is (and we assume to be) Poincar´e covariant with positive energy-momentum [22]. We may assume that ρ is localized within W and therefore the restriction of ρ to A(W ) ∩ A has a normal extension to an endomorphism of its weak closure A(W ), that will still be denoted by ρ|A(W ) . The index-statistics theorem [36, 39] shows that the map ρ −→ ρ|A(W ) is a faithful full functor of tensor categories, namely all the information on the superselection structure (charge transfers, statistics,. . . ) is visible within A(W ), in particular 1
d(ρ) = Ind(ρ) 2 , where d(ρ) is the DHR statistical dimension, i.e. the order of the parastatistics, and Ind(ρ) is the Jones index of ρ|A(W ) (more precisely the minimal index, see [34, 39]). In the sector ρ the proper time evolution is given by a one-parameter automorphism ρ of A(W ) corresponding to the pure Lorentz transformations leaving W ingroup αat variant, αtρ · ρ = ρ · αt . ρ admits a unique thermal equilibrium normal state ϕρ at the same As we shall see, αat inverse Hawking temperature β = 2π/a. If Kρ is the generator of the unitary implementation of the pure Lorentz in the x1 -direction, then
Hρ = aKρ is the proper Hamiltonian for our system Σ in the sector ρ. A similar structure appears in the analysis of the chemical potential [2]. Adding one particle is not a drastic change as our thermodynamical system has essentially infinitely many particles, so we do not obtain an inequivalent representation, namely ρ is normal on A(W ). There is however an important difference. The chemical potential labels different equilibrium states at the same temperature for the same dynamics, while we ρ ,a look at the ϕρ ’s as equilibrium states with respect to their own different dynamics αat fact compatible with the General Relativity context where the dynamics should depend on the state as the matter has influence on the metric tensor.
Kac–Wakimoto Formula and Black Hole Conditional Entropy
455
Motivated by the above discussion, we will consider the relative free energy between two thermal equilibrium states for the system Σ associated with the external region of a black hole. Guided by the thermodynamical expression dF = dE − T dS, we express the relative free energy of the states ω and ϕρ by F (ω|ϕρ ) = ϕρ (Hρ ) − β −1 S(ω|ϕρ ), where S is the Araki relative entropy [1] of the two states and ϕρ (Hρ ) is the relative mean (internal) energy. We shall find the relation 1 F (ω|ϕρ ) = − β −1 Sc (ρ), 2
(0.4)
where Sc (ρ) is the Connes-Størmer [15] conditional entropy of ρ(A(W )) ⊂ A(W ). Here we use the fact that, by the Pimsner–Popa theorem [41] and the index-statistics theorem [36], 21 Sc (ρ) equals the logarithm of the statistical dimension d(ρ) of ρ. As the latter takes only integral values by the DHR theorem [17], we conclude that the possible values for the relative free energy are F (ω|ϕρ ) = −β −1 log(n),
n = 1, 2, 3, . . . ,
namely the integer n = d(ρ) here acquaints the different meaning of a quantum number labeling the relative free energy levels. The vacuum state ω should play no specific physical role in (0.4) and it is only a convenient reference state in our setting. Indeed we will extend the formula to the case of two arbitrary thermal equilibrium states ϕσ and ϕρ , F (ϕσ |ϕρ ) =
1 −1 β (Sc (σ) − Sc (ρ)), 2
(0.5)
n and therefore F (ϕσ |ϕρ ) = β −1 log( m ) is β −1 -times the logarithm of a rational number n m , where n depends only on σ and m only on ρ. Finally we observe that formula (0.5) is consistent with the above recalled interpretations of the increment dA of surface area of the black hole [4, 45] and, in a sense, it unifies different points of view: the increments of the conditional entropy of Σ, an information theoretical concept, is indeed proportional to the increment of its free energy, a statistical mechanics concept. We summarize the conceptual scheme in the proof of our result in the following diagram:
Connes–Størmer Entropy Th. 3.4
1 −1 2β
?
Relative Free Energy
Pimsner–Popa Th.
exp −1 log β
Cor. 2.2
-
Jones Index √ · Index-statistics Th. ? DHR dimension
456
R. Longo
1. Connes Cocycles and Endomorphisms Let M be a von Neumann algebra and ϕ, ω faithful normal positive linear functionals on M and denote by σ ω and σ ϕ their modular group given by the Tomita–Takesaki theory [48]. The Connes Radon–Nikodym cocycle [12] u(t) = (Dϕ : Dω)t is the map (t ∈ R → u(t) unitary of M ) such that u(t + s) = u(t)σtω (u(s))
(1.1)
characterized by: for any given x, y ∈ M there exists a complex function F bounded and continuous in {0 ≤ Imz ≤ 1} and analytic in its interior such that F (t) = ϕ(σtϕ (y)u(t)x),
F (t + i) = ω(xu(t)σtω (y)).
The relevant property is that u(t) intertwines σtω and σtϕ
=
σtϕ ,
(1.2)
namely
u(t)σtω (·)u(t)∗ .
(1.3)
ω
Conversely a continuous unitary σ -cocycle u (i.e. (1.1) is valid for u) is the Connes cocycle with respect to a unique faithful normal positive linear functional or semifinite weight ϕ of M . If M is a factor, a continuous unitary σ ω -cocycle u satisfying (1.3) is uniquely determined up to a one-dimensional character of R, hence, in order to check whether u(t) = (Dϕ : Dω)t , one may test Eq. (1.2) in the special case x = y = 1: there must exist a function F bounded and continuous in {0 ≤ Imz ≤ 1} and analytic in its interior such that F (t) = ϕ(u(t)), F (t + i) = ω(u(t)). (1.4) In particular, given the σ ω -cocycle u, the positive functional ϕ such that u(t) = (Dϕ : Dω)t may be computed by (1.2) as ϕ(x) = anal.cont. ω(xu(t)). t→−1
(1.5)
Let now M be an infinite factor and denote by End(M ) the set finite-index (or finitedimensional) endomorphisms of M . Namely ρ ∈ End(M ) if ρ is an endomorphism of M whose intrinsic dimension d(ρ) is finite ([40], see Appendix A). Equivalently ρ(M ) is a subfactor of M with finite index in the sense of [34, 32], indeed, as shown in [37], 1
d(ρ) = Ind(ρ) 2 , where Ind(ρ) denotes the minimal index of ρ(M ) ⊂ M . We refer to [39] and references therein for the notions of index theory of our use. We fix a normal faithful state ω of M . Given ρ ∈ End(M ) we denote by 8ρ the minimal left inverse of ρ and set 9ρ = d(ρ)8ρ ,
ψρ = ω · 9ρ
so that ψρ is a normal faithful positive functional of M . We define the cocycle of an endomorphism ρ by u(ρ, t) = (Dψρ : Dω)t
(1.6)
(see also [33]). As will be apparent, several of the results in this section are valid (essentially with the same proofs) for endomorphisms of factors with a normal faithful conditional expectation onto the range, but for simplicity we just treat the finite-index case.
Kac–Wakimoto Formula and Black Hole Conditional Entropy
457
ω Proposition 1.1. Let ρ ∈ End(M ) and set ρt = σtω ρσ−t . Then u(ρ, t) satisfies
Adu(ρ, t) · ρt = ρ.
(1.7)
In particular, if ρ is irreducible, then u(ρ, t) is characterized by (1.7) up to the multiplication by a one-dimensional character of R. Proof. The minimal expectation Eρ = ρ8ρ onto ρ(M ) leaves ψρ invariant since ψρ ρ8ρ = ω8ρ ρ8ρ = ω8ρ = ψρ ; thus
σ ψρ |ρ(M ) = σ ψρ |ρ(M ) = σ ωρ
−1
= ρσ ω ρ−1 ,
namely ψ
σt ρ ρ(x) = ρσtω (x), and since
ψ σt ρ
x ∈ M,
= Adu(ρ, t)σtω we have ω Adu(ρ, t) · σtω · ρ · σ−t = ρ.
(1.8)
Now Eq. (1.8) detemines the restriction of Adu(ρ, t) to ρ(M ), hence it determines u(ρ, t) up to muliplication by a unitary in ρ(M )0 ∩ M and therefore, if ρ is irreducible, it determines u(ρ, t) up to a phase. As recalled in Appendix A, the elements of End(M ) are the objects of a tensor C∗ category where the arrows are given by (A.0). The following in this section has a natural interpretation in the setting of tensor C∗ -categories, although here below we use the explicit formulas of our setting (see [37]). As we shall use the greek letter σ to denote an element of End(M ), the modular group will be always denoted with explicit reference to the functional (e.g. σtω ). Proposition 1.2. Let ρ ∈ End(M ) be irreducible and contained in σ ∈ End(M ), namely there exists an isometry w in (ρ, σ) (i.e. w ∈ M and wρ(x) = σ(x)w, ∀x ∈ M ), then 9ρ = 9σ (w · w∗ ).
(1.9)
(i) If σ = ⊕N i=1 ni ρi is an irreducible decomposition of σ and for each i {wk , k = 1, . . . ni } is an orthonormal basis of isometries in (ρi , σ), then
9σ =
ni N X X
9ρi (wk(i)∗ · wk(i) ).
(1.10)
i=1 k=1
Proof. Let 90σ be defined by the right-hand side of (1.10). Then 90σ (σ(x)) =
ni N X X i=1 k=1
9ρi (wk(i)∗ σ(x)wk(i) ) =
ni N X X
9ρi (ρi (x)) =
i=1 k=1
N X
ni d(ρi )x = d(σ)x,
i=1
thus d(σ)−1 90σ is a left inverse of σ. As 90σ (wk(i) wk(i)∗ ) = 9ρi (1) = d(ρi ), we have that 90σ = 9σ , showing the validity of Eq. (1.10). Formula (1.9) is obtained similarly. Proposition 1.3. If T is an arrow in (ρ, σ), then T u(ρ, t) = u(σ, t)σtω (T ) .
(1.11)
458
R. Longo
Proof. Let first ρ be an irreducible component of σ and w an isometry in (ρ, σ). Then by (1.9) we have u(ρ, t) = (Dψρ : Dω)t = (Dψσ (w · w∗ ) : Dω)t , therefore by the Radon–Nikodym chain rule and using the relation (Dψσ (w∗ · w) : Dψσ )t = w∗ σtψσ (w) we have u(ρ, t) = (Dψσ (w∗ · w) : Dψσ )t (Dψσ : Dω)t = w∗ σtψσ (w)u(σ, t) = w∗ u(σ, t)σtω (w), and after multiplying on the left by w all members in the above expression, and using the σtψσ -invariance of ww∗ , we obtain the special case of (1.11), wu(ρ, t) = u(σ, t)σtω (w).
(1.12)
(i) If now σ = ⊕N i=1 ni ρi is an irreducible decomposition of σ and the wk are as in Proposition 1.2, we have from (1.12) u(σ, t)wk(i) wk(i)∗ = wk(i) u(ρ, t)σ ω (wk(i)∗ ) and summing up over i and k we have X (i) wk u(ρ, t)σ ω (wk(i)∗ ). (1.13) u(σ, t) = i,k
If now T is an arrow in (ρ, σ), we may decompose both σ and ρ into irreducibles so that the ranges of the wk(i) ’s of ρ (resp. of σ) are either orthogonal or contained in the range of T (resp. of T ∗ ). Then multiplying (1.13) on the left by T (resp. on the right by σtω (T )) will kill the indices corresponding to the orthogonal part and the result is obtained by Eq. (1.12). Proposition 1.4. u(ρσ, t) = ρ(u(σ, t))u(ρ, t). Proof. We first note that ρ(u(ρ, t)) = (Dψρσ : Dψρ )t .
(1.14)
Indeed as both functionals ψρσ and ψρ leave invariant the conditional expectation ρ8ρ onto ρ(M ), their Radon–Nikodym cocycle coincides with the cocycle of their restriction to ρ(M ), thus (Dψρσ : Dψρ )t = (Dψρσ |ρ(M ) : Dψρ |ρ(M ) )
t
= (Dψσ · ρ−1 |ρ(M ) : Dω · ρ−1 |ρ(M ) )t = ρ((Dψσ : Dω)t ) = ρ(u(ρ, t)). The proposition then follows by the Radon–Nikodym chain rule for the Connes cocycles. Propositions 1.3 and 1.4 state that u(ρ, t) is a two-variable-cocycle (Appendix A), with respect to the action of R on End(M ) given by ω t → ρt = σtω · ρ · σ−t ,
t → σtω (T ),
(1.15)
where ρ is an object and T is an arrow in End(M ). Recall now that each ρ ∈ End(M ) has a conjugate object ρ, ¯ namely there exist Rρ ∈ (ι, ρρ) ¯ and R¯ ρ ∈ (ι, ρρ) ¯ standard solutions of Eq. (A.1), i.e.
Kac–Wakimoto Formula and Black Hole Conditional Entropy
R¯ ρ∗ ρ(Rρ ) = 1,
459
Rρ∗ ρ( ¯ R¯ ρ ) = 1
√ and kRρ k = kR¯ ρ k (= d(ρ)) is minimal. As explained in Appendix A, given an arrow T ∈ (ρ1 , ρ2 ), the conjugate arrow T • ∈ (ρ¯1 , ρ¯2 ) is defined by T • = ρ¯2 (R¯ ρ∗1 T ∗ )Rρ2 , where Rρi and R¯ ρi give a standard solution for the conjugate equation defining the conjugate ρ¯i . Note now that given ρ ∈ End(M ), once we choose ρ¯ defined by the Rρ and Rρ¯ , then (ρ) ¯ t is a conjugate of ρt defined by Rρt = σtω (Rρ ), R¯ ρt = σtω (R¯ ρ ), that we may simply denote by ρ¯t . In the following u(ρ, t)• is defined by this choice of the R-operators. Proposition 1.5. u(ρ, ¯ t) = u(ρ, t)• . Proof. By definition ¯ R¯ ρ∗t u(ρ, t)∗ )Rρ u(ρ, t)• = ρ( = ρ(σ ¯ tω (R¯ ρ∗ )u(ρ, t)∗ )Rρ = ρ(σ ¯ tω (R¯ ρ∗ )σtω (u(ρ, −t)))Rρ = ρ(σ ¯ tω (R¯ ρ∗ u(ρ, −t)))Rρ = u(ρ, ¯ t)σtω (ρ( ¯ R¯ ρ∗ u(ρ, −t)))u(ρ, ¯ t)∗ Rρ = u(ρ, ¯ t)σtω (ρ( ¯ R¯ ρ∗ u(ρ, −t)))σtω (u(ρ, ¯ −t))Rρ = u(ρ, ¯ t)σtω (ρ( ¯ R¯ ρ∗ u(ρ, −t))u(ρ, ¯ −t))Rρ . ω ¯ R¯ ρ∗ u(ρ, −t))u(ρ, ¯ −t))Rρ = 1 or, by applying σ−t , that Thus we have to show that σtω (ρ( ∗ ¯ ¯ −t)Rρ−t = 1. Indeed we have ρ( ¯ Rρ u(ρ, −t))u(ρ,
¯ −t)Rρ−t = ρ( ¯ R¯ ρ∗ )ρ(u(ρ, ¯ −t))u(ρ, ¯ −t)Rρ−t ρ( ¯ R¯ ρ∗ u(ρ, −t))u(ρ, ∗ = ρ( ¯ R¯ ρ )u(ρρ, ¯ −t)Rρ−t = ρ( ¯ R¯ ρ∗ )Rρ = 1 , where u(ρρ, ¯ −t)Rρ−t = Rρ by Proposition 1.3.
Lemma 1.6. If j is an ω-preserving anti-automorphism of M , then u(ρ, t) = j(u(j · ρ · j −1 , −t)). ω , thus Proof. Since j preserves ω, by the KMS condition jσtω j −1 = σ−t −1 ω j(u(j · ρ · j , −t)) is a σt -cocycle and it coincides with u(ρ, t) because it satifies Eq. (1.5).
Proposition 1.7. Let T be a C ∗ -tensor subcategory with conjugates of End(M ) and z a two-variable cocycle for the action R → AutT given by the modular group σ ω (Eq. (1.15)). Suppose there is an anti-automorphism j of M such that j −1 ρj is a conjugate of ρ and z(j −1 ρj, t) = j(z(ρ, −t)) for a given ρ ∈ T . Then z(ρ, t) coincides with the Connes cocycle u(ρ, t) defined by (1.5). As a consequence, d(ρ) = anal.cont. ω(z(ρ, t)). t→−i
(1.16)
460
R. Longo
Proof. We have z(ρ, t) = µ(ρ, t)u(ρ, t) for some character µ(ρ, t) of R. Since ρ¯ = j −1 ρj is a conjugate of ρ we have by Lemma 1.6, z(ρ, ¯ t) = j(z(ρ, −t)) = j(µ(ρ, −t)u(ρ, −t)) = µ(ρ, t)j(u(ρ, −t)) = µ(ρ, t)u(ρ, ¯ t), On the other hand by Lemma A.3 of Appendix A, z(ρ, ¯ t) = z(ρ, t)• = µ(ρ, t)• u(ρ, t)• = µ(ρ, t)u(ρ, ¯ t), thus µ(ρ, t) = 1. Formula (1.16) is thus a consequence of (1.5) in the case x = 1.
Before concluding this section we recall the notions of entropy for later use. If N ⊂ M is an inclusion of II1 -factors, the Connes-Størmer (conditional) entropy H(M |N ) is defined in [15]. By the Pimsner–Popa Theorem [41], H(M |N ) = log[M : N ],
(1.17)
where [M : N ] is the Jones index of N ⊂ M (if N ⊂ M is irreducible or extremal). H(M |N ) is generalized to the type III setting in [14]. If ϕ is a normal faithful state of M one defines X
Hϕ (M |N ) = sup{ (ϕi )
S(ϕ|ϕi ) − S(ϕ| N |ϕi| N )},
i
sets of finitely many normal positive linear functionals where (ϕi ) varies among theP n ϕ1 , ϕ2 , . . . ϕn of M such that i=1 ϕi = ϕ, and S(·|·) denotes the Araki relative entropy between states, see [6] and Eq. (3.7) below. If E is a normal conditional expectation of M onto N one sets HE (M |N ) = sup{Hϕ (M |N ), ϕ · E = ϕ}. If moreover N is a III1 -factor, then HE (M |N ) = Hϕ (M |N ) for any normal faithful state ϕ such that ϕ · E = ϕ [28]. HE (M |N ) depends on the choice of a normal conditional expectation E, but we simply write H(M |N ) = HEmin (M |N ), where Emin is the minimal conditional expectation. The Pimsner–Popa equality (1.17) holds true without restrictions, provided [M : N ] denotes the minimum index, see [28]. Finally, if ρ is an endomorphism of M , we consider the conditional entropy of ρ, Sc (ρ) = H(M |ρ(M )).
(1.18)
Kac–Wakimoto Formula and Black Hole Conditional Entropy
461
2. The Formula in Conformal Field Theory 2.1. Finite index case. We now consider a precosheaf (net) A of von Neumann algebras associated with a chiral conformal quantum field theory. Namely A is a map I −→ A(I) from the (proper) intervals of S 1 to von Neumann algebras A(I) on a fixed Hilbert space H that satisfies: ˜ Isotony: if I ⊂ I˜ then A(I) ⊂ A(I), Locality: A(I) and A(I 0 ) commute elementwise, where I 0 = S 1 \I, M¨obius covariance with positive energy: there exists a unitary reperesentation U of M¨obius group SL(2, R)/{1, −1}, that for convenience we regard as a representation of its universal covering group G, such that U (g)A(I)U (g)∗ = A(gI) and the generator of the one-parameter rotation subgroup is positive. We set αg (X) = U (g)XU (g)∗ where X is a local operator, namely X belongs to some A(I), Existence and uniqueness of the vacuum: there exists a unique (up to a phase) U invariant unit vector ∈ H, and it is cyclic for the algebra generated by local operators. We denote by ω = ( ·, ) the vacuum state. For a discussion of these properties and their consequences, see [23]. Let ρ be a covariant positive energy representation of A on a separable Hilbert space Hρ , namely for every proper interval I we have a representation ρI of A(I) on Hρ so that = ρI if I ⊂ I˜ ρI˜ |A(I)
and a positive energy representation Uρ of G on Hρ such that ρgI (αg (X)) = Uρ (g)ρI (X)Uρ (g)∗ , X ∈ A(I) (if A is strongly additive the covariance property is automatic [22]). We shall always assume the representations to be covariant with positive energy. Let 3I (t) be the special conformal one-parameter group associated with I. If I is the upper semi-circle then 3I (t)z =
(z + 1) + e−t (z − 1) , (z + 1) − e−t (z − 1)
(2.1)
while if I0 is any other interval 3I0 is well defined by conjugation of the transformation (2.1) with a conformal transformation g ∈ SL(2, R) such that gI = I0 . We denote by KρI = −i
d Uρ (3I (t))| t=0 dt
the infinitesimal generator of Uρ (3I (t)).
462
R. Longo
Theorem 2.1. Let ρ be a representation of A, I ∈ S 1 an interval and ξ ∈ Hρ a cyclic vector for ρI 0 (A(I 0 )) such that ω(X) = (ρI 0 (X)ξ, ξ) for all X ∈ A(I 0 ). Then3 I
(e−2πKρ ξ, ξ) = d(ρ). A vector ξ as above always exists. As we are interested in the representation ρ up to unitary equivalence, we may identify Hρ with H so that, due to Haag duality (see below), ρ becomes an endomorphism of ˜ ⊂ A(I) ˜ if A localized in a given interval I, namely ρI 0 acts identically, and ρI˜ (A(I)) ˜ see [23]. Because of the Reeh-Schlieder theorem (see [20]), the vacuum vector I ⊂ I, is cyclic for any local von Neumann algebra, in particular for A(I 0 ), therefore satisfies in this representation the properties required for the vector ξ in the statement of Theorem 2.1, showing its existence. As ρ is covariant, there is a unitary α-cocycle z(ρ, g) such that Adz(ρ, g) · αg · ρ · αg−1 = ρ,
(2.2)
where αg = AdU (g). More precisely the equation z(ρ, g) = Uρ (g)U (g)∗ defines z(ρ, g) if g belongs to a neighbourhood of the identity of G, and z(ρ, g) is localized in the sense ˜ if I˜ is any interval containing both I and gI [22], (see also [54]). that it belongs to A(I) Then z(ρ, g) is defined for arbitrary g ∈ G as an element of the universal algebra C ∗ (A) by the cocycle identity z(ρ, gh) = z(ρ, g)ρ(z(ρ, h)), but we do not need this fact. It is important to note that if ρ is irreducible or a finite direct sum of irreducibles, as is the case of a finite-index ρ, then z is uniquely determined by the formula (2.2) as a localized cocycle, because G has no non-trivial unitary finite-dimensional representation, see [23]. We consider the tensor C ∗ -category EI whose objects are the (covariant, positive energy) endomorphisms of A with finite index, namely d(ρ) = d(ρI ) < ∞, localized in an interval whose closure is contained in the interior of a given interval I and the arrows (ρ, σ) are the local operators T such that T ρI1 (X) = σI1 (X)T for all intervals I1 and local operators X ∈ A(I1 ). By [36] (see also [23]) the restriction map ρ ∈ EI → ρI ∈ EndA(I) is a faithful full functor, therefore we may identify EI with a tensor C ∗ -subcategory of EndA(I), so that d(ρ) is identified with the DHR statistical dimension of ρ. Then z is in a natural sense a local two-variable cocycle for the local action of G on EI given by ρ → αg ραg−1 (ρ ∈ EI ) and T → αg (T ) (T arrow), namely the properties a) and b) defining a two-variable cocycle after Lemma A.3 of Appendix A hold for z(ρ, g), but only if g belongs to a neighbourhood of the identity of G (depending on ρ). For example if ρ, σ ∈ EI then ρ(z(ρ, g))z(σ, g) exists if g lies in a neighbourhood of the identity of G and satisfies (2.2) for ρσ, hence it agrees with z(ρσ, g) as the α-cocycle property and formula (2.2) determines it. We thus have: Lemma 2.2. z(ρ, g) is a local two-variable cocycle for the action of G on the tensor C ∗ -category EI . We shall use the following convention: if A is a positive selfadjoint operator and η a vector, then (Aη, η) = 1 1 kA 2 ηk2 if η belongs to the domain of A 2 , and (Aη, η) = +∞ otherwise. 3
Kac–Wakimoto Formula and Black Hole Conditional Entropy
463
Proof. This follows by the above discussion and an elementary direct verification of the two-variable cocycle property. Recall now that each localized endomorphism ρ has a conjugate localized endomorphism given by the formula [22] ρ¯ = j · ρ · j, (2.3) where j = AdJ is the anti-automorphism of A implemented by the modular conjugation J of (A(I1 ), ) for any choice of the interval I1 . To be definite let I be the upper halfcircle and I1 the right half-circle. Due to the geometrical meaning of J, j implements an anti-automorphism of A(I). If g is in the M¨obius group, we denote by g j its conjugate by the anti-automorphism given by the reflection on S 1 corresponding to j. Proposition 2.3. Let ρ be a finite-index endomorphism of A localized in the interval I. With the above notations, we have z(ρ, ¯ g) = z(ρ, g)• = j(z(ρ, g j )) (see Sect. 1 and Appendix A for the definition of the • -mapping). Proof. The first equality follows from Lemma 2.1 and Lemma A.3. However one may see directly the validity of both equalities by the uniqueness of z(ρ, ¯ g) by checking that also z(ρ, g)• and j(z(ρ, g j )) are local α-cocycles and both satisfy formula (2.2) for ρ¯ = jρj. Now the modular structure of A is computed in [29, 7], in particular we have 1it I = U (3I (−2πt)), where 1I is the modular operator of (A(I), ). An important consequence is Haag duality A(I 0 ) = A(I)0 , moreover the A(I)’s are type III1 factors. The next theorem computes the modular structure of A in the representation ρ, see also Proposition 3.5. We set uI (ρ, t) = u(ρI , t) = (Dω9ρI : Dω|A(I) )t as in (1.6), with ω the vacuum state. Theorem 2.4. Let ρ be finite-index endomorphism of A localized in the interval I. Then uI (ρ, t) = z(ρ, 3I (−2πt)), t ∈ R. Proof. By the formula (2.3) for the conjugate sector and Lemma 2.2, the theorem follows immediatly by Proposition 1.7. Corollary 2.5. Let ρ be an irreducible finite-index endomorphism of A localized in the interval I of S 1 . Then I
I
d(ρ) = (e−2πKρ , ) = ke−πKρ k2 .
(2.4)
∗
Proof. Since zI (ρ, 3I (t)) = Uρ (3I (t))U (3I (t)) , Proposition 1.7 and Theorem 2.3 show that the function t → ω(z(ρ, 3I (−2πt))) extends to a function bounded and continuous in the strip {−1 ≤ Imz ≤ 0} and analytic in its interior so that d(ρ) = anal.cont. ω (z(ρ, 3I (−2πt))) = anal.cont. Uρ (3I (−2πt)), . t→−i
t→−i
Standard functional analysis arguments then show that the right hand side of the above I expression is equal to (e−2πKρ , ).
464
R. Longo
Let V : Hρ → H be the unitary given
Proof of Theorem 2.1 in the finite index case. by V ρ(X)ξ = X,
X ∈ A(I 0 )
so that V ρ(X)V ∗ = X if X ∈ A(I 0 ), thus ρ0 = V ρ(·)V ∗ is an endomorphism of A localized in I. By Corollary 2.5 we then have I
I
I
d(ρ) = (e−2πKρ0 , ) = (V e−2πKρ V ∗ , ) = (e−2πKρ ξ, ξ), in case ρ has finite index. We have already commented on the existence of ξ. The infinite index case is discussed here below. 2.2. General case: A criterium for finite index. We now show that formula (2.4) gives a criterium for the finiteness of the index of a sector, namely that Theorem 2.1 holds without restrictions. We start with a general fact. Proposition 2.6. Let N ⊂ M be an inclusion of factors and assume that there exist a normal faithful conditional expectation E of M onto N and a normal faithful conditional expectation E 0 of N 0 onto M 0 . Then N 0 ∩ M is a discrete type I von Neumann algebra, i.e. a (possibly infinite) direct sum of type I factors. Moreover for each minimal projection p of N 0 ∩ M the inclusion N p ⊂ pM p has finite index. Proof. Setting R = N 0 ∩M , E restricts to a faithful expectation of N ∨R onto N , hence N ∨ R is canonically isomorphic to the von Neumann tensor product N ⊗ R and we can assume this isomorphism to be spatial (by tensoring N and M by a type III factor, if necessary). On the other hand E 0 factors through a faithful normal expectation of N 0 onto (N ∨ R)0 = N 0 ∩ R0 by Takesaki’s theorem [49], hence, with the above spatial identification, we have a normal faithful expectation of N 0 ⊗ B(H) onto N 0 ⊗ R0 , that restricts to a normal expectation of C ⊗ B(H) onto C ⊗ R0 , that implies R to be be a type I von Neumann algebra. As R is a direct sum of homogeneous type In von Neumann algebras, by considering the reduced inclusion corresponding to an abelian projection of R (fixed by the modular group of the expectation) we are left to prove our statement in the case R is an abelian von Neumann algebra, namely we have to prove that R is totally atomic, for in this case the finiteness of the index of the reduced inclusions corresponding to minimal projections of R would follow by [37, Proposition 4.4]. By decomposing R into its diffuse and atomic part, we may then assume R to be diffuse abelian and find an absurd. To this end, for notational convenience, we may identify N with M (see [38]), i.e. we set N = σ(M ) for some endomorphism σ of M . R⊕ σλ dµ(λ) into irreducibles and as R is abelian σλ is disjoint We may decompose σ = to σλ0 for almost all pairs (λ, λ0 ), hence σ¯ λ σλ0 does not contain the identity, except for (λ, λ0 ) in a set of product measure 0 and we conclude that σσ ¯ does not contain the identity too. By [36] this shows that there exists no normal faithful expectation onto N contradicting our hypothesis. Lemma 2.7. Let N ⊂ M be an inclusion of von Neumann algebras on a Hilbert space H, a cyclic separating vector for M and 1M , 1N the corresponding modular 1
1
1
2 2 2 operators on H and on N . Then k1M ξk = k1N ξk for all ξ in the domain of 1N .
Kac–Wakimoto Formula and Black Hole Conditional Entropy
465
Proof. If x ∈ N we have with usual notations 1
1
1
1
2 2 2 2 xk = kJN 1N xk = kx∗ k = kJM 1M xk = k1M xk, k1N 1
1
2 2 the equality holds for all the vectors in the domain of 1N . and as N is a core for 1N
Corollary 2.8. Let T (t) and V (s) be two unitary one-parameter groups on a Hilbert space H such that (2.5) V (s)T (t)V (−s) = T (e−2πs t), t, s ∈ R and assume −i d T (t)|t=0 to be positive. Then dt ke−πD ξk = ke−πD T (t)ξk
(2.6)
for all ξ in the domain of e−πD and all t ≥ 0, where D is the generator of V . Proof. The projection P onto the T -fixed vectors commutes both with T and V and on such vectors Eq. (2.6) trivially holds, hence we may assume that P = 0. With this assumption all non-zero representations of the commutation relation (2.5) are quasi-equivalent by the von Neumann uniqueness theorem (D and the logarithm of the generator of T (t) satisfy the Heisenberg commutation relations), hence we may verify Eq. (2.6) in any given representation. Let B be the conformal net on R = S 1 \{−1} given by the current algebra (see e.g. [19]), T and V the translation and dilation unitary groups. Then e−2πD = 1M , where M = B(0, +∞) and T (t)e−2πD T (−t) = 1N , where N = B(t, +∞). Lemma 2.7 then applies to the modular operators 1M and 1N with respect to the vacuum and gives Eq. (2.6). We shall now denote by T I (t) the one-parameter unitary group of translations associated with I, namely cutting S 1 and identifying it with R so that I is identified with R+ , then T I (t) correspond to the translations on R. If ρ is an endomorphism of A localized in I, we shall then denote by TρI , the corresponding one-parameter unitary group in the representation ρ. Corollary 2.9. Let ρ be an endomorphism of A localized in the interval I. Then I
I
ke−πKρ ξk = ke−πKρ TρI (t)ξk I
for all ξ in the domain of e−πKρ and t ≥ 0 . Proof. Immediate by Corollary 2.8.
Proposition 2.10. Let ρ be an endomorphism of A localized in the interval I. If I (e−2πKρ , ) < ∞, then the formula I
I
ψρ (XY ∗ ) = (e−πKρ X, e−πKρ Y ) determines a positive normal linear functional ψρ (Dψρ : Dω|A(I) )t = z(ρ, 3I (−2πt)).
on A(I) such that
466
R. Longo
Proof. By Connes’ theorem [12] there exists a unique normal faithful semifinite weight ψρ on A(I) such that (Dψρ : DωI )t = z(ρ, 3I (−2πt)), where we shorten notations like ω|A(I) = ωI . Moreover 1(ωI 0 |ψρ )it = z(ρ, 3I (2πt))1(ωI 0 |ωI )it , where 1(·|·) denotes the Connes spatial derivative [13] (between a weight on a von Neumann algebra and a weight on its commutant), and since by (2.2) I
−2πitK , 1it (ωI 0 |ωI ) = 1it I =e
we have
I
e−2πKρ = 1(ωI 0 |ψρ ).
(2.7) 1 2
By assumptions thus belongs to the domain of 1(ωI 0 |ψρ ) and this implies that 1 ψρ (1) = k1(ωI 0 |ψρ ) 2 k2 < +∞, namely ψρ is everywhere defined. The proposition now readly follows by formula (1.4): ψρ (XY ∗ ) = anal.cont. ω(Y ∗ u(ρ, t)σtω (X)) t→−i
= anal.cont. ω Y ∗ z(ρ, 3I (−2πt))σtω (X) t→−i = anal.cont. e−i2πtKρ X, Y . t→−i
Alternatively one could use directly the expression given by Proposition 3.5.
Proposition 2.11. Let ρ and ψρ be as in Proposition 2.10 and identify I with R+ as above. Then (2.8) ψρ (Uρ (g)XUρ (g)∗ ) = ψρ (X), X ∈ A(I), provided g ∈ G is a dilation or a positive translation associated with I. As a consequence there exists a positive linear functional ψ˜ ρ on Aζ = ∪` A(`, +∞)− , normal on any A(`, +∞), translation and dilation invariant in the representation ρ, defined by ψ˜ ρ (X) = ψρ (TρI (t)XTρI (t)∗ ), where X ∈ A(`, +∞) and t + ` > 0. Proof. The second assertion clearly follows from the first one. Formula (2.8) holds if g is a dilation, as the dilations correspond to the modular automorphisms of A(I) with respect to ψρ , due to the construction of ψρ . By Proposition 2.10 we thus have to show that I I (2.9) ke−πKρ TρI (t)XTρI (−t)k = ke−πKρ Xk, where X ∈ A(I) and t ≥ 0. Indeed by Corollary 2.9 for all X ∈ A(I) and t ≥ 0 we have I
I
ke−πKρ TρI (t)XTρI (−t)k = ke−πKρ XTρI (−t)k I
I
= ke−πKρ XTρI (−t)k = ke−πKρ Xξk,
(2.10)
where ξ = TρI (−t) = TρI (−t)T I (t). On the other hand z(ρ, t) = TρI (t)T I (−t) belongs to A(I) if t ≥ 0 as ρ(T I (t)·T I (−t)) is also localized in I. Therefore (X 0 ξ, ξ) = (X 0 , ) if X 0 ∈ A(I 0 ).
Kac–Wakimoto Formula and Black Hole Conditional Entropy
467
Now ξ is cyclic for A(I 0 ) if t ≥ 0, because ξ = u with u = z(ρ, t) a unitary in A(I). By Proposition 2.10, Eq. (2.10) gives I
I
I
ke−πKρ TρI (t)XTρI (−t)k = ke−πKρ Xξk = ke−πKρ Xuk = ψρ (Xuu∗ X ∗ ) = ψρ (XX ∗ ) showing (2.9) as desired.
Proposition 2.12. Let ρ = ⊕k ρk be a direct sum of endomorphisms of A all localized in a given interval I. Then X −2πK I I ρk (e , ). (e−2πKρ , ) = k
Proof. Let vk be a family of isometries of A(I) such that ρ = X I I vk e−2πKρk vk∗ , e−2πKρ =
P k
vk ρk (·)vk∗ . Then
k
hence by Proposition 2.10 we have X −2πK I X I ρk ∗ (e vk , vk∗ ) = ψρk (vk∗ vk ) (e−2πKρ , ) = k
=
X k
ψρk (1) =
X
k
d(ρk ).
k
Proposition 2.13. Let ρ be as in Proposition 2.10. Then ρI (A(I))0 ∩ A(I) is equal to the commutant {∪I0 ρI0 (A(I0 ))}0 of the representation ρ. Proof. The proof is based on the arguments given in the proof of [23, Theorem 2.3] that concerned the case ρ had a priori finite index. In that context the proof relied on the construction in [23, Corollary 2.5] of a locally normal faithful positive linear functional invariant under dilations and translations in the representation ρ. Proposion 2.11 provides us with such a functional ψ˜ ρ in our setting, therefore, with obvious modifications, the rest of the proof of [23, Theorem 2.3] is valid here. Lemma 2.14. Let ρ be an endomorphism of A localized in the interval I. If there exists a normal faithful conditional E expectation of A(I) onto ρ(A(I)), then ρI is a (possibly infinite) direct sum of irreducible endomorphisms of A(I) with finite index. Proof. By conformal invariance we may assume that I is the upper semi-circle. We now use the formula ρ¯ = j · ρ · j for the conjugate sector ρ, ¯ where j = AdJ with J the modular conjugation associated with the right semi-circle. Due to its geometrical meaning, j is an anti-automorphism of A(I), so that j · E · j is a normal faithful expectation onto ρ(A(I)). ¯ Now the inclusion ρ(A(I)) ¯ ⊂ A(I) is dual to the inclusion ρ(A(I)) ⊂ A(I), hence the lemma follows by Proposition 2.6. Theorem 2.15. Let ρ be an endomorphism of A localized in the interval I. Then I (e−2πKρ , ) < +∞ if and only if ρ has finite index. Therefore the equality d(ρ) = I (e−2πKρ , ) holds regardless if d(ρ) is finite or infinite.
468
R. Longo I
Proof. We only have to show that if (e−2πKρ , ) < +∞ then ρ has finite index. Now in this case Proposition 2.10 gives us a faithful positive normal linear functional on A(I) whose modular group leaves ρ(A(I)) globally invariant by Proposition 1.1. By the Takesaki theorem [49] we have a normal faithful expectation onto ρ(A(I)), whence by Proposition 2.12 and Lemma 2.14 ρ is a direct sum of irreducible finite index sectors I ρk . As for each ρk the formula (e−2πKρk , ) = d(ρk ) holds true by Corollary 2.5, the results follows by the additivity expressed in Proposition 2.12. Before concluding the section, we mention further applications of the above methods to the analysis of superselection sectors with infinite index [22, Sect. 11], in particular regarding the positivity of the energy in these representations. This matter will be discussed in [55].
3. Hawking Temperature in a Charged State and Conditional Entropy 3.1. General setting and a first expression. Following the discussion made in the introduction, we consider a Quantum Field Theory on the Minkowski space R4 , identify the Rindler space-time W with a wedge region in R4 , and look at W in analogy with the Schwarzschild space-time. For convenience we fix the Lorentz frame so that W = {x ∈ R4 , x1 > |x0 |}, and denote by 3W (t) the corresponding one-parameter group of pure Lorentz transformations in the x1 -direction: ch(t) sh(t) 0 0 sh(t) ch(t) 0 0 3W (t) = 0 0 1 0 0 0 01 Let A(O) be the von Neumann algebra on the Hilbert space H of the observables localized within the region O of the Minkowski space. Let U denote the unitary covariant, positive energy, representation of the Poincar´e group P˜ + on H and the vacuum vector. We assume the local algebras to be generated by a Wightman field [47], in order to have the Bisognano–Wichmann theorem that identifies the Tomita–Takesaki modular operator 1 and the modular conjugation J associated with (A(W ), ): 1it W = U (3W (−2πt)), t ∈ R, and J is the PCT anti-unitary composed with the unitary implementation of the change of sign of the x2 , x3 -coordinates. Therefore U (3W (t)) implements a one-parameter automorphism group αt of A(W ) that satisfies the Kubo–Martin–Schwinger equilibrium condition at inverse temperature β = 2π with respect to the restriction of the vacuum state ω = (., ) to A(W )4 ; in other words, by restriction to A(W ), the pure ground state ω becomes faithful (by the Reeh–Schlieder theorem) and thermal for the geodesic evolution on the Rindler space provided boost transformations. 4 One may start with a modular covariance condition and encode the space-time symmetries intrinsically into the net structure [8].
Kac–Wakimoto Formula and Black Hole Conditional Entropy
469
As already explained, there is a relation of this setting with the Hawking and the Unruh effects, first noted by Sewell [44]. The space-time W can be identified with the outside region of a black hole. Then the observable algebra for the background system of the black hole is A(W ), the corresponding proper Hamiltonian is H = aK = −i
d U (3W (at))|t=0 , dt
where a is the surface gravity of the black hole, and the dynamics in the Heisenberg picture is given by αat (X) = e−iHt XeiHt , X ∈ A(W ). Accordingly ω|A(W ) is a KMS state (i.e. Gibbs state at infinite volume [25]) at inverse a . We refer to [24, 46] for more details and further literature. Hawking temperature β = 2π We shall consider the black hole as a heat reservoir for its background system and treat the latter as an open system. Because of the particle production due to the Hawking effect, the background system undergoes a change in its quantum numbers, namely the system goes in different superselection sectors, and we shall consider the thermal equilibrium charged state corresponding to a given sector. We thus consider a superselection sector, namely the unitary equivalence class of a representation ρ of the quasi-local C∗ -algebra A, the norm closure ∪A(O)− of the union of all local algebras associated to bounded regions O. The representation ρ is assumed to be localizable in each space-like cone S 5 , namely ρ and the identity (vacuum) representation have unitarily equivalent restrictions to the C ∗ -algebra ∪{A(O), O ⊂ S 0 , O bounded}− generated by the local observables in S 0 . We may then assume S ⊂ W and, by identifying the representation Hilbert spaces, ρ to act as the identity on A(O) if O ⊂ S 0 , namely ρ is a DHR localized endomorphism [17]. By wedge duality (consequence of the Bisognano–Wichmann theorem) A(W 0 ) = A(W )0 , therefore ρ restricts to a normal endomorphism of A(W ), still denoted by ρ (more precisely ρ restricts to A(W ) ∩ A and has a normal extension to A(W )). We assume that ρ is irreducible and Poincar´e covariant with positive energymomentum, namely there exists a unitary representation Uρ of universal covering group e+ of the the Poincar´e group such that P Uρ (g)ρ(X)Uρ (g)∗ = ρ(U (g)XU (g)∗ ),
e+ . X ∈ A, g ∈ P
(3.1)
The covariance is automatic under general conditions [22]. As shown in [17], the notion statistics is intrinsically associated with ρ, in particular the statistical dimension d(ρ) is defined and turns out to be a positive integer or +∞. We shall assume d(ρ) < ∞. By the index-statistics theorem [36] Ind(ρ) = d(ρ)2 , where Ind(ρ) is the minimal index of ρ|A(W ) , so we may equivalently assume that the restriction of ρ to A(W ) has finite index. 5 This class exahusts all the translation covariant, positive energy representations with an isolated mass shell [11], but possibly not charges with long range interactions.
470
R. Longo
The representation Uρ giving the covariance is uniquely defined by formula (3.1): since ρ is irreducible any other representation would differ from Uρ by a one-dimensional character of P˜ + , that has to be trivial because P˜ + has no non-trivial finite-dimensional unitary representation. Now z(ρ, g) = Uρ (g)U (g)∗ is a AdU (g)-cocycle and is localized, in particular z(ρ, g) belongs to A(W ) if also ρ · AdU (g) is localized in W . In particular z(ρ, 3W (t)) is a αt -cocycle localized in W . As a consequence we have: Lemma 3.1. αtρ (X) = Uρ (3W (t))XUρ (3W (−t)), X ∈ A(W ) defines a one-parameter automorphism group αtρ of A(W ). Proof. We have αtρ (A(W )) = z(ρ, 3W (t))αt (A(W ))z(ρ, 3W (t))∗ = z(ρ, 3W (t))A(W )z(ρ, 3W (t))∗ = A(W ) because z(ρ, 3W (t)) belongs to A(W ).
ρ The one-parameter automorphism group αat is the dynamics of our system in the sector ρ and the corresponding proper Hamiltonian is given by
Hρ = aKρ = −i
d Uρ (3W (at))| t=0 . dt
Theorem 2.1, or equivalently formula (2.4), has its version here, by an analogous proof, d(ρ) = (e−βHρ , )|β= 2π = (e−2πKρ , ), a
(3.2)
where is the vacuum vector or any other cyclic vector for A(W ) such that ( ·, ) coincides with the vacuum state ω on A(W 0 ). As in Proposition 2.8 we have a normal faithful state of A(W ) given by ϕρ (XX ∗ ) = d(ρ)−1 ke−πKρ Xk2 , X ∈ A(W ).
(3.3)
Lemma 3.2. If ρ is irreducible then the one-parameter automorphism group αtρ is ergodic on A(W ), namely its fixed points are the scalars. Proof. The proof is similar to the one given in [23] in the case of a conformal theory. Details will be given somewhere else. Next we show that the system in the sector ρ admits a thermal equilibrium state at the same Hawking inverse temperature β = 2π/a. ρ admits a unique normal KMS state ϕρ at inverse temperature β = Theorem 3.3. αat The state ϕρ is given by Eq. (3.3) or equivalently by
ϕρ = ω8ρ ,
2π a .
(3.4)
where 8ρ is the minimal left inverse of ρ on A(W ) and, if ρ is irreducible, ϕρ is the ρ unique normal αρ -invariant state of A(W ). If β 6= 2π a , no αat -KMS normal state exists.
Kac–Wakimoto Formula and Black Hole Conditional Entropy
471
Proof. The situation is similar to the one discussed in the previous section. Again, relying on formula (3.2) and Proposition 1.7, we see that the Connes cocycle (Dϕρ : Dω|A(W ) )t , where ϕρ is defined by Eq. (3.4), is equal to d(ρ)−it z(ρ, 3W (−2πt)), thus the modular group of ϕρ is given by ϕ ρ σt ρ = α−2πt , ρ i.e. ϕρ is αat -KMS at β = 2π/a. The non-existence of normal states at different temperatures is an immediate consequence of the outerness of σ ϕρ (cf. [12]), since A(W ) is a type III1 factor (see [35]). The uniqueness of ϕρ as a normal αρ -invariant state is equivalent to the ergodicity of αρ , hence a consequence of Lemma 3.2.
Now a finite volume consideration (see Appendix B) suggests to regard (e−βHρ , ) as the ratio of the (here undefined) partition functions Z0 (β) of the state ω and Zρ (β) of the state ϕρ , namely log(e−βHρ , ) = log Z0 (β) − log Zρ (β),
(3.5)
whence we expect the quantity F (ω|ϕρ ) = −β −1 log(e−βHρ , ) to represent the increment of the free energy between ω and ϕρ . We shall see the above formula to hold true in a precise sense. We define the relative free energy F (ω|ϕρ ) between the states ω and ϕρ by F (ω|ϕρ ) = ϕρ (Hρ ) − β −1 S(ω|ϕρ ),
(3.6)
where S(ω|ϕρ ) = S(ω|A(W ) |ϕρ ) is the Araki relative entropy of the two states on A(W ), S(ω|ϕρ ) = −(log 1,ξρ ξρ , ξρ );
(3.7)
here 1,ξρ is Araki’s relative modular operator of A(W ) associated with the two cyclic separating vectors and ξρ , where ξρ is any cyclic vector such that ϕρ = (· ξρ , ξρ ) on A(W ). In particular we may assume ξρ to belong to the natural cone associated with (A(W ), ). The quantity ϕρ (Hρ ) = (Hρ ξρ , ξρ ) in (3.6) represents the relative mean energy in between ω and ϕρ , indeed according to formula (3.5) this has to be given, anticipating Proposition 3.5, by 1/2
1/2
(e−βHρ Hρ , ) d(ρ)(Hρ 1,ξρ , 1,ξρ ) d log(e−βHρ , ) = = − dβ (e−βHρ , ) (e−βHρ , ) 1/2
1/2
= (Hρ J1,ξρ , J1,ξρ ) = (Hρ ξρ , ξρ ) = ϕρ (Hρ ), where J is the modular conjugation of both and ξρ and we have set β = 2π/a and applied formula (3.2). More directly one may define the relative mean energy by the formal expression ϕ˜ ρ (Hρ ) = ϕ˜ ρ (Hρ − H), where Hρ − H is the relative Hamiltonian, ϕ˜ ρ = (·ξρ , ξρ ) and one sets ϕ˜ ρ (H) = 0 motivated by the fact that (e−iHt ξρ , ξρ is a real even function (because 1it preserves the natural cone) picking a maximum at t = 0.
472
R. Longo
The relative entropy S(ω|ϕρ ) is always non-negative, but it may be equal to +∞, as no volume renormalization has been made (cf. Appendix B); also the relative mean energy ϕρ (Hρ ) may be infinite, but we shall show that the relative free energy between ϕρ and ω is finite, so in particular ϕρ (Hρ ) shall be bounded below. Formula (3.6) will have the obvious rigorous meaning as F (ω|ϕρ ) = (Hρ + β −1 log 1,ξρ ξρ , ξρ ). Theorem 3.4. The relative free energy between the thermal equilibrium states ω and ϕρ is proportional to the Connes-Størmer entropy of the sector ρ: 1 F (ω|ϕρ ) = − β −1 Sc (ρ). 2 Here Sc (ρ) denotes the conditional entropy H(A(W )|ρ(A(W )) (see (1.17), (1.18)) n X S(ϕρ |ϕi ) − S(ϕρ · ρ|ϕi · ρ)}, Sc (ρ) = sup{ (ϕi )
i=1
of finitely many normal positive linear functionals where (ϕi ) varies among the setsP n ϕ1 , ϕ2 , . . . ϕn of A(W ) such that i=1 ϕi = ϕρ . We note the extensive property of Sc : Sc (ρ1 ρ2 ) = Sc (ρ1 ) + Sc (ρ2 ) [38]. Proposition 3.5. We have 1 βHρ = 2πKρ = − log 1,ξρ − Sc (ρ). 2
(3.8)
Proof. By Araki’s formula −it (Dϕρ : Dω|A(W ) )t = 1it ,ξρ 1 ,
therefore
−it uW (ρ, t) = (Dd(ρ)ϕρ : Dω|A(W ) )t = d(ρ)it 1it ,ξρ 1 .
On the other hand by our formula uW (ρ, t) = z(ρ, 3W (−2πt)) = e−i2πKρ t ei2πKt , therefore
−it e−i2πKρ t ei2πKt = d(ρ)it 1it ,ξρ 1 ,
−iKt and as 1it we see that = U (3W (−2πt)) = e
e−i2πtKρ = d(ρ)it 1it ,ξρ so the proposition is obtained by differentiating this expression at t = 0. An alternative argument will appear in Lemma 3.10. Proof of Theorem 3.4.
By evaluating on ϕ˜ ρ both sides of formula (3.8) we have
β(Hρ ξρ , ξρ ) = −(log 1,ξρ ξρ , ξρ ) − log d(ρ). On the other hand d(ρ) is the square root of the minimal index of ρ|A(W ) [36], thus by the Pimsner–Popa equality (1.18) it follows that log d(ρ) = 21 Sc (ρ), hence proving the theorem.
Kac–Wakimoto Formula and Black Hole Conditional Entropy
473
Corollary 3.6. The possible values of the relative free energy with initial state ω are 1 F (ω|ϕρ ) = − β −1 log(n), n = 1, 2, 3, . . . . 2 Proof. Immediate by the DHR theorem [17] to the effect that the statistical dimension is a positive integer or +∞. Therefore the integer n, expressing the order of the parastatistics in [17], here appears as a quantum number labeling the relative free energy levels. In low space-time dimensions the quantization of the conditional entropy is less restrictive. By Jones theorem [34] and the results in [38,42] we have however restrictions for the possible values of the relative free energy associated with a planar black hole: Corollary 3.7. In low dimensions the possible values of e−βF (ω|ϕρ ) are restricted to 4 cos2 ( nπ ) in the interval (0, 4). No value in (4, 5) is possible. In the interval (5, 6) only 3 values are possible. Proof. The first assertion follows from [34], because e−βF (ω|ϕρ ) is an index. The rest of the statement is a consequence of the further restrictions on the index values due to the occurrence of the braid group symmetry [38,42]. 3.2. The increment of the free energy between arbitrary thermal equilibrium states. Let σ be another endomorphism localized in W and ξσ the cyclic separating vector for A(W ) such that (Xξσ , ξσ ) = ϕσ (X), X ∈ A(W ) lying in the natural cone associated with (A(W ), ). To extend the definition (3.6) for the free relative energy to the case the initial state is an arbitrary thermal state ϕσ , we note first that the formal relative Hamiltonian between ϕσ and ϕρ is Hρ − Hσ and hence the relative mean internal energy should be formally given by ϕ˜ ρ (Hρ − Hσ ) = ϕ˜ ρ (Hρ − Hσ − H) = ϕ˜ ρ (Hρ + Hσ¯ − H). Here the conjugate charge given by σ¯ = j · σ · j (see [22]) is localized in W 0 and Hσ¯ = JHσ J = −Hσ . These premises and the following lemma will motivate definition (3.9) below. Lemma 3.8. We have
eitHρσ¯ = eitHρ e−itH eitHσ¯ .
Proof. By the cocycle property ¯ 3W (t)))z(ρ, 3W (t)) = z(σ, ¯ 3W (t))z(ρ, 3W (t)) z(ρσ, ¯ 3W (t)) = ρ(z(σ, because z(σ, ¯ 3W (t)) is localized in W 0 and ρ acts identically on A(W 0 ) so the lemma is obtained by multiplying on the right by e−itK the above expression. We thus define the relative free energy between ϕσ and ϕρ by F (ϕσ |ϕρ ) = ϕ˜ ρ (Hρσ¯ ) − β −1 S(ϕσ |ϕρ ),
(3.9)
and we give to this expression a rigorous meaning as is done for expression (3.6). Alternatively, extending the considerations in the previous subsection, we could interpret directly log(e−βHρσ¯ ξσ , ξσ ) as the increment of the logarithm of the partition function between the states ϕσ and ϕρ , leading to the expression F (ϕσ |ϕρ ) = −β −1 log(e−βHρσ¯ ξσ , ξσ ).
474
R. Longo
Theorem 3.9. The relative free energy is given by F (ϕσ |ϕρ ) =
1 −1 β (Sc (σ) − Sc (ρ)). 2
As a consequence e−βF (ϕσ |ϕρ ) is equal to the rational number
d(σ) d(ρ) .
Lemma 3.10. Let ρ be as above, ρ0 an endomorphism localized in W 0 and ϕρ0 = ω · 8ρ0 |A(W 0 ) , where 8ρ0 is minimal left inverse of ρ0 . Then 1(ϕρ0 |ϕρ ) =
d(ρ) −2πKρρ0 e , d(ρ0 )
where 1(·|·) denotes the Connes spatial derivative. Proof. Setting ω 0 = ω|A(W 0 ) and using [13] one has 1(ϕρ0 |ϕρ )it = (Dϕρ0 : Dω 0 )t 1(ω 0 |ϕρ )it = d(ρ)it (Dϕρ0 : Dω 0 )t e−i2πtKρ = d(ρ)it d(ρ0 )−it e−i2πtKρ0 ei2πtK e−i2πtKρ = d(ρ)it d(ρ0 )−it z(ρ0 , 3W (−2πt))z(ρ, 3W (−2πt))e−i2πtK = d(ρ)it d(ρ0 )−it ρ(z(ρ0 , 3W (−2πt)))z(ρ, 3W (−2πt))e−i2πtK = d(ρ)it d(ρ0 )−it z(ρρ0 , 3W (−2πt))e−i2πtK = d(ρ)it d(ρ0 )−it e−i2πtKρρ0 . where we have used Proposition 2.5 in our context.
Corollary 3.11. We have 1ξσ ,ξρ =
d(ρ) −2πKρσ¯ e . d(σ)
Proof. Immediate by the above discussion and the relation 1ξσ ,ξρ = 1(ϕσ · AdJ|ϕρ ), where ϕσ · AdJ is the vector state (·ξσ , ξσ ) on A(W 0 ). Proof of Theorem 3.9.
By the above Corollary we have
1 βHρσ¯ = βaKρσ¯ = − log 1ξσ ,ξρ + (Sc (σ) − Sc (ρ)), 2 and this clearly implies the desired relation.
Appendix A. Tensor Categories and Cocycles The purpose of this appendix is to shed light on part of the mathematical structure underlying our results. Indeed a good part of our results depends only on the tensor categorical structure provided by the superselection sectors and are therefore visible without a more detailed description of the theory. Let T be a strict C ∗ -tensor category. We assume (ι, ι) = C, where ι is the identity object and (·, ·) denotes the intertwiner space. We refer to [40] for the basic notions used here.
Kac–Wakimoto Formula and Black Hole Conditional Entropy
475
A basic and originating example for T , appearing in [17, 18], is obtained by taking End(M ), M a unital C ∗ -algebra with trivial centre, to be the set of objects, and arrows between objects ρ and ρ0 given by (ρ, ρ0 ) = {T ∈ M, T ρ(x) = ρ0 (x)T, ∀x ∈ M },
(A.0)
while the tensor product is given by the composition of maps ρ ⊗ ρ0 = ρ · ρ0 , T ⊗ S = ρ02 (T )S = Sρ01 (T ), T ∈ (ρ1 , ρ2 ), S ∈ (ρ01 , ρ02 ). The reader unfamiliar with abstract tensor categories might at first focus on this particular case. Given an object ρ of T , an object ρ¯ of T is said to be a conjugate of ρ if there exist ¯ and R¯ ρ ∈ (ι, ρρ) ¯ such that Rρ ∈ (ι, ρρ) ∗ R¯ ρ ⊗ 1ρ ◦ 1ρ ⊗ Rρ = 1ρ ; Rρ∗ ⊗ 1ρ¯ ◦ 1ρ¯ ⊗ R¯ ρ = 1ρ¯ . (A.1) We shall assume that each object ρ has a conjugate ρ¯ (this is automatic in End(M ) [37] if M is an infinite factor and ρ has finite index) and shall refer to (A.1) as the conjugate equation for ρ and ρ. ¯ Equation (A.1) has then a standard solution, namely √ one can choose multiples of isometries Rρ and R¯ ρ in (A.1) so that kRρ k = kR¯ ρ k = d(ρ) is minimal. This formula defines d(ρ), the dimension of ρ [40]. ¯ ρ¯0 ) is Now recall that given an arrow T ∈ (ρ, ρ0 ), the conjugate arrow T • ∈ (ρ, defined by ¯ ρ¯0 ), T • = 1ρ¯ 0 ⊗ R¯ ρ∗ ◦ 1ρ¯ 0 ⊗ T ∗ ⊗ 1ρ¯ ◦ Rρ0 ⊗ 1ρ¯ ∈ (ρ, where R¯ ρ and Rρ0 are multiples of isometries in the standard solution for the conjugate equations defining the conjugates ρ¯ and ρ¯0 . The mapping T 7→ T • is antilinear and enjoys in particular the following properties: a) 1•ρ = 1ρ¯ , b) S • ◦ T • = (S ◦ T )• , c) T •∗ = T ∗• . We shall say that α is an (anti-)automorphism of T if α is an invertible functor of T with itself, (anti-)linear on the arrows, commuting with the ∗ -operation and preserving tensor products. The action of α on the object ρ and on the arrow T will be denoted by ρα and T α . Given an automorphism α of T a cocycle u with respect to α is a map ρ ∈ Obj(T ) → u(ρ) unitaries in (ρ, ρα ) such that a) u(ρ ⊗ ρ0 ) = u(ρ) ⊗ u(ρ0 ) . b) If T ∈ (ρ, ρ0 ), then the following diagram commutes: ρ u(ρ)
T
?
ρα
-
Tα
ρ0 u(ρ0 ) ?
ρ0α
(A.2)
476
R. Longo
Proposition A.1. Let u be a cocycle with respect to α as above. Then u(ρ) ¯ = u(ρ)• .
Proof. The proof is obtained similarly as in Proposition 1.5.
Lemma A.2. Let u be a cocycle with respect to α as above and j be an antiautomorphism of T . Then ρ → u(ρj )j is a cocycle with respect to jαj −1 . Proof. The statement is checked by a direct verification.
We now give two uniqueness results that are at the basis of the identifications of the covariance cocycle and the Connes cocycle in this paper. Lemma A.3. With the notations in Lemma A.2, assume that jαj −1 = α−1 and ρj is a conjugate of ρ for all objects ρ. Then the cocycle u with respect to α is unique. Proof. If u0 is a cocycle with respect to α and ρ is irreducible, then u0 (ρ) = µ(ρ)u(ρ) for ¯ = µ(ρ), some phase µ(ρ). By Proposition A.1 µ(ρ) ¯ = µ(ρ), while by Lemma A.2 µ(ρ) hence µ = 1 on the irreducibles, thus always because a cocycle is determined by its value on the irreducible objects. Let now G be a group and α an action of G on T , namely a homomorphism g → αg of G into the automorphism group of T (for simplicity we omit topological assumptions). For any ρ ∈ T and g ∈ G, let u(ρ, g) be a unitary in (ρg , ρ) (where ρg ≡ ραg ). We shall say that u is a two-variable cocycle if: a) For any fixed g ∈ G, u(·, g)∗ is a cocycle with respect to the automorphism αg . b) For any fixed ρ ∈ T , u(ρ, ·) is a α-cocycle, namely u(ρ, gh) = u(ρ, g)u(ρ, h)αg . Proposition A.4. Let u be a two-variable cocycle as above. If G is perfect (i.e. G has no non-trivial one-dimensional unitary representation), then u is unique. Proof. As in the proof of Lemma A.3, if ρ is an irreducible object, a second two-variable cocycle would give rise to a phase µ(ρ, g) that, for a fixed ρ, would be a one-dimensional character of G, and thus had to be trivial. Appendix B. The Relative Free Energy at Finite Volume A finite volume computation with canonical distribution may clarify the notion of relative free energy F in (3.6). Let the Hamiltonians of the evolutions α(0) and α(1) be given by positive selfadjoint operators H0 and H1 , so that α(k) is implemented by eitHk . The Gibbs state ωβ(k) for α(k) is given by ωβ(k) = Tr(ρk ·) with density matrix
ρk = Zk (β)−1 e−βHk ,
where Zk (β) = Tr(e−βHk ) is the partition function. Then Fk = ωk (Hk ) − β −1 S(ρk ) = −β −1 log Zk (β) is the Helmholtz free energy in the state ω (k) , where the entropy in state ω (k) is given by
Kac–Wakimoto Formula and Black Hole Conditional Entropy
477
S(ρk ) = −Tr(ρk log ρk ). Then the relative entropy is given by (see [50]) S(ωβ(0) |ωβ(1) ) = −Tr(ρ0 log ρ0 − ρ0 log ρ1 ) = −ωβ(0) (log ρ0 − log ρ1 ) = βωβ(0) (Hrel ) + log Z0 (β) − log Z1 (β), where Hrel = H0 − H1 is the relative Hamiltonian. The relative free energy is thus given by F (ωβ(0) |ωβ(1) ) = ωβ(0) (Hrel ) − β −1 S(ωβ(0) |ωβ(1) ) = β −1 log Z0 (β) − β −1 log Z1 (β) = F1 − F0 . Had we considered a grand canonical distribution on the Fock space, the Hamiltonian for αt(k) would have been implemented by eit(Hk −µk Nk ) , with µk the chemical potential and Nk the number operator, and the above expression for Fk would have been accordingly modified. We note explicitly that (0)
(1)
e−βF (ωβ |ωβ ) =
Tr(e−βH1 ) , Tr(e−βH0 )
providing evidence to the analogy between formulae (0.1) and (0.2).
Final Comments As mentioned, the Rindler space-time is a good approximation of the Schwarzschild space-time only near the horizon. However a version of our results within the context of the Kruskal extension of the Schwarzschild space-time should be possible, as a version of the Bisognano–Wichmann theorem and a model independent derivation of the Hawking temperature has been given in this setting [44]. Another point to comment on is related to the use of the Minkowski vacuum associated with Poincar´e symmetries. As is known there exists no vacuum state for a quantum field theory on a general curved space-time. In such a general theory the relative free energy could however be defined by consistency with the fusion rules of the superselection structure and it seems that our results may be achieved in wider contexts. Concerning the expression (3.6) for the relative free energy, it would be physically meaningful to derive it by a finite volume approximation, where its expression is given in Appendix B. To this end one should use the split property and the Noether currents, see [10], and this approach might also be useful for the above-discussed extension to more general curved spacetimes. Moreover the development of such techniques could bring up a model independent derivation of the formula (0.1). Acknowledgement. It is a pleasure to thank Claudio D’Antoni, Bernard Kay, John E. Roberts and Rainer Verch for various conversations.
478
R. Longo
References 1. Araki, H.: Relative Hamiltonians for faithful normal states of a von Neumann algebra. Pub. R.I.M.S., Kyoto Univ. 9, 165–209 (1973) 2. Araki, H., Haag, R., Kastler, D., Takesaki, M.: Extensions of KMS states and chemical potential. Commun. Math. Phys. 53, 97–134 (1977) 3. Bardeeen, J.M., Carter, B., Hawking, S.W.: The four laws of black holes mechanics. Commun. Math. Phys. 31, 161 (1973) 4. Bekenstein, J.D.: Black holes and entropy. Phys. Rev. D7, 2333 (1973) 5. Bisognano, J., Wichmann, E.: On the duality condition for a Hermitian scalar field. J. Math. Phys. 16, 985–1007 (1975) and J. Math. Phys. 17, 303–321 (1976) 6. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics, II. BerlinHeidelberg-New York: Springer-Verlag, 1981 7. Brunetti, R., Guido, D., Longo, R.: Modular structure and duality in conformal quantum field theory. Commun. Math. Phys. 156, 201–219 (1993) 8. Brunetti, R., Guido, D., Longo, R.: Group cohomology, modular theory and spacetime symmetries. Rev. Math. Phys. 7, 57–71 (1995) 9. Buchholz, D., D’Antoni, C., Longo, R.: Nuclear maps and modular structures. I. J. Funct. Anal. 88, 223–250 (1990), II Commun. Math. Phys. 129, 115–13 (1990) 10. Buchholz, D., Doplicher, S., Longo, R.: On Noether’s theorem in quantum field theory. Ann. Phys. 170, 1–17 (1986) 11. Buchholz, D., Fredenhagen, K.: Locality and structure of particle states. Commun. Math. Phys. 84, 1–54 (1982) 12. Connes, A.: Une classification des facteurs de type III. Ann. Sci. Ec. Norm. Sup. 6, 133–252 (1973) 13. Connes, A.: On a spatial theory of von Neumann algebras. J. Funct. An. 35, 153–164 (1980) 14. Connes, A.: Entropie de Kolmogoroff-Sinai et m´ecanique statistique quantique. C. R. Acad. Sci. Paris S´er. I 301, 1–6 (1985) 15. Connes, A.: Størmer, E. Entropy for automorphisms of II1 von Neumann algebras. Acta Math. 134, 288–306 (1975) 16. Davies, P.C.W.: Scalar particle production in Schwarzschild and Rindler metrics. J. Phys. A8, 608 (1975) 17. Doplicher, S., Haag, R., Roberts, J.E.: Local observables and particle statistics I. Commun. Math. Phys. 23, 199–230 (1971), II. Commun. Math. Phys. 35, 49–85 (1974) 18. Doplicher, S., Roberts, J.E.: Endomorphisms of C∗ -algebras, crossed products and duality for compact groups. Ann. Math. 170, 75 (1989) 19. Evans, D., Kawahigashi, K.: Quantum Symmetries on Operator Algebras. In press 20. Fredenhagen, K., J¨orß, M.: Conformal Haag-Kastler nets, pointlike localized fields and the existence of operator product expansion. Commun. Math Phys. 176, 541 (1996) 21. Fr¨ohlich, J., Gabbiani, F.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569–640 (1993) 22. Guido, D., Longo, R.: Relativistic invariance and charge conjugation in quantum field theory. Commun. Math. Phys. 148, 521–551 (1992) 23. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. 181, 11–35 (1996) 24. Haag, R.: Local Quantum Physics. Heidelberg-Berlin: Springer Verlag, 1992 25. Haag, R., Hugenoltz, N.M., Winnink, M.: On the equilibrium states in quantum statistical mechanics. Commun. Math. Phys. 5, 215 (1967) 26. Haag, R., Kastler, D.: An algebraic approach to Quantum Field Theory. J. Math. Phys. 5, 848–861 (1964) 27. Hawking, S.W.: Particle creation by black holes. Commun. Math. Phys. 43, 199 (1975) 28. Hiai, F.: Minimum index for subfactors and entropy. II. J. Math. Soc. Japan 43, 673–678 (1991) 29. Hislop, P.D., Longo, R.: Modular structure of the von Neumann algebras associated with the free scalar massless field theory. Commun. Math. Phys. 84, 71–85 (1982) 30. Kac, V.G., Wakimoto, M.: Modular and conformal invariance constraints in representation theory of affine algebras. Adv. in Math. 70, 156–236 (1988) 31. Kay, B., Wald, R.: Some recent developments related to the Hawking effect. In: Proc. Int. Conf. Diff. Geom. Meth. in Theor. Phys., Doebner and Hennin (eds.) Singapore: World Scientific, 1987 32. Kosaki, H.: Extension of Jones’ theory on index to arbitrary subfactors. J. Funct. Anal. 66, 123–140 (1986), 123–140 33. Izumi, M.: Canonical extension of endomorphisms of factors. In: “Subfactors”, Proc. Taniguchi Symposium, Araki, Kawahigashi, Kosaki (eds.), Singapore: World Scientific, 1994
Kac–Wakimoto Formula and Black Hole Conditional Entropy
479
34. Jones, V.R.F.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 35. Longo, R.: Algebraic and modular structure of von Neumann algebras of Physics. Proc. Symp. Pure Math. 38, Part 2, 551 (1982) 36. Longo, R.: Index of subfactors and statistics of quantum fields. I . Commun. Math. Phys. 126, 217–247 (1989) 37. Longo, R.: Index of subfactors and statistics of quantum fields. II. Correspondences, braid group statistics and Jones polynomial. Commun. Math. Phys. 130, 285–309 (1990) 38. Longo, R.: Minimal index and braided subfactors. J. Funct. Anal. 109, 98–112 (1992) 39. Longo, R.: Von Neumann algebras and quantum field theory. Proceedings of the ICM, Z¨urich 1994, Basel, Switzerland: Birkh¨auser Verlag, 1995 40. Longo, R., Roberts, J.E.: A theory of dimension. K-Theory 11, 103–159 (1997) 41. Pimsner, M., Popa, M.: Entropy and index for subfactors. Ann. Sci. Ec. Norm. Sup. 19, 57–106 (1986) 42. Rehren, K.H.: On the range of the index of subfactors. J. Funct. Anal. 134, 183–193 (1995) 43. Schroer, B.: Some useful properties of rotational Gibbs states in chiral conformal QFT. Manuscript, 1995 44. Sewell, G.L.: Quantum fields on manifolds: PCT and gravitationally induced thermal states. Ann. Phys. 141, 201 (1982) 45. Sewell, G.L.: On the generalised second low of thermodynamics. Phys. Lett. 122A, 309–311 (1987) 46. Summers, S., Verch, R.: Modular inclusion, the Hawking temperature and Quantum Field Theory in curved space-time. Lett. Math. Phys. 37, 145–158 (1996) 47. Streater, R.F., Wightman, A.S.: PCT, Spin and Statistics, and all that. Reading, MA: Benjamin, 1964 48. Takesaki, M.: Tomita theory of modular Hilbert algebras. Lect. Notes in Math. 128, New York-HeidelbergBerlin: Springer Verlag, 1970 49. Takesaki, M.: Conditional expectation in von Neumann algebras. J. Funct. Anal. 9, 306–321 (1972) 50. Thirring, W.: A Course in Mathematical Physics. Vol. 4, New York-Heidelberg-Berlin: Springer Verlag, 1981 51. Unruh, W.G.: Notes on black hole evaporation. Phys. Rev. D14, 870–892 (1976) 52. Wald, R.M.: Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics. Chicago: Univ. Chicago Press, 1994 53. Wick, G.C., Wightman, A.S., Wigner, E.P.: The intrinsic parity of elementary particles. Phys. Rev. 88, 101–105 (1952), 101-105 54. Wiesbrock, H.W.: Superselection structure and localized cocycles. Rev. Math. Phys. 7, 127 (1995) 55. Bertozzini, P., Conti, R., Longo, R.: Covariant sectors with infinite dimension and positivity of the energy. - preprint. Communicated by A. Connes
Commun. Math. Phys. 186, 481 – 493 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Some Schr¨odinger Operators with Power-Decaying Potentials and Pure Point Spectrum Christian Remling? Universit¨at Osnabr¨uck, Fachbereich Mathematik/Informatik, D-49069 Osnabr¨uck, Germany. E-mail:
[email protected] Received: 8 November 1996 / Accepted: 8 January 1997
Abstract: We construct (deterministic) potentials V (x) = O(x−c ) such that the Schr¨odinger equation −y 00 + V y = Ey on x ∈ [0, ∞) has dense pure point spectrum in (0, ∞) for almost all boundary conditions at x = 0. As a by-product, we also obtain power-decaying potentials for which the spectrum is purely singular continuous on (0, ∞) for all boundary conditions. 1. Introduction In this paper, we will construct power-decaying potentials V (x) = O(x−c ) (c > 0) such that the Schr¨odinger equation for positive energies E = k 2 , − y 00 (x) + V (x)y(x) = k 2 y(x)
(1)
on the half-axis x ∈ [0, ∞) has square integrable solutions for almost all k > 0 with respect to Lebesgue measure. By well-known arguments (using spectral averaging, see d2 e.g. [16]), this implies that the corresponding Hamiltonian Hα = − dx 2 + V (x) with boundary condition y(0) cos α + y 0 (0) sin α = 0 has dense pure point spectrum in (0, ∞) (i.e. σess = [0, ∞), σc (Hα ) = ∅) for almost all α ∈ [0, π). As far as I am aware, this is even the first explicit example of a Schr¨odinger operator with this spectral type and V → 0. The existence of such potentials, however, is not a new result: The work on decaying random potentials [8, 14] has shown that, roughly speaking, random potentials decaying as Vω (x) ∼ x−c with c < 1/2 almost surely lead to dense pure point spectrum if the values of the potential at different sites are independent with zero expectation. (In fact, c = 1/2 is generally believed to be the borderline between absolutely continuous and singular spectrum in the sense that V (x) = O(x−c ) with ? Current address (until May 31, 1997): 253-37 Department of Mathematics, California Institute of Technology, Pasadena, CA 91125, USA
482
C. Remling
c > 1/2 should imply σac = [0, ∞). Kiselev has recently proved this for c > 2/3 [6], but in the region 1/2 < c ≤ 2/3, this question is open.) There is, of course, still much point in looking for deterministic examples, since the proofs in the random case strongly depend on averaging procedures and independence assumptions and therefore do not give much insight into the mechanisms with which a fixed potential generates point spectrum. The first deterministic potentials with pure point spectrum were constructed in [3, 5]. However, these examples rely on tunneling through high barriers and therefore lim sup V (x) = ∞. Later, these methods were extended to handle the case of wide barriers (see [19] and also [5] where results of this type are stated without proof), thereby obtaining potentials with lim sup V < ∞ and pure point spectrum in (lim inf V, lim sup V ) (a typical example is V (x) = cos xγ (0 < γ < 1); see also [17]). Although not directly related to this work, I should also mention the construction of potentials with dense point spectrum and |V (x)| ≤ f (x)/x (f → ∞ arbitrarily slowly) in [10, 15]. However, here one has σac = [0, ∞). As an almost immediate consequence of our construction, we will also obtain powerdecaying potentials with purely singular continuous spectrum in (0, ∞) for all boundary conditions. Apparently, only one other class of decaying potentials with this spectral behavior is known at present, namely the type of sparse potentials introduced by Pearson in his classic [11], and although the conditions on the barrier separations could be considerably weakened recently [7], the decay of these potentials is still slower than (say) (ln x)−1/2 . The main tools we will use in this paper are the (modified) Pr¨ufer transformation and transfer matrices. In some respects, our general strategy will be similar to the proof of [3, Theorem 2]. For other papers discussing some of the techniques used here, see, for instance, [7, 8, 10, 12]. This paper is organized as follows. In Sect. 2, we describe the model and state the result we will prove. We state some preliminary observations and explain the general strategy in Sect. 3. The technical details are presented in the following three sections. We conclude with the modification which yields singular continuous spectrum. I would like to thank A. Gordon, B. Simon and T. Wolff for useful discussions, the Deutsche Forschungsgemeinschaft for financial support, and I am grateful for the hospitality of Caltech where this work was done.
2. Description of the Model Let Ln ∈ N, gn > 0, ln ≥ 0, and let W (x) be a bounded, measurable and periodic function with period p. Define a1 = 0, bn = an + p
Ln , gn
an+1 = bn + ln
(n ≥ 1).
The potential we will study is given by V (x) =
0 Wgn (x − an )
if x ∈ (bn , an+1 ) , if x ∈ (an , bn )
where Wg is the rescaled function Wg (x) = g 2 W (gx).
Schr¨odinger Operators with Power-Decaying Potentials
483
Theorem 2.1. Ln , gn , ln and W (x) can be chosen such that: 1) V (x) = O(x−c ) for some c > 0, 2) For almost all k > 0, (1) has an L2 -solution. It follows from V → 0 that σess = [0, ∞). Moreover, as pointed out above, 2) implies that for almost all boundary conditions α, the corresponding half-line operator Hα has no continuous spectrum, hence it must have dense pure point spectrum in (0, ∞). Note also that by recently discovered general principles on the instability of point spectrum [1], there is a dense Gδ set of boundary conditions for which the spectrum is purely singular continuous in (0, ∞)! 3. Preliminary Observations
Write Y (x, k) :=
y(x, k) y 0 (x, k)/k
,
where y is a solution of (1) satisfying an initial condition of the form y(0, k) = − sin α, y 0 (0, k) = cos α. We will use the modified Pr¨ufer variables R, ϕ defined by the equation sin ϕ(x, k) Y (x, k) = R(x, k) , cos ϕ(x, k) and by requiring that ϕ be continuous and R > 0. The transfer matrix T (t, s; k) is defined by the property that it takes Y (s, k) to Y (t, k), i.e. Y (t, k) = T (t, s; k)Y (s, k). Hence it is given by u(t, k) kv(t, k) , (2) T (t, s; k) = u0 (t, k)/k v 0 (t, k) where u, v are the solutions of (1) with u(s) = v 0 (s) = 1, u0 (s) = v(s) = 0. If the potential is periodic, then obviously T (t + p, s + p; k) = T (t, s; k), (this observation is roughly one half of what is usually called Floquet theory). Abbreviating T (k) := TW (p, 0; k) (the transfer matrix over one period for the potential W ), we get the transfer matrix for the potential V described in the preceding section Lemma 3.1. T (an + m2 pgn−1 , an + m1 pgn−1 ; k) = T (kgn−1 )m2 −m1 ,
mi ∈ {0, 1, . . . , Ln }
Proof. By “Floquet theory”, it suffices to prove the assertion for m1 = 0, m2 = 1. Moreover, T (an + pgn−1 , an ; k) = Tˆ (pgn−1 , 0; k), where Tˆ is the transfer matrix of −y 00 + Wgn (x)y = k 2 y. d2 y For a function y(x) and g > 0, let w(t) := y(t/g). Then we have: y solves − dx 2 + d2 w 2 Wg (x)y = k y and satisfies (y, dy/dx)(0) = (a, b) if and only if w solves − dt2 + W (t)w = (k/g)2 w and satisfies (w, dw/dt)(0) = (a, g −1 b). Hence, if ug , vg denote the solutions of −y 00 + Wg y = k 2 y with ug (0) = vg0 (0) = 1, u0g (0) = vg (0) = 0, then vg (x, k) = g −1 v1 (gx, k/g), vg0 (x, k) = v10 (gx, k/g).
ug (x, k) = u1 (gx, k/g), u0g (x, k) = gu01 (gx, k/g), Now the assertion follows from (2).
484
C. Remling
Next, we recall some basic facts about periodic potentials; for further background information on this topic, please consult [2]. Since det T (k) = 1 by constancy of the Wronskian, the eigenvalues of T are determined by its trace D(k) := tr T (k) = u(p, k) + v 0 (p, k), (u, v solve −y 00 + W y = k 2 y and u(0) = v 0 (0) = 1, u0 (0) = v(0) = 0). More explicitly, the eigenvalues are µ, µ−1 with r D(k) D(k)2 ± − 1. (3) µ(k) = 2 4 We choose the sign so that |µ| ≥ 1. µ is real and larger than 1 in magnitude if and only if |D| > 2; if |D| < 2, then |µ| = 1, µ 6= ±1. In either case, the eigenvalues are distinct and hence T is diagonalizable. Let U (k) be a diagonalizing matrix, i.e. µ(k) 0 −1 . U (k)T (k)U (k) = 0 µ−1 (k) This representation leads to a simple estimate on the norm of the transfer matrix. In the following lemma, we use the l2 -norm k(v1 , v2 )t k2 = |v1 |2 + |v2 |2 . Lemma 3.2. a) If |D(k)| < 2, then kT (k)m vk ≥
kvk . kU (k)k kU −1 (k)k
b) If |D(k)| > 2, then there is an angle β(k) ∈ [0, π) such that |ϕ − β(k) − nπ| ≥ ∀n ∈ Z implies 2 kT (k)m eϕ k ≥ |µ(k)|m π (where eϕ = (sin ϕ, cos ϕ)t ). Proof. Obviously,
T (k)m = U (k)
µ(k)m 0
0 µ(k)−m
U −1 (k).
(4)
Hence kT −m wk ≤ |µ|m kU k kU −1 k kwk which implies a) since |µ| = 1 in this case. In order to prove b), notice that because T (k) is real and has real, distinct eigenvalues if |D(k)| > 2, we may take U in the form sin α sin β U= cos α cos β with α, β ∈ [0, π), α 6= β. Now an elementary calculation using (4) leads to (writing γ = α − β) kT m eϕ k2
=
sin−2 γ µ2m sin2 (ϕ − β)
+ µ−2m sin2 (ϕ − α) − 2 cos γ sin(ϕ − β) sin(ϕ − α) .
Viewing the right-hand side as a (quadratic) function of sin(ϕ − α) and determining the minimum shows kT m eϕ k2 ≥ µ2m sin2 (ϕ − β). This yields the desired estimate, since | sin x| ≥ 2|x|/π if |x| ≤ π/2.
Schr¨odinger Operators with Power-Decaying Potentials
485
Now we are in a position to explain the general idea of our construction. By Lemma 3.1, the transfer matrix for x ∈ (an , bn ) is given by powers of T (k/gn ). We will take a function W with infinitely many “gaps” (i.e. intervals with |D| > 2), and we will choose the gn such that for every k, k/gn is in some gap for sufficiently many n. Then Lemma 3.2b) shows that for those n the solution grows by a factor |µ|Ln unless the phase ϕ(an ) is close to β. If k/gn is not in a gap, then the solution can be controlled with the aid of Lemma 3.2a). The point of this argument is that the estimate of Lemma 3.2a) is independent of m. Therefore, after everything else has been chosen, we can take the Ln sufficiently large to obtain increasing solutions. Of course, one then needs an additional argument to deduce the existence of L2 -solutions. To this end, we will use a Ruelle type Theorem (see [13]) recently proved in [9]. We will make no attempt to optimize the exponent c from Theorem 2.1 because it does not seem possible to reach the critical value 1/2. In particular, we will sometimes use rather crude estimates when this does not destroy the power decay. The following lemma allows us to adjust the phases ϕ(an , k) if necessary. For k > 0 with |D(k)| > 2, define S (k) = (β(k) − , β(k) + ) with β(k) from Lemma 3.2b). Lemma 3.3. Suppose V (x) has been chosen for x ≤ bn . Let M be a measurable subset of {k ∈ [k1 , k2 ] : |D(k/gn+1 )| > 2}. Then there is an ln ∈ [0, π/k2 ] such that |{k ∈ M : ϕ(an+1 , k) ∈ S (k/gn+1 )}| ≤
2k2 |M |. πk1
Of course, the phase ϕ has to be evaluated modulo π here. Proof. First note that since V = 0 on (bn , an+1 ), we have ϕ(an+1 , k) = ϕ(bn , k)+(an+1 − bn )k = ϕ(k) + ln k (using the abbreviation ϕ(k) = ϕ(bn , k)). Denote the set defined in the lemma by J(ln ) and integrate |J(ln )| with respect to dln . Fubini-Tonelli shows Z
π/k2 0
Z
Z
|J(ln )| dln =
π/k2
dk M
0
dln χS (k/gn+1 ) (ϕ(k) + kln ).
(ϕ, β are continuous in k, so there is certainly no problem with measurability.) χS denotes the characteristic function of the set S. The second integral on the right-hand side can be estimated by k1−1 2, because ϕ(k) + lk (0 ≤ l < π/k2 ) takes on every value R π/k in [0, π) at most once. Hence 0 2 |J(ln )| dln ≤ k1−1 |M |2, and therefore the claimed inequality must hold for some ln . 4. Asymptotics According to the above remarks, we need to study the asymptotics of D(k) as k → ∞. Since the higher order terms in this asymptotic expansion depend in Ra complicated way on p the Fourier coefficients and also on higher moments of W such as 0 W (x)xeinπx/p dx for a general W (cf. [2]), we now specialize to 0 (0 < x < 1) W (x) = , 1 (1 < x < 2) p = 2. Then an elementary calculation shows
486
C. Remling
T (k) =
cos ω cos k − kω −1 sin ω sin k − cos ω sin k − ωk −1 sin ω cos k
cos ω sin k + kω −1 sin ω cos k cos ω cos k − ωk −1 sin ω sin k
,
where ω 2 = k 2 − 1. In particular, D(k) = 2 cos(ω + k) −
(k − ω)2 sin ω sin k. ωk
The computations in [2, Sect. 4.5] imply that |D(k)| > 2 can only hold if k = nπ/2 + (2nπ)−1 + δ with n ∈ N and δ = O(n−2 ). For these k, a Taylor expansion leads to (we omit this lengthy, but completely elementary computation) D(k) = −2 + 4δ 2 −
4 + O(n−6 ) n4 π 4
(5)
if n is odd and
8δ 2 1 + + O(n−7 ) n2 π 2 n6 π 6 if n is even. With these formulae, we can now show D(k) = 2 − 4δ 2 −
(6)
Lemma 4.1. Let kn = nπ/2 + (2nπ)−1 (n ∈ N) and (n odd) (nπ)−2 δn = . (n even) 2−1 (nπ)−3 Then, for sufficiently large n: a) If kn−1 + δn−1 + (n − 1)−7/2 ≤ k ≤ kn − δn − n−7/2 , then |D(k)| ≤ 2 − π −2 n−13/2 . b) If |k − kn | ≤ δn − n−7/2 , then |D(k)| ≥ 2 + π −2 n−13/2 . Remark. Subsequently, we will only need that |D| − 2 can be estimated by some power of n if the distance of k to the band edges is ≥ Cn−3− ( > 0). Thus, for simplicity, we did not distinguish between odd and even n in the statement; in fact, for odd n, the exponent in b) is 11/2 rather than 13/2. Note also that both estimates are very crude if k is not close to one of the band edges. Proof. The lemma is an easy consequence of (5), (6). The details are left to the reader. In order to establish a), recall that D(k) is monotone in intervals where |D| < 2 [2, Theorem 2.3.1]. In order to be able to use Lemma 3.2, we also need to know the asymptotics of the diagonalizing transformations U (k): Lemma 4.2. If k satisfies either of the assumptions of Lemma 4.1, then U (k) can be chosen such that kU (k)k kU −1 (k)k ≤ Cn13/2 . t1 t2 Proof. Write T (k) = . If t2 6= 0, then t3 t4 −1/2 t2 t2 U = t2 (µ−1 − µ) µ − t1 µ−1 − t1 is a diagonalizing transformation. Notice that |µ − µ−1 |2 = |4 − D2 | ≥ cn−13/2 because of Lemma 4.1. Since |ti |, |µ| are bounded from above, we get
Schr¨odinger Operators with Power-Decaying Potentials
kU k2 ≤ C1
487
n13/4 . |t2 |
Moreover, det U = 1 implies kU −1 k = kU k. Similarly, if t3 6= 0, then µ − t4 µ−1 − t4 −1 −1/2 U = t3 (µ − µ ) t3 t3 is also a diagonalizing transformation with det U = 1 and kU k2 ≤ C2 n13/4 |t3 |−1 . So it suffices to show that max{|t2 |, |t3 |} ≥ C3 n−13/4 . In the situation of Lemma 4.1a), the conditions t1 t4 − t2 t3 = 1, |t1 + t4 | ≤ 2 − π −2 n−13/2 imply that |t2 t3 | ≥ Cn−13/2 , hence max{|t2 |, |t3 |} ≥ C 0 n−13/4 , as desired. If the hypothesis of Lemma 4.1b) holds, consider ω k − sin ω cos k. t2 + t3 = ω k A Taylor expansion similar to that leading to (5), (6) shows t2 +t3 = −2(nπ)−3 +O(n−4 ), so in particular max{|t2 |, |t3 |} ≥ Cn−3 . 5. Choosing the Parameters Up to now, we have fixed the basic periodic potential W . We proceed by choosing the gn . As discussed above, the choice has to be made so that for every k, k/gn is in some gap for sufficiently many n. Lemma 5.1. Let [k1 , k2 ] be a subinterval of (0, ∞). Then there are constants G0 , C > 0 such that for any G ∈ (0, G0 ] we can find numbers g1 , . . . , gN with the following properties: 1) N ≤ CG−3 2) G/2 ≤ gi ≤ G for all i = 1, . . . , N . 3) For all k ∈ [k1 , k2 ], there is an i ∈ {1, . . . , N } so that |k/gi − kr | ≤ δr − r−7/2 for some r ∈ N. Proof. In this proof, we write ci
=
k2i+1 − δ2i+1 + (2i + 1)−7/2 ,
di
=
k2i+1 + δ2i+1 − (2i + 1)−7/2 .
(We work with the gaps corresponding to odd n only, because the gaps with even n are by a factor ∼ n−1 smaller, see Lemma 4.1.) For G > 0, let r1 r2
= =
max{i ∈ N : Gci ≤ k1 }, min{i ∈ N : Gdi ≥ k2 }.
Obviously, ri → ∞ as G tends to zero. From the definition of ri (and that of ki , δi ) one deduces easily that ri = ki (πG)−1 + O(1), where O(1) denotes an expression that remains bounded as G → 0. We take g1 = G and gn = gn−1 cr2 /dr2 for n = 2, . . . , m, where m is the smallest integer with gm cr2 ≤ Gdr2 −1 . By construction, this covers I := [Gcr2 −1 , Gdr2 ] in the sense that for all k ∈ I there is an i ≤ m such that k/gi ∈ [cr2 , dr2 ]. Moreover,
488
C. Remling
gi = G 1 −
i−1
1 + O(r2−4 ) 2π 3 r23
,
(7)
provided that r2 is large enough, i.e. G has to be sufficiently small. In this case, one infers easily from (7) that m can be estimated by m ≤ 4π 3 r22 . Using this and again (7) shows gi ≥ G(1 − r2−1 ) ≥ G/2 if, again, G is small enough. The intervals [Gci , Gdi+1 ] with r1 ≤ i ≤ r2 − 2 can now be covered in the same way by appropriately scaled copies of [ci+1 , di+1 ]. Note that the total number of gi ’s is N ≤ 4π 3 (r1 + 1)2 + . . . + r22 ≤ 4π 3 r23 . Hence this set of gi ’s has the properties stated in the lemma.
In order to treat the whole half-axis k ∈ (0, ∞), let G0 (m), C(m) be the constants of Lemma 5.1 for [k1 , k2 ] = [m−1 , m] (m ∈ N). Given any functions F, G > 0 with F (s) → ∞, G(s) → 0, we can find integers m(s) → ∞ such that m(s) ≤ F (s) and C(m(s)) ≤ F (s), G0 (m(s)) ≥ G(s) for all s ∈ N with m(s) > 1 (take m(1) = . . . = m(s0 − 1) = 1, then m(s0 ) = 2, where s0 is sufficiently large so that the inequalities hold, etc.). We can now choose the whole sequence (gn )∞ n=1 : Let F (s), G(s) > 0 be prescribed sequences with F → ∞, G → 0. Determine m(s) as explained in the preceding paraPs−1 graph, and for s = 1, 2, . . ., pick gj(s)+1 , . . . , gj(s)+N (s) (where j(s) = t=1 N (t)) so that these g’s and N = N (s) satisfy 1)-3) of Lemma 5.1 for G = G(s) and [k1 , k2 ] = [m(s)−1 , m(s)]. These definitions of j(s), N (s), gn as well as the meaning of F (s), G(s), m(s) should be kept in mind throughout the rest of this paper. We will also write s(n) for the unique s determined by j(s) + 1 ≤ n ≤ j(s) + N (s). Next, we investigate the set of k for which k/gn is close to one of the band edges infinitely often (abbreviated as i.o. in the sequel), i.e. for infinitely many n. More precisely, consider An = {k > 0 : |k/gn − ki − δi | < i−7/2 or |k/gn − ki + δi | < i−7/2 for some i}. Fix [k1 , k2 ] ⊂ (0, ∞). If n is sufficiently large, then by construction of the gn and by Lemma 5.1 2), the condition defining An can only hold for i ≥ C0 (k1 )/G(s) (writing s = s(n) for brevity). Hence |An ∩ [k1 , k2 ]| ≤ 4G(s)
X
i−7/2 ≤ CG(s)7/2 .
(8)
i≥C0 /G(s)
Lemma 5.1 1) says that there are at most F (s)G(s)−3 indices n with s(n) = s. Thus summing (8) over n yields ∞ X n=1
|An ∩ [k1 , k2 ]| ≤ C
∞ X
F (s)G(s)1/2 .
s=1
Now the Borel-Cantelli Lemma implies P∞ Lemma 5.2. Assume s=1 F (s)G(s)1/2 < ∞. Then |{k > 0 : k ∈ An i.o.}| = 0.
Schr¨odinger Operators with Power-Decaying Potentials
489
In the same way, we can control the measure of the set where |D| > 2 but at the same time the phase is close to β (recall the definitions from Sect. 3). For a sequence (s) > 0, let (again, with s = s(n)) Bn
=
{k ∈ [m(s)−1 , m(s)] : |k/gn − ki | ≤ δi − i−7/2 for some i, ϕ(an , k) ∈ S(s) (k/gn )}.
Lemma 3.3 says that |Bn | ≤
2 2 π F (s) |M |(s)
for appropriate ln−1 ∈ [0, π/m(s)]. Here,
M = {k ∈ [m(s)−1 , m(s)] : |k/gn − ki | ≤ δi − i−7/2 for some i}. In order to estimate |M |, proceed as above. This leads to |Bn | ≤ CF (s)3 G(s)2 (s), and summing over n proves P∞ Lemma 5.3. Assume s=1 F (s)4 G(s)−1 (s) < ∞. Then there are ln → 0 so that |{k > 0 : k ∈ Bn i.o.}| = 0. Note the somewhat complicated logic behind this statement: If the summability assumption holds, then for any choice of the Ln (see Sect. 2), we can find ln with the stated properties. In particular, recall that ln depends on the values of the potential V (x) for x ≤ bn . However, this does not cause any difficulties because we may as well think of the parameters Ln , gn as being fixed from the beginning (with the values given below, of course). The representation given here tries to avoid unmotivated choices. There will be no further conditions on F, G, , therefore we now fix these parameters. We take F (s) = ln s, G(s) = s−3 , (s) = s−5 (we remark that powers are by no means the only possible choice), hence N (s) ≤ s9 ln s by Lemma 5.1 1). Ln will also depend on s only, thus we can write L(s).
6. Conclusion of the Proof We are now in a position to estimate the solutions of (1). More specifically, we consider Rni := R(xni , k) at the points xni = an + 2i/gn (n ∈ N, i = 0, 1, . . . , Ln ). Fix a k which is not in the exceptional sets of Lemmas 5.2, 5.3, and let n0 be sufficiently large so that k ∈ [m(s)−1 , m(s)] if s ≥ s(n0 ). Furthermore, take n0 also greater than the finitely many n for which k ∈ An ∪ Bn , and let i0 ∈ {0, 1, . . . , Ln0 }. For some of the deductions below, it may be necessary to restrict n0 to even larger values; we will not mention this explicitly in the sequel. Finally, write s0 = s(n0 ), R0 = Rn0 i0 . By combining our previous results, we will obtain the following estimate: If xni ≥ xn0 i0 , then ln Rni − ln R0 ≥ K1
s(n)−1 X t=s0 +1
L(t)t−39/4 − K2
s(n) X
t9 ln2 t.
(9)
t=s0
Here, K1/2 are positive constants which are independent of n0 , i0 , n, i. If xni < aj(s0 +2)+1 , then the first sum is zero by convention. In order to establish (9), we argue as follows: Let m be an index with n0 ≤ m ≤ n. By Lemma 3.1, we have T (xm,i+j , xmi ; k) = T (k/gm )j . Furthermore, R is constant on (bm , am+1 ) = (xmLm , xm+1,0 ) since V = 0 on these intervals. Because of the definition of the sets Am , one of the two parts of Lemma 4.1 applies to k/gm (instead of k).
490
C. Remling
In the situation of Lemma 4.1a), we deduce from Lemma 3.2a) and Lemma 4.2 that kT (k/gm )j vk ≥ Cr−13/2 kvk, where r is the unique integer for which kr−1 + δr−1 + (r − 1)−7/2 ≤ k/gm ≤ kr − δr − r−7/2 holds. Since gm ≥ G(s(m))/2 (see Lemma 5.1 2)), r ≤ C(k)s(m)3 . In terms of R, we thus get (10) ln Rm,i+j − ln Rmi ≥ −C ln s(m), where C is independent of m, i, j. In particular, taking i = 0, j = Lm we see that ln R(am+1 ) − ln R(am ) ≥ −C ln s(m).
(11)
Summing up the contributions from (10), (11) gives the second term of (9). / Assume now that Lemma 4.1b) applies to k/gm . Observe that in this case ϕ(am , k) ∈ S(s(m)) (k/gm ) because of the definition of Bm (compare Lemma 5.3). Moreover, (3) together with Lemma 4.1b) implies that |µ(k/gm )| ≥ 1+π −1 r−13/4 , where r is determined by |k/gm −kr | ≤ δr −r−7/2 , hence, estimating r as above, |µ(k/gm )| ≥ 1+Cs(m)−39/4 . Now Lemma 3.2b) shows ln R(am+1 ) − ln R(am ) ≥ C1 L(s(m))s(m)−39/4 − C2 ln s(m), and since for all t, there is at least one m ∈ {j(t) + 1, . . . , j(t) + N (t)} for which k/gm is in a gap (by Lemma 5.1 3)), we also get the first sum of (9). It remains to treat the contribution to (9) that comes from the beginning of the interval under consideration, namely ln Rn0 ,i0 +j − ln R0 , for the case when k/gn0 is in a gap (i.e. Lemma 4.1b) applies to k/gn0 ). This term is not included in the above treatment because Lemma 5.3 gives no information on the phase at xn0 i0 . However, Lemma 3.2b) does show that 2 Rn0 ,i0 +j ≥ (s0 )|µ(k/gn0 )|i0 +j R(an0 ). π On the other hand, R0 ≤ kT (k/gn0 )i0 kR(an0 ) and therefore, taking kT m k ≤ |µ|m kU k kU −1 k into account (see the proof of Lemma 3.2), Rn0 ,i0 +j ≥
2(s0 )|µ(k/gn0 )|j R0 . πkU (k/gn0 )k kU (k/gn0 )−1 k
kU k kU −1 k can be controlled with the aid of Lemma 4.2. In conclusion, we get again an estimate of the form of (10). Note that k/gn might also happen to be in a gap (n is from (9)), but in this case it is obvious from Lemma 3.2b) that (10) holds. Thus the proof of (9) is complete. If we now take, say, L(s) = [s79/4 ] ([x] is the largest integer ≤ x), then the first sum in (9) dominates the second one, i.e. we get increasing solutions. Simple estimates allow us to rewrite (9) in the following more useful form: s(n) = s0 , s0 + 1 −C1 s90 ln2 s0 . (12) ln Rni − ln R0 ≥ s(n) ≥ s0 + 2 C2 s(n)10 In order to establish the existence of L2 -solutions, we use the following version of [9, Theorem 8.1]. This (slight) reformulation takes into account that we do not have accurate upper bounds for the norm of the transfer matrices.
Schr¨odinger Operators with Power-Decaying Potentials
491
Lemma 6.1 ([9]). Assume that there is an increasing sequence xn → ∞ such that: 1) kT (xn+1 , xn ; k)k ≤ C(k) for all Pn 2) There is a vector v ∈ R2 with kT (xn , x0 ; k)vk−2 < ∞. Then either a) There exists another vector u 6= 0 such that (writing ρn = kT (xn , x0 ; k)uk, Rn = kT (xn , x0 ; k)vk) !2 ∞ X (Rn /Ri )2 + Rn−2 . ρ2n ≤ C1 (k) Rn−2 i=n
or b) ρn /Rn → ∞ if u is not a constant multiple of v. Proof. Follow the proof of [9, Theorem 8.1]. If the vector u∞ constructed there is not a constant multiple of v, then a) follows as in [9] (with u = u∞ ). If u∞ = cv, then b) follows directly from Eq. (8.2) of [9] and the fact that the vector u from that statement is exactly u∞ . The hypotheses of this lemma are satisfied for almost every R xn,i+1k if we let xn be the |V (x)| dx = gn , the sequence of the points xni ≥ xn0 i0 defined above: Since xni standard Gronwall estimate [4, Sect. III.1] proves 1) (in fact, it proves much more, namely that T (xn+1 , xn ; k) → 1 uniformly on compact k sets). Furthermore, (12) obviously implies 2). Now let M be the set of those k 2 for which conclusion b) of the lemma holds. Clearly, b) in particular implies that no solution of (1) is polynomially bounded (as a function of x), hence M is a set of spectral measure zero for all boundary conditions α by Schnol’s Theorem (see [18, Proposition 9]). Therefore, using spectral averaging, Rπ |M | = 0 ρα (M ) dα = 0. P Using (12), we can estimate Rn−2 as well as i≥n (Rn /Ri )2 . Fixing n0 , i0 , we immediately get (with positive constants Ci , as usual) 10
2 ≥ C1 eC2 s(n) . Rni
P In order to bound xmj ≥xni (Rni /Rmj )2 (of course, the sum is over m, j, and n, i are fixed), we also use (12), but now with Rni taking the role of R0 . Here it is important that the constants of (12) are independent of n0 , i0 . As in the preceding section, we translate the sum over m, j into a sum over s and estimate the number of terms with the aid of Lemma 5.1 1). The result of these by now familiar arguments is X 9 2 (Rni /Rmj )2 ≤ C3 eC4 s(n) ln s(n) . xmj ≥xni
Thus Lemma 6.1a) implies that for almost all k > 0, there is a solution y satisfying (y 2 + (y 0 /k)2 )(xni ) = ρ2ni ≤ Ce−C
0 10
s
.
(13)
Because of Gronwall’s Lemma (compare the argument above), this estimate also holds for x ∈ [xni , xn,i+1 ] (with a possibly larger C). The number of indices n, i corresponding to the same s and the length of the intervals [xni , xn,i+1 ] can both be bounded by a power of s, hence (13) guarantees that y ∈ L2 .
492
C. Remling
Finally, for x ≥ 0, let s be the index corresponding to x, i.e. aj(s)+1 ≤ x < aj(s+1)+1 . Then by construction of V (see Sect. 2) s X 4L(t) + π ≤ Cs33 . N (t) x < aj(s+1)+1 ≤ G(t) t=1
The second term in the sum estimates the contribution from the ln . On the other hand, 0 ≤ V (x) ≤ s−6 , so indeed V (x) = O(x−2/11 ). This completes the proof of Theorem 2.1. 7. Singular Continuous Spectrum We modify V as follows. For a growing sequence sn → ∞, let V (x) (aj(sn )+1 ≤ x ≤ bj(sn )+N (sn ) ) Vaux (x) = . 0 else Now the modified potential V˜ is obtained from Vaux by readjusting the ln (more precisely, the subsequence of ln corresponding to the sequence sn ) according to Lemma 5.3. The remark following Lemma 5.3 shows that this may be necessary. Since the transfer matrix for zero potential is unitary, the above arguments also apply to V˜ . In particular, an estimate similar to (9) still holds, the sums now being over the subsequence sn . Consequently, one can show along the lines of the preceding section that for almost all k, ! n X 10 si . (14) R(bj(sn )+N (sn ) , k) ≥ C1 (k) exp C2 (k) i=1
A version of Schnol’s Theorem (namely [18, Proposition 9]) states that if h ∈ L2 ∩ L∞ , then h(·)v(·, k 2 ) ∈ L2 for spectrally almost all E = k 2 . Here v is a solution of the Schr¨odinger equation satisfying the boundary condition at x = 0. Let Dn = aj(sn+1 )+1 − bj(sn )+N (sn ) and h(x) = (n2 Dn )−1/2 if bj(sn )+N (sn ) ≤ x ≤ aj(sn+1 )+1 , and h(x) = 0 otherwise. Then (14) shows that hv ∈ / L2 for almost all k, hence the spectrum is supported by a set of Lebesgue measure zero and therefore σac = ∅. On the other hand, if Dn grows sufficiently fast (which can be achieved by taking a sparse sequence sn ), then we can also exclude the existence of L2 -solutions. More specifically, choose the parameters by the following inductive procedure: Take s1 = 1 and pick the corresponding ln ’s. Next, determine D1 large enough so that for all k ∈ [1−1 , 1] the following holds: If y is any solution with R(0, k) = 1, then D1 R2 (bj(s1 )+N (s1 ) , k) ≥ 1. Pick any s2 so that aj(s2 )+1 ≥ bj(s1 )+N (s1 ) + D1 (for instance, take the smallest number with this property) and adjust the ln ’s corresponding to this s2 . Again, choose D2 so that D2 R2 (bj(s2 )+N (s2 ) , k) ≥ 1, but this time for all k ∈ [2−1 , 2], etc. We can also be more explicit; for instance, it is not hard to show (using Gronwall’s Lemma again) that the sequence sn defined by s1 = 1, sn sn+1 = 2(2 ) has the required properties. Finally, note that since the original V satisfies V (x) = O(x−c ), V˜ is also of order O(x−c ). We summarize our result as
Schr¨odinger Operators with Power-Decaying Potentials
493
Theorem 7.1. The potential V˜ constructed above satisfies: 1) V˜ (x) = O(x−c ) (c > 0). 2) For all boundary conditions at x = 0, the spectrum is purely singular continuous in (0, ∞). Note added in proof The open question mentioned in the introduction has been answered by myself (paper in preparation): If V (x) = O(x−c ) with c > 1/2, then oac = [0, ∞). References 1. del Rio, R., Makarov, N., Simon, B.: Operators with Singular Continuous Spectrum, II. Rank One Operators. Commun. Math. Phys. 165, 59–67 (1994) 2. Eastham, M.S.P.: The Spectral Theory of Periodic Differential Equations. London: Scottish Academic Press, 1973 3. Gordon, A.Y., Molchanov, S.A., Tsagani, B.: Spectral Theory of One-Dimensional Schr¨odinger Operators with Strongly Fluctuating Potentials. Funct. Anal. Appl. 25, 236–238 (1992) and Gordon, A.Y.: Private communication 4. Hartman, P.: Ordinary Differential Equations. Boston: Birkh¨auser, 1982 5. Kirsch, W., Molchanov, S.A., Pastur, L.: One-dimensional Schr¨odinger operators with high potential barriers. Operator Theory: Advances and Applications, Vol.57, Basel: Birkh¨auser Verlag, 1992, pp. 163–170 6. Kiselev, A.: Preservation of the absolutely continuous spectrum of Schr¨odinger equation under perturbations by slowly decreasing potentials and a.e. convergence of integral operators. Preprint (1996) 7. Kiselev, A., Last, Y., Simon, B.: Modified Pr¨ufer and EFGP transforms and the spectral analysis of one-dimensional Schr¨odinger operators. Commun. Math. Phys. (submitted) 8. Kotani, S., Ushiroya, N.: One-dimensional Schr¨odinger operators with random decaying potentials. Commun. Math. Phys. 115, 247–266 (1988) 9. Last, Y., Simon, B.: Eigenfunctions, Transfer Matrices, and Absolutely Continuous Spectrum of OneDimensional Schr¨odinger Operators. Ann. Math. (submitted) 10. Naboko, S.N.: Dense Point Spectra of Schr¨odinger and Dirac Operators. Theor. and Math. Phys. 68, 646–653 (1986) 11. Pearson, D.B.: Singular Continuous Measures in Scattering Theory. Commun. Math. Phys. 60, 13–36 (1978) 12. Remling, C.: A probabilistic approach to one-dimensional Schr¨odinger operators with sparse potentials. Commun. Math. Phys. 13. Ruelle, D.: Ergodic theory of differentiable dynamical systems. Publ. IHES 50, 275–306 (1979) 14. Simon, B.: Some Jacobi matrices with decaying potentials and dense point spectrum. Commun. Math. Phys. 87, 253–258 (1982) 15. Simon, B.: Some Schr¨odinger operators with dense point spectrum. Proc. Am. Math. Soc. bf 125, 203–208 (1997) 16. Simon, B., Wolff, T.: Singular Continuous Spectrum under Rank One Perturbations and Localization for Random Hamiltonians. Commun. Pure Appl. Math. 39, 75–90 (1986) 17. Simon, B., Zhu, Y.: The Lyapunov Exponents for Schr¨odinger Operators with Slowly Oscillating Potentials. J. Funct. Anal. 140, 541–556 (1996) 18. Stolz, G.: Localization for random Schr¨odinger operators with Poisson potential. Ann. Inst. H. Poincar´e 63, 297–314 (1995) 19. Stolz, G.: Localization for Schr¨odinger operators with effective barriers. J. Funct. Anal. (to appear) Communicated by B. Simon
Commun. Math. Phys. 186, 495 – 529 (1997)
Existence and Uniqueness Theorems for Massless Fields on a Class of Spacetimes with Closed Timelike Curves J.L. Friedman, M.S. Morris 1 Institute for Theoretical Physics, University of California, Santa Barbara, CA 93106, USA and Department of Physics, University of Wisconsin-Milwaukee, Milwaukee, WI 53201, USA. E-mail:
[email protected] 2 Department of Physics & Astronomy, Butler University, 4600 Sunset Ave., Indianapolis, IN 46208, USA. E-mail:
[email protected] Received: 28 November 1994/Accepted: 20 May 1996
Abstract: We study the massless scalar eld on asymptotically at spacetimes with closed timelike curves (CTC’s), in which all future-directed CTC’s traverse one end of a handle (wormhole) and emerge from the other end at an earlier time. For a class of static geometries of this type, and for smooth initial data with all derivatives in L2 on I− , we prove existence of smooth solutions which are regular at null and spatial in nity (have nite energy and nite L2 -norm) and have the given initial data on I− . A restricted uniqueness theorem is obtained, applying to solutions that fall o in time at any xed spatial position. For a complementary class of spacetimes in which CTC’s are con ned to a compact region, we show that when solutions exist they are unique in regions exterior to the CTC’s. (We believe that more stringent uniqueness theorems hold, and that the present limitations are our own.) An extension of these results to Maxwell elds and massless spinor elds is sketched. Finally, we discuss a conjecture whose meaning is essentially that the Cauchy problem for free elds is well de ned in the presence of CTC’s whenever the problem is well-posed in a geometric-optics limit. We provide some evidence in support of this conjecture, and we present counterexamples that show that neither existence nor uniqueness is guaranteed under weaker conditions. In particular, both existence and uniqueness can fail in smooth, asymptotically at spacetimes with a compact nonchronal region. 1. Introduction Although spacetimes with closed timelike curves occur as solutions to the vacuum Einstein equations, they have been regarded as unphysical, in part because the most familiar examples have no well-de ned initial value problem. Morris and Thorne [1], however, introduced a class of wormhole geometries in which, although there are many closed timelike curves, the set of closed timelike and null geodesics has measure zero. On these spacetimes, Morris et al. [2] noted that the evolution of free elds is well de ned in the limit of geometrical optics; and this in turn makes
496
J.L. Friedman, M.S. Morris
Fig. 1. An orientable 3-manifold M is constructed by identifying points of I and points of II that are labeled by the same letter, with subscripts I and II: PII = T(PI )
it seem likely that a multiple scattering series converges to a solution for arbitrary initial data [2–4]. The simplest of the spacetimes they considered are static, with CTC’s present at all times, and we reported an existence and a restricted uniqueness theorem for the massless scalar eld on such static time-tunnel spacetimes [4]. The present paper provides details of these latter results and outlines their extension to spinor and vector elds. A nal section discusses the Cauchy problem on spacetimes in which CTCs are con ned to a compact region. We prove a uniqueness theorem and present a conjecture on the existence of free elds whose data are speci ed on a spacelike hypersurface (partial Cauchy surface) to the past of any CTCs. Until the nal section, the geometry N; g we consider is static in the sense that there is a timelike Killing vector t that is everywhere locally hypersurface orthogonal.1 The manifold has topology N = M × R, where M is a plane with a handle (wormhole) attached: M = R3 #(S 2 × S 1 ). The metric g on N is smooth (C ∞ ), and, for simplicity in treating the asymptotic behavior of the elds, we will assume that outside a compact region R the geometry is at, with metric . 2. Preliminaries A. Foliation of a related spacetime with boundary. One can construct the 3manifold M from R3 by removing two balls and identifying their spherical boundaries, I and T(I ), by a map T, as shown in Fig. 2. In Fig. 2, the map T involves an improper rotation of I , and it yields an orientable handle. Maps T which involve a proper rotation of I yield non-orientable handles. For the rest of this paper, we will in fact assume that T is improper and that the handle is consequently orientable. This matters, however, only to our treatment of a two-component Weyl spinor. Were we to allow non-orientable handles as well, everything would be the same, except that we should have to consider 4-component Dirac spinors. The sphere obtained by the identi cation of I and T(I ) will be called the “throat” of the handle. Its location is arbitrary: after removing any sphere , from the handle 1 Spacetime indices will be lower case Greek, spatial indices lower case Latin, spinor indices upper case Latin. For those familiar with the abstract index notation [5], letters near a (or ) in the alphabet will be abstract while those after i () will be concrete, so that is the th component of the vector . Our signature is − + ++, and Riemann tensor conventions follow [5]. We will use “manifold” as a shorthand for smooth manifold, with or without boundary.
Existence Uniqueness Theorems for Massless Fields
497
Fig. 2. The spacelike hypersurfaces Mt foliate the spacetime whose boundary is the union of two cylinders, CI ∪ CII . In the spacetime N, the cylindrical boundaries are identi ed, and Mt+ is a smooth continuation of Mt across the identi ed spheres I and T(I )
of M one is left with a manifold homeomorphic to R3 \(B3 #B3 ), whose boundary is the disjoint union of two spheres. Let C ∼ = S 2 × R be the history of the throat in the spacetime N; its orbit under the group of time-translations generated by t . Then, after removing the cylinder C from the spacetime N; one is similarly left with a manifold homeomorphic to R4 \[(B3 #B3 ) × R], whose boundary, @(N\C), is the disjoint union, CI t CII , of two timelike cylinders. We will again use the symbol T to denote the map from CI to CII that relates identi ed points; its restriction to a single sphere I ⊂ CI is the map denoted above by T. For the spacetimes we will consider, identi ed points of CI and CII = T(CI ) will be timelike separated, and copies of M in the spacetime cannot be everywhere spacelike. Thus, although t is locally hypersurface-orthogonal, no complete hypersurface of N with a single asymptotically at region is orthogonal to every trajectory of t . One can, however, foliate N\C by spacelike hypersurfaces Mt orthogonal to t . Each Mt can be chosen to agree asymptotically with a t = constant surface of the at metric and to intersect CI and CII in isometric spheres I and II . The sphere T(I ) identi ed with I is a time-translation of II , obtained by moving II a parameter distance along the trajectories of t (see Fig. 2). The manifold Mt near I is a smooth extension of Mt+ near T(I ). That is, if UI ⊂ Mt and Ut+ ⊂ Mt+ are spacelike neighborhoods of I and T(I ), their union UI ∪ Ut+ is a smooth, spacelike submanifold of N. Because the time-
498
J.L. Friedman, M.S. Morris
translation of Ut+ to Mt is a neighborhood UII ⊂ Mt of II isometric to Ut+ , by arti cially identifying the spheres I and II of Mt , one obtains a copy of M (a plane with a handle) with a metric that is everywhere smooth and spacelike. Of course this spacelike copy of M is not a submanifold of the spacetime N. A static metric on N is given by g = −e−2 t t + h ;
(2.1)
where h t = 0. If the Minkowski coordinate t is extended to N\C by making Mt a t = constant surface, then t ∇ t = 1, ∇ t = −e−2 t , and the metric (2.1) can be written on N\C in the form g = −e2 @ t@ t + h :
(2.2)
It will be convenient to single out a representative hypersurface, M := M0 :
(2.3)
We will denote by hab the corresponding spatial metric on M; that is, hab is the pullback of h (or g ) to M. B. An initial value problem 1. Free massless elds on N. We consider the wave equation ≡ ∇ ∇ = 0 ;
(2.4)
for a massless scalar eld , and the corresponding wave equations for massless vector and spinor elds. A Maxwell eld F can be written in terms of a vector potential A satisfying the Lorentz gauge condition, F = ∇ A − ∇ A ;
∇ A = 0 :
(2.5)
The equation ∇ F = 0, governing a free Maxwell eld, is then equivalent to ∇ ∇ A − R A = 0 :
(2.6)
We will adopt the 2-component spinor notation given in Penrose and Rindler [6], with ∇AA0 = AA √0 ∇ , where AA0 has components equal to entries of A(the usual Pauli spin matrices)= 2. The free- eld equation for a massless spinor is given by ∇AA0 A = 0 :
(2.7)
Although N has no spacelike hypersurface which could play the role of a Cauchy surface, one can pose initial data to the massless wave equations at past null in nity, I− . We will show for the scalar wave equation (and outline a proof for the other elds) that for all data in H∞ (I− ), there is a solution to Eq. (2.4). Because the geometry is at outside a region of xed spatial size, I is a copy of the Minkowski space I. Our treatment of the initial value problem will be closely related to a way of expressing a solution of the wave equation in Minkowski space in terms of initial data on I− .
499
Existence Uniqueness Theorems for Massless Fields
2. Initial value problem on I− for Minkowski space. In the null chart (v = t + r; ˜r ) for Minkowski space, I− has coordinates (v; rˆ ). A smooth solution 0 with nite energy to the wave equation (2.4) has as initial data on I− the single function [9] f0 (v; rˆ) = lim r0 (v; r r) ˆ : r→∞
(2.8)
It is helpful to write the solution 0 in terms of its positive frequency part, 0 : R 0 = 2Re 0 = (2)−3=2 d3k[a(k)ei(k · x−!t) + a∗ (k)e−i(k · x−!t) ] ; (2.9) where
R 0 = (2)−3=2 d3ka(k)ei(k · x−!t) :
Then the function 0 has initial data on I lim r 0 (v; r rˆ) =
r→∞
i (2)
1 2
−
(2.10)
given by [9]
R∞ d!!a(−! rˆ)e−i!v :
(2.11)
0
The corresponding initial data f0 (v; rˆ) for 0 , has Fourier transform, f˜0 (!; rˆ) =
R∞
1 (2)
1 2
−∞
dvf0 (v; rˆ)ei!v ;
(2.12)
related to a(k) by f˜0 (!; rˆ) = i!a(−! rˆ);
!=0;
f˜0 (−!; rˆ) = f˜0∗ (!; r) ˆ :
(2.13) (2.14)
−
If we de ne L2 (I ) by the norm, R kfk2L2 (I− ) = dvd |f|2 ;
(2.15)
then the L2 norm of 0 on a spacelike hyperplane is equal to the L2 norm of its initial data on I− : R R R lim dvd |r 0 |2 = dk|a|2 = dV | 0 |2 : (2.16) r→∞
The L2 norm of 0 depends on hypersurface, but it is bounded by the constant L2 norm of 0 . The ux of energy at I− is given in terms of the stress tensor, 1 T = ∇ ∇ − g ∇ ∇ ; 2
(2.17)
by R I−
dS T t = lim
r→∞
= =
R R
R
dvd [@v (r0 )]2
(2.18)
d!d !2 |f˜0 |2
(2.19)
dk!2 |a(k)|2 :
(2.20)
500
J.L. Friedman, M.S. Morris
Initial data for vector and spinor elds have a similar character. For a Maxwell eld with nite energy on Minkowski space, initial data on I− has the form, i R (2.21) d!!a (−! rˆ)e−i!v ; lim rA (v; r rˆ) = 2Re 1 r→∞ (2) 2 with a (k)k = 0 = a t . Similarly, data on I− for a massless spinor eld has the form, i R (2.22) d!!aA (−! rˆ) e−i!v ; lim r A (v; r rˆ) = 2Re 1 r→∞ (2) 2 where aA (k)kAA0 = 0: As in (2.16), the L2 norm of the initial data for vector and spinor elds is equal in at space to the L2 norm on a spacelike hyperplane: R R R lim dvd |rA0 |2 = dk|a|2 = dV |A0 |2 ; (2.23) r→∞
lim
r→∞
R
R R dvd |r 0 |2 = dk|a|2 = dV | 0 |2 :
(2.24)
From the form of the energy-momentum tensor for vector and spinor elds, 1 1 F F − g F F ; (2.25) T = 4 4 0
0
T = iAA BB ( (A ∇B)A0 B0 − (A0 ∇B0 )A B ) ;
(2.26)
the energy ux of the elds F and A at I− is 1 R 1 R dk!2 aj (k)∗ a j (k) ; dS F F t = 4 I− 4 and
R I−
R 0 0 dS aAA BB ( (A ∇B)A0 B0 − (A0 ∇B0 )A B ) t = dk!2 |a(k)|2 :
(2.27)
(2.28)
3. Initial value problem on I− for N. In the spacetime N; g that we are considering, we seek a solution to the scalar wave equation, ∇ ∇ = 0 ;
(2.29)
with initial data f de ned as in Eq. (2.8) by f(v; rˆ) = lim r(v; r rˆ) ; r→∞
(2.30)
and we shall assume that f and its derivatives are in L2 (I− ) . Because a neighborhood of I− in N is isometric to a neighborhood of I− in Minkowski space, the null Minkowski chart (v = t + r; ˜r ), can be regarded as a chart on N near I− . It will again be convenient to relate f to a function a(k) as in Eqs. (2.12)–(2.14), in this case de ning a(k) by ˜ f(!; rˆ) = i!a(−! rˆ);
!=0;
(2.31)
501
Existence Uniqueness Theorems for Massless Fields
˜ f(−!; rˆ) = f˜∗ (!; rˆ) :
(2.32)
Initial data on I− for massless vector and spinor elds on N can similarly be de ned as vector and spinor elds f and fA on I− , related to solutions A and A of Eqs. (2.6) and (2.7) by f = lim rA (v; r rˆ)
(2.33)
fA = lim r A (v; r rˆ) :
(2.34)
r→∞
and r→∞
Here the at metric near I− is implicitly used to identify the tangent spaces at each point with a single vector space in order to make the limit meaningful; and the spinor space at each point is similarly regarded as a single 2-dimensional complex space. Equivalently, one can simply replace the abstract indices and A of Eqs. (2.33) and (2.34) by concrete indices along frames that are constant with respect to the at metric. Using the same single vector space, one can again de ne Fourier components f˜ by 1 R∞ dvf (v; rˆ)ei!v ; (2.35) f˜ (!; rˆ) = 1 (2) 2 −∞ and
f˜ (−!; rˆ) = f˜∗ (!; rˆ) :
(2.36)
Corresponding vectors a (k) with k a (k) = 0 are related to f˜ by f˜ (!; rˆ) = i!a (−! rˆ);
!=0:
(2.37)
0
Spinors A (k) with k AA aA = 0 are similarly related to f˜A if one replaces each subscript by A in Eqs. (2.36) and (2.37). Because the spacetime N is not simply connected, one must specify a choice of spinor structure in order to make sense of Eq. (2.7). On the simply connected spacetime N\C, the two spinor structures on N correspond to a choice of sign in identifying a spinor at a point PI ∈ CI with a spinor at the corresponding point PII ∈ CII . The choice of spinor structure thus becomes a choice of boundary condition (see Eq. (2.41), below). The two choices give two inequivalent spinor elds on N. With a(k); a (k), and aA de ned in this way, Eqs. (2.18), (2.27), and (2.28) remain valid as expressions for the energy ux through I− of N. C. Boundary conditions.A scalar eld on N satis es at the cylindrical boundaries of N\C conditions expressing the continuity of and its normal derivative along a path that traverses the wormhole. Let PI and PII = T(PI ) be identi ed points on the cylinders CI and CII . The tangent vector to a path that traverses the wormhole, entering at PI and leaving at PII , points inward at PI and outward at PII . When the cylinders CI and CII are identi ed, a unit inward normal to CI is thus identi ed with a unit outward normal to CII . If we denote by nˆI and nˆII the unit outward normals to CI and CII , the boundary conditions can be written (PII ) = (PI ) ;
(2.38a)
nˆII ·∇(PII ) = −ˆ nI ·∇(PI ) :
(2.38b)
502
J.L. Friedman, M.S. Morris
The analogous conditions for vector and spinor elds can be stated in terms of the dierential map T∗ induced by T. If {ˆe (PI )} = {ˆe0 ;eˆ1 ;eˆ2 ; nˆI } is a right-handed orthonormal frame at PI , then the corresponding right-handed frame at PII (if T were a proper rotation, then the corresponding frame would be left-handed) is nII } : {ˆe (PII )} = {T∗eˆ0 ; T∗eˆ1 ; T∗eˆ2 ; −ˆ
(2.39)
The boundary conditions for a vector eld can be expressed in terms of its components along the frame {ˆe }: A (PII ) = A (PI ) ;
(2.40a)
nI ·∇A (PI ) : nˆII ·∇A (PII ) = −ˆ
(2.40b)
A spinor eld has components along a spinor frame, an element of the double covering (' SL(2; C)) of the space of right-handed orthonormal frames (' SO(3,1)) at a point. Two spinor frames correspond to the same orthonormal frame, and the choice of spinor structure is the choice of which spinor frame at PII to identify with a spinor frame at PI . Let (o A ; A ) be a eld of spinor frames covering a eld of frames eˆ that satis es Eq. (2.39) on N\C. We can choose as boundary conditions for the corresponding components J of a spinor eld J (PII ) = J (PI ) ;
(2.41a)
nI ·∇ J (PI ) : nˆII ·∇ J (PII ) = −ˆ
(2.41b)
The opposite spinor structure would be selected by changing the sign of the righthand side of these equations or, equivalently, by keeping the same sign but choosing a homotopically dierent frame eld {˜e }. D. Eigenfunction expansions.Because the geometry is static, we can express a solution as a superposition of functions with harmonic time dependence, R (t; x) = d!(!; x)e−i!t ; (2.42a) R A (t; x) = d!A (!; x)e−i!t ; (2.42b) R (2.42c) A (t; x) = d!A (!; x)e−i!t : Here x is naturally a point of the manifold of trajectories of t , but we can identify it with a point of the simply connected spacelike hypersurface M = M0 , with spherical boundaries I and II . Let (t; xI ) and (t + ; xII ) be points of N\C that are identi ed in N. The harmonic components of elds on N can be regarded as elds on M satisfying the boundary conditions,
with phase = !.
(!; xII ) = ei (!; xI ) ; nˆII ·∇(!; xII ) = −einˆI ·∇(!; xI ) ; A (!; xII ) = ei A (!; xI ) ;
(2.43a) (2.43b) (2.44a)
nˆII ·∇A (!; xII ) = −einˆI ·∇A (!; xI ) ; J (!; xII ) = ei J (!; xI ) ; nˆII ·∇J (!; xII ) = −einˆI ·∇J (!; xI ) ;
(2.44b) (2.45a) (2.45b)
503
Existence Uniqueness Theorems for Massless Fields
The harmonic components, , of the scalar eld satisfy on M elliptic equations of the form (!2 + L) = 0 ; (2.46) with boundary conditions (2.43), where L can be de ned by the action of ∇ ∇ on time independent elds f on N; that is Lf := e2 ∇ ∇ f|M ;
(2.47)
for elds satisfying $t f = 0, where $t is the Lie derivative along t . Then L = e Da e Da ;
(2.48)
where Da is the covariant derivative of the 3-metric hab on M. We will denote by L the operator L with boundary conditions (2.43). The analogous operators for spinor and scalar elds are discussed in Sect. III C. We show in Lemma 4 below that one can construct solutions F(; k; x), to the wave equation (2.29), having, on the at geometry outside R, the form of a plane wave plus a purely outgoing wave, corresponding to the scattering of the plane wave by the interior geometry. The existence of solutions for initial data on I− then follows if one can show that a spectral decomposition of the form R (2.49) (x; t) = E(k; x)[e−i!t a(k) + ei!t a∗ (k)]d3k ; converges, where E(k; x) = F( = !; k; x). The major diculty lies in the fact that, because the boundary conditions (2.38) involve a time-translation, the corresponding boundary conditions (2.43) depend on the frequency !. If were independent of frequency, the result would follow from the usual spectral theorem for self-adjoint operators. Here, however, E(k; x) and E(k 0 ; x) are, for |k|-|k 0 |, eigenfunctions of dierent operators; they are not orthogonal, and their completeness is not guaranteed by the spectral theorem. The main job of Sect. III is to show that the solution to the scalar wave equation for arbitrary initial data on I− can nevertheless be constructed as a spectral integral of the form (2.49). We adopt throughout the common usage in which the term “eigenfunction” refers not to an element of the domain of L , but to a function in a weighted L2 space, whose L2 norm diverges (an example is the function eikx in R3 ). E. Sobolev spaces.We shall need some standard properties of Sobolev spaces, including the Sobolev embedding and trace theorems. These can be found in Reed and Simon [7]. Denote by Hk (N ) the Sobolev space on a manifold N with volume form dv, so that for k a positive integer, Hk (N ) is the space of functions on N for which the function and its rst k derivatives are square integrable. A Hilbert space norm kkk on the set Hk (N ) can be speci ed by choosing a connection ∇ on N and writing " #1=2 R P k dv : (2.50) |∇1 · · · ∇m f|2 kfkk = m 05m5k N Spaces H−k (N ) can then be de ned as the dual spaces to Hk : De nition. For each integer k = 0, the Hilbert space Hk (N ) is the completion of C0∞ (N ) in the norm (2.50); H−k (N ) = Hk (N )∗ .
504
J.L. Friedman, M.S. Morris
Note that for N = Rn with natural volume element and natural at connection, Hk for both positive and negative values of k can be de ned as the completion of C0∞ in the norm 1=2 R n 2 2 k ˆ d |f()| (1 + || ) : (2.51) kfkk = Rn
(Equation (2.51) de nes Hk (Rn ) for any real k; on N one can extend the de nition to all real k by quadratic interpolation [8].) For the spacelike three-manifold M ≡ M0 , we will use the volume element dv = e− dV , where dV is the proper volume associated with the 3-metric hab . The lapse function, e− , appearing in the measure is needed to make the operator L symmetric. Weighted L2 spaces on M will also be needed. De nition. L2; r (M) is the completion of C0∞ (M) in the norm kfk22; r =
R M
dV e− |f|2 (1 + |x|2 )r :
(2.52)
The space L2; r (R3 ) is similarly de ned, with dv the natural volume on R3 . One can de ne on M a smooth global eld of frames that agrees on the at region outside R with a cartesian frame of R3 . A tensor eld will be said to be in any of the spaces de ned above if its components with respect to the frame are in the space. Corresponding to a global frame eld and a choice of spinor structure is a eld of spinor frames on M. A spinor eld will be said to be in any of the spaces above if its components with respect to the frame are in the space. Finally, it is helpful to de ne spaces that incorporate the boundary conditions on @M = I t II . De nition. H1 is the intersection of H1 (M) with functions satisfying the boundary condition (2.38a). H2 is the intersection of H2 (M) with functions satisfying the boundary conditions (2.38a–2.38b). The Sobolev trace theorem implies that elements of H1 (M) have well-de ned values on @M, so H1 and H2 are completions in the H1 and H2 norms of C ∞ functions satisfying the boundary conditions speci ed in the de nition.
III. Existence and Uniqueness Theorems A. Existence theorem for a massless scalar eld. Let R be large enough that for r ¿ R the geometry is at. De ne an exterior region E by E = {p ∈ N|r(p) ¿ R} and an interior region R = N\E. It will be helpful in what follows to introduce a smoothed step function, , that vanishes on R: 0; if p ∈ R (p) = (3.1) 1; if r(p) = R + ; for some ¿ 0.
505
Existence Uniqueness Theorems for Massless Fields
De nition. A scalar eld on N is asymptotically regular at spatial in nity if ◦ ∈ H1 (M); it is asymptotically regular at null in nity if the limits, f(v; r) ˆ = lim r(v; r r) ˆ ; r→∞
ˆ ; g(u; r) ˆ = lim r(u; r r)
(3.2)
r→∞
exist, with f ∈ H1 (I− ); g ∈ H1 (I+ ), where u is the null coordinate t − r. That is, is asymptotically regular if it is an L2 -function with nite energy on N and has well-de ned data with nite energy on I− and I+ . Our result on existence of solutions to the scalar wave equation has the following form: Proposition 1. For almost all spacetimes N, g of the kind described in Sect. II A ( for almost all parameters ), the following existence theorem holds. Let f be initial data on I− for which f and all its derivatives are in L2 (I− ). Then there exists a solution to the scalar wave equation which is smooth and asymptotically regular at null and spatial in nity and which has f as initial data. The proof is given as a sequence of lemmas. For boundary conditions (2.43) speci ed by a xed phase, , we show that the operators L are self-adjoint on a dense subspace of L2 . We follow a method given by Wilcox [9] to obtain explicit eigenfunctions F(; k; x) in a weighted L2 space. For xed boundary phase the eigenfunctions are complete and orthonormal, but, as mentioned earlier, a solution of Eq. (2.29) is a superposition of eigenfunctions, E(k; x) = F( = !; k; x) ;
(3.3)
that are not orthonormal. For each !, the function E(k; x) is an eigenfunction of a dierent operator L because the boundary condition depends on ! through the relation = !. Lemma 1. The operator L with boundary conditions (2.43) is self-adjoint on the space L2 (M) with domain H2 . Proof. Recall that for an operator A with domain D(A) ⊂ L2 (M), a function f ∈ L2 (M) is in D(A† ) if and only if ∃h ∈ L2 (M) such that hf|Agi = hh|gi∀g ∈ D(A). To prove that H2 ⊂ D(L† ) is easy: for f ∈ H2 , L f ∈ L2 (M). Take h = L f. Then, writing := I t II , we have R ↔a dSa e− (fei )D gei : (3.4) hf|L gi = hL f|gi + I tII
Because f and g satisfy the boundary conditions (2.43), we have R R R ↔a ↔a ↔a dSa e−fD g = − dSa e− (fei )D gei = − dSa e−fD g : II
I
(3.5)
I
Thus L is symmetric, hf|L gi = hL f|gi = hh|gi ; and H2 ⊂ D(L† ).
(3.6)
506
J.L. Friedman, M.S. Morris
To prove that D(L† ) ⊂ H2 , we proceed as follows. For f ∈ D(L† ), there is an h ∈ L2 (M) with (3.7) hf|L gi = hh|gi ∀g ∈ D(L ) : That is, the equation L† f = h is satis ed weakly on M. Then, by elliptic regularity, we have f ∈ H2loc (M) and L f = h is satis ed strongly on M. Asymptotic behavior of f follows from the fact that the exterior region E can be regarded as a subspace of Euclidean 3-space, E 3 . Let be the smooth step ˜ function of Eq. (3.1), and letf˜ = f. Thenf˜ is a function on E 3 satisfying ∇2f˜ = h, with h ∈ L2 (E 3 ). Because ∇2 is self-adjoint on E 3 with domain H2 (E 3 ), we have f˜ ∈ H2 (E 3 ), implying f|E ∈ H2 (E). To check that f satis es the boundary conditions, we use the fact, noted in Sect. I (before Eq. (2.1)), that by identifying I and II , one obtains a spacelike copy Mˆ of M with a smooth spatial metric, gˆab . (Mˆ is an arti cially constructed spacelike surface with a handle; it is not a submanifold of our spacetime, N.) The operatorLˆ constructed from gˆab is smooth and elliptic. Because the projection, p : Mt → M , is an isometry there, it agrees with L on the interior of Mt . Let U be a neighborhood of for which p−1 (U ) = UI t UII , with UI and UII disjoint neighborhoods of I and II . If, as expected, the function f ◦ p−1 has a phase that changes by ei across , then we should get a smooth function fˆ on U by requiring f ◦ p−1 ; P ∈ p(UI ) ˆ : (3.8) f(P) = e−i f ◦ p−1 ; P ∈ p(UII ) To see that this is true, rst smoothly truncate f so that it vanishes outside of U1 and UII , and then de ne fˆ as above. Let g of (3.7) similarly have support on U1 ∪ UII , and let h be the function in L2 (M ) given by (3.8). De ne gˆ and hˆ on U1 ∪ UII according to (3.8), with f replaced by g and h, respectively. Then g ∈ H2 (M) ⇒ gˆ ∈ H2 (Mˆ ), and we have ˆ gi ˆ gi hf|L ˆ = hh| ˆ ;
(3.9)
∀gˆ ∈ H2 (Mˆ ) with support on U1 ∪ UII , implying (again by elliptic regularity) that fˆ is smooth across . Thus any f ∈ D(L† ) is a function in H2 satisfying the boundary conditions (2.43), meaning D(L† ) ⊂ H2 . We now nd a set of eigenfunctions that are complete for data on I− . We consider solutions F(; k; x) to Eq. (2.46) which, for r ¿ R, have the form F = (2)−3=2 eik · x + outgoing waves :
(3.10)
To prove existence of the eigenfunctions F, one rst rewrites Eq. (2.46) in the inhomogeneous form (3.11) (!2 + L )’ = ; where ’ is purely outgoing and has compact support. One can do this by using the steplike function (r) of Eq. (3.1), writing F = (2)−3=2 eik · x + Fout :
(3.12)
Existence Uniqueness Theorems for Massless Fields
Then with
507
(!2 + L )F = (!2 + L )Fout − ;
(3.13)
= −(2)−3=2 eik · x (∇ ∇ − 2ik ∇ ) :
(3.14)
The homogeneous equation, (!2 + L )F = 0, is equivalent to the inhomogeneous equation (3.11) with ’ = Fout . Lemma 2. Let k be a sequence of complex numbers with positive imaginary part, such that k → !2 . Consider families {’k ; k } of smooth elds on M, where for each k, the eld k has support on the region r 5 R + , and ’k is the unique asymptotically regular solution to Eq. (3.11) with !2 replaced by k . If k → in L2 (M), then a subsequence {’m } converges in an H1 norm to a smooth outgoing solution ’ to Eq. (3.11). Proof. Let D be a real number greater than R + , MD = {x ∈ M; r 5 D}. Denote by kks; D the norm of Hs (MD ). Suppose rst that k’k k2; D has a bound C independent of k. Since H2 (MD ) ,→ H1 (MD ) is a compact embedding (by the Sobolev embedding theorem) the set {’k } belongs to a compact set of H1 (MD ). Thus there is a subsequence {’m } that converges in H1 (MD ): (3.15) ’m → ’ : We must show that ’ satis es (i) (!2 + L)’ = on MD , (ii) the boundary conditions (2.43), and (iii) we must extend ’ outside MD . (i): To see that ’ weakly satis es (3.11), de ne ∀ ∈ C0∞ (MD ), Q ≡ − hDa (e )|Da (e ’)i + !2 h |’i − h |i :
(3.16)
Add this Q to the quantity hDa (e )|Da (e ’m )i − m h |’m i + h |m i = 0. Then |Q| = | − hDa (e )|Da (e ’)i + !2 h |’m i − h |i + hDa (e )|Da (e ’m )i − m h |’m i + h |m i| 5 |hDa (e )|Da [e (’ − ’m )]i| + !2 |h |’ − ’m i| + (!2 − m ) h |’m i + |h | − m i| :
(3.17)
The limit of the right-hand side of (3.17) is zero as m → ∞: lim k’ − ’m k1; R = 0 ;
(3.18a)
lim k − m k0 = 0 ;
(3.18b)
lim |!2 − m | = 0 :
(3.18c)
m→∞
m→∞
m→∞
Then Q independent of m implies Q = 0. Hence ’ is a weak solution to (!2 + L)’ = in MD , and elliptic regularity implies that it is a strong solution. (ii): Next de ne a neighbourhood U on Mˆ , as in the proof of Lemma 1. And de ne ’ˆ k on U from ’k on MD in exactly the same way that fˆ was de ned in Lemma 1 from f. Then, because ’k is smooth and satis es the boundary conditions
508
J.L. Friedman, M.S. Morris
(2.43) on MD , we have that ’ˆ k is smooth on U . Also, because k has support only ˆ ’ˆ k = 0 on U . Now, outside of U (away from the boundaries), ’ˆ k satis es (k + L) ˆ which is continuous. But, a subsequence {’ˆ m } converges in H1 (U ) to a solution ’, by elliptic regularity ’ˆ is smooth in U and therefore ’ satis es both boundary conditions (2.43) ∀ ∈ C ∞ (MD ). (iii): For r ¿ R + , we have m = 0. Because the space is at outside r = R, and ’m is asymptotically regular, we can use the explicit Green function for ∇2 at + !2 to write √ ↔ ei m |x−y| R ; (3.19) dSy ’m (y)@y ’m (x) = |x − y| |y|=R √ where Im m ¿ 0. Then for R + ¡ r ¡ D, ’(x) = lim ’m (x) = m→∞
↔ ei!|x−y| ; dSy ’(y)@y |x − y| |y|=R R
(3.20)
an outgoing wave. De ning ’ by Eq. (3.20) for r ¿ D we obtain an outgoing C ∞ solution to (3.11) as claimed. The construction has so far relied on the assumption that k’k k2; D , was bounded. If not, the sequence ’˜ k = ’k =k’k k2; D has unit norm and a source ˜k = k =k’k k2; D whose norm converges to zero: lim k˜k k = 0 :
(3.21)
k→∞
This leads to a contradiction. From the previous paragraph, there is a subsequence ˜ m converging to an outgoing solution ’˜ of L ’˜ = 0. But Rellich’s uniqueness theorem (Lemma 3 below) implies ’˜ = 0, whence limm→∞ k’m k0; D = 0. From L’˜ m = ˜m , we have k’˜ m k2; D 5 Ck’˜ m k0; D + km k0 . Then limm→∞ k’˜ m k2; D = 0, contradicting k’˜ m k2; D = 1. Lemma 3 (Rellich’s uniqueness theorem). Let ’ ∈ H2 be outgoing at spatial in nity and satisfy (L + !2 )’ = 0. Then ’ = 0. Proof. In the exterior region E, ’ is a smooth solution to the at space equation (!2 + ∇2 )’ = 0 and can therefore be written as a sum P (2) (3.22) lm h(1) l (!r)Ylm ( ) + lm hl (wr)Ylm ( ) ; (1) (1)∗ converging in Lloc and h(2) are spherical Hankel functions, satis2 . Here hl l = hl fying the Wronskian relation, (2) (2) (1) h(1) l (y)@y hl (y) − hl (y)@y hl (y) =
2 : iy2
(3.23)
Using this relation and orthonormality of the spherical harmonics Ylm , we have, ↔ R R R dS’∗ @r ’ 0 = dV ’∗ (!2 + L)’ − dV (!2 + L)’∗ ’ = R
R
r=R
↔ (1) ↔ (2) 2 2i P 2 (1) [|lm |2 − | lm |2 ] (3.24) [|lm |2 h(2) = l @r hl + | lm | hl @r hl ]R = ! lm P P ⇒ |lm |2 = | lm |2 : (3.25) P
509
Existence Uniqueness Theorems for Massless Fields
In other words, ingoing and outgoing uxes are equal. Then ’ outgoing implies lm = 0 ⇒ lm = 0 ⇒ ’ = 0 outside R. Aronszjan’s elliptic continuation theorem [10] then implies ’ = 0 everywhere on M. Lemma 4. There is a unique solution, F(; k; x), to the equation (L + !2 )F = 0, for which F = (2)−3=2 eik · x + outgoingwaves : (3.26) The map given by
L2 (M) → L2 (R3 ) ;
(3.27)
R ˆ f(x) 7→ f(k) = dkF(; k; x)f(x) ;
(3.28)
is unitary. Proof. An immediate consequence of Lemmas 2 and 3 is that there exists a unique outgoing smooth solution ’ to (3.11). Then Eq. (3.12), relating F to Fout , gives us existence and uniqueness of a smooth solution F of the claimed form. Unitarity is implied by the self-adjointness of L for xed and the fact that R dk is a spectral measure. A detailed proof of unitarity, applicable with essentially trivial changes to our case is given R in Chap. 6 of Wilcox [9]. (For example, Neumann condition,” d3 x[f∇2 g + ∇f · ∇g] = 0, is replaced by Rthe “generalized dV [fe Da (e Da g) + Da (e f)Da (e g)] = 0.) 3 RLemma 5. For almost all the following holds: Let a(k) ∈ L2; n (R ), and let (x) = dkF( = !; k; x)a(k). Then (x) ∈ Hn−3=2− (MD ), all ¿ 0.
Proof. Regard M as a subset of R3 and x ∈ M a vector in R3 . Fourier transform F(; k; x), truncating smoothly at x = R: With the smoothed step function of Eq. (3.1), let gy ∈ L2 (M) be given by gy (x) := (2)3=2 eix · y [1 − (x)] ; with norm
3
kgy kL2 (M) = CR 2 ;
(3.29) (3.30)
some C independent of y. From the fact that F, regarded as a map from L2 (M) to L2 (R3 ) is norm-preserving, the function R ˆ k; y) := dV e− F(; k; x)gy (x) (3.31) F(; has norm in k-space ˆ kF(; · ; y)kL2 (R3 ) = CR3=2 ;
∀; y :
(3.32)
ˆ In order to bound the integral of the Lemma, we will bound a norm of F(!; k; y) in -k space, for any nite interval I = [0 ; 1 ] of . It will be convenient to take 1 = m0 , for some integer m. Writing Q(; k; y) =
ˆ k; y)|2 |!n F(; ; (1 + !2 )n
(3.33)
510
J.L. Friedman, M.S. Morris
we have R∞ R R∞ ! R1 d! dQ(!; k; y) = d! d!−1 Q(; k; y) 0
I
!0
0
5
2 ∞R P
d
j=0 0
5
2( j+1)= R 0
d!( j + 1)
2j=0
R2
R∞
0
0
d d!
!1 1 + 0 2
1 −1 ! Q 0
!−1 Q :
(3.34)
In the rst inequality, we have used the fact that Q is periodic in . From this relation, we obtain R
"
R
R
dk dQ(!; k; y) = I
dk +
!¡2=0 0
5C +
dk
!¿2=0
R
R2 dk d
!¿2=0
0
5 C 0 + 1 =
R
!¿2=0 0
#
R
R
dQ
I
!1 1 + 0 2
!−1 Q(; k; y)
R2 dk dQ(; k; y) 0
00 3
5 C +C R ;
(3.35)
where the last inequality follows from Eq. (3.32). Integrating over y, we have, R
ddkdy
ˆ k; y)|2 !2n |F(!; ¡∞ 2 n (1 + ! ) (1 + y2 )3=2+
(3.36)
ˆ ⇒ !n F(!; k; y) ∈ L2 (I ) ⊗ L2; −n (R3 ) ⊗ L−3=2− (R3 ) ⇒ ∇n F(!; k; x) ∈ L2 (I ) ⊗ L2; −n (R3 ) ⊗ H−3=2− (MD ) ⇒ F(!; k; x) ∈ L2 (I ) ⊗ L2; −n (R3 ) ⊗ Hn−3=2− (MD ) :
(3.37)
Thus, for almost all , F(!; k; x) ∈ L2; −n (R3 ) ⊗ Hn−3=2− (MD ) R ⇒ dka(k)F(!; k; x) ∈ Hn−3=2− (MD )
(3.38)
for a(k) ∈ L2; n (R3 ). Lemma 6. Let out be the outgoing eld in Minkowski space of a smooth, spatially bounded source, . Then data for out on I+ is well-de ned and smooth.
Existence Uniqueness Theorems for Massless Fields
511
Proof. The lemma follows from the explicit form of the at space retarded solution, R (t = u + r − |˜x − y ˜ |; y ˜) out (t = u + r;˜x) = d3y ; |˜x − y ˜|
(3.39)
where r = |˜x|. Writing |˜x − y ˜ | = r − xˆ · y ˜ + O(|˜ y=˜x|); we have |(t = u + r − |˜x − y ˜ |; y ˜ ) − (t = u + xˆ · y ˜; y ˜ )| ¡ K|˜ y=˜x|max|| ˙ :
(3.40)
This bound holds for all values of r along the null ray at constant u; ; , with max|| ˙ the maximum value of || ˙ on the (compact) intersection of the support of with the past light cones from points of the null ray. Then data on I+ takes the explicit form, R ˜; y ˜) : (3.41) lim r out (t = u + r;˜x) = d3y(t = u + xˆ · y r→∞
Lemma 7. Let be such that the conclusion to Lemma 5 holds. Let a(k) ∈ L2; n (R3 ), all n ∈ Z, and let R (3.42) (t; x) = dka(k)F(!; k; x)e−i!t : Then is smooth on N, and has data on I− given by Eq. (2.8), ˆ = lim r (v; r r)
r→∞
i R∞ d!!a(−!r)e ˆ i!v : (2)1=2 0
(3.43)
That is, if has for each harmonic the same ingoing part as does a solution 0 in Minkowski space, then has the same data on I− that 0 has. Proof. The smoothness of a(k) implies, by Lemma 5, that the integral of Eq. (3.42) de nes (0; · ) ∈ Hn (MD )∀n. Since a(k)e−i!t ∈ L2 (R3 ), we have (t; · ) ∈ Hn (Mt; D ) (∀n; t), and (t; · ) is smooth on each surface Mt , because it is smooth on Mt; D for all D. To see that is smooth on N, note rst that it is smooth on the throat that joins Mt to its extension Mt+ . This follows from the fact that the location of the cylinder C removed from N is arbitrary: if, instead of removing C from N, one removes a dierent, cylinder C 0 , disjoint from C, one obtains the same , because the eigenfunctions on M from which is constructed are unique by Lemma 3. Thus (t; · ) is smooth on each Mt0 and, in particular, on the throat . Because !a(k) is similarly in L2; n (R3 )∀n, @t is smooth on each Mt and on the throat. Finally, is smooth on N, because N is covered by globally hyperbolic subspacetimes (U; g|U ) which have as Cauchy surfaces U ∩ (Mt ∪ ∪ Mt+ ), for some t; and on (U; g|U ), satis es the smooth hyperbolic equation ∇ ∇ = 0 with smooth initial data. We will rst relate data on I+ to a(k) and then reverse the argument, deducing the data on I− from that on I+ . As in Eq. (3.12), we can write F(!; k; x) = (x)F0 (k; x) + Fout (!; k; x) ;
(3.44)
where F0 (k; x) = (2)−3=2 eik · x . Then = (x) 0− + out ;
(3.45)
512
J.L. Friedman, M.S. Morris
where R out = dka(k)Fout (!; k; x)e−i!t ;
R 0− = dka(k)F0 (k; x)e−i!t :
(3.46)
The integral de ning out converges in L2 (MD ) to a smooth function, because the integrals de ning and the Minkowski-space solution 0 so converge. and out are smooth on M because they are smooth on MD for all D. We can use the spherical harmonic basis {Ylm } for L2 (S 2 ) to write, for the exterior region E, 1=2 Pl 2 ∗ ˆ i jl (!r)Ylm (k)Ylm (ˆ x) ; F0 (k; x) = lm P x) ; Fout (; k; x) = lm (; k)h(1) l (!r)Ylm (ˆ
(3.47) (3.48)
lm
with the sums converging in L2 (S 2 ). Then P 1 R d!!2 clm (!)h(1) x)e−i!t ; out = √ l (!r)Ylm (ˆ 2 lm where clm (!) =
√ R 2 d k a(k) lm (!; k) :
(3.49)
(3.50)
We can similarly rewrite the convergent integrals for 0 and in E: P 1 R d!!2 2alm (!)jl (!r)Ylm (ˆ x)e−i!t ; 0− = √ 2 lm P 1 R (2) d!!2 [blm (!)h(1) x)e−i!t ; = √ l (!r) + alm (!)hl (!r)]Ylm (ˆ 2 lm where
(3.51)
(3.52)
R ∗ ˆ (k) alm (!) = il d k a(k)Ylm
(3.53)
blm = alm + clm :
(3.54)
and The construction of out from outgoing waves Fout of Eq. (3.12), satisfying the inhomogeneous equation (3.11) expresses out on E as the retarded solution on Minkowski space, (3.41), to out = , with := 0 . If one writes of Eq. (3.41) as a sum of spherical harmonics and uses Eq. (3.39) to relate clm to lm , data (3.41) on I+ for out becomes P 1 R d!! i−(l+1) clm (!)Ylm (ˆ lim r out (t = u + r;x) ˆ =√ x)e−i!u : r→∞ 2 lm
(3.55)
A nite-energy solution on Minkowski space of the form (3.51) has data on I+ given by P 1 R d!! i−(l+1) alm (!)Ylm (ˆ ˆ =√ lim r 0− (u + r;x) x)e−i!u : 2 lm
r→∞
(3.56)
513
Existence Uniqueness Theorems for Massless Fields
The rst part of the proof of Lemma 3 implies for the solution to L = 0 on E, eigenfunctions of L satisfy, for each !, P P |alm (!)|2 = |blm (!)|2 : (3.57) lm
lm
Thus the data induced by on I satis es the constraints imposed on a(k): b(k) ∈ L2; n (R3 ), all n ∈ Z. As a result, the argument just given, with I− and I+ reversed, implies (3.58) = 0+ + in ; +
with 0+ and in smooth, and given on E by expressions P 1 R d!!2 2blm (!)jl (!r)Ylm (ˆ x)e−i!t ; 0+ = √ 2 lm −1 R d!!2 clm (!)h(2) x)e−i!t ; in = √ l (!r)Ylm (ˆ 2
(3.59) (3.60) (3.61)
converging in L2 (MD ) for each D; and data on I− is well-de ned, with P 1 R lim r (v − r; rˆ d!! i(l+1) (blm + clm )Ylm (ˆ x) = √ x)e−i!v ; 2 lm
r→∞
(3.62)
whence, using Eq. (3.54), we recover (3.43). Lemma 8. Let be a smooth retarded solution on Minkowski space to the massless scalar wave equation whose source has support within a compact spatial region r ¡ R, and suppose that has zero data on I− and data f on I+ with nite energy norm, kfkH1 (I+ ) . Then has nite energy norm k kH1 (H) on a spacelike hyperplane H of Minkowski space. Proof. The proof will follow from conservation of energy, but, because the energy density of a massless scalar eld does not include a term proportional to 2 , we will need to consider both the energy of and of a time integral of to bound k k1 on a spacelike hyperplane. Let H be the surface t = 0. We need only consider the retarded eld of a source that vanishes for t ¿ 0, because the eld on t = 0 depends only on values of for t ¡ 0. Let t be the timelike Killing vector @t , and let J = −T t , with T the energy-momentum tensor of . Let E(u0 ) be the energy of on the part H(u0 ) of H with r ¡ |u0 |: R dS J ; (3.63) E(u0 ) = H(u0 )
with dS along the normal ∇ t. Let Iu0 ; v0 be the part of the past null cone v = v0 lying to the future of t = (u0 + v0 )=2; and let Ju0 ; v0 be the part of the future null cone u = u0 between t = 0 and t = (u0 + v0 )=2. Then H(u0 ), Iu0 ; v0 , and Ju0 ; v0 form the boundary of a source-free region of Minkowski space. Denote by FI (u0 ; v0 ) and FJ (u0 ; v0 ) the ux through Iu0 ; v0 and Ju0 ; v0 , respectively, R R dS J ; FJ = dS J ; (3.64) FI = Iu0 ; v0
Ju0 ; v0
514
J.L. Friedman, M.S. Morris
choosing dS along the normals ∇ u and ∇ v. Then ∇ J = 0 implies E(u0 ) + FJ (u0 ; v0 ) = FI (u0 ; v0 ) :
(3.65)
A massless scalar eld, satis es the dominant energy condition: the vector J is future-directed non-spacelike. Consequently E(u0 ); FI (u0 ; v0 ), and FJ (u0 ; v0 ) are all positive, and we have (3.66) E(u0 ) 5 FI (u0 ; v0 ) : The retarded eld of a smooth, spatially compact source has asymptotic behavior (u; x) = f(u;x)=r ˆ + O(r −2 ) ; ∇ v∇ h @
= @u f(u;x)=r ˆ + O(r −2 ) ; = O(r −2 ) ;
(3.67)
where h = + t t is the projection orthogonal to t . Consequently, lim FI (u0 ; v0 ) = FI+ (u0 ) ;
(3.68)
v0 →∞
where FI+ (u0 ) is the ux through the part of I+ (u0 ) lying to the future of u = u0 : FI+ (U ) =
R0 −u0
R du d |@u f(u;x)| ˆ 2:
(3.69)
Since E(u0 ) is independent of v0 , we have for all u0 , E(u0 ) ¡ FI+ (u0 ) . Hence the scalar eld has nite energy on H: E= In order to bound
lim E(u0 ) 5
lim FI+ (u0 ) ¡ kfkH1 (I+ ) :
u0 →−∞
R
H
dV
2
u0 →−∞
(3.70)
, we essentially repeat the argument for Ru ˜ = du0 (u0 ; x) :
(3.71)
0
Equation (3.67) implies that ˜ has data f˜ on I+ , where Ru f˜ = du0 f(u;x) ˆ :
(3.72)
0
Bounding the energy E˜ of E˜ =
R H
=
R
H
dS J˜ = dV
1 2
2
:
on H bounds the L2 norm of
, because
R
1 dV [(@u ˜ )2 + (@u ˜ − @r ˜ )2 + r −2 (@ ˜ )2 + (r sin )−2 (@’ ˜ )2 ] 2 H (3.73)
Because vanishes for u = 0, ˜ satis es the scalar wave equation with source : ˜ Using 1 1 2 2 2 @’ ; (3.74) = −2@u @r + @r + 2 @ + r sin2
515
Existence Uniqueness Theorems for Massless Fields
and Ru @u ˜ (u; x) = (u; x) = du0 @u0 (u0 ; x) ;
(3.75)
0
we have Ru ˜ = du0
(u0 ; x) = (u; ˜ x) :
(3.76)
0
Then Eqs. (3.64, 3.65, 3.66) hold for the energy uxes F˜I (u0 ; v0 ), and F˜J (u0 ; v0 ), and, in particular, ˜ 0 ) 5 F˜I (u0 ; v0 ) : E(u
(3.77)
However, ˜ is not the retarded solution to the scalar wave equation with source , ˜ so we must check that the asymptotic conditions (3.67) on hold for ˜ . This is easy: A function O(r −n ) has, for r greater than some r0 , the form g(u; x)=r n where, for xed u; r, ˆ g is a bounded function of r. In our case, is smooth, and the corresponding functions g are R u smooth and bounded in a compact domain [0; u] for u (and hence for u; r). ˆ Thus 0 r −n g(u0 )du0 = O(r −n ), and it follows that ˜ satis es (3.67). The analogues of Eqs. (3.69) and (3.70), R0 R R0 R ˜ x)| ˆ 2 = du d |f(u;x)| ˆ 2 ¡∞ ; F˜I+ (u0 ) = du d |@uf(u; u0
(3.78)
u0
and E˜ =
˜ 0) 5 lim E(u
u0 →−∞
lim F˜I+ (u0 ) ¡ kfkL2 (I+ ) ;
u0 →−∞
(3.79)
together with Eq. (3.73), imply k kL2 (H) ¡ ∞. Finally, Eq. (3.70) and the bound on k kL2 (H) implies k kH1 (H) ¡ ∞. Corollary. Under the assumptions of Lemma 7; out of Eq. (3.46) is asymptotically regular at spatial in nity. Proof. As in Lemma 7, out is on E the retarded solution of Eq. (3.39) with smooth spatially bounded source . Then Lemma 8 implies out is regular at spatial in nity. Proof of Proposition 1. The proof is essentially immediate from the lemmas. By Lemmas 6 and 7, with = 2 Re , for almost all ; is a smooth solution on N to the scalar wave equation, with data f on I− . Finally, on E, = 0 + out ; where 0 is asymptotically regular at spatial in nity because it is a solution to the
at-space wave equation with data having nite energy-norm on I− and out is regular by the Corollary to Lemma 8.
516
J.L. Friedman, M.S. Morris
B. Restricted uniqueness theorem for a massless scalar eld.Because the system is linear, uniqueness means that the only solution to Eq. (2.29) with zero data at I− is = 0. This is not true in the geometrical optics limit, because closed null geodesics c() can loop through the wormhole and never reach I; and one might worry that there are analogous smooth solutions that vanish in past and future and have zero data on I− . The following restricted uniqueness theorem rules them out: smoothed versions of a looping zero-rest mass particle spread and reach I. Denote by EKt an energy norm of the eld on a compact submanifold Kt ⊂ Mt , where manifolds in the family {Kt } are related by time-translation along the Killing trajectories: R (3.80) EKt = dV e− [|∇|2 + ||2 ] = kkH1 (Kt ) : Kt
Proposition 2. If (t; x) is a smooth solution to ∇ ∇ = 0; having nite energy and zero initial data on I− ; and if limt→±∞ EKt = 0 for any family of compact (time-translation related) Kt ; then = 0. Proof. The result is a corollary of Rellich’s uniqueness theorem for each mode. The requirement that the energy norm on any compact K vanishes as t → ∞ allows one to transform the solution as follows. Denote by HT a smoothed step function, 1; |t| 5 T (3.81) HT (t) = 0; |t| = T + ; and write the Fourier transform of in the form R ˆ (!; x) = lim (2)−1=2 dtHT (t)e−i!t (t; x) : T →∞
We have 0= =
R R
(3.82)
dtHT (t)e−i!t ∇ ∇ (t; x) e− dte−i!t [HT (!2 + L) + 2i!@t HT − @2t HT ](t) ;
whence R R dte−i!t [HT (!2 + L)(t) 2 5 dt(|2!@t HT (t)| + |@2t HT (t)|) 2 :
(3.83) (3.84)
@2t HT
are bounded functions of t with compact support, we can Because @t HT and write R 2 R R dV dtHT e−i!t (!2 + L)(t; x) 5 C max dV |(t; x)|2 ; (3.85) |t|∈[T; T +] K
K
or
R
dtHT e−i!t (!2 + L)(t; x) 2
L2 (Kt )
5 C max kk2L2 (Kt ) ; t∈[T; T +]
(3.86)
where C is a constant independent of T . From our assumption that k|k2L2 (Kt ) → 0 as t → ∞, we have (3.87) lim max kk2L2 (Kt ) = 0 T →∞ t∈[T; T +]
implying
2
R lim dtHT e−i!t (!2 + L)(t; x) L (K ) = 0 : (3.88) 2 t T →∞ R ˆ It follows that for all !, dte−i!t (!2 + L)(t; x) = (!2 + L)(!; x) vanishes for almost all x. Finally, since nite energy and zero initial data imply purely outgoing at spatial in nity, by Rellich’s theorem must vanish as well.
517
Existence Uniqueness Theorems for Massless Fields
C. Other massless elds.Extending these results to Weyl and Maxwell elds appears straightforward. The statement of the existence theorem is identical, with the scalar eld replaced by a Weyl spinor A and (say) a vector potential for a free Maxwell eld F in a Lorentz gauge with A t = 0. The statement of uniqueness for a Weyl eld is identical to that for a scalar eld; for a Maxwell eld it must be modi ed to exclude the time independent solutions that have nonzero ux threading the handle: If F is a smooth solution to
with
∇ F = 0;
∇[ F ] = 0
R
R∗
F dS = 0 =
S
(3.89)
F dS ;
S
having nite energy and zero initial data on I− , and if limt→±∞ EKt = 0 for any family of compact (time-translation related) Kt , then F = 0. For a static spacetime, the proofs appear to require only minor changes. Recall rst that the wave equations for Weyl and Maxwell elds are conformally invariant [6]: If A and A satisfy the Maxwell and Weyl Equations (2.6) and (2.7), respectively, for the metric g , the elds A and e A satisfy the same equations for the conformally related metric e−2 g . As a result, it suces to look at solutions to the hyperstatic metric of the form, g = −@ t@ t + h :
(3.90)
That is, a static metric of the form, (2.2), is related to a hyperstatic metric of the form (3:90) by a smooth conformal factor e2 ; because e2 has the value 1 in a neighborhood of I, the conformally related solutions have the same data on I− .2 As in the scalar wave case, we can decompose the wave operator for vector and spinor elds in the manner (3.91) ∇ ∇ = −$2t + L ; with L = Da D a , as in Eq. 2.48. Then the harmonic components ; A satisfy (!2 + L) = 0;
(!2 + L) A = 0 :
(3.92)
The operator L is symmetric on L2 (M), de ned for vector and spinor elds on R 0 M by the norm h|i = M dV e− ||2 , with ||2 ≡ a and ||2 ≡ tAA0 A A for vectors and spinors, respectively. The de nition of the Sobolev spaces of Sect. II E are similarly extended automatically to vectors and spinors by contracting vector indices with hab and spinor indices with tAA0 . (In a spinor frame associated with an orthonormal frame for which t is the timelike frame vector, tAA0 has components tII 0 = II 0 . Sobolev spaces of Sect. II E are similarly extended to vector and spinor elds. Finally, the space H2 , is de ned for vectors and spinors as the set of elds in H2 (M) satisfying the boundary conditions (2.40,2.41).) Here is a sketch of the proofs. The existence theorem involves the same set of lemmas. Lemma 1, self-adjointness of L on the spaces H2 of elds in H2 (Mt ) again follows from the symmetry of L , re ecting the fact that boundary conditions imply smoothness of the elds ˆ A and ˆ on Mˆ . 2 We thank the referee for pointing out this way to extend our argument to static from hyperstatic metrics.
518
J.L. Friedman, M.S. Morris
Lemma 2 and its proof can be repeated as written with and regarded as vectors or spinors. In the proof of Lemma 3, each of the components of ( A ) with respect to covariantly constant frames (spinor frames) on the exterior region E are scalars satisfying Eqs. (3.22), (3.24), and (3.25). They therefore vanish on E and, by elliptic continuation, vanish on all of M. In Lemma 4, eigenfunctions for vector and spinor elds are de ned by F A = (2)3=2 A eik · x + ’ A ;
(3.93a)
F = (2)3=2 eik · x + ’ ;
(3.93b)
where and A are covariantly constant on E and satisfy k = 0 = kAA0 A . The rst part of Lemma 4, existence and uniqueness of eigenfunctions F ; F A , is again immediate from Lemmas 2 and 3. The second part, unitarity, again appears to be a straightforward extension of Chap. 6 in Wilcox [9], but here there are details we have not checked. The proof of Lemma 5 goes through as written if a in Lemma 5 is interpreted as a vector (spinor) with a covariant index and their product is read as a dot product: R e.g., f(x) = dkF ( = !; k; x)a (k). In Lemma 6, the equations are correct as written for the components of and A , but one must use, in addition, the fact that the elds satisfy the peeling theorem for at space [6] to complete the characterization of their behavior. In Lemma 7, the proof of smoothness can be read as written. The proof of regularity at I− and the recovery of initial data relies on a spherical harmonic decomposition that can be modi ed in a standard way for spinors and vectors. Finally, Lemma 8 and the proof of Propositions 1 and 2, can be read as written, with the change in the statement of Proposition 2 given above.
IV. The Cauchy Problem for More General Spacetimes The work reported above shows the existence of an unexpected class of spacetimes for which an existence theorem and at least a partial uniqueness theorem can be proved. How broad is the class of spacetimes for which a generalized Cauchy problem is well-de ned? Examples of spacetimes with CTCs for which one can prove existence and uniqueness for linear wave equations are not dicult to nd, if one allows singularities and does not require that solutions have nite energy [13], and we will display some examples below. Finding examples of nonsingular geometries with CTCs and a well-de ned Cauchy problem is more dicult, but the earlier work by Morris et. al. [2] is persuasive: Their time-tunnel examples are asymptotically
at spacetimes in which CTCs are con ned to a compact region and for which there appears to be a well-de ned initial value problem for data on a spacelike hypersurface to the past of the nonchronal region, the set of points through which CTCs pass. In the present section we present a uniqueness result complementary to that of Sect. III B, a conjecture on existence and uniqueness, and examples of spacetimes that do or do not have a well-posed initial value problem.
Existence Uniqueness Theorems for Massless Fields
519
A. A result on uniqueness for spacetimes with a compact nonchronal region.In Sect. III B we ruled out, for the static spacetimes considered, a lack of uniqueness corresponding to a eld forever trapped inside the nonchronal region, a smooth analog of a closed null geodesic. Here we show, for spacetimes with a compact nonchronal region, A, that when solutions to the Cauchy problem exist, they are unique outside A. Because data is now given on a spacelike surface (instead of I− ), we need no longer restrict consideration to massless wave equations. Initial data on a spacelike hypersurface M for a solution to the scalar wave equation, (−∇ ∇ + m2 ) = 0 ;
(4.1)
will mean the pair of functions = |M ;
= n ∇a |M ;
(4.2)
where n is a unit normal to M. Proposition 3. Let N; g be a smooth; asymptotically at spacetime with regions F and P to the future and past of a compact 4-dimensional submanifold A de ned by F = N\J − (A) and P = N\J + (A); where both F and P are globally hyperbolic and foliated by complete spacelike 3-manifolds. Suppose for arbitrary smooth data with nite energy and a Cauchy surface MP of P that the scalar wave equation has a solution on N with nite energy on F; and suppose that for arbitrary smooth data with nite energy and a Cauchy surface MF of F that the scalar wave equation has a solution on N with nite energy on P. Then; solutions on N with nite energy in N\A are unique in N\A. Proof. The proof relies on the nondegeneracy of the symplectic form, R !M (f; g) := dS (f ∇ g − g∇ f) ;
(4.3)
M
and the fact that ! is independent of hypersurface. That is, let B ⊂ N be a slab, a 4-dimensional submanifold of N bounded by two submanifolds M and M0 in the foliations of P and F that coincide outside of a compact region. Then, for any two solutions with nite energy, R 0 = d4 Vf(∇ ∇ − m2 )g B R = dS (f ∇ g − g∇ f) @B
= !M (f; g) − !M0 (f; g) :
(4.4)
Suppose the theorem is false. Then there is a solution to Eq. (4.1) with zero initial data on MP (say) and with nonzero and with nite energy somewhere on N\A. Thus a hypersurface M in the foliation F has nonzero data for and we can deform M outside the support of (in the intersection of F and P) to ˙ on M such coincide with MP . Because ! is non-degenerate, there is data ( ; ) that (4.5) !M (; )-0 : But, by hypothesis, a solution exists on N, corresponding to the initial data on M; and the fact that ! is independent of hypersurface implies !MP (; )-0, contradicting the assumption that has vanishing initial data on MP .
520
J.L. Friedman, M.S. Morris
Fig. 3. Misner space is the region between the two null rays ‘ and B‘, with points of the null boundaries identi ed by the boost B. The curve C = CC 0 is a chronology horizon, a closed null geodesic that separates the nonchronal region above it from the globally hyperbolic spacetime to its past. Equivalently, Misner space is the quotient of the half of Minkowski space lying to the left of the null line L by the group of boosts generated by B
Corollary. Proposition 3 holds for the Maxwell; Dirac; and Weyl elds. Proof. Each of the three elds has a conserved symplectic product !. The proof goes through as stated, with the symplectic product and initial data of a scalar eld replaced by that of the Maxwell and Dirac elds and with the energy norm for a scalar eld replaced by vector and spinor energy norms. Because the billiard-ball examples considered by Echeverria et al. [3, 11] have a multiplicity of solutions for the same initial data, uniqueness in spacetimes with CTCs is likely to hold only for free or weakly interacting elds. Because solutions seem always to exist for the billiard ball examples in the spacetimes they considered, it may be that classical interacting elds have solutions on spacetimes for which solutions to the free eld equations exist. B. A conjecture.The solution to the problem posed at the beginning of this section – to delineate the class of spacetimes for which a generalized Cauchy problem is well-de ned – is, of course, not known. We present a conjecture here, motivated by examples of geometries which appear to have a well-de ned Cauchy problem and by examples of others for which either existence or uniqueness fails. We will motivate the conjecture with a brief reminder of some examples that are by now well known; a more detailed discussion of additional spacetimes will be given in Sect. IV C. A helpful 2-dimensional example of a spacetime where the Cauchy problem is not well de ned is Misner space. This is the quotient of the half of Minkowski
Existence Uniqueness Theorems for Massless Fields
521
space on one side of a null line L by the subgroup {1; B±1 ; B±2 ; : : :} generated by a boost B about a point of the null line. Equivalently, if ‘ is a null line parallel to L, and B‘ is its boosted image, then Misner space is the strip between ‘ and B‘, where boundary points related by B are identi ed (see Fig. 3). Misner space has a single closed null geodesic, C = CC 0 in the gure, and the past P of C is globally hyperbolic. The future of C is nonchronal, so C is a chronology horizon, a Cauchy horizon that bounds the nonchronal region. Initial data for the scalar-wave equation can be posed on a Cauchy surface M of P, but solutions have divergent energy on the chronology horizon. The reason solutions diverge is clear in the geometrical optics limit. A light ray , starting from M, loops about the space and is boosted each time it loops. Because loops an in nite number of times before reaching C, its frequency and energy diverge as it approaches the horizon. The ray is an incomplete geodesic: it reaches the horizon in nite ane parameter length, because each boost decreases the ane parameter by the blueshift factor ≡ [(1 + V )=(1 − V )]1=2 , with V the velocity of the boost.3 This behavior is not unique to Misner space: A theorem due to Tipler [12] shows that geodesic incompleteness is generic in spacetimes like Misner space in which CTCs are “created” – spacetimes with a nonchronal region to the future of a spacelike hypersurface. In more than two dimensions, however, the existence of incomplete null geodesics like does not always imply that the chronology horizon is unstable. This is because there may be only a set of measure zero of such geodesics so that the energy may remain nite on the chronology horizon. For the time-tunnel spacetimes considered in refs [2, 3], a congruence of null rays initially parallel to spreads as the rays approach the chronology horizon. When the spreading of the rays overcomes the successive boosts (when the fractional decrease in ux is greater than the fractional increase in squared frequency), the horizon is stable in the geometrical optics approximation. We suggest that this is the generic situation – that, roughly speaking, when a spacetime is stable in the geometric optics limit, its Cauchy problem is well de ned. To obtain a more precise statement of the conjecture, we proceed as follows. Hawking [17] has a related de nition of the optical stability of a Cauchy horizon, involving the divergence and boost of a closed null geodesic generator, but it is not directly applicable to the more general situation we consider. Let N; g be an asymptotically at spacetime, let U; g|U be a globally hyperbolic subspacetime, and let V ⊂ U be compact. Then U; g|U can be smoothly foliated by a family of spacelike hypersurfaces. Let St be the restriction to V of such a foliation of U . Let c(x; ) be a smooth null geodesic congruence, transverse at = 0 to a spacelike surface ⊂ N, with an ane parameter and c(x; 0) = x ∈ . Call c regular if it is a subcongruence of similarly de ned smooth null congruence c(x; ˆ ) transverse to at = 0, where the boundary of the support of cˆ lies entirely outside the support of c on . Let k a (x; ) be the vector eld tangent to c. For xed x0 , the curve c(x0 ; ) intersects each surface St in V at most countably many times, at values {n (x0 )} of the ane parameter . Let Wn be the nth 3 That is, trajectories of a (locally-de ned) timelike Killing vector cross the null geodesic at a sequence of points. The Killing vector can be used to compare the ane parameter at succesive crossing points by time-translating a segment of the geodesic to successively later segments. Compared in this way, the ane parameter of a given segment will be less than that of the next segment by the blueshift factor [(1 + V )=(1 − V )]1=2 .
522
J.L. Friedman, M.S. Morris
intersection of the congruence c with St : S Wn = c(x; n (x)) :
(4.6)
x
To de ne an energy carried by the null congruence, we introduce a density (x; ) by setting (x; 0) = 0 and extending to other values of by the conservation equation, (4.7) ∇a (k a ) = 0 : The energy carried across St by c in the nth crossing is then PR a b k k na nb dS : E=
(4.8)
n Wn
Call a neighborhood V optically stable if, for each St and each regular congruence c, E is nite. Call the spacetime N; g optically stable if for each globally hyperbolic subspacetime U; g|U , every open V with V ⊂ U is optically stable. The existence part of our conjecture can be stated as follows. Existence Conjecture. Let N; g be a smooth, asymptotically at spacetime for which past and future regions P = N\J + (A) and F = N\J − (A) of a compact 4dimensional submanifold A are globally hyperbolic. If N; g is optically stable, the Cauchy problem for massless wave equations (for scalar, Maxwell, and Weyl elds) is well-de ned. A conjecture relating uniqueness for massless elds to uniqueness in a geometicoptics sense is easier to formulate. Uniqueness Conjecture. Again let N; g be a smooth, asymptotically at spacetime for which past and future regions P = N\J + (A) and F = N\J − (A) of a compact 4-dimensional submanifold A are globally hyperbolic. Let S± be Cauchy surfaces for NJ ± (A). If all but a set of measure zero of null geodesics intersect S+ and S− , then solutions to massless wave equations on N are unique for initial data on S− (and for initial data on S+ ). Because the instability of massive elds also corresponds to an instability of particles moving along trajectories that become null as one approaches the chronology horizon, it may be that massive wave equations also have a well-de ned Cauchy problem for the same spacetimes. C. Examples and counterexamples. For no asymptotically at spacetime in 4-dimensions, in which CTCs are con ned to a compact region, are we aware of a rigorous demonstration that nite-energy solutions to the scalar wave equation do exist for arbitrary initial data, or that solutions are unique. There may also be no published counterexamples, 4-dimensional asymptotically at spacetimes with a compact nonchronal region (more precisely, spacetimes satisfying condition (i) of the conjecture) for which the nonexistence or nonuniqueness of solutions to free eld equations is proven; but counterexamples like this are not dicult to nd. We present below two examples of asymptotically at spacetimes which are globally hyperbolic to the past and future of a compact region; one can prove for one of them that no solution exists for generic data and for the other that solutions are not unique. The examples show that, without some requirement akin to the optical stability of our conjecture, there can be too many closed geodesics to allow
Existence Uniqueness Theorems for Massless Fields
523
Fig. 4. A piece of Misner space is used in the construction of a 4-dimensional spacetime N with a compact nonchronal region and an unstable chronology horizon. The lines pq and p0 q0 lie in the covering space, the half of Minkowski space lying to the left of v = 0, and they are identi ed by the boost that de nes Misner space. In N, these lines can be regarded as lying along the trajectory of the wormhole mouth
a well-de ned initial value problem. It might also be straightforward to show, for the time-tunnel spacetimes of Refs. [1–3], that whenever the chronology horizon is optically unstable, it is genuinely unstable for elds with smooth initial data. The rst example is a time-tunnel spacetime like those of Morris et al. but with a metric that is everywhere smooth and is chosen to induce a at 2-metric on the part of the identi ed spheres that face each other. The spacetime is also at between the at pieces of the spheres, so the geometry includes a region isometric to a piece of (Misner space)×E 2 , where E 2 is at Euclidean 2-space. For de niteness, we shall identify it with the following piece of Minkowski space. Regard Misner space as the strip of 2-dimensional Minkowski space between the parallel null lines, v = −A and v = −A(V ), with boundary identi ed by a boost with velocity V as above; and take a nite section S of that strip bounded by the bottom half of the hyperbola uv = A2 and by the left half of the hyperbola uv = −A2 , as in Fig. 4 (here u = t − z; v = t + z are the usual null coordinates). Finally, take as the piece of 4-dimensional Minkowski space S × D, where D is the disk x2 + y2 ¡ B2 , and B ¿ A. Spacelike sections of S × D prior to the horizon, C × D, are cylinders with circular cross section in E 2 . In the full spacetime, N, the surface C × D is again a part of the chronology horizon. The key to showing that the chronology horizon of N is unstable is to note that the past of points on C × D with x = y = 0 lies entirely in C × D. This is most easily seen using the universal covering space of S × D. Misner space has as its universal covering space the half of 2-dimensional Minkowski space to the left of v = 0. The corresponding cover S × D of S × D is the part of 4-dimensional Minkowski space bounded by the 3-surfaces v = −A; v = −A; uv = A, lower branch; uv = −A,
524
J.L. Friedman, M.S. Morris
left branch; and x2 + y2 ¡ B2 . The past light cone in S × D of a point on C × (0; 0) has maximum value of x2 + y2 , where it meets the boundary uv = A, and calculation shows that the intersection is a surface with x2 + y2 ¡ A2 . Since, by construction, A2 ¡ B2 , the past light cone of a point of C × {(0; 0)} never intersects the boundary x2 + y2 = B2 . Thus every point P ∈ S × D to the past of C × {(0; 0)} is in the domain of dependence of the spacelike boundary uv = A of S × D; and every point P ∈ S × D to the past of C × (0; 0) is then in the domain of dependence of the boundary uv = A of S × D. Data on uv = A that is independent of x and y yields a solution that diverges in S × D because the solution is identical in the domain of dependence of uv = A to the divergent solution in Misner space. Finally, by picking a spacelike hypersurface of the full spacetime that agrees with the uv = A surface in S × D, we obtain data on a spacelike hypersurface to the past of the chronology horizon for which no nite-energy solution exists to the scalar wave equation for generic smooth initial data with nite energy. The example of a geometry with compact nonchronal region for which uniqueness fails depends on a construction suggested by Geroch [18]. Although 2-dimensional geometries obtained by removing slits and identifying sides are singular [19], it is possible in 4-dimensions to build smooth geometries in a similar way. The construction relies on the following observation. Lemma 9. Any smooth compact 4-dimensional spacetime with boundary can be embedded (i) in a smooth compact spacetime without boundary; and (ii) in a smooth spacetime that is isometric to Minkowski space outside a compact region. Proof. The technique is borrowed from references [16, 20] (see also [21]). (i): Any manifold M with boundary can be embedded in a compact manifold M˜ by attaching a second copy M 0 of M to the outward side of . Let U be a collar of M , a neighborhood U ∼ = × I with boundary 0 t . A Lorentzian metric g on M can be extended to a Lorentzian metric g˜ on M˜ precisely when a direction eld tˆ of timelike directions on U can be extended to a timelike direction eld on M˜ . Now ˆt can always be extended to a Morse direction eld on M˜ , a direction eld that has isolated zeroes, at each of which the line-element eld is tangent to a vector eld with index ±1. By cutting out a ball B4 containing each zero and gluing in a copy || P 2 \B4 for each zero of index of RP 4 \B4 for each zero of index 1 and a copy of C −1, we can extend the line element to a nonvanishing eld on the interior. (ii): The proof here is nearly identical. First embed M in M˜ as in (i). If one removes a ball B4 from M 0 then one can put a at metric on a neighborhood V of the spherical boundary @B4 that makes V into a copy of Minkowski space outside a ball. One is again asking to extend a direction eld on a new boundary, U t V to the 4-manifold that it bounds, and the construction proceeds as in (i). Finally any Lorentzian metric g on the compact manifold-with-boundary, M 0 (M 0 \B4 ) that is smooth on U (U t V ) can be deformed to a smooth metric that agrees with g on a neighborhood of the boundary of U and V . Using this construction, we exhibit a smooth, asymptotically at spacetime with compact nonchronal region, for which solutions with nite energy do not exist for generic initial data. Begin with a 4-torus T 4 with a at metric chosen to make two of the generators null and the other two spacelike (see Fig. 5). Explicitly, identify
Existence Uniqueness Theorems for Massless Fields
525
Fig. 5. An example of an asymptotically at spacetime with compact nonchronal region is depicted here with two dimensions suppressed. Balls are removed from N and the torus, and their boundary 3-spheres I and II are identi ed. Arrows at P point along null generators of the torus. The shaded region is the support of a solution to the massless scalar wave equation
by translation opposite faces of the rectangular 4-cell in Minkowski space, 0 5 u 5 A; 0 5 v 5 A; 0 5 x 5 A; 0 5 y 5 A: The geometry T 4 ; is chosen because it has solutions to the wave equation whose support is not all of T 4 ; examples are smooth plane waves, functions = (u); with (u) = 0; u ¡ A=2. Cut a ball out of T 4 and embed it in a spacetime that is isometric to Minkowski space outside a compact region U , using Lemma 9.4 The resulting spacetime satis es condition (i) of the Conjecture, but solutions to the wave equation for data on a Cauchy surface M for the past of the Cauchy horizon are not unique: Zero data on M is consistent with arbitrary solutions (u) whose support on T 4 is disjoint from the removed ball. Although our primary interest is in smooth geometries with CTCs con ned to a compact region, it is worth pointing out that if one allows singularities, there are simple examples of spacetimes with CTCs for which one can easily prove the existence of solutions to free- eld equations for arbitrary initial data. For these geometries, however, solutions for smooth data with nite energy are not smooth and do not in general have nite energy; generic solutions are in Lloc 2 . Consider, for example, two-dimensional Minkowski space with two parallel timelike or spacelike segments of equal length removed, as in Fig. 6, and each side of each segment identi ed with a side of the other segment after translation by a timelike vector V , which will be taken to point up and to the right.5 4 An
example of suitable manifold is T 4 #R4 #CP 2 #CP 2 . b be the manifold obtained from Minkowski precisely, to construct the rst spacetime, let N space by removing the two timelike segments L1 and L2 = T(LI ), where T is translation by a timelike vector V . Call the segments with their endpoints removed L1 and L2 . Formally reattach L1 and L2 by b L1 t L2 . The topology of the rst spacetime, N, is generated by the open sets of Nb writing N =Nt together with neighborhoods of points of L1 and L2 de ned as follows: Let O be any open set in an atlas for Minkowski space intersecting the line through L1 in an open interval ‘ ⊂ L1 . Let OR be the part of O lying to the left (right) of the line through L1 . Let OR0 be the part of the translated open set 5 More
526
J.L. Friedman, M.S. Morris
Fig. 6. Two simple spacetimes with CTCs and a well de ned Cauchy problem are shown these two gures. Two parallel slits are removed from Minkowski space and points labelled by the same letter are identi ed
The resulting geometries are at with two conical singularities, corresponding to the removed endpoints of the segments. Similar spacetimes have been discussed by Geroch and Horowitz [15] and Politzer [13]. Some identi ed points are timelike related, and timelike curves in Minkowski space joining points PI ∈ LI to T(PI ) ∈ LII become CTCs in N. For each spacetime N there is a spacelike hypersurface M to the past of the nonchronal region on which initial data can be set, and we will see it is easy to nd a solution for arbitrary initial data on M. Proposition 4. For any initial data ; ˙ in L2 (M) × L2 (M) there is a unique soto the massless scalar wave equation on spacetimes of the form lution in Lloc 2 described above. In a neighborhood of spatial in nity the solution agrees with the Minkowski space solution for the same initial data. Proof. We can divide initial data on M into a sum of data for right-moving and left-moving waves, f(x − t) and g(x + t), by writing " " # # Rx Rx 1 1 0 ˙ 0 0 ˙ 0 (x) − (x) + dx (x ) ; g(x) = dx (x ) : (4.9) f(x) = 2 2 −∞ −∞ We separately construct solutions for right-moving and left-moving data. On Minkowski space, right-moving data, (f; f˙ = −f0 ) gives the solution f(x − t); equivalently, f(P) = f(p), where p ∈ M is the past endpoint of the right moving O0 = T(O) that lies to the right of the line through L2 ; and let OL0 be the part of the translated open set O0 = T(O) that lies to the left of the line through L2 and to the right of the line through L1 . Then b. With the obvious maps to subsets of R4 , the open OL ∪ ‘ ∪ OR and OR0 ∪ ‘0 ∪ OL0 are open sets of N sets just enumerated form an atlas. Because of the deleted endpoints, N is not complete.
Existence Uniqueness Theorems for Massless Fields
527
Fig. 7. Each shaded region is a strip isometric to a piece of Minkowski space. A right-moving solution to the massless wave equation is smooth and well-de ned on each strip
null ray from M to P. Note that data in L2 that is discontinuous across a nite set of points p; q; : : : ; r; s of M yields a solution that is discontinuous across the boundaries of the strip between the two right-moving null lines through endpoints p, q. Each point of N similarly lies on a unique right-moving null geodesic, and all but four of these rays, followed back to the past, intersect M. De ne a solution in Lloc 2 (N) by f(P) = f(p), where p ∈ M is the past endpoint of the right moving null ray from M to P. The four rays that fail to meet M are the future parts of null lines that emerge from the (removed) endpoints of the identi ed segments. These lines are the future and past parts of right-moving null rays that are geodesically incomplete, leaving the manifold at the removed endpoints. They divide N into ve strips which intersect M in ve segments (see Fig. 7) (−∞; p20 ); (p20 ; p2 ); (p2 ; p10 ); (p10 ; p1 ); (p1 ; ∞) ; where pi and pi0 are the points where the right-moving null rays from the bottom and top endpoints of Li , respectively, meet M. Since each strip is isometric to a strip bounded by null rays in Minkowski space, the function f satis es the scalar wave equation with the given initial data everywhere except at the null boundaries of the strips. The proof of existence for left-moving solutions is identical, where ve new strips intersect M in ve new segments (−∞; q1 ); (q1 ; q1 0 ); (q1 0 ; q2 ); (q2 ; q2 0 ); (q2 0 ; ∞) ;
528
J.L. Friedman, M.S. Morris
where qi and qi 0 are the points where left-moving null rays from the bottom and top endpoints of Li , respectively, meet M. Outside the chronology horizon, the left- and right-moving solutions have their Minkowski space values because past directed null rays from points outside the horizon never intersect L1 ∪ L2 . Thus the solution outside a spatially compact region of any asymptotically spacelike hypersurface agrees with the Minkowski space solution for the same initial data. Proving uniqueness of these solutions appears to be straightforward: Assuming that any solution in Lloc 2 is locally a sum of right-moving and left-moving solutions, one can trace it back to nonzero data on M. Similar spacetimes can be constructed in 4-dimensions by removing two parallel planar 3-disks from Minkowski space and identifying their boundaries as in the two dimensional example. Again it seems clear that solutions in Lloc 2 exist and that they do not in general have nite energy. Curiously, in 4-dimensions these singular spacetimes can be made into smooth spacetimes by using a construction essentially equivalent to that of Lemma 9. In addition to removing two 3-disks, one removes a small solid torus (D3 × S 1 ) at the edge of each disk. Then when the sides of the disk are glued back in, one is left with a spacetime with boundary S 3 × S 1 , which one can glue to a compact spacetime. Acknowledgements. We thank Piotr Chrusciel, Robert Geroch, and Robert Wald for helpful discussions and Rainer Picard for extensive coaching. M.S. Morris wishes to acknowledge support from a Holcomb Research Fellowship at Butler University while completing this work. The work was also supported by NSF Grant No. PHY-9105935.
References 1. Morris, M.S., Thorne, K.S.: Wormholes in spacetime and their use for interstellar travel: A tool for teaching general relativity. Am.J.Phys.56, 395– 412 (1988) 2. Morris, M.S., Thorne, K.S., Yurtsever, U.: Wormholes, time machines, and the weak energy condition. Phys.Rev.Lett.61, 1446 –1449 (1988) 3. Friedman, J.L., Morris, M.S., Novikov, I.D., Echeverria, F., Klinkhammer, G., Thorne, K.S., Yurtsever, U.: Cauchy problem on spacetimes with closed timelike curves. Phys. Rev. D 42, 1915–1930 (1990) 4. Friedman, J.L., Morris, M.S.: The Cauchy problem for the scalar wave equation is well-de ned on a class of spacetimes with closed timelike curves. Phys. Rev. Lett. 66, 401– 404 (1991) 5. Wald, R.M.: General Relativity. Chicago: University of Chicago Press, 1984 6. Penrose, R., Rindler, W.: Spinors and Space-Time, v. 1,2. Cambridge: Cambridge University Press, 1984 7. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol.2. New York: Academic Press, 1972, Propositions IX.24 and IX.39 8. Palais, R.S.: Seminar on the Atiyah–Singer index theorem. Princeton: Princeton University Press, 1965, Chap. X 9. Wilcox, C.H.: Scattering Theory for the D’Alembert Equation in Exterior Domains. New York: Springer-Verlag, 1975 10. Aronszajn, N.: A unique continuation theorem for solutions of elliptic partial dierential equations or inequalities of second order. J. de Math. Pures et Appl. 36, 235 –249 (1957) 11. Echeverria, F.,Klinkhammer, G., Thorne, K.S.: Billiard balls in wormhole spacetimes with closed timelike curves: Classical theory. Phys.Rev.D 44, 1077–1099 (1991) 12. Tipler, F.J.: Singularities and Causality Violation. Ann. Phys. 108, 1– 36 (1977) 13. Politzer, H.D.: Simple quantum systems in spacetimes with closed timelike curves. Phys. Rev. D 46, 4470– 4476 (1992)
Existence Uniqueness Theorems for Massless Fields
529
14. Gott, J.R.: Closed timelike curves produced by pairs of moving cosmic strings: Exact solutions. Phys. Rev. Lett. 66, 1126 –1129 (1991) 15. Geroch, R., Horowitz, G.: Global structure of spacetimes. In: S.W. Hawking, W. Israel (eds.), General Relativity. Cambridge: Cambridge Univ. Press, 1979, pp. 212 –193 16. Geroch, R.: Topology in general relativity. J. Math. Phys. 8, 782 –786 (1967) 17. Hawking, S.W.: Chronology protection conjecture. Phys. Rev. Lett. 46, 603 (1992) 18. Geroch, R.: Private conversation. 19. Chamblin, A., Gibbons, G.W., Steif, A.R.: Kinks and Time Machines. DAMPT preprint (and electronic preprint gr-qc/9405001) (1994) 20. Sorkin, R.: Non-time-orientable Lorentzian cobordism allows for pair creation. Int. J. Theor. Phys. 25, 877 (1986) 21. Friedman, J.L.: Spacetime topology and quantum gravity. In: Conceptual Problems of Quantum Gravity, ed. A. Ashtekar, J. Stachel, Basel-Boston: Birkhauser, 1991
Communicated by S.-T. Yau
Commun. Math. Phys. 186, 531 – 562 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras A. Astashkevich ? Department of Mathematics, MIT, Cambridge, MA 02139, USA. E-mail:
[email protected] Received: 15 December 1995 / Accepted: 4 March 1996
Abstract: In the paper we present a different proof of the theorem of B. L. Feigin and D. B. Fuchs about the structure of Verma modules over Virasoro algebra. We state some new results about the structure of Verma modules over Neveu-Schwarz. The proof has two advantages: first, it is simpler in the most interesting cases (for example in the so called minimal models), second, it can be generalized for Neveu-Schwarz algebra for some class of Verma modules.
1. Introduction The main goal of this paper is to present a different proof of the theorem of B. L. Feigin and D. B. Fuchs (see [Fe-Fu 1]) about the structure of Verma modules over Virasoro algebra and to state some new results about the structure of Verma modules over NeveuSchwarz algebra. This proof has two advantages: first, it is simpler in the most interesting cases (for example in the so called minimal models), and second, it can be generalized for Neveu-Schwarz algebra for some class of Verma modules. This text arose as a result of trying to understand the original proof of B. L. Feigin and D. B. Fuchs. The original proof uses some facts about Jantzen’s filtration which I could not prove and nobody could explain to me. That is why I tried to find another proof. I would like to express my deep gratitude to M. Finkelberg, E. Frenkel and W. Soergel for valuable discussions. I am happy to thank Professor V.G. Kac for his interest in my work and his questions. I am greatly indebted to D.B. Fuchs for numerous conversations and his constant care. ?
Supported by Rosenbaum Fellowship
532
A. Astashkevich
2. Notation 2.1. Virasoro algebra and Verma modules Let L be the Lie algebra of algebraic vector fields on C∗ with the basis Li , i ∈ Z and commutators [Li , Lj ] = (j − i)Li+j . The Virasoro algebra, Vir, is a one dimensional central extension of L corresponding 3 −j) . We have the following basis in the Virasoro algebra to the cocycle (Li , Lj )→δ−i,j (j 12 Li , i ∈ Z and C and commutators [Li , C] = 0, (j 3 − j) C. 12 Both algebras are Z-graded: deg Li = i and deg C = 0. Let us denote by h a Lie algebra with the basis L0 and C, by n− a Lie algebra with the basis {L−i , i ∈ N}, and by n+ a Lie algebra with the basis {Li , i ∈ N}. We also denote by b+ a Lie algebra with the basis Li and C, i ∈ Z+ . All these algebras h, n− , n+ and b+ are subalgebras of Vir. We have a Cartan type decomposition of Vir [Li , Lj ] = (j − i)Li+j + δ−i,j
Vir = n− ⊕h⊕n+ , Vir = n− ⊕b+ . Let h, c ∈ C. Let us consider a one dimensional module Ch,c over b+ such that n+ acts by zero, L0 is a multiplication by h and C is a multiplication by c. Verma module Mh,c over the Virasoro algebra is by definition an induced module from Ch,c Mh,c = IndVir b+ Ch,c .
We have a natural inclusion of Ch,c ,→Mh,c . So we have a vector v ∈ Mh,c corresponding to 1 ∈ Ch,c . Sometimes we will write vh,c to stress that this vector lies in Mh,c . Vector v is called the vacuum vector. Let us make a few remarks about Verma modules. First, any Verma module Mh,c is a free module over U(n− ). Therefore, we have the following basis in Mh,c : L−ik L−ik−1 ...L−i2 L−i1 vh,c , where ik ≥ ik−1 ≥ ... ≥ i1 ≥ 1. The operator L0 on Mh,c is semisimple. We can consider the eigenspace decomposition of Mh,c , ∞ M i Mh,c , Mh,c = i=0 i where L0 acts as a multiplication by h − i on Mh,c . It is easy to see that this decomposin tion respects the grading on Vir. We say that vector w ∈ Mh,c has level n if w ∈ Mh,c . + Vector w is called singular if it has some level n (n ∈ Z+ ) and n acts by zero on this vector. It is obvious that any singular vector generates a submodule isomorphic to Verma module. If a singular vector has level n then it generates Mh−n,c .
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
533
2.2. Categories Oc , O, Shapovalov’s form and Kac determinant formula Let us define categories Oc and O. We say that module M ∈ Oc if it satisfies the following conditions: 1) C acts on M as a multiplication by c. 2) L0 acts semisimply on M and we have a decomposition of M M Mh , where L0 acts on Mh as a multiplication by h M= h∈C
and dim(Mh ) < ∞ for any h ∈ C. 3) n+ acts locally finite on M. This means that for any w ∈ M, U(n+ )w is a finite dimensional space. We define category O as a direct sum of Oc over all c ∈ C. Example 1. Module Mh,c ∈ Oc . Let us define for every module M ∈ Oc a contragradient module M ∈ Oc in the following way: def
0
Mh = Mh 0
Li acts on M as L−i and C acts as a multiplication by c. def
0
It is obvious that w = vh,c ∈ M h,c is a singular vector thus we get a map B : 0 Mh,c →Mh,c . This map defines a bilinear form on the module Mh,c (Shapovalov’s form). One can show that this form is symmetric. By definition this form is contravariant. It is n m easy to check that the spaces, Mh,c and Mh,c for n 6= m are orthogonal. Since we have n a basis in Mh,c we can calculate the determinant of the form Bn - restriction of B on n . The result is well known. Mh,c Kac determinant formula: det 2 (Bn ) = Const
Y
Φk,l (h, c)p(n−kl) ,
where
k, l kl ≤ n
Φk,l (h, c) = (h +
(l2 − 1)(c − 13) (kl − 1) (k 2 − 1)(c − 13) (kl − 1) + )(h + + ) 24 2 24 2 2
(k 2 − l2 ) . 16 This formula gives us a condition under which the module Mh,c is irreducible. It is obvious that if Verma module is reducible then it contains at least one singular vector. +
Remark 1. We can define the form B in another way. Let ω : U(Vir)→U(Vir) be an anti-involution such that def
ω(Li ) = L−i , def
ω(C) = C.
534
A. Astashkevich
Let for w ∈ Mh,c , hwi be the vacuum expectation value def
w = avh,c +
where hwi = a, X ai1 ,...,ik L−ik L−ik−1 ...L−i2 L−i1 vh,c . ik ≥ik−1 ≥...≥i1 ≥1
Then for x, y ∈ U(Vir) we have def
B(x(vh,c ), y(vh,c )) = hω(x)y(vh,c )i. All these facts are well-known and can be found in [Kac-Ra] or [Fe-Fu 1]. 3. Singular Vectors in Verma Modules over Virasoro Theorem 3.1. ([Fu 1]) At each level n only one singular vector w can exist. If a singular vector w exists then we have the following formula for it X Pi(n) (h, c)L−ik L−ik−1 ...L−i2 L−i1 vh,c w = (L−1 )n vh,c + 1 ,...,ik ik + ... + i1 = n ik ≥ ik−1 ≥ ... ≥ i1 ≥ 1 ik ≥ 2 which defines w up to multiplication by a constant. Pi(n) (h, c) are polynomials in h 1 ,...,ik and c. n can be written in a unique way in the form: Proof. First of all, any element w ∈ Mh,c X ai1 ,...,ik L−ik L−ik−1 ...L−i2 L−i1 vh,c . w = a(L−1 )n vh,c +
ik + ... + i1 = n ik ≥ ik−1 ≥ ... ≥ i1 ≥ 1 ik ≥ 2 Let us order the monomials, L−ik L−ik−1 ...L−i2 L−i1 (ik ≥ ik−1 ≥ ... ≥ i1 ≥ 1), in the following way. We say that L−ik L−ik−1 ...L−i2 L−i1 is greater than L−jl L−jl−1 ...L−j2 L−j1 if i1 = j1 , i2 = j2 , ..., im = jm and im+1 > jm+1 for some m. Example 2. (L−1 )n < L−2 (L−1 )n−2 < L−3 (L−1 )n−3 < . . . Let us assume that w is a singular vector. This means that for any i ≥ 1, Li w = 0. Our goal is to express all coefficients ai1 ,...,ik as a×(some polynomial depending only on h and c). We will show that this can be done by induction. If we know all coefficients ai1 ,...,ik for L−ik L−ik−1 ...L−i2 L−i1 < L−jl L−jl−1 ...L−j2 L−j1 then we can express the coefficient aj1 ,...,jl as a sum of ai1 ,...,ik Qi1 ,...,ik (h, c), where Qi1 ,...,ik (h, c) are polynomials in h and c. Let j1 = j2 = ... = js = 1 and js+1 > 1. Then let us calculate the coefficient at L−jl L−jl−1 ...L−js+2 (L−1 )s+1 vh,c in Ljs+1 −1 w. One can see that it has the following form:
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
+
(2js+1 − 1)aj1 ,...,jl + X
535
ai1 ,...,ik Qi1 ,...,ik (h, c),
ik ≥ ik−1 ≥ ... ≥ i1 ≥ 1 L−ik L−ik−1 ...L−i2 L−i1 < L−jl L−jl−1 ...L−j2 L−j1 where a1,...,1 = a. Since this coefficient equals zero we can express aj1 ,...,jl as a sum of ai1 ,...,ik Qi1 ,...,ik (h, c), where Qi1 ,...,ik (h, c) are polynomials in h and c, and the sum is over such indices i1 , ..., ik that L−ik L−ik−1 ...L−i2 L−i1 < L−jl L−jl−1 ...L−j2 L−j1 . By induction we immediately get that aj1 ,...,jl is of the form a×(some polynomial depending only on h, c). From this the proposition follows immediately.
Now let us look at the Kac determinant formula: Y det 2 (Bn ) = Const Φk,l (h, c). k, l kl ≤ n Equation Φk,l (h, c) = 0 defines a rational curve (C∗ ) in C2 . This curve can be given by the formulas: 1 − kl 1 − l2 −1 1 − k2 t+ + t , h(t) = 4 2 4 c(t) = 6t + 13 + 6t−1 . Denote this curve by F(k, l). The Kac determinant formula shows that for almost all t ∈ C∗ (except finite number) there exist a singular vector in Mh(t),c(t) at level kl. We get the following corollary of proposition 3.1. Corollary 3.1. ([Fu 1]) For any natural numbers k and l there exist a unique map Sk,l : F(k, l)→U(n− )kl which has the following form: X Pik,l (t)L−ir L−ir−1 ...L−i2 L−i1 vh,c , Sk,l (t) = 1 ,...,ir ik + ... + i1 = kl ir ≥ ir−1 ≥ ... ≥ i1 ≥ 1 k,l ≡ 1 and Pik,l where P1,...,1 (t) is a polynomial in t and t−1 for any i1 , ..., ir and such 1 ,...,ir that Sk,l (t)vh,c is a singular vector in the module Mh(t),c(t) , where h(t) and c(t) are given by the formulas above (i.e., t is a parameter on the curve F (k, l)).
Proof. Follows immediately from Proposition 3.1.
We have a trivial vector bundle U(n− )kl over CP 1 and we have a section Sk,l (t) of this bundle over C∗ . Consider this section as a meromorphic section of our vector bundle over CP 1 . Now we would like to calculate the orders of the poles at points zero and infinity. Let us formulate the final result. The proof of the following theorem is technical and can be found in [Ast-Fu].
536
A. Astashkevich
Theorem 3.2. The coefficient at L−ir L−ir−1 ...L−i2 L−i1 in Sk,l (t) has degree in t less or equal than l(k − 1). The degree in t is equal to l(k − 1) only at the monomial (Lk )l and the coefficient at tl(k−1) equals ((k − 1)!)l . As a corollary of the last theorem one get the following important result. Theorem 3.3. The orders of the poles of the section Sk,l (t) of the trivial vector bundle U(n− )kl over CP 1 are equal to l(k − 1) at ∞ and k(l − 1) at 0. Proof. Obvious.
4. Jantzen’s Filtration The main goal of this section is to define Jantzen’s filtration and to formulate all properties of it which we need. Let C be a smooth algebraic curve over C with the sheaf of functions O and two vector bundles M and M . Denote the corresponding sheaves by M and M. Suppose we have a map B : M →M of the vector bundles . Then for any point p ∈ C we get Jantzen’s filtration on the fiber of M at point p. Let us define it. Let τ be a local parameter at point p ∈ C. Consider the ring Op and the modules Mp and Mp over it. These modules are free and we have a map Bp : Mp →Mp . The fibers of M and M at point p are exactly Mp /τ Mp and Mp /τ Mp . We denote them by V and W respectively. Now we will define a decreasing filtration · · · V (2) ⊆ V (1) ⊆ V (0) = V . Definition 1. V (n) is spanned by such vectors v that there exists an element v˜ ∈ Mp with the following properties: π ˜ = v. i) Under the projection Mp →Mp /τ Mp = V π(v) n ii) Bp (v) ˜ ∈ τ Mp . From the definition it is obvious that the filtration depends only on the map B in some neighborhood of point p. Suppose we have a symmetric bilinear form B on the vector bundle M . Take M = 0 0 M - the dual vector bundle. Bilinear form provides a map M →M = M . So we obtain Jantzen’s filtration on every fiber of the vector bundle M . From now on we assume that we have a vector bundle M and a bilinear form B on it. Choose a point p ∈ C and let us 0 denote the induced map from V →V by B(p) , where V is a fiber of M at point p. Properties of Jantzen’s filtration 1) V (1) = Ker(B(p) ) 0 2) Let us assume that we have two maps A and A : M →M such that B(Av, w) = 0 B(v, A w) for any two sections v, w ∈ Γ (U , M) where U is any open subset of C. We 0 have induced maps A(p) and A(p) : V →V and it is easy to check that A(p) (V (n) ) ⊆ V (n) . 0
˜ ∈ τ n Mp . This Indeed, if v ∈ V (n) then there exists v˜ ∈ Mp such that Bp (v) n ˜ w) ∈ τ Op . Therefore, in order to check that means that for any w ∈ Mp Bp (v, ˜ w) ∈ τ n Op . But A(p) v ∈ V (n) it is sufficient to show that for any w ∈ Mp Bp (Ap v, 0 0 n ˜ w) = Bp (v, ˜ Ap w) ∈ τ Op because Ap w ∈ Mp for any w ∈ Mp . Bp (Ap v,
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
537
3) Let us assume that the form B is non-degenerate at the generic point. Then we can define a determinant of this form as a section of the following line bundle ((Λdim M M )⊗2 )0 , det(B) ∈ Γ (C, ((Λdim M M )⊗2 )0 ). We have the following formula: ordτ (det(B)) =
∞ X
dim(V (i) ).
i=1
The statement is local, so to prove it, it is enough to consider a free module Mp over Op and a bilinear form Bp . In such case the formula is almost obvious. One can find a proof of it in the Jantzen’s book (see [Ja]). 4) Assume that the form B is symmetric and non-degenerate at generic point. Then it induces non-degenerate symmetric bilinear form on each quotient V (i) /V (i+1) where i ∈ N. The statement is local. One can find the proof of it in Jantzen’s book (see [Ja]). 5) Assume that we have a smooth algebraic surface, S, over C, a point, p ∈ S, and two smooth curves C1 and C2 which intersect transversally at this point. Suppose we have a vector bundle M over S and a bilinear form B on M which is non-degenerate at the generic point of the surface S. We will denote the fiber of the vector bundle M at point p by V . In addition, assume that we have two sections v˜ and w˜ ∈ Γ (S, M ) such that a) v = v(p) ˜ = w(p) ˜ ∈V ˜ C1 , •) = 0 and B|C2 (w| ˜ C2 , •) = 0. b) B|C1 (v| Let C be any smooth curve which contains point p. If we restrict the vector bundle M and the form B to C we obtain Jantzen’s filtration on the vector space V if we restrict our vector bundle M and the form B to C. The claim is v ∈ V (2) . Proof. The statement is local, therefore we can consider a free module Mp over the local ring OS,p . Moreover, we can take completions with respect to the maximal ideal bS,p ∼ m⊂OS,p . Then O = C[[x, y]] and we can assume that the curves C1 and C2 are given by the equations x = 0 and y = 0 correspondingly. The equation for the curve C is z = 0 cp . cp on the module M where z = ax + by + z˜ and z˜ ∈ m2 . We have form B Now we forget about the curve and reformulate everything in purely algebraic terms. b : Vb →Vb 0 over We have a ring C[[x, y]], a free module Vb = V ⊗C[[x, y]], and a map B C[[x, y]]. Moreover, we have two vectors, v, ˜ w˜ ∈ Vb , such that modulo the maximal ideal 0 0 m ⊂ C[[x, y]] they are equal to v ∈ V = Vb /mVb and B(v) ˜ ∈ (x)Vb , B(w) ˜ ∈ (y)Vb . We can write down the map B as a Taylor series: B = B0 + xB1,0 + yB0,1 + O(m2 ). Also we can write v˜ = v + yv0,1 + xv1,0 + O(m2 )
and
w˜ = v + yw0,1 + xw1,0 + O(m2 ). Then we have the following equalities: B0 (v) = 0, B0,1 (v) = B0 (v0,1 ), B1,0 (v) = B0 (w1,0 ).
538
A. Astashkevich
We must show that we can find u ∈ Vb such that u = v mod m and B(u) = 0 mod (z = ax + by + z, ˜ m2 ) = (ax + by, m2 ). But this is obvious since we can solve the following equation: xB1,0 (v) + yB0,1 (v) = −B0 (xu1,0 + yu0,1 ) mod (ax + by, m2 ).
5. Structure of Submodules of Verma Modules and Jantzen’s Filtration In this section we will state the main theorems. Let us fix h and c. Then equation Φk,l (h, c) = 0 defines in plane C2 (k, l) a quadruple of straight lines, namely pk + ql + m = 0, where c=
(3p + 2q)(3q + 2p) , pq
h=
(p + q)2 − m2 . 4pq
Certainly p, q and m are not defined uniquely but nevertheless, the lines pk+ql+m = 0 in plane C2 (k, l) are correctly defined. It is obvious that the directions of these lines depend only on c. When c 6= 1, 25, these lines form a rhombus with the diagonals k = ±l. If c = 1 (or c = 25) then this rhombus degenerates into a pair of lines parallel to the line k = l (or k = −l) and symmetric with respect to this line. Moreover, if c = 1 and h = 0 (or c = 25 and h = −1 ) then these two lines become a single line k = l (or k = −l ). These lines are real if and only if c ≤ 1 or c ≥ 25. If c ≤ 1 then all these lines have positive slope and when c ≥ 25 all these lines have negative slope. These lines are never parallel to the coordinate axis. Let us denote one of these lines by lh,c . 5.1. Structure of Submodules of Verma Modules We shall distinguish the following cases. The line lh,c contains no integral points. The line lh,c contains exactly one integral point (k, l). We have following subcases: II+ . The product kl > 0. II0 . The product kl = 0. II− . The product kl < 0. Case III. The line lh,c contains infinitely many integral points. We will distinguish six subcases which can be divided in two groups. Subcase c ≤ 1. Let (k1 , l1 ), (k2 , l2 ), (k3 , l3 ), ... be all integral points (k, l) on the line 0 0 0 0 lh,c up to equivalence (k, l) ∼ (k , l ) iff kl = k l and such that kl > 0. We ordered them in such a way that ki li < ki+1 li+1 for all i ∈ N. 00 III− . Line lh,c intersects both axes at integral points (see Fig. 1). 0 . Line lh,c intersects only one axis at integral point (see Fig. 2). III− III− . Line lh,c intersects both axes at non-integral points. In this case we 0 draw an auxiliary line lh,c parallel to lh,c through the point (k1 , −l1 ). We denote its points (k2 , l2 − 2l1 ), (k3 , l3 − 2l1 ), (k4 , l4 − 2l1 ), ... by 0 0 0 0 0 0 (k1 , l1 ), (k2 , l2 ), (k3 , l3 ), ... (see Fig. 3). It is easy to see that we have the following inequalities:
Case I. Case II.
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
8
6
• (k
7 5
3
4
)
,l
3 ,l
2
2
3
)
)
• (k
2
1 -1
• (k
4
-9 -8 -7 -6 -5 -4 -3 -2 -1
,l
4
• (k
6
-11
539
c -3 •-4 -5 -6 • -7 -8 • -9
c 1
2
3
,l
1 4
1 5
)
6
7
8
9
10
-
-2
Fig. 1.
0 0
0 0
k 1 l 1 < k2 l 2 < k 1 l 1 + k 1 l 1 < k 1 l 1 + k 2 l 2 < k 3 l 3 < k 4 l 4 < 0 0
0 0
< k1 l1 + k3 l3 < k1 l1 + k4 l4 < k5 l5 < k6 l6 < ... . Subcase c ≥ 25. Let {(k1 , l1 ), (k2 , l2 ), ..., (ks , ls )} be all integral points (k, l) on the 0 0 0 0 line lh,c up to equivalence (k, l) ∼ (k , l ) iff kl = k l and such that kl > 0. We ordered them in such a way that ki li < ki+1 li+1 for all i ∈ {1, 2, ..., s − 1}. III+00 . Line lh,c intersects both axes at integral points (see Fig. 4). III+0 . Line lh,c intersects only one axis at integral point (see Fig. 5). III+ . Line lh,c intersects both axes at non-integral points. In this case we 0 draw an auxiliary line lh,c parallel to lh,c through the point (k1 , −l1 ). We denote its points (k2 , l2 − 2l1 ), (k3 , l3 − 2l1 ), (k4 , l4 − 2l1 ), . . . 0 0 0 0 0 0 by (k1 , l1 ), (k2 , l2 ), (k3 , l3 ), . . . (see Fig. 6). It is easy to see that we have the following inequalities: 0 0
0 0
k 1 l 1 < k2 l 2 < k 1 l 1 + k 1 l 1 < k 1 l 1 + k 2 l 2 < k 3 l 3 < k 4 l 4 < 0 0
0 0
< k1 l1 + k3 l3 < k1 l1 + k4 l4 < k5 l5 < k6 l6 < ... .
540
A. Astashkevich
6
8
7 6
5
• (k
4 3 2
-9 -8 -7 -6 -5 -4 -3 -2 -1
-1
-2
(k
(k
4
,l
2
,l
2
4
) •
• )
• (k
1 -11
• (k
1
c
2
3
,l
1 4
,l
3
1 5
,l
5
3
5
)
)
) 6
7
8
9
10
-
-3
-4 -5 -6 -7 -8
Fig. 2.
Theorem A ([Fe-Fu 1]). a) All submodules of Verma module are generated by singular vectors. b)
i) In cases I, II− and II0 Verma module is irreducible.
ii) In case II+ Verma module Mh,c has a unique submodule generated by the singular vector at level kl. This submodule is isomorphic to Verma module Mh−kl,c which is irreducible (case II− ). 00 0 , III− and III− we have an infinite number of singular vectors. iii) In cases III− All singular vectors and relations between them are shown in the diagrams below. Singular vectors are denoted by points with their weights indicated. An arrow or a chain of arrows from one point to another means that the second singular vector vector lies in the submodule generated by the first one.
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
8
6
(k
(k
,l
5
,l
•
7 6
3
4 3
-11
,l
1
2
)
1
(k
(k
,l
4
,l
2
4
)
(k
6
,l
6
)
2
•
)
•
•
1
-9 -8 -7 -6 -5 -4 -3 -2 -1 -1
d
• -2 -3
• -4 -5
•
2
(k’
3
)
•
•
5
(k
)
5
1
(k’
(k’
, l’
2
541
2
, l’
4
4
)
)
3
, l’
4
1
5
6
7
8
9
10
)
-6 -7
•
(k’
3
-8
, l’
3
)
Fig. 3.
00 Cases III− 0 and III− (h , c)
x
Case
x @ @ R @ x HH H HH ? j H x HH H HH ? j H x H HH HH ? j H x H HH H H ? j H x
(h , c)
(h − k1 l1 , c)
(h − k1 l1 , c)
? x
(h − k2 l2 , c)
? x
(h − k3 l3 , c)
?
(h − k1 l1 − k3 l3 , c)
...
III−
0 0
(h − k1 l1 − k1 l1 , c)
0 0
(h − k5 l5 , c)
...
x
(h − k2 l2 , c)
0 0 ? x(h − k1 l1 − k2 l2 , c)
? x
(h − k4 l4 , c)
0 0 ? x(h − k1 l1 − k4 l4 , c)
? x
(h − k6 l6 , c)
542
A. Astashkevich 12
6
11 10 9 8
H7Hd H 6
(k
,l
1
HH
• H
5 4 3
1
)
HH(k 2 , l 2 ) H•H (k ,l ) 3 3 H HH • HH HH d H
HHd HH
2
HH d
1 -1
1
2
3
4
5
6
7
8
8
10
11
HH HH d
12
13
14
15
Fig. 4.
Cases III+00
Case III+
and III+0 (h, c)
(h, c)
x
(h − k1 l1 , c)
? x
(h − k2 l2 , c)
? x
(h − k1 l1 , c) 0 0
(h − k1 l1 − k1 l1 , c) (h − k3 l3 , c)
q q q 0 0
(h − k1 l1 − k3 l3 , c) (h − ks−1 ls−1 , c) (h − ks ls , c)
? x ? x
x @ Rx @ x (h − k2 l2 , c) H H ? j ? HH x x(h − k l − k0 l0 , c) 1 1 2 2 H H HH ? j ? x x HH (h − k4 l4 , c) HH j ? ? x x(h − k l − k0 l0 , c) 1 1
q q q
4 4
@ ? R ? @ x x @ R @ x
iv) In cases III+00 , III+0 and III+ we have a finite number of singular vectors (maybe zero). All singular vectors and relations between them are shown in the diagrams below. Singular vectors are denoted by points with their weights indicated. An arrow or a chain of arrows from one point to another means that the second singular vector lies in the submodule generated by the first one.
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras 12
543
6
11 10
Q Q
9Q
(k
Q
8
Q
1
,l
Q• Q
)
1
Q
Q
7
Q
(k
,l
Q
3
3
)
• Q
6
Q
Q
5
Q
(k
Q
4
Q•
4
4
)
Q
3 2 1 -1
,l
1
2
3
4
5
6
7
8
Q
Q (k ,l ) 2 2 Q Q• Q Q Q Q Q d Q 8 10 11 12 13 Q 14 15
Fig. 5.
00 0 c) In case III− (respectively III− , III− , III+00 , III+0 and III+ ) any Verma submodule 00 0 generated by a singular vector belongs to case III− (respectively III− , III− , III+00 , 0 III+ and III+ ).
5.2. Jantzen’s Filtration First of all, let us make some remarks about the curves F(k, l). Curves F (k, l) intersect 2 each other only at real points. Curves F(k, k) are lines h = k 24−1 (1 − c). All other curves F (k, l) for k 6= l have two branches in real plane R2 . One of the branches of the curve F(k, l) lies in the region c ≤ h ≥ (c−1) 24 , the other one lies in the region c ≥ 25 h ≤ 0. All curves F(k, l) for k 6= l touch boundary lines c = 1, h = (c−1) 24 and c = 25. We have a two parameter family of Verma modules Mh,c . Moreover, we have a symmetric bilinear form B(h, c) on our Verma module Mh,c . We can restrict the form i where i ∈ Z+ . We denote this restriction by Bi (h, c). One can look B(h, c) to Mh,c at this situation in the following way. We have a complex algebraic surface C2 and a i at point trivial vector bundle Mi with fibers which are canonically isomorphic to Mh,c 2 (h, c) ∈ C . We have a symmetric bilinear form on the vector bundle Mi which is nondegenerate at the generic point of C2 . The Kac determinant formula gives us expression for the determinant of this form in some basis. Let us fix a point (h, c) ∈ C2 . Let C be any smooth curve which passes through this point (h, c). Consider Jantzen’s filtration on the fiber of the vector bundle Mi at point i along this curve C. From the properties of Janten’z filtration (see Sec. (h, c) i.e., Mh,c 4.) it follows that the filtration is Vir invariant. Let us describe it. We assume here that
544
A. Astashkevich 12
6
11 10 9
Q
Q
8 7
Q
(k
Q
2
)
Q• Q
Q
6
,l
2
Q
Q
Q
5
Q•
4
(k’ ,l’ ) 2 2
3
(k
Q
Q
4
Q •
4
)
Q
Q Q
(k’ ,l’ ) 3 3
Q
Q
(k
Q
•
2
3
Q
2
3
4
5
)
3
Q
Q
(k’ ,l’ ) 1 1 1
,l
Q• Q
Q
1 -1
,l
6
7
Q (k Q • Q
• Q 8
Q
8
1
,l
1
)
Q Q
11
Qd
-
Q Q
12
Q
14
15
Fig. 6.
the curve C is not tangent to any curve F (k, l) at point (h, c). We will keep the same notation as in Theorem A. (i) = 0 for i ∈ N. Theorem B. a) In cases I, II0 and II− all Mh,c (1) b) In case II+ we have one point (k, l) on the line lh,c such that kl > 0. Mh,c is generated by the singular vector at level kl and is isomorphic to the Verma module (i) = 0 f or i > 1, i ∈ N. Mh−kl,c . Mh,c
0 we have an infinite number of points (k , l ) on the line l . Submodule c) In case III− i i h,c (i) Mh,c is generated by the singular vector at level ki li and is isomorphic to the Verma module Mh−ki li ,c . d) In case III+0 we have a finite number of points (k1 , l1 ), ..., (ks , ls ) on the line lh,c . (i) is generated by the singular vector at level ki li and is isomorphic Submodule Mh,c (i) = 0 for i > s. to the Verma module Mh−ki li ,c for i ≤ s. Mh,c e) In case III− we have an infinite number of points (ki , li ) on the line lh,c and points 0 0 (2i−1) is generated by two singular vectors (kj , lj ) on the additional line. Submodule Mh,c (2i) is generated by two singular at levels k2i−1 l2i−1 and k2i l2i for i ∈ N. Submodule Mh,c 0 0 0 0 vectors at levels k1 l1 + k2i−1 l2i−1 and k1 l1 + k2i l2i for i ∈ N. f) In case III+ we have a finite number of points (k1 , l1 ), ..., (ks , ls ) on the line lh,c and 0 0 0 0 (2i−1) is generated points (k1 , l1 ), ..., (ks−1 , ls−1 ) on the additional line. Submodule Mh,c (2i) by two singular vectors at levels k2i−1 l2i−1 and k2i l2i for 2i ≤ s. Submodule Mh,c is 0 0 0 0 generated by two singular vectors at levels k1 l1 +k2i−1 l2i−1 and k1 l1 +k2i l2i for 2i < s.
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
545
We have two case (which depend on s being even or odd): (s) s is even) Submodule Mh,c is generated by the singular vector at level k1 l1 + 0 0 ks−1 ls−1 and is isomorphic to the Verma module Mh−k1 l1 +k0 l0 ,c . For all i > s s−1 s−1
(i) = 0. submodule Mh,c (s) is generated by the singular vector at level ks ls and s is odd) Submodule Mh,c (i) = 0. is isomorphic to the Verma module Mh−ks ls ,c . For all i > s submodule Mh,c 00 g) In case III− we have an infinite number of points (ki , li ) on the line lh,c . We distinguish two cases: (i) is generated by the singular vector c = 1 or c = 24h + 1 ) Submodule Mh,c at level ki li and is isomorphic to the Verma module Mh−ki li ,c . (2i−1) (2i) c 6= 1 and c 6= 24h + 1 ) Submodule Mh,c = Mh,c is generated by the singular vector at level ki li and is isomorphic to the Verma module Mh−ki li ,c . h) In case III+00 we have a finite number of points (k1 , l1 ), ..., (ks , ls ) on the line lh,c . We distinguish two cases: (i) c = 25 ) Submodule Mh,c is generated by the singular vector at level ki li and is isomorphic to the Verma module Mh−ki li ,c for 1 ≤ i ≤ s. For all i > s we have (i) = 0. Mh,c (2i−1) (2i) = Mh,c is generated by the singular vector at c 6= 25 ) Submodule Mh,c level ki li and is isomorphic to the Verma module Mh−ki li ,c for 1 ≤ i ≤ s. For all i > 2s (i) = 0. we have Mh,c Now we will describe Jantzen’s filtration in the case when C = F(k, l). We have a fixed point (h, c) ∈ C2 and we assume that (h, c) ∈ C = F(k, l). We will keep the same notation as in Theorem A. Theorem C. a) Cases I, II0 and II− cannot occur. b) In case II+ we have a point (k, l) on the line lh,c such that kl > 0 and (k, l) = (k, l). (i) (1) = Mh,c is generated by the singular vector at level kl and is isomorphic Then Mh,c to the Verma module Mh−kl,c for all i ∈ N. 0 we have an infinite number of points (k , l ) on the line l c) In case III− i i h,c and for (i) some j ∈ N (kj , lj ) = (k, l). Submodule Mh,c is generated by the singular vector at level ki li and is isomorphic to the Verma module Mh−ki li ,c for i ≤ j. For i > j (j) (i) Mh,c = Mh,c . d) In case III+0 we have a finite number of points (k1 , l1 ), ..., (ks , ls ) on the line lh,c . (i) is generated by the singular For some j ∈ {1, ..., s} (kj , lj ) = (k, l). Submodule Mh,c vector at level ki li and is isomorphic to the Verma module Mh−ki li ,c for i ≤ j. (j) (i) Mh,c = Mh,c for i > j. e) Case III− . There is an infinite number of points (kj , lj ) on the line lh,c . For (2i−1) is generated by two singular vectors some j ∈ N (kj , lj ) = (k, l). Submodule Mh,c (2i) at levels k2i−1 l2i−1 and k2i l2i for 2i − 1 ≤ j. Submodule Mh,c is generated by two 0 0 0 0 singular vectors at levels k1 l1 + k2i−1 l2i−1 and k1 l1 + k2i l2i for 2i < j. Let’s denote by j + 1 if j is odd (i) is generated by the singular vector j˜ = . Then for any i ≥ j˜ Mh,c j if j is even at level kj lj and is isomorphic to the Verma module Mh−kj lj ,c .
546
A. Astashkevich
f) In case III+ we have a finite number of points (k1 , l1 ), ..., (ks , ls ) on the line 0 0 0 0 lh,c and points (k1 , l1 ), ..., (ks−1 , ls−1 ) on the additional line. For some 1 ≤ j ≤ s (2i−1) is generated by two singular vectors at levels (kj , lj ) = (k, l). Submodule Mh,c (2i) is generated by two singular k2i−1 l2i−1 and k2i l2i for 2i − 1 ≤ j. Submodule Mh,c 0 0 0 0 vectors at levels k l + k l and k l + k l for 2i < j. Let’s denote by j˜ = 1 1 1 1 2i−1 2i−1 2i 2i j + 1 if j is odd (i) is generated by the singular vector at . Then for any i ≥ j˜ Mh,c j if j is even level kj lj and is isomorphic to the Verma module Mh−kj lj ,c . 00 we have an infinite number of points (k , l ) on the line l . There g) In case III− i i h,c (2i−1) (2i) exists j ∈ N such that kj lj = kl. Submodule Mh,c = Mh,c is generated by the singular vector at level ki li and is isomorphic to the Verma module Mh−ki li ,c for 1 ≤ i ≤ j. For (2j) (i) any i > 2j Mh,c = Mh,c . h) In case III+00 we have a finite number of points (k1 , l1 ), ..., (ks , ls ) on the line lh,c . (2i−1) (2i) = Mh,c is generated There exists j, 1 ≤ j ≤ s, such that kj lj = kl. Submodule Mh,c by the singular vector at level ki li and is isomorphic to the Verma module Mh−ki li ,c for (2j) (i) 1 ≤ i ≤ j. For any i > 2j Mh,c = Mh,c .
5.3. Remarks 1) We will prove these theorems in two steps. Step 1) We prove Secs. b) (i), (ii) and (iii) of Theorem A and Sec. a) of Theorem A in these cases. Alongside we prove Secs. a) through f) of Theorem B. Then as a corollary we get that parts a) through f) of Theorem C are true. Step 2) We prove Secs. b) (iv) of Theorem A and Sec. a) of Theorem A in the above case. Alongside we prove Secs. g) and h) of Theorem C. As a corollary we obtain that Sec. g) and h) of Theorem B are true. 2) We prove everything by induction by level. Let us explain what it means. We say that some property is true up to level k if this property holds in k M
i Mh,c .
i=0
For example, we say that some submodule V is generated by a vector v up to level k iff k M i=0
i Mh,c ∩V =
k M
i Mh,c ∩W,
where W is a submodule generated by v.
i=0
Another example We say that part a) of Theorem A holds up to level k meaning that any submodule of the Verma module is generated by singular vectors up to level k . 3) It is easy to show (using the Kac determinant formula) that Theorem B follows from Theorem A and vice versa.
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
547
6. Proof of the structure theorem in simple cases In this section we will prove the structure theorem in the following cases: I, II, III+0 , 0 , III and III . Let us emphasize that in all these cases all curves F(k, l) which III− + − pass through point (h, c) intersect transversally at this point. First, let us notice that from the Kac determinant formula and Corollary 3.2 immediately follows the existence of all the singular vectors and the diagrams of inclusions between the corresponding Verma modules as stated in Theorem A. Also, it is easy to see that any Verma submodule generated by a singular vector (which we constructed) is of the same case as the original Verma module. Our goal is to show that the maximal submodule is generated by the singular vectors (or singular vector) at levels k1 l1 and k2 l2 (or level k1 l1 ) in the notation of Theorem A. Second, let us notice that cases I, II− and II0 are trivial. They immediately follow from the Kac determinant formula (since in these cases the determinant does not vanish). Definition 2. For any module M ∈ Oc let us define its character def X
ch(M) =
dim(Mh )q h ,
h∈C
where M =
M
Mh .
h∈C
Using Property 3 of Jantzen’s filtration (Sec. 4) we can write an explicit formula for P∞ (i) 0 i=1 ch(Mh,c ). In cases III− and III− we obtain the following formula: ∞ X
(i) ch(Mh,c )=
i=1
∞ X
ch(Mh−ki li ,c ).
i=1
In cases III+0 and III+ we have only a finite number of marked points (k1 , l1 ), ..., (ks , ls ) on the line lh,c . So we obtain the following formula: ∞ X
(i) ch(Mh,c )
i=1
In case II+ we get
=
s X
ch(Mh−ki li ,c ).
i=1
∞ X
(i) ch(Mh,c ) = ch(Mh−kl,c ),
i=1
where (k, l) is the marked point on the line lh,c . Let us make the following important remark. Jantzen’s filtration is a filtration by Vir submodules (this follows immediately from the second property of Jantzen’s filtra(i) , then we see that the tion). Therefore, for example, if we know some vector w ∈ Mh,c (i) submodule generated by the vector w is contained in Mh,c . Let us show that in case II+ Verma module Mh,c has a unique submodule Mh−kl,c . Certainly, we know that such submodule exists (since we have a singular vector at level kl). From the Kac determinant formula it follows that module Mh−kl,c is irreducible. Since the kernel of the form is exactly the maximal submodule, it contains our submodule
548
A. Astashkevich
Case α)
level n level n+1
A A A A AA A A A A A A A A AA A A A AA A A A AA AA A AA AA A AA AA A AA A T AA J T AA A A A A J T AA A AA A J A q q qJ TTAA A A A AA A AA J Fig. 7.
(1) Mh−kl,c . From Property 1 of Jantzen’s filtration follows that Mh−kl,c ⊂Mh,c . Comparing formula ∞ X (i) ch(Mh,c ) = ch(Mh−kl,c ) i=1 (1) with the fact that ch(Mh,c )
(1) (i) ≥ ch(Mh−kl,c ) we obtain that Mh,c = Mh−kl,c and Mh,c =0
for all i > 1. Now let us prove Theorems A and B for case III− . The proof for other cases is the same with minor modifications. We will use the same notation as in Theorems A and B. It is useful to look at Figs. 7 and 8 for a better understanding of the proof. First of all, let us make the following remarks. Submodule generated by singular (1) . The more important fact is that the vectors at levels k1 l1 and k2 l2 is contained in Mh,c 0 0 0 0 module generated by the singular vectors at levels k1 l1 + k1 l1 and k1 l1 + k2 l2 is contained (2) in Mh,c . This fact follows from Property 5 of Jantzen’s filtration, since each of these two singular vectors comes along the curves F (k1 , l1 ) and F(k2 , l2 ). This immediately P∞ (i) )) that the structure implies (comparing these remarks with the formula for i=1 ch(Mh,c of submodules of the Verma module Mh,c is exactly as stated in Theorems A and B up to level min(k3 l3 , k4 l4 ) − 1. We will prove Theorem A together with Theorem B for this case (for all modules) by induction by level. Certainly, we have the base of induction. Assume that we proved those theorems up to level n. Let us prove them up to level n + 1 for module Mh,c . Let us introduce two additional filtrations F(i) and G(i) . F(2i−1) is generated by the singular vectors at levels k2i−1 l2i−1 and k2i l2i and F(2i) is generated by the singular 0 0 0 0 vectors at levels k1 l1 + k2i−1 l2i−1 and k1 l1 + k2i l2i . What we want to prove is that the (i) . filtration F(i) coincides with the filtration Mh,c L n j G(i) is generated by the vectors F(i) ∩ j=1 Mh,c . It is important to understand how (i) the filtration G looks. In some sense it is a cutting of the filtration F(i) . We know that,
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
Case β) j = 1
J
J
549
J
.. . .J
J . J .
. . . J . J
.J .
. . J J . .J
J J .J .
.
.. . . . J J . J . .
J J ..J .
. . J . . . . J
.
J . .J . .
J J. . .
. J J . J
. . . .
J J. . J J . . .
. . . .
J J. .J . .
level n . .
. J J .J
.J. J . . J . . . J J .
.. .
. J. . J J. . J J . .J .
.
. .J . .
.S J J. . J level n+1
J . . J ..J .
. . .
J . S . . J .J
. . . . ... .. .. .. .. J J . .JJ. ...S.. J .. . . . . . . ... .. .. .. .. . . . J. . ..
Fig. 8.
(i) up to level n, G(i) coincides with Mh,c . Let us also notice the following properties: (i) and G(i) ⊂F(i) for any i ∈ N. G(i) ⊂Mh,c
We distinguish two cases. 0 0 Case α): n + 1 6= ki li , k1 l1 + ki li for all i ∈ N. Then we see that the formulas for P P∞ ∞ (i) (i) i=1 ch(Mh,c ) and i=1 ch(G ) coincide up to level n + 1. Together with the fact that (i) (i) F = G up to level n + 1 in this case (direct check) we obtain that the filtration (i) up to level n + 1. This proves the statements of F(i) coincides with the filtration Mh,c Theorems A and B up to level n + 1 for Mh,c . 0 0 Case β): n + 1 = ki li or k1 l1 + ki li for some i ∈ N. Then direct calculations (very easy) show that Resq (q
−h+n
(
∞ X j=1
(j) ch(Mh,c )
−
∞ X
ch(G(j) ))dq) = 1.
j=1
(j) n+1 ) /(G(j) )n+1 ) = 1. We will use the Therefore, there exists j ∈ N such that dim((Mh,c following notation i − 1 if n + 1 = ki li and i is odd i − 2 if n + 1 = k l and i is even i i 0 0 ˜i = i if n + 1 = k1 l1 + ki li and i is odd . 0 0 i − 1 if n + 1 = k1 l1 + ki li and i is even
We distinguish three cases: first j = 1, second 1 < j ≤ ˜i and third j = ˜i + 1. The third case is exactly the statement of Theorem B. From the induction hypothesis (j) is exactly as stated in the theorem (since everything is contained one can see that Mh,c (1) in Mh,c and this “reduces” level ). We know that
550
A. Astashkevich
Resq (q −h+n (
∞ X
(j) ch(Mh,c )−
∞ X
j=1
and Resq (q
−h+s
(
∞ X
ch(G(j) ))dq) = 1
j=1
(j) ch(Mh,c )
j=1
−
∞ X
ch(G(j) ))dq) = 0 for s < n.
j=1
Therefore, the same argument as in the third case proves that the second case can not (1) and this “reduces” level ). occur (since everything is contained in Mh,c Thus, we must show that only the first case can not occur. If ˜i = 1 then everything follows from the remarks at the beginning of the proof. Thus, we can assume that ˜i > 1. (1) n+1 (s) ) /(G(1) )n+1 ) = 1 we have G(s) = Mh,c up to level Since we assumed that dim((Mh,c def
˜
˜
˜
˜
(i) (i+1) n + 1 for all s > 1. Therefore V = Mh,c /Mh,c = G(i) /G(i+1) up to level n + 1. From Property 4 of Jantzen’s filtration we obtain a symmetric non-degenerate bilinear form on module V. We know that (in terms of module Mh,c ) module V is generated by two singular vectors up to level n + 1. So up to level n + 1 this form is determined by its values on these two singular vectors. Let V1 and V2 be two submodules of V generated by the first and the second singular vectors respectively. By the induction hypothesis we know that these submodules intersect at level n + 1 and the intersection (at this level) is one dimensional and is generated by the singular vector. Since V1 and V2 are quotients of Verma modules we see that the form on them must coincide with the standard form (up to a constant multiple). It is zero on their intersection (since form is zero on the maximal submodule). So we see (since V = V1 + V2 up to level n + 1) that the singular vector at level n + 1 lies in the kernel of the form. This contradicts the fact that the form is non-degenerate. We proved that case j = 1 is impossible. Thus we proved the statements of Theorems A and B for case III− . Using similar 0 and III . arguments one can prove Theorems A and B for cases III+0 , III− + In Figs. 7 and 8 we draw the Verma modules as cones. One cone is contained in the other if the same is true for the corresponding the Verma modules. By dotted lines we marked Verma module generated by the singular vector at level k2 l2 and the quotient def
˜
˜
˜
˜
(i) (i+1) V = Mh,c /Mh,c = G(i) /G(i+1) assuming that j = 1. One can see the structure of this module in the figure, so it becomes clear that such a situation is impossible.
7. Completion of the Proof 7.1. General remarks and the idea of the proof In this section we will prove Theorems A, B and C in the remaining cases. Our proof is a minor modification of the proof of Feigin and Fuchs (see [Fe-Fu 1]). First of all, let us make the following remark. If Theorem A is true up to level n then Theorems B and C are also true up to level n. This is obvious, since we have an explicit formula for P∞ (j) j=1 ch(Mh,c ) and we know the structure of submodules of the Verma module up to level n. This shows that there is a unique possibility for Jantzen’s filtration. Therefore, Theorems B and C are true up to level n. The next remark is that in order to prove Theorem A up to level n + 1 (we assume that it is true up to level n) it is enough to show 00 the maximal submodule is generated by the singular vector that in cases III+00 and III− at level k1 l1 up to level n + 1. Indeed, all submodules are contained in the submodule
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
551
generated by the singular vector at the level k1 l1 up to level n + 1 i.e., in Verma module Mh−k1 l1 ,c up to level n + 1 with respect to the original Verma module Mh,c (in other words up to level n + 1 − k1 l1 with respect to Mh−k1 l1 ,c ). Therefore, we can apply the induction hypothesis. We will be proving Theorems A, B and C by induction “up to level n”. Now we assume that they are true up to level n. From the previous remarks it is enough to show 00 the maximal submodule is generated by the singular vector that in cases III+00 and III− at level k1 l1 up to level n + 1. We distinguish two cases. α) n + 1 6= ki li for all i ∈ N (or 1 ≤ i ≤ s). Let us take some smooth curve through the point (h, c) which is not tangent to any curve F(k, l) at this point. An explicit formula P∞ (j) ) shows that Theorem A is true up to level n + 1 for the module Mh,c . for j=1 ch(Mh,c Here it is. 00 and the set {1, ..., s} for case III00 . Then By I let us denote the set N for case III− + we distinguish two cases: a) c 6= 1, 25, 24h + 1: ∞ X
(j) ch(Mh,c ) = 2(
j=1
X ch(Mh−ki li ,c )). j∈I
b) c = 1 or 25 or 24h + 1: ∞ X
(j) ch(Mh,c )=
j=1
X ch(Mh−ki li ,c ). j∈I
These formulas follow immediately from the Kac determinant formula and Property 3 of Jantzen’s filtration. β) n + 1 = kj lj for some j ∈ N (or 1 ≤ j ≤ s). Then we take C = F (kj , lj ). This curve C is given by the parametric equations h = h(t), c = c(t) (see the formulas in Sec. 3). Corollary 3.2 shows that the module Mh(t),c(t) has a singular vector at level n + 1 = kj lj and this singular vector generates the Verma submodule Mh(t)−n−1,c(t) . Let def
L(t) = Mh(t),c(t) /Mh(t)−n−1,c(t) . The contravariant form B:Mh(t),c(t) →M h(t),c(t) vanishes on the submodule Mh(t)−n−1,c(t) and since it is a symmetric bilinear form, it de˜ : L(t)→L(t) with the same properties. So we can speak about Jantzen’s fines a form B filtration on L(t) along this curve C. Unfortunately, we do not know the determinant formula for module L(t). Nevertheless, we can calculate the following sum: ∞ XX
q −h(t) ch(L(t)(i) ).
t∈C i=1
This is going to be our first calculation. The second calculation is described below. It is easy to see from the definition of (k) then the image N of N under Jantzen’s filtration that if some submodule N ⊂Mh(t 0 ),c(t0 ) projection Mh(t0 ),c(t0 ) →L(t0 ) is contained in L(t0 )(k) i.e., N ⊂L(t0 )(k) . Since Theorems A, B and C are true up to level n we have some information about Jantzen’s filtration. Exactly, we know that for i < j the Verma module generated by the singular vector at level (2i) (2i) (i.e. Mh(t0 )−ki li ,c(t0 ) ⊂Mh(t ). Therefore, its image ki li is contained in Mh(t 0 ),c(t0 ) 0 ),c(t0 ) P∞ −h(t0 ) (2i) M h(t0 )−ki li ,c(t0 ) ⊂L(t0 ) . This gives us a low boundary for i=1 q ch(L(t0 )(i) ). Summing over all t ∈ C (using the fact that in all other cases except III+00 and
552
A. Astashkevich
00 we know Jantzen’s filtration exactly) we obtain the estimate from below for III− P P∞ −h(t) ch(L(t)(i) ). The remarkable fact is that these two calculations give t∈C i=1 q us the same answer. Thus we know Jantzen’s filtration on L(t) along this curve C for all t ∈ C. We immediately see that the maximal submodule in Mh,c is generated by the singular vector at level k1 l1 up to level n + 1. This finishes the proof.
7.2. First calculation This calculation is exactly the first calculation from the paper by Feigin and Fuchs ([Fe-Fu 1]). We present it here in greater detail for the sake of completeness. Let us take the compactification of curve F(kj , lj ) i.e. CP1 . Then we have a trivial vector bundle over it Mh(t),c(t) = U(n− ) which is a direct sum of trivial bundles U(n− )i for i ∈ Z+ . We have subbundle N(t) which is generated by singular vector Skj ,lj (t) over Virasoro. Certainly, we have decompositions into the direct sums, Mh(t),c(t) = U(n− ) =
∞ M U(n− )i
and
i=0
N(t) =
∞ M
N(t)i .
i=kj lj
Let us denote by η the line bundle N(t)n+1 (remember that n + 1 = kj lj ). It is easy to see that N(t)i = p(i − kj lj )η for all i ∈ Z+ , where p(i) is a partition function (we set p(i) = 0 for i < 0), L(t) = Mh(t),c(t) /N(t), in particular L(t)i = U(n− )i /N(t)i . ˜ i ) is a section of the following line bundle (Λdim L(t)i L(t)i )⊗2 )0 (it is obvious that det(B dim L(t)i = p(i)−p(i−kj lj )). This section is regular outside 0 and ∞. Let us denote by Pi (0) and Pi (∞) the orders of the poles of the section at zero and infinity respectively. We have the following formula ∞ XX
dim dim(L(t)(m) i ) = Eu((Λ
L(t)i
L(t)i )⊗2 )0 ) + Pi (0) + Pi (∞),
t∈C m=1
where Eu(•) denotes the Euler number of the vector bundle. This formula follows from Property 3 of Jantzen’s filtration. Therefore, we must calculate the Euler number and the orders of the poles at zero and infinity of the corresponding vector bundle. Let us calculate the Euler number first. Lemma 7.1. (see [Fe-Fu 1]) The Euler number, Eu(η) = kj + lj − 2kj lj . Proof. From Theorem 3.4. we see that line bundle η has section Skj ,lj (t) that has no zero and has two poles of orders kj (lj − 1) and lj (kj − 1). To calculate Eu((Λdim L(t)i L(t)i )⊗2 )0 ) is the same as to calculate the first Chern class, c1 , of this vector bundle. c1 ((Λdim L(t)i L(t)i )⊗2 )0 ) = = −2c1 ((Λdim
L(t)i
L(t)i )) = −2c1 (L(t)i ) = 2c1 (N(t)i ) = 2p(i − kj lj )c1 (η).
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
Lemma 7.2. (see [Fe-Fu 1]) The Euler number, Eu((Λdim kj lj )(kj + lj − 2kj lj ).
L(t)i
553
L(t)i )⊗2 )0 ) = 2p(i −
Proof. Obvious.
Now let us calculate the numbers Pi (0) and Pi (∞). For example, near infinity we ˜ i as the determinant of the principal minor can calculate the determinant of the form B of the matrix of the contravariant form corresponding to the following part of the basis L−ir L−ir−1 ...L−i2 L−i1 (Lkj )s , where s < lj , ir ≥ ir−1 ≥ ... ≥ i2 ≥ i1 ≥ 1 and 1−k2
im 6= kj for all m. For t→∞ we know that h ∼ 4 j t and c ∼ 6t. One can see that the degree in t of the determinant of this minor can be calculated as a sum of the degrees of its diagonal entries (other products have smaller degree). The computation of this sum is similar to the computations in Sec. 3. We must take into account only that Li L−i vh(t),c(t) = [−2ih(t) − (i3 − i) 12c(t)]vh(t),c(t) ∼ −i(i2 − kj2 ) 2tvh(t),c(t) which Q k s s has degree 1 in t if i 6= kj and that (Lkj )s (L−kj )s vh(t),c(t) ∼ (− 12 ) m=1 (kj (13kj − 12(lj + 2s − 2m)) − 1) 6= 0 has degree 0 in t. So Pi (∞) equals the number of all elements not equal to kj of all partitions of i which contain kj less than lj times. A similar statement is true for Pi (0). We must only replace kj by lj and vice versa. We obtain the following formulas: ∞ X
Pm (∞)u
m
= p(u)(1 − u
m=0 ∞ X
kj l j
u kj ) s(u) − 1 − u kj
Pm (0)um = p(u)(1 − ukj lj ) s(u) −
m=0
u lj 1 − u lj
,
,
P∞ P∞ P∞ um where p(u) = m=0 p(m)um , s(u) = m=0 s=0 ums . Indeed, p(u) 1−u m is a generating function for the number of all elements equal to m of all partitions of positive integer. Thus we proved the following proposition. Proposition 1. (see [Fe-Fu 1]) ∞ XX
q
−h(t)
(m)
ch(L(t)
)(q ) = p(q)(1 − q 1
kj lj
) 2s(q) −
t∈C m=1
q lj − 1 − q lj
q kj − 1 − q kj
− (2kj lj − kj − lj )p(q)q kj lj .
Remark 2. We have n + 1 = kj lj .
7.3. Second calculation This calculation is completely combinatorial and quite lengthy. It is a good exercise in combinatorics to do it by yourself. In any case, everyone who is interested in it can find it in [Fe-Fu] Chapter 2, paragraph 1, Sec. 4. The result is the same as the right hand side in Proposition 7.3. .
554
A. Astashkevich
7.4. Final remarks Comparison of these two calculations finishes the proof of Theorems A, B and C. 00 the proof uses the asymptotics of the One can see that in cases III+00 and III− formulas for the singular vectors. It seems interesting to me to find another proof which does not use such kinds of information. 8. On the Structure of Verma Modules over Neveu-Schwarz Algebra 8.1. Notation Neveu-Schwarz is a Lie superalgebra with the basis Li , Li+ 1 and C where i ∈ Z and 2 the following commutators: [Li , C] = 0,
[Li+ 1 , C] = 0, 2
(j 3 − j) C, 12 (4n2 − 1) [Lm+ 1 , Ln+ 1 ] = 2Ln+m+1 + δ0,m+n+1 C, 2 2 12 (n − 1) − m)Lm+n+ 1 . [Lm+ 1 , Ln ] = ( 2 2 2 1 1 It is 2 Z-graded: deg Li = i, deg Li+ 1 = i + 2 and deg C = 0. Let us denote by h the 2 Lie algebra with the basis L0 and C, by n− the Lie algebra with the basis {L− i2 , i ∈ N} and by n+ the Lie algebra with the basis {L i2 , i ∈ N}. We also denote by b+ the Lie algebra with the basis {L i2 and C, i ∈ Z+ }. All these algebras h, n− , n+ and b+ are subalgebras of N V. We have a Cartan type decomposition of N V [Li , Lj ] = (j − i)Li+j + δ−i,j
N V = n− ⊕h⊕n+ , N V = n− ⊕b+ . Let h, c ∈ C. Let’s consider the one dimensional module Ch,c over b+ such that n+ acts by zero, L0 is a multiplication by h and C is a multiplication by c. Verma module Mh,c over Neveu-Schwarz is (by definition) the induced module from Ch,c , Mh,c = IndVir b+ Ch,c .
We have a natural inclusion of Ch,c ,→Mh,c . Therefore, we have a vector v ∈ Mh,c corresponding to 1 ∈ Ch,c . Sometimes we denote it by vh,c to stress that this vector lies in Mh,c . Vector v is called the vacuum vector. Let us make some remarks about the Verma modules. First, any Verma module Mh,c is a free module over U(n− ). We have the following basis in Mh,c : L− ik L− ik−1 ...L− i2 L− i1 vh,c , 2
2
2
2
where ik ≥ ik−1 ≥ ... ≥ i1 ≥ 1 and ij is either odd or the multiple of 4 for any j. The operator L0 on Mh,c is semisimple. We can consider the eigenspace decomposition of Mh,c ,
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
555
∞ M i 2 Mh,c = Mh,c , i=0 i
2 . It is easy to see that this decomwhere L0 acts as a multiplication by h − 2i on Mh,c position respects the grading on N V. We say that a vector w ∈ Mh,c has level n if n . w ∈ Mh,c We call vector w singular if it has some level n (n ∈ 21 Z+ ) and n+ acts by zero on this vector. It is obvious that any singular vector generates a submodule isomorphic to the Verma module. If a singular vector has level n then it generates Mh−n,c . In the same way as for Virasoro one can define a contavariant Hermitian form B(h, c) i
2 on the Verma module Mh,c . One can see that the spaces Mh,c , i ∈ Z+ , are orthogonal i
2 for different i. We denote the restriction of the form B(h, c) to Mh,c by B i2 (h, c). We have the following determinant formula (see [Kac-Wa]):
det 2 (B n ) = Const 2
Y
˜ Φ˜ k,l (h, c)p(
n−kl ) 2
,
where
k, l ∈ Z+ kl ≤ n k = l mod(2)
(k 2 − 1)c 5(1 − k 2 ) − 4(1 − kl) Φ˜ k,l (h, c) = (h + + )× 24 16 2
×(h +
(l2 − 1)c 5(1 − l2 ) − 4(1 − kl) (k 2 − l2 ) + )+ . 24 16 64
The curve Φ˜ k,l (h, c) = 0 can be given by the following formulas: h= c = 3t +
1 − kl 1 − l2 −1 1 − k2 t+ + t 8 4 8 15 + 3t−1 , 2
where t ∈ C∗ .
In order to formulate the main result, let us make the following substitution: c=3
(2p + q)(2q + p) , 2pq
h=
(p + q)2 − m2 . 8pq
Then as in the case of Virasoro we have a quadruple of lines in the plane (k,l): m = pk + ql, m = pl + qk, 0 = m + pk + ql, 0 = m + pl + qk. Let us choose one of these lines, for example 0 = m + pk + ql, and denote it by lh,c .
556
A. Astashkevich
8.2. On the structure of Verma Modules We shall distinguish the following cases. The line lh,c contains no integral points. The line lh,c contains exactly one integral point (k, l). We have the following subcases: II− . The product kl < 0. II0 . The product kl = 0. II+ . We distinguish two cases: a) k = l mod(2). b) k 6= l mod(2). Case III. The line lh,c contains infinitely many integral points. Let us consider two neighboring integral points on the line. Then the second point can be obtained by adding some vector v to the first point. We can write the vector v in coordinates (v1 , v2 ). Numbers v1 and v2 are integers and we are going to distinguish the following two cases (we denote them by A and B respectively) A) v1 and v2 are both odd B) one of v1 and v2 is odd and the other one is even v1 and v2 can not be even at the same time since we took neighboring points. 3 Subcase c ≤ 2 . 00 . Line lh,c intersects both axes at integral points. III− 0 III− . Line lh,c intersects only one axis at integral point. Let (k1 , l1 ), (k2 , l2 ), (k3 , l3 ), . . . be all integral points (k, l) on the line lh,c up to equivalence 0 0 0 0 (k, l) ∼ (k , l ) iff kl = k l and such that kl > 0. We ordered them in such a way that ki li < ki+1 li+1 for all i ∈ N. A) v1 and v2 are both odd. Then we distinguish two cases: α) k1 6= l1 mod (2), β) k1 = l1 mod (2). B) One of v1 and v2 is odd and the other one is even We have four cases: α) k1 = l1 and k2 = l2 mod (2), β) k1 6= l1 and k2 6= l2 mod (2) then k3 = l3 and k4 = l4 mod (2), γ) k1 6= l1 mod (2), k2 = l2 and k3 = l3 mod (2) then k4 6= l4 mod (2), δ) k1 = l1 and k4 = l4 mod (2) then k2 6= l2 and k3 6= l3 mod (2). Line lh,c intersects both axes at non-integral points. Let (k1 , l1 ), (k2 , l2 ), III− . (k3 , l3 ), . . . be all integral points (k, l) on the line lh,c such that ki = 0 0 0 0 li mod (2) for all i ∈ N up to equivalence (k, l) ∼ (k , l ) iff kl = k l and such that kl > 0. We ordered them in such a way that ki li < ki+1 li+1 0 for all i ∈ N. In this case we draw an auxiliary line, lh,c , parallel to lh,c 0 0 0 0 0 0 through the point (k1 , −l1 ). Let (k1 , l1 ), (k2 , l2 ), (k3 , l3 ), . . . be all integral 0 0 0 points (k, l) on the line lh,c such that ki = li mod (2) for all i ∈ N up Case I. Case II.
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
557
to equivalence (k, l) ∼ (k “ , l“ ) iff kl = k “ l“ and such that kl > 0. We 0 0 0 0 ordered them in such a way that ki li < ki+1 li+1 for all i ∈ N. Then it is easy to see that we have the following inequalities: 0 0
0 0
k 1 l 1 < k 2 l 2 < k 1 l 1 + k 1 l 1 < k1 l 1 + k 2 l 2 < k 3 l 3 < k 4 l 4 < 0 0
0 0
< k1 l1 + k3 l3 < k1 l1 + k4 l4 < k5 l5 < k6 l6 < ... . We distinguish two cases: A) v1 and v2 are both odd. Then either both sets { (k1 , l1 ), (k2 , l2 ), 0 0 0 0 0 0 (k3 , l3 ), . . . } and { (k1 , l1 ), (k2 , l2 ), (k3 , l3 ), . . . } are empty (subcase α) or not (subcase β). B) One of v1 and v2 is odd and the other one is even. Subcase c ≥ 27 2 . Line lh,c intersects both axes at integral points. III+00 . Line lh,c intersects only one axis at integral point. Let { (k1 , l1 ), (k2 , l2 ), III+0 . . . . , (ks , ls )} be all integral points (k, l) on the line lh,c up to equivalence 0 0 0 0 (k, l) ∼ (k , l ) iff kl = k l and such that kl > 0. We ordered them in such a way that ki li < ki+1 li+1 for all i ∈ {1, 2, ..., s − 1}. A) v1 and v2 are both odd. We distinguish two cases: α) k1 6= l1 mod (2), β) k1 = l1 mod (2). B) One of v1 and v2 is odd and the other one is even Then we have four cases: α) k1 = l1 and k2 = l2 mod (2), β) k1 6= l1 and k2 6= l2 mod (2) then k3 = l3 and k4 = l4 mod (2), γ) k1 6= l1 mod (2), k2 = l2 and k3 = l3 mod (2) then k4 6= l4 mod (2), δ) k1 = l1 and k4 = l4 mod (2) then k2 6= l2 and k3 6= l3 mod (2). Line lh,c intersects both axes at non-integral points. Let { (k1 , l1 ), (k2 , l2 ), III+ . . . . , (ks , ls )} be all integral points (k, l) on the line lh,c such that ki = 0 0 li mod (2) for all i ∈ {1, 2, ..., s} up to equivalence (k, l) ∼ (k , l ) 0 0 iff kl = k l and such that kl > 0. We ordered them in such a way that ki li < ki+1 li+1 for all i ∈ {1, 2, ..., s − 1}. In this case we draw 0 an auxiliary line, lh,c , parallel to lh,c through the point (k1 , −l1 ). Let 0 0 0 0 0 0 0 (k1 , l1 ), (k2 , l2 ), . . . , (ks , ls ) be all integral points (k, l) on the line lh,c 0 0 such that ki = li mod (2) for all i∈ {1, 2, ..., s − 1} up to equivalence (k, l) ∼ (k “ , l“ ) iff kl = k “ l“ and such that kl > 0. We ordered them in 0 0 0 0 such a way that ki li < ki+1 li+1 for i ∈ {1, 2, ..., s − 1}. Then it is easy to see that we have the following inequalities: 0 0
0 0
k 1 l 1 < k 2 l 2 < k 1 l 1 + k 1 l 1 < k1 l 1 + k 2 l 2 < k 3 l 3 < k 4 l 4 < 0 0
0 0
< k 1 l 1 + k 3 l 3 < k 1 l 1 + k 4 l 4 < k 5 l 5 < k6 l 6 < . . . . We distinguish two cases:
558
A. Astashkevich
A) v1 and v2 are both odd. Then either both sets {(k1 , l1 ), (k2 , l2 ), 0 0 0 0 0 0 (k3 , l3 ), . . .} and {(k1 , l1 ), (k2 , l2 ), (k3 , l3 ), . . .} are empty (subcase α) or not (subcase β). B) One of v1 and v2 is odd and the other one is even. Theorem D. 0 , III+ and III+0 all submodules of Verma module are gena) In cases I, II, III− , III− erated by singular vectors. 0 .A)α), III+0 .A)α), III− .A)α) and III+ .A)α) b) i) In cases I, II− , II0 , II+ .a), III− the Verma module is irreducible. ii) In case II+ .b) the Verma module Mh,c has a unique submodule generated by the singular vector at level kl 2 . This submodule is isomorphic to the Verma module Mh− kl ,c which is irreducible (case II− ). 2 0 0 iii) In cases III− .A)β), III− .B) and III− we have an infinite number of singular vectors. All singular vectors and relations between them are shown in the diagrams below. Singular vectors are denoted by points with their weights indicated. An arrow or a chain of arrows from one point to another means that the second singular vector lies in the submodule generated by the first one. iv) In cases III+0 .A)β), III+0 .B) and III+ we have a finite number of singular vectors (maybe zero). All singular vectors and relations between them are shown in the diagrams below. Singular vectors are denoted by points with their weights indicated. An arrow or a chain of arrows from one point to another means that the second singular vector lies in the submodule generated by the first one. 8.3. Sketch of the proof Let Mh,c be a Verma module with the central charge c and the highest weight h. Then the following theorem can be proved similarly to Theorem 3.1.. Theorem 8.4. At each level n ∈ 21 Z− only one singular vector w can exist. If the singular vector, w, exists then it is given by the following formula
+
X
w = (L− 1 )2n vh,c + 2
(h, c)L− ik Pi(n) 1 ,...,ik 2
ik + ... + i1 = 2n ik ≥ ik−1 ≥ ... ≥ i1 ≥ 1 ik ≥ 3 where ij is either odd or a multiple of 4 f or any j
L− ik−1 ...L− i2 L− i1 vh,c , 2
2
2
(h, c) are polynomials in h which defines w up to multiplication by a constant. Pi(n) 1 ,...,ik and c. One can see that the proof of the structure of Verma modules in the simple cases uses only the Kac determinant formula and Theorem 3.1. We have the analog of both i.e., Theorem 8.1. and the determinant formula. Repeating the arguments which we used in the case of Virasoro (almost word for word with minor modifications ) one can see that Theorem D is true.
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras 0 Case III− .A)β )
Case
559
III− x @ @ R @ x H HH HH ? j H x H HH HH ? j H x HH H HH ? j H x H HH HH ? j H x
(h , c)
x
(h , c)
(h −
k1 l1 , c) 2
(h −
k2 l2 , c) 2
? x
(h −
(h −
? x ?
k1 l1 2
0 0
−
(h −
(h −
...
k1 l1 2
k1 l1 , c) 2
k l 1 1 , c) 2
k3 l3 , c) 2
0 0
−
(h −
k l 3 3 , c) 2
k5 l5 , c) 2
x
(h −
? x(h − ? x
? x
k1 l1 2
(h −
? x(h −
k2 l2 , c) 2
k l 2 2 , c) 2
k4 l4 , c) 2
k1 l1 2
(h −
0 0
−
0 0
−
k l 4 4 , c) 2
k6 l6 , c) 2
...
Case III+0 .A)β )
(h, c) (h − (h −
k1 l1 , c) 2 k2 l2 , c) 2
Case III+ (h, c)
x (h −
? x ? x
(h −
k1 l1 2
0 0
−
k l 1 1 , c) 2
(h −
q q q (h − (h −
ks−1 ls−1 , c) 2
(h −
ks ls , c) 2
? x ? x
k1 l1 2
k1 l1 , c) 2
0 0
−
k3 l3 , c) 2
k l 3 3 , c) 2
x @ Rx @ x (h − HH ? j ? HH x x(h − HH ? HH j ? x x HH (h − HH j ? ? x x(h − q q q @ ? R ? @ x x @ R @ x
k2 l2 , c) 2 k1 l1 2
0 0
−
k l 2 2 , c) 2
k4 l4 , c) 2 k1 l1 2
0 0
−
k l 4 4 , c) 2
560
A. Astashkevich
0 0 Case III− .B)β ) Case III− .B)α)
x @ @ R @ x H HH HH ? j H x H HH HH ? j H x HH H HH ? j H x H HH HH ? j H x
(h , c) (h , c)
(h −
k3 l3 , c) 2
(h −
k4 l4 , c) 2
(h −
k7 l7 , c) 2
x
(h −
k1 l1 , c) 2
? x
(h −
k3 l3 , c) 2
? x
(h −
k5 l5 , c) 2
? x
(h −
k7 l7 , c) 2
(h −
k9 l9 , c) 2
? x (h −
x
(h −
k2 l2 , c) 2
? x
(h −
k4 l4 , c) 2
? x
(h −
k6 l6 , c) 2
? x
(h −
k8 l8 , c) 2
? x
(h −
k10 l10 , c) 2
x
(h −
k3 l3 , c) 2
? x
(h −
k5 l5 , c) 2
? x
(h −
k7 l7 , c) 2
? x
(h −
k9 l9 , c) 2
? x
(h −
k11 l11 , c) 2
...
k8 l8 , c) 2
?
... 0 0 Case III− .B)δ ) Case III− .B)γ )
x @ @ R @ x HH H HH ? j H x HH H HH ? j H x H HH HH ? j H x H HH HH ? j H x
(h , c) (h , c)
(h −
k1 l1 , c) 2
(h −
k4 l4 , c) 2
(h −
k5 l5 , c) 2
x
(h −
k2 l2 , c) 2
? x
(h −
k4 l4 , c) 2
? x
(h −
k6 l6 , c) 2
? x
(h −
k8 l8 , c) 2
? x (h −
k8 l8 , c) 2
?
...
(h −
k10 l10 , c) 2
...
On the Structure of Verma Modules over Virasoro and Neveu-Schwarz Algebras
Case III+0 .B)β )
(h, c) (h −
k3 l3 , c) 2
(h −
k4 l4 , c) 2
(h −
k7 l7 , c) 2
(h −
k8 l8 , c) 2
Case III+0 .B)α) (h, c)
x ? x
(h −
k1 l1 , c) 2
? x
(h −
k3 l3 , c) 2
? x
(h −
k5 l5 , c) 2
? x
(h −
k7 l7 , c) 2
? x
Case III+0 .B)δ )
(h −
k1 l1 , c) 2
(h −
k4 l4 , c) 2
(h −
k5 l5 , c) 2
(h −
k8 l8 , c) 2
(h, c)
? x
(h −
k2 l2 , c) 2
? x
(h −
k4 l4 , c) 2
? x
(h −
k6 l6 , c) 2
? x
(h −
k8 l8 , c) 2
? x
q q q
k2 l2 , c) 2 k4 l4 , c) 2 k6 l6 , c) 2 k8 l8 , c) 2
Case III+0 .B)γ )
x
q q q
x @ Rx @ x (h − HH H ? j ? H x x(h − HH ? HH j ? x x H (h − H HH j ? ? x x(h − @ ? R ? @ x x @ R @ x
q q q
(h, c)
561
x @ Rx @ x (h − H H ? j ? HH x x(h − H H ? HH j ? x x HH (h − HH j ? ? x x(h − q q q @ ? R ? @ x x @ R @ x
k3 l3 , c) 2 k5 l5 , c) 2 k7 l7 , c) 2 k9 l9 , c) 2
562
A. Astashkevich
References [Ast-Fu]
Astashkevich, A.B., Fuchs, D.B.: “Asymptotic of singular vectors in Verma modules over the Virasoro Lie algebra.” Pacific Journal of Math. 177, No. 2 (1997) [Fe-Fu 1] Feigin,B.L., Fuchs, D.B.: “Representation of the Virasoro algebra.” Advanced Studies in Contemporary Mathematics, 7; New York: Gordon and Breach Science Publishers, 1990, 465–554 [Fe-Fu 2] Feigin, B.L., Fuchs, D.B.: “Skew-symmetric differential operator on the line and Verma modules over the Virasoro algebra.” Funct. Anal. and Appl. 16, No. 2, 47–63 (1982) [Fre] Frenkel, E.: “Determinant formulars for the free field representations of the Virasoro and KacMoody algebras.” Physics Letters B 286, 71–77 (1992) [Fu 1] Fuchs, D.B.: “Doctor’s thesis.” in Russian [Fu 2] Fuchs, D.B.: “ Cogomology of infinite- dimensional Lie algebras.” New York: Consultants Bureau, 1988 [Ja] Jantzen J.C.: “Moduln mit einem h¨ochsten a-Gewicht.” Lecture notes in mathematics 750, Berlin– Heidelberg–New York: Springer-Verlag [Kac 1] Kac, V.G.: “Infinite-Dimensional Lie algebras.” Cambridge: Cambridge University Press, 1990 [Kac 2] Kac, V.G.:“Contravariant form for infinite-dimensional Lie algebras and superalgebras.” Lect. Notes in Phys. 94, 441–445 (1979) [Kac-Kaz] Kac, V.G., Kazhdan, D.A.: “Structure of representations with highest weight of infinitedimensional Lie algebras.” Adv. Math. 34, 97–108 (1979) [Kac-Ra] Kac, V.G., Raina, A.K.:“Bombay lectures on highest weight representations of infinite dimensional Lie algebras” Singapore: World Sci., 1987 [Kac-Wa] Kac, V.G., Wakimoto, M.: “Unitarizable highest weight representations of the Virasoro, NeveuSchwartz and Ramond algebras.” In Proceedings of the Symposium on conformal groups and structures, Claustal, 1985, Lecture Notes in Physics 261 1986, 345–372 Communicated by M. Jimbo
Commun. Math. Phys. 186, 563 – 579 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
A Mathematical Construction of the Non-Abelian Chern-Simons Functional Integral Sergio Albeverio? and Ambar Sengupta?? Fakult¨at und Institut f¨ur Mathematik, Ruhr-Universit¨at Bochum, 44780 Bochum, Germany Received: 2 April 1996 / Accepted: 15 August 1996
Abstract: We construct rigorously an infinite dimensional distribution which corresponds to the Chern-Simons (CS) functional integral associated with a principal fiber bundle over R3 with structure group a compact connected Lie group. We determine the ‘moments’ of the CS distribution and show that these coincide with those used in informal studies of the CS integral. A locality property of the CS distribution is proven. The complexified theory of Fr¨ohlich and King is also discussed within our framework. 1. Introduction The purpose R of this paper is to give a rigorous meaning to normalized integrals of the form A eiCS(A) φ(A) DA, where CS is the Chern-Simons action functional on the space A of connections on a bundle over R3 . We realize such a normalized integral as 8CS (φ), where 8CS is a rigorously defined element of a Sobolev-type closure of a space L2 (µ) where µ is a Gaussian measure over g ⊕ g−valued distributions on R3 (g being the Lie algebra of the gauge group). In this sense, 8CS is an infinite-dimensional distribution. We prove in Sect. 3 that 8CS has a certain locality property reminiscent of the Markov property of some quantum fields. The n−point functions (or moments) for 8CS often serve as a basis for informal arguments used in the literature concerning Chern-Simons integrals; in Sect. 4 we determine these moments rigorously, showing that they agree with the values one expects on a heuristic basis. In sections 5 and 6 we apply our framework to the treatment of Chern-Simons theory in a complexified setting given by Fr¨ohlich and King in [FK]. Much of the interest in Chern-Simons theory stems from the fact that certain topological invariants associated with knots can be obtained in terms of Chern-Simons functional integrals. A rigorous formulation of these functional ?
BiBoS; SFB237 (Essen-Bochum-D¨usseldorf) CERFIM(Locarno). Alexander von Humboldt Fellow; on leave from Department of Mathematics, Louisiana State University, Baton Rouge. ??
564
S. Albeverio, A.Sengupta
integrals thus provides a solid basis for investigations connecting such integrals with knot invariants. In this paper we shall be content mainly with defining Chern-Simons functional integrals rigorously, the last section providing some of the tools needed to proceed to knots in this context. The relationship between Chern-Simons theory and knot theory is described (mainly at a heuristic level) in numerous works; for instance, [At, FK] and the groundbreaking work [Wi]. Infinite-dimensional distributions have been used to provide a rigorous formulation of the Feynman path integral (chapter 12 of [HKPS] contains an exposition along with the original references, such as de Faria et al. [FPS]). In the context of Chern-Simons theory in the abelian case, rigorous formulations have been given by Sch¨afer [Sc] and Albeverio and Sch¨afer [AS1,2,3] in terms of oscillatory integrals, and in terms of white noise analysis (infinite-dimensional distribution theory) by Leukert and Sch¨afer [LS]. In [AS1,2,3, LS, Sc], it was shown within a rigorous framework that suitably renormalized versions of holonomies do yield link-invariants as expected from the heuristic theory. The construction of the Chern-Simons distribution can be motivated by some ‘backof-the-envelope’-style calculations. As we shall see in Sect. 2.1, the space of connections modulo gauge transformations can be identified with the space of g 2 −valued functions (a, f ) on R3 , and the Chern-Simons action then has the form ha, f i where h·, ·i is an inner-product. To illustrate our procedure, let us replace for simplicity the space of connections modulo gauge transformations by R × R and the action by the function (x, y)R 7→ xy. Thus we are seeking to give meaning to normalized integrals of the form eixy f (x, y)dxdy. In the infinite-dimensional situation there is no standard usable Lebesgue measure, Gaussian measure being used instead; with this in mind we rewrite R ixy 2 2 def R e f (x, y)dxdy in the Rform hf i = eixy+|(x,y)| /2 f (x, y)e−|(x,y)| /2 dxdy. The strategy now is to view f 7→ eixy f (x, y)dxdy as a distribution and realize this distri2 bution concretely as an element of a completion of L2 (e−|(x,y)| /2 dxdy). In more detail, P 2 we expand eixy+|(x,y)| /2 in orthonormal Hermite n cn Hn (x, y) with the R polynomials coefficientsP cn growing at a certain rate; then f (x, y)eixy dxdy can be obtained by computing n an (f )cn , where R an (f ) are the coefficients of the Hermite expansion of f . Thus the distribution f 7→ eixy f (x, y)dxdy (for an appropriate class of test-functions f ) is specified completely by the coefficients cn , and these latter can be determined. For the Chern-Simons distribution we carry out this procedure in the infinite dimensional setting of the space of connections in place of R2 . A note on brackets. We adopt the following convention: Angular brackets of the form h·, ·i, with or without subscripts, always denote inner-products. Brackets of the form (·, ·)k , with subscript, always denote real or complex bilinear pairings. An expression of the form (a, b), without subscript, denotes either an ordered pair of elements or the evaluation of a functional a on an element b.
2. The CS Distribution The goal of this section is to show that there is an infinite-dimensional distribution whose on a test-function φ corresponds to the formal Chern-Simons integral R R evaluation −1 φ(A)eiCS(A) DA, where ZCS = A eiCS(A) DA. To this end, we shall bring the CS ZCS A action into a quadratic form by appropriate gauge choice, review the basic framework of the infinite-dimensional distribution theory provided by white noise analysis, and produce a distribution which corresponds to the CS integrand. We work with Chern-
A Mathematical Construction of the Non-Abelian Chern-Simons
565
Simons over R3 , but much of our framework is applicable to 3−manifolds of the type Σ × R, with Σ being a 2−dimensional manifold. 2.1. The Chern-Simons action CS and a gauge choice. We shall work with connections on a principal G−bundle over R3 , where G is a compact (or abelian) connected Lie group with a fixed Ad-invariant inner-product h·, ·ig . Using pullbacks by a fixed section of this (trivial) bundle, connections will be taken to be g−valued 1−forms on R3 . The Chern-Simons action CS is given on such a 1−form A by: Z 1 κ hA ∧ dAi + hA ∧ [A, A]i dvolR3 , (2.1.1) CS(A) = 4π R3 3 where κ is a non-zero constant, [A, A] is the 2−form whose value on a pair of vectors (X, Y ) is 2[A(X), A(Y )], and hA ∧ Bi, for a g−valued 1−form A and a g−valued 2−form B, is the 3−form whose value on (X, Y, Z) is given by the skew-symmetrized form of hA(X), B(Y, Z)ig . There is a natural constraint (gauge invariance of eiCS(A) ) which should be imposed on the constant κ, but we will not make any explicit use of this. (We shall use only (2.1.1), but general background on the Chern-Simons functional is available in the original work [CS]). Every connection over R3 can be gauge transformed into the form A = a0 dx0 + a1 dx1 ,
(2.1.2)
where a0 and a1 are functions over R3 taking values in the Lie algebra g, and a1 |(R2 × {0}) = 0 and a0 |(R × {(0, 0)}) = 0.
(2.1.3)
In this gauge the Chern-Simons action CS(A) loses the cubic term and, after integration by parts (assuming fast enough decay of A) , becomes: Z κ κ def ha0 , −∂2 a1 iL2 (R3 ;g) (2.1.4) ha0 (x), ∂2 a1 (x)ig dvolR3 (x) = CS(a0 , a1 ) = − 2π R3 2π Thus Chern-Simons integrals then take the form Z φ(a0 , a1 )eiCS(a0 ,a1 ) Da0 Da1 ,
(2.1.5)
A0
where the integral is now over the space A0 of (a0 , a1 ) satisfying (2.1.3), and φ is “any” function on this space. The transformation of the ‘Lebesgue measure’ DA to Da0 Da1 works (informally of course) essentially because the gauge choice (2.1.3) involves only linear constraints. 2.2. Synopsis of the White Noise Analysis framework. In finite-dimensional analysis, Sobolev completions of L2 (Rn ) provide useful spaces wherein “functions” of varying degrees of singular behavior, such as measures and distributions, can be realized. Analogously, in white noise analysis one completes the L2 space over an infinite-dimensional space (with a Gaussian measure) with respect to analogs of Sobolev-type norms. We shall seek the Chern-Simons integral in such a framework. The basic framework of white noise analysis is quite flexible, but for now we describe one format. Consider a real separable Hilbert space E, with inner-product h·, ·i, and vector subspaces E1 ⊃ E2 · · ·, each Ep being a Hilbert space with inner-product h·, ·ip
566
S. Albeverio, A.Sengupta def
(we take the case p = 0 to be the same as E) such that : (i) E = ∩p Ep is dense in E and in each Ep , (ii) |u|p ≤ |u|q for every q ≥ p and u ∈ Eq , (iii) for every p, the Hilbert-Schmidt norm ||iqp ||HS of the inclusion iqp : Eq → Ep is finite for some q ≥ p and limq→∞ ||iqp ||HS = 0. Identifying E0 with its dual E0∗ , there is the chain of spaces: def
def
E = ∩p Ep ⊂ · · · ⊂ E2 ⊂ E1 ⊂ E0 = E ' E0∗ ⊂ E−1 ⊂ E−2 · · · ⊂ E ∗ = ∪p Ep , (2.2.1) where E−p = Ep∗ , p ∈ Z. In a typical example, one has a Hilbert-Schmidt operator K, with Hilbert-Schmidt norm ||K||HS < 1, and Ep is the range Im(Kp ), and hu, vip = hK−p u, K−p vi. According to Minlos’ theorem there is a unique probability measure µ on the Borel σ−algebra (of the weak topology) on E ∗ such that for every x ∈ E the function def
E ∗ → R : φ 7→ (φ, x) = φ(x) is a mean 0 Gaussian of variance |x|20 . By unitary extension, for every x ∈ E there is a µ−almost-everywhere defined mean 0, variance |x|20 , Gaussian random variable (·, x) on E ∗ , and this extends by complex-linearity to complex Gaussians corresponding to elements z of the complexification EC of E. The inner-products on E induces a complex bilinear form (·, ·)0 , in addition to a Hermitian inner-product h·, ·i0 , on EC . Let Fs (EC ) be the symmetric Fock space over tensor EC , i.e. the Hilbert space obtained by completing P the symmetric P P algebra over EC with respect to the inner-product given by hh n un , m vm ii0 = n n!hun , vn i0 , wherein un and vn are n-tensors and, in the summand, h·, ·i0 also denotes the innerproduct induced on n-tensors induced by the inner-product on EC . The Hermite-Itˆo-Segal unitary isomorphism (2.2.2a) I : L2 (E ∗ , µ) → Fs (EC ) is specified by I(e(·,z)−(z,z)0 /2 ) = Exp z
(2.2.2b)
∗
for every z ∈ EC , where (·, z) : E → C : φ 7→ φ(z) and Exp z = 1 + z +
z ⊗2 z ⊗3 + + · · · ∈ Fs (EC ). 2! 3!
(2.2.2c)
The inner-product h·, ·ip , restricted (for p < 0) to EC , produce inner-products hh·, ·iip on Fs (EC ) in the usual way and these transfer by I to inner-products, also denoted hh·, ·iip , on L2 (E ∗ , µ). These correspond to the finite-dimensional Sobolev norms. A white noise distribution over E ∗ is any element of the completion [E−p ] of L2 (E ∗ , µ) with respect to the dual norm hh·, ·ii−p , for any integer p ≥ 0. We shall realize the Chern-Simons integrand as such a distribution for an appropriate choice of the spaces Ep . Let us note the chain of inclusions of Hilbert spaces [E] ⊂ · · · ⊂ [E2 ] ⊂ [E1 ] ⊂ [E0 ] = L2 (E ∗ , µ) ' [E−0 ] ⊂ [E−1 ] ⊂ · · · ⊂ [E]∗ , (2.2.3a) where def def (2.2.3b) [E] = ∩p [Ep ], and [E]∗ = ∪p [E−p ]. An element of [Ep ] is of the form I −1 (z0 + z1 + · · ·), where each zn ∈ (Ep )⊗n C and the P∞ 2 norm-squared n=0 n!|zn |p < ∞, where | · |p here is the norm on the tensor power (Ep )⊗n C induced by h·, ·ip on Ep .
A Mathematical Construction of the Non-Abelian Chern-Simons
567
Elements of [E] are taken to be the test functions over E ∗ , and [E]∗ is the corresponding space of distributions. The topology on [E] is the smallest one which makes the inclusions [E] ⊂ [Ep ] continuous. Then [E]∗ is identifiable as the dual of [E]. A sequence of elements φn ∈ [E] converges in [E] if there is an element φ ∈ [E] such that ||φn − φ||p → 0, as n → ∞, for every p; this φ is then the limit in [E] of the sequence (φn ). It is readily verifiable that for z ∈ EC the series for Exp z given by (2.2.2c) converges in [E]. Moreover, z 7→ Exp z is infinitely differentiable relative to any of the norms | · |p , and the k th derivative at 0 is given by : b · · · ⊗z b k = Exp (k) (0)(z1 , ..., zk ) = z1 ⊗
def
1 X b · · · ⊗z b σ(k) . zσ(1) ⊗ k!
(2.2.4)
σ∈Sk
If u1 , ..., un are h·, ·i0 −orthonormal elements of E then for any real t1 , ..., tn , we have e(·,t1 u1 +···+tn un )− 2 |t1 u1 +···+tn un |0 = 1
2
X k1 ,...,kn ≥0
tk1 1 · · · tknn Hk ((·, u1 )) · · · Hkn ((·, un )) , k1 ! · · · kn ! 1
(2.2.5a) the series converging pointwise on E ∗ and in L2 (E ∗ , µ) (i.e. in the || · ||0 −norm in [E]0 ), and Hk (x) is the Hermite polynomial specified by 1 2
etx− 2 t =
∞ k X t Hk (x). k! k=0
On the other hand, we have Exp (t1 u1 + · · · + tn un ) =
X k1 ,...,kn ≥0
tk1 1 · · · tknn ⊗k1 n b · · · ⊗u b ⊗k u ⊗ n k1 ! · · · kn ! 1
(2.2.5b)
the series converging in the Fock space Fs (E) (i.e. in the || · ||0 −norm in Fs (E)). Comparing (2.2.5a, b) one has 1 b n b ⊗k , (2.2.6) ⊗ · · · ⊗u I Hk1 ((·, u1 )) · · · Hkn ((·, un )) = u⊗k n 1 valid for h·, ·i0 −orthonormal u1 , ..., un . Combining (2.2.6) with the derivative relation (2.2.4), and notingthat I is, by definition of the norms, a unitary isomorphism between [E]p and Fs (Ep )C , we conclude that for any h·, ·i0 −orthonormal u1 , ..., un ∈ E, the functions Hk1 ((·, u1 )) · · · Hkn ((·, un )) 1 are in the [E]−closure of the linear span of the set of elements of the form e(·,x)− 2 (x,x)0 with x running over the linear span of u1 , ..., un . Since Hermite polynomials form a basis for all polynomials we conclude, by using Gramm-Schmidt orthogonalization, that if x1 , ..., xn ∈ E then: every polynomial in the variables (·, x1 ), ..., (·, xn ) belongs to the[E]−closure Pn Pn 2 ·, tj xj − 21 | tj x j | 0 j=1 j=1 ; t1 , ..., tn ∈ R . (2.2.7) of the linear span of e Since, as already noted, the series for Exp x converges in all || · ||p −norms we see that the [E]−closure of the set of polynomials in (·, x1 ),...,(·, xn ) equals the closure of the 1 linear span of the set of elements of the form e(·,v)− 2 (v,v)0 with v running over the linear span of x1 , ..., xn .
568
S. Albeverio, A.Sengupta
The space [E] of ‘test functions’ is, within the framework we shall use, an algebra under pointwise multiplication and addition; in fact, there exists a constant C and an integer r ≥ 0 such that for any large integer p and any φ, ψ ∈ [E], ||φψ||p ≤ C||φ||p+r ||ψ||q+r ,
(2.2.8)
which says, in particular, that [E] × [E] → [E] : (φ, ψ) 7→ φψ is continuous ([HKPS: Theorem 4.9]). A distribution is characterized conveniently by its S-transform (an extension of the Segal-Bargmann-Fock transform to distributions). If 8 ∈ [E]∗ then S8 is a function on EC : def (2.2.9) S8(z) = 8, e(·,z)−(z,z)0 /2 , this being the evaluation of 8 on e(·,z)−(z,z)0 /2 ∈ [E]. According to the Potthoff-Streit characterization theorem (Theorem 4.38 in [HKPS]), a function 9 : EC → C is of the form S8 for some 8 ∈ [E]∗ if and only if : (S1) 9(w + λz) is holomorphic in λ ∈ C for every w, z ∈ EC and (S2) there are non-negative constants A, B and an integer p ≥ 0 such that |9(z)| ≤ 2 AeB|z|p for every z ∈ EC . In fact 8 is recovered from 9 by the simple formula : n ∞ X D 9(0) , In (x) , n! 8(x) = n!
(2.2.10)
n=0
wherein the summand is the evaluation of Dn 9(0) : EC⊗n → C (the nth derivative of 9 at 0) on In (x), the latter being the component number n of I(x) ∈ Fs (EC ) (here x ∈ [E] ⊂ L2 (E ∗ , µ)). Thus, roughly speaking, 8 is obtained by expanding 9 in a Taylor series and replacing the monomials by corresponding Hermite-monomials. The holomorphicity condition (S1) provides a Cauchy integral formula for Dn 9(0) which, together with the growth condition (S2), leads to the appropriate convergence in (2.2.10) and hence the characterization theorem alluded to above. Before returning to the Chern-Simons context, we wish to note again that there is a certain amount of flexibility in setting up the basic framework of white noise analysis. We summarize some of the important common elements of this framework. The basic structure consists of a triple E ⊂ E0 ⊂ E ∗ , where E0 is a real Hilbert space (identified in the usual way with its dual E0∗ ), and E is a subspace which is equipped with the topology specified by a family of inner-products h·, ·ip (including p = 0, the innerproduct on E0 ) satisfying the condition that for any such p1 and any > 0, there is a p2 (≥ p1 ) such that the identity map on E, viewed as a map from the inner-product space (E, h·, ·ip2 ) to (E, h·, ·ip1 ), is Hilbert-Schmidt with Hilbert-Schmidt norm less than . One then considers the Gaussian L2 −space L2 (E ∗ , µ) and identifies a subspace [E] of test-functions on E ∗ (these test functions do in fact have versions which are pointwise defined and continuous on E ∗ , and [E] is an algebra under pointwise multiplication). This space [E] is equipped with inner-products hh·, ·iip arising from those on E, and there is a corresponding dual space [E]∗ whose elements are considered to be distributions over E ∗ .
A Mathematical Construction of the Non-Abelian Chern-Simons
569
2.3. The spaces E and Ep for Chern-Simons. We shall apply the above framework with E = L2real (R3 ; g ⊕ g) ' (L2real (R3 ) ⊗ g) ⊕ (L2real (R3 ) ⊗ g).
(2.3.1)
Here L2real (R3 ) is the space of real-valued functions which are square-integrable with respect to Lebesgue measure on R3 . To connect with the Chern-Simons setting, a point in E should be viewed as corresponding to the pair (a0 , −∂2 a1 ) in our previous notation. R x Thus if (a0 , f1 ) is a smooth pair in E then (a0 , a1 ), where a1 (x0 , x1 , x2 ) = − 0 2 f1 (x0 , x1 , t) dt, satisfies the first constraint in (2.1.3). We shall not enforce the second constraint in (2.1.3); not imposing this constraint means that we sum over all the equal contributions of part of a gauge orbit, and this is all right since in the end we normalize. We take the operator K to be identity on g ⊕ g and given by K1⊗3 on −1 d2 x2 + on L2real (R). Thus Ep L2real (R)⊗3 ' L2real (R3 ), where K1 is the operator 41 − dx 2 4 is the space p (K a0 , Kp f1 ) : a0 , f1 ∈ L2real (R3 ) ⊗ g and the h·, ·ip inner-product is specified by the norm ||(a0 , f1 )||2p = ||K−p a0 ||2L2 (R3 ;g) + ||K−p f1 ||2L2 (R3 ;g) . The inner-product on Ep (including the case p = 0) will be denoted h·, ·ip , and the corresponding bilinear form on (Ep )C by (·, ·)p . 2.4. The distribution spaces for (a0 , a1 ). We can pass back to the (a0 , a1 ) formalism by starting with a space of g 2 −valued smooth functions (a0 , a1 ), satisfying the constraints (2.1.3), and completing this space with the norm specified by ||(a0 , a1 )||20 = ||a0 ||2L2 (R3 ;g) + ||∂2 a1 ||2L2 (R3 ;g) .
(2.4.1)
Then the mapping (a0 , a1 ) 7→ (a0 , ∂2 a1 ) extends to a unitary isomorphism of the new completed space E˜0 with the Hilbert space E0 . The spaces Ep may similarly be transferred to the setting of the (a0 , a1 ) space E˜0 . 2.5. The Chern-Simons integrator as a distribution. Instead of (a0 , a1 ) we shall use (a0 , f1 ), where f1 = ∂2 a1 , as coordinates on the space of connections; thus with this convention, the Chern-Simons action takes the simple form : CS(a0 , f1 ) =
κ ha0 , f1 iL2 (R3 ;g) . 2π
(2.5.1)
A formal computation with (j0 , j1 ) ∈ E shows that R iCS(a ,f ) ((a ,f ),i(j ,j )) −(i(j ,j ),i(j ,j )) /2 1 0 1 0 1 0 1 0 e 0 1 0 1 0 Da0 Df1 e N 2π
=eκ
i(ij0 ,ij1 )L2 (R3 ;g) −(i(j0 ,j1 ),i(j0 ,j1 ))0 /2
,
(2.5.2)
R where N is the normalizer eiCS(a0 ,f1 ) Da0 Df1 . This computation, being in any case informal, is done most easily by pretending that a0 , f1 , j0 , j1 were real variables. In view of (2.5.2), it is natural to make the following definition : The Chern-Simons integrator is the distribution 8CS whose S-transform is : 2π
(S8CS )(j0 , j1 ) = e κ
i(j0 ,j1 )L2 (R3 ;g) −((j0 ,j1 ),(j0 ,j1 ))0 /2
.
(2.5.3)
570
S. Albeverio, A.Sengupta
Since conditions (S1) and (S2) of Sect. 2.2 are clearly satisfied by the right side of (2.5.3), it follows that there is a unique distribution 8CS satisfying (2.5.3). The S-transform condition (2.5.3) is equivalent to : i 2π (j0 ,j1 )L2 (R3 ;g) , 8CS e(·,j) = e κ
(2.5.4)
where j = (j0 , j1 ) ∈ EC and (·, ·)L2 (R3 ;g) is the complex-bilinear extension of the innerproduct of h·, ·iL2 (R3 ;g) . In keeping with standard notation, if ψ ∈ E then we shall often write 8CS (ψ) as hψiCS : (2.5.5) hψiCS = 8CS (ψ).
3. Locality of the CS Distribution The purpose of this section is to prove the following locality property of the ChernSimons distribution. Theorem 3.1. For any set C ⊂ R3 , denote by PC the closure in [E] of the set of all polynomials in (·, x) as x runs over the elements of E which vanish outside C. Then for any disjoint sets A, B ⊂ R3 and any ψA ∈ PA and ψB ∈ PB , hψA ψB iCS = hψA iCS hψB iCS .
(3.1.1)
Proof. First consider x, y ∈ E with x|Ac = 0 and y|B c = 0. Then |x + y|20 = |x|20 + |y|20 ,
(3.1.2)
and so 2 2 2 1 1 1 8CS e(·,x)− 2 |x|0 e(·,y)− 2 |y|0 = 8CS e(·,x+y)− 2 |x+y|0 = (S8CS ) (x + y) (2.5.3)
=
= e
ihx0 +y0 ,x1 +y1 iL2 (R3 ;g) − 21 |x+y|20
2 2π 1 κ ihx0 ,x1 iL2 (R3 ;g) − 2 |x|0
(2.5.3)
=
2π
eκ
8CS e
(·,x)− 21 |x|20
e
.
2 2π 1 κ ihy0 ,y1 iL2 (R3 ;g) − 2 |y|0
2 1 8CS e(·,y)− 2 |y|0 (3.1.3)
Thus the locality relation (3.1.1) holds when ψA is in def
LA = the linear span of {e(·,x)− 2 |x|0 ; x ∈ E, x|Ac = 0} 1
2
(3.1.4)
and ψB ∈ LB . From the observation in (2.2.7), the closure of LA in [E] contains all polynomials in (·, x) with x ∈ E and x|Ac = 0. This, together with the continuity of multiplication stated in (2.2.8), implies that the locality relations (3.1.1) holds for every t u ψA ∈ PA and ψB ∈ PB .
A Mathematical Construction of the Non-Abelian Chern-Simons
571
It should be kept in mind that the above result is in terms of our framework involving the space of pairs (a0 , f1 ); a reformulation can be made in terms of the pairs (a0 , a1 ) discussed in Sect. 2.4. General examples of elements of the classes PA used in Theorem 3.1 can be obtained using a result from Kubo and Kuo [KK]. Denote a multi-index (k1 , ..., kn ), with each kj a non-negative integer, by k, and then denote by |k| the sum k1 + · · · + kn ; denote by Hk the Hermite polynomial onP Rn given by Hk1 (x1 ) · · · Hkn (xn ). IfP f is a function on Rn whose Hermite expansion k ck Hk has coefficients satisfying k |k|!et|k| |ck |2 < ∞ for every t > 0, then for any h·, ·i0 −orthonormal vectors u1 , ..., un in E, the function f ((·, u1 ), ..., (·, un )) belongs to [E], and hence to PA if each ui vanishes outside A. 4. n−Point Functions of the Chern-Simons Distribution Often the informal Chern-Simons functional integral is taken to be specified through the ‘n−point functions’ Z 1 a0 (p1 ) · · · a0 (pn )a1 (p01 ) · · · a1 (p0m )eiCS(A) DA, ZCS A where A = a0 dx0 + a1 dx1 , as before, and p1 , ..., pn , p01 , ..., p0m are arbitrary points in R3 . The goal of this section is to determine these “n−point functions”, or “moments”, of the distribution 8CS rigorously in our setting (in particular we shall work with f1 = −∂2 a1 rather than a1 , and with smeared values of the field a0 and f1 instead of pointwise values). The values we determine coincide with the values obtained by standard heuristic reasoning. Actually we shall start with the ‘moment generating function’ 8CS e(·,t1 u1 +···+tn un ) as determining the moments, instead of the polynomial form 8CS (·, u1 )k1 · · · (·, un )kn . 4.1. Interpretation of certain integrals. Let X be a finite dimensional real vector space with an inner-product h·, ·i0 , and let B be the quadratic form on X 2 given by κ hx0 , y1 i0 , where x = (x0 , x1 ), y = (y0 , y1 ) ∈ X 2 . Let B be nonB(x, y) = −i 2π degenerate quadratic forms on X 2 such that Re(B ) > 0 and lim→0 B = B. Then for an exponential function φ on X 2 we define R −B (x,x) R −B(x,x) φ(x) dx def φ(x) dx e e R . (4.1.1) = lim R −B (x,x) −B(x,x) →0 dx dx e e As we shall see in (4.2.4) in Rthe proof below, if λ1 , ..., λn are linear functionals on X 2 e−B(x,x) e(t1 λ1 +···+tn λn )(x) dx
R
then the normalized integral
e−B(x,x) dx
is the exponential of a quadratic
function of the tj and can therefore be expanded in a convergent power series in these k
tj ; the coefficient of
n t1 1 ···tk n k1 !···kn !
R
will be taken to be
e−B(x,x) (λ1 , x)k1 ...(λn , x)kn dx R . e−B(x,x) dx
(4.1.2)
(The integral in (4.1.2) is not defined directly by (4.1.1).) For the following result on moments of 8CS , recall that each element of E is of the form (a0 , f1 ) where a0 and f1 are smooth g−valued functions on R3 and that, as in (2.2.1), E is identified as a subset of E ∗ by means of the inner-product h·, ·i0 .
572
S. Albeverio, A.Sengupta
Theorem 4.1 (Moments for 8CS ). Let V be a finite dimensional space with V 2 ⊂ E. Then R i κ ha0 ,f1 iL2 (R3 ;g) e 2π φ ((a0 , f1 )) da0 df1 V2 (4.2.1) hφiCS = κ R i 2π ha0 ,f1 iL2 (R3 ;g) e da df 0 1 2 V holds if φ is any finite linear combination of exponential functions of the form e(·,v) with v running over V 2 , and polynomial functions in variables (·, w) with w again running over V 2 , and da0 df1 is Lebesgue measure on V 2 corresponding to the inner-product h·, ·i0 . Proof. Recall from (2.5.4) that if v = (v0 , v1 ) ∈ EC then i 2π (v0 ,v1 )L2 (R3 ;g) . 8CS e(·,v) = e κ
(4.2.2)
On the other hand, for any non-degenerate quadratic form B on V 2 (viewed also as an operator on V by means of the inner-product h·, ·i0 ) with positive-definite real part, R 1 e−B (x,x) e(x,v)0 dx (v,B−1 v)L2 (R3 ;g) V 2R = e4 . (4.2.3) −B (x,x) e dx V2 0 − 2i κ Taking lim↓0 B = B = 2π , as a matrix on V ⊕ V , we then have − 2i 0 R e−B(x,x) e(x,v)0 dx 2π V 2R = ei κ (v0 ,v1 )0 . (4.2.4) −B(x,x) e dx V2 Thus (4.2.1) holds when φ is itself an exponential function, and hence also if φ is a finite linear combination of such functions. Now choosing an orthonormal basis u1 , ..., un for V , for any real t1 , ..., tn we have the [E]−convergent expansion e(·,t1 u1 +···+tn un )− 2 |t1 u1 +···+tn un |0 = 2
1
X k1 ,...,kn ≥0
tk1 1 · · · tknn Hk ((·, u1 )) · · · Hkn ((·, un )) . k1 ! · · · kn ! 1
and hence, applying 8CS we see that the definition in (4.1.2) is meaningful, and also that the desired relation (4.2.1) holds if φ is a polynomial of the form Hk1 ((·, u1 )) · · · Hkn ((·, un )). Since the Hermite polynomials form a vector-space basis for all polynomials it follows that (4.2.1) holds for all polynomials in the variables (·, u1 ), ..., (·, un ), t u i.e. for all polynomials φ on V 2 . 5. The Fr¨ohlich–King Setting 5.1. The complexified setting. Fr¨ohlich and King [FK] have studied Chern-Simons theory in a setting where a pair of the coordinates are complexified; they start with a coordinate system (x0 , x1 , x2 ) which is related to our (x0 , x1 , x2 ) by : x1 = x1 + x2 and x2 = x1 − x2 .
(5.1.1)
(Our (x0 , x1 , x2 ) correspond to (x0 , x+ , x− ) in [FK].) Then the x2 −coordinate is made imaginary; thus:
A Mathematical Construction of the Non-Abelian Chern-Simons
573
x1 becomes a complex coordinate z, and x2 becomes the conjugate z. A typical connection, in the gauge used earlier, can then be expressed in the form a = a0 dx0 + a1 dz.
(5.1.2)
(In [FK], a1 is denoted 21 A+ .) The Chern-Simons functional becomes Z κ (a0 , ∂z a1 )g dvolR×C , (5.1.3) CS(a0 , a1 ) = −i π R×C ∂ ∂ where ∂z is 21 ∂x if z = x + iy, and (·, ·)g is the complex-bilinear extension of + i ∂y h·, ·ig . If (j0 , j1 ) is a pair of g−valued test functions on R × C, an informal computation shows that Z 1 eiCS(a0 ,a1 ) e((a0 ,a1 ),i(j0 ,j1 ))0 −(i(j0 ,j1 ),i(j0 ,j1 ))0 /2 Da0 Da1 N0 , (5.1.4) −1 π =e where N 0 is the normalizer
− κ ∂z ij0 ,ij1
L2 (R3 ;g)
−(i(j0 ,j1 ),i(j0 ,j1 ))0 /2
R
eiCS(a0 ,a1 ) Da0 Da1 , and Z 1 j0 (x0 , x + iy) def dxdy ∂z−1 j1 (x0 , x1 , x2 ) = − π R2 (x + iy) − (x1 + ix2 )
(5.1.5)
(see, for instance, Lemma II.2.2 in [FKr]). Thus we consider the complex symmetric quadratic form Q on the space E of g 2C −valued test-functions on R × C specified by 1 Q ((j0 , j1 ), (j0 , j1 )) = − π
Z
Z
hj0 (x0 , z), j1 (x0 , z 0 )ig
dx0 R
C2
z − z0
dλ(z)dλ(z 0 ),
(5.1.6)
where λ is Lebesgue measure on C = R2 ; the convergence of the integrals on the right may be seen as follows. The inner integral can be replaced by Z hj (x , z + w), j (x , z)i 0 0 1 0 g dλ(w)dλ(z). w C2 Since j0 and j1 decrease rapidly, we need only focus on a neigborhood of 0 in R × C2 . On such a neighborhood 1/w is integrable, and so the integral on the right of (5.1.6) converges absolutely. As in Sect. 2.3, there is a chain of spaces E = ∩p Ep ⊂ · · · ⊂ E2 ⊂ E1 ⊂ E0 ⊂ E−1 ⊂ E−2 ⊂ · · · ⊂ ∪p Ep = E ∗ ,
(5.1.7)
where E0 is the L2 −space of g 2 −valued functions on R×C, and the inner-products for Ep are chosen by means of a Hilbert-Schmidt operator as before. Correspondingly, there is a Gaussian L2 −space L2 (E ∗ , µ), a space [E] of test-functions on E ∗ , and a corresponding dual space [E]∗ of distributions. It may be verified that Q is continuous with respect to one of the usual Schwartz norms, and so there are constants c and p ≥ 0 such that |Q(j, j)| ≤ c|j|2p for all j ∈ E. Therefore, as explained in the context of (2.2.10), there ∗ is a distribution 8FK CS ∈ [E] such that
574
S. Albeverio, A.Sengupta
1 −π κ Q(j,j)− 2 (j,j)0 , S8FK CS (j) = e
(5.1.8)
where j = (j0 , j1 ) ∈ [E] and S is the S-transform as defined in (2.2.9); equivalently, π (·,j) 8FK (5.1.9) = e− κ Q(j,j) . CS e We shall use the notation hφiCS in the present setting to mean 8FK CS (φ): def
hφiCS = 8FK CS (φ).
(5.1.10)
5.2. Locality. An analog of the locality result given in Theorem 3.1 holds in the complexified setting; this is a rigorous formulation of the “crucial fact” referred to by Fr¨ohlich and King in the context of Eq. (2.2) in [FK]. As in Theorem 3.1, for C ⊂ R3 , let PC be the closure in [E] of the set of polynomials in (·, j) as j runs over elements of E which vanish outside C. We say that sets A, B ⊂ R × C are time-wise disjoint if there are no points of the form (t, z) and (t, z 0 ) with (t, z) ∈ A and (t, z 0 ) ∈ B. Then we have the following analog of Theorem 3.1 for the present setting: If A and B are time-wise disjoint subsets of R × C then for any ψA ∈ PA and ψ B ∈ PB , (5.2.1) hψA ψB iCS = hψA iCS hψB iCS . The proof is similar to that for Theorem 3.1. We observe that if x, y ∈ E are such that x|Ac = 0 and y|B c = 0 then |x + y|20 = |x|20 + |y|20 and Q(x + y, x + y) = Q(x, x) + Q(y, y),
(5.2.2)
and so
(·,x)− 21 |x|20 (·,y)− 21 |y|20 (·,x+y)− 21 |x+y|20 e e = 8 e 8FK CS CS = S8FK CS (x + y) (5.1.8)
=
(5.2.2)
=
(5.1.8)
=
π
e− κ Q(x+y,x+y)− 2 |x+y|0 e
2 1 −π κ Q(x,x)− 2 |x|0
8
FK CS
2
1
e
e
(·,x)− 21 |x|20
. (5.2.3)
2 1 −π κ Q(y,y)− 2 |y|0
2 1 8FK e(·,y)− 2 |y|0 CS
Taking linear combinations and using continuity, we obtain (5.2.1). 5.3. 2-point functions. Let ψ1 = (ψ10 , ψ11 ) and ψ2 = (ψ20 , ψ21 ) be elements of E. Then h(·, ψ1 )(·, ψ2 )iCS is the coefficient of t1 t2 in the expansion of he(·,t1 ψ1 +t2 ψ2 ) iCS . In view of the expression for he(·,ψ) iCS given in (5.1.9), we then have π h(·, ψ1 )(·, ψ2 )iCS = − Q(ψ1 , ψ2 ) κ Z δ(t − s) n (5.1.6) 1 hψ10 (t, z), ψ21 (s, w)ig = , 2κ (R×C)2 z − w o + hψ20 (t, z), ψ11 (s, w)ig dt dλ(z) ds dλ(w) (5.3.1)
A Mathematical Construction of the Non-Abelian Chern-Simons
575
where λ is Lebesgue measure on C. This can be expressed more concisely : choosing Pd an orthonormal basis {e1 , ..., ed } for g, and writing a0 = r=1 ar0 er and similarly for a1 , we can rewrite (5.3.1) in the form: haq0 (t, z)ar0 (s, w)iCS = 0 = haq1 (t, z)ar1 (s, w)iCS , haq0 (t, z)ar1 (s, w)iCS =
1 qr δ(t − s) δ , 2κ z−w
(5.3.2a) (5.3.2b)
where q, r ∈ {1, ..., d}; these equations should be interpreted after ‘integrating’ against the functions ψij . Equation (5.3.1) (or, equivalently, (5.3.2a, b)) constitute a rigorous formulation of the two-point functions given in (2.1) of [FK].
6. The parallel-transport equations of Fr¨ohlich-King Fr¨ohlich and King [FK] have shown how consideration of parallel-transport in their complexified framework leads to the Knizhnik-Zamolodchikov Eq. (equation (6.4.7) below). In this section we shall show how this fits into our framework. The derivation of the Knizhnik-Zamolodchikov equation (in the complexified ChernSimons setting) in [FK] is informal. Our discussion is rigorous except for the following inter-related technical issues which we expect can be settled : (i) the definition in Eq. (6.4.2) and the expressions in (6.4.3a, b) below require a further regularization to be rigorously meaningful; (ii) the steps (6.4.5) and (6.4.7), involving limits, have not been justified. Nevertheless, the discussion in this section shows how our framework permits a precise formulation of questions related to parallel-transport in the Chern-Simons context including, in particular, some of the considerations of [FK]. We shall take G to be a matrix group. 6.1. Parallel-transport. The equation for parallel transport by a connection (a0 , a1 , 0) along a path t 7→ (t, x1 (t), x2 (t)) ∈ R3 is: dut (a0 , a1 ) = − a0 (t, x1 (t), x2 (t)) + a1 (t, x1 (t), x2 (t))x01 (t) ut (a0 , a1 ), dt
(6.1.1)
where t 7→ ut (a0 , a1 ) is G−valued, and u0 = e the identity in G. Going over to the complexified setting we have the parallel-transport equation dut (a0 , a1 ) = − a0 (t, z(t)) + a1 (t, z(t))z 0 (t) ut (a0 , a1 ). dt
(6.1.2)
This equation is rigorously meaningful if a0 , a1 are smooth. In the context of our ChernSimons theory, (a0 , a1 ) are in E ∗ , i.e. a0 , a1 are distributions over R3 , and so (6.1.2) is not rigorously meaningful in this setting. In Sect. 6.3 below, we shall consider a mathematically well-defined ‘smeared’ version of (6.1.2). First we rewrite the parallel-transport equation (6.1.2) in a more convenient form (as is done also in [FK]). Dropping the argument (a0 , a1 ), we will denote ut (·) as u(t). The equation of parallel-transport may be written as du(t) = − [dL(t) + dM (t)] u(t), where
(6.1.3)
576
S. Albeverio, A.Sengupta
L(t) =
Rt 0
a0 (s, z(s)) ds and M (t) =
Rt 0
a1 (s, z(s)) z 0 (s) ds .
(6.1.4)
6.2. Component-wise formulation. It will be convenient for our present purposes to work with the standard white-noise framework, using real-valued, rather than g 2 −valued testfunctions and distributions. To this end we use the standard white-noise triple S(R3 ) ⊂ L2real (R3 ) ⊂ S ∗ (R3 ),
(6.2.1)
and the corresponding infinite-dimensional spaces def [S] ⊂ [S]0 = L2 S ∗ (R3 ), ν ⊂ [S]∗ ,
(6.2.2)
where ν is the Gauss-measure on S ∗ (R3 ) (defined using the L2 (R3 )−norm). To relate to our previous account, (6.2.3a) E = S(R3 ) ⊗ g ⊕ S(R3 ) ⊗ g , E ∗ = S ∗ (R3 ) ⊗ g ⊕ S ∗ (R3 ) ⊗ g .
(6.2.3b)
Thus a typical element of [E]∗ can be written as (a0 , a1 ) with aα =
d X
arα er
r=1
wherein {e1 , ..., ed } is an orthonormal basis of g and the arα ’s are ordinary distributions over R3 . The specification of the distribution 8FK CS requires that for any u0 , u1 ∈ g and φ0 , φ1 ∈ 3 S(R ), R E D R ha (s,z),φ0 (s,z)u0 ig dsdλ(z)++ ha1 (s,z),φ1 (s,z)u1 ig dsdλ(z) e 0 CS nR o . (6.2.4) φ0 (s,z)φ1 (s,w) π =e
−κ
dsdλ(z)dλ(w) hu0 ,u1 ig
z−w
R×C2
R Pd j (We are writing haα (s, z), φα (s, z)uα ig dsdλ(z) to mean the sum j=1 aα (φα ) hej , uα ig . The expression within h·iCS on the left is taken as a function of (a0 , a1 ) ∈ E ∗ ; the complete expression h·iCS on the left is a function of (φ0 u0 , φ1 u1 ) ∈ E). The 2−point functions given in (5.3.2a, b) can also be obtained from (6.2.4). For fixed φ ∈ S(R3 ), and α = 0, 1, the integral Z Z ds dλ(z) aα (s, z)φ(s, z), (6.2.5a) R
C
as a function of aα , is a g−valued Gaussian random variable, with mean 0 and L2 −norm ||φ||L2 (R3 ) . By isometric extension, (6.2.5a) is meaningful, as an almost everywhere defined Gaussian random variable, for every φ ∈ L2 (R3 ), and hence so is Z t Z ds dλ(z) aα (s, z)φ(s, z). (6.2.5b) 0
C
A Mathematical Construction of the Non-Abelian Chern-Simons
577
It is readily seen that aα (f ) and aα (h) are independent when f and h have disjoint Rt support. Thus the integral in (6.2.5b) at t + 1t is the sum of the integral 0 · · · and the R t+1t · · · . Consequently, independent mean−0 random variable given by the integral t (6.2.5b) is a martingale in its dependence on t. 6.3. Smeared parallel-transport. Let t 7→ (t, z(t)) be a C ∞ path in R × C. For > 0, we choose a 2-dimensional ‘bump function’ ψ , centered at the origin 0 ∈ C, and vanishing outside a disk of radius . In view of the observations made in the context of (6.2.5), the integrals Z t Z ds a0 (s, z)ψ (z − z(s)) dλ(z), (6.3.1a) L (t) = 0
Z M (t) =
C
Z
t
a1 (s, z)ψ (z − z(s))z 0 (s)dλ(z)
ds 0
(6.3.1b)
C
are meaningful for µ−almost every (a0 , a1 ) and are g−valued martingales. According to standard practice, the smeared version of the parallel-transport equation (6.1.3) ought to be a Stratonovich stochastic differential equation, with L and M replaced by L and M . If we were to do this and then apply the distribution 8FK CS we would run into singularities. For this reason, we use the Itˆo version of (6.1.3) (as is done also in [FK]) and take the smeared parallel-transport equation to be the Itˆo stochastic differential equation: (6.3.2) du (t) = −[dL (t) + dM (t)]u (t) After application of 8FK o equation, as opposed to the Stratonovich CS the choice of the Itˆ form, appears to be a type of renormalization. 6.4. The Fr¨ohlich-King parallel-transport equation. As mentioned before, the arguments in this section will be informal to the extent stemming from the fact that Eq. (6.4.2) below requires consideration of a further regularization to be meaningful (such a regularization would lead to modifications in (6.4.3b, c), (6.4.5), and in the interchange of limits in (6.4.7)). The solution of the smeared parallel-transport equation (6.3.2) is obtained by dividing the time domain into intervals of width 1t, taking a difference approximation to (6.3.2), and then taking the limit, in an appropriate sense, as 1t → 0. Consider parallel-transport along n smooth curves σj : t 7→ t, zj (t) , with j = 1, ..., n. Let u1t, (t) be the solution of the difference equation corresponding to j
duj (t) = − dLj (t) + dMj (t) uj (t).
(6.4.1)
We shall denote the (p, q)−entry of the matrix uj1t, (t) by uj1t, (t)p,q . The goal is first to obtain the “Chern-Simons expectation values” hu1t, (t) · · · u1t, (t) i (and 1
p1 ,q1
n
pn ,qn CS
then let → 0 and 1t → 0). This information is conveniently and compactly contained in the tensor product def t, t, (t) = hu11t, (t) ⊗ · · · ⊗ u1 (t)iCS φ1 n n
(6.4.2)
(this is an element of the nth tensor power of the space of matrices in which G lies). t, (t)pn ,qn We note, however, that it is not apparent that the products u11t, (t)p1 ,q1 · · · u1 n
578
S. Albeverio, A.Sengupta
actually lie in the domain (suitably extended) of the distribution 8FK CS . A further smearing 1 t, 1 t, of the uj (t) would lead to a well-defined choice for φn (t), but we shall at this stage work with the expression in (6.4.2). In terms of an orthonormal basis {e1 , ..., ed } of g, we may write Pd Pd ,r Lj (t) = r=1 L,r (6.4.3a) j (t)er and Mj (t) = r=1 Mj (t)er . Now recalling the definitions of L and M given in (6.3.1a, b), and using the 2−point functions of 8FK CS given in (5.3.2a, b), we have ,q ,r ,q h1L,r j (t) 1Lk (t)iCS = 0 = h1Mj (t) 1Mk (t)iCS
(6.4.3b)
and ,q h1L,r j (t) 1Mk (t)iCS Z Z 0 1 rq t+1t zk (s) δ ψ z − zj (s) ψ (w − zk (s)) dλ(z) dλ(w), = ds 2κ z−w C2 t (6.4.3c) Higher-order products have h·iCS of the order of o(1t). From (6.3.2) and the 2−point functions (6.4.3b), and using the locality property (5.2.1) of 8FK CS , we have d X
X
t, φn1t, (t + 1t) − φ1 (t) = n
,r 1t, (t) + o(1t), r h1L,r j (t)1Mk (t)iCS jk φn
1≤j6=k≤n r=1
(6.4.4a) where
rjk = I ⊗ · · · ⊗ er ⊗ · · · ⊗ er ⊗ · · · ⊗ I, |{z} |{z} j
(6.4.4b)
k
with {e1 , ..., ed } being, as before, an orthonormal basis of the Lie algebra g. At this point it will be necessary to assume that the curves σj are distinct and nonintersecting, in order to avoid certain singularities. Using the 2−point functions given in (6.4.3b, c) in (6.4.4a), and letting 1t → 0 (after dividing by 1t), we obtain in formally 1 d φn (t) = dt 2κ (Z X 1≤j,k≤n,j6=k
ψ z − zj (t)
zk0 (t)ψ
(w − zk (t))
z−w
C2
)
dλ(z) dλ(w) jk φn (t)
,
(6.4.5) wherein jk =
d X r=1
I ⊗ · · · ⊗ er ⊗ · · · ⊗ e r ⊗ · · · ⊗ I |{z} |{z} j
(6.4.6)
k
and φn (t) is, in formally, hu1 (t) ⊗ · · · ⊗ un (t)iCS . Letting → 0, we obtain, upon formally interchanging limit and derivative, 1 X zj0 (t) − zk0 (t) d φn (t) = − jk φn (t), (6.4.7) dt 2κ zj (t) − zk (t) 1≤j 0, 5
By bound we mean confined to a compact set for all times.
HyperK¨ahler Quotient Construction of BPS Monopole Moduli Spaces
597
for all 4m−vectors X, Y, or equivalently V(a;b) Xa Yb > 0. Along a geodesic with a tangent vector L one therefore has: d g(V, L) = LV g(L, L) > 0. dt
(71)
Now if this is a bound or closed geodesic one may average over a time period T . The left-hand side of (71) tends to zero as T → ∞ while the right-hand side tends to some positive constant, which is a contradiction. The existence of a distance increasing vector field can be easily demonstrated on spaces obtained by HyperK¨ahler quotient restricting to the zero-set of the moment map 6 . From our examples in Sect. 3 these are Lee-Weinberg-Yi and Taubian-Calabi manifolds. The vector field V is induced on X0 from the following R+ action on M = Hm × Hp : qa iq¯a ψa Rewi Imwi
→ → → →
α1/2 qa iq¯a , ψa , Rewi , α Imwi , α > 0
(72)
with qa = ra e−iψa /2 , a = 1, . . . , m and i = 1, . . . , p. This R+ action leaves invariant the level sets µ−1 (0) and commutes with the Rm , T m and SU (2) actions. It therefore descends to give a well-defined R+ action on µ−1 (0)/Rm which stabilises the point qa = 0 corresponding for Lee-Weinberg-Yi metric to the spherically symmetric monopole [4]. The action (72) is clearly distance increasing on M, so its restriction to µ−1 (0)/Rm is also distance increasing. Note that the argument just given is a more geometric version of the generalised Virial Theorem given earlier in [5]. We conclude this Sect. by making a remark about the integrability of the geodesic flow. Consider the Lagrangian for the configuration described by the moduli space metric (13) (see [7]). If one eliminates the conserved charges dτ drc a + ω ac . , Qa = Gab dt dt one obtains an effective Lagrangian on R3m = Xζ /T m : Gab
drc dra drb . − Gab Qa Qb + Qa ω ac . . dt dt dt
(73)
This many-body Lagrangian (73) may not look very tractable when considered on R3m but in some cases it admits “hidden” symmetries which, although not apparent in 3m dimensions, are clearly present on the 4m−dimensional manifold. A simple example of the phenomenon occurs in the Eguchi-Hanson manifold. On R3 there is only one manifest symmetry corresponding to rotation about the axis joining the two centres. However, the geodesic motion is completely integrable [23]. This happens because of the large isometry group, U (2), which acts on the three-dimensional orbits. In fact the motion on electrically neutral geodesics, i.e. those with Q = 0, in the ALE and ALF 6 It will be clear shortly why this argument does not apply to spaces where one cannot without loss of generality consider µ−1 (0).
598
G.W. Gibbons, P. Rychenkova, R. Goto
spaces associated to the cyclic group of order k is the same as that of a light planet moving in the Newtonian gravitational field of k fixed gravitating centres. The case k = 1 corresponds to the Kepler problem and is clearly integrable. The case k = 2 corresponds to the Euler problem and is also integrable. According to [24] the case when there is a plane containing the centres and the forces are attractive and the motion of a planet with positive energy is confined to that plane, then there are no analytic constants of the motion other than the energy if k > 2. Since this case is a special case of the general motion, it strongly indicates that for k > 2 the geodesic flow on the cyclic ALE and ALF spaces is not integrable. One might be able to use a similar argument in some other cases. However one might expect to encounter hidden symmetries in the case of the Calabi and the Taubian-Calabi metrics. As we have seen above neither the Calabi nor the Taubian-Calabi metrics look very symmetric when written out in terms of the cartesian coordinates but in fact they admit a large group of isometries whose principal orbits are of co-dimension one or two respectively. Another interesting question along the same lines is whether there are cases in which the geodesic flow admits Lagrange-LaplaceRunge-Lenz vectors as it does in the Taub-NUT case. We defer further considerations of these questions for a future publication. 6. Concluding Comments In this paper we have presented a rather simple and elegant way to construct and analyse some known and some new HyperK¨ahler metrics using the HyperK¨ahler quotient. All our examples turned out to possess a tri-holomorphic torus action which considerably simplified the algebra. From the few applications that we have discussed, it is clear that this approach gives explicit answers to many interesting questions about the global properties of these manifolds. In many cases such properties are not immediately apparent from the local form of the metric. There are a number of open problems that can be explored using the HyperK¨ahler quotient. In the future we hope to extend the method introduced in this paper further to discuss the differential forms and the spectrum of the Hodge-de Rahm Laplacian, as well as the physics of the so-called massless monopoles. It would also be interesting to look at the singular metrics and understand the type of singularities that arise and how they may be resolved. Acknowledgement. We would like to thank N.Hitchin for illuminating conversations. The research of P.R. is supported by Trinity College, Cambridge University. R.G. would also like to thank Prof. Fujiki and Prof. Nakajima for discussions and N.Hitchin for hospitality during his stay in Cambridge.
References 1. Hitchin, N.J., Karlhede, A., Lindstr¨om U., Roˇcek, M.: HyperK¨ahler metrics and Supersymmetry. Commun. Math. Phys. 108 535 (1987) 2. Lee, K., Weinberg, E.J., Yi, P.: The Moduli Space of Many BPS Monopoles. hep-th/9602167 3. Gauntlett, J.P., Lowe, D.A.: Dyons and S-duality in N = 4 Supersymmetric Gauge Theory. hepth/9601085 4. Lee, K., Weinberg, E.J., Yi, P.: Electromagnetic Duality and SU (3) Monopoles. hep-th/9601097 5. Gibbons, G.W.: Phys.Lett. 382B, 93 (1996) 6. Intriligator, K., Seiberg, N.: Mirror Symmetry in Three Dimensional Gauge Theories. hep-th/9607207
HyperK¨ahler Quotient Construction of BPS Monopole Moduli Spaces
599
7. Gibbons, G.W., Manton, N.S.: The moduli space metric for well-separated BPS monopoles. Phys. Lett. B356, 32 (1995) 8. Lee, K., Weinberg, E.J., Yi, P.: Massive and Massless Monopoles with Nonabelian Magnetic Charge. hep-th/9605229 9. Lindstr¨om, U., Roˇcek, M.: Nucl. Phys. B222, 285 (1983) 10. Pedersen, H., Poon, Y.S.: Commun. Math. Phys. 117, 569 (1988) 11. Goto, R.: On Toric HyperK¨ahler Manifolds. Kyoto University preprint RIMS−818 (1991); On HyperK¨ahler Manifolds of Type A∞ . Geom. Funct. Anal. 4(4), 424 (1994) 12. Murray, M.K.: A Note on the (1, 1, . . . , 1) Monopole Metric. hep-th/9605054 13. Connell, S.A.: The Dynamics of the SU (3) charge (1, 1) Magnetic Monopoles. University of South Australia, preprint 14. Curtright, T.L., Freedman, D.Z.: Phys. Lett. 90B, 71 (1980) 15. Dancer A., Swann, A.: HyperK¨ahler Metrics of Cohomogeneity One. J. Geom. Phys., to appear 16. Roˇcek, M.: Physica 15D, 75 (1985) 17. Gibbons G.W., Hawking, S.W.: Phys. Lett. B78(4), 430 (1978) 18. Hawking, S.W.: Phys. Lett. 60A, 81 (1977) 19. Bielawski, R.: Existence of Closed Geodesics on the Moduli Space of k-monopoles. McMaster University, preprint (1996) 20. Ward, R.S.: Phys. Lett. 158B, 424 (1985) 21. Manton, N.S.: Phys. Lett. B110, 54 (1982) 22. Benci, V., Giannoni, F.: Duke Math. J. 68(2), 195 (1992) 23. Mignemi, S.: Classical and Quantum Motion on an Eguchi-Hanson Space. J. Math. Phys. 32(11), 3047 (1991) 24. Fomenko, A.T.: Symplectic Geometry. Advanced Studies in Contemporary Mathematics 5, New York: Gordon and Breach, 1988 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 186, 601 – 648 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Null-Vectors in Integrable Field Theory O. Babelon1,? , D. Bernard2,?? , F. A. Smirnov1,??? 1 Laboratoire de Physique Th´ eorique et Hautes Energies,Laboratoire associ´e au CNRS, Universit´e Pierre et Marie Curie, Tour 16 1er e´ tage, 4 place Jussieu, 75252 Paris cedex 05, France 2 Service de Physique Th´ eorique de Saclay, Laboratoire de la Direction des Sciences de la Mati`ere du Commissariat a` l’Energie Atomique, F-91191, Gif-sur-Yvette, France
Received: 1 July 1996 / Accepted: 18 October 1996
Abstract: The form factor bootstrap approach allows to construct the space of local fields in the massive restricted sine-Gordon model. This space has to be isomorphic to that of the corresponding minimal model of conformal field theory. We describe the subspaces which correspond to the Verma modules of primary fields in terms of the commutative algebra of local integrals of motion and of a fermion (Neveu–Schwarz or Ramond depending on the particular primary field). The description of null-vectors relies on the relation between form factors and deformed hyper-elliptic integrals. The null-vectors correspond to the deformed exact forms and to the deformed Riemann bilinear identity. In the operator language, the null-vectors are created by the action of two operators Q (linear in the fermion) and C (quadratic in the fermion). We show that by factorizing out the null-vectors one gets the space of operators with the correct character. In the classical limit, using the operators Q and C we obtain a new, very compact, description of the KdV hierarchy. We also discuss a beautiful relation with the method of Whitham.
1. Introduction In this article we present a synthesis of the ideas of the papers [1] and [2]. In the first of these papers the space of fields for the sine-Gordon model (SG) was described in terms of the form factors previously obtained in the bootstrap approach [3]. This description is based on rather special properties of form factors for the SG model. Namely, it uses the fact that the form factors were written in terms of deformed hyper-elliptic differentials, allowing deformations of all the nice properties of the usual hyper-elliptic differentials: the notion of deformed exact forms and of the deformed Riemann bilinear identity are ? ?? ???
Membre du CNRS Membre du CNRS On leave from Steklov Mathematical Institute, Fontanka 27, St. Petersburg, 191011, Russia
602
O. Babelon, D. Bernard, F. A. Smirnov
available for them [4]. Using these facts it has been shown in [1] that the same number of local operators can be constructed in the generic case of the sine-Gordon model as at the free fermion point. The deformed exact forms and the deformed Riemann bilinear identity are necessary in order to reduce the space of fields to the proper size because in its original form factor description the space is too big. The description of [1] is basically independent of the coupling constant, but for the rational coupling constant there is a possibility to find additional degenerations. The problem with the description of the space of fields obtained in [1] is due to the fact that it is difficult to compare it with the description coming from Conformal Field Theory (CFT) or from the classical theory. The latter two are closely connected because the Virasoro algebra can be considered as a quantization of the second Poisson structure of KdV. In the description of [1], it is even difficult to distinguish the descendents with respect to the two chiral Virasoro algebras. On the other hand in [2] the semi-classical limit of the form factor formulae has been understood. This opens the possibility of identifying all the local operators by their classical analogues. Using this result we decided to try to construct the module of the descendents of the primary fields with respect to the chiral Virasoro algebra. The result of this study happened to be quite interesting. Let us formulate more precisely the problems discussed in this paper. The sineGordon model is described by the action: Z π (∂µ ϕ)2 + m2 (cos(2ϕ) − 1) d2 x, S= γ where γ is the coupling constant. In the quantum theory, the relevant coupling constant πγ . is: ξ = π−γ The sine-Gordon theory contains two subalgebras of local operators which, as operator algebras, are generated by exp(iϕ) and exp(−iϕ) respectively. We shall consider one of them, say the one generated by exp(iϕ). It is known that this subalgebra can be considered independently of the rest of the operators, as the operator algebra of the theory with the modified energy-momentum tensor: mod SG = Tµν + iαµ,µ0 ν,ν 0 ∂µ0 ∂ν 0 ϕ, Tµν
q where α = π
6 ξ(π+ξ) .
This modification changes the trace of the energy-momentum
mod = m2 exp(2iϕ). This modified energy-momentum tensor tensor which is now: Tµµ corresponds to the restricted sine-Gordon theory (RSG). For rational πξ , the RSG model describes the 8[1,3] -perturbations of the minimal models of CFT. In this paper we consider only the RSG model. It is natural from the physical point of view of integrable perturbations [5] to expect that the space of fields for the perturbed model is the same as for its conformal limit. The latter consists of the primary fields and their descendents with respect to the two chiral Virasoro algebras. In this paper we shall consider the descendents with respect to one of these algebras, the possibility of considering the descendents with respect to the other one is explained in Subsect. 2.2. The Verma module of the Virasoro algebra is generated by the action of the generators L−k of the Virasoro algebra on the primary field. The irreducible representation corresponding to a given primary field is obtained by factorizing out the null-vectors [6, 7]. On the other hand the very possibility of integrable deformations is due to the fact that there exists a commutative subalgebra of the universal enveloping algebra of the
Null-Vectors in Integrable Field Theory
603
Virasoro algebra: the algebra of local integrals [5, 8, 9]. The local integrals I2k−1 have odd spins. Logically it must be possible to present the Verma module as a result of the action of the local integrals of motion on a smaller module: the quotient over the local integrals. The latter is isomorphic to the module created from the vacuum by the action of bosons of even spins J2k . The important observation is that the form factor formulae allow us to give a description of this space. Certainly, such a realization of the Verma module is very useful for integrable applications, but there is a difficulty. It is not trivial to describe the null-vectors in this realization. This question, of the description of the null-vectors, is the main problem solved in this paper. Briefly, the result is as follows. It is useful to fermionize the bosons J2k by introducing Neveu–Schwarz (ψ2k−1 ) or Ramond (ψ2k ) fermions, depending on which primary field we consider. As in [1] the null-vectors are due to the deformed exact forms and the deformed Riemann bilinear identity. In the fermionic language these null-vectors are created by the action of two operators: a linear one in the fermion (Q) and a quadratic one (C). By calculating the characters we show that, after factorizing these null-vectors, we find exactly the same number of descendents of a given primary field as in the corresponding irreducible representation of the Virasoro algebra. In Conformal Field Theory, the existence of null vectors provide differential equations for the correlation functions. The null vectors built out of Q and C in our approach will also lead to a system of equations for the correlation functions in the massive case. These equations will be similar to the hierarchies of equations in classical integrable systems. In spite of the fact that such an infinite set of equations seems a priori untractable, our hope is that, as in the classical case, large classes of solutions will be constructed. Hence we hope that these equations will be really useful for the study of correlation functions in the massive case. We also present a detailed analysis of the classical limit of our constructions. The description of the null-vectors in terms of Q and C implies in the classical limit, a new and very compact description of the KdV hierarchy. This new description is not the same as the description in terms of τ -functions [10], still the techniques we use are close to those of the Kyoto school. Finally, we discuss an amazing analogy between the quantum theory and the results of the Whitham method for small perturbations around a given quasi-periodic classical solution of KdV. Let us make one general remark on the exposition. In this paper we restrict ourselves to the reflectionless case of the sine-Gordon model. This is done only in order to simplify the reading of the paper. All our conclusions are valid for a generic value of the coupling constant as well. For this reason, we chose to present the final formulae in the situation of a generic coupling constant. This leads, from time to time, to a contradictory situation. We hope that we shall be forgiven for this because if we wrote generic formulae everywhere the understanding of the paper would have been much more difficult.
2. The Space of Fields 2.1. The description of the space of fields in the A, B-variables. At the reflectionless points (ξ = πν , ν = 1, 2, · · ·) there is a wide class of local operators O for which the form factors in the RSG-model corresponding to a state with n-solitons and n-anti-solitons are given by fO (β1 , β2 , · · · , β2n )−···−+···+ =
604
O. Babelon, D. Bernard, F. A. Smirnov
=c
n
Y
ζ(βi − βj )
i<j
2n n Y Y i=1 j=n+1
X 1 1 exp(− (ν(n − 1) − n) βj ) sinh ν(βj − βi − πi) 2 j
× fbO (β1 , β2 , · · · , β2n )−···−+···+
(1)
The function ζ(β), without poles in the strip 0 < Im β < 2π, satisfies ζ(−β) = S(β)ζ(β) and ζ(β − 2πi) = ζ(−β): the S-matrix S(β) and the constant c are given in Appendix A. The most essential part of the form factor is given by Z 1 dA1 · · · fbO (β1 , β2 , · · · , β2n )−···−+···+ = (2πi)n Z 2n n n Y Y Y Y ψ(Ai , Bj ) (A2i − A2j ) L(n) (A , · · · , A |B , · · · , B ) a−i (2) dAn 1 n 1 2n i O i<j
i=1 j=1
i=1
where Bj = eβj and ψ(A, B) =
ν−1 Y
(B − Aq −j ),
with q = eiπ/ν .
j=1
As usual we define a = A2ν . Here and later if the range of integration is not specified the integral is taken around 0. Notice that the operator dependence of the form factors (1) only enters in fbO . Different local operators O are defined by different functions L(n) O (A1 , · · · , An |B1 , · · · , B2n ). These functions are symmetric polynomials of A1 , · · · , An . For the primary operators 82k = exp(2kiϕ) and their Virasoro descendants, LO are symmetric Laurent polynomials of B1 , · · · , B2n . For the primary operators 82k+1 = exp((2k+1)iϕ), Q 1 they are symmetric Laurent polynomials of B1 , · · · , B2n multiplied by Bj2 . Our definition of the fields 8m is related to the notations coming from CFT as follows: 8m corresponds to 8[1,m+1] . The requirement of locality is guaranteed by the following simple recurrent relation for the polynomials L(n) O : (A , · · · , A = L(n) |B , · · · , B ) 1 n 1 2n O B2n =−B1 , An =±B1
= −± L(n−1) (A1 , · · · , An−1 |B2 , · · · , B2n−1 ) O
(3)
where = + or − respectively for the operators 82k and their descendents, or for 82k+1 and their descendents. In addition to the simple formula (2) we have the requirement 2n n Y Y Y resAn =∞ ψ(Ai , Bj ) (A2i − A2j ) L(n) (A1 ,· · ·, An |B1 ,· · ·, B2n )a−k n O
i=1 j=1
= 0,
k ≥n+1
i<j
(4)
which is true in particular if degAn (LO ) < 2ν. We explain these conditions in Appendix A. We see that the restriction (4) disappears only in the classical limit ν → ∞. It will be clear later that this class of local operators is not complete for the reason that the anzatz (1) is too restrictive. With this anzatz we obtain the complete set of operators only in the classical limit. However there is a possibility to define the form factors of local operators
Null-Vectors in Integrable Field Theory
605
which correspond to polynomials satisfying the relation (3) without any restriction of the kind (4). To do that for the reflectionless points one has to consider the coupling constant in generic position (in which case the formulae for the form factors are much more complicated [3]) and to perform carefully the limit ξ = πν + , → 0. An example of such a calculation for ξ = π is given in [1]. We shall give more comment about this procedure later, here we would like to emphasize that the local operator can be defined for any polynomial satisfying (3) but its form factors are not necessarily given by the anzatz (1). Physically the existence of local operators for the reflectionless case whose form factors are not given by the anzatz (1) is related to existence of additional local b conserved quantities which constitute the algebra sl(2). In spite of the fact that the form factors of the form (1) do not define all the operators they provide a good example for explaining the properties valid in the generic case. The explicit form of the polynomials LO for the primary operators is as follows L(n) 8 (A1 , · · · , An |B1 , · · · , B2n ) =
n Y
m
Am i
i=1
2n Y
−m 2
Bj
.
j=1
In this paper we shall consider the Virasoro descendents of the primary fields. We shall restrict ourselves by considering only one chirality. Obviously, the locality relation (3) is not destroyed if we multiply the polynomial L(n) O (A|B) either by I2k−1 (B) or by J2k (A|B) with 1 + q 2k−1 s2k−1 (B), k = 1, 2, · · · . (5) I2k−1 (B) = 1 − q 2k−1 1 k = 1, 2, · · · . (6) J2k (A|B) = s2k (A) − s2k (B), 2 Here and later we shall use the following definition: sk (x1 , · · · , xm ) =
m X
xkj
j=1
The multiplication by I2k−1 corresponds to the application of the local integrals of 1+q 2k−1 motion. The normalization factor 1−q 2k−1 is introduced for further convenience. Since the boost operator acts by dilatation on A and B, I2k−1 has spin (2k − 1) and J2k has spin 2k. The crucial assumption which we make is that the space of local field descendents of the operator 8m is generated by the operators obtained from the generating function 2n n X Y Y m − Lm (t, y|A|B) = exp t2k−1 I2k−1 (B) + y2k J2k (A|B) Am Bj 2 . (7) i k≥1
i=1
j=1
This is our main starting point. This assumption follows from the classical meaning of the variables A, B [2]. We shall give classical arguments later, and the connection with CFT is explained in the next subsection. 2.2. The relation with CFT. The RSG model coincides with the 8[1,3] -perturbation of the minimal models of CFT. For the coupling constants which we consider the minimal
606
O. Babelon, D. Bernard, F. A. Smirnov
models in question are not unitary, but this fact is of no importance for us. It is natural from a physical point of view to conjecture that the number of local operators in the perturbed model is the same as in its conformal limit. We first need to recall a few basic facts concerning the minimal conformal field theories. At the coupling constant ξ, the conformal limit of the RSG models have central charge c=1−6
π2 . ξ(ξ + π)
The conformal limit of the fields 8m = eimϕ is identified with the operators 8[1,m+1] in the minimal models. They have conformal dimensions: 1m =
m(mξ − 2π) . 4(ξ + π)
The reflectionless points, ξ = πν with ν = 1, 2, · · ·, correspond the non-unitary minimal models M(1,ν+1) , which are degenerate cases. At these points one has to identify 8m with 82ν−m . The Virasoro Verma module corresponding to the primary field 8m is generated by the vectors L−k1 L−k2 · · · L−kN 8m . At ξ = πν their structure is the same as for generic value of ξ. This means that for m integer, the Verma module possesses only one submodule. The so-called basic nullvectors, which we shall denote by 0m , are the generators of these submodules. They appear at level m + 1. The first few are given by: 00 = L−1 80 , 01 = (L−2 + κL2−1 )81 ,
κ = −1 −
03 = (L−3 + κ1 L−1 L−2 + κ2 L3−1 )82 ,
π , ξ κ1 = −2
π+ξ (π + ξ)2 , κ2 = , π + 3ξ 2ξ(π + 3ξ)
etc · · · Other null-vectors, whose set form the submodule, are created from the basic ones as follows L−k1 L−k2 · · · L−kN 0m . As a consequence, the character of the irreducible Virasoro representation with highest weight 8m is: χm (p) = Q
1 − pm+1 . j j≥1 (1 − p )
There exists an alternative description of these modules which is more appropriate for our purposes. Indeed, as is well known the integrability of the 8[1,3] -perturbation is related to the existence of a certain commutative subalgebra of the Virasoro universal enveloping algebra generated by elements I2k−1 , polynomial of degree k in Ln , such that [L0 , I2k−1 ] = (2k − 1)I2k−1 and
Null-Vectors in Integrable Field Theory
607
k, l = 1, 2, · · · .
[I2k−1 , I2l−1 ] = 0, The first few are given by:
I1 = L−1 , X L−n−2 Ln−1 , I3 = 2 n≥0
etc · · · . This is the subalgebra of the local integrals of motion. The meaning of these operators and of the I2k−1 (B) which we had above is the same, so we denote them by the same letters. Of course acting only with this subalgebra on 8m does not generate the whole Verma module. But the left quotient of the Verma moduleQ by the ideal generated by the integrals of motion produces a space whose character is j≥0 (1 − p2j )−1 . This space can be thought of as generated by some J2k with [L0 , J2k ] = 2kJ2k . We expect that there is the following alternative way of generating the Virasoro module. Namely, in a spirit similar to the Feigin-Fuchs construction [6], we expect that there exists an appropriate completion of the subalgebra of the integrals of motion by elements J2k such that (i) the Verma module is isomorphic, as a graded space, to the space generated by the vectors X X t2k−1 I2k−1 ) : exp( y2k J2k ) : 8m , (8) exp( k≥1
k≥1
where the double dots refer to an appropriate normal ordering, and (ii) that for a certain normalization of I2k−1 and a certain choice of J2k and their normal ordering the generating functions (7) and (8) provide different realizations of the same object. We shall call the description given by (7) the A, B-representation of the Virasoro module. The existence of this alternative description of the Virasoro module is very important for integrable applications. The main problem for this I, J description arises from the non-trivial construction of the null-vectors. This problem has two aspects. First one has to construct the nullvectors 0m in terms of I, J. Then one also has to construct the submodule associated to it. This is not as trivial as in the standard representation since, while acting on 0m with the I2k−1 still produces a null-vector, acting with the J2k does not necessarily lead to a null-vector. We shall show that the whole null-vector submodule can be described in the A, B-representation. The proof is based on rather delicate properties of the form-factor integrals, the main one being the deformed Riemann bilinear identity [4]. Let us discuss briefly the problem of the second chirality. In [2] we have explained that the formula (1) has to be understood as a result of light-cone quantization in which x− is considered as space and x+ as time. We must be able to consider the alternative possibility ( x+ - space, x− - time), and the results of quantization must coincide. Where exactly does the choice of hamiltonian picture manifest itself in our formulae? Consider formula (2). As it is explained in [2] the fact that L(n) O is a polynomial in Ai corresponds n Q to the choice of x− as a space direction while the multiplier a−i i corresponds semii=1
classically to the choice of trajectories under x+ -flow. These are the two ingredients which change when the hamiltonian picture is changed. Indeed, analyzing the results of
608
O. Babelon, D. Bernard, F. A. Smirnov
[2], we find the following alternative description of the form factors. The form factors are given by the formulae (1) with fbO replaced by Z 1 b dA1 · · · × hO (β1 , β2 , · · · , β2n )−···−+···+ = (2πi)n Z 2n n Y Y Y (n) ψ(Ai , Bj ) (A2i − A2j ) KO (A1 , · · · , An |B1 , · · · , B2n ) × × dAn i<j
i=1 j=1
×
n Y
a−i+1 i
i=1
2n Y
−1
bj 2 ,
j=1
Q −i (n) where bj = Bj2ν , KO is a polynomial in A−1 and the replacement of ai by i Q −i+1 Q − 21 bj corresponds to the change of x+ -trajectories to x− -trajectories. In parai ticular for the primary fields we have (n) (A1 , · · · , An |B1 , · · · , B2n ) = K8 m
n Y i=1
A−m i
2n Y
m
Bj 2 .
j=1
(n) The L−k descendents are obtained by multiplying K8 by I−(2k−1) (B) and J−2k (A|B). m The consistency of the two pictures requires that for the primary fields they give the same result:
h8m (β1 , β2 , · · · , β2n )−···−+···+ . fb8m (β1 , β2 , · · · , β2n )−···−+···+ = b
(9)
This is a complicated identity which nevertheless can be proven. We do not present the proof here because it goes beyond the scope of this paper. It should be said, however, that the proof is based on the same technique as used below (deformed Riemann bilinear identity, etc). Using the equivalence of the two representations of the form factors we can consider the descendents with respect to L−k and also mixed L−k , L−k descendents. A formula similar to (9) holds for any coupling constant. However, at the reflectionless points there is another consequence of (9): it also shows that the operators 8m and 82ν−m are identified as it should be. 3. Null-Vectors 3.1. The null-polynomials. Null-vectors correspond to operators with all the form factors vanishing. Consider the integral Z 1 dA1 · · · × (2πi)n Z 2n n Y n Y Y Y ψ(Ai , Bj ) (A2i −A2j ) L(n) (A ,· · ·, A |B ,· · ·, B ) a−i × dAn 1 n 1 2n i . (10) O i=1 j=1
i<j
i=1
(n) Instead of L(n) O , we shall often use the anti-symmetric polynomials MO : Y (n) MO (A1 , · · · , An |B1 , · · · , B2n ) = (A2i − A2j ) L(n) O (A1 , · · · , An |B1 , · · · , B2n ). i<j
Null-Vectors in Integrable Field Theory
609
(n) The dependence on B1 , · · · , B2n in the polynomials MO will often be omitted. There are several reasons why this integral can vanish. Some of them depend on a particular value of the coupling constant or on a particular number of solitons. We should not consider these occasional situations. There are three general reasons for the vanishing of the integral, let us present them.
1. Residue. The integral (10) vanishes if vanishes the residue with respect to An at the point An = ∞ of the expression 2n Y
(n) ψ(An , Bj )a−n n MO (A1 , · · · , An )
j=1
Of course the distinction of the variable An is of no importance because (n) (A1 , · · · , An ) is anti-symmetric. MO (n) (A1 , · · · , An ) happens to be an 2. “Exact forms”. The integral (10) vanishes if MO “exact form”. Namely, if it can be written as: (n) (A1 , · · · , An ) = MO X ck , · · · , An ) (Q(Ak )P (Ak )−qQ(qAk )P (−Ak )) , (11) = (−1)k M (A1 , · · · , A k
with P (A) =
2n Y
(Bj + A)
j=1
ck means that for some anti-symmetric polynomial M (A1 , · · · , An−1 ). Here and later A Ak is omitted. This is a direct consequence of the functional equation satisfied by ψ(A, B) (see Eq.(69) in Appendix B). For Q(A) one can take in principle any Laurent polynomial, (n) to be a polynomial the degree of Q(A) has to be greater or equal but since we want MO −1. 3. Deformed Riemann bilinear relation. The integral (10) vanishes if X (n) ci , · · · , A cj , · · · An )C(Ai , Aj ), (A1 , · · · , An ) = (−1)i+j M (A1 , · · · , A MO i<j
where M (A1 , · · · , An−2 ) is an anti-symmetric polynomial of n − 2 variables, and C(A1 , A2 ) is given by 1 C(A1 , A2 ) = × A1 A2 A1−A2 × (P (A1 )P (A2 )−P (−A1 )P (−A2 ))+(P (−A1 )P (A2 )−P (A1 )P (−A2 )) . A1+A2 (12) This property needs some comments. For the case of the generic coupling constant its proof is rather complicated. It is a consequence of the so called deformed Riemann bilinear identity [4]. The name is due to the fact that in the limit ξ → ∞ the deformed Riemann bilinear identity happens to be the same as the Riemann bilinear identity for
610
O. Babelon, D. Bernard, F. A. Smirnov
hyper-elliptic integrals. The formula for C(A1 , A2 ) given in [11] differs from (12) by simple “exact forms”. Notice that the formula for C(A1 , A2 ) does not depend on the coupling constant. For the reflectionless case a very simple proof is available which is given in Appendix B. In order to apply these restrictions to the description of the null-vectors we have to make some preparations. The expression for C(A1 , A2 ) given above is economic in a sense that, as a polynomial of A1 , A2 it has degree 2n−1, but it is not appropriate for our goals because it mixes odd and even degrees of A1 , A2 while the descendents of primary fields contain only odd or only even polynomials. So, by adding “exact forms” we want to replace C(A1 , A2 ) by equivalent expressions of higher degrees which contain only odd or only even degrees. Proposition 1. The following equivalent forms of C(A1 , A2 ) exist: C(A1 , A2 ) ' Ce (A1 , A2 ) ' Co (A1 , A2 ),
(13)
here and later ' means equivalence up to “exact forms”. The formulae for Ce (A1 , A2 ) and Co (A1 , A2 ) are as follows: Ce (A1 , A2 ) = Z P (D1 )P (D2 ) D1 P (D1 )P (D2 ) dD2 dD1 τe − , (14) D2 (D12 − A21 )(D22 − A22 ) (D22 − A21 )(D12 − A22 )
|D2 |>|D1 | |D1 |>|A1 |,|A2 |
where τe (x) =
∞ X 1 − q 2k−1 k=1
1+q
x2k−1 − 2k−1
∞ X 1 + q 2k 2k x 1 − q 2k k=1
and Co (A1 , A2 ) = A1 A2 × Z P (D1 )P (D2 ) D1 P (D1 )P (D2 ) dD2 dD1 τo − , × D2 (D12 − A21 )(D22 − A22 ) (D22 − A21 )(D12 − A22 ) |D2 |>|D1 | |D1 |>|A1 |,|A2 |
where τo (x) =
∞ X 1 − q 2k k=1
1 + q 2k
x
2k
∞ X 1 + q 2k−1 2k−1 − x . 1 − q 2k−1 k=1
The proof of this proposition is given in Appendix B. The functions τe (x) and τo (x) are not well defined when q r = 1 because certain denominators vanish. However the formula (13) in which the LHS is independent of q implies that in the dangerous places we always find “exact forms”. So, for our applications these singularities are harmless. We shall comment more on this point later. 3.2. The fermionization. The descendents of the local operators are created by I2k−1 and J2k . This generates a bosonic Fock space. It is very convenient to fermionize J2k . ∗ ∗ and ψ2k , ψ2k . Let us introduce Neveu–Schwarz and Ramond fermions: ψ2k−1 , ψ2k−1 The commutation relations are as follows ∗ ∗ + ψm ψl = δl,m . ψl ψm
Null-Vectors in Integrable Field Theory
611
We prefer to follow the notations from [10] than those coming from CFT, the reader ∗ ∗ by ψ−m . used to CFT language has to replace ψm The vacuum vectors for the spaces with different charges are defined as follows. In the Neveu–Schwarz sector we have: ψ2k−1 |2m − 1i = 0, f or k > m, h2m − 1|ψ2k−1 = 0, f or k ≤ m,
∗ ψ2k−1 |2m − 1i = 0, f or k ≤ m; ∗ h2m − 1|ψ2k−1 = 0, f or k > m.
For the Ramond sector we have: ψ2k |2mi = 0, f or k > m, h2m|ψ2k = 0, f or k ≤ m,
∗ ψ2k |2mi = 0, f or k ≤ m; ∗ h2m|ψ2k = 0, f or k > m.
We shall never mix the Neveu–Schwarz and Ramond sectors. The spaces spanned by the right action of an equal number of ψ’s and ψ ∗ ’s on the vector hp| will be called ∗ . It is useful to think of the vector hp| as a Hp∗ . The right action of ψ sends Hp∗ to Hp+2 semi-infinite product hp| = · · · ψp−4 ψp−2 ψp . Let us introduce generating functions for the fermions. The operators ψ(A), ψ ∗ (A) are defined for the Neveu–Schwarz and the Ramond sectors respectively as follows ∞ P
ψ(A) =
A−2k+1 ψ2k−1 ,
ψ ∗ (A) =
k=−∞
ψ(A)
∞ P
=
A−2k ψ2k ,
ψ ∗ (A) =
k=−∞
∞ P k=−∞ ∞ P
∗ A2k−1 ψ2k−1 ;
k=−∞
∗ A2k ψ2k .
We shall use the decomposition of ψ(A), ψ ∗ (A) into the regular and singular parts (at zero): ψ(A) = ψ(A)reg + ψ(A)sing ,
ψ ∗ (A) = ψ ∗ (A)reg + ψ ∗ (A)sing ,
where ψ(A)reg and ψ ∗ (A)reg contain all the terms with non-negative degrees of A. Let us introduce the bosonic commuting operators h−2k for k ≥ 1: h−2k = h−2k =
∞ X j=−∞ ∞ X
∗ ψ2j−1 ψ2k+2j−1
∗ ψ2j ψ2k+2j
for Neveu − Schwarz sector, for Ramond sector.
j=−∞
They satisfy the commutation relations: [h−2k , h∗−2l ] = −kδk,l . We also have the following commutation relations between the fermions and the bosons: ψ(A)h−2k = (h−2k − A−2k )ψ(A).
(15)
The bosonic generating function Lm (t, y|A|B) for the descendents of the operators 8m can be rewritten as: Lm (t, y|A|B) = X X = exp t2k−1 I2k−1 (B)hm − 1| exp y2k h∗−2k Lbm (A|B)|m − 1i. (16) k≥1
k≥1
612
where
O. Babelon, D. Bernard, F. A. Smirnov
Y −m X1 Y h−2k J2k (A, B) Am Bj 2 . Lbm (A|B) = exp − i k i j k≥1
In other words, to have a particular descendent one has to take in the expression X t2k−1 I2k−1 (B) Lbm (A|B)|m − 1i exp k≥1
the coefficient in front of some monomial in t2k−1 and to calculate the matrix element ∗ . with some vector from the fermionic space Hm−1 We can replace this bosonic expression by a fermionic one. Recall that as a direct result of the boson-fermion correspondence one has: n X X1 Y |m − 1i = h−2k (A2i − A2j ) exp − A2k i k i<j k≥1
i=1
= ψ ∗ (A1 ) · · · ψ ∗ (An )|m − 2n − 1i
n Y
A−m+2n−1 , i
i=1
where the fermions are Neveu–Schwarz or Ramond ones depending on the parity of m. Using this fact the formula (16) can be rewritten for any n as follows Y (A2i − A2j )Lbm (A|B)|m − 1i = i<j
= g(B)ψ ∗ (A1 ) · · · ψ ∗ (An )|m − 2n − 1i
Y
A2n−1 i
i
where
X 1 h−2k s2k (B) . g(B) = exp 2k
Y
−m 2
Bj
,
j
(17)
k≥1
The occurrence of this operator is similar to that of the spin field in the description [12] of CFT on hyperelliptic curves. Let us now concentrate on the operator 80 = 1 and its descendents. It means that ∗ subspace of the Neveu–Schwarz sector. The polynomials we are working with the H−1 corresponding to the operators in this sector are even in Ai . Let us describe the nullvectors due to the restrictions 1, 2, 3, from the previous subsection. To do that we shall need the results of the following three propositions. Proposition 2. The set of polynomials of the form X (n) ci , · · · , A cj , · · · An )Ce (Ai , Aj ) (18) (A1 , · · · , An ) = (−1)i+j M (A1 , · · · , A MO i<j
coincides up to "exact forms" with the set of the matrix elements:
Null-Vectors in Integrable Field Theory
613
h9−5 |Cb ψ ∗ (A1 ) · · · ψ ∗ (An )| − 2n − 1i where Cb =
Z
Z dD2
Y
A2n−1 , i
dD1 P (D1 )P (D2 )D1−2n−1 D2−2n−1 τe
|D2 |>|D1 |
∗ ∀ 9−5 ∈ H−5 ,
D1 D2
(19)
ψ(D1 )ψ(D2 ).
Proof. The vectors from the space H ∗ can be written as h9| = hN |ψk1 · · · ψkp , where kp > · · · > k1 > N + 1. We shall call N the depth of h9|. There are three possibilities for the matrix element (19) to differ from zero: 1. The depth of h9−5 | is greater than −2n − 1. 2. The vector h9−5 | is obtained from a vector h9−1 | whose depth is greater than −2n−1 ∗ ∗ by application of ψ−2p−1 ψ−2q−1 with q > p ≥ n (i.e. there are two holes below −2n − 1). 3. The vector h9−5 | is obtained from a vector h9−3 | whose depth is greater than −2n−1 ∗ by application of ψ−2p−1 with p ≥ n (i.e. there is one hole below −2n − 1). In the first case using the formula ∗
h−2n − 1|ψ(D)ψ (A)| − 2n − 1i =
D A
2n−1
D2 , (D2 − A2 )
|D| > |A|
and (14) one find Y h9−5 |Cb ψ ∗ (A1 ) · · · ψ ∗ (An )| − 2n − 1i A2n−1 = i X ci , · · · , A cj , · · · An )Ce (Ai , Aj ), = (−1)i+j M (A1 , · · · , A i<j
where M (A1 , · · · , An−2 ) = h9−5 |ψ ∗ (A1 ) · · · ψ ∗ (An−2 )| − 2n − 1i
Y
A2n−1 . i
∗ ∗ In the second case it is necessary that in the expression h9−1 |ψ−2p−1 ψ−2q−1 Cb the two b holes below −2n − 1 are annihilated by C, the result is Z Z D1 −2n+2p −2n+2q dD2 dD1 P (D1 )P (D2 )D1 D2 τe =0 h9−1 | D2 |D2 |>|D1 |
because the integrand is a regular function of D1 for p ≥ n. ∗ In the third case it is necessary that in the expression h9−3 |ψ−2p−1 Cb the hole below b the result is −2n − 1 is annihilated by C, Z Z D1 dD2 dD1 P (D1 )P (D2 )D1−2n−1 D2−2n+2p τe ψ(D1 ) h9−3 | D2 |D2 |>|D1 |
614
O. Babelon, D. Bernard, F. A. Smirnov
∗ where the pairing of ψ(D1 ) and ψ−2p−1 is not considered because it produces zero for the same reason as above. In the matrix element we shall have the polynomials Z dD2 × h−2n − 1|
Z × =
|D2 |>|D1 |
D1 dD1 P (D1 )P (D2 )D1−2n−1 D2−2n+2p τe = ψ(D1 )ψ ∗ (Aj )| − 2n − 1iA2n−1 j D2 Z Z 1 D1 −2n+2p dD2 dD1 P (D1 )P (D2 ) 2 D τ ≡ R2n+2p (Aj ). e 2 2 D2 D1 − Aj
|D2 |>|D1 |
The polynomial R2n+2p (A) is an even polynomial of degree 2n + 2p. Let us show that it is an “exact form”. From the calculations of Appendix B we have Z Z 1 D1 −2n+2p dD2 dD1 P (D1 )P (D2 ) 2 D τe ' R2n+2p (A) = D2 D1 − A2 2 Z
|D2 |>|D1 |
'A
Z
dD2
|D2 |>|D1 |
Z
Z
=A
1 dD1 P (D1 )P (D2 ) 2 D−2n+2p D1 − A2 2
dD2
dD1 P (D1 )P (D2 )
|D1 |>|D2 |
1 D−2n+2p 2 D1 − A2 2
1 D1 + D2
1 D1 + D2
=
= 0. (20)
where we have changed the integral over |D2 | > |D1 | by the integral over |D1 | > |D2 | because the residue at D2 = −D1 gives the integral over D1 of an even function; such an integral equals zero. The last integral in (20) vanishes because the integrand is a regular function of D2 for p ≥ n. Let us emphasize that our construction is self-consistent because for every n the polynomials of too high degree (greater than 4n − 2) are “exact forms”. Thus we have non-trivial matrix elements only in the first case which obviously exhausts the polynomial of the kind (18). Consider now restriction 2 of the previous subsection. It is easy to figure out that there is only one uniform way to write for all n polynomials of the type (11) which are even in all variables Ai . Namely: (n) (A1 , · · · , An ) = MO X c2 , · · · , A2 ) (P (A ) − P (−A )) A−1 , (−1)k M (A21 , · · · , A = k k n k k
(21)
k
where M (A21 , · · · , · · · , A2n−1 ) is an arbitrary anti-symmetric polynomial. The following two simple propositions are given without proof. Proposition 3. The set of polynomials (21) coincides with the set of matrix elements: Y ∗ b ψ ∗ (A1 ) · · · ψ ∗ (An )| − 2n − 1i A2n−1 ∀9−3 ∈ H−3 , h9−3 |Q i where b= Q
Z
dDD−2n−1 P (D)ψ(D).
Null-Vectors in Integrable Field Theory
615
(n) Proposition 4. The set of polynomials MO (A1 , · · · , An ) such that Y (n) resAn =∞ ψ(An , Bj )a−n n MO (A1 , · · · , An ) = 0 j
(we hope that the same letter ψ used for the function ψ(A, Bj ) and for the fermion is not confusing) coincides with the set of matrix elements: Y b† ψ ∗ (A1 ) · · · ψ ∗ (An )| − 2n − 1i A2n−1 ∀91 ∈ H1∗ , h91 |Q i where b† = resA=∞ Q
Y
ψ(A, Bj )a−n
j
Z
ψ ∗ (D)
|D|>|A|
1 D2n dD. D2 − A2
Let us apply these results to the description of null-vectors. We need to introduce the following notations: X 1 D−2k+1 s2k−1 (B) X(D) = 2k − 1 k≥1 X 1 1 − q 2k−1 I2k−1 (B), D−2k+1 (22) = 2k − 1 1 + q 2k−1 k≥1
X 1 D−2k s2k (B). Y (D) = 2k
(23)
k≥1
Obviously P (D) = D2n eX(D)−Y (D) .
(24)
The null-vectors will be produced by acting with some operators C, Q and Q† on Lb0 (A|B)| − 1i. In view of the bosonization formulae, these operators are obtained from bQ b and Q b† by conjugation with g(B): C, b C g(B) = g(B) C,
b† . Q† g(B) = g(B) Q
b Q g(B) = g(B) Q,
The formulae for C and Q are given in the following two propositions: Proposition 2’. From Proposition 2 we find that the null-vectors due to the deformed Riemann bilinear identity are of the form: X1 X h−2k J2k (A, B))| − 1i, t2k−1 I2k−1 (B)) h9−5 | C exp(− exp( k k≥1
where
k≥1
Z C= |D2 |>|D1 |
dD2 D2
Z
dD1 X(D1 ) X(D2 ) e e τe D1
D1 D2
.ψ(D1 )ψ(D2 )
(25)
Proposition 3’. From Proposition 3 one gets the following set of null-vectors due to the “exact forms”:
616
O. Babelon, D. Bernard, F. A. Smirnov
exp(
X
t2k−1 I2k−1 (B)) h9−3 |Q exp(−
k≥1
X1 h−2k J2k (A, B))| − 1i, k k≥1
where
Z Q=
dD X(D) e ψ(D). D
(26)
Notice a very important feature in these formulae: The operators Q and C are independent of n. These propositions are direct consequences of the previous ones and of the following conjugation property of the fermions: ψ(D) g(B) = g(B) ψ(D) e−Y (D) ,
ψ ∗ (D) g(B) = g(B) ψ ∗ (D) eY (D) .
Before dealing with Q† let us discuss the operator C, Q in more detail. It will be convenient to rewrite them in terms of another set of fermions ψe and ψe† . To understand the purpose of introducing a new basis for the fermions consider the formula (25). In this formula the fermion ψ(D2 ) can be replaced by its regular part ψ(D2 )reg because other multipliers in the integrand contain only negative powers of D2 . That is why C can be rewritten in the form Z dD e e reg , ψ(D)sing ψ(D) (27) C= D where the modified fermion ψe is defined as follows: e reg = ψ(D)reg , ψ(D) with U the following operator Z U f (D) = |D|>|D1 |
e sing = U ψ (D), ψ(D)
dD1 X(D1 ) X(D) e e τe D1
D1 D
f (D1 )
, odd
where [· · ·]odd means that only odd degrees of the expression with respect to D are taken because only those contribute to the integral (27). It is quite obvious that this transformation is triangular, namely 1 − q 2k−1 e k ≥ 1. ψ2k−1 + (terms with ψ2l−1 , l < k), ψ2k−1 = 1 + q 2k−1 e b ψ (D) where U b is triangular. Altogether we can write ψ(D) =U † e Introduce the fermions ψe satisfying canonical commutation relations with ψ: T b −1 ψ ∗ (D). ψe† (D) = U e The triangularity of the b is not unitary, we do not use ∗ but † for ψ. Since the operator U ∗ e ψe† coincides b guarantees that the Fock space H constructed in terms of ψ, operator U with the original one. Thus, we can rewrite (27) as follows:
Null-Vectors in Integrable Field Theory
C=
617 ∞ X
ψe−2j+1 ψe2j−1 .
j=1
The important property of this formula is that for a given number of solitons n the summation can be taken from 1 to n because the operators ψe2j−1 with j > n produce "exact forms" when plugged into the matrix elements (see the proof of Proposition 2). e Similarly, we can express the operator Q, defined in (26), in terms of ψ: Z Q=
dD X(D) e e ψ(D). D
This equality is due to the fact that only the regular part of ψ(D) contributes into the e integral which does not change under the transformation to ψ(D). † Now we are ready to consider the operator Q . Proposition 4’. From Proposition 4 one gets the following set of null-vectors due the vanishing of the residues: exp(
X
t2k−1 I2k−1 (B)) h91 |Q† exp(−
k≥1
X1 h−2k J2k (A, B))| − 1i, k k≥1
where Q† =
Z
dD X(D) e† e ψ (D). D
Proof. Directly from Proposition 4 one gets the following formula for Q† : Z Y 1 † −n Q = resA=∞ ψ(A, Bj )a dDD2n e−Y (D) ψ ∗ (D) 2 . D − A2 j
(28)
|D|>|A|
This formula looks much simpler in terms of ψe† . By definition we have Z dD1 X(D1 ) X(D) D e† b T ψe† (D) = e e τe ψ ∗ (D) = U ψreg (D1 ) + ψe† (D)sing . D1 D1 |D1 |>|D|
odd
The last term does not contribute to the residue because Z 1 dDD2n e−Y (D) 2 ψe† (D)sing = O(A2n−2 ) D − A2 |D|>|A|
and
Y
ψ(A, Bj )a−n = A−2n (1 + O(A−1 )).
j
Substituting the rest into (28) one has
(29)
618
O. Babelon, D. Bernard, F. A. Smirnov
Z
dD1 e† ψ (D1 )eX(D1 ) D1 Y ×resA=∞ ψ(A, Bj )a−n Q† =
j
Z dDP (D)
|D1 |>|D|
1 τe 2 D − A2
D D1
.
Using the formulae from Appendix B one can show that Z 1 D dDP (D) 2 τe ' D − A2 D1 |D1 |>|D|>|A|
1 ' 2
P (A) P (A) + A + D 1 A − D1
+ O(A−1 ) = A2n−1 (1 + O(A−1 )).
Here the equality is up to "exact form" in A; such "exact form" never contribute to the residue. Now the formula (29) gives Z Y 1 D −n ψ(A, Bj )a dDP (D) 2 τ = 1. resA=∞ 2 e D − A D 1 j |D1 |>|D|>|A|
which proves the proposition.
This alternative expression for Q† shows that it is independent of n, as Q and C are. e Notice that in the formulae for Q and Q† only the holomorphic parts of ψ(D) and † e ψ (D) are relevant. This leads to the important commutation relation: [C, Q† ] = Q.
(30)
Notice also that Q and Q† are nilpotent operators, Q2 = (Q† )2 = 0, and [C, Q] = 0. This is the proper place to discuss the problems which we had before: the definition of the local operator corresponding to an arbitrary polynomial satisfying (3) and the singularities in the definition of Co and Ce . Consider the polynomials Y i<j
e −3 |Q exp(− (A2i − A2i )h9
X1 h−2k J2k (A, B))| − 1i, k
(31)
k≥1
e −3 | are the vectors from the Fock space constructed via the fermions where the states h9 e the ψ. By the very definition of ψe the polynomial (31) is equivalent to an antisymmetric polynomial M e (A1 , · · · , An ) of degree ≤ 2n − 1 with respect to any Ai which is h9−3 | independent of the coupling constant. This polynomial satisfies the requirement formulated in [1] and hence defines a local operator for arbitrary coupling constant. So, if we start from the fermions ψe the counting of the local operators is independent of the coupling constant like in the paper [1]. But where is the origin of vanishing denominators? The point is that if we consider the rational coupling constant ξ, for many of these local operators all the form factors will vanish. So, we have to be more careful: to consider the vicinity of the rational coupling constant ξ = π pq + and to keep the first non-vanishing order in the form factors in the limit → 0. Thus, to define the space of
Null-Vectors in Integrable Field Theory
619
e If when passing to the fermions local operators it is better to start from the fermions ψ. ψ we find somewhere an infinite coefficient when the coupling constant is rational, it is always accompanied by a vanishing operator in such a way that the result is finite. 3.3. Algebraic definition of the null-vectors. Let us summarize our study of the nullthe null-vectors as vectors for the descendents of 80 . It is quite convenient to present w vectors from the dual space. In this section we shall write h9| = 0 if the matrix element of h9| with the fermionic generating function of local fields vanishes under the integral (w means "in a weak sense"). We have found three types of null-vectors: w
∀
∗ h9−5 | ∈ H−5 ,
(i)
w
∀
∗ h9−3 | ∈ H−3 ,
(ii)
h9−5 |C = 0, h9−3 |Q = 0, w h91 |Q† = 0,
∀
h91 | ∈
H1∗ ,
(iii)
where the operators C, Q† , Q are given respectively by Z Z dD X(D) e dD e e ψ(D)reg ψ(D)sing , e Q= ψ(D), C = D D Z dD X(D) e† e ψ (D). Q† = D Let us show that these three conditions for the null-vectors are not independent. ∗ and H1∗ : It is easy to show that the operator C identifies the spaces H−3 ∗ →H ∗ ) = 0, Ker(C|H−3 1
∗ ∗ →H ∗ ) = H . Im(C|H−3 1 1
∗ . Let us Hence every h91 | ∈ H1∗ can be presented as h9−3 |C for some h9−3 | ∈ H−3 show that the null-vectors (iii) are linear combinations of (i) and (ii). We have:
h91 |Q† = h9−3 |CQ† = h9−3 |Q + h9−3 |Q† C, where we have used the commutation relation (30). Thus we have proven the following ∗ Proposition 5. In the space of descendents of 80 = 1 which is H(I) ⊗ H−1 (where H(I) is the space of polynomials of {I2k−1 }) the null-vectors coincide with the vectors w
h9−5 |C = 0, w
h9−3 |Q = 0,
∀
∗ h9−5 | ∈ H−5 ,
(i)
∀
∗ H−3 ,
(ii)
h9−3 | ∈
and their descendents with respect to I’s. The consideration of the other operators 82m is based on the same formulae, but involves additional complications. We do not want to go into details, and we only present the final result. Proposition 6. For the operator 82m whose descendents are counted by the vectors ∗ we have two types of independent null-vectors: from H(I) ⊗ H2m−1 w
h9−5−2m |(C)m+1 = 0, w
h9−3−2m |(C) Q = 0, m
∀
∗ h9−5−2m | ∈ H−5−2m ,
(i)
∀
∗ H−3−2m ,
(ii)
and their descendents with respect to I’s.
h9−3−2m | ∈
620
O. Babelon, D. Bernard, F. A. Smirnov
For the operators 82m+1 one has the following picture. There is no uniform “exact form” which is an odd polynomial, so, the analogue of the operator Q does not exist for 82m+1 , and the null-vectors are either due to the deformed Riemann bilinear identity or due to the vanishing of the residue. To construct the null-vectors in terms of fermions one has to introduce first the operator C: Z Z dD2 dD1 X(D1 ) X(D2 ) D1 e e τo ψ(D1 )ψ(D2 ) C= D2 D1 D2 |D2 |>|D1 |
where the fermions are from the Ramond sector. This operator can be rewritten in a form similar to (27): Z dD e e sing . ψ(D)reg ψ(D) C= D The fermions ψe are related to ψ by triangular transformation. The consideration of the operator 81 is absolutely parallel to the consideration of 80 . The null-vectors are created either by the action of C (Riemann identity) or by the † action of ψe (residue). Notice that 0
† [C, ψe0 ] = 0 which guarantees the consistency. For higher operators 82m+1 there are additional problems which we would not like to discuss here. The general result is given in the following Proposition 7. The descendents of the operator 82m+1 are counted by the vectors from ∗ . We have two types of null-vectors the space H(I) ⊗ H2m w
h9−4−2m |(C)m+1 = 0, † w h92−2m |(C)m ψe0 = 0,
∀
∗ h9−4−2m | ∈ H−4−2m ,
(i)
∗ h92−2m | ∈ H2−2m ,
(ii)
∀
and their descendents with respect to I’s . 3.4. Examples of null-vectors and the characters. Let us present the simplest examples of null-vectors for the operators 80 , 81 and 82 . For the operator 80 the simplest null-vector is created by h−3|Q = s1 (B)h−1|.
This null-vector is
1−q 1+q
I 1 80 .
This null-vector is to be compared with L−1 80 . For the operator 81 the simplest null-vector is created by † h2|ψe0 = h−2|ψe2 =
1 − q2 1 + q2
1 h−2|ψ2 − 2
1+q 1−q
!
2
s1 (B) h0| , 2
Null-Vectors in Integrable Field Theory
621
which gives the null-vector
1 − q2 1 + q2
1 (J2 − I12 )81 , 2
which has to be compared with (L−2 + κL2−1 )81 . For the operator 82 the simplest null-vector is created by h−5|CQ. It yields 1 1 1 − q3 (I3 − 3I1 J2 + I13 )82 , 3 3 1+q 2 which has to be compared with (L−3 + κ1 L−1 L−2 + κ2 L3−1 )82 . Notice that the relative coefficients in our parametrization of the null-vectors are independent of ξ. So, they are exactly of the same form as the classical one. However, this is not always the case. Let us show that generally the number of our null-vectors is the same as for the representations of the Virasoro algebra. Recall that we consider the null-vectors which do not depend on the arithmetical properties of πξ , so, there is one basic null-vector in every Verma module of Virasoro algebra. The character of the irreducible module associated with 8m is χm (p) = (1 − pm+1 ) Q
1 , j j≥1 (1 − p )
(32)
where we omitted the multiplier with the scaling dimension of the primary field. We can not control this scaling dimension, the dimensions of the descendents are understood relatively to the dimension of the primary field. The character (32) is obtained from the character of the Verma module by omitting the module of descendents of the null-vector on the level m + 1. Let us consider the character of the module which we constructed in terms of I, J. The dimensions of I2k−1 and J2k are naturally 2k − 1 and 2k. If we do not take into account the null-vectors, the characters of all the modules associated with 8m are the same: 1 . χ(p) = Q j j≥1 (1 − p ) Let us take into account the null-vectors. They are described in terms of fermions. By consistency with the dimensions of I2k−1 and J2k one finds that the dimensions of ψl † and ψ−l equal l. ∗ , Technically it is easier to start with 82m+1 . The space of descendents is H(I)⊗H2m where H(I) is the space of polynomials of {I2k−1 }. Proposition 8. The character of the space of descendents of 82m+1 , modulo the null vectors, equals 1 . χ2m+1 (p) = (1 − p2(m+1) ) Q j j≥1 (1 − p )
622
O. Babelon, D. Bernard, F. A. Smirnov
Proof. The null-vectors are defined in Proposition 7. It is easy to eliminate the null∗ e vectors (ii): we have to consider the subspace H−2m, 0 in which the level ψ0 is always occupied. Consider the sequence m
∗ C ∗ C ∗ H−2m−4, 0 →H−2m, 0 → H2m, 0 . ∗ ∗ The operator C m identifies the spaces H−2m, 0 and H2m, 0 : ∗ ∗ ∗ Im(C m |H−2m, ) = H2m, →H2m, 0 0 0
∗ ∗ Ker(C m |H−2m, ) = 0, →H2m, 0 0
† and h−2m − 2|ψe0 C m = h2m|. Hence we can count the descendents by vectors of the ∗ space H(I) ⊗ H−2m, 0 with the null-vectors: w
h9−4−2m |C = 0,
∀
∗ h9−4−2m | ∈ H−4−2m, 0.
Notice that the operator C is dimensionless and ∗ ∗ ) = 0. Ker(C|H−4−2m, →H−2m, 0 0
That is why for the character of the space of descendents without null-vectors we have: 1 −m(m−1) p χ2m+1 (p) = Q χ (p) − χ (p) , (33) ∗ ∗ H H 2j−1 ) −2m, 0 −2m−4, 0 j≥1 (1 − p where the first multiplier comes from H(I), the multiplier p−m(m−1) is needed in order ∗ . Let us evaluate the expression to cancel the dimension of the vacuum vector in H−2m in brackets: (p) − χ H ∗ (p) = χ H∗ −2m, 0 −2m−4, 0 Z Y Z Y dx dx = − = (1 + p2j x)(1 + p2j x−1 )x−m (1 + p2j x)(1 + p2j x−1 )x−m−2 x x j≥1 j≥1 Z Y dx − (1 + p2j x)(1 + p2j x−1 )x−m = (1 + x−1 ) x j≥1 Z Y dx = − (1 + x−1 ) (1 + p2j x)(1 + p2j x−1 )x−m−1 x j≥1 Z Y dx = (1 + p2j x)(1 + p2j x−1 )x−m = (1 − p2(m+1) ) (1 + x−1 ) x j≥1
= (1 − p2(m+1) )χ H ∗
−2m
(p) = pm(m−1) (1 − p2(m+1) ) Q
1 , 2j j≥1 (1 − p )
where we have changed the variable of integration x → xp−2 in the second integral when passing from the third to fourth line. Substituting this result into (33) we get the correct character: 1 . χ2m+1 (p) = (1 − p2(m+1) ) Q (1 − pj ) j≥1
Null-Vectors in Integrable Field Theory
623
Let us consider now the operators 82m . We parametrize the descendents of 82m by ∗ , the vectors from H(I) ⊗ H2m−1 Proposition 9. The character of the space of descendents of 82m , modulo the nullvectors, equals 1 . χ2m (p) = (1 − p2m+1 ) Q (1 − pj ) j≥1 Proof. The null-vectors are defined in Proposition 6. We have m
∗ C ∗ C ∗ H−2m−5 → H−2m−1 → H2m−1 , m
∗ Q ∗ C ∗ → H−2m−1 → H2m−1 . H−2m−3 ∗ ∗ and H2m−1 : The operator C m identifies H−2m−1 ∗ ∗ ) = 0, Ker(C m |H−2m−1 →H2m−1
∗ ∗ ∗ Im(C m |H−2m−1 ) = H2m−1 . →H2m−1
∗ with and h−2m − 1| C m = h2m − 1|. Hence we can replace the space H(I) ⊗ H2m−1 ∗ these null-vectors by H(I) ⊗ H−2m−1 with null-vectors w
h9−5−2m |C = 0, w
h9−3−2m |Q = 0,
∀
∗ h9−5−2m | ∈ H−5−2m ,
(i)
∀
∗ H−3−2m .
(ii)
h9−3−2m | ∈
So, the character in question is χ2m (p) = Q
1 −m2 p χ (p) − χ (p) , ∗ ∗ H H 2j−1 ) −2m−1, 0 −2m−5,0 j≥1 (1 − p
∗ ∗ ∗ where H−2l−1, 0 = H−2l−1 /H−2l−3 Q. In order to calculate the character χ H ∗
−2l−1, 0
(p)
one has to take into account that Q is a nilpotent operator, Q2 = 0 , with a trivial cohomology. Hence ∗ ∗ ∗ ∗ ) = Im(Q|H−2j−5 ), Ker(Q|H−2j−3 →H−2j−1 →H−2j−3
Summing up over this complex we obtain: Z Y x dx . (p) = (1 + p2j−1 x)(1 + p2j−1 x−1 )x−l χ H∗ −2l−1, 0 x+1 x |x|>1
j≥1
Hence χ H∗ (p) − χ H ∗ (p) = −2m−1, 0 −2m−5, 0 Z Y x x2 − 1 dx = = (1 + p2j−1 x)(1 + p2j−1 x−1 )x−m x + 1 x2 x |x|>1
=
j≥1
Z Y
(1 + p2j−1 x)(1 + p2j−1 x−1 )x−m
j≥1
−
Z Y
dx − x
(1 + p2j−1 x)(1 + p2j−1 x−1 )x−m−1
j≥1
= (1 − p2m+1 )χ H ∗
−2m−1
2
dx = x
(p) = pm (1 − p2m+1 ) Q
1 . 2j j≥1 (1 − p )
624
O. Babelon, D. Bernard, F. A. Smirnov
Thus the character is given by χ2m (p) = (1 − p2m+1 ) Q as it should be.
1 j j≥1 (1 − p )
4. Classical Case 4.1. Local fields and null-vectors in the classical theory. The classical limit of the lightcone component T−− of the energy-momentum tensor gives the KdV field u(x− ). When working with the multi-time formalism we shall identify x− with t1 . Local fields in the KdV theory, descendents of the identity operators, are simply polynomials in u(t) and its derivatives with respect to t1 : O = O(u, u0 , u00 , ...).
(34)
We shall use both notations ∂1 and 0 for the derivatives with respect to x− = t1 . Instead of the variables u, u0 , u00 , ..., it will be more convenient to replace the odd derivatives of u by the higher time derivatives ∂2k−1 u, according to the equations of motion h 2k−1 i 1 ∂L = L 2 , L = 2k−1 u(2k−1) + · · · . ∂t2k−1 2 + Here L = ∂12 − u is the Lax operator of KdV. We have used the pseudo-differential operator formalism. We follow the book [13]. The even derivatives of u(x) will be replaced by the densities S2k of the local integrals of motion, S2k = res∂1 L
2k−1 2
=−
1 22k−1
u(2k−2) + · · · .
In particular on level 2 we have S2 = − 21 u. For a reader who prefers the τ -function language S2k = ∂1 ∂2k−1 log τ . From analogy with the conformal case we put forward the following main conjecture underlying the classical picture: Conjecture. We can write any local fields as O(u, u0 , u00 , ...) = FO,0 (S2 , S4 , · · ·) +
X
∂ ν FO,ν (S2 , S4 , · · ·),
(35)
ν≥1
where ν = (i1 , i3 , · · ·) is a multi index, ∂ ν = ∂1i1 ∂3i3 · · ·, |ν| = i1 + 3i3 + · · ·. We have checked this conjecture up to very high levels. To see that this conjecture is a non trivial one, let us compute the character of the space of local fields, Eq.(34). Attributing the degree 2 to u and 1 to ∂1 , we find that
Null-Vectors in Integrable Field Theory
χ1 =
Y j≥2
625
Y 1 1 = (1 − p) = 1 + p2 + p3 + 2p4 + 2p5 + · · · . j 1−p 1 − pj j≥1
On the other hand the character of the elements in the right hand side of Eq.(35) is χ2 =
Y j≥1
Y Y 1 1 1 = = 1 + p + 2p2 + 3p3 + 5p4 + 7p5 + · · · . 2j−1 2j 1−p 1−p 1 − pj j≥1
j≥1
Hence χ1 < χ2 , and this is precisely why null-vectors exist. Let us give some examples of null-vectors level 1 : ∂1 · 1 = 0, level 2 : ∂12 · 1 = 0, level 3 : ∂13 · 1 = 0, level 4 : level 5 :
∂14 ∂15
(36) ∂3 · 1 = 0,
· 1 = 0,
∂1 ∂3 · 1 = 0,
(∂12 S2 − 4S4 + 6S22 ) · 1 = 0,
· 1 = 0,
∂12 ∂3 · 1 = 0,
∂5 · 1 = 0,
∂1 (∂12 S2
− 4S4 +
6S22 )
· 1 = 0,
(∂3 S2 − ∂1 S4 ) · 1 = 0,
We have written all the null-vectors explicitly to show that their numbers exactly match the character formulae. The non trivial null-vector at level 4 expresses S4 in terms of the original variable u: 4S4 = − 21 u00 + 23 u2 . With this identification the non-trivial null-vector at level 5, ∂3 S2 − ∂1 S4 , gives the KdV equation itself 3 1 ∂3 u + uu0 − u000 = 0. 2 4 More generally one can consider the descendents of the fields eimϕ , where ϕ is related to u by the Miura transformation u = −ϕ02 + iϕ00 . Here, the presence of i is a matter of convention. The reality problems have been discussed at length in [2]. For this consideration and for other purposes we need certain information about the Baker-Akhiezer function. The Baker-Akhiezer function w(t, A) is a solution of the equation Lw(t, A) = A2 w(t, A),
(37)
which admits an asymptotic expansion at A = ∞ of the form X w(t, A) = eζ(t,A) (1 + 0(1/A)); ζ(t, A) = t2k−1 A2k−1 . k≥1
In these formulae, higher times are considered as parameters. The second solution of Eq. (37), denoted by w∗ (t, A), has the asymptotics w∗ (t, A) = e−ζ(t,A) (1 + 0(1/A)). These definitions do not fix completely the Baker-Akhiezer functions since we can still multiply them by constant asymptotic series of the form 1+O(1/A). Since normalizations
626
O. Babelon, D. Bernard, F. A. Smirnov
will be important to us, let us give a more precise definition. We first introduce the dressing operator 8: X 8i ∂1−i , L = 8∂12 8−1 ; 8 = 1 + i>1
and we define w(t, A) = 8eζ(t,A) , w∗ (t, A) = (8∗ )−1 e−ζ(t,A) , P where 8∗ = 1 + i>1 (−∂1 )−i 8i is the formal adjoint of 8. Proposition 10. With the above definitions, one has 1. The wronskian W (A) = w(t, A)0 w∗ (t, A) − w∗ (t, A)0 w(t, A) takes the value W (A) = 2A. 2. The generating function of the local densities S(A) = 1 + to the Baker-Akhiezer function by
P k>0
S2k A−2k is related
S(A) = w(t, A)w∗ (t, A). 3. The function S(A) satisfies the Ricatti equation 2S(A)S(A)00 − (S(A)0 )2 − 4uS(A)2 − 4A2 S(A)2 + 4A2 = 0.
(38)
Proof. Let us prove the wronskian identity. This amounts to showing that resA (W (A)Ai ) = 2δi,−2 . But we have resA W (A)Ai = n o = resA ∂1 8∂1i eζ(t,A) (8∗ )−1 e−ζ(t,A) − 8eζ(t,A) ∂1 (8∗ )−1 (−∂1 )i e−ζ(t,A) . We can transform the residue in A in a residue in ∂1 using the formula n o = res∂1 P Q∗ . resA P eζ(t,A) · Qe−ζ(t,A) Hence we find n o n o i i resA W (A)Ai = res∂1 ∂1 8∂1i 8−1 + 8∂1i 8−1 ∂1 = res∂1 ∂1 L 2 + L 2 ∂1 . i
If i is even positive, the residue is zero because the L 2 is a purely differential operator. If i i i = −2 the residue is obviously 2, and if i < −2 it is zero. If i is odd, then (L 2 )∗ = −L 2 i i so that the operator ∂1 L 2 + L 2 ∂1 is formally self-adjoint and it cannot have a residue. The proof of 2) is simple [13] 2k−1 = S2k . resA A2k−1 w(t, A)w∗ (t, A) = res∂1 8∂12k−1 8−1 = res∂1 L 2 The Ricatti equation follows immediately from 1),2) and Eq.(37).
Null-Vectors in Integrable Field Theory
627
Let us return to the descendents of the primary fields. For the descendents of the fields eimϕ , our conjecture states that X O(u, u0 , u00 , ...)eimϕ = ∂ ν FO,ν (S2 , S4 , · · ·)eimϕ . (39) ν≥0
Let us consider for example eiϕ . For a true solution of the KdV equation, the BakerAkhiezer function is a true function on the spectral curve, and it can be analytically continued at A = 0. From the definition of eiϕ we have Leiϕ = 0. Comparing with Eq.(37), we see that eiϕ = w(t, A)|A=0 . To check Eq.(39), at least on the first few levels, we need the time derivatives of eiϕ . They can be obtained as follows. The time evolution of the Baker-Akhiezer function is well known, 2k−1 ∂w = L 2 w. ∂t2k−1 + By analytical continuation at A = 0, we obtain the evolution equations for eiϕ . Let us give some examples of these null-vectors. We show below the first null-vector associated to the primary fields eimϕ . m = 1 : (∂12 + 2S2 )eiϕ = 0, m = 2 : (2∂3 + ∂13 + 6∂1 S2 )e2iϕ = 0, m = 3 : (8∂1 ∂3 + ∂14 + 12∂12 S2 + 24S4 )e3iϕ = 0, m = 4 : (24∂5 ∂1 + 20∂3 ∂13 + ∂16 + 20∂14 S2 + 40∂3 ∂1 S2 + 120∂12 S4 )e4iϕ = 0. In these formulae, the derivatives act on everything on their right, i.e. ∂1 S2 e2iϕ = ∂1 (S2 e2iϕ ). 4.2. Finite-zone and soliton solutions. For the finite-zone solutions, the Baker-Akhiezer function is an analytical function on the spectral curve which is an algebraic Riemann surface. Let us recall briefly the construction [14, 15]. We start with an hyperelliptic curve 0 of genus n described by the equation 0 : Y 2 = XP(X), P(X) =
2n Y
(X − Bj2 ),
B2n > · · · > B2 > B1 > 0.
j=1
For historical reasons we prefer to work with the parameter A such that X = A2 . The surface is realized as the A-plane with cuts on the real axis over the intervals ci = (B2i−1 , B2i ) and ci = (−B2i , −B2i−1 ), i = 1, · · · , n, the upper p (lower) bank of ci is identified with the upper (lower) bank of c . The square root P(A2 ) is chosen so i p that P(A2 ) → A2n as A → ∞. The canonical basis of cycles is chosen as follows: the cycle ai starts from B2i−1 and goes in the upper half-plane to −B2i−1 , bi is an anti-clockwise cycle around the cut ci . Let us consider in addition a divisor of order n on the surface 0: D = (P1 , · · · , Pn ). With these data we construct the Baker-Akhiezer function which is the unique function with the following analytical properties:
628
O. Babelon, D. Bernard, F. A. Smirnov
– It has an essential singularity at infinity: w(t, A) = eζ(t,A) (1 + O(1/A)). – It has n simple poles outside infinity. The divisor of these poles is D. Considering the quantity −∂12 w + A2 w, we see that it has the same analytical properties as w itself, apart from the first normalization condition. Hence, because w is unique, there exists a function u(t) such that −∂12 w + u(t)w + A2 w = 0.
(40)
We recognize Eq.(37). One can give various explicit constructions of the Baker-Akhiezer function. Let us introduce the divisor Z(t) of the zeroes of the Baker-Akhiezer function. It is of degree n: Z(t) = (A1 (t), · · · , An (t)). The equations of motion for the divisor Z(t) read [15]. p P(A2i (t)) . ∂1 Ai (t) = − Q 2 (Ai (t) − A2j (t))
(41)
j6=i
The normalization of the Baker-Akhiezer function corresponds to a particular choice of the divisor of its poles D. Later we shall specify the divisor which corresponds to the normalization of the Baker-Akhiezer function which was required in the previous subsection, for the moment we give a formula in which the normalization is irrelevant. Consider two sets of times t and t(0) , differing only by the value of t1 . Then we can write s ! Z t1 p Q(A2 , t) A P(A2 ) w(t, A) = exp dt1 , (42) w(t(0) , A) Q(A2 , t(0) ) Q(A2 , t) t(0) 1 Q where the polynomial Q(A2 , t) is defined as Q(A2 , t) = i (A2 − A2i (t)). ∗ The ratio of two dual Baker-Akhiezer functions ww∗ (t(t,A) (0) ,A) is obtained by applying the hyperelliptic involution. This amounts to the reflection A → −A in Eq.(42). Let us prove the following simple proposition. Proposition 11. For the Baker-Akhiezer functions w(t, A), w∗ (t, A) normalized by w(t, A)0 w∗ (t, A) − w∗ (t, A)0 w(t, A) = 2A, we have X1 Q(A2 ) S(A) = p J2k A−2k ≡ exp − 2 k P(A ) k
! ,
(43)
the latter equality is the definition of J2k . We recall that Q(A2 ) and P(A2 ) are the polynomials Q(A2 ) =
n Y i=1
(A2 − A2i ), P(A2 ) =
2n Y i=1
(A2 − Bi2 ).
Null-Vectors in Integrable Field Theory
629
Proof. To prove the proposition we use the Wronskian identity w(t, A) w(t, A)0 w∗ (t, A) − w∗ (t, A)0 w(t, A) = w(t, A)w∗ (t, A)∂1 log ∗ = 2A, w (t, A) ∗
w(t,A) w (t,A) but using Eq.(42) and the fact that w(t (0) ,A) and w ∗ (t(0) ,A) differ by the sign of the square root, we have p A P(A2 ) w(t, A) ∂1 log ∗ =2 , w (t, A) Q(A2 )
and the result follows.
Notice that for J2k defined in (43) we have J2k =
X
A2k i −
i
1 X 2k Bi . 2 i
From Eq.(43) we see that the normalization of w(t, A) and w∗ (t, A) which corresponds to the proper value of the Wronskian is such that the divisors D and D∗ are composed of Weierstrass points and D + D∗ = (B1 , · · · , B2n ). Actually, it is this quite unique normalization which was used by Akhiezer in his original paper. Now we are in a position to describe the dynamics of S(A) with respect to all times. It is very useful to define the following strange object dI(D) =
X k≥1
D−2k
∂ dD. ∂t2k−1
(44)
dI(D) is a 1-form in the D-plane and a vector field with respect to times. We have Proposition 12. dI(D) · S(A) =
S(D)S(A)0 − S(A)S(D)0 dD. D2 − A2
(45)
Proof. We give a proof of this proposition for the finite zone solutions, which are our main concern here, but clearly the formula is quite general. We are sure that a general proof of Eq. (45) exists, but it must be based on manipulations with asymptotic formulae. We prefer to work with analytical functions. Anyway, every solution of KdV can be obtained from the finite-zone ones by a suitable limiting procedure, so considering finite-zone solutions is not a real restriction. Let us describe the motion, under the time tl , of the divisor Z(t) of the zeroes of the Baker-Akhiezer function. Introduce the normalized holomorphic differentials dωi for i = 1, · · · , n and the normalized second kind differentials with singularity at infinity de ω2i−1 , i ≥ 1 , Z dωi = δi,j , aj
Z
de ω2i−1 = 0, aj
de ω2i−1 (A) = d(A2i−1 ) + O(A−2 )dA f or A ∼ ∞.
630
O. Babelon, D. Bernard, F. A. Smirnov
It is well known that, by the Abel map, this motion is transformed into a linear flow on the Jacobi variety, Z Z ∂ X Aj dωk = de ω2l−1 , ∂t2l−1 j bk From this equation one easily finds: dI(D) ·
XZ
Z
Aj
dωk =
de ωD ,
(46)
bk
j
where de ωD is a 2-differential defined on 0×00 (00 is the Riemann sphere) parametrized by A and D respectively. It is useful to think of 00 as a realization of the curve Y 2 = X similar to 0. The a-periods of de ωD on 0 vanish. The only singularities of the differential de ωD are the second order poles at the two points A = ±D: de ωD (A) =
A2 + D2 + O(1) dA dD. (A2 − D2 )2
By Riemann’s bilinear relations one easily finds: Z de ωD = dωk (D). bk
Equation (46) then takes the form dI(D) ·
XZ
Aj
dωk = dωk (D).
(47)
j
The normalized holomorphic differentials are linear combinations of the differentials A2k−2 dσk (A) = p dA, k = 1 · · · n. P(A2 ) with coefficients depending on Bi . They do not depend on times. Hence by linearity we can write for dσk the same equation as (47). Differentiating explicitly we get the following system of equations: n X A2k−2 D2k−2 qj dI(D) · Aj = p dD; P(D2 ) P(A2j ) j=1
k = 1 · · · n.
Solving this linear system of equations gives 2
q P(A2j )
1 Q(D ) Q dD dI(D) · A2j = 2Aj p 2 2 2 − A2 2 D P(D ) j i6=j (Aj − Ai ) 1 Q(D2 ) ∂ A2 dD, = −p 2 1 j 2 2 P(D ) D − Aj where we have used Eq.(41) in the last step. Finally, using Eq.(43) we find
Null-Vectors in Integrable Field Theory
631
X
1 1 ∂ (A2j )dD 2 − A2 A2 − A2 1 D j j j 0 S(D) 1 S(D)S(A) log dD. = 2 2 D −A S(A)
dI(D) · S(A) = S(D)S(A)
The soliton solutions correspond to a rational degeneration of the finite-zone solutions such that ej ← B2j , j = 1, · · · , n. B2j−1 → B ej are the coordinates in the n-solitons phase The points of the divisor Ai and the points B space. In [2] we gave a detailed discussion of the Hamiltonian structure. In particular we have 1 X e 2k−1 ∂ = {I2k−1 , ·}; I2k−1 = . Bi ∂t2k−1 2k − 1 i The expressions for the local observables also follow easily from the finite-zone case. In particular e = J2k (A, B)
n X
e 2k . A2k i − Bi
(48)
i=1
The expressions for S2k follow from here. 4.3. Classical limit of Q and C. Let us consider the classical limit ν → ∞ of the operators Q and C. To this aim, we have to understand the relation between the quantum and the classical descriptions of the observables. In the quantum case we considered the form factors, i.e. matrix elements of the form fO (β1 , · · · , β2n )−···−+···+ = h0|O(0)|β1 , · · · , βn ; βn+1 , · · · , β2n i, where β1 , · · · , βn are rapidities of anti-solitons, βn+1 , · · · , β2n are rapidities of solitons. The matrix elements of this form do not allow a direct semi-classical interpretation; it is necessary to perform a crossing transformation to the matrix elements between two n-soliton states: hβ1 , · · · , βn |O(0)|βn+1 , · · · , β2n i = = fO (β1 − πi, · · · , βn − πi, βn+1 , · · · , β2n )−···−+···+ . In [2] it is explained that the formula (2) for this form factor is a result of quantization of n-soliton solutions in which A1 , · · · , An play the role of coordinates, B1 , · · · , Bn and Bn+1 , · · · , B2n give the collection of eigenvalues for two eigenstates. Recall that the generating function for the local descendents of the primary field 8m was written as follows: 2n n X Y Y m − t2k−1 I2k−1 (B) + y2k J2k (A|B) Am Bj 2 ,(49) Lm (t, y|A|B) = exp i k≥1
i=1
j=1
where I2k−1 (B) and J2k (A|B) are defined in Eqs.(5,6). The expression
632
O. Babelon, D. Bernard, F. A. Smirnov
X
exp
2n n Y Y m − y2k J2k (A|B) Am Bj 2 i
i=1
k≥1
j=1
is practically unchanged under the crossing transformation which corresponds to Bi → −Bi for i = 1, · · · , n. Comparing it with the classical formulae (48) we see that it corresponds to special symmetric ordering of them, for example 1 J2k (A1 , · · · , An , B1 , · · · , Bn ) + J2k (A1 , · · · , An , Bn+1 , · · · , B2n ) . J2k (A|B) = 2 This ordering is a prescription which we make for the quantization. On the other hand the eigenvalues of the Hamiltonians I2k−1 (B) under crossing transformation change to −I2k−1 (B1 , · · · , Bn ) + I2k−1 (Bn+1 , · · · , B2n ), i.e. the descendents with respect to I2k−1 correspond to taking thecommutator of O with I2k−1 . Certainly the classical limit makes sense only for the states with close eigenvalues, so, it is needed that s2k−1 (B1 , · · · , Bn ) − s2k−1 (Bn+1 , · · · , B2n ) = O(ξ). Recall that ξ = −i log(q) plays the role of Planck’s constant. Thus comparing the classical and quantum pictures provides the following result. The quantum generating function (49) corresponds to the classical generating function X X t2k−1 I2k−1 · exp y2k J2k eimϕ , (50) Lcl m (t, y) = exp k≥1
k≥1
where · means the application of Poisson brackets. In fact I2k−1 can be replaced by ∂2k−1 . The normalization in the formula for I(B) (5) is chosen in order to provide an exact agreement with the classical formulae. Now let us consider the classical limit of the operators Q and C in the Neveu–Schwarz sector. For Q we had the formula Z Z dD dD X(D) e sinh(X(D))ψ(D). ψ(D) = Q= D D The latter equation is due to the fact that the fermion is odd. From the definition (22) of X(D), we have in the classical limit X(D) →
−iξ X −2k+1 D I2k−1 . 2 k≥1
Hence the following expression is finite in the classical limit: Z Q Qcl = lim i2 = ψ(D)dI(D), ξ→0 ξ
(51)
where dI(D) is the 1-form in D-plane introduced in the previous subsection: dI(D) = P −2k D I2k−1 dD. Remark that Qcl can be thought of as a generalized Dirac operator. k≥1
For C we had the formula
Null-Vectors in Integrable Field Theory
Z C= |D2 |>|D1 |
Z
= |D2 |>|D1 |
Z
+ |D2 |>|D1 |
dD2 D2
dD2 D2 dD2 D2
Z Z
Z
633
dD1 X(D1 ) X(D2 ) e e τe D1
D1 D2
ψ(D1 )ψ(D2 ) =
dD1 cosh(X(D1 )) cosh(X(D2 ))τe− D1
dD1 sinh(X(D1 )) sinh(X(D2 ))τe+ D1
D1 D2
D1 D2
ψ(D1 )ψ(D2 ) +
ψ(D1 )ψ(D2 ),
where τe+ and τe− are even and odd parts of τe : τe− (x) =
∞ X 1 − q 2k−1 k=1
1 + q 2k−1
τe+ (x) = −
x2k−1 ,
∞ X 1 + q 2k 2k x . 1 − q 2k k=1
Obviously when ξ → 0 one has −iξ d x x , τe− (x) → 2 dx 1 − x2
τe+ (x) → −(iξ)−1 log(1 − x2 ).
So, the following expression is finite in the classical limit: Z d 2C = ψ(D) ψ(D)dD Ccl = lim ξ→0 πξ dD 2 ! Z Z 1 D1 + dI(D2 ) dI(D1 ) log 1 − ψ(D1 )ψ(D2 ). 2πi D2
(52)
|D2 |>|D1 |
In the next subsection we are going to apply these operators to a description of the classical KdV hierarchy. Notice that as usual the quantum formulae are far more symmetric than the classical ones. 4.4. The classical equations of motion from Qcl and Ccl . In this subsection we shall consider only the descendents of the identity, i.e. the pure KdV fields. We have described this space by the generating function (50): X X t2k−1 I2k−1 · exp y2k J2k · 1. Lcl m=0 (t, y) = exp k≥1
k≥1
Let us fermionize J2k and apply the equations w
h9−3 | Qcl = 0,
w
h9−5 | Ccl = 0
(53) w
to the description of the equations of motion. In this section the symbol = 0 means the vanishing of the scalar product with the generating of local fields. We give the list of null-vectors following from these two equations up to the level 5 specifying explicitly the vectors h9−3 | and h9−5 | from which they come. We do not write the descendents with respect to I’s of the already listed null-vectors: Null-vectors coming from Qcl :
634
O. Babelon, D. Bernard, F. A. Smirnov ∗ h−1|ψ−1 ∗ h−1|ψ−3 ∗ h−1|ψ−5 ∗ ∗ h−1|ψ−3 ψ−1 ψ1
: : : :
∂1 · 1 ∂3 · 1 ∂5 · 1 (−∂3 S2 + ∂1 S4 ) · 1.
Null-vectors coming from Ccl : 1 ∗ ∗ h−1|ψ−3 ψ−1 : (∂12 S2 − 4S4 + 6S22 + ∂1 ∂3 ) · 1. 2 Obviously these null-vectors coincide with (36). So, in particular, Eqs. (53) imply the KdV equation itself. We have verified that the null-vectors coincide with those obtained from the Gelfand-Dickey construction up to level 16. On higher levels we find higher equations of the KdV hierarchy. We have seen that the KdV equation follows from Eqs. (53). Let us prove the opposite: Eqs. (53) hold on any solution of KdV. We start with the operator Qcl . Proposition 13. Let Z Qcl =
ψ(D)dI(D) =
X
ψ−2k+1
k≥1
∂ . ∂t2k−1
Then if J2k are constructed from a solution of KdV we have X1 J2k h−2k | − 1i = 0. Qcl exp − k k≥1
Proof. Let us introduce the notation: Z X1 dA J2k h−2k = exp log S(A)h(A) , T = exp − k 2iπA k≥1
where h(A) =
P
h−2k A2k . We have ψ(D) T = T S −1 (D)ψ(D). So
k≥1
Z Qcl T | − 1i = T
dA 1 ψ(D)h(A) (dI(D) · S(A))| − 1i. S(D)S(A) 2iπA
We now use Eq.(45) to get Z dA 1 ψ(D)h(A) (dI(D) · S(A))| −1i = S(D)S(A) 2iπA Z dA 1 (log S(A))0 − (log S(D))0 ψ(D)h(A)| −1i. = dD 2 2 2iπA D − A But ψ(D)h(A)|−1i =: ψ(D)ψ(A)ψ ∗ (A) : |−1i +
AD ψ(A)|−1i |D| > |A|. − A2
D2
Null-Vectors in Integrable Field Theory
635
Let us consider first the integral Z AD dD dA D 0 ∗ (log S(A)) : ψ(D)ψ(A)ψ (A) : − 2 ψ(A) |−1i. 2 2 D −A2 |D|>|A| D 2iπA D −A One can do the integral over D. Notice that the integrand is regular at D = 0. Hence, the contributions to the integral come from the poles at D2 = A2 . The simple pole does not contribute because its residue vanishes since we have the product of two fermion fields at the same point in the normal product. The double pole obviously does not contribute either, and the integral is zero. Next we look at the integral Z AD dD dA D 0 ∗ (log S(D)) : ψ(D)ψ(A)ψ (A) : − 2 2 ψ(A) |−1i. 2 2 D −A |D|>|A| D 2iπA D −A This time one can do the integral over A. But it is clear that the integrand is regular at A = 0, and the integral also vanishes. Let us consider now the operator Ccl . Proposition 14. Let Z Z 1 d D12 ψ(D) + log 1− 2 ψ(D1 )ψ(D2 )dI(D1 )dI(D2 ). Ccl = dDψ(D) dD 2iπ |D1 | p ≥ n (i.e. there are two holes below −2n − 1). 3. The vector h9−5 | is obtained from a vector h9−3 | whose depth is greater than −2n−1 ∗ by application of ψ−2p−1 with p ≥ n (i.e. there is one hole below −2n − 1). In the first case using the identity Z 2 |D|>|A1 |,|A2 |
p P(D2 ) d 2 D − A21 dD
! p P(D2 ) DdD = C(A1 , A2 ), D2 − A22
one easily gets the formula (58) with M (A21 , · · · , A2n−2 ) = h9−5 |ψ ∗ (A1 ) · · · ψ ∗ (An−2 )| − 1 − 2ni
Y
A2n−1 . j
j ∗ ∗ In the second case it is necessary that in the expression h9−1 |ψ−2p−1 ψ−2q−1 Cb0 the operator Cb0 annihilates two holes. Hence one find the integral Z h9−1 |2(p − q) P(D2 )D−4n+2p+2q+1 dD = 0.
It vanishes because p, q ≥ n. ∗ Finally, in the third case it is necessary that in the expression h9−3 |ψ−2p−1 Cb0 the operator Cb0 annihilates the hole. This gives Z d 2 2 h9−3 | P(D ) D−4n+2p ψ(D)dD. P(D ) + D dD So, in the matrix elements we shall find the polynomials
Null-Vectors in Integrable Field Theory
641
d 1 2 P(D ) D−4n+2p 2 dD = P(D ) + D dD D − A2j
Z
2
|D|>|Aj |
q
=2
P(A2j )
d q , P(A2j )A2p−2n+1 j dAj
which corresponds to an exact form.
The similarity of the proof of this proposition with that of Proposition 2 of Sect. 3 is quite impressive. Let us consider now the operator Q0 . It is responsible for the equation of motion as shown in the following proposition: Proposition 15. The equations ∂ ∂ de ω2q+1 (A) = de ω2p+1 (A) ∂T2p+1 ∂T2q+1
(59)
follow from w
h9−3 |Q0 = 0
w
h9−5 |C0 = 0.
(60)
Proof. We will show that the Whitham equations (59) follow by considering the vectors ∗ ∗ ψ−2q−1 ψ2s+1 h9−3 | = h−1|ψ−2p−1
The proof goes in two steps. First we shall show that Eq. (60) implies that: w (2p + 1)Ib2q+1 h−1|ψ2p+1 ψ ∗ (A) − (2q + 1)Ib2p+1 h−1|ψ2q+1 ψ ∗ (A) = 0.
(61)
Indeed, applying Q0 to this vector h9−3 | gives w ∗ ∗ ∗ ∗ h−1|ψ−2p−1 ψ−2q−1 ψ2s+1 Q0 = Ib2p+1 h−1|ψ−2q−1 ψ2s+1 − Ib2q+1 h−1|ψ−2p−1 ψ2s+1 = 0.
Now notice that ∗ ∗ ∗ ∗ ψ2s+1 = (2p + 1)h−1|ψ−2s−1 ψ2p+1 + h−1|ψ−2s−1 ψ−2p−1 C0 . (2s + 1)h−1|ψ−2p−1 w
Hence having in mind the equation h9−5 | C0 = 0 one gets w ∗ ∗ (2p + 1)Ib2q+1 h−1|ψ2p+1 ψ−2s−1 − (2q + 1)Ib2p+1 h−1|ψ2q+1 ψ−2s−1 = 0.
Since it is true for every s we can write it for the generating function as in Eq.(61). The second step consists in computing the following average: 1−1 h−1|ψ2p+1 ψ ∗ (A) g(B) hhψ ∗ ii1 · · · hhψ ∗ iin | − 1 − 2ni. Noticing that h−1|ψ2p+1 ψ ∗ (A) g(B) = h−1|
Z
dDD−2n+2p
p A2n P(D2 ) p ψ(D)ψ ∗ (A) P(A2 )
and calculating the matrix element in a usual way we get the answer: 1−1 h−1|ψ2p+1 ψ ∗ (A) g(B) hhψ ∗ ii1 · · · hhψ ∗ iin | − 1 − 2ni = 1−1 A det(M (A)),
642
O. Babelon, D. Bernard, F. A. Smirnov
where M (A) is (n + 1) × (n + 1) matrix with the following matrix elements: Z M (A)i,j = aj
D2(i−1) p dD, P(D2 )
i, j = 1, · · · , n,
A2(i−1) , i = 1, · · · , n, M (A)i,n+1 = p P(A2 ) Z Q (D2 ) pp M (A)n+1,j = , j = 1, · · · , n, P(D2 ) aj
Qp (A2 ) , M (A)n+1,n+1 = p P(A2 ) where
Z Qp (A2 ) =
dD |D|>|A|
hp i D2p+1 p P(D2 ) = P(A2 )A2p , 2 2 D −A +
where [· · ·]+ means taking the polynomial part in the expansion around infinity. It is quite obvious that the normalized differential de ω2p+1 (A) is given by de ω2p+1 (A) = (2p + 1)1−1 det(M (A))dA, which finishes the proof of the proposition. Returning to the beginning of the proof one find that the expression dA (2p + 1)h−1|ψ2p+1 ψ ∗ (A) A can be considered as "symbol" of the normalized differential de ω2p+1 (A).
w
The equation h9−3 | Q0 = 0 for more complicated states h9−3 | than those considered in Proposition 16 implies other linear partial differential equations for Bj , so, we get the whole Whitham hierarchy. However Eqs. (59) are the only ones with derivatives with respect to only two times. We shall not go further into the study of the Whitham hierarchy, because it is not our goal. What we really wanted to do was to show the remarkable parallel between the Whitham method and the quantum form factor formulae. We hope that this goal is achieved.
6. Appendix A In this appendix we explain why the conditions (A , · · · , A |B , · · · , B ) L(n) 1 n 1 2n O
B2n =−B1 , An =±B2n−1
(A1 , · · · , An−1 |B2 , · · · , B2n−1 ) = −± L(n−1) O
= (62)
( = + or − respectively for the operators 82k and their descendents or 82k+1 and their descendents) and
Null-Vectors in Integrable Field Theory
643
2n n Y Y Y resAn =∞ ψ(Ai , Bj ) (A2i −A2j ) L(n) (A1 ,· · ·, An |B1 ,· · ·, B2n )akn = O
i<j
i=1 j=1
= 0,
k ≥n+1
(63)
are sufficient for the locality of the operator whose form factors are given by fO (β1 , β2 , · · · , β2n )−···−+···+ = = cn
Y
ζ(βi − βj )
i<j
2n n Y Y i=1 j=n+1
X 1 1 exp(− (ν(n − 1) − n) βj ) sinh ν(βj − βi − πi) 2
×fbO (β1 , β2 , · · · , β2n )−···−+···+ , where
Z 1 dA1 · · · × (2πi)n Z 2n n Y n Y Y Y ψ(Ai , Bj ) (A2i − A2j ) L(n) a−i × dAn i . O (A1 , · · · , An |B1 , · · · , B2n )
fbO (β1 , β2 , · · · , β2n )−···−+···+ =
i<j
i=1 j=1
i=1
The calculations which we are going to make are well known even for the more general case [3]. However, we want to repeat them for our particular situation for the completeness of the exposition. In the case of diagonal scattering the only non-trivial requirement for the form factors is the following: 2πi resβ2n =β1 +πi fO (β1 , β2 , · · · , β2n−1 , · · · β2n )−···−+···+ = 2n−1 Y S(βj − β1 ) − , = fO (β2 , · · · , · · · , β2n−1 )−···−+···+
(64)
j=2
where the S-matrix in our case is S(β2 − β1 ) =
ν−1 Y j=1
sinh 21 (β2 − β1 + πi ψ(−B1 , B2 ) ν j) .− = 1 πi ψ(B1 , B2 ) sinh 2 (β2 − β1 − ν j)
Using the identity [3] 1 ζ(β1 − βj )ζ(βj − β1 − πi) 1 exp(− (ν − 1)(β1 + βj )) , = 2 (sinh ν(βj − β1 ))2 ψ(B1 , Bj ) one finds that the relation (64) is equivalent to fbO (β1 , β2 , · · · , β2n−1 , β2n )−···−+···+ = β2n =β1 +πi 2n−1 2n−1 Y Y 1 b ψ(−B1 , Bj ) − ψ(B1 , Bj ) fO (β2 , · · · , β2n−1 )−···−+···+ = 2B1 j=2
j=2
if the constant c is taken as c = 2ν(ζ(−πi))−1 . Explicitly the LHS of this equation is
644
O. Babelon, D. Bernard, F. A. Smirnov
Z
1 (2πi)n ×
Z dA1 · · ·
Y
n 2n−1 Y a i − b1 Y dAn ψ(Ai , Bj ), A2i − B12 j=2 i=1
(A2i − A2j ) L(n) O (A1 , · · · , An |B1 , · · · , B2n−1 , −B1 )
n Y
i<j
a−i i ,
(65)
i=1
where we have used the identity ψ(A, B)ψ(A, −B) =
a−b . A2 − B 2
Let us consider the integral over An . If the contour is such that |An | > |B1 | we can −n+1 replace in this integral a−n (an − b1 )−1 . Indeed n by b1 n−1 X X b−n+1 −n+j −n+j 1 = a−n a−j + a−j ; n + n b1 n b1 (an − b1 ) j=1
the sum the sum
n−1 P j=1 P
j≥n+1
can be omitted due to anti-symmetry with respect to aj , j = 1, · · · , n − 1, can be omitted because the integrand does not have residue at An = ∞
j≥n+1
due to (63). Thus the integral over An becomes b−n+1 1
1 2πi
An >B1
Y
(A2i i|D1 |
|D1 |>|A1 |,|A2 |
Let us modify this expression by adding "exact forms" in variables A1 . It is convenient to use the formula Q(A1 )P (A1 ) − qQ(qA1 )P (−A1 ) = Z P (D1 ) dD1 2 (Q(D1 )(D1 + A1 ) − qQ(−qD1 )(D1 − A1 )). = (D1 − A21 ) |D1 |>|A1 |
Suppose that the polynomial Q(A) solves the equation Z P (D2 ) Q(D1 ) + qQ(−qD1 ) = dD2 (D1 + D2 )(D22 − A22 )
(70)
|D2 |>|D1 |
(obviously the RHS of this equation is a polynomial) then we can rewrite the expression for C 0 (A1 , A2 ) in the following equivalent form: Z P (D1 ) C 0 (A1 , A2 ) = dD1 D1 2 (Q(D1 ) − qQ(−qD1 )). (D1 − A21 ) |D1 |>|A1 |
Now we have to solve Eq. (70). It is simple:
Null-Vectors in Integrable Field Theory
1 Q(D1 ) = D1
647
dD2 η |D2 |>|D1 |
where η(x) satisfy η(x) − η(−qx) = η(x) =
∞ X k=0
Hence 0
x 1+x .
D1 D2
P (D2 ) , (D22 − A22 )
∞
X 1 1 x2k − x2k−1 . 2k+1 1+q 1 − q 2k k=1
dD2
|D2 |>|D1 |
Namely,
Z
Z
C (A1 , A2 ) =
Z
dD1 P (D1 )P (D2 )τe
|D1 |>|A1 |
D1 D2
1 (D12
−
A21 )(D22
− A22 )
,
where τe (x) = η(x) + η(−qx), ie. τe (x) =
∞ X 1 − q 2k−1 k=1
1+q
x2k−1 − 2k−1
∞ X 1 + q 2k 2k x . 1 − q 2k k=1
The expression for Ce (A1 , A2 ) given in Proposition 1 follows from these formulae. The expression for Co (A1 , A2 ) can be obtained in a similar way eliminating the even degrees of A1 from C 0 (A1 , A2 ). Acknowledgement. We would like to thank Tetsuji Miwa for his interest in this work and careful reading of the manuscript.
References 1. Smirnov, F. A.: Nucl. Phys. B453 [FS], 807 (1995) 2. Babelon, O., Bernard, D., Smirnov, F. A.: Quantization of Solitons and the Restricted sine-Gordon Model. hep-th/9603010, Commun. Math. Phys. 182, 319–354 (1996) 3. Smirnov, F. A.: Form Factors in Completely Integrable Models of Quantum Field Theory. Adv. Series in Math. Phys. 14, Singapore: World Scientific, 1992 4. Smirnov, F. A.: Lett. Math. Phys. 36, 267(1996) 5. Zamolodchikov, A. B.: Adv. Studies Pure Math. 19, 641 (1989) 6. Feigin, B., Fuchs, D.: In: Representations of Lie Groups and Related Topics. eds. A.M. Vershik and D.P. Zhelobenko, London: Gordon and Breach, 1990, p. 465 7. Belavin, A. A., Polyakov, A. M., Zamolodchikov, A. B.: Nucl. Phys. B241, 333 (1984) 8. Kupershmidt, B. A., Mathieu, P.: Phys. Lett. B227, 245 (1989) 9. Feigin, B., Frenkel, E.: Integrals of Motion and Quantum Groups. Lect. Notes in Math. v.1620 Berlin– Heidelberg–New Yorik: Springer Verlag, 1995 10. Date, E., Jimbo, M., Kashiwara, M., Miwa, T.: In: Nonlinear Integrable Systems. Singapore: World Scientific, 1983 11. Smirnov, F. A.: Particle-Field Duality in Sine-Gordon Theory. To be published in Proceedings of Buckow Conference (1995) 12. Knizhnik, V. G.: Commun. Math. Phys. 112, 567 (1987) 13. Dickey, L. A.: Soliton Equations and Hamiltonian Systems. Adv. Series in Math. Phys. 12, Singapore: World Scientific, 1991 14. Belokolos, E. D., Bobenko, A. I., Enol’skii, V. Z., Its, A. R., Matveev, V. B.: Algebra-geometric Approach to Non-linear Integrable Equations. Berlin–Heidelberg–New York: Springer series in non-linear dynamics, 1994
648
O. Babelon, D. Bernard, F. A. Smirnov
15. Novikov, S., Manakov, S.,Pitaevski, L., Zakharov, V.: Theory of Solitons. New York: Consultants Bureau, 1984 16. Whitham, G. B.: Linear and Nonlinear Waves. New York: Wiley-Interscience, 1974 17. Flashka, H., Forest, M. G., McLaughlin, D. W.: Comm. Pure Appl. Math. XXXIII, 739 (1980) 18. Krichever, I. M.: Functional Anal. Appl. 22, 200 (1988) 19. Dubrovin, B. A., Novikov, S. P.: Russian Math. Surveys 44:6, 35 (1989) Communicated by G. Felder
Commun. Math. Phys. 186, 649 – 669 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Bi-Hamiltonian Structure of Equations of Associativity in 2-d Topological Field Theory E.V. Ferapontov1 , C. A. P. Galv˜ao2 , O. I. Mokhov3 , Y. Nutku4 1 Institute for Mathematical Modelling, Academy of Science of Russia, Miusskaya, 4, Moscow 125047, Russia 2 Universidade de Brasilia, Departamento de F´ısica, 70.910 Brasilia DF, Brasil 3 Department of Geometry and Topology, The Steklov Mathematical Institute, Academy of Science of Russia, ul. Vavilova, 42, Moscow, GSP-1, 117966, Russia 4 TUB ¨ ˙ITAK - Marmara Research Center, Research Institute for Basic Sciences, Department of Physics, 41470 Gebze, Turkey
Received: 1 March 1996 / Accepted: 25 October 1996
Abstract: We exhibit the bi-Hamiltonian structure of the equations of associativity (Witten-Dijkgraaf-Verlinde-Verlinde-Dubrovin equations) in 2-d topological field theory, which reduce to a single equation of Monge-Amp`ere type fttt = fxxt 2 − fxxx fxtt , in the case of three primary fields. The first Hamiltonian structure of this equation is based on its representation as a 3-component system of hydrodynamic type and the second Hamiltonian structure follows from its formulation in terms of a variational principle with a degenerate Lagrangian. 1. Equations of Associativity Witten has introduced model-independent recursion relations for the genus zero m-point correlation functions [39] where the underlying algebraic object is a Frobenius algebra such that the 2-point correlators define a metric and the 3-point correlators define the structure functions. Dijkgraaf, Verlinde and Verlinde [3] have proved that these structure functions can be expressed as third derivatives of a generating function F which is the free energy and the condition of associativity gives rise to Monge-Amp`ere type equations of third order. Finally Dubrovin [7] has demonstrated the integrability and given a systematic account of the associativity equations which we shall follow. Thus we consider a function of n independent variables F (t1 , ..., tn ) satisfying the following two conditions: 1. The matrix ηαβ =
∂3F ∂t1 ∂tα ∂tβ
α, β = 1, ..., n
is constant and nondegenerate so that its inverse defined by η αµ ηµβ = δβα exists.
650
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
2. For all t = (t1 , ..., tn ) the functions αµ cα βγ (t) = η
∂3F ∂tµ ∂tβ ∂tγ
are the structure constants of an associative algebra A(t) of dimension n with the basis {e1 , ..., en } and the law of multiplication eβ ◦ e γ = c α βγ (t)eα , where cα βγ determine the structure of the algebra. The associativity condition (eα ◦ eβ ) ◦ eγ = eα ◦ (eβ ◦ eγ ) results in η µν
∂3F ∂3F ∂3F ∂3F µν = η , ∂tµ ∂tβ ∂tγ ∂tα ∂tλ ∂tν ∂tµ ∂tβ ∂tα ∂tγ ∂tλ ∂tν
(1)
which is a system of third order Monge-Amp`ere type nonlinear partial differential equations for F . In two-dimensional topological field theory this system is known as the equations of associativity, or the Witten-Dijkgraaf-H.Verlinde-E.Verlinde-Dubrovin (WDVVD) system. Any solution F (t1 , ..., tn ) of the WDVVD system gives an nparametric deformation A(t) of the Frobenius algebra of dimension n. We refer to [7] for the theory of integrability of the equations of associativity. For n = 3 Dubrovin showed that with 1 1 F = (t1 )2 t3 + t1 (t2 )2 + f (t2 , t3 ) 2 2 the equations of associativity (1) reduce to a single equation fttt = fxxt 2 − fxxx fxtt ,
(2)
where x = t2 , t = t3 . This paper is devoted to an investigation of the Hamiltonian structure of the third order Monge-Amp`ere type equation (2). 2. The Main Theorem In order to discuss the Hamiltonian structure of Eq. (2) we need to cast it into the form of a triplet of first order nonlinear evolution equations. Introducing the auxiliary variables a = fxxx ,
b = fxxt ,
c = fxtt ,
(3)
the associativity equation (2) results in the first order evolutionary system [25, 26] at bt ct
= = =
bx , cx , (b2 − ac)x ,
(4)
which, in the terminology of Dubrovin and Novikov [9, 11], is a 3-component system of hydrodynamic type. In general, equations of hydrodynamic type consist of a system of first order quasilinear equations uit = v ij (u) ujx
(5)
for which there exists a well-developed theory of integrability. This approach is also similar to the one used for second order Monge-Amp`ere equations in [31]. The main result of this paper is the following
Bi-Hamiltonian Structure in 2-d Field Theory
651
Theorem 1. The system (4) can be represented in bi-Hamiltonian form at bt = J0 δH0 = J1 δH1 , ct
(6)
where δ denotes the variational derivative, in this case with respect to a, b, c. Here J0 is a first order Hamiltonian operator of Dubrovin-Novikov type 3 1 −2D Db 2 Da 1 1 3 , (7) J0 = 2 cD + cx 2 aD 2 (Db + bD) 3 2 2 bD (b − ac)D + D(b − ac) 2 Dc − cx and J1 is a third order homogeneous Hamiltonian operator given by 0 0 D3 D3 −D2 a D 0 J1 = , D2 b D + D b D2 3 2 D −D a D +D a D a D
(8)
where
d dx is the total derivative. The corresponding densities of Hamiltonian functions are given by 1 H1 = − a (D−1 b)2 − (D−1 b) (D−1 c) (9) H0 = c , 2 respectively and the Hamiltonian function Z H = H dx, D≡
is in every case the integral of the density. The expression for H1 above is nonlocal as D−1 is the inverse of D. Hamiltonian operators J0 and J1 are compatible, so that according to Magri’s theorem [21] the system (4) is integrable in the standard field-theoretic sense via the Lenard-Magri recursion scheme [21, 22, 23]. 3. Introduction The discussion of the bi-Hamiltonian structure of Eqs. (4) falls naturally into two parts, namely the derivation of the first and second Hamiltonian operators which is based on the results of [24 and 18] respectively. For the first Hamiltonian structure we need to cast the equation of associativity (2) into the form of a Dubrovin-Novikov type hydrodynamic system. It was shown in [25 and 26] that the result is a system which does not possess Riemann invariants. We shall obtain the first Hamiltonian structure of Eqs. (4) and, using the general theory of Hamiltonian 3-component homogeneous systems of hydrodynamic type without Riemann invariants [13–16], we shall show that it reduces to the well-known integrable 3-wave interaction. A remarkable result that emerges from
652
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
this discussion is that the Darboux coordinates for the first Hamiltonian operator are Halphen variables which play a prominent role in the 2-monopole problem. For the second Hamiltonian structure we start with the Lagrangian representation of the associativity equation which turns out to be degenerate as in the case of the multi-Hamiltonian structure of the Monge-Amp`ere equation [33, 34]. Since the Lagrangian is linear in the time derivatives of the appropriate first order variables we are directly led to a symplectic representation of the associativity equation and the second Hamiltonian operator. The two Hamiltonian operators are compatible so that by Magri’s theorem [21] we arrive at a proof of the complete integrability of the associativity equation. The Darboux coordinates for the second Hamiltonian operator are the densities of its nonlocal Casimirs and the transformation is a differential substitution. In the original hydrodynamic type variables the second Hamiltonian operator is homogeneous of the third degree and provides the first non-trivial example of higher order homogeneous Hamiltonian operators of differential-geometric type that were introduced by Dubrovin and Novikov [10] and investigated in [35, 36, 6]. We shall therefore present a survey of some results for third order Hamiltonian operators and illustrate them with the example of the second Hamiltonian operator of the system (4). The application of the Lenard-Magri recursion scheme to the bi-Hamiltonian formulation of the associativity equation yields higher conservation laws and commuting flows. We shall also consider the construction of exact solutions of the associativity equation which is based on its restriction to the set of stationary points of its higher integrals, following a general scheme proposed in [2, 29, 30]. 4. Equations of Hydrodynamic Type A system of evolution equations will admit Hamiltonian structure provided it can be written in the form δH (10) uit = {ui , H} = J ij j , δu where { , } denotes the Poisson bracket defined by the Hamiltonian operator J ik which is a skew-symmetric matrix operator satisfying the Jacobi identity. For equations of hydrodynamic type (5) Dubrovin and Novikov [9] introduced Hamiltonian operators of the following form: (11) J ij = g ij (u)D − g is (u)0jsk (u) ukx , with det g ij 6= 0, where g ij (u) is a Riemannian metric and 0jsk (u) are the coefficients of the Levi-Civita connection compatible with the metric. The Jacobi identity requires the vanishing of the Riemann tensor Rijkl = 0 (12) so that this metric is flat. Furthermore the Hamiltonian function is a functional of hydrodynamic type provided H = H(u), (13) the Hamiltonian density does not depend on the derivatives of u. The most efficient method of integrability of equations of hydrodynamic type consists of the use of Riemann invariants [37, 38] Rit = v i (R) Rix for diagonalizable systems. Tsarev [37] showed that all such systems have infinite number of conservation laws and commuting flows of hydrodynamic type and can be integrated by the generalized hodograph method.
Bi-Hamiltonian Structure in 2-d Field Theory
653
However, it was shown in [25, 26] that Eqs. (4) do not possess Riemann invariants and therefore our discussion will follow the general theory of integrability of nondiagonalisable Hamiltonian systems of hydrodynamic type developed in [13–16]. It turns out that any integrable nondiagonalisable Hamiltonian 3-component system of hydrodynamic type can be reduced to the 3-wave interaction by a sequence of reciprocal transformations and differential substitutions. In Sect. 8 we shall illustrate this general scheme by applying it to the system (4) which exactly fits into this class. 5. Spectral Problem and Halphen Variables It was shown by Dubrovin [7] that Eq. (2) is connected with a spectral problem which has the form 0 1 0 9x = zA9 = z b a 1 9, c b 0 (14) 0 0 1 c b 0 9, 9t = zB9 = z b2 − ac c 0 where z is the spectral parameter. The compatibility conditions of the spectral problem (14) are equivalent to the following two relations between the matrices A and B, At = Bx ,
[A, B] = 0,
(15)
which are satisfied identically by virtue of Eqs. (4). The eigenvalues of the matrix A are conserved densities of the system (4). This can be seen simply because according to Eqs. (15) the matrices A, B commute and therefore can be diagonalised simultaneously. Hence we can write A = P U P −1 ,
B = P V P −1 ,
where U = diag (u1 , u2 , u3 ), V = diag (v 1 , v 2 , v 3 ). Substitution in Eqs. (15) results in [P −1 Pt , U ] + Ut = [P −1 Px , V ] + Vx , where we note that the matrices [P −1 Pt , U ] and [P −1 Px , V ] are off-diagonal so that we must have Ut = Vx (16) which are new conservation laws. Thus besides the three evident conservation laws with densities a, b, c, the roots of the characteristic equation det( λE − A ) = λ3 − aλ2 − 2bλ − c = 0
(17)
provide three further conservation laws with Hamiltonian densities u1 , u2 , u3 , for the system (4). By virtue of the obvious linear relation a = u1 + u2 + u3 between them only five of these conserved densities u1 , u2 , u3 , b, c are linearly independent. One can show that the system (4) has no other conservation laws of hydrodynamic type owing to the fact that it is nondiagonalisable. Variables similar to ui had earlier been introduced by Dubrovin [7] in connection with the reduction of the Chazy equation to the Halphen system and will be called Halphen variables. The ultimate reason for calling ui Halphen variables will become manifest in the next section.
654
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
6. First Hamiltonian Structure The transformation of the first order system (4) to the Halphen variables is given by a = u1 + u2 + u3 , b = − 21 (u1 u2 + u2 u3 + u3 u1 ),
(18)
1 2 3
c=u u u
according to the Vi`ete formulas for the cubic (17). To simplify the calculations we note that the matrices A and B are connected by B = A2 − aA − bE, so that the same relation is valid for the corresponding diagonal matrices U and V , V = U 2 − aU − bE as well. Substituting the expressions for a and b from the Vi`ete formulas (18) and using Eq. (16) we find that Eqs. (4) assume the symmetric form [24] u1t = 21 (u2 u3 − u1 u2 − u1 u3 )x , u2t = 21 (u1 u3 − u2 u1 − u2 u3 )x ,
(19)
u3t = 21 (u1 u2 − u3 u1 − u3 u2 )x , which is manifestly Hamiltonian with the first Hamiltonian operator [24] ! 1 1 −1 −1 −1 1 −1 D, J0 = 2 −1 −1 1 and the Hamiltonian density
(20)
H0 = u1 u2 u3 ,
which is the same conserved quantity we had in Eq. (9). There are also the conserved densities P = 2b = −(u1 u2 + u2 u3 + u3 u1 ), C i = ui , i = 1, 2, 3, which consist of the density of momentum and the Casimirs of the Hamiltonian operator (20). It can be verified directly that in terms of the original variables a, b, c the Hamiltonian operator (20) is transformed to the form given in Eq. (7). The reduction of the system (19) to the ODE Halphen system is obtained by looking at its solutions linear in x , ui = ai (t)x + bi (t), which immediately leads to the ODE Halphen system in the variables ai a˙ 1 = a2 a3 − a1 (a2 + a3 ), a˙ 2 = a1 a3 − a2 (a1 + a3 ), a˙ 3 = a1 a2 − a3 (a1 + a2 ),
(21)
with dot denoting the derivative with respect to t. The bi satisfy the linear ODE system 2b˙ 1 = a2 b3 + a3 b2 − a1 (b2 + b3 ) − b1 (a2 + a3 ), 2b˙ 2 = a1 b3 + a3 b1 − a2 (b1 + b3 ) − b2 (a1 + a3 ), 2b˙ 3 = a1 b2 + a2 b1 − a3 (b1 + b2 ) − b3 (a1 + a2 ),
Bi-Hamiltonian Structure in 2-d Field Theory
655
which reduces to quadratures upon solution of the Halphen system. Thus a simple reduction of the equations of motion (19) results directly in the Halphen system [20] which justifies the name “Halphen variables" for ui . The Halphen system (21) has recently been the subject of extensive investigation in connection with the 2monopole problem [1]. The multi-Hamiltonian structure of the Halphen system is given in [19].
7. Nonexistence of Riemann Invariants In order to prove the nondiagonalisability of our system it will be convenient to reformulate the equations of hydrodynamic type (19) in terms of differential forms (see [25, 26] for a different proof). Given a system of equations of hydrodynamic type (5), let λi (u) be eigenvalues of the matrix vji , i.e., the roots of characteristic equation det( v ij (u) − λδji ) = 0, and we shall assume that the system under consideration is strictly hyperbolic so that all roots of the characteristic equation are real and distinct. Using the ith left eigenvector li which corresponds to the eigenvalue λi , i.e. lik v kj = λi lij , we can introduce the 1-forms ω i = lik duk ,
i = 1, ..., n,
(22)
which are defined up to normalization ω i 7→ pi ω i , pi 6= 0. Here as well as in the remainder of this section we shall suspend the summation convention. It is easy to verify that Eqs. (5) can be rewritten in the form of an exterior system ω i ∧ (dx + λi dt) = 0,
i = 1, ..., n,
(23)
and provided the Frobenius criterion for integrability dω i ∧ ω i = 0
(24)
is satisfied for some i, the corresponding 1-form can be represented as ω i = pi dRi , where Ri is the Riemann invariant. In this case the ith equation in the set (23) can be rewritten in diagonal form mentioned earlier in Sect. 4. For the system (19) the eigenvalues λi and the corresponding left eigenvectors l i have the form λ1 = −u1 , l 1 = (u2 − u3 , u1 − u3 , u2 − u1 ), (25) λ2 = −u2 , l 2 = (u2 − u3 , u1 − u3 , u1 − u2 ), λ3 = −u3 , l 3 = (u2 − u3 , u3 − u1 , u2 − u1 ). and therefore Eqs. (19) can be expressed as ω i ∧ (dx − ui dt) = 0, where
(26)
656
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
ω 1 = (u2 − u3 )du1 + (u1 − u3 )du2 + (u2 − u1 )du3 , ω 2 = (u2 − u3 )du1 + (u1 − u3 )du2 + (u1 − u2 )du3 , ω 3 = (u2 − u3 )du1 + (u3 − u1 )du2 + (u2 − u1 )du3
(27)
are the 1-forms. As one can verify directly, the Frobenius criterion for integrability (24) is not satisfied for any i = 1, 2, 3 so that our system does not possess Riemann invariants. The general theory of integrability of nondiagonalisable Hamiltonian systems of hydrodynamic type was developed in [13–16]. For 3-component systems the following result was obtained Theorem 2. [15, 16]. A nondiagonalisable Hamiltonian 3-component system of hydrodynamic type is integrable if and only if it is linearly degenerate. We recall that a system of hydrodynamic type (5) is called linearly degenerate if for any i = 1, ..., n, (28) £ri (λi ) = 0, where ri is the right eigenvector corresponding to λi . That is, for eigenvalues λi (u) of the matrix vji (u) the Lie derivative of the eigenvalue λi along the corresponding right eigenvector ri vanishes. The proof that Eqs. (4) form a linearly degenerate system using the condition (28) was given in [25, 26]. There exists another criterion of linear degeneracy which does not appeal to eigenvalues and eigenvectors. Proposition 1. [15, 16]. A system of hydrodynamic type is linearly degenerate if and only if (∇f1 )v n−1 + (∇f2 )v n−2 + ... + (∇fn )E = 0, where fi are the coefficients of the characteristic polynomial det(λδji − vji (u)) = λn + f1 (u)λn−1 + f2 (u)λn−2 + ... + fn (u), and v n denotes the nth power of the matrix vji . The application of this criterion to show the linear degeneracy of Eqs. (4) can be found in [25, 26]. 8. Reduction to the 3-Wave Interaction We have found that the first order system resulting in the equation of associativity (2) is a linearly degenerate Hamiltonian 3-component system of hydrodynamic type which does not possess Riemann invariants. Such systems reduce to the 3-wave interaction. In order to show this explicitly let σ = B(u)dx + A(u)dt, τ = N (u)dx + M (u)dt be two hydrodynamic type integrals of the system (5), i.e., the differential 1-forms σ, τ are closed on the solutions of the hydrodynamic system. Thus locally σ, τ are exact σ τ
= B dx + A dt = N dx + M dt
≡ dx, ˜ ≡ dt˜,
(29)
and we may change from x, t to new independent variables x, ˜ t˜ to arrive at the equations uit˜ = v˜ ij (u) ujx˜ ,
Bi-Hamiltonian Structure in 2-d Field Theory
657
where the matrix v˜ is related to v by the formula v˜ = (Bv − AE)(M E − N v)−1 of reciprocal transformations. Under this change the exterior system (23) is transformed to the form ω i ∧ (dx˜ + λ˜ i dt˜) = 0, where
λi B − A λ˜ i = , M − λi N hence the 1-forms ω i remain invariant under reciprocal transformations (29) while the eigenvalues λi are transformed. Theorem 3. [15, 16]. If a 3−component system of hydrodynamic type (5) is linearly degenerate and Hamiltonian with a nondegenerate Poisson bracket of hydrodynamic type, then we can perform a reciprocal transformation through a pair of integrals of this system such that the transformed system has constant eigenvalues λ˜ i , which can be put equal to 1, -1, 0 without loss of generality. For the system (19) the existence of the transformation (29) is established by Theorem 3 and explicitly we have dx˜ = (u1 − u2 )dx + u3 (u2 − u1 )dt , (30) dt˜ = (2u3 − u1 − u2 )dx + (2u1 u2 − u1 u3 − u2 u3 )dt, which by virtue of Eqs. (19) are exact 1-forms. Under the transformation (30) the eigenvalues will be 1, -1 and 0 respectively. Theorem 4. [15, 16]. If a 3-component system of hydrodynamic type is nondiagonalisable, linearly degenerate and Hamiltonian, then the corresponding 1-forms ω 1 , ω 2 , ω 3 can be normalized so that they will satisfy dω 1 = ω 2 ∧ ω 3 , dω 2 = ω 3 ∧ ω 1 , dω 3 = ω 1 ∧ ω 2 ,
(31)
the Maurer-Cartan equations for either SO(3), or SO(2, 1) depending on = ±1 which enters into the signature of the metric defining the Poisson bracket of hydrodynamic type. In our case the signature of the metric of the Poisson bracket (20) is Lorentzian and for the system (19) the 1-forms ω i can be normalized so that they will satisfy the structure equations of SO(2, 1). We will not introduce a new notation for the normalized 1-forms ω i which with the desired normalization are given by 2 3 1 1 + (up − u3 )du2 + (u2 − u1 )du3 , ω 1 = (u − u )du 2 3 2(u − u ) (u2 − u1 )(u3 − u1 ) 2 3 1 1 + (up − u3 )du2 + (u1 − u2 )du3 , ω 2 = (u − u )du 2(u3 − u1 ) (u2 − u1 )(u2 − u3 ) 2 3 1 3 + (up − u1 )du2 + (u2 − u1 )du3 , ω 3 = (u − u )du 2 1 2(u − u ) (u3 − u1 )(u2 − u3 )
(32)
658
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
where for definiteness we took u1 < u3 < u2 . One can verify directly that the 1-forms (32) satisfy the Maurer-Cartan equations of SO(2, 1). According to Theorems 3 and 4 any nondiagonalisable linearly degenerate Hamiltonian 3-component system of hydrodynamic type with nondegenerate Poisson bracket can be reduced to the canonical form ω 1 ∧ (dx˜ + dt˜) = 0, ω 2 ∧ (dx˜ − dt˜) = 0, ω 3 ∧ dx˜ = 0
(33)
by a suitable reciprocal transformation (30) in the new independent variables x, ˜ t˜. Moreover, the 1-forms ω i still satisfy the Maurer-Cartan equations (31) as they are not effected by reciprocal transformations. Introducing the normalization factors pi for the 1-forms satisfying Eqs. (33) ω 1 = p1 (dx˜ + dt˜), ω 2 = p2 (dx˜ − dt˜), ˜ ω 3 = p3 dx,
(34)
we can then substitute them into the Maurer-Cartan equations (31) to obtain p1t˜ − p1x˜
=
−p2 p3 ,
p2t˜ + p2x˜
=
−p1 p3 ,
=
1 2
p3t˜
(35)
2p p ,
which is simply the integrable 3-wave system. Using the explicit coordinate representation of the 1-forms (22) we can obtain expressions of the form pi = lki (u) ukx˜ for the factors pi . Hence, the change from ui to pi is a differential substitution of the first order. We can summarize transformation of the system (19) in the Halphen variables to the 3-wave system (35) in two steps. First we introduce new independent variables x, ˜ t˜ by a reciprocal transformation given by Eqs. (30) and then transform the dependent variables from ui to pi according to p1 =
1 3 2 2 1 3 (u2 − u3 )u1x˜ + (u p − u )ux˜ + (u − u )ux˜ , 2 3 2 1 3 1 2(u − u ) (u − u )(u − u )
p2 =
1 3 2 1 2 3 (u2 − u3 )u1x˜ + (u p − u )ux˜ + (u − u )ux˜ , 3 1 2 1 2 3 2(u − u ) (u − u )(u − u )
p3 =
3 1 2 2 1 3 (u2 − u3 )u1x˜ + (u p − u )ux˜ + (u − u )ux˜ , 2 1 3 1 2 3 2(u − u ) (u − u )(u − u )
(36)
which should be compared to Eqs. (32). Finally we note that reciprocal transformations of the type (29) in general change a local Hamiltonian operator into a nonlocal one, cf. [27, 12, 17, 28, 26] so that the Hamiltonian operator (7) will assume the form of a nonlocal operator in terms of pi .
Bi-Hamiltonian Structure in 2-d Field Theory
659
9. Variational Principle In order to cast the associativity equation into the form of equations of hydrodynamic type we have introduced auxiliary variables (3) which are not suitable for formulating Eq. (2) in terms of a variational principle as the Lagrangian density turns out to be nonlocal in these variables. However, if instead we introduce new auxiliary variables p = fx ,
q = ft ,
r = ftt ,
(37)
whereby Eq. (2) will be given by the system [18] qx , r, (38) qxx2 − pxx rx , R then Eqs. (38) can be obtained by varying the action L dxdt with the local Lagrangian density [18] pt qt rt
= = =
L = px qxx pt + (px pxx −
1 1 qx ) qt + p rt − q rx + qx 2 pxx , 2 2
(39)
which is linear in the velocities. We note that in terms of the original variable f this Lagrangian density is simply equivalent to L=
1 1 2 fxt fxxx + fxt ftt , 2 2
which results in the x-derivative of Eq. (2) upon variation with respect to f . 10. Symplectic Representation The passage to a Hamiltonian formulation of the degenerate Lagrangian (39) requires use of Dirac’s theory of constraints [4] which has been given in [18]. However, when the Lagrangian density is linear in the velocities, as in Eq. (39), Dirac’s approach can be simplified and directly results in the symplectic representation of the system which is dual to its Hamiltonian representation (see [26] forR this approach). In order to elucidate this point let us, following [26], consider an action L dxdt with the Lagrangian density L = fi (u, ux , ...) uit − H(u, ux , ...) ,
(40)
which is linear in the velocities but fi and H are arbitrary functions of uk and their x-derivatives up to some finite order. For this case the Euler-Lagrange equations can be cast into the symplectic form [5, 26] ωij ujt =
δH , δui
where the symplectic matrix differential operator ωij is given by ∂fj ∂fj ∂fi ∂fi ωij = − D − D i + ∂ui ∂uj ∂u x ∂ujx ∂f ∂fi D2 − ..., + D 2 ij − ∂u xx ∂ujxx
(41)
(42)
660
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
and the Hamiltonian density H = H(u, ux , ...) is arbitrary. Examples of Lagrangians linear in t-derivatives arise naturally in nonlinear σ-models, Monge-Amp`ere equations and equations of hydrodynamic type, just to name a few, see a survey in [26]. Either by applying this procedure, or Dirac’s theory to the Lagrangian density (39), we arrive at the symplectic representation of the corresponding Euler-Lagrange equations (38) with −qxx D − Dqxx pxx D 1 D 0 Dpxx (43) ωij = −1 0 0 and the Hamiltonian density H1 = q rx −
1 2 q pxx , 2 x
(44)
which is the same expression as in Eq. (9) up to a divergence. The corresponding symplectic 2-form density [5] is given by 1 ω = dp ∧ dr − qxx dp ∧ dpx + pxx dp ∧ dqx + dq ∧ dqx , 2
(45)
which can be directly verified to be a closed 2-form. By invoking the Poincar´e lemma, in a local neighborhood we can write ω = dα,
α = −(qx pxx + r) dp −
1 qx dq, 2
(46)
where the coefficients of dp and dq are the Casimirs of (8). The closure of the symplectic 2-form (45) is equivalent to the satisfaction of the Jacobi identities by the Hamiltonian operator (8). Finally we can readily verify the symplectic form of the equations of motion iX ω = dH1 ,
(47)
obtained by contracting the 2-form (45) with the vector field, X = qx
∂ ∂ ∂ +r + (qxx2 − pxx rx ) , ∂p ∂q ∂r
defining the flow given by Eqs. (38). 11. Second Hamiltonian Structure Inverting the symplectic operator ωij in Eq. (43), we arrive at the Hamiltonian representation of the system (38) with the Hamiltonian operator [18] 0 0 −1 pxx 0 D−1 (48) J1 = −D qxx − qxx D 1 −pxx −pxx D pxx which appears to be nonlocal but transforming from p, q, r back to the variables a, b, c according to
Bi-Hamiltonian Structure in 2-d Field Theory
661
a = pxx , b = qxx , c = rx , the Hamiltonian operator (48) becomes simply the local homogeneous third order operator (8). The Hamiltonian operator J1 belongs to the class of third order homogeneous Hamiltonian operators that were introduced in [10] and extensively investigated in [35, 36, 6]. In Sect. 14 we shall present a brief survey of results known presently on third order homogeneous Hamiltonian operators. Although in the variables a, b, c Hamiltonian density (44) assumes the nonlocal form given in Eqs. (9) it generates local equations of motion (4) as one can verify directly by a straightforward calculation. The impulse of the Hamiltonian operator J1 is also nonlocal: 1 1 P1 = −(D−1 a)(D−1 c) − (D−1 b)2 + b(D−1 a)2 . 2 2 Moreover, besides the obvious Casimirs a, b, c, the operator J1 possesses also three nonlocal Casimir densities C1 = D−1 a,
C2 = D−1 b,
C3 = D−1 c + a (D−1 b),
(49)
and we note that in the variables p, q, r both C1 and C2 become trivial, while 1 1 P1 = −rpx − qx2 + qxx p2x , 2 2
C3 = r + pxx qx ,
assume the form of local expressions.
12. Compatibility We have established that the associativity equation (2) admits two local Hamiltonian structures with the Dubrovin-Novikov type first order Hamiltonian operator (7) and the homogeneous third order operator (8). In order to establish the complete integrability of this system according to Magri’s theorem [21] we need Theorem. Hamiltonian operators J0 and J1 are compatible. The proof of compatibility lies in showing that the linear combination λJ0 + µJ1 of these two local Hamiltonian operators with arbitrary constant coefficients λ, µ must be Hamiltonian as well. Therefore we need to check the Jacobi identities for the above linear combination which is based on a standard algorithm. As a result of this lengthy but straightforward calculation we find that J0 and J1 are compatible Hamiltonian operators and the system (4) is bi-Hamiltonian. Bi-Hamiltonian representation of the associativity equation (2) proves its integrability via Magri’s theorem [21] and the Lenard-Magri recursion scheme as we shall detail in Sect. 15.
662
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
13. Darboux’s Theorem It is possible to demonstrate that the second Hamiltonian operator satisfies the conditions of Darboux’s theorem. This can be accomplished by a change of dependent variables from a, b, c to new variables si according to a = s1x , b = s2x , c = s3x − s1x s2x − s2 s1xx ,
(50)
where si are densities of the three nonlocal Casimirs C i of the Hamiltonian operator J1 . It can be directly verified that in the new variables si the Hamiltonian operator J1 transforms into 0 0 1 J˜1 = − 0 1 0 D , (51) 1
0
0
thus manifesting the validity of Darboux’s theorem. Up to our knowledge this is the first Darboux-type result in the theory of nontrivial third order homogeneous Hamiltonian operators of differential-geometric type which cannot be reduced to constant coefficient form by a point transformation. Finally, it will be of interest to record the form of the system (4) in the new variables s1t
=
s2x ,
s2t
=
(−s2 s1x + s3 )x ,
s3t
=
(s2 s2x )x ,
(52)
which turns out to be of reaction-diffusion type. Now we have H0
= 21 (s2 )2 s1x − s2 s3 ,
P
= −s1 s3 − 21 (s2 )2
for the densities of the Hamiltonian and impulse respectively. 14. Homogeneous Hamiltonian Operators The remarkable result that first order homogeneous Hamiltonian operators (11) for equations of hydrodynamic type are connected with flat metrics has led Dubrovin and Novikov [10] to conjecture that homogeneous Hamiltonian operators of arbitrary order n would also have interesting differential-geometric content and they proposed the problem of classification of these operators. In the one-dimensional case homogeneous Hamiltonian operators of degree n are of the form ij k n−1 k k l n−2 + [cij J ij = g ij (u)Dn + bij k (u)ux D k (u)uxx + ckl (u)ux ux ]D ij k +... + [dk (u)u(n) + ...],
where det g ij 6= 0 and with respect to the natural grading deg(hg) = deg h + deg g, deg f (u(x)) = deg u(x) = 0, deg Dk = deg u(k) = k,
(53)
Bi-Hamiltonian Structure in 2-d Field Theory
663
all the terms are homogeneous of degree n. Homogeneous Hamiltonian operators of Dubrovin-Novikov type (53) define several different geometries on a manifold with local coordinates u1 , ..., uN . The coefficient of Dn transforms as a second rank contravariant tensor, a metric, with respect to losk sk cal changes of coordinates while the coefficients gis bsk j , gis cj , ..., gis dj transform as Christoffel symbols of some affine connections not necessarily related to the above metric. Unfortunately at this moment a complete classification of homogeneous Hamiltonian operators (53) exists only for n = 0 (Darboux), for n = 1 (Dubrovin and Novikov, [9]) and for n = 2 (Potemin, [35, 36] and Doyle, [6]). The case n = 3 was partially studied in [35, 36] and [6], however, a complete classification for n = 3 is yet lacking. In [32] Novikov conjectured that the last connection ˜ ijk = gjs (u)dsi 0 k (u)
(54)
in Eq. (53) is torsion-free and flat. Novikov’s conjecture was proved by Potemin in [36] for arbitrary n, see also Doyle [6]. For n = 1 since we have a flat metric one can always choose coordinates, where the Christoffel symbols 0jsk vanish, and the metric tensor has constant coefficients g ij = g0ij = const0 s. In these coordinates the Hamiltonian operator J ij assumes the form J ij = g0ij D manifesting the validity of Darboux’s theorem. As it was shown in Sect. 2, the flat coordinates for J0 are simply ui whereby the metric coefficients become ±1/2. For the homogeneous Hamiltonian operator of the third order ij ij k 2 k k l J ij = g ij (u)D3 + bij k (u)ux D + [ck (u)uxx + ckl (u)ux ux ]D ij ij ij k k l k l m +[dk (u)uxxx + dkl (u)uxx ux + dklm (u)ux ux ux ] ,
(55)
j
˜ sk in Eq. (54) is with det g ij 6= 0, Potemin [36] has shown that the last connection 0 indeed flat and therefore there exists a local coordinate system where all these Christoffel symbols and consequently the coefficients dij k are zero. Then in this local coordinate system we have djk djk djk l = 0, lm = 0 , lmn = 0, and therefore an arbitrary nondegenerate homogeneous Hamiltonian operator of the third order can always be reduced to the form ij ij k 2 k k l J ij = g ij (u)D3 + bij k (u)ux D + [ck (u)uxx + ckl (u)ux ux ]D,
(56)
which is a useful simplification in order to understand the conditions required by the Jacobi identities for the third order homogeneous operator (56). ij The coefficients bij k (u) and ckl (u) are defined by virtue of the Jacobi identities by relations ij ji (57) bij k = 2ck + ck , ∂cij ∂cij m l + . (58) ∂um ∂ul Potemin [36] has found that the expression (56) defines a Hamiltonian operator if and only if gij and cmnk = gmj gni cij k , satisfy 2cij lm =
664
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
∂gmn = −cmnk − cnmk , ∂uk cmnk + cmkn = 0,
(59) (60)
cmnk + cnkm + ckmn = 0,
(61)
∂cmnk = −g pq cpml cqnk , ∂ul X Alk rp cqlm cskn = 0,
(62) (63)
(m,n,p)
where li kj Alk rp = −g g
∂crjp ∂cjip ∂crij + + ∂ui ∂ur ∂up
,
(64)
P and the sum (m,n,p) is taken over all cyclic permutations of the elements (m, n, p). As it was shown by Potemin [36] from the conditions (59)–(63) it follows that ∂ 2 cmnk = 0, ∂ul ∂up and the metric gmn (u) is quadratic in these special local coordinates 0 gmn = gmnpq up uq + gmnp up + gmn ,
(65)
0 where gmnpq , gmnp and gmn are constant. Furthermore, again in this special coordinate system, cmnp coincides with the torsion tensor with lower indices of the connection cij k = −3g is 0jsk . The torsion tensor cmnp satisfying the conditions (59)–(63) and the constant 0 matrix gmn completely define nondegenerate homogeneous Hamiltonian operators of third order (56).
Theorem. [36, 6]. Nondegenerate homogeneous differential-geometric Hamiltonian operator of the third order (55) can be reduced to the constant coefficient form J ij = g ij D3 , where g ij = const, by a local change of coordinates u = u(v) if and only if the connection 1 0jik = − gis csj k 3 has vanishing torsion. The homogeneous Hamiltonian operator of the third order (8) provides a nice illustration of this situation. The coordinates a, b, c are flat coordinates of the last connection in (55) as every term in (8) has at least one D on the right. Then we can read off the coefficients of the contravariant metric from the coefficients of D3 in (8) and invert to obtain the metric (66) ds2 = −2b da2 + 2a da db + 2 da dc + db2 , where the metric coefficients are linear in a, b, c in full agreement with (65). Interestingly enough the metric (66) is also flat, even though this is not required by the above conditions obtained from the Jacobi identities. If we introduce new coordinates u, v, w through
Bi-Hamiltonian Structure in 2-d Field Theory
665
a = u , b = v − u2 , c = w + uv −
u3 , 3
the metric (66) becomes manifestly flat ds2 = 2 du dw + dv 2 , with constant coefficients. However, we must emphasize that even though the metric is flat the operator J ij cannot be reduced to constant coefficient form because the two coordinate systems where the last connection vanishes (a, b, c) and the coordinate system where the metric has the constant coefficient form (u, v, w) are different. On the other hand in Sect. 13 we had shown that the operator J ij can be reduced to the first order constant coefficient form (51) by a differential substitution (50). We would like to conclude this section with the following question which may shed light on a possible proof of Darboux’s theorem for third order operators. Namely, is it true that an arbitrary homogeneous Hamiltonian operator of the third order can be reduced to constant coefficient form by an appropriate differential substitution?
15. Higher Commuting Flows Given a pair of compatible Hamiltonian operators we can generate higher conservation laws and commuting flows via the Lenard-Magri recursion scheme. For this purpose we shall use the representation of both Hamiltonian operators in the Halphen variables and take the Casimirs u1 , u2 , u3 of the operator J0 as the starting point of the hierarchy. In the Halphen variables the first Hamiltonian operator J0 is given by Eq. (20) and for the second Hamiltonian operator we can write J 1 = J J1 J t ,
(67)
where J1 is (8) in the variables a, b, c and the matrix
2
u1 (u1 − u2 )(u1 − u3 ) 2 u2 J = (u2 − u1 )(u2 − u3 ) 2 u3 3 1 (u − u )(u3 − u2 )
2u1 (u1 − u2 )(u1 − u3 ) 2u2 (u − u1 )(u2 − u3 ) 2
2u3 (u3 − u1 )(u3 − u2 )
1 (u1 − u2 )(u1 − u3 ) 1 2 1 2 3 (u − u )(u − u ) 1 3 1 3 2 (u − u )(u − u )
(68)
is the inverse of the Jacobian of the transformation −1 ∂(a, b, c) J= ∂(u1 , u2 , u3 ) from a, b, c to the Halphen variables. In these variables higher conserved densities I 1 , I 2 , I 3 are defined by the recursive formulas J0 δI m = J1 δum , m = 1, 2, 3,
(69)
and direct calculation results in the following expression for the first conserved density:
666
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
I1 =
n 2 3 2 o 2u1 − u2 − u3 3 2 2 12 1 2 3 − u ) u + (u u ) − u (u + u ) (u x x x 2(u2 − u1 )3 (u3 − u1 )3 2 1 2 3 1 2 − u ) u1 (u2 u3 ) − u1 (u2 + u3 ) , + (u 2− u )1 3+ (u x x 3 (u − u ) (u − u1 )3 x
(70)
from which I 2 and I 3 can be obtained by interchanging the indices 1 ↔ 2 and 1 ↔ 3, respectively. The flux F 1 , corresponding to the density I 1 , is given by n o 2 3 1 2 ) − u2 u3 (u3 − u2 )2 u1 2 + (u2 u3 ) − u1 (u2 + u3 ) 2 F 1 = (u +2 u −1 u x x x 2(u − u )3 (u3 − u1 )3 (71) 2 2 1 2 3 3 1 2 u ) u1 (u2 u3 ) − u1 (u2 + u3 ) , + u (u 2− u )1 3+ u3 (u − x x x (u − u ) (u − u1 )3 and it can be directly verified that I 1t = F 1x . Note, that both I m and F m are quadratic in the first derivatives uk , so that they are of second order. We emphasize, however, that there are no “good" expressions for the conserved densities I m in the variables a, b, c due to their obvious nonsymmetry under the interchange of indices. From three equations (69) we arrive at the identity J0 δ(I 1 + I 2 + I 3 ) = J1 δa = 0 ,
(72)
in the set (69) because a is the Casimir of J1 . Hence I 1 + I 2 + I 3 = 0, so that among the quadratic integrals I m and three corresponding third-order higher flows u sm = J0 δI m = J1 δum , m = 1, 2, 3 ,
(73)
only two are linearly independent. We conclude that the hydrodynamic system (19) possesses exactly two conservation laws of an arbitrary even order, excluding the order zero, where it has 5 conservation laws of hydrodynamic type, and exactly two higher commuting flows of an arbitrary odd order.
16. Exact Solutions The most natural way of constructing exact solutions of a system of nonlinear evolution equations is to consider the restriction of the system under study to the set of stationary points of its higher integrals [2, 29, 30]. The natural variables for considering the restriction of the associativity equation to the set of stationary points of its higher integrals are the Halphen variables because the first Hamiltonian operator is in canonical form. In the Halphen variables we have integrals of hydrodynamic type Z (74) u1 u2 u3 dx , Z (u1 u2 + u1 u3 + u2 u3 ) dx , Z
ui dx ,
i = 1, 2, 3 ,
and two second order integrals (say, I 1 and I 2 ), quadratic in the first derivatives
(75) (76)
Bi-Hamiltonian Structure in 2-d Field Theory
Z I1 =
1 gij (u)uix ujx dx , I 2 =
667
Z
2 gij (u)uix ujx dx ,
(77)
(see Sect. 15 where the explicit form of I 1 and I 2 is given). Any linear combination of these integrals can be represented in the form Z (78) I= gij (u) uix ujx + V (u) dx , governing the motion of a particle in curved space with metric 1 2 gij = λ1 gij + λ2 gij
(79)
under the influence of the potential V (u), which in our case is a cubic expression in the variables ui . Here λ1 , λ2 are two arbitrary constants and in the generic case, that is apart from the choice of constants λ1 = λ2 , or λ1 = 0, alternatively λ2 = 0, the metric (79) is nondegenerate. Although the canonical Hamiltonian formulation of the corresponding equations on the set of stationary points is absolutely clear (see a general theorem in [29, 30], where an explicit canonical Hamiltonian representation and an explicit expression for a first integral in involution with the corresponding Hamiltonian is found for an arbitrary evolution system, restricted on the set of stationary points of its nondegenerate integral), it is by no means trivial to integrate the resulting finite dimensional dynamical system in a closed form. We hope to return to these questions elsewhere. Acknowledgement. We would like to thank B.A. Dubrovin, I.M. Krichever and S.P. Novikov for discussions ¨ ˙ITAK and Y.N. thanks CNPq for grants which made and useful remarks. E.V.F., C.A.P.G. and O.I.M. thank TUB this collaboration possible. O.I.M. and E.V.F. thank the International Science Foundation, Russian Foundation for Basic Research, RFBR-DFG (grant No. 96-01-00050) and INTAS (grant No. 93-0166-Ext) for partial financial support.
References 1. Atiyah, M. F., Hitchin, N.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, New Jersey: Princeton University Press, 1988 2. Bogoyavlenskii, O. I., Novikov, S. P.: The connection between the Hamiltonian formalisms for stationary and nonstationary problems. Funkts. Anal. i ego Pril. 10, 9–13 (1976) (Funct. Anal. Appl. 10, (1976)) 3. Dijkgraaf, R., Verlinde, H., Verlinde, E.: Topological strings in d < 1. Nucl. Phys. B 352, 59–86 (1991) 4. Dirac, P. A. M.: Lectures on Quantum Mechanics. Belfer Graduate School of Science Monographs series 2, New York, 1964 5. Dorfman, I. Ya., Mokhov, O. I.: Local symplectic operators and structures related to them. J. Math. Phys. 32, No. 12, 3288–3296 (1991) 6. Doyle, P. W.: Differential geometric Poisson bivectors in one space variable. J. Math. Phys. 34, No. 4, 1314-1338 (1993) 7. Dubrovin, B. A.: Geometry of 2D topological field theories. Preprint SISSA–89/94/FM, SISSA, Trieste (1994), hep-th/9407018 8. Dubrovin, B. A.: Integrable systems in topological field theory. Nucl. Phys. B 379, 627–689 (1992) 9. Dubrovin, B. A., Novikov, S. P.: Hamiltonian formalism of one dimensional systems of hydrodynamic type and the Bogoliubov-Whitham averaging method. Dokl Akad. Nauk SSSR 270, 781–785 (1983) (Soviet Math. Dokl. 27, 665–669 (1983)) 10. Dubrovin, B. A., Novikov, S. P.: On Poisson brackets of hydrodynamic type, Dokl. Akad. Nauk SSSR 279, 294–297 (1984) (Soviet Math. Dokl. 30, 651–654 (1984))
668
E.V. Ferapontov, C. A. P. Galv˜ao, O. I. Mokhov, Y. Nutku
11. Dubrovin, B. A., Novikov, S. P.: Hydrodynamics of weakly deformed soliton lattices. Differential geometry and Hamiltonian theory. Uspekhi Mat. Nauk 44, No. 6, 29–98 (1989) (Russ. Math. Surv. 44, No. 6, (1989)) 12. Ferapontov, E. V.: Differential geometry of nonlocal Hamiltonian operators of hydrodynamic type. Funkts. Analiz i ego Pril. 25, No. 3, 37–49 (1991) 13. Ferapontov, E. V.: On integrability of 3 × 3 semi-Hamiltonian systems of hydrodynamic type, which do not possess Riemann invariants. Physica D 63, 50–70 (1993) 14. Ferapontov, E. V.: Several conjectures and results in the theory of integrable Hamiltonian systems of hydrodynamic type which do not possess Riemann invariants. Teor. and Mat. Physics 99, No. 2, 257–262 (1994) 15. Ferapontov, E. V.: On the matrix Hopf equation and integrable Hamiltonian systems of hydrodynamic type, which do not possess Riemann invariants. Phys. Lett. A 179, 391–397 (1993) 16. Ferapontov, E. V.: Dupin hypersurfaces and integrable Hamiltonian systems of hydrodynamic type which do not possess Riemann invariants. Diff. Geometry and its Appl. 5, 121–152 (1995) 17. Ferapontov, E. V.: Nonlocal Hamiltonian operators of hydrodynamic type: Differential geometry and applications. Am. Math. Soc. Translations, ser. 2. Topics in Topology and Math. Physics, Ed. Novikov, S. P. 170, Providence, RI: AMS, 1995, pp. 33–58 18. Galv˜ao, C.A.P., Nutku, Y.: Hamiltonian structure of Dubrovin’s equation of associativity in 2-d topological field theory. J. Phys. A (1996) to be published 19. G¨umral, H., Nutku, Y.: Poisson structure of dynamical systems with three degrees of freedom. J. Math. Phys. 34, 5691 (1993) 20. Halphen, M.: C. R. Acad. Sci. Paris 92, 1101 (1881) 21. Magri, F.: A simple model of an integrable system. J. Math. Phys. 19, 1156 (1978) 22. Magri, F.: In: Nonlinear Evolution Equations and Dynamical Systems. Eds. Boiti, M., Pempinelli, F., Soliani, G. Lecture Notes in Phys. 120, New York: Springer, 1980, p. 233 23. Magri, F., Morosi, C., Tondo, G.: Nijenhuis G-manifolds and Lenard bicomplexes: A new approach to KP systems. Commun. Math. Phys. 115, 457 (1988) 24. Mokhov, O. I., Ferapontov, E. V.: Equations of associativity in two-dimensional topological field theory as integrable Hamiltonian nondiagonalizable systems of hydrodynamic type. 1995, hep-th/9505180; Funkts. Analiz i ego Pril. 30, No. 3, 62–72 (1996) (Funct. Anal. and its Appl. 30, No. 3, 62–72 (1996)) 25. Mokhov, O. I.: Differential equations of associativity in 2D topological field theories and geometry of nondiagonalizable systems of hydrodynamic type. In: Internat. Conference on Integrable Systems “Nonlinearity and Integrability: From Mathematics to Physics," Feb. 21–24, 1995, Montpellier, France 26. Mokhov, O. I.: Symplectic and Poisson geometry on loop spaces of manifolds and nonlinear equations. Am. Math. Soc. Translations, ser. 2. Topics in Topology and Math. Physics, Ed. Novikov, S. P. 170, Providence, RI: AMS, 1995, pp. 121–151, hep-th/9503076 27. Mokhov, O. I., Ferapontov, E. V.: On the nonlocal Hamiltonian operators of hydrodynamic type connected with constant curvature metrics. Uspekhi Mat. Nauk 45, No. 3, 191–192 (1990) (Russ. Math. Surv. 45, No. 3, 1990) 28. Mokhov, O. I.: Hamiltonian systems of hydrodynamic type and constant curvature metrics. Phys. Letters A 166, No. 3,4, 215–216 (1992) 29. Mokhov, O. I.: The Hamiltonian property of an evolutionary flow on the set of stationary points of its integral. Uspekhi Mat. Nauk SSSR 39, No. 4, 173–174 (1984) (Russ. Math. Surv. 39, No. 4, 133–134 (1984)) 30. Mokhov, O. I.: On the Hamiltonian property of an arbitrary evolution system on the set of stationary points of its integral. Izv. Akad. Nauk SSSR, Ser. Mat. 51, No. 6, 1345–1352 (1987) (Math. USSR Izv. 31, No. 3, 657–664 (1988)) 31. Mokhov, O. I., Nutku, Y.: Bianchi transformation between the real hyperbolic Monge-Amp`ere equation and the Born-Infeld equation. Lett. Math. Phys. 32, No. 2, 121–123 (1994) 32. Novikov, S. P.: Geometry of conservative systems of hydrodynamic type. Method of averaging for fieldtheoretic systems. Uspekhi Mat. Nauk 40, No. 4, 79–89 (1985) (Russ. Math. Surv. 40, No. 4 (1985)) ¨ An integrable family of Monge-Amp`ere equations and their multi-Hamiltonian 33. Nutku, Y., Sarıo˘glu, O.: structure. Phys. Lett. A 173, 270 (1993) 34. Nutku, Y.: Hamiltonian structure of real Monge-Amp`ere equations. J. Phys. A: Math. and Gen. 29, 3257 (1996) 35. Potemin, G. V.: On Poisson brackets of differential-geometric type. Doklady Akad. Nauk SSSR 286, 39–42 (1986) (Soviet Math. Dokl. 33, 30–33 (1986))
Bi-Hamiltonian Structure in 2-d Field Theory
669
36. Potemin, G. V.: Some aspects of differential geometry and algebraic geometry in soliton theory, PhD Thesis, Moscow State University, Moscow, (1991) 37. Tsarev, S. P.: On Poisson brackets and one-dimensional Hamiltonian systems of hydrodynamic type. Dokl. Akad. Nauk SSSR 282, 534–537 (1985) (Sov. Math. Dokl. 31, 488-491 (1985)) 38. Tsarev, S.P.: The geometry of Hamiltonian systems of hydrodynamic type. The generalised hodograph method. Izvestiya Akad. Nauk SSSR, Ser. mat., 54, No. 5, 1048–1068 (1990) (Math. USSR Izv. 37, 397–419 (1991)) 39. Witten, E.: On the structure of topological phase of two-dimensional gravity. Nucl. Phys. B 340, 281–332 (1990) 40. Witten, E.: Two-dimensional gravity and intersection theory on moduli space. Surveys in Diff. Geometry 1, 243–310 (1991) Communicated by Ya. G. Sinai
Commun. Math. Phys. 186, 671 – 700 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Correlation Spectrum of Quenched and Annealed Equilibrium States for Random Expanding Maps Viviane Baladi? Section de Math´ematiques, Universit´e de Gen`eve, CH-1211 Geneva 24, Switzerland. E-mail:
[email protected] Received: 16 April 1996 / Accepted: 25 October 1996
Abstract: We show that the integrated transfer operators for positively weighted independent identically distributed smooth expanding systems give rise to annealed equilibrium states for a new variational principle. The unique annealed equilibrium state coincides with the unique annealed Gibbs state. Using work of Ruelle [1990] and Fried [1995] on generalised Fredholm determinants for transfer operators, we prove that the discrete spectrum of the transfer operators coincides with the correlation spectrum of these invariant measures (yielding exponential decay of correlations), and with the poles of an annealed zeta function, defined also for complex weights. A modified integrated transfer operator is introduced, which describes the (relativised) quenched states studied e.g. by Kifer [1992], and conditions (including SRB) ensuring coincidence of quenched and annealed states are given. For small random perturbations we obtain stability results on the quenched and annealed measures and spectra by applying perturbative results of Young and the author [1993].
1. Introduction The study of equilibrium states for a single map f : M → M and a positive weight function g on M , i.e., the analysis of f -invariant Borel probability measures µ on M which maximise the expression Z (1.1) hf (ν) + log g(x) ν(dx) (with hf (ν) the entropy of (f, ν)) is now a well-developed subject in a variety of settings (see e.g. Ruelle [1989] and references therein). One of the main tools for this is a transfer operator acting on a suitable Banach space of test functions ϕ : M → C by ?
On leave from CNRS, UMR 128, ENS Lyon, France
672
V. Baladi
Lϕ(x) =
X
ϕ(y)g(y) ,
(1.2)
f y=x
where we assumed for simplicity that the map f has finitely or countably many branches. In many cases one constructs the equilibrium state µ by combining maximal eigenfunctions of L and its dual, and one obtains exponential decay of the corresponding correlation functions Z Z Z (1.3) Cϕ1 ϕ2 (n) = (ϕ1 ◦ f n (x))ϕ2 (x) µ(dx) − ϕ1 (x)µ(dx) ϕ2 (x) µ(dx) for suitable ϕ1 , ϕ2 by proving that there is a gap in the spectrum of L. The discrete spectrum of L can be shown to correspond to the poles of the Fourier transform of Cϕ1 ,ϕ2 in some strip; these poles are the resonances of Ruelle [1987]. A natural generalisation of this problem (see e.g. Ruelle [1995] for an overview) consists in starting from a family of maps fξ (or their inverse branches) and positive weights gξ for ξ ∈ E, and defining the mixed or generalised transfer operator Lϕ(x) =
X X ξ
ϕ(y)gξ (y)
(1.4)
fξ (y)=x
(the sum over ξ being replaced by an integral when the index set E is uncountable). This framework appears naturally when considering (weighted) independent identically distributed (i.i.d.) random compositions of maps fξ associated with a probability measure θ(dξ) on the index set E, a convenient description of the system being given by the weighted (two-sided) skew product on M × E Z , τ (x, ω) = (fω0 (x), σω) ,
g(x, ω) = gω0 (x) ,
(1.5)
with σ the shift on E Z , or its corresponding “one-sided” version τ + (see (2.3)). For weighted random (not necessarily i.i.d.) compositions, equilibrium states for a relativised variational principle (Ledrappier–Walters [1977], see (2.6) below) have been studied, in particular by Kifer [1992]. In the case where the maps fξ are expanding, and the weights are given by the Jacobians gξ (x) = 1/|detDx fξ |, the integrated transfer operator (1.4) (see (2.16) for a precise formula) gives rise to this relativised equilibrium state µ(q) , which is just the SRB measure of the random system, and has been studied in particular by Baladi–Young [1993]. The discrete spectrum of the operator is then also related to integrated correlation functions (2.20) for the random SRB measure. In more general cases (consider for example a family of linear one-dimensional repellors fξ , each with an invariant Cantor set of Hausdorff dimension αξ and escape rate Pξ , and the weight gξ = 1/|fξ0 |, cf. Remark 3.4), we find that the annealed integrated operator Lb (1.4)–(2.16) acting on a suitably “large” space does not always give rise to the relativised equilibrium state, but to another τ -invariant measure (see (2.10) below for the corresponding variational principle) which we call the annealed equilibrium state µ(a) (in particular, we solve a conjecture of Ruelle [1995, Sect. 7], see Sect. 2.4). Extending the analogy with spin-glasses (see e.g. M´ezard et al. [1987], and our random Ising model example in Sect. 2.4) we rename the relativised equilibrium states quenched equilibrium states. Using previous work of Kifer [1992] and Khanin–Kifer [1996] we describe the modified (and less directly accessible, for example in computer simulations) quenched c (see (4.12)) which gives rise to the quenched states. integrated transfer operator M
Equilibrium States for Random Expanding Maps
673
We are then able to extend the very powerful transfer operator techniques (including perturbative results from Baladi–Young [1993], as well as the analysis of the discrete spectrum in terms of zeta functions or generalised Fredholm determinants of Ruelle b M, c obtaining a good understanding of the ergodic [1990]) to both integrated operators L, properties of both the annealed and quenched invariant measures µ(a) , µ(q) , including the resonances of their integrated correlation functions. Some of our results also apply to negative, or even complex weights (negative weights appear naturally e.g. in the study of renormalisation, see e.g. Christiansen et al. [1990], Jiang et al. [1992] and references therein) where perturbative results are also desirable. See Baladi et al. [1996] for a treatment of random correlation functions (as opposed to the integrated correlation functions (2.20)), without any i.i.d. assumption. The Birkhoff cones used there do not seem to be directly applicable to other Ruelle resonances than the first one. We refer to Ruedin [1994], Lanford–Ruedin [1996] for a study of pressure and Gibbs state via similar integrated transfer operators. In this paper we use mainly three ingredients: We adapt the results of Ruelle [1990] and Fried [1995] to our (one-sided) skew product situation; we transport the two-sided techniques of Kifer [1992] and Khanin–Kifer [1996] to our one-sided skew product; we apply the perturbative results in Baladi–Young [1993] to get stochastic stability. In some sense we are considering a “toy model”: our uniform expansion and smoothness assumptions are the strongest possible. We expect and hope that the techniques developed here may be extended to more realistic settings (expanding in average as in Khanin– Kifer [1996], or more generally non-uniformly hyperbolic, and/or piecewise smooth). The theory presented here is particularly simple in the i.i.d. setting, but most of it can be extended to more general situations as pointed out to us by David Ruelle (see Appendix B). The outline of the paper is as follows: In Sect. 2 we define precisely our model for random compositions of expanding maps, as well as the annealed and quenched equilibrium and Gibbs states. We also state the main results: Theorem 1 (existence and uniqueness of annealed states), Theorem 3 (stochastic stability for annealed and quenched states), Theorem 4 (giving the relationship between the spectrum of the integrated operators and the correlation functions for annealed and quenched states), Theorem 5 (stability of these correlation spectra) and finally Theorem 6 on annealed zeta functions and annealed Fredholm determinants and their stability. Section 3 contains a proof of Theorem b see Proposition 3.1) which 1 based on an analysis of an integrated transfer operator (L, also yields Theorem 4, and proofs of the stability results concerning the annealed states in Theorems 3 and 5. Theorem 6 is also proved in Sect. 3. Finally, Sect. 4 is devoted to the proofs of the claims in Theorems 3, 4 and 5 on quenched states, using the transfer c (Proposition 4.2). operator M I am indebted to Thomas Bogensch¨utz, Konstantin Khanin, and especially David Ruelle for extremely useful conversations. I would also like to thank Yuri Kifer, Franc¸ois Ledrappier, Laurent Ruedin, and Lai–Sang Young for interesting comments. It is a pleasure to acknowledge the hospitality of ETH Z¨urich, IHES, SFB 170 in G¨ottingen, and IMPA, where part of this work was carried through, as well as financial support from the Soci´et´e Acad´emique de Gen`eve and the Fonds National Suisse de la Recherche Scientifique.
674
V. Baladi
2. Definitions and Statement of Results 2.1. Weighted random composition of expanding maps. For fixed r ≥ 1 let M be a compact, connected, C r Riemann manifold endowed with a Riemann metric dM . For γ > 1 let Cγr (M, M ) denote the space of all γ-expanding C r maps f : M → M (i.e., maps such that for all x ∈ M , and all v ∈ Tx M , we have kDx f (v)k > γkvk), endowed with the C r metric. Finally, let C r (M, C), respectively C r (M, R+∗ ) be the space of all complex-valued or positive C r functions, endowed with the C r metric dr or norm k · kr . (Many of our results have versions for M a compact metric space and Lipschitz or H¨older smoothness, or replacing the inverse branches of expanding maps by suitable families of contractions.) Let E be a compact subspace of Cγr (M, M )×C r (M, C), for the C r metric dE . Let + be the compact space of one-sided sequences E Z+ endowed with the distance dα (ω, ω) ˜ = P ∞ k ˜ k ) for some 0 < α < 1, and let = E Z be the corresponding twok=0 α dE (ωk , ω sided space, with an analogous metric also denoted dα . Let σ + be the one-sided shift to the left on + , and σ the two-sided shift to the left on . Fix θ a Borel probability on E. The product measure 2+ = θZ+ on + is σ + -invariant and σ + is ergodic for 2+ . For a Borel measure υ on M × + , we shall write πυ for the marginal of υ on + . We let P2 denote the space of τ + -invariant probability measures µ on M × + with πµ = 2+ . If ω ∈ + , write fω for the first coordinate of ω0 ∈ E, and gω for the second coordinate of ω0 . We consider the independent identically distributed compositions
weighted by
fω(n) = f(σ+ )n ω ◦ · · · ◦ fσ+ ω ◦ fω ,
(2.1)
gω(n) = g(σ+ )n−1 ω ◦ fω(n−1) · · · gσ+ ω ◦ fω · gω ,
(2.2)
where n ≥ 1 and (fω , gω ) = (fω0 , gω0 ) is chosen in E following the distribution θ(dω0 ). In other words, we are iterating the (weighted) one-sided skew-product τ + : M × + → M × + , (2.3) τ + (x, ω) = (fω0 (x), σ + (ω)) , g(x, ω) = gω0 (x) . The map τ + is in general not positively expansive, but for each ξ ∈ E and each local inverse branch (fξ )−1 i of fξ the inverse branch −1 (τ + )−1 ξ,i (x, ω) = ((fξ )i x, ξ ∧ ω)
(2.4)
(where ξ ∧ ω, or simply ξω, denotes the concatenation of ξ ∈ E and ω ∈ + ) is a max(α, 1/γ) contraction for the metric dM × dα . In particular, we shall see that we are in the framework of Ruelle [1990] or Fried [1995]. We call such a system (τ + , g, θ) (note that the pair (E, θ) contains all the information) a C r weighted independent identically distributed (i.i.d.) expanding map. If all the gω are real and positive-valued (respectively nonnegative-valued) the system is called positively weighted (respectively nonnegatively weighted). A special case of a (family of) random i.i.d. expanding maps is obtained by considering small random perturbations of (f0 , g0 ), for f0 ∈ Cγr (M, M ) and g0 ∈ C r (M, C): For each small ≥ 0 we have a probability measure θ on some fixed E as above, with supp θ ⊂ B (f0 , g0 ) ,
(2.5)
where B is the -ball in the dE metric. (In particular, θ0 is the Dirac mass at (f0 , g0 ).) Many of our results concern this special case.
Equilibrium States for Random Expanding Maps
675
Remark 2.1. Another model in which our arguments work without modification is given by the following data: let M , γ, r, Cγr (M, M ), C r (M, C) be as above, let (E, dE ) be a compact metric space endowed with a probability measure θ, set + = E Z+ endowed with metrics dα for 0 < α < 1, and consider f : + → Cγr (M, M ) ,
g : + → C r (M, C) ,
two Lipschitz functions with f (ω) = f (ω0 ) and g(ω) = g(ω0 ), which we view as random variables on (+ , θZ+ ) (or equivalently (E, θ)). The rest of the setup is as above. This other description is more convenient to describe one-dimensional random Ising models in Sect. 2.4. Also, it allows generalisations to Lipschitz g on + which depend on the full sequence ω0 , ω1 , . . . (but assuming still that f (ω) = f (ω0 )). In this case, most of c in Sect. 4). The only differences our results hold (see the discussion on the operator M b most notably are that the operator L does not exist any more (we must work with L, in Theorems 4 and 6, and the maximal eigenfunction ρ(x, ˆ ω) of Proposition 3.1 (2) can depend on ω) and that the definition of the annealed zeta function and determinant (2.24– 2.25) must be slightly changed (completing periodically the sequences ξ~ appearing in gξ~). 2.2. Relativised equilibrium and Gibbs states (quenched and annealed). We assume in this subsection that all weights gω are real and nonnegative. Quenched and annealed equilibrium states. Recall that an equilibrium state for the relativised variational principle (Ledrappier–Walters [1977], Ruelle [1978 (Sects. 6.21– 22)], Kifer [1992]), for τ + , g(x, ω) = gω0 (x), and θ is a Borel probability measure µ ∈ P2 which realises the supremum Z Q(q) (log g) = sup{hτ + (ν|2+ ) + log g(x, ω) ν(dx, dω) | ν ∈ P2 } , (2.6) where hτ + (ν|2+ ) denotes the relative entropy of ν with respect to its marginal πν = 2+ . We shall apply the formula from Bogensch¨utz–Crauel [1992], Bogensch¨utz [1992] Z n−1 _ 1 ω( hτ + (ν|πν ) = sup lim H (fω(n) )−1 Q) πν (dω) , ν + Q finite partition of M n→∞ n i=0
(2.7)
where we use the essentially unique decomposition ν(dx, dω) = ν ω (dx)πν (dω) ,
(2.8)
and where P the entropy Hυ (Q) of a measure υ for finite partition Q is defined as usual by − Q∈Q υ(Q) log υ(Q). We call the relativised equilibrium states defined by (2.6) quenched (relativised) equilibrium states for τ + , g, and θ, and the supremum Q(q) (log g) the quenched (relativised) topological pressure of τ + , g, and θ. We now introduce a new type of invariant equilibrium measure. Recall that the specific entropy per site hθ (υ) of a σ + invariant measure υ, relative to the a priori measure θ on E is Z Z θ (2.9) h (υ) = − log β(ξω) υ(d(ξω)) = − log β(ξω) β(ξω) θ(dξ) υ(dω)
676
V. Baladi
if υ(d(ξω)) is absolutely continuous with respect to θ(dξ)υ(dω), with Radon–Nikodym derivative β(ξω), and otherwise, hθ (υ) = −∞. (See Georgii [1988, pp. 317–318], Pinsker [1964, Sect. 15.2]: the two-sided framework there may be adapted to our onesided shift σ + .) Define now an annealed (relativised) equilibrium state for τ + , g and the a priori measure θ to be a τ + -invariant Borel probability measure µ on M × + realising the following supremum: Z (2.10) Q(a) (log g) = sup{hτ + (ν|πν ) + hθ (πν ) + log g(x, ω) ν(dx, dω)} , the supremum being over all τ + -invariant Borel probability measures ν on M × + . We call Q(a) (log g) the annealed topological pressure of τ + , g, and θ. Since hθ (πµ ) = hθ (2+ ) = 0 if and only if µ ∈ P2 (see e.g. Georgii [1988]), we have Q(a) (log g) ≥ Q(q) (log g). In some cases equality holds, but not always (see in particular Proposition 2 and Remark 3.4 below). Quenched and annealed Gibbs states. We introduce first the random transfer operators Lξ : C r (M, C) → C r (M, C), defined for ξ ∈ E by X ϕ(y) gξ (y) . (2.11) Lξ ϕ(x) = fξ (y)=x
We also write for n ≥ 1 and ω ∈ + , Lnω = Lωn−1 ◦ · · · ◦ Lω1 ◦ Lω0 .
(2.12)
Define now a quenched (relativised) Gibbs state for τ + , g, and θ (see Khanin–Kifer [1996], and also Bogensch¨utz–Gundlach [1995] who used a slightly different but equivalent definition) to be a measure µ ∈ P2 such that the probability measures µω on M arising in the essentially unique decomposition (2.8) satisfy µω is absolutely continuous with respect to ν ω ,
(2.13)
where (2+ -almost) each measure ν ω is a quenched (relativised) Gibbs measure for τ + , g, and θ, i.e., satisfies Z Z n Lω ϕ(x) (n) ∗ ω ϕ(x) ν ω (dx) = (f ) ν (dx) , (2.14) Lnω 1(x) ω for all continuous ϕ : M → C, and all n ≥ 1, where 1 denotes the constant function = 1 on M . Clearly, the definition of a Gibbs measure is equivalent to requiring that the conditional probabilities νnω (dy|x) of the measure ν ω conditioned by fω(n) (y) = x be the discrete measures defined on the finite set (fω(n) )−1 (x) by Z Ln ϕ(x) ϕ(y)νnω (dy|x) = ωn . (2.15) Lω 1(x) Defining the integrated transfer operator Lb acting on measurable functions ϕ : M × + → C (we write ϕ(x, ω) = ϕω (x)) by: Z b (Lξ ϕξ∧ω )(x) θ(dξ) , (2.16) Lϕ(x, ω) = E
Equilibrium States for Random Expanding Maps
677
we define an annealed (relativised) Gibbs measure for τ + , g and the a priori measure θ to be a Borel probability measure ν on M × + such that for all measurable ϕ : M × + and all n ≥ 0, Z Z bn L ϕ(x, ω) ϕ(x, ω) ν(dx, dω) = ((τ + )n )∗ ν(dx, dω) . (2.17) Lbn 1(x, ω) Again, there is an interpretation in terms of conditional probabilities: a Borel probability measure ν on M × + is an annealed Gibbs measure if for any integer n ≥ 1 the conditional probability νn ((dy, dη)|(x, ω)) under the condition (τ + )n (y, η) = (x, ω) (in particular ηn+j = ωj for j ≥ 0) is equal to the Radon measure Z Lbn ϕ(x, ω) . (2.18) ϕ(y, η)νn (dy, dη|(x, ω)) = Lbn 1(x, ω) Finally we define an annealed (relativised) Gibbs state for τ + , g and the a priori measure θ to be a Borel probability measure µ on M × + which is τ + -invariant and absolutely continuous with respect to an annealed Gibbs measure on M × + for τ + , g, and θ. 2.3. Results. Let us first recall results due to Kifer, Khanin–Kifer, and Bogensch¨utz– Gundlach: Theorem (Unique quenched Gibbs and equilibrium states µ(q) ). A C r positively weighted i.i.d. expanding map (τ + , g, θ) admits a unique quenched (relativised) Gibbs state and a unique quenched (relativised) equilibrium state. These two measures coincide. For a proof, see Kifer [1992, Theorem A], Khanin–Kifer [1996, Theorem C] and Bogensch¨utz–Gundlach [1995] (their results are for the two-sided skew product τ in (1.5), but give readily our claim by integration). Our first main result is: A C r positivTheorem 1 (Unique annealed Gibbs and equilibrium states µ(a) ). + ely weighted i.i.d. expanding map (τ , g, θ) admits a unique annealed Gibbs state and a unique annealed equilibrium state. These two measures coincide. We prove Theorem 1 in Sect. 3.2. In fact, we also construct there (non-necessarily unique) annealed states for nonnegative weights. In the case of SRB measure the quenched and annealed states are the same: Proposition 2 (SRB). For a C r weighted i.i.d. expanding map (τ + , g, θ) with gω (x) = 1/|JacDx fω |, the unique annealed equilibrium state and the unique quenched equilibrium state coincide. Essentially in the same setting, Kifer [1992] proved that for a C r weighted i.i.d. expanding map (τ + , g, θ) with gω (x) = 1/|JacDx fω |, the (relativised) equilibrium state is a direct product ρ × 2+ , with ρ a measure equivalent with Riemannian volume on M and invariant for the Markov chain corresponding to (τ + , θ), which is defined by the transition probabilities Z (2.19) P(x, A) = χA (fω0 (x)) θ(dω0 ) for x ∈ M and A ⊂ M Borel. (The marginal on M of an arbitrary τ + invariant measure is not invariant for the Markov chain in general.)
678
V. Baladi
Proposition 2 is a consequence of Remark 3.4 in Sect. 3. One-dimensional i.i.d. random Ising models give simple examples where the quenched and annealed states differ (see Sect. 2.4). We obtain the following stability result in Sect. 3.3 (the second claim in Theorem 3 was obtained previously by Kifer [1992, Sect. 4; 1990], see also Bogensch¨utz [1996]): Theorem 3 (Stochastic stability for annealed and quenched states). Let µ0 be the equilibrium state for f0 ∈ Cγr (M, M ) and log g0 with g0 ∈ C r (M, R+∗ ). Consider a positively weighted small random perturbation of f0 , g0 given by a family θ ( ≥ 0). Z+ (1) The annealed equilibrium states µ(a) weakly converge to µ0 × δf0 ,g0 , where δf0 ,g0 is the Dirac measure at (f0 , g0 ) as → 0. The annealed relativised pressure Q(a) (log g) converges to the topological pressure P (log g0 ) of f0 . Z+ (2) The quenched equilibrium states µ(q) weakly converge to µ0 × δf0 ,g0 as → 0. (q) The quenched relativised pressure Q (log g) converges to the topological pressure P (log g0 ) of f0 .
Integrated correlation functions. If µ is a τ + -invariant probability measure, we define its integrated random correlation function for ϕ1 , ϕ2 ∈ L2 (µ), and any integer n ≥ 0 by Z Cϕ1 ϕ2 (n) = (ϕ1 ◦ (τ + )n )(x, ω) ϕ2 (x, ω) µ(dx, dω) Z Z (2.20) − ϕ1 (x, ω) µ(dx, dω) ϕ2 (x, ω) µ(dx, dω) . For ϕ1 , ϕ2 in some function class F , we may ask if |Cϕ1 ϕ2 (n)| goes to zero exponentially fast, i.e., if there exists τ < 1 so that for any ϕ1 , ϕ2 ∈ F there is K(ϕ1 , ϕ2 ) with |Cϕ1 ϕ2 (n)| ≤ K(ϕ1 , ϕ2 ) · τ n for all n (the smallest such τ is called the (exponential) rate of decay of correlations for µ and F ). More generally, we can ask if the formal Fourier transform (see Pollicott [1985], Ruelle [1987] for corresponding objects in a nonrandom setting) bϕ ϕ (η) = C 1 2
∞ X n=0
Cϕ1 ϕ2 (n) eiηn +
∞ X
Cϕ2 ϕ1 (n) e−iηn
(2.21)
n=1
admits an analytic extension to a strip and a meromorphic extension to a larger domain of the complex plane. Using the notation L to represent the restriction to C r (M, C) of c the operator Lb defined in (2.16), and referring to (4.12) Sect. 4 for the definition of M (and to Sect. 3.1 for the definition of the Banach space B(α)) we have (see Sects. 3.3, 3.5 and 4 for proofs): Theorem 4 (Annealed and quenched integrated correlation spectrum). Set F = C r (M, C). bϕ ϕ (η) of the integrated correlation function of the annealed (1) The Fourier transform C 1 2 (a) equilibrium state µ of a C r positively weighted i.i.d. expanding map (τ + , g, θ) for test functions in F is analytic in a strip |=(η)| ≤ δ (a) for some δ (a) > 0 and admits a meromorphic extension to the strip |=(η)| ≤ log γ r , where its poles appear at points η, where λ = exp(−iη + Q(a) (log g)) is an eigenvalue of L acting on F with exp(Q(a) (log g))/γ r < |λ| < exp(Q(a) (log g)).
Equilibrium States for Random Expanding Maps
679
(2) Let α > 1/γ. The Fourier transform of the integrated correlation function of the quenched equilibrium state µ(q) of a C r positively weighted i.i.d. expanding map (τ + , g, θ) for test functions in F or B(α) is analytic in a strip |=(η)| ≤ δ (q) for some δ (q) > 0 and admits a meromorphic extension to the strip |=(η)| ≤ log 1/α, where c acting on its poles appear at points η, where λ = exp(−iη) is an eigenvalue of M B(α) satisfying α < |λ| < 1. In particular, the rate of decay of µ(a) for F = C r (M, C) coincides with the ratio of c the moduli of the “first two eigenvalues” of L acting on F and similarly for µ(q) and M. Motivated by Theorem 4, and since we will show in Proposition 3.1 (respectively c on B(α) Proposition 4.2) that the essential spectral radius of L on F (respectively M with α > 1/γ) is not bigger than exp(Q(a) (log g))/γ r (respectively α) we call the (discrete) spectrum of L in the annulus exp(Q(a) (log g))/γ r < |λ| < exp(Q(a) (log g)) c in α < |λ| < 1) the annealed (respectively quenched) integrated (respectively of M correlation spectrum of the measure µ(a) and the function class F (respectively µ(q) and B(α)). Regarding small random perturbations, we shall show in Sects. 3.3, 3.5, and 4: Theorem 5 (Stability of the annealed/quenched integrated correlation spectrum). Let µ0 be the equilibrium state for f0 ∈ Cγr (M, M ) and log g0 , with g0 ∈ C r (M, R+∗ ), let P (log g0 ) be the corresponding pressure, and let τ0 < 1 be the rate of decay of correlations for µ0 and F = C r (M, C). Consider a positively weighted small random perturbation of f0 , g0 given by a family θ ( ≥ 0). (1) The rate of decay τ(a) of correlations for the annealed equilibrium state µ(a) of (τ + , g, θ ), and test functions in F, satisfies lim sup→0 τ(a) ≤ τ0 . In fact, outside of any disc of radius eP (log g0 ) /γ r + δ, (δ > 0) the integrated correlation spectrum of µ(a) for F converges to the correlation spectrum of µ0 for F . (2) The rate of decay τ(q) of correlations for the quenched equilibrium state µ(q) of (τ + , g, θ ), and test functions in B(α) (α > 1/γ), satisfies lim sup→0 τ(q) ≤ τ0 . In fact, outside of any disc of radius eP (log g0 ) /γ + δ (δ > 0) the integrated correlation spectrum of µ(q) for B(α) converges to the correlation spectrum of µ0 for F . Annealed zeta functions and annealed generalised Fredholm determinants. First consider the deterministic system f , g (g not necessarily real or positive), and define the formal zeta function (see e.g. Ruelle [1989] and references therein) by ζ(z) = exp
X zm ζm , m
where ζm =
m≥1
X
m−1 Y
f m (x)=x
j=0
g(f j (x)) .
(2.22)
A second formal series, the generalised Fredholm determinant (Ruelle [1990]), may be associated with the deterministic system by setting Qm−1 j X X zm j=0 g(f (x)) dm , where dm = , (2.23) d(z) = exp − m det (1 − Dx f −m ) m m≥1
f
(x)=x
where Dx f −m denotes the derivative of the local inverse branch of f m associated to the m-periodidic orbit of x. For a weighted i.i.d. map (τ + , g, θ), we define the formal annealed zeta function by
680
V. Baladi
ζ (a) (z) = exp
X zm ζ (a) , where m m
m≥1
Z (a) ζm =
Em
X
m−1 Y
f~(m) (x)=x ξ
j=0
gξj (f~(j) (x)) θ(dξ0 ) . . . θ(dξm−1 ) . ξ
(2.24)
Similarly, we get an annealed Fredholm determinant by setting d(a) (z) = exp − Z d(a) m
= Em
X zm d(a) , where m m m≥1 Qm−1 (j) X j=0 gξj (fξ~ (x))
f~(m) (x)=x
det (1 − Dx f~(−m) )
θ(dξ0 ) . . . θ(dξm−1 ) .
(2.25)
ξ
ξ
The following stability result will be a consequence of the proofs in Ruelle [1990] and the spectral stability obtained in Proposition 3.5 below (see Sect. 3.4): Theorem 6 (Annealed zeta functions and Fredholm determinants). Consider a C r weighted i.i.d. expanding map (τ + , g, θ), write R(a) = exp Q(a) (log |g|), and let F = C r (M, C). (1) The annealed zeta function ζ (a) (z) is analytic in the disc of radius 1/R(a) and admits a meromorphic, zero-free, extension to the open disc of radius γ/R(a) , where its poles are exactly the inverses of the eigenvalues of L acting on F of modulus > R(a) /γ (including multiplicities). (2) The annealed Fredholm determinant d(a) (z) admits an analytic extension to the disc of radius γ r /R(a) , where its zeroes are exactly the inverses of the eigenvalues of L acting on F of modulus > R(a) /γ r (including multiplicities). (3) In the case of a weighted small random perturbation (τ + , g, θ ) of (f0 , g0 ), writing R0 = exp P (log |g0 |), the functions (ζ(a) (z))−1 , respectively d(a) (z) converge to (ζ(z))−1 , respectively d(z), as → 0 in any compact subset of the disc of radius γ/R0 , respectively γ r /R0 , in the sense of analytic functions. 2.4. Two examples. A conjecture of Ruelle. Our first example is taken from Ruelle [1995, Sect. 7.4]. Assume that E is countable or finite (with θ(ξ) > 0 for all ξ ∈ E) and that g is nonnegative. Assume also that the spectral radius R > 0 of the operator L acting on C r (M, C) is the only eigenvalue of modulus R and is simple (this is true for example if g is positive, see Sect. 3.1). We start by giving a different characterisation of the annealed equilibrium state µ for (τ + , g) and θ, assuming further that µ has the property that hσ+ (πµ ) < ∞. In this case we have on the one hand the Abramov–Rokhlin formula hτ + (µ|πµ ) = hτ + (µ) − hσ+ (πµ ) , Z
and on the other hσ+ (πµ ) = −
+
log
πµ (ω) πµ (dω) , πµ (σ + ω)
(2.26)
(2.27)
Equilibrium States for Random Expanding Maps
681
where πµ (ω)/πµ (σ + ω) denotes the Radon–Nikodym derivative (recall that E is countable, πµ is σ + -invariant, and use Theorem 4.14 in Walters [1982]). It thus follows from the definition of the specific entropy per site that Z log gω0 (x) µ(dx, dω) hτ + (µ|πµ ) + hθ (πµ ) + + M × Z log(θ(ω0 )gω0 (x)) µ(dx, dω) . (2.28) = hτ + (µ) + + M × In other words, the annealed equilibrium state µ is the (almost ordinary1 ) equilibrium state for τ + and the weight G(x, ω) = θ(ω0 )gω0 (x) on M × + , whenever hσ+ (πµ ) < ∞. (Note that this finiteness property does not always hold for E countable infinite: just consider gξ = 1/|Det Dfξ | so that the annealed equilibrium state satisfies πµ = θZ+ , and take θ(n) of the order of 1/(n(log n)2 ); an example where it does hold would be given by an a priori measure of the order of 1/n2 .) To obtain a τ + -invariant measure, Ruelle starts from ρ, the nonnegative eigenfunction of L associated to its spectral radius R ≥ 0 (Ruelle [1989, 1990]), and constructs a measure ν¯ on M × + by iterating the corresponding eigenfunctional ν of L∗ (noting that ν is a positive measure): X −1 ν(dx, ¯ dω0 , dω1 , . . .) = lim θ(dω0 ) . . . θ(dωm ) (fω0 )−1 i0 (R gω0 m→∞
×(· · ·
X im
i0 −1 (fωm )−1 im (R gωm ν(dx)) · · ·) ,
(2.29)
where we use the notation (fξ )−1 i for the finitely many inverse branches of fξ . He then ¯ dω) and considers the normalisation υ(dx, dω) of the τ + -invariant measure ρ(x)ν(dx, formulates the Conjecture (Ruelle [1995, Sect. 7.4]). The spectral radius R is the exponential of the ¯ ν(ρ) ¯ topological pressure of log G for τ + . The τ + -invariant probability measure υ = ρν/ is an equilibrium state for the dynamical system τ + on (M × + ) and the function log G : M × + → R ∪ −∞. It is not difficult to check that Lb∗ ν¯ = Rν, ¯ so that ν¯ = ν, ˆ the maximal eigenfunctional of Lb∗ (Proposition 3.1). The invariant measure υ is thus the annealed equilibrium state µ for τ + , g and θ by Proposition 3.2 below. We have therefore proved the above conjecture, in our uniformly expanding framework, under the assumptions, nontrivial when E is infinite, that hσ+ (πυ ) < ∞ (equivalently hτ + (υ) < ∞, note that the conjecture needs to be reformulated otherwise) and that E = {(fξ , gξ ) , ξ ∈ Z+ } ⊂ Cγr (M, M ) × C r (M, C) is compact for the induced metric. Ruelle actually works with a countable family of inverse branches instead of expanding maps fω defined on the entire space M . However, the assumptions he makes on the support of the corresponding weights ensure that our arguments in Sects. 2 and 3 carry through. One dimensional exponentially decaying random Ising model. Our second example is a one-dimensional Ising model with i.i.d. random external field and coupling constants (see e.g. Ledrappier [1977] for a description in terms of relative variational principle and references). More precisely, we work on the half-lattice Z+ and consider the full 1
Restricting to τ + invariant measures with finite entropy hτ + (ν) in the variational principle.
682
V. Baladi
P shift f on the metric space M = {±1}Z+ with a metric dM (x, y) = k≥0 |xk − yk |/γ k for γ > 1 (a compact set of continuous spins could also be considered). It is more convenient to work in the setup described in Remark 2.1, considering of course Lipschitz (instead of C r ) functions on M . For the weight, we can fix for example some β ≥ 0, consider a probability law θ = θ1 × θ2 on a compact square E = [−A, A]2 ⊂ R2 (see Ledrappier [1977] on how to remove the compactness assumption in the nearest neighbour interaction case), and set gω0 (x) = exp(−β · (hω0 x0 + Jω0 x0 · x1 )) , x ∈ M ,
(2.30)
with ω0 = (hω0 , Jω0 ) picked in E with law θ. (At the end of this subsection we explain how to generalise to long-range exponentially decaying interactions.) The physical interpretation is that log gω (f k x) is the random contribution to the Hamiltonian associated with the k th site of the configuration x (i.e., the sum of the interaction between the k th site and the (k + j)th sites for j ≥ 0, as well as the term from the external random field acting on xk ). Note that since the skew product is in fact a direct product here, the marginal on M of an annealed or quenched state will be a shift invariant measure on M . In other words, if we define by the usual formula the partition function Zn (ω, x) = Zn (ω0 , . . . , ωn−1 , x) (ω ∈ + , x ∈ M ) of a finite one-sided box [0, n−1] corresponding to the random Hamiltonian with fixed boundary condition yn+j = xj , j ≥ 0, we find that Zn (ω, x) = L(n) ω 1(x). The results of Ledrappier [1977] for finite range interaction and more generally (a slight modification) of Kifer [1992, Theorem 3.2 iii] imply that for θZ+ -almost all ω and all x ∈ M , lim
n→∞
1 log Zn (ω, x) = Q(q) (log g) . n
(2.31)
Therefore it follows from Proposition 3.2 below that for θZ+ -almost all ω and all x ∈ M ,
lim
n→∞
1 log Zn (ω, x) = Q(q) (log g) n ≤ Q(a) (log g) Z 1 log Zn (ω, x)θ(dω0 ) · · · θ(dωn−1 ), (2.32) = lim n→∞ n En
with a strict inequality in general. Our definitions of one-sided quenched and annealed Gibbs states are consistent with the standard terminology and we recover in particular by Theorem 4 the folklore theorem of exponential decay of correlations for both states (note that the integrated correlations (2.20) are simply the space-correlation functions of observables in phase space M for the shift-invariant M -marginal). Even when the quenched and annealed states are different, it is not obvious that they have different marginals on M . Since the physically observable measure is this M -marginal, it would be of interest if possible to find conditions ensuring that the quenched and annealed marginal are the same. (per) We point out also that considering partition functions Zn (ω) with periodic boundary conditions xn+j = xj , j ≥ 0 yields Z 1 (a) Q (log g) = lim log Zn(per) (ω)θ(dω0 ) · · · θ(dωn−1 ) (2.33) n→∞ n En
Equilibrium States for Random Expanding Maps
683
R (a) (because ζm = E n Zn(per) (ω)θ(dω0 ) · · · θ(dωn−1 ) and by Theorem 6 (1) on the annealed zeta function). We may also consider exponentially decaying long-range interactions such as ∞ X x0 · xj Jω j ,x∈M, gω0 (x) = exp −β · hω00 x0 + 0 γj
(2.34)
j=1
with ω0 = (hω00 , Jωj , j ≥ 1) chosen in E = [−A, A] × [−A, A]Z+ with law θ1 × θ2Z+ , 0 where θ1 and θ2 are two probabilities on [−A, A]. Since we are in a purely Lipschitz context, there is no need to modify the results of Ruelle [1990] (see the beginning of the proof of Proposition 3.1 (2)) and it suffices to check that g is a Lipschitz function on M × + if we endow E with a metric dα for α > 1/γ in order to apply our results. b 3. The Annealed Transfer Operators L b Let B = B(α) denote the Banach 3.1. The integrated annealed transfer operators L. space of Lipschitz functions ϕ : + → C r (M, C) (for the metric dα on and dr on C r (M, C)) endowed with the norm kϕkα = supω kϕkr + Lipω ϕ, where Lipω ϕ denotes the smallest Lipschitz constant. We may view an element of B as a function on M × + by setting ϕ(x, ω) = ϕ(ω)(x) and it is easy to see that the operator Lb defined by (2.16) preserves the Banach space B. We consider the operator L = Lg defined by restricting Lb (see (2.16)) to measurable functions on M : Z Lϕ(x) =
(Lξ ϕ)(x)θ(dξ) .
(3.1)
The transfer operator |L| obtained by replacing gξ by |gξ | in (3.1) is bounded when acting on the Banach space of bounded functions on M endowed with the supremum norm kϕk∞ . Denote by R = R(|g|) its spectral radius which satisfies by definition R = lim (k|L|m 1k∞ )1/m . m→∞
(3.2)
The basic properties of L and Lb that we shall use are: Proposition 3.1 (Quasicompacity). Set F = C r (M, C). (1) The spectral radius of L acting on F is bounded above by R, its essential spectral radius is bounded above by R/γ r . If g is nonnegative and R > 0, then R is an eigenvalue of L with a nonnegative eigenfunction ρ ∈ F . If g is positive then ρ is positive and R is a simple eigenvalue; moreover it is the only eigenvalue of modulus R, and the corresponding eigenfunctional for L∗ is a positive measure ν such that n L ϕ(x) − ρ(x) · ν(ϕ) = 0 (3.3) lim sup n n→∞ x∈M R for all ϕ ∈ L1 (ν).
684
V. Baladi
(2) The essential spectral radius of the operator Lb acting on B(α) is not larger than R · max(α, γ −r ). The spectra of L acting on F and Lb acting on B(α) coincide, including multiplicities in the domain {|z| > R · max(α, γ −r )}. If g is positive, then R is a simple eigenvalue of Lb with eigenfunction ρˆ equal to the eigenfunction ρ of L, the corresponding positive eigenfunctional νˆ is a positive measure with marginal ν on M . Also, when g is positive n Lb ϕ(x, ω) sup − ρ(x) · ν(ϕ) ˆ = 0 lim n n→∞ + R (x,ω)∈M ×
(3.4)
for all ϕ ∈ L1 (ν). ˆ Proof of Proposition 3.1. (1) The bounds on the spectral and essential spectral radius are proved in Ruelle [1990, Theorem 1.1, Theorem 1.3] (condition (ii) of Ruelle is satisfied up to using a partition of unity). If g is nonnegative, the spectral radius of L acting on F is equal to R by (3.2). To prove that there is a corresponding nonnegative eigenfunction, just use the algebra in Ruelle [1989, (4.10–4.12)] (the stronger assumptions of that paper were not used in this particular argument, see also Baladi–Kitaev–Ruelle–Semmes [1995, Proof of Theorem 2.5] for more details). If g is positive then since each fω is transitive we may use easy modifications of standard arguments (see e.g. Parry– Pollicott [1990, pp. 23–24]) to show that each nonnegative eigenfunction is positive. To show that there is a nonnegative eigenfunctional which is a positive measure, one may consider as usual the weight defined by g(y, ¯ ω) = g(y, ω)ρ(y)/(Rρ(fω (y)). The constant function 1 is then fixed by the integrated operator Lg¯ , so that the dual of this operator preserves the compact convex space of Borel probability measures. By Schauder–Tychonoff, L∗g¯ therefore has a fixed point ν¯ and then the normalisation of the measure defined by ν = ν/ρ ¯ is the desired maximal eigenmeasure for L. Since the iterates of Lg¯ satisfy a classical Yorke-type inequality (see e.g. Lemma 4.2 in Baladi et al. [1996]) and each fω is topologically mixing, the standard convexity argument (see Parry–Pollicott [1990, pp. 25–26]) may be applied to Lg¯ , yielding Lng¯ ϕ → ν(ϕ) for continuous ϕ, the result for ν-integrable ϕ follows from Lusin’s theorem. Standard arguments (see Parry–Pollicott [1990, pp. 25–26]) then show that Lg¯ restricted to {ϕ ∈ C r (M ) | ν(ϕ) = 0} has spectral radius < 1, so that the spectrum of Lg¯ is formed of the simple eigenvalue 1 and a subset of a disc with radius strictly less than 1. (2) Theorem 1.1 in Ruelle [1990] yields the upper bound R·max(α, γ −1 ) for the essential spectral radius of Lb acting on the Banach space of Lipschitz functions on M × + with the metric d + dα . To get the better bound claimed for the space B(α), we could adapt Ruelle’s original computation, but have chosen to follow Fried’s [1995] subsequent presentation. Our setting is much simpler than the one considered by Fried, in particular, we are only considering a graph (V, A) with vertex set V reduced to a point M × + so that all arrows A have initial and final vertex equal to M × + . The arrows are simply an index set for the contractions which are the inverse branches ψξ,i (x, ω) = ((fξ )−1 i x, ξω) (for ξ ∈ E and i in a finite set depending on ξ) as in (2.4), ˆ ξ,i ×+ of M ×+ . We say that an that we view as being defined on a closed subset M n-tuple of local inverse branches of τ+ is admissible if the corresponding composition + n ψξ,~ ~ ı × in ~ ı = ψξn ,in ◦ · · · ◦ ψξ1 ,i1 has a non-empty domain of definition Dξ,~
Equilibrium States for Random Expanding Maps
685
M × + . We need to refine Lemma 1 from Fried [1995], adapting it to our skewproduct situation: We claim that there is a constant C > 0 so that for all n ≥ 0, n + each admissible composition ψ n = ψξ,~ ı of n local inverse branches of τ , and any + n (x, ˜ ω) ˜ in the image of ψ (Dξ,~ ˜ ω) ˜ denotes the finite rank ~ ı × ), then if T = Tψ n (x, operator on B given by T ϕ = the Taylor expansion of order r about x˜ of ϕω˜ , we have: (3.5) k(ψ n )∗ (I − T )ϕkB ≤ C max(α, γ −r )n kϕkB (see Appendix A for a proof, where we explain the slight differences with Fried’s assumptions and bounds). Using (3.5) in place of Lemma 1 in Fried [1995], the proof of Proposition 1 from Fried [1995] combined with the Leibniz-telescoping argument in the proof of Ruelle [1990, Proposition 2.5] (useful to replace the growth rate appearing in Fried [1995] by the better bound R) then yields our claim. (We may ensure Fried’s [1995, p. 1064] gap condition by using a suitable partition of unity.) Applying Theorem 1.1 in Ruelle [1990], we see that the eigenvalues of both L and Lb acting on Lipschitz functions (on M , respectively M × + ) in the domain |z| > R max(α, γ −1 ) are exactly (including multiplicities) the inverses of the poles of the zeta function ζ (a) (z) (2.24). Since any eigenfunction for L is clearly an eigenfunction b the statements on ρˆ and νˆ in the case of a positive weight g follow from the for L, simplicity of the eigenvalue R for L. Using (3.5) to generalise the main theorem in Fried [1995, Sect. 3, p. 1067], we obtain a bijection between the spectra of L, Lb and the zeroes of the determinant d(a) (z) (2.25) in the bigger domain |z| > R max(α, γ −r ). (To find the formula for the trace of each finite rank operator K = Lψn T associated to an admissible composition ψ n and the ˜ ω), ˜ where we choose (x, ˜ ω) ˜ to be a fixed point corresponding operator T = Tψn (x, + + of ψ n if possible, and otherwise a point in ψ n (Dξ,~ ~ ı × ) \ Dξ,~ ~ ı × , with K acting on B, we may compute instead the trace of K as an operator on functions Bω˜ depending only on the x ∈ M variable, setting the random argument to be equal to ω˜ in the notation above. This is possible because the corresponding projection Π = Πω˜ : B → Bω˜ satisfies K = ΠKΠ so that Tr K = Tr ΠKΠ = Tr K|B(ω) ˜ and this last trace is computed as in page 1067 of Fried [1995].) 3.2. Annealed equilibrium and Gibbs states. Theorem 1 will be an immediate consequence of Proposition 3.2 and Proposition 3.3: Proposition 3.2. Assume that g is positive, and let ρ be the maximal eigenfunction and νˆ the maximal eigenmeasure of Lb from Proposition 3.1 (2). Then the probability measure µ = ρν/ ˆ ν(ρ) ˆ is the unique annealed equilibrium state for (τ + , g, θ). The maximal (a) eigenvalue R of Lb is equal to eQ (log g) . Proposition 3.3. Assume that g is positive. The probability measure µ in Proposition 3.2 is the unique annealed Gibbs state for (τ + , g, θ). Proof of Proposition 3.2. To check that the measure µ is τ + -invariant, consider the ˆ = 1): following chain of equalities, which holds for any ϕ ∈ L1 (µ) (assume that ν(ρ) Z
Z ϕ(x, ω)µ(dx, dω) =
ϕω (x)ρ(x)ν(dx, ˆ dω)
686
V. Baladi
Z Z 1 ϕω (x) Lξ (ρ)(x)θ(dξ)ν(dx, ˆ dω) R ZZ 1 Lξ ((ϕω ◦ fξω ) · ρ)(x)θ(dξ)ν(dx, ˆ dω) = R Z ˆ dω) = ϕσ+ ω (fω x)ρ(x)ν(dx, Z = ϕ ◦ τ + (x, ω)µ(dx, dω) . =
(3.6)
The basic strategy now is to go to the two-sided situation in order to apply the arguments in Kifer [1992] and Khanin–Kifer [1996]. We will use two other random transfer operators to construct an invariant measure υ for the two-sided skew product τ with the same relative entropy as µ. Consider first the random operator L0ξ , defined by formula (2.11) for the weight gξ0 (x) = gξ (x)ρ(x)/(R · ρ(fξ x)) (note that g 0 ∈ B). The operator L0ξ has by definition the property that for all ϕ ∈ L1 (µ): Z ZZ (3.7) (L0ξ ϕ(ξω))(x)θ(dξ)µ(dx, dω) = ϕ(x, ω)µ(dx, dω) . It follows from the definitions that the measure πµ (dξω) is equivalent with the product measure θ(dξ)πµ (dω) with a density denoted by β(ξω) ∈ L1 (πµ ). In fact, from (3.7) for functions ϕ(x, ξω) independent of x, we obtain by Fubini the explicit formula for πµ almost all ξω ∈ + , Z X 1 gξ (y)ρ(y) µω (dx) , (3.8) β(ξω) = M Rρ(x) fξ (y)=x
(where we use the decomposition (2.8) for µ). It is clear from (3.8) that β(ξω) is 2+ almost everywhere uniformly bounded and bounded away from zero (combining the (uniform) smoothness and positivity of ρ and gξ , together with the fact that the number of inverse branches of the fξ is uniformly bounded). 00 = gξ0 /β(ξω) We then define the second modified random operator associated to gξω by L0ξ ϕ(x) , (3.9) L00ξω ϕ(x) = β(ξω) whose dual has the key property that for πµ -almost all ξω: (L00ξω )∗ µω = µξω .
(3.10)
To prove (3.10), consider an arbitrary ϕ ∈ L1 (µ) and write (using (3.7)) ZZ
(L00ξω ϕ(ξω))(x) µω (dx)πµ (dξω) =
ZZ
β(ξω)−1 (L0ξ ϕ(ξω))(x)µω (dx)πµ (dξω) ZZZ = (L0ξ ϕ(ξω))(x)θ(dξ)µω (dx)πµ (dω) ZZ (3.11) = ϕ(x, ω)µω (dx)πµ (dω) ZZ = ϕ(x, ξω)µξω (dx)πµ (dξω) .
Equilibrium States for Random Expanding Maps
687
00 Consider now two-sided sequences ω ∈ viewing gξω = gξ0 /β(ξω) as a function of ω depending only on the ωi with i ≥ 0, and let πυ denote the natural extension of πµ to . Since the family of positive weights gω00 satisfies the equi-H¨older continuous property of Kifer [1992, (1.7)] we are now in a position to apply Kifer [1992, Proposition 2.5] to the operators L00ω . Recall that in this two-sided and not necessarily i.i.d. setting Kifer constructs for πυ almost all ω ∈ uniquely defined numbers λω > 0, probability measures (µ00 )ω on M , and positive H¨older functions h00ω : M → R with (µ00 )ω (h00ω ) = 1 and such that (3.12) L00ω h00ω = λω h00σω , (L00ω )∗ (µ00 )σω = λω (µ00 )ω .
It follows from (3.10) and the uniqueness statement in Kifer that λω ≡ 1 and µω = (µ00 )ω . By construction, the two-sided probability measure υ(dx, dω) = h00ω (x)µω (dx) πυ (dω)
(3.13)
is invariant under the two-sided skew product τ , and from Theorem 3.2 in Kifer [1992] it is the unique (quenched) equilibrium state with marginal πυ for the pair (τ, g 00 ) on M ×. We claim that the relative entropy of (τ, υ) over (σ, πυ ) coincides with the relative entropy of (τ + , µ) over (σ + , πµ ). This follows from formula (2.7) applied both in the one-sided and two-sided settings, and the fact that supx | log h00ω (x)| is bounded uniformly πυ -almost everywhere by Kifer [1992, Proposition 2.5, (2.16)]. (Indeed, this uniform bound implies that there is a positive constant C > 0 so that C · Hυω (Q) ≤ Hµω (Q) ≤ Hυω (Q)/C for any finite partition Q and πυ almost all ω ∈ , where we used the decomposition υ(dx, dω) = υ ω (dx)πυ (dω).) In fact, we will show next that (υ, τ ) is the -natural extension of (µ, τ + ), i.e., the unique τ invariant measure υ such that υ(ϕ) = µ(ϕ) for all ϕ ∈ L1 (µ) which depend only on (x and) ωj for j ≥ 0. The -natural extension is constructed just like the standard natural extension (and has the property that it leaves the relativised entropy invariant, giving a second proof of that fact). To prove that υ is the -natural extension of µ we combine two ingredients. The first one is Theorem C from Khanin–Kifer [1996], which says (in a framework more general than ours) that since υ(dx, dω) is the unique equilibrium state and therefore the unique Gibbs state for (gω00 , τ ) (with marginal πυ ), it is also the unique τ -invariant measure whose disintegrations h00ω (x)µω (dx) are almost all absolutely continuous with respect to µω (dx). The second ingredient is an abstract result on skew products: If the τ + -invariant probability m+1 has absolutely continuous disintegrations (m+1 )ω (dx) with respect to the disintegrations of the τ + -invariant probability m+2 on M × + , then the τ -invariant -natural extension m1 of m+1 has absolutely continuous disintegrations mω 1 (dx) with respect to those of the -natural extension m2 of m+2 on M × . (To show this abstract result, assume for a contradiction that there ω ω ω is a Borel set A ⊂ M with mω 2 (A ) = 0 but m1 (A ) > 0 for a set of ωs of positive measure, and consider approximations A ⊂ An of A by suitable -cylinders, so that τ n An can be viewed as an element of M × + . We obtain our contradiction from the fact that m+1 (τ n An ) = m1 (An ) ≥ m1 (A) > 0, and m+2 (τ n An ) = m2 (An ) → m2 (A) = 0 as n → ∞.) The application of the abstract result to our case is by setting m+2 = µ (so that (m2 , τ ) is the -natural extension of (µ, τ + )) and Z + ω 00 ω+ hω (x) πυ (dω− ) πµ (dω+ ) , m1 (dx, dω+ ) = µ (dx) · − where we write ω− = (. . . , ω−2 , ω−1 ), − = {ω− }, ω+ = (ω0 , ω1 , · · ·) and use the almost everywhere well-defined disintegration πυ (dω) = πυω+ (dω− ) πµ (dω+ ) (one easily checks
688
V. Baladi
R that m1 = υ). Note that we are using the fact that ( − h00ω (x) πυω+ (dω− )) is bounded away from zero uniformly in x and (essentially uniformly) in ω+ : This is true because h00ω (x) is bounded away from zero uniformly in x and almost everywhere uniformly in ω by Kifer [1992, Proposition 2.5, (2.16)]. We now show that (3.14) 0 = hτ (υ|πυ ) + υ(log g 00 ) . Equality (3.14) follows from Kifer [1992, Proposition 3.1] which tells us in particular that if we set gω000 = gω00 h00ω /(h00σω ◦ fω ) then we have for almost all ω 0 = υ ω (Iυω (BM |fω−1 BM ) + log gω000 ) ,
(3.15)
where Iη (BM |fω−1 BM ) denotes the conditional information of the partition E of M into points with respect to the partition fω−1 E for the probability measure η. (Just integrate with respect to πυ , use the definitions of g 000 and υ and the fact that hτ (υ|πυ ) = R(3.15) υ ω (Iυω (BM |fω−1 BM ))πυ (dω).) Since we have (using the fact that υ is the -natural extension of µ) hτ (υ|πυ ) + υ(log g 00 ) = hτ + (µ|πµ ) + µ(log g 00 ) = hτ + (µ|πµ ) + µ(log g 0 ) + hθ (πµ ) ,
(3.16)
Equation (3.14) implies by definition of g 0 that hτ + (µ|πµ ) + µ(log g) + hθ (πµ ) = log R .
(3.17)
ˆ dω) = µˆ ω (dx)πµˆ (dω) be a We now check that log R = Q(a) (log g). Let then µ(dx, + + θ ˆ be the one-sided τ -invariant measure on M × with h (πµˆ ) > −∞, and let β(ω) Radon–Nikodym derivative of πµˆ (dω) with respect to θ(dω0 )πµˆ (dσω) (note that βˆ is πµˆ ˆ almost everywhere nonzero since − log β(ω) is in L1 (πµˆ ) because the specific entropy per site is finite). We consider υ(dx, ˆ dω) = υˆ ω (dx)πυˆ (dω) the -natural extension of + (µ, ˆ τ ). Note that πυˆ is the (ordinary) natural extension of πµˆ . ˆ by Theorem Next, let hˆ ω > 0 be the functions associated to the weight gω0 0 (x)/β(ω) ω ˆ 3.1 in Khanin–Kifer [1996] (note that the corresponding λ are equal to one πυˆ almost everywhere by a uniqueness argument and a computation identical to (3.11)). Set gˆ ω (x) =
gω0 0 (x)hˆ ω (x) . ˆ β(ω)( hˆ σω ◦ fω0 (x))
(3.18)
P By construction, fω y=x gˆ ω (y) = 1 for any x ∈ M and πυˆ almost all ω ∈ . Applying finally the arguments of Kifer [1992, (3.8)–(3.11)] to υˆ we get for πυˆ almost every ω,
υˆ
ω
I
υˆ ω
(BM |fω−1 BM )
+ log gˆ ω ≤
Z
X
gˆ ω (y) υˆ ω (dx) − 1 = 0 .
(3.19)
−1 y∈fω fω (x)
Integrating both sides of (3.19) with respect to πυˆ and using the definition of g, ˆ we get hτ (υ|π ˆ υˆ ) + υ(log ˆ g 0 ) + hθ (πµˆ ) = hτ + (µ|π ˆ µˆ ) + µ(log ˆ g 0 ) + hθ (πµˆ ) ≤ 0 .
(3.20)
Equilibrium States for Random Expanding Maps
689
Since µ(log ˆ g 0 ) = µ(log ˆ g) − log R, we are done. It remains to prove uniqueness of the annealed state. Let µ(dx, ˆ dω) = µˆ ω (dx)πµˆ (dω) be a τ + invariant probability measure with hθ (πµˆ ) > −∞ and such that the inequality in (3.20) is an equality, so that the inequality for the corresponding υˆ on M × in (3.19) is πυˆ almost everywhere an equality, i.e., υˆ ω Iυˆ ω (BM |fω−1 BM ) + log gˆ ω =
Z
X
gˆ ω υˆ ω (dx) − 1 = 0 .
(3.21)
−1 y∈fω fω (x)
Starting from (3.21), we may proceed exactly as Kifer [1992, (3.12)] obtaining that L∗gˆ ξω υˆ ω = υˆ ξω (πυˆ almost everywhere) for Lgˆ ω the random operator associated to gˆ ω . Integration then shows that (3.7) holds with µˆ instead of µ: Z X 0 Z gξ (y) hˆ ξω (y)ϕξω (y) µˆ ω (dx) πµˆ (dξω) = ϕω (x)hˆ ω (x) µˆ ω (dx)πµˆ (dω) . ˆ β(ξω) fξ y=x
(3.22) Since the simplicity of the maximal eigenvalue statement in Proposition 3.1 (2) applies to the dual of the integrated operator Lˆ 0 associated to gξ0 , we get the claimed equality ˆ µ = µˆ from (3.22) by definition of β. Proof of Proposition 3.3. We essentially follow the path laid out by Khanin and Kifer [1996, Sect. 4], proving first the existence and uniqueness of the annealed Gibbs measure. Observe that any limit point of the probability measures ϕ 7→
Lbn ϕ(x, ω)n , Lbn 1(x, ω)n
ϕ ∈ C 0 (M × + , C) ,
(3.23)
as n → ∞ with (x, ω)n ∈ M × + is an annealed Gibbs measure, and that all annealed Gibbs measures are constructed with this procedure. Such a limit point must exist by standard compactness arguments. In fact it follows from the proof of Proposition 3.1 that for all continuous ϕ : M × + → C, n n Lb ϕ(x, ω) Lbn ϕ(y, ω) Lb ϕ(x, ω) ˜ = 0 , (3.24) − = lim − ν(ϕ) ˆ n→∞ (x,ω),(y,ω) n→∞ L bn 1(x, ω) bn 1(x, ω) ˜ L ˜ Lbn 1(y, ω) lim
sup
uniformly in (x, ω), where νˆ is defined in Proposition 3.1. (Indeed, the difference ˆ converges to zero uniformly in (x, ω).) In particular, we Lbn ϕ(x, ω)/(Rn ρ(x)) − ν(ϕ) also get uniqueness of the Gibbs measure, which coincides with ν. ˆ Clearly, the annealed equilibrium state µ = ρνˆ is therefore also an annealed Gibbs state. To prove that there is no other annealed Gibbs state we note that any such state µ0 b 0 = Rρ0 in L1 (ν) has a density ρ0 ∈ L1 (ν) ˆ with respect to νˆ which satisfies Lρ ˆ (indeed, 0 ˆ ) for all ϕ ∈ L1 (ν), ˆ and we may use we have by assumption ν((ϕ ˆ ◦ τ + )ρ0 ) = ν(ϕρ ˆ =1 Lbνˆ = Rν). ˆ Since (Lbn ϕ(x))/Rn converges to ρ(x) for all continuous ϕ with ν(ϕ) ˆ by Lusin’s Theorem, we get (Theorem and continuous functions are dense in L1 (ν) Pn−1 VIII.5.1 in Dunford–Schwartz [1988]) that n1 k=0 (Lbk ϕ(x)/Rk ) converges to ρ(x) for all ϕ ∈ L1 (ν) ˆ with ν(ϕ) ˆ = 1, in particular ρ0 = ρ as desired.
690
V. Baladi
Remark 3.4. The annealed equilibrium state µ is also a quenched equilibrium state if and only if πµ = 2+ (if and only if Q(a) (log g) = Q(q) (log g)). In the simple case where µ = ν, ˆ i.e., h ≡ 1, a Rnecessary condition for this is the existence of a probability measure ν(dx) on M so that M Lξ 1(x) ν(dx) is θ-almost everywhere constant (because µ ∈ P2 ˆ ∈ P2 , in particular when integrating functions independent of and µ = νˆ = Lb∗ ν/R x ∈ M ). This constancy condition is violated for example if the number of branches of the fξ is constant in ξ, and gξ (x) is constant in x but depends (essentially) on ξ. A concrete example is when each fξ is a linear repeller of the interval with two branches (the slopes of which are chosen to depend essentially on ξ) and gξ = 1/|fξ0 |. It seems difficult to state a simple necessary condition for the coincidence of the quenched and annealed states when h is not constant. ˆ is the existence A sufficient condition ensuring πµ = 2+ (without assuming µ = ν) of a probability ν(dx) on M and a constant λ > 0 such that L∗ξ (ν) = λ(ν) for θ almost all ξ ∈ E. By definition of the Jacobian, this property holds with λ = 1 for gξ = 1/| det D(fξ )| and ν Lebesgue measure, proving Proposition 2. In Sect. 4, Remark 4.3, we mention a weaker sufficient condition. 3.3. Stability of the discrete spectrum and annealed state. The stability claims in Theorems 3 and 5 will be a consequence of the following proposition and results from Baladi–Young [1993]: Proposition 3.4. Consider a weighted small random perturbation of f0 ∈ Cγr (M, M ), g0 ∈ C r (M, C) given by a family θ ( ≥ 0) and write Lb , L ( ≥ 0) for the corresponding transfer operators acting on B(α), respectively F = C r (M, C). Write R = exp P (log |g0 |) as usual. (1) For any fixed ψ ∈ B, ϕ ∈ F , n ≥ 1, lim kLbn ψ − Lbn0 ψkB = 0 ,
→0
lim kLn ϕ − Ln0 ϕkF = 0 .
→0
(3.25)
(2) Let γ¯ < γ and α¯ > α. Then there is a constant C > 0 and an integer N ≥ 0, so that for all n ≥ N there is (n) such that for all < (n) , kLbn − Lbn0 kB ≤ CRn max(γ¯ −rn , α¯ n ) , kLn − Ln0 kF ≤ CRn max(γ¯ −rn , α¯ n ) .
(3.26)
Proof of Proposition 3.4. (1) By the triangle inequality it suffices to prove the claims for n = 1. To do this, use that each θ is a probability distribution and observe that lim
sup
→0
ξω∈supportθZ+
lim
sup
→0
ξω∈supportθZ+
kLξ ψξω − Lψξω kF = 0 , Lipω Dj (Lξ ψξω − Lψξω ) = 0 , 0 ≤ j ≤ r ,
(3.27)
(simply apply the Leibniz formula to each term in the finite sums over inverse branches of the fξ and use the definition of a small random perturbation).
Equilibrium States for Random Expanding Maps
691
(2) The argument follows the lines of the proof of Lemma 5 in Baladi–Young [1993] or Lemma A.1 in Baladi et al. [1996] and is left to the reader. (We may use that Lbn ϕ(x, ω) can be written as an integral over θ (dξ1 ) · · · θ (dξn ) of random operators where the weights g are evaluated at points which depend on x and ξ1 , . . . , ξn but not on ω.) Theorem 3 (1) is an immediate consequence of Proposition 3.5: Proposition 3.5. Consider a weighted small random perturbation of f0 ∈ Cγr (M, M ), g0 ∈ C r (M, C) given by a family θ ( ≥ 0), and write Lb , L ( ≥ 0) for the corresponding transfer operators acting on B(α), respectively F = C r (M, C). Write R = exp P (log |g0 |) as usual. Let γ¯ < γ and assume that α < 1/γ¯ r . The spectrum of L and that of Lb outside of the disc of radius R/γ¯ r contains only isolated eigenvalues of finite multiplicity for small enough , and both spectra converge to the spectrum of L0 acting on F (outside of this disc) as → 0. The corresponding generalised eigenspaces of L , respectively Lb converge in the F , respectively B(α), topology to those of L0 , respectively Lb0 , and the dual eigenspaces converge in the weak topology. In particular, for positive weights, the maximal eigenmeasure νˆ of Lb converges to ν0 × δ Z+ with δ the Dirac mass at (f0 , g0 ). Proof of Proposition 3.5. The stability of the spectrum and the convergence of the eigenfunctions (in particular lim→0 kρ − ρ0 kr = 0 for positive weights) follows from Lemma 3 in Baladi–Young [1993] applied to the operators L , using the statements in Proposition 3.4 about the operators L . Indeed we get from Baladi–Young [1993] that the spectrum of L acting on F and of Lb acting on B outside of the disc of radius R/γ¯ r both converge to the spectrum of L0 acting on F outside of this disc as goes to zero. The eigenfunctions converge in the F, respectively B norm. To get the weak convergence of the eigenfunctionals it suffices to observe that the bounds for Lb in Proposition 3.4 also apply to Lb∗ by definition of the dual norm. Therefore, Lemma 3 in Baladi–Young [1993] may also be applied to the family Lb∗ , yielding the desired convergence. For the final claim, use Lb∗0 (ν × δ Z+ ) = R(ν × δ Z+ ) for any maximal eigenmeasure ν of L0 and the fact that the multiplicity of the maximal eigenvalue is constant for small enough from Proposition 3.4. 3.4. The annealed zeta functions. Theorem 6 will be a consequence of Proposition 3.5 and the following result of Ruelle: Theorem (Ruelle [1990, Theorem 1.1, Theorem 1.3]). Consider a C r (complex) weighted γ-expanding system (τ + , g, θ), write L for the corresponding transfer operator acting on F = C r (M, C), and let R = exp Q(a) (log |g|). (1) The zeta function ζ (a) (z) is analytic in the disc of radius R−1 and admits a zerofree meromorphic extension to the disc of radius R−1 γ, where its poles coincide (including multiplicity) with the inverses of the eigenvalues of modulus larger than R/γ of L acting on F. More precisely, if E is fixed then for any δ > 0 and γ¯ < γ there is a constant C(γ, ¯ δ) > 0 which does not depend on the probability distribution θ on E, so that if ¯ then the coefficients λ1 , . . . , λN are the eigenvalues of L of modulus larger than R/γ, an in the expansion
692
V. Baladi ∞ X
n
an z := log(ζ
(a)
(z) ·
n=0
N Y
(1 − λ−1 i z))
(3.28)
i=1
satisfy the uniform bounds |an | ≤ C exp(n(Q(a) (log |g|) + δ))/γ¯ n .
(3.29)
(2) The generalised Fredholm determinant d(a) (z) admits an analytic extension to the disc of radius R−1 γ r , where its zeroes coincide (including multiplicity) with the inverses of the eigenvalues of modulus larger than R/γ r of L. More precisely, if E is fixed, then for any δ > 0 and γ¯ < γ there is a constant C(γ, ¯ δ) > 0 which does not depend on the probability distribution θ on E, so that if λ1 , . . . , λM are the eigenvalues of L of modulus larger than R/γ¯ r then the coefficients bn in the expansion ∞ X
M Y
(1 − λ−1 i z))
(3.30)
|bn | ≤ C exp(n(Q(a) (log |g|) + δ))/γ¯ rn .
(3.31)
bn z n := log(d(a) (z)/
n=0
i=1
satisfy the uniform bounds
(Ruelle does not state explicitly the θ-uniform bounds (3.29),(3.31) but they are easily obtained from his proofs.) Proof of Theorem 6. For each fixed m ≥ 1 we get by definition of a small random (a) perturbation that ζ(a) (m) converges to ζ0(a) (m) and d(a) (m) converges to d0 (m) as r → 0. Moreover, the eigenvalues λi, of L of modulus larger than R/γ¯ converge to the corresponding eigenvalues of L0 by Proposition 3.5. The result is therefore an easy exercise on convergent power series using the uniform bounds in the theorem of Ruelle stated above. 3.5. Integrated annealed correlation functions. For ϕ1 , ϕ2 ∈ B, and µ = ρνˆ the annealed equilibrium state of a positively weighted i.i.d. expanding map (τ + , g, θ), writing R for the spectral radius of Lb on B, we get Z Z 1 (ϕ1 ◦ (τ + )n )(x, ω)ϕ2 (x, ω)ρ(x, ω)ν(dx, ˆ dω) = n ϕ1 Lbn (ρϕ2 )(x, ω)ν(dx, ˆ dω) . R (3.32) (Just use the fact that νˆ is an eigenfunctional for the dual of Lb and the eigenvalue R.) If Cϕ1 ϕ2 (n) denotes the correlation function (2.20), it follows formally that Z X eiη −1 einη Cϕ1 ϕ2 (n) = 1 − ( )Lb (ρϕ2 ) (x, ω)ϕ1 (x, ω)ν(dx, ˆ dω) . (3.33) R n≥0
Our results on the spectrum of Lb in Proposition 3.1 give the desired meaning to (3.33). This proves Theorem 4 (1). (For ϕ1 , ϕ2 ∈ C r (M, C), we may in fact replace Lb by L and (τ + )n by fω(n) in (3.32) and (3.33).) Finally, Theorem 5 (1) follows from Proposition 3.5, just as in the proof of Theorem 3.
Equilibrium States for Random Expanding Maps
693
c 4. The Quenched Transfer Operator M In this section, we restrict again to the case of positive weights and we construct a c related to the quenched state. This will be useful normalised integrated operator M to study the quenched correlation spectrum and its stability for small perturbations, in particular to prove Theorem 3 (2), Theorem 4 (2), and Theorem 5 (2). Consider first the two-sided situation (τ, g, θ), viewing g as a function on M × depending only on x and ω0 . Using the notations and results from Kifer [1992] recalled in (3.12) above, i.e., uniquely defined positive numbers λω , Borel probability measures ν ω on M , and functions hω : M → R with ν ω (hω ) = 1, such that Lω hω = λω hσω , and L∗ω ν σω = λω ν ω , we first show: Proposition 4.1 (Properties of λω and ν ω ). (1) Since Lω = Lω0 , the objects λω and ν ω only depend on ωk for k ≥ 0. (2) The map ω 7→ log λω is Lipschitz from + → R+ for the metric dα for any α > 1/γ. Proof of Proposition 4.1. The first assertion is a consequence of the proof of Lemma 2.2 in Kifer [1992]. To prove the second claim, we use the observations of Kifer [1992, p. 16] that λω = ν σω (Lω (1)), and Lnω ϕ . (4.1) ν ω (ϕ) = lim n→∞ Ln ω1 Let 1 < γ¯ < γ and α = 1/γ, ¯ we shall prove that there is a constant C > 0 so that for ˜ any ω, ω˜ ∈ + we have | log λω − log λω˜ | ≤ Cdα (ω, ω). We begin with two purely dynamical remarks. First observe that, by compactness of E, there is ¯(E) such that whenever dr (fω , fω˜ ) < ¯ then fω and fω˜ have the same degree, and, moreover, for each x ∈ M the bijection 9 between {y | fω (y) = x} and ˜ = x} can be chosen in such a way that {y˜ | fω˜ (y) dM (y, 9(y)) ≤
dr (fω , fω˜ ) . γ
(4.2)
Indeed, if ¯ is small enough we may choose 9 so that y and 9(y) are in the image of the same local inverse branch of fω , so that if (4.2) were violated for some y we would have 0 = dM (x, x) = dM (fω y, fω˜ (9(y)) ≥ dM (fω y, fω (9(y)) − dM (fω (9(y)), fω˜ (9(y)) dr (fω , fω˜ ) − dr (fω , fω˜ ) = 0 , >γ· γ
(4.3)
a contradiction. We claim now that for all ω, ω, ˜ all n ≥ 1, and up to exchanging ω and ω, ˜ there exists for any point x ∈ M a surjective map : Yn,ω,x = {y ∈ M | fω(n) (y) = x} → Yn,ω,x = {y˜ ∈ M | fω(n) ˜ = x} . 9n,ω,ω,x ˜ ˜ ˜ (y) (4.4) (If all fω have the same degree, then the 9n,ω,ω,x ˜ are bijections, otherwise, the cardinality of the fibers can be unbounded as n → ∞.) Moreover, fixing ¯(E) as above, there is a
694
V. Baladi
constant C > 0 so that if, in addition δ = d1/γ¯ (ω, ω) ˜ ≤ ¯, then there is n(δ) so that for n ≥ n(δ) and any y ∈ Yn,ω,x , we have dM (y, 9n,ω,ω,x ˜ . ˜ (y)) < Cd1/γ¯ (ω, ω)
(4.5)
To prove (4.5), we first note that since dE (ωk , ω˜ k ) ≤ δ · γ¯ k for all k ≥ 0, there is for any δ ≤ ¯ an iterate k0 (δ) ≥ 0 with δ γ¯ k0 ≤ ¯ < δ γ¯ k0 +1 (if δ > ¯ we set k0 = 0). For any n ≥ n(δ) = k0 (δ), and any x ∈ M consider the finite sets Y = Yn−k0 ,σk0 ω,x ˜ and Y˜ = Yn−k0 ,σk0 ω,x ˜ , assuming that #Y ≥ # Y (the other case is symmetric) and ˜ choose an arbitrary surjection 9 : Y → Y . If δ > ¯, we are done. Otherwise, we fix an arbitrary pair (y, y˜ = 9(y)) ∈ Y × Y˜ . Using the fact that for any j ≥ 1 and any u, v in M the sets (fω(j) )−1 (u) and (fω(j) )−1 (v) are in bijection with the distance between two paired points not larger than dM (u, v)/γ j (recall that each fω is γ-expanding), using the simplified notation d((fω(j) )−1 (u), (fω(j) )−1 (v)) to represent the maximum distance −1 between two such paired points, and defining d((fω(j) )−1 (u), (fω(j) ˜ ) (u)) analogously, we get by applying successively (4.2) and recalling the definition of k0
0 ) −1 d((fω(k0 ) )−1 (y), (fω(k ˜ ˜ ) (y)) 0 ) −1 ˜ + d((fω(k0 ) )−1 (y), ˜ (fω(k ˜ ≤ d((fω(k0 ) )−1 (y), (fω(k0 ) )−1 (y)) ˜ ) (y)) diam M ≤ + d((fω(k0 −1) )−1 (fωk0 )−1 (y), ˜ (fω(k0 −1) )−1 (fω˜ k0 )−1 (y)) ˜ γ k0 0 −1) −1 +d((fω(k0 −1) )−1 (fω˜ k0 )−1 (y), ˜ (fω(k ) (fω˜ k0 )−1 (y)) ˜ ˜ ≤ ··· k0 −1 ¯ γ¯ k0 +1 · diam M X + ≤ k k +1 k −1−j 0 0 0 γ γ¯ γ · γ¯ j j=0 k0 1 γ¯ · diam M , ≤ δ · γ¯ · + ¯γ k0 1 − (γ/γ) ¯
as claimed. We now write
Lnω (1) · Ln−1 λω σω ˜ (1) = lim n−1 ω ˜ n→∞ λ Lσω (1) · Lnω˜ (1) X n−1 Y = lim sup n→∞ x
(n) fω u=x k=0
X
X
gσk (ω) (fω(k) u) ·
n−2 Y
(n−1) fσω v=x k=0
n−2 Y
(k) gσk+1 (ω) ˜ (fσ ω ˜ s)
fσ(n−1) s=x k=0 ω ˜ (k) gσk+1 (ω) (fσω v)
Y X n−1 (k) · gσk (ω) ˜ (fω ˜ t) (n) k=0 fω ˜ t=x
(4.6)
Equilibrium States for Random Expanding Maps
= lim sup n→∞ x
X
n−2 Y
(n−1) fσω y=x (n−1) f s=x σω ˜
k=0
695
(k) (k) gσk+1 (ω) (fσω y) · gσk+1 (ω) ˜ (fσ ω ˜ s)
X
n−2 Y
(n−1) fσω v=x (n−1) f r=x σω ˜
k=0
X
gω0 (u)
fω0 u=y
(k) gσk+1 (ω) (fσω v)
·
X
.
(k) gσk+1 (ω) gω˜ 0 (t) ˜ (fσ ω ˜ r) fω˜ 0 t=r
(4.7) If δ ≤ ¯ (the case δ > ¯ is simpler since we just need to bound (4.7) uniformly from above and below), we consider the right-hand-side of (4.7) for a fixed n ≥ n(δ) and any x ∈ M , and assume that the surjection 9n,ω,ω˜ is as in (4.4) (the other case is similar). It suffices to replace gω0 (u) by gω˜ 0 (t) · (gω0 (u)/gω˜ 0 (t)) (where t = 9n,ω,ω˜ (u)) in the numerator, ˜ the fact that dE (ω0 , ω˜ 0 ) ≤ δ, as well and to use the remark that dM (t, u) < d1/γ¯ (ω, ω), as the following trivial inequalities for numbers ai , ci > 0 with i ∈ I finite: P ai ci ≤ sup ci . (4.8) inf ci ≤ Pi∈I i∈I i∈I i∈I ai Remark 4.1. Equation (4.7) in the proof of Proposition 4.1 also shows that λω only depends on ω0 if fω0 is independent of ω0 (as in the random Ising model in Sect. 2.4). In this case, and whenever λω = λω0 , it is not difficult to check by looking at the proof of Kifer [1992, Proposition 2.5] that hω only depends on ωi for i < 0. Remark 4.2. (1) Since the Lipschitz constant of log λω is uniform in in the case of a small random perturbation θ , we get ω lim (sup log λω − inf log λ ) = 0 .
→0
ω
ω
(4.9)
R Since log λω θZ+ (dω) = Q(q) (log g) and we know from Kifer [1992, Sect. 4] that (q) (q) Q (log g) → Q0 (log g) as → 0 we find (q) lim sup | log λω − Q (log g)| = 0 .
→0 ω
(4.10)
(2) In the case of a small random perturbation θ , we claim that log λω when viewed as a C β function of ω (for any 0 < β < 1) has a H¨older constant which goes to zero as → 0. (This will be useful in Proposition 4.3 below.) To see this, let C > 0 be an upper bound for the Lipschitz constant of log λω for < 0 , and observe that for any fixed and any ω, ω, ˜ with each ωi , ω˜ i in the support of θ , we have dα (ω, ω) ˜ ≤ /(1 − α) so that ω ˜ | log λω − log λ | ≤
C1−β · dα (ω, ω) ˜ β. (1 − α)1−β
(4.11)
c acting on bounded We now define the integrated quenched transfer operator M functions ϕ : M × + → C by Z 1 c (Lξ ϕξ∧ω+ )(x) θ(dξ) . (4.12) (Mϕ)(x, ω+ ) = λξ∧ω+ c acting on bounded functions is equal to 1. Observe that the spectral radius of M
696
V. Baladi
Remark 4.3. There is in general no corresponding operator M acting on functions which only depend on the x-variable. Such a definition exists if λω only depends on ω0 (e.g., if the dynamical system is deterministic but not the weight see Remark 4.1). This is the case in particular when λω is constant (for example in the SRB case where λω ≡ 1, or c is for a constant weight whenever the degree of the fξ is constant), then the operator M ω (a) (q) b simply L rescaled by λ = λ so that µ = µ . c Since λω is Lipschitz for dα (α > 1/γ) by Proposition 4.1 (2) we may consider M as an operator acting on B(α) and we find: c The operator M c acting on B(α) for α > Proposition 4.2 (Quasicompacity of M). 1/γ has spectral radius equal to 1 and essential spectral radius strictly smaller than 1. The spectral radius is a simple eigenvalue with a corresponding eigenfunction which coincides with Z (4.13) ρ˜ω0 ∧ω+ (x) = hω (x) 2− (dω− ) , where we use the notations = − × E × + , ω = ω− ∧ ω0 ∧ ω+ , and 2− for the c∗ is the marginal of the measure 2 on − . The corresponding for M R ω eigenfunctional + + + ˜ = ν (ϕω+ ) 2 (dω+ ). The probability positive measure ν˜ on M × defined by ν(ϕ) measure ν˜ ρ/ ˜ ν( ˜ ρ) ˜ is the unique relativised quenched equilibrium state for τ + , g, θ. Note in particular that ρ(x, ˜ ω) ∈ B(α). Proof of Proposition 4.2. The claims about the spectral radius and essential spectral radius follow by the same adaptation of the results of Ruelle [1990] and Fried [1995] as c has a simple in Proposition 3.1. Since we have assumed that the weight g is positive, M 1 ¯ for the unique probability measure such maximal fixed point also when acting on L (ν) ¯ Note that ρ(x, ω) ∈ L∞ (ν) ¯ ⊂ L1 (ν). ¯ We have that M∗ ν¯ = ν. cρ)(x, (M ˜ ω+ ) =
Z
1
ZZ
= Z =
λξ∧ω+
Lξ ρ˜ξ∧ω+ θ(dξ)
1 Lξ hω− ·∧ξ∧ω+ (x) 2− (dω− )θ(dξ) λξ∧ω+
(4.14)
˜ ω+ ) . hω− ∧ξ·∧ω+ (x) 2− (dω− )θ(dξ) = ρ(x,
˜ Also, since L∗ξ νω+ = λξ∧ω νξ∧ω+ , we get for ϕ ∈ L1 (ν): Z
c ω ) 2+ (dω+ ) ν ω+ (Mϕ + Z Z 1 Lξ ϕξ∧ω+ θ(dξ) 2+ (dω+ ) = ν ω+ ξ∧ω + λ Z ξ∧ω+ (ϕξ∧ω+ ) θ(dξ) 2+ (dω+ ) = ν(ϕ) ˜ , = ν
c∗ ν(ϕ) ˜ = M
(4.15)
c is just the transfer operator associated to the weight gω (x)/λω , so that ν˜ = ν. ¯ Since M 0 the same computation as (3.6) shows that µ is τ + -invariant (the fact that the weight now depends on the full sequence ω and that the eigenfunction ρ˜ depends on ω play no role
Equilibrium States for Random Expanding Maps
697
there). Finally, we can check along the lines of the proof of Proposition 3.2 that the measure ν˜ ρ/ ˜ ν( ˜ ρ) ˜ is a one-sided quenched relativised equilibrium state. When λω = λω0 we have that ρ˜ω (x) only depends on x by Remark 4.1. In that case, ˜ ν( ˜ ρ) ˜ is equal to ρ˜ν, ¯ where ρ˜ and ν¯ the marginal on M of the quenched state µ(q) = ρ˜ν/ are the maximal eigenfunctions of M acting on C r (M ) and its dual M∗ . Finally, we have the following stability result which implies Proposition 3 (2): Proposition 4.3 (Quenched stability). Consider a positively weighted small random perturbation of f0 ∈ Cγr (M, M ), g0 ∈ C r (M, R+∗ ) given by a family θ ( ≥ 0), c , Lb0 ( ≥ 0) for the corresponding transfer operators acting on B(α) and write M c acting on (α > 1/γ). Write R0 = exp P (log |g0 |) as usual. The spectrum of R0 M B(α) contains only isolated eigenvalues of finite multiplicity outside of the disc of radius R0 /α for small enough , and converges to the spectrum of Lb0 (outside of this disc) c converge in acting on B as → 0. The corresponding generalised eigenspaces of M b the B(α) topology to those L0 , and the dual eigenspaces converge in the weak topology. In particular the maximal eigenmeasure ν˜ converges to ν0 × δ Z+ with δ the Dirac mass at (f0 , g0 ). c ( ≥ 0) Proof of Proposition 4.3. We first claim that for any 0 < β < 1 the operators M have the same spectral radius when acting on B(α) or B(α, β) (where B(α, β) is obtained by replacing Lipschitz by β-H¨older in the definition of B(α)), that their essential spectral radius acting on B(α, β) is not bigger than 1/αβ and that their eigenvalues of modulus c acting on B(α). (Analogous properties hold larger than 1/αβ coincide with those of M for Lb .) This is obtained again by adapting the results of Ruelle [1990] or Fried [1995]. c and Lb0 acting on B(α, β) for all It therefore suffices to show the claim for M c and 0 < β < 1. Writing R for the spectral radius of Lb , we have by definition of M (4.9),(4.11) that for all n ≥ 1 and small enough and all n ≥ 1, cn − Lbn kB(α,β) kR0n M ≤ sup
R0n − 1 kLbn kB(α,β) λξn ω · · · λξ1 ···ξn ω (ξ1 ···ξn ω)∈supportθZ+ n β X R0n H ω (λξj ···ξn ω ) kLbn kB(α,β) + sup · ξj ···ξn ω ξ ω · · · λξ1 ···ξn ω λ Z+ λ n (ξ1 ···ξn ω)∈supportθ j=1
≤ cn, Rn ,
(4.16)
with cn, a constant tending to zero when → 0 for each fixed n, and Hβ ω (ψ(ω)) the β-H¨older constant of a function ψ : ω + → R. It follows from (4.16) and Proposition 3.4 (which also holds for Lb when considering B(α, β), up to replacing α by αβ in (3.26)) c , i.e., for any fixed ψ ∈ B(α, β), and that the analogue of Proposition 3.4 holds for M n ≥ 1, cn ψ − Lbn0 ψkB(α,β) = 0 , (4.17) lim kR0n M →0
and there is a constant C > 0 and an integer N ≥ 0, so that for all n ≥ N there is (n) such that for all < (n), cn − Lbn0 kB(α,β) ≤ CRn αnβ . kR0n M
(4.18)
698
V. Baladi
We may thus use Lemma 3 in Baladi–Young [1993] just as in the proof of Proposition 3.5. We may define correlation functions associated with the quenched Gibbs state µ(q) = ρ˜ν˜ and test functions in B(α) or B(α, β) (α > 1/γ and 0 < β < 1) and proceed as in c acting on B(α) and we get Theorem 4 (2) Sect. 3.5. The relevant spectrum is that of M by Proposition 4.2 and Theorem 5 (2) by Proposition 4.3. Note finally that a quenched zeta function ζ (q) may be introduced by normalising (a) ζm in the definition in (2.24) through the λω associated with m-periodic sequences ω. c and the The results of Ruelle [1990] apply again, relating the discrete spectrum of M poles of ζ (q) (z). Appendix A Proof of (3.5). We show here how the proof of Lemma 1 in Fried [1995] can be adapted to our skew-product situation, trying to keep close to the notation there. For ϕ ∈ B(α) we define bounded functions νj (x, ω) (j = 0, . . . , r, r + 1) on M × + by setting j=0 |ϕ(x, ω)| 1 ≤ j ≤ r, (A.1) νj (ϕ)(x, ω) = kDxj ϕ(·, ω)k Lip Dj−(r+1) ϕ(x, ·) r + 1 ≤ j ≤ 2r + 1 , ω where Lipω ψ(x, ·), for ψ a complex or matrix-valued function, is the smallest constant K(x) so that ˜ ω) ¯ , kψ(x, ω) ˜ − ψ(x, ω)k ¯ ≤ K(x)dα (ω, for all ω, ˜ ω¯ in + (where k · k denotes complex modulus or matrix norm). Just like Fried [1995, p. 1063], we find that for all j there are numbers Fjk so that j X Fjk νk (ϕ)(ψ n (x, ω)) , νj (ψ n )∗ ϕ(x, ω) ≤
(A.2)
k=0
with Fjj = (γ −j )n for 0 ≤ j ≤ r, Fj,j = αn (γ −j+(r+1) )n , for r + 1 ≤ j ≤ 2r + 1 and Fj` = 0 for ` = 0 < j. It remains to estimate sup
νj (ϕ − T ϕ)(x, ω)
(A.3)
(x,ω)∈Im ψ n
for functions ϕ with kϕk ≤ 1. Obviously ˜ ω)(ϕ)(x, ˜ ω)k kD j ϕ(x, ω) − Dj T (x, ˜ ω)(ϕ)k + kDj T (x, ˜ ω)(ϕ) − Dj T (x, ˜ ω)(ϕ)k ˜ . (A.4) ≤ kDj ϕ(x, ω) − Dj T (x, To bound the first term in the right-hand-side of (A.4) it is useful to observe that M may be embedded in euclidean space such that for any x, y in M there is a piecewise linear path between them with length bounded by a uniform constant times dM (x, y), and such that the local inverse branches (fξ )−1 i may be extended to an open neighbourhood of M with uniformly bounded derivatives (this is a weakened but sufficient version of assumptions (1)–(3) in Fried [1995, p. 1062]). Therefore the arguments of Fried [1995, p. 1064] yield
Equilibrium States for Random Expanding Maps
699
sup νj ((1 − T )ϕ)(x, ω) ≤ γ −n(r−j) + αn , x,ω∈Imψ n for 0 ≤ j ≤ r and sup
νj ((1 − T )ϕ(x, ω)) ≤ 1 ,
x∈Imψ n
for r + 1 ≤ j ≤ 2r + 1 so that the proof of (3.5) may be completed just as the proof of Lemma 1 in Fried [1995]. Appendix B The non i.i.d. case. We use the setup of Sect. 2.1, except that we do not assume that the σ + invariant and mixing probability measure 2+ on + is a product measure, and indicate how our results could be extended. Since 2+ is σ + invariant, its decomposition on E × + takes the special form 2+ (dω) = θσ ω (dω0 )2+ (dσ + ω) . +
(B.1)
We now assume further that the functionals θω are Lipschitz functions of ω ∈ + for some metric dα . We then define the annealed integrated operator Lb acting on B(α) by Z b (Lξ ϕξω )(x)θω (dξ) . (B.2) (Lϕ)ω (x) = E
There is in general no operator L acting on F in the non-i.i.d. setting, but our main results (quasicompactness, annealed zeta function, stability of spectrum, etc.) should hold as before (see Remark 2.1, note however that neither the quenched nor the annealed onesided SRB state is a product measure on M × + in general). The definition of the annealed Gibbs state is unchanged, and the definition of the annealed equilibrium state is (2.10), with the following formula for the specific entropy for site of a σ + invariant measure υ with respect to the family θ of a priori measures θω (dξ) on E: Z θ log β(ξω) υ(d(ξω)) (B.3) h (υ) = − + if υ(d(ξω)) is absolutely continuous with respect to θω (dξ)υ(dω), with Radon–Nikodym c can be defined as in derivative β(ξω), and otherwise, hθ (υ) = −∞. The operator M ω (4.12), replacing θ(dξ) by θ (dξ) and the results on annealed and quenched states should hold in this more general setting. References 1. Baladi, V., Kitaev, A., Ruelle, D., Semmes, S.: Sharp determinants and kneading operators for holomorphic maps. IHES preprint (1995), to appear Pror. Stekov Math. Inst. 2. Baladi, V., Kondah, A., Schmitt, B.: Random correlations for small perturbations of expanding maps. Random and Computational Dynamics 4, 179–204 (1996) 3. Baladi, V., Young, L.-S.: On the spectra of randomly perturbed expanding maps. (see also Erratum, Commun. Math. Phys. 166, 219–220 (1994)) Commun. Math. Phys. 156, 355–385 (1993) 4. Bogensch¨utz, T.: Entropy, pressure and a variational principle for random dynamical systems. Random and Computational Dynamics 1, 219–227 (1992)
700
V. Baladi
5. Bogensch¨utz, T.: Stochastic stability of equilibrium states. Random and Computational Dynamics 4, 85–98 (1996) 6. Bogensch¨utz, T., Crauel, H.: The Abramov-Rokhlin formula. Ergodic Theory and Related Topics III, Proceedings 1990 (U. Krengel, K. Richter and V. Warstadt, ed.), Lecture Notes in Math. 1514, New York–Berlin: Springer-Verlag 7. Bogensch¨utz, T., Gundlach, V.M.: Ruelle’s transfer operator for random subshifts of finite type. Ergodic Theory Dynamical Systems 15, 413–447 (1995) 8. Christiansen, F., Cvitanovi´c, P., Rugh, H.H.: The spectrum of the period-doubling operator in terms of cycles. J. Phys. A 23, L713–L717 (1990) 9. Dunford, N., Schwartz, J.T.: Linear Operators, Part I: General Theory. New York: Wiley (Wiley Classics Library Edition), 1988 10. Fried, D.: The flat-trace asymptotics of a uniform system of contractions. Ergodic Theory Dynamical Systems 15, 1061–1073 (1995) 11. Georgii, H.-O.: Gibbs Measures and Phase Transitions. Berlin–New York: De Gruyter (Studies in Mathematics), 1988 12. Jiang, Y., Morita, T., Sullivan, D.: Expanding direction of the period doubling operator. Commun. Math. Phys. 144, 509–520 (1992) 13. Khanin, K., Kifer, Y.: Thermodynamic formalism for random transformations and statistical mechanics. Sinai’s Moscow Seminar on Dynamical Systems (Amer. Math. Soc. Translations Series 2, Vol. 171), Providence, RI: Am. Math. Soc., 1996 14. Kifer, Y.: Principal eigenvalues, topological pressure, and stochastic stability of equilibrium states. Israel J. Math. 70, 1–47 (1990) 15. Kifer, Y.: Equilibrium States for Random Expanding Transformations. Random and Computational Dynamics 1, 1–31 (1992) 16. Lanford III, O.E., Ruedin, L.: Statistical mechanical methods and continued fractions. Helv. Phys. Acta 69, 908–948 (1996) 17. Ledrappier, F.: Pressure and variational principle for random Ising model. Commun. Math. Phys. 56, 297–302 (1977) 18. Ledrappier, F., Walters, P.: A relativised variational principle for continuous transformations. J. London Math. Soc. 16, 568–576 (1977) 19. M´ezard, M., Parisi, G., Virasoro, M.A.: Spin glass theory and beyond. World Scientific Lecture Notes in Physics, Vol. 9, Singapore, New Jersey, Hong Kong: World Scientific, 1987 20. Parry, W., Pollicott, M.: Zeta functions and the periodic orbit structure of hyperbolic dynamics. Soc. Math. France. Ast´erisque 187–188, Paris, 1990 21. Pinsker, M.S.: Information and information stability of random variables and processes. San Francisco: Holden-Day, 1964 22. Pollicott, M.: On the rate of mixing of Axiom A flows. Invent. Math. 81, 413–426 (1985) 23. Ruedin, L: Statistical mechanical methods and continued fractions. Ph.D. Thesis, ETH Z¨urich, 1994 24. Ruelle, D.: Thermodynamic Formalism. Reading, MA: Addison-Wesley, 1978 25. Ruelle, D.: One-dimensional Gibbs states and Axiom A diffeomorphisms. J. Differ. Geom. 25, 117–137 (1987) 26. Ruelle, D.: The thermodynamic formalism for expanding maps. Commun. Math. Phys. 125, 239–262 (1989) ´ 27. Ruelle, D.: An extension of the theory of Fredholm determinants. Inst. Hautes Etudes Sci. Publ. Math. 72, 175–193 (1990) 28. Ruelle, D.: Functional determinants related to dynamical systems and the thermodynamic formalism. (Lezioni Fermiane Pisa, 1995), Preprint (1995) 29. Walters, P.: An Introduction to Ergodic Theory. New York: Springer-Verlag, 1982 Communicated by Ya. G. Sinai
Commun. Math. Phys. 186, 701 – 730 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Initial Boundary Value Problem for Conservation Laws Pui Tak Kan1,? , Marcelo M. Santos2,?? , Zhouping Xin3,??? 1
Department of Mathematics, IUPUI Departamento de Matem´atica, IMECC–UNICAMP 3 Courant Institute of Mathematical Sciences, New York Universtiy and Department of Mathematics, Harvard University
2
Received: 23 July 1996 / Accepted: 28 October 1996
Abstract: This paper concerns the initial boundary value problems for some systems of quasilinear hyperbolic conservation laws in the space of bounded measurable functions. The main assumption is that the system under study admits a convex entropy extension. It is proved that then any twicely differentiable entropy fluxes have traces on the boundary if the bounded solutions are generated by either Godunov schemes or by suitable viscous approximations. Furthermore, in the case that the weak interior solutions are generated by Godunov schemes, any Lipschitz continuous entropy fluxes corresponding to convex entropies have traces on the boundary and the traces are bounded above by computable numerical boundary values. This in particular gives a trace formula for the flux functions in terms of the numerical boundary data. We also investigate the formulation of boundary conditions for systems of hyperbolic conservation laws. It is shown that the set of expected boundary values derived from the viscous approximation contains the one derived in terms of the boundary Riemann problems, and the converse is not true in general. The general theory is then applied to some specific examples. First, several new facts are obtained for convex scalar conservation laws. For example, we give example which show that Godunov schemes produce numerical boundary layers. It is shown that any continuous functions of density have traces on the boundary (instead of only entropy fluxes). We also obtain interior and boundary regularity of the weak solutions for bounded measurable initial and boundary data. A generalized Oleinik entropy condition is also obtained. Next, we prove the existence of a weak solution to the initial-boundary value problem for a family of 2 × 2 quadratic system with a uniformly characteristic boundary condition.
?
NSF grant DMS-94-04341 Supported in part by CNPq/Brazil, proc. 453613/95-0 and 300050/92-5. ??? Supported in part by a Sloan Foundation Fellowship, NSF Grant DMS-93-03887, and Department of Energy Grant De-FG02-88ER-25053. ??
702
P. T. Kan, M. M. Santos, Z. Xin
1. Introduction The theory for initial boundary value problems for systems of quasilinear conservation laws distinguishes itself from the one for Cauchy problems due to the complexities introduced by the boundaries. As even in the linear case, the very definitions of the appropriate boundary conditions themselves are an important issue in both the theoretical understanding and the numerical approximations. Furthermore, even in the case that the boundary conditions can be easily formulated (such as uniformly non–characteristic boundary conditions), one still has to obtain traces on the boundary for some functions of density variables to make sense of the boundary conditions for weak solutions, which are in general only bounded measurable functions due to the nonlinearity of the density fluxes. Also in the numerical computations of weak solutions in the presence of boundaries, one usually chooses some kind of up–winding schemes (such as Godunov type schemes) near the boundary to satisfy the physical boundary conditions. These exclude numerical boundary layers for smooth flows and linear problems. It is of practical significance to know whether up–winding type schemes exclude numerical boundary layers for weak solutions of nonlinear problems. In this paper, we address some of these issues associated with the initial boundary value problems for nonlinear hyperbolic conservation laws and their numerical approximations. Specifically, we consider the following initial boundary value problem (IBVP) for systems of conservation laws: Ut + F (U )x = 0, x > 0, t > 0 U (x, 0) = U0 (x), x > 0 (1.1) Boundary condition at x = 0, t > 0, where U ∈ U ⊂ Rn , F : U → Rn , and U0 is the initial datum, and the conditions at the boundary x = 0 are either Dirichlet type conditions ((2.3)) or in general form ((2.2)). We will assume that the system (1.1) admits a convex entropy extension. First we consider a question associated with formulations of boundary conditions for IBVP (1.1). It is well known that IBVP (1.1) is not well–posed in general as it stands. This necessitates formulating conditions for the solutions at the boundary x = 0 so that the problem becomes well-posed. There are two natural ways to achieve this historically. One is based on the idea that the appropriate boundary conditions for the inviscid equations should be consistent with the regularized problems when dissipations are taken into account, so the boundary set is described in terms of entropy–entropy flux pair (as in Definitions 3.1), which we will call the viscous set V. See [BLN, DL, He]. Another way is based on solving the half–space Riemann problems so that numerical approximated solutions can be defined as first proposed for general systems by Goodman [Go], for gas dynamics by Liu [LI] and Nishida-Smoller [NS], see also [DL], which we will call the Riemann set R (see Definition 3.2). It is known [DL] that these two formulations are equivalent for scalar conservation laws. In Sect. 3, we will show that the second formulation based on solving the half–space Riemann problems is more stringent than the first formulation by the vanishing viscosity method. In fact, we will show that for arbitrary systems, R ⊂ V, and V 6⊂ R unless some geometric conditions on the wave curves are satisfied (see Theorem 3.4). We note that part of this result was announced by B-S in [BS] for some special cases. However we give an elementary but simple proof using a construction of the entropy-entropy flux pair due to Dafermos-Diperna [DD], furthermore, we derive a general condition for V 6⊂ R. Next, we study the problem of traces of weak solution to IBVP (1.1). We consider approximate solutions U ε = U ε (x, t), ε > 0,
Initial Boundaray Value Problem for Conservation Laws
703
x, t > 0, to (1.1) obtained, e.g., by either numerical methods or vanishing viscosity limit approaches. Assume that U ε is measurable and uniformly bounded (in ε) and we can take the weak limit of U ε , in x > 0, t ≥ 0 when the parameter ε goes to zero, such that in the limit we obtain a bounded measurable function U (x, t), which is an interior solution of (1.1) in x > 0, t ≥ 0 with initial datum U0 (x). Due to the possible appearance of boundary layers, we cannot expect in general to take the weak limit including the boundary x = 0. Then a number of natural questions arise: What is the behavior of the interior solution U (x, t) as x goes to zero? What is the meaning that the boundary conditions in (1.1) are satisfied? Whether or not there is a trace of density variables U or some functions composed with U at the boundary x = 0; are there boundary layers in the approximations U ε (x, t)? Here by boundary layer we mean that the weak limit U ε (0, t) as ε goes to zero is different from the trace of U at x = 0 if both exist. See Definition 5.6. The rest of this paper is devoted to answer some of these questions. In Sect. 4, we first study the behavior near the boundary of any smooth (twicely differentiable) entropy flux q(U ) composed with bounded measurable interior solution U (x, t) generated by either Godunov schemes (see Sect. 2) or a viscous approximation (see (4.4)). By analysing the limiting behavior of the measure ∂t η(U ε (x, t)) + ∂x q(U ε (x, t)) (where η is the entropy corresponding to the entropy flux q), we prove that q(U (x, t)) has a weak trace γq(U ) ∈ L∞ (R+ ), see Theorem 4.1. Furthermore, in the case that U (x, t) is generated by the Godunov schemes, a much stronger result (Theorem 4.3) is obtained. In fact, we prove that for any entropy–entropy flux pair (η, q) of class W 1,∞ with η being a convex function, q(U (x, t)) has a weak trace γq(U ) ∈ L∞ (R+ ) at x = 0, and γq(U ) ≤ w ∗ lim q(U ε (0, ·)). ε→0
(1.2)
In particular, the density flux F (U ) has a trace such that γF (U ) = w ∗ lim F (U ε (0, ·)). ε→0
(1.3)
It should be emphasized that the estimates (1.2) and (1.3) should be important in understanding the behavior of weak solution U (x, t) near the boundary x = 0 since the right hand side of (1.2) and (1.3) are computable in terms of numerical boundary values. Indeed, in the case of scalar convex conservation laws, we show that (1.2) implies that the strong trace of U (x, t) exists and (1.2) becomes an identity provides that the wave speed never vanishes (see Theorem 5.7). Also (1.3) implies that if there is a boundary layer for the scalar convex conservation law, it must be an entropy standing shock (see Theorem 5.11). In the study of IBVP for a family of quadratic systems, which is not strictly hyperbolic at the origin, with the characteristic boundary condition (6.3) (see Sect. 6), we can apply (1.2) to show that the boundary condition is satisfied in a strong sense so we obtain a global (in time) weak solution to the IBVP (6.1)–(6.3). The fact (Theorem 4.1) that any smooth entropy q(U (x, t)) has a trace γq(U ) at x = 0 should be also useful. Applying this fact, we can show that for the scalar convex conservation law, the boundary Young measure associated with the limiting behavior of U (x, t) as x → 0+ is in fact unique, and consequently any continuous function of U has a well–defined trace at x = 0, in particular, U (x, t) itself has a trace at x = 0 (see Theorem 5.4). In Sect. 5, we give an example for the Burgers equation to show that the Godunov scheme for IBVP can also introduce boundary layers provided that the wave speeds are allowed to change sign. This is somewhat surprising to us since it is in general believed that due to its up–winding property, the Godunov scheme excludes numerical boundary layers. The reason for the appearance of the boundary layer in the Godunov approximation is
704
P. T. Kan, M. M. Santos, Z. Xin
due to the fact that boundary is not uniformly non–characteristic (see Theorem 5.8). We conjecture that similar conclusions hold for general systems. The initial boundary value problems for scalar convex conservation laws with Dirichlet boundary condition via Godunov schemes are studied in great deal in Sect. 5. Besides the results mentioned in the above paragraph, we also obtain many other results on the stability of the Godunov scheme, boundary and interior regularity of the solution, etc. In particular, we obtained a generalized version of Oleinik’s entropy condition for IBVP (Theorem 5.3) which yields interior BV regularity immediately even though both the initial and boundary data are assumed only bounded measurable. In [Go], Goodman proved global existence of the weak solution to IBVP for strictly hyperbolic system with noncharacteristic boundary conditions provided that both initial and boundary data have small BV norms. In this case, the solutions also have small total variation, and thus have strong trace so the boundary condition is always satisfied in the strong sense. However, there are many cases in which even small BV solutions cannot be obtained. A prototype example of this is the 2 × 2 systems with quadratic fluxes. For such systems, the only method (so far) to give existence of weak solution is the theory of compensated compactness [CK1]. Thus one is forced to work in the space of bounded measurable functions. In Sect. 6, we study the IBVP for the following systems: ut + 21 (au2 + v 2 )x = 0, x > 0, t > 0 , (1.4) vt + (uv)x = 0, x > 0, t > 0 def
where 1 < a < 2, and U ≡ (u, v) ∈ R2+ = {(u, v) ∈ R2 ; v ≥ 0}, with initial datum U (x, 0) = U0 (x), U0 ∈ L2 ∩ L∞ , and the following boundary condition √ a u(0, t) − v(0, t) = 0.
(1.5)
(1.6)
Note that the boundary condition (1.6) is uniformly characteristic for the system (1.4). We prove that the IBVP (1.4)–(1.6) has a global weak solution. Furthermore, as an application of a general trace theorem, we show that the boundary condition (1.6) is satisfied in a strong sense. The rest of this paper is organized as follows. In Sect. 2, we describe the Godunov approximations for the IBVP (1.1). Section 3 contains some discussions on the formulations of the boundary set corresponding to either viscosity approximation or half–space Riemann problems. An elementary and simple proof is given to show that the Riemann set is more stringent than the viscous set. The trace theorems on entropy fluxes are proved in Sect. 4. The IBVP via Godunov schemes are analyzed in Sect. 5. Then in Sect. 6, we prove the existence of global weak solution to the IBVP (1.4)–(1.6). Finally, the details of the proof of the generalized Oleinik entropy condition stated in Sect. 5 are given in the Appendix. We conclude this introduction by pointing out that there is an extensive literature on the study of initial boundary value problems for hyperbolic conservation laws. We refer the reader to [He] for a more complete list of references. 2. Godunov Schemes for IBVP In this section, we will describe the Godunov method for the IBVP (1.1), and make some conventions we will use throughout this paper. We refer the reader to [Sm] and [Le] for the usual definitions and terminologies.
Initial Boundaray Value Problem for Conservation Laws
705
We assume that the n × n systems of conservation laws Ut + F (U )x = 0, x > 0, t > 0
(2.1)
possess a strictly convex entropy, is hyperbolic and genuine nonlinear, and F : U → Rn is of class C2 , where U is a bounded domain of Rn . We denote the eigenvalues of the Jacobian matrix JF (U ) by λ1 ≤ λ2 ≤ · · · ≤ λn . We will refer to the Riemann problem Ut + F (U )x = 0 UL , if x < 0 (2.2) U (x, 0) = U , if x > 0 R
where UL , UR ∈ U are constant states, as 0, where h is a given Lipschtzian continuous map (cf. Theorem 6.1 in [He]). Again we also assume that {U ε }ε>0 is uniformly bounded in L∞ (R+ × R+ ), and that it converges for almost x > 0, t > 0 to a solution U of (2.1). Multiplying the first equation in (4.4) by ∇η(U ε )φ and integrating by parts, one obtains Z ∞ q(U ε )φ − ∇η(U ε )h(U ε )φ x=0 dt hE ε , φi = − 0 Z ∞ Z ∞ (4.5) 2 +ε ∇ η(U ε ) · (Uxε , Uxε )φ + ∇η(Uxε )Uxε φx dx dt 0
and so
0
√ |hE ε , φi| ≤ C {k√φ k∞√+ k εUxε kL2 k φ k∞ + ε k εUxε kL2 k φx kL2 } .
(4.6)
√ But { εUxε }ε>0 is uniformly bounded in L2 as follows from the standard energy estimates and the assumed uniform boundness of U ε in L∞ (R+ × R+), then after taking limε→0+ in (4.5), one can again apply Theorem 1.2 in [An] to obtain (4.3). Thus we have proved the following theorem. Theorem 4.1 (Trace of entropy fluxes). Let U = U (x, t) ∈ L∞ be an interior solution to the IBVP (1.1) and (2.3) obtained either by the Godunov method or by the parabolic approximation (4.4). Then for every entropy–entropy flux pair of class C2 , q(U (x, t)) has a trace γq(U ) ∈ L∞ t at x = 0 such that (4.3) is satisfied. Remark 4.2. The trace γq(U ) in Theorem 4.1 is attained in the weak sense: Z 1 δ q(U (x, ·))dx. γq(u) = w ∗ lim q(U (x, ·)) = w ∗ lim x→0 δ→0 δ 0
(4.7)
712
P. T. Kan, M. M. Santos, Z. Xin
Proof. Setting in (4.3) φ(x, t) = ρ(x)ζ(t) with ρ(X) = ζ(T ) = ζ(0) = 0, one then gets Z XZ T η(U )ζ 0 (t)ρ(x)dxdt 0Z 0 Z T Z X 0 (4.8) + ρ (x) q(U )ζ(t)dtdx + ρζdµ (0,X)×(0,T ) 0 0 Z T = −ρ(0) γq(u)(t) · ζ(t)dt. 0
It follows from (4.8) that Z Z X ρ0 (x)[ | 0
T
q(U )ζ(t)dt]dx| ≤ Cζ ||ρ||∞
0
for all ρ ∈ C01 (0, X), where Cζ is a constant that does not depend on ρ. Thus Z T q(U (x, t))ζ(t)dt 0
is a function in BV (x ≥ 0). Consequently, there exists the limit Z T def q(U (x, t))ζ(t)dt. qζ = lim x→0
(4.9)
0
Now we set ρ(x) = ρδ (x), 0 < δ 0 and φ ∈ C1 (R2 ), where Z T X X def (s[η(U ε )] − [q(U ε )]) φdt, (φ) = 0
def
Z
shocks
∞ XX
L(φ) =
0
n=1
1 t−0 dx, φ(x, n 1 t)[η(U ε )]t=n t=n 1 t+0
714
P. T. Kan, M. M. Santos, Z. Xin
and P [·] denotes the jump of · along a shock. Since all P the shocks are admissible, we have (φ), L(φ) ≥ 0 if η is convex and φ ≥ 0, and (φ) = 0 if q = F for any φ, by the Rankine–Hugoniot condition. Then (4.12) follows from (4.14) after taking the limit as ε goes to zero and using the fact that the trace of q(U (x, t) exit and is given by the formula (4.7). Now let (η, q) be a given entropy–entropy flux pair in C∩W 1,∞ with η being a convex function. To prove the theorem in this case, one needs only to show that the trace of q(U (x, t)) at x = 0 exists since then (4.12) in this case follows exactly the same way as before. Taking φ ≡ 1 in (4.14) gives Z
X
T
ε
Z
ε
∞ XX
(s[η(U )] − [q(U )]) dt +
0
Z shocks X (η(U ε )(x, T ) − η(U ε )(x, 0)) dx =− Z0 T (q(U ε )(X, t) − q(U ε )(0, t)) dt. −
0
n=1
1 t−0 dx [η(U ε )]t=n t=n 1 t+0 (4.15)
0
Since η is Lipschitz continuous, and convex, one can approximate η uniformly by C2 – convex functions, so by a standard but lengthy approximation argument, one can show that (4.16) s[η(U ε )] − [q(U ε )] ≥ 0, and
Z
(j+1) 1 t
j 1t
1 t−0 dx ≥ 0 [η(U ε )]t=n t=n 1 t+0
(4.17)
for any j. It follows from (4.15)–(4.17) that Z
T 0
and
X
|s[η(U ε )] − [q(U ε )]| dt ≤ C,
(4.18)
shocks ∞ Z X X 1 t−0 dx ≤ C, [η(U ε )]t=n t=n 1 t+0 0
(4.19)
n=1
where C is a constant depending only on (η, q). Using formula (4.14), we have that for any φ ∈ C10 ((0, ∞)2 ), ε
Z
def
hE , φi = −
∞
Z0
=− Z
0
∞
0
η(U ε )φt + q(U ε )φx dxdt
0
X
(s[η(U ε )] − [q(U ε )]) φdt
shocks
∞ XX
− which gives
T
Z
n=1
1 t−0 dx, φ(x, n 1 t)[η(U ε )]t=n t=n 1 t+0
Initial Boundaray Value Problem for Conservation Laws
715
Z T X ε ε (s[η(U )] − [q(U )) φdt |hE , φi| ≤ 0 Z shocks ∞ XX ε t=n 1 t−0 + φ(x, n 1 t)[η(U )]t=n 1 t+0 dx 0 n=1 ≡ e1 + e2 . ε
It follows from (4.18) that e1 ≤ const.||φ||∞ . Next, ∞ Z (j+1) 1 x X ε t=n 1 t−0 e2 ≤ φ(j 1 x, n 1 t)[η(U )]t=n 1 t+0 dx n,j=1 j 1 x ∞ Z (j+1) 1 x X ε t=n 1 t−0 (φ(x, n 1 t) − φ(j 1 x, n 1 t)) [η(U )]t=n 1 t+0 dx + n,j=1 j 1 x ≡ e21 + e22 . e21 can be estimated by using (4.19) as Z (j+1) 1 x P ε t=n 1 t−0 [η(U )]t=n 1 t+0 dx e21 = n,j φ(j 1 x, n 1 t) j 1x Z (j+1) 1 x X 1 t−0 dx |φ(j 1 x, n 1 t)| [η(U ε )]t=n ≤ t=n 1 t+0 j 1x n,j ≤ const.||φ||∞ , while e22 admits the following estimate: Z
ε 1 t−0 )] [η(U dx n,j t= 1 t+0 j 1x Z (j+1) 1 x 2 P 1 1 ( 1 x)2 1/2 ε 1 t−0 + ( 1 x) [η(U )]t= 1 t+0 dx ≤ ||φx ||∞ n,j 1 x 2 ( 1 x)1/2 2 ( j 1x ) Z (j+1) 1 x 2 √ P P ε 1 t−0 2 ≤ ||φx ||∞ 1 x [η(U )]t= 1 t+0 dx n,j ( 1 x) + n,j j 1x ( ) Z (j+1) 1 x 2 √ P P 1 t−0 ε 2 2 ≤ ||φx ||∞ 1 x [U ]t= 1 t+0 dx n,j ( 1 x) + ||∇η||∞ n,j j 1x √ ≤ const. 1 x||φx ||∞ ,
e22 ≤ ||φx ||∞
P
1x
(j+1) 1 x
where we have used the assumption that the system admits a strictly convex entropy and the fact that η ∈ L∞ . Collecting the above estimates and taking the limit as ε = O( 1 x) goes to zero, we have shown that div(η(U ), q(U )) is a bounded measure in any bounded domain in R+ × R+ , so as in the proof of the previous theorem we can apply Anzellotti’s theorem to conclude the result.
716
P. T. Kan, M. M. Santos, Z. Xin
Remark 4.4. The estimates (4.12) and (4.13) should be very important in the study of limiting behavior of weak solutions near the boundary x = 0. Indeed, as we will see in Sects. 5, and 6, that the structures of the boundary Young measure associated with the limiting behavior of the density variable U (x, t) as x approaches 0+ for either scalar convex conservation laws or the system (1.4) can be characterized in quite detailed ways. However, we do not expect that estimate (4.12) alone will fully characterize the stuctures of the boundary Young measure associated with the density variables for general systems, and additional estimates are needed for general systems. This is left for future research. Remark 4.5. In general, due to the appearance of the strong boundary layers, we do not expect that (4.12) holds true for the solutions generated by viscous approximations.The appropriate estimate for the viscous approximation is given by the inequality (3.1). 5. Scalar Equation In this section, we study in detail the initial boundary value problem for scalar convex conservation laws with Dirichlet boundary conditions via Godunov schemes. Several new facts concerning the limiting behavior of bounded measurable weak solutions at the boundary are found. For example, it will be shown that the Young measure associated with the limiting behavior of the density variable U (x, t) as x approaches boundary is in fact unique, and thus any continuous function of U (in particular U(x,t) itself) has a well-defined trace at x = 0. In the case that the wave speed never vanishes, we obtain the strong trace of the density variable given by pointwise limit of the numerical boundary data. More interestingly, it is found that even Godunov methods may introduce numerical boundary layers provided that the wave speed is allowed to change sign. This is surprising since it is in general believed that duo to its up-winding property, the Godunov schemes exclude numerical boundary layers. We will also obtain a generalized version of Oleinik’s entropy condition for IBVP which gives interior BV regularity immediately even though both the initial data and the boundary data are only bounded and measurabe. Other information such as boundary layer structure and BV stability of the Godunov methods will also be presented. For convenience, we first introduce some notations and terminologies used only for this section. The flux function F will be replaced by f : R → R, and it is assumed that f is a convex function, i.e. f 00 > 0. The cases f 0 > 0 and f 0 < 0 are referred to as nontransonic flux. The case that f 0 vanishes at a point is referred to as transonic flux, and we consider that point to be zero, without loss of generality. We write U = u ∈ R, U0 = u0 , Ub = ub , and the boundary condition is of the type (2.4), which reads here as u(0, t) = 0 a.e. is an interior solution to the IBVP (1.1)–(5.1); see [Ta]. Denote the middle states of the Riemann problem 0. Proof. With estimate (5.3) on the approximate solutions at hand, one can complete the proof of this corollary by modifying the classical analysis (see [Sm]) easily using the structure of the Godunov solutions on each mesh. We will omit the details. Next we show a regularity result for the interior solution of the IBVP (1.1)–(5.1) at the boundary x = 0. We recall that a Young measure, ν, associated with the family {u(x, t)}x>0 as x goes to zero, is a map from R+ into P(R) defined for a.e. x ∈ R+ , where P(R) stands for the set of Probability measures on R, such that Z w ∗ lim g(u(x, ·)) = g(λ)dν· λ x→0
for any continuous function g; see [Ta]. Theorem 5.4 (Regularity at the boundary). Let ν and µ be any Young measures associated with the family {u(x, t)}x>0 as x goes to zero. Then ν = µ, that is, boundary Young measures associated with the interior solution of the IBVP (1.1)–(5.1) obtained by theGodunov method is unique. As a consequence, we have that for any continuous function g ∈ C(R), the whole sequence g(u(x, t)) converges weakly to a unique limit as x → 0 so that its trace at x = 0 is well-defined. In particular, u has a trace γu at the boundary x = 0: def
γu = w ∗ lim u(x, ·). x→0
Proof. Since the trace γq(u) = w∗limx→0 q(u(x, ·)) exists due to Theorem 3.4, it follows that Z Z q(λ)dν = q(λ)dµ (5.5) for any entropy flux q. Taking q ∈ C10 (R − {0}) in (5.5), we obtain that ν|(R − {0}) = µ|(R − {0}). But ν(R) = µ(R) = 1, so ν = µ. Remark 5.5. The idea of taking boundary Young measures to deal with IBVP for scalar conservation laws appeared first in [Sz]. He obtains an interior solution for the IBVP for the scalar conservation law by a streamline diffusion finite element method. Although he does not prove the uniqueness of boundary Young measures, he can use them to define the measure valued solution to the IBVP, because, as he proves, the integral of the flux function with respect to the boundary Young measures is uniquely defined. We now address the question whether Godunov schemes introduce numerical boundary layers. We need a definition first. Definition 5.6 (Boundary layer). Let def
u = w ∗ lim uε (0, ·). ε→0
It is said that the Godunov methods for the IBVP (1.1)–(5.1) have boundary layers if u 6= γu.
Initial Boundaray Value Problem for Conservation Laws
719
The next theorem gives a sufficient condition which ensures the exclusion of boundary layers in the Godunov methods. Theorem 5.7. If the flux is nontransonic then the Godunov method for the IBVP (1.1)– (5.1) has no boundary layer. Furthermore, the trace of u at the boundary x = 0 exists in the strong sense. That is, s − limx→0 u(x, ·) =: γu exists and γu = u. Proof. Since the flux is nontransonic, q(u) = −u is an entropy flux with the convex entropy η(u) = −1/f 0 (u). Then by the trace Theorem 4.3, the trace of u, γu, at the boundary x = 0 exists in the weak sense, i.e. there exits the weak limit w ∗ limx→0 u(x, ·) =: γu, and (5.6) γu ≥ u. On the other hand, it also follows from Theorem 4.3 that γf (u) = w ∗ lim f (u(x, ·) = w ∗ lim f (uε (0, ·). x→0
ε→0
Using again that the flux is nontransonic, one shows that uε (0, ·) converges strongly to u as ε → 0. Then γf (u) = w ∗ lim f (u(x, ·) = w ∗ lim f (uε (0, ·) = f (u). x→0
ε→0
(5.7)
Using Jensen’s inequality, the convexity of f , and (5.7), one gets f (γu) ≤ γf (u) = f (u).
(5.8)
In the case f 0 > 0, (5.8) yields γu ≤ u. So combining with (5.6) gives that γu = u and f (γu) = γf (u). From this last equation and the convexity of f it follows that γu = s − limx→0 u(x, ·). It remains to complete the proof for the case f 0 < 0. In this case, f −1 exists and it is a concave function, so using facts mentioned above we have γu = w ∗ limx→0 u(x, ·) = w ∗ limx→0 f −1 (f (u(x, ·))) ≤ f −1 (w ∗ limx→0 f (u(x, ·))) = f −1 (γf (u)) = f −1 (w ∗ limε→0 f (uε (0, ·))) = u. Thus the Godunov methods introduce no numerical boundary layers for the nontransonic flows. However, the following example shows that this is not the case in general. Theorem 5.8. If the flux is transonic then the Godunov method for the IBVP (1.1)–(5.1) may have boundary layer. Proof. Consider the following IBVP for Burgers’ equation: 2 u t + ( u )x = 0 2 u(x, 0) ≡ −1 u(0, t) ≡ 1. Let uε (x, 0) = −1 + 1 x, where ε = 1 t ≤ ( 1 x)/2. It is not difficult to see that for 1 x 0 independent of ε such that uε (0, t) = 1 for all 0 < t < t0 , and uε (x, t) = −1 + 1 x for all (x, t) ∈ ( 1 x, ∞) × (0, t0 ). It follows that u(t) = 1 6= γu(t) = −1, for all 0 < t < t0 . Remark 5.9. In the example of the proof above, the boundary layer is an entropic standing shock. This fact on the structure of boundary layers holds in general accordingly to the next theorem. Theorem 5.10. Assume that γu = s−limx→0 u(x, ·) and u = s−limε→0 uε (0, ·). Then, if there is boundary layer it must be an entropy standing shock, i.e. f (γu) = f (u) and γu ≤ u. In particular, there is no boundary layer if u ≤ 0, i.e. in this case we have γu = u. Proof. Due to the hypothesis of strong convergences, it follows from (4.7) and (4.13) that f (γu) = f (u). Since f 00 > 0 and f 0 (0) = 0, it suffices to prove that γu ≤ 0 if u ≤ 0. Let ν be a Young measure associated with the family of functions {u(x, ·)}x>0 as x goes to zero. Then there is a sequence {xk → 0}, and a family of probability measures ν in R such that Z x →0 g(λ)dν(λ), (5.12) g(u(xk , ·)) k* R for any continuous function g. In particular, for entropy flux q one has Z q(λ)dν. (5.13) γq(u) = w ∗ lim = xk →0 R In the case that the entropy of q is convex, Theorem 4.3 yields γq(u) ≤ w ∗ lim q(uε (0, ·)) = s − lim q(uε (0, ·)) = q(u). ε→0
ε→0
It follows from this and (5.13) that Z R
q(λ)dν ≤ q(u),
(5.14)
for any entropy flux q with convex entropy. Using in (5.14) a q that is positive for λ > 0 and zero for λ ≤ 0 we obtain that suppν ⊂ (−∞, 0]. This and (5.12) applied to the identity function gives the desired result that γu ≤ 0. Next we give an example for a transonic flux such that u = γu = 0, in particular there is no boundary layer.
Initial Boundaray Value Problem for Conservation Laws
721
Example 4. Consider the following IBVP for Burgers’ equation: 2 ut + ( u ) x = 0 2 u(x, 0) ≡ 1 u(0, t) ≡ −1. Let uε (x, 0) = −1 for 0 ≤ x < 1 x, and uε (x, 0) ≡ 1 for x ≥ 1 x, where ε = 1 t = ( 1 x)/2. We will show that u = γu = 0. Due to Theorem 5.12, one may apply Theorem 5.10 to conclude as long as the following claim holds true. Claim. −1 ≤ un1 < 0 < unj ≤ 1, n = 0, 1, 2, · · ·, j = 2, 3, · · ·, and limn→∞ un1 = 0. One can verify this claim by induction. If n = 0 then unb = u01 = −1, by definition. Suppose the claim holds for some n ≥ 0. Then un1/2 = un1 and un3/2 = 0, so (5.2) yields 1 un+1 = un1 + (un1 )2 = ψ(un1 ), 1 4
(5.15)
where
1 def ψ(λ) = λ + λ2 , λ ∈ R. 4 < 0. On the other Now, notice that −1 ≤ ψ(λ) ≤ 0 for all λ ∈ (−4, 0) so −1 ≤ un+1 1 hand, if un1 < 0 < unj ≤ 1 for j ≥ 2 then
and
1 = un2 − (un2 )2 = ϕ(un2 ) un+1 2 4
(5.16)
1 = ϕ(unj ) + (unj−1 )2 if j ≥ 3, un+1 j 4
(5.17)
where
1 def ϕ(λ) = λ − λ2 , λ ∈ R. 4 ≤ 1, while the limn→∞ un1 = 0 can be It follows from (5.16) and (5.17) that 0 < un+1 j derived easily from (5.15). In the case that both boundary and initial data are of bounded total variation,there holds the following global stabilty estimate in total variation norm for Godonov solutions which improves the estimate in Theorem 5.1. Theorem 5.11 (Regularity). Assume that both the initial and boundary data are BV functions. Then the following BV -estimate holds: TV(un ) ≤ TV(u0 ) + TV(ub ) + |u01 − u0b |, where def
TV(un ) =
∞ X
(5.18)
|unj+1 − unj |
j=1
and def u01 =
1 1x
Z 1x u0 (x)dx, 0
def u0b =
1 1t
Z 1t ub (t)dt. 0
722
P. T. Kan, M. M. Santos, Z. Xin
Proof. This theorem will follow from the TVD property of the Godunov methods for the Cauchy problem. Consider two cases: First. un1/2 = unb or 0. In this case, one can extend U ε to the strip (−∞, ∞) × [tn , tn+1 ) by U ε ≡ unb if x < 0. Since the Godunov method for the Cauchy problem is TVD, one obtains − unb | ≤ TV(un ) + |un1 − unb |, TV(un+1 ) + |un+1 1 and so,
− unb |. TV(un+1 ) ≤ TV(un ) + |un1 − unb | − |un+1 1
(5.19)
Second. un1/2 = un1 . In this case, one extends U ε to the strip (−∞, ∞) × [tn , tn+1 ) U ε ≡ un1 if x < 0. Then, the same argument as in the first case shows that
by
− un1 | ≤ TV(un ). TV(un+1 ) + |un+1 1 Thus,
− un1 |, TV(un+1 ) ≤ TV(un ) − |un+1 1
which also implies (5.19). Now (5.18) follows from (5.19) by induction on n.
6. A Class of Quadratic Systems In this section we consider an initial boundary value problem for the symmetric quadratic systems of conservation laws ut + 21 (au2 + v 2 )x = 0 (6.1) vt + (uv)x = 0, def
where x > 0, t > 0, U ≡ (u, v) ∈ R2+ = {(u, v) ∈ R2 ; v ≥ 0}, and 1 < a < 2. These systems are in the case III of symmetric quadratic system introduced in [SS]. The quadratic systems arise from 2 × 2 systems of nonstrictly hyperbolic conservation laws by neglecting high order terms in the Taylor series of the flux functions, and they can be used as a model for oil recovery [SS]. The solution of their Cauchy or Riemann problem presents complexities that distinguish its own theory, see e.g. [CK 1, CK 2, FS, IMPT, IT, Ka, MPSS, SS]. We prescribe the initial datum, U (x, 0) = U0 (x),
(6.2)
where U0 (x) ∈ R2+ for all x > 0 and U0 ∈ L2 ∩ L∞ , and the following boundary condition, √ a u(0, t) − v(0, t) = 0. (6.3) Note that (6.3) is uniformly characteristic for the system (6.1). The main result in this section is the existence of a solution (u, v) to the Initial Boundary Value Problem (IBVP) (6.1)–(6.3); see Theorem 6.1 below. We will use the Godunov method to construct an approximate solution U ε (x, t), and then take the limit as ε goes to zero to obtain an exact solution. Our approximations U ε satisfy the boundary condition (6.3) exactly, as we show in [KSX] by an analytical construction of global solutions of the Riemann problem and half–space Riemann problem for the systems (6.1); cf. Proposition 6.2 below. Let us recall some basic facts on the systems (6.1). For the details we refer the reader to [IT], Sect. 2 of [Ka], and [CK 1, CK 2]. The eigenvalues of (6.1) are
Initial Boundaray Value Problem for Conservation Laws
λk =
723
o p 1n (a + 1)u + (−1)k (a − 1)2 u2 + 4v 2 , 2
k = 1, 2. According to their signs, the upper half plane R2+ is divided in three regions: def
def
K1 = {U ∈ R2+ ; λ1 (U ) < λ2 (U ) < 0}, K2 = {U ∈ R2+ ; λ1 (U ) < 0 ≤ λ2 (U )}, and def
K3 = {(0, 0)} ∪ {U ∈ R2+ ; 0 ≤ λ1 (U ) < λ2 (U )}. Notice that the boundary condition (6.3) relies on K2 ∩ K3 , where λ1 ≡ 0, and λ2 > 0 for all U 6= (0, 0). Notice also that the origin (0,0) is an umbilic point for the system (6.1)[SS], where λ1 = λ2 = 0. The corresponding eigenvectors are r1,2 = (v, λ1,2 − au). It is easy to check that rj · ∇λj 6= 0, j = 1, 2, for all (u, v) such that v > 0, that is, the systems (1.1) are genuinely nonlinear for v > 0. Integrating these fields on the plane, one gets the rarefaction curves of (1.1). See [IT]. Associated to the rarefaction curves, there is a pair of Riemann invariants w1 , w2 , that is, a pair of real functions on R2 such that ∇wi · rj = 0, i 6= j, i, j = 1, 2. We will normalize (w1 , w2 ) such that ∇wi · ri > 0, i = 1, 2. Let UR be a constant state in R2+ . We will denote the backward 2-rarefaction wave about UR and the backward 2-shock wave curve about UR in the sense of Lax [La], 2 (UR ). Now our main theorem can stated as follows. respectively, by R2− (UR ) and S− Theorem 6.1. There exists a solution U = U (x, t) in L∞ to the IBVP (6.1)–(6.3) such that Z √ 1 δ √ ( au(x, ·) − v(x, ·))2 dx = 0. s − lim ( au(x, ·) − v(x, ·))2 = s − lim x→0 δ→0 δ 0 Theorem 6.1 will be proved in the end of this section. As mentioned earlier, the solution will be the limit of an approximation solution constructed by Godonov methods described in Sect. 2. One of the main ingredients of this construction is the following proposition. 2 Proposition 6.2 ([KSX] Half–space Riemann problem). For √ any UR ∈ R+ , there is 2 2 a unique UB = (uB , vB ) ∈ R− (UR ) ∪ S− (UR ) such that auB − vB = 0 and the Riemann solution U = 0, that is, U satisfies the boundary condition (6.3) for all t > 0.
To define the approximate solution and obtain a L∞ a priori estimate, we will use the following result about invariant regions of Riemann problems for (6.1). Lemma 6.3 ([KSX]). For arbitrary u1 < 0, u2 > 0, let S = S(u1 ; u2 ) be the closed region defined by def
S = {U ∈ R2+ ; w1 (U ) ≥ w1 ((u1 , 0)) and w2 (U ) ≤ w2 ((u2 , 0))}. Then S is an invariant region of the Riemann problem for (1.1), that is, if UL , UR ∈ S then 0 and the intersection Sect. 2. Fix an invariant region S such that U0 (x) ∈ √ 2 (UR ) with the line au − v = 0 contained in S for all of the wave curve R2− (UR ) ∪ S− UR ∈ S. Let {(j 1 x, n 1 t) ; (j, n) ∈ N2 }, N = {1, 2, . . .}, def
be a net in R+ × R+ = {(x, t) ∈ R2 ; x ≥ 0, t ≥ 0} such that δ = 1 t/ 1 x is constant and satisfies the CFL condition δ sup{|λk (U )| ; k = 1, 2, U ∈ S} < 1. def
An approximate solution U ε , ε = 1 t = δ 1 x, is defined as follows: First we approximate the initial data U0 by def
U0ε (x) =
∞ X j=0
U0j χ(2j 1 x,(2j+2) 1 x] (x),
where def
U0j =
1 2j 1 x
Z
(2j+2) 1 x
21x
U0 (x)dx.
Since U0 (x) ∈ S for all x > 0 and S is convex, U0j ∈ S for all j ∈ N and U0ε (x) ∈ S for all x > 0. Now suppose that U ε is defined in some strip R+ × [0, n 1 t) and U ε (x, n 1 t) ∈ S for all x ≥ 0, and we show how to define U ε in R+ × [n 1 t, (n + 1) 1 t). First define U ε on R+ × {n 1 t} by def
Unε (x) =
∞ X j=0
where def
Unj =
1 21x
Z
Unj χ(2j 1 x,(2j+2) 1 x] (x),
(2j+2) 1 x 2j 1 x
U ε (x, n 1 t − 0)dx.
Next set U ε in the mesh ((2j − 1) 1 x, (2j + 1) 1 x) × (n 1 t, (n + 1) 1 t), j ≥ 1, by U ε (x, t) = u∗R ] =⇒ f (uM ) ≥ f (uR ).
726
P. T. Kan, M. M. Santos, Z. Xin
We divide the proof of Lemma 5.2 in three main cases, namely, f 0 (uj+1/2 ) = 0, f 0 (uj+1/2 ) < 0, and f 0 (uj+1/2 ) > 0. First case: f 0 (uj+1/2 ) = 0. In this case one has uj+1/2 = 0. So, due to Remark a, uj ≤ 0 ≤ uj+1 , then we obtain from Remark h and Remark i that f (uj + 3/2) ≥ f (uj+1 ) and f (uj−1/2 ) ≥ f (uj ). It follows from these estimates and (7.1) that 1 t {f (u ) + f (u )} D j ≤ Dj − 1 j+1 j x 1 t { 1 f 00 (ξ )(u )2 + 1 f 00 (ξ )(u )2 } = Dj − 1 1 j+1 2 j 2 x 2 ≤ Dj − c{(uj+1 )2 + (uj )2 } ≤ Dj − c2 (Dj )2 , where c is defined in Lemma 5.2, and for the last estimate we have used the definition of Dj and the simple inequality: (uj+1 )2 + (uj )2 ≥ (uj+1 − uj )2 /2. To study the other two cases f 0 (uj+1 /2) < 0 and f 0 (uj+1/2 ) > 0, we first consider some subcases: “Negative case”: uj−1/2 = uj , uj+1/2 = uj+1 , and uj+3/2 = uj+2 or 0. In this case, (7.1) yields 1t {f (uj+3/2 ) − 2f (uj+1 ) + f (uj )}. D j = Dj − (7.2) 1x Expanding both f (uj+3/2 ) and f (uj ) in a Taylor series about uj+1 , one obtains 1 t f 0 (u ))D − 1 t f 0 (u )(u Dj = (1 + 1 j+1 j j+1 j+3/2 − uj+1 ) x 1x 1 1t 00 2 00 − 2 1 x {f (ξ1 )(Dj ) + f (ξ2 )(uj+3/2 − uj+1 )2 },
(7.3)
where ξ1 , ξ2 ∈ [−M, M]. Now it follows from Remark c that Dj = 0 or f 0 (uj+1 ) ≤ 0, and from Remark a that uj+3/2 − uj+1 ≤ Dj+1 = uj+2 − uj+1 .
(7.4)
If f 0 (uj+1 ) ≤ 0, then (7.3) and (7.4) give Dj ≤ Dj] − c(Dj] )2 ≤ Dj∗ − c(Dj∗ )2 with
Dj] = max{Dj , Dj+1 }. def
If f 0 (uj+1 ) > 0, then Dj = 0, i.e. uj+1 = uj . Furthermore, since uj+1 > 0, one has from Remark h that f (uj+3/2 ) ≥ f (uj+1 = f (uj ). Therefore, (7.2) yields Dj = − and so
1t {f (uj+3/2 ) − f (uj )} ≤ 0, 1x
Dj ≤ 0 = Dj − c(Dj )2 ≤ Dj] − c(Dj] )2 .
“Positive case”: uj−1/2 = uj−1 or 0, uj+1/2 = uj , and uj+3/2 = uj+1 . In this case, (7.1) says that 1t {f (uj+1 − 2f (uj ) + f (uj−1/2 )}. D j = Dj − (7.4) 1x
Initial Boundaray Value Problem for Conservation Laws
727
As in the previous case, one has 1 t f 0 (u ))D + 1 t f 0 (u )(u − u Dj = (1 − 1 j j j j j−1/2 ) x 1x 1 1t 00 2 00 − 2 1 x {f (ξ1 )(Dj ) + f (ξ2 )(uj − uj−1/2 )2 },
(7.5)
and Remark b shows either Dj = 0 or f 0 (uj ) ≥ 0, while Remark a implies that uj − uj−1/2 ≤ Dj−1 = uj − uj−1 .
(7.6)
If f 0 (uj ) ≥ 0, then combining (7.5) with (7.6) gives Dj ≤ Dj[ − c(Dj[ )2 ≤ Dj∗ − c(Dj∗ )2 with def
Dj[ = max{Dj−1 , Dj }. If f 0 (uj ) < 0, then Dj = 0, and f (uj−1/2 ) ≥ f (uj ) = f (uj+1 ) (due to Remark i.i). It follows from (7.4) that Dj = − Thus
1t {f (uj−1/2 ) − f (uj )} ≤ 0. 1x
Dj ≤ 0 = Dj − c(Dj )2 ≤ Dj[ − c(Dj[ )2 .
“Positive–negative–negative case”: uj−1/2 = uj−1 or 0, uj+1/2 = uj+1 , and uj+3/2 = uj+2 or 0. In this case, (7.1) reads D j = Dj −
1t {f (uj+3/2 ) − 2f (uj+1 ) + f (uj−1/2 )}, 1x
(7.7)
and one has either Dj = 0 or f 0 (uj+1 ) ≤ 0, and furthermore uj+3/2 − uj+1 ≤ Dj+1 . There are several subcases to be considered. • If f 0 (uj+1 ) ≤ 0 and uj−1/2 = uj−1 , then from Remark b we have either uj−1 = uj , for which the proof follows as in the “negative case”, or uj−1 ≥ 0 and uj > u∗j−1 , where u∗ is defined by the unique solution of the equation f (u∗ ) = f (u). The latter case is studied as follows. First we assume that u∗j−1 < uj ≤ uj−1 . Then f (uj−1 ) ≥ f (uj ) and D j ≤ Dj −
1t {f (uj+3/2 ) − 2f (uj+1 ) + f (uj )}. 1x
It now follows from the “negative case” that Dj ≤ Dj] − c(Dj] )2 . Next, suppose that uj > uj−1 . Then uj > 0 and Dj−1 > 0, and so Dj ≤ {uj+1 + 2
1t f (uj+1 )} − uj ≤ 0 ≤ ϕ(Dj−1 ), 1x
where ϕ is by the definition the parabola ϕ(λ) = λ − c(λ)2 .
728
P. T. Kan, M. M. Santos, Z. Xin
• If f 0 (uj+1 ) ≤ 0 and uj−1/2 = 0, then uj−1 ≤ uj due to Remark a, and so (7.7) implies that 1t f (uj+1 )} − uj ≤ 0 ≤ ϕ(Dj−1 ). Dj ≤ {uj+1 + 2 1x • If f 0 (uj+1 ) > 0, then Dj = 0, and f (uj+3/2 ) ≥ f (uj+1 ) = f (uj ) (as follows from Remark h). So 1 t {f (u Dj = − 1 j+3/2 ) − 2f (uj+1 ) + f (uj−1/2 )} x t 1 ≤ − 1 x {f (uj ) − 2f (uj ) + f (uj−1/2 )}. Thus, applying the “positive case” with uj = uj+1 = uj+2 , we obtain Dj ≤ ϕ(Dj[ ). We now consider a parallel case: “Positive–positive–negative case”: uj−1/2 = uj−1 or 0, uj+1 /2 = uj , and uj+3/2 = uj+2 or 0: In this case, (7.1) becomes D j = Dj −
1t {f (uj+3/2 ) − 2f (uj ) + f (uj−1/2 )}, 1x
(7.8)
Dj = 0 or f 0 (uj ) ≥ 0, and uj − uj−1/2 ≤ Dj−1 = uj − uj−1 , and either Dj = 0 or f 0 (uj ) ≥ 0. • If f 0 (uj ) ≥ 0 and uj+3/2 = uj+2 , then it follows from Remark c that either uj+1 = uj+2 , for which the proof goes as in the “positive case”, or uj+2 ≤ 0 and uj+1 ≤ u∗j+2 , for which we will argue as follows. Consider two cases: 1. uj+2 ≤ uj+1 ≤ u∗j+2 . In this case, f (uj+2 ) ≥ f (uj+1 ), and D j ≤ Dj −
1t {f (uj+1 ) − 2f (uj ) + f (uj−1/2 )}. 1x
Then, by the “positive case”, one obtains Dj ≤ ϕ(Dj[ ). 2. uj+1 < uj+2 . Then uj+1 < 0 and Dj+1 > 0. Thus Dj ≤ −{uj − 2
1t f (uj )} + uj < 0, ϕ(Dj+1 ). 1x
• If f 0 (uj ) ≥ 0 and uj+3/2 = 0, then uj+1 ≤ 0 ≤ uj+2 , and so Dj ≤ −{uj − 2
1t f (uj )} + uj+1 ≤ 0 ≤ ϕ(Dj+1 ). 1x
• If f 0 (uj ) < 0, then Dj = 0 and f (uj−1/2 ) ≥ f (uj ). One gets that 1 t {f (u Dj ≤ − 1 j+3/2 ) − 2f (uj ) + f (uj )} x t 1 ≤ − 1 x {f (uj+3/2 ) − 2f (uj+1 ) + f (uj+1 )}
Initial Boundaray Value Problem for Conservation Laws
729
It follows from the “negative case” with uj−1 = uj = uj+1 that Dj ≤ ϕ(Dj] ). Now we turn to the two remaining cases, f 0 (uj+1/2 ) < 0 and f 0 (uj+1/2 ) > 0: Second case: f 0 (uj+1/2 ) < 0. Then uj+1/2 < 0. It follows that from Remark d and Remark f respectively that (7.9) uj+1/2 = uj+1 < 0, and, uj+3/2 = uj+2 or 0.
(7.10)
Consequently, this case is reduced to either the “negative case” or the “positive–negative– negative case”. Third case: f 0 (uj+1/2 ) > 0. In this case we have uj+1/2 > 0. Instead of (7.9) and (7.10), one has (7.11) uj+1/2 = uj > 0, and uj−1/2 = uj−1 or 0,
(7.12)
as follows from Remark e and Remark g. Now the analysis for this case is reduced to that for either the “positive case” or the “positive–positive–negative case”. This finishes the proof of Lemma 5.2. References [An] [BLN] [BS] [CK 1] [CK 2] [Di 1] [Di 2] [Di 3] [DL] [FS] [Ge] [Go] [GL]
Anzellotti, G.: Pairings between measures and bounded functions and compensated compactness. Ann. Mat. Pura Appl. 135, 293–318 (1983) Bardos, C., Leroux, A. Y. and Nedelec, J. C.: First order quasilinear equations with boundary conditions. Comm. Pure Appl. Math. 4(9), 1017–1034 (1979) Benabadallah, A. and Serre, D.: Probl`emes aux limites pour des syst`emes hyperboliques non lin´eares de deux e´ quations a` une dimension d’espace. C. R. Acad. Sci. Paris Ser I Math. 305, 677–680 (1987) Chen, G. Q. and Kan, P. T.: Hyperbolic conservation laws with umbilic degeneracy I. Arch. Rat. Mech. Anal. 130, 231–276 (1995) Chen, G. Q. and Kan, P. T.: Hyperbolic conservation laws with umbilic degeneracy II. Preprint DiPerna, R. J.: Convergence of approximate solutions to conservation laws. Arch. Rat. Mech. Anal. 82, 27–70 (1983) DiPerna, R. J.: Convergence of the viscosity method for isentropic gas dynamics. Commun. Math. Phys. 91, 1–30 (1983) DiPerna, R. J.: Uniqueness of solutions to hyperbolic conservation laws. Indiana Univ. Math. J. 28, 244–257 (1979) LeFloch, P. and Dubois, F.: Boundary conditions for nonlinear hyperbolic systems of conservation laws. J. Diff. Eq. 71, 93–122 (1988) Frid, H. and Santos, M.M.: Nonstrictly hyperbolic systems of conservation laws of the conjugate type, Comm. Part. Diff. Eq. 19(1&2), 27–59 (1994) Gel’fand, I.M.: Some problems in theory of quasi–linear equations. Amer. Math. Soc. Trans., Ser.2, 29, 295–381 (1963) Goodman, J.B.: Initial boundary value problems for hyperbolic systems of conservation laws. Thesis, Stanford University (1981) Goodman, J.B. and LeVeque, R.J.: A geometric approach to high resolution TVD schemes. SIAM J. Num. Anal. 25, 268–284 (1988)
730
[He]
P. T. Kan, M. M. Santos, Z. Xin
Heidrich, A.: Global weak solutions to initial boundary value problems for the onedimensional quasilinear wave equation with large data. Arch. Rat. Mech. Anal. [IMPT] Isaacson, E., Marchesin, D., Plohr, B. and Temple, B.: The Riemann problem near a hyperbolic singularity: the classification of solutions of quadratic Riemann problems I. SIAM J. Appl. Math. 48(5), 1009–1032 (1988) [IT] Isaacson, E. and Temple, B.: The Riemann problem near a hyperbolic singularity II. SIAM Appl. Math. 48(6), 1287–1301 (1988) [Ka] Kan, P. T.: On the Cauchy problem of a 2 × 2 system of non-strictly hyperbolic conservation laws. Thesis, New York University, (1989) [KSX] Kan, P.T., Santos, M.M. and Xin, Z.: Initial boundary value problem for a class of quadratic systems of conservation laws. Matem´atica Contemporˆanea (Brazilian Mathematical Society), Vo. 11, 1–32 (1996) [KK] Keyfitz, B. and Kranzer, H.: A system of nonstrictly hyperbolic conservation laws arising in elasticity theory. Arch. Rat. Mech. Anal. 72, 219–241 (1980) [La] Lax, P.D.: Hyperbolic systems of conservation laws, II. Comm. Pure Appl. Math. 19, 537–556 (1957) [Le] LeVeque, R.J.: Numerical methods for conservation laws. BAsel–Boston: Birkh¨auser, (1992) [Li] Liu, T.P.: Initial-boundary-value for gas dynamics. Arch. Rat. Mech. Anal. 64, 137–168 (1977) [MPSS] Marchesin, D., Paes–Leme, P.J., Schaeffer, D.G. and Shearer, M.: Solution of the Riemann problem for a prototype 2 × 2 system of non–strictly hyperbolic conservation laws. Arch. Rat. Mech. Anal. 97, 299–320 (1987) [NS] Nishida, T. and Smoller, J.: Mixed problems for nonlinear conservation laws. J. Diff. Eqns. 23, 244–269 (1977) [SS] Schaeffer, D. G., and Shearer, M.: The classification of 2 × 2 systems of non-strictly hyberbolic conservation laws with application to oil recovery, with Appendix by D. Marchesin, P.J. Paes– Leme, D.G. Schaeffer and M. Shearer. Comm. Pure Appl. Math. 40, 141–178 (1987) [Sm] Smoller, J.: Shock waves and reaction-diffusion equations. Berlin–Heidelberg–New York: SpringerVerlag, 1982 [Sz] Szepessy, A.: Measure valued solutions to scalar conservations laws with boundary conditions. Arch. Rat. Mech. Anal. 107, 181–193 (1989) [Ta] Tartar, L.: Compensated Compactness e Applications to Partial Differential Equations. Research Notes in Math., Nonlinear Analysis and Mechanics, Heriot-Watt Symposium, Knop, Vol. 4, R. J. (ed.) Pitmann Press, 1979 Communicated by S.-T. Yau
Commun. Math. Phys. 186, 731 – 750 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
The Spectral Action Principle Ali H. Chamseddine1,2 , Alain Connes2 1 2
Theoretische Physik, ETH-H¨onggerberg, CH-8093 Z¨urich, Switzerland I.H.E.S., F-91440 Bures-sur-Yvette, France
Received: 1 October 1996 / Accepted: 15 November 1996
Abstract: We propose a new action principle to be associated with a noncommutative space (A, H, D). The universal formula for the spectral action is (ψ, Dψ) + Trace(χ(D/ Λ)) where ψ is a spinor on the Hilbert space, Λ is a scale and χ a positive function. When this principle is applied to the noncommutative space defined by the spectrum of the standard model one obtains the standard model action coupled to Einstein plus Weyl gravity. There are relations between the gauge coupling constants identical to those of SU (5) as well as the Higgs self-coupling, to be taken at a fixed high energy scale. 1. Introduction The basic data of Riemannian geometry consists in a manifold M whose points x ∈ M are locally labelled by finitely many coordinates xµ ∈ R, and in the infinitesimal line element, ds, (1.1) ds2 = gµν dxµ dxν . The laws of physics at reasonably low energies are well encoded by the action functional, R
√
I = IE + ISM ,
(1.2)
1 where IE = 16πG R g d4 x is the Einstein action, which depends only upon the 4geometry (we shall work throughout in the Euclidean, Ri.e. imaginary time formalism) and where ISM is the standard model action, ISM = LSM , LSM = LG + LGH + LH + LGf + LHf . The action functional ISM involves, besides the 4-geometry, several additional fields: bosons G of spin 1 such as γ, W ± and Z, and the eight gluons, bosons of spin 0 such as the Higgs field H and fermions f of spin 1/2, the quarks and leptons. These additional fields have a priori a very different status than the geometry (M, g) and the gauge invariance group which governs their interaction is a priori very different from the diffeomorphism group which governs the invariance of the Einstein action. In fact the natural group of invariance of the functional (1.2) is the semidirect product,
732
A. H. Chamseddine, A. Connes
G = U >/ Diff(M )
(1.3)
of the group of local gauge transformations, U = C ∞ (M, U (1) × SU (2) × SU (3)) by the natural action of Diff(M ). The basic data of noncommutative geometry consists of an involutive algebra A of operators in Hilbert space H and of a selfadjoint unbounded operator D in H [1–6]. The inverse D−1 of D plays the role of the infinitesimal unit of length ds of ordinary geometry. To a Riemannian compact spin manifold corresponds the spectral triple given by the algebra A = C ∞ (M ) of smooth functions on M , the Hilbert space H = L2 (M, S) of L2 -spinors and the Dirac operator D of the Levi-Civita Spin connection. The line element ds is by construction the propagator of fermions, ds = ×—× .
(1.4)
No information is lost in trading the original Riemannian manifold M for the corresponding spectral triple (A, H, D). The points of M are recovered as the characters of the involutive algebra A, i.e. as the homomorphisms ρ : A → C (linear maps such that ρ(ab) = ρ(a) ρ(b) ∀ a, b ∈ A). The geodesic distance between points is recovered by d(x, y) = Sup {|a(x) − a(y)| ; a ∈ A , k[D, a]k ≤ 1} .
(1.5)
More importantly one can characterize the spectral triples (A, H, D) which come from the above spinorial construction by very simple axioms ([4]) which involve the dimension n of M . The parity of n implies a Z/2 grading γ of the Hilbert space H such that, γ = γ ∗ , γ 2 = 1 , γa = aγ
∀ a ∈ A , γD = −Dγ .
(1.6)
Moreover one keeps track of the real structure on H as an antilinear isometry J in H satisfying simple relations J 2 = ε , JD = ε0 DJ , Jγ = ε00 γJ ; ε, ε0 , ε00 ∈ {−1, 1}, 0
(1.7)
00
where the value of ε, ε , ε is determined by n modulo 8. One first virtue of these axioms is to allow for a shift of point of view, similar to Fourier transform, in which the usual emphasis on the points x ∈ M of a geometric space is now replaced by the spectrum Σ ⊂ R of the operator D. Indeed, if one forgets about the algebra A in the spectral triple (A, H, D) but retains only the operators D, γ and J acting in H one can (using (1.7)) characterize this data by the spectrum Σ of D which is a discrete subset with multiplicity of R. In the even case Σ = −Σ. The existence of Riemannian manifolds which are isospectral (i.e. have the same Σ) but not isometric shows that the following hypothesis is stronger than the usual diffeomorphism invariance of the action of general relativity, “The physical action only depends upon Σ .” (1.8) In order to apply this principle to the action (1.2) we need to exploit a second virtue of the axioms (cf. [4]) which is that they do not require the commutativity of the algebra A. Instead one only needs the much weaker form, ab0 = b0 a
∀ a, b ∈ A
with b0 = Jb∗ J −1 .
(1.9)
In the usual Riemannian case the group Diff(M ) of diffeomorphisms of M is canonically isomorphic to the group Aut(A) of automorphisms of the algebra A = C ∞ (M ). To each ϕ ∈ Diff(M ) one associates the algebra preserving map αϕ : A → A given by
Spectral Action Principle
733
αϕ (f ) = f ◦ ϕ−1
∀ f ∈ C ∞ (M ) = A .
(1.10)
In general the group Aut(A) of automorphisms of the involutive algebra A plays the role of the diffeomorphisms of the noncommutative (or spectral for short) geometry (A, H, D). The first interesting new feature of the general case is that the group Aut(A) has a natural normal subgroup, Int(A) ⊂ Aut(A),
(1.11)
where an automorphism α is inner iff there exists a unitary operator u ∈ A, (uu∗ = u∗ u = 1) such that, ∀a ∈ A. (1.12) α(a) = uau∗ The corresponding exact sequence of groups, 1 → Int(A) → Aut(A) → Out(A) → 1
(1.13)
looks very similar to the exact sequence 1 → U → G → Diff(M ) → 1,
(1.14)
which describes the structure of the symmetry group G of the action functional (1.2). Comparing (1.13) and (1.14) and taking into account the action of inner automorphisms of A in H given by (1.15) ξ → u(u∗ )0 ξ = uξu∗ g g takes into account the one determines the algebra A such that Aut(A) = G (where Aut action of automorphisms in the Hilbert space H). The answer is A = C ∞ (M ) ⊗ AF ,
(1.16)
where the algebra AF is finite dimensional, AF = C ⊕ H ⊕ M3 (C), where H ⊂ M2 (C) is the algebra of quaternions, n o α β H= ; α, β ∈ C . −β¯ α¯
(1.17)
(1.18)
Giving the algebra A does not suffice to determine the spectral geometry, one still needs the action of A in H and the operator D. Since A is a tensor product (16) which geometrically corresponds to a product space, an instance of spectral geometry for A is given by the product rule, H = L2 (M, S) ⊗ HF , D = ∂/M ⊗ 1 + γ5 ⊗ DF ,
(1.19)
where (HF , DF ) is a spectral geometry on AF , while both L2 (M, S) and the Dirac operator ∂/M on M are as above. Since AF is finite dimensional the dimension of the corresponding space is 0 so that HF must be finite dimensional. The list of elementary fermions provides a natural candidate for HF . One lets HF be the Hilbert space with basis labelled by elementary leptons and quarks. Thus for the first generation of leptons we get eL , eR , νL , e¯L , e¯R , ν¯ L for instance, as the corresponding basis. The Z/2 grading γF is given by +1 for left handed particles and −1 for right handed ones. For quarks one has an additional color
734
A. H. Chamseddine, A. Connes
index, y, r, b. The involution J is just such that Jf = f¯ for any f in the basis. One has J 2 = 1, Jγ = γJ as dictated by the dimension n = 0. Moreover the algebra AF has a natural representation in HF and: ab0 = b0 a
∀ a, b ∈ AF , b0 = Jb∗ J −1 .
(1.20)
Finally there is a natural matrix acting in the finite dimensional Hilbert space HF . It is h i Y 0 DF = , (1.21) 0 Y¯ where Y is the Yukawa coupling matrix. The special features of Y show that the algebraic rule [[D, a], b0 ] = 0
∀ a, b ∈ A
(1.22)
which is one of the essential axioms, holds for the spectral geometry (AF , HF , DF ) = F . Of course this 0-dimensional geometry is encoding the knowledge of the fermions of the standard model and it is a basic question to understand and characterize it abstractly, but let us postpone this problem and proceed with the product geometry M × F . The next important new feature of the noncommutative case is the following. We saw that the group Aut(A) of diffeomorphisms falls in equivalence classes under the normal subgroup Int(A) of inner automorphisms. In the same way the space of metrics has a natural foliation into equivalence classes. The internal fluctuations of a given metric are given by the formula, D = D0 + A + JAJ −1 , A = Σ ai [D0 , bi ] , ai , bi ∈ A and A = A∗ .
(1.23)
Thus starting from (A, H, D0 ) with obvious notations, one leaves the representation of A in H untouched and just perturbs the operator D0 by (1.23), where A is an arbitrary self-adjoint operator in H of the form A = Σ ai [D0 , bi ] ; ai , bi ∈ A. One checks that this does not alter the axioms (check (1.22) for instance). These fluctuations are trivial: D = D0 in the usual Riemannian case in the same way as the group of inner automorphisms Int(A) = {1} is trivial for A = C ∞ (M ). f In general the natural action of Int(A) on the space of metrics restricts to the above equivalence classes and is simply given by (for the automorphism associated to u ∈ A, uu∗ = u∗ u = 1), ξ ∈ H → uξu∗ = uu∗0 ξ , A → u[D, u∗ ] + uAu∗ .
(1.24)
When one computes the internal fluctuations of the above product geometry M × F one finds ([6]) that they are parametrized exactly by the bosons γ, W ± , Z, the eight gluons and the Higgs fields H of the standard model. The equality Z √ (LGf + LHf ) g d4 x = hψ, Dψi (1.25) M
gives the contribution to (1.2) of the last two terms of the SM Lagrangian in terms of the operator D alone. The operator D encodes the metric of our “discrete Kaluza Klein” geometry M × F but this metric is no longer the product metric as it was for D0 . In fact the initial scale given by DF completely disappears when one considers the arbitrary internal fluctuations of D0 = ∂/M ⊗ 1 + γ5 ⊗ DF . What remains is to understand in a purely gravitational manner
Spectral Action Principle
735
the 4 remaining terms of the action (1.2). This is where we apply the basic principle (1.8). We shall check in this paper that for any smooth function χ, one has D = IE + IG + IGH + IH + IC + 0(Λ−∞ ), (1.26) Trace χ Λ where IC is a sum of a cosmological term, a term of Weyl gravity and a term in R √ R H 2 g d4 x. The computation in itself is not new, and goes back to the work of DeWitt [7]. Similar computations also occur in the theory of induced gravity [8]. It is clear that the left hand side of (1.26) only depends upon the spectrum Σ of the operator D, and following our principle (1.8) this allows to take it as the natural candidate for the bare action at the cutoff scale Λ. In our framework there is a natural way to cutoff the geometry at a given energy scale Λ, which has been developed in [9] for some concrete examples. It consists in replacing the Hilbert space H by the subspace HΛ , D (1.27) HΛ = range χ Λ and restricting both D and A to this subspace, while maintaining the commutation rule (1.20) for the algebra A. This procedure is superior to the familiar lattice approximation because it does respect the geometric symmetry group. The point is that finite dimensional noncommutative algebras have continuous Lie groups of automorphisms while the automorphism group of a commutative finite dimensional algebra is necessarily finite. The hypothesis which we shall test in this paper is that there exists an energy scale Λ in the range 1015 − 1019 Gev at which the bare action (1.2) becomes geometric, i.e. ∼ D + hψ, Dψi . (1.28) Trace χ Λ 2. The Spectral Action Principle Applied to the Einstein-Yang-Mills System To test the spectral action functional (1.28) we shall first consider the simplest noncommutative modification of a manifold M . Thus we replace the algebra C ∞ (M ) of smooth functions on M by the tensor product A = C ∞ (M ) ⊗ MN (C), where MN (C) is the algebra of N × N matrices. The spectral triple is obtained by tensoring the Dirac spectral triple for M by the spectral triple for MN (C) given by the left action of MN (C) on the Hilbert space of N × N matrices with Hilbert-Schmidt norm. The real structure is given by the adjoint operation, m → m∗ on matrices. Thus for the product geometry one has H = L2 (M, S) ⊗ MN (C), J(ξ ⊗ m) = Cξ ⊗ m∗ , D = ∂/M ⊗ 1 . We shall compare the spectral action functional (1.28) with the following: Z 1 √ I= 2 R g d4 x + IY M 2κ
(2.1)
(2.2)
736
A. H. Chamseddine, A. Connes
R √ where IY M = (LG + LGf ) g d4 x is the action for an SU (N ) Yang-Mills theory coupled to fermions in the adjoint representation. Starting with (2.1), one first computes the internal fluctuations of the metric and finds that they are parametrized exactly by an SU (N ) Yang Mills field A. Note that the formula D = D0 + A + JAJ ∗ eliminates the U (1) part of A even if one starts with an U (N ) gauge potential. One also checks that the coupling of the Yang Mills field A with the fermions is equal to hψ, Dψi ψ ∈ H. (2.3) The operator D = D0 + A + JAJ ∗ is given by i D = eµa γ a (∂µ + ωµ ) ⊗ 1N + 1 ⊗ − g0 Aiµ T i , 2
(2.4)
where ωµ is the spin-connection on M : ωµ =
1 ab ω γab , 4 µ
and T i are matrices in the adjoint representation of SU(N) satisfying Tr(Ti Tj ) = 2δ ij . (ωµab is related to the eaµ by the vanishing of the covariant derivative1 , ρ ∇µ eaν = ∂µ eaν − ωµab ebν − Γµν eaρ = 0 .
(2.5)
As the Christoffel connection ρ = Γµν
1 ρσ g (gµσ,ν + gνσ,µ − gµν,σ ) 2
(2.6)
is a given function of gµν = eaµ eaν , Eq. (2.5) could be solved to express ωµab as a function of eaµ .) It is a simple exercise to compute the square of the Dirac operator given by (2.4) [10–11]. This can be cast into the elliptic operator form [12]: P = D2 = −(g µν ∂µ ∂ν · 1I + Aµ ∂µ + B),
(2.7)
where 1I, Aµ and B are matrices of the same dimensions as D, and are given by: Aµ = (2ω µ − Γ µ ) ⊗ 1N − ig0 14 ⊗ Aµi T i , B = (∂ ωµ + ω µ ωµ − Γ ν ων + R) ⊗ 1N − ig0 ωµ ⊗ Aµi T i .
(2.8)
µ
In deriving (2.8) we have used Eq. (2.5) as well as the following definitions and identities 1 ab R (ω(e))γab , 4 µν eaρ ebσ Rµνab (ω(e)) = Rµνρσ (g), µ µ µ κ µ κ = ∂ρ Γνσ − ∂σ Γνρ + Γρκ Γνσ − Γσκ Γνρ , µ νσ µ Γ = g Γνσ ;
[∂µ + ωµ , ∂ν + ων ] ≡ µ Rνρσ
(2.9)
we have also used the symmetries of the Riemann tensor to prove that 1 We have limited our considerations to torsion free spaces. The more general case of torsion will be treated somewhere else.
Spectral Action Principle
737
γ µν Rµνab γab = −2R .
(2.10)
We shall now compute the spectral action for this theory given by 2 D + (ψ, D ψ), Tr χ m20
(2.11)
where the trace Tr is the usual trace of operators in the Hilbert space H, and m0 is a (mass) scale to be specified. The function χ is chosen to be positive and this has important consequences for the positivity of the gravity action. Using identities [12]: Z ∞ 1 −s ts−1 Tr e−tP dt Re(s) ≥ 0, (2.12) Tr(P ) = Γ (s) 0 and the heat kernel expansion for Tr e−tP '
X
t
n−m d
Z an (x, P ) dv(x),
(2.13)
M
n≥0
where m is the dimension of the manifold in C ∞ (M ), d is the order of P (in our case √ m = 4, d = 2) and dv(x) = g dm x, where g µν is the metric on M appearing in equation (2.7). If s = 0, −1, . . . is a non-positive integer then Tr(P −s ) is regular at this value of s and is given by Tr(P −s ) = Res Γ (s) |s= m−n an . d
From this we deduce that
Tr χ(P ) '
X
fn an (P ),
(2.14)
n≥0
where the coefficients fn are given by Z ∞ Z χ(u) udu , f2 = f0 = 0
R
∞
χ(u) du , 0
f2(n+2) = (−1)n χ(n) (0) , n ≥ 0,
(2.15)
and an (P ) = an (x, P ) dv(x). The Seeley-de Witt coefficients an (P ) vanish for odd values of n. The first three an ’s for n even are [12]: a0 (x, P ) = (4π)−m/2 Tr(1I), R −m/2 Tr − 1I + E , a2 (x, P ) = (4π) 6 1 Tr a4 (x, P ) = (4π)−m/2 360 ((−12R;µ µ + 5R2 − 2Rµν Rµν + 2Rµνρσ Rµνρσ )1I − 60RE + 180E2 + 60E;µ µ + 30Ωµν Ω µν ), (2.16)
738
A. H. Chamseddine, A. Connes
where E and Ωµν are defined by ρ ωβ0 ), E = B − g µν (∂µ ων0 + ωµ0 ων0 − Γµν
Ωµν = ∂µ ων0 − ∂ν ωµ0 + [ωµ0 ων0 ], 1 ωµ0 = gµν (Aν − Γ ν · 1I) . 2
(2.17)
The Ricci and scalar curvature are defined by Rµρ = Rµνab eνb eaρ , R = Rµνab eµa eνb .
(2.18)
We now have all the necessary tools to evaluate explicitly the spectral action (2.11). Using Eqs. (2.8) and (2.16) we find: E= Ωµν =
1 4R ⊗ 1 ab 4 Rµν
i 1I4 ⊗ 1IN + 4i γ µν ⊗ gFµν T i, i γab ⊗ 1N − 2i 1I4 ⊗ gFµν Ti .
(2.19)
From the knowledge that the invariants of the heat equation are polynomial functions of R, Rµν , Rµνρσ , E and Ωµν and their covariant derivatives, it is then evident from Eq. (2.19) that the spectral action would not only be diffeomorphism invariant but also gauge invariant. The first three invariants are then2 Z N √ 4 a0 (P ) = g d x, 4π 2 ZM N √ a2 (P ) = g R d4 x, 48π 2 M Z (2.20) 1 N 4 √ µ 2 µν a4 (P ) = · d x g (12R; + 5R − 8R R µ µν 16π 2 360 M 120 2 i µνρσ µνi g Fµν F . − 7Rµνρσ R )+ N For the special case where the dimension of the manifold M is four, we have a relation between the Gauss-Bonnet topological invariant and the three possible curvature square terms: (2.21) R∗ R∗ = Rµνρσ Rµνρσ − 4Rµν Rµν + R2 , αβ γδ Rρσ . Moreover, we can change the expression for where R∗ R∗ ≡ 41 εµνρσ εαβγδ Rµν a4 (P ) in terms of Cµνρσ instead of Rµνρσ , where
1 Cµνρσ = Rµνρσ − (gµ[ρ Rν|σ] − gν[ρ Rµ|σ] ) + (gµρ gνσ − gµσ gνρ )R 6
(2.22)
is the Weyl tensor. Using the identity: 1 Rµνρσ Rµνρσ = Cµνρσ C µνρσ + 2Rµν Rµν − R2 , 3 we can recast a4 (P ) into the alternative form: 2
Note that according to our notations the scalar curvature R is negative for spheres.
(2.23)
Spectral Action Principle
N a4 (p) = 48π 2
739
Z 4
d x
√
3 1 Cµνρσ C µνρσ + (11R∗ R∗ + 12R;µ µ ) 20 120 (2.24) g2 i Fµν F µνi , + N
g −
and this is explicitly conformal invariant. The Euler characteristic χE (not to be confused with the function χ) is related to R∗ R∗ by the relation 1 χE = 32π 2
Z d4 x
√
g R∗ R∗ .
(2.25)
If we choose the function χ to be a cutoff function, i.e. χ(x) = 1 for x near 0, then χ(n) (0) is zero ∀ n > 0 and this removes the non-renormalizable interactions. It is also possible to introduce scale m0 and consider χ to be a function of the dimensionless a mass P variable χ m . In this case terms coming from an (P ), n > 4 will be supressed by 2 0
powers of
1 : m20
Z Z N √ 4 4 √ 2 12m0 f0 d x g + m0 f2 d4 x g R Ib = 48πZ2 3 1 11 ∗ ∗ 4 √ R;µ µ + R R + f4 d x g − Cµνρσ C µνρσ + 20 10 20 g2 i 1 + Fµν F µνi +0 . N m20
(2.26)
We shall adopt Wilson’s viewpoint of the renormalization group approach to field theory [13] where the spectral action is taken to give the bare action with bare quantities m0 and g0 and with a cutoff scale Λ, where the theory is assumed to take a geometrical form. Introducing the cutoff scale Λ will regularize the theory. The perturbative expansion is then reexpressed in terms of renormalized physical quantities. The fields also receive wave function renormalization. Normalizing the Einstein and Yang-Mills terms in the bare action we then have: N m20 f 2 = 24π 2 2 f4 g0 = 12π 2
1 κ20
≡
1 8πG0 ,
(2.27)
1,
and (2.26) becomes: Z Ib =
d4 x
√
g
1 R + e0 2κ20
+ a0 Cµνρσ C where
µνρσ
∗
∗
+ c0 R R + d0 R;µ
µ
1 i µνi + Fµν F , 4
(2.28)
740
A. H. Chamseddine, A. Connes
−3N 1 , 80 g02 2 c 0 = − a0 , 3 11 d 0 = − a0 , 3 N m40 f0 . e0 = 4π 2
a0 =
(2.29)
The renormalized action receives counterterms of the same form as the bare action but with physical parameters k, a, c, d, and requires the addition of one new term [14] Z d4 x
√
g (b R2 ) .
(2.30)
This adds one further boundary condition for Eqs. (2.29): b0 = 0 . The renormalized fermionic action (ψ, Dψ) keeps the same form as the bare fermionic action. The renormalization group equations will yield relations between the bare quantities and the physical quantities with the addition of the cutoff scale Λ. Conditions on the bare quantities would translate into conditions on the physical quantities. In the present example only the gauge coupling g(Λ) and Newton’s constant will have measurable −2 effects. The dependence of κ0 on κ and the other physical quantities is such that κ−2 0 −κ 2 contains terms proportional to the cutoff scale. As κ must be identified with 8πG at 3 low energy it is clear that both κ−1 0 and Λ could be as high as the Planck scale The renormalization group equations of this system (after the addition of the R2 term) were studied by Fradkin and Tseytlin [15] and is known to be renormalizable, but non-unitary [14] due to the presence of spin-two ghost (tachyon) pole near the Planck mass. We shall not worry about non-unitarity (see, however, reference 16), because in our view at the Planck energy the manifold structure of space-time will break down and one must have a completely finite theory where only the part of the Hilbert space given by χ(D2 )H enters. The algebra A becomes finite dimensional in such a way that all symmetries of the continuum (in some approximation) would be admitted. In the limit of flat space-time we have gµν = δµν and the action (2.11) becomes (adopting the normalizations (2.29)): 1 i F F µνi + (ψ, D ψ), 4 µν
(2.31)
where we have dropped the constant term. This action is known to have N = 1 global supersymmetry. In reality we can also obtain the N = 2 and N = 4 super Yang-Mills actions by taking the appropriate Dirac operators in six and ten dimensions respectively [17]. 3
We would like to thank A. Tseytlin for correspondence on this point.
Spectral Action Principle
741
3. Spectral Action for the Standard Model Having illustrated the computation of our spectral action for the Einstein-Yang-Mills system we now address the realistic case of obtaining action (1.2) for the EinsteinStandard model system. We first briefly summarize the spectral triple (A, H, D) associated with the spectrum of the standard model. A complete treatment can be found in [4,6]. The geometry is that of a 4-dimensional smooth Riemannian manifold with a fixed spin structure times a discrete geometry. The product geometry is given by the rules A = A 1 ⊗ A2 , H = H 1 ⊗ H2 , D = D1 ⊗ 1 + γ 5 ⊗ D 2 ,
(3.1)
where A1 = C ∞ (M ), D1 = ∂/M the Dirac operator on M , H1 = L2 (M, S) and the discrete geometry (A2 , H2 , D2 ) will now be described. The algebra A2 is the direct sum of the real involutive algebras C of complex numbers, H of quaternions, and M3 (C) of 3 × 3 matrices. H2 is the Hilbert space with basis the elementary fermions, namely the quarks Q, leptons L and their charge conjugates, where ! uL νL dL (3.2) , L = eL , Q= dR eR uR and we have omitted family indices for Q and L and colour index for Q. The antilinear isometry J = J2 in H2 exchanges a fermion with its conjugate. The action of an element a = (λ, q, m) ∈ A2 in H2 is given by: uL q d a Q = λ¯ d L , (3.3) R
λ uR
α β is a quaternion. A similar formula holds for leptons. The action ¯ −β α¯ on conjugate particles is: where q =
¯ a L¯ = λ L, (3.4) ¯ a Q = m Q¯ . Y 0 , where Y is a Yukawa coupling matrix For the operator D2 we take D2 = 0 Y¯ of the form (3.5) Y = Yq ⊗ 13 ⊕ Y`
with Yq =
e0 k0d ⊗ H0 k0u ⊗ H 02 d ∗ ∗ u ∗ ∗ e (k0 ) ⊗ H0 (k0 ) ⊗ H 02 0 e k 0 ⊗ H0 02 . Y` = k0e∗ ⊗ H0∗ 0
The matrices k d , k u and k e are 3 × 3 family mixing matrices and 0 e 0 = iσ2 H0∗ . H0 = µ , H 1
,
(3.6)
742
A. H. Chamseddine, A. Connes
The parameter µ has the dimension of mass. The choice of the Dirac operator and the action of A2 in H2 comes from the restrictions that these must satisfy: J 2 = 1 , [J, D2 ] = 0 , [a, Jb∗ J −1 ] = 0, [[D, a], Jb∗ J −1 ] = 0 ∀ a, b .
(3.7)
The next step is to compute the inner fluctuations of the metric and thus the operators of the form: A = Σ ai [D, bi ]. This with the self-adjointness condition A = A∗ gives U (1), SU (2) and U (3) gauge fields as well as a Higgs field. The computation of A+JAJ −1 removes a U (1) part from the above gauge fields (such that the full matrix is traceless) (for derivation see [4]). The Dirac operator Dq that takes the inner fluctuations into account is given by the 36 ×36 matrix (acting on the 36 quarks) (tensored with Clifford algebras) D q =
γ µ ⊗ Dµ ⊗ 12 α − 2i g02 Aα µ σ i − 6 g01 Bµ ⊗ 12 ⊗ 13 γ5 ⊗ k0d∗ ⊗ H ∗
γ5 ⊗ k0d ⊗ H Dµ +
γµ⊗
i 3 g01 Bµ
0
⊗ 13
e∗ 0 γ5 k0u∗ H i +γ µ ⊗ 14 ⊗ 13 ⊗ − g03 Vµi λi , 2
⊗ 13
e γ5 ⊗ k0u ⊗ H
Dµ −
γµ⊗
2i 3 g01 Bµ
⊗ 13 (3.8)
where σ α are Pauli matrices and λi are Gell-mann matrices satisfying Tr(λi λj ) = 2δ ij .
(3.9)
i The vector fields Bµ , Aα µ and Vµ are the U (1), SU (2)w and SU (3)c gauge fields with gauge couplings g01 , g02 and g03 . The differential operator Dµ is given by
D µ = ∂µ + ωµ
(3.10)
e is the SU (2) conjugate and γ µ = eµa γ a . The scalar field H is the Higgs doublet, and H of H: e = (iσ 2 H ∗ ) . H (3.11) We note that although H0 was introduced in the definition of D2 it is absorbed in the field H. It is a simple exercise to see that the action for the fermionic quark sector is given by (3.12) (Q, Dq Q) . The Dirac operator acting on the leptons, taking inner fluctuations into account is given by the 9 × 9 matrix (tensored with Clifford algebra matrices): i h µ i α γ5 ⊗ k0e ⊗ H γ ⊗ Dµ − 2i g02 Aα µ σ + 2 g01 Bµ ⊗ 12 ⊗ 13 . D` = ∗e ∗ µ γ 5 ⊗ k0 ⊗ H γ ⊗ (Dµ + ig01 Bµ ) ⊗ 13 (3.13) Again the leptonic action has the simple form:
Spectral Action Principle
743
(L, D` L) .
(3.14)
According to our universal formula (1.28) the spectral action for the standard model is given by: (3.15) Tr[χ(D2 /m20 )] + (ψ, Dψ), where (ψ, Dψ) will include the quark sector (3.12) and the leptonic sector (3.14). Calculating the bosonic part of the above action follows the same lines as in the previous section. The steps that lead to the result are given in the Appendix. The bosonic action is Z 9m40 5 √ I= f d4 x g 0 π2 4 Z 3m20 5 4 √ 2 ∗ R − 2y + d f x g H H 2 4π 2 Z 4 f4 1 5 √ (12R;µ µ + 11R∗ R∗ − 18Cµνρσ C µνρσ ) d4 x g + 2 4π 40 4 (3.16) 1 2 ∗ µ ∗ + 3y Dµ H D H − R H H 6 2 2 α Giµν Gµνi + g02 Fµν F µνα + g03 5 2 + g01 Bµν B µν 3 1 2 ∗ 2 2 ∗ µ + 3z (H H) − y (H H);µ +0 , m20 where we have denoted
y = Tr |k0d |2 + |k0u |2 + z 2 = Tr |k0d |4 + |k0u |4 +
1 e2 |k | , 3 0 1 e4 |k | , 3 0 i i α g01 Bµ H . Dµ H = ∂µ H − g02 Aα µσ H − 2 2 2
(3.17)
Normalizing the Einstein and Yang-Mills terms gives: 15m20 f2 1 = 2, 4π 2 κ0 2 g03 f4 = 1, π2 5 2 2 2 g03 = g02 = g01 . 3
(3.18)
Relations (3.18) among the gauge coupling constants coincide with those coming from SU (5) unification. To normalize the Higgs fields kinetic energy we have to rescale H by: H→
2 g03 H. 3 y
This transforms the bosonic action (3.16) to the form:
(3.19)
744
A. H. Chamseddine, A. Connes
Z
Ib =
√
1 R − µ20 (H ∗ H) + a0 Cµνρσ C µνρσ 2κ20 + b0 R2 + c0 ∗ R∗ R + d0 R;µ µ 1 1 α µνα + e0 + Giµν Gµνi + Fµν F 4 4 1 µν 2 2 ∗ 2 + Bµν B + |Dµ H| − ξ0 R|H| + λ0 (H H) , 4
(3.20)
4 , 3κ20 9 − 2 , 8g03 0, 11 − a0 , 18 2 − a0 , 3 45 f0 m40 , 4π 2 4 2 z2 g , 3 03 y 4 1 . 6
(3.21)
4
d x
g
where µ20 = a0 = b0 = c0 = d0 = e0 = λ0 = ξ0 =
As explained in the last section this action has to be taken as the bare action at some cutoff scale Λ. The renormalized action will have the same form as (3.20) but with the bare quantities κ0 , µ0 , λ0 , a0 to e0 and g01 , g02 , g03 replaced with physical quantities. Relations between the bare gauge coupling constants as well as Eqs. (3.19) have to be imposed as boundary conditions on the renormalization group equations [13]. The bare mass of the Higgs field is related to the bare value of Newton’s constant, and both have quadratic divergences in the limit of infinite cutoff Λ. The relation between m20 and the physical quantities is: m20 = m2 1 +
Λ2 m2
−1
32π 2
Λ2 9 2 3 2 g2 + g1 + 6λ − 6kt2 + 0 ln 2 + . . . (3.22) 4 4 m
For m2 (Λ) to be small at low-energies m20 should be tuned to be proportional to the cutoff scale according to Eq. (3.22). Similarly the bare cosmological constant is related to the physical one (which must be tuned to zero at low energies): e0 = e +
Λ4 (62) + . . . , 32π 2
(3.23)
where 62 is the difference between the fermionic degrees of freedom (90) and the bosonic ones (28).
Spectral Action Principle
745
−2 There is also a relation between the bare scale κ−2 which 0 and the physical one κ is similar to Eq. (3.20) (but with all one-loop contributions coming with the same sign) which shows that κ−1 0 ∼ m0 and Λ are of the same order as the Planck mass. The renormalization group equations for the gauge coupling constants are: 1 dg1 41 = g3 , dt 16π 2 6 1 1 19 dg2 (3.24) = g23 , − 2 dt 16π 6 1 dg3 = (−7)g33 , dt 16π 2
where t = ln µ, µ being the running scale. Solutions to Eqs. (3.24) are known from the SU (5) case and are given by [19] Λ 41 ln , 12π MZ Λ 19 ln , α2−1 (MZ ) = α2−1 (Λ) − 12π MZ Λ 42 ln , α3−1 (MZ ) = α3−1 (Λ) − 12π MZ α1−1 (MZ ) = α1−1 (Λ) +
(3.25)
g2
where αi = 4πi , i = 1, 2, 3 and Mz is the mass of the Z vectors. At the scale Λ we have to impose the boundary conditions (3.18): α3 (Λ) = α2 (Λ) =
5 α1 (Λ) . 3
Using Eqs. (3.25) and (3.26) one easily finds: 109 3 Λ 2 1− αem ln , sin θw = 8 18π MZ 2π Λ −1 (3αem = (MZ ) − 8α3−1 (MZ )) . ln MZ 67
(3.26)
(3.27)
−1 (MZ ) and α3 (MZ ) are The present experimental values for αem −1 (MZ ) = 128.09, αem 0.110 ≤ α3 (MZ ) ≤ 0.123 .
(3.28)
9.14 × 1014 ≤ Λ ≤ 4.44 × 1014 (Gev), 0.206 ≤ sin2 θw ≤ 0.210 .
(3.29)
These values lead to
Therefore the bare action we obtained and associated with the spectrum of the standard model is consistent with experimental data provided the cutoff scale is taken to be Λ ∼ 1015 Gev. There is, however, a slight disagreement (10%) between the predicted value of sin2 θw and the experimental value of 0.2325 known to a very high precision. It is a remarkable fact that starting from the spectrum of the standard model at lowenergies, and assuming that this spectrum does not change, one can get the geometrical
746
A. H. Chamseddine, A. Connes
spectral action which holds at very high-energies and consistent within ten percent with experimental data. This can be taken that at higher energies the noncommutative nature of space-time reveals itself and shows that the effective theory at the scale Λ has a higher symmetry. The other disagreement is that the gravity sector requires the cutoff scale to be of the same order as the Planck scale while the condition on gauge coupling constants give Λ ∼ 1015 Gev. The gravitational coupling G runs with Λ due to the matter interactions. This dictates that it must be of the order Λ−2 and gives a large value for Newton’s constant. These results must be taken as an indication that the spectrum of the standard model has to be altered as we climb up in energy. The change may happen at low energies (just as in supersymmetry which also pushes the cutoff scale to 1016 Gev) or at some intermediate scale. Incidentally the problem that Newton’s constant is coming out to be too large is also present in string theory where also a unification of gauge couplings and Newton’s constant occurs [20]. Ultimately one would hope that modification of the spectrum will increase the cutoff scale nearer to the Planck mass as dictated by gravity. There is one further relation in our theory between the λ(H ∗ H)2 coupling and the gauge couplings to be imposed at the scale Λ [21]: λ0 =
4 2 z2 g . 3 03 y 4
(3.30)
This relation could be simplified if we assume that the top quark Yukawa coupling is much larger than all the other Yukawa couplings. In this case Eq. (3.30) simplifies to λ(Λ) =
16π α3 (Λ) . 3
(3.31)
Therefore the value of λ at the unification scale is λ0 ' 0.402 showing that one does not go outside the perturbation domain. In reality, Eq. (3.31) could be used, together with the RG equations for λ and kt to determine the Higgs mass at the low-energy scale MZ [22]: 1 dλ = 4λγ + 2 (12λ2 + B), dt 8π 1 9 2 17 2 dkt 3 2 = g kt , 9kt − 16g3 + g2 + dt 32π 2 2 6 1
(3.32)
1 (12k 2 − 9gt2 − 3g12 ), 64π 2 t 3 1 4 2 2 4 4 (3g + 2g1 g2 − g1 ) − kt . B = 84π 2 16 2
(3.33)
where γ =
These equations have to be integrated numerically [21]. One can get a rough estimate on the Higgs mass from the triviality bound4 on the λ couplings. For Λ ' 1015 Gev as given in Eq. (3.29) the limits are 160 < mH < 200 Gev .
(3.34)
This together with the boundary condition (3.31) gives a mass of the Higgs near the lower bound of 160 Gev. The exact answer can be only determined by numerical 4
We would like to thank M. Lindner for explanations on this point.
Spectral Action Principle
747
integration, but this of course cannot be completely trusted as the predicted value for sin2 θw is off by ten percent. It can, however, be taken as an approximate answer and in this respect one can say that the Higgs mass lies in the interval 160 − 180 Gev. We expect this prediction to be correct to the same precision as that of sin2 θw in (3.29). In reality we can perform the same analysis for the gravitational sector to determine the dependence of κ0 , a0 , b0 , c0 , d0 and e0 on the physical quantities and the effect of the boundary conditions (3.19) on them. This, however, will not have measurable consequences and will not be pursued here.
4. Conclusions The basic symmetry for a noncommutative space (A, H, D) is Aut(A). This symmetry includes diffeomorphisms and internal symmetry transformations. The bosonic action is a spectral function of the Dirac operator while the fermionic action takes the simple linear form (ψ, Dψ) where ψ are spinors defined on the Hilbert space. Applying this principle to the simple case where the algebra is C ∞ (M ) ⊗ Mn (C) with a Hilbert space of fermions in the adjoint representation, one finds that the bosonic action contains the Yang-Mills, Einstein and Weyl actions. This action is to be interpreted as the bare Wilsonian action at some cutoff scale Λ. The same principle when applied to the less trivial noncommutative geometry of the standard model gives the standard model action coupled to Einstein and Weyl gravity plus higher order non-renormalizable interactions suppressed by powers of the inverse of the mass scale in the theory. One also gets a mass term for the Higgs field. This bare mass is of the same order as the cutoff scale and this is related to the fact that there are quadratic divergences associated with the Higgs mass in the standard model. There are some relations between the bare quantities. The renormalized action will have the same form as the bare action but with physical quantities replacing the bare ones (except for an R2 term which is absent in the bare action due to the scale invariance of the a4 term associated with the square of the Dirac operator). The relations among the bare quantities must be taken as boundary conditions on the renormalization group equations governing the scale dependence of the physical quantities. In particular there are relations among the gauge couplings coinciding with those of SU (5) (or any gauge group containing SU (5) and also between the Higgs couplings to be imposed at some scale. These relations give a unification scale (or cutoff scale) of order ∼ 1015 Gev and a value for sin2 θw ∼ 0.21 which is off by ten percent from the true value. We also have a prediction of the Higgs mass in the interval 160 − 180 Gev. This can be taken as an indication that the noncommutative structure of space-time reveals itself at such high scale where the effective action has a geometrical interpretation. The slight disagreement with experiment indicates that the spectrum of the standard model could not be extrapolated to very high energies without adding new particles necessary to change the RG equations of the gauge couplings. One possibility could be supersymmetry, but there could be also less drastic solutions. It might be tempting by changing the spectrum to push the unification scale up nearer to the Planck scale, a situation which is also present in string theory. In summary, we have succeeded in finding a universal action formula that unified the standard model with the Einstein action. This necessarily involved an extrapolation from the low-energy sector to 1015 Gev, assuming no new physics arise. Our slight disagreement for the prediction of sin2 θw and for the low value of the unification scale
748
A. H. Chamseddine, A. Connes
seems to imply that the spectrum of the standard model must be modified either at lowenergy or at an intermediate scale. There is also the possibility that by formulating the theory at very high energies, the concept of space-time as a manifold breaks down and the noncommutativity of the algebra must be extended to include the manifold part. One expects that the algebra A becomes a finite dimensional algebra. Finally, we hope that our universal action formula should be applicable to many situations of which the most important could be superconformal field theory. Work along these ideas is now in progress. Appendix To derive a general formula for Tr χ(D2 /Λ2 ) we must evaluate the heat kernel invariants an (x, P ) for a Dirac operator of the form µ γ5 S γ (Dµ · 1IN + Aµ ) . (A.1) D= γ µ (Dµ ⊗ 1IN + Aµ ) γ5 S Evaluating D2 we find that Aµ = ((2ω µ − Γ µ ) ⊗ 1N + 2Aµ ) ⊗ 12 , µ ωµ + ω µ ωµ − Γ µ ωµ + R) ⊗ 1N + 2Aµ · ωµ B = (∂ 1 + ∂ µ + ω µ − Γ µ ) Aµ − γ µν Fµν + Aµ Aµ − S 2 ⊗ 12 2 0 1 − γ µ γ5 (Dµ S + [Aµ , S]) ⊗ . 1 0
From this we can construct E and Ωµν : 1 µν 1 2 R ⊗ 1N − γ Fµν − S ⊗ 12 E = 4 2 0 1 µ − γ γ5 (Dµ S + [Aµ , S]) ⊗ , 1 0 1 ab R γab ⊗ 1N + Fµν ⊗ 12 . Ωµν = 4 µν
(A.2)
(A.3)
(A.4)
(A.5)
From this we deduce that Z Λ4 √ 4 g d x Tr(1), a0 (x, P ) = 2 4π Z 2 Λ R √ 4 2 Tr(1) − 2 Tr(S ) , a2 (x, P ) = 2 gd x 4π Z 12 1 9 11 ∗ ∗ √ 4 Tr(1) µνρσ µ (A.6) 3R;µ − Cµνρσ C gd x + R R a4 (x, P ) = 2 4π 360 4 2 R + Tr (Dµ S + [Aµ , S])2 − S 2 6 1 1 µν 4 − Tr Fµν F + Tr S − Tr(S 2 );µ µ . 6 3
Spectral Action Principle
749
Applying these formulas to the Dirac operator of the quark sector we can obtain the same answer as from an explicit calculation by replacing in the previous formulas: Tr(1)→ 36, Tr S 2 → 3 Tr(|k0d |2 + |k0u |2 )H ∗ H, Tr S 4 → 3 Tr(|k0d |4 + |k0u |4 )(H ∗ H)2 , i α − 2i g02 Aα µ σ − 6 g01 Bµ · 1I2 Aµ →
i + 14 ⊗ 13 ⊗ − g03 Vµi λi 2
1 3 g01 Bµ
⊗ 13 ⊗ 13 − 2i3 g01 Bµ
.
(A.7)
Then 3 2 α µνα 11 2 1 2 g Bµν B µν + g03 Fµν F + Giµν Gµνi . − Tr Fµν F µν → g02 6 4 2 01
(A.8)
In the leptonic sector, we make the replacements: Tr(1)→ 9, Tr S 2 →Tr |k0e |2 H ∗ H, (A.9) Tr S 4 →Tr |k0e |4 (H ∗ H)2 , 1 1 3 2 α µνα 11 2 2 − Tr Fµν F µν → g02 Fµν F + g01 Bµν B µν + g03 Giµν Gµνi . 6 3 4 2
Acknowledgement. A.H.C. would like to thank J¨urg Fr¨ohlich for very useful discussions and I.H.E.S. for hospitality where part of this work was done.
References 1. Connes, A.: Publ. Math. IHES 62, 44 (1983); Noncommutative Geometry. New York: Academic Press, 1994 2. Connes, A. and Lott, J.: Nucl. Phys. Proc. Supp. B18, 295 (1990); Proceedings of 1991 Carg`ese Summer Conference, edited by J. Fr¨ohlich et al., New York: Plenum, 1992 3. Kastler, D.: Rev. Math. Phys. 5, 477 (1993) 4. Connes, A.: Gravity Coupled with Matter and the Foundation of Noncommutative Geometry. hepth/9603053 5. Chamseddine, A.H., Felder, G. and Fr¨ohlich, J.: Commun. Math. Phys. 155, 109 (1993); A.H. Chamseddine, J. Fr¨ohlich and O. Grandjean, J. Math. Phys. 36, 6255 (1995) 6. Connes, A.: J. Math. Phys. 36, 6194 (1995) 7. De Witt, B.: Dynamical Theory of Groups and Fields. New York: Gordon and Breach, 1965 8. Adler, S.: In The high energy limit, Erice lectures edited by A. Zichichi, New York: Plenum, 1983 9. Grosse, H., Klimcik, C. and Presnajder, P.: On Finite 4D Quantum Field Theory in Noncommutative Geometry. hep-th/9602115 10. Kastler, D.: Commun. Math. Phys. 166, 633 (1995) 11. Kalau, W. and Walze, M.: J. Geom. Phys. 16, 327 (1995)
750
A. H. Chamseddine, A. Connes
12. Gilkey, P.: Invariance theory, the heat equation and the Atiyah-Singer index theorem, Dilmington: Publish or Perish, 1984 13. Wilson, K.G.: Rev. Mod. Phys. 47, 773 (1975); For an exposition very close to the steps taken here see C. Itzykson and J. -M. Drouffe, Field theory, Chapter five, Cambridge: Cambridge University Press, 1989 14. Stelle, K.S.: Phys. Rev. D16, 953 (1977) 15. Fradkin E. and Tseytlin, A.: Nucl. Phys. 16. E. Tomboulis, Phys. Lett. 70B, 361 (1977). 17. Chamseddine, A.H.: Phys. Lett. B332, 349 (1994) 18. For a discussion of quadratic divergences and the hierarchy problem in the standard model see e.g. J. Ellis, Supersymmetry and Grand Unification, hep-ph/9512335 19. For a review see G. Ross, Grand unified theories. Frontiers in Physics Series, Vol.60, New York: Benjamin, 1985 20. Witten, E.: Strong Coupling Expansion of Calabi-Yau Compactification, hep-th/9602070 21. B´eg, M., Panagiotakopoulos, C. and Sirlin, A.: Phys. Rev. Lett. 52, 883 (1984); M. Lindner, Z. Phys. C31, 295 (1986) 22. For an extensive review see M. Sher, Phys. Rep. 179, 273 (1989) Communicated by A. Jaffe