Commun. Math. Phys. 263, 1–19 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1491-6
Communications in
Mathematical Physics
Characterization and ‘Source-Receiver’ Continuation of Seismic Reflection Data Maarten V. de Hoop1 , Gunther Uhlmann2, 1
Center for Computational and Applied Mathematics, Purdue University, 150 N. University Street, West Lafayette, IN 47907, USA. E-mail:
[email protected] 2 Department of Mathematics, University of Washington, Seattle, WA 98195-4350, USA. E-mail:
[email protected] Received: 29 June 2004 / Accepted: 3 October 2005 Published online: 26 January 2006 – © Springer-Verlag 2006
Abstract: In reflection seismology one places sources and receivers on the Earth’s surface. The source generates elastic waves in the subsurface, that are reflected where the medium properties, stiffness and density, vary discontinuously. In the field, often, there are obstructions to collect seismic data for all source-receiver pairs desirable or needed for data processing and application of inverse scattering methods. Typically, data are measured on the Earth’s surface. We employ the term data continuation to describe the act of computing data that have not been collected in the field. Seismic data are commonly modeled by a scattering operator developed in a high-frequency, single scattering approximation. We initially focus on the determination of the range of the forward scattering operator that models the singular part of the data in the mentioned approximation. This encompasses the analysis of the properties of, and the construction of, a minimal elliptic projector that projects a space of distributions on the data acquisition manifold to the range of the mentioned scattering operator. This projector can be directly used for the purpose of seismic data continuation, and is derived from the global parametrix of a homogeneous pseudodifferential equation the solution of which coincides with the range of the scattering operator. We illustrate the data continuation by a numerical example.
1. Introduction In reflection seismology one places sources and receivers on the Earth’s surface. The source generates elastic waves in the subsurface, that are reflected where the medium properties, stiffness and density, vary discontinuously. Seismic data collected in the field are often not ideal for data processing and application of inverse scattering methods. Typically, data are measured on the Earth’s (two-dimensional) surface; the location of the
This research was supported in part under NSF CMG grant EAR-0417891. Partly supported by a John Simon Guggenheim fellowship.
2
M.V. de Hoop, G. Uhlmann
receiver relative to the source can be coordinated by offset and azimuth. We employ the term data continuation to describe the act of computing data that have not been collected in the field. Special cases of data continuation are the so-called ‘transformation to zero offset’ (derived from what seismologists call Dip MoveOut [7] to generate data at zero offsets, and ‘transformation to common azimuth’ (derived from what seismologists call Azimuth MoveOut [2]) to generate data at a fixed, prescribed azimuth. Data continuation can also play the role of ‘forward extrapolation’ [9] in a data regularization scheme. Seismic data are commonly modeled by a scattering operator developed in a highfrequency single scattering approximation. In this approximation one assumes that the medium is described by a singular contrast superimposed on a smooth background. Under geological constraints, often, the contrast is a conormal distribution. Initially, we focus on the determination of the range of the forward scattering operator that models the singular part of the data in the single scattering approximation. This encompasses the analysis of the properties of, and the construction of, a minimal elliptic projector that projects the space of distributions on the acquisition manifold to the range of the mentioned scattering operator. This projector can be directly used for the purpose of seismic data continuation, and is derived from the global parametrix of a homogeneous pseudodifferential equation the solution of which coincides with the range of the scattering operator. Through characterization of features in the data, applications of data continuation extend to survey design (i.e. the design of the acquisition geometry describing the locations of the source-receiver pairs). The range of the scattering operator can also be used as a criterion for muting the data for features that are undesirable for the purpose of imaging the data (such as multiple scattered waves). The notion of data continuation has been introduced in exploration seismology quite some time ago. As early as in 1982, Bolondi et al. [3] came up with the idea of describing data offset continuation and Dip MoveOut in the form of solving a partial differential equation. Their approach, built on the approach of Deregowski and Rocca [7], is valid in homogeneous media for acoustic waves while their partial differential operator is approximate only. An ‘exact’ partial differential equation for space dimension n = 2 that addresses mentioned offset continuation was later derived by Goldin [11]. In this application it is implicitly used that the kernel of the associated partial differential operator determines the range of the operator that models the singular part of seismic data – in the single scattering approximation. The operator can be written in the form of a generalized Radon transform. Heuristically, the procedure and analysis presented in this paper can be thought of as a generalization from two to higher dimensions, from acoustic to elastic, and from homogeneous to heterogeneous media, of Goldin’s ‘offset continuation’ equation approach. Let the data be denoted by d = d(s, r, t), where s denotes source position, r receiver position, t the time, while (s, r, t) ∈ Y and Y denotes the acquisition manifold. Let Y ⊂ R2n−1 . We introduce the map κ: (s, r, t) → (z, tn , h), z = 21 (r + s), h = 21 (r − s), tn = tn (h, t) =
t2 −
4h2 , v2
where, v is the acoustic wave speed. Let r be the pull back of d by the inverse of this map, r = (κ −1 )∗ d. The singular support of r can be parametrized by (z, h) according to (z, Tn (z, h), h) with Tn (z, h) = tn (h, T (z, h)), in which the function T (z, h) denotes the traveltime of a particular reflection in the data; in seismological terms, the function
Characterization of Seismic Reflection Data
3
Tn is the traveltime ‘after Normal MoveOut correction’. Goldin’s equation is of the form (n = 2) 2 ∂2 ∂2 ∂ (1) − 2 . +h P r = 0, P := tn ∂tn ∂h ∂h2 ∂z This equation is supplemented with the initial conditions r(z, tn , h)|h=h0 ,
∂r (z, tn , h)|h=h0 . ∂h
The first initial condition represents what seismologists call a post-normal-moveout constant-offset section at half offset h0 ; the second initial condition is the first-order derivative of post-normal-moveout section at half offset h0 . Goldin’s equation is not exact in the sense that it does not account for the symbols of the reflection operators associated with the reflectors in the subsurface. The notion of data continuation has also been introduced and exploited in helical x-ray transmission tomography (CT) [22]. Consider a flat area detector, which is contained in the plane described by Cartesian coordinates (u, v). Let R denote the radius of the helix and 2πh its pitch. Let λ denote the angle describing rotation of the cone vertex. The axial shift of the assembly of the x-ray source and the detector is denoted by ζ . The data are denoted by g = g(u, v, λ, ζ ). In this case, John’s equation [19] describing the range of the Radon transform is used. John’s equation for the x-ray transform in dimension 3 is given by R2
∂ 2g ∂ 2g ∂g ∂ 2g ∂ 2g ∂ 2g − 2u + (R 2 + u2 ) =R − Rh + uv 2 . ∂u∂ζ ∂v ∂u∂v ∂λ∂v ∂v∂ζ ∂v
(2)
This equation is supplemented with the initial conditions g(u, v, λ, ζ )|ζ =0 that are measured. (The standard form of John’s equation is much simpler than (2).) Gel’fand and Graev [10] have generalized John’s result to k-planes in Rn . John’s (and Goldin’s) partial differential equation in higher space dimension is second order and of ultrahyperbolic type. The seismic forward scattering operator is a Fourier integral operator and can be identified with a generalized Radon transform [1, 5, 23]. We characterize seismic data by analyzing the range of the forward scattering operator. This range coincides with the kernel of a self-adjoint, second-order pseudodifferential operator, P , derived from annihilators, Pi , of the data, d, Pi d = 0, P = Pi2 . (3) i
Let Q denote the global parametrix of P . The mentioned elliptic minimal projector then follows to be π = I − QP + smoothing operator
(4)
and provides the Fourier integral operator for continuing the singular part of the data. The annihilators are functionally dependent on the background medium, and hence can be used to form a criterion to estimate it. This estimation is known to seismologists as
4
M.V. de Hoop, G. Uhlmann
‘velocity analysis’ and can be formulated as a reflection tomography problem. Thus, data continuation and reflection tomography, and imaging, are intimately connected. The results presented in this paper are based on the work by Guillemin and Uhlmann [15]. Here, we speak of ‘data’ continuation rather than ‘offset’ continuation, because our approach continues data in sources and receivers and not only in offset due to the heterogeneity of the subsurface we can allow. 2. Modeling of Seismic Data in the Single Scattering Approximation The propagation and scattering of seismic waves is governed by the elastic wave equation, which is written in the form Wil ul = fi ,
(5)
where ul =
ρ(x)(displacement)l ,
fi = √
1 (volume force density)i , ρ(x)
(6)
∂ cij kl (x) ∂ . ∂xj ρ(x) ∂xk
(7)
and Wil = δil
∂2 + Ail + l.o.t., ∂t 2
Ail = −
Here, x ∈ Rn and the subscripts i, j, k, l ∈ {1, . . . , n}; ρ is the density of mass while cij kl denotes the stiffnesss tensor. The system of partial differential equations is assumed to be of principal type. It supports different wave types (modes), one ‘compressional’ and n − 1 ‘shear’. We label the modes by M, N, . . . . For waves in mode M, singularities are propagated along bicharacteristics, that are determined by Hamilton’s equations generated by a Hamiltonian BM , ,
dt = 1, dλ
dξ ∂ = − BM (x, ξ ) , dλ ∂x
dτ = 0. dλ
∂ dx = BM (x, ξ ) dλ ∂ξ
(8)
The BM follow from the diagonalization of the principal symbol matrix of Ail , as the square roots of its eigenvalues. Clearly, the solution may be parameterized by t. We denote the solution of (8) subject to and initial values (x0 , ξ0 ) at t = 0 by (xM (x0 , ξ0 , t), ξM (x0 , ξ0 , t)). In the contrast formulation the total value of the medium parameters ρ, cij kl is written as the sum of a smooth background constituent ρ(x), cij kl (x) and a singular perturbation δρ(x), δcij kl (x), viz. ρ + δρ, cij kl + δcij kl . This decomposition induces a perturbation of Wil (cf. (7)), δWil = δil
δρ(x) ∂ 2 ∂ δcij kl (x) ∂ − . 2 ρ(x) ∂t ∂xj ρ(x) ∂xk
The scattered field, in the single scattering approximation, satisfies Wil δul = −δWil ul .
Characterization of Seismic Reflection Data
5
Data are measurements of the scattered wave field δu, which we relate here to the Green’s function perturbation: They are assumed to be representable by δGMN ( x, x , t) for ( x, x , t) in some acquisition manifold, which contains the receiver and source points and time. Let y → ( x (y), x (y), t (y)) be a coordinate transformation, such that y = (y , y ) and the acquisition manifold, Y say, is given by y = 0. We assume that the dimension of y is 2 + c, where c is the codimension of the acquisition geometry. For example, for marine acquisition in seismic reflection data, c = 1, while also in global seismology for many, but not all, regions c = 1 – seismologists recognize this as lack of ‘azimuthal’ coverage. An example of c = 2 is provided by the common-‘offset’ acquisition geometry. In this framework, the data are modeled by δρ(x) δcij kl (x) x (y , 0), x (y , 0), t (y , 0)). (9) , → δGMN ( ρ(x) ρ(x) ξ0 ), We investigate the propagation of singularities by this mapping. Let τ = ∓BM (x0 , and ξ0 , ± t), x = xN (x0 , ξ0 , ± t), x = xM (x0 , ξ = ξM (x0 , ξ0 , ± t), ξ = ξN (x0 , ξ0 , ± t).
t = t + t,
ξ0 , ξ0 , t, t), η(x0 , ξ0 , ξ0 , t, t)) by transforming ( x, x , t + t, ξ , ξ, τ) We then obtain (y(x0 , to (y, η) coordinates. We invoke the following assumptions that concern scattering over π and rays grazing the acquisition manifold, Assumption 1. There are no elements (y , 0, η , η ) with (y , η ) ∈ T ∗ Y \0 such that there is a direct bicharacteristic from ( x (y , 0), ξ (y , 0, η , η )) to ( x (y , 0), − ξ (y , 0, η , η )) with arrival time t (y , 0). Assumption 2. The matrix ∂y has maximal rank. ∂(x0 , ξ0 , ξ0 , t, t)
(10)
The propagation of singularities by (9) is governed by the canonical relation ξ0 , ξ0 , t, t), η (x0 , ξ0 , ξ0 , t, t); x0 , ξ0 + ξ0 ) | , (11) MN = {(y (x0 , ∗ ∗ BM (x0 , ξ0 ) = BN (x0 , ξ0 ) = ∓τ, y (x0 , ξ0 , ξ0 , t, t) = 0} ⊂ T Y \0 × T X\0. The condition y (x0 , ξ0 , ξ0 , t, t) = 0 determines the traveltimes t for given (x0 , ξ0 ) and t for given (x0 , ξ0 ). Following Maslov and Fedoriuk [21], we choose coordinates for MN of the form (yI , x0 , ηJ ),
(12)
where I ∪ J is a partition of {1, . . . , 2n − 1 − c}, with associated generating function SMN = SMN (yI , x0 , ηJ ). The phase function in these coordinates becomes MN =
MN (y , x0 , ηJ ). Let τ = 21 ( τ + τ ) and τ¯ = τ − τ . The map (x0 , ξ0 , ξ0 , t, t) → (x0 , yI , y , ηJ , τ¯ )
(13)
6
M.V. de Hoop, G. Uhlmann
is bijective. Thus, for y = 0 and τ¯ = 0 we can express ( ξ0 , ξ0 ) as functions of (yI , x0 , ηJ ). The amplitude associated with MN , to leading order, can be written in the form |bMN (yI , x0 , ηJ )| ∂( x, x , t) −1/2 ∂(x0 , ξ0 , ξ0 , t, t ) 1/2 1 − n+1+c 4 . (14) = (2π ) det ∂(y , y ) det ∂(x , y , y , η , τ¯ ) 2 4τ 0 I y =0,τ¯ =0 J
δcij kl , are described by conormal distributions. We consider the We assume that δρ ρ ρ case of a single interface, and a jump discontinuity in (δρ, δcij kl ) across this interface. Let κ : Rn → Rn , x → z be a coordinate transformation such that the interface is given by zn = 0. The corresponding cotangent vector is denoted by ζ , and transforms −1 t according to ζi (x, ξ ) = (( ∂κ ∂x ) )ij ξj ; the z form coordinates on the manifold X, and δc we write z = (z , zn ). We introduce the distributions (δρ, ij kl ) by pull back with κ: δρ(κ(x)) = δρ(x),
δc ij kl (κ(x)) = δcij kl (x).
(15)
Then ∂ ∂z = n ρ + l.o.t., δρ ∂x ∂x
ρ =
∂ δρ, ∂zn ∂ δ c
where ρ contains a factor δ(zn (x)), and similarly for ∂xij kl . Substituting (15) into the integral over X representing the high-frequency Born approximation for scattered waves, and integrating by parts, then yields an oscillatory integral representation in which wMN;0 (yI , x, ηJ )
δcij kl (x) δρ(x) + wMN ;ij kl (yI , x, ηJ ) , ρ(x) ρ(x)
where w stands for the contrast-source radiation patterns derived from the pseudodifferential operators that diagonalize the elastodynamic system of equations [23], has been replaced by ∂zn δ(zn (x)). 2iτ RMN (yI , x, ηJ ) ∂x n Here we use that τ will be one of the components of ηJ . Also (· · · ) ∂z ∂x δ(zn (x)) ∂zn dx = zn =0 (· · · ) det ∂x ∂z ∂x dz becomes the Euclidean surface integral over the surface or manifold zn = 0. Theorem 1 [23]. Suppose Assumptions 1, and 2 are satisfied microlocally for the relevant part of the data. Let MN (y, x, ηJ ) and bMN (yI , x, ηJ ) be the phase function and amplitude introduced above. Then the mapping ∂zn refl F : ∂x δ(zn (x)) → GMN (y),
Characterization of Seismic Reflection Data
7
where
− |J2 | − 3n−1−c 4 Grefl (y) = (2π ) (2iτ (ηJ ) bMN (yI , x, ηJ )RMN (yI , x, ηJ ) + l.o.t.) MN X ∂zn × (16) ∂x δ(zn (x)) exp[i MN (y, x, ηJ )] dxdηJ , defines a Fourier integral operator with canonical relation MN and of order n−1+c −1. 4 F models seismic reflection data. In the Kirchhoff approximation, one can identify the principal part of RMN with the plane-wave reflection coefficient: Using (13) we find the (x, ξ0 , ξ0 ) associated with (yI , x, ηJ ). A reflection from an incident N -mode with covector ξ0 into a scattered M-mode with covector ξ0 takes place, at x, if the frequencies are equal and ξ0 + ξ0 is in the wavefront set of δ(zn (x)). Given ξ0 , ξ0 one can identify the down- and upgoing modes µ(M), ν(N ) relative to the interface, and define (at least to highest order) the reflection coefficient at x, RMN = Rµ(M),ν(N) (z (x), ζ (x, ξ0 ), τ ) if zn (x) = 0,
(17)
see De Hoop and Bleistein [4] and Stolk and De Hoop [23]. The Kirchhoff approximation requires the following assumption Assumption 3. There are no rays tangent to the interface zn = 0, i.e. elements in MN associated with (x(z , 0), ξ0 (z , 0, ζ , 0)) or with (x(z , 0), ξ0 (z , 0, ζ , 0)) (cf. (11)). For a treatment of reflection and transmission of waves in the elastic case, using microlocal analysis, see Taylor [24]; for the acoustic case, see also Hansen [16]. Examples ∂zn of conormal distributions, ∂x δ(zn (x)), in the Earth sciences the reflections off which are observed, include the core-mantle boundary, thermal and chemical boundary layers in the deep mantle, fault zones, and geological interfaces in sedimentary basins.
3. Extension of the Scattering Operator For simplicity of notation, from here on, we drop the subscripts MN and consider a single mode pair. In the single scattering approximation, subject to restriction to the acquisition manifold Y , the singular part of the medium parameters is a function of n variables, while the data are a function of 2n − 1 − c variables. Here, we discuss the extension of the scattering operator to act on distributions of 2n − 1(−c) variables, equal to the number of degrees of freedom in the acquisition.
3.1. The wavefront set of seismic data . The wavefront set of the modeled data is not arbitrary. This is a consequence of the fact that data consist of multiple experiments designed to provide a degree of redundancy, which we explain here. Assumption 4 (Guillemin [13]). The projection πY of on T ∗ Y \0 is an embedding.
8
M.V. de Hoop, G. Uhlmann
This assumption is known as the Bolker condition. It admits the presence of caustics. Because is a canonical relation that projects submersively on the subsurface variables (x, ξ ) (using that the operator Wil is of principal type), the projection of (11) on T ∗ Y \0 is immersive [17, Lemma 25.3.6 and (25.3.4)]. In fact, only the injectivity part of the Bolker condition needs to be verified. The image L of πY is locally a coisotropic submanifold of T ∗ Y \0. Hence, for each (y, η) ∈ L, (T(y,η) L)⊥ ⊂ T(y,η) L. Setting V(y,η) = (T(y,η) L)⊥ , the vector bundle V → L whose fiber at (y, η) is (T(y,η) L)⊥ , is an integrable subbundle of T L. Applying [14, Prop. 8.1], from the Bolker condition it follows that L satisfies their Axiom F: the foliation of L associated with V is fibrating, i.e. there exists a C ∞ Hausdorff manifold X and a smooth fiber map L → X whose fibers are the connected leaves of the foliation defined by V . We choose coordinates revealing the mentioned fibration. Since the projection πX of on T ∗ X\0 is submersive, we can choose (x, ξ ) as the first 2n local coordinates on ; the remaining dim Y − n = n − 1 − c coordinates are denoted by e ∈ E, E being a manifold itself. The sets X (x, ξ ) = const. are the isotropic fibers of the fibration of H¨ormander [18], Theorem 21.2.6, see also Theorem 21.2.4. Duistermaat [8] calls them characteristic strips (see Theorem 3.6.2). Also, ν = ξ −1 ξ is then identified as the migration dip. The wavefront set of the data is contained in L and is a union of such fibers. The map πX πY−1 : L → X is a canonical isotropic fibration, known to seismologists as map migration. We consider again the canonical relation and suppose that Assumption 4 is satisfied. We define as the mapping πY πX−1 , : (x, ξ, e) → (y(x, ξ, e), η(x, ξ, e)) : T ∗ X\0 × E → T ∗ Y \0, which is known to seismologists as map demigration. This map conserves the symplectic form of T ∗ X\0. Indeed, let σY denote the fundamental symplectic form on T ∗ Y \0. We consider the vector fields over an open subset of L with components wxi = ∂(y,η) ∂xi and similarly for wξi and wei . Then σY (wxi , wxj ) = σY (wξi , wξj ) = 0, σY (wξi , wxj ) = δij , σY (wei , wxj ) = σY (wei , wξj ) = σY (wei , wej ) = 0.
(18)
The (x, ξ, e) are ‘symplectic coordinates’ on the projection L of on T ∗ Y \0. In the following lemma, we extend these coordinates to symplectic coordinates on an open neighborhood of L, which is a manifestation of Darboux’s theorem stating that T ∗ Y can be covered with symplectic local charts. Lemma 1. Let L be an embedded coisotropic submanifold of T ∗ Y \0, with coordinates (x, ξ, e) such that (18) holds. Denote L (y, η) = (x, ξ, e). We can find a homogeneous canonical map G from an open part of T ∗ (X × E)\0 to an open neighborhood of L in T ∗ Y \0, such that G(x, e, ξ, ε = 0) = (x, ξ, e).
3.2. An invertible Fourier integral operator. Let M be the canonical relation associated with the map G we introduced in Lemma 1, i.e. M = {(G(x, e, ξ, ε); x, e, ξ, ε)} ⊂ T ∗ Y \0 × T ∗ (X × E)\0.
Characterization of Seismic Reflection Data
9
We now construct a Maslov-type phase function for M that is directly related to a phase function for . Suppose (yI , x, ηJ ) are suitable coordinates for , at ε = 0. For ε small, the constant-ε subset of M allows the same set of coordinates, thus we can use coordinates (yI , ηJ , x, ε) on M. Now there is (see Theorem 4.21 in Maslov and Fedoriuk [21]) a function S(yI , x, ηJ , ε), called the generating function, such that M is given by yJ = −
∂S , ∂ηJ
∂S ξ = − , ∂x
ηI =
∂S , ∂yI
e=
∂S . ∂ε
(19)
Thus a phase function for M is given by (y, x, e, ηJ , ε) = S(yI , x, ηJ , ε) + ηJ , yJ − ε, e.
(20)
A phase function for then follows as
(y, x, ηJ ) = (y, x, ∂S ∂ε |ε=0 , ηJ , 0) = S(yI , x, ηJ , 0) + ηJ , yJ . We introduce the amplitude b(yI , x, ηJ , ε) on M such that b(yI , x, ηJ , ε = 0) coincides with the amplitude in Theorem 1. To leading order, ∂ b=0 ∂ε because the coordinates εi are in involution. We construct a mapping from the reflectivity function to seismic data, extending the mapping from contrast to data. This is done by applying the results of Sect. 3.1 to the Kirchhoff modeling formula (16). We apply the change of coordinates on from (yI , x, ηJ ) to (x, ξ, e) to the symbol RMN and write now RMN = R(x, ξ, e). To highest order, R does not depend on ξ and is simply a function of (x, e). Theorem 2 [23]. Suppose microlocally that Assumptions 3 (no grazing rays at any interface), 1 (no scattering over π), 2 (transversality), and 4 (Bolker condition) are satisfied. Let H be the Fourier integral operator, H: E (X × E) → D (Y ), with canonical relation given by the extended map G : (x, ξ, e, ε) → (y, η) constructed in Sect. 3.1, and with amplitude to highest order given by (2π )n/2 2iτ (ηJ )b(yI , x, ηJ , ε) expressible in terms of the coordinates (x, e, ξ, ε). Then the data, in both Born and Kirchhoff approximations, can be modeled by H acting on a distribution r(x, e) of the form r(x, e) = R(x, Dx , e) c(x),
(21)
where R stands for a smooth e-family of pseudodifferential operators and c ∈ E (X). For n the Kirchhoff approximation the distribution c equals ∂z ∂x δ(zn (x)), while the principal symbol of the pseudodifferential operator R equals R(x, e), so to highest order ∂zn (22) r(x, e) = R(x, e) ∂x δ(zn (x)).
10
M.V. de Hoop, G. Uhlmann
For the Born approximation the function r(x, e) is given by a pseudodifferential opera−1 (w tor R with principal symbol (2iτ MN ;0 (x, ξ, e), wMN;ij kl (x, ξ, e)), acting
(x, ξ, e)) on a distribution c given by
δcij kl δρ ρ , ρ
, so to highest order
δcij kl (x) δρ(x) r(x, e) = (2iτ (x, Dx , e))−1 wMN ;0 (x, Dx , e) +wMN;ij kl (x, Dx , e) . ρ(x) ρ(x) The operator H is invertible. Remark 1. Microlocally, we have obtained the following diagram (suggested by Symes, personal communication) H
E (X × E) −→ D (Y ) R(x, Dx , e) ↑ ↑ Id E (X)
F
−→
(23)
D (Y )
We note that R(x, Dx , e) is of order 0. H −1 maps data into what seismologists call common-image-point gathers (the integral over ε replaces the notion of beamforming; e plays the role of scattering angle and azimuth). 4. A Procedure for Data Continuation 4.1. The range of the scattering operator. If n − 1 − c > 0, there is a redundancy in the data parametrized by the variable e. The redundancy in the data manifests itself as a redundancy in images of the subsurface from these data. A smooth background is considered ‘acceptable’ if the data are contained in the range of F (or H ). If a smooth background is acceptable, then applying the operator H −1 of Theorem 2 to the data results in a reflectivity distribution r(x, e), the singular support (in x) of which does not depend on e. One way to measure the agreement in singular supports between images of reflectivity r(x, e) parametrized by e is by taking a derivative with respect to e. Taking (21) as the point of departure, we find that ∂ ∂R ∂R R(x, Dx , e) − (x, Dx , e) r(x, e) = R(x, Dx , e), (x, Dx , e) c(x). (24) ∂e ∂e ∂e Hence, microlocally where R(x, Dx , e) is elliptic, ∂R ∂ R(x, Dx , e) − (x, Dx , e) ∂e ∂e ∂R (x, Dx , e) R(x, Dx , e)−1 r(x, e) = 0 − R(x, Dx , e), ∂e
(25)
to all orders. We observe that the first operator acting on distributions in (x, e) in the sum is of order 1, the second operator is of order 0, while the third operator is of order −1. Falling back on (22) we exploit that, up to leading order, the operator R acts as a multiplication by R(x, e). Clearly, ∂zn ∂R R(x, e), (x, e) ∂x δ(zn (x)) = 0, ∂e
Characterization of Seismic Reflection Data
11
cf. (24). Substituting (21) into (25) reveals that the operator in between parentheses on ∂ the left-hand side equals (R(x, e) ∂e − ∂R ∂e (x, e)) up to the leading two orders. Hence, R(x, e)
∂ ∂R − (x, e) r(x, e) = 0 ∂e ∂e
(26)
up to the highest two orders. Conjugating the operator in between parentheses in (26), or in (25), with the invertible Fourier integral operator H , we obtain a pseudodifferential operator on D (Y ) [23] Lemma 2. Let the pseudodifferential operators Pi (y, Dy ) : D (Y ) → D (Y ) of order 1 be given by the composition ∂ ∂R Pi (y, Dy ) = H R(x, e) − (x, e) H −1 , i = 1, . . . , n − 1 − c . ∂ei ∂ei r
Then for Kirchhoff data d(y) modeled by F , we have to the highest two orders, Pi (y, Dy )d(y) = 0, i = 1, . . . , n − 1 − c.
(27)
Microlocally, for values of e where R(x, e) = 0, the operator Pi (y , Dy ) can be modeled after (25) such that (27) is valid to all orders. The principal part of the symbol of Pi is denoted by pi , while the next order term in the symbol’s polyhomogeneous expansion is denoted by pi;0 . The subprincipal symbols (which show up naturally in the Weyl calculus of symbols), ci , of the annihilators are 2p i then given by ci := pi;0 + 2i j ∂y∂j ∂η . j Remark 2. The wavefront set of the data is contained in L = πY ( ), which, in analogy with the eikonal equation, is also the submanifold of T ∗ Y \0 defined by pi (y, η) = 0,
i = 1, . . . , n − 1 − c,
(28)
where pi is the principal symbol of Pi as before, and is of codimension 2 [2(n − 1) − c + 1] − (3n − 1 − c) = n − 1 − c, which is also the dimension of the covector ε. n−1
The operator F in Theorem 1 is continuous H 2 (X) → L2 (Y ). We now define the operator projection, π : L2 (Y ) → L2 (Y ), onto the range of the scattering operator F . Microlocally, π 2 = π. Since Assumption 4 is satisfied, using [14, Prop. 8.3], π is an elliptic minimal projector. By [14, Theorem 6.6], the kernel of 2 P = P12 + · · · + Pn−1−c
(29)
is identical with the range of π . More precisely, let Q denote the global parametrix of P , then, by [14, Theorem 6.7], π = I − QP + smoothing operator.
(30)
12
M.V. de Hoop, G. Uhlmann
4.2. A global parametrix. The construction of a global parametrix, Q, for an operator of the type P is given by Guillemin and Uhlmann [15]. A natural parametrix for P would 1 have as principal symbol 2 . However, this expression becomes singu2 p1 + · · · + pn−1−c lar at the set {p1 = · · · = pn−1−c = 0}. A class of operators, containing pseudodifferential operators with singular symbols, was introduced by Guillemin and Uhlmann [15]. The wavefront set of the kernels of these operators consist of two Langrangian manifolds, 0 and 1 say, intersecting cleanly in a submanifold of given codimension. In our case, 0 is the diagonal diag(T ∗ Y \0) in T ∗ Y \0 × T ∗ Y \0, while 1 is the fiber X
product L × L. The Lagrangian submanifold 1 ⊂ T ∗ Y \0 × T ∗ Y \0 precisely consists of points on the joint flowout from diag(T ∗ Y \0) ∩ {p1 = · · · = pn−1−c = 0} by the Hamiltonian flows of the Hp1 , . . . , Hpn−1−c , where Hpi denotes the Hamiltonian field associated with the function pi . The flowout is described by the solution to the Hamilton systems with parameters ei , ∂yj ∂pi = (y, η), ∂ei ∂ηj
∂ηj ∂pi =− (y, η), ∂ei ∂yj
1 ≤ i, j ≤ n − 1 − c.
(31)
The Lagrangian submanifolds 0 and 1 intersect cleanly in a submanifold of codimension n − 1 − c, see Remark 2. Guillemin and Uhlmann’s construction relies on the introduction of the space of distributional half densities, I p,l (Y ×Y ; 0 , 1 ), defining a class of Fourier integral operators with singular symbols, with the properties ∩l I p,l (Y × Y ; 0 , 1 ) = I p (Y × Y, 1 ) (defining standard Fourier integral operators with canonical relation 1 ) for p fixed, and ∩p I p,l (Y × Y ; 0 , 1 ) = C0∞ (Y × Y ) for l fixed. Viewing the Schwartz kernel of the identity (I ) as an element of I 0,0 (Y × Y ; 0 , 1 ), Guillemin and Uhlmann’s recursive construction results in QP = I − π + R, where the kernel of π belongs to ∩l I p,l (Y × Y ; 0 , 1 ), and R is a smoothing operator with kernel in ∩p I p,l (Y × Y ; 0 , 1 ). Here, we discuss the properties of Q. We observe that 1 ◦ 1 = 1 . The elliptic minimal projector π , introduced in the previous subsection, is a Fourier integral operator with canonical relation 1 , (31)
L −→ L ↓ ↓ X −→ X
(32)
The wavefront set of Q is contained in 0 ∪ 1 . Q is a pseudodifferential operator on 1 0 \( 0 ∩ 1 ) and its principal symbol there is given by σ0 = 2 up 2 p1 + · · · + pn−1−c to Maslov factors and half densities. Q is a Fourier integral operator on 1 \( 0 ∩ 1 ). Its principal symbol, σ1 , solves the transport equation n−1−c
iHpi − ci
2
σ1 = σπ
i=1
on 1 \( 0 ∩ 1 ), where σπ denotes the symbol of π . (We return to the evaluation of σπ in Sect. 6.) The expression between parentheses is an elliptic differential operator of
Characterization of Seismic Reflection Data
13
order 2 on each fiber of L. The equation is Laplace’s equation in every leaf of the foliation generated by the commuting vector fields Hpi , i = 1, . . . , n − 1 − c. The principal symbol σ1 has a conormal singularity at 0 ∩ 1 , expressible by an appropriate Fourier transform of the singularity of σ0 , see [15, (5.14)].
4.3. Data continuation. We apply the results of the previous subsections to the problem of source-receiver continuation of seismic data: Seismic data are commonly measured on an open subset of the manifold of all possible observations. Continuation of these data from the open subset to the full acquisition manifold is desired for various data processing procedures, including imaging – in the seismic literature this continuation is referred to as the ‘forward extrapolation’ step within data regularization [9], or ‘data healing’. Theorem 3. Suppose u is a distribution belonging to the range of the scattering operator F . Let χ = χ (y, Dy ) be a pseudodifferential operator of order 0 that acts as a cutoff in phase space T ∗ Y \0. Assume that we observe u0 = χ u in accordance with the constraints of the acquisition geometry. Suppose χ is elliptic on a leaf of the foliation of L, then WF(π u0 ) intersected with this leaf is equal to WF(u) intersected with the same leaf. In this case, π heals the data on this leaf. Proof. We observe that u = πv for some v. Then u0 = (χ π) v.
(33)
Because π 2 u = u, it is natural to investigate πu0 , i.e. πu0 = (πχ π) v.
(34)
If χ is elliptic on a leaf of the foliation of L, then (π − π χ π ) v = 0, or u − π u0 = 0, microlocally on this leaf. This implies the statement in the theorem. We implement π by making use of the following observation. In view of the Bolker condition, Assumption 4, the composition F ∗ F is an elliptic pseudodifferential operator of order n − 1. Let denote the parametrix for F ∗ F . The operator F F ∗ belongs to Guillemin and Sternberg’s algebra RL [14] of Fourier integral operators with canonical relation 1 [6]. Clearly, (F F ∗ )2 = F F ∗ microlocally, while I − F ∗ F = I − F ∗ F is the orthogonal projection onto the kernel of F ∗ F . Indeed, F F ∗ is precisely an elliptic minimal projector [14, Proof of Thm. 8.3] of the type introduced in Sect. 4. The symbol of this operator follows by the standard composition calculus. Following the composition of F with F ∗ in F F ∗ , we represent the canonical relation 1 as the composition of canonical relations with ∗ . Remark 3. The transformation to zero offset (TZO) of seismic data, which is derived from Dip MoveOut, can be expressed in the form R0 π , where R0 is the restriction of distributions on Y to an acquisition manifold with coinciding sources and receivers: In this case, y = (y , y ) with y = ( 21 (s + r), t) and y = 21 (r − s) if (s, r, t) are the original local coordinates on Y . Assumption 2, subject to this substitution, guarantees that the composition, R0 π, is again a Fourier integral operator.
14
M.V. de Hoop, G. Uhlmann
5. Goldin’s Equation Revisited In context of the simplest seismic scattering theory, in a background that essentially is constant, the following simplications are made. To begin with, the source, s, and receiver, r, points in Y are assumed to be contained in a flat surface, Rn−1 , while e is initially replaced by half source-receiver offset h = 21 (r − s) ∈ Rn−1 . Thus y is replaced by (s, r, t); we write η = (σ, ρ, τ ). Essentially, we assume that the rays between reflector and acquisition surface are straight, see Fig. 1. We repeat the NMO correction, 4h2 κ : (s, r, t) → (z, tn , h), z = 21 (r + s), h = 21 (r − s), tn = t 2 − 2 , v of the introduction. (The subscript n refers in this section to normal moveout.) Here, v could be thought of as the so-called NMO velocity, which can be introduced for ‘pure mode’ scattering (i.e. M = N ), even in the anisotropic media under consideration here [12]; Goldin, however, restricts his analysis to an isotropic medium and compressional waves. We will reproduce Goldin’s result here in the context of our analysis subject to the substitutions n = 2 and c = 0 (and m = 1). NMO correction applied to the data yields (κ −1 )∗ d. Including a so-called geometrical spreading correction, a multiplication by time t, then leads to the map d(s, r, t) → ((κ −1 )∗ (t d))(z, tn , h)
(35)
that replaces (H −1 d)(x, e); the point x has attained coordinates (z, tn ). The outcome is of the form r(z, tn , h) = Rn (z, h) cn (tn − Tn (z, h)).
Fig. 1. Geometry underlying the annihilator symbol for constant coefficients
(36)
Characterization of Seismic Reflection Data
15
Equation (36) replaces (22). The reflection time, T (s, r, x) maps under κ according to 4h2 T (s, r, x) → Tn (z, h) = (T (z − h, z + h, x))2 − 2 . v We observe that, in the simplication considered, Rn is independent of tn , while cn is not only a function of (z, tn ) but also of h. Hence, a simple derivative of r with respect to h, motivated by (26), would not yield a vanishing outcome up to leading order. Instead, it is possible to construct a candidate operator P1 , acting on the data, directly. In Fig. 1 we introduce angles α and γ ; in fact, (x, α, γ ) can be identified with (x, ν, e). Using simple trigonometric identities (including the law of sines) and the geometry in Fig. 1 α (observing that the total length of the reflected ray is vt = (r − s) cos sin γ with 2γ denoting the scattering angle and α denoting the incidence angle of the zero-offset ray at the surface), it follows that τ
(r − s)2 2 −1 (σ − ρ)−2 (r − s) t −τ σρ = 0 (37) p1 (s, r, t, σ, ρ, τ ) = t + v2 v2 defines the points in L (Remark 2) in the simplification under consideration; p1 is v dependent. Applying the coordinate transformation implied by κ to this symbol, and multiplying the result by frequency τ , yields p (z, tn , h, ζ, τn , ε) = (τp1 )(κ(s, r, t), ((κ )−1 )t (σ, ρ, τ ))
(38)
p (z, tn , h, ζ, τn , ε) = −tn τn ε + h (ζ 2 − ε 2 ),
(39)
or
which defines the principal symbol of an operator P ; we observe that p is v independent. We recover Goldin’s equation, P (z, tn , h, Dz , Dtn , Dh )r = 0,
2 ∂2 ∂2 ∂ P (z, tn , h, Dz , Dtn , Dh ) = tn − 2 , −h ∂tn ∂h ∂z2 ∂h
(40)
which is valid up to highest order. It would be valid up to the next order if we had not applied the geometrical spreading correction in (35). Accounting for this correction leads to a subprincipal symbol contribution: 2 ∂ ∂2 ∂ ∂2 P (z, tn , h, Dz , Dtn , Dh ) := tn −h . − − 2 2 ∂tn ∂h ∂z ∂h ∂h Through the coordinate transformation implied by κ, we obtain a subprincipal symbol contribution to the operator with principal symbol p1 (cf. (37)). Note that the operator P is of second order unlike the operator annihilating r in (26) which is of first order. However, in (38) we introduced a multiplication by τ , raising the order by one. A first-order operator derived from P in (40) follows to be P (z, tn , h, Dz , Dtn , Dh ) =
∂ ∂h
−1
h tn
∂2 ∂2 − ∂h2 ∂z2
+
∂ . ∂tn
16
M.V. de Hoop, G. Uhlmann
We write down the Hamiltonian system describing the flowout as in (31); there is only one such system since n − c − 1 = 1. We use the first-order symbol, p (z, tn , h, ζ, τn , ε) = τn − εthn (ζ 2 − ε 2 ) (we omitted a factor i), so that 2hζ ∂z =− ∂e εtn
,
∂ζ = 0, ∂e
∂tn =1 ∂e
,
∂τn h = − 2 (ζ 2 − ε 2 ), ∂e εtn
∂h 2h ζ2 h , =− 2 + ∂e ε tn tn
(41)
∂ε 1 (ζ 2 − ε 2 ). = ∂e εtn
We set e = tn , and eliminate τn . To this end, we introduce the slowness vectors ζ and ε according to ζ = τn ζ and ε = τn ε. Substituting τn = εthn (ζ 2 − ε 2 ) (using that p = 0), and the equation for
∂τn ∂e ,
then yields the system
2h ζ ∂z =− ∂tn ε tn
,
∂ ζ = ∂tn
ζ , tn
∂h h 1 =− + , ∂tn ε tn
∂ ε = ∂tn
ζ2 . ε tn
(42)
We note that tn and h are directly related to one another. Indeed, let the zero-offset reflecton time be given by T0 = T (z0 , z0 , x) = Tn (z0 , 0). Then, for given z, h Tn = 2 . T0 (h − (z − z0 )2 )1/2 Along bicharacteristics, h v2 =− ε tn 4 sin2 α
(43)
is invariant (i.e. its derivative with respect to e is zero). We can convert tn to half scattering angle γ – keeping (x, α) fixed – according to the relation cos α cos γ Tn = , T0 (cos2 α − sin2 γ )1/2
T0 =
2V . v
(44)
We discuss in as much the kernel of P determines the range of F under the simplication (‘straight rays’) under consideration. The range is described by wavefields of the form −1 cos2 γ + V C ¯ r(x, γ ) δ(t − T ), T = T (s, r, x), (45) d(s, r, t) = vt cos γ obtained after preprocessing d for time signature (a 2.5D correction) and source or receiver radiation characteristics, applying an appropriate pseudodifferential operator. In (45), V = 21 vT (z, z, x) denotes the length of the zero-offset ray, and C = x 2 (x1 ) cos3 α denotes the curvature of the reflector, see Fig. 1. We have assumed that the reflecting
Characterization of Seismic Reflection Data
17
interface can be described by the graph (x1 , x 2 (x1 )). (If the interface is the zero level ∂z2 set of z2 = z2 (x1 , x2 ) then we assume that ∂x = 0.) Equation (45) is the outcome of 2 a stationary phase calculation of the scattered field in the Kirchhoff approximation. We ¯ r, t) and obtain set r(x, γ ) ≡ 1 and apply (κ −1 )∗ to d(s, r¯ (z, tn , h) = An (z, h) δ(tn − Tn ), Tn = Tn (z, h), −1 cos2 γ + V C t , An = vt cos γ tn
(46)
because | dtdtn | = ttn . Expression (46) is indeed of the form (36). To verify whether the wavefields in (45), via (46), coincide with functions in the kern nel of P , up to leading order, we first notice that by derivation p (z, Tn , h, −iτn ∂T ∂z , τn , n −iτn ∂T ∂h ) = 0. Secondly, we consider the transport equation derived from (40), which is given by Tn − 2h
∂Tn ∂h
∂Tn ∂An ∂An + 2h + hAn ∂h ∂z ∂z
∂ 2 Tn ∂ 2 Tn − 2 ∂z ∂h2
= 0.
n ∂Tn The velocity vector associated with a ray or characteristic is given by ( ∂T ∂z , ∂h ). Thus, along a characteristic, the transport equation becomes
−
1 dAn ∂Tn −1 ∂ 2 Tn ∂ 2 Tn = 0, + h Tn − An dTn ∂h ∂z2 ∂h2
(47)
where we made use of (42). We change variables according to (44), with 1 dTn sin2 α sin γ . =− Tn dγ (cos2 α − sin2 γ ) cos γ Furthermore, using the ray geometry, we find the identity Tn
∂ 2 Tn ∂ 2 Tn − ∂z2 ∂h2
∂ 2T cos2 γ sin2 α + V C cos2 γ =4 T = 4 + . ∂s∂r v2 v 2 cos2 γ + V C
(48)
Substituting identity (48) and invariant (43) into (47), applying the change of variables (44), leads to the equation, −
1 dAn + An dγ
1 1 − (cos2 α − sin2 γ ) cos2 γ + V C
sin γ cos γ = 0.
(49)
This equation can be directly integrated to yield solutions for An of the type (46). We conclude that the kernel of P generates wavefields of the type (45) which comprise the range of the scattering operator subject to processing for time signature and setting r(x, γ ) ≡ 1.
18
M.V. de Hoop, G. Uhlmann
6. Numerical Example The minimal elliptic projector π is a Fourier integral operator and is directly implementable and applicable to data. This is the subject of this section. Indeed, given a smooth background model, we can construct a minimal elliptic projector for data continuation by operator composition, F F ∗ . On the other hand, however, in Sect. 4 we showed that, given the annihilators of the data (in practice, just their principal parts), we can construct the global parametrix Q, from which the elliptic minimal projector follows. This procedure is related to what Guillemin and Sternberg call relative geometrical quantization. We include an example to confirm the computability of our result. The algorithm used is designed and explained in [20]. In our example, n = 2 and χ is replaced by a smooth cutoff ψY ; the cutoff restricts the data to the set {(s, r, t) | s, r ∈ Rn−1 , r −s > h0 , t ∈ (0, T )}. The goal is to continue the data to an acquisition manifold with the constraint
r − s > h0 removed. Elastic-wave data were simulated over a model illustrated in Fig. 2 (top). The P-wave velocities are shown in grey scale; a low-velocity Gaussian lens was inserted (white-to-grey). The continuation is illustrated for the P-wave constituents even though the simulated data contained S waves as well. By selecting the vertical
Fig. 2. A numerical example of data continuation: The top figure is the isotropic P-wave velocity model used in the reflection data simulation, containing a Gaussian (low velocity) lens (in white). The bottom two figures both show a shot record (receiver location versus time) with source location at the black vertical line in top figure; the left record shows the outcome of continuation (the data in between the two vertical lines was missing) while the right record shows the original simulated data
Characterization of Seismic Reflection Data
19
displacement component, most energy in the wavefield can be attributed to P waves. For one value of s the synthetic data as a function of r and t are shown in Fig. 2 (bottom, right). We set T = 3s and h0 = 500m. The input to the continuation (u0 ) were the data with r values in between the black vertical lines removed Fig. 2 (bottom, left). The result of the continuation is plotted in between the black vertical lines of the same figure and should be compared with Fig. 2 (bottom, right). References 1. Beylkin, G.: The inversion problem and applications of the generalized Radon transform. Comm. Pure Appl. Math. XXXVII, 579–599 (1984) 2. Biondi, B., Fomel, S., Chemingui, N.: Azimuth moveout for 3-D prestack imaging. Geophysics 63, 574–588 (1998) 3. Bolondi, G., Loinger, E., Rocca, F.: Offset continuation of seismic sections. Geoph. Prosp. 30, 813–828 (1982) 4. De Hoop, M.V., Bleistein, N.: Generalized radon transform inversions for reflectivity in anisotropic elastic media. Inverse Problems 13, 669–690 (1997) 5. De Hoop, M.V., Brandsberg-Dahl, S.: Maslov asymptotic extension of generalized Radon transform inversion in anisotropic elastic media: A least-squares approach. Inverse Problems 16, 519–562 (2000) 6. De Hoop, M.V., Malcolm, A.E., Le Rousseau, J.H.: Seismic wavefield ‘continuation’ in the single scattering approximation: A framework for Dip and Azimuth MoveOut. Can. Appl. Math. Q. 10, 199–238 (2002) 7. Deregowski, S.G., Rocca, F.: Geometrical optics and wave theory of constant offset sections in layered media. Geoph. Prosp. 29, 374–406 (1981) 8. Duistermaat, J.J.: Fourier integral operators. Boston: Birkh¨auser, 1996 9. Fomel, S.: Theory of differential offset continuation. Geophysics 68, 718–732 (2003) 10. Gel’fand, I.M., Graev, M.I.: Complexes of straight lines in the space Cn . Funct. Anal. Appl. 2, 39–52 (1968) 11. Goldin, S.: Superposition and continuation of transformations used in seismic migration. Russ. Geol. and Geophys. 35, 131–145 (1994) 12. Grechka, V., Tsvankin, I., Cohen, J.K.: Generalized Dix equation and analytic treatment of normalmoveout velocity for anisotropic media. Geoph. Prosp. 47, 117–148 (1999) 13. Guillemin, V.: In: Pseudodifferential operators and applications (Notre Dame, Ind., 1984), Chapter “On some results of Gel’fand in integral geometry”, Providence, RI: Amer. Math. Soc., 1985, pp. 149–155 14. Guillemin, V., Sternberg, S.: Some problems in integral geometry and some related problems in microlocal analysis. Amer. J. of Math. 101, 915–955 (1979) 15. Guillemin, V., Uhlmann, G.: Oscillatory integrals with singular symbols. Duke Math. J. 48, 251–267 (1981) 16. Hansen, S.: Solution of a hyperbolic inverse problem by linearization. Commun. Par. Differ. Eqs. 16, 291–309 (1991) 17. H¨ormander, L.: The analysis of linear partial differential operators. Volume IV. Berlin: SpringerVerlag, 1985 18. H¨ormander, L.: The analysis of linear partial differential operators. Volume III. Berlin: SpringerVerlag, 1985 19. John, F.: The ultrahyperbolic differential equation with four independent variables. Duke Math. J. 4, 300–322 (1938) 20. Malcolm, A.E., De Hoop, M.V., Le Rousseau, J.H.: The applicability of DMO/AMO in the presence of caustics. Geophysics 70, 51 (2005) 21. Maslov, V.P., Fedoriuk, M.V.: Semi-classical approximation in quantum mechanics. Dordrecht: Reidel Publishing Company, 1981 22. Patch, S.K.: Computation of unmeasured third-generation VCT views from measured views. IEEE Trans. Med. Imaging 21, 801–813 (2002) 23. Stolk, C.C., De Hoop, M.V.: Microlocal analysis of seismic inverse scattering in anisotropic, elastic media. Comm. Pure Appl. Math. 55, 261–301 (2002) 24. Taylor, M.E.: Reflection of singularities of solutions to systems of differential equations. Comm. Pure Appl. Math. 28, 457–478 (1975) Communicated by P. Constantin
Commun. Math. Phys. 263, 21–64 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1492-5
Communications in
Mathematical Physics
Mott Law as Lower Bound for a Random Walk in a Random Environment A. Faggionato1 , H. Schulz-Baldes2 , D. Spehner3 1 2 3
Weierstrass Institut f¨ur Angewandte Analysis und Stochastic, 10117 Berlin, Germany Institut f¨ur Mathematik, Technische Universit¨at Berlin, 10623 Berlin, Germany Fachbereich Physik, Universit¨at Duisburg-Essen, 45117 Essen, Germany
Received: 21 July 2004 / Accepted: 16 September 2005 Published online: 24 January 2006 – © Springer-Verlag 2006
Abstract: We consider a random walk on the support of an ergodic stationary simple point process on Rd , d ≥ 2, which satisfies a mixing condition w.r.t. the translations or has a strictly positive density uniformly on large enough cubes. Furthermore the point process is furnished with independent random bounded energy marks. The transition rates of the random walk decay exponentially in the jump distances and depend on the energies through a factor of the Boltzmann-type. This is an effective model for the phonon-induced hopping of electrons in disordered solids within the regime of strong Anderson localization. We show that the rescaled random walk converges to a Brownian motion whose diffusion coefficient is bounded below by Mott’s law for the variable range hopping conductivity at zero frequency. The proof of the lower bound involves estimates for the supercritical regime of an associated site percolation problem. 1. Introduction 1.1. Main Result. Let us directly describe the model and the main results of this work, deferring a discussion of the underlying physics to the next section. Suppose given an infinite countable set of random points {xj } ⊂ Rd distributed according to some ergodic stationary simple point process. One can identify this set with the simple counting measure ξˆ = j δxj having {xj } as its support, and then write x ∈ ξˆ if x ∈ {xj }. The σ -algebra B(Nˆ ) on the space Nˆ of counting measures on Rd is generated by the family of subsets {ξˆ ∈ Nˆ : ξˆ (B) = n}, where B ⊂ Rd is Borel and n ∈ N. The distribution Pˆ of the point process is a probability on the measure space (Nˆ , B(Nˆ )). It is stationary and ergodic w.r.t. the translations x → x + y of Rd . In the sequel, we need to impose boundedness of some κth moment defined by ρκ := EPˆ ξˆ (C1 )κ , (1) ˆ Then ρ = ρ1 is the so–called where C1 = [− 21 , 21 ]d and EPˆ is the expectation w.r.t. P. intensity of the process.
22
A. Faggionato, H. Schulz-Baldes, D. Spehner
To each xj is associated a random energy mark Exj ∈ [−1, 1]. These marks are drawn independently and identically according to a probability measure ν. Again, {(xj , Exj )} is naturally identified with an element ξ of the space N of counting measures on Rd × [−1, 1], and the distribution P of the marked process is a measure on (N , B(N )) (with B(N ) defined similarly to B(Nˆ )). The distribution P is said to be the ν–randomization of Pˆ [Kal]. It is stationary and ergodic w.r.t. Rd –translations. In order to assure that {xj } contains the origin, we consider the measurable subset N0 = {ξ ∈ N : ξ({0} × [−1, 1]) = 1} furnished with the σ -algebra B(N0 ) = {A ∩ N0 : A ∈ B(N )}. The random environment is given by a configuration ξ ∈ N0 randomly chosen along the Palm distribution P0 associated to P. Roughly, one can think of P0 as the probability on (N0 , B(N0 )) obtained by conditioning P to the event N0 (see Sect. 2). Note that almost each environment is a simple counting measure, and therefore it can be identified with its support as we will do in what follows. For a fixed environment ξ ≡ {(xj , Exj )} ∈ N0 let us consider a continuous-time random walk over the points {xj } starting at the origin x = 0 with transition rates from x ∈ ξˆ to y ∈ ξˆ given by cx,y (ξ ) := exp − |x − y| − β(|Ex − Ey | + |Ex | + |Ey |) , x = y , (2) where β > 0 is the inverse temperature. More precisely, let ξ = D([0, ∞), supp(ξˆ )) be the space of right-continuous paths on the support of ξˆ having left limits, endowed ξ ξ with the Skorohod topology [Bil]. Let us write (Xt )t≥0 for a generic element of ξ . If P0 denotes the distribution on (ξ , B(ξ )) of the above random walk starting at the origin, ξ ξ ξ ξ then the set of stationary transition probabilities pt (y|x) := P0 (Xs+t = y|Xs = x), x, y ∈ ξˆ , t ≥ 0, s > 0 satisfy the following conditions for small values of t [Bre]: ξ
(C1) pt (y|x) = cx,y (ξ ) t + o(t) if x = y; ξ (C2) pt (x|x) = 1−λx (ξ ) t +o(t) with λx (ξ ) := y∈ξˆ cx,y (ξ ), where cx,x (ξ ) := 0. It is verified in Appendix A that, provided that ρ2 < ∞, no explosions occur and thus the random walk is well-defined for P0 –almost all ξ . Our main interest concerns the long time asymptotics of the random walk and the diffusion matrix D defined by 1 ξ (a · Da) = lim a ∈ Rd , EP0 EPξ (Xt · a)2 , (3) t→∞ t 0 where (a · b) denotes the scalar product of the vectors a and b in Rd . The main results of the work are (i) the existence of the limit (3) in any dimension d ≥ 1 as well as the convergence of the (diffusively rescaled) random walk to a Brownian motion with finite covariance matrix D ≥ 0; (ii) a quantitative lower bound on D in dimension d ≥ 2 under given assumptions on the energy distribution ν and either one of the following two technical hypotheses. Let denote the Lebesgue measure and CN = [−N/2, N/2]d . Given A ⊂ Rd , let FA be the σ –subalgebra in B(Nˆ ) generated by the random variables ξˆ (B) with B ⊂ A and B ∈ B(Rd ). (H1) Pˆ admits a lower bound ρ > 0 on the point density: ξˆ (CN ) ≥ ρ (CN ), with ρ and N0 independent on ξˆ .
ˆ ∀ N ≥ N0 , P-a.s. ,
(4)
Mott Law as Lower Bound for a Random Walk in a Random Environment
23
(H2) Pˆ satisfies the following mixing condition: there exists a function h : R+ → R+ with h(r) ≤ c(1 + r 2d+7+δ )−1 for some c, δ > 0 such that for any r2 ≥ r1 > 1, ˆ ˆ ˆ ∀ A ∈ FCr1 , P-a.s. (5) ≤ r1d r2d−1 h(r2 − r1 ) , P(A|FRd \Cr ) − P(A) 2
We feel that Hypotheses (H1) and (H2) cover nearly all interesting examples see however Example 2 below. The uniform lower bound (H1) holds in the case of random and quasiperiodic tilings and, more generally, the so-called Delone sets [BHZ]. The type of mixing condition (H2) is inspired by decorrelation estimates holding for Gibbs measures of spin systems in a high temperature phase [Mar]. It is satisfied for a stationary Poisson point process as well as for point processes with finite range correlations. Due ˆ (H2) implies that Pˆ is a mixing, and, in particular, ergodic point to the stationarity of P, process (see [DV, Chap. 10]). We can now state more precisely the above-mentioned results. Theorem 1. Let Pˆ be the distribution of an ergodic stationary simple point process on Rd , let P be the distribution of its ν–randomization with a probability measure ν on [−1, 1], and let P0 be the Palm distribution associated to P. Assume that ρ12 < ∞ and that δ(xj ,Ej ) ⇒ ξ = Sx ξ := δ(xj −x,Ej ) ∀ x ∈ Rd \ {0}, P a.s. (6) ξ= j
j
Condition (6) is automatically satisfied if ν is not a Dirac measure. Then: ξ
(i) The limit in (3) exists and the rescaled process Y ξ,ε = (εXtε−2 )t≥0 defined on ξ
(ξ , P0 ) converges weakly in P0 -probability as ε → 0 to a Brownian motion W D with covariance matrix D. Namely, for any bounded continuous function F on the path space D([0, ∞), Rd ) endowed with the Skorohod topology, → E F WD in P0 -probability . EPξ F Y ξ,ε 0
(ii) Suppose d ≥ 2 and let either (H1) or (H2) be satisfied. Furthermore, suppose that there are some positive constants α, c0 such that, for any 0 < E ≤ 1,
Then
ν([−E, E]) ≥ c0 E 1+α .
(7)
d(α+1) α+1 D ≥ c1 β − α+1+d exp −c2 β α+1+d 1d ,
(8)
where 1d is the d × d identity matrix and c1 and c2 are some positive β-independent constants. The important factor in the lower bound (8) is the exponential factor and not the power law in front of it (on which we comment below though). Based on the following heuristics due to Mott [Mot, SE], we expect that the expression in the exponential in (8) captures the good asymptotic behavior of ln D in the low temperature limit β ↑ ∞ if ν([−E, E]) ∼ c0 E 1+α as E ↓ 0. Indeed, as β becomes larger, the rates (2) fluctuate widely with (x, y) because of the exponential energy factor. The low temperature limit effectively selects only jumps between points with energies in a small interval [−E(β), E(β)] shrinking to zero as β ↑ ∞. Assuming that D is determined by those
24
A. Faggionato, H. Schulz-Baldes, D. Spehner
jumps with the largest rate, one obtains directly the characteristic exponential factor on the r.h.s. of (8) by maximizing these rates for a fixed temperature under the constraint that the mean density of points xj with energies in [−E(β), E(β)] is equal to ρ ν([−E(β), E(β)]) ∼ c0 ρE(β)1+α . One speaks of variable range hopping since the characteristic mean distance |x − y| between sites with optimal jump rates varies heavily with the temperature. A crucial (and physically reasonable, as discussed below) element of this argument is the independence of the energies Ex . The selection of the points {xj } with energies in the window [−E(β), E(β)] then corresponds mathematically to a p-thinning with p = ν([−E(β), E(β)]). It is a well-known fact (see e.g. [Kal, Theorem 16.19]) that an adequate rescaling of the p-thinning of a stationary point process converges in the limit p ↓ 0 (corresponding here to β ↑ ∞) to a stationary Poisson point process (PPP). Hence one might call the stationary PPP the normal form of a model leading Mott’s law, namely the exponential factor on the r.h.s. of (8), and we believe that proving the upper bound corresponding to (8) should therefore be most simple for the PPP. In dimension d = 1, a different behavior of D is expected [LB] and this will not be considered here. Note that statement (i) does not necessarily imply that the motion of the particle is diffusive at large time, since it could happen that D = 0. (α+1)(2−d)
The preexponential factor in (8) can be improved to β (1+α+d) by means of formal scaling arguments on the formulas in Sects. 4 to 6. As we are not sure that this is optimal and we do not control the constant c2 in (8) anyway, we choose not to develop this improvement in detail.
1.2. Physical discussion. Our main motivation for studying the above model comes from its importance for phonon-assisted hopping conduction [SE] in disordered solids in which the Fermi level (set equal to 0 above) lies in a region of strong Anderson localization. This means that the electron Hamiltonian has exponentially localized quantum eigenstates with localization centers xj if the corresponding energies Exj are close to the Fermi level. The DC conductivity of such materials would vanish if it were not for the lattice vibrations (phonons) at nonzero temperature. They induce transitions between the localized eigenstates, the rate of which can be calculated from first principle by means of the Fermi golden rule [MA, SE]. In the variable range hopping regime at low temperature, the Markov and adiabatic (or rotating wave) approximations can be used to treat quantum mechanically the electron-phonon coupling [Spe]. Coherences between electronic eigenstates with different energies decay very rapidly under the resulting dissipative electronic dynamics and one can show that the hopping DC conductivity of the disordered solid coincides with the conductivity associated with a Markov jump process on the set of localization centers {xj }, hence justifying the use of a model of classical mechanics [BRSW]. Because Pauli blocking due to Fermi statistics of the electrons has to be taken into account, this leads to a rather complicated exclusion process (e.g. [Qua, FM]). If, however, the blocking is treated in an effective medium (or mean field) approximation, one obtains a family of independent random walks with rates which are given by (2) in the limit β ↑ ∞ [MA, AHL]. Let us discuss the remaining aspects of the model. The stationarity of the underlying simple point process {xj } simply reflects that the material is homogeneous, while the independence of the energy marks is compatible with Poisson level statistics, which is a general rough indicator for the localization regime and has been proven to hold for an Anderson model [Min]. The exponent α allows to model a possible Coulomb pseudogap in the density of states [SE].
Mott Law as Lower Bound for a Random Walk in a Random Environment
25
Having in mind the Einstein relation between the conductivity and the diffusion coefficient (which can be stated as a theorem for a number of models [Spo]), the lower bound (8) gives a lower bound on the hopping DC condutivity. In the above materials, the DC conductivity shows experimentally Mott’s law, namely a low-temperature behavior which is well approximated by the exponential factor in the r.h.s. of (8) with α = 0, as predicted by Mott [Mot] based on the optimization argument discussed above. In certain materials having a Coulomb pseudogap in the density of state, Mott’s law with α = d −1 is observed, as predicted by Efros and Shklovskii [EF]. A first convincing justification of Mott’s argument was given by Ambegoakar, Halperin and Langer [AHL], who first reduced the hopping model to a related random resistor network, in a manner similar to the work of Miller and Abrahams [MA], and then pointed out that the constant c2 in (8) can be estimated using percolation theory [SE]. Our proof of the lower bound (8) is inspired by this work. Let us also mention that the low frequency AC conductivity (response to an oscillating electric field) in disordered solids has recently been studied within a quantum-mechanical one-body approximation in [KLP]. Here the energy necessary for a jump between localized states comes from a resonance at the frequency of the external electric field rather than a phonon. It leads to another well-known formula for the conductivity which is also due to Mott. 1.3. Overview. Let us develop the main ideas of the proof of Theorem 1, leaving precise statements and their proofs to the following sections. The model described above is a random walk in a random environment. A main tool used in this work is the contribution of De Masi, Ferrari, Goldstein and Wick [DFGW] which is based on prior work by Kipnis and Varadhan [KV]. They construct a new Markov process, called the environment viewed from the particle, which allows to translate the homogeneity of the medium ξ into properties of the random walk. In Sect. 3, we argue that Xt has finite moments ξ w.r.t. P0 (dξ )P0 (Proposition 1) and study the generator of the process environment viewed from the particle when the initial environment is chosen according to the Palm distribution P0 (Propositions 2 and 3), thus allowing to apply the general Theorem 2.2 of [DFGW] to deduce the existence of the limit (3). The convergence to a Brownian motion stated in Theorem 1 also follows, but this could have been obtained (avoiding an analysis of the infinitesimal generator) by applying Theorem 17.1 of [Bil] and Theorem 2.1 of [DFGW]. The results of [DFGW] also lead to a variational formula for the diffusion matrix D (Theorem 2 below). The main virtue of this formula is that it allows to bound D from below through bounds on the transition rates. The first step in proving Theorem 1(ii) is to define a new random walk with transition rates bounded above by the rates (2). This is done in Sect. 4 in the following way. For a fixed configuration ξ ∈ N0 of the environment, consider the set {xjc } = {xj : |Exj | ≤ Ec } of all random points having energies inside a given energy window [−Ec , Ec ] with 0 < Ec ≤ 1. The distribution Pˆ c of these points is obtained from P by a δc -thinning with δc = ν([−Ec , Ec ]). Given a cut-off distance rc > 0, consider the random walk on supp(ξˆ ) with the transition rates cˆx,y (ξ ) = χ (|x −y| ≤ rc )χ (|Ex | ≤ Ec )χ (|Ey | ≤ Ec ), where χ is the characteristic function. Since we want this random walk to have a strictly positive diffusion coefficient in the limit Ec → 0, one must choose rc such that the mean number of points xj with energies in [−Ec , Ec ] inside a ball of radius rc is larger than an Ec -independent constant c3 > 0. This mean number is equal to c4 δc rcd and is larger than c5 Ec1+α rcd by assumption (7), where c4 and c5 are constants depending on ρ and d −(1+α)/d only. Hence rc = c6 Ec . It is shown in Proposition 5 that the diffusion matrix of
26
A. Faggionato, H. Schulz-Baldes, D. Spehner
the new random walk is equal to δc D(rc , Ec ), where D(rc , Ec ) is the diffusion matrix of a random walk on {xjc } with energy-independent transition rates χ (|x − y| ≤ rc ). By the monotonicity of D in the jump rates and since cx,y (ξ ) ≥ exp(−rc − 4βEc ) cˆx,y (ξ ), one gets using also assumption (7) and the constant c0 therein: D ≥ c0 Ec1+α e−rc −4βEc D(rc , Ec ) .
(9)
In Sect. 5, a lower bound on D(rc , Ec ) is obtained by considering periodic approximants (in the limit of large periods) as in [DFGW]. The diffusion coefficient of these approximants can be computed as the resistance of a random resistor network. The resistance of the random resistor network is bounded by invoking estimates from percolation theory in Sect. 6, hence showing that, if rc is large enough, D(rc , Ec ) > c7 1d , where −(1+α)/d c7 > 0 is independent on Ec , β. Recalling that rc = c6 Ec , an optimization w.r.t. d Ec of the right member of (9) then yields Ec = c8 β − 1+α+d and thus the lower bound (8). Let us note that this optimization is the same as in the Mott argument discussed above and that Ec ↓ 0 and rc ↑ ∞ as β ↑ ∞. Moreover, our optimized lower bound results from a critical resistor network roughly approximating the one appearing in [AHL]. The paper is organized as follows. In Sect. 2 we recall some definitions and results about point processes and state some technical results needed later on. The statements (i) and (ii) of Theorem 1 are proven in Sect. 3 and in Sects. 4 to 6, respectively. In Appendix A we show that the continuous-time random walk in the random environment is well defined by verifying the absence of explosion phenomena. Appendix B contains some technical proofs about the Palm measure. Appendix C is devoted to the proof of Proposition 1. 2. The Random Environment In this section, we recall some properties of point processes (for more details, see [DV, FKAS, MKM, Kal, Tho]). In the sequel, given a topological space X, B(X) will denote the σ -algebra of Borel subsets of X. Given a set A, |A| will denote its cardinality. Moreover, given a probability measure µ, we write Eµ for the corresponding expectation. 2.1. Stationary simple marked point processes. Given a bounded complete separable metric space K, consider the space N := N (Rd × K) of all counting measures ξ on Rd × K, i.e. integer-valued measures such that ξ(B × K) < ∞ for any bounded set B ∈ B(Rd ). One can show that ξ ∈ N if and only if ξ = j δ(xj ,kj ) where δ is the Dirac measure and {(xj , kj )} is a countable family of (not necessarily distinct) points in Rd × K with at most finitely many points in any bounded set. Then kj is called the mark at xj . Given ξ ∈ N , we write ξˆ ∈ N (Rd ) for the counting measure on Rd defined by ξˆ (B) = ξ(B × K) for any B ∈ B(Rd ). Given x ∈ Rd , we write x ∈ ξˆ whenever x ∈ supp(ξˆ ). If ξˆ ({x}) ≤ 1 for any x ∈ Rd , we say that ξ ∈ N is simple and write kxj := kj for any xj ∈ ξˆ . A metric on N can be defined in the following way [MKM, Sect. 1.15]. Fix an element d k ∗ ∈ K. Denote by Br (x, k) and Br the open balls in R × K of radius r > 0 centred ∗ on (x, k) and on (0, k ), respectively. Let ξ = i∈I δ(xi ,ki ) and ξ = j ∈J δ(xj ,kj ) be elements of N , where I , J are countable sets. Then ξ and ξ are close to each other if any point (xi , ki ) contained in Bn is close to a point (xj , kj ) for arbitrary large n, up
Mott Law as Lower Bound for a Random Walk in a Random Environment
27
to “boundary effects”. More precisely, given a positive integer n, let dn (ξ, ξ ) be the infimum over all ε > 0 such that there is a one-to-one map f from a (possibly empty) subset D of I into a subset of J with the properties: (i) supp(ξ ) ∩ Bn−ε ⊂ {(xi , ki ) : i ∈ D}; (ii) supp(ξ ) ∩ Bn−ε ⊂ {(xj , kj ) : j ∈ f (D)}; (iii) (xf (i) , kf (i) ) ∈ Bε (xi , ki ) for i ∈ D. −n
One can show that dN (ξ, ξ ) = ∞ n=1 2 dn (ξ, ξ ) is a bounded metric on N and for this metric N is complete and separable. Moreover, the sets {ξ ∈ N : ξ(B) = n}, B ∈ B(Rd × K), n ∈ N, generate theBorel σ -algebra B(N ) and dN generates the coarsest topology such that ξ ∈ N → ξ(dx, dk) f (x, k) is continuous for any continuous function f ≥ 0 on Rd × K with bounded support. Finally, by choosing different reference points k ∗ one obtains equivalent metrics. A marked point process on Rd with marks in K is then a measurable map from a probability space into N . We denote by P its distribution (a probability measure on (N , B(N ))). We say that the process is simple if P-almost all ξ ∈ N are simple. The translations on Rd extend naturally to Rd × K by Sx : (y, k) → (x + y, k). This induces an action S of the translation group Rd on N by (Sx ξ )(B) = ξ(Sx B), where B ∈ B(Rd × K) and x ∈ Rd . For simple counting measures, δ(y−x,ky ) . Sx ξ = y∈ξˆ
A marked point process is said to be stationary if P(A) = P(Sx A) for all x ∈ Rd , A ∈ B(N ), and (space) ergodic if the σ -algebra of translation invariant sets is trivial, i.e., if A ∈ B(N ) satisfies Sx A = A for all x ∈ Rd then P(A) ∈ {0, 1}. Due to [DV, Prop. 10.1.IV], if P is stationary and gives no weight to the trivial measure without any point (which will be assumed here), then P ξ ∈ N : supp(ξˆ ) = ∞ = 1 . (10) The marked point processes studied in this work are obtained by the procedure of randomization, which we recall now together with the related notion of thinning (see [Kal]). ˆ be a stationary simple point process (SSPP) on Rd , ν be a probability measure Let ˆ is the stationary simple marked on [−1, 1] and p ∈ [0, 1]. The ν–randomization of point process (SSMPP) ν obtained by assigning to each realization ξˆ = i∈I δxi of ˆ the measure ξ = i∈I δ(xi ,Ei ) , where {Ei }i∈I are independent identically distributed ˆ of ˆ is the SSPP on random variables having distribution ν. Finally, the p–thinning p d R obtained by assigning to each realization ξˆ the measure i∈I Pi δxi , where {Pi }i∈I are independent Bernoulli variables with Prob(Pi = 1) = p and Prob(Pi = 0) = 1 − p. ˆ p are examples of stationary cluster processes, also Both the point processes ν and called homogeneous cluster fields (see [DV, Chap. 8] and [MKM, Chap. 10]). In particular, ergodicity is conserved by ν–randomization and p–thinning ([DV, Prop. 10.3.IX] and [MKM, Prop. 11.1.4]). To conclude, let us give a few examples. Example 1. A Poisson point process (PPP) appears, as already discussed, naturally as limit distribution of thinnings. Given a measure µ on X, with X equal to Rd or Rd × [−1, 1], the PPP on X with intensity measure µ is defined by the two conditions (i) for any B ∈ B(X), ξ(B) is a Poisson random variable with expectation µ(B); (ii) for any
28
A. Faggionato, H. Schulz-Baldes, D. Spehner
disjoint sets B1 , . . . , Bn ∈ B(X), ξ(B1 ), . . . , ξ(Bn ) are independent. A PPP on Rd is stationary if and only if its intensity measure µ is proportional to the Lebesgue measure, µ = ρ dx. In such a case it is an ergodic process satisfying the hypothesis (H2) of Theorem 1 and all moments ρκ , κ > 0 in (1) are finite. Its p–thinning is the PPP on Rd with intensity pρ while its ν–randomization is the PPP on Rd × [−1, 1] with intensity measure ρ dx ⊗ ν. Example 2. Let us associate to the uniformly distributed random variable y in the unit cube C1 the point measure ξˆ = z∈Zd δz+y . The corresponding point process is an ergodic SSPP satisfying ρκ = 1 for any κ > 0. Although this process satisfies (H1), the SSMPP obtained from it via p-thinning and ν-randomization does not and does also not satisfy (H2). However, Theorem 1(ii) is still valid for this SSMPP, as can be checked by restricting the analysis of Sect. 6 to regions which are unions of boxes of the form z + [0, 1)d , z ∈ Zd and using the independence of ξˆ (A) and ξˆ (B) when A, B are disjoint unions of such boxes. Other examples of ergodic SSMPP can be obtained by means of SSPP with short– range correlations (see [DV, Exercise 10.3.4]). Of particular relevance for solid state physics are point processes associated to random or quasiperiodic tilings [BHZ], which satisfy the hypothesis (H1) of Theorem 1. 2.2. The Palm distribution. In what follows, it will always be assumed that Pˆ and P are defined as in Theorem 1 and that (6) is satisfied if ν is a Dirac measure. In order to shorten notations, we will write N and Nˆ for N (Rd ×[−1, 1]) and N (Rd ), respectively. We would like now to “pick up at random” a point among {xj } and take it as the origin. One thus looks at the following borelian subset of N :
N0 := ξ ∈ N : 0 ∈ ξˆ . Since N0 is closed, it defines a bounded complete separable metric space. Note that x ∈ ξˆ if and only if Sx ξ ∈ N0 . The Palm distribution P0 on N0 associated to P is now defined as follows. map G from N into N (Rd × N0 ) given by Consider the measurable ∗ ∗ ξ → ξ = x∈ξˆ δ(x,Sx ξ ) . Let P = G∗ P be the distribution of the marked point prod cess on R × N0 with mark space N0 , namely P ∗ is the image under G of the probability measure P on N . It is easy to show that G ◦Sx = Sx∗ ◦G for x ∈ Rd , where Sx∗ is the action ∗ on Rd × N0 of the translations given by (y, ξ ) → (y + x, ξ ). As a result, P is also stationary. Then, for any fixed A ∈ B(N0 ), the measure µA (B) = P ∗ (dξ ∗ ) ξ ∗ (B ×A) on Rd is translation invariant and thus proportional to the Lebesgue measure. This implies that, for any N > 0 and any A ∈ B(N0 ), 1 ∗ ∗ ∗ CP (A) := P (dξ ) ξ (C1 × A) = d P ∗ (dξ ∗ ) ξ ∗ (CN × A). N N (Rd ×N0 ) N (Rd ×N0 ) The Palm distribution associated to P is the probability measure P0 on N0 obtained from CP by normalization, namely, P0 = ρ −1 CP , where ρ is defined in (1). Thus, for any N > 0, 1 1 P0 (A) := ξˆ (dx) χA (Sx ξ ) , P(dξ ) (11) ρ Nd N CN
Mott Law as Lower Bound for a Random Walk in a Random Environment
29
where χA is the characteristic function on the Borel set A ⊂ N0 . One can show [FKAS, Theorem 1.2.8] that for any nonnegative measurable function f on Rd × N0 , 1 dx P0 (dξ ) f (x, ξ ) = P(dξ ) (12) ξˆ (dx) f (x, Sx ξ ) , ρ N Rd N0 Rd which is used in [DV] as the definition of P0 . Similarly, there is a Palm distribution Pˆ 0 on Nˆ 0 := {ξˆ ∈ Nˆ : 0 ∈ ξˆ } associated to the distribution Pˆ of a SSPP on Rd . It is known that the Palm distribution of a stationary PPP on Rd with distribution Pˆ (Example 1 above) is the convolution Pˆ 0 = Pˆ ∗ δδ0 of Pˆ with the Dirac measure at ξˆ = δ0 (i.e. Pˆ 0 is simply obtained by adding a point at the origin). The Palm distribution of a PPP on Rd ×[−1, 1] with intensity measure ρ dx ⊗ν is the convolution P0 = P ∗ζ , where ζ is the distribution of a marked point process obtained by ν–randomization of δδ0 . The Palm distribution associated to the SSPP in Example 2 is Pˆ 0 = δx∈Zd δx . Its ν–randomization is the Palm distribution of the ν–randomization of Example 2. We collect in the lemma below a number of results on the Palm distribution which will be needed in the sequel. Their proofs are given in Appendix B. Lemma 1. (i) Let k : N0 ×N0 → R be a measurable function such that ξˆ (dx) |k(ξ, Sx ξ )| and ξˆ (dx) |k(Sx ξ, ξ )| are in L1 (N0 , P0 ). Then ˆ P0 (dξ ) ξˆ (dx) k(Sx ξ, ξ ) . P0 (dξ ) ξ (dx) k(ξ, Sx ξ ) = (ii) Let ∈ B(N ) be such that Sx = for all x ∈ Rd . Then P() = 1 if and only if P0 (0 ) = 1 with 0 = ∩ N0 . (iii) Let P be ergodic and A, B ∈ B(N0 ) be such that B ⊂ A, P0 (A \ B) = 0 and Sx ξ ∈ A for any ξ ∈ B and any x ∈ ξˆ . Then P0 (A) ∈ {0, 1}. (iv) Let Aj ∈ B(Rd ) for j = 1, . . . , n. Then n n c c n+1 ˆ ˆ ≤ EP ξˆ (A˜ j )n+1 , (13) EP 0 ξ (Aj ) + EP ξ (C1 ) ρ ρ j =1
j =1
where A˜ j := ∪x∈C1 Aj + x and c is a positive constant depending on n. Remark 1. Here we point out a simple geometric property of point measures ξ within the set W := ξ ∈ N0 : Sx ξ = ξ ∀x ∈ Rd \ {0} , (14) which will be fundamental in order to apply the methods developed in [KV] and [DFGW]. Let us consider a sequence {xn }n≥0 of elements in supp(ξˆ ) with x0 = 0 and set ξn := Sxn ξ . The ξn can be thought of as the environment viewed from the point xn . Due to the definition of W, {xn }n≥0 can be recovered from {ξn }n∈N by means of the identities xn+1 − xn = (ξn , ξn+1 ), where the function : W × N0 →
n ∈ N,
Rd
is defined as x if ξ
= Sx ξ , (ξ , ξ
) := 0 otherwise .
Note that, by Lemma 1(ii), condition (6) is equivalent to P0 (W) = 1.
(15)
30
A. Faggionato, H. Schulz-Baldes, D. Spehner
3. Variational Formula The main object of this section is to show the following result, implying Theorem 1(i). Theorem 2. Let P satisfy the assumptions of Theorem 1(i). Then the limit (3) exists and D is given by the variational formula 2 P0 (dξ ) ξˆ (dx)c0,x (ξ ) a · x +∇x f (ξ ) , a ∈ Rd ,(16) (a · D a) = inf f ∈L∞ (N0 ,P0 )
with ∇x f (ξ ) := f (Sx ξ ) − f (ξ ) . ξ
(17) ξ
Moreover, the rescaled process Y ξ,ε := (εXtε−2 )t≥0 defined on (ξ , P0 ) converges weakly in P0 –probability as ε → 0 to a Brownian motion W D with covariance matrix D. The proof is based on the theory of Ref. [KV] and [DFGW] and, in particular, Theorem 2.2 of [DFGW]. Because of the geometric disorder and the possibility of jumps between any of the random points, the application of this general theorem to our model is technically considerably more involved than in the case of the lattice model with jumps to nearest neighbors studied in [DFGW, Sect. 4]. As a preamble, let us state a result on ξ the process Xt proven in Appendix C which will be used several times below. Proposition 1. Let P satisfy ρκ < ∞ for some integer κ > 3. Then, given t > 0 and 0 < γ < κ − 3, ξ EP0 EPξ |Xt |γ < ∞ . 0
Remark 2. From the variational formula of the diffusion matrix D given in Theorem 2 one can easily prove (see e.g. [DFGW]) that D is a multiple of the identity whenever P is isotropic (i.e., it is invariant under all rotations by π/2 in a coordinate plane). In this case, the arguments leading to a lower bound on D are slightly simpler (and can be easily adapted to the general case). Therefore, in order to simplify the discussion and without loss of generality, in the last Sects. 5 and 6 we will assume P to be isotropic. 3.1. The result of De Masi, Ferrari, Goldstein and Wick. A main idea in [DFGW] is to study the process (SXξ ξ )t≥0 with values in the space N0 of the environment cont
ξ
figurations, instead of the random walk (Xt )t≥0 . This process is called the process ξ environment viewed from the particle. It is defined on the probability space (ξ , P0 ), with ξ = D([0, ∞), supp(ξˆ )). Let Pξ be its distribution on the path space := D([0, ∞), N0 ) (endowed as usual with the Skorohod topology). A generic element of will be denoted by ξ = (ξt )t≥0 . Let us set P := P0 (dξ )Pξ . The environment process is the process (ξt )t≥0 defined on the probability space (, P) with distribution P. This is a continuous–time jump Markov process with initial measure P0 and transition probabilities P(ξs+t = ξ | ξs = ξ ) = Pξ (ξt = ξ ) =: pt (ξ |ξ )
∀ s, t ≥ 0
Mott Law as Lower Bound for a Random Walk in a Random Environment
with, for any ξ ∈ W,
pt (ξ |ξ ) =
if ξ = Sx ξ for some x ∈ ξˆ , otherwise .
ξ
pt (x|0) 0
31
(18)
For any time t ≥ 0, let us introduce the random variable Xt : → Rd defined by Xt (ξ ) := s (ξ ) , (19) s∈[0,t]
where
s (ξ ) :=
x 0
if ξs = Sx ξs − otherwise
and the sum runs over all jump times s for which s (ξ ) = 0. Note that {X[s,t] := Xt − Xs : t > s ≥ 0} defines an antisymmetric additive covariant family of random variables as defined in [DFGW], and Xt has paths in D([0, ∞), Rd ). The crucial link to the dynamics of a particle in a fixed environment is now the following: due to Remark 1, for any ξ ∈ W, the distribution of the process (Xt )t≥0 defined on (, Pξ ) is equal to ξ the distribution P0 of the randomwalk on supp(ξˆ ) (naturally embedded in Rd ) starting at the origin. Recalling that P = P0 (dξ ) Pξ , this implies ξ EP0 EPξ (Xt · a)2 = EP (Xt · a)2 , (20) 0
which gives a way to calculate the diffusion matrix D from the distribution P on . In order to apply Theorem 2.2 of [DFGW], it is enough to verify the following hypothesis: (a) the environment process is reversible and ergodic; (b) the random variables X[s,t] , 0 ≤ s < t are in L1 (, P); (c) the mean forward velocity exists: ϕ(ξ ) := L2 −lim t↓0
(d) the martingale Xt −
t 0
1 EPξ (Xt ) . t
(21)
ds ϕ(ξs ) is in L2 (, P).
Let us assume ρ12 < ∞. Then, statement (a) will be proved in Proposition 2, Subsect. 3.3. The statement (b) follows from Proposition 1. The L2 -convergence in (c) will be proved in Subsect. 3.4 (Proposition 4), where we also show the L2 -convergence in the following formula defining the mean square displacement matrix ψ(ξ ): 1 (a · ψ(ξ )a) := L2 −lim EPξ (a · Xt )2 . (22) t↓0 t The last point (d) is a consequence of Proposition 1 assuring that Xt ∈ L2 (, P) and t the fact that 0 ds ϕ(ξs ) ∈ L2 (N0 , P0 ), which can be proved by means of the Cauchy–Schwarz inequality, the stationarity of P following from (a), and the property ϕ ∈ L2 (N , P0 ).
32
A. Faggionato, H. Schulz-Baldes, D. Spehner
Once hypotheses (a)-(d) have been verified, one can invoke [DFGW, Theorem 2.2 and Remark 4, p. 802] to conclude that limit (3) exists and that the rescaled random walk Y ξ,ε converges weakly in P0 –probability to the Brownian motion W D with covariance matrix D given by (3), and that D is moreover given by ∞ dt ϕ · a , et L ϕ · a , (23) (a · Da) = EP0 (a · ψa) − 2 P0
0
where L is the generator of the environment process and the integral on the r.h.s. is finite. Formula (16) can be deduced from the expressions of L, ϕ and ψ established in the following subsections (Propositions 3 and 4) by using a known general result on self-adjoint operators stated in (47) below. 3.2. Preliminaries. Before starting to prove the above-mentioned statements (a)-(d), let us fix some notations and recall some general facts about jump Markov processes. In what follows, given a complete separable metric space Z we denote by F(Z) the family of bounded Borel functions on Z and, given a (not necessarily finite) interval I ⊂ R, we denote by D(I, Z) the space of right continuous paths z = (zt )t∈I , zt ∈ Z, having left limits. The path space D(I, Z) is endowed with the Skorohod topology [Bil] which is the natural choice for the study of jump Markov processes. For a time s ≥ 0, the time translation τs is defined as τs : D([0, ∞), Z) → D([0, ∞), Z),
(τs z)t := zt+s .
Moreover, given 0 ≤ a < b, we denote by R[a,b] the function R[a,b] : D([0, ∞), Z) → D([a, b], Z),
(R[a,b] z)t := lim za+b−t−δ . δ↓0
R[a,b] z is the time–reflection of (zt )t∈[a,b] w.r.t. the middle point of [a, b], and it can naturally be extended to paths on [0, a + b]. A continuous–time Markov process with path in D([0, ∞), Z) and distribution p is called stationary if Ep (F ) = Ep (F ◦τs ) for all s ≥ 0 and for any bounded Borel function F on D([0, ∞), Z). It is called reversible if Ep (F ) = Ep (F ◦ R[a,b] ) for all b > a ≥ 0 and any bounded Borel function F on D([0, ∞), Z) such that F (z) depends only on (zt )t∈[a,b] . Thanks to the Markov property, one can show that stationarity is equivalent to Ep f (z0 ) = Ep f (zs ) , ∀ s ≥ 0 , ∀ f ∈ F(Z), (24) while reversibility is equivalent to Ep f (z0 )g(zs ) = Ep g(z0 )f (zs ) ,
∀ s ≥ 0 , ∀ f, g ∈ F(Z) .
(25)
In particular, stationarity follows from reversibility. process is called Finally, the Markov (time) ergodic if p(A) ∈ {0, 1} whenever A ∈ B D([0, ∞), Z) is time-shift invariant, i.e. A = τs A for all s ≥ 0. Recall that if the Markov process is stationary then it can be extended to a Markov process with path space D(R, Z) and the resulting distribution is univocally determined (this follows from Kolmogorov’s extension theorem and the regularity of paths). Now stationarity, reversibility and ergodicity of the extended process are defined as above by means of τs , s ∈ R, and R[a,b] , −∞ < a < b < ∞. Then one can check that these properties are preserved by extension (for what concerns ergodicity,
Mott Law as Lower Bound for a Random Walk in a Random Environment
33
see in particular [Ros, Chapter 15, p. 96–97]). Therefore our definitions coincide with those in [DFGW]. All the above definitions and remarks can be extended in a natural way to discrete–time Markov processes (with path space Z N ). Moreover, in the discrete case, stationarity and reversibility are equivalent respectively to (24) and (25) with s = 1. We conclude this section recalling the standard construction of the continuous–time random walk satisfying conditions (C1) and (C2) in the Introduction. We first note that these conditions are meaningful for P0 –almost all ξ if EP0 (λ0 ) < ∞. In fact, due to the bound λx (ξ ) ≤ e4β e|x| λ0 (ξ ), one can infer from EP0 (λ0 ) < ∞ that λx (ξ ) < ∞ for any x ∈ ξˆ , P0 a.s. We note that the condition EP0 (λ0 ) < ∞ is equivalent to ρ2 < ∞ due to the following lemma: Lemma 2. For any positive integer k, EP0 (λk0 ) < ∞ if and only if ρ k+1 < ∞. Proof. Note that for suitable positive constants c1 , c2 one has ξˆ (C1 + z)e−|z| ≤ λ0 (ξ ) ≤ c2 ξˆ (C1 + z)e−|z| , c1 z∈Zd
P0 -a.s.
z∈Zd
Next let us expand the k th power of these inequalities. By applying Lemma 1(iv) and using the stationarity of P, one gets that EP0 (λk0 ) < ∞ if ρ k+1 < ∞. Suppose now that EP0 (λk0 ) < ∞. Then the above expansion in k th power implies that EP0 (ξˆ (C1 )k ) < ∞. Since due to (11), 2d 2d k ˆ P(dξ ) ξˆ (dx) ξˆ (C1 + x)k ≥ EP (ξˆ (C1/2 )k+1 ) , EP0 (ξ (C1 ) ) = ρ N ρ C1/2 one concludes that EP (ξˆ (C1/2 )k+1 ) < ∞, which is equivalent to ρk+1 < ∞.
The construction of the continuous–time random walk follows standard references (e.g. [Bre, Chap. 15] and [Kal, Chap. 12]) and can be described roughly as follows: After arriving at site y ∈ ξˆ , the particle waits an exponential time with parameter λy (ξ ) and then jumps to another site z ∈ ξˆ with probability pξ (z|y) :=
cy,z (ξ ) . λy (ξ )
(26)
More precisely, consider ξ ∈ N0 such that 0 < λz (ξ ) < ∞ for any z ∈ ξˆ and set ξ ˜ ξ is denoted by X˜ nξ ˜ ξ := supp(ξˆ ) N . A generic path in . Given x ∈ ξˆ , let P˜ x n≥0 ˜ ξ of a discrete–time random walk on supp(ξˆ ) starting in x and be the distribution on having transition probabilities p ξ (z|y). Let , Q be another probability space where ξ the random variables Tn,z , z ∈ ξˆ , n ∈ N, are independent and exponentially distributed ξ with parameter λz (ξ ), namely Q Tz,n > t ) = exp −λz (ξ )t . On the probability space ˜ ξ × , P˜ xξ ⊗ Q) define the following functions: ( ξ
ξ
ξ ξ 0,X˜ 0
R0 := 0 ; Rn := T ξ
+T
ξ ξ 1,X˜ 1
+ ··· + T ξ
ξ ξ n−1,X˜ n−1 ξ
n∗ (t) := n if Rn ≤ t < Rn+1 .
if
n≥1,
34
A. Faggionato, H. Schulz-Baldes, D. Spehner ξ
ξ
Note that n∗ (t) is well posed for any t ≥ 0 only if limn↑∞ Rn = ∞. If P˜ xξ ⊗ Q lim Rnξ = ∞ = 1 , n↑∞
(27)
ξ ξ )t≥0 , defined P˜ x ⊗Q–almost ξ n∗ (t)
then, due to [Bre, Theorem 15.37], the random walk ( X˜
everywhere, is a jump Markov process whose distribution satisfies the infinitesimal conξ ditions (C1) and (C2). The condition limn↑∞ Rn = ∞ assures that no explosion phenomenon takes place, notably only finitely many jumps can occur in finite time intervals. In Appendix A we prove that (27) is verified if ρ2 < ∞. 3.3. The environment viewed from the particle. The process environment viewed from the particle and the environment process have been introduced in Sect. 3.1. Given t > 0, we write n∗ (t) for the function on the path space = D ([0, ∞), N0 ) associating to each ξ ∈ the corresponding number of jumps in the time interval [0, t]. Motivated by further applications, it is convenient to consider also the discrete–time versions of the above processes. Consider the discrete-time Markov process SX˜ ξ ξ n≥0 defined on n ˜ := N N and denote a generic path ˜ ξ , P˜ ξ , call P˜ ξ its distribution on the path space 0 0 ˜ by (ξn )n≥0 . Such a Markov process can be thought of as the environment viewed in ξ from the particle performing the discrete–time random walk with distribution P˜ 0 . Let us point out a few properties of the distribution P˜ ξ . First, we remark that due to the covariant relations cz,y (Sx ξ ) = cz+x,y+x (ξ ) , λy (Sx ξ ) = λy+x (ξ ), (28) ˜ ξ , P˜ ξ ) and the process (λ0 (ξn ))n∈N defined on the process λX˜ ξ (ξ ) n∈N defined on ( 0 n ˜ P˜ ξ ) have the same distribution. Moreover, due to Remark 1, if ξ ∈ W, then the (, ˜ P˜ ξ ) by ζ0 = 0 and process (ζn )n∈N defined on (, ζn =
n−1
(ξk , ξk+1 ) , ∀ n ≥ 1 ,
k=0
˜ ξ with distribution P˜ ξ . Finally, it is conwhere (ξ, ξ ) is given by (15), has paths in 0 venient to consider a suitable average of the distributions P˜ ξ . To this aim, let Q0 be the probability measure on N0 defined as Q0 (dξ ) :=
λ0 (ξ ) P0 (dξ ) , EP0 (λ0 )
Q0 (dξ )P˜ ξ . If ξ ∈ W, the transition probabilities are λ−1 if ξ = Sx ξ ,
0 (ξ )c0,x (ξ ) p(ξ |ξ ) := P˜ ξn+1 = ξ |ξn = ξ = 0 otherwise .
and set P˜ :=
Note that, due to (28) and the symmetry of the jump rates (2), λ0 (ξ )p(ξ |ξ ) = λ0 (ξ )p(ξ |ξ ).
Mott Law as Lower Bound for a Random Walk in a Random Environment
35
Proposition 2. Let ρ2 < ∞. Then the process (ξt )t≥0 defined on (, P) is reversible, i.e. EP f (ξ0 )g(ξt ) = EP g(ξ0 )f (ξt )
∀ f, g ∈ F(N0 ) , ∀ t > 0 ,
(29)
and is (time) ergodic if P is ergodic. Similarly, the discrete-time Markov process (ξn )n≥0 ˜ is reversible and is (time) ergodic. ˜ P) defined on (, Having at our disposal Lemma 1, the proof follows modifying arguments of e.g. [DFGW]. Proof. We give the proof for the continuous–time process, the discrete–time case being similar. We first verify the symmetric property pt (ξ |ξ ) = pt (ξ |ξ ). Actually, thanks to the construction of the dynamics given in Sect. 3.2, one can show that for any positive integer n and any ξ = ξ (0) , ξ (1) , . . . , ξ (n−1) , ξ (n) = ξ ∈ N0 , Pξ n∗ (t) = n, ξR1 =ξ (1) , . . . , ξRn =ξ (n) =Pξ n∗ (t) = n, ξR1 = ξ (n−1) , . . . , ξRn =ξ (0) , where, given ξ ∈ , R1 (ξ ) < R2 (ξ ) < . . . denote the jump times of the path ξ . Next, given f, g ∈ F(N0 ) one gets by applying Lemma 1(i) and using pt (ξ |ξ ) = pt (ξ |ξ ) that P0 (dξ ) ξˆ (dx) pt (Sx ξ |ξ ) f (ξ )g(Sx ξ ) = P0 (dξ ) ξˆ (dx) pt (Sx ξ |ξ ) f (Sx ξ )g(ξ ), (30) which is equivalent to (29). Hence P is reversible. Due to Corollary 5 in [Ros, Chap. IV], in order to prove ergodicity it is enough to show that P0 (A) ∈ {0, 1} if A ∈ B(N0 ) has the following property: Pξ (ξt ∈ A) = χA (ξ ) for P0 –almost all ξ . Given such a set A, then there exists a Borel subset B ⊂ A such that P0 (A \ B) = 0 and Pξ (ξt ∈ A) = 1 for any ξ ∈ B. Fix ξ ∈ B and x ∈ ξˆ , then Pξ (ξt = Sx ξ, ξt ∈ A) = Pξ (ξt = Sx ξ ) > 0 (the last bound follows from the positivity of the jump rates). Hence Sx ξ ∈ A. Lemma 1(iii) implies that P0 (A) ∈ {0, 1}, thus allowing to conclude the proof. Let P fulfill the assumption of Proposition 2. Then,
(Tt f )(ξ ) := EPξ f (ξt ) =
ξˆ (dx) pt (Sx ξ |ξ ) f (Sx ξ ) ,
P0 a.s.
(31)
defines a strongly continuous contraction semigroup on L2 (N0 , P0 ) (Markov semigroup). Actually, (i) Tt : L2 (N0 , P0 ) → L2 (N0 , P0 ) is self-adjoint by (29) and is a contraction by the Cauchy-Schwarz inequality and the stationarity of P; (ii) Tt+s = Tt Ts follows from the Markov nature of the process; (iii) the continuity follows from the following argument: first observe that it is enough to prove the continuity of Tt f at t = 0 for f ∈ L∞ (N0 , P0 ), which is obtained from the dominated convergence theorem and ξ the estimate |(Tt f − f )(ξ )| ≤ 2f ∞ (1 − pt (0|0)). Let us denote by L the generator of the Markov semigroup (Tt )t≥0 and by D(L) ⊂ L2 (N0 , P0 ) its domain.
36
A. Faggionato, H. Schulz-Baldes, D. Spehner
Proposition 3. Let P satisfy ρ4 < ∞. Then L is nonpositive and self–adjoint with core L∞ (N0 , P0 ). For any f ∈ L∞ (N0 , P0 ), one has for P0 -a.e. ξ , (32) (Lf )(ξ ) = ξˆ (dx) c0,x (ξ ) ∇x f (ξ ) , where ∇x f is defined in (17), and, moreover, 1 P0 (dξ ) ξˆ (dx) c0,x (ξ ) (∇x f (ξ ))2 . f, (−L)f P0 = 2
(33)
Proof. The self-adjointness of L follows from [RS, Vol.2, Theorem X.1]. Actually, (i) L is closed as a generator of a strongly continuous semigroup [RS, Vol.2, Chap. X.8]; (ii) L is symmetric because Tt is self-adjoint; (iii) the spectrum of L is included in (−∞, 0] by contractivity of the semigroup. Note that (iii) also implies that L is non-positive. We use the abbreviation Lp for Lp = Lp (N0 , P0 ), p = 2 or ∞. For any f ∈ L∞ , denote by f the function defined by the r.h.s. of (32). Due to Lemma 2, EP0 (λ20 ) < ∞ and in particular 2 P0 (dξ )(f )(ξ ) ≤ 4 f 2∞ EP0 λ20 < ∞ , thus implying that : L∞ → L2 is a well-defined operator. We claim that L2 − lim t↓0
Tt f − f = f , t
∀ f ∈ L∞ .
(34)
Note that (34) implies that L∞ ⊂ D(L) and Lf = f for all f ∈ L∞ . Since moreover Tt is a contraction and Tt L∞ ⊂ L∞ , it then follows from [RS, Vol.2, Theorem X.49] that L∞ is a core for L and L is the closure of . Finally, using (30) in the limit t → 0, by straightforward computations (33) can be derived from (32). Let us now prove (34). We assume ξ ∈ W and we set, for ξ = ξ , pt,1 (ξ |ξ ) := Pξ (ξt = ξ , n∗ (t) = 1) = pt (ξ |ξ ) − Pξ (ξt = ξ , n∗ (t) ≥ 2) . Thanks to the construction of the dynamics described in Sect. 3.2 and due to the estimate 1 − e−u ≤ u, u ≥ 0, one has for any x ∈ ξˆ and x = 0, ξ ξ ξ pt,1 (Sx ξ |ξ ) ≤ P˜ 0 ⊗ Q(X˜ 1 = x, T0,0 ≤ t) = pξ (x|0)(1 − e−λ0 (ξ )t ) ≤ c0,x (ξ ) t .(35) Let f ∈ L∞ . In view also of (31) and ξˆ (dx) pt (Sx ξ |ξ ) = 1, ˆ Tt f − f − t f (ξ ) = ξ (dx) f (Sx ξ ) − f (ξ ) pt (Sx ξ |ξ ) − c0,x (ξ ) t ≤ 2 f ∞ ξˆ (dx) pt (Sx ξ |ξ ) − pt,1 (Sx ξ |ξ ) {x=0} + ξˆ (dx) −pt,1 (Sx ξ |ξ ) + c0,x (ξ ) t . {x=0}
The first integral in the second line can be bounded by Pξ (n∗ (t) ≥ 2). The second integral equals − Pξ ( n∗ (t) = 1 ) + λ0 (ξ ) t = −1 + e−λ0 (ξ )t + λ0 (ξ ) t + Pξ (n∗ (t) ≥ 2) .
Mott Law as Lower Bound for a Random Walk in a Random Environment
37
By collecting the above estimates, we get 2 32 f 2∞ 1 2 P T ≤ E f − f − tf E (n (t) ≥ 2) t ∗ P P ξ 0 0 t2 t2 2 2 8f ∞ −λ0 t + E + λ t −1 + e . (36) 0 P 0 t2 By using the estimate (e−u − 1 + u)2 ≤ u3 /2 for u ≥ 0 and the finiteness of EP0 (λ30 ), it is easy to check that the second term in the r.h.s. tends to zero as t → 0. In order to bound the first term, we observe that ξ ξ ξ Pξ n∗ (t) ≥ 2 ≤ P˜ 0 ⊗ Q(T0,0 ≤ t, T ˜ ξ ≤ t) 1,X1 −λ0 (ξ )t = 1−e ξˆ (dx) p(Sx ξ |ξ ) 1 − e−λ0 (Sx ξ )t . Due to the estimate 1 − e−u ≤ u, this implies the bound 2 Pξ (n∗ (t) ≥ 2) ≤ t λ0 (ξ ) ξˆ (dx) p(Sx ξ |ξ )λ0 (Sx ξ ) = t 2 λ0 (ξ ) EP˜ ξ λ0 (ξ1 ) (. 37) Due also to the estimate 1 − e−u ≤ 1, it is also true that Pξ (n∗ (t) ≥ 2) ≤ t EP˜ ξ λ0 (ξ1 ) .
(38)
˜ one obtains By multiplying the last two inequalities, and using the stationarity of P, 2 1 EP0 Pξ2 (n∗ (t) ≥ 2) ≤ t EP0 λ0 (ξ ) EP˜ ξ λ0 (ξ1 ) 2 t ≤ t EP0 λ0 (ξ )EP˜ ξ λ20 (ξ1 ) = t EP0 λ30 , thus implying that the first term on the r.h.s. of (36) goes to 0 as t → 0.
3.4. Mean forward velocity and infinitesimal square displacement. Proposition 4. Let P satisfy ρ12 < ∞ and let ϕ be the Rd -valued function on N0 and ψ be the function on N0 with values in the real symmetric d × d matrices, respectively defined by ˆ ϕ(ξ ) = ξ (dx) c0,x (ξ ) x , (a · ψ(ξ )a) = ξˆ (dx) c0,x (ξ ) (a · x)2 . (39) (i) ϕ(ξ ) is in L2 (N0 , P0 ) and is equal to the mean forward velocity given by the convergent L2 -strong limit (21). (ii) (a · ψ(ξ )a) is in L2 (N0 , P0 ) and is equal to the infinitesimal mean square displacement given by the convergent L2 -strong limit (22). We point out that ϕ(ξ ) and ψ(ξ ) are well defined for P0 almost all ξ since ρ2 < ∞ (see for example the proof of Lemma 2).
38
A. Faggionato, H. Schulz-Baldes, D. Spehner
Proof. (i) One has 2 2 1 2 P P0 (dξ ) EPξ Xt χ n∗ (t) = 1 − t ϕ(ξ ) (dξ ) E (X ) − t ϕ(ξ ) ≤ 0 Pξ t 2 2 t t 2 2 + 2 P0 (dξ ) EPξ Xt χ n∗ (t) ≥ 2 . (40) t We first show that the first term on the r.h.s. vanishes as t → 0. Using the same notation as in the proof of Proposition 3 and invoking (35), 2 EP0 EPξ Xt χ n∗ (t) = 1 −t ϕ(ξ ) 2 ˆ ξ (dx) pt,1 (Sx ξ |ξ )−t c0,x (ξ ) x = EP0 {x=0}
is bounded according to the Cauchy-Schwarz inequality by EP0 ξˆ (dx) −pt,1 (Sx ξ |ξ ) + t c0,x (ξ ) {x=0}
{y=0}
ξˆ (dy) −pt,1 (Sy ξ |ξ ) + t c0,y (ξ ) |y|2 .
(41)
Let us denote by I1 (ξ ) and I2 (ξ ) the (non negative) integrals over ξˆ (dx) and ξˆ (dy) respectively. Using the identities of the proof of Proposition 3, the inequality 0 ≤ −1 + e−u + u ≤ u2 , u ≥ 0, and (37), we deduce I1 (ξ ) = −1 + e−tλ0 (ξ ) + tλ0 (ξ ) + Pξ (n∗ (t) ≥ 2) ≤ t 2 λ0 (ξ )2 + t 2 λ0 (ξ ) EP˜ ξ (λ0 (ξ1 )). Moreover, I2 (ξ ) ≤ t ξˆ (dy)c0,y (ξ )|y|2 . Hence (41) is bounded by t 3 EP0 λ20 (ξ ) ξˆ (dy) c0,y (ξ )|y|2 +EP0 λ0 (ξ )EP˜ ξ (λ0 (ξ1 )) ξˆ (dy) c0,y (ξ )|y|2 . As long as ρ4 < ∞, the first expression can be bounded by applying Lemma 1(iv) (see the argument leading to Lemma 2). A short calculation shows that the second expression equals P0 (dξ ) ξˆ (dx) c0,x (ξ ) ξˆ (dz) cx,z (ξ ) ξˆ (dy) c0,y (ξ ) |y|2 and is therefore bounded if ρ4 < ∞ (again by means of Lemma 1(iv)). Resuming the results obtained so far, one gets 1 EP Xt χ n∗ (t) = 1 − t ϕ(ξ )2 = O(t) . P (dξ ) (42) 0 ξ t2 We now turn to the second term in (40). By Proposition 1, EP0 (EPξ (|Xt |γ )) < ∞ as long as 0 < γ < κ − 3 whenever ρκ < ∞ for κ integer. By applying twice the H¨older inequality, if γ > 2, γ γ2 2 2γ −2 1− γ2 EP0 EPξ Xt χ n∗ (t) ≥ 2 ≤ EP0 EPξ Xt EP0 Pξ n∗ (t) ≥ 2 γ −2 .
Mott Law as Lower Bound for a Random Walk in a Random Environment
39
Let us take (38) to the power γ /(γ − 2), multiply the result by (37). This yields 2γ −2 3γ −4 3γ −4 3γ −4 2γ −2 γ −2 γ −2 . EP0 Pξ n∗ (t) ≥ 2 γ −2 ≤ t γ −2 EP0 λ0 (ξ ) EP˜ λ0 (ξ1 ) = t γ −2 EP0 λ0 ξ
Hence, by Lemma 2, if ρκ < ∞ is satisfied for integer κ > (4γ − 6)/(γ − 2) and γ < k − 3, there is a finite constant C > 0 such that 3γ −4 2 (43) EP0 EPξ Xt χ n∗ (t) ≥ 2 ≤ C t γ . One concludes the proof by choosing γ > 4 and by combining (40), (42) and (43), as long as κ > 7. (ii) One follows the same strategy. The first term in the equation corresponding to (40) can be dealt with in exactly the same way. In the argument for the second term, |Xt | is replaced by |Xt |2 so that one needs 2γ < κ − 3, hence κ > 11. 3.5. Proof of Theorem 2. Since all conditions (a)-(d) of Subsect. 3.1 have been checked in the preceding subsections, as already pointed out, one can invoke [DFGW, Theorem 2.2] to conclude that the limit (3) exists and that the rescaled random walk Y ξ,ε converges weakly in P0 –probability to the Brownian motion W D . We can now also derive the variational formula (16) from the general expression (23). Let us first quote some general results concerning self–adjoint operators. Let (, µ) be a probability space and denote by . , . µ and by .µ the scalar product and the norm on H = L2 (, µ). Let L : D(L) → H be a nonpositive self–adjoint operator with (dense) domain D(L) ⊂ H and assume C ⊂ D(L) is a core of L. The space of D(|L|1/2 ) ∩ (Ker(L))⊥ H1 is the completion 1/2 1/2 under the norm f 1 := |L| f µ for f ∈ D(|L| ), while the dual H−1 of H1 −1/2 ) = Ran(|L|1/2 ) under . , . µ can be identified with the completion −1/2 of D(|L| ϕ µ for ϕ ∈ D(|L|−1/2 ). Given under the .−1 -norm defined as ϕ−1 := |L| ϕ ∈ H ∩ H−1 , the dual norm ϕ−1 admits several useful characterizations: ϕ2−1 =
|ϕ, f µ |2 = f 21 f ∈H1 ∩H sup
sup f ∈C ∩(Ker(L))⊥
|ϕ, f µ |2 , f 21
(44)
where the last identity results from the fact that C is a core for L. Moreover, the identity ϕ2−1 = sup 2 ϕ, f µ − f, (−L)f µ (45) f ∈C
is obtained by using the nonlinearity in f of the expression in the r.h.s. of (45) and observing that ϕ ∈ (Ker(L))⊥ . Finally, it follows from spectral calculus that ∞ dt ϕ, et L ϕµ . (46) ϕ2−1 = 0
In what follows, we extend the definition of · −1 to the whole space H by setting ϕ−1 := ∞ whenever ϕ ∈ H and ϕ ∈ H−1 . Thanks to this choice, identities (44), (45) and (46) are true for all ϕ ∈ H. Invoking (45) and (46), one obtains ∞ dt ϕ · a , et L ϕ · a = sup 2 ϕ · a , f P0 − f, (−L)f P0 . 0
P0
f ∈L∞ (N0 ,P0 )
(47) Using (33), (39) and Lemma 1(i), a short calculation starting from (23) yields (16).
40
A. Faggionato, H. Schulz-Baldes, D. Spehner
4. Bound by Cut-off on the Transition Rates This section and the next ones are devoted to the proof of Theorem 1(ii). In particular, ˆ P and ν satisfy the conditions of Theorem 1(ii) although many partial we assume that P, results are true under much weaker conditions. The variational formula (16) is particularly suited in order to derive bounds on the diffusion matrix D. For example, due to the monotonicity of the jump rates cx,y (ξ ) in the inverse temperature β, one deduces that the diffusion matrix is a non-increasing function of β. The aim of this section is to obtain more quantitative bounds. Given an energy 0 ≤ Ec ≤ 1, we define the map c : N → Nˆ := N (Rd ) as follows: c (ξ ) (A) := ξ(A × [−Ec , Ec ]) , A ∈ B(Rd ) . (48) d Note that Pˆ c := P ◦ −1 c is the distribution of a point process on R with finite intensity ˆ ˆ ρc := EPˆ c ( ξ (C1 ) ) ≤ EP ( ξ (C1 ) ) = ρ, and in general
EPˆ c ( ξˆ (C1 )κ ) ≤ ρκ ,
∀ κ > 0.
(49)
In what follows, we assume that ρc > 0. It can readily be checked that Pˆ c is an ergodic SSPP on Rd . We write Pˆ 0c for the Palm distribution associated to Pˆ c . Note that the distribution Pˆ c is obtained from Pˆ by δc –thinning with δc := ν([−Ec , Ec ]). Thus, ρc = δc ρ. The relation between the Palm distributions P0 and Pˆ 0c is described in the following lemma. Lemma 3. For any Borel set A ∈ B(Nˆ 0 ) one has Pˆ 0c (A) = ρ ρc−1 P0 ( |E0 | ≤ Ec , c (ξ ) ∈ A ). Proof. The assertion is proven by comparing the two following identities obtained from (11): 1 c c ˆ ˆ ˆ P0 (A) = P (d ξ ) ξˆ (dx)χA (Sx ξˆ ) , ρc Nˆ C1 1 P0 ( |E0 | ≤ Ec , c (ξ ) ∈ A ) = P(dξ ) ξˆ (dx) χ (|Ex | ≤ Ec ) χA (c (Sx ξ )) ρ N C1 1 = P(dξ ) c (ξ ) (dx) χA Sx (c (ξ )) . ρ N C1 Proposition 5. Fix a distance rc > 0 and an energy 0 ≤ Ec ≤ 1 and let Pˆ 0c be as above. Moreover, define ϕc (ξˆ ) := ξˆ (dx) cˆ0,x x , (a · ψc (ξˆ )a) := ξˆ (dx) cˆ0,x (a · x)2 (50) as functions on Nˆ 0 , where cˆ0,x := χ (|x| ≤ rc ). Then the diffusion matrix D for the ξ process (Xt )t≥0 in Theorem 2 admits the following lower bound D ≥
ρc −rc −4 β Ec e Dc (rc , Ec ) , ρ
Mott Law as Lower Bound for a Random Walk in a Random Environment
41
where
(a · Dc (rc , Ec ) a) := EPˆ c (a · ψc a) 0
∞
− 2
dt ϕc · a , et Lc ϕc · a ˆ c , (51) P0
0
and Lc is the unique self–adjoint operator on L2 (Nˆ 0 , Pˆ 0c ) such that (Lc f )(ξˆ ) =
ξˆ (dx) cˆ0,x ∇x f (ξˆ ) ,
∀ f ∈ L∞ (Nˆ 0 , Pˆ 0c ) .
(52)
One can prove by the same arguments used in the proof of Proposition 3 that Lc is well-defined and self–adjoint. Let Pˆ c and Pˆ cˆ be the probability measures on the path ξ
ˆ := D( [0, ∞), Nˆ 0 ) associated to the Markov process with generator Lc and space initial distribution Pˆ 0c and δξˆ , respectively, with ξˆ ∈ Nˆ 0 . One can prove that these Markov processes are well–defined (in particular, Pˆ c is well–defined for Pˆ c –almost all ξˆ ) ξˆ
0
and exhibit a realization as jump processes by means of the same arguments used in Sect. 3.2 (note that, for a suitable positive constant c, ξˆ (dx)cˆ0,x ≤ c λ0 (ξ ) for any ξ ∈ N0 , thus allowing to exclude explosion phenomena from the results of Appendix A). Finally, ˆ Xt (ξˆ ) is defined as in (19). given ξˆ ∈ , Proof. Note that c0,x (ξ ) ≥ e−rc −4 β Ec c˜0,x (ξ ) , where c˜x,y (ξ ) := χ |Ex | ≤ Ec , |Ey | ≤ Ec , |x − y| ≤ rc )
,
x, y ∈ ξˆ .
Then (16) implies that (a · Da) ≥ e−rc −4 β Ec g(a), where 2 g(a) := inf P0 (dξ ) ξˆ (dx) c˜0,x (ξ ) a · x + ∇x f (ξ ) ≥0. f ∈L∞ (N0 ,P0 )
By the same arguments used in the proof of Proposition 3 one can show that there is a unique self–adjoint operator L˜ on L2 (N0 , P0 ) such that ˜ (Lf )(ξ ) := ξˆ (dx) c˜0,x (ξ ) ∇x f (ξ ) , ∀ f ∈ L∞ (N0 , P0 ). Moreover, L∞ (N0 , P0 ) is a core of L˜ and ˜ P = 1 P0 (dξ ) ξˆ (dx)c˜0,x (ξ ) (∇x f (ξ ))2 , f, (−L)f 0 2 Next let us introduce the functions ϕ(ξ ˜ ) = ξˆ (dx) c˜0,x (ξ ) x ,
˜ )a) = (a · ψ(ξ
∀f ∈ L∞ (N0 , P0 ).(53)
ξˆ (dx) c˜0,x (ξ ) (a · x)2 .
42
A. Faggionato, H. Schulz-Baldes, D. Spehner
Then we obtain by means of straightforward computations and the identities (45), (46) and (53) that ˜ P ˜ − 2 supf ∈L∞ (N0 ,P0 ) 2 ϕ˜ · a, f P0 − f, (−L)f g(a) = EP0 (a · ψa) 0 ∞ ˜ ˜ = EP0 (a · ψa) − 2 0 dt ϕ˜ · a , et L ϕ˜ · a . P0
At this point, in order to get (51), it is enough to show that ˜ = δc EPˆ c (a · ψc a) , EP0 (a · ψa) 0
and
˜ ϕ˜ · a , et L ϕ˜ · a
P0
= δc ϕc · a , et Lc ϕc · a ˆ c . P0
This can be derived from Lemma 3 and the following identities, where c is defined by (48): ψ˜ = χ (|E0 | ≤ Ec ) ψc ◦ c , ϕ˜ = χ (|E0 | ≤ Ec ) ϕc ◦ c , ˜ ◦ c ) = χ (|E0 | ≤ Ec ) (Lc f ) ◦ c . L(f 5. Periodic Approximants and Resistor Networks In this section, we compare Dc (rc , Ec ) to the diffusion coefficient of adequately defined periodic approximants, which then in turn can be calculated as the conductance of a random resistor network as in [DFGW]. There have been numerous works on periodic approximants; a recent one containing further references is [Owh]. 5.1. Random walk on a periodized medium. Let us choose a given direction in Rd , say, the direction parallel to the axis of the first coordinate. Given a fixed configuration ξˆ ∈ Nˆ and N > rc , we define the following subsets of Rd ξˆ
QN : = supp(ξˆ ) ∩ Cˇ 2N , ξˆ
ξˆ
+ − V N : = QN ∪ N ∪ N ,
± N := Zd ∩ {x : x (1) = ±N, |x (j ) | < N for j = 2, . . . , d}, ξˆ ±
BN
ξˆ
± := QN ∩ BN ,
− + := {x ∈ Cˇ 2N : x (1) ∈ (−N, −N + rc ]} and BN := where Cˇ 2N := (−N, N )d , BN (1) ˇ {x ∈ C2N : x ∈ [N − rc , N )}. ξˆ
ξˆ
ξˆ
ξˆ
Next let us introduce a graph (V N , E N ) with set of vertices V N and set of edges E N . ξˆ
ξˆ
Two vertices x, y ∈ QN are connected by a non-oriented edge {x, y} ∈ E N if and only ξˆ +
ξˆ −
if |x − y| ≤ rc ; moreover, all vertices x ∈ BN (respectively x ∈ BN ) are connected ξˆ
+ − ± to all y ∈ N (respectively y ∈ N ) by an edge {x, y} ∈ E N and the points of N are not connected between themselves.
Mott Law as Lower Bound for a Random Walk in a Random Environment ξˆ
ξˆ
ξˆ
43 ξˆ
We now define another graph (VN , EN ) obtained from (V N , E N ) by identifying the vertices x− = (−N, x (2) , . . . , x (d) ) ξˆ
x+ = (N, x (2) , . . . , x (d) ) .
and
ξˆ
Let us write π : V N → VN for the identification map on the sets of vertices. Hence ξˆ
ξˆ
ξˆ
− + π(N ) = π(N ) and π restricted to QN is the identity map. The set VN = π(V N ) − ) is represents the medium periodized along the first coordinate. A vertex y ∈ π(N ξˆ +
ξˆ −
ξˆ
connected to all vertices x ∈ BN ∪ BN by an edge of EN . ξˆ
ξˆ
Now a continuous–time random walk with state space VN and infinitesimal generator
LN is given by
ξˆ LN f (x) =
ξˆ
c({x, y}) f (y) − f (x) ,
ξˆ
∀ x ∈ VN ,
ξˆ
y∈VN : {x,y}∈EN ξˆ
where the bond-dependent transition rates c({x, y}) are defined for any {x, y} ∈ EN by ξˆ 1 if x, y ∈ QN , (54) c({x, y}) = − − 1− if x ∈ π(N ) or y ∈ π(N ). | | N
ξˆ
ξˆ
ξˆ
Clearly the generator LN is symmetric w.r.t. the uniform distribution mN on VN given by 1 ξˆ mN = ˆ δx . V ξ ξˆ x∈VN
N
ξˆ
ξˆ
Hence the Markov process with generator LN and initial distribution mN is reversible. Note that it is not ergodic, however, if there are more than one cluster (equivalence class of edges). In the latter case, the ergodic measures are the uniform distributions on a given cluster and this is sufficient for the present purposes. ξˆ ξˆ ξˆ We write PN (respectively PN,x ) for the probability on the path space N = D [0, ∞), ξˆ ξˆ VN associated to the random walk with initial distribution mN (respectively δx ) and ξˆ
generator LN .
ξˆ
Let us introduce an antisymmetric function d1 (x, y) on VN such that ξˆ (1) (1) if x, y ∈ QN , y − x ξˆ − d1 (x, y) = y (1) + N if y ∈ QN , y (1) < 0, x ∈ π(N ), (1) ξˆ − if y ∈ QN , y (1) > 0, x ∈ π(N ) . y −N Finally, given t ≥ 0, we define the random variable (1)ξˆ d1 (ωs− , ωs ) , XN,t (ω) = s∈[0,t] : ωs =ωs−
44
A. Faggionato, H. Schulz-Baldes, D. Spehner ξˆ
where (ωs )s≥0 ∈ N . It is the sum of position increments along the first coordinate (1)ξˆ
axis for all jumps occurring in the time interval [0, t]. Clearly, XN,t gives rise to a time-covariant and antisymmetric family so that, as in Sect. 3, [DFGW, Theorem 2.2] can be used in order to deduce the following result. Proposition 6. Given N ∈ N, N > rc , and ξˆ ∈ Nˆ , lim
t↑∞
(1)ξˆ 1 ξˆ E ξˆ (XN,t )2 = DN , P t N ξˆ
where the diffusion coefficient DN is finite and given by ∞ ξˆ ξˆ ξˆ ξˆ ξˆ ξˆ dt ϕN , et LN ϕN DN = mN ψN − 2 0
ξˆ
ξˆ
,
ξˆ
mN
(55)
ξˆ
with ψN , ϕN (scalar) functions on VN defined as ξˆ ξˆ c({x, y}) d1 (x, y)2 , ϕN (x)= ψN (x)= ξˆ y : {y,x}∈EN
c({x, y}) d1 (x, y).
(56)
ξˆ y : {y,x}∈EN
5.2. Link to periodized medium. Here we show that the diffusion matrix (51) can be bounded from below in terms of the average of the diffusion coefficient associated to the periodized random media. Our proof follows the arguments of [DFGW, Prop. 4.13], but additional technical problems are related to the randomness of geometry (absence of any lattice structure) and possible (albeit integrable) singularities of the mean forward velocity and infinitesimal mean square displacement. Proposition 7. Suppose that for 1 ≤ p ≤ 8 lim
N↑∞
ρc (C2N ) = 1 ˆξ (C2N ) + a2N
in Lp ( Nˆ , Pˆ c ) ,
(57)
± where ρc := EPˆ c (ξˆ (C1 )) and a2N := |N | = (2N − 1)d−1 . Then, for any t > 0, ˆ ˆ ξ ξ (58) lim EPˆ c mN ψN = EPˆ c ψc(11) , 0 N↑∞ ξˆ ξˆ ξˆ lim EPˆ c ϕN , et LN ϕN ξˆ = ϕc(1) , et Lc ϕc(1) ˆ c , (59) (11)
P0
mN
N↑∞
(1)
where ψc and ϕc are the first diagonal matrix element of the matrix ψc and the first component of the vector ϕc introduced in (50), respectively. Since Dc (rc , Ec ) is given by (51) and is a multiple of the identity (cf. Remark 2), the identities (58) and (59) combined with Fatou’s Lemma immediately imply: Corollary 1. Under the same hypothesis as above, Dc (rc , Ec ) ≥
ξˆ lim sup EPˆ c DN N↑∞
where 1d is the d × d identity matrix.
1d ,
(60)
Mott Law as Lower Bound for a Random Walk in a Random Environment
45
Before giving the proof, let us comment on its assumptions. In Sect. 6 we will show that condition (57) is always satisfied. Due to (49), ρp < ∞ implies EPˆ c ( ξˆ (C1 )p ) < ∞ for any p > 0. As Pˆ c is ergodic, this implies the following ergodic theorem, an extension of [DV, Theorem 10.2]. We recall that a convex averaging sequence of sets {An } in Rd is a sequence of convex sets such that An ⊂ An+1 and An contains a ball of radius rn with rn → ∞ as n → ∞. Lemma 4. Suppose that ρp < ∞, p ≥ 1. Then, given a convex averaging sequence of Borel sets {An } in Rd , ξˆ (An ) → 1 ρc (An )
in Lp ( Nˆ , Pˆ c ) ,
ξˆ (An ) → 1 ρc (An )
and
Pˆ c -a.s.
We will also need a bound on EPˆ c ((ξˆ (An )/ (An ))p ), uniformly in n, for a sequence of sets that does not satisfy the assumptions of Lemma 4. To this aim we note that, given a Borel set B ⊂ Rd which is a union of k non-overlapping cubes of side 1, one has p ∀ p ≥ 1. (61) EPˆ c ξˆ (B)/k ≤ EPˆ c ξˆ (C1 )p ≤ ρp , This follows from the stationarity of Pˆ c and the convexity of the function f (x) = x p , x ≥ 0. Proof of Proposition 7. Without loss of generality, we assume rc = 1. Note that, since Pˆ is stationary with finite intensity ρ1 , one has P-a.s. ξˆ (∂Ck ) = 0 for all k ∈ N. In what follows we hence may assume ξ to be as such, thus allowing to simplify notation since C2N ∩ supp(ξˆ ) = Cˇ 2N ∩ supp(ξˆ ). A key observation in order to prove (58) and (59) is the following identity, valid for any nonnegative measurable function h defined on Nˆ 0 . It follows easily from (12): EPˆ c ∀ B ∈ B(Rd ). (62) ξˆ (dx)h(Sx ξˆ ) = ρc (B) EPˆ c (h), B
0
From this identity we can deduce for any h ∈ L2 (N0 , Pˆ 0c ) that 1 lim EPˆ c ξˆ (dx)h(Sx ξˆ ) = EPˆ c (h) . (63) 0 N↑∞ ξˆ (C2N ) + a2N C2N −2 In fact, due to (62), it is enough to show that 1 1 ˆ ˆ EPˆ c ξ (dx)h(Sx ξ ) ↓ 0 , as N ↑ ∞. − ξˆ (C2N )+a2N ρc (C2N−2 ) C2N−2 (64) By applying twice the Cauchy-Schwarz inequality and by invoking (62), we obtain 2 l.h.s. of (64) ρ (C 2 ξˆ (C 2N−2 ) 2N−2 ) c ≤ EPˆ c −1 ˆξ (C2N ) + a2N ρc2 (C2N−2 )2 2 1 EPˆ c ξˆ (dx)h(Sx ξˆ ) ξˆ (C2N−2 ) C2N −2 ρ (C 2 ξˆ (C c 2N−2 ) 2N−2 ) −1 ≤ EPˆ c EPˆ c (h2 ) . 0 ˆξ (C2N ) + a2N ρc (C2N−2 )
46
A. Faggionato, H. Schulz-Baldes, D. Spehner
At this point, (64) follows by applying the Cauchy-Schwarz inequality to the first expectation above and then applying (61) and the limit (57) for p = 4. ξˆ
N,
ξˆ
Let now hN be a function on VN such that for some constant c > 0 independent of ξˆ |hN (x)|
≤ c
ξˆ (B1 (x))
if x ∈ QN ,
|BN |
otherwise ,
ξˆ
ξˆ
a2N
ξˆ
ξˆ −
ξˆ +
ξˆ
where BN = BN ∪ BN and B1 (x) is the closed unit ball centered in x. Note that ψN ξˆ
and ϕN satisfy this inequality. We claim that the mean boundary contribution vanishes in the limit: 1 ξˆ lim EPˆ c for 1 ≤ p ≤ 4. (65) |hN (x)|p = 0 , N↑∞ ξˆ (C2N ) + a2N ξˆ ξˆ x∈VN \QN −1
In fact, the sum in (65) can be bounded by ξˆ
p
c a2N
|BN |p p a2N
+ cp
p
ξˆ (B1 (x))
(66)
.
ξˆ ξˆ x∈QN \QN −1
By the Cauchy-Schwarz inequality EPˆ c
a2N ξˆ (C2N ) + a2N
ξˆ
|BN |p
1
≤ E 2ˆ c
p
P
a2N
2 a2N
(ξˆ (C2N ) + a2N )2
1
E 2ˆ c
ξˆ
|BN |2p
P
2p
.
a2N
The first factor on the r.h.s. is negligible as N ↑ ∞ because of the limit (57) for p = 2, while the second factor is bounded, uniformly in N , because of (61). For the second summand in (66), we use twice the Cauchy-Schwarz inequality and invoke (62) to deduce 1 p 1 ξˆ (C2N \ C2N−2 ) EPˆ c ≤ E 2ˆ c ξˆ B1 (x) P (ξˆ (C2N ) + a2N )2 ξˆ (C2N ) + a2N ξˆ ξˆ x∈QN \QN −1 1
×E 2ˆ c P
1 ξˆ (C2N \ C2N−2 )
p 2 ξˆ B1 (x)
ξˆ
ξˆ
x∈QN \QN −1
1 1 2 2p ξˆ (C2N \ C2N−2 ) ≤ ρc (C2N \ C2N−2 )EPˆ c . E 2ˆ c ξˆ B1 (0) P0 (ξˆ (C2N ) + a2N )2 The last factor is bounded by hypothesis, the first one converges to 0 as N ↑ ∞ because of Lemma 4 and (57). ξˆ ξˆ (11) In order to prove (58) observe that ψc (Sx ξˆ ) = ψN (x) if x ∈ QN−1 . Therefore we can write 1 1 ξˆ ξˆ ξˆ mN (ψN ) = ξˆ (dx)ψc(11) (Sx ξˆ )+ ψN (x) . ξˆ (C2N )+a2N C2N−2 ξˆ (C2N )+a2N ξˆ
x∈VN \C2N−2
Mott Law as Lower Bound for a Random Walk in a Random Environment ξˆ
47
ξˆ
Now (58) follows easily from (63) and (65) with hN := ψN . Note that by the same arguments one can prove ˆ ˆ
ξ ξ = EPˆ c (|ϕc(1) |p ) < ∞ , 1 ≤ p ≤ 4 , (67) lim EPˆ c mN |ϕN (x)|p N↑∞
0
which will be useful below. In order to prove (59), we fix 0 < α < 1 and set M = 2N − 2[N α ], where [N α ] denotes the integer part of N α . Moreover, we define the hitting times ξˆ
ξˆ
τN (ω) = inf {s ≥ 0 : ωs ∈ C2N−2 } ,
ξˆ
ω = (ωs )s≥0 ∈ N = D([0, ∞), VN ).(68)
ξˆ ξˆ Recall the definitions of the distribution Pˆ cˆ , PN,x and PN given in Sects. 4 and 5.1. ξ ξˆ ξˆ ξˆ Thanks to the identity (et LN ϕN )(x) = E ξˆ ϕN (ωt ) , we can write PN,x
ξˆ
ξˆ
ξˆ
EPˆ c ϕN , et LN ϕN where
ξˆ mN
ξˆ ξˆ ξˆ = EPˆ c A1,N + A2,N + A3,N ,
ξˆ ξˆ ξˆ ξˆ A1,N = mN χ (x ∈ CM ) ϕN (x) E ξˆ ϕN (ωt ) , PN,x ˆξ ˆξ ˆξ ξˆ ξˆ A2,N = mN χ (x ∈ CM ) ϕN (x) E ξˆ χ (τN ≤ t) ϕN (ωt ) , PN,x ξˆ ξˆ ξˆ ξˆ ξˆ A3,N = mN χ (x ∈ CM ) ϕN (x) E ξˆ χ (τN > t) ϕN (ωt ) . PN,x
Then (59) follows from ξˆ lim EPˆ c A1,N = 0, N↑∞ ξˆ lim EPˆ c A2,N = 0,
N↑∞
ξˆ lim EPˆ c A3,N = ϕc(1) , et Lc ϕc(1) Pˆ c . 0 N↑∞
(69)
Let us first prove the first limit in (69). By several applications of Cauchy-Schwarz ξˆ ξˆ ξˆ inequality and due to the identity PN = mN (dx)PN,x , we get 1 1 E ˆ c Aξˆ ≤ E 2 mξˆ V ξˆ \ CM E 2 mξˆ ϕ ξˆ (x)2 E 1,N N N N N ˆc ˆc P
P 1 2
ξˆ
ξˆ
≤ E ˆ c mN VN \ CM P 1 2
= E ˆc P
ξˆ ξˆ mN VN
\ CM
P
ξˆ PN,x
ξˆ
ϕN (ωt )2
1 ˆ 41 ξ ξˆ E 4ˆ c mN ϕN (x)4 E ˆc E
P 1 2
E ˆc P
ξˆ mN
ξˆ ϕN (x)4 ξˆ
P
ξˆ PN
ξˆ
ϕN (ωt )4
, ξˆ
where the last identity follows from the stationarity of LN w.r.t. mN . Due to the dominated convergence theorem, the first expectation on the r.h.s. goes to 0, while the second expectation is bounded due to (67).
48
A. Faggionato, H. Schulz-Baldes, D. Spehner
In order to prove the second limit in (69), we apply twice the Cauchy-Schwarz ξˆ inequality in order to obtain the bound EPˆ c A2,N by 1 ˆ ˆ 1 ξˆ 1 ξˆ
ξ ξ ξˆ ξˆ E 2ˆ c mN ϕN (x)2 E 4ˆ c E ξˆ ϕN (ωt )4 E 4ˆ c mN χ (x ∈ CM )PN,x (τN ≤ t) P
P
P
PN
(70) Again, because of stationarity and (67), the first two factors on the r.h.s. are bounded while the last one converges to 0 due to Lemma 5 below. ˆ = D([0, ∞), Nˆ 0 ) Finally we prove the last limit in (69). To this aim, given ξˆ ∈ and x ∈ CM , we set
τN,x (ξˆ ) = inf s ≥ 0 : x + Xs (ξˆ ) ∈ C2N−2 , (71) where Xs (ξˆ ) is defined as in (19). Note that for x ∈ CM ∩ supp(ξˆ ), ξˆ
ϕN (x) = ϕc(1) (Sx ξˆ ),
E
ξˆ PN,x
ξˆ ξˆ χ (τN > t) ϕN (ωt ) = EPˆ c χ (τN,x > t) ϕc(1) (ξˆt ) . Sx ξˆ
Therefore ˆ! ξˆ "
ξ . EPˆ c A3,N = EPˆ c mN χ (x ∈ CM ) ϕc(1) (Sx ξˆ ) EPˆ c χ (τN,x > t) ϕc(1) (ξˆt ) Sx ξˆ On the other hand, by applying the Cauchy-Schwarz inequality as in (70) and due to Lemma 5, we obtain ˆ! "
ξ = 0. lim EPˆ c mN χ (x ∈ CM ) |ϕc(1) (Sx ξˆ )| EPˆ c χ (τN,x ≤ t) |ϕc(1) (ξˆt )| N ↑∞
Sx ξˆ
The last two identities imply ˆ! ξˆ "
ξ lim EPˆ c A3,N = lim EPˆ c mN χ (x ∈ CM ) ϕc(1) (Sx ξˆ ) EPˆ c ϕc(1) (ξˆt ) . N ↑∞ N↑∞ Sx ξˆ (72) Observe now that (63) remains valid if the integral is performed on CM in place of C2N −2 (the arguments used in the proof there work also in this case) and the function h(ξˆ ) is defined as h(ξˆ ) = ϕc(1) (ξˆ ) EPˆ c ϕc(1) (ξˆt ) = ϕc(1) (ξˆ ) et Lc ϕc(1) (ξˆ ) . ξˆ
Note that h ∈ L2 (Nˆ 0 , Pˆ 0c ). Therefore we can conclude that the r.h.s. of (72) is equal to (1) (1) ϕc , et Lc ϕc Pˆ c . 0
ξˆ
Lemma 5. Let τN and τN,x be defined as in (68) and (71), and let M = 2N − 2[N α ]. Then ˆ! "
ξ ξˆ ξˆ lim EPˆ c mN χ (x ∈ CM ) PN,x τN ≤ t = 0, (73) N↑∞ ˆ! "
ξ = 0. (74) lim EPˆ c mN χ (x ∈ CM ) Pˆ c ˆ τN,x ≤ t N↑∞
Sx ξ
Mott Law as Lower Bound for a Random Walk in a Random Environment
49
Proof. One can check by a coupling argument that the two expectations in (73) and (74) coincide: for each N ∈ N+ , ξˆ ∈ Nˆ and x ∈ CM ∩ supp(ξˆ ), one can define a probability ξˆ ˆ such that measure µ on × N
ξˆ
ˆ = P (A), µ(A × ) N,x
ξˆ
µ(N × B) = Pˆ c ˆ (B), Sx ξ
ξˆ
ˆ ∀A ∈ B(N ), ∀B ∈ B(),
ξˆ ξˆ and such that, µ almost surely, τN (ω) = τN,x (ξˆ ) and ωs = x+Xs (ξˆ ) for any 0 ≤ s < τN . ξˆ ξˆ Such a coupling µ implies Pˆ c ˆ τN,x ≤ t = PN,x τN ≤ t . Thus we need to prove Sx ξ only (73). Moreover, without loss of generality, we assume rc = 1. To this aim let us cover C2N−2 \ CM by disjoint cubes C1,i of side 1, i ∈ I , so that C2N−2 \ CM = ∪i∈I C1,i (the boundaries of these cubes are suitably chosen for them to be disjoint). Finally, given a positive integer n, we set
I∗n = {(l1 , . . . , ln ) ∈ I n : lj = lk if j = k}. ξˆ
For paths ω such that τN (ω) < ∞, let us define k = k(ω) as the number of different ξˆ
cubes C1,i , i ∈ I , visited by the particle in the time interval [0, τN (ω) ) and more k over we define by induction (i1 , . . . , ik ) ∈ I∗k , (x1 , . . . , xk ) ∈ C2N−2 \ CM with xj ∈ C1,ij ∀j : 1 ≤ j ≤ k, and (t1 , . . . , tk ) as follows: Let x1 be the first point reached in C2N −2 \ CM and t1 be the time spent in x1 before jumping away. The index i1 is characterized by the requirement that x1 ∈ C1,i1 . Suppose now that i1 , . . . , ij , x1 , . . . , xj and t1 , . . . , tj have been defined and that j < k. Then xj +1 is the first point ξˆ in C2N −2 \ CM ∪ C1,i1 ∪ · · · ∪ C1,ij visited during the time interval [0, τN (ω) ) and tj +1 is the time spent at xj +1 during such a first visit. Moreover, ij +1 is such that xj +1 ∈ C1,ij +1 . ξˆ
Now let Ti , i ∈ I and ξˆ ∈ Nˆ , be a family of independent exponential random variξˆ
ables (all independent from the above random objects) and such that Ti has parameter #1,i ), where ξˆ C #1,i = {y ∈ Rd : dist(y, C1,i ) ≤ 1 }. C Since, given ξˆ , k and (x1 , . . . , xk ), tj (1 ≤ j ≤ k) are independent exponential variables #1,ij and since k ≥ kmin := [N α ] − 1, we and tj has parameter not larger than ξˆ C obtain ˆ! "
ξ ξˆ ξˆ EPˆ c mN χ (x ∈ CM ) PN,x τN ≤ t =
|I |
n=kmin l∈I∗n
≤
|I | n=kmin l∈I∗n
ˆ! ξ EPˆ c mN χ (x∈CM ) y∈
$n
"
ξˆ ξˆ PN,x τN ≤ t, k = n, xl = yl , 1 ≤ l ≤ n
ξˆ j =1 C1,lj ∩VN
ˆ! " ξ ξˆ EPˆ c mN χ (x ∈ CM ) PN,x k = n, i1 = l1 , . . . , in = ln
ξˆ
ξˆ × Prob Tl1 + · · · + Tln ≤ t ,
(75)
50
A. Faggionato, H. Schulz-Baldes, D. Spehner
where the last inequality follows from the bound ξˆ ξˆ ξˆ ξˆ PN,x τN ≤ t | k = n, x1 = y1 , . . . , xn = yn ≤ Prob Tl1 + · · · + Tln ≤ t . In order to estimate the probability in the r.h.s., we use an argument to that similar #1 ) , where of the proof of Proposition 1 in Appendix C. Let us define m := EPˆ c ξˆ (C #1 = {y ∈ Rd : dist(y, C1 ) ≤ 1}. Given κ > 0 and l ∈ I∗n as above, we define C A = A(κ, l) as follows
#1,lj > κ m > n . A = ξˆ ∈ Nˆ : j : 1 ≤ j ≤ n and ξˆ C 2 Then, by the Chebyshev inequality and the stationarity of Pˆ c , 2 #1,lj > κ m ≤ 2 Pˆ c ξˆ (C #1 ) > κ m → 0, Pˆ c A ≤ EPˆ c j : 1 ≤ j ≤ n and ξˆ C n as κ → ∞. Note that the complement Ac of A can be written as
#1,lj ≤ κ m ≥ n Ac = ξˆ ∈ Nˆ : j : 1 ≤ j ≤ n and ξˆ C , 2 ∗ where [n/2]∗ is defined as n/2 for n even and as (n + 1)/2 for n odd. If ξˆ ∈ Ac then at ξˆ ξˆ least n2 ∗ of the exponential variables Tl1 , . . . ,Tln have parameter not larger than κ m. Then, by a coupling argument (e.g. Appendix C), we get for all ξˆ ∈ Ac , ξˆ ξˆ Prob Tl1 + · · · + Tln ≤ t ≤ e−κ mt
∞ r=[n/2]∗
(κ m t)r =: φ(κ, n) . r!
Due to the above estimates and since n ≥ kmin = [N α ] − 1, we get #1 ) > κ m + φ(κ, N α ) . r.h.s. of (75) ≤ 2 Pˆ c ξˆ (C The lemma follows by taking first the limit N ↑ ∞ and then the limit κ ↑ ∞.
5.3. Random resistor networks. We conclude this section by pointing out that the diffuξˆ
sion coefficient DN of the periodized medium can be expressed in terms of the effective ξˆ
ξˆ
conductance of the graph (V N , E N ) when assigning suitable bond conductances. More ξˆ
ξˆ
precisely, consider the electrical network given by the graph (V N , E N ), where the bond ξˆ
{x, y} ∈ E N has conductivity c({π(x), π(y)}) with c({·, ·}) defined in (54). Then, the ξˆ
− to effective conductance GN of this network is defined as the current flowing from N + − + N when a unit potential difference between N to N is imposed. It can be calculated from Ohm’s law and the Kirchhoff rule as follows. Let the electrical potential V (x) − + vanish on the left border N , be equal to 1 on the right border N , and satisfy:
c({π(x), π(y)}) V (y) − V (x) = 0 for any ξˆ
y : {y,x}∈E N
ξˆ
x ∈ QN .
Mott Law as Lower Bound for a Random Walk in a Random Environment
51
Then the effective conductance is given by the current flowing through the surfaces {x ∈ [−N, N]d : x (1) = ±N }: ξˆ
GN =
1 − V (x) .
V (x) =
ξˆ − x∈BN
(76)
ξˆ + x∈BN
ξˆ
By a well-known analogy it is linked to the diffusion coefficient DN (see e.g. [DFGW, Prop. 4.15] for a similar proof): Proposition 8. One has ξˆ
DN =
8 N2 ξˆ |V N |
ξˆ
GN .
(77)
6. Percolation Estimates Let us set Fr := FRd \Cr and recall that ρc = ρ δc with δc = ν([−Ec , Ec ]).
6.1. Point density estimates. Here we show how the ergodic properties of Lemma 4 combined with the hypothesis (H1) or (H2) imply (57). Proposition 9. Suppose that ρ8 < ∞ and that the hypothesis (H1) or (H2) holds. For 1 ≤ p ≤ 8, lim
N↑∞
ρc (CN ) = 1 ˆξ (CN ) + aN
in Lp ( Nˆ , Pˆ c ) ,
(78)
where aN = (N − 1)d−1 . We will first prove the following criterion. Lemma 6. Property (78) holds if one has, for some 0 < ρ < ρ, lim N p Pˆ ξˆ (CN ) ≤ ρ N d = 0 .
N↑∞
(79)
Proof. We first check that (79) implies that, for some 0 < ρ
< ρ δc , lim N p Pˆ c ξˆ (CN ) ≤ ρ
N d = 0 .
N↑∞
(80)
If δc = 1, this is clearly true so let us suppose that 0 < δc < 1. Set δ˜c = 1 − δc . If Cjk denotes the binomial coefficient, we have
52
A. Faggionato, H. Schulz-Baldes, D. Spehner
N d ] [ρ c ˆ
d ˆ ˆ ξˆ (CN ) = k) P ξ (CN ) ≤ ρ N = P(
k=0 ∞
+
k=[ρ
N d ]+1 [ρ N d ]
≤
k
ˆ ξˆ (CN ) = k) P(
j =k−[ρ
N d ] k
ˆ ξˆ (CN ) = k)+ sup P(
k>[ρ N d ]
k=0
j k−j Cjk δ˜c δc
Cjk δ˜c δc
j k−j
j =k−[ρ
N d ]
≤ Pˆ ξˆ (CN ) ≤ ρ N d +exp(−c[ρ N d ](δc − ρ
/ρ )2 ), where the last inequality, given ρ
< δc ρ , follows from a standard large deviation type estimate for Bernoulli variables with some c > 0. Multiplying by N p , (79) thus implies (80). Now set AN = {ξˆ : ξˆ (CN ) ≤ ρ
N d }. Then, for some c > 0 independent of N , p ρc (CN ) p fN (ξˆ ) := − 1 ≤ c ρc N p χAN (ξˆ ) + fN (ξˆ ) χAcN (ξˆ ) . ξˆ (CN ) + aN Integrating w.r.t. Pˆ c , the first term vanishes in the limit N ↑ ∞ because of (80). For the second, let us first note that Lemma 4 implies that limN↑0 fN χAcN = 0 holds Pˆ c -a.s. Furthermore, |fN χAcN | ≤ c
< ∞ uniformly in N so that the dominated convergence theorem assures that limN↑0 EPˆ c (fN χAcN ) = 0. Proof of Proposition 9. Due to Lemma 6 we only need to show that (79) is satisfied for some ρ < ρ. This is trivially true if (H1) holds. Hence let us consider the case where (H2) holds. This implies E ˆ (f | Fr ) − E ˆ (f ) ≤ f ∞ r d r d−1 h(r2 − r1 ) , ˆ P-a.s. , (81) 2 1 2 P P where f is a bounded FCr1 –measurable function.
Let C1i denote the unit cube centered at i ∈ Zd and Cˇ 1i be the interior of C1i . Let j IN ⊂ Zd be such that CN = ∪i∈IN C1i and Cˇ 1i ∩ Cˇ 1 = ∅ if i = j . Hence |IN | = N d . ˜ ˜ Given M > 0, set Y˜i (ξˆ ) = min{ξˆ (Cˇ 1i ), M 2 } and Yi = Yi − EPˆ (Yi ). Note that Yi is centered, FC i –measurable and Yi ∞ ≤ M. We choose M large enough so that 1 ρ˜ := E ˆ (Y˜i ) > ρ which is possible because limM↑∞ E ˆ (Y˜i ) = ρ > ρ . Now P
ξˆ (CN ) ≤ ρ N d
⊂
i∈IN
P
Yi (ξˆ ) ≥ (ρ˜ − ρ )N d . Y˜i (ξˆ ) ≤ ρ N d ⊂ i∈IN
Hence it is sufficient to show that, for a > 0, Yi ≥ aN d = 0 . lim N p Pˆ N↑∞
i∈IN
(82)
Mott Law as Lower Bound for a Random Walk in a Random Environment
By the Chebyshev inequality, one has for any even q ∈ N: 1 Pˆ Yi ≥ aN d ≤ q dq EPˆ Yi1 · · · Yiq . a N
53
(83)
i1 ,... ,iq ∈IN
i∈IN
We will now bound the sum in the r.h.s. of (83). Let us define the norm x = max{|x (k) | : 1 ≤ k ≤ d} on Rd (recall that x (k) is the k th component of x) and introduce the notation i = (i1 , . . . , iq ), I N = (IN )q , and rj (i) = min{ij − ik : k = 1, . . . , q, k = j }. If r1 (i) = · · · = rN (i) = 0, i.e., if each point appears at least twice in (i1 , . . . , iq ), then use the bound EPˆ (Yi1 · · · Yiq ) ≤ M q . The number of i ∈ I N satisfying this property is at most cN dq/2 (here and below c is a varying constant depending only on d and on q). i Suppose now that, say, r1 (i) = r ≥ 1. Then the open cubes Cˇ 1i2 , . . . , Cˇ 1q are contained i1 in A := Rd − C2r−1 and thus Yi2 , . . . , Yiq are FA -measurable. Using the conditional expectation, (81) and the fact that Yi1 is centered and Yij ∞ ≤ M, EPˆ (Yi1 · · · Yiq ) ≤ M q−1 EPˆ EPˆ (Yi1 |FA ) ≤ M q h(2r − 2) (2r − 1)d−1 . Note that EPˆ (Yi1 · · · Yiq ) is invariant under permutations of the indices i1 , . . . , iq . Hence
EPˆ (Yi1 · · · Yiq ) ≤ c
N
EPˆ (Yi1 · · · Yiq )
r=0 i∈K N (r)
i∈I N
≤ c M q N dq/2 + c M q
N
h(2r − 2) (2r − 1)d−1 |K N (r)| ,
r=1
where K N (r) = {i ∈ I N : r1 (i) = r, r2 (i) ≤ r, . . . , rq (i) ≤ r}. One has |K N (r)| ≤ c N dq/2 r dq/2−1 .
(84)
In fact, on the set of points i1 , . . . , iq (treated as distinguishable) let us define a graph structure by connecting two points i ∼ j with a bond whenever i − j ≤ r. We call G(i) the resulting graph. Note that each connected component of G(i) has cardinality at least 2 whenever i ∈ K N (r), therefore G(i) has at most q/2 connected components. We claim that, given 1 ≤ l ≤ q/2, {i ∈ K (r) : G(i) has l connected components} ≤ c(N/r)dl r dq−1 . (85) N In order to prove (85), suppose that the connected component containing i1 has cardinality k1 , while the other components have cardinality k2 , . . . , kl respectively. Each component can be built by first choosing one of its points in IN (there are N d possible choices), then its neighboring points w.r.t. ∼ (for each such neighboring point there are at most cr d possible choices) and then iteratively adding neighboring points w.r.t. ∼. Therefore, the j th component can be built in at most cN d r d(kj −1) ways. If j = 1, since i1 has a neighboring point at distance exactly r, the upper bound can be improved by cN d r d−1+d(k1 −2) . Summing over all possible k1 , . . . , kl such that k1 + · · · + kl = q, one gets (85). Since r ≤ N, (85) implies |K N (r)| ≤
q/2 l=1
c(N/r)dl r dq−1 ≤ c(N/r)dq/2 r dq−1 ,
54
A. Faggionato, H. Schulz-Baldes, D. Spehner
thus concluding the proof of (84). It implies ∞ q dq/2 1 + c
EPˆ (Yi1 · · · Yiq ) ≤ M N h(2r − 2) r dq/2+d−2
.
(86)
r=1
i∈I N
Provided that dq > 2p and the sum over r converges, that is, if dq/2 − d ≤ 8, we get the result (79) by combining (82), (83), and (86). Choosing for q the smallest even integer larger than 16/d, (79) is true for 1 ≤ p ≤ 8 and dq/2 − d ≤ 8 as required. 6.2. Domination. Due to Proposition 9, we may apply the results of Sect. 5 so that combining with Proposition 3, 2 8 N ˆ ξ D ≥ ν([−Ec , Ec ]) e−rc −4βEc lim sup EPˆ c ˆ GN . (87) ξ N→∞ |V N | ξˆ
rc from below, we will discretize the In order to bound the conductance GN for N space Rd using cubes of appropriate size and spacing. Given r2 ≥ r1 > 0, let us then consider the following functions on Nˆ : σj (ξˆ ) := χ ξˆ (Cr1 + r2 j ) > 0 , j ∈ Zd . (88) They form a random field = (σj )j ∈Zd on the probability space (Nˆ , Pˆ c ). If Pˆ is a PPP, the σj are independent random variables. For a process with finite range correlations, this independence can also be assured by an adequate choice of r1 and r2 , but in general the σj are correlated. The side length r1 and spacing r2 are going to be chosen of order O(rc ) in such a way that all points of neighboring cubes have an euclidean distance less ξˆ
ξˆ
than rc and they are thus connected by an edge of the graph (V N , E N ). Next note that the σj take values in {0, 1}. We shall consider the associated site percolation problem with bonds between nearest neighbors only [Gri]. For this purpose, p we shall compare with a random field Z p = (zj )j ∈Zd of independent and identically p p distributed random variables with Prob(zj = 1) = p and Prob(zj = 0) = 1 − p. In this independent case, it is well-known that there is a critical probability pc (d) ∈ (0, 1) such that, if p > pc (d), there is almost surely a unique infinite cluster, while for p < pc (d) there is almost surely none [Gri]. We will need somewhat finer estimates for the supercritical regime. Let |.| denote the Euclidean norm in Rd . A left-right crossing (LR-crossp ing) with length k − 1 of C2N of a configuration (zj )j ∈Zd is a sequence of distinct points p y1 , . . . , yk in C2N ∩ Zd such that |yi − yi+1 | = 1 for 1 ≤ i < k, zyi = 1 for 1 ≤ i ≤ k, (1) (1) (1) (s) (s) y1 = −N , yk = N, −N < yi < N for 1 < i < k, and finally yi = yj for any s ≥ 3 and for 1 ≤ i < j ≤ k. Two crossings are called disjoint if all the involved yj ’s are distinct. In the same way, one defines disjoint LR-crossings for (σj )j ∈Zd . Note that this definition of LR-crossings for d ≥ 3 uses LR-crossings in 2-dimensional slices only. For the random field Z p , the techniques of [Gri, Sect. 2.6 and 11.3] transposed to site percolation imply that, if p > pc (2), there are positive constants a = a(p), b = b(p), and c = c(p) such that for all N ∈ N+ , Prob Z p has less than bN d−1 disjoint LR–crossings in C2N ≤ c e−a N . (89)
Mott Law as Lower Bound for a Random Walk in a Random Environment
55
In order to transpose this result on Z p to one for , we will use the concept of stochastic dominance [Gri, Sect. 7.4]. One writes ≥st Z p whenever EPˆ c (f ()) ≥ EProb (f (Z p )) ,
(90)
for any bounded, increasing, measurable function f : {0, 1}Z → R (recall that a function is increasing if f ((zj )j ∈Zd ) ≥ f ((zj )j ∈Zd ) whenever zj ≥ zj for all j ∈ Zd ). As the event on the l.h.s. of (89) is decreasing, ≥st Z p with p > pc (2) implies that for all N ∈ N+ , Pˆ c (σj )j ∈Zd has less than bN d−1 disjoint LR–crossings in C2N ≤ c e−a N . (91) d
Moreover, let us call the configurations ξˆ in the set on the l.h.s. N -bad, those in the complementary set N-good. For every N –good ξˆ , let us fix a set of at least configuration bN d−1 disjoint LR–crossings in C2N for σj (ξˆ ) j ∈Zd and denote it CN (ξˆ ). Given an LR–crossing γ in C2N , we write L(γ ) for its length. Note that, since the LR–crossings are self–avoiding, L(γ ) = |supp(γ )| − 1 for all γ ∈ CN (ξˆ ). Moreover, since paths in CN (ξˆ ) are disjoint and have support in C2N ∩ Zd , γ ∈CN (ξˆ ) |supp(γ )| ≤ (2N + 1)d . The above estimates imply that γ ∈CN (ξˆ ) L(γ ) ≤ (2N + 1)d ≤ (4N )d . In particular, due to the Jensen inequality, for any N–good configuration ξˆ ,
1 |CN (ξˆ )|2 b2 N d−2 . ≥ ≥ L(γ ) 4d γ ∈CN (ξˆ ) L(γ ) ˆ γ ∈CN (ξ )
(92)
This will allow us to prove a lower bound on (87). Hence we need the following criterion for domination. Lemma 7. ≥st Z p holds with r1 = r, r2 = 2r if Pˆ and r > 0 satisfy the following: There exists ρ > 0 such that r d ν([−Ec , Ec ]) ≥ −
ln(p/2) , ρ
(93)
and 3p Pˆ ξˆ (Cr ) < ρ r d F2 r ≤ 1 − , 2
ˆ P–a.s.
(94)
Proof. The proof is based on the following criterion [Gri, Sect. 7.4]: if for any finite subset J of Zd , i ∈ Zd \ J and zj ∈ {0, 1} for j ∈ J satisfying Pˆ c (σj = zj ∀j ∈ J ) > 0, one has Pˆ c (σi = 1 | σj = zj ∀ j ∈ J ) ≥ p ,
(95)
then ≥st Z p . Hence let J, i, zj be as above and set δ˜c := 1 − δc and J0 := {j ∈ J : zj = 0} as well as J1 := {j ∈ J : zj = 1}. Moreover, given k ∈ NJ0 and s ∈ NJ+1 , let W (k, s) := ξˆ ∈ Nˆ : ξˆ (Cr + 2rj ) = kj ∀j ∈ J0 , ξˆ (Cr + 2rj ) = sj ∀j ∈ J1 .
56
A. Faggionato, H. Schulz-Baldes, D. Spehner
Then Pˆ c (σi = 0 | σj = zj ∀j ∈ J ) n$ $ ˜ ˜kj j ∈J (1 − δ˜csj ) ˆ ˆ J n∈N P ξ (Cr + 2ri) = n , W (k, s) δc j ∈J0 δc k∈NJ0 1 s∈N+1 = . $ kj $ sj ˜ ˜ ˆ W (k, s) J1 P j ∈J0 δc j ∈J1 (1 − δc ) k∈NJ0 s∈N+
Within this, we can, moreover, replace Pˆ ξˆ (Cr + 2ri) = n , W (k, s) = Pˆ ξˆ (Cr + 2ri) = n | W (k, s) Pˆ W (k, s) . Finally, note that W (k, s) ∈ FA , where A = Rd \ C2r + 2ri . As δ˜c ≤ e−δc , we obtain the following bound
d Pˆ ξˆ (Cr + 2ri) = n | W (k, s) δ˜cn ≤ Pˆ ξˆ (Cr + 2ri) < ρ r d | W (k, s) + e−δc ρ r . n∈N
ˆ (93) and (94) imply (95). Due to the stationarity of P,
6.3. Proof of Theorem 1(ii). We fix p > pc (2) and ρ < ρ. Then, given Ec , we choose rc such that (93) is satisfied, i.e. rc = c(Ecα+1 )−1/d for some constant c. As rc ↑ ∞ in the limit of low temperature, we can next check that the condition (94) also holds. This is trivial for a process with a uniform lower bound (4) on the point density. For a mixing point process satisfying (5), one has Pˆ ξˆ (Cr ) < ρ r d F2r ≤ Pˆ ξˆ (Cr ) < ρ r d + r d (2r)d−1 h(r) , Pˆ − a.s. Due to the hypothesis on h, the second term converges to 0 in the limit r ↑ ∞. If ρ < ρ, the first one can be bounded by the Chebychev inequality: ξˆ (C ) ξˆ (C ) 1 r r
d
ˆ ξˆ (Cr ) ≤ ρ r ) ≤ Pˆ ˆ P( P(dξ ) − ρ > ρ − ρ ≤ − ρ .
(Cr ) (Cr ) ρ−ρ By Lemma 4, the expression on the r.h.s. can be made arbitrarily small by choosing r sufficiently large, thus implying that (94) is satisfied for r sufficiently large. In conclusion, due to Lemma 7, (91) holds for r large enough, i.e. temperature low enough. We fix such a value r satisfying (91) and call it rp . Consider the variables (σj )j ∈Zd defined for r1 = rp , r2 = 2rp and choose rc = 1 (d+8) 2 rp . This assures that, if neighboring sites j and j in Zd have σj (ξˆ ) = σj (ξˆ ) = 1, then Crp + 2j rp and Crp + 2j rp contain each a point and these points are separated by a distance less than rc . Two neighboring sites j and j in Zd such that σj (ξˆ ) = σj (ξˆ ) = 1 define a bond of the site percolation problem. To such a bond one can associate (at least) two points x ∈ supp ξˆ ∩ (Crp + 2j rp ) and y ∈ supp ξˆ ∩ (Crp + 2j rp ) separated by a distance less than rc . Given N integer, we define Nˆ := max n ∈ N : Crp + 2rp j ⊂ C2[rp N] , ∀j ∈ C2n ∩ Zd .
Mott Law as Lower Bound for a Random Walk in a Random Environment
57
Note that Nˆ = O(N ). If j, j ∈ C2Nˆ ∩ Zd , then the above associated points x and y are ξˆ
ξˆ
linked by an edge of the graph (V [rp N] , E [rp N] ) defined in Sect. 5.1. Each LR-crossing of C2Nˆ for the site percolation problem gives in a natural way a connected path of edges ξˆ
ξˆ
± of the graph (V [rp N] , E [rp N] ) which connects the boundary faces N .
ξˆ ˆ For a N–good configuration ξˆ , we now bound the conductance G[rp N] from below. For
ξˆ + ˆ− , N }, this purpose, let us consider the random resistor network with vertices Q[rp N] ∪{ˆ N ξˆ
ξˆ
where unit conductances are put on all edges in E [rp N] with vertices in Q[rp N] as well as ξˆ ±
± and all points of B[rp N] . This new network between the two added boundary points ˆ N is obtained from the one of Sect. 5.1 upon placing superconducting wires between all + − + and [r so that they can be identified with a single point ˆ N and vertices of [r p N] p N]
ξˆ − − . The conductance gN of this new network (defined as the current flowing from ˆ N ˆ N + to ˆ N when a unit potential difference is imposed between these two points) is precisely ξˆ
± have the same potential (0 or 1 respectively) equal to G[rp N] because all points of [r p N] ξˆ ±
and each has links to all points of B[rp N] with equal conductances summing up to 1. ξˆ
In order to bound gN from below, we now invoke Rayleigh’s monotonicity law which states that eliminating links (i.e. conductances) from the network always lowers its conductance. ˆ For a given N-good configuration ξˆ , we cut all links but those belonging to the family + − and ˆ N is of disjoint paths associated to CNˆ (ξˆ ). Each of these paths γ connecting ˆ N self-avoiding and hence has a conductance bounded below by 1/L(γ ). As all the paths ξˆ of C ˆ (ξˆ ) are disjoint and they are connecting ˆ + and ˆ − in parallel, g is the sum of N
N
N
N
ξˆ
the conductances of all paths and it follows from (92) that gN ≥ c(b)N d−2 for some positive constant c(b) depending on b. We therefore deduce that 2 2 [rp N ] [rp N] ξˆ EPˆ c ˆ G[rp N] ≥ c(b) EPˆ c ˆ N d−2 χ (ξˆ is Nˆ –good ) . ξ ξ |V [rp N] | |V [rp N] |
Due to (91) and Proposition 9 the r.h.s. converges to a positive value. Combining this with the estimate (87) we obtain − α+1 d
D ≥ C ν([−Ec , Ec ]) e−rc −4βEc ≥ C Ec1+α exp(−cEc
− 4βEc ) ,
where C and C are positive constants. Optimizing the exponent leads to Ec = c β − α+1+d which completes the proof. d
A. Proof that the Random Walk is Well-Defined Proposition 10. Let P be ergodic with ρ2 < ∞. Then for P0 –almost all ξ ∈ N0 and for ξ all x ∈ ξˆ , there exists a unique probability measure Px on ξ = D([0, ∞), supp(ξˆ )) of
58
A. Faggionato, H. Schulz-Baldes, D. Spehner ξ
a continuous–time random walk starting at x whose transition probabilities pt (y|x) := ξ ξ ξ Px (Xs+t = y|Xs = x), x, y ∈ ξˆ , t ≥ 0, s ≥ 0 satisfy the infinitesimal conditions (C1) and (C2). Proof. The uniqueness follows from [Bre, Chap. 15]. In order to prove existence, due to the construction described in Sect. 3.2, we only need to prove (27) for P0 –almost all ξ and for any x ∈ ξˆ . According to [Bre, Prop. 15.43], condition (27) is implied by the following one: P˜ xξ
∞ n=0
1 =∞ = 1. λX˜ ξ (ξ )
(96)
n
Due to the identity ξ P˜ 0
∞ n=1
1 ξ 1 = ∞ X˜ 1 = x = P˜ xξ =∞ , λX˜ ξ (ξ ) λX˜ ξ (ξ ) ∞
n=0
n
∀ x ∈ ξˆ ,
n
the proof will be completed if we can show (96) for x = 0 and P0 –almost all ξ and, in particular, if we can show ∞ ∞ 1 1 ξ P˜ =∞ = Q0 (dξ ) P˜ 0 =∞ =1, λ0 (ξn ) λX˜ ξ (ξ ) n=0
n=0
n
˜ P˜ , and Q0 are defined in Sect. 3.3. Due to Proposition 2, P˜ where the distributions P, 0 is ergodic and therefore, according to ergodic theory (see [Ros, Chap. IV]), ξ
N 1 1 1 1 = = EQ0 , N↑∞ N λ0 (ξn ) λ0 EP0 (λ0 )
lim
˜ P-almost surely,
n=0
thus allowing to conclude the proof.
Remark 3. Explosions are excluded if supx∈ξˆ λx (ξ ) < ∞ (in such a case (96) is always true), but this simple criterion is typically not satisfied in our case. For instance, for a PPP e−|x−y| ≥ e−4β−1 sup ξˆ (C1 + x) = ∞, P0 -a.s. sup λx (ξ ) ≥ e−4β sup x∈ξˆ
x∈ξˆ y∈ξˆ ,|y−x|≤1
x∈ξˆ
B. Proof of Lemma 1 Note that the statements (ii) and (iii) of Lemma 1 are proved in [FKAS, Corollary 1.2.11 and Theorem 1.3.9] in dimension d = 1. The proof below is valid for any dimension d. Proof of Lemma 1. (i) Let h(ξ, ξ ) := k(ξ, ξ ) − k(ξ , ξ ). By the definition (11) of the Palm distribution P0 , ∀N > 0, ∀A ∈ B(Rd ) and for any non negative measurable function f , 1 ˆ (dy) P(dξ ) ξ ξˆ (dx)f (Sy ξ, Sx ξ ) . P0 (dξ ) ξˆ (dx)f (ξ, Sx ξ )= ρN d A CN A+y (97)
Mott Law as Lower Bound for a Random Walk in a Random Environment
59
The antisymmetry of h(ξ, ξ ) and the identity above imply 1 ˆ (dy) P(dξ ) ξ ξˆ (dx)h(Sy ξ, Sx ξ ).(98) P0 (dξ ) ξˆ (dx) h(ξ, Sx ξ )= ρN d Rd CN Rd \CN Let us split the last integral into two integrals over Rd \ CN+√N and over CN+√N \ CN . Using (97) again, 1 ˆ ˆ P(dξ ) ξ, S ξ ) ξ (dy) ξ (dx)h(S y x ρN d CN Rd \CN +√N ≤ P0 (dξ ) ξˆ (dx) |k(ξ, Sx ξ )| + |k(Sx ξ, ξ )| , Rd \C√N
which converges to zero as N → ∞ by the dominated convergence theorem. The same holds for 1 ˆ (dy) ˆ (dx) h(Sy ξ, Sx ξ ) , P(dξ ) ξ ξ ρN d CN C √ \CN N+ N
since, due to (97), it can be bounded by 1 ˆ P(dξ ) ξ (dx) ξˆ (dy) |k(Sy ξ, Sx ξ )| + |k(Sx ξ, Sy ξ )| √ d ρN d CN + N \CN R √ d d (N + N ) − N P0 (dξ ) ξˆ (dy) |k(Sy ξ, ξ )| + |k(ξ, Sy ξ )| . = d d N R Letting N → ∞ in (98) leads to the result. (ii) Since ∈ B(N ) is translation invariant, one has χ0 (Sx ξ ) = χ (ξ ) for all ξ ∈ N and x ∈ ξˆ . The above remark together with (11) gives 1 1 P(dξ ) ξˆ (C1 ) . ξˆ (dx)χ0 (Sx ξ ) = P(dξ ) P0 (0 ) = ρ ρ C1 Comparing with (1), this yields P0 (0 ) = 1 if P() = 1. Reciprocally, always due to (1), if P0 (0 ) = 1, one gets ξˆ (C1 ) = 0 for P–almost all ξ ∈ N \ , and by translation invariance ξ = 0 for P–almost all ξ ∈ N \ , thus * implying that P() = 1. (iii) Let us suppose that P0 (A) = P0 (B) > 0 and set := x∈Rd Sx B. This is a translation-invariant Borel subset of N (see Lemma 8) and B ⊂ ∩ N0 ⊂ A. In particular, P() ∈ {0, 1} by the ergodicity of P. Since χB (Sy ξ ) ≤ χ (ξ ) for all ξ ∈ N and y ∈ Rd , it follows from (11) that 1 1 ξˆ (dy) χB (Sy ξ ) ≤ P(dξ ) P(ξ )ξˆ (C1 ) . P0 (B) = ρ N ρ C1 Therefore, P() = 0 would imply that P0 (B) = 0, in contradiction with our assumption. Thus P() = 1. But ∩ N0 ⊂ A, therefore the statement follows from (ii). (iv) The thesis follows by observing that (11) implies k k 1 1 ˆ ˆ ˆ EP0 P(dξ ) ξ (dx) P(dξ )ξˆ (C1 ) ξ (Aj ) = ξ (Aj +x) ≤ ξˆ (A˜ j ) ρ N ρ N C 1 j =1 j =1 j =1 k
60
A. Faggionato, H. Schulz-Baldes, D. Spehner
k+1 and by applying the estimate a1 · · · ak+1 ≤ c(k +1) (a1k+1 +· · ·+ak+1 ), a1 , . . . , ak+1 ≥ 0. * Lemma 8. Let A ∈ B(N0 ). Then x∈Rd Sx A ∈ B(N ).
Proof. Let us introduce the following lexicographic ordering on Rd : x ≺ y if and only if either |x| < |y| or |x| = |y| and there is k, 1 ≤ k ≤ d, such that x (k) < y (k) and x (l) = y (l) for l < k (here x (k) is the k th component of the vector x). Given ξˆ ∈ Nˆ , one can then order the support of ξˆ according to ≺: {y1 (ξˆ ), y2 (ξˆ ), . . . , yN (ξˆ )} if N := ξˆ (Rd ) < ∞ , ˆ supp(ξ ) = otherwise , {yj (ξˆ )}j ∈N+ where yj ≺ yk whenever j < k. For any n ∈ N, let xn : Nˆ → Rd then be defined as yn (ξˆ ) if n ≤ ξˆ (Rd ) , xn (ξˆ ) = yN (ξˆ ) if n > N := ξˆ (Rd ) . Using an adequate family of finite disjoint covers of Rd and the fact that ξˆ ∈ Nˆ → ξˆ (B) is a Borel function for every Borel set B ⊂ Rd , one can verify that xn is a Borel function for each n. Moreover, supp(ξˆ ) = {xn (ξˆ ) : n ∈ N} for all ξˆ ∈ Nˆ . Due to the definition of the Borel sets in N and Nˆ , the map π : N → Nˆ given by π(ξ ) = ξˆ is Borel, and by [MKM, Sect. 6.1] the function F : Rd × N → N given by F (x, ξ ) = Sx ξ is even continuous. Hence we conclude that Hn (ξ ) := F xn (ξˆ ), ξ = Sxn (ξˆ ) ξ , Hn : N → N0 , function. Now is a Borel function. Its restriction Hˆ n : N0 → N0 is then also * a Borel −1 (A) is a Borel ˆ given a Borel subset A of N0 , we conclude that (A) := ∞ H n=1 n subset in N0 . One can check that (A) = {ξ : ξ = Sx η for some η ∈ A and x ∈ ηˆ }. Since N0 is a Borel subset of N , it follows that (A) is a Borel subset of N as is H1−1 (A) since H1 is a Borel function. The identity + Sx A , H1−1 (A) = x∈Rd
now completes the proof. C. Proof of Proposition 1
Proof of Proposition 1. Due to the construction of the dynamics given in Sect. 3.2, ξ ξ EP0 EPξ |Xt |γ = EP0 EP˜ ξ ⊗Q |X˜ ξ |γ . 0
0
n∗ (t)
Mott Law as Lower Bound for a Random Walk in a Random Environment
61
Let p, q > 1 be such that 1/p + 1/q = 1. Due to the H¨older inequality, ξ EP0 EP˜ ξ ⊗Q |X˜ ξ
n∗ (t)
0
≤
|γ
∞
=
ξ ξ EP0 EP˜ ξ ⊗Q |X˜ nξ |γ χ n∗ (t) ≥ 1 χ n∗ (t) = n 0
n=1
∞
1 1 ξ q p ξ EP0 EP˜ ξ ⊗Q |X˜ nξ |γ q χ n∗ (t) ≥ 1 EP0 P˜ 0ξ ⊗ Q (n∗ (t) = n) . 0
n=1 ξ
ξ ξ 0,X˜ 0
Clearly, n∗ (t) ≥ 1 means T
≤ t. It then follows from the estimate 1 − e−u ≤ u,
u ≥ 0, that ξ EP˜ ξ ⊗Q |X˜ nξ |γ q χ n∗ (t) ≥ 1 = 1 − e−λ0 (ξ )t EP˜ ξ |X˜ nξ |γ q ≤ λ0 (ξ )t EP˜ ξ |X˜ nξ |γ q . 0
0
0
(99) We then obtain ∞ ξ EP0 EPξ |Xt |γ ≤C 0
1/q Q0 (dξ )EP˜ ξ |X˜ nξ |γ q 0
n=1
1/p ξ ξ ˜ P0 (dξ )P0 ⊗ Q n∗ (t)=n ,
(100)
with C = [t EP0 (λ0 )]1/q . We claim that there is a (time-independent) constant C > 0 such that (101) Q0 (dξ ) EP˜ ξ |X˜ nξ |γ q ≤ C nγ q . 0
To show this, let us note first that, given X˜ 0 = 0, by another application of the H¨older inequality, ξ
n−1 ξ γ q ξ n−1 ξ γ q ξ γ q−1 X˜ X˜ ˜ ˜ ˜ ξ γ q , X = − X ≤ n n m m+1 m+1 − Xm m=0
m=0
where it has been assumed that γ q > 1. One can derive from the stationarity of P˜ and Remark 1 that ξ ξ γ q ξ γ q ˜ ˜ = Q0 (dξ ) EP˜ ξ X˜ 1 := C
Q0 (dξ ) EP˜ ξ Xn+1 − Xn 0
0
for any n ∈ N. One concludes the proof of (101) by checking that C is finite. Actually, by (26), EP0 (λ0 ) C is equal to |x| P0 (dξ ) ξˆ (dx) e− 2 , P0 (dξ ) ξˆ (dx) c0,x (ξ )|x|γ q ≤ c for a suitable constant c. The r.h.s. can be bounded by means of Lemma 1(iv) and the same argument leading to Lemma 2. In view of (100) and (101), the proposition will be proved if we can show that the ξ ξ expectation EP0 (P˜ 0 ⊗ Q(n∗ (t) = n)) converges to zero more rapidly than n−(γ +1)p as
62
A. Faggionato, H. Schulz-Baldes, D. Spehner
n → ∞. Let us fix 0 < α < 1. We will show that, if l > 0 is such that EP0 (λl+1 0 ) < ∞, then ξ ξ EP0 P˜ 0 ⊗ Q n∗ (t) = n = O(n−αl ) .
(102)
To this end, let us first make a general observation. Let λ > 0 and let T1 , . . . , Tk be independent exponential variables on some probability space (, µ), with parameters λ1 , . . . , λk ≤ λ. Define the random variables Tj := (λj /λ)Tj , j = 1, . . . , k. These are independent identically distributed exponential variables with parameter λ. As Tj ≤ Tj , this shows that ∞ (λt)k (λt)j +k µ T1 + · · · + Tk ≤ t ≤ µ T1 + · · · + Tk ≤ t = e−λt ≤ . (j + k)! k! j =0
(103) ξ In order to proceed, for all ξ ∈ N0 , let us set Bn := x ∈ ξˆ : λx (ξ ) ≤ nα as well as Aξn :=
n ξ ˜ ξ : ∃ J ⊂ In , |J | > , X˜ ξ ∈ Bnξ ∀ j ∈ J , X˜ k )k≥0 ∈ j 2
ξ ξ where In := {0, . . . , n − 1} and |J | is the cardinality of J . We write P˜ 0 ⊗ Q n∗ (t) = n = gn (ξ ) + hn (ξ ) with
gn (ξ ) := P˜ 0 ⊗ Q ξ
ξ n∗ (t) = n ∩ Aξn ,
hn (ξ ) := P˜ 0 ⊗ Q ξ
ξ n∗ (t) = n ∩ (Aξn )c .
ξ ξ We first estimate gn . Obviously {n∗ (t) = n} is contained in { j ∈J T
j,X˜ j
ξ
≤ t}. As a
result, gn (ξ ) ≤
χ xj ∈ Bnξ ∀ j ∈ J χ xi ∈ / Bnξ ∀ i ∈ In \ J
J ⊂In ,|J |>n/2 x0 ,... ,xn−1 ∈ξˆ
ξ ξ ξ ξ P˜ 0 X˜ 0 = x0 , . . . , X˜ n−1 = xn−1 Q Tj,xj ≤ t
≤
max
k=[n/2]+1,... ,n−1
(nα t)k
k!
j ∈J
.
√ Thanks to the Stirling formula k! ∼ k k e−k 2πk as k → ∞, the last expression can be bounded by a constant times (2 e t)n/2 n−n(1−α)/2 and is thus exponentially small. We now turn to EP0 (hn ), n ≥ 1. Clearly, ξ P˜ 0
Aξn
n−1 c 2 ξ 2 ξ ≤ EP˜ ξ χ X˜ 0 ∈ / Bnξ +· · ·+χ X˜ n−1 ∈ / Bnξ = EP˜ ξ λ0 (ξm ) > nα . n 0 n m=0
Mott Law as Lower Bound for a Random Walk in a Random Environment
63
By Proposition 2 and invoking Chebyshev’s inequality, one obtains for any l > 0, c c ξ ξ ξ EP0 hn ≤ P0 (dξ )P˜ 0 ⊗ Q n∗ (t) ≥ 1 ∩ Aξn ≤ t P0 (dξ ) λ0 (ξ )P˜ 0 Aξn ≤
n−1 2t P0 (dξ ) λ0 (ξ ) EP˜ ξ λ0 (ξm ) > nα = 2t EP0 λ0 χ (λ0 > nα ) n m=0
2t ≤ αl EP0 λl+1 , 0 n where the second inequality follows from the same argument leading to (99) and the ˜ This proves (102). We may now choose equality follows from the stationarity of P. p = α −1 > 1 arbitrarily close to 1 so that γ q > 1 and such that one may take for l the smallest integer strictly greater than γ + 1. For such a choice the sum (100) converges. We can now invoke Lemma 2 to get the result. ˇ y, B. Derrida, P. A. Ferrari, D. Gabrielli, A. Acknowledgement. We would like to thank A. Bovier, J. Cern´ Ramirez and R. Siegmund-Schultze for very useful comments. The work was supported by the SFB 288, SFB/TR 12 and the Dutch-German Bilateral Research Group “Mathematics of random spatial models from physics and biology".
References [AHL]
Ambegoakar, V., Halperin, B.I., Langer, J.S.: Hopping Conductivity in Disordered Systems. Phys, Rev B 4, 2612–2620 (1971) [BRSW] Bellissard, J., Rebolledo, R., Spehner, D., von Waldenfels, W.: In preparation [BHZ] Bellissard, J., Hermann, D., Zarrouati, M.: Hull of Aperiodic Solids and Gap Labelling Theorems. In: Directions in Mathematical Quasicrystals, M.B. Baake, R.V. Moody, eds., CRM Monograph Series, Volume 13, Providence, RI: Amer. Math.Soc., (2000) 207–259 [Bil] Billingsley, P.: Convergence of Probability Measures. New York: Wiley, 1968 [BS] Bolthausen, E., Sznitman, A.-S.: Ten lectures on random media. DMV Seminar 32 Basel: Birkh¨auser, 2002 [Bre] Breiman, L.: Probability. Reading, MA: Addison–Wesley, 1953 [DV] Daley, D.J., Vere–Jones, D.: An Introduction to the Theory of Point Processes. New York: Springer, 1988 [DFGW] De Masi, A., Ferrari, P.A., Goldstein, S., Wick, W.D.: An Invariance Principle for Reversible Markov Processes. Applications to Random Motions in Random Environments. J. Stat. Phys. 55, 787–855 (1989) [EF] Efros, A.L., Shklovskii, B.I.: Coulomb gap and low temperature conductivity of disordered systems. J. Phys. C: Solid State Phys. 8, L49–L51 (1975) [FM] Faggionato, A., Martinelli, F.: Hydrodynamic limit of a disordered lattice gas. Probab. Theory Related Fields 127, 535–608 (2003) [FKAS] Franken, P., K¨onig, D., Arndt, U., Schmidt, V.: Queues and Point Processes. Berlin: Akadamie-Verlag, 1981 [Gri] Grimmett, G.: Percolation. Second Edition, Grundlehren 321, Berlin: Springer, 1999 [KV] Kipnis, C., Varadhan, S.R.S.: Central Limit Theorem for Additive Functionals of Reversible Markov Processes and Applications to Simple Exclusion. Commun. Math. Phys. 104, 1–19 (1986) [Kal] Kallenberg, O.: Foundations of Modern Probability. Second Edition, New York: Springer-Verlag, 2001 [KLP] Kirsch, W., Lenoble, O., Pastur, L.: On the Mott formula for the a.c. conductivity and binary correlators in the strong localization regime of disordered systems. J. Phys. A: Math. Gen. 36, 12157–12180 (2003) [LB] Ladieu, F., Bouchaud, J.-P.: Conductance statistics in small GaAs:Si wires at low temperatures: I. Theoretical analysis: truncated quantum fluctuations in insulating wires. J. Phys. I France 3, 2311–2320 (1993) [Mar] Martinelli, F.: Lectures on Glauber dynamics for discrete spin models. Lecture Notes in Mathematics, Vol. 1717, Berlin-Heidelberg-Newyork: Springer, 2000
64 [MKM] [MR] [MA] [Min] [Mot] [Owh] [Qua] [RS] [Ros] [SE] [Spe] [Spo] [Tho]
A. Faggionato, H. Schulz-Baldes, D. Spehner Matthes, K., Kerstan, J., Mecke, J.: Infinitely Divisible Point Processes. Wiley Series in Probability and Mathematical Physics, Newyork: Wiley, 1978 Meester, R., Roy, R.: Continuum Percolation. Cambridge: Cambridge University Press, 1996 Miller, A., Abrahams, E.: Impurity Conduction at Low Concentrations. Phys. Rev. 120, 745– 755 (1960) Minami, N.: Local fluctuation of the spectrum of a multidimensional Anderson tight binding model. Commun. Math. Phys. 177, 709–725 (1996) Mott, N.F.: J. Non-Crystal. Solids 1, 1 (1968); N. F. Mott, Phil. Mag 19, 835 (1969); Mott, N.F., Davis, E.A.: Electronic Processes in Non-Crystaline Materials. New York: Oxford University Press, 1979 Owhadi, H.: Approximation of the effective conductivity of ergodic media by periodization. Probab. Theory Related Fields 125, 225–258, (2003) Quastel, J.: Diffusion in Disordered Media. In: Funaki, T., Woyczinky, W., eds., Proceedings on stochastic method for nonlinear P.D.E., IMA volumes in Mathematics 77, New York: Springer Verlag, 1995, pp. 65–79 Reed, M., Simon, B.: Methods of Modern Mathematical Physics I-IV. San Diego: Academic Press, 1980 Rosenblatt, M.: Markov Processes. Structure and Asymptotic Behavior. Grundlehren 184, Berlin: Springer, 1971 Shklovskii, B., Efros, A.L.: Electronic Properties of Doped Semiconductors. Berlin: Springer, 1984 Spehner, D.: Contributions a` la th´eorie du transport e´ lectronique dissipatif dans les solides ap´eriodiques. PhD Thesis, Toulouse, 2000 Spohn, H.: Large Scale Dynamics of Interacting Particles. Berlin: Springer, 1991 Thorisson, H.: Coupling, Stationarity, and Regeneration. New York: Springer, 2000
Communicated by M. Aizenman
Commun. Math. Phys. 263, 65–88 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1494-3
Communications in
Mathematical Physics
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group Giovanni Landi1 , Chiara Pagani2 , Cesare Reina2 1 2
Dipartimento di Matematica e Informatica, Universit`a di Trieste, Via A.Valerio 12/1, 34127 Trieste, Italy, and I.N.F.N., Sezione di Napoli, Napoli, Italy. E-mail:
[email protected] S.I.S.S.A. International School for Advanced Studies, Via Beirut 2-4, 34014 Trieste, Italy. E-mail:
[email protected];
[email protected] Received: 7 September 2004 / Accepted: 16 August 2005 Published online: 24 January 2006 – © Springer-Verlag 2006
Abstract: We construct a quantum version of the SU (2) Hopf bundle S 7 → S 4 . The quantum sphere Sq7 arises from the symplectic group Spq (2) and a quantum 4-sphere Sq4 is obtained via a suitable self-adjoint idempotent p whose entries generate the algebra A(Sq4 ) of polynomial functions over it. This projection determines a deformation of an (anti-)instanton bundle over the classical sphere S 4 . We compute the fundamental Khomology class of Sq4 and pair it with the class of p in the K-theory getting the value −1 for the topological charge. There is a right coaction of SUq (2) on Sq7 such that the algebra A(Sq7 ) is a non-trivial quantum principal bundle over A(Sq4 ) with structure quantum group A(SUq (2)).
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . 2. Odd Spheres from Quantum Symplectic Groups . 2.1 The quantum groups Spq (N, C) and Spq (n) 2.2 The odd symplectic spheres . . . . . . . . . 2.3 The symplectic 7-sphere Sq7 . . . . . . . . . 3. The Principal Bundle A(Sq4 ) → A(Sq7 ) . . . . . . 3.1 The quantum sphere Sq4 . . . . . . . . . . . 3.2 The SUq (2)-coaction . . . . . . . . . . . . 4. Representations of the Algebra A(Sq4 ) . . . . . . 4.1 The representation β . . . . . . . . . . . . 4.2 The representation σ . . . . . . . . . . . . 5. The Index Pairings . . . . . . . . . . . . . . . . 6. Quantum Principal Bundle Structure . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
66 67 67 68 70 72 73 76 79 79 79 80 82
66
G. Landi, C. Pagani, C. Reina
6.1 The associated bundle and the coequivariant maps . . . . . . . . . . . A. The Classical Hopf Fibration S 7 → S 4 . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85 86 88
1. Introduction In this paper we study yet another example of how “quantization removes degeneracy” by constructing a new quantum version of the Hopf bundle S 7 → S 4 . This is the first outcome of our attempt to generalize to the quantum case the ADHM construction of SU (2) instantons together with their moduli spaces. The q-monopole on two dimensional quantum spheres has been constructed in [8] more than a decade ago. There the general notion of a quantum principal bundle with quantum differential calculi, from a geometrical point of view was also introduced. With universal differential calculi, this notion was later realised to be equivalent to the one of Hopf-Galois extension (see e.g. [14]). An analogous construction for q-instantons and their principal bundles has been an open problem ever since. A step in this direction was taken in [3] resulting in a bundle which is only a coalgebra extension [4]. Here we present a quantum principal instanton bundle which is a honest Hopf-Galois extension. One advantage is that non-universal calculi may be constructed on the bundle, as opposite to the case of a coalgebra bundle where there is not such a possibility. In analogy with the classical case [1], it is natural to start with the quantum version of the (compact) symplectic groups A(Spq (n)), i.e. the Hopf algebras generated by matrix j elements Ti ’s with commutation rules coming from the R matrix of the C-series [25]. These quantum groups have comodule-subalgebras A(Sq4n−1 ) yielding deformations of the algebras of polynomials over the spheres S 4n−1 , which give more examples of the general construction of quantum homogeneous spaces [8]. The relevant case for us is n = 2, i.e. the symplectic quantum 7-sphere A(Sq7 ), which is generated by the matrix elements of the first and the last column of T . Indeed, as 4 1 . A similar conjugation occurs for the elements of the middle we will see, T i ∝ T4−i columns, but contrary to what happens at q = 1, they do not generate a subalgebra. The algebra A(Sq7 ) is the quantum version of the homogeneous space Sp(2)/Sp(1) and the injection A(Sq7 ) → A(Spq (2)) is a quantum principal bundle with “structure Hopf algebra” A(Spq (1)). Most importantly, we show that Sq7 is the total space of a quantum SUq (2) principal bundle over a quantum 4-sphere Sq4 . Unlike the previous construction, this is obviously not a quantum homogeneous structure. The algebra A(Sq4 ) is constructed as the subalgebra of A(Sq7 ) generated by the matrix elements of a self-adjoint projection p which generalizes the anti-instanton of charge −1. This projection will be of the form vv ∗ with v a 4 × 2 matrix whose entries are made out of generators of A(Sq7 ). The naive generalization of the classical case produces a subalgebra with extra generators which vanish at q = 1. Luckily enough, there is just one alternative choice of v which gives the right number of generators of an algebra which deforms the algebra of polynomial functions of S 4 . At q = 1 this gives a projection which is gauge equivalent to the standard one. This good choice becomes even better because there is a natural coaction of SUq (2) on A(Sq7 ) with coinvariant algebra A(Sq4 ) and the injection A(Sq4 ) → A(Sq7 ) turns out to be a faithfully flat A(SUq (2))-Hopf-Galois extension.
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
67
Finally, we set up the stage to compute the charge of our projection and to prove the non-triviality of our principal bundle. Following a general strategy of noncommutative index theorem [10], we construct representations of the algebra A(Sq4 ) and the corresponding K-homology. The analogue of the fundamental class of S 4 is given by a non-trivial Fredholm module µ. The natural coupling between µ and the projection p is computed via the pairing of the corresponding Chern characters ch∗ (µ) ∈ H C ∗ [A(Sq4 )] and ch∗ (p) ∈ H C∗ [A(Sq4 )] in cyclic cohomology and homology respectively [10]. As expected the result of this pairing, which is an integer in principle being the index of a Fredholm operator, is actually −1 and therefore the bundle is non-trivial. Clearly the example presented in this paper is very special and limited, since it is just a particular anti-instanton of charge −1. Indeed our construction is based on the requirement that the matrix v giving the projection is linear in the generators of A(Sq7 ) and such that v ∗ v = 1. This is false even classically at generic moduli and generic charge, except for the case considered here (and for a similar construction for the case of charge 1). A more elaborate strategy is needed to tackle the general case. 2. Odd Spheres from Quantum Symplectic Groups We recall the construction of quantum spheres associated with the compact real form of the quantum symplectic groups Spq (N, C) (N = 2n), the latter being given in [25]. Later we shall specialize to the case N = 4 and the corresponding 7-sphere will provide the ‘total space’ of our quantum Hopf bundle. 2.1. The quantum groups Spq (N, C) and Spq (n). The algebra A(Spq (N, C)) is the associative noncommutative algebra generated over the ring of Laurent polynomials Cq := C[q, q −1 ] by the entries Ti j , i, j = 1, . . . , N of a matrix T which satisfy RTT equations: R T1 T2 = T2 T1 R ,
T1 = T ⊗ 1 ,
T2 = 1 ⊗ T .
In components (T ⊗ 1)ij kl = Ti k δj l . Here the relevant N 2 × N 2 matrix R is the one for the CN series and has the form [25], R=q
N
ei i ⊗ e i i +
ei i ⊗ ej j + q −1
i,j =1
i=1
+(q − q −1 )
N i=j,j
N i,j =1
i>j
ei j ⊗ ej i − (q − q −1 )
N
ei i ⊗ e i i
i=1 N
q ρi −ρj εi εj ei j ⊗ ei j ,
i,j =1
i>j
where i = N + 1 − i ; ei j ∈ Mn (C) are the elementary matrices, i.e. (ej i )kl = δj l δ ik ; εi = 1, for i = 1, . . . , n ; εi = −1, for i = n + 1, . . . , N ; (ρ1 , . . . , ρN ) = (n, n − 1, . . . , 1, −1, . . . , −n).
(1)
68
G. Landi, C. Pagani, C. Reina
The symplectic group structure comes from the matrix Ci j = q ρj εi δij by imposing the additional relations T CT t C −1 = CT t C −1 T = 1. The Hopf algebra co-structures (, ε, S) of the quantum group Spq (N, C) are given by .
(T ) = T ⊗ T ,
ε(T ) = I ,
S(T ) = CT t C −1 .
In components the antipode explicitly reads
S(T )i j = −q ρi +ρj εi εj Tj i .
(2)
At q = 1 the Hopf algebra Spq (N, C) reduces to the algebra of polynomial functions over the symplectic group Sp(N, C). The compact real form A(Spq (n)) of the quantum group A(Spq (N, C)) is given by taking q ∈ R and the anti-involution [25] T = S(T )t = C t T (C −1 )t .
(3)
2.2. The odd symplectic spheres. Let us denote xi = Ti N ,
v j = S(T )N j ,
i, j = 1, . . . , N .
As we will show, these generators give subalgebras of A(Spq (N, C)). With the natural involution (3), the algebra generated by the {xi , v j } can be thought of as the algebra A(Sq4n−1 ) of polynomial functions on a quantum sphere of ‘dimension’ 4n − 1. From here on, whenever no confusion arises, the sum over repeated indexes is understood. In components the RTT equations are given by Rij kp Tk r Tp s = Tj p Ti m Rmp rs .
(4)
Hence Rij kl Tk r = Tj p Ti m Rmp rs S(T )s l , and in turn S(T )p j Rij kl = Ti a Rap rs S(T )s l S(T )r k , so that S(T )a i S(T )p j Rij kl = Rap rs S(T )s l S(T )r k .
(5)
Conversely, if we multiply Rij kp Tk r = Tj l Ti m Rml rs S(T )s p on the left by S(T ) we have S(T )l j Rij kp Tk r = Ti m Rml rs S(T )s p .
(6)
We shall use Eqs. (4), (5) and (6) to describe the algebra generated by the xi ’s and by the v i ’s.
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
69
The algebra Cq [xi ] From (4) with r = s = N we have Rij kp xk xp = Tj p Ti m Rmp NN .
(7)
Since the only element Rmp NN ∝ em N ⊗ ep N (m, p ≤ N ) which is different from zero is RNN NN = q, it follows that Rij kp xk xp = q xj xi ,
(8)
and the elements xi ’s give an algebra with commutation relations xi xj = qxj xi ,
i < j, i = j ,
xi xi = q −2 xi xi + (q −2 − 1)
i−1
q ρi −ρk εi εk xk xk ,
i < i .
(9)
k=1
The algebra Cq [v i ] Putting a = p = N in Eq. (5), we get v i v j Rij kl = RNN rs S(T )s l S(T )r k . The sum on the r.h.s. reduces to RNN NN S(T )N l S(T )N k and the v i ’s give an algebra with commutation relations v l v k Rlk j i = qv i v j .
(10)
Explicitly v i v j = q −1 v j v i , i i
i < j, i = j ,
2 i i
v v = q v v + (q − 1) 2
N
q ρk −ρi εk εi v k v k ,
i < i .
(11)
k=i +1
The algebra Cq [xi , v j ]. Finally, for l = r = N Eq. (6) reads: v j Rij kp xk = Ti m RmN Ns S(T )s p . Once more, the only term in R of the form em N ⊗ eN s (m ≤ N ) is eN N ⊗ eN N and therefore v j Rij kp xk = q xi v p .
(12)
Explicitly the mixed commutation rules for the algebra Cq [xi , v j ] read, xi v i = v i xi + (1 − q −2 )
i−1 k=1
i
v k xk + (1 − q −2 )q ρi −ρi v i xi , if i>i
i
xi v = q −2 v xi , xi v j = q −1 v j x i ,
i = j
and
i < j ,
xi v j = q −1 v j x i + (q −2 − 1)q ρi −ρj εi εj v i xj ,
i = j
and
i > j . (13)
70
G. Landi, C. Pagani, C. Reina
The quantum spheres Sq4n−1 . Let us observe that with the anti-involution (3) we have the identification v i = S(T )N i = x¯ i . The subalgebra A(Sq4n−1 ) of A(Spq (n)) generated by {xi , v i = x¯ i , i = 1, . . . , 2n} is the algebra of polynomial functions on a sphere. Indeed N S(T )T = I ⇒ S(T )N i Ti N = δN = 1, i.e.
x¯ i xi = 1 .
(14)
i
Furthermore, the restriction of the comultiplication is a natural left coaction L : A(Sq4n−1 ) −→ A(Spq (n)) ⊗ A(Sq4n−1 ) . The fact that L is an algebra map then implies that A(Sq4n−1 ) is a comodule algebra over A(Spq (n)). At q = 1 this algebra reduces to the algebra of polynomial functions over the spheres S 4n−1 as homogeneous spaces of the symplectic group Sp(n): S 4n−1 = Sp(n)/Sp(n−1). 2.3. The symplectic 7-sphere Sq7 . The algebra A(Sq7 ) is generated by the elements xi = Ti 4 and x¯ i = S(T )4 i = q 2+ρi εi Ti 1 , for i = 1, . . . , 4. From S(T ) T = 1 we have the sphere relation 4i=1 x¯ i xi = 1. Since we shall systematically use them in the following, we shall explicitly give the commutation relations among the generators. From (9), the algebra of the xi ’s is given by x1 x2 = qx2 x1 , x1 x3 = qx3 x1 , x2 x4 = qx4 x2 , x3 x4 = qx4 x3 , x4 x1 = q −2 x1 x4 , x3 x2 = q −2 x2 x3 + q −2 (q −1 − q)x1 x4 ,
(15)
together with their conjugates (given in (11)). We have also the commutation relations between the xi and the x¯ j deduced from (12): x1 x¯ 1 x1 x¯ 3 x2 x¯ 2 x2 x¯ 3 x2 x¯ 4 x3 x¯ 3 x3 x¯ 4 x4 x¯ 4
= x¯ 1 x1 , x1 x¯ 2 = q −1 x¯ 2 x1 , −1 3 = q x¯ x1 , x1 x¯ 4 = q −2 x¯ 4 x1 , = x¯ 2 x2 + (1 − q −2 )x¯ 1 x1 , = q −2 x¯ 3 x2 , = q −1 x¯ 4 x2 + q −1 (q −2 − 1)x¯ 3 x1 , = x¯ 3 x3 + (1 − q −2 )[x¯ 1 x1 + (1 + q −2 )x¯ 2 x2 ] , = q −1 x¯ 4 x3 + (1 − q −2 )q −3 x¯ 2 x1 , = x¯ 4 x4 + (1 − q −2 )[(1 + q −4 )x¯ 1 x1 + x¯ 2 x2 + x¯ 3 x3 ] ,
(16)
again with their conjugates. Next we show that the algebra A(Sq7 ) can be realized as the subalgebra of A(Spq (2)) generated by the coinvariants under the right-coaction of A(Spq (1)), in complete analogy with the classical homogeneous space Sp(2)/Sp(1) S 7 .
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
71
Lemma 1. The two-sided *-ideal in A(Spq (2)) generated as Iq = {T1 1 − 1, T4 4 − 1, T1 2 , T1 3 , T1 4 , T2 1 , T2 4 , T3 1 , T3 4 , T4 1 , T4 2 , T4 3 } with the involution (3) is a Hopf ideal.
Proof. Since S(T )i j ∝ Tj i , S(Iq ) ⊆ Iq which also proves that Iq is a *-ideal. One easily shows that ε(Iq ) = 0 and (Iq ) ⊆ Iq ⊗ A(Spq (2)) + A(Spq (2)) ⊗ Iq .
Proposition 1. The Hopf algebra Bq := A(Spq (2))/Iq is isomorphic to the coordinate algebra A(SUq 2 (2)) ∼ = A(Spq (1)). Proof. Using T = S(T )t and setting T2 2 = α, T3 2 = γ , the algebra Bq can be described as the algebra generated by the entries of the matrix
1 0 T = 0 0
0 0 α −q 2 γ¯ γ α¯ 0 0
0 0 . 0 1
(17)
The commutation relations deduced from RTT equations (4) read: α γ¯ = q 2 γ¯ α , αγ = q 2 γ α , γ γ¯ = γ¯ γ , αα ¯ + γ¯ γ = 1 ; α α¯ + q 4 γ γ¯ = 1 .
(18)
Hence, as an algebra Bq is isomorphic to the algebra A(SUq 2 (2)). Furthermore, the restriction of the coproduct of A(Spq (2)) to Bq endows the latter with a coalgebra . structure, (T ) = T ⊗ T , which is the same as the one of A(SUq 2 (2)). We can conclude that also as a Hopf algebra, Bq is isomorphic to the Hopf algebra A(SUq 2 (2)) ∼ = A(Spq (1)).
Proposition 2. The algebra A(Sq7 ) ⊂ A(Spq (2)) is the algebra of coinvariants with respect to the natural right coaction .
R : A(Spq (2)) → A(Spq (2)) ⊗ A(Spq (1)) ;
.
R (T ) = T ⊗ T .
Proof. It is straightforward to show that the generators of the algebra A(Sq7 ) are coinvariants: R (xi ) = R (Ti4 ) = xi ⊗ 1 ; R (x¯ i ) = −q 2+ρi εi R (Ti1 ) = x¯ i ⊗ 1, thus the algebra A(Sq7 ) is made of coinvariants. There are no other coinvariants of degree one since each row of the submatrix of T made out of the two central columns is a fundamental comodule under the coaction of SUq 2 (2). Other coinvariants arising at higher even degree are of the form (Ti2 Ti3 − q 2 Ti3 Ti2 )n ; thanks to the commutation relations of A(Spq (2)), one checks these belong to A(Sq7 ) as well. It is an easy computation to check that similar expressions involving elements from different rows cannot be coinvariant.
72
G. Landi, C. Pagani, C. Reina
The previous construction is one more example of the general construction of a quantum principal bundle over a quantum homogeneous space [8]. The latter is the datum of a Hopf quotient π : A(G) → A(K) with the right coaction of A(K) on A(G) given by the reduced coproduct R := (id ⊗ π ), where is the coproduct of A(G). The subalgebra B ⊂ A(G) made of the coinvariants with respect to R is called a quantum homogeneous space. To prove that it is a quantum principal bundle one needs some more assumptions (see Lemma 5.2 of [8]). In our case A(G) = A(Spq (2)), A(K) = A(Spq (1)) with π(T ) = T . We will prove in Sect. 6 that the resulting inclusion B = A(Sq7 ) → A(Spq (2)) is indeed a Hopf Galois extension and hence a quantum principal bundle. 3. The Principal Bundle A(Sq4 ) → A(Sq7 ) The fundamental step of this paper is to make the sphere Sq7 itself into the total space of a quantum principal bundle over a deformed 4-sphere. Unlike what we saw in the previous section, this is not a quantum homogeneous space construction and it is not obvious that such a bundle exists at all. Nonetheless the notion of quantum bundle is more general and one only needs that the total space algebra is a comodule algebra over an Hopf algebra with additional suitable properties. The notion of quantum principle bundle, as said, is encoded in the one of Hopf-Galois extension (see e.g. [8, 14]). Let us recall some relevant definitions [20] (see also [22]). Recall that we work over the field k = C. Definition 1. Let H be a Hopf algebra and P a right H -comodule algebra with multiplication m : P ⊗ P → P and coaction R : P → P ⊗ H . Let B ⊆ P be the subalgebra of coinvariants, i.e. B = {p ∈ P | R (p) = p ⊗ 1}. The extension B ⊆ P is called an H Hopf-Galois extension if the canonical map χ : P ⊗B P −→ P ⊗ H , χ := (m ⊗ id) ◦ (id ⊗B R ) ,
p ⊗B p → χ (p ⊗B p) = p p(0) ⊗ p(1) (19)
is bijective. We use Sweedler-like notation R p = p(0) ⊗ p(1) . The canonical map is left P -linear and right H -colinear and is a morphism (an isomorphism for Hopf-Galois extensions) of left P -modules and right H -comodules. It is also clear that P is both a left and a right B-module. The injectivity of the canonical map dualizes the condition of a group action X×G → X to be free: if α is the map α : X × G → X ×M X, (x, g) → (x, x · g) then α ∗ = χ with P , H the algebras of functions on X, G respectively and the action is free if and only if α is injective. Here M := X/G is the space of orbits with projection map π : X → M, π(x · g) = π(x), for all x ∈ X, g ∈ G. Furthermore, α is surjective if and only if for all x ∈ X, the fibre π −1 (π(x)) of π(x) is equal to the residue class x · G, that is, if and only if G acts transitively on the fibres of π . In differential geometry a principle bundle is more than just a free and effective action of a Lie group. In our example, thanks to the fact that the “structure group” is SUq (2), from Th. I of [28] further nice properties can be established. We shall elaborate more on these points later on in Sect. 6 . The first natural step would be to construct a map from Sq7 into a deformation of the Stieffel variety of unitary frames of 2-planes in C4 to parallel the classical construction
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
73
as recalled in Appendix A. The naive choice we have is to take as generators the elements of two (conjugate) columns of the matrix T . We are actually forced to take the first and the last columns of the matrix T because the other choice (i.e. the second and the third columns) does not yield a subalgebra since commutation relations of their elements will involve elements from the other two columns. If we set x1 x¯ 4 q −1 x¯ 3 x2 v= (20) −q −3 x¯ 2 x3 , −4 1 −q x¯ x4 we have v ∗ v = I2 and the matrix p = v v ∗ is a self-adjoint idempotent, i.e. p = p ∗ = p2 . At q = 1 the entries of p are invariant for the natural action of SU (2) on S 7 and generate the algebra of polynomials on S 4 . This fails to be the case at generic q due to the occurrence of extra generators, e.g. p14 = (1 − q −2 )x1 x¯ 4 ,
p23 = (1 − q −2 )x2 x¯ 3 ,
(21)
which vanish at q = 1. 3.1. The quantum sphere Sq4 . These facts indicate that the naive quantum analogue of the quaternionic projective line as a homogeneous space of Spq (2) has not the right number of generators. Rather surprisingly, we shall anyhow be able to select another subalgebra of A(Sq7 ) which is a deformation of the algebra of polynomials on S 4 having the same number of generators. These generators come from a better choice of a projection. On the free module E := C4 ⊗ A(Sq7 ) we consider the hermitian structure given by h(|ξ1 , |ξ2 ) =
4
j j ξ¯1 ξ2 .
j =1
To every element |ξ ∈ E one associates an element ξ | in the dual module E ∗ by the pairing ξ | (|η) := ξ |η = h(|ξ , |η). Guided by the classical construction which we present in Appendix A, we shall look for two elements |φ1 , |φ2 in E with the property that φ1 |φ1 = 1 ,
φ2 |φ2 = 1 ,
φ1 |φ2 = 0 .
As a consequence, the matrix valued function defined by p := |φ1 φ1 | + |φ2 φ2 | ,
(22)
is a self-adjoint idempotent (a projection). In principle, p ∈ Mat4 (A(Sq7 )), but we can choose |φ1 , |φ2 in such a way that the entries of p will generate a subalgebra A(Sq4 ) of A(Sq7 ) which is a deformation of the algebra of polynomial functions on the 4-sphere S 4 . The two elements |φ1 , |φ2 will be obtained in two steps as follows.
74
G. Landi, C. Pagani, C. Reina
i Firstly we write the relation 1 = x¯ xi in terms of the quadratic elements x¯ 1 x1 , 2 3 4 x2 x¯ , x¯ x3 , x4 x¯ by using the commutation relations of Sect. 2.3. We have that 1 = q −6 x¯ 1 x1 + q −2 x2 x¯ 2 + q −2 x¯ 3 x3 + x4 x¯ 4 . Then we take, |φ1 = (q −3 x1 , −q −1 x¯ 2 , q −1 x3 , −x¯ 4 )t ,
(23)
(t denoting transposition) which is such that φ1 |φ1 = 1. Next, we write 1 = x¯ i xi as a function of the quadratic elements x1 x¯ 1 , x¯ 2 x2 , x3 x¯ 3 , x¯ 4 x4 : 1 = q −2 x1 x¯ 1 + q −4 x¯ 2 x2 + x3 x¯ 3 + x¯ 4 x4 . By taking, |φ2 = (±q −2 x2 , ±q −1 x¯ 1 , ±x4 , ±x¯ 3 )t we get φ2 |φ2 = 1. The signs will be chosen in order to have also the orthogonality φ1 |φ2 = 0; for |φ2 = (q −2 x2 , q −1 x¯ 1 , −x4 , −x¯ 3 )t this is satisfied. The matrix
q −3 x1 −q −1 x¯ 2 v = (|φ1 , |φ2 ) = q −1 x3 −x¯ 4
q −2 x2 q −1 x¯ 1 . −x4 −x¯ 3
(24)
(25)
is such that v ∗ v = 1 and hence p = vv ∗ is a self-adjoint projection. Proposition 3. The entries of the projection p = vv ∗ , with v given in (25), generate a subalgebra of A(Sq7 ) which is a deformation of the algebra of polynomial functions on the 4-sphere S 4 . Proof. Let us compute explicitly the components of the projection p and their commutation relations. 1. The diagonal elements are given by p11 = q −6 x1 x¯ 1 + q −4 x2 x¯ 2 , p22 = q −2 x¯ 2 x2 + q −2 x¯ 1 x1 , p33 = q −2 x3 x¯ 3 + x4 x¯ 4 , p44 = x¯ 4 x4 + x¯ 3 x3 , and satisfy the relation q −2 p11 + q 2 p22 + p33 + p44 = 2 .
(26)
Only one of the pii ’s is independent; indeed by using the commutation relations and the equation x¯ i xi = 1, we can rewrite the pii ’s in terms of t := p22 , as p11 = q −2 t , p22 = t , p33 = 1 − q −4 t , p44 = 1 − q 2 t . Equation (26) is easily verified. Notice that t is self-adjoint: t¯ = t.
(27)
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
75
2. As in the classical case, the elements p12 , p34 (and their conjugates) vanish: p12 = −q −4 x1 x2 + q −3 x2 x1 = 0 ,
p34 = −q −1 x3 x4 + x4 x3 = 0 .
3. The remaining elements are given by p13 = q −4 x1 x¯ 3 − q −2 x2 x¯ 4 , p23 = −q −2 x¯ 2 x¯ 3 − q −1 x¯ 1 x¯ 4 ,
p14 = −q −3 x1 x4 − q −2 x2 x3 , p24 = q −1 x¯ 2 x4 − q −1 x¯ 1 x3 ,
with pj i = p¯ ij when j > i. By using the commutation relations of A(Sq7 ), one finds that only two of these are independent. We take them to be p13 and p14 ; one finds p23 = q −2 p¯ 14 and p24 = −q 2 p¯ 13 . Finally, we also have the sphere relation, 2 2 2 (q 6 − q 8 )p11 + p22 + p44 + q 4 (p13 p31 + p14 p41 ) + q 2 (p24 p42 + p23 p32 ) 2
= x¯ i xi = 1 .
(28)
Summing up, together with t = p22 , we set a := p13 and b := p14 . Then the projection p takes the following form −2 q t0 a b −q 2 a¯ t q −2 b¯ 0 p= (29) . q −2 b 1 − q −4 t 0 a¯ 1 − q 2t b¯ −q 2 a 0 By construction p∗ = p and this means that t¯ = t, as observed, and that a, ¯ b¯ are con2 jugate to a, b respectively. Also, by construction p = p; this property gives the easiest way to compute the commutation relations between the generators. One finds, ab = q 4 ba , ab ¯ = ba¯ , ta = q −2 at , tb = q 4 bt ,
(30)
together with their conjugates, and sphere relations a a¯ + bb¯ = q −2 t (1 − q −2 t) , ¯ = (1 − q −4 )t 2 . bb¯ − q −4 bb
¯ = t (1 − t) , q 4 aa ¯ + q −4 bb
It is straightforward to check also the relation (28).
(31)
¯ t ¯ b, b, We define the algebra A(Sq4 ) to be the algebra generated by the elements a, a, with the commutation relations (30) and (31). For q = 1 it reduces to the algebra of polynomial functions on the sphere S 4 . Otherwise, we can limit ourselves to |q| < 1, because the map q → q −1 , yields an isomorphic algebra.
a → q 2 a, ¯
¯ b → q −2 b,
t → q −2 t
76
G. Landi, C. Pagani, C. Reina
At q = 1, the projection p in (29) is conjugate to the classical one given in Appendix A by the matrix diag[1, −1, 1, 1] (up to a renaming of the generators). Our sphere Sq4 seems to be different from the one constructed in [3]. Two of our generators commute and most importantly, it does not come from a deformation of a subgroup (let alone coisotropic) of Sp(2). However, at the continuous level these two quantum spheres are the same since the C ∗ -algebra completion of both polynomial algebras is the minimal unitization K ⊕ CI of the compact operators on an infinite dimensional separable Hilbert space, a property shared with Podle´s standard sphere as well [24]. This fact will be derived in Sect. 4 when we study the representations of the algebra A(Sq4 ). 3.2. The SUq (2)-coaction. We now give a coaction of the quantum group SUq (2) on the sphere Sq7 . This coaction will be used later in Sect. 6 when analyzing the quantum principle bundle structure. Let us observe that the two pairs of generators (x1 , x2 ), (x3 , x4 ) both yield a quantum plane, x1 x2 = qx2 x1 , x3 x4 = qx4 x3 ,
x¯ 1 x¯ 2 = q −1 x¯ 2 x¯ 1 , x¯ 3 x¯ 4 = q −1 x¯ 4 x¯ 3 .
Then we shall look for a right-coaction of SUq (2) on the rows of the matrix v in (25). Other pairs of generators yield quantum planes but the only choice which gives a projection with the right number of generators is the one given above. The defining matrix of the quantum group SUq (2) reads α −q γ¯ (32) γ α¯ with commutation relations [30], αγ = qγ α , α γ¯ = q γ¯ α , α α¯ + q 2 γ¯ γ = 1 , αα ¯ + γ¯ γ = 1 .
γ γ¯ = γ¯ γ ,
We define a coaction of SUq (2) on the matrix (25) by, −3 q x1 q −2 x2 −q −1 x¯ 2 q −1 x¯ 1 . α −q γ¯ ⊗ δR (v) := . q −1 x3 −x4 γ α¯ −x¯ 4 −x¯ 3
(33)
(34)
We shall prove presently that this coaction comes from a coaction of A(SUq (2)) on the sphere algebra A(Sq7 ). For the moment we remark that, by its form in (34) the entries of the projection p = vv ∗ are automatically coinvariants, a fact that we shall also prove explicitly in the following. On the generators, the coaction (34) is given explicitly by δR (x1 ) = x1 ⊗ α + q x2 ⊗ γ δR (x2 ) = −x1 ⊗ γ¯ + x2 ⊗ α¯ δR (x3 ) = x3 ⊗ α − q x4 ⊗ γ δR (x4 ) = x3 ⊗ γ¯ + x4 ⊗ α¯ ,
, δR (x¯ 1 ) = q x¯ 2 ⊗ γ¯ + x¯ 1 ⊗ α¯ = δR (x1 ) , , δR (x¯ 2 ) = x¯ 2 ⊗ α − x¯ 1 ⊗ γ = δR (x2 ) , , δR (x¯ 3 ) = −q x¯ 4 ⊗ γ¯ + x¯ 3 ⊗ α¯ = δR (x3 ) , δR (x¯ 4 ) = x¯ 4 ⊗ α + x¯ 3 ⊗ γ = δR (x4 ),
(35)
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
77
from which it is also clear its compatibility with the anti-involution, i.e. δR (x¯ i ) = δR (xi ). The map δR in (35) extends as an algebra homomorphism to the whole of A(Sq7 ). Then, as alluded to before, we have the following Proposition 4. The coaction (35) is a right coaction of the quantum group SUq (2) on the 7-sphere Sq7 , δR : A(Sq7 ) −→ A(Sq7 ) ⊗ A(SUq (2)) .
(36)
Proof. By using the commutation relations of A(SUq (2)) in (33), a lengthy but easy computation gives that the commutation relations of A(Sq7 ) are preserved. This fact also shows that extending δR as an algebra homomorphism yields a consistent coaction.
Proposition 5. The algebra A(Sq4 ) is the algebra of coinvariants under the coaction defined in (35). Proof. We have to show that A(Sq4 ) = {f ∈ A(Sq7 ) | δR (f ) = f ⊗ 1}. By using the commutation relations of A(Sq7 ) and those of A(SUq (2)), we first prove explicitly that the generators of A(Sq4 ) are coinvariants: δR (a) = q −4 δR (x1 )δR (x¯ 3 ) − q −2 δR (x2 )δR (x¯ 4 ) = q −4 x1 x¯ 3 ⊗ (α α¯ + q 2 γ¯ γ ) − q −2 x2 x¯ 4 ⊗ (γ γ¯ + αα) ¯ = (q −4 x1 x¯ 3 − q −2 x2 x¯ 4 ) ⊗ 1 = a ⊗ 1, δR (b) = −q −3 δR (x1 )δR (x4 ) − q −2 δR (x2 )δR (x3 ) = −q −3 x1 x4 ⊗ (α α¯ + q 2 γ¯ γ ) − q −2 x2 x3 ⊗ (γ γ¯ + αα) ¯ = −(q −3 x1 x4 + q −2 x2 x3 ) ⊗ 1 = b ⊗ 1, δR (t) = q −2 δR (x¯ 2 )δR (x2 ) + q −2 δR (x¯ 1 )δR (x1 ) = q −2 x¯ 2 x2 ⊗ (α α¯ + q 2 γ¯ γ ) + q −2 x¯ 1 x1 ⊗ (γ γ¯ + αα) ¯ −2 2 −2 1 = (q x¯ x2 + q x¯ x1 ) ⊗ 1 = t ⊗ 1. By construction the coaction is compatible with the anti-involution so that ¯ = δR (b) = b¯ ⊗ 1. ¯ = δR (a) = a¯ ⊗ 1, δR (b) δR (a) In fact, this only shows that A(Sq4 ) is made of coinvariants but does not rule out the possibility of other coinvariants not in A(Sq4 ). However this does not happen for the following reason. From Eq. (35) it is clear that w1 ∈ {x1 , x3 , x¯ 2 , x¯ 4 } (respectively w−1 ∈ {x2 , x4 , x¯ 1 , x¯ 3 }) are weight vectors of weight 1 (resp. −1) in the fundamental comodule of SUq (2). It follows that the only possible coinvariants are of the form (w1 w−1 − qw−1 w1 )n . When n = 1 these are just the generators of A(Sq4 ).
Remark 1. The last part of the proof above is also related to the quantum Pl¨ucker coordinates. For every 2 × 2 matrix of (25), let us define the determinant by det
a11 a12 a21 a22
:= a11 a22 − q a12 a21 .
(37)
78
G. Landi, C. Pagani, C. Reina
(Note that a12 , a21 do not commute and so in the previous formula the ordering between them is fixed.) Let mij be the minors of (25) obtained by considering the i, j rows. Then m12 = q 2 p11 = t , m13 = p14 = b , m14 = −q p13 = −q a , m23 = p24 = −q 2 a¯ , m24 = −q p23 = −q −1 b¯ , m34 = −q p33 = q −3 t − q .
(38)
At q = 1, these give the classical Pl¨ucker coordinates [1]. The right coaction of SUq (2) on the 7-sphere Sq7 can be written as α −γ¯ 0 . qγ α ¯ 0 δR (x1 , x2 , x3 , x4 ) = (x1 , x2 , x3 , x4 ) ⊗ 0 0 α 0 0 −qγ
0 0 , γ¯ α¯
(39)
together with δR (x¯i ) = δR (xi ). In the block-diagonal matrix which appears in (39) the second copy is given by SUq (2) while the first one is twisted as α −γ¯ 1 0 α γ¯ 1 0 = . qγ α¯ 0 −1 −qγ α¯ 0 −1 A similar phenomenon occurs in [3]. Remark 2. It is also interesting to observe that δR (v ∗ v) = v ∗ v ⊗ 1 = 1 ⊗ 1 . Indeed, δR (φ1 |φ1 ) = δR (q −6 x¯ 1 x1 + q −2 x2 x¯ 2 + q −2 x¯ 3 x3 + x4 x¯ 4 ) = (−q −5 x¯ 2 x1 + q −2 x1 x¯ 2 + q −1 x¯ 4 x3 − x3 x¯ 4 ) ⊗ γ¯ α +(q −4 x¯ 2 x2 + q −2 x1 x¯ 1 + x¯ 4 x4 + x3 x¯ 3 ) ⊗ γ¯ γ +(q −6 x¯ 1 x1 + q −2 x2 x¯ 2 + q −2 x¯ 3 x3 + x4 x¯ 4 ) ⊗ αα ¯ ¯ +(−q −5 x¯ 1 x2 + q −2 x2 x¯ 1 + q −1 x¯ 3 x4 − x4 x¯ 3 ) ⊗ αγ = φ2 |φ1 ⊗ γ¯ α + φ2 |φ2 ⊗ γ¯ γ + φ1 |φ1 ⊗ αα ¯ + φ1 |φ2 ⊗ αγ ¯ = 1 ⊗ (γ¯ γ + αα) ¯ =1⊗1, δR (φ2 |φ2 ) = δR (q −2 x1 x¯ 1 + q −4 x¯ 2 x2 + x3 x¯ 3 + x¯ 4 x4 ) = (q −4 x¯ 2 x1 − q −1 x1 x¯ 2 − x¯ 4 x3 + qx3 x¯ 4 ) ⊗ α γ¯ +(q −4 x¯ 2 x2 + q −2 x1 x¯ 1 + x¯ 4 x4 + x3 x¯ 3 ) ⊗ α α¯ +(q −4 x¯ 1 x1 + x2 x¯ 2 + x¯ 3 x3 + q 2 x4 x¯ 4 ) ⊗ γ γ¯ +(q −4 x¯ 1 x2 − q −1 x2 x¯ 1 − x¯ 3 x4 + qx4 x¯ 3 ) ⊗ γ α¯ 2 φ1 |φ1 ⊗ γ γ¯ −q φ1 |φ2 ⊗ γ α¯ = −q φ2 |φ1 ⊗ α γ¯ +φ2 |φ2 ⊗ α α+q ¯ 2 = 1 ⊗ (α α¯ + q γ γ¯ ) = 1 ⊗ 1 , δR (φ1 |φ2 ) = q −5 δR (x¯ 1 )δR (x2 ) − q −2 δR (x2 )δR (x¯ 1 ) −q −1 δR (x¯ 3 )δR (x4 ) + δR (x4 )δR (x¯ 3 ) = 0, since δR defines a coaction on Sq7 and so preserves its commutation relations.
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
79
4. Representations of the Algebra A(Sq4 ) Let us now construct irreducible ∗-representations of A(Sq4 ) as bounded operators on a separable Hilbert space H. For the moment, we denote in the same way the elements of the algebra and their images as operators in the given representation. As mentioned before, since q → q −1 gives an isomorphic algebra, we can restrict ourselves to |q| < 1. We will consider the representations which are t-finite [19], i.e. such that the eigenvectors of t span H. Since the self-adjoint operator t must be bounded due to the spherical relations, from ¯ it follows that the spectrum should the commutation relations ta = q −2 at, t b¯ = q −4 bt, 2k ¯ be of the form λq and a, b (resp. a, ¯ b) act as rising (resp. lowering) operators on the eigenvectors of t. Then boundedness implies the existence of a highest weight vector, i.e. there exists a vector |0, 0 such that t |0, 0 = t00 |0, 0 , a |0, 0 = 0, b¯ |0, 0 = 0 . By evaluating
q 4 aa ¯
+ bb¯ =
(1 − q −4 t)t
(40)
on |0, 0 we have
(1 − q −4 t00 )t00 = 0. According to the values of the eigenvalue t00 we have two representations. 4.1. The representation β. The first representation, that we call β, is obtained for t00 = 0. Then, t |0, 0 = 0 implies t = 0. Moreover, using the commutation relations (30) and (31), it follows that this representation is the trivial one t = 0, a = 0, b = 0 ,
(41)
the representation Hilbert space being just C; of course, β(1) = 1. 4.2. The representation σ . The second representation, that we call σ, is obtained for t00 = q 4 . This is infinite dimensional. We take the set |m, n = Nmn a¯ m bn |0, 0 with n, m ∈ N, to be an orthonormal basis of the representation Hilbert space H, with N00 = 1 and Nmn ∈ R the normalizations, to be computed below. Then t |m, n = tmn |m, n , a¯ |m, n = amn |m + 1, n , b |m, n = bmn |m, n + 1 . By requiring that we have a ∗-representation we have also that a |m, n = am−1,n |m − 1, n , b¯ |m, n = bm,n−1 |m, n − 1 , with the following recursion relations: am,n±1 = q ±2 am,n ,
bm±1,n = q ±2 bm,n ,
bm,n = q 2 a2n+1,m .
By explicit computation, we find tm,n = q 2m+4n+4 , 1 −1 am,n = Nmn Nm+1,n = (1 − q 2m+2 ) 2 q m+2n+1 ,
−1 = (1 − q 4n+4 ) 2 q 2(m+n+2) . bm,n = Nmn Nm,n+1 1
(42)
80
G. Landi, C. Pagani, C. Reina
In conclusion we have the following action: t |m, n = q 2m+4n+4 |m, n , a¯ |m, n = (1 − q
2m+2
1 2
) q
(43)
m+2n+1
|m + 1, n ,
1 2
a |m, n = (1 − q 2m ) q m+2n |m − 1, n , 1
b |m, n = (1 − q 4n+4 ) 2 q 2(m+n+2) |m, n + 1 , b¯ |m, n = (1 − q 4n ) 2 q 2(m+n+1) |m, n − 1 . 1
It is straightforward to check that all the defining relations (30) and (31) are satisfied. In this representation the algebra generators are all trace class: Tr(t) = q 4
m
Tr(|a|) = q
q 2m
q 4n =
n
1 (1 − q 2m+2 ) 2 q m+2n = m,n
≤
q4 (1 − q 2 )(1 − q 4 )
,
1 q (1 − q 2m+2 ) 2 q m 1 − q2 m
q m q q = , 1 − q2 m (1 − q)(1 − q 2 )
Tr(|b|) = q 4
1
(1 − q 4n+4 ) 2 q 2(n+m) =
m,n
(44)
1 q4 (1 − q 4n+4 ) 2 q 2n 1 − q2 n
q 4 2n q4 ≤ q = . 1 − q2 n (1 − q 2 )2 From the sequence of Schatten ideals in the algebra of compact operators one know [29] that the norm closure of trace class operators gives the ideal of compact operators K. As a consequence, the closure of A(Sq4 ) is the C ∗ -algebra C(Sq4 ) = K ⊕ CI. 5. The Index Pairings The ‘defining’ self-adjoint idempotent p in (29) determines a class in the K-theory of Sq4 , i.e. [p] ∈ K0 [C(Sq4 )]. A way to prove its nontriviality is by pairing it with a nontrivial element in the dual K-homology, i.e. with (the class of) a nontrivial Fredholm module [µ] ∈ K 0 [C(Sq4 )]. In fact, in order to compute the pairing of K-theory with K-homology, it is more convenient to first compute the corresponding Chern characters in the cyclic homology ch∗ (p) ∈ H C∗ [A(Sq4 )] and cyclic cohomology ch∗ (µ) ∈ H C ∗ [A(Sq4 )] respectively, and then use the pairing between cyclic homology and cohomology [10]. Like it happens for the q-monopole [14], to compute the pairing and to prove the nontriviality of the bundle it is enough to consider H C0 [A(Sq4 )] and dually to take a suitable trace of the projector. The Chern character of the projection p in (29) has a component in degree zero ch0 (p) ∈ H C0 [A(Sq4 )] simply given by the matrix trace, ch0 (p) := tr(p) = 2 − q −4 (1 − q 2 )(1 − q 4 ) t ∈ A(Sq4 ).
(45)
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
81
The higher degree parts of ch∗ (p) are obtained via the periodicity operator S; not needing them here we shall not dwell more upon this point and refer to [10] for the relevant details. As mentioned, the K-homology of an involutive algebra A is given in terms of homotopy classes of Fredholm modules. In the present situation we are dealing with a 1-summable Fredholm module [µ] ∈ K 0 [C(Sq4 )]. This is in contrast to the fact that the analogous element of K0 (S 4 ) for the undeformed sphere is given by a 4-summable Fredholm module, being the fundamental class of S 4 . The Fredholm module µ := (H, , γ ) is constructed as follows. The Hilbert space is H = Hσ ⊕ Hσ and the representation is = σ ⊕ β. Here σ is the representation of A(Sq4 ) introduced in (43) and β given in (41) is trivially extended to Hσ . The grading operator is 1 0 γ = . 0 −1 The corresponding Chern character ch∗ (µ) of the class of this Fredholm module has a component in degree 0, ch0 (µ) ∈ H C 0 [A(Sq2n )]. From the general construction [10], the element ch0 (µev ) is the trace τ 1 (x) := Tr (γ (x)) = Tr (σ (x) − β(x)) .
(46)
The operator σ (x) − β(x) is always trace class. Obviously τ 1 (1) = 0. The higher degree parts of ch∗ (µev ) can again be obtained via a periodicity operator. A similar construction of the class [µ] and the corresponding Chern character were given in [21] for quantum two and three dimensional spheres. We are ready to compute the pairing: [µ], [p] := ch0 (µ), ch0 (p) = −q −4 (1 − q 2 )(1 − q 4 ) τ 1 (t) = −q −4 (1 − q 2 )(1 − q 4 ) Tr(t) = −q −4 (1 − q 2 )(1 − q 4 )q 4 (1 − q 2 )−1 (1 − q 4 )−1 = −1 .
(47)
This result shows also that the right A(Sq4 )-module p[A(Sq4 )4 ] is not free. Indeed, any free module is represented in K0 [C(Sq4 )] by the idempotent 1, and since [µ], [1] = 0, the evaluation of [µ] on any free module always gives zero. We can extract the ‘trivial’ element in the K-homology K 0 [C(Sq4 )] of the quantum sphere Sq4 and use it to measure the ‘rank’ of the idempotent p. This generator corresponds to the trivial generator of the K-homology K0 (S 4 ) of the classical sphere S 4 . The latter (classical) generator is the image of the generator of the K-homology of a point by the functorial map K∗ (ι) : K0 (∗) → K0 (S N ), where ι : ∗ → S N is the inclusion of a point into the sphere. Now, the quantum sphere Sq4 has just one ‘classical point’, i.e. the 1-dimensional representation β constructed in Sect. 4.1. The corresponding 1summable Fredholm module [ε] ∈ K 0 [C(Sq4 )] is easily described: the Hilbert space is C with representation β; the grading operator is γ = 1. Then the degree 0 component ch0 (ε) ∈ H C 0 [A(Sq2n )] of the corresponding Chern character is the trace given by the representation itself (since it is a homomorphism to a commutative algebra), τ 0 (x) = β(x) ,
(48)
82
G. Landi, C. Pagani, C. Reina
and vanishes on all the generators whereas τ 0 (1) = 1. Not surprisingly, the pairing with the class of the idempotent p is [ε], [p] := τ 0 (ch0 (p)) = β(2) = 2 .
(49)
6. Quantum Principal Bundle Structure Recall that if H is a Hopf algebra and P a right H -comodule algebra with multiplication m : P ⊗ P → P and coaction R : P → P ⊗ H and B ⊆ P is the subalgebra of coinvariants, the extension B ⊆ P is H Hopf-Galois if the canonical map χ : P ⊗B P −→ P ⊗ H , p ⊗B p → χ (p ⊗B p) = p p(0) ⊗ p(1) ,
(50)
is bijective. As mentioned, for us a quantum principle bundle will be the same as a Hopf-Galois extension. For quantum structure groups which are cosemisimple and have bijective antipodes, as is the case for SUq (2), Th. I of [28] grants further nice properties. In particular the surjectivity of the canonical map implies bijectivity and faithfully flatness of the extension. Moreover, an additional useful result [26] is that the map χ is surjective whenever, for any generator h of H , the element 1 ⊗ h is in its image. This follows from the left P -linearity -colinearity of the map χ . Indeed, and right H let h, k be two elements of H and pi ⊗ pi , qj ⊗ qj ∈ P ⊗ P be such that χ( pi ⊗B pi ) = 1 ⊗ h, χ ( qj ⊗B qj ) = 1 ⊗ k. Then χ ( pi qj ⊗B qj pi ) = 1 ⊗ kh, that is 1 ⊗ kh is in the image of χ. But, since the map χ is left P -linear, this implies its surjectivity. Definition 2. Let P be a bimodule over the ring B. any Given two elements |ξ1 and |ξ2 . m in the free module E = C ⊗ P , we shall define ξ1 ⊗B ξ2 ∈ P ⊗B P by
.
ξ1 ⊗B ξ2 :=
m
j j ξ¯1 ⊗B ξ2 .
(51)
j =1
. Analogously, one can define quantities ξ1 ⊗ ξ2 ∈ P ⊗ P with the same formula as above and tensor products taken over the ground field C. Proposition 6. The extension A(Sq7 ) ⊂ A(Spq (2)) is a faithfully flat A(Spq (1))-HopfGalois extension. Proof. Now P = A(Spq (2)), H = A(Spq (1)) and B = A(Sq7 ) and the coaction R of H is given just before Prop. 2. Since A(Spq (1)) A(SUq 2 (2)) has a bijective antipode and is cosemisimple ([19], Chap. 11), from the general considerations given above in order to show the bijectivity of the canonical map χ : A(Spq (2)) ⊗A(Sq7 ) A(Spq (2)) −→ A(Spq (2)) ⊗ A(Spq (1)) , it is enough ¯ γ¯ of A(Spq (1)) in (17) are in its image. to show that all generators α, γ , α, Let T 2 , T 3 be the second and third columns of the defining matrix T of Spq (2). We shall think of them as elements of the free module C4 ⊗ A(Spq (2)). Obviously,
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
83
T i |T j = δ ij . Recalling that A(Spq (2)) is both a left and right A(Sq7 )-module and using Def. 2, we have that . . T 2 ⊗A(Sq7 ) T 2 T 2 ⊗A(Sq7 ) T 3 α −q 2 γ¯ . . χ = 1 ⊗ . . 3 2 3 3 γ α ¯ T ⊗A(Sq7 ) T T ⊗A(Sq7 ) T
Indeed,
. 2 χ ( T 2 ⊗A(Sq7 ) T 2 ) = T i R Ti2 = T 2 |T 2 ⊗ α + T 2 |T 3 ⊗ γ = 1 ⊗ α , . 3 χ ( T 3 ⊗A(Sq7 ) T 2 ) = T i R Ti2 = T 3 |T 2 ⊗ α + T 3 |T 3 ⊗ γ = 1 ⊗ γ ;
a similar computation giving the other two generators. Proposition 7. extension.
The extension A(Sq4 )
⊂
A(Sq7 ) is a faithfully flat A(SUq (2))-Hopf-Galois
Proof. Now P = A(Sq7 ), H = A(SUq (2)) and B = A(Sq4 ) and the coaction δR of H is given in Prop. 4. As already mentioned A(SUq (2)) has a bijective antipode and is cosemisimple, then as before in order to show the bijectivity of the canonical map χ : A(Sq7 ) ⊗A(Sq4 ) A(Sq7 ) −→ A(Sq7 ) ⊗ A(SUq (2)) , we have to show that all generators α, γ , α, ¯ γ¯ of A(SUq (2)) in (32) are in its image. Recalling that A(Sq7 ) is both a left and right A(Sq4 )-module and using Def. 2, we have that . . φ1 ⊗A(Sq4 ) φ1 φ1 ⊗A(Sq4 ) φ2 α −q γ¯ . , χ = 1 ⊗ . . γ α ¯ φ2 ⊗A(Sq4 ) φ1 φ2 ⊗A(Sq4 ) φ2 where |φ1 , |φ2 are the two vectors introduced in Eqs. (23) and (24). Indeed
. χ ( φ1 ⊗A(Sq4 ) φ1 ) = χ q −6 x¯ 1 ⊗A(Sq4 ) x1 + q −2 x2 ⊗A(Sq4 ) x¯ 2 +q −2 x¯ 3 ⊗A(Sq4 ) x3 + x4 ⊗A(Sq4 ) x¯ 4 = q −6 x¯ 1 δR (x1 ) + q −2 x2 δR (x¯ 2 ) + q −2 x¯ 3 δR (x3 ) + x4 δR (x¯ 4 ) = q −6 x¯ 1 x1 ⊗ α + q −5 x¯ 1 x2 ⊗ γ + q −2 x2 x¯ 2 ⊗ α − q −2 x2 x¯ 1 ⊗ γ +q −2 x¯ 3 x3 ⊗ α − q −1 x¯ 3 x4 ⊗ γ + x4 x¯ 4 ⊗ α + x4 x¯ 3 ⊗ γ = φ1 |φ1 ⊗ α = 1 ⊗ α ,
. χ( φ2 ⊗A(Sq4 ) φ1 ) = q −5 x¯ 2 δR (x1 ) − q −2 x1 δR (x¯ 2 ) − q −1 x¯ 4 δR (x3 ) + x3 δR (x¯ 4 )
= q −5 x¯ 2 x1 ⊗ α + q −4 x¯ 2 x2 ⊗ γ − q −2 x1 x¯ 2 ⊗ α + q −2 x1 x¯ 1 ⊗ γ −q −1 x¯ 4 x3 ⊗ α + x¯ 4 x4 ⊗ γ + x3 x¯ 4 ⊗ α + x3 x¯ 3 ⊗ γ = φ2 |φ1 ⊗ α + φ2 |φ2 ⊗ γ = 1 ⊗ γ , with similar computations for the other generators.
84
G. Landi, C. Pagani, C. Reina
It was proven in [4] that the bundle constructed in [3] is a coalgebra Galois extension [9, 6]. The fact that our bundle A(Sq4 ) ⊂ A(Sq7 ) is Hopf-Galois shows also that these two bundles cannot be the same. On our extension A(Sq4 ) ⊂ A(Sq7 ) there is a strong connection. Indeed a H -HopfGalois extension B ⊆ P for which H is cosemisimple and has a bijective antipode is also equivariantly projective, that is there exists a left B-linear right H -colinear splitting s : P → B ⊗ P of the multiplication map m : B ⊗ P → P , m ◦ s = idP [27]. Such a map characterizes the so-called strong connection. Constructing a strong connection is an alternative way to prove that one has a Hopf Galois extension [12, 13]. In particular, if H has an invertible antipode S, an equivalent description of a strong connection can be given in terms of a map : H → P ⊗ P satisfying a list of conditions [17, 7] (see also [15, 5]). We denote by the coproduct on H with Sweedler notation (h) = h(1) ⊗ h(2) , by δ : P → P ⊗ H the right-comodule structure on P with notation δp = p(0) ⊗ p(1) , and δl : P → H ⊗ P is the induced left H -comodule structure of P defined by δl (p) = S −1 (p(1) ) ⊗ p(0) . Then, for the map one requires that (1) = 1 ⊗ 1 and that for all h ∈ H , χ ((h)) = 1 ⊗ h , (h(1) ) ⊗ h(2) = (id ⊗ δ) ◦ (h) , h(1) ⊗ (h(2) ) = (δl ⊗ id) ◦ (h).
(52)
The splitting s of the multiplication map is then given by s :P →B ⊗P ,
p → p(0) (p(1) ) .
Now, if g, h ∈ H are such that (g) = g 1 ⊗ g 2 and (h) = h1 ⊗ h2 satisfy condition (52) so does (gh) defined by (gh) := h1 g 1 ⊗ g 2 h2 .
(53)
If H has a PBW basis [18], this fact can be used to iteratively construct once one knows its value on the generators of H . For H = A(SUq (2)), with generators, α, γ , α¯ and γ¯ , the PBW basis is given by α k γ l γ¯ m , with k, l, m ∈ {0, 1, 2, . . . } and γ k γ¯ l α¯ m , with k, l ∈ {0, 1, 2, . . . } and m ∈ {1, 2, . . . } [30]. Then, for our extension A(Sq4 ) ⊂ A(Sq7 ) the map can be constructed as follows. Firstly, we put (1) = 1 ⊗ 1. Then, on the generators we set . . (α) := φ1 ⊗ φ1 , (α) ¯ := φ2 ⊗ φ2 , . . (γ ) := φ2 ⊗ φ1 , (γ¯ ) := −q −1 φ1 ⊗ φ2 . These expressions for satisfy all the properties (52): Firstly, χ ((α)) = 1 ⊗ α follows from the proof of Prop. 7. Then, (id ⊗ δ) ◦ (α) = q −6 x¯ 1 ⊗ δx1 + q −2 x2 ⊗ δ x¯ 2 + q −2 x¯ 3 ⊗ δx3 + x4 ⊗ δ x¯ 4 . . = φ1 ⊗ φ 1 ⊗ α + φ 1 ⊗ φ 2 ⊗ γ = (α) ⊗ α − q(γ¯ ) ⊗ γ = (α(1) ) ⊗ α(2) .
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
85
Moreover (δl ⊗ id) ◦ (α) = q −6 (α ⊗ x¯ 1 − q 2 γ¯ ⊗ x¯ 2 ) ⊗ x1 + q −2 (q γ¯ ⊗ x1 + α ⊗ x2 ) ⊗ x¯ 2 +q −2 (q 2 γ¯ ⊗ x¯ 4 + α ⊗ x¯ 3 ) ⊗ x3 + (−q γ¯ ⊗ x3 + α ⊗ x4 ) ⊗ x¯ 4 . . = α ⊗ φ1 ⊗ φ1 − q γ¯ ⊗ φ2 ⊗ φ1 = α ⊗ (α) − q γ¯ ⊗ (γ ) = α(1) ⊗ (α(2) ) . Similar computations can be carried for γ , α¯ and γ¯ . That an iterative procedure constructed by using (53) on the PBW basis leads to a well defined on the whole of H = A(SUq (2)) will be proven in the forthcoming paper [23] where other elaborations coming from the existence of a strong connection will be presented as well. 6.1. The associated bundle and the coequivariant maps. We now give some elements of the theory of associated quantum vector bundles [8] (see also [11]). Let B ⊂ P be a H -Galois extension with R the coaction of H on P . Let ρ : V → H ⊗ V be a corepresentation of H with V a finite dimensional vector space. A coequivariant map is an element ϕ in P ⊗ V with the property that (R ⊗ id)ϕ = (id ⊗ (S ⊗ id) ◦ ρ)ϕ
(54)
where S is the antipode of H . The collection ρ (P , V ) of coequivariant maps is a right and left B-module. The algebraic analogue of bundle nontriviality is translated in the fact that the HopfGalois extension B ⊂ P is not cleft. On the other hand, it is know that for a cleft Hopf-Galois extension, the module of coequivariant maps ρ (P , V ) is isomorphic to the free module of coinvariant maps 0 (P , V ) = B ⊗ V [8, 14]. For our A(SUq (2))-Hopf-Galois extension A(Sq4 ) ⊂ A(Sq7 ), let ρ1 : C2 → C2 ⊗ A(SUq (2)) be the fundamental corepresentation of A(SUq (2)) with 1 (A(Sq7 ), C2 ) the right A(Sq4 )-module of corresponding coequivariant maps. Now, the projection p in (29) determines a quantum vector bundle over Sq4 whose module of section is p[A(Sq4 )4 ], which is clearly a right A(Sq4 )-module. The following proposition in straightforward Proposition 8. The modules E := p[A(Sq4 )4 ] and 1 (A(Sq7 ), C2 ) are isomorphic as right A(Sq4 )-modules. Proof. Remember that p = vv ∗ with v in (25). The element p(F ) ∈ E, with F = (f1 , f2 , f3 , f4 )t , corresponds to the equivariant map v ∗ F ∈ 1 (A(Sq7 ), C2 ).
We expect that a similar construction extends to every irreducible corepresentation of A(SUq (2)) by means of suitable projections giving the corresponding associated bundles [23]. Proposition 9. The Hopf-Galois extension A(Sq4 ) ⊂ A(Sq7 ) is not cleft. Proof. As mentioned, the cleftness of the extension does imply that all modules of coequivariant maps are free. On the other hand, the nontriviality of the pairing (47) between the defining projection p in (29) and the Fredholm module µ constructed in Sect. 5 also shows that the module p[A(Sq4 )4 ] ρ (A(Sq7 ), C2 ) is not free.
86
G. Landi, C. Pagani, C. Reina
Acknowledgements. We thank the referee for many useful comments and suggestions. We are grateful to Tomasz Brzezi´nski and Piotr M. Hajac for several important remarks on a previous version of the compuscript. Also, Eli Hawkins, Walter van Suijlekom, Marco Tarlini are thanked for very useful discussions.
A. The Classical Hopf Fibration S 7 → S 4 We shall review the classical construction of the basic anti-instanton bundle over the four dimensional sphere S 4 in a ‘noncommutative parlance’ following [16]. This has been useful in the main text for our construction of the quantum deformation of the Hopf bundle. We write the generic element of the group SU (2) as w1 w2 w= . (55) −w¯ 2 w¯ 1 The SU (2) principal fibration SU (2) → S 7 → S 4 over the sphere S 4 is explicitly real4 2 ized as follows. The total space is S 7 = {z = (z1 , z2 , z3 , z4 ) ∈ C4 , i=1 |zi | = 1} , with right diagonal action w1 w2 0 0 −w¯ 2 w¯ 1 0 0 S 7 × SU (2) → S 7 , z · w := (z1 , z2 , z3 , z4 ) . (56) 0 0 w1 w 2 0 0 −w¯ 2 w¯ 1 The bundle projection π : S 7 → S 4 is just the Hopf projection and it can be explicitly given as π(z1 , z2 , z3 , z4 ) := (x, α, β) with x = |z1 |2 + |z2 |2 − |z3 |2 − |z4 |2 = −1 + 2(|z1 |2 + |z2 |2 ) = 1 − 2(|z3 |2 + |z4 |2 ) , α = 2(z1 z¯ 3 + z2 z¯ 4 ) , β = 2(−z1 z4 + z2 z3 ) . (57) One checks that |α|2 + |β|2 + x 2 = ( 4i=1 |zi |2 )2 = 1 . We need the rank 2 complex vector bundle E associated with the defining left representation ρ of SU (2) on C2 . The quickest way to get this is to identify S 7 with the unit sphere in the 2-dimensional quaternionic (right) H-module H2 and S 4 with the projective line P1 (H), i.e. the set of equivalence classes (w1 , w2 )t (w1 , w2 )t λ with (w1 , w2 ) ∈ S 7 and λ ∈ Sp(1) SU (2). Identifying H C2 , the vector (w1 , w2 )t ∈ S 7 reads z 1 z2 −¯z z¯ v = 2 1. (58) z3 z4 −¯z4 z¯ 3 This is actually a map from S 7 to the Stieffel variety of frames for E. In particular, notice that the two vectors |ψ1 , |ψ2 given by the columns of v are orthonormal, indeed v ∗ v = I2 . As a consequence, p := vv ∗ = |ψ1 ψ1 | + |ψ2 ψ2 | is a self-adjoint idempotent (a projector), p2 = p, p ∗ = p. Of course p is SU (2) invariant and hence its entries are functions on S 4 rather than S 7 . An explicit computation yields 1+x 0 α β 1 0 1 + x −β¯ α¯ , p= (59) α ¯ −β 1 − x 0 2 β¯ α 0 1−x
A Hopf Bundle Over a Quantum Four-Sphere from the Symplectic Group
87
where (x, α, β) are the coordinates (57) on S 4 . Then p ∈ Mat4 (C ∞ (S 4 , C)) is of rank 2 by construction. The matrix v in (58) is a particular example of the matrices v given in [1], for n = 1, k = 1, C0 = 0, C1 = 1, D0 = 1, D1 = 0. This gives the (anti-)instanton of charge −1 centered at the origin and with unit scale. The only difference is that here we identify C4 with H2 as a right H-module. This notwithstanding, the projections constructed in the two formalisms actually coincide. Finally recall that, as mentioned already, the classical limit of our quantum projection (29) is conjugate to (59). The canonical connection associated with the projector, ∇ := p ◦ d : ∞ (S 4 , E) → ∞ (S 4 , E) ⊗C ∞ (S 4 ,C) 1 (S 4 , C),
(60)
corresponds to a Lie-algebra valued (su(2)) 1-form A on S 7 whose matrix components are given by Aij = ψi |dψj ,
i, j = 1, 2 .
(61)
This connection can be used to compute the Chern character of the bundle. Out of the curvature of the connection ∇ 2 = p(dp)2 one has the Chern 2-form and 4-form given respectively by 1 tr(p(dp)2 ) , 2πi 1 C2 (p) := − 2 [tr(p(dp)4 ) − C1 (p)C1 (p)] , 8π C1 (p) := −
(62)
with the trace tr just an ordinary matrix trace. It turns out that the 2-form p(dp)2 has vanishing trace so that C1 (p) = 0. As for the second Chern class, a straightforward calculation shows that 1 [(x0 dx4 − x4 dx0 )(dξ )3 + 3dx0 dx4 ξ (dξ )2 ] 32π 2 3 = − 2 [x0 dx1 dx2 dx3 dx4 + cyclic permutations] 8π 3 = − 2 d(vol(S 4 )) . 8π
C2 (p) = −
(63)
The second Chern number is then given by c2 (p) =
S4
C2 (p) = −
3 8π 2
S4
d(vol(S 4 )) = −
3 8 2 π = −1 . 8π 2 3
(64)
The connection A in (61) is (anti-)self-dual, i.e. its curvature FA := dA + A ∧ A satisfies (anti-)self-duality equations, ∗H FA = −FA , with ∗H the Hodge map of the canonical (round) metric on the sphere S 4 . It is indeed the basic Yang-Mills anti-instanton found in [2].
88
G. Landi, C. Pagani, C. Reina
References 1. Atiyah, M.: The geometry of Yang-Mills fields. Lezioni Fermiane. Accademia Nazionale dei Lincei e Scuola Normale Superiore, Pisa 1979 2. Belavin, A., Polyakov, A., Schwartz, A., Tyupkin, Y.: Pseudoparticles solutions of the Yang-Mills equations. Phys. Lett. 58 B, 85–87 (1975) 3. Bonechi, F., Ciccoli, N., Tarlini, M.: Noncommutative instantons on the 4-sphere from quantum groups. Commun. Math. Phys. 226, 419–432 (2002) 4. Bonechi, F., Ciccoli, N., D¸abrowski, L., Tarlini, L.M.: Bijectivity of the canonical map for the non-commutative instanton bundle. J. Geom. Phys. 51, 71–81 (2004) 5. Brzezi´nski, T., D¸abrowski, L., Zielinski, B.: Hopf fibration and monopole connection over the contact quantum spheres. J. Geom. Phys. 50, 345–359 (2004) 6. Brzezi´nski, T., Hajac, P.M.: Coalgebra extensions and algebra coextensions of Galois type. Commun. Algebra 27, 1347–1368 (1999) 7. Brzezi´nski, T., Hajac, P.M.: The Chern-Galois character. C. R. Acad. Sci. Paris, Ser. I 333, 113–116 (2004) 8. Brzezi´nski, T., Majid, S.: Quantum group gauge theory on quantum spaces. Commun. Math. Phys. 157, 591–638 (1993) Erratum 167, 235 (1995) 9. Brzezi´nski, T., Majid, S.: Coalgebra Bundles. Commun. Math. Phys. 191, 467–492 (1998) 10. Connes, A.: Noncommutative geometry. London-New York: Academic Press, 1994 11. Durdevich, M.: Geometry of quantum principal bundles I. Commun. Math. Phys. 175, 427–521 (1996); Geometry of quantum principal bundles II. Rev. Math. Phys. 9, 531–607 (1997) 12. D¸abrowski, L., Grosse, H., Hajac, P.M.: Strong connections and Chern-Connes pairing in the HopfGalois theory. Commun. Math. Phys. 206, 247–264 (1999) 13. Hajac, P.M.: Strong connections on quantum principal bundles. Commun. Math. Phys. 182, 579–617 (1996) 14. Hajac, P.M., Majid, S.: Projective module description of the q-monopole. Commun. Math. Phys. 206, 247–264 (1999) 15. Hajac, P.M., Matthes, R., Szyma´nski, W.: A locally trivial quantum Hopf fibration. http:// arXiv.org/list/math.QA/0112317, 2001; to appear in Algebra and Representation Theory 16. Landi, G.: Deconstructing monopoles and instantons. Rev. Math. Phys. 12, 1367–1390 (2000) 17. Majid, S.: Quantum and braided group Riemannian geometry. J. Geom. Phys. 30, 113–146 (1999) 18. Kassel, C.: Quantum groups. Berlin-Heidelberg-New York: Springer 1995 19. Klimyk,A., Schm¨udgen, K.: Quantum groups and their representations. Berlin-Heidelberg: Springer Verlag, 1997 20. Kreimer, H.F., Takeuchi, M.: Hopf algebras and Galois extensions of an algebra. Indiana Univ. Math. J. 30, 675–692 (1981) 21. Masuda, T., Nakagami, Y., Watanabe, J.: Noncommutative differential geometry on the quantum SU (2). I:An algebraic viewpoint. K-Theory 4, 157–180 (1990); Noncommutative differential geometry on the quantum two sphere of P.Podle´s. I: An algebraic viewpoint. K-Theory 5, 151–175 (1991) 22. Montgomery, S.: Hopf algebras and their actions on rings. Providence, RI: AMS 1993 23. Pagani, C.: In preparation 24. Podle´s, P.: Quantum spheres. Lett. Math. Phys. 14, 193–202 (1987) 25. Reshetikhin, N.Yu., Takhtadzhyan, L.A., Faddeev, L.D.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 193–225 (1990) 26. Schauenburg, P.: Bi-Galois objects over Taft algebras. Israel J. Math. 115, 101–123 (2000) 27. Schauenburg, P., Schneider, H.: Galois type extensions of noncommutative algebras. In preparation 28. Schneider, H.: Principal homogeneous spaces for arbitrary Hopf algebras. Israel J. Math. 72, 167–195 (1990) 29. Simon, B.: Trace ideals and their applications. Cambridge: Cambridge Univ. Press, 1979 30. Woronowicz, S.L.: Twisted SU(2) group. An example of a noncommutative differential calculus. Publ. Res. Inst. Math. Sci. 23, 117–181 (1987) Communicated by L. Takhtajan
Commun. Math. Phys. 263, 89–132 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1482-7
Communications in
Mathematical Physics
Setting the Quantum Integrand of M-Theory Daniel S. Freed1 , Gregory W. Moore2 1
Department of Mathematics, University of Texas at Austin, 1 University Station, Austin, TX, 78712-0257, USA. E-mail:
[email protected] 2 Department of Physics, Rutgers University, Piscataway, NJ 08855-0849, USA. E-mail:
[email protected] Received: 21 September 2004 / Accepted: 26 January 2005 Published online: 24 January 2006 – © Springer-Verlag 2006
Abstract: In anomaly-free quantum field theories the integrand in the bosonic functional integral—the exponential of the effective action after integrating out fermions—is often defined only up to a phase without an additional choice. We term this choice “setting the quantum integrand”. In the low-energy approximation to M-theory the E8 -model for the C-field allows us to set the quantum integrand using geometric index theory. We derive mathematical results of independent interest about pfaffians of Dirac operators in 8k + 3 dimensions, both on closed manifolds and manifolds with boundary. These theorems are used to set the quantum integrand of M-theory for closed manifolds and for compact manifolds with either temporal (global) or spatial (local) boundary conditions. In particular, we show that M-theory makes sense on arbitrary 11-manifolds with spatial boundary, generalizing the construction of heterotic M-theory on cylinders.
Contents 1. Determinants, Pfaffians, and η-Invariants . . . . . . . . . 1.1. Determinant line bundle . . . . . . . . . . . . . . 1.2. Odd dimensions . . . . . . . . . . . . . . . . . . . 1.3. 8k + 3 dimensions . . . . . . . . . . . . . . . . . . 1.4. ζ -functions . . . . . . . . . . . . . . . . . . . . . 2. M-theory Action on Closed Manifolds . . . . . . . . . . 3. (8k + 3)-Dimensional Manifolds with Boundary . . . . 3.1. Generalities . . . . . . . . . . . . . . . . . . . . . 3.2. Global boundary conditions . . . . . . . . . . . . . 3.3. Local boundary conditions . . . . . . . . . . . . . 4. M-Theory Action on Compact Manifolds with Boundary
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
92 92 95 97 100 101 103 104 105 106 109
The work of D.F. is supported in part by NSF grant DMS-0305505. The work of G.M. is supported in part by DOE grant DE-FG02-96ER40949
90
4.1. Actions and anomalies . . . . . . . . . . . . . . . 4.2. Temporal boundary conditions . . . . . . . . . . . 4.3. Spatial boundary conditions . . . . . . . . . . . . 5. Further Discussion . . . . . . . . . . . . . . . . . . . . 5.1. The E8 -model . . . . . . . . . . . . . . . . . . . . 5.2. M2-branes . . . . . . . . . . . . . . . . . . . . . . 5.3. Boundary values of C-fields . . . . . . . . . . . . 5.4. Temporal boundaries and the Hamiltonian anomaly 5.5. Topological terms . . . . . . . . . . . . . . . . . . Appendix A: The Gravitino Path Integral . . . . . . . . . . . A.1. The gravitino theory . . . . . . . . . . . . . . . . . A.2. Local analysis of the equations of motion . . . . . A.3. Partition function . . . . . . . . . . . . . . . . . . A.4. Boundary conditions for ghosts . . . . . . . . . . . Appendix B: Quaternionic Fredholms and Pfaffians . . . . .
D.S. Freed, G.W. Moore
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
109 111 114 117 117 118 119 120 121 124 124 125 126 128 129
The low-energy approximation to M-theory is a refinement of classical 11-dimensional supergravity. It has a simple field content: a metric g, a 3-form gauge potential C, and a gravitino. The M-theory action contains rather subtle “Chern-Simons” terms which, on a topologically nontrivial manifold Y , raise delicate issues in the definition of the (exponentiated) action. Some aspects of the problem were resolved by Witten [W1]. The key ingredients are: a quantization law for C and a background magnetic current induced by the fourth Stiefel-Whitney class of the underlying manifold; an expression for the exponentiated Chern-Simons terms using an E8 gauge field and an associated Dirac operator in 12 dimensions; and finally a sign ambiguity in the gravitino partition function. In [DFM] the link to E8 was used to construct a model for the C-field and define precisely the action, assuming that the metric g is fixed. The present paper gives a complete treatment of the M-theory action as a function of both C and g. Furthermore, we treat manifolds with boundary. The boundary may have several components and each component is interpreted either as a fixed time slice (temporal boundary) or a boundary in space (spatial boundary). We do not mix temporal and spatial boundary conditions. Our discussion of spatial boundaries in §4.3 generalizes the case Y = X × [0, 1], where X is a closed 10-manifold, which was described in the work of Horava and Witten [HW1, HW2]. Our analysis here makes it clear that the anomaly cancellation is local. (As emphasized in [BM] the locality of anomaly cancellation in the Horava-Witten model is far from obvious.) In particular, we show that there is no topological obstruction to formulating M-theory on an 11-manifold with an arbitrary number of boundary components, provided an independent E8 super-Yang-Mills multiplet is present on each component. The analysis here is more than a cancellation of anomalies in M-theory. Already in [W1] Witten showed that there is a nontrivial Green-Schwarz mechanism canceling global anomalies on closed 11-manifolds. We go further and show that the anomaly is canceled canonically. This is a crucial distinction for the following reason. The absence of anomalies is a necessary condition for a quantum theory to be well-defined, but the cancellation mechanism depends on physically measurable choices. Put differently, there are undetermined phases if the configuration space of bosonic fields is not connected. As we explain quite generally in §4.1, the exponentiated effective action after integrating out fermionic fields is naturally a section of a hermitian line bundle with covariant derivative over the space of bosonic fields. The absence of anomalies means
Setting the Quantum Integrand of M-Theory
91
that the line bundle is geometrically trivializable, i.e., the covariant derivative is flat with no holonomy. If there are no anomalies then global trivializations exist, and a choice of trivialization determines the integrand of the bosonic functional integral. When we make such a choice we say we have set the quantum integrand. The general uniqueness question for settings of the quantum integrand is discussed in §5.5. Our main result is that in M-theory there is a canonical choice of trivialization, thus a canonical setting of the quantum integrand of M-theory. The procedure by which we set the quantum integrand of M-theory is, as we have mentioned, an example of the GreenSchwarz mechanism. Quite generally, by the Green-Schwarz mechanism we mean that setting the quantum integrand involves a trivialization of the tensor product of two line bundles with covariant derivative, one coming from integration over fermionic fields and the other from the simultaneous presence of electric and magnetic current; see [F2, F3 , Part 3 ] for a general discussion. The integral over fermionic fields is a section of a pfaffian line bundle. In this paper we use the E8 -model for the C-field to define the exponentiated electric coupling. This has the advantage that the associated line bundle with covariant derivative is defined by Atiyah-Patodi-Singer η-invariants associated to the E8 -gauge fields. With this model, then, we can analyze both line bundles in the context of standard invariants of geometric index theory and explicitly write down the trivialization which sets the quantum integrand. The mathematical results we apply to M-theory are given in §1 for closed manifolds and in §3 for manifolds with boundary. Determinant and pfaffian line bundles are usually considered for families of Dirac operators on even dimensional manifolds, but our interest here is in the odd dimensional case. As we explain in §1.2 there is a second natural real line bundle with covariant derivative in odd dimensions, defined using the exponentiated η-invariant, and it is isomorphic to the determinant line bundle (Proposition 1.16). This isomorphism is equivalent to a trivialization of the tensor product—the trivialization needed for the physics—since the second line bundle is real. This isomorphism induces a real structure on the determinant line bundle in odd dimensions. Also, it induces a nonflat complex trivialization of the determinant line bundle, so gives a definition of the determinant of the Dirac operator in odd dimensions as a complex number [S]; see Remark 1.20. This definition is often used in the physics literature, and is arrived at with Pauli-Villars regularization [R1, R2, ADM]. However, the definition as an element of the determinant line is more fundamental. There is an important refinement (Proposition 1.31) to pfaffians in dimensions 3, 11, 19, . . . which includes the dimensions of interest in M-theory: 11 for the bulk and 3 for M2-branes (§5.2). This refinement is topological in a sense made precise in Appendix B (Proposition B.2). We take up the generalization of this isomorphism to Dirac operators on manifolds with boundary in Sect. 3. Most often considered in the geometric index theory literature are boundary conditions of global type, which in the physics correspond to a temporal boundary. The generalization of the basic theorem to this case is straightforward (§3.2). Local boundary conditions arise in the physics from spatial boundaries, but because they do not exist for every Dirac operator they are less studied. The generalization to this case is more subtle and (in general dimensions) is the subject of the forthcoming thesis of Matthew Scholl. The applications to M-theory on manifolds with boundary appear as Theorem 4.16 (temporal boundary) and Conjecture 4.35 (spatial boundary). Our treatment falls short by not defining precisely the partition function of the Rarita-Schwinger (gravitino) field. The definition commonly used in the literature seems
92
D.S. Freed, G.W. Moore
ill-defined due to singularities related to the zeromodes of bosonic ghosts for supersymmetry transformations. Moreover, the derivation of the standard expression in terms of pfaffians of Dirac operators assumes an off-shell formulation of supergravity, something which is lacking in the 10- and 11-dimensional cases. Nevertheless, we take the standard expression as motivation for the line bundle with covariant derivative of which the Rarita-Schwinger partition function is a section. We present a derivation of the standard formula in Appendix A, mostly to motivate the local boundary conditions for the ghost fields which are used in Sect. 4.3. The precise definition of the Rarita-Schwinger partition function is a general issue which we leave to future work. Another issue we do not confront is the dependence of the covariant derivative on the Rarita-Schwinger line bundle on background fluxes. Nontrivial dependence can in principle arise from terms of the form ψGψ in the supergravity action. (There are additional terms of a similar nature in heterotic M-theory.) We believe the above issues will not drastically alter the discussion we give, which is based on the simple assumption that the Rarita-Schwinger partition function is a section of the line bundle in Eq. (2.2), equipped with the standard covariant derivative. Some general issues of independent interest arose during our investigations. One concerns the definition of anomalies and the setting of quantum integrands for manifolds with temporal boundaries. This forms part of the discussion in §4.1 and is elaborated in §5.4 where we relate it to the Hamiltonian interpretation of anomalies. There are interesting mathematical questions which underlie that discussion, but they are not treated here. Another issue concerns boundary values for fields with automorphisms, such as gauge fields. Then the boundary condition includes a choice of isomorphism (for example, see [FQ] where gauge theories with finite gauge groups are treated carefully), and this shows up in the physics as certain phases, such as θ-angles. In §5.3 we indicate how this works for the C-field in M-theory. 1. Determinants, Pfaffians, and η-Invariants The geometry of determinant line bundles was developed in [Q, BF]; see [F1] for a survey. In §1.1 we recall the main points. Our discussion is phrased in general terms and applies in arbitrary dimensions. In odd dimensions Clifford multiplication by the volume form induces a real structure on the determinant line bundle, which we explain in §1.2 by introducing a manifestly real line bundle associated to the η-invariant [APS] and proving it is isomorphic to the determinant line bundle. In §1.3 we prove a refinement in dimensions 8k + 3 (k ∈ Z≥0 ) coming from the quaternionic structure. Some details about ζ -functions are addressed in §1.4.
1.1. Determinant line bundle. Definition 1.1. Let T be a smooth manifold. A geometric family of Dirac operators parametrized by T consists of: (i) a Riemannian manifold Y → T ; that is, a fiber bundle Y → T , a metric on the relative tangent bundle T (Y/T ) → Y, and a horizontal distribution H on Y (thus H ⊕ T (Y/T ) = T Y); and (ii) a bundle M = M 0 ⊕M 1 → Y of complex Z/2Z-graded Cliff(Y/T )-modules with compatible metric and covariant derivative.
Setting the Quantum Integrand of M-Theory
93
The metric and horizontal distribution determine a Levi-Civita covariant derivative on T (Y/T ) → Y. The Riemannian metrics determine a bundle Cliff(Y/T ) → Y of (finite dimensional) Clifford algebras: the fiber at y ∈ Y is the real Clifford algebra of the relative cotangent space Ty∗ (Y/T ). The Clifford module structure on M is given as a map γ : T ∗ (Y/T ) −→ End(M)
(1.2)
which obeys the Clifford relation γ (θ1 )γ (θ2 ) + γ (θ2 )γ (θ1 ) = −2θ1 , θ2 ,
θ1 , θ2 ∈ Ty∗ (Y/T ),
y ∈ Y. (1.3)
We ask that the image consist of odd skew-adjoint transformations. The compatibility in the last line of Definition 1.1 also requires that (1.2) be flat. For each t ∈ T the Dirac operator Dt = γ ◦ ∇ is defined on sections of M Y → Yt . It is odd relative to the t Z/2Z-grading.1 To illustrate the notation let T be a point, so D is a Dirac operator on a single manifold Y . Suppose first that dim Y = 2m is even and Y is spin. Then for the standard Dirac operator M = S is the bundle of Z/2Z-graded spinors with homogeneous components S 0 , S 1 of complex rank 2m−1 , the bundles of chiral spinors. The Dirac operator interchanges the chirality of homogeneous spinor fields. The covariant derivative on S is induced from the Levi-Civita covariant derivative. If dim Y = 2m + 1 is odd, then we usually say that spinors are ungraded: there is no chirality. In the Z/2Z-graded setup we can take each of S 0 , S 1 to be the ungraded spinor bundle of complex rank 2m . This is compatible with the observation that for any Z/2Z-graded Cliff(Y )-module M → Y Clifford multiplication by the volume form provides an isomorphism M 0 → M 1 if dim Y is odd; see the next section for consequences. For Dirac operators with coefficients in a vector bundle E → Y take M = S ⊗ E. We occasionally denote this Dirac operator as ‘DM ’. In the application to field theory the parameter space is typically an infinite dimensional space B of all bosonic fields; from this point of view we study the pullback by a map T → B. Given a geometric family of Dirac operators parametrized by T let H = H0 ⊕ H1 −→ T
be the Hilbert space bundle whose fiber at t ∈ T is the space of L2 sections of M Y → t Yt . Assume each fiber Yt is closed, i.e., compact without boundary. Then the Dirac operator Dt extends to an odd self-adjoint operator on Ht , and so Dt2 to an even selfadjoint operator on Ht . The spectrum of Dt2 is nonnegative, discrete, has no accumulation points, and the eigenspaces are graded finite dimensional subspaces of Ht . Furthermore, if λ2 > 0 is an eigenvalue of Dt2 , then Dt /λ is an isometry from the even component of the λ2 -eigenspace to the odd component. Define spec0 (Dt2 ) to be the spectrum of Dt2 restricted to Ht0 . There is a canonical open cover {Ua }a≥0 of T : Ua = t ∈ T : a ∈ / spec0 (Dt2 ) . (1.4) 1 We remark that the Z/2Z-grading on M is not the physics grading of bosonic and fermionic fields. In our exposition here sections of M—for example, spinor fields—are treated as ordinary commuting fields. When we turn to the physics applications in §2 we use the proper action for fermionic fields.
94
D.S. Freed, G.W. Moore
On Ua we introduce the Z/2Z-graded vector bundle H(a) = H0 (a) ⊕ H1 (a) −→ Ua
(1.5)
whose fiber H(a)t at t ∈ T is the sum of the eigenspaces of for eigenvalues less than a. Then H(a) is smooth of finite rank, with constant rank on each component of Ua . Furthermore, the geometric data induces a metric and covariant derivative on H(a). The Dirac operator D restricts to an operator D(a) on H(a). Global geometric invariants of Dirac operators are constructed by patching invariants on Ua . Recall that the determinant line Det E of a finite dimensional ungraded vector space E is its highest exterior power. A linear map S : E0 → E1 between vector spaces of the same dimension has a determinant det S ∈ Hom(Det E0 , Det E1 ) ∼ (1.6) = Det E1 ⊗ (Det E0 )∗ Dt2
which is the induced map on the highest exterior power. The line which appears in (1.6) is the determinant line of the Z/2Z-graded vector space E0 ⊕E1 . It is natural to grade Det E by dim E. For our purposes we take the grading to lie in Z/2Z rather than Z. Returning to the family of Dirac operators, define the line bundle Det H(a) = Det H1 (a) ⊗ Det H0 (a)∗ −→ Ua . For b > a we set H(a, b) = H0 (a, b) ⊕ H1 (a, b) −→ Ua ∩ Ub whose fiber at t is the sum of the eigenspaces of Dt2 for eigenvalues between a and b. There is a canonical isomorphism Det H(a) ⊗ Det H(a, b) −→ Det H(b)
on Ua ∩ Ub
(1.7)
and a canonical nonzero det D(a, b) of Det H(a, b), where D(a, b)t : H0 (a, b)t → H1 (a, b)t is the restriction of the “chiral” Dirac operator Dt0 : Ht0 → Ht1 . From (1.7) we obtain the patching isomorphism section2
Det H(a) −→ Det H(b)
on Ua ∩ Ub ,
(1.8)
and a cocycle identity on Ua ∩Ub ∩Uc , whence a global smooth line bundle Det D → T . Furthermore, the sections det D(a) of Det H(a), defined analogously to det D(a, b), patch to a smooth section det D of Det D → T . The patching isomorphism (1.8) preserves the Z/2Z grading of the determinant line: the parity of Det Dt is the parity of index Dt0 . The metric and covariant derivative on H(a) induce a metric and covariant derivative on Det H(a), but these are not preserved by (1.8). Modify the metric and covariant derivative to obtain invariance under patching: multiply the metric on Det H(a)t by λ2 (1.9) a> 0 define η(α)t [s] = sign(λ − α)|λ|−s − sign(α) · # spec0 (ωt Dt ) ∩ {0} λ∈spec0 (ωt Dt ) \ {0}
and set η(α)t to be the value of the meromorphic continuation of η(α)t [s] at s = 0. For α < β we have η(β)t η(α)t on Vα ∩ Vβ . (1.12) = − # λ ∈ spec0 (ωt Dt ) : α < λ < β 2 2 We use (1.12) to construct two invariants. First, let T denote the group of unit norm complex numbers. Then
η(α) τ (α) = exp 2πi : Vα −→ T (1.13) 2 is invariant under patching, so defines a global function τD : T −→ T.
(1.14)
96
D.S. Freed, G.W. Moore
Second, we use the integers in the last term of (1.12) to patch a principal Z-bundle on T : the fiber at t ∈ T is
α 0},
, 1
e−2ikL c(t, k) = O
k
,
k ∈ D3 = {k | Im k < 0, Im k 3 > 0}.
(4.30)
In the case of the mKdV-I equation, the given (linearly well-posed) boundary conditions are g0 (t), h0 (t), and h1 (t), and we are looking for expressions for g1 (t), g2 (t), and h2 (t). Substitute (4.26) into (4.29) and rewrite the resulting equation in the form:
t
t 3 (0) (0) 2ikL 8ik 3 τ −e F1 (t, 2τ − t)e dτ + F1 (t, 2τ − t)e8ik τ dτ 0 0
t 3 3 (1) +ik F1 (t, 2τ − t)e8ik τ dτ = G1 (t, k) + G2 (t, k) + e8ik t c(t, k), (4.31) 0
where
t
t 3 3 (1) (2) G1 (t, k) = e2ikL ik F1 (t, 2τ − t)e8ik τ dτ + k 2 F1 (t, 2τ − t)e8ik τ dτ 0 0
t 3 (2) −k 2 F1 (t, 2τ − t)e8ik τ dτ, (4.32) 0
t
t 3 3 G2 (t, k) = 2e2ikL F2 (t, t − 2τ, k)e8ik τ dτ F1 (t, 2τ − t, k)e8ik τ dτ 0 0
t
t 3 8ik 3 τ −2 F2 (t, t − 2τ, k)e dτ F1 (t, 2τ − t, k)e8ik τ dτ. (4.33) 0
0
Let D = {k : 0 < arg k < π/3}. Considering (4.31) in D as well as replacing k by 2π i Ek and by E 2 k in (4.31), where E = e 3 , we obtain three equations, which are valid for k ∈ D. These equations can be written in the vector form as follows: 3
E(k)U (t, k) = H1 (t, k) + H2 (t, k) + e8ik t Hc (t, k),
k ∈ D,
(4.34)
Integrable Nonlinear Evolution Equations on a Finite Interval
163
where (j = 1, 2) 1 1 −e2ikL , 1 E E(k) = −e2iEkL 2 kL 2 kL −2iE 2 −2iE E e −1 e t (0) 8ik 3 τ F1 (t, 2τ − t)e dτ 0 t 3τ (0) 8ik U (t, k) = F1 (t, 2τ − t)e dτ , 0 t (1) 8ik 3 τ ik F1 (t, 2τ − t)e dτ
Gj (t, k) , Gj (t, Ek) Hj (t, k) = 2 kL −2iE 2 Gj (t, E k) e c(t, k) . c(t, Ek) Hc (t, k) = 2 kL −2iE 2 c(t, E k) e
0
¯ Notice that det E(k) → 1 − E = 0 as |k| → ∞, k ∈ D. 3
Multiply (4.34) by diag{k 2 , k 2 , −ik}E −1 (k)e−8ik t , 0 < t < t, and integrate along the contour ∂D (0) , which is the boundary of D deformed (in its finite part) to pass above the zeros of det E(k). Then (4.30) implies that the term containing Hc vanishes. In order to evaluate the other terms we will use the following identities (see, e.g., [2]):
t π 3
k2 α(τ )e8ik (τ −t ) dτ dk = (4.35) α(t ), 12 ∂D (0) 0
t 3
km α(τ )e8ik (τ −t ) dτ dk ∂D (0) 0
t 1 m 8ik 3 (τ −t )
= k α(τ )e dτ − α(t ) dk, (4.36) 8ik 3 ∂D (0) 0 where m = 3, 4 and α(τ ) is a smooth function for 0 < τ < t. Then the integration by parts together with Jordan’s lemma show that one can pass to the limit as t → t in the right-hand side of (4.36). Applying (4.36) to the integral term containing H1 one obtains
3
diag{k 2 , k 2 , −ik}E −1 (k)H1 (t, k)e−8ik t dk ∂D (0)
˜ 1 (t, t , k) G k2 0 0 ˜ 1 (t, t , Ek) 0 k 2 0 E −1 (k) dk, G = (0) 2 ∂D kL −2iE
2 ˜ 0 0 −ik G1 (t, t , E k) e
(4.37)
where
˜ 1 (t, t , k) = e2ikL ik G 0
+k
t
2
t
0 t
−k 2 0
(1)
F1 (t, 2τ − t)e8ik
3 (τ −t )
(2)
3 (τ −t )
(2)
3 (τ −t )
F1 (t, 2τ − t)e8ik F1 (t, 2τ − t)e8ik
dτ −
1 (1) F (t, 2t − t) 8k 2 1
dτ −
1 (2) F (t, 2t − t) 8ik 1
dτ +
1 (2) F (t, 2t − t). (4.38) 8ik 1
164
A. Boutet de Monvel, A.S. Fokas, D. Shepelsky
Applying (4.35) to the integral in the left-hand side of (4.34) we arrive at the equation (0) F1 (t, 2t − t) (0) F1 (t, 2t − t) (1) F1 (t, 2t − t) 2 ˜ 1 (t, t , k)
G k 0 0 12 ˜ 1 (t, t , Ek) 0 k 2 0 E −1 (k) dk G = π ∂D (0) 0 0 −ik 2 ˜ 1 (t, t , E 2 k) e−2iE kL G 2
G2 (t, k) k 0 0 12 0 k 2 0 E −1 (k) e−8ik 3 t dk. (4.39) G2 (t, Ek) + 2 π ∂D (0) 0 0 −ik e−2iE kL G2 (t, E 2 k) Evaluating this equation at t = t and using (4.27) and (4.28) we find the following equations for h2 (t), g1 (t), and g2 (t): ˜ 1 (t, k)
G 1 12i ˜ 1 (t, Ek) dk G g1 (t) = g0 (t)N2 (t, t) − k E −1 (k) 3 2 π ∂D (0) 2 kL −2iE 2 ˜ 1 (t, E k) G e
G2 (t, k) 12i e−8ik 3 t dk, G2 (t, Ek) − k E −1 (k) 2 3 π ∂D (0) e−2iE kL G2 (t, E 2 k) 1 g2 (t) = 2λg03 (t) + g0 (t)M2 (t, t) + g1 (t)N2 (t, t) 2 ˜ 1 (t, k)
G 24 ˜ 1 (t, Ek) dk G − k 2 E −1 (k) 2 π ∂D (0) 2 kL −2iE 2 ˜ 1 (t, E k) G e
G2 (t, k) 24 e−8ik 3 t dk, G2 (t, Ek) − k 2 E −1 (k) 2 2 π ∂D (0) e−2iE kL G2 (t, E 2 k) 1 h2 (t) = 2λh30 (t) + h0 (t)M2 (t, t) + h1 (t)N2 (t, t) 2 ˜ 1 (t, k)
G 24 ˜ 1 (t, Ek) dk G − k 2 E −1 (k) 1 π ∂D (0) 2 kL −2iE 2 ˜ 1 (t, E k) G e
G2 (t, k) 24 e−8ik 3 t dk, G2 (t, Ek) − k 2 E −1 (k) (4.40) 2 1 π ∂D (0) e−2iE kL G2 (t, E 2 k) where E −1 (k) j , j = 1, 2, 3, denotes the j th row of E −1 (k) and t 3 1 2ikL ˜ G1 (t, k) = e ik M1 (t, 2τ − t) − h0 (t)N2 (t, 2τ − t) e8ik (τ −t) dτ 2 0 1 1 − 2 h1 (t) + h0 (t)N2 (t, t) 8k 16k 2
Integrable Nonlinear Evolution Equations on a Finite Interval
t
1 h0 (t) 4ik 0
t 1 3 −k 2 N1 (t, 2τ − t)e8ik (τ −t) dτ + g0 (t). 4ik 0
+k 2
N1 (t, 2τ − t)e8ik
3 (τ −t)
165
dτ −
(4.41)
˜ 1 (t, k), G2 (t, k), N2 (t, t), M2 (t, t), N2 (t, t), and M2 (t, t) involved in The functions G (4.40) can be expressed in terms of (t, k) and (t, k), see Definitions 3.2 and 3.3. Indeed, (3.6) and (3.9) together with (4.2) give
t
1 3 1 (t, k)e8ik t , 2 0
t 1 3 3 F1 (t, 2τ − t, k)e8ik τ dτ = 1 (t, k)e8ik t , 2 0
t ¯ 8ik 3 τ dτ = 1 (2 (t, k) ¯ − 1), F2 (t, t − 2τ, k)e 2 0
t ¯ 8ik 3 τ dτ = 1 (2 (t, k) ¯ − 1). F2 (t, t − 2τ, k)e 2 0 3
F1 (t, 2τ − t, k)e8ik τ dτ =
(4.42) (4.43) (4.44) (4.45)
Hence, G2 (t, k) can be written as follows: G2 (t, k) =
1 8ik 3 t 2ikL ¯ − 1)1 (t, k) − (2 (t, k) ¯ − 1)1 (t, k) . (2 (t, k) e e 2
(4.46)
Since the exponentials in (4.42) depend on k only through k 3 , supplementing (4.42) with the two equations obtained from (4.42) by replacing k with Ek and E 2 k and taking into account (4.27) we obtain a linear system of equations, the solution of which gives
! t (2) 3 k 2 0 F1 (t, 2τ − t, k)e8ik τ dτ ! t (1) 3 ik 0 F1 (t, 2τ − t, k)e8ik τ dτ ! t (0) 8ik 3 τ dτ 0 F1 (t, 2τ − t, k)e 2 (t, E 2 k) (t, k) + E (t, Ek) + E 1 1 1 1 3 = e8ik t 1 (t, k) + E 2 1 (t, Ek) + E1 (t, E 2 k) . 6 1 (t, k) + 1 (t, Ek) + 1 (t, E 2 k)
(4.47)
Similarly, supplementing (4.44) with the two equations obtained from (4.44) by replacing k with Ek and E 2 k and taking into account (4.27) we find
! t (2) 3 k 2 0 F2 (t, t − 2τ, k)e8ik τ dτ ! t (1) 3 ik 0 F2 (t, t − 2τ, k)e8ik τ dτ ! t (0) 3 8ik τ dτ 0 F2 (t, t − 2τ, k)e 2 2 2 (t, Ek) + E 2 (t, E k) 1 2 (t, k) + E = 2 (t, k) + E 2 2 (t, Ek) + E2 (t, E 2 k) . 6 1 (t, k) + 1 (t, Ek) + 1 (t, E 2 k) − 3
(4.48)
166
A. Boutet de Monvel, A.S. Fokas, D. Shepelsky (1)
(2)
Using (4.35) and the expressions for F2 (t, s) and F2 (t, s) in terms of Nj (t, s), j = 1, 2 and M2 (t, s), see (4.27), from (4.48) we conclude that
2 N2 (t, t) = 2 (t, k) + E2 (t, Ek) + E 2 2 (t, E 2 k) dk, (4.49) π ∂D (0)
2i M2 (t, t) = −λg02 (t) − k 2 (t, k) + E 2 2 (t, Ek) + E2 (t, E 2 k) dk. π ∂D (0) Similarly,
2 (t, k) + E2 (t, Ek) + E 2 2 (t, E 2 k) dk, (4.50) ∂D (0)
2i M2 (t, t) = −λg02 (t) − k 2 (t, k) + E 2 2 (t, Ek) + E2 (t, E 2 k) dk. π ∂D (0) N2 (t, t) =
2 π
˜ 1: Substituting (4.47) and (4.50) into (4.41) we obtain the following expression for G ˜ 1 (t, k) = e2ikL 1 e8ik 3 t 1 (t, k) − 1 (t, Ek) − 1 (t, E 2 k) − 1 h1 (t) G 3 8k 2
1 1 − h0 (t) 2 (t, ζ ) + E2 (t, Eζ ) + E 2 2 (t, E 2 ζ ) dζ h0 (t)+ 2 4ik 8πk ∂D (0) 1 8ik 3 t 1 − e (4.51) 1 (t, k) + E1 (t, Ek) + E 2 1 (t, E 2 k) + g0 (t). 6 4ik Using (4.46), (4.51), (4.49), and (4.50) in (4.40) we obtain the equations for g1 , g2 , and h2 in terms of and . These equations, together with (3.5) and the similar equation for (t, k) constitute a system of four nonlinear ODEs for 1 , 2 , 1 , and 2 . mKdV-II. The integral representations for A and B are the same as in (4.26)–(4.28) but with g0 and g2 replaced by −g0 and −g2 , respectively. Similarly, the integral representations for A and B are the same as in the case of the mKdV-I, with h0 and h2 replaced by −h0 and −h2 , respectively. The global relation (4.29) and relations (4.30) become e−2ikL A(t, k)B(t, k) − B(t, k)A(t, k) = e8ik t c(t, k), 3
and
1
c(t, k) = O
k ∈ D1 ∪ D3 , (4.52)
, k ∈ D1 = {k | Im k < 0, Im k 3 > 0}, 1 e2ikL c(t, k) = O , k ∈ D3 = {k | Im k > 0, Im k 3 > 0}, (4.53) k respectively. Let g0 (t), g1 (t), and h0 (t) be the given boundary conditions. Then the analysis of the global relation consists in finding equations for g2 , h1 , and h2 . Substitute (4.26) into (4.52) and rewrite the resulting equation in the form:
t
t 3 3 (0) (1) −e−2ikL F1 (t, 2τ − t)e8ik τ dτ − e−2ikL ik F1 (t, 2τ − t)e8ik τ dτ 0 0
t 3 (0) 8ik 3 τ + F1 (t, 2τ − t)e dτ = G1 (t, k) + G2 (t, k) + e8ik t c(t, k), (4.54) 0
k
Integrable Nonlinear Evolution Equations on a Finite Interval
167
where
t
t 3 3 (2) (1) F1 (t, 2τ − t)e8ik τ dτ − ik F1 (t, 2τ − t)e8ik τ dτ G1 (t, k) = e−2ikL k 2 0 0
t 3 (2) −k 2 F1 (t, 2τ − t)e8ik τ dτ, (4.55) 0
t
t 3 3 G2 (t, k) = 2e−2ikL F2 (t, t − 2τ, k)e8ik τ dτ F1 (t, 2τ − t, k)e8ik τ dτ 0 0
t
t 3 3 −2 F2 (t, t − 2τ, k)e8ik τ dτ F1 (t, 2τ − t, k)e8ik τ dτ. (4.56) 0
0
Let, as above, D = {k : 0 < arg k < π/3}. Considering (4.54) in D as well as replacing 2π i k by Ek and by E 2 k in (4.54), where E = e 3 , we get three equations valid for k ∈ D, which can be written in the vector form as follows: 3
E(k)U (t, k) = H1 (t, k) + H2 (t, k) + e8ik t Hc (t, k),
k ∈ D,
(4.57)
where (j = 1, 2) −1 e2ikL −E e2iEkL , E(k) = 2 kL 2 kL −2iE 2 −2iE −e −E e 1 t (0) 8ik 3 τ F (t, 2τ − t)e dτ 1 0 t (1) 8ik 3 τ , U (t, k) = ik F (t, 2τ − t)e dτ 1 0t 3 (0) F1 (t, 2τ − t)e8ik τ dτ
−1 −1
e2ikL Gj (t, k) Hj (t, k) = e2iEkL Gj (t, Ek) , Gj (t, E 2 k)
e2ikL c(t, k) Hc (t, k) = e2iEkL c(t, Ek) . c(t, E 2 k)
0
¯ Notice that det E(k) → E − 1 = 0 as |k| → ∞, k ∈ D. 3t
2 2 −1 −8ik Multiply (4.57) by diag{k , −ik, k }E (k)e , 0 < t < t, and integrate over (0) the contour ∂D , which is the boundary of D deformed (in its finite part) to pass above the zeros of det E(k). Then (4.53) implies that the term containing Hc vanishes, and the resulting equation takes the form
2 2ikL (0) ˜ 1 (t, t , k)
F1 (t, 2t − t) e G k 0 0 12 (1) 0 k 2 0 E −1 (k) e2iEkL G ˜ 1 (t, t , Ek) dk F1 (t, 2t − t) = (0) π ∂D (0) ˜ 1 (t, t , E 2 k) 0 0 −ik G F1 (t, 2t − t) 2 2ikL
G2 (t, k) e k 0 0 12 0 k 2 0 E −1 (k)e2iEkL G2 (t, Ek)e−8ik 3 t dk, + π ∂D (0) 0 0 −ik G2 (t, E 2 k) (4.58)
168
A. Boutet de Monvel, A.S. Fokas, D. Shepelsky
where
−2ikL ˜ G1 (t, t , k) = e k2 0
t
−ik 0
−k
t
(2)
F1 (t, 2τ − t)e8ik
(1)
F1 (t, 2τ − t)e8ik
2 0
t
(2)
3 (τ −t )
F1 (t, 2τ − t)e8ik
3 (τ −t )
dτ +
3 (τ −t )
dτ −
1 (2) F (t, 2t − t) 8ik 1
1 (1) F (t, 2t − t) 8k 2 1
dτ +
1 (2) F (t, 2t − t). 8ik 1 (4.59)
Evaluating this equation at t = t and using (4.27) and (4.28) we find the following equations for g2 (t), h1 (t), and h2 (t):
˜ 1 (t, k) e2ikL G 1 12i ˜ 1 (t, Ek) g1 (t) = − g0 (t)N2 (t, t) − k E −1 (k) e2iEkL G 3 2 π ∂D (0) ˜ G1 (t, E 2 k) 2ikL
G2 (t, k) e 12i 3 − k E −1 (k) e2iEkL G2 (t, Ek) e−8ik t dk, 2 π ∂D (0) G2 (t, E 2 k) 2ikL ˜ 1 (t, k)
e G 1 12i ˜ 1 (t, Ek) dk h1 (t) = − h0 (t)N2 (t, t) − k E −1 (k) e2iEkL G 2 2 π ∂D (0) ˜ 1 (t, E 2 k) G 2ikL
G2 (t, k) e 12i 3 − k E −1 (k) e2iEkL G2 (t, Ek) e−8ik t dk, 3 π ∂D (0) G2 (t, E 2 k) 1 h2 (t) = 2λh30 (t) + h0 (t)M2 (t, t) − h1 (t)N2 (t, t) 2 2ikL ˜ 1 (t, k)
e G 24 ˜ 1 (t, Ek) dk k 2 E −1 (k) 1 e2iEkL G + π ∂D (0) ˜ G1 (t, E 2 k) 2ikL
G2 (t, k) e 24 3 + k 2 E −1 (k) 1 e2iEkL G2 (t, Ek) e−8ik t dk, (4.60) (0) π ∂D G2 (t, E 2 k)
where t 1 3 ˜ 1 (t, k) = e−2ikL k 2 G h0 (t) N1 (t, 2τ − t)e8ik (τ −t) dτ + 4ik 0
t 3 1 1 −ik M1 (t, 2τ − t) + g0 (t)N2 (t, 2τ − t) e8ik (τ −t) dτ + 2 g1 (t) 2 8k 0
t 1 1 3 + g0 (t). g0 (t)N2 (t, t) − k 2 N1 (t, 2τ − t)e8ik (τ −t) dτ − 16k 2 4ik 0 (4.61)
Integrable Nonlinear Evolution Equations on a Finite Interval
169
Now one can express the functions involved in (4.60) in terms of and . The formulas for N2 (t, t), M2 (t, t), N2 (t, t), and M2 (t, t) have the same form as in the case of mKdV-I, (4.49) and (4.50), whereas 1 3 ¯ − 1)1 (t, k) − (2 (t, k) ¯ − 1)1 (t, k) (4.62) G2 (t, k) = e8ik t e−2ikL (2 (t, k) 2 and
˜ 1 (t, k) = e−2ikL 1 e8ik 3 t 1 (t, k) + E1 (t, Ek) + E 2 1 (t, E 2 k) + 1 h0 (t) G 6 4ik 1 8ik 3 t 1 1 − e 1 (t, k) − 1 (t, Ek) − 1 (t, E 2 k) + 2 g1 (t) − g0 (t) 3 8k 4ik
1 + g0 (t) (4.63) 2 (t, ζ ) + E2 (t, Eζ ) + E 2 2 (t, E 2 ζ ) dζ. 2 8π k ∂D (0)
5. Conclusions We have presented a general method for the analysis of initial boundary value problems for nonlinear integrable evolution equations on the finite interval and have applied this method to the sine-Gordon and the two mKdV equations. In particular: 1. Given the Dirichlet data for the sG equation, q(0, t) = g0 (t) and q(L, t) = g1 (t), we have characterized the Neumann boundary values qx (0, t) = g1 (t) and qx (L, t) = h1 (t) through a system of nonlinear ODEs for the functions 1 , 2 , 1 , and 2 . The functions 1 and 2 satisfy Eqs. (3.5), the functions 1 and 2 satisfy similar equations, and the Neumann boundary values are given by Eqs. (4.21) and (4.25). Similarly, given the boundary data q(0, t) = g0 (t), q(L, t) = h0 (t), qx (L, t) = h1 (t) for the mKdV-I equation (q(0, t) = g0 (t), qx (0, t) = g1 (t), q(L, t) = h0 (t) for the mKdV-II equation), we have characterized the boundary values qx (0, t) = g1 (t), qxx (0, t) = g2 (t), qxx (L, t) = h2 (t) (qxx (0, t) = g2 (t), qx (L, t) = h1 (t), qxx (L, t) = h2 (t), respectively) through a system of nonlinear ODEs. 2. Given the initial conditions q(x, 0) = q0 (x) (q(x, 0) = q0 (x) and qt (x, 0) = q1 (x) for the sG equation) we have defined {a(k), b(k)}, see Definition 3.1. Given {gl (t)}n−1 0 we have defined {A(k), B(k)}, we have defined {A(k), B(k)}, and given {hl (t)}n−1 0 see Definitions 3.2 and 3.3. 3. Given {a(k), b(k), A(k), B(k), A(k), B(k)} we have defined a Riemann-Hilbert problem for M(x, t, k), and then we have defined q(x, t) in terms of M. We have shown that q(x, t) solves the given nonlinear equation and that q(x, 0) = q0 (x) ∂xl q(0, t) = gl (t),
(and qt (x, 0) = q1 (x) for sG), ∂xl q(L, t) = hl (t), 0 ≤ l ≤ n − 1,
see Theorem 3.1. The most difficult step of this method is the analysis of the global relation coupling the spectral functions. Generally, this leads to a system of nonlinear ODEs. For integrable evolution PDEs on the half-line, there exist particular boundary conditions, the so-called linearizable boundary conditions, for which this nonlinear system can be avoided: the global relation yields directly S(k) in terms of s(k) and the prescribed boundary conditions [21, 15, 3]. Different aspects of linearizable boundary conditions have been studied
170
A. Boutet de Monvel, A.S. Fokas, D. Shepelsky
by a number of authors, see, for example, [28, 29, 25, 1]. The analysis of linearizable boundary conditions on a finite domain will be presented elsewhere. Here we only note that x-periodic boundary conditions belong to the linearizable class. In this case S(k) = SL (k) and the global relation simplifies. The analysis of this simplified global relation, together with the results presented in this paper, yields a new formalism for the solution of this classical problem. The main advantage of the inverse scattering method, in comparison with the standard PDE techniques, is that it yields explicit asymptotic results. Indeed, using the inverse scattering method, the solution of the Cauchy problem on the line for an integrable nonlinear PDE can be expressed through the solution of a matrix Riemann-Hilbert problem which has a jump matrix involving an exponential (x, t)-dependence. By making use of the Deift-Zhou method [11] (which is a nonlinear version of the classical steepest descent method), it is possible to compute explicitly the long time behavior of the solution. Furthermore, using a nontrivial extension of the Deift-Zhou method [7], it is also possible to compute the small dispersion limit of the solution. Neither of these two important asymptotic results can be obtained by standard PDE techniques. An important feature of the method of [12] is that it yields the solution of the given initial boundary value problem in terms of a matrix Riemann-Hilbert problem which also involves a jump matrix with an exponential (x, t)-dependence. The curve along which this jump matrix is defined, is now more complicated, but this does not pose any additional difficulties. Thus, it is again possible to obtain explicit asymptotic results. Indeed, for problems on the half-line, the long time asymptotics for decreasing and for time-periodic boundary conditions is obtained in [17–19, 4] (see also [21] and [3]). Furthermore, the zero dispersion limit of the NLS equation is computed in [26]. For problems on the interval, it is again possible to study the asymptotic properties of the solution. In particular, it should be possible to study the small dispersion limit. Another important feature of the method of [12] is that it characterizes the generalized Dirichlet-to-Neumann map. For example, for the Dirichlet problem for the NLS equation on the half-line, the method of [12] yields qx (0, t) in terms of q(x, 0) and q(0, t). Actually, it is shown in [2] and [16] that qx (0, t) can be expressed explicitly through the solution of a system of nonlinear ODEs uniquely defined in terms of q(x, 0) and q(0, t). This is the first time in the literature that such an explicit result is obtained for a nonlinear evolution PDE. In this paper we have presented similar results for initial boundary problems on the interval. For example, for the case of the Dirichlet problem for the sG equation, Eqs. (4.21) and (4.25) express qx (0, t) and qx (L, t) in terms of a system of four nonlinear ODEs which is uniquely defined in terms of q(x, 0), qt (x, 0), q(0, t), and q(L, t). Such explicit results cannot be obtained by the standard PDE techniques. Different approaches to initial-boundary value problems for soliton equations are presented in [8–10], where the analyticity properties of the scattering matrix for the x-equation of an associated Lax pair are used to obtain either an integro-differential evolution equation or a nonlinear Riemann–Hilbert problem for this scattering matrix. In [24], the boundary value problem for the nonlinear Schr¨odinger equation on an interval is transformed into a Cauchy problem for periodic profiles, for which the algebro-geometric tools of the finite gap method are applied giving a system of nonlinear ordinary differential equations with algebraic right-hand sides for the time evolution of the associated spectral data. The analysis of initial-boundary value problems for linear evolution PDEs on the finite interval shows that, while there exists discrete spectrum for the Dirichlet problem of dispersive PDEs involving second order derivatives, typical boundary value problems
Integrable Nonlinear Evolution Equations on a Finite Interval
171
for PDEs involving third order derivatives do not possess discrete spectrum [22]. Since the algebro-geometric approach to nonlinear evolution PDEs is based on the nonlinearisation of the discrete spectrum, the above suggests that such an approach for the finite interval may be appropriate for the nonlinear Schr¨odinger, but not for the mKdV and the KdV. References 1. Adler, V.E., G¨urel, B., G¨urses, M., Habibullin, I.: Boundary conditions for integrable equations. J. Phys. A 30, 3505–3513 (1997) 2. Boutet de Monvel, A., Fokas, A.S., Shepelsky, D.: Analysis of the global relation for the nonlinear Schr¨odinger equation on the half-line. Lett. Math. Phys. 65, 199–212 (2003) 3. Boutet de Monvel, A., Fokas, A.S., Shepelsky, D.: The mKdV equation on the half-line. J. Inst. Math. Jussieu 3(2), 139–164 (2004) 4. Boutet de Monvel, A., Kotlyarov, V.: Generation of asymptotic solitons of the nonlinear Schr¨odinger equation by boundary data. J. Math. Phys. 44(8), 3185–3215 (2003) 5. Boutet de Monvel, A., Shepelsky, D.: The modified KdV equation on a finite interval. C. R. Math. Acad. Sci. Paris 337(8), 517–522 (2003) 6. Boutet de Monvel, A., Shepelsky, D.: Initial boundary value problem for the mKdV equation on a finite interval. Ann. Inst. Fourier (Grenoble) 54(5), 1477–1495 (2004) 7. Deift, P., Venakides, S., Zhou, X.: New results in small dispersion KdV by an extension of the steepest descent method for Riemann-Hilbert problems. Intl. Math. Res. Notices 1997, No. 6, pp. 286–299 8. Degasperis, A., Manakov, S.V., Santini, P.M.: On the initial-boundary value problems for soliton equations. JETP Letters 74(10), 481–485 (2001) 9. Degasperis, A., Manakov, S.V., Santini, P.M.: Initial-boundary problems for linear and soliton PDEs. Theoret. and Math. Phys. 133(2), 1475–1489 (2002) 10. Degasperis, A., Manakov, S.V., Santini, P.M.: Integrable and nonintegrable initial boundary value problems for soliton equations. J. Nonlinear Math. Phys. 12, 228–243 (2005) 11. Deift, P., Zhou, X.: A steepest descent method for oscillatory Riemann-Hilbert problems. Asymptotics for the MKdV equation. Ann. Math. (2) 137, 295–368 (1993) 12. Fokas, A.S.: A unified transform method for solving linear and certain nonlinear PDEs. Proc. Roy. Soc. London Ser. A 453, 1411–1443 (1997) 13. Fokas, A.S.: On the integrability of linear and nonlinear partial differential equations. J. Math. Phys. 41, 4188–4237 (2000) 14. Fokas, A.S.: Two dimensional linear PDEs in a convex polygon. Proc. Roy. Soc. London Ser. A 457, 371–393 (2001) 15. Fokas, A.S.: Integrable nonlinear evolution equations on the half-line. Commun. Math. Phys. 230, 1–39 (2002) 16. Fokas, A.S.: The generalized Dirichlet-to-Neumann map for certain nonlinear evolution PDEs. Commun. Pure Appl. Math. 58, 639–670 (2005) 17. Fokas, A.S., Its, A.R.: An initial-boundary value problem for the sine-Gordon equation. Theor. Math. Physics 92, 388–403 (1992) 18. Fokas, A.S., Its, A.R.: An initial-boundary value problem for the Korteweg-de Vries equation. Math. Comput. Simul. 37, 293–321 (1994) 19. Fokas, A.S., Its, A.R.: The linearization of the initial-boundary value problem of the nonlinear Schr¨odinger equation. SIAM J. Math. Anal. 27, 738–764 (1996) 20. Fokas, A.S., Its, A.R.: The nonlinear Schr¨odinger equation on the interval. J. Phys. A 37, 6091–6114 (2004) 21. Fokas, A.S., Its, A.R., Sung, L.Y.: The nonlinear Schr¨odinger equation on the half-line. Nonlinearity 18, 1771–1822 (2005) 22. Fokas, A.S., Pelloni, B.: A transform method for evolution PDEs on the interval. IMA J. Appl. Math. 70(4), 564–587(2005) 23. Gardner, C.S., Greene, J.M., Kruskal, M.D., Miura, R.M.: Methods for solving the Korteweg–de Vries equation. Phys. Rev. Lett. 19, 1095–1097 (1967) 24. Grinevich, P.G., Santini, P.M.: The initial-boundary value problem on the interval for the nonlinear Schr¨odinger equation. The algebro-geometric approach. I. In: V.M. Buchstaber, I.M.Krichever, (eds.), Geometry, Topology, and Mathematical Physics: S.P. Novikov Seminar 2001-2003, Volume 212 of AMS Translations Ser. 2, Providence, R.I.: Amer. Math. Soc., 2004, pp. 157–178 25. Habibullin, I.T.: B¨acklund transformation and integrable boundary-initial value problems. In: Nonlinear world, Volume 1 (Kiev, 1989), River Edge, N.J.: World Sci. Publishing, 1990, pp. 130–138
172
A. Boutet de Monvel, A.S. Fokas, D. Shepelsky
26. Kamvissis, S.: Semiclassical nonlinear Schr¨odinger on the half line. J. Math. Phys. 44, 5849–5868 (2003) 27. Lax, P.D.: Integrals of nonlinear equations of evolution and solitary waves. Commun. Pure Appl. Math. 21, 467–490 (1968) 28. Sklyanin, E.K.: Boundary conditions for integrable equations. Funct. Anal. Appl. 21, 86–87 (1987) 29. Tarasov, V.O.: An boundary value problem for the nonlinear Schr¨odinger equation. Zap. Nauchn. Sem. LOMI 169, 151–165 (1988); [transl.: J. Soviet Math. 54, 958–967 (1991)] 30. Zakharov, V.E., Shabat, A.B.: A scheme for integrating the nonlinear equations of mathematical physics by the method of the inverse scattering problem. I and II. Funct. Anal. Appl. 8, 226–235 (1974) and 13, 166–174(1979) 31. Zhou, X.: The Riemann-Hilbert problem and inverse scattering. SIAM J. Math. Anal. 20, 966–986 (1989) 32. Zhou, X.: Inverse scattering transform for systems with rational spectral dependence. J. Differ. Eqs. 115, 277–303 (1995) Communicated by P. Constantin
Commun. Math. Phys. 263, 173–216 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1496-1
Communications in
Mathematical Physics
Fermionic Quantization and Configuration Spaces for the Skyrme and Faddeev-Hopf Models Dave Auckly1, , Martin Speight 2, 1 2
Department of Mathematics, Kansas State University, Manhattan, Kansas 66506, USA School of Mathematics, University of Leeds, Leeds LS2 9JT, England
Received: 1 November 2004 / Accepted: 12 July 2005 Published online: 26 January 2006 – © Springer-Verlag 2006
Abstract: The fundamental group and rational cohomology of the configuration spaces of the Skyrme and Faddeev-Hopf models are computed. Physical space is taken to be a compact oriented 3-manifold, either with or without a marked point representing an end at infinity. For the Skyrme model, the codomain is any Lie group, while for the FaddeevHopf model it is S 2 . It is determined when the topology of configuration space permits fermionic and isospinorial quantization of the solitons of the model within generalizations of the frameworks of Finkelstein-Rubinstein and Sorkin. Fermionic quantization of Skyrmions is possible only if the target group contains a symplectic or special unitary factor, while fermionic quantization of Hopfions is always possible. Geometric interpretations of the results are given. 1. Introduction The most straightforward procedure for quantizing a Lagrangian dynamical system with configuration space Q is to specify the quantum state by a wavefunction ψ : Q → C, but many other procedures are possible [30]. One may take ψ to be a section of a complex line bundle over Q. Clearly we recover the original quantization if the bundle is trivial. Depending on the topology of Q (H 2 (Q, Z)) there may be many inequivalent line bundles, giving rise to quantization ambiguity. Quantization ambiguity can be used to generate fermionic or bosonic quantizations of the same classical system. The example of a charged particle under the influence of a magnetic monopole studied by Dirac [12, 13] clearly demonstrates the utility of complex line bundles for analyzing quantum dynamics. See also, the discussion in [36] and [7, Chap.6]. In the case of a Lagrangian field theory supporting topological solitons, configuration space is typically the space of (sufficiently regular) maps from some 3-manifold
The first author was partially supported by NSF grant DMS-0204651. The second author was partially supported by EPSRC grant GR/R66982/01.
174
D. Auckly, M. Speight
(representing physical space) to some target manifold. A famous example of this is the Skyrme model with target space SU(N ), N ≥ 2 and physical space R3 . Here fermionic quantization is phenomenologically crucial since the solitons are taken to represent protons and neutrons. Recall the distinction between bosons and fermions: a macroscopic ensemble of identical bosons behaves statistically as if arbitrarily many particles can lie in the same state, while a macroscopic ensemble of identical fermions behaves as if no two particles can lie in the same state. Photons are examples of particles with bosonic statistics and electrons are examples of particles with fermionic statistics. There are several theoretical models of particle statistics. In quantum mechanics, the wavefunction representing a multiparticle state is symmetric under exchange of any pair of identical bosons, and antisymmetric under exchange of any pair of identical fermions. In conventional perturbative quantum field theory, commuting fields are used to represent bosons and anti-commuting fields are used to represent fermions. More precisely, bosons have commuting creation operators and fermions have anti-commuting creation operators. However, there are times when fermions may arise within a field theory with purely bosonic fundamental fields. This phenomenon is called emergent fermionicity, and it relies crucially on the topological properties of the underlying configuration space of the model. Analogous instances of emergent fermionicity in quantum mechanical (rather than field theoretic) settings are described in [10, 29, 36] and [7, Chap. 7]. When Skyrme originally proposed his model, it was not clear how fermionic quantization of the solitons could be achieved, a fundamental gap which he acknowledged [31]. The possibility of fermionically quantizing the Skyrme model was first demonstrated (for N = 2) by Finkelstein and Rubinstein [15]. Full consistency of their quantization procedure was established by Giulini [16]. The case N ≥ 3 was dealt with in a rather different way by Witten [35] at the cost of introducing a new term into the Skyrme action. This was a crucial development, since the N = 3 model is particularly phenomenologically favoured. Although the approaches of Finkelstein-Rubinstein and Witten appear quite different, they can be treated in a common framework, as demonstrated by Sorkin [32]. See [1] and [7] for exposition about where the Skyrme model fits into modern physics. Also see [2], and [28] for a discussion of fermionic quantization of SU(N ) valued Skyrme fields. We will review the Finkelstein-Rubinstein and Sorkin models of particle statistics in Sect. 5 below, and describe the obvious generalization of their models when the domain of the soliton is something other than R3 . Spin is a property of a particle state associated with how it transforms under spatial rotations. Let us briefly review the usual mathematical models of spin. There are two general situations. In one, space admits a global rotational symmetry, in the other, it does not. When physical space is R3 , so space-time is the usual Minkowski space, the classical rotational symmetries induce quantum symmetries that are representations of the Spin group. These irreducible representations are labeled by half integers (one half of the number of boxes in a Young diagram for a representation of SU(2).) The integral representations are honest representations of the rotation group, but the fractional ones are not. The spin of a particle is the half integer labelling the SU(2) representation under which its wavefunction transforms. If the spin is not an integer, the particle is said to be spinorial. When physical space is not R3 , one can instead consider the bundle of frames over spacetime (vierbeins). Spin may then be modeled by the action of the rotation group on these frames [9]. The easiest case in this direction is when space-time admits a spin structure. This reconciliation of spin with the possibility of a curved space time was an important discovery in the last century. The situation is parallel in nonlinear models. In
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
175
the easiest example of a model with solitons incorporating spin, the configuration space is not R3 , but maps defined on R3 . The rotation group acts on such maps by precomposition. Sorkin described a model for the spin of solitons which generalizes to any maps defined on any domain. We cover this in Sect. 5. Although spin and particle statistics have completely different conceptual origins, there are strong connexions between the two. The spin-statistics theorem asserts, in the context of axiomatic quantum field theory, that particles are fermionic if and only if they are spinorial. Said differently, any particle with fractional spin is a fermion, and any particle with integral spin is a boson. Analogous spin-statistics theorems have been found for solitons also. Such a theorem was proved for SU(2) Skyrmions on R3 by Finkelstein and Rubinstein [15], and for arbitrary textures on R3 by Sorkin [32], using only the topology of configuration space. By a texture, we mean that the field must approach a constant limiting value at spatial infinity, in contrast to, say, monopoles and vortices. So pervasive is the link between fermionicity and spinoriality that the two are often elided. For example, Ramadas argued that it was possible to fermionically quantize SU(N ) Skyrmions on R3 because it was possible to spinorially quantize them [28]. Isospin is the conserved quantity analogous to spin associated with an internal rotational symmetry. As in the simplest model of spin, a particle’s isospin is a half integer labelling the representation of the quantum symmetry group corresponding to the classical internal SO(3) symmetry. In all the models we consider, the target space has a natural SO(3) action, so it will make sense to determine whether these models admit isospinorial quantization in the usual sense. Krusch [24] has shown that SU(2) Skyrmions are spinorial if and only if they are isospinorial, which is in good agreement with nuclear phenomenology, since they represent bound states of nucleons. Recall that nucleon is the collective term for the proton and neutron. Both have spin and isospin 1/2 but are in different eigenstates of the 3rd component of isospin: the proton has I3 = 1/2, and the neutron has I3 = −1/2. In general, the integrality of spin is unrelated to that of isospin. Strange hadrons can have half-integer spin and integer isospin (and vice-versa). For example, the -baryon has isospin 1, but spin 1/2, and the K mesons have isospin 1/2 and spin 1. One would hope, therefore, that the correlation found by Krusch fails in the SU(N ) Skyrme model with N > 2, since this is supposed to model low energy QCD with more than two light quark flavours, and should therefore be able to accommodate the more exotic baryons. The mathematical reader unfamiliar with spin, isospin, strangeness, etc. may find the book by Halzen and Martin [18] and the comprehensive listing of particles and their properties in [26] helpful. Emergent fermionicity, like (iso)spinoriality, can often be incorporated into a quantum system by exploiting the possibility of differences between the classical and quantum symmetries of the space of quantum states [7, Chap. 7]. A spinning top is a well known example of this. The classical symmetry group is SO(3), while the quantum symmetry group for some quantizations is SU(2), [10, 29]. An electron in the field of a magnetic monopole is also a good example, [7, Chap. 7] and [36]. We emphasize, however, that emergent fermionicity does not depend on any symmetry assumptions. In fact, the model of particle statistics that we mainly consider (Sorkin’s model) depends only on the topology of the configuration space. The purpose of this paper is to determine the quantization ambiguity for a wide class of field theories supporting topological solitons of texture type in 3 spatial dimensions. We will allow (the one point compactification of) physical space to be any compact, oriented 3-manifold M and target space to be any Lie group G, or the 2-sphere. The results use only the topology of configuration space and are completely independent of
176
D. Auckly, M. Speight
the dynamical details of the field theory. They cover, therefore, the Faddev-Hopf and general Skyrme models on any orientable domain. Our main mathematical results will be the computation of the fundamental group and the rational or real cohomology ring of Q. (The universal coefficient theorem implies that the rational dimension of the rational cohomology is equal to the real dimension of the real cohomology. Homotopy theorists tend to express results using rational coefficients and physicists tend to use real or complex coefficients.) We shall see that quantization ambiguity, as described by H 2 (Q, Z), may be reconstructed from these data. We also give geometric interpretations of the algebraic results which are useful for purposes of visualization. We then determine under what circumstances the quantization ambiguity allows for consistent fermionic quantization of Skyrmions and Hopfions within the frameworks of Finkelstein-Rubinstein and Sorkin. We finally discuss the spinorial and isospinorial quantization of these models. The main motivation for this work was to test the phenomenon of emergent fermionicity (i.e. fermionic solitons in a bosonic theory) in the Skyrme model to see, in particular, whether it survives the generalization from domain R3 to domain M. Our philosophy is that a concept in field theory which cannot be properly formulated on any oriented domain should not be considered fundamental. In fact, we shall see indications that emergent fermionicity is insensitive to the topology of M, but depends crucially on the topology of the target space. 2. Notation and Statement of Results Recall that topologically distinct complex line bundles over a topological space Q are classified by H 2 (Q, Z). Note that Q need have no differentiable structure to make sense of this: we can define c1 (L) ∈ H 2 (Q, Z) corresponding to bundle L directly in terms of the transition functions of L rather than thinking of it as the curvature of a unitary connexion on L [30]. So this classification applies in the cases of interest. The free part of H 2 (Q, Z) is determined by H 2 (Q, R), while its torsion is isomorphic to the torsion of H1 (Q). For any topological space, H1 (Q) is isomorphic to the abelianization of π1 (Q). The bundle classification problem is solved, therefore, once we know π1 (Q) and H 2 (Q, R). In this section we will define the configuration spaces that we consider, set up notation and state our main topological results. There are in fact many different but related configuration spaces that we could consider (for example spaces of free maps versus spaces of base pointed maps) and several different possibilities depending on whether the domain is connected, etc. We give clean statements of our results for special cases in this section, and describe how to obtain the most general results in the next section. The next section will also include some specific examples. Of course homotopy theorists have studied the algebraic topology of spaces of maps. The paper by Federer gives a spectral sequence whose limit group is a sum of composition factors of homotopy groups for a space of based maps [14]. We do not need a way to compute – we actually need the computations, and this is what is contained here. Let M be a compact, oriented 3-manifold and G be any Lie group. Then the first configuration space we consider is either FreeMaps(M, G), the space of continuous maps M → G, or GM , the subset of FreeMaps(M, G) consisting of those maps which send a chosen basepoint x0 ∈ M to 1 ∈ G. We will address configuration spaces of S 2 -valued maps later in this section and paper. Both FreeMaps(M, G) and GM are given the compact open topology. In practice some Sobolev topology depending on the energy functional is probably appropriate. The issue of checking the algebraic topology
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
177
arguments given in this paper for classes of Sobolev maps is interesting. See [6] for a discussion of the correct setting and arguments generalizing the labels of the path components of these configuration spaces for Sobolev maps. The space FreeMaps(M, G) is appropriate to the G-valued Skyrme model on a genuinely compact domain, while GM is appropriate to the case where physical space Mˆ is noncompact but has a connected end at infinity which, for finite energy maps, may be regarded as a single point x0 in the ˆ one point compactification M of M. M The space G splits into disjoint path components which are labeled by certain cohomology classes on M [4]. Let (GM )0 be the identity component of the topological group GM , that is, the path component containing the constant map u(x) = 1. In physical terms (GM )0 is the vacuum sector of the model. Then (GM )0 is a normal subgroup of GM whose cosets are precisely the other path components. The set of path components itself has a natural group structure. As a set, the space of path components of the based maps is given by the following proposition. Proposition 1 (Auckly-Kapitanski). Let G be a compact, connected Lie group and M be a connected, closed 3-manifold. The set of path components of GM is GM /(GM )0 ∼ = H 3 (M; π3 (G)) × H 1 (M; H1 (G)). The reason the above proposition only describes the set of path components is that Auckly and Kapitanski only establish an exact sequence 0 → H 3 (M; π3 (G)) → GM /(GM )0 → H 1 (M; H1 (G)) → 0. To understand the group structure on the set of path components one would have to understand a bit more about this sequence (e.g. does it split?). Every path component of GM is homeomorphic to (GM )0 since u(x) → u(x)−1 u(x) is a homeomorphism M M u (G )0 → (G )0 . Our first result computes the fundamental group of the configuration space of based G-valued maps. Theorem 2. If M is a closed, connected, orientable 3-manifold, and G is any Lie group, then π1 (GM ) ∼ = Zs2 ⊕ H 2 (M; π3 (G)). Here s is the number of symplectic factors in the Lie algebra of G. Our next result gives the whole real cohomology ring H ∗ ((GM )0 , R), including its multiplicative structure. This, of course, includes the required computation of H 2 ((GM )0 , R). Similarly to Yang-Mills theory, there is a µ map, µ : Hd (M; R) ⊗ H j (G; R) → H j −d (GM ; R), and the cohomology ring is generated (as an algebra) by the images of this map. To state the theorem we do not need the definition of this µ map, but the definition may be found in Subsect. 4.3 of Sect. 4, in particular, Eq. (4.1). Theorem 3. Let G be a compact, simply-connected, simple Lie group. The cohomology ring of any of these groups is a free graded-commutative unital algebra over R generated by degree k elements xk for certain values of k (and with at most one exception at most one generator for any given degree). The values of k depend on the group and are
178
D. Auckly, M. Speight
listed in Table 3 in Sect. 3. Let M be a closed, connected, orientable 3-manifold. The cohomology ring H ∗ ((GM )0 ; R) is the free graded-commutative unital algebra over R generated by the elements µ(jd ⊗ xk ), where jd form a basis for Hd (M; R) for d > 0 and k − d > 0. The examples in the next section best illuminate the details of the above theorem. Turning to the Faddeev-Hopf model, the configuration space of interest is either the space of free S 2 -valued maps FreeMaps(M, S 2 ), or (S 2 )M , the space of based continuous maps M → S 2 . One can analyze FreeMaps(M, S 2 ) in terms of (S 2 )M by making use of the natural fibration π
(S 2 )M → FreeMaps(M, S 2 ) → S 2 ,
π : u(x) → u(x0 ).
The fundamental cohomology class (orientation class), µS 2 ∈ H 2 (S 2 , Z) plays an important role in the description of the mapping spaces of S 2 -valued maps. The path components of (S 2 )M were determined by Pontrjagin [27]: Theorem 4 (Pontrjagin). Let M be a closed, connected, oriented 3-manifold, and µS 2 be a generator of H 2 (S 2 ; Z) ∼ = Z. To any based map ϕ from M to S 2 one may asso∗ ciate the cohomology class, ϕ µS 2 ∈ H 2 (M; Z). Every second cohomology class may be obtained from some map and any two maps with different cohomology classes lie in distinct path components of (S 2 )M . Furthermore, the set of path components corresponding to a cohomology class, α ∈ H 2 (M) is in bijective correspondence with H 3 (M)/(2α H 1 (M)). A discussion of this theorem in the setting of the Faddeev model may be found in [5] and [6]. Let (S 2 )M 0 denote the vacuum sector, that is the path component of the constant 2 M map, and (S 2 )M ϕ denote the path component containing ϕ. Since (S ) is not a topological group, there is no reason to expect all its path components to be homeomorphic. 2 M We will prove, however that two components (S 2 )M ϕ and (S )ψ are homeomorphic if ∗ ∗ ϕ µS 2 = ψ µ S 2 : ∼ 2 M Theorem 5. Let ϕ ∈ (S 2 )M such that ϕ ∗ µS 2 = ψ ∗ µS 2 . Then (S 2 )M ϕ = (S )ψ . Moreover, the fundamental group of any component can be computed, as follows. Theorem 6. Let M be closed, connected and orientable. For any ϕ ∈ (S 2 )M , the fundamental group of (S 2 )M ϕ is given by 2 ∗ ∼ π1 ((S 2 )M ϕ ) = Z2 ⊕ H (M; Z) ⊕ ker(2ϕ µS 2 ).
Here 2ϕ ∗ µS 2 : H 1 (M; Z) → H 3 (M; Z) is the usual map given by the cup product. There is a general relationship between the fundamental group of the configuration space of based S 2 -valued maps and the corresponding configuration space of free maps. It implies the following result for the fundamental group of the space of free S 2 -valued maps. Theorem 7. We have π1 (FreeMaps(M, S 2 )ϕ ) ∼ = Z2 ⊕ H 2 (M; Z)/ 2ϕ ∗ µS 2
⊕ ker(2ϕ ∗ µS 2 ).
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
179
To complete the classification of complex line bundles over (S 2 )M ϕ one also needs 2 2 M the second cohomology H ((S )ϕ ; R), which can be extracted from the above theorem and the following computation of the cohomology ring H ∗ ((S 2 )M ϕ , R): Theorem 8. Let M be closed, connected and orientable, let ϕ : M → S 2 , let jd form a basis for Hd (M; R) for d < 3, and let {αk } for a basis for ker(2ϕ ∗ µS 2 ∪) : H 1 (M; Z) → H 3 (M; Z)). The cohomology ring H ∗ ((S 2 )M ϕ ; R) is the free graded-commutative unital algebra over R generated by the elements αk and µ(jd ⊗ x), where x ∈ H 3 (Sp1 ; Z) is the orientation class. The classes αk have degree 1 and µ(jd ⊗ x) have degree 3 − d. We can compute the cohomology of the space of free S 2 -valued maps using the following theorem. p,q
Theorem 9. There is a spectral sequence with E2 = H p (S 2 ; R)⊗H q ((S 2 )M ϕ ; R) con∗ 2 verging to H (FreeMaps(M, S )ϕ ; R). The second differential is given by d2 µ( (2) ⊗ x) = 2ϕ ∗ µS 2 []µS 2 with d2 of any other generator trivial. All higher differentials are trivial as well. In order to compare the classical and quantum isospin symmetries, we will use the following theorem due to Gottlieb [17]. It is based on earlier work of Hattori andYoshida [19]. Theorem 10 (Gottlieb). Let L → X be a complex line bundle over a locally compact space. An action of a compact connected Lie group on X, say ρ : X × G → X, lifts to a bundle action on L if and only if two obstructions vanish. The first obstruction is the pullback of the first Chern class, L∗x0 c1 (L) ∈ H 2 (G; Z). Here Lx0 is the map induced by applying the group action to the base point. The second obstruction lives in H 1 (X; H 1 (G; Z)). We have taken the liberty of radically changing the notation from the original theorem, and we have only stated the result for line bundles. The actual theorem is stated for principal torus bundles. Since our configuration spaces are not locally compact, we should point out that we will use one direction of this theorem by restricting to a locally compact equivariant subset. In the other direction, we will just outline a construction of the lifted action. Our main physical conclusions are: C1 In these models, there is a portion of quantization ambiguity that depends only on the codomain and is completely independent of the topology of the domain. This allows for the possibility that emergent fermionicity may only depend on the target. C2 It is possible to quantize G-valued solitons fermionically (with odd exhange statistics) if and only if the Lie algebra contains a symplectic (Cn ) or special unitary (An ) factor. C3 It is possible to quantize G-valued solitons with fractional isospin when the Lie algebra of G contains a symplectic (Cn ) or special unitary (An ) factor. C4 It is not possible to quantize G-valued solitons with fractional isospin when the Lie algebra does not contain such a factor. C5 It is always possible to choose a quantization of these systems with integral isospin (however such might not be consistent with other constraints on the model). C6 It is always possible to quantize S 2 -valued solitons with fractional isospin and odd exchange statistics.
180
D. Auckly, M. Speight
The rest of this paper is structured as follows. In Sect. 3 we describe how to reduce the description of the topology of general G-valued and S 2 -valued mapping spaces to the theorems listed in this section. We also provide several illustrative examples. In Sect. 4 we review the Pontrjagin-Thom construction and describe geometric interpretations of some of our results using the Pontrjagin-Thom construction. Physical applications, particularly the possibility of consistent fermionic quantization of Skyrmions, are discussed in Sect. 5. Finally, Sect. 6 contains the proofs of our results. 3. Preliminary Reductions and Examples We begin this section with a collection of observations that allows one to reduce questions about the topology of various mapping spaces of G-valued and S 2 -valued maps to the theorems listed in the previous section. Many of these observations will reduce a more general mapping space to a product of special mapping spaces, or put such spaces into fibrations. These reductions ensure that our results are valid for arbitrary closed, orientable 3-manifolds, and valid for any Lie group. It follows directly from the definition of π1 that π1 (X × Y ) ∼ = π1 (X) × π1 (Y ). The cohomology of a product is described by the K¨unneth theorem, see [33]. For real coefficients it takes the simple form, H ∗ (X × Y ) ∼ = H ∗ (X) ⊗ H ∗ (Y ). The cohomology ring of a disjoint union of spaces is the direct sum of the corresponding cohomology rings, ∗ i. e. H ∗ (⊥ ⊥Xν ; A) = H (Xν ; A). Recall that a fibration is a map with the covering homotopy property, see for example [33]. Given a fibration F → E → B, there is an induced long exact sequence of homotopy groups, . . . → πk+1 (B) → πk (F ) → πk (E) → πk (B) → . . ., see [33]. By itself this sequence is not enough to determine the fundamental group of a term in a fibration from the other terms. However, combined with a bit of information about the twisting in the bundle it will be enough information. One can also relate the cohomology rings of the terms in a fibration. This is accomplished by the Serre spectral sequence, see [33]. ∼ FreeMaps(Xν , Y ) and Y ⊥⊥Xν = ∼ Y X0 × Reduction 1. We have, FreeMaps(⊥⊥Xν , Y ) = ν =0 FreeMaps(Xν , Y ), where X0 is the component of X containing the base point. It follows that there is no loss of generality in assuming that M is connected. Likewise there is no loss of generality in assuming that the target is connected because of the following reduction. Reduction 2. We have, FreeMaps(X, ⊥⊥Yν ) ∼ = ⊥⊥FreeMaps(X, Yν ) and assuming X is connected, Y X = Y0X , where Y0 is the component containing the base point. Both FreeMaps(M, G) and GM are topological groups under pointwise multiplication. In fact FreeMaps(M, G) ∼ = GM G, the isomorphism being u(x) → (u(x)u(x0 )−1 , u(x0 )), which is clearly a homeomorphism FreeMaps(M, G) → GM × G. It is thus straightforward to deduce π1 (FreeMaps(M, G) and H ∗ (FreeMaps(M, G), R) from π1 (GM ) and H ∗ (GM , Z). Note that the based case includes the standard choice Mˆ = R3 . Reduction 3. We have FreeMaps(M, G) ∼ = GM × G. In the same way, we can reduce the free maps case to the based case for S 2 -valued maps. In this case we only obtain a fibration. See Lemmas 18, 20 and 21. Reduction 4. We have a fibration, (S 2 )M → FreeMaps(M, S 2 ) → S 2 , π0 (Freemaps(M, S 2 )) = π0 ((S 2 )M ), and π1 (FreeMaps(M, S 2 )ϕ ) = π1 ((S 2 )M ϕ ).
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
181
The relevant information about the twisting in this fibration, as far as the fundamental group detects it, is given in the proof of Theorem 7 contained in Subsect. 6.3. For cohomology, the information is encoded in the second differential of the associated spectral sequence. Returning to the case of group-valued maps we know by the CartanMalcev-Iwasawa theorem that any connected Lie group is homeomorphic to a product G = K × Rn , where K is compact [22]. Reduction 5. If X and Y are homotopy equivalent to X and Y respectively, then (Y )X is homotopy equivalent to Y X . In particular we have GM K M .
Recall that every path component of GM is homeomorphic to (GM )0 We may therefore consider only the vacuum sector (GM )0 , without loss of generality. We shall see that things are very different for the Faddeev-Hopf configuration space, where we must keep track of which path component we are studying. ˜ is the universal covering group of its identity comReduction 6. If G is a Lie group, G ponent and M is a 3-manifold, then ˜ M )0 ∼ (G = (GM )0 . Proof. Without loss of generality we may assume that G is connected. We have the exact sequence, ˜ M → GM → H 1 (M; H1 (G)) → 0, 1→G from [4]. The exactness follows from the unique path lifting property of covers at the first term, the lifting criteria for maps to the universal cover at the center term, and induction ˜ M maps to the on the skeleton of M at the last term. Clearly, the identity component of G M identity component of G . By the above sequence, this map is injective. Any element ˜ M , say of (GM )0 , say u, maps to 0 in H 1 (M; H1 (G)), so is the image of some map in G u. ˜ Using the homotopy lifting property of covering spaces, we may lift the homotopy of ˜ M )0 . u to a constant map, to a homotopy of u˜ to a constant map and conclude that u˜ ∈ (G M M ˜ It follows that the map, (G )0 → (G )0 is a homeomorphism. Reduction 7. The universal covering group of any compact Lie group is a product of Rm with a finite number of compact, simple, simply-connected factors [25]. Furthermore, X X Yν ∼ = FreeMaps(X, Yν ). = Yν and FreeMaps(X, Yν )) ∼ We have therefore reduced to the case of closed, connected, orientable M and compact, simple, simply-connected Lie groups. Recall from Proposition 1 that the path components of a configuration space of group-valued maps depend on the fundamental group of the group. The fundamental group of any Lie group is a discrete subgroup of the center of the universal covering group. The center of such a group is just the product of the centers of the factors. All compact, simple, simply-connected Lie groups are listed together with their center and rational cohomology in Table 1. Some comments about Table 1 are in order at this point. The cohomology of a compact, simple, simply-connected Lie group is a free unital graded-commutative algebra. Each generator in the table is labeled with its degree. Thus Q[x3 , x5 ] is not the ring of all polynomials in x3 and x5 since x32 = x52 = 0 and x3 x5 = −x5 x3 by graded-commutativity. The last generator of the cohomology ring of Dn is labeled with a y instead of an x
182
D. Auckly, M. Speight Table 1. Simple groups group, G
center, Z(G)
H ∗ (G; Q)
An = SU(n + 1), n ≥ 2 Bn = Spin(2n + 1), n ≥ 3 Cn = Sp(n), n ≥ 1 Dn = Spin(2n), n ≥ 4
Zn+1 Z2 Z2 Z2 ⊕ Z2 for n ≡2 0 Z4 for n ≡2 1 Z3 Z2 0 0 0
Q[x3 , x5 , . . . x2n+1 ] Q[x3 , x7 , . . . x4n−1 ] Q[x3 , x7 , . . . x4n−1 ] Q[x3 , x7 , . . . x4n−5 , y2n−1 ]
E6 E7 E8 F4 G2
Q[x3 , x9 , x11 , x15 , x17 , x23 ] Q[x3 , x11 , x15 , x19 , x23 , x27 , x35 ] Q[x3 , x15 , x23 , x27 , x35 , x39 , x47 , x59 ] Q[x3 , x11 , x15 , x23 ] Q[x3 , x11 ]
because there are two generators in degree 2n − 1 when n is even. As usual, SU(k) is the set of special unitary matrices, that is complex matrices with unit determinant satisfying, A∗ A = I . The symplectic groups, Sp(k), consist of the quaternionic matrices satisfying A∗ A = I , and the special orthogonal groups, SO(k) consist of the real matrices with unit determinant satisfying A∗ A = I . The spin groups, Spin(k) are the universal covering groups of the special orthogonal groups. The definitions of the exceptional groups may be found in [3]. The following isomorphisms hold, SU(2) ∼ = Sp(1) ∼ = Spin(3), ∼ ∼ Spin(5) = Sp(2), and Spin(6) = SU(4), [3]. We will need some homotopy groups of Lie groups. Recall that the higher homotopy groups of a space are isomorphic to the higher homotopy groups of the universal cover of the space, and the higher homotopy groups take products to products. We have π3 (G) ∼ = Z for any of the simple G, and π4 (Sp(n)) ∼ = Z2 and π4 (G) = 0 for all other simple groups [25]. This is the reason we grouped the simple groups as we did. Note in particular that we are calling SU(2) a symplectic group. 3.1. Examples. In this subsection, we present two examples that suffice to illustrate all seven reductions described earlier. Example 1. For our first example, we take M = (S 2 × S 1 )⊥⊥RP 3 and a b G = Sp(2) × ∈ GL(2, R) . 0 c We take ((1, 0, 0), (1, 0)) ∈ S 2 × S 1 as the base point in M. In this example, neither the domain nor codomain is connected (G/G0 ∼ = Z2 × Z2 ). In addition, the group is not reductive. We also see exactly what is meant by the number of symplectic factors in the Lie algebra: it is just the number of Cn factors in the Lie algebra of the maximal compact subgroup of the identity component of G. This example requires Reductions 1, 2, 3, and 5. To analyze the topology of the spaces of free and based maps, it suffices to understand maps from S 2 × S 1 and RP 2 into the identity component, G0 (Reductions 1, 2 and 3). In fact, we may replace G0 with Sp(2) (Reduction 5). Proposition 1 implies 3 2 1 that π0 (Sp(2)S ×S ) = Z and π0 (Sp(2)RP ) = Z, so π0 (FreeMaps(M, G)) = Z42 × Z2 1 2 and π0 (GM ) = Z22 × Z2 . Similarly, Theorem 2 implies that π1 (Sp(2)S ×S ) = Z2 ⊕ Z 3 and π1 (Sp(2)RP ) = Z2 ⊕ Z2 , so π1 (FreeMaps(M, G)) = π1 (GM ) = Z32 ⊕ Z. Turning to the cohomology, we know that H ∗ (Sp(2); R) is the free graded-commutative unital algebra generated by x3 and x7 . Graded-commutative means xy =
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
183
(−1)|x||y| yx. It follows that any term with repeated factors is zero. We can list the generators of the groups in each degree. In the expression below we list the generators left to right from degree 0 with each degree separated by vertical lines: H ∗ (Sp(2); R) = |1|0|0|x3 |0|0|0|x7 |0|0|x3 x7 |. The product structure is apparent. Theorem 3 tells us that H ∗ ((Sp(2))S0 ×S ; R) is the free unital graded-commutative algebra generated by µ([S 2 ×pt]⊗x3 )(1) , µ([pt×S 1 ]⊗x3 )(2) , µ([S 2 × S 1 ] ⊗ x7 )(4) , µ([S 2 × pt] ⊗ x7 )(5) , and µ([pt × S 1 ] ⊗ x7 )(6) . Here we have included the degree of the generator as a superscript. In the same way we see that P 3 ; R) is the free unital graded-commutative algebra (FUGCA) generH ∗ ((Sp(2))R 0 ated by µ([RP 3 ] ⊗ x7 )(4) . Using the reductions and the K¨unneth theorem we see that H ∗ ((GM )0 ; R) is the FUGCA generated by µ([S 2 × pt] ⊗ x3 )(1) , µ([pt × S 1 ] ⊗ x3 )(2) , x3 , µ([S 2 × S 1 ] ⊗ x7 )(4) , µ([RP 3 ] ⊗ x7 )(4) , µ([S 2 × pt] ⊗ x7 )(5) , µ([pt × S 1 ] ⊗ x7 )(6) , x7 . Notice that this is not finitely generated as a vector space even though it is finitely generated as an algebra. This is because it is possible to have repeated even degree factors. The vector space in each degree is still finite dimensional. The cohomology ring space of based maps is just the direct of the configuration ∗ M sum, H ∗ (GM ; R) = π0 (GM ) H ((G )0 ; R). Notice that it is infinitely generated as an algebra. The cohomology of the identity component will usually be the important thing. Using the reductions, we see that the identity component of the space of free maps is up to homotopy just the product, FreeMaps(M, G)0 = GM × Sp(2), so the cohomology ring H ∗ (FreeMaps(M, G)0 ; R) is obtained from H ∗ ((GM )0 ; R) by adjoining new generators in degrees 3 and 7, say y3 and y7 . Thus H 2 ((GM )0 ; Z) ∼ = H 2 (FreeMaps(M, G)0 ; Z) ∼ = Z ⊕ Z32 . 2
1
Example 2. For this example, we take M = T 3 #L(m, 1), G1 = SO(8) and G = U(2) × SO(8). Recall that the lens space L(m, 1) is the quotient Sp(1)/Zm , where we view Zm as the mth roots of unity in S 1 ⊂ Sp(1). In this example we will need to use Reductions 6 and 7. The unitary group is isomorphic to Sp(1) ×Z2 S 1 , where Z2 is viewed as the diagonal subgroup, ±(1, 1). The universal covering group of SO(8) is Spin(8). It follows that G1 and G are connected, G has universal covering group Sp(1) × R × Spin(8), and the fundamental groups are π1 (Spin(8)) = Z2 and π1 (G) = Z ⊕ Z2 . The group G has two simple factors, one of which is symplectic. The integral cohomology of M is given by H 1 (M; Z) ∼ = Z3 and H 2 (M; Z) ∼ = Z3 ⊕ Zm . The universal coefficient theorem and Proposition 1 imply π0 (GM ) = π (FreeMaps(M, G1 )) = Z × Z42 if m is even, Z × Z32 0 1 if m is odd, and π0 (GM ) = π0 (FreeMaps(M, G)) = Z5 × Z42 if m is even and Z5 × Z32 3 if m is odd. Theorem 2 implies that π1 (GM 1 ) = Z ⊕ Zm , π1 (FreeMaps(M, G1 )) = 3 M 6 2 Z ⊕ Zm ⊕ Z2 , π1 (G ) = Z ⊕ Zm ⊕ Z2 and π1 (FreeMaps(M, G)) = Z7 ⊕ Z2m ⊕ Z22 . Turning once again to cohomology, we see from Theorem 3 that H ∗ ((U(2)M )0 ; R) is the FUGCA generated by µ([T 2 × pt] ⊗ x3 )(1) , µ([S 1 × pt × S 1 ] ⊗ x3 )(1) , µ([pt × T 2 ] ⊗ x3 )(1) , µ([S 1 × pt] ⊗ x3 )(2) , µ([pt × S 1 pt] ⊗ x3 )(2) , and µ([pt × S 1 ] ⊗ x3 )(2) . 2 (1) 1 Also, H ∗ ((GM 1 )0 ; R) is the FUGCA generated by µ([T × pt] ⊗ y3 ) , µ([S × 1 (1) 2 (1) 1 (2) 1 pt × S ] ⊗ y3 ) , µ([pt × T ] ⊗ y3 ) , µ([S × pt] ⊗ y3 ) , µ([pt × S pt] ⊗ y3 )(2) , µ([pt × S 1 ] ⊗ y3 )(2) , µ([T 3 ] ⊗ y7 )(4) , µ([T 3 ] ⊗ z7 )(4) , µ([T 2 × pt] ⊗ y7 )(5) , µ([S 1 × pt × S 1 ] ⊗ y7 )(5) , µ([pt × T 2 ] ⊗ y7 )(5) , µ([S 1 × pt] ⊗ y7 )(6) , µ([pt × S 1 pt] ⊗ y7 )(6) , µ([pt×S 1 ]⊗y7 )(6) , µ([T 2 ×pt]⊗z7 )(5) , µ([S 1 ×pt×S 1 ]⊗z7 )(5) , µ([pt×T 2 ]⊗z7 )(5) , µ([S 1 × pt] ⊗ z7 )(6) , µ([pt × S 1 pt] ⊗ z7 )(6) , µ([pt × S 1 ] ⊗ z7 )(6) , µ([T 3 ] ⊗ y11 )(8) ,
184
D. Auckly, M. Speight
µ([T 2 × pt] ⊗ y11 )(9) , µ([S 1 × pt × S 1 ] ⊗ y11 )(9) , µ([pt × T 2 ] ⊗ y11 )(9) , µ([S 1 × pt] ⊗ y11 )(10) , µ([pt × S 1 pt] ⊗ y11 )(10) , and µ([pt × S 1 ] ⊗ y11 )(10) . Therefore, H ∗ ((GM )0 ; R) is the FUGCA generated by all of the generators listed for the two previous algebras. We changed the notation for the generators of the cohomology of the Lie groups as needed. To get to the cohomology of the identity component of the space of free maps, we would just have to add generators for the cohomology of the group G0 to this list. In general the cohomology of a connected Lie group is the same as the cohomology of the maximal compact subgroup, and every compact Lie group has a finite cover that is a product of simple, simply-connected, compact Lie groups and a torus. In this case, we need to add generators, t1 , u3 , w3 , u7 , v7 , and u11 . 2 ∼ 6 ∼ 6 Thus, H 2 ((GM 1 )0 ; Z) = Z ⊕ Zm , H (FreeMaps(G1 , M)0 ; Z) = Z ⊕ Zm ⊕ Z2 , H 2 ((GM )0 ; Z) ∼ = Z21 ⊕ Z2m ⊕ Z2 , and H 2 (FreeMaps(G, M)0 ; Z) ∼ = Z21 ⊕ Z2m ⊕ Z22 . 2 We can also analyze the topology of the space of S -valued maps with domain M. The path components of (S 2 )M agree with the path components of FreeMaps(M, S 2 ) (Reduction 4) and are given by Theorem 4. Let ϕ0 : M → S 2 be the constant map and let ϕ3 : M → S 2 be the map constructed as the composition of the map M → T 3 (collapse the L(m, 1)), the projection T 3 → T 2 , and a degree three map T 2 → S 2 . According to Theorem 6 and Theorem 7, we have ∼ 6 π1 (FreeMaps(M, S 2 )ϕ0 ) = π1 ((S 2 )M ϕ0 ) = Z ⊕ Zm ⊕ Z2 , ∼ 5 π1 ((S 2 )M and ϕ3 ) = Z ⊕ Zm ⊕ Z2 , 2 4 ∼ π1 (FreeMaps(M, S )ϕ3 ) = Z ⊕ Zm ⊕ Z6 ⊕ Z2 . Using Theorem 8 we can write out generators for the cohomology. The cohomology, 2 (1) 1 1 (1) H ∗ ((S 2 )M ϕ0 ; R) is the FGCUA generated by P D([T × pt]) , P D([S × pt × S ]) , P D([pt × T 2 ])(1) , µ([T 2 × pt] ⊗ x)(1) , µ([S 1 × pt × S 1 ] ⊗ x)(1) , µ([pt × T 2 ] ⊗ x)(1) , µ([S 1 × pt] ⊗ x)(2) , µ([pt × S 1 pt] ⊗ x)(2) , and µ([pt × S 1 ] ⊗ x)(2) , where PD denotes Poincar´e dual. 1 1 (1) Similarly, H ∗ ((S 2 )M ϕ3 ; R) is the FGCUA generated by P D([S ×pt×S ]) , P D([pt× 2 (1) 2 (1) 1 1 (1) 2 T ]) , µ([T × pt] ⊗ x) , µ([S × pt × S ] ⊗ x) , µ([pt × T ] ⊗ x)(1) , µ([S 1 × pt] ⊗ x)(2) , µ([pt × S 1 pt] ⊗ x)(2) , and µ([pt × S 1 ] ⊗ x)(2) . The reason why there is no generator corresponding to P D([T 2 × pt])(1) in the ϕ3 cohomology is that it is not in the kernel since 2ϕ3∗ µS 2 P D([T 2 × pt])(1) = 6µM . We can use Theorem 9 to compute the cohomology of the space of free maps. In the component with ϕ0 we notice that the second differential is trivial because ϕ0∗ µS 2 = 0. It follows that H ∗ (FreeMaps(M, S 2 )ϕ0 ; R) is the graded-commutative, unital algebra generated by P D([T 2 ×pt])(1) , P D([S 1 ×pt×S 1 ])(1) , P D([pt×T 2 ])(1) , µ([T 2 ×pt]⊗x)(1) , µ([S 1 × pt × S 1 ] ⊗ x)(1) , µ([pt × T 2 ] ⊗ x)(1) , µ([S 1 × pt] ⊗ x)(2) , µ([pt × S 1 pt] ⊗ x)(2) , µ([pt × S 1 ] ⊗ x)(2) , and µS 2 . Notice that this algebra is not free. It is subject to the single relation, µ2S 2 = 0. 2 In the component containing ϕ3 all of the generators of H ∗ ((S 2 )M ϕ3 ; R) except µ([T × pt] ⊗ x)(1) survive to H ∗ (FreeMaps(M, S 2 )ϕ3 ; R) because they are in the kernel of d2 . However, d2 µ([T 2 × pt] ⊗ x)(1) = 6µS 2 so µS 2 does not survive and H ∗ (FreemapsM, Sϕ23 ; R) is the FUGCA generated by P D([S 1 × pt × S 1 ])(1) , P D([pt × T 2 ])(1) , µ([S 1 × pt × S 1 ] ⊗ x)(1) , µ([pt × T 2 ] ⊗ x)(1) , µ([S 1 × pt] ⊗ x)(2) , µ([pt × S 1 pt] ⊗ x)(2) , and µ([pt × S 1 ] ⊗ x)(2) . 2 2 ∼ 18 ∼ 19 Thus H 2 ((S 2 )M ϕ0 ; Z) = Z ⊕ Zm ⊕ Z2 , H (FreeMaps(M, S )ϕ0 ; Z) = Z ⊕ Zm ⊕ 2 2 M 13 2 2 Z2 , H ((S )ϕ3 ; Z) ∼ = Z ⊕Zm ⊕Z2 , and H (FreeMaps(M, S )ϕ3 ; Z) ∼ = Z9 ⊕Zm ⊕Z2 .
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
185
4. Geometric Interpretations We will follow the folklore maxim: think with intersection theory and prove with cohomology. The combination of Poincar´e duality and the Pontrjagin-Thom construction gives a powerful tool for visualizing results in algebraic topology. If W is an oriented n-dimensional homology manifold, Poincar´e duality is the isomorphism H k (W ) ∼ = Hn−k (W ). It is tempting to think of the k th cohomology as the dual of the k th homology. This is not far from the truth. The universal coefficient theorem is the split exact sequence 0 → Ext1Z (Hk−1 (W ; Z), A) → H k (W ; A) → HomZ (Hk (W ; Z), A) → 0. Putting this together, we see that every degree k cohomology class corresponds to a unique (n − k)-cycle (codimension k homology cycle), and the image of the cocycle applied to a k-cycle is the weighted number of intersection points with the corresponding (n−k)-cycle. For field coefficients this is the entire story since there is no torsion and the Ext group vanishes. With other coefficients, this gives the correct answer up to torsion. The Pontrjagin-Thom construction associates a framed codimension k submanifold of W to any map W → S k . The associated submanifold is just the inverse image of a regular point. This is well defined up to a framed cobordism. The framing is the inverse image of a standard frame. Going the other way, a framed submanifold produces a map W → S k defined via the exponential map on fibers of a tubular neighborhood of the submanifold and as the constant map outside of the neighborhood. We will take this up in greater detail later in this section. Before addressing the topology of our configuration spaces, we need to understand the cohomology of Lie groups. A number of different approaches may be utilized to compute the real cohomology of a compact Lie group: H-space methods, equivariant Morse theory, the Leray-Serre spectral sequence, Hodge theory. The cohomology is a free graded-commutative algebra over R. Recall that this means that xy = (−1)deg(x)deg(y) yx. For our purposes, the spectral sequence and Hodge theory are the two most important. The fibration SU(N ) → SU(N + 1) → S 2N+1 may be used to compute the cohomology of SU(N ), and we will use it and other similar fibrations to compute the cohomology of various configuration spaces. According to Hodge theory, the real cohomology is isomorphic to the collection of harmonic forms. Any compact Lie group admits an Ad-invariant innerproduct on the Lie algebra obtained by averaging any innerproduct over the group, or as the Killing form, X, Y = −Tr(ad(X)ad(Y )) in the semisimple case. Such an innerproduct induces a biinvariant metric on the group. With respect to this metric, the space of harmonic forms is isomorphic to the space of Ad-invariant forms on the Lie algebra. Any harmonic form induces a form on the Lie algebra by restriction and any Ad invariant form on the Lie algebra induces a harmonic form via left translation. In the case of SU(N ), these forms may be described as products of the elements, xj = Tr((u−1 du)j ). In some applications it might be appropriate to include a normalizing constant so that the integral of each of these forms on an associated primitive homology class is 1. 4.1. Components of GM . For simplicity, we will just consider geometric descriptions of G-valued maps for the compact, simple, simply-connected Lie groups. By applying the Pontrjagin-Thom construction, we will obtain a correspondence between homotopy classes of based maps M → G and finite collections of signed points in M. This may be used to give a geometric interpretation of Proposition 1. In physical terms, the signed points may be thought of as particles and anti-particles in the theory.
186
D. Auckly, M. Speight
To use the Pontrjagin-Thom construction in this setting we need a special basis for H∗ (G; R). By the universal coefficient theorem, there are (2k + 1)-cycles β2k+1 in H2k+1 (G; R) dual to x2k+1 . Assuming that the generators x2k+1 are suitably normalized, we may assume that the β2k+1 are integral classes, i.e. images of elements of the form β2k+1 for β2k+1 ∈ H2k+1 (G; Z). We will often use notation from de Rham theory to denote the analogous constructions in singular, or cellular theory. For example, the evaluation pairing between cohomology and homology is called the cap product. It is
usually denoted, x ∩ β or x[β]. The cap product corresponds to integration ( β x) in de Rham theory. By Poincar´e duality, we can identify each cocycle x2k+1 with a codimension (2k + 1)-cycle F in G so that the image of any (2k + 1)-chain c2k+1 under x2k+1 is precisely the algebraic intersection number of F and c2k+1 . Hence, each compact, simple, simply-connected Lie group contains a codimension 3 cycle F Poincar´e dual to x3 , which intersects β3 algebraically in one positively oriented point. We will shortly describe these codimension 3 cycles in greater detail, but we first describe how these cycles may be used to determine the path components of the configuration space. Assume for now that the cycle F has a trivial normal bundle. We will justify this assumption later. (Throughout this paper we will use normal bundles, open and closed tubular neighborhoods and the relation between them via the exponential map without explicitly writing the map. If ⊂ M then ν ⊂ T M, will denote the normal bundle and N ⊂ M will denote the closed tubular neighborhood.) Fix a trivialization of the normal bundle. Using this trivialization, we may associate a finite collection of signed points to any generic based map, u : M → G. To such a map we associate the collection of points, u−1 (F ). Such a point is positively oriented if the push forward of an oriented frame at the point has the same orientation as the trivialization of the normal bundle at the image. Conversely, to any finite collection of signed points we may associate a based map, u : M → G. Using a positively or negatively oriented frame at each point, we construct a diffeomorphism from the closed tubular neighborhood of each point to the 3-disk of radius π in the space of purely imaginary quaternions, sp(1). Via the exponential map, exp : sp(1) → Sp(1) given by, exp(x) = cos(|x|) + sin(|x|) |x| x we define a map from the closed tubular neighborhood of the points to Sp(1). This map may be extended to the whole 3-manifold by sending points in the complement of the neighborhood to −1. We next modify the map by multiplying by −1, so that the base point will be 1. Finally, we notice that the class, β3 is represented by a homomorphic image of Sp(1) in any Lie group. For the classical groups, this homomorphism is just the standard inclusion, Sp(1) = SU(2) → SU(n + 1), Sp(1) = Spin(3) → Spin(n), or Sp(1) → Sp(n). The homomorphism for each exceptional group is described in [4]. This matches exactly with the statement of Proposition 1. In the case we are considering here, H1 (G; Z) = 0, ˜ Z)) ∼ ˜ Z)) is and an element of H 3 (M; π3 (G)) ∼ = H 3 (M; H3 (G; = H 3 (M; H g−3 (G; just a machine that eats a 3-cycle in M, i.e. [M], and spits out a machine that eats a codimension 3-cycle in G, i.e. F , and spits out an integer. If G is not simple, there will be independent codimension 3-cycles for each simple factor, and one could interpret the intersection number with each cycle as a different type of particle (soliton). If G were not simply connected, the element of H 1 (M; H1 (G0 )) would be the obvious one, and one obtains the element of H 3 (M; π3 (G)) from a modification of the map into G that ˜ lifts to G. It is not difficult to describe the cycles β2k+1 and F for SU(n+1). Recall that the suspension of a pointed topological space is SX = X × [0, 1]/(X × {0, 1} ∪ {p0 } × [0, 1]). This may be visualized as the product X × S 1 with the circle above the marked point in X and the copy of X above a marked point in S 1 collapsed to a point. Identify CP k
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
187
with U(k + 1)/(U(1) × U(k)) and define β2k+1 : SCP k → SU(n + 1) by β2k+1 ([A, t]) = [A, eπit ⊕ e−πit Ik ]com ⊕ In−k . Here, [A, B]com = ABA−1 B −1 is the usual commutator in a group. The normalization constants of x2k+1 would ensure that β2k+1 x2k+1 = 1. The values of these constants for k = 1 have been computed in [4]. We do not need these constants for this present work. The value of the normalization constants for k = 2 would, for example, be important if one wished to add a Wess-Zumino term to the Skyrme Lagrangian. The multiplication on a Lie group may be used to endow the homology of the Lie group with a unital, graded-commutative algebra structure, and the cohomology with a comultiplication. The homology product is given by (σ : → G) · (σ : → G) := (σ σ : × → G) and the comultipication on cohomology is dual to this. The multiplication and comultiplication give H ∗ (G; R) the structure of a Hopf algebra. It is exactly in this context that Hopf algebras were first defined. Using this algebra structure, we may give an explicit description of the Poincar´e duality isomorphism. Any product of generators, xj in H ∗ (G; R) is sent to the element of H∗ (G; R) obtained from the product, nk=1 β2k+1 by removing the corresponding βj . In particular, F = nk=2 β2k+1 is the cycle
Poincar´e dual to x3 . Geometrically, Poincar´e duality is described by the equation, ω = #(P D(ω) ∩ ). Since SCP k is not a manifold, some words about our interpretation of the normal bundle to F are in order at this point. For SU(2) we may take F = {−1}. This is a codimension 3 submanifold, so there are no problems. Recall that CP k − CP k−1 is homeomorphic to R2k . It follows that the subset of F , call it F0 , obtained from the product of the SCP k − SCP k−1 is a codimension 3 cell properly embedded in SU(n + 1) − (F − F0 ). Since F − F0 has codimension 5, we may assume, using general position, that any map of a 3-manifold into SU(n + 1) avoids F − F0 . As F0 is contractible, it has a trivial normal bundle, justifying our assumption at the beginning of this description. 4.2. The fundamental group of GM . The Pontrjagin-Thom construction may also be used to understand the isomorphism, φ : π1 (GM ) → Zs2 ⊕ H 2 (M; π3 (G)), asserted in Theorem 2. A loop in (GM )0 based at the constant map u(x) = 1, may be regarded as a based map γ : SM → G. The identifications in the suspension provide a particularly nice way to summarize all of the constraints on γ imposed by the base points. We will use the same notation for the map, γ : M × [0, 1] → G obtained from γ by composition with the natural projection. The inverse image γ −1 (F ) with framing obtained by pulling back the trivialization of ν(F ) may be associated to γ . Conversely, given a framed link in (M − p0 ) × (0, 1) one may construct an element of π1 (GM ). Using the framing, each fiber of the closed tubular neighborhood to the link may be identified with the disk of radius π in sp(1). As before −1 times the exponential map may be used to construct a map, γ : SM → G representing an element of π1 (GM ). It is now possible to describe the geometric content of the isomorphism in Theorem 2. For a class of loops [γ ] ∈ (GM )0 , let φ(γ ) = (φ1 (γ ), φ2 (γ )). Restrict attention to the case of simply-connected G, and make the identifications, π3 (G) ∼ = H3 (G; Z) ∼ = H g−3 (G; Z). An element of H 2 (M; π3 (G)) may be interpreted as a function that associates an integer to a surface in M, say , and a codimension 3 cycle in G, say F . Set
188
D. Auckly, M. Speight
φ2 (γ )(, F ) = #( × [0, 1] ∩ γ −1 (F )). Note that γ −1 (F ) inherits an orientation from the framing and orientation on M. Using Poincar´e duality this may be said in a different way. The homology class of γ −1 (F ) in (M − p0 ) × (0, 1) projects to an element of H1 (M) dual to the element associated to φ2 (γ ). The first component of the isomorphism counts the parity of the number of twists in the framing. Consider the framing in greater detail. Using a spin structure on M we associate a canonical framing to any oriented 1-dimensional submanifold of (M − p0 ) × (0, 1). See Proposition 14 in the proofs section. For now restrict attention to null-homologous submanifolds. Let be an oriented 2-dimensional submanifold of (M −N (p0 ))×(0, 1) with non-trivial boundary. The normal bundle to inherits an orientation from the orientations on (M − p0 ) × (0, 1) and . Oriented 2-plane bundles are classified by the second cohomology. Since H 2 (; Z) = 0, the normal bundle is trivial. Let (e1 , e2 ) be an oriented trivialization of this bundle. Let e3 ∈ (T |∂ ) be the outward unit normal. The canonical framing on ∂ is (e1 , e2 , e3 ). Given a second framing, (f1 , f2 , f3 ) on ∂ and an orientation preserving parameterization of the boundary, we obtain an element A ∈ π1 (GL+ (3, R)) = π1 (SO(3)) ∼ = Z2 satisfying (f1 , f2 , f3 ) = (e1 , e2 , e3 )A. This is 3 the origin of the first component of the isomorphism. The generator of π1 (Sp(1)S ) ∼ = Z2 is represented by, x1 −λ¯ x¯2 γ : (λ, x1 , x2 ) → , λx2 x¯1 having identified S 3 with the unit sphere in C2 (so |x1 |2 + |x2 |2 = 1), S 1 ∼ = U(1) and Sp(1) ∼ = SU(2). The image of γ under the obvious inclusion ι :SU(2) →SU(3), that is, ι(U ) = diag(U, 1), is homotopically trivial, as can be seen by constructing an explicit homotopy between it and ι ◦ γ (1, ·). First note that any SU(3) matrix is uniquely determined by its first two columns, which must be an orthonormal pair. For all t ∈ [0, 1], let µt (λ) = tλ + 1 − t (so µ1 = id and µ0 = 1) and define x1 −λ¯ x¯2 , e := µt (λ)x2 v := x¯1 , v⊥ := v − (e† v)e. 2 0 1 − |µt (λ)| x2 Then
v⊥ (t, λ, x1 , x2 ) → e, ,∗ |v⊥ |
is the required homotopy between ι ◦ γ (t = 1) and the trivial loop based at ι : S 3 →SU(3). It is straightforward to check that e and v are never parallel (so the map is well defined), that (t, λ, 1, 0) → I3 for all t, λ (this is a homotopy through loops of based maps S 3 →SU(3)) and that (t, 1, x1 , x2 ) → ι(x1 , x2 ) for all t (each loop is based at ι). The homomorphic image of Sp(1) is contained in a standardly embedded SU(3) in each of the exceptional groups and the classical groups SU(n + 1), n ≥ 2, and Spin(N ), N ≥ 7, [4]. This is the reason why the Z2 factors only correspond to the symplectic factors of the Lie group. The following figures show some loops in the configuration spaces. For the first two figures, the horizontal direction represents the interval direction in M × [0, 1]. The disks represent the x − y plane in a coordinate chart in M, and we suppress the z direction due to lack of space. Figure 1 shows two copies of a typical loop representing an element in a
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
189
symplectic Z2 factor. Only the first vector of the framing is shown in Fig. 1. The second vector is obtained by taking the cross product with the tangent vector to the curve in the displayed slice, and the final vector is the z-direction. It is easy to see that the left copy may be deformed into the right copy. We describe the left copy as follows: a particle and antiparticle are born; the particle undergoes a full rotation; the two particles then annihilate. The right copy may be described as follows: a first particle-antiparticle pair is born; a second pair is born; the two particles exchange positions without rotating; the first particle and second antiparticle annihilate; the remaining pair annihilates. Notice that there are two ways a pair of particles can exchange positions. Representing the particles by people in a room, the two people may step sideways forwards/backwards and sideways following diametrically opposite points on a circle always facing the back of the room. This is the exchange without rotating described in Fig. 1. This exchange 3 is non-trivial in π1 (Sp(1)S ). The second way a pair of people may change positions is to walk around a circle at diametrically opposite points always facing the direction that they walk to end up facing the opposite direction that they started. This second change of position is actually homotopically trivial. Since the framed links in Fig. 1 avoid the slices, M × {0, 1}, they represent a loop based at the constant identity map. It is possible to describe a framing without drawing any normal vectors at all. The first vector may be taken perpendicular to the plane of the figure, the second vector may be obtained from the cross product with the tangent vector, and the third vector may be taken to be the suppressed z-direction. The framing obtained by following this convention is called the black-board framing. We use the blackboard framing in Fig. 2. The Pontrjagin-Thom construction may also be used to visualize loops in other components of the configuration space. Figure 2 shows a loop in the degree 2 component of the space of maps from M to Sp(1). We can also use the Pontrjagin-Thom construction to draw figures of homotopies between loops in configuration space. Figure 3 displays a homotopy between the loop corresponding to a canonically framed unknot and the constant loop. In this figure, the horizontal direction represents the second interval factor of M ×[0, 1]×[0, 1], the direction out of the page represents the first interval factor, the vertical direction represents
Fig. 1. The rotation or exchange loop
Fig. 2. The degree 2 exchange loop
190
D. Auckly, M. Speight
Fig. 3. The contraction of a canonically framed contractible link
the x direction, and the y and z directions are suppressed. The framing is given by the normal vector to the hemisphere, the y direction and the z direction. 4.3. Cohomology of GM . We now turn to a description of the real cohomology of GM . We will use the slant product to associate a cohomology class on GM to a pair consisting of a homology class on M and a cohomology class on G. Recall that the slant product is a map H n (X × Y ; A) ⊗ Hk (X; B) → H n−k (Y ; A ⊗ B), [33]. In addition the universal coefficient theorem allows us to identify H k (GM ; R) with Hom(Hk (GM ; Z), R). Let σ : → M be a singular chain representing a homology class in Hd (M; R) = Hd (M; Z)⊗ R (instead of viewing singular chains as linear combinations of singular simplices, we will combine them together and view a singular chain as a map of a special polytope into the space), and let xj be a cohomology class in H j (G; R). To define the image of the mu map, µ( ⊗ xj ), let u : F → GM be a singular chain representing an element in Hd−j (GM ). This induces a natural singular chain u : M × F → G. The pullback ∗ j produces u xj ∈ H (M × F ; R). The formal definition of the mu map is then, u∗ xj /)[F ]. µ( ⊗ xj )(u) := (
(4.1)
Writing this in notation from the de Rham model of cohomology may help to clarify the definitions. In principle one could construct a homology theory based on smooth chains and make the following rigorous. The µ map produces a (j − d)-cocycle in GM from a d-cycle in M and a j -cocycle in G. On the level of chains, let ed : D d → M be a d-cell, and xj be a closed j -form on G. Given a singular simplex, u : j −d → GM , let u : M × j −d → G be the natural map and write d u ∗ xj . µ(e ⊗ xj )(u) = D d ×j −d
Using the product formula for the boundary, ∂(D d × j −d+1 ) = (∂D d ) × j −d+1 + (−1)d D d × ∂j −d+1 , we can get a simple formula for the coboundary of the image of an element under the µ-map. Let v : j −d+1 → GM , be a singular simplex, then δ(µ(ed ⊗ xj ))(v) =
j −d+1
(−1)k
k=0
∗
D d ×j −d
(v ◦ f k ) xj =
D d ×∂j −d+1
v ∗ xj
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
= (−1)d+1 =
v ∗ xj + (−1)d
(∂D d )×j −d+1 d+1 (−1) µ((∂ed ) ⊗ xj )(v).
∂(D d ×j −d+1 )
191
v ∗ xj (4.2)
We used Stokes’s theorem in the last line. It follows that µ is well defined at the level of homology. Theorem 3 asserts that H ∗ (GM 0 ; R) is a finitely generated algebra with generators µ(jd ⊗ xk ), where {jd } and {xk } are bases for H∗ (M; R) and H ∗ (G; R) respectively. The multiplication on H ∗ (GM ; R) is given by the cup product. Recall that this is defined at the level of cochains by, (α β)(w) = α( k w)β(w ), where α is a k-cocycle, β is a -cocycle, w is a (k + )-singular simplex and k w is the front k-face and w is the back -face [33]. Note that is graded-commutative, that is, α β = (−1)k β α. It is instructive to understand some classes that do not appear as generators. One might expect µ(pt ⊗ xj ) to be a generator in degree j . However, since GM consists of based maps, the induced map u : M × F → G arising from a chain u : F → GM restricts to a constant map on pt × F . It follows that µ(pt ⊗ xj ) = 0. There would be an analogous class if we considered the cohomology of the space of free maps. Turning to the other end of the spectrum, one might expect to see classes of the form µ(M ⊗ x3 ) in degree zero. Such certainly could not appear in the cohomology of the identity component GM 0 . In fact we stated our theorem for the identity component because the argument leading to generators of the form µ( ⊗ x3 ) breaks down when is a 3-cycle and x3 is a 3-cocycle. The argument starts by considering maps of spheres into the group G, and then assembles the cohomology of these mapping spaces (which are denoted by k G) into the cohomology of GM . The path fibration is used to compute the cohomology of the k G. The fibration leading to the cohomology of 3 G does not have a simply connected base and this is the break down. See Lemma 15. Finally one might expect to see classes of the form µ( ⊗ xj ∪ xk ). It will turn out in the course of the proof (Lemma 15) that such classes vanish. Up to this point, our geometric descriptions of the algebraic topology of configuration spaces have been simpler than we had any right to expect. We were able to describe the space of path components and the fundamental group of the configuration space of maps from an orientable 3-manifold into an arbitrary simply-connected Lie group by just considering subgroups isomorphic to Sp(1). This will not hold for all homotopy invariants of GM . The main object of interest to us is the second cohomology of the configuration space with integral coefficients, because this classifies the complex line bundles over the configuration space (the quantization ambiguity). It is possible to describe one second cohomology class on Sp(n)M in terms of Sp(1) geometry. However we need to pass to SU(3) subgroups to get at the second cohomology in general. Before considering these geometric representatives of the second cohomology, briefly recall the definition of the Ext groups. Given R-modules A and B, pick a free resolution of A say → C2 → C1 → C0 → A. The k th Ext group is just defined to be the k th homology of the complex Hom(C∗ , B), i.e. ExtkR (A, B) = Hk (Hom(C∗ , B)). When R is a PID (principal ideal domain) every R-module has a free resolution of the form, 0 → C1 → C0 → A. Given such a resolution one obtains the exact sequence, 0 → Hom(A, B) → Hom(C0 , B) → Hom(C1 , B) → Ext1R (A, B) → 0,
(4.3)
and all higher Ext groups vanish. We will always take R = Z and drop the ground ring from the notation. Based on the above exact sequence, we say that the Ext groups
192
D. Auckly, M. Speight
measure the failure of Hom to be exact, i.e. take exact sequences to exact sequences. The Ext group may also be identified with the collection of extensions of A by B [33]. By the universal coefficient theorem H 2 (GM ; Z) ∼ = Ext1 (H1 (GM ; Z), Z) ⊕ Hom(H2 (GM ; Z), Z), 2 M ∼ H (G ; R) = Ext1 (H1 (GM ; Z), R) ⊕ Hom(H2 (GM ; Z), R). Now for all A, Ext1 (A, R) = 0, so Hom(H2 (GM ; Z), Z) is a free abelian group of rank b2 = dimR H 2 (GM ; R). In addition, Ext1 (A, Z) is just the torsion subgroup of A and H1 (GM ; Z) ∼ = π1ab (GM ) = π1 (GM ). Hence H 2 (GM ; Z) ∼ = Zb2 ⊕ Tor(π1 (GM )), where π1 (GM ) and the Betti number b2 may be obtained from Theorems 2 and 3. We will use the universal coefficient theorem and Ext groups to describe some cohomology classes of our configuration spaces. There is a natural Z2 contained in the fundamental group of the configuration space for any group with a symplectic factor. This Z2 is generated by the exchange loop. Wrapping twice around the exchange loop is the boundary of a disk in the configuration space. Since RP 2 is the result of identifying the points on the boundary of a disk via a degree 2 map, one expects to find an RP 2 embedded into any of the Skyrme configuration spaces with a symplectic factor. In [32], 3 R. Sorkin describes an embedding, fstat : RP 2 → SU(n + 1)S . He also describes an 3 embedding, fspin : RP 3 → SU(n + 1)S . He further shows that fspin restricted to the RP 2 subspace is homotopic to fstat . Using the map, M → M (3) /M (2) ∼ = S 3 , these M (k) induce maps into SU(n + 1) . Here M is the k skeleton of M with respect to some CW structure. In fact, using the inclusion of Sp(1) = SU(2) into any simply-connected simple Lie group one obtains maps from RP 2 and RP 3 into any configuration space of Lie group valued maps. This is most interesting when the map factors through a symplectic factor. We briefly recall Sorkin’s elegant construction. Describe RP 2 as the 2-sphere with antipodal points identified. By the addition of particle antiparticle pairs, we may assume that there are two particles in a coordinate chart. We may place the particles at antipodal points of a sphere in a coordinate chart using frames parallel to the coordinate directions. The map obtained from these frames using the Pontrjagin-Thom construction is fstat . The projective space, RP 3 is homeomorphic to the rotation group SO(3). The map fspin may be described by using SO(3) to rotate a single frame and then applying the Pontrjagin-Thom construction. Sorkin includes a second unaffected particle in his description of fspin to make the comparison with fstat easier. 3 A degree one map M → S 3 (which always exists) induces a map GS → GM . The 3 space GS is typically denoted 3 G. If the Lie algebra of the maximal compact subgroup admits a symplectic factor, then we have an interesting map Sp(1) → G which induces a map 3 Sp(1) → 3 G. We will see in the course of our proofs that on the level of π1 or H1 these maps give a sequence of injections, H1 (RP 2 ; Z) → H1 (RP 3 ; Z) → H1 (3 Sp(1); Z) → H1 (3 G; Z) → H1 (GM ; Z). The universal coefficient theorem implies that there is a Z2 factor in the second cohomology of GM when G contains a symplectic factor. In fact, in this case we see that twice the exchange loop is a generator of the 1-dimensional boundaries. This means we can define a homomorphism from B1 (the 1-dimensional boundaries) to Z taking twice the
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
193
exchange loop to 1. The cocycle defined by following the boundary map from 2-chains by this homomorphism generates the Z2 in H 2 (GM ; Z). We see that this class evaluates nontrivially on the Sorkin RP 2 . When G has unitary factors, there will be infinite cyclic factors in the second cohomology of GM . This is nicely explained by a construction of Ramadas, [28]. Ramadas constructs a map, S 2 × S 3 → SU(3). This construction goes as follows. He first defines a map K : SU(2) → SU(2) by 2 1 a b −b¯ 2 a . K( ¯ ) = (|a|4 + |b|4 )− 2 b a¯ b2 a¯ 2 ¯ σ : S 1 \SU(2)× This map satisfies, K(diag(λ, λ)A) = K(A)diag(λ2 , λ¯ 2 ). Finally define SU(2) → SU(3) by σ ([A], B) = diag(1, K(A))diag(ABA∗ , 1)diag(1, K(A)∗ ). ¯ of SU(2). It is well known that Here we are viewing S 1 as the subgroup diag(λ, λ) 1 2 3 ∼ ∼ S \SU(2) = S and SU(2) = S . The map σ : S 2 × S 3 → SU(3) induces a map σ : S 2 → 3 SU(3). Ramadas shows that this map generates H2 (3 SU(3); Z) ∼ = Z. Combining with the degree one map from M and the inclusion into a special unitary factor of G, we obtain a map S 2 → GM generating an infinite cyclic factor of H2 (GM ; Z). By the universal coefficient theorem a map from H2 (GM ; Z) to Z taking this generator to 1 is a cohomology class in H 2 (GM ; Z). Clearly this class evaluates non-trivially on this S 2 . If G does not have a symplectic or special unitary factor, then there is no reason to expect any elements of the second cohomology. In fact under this hypothesis, H 2 (3 G; Z) = 0. It is worth mentioning how these maps behave in general. The third homotopy group of any Lie group is generated by homomorphic images of Sp(1). Each time one of these generators is contained in a symplectic factor, we get a Z2 in the second cohomology detected by a Sorkin RP 2 . When one of these factors is not contained in a symplectic factor, it is contained in a copy of SU(3). This kills the Z2 factor in π1 as explained above in Subsect. 4.2. If the SU(3) is contained in a special unitary factor, the Sorkin map RP 2 → Sp(1)M → SU(3)M → GM pulls back the second cohomology class described by Ramadas (and extended to arbitrary M and G with special unitary factor as above) to the generator of H 2 (RP 2 ; Z) ∼ = Z2 . (Ramadas proves that the generator of H 2 (3 SU(3); Z) pulls back to the generator of H 2 (RP 2 ; Z) and the rest follows from our proofs.) If this SU(3) is not contained in a special unitary factor, it follows from our proofs that the second homology class associated to S 2 → GM bounds, so there is no associated cohomology class. 2 M 4.4. Components of (S 2 )M ϕ . The picture of the components of (S )ϕ arising from the Pontrjagin-Thom construction and Poincar´e duality is quite nice. The inverse image of a regular value in S 2 is Poincar´e dual to ϕ ∗ µS 2 . The number of twists in the framing of a second map with the same pull-back is the element of H 3 (M; Z)/ 2ϕ ∗ µS 2 . This is very similar to the description of elements of the fundamental group of GM when G has symplectic factors. We give three examples to clarify this. Identify S 2 with CP 1 and consider the maps. We have ϕ1 , ϕ1 , ϕ3 : CP 1 × S 1 → CP 1 given by,
ϕ1 ([z : w], λ)=[z : w],
ϕ1 ([z : w], λ)=[λz : w],
and
ϕ3 ([z : w], λ)=[z3 : w3 ]. (4.4)
194
D. Auckly, M. Speight
We can view CP 1 ×S 1 as S 2 ×[0, 1] (a spherical shell) with the inner and outer (S 2 ×{0} and S 2 ×{1}) spheres identified. Using this convention and the framing conventions from Subsect. 4.2, we have displayed the framed 1-manifolds arising as the inverse images of a regular value in Fig. 4. It may appear that there is a well defined twist number associated to a S 2 -valued map. However, there is a homeomorphism of CP 1 × S 1 twisting the 2-sphere (such a map is given by ([z : w], λ) → ([λz : w], λ)). This will change the number of twists in a framing, but will not change the relative number of twists. The reason why this relative number of twists is only at most well defined modulo twice the divisibility of the cohomology class ϕ ∗ µS 2 is demonstrated for ϕ1 in Fig. 5. 2 M 4.5. Fundamental group of (S 2 )M ϕ . An element of π1 ((S )ϕ ) is represented by a map, 1 2 γ : M × S → S . The inverse image of a regular value is a 2-dimensional submanifold, say . This defines an element of H 1 (M; Z) as follows. To any 1-cycle in M, say σ , we associate the intersection number of and σ ×S 1 . Since our loop is in the path component of ϕ, the surface is parallel to the ϕ-inverse image of a regular value. This implies that our element of H 1 (M; Z) is in the kernel of the map, 2ϕ ∗ µS 2 : H 1 (M; Z) → H 3 (M; Z). Given any element of this kernel, we can define a loop in (S 2 )M ϕ via the q-map defined in Sect. 6.3. There is a map from u : M × S 1 → Sp(1) that may be used to change this new loop back into γ . The remaining homotopy invariants of γ are just those of u as described in Subsect. 4.2.
5. Physical Consequences As explained in Sect. 3, the configuration space of the Skyrme model with arbitrary target group is homotopy equivalent to the configuration space of a collection of uncoupled Skyrme fields each taking values in a compact, simply connected, simple Lie group. We will therefore assume, throughout this section that G is compact, simply connected and simple. In this case, by Proposition 1, the path components of GM are labelled by H 3 (M; Z) ∼ = Z, identified with the baryon number B of the configuration. This identification has already been justified by consideration of the Pontrjagin-Thom construction. Let us denote the baryon number B sector by QB .
Fig. 4. Pontrjagin-Thom representatives of the S 2 -valued maps ϕ1 , ϕ1 and ϕ3
Fig. 5. Introducing 2d twists
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
195
We first recall how Finkelstein and Rubinstein introduced fermionicity to the Skyrme B , model [15]. The idea is that the quantum state is specified by a wavefunction on Q the universal cover of QB , rather than QB itself. By the uniqueness of lifts, there is a B by deck transformations. Let π : Q B → QB denote the natural action of π1 (QB ) on Q covering projection, λ ∈ π1 (QB ) and Dλ be the associated deck transformation. Since all points in π −1 (u) are physically indistinguishable, we must impose the constraint |ψ(Dλ q)| = |ψ(q)| B and λ ∈ π1 (QB ). This leaves us the B → C, for all q ∈ Q on the wavefunction ψ : Q freedom to assign phases to the deck transformations, that is, the remaining quantization ambiguity consists of a choice of U (1) representation of π1 (QB ). The possibility of fermionic quantization arises if the two-Skyrmion exchange loop in Q2 is noncontractible with even order: we can then choose a representation which assigns this loop the phase −1. In this case our wavefunction aquires a minus sign under Skyrmion interchange. Clearly, the Finkelstein-Rubinstein model could apply to any sigma model with a configuration space admiting non-trivial elements of the fundamental group representing the exchange of identical particles. In particular, the domain does not have to be R3 . Note we have insisted that the wavefunction ψ have support on a single path component QB , because baryon number is conserved in nature, so transitions which change B have zero probability. It seems, then, that the choice of representation of π1 (QB ) can be made independently for each B, but in fact there is a strong consistency requirement between the representations associated with the various components. Recall that all the sectors are homeomorphic and that given any u ∈ QB one obtains a homeomorphism Q0 → QB by pointwise multiplication by u. Hence, to each u ∈ QB there is associated an isomorphism π1 (Q0 ) → π1 (QB ), so one has a map QB → Iso(π1 (Q0 ), π1 (QB )). Since QB is connected and π1 is discrete, this map is constant, that is, there is a canonical isomorphism π1 (Q0 ) → π1 (QB ), which may be obtained by pointwise multiplication by any charge B configuration. Having chosen a representation of π1 (Q0 ), we obtain canonical representations of π1 (QB ) for all other B. Physically, we are demanding that the phase introduced by transporting a configuration around a closed loop should be independent of the presence of static Skyrmions held remote from the loop. This places nontrivial consistency conditions, if we are to obtain a genuinely fermionic quantization. In particular, the loop in Q2B consisting of the exchange of a pair of identical charge B Skyrmions must be assigned the phase (−1)B , since a charge B Skyrmion represents a bound state of B nucleons, which is a fermion for B odd and a boson for B even. The Finkelstein-Rubinstein formalism can be used to give a consistent fermionic quantization of the Skyrme model on any domain M if G = Sp(n), but not for any of the other simple target groups. In this case, Theorem 2 tells us that π1 (QB ) ≡ Z2 ⊕ H1 (M), and we can choose (and fix) a U (1) representation which maps the generator of Z2 to (−1). The generator of the Z2 -factor in the baryon number zero component is exactly the rotation–exchange loop as may be seen in the proof of Proposition 13 in the next section. To see that this assigns phase (−1) to the 2-Skyrmion exchange loop, we may consider the Pontrjagin-Thom representative of the loop. This is a framed 1-cycle in S 1 × M depicted in Fig. 6. It is framed-cobordant to the representative of the loop in which one of the Skyrmions remains static, while the other rotates through 2π about its center. Figure 6 gives a sketch of the cobordism. The horizontal direction represents the
196
D. Auckly, M. Speight
Fig. 6. The cobordism between Skyrmion rotation and Skyrmion exchange
loop parameter (“time”), the vertical direction represents M and the direction into the page represents the cobordism parameter. The framing has been omitted, and the start and end 1-cycles of the cobordism have been repeated, for clarity. Note that the apparent self intersection of the cobordism (along the dashed line) is an artifact of the pictorial projection from 5 dimensions to 3. Hence, the exchange loop in Q2 is homotopic to the loop represented by one static Skyrmion and one Skyrmion that undergoes a full rotation. To identify the phase assigned to this homotopy class, we must transfer the loops to Q0 by adding a pair of anti-Skyrmions, as depicted in Fig. 7. This changes each configuration by multiplying by a fixed charge −2 configuration which is 1 outside a small ball – precisely one of the homeomorphisms discussed above. The figure may be described thus: the exchange loop is homotopic to the rotation loop with an extra static 1-Skyrmion lump (far left) which is transferred to the vacuum sector by adding a stationary pair of anti-Skyrmions (2nd box). This loop is homotopic to the charge 0 rotation loop of Fig. 1, via the sequence of moves shown. The orientations on the curves indicate how to assign a framing via the blackboard framing convention. The resulting Pontrjagin-Thom representative is framed cobordant to the charge 0 exchange loop described in Sect. 4.2, which, as explained, generates the Z2 factor in π1 (Q0 ). Hence, the loop along which two identical 1-Skyrmions are exchanged (without rotating) around a contractible path in M is assigned the phase (−1). Exchange of higher charge Skyrmions may be treated by considering composites of B unit Skyrmions, as depicted for B = 2 in Fig. 8. The loop may be deformed into one with four distinct single exchange events (surrounded by dashed boxes). Each of these may be replaced by a pair of uncrossed strands, one of which has a 2π twist, using the homotopy described in Figs. 1 and 6 in each box. Since each strand has an even number of twists, this is homotopic to the constant loop. Hence it must be assigned the phase
~
~
Fig. 7. Mapping the Skyrmion exchange loop into the vacuum sector Q0
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
197
Fig. 8. Exchange of baryon number 2 Skyrmions is contractible in Q4
(+1). The argument clearly generalizes: given an exchange loop of a pair of charge B composites, one may isolate B 2 single exchange events, each of which can be replaced by a single twist in one of the uncrossed strands. It is easy to see that the twists may be distributed so that every strand except at most one has an even number of twists. Hence if B is even, this last strand also has an even number of twists and the loop is necessarily contractible. If B is odd, the last strand has an odd number of twists, so the loop is homotopic to the loop where 2B − 1 Skyrmions remain static and one Skyrmion executes a 2π twist. Adding 2B anti-Skyrmions, this loop is identified with the baryon number 0 exchange loop and hence receives a phase of (−1). Finkelstein and Rubinstein also model spin in this framework. In this model spin is determined by the phase associated to the rotation loop. As we saw in the previous section the rotation and exchange loops agree up to homotopy confirming the spin statistics theorem in this model. This is essentially the observation that the exchange loop is homotopic to the 2π rotation loop (Fig. 6). Note that throughout the above discussion we have used only a local version of exchange to model particle statistics, and verify the spin-statistics correlation. This definition of particle statistics and spin makes sense on general M, even without an action of SO(3), because the exchange and rotation loops have support over a single coordinate chart, so we have a local notion of rotating a Skyrmion. Things become much more subtle when a loop has a Pontrjagin-Thom representative which projects to a nontrivial cycle in M. It should not be surprising that it requires a spin structure on M to specify whether the constituent Skyrmions of such a loop undergo an even or odd number of rotations. It was precisely the generalization from R3 to general spaces that motivated the definition of spin structures in the first place. Notice that by changing the spin structure, we can interpret a loop as either having an even or odd number of rotations, so one must fix a spin structure on space before discussing spin. (This is similar to the reason, discussed in Sect. 4.4 above, why the secondary invariant for path components of S 2 -valued maps is only a relative invariant.) Even in the simple case of quantization of many point particles on a topologically nontrivial domain, the statistical type (boson, fermion or something more exotic) of the particles is usually taken to be determined only by their exchange behavior around trivial loops in M [21]. It may be more reasonable to require that spin be determined by the behavior of locally supported rotations, but to insist that the statistical type be consistent under any particle exchange. It follows from Proposition 13 that the notion of an exchange or rotation loop around a contractible loop is well defined independent of the choice of spin structure. This is just the image of π4 (G). That the parity of a rotation around a non-contractible loop is determined by a spin structure is explained in Proposition 14 in the next section. In fact, the Finkelstein-Rubinstein quantization scheme remains consistently fermionic in this extended sense provided that the correct representation into U(1) is chosen. As with more traditional models of spin, a spin structure on the domain will be required. When the domain has non-trivial first cohomology with Z2 coefficients there are many
198
D. Auckly, M. Speight
spin structures to choose from. Selecting a spin structure produces an isomorphism of π1 (QB ) with Z2 ⊕H1 (M). The required representation is just projection onto the Z2 factor. Exchange around a (possibly) non-contractible simple closed curve in space means that two identical solitons start at antipodal points on the curve and each one moves without rotating half way around the curve to exchange places with the other soliton. The notion of moving without rotating is where the spin structure enters. We will define this after describing the representation of an exchange displayed in Fig. 9. Each rectangle in this figure represents a slice of a cobordism. The horizontal direction represents time, the vertical direction represents space, and the thick lines are the world lines of the solitons. The top and bottom of each rectangle are to be identified to make each slice a cylinder representing the curve cross time. We can imagine different spin structures obtained by identifying the top and bottom in the straightforward way or by putting a full twist before making the identification. The first slice is just the exchange around the curve. One of the loops makes a lefthand rotation followed by a righthand rotation, but this wobble is the same as no rotation at all. Adding a ribbon between the non-rotating soliton and the right rotation of the bottom soliton produces the second slice. This slice may be described as one soliton making a full left rotation in a fixed location while a second soliton traverses once around the curve without rotation. This slice homotopes to the third slice, and a second ribbon gives the fourth slice. The fourth slice may be described as follows. The vertical S-curve represents the birth of a Skyrmion-anti-Skyrmion pair after which the Skyrmion and anti-Skyrmion move in opposite directions around the curve until they collide and annihilate. The horizontal lines are two (nearly) static Skyrmions, and the figure eight curve is a contractible (left) rotation loop. By definition, the exchange is non-rotating with respect to the spin structure if the Z2 representation of the vertical S-curve resulting from a baryon number 1 exchange is trivial. We now see that the general exchange is consistent because the two horizontal lines contribute nothing to the representation, the S-curve contributes nothing, and the baryon number B contractible rotation loop contributes (−1)B as described previously and seen in Fig. 8. As will be seen in Sect. 6.3, there is a close connexion between Sp(1)M and (S 2 )M . This allows us to transfer the Finkelstein-Rubinstein construction of a fermionic quantization scheme for Sp(N ) valued Skyrmions to the Faddeev-Hopf model. Recall that here, unless H 2 (M; Z) = 0, the path components of Q are not labelled simply by an integer, but rather are separated by an invariant α ∈ H 2 (M), and a relative invariant c ∈ H 3 (M)/2α H 1 (M). Configurations with α = 0 necessarily have support which wraps around a nontrivial cycle in M. They therefore lack one of the key point-like characteristics of conventional solitons: they are not homotopic to arbitrarily highly localized configurations. Such configurations are intrinsically tied to some topological “defect” in physical space, and so are somewhat exotic. We therefore mainly restrict our attention to configurations with α = 0. As with Skyrmions, these configurations are labelled by B ∈Z∼ = H 3 (M; Z) which we identify with the Hopfion number. This is the relative
~
~
Fig. 9. Sum of a loop with the rotation loop
~
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
199
invariant between the given configuration and the constant map. It turns out that all the α = 0 sectors, QB , are homeomorphic (Theorem 5) and, further, that QB is homotopy equivalent to (Sp(1))M 0 , the vacuum sector of the classical Skyrme model. The homotopy equivalence is given by the fibration f ∗ of Lemma 20. This may be used to map the charge zero Skyrmion exchange loop to a charge 2 Hopfion exchange loop, generating a Z2 factor in π1 (Q2 ). It follows that Hopfions can be quantized fermionically within the Finkelstein-Rubinstein scheme. In the more exotic configurations with α = 0, the relative invariant takes values in Z/(2dZ) ∼ = H 3 (M; Z)/(2α H 1 (M; Z)). So even though the Hopfion number is not defined in this case, the parity of the Hopfion number is well defined and the Finkelstein-Rubinstein formalism still yields a consistently fermionic quantization scheme. We return now to the Skyrme model, but in the case where G is not Sp(N ) but is rather SU(N ), Spin or exceptional. In this case the Skyrmion exchange loop is contractible, so must be assigned phase (+1) in the Finkelstein-Rubinstein quantization scheme so that only bosonic quantization is possible in that framework. To proceed, one may take the wavefunction to be a section of a complex line bundle over QB equipped with a unitary connexion. Parallel transport with respect to the connexion associates phases to closed loops in QB in a way that one might hope will mimic fermionic behaviour. The problem with this is that the holonomy of a loop is not (for non-flat connexions) homotopy invariant, so the phase assigned to an exchange loop will depend on the fine detail of how the exchange is transacted. To get around this, Sorkin introduced a purely topological definition of statistical type and spin for solitons defined on R3 . His definition extends immediately to solitons defined on arbitrary domains. We review these definitions next. Recall Sorkin’s definition of fstat : RP 2 → Q2 : choose a sphere in R3 and associate to each antipodal pair of points on this sphere the charge 2 configuration with Pontrjagin-Thom representation given by that pair of points, framed by the coordinate basis vectors. The subscript stat in fstat refers to statistics. There is an associated homomor∗ : H 2 (Q ) → H 2 (RP 2 ) ∼ Z given by pullback. According to Sorkin’s phism fstat = 2 2 definition, the quantization wherein the wavefunction is a section of the line bundle over ∗ (c) = 1, bosonic otherwise, Q associated with class c ∈ H 2 (Q; Z) is fermionic if fstat ∗ (c) represents the [32]. Thinking of fstat as an inclusion map, the pulled-back class fstat Chern class of the restriction of the bundle associated to c over Q2 to the subset RP 2 . The intuition behind this definition is that if there was a unitary connection on the bundle with parallel transport equal to (−1) around exchange loops, then the restriction to the bundle to such a RP 2 would have to be non-trivial. This definition generalizes to solitons defined on arbitrary domains by analogous maps from RP 2 into the configuration space based on embeddings of S 2 in the domain. To make sense of the framing, one must pick a trivialization of the tangent bundle of the domain restricted to the S 2 . Up to homotopy there is a unique such framing. The elementary Sorkin maps will be the ones associated with sufficiently small spheres that lie in a single coordinate chart. To model spin for solitons with domain R3 , Sorkin considers the action of SO(3) on the configuration space given by precomposition with any field. The orbit of a basic soliton with Pontrjagin-Thom representative given by one point and an arbitrary frame has representatives obtained by rotates of the frame. This rotation may be performed on any isolated lump in any component of configuration space to define a map fspin : SO(3) → QB . Sorkin defines the quantization associated to a class c ∈ H 2 (QB ; Z) to ∗ c ∈ H 2 (SO(3); Z) ∼ Z is non-trivial, [32]. be spinorial if and only if the pull-back fspin = 2 This definition generalizes immediately to solitons on arbitrary domains. The intuition behind this definition is clearly explained in the paper of Ramadas, [28]. The idea is
200
D. Auckly, M. Speight
that the classical SO(3) symmetry of QB lifts to a quantum Sp(1) = SU(2) = Spin(3) symmetry that descends to an SO(3) action on the space of quantum states if and only ∗ c ∈ H 2 (SO(3); Z) is trivial. In the case where the domain is not if the pull-back fspin 3 R , there is no SO(3) symmetry, but there is still a SO(3) orbit of any single soliton, obtained by rotating the framing of a single-point Pontrjagin-Thom representative, so one still has a local notion of what it means to rotate a soliton. One may still define a map fspin : SO(3) → QB and define the quantization corresponding to class c to be spinorial ∗ c = 0, though there is no corresponding statement about quantum if and only if fspin symmetries. This should be contrasted with the case of isospin, which we discuss later in this section. Sorkin proves a version of the spin statistics theorem when the domain is R3 . Recall that rotations may be represented by vectors along the axis of the rotation with magnitude equal to the angle of rotation. A one-half rotation in one direction is equivalent to a one-half rotation in the opposite direction. This gives a natural inclusion of RP 2 into SO(3) as the set of one-half rotations. This inclusion induces an isomorphism on cohomology, ι∗ : H 2 (SO(3); Z) → H 2 (RP 2 ; Z). Sorkin’s version of the spin statistics ∗ ∗ . There is one slightly stronger version of the spin theorem states that ι∗ fspin = fstat statistics correspondence that one may hope for when the domain is arbitrary. We will discuss this later in this section. Ramadas proved that the Sorkin definition of statistical type and spinoriality were strict generalizations of the Finkelstein-Rubinstein definition when the target group is SU(N ). The statement works as follows. One first notices that the universal coefficient theorem gives an isomorphism H 2 (Q; Z) ∼ = Hom(H2 (Q; Z), Z) ⊕ Ext1 (H1 (Q; Z), Z) ∼ = Hom(H2 (Q; Z), Z) ⊕ Ext1 (π1 (Q), Z). When fermionic quantization is possible in the framework of Finkelstein-Rubinstein, the exchange loop is an element of order 2 in π1 (Q). Ramadas shows that the corresponding element of H 2 (Q; Z) pulls back to the non-trivial element of H 2 (SO(3); Z) under fspin (when the target is SU(2) so that fspin is defined). More precisely, he shows several 3 3 things. He shows that H 2 (SU(N )S ; Z) ∼ = Z for N > 2 and H 2 (SU(2)S ; Z) ∼ = Z2 . He shows that the inclusion SU(N ) → SU(N + 1) induces an isomorphism 3
3
H 2 (SU(N + 1)S ; Z) → H 2 (SU(N )S ; Z) for N > 2 and a surjection for N = 2. The N > 2 case follows from the fibration SU(N ) → SU(N + 1) → S 2N+1 . The N = 2 case follows from the four term exact sequence induced by the Ext functor together with several ingeniously defined maps, 3 3 see [28]. Since H 2 (SU(2)S ; Z) ∼ = Z2 , the exchange loop in π1 (SU(2)S ) corresponds 3 to the generator of H 2 (SU(2)S ; Z) under the universal coefficient isomorphism. This class pulls back to the generator of H 2 (SO(3); Z) under fspin . Thus, when it is possible to quantize fermionically in the Finkelstein-Rubinstein framework, the exchange loop is an element of order 2, so it corresponds to a cohomology class which pulls back non-trivially under fspin and fstat . Hence it is possible to quantize fermionically in the Sorkin framework, also. Now turn to the case of an arbitrary domain and compact, simply connected, simple target group. Given an arbitrary domain, M, we can construct a degree one map to S 3 by collapsing the 2-skeleton. This map induces, via precomposition, a map between the
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
201
3
corresponding configuration spaces, ιM : GS → GM . (Given a soliton configuration on S 3 define one on M by mapping points in M to points in S 3 , then follow the configuration into the target.) The induced map on cohomology is surjective, so there is a portion of the quantization ambiguity (H 2 (QB ; Z)) that depends only on the codomain and is completely independent of the domain. This confirms our first physical conclusion, C1. Recall that we call H 2 (QB ; Z) the quantization ambiguity because line bundles are classified by elements of H 2 (QB ; Z) and wave functions are sections of such bundles. By Theorems 2 and 3 this cohomology is H 2 (QB , Z) = Zb2 (QB ) ⊕ Tor(H1 (M)) ⊕ π4 (G), where b2 (QB ) = b1 (M) + 1 for G = SU(N ), N ≥ 3, and b2 (QB ) = b1 (M) for G = Spin(N ), N ≥ 7, G = Sp(N ), N ≥ 1, or G exceptional. Here bk (X) denotes the k th Betti number of X, that is, dimR Hk (X; R). Also π4 (G) = Z2 for G = Sp(N ), N ≥ 1, and π4 (G) = 0 otherwise. Notice that the elementary Sorkin maps factor through 3 ιM : GS → GM . Use fEstat and fEspin to denote the elementary Sorkin maps defined on an arbitrary domain. By definition we have, fEstat = ιM ◦ fstat and fEspin = ιM ◦ fspin . 3 If G = Spin(N ), N ≥ 7, or exceptional, then H 2 (GS ; Z) = 0 so ι∗M is trivial. Hence, fermionic quantization is impossible in these cases. The inclusions Sp(N ) → Sp(N +1) 3 induce maps on the configuration spaces Sp(N )S that induce isomorphisms on cohomology, and Sp(1) ∼ = SU(2), so we have reduced to the special unitary case. If G = SU(N ), 3 2 N ≥ 3, then H (GS ; Z) = Z and ι∗M maps Tor(H1 (M)) and all the generators µ(1k ⊗ x3 ) to 0, and µ([M]⊗x5 ) to µ([S 3 ]⊗x5 ). Since the map on cohomology induced by ιM is surjective, fermionic quantization in the generalization of the sense of Sorkin is possible over an arbitrary domain if and only if it is possible for domain S 3 . Combined with the 3 result of Ramadas that fspin : H 2 (GS ; Z) → H 2 (SO(3); Z) surjects and Sorkin’s spin ∗ = ι∗ f statistics correlation that fstat spin , one obtains fstat (mµ([M] ⊗ x5 )) = m ∈ Z2 , so quantization on the bundle represented by the class mµ([M] ⊗ x5 ) is fermionic if and only if m is odd. Our second physical conclusion, C2, establishing necessary and sufficient conditions for the existence of fermionic quantizations follows from these comments. There are consistency conditions that one would like to check with regard to the generalized Sorkin model of particle statistics. Since QB is connected and H 2 (Q0 , Z) is discrete, there is a canonical isomorphism H 2 (Q0 ) → H 2 (QB ), so a class c ∈ H 2 (Q0 ; Z) defines a fixed class over each sector QB . As in the discussion of the Finkelstein-Rubinstein model of particle statistics, we would like to know that Baryon number B lumps ∗ c = 0). We in QB are all bosonic when the one lump class in Q1 is bosonic (fEstat would also like to know that Baryon number B lumps in QB are bosonic or fermionic ∗ c = 1). according to the parity of B when the one lump class in Q1 is fermionic (fEstat The statistical type of a Baryon number B lump may be defined via a generalization of the Sorkin map in which a pair of Baryon number B lumps is placed at antipodal points on a sphere. With this definition, cobordism arguments similar to those given in the Finkelstein-Rubinstein case show that this model of particle statistics is indeed consistent. As in the Finkelstein-Rubinstein case, the spin of a particle will just be determined by a local picture, and the statistical type may be based on non-local exchanges of identical particles. A non-local exchange will be defined by a generalization of the Sorkin map associated to an arbitrary embedded 2-sphere, say S → M. Denote the associated map by, fS:stat : RP 2 → QB . There are two cases: either S separates the domain, so
202
D. Auckly, M. Speight
M − S = M1 ∪ M2 , or S does not separate. If the 2-sphere in M separates, we can define a degree one map p : M → S 3 by collapsing the relative 2-skeleta of M 1 and M 2 . As in the case of the elementary Sorkin map, we obtain fS:stat = fstat ◦ p , where 3 p : GS → GM . It follows that this model of particle statistics is consistent with these non-local exchanges. If the sphere does not separate, then there is a simple path from one side of the sphere to the other side of the sphere. A tubular neighborhood of the union of this path and the sphere is homeomorphic to a punctured S 2 × S 1 . We may construct a degree one projection from M to S 2 × S 1 by collapsing the complement of this tubular neighborhood. This intertwines the Sorkin map defined using the non-separating sphere with the Sorkin map defined using S 2 × {1} ⊂ S 2 × S 1 . If we knew the following conjecture, then this model of particle statistics would be consistent in this larger sense. As it is, we know that it satisfies the stronger consistency condition in the typical case where the domain does not contain a non-separating sphere. Conjecture. fS∗2 ×{1}:stat µ([S 2 × S 1 ] ⊗ x 5 ) = 0. To discuss isospin, we recall some standard facts about extending group actions on a be the associated princonfiguration space. Given a complex line bundle over Q, let Q cipal U(1) bundle. If a group acts on Q it is possible to construct an extension of by so that the projection to intertwines the two actions. The extension U(1) that acts on Q may be defined as equivalence classes of paths in , see [8]. The quantum symmetry group is a subgroup of this extension. When = SO(3) the possible U(1) extensions are SO(3) × U(1) and U(2). These correspond to integral and fractional isospin respectively when SO(3) acts as rotations on the target. Recall that every compact, simply connected, simple Lie group G has a Sp(1) subgroup. We define the isospin action on G to be the adjoint action of this Sp(1) subgroup. This coincides with the usual definition if G = Sp(1). Of course, we can always take a trivial line bundle over Q, so any of our configuration spaces admit quantizations with integral isospin, confirming our fifth physical conclusion, C5. To justify our remaining physical conclusions about isospin, we review the required 3 constructions. The Sorkin map SO(3) → SU(2)S is the map obtained by the isospin action. To see this, notice that we can rotate the frame in the Pontrjagin-Thom representative by either rotating the domain or by rotating the codomain. When a class in H 2 (GM ; Z) pulls back to the generator of H 2 (RP 2 ; Z) under the Sorkin map, we claim that the associated quantization has fractional isospin. Assume otherwise so the extension of the rotation group is SO(3)×U(1). This means that the SO(3) subgroup is a lift of the SO(3) action on the configuration space to the bundle over the space. Restricting this to the image of SO(3) under fspin we obtain a contradiction from Theorem 10. Since such classes exist whenever the configuration space admits a fermionic quantization, we obtain our third physical conclusion, C3. To show that quantizations with fractional isospin are not possible when the group does not have a symplectic or special unitary factor, one must just follow through the construction of the extension given in [8] to see that the resulting extension is trivial. This establishes our fourth conclusion, C4. As we noted earlier, the relation between the configuration space of Sp(1)-valued maps and S 2 -valued maps implies that it is always possible to fermionically quantize Hopfions. Since it is possible to quantize Sp(1)-valued solitons with fractional isospin, the same relation implies that it is possible to quantize S 2 -valued solitons with fractional isospin. This is our sixth physical conclusion, C6.
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
203
6. Proofs We begin by recalling some basic homotopy and homology theory [33]. For pointed spaces X, Y , let [X, Y ] denote the set of based homotopy classes of maps X → Y . There is a distinguished element 0 in [X, Y ], namely the class of the constant map. Given a map f : X → X , there is for each Y a natural map f ∗ : [X , Y ] → [X, Y ] defined by composition. We define ker f ∗ ⊂ [X , Y ] to be the inverse image of the null g
f
class 0 ∈ [X, Y ]. A sequence of maps X → X → X is coexact if ker f ∗ = Im g ∗ for every choice of codomain Y . Longer sequences of maps are coexact if every constituent triple is coexact. Note that this makes sense even in the absence of group structure. If Y happens to be a Lie group G, as it will be for us, then [X, G] inherits a group structure by pointwise multiplication, f ∗ and g ∗ are homomorphisms, and the sequence g∗
f∗
[X , G] → [X , G] → [X, G] is exact in the usual sense. In the following, we will make extensive use of the following standard result [33]: Proposition 11. If X is a CW complex and A ⊂ X is a subcomplex then there is an infinite coexact sequence, A →X → X/A → SA → SX → S(X/A) → · · · → S n A → S n X → S n (X/A) → · · ·, where S n denotes iterated suspension. The proofs will use several naturally defined homomorphisms. Any map f : X → Y defines homomorphisms f∗ : Hk (X) → Hk (Y ) which depend on f only up to homotopy. Hence, one has natural maps Hk : [X, Y ] → Hom(Hk (X), Hk (Y )). There is a natural (Hurewicz) homomorphism Hur k : πk (X) → Hk (X) sending each map S k → X to the push-forward of the fundamental class via the map in X. If X is (k − 1)-connected then Hur k is an isomorphism. There is also a natural isomorphism Suspk : Hk (SX) → Hk−1 (X) relating the homologies of X and SX. We may now prove a preliminary lemma that is used in the computation of both the fundamental group and the real cohomology. This lemma is the place where we use the assumption that the domain is orientable. This lemma was used in [4] as well. Note that SX (k) denotes the suspension of the k-skeleton of X. The k-skeleton of the suspension of X will always be denoted (SX)(k) . Lemma 12. For a closed, connected, orientable 3-manifold, and simply-connected, compact Lie group the map, [SM, G] → [SM (2) , G] induced by inclusion is surjective. Proof. Start with a cell decomposition of M with exactly one 0-cell and exactly one 3-cell. The sequences, M/M (2)
∂-
SM (2)
- SM,
and SM (1)
- SM (2)
q S(M (2) /M (1) ),
204
D. Auckly, M. Speight
are coexact by Proposition 11 with X = M, A = M (2) and X = M (2) , A = M (1) respectively. Hence, the sequences, ∂∗ [SM, G] - [SM (2) , G] - [M/M (2) , G], and q∗ [S(M (2) /M (1) ), G] - [SM (2) , G] - [SM (1) , G] = 0, are exact. The group [SM (1) , G] is trivial because G is 2-connected. Hence q ∗ is sur(3) (2) is homeomorphic to D 3 /S 2 . Under this identification, jective. The space M /M x ∂(x) = f (3) |x| , |x| for x ∈ D 3 , where f (3) : S 2 → M (2) is the attaching map for the 3-cell. Now we can construct the following commutative diagram: [S(M (2) /M (1) ), G]
∂∗ ◦ q∗
- [M (3) /M (2) , G]
H3 Hom(H3 (S(M
H3
(2)
? /M (1) )), H3 (G))
Hom(H3 (M
(3)
∗ ◦ Hur 3 ∗ Susp−1 3
Hom(H2 (M
(2)
? /M (2) ), H3 (G)) Hur 3 ∗
? /M (1) ), π3 (G))
? δ(3) Hom(H3 (M /M (2) ), π3 (G)).
Note that both maps H3 are isomorphisms, and Hur 3 : π3 (G) → H3 (G) is an isomorphism because G is 2-connected, so the vertical maps are all isomorphisms. We may therefore identify ∂ ∗ ◦q ∗ with the map δ. But δ is the coboundary map in the CW cochain complex for H ∗ (M; π3 (G)). Since M is orientable, this coboundary is trivial. Since q ∗ is surjective, we conclude that ∂ ∗ = 0. 6.1. The fundamental group of the Skyrme configuration space. We saw in section 3 that it was sufficient to study simple, simply-connected Lie groups. We begin our study of the fundamental group by showing that the fundamental group of the configuration space fits into a short exact sequence in this case. Proposition 13. If M is a closed, connected, orientable 3-manifold and G is a simple, simply-connected Lie group, then 0 → π4 (G) → π1 (GM ) → H 2 (M; π3 (G)) → 0 is an exact sequence of abelian groups. The maps in this sequence will be defined in the course of the proof. Proof. We have π1 (GM ) = [S 1 , GM ] ∼ = [SM, G], where SM is the suspension of M. The sequence (SM)(3) → SM → SM/(SM)(3)
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
205
is coexact by Proposition 11 with X = SM, A = (SM)(3) , and hence induces the exact sequence of groups: [SM/(SM)(3) , G] → [SM, G] → [(SM)(3) , G].
(6.1)
Noting that (SM)(3) = SM (2) , Lemma 12 implies that the last map in the above sequence is surjective. This will become the exact sequence we seek, after suitable identifications. First we identify the third group in sequence 6.1 with the second cohomology of M, in similar fashion to the proof of Lemma 12. From Proposition 11, we have exact sequences, q∗ [S((SM)(2) /(SM)(1) ), G] - [S((SM)(2) ), G] - [S((SM)(1) ), G] (X = (SM)(2) , A = (SM)(1) ) and ∂∗ [S((SM)(2) ), G] - [(SM)(3) /(SM)(2) , G] - [(SM)(3) , G], (X = (SM)(3) , A = (SM)(2) ). Since G is 2-connected, [S((SM)(1) ), G] = 0, and q ∗ is surjective. Thus coker(∂ ∗ ◦ q ∗ ) = coker(∂ ∗ ) = [(SM)(3) , G]. Using the Hurewicz and suspension isomorphisms as before, we may identify ∂ ∗ ◦ q ∗ with a coboundary map in the CW cochain complex for H ∗ (M; π3 (G)): [S((SM)(2) /(SM)(1) ), G]
Hom(H2 ((SM)
(2)
? /(SM)(1) ), π3 (G))
∂∗ ◦ q∗
δ
- [(SM)(3) /(SM)(2) , G]
? - Hom(H3 ((SM)(3) /(SM)(2) ), π3 (G)).
Now, [(SM)(3) , G] ∼ = coker(∂ ∗ ◦ q ∗ ) ∼ = coker(δ) ∼ = H 3 (SM; π3 (G)) ∼ = H 2 (M; π3 (G)). The first group in sequence (6.1) is π4 (G) because SM/(SM)(3) ∼ = S 4 . For the nonsymplectic groups π4 (G) = 0, and we are done. For the higher symplectic groups, the fibration Sp(n) → Sp(n + 1) → S 4n+3 induces a fibration (Sp(n))M → (Sp(n + 1))M → (S 4n+3 )M . The homotopy exact sequence of this fibration reads π2 ((S 4n+3 )M ) → π1 ((Sp(n))M ) → π1 ((Sp(n + 1))M ) → π1 ((S 4n+3 )M ). Now πk ((S 4n+3 )M ) = [S k , (S 4n+3 )M ] ∼ = [S k M, S 4n+3 ]. For k = 1, 2 these groups are trivial since S 4n+3 is 5-connected. Hence π1 (Sp(n + 1)M ) ∼ = π1 (Sp(n)M ) for all n ≥ 1, so the proposition reduces to showing that the first map in sequence (6.1) is injective for G = Sp(1). In the special case of M ∼ = S 3 the exchange loop depicted in Fig. 1 3 represents the generator of π4 (Sp(1)) ∼ = π1 ((Sp(1))S ). Our final task is to show that
206
D. Auckly, M. Speight
the image of this generator under push forward by the collapsing map M → M/M (2) is non-trivial in π1 ((Sp(1))M ). Proceed indirectly and assume that there is a homotopy between the constant loop and the exchange loop, say H : M × [0, 1] × [0, 1] → Sp(1). Set = H −1 (−1). = ∩ where Now glue the homotopy from Fig. 3 to this homotopy and let defined over is the hemisphere in Fig. 3. The trivialization of the normal bundle to defined over do not match. The and the trivialization of the normal bundle to discrepency is the generator of π1 (SO(3)). It follows that the second Stiefel-Whitney )) ∈ H 2 (M × [0, 1] × [0, 1]; π1 (SO(3))) is non-trivial [34]. However, class, w2 (N ( the Whitney product formula yields, )) = w(N( )) w(T ) w(N ( = w(T ⊕ N ( )) = w(T (M × [0, 1] × [0, 1])| ) = 1. ) = 1 because the Stiefel-Whitney class Here w is the total Stiefel-Whitney class, w(T of any orientable surface is 1, and T (M × [0, 1] × [0, 1])| is trivial. This contradiction establishes the proposition. ←
It is well known that a split exact sequence of abelian groups, 0 → K → G → H → 0, induces an isomorphism, K ⊕H ∼ = G. The following proposition will establish such a splitting, and therefore, complete our computation of the fundamental group of the Skyrme configuration spaces. The proof will require surgery descriptions of 3-manifolds, so we recall what this means. Given a framed link, say L, (i.e. 1-dimensional submanifold with trivialized normal bundle or identification of a closed tubular neighborhood with ⊥⊥S 1 × D 2 ) in S 3 = ∂D 4 , we define a 4-manifold by D 4 ∪⊥⊥S 1 ×D 2 D 2 × D 2 . The boundary of this 4-manifold is said to be the 3-manifold obtained by surgery on L. It is denoted, SL3 . Proposition 14. The sequence, 0 → π4 (G) → π1 (GM ) → H 2 (M; π3 (G)) → 0 splits, and there is a splitting associated to each spin structure on M. Proof. As we saw at the end of the proof of the previous proposition, it is sufficient to check the result for G = Sp(1). Since the three dimensional Spin cobordism group is trivial, every 3-manifold is surgery on a framed link with even self-linking numbers [23]. Such a surgery description induces a Spin structure in M. Let M = SL3 be such a surgery description, orient the link and let {µj }cj =1 be the positively oriented meridians to the components of the link. These meridians generate H1 (M) ∼ = H 2 (M; π3 (Sp(1))). This last isomorphism is Poincar´e duality. Define a splitting by: 1 M s : H1 (M) → π1 ((Sp(1)) ); s(µj ) = P T µj × , canonical framing . 2 Here P T represents the Pontrjagin-Thom construction and the canonical framing is constructed as follows. The first vector is chosen to be the 0-framing on µj considered as an unknot in S 3 . The second vector is obtained by taking the cross product of the tangent vector with the first vector, and the third vector is just the direction of the interval. We will now check that this map respects the relations in H1 (M). Let QL = (nj k ) be the
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
207
linking matrix so that, H1 (M) = µj |nj k µj = 0 . We are using the summation convention in this description. The 2-cycle representing the relation, nj k µj = 0 may be constructed from a Seifert surface to the j th component of the link, when this component is viewed as a knot in S 3 . Let j denote this Seifert surface. The desired 2-cycle is then ◦ j = (j − N (L)) ∪ σj . Here σj is the surface in S 1 × D 2 with njj meridians depicted j is exactly the relation, nj k µj = 0. The framon the left in Fig. 10. The boundary of ing on each copy of µk for k = j induced from this surface agrees with the 0-framing. j may be extended to a The framing on each copy of µj is −sign(njj ). The surface, surface in M × [0, 1] × [0, 1] by adding a collar of the boundary in the direction of the second interval followed by one band for each pair of the µj as depicted on the right of Fig. 10. The resulting surface has a canonical framing, and the corresponding homotopy given by the Pontrjagin-Thom construction homotopes the loop corresponding to the relation to a loop corresponding to a ±2-framed unlink. Such a loop is null-homotopic, as required. We remark that the Spin structures on M correspond to H 1 (M; Z2 ). In addition, the splittings of Z2 → π1 (Sp(1)M ) → H 2 (M; Z) corresponds to the group cohomology, H 1 (H 2 (M; Z); Z2 ) ∼ = H 1 (H1 (M; Z); Z2 ) ∼ = H 1 (M; Z2 ). The last isomorphism is because the 2-skeleton of M is the 2-skeleton of a K(H1(M;Z),1). A combination of Propositions 13 and 14 together with Reductions 5, 6 and 7 and the corollary of the universal coefficient theorem that H ∗ (X; A⊕B) ∼ = H ∗ (X; A)⊕H ∗ (X; B) give Theorem 2. 6.2. Cohomology of Skyrme configuration spaces. As we have seen we may restrict our attention to compact, simple, simply-connected G. Recall the cohomology classes, xj , and the µ-map defined in Sect. 4. Throughout this section we will take the coefficients of any homology or cohomology to be the real numbers unless noted to the contrary. To compute the cohomology of GM we will use the cofibrations M (k) → M (k+1) → M (k+1) /M (k) and the fact that M (k+1) /M (k) is a bouquet of spheres to reduce the problem to the case where the domain is a sphere. Briefly recall the computation of the cohomology of the loop spaces. These are well known results, but we sketch a proof because this explains why the classes µ(, xj xk ) k are trivial. As usual let k G = GS denote the k-iterated loop space. We have the following lemma.
1
1
n
jj
Fig. 10. The 2-cycle in the proof of Proposition 14
208
D. Auckly, M. Speight
Lemma 15. The cohomology rings of the first loop groups are given by, H ∗ (G) = R[µ([S 1 ] ⊗ xj )], H ∗ (2 G) = R[µ([S 2 ] ⊗ xj )], and H ∗ (30 G) = R[µ([S 3 ] ⊗ xj ), j > 3]. Proof. Recall that the path space, P G is contractible, and fits into a fibration, G → p,q P G → G. The Serre spectral sequence of this fibration has E2 = H p (G; H q (G)). Since G is simply-connected, the coefficient system is untwisted. Since P G is contractible all classes of positive degree have to die at some point in this spectral sequence. By location we know that all differentials of x3 vanish, so there must be some class in H 2 (G) mapping to x3 . The class µ([S 1 ] ⊗ x3 ) is one such class, and is the only class that there can be without having something elselive to the limit group of the spectral sequence. Notice that classes of the form x3 xjk are images of classes of the form µ([S 1 ] ⊗ x3 ) xjk , so we have killed all classes with a factor of x3 . In the same way, we can kill terms with a factor of the next xj . We conclude that H ∗ (G) = R[µ([S 1 ] ⊗ xj )]. Repeating the argument with the fibration, 2 G → P G → G we obtain, H ∗ (2 G) = R[µ([S 2 ] ⊗ xj )]. This time the coefficient system is untwisted because π1 (G) ∼ = π2 (G) = 0. We need to adjust the argument a bit at the next stage because π0 (3 G) ∼ = π1 (2 G) ∼ = 3 ∼ π3 (G) = Z. This shows that the path components of G may be labeled by the integers. Each component is homeomorphic to the identity component since 3 G is a topological group. In this case, we have no guarantee that the coefficient system is untwisted, so we will use a different approach that will be useful again in Sect. 6.4. Let 2 G denote the universal cover of 2 G and let 3 G denote the identity component of 0 2G → 2 G that may be used to obtain 3 G. These fit into a fibration, 3 G → P 0
H ∗ (30 G) = R[µ([S 3 ] ⊗ xj ), j > 3]. We will use equivariant cohomology to compute 2 G. the cohomology of Recall that any Lie group, say , acts properly on a contractible space called the total space of the universal bundle. This space is denoted E. The quotient of this by is the classifying space B. Let X be a space (i.e. a space with a action) and consider the space X := E × X. The cohomology of the space X is called the equivariant cohomology of X. It is denoted by H∗ (X). When the action on X is free and proper (as it is in our case), we have a fibration X → X/ obtained by ignoring the E component in the definition of X . The fiber of this fibration is just E which is contractible, so the spectral sequence of the fibration implies that the cohomology of X/ is isomorphic to the equivariant cohomology of X. By ignoring the X component in the definition of X we obtain a fibration X → B that may be used to relate the equivariant cohomology of X to the cohomology of X. 2 G and = π (2 G) ∼ If we apply these ideas with X = = Z, we obtain a spec1 tral sequence that may be used to show that the cohomology of 2˜ G is generated by µ([S 2 ] ⊗ xj ) for j > 3. This then plugs in to give the stated result for 30 G. Returning to the situation of a 3-manifold domain, let M have a cell decomposition with one 0-cell (p0 ) several 1-cells (er ) several 2-cells (fs ) and one 3-cell ([M]). Since GX∨Y = GX × GY , and M (k+1) /M (k) is a bouquet of spheres we have, H ∗ (GM ) = R[µ(er ⊗ xj )], (2) (1) H ∗ (GM /M ) = R[µ(fs ⊗ xj )], (1)
M (3) /M (2)
H ∗ (G0
) = R[µ([M] ⊗ xj ), j > 0].
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
209 (2)
The next lemma assembles these facts into the cohomology of GM . Lemma 16. If r1 form a basis for H1 (M), and s2 form a basis for H2 (M) we have, H ∗ (GM ) = R[µ(r1 ⊗ xj ), µ(s2 ⊗ xk )]. (2)
Proof. The cofibration M (1) → M (2) → M (2) /M (1) leads to a fibration, GM
(2) /M (1)
→ GM
(2)
(1)
→ GM .
(1)
Since G is 2-connected, π1 (GM ) = 0, so the coefficients in the cohomology appearing in the second term of the Serre spectral sequence are untwisted. We have, E2∗,∗ = H ∗ (GM ; H ∗ (GM /M )) (1) (2) (1) = H ∗ (GM ) ⊗ H ∗ (GM /M ) ∼ = R[µ(er ⊗ xj ), µ(fs ⊗ xk )]. (1)
(2)
(1)
To go further we need to understand the differentials in this spectral sequence. Since j −1,0 µ(er ⊗ xj ) ∈ E2 , we have dk µ(er ⊗ xj ) = 0 for all k. We will show that d µ(fs ⊗ xk ) = 0 for < k − 1 and dk−1 µ(fs ⊗ xk ) = −µ((∂fs ) ⊗ xk ) from which the result (1) will follow. The multiplication on G induces a ring structure on the homology of GM (2) (1) and GM /M . Using the homology spectral sequence of the path fibrations, one may (1) j −1 → GM show that these homology groups are generated by cycles er βj : r (2) (1) and fs βk : sk−2 → GM /M dual to µ(er ⊗ xj ) and µ(fs ⊗ xk ) respectively. The product in the homology ring of GX is given by β · β : × → GX with β · β (x, y)(p) = β(x)(p)β (y)(p). The computation (4.2) in Sect. 4 shows that the differential of our spectral sequence is given by (d µ(fs ⊗ xk ))(β ⊗ β ) = − w ∗ xk , (∂fs )××
where w(p, x, y) = β(x)(p)β (y)(π(p)) and π : M (2) → M (2) /M (1) is the canonical projection. We are using the integral as a suggestive notation for the cap product. We see that the map w factors through (∂fs ) × × point. When < k − 1, (∂fs ) × × point has dimension less than k, so the differential is trivial. For = k − 1, this reduces to the claimed result. We remark that the above lemma is a valid computation of the cohomology of GK when K is any connected 2-complex. To go up to the next and final stage we need to (2) (3) (2) analyze the action of the fundamental group of GM on the cohomology of GM /M . This will require Lemma 12 from the beginning of this section. This is the place in the cohomology computation where we use the fact that M is orientable. Let us review the situation for general fibrations first. If F → E → B is a fibration and γ : [0, 1] → B represents an element of the fundamental group of B one can define a map, : F × [0, 1] → B by (x, t) = γ (t). The map, 0 : F → B lifts to the inclusion, F → E. By the homotopy lifting property, there is a lift, : F × [0, 1] → E. The restriction, 1 : F → F induces a map on the cohomology of F . This is how the fundamental group of the base of a fibration acts on the cohomology of the fiber. The following lemma shows that the action of the fundamental group on the cohomology of the fiber of our final fibration is trivial.
210
D. Auckly, M. Speight
Lemma 17. If M is an orientable 3-manifold and G is a compact, simply-connected Lie (2) (3) (2) group then the action of π1 (GM ) on H ∗ (GM /M ) is trivial. (2)
Proof. Let γ : SM (2) → G represent an element of π1 (GM ). By Lemma 12, this (3) (2) extends to a map, ζ : SM (3) → G. Now define : GM /M × [0, 1] → GM by (u, t)(x) = u([x]) ζ ([x, t]). The upper triangle of the following diagram commutes because ζ ([x, 0]) = 1. The lower triangle commutes because, ζ |SM (2) = γ, GM
(3) /M (2)
× {0}
- GM
? G
M (3) /M (2)
× [0, 1]
? - GM (2)
Thus is an appropriate lift. Since u([x]) = 1 for x ∈ M (2) , 1 is the identity map and the action on the cohomology of the fiber is trivial. We can now complete the proof of Theorem 2. The cofibration M (2) → M (3) → M (3) /M (2) leads to a fibration, GM
(3) /M (3)
→ GM
(3)
(2)
→ GM .
By the previous lemma, the coefficients in the cohomology appearing in the second term of the Serre spectral sequence are untwisted. Using Lemma 16, we have, E2∗,∗ = H ∗ (GM ; H ∗ (GM /M )) = H ∗ (GM ) ⊗ H ∗ (GM ∼ = R[µ(r1 ⊗ xj ), µ(s2 ⊗ xk ), µ([M] ⊗ x ), > 0]. (2)
(3)
(2)
(2)
(3) /M (2)
)
Repeating the argument from Lemma 16 with computation (4.2), we see that all of the differentials of this spectral sequence vanish. This completes our computation of the cohomology of the Skyrme configuration space. 6.3. The fundamental group of Faddeev-Hopf configuration spaces. In this subsection we compute the fundamental group of the Faddeev-Hopf configuration space, (S 2 )M . Recall (Theorem 4) that the path components (S 2 )M ϕ (where ϕ is any representative ∗ of the component) fall into families labelled by ϕ µS 2 ∈ H 2 (M; Z), where µS 2 is a generator of H 2 (S 2 ; Z), and that components within a given family are labelled by α ∈ H 3 (M; Z)/2ϕ ∗ µS 2 H 1 (M; Z). To analyze the Faddeev-Hopf configuration space in more detail we will further exploit its natural relationship with the classical (G = SU(2) = Sp(1)) Skyrme configuration space. These ideas were concurrently introduced in [5]. We identify S 2 with the unit purely imaginary quaternions, and S 1 with the unit complex numbers. The quotient, Sp(1)/S 1 is homeomorphic to S 2 , with an explicit homeomorphism given by [q] → qiq ∗ . Our main tool will be the map q : S 2 × S 1 → Sp(1),
q(x, λ) = qλq ∗ ,
It is not difficult to verify the following properties of q:
where x = qiq ∗ .
(6.2)
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
1. 2. 3. 4. 5. 6.
211
It is well defined and smooth. q(x, λ1 λ2 ) = q(x, λ1 )q(x, λ2 ). q−1 (1) = S 2 × {1}. q(x, λ)x(q(x, λ))∗ = x. q(x, ·) : S 1 → {q|qxq ∗ = x} is a diffeomorphism. deg(q) = 2 with the standard “outer normal first” orientations.
For example writing x = qiq ∗ , the fourth property may be verified as q(x, λ)x(q(x, λ))∗ = qλq ∗ qiq ∗ qλ∗ q ∗ = x. We will let λx denote the inverse to q(x, ·). We will also use the related maps, ρ : S 2 × Sp(1) × S 1 → S 2 × Sp(1) and f : S 2 × Sp(1) → S 2 × S 2 defined by ρ(x, y, λ) = (x, yq(x, λ)) and f (x, q) = (x, qxq ∗ ). Properties (2) and (3) show that ρ is a free right action. Properties (4) and (5) show that f is a principal fibration with action ρ. As our first application of these maps we show that the evaluation map is a fibration. Lemma 18. The evaluation map, evp0 : FreeMap(M, S 2 ) → S 2 , given by evp0 (ϕ) = ϕ(p0 ) is a fibration. Proof. We just need to construct the diagonal map in the following diagram. h FreeMap(M, S 2 )
X × {0}
H
i
? X × [0, 1]
evp0 ? - S2
H
If we define the horizontal maps in the following diagram by µ(x, t) = (h(x)(p0 ), H (x, t)) and ν(x) = (h(x)(p0 ), 1), then the existence of the diagonal map will follow because f is a fibration. X × {0} i ? X × [0, 1]
ν 2 S × Sp(1) µ
f
? - S2 × S2 µ
The desired map is just, H (x, t)(p) = µ2 (x, t)h(x)(p)(µ2 (x, t))∗ .
Clearly the fiber of this fibration is (S 2 )M . Recall that we are using (S 2 )M ϕ to denote the ϕ-component of the space of based maps. In [5] these ideas were used to give a new proof of Pontrjagin’s homotopy classification of maps from a 3-manifold to S 2 . The following lemma comes from that paper. A second proof of this lemma may be found in [6]. Lemma 19 (Auckly-Kapitanski). There exists a map u : M → Sp(1) such that ψ : M → S 2 and ϕ : M → S 2 are related by ψ = uϕu∗ if and only if ψ ∗ µS 2 = ϕ ∗ µS 2 .
212
D. Auckly, M. Speight
Theorem 5 follows directly from this lemma. Assuming ψ ∗ µS 2 = ϕ ∗ µS 2 , define a 2 M ∗ map F : (S 2 )M ϕ → (S )ψ by F (ξ ) = uξ u . This is clearly well defined because any map homotopic to ϕ will be mapped to a map homotopic to ψ under F . There is a well-defined inverse given by F −1 (ζ ) = u∗ ζ u. We have a fibration relating the identity component of the Skyrme configuration space to any component of the Faddeev-Hopf configuration space. Lemma 20. The map induced by f , f∗
2 M {ϕ} × Sp(1)M 0 → {ϕ} × (S )ϕ
is a fibration. Proof. Once again we just need to construct the diagonal map in a diagram. hSp(1)M 0
X × {0} i
H
? X × [0, 1]
f∗
? - (S 2 )M ϕ H
So once again we consider a second diagram. M × X × {0} i ? M × X × [0, 1]
ν 2 S × Sp(1) µ
f
? - S2 × S2 µ
Here the horizontal maps are given by ν(p, x) = (ϕ(p), h(x)(p)) and µ(p, x, t) = (ϕ(p), H (x, t)(p)). The diagonal lift exists because f is a fibration. We need to use Property (5) of q to adjust the base points. Let x0 be the basepoint of S 2 and define the desired lift by H (x, t)(p) = µ2 (p, x, t)q(ϕ(p), λx0 (µ2 (p, x, t)−1 ). This completes the proof.
By Property (5) of q, we see that any element of the fiber of the above fibration may be written in the form q(ϕ, λ) for some map λ : M → S 1 . Since q(ϕ, λ) is null homotopic, its degree must be zero. By Property (6) of q, this implies that λ∗ µS 1 must be in the kernel of the cup product 2ϕ ∗ µS 2 . Conversely, given any map λ with λ∗ µS 1 ∈ ker(2ϕ ∗ µS 2 ) we get an element of the fiber. Recall that the components of the space of maps from M to S 1 correspond to H 1 (M; Z) and each component is homeomorphic to the identity component which is homeomorphic to RM which is con∗ tractible. It follows that up to homotopy Sp(1)M 0 is a regular ker(2ϕ µS 2 ) cover of
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
213
∗ (S 2 )M ϕ (the fiber is homotopy equivalent to ker(2ϕ µS 2 )). The homotopy sequence of the fibration then gives us the following sequence: ∗ 0 → π1 (Sp(1)M ) → π1 ((S 2 )M ϕ ) → ker(2ϕ µS 2 ) → 0.
(6.3)
Since we do not already know that π1 ((S 2 )M ϕ ) is abelian, we not only need to show that the sequence splits, we also need to show that the image of the splitting commutes with the image of π1 (Sp(1)M ). This is the content of the following lemma. This lemma will complete the proof of Theorem 6. Lemma 21. The sequence (6.3) splits and the image of the splitting commutes with the image of π1 (Sp(1)M ). Proof. Given θ ∈ ker(2ϕ ∗ µS 2 ) define a corresponding map λθ : M → S 1 in the usual p
θ
way, λθ (p) = e p0 . This induces a map, qθ : M → Sp(1) by qθ (p) = q(ϕ(p), λθ (p)). We compute the degree as follows:
deg(qθ ) = M qθ∗ µSp(1)
= 2 M ϕ ∗ µS 2 ∧ λ∗θ µS 1 = 2 M ϕ ∗ µS 2 ∧ λ∗θ = 0. It follows that there is a homotopy, H θ with H θ (0) = 1 and H θ (1) = qθ . Define the splitting by sending θ to Hθ ∈ π1 ((S 2 )ϕ ) given by Hθ (t)(p) = f (ϕ(p), H θ (t)(p)). To see that the image of this splitting commutes with the image of π1 (Sp(1)M ), let 2t γ : [0, 1] → Sp(1)M be a loop and define maps δ1 and δ2 by δ1 (t, s) = s+1 for 1 t ≤ 2 (s + 1), δ1 (t, s) = 1 otherwise and δ2 (t, s) = 1 − δ1 (1 − t, s). We see that f (ϕ, (γ ◦ δ1 ) · (H θ ◦ δ2 )) is a homotopy between f (ϕ, γ ) ∗ f (ϕ, H θ ) and f (ϕ, γ · H θ ). Likewise, f (ϕ, (γ ◦ δ2 ) · (H θ ◦ δ1 )) is a homotopy between f (ϕ, H θ ) ∗ f (ϕ, γ ) and f (ϕ, γ · H θ ). To prove Theorem 7, notice that we have a left S 1 action on (S 2 )M ϕ given by z · ψ := zψz∗ . We claim that the fibration, FreeMap(M, S 2 )ϕ → S 2 , is just the fiber bundle with associated principal bundle Sp(1) → S 2 and fiber (S 2 )M ϕ . In fact, the map Sp(1) ×S 1 2 , M) given by [q, ψ] → qψq ∗ is the desired isomorphism. (S 2 )M → FreeMaps(S ϕ Now consider the homotopy exact sequence of the fibration, FreeMap(M, S 2 )ϕ → S 2 , 2 → π2 (S 2 ) → π1 ((S 2 )M ϕ ) → π1 (FreeMaps(S , M)ϕ ) → 0.
It follows that π1 (FreeMaps(S 2 , M)ϕ ) is just the quotient of π1 ((S 2 )M ϕ ) by the image of π2 (S 2 ). The next lemma identifies this image, to complete the proof of Theorem 7. Lemma 22. The image of the map from π2 is the subgroup of H 2 (M; Z) < π1 ((S 2 )M ϕ ) generated by 2ϕ ∗ µS 2 . Proof. Recall that the map from π2 of the base to π1 of the fiber is defined by taking a map of a disk into the base to the restriction to the boundary of a lift of the disk to the total space. Since the boundary of the disk maps to the base point, the restriction to the boundary of the lift lies in the fiber. The homotopy exact sequence of the fibration, Sp(1) → S 2 implies that the disk representing a generator of π2 (S 2 ) lifts to a disk with boundary generating the fundamental group of the fiber S 1 . This lift, say γ to Sp(1)
214
D. Auckly, M. Speight
2 ∼ gives a lift D 2 → Sp(1) ×S 1 (S 2 )M γ (z), 1]. ϕ = FreeMaps(M, S ) defined by z → [ Restricted to the boundary, this map is just z → [z, ϕ] = [1, zϕz∗ ]. It follows that the image of π2 (S 2 ) is just the subgroup generated by the loop zϕz∗ . We now just have to trace this loop through the proof of the isomorphism, 2 ∗ ∼ π1 ((S 2 )M ϕ ) = Z2 ⊕ H (M; Z) ⊕ ker(2ϕ µS 2 ).
The projection to ker(2ϕ ∗ µS 2 ) was defined by taking a lift of each map in the M 1-parameter family representing the loop in π1 ((S 2 )M ϕ ) to Sp(1)0 and comparing the maps at the beginning and end. In our case the entire path consistently lifts to the ∗ path γϕ : S 1 → Sp(1)M 0 given by γϕ (z) = zq(ϕ, z ). It follows that the component in ∗ ker(2ϕ µS 2 ) is zero. A loop such as γϕ naturally defines a map, γ¯ϕ : M ×S 1 → Sp(1). The image of our loop in H 2 (M; Z) is just γ¯ϕ∗ µSp(1) /[pt × S 1 ]. In notation reminis
cent of differential forms this would be pt×S 1 γ¯ϕ∗ µSp(1) . In order to evaluate this, we write γ¯ϕ as the composition of the map (ϕ, idS 1 ) : M × S 1 → S 2 × S 1 and the map q˜ : S 2 × S 1 → Sp(1) given by q˜ (x, z) = zq(x, z∗ ). This latter map is then expressed as the composition of (q, pr∗2 ) : S 2 × S 1 → Sp(1) × S 1 and the map Sp(1) × S 1 → Sp(1) given by (u, λ) → λu. The form µSp(1) pulls back to µSp(1) 1 under the first map, and this pulls back to 2µS 2 µS 1 under the first factor of q˜ since q has degree two. In particular q˜ has degree two as well. We can now complete this computation to see that our loop projects to 2ϕ ∗ µS 2 in H 2 (M; Z). To complete the proof, we need to compute the projection of our loop in the Z2 -factor. The projection to Z2 is defined by multiplying the inverse of our map by the image of 2ϕ ∗ µS 2 under the splitting H 2 → π1 and taking the framing of the inverse image of a regular value. The equivalence classes of framings may be identified with Z2 since the inverse image is homologically trivial. Alternatively we may compare the framing coming from our map to the framing of the map coming from the splitting. The image under the splitting of 2ϕ ∗ µS 2 is, of course, just two times the image of ϕ ∗ µS 2 under the splitting. The inverse image coming from our map is just two copies of the inverse image of a frame under the map (ϕ, idS 1 ) : M ×S 1 → S 2 ×S 1 . This means that the projection is even, so zero in Z2 . 6.4. The cohomology of Faddeev-Hopf configuration spaces. In order to compute the M 2 M cohomology of (S 2 )M ϕ we will use the fibration, Sp(1)0 → (S )ϕ . The fiber of this 1 M ∗ fibration is just ⊥⊥α∈K (S )α , where K = ker(2ϕ µS 2 ∪). Up to homotopy, the fiber is just K. In fact we can assume that the fiber is exactly K if we first take the quotient by (S 1 )M 0 (which is contractible by Reduction 6). It is slightly tricky to use a spectral sequence to compute the cohomology of the base of a fibration, so we will use equivariant cohomology to recast the problem. Recall that any Lie group acts properly on a contractible space called the total space of the universal bundle. In our case, this space is denoted EK. The quotient of this by K is the classifying space BK. Let X be a K space (i.e. a space with a K action) and consider the space XK := EK ×K X. We 1 M will be interested in the situation when X = Sp(1)M 0 (or rather this divided by (S )0 , but this has the same homotopy type). The cohomology of the space XK is called the equivariant cohomology of X. It is denoted by HK∗ (X). When the K action on X is free and proper (as it is in our case), we have a fibration XK → X/K obtained by ignoring the EK component in the definition of XK . The fiber of this fibration is just EK which is contractible, so the spectral sequence of the fibration implies that the cohomology of X/K is isomorphic to the equivariant cohomology of X. By ignoring the X component
Quantization and Configuration Spaces for Skyrme and Faddeev-Hopf Models
215
in the definition of XK we obtain a fibration XK → BK which may be used to compute the equivariant cohomology of X. Since H 1 (M; Z) is a free abelian group, the kernel K is as well. It follows that we may take EK to be Rn with n equal to the rank of K and with K acting by translations. It follows that BK is just an n-torus, and we have a spectral sequence with E2 term, p,q ˜ M )) converging to the cohomology of (S 2 )M . Clearly the E2 = H p (T n ; H q (Sp(1) ϕ 0 fundamental group of T n is just K. To compute the action of K on H ∗ (Sp(1)M 0 ), let M 1 ∗ q λ : M → S satisfy λ µS 1 ∈ K, and µ( ⊗ x) ∈ H (Sp(1)0 ) with σ : → M. Let u : q → Sp(1)M 0 be a singular q-simplex and let m : Sp(1) × Sp(1) → Sp(1) be the multiplication. Then we have,
(λ∗ µS 1 · µ( ⊗ x))(u) = ×q m( u, q(ϕ, λ) ◦ (σ, 1))∗ x ∗ x + R∗ u∗ x = ×q (σ, 1)∗ q(ϕ, λ)∗ L u q(ϕ,λ)◦(σ,1)
= ×q u∗ x = µ( ⊗ x)(u). Thus the fundamental group of the base acts trivially on the cohomology of the fiber. Because this fibration has an associated principal fibration with discrete group, all of the higher differentials vanish, and we obtain Theorem 8. Theorem 9 will follow from considerations of a general fiber bundle with structure group S 1 and one computation. Let P → X be a principal S 1 bundle with simply-connected base and let τ : S 1 × F → F be a left action. The Serre spectral sequence of the p,q fibration E = P ×S 1 F → X has E2 = H p (X; H q (F ; R)) and second differential ∗ 1 d2 ω = c1 (P ) ∪ τ ω/[S ]. In our case, the principal bundle is Sp(1) → S 2 . It follows immediately that the coefficient system in the E2 term of the Serre spectral sequence is untwisted, and that the only non-trivial differential is the d2 differential. In this case, the first Chern class is µS 2 . The action that we consider is the map τ : S 1 × (S 2 )M ϕ given by τ (z, u) = zuz∗ . In fact, we only need to consider the effect of this action on terms coming from Sp(1)M 0 . This is because the action is trivial on the classes coming from BK. This can be seen by considering a map from (S 2 )M ϕ to BK. However, the easiest way to see this is first to compute the cohomology of the fiber bundle with fiber Sp(1)M 0 , and then recognize that, up to homotopy, the total space of this bundle is a regular Kcover of FreeMaps(M, S 2 )ϕ . Either way, we need to compute the second differential M ˜ (ϕ, z)u. coming from the action, τ0 : S 1 × Sp(1)M 0 → Sp(1)0 given by τ0 (z, u) = q M Let u : F → Sp(1)0 be a singular chain and compute τ0∗ µ( ⊗ x)/[S 1 ] (u) =
S 1 ××F
∗
u ◦ (σ, prF ) m ◦ (˜q(ϕ ◦ σ, prS 1 ),
x.
Here m : Sp(1) × Sp(1) → Sp(1) is multiplication, and the rest of the maps are as in the definition of µ( ⊗ x) in line (4.1). This vanishes for dimensional reasons when is a 1-cycle (ϕ ◦ σ would push it forward to a 1-cycle in S 2 ). When is a 2-cycle, we use the that q˜ : S 2 × S 1 → Sp(1) has degree two to conclude ∗product rule and the fact 1 ∗ that τ0 µ( ⊗ x)/[S ] = 2ϕ µS 2 []. This completes the proof of our last theorem. Acknowledgement. We would like to thank Louis Crane, Steffen Krusch and Larry Weaver for helpful conversations about particle physics.
216
D. Auckly, M. Speight
References 1. Aitchison, I.J.R.: Effective Lagrangians and soliton physics I: derivative expansion, and decoupling. Acta Phys. Polon. B18:3, 191–205 (1987) 2. Aitchison, I.J.R.: Berry phases, magnetic monopoles, and Wess-Zumino terms or how the Skyrmion got its spin. Acta Phys. Polon. B18:3, 207–235 (1987) 3. Adams, J.F.: Lectures on exceptional Lie groups. Chicago and London: The University of Chicago Press, 1996 4. Auckly, D., Kapitanski, L.: Holonomy and Skyrme’s Model. Commun. Math. Phys. 240, 97–122 (2003) 5. Auckly, D., Kapitanski, L.: Analysis of the Faddeev Model. Commun. Math. Phys. 256, 611–620 (2005) 6. Auckly, D., Kapitanski, L.: The Pontrjagin-Hopf invariants for Sobolev maps. In progress 7. Balachandran, A., Marmo, G., Skagerstam, B., Stern, A.: Classical topology and quantum states. New Jersey: World Scientific, 1991 8. Balachandran, A., Marmo, G., Simoni, A., Sparno, G.: Quantum bundles and their symmetries. Int. J. Mod. Phys. A 7:8, 1641–1667 (1982) 9. Birrell, N., Davies, P.: Quantum fields in curved space. Cambridge: Cambridge University Press, 1982 ¨ 10. Bopp, F., Haag, Z.: Uber die m¨oglichkeit von Spinmodellen. Zeits. fur Natur. 5a, 644 (1950) 11. Br¨ocker, Th., tom Dieck, T.: Representation of compact Lie groups. New York: Springer-Verlag, 1985 12. Dirac, P.: Quantized singularities in the electromagnetic field. Proc. Roy. Soc. London A133, 60–72 (1931) 13. Dirac, P.: The theory of magnetic poles. Phys. Rev. 74, 817–830 (1948) 14. Federer, H: A study of function spaces by spectral sequences. Ann. of Math. 61, 340–361 (1956) 15. Finkelstein, D., Rubinstein, J.: Connection between spin statisitics and kinks. J. Math. Phys. 9, 1762–1779 (1968) 16. Giulini, D.: On the possibility of spinorial quantization in the Skyrme model. Mod. Phys. Lett. A8, 1917–1924 (1993) 17. Gottlieb, D.: Lifting actions in fibrations. In: Geometric applications of homotopy theory I. Lecture Notes in Mathematics 657, Berlin-Heidelberg-New york: Springer, 1978, pp. 217–253 18. Halzen, F., Martin, A.: Quarks and Leptons: an introductory course in modern particle physics. New York: John Wiley and Sons, 1984 19. Hattori, A., Yoshida, T.: Lifting group actions in fiber bundles. Japan J. Math. 24, 13–25 (1976) 20. Helgason, S.: Differential geometry, Lie groups and symmetric spaces. Providence, Rhode Island: Amer. Math. Soc., 2001 21. Imbo, T.D., Shah Imbo, C., Sudarshan, E.C.G.: Identical particles, exotic statistics and braid groups. Phys. Lett. B 234, 103–107 (1990) 22. Iwasawa, K.: On some types of topological groups. Ann. of Math. 50:3, 507–558 (1949) 23. Kirby, R.C.: The topology of 4-manifolds. Lecture Notes in Mathematics 1374, New York: SpringerVerlag, 1985 24. Krusch, S.: Homotopy of rational maps and quantization of Skyrmions. Annals Phys. 304, 103–127 (2003) 25. Mimura, M., Toda, H.: Topology of Lie groups, I and II. Transl. Amer. Math. Soc., Providence, Rhode Island: Amer. Math. Soc., 1991 26. Particle data group: Review of particle physics. http://pdg.lbl.gov 27. Pontrjagin, L.: A classification of mappings of the 3-dimensional complex into the 2-dimensional sphere. Mat. Sbornik N.S. 9:51, 331–363 (1941) 28. Ramadas, T.R.: The Wess-Zumino term and fermionic solitons. Commun. Math. Phys. 93, 355–365 (1984) 29. Schulman, L.: A path integral for spin. Phys. Rev. 176:5, 1558–1569 (1968) 30. Simms, D.J., Woodhouse, N.M.J.: Geometric quantization. Lecture Notes in Physics 53, Berlin: Springer-Verlag, 1977 31. Skyrme, T.H.R.: A unified field theory of mesons and baryons. Nucl. Phys. 31, 555–569 (1962) 32. Sorkin, R.: A general relation between kink-exchange and kink-rotation. Commun. Math. Phys. 115, 421–434 (1988) 33. Spanier, E.H.: Algebraic topology. New York: Springer, 1966 34. Steenrod, N.: The topology of fiber bundles. Princeton: Princeton University Press, 1951 35. Witten, E.: Current algebra, baryons and quark confinement. Nucl. Phys. B233, 433–444 (1983) 36. Zaccaria, F., Sudarshan, E., Nilsson, J., Mukunda, N., Marmo, G., Balachandran, A.: Universal unfolding of Hamiltonian systems: from symplectic structures to fibre bundles. Phys. Rev. D27, 2327–2340 (1983) Communicated by L. Takhtajan
Commun. Math. Phys. 263, 217–258 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1452-0
Communications in
Mathematical Physics
Unitary Representations of Super Lie Groups and Applications to the Classification and Multiplet Structure of Super Particles C. Carmeli1 , G. Cassinelli1 , A. Toigo1 , V.S. Varadarajan2 1 2
Dipartimento di Fisica, Universit`a di Genova, I.N.F.N., Sezione di Genova, Via Dodecaneso 33, 16146 Genova, Italy. E-mail:
[email protected];
[email protected];
[email protected] Department of Mathematics, University of California at Los Angeles, Box 951555, Los Angeles, CA 90095-1555, USA. E-mail:
[email protected] Received: 23 December 2004 / Accepted: 9 May 2005 Published online: 24 January 2006 – © Springer-Verlag 2006
Abstract: It is well known that the category of super Lie groups (SLG) is equivalent to the category of super Harish-Chandra pairs (SHCP). Using this equivalence, we define the category of unitary representations (UR’s) of a super Lie group. We give an extension of the classical inducing construction and Mackey imprimitivity theorem to this setting. We use our results to classify the irreducible unitary representations of semidirect products of super translation groups by classical Lie groups, in particular of the super Poincar´e groups in arbitrary dimension and signature. Finally we compare our results with those in the physical literature on the structure and classification of super multiplets. 1. Introduction The classification of free relativistic super particles in SUSY quantum mechanics is well known (see for example [SS74, FSZ81]). It is based on the technique of little (super)groups which, in the classical context, goes back to Wigner and Mackey. However the treatments of this question in the physics literature make the implicit assumption that the technique of little groups remains valid in the SUSY set up without any changes; in particular no attempt is made in currently available treatments to exhibit the SUSY transformations explicitly for the super particle. The aim of this paper is to remedy this situation by laying a precise mathematical foundation for the theory of unitary representations of super Lie groups, and then to apply it to the case of the super Poincar´e groups. In the process of doing this we clarify and extend the results in the physical literature to Minkowskian spacetimes of arbitrary dimension D ≥ 4 and N -extended supersymmetry for arbitrary N ≥ 1. Super Lie groups differ from classical Lie groups in a fundamental way: one should think of them as group-valued functors rather than groups. As long as only finite dimensional representations are being considered, there is no difficulty in adapting the functorial theory to study representations. However, this marriage of the functorial
218
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
approach with representation theory requires modifications when one considers infinite dimensional representations. Our entire approach is based on two observations. The first is to view a super Lie group as a Harish-Chandra pair, namely a pair (G0 , ᒄ) where G0 is a classical Lie group, ᒄ is a super Lie algebra which is a G0 -module, Lie(G0 ) = ᒄ0 , and the action of ᒄ0 on ᒄ is the differential of the action of G0 . That this is a valid starting point is justified by the result [DM99] that the category of super Lie groups is equivalent to the category of Harish-Chandra pairs. This point of view leads naturally to define a unitary representation of a super Lie group (G0 , ᒄ) as a classical unitary representation of G0 together with a compatible infinitesimal unitary action of ᒄ. This is in fact very close to the approach of the physicists. The source of our second observation is more technical and is the fact, which is a consequence of the commutation rules, that the operators corresponding to the odd elements of ᒄ are in general unbounded and so care is needed to work with them. Our second observation is in fact a basic result of this paper, namely, that the commutation rules and the symmetry requirements that are implicit in a supersymmetric theory force the unbounded odd operators to be well behaved and lead to an essentially unique way to define a unitary representation of a super Lie group. Of course this aspect was not treated in the physical literature, not only because representations of the big super Lie groups were not considered, but even more, because only finite dimensional super representations of the little groups were considered, where unbounded phenomena obviously do not occur. Our treatment has the additional feature that it is able to handle the construction of super particles with infinite spin also. In the first section of the paper we treat the foundations of the theory of unitary representations of a super Lie group based on these two observations. The basic result is Proposition 2 which asserts that the odd operators are essentially self adjoint on their domain and that the representation of the Harish-Chandra pair, extended to the space of C ∞ vectors of the representation of G0 , is unique. This result is essential to everything we do and shows that the formal aspects of representations of super Lie groups already control all their analytic aspects. The second section discusses the imprimitivity theorem in the super context. Here an important assumption is made, namely that the super homogeneous space on which we have the system of imprimitivity is a purely even manifold, or equivalently, the sub super Lie group (H0 , ᒅ) defining the system of imprimitivity has the same odd dimension as the ambient super Lie group, i.e., ᒅ1 = ᒄ1 . This restriction, although severe, is entirely adequate for treating super Poincar´e groups and more generally a wide class of super semi direct products. The main result is Theorem 2 which asserts that the inducing functor is an equivalence of categories from the category of unitary representations of (H0 , ᒅ) to the category of super systems of imprimitivity based on G0 /H0 . These two sections complete the foundational aspects of this paper. The third section is concerned with applications and bringing our treatment as close as possible to the ones in the literature. We consider a super semi direct product of a super translation group with a classical Lie group L0 acting on it. No special assumption is made about the action of L0 on ᒄ1 , so that the class of super Lie groups considered is vastly larger than the ones treated in the physical literature, where this action is always assumed to be spinorial. If T0 is the vector group which is the even part of the super translation group, the theory developed in §§3,4 leads to the result that the irreducible representations of (G0 , ᒄ) are in one-one correspondence with certain L0 -orbits in the dual T0∗ of T0 together with certain irreducible representations of the super Lie groups (little groups) which are stabilizers of the points in the orbits. This is the super version of the classical little group method (Theorem 3).
Unitary Representations of Super Lie Groups and Applications
219
It is from this point on that the SUSY theory acquires its own distinctive flavor. In the first place, unlike the classical situation, Theorem 3 stipulates that not all orbits are allowed, only those belonging to a suitable subset T0+ . We shall call these orbits admissible. These are the orbits where the little super group admits an irreducible unitary representation which restricts to a character of T0 . These representations will be called admissible. These orbits satisfy a positivity condition which we interpret as the condition of positivity of energy. This condition is therefore necessary for admissibility. However it requires some effort to show that it is also sufficient for admissibility, and then to determine all the irreducible unitary representations of the little super Lie group at λ (Theorem 4). The road to Theorem 4 is somewhat complicated. Let λ ∈ T0∗ be fixed. The classical stabilizer of λ is T0 Lλ0 , where Lλ0 is the classical little group at λ, namely the stabilizer of λ in L0 . The super stabilizer of λ is the super Lie group S λ defined by S λ = (T0 Lλ0 , ᒄλ ),
ᒄλ = ᒑ0 ⊕ ᒉ0λ ⊕ ᒄ1 .
(By convention, the Lie algebra of a Lie group is denoted by the corresponding gothic letter.) Given λ, there is an associated Lλ0 -invariant quadratic form λ on ᒄ1 which will be nonnegative definite; the nonnegativity of λ is the positive energy condition mentioned earlier. This form need not be strictly positive, but one can pass to the quotient ᒄ1λ of ᒄ1 by its radical, and obtain a positive definite quadratic vector space on which the classical part Lλ0 of the little group operates. So we obtain a map jλ : Lλ0 −→ O(ᒄ1λ ). Now Lλ0 need not be connected and so jλ need not map into SO(ᒄ1λ ). Also, since we are not making any assumption about the action of L0 on ᒄ1 , it is quite possible that dim(ᒄ1λ ) could be odd. We introduce the subgroup j −1 (SO(ᒄ1λ )) if jλ (Lλ0 ) ⊂ SO(ᒄ1λ ) and dim(ᒄ1λ ) is even λ L00 = λλ L0 otherwise. Then Lλ00 is either the whole of Lλ0 or a (normal) subgroup of index 2 in it. Theorem 4 asserts that there is a functorial map r −→ θrλ which is an equivalence of categories from the category of unitary projective representations r of Lλ00 corresponding to a certain canonical multiplier µλ , to the category of admissible unitary representations of the super group S λ . In particular, the admissible irreducible unitary representations of S λ correspond one-one to irreducible unitary µλ -representations of Lλ00 . We shall now explain how this correspondence is set up. The odd operators of the super representation of the little group are obtained from a self adjoint representation of the Clifford algebra of ᒄ1λ . Since any such representation is a multiple of an essentially unique one, we begin with an irreducible self adjoint representation. Let us call it τλ . In the space of τλ one can define in an essentially unique manner a projective representation κλ of Lλ0 that intertwines τλ and its transforms by elements of Lλ0 : κλ (t)τλ (X)κλ (t)−1 = τλ (tX).
(∗)
220
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
The class of the multiplier µλ in H 2 (Lλ0 , T) is then uniquely determined. We can normalize µλ so that it takes only ±1-values. Starting from any unitary µλ -representation r of Lλ00 we build the µλ -representation Ind(r) of Lλ0 induced by r. Then the admissible unitary representation θrλ of S λ corresponding to r is given by θrλ = eiλ(t) Ind(r) ⊗ κλ , 1 ⊗ τλ ; the fact that µλ is ±1-valued implies that θrλ is an ordinary rather than a projective representation. The representation rλ of the super Lie group (G0 , ᒄ) induced by θrλ from S λ is then irreducible if r is irreducible. Theorem 5 asserts that all the irreducible unitary representations of the super Lie group are parametrized bijectively as above by the irreducible representations of Lλ00 , thus giving the super version of the classical theory. The representation κλ is therefore at the heart of the theory of irreducible unitary representations of super semi direct products. It is finite dimensional; in fact it is the lift via jλ to Lλ0 of the spin representation of the quadratic vector space. It is dependent only on λ. Clearly, the representation Ind(r) ⊗ κλ of Lλ0 will not in general be irreducible even if r is. The irreducible constituents of Ind(r)⊗κλ then define, via the classical procedure of inducing, irreducible unitary representations of G0 which are the constituents of the even part of the full super representation. This family of irreducible representations of G0 is the multiplet of the irreducible representation of the super semi direct product in question. If µλ is trivial, we can choose r to be trivial, and the corresponding multiplet is the fundamental multiplet. Thus κλ determines the entire correspondence in a simple manner. In §4.3 we discuss the case of the super Poincar´e groups. We consider spacetimes of Minkowski signature and of arbitrary dimension D ≥ 4 together with N -extended supersymmetry for arbitrary N ≥ 1. In this case, the groups Lλ0 are all connected and Lλ0 = Lλ00 . Moreover, the multiplier µλ becomes trivial, so that κλ becomes an ordinary representation (Lemma 13). Hence θrλ = eiλ(t) r ⊗ κλ , 1 ⊗ τλ . Thus in this case we finally reach the conclusion that the super particles are parametrized by the admissible orbits and irreducible unitary representations of the stabilizers of the classical little groups, exactly as in the classical theory. The positive energy condition λ ≥ 0 becomes just that λ, which we replace by p to display the fact that it is a momentum vector, lies in the closure of the forward light cone. Thus the orbits of imaginary mass are excluded by supersymmetry (Theorem 6). Our approach enables us to handle super particles with infinite spin in the same manner as those with finite spin because of the result that the odd operators of the little group are bounded (Lemma 10). Finally, in §4.4 we give the explicit determination of κp when the spacetime has arbitrary dimension D ≥ 4 and we have N -extended supersymmetry. The results in the physical literature are in general only for D = 4. The literature on supersymmetric representation theory is very extensive. We have been particularly influenced by [SS74, Fer01, Fer03, FSZ81]. In [DP85, DP86, DP87]
Unitary Representations of Super Lie Groups and Applications
221
Dobrev et al. discuss representations of superconformal groups induced from a maximal parabolic subgroup. The papers [SS74] gave the structure of κp for N = 1 supersymmetry, while [FSZ81] gave a very complete treatment of the structure of κp in the case of extended supersymmetry when D = 4, including the case of central charges. 2. Super Lie Groups and Their Unitary Representations 2.1. Super Hilbert spaces. All sesquilinear forms are linear in the first argument and conjugate linear in the second. We use the usual terminology of super geometry as in [DM99, Var04]. A super Hilbert space (SHS) is a super vector space H = H0 ⊕ H1 over C with a scalar product (· , ·) such that H is a Hilbert space under (· , ·), and Hi (i = 0, 1) are mutually orthogonal closed linear subspaces. If we define if x and y are of opposite parity 0 x, y = (x, y) if x and y are even i(x, y) if x and y are odd then x, y is an even super Hermitian form with y, x = (−1)p(x)p(y) x, y , x, x > 0(x = 0 even), i −1 x, x > 0(x = 0 odd). If T (H → H) is a bounded linear operator, we denote by T ∗ its Hilbert space adjoint and by T † its super adjoint given by T x, y = (−1)p(T )p(x) x, T † y . Clearly T † is bounded, p(T ) = p(T † ), and T † = T ∗ or −iT ∗ according as T is even or odd. For unbounded T we define T † in terms of T ∗ by the above formula. These definitions are equally consistent if we use −i in place of i. But our convention is as above. 2.2. SUSY quantum mechanics. In SUSY quantum mechanics in a SHS H, it is usual to stipulate that the Hamiltonian H = X 2 , where X is an odd operator [Wit82]; it is customary to argue that this implies that H ≥ 0 (positivity of energy); but this is true only if we know that H is essentially self adjoint on the domain of X 2 . We shall now prove two lemmas which clarify this situation and will play a crucial role when we consider systems with a super Lie group of symmetries. If A is a linear operator on H, we denote by D (A) its domain. We always assume that D (A) is dense in H, and then refer to it as densely defined. We write A ≺ B if D(A) ⊂ D(B) and B restricts to A on D(A); A is symmetric iff A ≺ A∗ , and then A has a closure A. A ≺ B ⇒ B ∗ ≺ A∗ . If A is symmetric we say that it is essentially self adjoint if A is self adjoint; in this case A∗ = A. If A is symmetric and B is a symmetric extension of A, then A ≺ B ≺ A∗ ; in fact A ≺ A∗ and A ≺ B ≺ B ∗ , and so B ∗ ≺ A∗ and A ≺ B ≺ B ∗ ≺ A∗ . If A is self adjoint and L ⊂ D(A), we say that L is a core for A if A is the closure of its restriction to L. A vector ψ ∈ H is analytic for a symmetric operator H if ψ ∈ D(H n ) for all n and the series n t n (n!)−1 ||H n ψ|| < ∞ for some t > 0. It is a well known result of Nelson [Nel59] that if D ⊂ D(H ) and contains a dense set of analytic vectors, then H is essentially self adjoint on D. In this case ψ ∈ D(H ) is analytic for H if and only if t −→ eitH ψ is analytic in t ∈ R. If A is self adjoint, then A2 , defined on the domain D(A2 ) = {ψ ψ, Aψ ∈ D(A)}, is self adjoint; this is well known and follows easily from the spectral theorem.
222
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
Lemma 1. Let H be a self adjoint operator on H and U (t) = eitH the corresponding one parameter unitary group. Let B ⊂ D(H ) be a dense U -invariant linear subspace. We then have the following: (i) B is a core for H . (ii) Let X be a symmetric operator with B ⊂ D(X) such that XB ⊂ D(X) and 2 X2 B = H|B . Then X|B is essentially self adjoint, X|B = X and X = H . In particular, H ≥ 0, D(H ) ⊂ D(X). Finally, these results are valid if we only assume that B is invariant under H and contains a dense set of analytic vectors. Proof. Let H1 = H B . We must show that if L(λ)(λ ∈ C) is the subspace of ψ such that H1∗ ψ = λψ, then L(λ) = 0 if (λ) = 0. Now L(λ) is a closed subspace. Moreover, as H1 is invariant under U , so is H1∗ and so L(λ) is invariant under U also. So the vectors in L(λ) that are C ∞ for U are dense in L(λ) and so it is enough to prove that 0 is the only C ∞ vector in L(λ). But H = H ∗ ≺ H1∗ while the C ∞ vectors for U are all in D(H ), and so if ψ is a C ∞ vector in L(λ), H ψ = H1∗ ψ = λψ. This is a contradiction since H is self adjoint and so all its eigenvalues are real. This proves (i). Let X1 = X B . Clearly, X1 is symmetric on B. It is enough to show that X1 is 2
essentially self adjoint and X1 = H , since in this case X1 ≺ X ≺ X1∗ = X1 and hence X = X1 . We have X12 = H1 . So H ≥ 0 on B and hence H ≥ 0 by (i). Again it is a question of showing that for λ ∈ C with (λ) = 0, we must have M(λ) = 0, where M(λ) is the eigenspace for X1∗ for the eigenvalue λ. Let ψ ∈ M(λ). Now, for ϕ ∈ B, (X12 ϕ, ψ) = (X1 ϕ, X1∗ ψ) = λ(X1 ϕ, ψ) = λ (ϕ, ψ) = (ϕ, λ2 ψ). 2
Hence ψ ∈ D((X12 )∗ ) and (X12 )∗ ψ = λ2 ψ. But X12 = H1 and so (X12 )∗ = H1∗ = H by (i). So H ψ = λ2 ψ. Hence λ2 is real and ≥ 0. This contradicts the fact that (λ) = 0. 2 2 2 Furthermore, X12 ≺ X1 and so X1 = (X1 )∗ ≺ (X12 )∗ = H1∗ = H . On the other 2
2
2
2
hand, as X1 is closed, H = H 1 = X12 ≺ X1 . So H = X1 = X . This means that D(H ) ⊂ D(X). Finally, let us assume that H B ⊂ B and that B contains a dense set of analytic vectors for H . Clearly B is a core for H . If ψ ∈ B is analytic for H we have X 2n ψ = H n ψ ∈ B and X2n+1 ψ ∈ D(X) by assumption, and ||X n ψ||2 = |(H n ψ, ψ)| ≤ ||ψ||||H n ψ|| ≤ M n n! for some M > 0 and all n. Thus ψ is analytic for X and so its essential self adjointness is a consequence of the theorem of Nelson. The rest of the argument is unchanged. Lemma 2. Let A be a self adjoint operator in H. Let M be a smooth (resp. analytic) manifold and f (M −→ H) a smooth (resp. analytic) map. We assume that (i) f (M) ⊂ D(A2 ) and (ii) A2 f : m −→ A2 f (m) is a smooth (resp. analytic) map of M into H. Then Af : m −→ Af (m) is a smooth (resp. analytic) map of M into H. Moreover, if E is any smooth differential operator on M, (Ef )(m) ∈ D(A2 ) for all m ∈ M, and E(A2 f ) = A2 Ef,
E(Af ) = AEf.
Unitary Representations of Super Lie Groups and Applications
223
Proof. It is standard that if g(M −→ H) is smooth (resp. analytic) and L is a bounded linear operator on H, then Lg is a smooth (resp. analytic) map. We have Aψ = A(I + A2 )−1 (I + A2 )ψ,
ψ ∈ D(A2 ).
Moreover A(I + A2 )−1 is a bounded operator. Now (I + A2 )f is smooth (resp. analytic) and so it is immediate from the above that Af is smooth (resp. analytic). For the last part we assume that M is an open set in Rn since the result is clearly local. Let x i (1 ≤ i ≤ n) be the coordinates and let ∂j = ∂/∂x j , ∂ α = ∂1α1 . . . ∂nαn , where α = (α1 , . . . , αn ). It is enough to prove that (∂ α f )(m) ∈ D(A2 ),
A2 ∂ α f = ∂ α (A2 f ),
A∂ α f = ∂ α (Af ).
We begin with a simple observation. Since (A2 ψ, ψ) = ||Aψ||2 for all ψ ∈ D(A2 ), it follows that whenever ψn ∈ D(A2 ) and (ψn ) and (A2 ψn ) are Cauchy sequences, then (Aψn ) is also a Cauchy sequence; moreover, if ψ = limn ψn , then ψ ∈ D(A2 ) and A2 ψ = limn A2 ψn , Aψ = limn Aψn . This said, we shall prove the above formulae by induction on |α| = α1 + · · · + αn . Assume them for |α| ≤ r and fix j, 1 ≤ j ≤ n. Let g = ∂ α f, |α| = r. Let gh (x) =
1 1 g(x , . . . , x j + h, . . . , x n ) − g(x 1 , . . . , x n ) h
(h is in j th place).
Then, as h → 0, gh (x) → ∂j ∂ α f (x) while A2 gh (x) = (∂ α A2 f )h (x) → ∂j ∂ α A2 f (x), and Agh (x) = (∂ α Af )h (x) → ∂j ∂ α Af (x). From the observation made above we have ∂j ∂ α f (x) ∈ D(A2 ) and A2 and A map it respectively into ∂j ∂ α A2 f (x) and ∂j ∂ α Af (x). Definition 1. For self adjoint operators L, X with L bounded, we write L ↔ X to mean that L commutes with the spectral projections of X. Lemma 3. Let X be a self adjoint operator and B a dense subspace of H which is a core for X such that XB ⊂ B. If L is a bounded self adjoint operator such that LB ⊂ B, then the following are equivalent: (i) LX = XL on B (ii) LX = XL on D(X) (this carries with it the inclusion LD(X) ⊂ D(X)) (iii) L ↔ X. In this case eitL X = XeitL for all t ∈ R. Proof. (i) ⇐⇒ (ii). Let b ∈ D(X). Then there is a sequence (bn ) in B such that bn → b, Xbn → Xb. Then XLbn = LXbn → LXb. Since Lbn → Lb we infer that Lb ∈ D(X) and XLb = LXb. This proves (i) ⇒ (ii). The reverse implication is trivial. n itL Xb = (ii) ⇒ (iii). We have Ln Xb = XL bn for alln b ∈ D(X), nitL≥ 1. So e itL n n b, XsN → e Xb. n ((it) /n!)XL b. If sN = n≤N ((it) /n!)L b, then sN → e So eitL b ∈ D(X) and XeitL b = eitL Xb. If U (t) = eitL , this means that U (t)XU (t)−1 = X and so, by the uniqueness of the spectral resolution of X, U (t) commutes with the spectral projections of X. But then L ↔ X. (iii) ⇒ (i). Under (iii) we have U (t)XU (t)−1 = X or XU (t)b = U (t)Xb for b ∈ D(X), U (t)b being in D(X) for all t. Let bt = (it)−1 (U (t)b − b). Then Xbt = (it)−1 (U (t)Xb − Xb). Hence, as t → 0, bt → Lb while Xbt → LXb. Hence Lb ∈ D(X) and XLb = LXb. Thus we have (ii), hence (i).
224
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
2.3. Unitary representations of super Lie groups. We take the point of view [DM99] that a super Lie group (SLG) is a super Harish-Chandra pair (G0 , ᒄ) that is a pair (G0 , ᒄ), where G0 is a classical Lie group, ᒄ is a super Lie algebra which is a G0 -module, Lie(G0 ) = ᒄ0 , and the action of ᒄ0 on ᒄ is the differential of the action of G0 . The notion of morphisms between two super Lie groups in the above sense is the obvious one from which it is easy to see what is meant by a finite dimensional representation of a SLG (G0 , ᒄ): it is a triple (π0 , π, V ), where π0 is an even representation of G0 in a super vector space V of finite dimension over C, i.e., a representation such that π0 (g) is even for all g ∈ G0 ; π is a representation of the super Lie algebra ᒄ in V such that π ᒄ = dπ0 ; and 0
π(gX) = π0 (g)π(X)π0 (g)−1 ,
g ∈ G 0 , X ∈ ᒄ1 .
If V is a SHS and π(X)† = −π(X) for all X ∈ ᒄ, we say that (π0 , π, V ) is a unitary representation (UR) of the SLG (G0 , ᒄ). The condition on π is equivalent to saying that π0 is a unitary representation of G0 in the usual sense and π(X)∗ = −iπ(X) for all X ∈ ᒄ1 . It is then clear that a finite dimensional UR of (G0 , ᒄ) is a triple (π0 , π, V ), where (a) π0 is an even unitary representation of G0 and is a SHS V ; (b) π is a linear map of ᒄ1 into the space ᒄᒉ(V )1 of odd endomorphisms of V with π(X)∗ = −iπ(X) for all X ∈ ᒄ1 ; (c) dπ0 ([X, Y ]) = π(X)π(Y ) + π(Y )π(X) for X, Y ∈ ᒄ1 ; (d) π(g0 X) = π0 (g0 )π(X)π0 (g0 )−1 for X ∈ ᒄ1 , g0 ∈ G0 . Let ζ = e−iπ/4 ,
ρ(X) = ζ π(X).
Then, we may replace π(X) by ρ(X) for X ∈ ᒄ1 ; the condition (b) becomes the requirement that ρ(X) is self adjoint for all X ∈ ᒄ1 , while the commutation rule in condition (c) becomes −idπ0 ([X, Y ]) = ρ(X)ρ(Y ) + ρ(Y )ρ(X),
X, Y ∈ ᒄ1 .
If we want to extend this definition to the infinite dimensional context it is necessary to take into account the fact that the π(X) for X ∈ ᒄ1 will in general be unbounded; indeed, from (c) above we find that dπ0 ([X, X]) = 2π(X)2 , and as dπ0 typically takes elements of ᒄ0 into unbounded operators, the π(X) cannot be bounded. So the concept of a UR of a SLG in the infinite dimensional case must be formulated with greater care to take into account the domains of definition of the π(X) for X ∈ ᒄ1 . In the physics literature this aspect is generally ignored. We shall prove below that contrary to what one may expect, the domain restrictions can be formulated with great freedom, and the formal and rigorous pictures are essentially the same. In particular, the concept of a UR of a super Lie group is a very stable one and allows great flexibility of handling. If V is a super vector space (not necessarily finite dimensional), we write End(V ) for the super algebra of all endomorphisms of V . If π0 is a unitary representation of G0 in a Hilbert space H, we write C ∞ (π0 ) for the subspace of differentiable vectors in H for π0 . We denote by C ω (π0 ) the subspace of analytic vectors of π0 . Here we recall that a vector v ∈ H is called a differentiable vector (resp. analytic vector) for π0 if the map g → π0 (g)v is smooth (resp. analytic). If H is a SHS and π0 is even, then C ∞ (π0 ) and C ω (π0 ) are π0 -invariant dense linear super subspaces. We also need the following fact
Unitary Representations of Super Lie Groups and Applications
225
which is standard but we shall give a partial proof because the argument will be used again later. Lemma 4. C ∞ (π0 ) and C ω (π0 ) are stable under dπ0 (ᒄ0 ). For any Z ∈ ᒄ0 , idπ0 (Z) is essentially self adjoint both on C ∞ (π0 ) and on C ω (π0 ); moreover, for any Z1 , . . . , Zr ∈ ᒄ0 and ψ ∈ C ∞ (π0 ), (resp. ψ ∈ C ω (π0 )) the map g −→ dπ0 (Z1 ) . . . dπ0 (Zr )π0 (g)ψ is C ∞ (resp. analytic). Proof. We prove only the second statement. That idπ0 (Z) is essentially self adjoint on C ∞ (π0 ) and on C ω (π0 ) is immediate from Lemma 1. Using the adjoint representation we have, for any Z ∈ ᒄ0 , gZ = i ci (g)Wi for g ∈ G0 , where the ci are analytic functions on G0 and Wi ∈ ᒄ0 . Hence, as dπ0 (Z)π0 (g) = π0 (g)dπ0 (g −1 Z), we can write dπ0 (Z1 ) · · · dπ0 (Zr )π0 (g)ψ as a linear combination with analytic coefficients of π0 (g)dπ0 (R1 ) · · · dπ0 (Rr )ψ for suitable Rj ∈ ᒄ0 . The result is then immediate. Definition 2. A unitary representation (UR) of a SLG (G0 , ᒄ) is a triple (π0 , ρ, H), H a SHS, with the following properties: (a) π0 is an even UR of G0 in H; (b) ρ(X −→ ρ(X)) is a linear map of ᒄ1 into End(C ∞ (π0 ))1 such that (i) ρ(g0 X) = π0 (g0 )ρ(X)π0 (g0 )−1 (X ∈ ᒄ1 , g0 ∈ G0 ), (ii) ρ(X) with domain C ∞ (π0 ) is symmetric for all X ∈ ᒄ1 , (iii) −idπ0 ([X, Y ]) = ρ(X)ρ(Y ) + ρ(Y )ρ(X) (X, Y ∈ ᒄ1 ) on C ∞ (π0 ). Proposition 1. If (π0 , ρ, H) is a UR of the SLG (G0 , ᒄ), then ρ(X) with domain C ∞ (π0 ) is essentially self adjoint for all X ∈ ᒄ1 . Moreover π : X0 + X1 −→ dπ0 (X0 ) + ζ −1 ρ(X1 )
(Xi ∈ ᒄi )
is a representation of ᒄ in C ∞ (π0 ). Proof. Let Z = (1/2)[X, X]. We apply Lemma 1 with U (t) = π0 (exp tZ) = eitH , B = C ∞ (π0 ). Then H = −idπ0 (Z) = ρ(X)2 on C ∞ (π0 ). We conclude that H and ρ(X) are 2 essentially self adjoint on C ∞ (π0 ) and that H = ρ(X) ; in particular, H ≥ 0. For the second assertion the only non-obvious statement is that for Z ∈ ᒄ0 , X ∈ ᒄ1 , ψ ∈ C ∞ (π0 ), ρ([Z, X])ψ = dπ0 (Z)ρ(X)ψ − ρ(X)dπ0 (Z)ψ. Let gt = exp(tZ) and let (Xk ) be a basis for ᒄ1 . Then gX = k ck (g)Xk , where the ck are smooth functions on G0 . So
π0 (gt )ρ(X)ψ = ρ(gt X)π0 (gt )ψ = ck (gt )ρ(Xk )π0 (gt )ψ. k
Now g −→ π0 (g)ψ is a smooth map into C ∞ (π0 ). On the other hand, if Hk = −(1/2)[Xk , Xk ], we have idπ0 (Hk ) = ρ(Xk )2 on C ∞ (π0 ), so π0 (g)ψ ∈ D(ρ(Xk )2 ), and by Lemma 4, ρ(Xk )2 π0 (g)ψ = idπ0 (Hk )π0 (g)ψ is smooth in g. Lemma 2 now applies and shows that ρ(Xk )π0 (gt )ψ is smooth in t and
d dt
ρ(Xk )π0 (gt )ψ = ρ(Xk )dπ0 (Z)ψ. t=0
226
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
Hence dπ0 (Z)ρ(X)ψ =
ck (1)ρ(Xk )dπ0 (Z)ψ +
k
Since [Z, X] =
(Zck )(1)ρ(Xk )ψ.
k
(Zck )(1)Xk , k
X=
ck (1)Xk
k
the right side is equal to ρ(X)dπ0 (Z)ψ + ρ([Z, X])ψ. Remark 1. For Z such that exp tZ represents time translation, H is the energy operator, and so is positive in the supersymmetric model. We shall now show that one may replace C ∞ (π0 ) by a more or less arbitrary domain without changing the content of the definition. This shows that the concept of a UR of a SLG is a viable one even in the infinite dimensional context. Let us consider a system (π0 , ρ, B, H) with the following properties: (a) B is a dense super linear subspace of H invariant under π0 and B ⊂ D(dπ0 (Z)) for all Z ∈ [ᒄ1 , ᒄ1 ]; (b) (ρ(X))X∈ᒄ1 is a set of linear operators such that: (i) ρ(X) is symmetric for all X ∈ ᒄ1 , (ii) B ⊂ D(ρ(X)) for all X ∈ ᒄ1 , (iii) ρ(X)Bi ⊂ Hi+1 (mod 2) for all X ∈ ᒄ1 , (iv) ρ(aX + bY ) = aρ(X) + bρ(Y ) on B for X, Y ∈ ᒄ1 and a, b scalars, (v) π0 (g)ρ(X)π0 (g)−1 = ρ(gX) on B for all g ∈ G0 , X ∈ ᒄ1 , (vi) ρ(X)B ⊂ D(ρ(Y )) for all X, Y ∈ ᒄ1 , and −idπ0 ([X, Y ]) = ρ(X)ρ(Y ) + ρ(Y )ρ(X) on B. Proposition 2. Let (π0 , ρ, B, H) be as above. We then have the following: (a) For any X ∈ ᒄ1 , ρ(X) is essentially self adjoint and C ∞ (π0 ) ⊂ D(ρ(X)). (b) Let ρ(X) = ρ(X) C ∞ (π ) for X ∈ ᒄ1 . Then (π0 , ρ, H) is a UR of the SLG (G0 , ᒄ). 0
, ρ , H)
If (π0 is a UR of the SLG (G0 , ᒄ), such that B ⊂ D(ρ (X)) and ρ (X) restricts to ρ(X) on B for all X ∈ ᒄ1 , then ρ = ρ. Proof. Let X ∈ ᒄ1 . By assumption B is invariant under the one parameter unitary group generated by H = −(1/2)idπ0 ([X, X]) while H = ρ(X)2 on B. So, by Lemma 1, ρ(X) is essentially self adjoint, ρ(X) = ρ(X)|B , H = (ρ(X))2 , and D(H ) ⊂ D(ρ(X)). Since C ∞ (π0 ) ⊂ D(H ), we have proved (a). Let us now prove (b). If a is scalar and X ∈ ᒄ, ρ(aX) = aρ(X) follows from item (iv) and the fact that ρ(X) = ρ(X)|B . For the additivity of ρ, let X, Y ∈ ᒄ1 . Then ρ(X + Y ) is essentially self adjoint and its closure restricts to ρ(X + Y ) on C ∞ (π0 ). Then, viewing ρ(X) + ρ(Y ) as a symmetric operator defined on the intersection of the domains of the two operators (which includes C ∞ (π0 )), we see that ρ(X) + ρ(Y ) is
Unitary Representations of Super Lie Groups and Applications
227
a symmetric extension of ρ(X)|B + ρ(Y )|B = ρ(X + Y )|B . But as ρ(X + Y )|B is essentially self adjoint, we have, by the remark made earlier, ρ(X) + ρ(Y ) ≺ ρ(X + Y )|B = ρ(X + Y ). Restricting both of these operators to C ∞ (π0 ) we find that ρ(X + Y ) = ρ(X) + ρ(Y ). From the relation π0 (g)ρ(X)π0 (g)−1 = ρ(gX) on B follows π0 (g)ρ(X)π0 (g)−1 = ρ(gX). The key step is now to prove that for any X ∈ ᒄ1 , ρ(X) maps C ∞ (π0 ) into itself. Fix X ∈ ᒄ1 , ψ ∈ C ∞ (π0 ). Now π0 (g)ρ(X)ψ = ρ(gX)π0 (g)ψ.
So, writing gX = k ck (g)Xk , where the ck are smooth functions on G0 and the (Xk ) a basis for ᒄ1 , we have, remembering the linearity of ρ on C ∞ (π0 ),
π0 (g)ρ(X)ψ = ck (g)ρ(Xk )π0 (g)ψ. k
It is thus enough to show that g −→ ρ(Xk )π0 (g)ψ is smooth. If Hk = −[Xk , Xk ]/2, we 2 know from Lemma 4 that π0 (g)ψ lies in D(Hk ) and idπ0 (Hk )π0 (g)ψ = ρ(Xk ) π0 (g)ψ is smooth in g. Lemma 2 now shows that ρ(Xk )π0 (g)ψ is smooth in g. It remains only to show that for X, Y ∈ ᒄ1 we have −idπ0 ([X, Y ]) = ρ(X)ρ(Y ) + ρ(Y )ρ(X) on C ∞ (π0 ). But, the right side is ρ(X + Y )2 − ρ(X)2 − ρ(Y )2 while the left side is the restriction of (−i/2)dπ0 ([X + Y, X + Y ]) + (i/2)dπ0 ([X, X]) + (i/2)dπ0 ([Y, Y ]) to C ∞ (π0 ), and so we are done. We must show the uniqueness of ρ. Let ρ have the required properties also. Then ρ (X) is essentially self adjoint on C ∞ (π0 ) and B is a core for its closure, by Lemma 1. Hence ρ (X) = ρ(X). The proof is complete. We shall now prove a variant of the above result involving analytic vectors. Proposition 3. (i) If (π0 , ρ, H) is a UR of the SLG (G0 , ᒄ), then ρ(X) maps C ω (π0 ) into itself for all X ∈ ᒄ1 , so that π, as in Proposition (1), is a representation of ᒄ in C ∞ (π0 ). (ii) Let G0 be connected. Let π0 be an even unitary representation of G0 and B ⊂ C ω (π0 ) a dense linear super subspace. Let π be a representation of ᒄ in B such that π(Z) ≺ dπ0 (Z) for Z ∈ ᒄ0 and ρ(X) = ζ π(X) is symmetric for X ∈ ᒄ1 . Then, for each X ∈ ᒄ1 , ρ(X) is essentially self adjoint on B and C ∞ (π0 ) ⊂ D(ρ(X)). If ρ(X) is the restriction of ρ(X) to C ∞ (π0 ), then (π0 , ρ, H) is a UR of the SLG (G0 , ᒄ) and is the unique one in the following sense: if (π0 , ρ , H) is a UR with B ⊂ D(ρ (X)) and ρ (X) |B = ρ(X) for all X ∈ ᒄ1 , then ρ = ρ. Proof. (i) This is proved as its C ∞ analogue in the proof of Proposition 2, using the analytic parts of Lemmas 2 and 4. (ii) The proof that ρ(X) for X ∈ ᒄ1 is essentially self adjoint with D(ρ(X)) ⊃ C ∞ (π0 ) follows as before from (the analytic part of) Lemma 1. The same goes for the linearity of ρ.
228
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
We shall now show that for X ∈ ᒄ1 , g ∈ G0 ,
π0 (g −1 )ρ(X)π0 (g) = ρ(g −1 X).
(1)
Write g −1 X = k ck (g)Xk , where (Xk ) is a basis for ᒄ1 and the ck are analytic functions on G0 . We begin by showing that for all ψ ∈ B, ρ(X)π0 (g)ψ = π0 (g)ρ(g −1 X)ψ. Now π0 (g)ρ(g −1 X)ψ =
(2)
ck (g)π0 (g)ρ(Xk )ψ.
k
We argue as in Proposition 2 to conclude, using Lemmas 2 and 4, that the function ρ(X)π0 (g)ψ is analytic in g and its derivatives can be calculated explicitly. It is also clear that the right side is analytic in g since ρ(Xk )ψ ∈ B for all k. So, as G0 is connected, it is enough to prove that the two sides in (2) have all derivatives equal at g = 1. This comes down to showing that for any integer n ≥ 0 and any Z ∈ ᒄ0 ,
n ρ(X)dπ0 (Z)n ψ = (Z r ck )(1)dπ0 (Z)n−r ρ(Xk )ψ. (3) r k,r
Let λ be the representation of G0 on ᒄ1 and write λ again for dλ. Then, taking gt = exp(tZ),
λ(gt−1 )(X) = ck (gt )Xk , k
from which we get, on differentiating n times with respect to t at t = 0,
λ(−Z)r (X) = (Z r ck )(1)Xk . k
Hence the right side of (3) becomes
n dπ0 (Z)n−r ρ(λ(−Z)r (X))ψ. r r On the other hand, from the fact that π is a representation of ᒄ in B we get ρ(X)dπ0 (Z) = dπ0 (Z)ρ(X) + ρ(λ(−Z)(X)) on B. Iterating this we get, on B, ρ(X)dπ0 (Z)n =
n r
r
dπ0 (Z)n−r ρ(λ(−Z)r (X))
which gives (2). But then (1) follows from (2) by a simple closure argument. Using (1), the proof that ρ(X) maps C ∞ (π0 ) into itself is the same as Proposition 2. The proof of the relation −idπ0 ([X, Y ]) = ρ(X)ρ(Y ) + ρ(Y )ρ(X) for X, Y ∈ ᒄ1 is also the same. The proof is complete.
Unitary Representations of Super Lie Groups and Applications
229
2.4. The category of unitary representations of a super Lie group. If = (π0 , ρ, H) and = (π0 , ρ , H ) are two UR’s of a SLG (G0 , ᒄ), a morphism A : −→ is an even bounded linear operator from H to H such that A intertwines π0 , ρ and π0 , ρ ; notice that as soon as A intertwines π0 and π0 , it maps C ∞ (π0 ) into C ∞ (π0 ), and so the requirement that it intertwine ρ and ρ makes sense. An isomorphism is then a morphism A such that A−1 is a bounded operator; in this case A is a linear isomorphism of C ∞ (π0 ) with C ∞ (π0 ) intertwining ρ and ρ . If A is unitary we then speak of unitary equivalence of and . It is easily checked that equivalence implies unitary equivalence, just as in the classical case. is a subrepresentation of if H is a closed graded subspace of H invariant under π0 and ρ, and π0 (resp. ρ ) is the restriction of π0 (resp. ρ) to H (resp. C ∞ (π0 ) ∩ H ). The UR is said to be irreducible if there is no proper nonzero closed graded subspace H that defines a subrepresentation. If is a nonzero proper subrepresentation of , and H is H ⊥ , it follows from the self adjointness of ρ(X) for X ∈ ᒄ1 that H ∩ C ∞ (π0 ) is invariant under all ρ(X)(X ∈ ᒄ1 ); then the restrictions of π0 , ρ to H define a subrepresentation such that = ⊕ in an obvious manner. Lemma 5. is irreducible if and only if Hom(, ) = C. Proof. If splits as above, then the orthogonal projection H −→ H is a nonscalar element of Hom(, ). Conversely, suppose that is irreducible and A ∈ Hom(, ). Then A∗ ∈ Hom(, ) also and so, to prove that A is a scalar we may suppose that A is self adjoint. Let P be the spectral measure of A. Clearly all the P (E) are even. Then P commutes with π0 and so P (E) leaves C ∞ (π0 ) invariant for all Borel sets E. Moreover, by Lemma 3, the relation Aρ(X) = ρ(X)A on C ∞ (π0 ) implies that P (E) ↔ ρ(X) for all E and x ∈ ᒄ1 , and hence that P (E)ρ(X) = ρ(X)P (E) on C ∞ (π0 ) for all E, X. The range of P (E) thus defines a subrepresentation and so P (E) = 0 or I . Since this is true for all E, A must be a scalar. Lemma 6. Let (R0 , ᒏ) be a SLG and (θ, ρ θ ) a UR of it, in a Hilbert space L. Let P X be the spectral measure of ρ θ (X), X ∈ ᒏ1 . Then the following properties of a closed linear subspace M of L are equivalent: (i) M is stable under θ and M∞ = C ∞ (θ ) ∩ M is stable under all ρ θ (X), (X ∈ ᒏ1 ) (ii) M is stable under θ and all the spectral projections PFX (Borel F ⊂ R). In particular, (θ, ρ θ ) is irreducible if and only if L is irreducible under θ and all PFX . Proof. Follows from Lemma 3 applied to the orthogonal projection L : L −→ M. Indeed, suppose that M is a closed linear subspace of L stable under θ . Then L maps L∞ onto M∞ . By Lemma 3 L commutes with ρ θ (X) on L∞ if and only if L ↔ ρ θ (X); this is the same as saying that P X stabilizes M. 3. Induced Representations of Super Lie Groups, Super Systems of Imprimitivity, and the Super Imprimitivity Theorem 3.1. Smooth structure of the classical induced representation and its system of imprimitivity. Let G0 be a unimodular Lie group and H0 a closed Lie subgroup. We write = G0 /H0 and assume that has a G0 -invariant measure; one can easily modify the treatment below to avoid these assumptions. We write x → x for the natural map from G0 to and dx for a choice of the invariant measure on . For any UR σ of H0 in a Hilbert space K one has the representation π of G0 induced by σ . One may
230
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
take π as acting in the Hilbert space H of (equivalence classes) of Borel functions f from G0 to K such that (i) f (xξ ) = σ (ξ )−1 f (x) for all x ∈ G0 , ξ ∈ H0 , and (ii) ||f ||2H := |f (x)|2K dx < ∞. Here |f (x)|K is the norm of f (x) in K, and the function x → |f (x)|2K is defined on so that it makes sense to integrate it on . Let P be the natural projection valued measure on H defined as follows: for any Borel set E ⊂ the projection P (E) is the operator f → χE f , where χE is the characteristic function of E. Then (π, H, P ) is the classical system of imprimitivity (SI) associated to the UR σ of H0 . In our case G0 is a Lie group and it is better to work with a smooth version of π; its structure is determined by a well known theorem of Dixmier-Malliavin in a manner that will be explained below. We begin with a standard but technical lemma that says that certain integrals containing a parameter are smooth. Lemma 7. Let M, N be smooth manifolds, dn a smooth measure on N , and B a separable Banach space with norm |·|. Let F : M × N −→ B be a map with the following properties: (i) For each n ∈ N , m → F (m, n) is smooth. (ii) If A ⊂ M is an open set with compact closure, and G is any derivative of F with respect to m, there is a gA ∈ L1 (N, dn) such that |G(m, n)| ≤ gA (n) for all m ∈ A, n ∈ N . Then f (m) = F (m, n)dn N
exists for all m and f is a smooth map of M into B. Proof. It is a question of proving that the integrals |G(m, n)|dn N
converge uniformly when m varies in an open subset A of M with compact closure. But the integrand is majorized by gA which is integrable on N and so the uniform convergence is clear. We also observe that any f ∈ H lies in Lp,loc (G0 ) for p = 1, 2, i.e., θ(x)|f (x)|2K is integrable on G0 for any continuous compactly supported scalar function θ ≥ 0. In fact θ (x)|f (x)|2K dx = θ (xξ )|f (xξ )|2K dξ dx = θ(x)|f (x)|2K dx < ∞, G0
H0
where θ (x) = G0 θ (xξ )dξ . In H we have the space C ∞ (π ) of smooth vectors for π . We also have its Garding subspace, the subspace spanned by all vectors π(α)h, where α ∈ Cc∞ (G0 ) and h ∈ H. We have −1 (π(α)h)(z) = α(x)h(x z)dx = α(zt −1 )h(t)dt (z ∈ G0 ). G0
G0
The integrals exist because h is locally L2 on G0 as mentioned above. Since h ∈ L1,loc (G0 ), α ∈ Cc∞ (G0 ), the conditions of Lemma 7 are met and so π(α)h is smooth. Thus all elements of the Garding space are smooth functions. But the Dixmier-Malliavin theorem asserts that the Garding space is exactly the same as C ∞ (π ) [DM78]. Thus all
Unitary Representations of Super Lie Groups and Applications
231
elements of C ∞ (π ) are smooth functions from G0 to K. This is the key point that leads to the smooth versions of the induced representation and the SI at the classical level. Let us define B as the space of all functions f from G0 to K such that (i) f is smooth and f (xξ ) = σ (ξ )−1 f (x) for all x ∈ G0 , ξ ∈ H0 , (ii) f has compact support mod H0 . Let Cc∞ (π ) be the subspace of all elements of C ∞ (π ) with compact support mod H0 . Proposition 4. B has the following properties: (i)B = Cc∞ (π ), (ii) B is dense in H, (iii) f (x) ∈ C ∞ (σ ) for all x ∈ G0 , (iv) B is stable under dπ . Proof. (i) Let f ∈ B. To show that f ∈ C ∞ (π ) it is enough to show that for any u ∈ H the map x → (π(x −1 )f, u)H is smooth in x. Now (π(x −1 )f, u)H = (f (xy), u(y))K dy.
Since |u|K is locally L1 on X and f is smooth, the conditions of Lemma 7 are met. We have B ⊂ Cc∞ (π ). The reverse inclusion is immediate from the Dixmier-Malliavin theorem, as remarked above. (ii) It is enough to prove that any h ∈ H with compact support mod H0 is in the closure of B. We know that π(α)h → h as α ∈ Cc∞ (G0 ) goes suitably to the delta function at the identity of G0 . But π(α)h is smooth and has compact support mod H0 because h has the same property, so that π(α)h ∈ B. (iii) Fix x ∈ G0 . Since σ (ξ )f (x) = f (xξ −1 ) for ξ ∈ H0 it is clear that f (x) ∈ C ∞ (σ ). (iv) Let f ∈ B, Z ∈ ᒄ0 . Then (dπ(Z)f )(x) = (d/dt)t=0 f (exp(−tZ)x) is smooth and we are done.
We refer to (π, B) as the smooth representation induced by σ . We shall also define the smooth version of the SI. For any u ∈ Cc∞ () let M(u) be the bounded operator on H which is multiplication by u. Then M(u) leaves B invariant and M : u → M(u) is a ∗-representation of the ∗-algebra Cc∞ () in H. It is natural to refer to (π, B, M) as the smooth system of imprimitivity associated to σ . Observe that f ∈ C ∞ (π ) has compact support mod H0 if and only if there is some u ∈ Cc∞ () such that f = M(u)f . Proposition 4 shows that B is thus determined intrinsically by the SI associated to σ . The passage from (π, H, P ) to (π, B, M) is thus functorial and is a categorical equivalence. Thus we are justified in working just with smooth SI’s. It is easy to see that the assignment that takes σ to the associated smooth SI is functorial. Indeed, let R be a morphism from σ to σ , i.e., R is a bounded operator from K to K intertwining σ and σ . We then define TR = T (H −→ H ) by (TR f )(x) = Rf (x)(x ∈ G0 ). It is then immediate that TR is a morphism from the (smooth) SI associated to σ to the (smooth) SI associated to σ . This functor is an equivalence of categories. To verify this one must show that every morphism between the two SI’s is of this form. This is of course classical but we sketch the argument depending on the following lemma which will be essentially used in the super context also. ∞ Lemma 8. Suppose f ∈ B and f (1) = 0. Then we can find ui ∈ Cc (), gi ∈ B such that (i) ui (1) = 0 for all i, (ii) we have f = i ui gi .
Proof. If f vanishes in a neighborhood of 1, we can choose u ∈ Cc∞ () such that u = 0 in a neighborhood of 1 and f = uf . The result is thus true for f . Let f ∈ B be arbitrary
232
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
but vanishing at 1. Let ᒗ be a linear subspace of ᒄ0 = Lie (G0 ) complementary to ᒅ0 = Lie (H0 ). Then there is a sufficiently small r > 0 such that if ᒗr = {Z ∈ ᒗ | |Z| < r}, |·| being a norm on ᒗ, the map ᒗr × H0 −→ G0 ,
(Z, ξ ) −→ exp Z·ξ
is a diffeomorphism onto an open set G1 = G1 H0 of G0 . We transfer f from G1 to a function, denoted by ϕ on ᒗr × H0 . We have ϕ(0, ξ ) = 0, and ϕ(Z, ξ ξ ) = σ (ξ )−1 ϕ(Z, ξ ) for ξ ∈ H0 . If ti (1 ≤ i ≤ k) are the linear coordinates on ᒗ, ϕ(Z, ξ ) =
1
ti (Z)
(∂ϕ/∂ti )(sZ, ξ )ds. 0
i
1 The functions ψi (Z, ξ ) = 0 (∂ϕ/∂ti )(sZ, ξ )ds are smooth by Lemma 7 while ψi (Z, ξ ξ ) = σ (ξ )−1 ψi (Z, ξ ) for ξ ∈ H0 . So, going back to G1 we can write ∞ f = i ti hi , where ti are now in C (G1 ), right invariant under H0 and vanishing at 1, while the hi are smooth and satisfy hi (xξ ) = σ (ξ )−1 hi (x) for x ∈ G1 , ξ ∈ H0 . ∞ If u ∈ C c () is such that u is 1 in a neighborhood of 1 and supp (u) ⊂ G1 , then 2 ∞ u f = i ui gi where ui = uti ∈ Cc (), ui (1) = 0, and gi = uhi ∈ B. Since 2 2 2 f = u f + (1 − u )f and (1 − u )f = 0 in a neighborhood of 1, we are done. We can now determine all the morphisms from H to H . Let T be a morphism H −→ H . Then, as T commutes with multiplications by elements of Cc∞ (), it maps B to B . Moreover, for the same reason, the above lemma shows that if f ∈ B and f (1) = 0, then (Tf )(1) = 0. So the map R : f (1) −→ (Tf )(1)
(f ∈ B)
is well defined. From the fact that T intertwines π and π we obtain that (Tf )(x) = Rf (x) for all x ∈ G0 . To complete the proof we must show two things: (1) R is defined on all of C ∞ (σ ) and (2) R is bounded. For (1), let v ∈ C ∞ (σ ). In the earlier notation, if u ∈ Cc∞ (G0 ) is 1 in 1 and has support contained in G1 , then h : (exp Z, ξ ) → u(exp Z)σ (ξ )−1 v is in B and h(1) = v. For proving (2), let the constant C > 0 be such that (g ∈ H ).
||T g||H ≤ C||g||H
Then, taking g = u1/2 f for f ∈ B and u ≥ 0 in Cc∞ (), we get
u(x)|Rf (x)|K dx ≤ C 2
u(x)|f (x)|2K dx
for all f ∈ B and u ≥ 0 in Cc∞ (). So |Rf (x)|K ≤ C|f (x)|K for almost all x. As f and Rf = Tf are continuous this inequality is valid for all x, in particular for x = 1, proving that R is bounded.
Unitary Representations of Super Lie Groups and Applications
233
3.2. Representations induced from a special sub super Lie group. It is now our purpose to extend this smooth classical theory to the super context. A SLG (H0 , ᒅ) is a sub super Lie group of the SLG (G0 , ᒄ) if H0 ⊂ G0 , ᒅ ⊂ ᒄ, and the action of H0 on ᒅ is the restriction of the action of H0 (as a subgroup of G0 ) on ᒄ. We shall always suppose that H0 is closed in G0 . The sub SLG (H0 , ᒅ) is called special if ᒅ has the same odd part as ᒄ, i.e., ᒅ1 = ᒄ1 . In this case the super homogeneous space associated is purely even and coincides with = G0 /H0 . As in §3.1 we shall assume that admits an invariant measure although it is not difficult to modify the treatment to avoid this assumption. Both conditions are satisfied in the case of the super Poincar´e groups and their variants. We start with a UR (σ, ρ σ , K) of (H0 , ᒅ) and associate to it the smooth induced representation (π, B) of the classical group G0 . In our case K is a SHS and so H becomes a SHS in a natural manner, the parity subspaces being the subspaces where f takes its values in the corresponding parity subspace of K. π is an even UR. We shall now define the operators ρ π (X) for X ∈ ᒄ1 as follows: (ρ π (X)f )(x) = ρ σ (x −1 X)f (x)
(f ∈ B).
Since the values of f are in C ∞ (σ ) the right side is well defined. In order to prove that the definition gives us an odd operator on B we need a lemma. Lemma 9. [ᒄ1 , ᒄ1 ] ⊂ ᒅ0 and is stable under G0 . In particular it is an ideal in ᒄ0 . Proof. For g ∈ G0 , Y, Y ∈ ᒄ1 , we have g[Y, Y ] = [gY, gY ] ∈ [ᒄ1 , ᒄ1 ]. Since ᒅ0 ⊕ ᒄ1 is a super Lie algebra, [ᒄ1 , ᒄ1 ] ⊂ ᒅ0 . Proposition 5. ρ π (X) is an odd linear map B −→ B for all X ∈ ᒄ1 . Moreover ρ π (X) is local, i.e., supp(ρ π (X)f ) ⊂ supp(f ) for f ∈ B. Finally, if Z ∈ ᒅ0 , we have −dσ (Z)f (g) = (Zf )(g). Proof. The support relation is trivial. Further, for x ∈ G0 , ξ ∈ H0 , (ρ π (X)f )(xξ ) = ρ σ (ξ −1 x −1 X)f (xξ ) = σ (ξ )−1 ρ σ (x −1 X)σ (ξ )f (xξ ) = σ (ξ )−1 (ρ π (X)f )(x). σ −1 It is thus a question of proving that g → ρ (g X)f (g) is smooth. If (Xk ) is a basis for ᒄ1 , g −1 X = k ck (g)Xk , where the ck are smooth functions and so it is enough to prove that g → ρ σ (Y )f (g) is smooth for any Y ∈ ᒄ, f ∈ B. We use Lemma 2. If Z = (1/2)[Y, Y ], we have ρ σ (Y )2 f (g) = −idσ (Z)f (g), and we need only show that −dσ (Z)f (g) is smooth in g. But Z ∈ ᒅ0 and f (g exp tZ) = σ (exp(−tZ))f (g) so that −dσ (Z)f (g) = (Zf )(g) is clearly smooth in g. Note that this argument applies to any Z ∈ ᒅ0 , giving the last assertion.
Proposition 6. (π, ρ σ , B) is a UR of the SLG (G0 , ᒄ). Proof. The symmetry of ρ π (X) and the relations ρ π (yX) = π(y)ρ π (X)π(y)−1 follow immediately from the corresponding relations for ρ σ . Suppose now that X, Y ∈ ᒄ1 .
234
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
Then (ρ π (X)ρ π (Y )f )(x) = ρ σ (x −1 X)ρ σ (x −1 Y )f (x). Hence ((ρ π (X)ρ π (Y ) + ρ π (Y )ρ π (X))f )(x) = −idσ (x −1 [X, Y ])f (x) = i(x −1 [X, Y ]f )(x)( Proposition 5 ) = i(d/dt)t=0 f (x(x −1 exp t[X, Y ]x)) = i(d/dt)t=0 f (exp t[X, Y ]x) = i(d/dt)t=0 (π(exp(−t[X, Y ])f )(x) = −i(dπ([X, Y ])f )(x). This proves the proposition.
We refer to (π, ρ π , H) as the UR of the SLG (G0 , ᒄ) induced by the UR (σ, ρ σ , K) of (H0 , ᒅ), and to (π, ρ π , B) as the corresponding smooth induced UR. Write P for the natural projection valued measure in H based on : for any Borel E ⊂ , P (E) is the operator in H of multiplication by χE , the characteristic function of E. Recall the definition of ↔ before Lemma 3. Proposition 7. For X ∈ ᒄ1 , and u ∈ Cc∞ (), we have M(u) ↔ ρ π (X). Furthermore P (E) ↔ ρ π (X) for Borel E ⊂ . Proof. It is standard that a bounded operator commutes with all P (E) if and only if it commutes with all M(u) for u ∈ Cc∞ (). It is thus enough to prove that M(u) ↔ ρ π (X) for all u, X. On B we have M(u)ρ π (X) = ρ π (X)M(u) trivially from the definitions, and so we are done in view of Lemma 3. Theorem 1. The assignment that takes (σ, ρ σ ) to (π, ρ π , B, M) is a fully faithful functor.
Proof. Let R be a morphism intertwining (σ, ρ σ ) and (σ , ρ σ ), and let T : B −→ B be associated to R such that (Tf )(x) = Rf (x). It is then immediate that T intertwines ρ π and ρ π . Conversely, if T is a morphism between the induced systems, from the classical discussion following Lemma 8 we know that (Tf )(x) = Rf (x) for a bounded even operator R intertwining σ and σ . Since T intertwines ρ π and ρ π we conclude that R must intertwine ρ σ and ρ σ . 3.3. Super systems of imprimitivity and the super imprimitivity theorem. A super system of imprimitivity (SSI) based on is a collection (π, ρ π , H, P ), where (π, ρ π , H) is a UR of the SLG (G0 , ᒄ), (π, H, P ) is a classical system of imprimitivity, π, P are both even, and ρ π (X) ↔ P (E) for all X ∈ ᒄ1 and Borel E ⊂ . Let (π, ρ π , H) be the induced representation defined in §3.2 and let P be the projection valued measure introduced above. Proposition 7 shows that (π, ρ π , H, P ) is a SSI based on . We call this the SSI induced by (σ, ρ σ ). Theorem 2 (Super imprimitivity theorem). The assignment that takes (σ, ρ σ ) to (π, ρ π , H, P ) is an equivalence of categories from the category of UR ’s of the special sub SLG (H0 , ᒅ) to the category of SSI ’s based on . Proof. Let us first prove that any SSI of the SLG (G0 , ᒄ) is induced from a UR of the SLG (H0 , ᒅ). We may assume, in view of the classical imprimitivity theorem that π is
Unitary Representations of Super Lie Groups and Applications
235
the representation induced by a UR σ of H0 in K and that π acts by left translations on H. By assumption ρ π (X) leaves C ∞ (π ) invariant. We claim that it leaves Cc∞ (π ) also invariant. Indeed, let f ∈ Cc∞ (π ); then there is u ∈ Cc∞ () such that f = uf . On the other hand, by Lemma 3, ρ π (X)M(u) = M(u)ρ π (X) so that uf ∈ D(ρ π (X)) and ρ π (X)(uf ) = uρ π (X)f . Since uf = f this comes to ρ π (X)f = uρ π (X)f , showing that ρ π (X)f ∈ Cc∞ (π ). Thus the ρ π (X) leave B invariant and commute with all M(u) there. In other words we may work with the smooth SSI. By Lemma 8 the map f (1) −→ (ρ π (X)f )(1) is well defined and so, as in §3.1 we can define a map ρ σ (X) : C ∞ (σ ) −→ C ∞ (σ ) by ρ σ (X)v = (ρ π (X)f )(1),
f (1) = v,
f ∈ B.
Then, for f ∈ B, x ∈ G0 , (ρ π (X)f )(x) = (π(x −1 )ρ π (X)f )(1) = (ρ π (x −1 X)π(x −1 )f )(1) = ρ σ (x −1 X)(π(x −1 )f )(1) = ρ σ (x −1 X)f (x). If we now prove that (σ, ρ σ , K) is a UR of the SLG (H0 , ᒅ), we are done. This is completely formal. Covariance with respect to H0 . For f ∈ B, ξ ∈ H0 , ρ σ (ξ X)f (1) = (ρ π (ξ X)f )(1) = (π(ξ )ρ π (X)π(ξ −1 )f )(1) = (ρ π (X)π(ξ −1 )f )(ξ −1 ) = σ (ξ )(ρ π (X)π(ξ −1 )f )(1) = σ (ξ )ρ σ (X)(π(ξ −1 )f )(1) = σ (ξ )ρ σ (X)σ (ξ )−1 f (1). Odd commutators. Let X, Y ∈ ᒄ1 = ᒅ1 so that Z = [X, Y ] ∈ ᒅ0 . We have [ρ π (X), ρ π (Y )]f = −idπ([X.Y ])f for all f ∈ B. Now, i(−dπ(Z)f )(1) = i(d/dt)t=0 f (exp tZ) = i(d/dt)t=0 σ (exp(−tZ))f (1) = −idσ (Z)f (1). On the other hand, (ρ π (X)ρ π (Y )f )(1) = ρ σ (X)ρ σ (Y )f (1) so that the left side of (∗), evaluated at 1, becomes [ρ σ (X), ρ σ (Y )]f (1).
(∗)
236
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
Thus [ρ σ (X), ρ σ (Y )]f (1) = −idσ (Z)f (1). Symmetry. From the symmetry of the ρ π (X) we have, for all f, g ∈ B, a, b ∈
Cc∞ (),
(ρ π (X)(af ), bg)H = (af, ρ π (X)(bg))H . This means that (ρ σ (x −1 X)f (x), g(x))K a(x)b(x)dx = (f (x), ρ σ (x −1 X)g(x))K a(x)b(x)dx. Since a and b are arbitrary we conclude that (ρ σ (x −1 X)f (x), g(x))K = (f (x), ρ σ (x −1 X)g(x))K for almost all x. All functions in sight are continuous and so this relation is true for all x. The evaluation at 1 gives the symmetry of ρ σ (X) on C ∞ (σ ). This proves that (σ, ρ σ , K) is a UR of the SLG (H0 , ᒅ) and that the corresponding induced SSI is the one we started with. To complete the proof we must show that the set of morphisms of the induced SSI’s is in canonical bijection with the set of morphisms of the inducing UR’s of the sub SLG in question. Let (π, ρ π , H, P ) and (π , ρ π , H , P ) be the SSI’s induced by (σ, ρ σ ) and (σ , ρ σ ) respectively. For any morphism R from (σ, ρ σ ) to (σ , ρ σ ) let T be as in Theorem 1. Then T extends uniquely to a bounded even operator from H to H , and the relations T M(u) = M (u)T for all u ∈ Cc∞ () imply that T P (E) = P (E)T for all Borel E ⊂ . Hence T is a morphism from (π, ρ π , H, P ) to (π , ρ π , H , P ). It is clear that the assignment R −→ T is functorial. To complete the proof we must show that any morphism T from (π, ρ π , H, P ) to (π , ρ π , H , P ) is of this form for a unique R. But T must take B = Cc∞ (π0 ) to B = Cc∞ (π0 ) and commute with the actions of Cc∞ (). Hence T is a morphism from (π, ρ π , B, M) to (π , ρ π , B , M ). Theorem 1 now implies that T arises from a unique morphism of (σ, ρ σ ) to (σ , ρ σ ). This finishes the proof of Theorem 2. 4. Representations of Super Semidirect Products and Super Poincar´e Groups 4.1. Super semidirect products and their irreducible unitary representations. We start with a classical semidirect product G0 = T0 × L0 , where T0 is a vector space of finite dimension over R, the translation group, and L0 is a closed unimodular subgroup of GL(T0 ) acting on T0 naturally. For any Lie group the corresponding gothic letter denotes its Lie algebra. In applications L0 is usually an orthogonal group of Minkowskian signature, or its 2-fold cover, the corresponding spin group. By a super semidirect product (SSDP) we mean a SLG (G0 , ᒄ), where T0 acts trivially on ᒄ1 and [ᒄ1 , ᒄ1 ] ⊂ ᒑ0 . Clearly ᒑ := ᒑ0 ⊕ ᒄ1 is also a super Lie algebra, and (T0 , ᒑ) is a SLG called the super translation group. For any closed subgroup S0 ⊂ L0 , H0 = T0 S0 is a closed subgroup of G0 , ᒅ = ᒅ0 ⊕ ᒄ1 is a super Lie algebra, where ᒅ0 = ᒑ0 ⊕ ᒐ0 is the Lie algebra of H0 . Notice that (H0 , ᒅ) is a special sub SLG of (G0 , ᒄ). We begin by showing that the irreducible UR’s of (G0 , ᒄ) are in natural bijection with the irreducible UR’s of
Unitary Representations of Super Lie Groups and Applications
237
suitable special sub SLG’s of the form (H0 , ᒅ) with the property that the translations act as scalars. For brevity we shall write S = (G0 , ᒄ), T = (T0 , ᒑ). The action of L0 on T0 induces an action on the dual T0∗ of T0 . We assume that this action is regular, i.e., the orbits are all locally closed. By the well known theorem of Effros this implies that if Q is any projection valued measure on T0∗ such that QE = 0 or the identity operator I for any invariant Borel subset E of T0∗ , then Q is necessarily concentrated on a single orbit. This is precisely the condition under which the classical method of little groups of Frobenius-Mackey-Wigner works. For any λ ∈ T0∗ let Lλ0 be the stabilizer of λ in L0 and let ᒄλ = ᒑ0 ⊕ ᒉ0λ ⊕ ᒄ1 . The SLG (T0 Lλ0 , ᒄλ ) will be denoted by S λ . We shall call it the little super group at λ. It is a special sub SLG of (G0 , ᒄ). Two λ’s are called equivalent if they are in the same L0 -orbit. If θ is a UR of the classical group T0 L0 and O is an orbit in T0∗ , its spectrum is said to be in O if the spectral measure (via the SNAG theorem) of the restriction of θ to T0 is supported by O. Given λ ∈ T0∗ , a UR (σ, ρ σ ) of S λ is λ-admissible if σ (t) = eiλ(t) I for t ∈ T0 . λ itself is called admissible if there is an irreducible UR which is λ-admissible. It is obvious that the property of being admissible is preserved under the action of L0 . Let T0+ = λ ∈ T0∗ λ admissible . Then T0+ is an invariant subset of T0∗ . Theorem 3. The spectrum of every irreducible UR of the SLG (G0 , ᒄ) is in some orbit in T0+ . For each orbit in T0+ and choice of λ in that orbit, the assignment that takes a λ-admissible UR γ := (σ, ρ σ ) of S λ into the UR U γ of (G0 , ᒄ) induced by it, is a functor which is an equivalence of categories between the category of the λ-admissible UR’s of S λ and the category of UR’s of (G0 , ᒄ) with their spectra in that orbit. Varying λ in that orbit changes the functor into an equivalent one. In particular this functor gives a bijection between the respective sets of equivalence classes of irreducible UR’s. Proof. Notice first of all that since T0 acts trivially on ᒄ1 , π0 (t) commutes with ρ π (X) on C ∞ (π0 ) for all t ∈ T0 , X ∈ ᒄ1 . Hence PE ↔ ρ π (X) for all Borel E ⊂ , X ∈ ᒄ1 . For the first statement, let E be an invariant Borel subset of T0∗ . Let P be the spectral measure of the restriction of π to T0 . Then PE commutes with π, ρ π (X). So, if (π, ρ π ) is irreducible, PE = 0 or I . Hence P is concentrated in some orbit O, i.e., PO = I . The system (π, ρ π ) is clearly equivalent to (π, ρ π , P ) since P and the restriction of π to T0 generate the same algebra. If λ ∈ O and Lλ0 is the stabilizer of λ in L0 , we can transfer P from O to a projection valued measure P ∗ on L0 /Lλ0 = T0 L0 /T0 Lλ0 . So (π, ρ π ) is equivalent to the SSI (π, ρ π , P ∗ ). The rest of the theorem is an immediate consequence of Theorem 2. The fact that σ (t) = eiλ(t) I for t ∈ T0 is classical. Indeed, in the smooth model for π treated in §3.2, the fact that the spectrum of π is contained in the orbit −1 of λ implies that (π(t)f )(x) = eiλ(x tx) f (x) for all f ∈ B, t ∈ T0 , x ∈ G0 . Hence −1 iλ(t) −1 f (t ) = e f (1) while f (t ) = σ (t)f (1). So σ (t) = eiλ(t) I . Remark 2. In the classical theory all orbits of L0 are allowed and an additional argument of the positivity of energy is needed to single out the physically occurring representations. In SUSY theories as exemplified by Theorem 3, a restriction is already present: only orbits in T0+ are permitted. We shall prove in the next section that T0+ may be interpreted precisely as the set of all positive energy representations.
238
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
4.2. Determination of the admissible orbits. Product structure of the representations of the little super groups. We fix λ ∈ T0+ and let (σ, ρ σ ) be a λ-admissible irreducible UR of S λ . Clearly −idσ (Z) = λ(Z)I
(Z ∈ ᒑ0 ).
Define λ (X1 , X2 ) = (1/2)λ([X1 , X2 ])
(X1 , X2 ∈ ᒄ1 ).
Then, on C ∞ (σ ), [ρ σ (X1 ), ρ σ (X2 )] = λ([X1 , X2 ])I = 2λ (X1 , X2 )I. Clearly λ is a symmetric bilinear form on ᒄ1 × ᒄ1 . Let Qλ (X) = λ (X, X) = (1/2)λ([X, X]). Then Qλ is invariant under Lλ0 because for X1 , X2 ∈ ᒄ1 , h ∈ L1 , [ρ σ (hX1 ), ρ σ (hX2 )] = σ (h)[ρ σ (X1 ), ρ σ (X2 )]σ (h)−1 = 2λ (X1 , X2 ). Now ρ σ (X)2 = Qλ (X)I
(X ∈ ᒄ1 ).
Since ρ σ (X) is essentially self adjoint on C ∞ (σ ), it is immediate that Qλ (X) ≥ 0. We thus obtain the necessary condition for admissibility: Qλ (X) = λ (X, X) ≥ 0
(X ∈ ᒄ1 ).
In the remainder of this subsection we shall show that the condition that λ ≥ 0, which we refer to as the positive energy condition, is also sufficient to ensure that λ is admissible. We will then find all the λ-admissible irreducible UR’s of S λ . It will follow in the next section that if the super Lie group (G0 , ᒄ1 ) is a super Poincar´e group, the condition λ ≥ 0 expresses precisely the positivity of the energy. This is the reason for our describing this condition in the general case also as the positive energy condition. From now on we fix λ such that λ ≥ 0. Lemma 10. For any admissible UR (σ, ρ σ ) of S λ , ρ σ (X) is a bounded self adjoint operator for X ∈ ᒄ1 , and ρ σ (X)2 = Qλ (X)I . Moreover, Qλ ≥ 0 and is invariant under Lλ0 . Proof. We have, for X ∈ ᒄ1 , ψ ∈ C ∞ (σ ), |ρ σ (X)ψ|2K = (ρ σ (X)2 ψ, ψ) = Qλ (X)2 |ψ|2K which proves the lemma.
Unitary Representations of Super Lie Groups and Applications
239
This lemma suggests we study the following situation. Let W be a finite dimensional real vector space and let q be a nonnegative quadratic form on W , i.e., q(w) ≥ 0 for w ∈ W . Let ϕ be the corresponding symmetric bilinear form (q(w) = ϕ(w, w)). Let C be the real algebra generated by W with the relations w 2 = q(w)1(w ∈ W ). If q is nondegenerate, i.e., positive definite, this is the Clifford algebra associated to the quadratic vector space (W, q). If q = 0 it is just the exterior algebra over W . If (wi )1≤i≤n is a basis for W such that ϕ(wi , wj ) = εi δij with εi = 0 or 1 according as i ≤ a or > a, then C is the algebra generated by the wi with the relations wi wj + wj wi = 2εi δij . Let W0 be the radical of q, i.e., W0 = {w0 |ϕ(w0 , w) = 0 for all w ∈ W }; in the above notation W0 is spanned by the wi for i ≤ a. If W ∼ = W/W0 and q ∼ , ϕ ∼ are the corresponding objects induced on W ∼ , q ∼ is positive definite, and so we have the usual Clifford algebra C ∼ generated by (W ∼ , q ∼ ) with W ∼ ⊂ C ∼ . The natural map W −→ W ∼ extends uniquely to a morphism C −→ C ∼ which is clearly surjective. We claim that its kernel is the ideal C0 in C generated by W0 . Indeed, let I be this kernel. If s ∈ I , s is a linear combination of elements wI wJ , where wI is a product wi1 . . . wir (i1< · · · < ir ≤ a) and wJ is a product wj1 . . . wjs (a < j1 < · · · < js ); hence, s ≡ J cJ wJ mod C0 , and as the image of this element in C ∼ is 0, cJ = 0 for all J because the images of the wJ are linearly independent in C ∼ . Hence s ∈ C0 , proving our claim. A representation θ of C by bounded operators in a SHS K is called self adjoint (SA) if θ (w) is odd and self adjoint for all w ∈ W . θ can be viewed as a representation of the complexification C ⊗ C of C; a representation of C ⊗ C arises in this manner from a SA representation of C if and only if it maps elements of W into odd operators and takes complex conjugates to adjoints. Also we wish to stress that irreducibility is in the graded sense. Lemma 11. (i) If τ is a SA representation of C in K, then τ = 0 on C0 and so it is the lift of a SA representation τ ∼ of C ∼ . (ii) There exist irreducible SA representations τ of C; these are finite dimensional, unique if dim(W ∼ ) is odd, and unique up to parity reversal if dim(W ∼ ) is even. (iii) Let τ be an irreducible SA representation of C in a SHS L and let θ be any SA representation of C in a SHS R. Then R K ⊗ L, where K is a SHS and θ (a) = 1 ⊗ τ (a) for all a ∈ C; moreover, if dim(W ∼ ) is odd, we can choose K to be purely even. Proof. (i) If w ∈ W0 , then τ (w0 )2 = q(w0 )I = 0 and so, τ (w0 ) itself must be 0 since it is self adjoint. (ii) In view of (i) we may assume that W0 = 0 so that q is positive definite. Case I. dim(W ) = 2m. Select an ON basis a1 , b1 , . . . , am , bm for W . Let ej = (1/2)(aj + ibj ), fj = (1/2)(aj − ibj ). Then ϕ(ej , ek ) = ϕ(fj , fk ) = 0 while ϕ(ej , fk ) = (1/2)δj k . Then C ⊗ C is generated by the ej , fk with the relations ej ek + ek ej = fj fk + fk fj = 0,
ej fk + fk ej = δj k .
We now set up the standard “Schr¨odinger” representation of C ⊗ C. The representation acts on the SHS L = (U ), where U is a Hilbert space of dimension m and the grading on L is the Z2 -grading induced by the usual Z-grading of (U ). Let (uj )1≤j ≤m be an ON basis for U . We define τ (ej )f = uj ∧ f,
τ (fj )f = ∂(uj )(f )
(f ∈ (U )),
∂(u) for any u ∈ U being the odd derivation on (U ) such that ∂(u)v = 2(v, u) (here (·, ·) is the scalar product in (U ) extending the scalar product of U ). It is standard
240
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
that τ is an irreducible representation of C ⊗ C. The vector 1 is called the Clifford vacuum. We shall now verify that τ is SA. Since aj = ej + fj , bj = −i(ej − fj ), we need to verify that τ (fj ) = τ (ej )∗ for all j , ∗ denoting adjoints. For any subset K = {k1 < · · · < kr } ⊂ {1, 2, . . . , m} we write uK = uk1 ∧ · · · ∧ ukr . Then we should verify that (uj ∧ uK , uL ) = (uK , ∂(uj )uL )
(K, L ⊂ {1, 2, . . . , m}).
Write K = {k1 , . . . , kr }, L = {1 , . . . , s }, where k1 < · · · < kr , 1 < · · · < s . We assume that j = a for some a and K = L \ {a }, as otherwise both sides are 0. Then K = {1 , . . . , a−1 , a+1 , . . . , s } (note that r = s − 1). But then both sides are equal to (−1)a−1 . From the general theory of Clifford algebras we know that if τ is another irreducible SA representation of C, then either τ ≈ τ or else τ ≈ τ , where is the parity reversal map and we write ≈ for linear (not necessarily unitary) equivalence. So it remains to show that ≈ implies unitary equivalence which we write . This is standard since the linear equivalence preserves self adjointness. Indeed, if R : τ1 −→ τ2 is an even linear isomorphism, then R ∗ R is an even automorphism of τ1 and so R ∗ R = a 2 I , where a is a scalar which is > 0. Then U = a −1 R is an even unitary isomorphism τ1 τ2 . Also for use in the odd case to be treated next, we note that τ is irreducible in the ungraded sense since its image is the full endomorphism algebra of L. Case II. dim(W ) = 2m + 1. It is enough to construct an irreducible SA representation as it will be unique up to linear, and hence unitary, equivalence. Let a0 , a1 , . . . , a2m be an ON basis for W . Write xj = ia0 aj (1 ≤ j ≤ 2m), x0 = i m a0 a1 . . . a2m . Then x02 = 1, xj xk + xk xj = 2δj k (j, k = 1, 2, . . . , 2m). Moreover x0 commutes with all aj and hence with all xj . The xj generate a Clifford algebra over R corresponding to a positive definite quadratic form and so there is an irreducible ungraded representation τ + of it in an ungraded Hilbert space L+ such that τ + (xj ) is self adjoint for all j = 1, 2, . . . , 2m (cf. the remark above). Within C⊗C the xj generate C ⊗ C + so that τ + is a representation of C ⊗ C + in L+ such that iτ + (a0 aj ) is self adjoint for all j . We now take 01 + + + + . L = L ⊕ L , τ = τ ⊕ τ , τ (x0 ) = 10 Here L is given the Z2 -grading such that the first and second copies of L+ are the even and odd parts. It is clear that τ is an irreducible representation of C ⊗ C. We wish to show that τ (ar ) is odd and self adjoint for 0 ≤ r ≤ 2m. But this follows from the fact that the τ (xr ) are self adjoint, τ (xj ) are even, and τ (x0 ) is odd, in view of the formulae a0 = i m x0 x1 . . . x2m ,
aj = −ia0 xj .
This finishes the proof of (ii). (iii) Let now θ be a SA representation of C in a SHS R of possibly infinite dimension. For any homogeneous ψ ∈ R the cyclic subspace θ (C)ψ is finite dimensional, hence closed, graded and is θ -stable; moreover by the SA nature of θ , for any graded invariant subspace its orthogonal complement is also graded and invariant. Hence we can write R = ⊕α Rα , where the sum is direct and each Rα is graded, invariant, and irreducible. Let L be a SHS on which we have an irreducible SA representation of C. If dim(W )∼ is even we can thus write R = (M0 ⊗ L) ⊕ (M1 ⊗ L), where the Mj are even Hilbert spaces; if dim(W ∼ ) is odd we can write R = K ⊗ L, where K is an even Hilbert space.
Unitary Representations of Super Lie Groups and Applications
241
In the first case, since M1 ⊗ L = M1 ⊗ L, we have R = K ⊗ L, where K is a SHS with K0 = M0 , K1 = M1 . For studying the question of admissibility of λ we need a second ingredient. Let H be a not necessarily connected Lie group and let us be given a morphism j : H −→ O(W ∼ ) so that H acts on W ∼ preserving the quadratic form on W ∼ . We wish to find out when there is a UR κ of H , possibly projective, and preferably, but not necessarily, even, in the space of the irreducible SA representation τ ∼ , such that κ(t)τ ∼ (w)κ(t)−1 = τ ∼ (tw)
(t ∈ H, w ∈ W ∼ ).
(∗)
For h ∈ O(W ∼ ), let τh∼ (w) = τ ∼ (hw)
(w ∈ W ∼ ).
Then τh∼ is also an irreducible SA representation of C ∼ and so we can find a unitary operator K(h) such that τh∼ (w) = K(h)τ ∼ (w)K(h)−1
(w ∈ W ∼ ).
If dim W ∼ is even, τ ∼ is irreducible even as an ungraded representation, and so K(h) will be unique up to a phase; it will be even or odd according as τh∼ τ ∼ or τh∼ τ ∼ , where is parity reversal. If dim W ∼ is odd, τ ∼ is irreducible only as a graded representation and so we also need to require K(h) to be an even operator in order that it is uniquely determined up to a phase. With this additional requirement in the odd dimensional case, we then see that in both cases the class of κ as a projective UR of H is uniquely determined, i.e., the class of its multiplier µ in H 2 (H, T) is fixed. In the following, we shall show that µ can be chosen to be ±1-valued, and examine the structure of κ more closely. We begin with some preparation (see [Del99, Var04]). Let C ∼× be the group of invertible elements in C ∼ . Define the full Clifford group as follows: = x ∈ C ∼× ∩ (C ∼+ ∪ C ∼− ) | xW ∼ x −1 ⊂ W ∼ . We have a homomorphism α : −→ O(W∼ ) given by α(x)w = (−1)p(x) xwx −1 for all w ∈ W ∼ , p(x) being 0 or 1 according as x ∈ C ∼+ or x ∈ C ∼− . Let β be the principal antiautomorphism of C ∼+ ; then xβ(x) ∈ R× for all x ∈ , and we write G for the kernel of the homomorphism x → xβ(x) of into R× . Since W ∼ is a positive definite quadratic space, we have an exact sequence α
1 −→ {±1} −→ G−→O(W ∼ ) −→ 1. For dim W ∼ ≥ 2, the connected component G0 of G is contained in C ∼+ and coincides with Spin(W ∼ ). Lemma 12. Let τ ∼ be a SA irreducible representation of C ∼ . We then have the following. (i) τ ∼ restricts to a unitary representation of G. (ii) The operator τ ∼ (x) is even or odd according as x ∈ G0 or x ∈ G\G0 . (iii) τ ∼ (x)τ ∼ (w)τ ∼ (x)−1 = (−1)p(x) τ ∼ (α(x)(w)) for x ∈ G, w ∈ W ∼ .
242
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
Proof. Each x ∈ G is expressible in the form x = cv1 . . . vr , where vi are unit vectors in W ∼ and c ∈ {±1}. Since τ ∼ (vi ) is odd, the parity of τ ∼ (x) is the same as x. Moreover, since the τ ∼ (vi ) are self adjoint, τ ∼ (x) τ ∼ (x)∗ = c2 τ ∼ (v1 ) . . . τ ∼ (vr ) τ ∼ (vr ) . . . τ ∼ (v1 ) = I. This proves (i). (ii) and (iii) are obvious.
We now consider two cases. Case I. j (H ) ⊂ SO(W ∼ ). Let ζ be a Borel map of SO(W ∼ ) into Spin(W ∼ ) which is a right inverse of α(Spin(W ∼ ) −→ SO(W ∼ )) with ζ (1) = 1. Then ζ (xy) = ±ζ (x)ζ (y) for x, y ∈ SO(W ∼ ), and so κH = τ ∼ ◦ ζ ◦ j is an even projective UR of H satisfying (∗) with a ±-valued multiplier µH . Since ζ (1) = 1 it follows that µH is normalized, i.e., µH (h, 1) = µH (1, h) = 1
(h ∈ H ).
Clearly, the class of µH in H 2 (H, Z2 ) is trivial if and only if j : H → SO(W ∼ ) can be lifted to a morphism j : H → Spin(W ∼ ). In particular, this happens if H is connected and simply connected. Suppose now H is connected but j does not exist. We can then find a two-fold cover H ∼ of H with a covering map p(H ∼ −→ H ) such that j (H → SO(W ∼ )) lifts to a morphism j ∼ (H ∼ → Spin(W ∼ )), and if ξ is the nontrivial element in ker p, then j ∼ (ξ ) = −1. Lemma 13. If j maps H into SO(W ∼ ), there is a projectively unique even projective UR κ of H satisfying (∗), with a normalized ±1-valued multiplier µ. If H is connected, for κ to be an ordinary even representation (which will be unique up to multiplication by a character of H ) it is necessary and sufficient that either (i) j (H → SO(W ∼ )) can be lifted to Spin(W ∼ ) or (ii) there exists a character χ of H ∼ such that χ (ξ ) = −1. In particular, if H = A × T , where A is simply connected and T is a torus, then κ is an ordinary even unitary representation. Proof. The first statement has already been proved. We next prove the sufficiency part of the second statement. Sufficiency of (i) has already been observed. To see that (ii) is sufficient, note that κ ∼ = τ ∼ ◦ j ∼ is an even UR of H ∼ satisfying (∗); one can clearly replace κ ∼ by κ ∼ χ without destroying (∗). As κ ∼ (ξ ) = −1, we have (κ ∼ χ )(ξ ) = 1, and so it is immediate that κ ∼ χ descends to H. We leave the necessity part to the reader; it will not be used in the sequel. The statement for H = A × T will follow if we show that H ∼ has a character χ as in condition (ii). We have H ∼ = A × T ∼ , T ∼ being the double cover of T , and ξ = (1, t), with t = 1, t 2 = 1. There exists a character χ of T ∼ such that χ (t) = −1, and such a character can be extended to H ∼ by making it trivial on A. Case II. j (H ) ⊂ SO(W ∼ ). Let H0 = j −1 (SO(W ∼ )). Then H0 is a normal subgroup of H of index 2. We must distinguish two subcases.
Unitary Representations of Super Lie Groups and Applications
243
Case II.a. dim(W ∼ ) is even. Let ζ0 be a Borel right inverse of α(G0 −→ SO(W ∼ )) with ζ0 (1) = 1. Fix a unit vector v0 ∈ W ∼ and let r0 = −α(v0 ). Since α(v0 ) is the reflection in the hyperplane orthogonal to v0 and dim(W ∼ ) is even, we see that r0 ∈ O(W ∼ ) \ SO(W ∼ ). We then define a map ζ (O(W ∼ ) → G) by ζ0 (h) if h ∈ SO(W ∼ ) ζ (h) = ζ0 (h0 )v0 if h = h0 r0 , h0 ∈ SO(W ∼ ). Once again we have ζ (h1 h2 ) = ±ζ (h1 )ζ (h2 ). Define κH = τ ∼ ◦ ζ ◦ j. Then κH is a projective UR of H satisfying (∗) with a ±-valued normalized multiplier µH . But κH is not an even representation; elements of H \H0 map into odd unitary operators in the space of τ ∼ . We shall call such a representation of H graded with respect to H0 , or simply graded. We have thus proved the following. Lemma 14. If j (H ) ⊂ SO(W ∼ ), and dim(W ∼ ) is even, and if we define H0 = j −1 (SO(W ∼ )), then there is a projective UR κH , graded with respect to H0 , and satisfying (∗) with a ±-valued normalized multiplier µH . Before we take up the case when dim(W ∼ ) is odd, we shall describe how the projective graded representations of H are constructed. This is a very general situation and so we shall work with a locally compact second countable group A and a closed subgroup A0 of index 2; A0 is automatically normal and we write A1 = A \ A0 . Gradedness is with respect to A0 . We fix a ±-valued multiplier µ for A which is normalized. For brevity a representation will mean a unitary µ-representation. Moreover, with a slight abuse of language a µ-representation of A0 will mean a µ|A0 ×A0 -representation of A0 . If Rg is a graded representation in a SHS H, R the corresponding ungraded representation, and Pj is the orthogonal projection H −→ Hj , we associate to R the projection valued measure P on A/A0 , where PA0 = P0 and PA1 = P1 . Then the condition that Rg is graded is exactly the same as saying that (R, P ) is a system of imprimitivity for A based on A/A0 . Conversely, given a system of imprimitivity (R, P ) for A based on A/A0 , let us define the grading for H = H(R) by Hj = range of PAj (j = 0, 1); then R becomes a graded representation. Moreover for graded representations Rg , Rg , we have Hom(Rg , Rg ) = Hom((R, P ), (R , P )). In other words, the category of systems of imprimitivity for A based on A/A0 and the category of representations of A graded with respect to A0 are equivalent naturally. For any µ-representation r of A0 in a purely even Hilbert space H(r), let Rr := IndA A0 r be the representation of A induced by r. We recall that Rr acts in the Hilbert space H(Rr ) of all (equivalence classes of Borel) functions f (A → H(r)) such that for each α ∈ A0 , f (αa) = µ(α, a)r(α)f (a) for almost all a ∈ A; and (Rr (a)f )(y) = µ(y, a)f (ya)
(a, y ∈ A).
244
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
The space H(Rr ) is naturally graded by defining H(Rr )j = {f ∈ H(Rr ) | supp(f ) ⊂ Aj }
(j = 0, 1).
r for Rr treated as a It is then obvious that Rr is a graded µ-representation. We write R graded representation. These remarks suggest the following lemma. r be the Lemma 15. For any unitary µ-representation r of A0 let Rr = Ind(r) and let R r is an equivalence graded µ-representation defined by Rr . Then the assignment r → R from the category of unitary µ-representations of A0 to the category of unitary graded µ-representations of A. Proof. Let us first assume that µ = 1. Then we are dealing with UR’s and the above remarks imply the lemma in view of the classical imprimitivity theorem. When µ is not 1 we go to the central extension A∼ of A by Z2 defined by µ. Recall that A∼ = A ×µ Z2 with multiplication defined by (a, ξ )(a , ξ ) = (aa , ξ ξ µ(a, a ))
(a, a ∈ A, ξ, ξ ∈ Z2 ).
∼ ∼ (We must give to A∼ the Weil topology.) Then A∼ 0 = A0 ×µ Z2 and A /A0 = A/A0 . ∼ ∼ The µ-representations R of A are in natural bijection with UR’s R of A such that R ∼ is nontrivial on Z2 by the correspondence
R ∼ (a, ξ ) = ξ R(a),
R(a) = R ∼ (a, 1).
The assignment R → R ∼ is an equivalence of categories. Analogous considerations hold for µ-representations r of A0 and UR’s r ∼ of A∼ 0 which are nontrivial on Z2 . The lemma would now follow if we establish two things: (a) For any unitary µ-representation r of A0 , and Rr = Ind(r), we have Rr∼ Ind(r ∼ ), ∼ and (b) If ρ is a UR of A∼ 0 and Ind(ρ) = R for some µ-representation R of A, then ρ = r ∼ for some µ-representation r of A0 . To prove (a) we set up the map f −→ f ∼ from H(Rr∼ ) to H(Ind(r ∼ )) by
f ∼ (a, ξ ) = f (a)ξ. It is an easy calculation that this is an isomorphism of Rr∼ with H(Ind(r ∼ )) that intertwines the two projection valued measures on A/A0 and A∼ /A∼ 0 ≈ A/A0 . To prove (b) we have only to check that ρ(1, ξ ) = ξ ; this however is a straightforward calculation. Remark 3. Given a graded µ-representation R of A, let r be the µ-representation of A0 defined by r(α) = R(α) H(R) (α ∈ A0 ). 0
It is then easy to show that R Rr . In fact it is enough to verify this (as before) when µ = 1. In this simple situation this is well known.
Unitary Representations of Super Lie Groups and Applications
245
We now resume our discussion and treat the odd dimensional case. Case II.b. dim(W ∼ ) is odd. We shall exhibit a projective even UR κ of H satisfying (∗). We refer back to the construction of τ ∼ in Lemma 11. Then 0 1 S= −1 0 is an odd unitary operator such that S 2 = −1 and τ ∼ (x)S = (−1)p(x) Sτ ∼ (x) for all x ∈ C ∼ . Let γ (O(W ∼ ) → G) be a Borel right inverse of α. Then, it is easily checked that if h ∈ H0 (τ ∼ ◦ γ ◦ j ) (h) κH (h) = (τ ∼ ◦ γ ◦ j ) (h) S if h ∈ H \H0 has the required properties. We now return to our original setting. First the graded representations of H are obtained by taking H = A, H0 = A0 in the foregoing discussion. Let ᒄ1λ = ᒄ1 /rad λ . We write Cλ for the algebra generated by ᒄ1 with the relations X 2 = Qλ (X)1 for all X ∈ ᒄ1 . We have a map jλ : Lλ0 −→ O(ᒄ1λ ). We take in the preceding theory q = Qλ ,
ϕ = λ ,
H = Lλ0 ,
W ∼ = ᒄ1λ ,
j = jλ ,
C ∼ = Cλ∼ .
Furthermore let κλ = κ,
µλ = µ,
τλ∼ = τ ∼ ,
τλ = lift of τ ∼ to Cλ .
Then µλ is a normalized multiplier for Lλ0 which we can choose to be ±-valued, κλ is a µλ -representation (unitary) of Lλ0 in the space of τλ , and κλ (t)τλ (X)κλ (t)−1 = τλ (tX)
(t ∈ Lλ0 ).
Moreover, κλ is graded if and only if jλ (Lλ0 ) ⊂ SO(ᒄ1λ ) and dim(ᒄ1λ ) is even, otherwise κλ is even. Finally, let j −1 (SO(ᒄ1λ )) if jλ (Lλ0 ) ⊂ SO(ᒄ1λ ) and dim(ᒄ1λ ) is even λ L00 = λλ L0 otherwise. r be the For any unitary µλ -representation r of Lλ00 in an even Hilbert space Kλ , let R λ λ unitary µλ - representation of L0 induced by r, which is graded if jλ (L0 ) ⊂ SO(ᒄ1λ ) and dim(ᒄ1λ ) is even, and is just r in all other cases. Theorem 4. Let λ be such that λ ≥ 0. Then λ is admissible, i.e., λ ∈ T0+ . For a fixed such λ let τλ be an irreducible SA representation of Cλ in a SHS Lλ and κλ the unitary µλ -representation of Lλ0 in Lλ associated to τλ as above. For any unitary µλ -represenr be the unitary µλ - representation of tation r of Lλ00 in an even Hilbert space Kλ , let R Lλ0 defined as above, and let θrλ = (σrλ , ρλσ )
246
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
be the UR of the little SLG S λ , where, for X ∈ ᒄ1 , h ∈ Lλ0 , t ∈ T0 , σrλ (th) = eiλ(t) σrλ (h)
and r (h) ⊗ κλ (h), (h) = R σrλ
ρλσ (X) = 1 ⊗ τλ (X)
(X ∈ ᒄ1 ).
Then θrλ is an admissible UR of S λ . The assignment r −→ θrλ is functorial, commutes with direct sums, and is an equivalence of categories from the category of unitary µλ representations of Lλ00 to the category of admissible UR’s of the little super group S λ . If Lλ0 is connected and satisfies either of the conditions of Lemma 13, then r −→ θrλ is an equivalence from the category of even UR’s of Lλ0 into the category of admissible UR’s of S λ . Proof. Once κλ is fixed, the assignment r −→ θrλ is clearly functorial (although it depends on κλ ). If dim(ᒄ1λ ) is even, a morphism M : : Rr1 −→ Rr2 and hence to the morr1 −→ r2 obviously gives rise to the morphism M ⊗ 1 from θr1 λ to θr2 λ . Conversely, if T is a bounded even operator commuting phism M with 1 ⊗ τλ , it is immediate (since the τλ (X) generate the full super algebra of endor1 ) → H(R r2 ) morphisms of Lλ ) that T must be of the form M ⊗ 1, where M : H(R is a bounded even operator. If now T intertwines Rr1 ⊗ κλ and Rr2 ⊗ κλ , then M must r1 , R r2 ) ≈ Hom(r1 , r2 ). Thus r −→ θrλ is a fully faithful functor. If belong to Hom(R dim(ᒄ1λ ) is odd, we can choose Kλ to be purely even (see Lemma 11). If T is a bounded even operator commuting with 1 ⊗ τλ , we use the fact that it commutes with + τ (a) 0 01 1⊗ and 1 ⊗ 10 0 τ + (a) (in the notation of Lemma 11) to conclude, via an argument similar to the one used in the even dimensional case, that T is of the form M ⊗ 1. Arguing as before we conclude that M ∈ Hom(r1 , r2 ). Thus r → θrλ is a fully faithful functor in this case also. It remains to show that every admissible UR of S λ is of the form θrλ . Let θ be an admissible UR of S λ in H. Then θ = (ξ, τ ), where ξ is an even UR of T0 Lλ0 which restricts to eiλ I on T0 , τ is a SA representation of Cλ related to ξ as usual. We may then assume by Lemma 11 that H = K ⊗ Lλ and τ = 1 ⊗ τλ . If dim(ᒄ1λ ) is odd, we choose K purely even. Then 1 ⊗ τλ (hX) = ξ(h)[1 ⊗ τλ (X)]ξ(h)−1 . But the same relation is true if we replace ξ by 1 ⊗ κλ . So if ξ1 = [1 ⊗ κλ ]−1 ξ , then ξ1 (h) is even or odd according to the grading of κλ (h), and commutes with 1 ⊗ τλ . Hence ξ1 is of the form R ⊗ 1 for a Borel map R from Lλ0 into the unitary group of K. Thus ξ(h) = [1 ⊗ κλ (h)][R (h) ⊗ 1]. The two factors on the right side of this equation commute; the left side is an even UR and the first factor on the right is a unitary µλ -representation of Lλ0 , which is graded or even according to dim(ᒄ1λ ) even or dim(ᒄ1λ ) odd. So R is a µ−1 λ = µλ -representation of Lλ0 in K, which is graded or even according to dim(ᒄ1λ ) even or dim(ᒄ1λ ) odd. This finishes the proof.
Unitary Representations of Super Lie Groups and Applications
247
Remark 4. If λ = 0, then λ = 0, Lλ = 0, and θr0 = (r, 0). Remark 5. For the super Poincar´e groups we shall see in the next subsection that the situation is much simpler and Lλ0 is always connected and satisfies the conditions of Lemma 13. Combining Theorems 3 and 4 we obtain the following theorem. Let rλ be the UR of (G0 , ᒄ) induced by θrλ as described in Theorem 4. Theorem 5. Let λ be such that λ ≥ 0. The assignment that takes r to the UR rλ is an equivalence of categories from the category of unitary µλ -representations of Lλ00 to the category of UR’s of (G0 , ᒄ) whose spectra are contained in the orbit of λ. In particular, for r irreducible, rλ is irreducible, and every irreducible UR of (G0 , ᒄ) is obtained in this way. If the conditions of Lemma 13 are satisfied, then the r’s come from the category of UR’s of Lλ0 . In the case of super Poincar´e groups (see Remark 5 above), rλ induced by θrλ represents a superparticle. In general the UR πrλ of T0 L0 contained in rλ will not be an irreducible UR of G0 . Its decomposition into irreducibles gives the multiplet that the UR of S determines. This is of course the set of irreducible UR’s Urλj of G0 induced by the rλj , where the rλj are the irreducible UR’s of Lλ0 contained in r ⊗ κλ : r ⊗ κλ = rλj , πrλ = Urλj . The set (rλj ) thus defines the multiplet. For r trivial the corresponding multiplet is called fundamental. 4.3. The case of the super Poincar´e groups. We shall now specialize the entire theory to the case when (G0 , ᒄ) is a super Poincar´e group (SPG). This means that the following conditions are satisfied. (a) T0 = R1,D−1 is the D-dimensional Minkowski space of signature (1, D − 1) with D ≥ 4; the Minkowski bilinear form is x, x = x0 x0 − j xj xj . (b) L0 = Spin(1, D − 1). (c) ᒄ1 is a real spinorial module for L0 , i.e., is a direct sum of spin representations over C. (d) For any 0 = X ∈ ᒄ1 , and any x ∈ T0 lying in the interior + of the forward light cone , we have [X, X], x > 0. If in (c) ᒄ1 is the sum of N real irreducible spin modules of L0 , we say we are in the context of N-extended supersymmetry. Sometimes N refers to the number of irreducible components over C. In (d) = {x | x, x ≥ 0, x0 ≥ 0},
+ = {x | x, x > 0, x0 > 0}.
In the case when D = 4 and ᒄ1 is the Majorana spinor, the condition (d) is automatic (one may have to change the sign of the odd commutators to achieve this); in the general case, as we shall see below, it ensures that only positive energy representations are allowed. We identify T0∗ with R1,D−1 by the pairing x, p = x0 p0 − j xj pj . The dual action of L0 is then the original action. The orbit structure of T0∗ is classical.
248
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
Lemma 16. (i) Let V be a finite dimensional real vector space with a nondegenerate quadratic form and let V1 be a subspace of V on which the quadratic form remains nondegenerate. Then the spin representations of Spin(V ) restrict on Spin(V1 ) to direct sums of spin representations of Spin(V1 ). (ii) Suppose V = R1,D−1 . Let p ∈ V be such that p, p = ± m2 = 0 and V1 = p ⊥ . Then V1 is a quadratic subspace, the stabilizer p L0 of p in Spin(V ) is precisely Spin(V1 ), and it is Spin(D − 1) for p, p = m2 and Spin(1, D − 2) for p, p = −m2 = 0. Proof. (i) Let C, C1 be the Clifford algebras of V and V1 . Then C1+ ⊂ C + and hence, as the spin groups are imbedded in the even parts of the Clifford algebras, we have Spin(V1 ) ⊂ Spin(V ). Now the spin modules are precisely the modules for the even parts of the corresponding Clifford algebras and so, as these algebras are semisimple, the decomposition of the spin module of Spin(V ), viewed as an irreducible module for C + , into irreducible modules for C1+ under restriction to C1+ , gives the decomposition of the restriction of the original spin module to Spin(V1 ). See [Del99, Var04]. (ii) Choose an orthogonal basis (eα )0≤α≤D−1 such that e0 , e0 = −ej , ej = 1 for 1 ≤ j ≤ D − 1. It is easy to see that we can move p to either (m, 0, . . . , 0) or (0, m, 0, . . . , 0) by L0 and so we may assume that p is in one of these two positions. For u ∈ C(V )+ it is then a straightforward matter to verify that up = pu if and only if u ∈ C(V1 )+ . From the characterization of the spin group ([Del99, Var04]) it is now p clear that L0 = Spin(V1 ). Lemma 17. Let M be a connected real semisimple Lie group whose universal cover does not have a compact factor, i.e., the Lie algebra of M does not have a factor Lie algebra whose group is compact. Then M has no nontrivial morphisms into any compact Lie group, and hence no nontrivial finite dimensional UR’s. Proof. We may assume that M is simply connected. If such a morphism exists we have a nontrivial morphism ᒊ −→ ᒈ, where ᒊ is the Lie algebra of M and ᒈ is the Lie algebra of a compact Lie group. Let ᑾ be the kernel of this Lie algebra morphism. Then ᑾ is an ideal of ᒊ different from ᒊ, and so we can write ᒊ as ᑾ × ᒈ , where ᒈ is also an ideal and is non-zero; moreover, the map from ᒈ to ᒈ is injective. ᒈ is semisimple and admits an invariant negative definite form (the restriction from the Cartan- Killing form of ᒈ), and so its associated simply connected group K is compact. If A is the simply connected group for ᑾ, we have M = A × K , showing that M admits a compact factor, contrary to hypothesis. Corollary 1. If V is a quadratic vector space of signature (p, q), p, q > 0, p + q ≥ 3, then Spin(V ) does not have any nontrivial map into a compact Lie group. Proof. The Lie algebra is semisimple and the simple factors are not compact. Lemma 18. We have = {p | p ≥ 0}, i.e., for any p ∈ R1,D−1 , p ≥ 0 ⇐⇒ p0 ≥ 0, p, p ≥ 0. Moreover, p0 > 0, p, p > 0 ⇒ p > 0.
Unitary Representations of Super Lie Groups and Applications
249
Proof. For 0 = X ∈ ᒄ1 we have [X, X], x > 0 for all x ∈ + and hence the inequality is true with ≥ 0 replacing > 0 for x ∈ . Hence 2p (X, X) = [X, X], p ≥ 0 if p ∈ . So ⊂ {p | p ≥ 0}. We shall show next that {p | p ≥ 0} ⊂ {p | p, p ≥ 0}. Suppose on the contrary that p ≥ 0 but p, p < 0. Since p is invariant under p p p L0 which is connected, we have a map L0 −→ SO(ᒄ1p ). Then L0 = Spin(V1 ) = p Spin(1, D − 2) by Lemma 16, and Corollary 1 shows that L0 has no nontrivial morp p phisms into any compact Lie group. Hence L0 acts trivially on ᒄ1p . Since L0 is a semip simple group, ᒄ1p can be lifted to an L0 -invariant subspace of ᒄ1 . Hence, if ᒄ1p = 0, p the action of L0 on ᒄ1 must have non-zero trivial submodules. However, by Lemma 16, the spin modules of Spin(V ) restrict on Spin(V1 ) to direct sums of spin modules of the smaller group and there is no trivial module in this decomposition. Hence ᒄ1p = 0, i.e., p = 0. Hence p vanishes on [ᒄ1 , ᒄ1 ]. Now [ᒄ1 , ᒄ1 ] is stable under L0 and non-zero, and so must be the whole of ᒑ0 . So p = 0, a contradiction. To finish the proof we should prove that if p ≥ 0 then p0 ≥ 0. Otherwise p0 < 0 and so −p ∈ and so from what we have already proved, we have −p = −p ≥ 0. Hence p = 0. But then as before p = 0, a contradiction. Finally, if p0 > 0 and p, p > 0, then p > 0 by definition of the SPG structure. This completes the proof. Theorem 6. Let S = (G0 , ᒄ) be a SPG. Then all stabilizers are connected and T0+ = {p | p ≥ 0} = . p
Moreover, κp is an even UR of L0 , and the irreducible UR’s of S whose spectra are in p the orbit of p are in natural bijection with the irreducible UR’s of L0 . The correspondp ing multiplet is then the set of irreducible UR’s parametrized by the irreducibles of L0 p occurring in the decomposition of α ⊗ κp as a UR of L0 . Proof. In view of Theorem 4 and Lemma 18 we have T0+ = . For p ∈ , the stabilizers p are all known classically. If p, p > 0, L0 = Spin(D − 1); if p, p = 0 but p0 > 0, p p then L0 = RD−2 × Spin(D − 2); and for p = 0, L0 = L0 . So, except when D = 4 and p is non-zero and is in the zero mass orbit, the stabilizer is connected and simply p connected, thus κp is an even UR of L0 by Lemma 13. But in the exceptional case, p L0 = R2 × S 1 , where S 1 is the circle, and Lemma 13 is again applicable. This finishes the proof. 4.4. Determination of κp and the structure of the multiplets. Examples. We have seen that the multiplet defined by the super particle αp is parametrized by the set of irrep ducible UR’s of L0 that occur in the decomposition of α ⊗ κp . Clearly it is desirable to determine κp as explicitly as possible. We shall do this in what follows. To determine κp the following lemma is useful. (W, q) is a positive definite quadratic vector space and ϕ is the bilinear form of q. C(W ) is the Clifford algebra of W
250
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
and H is a connected Lie group with a morphism H −→ SO(W ). τ is an irreducible SA representation of C(W ) and κ is a UR such that κ(t)τ (u)κ(t)−1 = τ (tu) for all u ∈ W, t ∈ H . We write ≈ for equivalence after multiplying by a suitable character. Lemma 19. Suppose that dim(W ) = 2m is even and WC := C ⊗R W has an isotropic subspace E of dimension m stable under H . Let η be the action of H on (E) extending its action on E. Then κ ≈ (E) ≈ (E ∗ )
(E ∗ is the complex conjugate of E).
Proof. Clearly E ∗ is also isotropic and H -stable. E ∩ E ∗ = 0 as otherwise E ∩ E ∗ ∩ W will be a non-zero isotropic subspace of W . So WC = E ⊕ E ∗ . We write τ for the representation of C(W ) in (E), where τ (u)(x) = u ∧ x,
τ (v)(x) = ∂(v)(x)
(u ∈ E, v ∈ E ∗ , x ∈ (E)).
Here ∂(v) is the odd derivation taking x ∈ E to 2ϕ(x, v). It is then routine to show that η(t)τ (u)η(t)−1 = τ (tu)
(u ∈ E ∪ E ∗ ).
Now τ is equivalent to τ and so we can transfer η to an action, written again as η, of H in the space of τ satisfying the above relation with respect to τ . It is not necessary that η be unitary. But we can normalize it to be a UR, namely κ(t) = | det(η(t))|−1/ dim(τ ) η(t). Remark 6. It is easy to give an independent argument that (E) ≈ (E ∗ ). For the unitary group U(E) of E let r be the representation on r (E), and let be their direct sum; then a simple calculation of the characters on the diagonal group shows that r ∗ det−1 ⊗n−r . Hence ∗ det−1 ⊗, showing that ∗ ≈ . It is then immediate that this result remains true for any group which acts unitarily on E. Corollary 2. The conditions of the above lemma are met if W = A ⊕ B, where A, B are orthogonal submodules for H which are equivalent. Moreover κ ≈ (E) (E ∗ ) (A) (B). Proof. Take ON bases (aj ), (bj ) for A and B respectively so that the map aj → bj is an isomorphism of H -modules. If E is the span of the ej = aj + ibj , it is easy to check that E is isotropic, and is a module for H which is equivalent to A and B. We now assume that for some r ≥ 3 we have a map H −→ Spin(r) −→ Spin(W ), where the first map is surjective, and H acts on W through Spin(W ). Further let the representation of Spin(r) on W be spinorial. We write σr for the (complex) spin representation of Spin(r) if r is odd and σr± for the (complex) spin representations of Spin(r) if r is even. Likewise we write sr , sr± for the real irreducible spin modules. Note that dim(W ) must be even.
Unitary Representations of Super Lie Groups and Applications
251
Lemma 20. Let the representation of Spin(r) on W be spinorial. Let n be the number of real irreducible constituents of W as a module for Spin(r), and, when r is even, let n± be the number of irreducible constituents of real or quaternionic type. We then have the following determination of κ: r mod 8
κ
0(n± even ) 1, 7(n even) 2, 6 3, 5 4
+ − − ((n+ /2)σ r ⊕ (n /2)σr ) (n/2)σ r nσr+ ≈ nσr− nσr n+ σr+ ⊕ n− σr−
Proof. This is a routine application of the lemma and corollary above if we note the following facts: r ≡ 0 : Here σr± = sr± and W = n+ sr+ + n− sr− . r ≡ 1, 7 : Here σr = sr , W = nsr . r ≡ 2, 6 : Over C, sr becomes σr+ ⊕ σr− while σr± do not admit a non-zero invariant form. So WC = E ⊕ E ∗ , where E = nσr+ , E ∗ = nσr− , and q is zero on E. r ≡ 3, 5 : sr is quaternionic, W = nsr , WC = 2nσr and σr does not admit an invariant symmetric form. r ≡ 4 : sr± are quaternionic and σr± do not admit a non-zero invariant symmetric form; WC = E ⊕ E ∗ , where E = n+ σr+ + n− σr− and q is zero on E. In deriving these the reader should use the results in [Del99] and [Var04] on the reality of the complex spin modules and the theory of invariant forms for them. 4.4.1. Super Poincar´e group associated to R1,3 : N=1 supersymmetry. Here T0 = R1,3 , L0 = SL(2, C)R , where the suffix R means that the complex group is viewed as a real Lie group. Let ᒐ = 2 ⊕ 2, 2 being the holomorphic representation of L0 in C2 and 2 its complex conjugate. Thus we identify ᒐ with C2 ⊕ C2 and introduce the conjugation on ᒐ given by (u, v) = (v, u). The action (u, v) → (gu, gv) of L0 (g is the complex conjugate of g) commutes with the conjugation and so defines the real form ᒐR invariant under L0 (Majorana spinor). We take ᒑ0 to be the space of 2 × 2 Hermitian matrices and the action of L0 on it as g, A → gAg T . For (ui , ui ) ∈ ᒐR (i = 1, 2) we put 1 (u1 u2 T + u2 u1 T ). 2 Then ᒄ = ᒄ0 ⊕ ᒄ1 with ᒄ0 = ᒑ0 ⊕ ᒉ0 , ᒄ1 = ᒐR is a super Lie algebra and (T0 L0 , ᒄ) is the SLG with which we are concerned. a0 + a3 a1 − ia2 1,3 Here R ᒑ0 by the map a → ha = ; ᒑ0 ᒑ∗0 with p ∈ ᒑ0 a1 + ia2 a0 − a3 viewed as the linear form a → a, p = a0 p0 − a1 p1 − a2 p2 − a3 p3 . Then [(u1 , u1 ), (u2 , u2 )] =
Qp ((u, u)) =
1 T u hpˇ u, 4
pˇ = (p0 , −p1 , −p2 , −p3 ). p
I: p0 > 0, m2 = p, p > 0. We take p = mI so that L0 = SU(2). Take E = {(u, 0)}, E ∗ = {(0, u)}. Then we are in the set up of Lemma 19. Then 2D j ⊕D j +1/2 ⊕ D j −1/2 (j ≥ 1/2) 0 1/2 j κp = (E) 2D ⊕ D , D ⊗ (E) = 2D 0 ⊕D 1/2 (j = 0).
252
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
Thus the multiplet with mass m has the same mass m and spins {j, j, j + 1/2, j − 1/2}(j > 0) . {0, 0, 1/2}(j = 0)
00 II: p0 > 0, p, p = 0. Here we take p = (1, 0, 0, −1), hp = . Then 02 a0 p L0 = . The characters χn/2 : a → a n (n ∈ Z) are viewed as characters of ca p L0 . Here Qp ((u, u)) = 41 uT hpˇ u = |u1 |2 . The radical of p is the span of (e2 , e2 ) and (ie2 , −ie2 ), e1 , e2 being the standard basis of C2 . We identify ᒄ1p with the span of (e1 , e1 ) and (ie1 , −ie1 ). We now apply Lemma 19 with E = C(e1 , 0) which carries the character defined by χ1/2 ; then (E) = χ0 ⊕ χ1/2 ,
χn/2 ⊗ (E) = χn/2 ⊕ χ(n+1)/2 .
The multiplet is {n/2, (n + 1)/2}. These results go back to [SS74]. 4.4.2. Extended supersymmetry. Here the SLG has still the Poincar´e group as its even part but ᒄ1 is the sum of N > 1 copies of ᒐR . It is known ([Del99, Var04])that one can identify ᒄ1 with the direct sum ᒐN R of N copies of ᒐR in such a way that for the odd commutators we have
[(s1 , s2 , . . . , sN ), (s1 , s2 , . . . , sN )] = [si , si ]1 , 1≤i≤N
so that Qλ ((s1 , . . . , sN )) =
Q1λ ((si , si )).
1≤i≤N
Here the index 1 means the [ , ] and Q for the case N = 1 discussed above. Let E N = NE 1 . I: p0 > 0, m2 = p, p > 0. Then we apply Lemma 19 with E = E N so that κp = (N D 1/2 ). The decomposition of the exterior algebra of N D 1/2 is tedious but there is no difficulty in principle. We have
κp = cNr D r/2 cNr > 0, cNN = 1. 0≤r≤N
Then j + N/2 is the maximum value of r for which D r occurs in D j ⊗ (N D 1/2 ). The multiplet defined by the super particle of mass m is thus {j − N/2, j − N/2 + 1/2, . . . , j + N/2 − 1/2, j + N/2} (j ≥ N/2) . {0, 1/2, . . . , j + N/2 − 1/2, j + N/2} (0 ≤ j < N/2) II: p0 > 0, m = 0. Here
N κλ = (N χ1/2 ) = χr/2 . r 0≤r≤N
The multiplet of the super particle has the helicity content {r/2, (r + 1)/2, . . . , (r + N )/2}.
Unitary Representations of Super Lie Groups and Applications
253
4.4.3. Super particles of infinite spin. The little groups for zero mass have irreducible p UR’s which are infinite dimensional. Since L0 is also a semidirect product its irreducible UR’s can be determined by the usual method. The orbits of S 1 in C (which is identified with its dual) are the circles {|a| = r} for r > 0 and the stabilizers of the points are all the same, the group {±1}. The irreducible UR’s of infinite dimension can then be parametrized as {αr,± }. Now (E) = χ0 ⊕ χ1/2 and an easy calculation gives αr,± ⊗ (E) = αr,+ ⊕ αr,− . The particles in the multiplet with mass 0 corresponding to spin (r, ±) consist of both types of infinite spin with the same r. 4.4.4. Super Poincar´e groups of Minkowski super spacetimes of arbitrary dimension. p Let T0 = R1,D−1 . We first determine κp in the massive case. Here L0 = Spin(D − 1) p and the form p is strictly positive definite. So ᒄ1p = ᒄ1 and L0 acts on it by restriction, hence spinorially by Lemma 16. So Lemma 20 applies at once. It only remains to determine n, n± in terms of the corresponding N, N ± for ᒄ1 viewed as a module for p L0 . Notation is as in Lemma 20, and res is restriction to L0 ; r = D − 1. This is done p by writing ᒄ1 as a sum of the sD and determining the restrictions of the sD to L0 by dimension counting. We again omit the details but refer the reader to [Del99, Var04]. Proposition 8. When p0 > 0, m2 = p, p > 0, κp , the fundamental multiplet of the super particle of mass m, is given according to the following table: D mod 8
res (sD )
κp
0 1(N = 2k) 2(N = 2k) 3 4 5 6 7
2sD−1 + − sD−1 + sD−1 sD−1 sD−1 sD−1 + − sD−1 + sD−1 sD−1 2sD−1
N σD−1 + − kσD−1 ⊕ kσD−1 kσD−1 ± N σD−1 N σD−1 + − N σD−1 ⊕ N σD−1 N σD−1 ) ± (2N )σD−1 )
We now extend these results to the case when p has zero mass. Let V = R1,D−1 (D ≥ 4) and p = 0 a null vector in V . Let ej be the standard basis vectors for V so that (e0 , e0 ) = −(ej , ej ) = 1 for j = 1, . . . , D − 1. We may assume that p = e0 + e1 . Let V1 be the span of ej (2 ≤ j ≤ D − 1). The signature of V1 is (0, D − 2). Then p we have the flag 0 ⊂ Rp ⊂ p ⊥ ⊂ V left stable by the stabilizer L0 of p in L0 . The quadratic form on p⊥ has Rp as its radical and so induces a nondegenerate form on V1 := p ⊥ /Rp. Write L1 = Spin(V1 ). Note that V1 V1 . p We have a map x → x from L0 to L1 where, for v ∈ p ⊥ with image v ∈ V1 , p x v = (xv) . It is known that this is surjective and its kernel T1 := T0 is isomorphic p ⊥ to V1 canonically: for x ∈ T0 , the vector xe0 − e0 ∈ p , and the map that sends p x to the image t (x) of xe0 − e0 in V1 is well defined and is an isomorphism of T0 p with V1 . The map x → (t (x), x ) is an isomorphism of L0 with the semidirect product V1 × L1 . The Lie algebra of the big spin group L0 has the er es (r < s) as basis and it is p a simple calculation that the Lie algebra of L0 has as basis tj = (e0 + e1 )ej (2 ≤ j ≤
254
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
D − 1), er es (2 ≤ r < s ≤ D − 1) with the tj forming a basis of the Lie algebra of T1 . The er es (2 ≤ r < s ≤ D − 1) span a Lie subalgebra of the Lie algebra of L0 and the p p corresponding subgroup H ⊂ L0 is such that L0 T1 × H . For all of this see [Var04], pp. 36-37. p We shall now determine the structure of the restriction to L0 of the irreducible spin representation(s) of L, over R as well as over C. Since this may not be known widely we give some details. We begin with some preliminary remarks. p Let U be any finite dimensional complex L0 -module. Write U = ⊕χ Uχ , where Uχ , for any character (not necessarily unitary) χ of T1 , is the subspace of all elements u ∈ U such that (t − χ (t))m u = 0 for sufficiently large m. The action of L1 permutes the Uχ , and so, since L1 has no finite nontrivial orbit in the space of characters of T1 , it follows that the spectrum of T1 consists only of the trivial character, i.e., T1 acts unipotently. In particular U1 , the subspace of T1 -invariant elements of U , is = 0, an assertion which is then valid for real modules also. It follows that we have a strictly increasing filtration (Ui )i≥1 , where Ui+1 is the preimage in U of (U/Ui )1 . In particular, if U is semisimple, U = U1 . Lemma 21. Let W be an irreducible real or complex spin module for L0 . Let W1 be the subspace of all elements of W fixed by T1 and W 1 := W/W1 . We then have the following: p
(i) 0 = W1 = W , W1 is the unique proper non-zero L0 -submodule of W , and T1 acts trivially also on W 1 . p p (ii) W1 and W 1 are both irreducible L0 -modules on which L0 /T1 L1 acts as a spin module. (iii) The exact sequence 0 −→ W1 −→ W −→ W/W1 −→ 0 does not split. (iv) Over R, W, W1 , W 1 are all of the same type. If dim(V ) is odd, W1 W 1 . Let dim(V ) be even; then, over C, W1 , W 1 are the two irreducible spin modules for L1 ; over R, W1 W 1 when W is of complex type, namely, when D ≡ 0, 4 mod 8; otherwise, the modules W1 , W 1 are the two irreducible modules of L1 (which are either real or quaternionic). Proof. We first work over C. Let C be the Clifford algebra of V . The key point is that W1 = W . Suppose W1 = W . Then tj = 0 on W for all j . If D is odd, C + is a full matrix algebra and so all of its modules are faithful, giving a contradiction. Let D be even and W one of the spin modules for L0 . We know that inner automorphism by the invertible odd element e2 changes W to the other spin module. But as e2 tj e2−1 = tj for j > 2 and −t2 for j = 2, it follows that tj = 0 on the other spin module also. Hence tj = 0 in the irreducible module for the full Clifford algebra C. Now C is isomorphic to a full matrix algebra and so its modules are faithful, giving again tj = 0, a contradiction. p Let (Wi ) be the strictly increasing flag of L0 -modules, with T1 acting trivially on each Wi+1 /Wi , defined by the previous discussion. Let m be such that Wm = W . p Clearly m ≥ 2. On the other hand, the element −1 of L0 lies in L0 and as it acts as −1 on W , it acts as −1 on all the Wi+1 /Wi . Hence dim(Wi+1 /Wi ) ≥ dim(σD−2 ) (see Lemma 6.8.1 of [Var04]), and there is equality if and only if Wi+1 /Wi σD−2 . Since dim(σD ) = 2 dim(σD−2 ), we see at once that m = 2 and that both W1 and W/W1 are
Unitary Representations of Super Lie Groups and Applications
255
p
irreducible L0 -modules which are spin modules for L1 . The exact sequence in (iii) cannot split, as otherwise T1 will be trivial on all of W . Suppose now U is a non-zero proper p L0 -submodule of W . Then dim(U ) = dim(W1 ) for the same dimensional argument as above, and so U is irreducible, thus T1 is trivial on it, showing that U = W1 . We have thus proved (i)–(iii). We now prove (iv). There is nothing to prove when D is odd since there is only one p ± spin module. Suppose D is even. Let us again write σD−2 for the L0 -modules obtained p by lifting the irreducible spin modules of L1 to L0 . We consider two cases. ± D ≡ 0 , 4 mod 8 . In this case σD± are self dual while σD−2 are dual to each other. It + + is not restrictive to assume W = σD and W1 = σD−2 . We have the quotient map W = σD+ −→ W 1 = σ ; and σ is to be determined. Writing σ for the dual of σ , we get σ ⊂ (σD+ ) σD+ , so that, by the uniqueness of the submodule proved above, + − . Hence σ = σD−2 . σ = σD−2 ± are self dual. Again, supD ≡ 2 , 6 mod 8 . Now σD± are dual to each other while σD−2 + + pose W = σD and W1 = σD−2 . The above argument then gives σ ⊂ σD− . On the other hand, the inner automorphism by e2 transforms σD+ into σD− , and the subspace of σD+ fixed + by T1 into the corresponding subspace of σD− , while at the same time changing σD−2 − + + − into σD−2 . Hence it changes the inclusion σD−2 ⊂ σD into the inclusion σD−2 ⊂ σD− . − − . Dualizing, this gives σ = σD−2 once again. This finishes Hence we have σ = σD−2 the proof of the lemma over C.
We now work over R. Since both V and V1 have the same signature D −2, it is immediate that the real spin modules for L0 and L1 are of the same type. As an L0 -module, the complexification WC of W is either irreducible or is a direct sum U ⊕ U , where U is a complex spin module for L0 . In the first case W is of real type and the lemma follows from the lemma for the complex spin modules. In the second case W is of quaternionic or complex type according as U and U are equivalent or not. Complex type. We have WC = U ⊕ Z, where Z = U is the complex conjugate of U . We have U σD+ , Z σD− . Since CW1 = U1 ⊕ Z1 it is clear that 0 = W1 = W . The real irreducible spin modules of L1 have dimension 2D/2−1 and so we find that dim(W1 ) = 2D/2−1 and W1 , W 1 are both irreducible; they are equivalent as they are of complex type. A non-zero proper submodule R of W then has dimension 2D/2−1 and so must be irreducible. Hence either R = W1 or W = W1 ⊕ R. But then W = W1 , a contradiction. The same argument shows that the exact sequence in the lemma does not split. Quaternionic type. For proving (i)–(iii) of the lemma the argument is the same as in the complex type, except that Z U . We now check (iv). The case of odd dimension is obvi+ + + ous. So let D be even and W1 sD−2 . Then CW1 2σD−2 so that U1 Z1 σD−2 . − − − 1 1 1 1 Then U Z σD−2 and so CW 2σD−2 . Hence W sD−2 . The lemma is completely proved. Remark 7. Since we are interested in the quotient W 1 rather than W1 below, we change ± ± ∓ our convention slightly; for W = sD we write W 1 = sD−2 and W1 = sD−2 . We now come to the discussion of the structure of κp when p is in a massless orbit. Proposition 9. Let p0 > 0, p, p = 0. Then rad (p ) = ᒄT1 1 , the subspace of elements of ᒄ1 fixed by T1 . Moreover T1 acts trivially on ᒄ1p , rad (p ) ᒄ1p except in the
256
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan p
cases D ≡ 2, 6 mod 8 when rad (p ) and ᒄ1p are dual to each other, and L0 /T1 L1 acts spinorially on rad p and ᒄ1p . In all cases dim(rad (p )) = dim(ᒄ1p ) = (1/2) dim(ᒄ1 ). For ᒄ1p as well as the associated κp the results are as in the following table: D mod 8
ᒄ1p
κp
0, 4 2(N ± = 2n± ) 6 1, 3(N = 2n) 5, 7
N sD−2 + − N + sD−2 + N − sD−2 + − + − N sD−2 + N sD−2 N sD−2 N sD−2
± N σD−2 + − n+ σD−2 + n− σD−2 + − − N + σD−2 + N σD−2 nσD−2 N σD−2
Proof. We have ᒄ1 = ⊕1≤i≤N ᒅi , where the ᒅi are real irreducible spin modules and + . Let ᒏ = rad and ᒏ [ᒅi , ᒅj ] = 0 for i = j while [X, X], q > 0 for all q ∈ p p ip the radical of the restriction of p to ᒅi . Since Qp (X) = i Qp (Xi ), where Xi is the component of X in ᒅi , it follows that ᒏp = ⊕i ᒏip . We now claim that ᒏip = (ᒅi )1 , namely, p the subspace of elements of ᒅi fixed by T1 . Since ᒏip is a L0 -submodule it suffices, in view of the lemma above, to show that 0 = ᒏip = ᒅi . If ᒏip were 0, p would be strictly p positive definite on ᒅi , and hence the action of L0 will have an invariant positive definite p quadratic form. So the action of L0 on ᒅi will be semisimple, implying that T1 will act trivially on ᒅi . This is impossible, since, by the preceding lemma, ᒅi = (ᒅi )1 . If ᒏip = ᒅi , then p = 0 on ᒅi , and this will imply that p = 0. Thus ᒏip = (ᒅi )1 , hence ᒏp = (ᒄ1 )1 . The other assertions except the table are now clear. For the table we need to observe that ᒄ1p = ⊕i ᒅi /ᒏip and that ᒏip ᒅi /ᒏip except when D ≡ 2, 6 mod 8; in these cases, the two modules are the two real or quaternionic spin modules which are dual to each other. The table is worked out in a similar manner to Proposition 8. We omit the details. Remark 8. The result that the dimension of ᒄ1p has 1/2 the dimension of ᒄ1 extends the known calculations when D = 4 (see [FSZ81]). 4.4.5. The role of the R-group in classifying the states of κp . In the case of N -extended p supersymmetry we have two groups acting on ᒄ1 : L0 , the even part of the little super group at p, and the R-group ([Del99, Var04]) R. Their actions commute and they both leave the quadratic form Qp invariant. In the massive case we have a map p
L0 × R −→ Spin(ᒄ1p ) so that one can speak of the restriction κp of the spin representation of Spin(ᒄ1p ) to p p L0 × R. The same is true in the massless case except we have to replace L0 by a twofold cover of it. It is thus desirable to not just determine κp as we have done but actually p determine this representation κp of L0 ×R. We have not done this but there is no difficulty in principle. However, when D = 4, we have a beautiful formula [FSZ81]. To describe this, assume that we are in the massive case. We first remark that ᒄ1 HN ⊗H S0 , where S0 is the quaternionic irreducible of SU(2) of dimension 4. Thus the R-group is the unitary group U(N, H). Over C we thus have ᒄ1C C2N ⊗ C2 , where the R-group is the p symplectic group Sp(2N, C) acting on the first factor and L0 SU(2) acts as D 1/2 on the second factor. The irreducible representations of Sp(2N, C)×SU(2) are outer tensor products of irreducibles a of the first factor and b of the second factor, written as (a, b).
Unitary Representations of Super Lie Groups and Applications
257
Let k denote the irreducible of dimension k of SU(2) and [2N ]k denote the irreducible representation of the symplectic group in the space of traceless antisymmetric tensors of rank k over C2N ; by convention for k = 0 this is the trivial representation and for k = 1 it is the vector representation. Then κp ([2N ]0 , N + 1) + ([2N ]1 , N) + . . . ([2N ]k , N + 1 − k) + . . . ([2N ]N , 1). To see how this follows from our theory note that C2N ⊗ e1 is a subspace satisfying the conditions of Lemma 19 for the symplectic group and so κp (C2N ). It is known that
(N + 1 − k)[2N ]k . (C2N ) 0≤k≤N
On the other hand we know that the representations of SU(2) in κp are precisely the N + 1 − k(0 ≤ k ≤ N ). The formula for κp is now immediate. In the massless case the R-group becomes U(N ) and
κp ((N − k)/2, [N ]k ), 0≤k≤N
where r/2 denotes the character denoted earlier by χr and [N ]k is the irreducible representation of U(N ) defined on the space k (CN ). We omit the proof which is similar. Acknowledgement. G. C. gratefully acknowledges a grant of the Universit`a di Genova that has made possible a visit to Los Angeles during which some of the work for this paper has been done. He thanks Professor V. S. Varadarajan for his warm hospitality. V. S. V. would like to thank Professor Giuseppe Marmo and INFN, Naples, Professor Sergio Ferrara and CERN, Geneva, and Professors Enrico Beltrametti and Gianni Cassinelli and INFN, Genoa, for their hospitality during the summers of 2003 and 2004, during which most of the work for this paper was done. We are grateful to Professor Pierre Deligne of the Institute for Advanced Study, Princeton, NJ, for his interest in our work and for his comments which have improved the paper.
References [Del99] [DM78] [DM99] [DP85] [DP86]
[DP87] [Fer01] [Fer03] [FSZ81] [Nel59]
Deligne, P.: Notes on spinors. In: Quantum fields and strings: a course for mathematicians, Vol. 1, 2 (Princeton, NJ, 1996/1997), Providence, RI: Amer. Math. Soc., 1999, pp. 99–135 Dixmier, J., Malliavin, P.: Factorisations de fonctions et de vecteurs ind´efiniment diff´erentiables. Bull. Sci. Math. (2), 102(4), 307–330 (1978) Deligne, P., Morgan, J.W.: Notes on supersymmetry (following Joseph Bernstein). In: Quantum fields and strings: a course for mathematicians, Vol. 1, 2 (Princeton, NJ, 1996/1997), Providence, RI: Amer. Math. Soc., 1999, pp. 41–97 Dobrev, V.K., Petkova, V.B.: All positive energy unitary irreducible representations of extended conformal supersymmetry. Phys. Lett. B 162(1–3), 127–132 (1985) Dobrev, V.K., Petkova, V.B.: All positive energy unitary irreducible representations of the extended conformal superalgebra. In: Conformal groups and related symmetries: physical results and mathematical background (Clausthal-Zellerfeld, 1985), Vol. 261 of Lecture Notes in Phys., Berlin: Springer, 1986, pp. 300–308 Dobrev, V.K., Petkova, V.B.: Group-theoretical approach to extended conformal supersymmetry: function space realization and invariant differential operators. Fortschr. Phys. 35(7), 537–572 (1987) Ferrara, S.: UCLA lectures on supersymmetry. 2001 Ferrara, S.: UCLA lectures on supersymmetry. 2003 Ferrara, S., Savoy, C.A., Zumino, B.: General massive multiplets in extended supersymmetry. Phys. Lett. B 100(5), 393–398 (1981) Nelson, E.: Analytic vectors. Ann. Math. (2), 70, 572–615 (1959)
258 [SS74]
C. Carmeli, G. Cassinelli, A. Toigo, V.S. Varadarajan
Salam, A., Strathdee, J.: Unitary representations of super-gauge symmetries. Nucl Phys. B80, 499–505 (1974) [Var04] Varadarajan, V.S.: Supersymmetry for mathematicians: an introduction, Vol. 11 of Courant Lecture Notes in Mathematics. New York: New York University Courant Institute of Mathematical Sciences, 2004 [Wit82] Witten, E.: Supersymmetry and Morse theory. J. Differ. Geom. 17(4), 661–692 (1983) Communicated by Y. Kawahigashi
Commun. Math. Phys. 263, 259–276 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1510-7
Communications in
Mathematical Physics
Sufficiency in Quantum Statistical Inference Anna Jenˇcov´a1, , D´enes Petz2, 1 2
Mathematical Institute of the Slovak Academy of Sciences, Stefanikova 49, Bratislava, Slovakia. E-mail: jenca@mat,savba.sk Alfr´ed R´enyi Institute of Mathematics, Hungarian Academy of Sciences, POB 127, 1364 Budapest, Hungary. E-mail:
[email protected] Received: 27 December 2004 / Accepted: 22 September 2005 Published online: 26 January 2006 – © Springer-Verlag 2006
Abstract: This paper attempts to develop a theory of sufficiency in the setting of noncommutative algebras parallel to the ideas in classical mathematical statistics. Sufficiency of a coarse-graining means that all information is extracted about the mutual relation of a given family of states. In the paper sufficient coarse-grainings are characterized in several equivalent ways and the non-commutative analogue of the factorization theorem is obtained. As an application we discuss exponential families. Our factorization theorem also implies two further important results, previously known only in finite Hilbert space dimension, but proved here in generality: the Koashi-Imoto theorem on maps leaving a family of states invariant, and the characterization of the general form of states in the equality case of strong subadditivity. 1. Introduction A quantum mechanical system is described by a C*-algebra, the dynamical variables (or observables) correspond to the self-adjoint elements and the physical states of the system are modelled by the normalized positive functionals of the algebra, see [4, 5]. The evolution of the system M can be described in the Heisenberg picture in which an observable A ∈ M moves into α(A), where α is a linear transformation. α is an automorphism in the case of the time evolution of a closed system but it could be the irreversible evolution of an open system. The Schr¨odinger picture is dual, it gives the transformation of the states, the state ϕ ∈ M∗ moves into ϕ ◦ α. The algebra of a quantum system is typically non-commutative but the mathematical formalism supports commutative algebras as well. A simple measurement is usually modelled by a family of pairwise orthogonal projections, or more generally, by a partition of unity, (Ei )ni=1 . Since all Ei are supposed to be positive and i Ei = I , β : Cn → M, (z1 , z2 , . . . , zn ) → i zi Ei gives a Supported by the EU Research Training Network Quantum Probability with Applications to Physics, Information Theory and Biology and Center of Excellence SAS Physics of Information I/2/2005. Supported by the Hungarian grant OTKA T032662
260
A.Jenˇcov´a, D.Petz
positive unital mapping from the commutative C*-algebra Cn to the non-commutative algebra M. Every positive unital mapping occurs in this way. The essential concept in quantum information theory is the state transformation which is affine and the dual of a positive unital mapping. All these and several other situations justify the study of positive unital mappings between C*-algebras from a quantum statistical viewpoint. If the algebra M is “small” and N is “large”, and the mapping α : M → N sends the state ϕ of the system of interest to the state ϕ ◦ α at our disposal, then loss of information takes place and the problem of statistical inference is to reconstruct the real state from partial information. In this paper we mostly consider parametric statistical models, a parametric family S := {ϕθ : θ ∈ } of states is given and on the basis of the partial information the correct value of the parameter should be decided. If the partial information is the outcome of a measurement, then we have statistical inference in the very strong sense. However, there are “more quantum” situations, to decide between quantum states on the basis of quantum data, see Example 4 below. The problem we discuss is not the procedure of the decision about the true state of the system but the circumstances under which this is perfectly possible. The paper is organized as follows. In the next section we summarize the relevant basic concepts both in classical statistics and in the non-commutative framework. The first part of Sect. 3 is about sufficient subalgebras, or subsystems of a quantum system. The second part is devoted to sufficient coarse-grainings. Most of the result of this section has been known in a more restricted situation of faithful states, see Chap. 9 in [12]. The importance of the multiplicative domain of a completely positive mapping is emphasized here. This concept allows us to give a sufficient subalgebra determined by a sufficient coarse-graining. The quantum factorization theorem of Sect. 4 is the main result of the paper. The factorization of the states corresponds to a special structure of the algebras and the sufficient coarse-grainings. We use the properties of the von Neumann entropy and of the modular group to prove this result in some infinite dimensional situations (where the essential condition is the finiteness of the von Neumann entropy). The factorization implies a generalization of the Koashi-Imoto Theorem [7]. In Sect. 5 the equality case in the strong subadditivity of the von Neumann entropy is discussed in a possibly infinite dimensional framework and the factorization result is applied. 2. Preliminaries In this paper, C*-algebras always have a unit I . Given a C*-algebra M, a state ϕ of M is a linear function M → C such that ϕ(I ) = 1 = ϕ. (Note that the second condition is equivalent to the positivity of ϕ.) The books [4, 5] – among many others – explain the basic facts about C*-algebras. The class of finite dimensional full matrix algebras form a small and algebraically rather trivial subclass of C*-algebras, but from the view-point of non-commutative statistics, almost all ideas and concepts appear in this setting. A matrix algebra Mn (C) admits a canonical trace Tr and all states are described by their densities with respect to Tr. The correspondence is given by ϕ(A) = TrDϕ A (A ∈ Mn (C)) and we can simply identify the functional ϕ by the density Dϕ . Note that the density is a positive (semi-definite) matrix of trace 1. Let M and N be C*-algebras. Recall that 2-positivity of α : M → N means that α(A) α(B) AB ≥0 if ≥0 α(C) α(D) CD
Sufficiency in quantum statistical inference
261
for 2×2 matrices with operator entries. It is well-known that a 2-positive unit-preserving mapping α satisfies the Schwarz inequality α(A∗ A) ≥ α(A)∗ α(A).
(1)
A 2-positive unital mapping between C*-algebras will be called coarse-graining. Here are two fundamental examples. Example 1. Let X be a finite set and N be a C*-algebra. Assume that for each x ∈ X a positive operator E(x) ∈ N is given and x E(x) = I . In quantum mechanics such a setting is a model for a measurement with values in X . The space C(X ) of function on X is a C*-algebra and the partition of unity E induces a coarse-graining α : C(X ) → N given by α(f ) = x f (x)E(x). Therefore a coarsegraining defined on a commutative algebra is an equivalent way to give a measurement. (Note that the condition of 2-positivity is automatically fulfilled on a commutative algebra.) Example 2. Let M be the algebra of all bounded operators acting on a Hilbert space H and let N be the infinite tensor product M ⊗ M ⊗ . . .. (To understand the essence of the example one does not need the very formal definition of the infinite tensor product.) If γ denotes the right shift on N , then we can define a sequence αn of coarse-grainings M → N: 1 αn (A) := A + γ (A) + · · · + γ n−1 (A) . n αn is the quantum analogue of the sample mean. Let (Xi , Ai , µi ) be a measure space (i = 1, 2). Recall that a positive linear map M : L∞ (X1 , A1 , µ1 ) → L∞ (X2 , A2 , µ2 ) is called a Markov operator if it satisfies M1 = 1 and fn 0 implies Mfn 0. For mappings defined between von Neumann algebras, the monotone continuity is called normality. In the case that M and N are von Neumann algebras, a coarse-graining M → N will be always supposed to be normal. Our concept of coarse-graining is the analogue of the Markov operator. We mostly mean that a coarse-graining transforms observables to observables corresponding to the Heisenberg picture and in this case we assume that it is unit preserving. The dual of such a mapping acts on states or on density matrices and it will be called coarse-graining as well. We recall some well-known results from mathematical statistics, see [24] for details. Let (X, A) be a measurable space and let P = {Pθ : θ ∈ } be a set of probability measures on (X, A). A sub-σ -algebra A0 ⊂ A is sufficient for P if for all A ∈ A, there is an A0 -measurable function fA such that for all θ, fA = Pθ (A|A0 ) Pθ − almost everywhere, that is,
Pθ (A ∩ A0 ) =
fA dPθ
(2)
A0
for all A0 ∈ A0 and for all θ. It is clear from this definition that if A0 is sufficient then for all Pθ there is a common version of the conditional expectation Eθ [g|A0 ] for any measurable step function g, or, more generally, for any function g ∈ ∩θ∈ L1 (X, A, Pθ ). In the most important case, the family P is dominated, that is there is a σ -finite measure µ such that Pθ is absolutely continuous with respect to µ for all θ, this will be denoted by P 0 centered at x. We shall drop writing either x, or R in the notation of the sphere (ball) in the particular cases when either x = 0, or R = 1. For a fixed M > 1 we define the spherical shell A(M) := [k ∈ Rd∗ : M −1 ≤ |k| ≤ M] in the k-space, and A(M) := Rd × A(M) in the whole phase space. Given a vector v ∈ Rd∗ we denote by vˆ := v/|v| ∈ Sd−1 the unit vector in the direction of v. For any set A we shall denote by Ac its complement. For any non-negative integers p, q, r, positive times T > T∗ ≥ 0 and a function G : [T∗ , T ] × R2d ∗ → R that has p, q and r derivatives in the respective variables we define β γ [T∗ ,T ] := sup |∂tα ∂x ∂k G(t, x, k)|. (2.1) Gp,q,r (t,x,k)∈[T∗ ,T ]×R2d
The summation range covers all integers 0 ≤ α ≤ p and all integer valued multiindices |β| ≤ q and |γ | ≤ r. In the special case when T∗ = 0, T = +∞ we write p,q,r [0,+∞) ([0, +∞) × R2d Gp,q,r = Gp,q,r . We denote by Cb ∗ ) the space of all functions G with Gp,q,r < +∞. We shall also consider spaces of bounded and a suitable p p,q d number of times continuously differentiable functions Cb (R2d ∗ ) and Cb (R∗ ) with the respective norms · p,q and · p . 2.2. The background Hamiltonian. We assume that the background Hamiltonian H0 (k) is isotropic, that is, it depends only on k = |k|, and is uniform in space. Moreover, we assume that H0 : [0, +∞) → R is a strictly increasing function satisfying H0 (0) ≥ 0 and such that it is of C 3 -class of regularity in (0, +∞) with H0 (k) > 0 for all k > 0, and let h∗ (M):=
max
k∈[M −1 ,M]
(H0 (k)+|H0
(k)| + |H0
(k)|),
h∗ (M):=
min
k∈[M −1 ,M]
H0 (k). (2.2)
Two examples of such Hamiltonians are the quantum Hamiltonian H0 (k) = k 2 /2 and the acoustic wave Hamiltonian H0 (k) = c0 k. 2.3. The random medium. Let (, , P) be a probability space, and let E denote the expectation with respect to P. We denote by XLp () the Lp -norm of a given random variable X : → R, p ∈ [1, +∞]. Let H1 : Rd ×[0, +∞)× → R be a random field that is measurable and strictly stationary in the first variable. This means that for any shift x ∈ Rd and a collection of points k1 , . . . , kn ∈ [0, +∞), x1 , . . . , xn ∈ Rd the laws of (H1 (x1 + x, k1 ), . . . , H1 (xn + x, kn )) and (H1 (x1 , k1 ), . . . , H1 (xn , kn )) are identical. In addition, we assume that EH1 (x, k) = 0 for all k ≥ 0, x ∈ Rd , the realizations of H1 (x, k) are P–a.s. C 2 -smooth in (x, k) ∈ Rd × (0, +∞) and they satisfy Di,j (M):=max
ess-sup
|α|=i (x,k,ω)∈Rd ×[M −1 ,M]×
j
|∂xα ∂k H1 (x, k; ω)| < +∞,
i, j = 0, 1, 2. (2.3)
280
T. Komorowski, L. Ryzhik
˜ We define D(M) := 0≤i+j ≤2 Di,j (M). We suppose further that the random field is strongly mixing in the uniform sense. More precisely, for any R > 0 we let CRi and CRe be the σ –algebras generated by random variables H1 (x, k) with k ∈ [0, +∞), x ∈ BR and x ∈ BcR respectively. The uniform mixing coefficient between the σ –algebras is e ], φ(ρ) := sup[ |P(B) − P(B|A)| : R > 0, A ∈ CRi , B ∈ CR+ρ
for all ρ > 0. We suppose that φ(ρ) decays faster than any power: for each p > 0, hp := sup ρ p φ(ρ) < +∞.
(2.4)
ρ≥0
The two-point spatial correlation function of the random field H1 is R(y, k) := E[H1 (y, k)H1 (0, k)]. Note that (2.4) implies that for each p > 0, hp (M) :=
4
sup
d −1 i=0 |α|=i (y,k)∈R ×[M ,M]
(1+|y|2 )p/2 |∂yα R(y, k)| < +∞,
M > 0. (2.5)
We also assume that the correlation function R(y, l) is of the C ∞ -class for a fixed l > 0, is sufficiently smooth in l, and that for any fixed l > 0, ˆ R(k, l) does not vanish identically on any hyperplane Hp = {k : (k · p) = 0}. (2.6)
ˆ Here R(k, l) = R(x, l) exp(−ik · x)dx is the power spectrum of H1 . The above assumptions are satisfied, for example, if H1 (x, k) = c1 (x)h(k), where c1 (x) is a stationary uniformly mixing random field with a smooth correlation function, and h(k) is a smooth deterministic function. 2.4. Certain path-spaces. For fixed integers d, m ≥ 1 we let C d,m := C([0, +∞); Rd × Rm ∗ ): we shall omit the subscripts in the notation of the path space if m = d. We define (X(t), K(t)) : C d,m → Rd ×Rm ∗ as the canonical mapping (X(t; π ), K(t; π )) := π(t), π ∈ C d,m and also let θs (π )(·) := π(· + s) be the standard shift transformation. For any u ≤ v denote by Mvu the σ -algebra of subsets of C generated by (X(t), K(t)), t ∈ [u, v]. We write Mv := Mv0 and M for the σ algebra of Borel subsets of C. It coincides with the smallest σ –algebra that contains all Mt , t ≥ 0. (0) (0) ˜ For a given M > 0 and δ ∈ (0, δ∗ (M)] we Let δ∗ (M) := H0 M −1 /(2D(M)). let −1 √ √ 1 −1 −1 ˜ ˜ . H0 H0 −2 δ D(M) Mδ := max H0 (H0 (M)+2 δ D(M)), M (2.7) (0)
We select δ∗ (M) ∈ (0, δ∗ (M)) in such a way that Mδ < 2M for all δ ∈ (0, δ∗ (M)). For a particle that is governed by the Hamiltonian flow generated by Hδ (x, k) we have Mδ−1 ≤ |K(t)| ≤ Mδ for all t provided that K(0) ∈ A(M). Accordingly, we define C(T , δ) as the set of paths π ∈ C so that both (2Mδ )−1 ≤ |K(t)| ≤ 2Mδ , and
Diffusion in a Weakly Random Hamiltonian Flow
281
t √ X(t) − X(u) − H (K(s))K(s)ds ≤ D(2M ˆ ˜ δ ) δ(t − u), for all 0 ≤ u < t ≤ T . 0 u
In the case when δ = 1, or T = +∞ we shall write simply C(T ), or C(δ) respectively. 2.5. The main results. Let the function φδ (t, x, k) satisfy the Liouville equation ∂φ δ + ∇x Hδ (x, k) · ∇k φ δ − ∇k Hδ (x, k) · ∇x φ δ = 0, ∂t φ δ (0, x, k) = φ0 (δx, k).
(2.8)
We assume that the initial data φ0 (x, k) is a compactly supported function four times differentiable in k, twice differentiable in x whose support is contained inside a spherical shell A(M) = {(x, k) : M −1 < |k| < M} for some M > 1. Let us define the diffusion matrix Dmn (l, l) for l ∈ Sd−1 and l > 0 by
1 ∞ ∂ 2 R(H0 (l)sl, l) ds Dmn (l, l) = − 2 −∞ ∂xn ∂xm
∞ 2 1 ∂ R(sl, l) =− ds, m, n = 1, . . . , d. (2.9) 2H0 (l) −∞ ∂xn ∂xm Then we have the following result. Theorem 2.1. Assume that d ≥ 3. Let φ δ be the solution of (2.8) and let φ¯ ∈ Cb1,1,2 ([0, +∞); R2d ∗ ) satisfy d ¯ ∂ ∂ φ¯ ˆ k) ∂ φ + H (k) kˆ · ∇x φ, ¯ Dmn (k, = 0 ∂t ∂kn m,n=1 ∂km (2.10) ¯ x, k) = φ0 (x, k). φ(0, Suppose that M > 1. Then, there exist two constants C, α0 > 0 such that for all T ≥ 1 and all compact sets K ⊂ A(M) we have δ t x Eφ ¯ x, k) ≤ CT (1 + φ0 1,4 )δ α0 . (2.11) , , k − φ(t, sup δ δ (t,x,k)∈[0,T ]×K Remark 2.2. We shall denote by C, C1 , . . . , α0 , α1 , . . . , γ0 , γ1 , . . . throughout this article generic positive constants. Unless specified otherwise the constants denoted this way shall depend neither on δ, nor on T . In the statement of the results appearing throughout the paper we will always assume, unless stated otherwise, that the parameter T ≥ 1 Remark 2.3. Classical results of the theory of stochastic differential equations, see e.g. Theorem 6 of Chapter 2, p. 176 and Corollary 4 of Chapter 3, p. 303 of [6], imply that there exists a unique solution to the Cauchy problem (2.10) that belongs to the class Cb1,1,2 ([0, +∞) × R2d ∗ ). This solution admits a probabilistic representation using the law of a time homogeneous diffusion Qx,k whose Kolmogorov equation is given by (2.10), see Sect. 3.6 below. Note that
282
T. Komorowski, L. Ryzhik d
ˆ k)kˆm = − Dnm (k,
m=1
d m=1
=−
d m=1
1 2H0 (k) 1 2H0 (k)
∞ −∞
∞ −∞
ˆ k) ∂ 2 R(s k, kˆm ds ∂xn ∂xm ˆ k) d ∂R(s k, ds = 0, ds ∂xn
and thus the K-process generated by (2.10) is indeed a diffusion process on a sphere Sd−1 k , or, equivalently, Eqs. (2.10) for different values of k are decoupled. Assumption (2.6) implies the following. Proposition 2.4. The matrix D(l, l) has rank d − 1 for each l ∈ Sd−1 and l > 0. The proof is the same as that of Proposition 4.3 in [1]. It can be shown, using the argument given on pp. 122–123 of ibid., that, under assumption (2.6), Eq. (2.10) is hypoelliptic on the manifold Rd × Sd−1 for each k > 0. k We also show that solutions of (2.10) converge in the long time limit to the solutions of the spatial diffusion equation. More, precisely, we have the following re2 , x/γ , k), where φ¯ satisfies (2.10) with an initial data ¯ sult. Let φ¯ γ (t, x, k) = φ(t/γ φ¯ γ (0, t, x, k) = φ0 (γ x, k) and let w(t, x, k) be the solution of the spatial diffusion equation: d ∂w 2w = amn (k) ∂x∂n ∂x , m ∂t m,n=1
(2.12)
w(0, x, k) = φ¯ 0 (x, k), with the averaged initial data φ¯ 0 (x, k) =
1
d−1 Sd−1
φ0 (x, kl)d(l).
Here d(l) is the surface measure on the unit sphere Sd−1 and n is the area of the n-dimensional unit sphere. The diffusion matrix A(k) := [anm (k)] in (2.12) is given explicitly as
1 anm (k) = H (k)ln χm (kl)d(l), (2.13) d−1 Sd−1 0 where l = (l1 , . . . , ld ). The functions χj appearing above are the mean-zero solutions of d m,n=1
∂ ∂km
ˆ k) Dmn (k,
∂χj ∂kn
= −H0 (k)kˆj .
(2.14)
Note that Eqs. (2.14) for χm are elliptic on each sphere Sd−1 k . This follows from the fact that the equations for each such sphere are all decoupled and Proposition 2.4. (1) (1) Also note that the matrix A(k) is symmetric. Indeed, let c1 = (c1 , . . . , cd ), c2 = (2) (2) (i) d (c1 , . . . , cd ) ∈ Rd be fixed vectors and let χci := m=1 cm χm , i = 1, 2. Since the matrix D is symmetric we have
Diffusion in a Weakly Random Hamiltonian Flow
(A(k)c1 , c2 )Rd = − =− =
1 d−1 1
d d−1 m,n=1 S
d
χc1 (k)
283
∂ ∂km
∂ χc1 (l) d ∂l m R
ˆ k) Dmn (k,
∂χc2 (k) ˆ d(k) ∂kn
dl ∂χc2 (l) ˆ Dmn (l, |l|) δ(k − |l|) d−1 ∂ln |l|
d−1 m,n=1
1 ˆ k)∇χc1 (k k), ˆ ∇χc2 (k k)) ˆ Rd d(k)=(c ˆ (D(k, 1 , A(k)c2 )Rd .
d−1 Sd−1
(2.15) ˆ k)kˆ = 0. MoreThe last but one equality holds after integration by parts because D(k, over, substituting c1 = c2 we obtain that
1 ˆ k)∇χc1 (k k), ˆ ∇χc1 (k k)) ˆ Rd d(k) ˆ ≥ 0. (A(k)c1 , c1 )Rd = (D(k, d−1 Sd−1 k (2.16) In fact, the above inequality holds in the strict sense. This can be seen as follows. Since ˆ k) is one-dimensional and consists for each kˆ ∈ Sd−1 the null-space of the matrix D(k, ˆ in order for (A(k)c1 , c1 )Rd to vanish one needs that the of the vectors parallel to k, ˆ is parallel to kˆ for all k. ˆ This, however, together with (2.14), would gradient ∇χc1 (k k) d−1 imply that kˆ · c1 = 0 for all kˆ ∈ S , which is impossible. The following theorem holds. Theorem 2.5. For every 0 < T∗ < T < +∞ the re-scaled solution φ¯ γ (t, x, k) = 2 , x/γ , k) of (2.10) converges as γ → 0+ in C([T , T ]; L∞ (R2d )) to w(t, x, k). ¯ φ(t/γ ∗ Moreover, there exists a constant C > 0 so that we have √ w(t, ·) − φ¯ γ (t, ·)0,0 ≤ C γ T + γ φ0 1,1 (2.17) for all T∗ ≤ t ≤ T . Remark 2.6. In fact, as it will become apparent in the course of the proof, we have a stronger result, namely T∗ can be made to vanish as γ → 0+ . For instance, we can choose T∗ = γ 3/2 , see (4.16). The proof of Theorem 2.5 is based on some classical asymptotic expansions and is quite straightforward. As an immediate corollary of Theorems 2.1 and 2.5 we obtain the following result, which is the main result of this paper. Theorem 2.7. Assume that d ≥ 3, T∗ > 0 and M > 1. Let φδ be solution of (2.8) with the initial data φδ (0, x, k) = φ0 (δ 1+α x, k) and let w(t, ¯ x) be the solution of the diffusion equation (2.12) with the initial data w(0, x, k) = φ¯ 0 (x, k). Then, there exists α0 > 0 and a constant C > 0 so that for all 0 ≤ α < α0 , T∗ ≤ T and all compact sets K ⊂ A(M) we have: w(t, x, k) − Eφ¯ δ (t, x, k) ≤ CT δ α0 −α , sup (2.18) (t,x,k)∈[T∗ ,T ]×K
where φ¯ δ (t, x, k) := φδ t/δ 1+2α , x/δ 1+α , k .
284
T. Komorowski, L. Ryzhik
Theorem 2.7 shows that the movement of a particle in a weakly random quenched Hamiltonian is, indeed, approximated by a Brownian motion in the long time-large space limit, at least for times T δ −α0 . In fact, according to Remark 2.6 we can allow T∗ to vanish as δ → 0 choosing T∗ = δ 3α/2 . The estimate (2.18) shall still be valid then, provided that T is not too small, e.g. one can assume that T ≥ 1. In the isotropic case when R = R(|x|, k) we may simplify the above expressions for the diffusion matrices [Dmn ] and [amn ]. In that case we have
∞ 2 ˆ k) ∂ R(H0 (k)s k, ˆ k) = − 1 ds Dmn (k, 2 −∞ ∂xn ∂xm
∞ kn km
kn km R (H0 (k)s, k) =− R (H (k)s, k) + δ − ds nm 0 k2 k2 H0 (k)s 0
∞
R (s, k) k n km 1 , ds δnm − 2 =−
H0 (k) 0 s k ˆ k)] has the form so that the matrix [Dmn (k, ˆ k) = D0 (k) I − kˆ ⊗ kˆ , D0 (k) = − D(k,
1 H0 (k)
∞ 0
R (s, k) ds. s
In that case the functions χj are given explicitly by ˆ k) = − χj (k,
(H0 (k))2 k 2 kˆj , D¯ 0 (k) = − (d − 1)D¯ 0 (k)
0
∞
R (s, k) ds s
and
(H0 (k))3 k 2 (H0 (k))3 k 2 ˆ = anm (k) = kˆn kˆm d(k) δnm . d−1 (d − 1)D¯ 0 (k) Sd−1 d(d − 1)D¯ 0 (k) 2.6. A formal derivation of the momentum diffusion. We now recall how the diffusion operator in (2.10) can be derived in a quick formal way. We represent the solution of (2.8) as φ δ (t, x, k) = ψ δ (δt, δx, k) and write an asymptotic multiple scale expansion for ψ δ , √ x x ¯ x, k) + δφ1 t, x, , k + δφ2 t, x, , k + . . . . (2.19) ψ δ (t, x, k) = φ(t, δ δ We assume formally that the leading order term φ¯ is deterministic and independent of the fast variable z = x/δ. We insert this expansion into (2.8) and obtain in the order O δ −1/2 : ∇z H1 (z, k) · ∇k φ¯ − H0 (k)kˆ · ∇z φ1 = 0.
(2.20)
Let θ 1 be a small positive regularization parameter that will be later sent to zero, and consider a regularized version of (2.20): 1
¯ − kˆ · ∇z φ1 + θφ1 = 0.
∇z H1 (z, k) · ∇k φ H0 (k)
Diffusion in a Weakly Random Hamiltonian Flow
285
Its solution is 1 φ1 (t, x, z, k) = −
H0 (k)
∞ 0
d ˆ k) ∂ φ(t, ¯ x, k) −θs ∂H1 (z + s k, e ds. ∂zm ∂km
(2.21)
m=1
The next order equation becomes upon averaging ∂ φ¯ ∂H1 (z, k) ˆ ¯ =E k · ∇z φ1 − E (∇z H1 (z, k) · ∇k φ1 ) + H0 (k)kˆ · ∇x φ. ∂t ∂k (2.22) The first two terms on the right hand side above may be computed explicitly using expression (2.21) for φ1 : ∂H1 (z, k) ˆ E k · ∇z φ1 − E (∇z H1 (z, k) · ∇k φ1 ) ∂k
∞ d ˆ k) ∂ φ(t, ¯ x, k) −θs 1 ∂H (z, k) ∂ ∂H (z + s k, 1 1 = −E e ds kˆm ∂k ∂zm H0 (k) 0 ∂zn ∂kn m,n=1
∞ d ˆ k) ∂ φ(t, ¯ x, k) −θs 1 ∂H (z, k) ∂ ∂H (z + s k, 1 1 +E e ds . ∂zm ∂km H0 (k) 0 ∂zn ∂kn m,n=1
Using spatial stationarity of H1 (z, k) we may rewrite the above as
∞ d ˆ ¯ 1 ∂H1 (z, k) ˆ ∂ ∂H1 (z + s k, k) ∂ φ(t, x, k) −θs −E e ds km ∂k ∂zm H0 (k) 0 ∂zn ∂kn m,n=1
∞ d ˆ ¯ 1 ∂H1 (z + s k, k) ∂ φ(t, x, k) −θs ∂ ∂ −E H1 (z, k) e ds ∂zm ∂km H0 (k) 0 ∂zn ∂kn m,n=1
∞ d ˆ k) ∂ φ(t, ¯ x, k) −θs 1 ∂ ∂ 2 H1 (z + s k, =− E H1 (z, k) e ds ∂km H0 (k) 0 ∂zn ∂zm ∂kn m,n=1
∞ 2 d ˆ k) ∂ φ(t, ¯ x, k) −θs 1 ∂ ∂ R(s k, =− e ds ∂km H0 (k) 0 ∂xn ∂xm ∂kn m,n=1
∞ 2 d ˆ k) ∂ φ(t, ¯ x, k) 1 ∂ 1 ∂ R(s k, →− ds , as θ → 0+ . 2 ∂km H0 (k) −∞ ∂xn ∂xm ∂kn m,n=1
We insert the above expression into (2.22) and obtain d ∂ ∂ φ¯ ∂ φ¯ ˆ = Dnm (k, k) + H0 (k)kˆ · ∇x φ¯ ∂t ∂kn ∂km
(2.23)
m,n=1
ˆ k) as in (2.9). Observe that (2.23) is nothing but (2.10). with the diffusion matrix D(k, However, the naive asymptotic expansion (2.19) may not be justified. The rigorous proof presented in the next section is based on a quite different method.
286
T. Komorowski, L. Ryzhik
3. From the Liouville Equation to the Momentum Diffusion. Estimation of the Convergence Rates: Proof of Theorem 2.1 3.1. Outline of the proof. The basic idea of the proof of Theorem 2.1 is a modification of that of [1, 9]. We consider the trajectories corresponding to the Liouville equation (2.8) and introduce a stopping time, called τδ , that, among others, prevents near selfintersection of trajectories. This fact ensures that until the stopping time occurs the particle is “exploring a new territory” and, thanks to the strong mixing properties of the medium, “memory effects” are lost. Therefore, roughly speaking, until the stopping time the process is approximately characterized by the Markov property. Furthermore, since the amplitude of the random Hamiltonian is not strong enough to destroy the continuity of its path, it becomes a diffusion in the limit, as δ → 0. We introduce also an augmented process that follows the trajectories of the Hamiltonian flow until the stopping time τδ and becomes a diffusion after t = τδ . We show that the law of the augmented process is close to the law of a diffusion, see Proposition 3.4, with an explicit error bound. We also prove that the stopping time tends to infinity as δ → 0, once again with the error bound that is proved in Theorem 3.6. The combination of these two results allows us to estimate the difference between the solutions of the Liouville and the diffusion equations in a rather straightforward manner (see Sect. 3.9): they are close until the stopping time as the law of the diffusion is always close to that of the augmented process, while the latter coincides with the true process until τδ . On the other hand, the fact that τδ → ∞ as δ → 0 shows that with a large probability the augmented process is close to the true process. This combination finishes the proof. 3.2. The random characteristics corresponding to (2.8). Consider the motion of a particle governed by a Hamiltonian system of equations (δ) dz(δ) (t;x,k) z (t;x,k) (δ) (t; x, k) = (∇ H ) , m k δ dt δ (δ) (δ) dm (t;x,k) (3.1) = − √1 (∇z Hδ ) z (t;x,k) , m(δ) (t; x, k) dt δ δ z(δ) (0; x, k) = x, m(δ) (0; x, k) = k, √ where the Hamiltonian Hδ (x, k) := H0 (k) + δH1 (x, k), k = |k|. The trajectories of (3.1) are the characteristics of the Liouville equation (2.8). The hypotheses made in Sect. 2 imply that the trajectory (z(δ) (t; x, k), m(δ) (t; x, k)) necessarily lies in C(T , δ) for each T > 0, δ ∈ (0, δ∗ (M)], provided that the initial data (x, k) ∈ A(M). Indeed, it follows √ from the Hamiltonian structure of (3.1) that the Hamiltonian Hδ (x, m) = H0 (m) + δH1 (x, m) must be conserved along the trajectory. Hence, the definition (2.7) implies that Mδ−1 ≤ |m(δ) (·; x, k)| ≤ Mδ . We denote by Qδs,x,k (·) the law over C of the process corresponding to (3.1) starting at t = s from (x, k) (this law is actually supported in C(δ)). We shall omit writing the subscript s when it equals to 0. 3.3. The stopping times. We now define the stopping time τδ , described in Sect. 3.1, that prevents the trajectories of (3.1) to have near self-intersections (recall that the intent of the stopping time is to prevent any “memory effects” of the trajectories). As we have already mentioned, we will later show that the probability of the event [ τδ < T ] for a fixed T > 0 goes to zero, as δ → 0.
Diffusion in a Weakly Random Hamiltonian Flow
287
Let 0 < 1 < 2 < 1/2, 3 ∈ (0, 1/2 − 2 ), 4 ∈ (1/2, 1 − 1 − 2 ) be small positive constants that will be further determined later and set N = [δ −1 ],
p = [δ −2 ],
q = p [δ −3 ],
N1 = Np [δ −4 ].
(3.2)
We will specify additional restrictions on the constants j as the need for such constraints arises. However, the basic requirement is that i , i ∈ {1, 2, 3} should be sufficiently small and 4 is bigger than 1/2, less than one and can be made as close to one as we would need it. It is important that 1 < 2 so that N p when δ 1. We introduce the (p) following (Mt )t≥0 –stopping times. Let tk := kp −1 be a mesh of times, and π ∈ C be a path. We define the “violent turn” stopping time (p) (p) Sδ (π ) := inf t ≥ 0 : for some k ≥ 0 we have t ∈ tk , tk+1 and
1 1 ˆ (p) ) · K(t) ˆ ˆ ˆ t (p) − 1 · K(t) K(t ≤ 1 − ≤ 1 − , or K , k−1 k N N1 N (3.3) ˆ ˆ where by convention we set K(−1/p) := K(0). Note that with the above choice of 4 (p) (p) ˆ ˆ we have K tk − 1/N1 · K(tk ) > 1 − 1/N , Qδx,k -a.s., provided that δ ∈ (0, δ0 ] and δ0 is sufficiently small. We adopt in (3.3) a customary convention that the infimum of an empty set equals +∞. The stopping time Sδ is triggered when the trajectory performs a sudden turn – this is undesirable as the trajectory may then return to the region it has already visited and create correlations with the past. For each t ≥ 0, we denote by Xt (π ) := X (s; π ) the trace of the spatial com0≤s≤t
ponent of the path π up to time t, and by Xt (q; π) := [x : dist (x, Xt (π )) ≤ 1/q] a tubular region around the path. We introduce the stopping time
(p) (p) Uδ (π ) := inf t ≥ 0 : ∃ k ≥ 1 and t ∈ [tk , tk+1 ) for which X(t) ∈ Xt (p) (q) . k−1
(3.4) It is associated with the return of the X component of the trajectory to the tube around its past – this is again an undesirable way to create correlations with the past. Finally, we set the stopping time τδ (π ) := Sδ (π ) ∧ Uδ (π ) ∧ δ −1 .
(3.5)
The last term appearing on the right hand side of (3.5) ensures that τδ < +∞, Qδx,k -a.s. 3.4. The cut-off functions and the corresponding dynamics. Let M > 1 be fixed and p, q, N, N1 be the positive integers defined in Sect. 3.3. We define now several auxiliary functions that will be used to introduce the cut-offs in the dynamics. These cut-offs will ensure that the particle moving under the modified dynamics will avoid self-intersections, will have no violent turns and the changes of its momentum will be under control. In addition, up to the stopping time τδ the motion of the particle will coincide with the motion under the original Hamiltonian flow. Let a1 = 2 and a2 = 3/2. The functions ψj : Rd × Sd−1 → [0, 1], j = 1, 2 are of C ∞ class and satisfy
288
T. Komorowski, L. Ryzhik
ψj (k, l) =
1,
if kˆ · l ≥ 1−1/N
and Mδ−1 ≤ |k| ≤ Mδ
0,
if kˆ · l ≤ 1−aj /N,
or |k|≤(2Mδ )−1 ,
(3.6) or |k| ≥ 2Mδ .
One can construct ψj in such a way that for arbitrary nonnegative integers m, n it is possible to find a constant Cm,n for which ψj m,n ≤ Cm,n N m+n . The cut-off function (p) (p) (p) (p) for t ∈ [tk , tk+1 ) and k≥1, ψ1 k, Kˆ tk−1 ψ2 k, Kˆ tk −1/N1 (t, k; π ):= (p) ˆ ψ2 (k, K(0)) for t∈[0, t ), 1
(3.7) will allow us to control the direction of the particle motion over each interval of the partition as well as not to allow the trajectory to escape to the regions where the change of the size of the velocity can be uncontrollable. Let φ : Rd × Rd → [0, 1] be a function of the C ∞ class that satisfies φ(y, x) = 1, when |y − x| ≥ 1/(2q) and φ(y, x) = 0, when |y − x| ≤ 1/(3q). Again, in this case we can construct φ in such a way that φm,n ≤ Cq m+n for arbitrary integers m, n and a suitably chosen constant C. Let h∗ := [4h∗ (M)] + 1 and q∗ := qh∗ , cf. (2.2). The function φk : Rd × C → [0, 1] for a fixed path π is given by l φk (y; π) = . (3.8) φ y, X q∗ (p) 0≤l/q∗ ≤tk−1
We set (t, y; π) :=
1, φk (y; π),
(p)
if 0 ≤ t < t1 (p)
if tk
(p)
≤ t < tk+1 .
(3.9)
The function shall be used to modify the dynamics of the particle in order to avoid a possibility of near self-intersections of its trajectory. For a given t ≥ 0, (y, k) ∈ R2d ∗ and π ∈ C let us denote (t, y, k; π ) := (t, k; π ) (t, y; π ) . The following lemma can be verified by a direct calculation. Lemma 3.1. Let (β1 , β2 ) be a multi-index with nonnegative integer valued components, m = |β1 | + |β2 |. There exists a constant C depending only on m and M such that β β |∂y 1 ∂k 2 (t, y, k; π )| ≤ CT |β1 | q 2|β1 | N |β2 | for all t ∈ [0, T ], (y, k) ∈ A(2M), π ∈ C. Finally, let us set Fδ (t, y, l; π, ω) = (t, δy, l; π)∇y H1 (y, |l|; ω) .
(3.10)
Note that according to Lemma 3.1 we obtain that |∂y 1 ∂k 2 (t, δy, l; π)| ≤ CT |β1 | δ |β1 |[1−2(2 +3 )] N |β2 | β
β
(3.11)
for all t ∈ [0, T ], (y, k) ∈ A(2M), π ∈ C. For a fixed (x, k) ∈ R2d ∗ , δ > 0 and ω ∈ we consider the modified particle dynamics with the cut-off that is described by the stochastic process (y (δ) (t; x, k, ω), l (δ) (t; x, k, ω))t≥0 whose paths are the solutions of the following equation:
Diffusion in a Weakly Random Hamiltonian Flow
289
(δ) (δ) √ d y (δ) (t;x,k) y (t;x,k) (δ) = H0 (|l (δ) (t; x, k)|)+ δ ∂l H1 , |l (t; x, k)| lˆ (t; x, k), dt δ (δ) (3.12) y (δ) (t;x,k) (δ) d l (t;x,k) (δ) (·; x, k), l (δ) (·; x, k) √1 Fδ t, = − , l (t; x, k); y dt δ δ (δ) y (0; x, k)=x, l (δ) (0; x, k) = k. (δ) (δ) ˜ We will denote by Q x,k the law of the modified process (y (·; x, k), l (·; x, k)) over (δ) C for a given δ > 0 and by E˜ x,k the corresponding expectation. We assume that the initial momentum k ∈ A(M). From the construction of the cut-offs we immediately conclude that (δ)
2 (δ) (δ) (p) lˆ (t) · lˆ (tk−1 ) ≥ 1− , N
(p)
(p)
t ∈ [tk−1 , tk+1 ),
∀ k ≥ 0.
(3.13)
3.5. Some consequences of the mixing assumption. For any t ≥ 0 we denote by Ft the σ -algebra generated by (y (δ) (s), l (δ) (s)), s ≤ t. Here we suppress, for the sake of abbreviation, writing the initial data in the notation of the trajectory. In this section we 2 assume that M > 1 is fixed, X1 , X2 : (R × Rd × Rd )2 → R are certain continuous functions, Z is a random variable and g1 , g2 are Rd ×[M −1 , M]-valued random vectors. We suppose further that Z, g1 , g2 , are Ft -measurable, while X˜ 1 , X˜ 2 are random fields of the form j j j . X˜ i (x, k) = Xi ∂k H1 (x, k), ∇x ∂k H1 (x, k), ∇x2 ∂k H1 (x, k) j =0,1
For i = 1, 2 we denote gi := (gi , gi ), where gi ∈ Rd and gi ∈ [M −1 , M]. We also let U (θ1 , θ2 ) := E X˜ 1 (θ1 )X˜ 2 (θ2 ) , θ1 , θ2 ∈ Rd × [M −1 , M]. (3.14) (1)
(2)
(1)
(2)
The following mixing lemma is useful in formalizing the “memory loss effect” and can be proved in the same way as Lemmas 5.2 and 5.3 of [1]. Lemma 3.2. (i) Assume that r, t ≥ 0 and (δ) (1) y (u) r inf gi − ≥ , u≤t δ δ
(3.15)
P–a.s. on the set Z = 0 for i = 1, 2. Then, we have r X1 L∞ X2 L∞ ZL1 () . E X˜ 1 (g1 )X˜ 2 (g2 )Z − E [U (g1 , g2 )Z] ≤ 2φ 2δ (3.16) (ii) Let EX˜ 1 (0, k) = 0 for all k ∈ [M −1 , M]. Furthermore, we assume that g2 satisfies (3.15), (δ) (1) y (u) r + r1 inf g1 − (3.17) ≥ u≤t δ δ
290
T. Komorowski, L. Ryzhik
and |g1 − g2 | ≥ r1 δ −1 for some r1 ≥ 0, P-a.s. on the event Z = 0. Then, we have (1)
(1)
r φ 1/2 E X˜ 1 (g1 )X˜ 2 (g2 ) Z − E [U (g1 , g2 )Z] ≤ Cφ 1/2 2δ r 1 X1 L∞ X2 L∞ ZL1 () × 2δ
(3.18)
for some absolute constant C > 0. Here the function U is given by (3.14). 3.6. The momentum diffusion. Let k(t) be a diffusion, starting at k ∈ Rd∗ at t = 0, with the generator of the form LF (k) =
d
ˆ |k|)∂k2 ,k F (k) + Dmn (k, m n
m,n=1
=
d
d
ˆ |k|)∂km F (k) Em (k,
m=1
ˆ |k|)∂kn F (k) , ∂km Dm,n (k,
F ∈ C0∞ (Rd∗ ).
(3.19)
m,n=1
Here the diffusion matrix is given by (2.9) and the drift vector is ˆ l) = − Em (k,
d ˆ l) 1 +∞ ∂ 3 R(s k, s ds,
H0 (l)l ∂xm ∂xn2 0
m = 1, . . . , d.
n=1
Let Qx,k be the law of the process (x(t), k(t)) that starts at t = 0 from (x, k) given t ˆ by x(t) = x + 0 H0 (|k(s)|)k(s)ds, where k(t) is the diffusion described by (3.19). This process is a degenerate diffusion whose generator is given by ˜ (x, k) = Lk F (x, k) + H0 (|k|) kˆ · ∇x F (x, k), LF
F ∈ Cc∞ (R2d ∗ ).
(3.20)
Here the notation Lk stresses that the operator L defined in (3.19) acts on the respective function in the k variable. We denote by Mx,k the expectation corresponding to the path measure Qx,k .
3.7. The augmented process. The following construction of the augmentation of path measures has been carried out in Sect. 6.1 of [16]. Let s ≥ 0 be fixed and π ∈ C. Then, according to Lemma 6.1.1 of ibid. there exists a unique probability measure that is denoted by δπ ⊗s QX(s),K(s) , such that for any pair of events A ∈ Ms , B ∈ M we have δπ ⊗s QX(s),K(s) [A] = 1A (π ) and δπ ⊗s QX(s),K(s) [θs (B)] = QX(s),K(s) [B]. The following result is a direct consequence of Theorem 6.2.1 of [16]. (δ)
Proposition 3.3. There exists a unique probability measure Rx,k on C such that (δ)
(δ)
Rx,k [A] := Qx,k [A] for all A ∈ Mτδ and the regular conditional probability dis(δ)
tribution of Rx,k [ · |Mτδ ] is given by δπ ⊗τδ (π) QX(τδ (π)),K(τδ (π)) , π ∈ C. This measure (δ)
shall be also denoted by Qx,k ⊗τδ QX(τδ ),K(τδ ) .
Diffusion in a Weakly Random Hamiltonian Flow
291
Note that for any (x, k) ∈ A(M) and A ∈ Mτδ we have ˜ [A], Rx,k [A] = Qx,k [A] = Q x,k (δ)
(δ)
(δ)
(3.21)
that is, the law of the augmented process coincides with that of the true process, and of the modified process with the cut-offs until the stopping time τδ . Hence, accord(δ) ing to the uniqueness part of Proposition 3.3, in such a case Qx,k ⊗τδ QX(τδ ),K(τδ ) = (δ) (δ) ˜ ⊗τδ QX(τδ ),K(τδ ) . We denote by E the expectation with respect to the augmented Q x,k
x,k
(δ)
(δ)
measure described by the above proposition. Let also Rx,k,π , Ex,k,π denote the respective (δ)
conditional law and expectation obtained by conditioning Rx,k on Mτδ . The following proposition is of crucial importance for us, as it shows that the law of the augmented process is close to that of the momentum diffusion as δ → 0. To abbreviate the notation we let
t Nt (G) := G(t, X(t), K(t)) − G(0, X(0), K(0)) −
˜ (∂ + L)G(, X(), K())) d
0
for any G ∈ Cb1,1,3 ([0, +∞) × R2d ∗ ) and t ≥ 0. n Proposition 3.4. Suppose that (x, k) ∈ A(M) and ζ ∈ Cb ((R2d ∗ ) ) is nonnegative. Let γ0 ∈ (0, 1/2) and let 0 ≤ t1 < · · · < tn ≤ T∗ ≤ t < v ≤ T . We assume further that v − t ≥ δ γ0 . Then, there exist constants γ1 , C such that for any function G ∈ C 1,1,3 ([T∗ , T ] × R2d ∗ ) we have ! (δ) [T∗ ,T ] 2 (δ) ˜ T Ex,k ζ . (3.22) Ex,k [Nv (G) − Nt (G)] ζ˜ ≤ Cδ γ1 (v − t)G1,1,3
Here ζ˜ (π ) := ζ (X(t1 ), K(t1 ), . . . , X(tn ), K(tn )), π ∈ C(T , δ). The choice of the constants γ1 , C does not depend on (x, k), δ ∈ (0, 1], ζ , times t1 , . . . , tn , T∗ , T , v, t, or the function G. Proof. Let 0 = s0 ≤ s1 ≤ . . . ≤ sn ≤ t and B1 , . . . , Bn ∈ B(R2d ∗ ) be Borel sets. We denote A0 := C and for any k ∈ {1, . . . , n}, s ≤ sk we define the events Ak := [π : (X(s1 ), K(s1 )) ∈ B1 , . . . , (X(sk ), K(sk )) ∈ Bk ], and their shifted counterparts (s)
Ak := [π : (X(sk − s), K(sk − s)) ∈ Bk , . . . , (X(sn − s), K(sn − s)) ∈ Bn ]. 1,1,2 ([0, +∞) × R2d ) we let For (x, k) ∈ R2d ∗ , π ∈ C and G ∈ C ∗
"t G(t, x, k; π ) := H0 (|k|) kˆ · ∇x G(t, x, k) + 2 (t, X(t), K(t); π )Lk G(t, x, k) L −(t, X(t), K(t); π) ×
d m,n=1
ˆ |k|)∂kn G(t, x, k) ∂Km (t, X(t), K(t); π )Dm,n (k,
292
T. Komorowski, L. Ryzhik
and "t (G) := G(t, X(t), K(t))−G(0, X(0), K(0))− N
t
" )G(, X(), K(); π ) d. (∂ + L
0
It follows from the definition of the stopping time τδ (π ) and the cut-off function that ∇K (t, X(t), K(t); π) = 0, t ∈ [0, τδ (π )], hence ˜ "t G(t, X(t), K(t); π) = LG(t, X(t), K(t); π ), t ∈ [0, τδ (π )]. L We need the following result. n Lemma 3.5. Suppose that (x, k) ∈ A(M) and ζ ∈ Cb ((R2d ∗ ) ) is nonnegative. Let
γ0 ∈ (0, 1), 0 ≤ t1 < · · · < tn ≤ T∗ ≤ t < u ≤ T and t − T∗ ≥ δ γ0 . Then, there exist constants γ1 , C > 0 such that for any function G ∈ C 1,1,3 ([T∗ , T ] × R2d ∗ ) we have ! ˜ (δ) " "t (G)]ζ˜ ≤ C δ γ1 (u − t)G[T∗ ,T ] T 2 E˜ (δ) ζ˜ . (3.23) Ex,k [Nu (G) − N 1,1,3 x,k
The choice of the constants γ1 , C does not depend on (x, k), δ ∈ (0, 1], times t1 , . . . , tn , T∗ , T , t, u, or function G. The proof of this lemma follows very closely the argument presented in Sect. 5.3 of [1] and we postpone it until the Appendix. In the meantime we apply this result to conclude the proof of Proposition 3.4. We write (δ)
Ex,k,π [Nv (G) − Nv∧τδ (π) (G), An ] =
n−1
(τ (π))
δ 1[sp ,sp+1 ) (τδ (π ))1Ap (π )MX(τδ (π)),K(τδ (π)) [Nv−τδ (π) (G), Ap+1 ]
p=0
+1[sn ,v) (τδ (π ))1An (π )MX(τδ (π)),K(τδ (π)) [Nv−τδ (π) (G)].
(3.24)
When τδ (π ) ∈ [sp , sp+1 ) we obviously have (τ (π))
(τ (π))
δ δ ] = MX(τδ (π)),K(τδ (π)) [Nt−τδ (π) (G), Ap+1 ] MX(τδ (π)),K(τδ (π)) [Nv−τδ (π) (G), Ap+1
and MX(τδ (π)),K(τδ (π)) [Nv−τδ (π) (G)] = 0. Hence the left hand side of (3.24) equals n−1
(τ (π))
δ 1[sp ,sp+1 ) (τδ (π ))1Ap (π )MX(τδ (π)),K(τδ (π)) [Nt−τδ (π) (G), Ap+1 ]
p=0 (δ)
= Ex,k,π [Nt (G) − Nt∧τδ (π) (G), An ].
(3.25)
We conclude from (3.24), (3.25) that (δ)
(δ)
Ex,k,π [Nv (G), An ] = Ex,k,π [Nv∧τδ (π) (G) + Nt (G) − Nt∧τδ (π) (G), An ] (δ)
= Ex,k,π [N(v∧τδ (π))∨t (G), An ],
(3.26)
Diffusion in a Weakly Random Hamiltonian Flow
293
and therefore
(δ) (δ) (δ) Ex,k [Nv (G), An ] = Ex,k Ex,k,π [N(v∧τδ (π))∨t (G), An ] $ (δ) (δ) # = Ex,k Ex,k,π N(v∧τδ (π))∨t (G), An , τδ (π ) ≤ t $ (δ) (δ) # +Ex,k Ex,k,π N(v∧τδ (π))∨t (G), An , τδ (π ) > t . (3.27) (δ)
The first term on the utmost right hand side of (3.27) equals Ex,k [Nt (G), An , τδ ≤ t], $ (δ) # while the second one equals E˜ N(v∧τδ )∨t (G), B . Here B := An ∩ [τδ > t] is an x,k
Mt -measurable event. Suppose that γ0 ∈ (γ0 + 1/2, 1) and let L := [δ −γ0 ] be yet another mesh size parameter. We define σ := L−1 [([L(v ∧ τδ )] + 2) ∨ ([Lt] + 2)] and note that (δ) E˜ x,k [Nσ (G), B] =
[Lv]+2 p=[Lt]+2
p (δ) E˜ x,k Np/L (G), B, σ = . L
(3.28)
Representing the event [σ = p/L] as the difference of [σ ≥ p/L] and [σ ≥ (p + 1)/L] (note that [σ ≥ ([Lv] + 3)/L] = ∅) and grouping the terms of the sum that correspond to the same index p we obtain that the right-hand side of (3.28) equals
$ [Lv]+2 p+1 (δ) # (δ) ˜ ˜ Ex,k Np+1/L (G) − Np/L (G), B, σ ≥ . Ex,k N([Lt]+2)/L (G), B + L p=[Lt]+2
(3.29) Since the event B ∩ [σ ≥ (p + 1)/L] is M(p−1)/L -measurable, from Lemma 3.5 we conclude that the absolute value of each term appearing under the summation sign in
˜ (δ) [B] which implies (3.29) can be estimated by C G1,1,3 δ γ1 L−1 Q x,k # $ ˜ (δ) (δ) Ex,k [Nσ (G), B] − E˜ x,k N([Lt]+2)L−1 (G), B [Lv] + 1 − [Lt] . L A direct calculation using formulas (3.1) allows us to conclude also that both |Nσ (G) − [T∗ ,T ] γ0 −1/2 δ . N(v∧τδ )∨t (G)| and |N([Lt]+2)L−1 (G) − Nt (G)| are estimated by CG1,1,3 Hence, (since γ0 > 1/2 + γ0 ) $ ˜ (δ) # (δ) Ex,k N(v∧τδ )∨t (G), B − E˜ x,k [Nt (G), B] $ (δ) # ≤ E˜ x,k Nσ (G) − N(v∧τδ )∨t (G), B $ (δ) (δ) # + E˜ x,k [Nσ (G), B] − E˜ x,k N([Lt]+2)L−1 (G), B $ (δ) # + E˜ x,k N([Lt]+2)L−1 (G) − Nt , B
[T∗ ,T ] 2 ˜ ≤ C δ γ1 G1,1,3 T Qx,k [B] (δ)
[T∗ ,T ] 2 ˜ T Qx,k [B] (v − t) ∨ δ γ0 ≤ Cδ γ1 G1,1,3 (δ)
(3.30)
294
T. Komorowski, L. Ryzhik
for a certain constant C > 0 and γ1 := min[γ0 − γ0 − 1/2, γ1 ]. From (3.27), (3.30) and the observation just below (3.27), we obtain (δ) [T∗ ,T ] 2 (δ) T Rx,k [An ] (v − t) ∨ δ γ0 Ex,k [Nv (G) − Nt (G), An ] ≤ Cδ γ1 G1,1,3 for a certain constant C > 0 and the conclusion of Proposition 3.4 follows.
3.8. An estimate of the stopping time. The purpose of this section is to prove the fol(δ) lowing estimate for Rx,k [τδ < T ]. Theorem 3.6. Assume that the dimension d ≥ 3. Then, one can choose 1 , 2 , 3 , 4 in such a way that there exist constants C, γ > 0 for which (δ)
Rx,k [ τδ < T ] ≤ Cδ γ T ,
∀ δ ∈ (0, 1], T ≥ 1, (x, k) ∈ A(M).
(3.31)
Proof. With no loss of generality we can assume that δ −1 > T , since otherwise (3.31) holds with C = γ = 1. We obviously have then [ τδ < T ] = [ Uδ ≤ τδ , Uδ < T ] ∪ [ Sδ ≤ τδ , Sδ < T ]
(3.32)
with the stopping times Sδ and Uδ defined in (3.3) and (3.4). Let us denote the first and second event appearing on the right hand side of (3.32) by A(δ) and B(δ) respectively. (δ) To show that (3.32) holds we prove that the Rx,k probabilities of both events can be estimated by Cδ γ T for some C, γ > 0: see (3.40), (3.41) and (3.45). (δ)
3.8.1. An estimate of Rx,k [A(δ)]. The first step towards obtaining the desired estimate will be to replace the event A(δ) whose definition involves a stopping time by an event C(δ) whose definition depends only on deterministic times, see (3.33) below. Next we use the estimate (3.22) of Proposition 3.4 for an appropriately chosen function G to (δ) ˜ by an easier problem of reduce the question of bounding the Rx,k probability of A(δ) estimating its Qx,k probability (Qx,k corresponds to a degenerate diffusion determined by (3.20)). The latter is achieved by using bounds on heat kernels corresponding to hypoelliptic diffusions due to Kusuoka and Stroock. We assume in √this section to simplify the notation and without any loss of generality ˜ that h∗ (2M) + δ D(2M) ≤ 1. Note that then
j i 3 q ˜ A(δ) ⊂ A(δ) := X −X ≤ : 1 ≤ i ≤ j ≤ [T q], |i −j | ≥ q q q p (3.33) and thus
j i 3 ≤ [T q] max X q −X q ≤ q : & q 1 ≤ i ≤ j ≤ [T q], |i − j |≥ . p %
(δ) Rx,k [A(δ)]
2
(δ) Rx,k
(3.34)
Suppose that f (δ) : Rd → [0, 1] is a C ∞ –regular function that satisfies f (x) = 1, if |x| ≤ 4/q and f (δ) (x) = 0, if |x| ≥ 5/q. We assume furthermore that i, j are positive integers such that (j − i)/q ∈ [0, 1] and f (δ) 3 ≤ 2q 3 . For any x0 ∈ Rd and i/q ≤ t ≤ j/q define
Diffusion in a Weakly Random Hamiltonian Flow
295
j Gj (t, x, k; x0 ) := Mx,k f (δ) X − t − x0 . q for (x, k) ∈ A(Mδ ) and extend it elsewhere in such a way that it satisfies ∂t Gj (t, x, k; x0 ) + L˜ Gj (t, x, k; x0 ) = 0,
i/q ≤ t ≤ j/q,
where L˜ is a generator of a certain diffusion with C ∞ bounded coefficients that agree with the coefficients of L˜ on A(Mδ ). Hence, using Proposition 3.4 with v = j/q and t = i/q (note that v − t ≥ 1/p ≥ δ 2 and 2 ∈ (0, 1/2)), we obtain that there exists γ1 > 0 such that
i/q (δ) j i i i (δ) E X − x 0 − Gj ,X ,K ; x0 M x,k f q q q q j −i [i/q,j/q] ≤C Gj (·, ·, ·; x0 )1,1,3 T 2 δ γ1 , ∀ δ ∈ (0, 1]. (3.35) q According to [15], Theorem 2.58, p. 53, we have [i/q,j/q]
Gj (·, ·, ·; x0 )1,1,3
≤ Cf (δ) 3 ≤ Cq 3 ≤ Cδ −3(2 +3 ) ,
j ∈ {0, . . . , [qT ]}. (3.36)
Hence combining (3.35) and (3.36) we obtain that the left hand side of (3.35) is less q than, or equal to C δ γ1 −3(2 +3 ) for all δ ∈ (0, 1]. Let now i0 = j − so that 1 ≤ i ≤ p i0 ≤ j ≤ [T q]. We have
j j i 3 i (δ) (δ) (δ) Rx,k X f X −X ≤ ≤ E − X x,k q q q q q
i /q j (δ) (δ) (δ) 0 = Ex,k Ex,k f X (3.37) −y M . q y=X(i/q)
According to (3.35) and (3.36) we can estimate the utmost right-hand side of (3.37) by % & 1 (δ) d sup Mx,k f X − y : x, y ∈ R , k ∈ A(2M) + C δ γ1 −3(2 +3 ) T 2 . p x,y,k (3.38) To estimate the first term in (3.38) we use the following. Lemma 3.7. Let p, q be as in (3.2). Then, there exist positive constants C1 , C2 and C3 such that for all x, y ∈ Rd , k ∈ A(2M), j ∈ {1, . . . , [pT ]}, δ ∈ (0, 1] we have
C2 5 j p −C3 p Qx,k X − y ≤ ≤ C1 . (3.39) +e p q qd We postpone the proof of the lemma for a moment in order to finish the estimate of (δ) Rx,k [A(δ)]. Using (3.39) we obtain that the expression in (3.38) can be estimated by
296
T. Komorowski, L. Ryzhik
C1
* ) p C2 −C3 p + C δ γ1 −3(2 +3 ) T 2 ≤ C1 δ (d−C2 )2 +d3 + exp −C3 δ −2 +e qd +C δ γ1 −3(2 +3 ) T 2 .
Hence, from (3.34), we obtain that * ) (δ) Rx,k [A(δ)] ≤ [T q]2 C1 δ (d−C2 )2 +d3+ exp −C3 δ−2 +C δ γ1 −3(2 +3 ) T 2 * ) ≤ CT 2 δ (d−2−C2 )2+(d−2)3 + δ−2(2 +3 ) exp −C3 δ −2 + δ γ1 −5(2 +3 ) T 2 ≤ Cδ γ2 T 4
(3.40)
for γ2 := min[(d − 2 − C2 )2 + (d − 2)3 , γ1 − 5(2 + 3 )] > 0, provided that 2 + 3 < γ1 /5 and 2 ∈ (0, (d − 2)3 /(C2 + 2 − d)). Here with no loss of generality we have assumed that C2 + 2 > d. Recall also that d ≥ 3. Now suppose that γ3 ∈ (0, γ2 ). Consider two cases: T 3 < δ −γ3 and T 3 ≥ δ −γ3 . In the first one, the utmost right-hand side of (3.40) can be bounded from above by Cδ γ2 −γ3 T . In the second we have a trivial bound of the left side by δ γ3 /3 T . We have proved therefore that (δ)
Rx,k [A(δ)] ≤ Cδ γ T
(3.41)
for some C, γ > 0 independent of δ and T . Proof of Lemma 3.7. We prove this lemma by induction on j . First, we verify it for j = 1. Without any loss of generality we may suppose that k = (k1 , . . . , kd ) and kd > (4dMδ )−1 . Let D˜ mn : Rd−1 → R, m, n = 1, . . . , d − 1, E˜ m : Rd−1 → R, m = 1, . . . , d be given by + + D˜ pq (l) := Dpq (k −1 l, k −1 k 2 − l 2 , k), E˜ p (l) := Ep (k −1 l, k −1 k 2 − l 2 , k), √ : k −1 k 2 − l 2 > (4dMδ )−1 ], l = |l|. These functions are when l ∈ Z := [l ∈ Bd−1 k C ∞ smooth and bounded together with all their derivatives. Note also that the matrix ˜ = [D˜ mn ] is symmetric and Dξ ˜ · ξ ≥ λ0 |ξ |2 for all ξ ∈ Rd−1 and a certain λ0 > 0. The D projection K(t) = (K1 (t), . . . , Kd (t)) of the canonical path process (X(t; √ π ), K(t; π )) considered over the probability space (C, M, Qx,k0 ), where k0 := (l, k 2 − l 2 ), with l ∈ Z, is a diffusion whose generator equals L, see (3.19). It can be easily seen that (K1 (t), . . . , Kd−1 (t))t≥0 , is then a diffusion starting at l, whose generator N is of the form N F (l) :=
d−1
Xp2 F (l) +
p=1
d−1
aq (l)∂lq F (l),
F ∈ C0∞ (Rd−1 ),
q=1
where aq (l), q = 1, . . . , d − 1 are certain C ∞ -functions and Xp (l) :=
d−1
1/2 D˜ pq (l)∂lq ,
p = 1, . . . , d − 1.
q=1
The (d − 1) × (d − 1) matrix [D˜ pq (l)] is non-degenerate when l ∈ Z. Let 1/2
N˜ F (l, x) :=
d−1 p=1
X˜ p2 F (l, x) + X˜ 0 F (l, x),
F ∈ C0∞ (Rd−1 × Rd ),
(3.42)
Diffusion in a Weakly Random Hamiltonian Flow
297
where X˜ 0 is a C ∞ –smooth extension of the field
X0 (l) :=
d−1 d−1 H0 (k) H (k) + 2 lq ∂xq + 0 k − l 2 ∂xd + aq (l)∂lq , k k q=1
l ∈ Z.
q=1
It can be shown, by the same type of argument as that given on pp. 122–123 of [1], that for each (x, l), with l ∈ Z, the linear space spanned at that point by the fields belonging to the Lie algebra generated by [X0 , X1 ], . . . , [X0 , Xd−1 ], X1 , . . . , Xd−1 is of dimension 2d − 1. One can also guarantee that the extensions X˜ 0 , . . . , X˜ d−1 satisfy the same condition. We shall denote the respective extension of N by the same symbol. ˜ x,l0 be the path measures supported on C d−1 and Set l0 := (k1 , . . . , kd−1 ). Let Rl0 , R d,d−1 C respectively (see Sect. 2.4) that solve the martingale problems corresponding to the generators N and N˜ with the respective initial conditions at t = 0 given by l0 and (x, l0 ). Let r(t, x − y, l1 , l2 ), t ∈ (0, +∞), x, y ∈ Rd , l1 , l2 ∈ Rd−1 be the transition ˜ x,l0 . Using Corollary 3.25, p. 22 of [10] we of probability density that corresponds to R have that for some constants C, m > 0, r (t, x − y, k, l) ≤ Ct −m ,
provided that |x − y| ≤ 1, |k|, |l| ≤ 2M, t ∈ (0, 1]. (3.43)
Denote by τZ (π ) the exit time of a path π ∈ C d−1 from the set Z. For any π ∈ C d,d−1 we set also τ˜Z (π ) = τZ (K(·; π)). Let S : Bd−1 → Sd−1 be given by k k + S(l) := (l1 , . . . , ld−1 , k 2 − l 2 ),
l = (l1 , . . . , ld−1 ) ∈ Bd−1 k , l := |l|
˜ )(t) := (X(t; π ), S ◦ K(t; π )), t ≥ 0. For and let S˜ : C d,d−1 → C be given by S(π ˜ x,l0 [S˜ −1 (A)] = Qx,S(l0 ) [A]. Since the event [|X (1/p) − y| ≤ any A ∈ Mτ˜Z we have R 5/q]∩[τ˜Z ≥ 1/p] is Mτ˜Z –measurable we have for δ so small that q ≥ 5 and δ < δ∗ (M):
5 5 1 1 1 1 ˜ Qx,k X − y ≤ ≤ Rx,l0 X − y ≤ , τ˜Z ≥ + Rl0 τZ < p q p q p p d 4 ≤ C ω¯ d p m + Ce−C3 p . (3.44) q Here ω¯ d denotes the volume of Bd . To obtain the last inequality we have used (3.43) and an estimate for non-degenerate diffusions stating that Rl0 [τZ < 1/p] < Ce−C3 p for some constants C, C3 > 0 depending only on d and λ0 , see e.g. (2.1) p. 87 of [16]. Inequality (3.44) implies easily (3.39) for j = 1 with C1 = m. To finish the induction argument assume that (3.39) holds for a certain j . We show that it holds for j + 1 with the same constants C1 , C2 and C3 > 0. The latter follows easily from the
298
T. Komorowski, L. Ryzhik
Chapman-Kolmogorov equation, since
5 j +1 Qx,k X − y ≤ = p q
5 j Qz,m X − y ≤ p q
Rd ×Sd−1 k
1 ×Q , x, k, dz, dm p C2 induction assumpt. p 1 −C3 p Q , x, k, dy, dl ≤ C1 +e qd p C2 p = C1 +e−C3 p , qd and the formula (3.39) for j + 1 follows. Here Q(t, x, k, ·, ·) is the transition of probability corresponding to the path measure Qx,k . (δ)
3.8.2. An estimate of Rx,k [B(δ)]. We start with a simple observation concerning the Hölder regularity of the K component of any path π ∈ B(δ). Let us denote ρ := 2Mδ−1 N −1/2 and (p) D := π ∈ C(T , δ) : |K(t) − K(s)| ≥ ρ for some k s.t. tk ≤ T (p) (p) (p) (p) and t ∈ [tk , tk+1 ], s∈[tk−1 , tk ] , where Mδ has been defined in (2.7) and N in (3.2). Suppose that π ∈ B(δ), then we can (p) (p) (p) (p) ˆ ˆ find t ∈ [tk , tk+1 ], s ∈ [tk−1 , tk ] for which K(t) · K(s) ≤ 1 − 1/N . This, however, implies that |K(t) − K(s)|2 ≥
1 ˆ 2 2 ˆ |K(t) − K(s)| ≥ 2 , 2 Mδ Mδ N (δ)
thus π ∈ D. Hence the desired estimate of Rx,k [B(δ)] follows from the following lemma. Lemma 3.8. Under the assumptions of Theorem 3.6 there exist C, γ > 0 such that (δ)
Rx,k [D] ≤ CT δ γ ,
∀ δ ∈ (0, 1], T ≥ 1, (x, k) ∈ A(M).
(3.45)
Proof. We define the following events:
2 F1 := |K(t) − K(s)| ≥ ρ for some s, t ∈ [0, T ], 0 < t − s < , t ≤ τδ , p
2 F2 := |K(t) − K(s)| ≥ ρ for some s, t ∈ [0, T ], 0 < t − s < , s ≥ τδ , p
ρ 2 F3 := |K(τδ ) − K(s)| ≥ for some s ∈ [0, T ], 0 < τδ − s < , τδ ≤ T , 2 p
ρ 2 F4 := |K(τδ ) − K(t)| ≥ for some t ∈ [0, T ], 0 < t − τδ < . 2 p
Diffusion in a Weakly Random Hamiltonian Flow
Observe that D ⊂
4
299
Fi . Note that F1 , F3 are Mτδ –measurable, hence
i=1
˜ [Fi ], Rx,k [Fi ] = Q x,k (δ)
(δ)
i = 1, 3.
(3.46)
On the other hand for i = 2, 4 we have
(δ) ˜ (δ) (dπ ), Rx,k [Fi ] = QX(τδ (π)),K(τδ (π)) [Fi,π ]Q x,k where for a given π ∈ C,
2 , F2,π := |K(t)−K(s)|≥ρ for some s, t ∈ [0, (T − τδ (π )) ∧ 0], 0 < t − s < p
ρ 2 F4,π := |K(0) − K(t)| ≥ for some t ∈ [0, (T − τδ (π )) ∧ 0], 0 < t < . 2 p Since all Fi , i = 1, 3 and Fi,π , i = 2, 4, π ∈ C are contained in the event
2 ρ for some s, t ∈ [0, T ], 0 < t − s < , F := |K(t) − K(s)| ≥ 2 p (3.45) would follow if we show that there exist C > 0 and γ > 0 for which ˜ (δ) [F ] ≤ CT δ γ for all (x, k) ∈ A(M) Q x,k
(3.47)
Qx,k [F ] ≤ CT δ γ for all (x, k) ∈ A(Mδ ).
(3.48)
and
The estimate (3.48) follows from elementary properties of diffusions, see e.g. (2.46) p. 47 of [15]. We carry on with the proof of (3.47). The argument is analogous to the
proof of Theorem 1.4.6 of [16]. Let L be a multiple of p such that L := [δ −γ0 ], where (L) γ0 ∈ (1/2, 1) is to be specified even further later on. Let also sk := k/L, k = 0, 1, . . . . We define now the stopping times τk (π ) that determine the times at which the K component of the path π performs k-th oscillation of size ρ/8. Let τ0 (π ) := 0 and for any k ≥ 0, ρ (L) (L) , τk+1 (π ) := inf sk ≥ τk (π ) : |K(sk ) − K(τk (π ))| ≥ 8 with the convention that τn+1 = +∞ when τn = +∞, or when the respective event is impossible. Let N# := min[n : τn+1 > T ] and δ ∗ := min[τn − τn−1 : n = 1, . . . , N# ]. Then, for a sufficiently small δ0 and δ ∈ (0, δ0 ) we have F ⊂ [δ ∗ ≤ 1/p] so we only ˜ (δ) probability of the latter event. need to estimate Q x,k Let f : Rd → [0, 1] be a function of Cc∞ (Rd ) class such that f (0) ≡ 1, when |k| ≤ ρ/16 and f (k) ≡ 0, when |k| ≥ ρ/8. Let also fl (·) := f (· − l) for any l ∈ Rd . Note that according to Lemma 3.5 we can choose constants Aρ , C > 0, where C is independent of ρ, in such a way that Aρ < CT 2 ρ −3 and the random sequence
N +1 (δ) l MN/L + Aρ N , N ≥ 0 (3.49) := E˜ x,k fl K SN L L
300
T. Komorowski, L. Ryzhik
˜ (δ) –submartingale with respect to the filtration MN/L is a Q for all l with |l| ∈ x,k N≥0 −1 ((3Mδ ) , 3Mδ ) provided that δ is sufficiently small. We can decompose
˜ (δ) δ ∗ ≤ 2 ≤ Q ˜ (δ) δ ∗ ≤ 2 , N# > [δ −α ] ˜ (δ) δ ∗ ≤ 2 , N# ≤ [δ −α ] + Q Q x,k x,k x,k p p p −α
[δ ] (δ) 2 ˜ ˜ (δ) [N# > [δ −α ]], ≤ Q (3.50) +Q x,k τi − τi−1 ≤ x,k p i=1
where α > 0 is to be determined later. We will show that δ −α δ 1/2(1 +2 ) (δ) −α T ˜ Qx,k [N# > [δ ]] ≤ Ce 1 − 2 and
˜ (δ) τn+1 − τn ≤ [Lδ2 ]/L Mτn ≤ Cδ γ T 2 , Q x,k
(3.51)
(3.52)
for 0 < γ < min[2 − 31 /2, γ0 − (1 + 1 )/2]. From (3.49), (3.50) (3.51) and (3.52) we further conclude that δ −α
1/2(1 +2 ) δ 1 (δ) ∗ 2 γ −α T ˜ + Ce 1 − (3.53) ≤ CT δ Q x,k δ ≤ p 2 for some C > 0, independent of δ ∈ (0, 1] and T ≥ T0 , provided that we choose α ∈ (1/2(1 +2 ), γ ). This is possible if min[2 −31 /2, γ0 −(1+1 )/2] > (1 +2 )/2, which is true if we assume 2 > 101 > 0 and 1 > γ0 > (1+2 )/2+1 . Now, by the argument made after (3.40) we can always replace the first term on the right side of (3.53) by CT δ γ1 . We can also assume that the second term on the right hand side of (3.53) is less than or equal to CT δ γ1 . This can be Let β := α−1/2(1 +2 ). The term in ques) seen as follows. * tion is bounded by C exp T − C1 δ −β with C1 := inf ρ∈(0,1] ρ −1 log (1 − ρ/2)−1 . For ) * * ) δ −β ≥ 2T /C1 we get that exp T − C1 δ −β is less than or equal to exp −C1 δ −β /2 , while for δ −β < 2T /C1 the left side of (3.53) is obviously less than 2T δ β /C1 . In both cases we can find a bound as claimed. This proves (3.47) and hence the proof of Lemma 3.8 will be complete if we prove (3.51) and (3.52). ˜ (δ) , π ∈ C denote the family of the regular conditional probability To this end, let Q x,k,π ˜ (δ) [ · | Mτn ]. Then, there exists a Mτn measurable, distributions that corresponds to Q x,k
d ˜ null Q x,k probability event Z such that for each π ∈ Z and each l ∈ R∗ the random sequence (δ)
l l SN,π := SN 1[0,N/L] (τn (π )),
N ≥0
˜ (δ) . Let Tn,π := τn+1 ∧(τn (π )+2[Lδ ]/L), is an MN/L N≥0 submartingale under Q x,k,π where ∈ (0, 1) is a constant to be chosen later on. We can choose the event Z in such a way that ˜ (δ) [Tn,π ≥ τn (π )] = 1, Q x,k,π
∀ π ∈ Z.
(3.54)
Diffusion in a Weakly Random Hamiltonian Flow
301
K(τ (π)) Let S˜N,π := SN,πn , then the submartingale property of S˜N,π
N≥0
and (3.54) imply
that (δ) (δ) E˜ x,k,π S˜LTn,π ,π ≥ E˜ x,k,π S˜Lτn (π),π = 1 + Aρ τn (π ),
√
(3.55)
provided that γ0 ≥ (1 + 1 )/2. The latter condition assures that ρ ≥ C/(L δ) so that K(t) does not change by more than ρ during the time 1/L. In consequence of (3.55) we have
1 (δ) ˜ Ex,k,π fK(τn (π)) K Tn,π + (3.56) + 2Aρ δ ≥ 1, L as Tn,π − τn (π ) ≤ 2[Lδ ε ]/L. Since C fK(τ (π)) K Tn,π + 1 − fK(τn (π)) K Tn,π ≤ n Lρδ 1/2 L we obtain from (3.56) $ (δ) # 2Aρ δ ≥ E˜ x,k,π 1 − fK(τn (π)) K Tn,π −
C , Lρδ 1/2
so in particular
C [Lδ ] ˜ (δ) 1 − f ≥ E ≤ τ (π ) + , τ (K (τ )) n+1 n+1 n K(τn (π)) x,k,π Lρδ 1/2 L
[Lδ ] ˜ (δ) =Q . (3.57) x,k,π τn+1 ≤ τn (π ) + L We have shown, therefore, that
2 ˜ (δ) τn+1−τn ≤ [Lδ ] Mτn ≤ CT δ + C Q x,k L ρ3 Lρδ 1/2 2Aρ δ +
≤ C(δ −31 /2 T 2 + δ γ0 −(1+1 )/2 ) ≤ Cδ γ1 T 2
(3.58)
for γ1 < min[ − 31 /2, γ0 − (1 + 1 )/2] and some constant assume that T 2 δ γ1 /2 ≤ 1. If otherwise, we can always write
C > 0. We can always ˜ (δ) [F ] ≤ T δ γ /4 and Q x,k (3.47) follows. In particular, selecting := (1 + 2 )/2, one concludes from (3.58) that (1 +2 )/2 ] [Lδ (1 +2 )/2 (δ) (δ) τ −δ τ Mn ˜ E˜ x,k [exp{−(τn+1 −τn )}|M n ] ≤ e Q x,k τn+1 − τn ≥ L [Lδ (1 +2 )/2 ] τn (δ) ˜ + Qx,k τn+1 − τn ≤ M L (3.58) (1 +2 )/2 (1 +2 )/2 ≤ e−δ + C 1 − e−δ δ γ /2
δ (1 +2 )/2 , (3.59) 2 provided that δ is sufficiently small. From (3.59) one concludes easily, see e.g. Lemma 1.4.5, p. 38 of [16], that (3.51) holds. On the other hand, taking = 2 in (3.58) we obtain (3.52) with 0 < γ < min[2 − 31 /2, γ0 − (1 + 1 )/2]. Hence the proof of Lemma 3.8 is now complete. < 1−
302
T. Komorowski, L. Ryzhik
3.9. The estimation of the convergence rate. The proof of Theorem 2.1. Recall that φδ , φ¯ satisfy (2.8), (2.10), respectively, with the initial condition φ0 . We start with the following lemma. Lemma 3.9. Assume that φ0 satisfies the hypotheses formulated in Sect. 2.5. Then, ¯ [0,T ] ≤ φ0 0,0 , φ 0,0,0
d
¯ [0,T ] ≤ φ0 1,0 . ∂xi φ 0,0,0
(3.60)
i=1
Furthermore, there exists a constant C > 0 such that for all T ≥ 1, ¯ [0,T ] ≤ Cφ0 1,2 . ∂t φ 0,0,0
(3.61)
In addition, for any nonnegative integer valued multi-index γ = (α1 , α2 , α3 ) satisfying |γ | ≤ 3 we have d i1 ,i2 ,i3 =1
γ
∂ki
1 ,ki2 ,ki3
¯ [0,T ] ≤ CT |γ | φ0 1,4 . φ 0,0,0
(3.62)
Proof. The first estimate of (3.60) is a consequence of the maximum principle for the solution of (2.10) while the second one follows directly from differentiating (2.10) with respect to x and applying again the maximum principle. To obtain the estimates (3.61) and (3.62) we note first that the application of the operator L˜ to both sides of (2.10) and ˜ 0 L∞ (A(M)) ¯ x, ·)L∞ (A(M)) ≤ Lφ the maximum principle leads to the estimate L˜ φ(t, for all t ≥ 0, hence we conclude bound (3.61). In fact, thanks to the already proven estimate (3.60) we conclude that ¯ x, ·)L∞ (A(M)) ≤ Cφ0 1,2 for some constant C > 0 and all (t, x) ∈ [0, +∞) × Lφ(t, in the proof of Lemma 3.7. Define S : Z × [M −1 , M] → A(M) Rd . Let Z be as √ as S(l, k) := (l, k 2 − l 2 ), where l = |l|. Let also ψ(l, k) = φ¯ ◦ S(l, k). We have ¯ ◦ S(l, k) = N ψ(l, k), see (3.42). The Lp estimates for elliptic partial differential (Lk φ) equations, see e.g. Theorem 9.13, p. 239 of [7] allow us to estimate ψW 2,p (Z) ≤ C(ψLp (Z) + N ψLp (Z) ) ≤ Cφ0 1,2 . Choosing p sufficiently large we obtain that i ∂li ψL∞ (Z) ≤ Cφ0 1,2 , which ¯ ·)L∞ (S(Z)) ≤ Cφ0 1,2 . Obviously, one can find in fact implies that D(·)∇k φ(t, a covering of A(M) with charts corresponding to different choices of the components of k being projected onto the hyperplane Rd−1 and we obtain in that way that ˆ k) ¯ ·)L∞ (A(M)) ≤ Cφ0 1,2 for all t ≥ 0. Since the rank of the matrix D(k, D(·)∇k φ(t, equals d − 1, with the kernel spanned by the vector k, we obtain in that way the L∞ estimates of directional derivatives in any direction perpendicular to k. We still need to obtain the L∞ bound on the derivative in the direction k, denoted by ∂n := k1 ∂k1 + . . . + kd ∂kd . To that purpose we apply ∂n to both sides of (2.10) and after a straightforward calculation ˜ n φ¯ − 2Lk φ¯ + L1 φ¯ + H
(k)kˆ · ∇x φ, ¯ where we get ∂t ∂n φ¯ = L∂ 0 L1 φ¯ :=
d m,n=1
∂ ∂km
¯ ˆ k) ∂ φ ∂k Dmn (k, ∂kn
.
Diffusion in a Weakly Random Hamiltonian Flow
303
ˆ k)kˆ = 0 implies that ∂k D(k, ˆ k)kˆ = 0, hence L1 φ(t, ¯ ·)L∞ (A(M)) ≤ Note that D(k, ¯ L∞ (A(M)) are bounded, hence Cφ0 1,2 . We already know that Lk φ¯ and ∇x φ ¯ ·)L∞ (A(M)) ≤ Cφ0 1,2 T for t ∈ [0, T ]. We have shown therefore that ∂n φ(t, ¯ φ(t, ·)1,1 ≤ Cφ0 1,2 T for t ∈ [0, T ]. The above procedure can be iterated in order to obtain the estimates of the suprema of derivatives of the higher order.
Proof of Theorem 2.1. Let u ∈ [δ γ0 , T ], where we assume that γ0 (as in the statement of ¯ Lemma 3.5) belongs to the interval (1/2, 1). Substituting for G(t, x, k) := φ(u−t, x, k),
γ 0 ζ ≡ 1 into (3.22) we obtain (taking v = u, t = δ ) (δ) γ0
γ0
γ0
E˜ ¯ x,k φ0 (X(u), K(u)) − φ(u − δ , X(δ ), K(δ ))
− δ
" )G(, X(), K()) d ≤ CG[0,T ] δ γ1 T 2 , (∂ + L 1,1,3 γ0
u
∀ δ ∈ (0, 1]. (3.63)
˜ –a.s. for some Using the fact that |X(δ γ0 ) − x| ≤ Cδ γ0 , |K(δ γ0 ) − k| ≤ Cδ γ0 −1/2 , Q x,k deterministic constant C > 0, cf. (3.12), and Lemma 3.9 we obtain that there exist constants C, γ > 0 such that
u (δ) E˜ φ0 (X(u), K(u)) − φ(u, " ¯ x, k) − (∂ + L )G(, X(), K()) d x,k (δ)
0
≤
] γ 2 CG[0,T 1,1,3 δ T ,
δ ∈ (0, 1], T ≥ 1, u ∈ [0, T ].
We have however (δ) # $ E ¯ φ (X(u), K(u)) − φ(u, x, k), τ ≥ T δ x,k 0 (δ) # $ ¯ = E˜ x,k φ0 (X(u), K(u)) − φ(u, x, k), τδ ≥ T (3.64) ] γ 2 [0,T ] ˜ (δ) ≤ CG[0,T 1,1,3 δ T + 2φ0 0,0 + T G1,1,2 Qx,k [τδ < T ].
(3.64)
(3.65)
˜ (δ) [τδ < T ] = Using Mτδ measurability of the event [τδ < T ] we obtain that Q x,k (δ)
Rx,k [ τδ < T ] and by virtue of Theorem 3.6 we can estimate the right hand side of (3.65) by ] γ 2 [0,T ] Lemma 3.9 γ CG[0,T δ T + Cδ T 2φ + T G ≤ Cδ γ T 5 . 0 0,0 1,1,3 1,1,2 On the other hand, the expression under the absolute value on the utmost left-hand side of (3.65) equals $ $ (δ) # (δ) # ¯ ¯ Ex,k φ0 (X(u), K(u)) − φ(u, x, k) − Ex,k φ0 (X(u), K(u)) − φ(u, x, k), τδ < T .
304
T. Komorowski, L. Ryzhik
The second term can be estimated by (3.31)
(δ)
2φ0 0,0 Rx,k [τδ < T ] ≤ Cδ γ φ0 0,0 T , by virtue of Theorem 3.6. Since u x (δ) Eφδ , , k = Eφ0 (z(δ) (u; x, k), m(δ) (u; x, k)) = Ex,k φ0 (X(u), K(u)) δ δ we conclude from the above that the left-hand side of (2.11) can be estimated by Cδ γ φ0 1,4 T 5 for some constants C, γ > 0 independent of δ > 0, T ≥ 1. The bound appearing on the right-hand side of (2.11) can be now concluded by the same argument as the one used after (3.40). 4. Momentum Diffusion to Spatial Diffusion: Proof of Theorem 2.5 We show in this section that solutions of the momentum diffusion equation (2.10) in the long-time, large space limit converge to the solutions of the spatial diffusion equation 2 , x/γ , k), ¯ (2.12). We first recall the setup of Theorem 2.5. Let φ¯ γ (t, x, k) = φ(t/γ where φ¯ satisfies (2.10) and let w(t, x, k) be the solution of the spatial diffusion equation (2.12). In order to prove Theorem 2.5 we need to show that the re-scaled solution φγ (t, x, k) converges as γ → 0 in the space C([0, T ]; L∞ (A(M))) to w(t, x, k), so that w(t) − φ¯ γ (t)L∞ (A(M)) ≤ C γ T + γ 1/2 φ0 2,0 , 0 ≤ t ≤ T . (4.1) Proof of Theorem 2.5. The proof is quite standard. We present it for the sake of completeness and convenience to the reader. The function φ¯ γ is the unique Cb1,1,2 ([0, +∞), R2d ∗ )solution to d ∂ φ¯ γ ∂ φ¯ γ ∂ 2 ˆ k) γ Dmn (k, + γ H0 (k)kˆ · ∇x φ¯ γ , = ∂t ∂k ∂k m n m,n=1 (4.2) ¯ φγ (0, x, k) = φ0 (x, k), see Remark 2.3. We represent φ¯ γ as φ¯ γ = w + γ w1 + γ 2 w2 + R.
(4.3)
Here w is the solution of the diffusion equation (2.12), the correctors w1 and w2 will be constructed explicitly, and the remainder R will be shown to be small. The first corrector of the equation w1 is the unique solution of zero mean over each sphere Sd−1 k d m,n=1
∂ ∂km
ˆ k) ∂w1 Dmn (k, ∂kn
= −H0 (k)kˆ · ∇x w.
(4.4)
It has an explicit form w1 (t, x, k) =
d j =1
χj (k)
∂w(t, x, k) ∂xj
(4.5)
Diffusion in a Weakly Random Hamiltonian Flow
305
with the functions χj defined in (2.14). The second order corrector w2 is the unique zero mean over each sphere Sd−1 solution of the equation k d m,n=1
∂ ∂km
ˆ k) ∂w2 Dmn (k, ∂kn
=
∂w − H0 (k)kˆ · ∇x w1 . ∂t
(4.6)
Note that the expression on the right hand side of (4.6) is of zero mean since thanks to (2.12) and equality (2.13) we have
1 ∂w H (k)l · ∇x w1 d(l). = d−1 Sd−1 0 ∂t Equations (4.4) and (4.6) for various values of k = |k| are decoupled. As a consequence of this fact and the regularity properties for solutions of elliptic equations on a sphere we have that w1 , w2 belong to C([0, T ]; L∞ (A(M))). More explicitly, we may represent the function w2 as d
w2 (t, x, k) =
ψj l (k)
j,l=1
∂ 2 w(t, x, k) . ∂xj ∂xl
The functions ψj m (k) satisfy d m,n=1
∂ ∂km
ˆ k) Dmn (k,
∂ψj l ∂kn
= −H0 (k)kˆj χl (k) + aj l (k).
(4.7)
A unique mean-zero, bounded solution of (4.7) exists according to the Fredholm alternative combined with the regularity properties for solutions of (4.7) on each sphere Sd−1 k . With the above definitions of w, w1 , w2 , Eq. (4.2) for φ¯ γ implies that the remainder R in (4.3) satisfies ∂R ∂w1 ∂w2 + γ3 + γ4 − γ H0 (k)kˆ · ∇x R − γ 3 H0 (k)kˆ · ∇x w2 ∂t ∂t ∂t d ∂ ˆ k) ∂R . Dmn (k, = ∂km ∂kn
γ2
m,n=1
We re-write this equation in the form
∂ ∂R ˆ k) ∂R = f, − γ1 H0 (k)kˆ · ∇x R − γ12 dm,n=1 Dmn (k, ∂t ∂km ∂kn 2 R(0, x, k) = φ0 (x, k) − φ¯ 0 (x, k) − γ w1 (0, x, k) − γ w2 (0, x, k),
(4.8)
ˆ x w2 . Here, as before, R is understood as the where f := −γ ∂t w1 −γ 2 ∂t w2 −γ H0 (k)k·∇ 1,1,2 unique solution to (4.8) that belongs to Cb ([0, +∞), R2d ∗ ). We may split R = R1 +R2 according to the initial data and forcing in the equation: R1 satisfies d 1 1 ∂ ∂R1 ˆ k) ∂R1 = f, Dmn (k, − H0 (k)kˆ · ∇x R1 − 2 ∂t γ γ ∂k ∂kn m,n=1 m (4.9) R1 (0, x, k) = −γ w1 (0, x, k) − γ 2 w2 (0, x, k),
306
T. Komorowski, L. Ryzhik
and the initial time boundary layer corrector R2 satisfies d ∂R2 1 1 ∂ ˆ k) ∂R2 = 0, Dmn (k, − H0 (k)kˆ · ∇x R2 − 2 ∂t γ γ ∂km ∂kn m,n=1
(4.10)
R2 (0, x, k) = φ0 (x, k) − φ¯ 0 (x, k). Using the probabilistic representation for the solution of (4.10) as well as the regularity of w1 and w2 we obtain that R1 (t)L∞ (A(M)) ≤ Cγ T , 0 ≤ t ≤ T .
(4.11)
γ
To obtain the bound for R2 we consider R2 (t, x, k) := R2 (γ 3/2 t, x, k). This function satisfies γ γ d ∂R2 ∂R2 ∂ γ 1
1/2 ˆ k) − γ H0 (k)kˆ · ∇x R2 − γ 1/2 m,n=1 Dmn (k, = 0, ∂t ∂km ∂kn R2 (0, x, k) = φ0 (x, k) − φ¯ 0 (x, k). γ
We also define R˜ 2 , the solution of γ
d γ ∂ R˜ 2 1 ∂ − 1/2 ∂t γ ∂km m,n=1
∂ R˜ ˆ k) 2 Dmn (k, ∂kn
γ
= 0, (4.12)
γ R˜ 2 (0, x, k) = φ0 (x, k) − φ¯ 0 (x, k).
The uniform ellipticity of the right hand side of (4.12) on each sphere Sd−1 implies, see k γ ˜ e.g. Proposition 13.3, p. 55 of [17], that the function R2 satisfies the decay estimate on each sphere Cγ (d−1)/4 Cγ (d−1)/4 γ R˜ 2 (t)L∞ (Sd−1 ) ≤ (d−1)/2 φ0 L1 (Sd−1 ) ≤ (d−1)/2 φ0 L∞ (Sd−1 ) (4.13) k k t t for t ∈ [0, T ] and, similarly, Cγ (d−1)/4 γ ∇x R˜ 2 (t)L∞ (Sd−1 ) ≤ (d−1)/2 φ0 1,0 . k t γ γ Furthermore, the difference q γ = R2 − R˜ 2 satisfies
d 1 ∂ ∂q γ ∂q γ γ 1/2
γ ˆ ˆ −γ H0 (k)k · ∇x q − 1/2 Dmn (k, k) = γ 1/2 H0 (k)kˆ · ∇x R˜ 2 , ∂t γ ∂km ∂kn m,n=1
q γ (0, x, k) = 0. We conclude, using the probabilistic representation of the solution of (4.14), that q γ (t)L∞ (A(M)) ≤ Cγ 1/2 tφ0 1,0 ,
(4.14)
Diffusion in a Weakly Random Hamiltonian Flow
307
and thus γ
R2 (γ 3/2 )L∞ (A(M)) ≤ R2 (1)L∞ (A(M)) + q γ (1)L∞ (A(M)) ≤ C γ (d−1)/4 φ0 0,0 + γ 1/2 φ0 1,0 . The maximum principle for (4.10) implies that we have the above estimate: R2 (t)L∞ (A(M)) ≤ C γ (d−1)/4 φ0 0,0 + γ 1/2 φ0 1,0 , t ≥ γ 3/2 .
(4.15)
Combining (4.3), (4.11) and (4.15) we conclude that w(t)− φ¯ γ (t)L∞ (A(M))≤C γ T +γ (d−1)/4+γ 1/2 φ0 1,0 , γ 3/2 ≤ t ≤ T , (4.16) and thus (4.1) follows, as d ≥ 3. This finishes the proof of Theorem 2.5.
5. The Spatial Diffusion of Wave Energy In this section we consider an application of the previous results to the random geometrical optics regime of propagation of acoustic waves. We show that when the wave length is much shorter than the correlation length of the random medium, there exist temporal and spatial scales where the energy density of the wave undergoes the spatial diffusion. We start with the wave equation in dimension d ≥ 3, 1 ∂ 2φ − φ = 0, c2 (x) ∂t 2
(5.1)
√ and assume that the wave speed has the form c(x) = c0 + δc1 (x). Here c0 > 0 is the constant sound speed of the uniform background medium, while the small parameter δ 1 measures the strength of the mean zero random perturbation c1 . Rescaling the spatial and temporal variables x = x /δ and t = t /δ we obtain (after dropping the primes) Eq. (5.1) with a rapidly fluctuating wave speed x √ cδ (x) = c0 + δc1 . (5.2) δ It is convenient to rewrite (5.1) as the system of acoustic equations for the “pressure” p = φt /c and “acoustic velocity” u = −∇φ: ∂u + ∇ (cδ (x)p) = 0, ∂t ∂p + cδ (x)∇ · u = 0. ∂t
(5.3)
We will denote for brevity v = (u, p) ∈ Rd+1 and write (5.3) in the more general form of a first order linear symmetric hyperbolic system. To do so we introduce symmetric matrices Aδ and D j defined by Aδ (x) = diag(1, 1, 1, cδ (x)), and D j = ej ⊗ ed+1 + ed+1 ⊗ ej , j = 1, . . . , d. (5.4) Here em ∈ Rd+1 is the standard orthonormal basis: (em )k = δmk .
308
T. Komorowski, L. Ryzhik
We consider the initial data for (5.3) as a mixture of states. Let S be a measure space equipped with a non-negative finite measure µ. A typical example is that the initial data is random, S is the state space and µ is the corresponding probability measure. We assume that for each parameter ζ ∈ S and ε, δ > 0 the initial data is given by vεδ (0, x; ζ ) := (−ε∇φ0ε (x), 1/cδ (x)φ˙ 0ε (x)) and vεδ (t, x; ζ ) solves the system of equations d ∂vεδ ∂ Aδ (x)D j j Aδ (x)vεδ (x) = 0. + ∂t ∂x
(5.5)
j =1
The set of initial data is assumed to form an ε-oscillatory and compact at infinity family [5] as ε → 0. By the above we mean that for any continuous, compactly supported function ϕ : Rd → R we have
2 , δ lim lim sup |ϕvε | dk → 0 and lim lim sup |vεδ |2 dx → 0 R→+∞ ε→0+
R→+∞ ε→0+
|k|≥R/ε
|x|≥R
for a fixed realization ζ ∈ S of the initial data and each δ > 0. In the regime of geometric acoustics the scale ε of oscillations of the initial data is much smaller than the correlation length δ of the medium: ε δ 1. The dispersion matrix for (5.5) is P0δ (x, k) = i
d
Aδ (x)kj D j Aδ (x) = i
j =1
d
cδ (x)kj D j
j =1
= icδ (x) k˜ ⊗ ed+1 + ed+1 ⊗ k˜ ,
(5.6)
where k˜ = dj =1 kj ej . The self-adjoint matrix (−iP0δ ) has an eigenvalue H0 = 0 of the multiplicity d − 1, and two simple eigenvalues H±δ (x, k) = ±cδ (x)|k|.
(5.7)
Its eigenvectors are b0m
=
⊥ km ,0
1 , m = 1, . . . , d − 1; b± = √ 2
k˜ ± ed+1 , |k|
(5.8)
⊥ ∈ Rd is the orthonormal basis of vectors orthogonal to k. where km The (d + 1) × (d + 1) Wigner matrix of a mixture of solutions of (5.5) is defined by
1 εy εy δ Wε (t, x, k) = eik·y vεδ (t, x − ; ζ ) ⊗ vεδ∗ (t, x + ; ζ )dyµ(dζ ). (2π )d 2 2
Rd S
(5.9) It is well-known, see [5, 11, 13], that for each fixed δ > 0 (and even without introduction of a mixture of states) when Wεδ (0, x, k) converges weakly in S (Rd × Rd ), as ε → 0, to W0 (x, k) = u0+ (x, k)b+ (k) ⊗ b+ (k) + u0− (x, k)b− (k) ⊗ b− (k),
(5.10)
Diffusion in a Weakly Random Hamiltonian Flow
309
then Wεδ (t) converges weakly in S (Rd × Rd ) to U δ (t, x, k) = uδ+ (t, x, k)b+ (k) ⊗ b+ (k) + uδ− (t, x, k)b− (k) ⊗ b− (k). (δ)
The scalar amplitudes u± satisfy the Liouville equations: ∂t uδ± + ∇k H±δ · ∇x uδ± − ∇x H±δ · ∇k uδ± = 0,
(5.11)
uδ± (0, x, k) = u0± (x, k).
These equations are of the form (2.8), written in the macroscopic variables, with the Hamiltonian given by (5.7). One may obtain an L2 -error estimate for this convergence when a mixture of states is introduced, as in (5.9). In order to make the scale separation ε δ 1 precise we define the set ! Kµ := (ε, δ) : | ln ε|−2/3+µ ≤ δ ≤ 1 . The parameter µ is a fixed number in the interval (0, 2/3). The following proposition has been proved in Theorem 3.2 of [1], using straightforward if tedious asymptotic expansions. Proposition 5.1. Let the acoustic speed cδ (x) be of the form (5.2) and such that the Hamiltonian Hδ (x) given by (5.7) satisfies assumptions (2.3). We assume that the Wigner transform Wεδ satisfies Wεδ (0, x, k) → W0 (x, k) strongly in L2 (Rd × Rd ) as Kµ (ε, δ) → 0. (5.12) We also assume that the limit W0 ∈ Cc2 (R2d ∗ ) with a support that satisfies supp W0 (x, k) ⊆ A(M)
(5.13)
for some M > 0. Moreover, we assume that the initial limit Wigner transform W0 is of the form W0 (x, k) =
u0q (x, k)q (k), q (k) = bq (k) ⊗ bq (k).
(5.14)
q=±
Let U δ (t, x, k) =
uδp (t, x, k)p (k), where the functions uδp satisfy the Liouville
p=±
equations (5.11). Then there exists a constant C1 > 0 that is independent of δ so that 3/2 3/2 Wεδ (t, x, k) − U δ (t, x, k)2 ≤ C(δ) εW0 H 2 eC1 t/δ + ε 2 W0 H 3 eC1 t/δ +Wεδ (0) − W0 2 ,
(5.15)
where C(δ) is a rational function of δ with deterministic coefficients that may depend on the constant M > 0 in the bound (5.13) on the support of W0 .
310
T. Komorowski, L. Ryzhik
The Liouville equations (5.11) are of the form (2.8). Therefore, one may pass to the limit δ → 0 in (5.11) using Theorem 2.1 and conclude that Euδ± converge to the respective solutions of d ∂ u¯ ± ∂ ∂ u¯ ± 2 ˆ = |k| Dmn (k) ± c0 kˆ · ∇x u¯ ± (5.16) ∂t ∂km ∂kn m,n=1
ˆ = [Dmn (k)] ˆ is with the initial conditions as in (5.11). Here the diffusion matrix D(k) given by
ˆ 1 ∞ ∂ 2 R(c0 s k) ˆ Dmn (k) = − ds, (5.17) 2 −∞ ∂xn ∂xm where R(x) is the correlation function of the random field c1 (x): E [c1 (z)c1 (x + z)] = R(x). Furthermore, it follows from Theorem 2.7 that there exists α0 > 0 so that solutions of (5.11) with the initial data of the form uδ± (0, x, k) = u0± (δ α x, k) with 0 < α < α0 , converge in the long time limit to the solutions of the spatial diffusion equation. More precisely, in that case the function u¯ δ (t, x, k) = uδ+ (t/δ 2α , x/δ α , k) (and similarly for uδ− ) converges as δ → 0 to w(t, x, k) – the solution of the spatial diffusion equation d ∂w ∂ 2w amn (k) = , ∂t ∂xn ∂xm m,n=1
w(0, x, k) = u¯ 0+ (x; k) :=
(5.18)
1 u0 (x, kl)d(l) d−1 Sd−1 +
with the diffusion matrix amn given by:
c0 anm (k) = ln χm (kl)d(l), d−1 Sd−1
(5.19)
and the functions χj above are the mean-zero solutions of d m,n=1
∂ ∂km
ˆ k 2 Dmn (k)
∂χj ∂kn
= −c0 kˆj .
(5.20)
Theorems 2.1, 2.5 and 2.7 allow us to make the passage to the limit ε, δ, γ → 0 rigorous. The assumption that ε δ γ is formalized as follows. We let ! Kµ,ρ := (ε, δ, γ ) : δ ≥ | ln ε|−2/3+µ and γ ≥ δ ρ , ± 3 2d with 0 < µ < 2/3, ρ ∈ (0, 1). Suppose also that u± 0 ∈ Cc (R∗ ) and supp u0 ⊆ A(M). Let
W 0 (x, k) := u0+ (x, k)b+ (k) ⊗ b+ (k) + u0− (x, k)b− (k) ⊗ b− (k),
(5.21)
and W (t, x, k) := w+ (t, x; k)b+ (k) ⊗ b+ (k) + w− (t, x; k)b− (k) ⊗ b− (k).
(5.22)
Our main result regarding the diffusion of wave energy can be stated as follows.
Diffusion in a Weakly Random Hamiltonian Flow
311
Theorem 5.2. Assume that the dimension d ≥ 3 and M ≥ 1 are fixed. Suppose for some 0 < µ < 2/3, ρ ∈ (0, 1) we have, with W 0 as in (5.21) and Wεδ defined by (5.9), 2
EW δ 0, x , k − W 0 (x, k) dxdk → 0, as (ε, δ, γ ) → 0 and (ε, δ, γ ) ∈ Kµ,ρ . ε γ R2d
Then, there exists ρ1 ∈ (0, ρ] such that for any T > T∗ > 0 we have 2
t x δ sup EWε γ 2 , γ , k − W (t, x, k) dxdk t∈[T∗ ,T ] → 0, as (ε, δ, γ ) → 0 and (ε, δ, γ ) ∈ Kµ,ρ1 . Here W (t, x, k) is of the form (5.22) with the functions w± that satisfy (5.18) with the initial data w± (0, x, k) = u¯ 0± (x, k). The proof follows immediately from Theorems 2.1, 2.5 and 2.7 as well as Proposition 5.1. A. The Proof of Lemma 3.5 Given s ≥ σ > 0, π ∈ C we define the linear approximation of the trajectory ˆ ) L(σ, s; π) := X(σ ) + (s − σ )H0 (K(σ ))K(σ
(A.1)
and for any v ∈ [0, 1] let R(v, σ, s; π) := (1 − v)L(σ, s; π ) + vX(s).
(A.2)
The following simple fact can be verified by a direct calculation, see Lemma 5.4 of [1]. Proposition A.1. Suppose that s ≥ σ ≥ 0 and π ∈ C(δ). Then,
s
˜ ˆ ˆ )|dρ. |X(s) − L(σ, s; π )| ≤ D(2M − H0 (K(σ ))K(σ δ ) δ(s − σ ) + |H0 (K(ρ))K(ρ) √
σ
We obtain from Proposition A.1 for each s ≥ σ an error for the first-order approximation of the trajectory √ C(s − σ )2 ˜ , |z(δ) (s) − l (δ) (σ, s)| ≤ D(2M √ δ ) δ(s − σ ) + 2 δ
δ ∈ (0, δ∗ (M)].
ˆ (δ) (σ ) is the linear approximation between the Here l (δ) (σ, s) := z(δ) (σ ) + (s − σ )m times σ and s and C :=
sup δ∈(0,δ∗ (M)]
˜ (Mδ h∗0 (Mδ ) + h˜ ∗0 (Mδ ))D(2M δ ).
With no loss of generality we may assume that x = 0 and that there exists k ≥ 0 such (p) (p) that u, t ∈ [tk , tk+1 ). We shall omit the initial condition in the notation of the solution to (3.12). Throughout this argument we use Proposition A.1 with σ (s) := s − δ 1−γA for some γA ∈ (0, 1/16 ∧ (1 − 4 )),
s ∈ [t, u].
(A.3)
312
T. Komorowski, L. Ryzhik
The aforementioned proposition proves that for this choice of σ we have |L(δ) (σ, s) − y (δ) (s)| ≤ CA δ 3/2−2γA ,
∀ δ ∈ (0, 1].
(A.4)
Throughout this section we denote ζ˜ = ζ (y (δ) (t1 ), l (t1 ), . . . , y (δ) (tn ), l (tn )). We assume first that G ∈ C 2 (Rd∗ ) and G2 < +∞. Note that d u y (δ) (s) (δ) 1 (δ) (δ) (δ) G(l (u))−G(l (t)) = −√ , l (s) ds. ∂j G(l (s))Fj,δ s, δ δ (δ)
(δ)
j =1 t
(A.5) We can rewrite then (A.5) in the form I (1) + I (2) + I (3) , where d u y (δ) (s) (δ) 1 (1) (δ) I := − √ ∂j G(l (σ ))Fj,δ s, , l (σ ) ds, δ δ I
(2)
1 := δ
j =1 t d u s
i,j =1 t
×Fi,δ I (3)
∂j G(l
(ρ))∂i Fj,δ
y (δ) (s) (δ) s, , l (ρ) δ
σ
y (δ) (ρ) (δ) ρ, , l (ρ) ds dρ, δ
u s
d 1 := δ i,j =1 t
×Fi,δ
(δ)
2 ∂i,j G(l (δ) (ρ))Fj,δ
y (δ) (s) (δ) s, , l (ρ) δ
σ
y (δ) (ρ) (δ) ρ, , l (ρ) ds dρ, δ
and σ is given by (A.3). Each of these terms will be estimated separately below. A.1. The term E[I (1) ζ˜ ]. The term I (1) can be rewritten in the form J (1) + J (2) , where d u L(δ) (σ, s) (δ) 1 (1) (δ) J := − √ ∂j G(l (σ ))Fj,δ s, , l (σ ) ds, δ δ j =1 t
and J (2) := −
1 δ 3/2
d u 1
∂j G(l (δ) (σ ))∂yi Fj,δ
i,j =1 t 0 (δ) (δ) ×(yi (s) − Li (σ, s)) ds dv,
R (δ) (v, σ, s) (δ) s, , l (σ ) δ (A.6)
where, see (A.1) and (A.2), L(δ) (σ, s) = L(σ, s; y (δ) (·), l (δ) (·)), R (δ) (σ, s) = R(σ, s; y (δ) (·), l (δ) (·)). We use part (i) of Lemma 3.2 to handle the term E[J (1) ζ˜ ]. Let X˜ 1 (x, k) = −∂xi H1 (x, k), X˜ 2 (x, k) ≡ 1, (p) Z = tk , L(δ) (σ, s), l (δ) (σ ) ∂j G(l (δ) (σ ))ζ˜ ,
Diffusion in a Weakly Random Hamiltonian Flow
313
and g1 = (L(δ) (σ, s)δ −1 , |l (δ) (σ )|). Note that g1 and Z are both Fσ measurable. We need (p) to verify (3.15). Suppose therefore that Z = 0. For ρ ∈ [0, tk−1 ] we have |L(δ) (σ, s) − y (δ) (ρ)| ≥ (4q)−1 , provided that CA δ 3/2−2γA < 1/(12q), which holds for sufficiently small δ, since our assumptions on the exponents 2 , 3 , γA (namely that 2 , 3 < 1/8, (p) γA < 1/8) guarantee that 2 + 3 < 3/4 − γA /2. For ρ ∈ [tk−1 , σ ] we have (δ) (p) tk−1 (L(δ) (σ, s) − y (δ) (ρ)) · lˆ (δ) (δ) (p) tk−1 ≥ (s − σ )H0 (|l (δ) (σ )|) lˆ (σ ) · lˆ
σ √ y (δ) (ρ1 ) (δ) (δ) (δ) (p)
(δ) + H0 (|l (ρ1 )|) + δ ∂l H1 tk−1 dρ1 , |l (ρ1 )| lˆ (ρ1 ) · lˆ δ ρ √ (3.13) 2 2 ˜ ≥ (s − σ )h∗ (2Mδ ) 1 − + h∗ (2Mδ ) − δ D(2M δ ) (s − ρ) 1 − N N 2 ≥ (s − σ )h∗ (2Mδ ) 1 − , (A.7) N provided that δ ∈ (0, δ0 ] and δ0 is sufficiently small. We see from (A.7) that (3.15) is satisfied with r = (1 − 2/N) h∗ (2Mδ )δ 1−γA . Using Lemma 3.2 we estimate
u D(2M ˜ δ) (1) s − σ (1) ˜ ds G1 E[ζ˜ ] φ CA E[J ζ ] ≤ √ δ δ t (2) (1) −1/2 ≤ CA G1 E[ζ˜ ]δ φ CA δ −γA (u − t) ≤ CA G1 E[ζ˜ ]δ(u − t), (3)
(A.8)
(3)
and CA exists by virtue of assumption (2.4). On the other hand, the term J (2) defined (2) (2) by (A.6) may be written as J (2) = J1 + J2 ,where d u (δ) (σ, s) L 1 (2) J1 := − 3/2 , l (δ) (σ ) ∂j G(l (δ) (σ ))∂yi Fj,δ s, δ δ i,j =1 t
(δ)
(δ)
×(yi (s) − Li (σ, s)) ds and (2) J2
:= −
1 δ 5/2
u 1 1 d i,j,k=1 t
0 0 (δ)
∂y2i ,yk Fj,δ (δ)
R (δ) (θ v, σ, s) (δ) s, , l (σ ) v δ (δ)
(δ)
×∂j G(l (δ) (σ ))(yi (s) − Li (σ, s))(yk (s) − Lk (σ, s)) ds dv dθ. (A.9) (2)
The term involving J2 have then
may be handled easily with the help of (A.4) and (3.11). We
(2) (4) ˜ −5/2 3−4γA 2 δ T |E[J2 ζ˜ ]| ≤ CA D(2M δ )E[ζ˜ ]G1 (u − t)δ
≤ CA δ 1/2−4γA T 2 (u − t)E[ζ˜ ]G1 . (5)
(A.10)
314
T. Komorowski, L. Ryzhik (2)
In order to estimate the term corresponding to J1 (2) J1,1
:= −
1 δ 3/2
d u s i,j =1 t
×(s − ρ1 )
(2)
we write J1
(2)
(2)
= J1,1 + J1,2 , where
∂j G(l
(δ)
(σ ))∂yi Fj,δ
L(δ) (σ, s) (δ) , l (σ ) s, δ
σ
d (δ) (δ) H0 (|l (ρ1 )|) lˆi (ρ1 ) ds dρ1 dρ1
(A.11)
and (2) J1,2
d u s L(δ) (σ, s) (δ) 1 (δ) := − ∂j G(l (σ ))∂yi Fj,δ s, , l (σ ) δ δ i,j =1 t σ y (δ) (ρ) (δ) (δ) ×∂l H1 , |l (ρ)| lˆi (ρ) ds dρ, δ
with d (δ) (δ) (δ) (δ) (δ) H0 (|l (ρ1 )|) lˆi (ρ1 ) = H0
(|l (δ) (ρ1 )|) (lˆ (ρ1 ), l˙ (ρ1 ))Rd lˆi (ρ1 ) dρ1
d (δ) (δ) (δ) (δ) +H0 (|l (δ) (ρ1 )|)|l (δ) (ρ1 )|−1 li (ρ1 )−(lˆ (ρ1 ), l˙ (ρ1 ))Rd lˆi (ρ1 ) . dρ1 (A.12) (2)
(2)
(2)
(2)
(2)
We deal with J1,2 first. It may be split as J1,2 = J1,2,1 + J1,2,2 + J1,2,3 , where (2) J1,2,1
(2) J1,2,2
d u s L(δ) (σ, s) (δ) 1 (δ) := − ∂j G(l (σ ))∂yi Fj,δ s, , l (σ ) δ δ i,j =1 t σ L(δ) (σ, ρ) (δ) (δ) ×∂l H1 (A.13) , |l (σ )| lˆi (σ ) ds dρ, δ
d u s 1 L(δ) (σ, s) (δ) 1 (δ) , l (σ ) := − 2 ∂j G(l (σ ))∂yi Fj,δ s, δ δ i,j =1 t σ 0 R (δ) (v, σ, ρ) (δ) (δ) (δ) (δ) ×(∂yi ∂l H1 ) , |l (ρ)| (yi (ρ) − Li (σ, ρ)) lˆi (ρ) ds dρ dv δ
and (2) J1,2,3
d u s ρ L(δ) (σ, s) (δ) 1 (δ) , l (σ ) := − ∂j G(l (σ ))∂yi Fj,δ s, δ δ i,j =1 t σ σ d L(δ) (σ, s) (δ) (δ) × ∂l H 1 , |l (ρ1 )| lˆi (ρ1 ) ds dρ dρ1 . dρ1 δ
Diffusion in a Weakly Random Hamiltonian Flow
315
By virtue of (A.4), definition (3.10) and (3.11) we obtain easily that |E[J1,2,2 ζ˜ ]| ≤ CA δ 1/2−3γA G1 T (u − t)Eζ˜ . (2)
(6)
(A.14) (2)
The same argument and equality (A.12) also allow us to estimate |E[J1,2,3 ζ ]| by the right-hand side of (A.14). Using Lemma 3.1 and the definition (3.10) we conclude that there exists a constant (7) CA > 0 independent of δ such that (δ) (δ) ∂y Fj,δ s, L (σ, s) , l (δ) (σ ) − t (p), L(δ)(σ, s), l (δ)(σ ) ∂ 2 H1 L (σ, s) , |l (δ) (σ )| yi ,yj k i δ δ ≤ CA(7) δT ,
i, j = 1, . . . , d.
Therefore, we can write d u s 1 (p) (2) E[J ˜ E ∂j G(l (δ) (σ )) tk , L(δ) (σ, s), l (δ) (σ ) 1,2,1 ζ ] + δ i,j =1 t σ (δ) (σ, s) (δ) (σ, ρ) L L (δ) × ∂y2i ,yj H1 , |l (δ) (σ )| ∂l H1 , |l (δ) (σ )| lˆi (σ ) ζ˜ ds dρ δ δ ≤ CA δ 1−γA (u − t)G1 T Eζ˜ . (8)
(A.15)
We apply now part (ii) of Lemma 3.2 with (p) (δ) Z = ∂j G(l (δ) (σ )) tk , L(δ) (σ, s), l (δ) (σ ) lˆi (σ ) ζ˜ , X˜ 1 (x, k) := ∂x2i ,xj H1 (x, k), X˜ 2 (x) := ∂k H1 (x, k), L(δ) (σ, s) (δ) L(δ) (σ, ρ) (δ) g1 := , |l (σ )| , g2 := , |l (σ )| , δ δ (9)
r = CA (ρ − σ ),
(9)
r1 = CA (s − ρ).
We conclude that d u s (p) (δ) (δ) (δ) E J (2) ζ + 1 E ∂ G(l (σ )) t , L (σ, s), l (σ ) j 1,2,1 k δ i,j =1 t σ L(δ) (σ, s) − L(δ) (σ, ρ) (δ) (δ) 2 ×∂yi ,yj R1 , |l (σ )| lˆi (σ )ζ˜ ds dρ δ (10)
C (8) ≤ CA δ 1−γA (u − t)G1 T Eζ˜ + A G1 E[ζ˜ ] δ (9) (9)
u s C C (ρ − σ ) (s − ρ) A A × φ 1/2 φ 1/2 ds dρ, 2δ 2δ t
σ
(A.16)
316
T. Komorowski, L. Ryzhik
where R1 (y, k) := E[H1 (y, k)∂k H1 (0, k)],
(y, k) ∈ Rd × [0, +∞).
(A.17)
We can use assumption (2.4) to estimate the second term on the right hand side of (A.16) (11) e.g. by CA δ(u − t)G1 Eζ˜ . The second term appearing on the left hand side of (A.16) equals d u 1 (p) E ∂j G(l (δ) (σ )) tk , L(δ) (σ, s), l (δ) (σ ) (δ)
H0 (|l (σ )|) j =1 t s
d s − ρ (δ) × − ∂yj R1 H0 (|l (δ) (σ )|) lˆ (σ ), |l (δ) (σ )| dρ ζ˜ ds, (A.18) dρ δ σ
and integrating over dρ we obtain that it equals −
d j =1 t
u
E
∂j G(l (δ) (σ )) H0 (|l (δ) (σ )|)
(p) tk , L(δ) (σ, s), l (δ) (σ )
∂yj R1 0, |l (σ )| ζ˜ (δ)
ds
d u ∂j G(l (δ) (σ )) (p) (δ) (δ) (δ) −γA ˆ (δ) tk , L (σ, s), l (σ ) ∂yj R1 δ E (δ) l (σ ), |l (σ )| ζ˜ ds. + H0 (|l (σ )|) j =1 t
(A.19) (12)
By virtue of (2.5) the second term appearing in (A.19) is bounded e.g. by CA (12) t)G1 Eζ˜ for some constant CA > 0, thus we have shown that d u ∂j G(l (δ) (σ )) (p) (δ) (δ) E[J (2) ζ ]− E , L (σ, s), l (σ ) ∂yj R1 0, |l (δ) (σ )| t 1,2,1 k (δ)
H0 (|l (σ )|) j =1 t ≤ CA(13) δ 1−γA (u − t)G1 T Eζ˜ .
δ(u −
˜ζ ds
(A.20) (2)
Let us consider the term corresponding to J1,1 , cf. (A.11). Note that according to (A.12) (2)
(2)
(2)
and (3.12) we have J1,1 = J1,1,1 + J1,1,2 , where (2) J1,1,1
with
d u s L(δ) (σ, s) (δ) 1 (δ) , l (σ ) := − 2 ∂j G(l (σ ))∂yi Fj,δ s, δ δ i,j =1 t σ y (δ) (ρ1 ) (δ) ×(s − ρ1 )i ρ1 , , l (σ ) ds dρ1 , δ
i (ρ, y, l) := |l|−1 H0 (|l|) ˆl, Fδ (ρ, y, l) d li − Fi,δ (ρ, y, l) R
ˆ −H0 (|l|) l, Fδ (ρ, y, l) d lˆi , R
Diffusion in a Weakly Random Hamiltonian Flow
317
while (2) J1,1,2
d u s ρ1 L(δ) (σ, s) (δ) 1 (δ) := − 2 ∂j G(l (σ ))∂yi Fj,δ s, , l (σ ) δ δ i,j =1 t σ σ d y (δ) (ρ1 ) (δ) ×(s − ρ1 ) i ρ1 , (A.21) , l (ρ2 ) ds dρ1 dρ2 . dρ2 δ
Note that | dρd 2 i | ≤ CA δ −1/2 for some constant CA (14)
(14)
> 0. A straightforward com-
(2) |E[J1,1,2 ζ ]|
(15)
putation, using (A.3) and Lemma 3.1, shows that ≤ CA δ 1/2−3γA (u − t)G1 T E[ζ˜ ]. An application of (A.4), in the same fashion as it was done in the calcu(2) (2) lations concerning the terms E[J1,2,2 ζ ] and E[J1,2,3 ζ ], yields that d u s (δ) (σ, s) L 1 (2) (δ) (δ) E[J , l (σ ) (s−ρ1 )E ∂j G(l (σ ))∂yi Fj,δ s, 1,1,1 ζ ]+ 2 δ δ i,j =1 t σ L(δ) (σ, ρ1 ) (δ) (16) × i ρ1 , , l (σ ) ζ˜ ds dρ1 ≤ CA δ 1/2−4γA (u − t)G1 T E[ζ˜ ]. δ (A.22) For j = 1, . . . , d we let Vj (y, y , l) :=
d
H0
(|l|) − H0 (|l|) ∂y3i ,yj ,yk R(y − y , |l|)lˆi lˆk
i,k=1
+
d
H0 (|l|)|l|−1 ∂y3i ,yi ,yj R(y − y , |l|),
i=1
and also (t, y, y , l; π) := (t, y, l; π)(t, y , l; π ),
(A.23)
t ≥ 0, y, y ∈ Rd , l ∈ Rd∗ , π ∈ C, P := L(δ) (σ, s), L(δ) (σ, ρ1 ), l (δ) (σ ) , Pδ := δ −1 L(δ) (σ, s), δ −1 L(δ) (σ, ρ1 ), l (δ) (σ ) and (s) := (s, y (δ) (s), l (δ) (s); y (δ) (·), l (δ) (·)). Applying Lemma 3.1 and part ii) of Lemma 3.2, as in (A.15) and (A.16), we conclude that the difference between the second term on the left-hand side of (A.22) and d u s 1 (δ) ˜ ds dρ1 , ζ (s − ρ )E ∂ G(l (σ ))(σ, P )V (P ) 1 j j δ δ2 j =1 t
σ
(A.24)
318
T. Komorowski, L. Ryzhik (1)
(17) (1) can be estimated by CA δ γA (u − t)G1 E[ζ˜ ] for some γA > 0. Using the fact that (22)
|l (δ) (ρ) − l (δ) (σ )| ≤ CA δ 1/2−γA ,
ρ ∈ [σ, s],
(A.25)
estimate (A.4) and Lemma 3.1 we can argue that 2 (18) (σ, P ) − (s) ≤ CA (δ 1/2−γA −1 + δ 1/2−2(γA +2 +3 ) T ). We conclude therefore that the magnitude of the difference between the expression in (A.24) and s
d u 1 2 E ∂j G(l (δ) (σ )) (s) (s − ρ1 )Vj (Pδ ) dρ1 ζ˜ ds, (A.26) δ2 j =1 t
σ (2)
(19) (2) can be estimated by CA δ γA (u − t)G1 T E[ζ˜ ] for some γA > 0. Using shorthand (δ) notation Q(σ ) := H0 (|l (δ) (σ )|) lˆ (σ ) we can write the integral from σ to s appearing above as being equal to
s d 1
(δ)
(δ) (s − ρ ) (|l (σ )|) − H (|l (σ )|) H 1 0 0 δ2 s−δ 1−γA
i,k=1
s − ρ1 (δ) (δ) Q(σ ), |l (δ) (σ )| lˆi (σ )lˆk (σ ) + H0 (|l (δ) (σ )|)|l (δ) (σ )|−1 δ d s − ρ1 3 (δ) × ∂yi ,yi ,yj R dρ1 , Q(σ ), |l (σ )| δ ×∂y3i ,yj ,yk R
i=1
which upon the change of variables ρ1 := (s − ρ1 )/δ is equal to δ −γA
0
ρ1
d
H0
(|l (δ) (σ )|) − H0 (|l (δ) (σ )|)
i,k=1
(δ) (δ) ×∂y3i ,yj ,yk R ρ1 Q(σ ), |l (δ) (σ )| lˆi (σ )lˆk (σ ) + H0 (|l (δ) (σ )|)|l (δ) (σ )|−1
d
∂y3i ,yi ,yj R ρ1 Q(σ ), |l (δ) (σ )| dρ1 . (A.27)
i=1
Using the fact that d
(δ) ∂y3i ,yj ,yk R ρ1 Q(σ ), |l (δ) (σ )| lˆk (σ )
k=1
=
d 2 (δ) R ρ Q(σ ), |l (σ )| ∂ 1 ,y y i j H0 (|l (δ) (σ )|) dρ1 1
Diffusion in a Weakly Random Hamiltonian Flow
319
we obtain, upon integrating by parts in the first term on the right-hand side of (A.27), that this expression equals H0 (|l (δ) (σ )|)−1 H0
(|l (δ) (σ )|) − H0 (|l (δ) (σ )|) d δ −γA ∂y2 ,y R δ −γA Q(σ ), |l (δ) (σ )| lˆ(δ) (σ ) × i i j i=1 δ −γA
−
(δ) ∂y2i ,yj R ρ1 Q(σ ), |l (δ) (σ )| lˆi (σ ) dρ1
(A.28)
0 δ −γA
+H0 (|l (δ) (σ )|)|l (δ) (σ )|−1
ρ1 ∂y3i ,yi ,yj R ρ1 Q(σ ), |l (δ) (σ )| dρ1 .
0
(A.29) Note that ∇R(0, l) = 0 and d
(δ) ∂y2i ,yj R ρ1 Q(σ ), |l (δ) (σ )| lˆi (σ ) =
i=1
d (δ) ∂ R ρ Q(σ ), |l (σ )| . yj 1 H0 (|l (δ) (σ )|) dρ1 1
We obtain therefore that the expression in (A.28) equals H0 (|l (δ) (σ )|)−1 H0
(|l (δ) (σ )|) − H0 (|l (δ) (σ )|) d (δ) × δ −γA ∂y2 ,y R δ −γA Q(σ ), |l (δ) (σ )| lˆ (σ ) i
i
j
i=1
−H0 (|l (δ) (σ )|)−1 ∂yj R δ −γA Q(σ ), |l (δ) (σ )|
+H0 (|l (δ) (σ )|)|l (δ) (σ )|−1
−γA d δ
ρ1 ∂y3i ,yi ,yj R ρ1 Q(σ ), |l (δ) (σ )| dρ1 .
i=1 0
(A.30) Recalling assumption (2.5) we conclude that the expressions corresponding to the first (3) (3) two terms appearing in (A.30) are of order of magnitude O(δ γA ) for some γA > 0. Summarizing work done in this section, we have shown that d u 2 (δ) (δ) E I (1) − ˜ C (l (σ )) (s)∂ G(l (σ )) ds ζ j j j =1 t
(20)
≤ CA δ
(4) γA
(u − t)G1 T 2 Eζ˜
(A.31)
320
T. Komorowski, L. Ryzhik (20)
(4)
for some constants CA , γA > 0 and (cf. (A.17)) Cj (l) := Ej (ˆl, |l|) +
∂yj R1 (0, |l|)
, H0 (|l|) +∞ d H0 (k) ρ1 ∂y3i ,yi ,yj R ρ1 H0 (k)ˆl, k dρ1 , Ej (ˆl, k) := − k
j = 1, . . . , d.
i=1 0
A.1.1. The terms E[I (2) ζ˜ ] and E[I (3) ζ˜ ]. The calculations concerning these terms essentially follow the respective steps performed in the previous section so we only highlight their main points. First, we note that the difference between E[I (2) ζ˜ ] and
(δ) d u s 1 y (s) (δ) y (δ) (ρ) (δ) E ∂j G(l (δ) (σ ))∂i Fj,δ s, , l (σ ) Fi,δ ρ, , l (σ ) ζ˜ ds dρ δ i,j =1 δ δ t
σ
(A.32) (5)
(21) is less than, or equal to CA δ γA (u − t)G1 E[ζ˜ ], cf. (A.25). Next we note that (A.32) equals
d u s (δ) (σ, s) 1 L E ∂j G(l (δ) (σ ))∂i Fj,δ s, , l (δ) (σ ) δ δ i,j =1 t σ L(δ) (σ, ρ) (δ) ×Fi,δ ρ, , l (σ ) ζ˜ ds dρ δ
u s 1 d 1 R (δ) (v, σ, s) (δ) (δ) + 2 E ∂j G(l (σ ))∂i ∂yk Fj,δ s, , l (σ ) δ δ i,j,k=1 t σ 0 L(δ) (σ, ρ) (δ) (δ) (δ) ˜ × Fi,δ ρ, , l (σ ) (yk (s) − Lk (σ, s))ζ ds dρ dv δ
u s 1 d 1 y (δ) (s) (δ) (δ) + 2 E ∂j G(l (σ ))∂i Fj,δ s, , l (σ ) δ δ i,j,k=1 t σ 0 R (δ) (v, σ, ρ) (δ) (δ) (δ) ˜ × ∂yk Fi,δ ρ, , l (σ ) (yk (ρ) − Lk (σ, ρ))ζ ds dρ dv. δ (A.33) A straightforward argument using Lemma 3.1 and (A.4) shows that both the second and (23) third terms of (A.33) can be estimated by CA δ 1/2−(6γA +1 ) (u − t)G1 T 2 E[ζ˜ ]. The first term, on the other hand, can be handled with the help of part ii) of Lemma 3.2 in (2) the same fashion as we have dealt with the term J1,2,1 , given by (A.13) of Sect. A.1, and
Diffusion in a Weakly Random Hamiltonian Flow
321
we obtain that d u 2 (δ) (δ) (δ) (δ) E I (2) − ˜ (|l (σ )|) (s)+J (s; y (·), l (·))(s) ∂ G(l (σ )) ds ζ D j j j j =1 t
≤
(24) (6) CA δ γA (u − t)G1 T E[ζ˜ ].
(A.34)
Here Dj (l) :=
∂yj R2 (0,l) H0 (l)
R2 (y, l) := E[∂l H1 (y, l)H1 (0, l)],
,
Jj (s; y (δ) (·), l (δ) (·)) := − i (s) := ∂li
d
(δ)
i (s)Di,j (lˆ (σ ), |l (δ) (σ )|),
(A.35)
i=1 (s, y (δ) (s), l (δ) (s); y (δ) (·), l (δ) (·)).
Finally, concerning the limit of E[I (3) ζ˜ ], another application of (A.4) yields (25) (7) E[I (3) ζ˜ ] − I ≤ CA δ γA (u − t)G1 E[ζ˜ ],
(A.36)
where 1 I := δ
E t
σ
u s
2 ∂i,j G(l (δ) (σ ))Fj,δ
×Fi,δ ρ,
L(δ) (σ, s) (δ) s, , l (σ ) δ
L(δ) (σ, ρ) (δ) , l (σ ) ζ˜ ds dρ. δ
Then, we can use part ii) of Lemma 3.2 in order to obtain d u (δ) I − ˆ (σ ), |l (δ) (σ )|)2 (s)∂ 2 G(l (δ) (σ )) ds D ( l i,j i,j i,j =1 t
≤
(26) (8) CA δ γA (u − t)G2 T E[ζ˜ ].
(A.37)
Next we replace the argument σ , in formulas (A.31), (A.34) and (A.36), by s. This can be done thanks to estimate (A.25) and the assumption on the regularity of the random field H1 (·, ·). In order to make this approximation work we will be forced to use the third derivative of G(·). Finally (cf. (A.17), (A.35)) note that ∇y R1 (0, l) + ∇y R2 (0, l) = ∇
E[∂l H1 (y, l)H1 (y, l)] y y=0
= 0.
Hence we conclude that the assertion of the lemma holds for any function G ∈ C 3 (Rd∗ ) satisfying G3 < +∞. Generalization to an arbitrary G ∈ Cb1,1,3 ([0, +∞) × R2d ∗ )
322
T. Komorowski, L. Ryzhik
is fairly standard. Let r0 be any positive integer and consider sk := t + kr0−1 (u − t), k = 0, . . . , r0 . Then ! E [G(u, y (δ) (u), l (δ) (u)) − G(t, y (δ) (t), l (δ) (t))]ζ˜ = =
r 0 −1 k=0 r 0 −1
! E [G(sk+1 , y (δ) (sk+1 ), l (δ) (sk+1 )) − G(sk , y (δ) (sk ), l (δ) (sk ))]ζ˜ . E [G(sk , y (δ) (sk ), l (δ) (sk+1 )) − G(sk , y (δ) (sk ), l (δ) (sk ))]ζ˜
k=0 r 0 −1
+
!
! E [G(sk+1 , y (δ) (sk+1 ), l (δ) (sk )) − G(sk , y (δ) (sk ), l (δ) (sk ))]ζ˜ . (A.38)
k=0
Using the already proven part of the lemma we obtain r −1 0 ! "sk+1 (G(sk , y (δ) (sk ), · )) − N "sk (G(sk , y (δ) (sk ), · ))]ζ˜ E [N k=0
(9)
≤ CA δ γA (u − t)G1,1,3 T 2 Eζ˜ . (27)
(A.39)
On the other hand, the second term on the right-hand side of (A.38) equals sk+1 r 0 −1
E
H0 (|l (δ) (ρ)|) +
√
δ∂l H1
k=0 sk
y (δ) (ρ) (δ) , |l (ρ)| δ
(δ)
×lˆ (ρ) · ∇y G(ρ, y (δ) (ρ), l (δ) (sk )) + ∂ρ G(ρ, y (δ) (ρ), l (δ) (sk )) ζ˜
dρ.
(A.40)
The conclusion of the lemma for an arbitrary function G ∈ Cb1,1,3 ([0, +∞) × R2d ∗ ) is an easy consequence of (A.38)–(A.40) upon passing to the limit with r0 → +∞. Acknowledgement. The research of TK was partially supported by KBN grant 2PO3A03123. The work of LR was partially supported by an NSF grant DMS-0203537, an ONR grant N00014-02-1-0089 and an Alfred P. Sloan Fellowship.
References 1. Bal, G., Komorowski, T., Ryzhik, L.: Self-averaging of Wigner Transforms in Random Media. Commun. Math. Phys. 242, 81–135 (2003) 2. Dürr, D., Goldstein, S., Lebowitz, J.: Asymptotic motion of a classical particle in a random potential in two dimensions: Landau model. Commun. Math. Phys. 113, 209–230 (1987) 3. Erdös, L.,Yau, H.T.: Linear Boltzmann equation as the weak coupling limit of a random Schrödinger Equation. Commun. Pure Appl. Math. 53, 667–735 (2000) 4. Erdös, L., Salmhofer, M., Yau, H.T.: Quantum diffusion of the random Schrödinger evolution in the scaling limit. http://arxiv.org/list/math-ph/0502025, 2005 5. Gérard, P., Markowich, P.A., Mauser, N.J., Poupaud, F.: Homogenization limits and Wigner transforms. Commun. Pure Appl. Math., 50, 323–380 (1997)
Diffusion in a Weakly Random Hamiltonian Flow
323
6. Gikhman, I.I., Skorochod, A.V.: Theory of stochastic processes. Vol. 3, Berlin: Springer Verlag, 1974 7. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Berlin: Springer Verlag, 1998 8. Kesten, H., Papanicolaou, G.: A Limit Theorem For Turbulent Diffusion. Commun. Math. Phys. 65, 97–128 (1979) 9. Kesten, H., Papanicolaou, G.C.: A Limit Theorem for Stochastic Acceleration. Commun. Math. Phys. 78, 19–63 (1980) 10. Kusuoka, S., Stroock, D.: Applications of the Malliavin calculus, Part II. J. Fac. Sci. Univ. Tokyo, Sect. IA, Math, 32, 1–76 (1985) 11. Lions, P.-L., Paul, T.: Sur les mesures de Wigner. Rev. Mat. Iberoamericana, 9, 553–618 (1993) 12. Lukkarinen, J., Spohn, H.: Kinetic limit for wave propagation in a random medium. http://arxiv.org/list/math-ph/0505075, 2005 13. Ryzhik, L., Papanicolaou, G., Keller, J.: Transport equations for elastic and other waves in random media. Wave Motion 24, 327–370 (1996) 14. Spohn, H.: Derivation of the transport equation for electrons moving through random impurities. J. Stat. Phys. 17, 385–412 (1977) 15. Strook, D.: An Introduction to the Analysis of Paths on a Riemannian Manifold. Math. Surv. and Monographs 74, Providence, RI:Amer.Math.Soc., 2000 16. Strook, D., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Berlin, Heidelberg, New York: Springer-Verlag, 1979 17. Taylor, M.: Partial differential equations. Vol. 2, New York: Springer-Verlag, 1996 Communicated by P. Constantin
Commun. Math. Phys. 263, 325–352 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1397-3
Communications in
Mathematical Physics
Quantum Variance and Ergodicity for the Baker’s Map M. Degli Esposti1 , S. Nonnenmacher2 , B. Winn1, 1
Dipartimento di Matematica, Universit`a di Bologna, 40127 Bologna, Italy. E-mail:
[email protected] 2 Service de Physique Th´eorique, CEA/DSM/PhT Unit´e de recherche associ´ee au CNRS, CEA/Saclay, 91191 Gif-sur-Yvette c´edex, France. E-mail:
[email protected] Received: 15 December 2004/Accepted: 15 March 2005 Published online: 3 February 2006 – © Springer-Verlag 2006
Abstract: We prove an Egorov theorem, or quantum-classical correspondence, for the quantised baker’s map, valid up to the Ehrenfest time. This yields a logarithmic upper bound for the decay of the quantum variance, and, as a corollary, a quantum ergodic theorem for this map.
1. Introduction The correspondence principle of quantum mechanics suggests that in the classical limit the behaviour of quantum systems reproduces that of the system’s classical dynamics. It is becoming clear that to understand this process fully represents a challenge not only to methods of semiclassical analysis, but also the modern theory of dynamical systems. For a broad class of smooth Hamiltonian systems it has been proved that if the system is ergodic, then, in the classical limit, almost all eigenfunctions of the corresponding quantum mechanical Hamiltonian operator become equidistributed with respect to the natural measure (Liouville) over the energy shell. This is the content of the so-called ˇ Zel1, CdV, HMR]. quantum ergodicity theorem [Sni, This mathematical result, even if it can be considered quite mild from the physical point of view, still constitutes one of the few rigorous results concerning the properties of quantum eigenfunctions in the classical limit, and it still leaves open the possible existence of exceptional subsequences of eigenstates which might converge to other invariant measures. In the last few years a certain number of works have explored this mathematically and physically interesting issue. While exceptional subsequences can be present for some hyperbolic systems with extremely high quantum degeneracies [FDBN], it is believed that they do not exist for a typical chaotic system (by chaotic, we generally mean that the system is ergodic and mixing). The uniqueness of the classical limit for the Present address: Department of Mathematics, Texas A&M University, College Station, TX 778433368, USA. E-mail:
[email protected] 326
M. Degli Esposti, S. Nonnenmacher, B. Winn
quantum diagonal matrix elements is called quantum unique ergodicity (QUE) [RudSar, Sar1]. There have been interesting recent results in this direction for Hecke eigenstates of the Laplacian on compact arithmetic surfaces [Lin], using methods which combine rigidity properties of semi-classical measures with purely dynamical systems theory. The model studied in the present paper is not a Hamiltonian flow, but rather a discretetime symplectic map on the 2-dimensional torus phase space. In the case of quantised hyperbolic automorphisms of the 2-torus (“quantum cat maps”), QUE has been proven along a subsequence of Planck’s constants [DEGI, KR2], and for a certain class of eigenstates (also called “Hecke” eigenstates) [KR1] without restricting Planck’s constant. QUE has also been proved in the case of some uniquely ergodic maps [MR, Ros]. Quantum (possibly non-unique) ergodicity has been shown for some ergodic maps which are smooth by parts, with discontinuities on a set of zero Lebesgue measure [DBDE, MO’K, DE+ ]. Discontinuities generally produce diffraction effects at the quantum level, which need to be taken care of (this problem also appears in the case of Euclidean billiards with non-smooth boundaries [GL, ZZ]). Most proofs of quantum ergodicity consist of showing that the quantum variance defined below (Eq. (1.1)) vanishes in the classical limit. To state our results we now turn to the specific dynamics considered in the present article. We take as classical dynamical system the baker’s map on T2 , the 2-dimensional torus [AA]. For any even positive integer N ∈ 2N (N is the inverse of Planck’s constant h), this map can be quantised into a unitary operator (propagator) Bˆ N acting on an N dimensional Hilbert space. The quantum variance measures the average equidistribution ˆ of the eigenfunctions {ϕN,j }N−1 j =0 of BN : S2 (a, N ) :=
N−1 2 1 W , Op (a)ϕ − a(q, p) dqdp . ϕN,j N,j N 2 N T
(1.1)
j =0
Here a is some smooth function (observable) on T2 and OpW N (·) is the Weyl quantisation mapping a classical observable to a corresponding quantum operator. The quantised baker’s map (or some variant of it) is a well-studied example in the physics literature ˙ which motivated our on quantum chaology [BV, Sa, SaVo, O’CTH, Lak, Kap, ALPZ], desire to provide rigorous proofs for both the quantum-classical correspondence and quantum ergodicity. In this paper we prove a logarithmic upper bound on the decay of the quantum variance (see Theorem 1 below), which implies quantum ergodicity as a byproduct (Corollary 2). A similar upper bound was first obtained by Zelditch [Zel2] in the case of the geodesic flow of a compact negatively curved Riemannian manifold, and was generalized by Robert [Rob] to more general ergodic Hamiltonian systems. Both are using some control on the rate of classical ergodicity (Zelditch also proved similar upper bounds for higher moments of the matrix elements). The main semiclassical ingredient needed for all proofs of quantum ergodicity is some control on the correspondence between quantum and classical evolutions of observables, namely some Egorov estimate. As for billiard flows [Fa], such a correspondence can only hold for observables supported away from the set of discontinuities. We establish this correspondence for the quantum baker’s map in Sect. 5.2, generalizing previous results [DBDE] for a subclass of observables (an Egorov theorem was already proven in [RubSal] for a different quantisation of the baker’s map). Some related results can be found in [BGP, BR] for the case of smooth Hamiltonian systems. To obtain this Egorov estimate, we study the propagation of coherent states (Gaussian wavepackets): they provide a convenient way to “avoid” the set of
Quantum Variance and Ergodicity for the Baker’s Map
327
discontinuities. The correspondence will hold up to times of the order of the Ehrenfest time log N (1.2) TE (N ) := log 2 (here log 2 is the positive Lyapunov exponent of the classical baker’s map). Equipped with this estimate, one could apply the general results of [MO’K] to prove that the quantum variance semiclassically vanishes. We prefer to generalise the method of [Schu2] (applied to smooth maps or flows) to our discontinuous baker’s map. This method, inspired by some earlier heuristic calculations [FP, Wil, EFK+ ], yields a logarithmic upper bound for the variance. It relies on the decay of classical correlations (mixing property), which is related, yet not equivalent, with the control on the rate of ergodicity used in [Zel2, Rob]. Our main result is the following theorem. Theorem 1. For any observable a ∈ C ∞ (T2 ), there is a constant C(a) depending only on a, such that the quantum variance over the eigenstates of Bˆ N satisfies: C(a) . ∀N ∈ 2N, S2 (a, N ) ≤ log N We believe that this method can be extended to any piecewise linear map satisfying a fast mixing. We also can speculate that the method would work for non-linear piecewise-smooth maps, although in that case the propagation of coherent states should be analysed in more detail (see Remark 2). The upper bound in Theorem 1 seems far from being sharp. The heuristic calculations in [FP, Wil, EFK+ ] suggest that the quantum variance decays like V (a) N −1 , where the prefactor V (a) is the classical variance of the observable a, appearing in the central limit theorem. This has been conjectured to be the true decay rate for a “generic” Anosov system. The decay of quantum variance has been studied numerically in [EFK+ ] for the baker’s map and [BSS] for Euclidean billiards; in both cases, the results seem to be compatible with a decay N −1 ; however, a discrepancy of around 10% was noted between the observed and conjectured prefactors. This was attributed to the low values of N (or energy in the case of billiards) considered. A more recent numerical study of a chaotic billiard, at higher energies, still reveals some (smaller) deviations from the conjectured law [Bar], leaving open the possibility of a decay N −γ with γ = 1. A decay of the form V˜ (a) N −1 (with an explicit factor V˜ (a)) could be rigorously proven for two particular Anosov systems, using their rich arithmetic structure [KR1, LS, RuSo]. In both cases, the prefactor V˜ (a) generally differs from the classical variance V (a), which is attributed to the arithmetic properties of the systems, which potentially makes them “non-generic”. Algebraic decays have also been proven for some uniquely ergodic (non-hyperbolic) maps [MR, Ros], by pushing the Egorov property to times of order O(N ). The rigorous investigation of the quantum variance thus remains an important open problem in quantum chaology [Sar2]. Quantum ergodicity follows from Theorem 1 as a corollary: N→∞
Corollary 2. For each N ∈ 2N there exists a subset JN ⊂ {1, . . . , N}, with #JNN −−−−→ 1, such that for any a ∈ C ∞ (T2 ) and any sequence (jN ∈ JN )N∈2N , (a)ϕ = a(x)dx. (1.3) lim ϕN,jN , OpW N,j N N N→∞
T2
328
M. Degli Esposti, S. Nonnenmacher, B. Winn
This generalises a result of [DBDE] to any observable a ∈ C ∞ (T2 ) (previously only observables of the form a = a(q) could be handled). The restriction to a subset JN is the “almost all” clarification in quantum ergodicity. The paper is organised as follows. In Sect. 2 we briefly describe the classical baker’s map on T2 . In Sect. 3, we recall how this map can be quantised [BV] into an N × N unitary matrix. We then describe the action of the quantised baker map on coherent states (Proposition 5). This is the first step towards the Egorov estimates proven in Sect. 5 (Theorems 12 and 13, which shows the correspondence up to the Ehrenfest time). The first part of that section (Subsect. 5.1) compares the Weyl and anti-Wick quantisations, for observables which become more singular when N grows. This technical step is necessary to obtain Egorov estimates for times log N . In the final section, we implement the method of [Schu2] to the quantum baker’s map, using our Egorov estimates up to logarithmic times, and prove Theorem 1. 2. The Classical Baker’s Map The baker’s map1 is the prototype model for discontinuous hyperbolic systems, and it has been extensively studied in the literature. Standard results may be found in [AA], while the exponential mixing property was analyzed by [Has], and also derives from the results of [Ch]. Here, for the sake of fixing notations, we restrict ourself to recalling the very basic definitions and properties, referring the reader to the above references for more details concerning the ergodic properties of the map. We identify the torus T2 with the square [0, 1) × [0, 1). The first (horizontal) coordinate q represents the “position”, while the second (vertical) represents the “momentum”. In our notations, x = (q, p) will always represent a phase space point, either on R2 or on its quotient T2 . The baker’s map is defined as the following piecewise linear bijective transformation on T2 : (2q, p/2), if q ∈ [0, 1/2), (2.1) B(q, p) = (q , p ) = (2q − 1, (p + 1)/2), if q ∈ [1/2, 1). The transformation is discontinuous on the following subset of T2 : S1 := {p = 0} ∪ {q = 0} ∪ {q = 1/2},
(2.2)
and smooth everywhere else. If we consider iterates of the map, the discontinuity set becomes larger: for any n ∈ N, the map B n is piecewise linear, and discontinuous on the set n −1 2 j q= n , Sn := {p = 0} ∪ 2 j =0
B −n
while its inverse is discontinuous on the set S−n obtained from Sn by exchanging the q and p coordinates. Clearly, the discontinuity set becomes dense in T2 as |n| → ∞. The map is area preserving and uniformly hyperbolic outside the discontinuity set, with constant Lyapunov exponents ± log 2 and positive topological entropy (see below). The stable (resp. unstable) manifold is made of vertical (resp. horizontal) segments. A nice feature of this map lies in a simple symbolic coding for its orbits. Each real number q ∈ [0, 1) can be associated with a binary expansion 1 The name refers to the cutting and stretching mechanism in the dynamics of the map which is reminiscent of the procedure for making bread. Hence we write the word “baker” with a lower case “b”.
Quantum Variance and Ergodicity for the Baker’s Map
q = · 0 1 2 . . .
329
(i ∈ {0, 1}).
This representation is one-to-one if we forbid expansions of the form ·0 1 . . . 111 . . . . Using the same representation for the p-coordinate: p = · −1 −2 . . . , a point x = (q, p) ∈ T2 can be represented by the doubly-infinite sequence x = . . . −2 −1 · 0 1 . . . . Then, one can easily check that the baker’s map acts on this representation as a symbolic shift: B(. . . −2 −1 · 0 1 . . . ) = . . . −2 −1 0 · 1 . . . .
(2.3)
From this symbolic representation, one gets the Kolmogorov-Sinai entropy of the map with respect to the Lebesgue measure, hKS = log 2, as well as exponential mixing properties [Ch, Has]: there exists > 0 and C > 0 such that, for any smooth observables a, b on T2 , the correlation function a(x) b(B −n x) dx − a(x) dx b(x) dx (2.4) Kab (n) := T2
T2
T2
is bounded as |Kab (n)| ≤ C a C 1 b C 1 e−|n| .
(2.5)
According to [Has], one can take for any number smaller than log 2. 3. Quantised Baker’s Map The quantisation of the 2-torus phase space is now well-known and we refer the reader to [DEG], here describing only the important facts. The quantisation of an area-preserving map on the torus is less straightforward, and in general it contains some arbitrariness. The quantisation of linear symplectomorphisms of the 2-torus (or “generalised Arnold cat maps”) was first considered in [HB], and the case of nonlinear perturbations of cat maps was treated in [BdM+ ] (quantum ergodicity was proven for these maps in [BDB]). The scheme we present below, specific for the baker’s map, was introduced in [BV]. We start by defining the quantum Hilbert space associated to the torus phase space. For any ∈ (0, 1], we consider the quantum translations (elements of the Heisenberg ˆ 1 p)/ ˆ , v ∈ R2 , acting on L2 (R) and by extension on S (R). We group) Tˆv = ei(v2 q−v then define the space of distributions H = {ψ ∈ S (R), Tˆ(1,0) ψ = Tˆ(0,1) ψ = ψ}. These are distributions ψ(q) which are Z-periodic, and such that their -Fourier transform ∞ dq (Fˆ ψ)(p) := (3.1) ψ(q) e−iqp/ √ 2π −∞ is also Z-periodic.
330
M. Degli Esposti, S. Nonnenmacher, B. Winn
One easily shows that this space is nontrivial iff (2π )−1 = N ∈ N, which we will always assume from now on. This space can be obtained as the image of L2 (R) through the “projector” (−1)Nm1 m2 Tˆm = (3.2) Tˆ0,m2 Tˆm1 ,0 . PˆT2 = m2 ∈Z
m∈Z2
m1 ∈Z
H = HN then forms an N -dimensional vector space of distributions, admitting a “position representation”
N−1 N−1 1 j ψ(q) = √ + ν =: ψj δ q − ψj qj (q), N N j =0 ν∈Z j =0
(3.3)
where each coefficient ψj ∈ C. Here we have denoted by {qj }N−1 j =0 the canonical (“position”) basis for HN . This space can be naturally equipped with the Hermitian inner product: qj , qk = δj k ⇒ ψ, ω :=
N−1
ψj ω j .
(3.4)
j =0
Since HN is the image of S(R) through the “projector” (3.2), any state ψ ∈ HN can be constructed by projecting some Schwartz function (q). The decomposition on the RHS of (3.2) suggests that we may first periodicise in the q-direction, obtaining a periodic function C (q); such a wavefunction describes a state living in the cylinder phase space C = T × R. The torus state ψ(q) is finally obtained by periodicising C in the Fourier variable; equivalently, the N components of ψ in the basis {qj } are obtained by sampling this function at the points qj = Nj : j 1 , 0 ≤ j < N. (3.5) ψj = √ C N N The -Fourier transform Fˆ (seen as a linear operator on S (R)) leaves the space HN invariant. On the basis {qj }, it acts as an N × N unitary matrix FˆN called the “discrete Fourier transform”: 1 (3.6) (FˆN )kj = √ e−2iπkj/N , k, j = 0, . . . , N − 1. N Fˆ quantises the rotation by −π/2 around the origin, F (q0 , p0 ) = (p0 , −q0 ). As a result, FˆN maps the “position basis” {qj } onto the “momentum basis” {pj }: pj =
N−1
(FˆN−1 )kj qk .
k=0
The quantised baker’s map Bˆ N was introduced by Balazs and Voros [BV]. They require N to be an even integer, and prescribe the following matrix in the basis {qj }:
FˆN/2 0 −1 ˆ ˆ ˆ ˆ . (3.7) with BN,mix := BN := (FN ) BN,mix , 0 FˆN/2
Quantum Variance and Ergodicity for the Baker’s Map
331
This definition was slightly modified by Saraceno [Sa], in order to restore the parity symmetry of the classical map. Although we will concentrate on the map (3.7), all our results also apply to this modified setting.
3.1. Notations. Since we will be dealing with quantities depending on Planck’s constant N (plus possibly other parameters), all asymptotic notations will refer to the classical limit N → ∞. The notations A = O(B) and A B both mean that there exists a constant c such that for any N ≥ 1, |A(N )| ≤ c|B(N)|. Writing A = Or (B) and A r B means that the constant c depends on the parameter r. Similarly A = o(B) and A 0, the C j -norm is defined as
f C j :=
∂ γ f C 0 .
0≤|γ |≤j γ
γ
Here γ = (γ1 , γ2 ) ∈ N20 denotes the multiindex of differentiation: ∂ γ = ∂q 1 ∂p2 , and |γ | := γ1 + γ2 . Because we want to consider large time evolution, namely times n log N , we need to consider (smooth) functions which depend on Planck’s constant 1/N . Indeed, starting from a given smooth function a, its evolution a ◦ B −n fluctuates more and more strongly along the vertical direction, while it is smoother and smoother along the horizontal one as n → ∞ (assuming a is supported away from the discontinuity set Sn ). For this reason, we introduce the following spaces of functions [DS, Chap. 7]: Definition 1. For any α = (α1 , α2 ) ∈ R2+ , we call Sα (T2 ) the space of N -dependent smooth functions f = f (·, N ) such that, for any multiindex γ ∈ N20 , the quantity
∂ γ f (·, N ) C 0 N α·γ N∈N
Cα,γ (f ) := sup
is finite (here α · γ = α1 γ1 + α2 γ2 ). The seminorms Cα,γ (γ ∈ N20 ) endow Sα (T2 ) with the structure of a Fr´echet space.
332
M. Degli Esposti, S. Nonnenmacher, B. Winn
4. Coherent States on T2 Our proof of the quantum-classical correspondence will use coherent states on T2 . Below we define them, and collect some useful properties. More comprehensive details and proofs may be found in [Fo, Per, LV, BDB, BonDB]. We define a (plane) coherent state at the origin with squeezing σ > 0 through its wavefunction 0,σ ∈ S(R) (we will always omit the indication of -dependence): σ 1/4 σ q 2 e− 2 . (4.1) 0,σ (q) := π The (plane) coherent state at the point x = (q0 , p0 ) ∈ R2 is obtained by applying a quantum translation Tˆx to the state above, which yields: σ 1/4 p0 q0 p0 q −σ (q−q0 )2 x,σ (q) := e−i 2 ei e 2 π = (2N σ )1/4 e−πiNq0 p0 +2πiNp0 q−σ Nπ(q−q0 ) . 2
(In the second line, we took = (2πN )−1 , as is required if we want to project on the torus). From here we obtain a coherent state on the cylinder by periodicising along the q-axis: x,σ,C (q) := x,σ (q + ν). (4.2) ν∈Z
Finally, the coherent state on the torus is obtained by further periodicising in the Fourier variable, or equivalently by sampling this cylinder wavefunction: its coefficients in the canonical basis read 1 ψx,σ,T2 j = √ x,σ,C (j/N ), j = 0, . . . , N − 1. (4.3) N One can check that ψx+m,σ,T2 ∝ ψx,σ,T2 for any m ∈ Z2 : up to a phase, the state ψx,σ,T2 depends on the projection on T2 of the point x. In the classical limit, it will often be useful to approximate a torus (or cylinder) coherent state by the corresponding planar one: Lemma 3. Let q0 ∈ (δ, 1 − δ) for some 0 < δ < 1/2. Then in the classical limit: 2 (4.4) ∀q ∈ [0, 1), x,σ,C (q) = x,σ (q) + O (σ N )1/4 e−πNσ δ . The error estimate is uniform for σ N ≥ 1. Proof. Extracting the ν = 0 term in (4.2), one gets 2 ∀q ∈ [0, 1), x,σ,C (q) = x,σ (q) + O (σ N )1/4 e−πσ N min{|q−q0 +ν| :ν=0} . Now, if q0 ∈ (δ, 1 − δ), one has |q − q0 | ≤ 1 − δ, so that ∀ν = 0,
|q − q0 − ν| ≥ |ν| − |q − q0 | ≥ 1 − |q − q0 | ≥ δ.
The next lemma describes how a torus coherent state transforms under the application of the discrete Fourier transform.
Quantum Variance and Ergodicity for the Baker’s Map
333
Lemma 4. For any x = (q0 , p0 ) ∈ R2 , let F x := (p0 , −q0 ) denote its rotation by −π/2 around the origin. Then ∀N ≥ 1, ∀σ > 0,
FˆN ψx,σ,T2 = ψF x,1/σ,T2 .
(4.5)
Proof. The plane coherent states, which are Gaussian wavefunctions, are obviously covariant through the Fourier transform Fˆ : a straightforward computation shows that ∀x ∈ R2 ,
Fˆ ψx,σ = ψF x,1/σ .
When (2π) = N −1 , we apply the projector (3.2) to both sides of this inequality, and remember that Fˆ acts on HN as the matrix FˆN : this means PˆT2 Fˆ = FˆN PˆT2 , so the above covariance is carried over to the torus coherent states. 4.1. Action of Bˆ N on coherent states. We assume N to be an even integer, and apply the matrix Bˆ N to the coherent state ψx,σ,T2 , seen as an N -component vector in the basis {qj }. We get nice results if the point x = (q0 , p0 ) is “far enough” from the singularity set S1 (in this case Bx is well-defined). More precisely, we define the following subsets of T2 : Definition 2. For any 0 < δ < 1/4 and 0 < γ < 1/2, let D1,δ,γ := (q, p) ∈ T2 , q ∈ (δ, 1/2 − δ) ∪ (1/2 + δ, 1 − δ), p ∈ (γ , 1 − γ ) . (4.6) The evolution of coherent states will be simple for states localised in this set. Proposition 5. For some parameters δ, γ (which may depend on N ), we consider points x = (q0 , p0 ) ∈ T2 in the set D1,δ,γ . We associate to these points the phase 0, if q0 ∈ (δ, 1/2 − δ), (4.7) (x) = p0 + 1 q0 + , if q0 ∈ (1/2 + δ, 1 − δ). 2 We assume that the squeezing σ may also depend on N , remaining in the interval σ ∈ [1/N, N ]. From δ, γ , σ we form the parameter θ = θ (δ, γ , σ ) := min(σ δ 2 , γ 2 /σ ).
(4.8)
Then, in the semiclassical limit, the coherent state ψx,σ,T2 evolves almost covariantly through the quantum baker’s map:
Bˆ N ψx,σ,T2 − eiπ(x) ψBx,σ/4,T2 HN = O(N 3/4 σ 1/4 e−πNθ ).
(4.9)
The implied constant is uniform with respect to δ, γ , σ ∈ [1/N, N ], and the point x ∈ D1,δ,γ . We notice that the exponential in the above remainder will be small only if θ >> 1/N , which requires both σ >> 1/N and σ 0, ∀x = (q0 , p0 ) ∈ R , p0 +1 i Sˆ1, x,σ = e 2 (q0 + 2 ) S1 x,σ/4 . The approximate covariance stated in Proposition 5 is therefore a microlocal version of this exact global covariance. Remark 2. The fact that the error is exponentially small is due to the piecewise-linear character of the map B. Indeed, for a nonlinear area-preserving map M on T2 , coherent states are also transformed covariantly through Mˆ N , but the error term is in general of order O(N x 3 ), where x is the “maximal width” of the coherent state (here x = max(σ, σ −1 ) N −1/2 ) [Schu1]. Moreover, in general the squeezing σ takes values in the complex half-plane {Re (σ ) > 0}: the reason why we can here restrict ourselves to the positive real line is due to the orientation of the baker’s dynamics. Proof of Proposition 5. Since we already know that FˆN acts covariantly on coherent states, we only need to analyse the action of Bˆ N,mix (Eq. (3.7)). We first consider a coherent state in the “left” strip (δ, 1/2 −δ)×(γ , 1−γ ) of D1,δ,γ . In this case, the “relevant” coefficients of Bˆ N,mix ψx,σ,T2 are in the interval 0 ≤ m < N2 :
N/2−1 j 1 ˆ ˆ =√ (FN/2 )mj x,σ,C . BN,mix ψx,σ,T2 m N N j =0
(4.10)
From the formula (3.6), we have for all 0 ≤ j, m < N/2: √ (FˆN/2 )mj = 2 (FˆN )2m j . Since q0 ∈ (δ, 1/2 − δ), for any N/2 ≤ j one has j/N − q0 ≥ δ; using Lemma 3, we obtain
j 2 ∀j ∈ {N/2, . . . , N − 1}, x,σ,C = O (σ N )1/4 e−πNσ δ . (4.11) N We can therefore extend the range of summation in (4.10) to j ∈ {0, . . . , N − 1}, incurring only an exponentially small error: √ N−1 2 Bˆ N,mix ψx,σ,T2 = 2 (FˆN )2m j ψx,σ,T2 + O((σ N )1/4 e−πNσ δ ) m
j
j =0
√ = 2 ψF x,1/σ,T2
2m
+ O((σ N )1/4 e−πNσ δ ).
In the last step, we have used the covariance property of Lemma 4.
2
(4.12)
Quantum Variance and Ergodicity for the Baker’s Map
335
Since p0 ∈ (γ , 1 − γ ) and N/σ ≥ 1, it follows from Lemma 3 and simple manipulations of plane coherent states that for all q ∈ [0, 1/2), √ √ 2 2 F x,1/σ,C (2q) = 2 F x,1/σ (2q) + O (N/σ )1/4 e−πNγ /σ 2 = (p0 /2,−2q0 ),4/σ (q) + O (N/σ )1/4 e−πNγ /σ 2 = (p0 /2,−2q0 ),4/σ,C (q) + O (N/σ )1/4 e−πNγ /σ . The identity (p0 /2, −2q0 ) = F Bx (valid for x in the left strip) inserted in (4.12) yields for all m ∈ {0, . . . , N/2 − 1}, Bˆ N,mix ψx,σ,T2 = ψFBx,4/σ,T2 m + O((σ N )1/4 e−πNθ ) m
(4.13)
(θ is defined in (4.8), and we used the assumption σ N > 1 to simplify the remainder). The remaining coefficients N/2 ≤ m < N are bounded using (4.11): 1 Bˆ N,mix ψx,σ,T2 =√ m N
N−1
(FˆN/2 )m j x,σ,C
j =N/2
j N
= O((σ N )1/4 e−πσ Nδ ). 2
(4.14) On the other hand, Lemma 3 shows that the coefficients ψFBx,4/σ,T2 m for N/2 ≤ m < N are bounded from above by the same RHS. Hence, Eq. (4.13) holds for all m = 0, . . . , N − 1.√ A norm estimate is obtained by multiplying this component-wise estimate by a factor N. We now apply the inverse Fourier transform and Lemma 4, to obtain the part of the theorem dealing with coherent states in the left strip of D1,δ,γ . A similar computation treats the case of coherent states in the right strip of D1,δ,γ . The large components of ψx,σ,T2 are in the interval j ≥ N/2, so the second block of Bˆ N,mix is relevant. The analogue to (4.13) reads, for m ∈ {N/2, . . . , N − 1}:
2 2m ˆ BN,mix ψx,σ,T2 = F x,1/σ − 1 + O((σ N )1/4 e−πNθ ). (4.15) m N N Proceeding as before, we identify for all q ∈ [1/2, 1), √ p0 +1 2 F x,1/σ (2q − 1) = eπiN(q0 + 2 ) ((p0 +1)/2,−(2q0 −1)),4/σ (q) = eπiN(q0 +
p0 +1 2 )
FBx,4/σ,C (q) + O((N/σ )1/4 e−πNγ
2 /σ
). (4.16)
Applying the inverse Fourier transform we obtain the second part of the theorem.
336
M. Degli Esposti, S. Nonnenmacher, B. Winn
5. Egorov Property Our objective in this section is to control the evolution of quantum observables through Bˆ N , in terms of the corresponding classical evolution. Namely, we want to prove an Egorov theorem of the type N→∞
−n n
Bˆ N OpN (a) Bˆ N − OpN (a ◦ B −n ) −−−−→ 0.
(5.1)
Here OpN (a) is some quantisation of an observable a ∈ C ∞ (T2 ). As explained in the introduction, to avoid the diffraction problems due to the discontinuities of B, we will require the function a to be supported away from the set Sn of discontinuities of B n . Otherwise, a ◦ B −n may be discontinuous, and already its quantisation poses some problems. An Egorov theorem has been proven in [RubSal] for a different quantisation of the baker’s map, also using coherent states. In [DBDE, Cor. 17] an Egorov theorem was obtained for Bˆ N , but valid only for observables of the form a(q) (or a(p), depending on the direction of time) and restricting the observables to a “good” subspace of HN of dimension N − o(N ). Since we control the evolution of coherent states through Bˆ N (Proposition 5), it is natural to use a quantisation defined in terms of coherent states, namely the anti-Wick quantisation [Per] (see Definition 4 below). However, because the quasi-covariance (4.9) connects a squeezing σ to a squeezing σ/4, it will be necessary to relate the correspondAW,σ/4 ing quantisations OpAW,σ and OpN to one another. This will be done in the next N subsection, by using the Weyl quantisation as a reference. Besides, we want to control the correspondence (5.1) uniformly with respect to the time n. We already noticed that for n >> 1, an observable a supported away from Sn needs to fluctuate quite strongly along the q-direction, while its dependence in the p variable may remain mild. Likewise, a ◦ B −n , supported away from S−n , will strongly fluctuate along the p-direction. All results in this section will be stated for two classes of observables: – general functions f ∈ C ∞ (T2 ), without any indication on how f depends on N . This yields a Egorov theorem valid for time |n| ≤ ( 16 − ) TE (with > 0 fixed), which will suffice to prove Theorem 1 (TE = TE (N ) is the Ehrenfest time (1.2)). – functions f ∈ Sα (T2 ) for some α ∈ R2+ with |α| < 1 (see Definition 1). Here we use more sophisticated methods in order to push the Egorov theorem up to the times |n| ≤ (1 − )TE .
5.1. Weyl vs. anti-Wick quantizations on T2 . In this subsection, we define and compare the Weyl and anti-Wick quantisations on the torus. The main result is Proposition 8, which precisely estimates the discrepancies between these quantisations, in the classical limit. We start by recalling the definition of the Weyl quantisation on the torus [BDB, DEG]. Definition 3. Any function f ∈ C ∞ (T2 ) can be Fourier expanded as follows: f = f˜(k) ek , where ek (x) := e2πix∧k = e2πi(qk2 −pk1 ) . k∈Z2
Quantum Variance and Ergodicity for the Baker’s Map
337
The Weyl quantisation of this function is the following operator: OpW N (f ) :=
f˜(k) T (k),
where T (k) := Tˆhk .
(5.2)
k∈Z2
We use the same notations for translation operators T (k) acting on either HN or L2 (R); R2 (f ). in the latter case, the Weyl-quantised operator will be denoted by OpW, N The operators {T (k) ; k ∈ Z2 } acting on L2 (R) form an independent set of unitary operators. On the other hand, on HN these operators satisfy T (k + N m) = (−1)k∧m T (k). Hence, defining ZN := {−N/2, . . . , N/2 − 1}, the set {T (k), k ∈ Z2N } forms a basis of the space of operators on HN . This basis is orthonormal with respect to the Hilbert-Schmidt scalar product (3.8). The Weyl quantisations on L2 (R) and HN satisfy the following inequality [BDB, Lemma 3.9]: ∀f ∈ C ∞ (T2 ),
∀N ∈ N,
W,R
OpW (f ) B(L2 (R)) . (5.3) N (f ) B(HN ) ≤ OpN 2
This will allow us to use results pertaining to the Weyl quantisation of bounded functions on the plane (see the proof of Lemma 9). We now define a family of anti-Wick quantisations. Definition 4. For any squeezing σ > 0, the anti-Wick quantisation of a function f ∈ L1 (T2 ) is the operator OpAW,σ (f ) on HN defined as: N ∀φ, φ ∈ HN ,
φ, OpAW,σ (f ) φ := N N
T2
f (x) φ, ψx,σ,T2 ψx,σ,T2 , φ dx. (5.4)
Both Weyl and anti-Wick quantisations map a real observable onto a Hermitian operator. As opposed to the Weyl quantisation, the anti-Wick quantisation enjoys the important property of positivity. Namely, if the function a is nonnegative, then for any N, σ , the operator OpAW,σ (a) is positive. N These quantisations will be easy to compare once we have expressed the anti-Wick quantisation in terms of the Weyl one [BonDB]. Lemma 6. Using the quadratic form Qσ (k) := σ k12 + σ −1 k22 , one has the following expression for the anti-Wick quantisation: ∀f ∈ L1 (T2 ),
OpAW,σ (f ) = N
π
f˜(k) e− 2N Qσ (k) T (k).
(5.5)
k∈Z2 Equivalently, OpAW,σ (f ) = OpW N (f ), where the function f is obtained by convolution N of f (on R2 ) with the Gaussian kernel
KN,σ (x) := 2N e−2πNQσ (x) .
(5.6)
338
M. Degli Esposti, S. Nonnenmacher, B. Winn
Proof. To prove this lemma, it is sufficient to show that for any k 0 ∈ Z2 , the anti-Wick quantisation on HN of the Fourier mode ek 0 (x) reads: π
(ek 0 ) = e− 2N Qσ (k 0 ) T (k 0 ). OpAW,σ N
(5.7)
This formula has been proven in [BonDB, Lemma 2.3 (ii)], yet we give here its proof for completeness. The idea is to decompose OpAW,σ (ek 0 ) in the basis {T (k), k ∈ Z2N }, N using the Hilbert-Schmidt scalar product (3.8). That is, we need to compute
T (k), OpAW,σ (e ) k0 = N
T2
ek 0 (x) ψx,σ,T2 , T (k)† ψx,σ,T2 dx.
(5.8)
The overlaps between torus coherent states derive from the overlaps between plane coherent states, which are simple Gaussian integrals: y,σ , x,σ R2 = ei
∀x, y ∈ R2 ,
y∧x 2
0,σ , Tˆx−y 0,σ R2 = ei
y∧x 2
e−
Qσ (x−y) 4
.
Using the projector (3.2), we get ψx,σ,T2 , Tˆk/N ψx,σ,T2 =
(−1)Nm1 m2 x,σ , Tˆk/N Tˆm x,σ R2
m∈Z2
=
(−1)Nm1 m2 +m∧k e2iπ(x∧(k+Nm)) e−
πN 2 Qσ (m+k/N)
.
m∈Z2
We insert this expression in the RHS of (5.8) (and remember that N is even):
πN T (k), OpAW,σ (e ) = δk 0 ,k+Nm (−1)m∧k e− 2 Qσ (m+k/N) . k 0 N m∈Z2
This expression vanishes unless k = k 1 , the unique element of Z2N such that k 1 = k 0 + N m1 for some m1 ∈ Z2 . The orthonormality of the basis {T (k) : k ∈ Z2N } gives that OpAW,σ (ek 0 ) = (−1)m1 ∧k 1 e− N
πN 2 Qσ (k 0 )
T (k 1 ) = e−
πN 2 Qσ (k 0 )
T (k 0 ).
A simple property of these quantisations is the semi-classical behaviour of the traces of quantized observables: Lemma 7. For any integer M ≥ 3, ∞
∀f ∈ C (T ), 2
f M 1 C W . Tr(OpN (f )) = f (x) dx + OM M 2 N N T
(5.9)
For the anti-Wick quantisation, we have: ∀f ∈ L (T ), 1
2
πN 1 AW,σ Tr(OpN (f )) = f (x) dx + O( f L1 e− 2 N T2
min(σ,1/σ )
).
(5.10)
Quantum Variance and Ergodicity for the Baker’s Map
339
Proof. The first identity uses the fact that on the space HN , 1 1 if k = N m for some m ∈ Z2 , Tr T (k) = 0 otherwise. N The error term in (5.9) is bounded above by coefficients of a smooth function satisfy ∀M ≥ 1,
Now, the Fourier
f C M . (1 + |k|)M
(5.11)
|f˜(k)| M
∀k ∈ Z2 ,
˜
m∈Z2 \{0} |f (N m)|.
Using this upper bound (with M ≥ 3) in the above sum yields (5.9). In the anti-Wick case, each term |f˜(N m)| ≤ f L1 of the sum is multiplied by πN πN 2 e− 2 Qσ (m) ≤ e− 2 min(σ,1/σ )|m| , which yields (5.10). We will now compare the Weyl and anti-Wick quantisations in the operator norm. We give two estimates, corresponding to the two classes of functions described in the introduction of this section. Proposition 8. I) For any f ∈ C ∞ (T2 ) and σ > 0, AW,σ (f ) f C 5
OpW N (f ) − OpN
max{σ, σ −1 } . N
(5.12)
Here σ may depend arbitrarily on N . II) Let α ∈ R2+ , |α| < 1 and assume that σ > 0 may depend on N such that the quantity α (N, σ ) := max
N 2α1 −1 σ
, N α1 +α2 −1 , σ N 2α2 −1
(5.13)
goes to zero as N → ∞. Then there exists a seminorm Nα on the space Sα (T2 ) such that, for any f = f (·, N ) ∈ Sα (T2 ), one has: ∀N ≥ 1,
(f (·, N )) − OpW
OpAW,σ N (f (·, N )) Nα (f ) α (N, σ ). (5.14) N
Remark 3. The effective “small parameter” α (N, σ ) will be small as N → ∞ only if three conditions are simultaneously satisfied: – |α| = α1 + α2 < 1, – N 2α1 0,
W,R2
Op
(f ) ≤ C
1 γ1 ,γ2 =0
∂ γ f C 0 (R2 ) βγ1 +(1−β)γ2 .
(2πN )−1
In the case = we apply this bound to a function f ∈ Sα (T2 ), selecting β = (β, 1 − β) such that β ≥ α: we then obtain the upper bound of (5.16) for R2 OpW, (f ). The inequality (5.3) shows that this bound applies as well to the Weyl N operator on HN . 2
We thank N. Anantharaman for pointing out to us this scaling argument.
Quantum Variance and Ergodicity for the Baker’s Map
341
Equipped with this lemma, we can now prove the second part of Proposition 8. From the Taylor expansion |f (x + y) − f (x) − (y · ∇)f (x)| ≤
1 max (y · ∇)2 f (z), z = x + ty 2 0≤t≤1
and Lemma 6, one easily checks that for any f ∈ C ∞ (T2 ),
f − f C 0 ≤
1 1 2
∂q f C 0 + 2 ∂q ∂p f C 0 + σ ∂p2 f C 0 . 8πN σ
Since differentiation commutes with convolution, one controls all derivatives: for all γ ∈ N20 ,
∂ γ (f − f ) C 0 ≤
1 1 γ +(2,0)
∂ f C 0 + 2 ∂ γ +(1,1) f C 0 + σ ∂ γ +(0,2) f C 0 . 8πN σ (5.19)
For f = f (·, N ) ∈ Sα (T2 ), this estimate implies:
∂ γ (f − f ) C 0 ≤ N α·γ
N 2α1 −1 σ
Cα,γ +(2,0) (f )
+N α1 +α2 −1 Cα,γ +(1,1) (f ) + σ N 2α2 −1 Cα,γ +(0,2) (f ) ≤ N α·γ α (N, σ ) Cα,γ +(2,0) (f ) +Cα,γ +(1,1) (f ) + Cα,γ +(0,2) (f ) . (5.20)
Here we used the parameter α (N, σ ) defined in (5.13). This shows that2 the function 1 f f ,rem (·, N ) := α (N,σ (·, N ) − f (·, N) is also an element of Sα (T ), with semi) norms dominated by seminorms of f . Applying Lemma 9 to that function and taking any β ≥ α, |β| = 1, we get
OpAW,σ (f (·, N)) − OpW N (f (·, N)) α (N, σ ) N
1
Cα,γ +γ (f ).
|γ |≤2 γ1 ,γ2 =0
The seminorm stated in the theorem can therefore be defined as Nα (f ) :=
1
|γ |≤2 γ1 ,γ2 =0
Cα,γ +γ (f ).
(5.21)
342
M. Degli Esposti, S. Nonnenmacher, B. Winn
5.2. Egorov estimates for the baker’s map. We now turn to the proof of the Egorov property (5.1). Let us start with the case n = 1. We assume that a is supported in the set D1,δ,γ defined in Eq. (4.6), away from the discontinuity set S1 of B. Proposition 10. Let 0 < δ < 1/4 and 0 < γ < 1/2. Assume that the support of a ∈ C ∞ (T2 ) is contained in D1,δ,γ . Then, in the classical limit, −1 (a) Bˆ N − OpN
Bˆ N OpAW,σ N
AW,σ/4
(a ◦ B −1 ) a C 0 N 5/4 σ 1/4 e−πNθ ,
uniformly with respect to δ, γ , σ ∈ [1/N, N ]. Here we took as before θ = min(σ δ 2 , γ 2 /σ ). Proof. For any normalised state φ ∈ HN , we consider the matrix element ˆ −1 φ = N (a) B a(x) φ, Bˆ N ψx,σ,T2 Bˆ N ψx,σ,T2 , φ dx. φ, Bˆ N OpAW,σ N N T2
(5.22)
Using the quasi-covariance of coherent states localised in D1,δ,γ (Proposition 5) and applying the Cauchy-Schwarz inequality, the RHS reads a(x) φ, ψBx,σ/4,T2 ψBx,σ/4,T2 , φ dx + O( a C 0 N 5/4 σ 1/4 e−πNθ ). (5.23) N T2
The remainder is uniform with respect to the state φ. Through the variable substitution x = B −1 (y), this gives −1 (a)Bˆ N φ = φ, OpN φ, Bˆ N OpAW,σ N
AW,σ/4
(a ◦ B −1 )φ + O( a C 0 N 5/4 σ 1/4 e−πNθ ). (5.24)
Since the operators on both sides are self-adjoint, this identity implies the norm estimate of the proposition. Remark 4. Here we used the property that the linear local dynamics is the same at each point x ∈ T2 \S1 (expansion by a factor 2 along the horizontal, contraction by 1/2 along the vertical). Were this not the case, the state Bˆ N ψx,σ,T2 would be close to a coherent state at the point Bx, but with a squeezing depending on the point x. Integrating over x, we would get an anti-Wick quantisation of a ◦ B −1 with x-dependent squeezing, the analysis of which would be more complicated (see [Schu1, Chap. 4] for a discussion of such quantisations). We now generalise to n > 1. We assume that a is supported away from the set Sn of discontinuities of B n . More precisely, for some δ ∈ (0, 2−n−1 ) and γ ∈ (0, 1/2), we define the following open set, generalizing (4.6): k 2 Dn,δ,γ := (q, p) ∈ T , ∀k ∈ Z, q − n > δ, p ∈ (γ , 1 − γ ) . 2 The evolution of the sets Dn,δ,γ through B satisfies: ∀j ∈ {0, . . . , n − 1},
B j Dn,δ,γ ⊂ Dn−j,2j δ,γ /2j .
(5.25)
This is illustrated for n = 2, j = 1 in Fig. 5.1. If a is supported in Dn,δ,γ , then the support of a ◦ B −j is contained in Dn−j,2j δ,γ /2j ⊂ D1,2j δ,γ /2j . So for each 0 ≤ j < n, we can apply Proposition 10 to the observable a ◦ B −j , replacing the parameters δ, γ , σ by their corresponding values at time j ; we find that the parameter θ is independent of j . The triangle inequality then yields:
Quantum Variance and Ergodicity for the Baker’s Map
343
B
γ
γ 2
δ
2δ
Fig. 5.1. The action of the map B. On the left we show the set D2,δ,γ (shaded) and on the right is its image under the action of B
Corollary 11. Let n > 0 and for some δ ∈ (0, 2−n−1 ), γ ∈ (0, 1/2), let a ∈ C ∞ (T2 ) have support in Dn,δ,γ . Then, as N → ∞, AW,σ/4n
−n n
Bˆ N (a) Bˆ N − OpN OpAW,σ N
(a ◦ B −n ) a C 0 N 5/4 σ 1/4 e−πNθ .
(5.26)
This estimate is uniform with respect to n, the parameters δ, γ in the above ranges and n the squeezing σ ∈ [ 4N , N ]. Remark 5. The requirement N θ >> 1, together with the allowed ranges for δ, γ , impose n the restriction 4N 1, where TE is the Ehrenfest time (1.2). We can reach times n ∼ TE (1 − ) (with > 0 fixed) by taking the parameters δ = 2−n−2 N −1+ , γ 1, σ N 1− : in that case, the argument of the exponential in the RHS of Eq. (5.26) satisfies πN θ N , so that the RHS decays in the classical limit. We wish to obtain Egorov theorems where both terms correspond to a quantisation with the same parameter σ , or the Weyl quantisation. To do so, we will use Proposition 8 to replace the anti-Wick quantisations by the Weyl quantisation. Using the first statement of that proposition, we easily obtain the following Egorov theorem: Theorem 12. Let n > 0 and for some δ ∈ (0, 2−n−1 ), γ ∈ (0, 1/2), let a ∈ C ∞ (T2 ) have support in Dn,δ,γ . Then, in the limit N → ∞, and for any squeezing parameter n σ ∈ [ 4N , N], we have n W −n 5/4 1/4 −πNθ ˆ −n OpW σ e
Bˆ N N (a) BN − OpN (a ◦ B ) a C 0 N
n σ 4 1
a ◦ B −n C 5 . (5.27) max(σ, σ −1 ) a C 5 + max n , + 4 σ N
The implied constants are uniform in n, σ, δ, γ .
344
M. Degli Esposti, S. Nonnenmacher, B. Winn
If n, δ, γ and the observable a supported on Dn,δ,γ are independent of N , the RHS semi-classically converges to zero if we simply take σ = 1. This is the “finite-time” Egorov theorem. On the other hand, if we let n grow with N , the function a needs to change with N as well (at least because its support needs to change). In the next subsection we construct a specific family of functions {an }n≥1 , each one supported away from Sn , and compute the estimate (5.27) for this family. Remark 6. The same estimate holds if we replace n by −n on the LHS of (5.27), and replace σ by σ −1 on the RHS, including the definition of θ . Now, the function a must be supported in the set D−n,δ,γ obtained from Dn,δ,γ by exchanging the roles of q and p. Indeed, using the unitarity of Bˆ N , we may interpret the estimate (4.9) as the quasicovariant evolution of the coherent state ψy,σ ,T2 (where y = Bx, σ = σ/4) into the state ψB −1 y,4σ ,T2 , and the rest of the proof identically follows. 5.3. Egorov estimates for truncated observables. 5.3.1. A family of admissible functions. For future purposes (see the proof of Theorem 1 in the next section), and in order to understand better the bound (5.27), we explicitly construct a sequence of functions {an }n≥0 , each function being supported away from Sn . This sequence is simply obtained by taking the products of a fixed observable a ∈ C ∞ (T2 ) with cutoff functions χδ,n , which we now describe. Definition 5. For some 0 < δ < 1/4, we consider a Z-periodic function χ˜ δ ∈ C ∞ (R) which vanishes for x ∈ [−δ, δ] mod Z and takes value 1 for x ∈ [2δ, 1 − 2δ] mod Z. For any n ≥ 0, we then define the following cutoff functions on T2 : χδ,n (x) := χ˜ δ (2n q) χ˜ δ (p), χδ,−n (x) := χ˜ δ (2n p) χ˜ δ (q). For any n ∈ Z, we split the observable a ∈ C ∞ (T2 ) into its “good part” an (x) := a(x) χδ,n (x) and its “bad part” anbad (x) = a(x) (1 − χδ,n (x)). One easily checks that an is supported on Dn,δ/2n ,δ , while anbad is supported on a neighbourhood of Sn of area O(δ). In light of Remark 6 we can, without loss of generality, consider only times n > 0. For any multiindex γ ∈ N20 , we have
∂ γ an C 0 γ a C |γ | 2nγ1 δ −|γ | .
(5.28)
When evolving an through the map B, the derivatives grow along p and decrease along q; after n iterations, an ◦ B −n is still smooth, and
∂ γ (an ◦ B −n ) C 0 γ a C |γ | 2nγ2 δ −|γ | .
(5.29)
These estimates show that the C 5 -norms of an and an ◦ B −n (appearing on the RHS of Eq. (5.27)) are both of order 25n /δ 5 . With our conventions, the parameter θ appearing
Quantum Variance and Ergodicity for the Baker’s Map
345
δ n in the RHS of (5.26) reads θ = max(σ,4 n /σ ) . We maximise it by selecting σ = 2 . With this choice, the upper bound (5.27) reads 2
n W −n 5/4 n/4 −πNδ 2 /2n ˆ −n
Bˆ N OpW 2 e N (an ) BN − OpN (an ◦ B ) a C 0 N
26n a C 5 . (5.30) N δ5 Using Remark 6, the same estimate holds if we replace n by −n on the LHS. The last term of the RHS in (5.30) can semiclassically vanish only if |n| < T6E . This time window, although not optimal (see the following subsection), will be sufficient to prove Theorem 1 in Sect. 6. Before that, in the last part of this section we will sharpen this estimate by using the second part of Proposition 8: this will allow us to prove a Egorov property up to times |n| ≤ (1 − )TE , for any > 0. +
5.3.2. Optimised Egorov estimates. In this subsection we prove the following “optimal” Egorov theorem. Theorem 13. Choose > 0 arbitrarily small, and consider any observable a ∈ C ∞ (T2 ). For any N ≥ 1 and n ∈ Z, construct the “good part” an of that observable using Definition 5 with a width δ(N ) ≥ min(N −/4 , 1/10). Then, the following Egorov estimate holds: there exists C > 0 (independent of a, ) and N() > 0 such that for any N ≥ N () and any time |n| ≤ (1 − )TE ,
a C 4 n W −n 3/2 −πN /2 ˆ −n OpW e + /2 . (5.31)
Bˆ N N (an ) BN − OpN (an ◦ B ) ≤ C a C 0 N N Proof. We only treat the case n ≥ 0, finally invoking the time-reversal symmetry as in Remark 6. We consider > 0 fixed, and define N () through the equation N ()−/4 = 1/10. We then take N ≥ N () and consider any positive time n ≤ (1 − )TE . The improvement over Theorem 12 will be a sharper bound for the norms OpAW,σ N AW,σ/4n
−n (an ) − OpW (an ◦ B −n ) − OpW N (an ) and OpN N (an ◦ B ) . Using the rescaled n −/4 time t = TE and the property δ(N) ≥ N , the bound (5.28) on derivatives of an reads:
∂ γ an C 0 γ a C |γ | 2nγ1 N 4 |γ | = a C |γ | N tγ1 N 4 |γ | . Thus, the derivatives of an scale as those of an N -dependent function in the space Sα t (T2 ), where α t := (t + /4, /4). As in the former subsection, we must take σ = 2n = N t to minimise the remainder. The second part of Proposition 8 applied to a function in Sα t (T2 ) yields a “small parameter” α t (N, 2n ) = N t+/2−1 , so that the difference between the two quantisations of an is bounded as n
AW,2
OpW (an ) a C 4 N t+/2−1 . N (an ) − OpN −n
AW,2 −n Similar considerations using (5.29) show that OpW (a ◦ B −n ) N (a ◦ B ) − OpN is bounded by the same quantity. The argument of the exponential in Eq. (5.26) takes the value Nθ = Nδ 2 /2n ≥ N 1−t−/2 , so that the full estimate reads: n W −n 3/2 −πN ˆ −n
Bˆ N OpW e N (an ) BN − OpN (an ◦ B ) a C 0 N
1−t−/2
+
a C 4 . N 1−t−/2
346
M. Degli Esposti, S. Nonnenmacher, B. Winn
We obtain the bound (5.31) uniform in n by noticing that for the time window we consider, N 1−t−/2 ≥ N /2 . Our reason for believing that this estimate is “optimal” lies in Remark 5: we evolve states which stay away from the discontinuity set S1 along their evolution. Since any state satisfies qp 21 due to Heisenberg’s uncertainty principle, and q doubles at each time step, it is impossible for such a state to remain away from S1 during a time window larger than TE . Besides, at the time TE the “good part” an oscillates on a scale ≈ in the q direction, so it behaves more like a Fourier integral operator than an observable (pseudo-differential operator). 6. Quantum Ergodicity For any even N , we denote by {ϕN,j } the eigenvectors of Bˆ N (if some eigenvalues happen to be degenerate, which seems to be ruled out by numerical simulations, take an arbitrary ∞ 2 orthonormal eigenbasis). Let us consider a fixed real-valued observable a ∈ C (T ) satisfying T2 a(x) dx = 0. Quantum ergodicity follows if we prove that the quantum variance S2 (a, N ) =
N 1 2 N→∞ |ϕN,j , OpW −−−→ 0. N (a) ϕN,j | − N
(6.1)
j =1
One method to prove this limit for our quantised baker’s map would be to apply the methods of [MO’K]: one only needs the Egorov property (Theorem 12) for finite times n, and the classical ergodicity of B. However, this method seems unable to give information about the rate of decay of the variance. In order to prove the upper bound stated in Theorem 1, we will rather adapt the method used in [Zel2, Schu2] to our discontinuous map. This method requires the correlation functions of the classical map to decay sufficiently fast, which is the case here (Eq. 2.5). Proof of Theorem 1. To begin with, we consider the function
1 − cos x g(x) := 2 x2 and its Fourier transform ∞ 2π(1 − |k|), for −1 ≤ k ≤ 1, −2πikx g(k) ˆ = g(x) e dx = 0, elsewhere. −∞ For any T ≥ 1, we use it to construct the following periodic function: fT (θ ) := g(T (θ + m)). m∈Z
fT admits the Fourier decomposition fT (θ ) = k∈Z fˆT (k) e2πikθ , where |k| 2π 1 − for −T ≤ k ≤ T , T T fˆT (k) = 0 for |k| > T . Using this function, one may easily prove the following lemma [Schu2].
(6.2)
Quantum Variance and Ergodicity for the Baker’s Map
347
Lemma 14. With notations described above, for any even N ≥ 2 and T ≥ 1 one has 1 W −n n ˆ S2 (a, N ) ≤ OpW (a) B fˆT (n) Tr OpN (a) Bˆ N N N . N n∈Z
Notice that the terms in the sum on the RHS vanish for |n| > T . Proof. Let {ϕj } be the eigenbasis of Bˆ N , with Bˆ N ϕj = e2πiθj ϕj . Then one has N−1 −n n W 2 ˆ ˆ = Tr OpW (a) B Op (a) B e2πin(θk −θj ) |OpW N N N (a)ϕj , ϕk | . N N j,k=0
Multiplying by fˆT (n) and summing over n, we get, −n n W ˆ ˆ fˆT (n) Tr OpW (a) B Op (a) B N N N N n∈Z
=
N−1
2 fT (θk − θj ) |OpW N (a)ϕj , ϕk |
j,k=0
=
N−1
2 fT (0) |OpW N (a)ϕj , ϕj | +
j =0
2 fT (θk − θj ) |OpW N (a)ϕj , ϕk |
j =k
≥ N S2 (a, N ). The final inequality follows from the positivity of fT and the property fT (0) ≥ 1.
To prove the theorem we will estimate the traces appearing in Lemma 14. Due to the support properties of fˆT , only the terms with n ∈ [−T , T ] will be needed. We take the time T depending on N , precisely as T = T (N) :=
TE , 11
where TE is the Ehrenfest time (1.2). For each n ∈ Z ∩ [−T , T ], we will apply the Egorov Theorem 12. We first decompose a into a “good” part an and “bad” part anbad , as described in Definition 5: a = an + anbad ,
an := a.χδ,n .
(6.3)
We let the width δ > 0 depend on N as δ (log N )−1 . Therefore, for any n ∈ [−T , T ] |n| we will have 2δ N 1/10 . As a result, the bounds (5.28) for the derivatives of an read: ∀n ∈ Z ∩ [−T , T ],
|γ |
∂ γ an C 0 γ a C |γ | N 10 .
(6.4)
Furthermore, the same bounds are satisfied by the derivatives of anbad and an ◦ B −n . We decompose the traces of Lemma 14 according to the splitting (6.3): W W ˆn ˆ −n = Tr OpW ˆn ˆ −n Tr OpW N (a) BN OpN (a) BN N (a) BN OpN (an ) BN n W bad ˆ −n ˆ +Tr OpW (6.5) (a) B Op (a ) B N N n N N .
348
M. Degli Esposti, S. Nonnenmacher, B. Winn
bad The second term in the RHS will be controlled by replacing OpW N (an ) by its anti-Wick quantisation: AW,1 bad ˆ −n W bad ˆ −n ˆn ˆn Tr OpW (an ) BN + RN (n) . = Tr OpW N (a) BN OpN (an ) BN N (a) BN OpN
(6.6) The remainder RN (n) is dealt with using part I of Proposition 8, together with the bounds (6.4) applied to anbad : AW,1 bad W bad (an )
RN (n) ≤ OpW N (a) OpN (an ) − OpN
anbad C 5
a C 5 . (6.7) OpW N (a) N N 1/2 AW,1 bad ˆ −n ˆn In order to compute Tr OpW (an ) BN , we split the function anbad into N (a) BN OpN OpW N (a)
bad − a bad , where a bad ≥ 0. We then use the its positive and negative parts, anbad = an,+ n,− n,± following (standard) linear algebra lemma to estimate the trace:
Lemma 15. Let A, B be self-adjoint operators on HN , and assume B is positive. Then |Tr(AB)| ≤ A Tr(B).
(6.8)
bad ) is positive, this lemma yields: Since the anti-Wick operator OpAW,1 (an,+ N AW,1 bad AW,1 bad ˆ −n ˆn (an,+ ) BN ≤ OpW (an,+ ) , Tr OpW N (a) BN OpN N (a) Tr OpN bad by a bad . By linearity and a bad + a bad = |a bad |, we get and similarly by replacing an,+ n,− n,+ n,− n AW,1 bad ˆ −n AW,1 ˆn (an ) BN ≤ OpW (|anbad |) . Tr OpW N (a) BN OpN N (a) Tr OpN From Eq. (5.10), the trace on the RHS is equal to N · anbad L1 (T2 ) 1 + O(e−πN/2 ) . Since anbad is supported on a neighbourhood of Sn of area O(δ), its L1 norm is of order O(δ a C 0 ). Using the Calder´on-Vaillancourt estimate OpW N (a) ≤ C a C 2 , we have thus proven the following bound for the second term in (6.5):
1 W
a C 5 n bad ˆ −n δ
a . (6.9)
a Tr OpN (a) Bˆ N OpW (a ) B + 2 0 C C N n N N N 1/2
We now estimate the first term in (6.5). We write W −n ˆn W ˆ −n = Tr OpW Tr OpW N (a) BN OpN (an )BN N (a)OpN (an ◦ B ) + RN (n) ,
(6.10)
and control the remainder RN (n) with the Egorov estimate (5.30), remembering that n ≤ TE /11: 26n a C 5 5/4 n/4 −πNδ 2 /2n (a)
a N 2 e +
RN (n) OpW 0 C N N δ5 2
a C 5 . (6.11) N 2/5 The following lemma (proved in [MO’K, Lemma 3.1]) will allow us to replace the quantum product by a classical one.
Quantum Variance and Ergodicity for the Baker’s Map
349
Lemma 16. There exists C > 0 such that, for any pair a, b ∈ C ∞ (T2 ), ∀N ≥ 1,
W W
OpW N (a) OpN (b) − OpN (ab) ≤ C
a C 4 b C 4 . N
(6.12)
Using this lemma and the bounds (6.4), we get W −n W −n Tr OpW a(a (a) Op (a ◦ B ) = Tr Op ◦ B ) + R (n) , n n N N N N
a C 4
an C 4 an ◦ B −n C 4 . (6.13) 1/5 N N −n To finally estimate the trace of OpW N a(an ◦ B ) , we use Eq. (5.9) together with the estimates (6.4): a 2 3 1 W C −n a(an ◦ B −n )(x) dx + O . Tr OpN a(an ◦ B ) = N N2 T2 2
RN (n)
with
It remains to compute the integral on the RHS. We split it in two integrals, according to an = a − anbad . The second integral can be bounded by bad −n a(x) an (B x) dx ≤ a C 0 anbad L1 a 2C 0 δ, (6.14) T2
while the first one reads
T2
a(x) a(B −n x) dx = Ka a (n).
(6.15)
This integral is the classical autocorrelation function for the observable a(x), a purely classical quantity. At this point we must use the dynamical properties of the classical baker’s map B, namely its fast mixing properties (see the end of Sect. 2): for some > 0, the autocorrelation decays (when n → ∞) as Ka a (n) a 2C 1 e−|n| . Collecting all terms and using the properties of the function fˆT , Lemma 14 finally yields the following upper bound: 1 S2 (a, N ) a 2C 5 |fˆT (n)| e−|n| + δ + 1/5 N n∈[−T ,T ] 1 a 2C 5 +δ . T Since we took T log N and δ (log N )−1 , this concludes the proof of Theorem 1. Proof of Corollary 2. We start by picking an observable a ∈ C ∞ (T2 ), assuming a(x) N→∞
dx = 0. For any decreasing sequence α(N) −−−−→ 0, Chebychev’s inequality yields an upper bound on the number of eigenvectors of Bˆ N for which |ϕN,j , OpW N (a)ϕN,j | > α(N): # j ∈ {1, . . . , N} : |ϕN,j , OpW S2 (a, N ) N (a) ϕN,j | > α(N ) ≤ . (6.16) N α(N )2
350
M. Degli Esposti, S. Nonnenmacher, B. Winn
From Theorem 1, if we take α(N ) >> (log N )−1/2 , the above fraction converges to zero. Defining JN (a) as the complement of the set in the above numerator, we obtain a sequence of subsets JN (a) ⊂ {1, . . . , N} satisfying #JNN(a) → 1, such that the eigenstates ϕN,jN with jN ∈ JN (a) satisfy (1.3). Using a standard diagonal argument [CdV, HMR, Zel1], one can then extract subsets JN ⊂ {1, . . . , N} independent of the observable a ∈ C ∞ (T2 ), with #JNN → 1, such that (1.3) is satisfied for any a ∈ C ∞ (T2 ) if one takes jN ∈ JN . Acknowledgement. We are grateful to R. Schubert for communicating to us his results [Schu2] prior to publication, and for interesting comments. We also thank S. De Bi`evre, M. Saraceno, N. Anantharaman, A. Martinez and S. Graffi for interesting discussions and comments. This work has been partially supported by the European Commission under the Research Training Network (Mathematical Aspects of Quantum Chaos) HPRN-CT-2000-00103 of the IHP Programme.
References ˙ [ALPZ]
˙ Alicki, R., Lozinski, A., Pakonski, P., Zyczkowski, K.: Quantum dynamical entropy and decoherence rate. J. Phys. A 37, 5157–5172 (2004) [AA] Arnol’d, V.I., Avez, A.: Probl`emes ergodiques de la m´ecanique classique. Paris: GauthierVillars, 1967 [BSS] B¨acker, A., Schubert, R., Stifter, P.: Rate of quantum ergodicity in Euclidean billiards. Phys. Rev. E 57, 5425–5447; Erratum ibid. 58, 5192 (1998) [BGP] Bambusi, D., Graffi, S., Paul, T.: Long time semiclassical approximation of quantum flows: a proof of the Ehrenfest time. Asymptot. Anal. 21, 149–160 (1999) [Bar] Barnett, A.: Asymptotic rate of quantum ergodicity in chaotic Euclidean billiards. submitted to Comm. Pure Appl. Math., 2004, http://www.cims.nyu.edu/∼barnett/papers/q.pdf,2004 + [BdM ] Basilio de Matos, M., Ozorio de Almeida, A.M.: Quantization of Anosov maps. Ann. Phys. 237, 46–65 (1995) [BonDB] Bonechi, F., De Bi`evre, S.: Exponential mixing and ln timescales in quantized hyperbolic maps on the torus. Commun. Math. Phys. 211, 659–686 (2000) [Boul] Boulkhemair, A.: L2 estimates for Weyl quantization. J. Funct. Anal. 165, 173–204 (1999) [BDB] Bouzouina, A., De Bi`evre, S.: Equipartition of the eigenfunctions of quantized ergodic maps on the torus. Commun. Math. Phys. 178, 83–105 (1996) [BR] Bouzouina, A., Robert, D.: Uniform semi-classical estimates for the propagation of quantum observables. Duke Math. J. 111, 223–252 (2002) [BV] Balazs, N.L., Voros, A.: The quantized baker’s transformation. Ann. Phys. 190, 1–31 (1989) [Ch] Chernov, N.I.: Ergodic and statistical properties of piecewise linear hyperbolic automorphisms of the 2-torus. J. Stat. Phys. 69, 111–134 (1992) [CdV] Colin de Verdi´ere, Y.: Ergodicit´e et fonctions propres du Laplacien. Commun. Math. Phys. 102, 497–502 (1985) [DBDE] De Bi`evre, S., Degli Esposti, M.: Egorov theorems and equidistribution of eigenfunctions for the quantized sawtooth and Baker maps. Annales de l’Institut H. Poincar`e, Phys. Theor. 69, 1–30 (1998) [DEG] Degli Esposti, M., Graffi, S.: Mathematical aspects of quantum maps. In: M. Degli Esposti, S. Graffi (eds), The mathematical aspects of quantum maps, Volume 618 of Lecture Notes in Physics, Berlin-Heidelberg-New York: Springer, 2003, pp. 49–90 [DEGI] Degli Esposti, M., Graffi, S., Isola, S.: Classical limit of the quantized hyperbolic toral automorphism. Commun. Math. Phys. 167, 471–507 (1995) [DE+ ] Degli Esposti, M., O’Keefe, S., Winn, B.: A semi-classical study of the Casati-Prosen triangle map. Nonlinearity 18, 1073–1094 (2005) [DS] Dimassi, M., Sj¨ostrand, J.: Spectral asymptotics in the semi-classical limit. Cambridge: Cambridge University Press, 1999 + [EFK ] Eckhardt, B., Fishman, S., Keating, J.P., Agam, O., Main, J., M¨uller, K.: Approach to ergodicity in quantum wave functions. Phys. Rev. E 52, 5893–5903 (1995) [Fa] Farris, M.: Egorov’s theorem on a manifold with diffractive boundary. Commun. Partial Differ. Eqs. 6, 651–687 (1981) [FP] Feingold, M., Peres, A.: Distribution of matrix elements of chaotic systems. Phys. Rev. A 34, 591–595 (1986)
Quantum Variance and Ergodicity for the Baker’s Map [FDBN]
351
Faure, F., Nonnenmacher, S., De Bi`evre, S.: Scarred eigenstates for quantum cat maps of minimal periods. Commun. Math. Phys. 239, 449–492 (2003) [Fo] Folland, G.B.: Harmonic analysis in phase space, The Annals of Mathematics Studies 122, Princeton, NJ: Princeton University Press, 1989 ´ Ergodic properties of eigenfunctions for the Dirichlet problem. [GL] G´erard, P., Leichtnam, E.: Duke Math. J. 71, 559–607 (1993) [HB] Hannay, J.H., Berry, M.V.: Quantisation of linear maps on the torus—Fresnel diffraction by a periodic grating. Physica D 1, 267–290 (1980) [Has] Hasegawa, H.H., Saphir, W.C.: Unitarity and irreversibility in chaotic systems. Phys. Rev. A 46, 7401–7423 (1992) [HMR] Helffer, B., Martinez, A., Robert, D.: Ergodicit´e et limite semi-classique. Commun. Math. Phys. 109, 313–326 (1987) [Kap] Kaplan, L., Heller, E.J.: Linear and nonlinear theory of eigenfunction scars. Ann. Phys. (NY) 264, 171–206 (1998) [KM] Keating, J.P., Mezzadri, F.: Pseudo-symmetries of Anosov maps and spectral statistics. Nonlinearity 13, 747–775 (2000) [KR1] Kurlberg, P., Rudnick, Z.: Hecke theory and equidistribution for the quantization of linear maps of the torus. Duke Math. J. 103, 47–77 (2001) [KR2] Kurlberg, P., Rudnick, Z.: On quantum ergodicity for linear maps of the torus. Commun. Math. Phys. 222, 201–227 (2001) [KR3] Kurlberg, P., Rudnick, Z.: On the distribution of matrix elements for the quantum cat map. Ann. Math. 161, 489–507 (2005) [Lak] Lakshminarayan, A.: On the quantum baker’s map and its unusual traces. Ann. Phys. (NY) 239, 272–295 (1995) [LV] Lebœuf, P., Voros, A.: Chaos revealing multiplicative representation of quantum eigenstates. J. Phys. A 23, 1765–1774 (1990) [Lin] Lindenstrauss, E.: Invariant measures and arithmetic quantum unique ergodicity. Ann. Math. 163, 165–219 (2006) [LS] Luo, W., Sarnak, P.: Quantum variance for Hecke eigenforms Ann. Sci. Ecole Norm. Sup. 37, 769–799 (2004) [MO’K] Marklof, J., O’Keefe, S.: Weyl’s law and quantum ergodicity for maps with divided phase space; Appendix by Zelditch, S.: Converse quantum ergodicity. Nonlinearity 18, 277–304 (2005) [MR] Marklof, J., Rudnick, Z.: Quantum unique ergodicity for parabolic maps. Geom. Func. Anal. 10, 1554–1578 (2000) [Ma] Martinez, A.: An introduction to semiclassical and microlocal analysis. Berlin-HeidelbergNew York: Springer-Verlag, 2002 [O’CTH] O’Connor, P.W., Tomsovic, S., Heller, E.J.: Accuracy of semiclassical dynamics in the presence of chaos J. Stat. Phys. 68, 131–152 (1992) [Per] Perelomov, A.M.: Generalized coherent states and their applications. Heidelberg: Springer Verlag, 1986 [Rob] Robert, D.: Remarks on time dependent Schr¨odinger equation, bound states and coherent states. In: Multiscale methods in quantum mechanics, Trends Maths, Boston: Birkh¨auser 2004, pp. 139–158 [Ros] Rosenzweig, L.: Quantum unique ergodicity for maps on T2 . M.Sc. Thesis, Tel Aviv University, 2004 [RubSal] Rubin, R., Salwen, N.: A canonical quantization of the Baker’s Map. Ann. Phys. (NY) 269, 159–181 (1998) [RudSar] Rudnick, Z., Sarnak, P.: The behaviour of eigenstates of arithmetic hyperbolic manifolds. Commun. Math. Phys. 161, 195–213 (1994) [RuSo] Rudnick, Z., Soundararajan, K.: In preparation, 2004 [Sa] Saraceno, M.: Classical structures in the quantized baker transformation. Ann. Phys. (NY) 199, 37–60 (1990) [SaVo] Saraceno, M., Voros, A.: Towards a semiclassical theory of the quantum baker’s map. Physica D 79, 206–268 (1994) [Sar1] Sarnak, P.: Spectra of hyperbolic surfaces. Bull. Amer. Math. Soc. 40, 441–478 (2003) [Sar2] Sarnak, P.: Quantum vesus classical fluctuations on the modular surface. Talk given at the meeting: “Random Matrix Theory and Arithmetic Aspects of Quantum Chaos” at the Isaac Newton Institute, Cambridge, June 2004. Audio file available at http://www.newton.cam.ac.uk/webseminars/ [Schu1] Schubert, R.: Semiclassical localization in phase space. Ph.D. Thesis, Universit¨at Ulm, 2001. Available at http://vts.uni-ulm.de
352 [Schu2] ˇ [Sni] [Wil] [Zel1] [Zel2] [ZZ]
M. Degli Esposti, S. Nonnenmacher, B. Winn Schubert, R.: Upper bounds on the rate of quantum ergodicity. Preprint 2005, http://arXiv.org/list/math-ph/0503045, 2005 ˇ Snirel’man, A.I.: Ergodic properties of eigenfunctions. Usp. Mat. Nauk. 29, 181–182 (1974) Wilkinson, M.: A semiclassical sum rule for matrix elements of classically chaotic systems. J. Phys. A 9, 2415–2423 (1987) Zelditch, S.: Uniform distribution of eigenfunctions on compact hyperbolic surfaces. Duke Math. J. 55, 919–941 (1987) Zelditch, S.: On the rate of quantum ergodicity. I. Upper bounds. Commun. Math. Phys. 160, 81–92 (1994) Zelditch, S., Zworski, M.: Ergodicity of eigenfunctions for ergodic billiards. Commun. Math. Phys. 175, 673–682 (1996)
Communicated by P. Sarnak
Commun. Math. Phys. 263, 353–380 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1507-2
Communications in
Mathematical Physics
A Canonical Ensemble Approach to the Fermion/Boson Random Point Processes and Its Applications H. Tamura1 , K.R. Ito2 1 2
Department of Mathematics, Kanazawa University, Kanazawa 920-1192, Japan. E-mail:
[email protected] Department of Mathematics and Physics, Setsunan University, Neyagawa, Osaka 572-8508, Japan. E-mail:
[email protected];
[email protected] Received: 1 February 2005 / Accepted: 28 July 2005 Published online: 3 Febraury 2006 – © Springer-Verlag 2006
Abstract: We introduce the boson and the fermion point processes from the elementary quantum mechanical point of view. That is, we consider quantum statistical mechanics of the canonical ensemble for a fixed number of particles which obey Bose-Einstein, Fermi-Dirac statistics, respectively, in a finite volume. Focusing on the distribution of positions of the particles, we have point processes of the fixed number of points in a bounded domain. By taking the thermodynamic limit such that the particle density converges to a finite value, the boson/fermion processes are obtained. This argument is a realization of the equivalence of ensembles, since resulting processes are considered to describe a grand canonical ensemble of points. Random point processes corresponding to para-particles of order two are discussed as an application of the formulation. Statistics of a system of composite particles at zero temperature are also considered as a model of determinantal random point processes.
1. Introduction As special classes of random point processes, fermion point processes and boson point processes have been studied by many authors since [1, 16, 17]. Among them, [8, 6, 9, 7] made a correspondence between boson processes and locally normal states on the C ∗ -algebra of operators on the boson Fock space. A functional integral method is used in [15] to obtain these processes from quantum field theories of finite temperatures. On the other hand, [23] formulated both the fermion and boson processes in a unified way in terms of the Laplace transformation and generalized them. Let Q(R) be the space of all the locally finite configurations over a Polish space R and K a locally trace class integral operator on L2 (R) with a Radon function f measure λ on R . For any nonnegative having bounded support and ξ = j δxj ∈ Q(R), we set < ξ, f >= j f (xj ). Shirai and Takahashi [23] have formulated and studied the random processes µα,K which have
354
H. Tamura, K.R. Ito
Laplace transformations E[e
−
µα,K (dξ ) e−
]≡ Q(R)
−1/α = Det I + α 1 − e−f K 1 − e−f
(1.1)
for the parameters α ∈ {2/m; m ∈ N} ∪ {−1/m; m ∈ N }. Here the cases α = ±1 correspond to boson/fermion processes, respectively. In their argument, the generalized Vere-Jones’ formula [29] 1 Det(1 − αJ )−1/α = det α (J (xi , xj ))ni,j =1 λ⊗n (dx1 · · · dxn ) (1.2) n! R n has played an essential role. Here J is a trace class integral operator, for which we need the condition ||αJ || < 1 unless −1/α ∈ N, Det( · ) the Fredholm determinant and detα A the α-determinant defined by α n−ν(σ ) Aiσ (i) (1.3) det α A = σ ∈Sn
i
for a matrix A of size n × n, where ν(σ ) is the number of cycles in σ . The formula (1.2) is Fredholm’s original definition of his functional determinant in the case α = −1. The purpose of the paper is to construct both the fermion and boson processes from a viewpoint of elementary quantum mechanics in order to get a simple, clear and straightforward understanding of them in connection with physics. Let us consider the system of N free fermions/bosons in a box of finite volume V in Rd and the quantum statistical mechanical state of the system with a finite temperature. Giving the distribution function of the positions of all particles in terms of the square of the absolute value of the wave functions, we obtain a point process of N points in the box. As the thermodynamic limit, N, V → ∞ and N/V → ρ, of these processes of finite points, fermion and boson processes in Rd with density ρ are obtained. In the argument, we will use the generalized Vere-Jones’ formula in the form: 1 ⊗N (dx1 · · · dxN ) det α (J (xi , xj ))N i,j =1 λ N! dz = Det(1 − zαJ )−1/α , (1.4) N+1 2πiz Sr (0) where r > 0 is arbitrary for −1/α ∈ N, otherwise r should satisfy ||rαJ || < 1. Here and hereafter, Sr (ζ ) denotes the integration contour defined by the map θ → ζ + r exp(iθ ), where θ ranges from −π to π , r > 0 and ζ ∈ C. In the terminology of statistical mechanics, we start from the canonical ensemble and end up with formulae like (1.1) and (1.2) of a grand canonical nature. In this sense, the argument is related to the equivalence of ensembles. The use of (1.4) makes our approach simple. The thermodynamic limit has been discussed in [11] and [18] in the contexts of local current algebras for boson and fermion gases respectively at zero temperature. In our approach, we need neither quantum field theories nor the theory of states on the operator algebras to derive the boson/fermion processes. It is interesting to apply the method to the problems which have not been formulated in statistical mechanics on quantum field theories yet. Here, we study the system of para-fermions and para-bosons
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
355
of order 2. Para statistics was first introduced by Green [10] in the context of quantum field theories. For its review, see [21]. References [19] and [12, 27] formulated it within the framework of quantum mechanics of a finite number of particles. See also [20]. Recently statistical mechanics of para-particles are formulated in [28, 2, 3]. However, it does not seem to be fully developed so far. We formulate here point processes as the distributions of positions of para-particles of order 2 with finite temperature and positive density through the thermodynamic limit. It turns out that the resulting processes correspond to the cases α = ±1/2 in [23]. We also try to derive point processes from ensembles of composite particles at zero temperature and positive density in this formalism. The resulting processes also have their Laplace transforms expressed by Fredholm determinants. This paper is organized as follows. In Sect. 2, the random point processes of fixed numbers of fermions as well as bosons are formulated on the base of quantum mechanics in a bounded box. Then, the theorems on thermodynamic limits are stated. The proofs of the theorems are presented in Sect. 3 as applications of a theorem of rather abstract form. In Sects. 4 and 5, we consider the systems of para-particles and composite particles, respectively. In the Appendix, we calculate complex integrals needed for the thermodynamic limits. 2. Fermion and Boson Processes Consider L2 (L ) on L = [−L/2, L/2]d ⊂ Rd with the Lebesgue measure on L . Let
L be the Laplacian on HL = L2 (L ) satisfying periodic boundary conditions at ∂L . We deal with periodic boundary conditions in this paper, however, all the arguments except that in Sect. 5 may be applied to other boundary conditions. Hereafter we regard − L as the quantum mechanical Hamiltonian of a single free particle. The usual factor (L) 2 /2m is set at unity. For k ∈ Zd , ϕk (x) = L−d/2 exp(i2π k · x/L) is an eigenfunction (L) of L , and {ϕk }k∈Zd forms an complete orthonormal system [CONS] of HL . In the following, we use the operator GL = exp(β L ) whose kernel is given by GL (x, y) =
e−β|2πk/L| ϕk (x)ϕk (y) 2
(L)
(L)
(2.1)
k∈Zd (L)
for β > 0. We put gk = exp(−β|2πk/L|2 ), the eigenvalue of GL for the eigenfunction (L) ϕk . We also need G = exp(β ) on L2 (Rd ) and its kernel G(x, y) =
dp −β|p|2 +ip·(x−y) exp(−|x − y|2 /4β) e = . d (4πβ)d/2 Rd (2π)
Note that GL (x, y) and G(x, y) are real symmetric and GL (x, y) = G(x, y + kL).
(2.2)
k∈Zd
Let f : Rd → [0, ∞) be an arbitrary continuous function whose support is compact. In the course of the thermodynamic limit, f is fixed and we assume that L is so large that L contains the support, and regard f as a function on L .
356
H. Tamura, K.R. Ito
2.1. Fermion processes. In this subsection, we construct the fermion process in Rd as a limit of the process of N points in L . Suppose there are N identical particles which obey the Fermi-Dirac statistics in a finite box L . The space of the quantum mechanical states of the system is given by F = {AN f | f ∈ ⊗N HL }, HL,N
where AN f (x1 , . . . , xN ) =
1 sgn(σ )f (xσ (1) , . . . , xσ (N) ) N!
( x1 , . . . , xN ∈ L )
σ ∈SN
(L)
is anti-symmetrization in the N indices. Using the CONS {ϕk }k∈Zd of HL = L2 (L ), we make the element 1 sgn(σ )ϕk1 (xσ (1) ) · · · · · ϕkN (xσ (N) ) (2.3) k (x1 , . . . , xN ) = √ N ! σ ∈S N
F of HL,N for k = (k1 , . . . , kN ) ∈ (Zd )N . Let us introduce the lexicographic order ≺ in Zd and put (Zd )N = {(k1 , . . . , kN ) ∈ (Zd )N | k1 · · · kN }. Then {k }k∈(Zd )N
F . forms a CONS of HL,N According to the idea of the canonical ensemble in quantum statistical mechanics, the probability density distribution of the positions of the N free fermions in the periodic box L at the inverse temperature β is given by N
F pL,N (x1 , . . . , xN ) = ZF−1
k∈(Zd )N
= ZF−1
j =1
(L)
gkj
|k (x1 , . . . , xN )|2
k (x1 , . . . , xN )
k∈(Zd )N
× (⊗N GL )k (x1 , . . . , xN ),
(2.4)
where ZF is the normalization constant. We can define the point process of N points N in L from the density (2.4). I.e., consider a map N j =1 δxj ∈ L (x1 , . . . , xN ) → Q(Rd ). Let µFL,N be the probability measure on Q(Rd ) induced by the map from the F probability measure on N L which has the density (2.4). By EL,N , we denote expectation with respect to the measure µFL,N . The Laplace transform of the point process is given by − F EL,N dµFL,N (ξ ) e− e = =
=
Q(Rd )
N L
exp(−
N
F f (xj ))pL,N (x1 , . . . , xN ) dx1 . . . dxN
j =1
Tr HF [(⊗N e−f )(⊗N GL )] L,N
Tr HF [⊗N GL ] L,N
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
˜ L )AN ] Tr ⊗N HL [(⊗N G Tr ⊗N HL [(⊗N GL )AN ]
˜ N det −1 GL (xi , xj ) dx1 . . . dxN = L , N det −1 GL (xi , xj ) dx1 · · · dxN
=
357
(2.5)
L
˜ L is defined by where G ˜ L = G1/2 e−f G1/2 , G L L
(2.6)
where e−f represents the operator of multiplication by the function e−f . 1/2 The fifth expression follows from [⊗N GL , AN ] = 0, cyclicity of the trace and 1/2 1/2 ˜ L and so on. The last expression can be obtained (⊗N GL )(⊗N e−f )(⊗N GL ) = ⊗N G N by calculating the trace on ⊗ HL using its CONS {ϕk1 ⊗ · · · ⊗ ϕkN | k1 , . . . , kN ∈ Zd }, where det−1 is the usual determinant, see Eq. (1.3). Now, let us consider the thermodynamic limit, where the volume of the box L and the number of points N in the box L tend to infinity in such a way that the densities tend to a positive finite value ρ, i.e., L, N → ∞,
N/Ld → ρ > 0.
(2.7)
Theorem 2.1. The finite fermion processes {µFL,N } defined above converge weakly to the fermion process µFρ whose Laplace transform is given by Q(Rd )
e− dµFρ (ξ ) = Det 1 − 1 − e−f z∗ G(1 + z∗ G)−1 1 − e−f (2.8)
in the thermodynamic limit (2.7), where z∗ is the positive number uniquely determined by ρ=
dp z∗ e−β|p| = (z∗ G(1 + z∗ G)−1 )(x, x). d (2π) 1 + z∗ e−β|p|2 2
Remark. The existence of µFρ which has the above Laplace transform is a consequence of the result of [23] we have mentioned in the introduction.
2.2. Boson processes. Suppose there are N identical particles which obey Bose-Einstein statistics in a finite box L . The space of the quantum mechanical states of the system is given by B HL,N = {SN f | f ∈ ⊗N HL },
where SN f (x1 , . . . , xN ) =
1 f (xσ (1) , . . . xσ (N) ) N! σ ∈SN
( x1 , . . . , xN ∈ L )
358
H. Tamura, K.R. Ito (L)
is symmetrization in the N indices. Using the CONS {ϕk }k∈Zd of L2 (L ), we make the element 1 ϕk1 (xσ (1) ) · · · · · ϕkN (xσ (N) ) (2.9) k (x1 , . . . , xN ) = √ N!n(k) σ ∈SN
B for k = (k , . . . , k ) ∈ Zd , where n(k) = of HL,N 1 N l∈Zd ({n ∈ {1, . . . , N} | kn = l}!). d N Let us introduce the subset (Z )≺ = {(k1 , . . . , kN ) ∈ (Zd )N | k1 ≺ · · · ≺ kN } of (Zd )N . B . Then {k }k∈(Zd )N≺ forms a CONS of HL,N As in the fermion’s case, the probability density distribution of the positions of the N free bosons in the periodic box L at the inverse temperature β is given by B (x1 , . . . , xN ) = ZB−1 pL,N
N k∈(Zd )N ≺
j =1
(L)
gkj
|k (x1 , . . . , xN )|2 ,
(2.10)
where ZB is the normalization constant. We can define a point process of N points µB L,N from the density (2.10) as in the previous section. The Laplace transform of the point process is given by ˜ L )SN ] − Tr ⊗N HL [(⊗N G B EL,N e = N Tr ⊗N HL [(⊗ GL )SN ]
˜ N det 1 GL (xi , xj ) dx1 · · · dxN = L , N det 1 GL (xi , xj ) dx1 · · · dxN
(2.11)
L
where det1 denotes permanent, see Eq. (1.3). We set 2 dp e−β|p| ρc = , 2 d Rd (2π) 1 − e−β|p| which is finite for d > 2. Now, we have
(2.12)
Theorem 2.2. The finite boson processes {µB L,N } defined above converge weakly to the boson process µB whose Laplace transform is given by ρ e− dµB ρ (ξ ) Q(Rd )
= Det[1 +
1 − e−f z∗ G(1 − z∗ G)−1 1 − e−f ]−1
(2.13)
in the thermodynamic limit (2.7) if 2 dp z∗ e−β|p| ρ= = (z∗ G(1 − z∗ G)−1 )(x, x) < ρc . 2 d Rd (2π) 1 − z∗ e−β|p| Remark 1. For the existence of µB ρ , we refer to [23]. Remark 2. In this paper, we only consider the boson processes with low densities : ρ < ρc . The high density cases ρ ρc are related to the Bose-Einstein condensation. ˜ L to deal with these cases. It We need the detailed knowledge about the spectrum of G will be reported in another publication.
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
359
3. Thermodynamic Limits 3.1. A general framework. It is convenient to consider the problem in a general framework on a Hilbert space H over C. The proofs of the theorems of Sect. 2 are given in the next subsection. We denote the operator norm by || · ||, the trace norm by || · ||1 and the Hilbert-Schmidt norm by || · ||2 . Let {VL }L>0 be a one-parameter family of Hilbert-Schmidt operators on H which satisfy the conditions ∀L > 0 : || VL || = 1,
lim || VL ||2 = ∞,
L→∞
and A a bounded self-adjoint operator on H satisfying 0 A 1. Then GL = VL∗ VL , ˜ L = V ∗ AVL are self-adjoint trace class operators satisfying G L ˜ L GL 1, ||GL || = 1 and ∀L > 0 : 0 G
lim Tr GL = ∞.
L→∞
We define I−1/n = [0, ∞) for n ∈ N and Iα = [0, 1/|α|) for α ∈ [−1, 1] − {0, −1, −1/2, . . . }. Then the function (α)
hL (z) =
Tr [zGL (1 − zαGL )−1 ] Tr GL
is well defined on Iα for each L > 0 and α ∈ [−1, 1] − {0}. Theorem 3.1. Let α ∈ [−1, 1]−{0} be arbitrary but fixed. Suppose that for every z ∈ Iα , (α) there exist a limit h(α) (z) = limL→∞ hL (z) and a trace class operator Kz satisfying lim || Kz − (1 − A)1/2 VL (1 − zαVL∗ VL )−1 VL∗ (1 − A)1/2 ||1 = 0.
L→∞
(3.1)
Then, for every ρˆ ∈ [0, supz∈Iα h(α) (z)), there exists a unique solution z = z∗ ∈ Iα of h(α) (z) = ρ. ˆ Moreover suppose that a sequence L1 < L2 < · · · < LN < · · · satisfies lim N/Tr GLN = ρ. ˆ
(3.2)
N→∞
Then
σ ∈SN
˜ LN U (σ )] α N−ν(σ ) Tr ⊗N H [⊗N G
σ ∈SN
α N−ν(σ ) Tr ⊗N H [⊗N GLN U (σ )]
lim
N→∞
= Det[1 + z∗ αKz∗ ]−1/α
(3.3)
holds. Here the operator U (σ ) on ⊗N H is defined by U (σ )ϕ1 ⊗ · · · ⊗ ϕN = ϕσ −1 (1) ⊗ · · · ⊗ ϕσ −1 (N) for σ ∈ SN and ϕ1 , . . . , ϕN ∈ H. In order to prove the theorem, we prepare several lemmas under the same assumptions of the theorem. Lemma 3.2. h(α) is a strictly increasing continuous function on Iα and there exists a unique z∗ ∈ Iα which satisfies h(α) (z∗ ) = ρ. ˆ
360
H. Tamura, K.R. Ito (α)
(α)
Proof. From hL (z) = Tr [GL (1−zαGL )−2 ]/Tr GL , we have 1 hL (z) (1−zα)−2 (α)
for α > 0 and 1 hL (z) (1−zα)−2 for α < 0, i.e., {hL }{L>0} is equi-continuous on (α) Iα . By Ascoli-Arzel`a’s theorem, the convergence hL → h(α) is locally uniform and (α) hence h is continuous on Iα . It also follows that h(α) is strictly increasing. Together (α) with h(α) (0) = 0, which comes from hL (0) = 0, we get that h(α) (z) = ρˆ has a unique solution in Iα . (α)
Lemma 3.3. There exists a constant c0 > 0 such that ˜ L ||1 = Tr [VL∗ (1 − A)VL ] c0 ||GL − G uniformly in L > 0. Proof. Since 1 − zαGL is invertible for z ∈ Iα and VL is Hilbert-Schmidt, we have Tr [VL∗ (1 − A)VL ] = Tr [(1 − zαGL )1/2 (1 − zαGL )−1/2 ×VL∗ (1 − A)VL (1 − zαGL )−1/2 (1 − zαGL )1/2 ] ||1 − zαGL ||Tr [(1 − zαGL )−1/2 VL∗ (1 − A)VL (1 − zαGL )−1/2 ] = ||1 − zαGL ||Tr [(1 − A)1/2 VL (1 − zαGL )−1 VL∗ (1 − A)1/2 ] = (1 − (α ∧ 0)z)(Tr Kz + o(1)).
(3.4)
Here we have used |Tr B1 CB2 | ||B1 || ||B2 || ||C||1 = ||B1 || ||B2 ||Tr C for bounded operators B1 , B2 and a positive trace class operator C and Tr W V = Tr V W for Hilbert-Schmidt operators W, V . ˜ L in decreasing order Let us denote all the eigenvalues of GL and G g0 (L) = 1 g1 (L) · · · gj (L) · · · and g˜ 0 (L) g˜ 1 (L) · · · g˜ j (L) · · · , respectively. Then we have Lemma 3.4. For each j = 0, 1, 2, . . . ,
gj (L) g˜ j (L)
holds.
Proof. By the min-max principle, we have g˜ j (L) =
min
max
min
max
ψ0 ,...,ψj −1 ∈HL ψ∈{ψ0 ,...,ψj −1 }⊥ ψ0 ,...,ψj −1 ∈HL ψ∈{ψ0 ,...,ψj −1
}⊥
˜ L ψ) (ψ, G ||ψ||2 (ψ, GL ψ) = gj (L). ||ψ||2
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
361
Lemma 3.5. For N large enough, the conditions ˜ LN (1 − α z˜ N G ˜ LN )−1 ] = N Tr [zN GLN (1 − αzN GLN )−1 ] = Tr [˜zN G
(3.5)
determine zN , z˜ N ∈ Iα uniquely. zN and z˜ N satisfy zN z˜ N ,
|˜zN − zN | = O(1/N ) and
lim zN = lim z˜ N = z∗ .
N→∞
N→∞
Proof. From the proof of Lemma 3.2, HN (z) = Tr [zGLN (1−zαGLN )−1 ] = hLN (z)Tr GLN is a strictly increasing continuous function on Iα and HN (0) = 0. Let us pick z0 ∈ Iα such that z0 > z∗ . Since h(α) is strictly increasing, h(α) (z0 ) − h(α) (z∗ ) = > 0. We have (α)
h(α) (z0 ) HN (z0 ) Tr GLN = hLN (z0 ) → =1+ , (3.6) N N ρˆ ρˆ which shows HN (sup Iα )−0 HN (z0 ) > N for large enough N . Thus zN ∈ [0, z0 ) ⊂ Iα is uniquely determined by HN (zN ) = N . ˜ LN (1 − zα G ˜ LN )−1 ]. Then by Lemma 3.4, H˜ N is well-defined Put H˜ N (z) = Tr [zG on Iα and H˜ N HN there. Moreover ˜ LN )(1 − αzG ˜ LN )−1 ] HN (z) − H˜ N (z) = Tr [(1 − αzGLN )−1 z(GLN − G ˜ LN )−1 ||zTr [GLN − G ˜ LN ] ||(1 − αzGLN )−1 ||||(1 − αzG zc0 Cz = (1 − (α ∨ 0)z)2 holds. Together with (3.6), we have HN (z0 ) − Cz0 Cz H˜ N (z0 ) >1+ − 0, N N 2ρˆ N hence H˜ N (z0 ) > N , if N is large enough. It is also obvious that H˜ N is strictly increasing and continuous on Iα and H˜ N (0) = 0. Thus z˜ N ∈ [0, z0 ) ⊂ Iα is uniquely determined by H˜ N (˜zN ) = N . (α) The convergence zN → z∗ is a consequence of hLN (zN ) = N/Tr GLN → ρˆ = (α)
(α)
h(α) (z∗ ), the strict increasingness of h(α) , hL and the pointwise convergence hL → h(α) . We get zN z˜ N from HN H˜ N and the increasingness of HN , H˜ N . Now, let us show |˜zN − zN | = O(N −1 ), which together with zN → z∗ , yields z˜ N → z∗ . From 0 = N − N = HN (zN ) − H˜ N (˜zN ) ˜ LN )(1 − α z˜ N G ˜ LN )−1 ] = Tr [(1 − αzN GLN )−1 (zN GLN − z˜ N G ˜ LN )(1 − α z˜ N G ˜ LN )−1 ] = zN Tr [(1 − αzN GLN )−1 (GLN − G ˜ LN (1 − α z˜ N G ˜ LN )−1 ], −(˜zN − zN )Tr [(1 − αzN GLN )−1 G
362
H. Tamura, K.R. Ito
we get z˜ N − zN ˜ LN (1 − α z˜ N G ˜ LN )−1 (1 − αzN GLN )−1/2 ] Tr [(1 − αzN GLN )−1/2 z˜ N G z˜ N ˜ LN )−1 ]. ˜ LN )(1 − α z˜ N G = zN Tr [(1 − αzN GLN )−1 (GLN − G It follows that z˜ N − zN ˜ z˜ N − zN HN (˜zN ) N= z˜ N z˜ N z˜ N − zN = Tr [(1 − αzN GLN ) z˜ N ˜ LN (1 − α z˜ N G ˜ LN )−1 (1 − αzN GLN )−1/2 ] ×(1 − αzN GLN )−1/2 z˜ N G z˜ N − zN ˜ LN ||1 − αzN GLN ||Tr [(1 − αzN GLN )−1/2 z˜ N G z˜ N ˜ LN )−1 (1 − αzN GLN )−1/2 ] ×(1 − α z˜ N G ˜ LN )−1 ] ˜ LN )(1 − α z˜ N G = ||1 − αzN GLN ||zN Tr [(1 − αzN GLN )−1 (GLN − G ˜ LN )−1 || ˜ LN ] ||(1 − α z˜ N G zN ||1 − αzN GLN || ||(1 − αzN GLN )−1 || Tr [GLN − G c0 z0 (1 − (α ∧ 0)z0 )/(1 − (α ∨ 0)z0 )2 for N large enough, because zN , z˜ N < z0 . Thus, we have obtained z˜ N −zN = O(N −1 ). We put v (N) = Tr [zN GLN (1 − αzN GLN )−2 ]
˜ LN (1 − α z˜ N G ˜ LN )−2 ]. and v˜ (N) = Tr [˜zN G
Then we have : Lemma 3.6. (i) v (N) , v˜ (N) → ∞, Proof.
(ii)
v (N) → 1. v˜ (N)
(i) follows from the lower bound v (N) = Tr [zN GLN (1 − αzN GLN )−2 ] Tr [zN GLN (1 − αzN GLN )−1 ] ||1 − αzN GLN ||−1 N (1 + o(1))/(1 − (α ∧ 0)z∗ ),
since zN → z∗ . The same bound is also true for v˜ (N) . (ii) Using v (N) = Tr [−zN GLN (1 − αzN GLN )−1 + α −1 (1 − αzN GLN )−2 − α −1 ] = −N + α −1 Tr [(1 − αzN GLN )−2 − 1]
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
363
and the same for v˜ (N) , we get ˜ LN )−2 − (1 − αzN GLN )−2 ]| |v˜ (N) − v (N) | = |α −1 Tr [(1 − α z˜ N G ˜ LN )−1 |α −1 Tr [ (1 − α z˜ N G ˜ LN )−1 (1 − α z˜ N G ˜ LN )−1 ]| −(1 − αzN G ˜ LN )−1 +|α −1 Tr [ (1 − αzN G ˜ LN )−1 ]| −(1 − αzN GLN )−1 (1 − α z˜ N G ˜ LN )−1 +|α −1 Tr [(1 − αzN GLN )−1 (1 − α z˜ N G ˜ LN )−1 ]| −(1 − αzN G +|α −1 Tr [(1 − αzN GLN )−1 ˜ LN )−1 − (1 − αzN GLN )−1 ]| × (1 − αzN G ˜ LN )−1 || ||(1 − α z˜ N G ˜ LN )−1 (˜zN − zN ) ||(1 − α z˜ N G ˜ LN (1 − αzN G ˜ LN )−1 ||1 ×G ˜ LN ) ˜ LN )−1 || ||(1 − αzN G ˜ LN )−1 zN (GLN − G +||(1 − α z˜ N G ×(1 − αzN GLN )−1 ||1 ˜ LN )−1 (˜zN − zN ) +||(1 − αzN GLN )−1 || ||(1 − α z˜ N G ˜ LN (1 − αzN G ˜ LN )−1 ||1 ×G ˜ LN )−1 zN (GLN − G ˜ LN ) +||(1 − αzN GLN )−1 || ||(1 − αzN G ×(1 − αzN GLN )−1 ||1 ˜ LN )−1 || + ||(1 − αzN GLN )−1 ||) (||(1 − α z˜ N G z˜ N − zN ˜ LN × ||˜zN G z˜ N ˜ LN )−1 ||1 ||(1 − αzN G ˜ LN )−1 || ×(1 − α z˜ N G ˜ LN ||1 ˜ LN )−1 || ||GLN − G +zN ||(1 − αzN G ×||(1 − αzN GLN )−1 || = O(1). In the last step, we have used Lemmas 3.3 and 3.5. This, together with (i), implies (ii). Lemma 3.7. lim
N →∞
S1 (0)
lim
N →∞
2π v (N) 2π v˜ (N) S1 (0)
−1/α dη Det 1 − αzN (η − 1)GLN (1 − αzN GLN )−1 = 1, N+1 2πiη dη ˜ LN )−1 −1/α = 1. ˜ LN (1 − α z˜ N G Det 1 − α z˜ N (η − 1)G N+1 2πiη
364
H. Tamura, K.R. Ito
Proof. Put s = 1/|α| and (N)
pj
=
|α|zN gj (LN ) . 1 − αzN gj (LN )
Then the first equality is nothing but Proposition A.2(i) for α < 0 and Proposition A.2(ii) for α > 0. The same is true for the second equality. Proof of Theorem 3.1. Since the uniqueness of z∗ has already been shown, it is enough to prove (3.3). The main apparatus of the proof is Vere-Jones’ formula in the following form: Let α = −1/n for n ∈ N. Then Det(1 − αJ )−1/α =
∞ 1 n−ν(σ ) α Tr ⊗n H [(⊗n J )U (σ )] n! n=0
σ ∈Sn
holds for any trace class operator J . For α ∈ [−1, 1] − {0, −1, −1/2, . . . , 1/n, . . . }, this holds under an additional condition ||αJ || < 1. This has actually been proved in Theorem 2.4 of [23]. We use the formula in the form 1 N−ν(σ ) α Tr ⊗N H [(⊗N GLN )U (σ )] N! σ ∈SN dz Det(1 − zαGLN )−1/α = N+1 SzN (0) 2πiz
(3.7)
˜ LN . Here, recall that zN , z˜ N ∈ Iα . We and in the form in which GLN is replaced by G calculate the right-hand side by the saddle point method. Using the above integral representation and the property of the products of the Fredholm determinants followed by the change of integral variables z = zN η, z = z˜ N η, we get N−ν(σ ) Tr N ˜ ⊗N H [⊗ GLN U (σ )] σ ∈S α N N−ν(σ ) Tr ⊗N H [⊗N GLN U (σ )] σ ∈SN α −1/α dz/2πizN+1 ˜ Sz˜ N (0) Det(1 − zα GLN ) = −1/α dz/2πizN+1 Sz (0) Det(1 − zαGLN ) N
N ˜ LN ]−1/α zN Det[1 − z˜ N αGLN ]−1/α Det[1 − z˜ N α G = N Det[1 − zN αGLN ]−1/α Det[1 − z˜ N αGLN ]−1/α z˜ N −1 −1/α dη/2π iηN+1 ˜ ˜ S (0) Det[1 − z˜ N (η − 1)α GLN (1 − z˜ N α GLN ) ] × 1 . −1 −1/α dη/2π iηN+1 S1 (0) Det[1 − zN (η − 1)αGLN (1 − zN αGLN ) ]
Thus the theorem is proved if the following behaviors in N → ∞ are valid: (a) N zN N z˜ N
z˜ N − zN = exp − N + o(1) , zN
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
365
(b) z˜ N − zN Det[1 − z˜ N αGLN ]−1/α = exp N + o(1) , −1/α Det[1 − zN αGLN ] zN (c) ˜ LN ]−1/α Det[1 − z˜ N α G → Det[1 + z∗ αKz∗ ]−1/α , Det[1 − z˜ N αGLN ]−1/α (d) S1 (0)
˜ LN (1 − z˜ N α G ˜ LN )−1 ]−1/α dη/2π iηN+1 Det[1 − z˜ N (η − 1)α G
S1 (0) Det[1 − zN (η
− 1)αGLN (1 − zN αGLN )−1 ]−1/α dη/2π iηN+1
→ 1.
In fact, (a) is a consequence of Lemma 3.5. For (b), let us define a function k(z) = log Det[1 − zαGL ]−1/α = −α −1 ∞ log(1 − zαgj (L)). Then by Taylor’s formula j =0 and (3.5), we get (˜zN − zN )2 2 ∞ ∞ αgj2 gj (˜zN − zN )2 = (˜zN − zN ) + 1 − zN αgj (1 − z¯ αgj )2 2
k(˜zN ) − k(zN ) = k (zN )(˜zN − zN ) + k (¯z)
j =0
=N
j =0
z˜ N − zN + δ, zN
where z¯ is a mean value of zN and z˜ N and |δ| = O(1/N ) by Lemma 3.5. From the property of the product and the cyclic nature of the Fredholm determinants, we have ˜ LN ] Det[1 − z˜ N α G Det[1 − z˜ N αGLN ] = Det[1 + z∗ α(1 − A)1/2 VLN (1 − z∗ αGLN )−1 VL∗N (1 − A)1/2 ] ˜ LN )(1 − z˜ N αGLN )−1 ] + Det[1 + z˜ N α(GLN − G ˜ LN )(1 − z∗ αGLN )−1 ] . −Det[1 + z∗ α(GLN − G The first term converges to Det[1 + z∗ αKz∗ ] by the assumption (3.1) and the continuity of the Fredholm determinants with respect to the trace norm. The brace in the above equation tends to 0, because of the continuity and ˜ LN )(1 − z˜ N αGLN )−1 − z∗ α(GLN − G ˜ LN )(1 − z∗ αGLN )−1 ||1 ||˜zN α(GLN − G ˜ LN ||1 ||(1 − z˜ N αGLN )−1 || |˜zN − z∗ | |α| ||GLN − G ˜ LN ||1 ||(1 − z˜ N αGLN )−1 − (1 − z∗ αGLN )−1 || → 0, +z∗ |α| ||GLN − G where we have used Lemmas 3.3 and 3.5. Thus, we get (c). (d) is a consequence of Lemma 3.6 and Lemma 3.7.
366
H. Tamura, K.R. Ito
3.2. Proofs of the theorems. To prove Theorem 2.1[2.2], it is enough to show that (2.5)[(2.11)] converges to the right-hand side of (2.8) [(2.13), respectively] for every f ∈ Co (Rd )[5]. We regard HL = L2 (L ) as a closed subspace of L2 (Rd ). Corresponding to the orthogonal decomposition L2 (Rd ) = L2 (L ) ⊕ L2 (cL ), we set VL = eβ L /2 ⊕ 0. Let A = e−f be the multiplication operator on L2 (Rd ), which can be decomposed as A = e−f χL ⊕ χcL for large L since supp f is compact. Then GL = VL∗ VL = eβ L ⊕ 0
˜ L = VL∗ AVL = eβ L /2 e−f eβ L /2 ⊕ 0 and G
can be identified with those in Sect. 2. We begin with the following fact, where we denote (L)
k
=
1 1 d
2π k+ − , L 2 2
for
k ∈ Zd .
Lemma 3.8. Let b : [0, ∞) → [0, ∞) be a monotone decreasing continuous function such that b(|p|)dp < ∞. Rd
Define the function bL : Rd → [0, ∞) by bL (p) = b(|2πk/L|)
if
(L)
p ∈ k
for k ∈ Zd .
Then bL (p) → b(|p|) in L1 (Rd ) as L → ∞ . Proof. There exist positive constants c1 and c2 such 2 |p|) holds for √ that bL (p) c1 b(c√ all L 1 and p ∈ Rd . Indeed, c1 = b(0)/b(2π d/(d + 8)), c2 = 2/ d + 8 satisfy (L) the condition, since inf{c1 b(c2 |p|) | p ∈ 0 } b(0) for ∀L > 1 and sup{c2 |p| | p ∈ (L) k } 2π |k|/L for k ∈ Zd − {0}. Obviously c1 b(c2 |p|) is an integrable function of p ∈ Rd . The lemma follows by the dominated convergence theorem. Finally we confirm the assumptions of Theorem 3.1. Proposition 3.9. (i) ∀L > 0 : ||VL || = 1,
lim Tr GL /Ld = (4πβ)−d/2 .
L→∞
(3.8)
(ii) The following convergences hold as L → ∞ for each z ∈ Iα :
(α)
Tr [zGL (1 − zαGL )−1 ] Tr GL 2 dp ze−β|p| d/2 = h(α) (z), → (4πβ) 2 d Rd (2π) 1 − zαe−β|p| || 1 − e−f GL (1 − zαGL )−1 −G(1 − zαG)−1 1 − e−f ||1 → 0.
hL (z) =
(3.9)
(3.10)
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
367
2 2 ˜ Proof. By applying the above lemma to b(|p|) = e−β|p| and b(|p|) = ze−β|p| / 2 (1 − zαe−β|p| ), we have (3.8) and (3.9). By Gr¨um’s convergence theorem, it is enough to show 1 − e−f GL (1 − zαGL )−1 1 − e−f → 1 − e−f G(1 − zαG)−1 1 − e−f
strongly and Tr [ 1 − e−f GL (1 − zαGL )−1 1 − e−f ] = (1 − e−f (x) ) GL (1 − zαGL )−1 (x, x)dx d R → (1 − e−f (x) ) G(1 − zαG)−1 (x, x)dx Rd = Tr [ 1 − e−f G(1 − zαG)−1 1 − e−f ] for (3.10). These are direct consequences of |zGL (1 − zαGL )−1 (x, y) − zG(1 − zαG)−1 (x, y)| dp ˜ |eL (p, x − y)b˜L (p) − e(p, x − y)b(|p|)| = (2π )d dp ˜ ˜ ˜ + |eL (p, x − y) − e(p, x − y)|b(|p|) →0 |bL (p) − b(|p|)| d (2π ) ˜ uniformly in x, y ∈ supp f . Here we have used the above lemma for b(|p|) and we put e(p, x) = eip·x and eL (p; x) = e(2πk/L; x)
(L)
if p ∈ k
for
k ∈ Zd .
Thanks to (3.8), we can take a sequence {LN }N∈N which satisfies (3.2). On the relation between ρ in Theorems 2.1, 2.2 and ρˆ in Theorem 3.1, ρˆ = (4πβ)d/2 ρ is derived from (2.7). We have the ranges of ρ in Theorem 2.2 and Theorem 2.1, since sup h(1) (z) = (4πβ)d/2 z∈I1
dp e−β|p| = (4πβ)d/2 ρc 2 d Rd (2π) 1 − e−β|p| 2
and supz∈I−1 h(−1) (z) = ∞ from (3.9). Thus we get Theorem 2.1 and Theorem 2.2 using Theorem 3.1. 4. Para-Particles The purpose of this section is to apply the method which we have developed in the preceding sections to statistical mechanics of gases which consist of identical particles obeying para-statistics. Here, we restrict our attention to para-fermions and para-bosons of order 2. We will see that the point processes obtained after the thermodynamic limit are the point processes corresponding to the cases of α = ±1/2 given in [23].
368
H. Tamura, K.R. Ito
In this section, we use the representation theory of the symmetric group ( cf. e.g. [13, 22, 25]). We say that (λ1 , λ2 , . . . , λn ) ∈ Nn is a Young frame of length n for the symmetric group SN if n
λj = N,
λ1 λ2 · · · λn > 0.
j =1
We associate the Young frame (λ1 , λ2 , . . . , λn ) with the diagram of λ1 -boxes in the first row, λ2 -boxes in the second row,..., and λn -boxes in the nth row. A Young tableau on a Young frame is a bijection from the numbers 1, 2, . . . , N to the N boxes of the frame. 4.1. Para-bosons of order 2. Let us select one Young tableau, arbitrary but fixed, on each Young frame of length less than or equal to 2, say the tableau Tj on the frame (N − j, j ) for j = 1, 2, . . . , [N/2] and the tableau T0 on the frame (N ). We denote by R(Tj ) the row stabilizer of Tj , i.e., the subgroup of SN consists of those elements that keep all rows of Tj invariant, and by C(Tj ) the column stabilizer whose elements preserve all columns of Tj . Let us introduce the three elements 1 1 a(Tj ) = σ, b(Tj ) = sgn(σ )σ #R(Tj ) #C(Tj ) σ ∈R(Tj )
σ ∈C (Tj )
and e(Tj ) =
dTj N!
sgn(τ )σ τ = cj a(Tj )b(Tj )
σ ∈R(Tj ) τ ∈C (Tj )
of the group algebra C[SN ] for each j = 0, 1, . . . , [N/2], where dTj is the dimension of the irreducible representation of SN corresponding to Tj and cj = dTj #R(Tj )#C(Tj )/N !. As is known, a(Tj )σ b(Tk ) = b(Tk )σ a(Tj ) = 0
(4.1)
hold for any σ ∈ SN and 0 j < k [N/2]. The relations a(Tj )2 = a(Tj ),
b(Tj )2 = b(Tj ),
e(Tj )e(Tk ) = δj k e(Tj )
(4.2)
also hold. For later use, let us introduce d(Tj ) = e(Tj )a(Tj ) = cj a(Tj )b(Tj )a(Tj )
(j = 0, 1, . . . , [N/2]).
(4.3)
They satisfy d(Tj )d(Tk ) = δj k d(Tj )
for
0 j, k [N/2],
(4.4)
as is shown readily from (4.1) and (4.2). The inner product < ·, · > of C[SN ] is defined by < σ, τ >= δσ τ
for σ, τ ∈ SN
and extended to all elements of C[SN ] by sesqui-linearity.
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
369
The left representation L and the right representation R of SN on C[SN ] are defined by
L(σ )g = L(σ )
g(τ )τ =
τ ∈SN
and
R(σ )g = R(σ )
τ ∈SN
g(τ )σ τ =
τ ∈SN
g(τ )τ =
g(σ −1 τ )τ
τ ∈SN
g(τ )τ σ −1 =
τ ∈SN
g(τ σ )τ,
τ ∈SN
respectively. Here and hereafter we identify g : SN → C and τ ∈SN g(τ )τ ∈ C[SN ]. They are extended to the representation of C[SN ] on C[SN ] as L(f )g = f g = f (σ )g(τ )σ τ = f (σ τ −1 )g(τ ) σ σ,τ
and R(f )g = g fˆ =
σ
g(σ )f (τ )σ τ −1 =
σ,τ
τ
σ
g(σ τ )f (τ ) σ,
τ
where fˆ = τ fˆ(τ )τ = τ f (τ −1 )τ = τ f (τ )τ −1 . The character of the irreducible representation of SN corresponding to the tableau Tj is obtained by χTj (σ ) = (τ, σ R(e(Tj ))τ ) = (τ, σ τ e(T j )). τ ∈SN
τ ∈SN
We introduce a tentative notation χg (σ ) ≡ (τ, σ R(g)τ ) = (τ, σ τ γ −1 )g(γ ) = g(τ −1 σ τ ) τ ∈SN
τ,γ ∈SN
(4.5)
τ ∈SN
for g = τ g(τ )τ ∈ C[SN ]. Let U be the representation of SN ( and its extension to C[SN ]) on ⊗N HL defined by U (σ )ϕ1 ⊗ · · · ⊗ ϕN = ϕσ −1 (1) ⊗ · · · ⊗ ϕσ −1 (N)
for ϕ1 , . . . , ϕN ∈ HL ,
or equivalently by (U (σ )f )(x1 , . . . , xN ) = f (xσ (1) , . . . , xσ (N) )
for f ∈ ⊗N HL .
Obviously, U is unitary: U (σ )∗ = U (σ −1 ) = U (σ )−1 . Hence U (a(Tj )) is an orthogonal projection because of U (a(Tj ))∗ = U (a(T j )) = U (a(Tj )) and (4.2). So are U (b(Tj ))’s, [N/2] U (d(Tj ))’s and P2B = j =0 U (d(Tj )). Note that Ran U (d(Tj )) = Ran U (e(Tj )) because of d(Tj )e(Tj ) = e(Tj ), e(Tj )d(Tj ) = d(Tj ). We refer to the literature [19, 12, 27] for quantum mechanics of para-particles. (See also [20].) The arguments of the literatures indicate that the state space of N para-bosons 2B = P ⊗N H . It is obvious that there is of order 2 in the finite box L is given by HL,N 2B L (L)
(L)
2B which consists of the vectors of the form U (d(T ))ϕ a CONS of HL,N j k1 ⊗ · · · ⊗ ϕkN ,
370
H. Tamura, K.R. Ito
which are the eigenfunctions of ⊗N GL . Then, we define a point process of N free para-bosons of order 2 as in Sect. 2 and its generating functional is given by ˜ L )P2B ] − Tr ⊗N HL [(⊗N G 2B e = . EL,N N Tr ⊗N HL [(⊗ GL )P2B ] Let us give expressions, which have a clear correspondence with (2.11). Lemma 4.1. 2B EL,N
e
−
[N/2] j =0
σ ∈SN
= [N/2] =
˜ L )U (σ )] χTj (σ )Tr ⊗N HL [(⊗N G
N σ ∈SN χTj (σ )Tr ⊗N HL [(⊗ GL )U (σ )] j =0 [N/2] ˜ L (xi , xj )}dx1 · · · dxN det Tj {G j =0 N L . [N/2] det Tj {GL (xi , xj )}dx1 · · · dxN j =0 N L
(4.6)
(4.7)
2B = P N Remark 1. HL,N 2B ⊗ HL is determined by the choice of the tableaux Tj ’s. The spaces corresponding to different choices of tableaux are different subspaces of ⊗N HL . However, they are unitarily equivalent and the generating functional given above is not affected by the choice. In fact, χTj (σ ) depends only on the frame on which the tableau Tj is defined.
N Remark 2. detT A = σ ∈SN χT (σ ) i=1 Aiσ (i) in (4.7) is called immanant, another generalization of determinant than det α . Proof. Since ⊗N G commutes with U (σ ) and a(Tj )e(Tj ) = e(Tj ), we have Tr ⊗N HL (⊗N GL )U (d(Tj )) = Tr ⊗N HL (⊗N GL )U (e(Tj ))U (a(Tj )) = Tr ⊗N HL U (a(Tj ))(⊗N GL )U (e(Tj )) = Tr ⊗N HL (⊗N GL )U (e(Tj )) . (4.8) On the other hand, we get from (4.5) that χg (σ )Tr ⊗N HL (⊗N G)U (σ ) = g(τ −1 σ τ )Tr ⊗N HL (⊗N G)U (σ ) σ ∈SN
=
τ,σ
=
τ,σ
= N!
τ,σ ∈SN
g(σ )Tr ⊗N HL (⊗N G)U (τ σ τ −1 )
g(σ )Tr ⊗N HL (⊗N G)U (τ )U (σ )U (τ −1 )
g(σ )Tr ⊗N HL (⊗N G)U (σ ) = N !Tr ⊗N HL (⊗N G)U (g) ,
(4.9)
σ
where we have used the cyclicity of the trace and the commutativity of U (τ ) with ⊗N G. Putting g = e(Tj ) and using(4.8), the first equation is derived. The second one is obvious.
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
371
N Let ψTj be the character of the induced representation IndS R(Tj ) [1], where 1 is the representation R(Tj ) σ → 1, i.e.,
ψTj (σ ) =
< τ, σ R(a(Tj ))τ >= χa(Tj ) (σ ).
τ ∈SN
Then the determinantal form [13] χTj = ψTj − ψTj −1 χT0 = ψT0
(j = 1, . . . , [N/2])
(4.10)
yields the following result: Theorem 4.2. The finite para-boson processes defined above converge weakly to the point process whose Laplace transform is given by −2 Eρ2B e− = Det 1 + 1 − e−f z∗ G(1 − z∗ G)−1 1 − e−f in the thermodynamic limit, where z∗ ∈ (0, 1) is determined by ρ = 2
z∗ e−β|p| dp = (z∗ G(1 − z∗ G)−1 )(x, x) < ρc , (2π)d 1 − z∗ e−β|p|2 2
and ρc is given by (2.12). Proof. Using (4.10) in the expression in Lemma 4.1 and (4.9) for g = a(T[N/2] ), we have 2B − EL,N e
˜ L )U (σ ) ψT[N/2] (σ )Tr H⊗N (⊗N G L = N σ ∈SN ψT[N/2] (σ )Tr H⊗N (⊗ GL )U (σ ) L ˜ L )U (a(T[N/2] ) Tr ⊗N H (⊗N G L = Tr ⊗N H (⊗N GL )U (a(T[N/2] ) L [N/2] ˜ L )S[(N +1)/2] Tr [N/2] ˜ L )S[N/2] Tr ⊗[(N +1)/2] H (⊗[(N +1)/2] G G ⊗ HL (⊗ L = . Tr ⊗[(N +1)/2] H (⊗[(N +1)/2] GL )S[(N +1)/2] Tr ⊗[N/2] H (⊗[N/2] GL )S[N/2]
σ ∈S N
L
L
In the last equality, we have used a(T[N/2] ) =
σ ∈R1
#R1
σ
τ ∈R2
#R2
τ
,
where R1 is the symmetric group of [(N + 1)/2] numbers which are on the first row of the tableau T[N/2] and R2 that of [N/2] numbers on the second row. Then, Theorem 2.2 yields the theorem.
372
H. Tamura, K.R. Ito
4.2. Para-fermions of order 2. For a Young tableau T , we denote by T the tableau obtained by interchanging the rows and the columns of T . In another word, T is the transpose of T . The tableau Tj is on the frame (2, . . . , 2, 1, . . . , 1) and satisfies j
R(Tj ) = C(Tj ),
N−2j
C(Tj ) = R(Tj ).
The generating functional of the point process for N para-fermions of order 2 in the finite box L is given by N [N/2] ˜ − j =0 Tr ⊗N HL (⊗ G)U (d(Tj )) 2F EL,N e = [N/2] N j =0 Tr ⊗N HL (⊗ G)U (d(Tj )) as in the case of para-bosons of order 2. Let us recall the relations χTj (σ ) = sgn(σ )χTj (σ ),
ϕTj (σ ) = sgn(σ )ψTj (σ ),
where we have denoted by ϕTj (σ ) =
< τ, σ R(b(Tj ))τ >
τ N [ sgn ], where sgn is the representathe character of the induced representation IndS C (T ) j
tion C(Tj ) = R(Tj ) σ → sgn(σ ). Thanks to these relations, we can easily translate the argument of para-bosons to that of para-fermions and get the following theorem.
Theorem 4.3. The finite para-fermion processes defined above converge weakly to the point process whose Laplace transform is given by 2 Eρ2B e− = Det 1 − 1 − e−f z∗ G(1 + z∗ G)−1 1 − e−f in the thermodynamic limit, where z∗ ∈ (0, ∞) is determined by 2 dp z∗ e−β|p| ρ = = (z∗ G(1 + z∗ G)−1 )(x, x). 2 (2π)d 1 + z∗ e−β|p|2 5. Gas of Composite Particles Most gases are composed of composite particles. In this section, we formulate point processes which yield the position distributions of constituents of such gases. Each composite particle is called a “molecule", and molecules consist of “atoms". Suppose that there are two kinds of atoms, say A and B, such that both of them obey Fermi-Dirac or Bose-Einstein statistics simultaneously, that N atoms of kind A and N atoms of kind B are in the same box L and that one A-atom and one B-atom are bounded to form a molecule by the non-relativistic interaction described by the Hamiltonian HL = − x − y + U (x − y) with periodic boundary conditions in L2 (L × L ). Hence there are totally N such molecules in L . We assume that the interaction between atoms in different molecules
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
373
can be neglected. We only consider such systems of zero temperature, where N molecules are in the ground state and (anti-)symmetrizations of the wave functions of the N atoms of type A and the N atoms of type B are considered. In order to avoid difficulties due to boundary conditions, we have set the masses of two atoms A and B equal. We also assume that the potential U is infinitely deep so that the wave function of the ground state has a compact support. We put 1 (R) (r) HL = − R − 2 r + U (r) = HL + HL , 2 where R = (x + y)/2, r = x − y. The normalized wave function of the ground state (R) (r) of HL is the constant function L−d/2 . Let ϕL (r) be that of the ground state of HL . −d/2 Then, the ground state of HL is ψL (x, y) = L ϕL (x − y). The ground state of the N-particle system in L is, by taking the (anti-)symmetrizations, −1 L,N (x1 , . . . , xN ; y1 , . . . , yN ) = Zcα
α N−ν(σ ) α N−ν(τ )
σ,τ ∈SN
=
N
ψL (xσ (j ) , yτ (j ) )
j =1
N N! N−ν(σ ) α ϕL (xj − yσ (j ) ), (5.1) Zcα LdN/2 σ j =1
where Zcα is the normalization constant and α = ±1. Recall that α N−ν(σ ) = sgn(σ ) for α = −1. The distribution function of positions of 2N -atoms of the system with zero temperature is given by the square of magnitude of (5.1), cα pL,N (x1 , . . . , xN ; y1 , . . . , yN ) =
×
α N−ν(σ )
σ,τ ∈SN
N
(N !)2 2 LdN Zcα
ϕL (xj − yτ (j ) )ϕL (xσ (j ) − yτ (j ) ).
(5.2)
j =1
Suppose that we are interested in one kind of atoms, say of type A. We introduce the operator ϕL on HL = L2 (L ) which has the integral kernel ϕL (x −y). Then the Laplace transform of the distribution of the positions of N A-atoms can be written as N − cα cα e = EL,N e− j =1 f (xj ) pL,N (x1 , . . . , xN ; y1 , . . . , yN ) 2N
×dx1 · · · dxN dy1 · · · dyN N−ν(σ ) Tr N ∗ −f ϕ )U (σ )] L ⊗N H [(⊗ ϕL e σ ∈S α = N . ∗ N−ν(σ ) N Tr ⊗N H [(⊗ ϕL ϕL )U (σ )] σ ∈SN α In order to take the thermodynamic limit N, L → ∞, V /Ld → ρ, we consider a Schr¨odinger operator in the whole space. Let ϕ be the normalized wave function of the ground state of Hr = −2 r + U (r) in L2 (Rd ). Then ϕ(r) = ϕL (r) (∀r ∈ L ) holds for large L by the assumption on U . The Fourier series expansion of ϕL is given by ϕL (r) =
2π d/2 2πk ei2πk·r/L ϕˆ , L L Ld/2 d
k∈Z
374
H. Tamura, K.R. Ito
where ϕˆ is the Fourier transform of ϕ: ϕ(p) ˆ = ϕ(r)e−ip·r Rd
dr . (2π)d/2
By ϕ, we denote the integral operator on H = L2 (Rd ) having kernel ϕ(x − y). Now we have the following theorem on the thermodynamic limit, where the density ρ > 0 is arbitrary for α = −1, ρ ∈ (0, ρcc ) for α = 1 and ρcc =
2 |ϕ(p)| ˆ dp . 2 − |ϕ(p)| 2 (2π)d |ϕ(0)| ˆ ˆ
Theorem 5.1. The finite point processes defined above for α = ±1 converge weakly to the process whose Laplace transform is given by −1/α Eρcα e− = Det 1 + z∗ α 1 − e−f ϕ(||ϕ||2L1 − z∗ αϕ ∗ ϕ)−1 ϕ ∗ 1 − e−f in the thermodynamic limit (2.7), where the parameter z∗ is the positive constant uniquely determined by 2 dp z∗ |ϕ(p)| ˆ ρ= = (z∗ ϕ(||ϕ||2L1 − z∗ αϕ ∗ ϕ)−1 ϕ ∗ )(x, x). 2 − z α|ϕ(p)| 2 (2π )d |ϕ(0)| ˆ ˆ ∗ ˆ k/L)}k∈Zd . Since ϕ Proof. The eigenvalues of the integral operator ϕL is {(2π )d/2 ϕ(2π is the ground state of the Schr¨odinger operator, we can assume ϕ 0. Hence the largest eigenvalue is (2π )d/2 ϕ(0) ˆ = ||ϕ||L1 . We also have 2π d 2π k 2 2 1 = ||ϕ||2L2 (Rd ) = |ϕ(p)| ˆ dp = ||ϕL ||2L2 ( ) = ϕˆ . (5.3) L L L Rd d k∈Z
Set VL = ϕL /||ϕ||L1 so that ||VL || = 1,
||VL ||22 = Ld /||ϕ||2L1 .
Then Theorem 3.1 applies as follows: For z ∈ Iα , let us define functions d, dL on Rd by d(p) =
2 z|ϕ(p)| ˆ 2 − zα|ϕ(p)| 2 |ϕ(0)| ˆ ˆ
and dL (p) = d(2π k/L) Then
Rd
(L)
if p ∈ k
for
k ∈ Zd .
dp dL (p) = L−d ||zVL (1 − zαVL∗ VL )−1 VL∗ ||1 (2π)d
and the following lemma holds:
(5.4)
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
375
Lemma 5.2. lim ||dL − d||L1 = 0.
L→∞
Proof. Put ϕˆ[L] (p) = ϕ(2π ˆ k/L)
(L)
if p ∈ k
for
k ∈ Zd
and note that compactness of supp ϕ implies ϕ ∈ L1 (Rd ) and uniform continuity of ϕ. ˆ ˆ 2 ||L∞ → 0 and ||dL − d||L∞ → 0. On the other hand, we Then we have || |ϕˆ[L] |2 − |ϕ| ˆ 2 ||L1 from (5.3). It is obvious that get || |ϕˆ[L] |2 ||L1 = || |ϕ| | ||dL ||L1 − ||d||L1 |
|| |ϕˆ[L] |2 − |ϕ| ˆ 2 ||L1 z . 2 2 (1 − z(α ∨ 0)) |ϕ(0)|
Hence the lemma is derived by using the following fact twice: If f, f1 , f2 , · · · ∈ L1 (Rd ) satisfy ||fn − f ||L∞ → 0 and || fn ||L1 → || f ||L1 , then ||fn − f ||L1 → 0 holds. In fact, using |fn (x)| dx = |f (x)| dx |x|>R |x|>R + (|f (x)| − |fn (x)|) dx + || fn ||L1 − || f ||L1 , |x|R
we have
||fn − f ||L1
|x|R
|fn (x) − f (x)| dx +
2
|x|R
|x|>R
(|fn (x)| + |f (x)|) dx
|fn (x) − f (x)| dx
+2
|x|>R
|f (x)| dx + || fn ||L1 − || f ||L1 .
For any > 0, we can choose R large enough to make the second term of the right hand side smaller than . For this choice of R, we set n so large that the first term and the remainder are smaller than and then ||fn − f ||L1 < 3. (Continuation of the proof of Theorem 5.1). Using this lemma, we can show (α)
Tr [zVL (1 − zαVL∗ VL )−1 VL∗ ] Tr VL∗ VL 2 z|ϕ(p)| ˆ 2 → |ϕ(0)| ˆ dp = h(α) (z), 2 2 d | ϕ(0)| ˆ − zα| ϕ(p)| ˆ R || 1 − e−f VL (1 − zαVL∗ VL )−1 VL∗ −ϕ(||ϕ||2L1 − zαϕ ∗ ϕ)−1 ϕ 1 − e−f ||1 → 0,
hL (z) =
as in the proof of (3.9) and (3.10). We have the conversion ρˆ = ||ϕ||2L1 ρ and hence ρcc = supz∈I1 h(1) (z)/||ϕ||2L1 . Hence the proof is completed by Theorem 3.1.
376
H. Tamura, K.R. Ito
Acknowledgements. We would like to thank Professor Y. Takahashi and Professor T. Shirai for many useful discussions. K. R. I. thanks the Grant-in-Aid for Science Research (C)15540222 from JSPS.
A. Complex Integrals Lemma A.1.
(i) For 0 p 1 and −π θ π, 2p(1 − p)θ 2
|1 + p(eiθ − 1)| exp − π2
holds. For 0 p 1 and −π/3 θ π/3, p(1 − p) 2 4p(1 − p)|θ |3 θ | | log 1 + p(eiθ − 1) − ipθ + √ 2 9 3 holds. (ii) For p 0 and −π θ π, the following inequalities hold: 2p(1 + p) θ 2
, 1 + 4p(1 + p) π 2 p(1 + p) 2 p(1 + p)(1 + 2p)|θ |3 | log 1 − p(eiθ − 1) + ipθ − θ | . 2 6 |1 − p(eiθ − 1)| exp
Proof. (i) The first inequality follows from |1 + p(eiθ − 1)|2 = 1 − 2p(1 − p)(1 − cos θ) exp(−2p(1 − p)(1 − cos θ)) exp(−4p(1 − p)θ 2 /π 2 ),
(A.1)
where 1 − cos θ 2θ 2 /π 2 for θ ∈ [−π, π] is used in the second inequality. Put f (θ) = log(1 + p(eiθ − 1)). Then we have f (0) = 0, i(1 − p) , f (0) = ip, 1 − p + peiθ p(1 − p)eiθ f (θ) = − , f (0) = −p(1 − p), (1 − p + peiθ )2 f (θ) = i −
and f (3) (θ ) = −
ip(1 − p)eiθ (1 − p − peiθ ) . (1 − p + peiθ )3
By (A.1) and θ ∈ [−π/3, √ π/3], we have |1+p(eiθ −1)|2 1−p(1−p) 3/4. Hence, (3) |f (θ )| 8p(1 − p)/3 3 holds. Taylor’s theorem yields the second inequality.
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
377
(ii) The first inequality follows from |1 − p(eiθ − 1)|2 = 1 + 2p(1 + p)(1 − cos θ) 2p(1 + p)(1 − cos θ)
exp 1 + 4p(1 + p) 4p(1 + p) θ 2
exp . 1 + 4p(1 + p) π 2 x/(1+a) for x ∈ [0, a] in the first inequality, which is derived Here we have used 1+x
x e from log(1 + x) = 0 dt/(1 + t) x/(1 + a). Put f (θ) = log(1 − p(eiθ − 1)). Then we have f (0) = 0,
i(1 + p) , f (0) = −ip, 1 + p − peiθ p(1 + p)eiθ f (θ ) = , f (0) = p(1 + p), (1 + p − peiθ )2 f (θ ) = i −
and ip(1 + p)eiθ (1 + p + peiθ ) . (1 + p − peiθ )3
f (3) (θ ) =
Hence, we have |f (3) (θ )| p(1 + p)(1 + 2p). Thus we get the second inequality. (N)
Proposition A.2. Let s > 0 and a collection of numbers {pj }j,N satisfy (N)
p0
(N)
p1
(N)
p2
(N)
· · · pj
· · · 0,
∞
(N)
spj
= N.
j =0 (N)
(i) Moreover, if p0
1 and
v (N) ≡
∞
(N)
(N)
spj (1 − pj ) → ∞ (N → ∞),
j =0
then lim
N→∞
v (N) S1 (0)
holds. (N) (ii) If {p0 } is bounded, then (N) lim w N→∞
S1 (0)
∞ dη 1 (N) (1 + pj (η − 1))s = √ N+1 2πiη 2π j =0
dη 1 1 =√ 2πiηN+1 ∞ (1 − p (N) (η − 1))s 2π j =0 j
holds, where w
(N)
≡
∞ j =0
(N)
(N)
spj (1 + pj ).
378
H. Tamura, K.R. Ito
√
∞ Proof. (i) Set η = exp(ix/ v (N) ). Then the integral is written as −∞ hN (x) dx/2π , where √
hN (x) = χ[−π √v (N ) ,π √v (N ) ] (x)e−iNx/
v (N )
∞
(N)
1 + pj (eix/
√ v (N )
s − 1) .
j =0
By Lemma A.1(i), we have |hN (x)|
∞
(N )
e−2spj
(N )
(1−pj
)x 2 /π 2 v (N )
= e−2x
2 /π 2
∈ L1 (R).
j =0
√ If N is so large that |x/ v (N) | π/3, we also get ∞ √ Nx (N ) (N) +s log 1 + pj (eix/ v − 1) hN (x) = χ[−π √v (N ) ,π √v (N ) ] (x) exp − i √ v (N) j =0 Nx = χ[−π √v (N ) ,π √v (N ) ] (x) exp − i √ v (N) (N) (N) (N) 2 ∞ pj x pj (1 − pj )x (N) +s − + δ i√ j 2v (N) v (N) j =0
x2 2 = χ[−π √v (N ) ,π √v (N ) ] (x) exp − + δ (N) −→ e−x /2 , N→∞ 2 where |δ
(N)
|=|
∞
(N) sδj |
j =0
∞ 4sp (N) (1 − p (N) )x 3 4|x 3 | j j = . √ √ √ 3 9 3v (N) 9 3 v (N ) j =0
The dominated convergence theorem yields ∞ ∞ dx −x 2 /2 dx 1 −→ e hN (x) =√ . 2π N→∞ −∞ 2π 2π −∞ √ (N) → ∞ as N → ∞. Set η = exp(ix/ w (N) ). Then the integral is (ii) Note that
∞w written as −∞ kN (x) dx/2π, where kN (x) =
√ (N ) e−iNx/ w √ √ √ χ[−π w(N ) ,π w(N ) ] (x) (N) ix/ w(N ) ∞ j =0 1 − pj (e
s .
− 1)
(N)
By LemmaA.1(ii) and the boundedness of {p0 }, we have, with some positive constant c, |kN (x)|
∞ j =0
exp −
x2
2 e−cx ∈ L1 (R) (N) (N) π 2 w (N) 1 + 4p (1 + p ) (N)
(N)
0
0
2spj (1 + pj )
Canonical Ensemble Approach to the Fermion/Boson Random Point Processes
379
and Nx kN (x) = χ[−π √w(N ) ,π √w(N ) ] (x) exp − i √ w (N) ∞ √ (N ) (N) −s log 1 − pj (e−ix/ w − 1) j =0
Nx = χ[−π √w(N ) ,π √w(N ) ] (x) exp − i √ w (N) (N) (N) (N) ∞ pj x pj (1 + pj )x 2 (N) −s −i√ + + δ j 2w (N) w (N) j =0 x2 2 = χ[−π √w(N ) ,π √w(N ) ] (x) exp − + δ (N) −→ e−x /2 , N→∞ 2 where |δ
(N)
|=|
∞ j =0
(N) sδj |
∞ p (N) (1 + p (N) )(1 + 2p (N) )|x 3 | (N) (1 + 2p0 ) 3 j j j |x |. √ √ 3 6 w (N) 6 w (N) j =0
The result is obtained by the dominated convergence theorem.
References 1. Benard, C., Macchi, O.: Detection and emission processes of quantum particles in a chaotic state. J. Math. Phys. 14, 155–167 (1973) 2. Chaturvedi, S.: Canonical partition functions for parastatistical systems of any order. Phys. Rev. E 54, 1378–1382 (1996) 3. Chaturvedi, S., Srinivasan, V.: Grand canonical partition functions for multi-level para-Fermi systems of any order. Phys. Lett. A 224, 249–252 (1997) 4. Doplicher, S., Haag, R., Roberts, J.E.: Local observables and particle statistics I. Commun. Math. Phys. 23, 199–230 (1971) 5. Daley, D.J., Vere-Jones, D.: An Introduction to the Theory of Point Processes. Berlin: Springer Verlag, 1988 6. Fichtner, K.-H.: On the position distribution of the ideal Bose gas. Math. Nachr. 151, 59–67 (1991) 7. Freudenberg, W.: Characterization of states of infinite boson systems. II: On the existence of the conditional reduced density matrix. Commun. Math. Phys. 137, 461–472 (1991) 8. Fichtner, K.-H., Freudenberg, W.: Point processes and the position distribution of infinite boson systems. J. Stat. Phys. 47, 959–978 (1987) 9. Fichtner, K.-H., Freudenberg, W.: Characterization of states of infinite boson systems. I: On the construction of states of boson systems. Commun. Math. Phys. 137, 315–357 (1991) 10. Green, H.S.: A generalized method of field quantization. Phys. Rev. 90, 270–273 (1953) 11. Goldin, G.A., Grodnik, J., Powers, R.T., Sharp, D.H.: Nonrelativistic current algebra in the N/V limit. J. Math. Phys. 15, 88–100 (1974) 12. Hartle, J.B., Taylor, J.R.: Quantum mechanics of paraparticles. Phys. Rev. 178, 2043–2051 (1969) 13. James, G., Kerber, A.: The Representation Theory of the Symmetric Group. (Encyclopedia of mathematics and its applications vol. 16) London: Addison-Wesley Publishing, 1981 14. Lenard, A.: States of classical statistical mechanical systems of infinitely many particles. I. Arch. Rat. Mech. Anal. 59, 219–139 (1975) 15. Lytvynov, E.: Fermion and boson random point processes as particle distributions of infinite free Fermi and Bose gases of finite density. Rev. Math. Phys. 14, 1073–1098 (2002) 16. Macchi, O.: The coincidence approach to stochastic point processes. Adv. Appl. Prob. 7, 83–122 (1975)
380
H. Tamura, K.R. Ito
17. Macchi, O.: The fermion process–a model of stochastic point process with repulsive points. In: Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes and of the Eighth European Meeting of Statisticians Vol. A, Prague 1974. Dordrecht: Reidel Publishing, 1977, pp. 391–398 18. Menikoff, R.: The Hamiltonian and generating functional for a nonrelativistic local current algebra. J. Math. Phys. 15, 1138–1152 (1974) 19. Messiah,A.M.L., Greenberg, O.W.: Symmetrization postulate and its experimental foundation. Phys. Rev. 136, B248–B267 (1964) 20. Ohnuki, Y., Kamefuchi, S.: Wavefunctions of identical particles. Ann. Phys. 51, 337–358 (1969) 21. Ohnuki, Y., Kamefuchi, S.: Quantum field theory and parastatistics. Berlin: Springer-Verlag 1982 22. Sagan, B.E.: The Symmetric Group. New York: Springer-Verlag, 1991 23. Shirai, T., Takahashi, Y.: Random point fields associated with certain Fredholm determinants I: fermion, Poisson and boson point processes. J. Funct. Anal. 205, 414–463 (2003) 24. Simon, B.: Trace ideals and their applications. London Mathematical Society Lecture Note Series, Vol. 35, Cambridge: Cambridge University Press, 1979 25. Simon, B.: Representations of Finite and Compact Groups. Providence, RI: A.M.S, 1996 26. Soshnikov, A.: Determinantal random point fields. Russ. Math. Surv. 55, 923–975 (2000) 27. Stolt, R.H., Taylor, J.R.: Classification of paraparticles. Phys. Rev. D1, 2226–2228 (1970) 28. Suranyi, P.: Thermodynamics of parabosonic and parafermionic systems of order two. Phys. Rev. Lett. 65, 2329–2330 (1990) 29. Vere-Jones, D.: A generalization of permanents and determinants. Linear Algebra Appl. 111, 119–124 (1988) Communicated by J.L. Lebowitz
Commun. Math. Phys. 263, 381–400 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1506-3
Communications in
Mathematical Physics
The Pearcey Process Craig A. Tracy1 , Harold Widom2 1 2
Department of Mathematics, University of California, Davis, CA 95616, USA Department of Mathematics, University of California, Santa Cruz, CA 95064, USA
Received: 1 February 2005 / Accepted: 19 September 2005 Published online: 10 February 2006 – © Springer-Verlag 2006
Abstract: The extended Airy kernel describes the space-time correlation functions for the Airy process, which is the limiting process for a polynuclear growth model. The Airy functions themselves are given by integrals in which the exponents have a cubic singularity, arising from the coalescence of two saddle points in an asymptotic analysis. Pearcey functions are given by integrals in which the exponents have a quartic singularity, arising from the coalescence of three saddle points. A corresponding Pearcey kernel appears in a random matrix model and a Brownian motion model for a fixed time. This paper derives an extended Pearcey kernel by scaling the Brownian motion model at several times, and a system of partial differential equations whose solution determines associated distribution functions. We expect there to be a limiting nonstationary process consisting of infinitely many paths, which we call the Pearcey process, whose space-time correlation functions are expressible in terms of this extended kernel. 1. Introduction Determinantal processes are at the center of some recent remarkable developments in probability theory. These processes describe the mathematical structure underpinning random matrix theory, shape fluctuations of random Young tableaux, and certain 1 + 1 dimensional random growth models. (See [2, 9, 10, 18, 20] for recent reviews.) Each such process has an associated kernel K(x, y), and certain distribution functions for the process are expressed in terms of determinants involving this kernel. (They can be ordinary determinants or operator determinants associated with the corresponding operator K on an L2 space.) Typically these models have a parameter n which might measure the size of the system and one is usually interested in the existence of limiting distributions as n → ∞. Limit laws then come down to proving that the operator Kn , where we now make the n dependence explicit, converges in trace class norm to a limiting operator K. In this context universality theorems become statements that certain canonical operators K are the limits for a wide variety of Kn . What canonical K can we expect to encounter?
382
C. A. Tracy, H. Widom
In various examples the kernel Kn (x, y) (or, in the case of matrix kernels, the matrix entries Kn,ij (x, y)) can be expressed as an integral
f (s, t) eφn (s,t; x,y) ds dt. C1
C2
To study the asymptotics of such integrals one turns to a saddle point analysis. Typically one finds a nontrivial limit law when there is a coalescence of saddle points. The simplest example is the coalescence of two saddle points. This leads to the fold singularity φ2 (z) = 13 z3 + λz in the theory of Thom and Arnold and a limiting kernel, the Airy kernel [19] or the more general matrix-valued extended Airy kernel [17, 11]. After the fold singularity comes the cusp singularity φ3 (z) = 41 z4 + λ2 z2 + λ1 z. The diffraction integrals, which are Airy functions in the case of a fold singularity, now become Pearcey functions [16]. What may be called the Pearcey kernel, since it is expressed in terms of Pearcey functions, arose in the work of Br´ezin and Hikami [6, 7] on the level spacing distribution for random Hermitian matrices in an external field. More precisely, let H be an n × n GUE matrix (with n even), suitably scaled, and H0 a fixed Hermitian matrix with eigenvalues ±a each of multiplicity n/2. Let n → ∞. If a is small the density of eigenvalues is supported in the limit on a single interval. If a is large then it is supported on two intervals. At the “closing of the gap” the limiting eigenvalue distribution is described by the Pearcey kernel. Bleher, Kuijlaars and Aptekarev [4, 5, 3] have shown that the same kernel arises in a Brownian motion model. Okounkov and Reshetikhin [15] have encountered the same kernel in a certain growth model. Our starting point is with the work of Aptekarev, Bleher and Kuijlaars [3]. With n even again, consider n nonintersecting Brownian paths starting at position 0 at time τ = 0, with half the paths conditioned to end at b > 0 at time τ = 1 and the other half conditioned to end at −b. At any fixed time this model is equivalent to the random matrix model of Br´ezin and Hikami since they are described by the same distribution function. If b is of the order n1/2 there is a critical time τc such that the limiting distribution of the Brownian paths as n → ∞ is supported by one interval for τ < τc and by two intervals when τ > τc . The limiting distribution at the critical time is described by the Pearcey kernel. It is in searching for the limiting joint distribution at several times that an extended Pearcey kernel arises.1 Consider times 0 < τ1 ≤ · · · ≤ τm < 1 and ask for the probability that for each k no path passes through a set Xk at time τk . We show that this probability is given by the operatordeterminant det(I − K χ ) with an m × m matrix kernel K(x, y), where χ (y) = diag χ Xk (y) . We then take b = n1/2 and scale all the times near the critical time by the substitutions τk → 1/2 + n−1/2 τk and scale the kernel by x → n−1/4 x, y → n−1/4 y. (Actually there are some awkward coefficients involving 21/4 which we need not write down exactly.) The resulting limiting kernel, the extended Pearcey kernel, has i, j entry −
1 4π 2
i∞
C
−i∞
e−s
4 /4+τ s 2 /2−ys+t 4 /4−τ t 2 /2+xt j i
ds dt s−t
(1.1)
1 It was in this context that the extended Airy kernel, and other extended kernels considered in [21], arose.
The Pearcey Process
383
plus a Gaussian when i < j . The t contour C consists of the rays from ±∞eiπ/4 to 0 and the rays from 0 to ±e−iπ/4 . For m = 1 and τ1 = 0 this reduces to the Pearcey kernel of Br´ezin and Hikami.2 These authors also asked the question whether modifications of their matrix model could lead to kernels involving higher-order singularities. They found that this was so, but that the eigenvalues of the deterministic matrix H0 had to be complex. Of course there are no such matrices, but the kernels describing the distribution of eigenvalues of H0 + H make perfectly good sense. So in a way this was a fictitious random matrix model. In Sect. V we shall show how to derive analogous extended kernels and limiting processes from fictitious Brownian motion models, in which the end-points of the paths are complex numbers. For the extended Airy kernel the authors in [21] derived a system of partial differential equations, with the end-points of the intervals of the Xk as independent variables, whose solution determines det(I − K χ ).3 Here it is assumed that each Xk is a finite union of intervals. For m = 1 and X1 = (ξ, ∞) these partial differential equations reduce to ordinary differential equations which in turn can be reduced to the familiar Painlev´e II equation. In Sect. IV of this paper we find the analogous system of partial differential equations where now the underlying kernel is the extended Pearcey kernel.4 Unlike the case of the extended Airy kernel, here it is not until a computation at the very end that one sees that the equations close. It is fair to say that we do not really understand, from this point of view, why there should be such a system of equations. The observant reader will have noticed that so far there has been no mention of the Pearcey process, supposedly the subject of the paper. The reason is that the existence of an actual limiting process consisting of infinitely many paths, with correlation functions and spacing distributions described by the extended Pearcey kernel, is a subtle probabilistic question which we do not now address. That for each fixed time there is a limiting random point field follows from a theorem of Lenard [13, 14] (see also [18]), since that requires only a family of inequalities for the correlation functions which are preserved in the limit. But the construction of a process, a time-dependent random point field, is another matter. Of course we expect there to be one. 2. Extended Kernel for the Brownian Motion Model Suppose we have n nonintersecting Brownian paths. It follows from a theorem of Karlin and McGregor [12] that the probability density that at times τ0 , . . . , τm+1 their positions are in infinitesimal neighborhoods of x0i , . . . , xm+1,i is equal to m
det (P (xm,i , xm+1,j , σm )),
(2.1)
k=0
where σk = τk+1 − τk 2
In the external source random matrix model, an interpretation is also given for the coefficients of s 2 and t 2 in the exponential. It is not related to time as it is here. 3 Equations of a different kind in the case m = 2 were found by Adler and van Moerbeke [1]. 4 In the case m = 1 the kernel is integrable, i.e., it is a finite-rank kernel divided by x −y. (See footnote 7 for the exact formula.) A system of associated PDEs in this case was found in [7], in the spirit of [19], when X1 is an interval. This method does not work when m > 1, and our equations are completely different.
384
C. A. Tracy, H. Widom
and P (x, y, σ ) = (π σ )−1/2 e−(x−y)
2 /σ
.
The indices i and j run from 0 to n − 1, and we take τ0 = 0, τm+1 = 1. We set all the x0i = ai and xm+1,j = bj , thus requiring our paths to start at ai and end at bj . (Later we will let all ai → 0.) By the method of [8] (modified and somewhat simplified in [21]) we shall derive an “extended kernel” K(x, y), which is a matrix kernel (Kk (x, y))m k,=1 such that for general functions f1 , . . . , fm the expected value of m n−1 (1 + fk (xki )) k=1 i=0
is equal to det (I − K f ), where f (y) = diag (fk (y)). In particular the probability that for each k no path passes through the set Xk at time τk is equal to det (I − K χ ), where χ (y) = diag (χ Xk (y)). (The same kernel gives the correlation functions [8]. In particular the probability density (2.1) is equal to (n!)−m det(Kk (xki , xj ))k,=1,...,m; i,j =0,...,n−1 .) The extended kernel K will be a difference H − E, where E is the strictly uppertriangular matrix with k, entry P (x, y, τ − τk ) when k < , and where Hk (x, y) is given at first by the rather abstract formula (2.5) below and then by the more concrete formula (2.6). Then we let all ai → 0 and find the integral representation (2.11) for the case when all the Brownian paths start at zero. This representation will enable us to take the scaling limit in the next section. We now present the derivation of K. Although in the cited references the determinants at either end (corresponding to k = 0 and m in (2.1)) were Vandermonde determinants, it is straightforward to apply the method to the present case. Therefore, rather than go through the full derivation again we just describe how one finds the extended kernel. For i, j = 0, . . . , n − 1 we find Pi (x), which are a linear combination of the P (x, ak , σ0 ) and Qj (y), which are a linear combination of the P (y, bk , σm ) such that
···
Pi (x1 )
m−1 k=1
P (xk , xk+1 , σk ) Qj (xm ) dx1 · · · dxm = δij .
The Pearcey Process
385
Because of the semi-group property of the P (x, y, τ ) this is the same as Pi (x) P (x, y, τm − τ1 ) Qj (y) dx dy = δij .
(2.2)
We next define for k < , Ek (xk , x ) =
···
−1
P (xr , xr+1 , σr ) dxk+1 · · · dx−1 = P (xk , x , τ − τk ).
r=k
Set Pi = P1i ,
Qmj = Qj ,
and for k > 1 define Pki (y) = E1k (y, u) Pi (u) du = P (y, u, τk − τ1 ) Pi (u) du,
(2.3)
and for k < m define Qkj (x) = Ekm (x, v) Qj (v) dv = P (x, v, τm − τk ) Qj (v) dv.
(2.4)
(These hold also for P1i and Qmj if we set Ekk (x, y) = δ(x − y).) The extended kernel is given by K = H − E where Hk (x, y) =
n−1
(2.5)
Qki (x) Pi (y),
i=0
and Ek (x, y) is as given above for k < and equal to zero otherwise. This is essentially the derivation in [8] applied to this special case. We now determine Hk (x, y) explicitly. Suppose Pi (x) = pik P (x, ak , σ0 ), k
Qj (y) =
qj P (y, b , σm ).
If we substitute these into (2.2) and use the fact that σ0 + τm − τ1 + σm = 1 we see that it becomes 1 2 pik qj e−(ak −b ) = δij . √ 2π k, Thus, if we define matrices P , Q and A by P = (pij ), Q = (qij ), A = (e−(ai −bj ) ), √ then we require P AQt = 2π I . 2
386
C. A. Tracy, H. Widom
Next we compute Pri (y) =
P (y, u, τr − τ1 ) Pi (u) du =
pik P (y, ak , τr ),
k
Qsj (x) =
P (x, v, τm − τs ) Qj (v) dv =
qj P (x, b , 1 − τs ).
Hence Hrs (x, y) =
Qri (x) Psi (y) =
i
P (x, b , 1 − τr ) qi pik P (y, ak , τs ).
i,k,
√ The internal sum over i is equal to the , k entry of Qt P = 2π A−1 . So the above can be written (changing indices) √ Hk (x, y) = 2π P (x, bi , 1 − τk ) (A−1 )ij P (y, aj , τ ). i,j
If we set B = (e2 ai bj ) then this becomes Hk (x, y) =
√ 2 2 2π P (x, bi , 1 − τk ) ebi (B −1 )ij eaj P (y, aj , τ ).
(2.6)
i,j
This gives the extended kernel when the Brownian paths start at the aj . Now we are going to let all aj → 0. There is a matrix function D = D(a0 , . . . , an−1 ) such that for any smooth fuction f, f (0) f (a0 ) f (a1 ) f (0) lim D(a0 , . . . , an−1 ) = .. . .. ai →0 . . (n−1) f (an−1 ) (0) f Here limai →0 is short for a certain sequence of limiting operations. Now B −1 applied to the column vector 2
(eaj P (y, aj , τ )) equals (DB)−1 applied to the vector 2
D (eaj P (y, aj , τ )). When we apply limai →0 this vector becomes j
(∂a ea
2 /2
P (y, a, τ )|a=0 ),
The Pearcey Process
387
while DB becomes the matrix ((2 bj )i ), which is invertible when all the bj are distinct. If we set V = (bj i ) then the limiting (DB)−1 is equal to V −1 diag (2−j ). Thus we have shown that when all ai = 0, √ 2 2 j P (x, bi , 1 − τk ) ebi (V −1 )ij 2−j ∂a ea P (y, a, τ )|a=0 . (2.7) Hk (x.y) = 2π i,j
The next step is to write down an integral representation for the last factor. We have
i∞ τ 2 y y2 1 a2 1−τ s +2s a− 1−τ 1−τ e P (y, a, τ ) = √ e ds. e πi 2(1 − τ ) −i∞ Hence 2−j ∂a ea P (y, a, τ )|a=0 = j
2
y2 1 e 1−τ πi 2(1 − τ )
√
i∞ −i∞
τ
s j e 1−τ
2sy s 2 − 1−τ
ds. (2.8)
Next we are to multiply this by (V −1 )ij and sum over j . The index j appears in (2.8) only in the factor s j in the integrand, so what we want to compute is n−1
(V −1 )ij s j .
(2.9)
j =0
Cramer’s rule in this case tells us that the above is equal to i /, where denotes the Vandermonde determinant of b = {b0 , . . . , bn−1 } and i the Vandermonde determinant of b with bi replaced by s. This is equal to s − br . bi − b r r=i
Observe that this is the same as the residue s − br 1 res , t = bi . t − br s − t r
(2.10)
This allows us to replace the sum over i in (2.7) by an integral over t. In fact, using (2.8) and the identification of (2.9) with (2.10) we see that the right side of (2.7) is equal to i∞ τ y2 2sy s − b ds dt 1 1 2 s 2 − 1−τ r − , P (x, t, 1 − τk ) et e 1−τ e 1−τ √ 2π π(1 − τ ) t − b s−t r C −i∞ r where the contour of integration C surrounds all the bi and lies to one side (it doesn’t matter which) of the s contour. Thus y2 x2 1 1 1−τ − 1−τk e √ 2π 2 (1 − τk ) (1 − τ ) i∞ τ τ 2sy s − b 2xt − k t 2 + 1−τ + 1−τ s 2 − 1−τ r ds dt k × e 1−τk . (2.11) t − b r s−t C −i∞ r
Hk (x, y) = −
388
C. A. Tracy, H. Widom
In this representation the s contour (which passes to one side of the closed t contour) may be replaced by the imaginary axis and C by the contour consisting of the rays from ±∞eiπ/4 to 0 and the rays from 0 to ±∞e−iπ/4 . (We temporarily call this the “new” contour C.) To see this let CR denote the new contour C, but with R replacing ∞ and the ends joined by two vertical lines (where t 2 has positive real part). The t contour may be replaced by CR if the s contour passes to the left of it. To show that the s contour may be replaced by the imaginary axis it is enough to show that we get 0 when the s contour is the interval [−iR, iR] plus a curve from iR to −iR passing around to the left of CR . If we integrate first with respect to s we get a pole at s = t, and the resulting t integral is zero because the integrand is analytic inside CR . So we can replace the s contour by the imaginary axis. We then let R → ∞ to see that CR may be replaced by the new contour C. 3. The Extended Pearcey Kernel The case of interest here is where half the br equal b and half equal −b. It is convenient to replace n by 2n, so that the product in the integrand in (2.11) is equal to 2 n s − b2 . t 2 − b2 We take the case b = n1/2 . We know from [3] that the critical time (the time when the support of the limiting density changes from one interval to two) is 1/2, and the place (where the intervals separate) is x = 0. We make the replacements τk → 1/2 + n−1/2 τk and the scaling x → n−1/4 x, y → n−1/4 y. More exactly, we define Kn,ij (x, y) = n−1/4 Kij (n−1/4 x, n−1/4 y), with the new definition of the τk . (Notice the change of indices from k and to i and j . This is for later convenience.) The kernel En,ij (x, y) is exactly the same as Eij (x, y). As for Hn,ij (x, y), its integral representation is obtained from (2.11) by the scaling replacements and then by the substitutions s → n1/4 s, t → n1/4 t in the integral itself. The result is (we apologize for its length) 1 1 2x 2 2y 2 Hn,ij (x, y) = − 2 − 1/2 exp 1/2 π n − 2τj n − 2τi (1 − 2n−1/2 τi ) (1 − 2n−1/2 τj ) i∞ 1 + 2n−1/2 τi 2 4xt × exp −n1/2 t + 1 − 2n−1/2 τi 1 − 2n−1/2 τi C −i∞ −1/2 τ 4ys j 2 1/2 1 + 2n × exp n s − 1 − 2n−1/2 τj 1 − 2n−1/2 τj i n 1/2 2 ds dt 1 − s /n . (3.1) × 2 1/2 1 − t /n s−t
The Pearcey Process
389
We shall show that this has the limiting form i∞ ds dt 1 4 2 4 2 Pearcey Hij (x, y) = − 2 e−s /2+4 τj s −4ys+t /2−4 τi t +4xt , π C −i∞ s−t
(3.2)
where, as in (2.11) and (3.1), the s integration is along the imaginary axis and the contour C consists of the rays from ±∞eiπ/4 to 0 and the rays from 0 to ±∞e−iπ/4 . Precisely, we shall show that Pearcey
lim Hn,ij (x, y) = Hij
n→∞
(x, y)
(3.3)
uniformly for x and y in an arbitrary bounded set, and similarly for all partial derivatives.5 The factor outside the integral in (3.1) converges to −1/π 2 . The first step in proving the convergence of the integral in (3.1) to that in (3.2) is to establish pointwise convergence of the integrand. The first exponential factor in the integrand in (3.1) is exp − (n1/2 + 4τi + O(n−1/2 )) t 2 + (4 + O(n−1/2 )) xt , while the second exponential factor is exp (n1/2 + 4τj + O(n−1/2 )) s 2 − (4 + O(n−1/2 )) ys .
(3.4)
When s = o(n1/4 ), t = o(n1/4 ) the last factor in the integrand is equal to exp n1/2 t 2 + t 4 /2 + o(t 4 /n) − n1/2 s 2 − s 4 /2 + o(s 4 /n) . Thus the entire integrand (aside from the factor 1/(s − t)) is exp − (1 + o(1)) s 4 /2 + (4τj + O(n−1/2 )) s 2 − (4 + O(n−1/2 )) ys} × exp (1 + o(1)) t 4 /4 − (4τi + O(n−1/2 )) t 2 + (4 + O(n−1/2 )) xt .
(3.5)
In particular this establishes pointwise convergence of the integrands in (3.1) to that in (3.2). For the claimed uniform convergence of the integrals and their derivatives it is enough to show that they all converge pointwise and boundedly. To do this we change the t contour C by rotating its rays slightly toward the real line. (How much we rotate we say below. We can revert to the original contour after taking the limit.) This is so that on the modified contour, which we denote by C , we have t 2 > 0 as well as t 4 < 0. The function 1/(s − t) belongs to Lq for any q < 2 in the neighborhood of s = t = 0 on the contours of integration and to Lq for any q > 2 outside this neighborhood. To establish pointwise bounded convergence of the integral it therefore suffices to show that for any p ∈ (1, ∞) the rest of the integrand (which we know converges pointwise) has Lp norm which is uniformly bounded in x and y.6 The rest of the integrand is the 5 The constants in (3.2) are different from those in (1.1), a matter of no importance. In the next section we shall make the appropriate change so that they agree. 6 That this suffices follows from the fact, an exercise, that if {f } is a bounded sequence in Lp conn verging pointwise to f then (fn , g) → (f, g) for all g ∈ Lq , where p = q/(q − 1). We take fn to be the integrand in (3.1) except for the factor 1/(s − t) and g to be 1/(s − t), and apply this twice, with q < 2 in a neighborhood of s = t = 0 and with q > 2 outside the neighborhood.
390
C. A. Tracy, H. Widom
product of a function of s and a function of t and we show that both of these functions have uniformly bounded Lp norms. 4 2 From (3.5) it follows that for some small ε > 0 the function of t is O(e t /2+O(|t| ) ) 1/4 uniformly in x if |t| ≤ ε n . Given this ε we choose C to consist of the rays of C rotated slightly toward the real axis so that if θ = arg t 2 when t ∈ C then cos θ = ε2 /2. 4 2 4 When t ∈ C and |t| ≤ ε n1/4 the function of t is O(e t /2+O(|t| ) ) = O(ecos 2θ |t| /2 ). p Since cos 2θ < 0 the L norm on this part of C is O(1). When t ∈ C and |t| ≥ ε n1/4 we have |1 − t 2 /n1/2 |2 = 1 + n−1 |t|4 − 2n−1/2 cos θ |t|2 . But n−1 |t|4 − 2n−1/2 cos θ |t|2 = n−1/2 |t|2 (n−1/2 |t|2 − ε 2 ) ≥ 0 when |t| ≥ εn1/4 . Thus |1−t 2 /n1/2 | ≥ 1 and the function of t is O(e− cos θ n |t| +O(|t| ) ) 1/2 2 = O(e− cos θ n |t| /2 ), and the Lp norm on this part of C is o(1). We have shown that on C the function of t has uniformly bounded Lp norm. For the Lp norm of the function of s we see from (3.4) that the integral of its p th power is at most a constant independent of y times ∞ 1/2 2 2 e−pn s +τ s (1 + n−1/2 s 2 )pn ds. 1/2
2
2
0
(We replaced s by is, used evenness, and took any τ > −4p τj .) The variable change s → n1/4 s replaces this by ∞ 2 2 1/2 2 n1/4 e−pn (s −log (1+s ))+n τ s ds. 0
The integral over (1, ∞) is exponentially small. Since s 2 − log (1 + s 2 ) ≥ s 4 /2 when s ≤ 1, what remains is at most
1
n1/4 0
e−pn s
4 /2+O(n1/2 s 2 )
ds =
n1/4
e−p s
4 /2+O(s 2 )
ds,
0
which is O(1). This completes the demonstration of the bounded pointwise convergence of Hn,ij Pearcey (x, y) to Hij . Taking any partial derivative just inserts in the integrand a polynomial in x, y, s and t, and the argument for the modified integral is virtually the same. This completes the proof of (3.3). 4. Differential Equations for the Pearcey Process We expect the extended Pearcey kernel to characterize the Pearcey process, a point process which can be thought of as infinitely many nonintersecting paths. Given sets Xk , the probability that for each k no path passes through the set Xk at time τk is equal to det (I − K χ ),
The Pearcey Process
391
where χ (y) = diag (χ Xk (y)). The following discussion follows closely that in [21]. We take the case where each Xk is a finite union of intervals with end-points ξkw , w = 1, 2, . . ., in increasing order. If we set R = (I − K χ )−1 K, with kernel R(x, y), then ∂kw det (I − K χ ) = (−1)w+1 Rkk (ξkw , ξkw ). (We use the notation ∂kw for ∂ξkw .) We shall find a system of PDEs in the variables ξkw with the right sides above among the unknown functions. In order to have the simplest coefficients later we make the further variable changes s → s/21/4 , t → t/21/4 and substitutions x → 21/4 x/4, y → 21/4 y/4, τk → 21/2 τk /8. The resulting rescaled kernels are (we omit the superscripts “Pearcey”) i∞ 1 ds dt 4 2 4 2 Hij (x, y) = − 2 e−s /4+τj s /2−ys+t /4−τi t /2+xt , 4π C −i∞ s−t which is (1.1), and 2
Eij (x, y) =
1 − (x−y) e 2(τj −τi ) . 2π (τj − τi )
Define the vector functions i∞ 1 1 t 4 /4−τk t 2 /2+xt −s 4 /4+τk s 2 /2−ys e dt , ψ(y) = e ds . ϕ(x) = 2π i C 2π i −i∞ We think of ϕ as a column m-vector and ψ as a row m-vector. Their components are Pearcey functions.7 The vector functions satisfy the differential equations ϕ (x) − τ ϕ (x) + x ϕ(x) = 0, ψ (y) − ψ (y) τ − y ψ(y) = 0, where τ = diag (τk ). 7
In case m = 1 the kernel has the explicit representation K(x, y) =
ϕ (x) ψ(y) − ϕ (x) ψ (y) + ϕ(x) ψ (y) − τ ϕ(x)ψ(y) x−y
in terms of the Pearcey functions. (Here we set τ = τ1 . This is the same as the i, i entry of the matrix kernel if we set τ = τi .) This was shown in [6]. Another derivation will be given in footnote 9.
392
C. A. Tracy, H. Widom
Define the column vector function Q and row vector function P by Q = (I − K χ )−1 ϕ, P = ψ (I − χ K)−1 .
(4.1)
The unknowns in our equations will be six vector functions indexed by the end-points kw of the Xk and three matrix functions with the same indices. The vector functions are denoted by q, q , q , p, p , p . The first three are column vectors and the second three are row vectors. They are defined by = Qi (ξiu ), qiu = Qi (ξiu ), qiu = Qi (ξiu ), qiu
and analogously for p, p , p . The matrix function unknowns are r, rx , ry defined by riu,j v = Rij (ξiu , ξj v ), rx,iu,j v = Rxij (ξiu , ξj v ), ry,iu,j v = Ryij (ξiu , ξj v ). (Here Rxij , for example, means ∂x Rij (x, y).) The equations themselves will contain the matrix functions rxx , rxy , ryy defined analogously, but we shall see that the combinations of them that appear can be expressed in terms of the unknown functions. The equations will be stated in differential form. We use the notation ξ = diag (ξkw ), dξ = diag (dξkw ), s = diag ((−1)w+1 ). Recall that q is a column vector and p a row vector. Our equations are dr drx dry dq dp dq dp dq dp
= −r s dξ r + dξ rx + ry dξ, (4.2) = −rx s dξ r + dξ rxx + rxy dξ, (4.3) = −r s dξ ry + dξ rxy + ryy dξ, (4.4) = dξ q − r s dξ q, (4.5) = p dξ − p dξ s r, (4.6) = dξ q − rx s dξ q, (4.7) = p dξ − p dξ s ry , (4.8) = dξ (τ q − ξ q + r s q − ry s q + ryy s q − r τ s q) − rxx s dξ q, (4.9) = (p τ + p ξ + p s r − p s rx + p s rxx − p s τ r) dξ − p dξ s ryy .(4.10)
One remark about the matrix τ in Eqs. (4.9) and (4.10). Earlier τ was the m × m diagonal matrix with k diagonal entry τk . In the equations here it is the diagonal matrix with kw diagonal entry τk . The exact meaning of τ when it appears will be clear from the context. As in [21], what makes the equations possible is that the operator K has some nice commutators. In this case we also need a miracle at the end.8 8
That it seems a miracle to us shows that we do not really understand why the equations should close.
The Pearcey Process
393
Denote by ϕ ⊗ ψ the operator with matrix kernel (ϕi (x) ψj (y)), where ϕi and ψj are the components of ϕ and ψ, respectively. If we apply the operator ∂x + ∂y to the integral defining Hij (x, y) we obtain the commutator relation [D, H ] = −ϕ ⊗ ψ, where D = d/dx. Since also [D, E] = 0 we have [D, K] = −ϕ ⊗ ψ.
(4.11)
This is the first commutator. From it follows [D, K χ ] = −ϕ ⊗ ψ χ + K δ. Here we have used the following notation: δkw is the diagonal matrix operator whose k th diagonal entry equals multiplication by δ(y − ξkw ), and δ= (−1)w+1 δkw . kw
It appears above because D χ = δ. Set ρ = (I − K χ )−1 , R = ρ K = (I − K χ K)−1 − I. It follows from the last commutator upon left- and right-multiplication by ρ that [D, ρ] = −ρ ϕ ⊗ ψ χ ρ + R δ ρ. From the commutators of D with K and ρ we compute [D, R] = [D, ρ K] = ρ [D, K] + [D, ρ] K = −ρ ϕ ⊗ ψ (I + χ R) + R δ R. Notice that I + χ R = (I − χ K)−1 . If we recall (4.1) we see that we have shown [D, R] = −Q ⊗ P + R δ R.
(4.12)
To obtain our second commutator we observe that if we apply ∂t + ∂s to the integrand in the formula for Hij (x, y) we get zero for the resulting integral. If we apply it to (s − t)−1 we also get zero. Therefore we get zero if we apply it to the numerator, and this operation brings down the factor t 3 − τi t + x − s 3 + τj s − y. The same factor results if we apply to Hij (x, y) the operator ∂x3 + ∂y3 − (τi ∂x + τj ∂y ) + (x − y). We deduce [D 3 − τ D + M, H ] = 0.
394
C. A. Tracy, H. Widom
One verifies that also [D 3 − τ D + M, E] = 0. Hence [D 3 − τ D + M, K] = 0.
(4.13)
This is the second commutator.9 From it we obtain [D 3 − τ D + M, K χ ] = K [D 3 − τ D + M, χ ] = K (DδD + D 2 δ + δD 2 − τ δ), and this gives [D 3 − τ D + M, ρ] = R (DδD + D 2 δ + δD 2 ) ρ − R τ δ ρ,
(4.14)
which in turn gives [D 3 −τ D + M, R] = [D 3 −τ D + M, ρ K] = R (DδD+D 2 δ+δD 2 ) R − R τ δ R.
(4.15)
Of our nine equations the first seven are universal — they do not depend on the particulars of the kernel K or vector functions ϕ or ψ. (The same was observed in [21].) What are not universal are Eqs. (4.9) and (4.10). For the equations to close we shall also have to show that the combinations of the entries of rxx , rxy and ryy which actually appear in the equations are all expressible in terms of the unknown functions. The reader can check that these are the diagonal entries of rxx + rxy and rxy + ryy (which also give the diagonal entries of rxx − ryy ) and the off-diagonal entries of rxx , rxy and ryy . What we do at the beginning of our derivation will be a repetition of what was done in [21]. First, we have ∂kw ρ = ρ (K ∂kw χ ) ρ = (−1)w R δkw ρ.
(4.16)
From this we obtain ∂kw R = (−1)w R δkw R, and so ∂kw riu,j v = (∂kw Rij )(ξiu , ξj v ) + Rxij (ξiu , ξj v ) δiu,kw + Ryij (ξiu , ξj v ) δj v,kw . = (−1)w riu,kw rkw,j v + rx,iu,j v δiu,kw + ry,iu,j v δj v,kw . Multipliying by dξkw and summing over kw give (4.2). Equations (4.3) and (4.4) are derived analogously. Next we derive (4.5) and (4.7). Using (4.16) applied to ϕ we obtain ∂kw qiu = Qi (ξiu ) δiu,kw + (−1)w (R δkw Q)i (ξiu ) = qiu δiu,kw + (−1)w riu,kw qkw .
Multiplying by dξkw and summing over kw give (4.5). If we multiply (4.16) on the left by D we obtain ∂kw ρx = (−1)w Rx δkw ρ and applying the result to ϕ we obtain (4.7) similarly. For (4.9) we begin as above, now applying D 2 to (4.16) on the left and ϕ on the right to obtain w ∂kw qiu = Q i (ξiu ) δiu,kw + (−1) rxx,iu,kw qkw .
(4.17)
9 From (4.11) we obtain also [D 3 , K] = −ϕ ⊗ ψ + ϕ ⊗ ψ − ϕ ⊗ ψ . Combining this with (4.11) itself and (4.13) for m = 1 with τ = τ1 we obtain [M, K] = ϕ ⊗ ψ − ϕ ⊗ ψ + ϕ ⊗ ψ − τ ϕ ⊗ ψ. This is equivalent to the formula stated in footnote 7.
The Pearcey Process
395
Now, though, we have to compute the first term on the right. To do it we apply (4.14) to ϕ and use the differential equation satisfied by ϕ to obtain Q (x) − τ Q (x) + x Q(x) = −Ry δ Q + Ryy δ Q + R δ Q − R τ δ Q. (4.18) This gives Q i (ξiu ) = τi qiu − ξiu qiu + (−ry s q + ryy s q + r s q − r τ s q)iu .
If we substitute this into (4.17), multiply by dξkw and sum over kw we obtain Eq. (4.9). This completes the derivation of the equations for the differentials of q, q and q . We could say that the derivation of the equations for the differentials of p, p and p is analogous, which is true. But here is a better way. Observe that the P for the operator K is the transpose of the Q for the transpose of K, and similarly with P and Q interchanged. It follows from this that for any equation involving Q there is another one for P obtained by replacing K by its transpose (and so interchanging ∂x and ∂y ) and taking transposes. The upshot is that Eqs. (4.6), (4.8) and (4.10) are consequences of (4.5), (4.7) and (4.9). The reason for the difference in signs in the appearance of ξ on the right sides of (4.9) and (4.10) is the difference in signs in the last terms in the differential equations for ϕ and ψ. Finally we have to show that the diagonal entries of rxx + rxy and rxy + ryy and the off-diagonal entries of rxx , rxy and ryy are all known, in the sense that they are expressible in terms of the unknown functions. This is really the heart of the matter. We use ≡ between expressions involving R, Q and P and their derivatives to indicate that the difference involves at most two derivatives of Q or P and at most one derivative of R. The reason is that if we take the appropriate entries evaluated at the appropriate points we obtain a known quantity, i.e., one expressible in terms of the unknown functions. If we multiply (4.12) on the left or right by D we obtain Rxx + Rxy = −Q ⊗ P + Rx δ R,
Rxy + Ryy = −Q ⊗ P + R δ Ry .
In particular Rxx +Rxy ≡ 0,
Rxy +Rxy ≡ 0,
so in fact all entries of rxx + rxy and rxy + ryy are known. From (4.12) we obtain consecutively [D 2 , R] = −Q ⊗ P + Q ⊗ P + DR δ R + R δ RD,
(4.19)
[D 3 , R] = −Q ⊗ P +Q ⊗ P −Q ⊗ P +D 2 R δ R+DR δRD+R δ RD 2 . (4.20) If we subtract (4.15) from (4.20) we find [τ D − M, R] = −Q ⊗ P + Q ⊗ P − Q ⊗ P +Rxx δ R − Rx δRy + R δ Ryy + Ry δRx − Ryy δ R − R δRxx + R τ δ R. We use Rxx − Ryy = Q ⊗ P − Q ⊗ P + Rx δ R − R δ Ry
396
C. A. Tracy, H. Widom
and Ry = −Rx − Q ⊗ P + R δ R to see that this equals −Q ⊗ P + Q ⊗ P − Q ⊗ P + (Q ⊗ P − Q ⊗ P − R δ Ry ) δ R −R δ (Q ⊗ P − Q ⊗ P + Rx δ R − R δ Ry ) + Rx δ Q ⊗ P
(4.21)
−(Q ⊗ P − R δ R) δ Rx + R τ δ R. We first apply D acting on the left to this, and deduce that D [τ D, R] ≡ −Q ⊗ P + Rxx δ Q ⊗ P . If we use (4.18) and the fact that Ryy ≡ Rxx we see that this is ≡ 0. This means that τi Rxx,i,j + τj Rxy,i,j ≡ 0. Since Rxx,i,j + Rxy,i,j ≡ 0 we deduce that Rxx,i,j and Rxy,i,j are individually ≡ 0 when i = j . Therefore rxx,iu,j v and rxy,iu,j v are known then. But we still have to show that riu,iv is known when u = v, and for this we apply D 2 to (4.21) rather than D. We get this time D 2 [τ D − M, R] ≡ −Q ⊗ P + Q ⊗ (P − P δ R) − Rxx δ Ry δ R −Rxx δ (Q ⊗ P − Q ⊗ P + Rx δ R − R δ Ry )
(4.22)
+Rxxx δ Q ⊗ P + Rxx δ R δ Rx + Rxx τ δ R. We first compare the diagonal entries of D 2 [τ D, R] on the left with those of Rxx τ δ R on the right. The diagonal entries of the former are the same as those of τ (Rxxx + Rxxy ) ≡ τ Rxx δ R. (Notice that applying D 2 to (4.12) on the left gives Rxxx + Rxxy ≡ Rxx δ R.) The difference between this and Rxx τ δ R is [τ, Rxx ] δ R. Only the off-diagonal entries of Rxx occur here, so this is ≡ 0. If we remove these terms from (4.22) the left side becomes −D 2 [M, R] and the resulting right side we write, after using the fact Rx + Ry = −Q ⊗ P + R δ R twice, as −Q ⊗ P + Q ⊗ (P − P δ R) −Rxx δ (Q ⊗ P − Q ⊗ P − Q ⊗ P δ R + R δ Q ⊗ P ) + Rxxx δ Q ⊗ P . (4.23) Now we use (4.18) and the facts Ryy ≡ Rxx , Rxy ≡ −Rxx , Rxxx − Rxyy ≡ Rxx δ R (the last comes from applying D to (4.19) on the left) to obtain Q ≡ Rxx δ Q, Q ≡ Rxx δ Q + (Rxxx − Rxx δ R) δ Q. Substituting these expressions for Q and Q into (4.23) shows that it is ≡ 0. This was the miracle. We have shown that D 2 [M, R] ≡ 0, in other words (x − y) Rxx (x, y) ≡ 0. If we set x = ξiu , y = ξiv we deduce that Rxx (ξiu , ξiv ) = rxx,iu,iv is known when u = v.
The Pearcey Process
397
5. Higher-Order Singularities We begin with the fictitious Brownian motion model, in which the end-points of the paths are complex numbers. The model consists of 2Rn nonintersecting Brownian paths starting at zero, with n of them ending at each of the points ±n1/2 br (r = 1, . . . , R). The product in the integrand in (2.11) becomes n R 2 b − s 2 /n r
r=1
br2 − t 2 /n
,
(5.1)
and we use the same contours as before. We shall first make the substitutions τk → 1/2 + n−δ τk with δ to be determined. The first exponential in (2.11) becomes exp − (1 + 4 n−δ τk + O(n−2δ )) t 2 + (4 + O(n−δ )) xt + O(x 2 ) , (5.2) and the second exponential becomes exp (1 + 4 n−δ τ + O(n−2δ )) s 2 − (4 + O(n−δ )) ys + O(y 2 ) . If we set ar = 1/br2 the product (5.1) is the exponential of
n−1 2 4 n−R R+1 2R+2 2R+2 (t ar (t − s 4 ) + · · · + ar −s ) ar (t 2 − s 2 ) + 2 R+1 +O(n−R−1 (|t|2R+4 + |s|2R+4 )).
If R > 1 we choose the ar such that ar2 = · · · = arR = 0. ar = 1, The ar are the roots of the equation a R − a R−1 +
1 R−2 (−1)R − ··· + a = 0, 2! R!
(5.3)
from which it follows that
arR+1 =
(−1)R+1 . R!
In general the ar will be complex, and so the same will be true of the end-points of our Brownian paths. In the integrals defining Hij we make the substitutions t → nδ/2 t, s → nδ/2 s, where δ=
R , R+1
398
C. A. Tracy, H. Widom
and in the kernel we make the scaling x → n−δ/2 x, y → n−δ/2 y. This gives us the kernel Hn,ij (x, y). As before Eij is unchanged, and the limiting form of Hn,ij (x, y) is now 1 − 2 π
C
i∞ −i∞
R s 2R+2 /(R+1)!+4 τ s 2 −4ys+(−1)R+1 t 2R+2 /(R+1)!−4 τ t 2 +4xt j i
e(−1)
ds dt . s−t
This is formal and it is not at first clear what the C contour should be, although one might guess that it consists of four rays, one in each quadrant, on which (−1)R+1 t 2R+2 is negative and real. We shall see that this is so, and that the rays are the most vertical R ones, those between 0 and ±∞e± 2R+2 πi . The orientation of the rays is as in the case R = 1. (The s integration should cause no new problems.) After the variable changes the product of the two functions of t in the integrand in (2.11) is of the form e−n t 2 −δ 2 e(−4 τk t +4xt+O(n (|t|+|t| )) . 2 δ−1 2 n (br − n t ) δ 2
(5.4)
The main part of this is the quotient. Upon the substitution t → n(δ−1)/2 t the quotient becomes e−nt 2 . (br − t 2 )n 2
(5.5)
Suppose we want to do a steepest descent analysis of the integral of this over a nearly vertical ray from 0 in the right half-plane. (This nearly vertical ray would be the part of the t contour in (2.11) in the first quadrant.) No pole ±br is purely imaginary, as is clear from the equation the ar satisfy. So there are R poles in the right half plane and R in the left. There are 2R + 2 steepest descent curves emanating from the origin, half starting out in the right half-plane. These remain there since, as one can show, the integrand is positive and increasing on the imaginary axis. We claim that there is at least one pole between any two of these curves. The reason is that otherwise the integrals over these two curves would be equal, and so have equal asymptotics. That means, after computing the asymptotics, that the integrals
∞eikπ/(R+1)
R+1 t 2R+2
e(−1)
dt
0
would be the same for two different integers k ∈ (−(R + 1)/2, (R + 1)/2). But the integrals are all different. Therefore there is a pole between any two of the curves. Let be the curve which R starts out most steeply, in the direction arg t = 2R+2 π . It follows from what we have just shown that there is no pole between and the positive imaginary axis. This is what we wanted to show. The curve we take for the t integral in (2.11) is n = n(1−δ)/2 . The original contour for the t-integral in the representation of Hn,ij can be deformed to this one. (We are speaking now, of course, of one quarter of the full contour.)
The Pearcey Process
399
We can now take care of the annoying part of the argument establishing the claimed asymptotics. The curve is asymptotic to the positive real axis at +∞. Therefore for A 2 δ 2 sufficiently large (5.5) is O(e−n|t| /2 ) when t ∈ , |t| > A. Hence (5.4) is O(e−n |t| /4 ) (1−δ)/2 2 when t ∈ n , |t| > n A. It follows that its L norm over this portion of n is exponentially small. When t ∈ , |t| > ε then (5.5) is O(e−nη ) for some η > 0, and it follows that (5.4) is O(e−nη/2 ) when t ∈ n , |t| > n(1−δ)/2 ε and also |t| < n(1−δ)/2 A. Therefore the norm of (5.4) over this portion of n is also exponentially small. So we need consider only the portion of n on which |t| < n(1−δ)/2 ε, and for this we get the limit ∞e2iπ R/(R+1) R+1 2R+2 −4 τ t 2 +x t k e(−1) t dt 0
with appropriate uniformity, in the usual way. Just as with the Pearcey kernel we can search for a system of PDEs associated with det (I − K χ ). Again we obtain two commutators, which when combined show that K is an integrable kernel when m = 1. In this case we define ϕ and ψ by 1 R+1 2R+2 /(R+1)!−4τ t 2 /2+4xt k ϕ(x) = e(−1) t dt , πi C i∞ 1 R 2R+2 /(R+1)!+4τ t 2 /2−4yt k ψ(y) = e(−1) s ds . πi −i∞ They satisfy the differential equations cR ϕ (2R+1) (x) − 2τ ϕ (x) + 4xϕ(x) = 0, cR ψ (2R+1) (y) − 2ψ (y)τ − 4yψ(y) = 0, where (−1)R+1 . 42R+1 R! The first commutator is [D, K] = −4 ϕ ⊗ ψ, which also gives cR = 2
[D n , K] = −4
n−1
(−1)k ϕ (n−k−1) ⊗ ψ (k) .
(5.6)
k=0
The second commutator is [cR D 2R+1 − 2 τ D + 4M, K] = 0. In case m = 1 (or for a general m and a diagonal entry of K), combining this with the commutator [D 2R+1 , K] obtained from (5.6) and the differential equations for ϕ and ψ we get an expression for [M, K] in terms of derivatives of ϕ and ψ up to order 2R. This gives the analogue of the expression for K(x, y) in footnote 7. For a system of PDEs in this case we would have many more unknowns, and the industrious reader could write them down. However there will remain the problem of showing that certain quantities involving 2R th derivatives of the resolvent kernel R (too many Rs!) evaluated at endpoints of the Xk are expressible in terms of the unknowns. For the case R = 1 a miracle took place. Even to determine what miracle has to take place for general R would be a nontrivial computational task. Acknowledgement. This work was supported by the National Science Foundation under grants DMS0304414 (first author) and DMS-0243982 (second author).
400
C. A. Tracy, H. Widom
References 1. Adler, M., van Moerbeke, P.: A PDE for the joint distributions of the Airy process. http:// arxiv.org/list/math.PR/0302329, 2003; PDEs for the joint distribution of the Dyson, Airy and Sine processes. Ann. Prob. 33, 1326–1361 (2005) 2. Aldous, D., Diaconis, P.: Longest increasing subsequences: from patience sorting to the Baik-DeiftJohansson theorem. Bull. Amer. Math. Soc. 36, 413–432 (1999) 3. Aptekarev, A.I., Bleher, P.M., Kuijlaars, A.B.J.: Large n limit of Gaussian random matrices with external source, Part II. Commun. Math. Phys. 259, 367–389 (2005) 4. Bleher, P.M., Kuijlaars, A.B.J.: Random matrices with an external source and multiple orthogonal polynomials. Int. Math. Res. Not. 2004, No. 3, 109–129 5. Bleher, P.M., Kuijlaars, A.B.J.: Large n limit of Gaussian random matrices with external source, Part I. Commun. Math. Phys. 252, 43–76 (2004) 6. Br´ezin, E., Hikami, S.: Universal singularity at the closure of a gap in a random matrix theory. Phys. Rev. E 57, 4140–4149 (1998) 7. Br´ezin, E., Hikami, S.: Level spacing of random matrices in an external source. Phys. Rev. E 58, 7176–7185 (1998) 8. Eynard, B., Mehta, M.L.: Matrices coupled in a chain. I. Eigenvalue correlations. J. Phys. A: Math. Gen. 31, 4449–4456 (1998) 9. Ferrari, P.L., Pr¨ahofer, M., Spohn, H.: Stochastic growth in one dimension and Gaussian multi-matrix models. http://arxiv.org./list/math-ph/0310053, 2003 10. Johansson, K.: Toeplitz determinants, random growth and determinantal processes. Proc. Inter. Congress of Math., Vol. III (Beijing, 2002), Beijing: Higher Ed. Press, 2002, pp 53–62 11. Johansson, K.: Discrete polynuclear growth and determinantal processes. Commun. Math. Phys. 242, 277–329 (2003) 12. Karlin, S., McGregor, J.: Coincidence probabilities. Pacific J. Math. 9, 1141–1164 (1959) 13. Lenard, A.: States of classical statistical mechanical systems of infinitely many particles, I. Arch. Rat. Mech. Anal. 59, 219–239 (1975) 14. Lenard, A.: States of classical statistical mechanical systems of infinitely many particles, II. Arch. Rat. Mech. Anal. 59, 241–256 (1975) 15. Okounkov, A., Reshetikhin, N.: Private communication with authors, June, 2003; Poitiers lecture, June 2004; Random skew plane partitions and the Pearcey process, http://front.math.ucdavis.edu/ math.CO/0503508, 2005 16. Pearcey, T.: The structure of an electromagnetic field in the neighborhood of a cusp of a caustic. Philos. Mag. 37, 311–317 (1946) 17. Pr¨ahofer, M., Spohn, H.: Scale invariance of the PNG droplet and the Airy process. J. Stat. Phys. 108, 1071–1106 (2002) 18. Soshnikov, A.: Determinantal random point fields. Russ. Math. Surv 55, 923–975 (2000) 19. Tracy, C.A. Widom, H.: Level-spacing distributions and the Airy kernel. Commun. Math. Phys. 159, 151–174 (1994) 20. Tracy, C.A., Widom, H.: Distribution functions for largest eigenvalues and their applications. Proc. Inter. Congress of Math., Vol.I (Beijing, 2002), Beijing: Higher Ed. Press, 2002, pp 587–596 21. Tracy, C.A., Widom, H.: Differential equations for Dyson processes. Commun. Math. Phys. 252, 7–41 (2004) Communicated by H. Spohn
Commun. Math. Phys. 263, 401–437 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1505-4
Communications in
Mathematical Physics
Semiclassical Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions M. Bertola1,2 , B. Eynard1,3 , J. Harnad1,2 1
Centre de recherches math´ematiques, Universit´e de Montr´eal C. P. 6128, succ. centre ville, Montr´eal, Qu´ebec, Canada H3C 3J7. E-mail:
[email protected],
[email protected] 2 Department of Mathematics and Statistics, Concordia University 7141 Sherbrooke W., Montr´eal, Qu´ebec, Canada H4B 1R6 3 Service de Physique Th´eorique, CEA/Saclay Orme des Merisiers, 91191 Gif-sur-Yvette Cedex, France. E-mail:
[email protected] Received: 2 February 2005 / Accepted: 16 August 2005 Published online: 31 January 2006 – © Springer-Verlag 2006
Abstract: The differential systems satisfied by orthogonal polynomials with arbitrary semiclassical measures supported on contours in the complex plane are derived, as well as the compatible systems of deformation equations obtained from varying such measures. These are shown to preserve the generalized monodromy of the associated rank-2 rational covariant derivative operators. The corresponding matrix models, consisting of unitarily diagonalizable matrices with spectra supported on these contours are analyzed, and it is shown that all coefficients of the associated spectral curves are given by logarithmic derivatives of the partition function or, more generally, the gap probabilities. The associated isomonodromic tau functions are shown to coincide, within an explicitly computed factor, with these partition functions. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Generalized Orthogonal Polynomials and Partition Functions . . 2.1 Orthogonality measures and integration contours . . . . . 2.1.1 Definition of the boundary-free contours. . . . . . 2.1.2 Definition of the hard-edge contours. . . . . . . . 2.2 Recursion relations, derivatives and deformations equations 2.2.1 Existence of orthogonal polynomials and relation to random matrices. . . . . . . . . . . . . . . . . . 2.2.2 Wave vector equations. . . . . . . . . . . . . . . . 2.2.3 Wave vector of the second kind. . . . . . . . . . . 3. Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 n-windows and Christoffel–Darboux formula . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
402 405 405 406 406 407
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
407 409 413 415 415
Research supported in part by the Natural Sciences and Engineering Research Council of Canada, the Fonds FCAR du Qu´ebec and EC ITH Network HPRN-CT-1999-000161.
402
M. Bertola, B. Eynard, J. Harnad
3.2
Folded version of the deformation equations for changes in the potential . . . . . . . . . . . . . . . . . . . . . . . 3.3 Folding of the endpoint deformations . . . . . . . . . . . . 3.4 Folded version of the recursion relations and ∂x relations 4. Spectral Curve and Spectral Invariants . . . . . . . . . . . . . . 4.1 Virasoro generators and the spectral curve . . . . . . . . . 4.2 Spectral residue formulæ . . . . . . . . . . . . . . . . . . 5. Isomonodromic Tau Function . . . . . . . . . . . . . . . . . . . 5.1 Isomonodromic deformations and residue formula . . . . . 5.2 Traceless gauge . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
416 418 419 422 422 426 429 429 430
1. Introduction The partition function for Hermitian random matrix models with measures that are exponentials of a polynomial potential was shown in [4] to be equal, within a multiplicative factor independent of the deformation parameters, to the Jimbo-Miwa–Ueno isomonodromic tau function [11] for the rank 2 linear differential system satisfied by the corresponding set of orthogonal polynomials. The results of [4] were in fact more general, in that polynomials orthogonal with respect to complex measures supported along certain contours in the complex plane were considered. These may be viewed as corresponding to unitarily diagonalizable matrix models in which the spectrum is constrained to lie on these contours. The purpose of the present work is to extend these considerations to the more general setting of complex measures whose logarithmic derivatives are arbitrary rational functions, the associated semiclassical orthogonal polynomials and generalized matrix models. By also including contours with endpoints, the latter viewed as further deformation parameters, the gap probability densities are included as special cases of partition functions. To place the results in context, we first briefly recall the main points of [4], restricting to the more standard case of Hermitian matrices and real measures. Consider orthogonal −1 polynomials πn (x) ∈ L2 (R, e− V (x) dx) supported on the real line, with the measure defined by exponentiating a real polynomial potential V (x) =
d tJ J x . J
(1-1)
J =1
(Here we assume V (x) is of even degree and with positive leading coefficient, although these restrictions are unnecessary in the more general setting of [4].) The small parameter is usually taken as O(N −1 ) when considering the limit N → ∞. Any two consecutive polynomials satisfy a first order system of ODE’s, d πn−1 (x) π (x) = Dn (x) n−1 , (1-2) πn (x) πn (x) dx where Dn (x) is a 2 × 2 matrix with polynomial coefficients of degree at most d − 1 = deg(V (x)). The infinitesimal deformations corresponding to changes in the coefficients {tJ } result in a sequence of Frobenius compatible, overdetermined systems of PDE’s, ∂ πn−1 (x) π (x) = Tn,J (x) n−1 J = 1, . . . , d, (1-3) πn (x) πn (x) ∂tJ
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
403
where the matrices Tn,J (x) are polynomials in x of degree J which satisfy the compatibility conditions ∂ d − Dn (x) = 0 . − Tn,J (x), (1-4) ∂tJ dx It follows that the generalized monodromy data of the sequence of rational covariant d derivative operators dx − Dn (x) are invariant under these deformations, and independent of the integer n. This is a particular case of the general problem of rational isomonodromic deformation systems [11]. An important rˆole is played in this theory by the isomonodromic tau function τnI M associated with any solution of an isomonodromic deformation system. This function on the space of deformation parameters is obtained by integrating a closed differential whose coefficients are given by residues involving the fundamental solutions of the system. The main results of [4] were the following. First, the coefficients of the associated spectral curve, given by the characteristic equation of the matrix Dn (x), can be obtained by applying certain first order differential operators with respect to the deformation parameters (Virasoro generators) to ln(Zn (V )), where the partition function Zn (V ) of the associated n × n matrix model is 1 Zn (V ) := dM exp − TrV (M) . (1-5) Hn Second, this partition function is equal to the isomonodromic tau function up to a multiplicative factor that does not depend on the deformation parameters τnI M = Zn (V ) Fn .
(1-6)
The present work generalizes these results to the case of measures whose logarithmic derivatives are arbitrary rational functions, including those supported on curve segments in which the endpoints may play the rˆole of further deformation parameters. The latter are of importance in the calculation of gap probabilities in matrix models [17] since these may, in this way, be put on the same footing as partition functions using measures supported on such segments [6]. A Frobenius compatible system of first order differential and deformation equations satisfied by the corresponding orthogonal polynomials is derived (Propositions 3.2, 3.4) and the coefficients of the associated spectral curve are again shown to be obtained by applying suitable Virasoro generators to ln(ZN (V )) (Theorems 4.1–4.2). A formula that generalizes (1-6) is also derived (Theorem 5.1): τnI M = Zn (V )Fn (V ) ,
(1-7)
where the factor Fn (V ) is an explicitly computed function of the deformation parameters determining V , which can in fact be eliminated by making a suitable scalar gauge transformation. The results of Theorems 4.1, 4.2 give a precise meaning, for finite n, to formulæ that are usually derived in the asymptotic limit n → ∞ (n ∼ O(1)) through saddle point computations, relating the free energy to the asymptotic spectral curve. It is well known [9] that the free energy in the large n limit is given by solving a minimization problem (in the Hermitian matrix model) F0 := − lim 2 ln Zn = min V (x)ρ(x)dx − ρ(x)ρ(x ) ln |x − x | , n→∞
ρ(x)≥0
(1-8)
404
M. Bertola, B. Eynard, J. Harnad
giving the equilibrium density ρeq for the eigenvalue distribution. If, for instance, the potential is a real polynomial bounded from below, it is known [8] that the support of the equilibrium density is a union of finite segments I ⊂ R. The density ρeq is obtained from the variational equation ρeq (x)dx = V (x) (1-9) 2P x − x (for x ∈ I ), and is related to the resolvent by
ρeq (x) 1 = dx , z∈C\I . ω(z) := lim Tr n→∞ M −z x−z I
(1-10)
In terms of this, the spectral density may be recovered as the jump-discontinuity of ω(z) across I , and all its moments are given by zJ 1 xJ J lim TrM = dx ρeq (x) = − res ω(z)dz . (1-11) ∂tJ F0 = z=∞ J J n→∞ J I The function y = −ω(x) satisfies an algebraic relation given by y 2 = yV (x) + R(x) ,
(1-12)
where R(x) is a polynomial of degree less than V (x) that is uniquely determined by the consistency of (1-9) and (1-12). The point to be stressed here is that this asymptotic spectral curve should be compared with the spectral curve of Theorem 4.1, given by the characteristic equation (4-28), which also contains all the relevant information about the finite n case. In the n → ∞ limit, logarithmic derivatives of the partition function are expressed in (1-11) as residues of the meromorphic differentials zk ydz on the curve. The same formulæ are shown in Theorem 4.2 to hold as exact relations for the finite n case if we replace the “asymptotic” spectral curve by the spectral curve given by the characteristic equation of the matrix Dn (x). The paper is organized as follows. In Sect. 2 the problem is defined in terms of polynomials orthogonal with respect to an arbitrary semiclassical measure supported on complex contours, and the corresponding generalized matrix model partition functions. In Sect. 2.2 the recursion relations, differential systems and deformation equations which these satisfy are expressed in terms of the semi-infinite “wave vector” formed from the orthogonal polynomials. In Sect. 3 the notion of “folding” is introduced and used (Propositions 3.2–3.4) to express the preceding equations as an infinite sequence of compatible overdetermined 2 × 2 systems of linear differential equations and recursion relations satisfied by pairs of consecutive orthogonal polynomials. In Sect. 4 the results of folding are used to express the spectral curve in terms of logarithmic derivatives of the partition function and it is shown that the n → ∞ relation between the free energy and the spectral curve is also valid as an exact result for finite n. In Sect. 5 the definition of the isomonodromic tau function [11] is recalled and it is computed by relating it to the spectral invariants of the rational matrix generalizing Dn (x) in (1-2). These invariants are shown to give the logarithmic derivatives of the tau functions in terms of residues of meromorphic differentials on the spectral curves through formulæ that are nearly identical to those for the partition function, This leads to the main result, Theorem 5.1, which gives the explicit relation between Zn and τnI M .
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
405
2. Generalized Orthogonal Polynomials and Partition Functions 2.1. Orthogonality measures and integration contours. Given a measure on the real line, the associated orthogonal polynomials are those that diagonalize the quadratic form associated to the corresponding (complex) moment functional; i.e., the linear form over the space of polynomials obtained by integration with respect to the measure L : C[x] → C , p(x) → L(p(x)) = R p(x)dµ(x) .
(2-1)
A natural generalization consists of including moment functionals that are expressed by integration along more general contours in the complex x plan, with respect to a complex measure defined by locally analytic weight functions that may have isolated essential singular points and complex power-like branch points. We thus consider linear forms on polynomials given by integrals of the form L(p(x)) = p(x)µ(x)dx, κ
1
µ(x) = e− V (x) , K V (x) := Tr (x) ,
(2-2)
r=0
where T0 (x) := t0,0 + Tr (x) :=
dr J =1
d0 t0,J J x , J
J =1
tr,J − tr,0 ln(x − cr ) J (x − cr )J
−∂x ln µ(x) = V (x) =
d 0 −1
t0,J x J −1 −
J =1
K d r +1 r=1 J =1
tr,J −1 , (x − cr )J
(2-3)
and the symbol κ denotes integration over linear combinations of contours on which the integrals are convergent, as explained below. This class of linear functionals is sometimes referred to as semiclassical moment functionals [3, 14, 15]. We consider the corresponding monic generalized orthogonal polynomials pn (x), which satisfy pn (x)pm (x)µ(x)dx = hn δnm . , hn ∈ C \ {0} . (2-4) κ
If all the contours are contained in the real axis and the weight is real and positive, we reduce to the usual notion of semiclassical orthogonal polynomials. The small parameter introduced in (2-2) is not of essential importance here; it is only retained in the formulæ below to recall that, when taking the large n limit, it plays the rˆole of small parameter for which n remains finite as n → ∞. (j ) To describe the contours of integration, we first define sectors Sr , r = 0, . . . K, k = 1, . . . dr around the points cr for which dr > 1 (c0 := ∞) in such a way that (V (x)) −→ +∞ . x → cr , x∈
(j ) Sr
(2-5)
406
M. Bertola, B. Eynard, J. Harnad
The number of sectors for each pole in V is equal to the degree of that pole; that is, d0 for the pole at infinity and dr for the pole at cr . Explicitly
2kπ −arg(t0,d0 )− π2 2kπ −arg(t0,d0 )+ π2 (0) Sk := x ∈ C; , (2-6) < arg(x) < d0 d0 k = 0 . . . d0 − 1 ;
2kπ +arg(tr,dr )− π2 2kπ +arg(tr,dr )+ π2 (r) , (2-7) < arg(x − cr ) < Sk := x ∈ C; dr dr k = 0, . . . , dr − 1, r = 1, . . . , K . These sectors are defined precisely so that approaching within them any of the essential singularities of µ(x) (i.e., a cr such that dr > 0), the function µ(x) tends to zero faster than any power of the local parameter. 2.1.1. Definition of the boundary-free contours. The definition of the contours follows [16] (see Fig. 1). 1. For any cr for which there is no essential singularity in the measure (i.e., dr = 0), there are two subcases: (a) For the cr ’s that are branch points or poles in µ (i.e., tr,0 ∈ / N), we take a loop (0) starting at infinity in some fixed sector Sk encircling the singularity and going back to infinity in the same sector. (Note that if cr is just a pole; i.e., tr,0 ∈ −N+ , the contour could equivalently be taken as a circle around cr .) (b) For the cr ’s that are regular points (tr,0 ∈ N ), we take a line joining cr to infinity, (0) approaching ∞ in a sector Sk as before. 2. For any cr for which there is an essential singularity in µ (i.e., dr > 0) we define dr (r) (r) contours starting from cr in the sector Sk and returning to it in the next sector Sk+1 . Also, if tr,0 ∈ / Z, we join the singularity cr to ∞ by a path approaching ∞ within one (0) fixed sector Sk . (0) 3. For c0 := ∞, we take d0 − 1 contours starting at c0 in the sector Sk and returning (0) at c0 in the next sector Sk+1 . Note that, with these definitions, the integrals involved are convergent and we can perform integration by parts. Moreover, any contour in the complex plane for which the integral of µ(x)p(x)dx is convergent for all polynomials p(x) is equivalent to a linear combination of the contours defined above, no two of which are, in this sense, equivalent. 2.1.2. Definition of the hard-edge contours. We also include some additional contours in the complex plane {mj }j =1,...,L , starting at some points aj , j = 1 . . . L and going to (0) ∞ within one of the sectors Sk . These could be viewed as corresponding to additional points in 1(b) for which both dr = 0 and tr,0 = 0, but we prefer to deal with them separately since integration by parts on these contours does give a contribution. In total there are S := d0 + K r=1 (dr + 1) boundary-free contours σ , = 1, . . . , S and L hard-edge contours mh , h = 1, . . . , L. The moment functional is an arbitrary linear combination of integrals taken along these contours L S := κj + κL+j . (2-8) κ
j =1
mj
j =1
σj
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
Sk
407
L
*c2
a3
a2
a1
c1
c3
Fig. 1. The types of contours considered in the x Riemann sphere P1 . Here we have c1 with d1 = 3 and c2 with d2 = 0, t2,0 Z (logarithmic singularity in the potential), c3 with d3 = 0, t3,0 ∈ N and the degree of the potential at infinity c0 = ∞ is d0 = 5. The essential singularity in µ at c1 is of the form exp (x − c1 )−3 and there is also a cut extending from c1 to ∞ if t1,0 ∈ Z. The point c2 is a branch point of µ(x) since t2,0 ∈ Z, and the cut extends to infinity “inside” the contour (as shown here). If it were a pole (t2,0 ∈ −N+ ), the contour would be replaced by a circle around it. The point c3 is a regular point with t3,0 ∈ N× , and the contour extending from it to infinity is no different from the ones starting at the regular points a1 , a2 , a3 . The latter are the “hard-edge” segments joining the points a1 , a2 and a3 to ∞ (0) within one of the sectors Sk
Note that, by taking appropriate linear combinations of the contours, we could alternatively have had contours consisting of finite segments joining the points aj . 2.2. Recursion relations, derivatives and deformations equations. 2.2.1. Existence of orthogonal polynomials and relation to random matrices. Recall [7] that orthogonal polynomials satisfying (2-4) exist provided all the Hankel determinants formed from the moments are nonzero: i+j
n (κ) := det x µ(x)dx = 0 , ∀n ∈ N. (2-9) κ
0≤i,j ≤n−1
Since the n (κ)’s are homogeneous polynomials in the coefficients κj , the zero locus excluded by (2-9) is of zero measure (in the space of κj ’s), and hence “generically” the conditions (2-9) are fulfilled. The development to follow will in fact only involve orthogonal polynomials up to some arbitrarily large fixed degree, say N , and hence the conditions n (κ) = 0, n ≤ N − 1 determine a Zariski closed set in {κj }, (and a closed set of measure zero in the space of coefficients of V ).
408
M. Bertola, B. Eynard, J. Harnad
The orthogonal polynomials considered here are related to models of unitarily diagonalizable random matrices M ∈ gl(n, C) with spectra supported on the contours defined above. More specifically we have the partition function 1 Zn := Cn dMe− TrV (M) spec(M)∈ κ 1 n = dx1 · · · dxn (x)2 e− j =1 V (xj ) (2-10) κ
κ
= n! n (κ, V ) = n!
n−1
hj ,
(2-11)
(2-12)
j =0
where CN :=
1 U (n) dU
is the inverse of the U (n) group volume, and
(x) := (xi − xj )
(2-13)
i<j
is the usual Vandermonde determinant. The notation spec(M) ∈ κ in the first integral just means that M is unitarily diagonalizable M = U D U† ,
D := diag(x1 , . . . xn ) ,
U ∈ U (n) ,
(2-14)
and the eigenvalues {x1 , . . . xn } of M are constrained to lie on the contours entering in κ . In particular, as in the standard case, the orthogonal polynomials may be shown to be equal to the expectation values of the characteristic polynomials in such models pn (x) = det(xI − M) n 1 n 1 = dx1 · · · dxn (x − xi ) (x)2 e− j =1 V (xj ) , Zn κ κ
(2-15)
i=1
and all correlation functions between the eigenvalues may be expressed as determinants in terms of the standard Christoffel-Darboux kernel formed from them Kn (x, y) :=
n−1 1 1 pj (x)pj (y)e− 2 (V (x)+V (y)) . hj
(2-16)
j =0
More precisely, this is valid when there are no “hard-edge” contours present. Inclusion of the latter however allows one to interpret these determinants as certain conditional correlators, known as “Janossy distribution” correlators [6], giving the probability densities for a certain number of eigenvalues to lie at given locations within the complementary part of the support, while the remaining ones lie within it. The partition function Zn in this case can be reinterpreted as the corresponding gap probability [6, 17].
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
409
2.2.2. Wave vector equations. We now define the normalized orthogonal polynomials 1 πn (x) := √ pn (x) hn
(2-17)
and what will be referred to as the “orthonormal quasi-polynomials” 1
ψn (x) := πn (x)e− 2 V (x) , satisfying
(2-18)
κ
ψn (x)ψm (x)dx = δmn .
(2-19)
We now form the semi-infinite “wave vectors” (x) := [π0 (x), π1 (x), . . . , πn (x), . . .]t .
(2-20)
As in the theory of ordinary orthogonal polynomials, we have x(x) = Q(x) ,
(2-21)
where Q is a symmetric tridiagonal semi-infinite matrix with components Qij = γj δi,j −1 + βi δij + γi δi,j +1 ,
i, j ∈ N ,
(2-22)
defining a three term recursion relation of the form xπj (x) = γj +1 πj +1 (x) + βj πj (x) + γj πj −1 (x) .
(2-23)
Now introduce semi-infinite matrices P , Ai , Cr , Tr,J such that ∂x (x) = ∂ai (x) = ∂cr (x) = ∂tr,J (x) =
P (x), Ai (x) , Cr (x) , Tr,J (x) ,
i = 1, . . . , L, r = 1, . . . , K, r = 0, . . . , K, J = 0, . . . , dr .
Their matrix elements are determined simply by integration Xnm = (∂πn (x)) πm (x)µ(x)dx , κ
(2-24) (2-25) (2-26) (2-27)
(2-28)
where ∂ denotes any of the derivatives ∂x , ∂ai , ∂cr , ∂tr,J above for which X becomes the corresponding matrices P , Ai , Cr or Tr,J on the RHS of (2-24) – (2-27). Remark 2.1. Such wave vectors and associated deformation equations have been studied in many previous works relating orthogonal polynomials, matrix models and integrable systems (see, e.g. [2, 18]). However, considerations of the deformation theory have mainly been within the formal setting, with the potential V (x) replaced by some initial value, V0 (x), plus a perturbation consisting of an infinite power series with arbitrary coefficients, without regard to domains of convergence. Results obtained in this formal setting cannot be directly applied to the study of isomonodromic deformations, where the local analytic structure in the neighborhood of a number of isolated singular points is of primary interest.
410
M. Bertola, B. Eynard, J. Harnad
For any such semi-infinite square matrix X, let X0 , X+ , X− denote the diagonal, upper and lower triangular parts, respectively, and let 1 (2-29) X0 + X − . 2 Proposition 2.1. The matrices P , Ai , Cr and Tr,J are all lower semi-triangular (with P strictly lower triangular), and are given by X−0 :=
P = V (Q)−0 −
L
Ai = V (Q)− −
i=1 t
L
(Ai )− ,
Ai = κi ((ai ) (ai ))−0 µ(ai ), Cr =
dr J =0
(2-30)
i=1
−1 tr,J (Q − cr )−J , −0
r = 1, . . . , K,
1 1, 2 1 = QJ−0 , J = 1, . . . , d0 , J 1 = (Q − cr )−J −0 , r = 1, . . . , K, J = 1, . . . , dr , J = − ln(Q − cr )−0 , r = 1, . . . , K ,
(2-31) (2-32)
T0,0 =
(2-33)
T0,J
(2-34)
Tr,J Tr,0 where (Q − cr
(2-35) (2-36)
)−J
and ln(Q − cr ) are defined by the formulæ πn (z)πm (z) −J µ(z)dz, (Q − cr )nm := (z − cr )J κ ln(Q − cr )nm := ln(z − cr )πn (x)πm (z)µ(z)dz . κ
(2-37) (2-38)
The diagonal matrix elements for each of the above is given by the formula Xjj = − ∂(ln hj ) , (2-39) 2 where ∂ = ∂x , ∂ai , ∂cr and ∂tr,J , respectively, for X = P , Ai , Cr and Tr,J . In particular, they vanish for P , which is strictly lower triangular, and hence
V (Q)jj =
L
κi ψj (ai )2 .
(2-40)
Proof. We make use of the orthogonality relations (x)t (x)µ(x)dx = 1.
(2-41)
i=1
κ
Equations (2-32) – (2-36) are obtained as follows. Consider a deformation ∂ with respect to any of the above cr ’s or tr,J ’s and denote by X the corresponding matrix; then pn (x) 1 1 = − (∂ ln(hn )) πn (x) + √ ∂pn (x) ∂πn (x) = ∂ √ 2 hn hn 1 = − (∂ ln(hn )) πn (x) + lower degree polynomials , 2
(2-42) (2-43)
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
411
since the polynomials πn are monic. It follows that the deformation matrix X is lower semi-triangular. On the other hand, differentiating Eq. (2-41) gives 0= t ∂µ(x)dx ∂ t + ∂t µ(x)dx + κ κ t t = X+X + ∂µ(x)dx. (2-44) κ
Applying the operators for each case to µ as defined in (2-2) and using Eq. (2-21) then gives the result. Now consider the deformations of the endpoints ai of the “hard-edge” contours. Differentiating (2-41) gives 0 = ∂ai t µ(x)dx κ t = −κi (ai ) (ai )µ(ai ) + (∂ai )t + ∂ai t µ(x)dx κ
= −κi (ai )t (ai )µ(ai ) + Ai + Ati , where
(2-45)
Ai :=
It follows that
κ
∂ai t µ(x)dx.
(2-46)
Ai = κi (t )−0 x=a µ(ai ) ,
(2-47)
i
proving Eq. (2-31), and also that (Ai )nn = − ∂ai ln(hn ) = κi ψn2 (ai ) . 2 2
(2-48)
To determine the matrix P , note that it is strictly lower triangular and µ(ai )t
∂κ
= −
L
Ai + Ati
i=1
t = + t + t ∂x ln µ(x) µ(x)dx κ t + t − V (x) t µ(x)dx = κ
= P + P t − V (Q).
(2-49)
This implies that P = V (Q)−0 −
L i=1
Ai = V (Q)− −
L
(Ai )− .
(2-50)
i=1
This last equality follows from (2-40), which, in turn, follows from integration by parts in the definition of V (Q)nn . It may be seen as a consequence of the invariance of the
412
M. Bertola, B. Eynard, J. Harnad
partition function under an infinitesimal change in the integration variables xj → xj + in (2-11); i.e., translational invariance. From (2-39) and (2-11) follows a relation between the diagonal elements of the deformation matrices and the logarithmic derivatives of the partition function that will be very important in what follows. Define the truncated trace of a semi-infinite matrix X to be Tr n X :=
n−1
(2-51)
Xjj .
j =0
Corollary 2.1. For ∂ = ∂aj , ∂cr , and ∂tr,J , ∂ ln Zn = −2Tr n X ,
(2-52)
with X = Aj , Cr and Tr,J , respectively. For the cases ∂cr and ∂tr,J , ∂ ln Zn = Tr n ∂V (Q) ,
(2-53)
while for the ∂ai ’s we have
L
∂ai ln Zn = −V (Q)nn .
(2-54)
i=1
Proof. The first of these relations follows from (2-39) and (2-11) directly, the second from the explicit expressions for the deformation matrices (2-32)–(2-36) and of the potential V (x), and the third is a restatement of the (2-40) (translational invariance). Corollary 2.2. The compatibility conditions [G, H] = 0 ,
(2-55)
are satisfied, where G, H are any of the following operators: ∂ai − Ai , ∂tr,J − Tr,J , ∂cr − Cr , ∂x − P , x − Q
(2-56)
and r = 0, . . . K, J = 0, . . . dr . Proof. This follows immediately from the fact that the orthogonal polynomials entering in Eqs. (2-24)–(2-27) are linearly independent. Remark 2.2. Note that [∂x − P , x − Q] = 0
(2-57)
is just the string equation, while the other compatibility conditions involving x − Q imply the Lax equations: ∂ai Q = [Ai , Q],
∂tr,J Q = [Tr,J , Q],
∂cr Q = [Cr , Q] ,
(2-58)
showing that the spectrum of the matrix Q is invariant under these deformations.
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
413
2.2.3. Wave vector of the second kind. We now consider solutions of the second kind,
1
φn (x) := e V (x)
1
e− V (z) π(z) dz , x−z κ
(2-59)
which may be combined to form the components of a wave vector of the second kind (x) := [φ0 (x), φ1 (x), . . . , φn (x), . . .]t .
(2-60)
Denote by ∇Q V (x) :=
V (x) − V (Q) x−Q
(2-61)
the semi-infinite square matrix with elements 1 V (x) − V (Q) V (x) − V (z) = dze− V (z) πn (z)πm (z) , x−Q x−z κ nm
(2-62)
and define U (x) to be the semi-infinite column vector (with only its zeroth component nonvanishing) given by 1 (U (x))n := h0 e V (x) δn,0 . (2-63) The following lemma gives the effect of multiplication of (x) by x and of application of ∂x to it. It may be deduced immediately from Eqs. (2-21) and (2-24), applied inside the integral, together with integration by parts. Lemma 2.1. x(x) = Q(x) + U (x),
(2-64)
∂x (x) = P (x) + ∇Q V (x)U (x) +
L i=1
κi
e
1 (V (x)−V (ai ))
x − ai
(ai ). (2-65)
The next proposition, which is similarly verified, gives the effects of the above deformations on the wave vector of the second kind. Proposition 2.2. ∂ai (x) = Ai (x) 1
e (V (x)−V (ai )) −κi (ai ), x − ai ∂cr (x) = Cr (x) +
dr
tr,J
J =0
i = 1, . . . , L,
(Q−cr )−J −1 −(x −cr )−J −1 U (x), Q−x
(2-66)
r = 1, . . . , K, (2-67)
∂t0,J (x) = T0,J (x) +
QJ − x J U (x), Q−x
J = 1, . . . , d0 ,
(2-68)
414
M. Bertola, B. Eynard, J. Harnad
∂tr,J (x) = Tr,J (x) +
(Q−cr )−J −(x −cr )−J U (x) , Q−x
J = 1, . . . , dr ,
r = 1, . . . , K, (2-69)
∂tr,0 (x) = Tr,0 (x) ln(Q − cr ) − ln(x − cr ) + U (x), Q−x
r = 1, . . . , K.
(2-70)
The content of Eqs. (2-26)–(2-27) and (2-67)–(2-70) may be summarized uniformly as follows. Let v(x) be any function that is analytic at each point of the contours except, possibly, the points cr , and for which the following integrals are convergent:
1 v(z)πn (z)πm (z)e− V (z) dz = v(z)ψn (z)ψm (z)dz, κ κ v(x) − v(z) v(x) − v(Q) := := ψn (z)ψm (z)dz. (2-71) x−Q x−z κ nm
v(Q)nm := (∇Q v(x))nm
Define the deformation matrix under the infinitesimal variation of the potential V (x) → V (x) + v(x) to be Xv := v(Q)−0 .
(2-72)
Then the two infinite systems δv (x) := Xv (x), δv (x) := Xv (x) + ∇Q v(x)U (x),
(2-73) (2-74)
describe the infinitesimal deformation of the orthogonal polynomials and the secondkind solutions under such infinitesimal variations of the potential. Equivalently, define the 2 × ∞ matrix (x) := [(x), (x)] .
(2-75)
In terms of (x), all the recursion, differential and deformation equations (2-21), (2-24)–(2-27) and (2-64)–(2-70) may be expressed simultaneously as x = Q + (0, U ) , ∂x = P + 0, ∇Q V U +
K i=1
e
1 (V (x)−V (ai ))
x − ai
δv = Xv + (0, ∇Q vU ), 1 e (V (x)−V (ai )) ∂ai = Ai − 0, κi (ai ) , x − ai
(ai ) ,
(2-76) (2-77) (2-78) (2-79)
where v signifies any of the infinitesimal deformations of the potential ∂ci , ∂tr,J V (i = 1, . . . L, r = 0, . . . K, J = 1, . . . , dr ).
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
415
3. Folding 3.1. n-windows and Christoffel–Darboux formula. Let in be the ∞ × 2 matrix that represents the injection of the 2-dimensional subspace spanned by the (n − 1, n) basis elements into the (semi-)infinite space corresponding to the components of or . Its matrix elements are thus: (in )j k = δk,1 δj,n−1 + δk,2 δj,n ,
j = 0, 1, 2, . . . ,
k = 1, 2.
(3-1)
Let inT denote its transpose, which is the corresponding projection operator. The nth 2×2 block (or “window”) of is then given by:
n (x) :=
inT
πn−1 (x) = πn (x)
φn−1 (x) φn (x)
(3-2)
.
By “folding” the infinite recursion and differential-deformation equations (2-21), (2-24)–(2-27), (2-64)–(2-70), (2-76)–(2-79), we mean the corresponding sequence of recursion relations, ODEs and PDEs satisfied by the n (x)’s. To derive these, a form of the Christoffel–Darboux identity for orthogonal polynomials will repeatedly be used. Let 0n denote the semi-infinite square matrix whose only nonvanishing entries are 1’s on the diagonal in positions 0 to n (i.e., the projection onto the first n + 1 components) (0n )ij
:=
if
δij 0
0 ≤ i, j ≤ n otherwise .
(3-3)
Let σ :=
0 1
−1 0
(3-4)
be the standard 2 × 2 symplectic matrix, and let n := in σ inT
(3-5)
denote its projection onto the 2 × 2 subspace in position (n − 1, n). Proposition 3.1. The following extended Christoffel-Darboux formulæ are satisfied: (x − x ) T (x)0n−1 (x ) =
(3-6)
γn Tn (x)σ n (x )
+
0 1
e V (x)
1
−e V (x ) 1 1 1 e (V (x)+V (x )) κ e− 2 V (z) x−z −
= γn T (x) n (x ) 1 0 −e V (x ) + 1 1 1 1 e V (x) e (V (x)+V (x )) κ e− 2 V (z) x−z −
1 x −z
(3-7)
dz
1 x −z
dz
.
(3-8)
416
M. Bertola, B. Eynard, J. Harnad
Equivalently, in components, (x − x )
n−1
πj (x)πj (x ) = γn πn (x)πn−1 (x ) − πn−1 (x)πn (x )
(3-9)
j =0
(x − x )
n−1
1 πj (x)φj (x ) = γn πn (x)φn−1 (x )−πn−1 (x)φn (x ) −e V (x )
(3-10)
j =0
(x − x )
n−1
φj (x)φj (x ) = γn φn (x)φn−1 (x ) − φn−1 (x)φn (x )
j =0
+e
1 (V (x)+V (x ))
κ
e
− 21 V (z)
1 1 − dz, x − z x − z (3-11)
and evaluating at x = x gives det n (x) = πn−1 (x)φn (x) − φn−1 (x)πn (x) = −
1 1 V (x) e . γn
(3-12)
Proof. Equation (3-9) is the standard Christoffel-Darboux relation for orthogonal polynomials. The extended system may be derived as follows. Multiplying the expression (x)0n−1 (x ) by (x − x ), and applying the relation (2-76) with respect to both x and x gives (x − x ) T (x)0n−1 (x ) 0 = T (x) Q0n−1 − πn−1 Q (x ) 1 0 −e V (x ) + 1 1 1 1 e V (x) e (V (x)+V (x )) κ e− 2 V (z) x−z −
(3-13) 1 x −z
dz
.
(3-14)
The result (3-8) is obtained by substituting the following identity, which holds for any tridiagonal symmetric matrix of the form Q Q0n−1 − 0n−1 Q = γn in σ inT .
(3-15)
3.2. Folded version of the deformation equations for changes in the potential. Under infinitesimal changes of the parameters in the potential V and the end-points of the “hardedge” contours, the wave vectors (x) and (x) and the combined system (x) undergo changes determined by Eqs. (2-25)–(2-27), (2-66)–(2-70)) and (2-79)–(2-78). Besides the deformations induced by infinitesimal changes of the endpoints {aj }, all these deformations have the same general form, depending only on the function v(x) = δV (x) that gives the infinitesimal deformation of the potential. We deal with them all on the same footing in the following proposition, which expresses the explicit form they take on the window n (x).
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
417
Proposition 3.2. The deformation equations (2-73), (2-74) (2-78) are equivalent to the infinite sequence of 2 × 2 equations, δv n (x) = Vn (x) n (x),
(3-16)
where the folded matrix of the deformation is defined by v(x)− 21 v(Q)n−1,n−1 0 Vn (x) = 1 0 2 v(Q)nn ∇Q v(x)n−1,n−1 ∇Q v(x)n−1,n +γn σ. ∇Q v(x)n,n−1 ∇Q v(x)nn
(3-17)
For the deformations in (2-24)–(2-27) and (2-66)–(2-70), this gives the following equations corresponding to changes in the potential, ∂cr n (x) = Cr;n (x) n (x), ∂tr,J n (x) = Tr,J ;n (x) n (x),
(3-18) (3-19)
where the sequence of 2 × 2 matrices Cr;n and Tr,J ;n (x) are rational in x, with poles at the points {cr }, obtained by making the following substitutions in Eq. (3-17): Cr : v(x) →
dr
tr,J (x − cr )−J −1 ,
J =0
1 (x − cr )−J , J 1 : v(x) → x J , J : v(x) → − ln(x − cr ) .
Tr,J : v(x) → T0,J Tr,0
(3-20)
Proof. Using the definition (2-71) of ∇Q v(x) and the extended Christoffel-Darboux relation (3-8), we have 1 T (y) Tn (x) γn ∇Q v(x) n = γn (3-21) dy e− V (y) (v(y) − v(x)) (y) y−x κ 1 = dye− V (y) (v(y) − v(x)))(y)T (y)0n−1 (x) (3-22) κ 1 V (x) − 1 V (y) v(y) − v(x) + 0, −e (y) (3-23) dye y−x κ = v(Q)0n−1 (x)−v(x)0n−1 (x)− 0, ∇Q v(x)U (x) . (3-24) Applying the projector inT and noting that inT v(Q)0n−1 = inT v(Q)−0 +
1 v(Q)n−1,n−1 0 2
0 n, −v(Q)n,n
(3-25)
418
M. Bertola, B. Eynard, J. Harnad
we obtain δv n (x) = inT δv (x) = inT Xv (x) + (0, ∇Q vU (x) = =
=
(3-26)
inT v(Q)−0 (x) + (0, inT ∇Q v(x)U (x)) inT v(Q)0n−1 (x) 1 − 2 v(Q)n−1,n−1 0 n + 1 0 2 v(Q)n,n +(0, inT ∇Q v(x)U (x)) γn inT ∇Q v(x)inT σ n (x) v(x) − 21 v(Q)n−1,n−1 0 + n (x), 1 0 2 v(Q)n,n
(3-27)
(3-28)
(3-29)
proving the relation (3-17). Remark 3.1. Note that formula (3-17) for the deformation of the measure in Proposition 3.2, as well as those below, (3-36), (3-37), which are obtained through folding of the ∂x operator, could also be derived for arbitrary locally analytic potentials V (x), provided all the integrals involved are convergent [13]. However applicability of the subsequent isomonodromic analysis would be lost if the derivatives were not rational, since the resulting deformation equations would then have essential singularities. 3.3. Folding of the endpoint deformations. The case (2-25) and (2-66) involving deformations of the locations of the “hard edge” end–points must be considered separately. Proposition 3.3. The following gives a closed system for the nth window of Eqs. (2-25) and (2-66): ∂ai n (x) = Ai,n (x) n (x), where
(3-30)
2 (a ) κi γn ψn−1 (ai )ψn (ai ) −ψn−1 i ψn2 (ai ) −ψn−1 (ai )ψn (ai ) ai − x 2 (a ) κi −ψn−1 0 i + 0 ψn2 (ai ) 2 2 κi γn ψn−1 (ai ) ψn−1 (ai )ψn (ai ) = σ ψn2 (ai ) ai − x ψn−1 (ai )ψn (ai ) 2 (a ) κi −ψn−1 0 i + . 0 ψn2 (ai ) 2
Ai,n :=
(3-31)
Proof. This is very similar to the proof of Prop. 3.2. Using the definition (2-31) of the matrices Ai and the extended Christoffel–Darboux relation (3-8) we have ∂ai n (x) = inT ∂ai (x) T = in κi (ai ) T (ai )
1
e (V (x)−V (ai )) (x) − 0, κi (ai ) −0 x − ai
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
κi (ai ) (ai ) 2 (a ) κi −ψn−1 0 i + n (x) 0 ψn2 (ai ) 2 γn κi = inT (ai ) T (ai ) n (x) ai − x 2 (a ) κi −ψn−1 0 i + n (x) . 0 ψn2 (ai ) 2
=
inT
T
0n−1 (x) −
419
1
e (V (x)−V (ai )) 0, κi (ai ) x − ai
(3-32)
Recalling the definition (3-5) of n and computing the matrix product yields the result in the statement. Q.E.D. 3.4. Folded version of the recursion relations and ∂x relations. We now consider the recursion relations (2-21), (2-64) and (2-76) and the action of the ∂x operator in (2-24), (2-65) and (2-77) which, in their folded form are given by the following. Proposition 3.4. The folded forms of the relations (2-76) and (2-77) are n+1 (x) = Rn (x) n (x) , ∂x n (x) = Dn (x) n (x),
n ≥ 1,
(3-33) (3-34)
,
(3-35)
where Rn :=
0
1
γn − γn+1
x−βn γn+1
and Dn (x) = Dn(0) (x) +
L 2 (a ) κi γn ψn−1 (ai )ψn (ai ) −ψn−1 i ψn2 (ai ) −ψn−1 (ai )ψn (ai ) x − ai
(3-36)
i=1
with Dn(0) (x)
V (x) = 0 V (x) = 0
∇Q V (x) ∇Q V (x) n−1,n 0 −γn 0 n−1,n−1 + 0 γn 0 ∇ V (x) ∇ Q V (x) Q n,n−1 (x)nn V (x) − ∇ V ∇ 0 n−1,n−1 Q Q n−1,n . (3-37) + γn 0 − ∇Q V (x) n,n−1 ∇Q V (x) nn
Remark 3.2. Note that formula (3-36) implies that Tr(Dn (x)) = V (x) .
(3-38)
Proof. The folded form (3-33) of the recursion relations follows directly from Eqs. (2-21) and (2-64)), xπn (x) = γn+1 πn+1 (x) + βn πn (x) + γn πn−1 (x),
xφn (x) = γn+1 φn+1 (x) + βn φn (x) + γn φn−1 (x) + δn0 h0 e
(3-39) 1 V (x)
.
(3-40)
420
M. Bertola, B. Eynard, J. Harnad
To prove (3-34), note that the folding relations (3-16) may be expressed inT δv n = Vn n
(3-41)
for any infinitesimal variation v = δV in the potential. Choosing δ := −
K
∂cr +
d 0 −1
J t0,J +1 ∂t0,J + t0,1 ∂t0,0 ,
(3-42)
J =1
r=1
we have V (x) ≡ δV . Using (2-30) and (2-65), we have ∂x n =
P + 0 , ∇Q V (x)U −
inT
= inT (V (Q)−0 −
L
1
e (V (x)−V (ai )) κi (ai ) x − ai
Ai )
i=1
+ 0 , ∇Q (δV )(x)U −
L i=1
= inT ((δV )(Q)−0 −
L
1
1
e (V (x)−V (ai )) κi (ai ) x − ai
Ai )
i=1
+ 0 , ∇Q (δV )(x)U − = inT δ −
L i=1
(3-43)
L
L i=1
e (V (x)−V (ai )) κi (ai ) x − ai
∂ai ,
(3-44)
i=1
where we have used the deformation equations (2-25)–(2-27), (2-66)–(2-70). Applying the folded relations (3-16), (3-17) and (3-30), this gives K K ˆ ¯ ∂x n = Vn − Ai,n − Ai,n n , (3-45) i=1
where
and
i=1
2 (a ) κi γn ψn−1 (ai )ψn (ai ) −ψn−1 i ˆ Ai,n := , ψn2 (ai ) −ψn−1 (ai )ψn (ai ) ai − x 2 (a ) κi −ψn−1 0 i A¯ i,n := 2 (a ) , 0 ψ 2 n i
V (x)− 21 V (Q)n−1,n−1 0 Vn (x) = 1 0 2 V (Q)nn
(3-46) (3-47)
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
+
∇Q V (x)n−1,n−1 ∇Q V (x)n−1,n ∇Q V (x)n,n−1 ∇Q V (x)nn
0 −γn γn 0
421
.
(3-48)
It follows from (2-40) that the diagonal V (Q) terms in Vn (x) are canceled by the sum in the last term of (3-45), giving the stated result (3-36), (3-37). Combining the differential, recursion and deformations relations (3-34), (3-33), (318), (3-19) and (3-30), the fact that the invertible matrices n are simultaneous fundamental systems for all these equation implies the compatibility of the cross-derivatives; i.e., the corresponding set of zero-curvature equations. Corollary 3.1. For n ≥ 0 the set of PDE’s and recursion equations, ∂x n (x) = Dn (x) n (x), ∂cr n (x) = Cr;n (x) n (x), n+1 (x) = Rn (x) n (x),
∂ai n (x) = Ai;n (x) n (x), ∂tr,J n (x) = Tr,J ;n (x) n (x), (3-49)
are simultaneously satisfied by the invertible matrices n (x), and hence the zero-curvature equations, [∂x − Dn , ∂ai − Ai;n ] = 0, [∂x − Dn , ∂ai − Cr;n ] = 0, [∂x − Dn , ∂ai − Tr,J ;n ] = 0 , [∂ai = Ai;n , ∂ai − Cr;n ] = 0, [∂ai − Ai;n , ∂ai − Tr,J ;n ] = 0 , [∂ai − Cr;n , ∂ai − Tr,J ;n ] = 0, ∂ai Rn = Ai;n+1 Rn − Rn Ai;n , ∂cr Rn = Cr;n+1 Rn − Rn Cr;n , ∂tr,J Rn = Tr,J ;n+1 Rn − Rn Tr,J ;n (3-50) are satisfied. Remark 3.3. (The Riemann–Hilbert method.) The Riemann-Hilbert method for characterizing orthogonal polynomials [10, 8] provides an alternative approach to deriving the results of this section. This is a well-established approach, and will not be developed in detail here, except to indicate briefly how it could be applied to deducing the differential and deformation equations satisfied by the fundamental systems. The fundamental system n (x) has, by construction, a jump-discontinuity across any of the contours defining the orthogonality measure. Denoting the limiting values when approaching any of these contours from the left or the right by n,± we have the jump discontinuity conditions 1 2iπκj (3-51) n,+ (x) = n,− (x) , x ∈ γj . 0 1 Furthermore, the local asymptotic behavior near the singularities at ∞ are specified as in Sect. 5.2. To be more precise the function n (x) has local formal asymptotic form, within any of the Stokes sectors, 1 1 + O(x − c ) e− 2 Tr (x)σ3 x → cr C r r 1 e− 2 V (x) n (x) ∼ x → aj (3-52) Aj 1+O(x −aj ) e−κj ln(x−aj )σ+ . 1 1 e− 2 T0 (x)σ3 +(n− r t0,r )σ3 ln(x) x → ∞ C0 1 + O x
422
M. Bertola, B. Eynard, J. Harnad
It follows from the usual argument based on Liouville’s theorem that any two fundamental solutions (with the same Stokes matrices, given in fact by the same matrices (3-51) ) satisfying the above Riemann–Hilbert conditions are equal, within a constant scalar multiple. Also, from Liouville’s theorem it follows that the first column of (x) consists of polynomials (the orthogonal polynomials). Using similar arguments one can show that the following matrix is rational with poles, of the correct order, at the singular points cr , aj , ∞: 1 Dn (x) := ∂x n (x) n (x)−1 + V (x)1. 2
(3-53)
By comparing the local singular behavior of the logarithmic (matrix) derivatives of any two solutions and applying Liouville’s theorem, it follows again that these globally combine to define rational matrix functions which give the deformation matrices with respect to the various parameters at the poles. 4. Spectral Curve and Spectral Invariants The aim of this section is to express the spectral curve of the ODE (3-34) (i.e., the characteristic equation of Dn (x)) in terms of the partition function. In fact we will prove an exact finite n analog (Thm. 4.2) of the formulæ that are obtained by variational methods in the n → ∞ limit [9]. We start by expressing the explicit relation between the partition function and the spectral curve of the isomonodromic system.
4.1. Virasoro generators and the spectral curve. To express the result in a compact form, introduce the following local Virasoro generators: (r)
V−J :=
d r −J
Mtr,M+J
M=1
∂ , ∂tr,M
J = 0, . . . , dr − 1 , r = 0, . . . , K
(4-1)
in terms of which we define the following differential operator with coefficients that are rational functions of x: D(x) : =
L i=1
−
d 0 −3 1 ∂ (0) − x J V−J −2 x − ai ∂ai J =0
K d r +1 r=1 J =2
1 1 ∂ (r) V − . 2−J (x − cr )J x − cr ∂cr K
(4-2)
r=1
Theorem 4.1. The characteristic polynomial of the matrix Dn (x) in the differential system (3-34) is given by
V (M) − V (x) det y − Dn (x) = y 2 − yV (x) + Tr M −x −
L i=1
2 ∂a ln(Zn ), x − ai i
(4-3)
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
= y 2 − yV (x) + Tr n −
L i=1
V (Q) − V (x) Q−x
423
2 ∂a ln(Zn ), x − ai i
= y 2 − yV (x) + n
d 0 −1
(4-4)
t0,J +1 x J −1 − 2 D(x) ln Zn ,
(4-5)
J =1
and the quadratic trace invariant is TrDn (x)2 = V (x)2 − 2n
d 0 −1
t0,J +1 x J −1 + 22 D(x) ln Zn .
(4-6)
J =1
Proof. The equivalence of (4-3) and (4-4) follows from the well–known relation Tr(f (M)) = Tr n (f (Q))
(4-7)
for any scalar function f (x) for which the Tr(f (M)) is a convergent integral. The equivalence of (4-5) and (4-6) follows from (3-38). To prove (4-4) we use the recursion relation (3-33) and the explicit expression of Dn , we obtain Dn+1 = Rn Dn Rn −1 + Rn Rn −1 , 0 0 0 − γ1n Rn Rn −1 = , Rn −1 Rn = . 1 0 0 γn+1 0 Therefore,
(4-8) (4-9)
Tr Dn+1 (x)2 = Tr Dn (x)2 + 2Tr Dn (x)Rn −1 Rn 2 +2 Tr Rn Rn−1 V (Q) − V (x) = Tr Dn (x)2 − 2 Q−x nn −22
L κi ψ 2 (ai ) n
x − ai i=1 V (Q) − V (x) = Tr Dn (x)2 − 2 Q−x nn +22
L i=1
1 ∂a ln(hn ), x − ai i
(4-10)
(4-11)
(4-12)
where we have used Eqs. (3-36), (3-37) in (4-11) and (2-48) in (4-12). These equations imply that Tr(Dn (x) ) = Tr(D1 (x) ) − 2 2
2
n−1 V (Q) − V (x) j =1
Q−x
jj
424
M. Bertola, B. Eynard, J. Harnad
+ 22
n−1 L
1 ∂a ln(hj ). x − ai i
j =1 i=1
(4-13)
From the definition of D1 , we have D1 =
π0 φ0 π1 φ1
π0 φ0 π1 φ 1
−1 (4-14)
.
Using (3-12), this gives 1
det(D1 (x)) = 2 γ1 e− V (x) φ0 π1
− 1 V (z)
1 h1 − 1 V (x) 1 d e e e+ V (x) dz h0 h0 h1 dx κ x−z − 1 V (z) − 1 V (z) 2 1 e e = V (x) dz − dz 2 h0 κ x−z κ (x − z) − 1 V (z) 2 1 e = V (x) dz h0 κ x−z 1 − 1 V (z) ∂ − e dz ∂z x − z κ K 2 (a ) (V (x) − V (z))ψ02 (z) κ ψ i i 0 = , dz + 2 x−z x − ai κ
= 2
(4-15) (4-16)
(4-17)
(4-18) (4-19)
i=1
and hence Tr(D12 (x)) = −2 det(D1 (x)) + Tr(D1 (x))2 (V (x) − V (z))ψ02 (z) 2 = ((V (x)) − 2 dz x−z κ −22
K κi ψ 2 (ai ) 0
x − ai i=1 V (x) − V (Q) 2 = (V (x)) − 2 x−Q 00 +22
L i=1
1 ∂a ln(h0 ). x − ai i
(4-20)
(4-21)
(4-22)
Combining this with (4-13) gives Tr(Dn (x)2 ) = (V (x))2 −2
n−1 V (Q) − V (x) j =0
Q−x
+22 jj
L n−1 j =0 i=1
1 ∂a ln(hj ), x − ai i
(4-23) which, taking the expression (2-11) for the partition function into account, completes the proof of Eq. (4-4).
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
425
We now proceed to the proof of Eq. (4-5). By expanding the third term on the right of (4-4), we obtain d J −1 0 −1 V (x) − V (Q) = t0,J +1 x M QJ −M−1 x−Q J =1
M=0
K d r +1
+
tr,J −1
r=1 J =1
=
d 0 −1
J −1 M=0
t0,J +1 x J −1 1 +
J =1
tr,J −1
r=1 J =1
=
d 0 −1
J −1 M=0
t0,J +1 x
J −1
1+
J =1
x M t0,J +1 QJ −M−1
1 (Q − cr )−M−1 (x − cr )J −M
d 0 −3
x
r=1 M=1
1 (x − cr )M
dr
M
t0,J QJ −M−1
J =M+2
M=0
K d r +1
+
d −2 0 −1 J J =2 M=0
K d r +1
+
1 (Q − cr )−M−1 (x − cr )J −M
d r +1
tr,J −1 (Q − cr )M−J −1 . (4-24)
J =M
Now recall that for any deformation matrix X corresponding to an infinitesimal variation ∂ we have n−1 j =0
Xjj = − ∂ ln(Zn ). 2
(4-25)
Summing the diagonal terms of (4-24) up to n−1 and substituting into (4-4) we therefore obtain Tr(Dn (x)2 ) = (V (x))2 − 2n − 22 − 2
2
d 0 −2
t0,J +1 x J −1
J =1
xM
d0
(J − M − 1)t0,J +1
M=1 J =M+1 K d r +1 r=1 M=2
− 22
d 0 −1
K r=1
∂ ln(Zn ) ∂t0,J −M−1
d r +1 1 ∂ ln(Zn ) (J − M + 1)tr,J −1 (x − cr )M ∂tr,J −M+1 J =M
1 ∂ ln(Zn ) 1 ∂ ln(Zn ) + 22 x − cr ∂cr x − ai ∂ai L
i=1
Using Eq. (4-25) we finally get d 0 −1 t0,J +1 x J −1 det y − Dn (x) = y 2 − yV (x) + n J =1
(4-26)
426
M. Bertola, B. Eynard, J. Harnad
+2 +
2
d 0 −3
xM
(J − M − 1)t0,J +1
∂ ln(Zn ) ∂t0,J −M−1
M=0
J =M+2
K d r +1
d r +1 1 ∂ ln(Zn ) (J − M + 1)tr,J −1 (x − cr )M ∂tr,J −M+1
r=1 M=2
+2
d 0 −1
K r=1
J =M
1 ∂ ln(Zn ) 1 ∂ ln(Zn ) − 2 , x − cr ∂cr x − ai ∂ai L
(4-27)
i=1
which completes the proof of Eq. (4-5).
4.2. Spectral residue formulæ. Theorem 4.1, which determines all the coefficients of the spectral curve as logarithmic derivatives of the partition function, may be expressed in another form, in which the individual deformation parameters, as well as the logarithmic derivatives with respect to them, may be directly expressed as spectral invariants. The characteristic equation of Dn (x) det (y(x)I − Dn (x)) = 0,
(4-28)
defines a hyperelliptic curve Cn as a 2–sheeted branched cover of the Riemann sphere, on which y is a meromorphic function. It follows from (3-36) and Theorem 4.1 that y, viewed as a double valued function of x, has the same pole structure and degree as Dn (x) at the points {c0 = ∞, cr , ai }, but that the points {ai } are branch points. Let Y± (x) denote the two values of y(x). Defining W (x) := Tr n
L V (x) − V (Q) 2 ∂aj ln Zn − , x−Q x − aj
(4-29)
j =1
it follows from the explicit expression (4-4) for the spectral curve that, near any of the poles c0 = ∞, c1 , . . . , cK , the two branches have the asymptotic form ! 1 1 Y± (x) = V (x) ± (V (x))2 − W 2 4 1 W2 1 O(x −2d0 −1 ) x→∞ ∼ V (x) ∓ W+ +. . . + . 2 O((x − cr )2dr +2 ) x → cr 0 V (x) (V (x) ) (4-30) Before proceeding we have to point out that here we have assumed that d0 ≥ 1, for otherwise ∞ is a branch-point of the spectral curve. Theorem 4.2. The following residue formulæ express the deformation parameters and the logarithmic derivatives of Zn as spectral invariants of the matrix Dn (x): Y+ (x) dx, J = 1 . . . d0 , xJ = − res (x −cr )J Y+ (x)dx, r = 1, . . . , K, J = 1, . . . , dr ,
t0,J = − res
(4-31)
tr,J
(4-32)
x=∞ x=cr
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
2 ∂t0,0 ln Zn = res Y− (x)dx = −n,
427
( for d0 ≥ 1),
x=∞
(4-33)
1 res Y− (x)x J dx, J = 1, . . . , d0 , (4-34) J x=∞ 1 1 2 ∂tr,J ln Zn = dx, r = 1, . . . , K, J = 1, . . . , dr , (4-35) res Y− (x) J x=cr (x −cr )J 2 ∂cr ln Zn = − res Y− (x)Tr (x)dx, r = 1, . . . , K, (4-36)
2 ∂t0,J ln Zn =
x=cr
1 ∂aj ln Zn = res Tr(Dn2 (x))dx . 2 x=aj 2
(4-37)
Note that identity (4-33) can be established only for d0 ≥ 1 for otherwise the residue would not make sense since ∞ becomes a branch-point. Proof. Considering the various deformations associated to the poles and end-points we have: At infinity: 1 x J /J V (x) − V (Q) 2 J res x Y− dx = res Tr n − ∂a ln Zn x=∞ V (x) J x=∞ x−Q x − aj j x J /J V (x) − V (Q) Tr = − Tr n QJ = 2 ∂t0,J ln Zn , n x=∞ V (x) x−Q J J = 1, . . . d0 − 2. (4-38)
= res
Note that this computation does not provide the derivatives with respect to the two highest coefficients t0,d0 and t0,d0 −1 , which will be computed below. Moreover we should remark that the last equality follows from the following interchange of order of integrals: n−1 1 x J /J V (x) − V (Q) x J /J V (x) − V (z) 2 res Tr = πj (z)e− V (z) dz n x=∞ V (x) x=∞ V (x) κ x−Q x−z res
j =0
=
n−1 j =0
=−
1 x J /J V (x) − V (z) 2 πj (z)e− V (z) dz (x) x=∞ V x − z κ
res
n−1 1 zJ 2 1 πj (z)e− V (z) dz = − Tr n QJ . (4-39) J κ J j =0
The exchange is justified by the usual arguments observing that the expression V (x)−V x−z has no singularities at coinciding points x = z (away from the singularities of V ). At the poles cr :
(z)
(x − cr )−J /J V (x) − V (Q) 1 res (x − cr )−J Y− dx = res Tr n x=cr J x=cr V (x) x−Q = − Tr n (Q − cr )−J = 2 ∂tr,J ln Zn , J = 1, . . . , dr ; J T (x) V (x) − V (Q) − res Y− (x)Tr (x)dx = − res r Tr n = Tr n Tr (Q) x=cr x=cr V (x) x−Q = 2 ∂cr ln Zn . (4-40)
428
M. Bertola, B. Eynard, J. Harnad
The last equalities in (4-40) are obtained by a similar argument used for the deformations at c0 = ∞ here above. At the endpoints aj : V (x) − V (Q) 1 1 2 res Tr(Dn (x)) = res (V (x))2 − 2Tr n 2 x=aj 2 x=aj x−Q L 2 +2 ∂a ln Zn x − aj j j =1
= ∂aj ln Zn . 2
(4-41)
The determination Y− has the asymptotic behavior L V (x) − V (Q) 2 1 Y− (x) ∼ ∂a ln Zn Tr n − V (x) x−Q x − aj j j =1 & 2 2 n − x2 + O(x −d0 −2 ) x → ∞ + . O((x − cr )dr +1 ) x → cr
(4-42)
Identities (4-35) and (4-36) follow immediately from the expressions in (4-38), (4-40) and the asymptotic forms (4-42), as do the identities (4-34) for J ≤ d0 − 2. For the remaining two values of J (d0 − 1, d0 ), we compute 1 − res T0 (x)Y− (x)dx = − res (V (x) + O(1/x)) x=∞ x=∞ V (x) n2 2 −3 W − 2 + O x ) dx x =−
K
Tr n Tr (Q) +
L
2 ∂aj ln Zn
j =1
r=1
= Tr n T0 (Q),
(4-43)
where the last equality follows from Eq. 2-40 (translational invariance). This identity, together with Eqs. (4-34) for j ≤ d0 − 2 implies res x d0 −1 Y− (x)dx = 2
x=∞
∂ ∂t0,d0 −1
ln Zn ,
(4-44)
which is the case J = d0 − 1 of (4-34). Similarly − res xT0 (x)Y− (x)dx = − res (xV (x) + x=∞
x=∞
+O(1/x)) = n2 2 − n
K
tr,0
r=1
1 n2 2 −3 + O x ) dx W − V (x) x2 K r=1
= Tr n QT0 (Q),
tr,0 −
K r=1
Tr n QTr (Q) +
L
2 aj ∂aj ln Zn
j =1
(4-45)
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
429
where the last equality holds because of dilation invariance. This, together with the above proves (4-34) for J = d0 . The formulæ (4-37) follow from equating the residues at the poles x = aj in Eq. (4-6). Finally we examine (4-33): the second equality follows from the fact that t0,0 appears t0,0
in the integral (2-11) defining Zn only in the overall normalization factor e− . The first equality in (4-33) for the cases d0 ≥ 2 follows from the asymptotic expression of Y− . For the case d0 = 1 instead one has to use (4-43) and the fact that T0 (x) = t0,1 =constant. Remark 4.1. In the formulæ (4-33), (4-34), (4-36), (4-37) we may replace Y− by −(Y+ − V (x)); this corresponds to the fact that, in the large n limit, the behavior of Y (x) on the physical sheet (i.e. Y+ ) is related to the resolvent of the model by ' ( Y+ = V (x) + Tr(x − M)−1 . (4-46) 5. Isomonodromic Tau Function 5.1. Isomonodromic deformations and residue formula. In this section we briefly recall the definition of the isomonodromic tau-function given in [11] and compute its logarithmic derivatives in the present case in order to compare it with the partition function. This will lead to the main result of this section, Theorem 5.1, which explicitly gives this relation. Consider a rational covariant derivative operator on a rank p vector bundle over CP 1 , Dx = ∂x − A(x) ,
(5-1)
where the connection component A(x) is a p × p matrix, rational in x. Deformations of such an operator that preserve its (generalized) monodromy (i.e. including the Stokes’ data) are determined infinitesimally by requiring compatibility of the equations ∂x (x) = A(x), ∂ui (x) = Ui (x)(x) ,
i = 1, . . . ,
(5-2) (5-3)
where in the second set of equations Ui (x) are also p × p matrices, rational in x, viewed as components of a connection over the extended space consisting of the product of CP 1 with the space of deformation parameters {u1 , . . .}. The invariance of the generalized monodromy of Dx follows [11] from the compatibility of this overdetermined system, which is equivalent to the zero-curvature equations [∂x − A(x), ∂ui − Ui (x)] = 0 ,
[∂ui − Ui (x), ∂uj − Uj (x)] = 0.
(5-4)
Near a pole x = cν of A(x) a fundamental solution can be found that has the formal asymptotic behavior, in a suitable sector: (x) ∼ Cν Yν (x)eTν (x) ,
(5-5)
Yν (x) = 1 + O(x − cν )
(5-6)
where Cν is a constant matrix,
is a formal power series in the local parameter (x −cν ) (or 1/x for the pole at infinity) and Tν (x) is a Laurent-polynomial matrix in the local parameter, plus a possible logarithmic
430
M. Bertola, B. Eynard, J. Harnad
term t0 ln(x − c). In the generic case Tν (x) is a diagonal matrix, and, more generally, may be an element of a maximal Abelian subalgebra containing an element with no multiple eigenvalues. The locations of the poles cν and the coefficients of the nonlogarithmic part of Tν (x) are the independent deformation parameters. The deformation of the connection matrix A(x) is determined by the requirement that the (generalized) monodromy data be independent of all these isomonodromic deformation parameters. Given a solution of such an isomonodromic deformation problem, one is led to consider the associated isomonodromic τ -function [11], determined by integrating the following closed differential on the space of deformation parameters:
ω :=
ν
res Tr Yν−1 Yν · dTν (x) = d ln τ I M ,
x=cν
(5-7)
where the sum is over all poles of A(x) (including possibly one at x = ∞), and the differential is over all the independent isomonodromic deformation parameters. In the present situation A(x) is our 2 × 2 matrix Dn (x) and the (generalized) monodromy of the operator ∂x − Dn (x) is invariant under changes in the parameters cr , tr,J , aj and n.
5.2. Traceless gauge. For convenience in the computations we perform a scalar gauge transformation of the ODE by choosing quasipolynomials rather than polynomials. Explicitly we set 1
n (x) := e− 2 V (x) n (x) =
ψn−1 (x) ψn (x)
)n−1 (x) ψ )n (x) , ψ
n (x) = An (x)n (x), 1 An (x) = Dn (x) − V (x)1, 2
(5-8)
where )n := e− 2 V (x) φn . ψ 1
(5-9)
In this gauge the matrix of the ODE is traceless and the infinitesimal deformation matrices are transformed correspondingly by addition of the identity element multiplied by the derivatives of − 21 V (x) with respect to the parameters {cr , trJ , aj }. This choice gives a consistent reduction of the general gl(p, C) isomonodromic deformation problem to sl(p, C). (To be precise, this would require a further, x–independent diagonal −1
1
2 gauge transformation of the form diag(hn−1 , hn2 ) to render the infinitesimal deformation matrices also traceless.) At each of the poles c0 := ∞, c1 , . . . we then have the following asymptotic expansions. (To simplify notation, the index n is omitted in labeling the fundamental system and its local asymptotic form.)
1 1 (x) ∼ Cr Yr (x) exp − Tr (x) + δr0 n + tr,0 ln(x) σ3 . (5-10) 2 2
r≥1
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
Here we have set Y0 (x) := 1 + Yr (x) := 1 + Cr =
∞ Y0;k k=1 ∞
xk
C0 =
,
0
√
√1 hn
431
hn−1 , 0
Yr;k (x − cr )k ,
k=1
Vr (cr ) −1 √ 2 , , πn−1 (cr )e ˇ (cr − Q)n−1,0 h0 eˇ √ Vr (cr ) Vr (cr ) 2 πn (cr )e− 2 (cr − Q)−1 h e 0 n,0 ˇ
ˇ
(cr ) − Vr2
(5-11)
where Vˇr (x) = V (x) − Tr (x) is the holomorphic part of the potential at cr . The asymptotic forms given by (5-10)–(5-11) follow from the fact that, in any Stokes’ sector near cr , the second-kind solutions behave like ∞
)n (x) ∼ e 2 V (x) ψ 1
x→∞
x −k
k=n+1
1
κ
πn (z)e− V zk−1
1 1 = e 2 V (x) x −n−1 hn 1 + O , x 1 1 e− V (z) πn (z) V (x) ) ψn (x) ∼ e 2 dz (1 + O(x − cr )) x→cr cr − z κ 1 = e V (x) (cr − Q)−1 n,0 h0 (1 + O(x − cr )).
(5-12)
(5-13)
Near the endpoints aj we have 0 1 , σ+ := (x) ∼ Aj · Yj (x) · exp −κj ln(x − aj )σ+ , 0 0 1 − V (z) V (aj ) V (a ) e (πn−1 (z)−πn−1 (aj )) − j πn−1 (aj )e 2 e 2 κ dz aj −z (5-14) Aj = , − 1 V (z) V (aj ) V (aj ) e (π (z)−π (a )) n n j − 2 πn (aj )e e 2 κ dz aj −z since the matrix
(x) · exp −σ+ =
− πn−1 (x)e
πn (x)e−
1
e− V (z) dz x−z κ
V (x) 2
V (x) 2
e
− 1 V (z) e (πn−1 (z)−πn−1 (x)) dz κ x−z − 1 (z) V (x) (πn (z)−πn (x)) e e 2 κ dz x−z
V (x) 2
(5-15) is analytic in a neighborhood of aj and has the limiting value indicated in (5-14). The − 1 V (z) function − κ dz e x−z in the exponential of the second matrix in this formula has the same singularity as κj ln(x − aj ). (The signs in (5-14) follow from the orientation of the contour originating at aj ).
432
M. Bertola, B. Eynard, J. Harnad
The differential (5-7) can now be written κj daj −1 1 d ln τnI M = res dTr (x)Tr Yr−1 Yr σ3 + res Tr Yj Yj σ+ , x=cr x=aj x − aj 2 r=0
j
(5-16) where the differential involves the isomonodromic parameters only d K r ∂ ∂ ∂ ∂ d := + dtr,J + dcr daj = d(r) + daj . ∂tr,J ∂cr ∂aj ∂aj r=0
J =1
j
r=0
j
(5-17) We now derive residue formulæ for the deformation parameters and the logarithmic derivatives of the tau function for our rational 2 × 2 isomonodromic deformation problem. These essentially are the same as the formulæ of Thm. 4.2 giving the latter quantities in terms of logarithmic derivatives of the partition function of the matrix model1 . Consider the quadratic spectral invariant near any of the singularities: by virtue of the asymptotics (5-10, 5-14) we have, near cr and aj respectively (setting S := 2n + K r=1 tr,0 ) 2 2 2 −1 2 2 −1 Tr(A (x)) = Tr(( ) ) = Tr Yr Yr δr0 S −1 −Tr Yr Yr σ3 Tr − x 2 δr0 S 1 Tr − , (5-18) + 2 x Tr(A2 (x)) = 2 Tr(( −1 )2 ) 2 2κj + Tr Yj−1 Yj σ+ . = 2 Tr Yj−1 Yj x − aj
(5-19)
Taking the principal part at each singularity and using Liouville’s theorem (since TrA2 is a priori a rational function) we find K 1 δr0 S 2 1 δr0 S 2 −1 Tr(A (x)) = + Tr − Tr − Tr(Yr Yr σ3 ) 2 x x r=0 r,+ L 2 κj Tr(Yj (aj )σ+ ) , (5-20) + x − aj j =1
where the subscripts r,+ mean the singular part at the pole x = cr (including the constant for x = c0 = ∞). Consider now the spectral curve of the connection ∂x − A(x), w2 =
1 TrA2 (x), 2
(5-21)
1 The case of an arbitrary rank rational, nonresonant isomonodromic deformation problem will be developed elsewhere [5], together with further properties that allow us to view these as nonautonomous Hamiltonian systems, in which the logarithmic derivatives of the τ -function computed below are interpreted as the Hamiltonians generating the deformation dynamics.
Orthogonal Polynomials, Matrix Models and Isomonodromic Tau Functions
433
One immediately finds w± (x) 0 1 11 δr0 S 2 1 δr0 S −1 2 Tr(Yr Yr σ3 ) Tr − Tr − − =± 4 r x 2 x r,+
r,+
+
1 κj Tr(Yj (aj )σ+ ) . x − aj j (5-22)
Near any of the poles one has the asymptotic behavior 1 δr0 S 1 −1 Tr − − T Tr(Yr Yr σ3 ) 2Tr (x) r 2 r,+ 2x dr +1 O((x − c ) ) near x = c r r + ±w± = O(x −d0 −2 ) near x = ∞ . κj Tr(Yj (aj )σ+ ) (1 + O(x − aj )) near x = aj (x − aj )
(5-23)
This immediately implies the following identities: 1 n + tr,0 = ∓ res w± dx. x=∞ 2 r 1 1 t0,J = ∓ res J w± dx , J ≥ 1, x=∞ x 2 1 tr,J = ∓ res (x − cr )J w± dx, x=cr 2
(5-24)
and 2
∂ (x − cr )−J w± dx, ln τnI M = ∓ res x=cr ∂tr,J J
∂ xJ w± dx, ln τnI M = ∓ res x=∞ J ∂t0,J ∂ 2 ln τnI M = ± res Tr (x)w± dx, x=cr ∂cr ∂ 2 ln τnI M = res (w± )2 dx . x=aj ∂aj
2
(5-25) (5-26) (5-27) (5-28)
In order to compare with the formulæ given in Thm. (4.2) we note that the eigenvalues w of A(x) and Y of Dn (x) are related as follows due to the change of gauge (5-8): 1 V (x) + w± . 2 Comparing Eqs. (4-34)–(4-37) with Eqs. (5-26)–(5-28) we obtain Y± =
Zn 1 res x J V , = I M τn 2J x=∞ 1 Zn 1 res ln I M = V (x)dx, τn 2J x=cr (x − cr )j
(5-29)
2 ∂t0,J ln 2 ∂tr,J
r = 1, . . . , K, J = 1, . . . , dr ,
434
M. Bertola, B. Eynard, J. Harnad
2 ∂cr ln
Zn 1 = res Tr (x)V (x)dx, I M τn 2 x=cr
r = 1, . . . , K,
2 ∂aj ln Zn = 2 ∂aj ln τnI M .
(5-30)
These relations define a closed differential Zn d ln =: d ln(Fn ) , τnI M
(5-31)
where the quantity Fn is determined up to a multiplicative factor independent of the a ,a } isomonodromic deformation parameters {cr , trJ j J ≥1 , but which may depend on n. This may be explicitly integrated to give Fn 1 ln res T (x)Tq (x) , (5-32) =− 2 x=cr r n! 2 0≤qm
ζl ζm
kl,m ,
where µ(α; k) =
1 1 (α; q)k (q 2 t − 2 )k . −1 (αqt ; q)k
448
J. Shiraishi
Thus it is explicitly seen that I (α) is lower-triangular with respect to the dominance ordering on the monomial basis. It is easy to see that the diagonal elements are given by Eq. (35), and that all of them are distinct since the parameters are generic. Hence there is no obstruction for the construction of an eigenfunction of the form Eq. (34) corresponding to the eigenvalue Eq. (35). The uniqueness follows from the normalization cj1 ,j2 ,...,jn−1 = 1. In view of the equality µ(α; k + l) = µ(α; k)µ(αq k ; l), it immediately follows that all the eigenfunctions are related with each other. Proposition 4.2. We have fj1 ,j2 ,...,jn−1 (ζ1 , . . . , ζn ) =
n
j
ζi i−1
−ji
(Tq,si )−ji−1 +ji · f0,0,...,0 (ζ1 , . . . , ζn ). (36)
i=1
Remark. The family fj1 ,j2 ,...,jn−1 (ζ1 , . . . , ζn ) form a basis of Fn . At this moment, Conjecture 1.2 remains open when n ≥ 3, since properties of the eigenfunctions are not well known. For n = 3, one may guess an explicit formula of them by a brute force calculation. Conjecture 4.3. The first eigenfunction of I (α) for n = 3 is given by f0,0 (ζ1 , ζ2 , ζ3 ) =
∞
(qt −1 , qt −1 , t, t; q)k (qs1 /s3 )k (ζ3 /ζ1 )k (37) (q, qs1 /s2 , qs2 /s3 , qs1 /s3 ; q)k k=0 k+1 −1 −1 q t , qt si /sj , × (1 − ζj /ζi ) · 2 φ1 ; q, tζj /ζi . q k+1 si /sj 1≤i<j ≤3
Remark. No dependence on the parameter α is observed in f0,0 (ζ1 , ζ2 , ζ3 ), nor in the complete set of eigenfunctions (from Proposition 4.2), from which Conjecture 1.2 follows immediately. 5. Modified Macdonald Difference Operator D 1 In this section, we study some properties of the modified Macdonald difference operator D 1 given by Definition 1.3. We will see that these properties for D 1 are just the same for I (α), though we do not have any a priori reason for such coincidence. Proposition 5.1. For any set of nonnegative integers j1 , j2 , . . . , jn−1 , there exist a unique solution to the equation D 1 fj1 ,j2 ,...,jn−1 (ζ1 , . . . , ζn ) = λj1 ,j2 ,...,jn−1 fj1 ,j2 ,...,jn−1 (ζ1 , . . . , ζn ),
(38)
with the expansion Eq. (34) and the normalization cj1 ,j2 ,...,jn−1 = 1, if and only if λj1 ,j2 ,...,jn−1 =
n
si q −ji−1 +ji .
i=1
The eigenfunctions of D 1 satisfy the relation Eq. (36).
(39)
Family of Integral Transformations and Basic Hypergeometric Series
449
Remark. One can regard the eigenfunctions as a basic analogue of the Heckman-Opdam hypergeometric function [6]. Proof. The proof parallels the case of I (α). We have j1 j2 j1 j2 ζ3 ζ3 ζn jn−1 ζ2 ζn jn−1 1 ζ2 D ··· = ··· ζ1 ζ2 ζn−1 ζ1 ζ2 ζn−1 n 1 − q −1 tζi /ζj 1 − qt −1 ζk /ζi × si q −ji−1 +ji , (40) 1 − q −1 ζi /ζj 1 − qζk /ζi i=1
j
k>i
where the rational factors should be understood as the series in Eq. (8). Hence it is explicit in Eq. (40) that D is lower-triangular in the monomial basis with respect to the dominance order, with distinct diagonals given by Eq. (39). In view of Eq. (40), it follows that the relation Eq. (36) holds. We study the simplest case n = 2. Proposition 5.2. For n = 2, all the eigenfunctions of I (α) and D 1 are the same. Hence Conjecture 1.4 is true for n = 2. n Proof. Set f0 (ζ1 , ζ2 ) = (1 − ζ )g(ζ ) and g(ζ ) = ∞ n=0 gn ζ , where ζ = ζ2 /ζ1 . The difference equation for g reads s1 (1 − qt −1 ζ )g(qζ ) + s2 (1 − q −1 tζ )g(q −1 ζ ) = (s1 + s2 )(1 − ζ )g(ζ ). Solving this with the condition g0 = 1, we have gn =
(qt −1 , qt −1 s1 /s2 ; q)n n t . (q, qs1 /s2 ; q)n
Hence the first eigenfunction of D 1 is
qt −1 , qt −1 s1 /s2 (41) ; q, tζ2 /ζ1 . qs1 /s2 From Lemma 5.3 below, we find that f0 (ζ1 , ζ2 ) in Eq. (41) agrees with Eq. (24) written for j = 0. Since we have the same relations Eq. (36) for I (α) and D 1 , it follows that all the eigenfunctions of I (α) and D 1 must be the same. Hence we have the commutativity I (α)D 1 = D 1 I (α) for n = 2. f0 (ζ1 , ζ2 ) = (1 − ζ2 /ζ1 ) 2 φ1
Lemma 5.3. We have (1 − z) 2 φ1
a, b ; q, zq/b = 4 W3 (a/q; b/q; q, zq/b) . aq/b
Proof. ∞
∞
(a, b; q)n−1 (a, b; q)n (q/b)n zn − (q/b)n−1 zn (aq/b, q; q)n (aq/b, q; q)n−1 n=0 n=1 ∞ (a, b; q)n (1 − aq n /b)(1 − q n ) 1 − (b/q) = (zqb−1 )n (aq/b, q; q)n (1 − aq n−1 )(1 − bq n−1 )
LHS =
n=0
=
∞ n=0
(a, b; q)n (1 − bq −1 )(1 − aq 2n−1 ) (zqb−1 )n = RHS. (aq/b, q; q)n (1 − aq n−1 )(1 − bq n−1 )
(42)
450
J. Shiraishi
At present, it remains open how to construct an explicit formula for the eigenfunctions of D 1 for general n. For n = 3, a brute force calculation suggests the following. Conjecture 5.4. For n = 3, Eq. (37) also gives the first eigenfunction of the difference operator D 1 , from which Conjecture 1.4 follows. 6. Weyl Group Symmetry In this section, we study a hidden symmetry of the eigenfunctions of I (α) or D 1 in terms of the Weyl group of type An−1 . This Weyl group symmetry appears when the series for the eigenfunctions become truncated under the specialization t = q m (m = 1, 2, 3, . . .). Let W (An−1 ) be the Weyl group of type An−1 generated by σ1 , σ2 , · · · , σn−1 with the braid relations σi2 = id and σi σi+1 σi = σi+1 σi σi+1 . Let Pn be the space of polynomials in ζ1 , ζ2 , . . . , ζn with coefficients which are rational expressions in si ’s and q. Definition 6.1. For a positive integer m, we define the action πm of W (An−1 ) on Pn by
=
πm (σi )f (ζ1 , . . . , ζi , ζi+1 , . . . , ζn , s1 , . . . , si , si+1 , . . . , sn ) m−1 si − q k si+1 f (ζ1 , . . . , ζi+1 , ζi , . . . , ζn , s1 , . . . , si+1 , si , . . . , sn ). s − q k si k=1 i+1
(43)
We make a conjecture. Conjecture 6.2. If t = q m (m = 1, 2, 3, . . .), the eigenfunctions of I (α) or D 1 become truncated. We have n
(n−k)m
ζk
f0,0,...,0 (ζ1 , ζ2 , . . . , ζn ) ∈ Pn ,
(44)
k=1
and the antisymmetry with respect to the action πm (σ ) (σ ∈ W (An−1 )) πm (σ ) ·
n k=1
(n−k)m
ζk
f0,0,...,0 (ζ1 , ζ2 , . . . , ζn ) = −
n
(n−k)m
ζk
f0,0,...,0 (ζ1 , ζ2 , . . . , ζn ).
k=1
(45) Consider the case n = 2. An easy calculation gives us the following. Proposition 6.3. For m = 1, 2, 3, . . . , we have the W (A1 )-symmetric polynomial in P2 : −1 −1 qt , qt s1 /s2 m−1 πm (σ1 ) · ζ1 2 φ1 ; q, tζ2 /ζ1 m qs1 /s2 t=q −1 −1 qt , qt s1 /s2 = ζ1m−1 2 φ1 ; q, tζ2 /ζ1 . (46) m qs1 /s2 t=q
Hence from Eq. (41) Conjecture 6.2 is true for n = 2. Next, we look at the case n = 3. One may find a family of W (A2 )-symmetric polynomials.
Family of Integral Transformations and Basic Hypergeometric Series
451
Lemma 6.4. Let m be a positive integer and k be a nonnegative integer. Set (qt −1 , qt −1 , t, t; q)k (q, qs1 /s2 , qs2 /s3 , qs1 /s3 ; q)k k+1 −1 −1 q t , qt si /sj ; q, tζj /ζi 2 φ1 q k+1 si /sj
ϕk = (ζ3 /ζ1 )k (qs1 /s3 )k ×
1≤i<j ≤3
(47) . t=q m
Then we have ζ12m−2 ζ2m−1 ϕk ∈ P3 and the W (A2 )-symmetry πm (σ ) · ζ12m−2 ζ2m−1 ϕk = ζ12m−2 ζ2m−1 ϕk . (48) From Conjecture 4.3 and the lemma above, we have f0,0 = i<j (1 − ζj /ζi ) k ϕk , from which Conjecture 6.2 follows for n = 3. 7. Product Formula for the Eigenfunction In this section, we study the special case (s1 , s2 , . . . , sn ) = (1, t, . . . , t n−1 ),
(49)
and consider an infinite product formula for the first eigenfunction. Proposition 7.1. We have the infinite product formula for the first eigenfunction of D 1 as follows: (qt −1 ζj /ζi ; q)∞ f0,0,...,0 (ζ1 , ζ2 , . . . , ζn ) = (1−ζj /ζi ) . (tζj /ζi ; q)∞ n−1 (s1 ,s2 ,···,sn )=(1,t,···,t
)
1≤i<j ≤n
(50) Remark. Conjecture 1.4 suggests that the same is true also for I (α). Proof. We have D(1, t, . . . , t n−1 ; q, t)
(1 − ζj /ζi )
1≤i<j ≤n
= t n−1
(1 − ζj /ζi )
1≤i<j ≤n
= (1 + t + · · · + t n−1 )
(qt −1 ζj /ζi ; q)∞ (tζj /ζi ; q)∞
n (qt −1 ζj /ζi ; q)∞ t −1 ζi − ζj (tζj /ζi ; q)∞ ζi − ζ j i=1 j =i
(1 − ζj /ζi )
1≤i<j ≤n
(qt −1 ζj /ζi ; q)∞ . (tζj /ζi ; q)∞
Here we have used the identity n −1 t ζi − ζ j = (1 + t −1 + · · · + t −n+1 ), ζi − ζ j i=1 j =i
which can be proved by decomposition in partial fractions.
452
J. Shiraishi
It is of some interest to check this factorization by restricting the general formulas for the eigenfunctions. For n = 2, setting s1 = 1, s2 = t, ζ = ζ2 /ζ1 in Eq. (41), and using the q-binomial theorem (Eq. (1.3.2) in [5]), we have −1 −2 qt , qt f0 (ζ1 , ζ2 ) = (1 − ζ ) × 2 φ1 ; q, tζ (s1 ,s2 )=(1,t) qt −1 (qt −1 ζ ; q)∞ . (tζ ; q)∞ Next we look at the case n = 3. Using the q-Pfaff-Saalsch¨utz formula (Eq. (1.7.2) in [5]), we have the equality, k+1 −1 −3 ∞ q t , qt (t; q)k (t; q)k −2 k (qt ζ ) φ ; q, tζ 2 1 (q; q)k (qt −2 ; q)k q k+1 t −2 k=0 ∞ (qt −1 ; q)m (qt −3 ; q)m m m t, t, q −m = t ζ φ ; q, q 3 2 (qt −2 ; q)m (q; q)m qt −1 , q −m t 3 = (1 − ζ )
=
m=0 (qt −1 ζ ; q)∞
. (tζ ; q)∞ Hence from Conjecture 4.3, it follows that f0,0 (ζ1 , ζ2 , ζ3 ) 2 =
(s1 ,s2 ,s3 )=(1,t,t ) (qt −1 ζ2 /ζ1 ; q)∞ (1 − ζj /ζi ) × (tζ2 /ζ1 ; q)∞ 1≤i<j ≤3
×
∞ k=0
=
(qt −1 ζ3 /ζ2 ; q)∞ (tζ3 /ζ2 ; q)∞
k+1 −1 −3 (qt −1 , qt −1 , t, t; q)k q t , qt , −2 k k (qt ) (ζ3 /ζ1 ) · 2 φ1 ; q, tζ3 /ζ1 (q, qt −1 , qt −1 , qt −2 ; q)k q k+1 t −2 (1 − ζj /ζi )
1≤i<j ≤3
(qt −1 ζj /ζi ; q)∞ . (tζj /ζi ; q)∞
8. Raising Operators for Macdonald Polynomials In this section, we connect the modified Macdonald operator D 1 with raising operators for the Macdonald symmetric polynomials. We briefly recall the notion of the Macdonald polynomials [3]. Let x1 , x2 , . . . , xn be a set of indeterminates, and n = Z[x1 , . . . , xn ]Sn denotes the ring of symmetric polynomials. The ring of symmetric functions is defined as the inverse limit of the n in the category of graded rings. In this section, we regard q and t as independent indeterminates. Let F = Q(q, t) be the field of rational functions in q and t, and set F = ⊗Z F . Let pn = i xin be the power sum symmetric functions, and denote pλ = pλ1 pλ2 · · · for any partition λ = (λ1 , λ2 , . . .). The scalar product is introduced by 1−q λj pλ , pµ q,t = δλ,µ i mi mi ! , (51) 1−t λj i≥1 j≥1 where mi = mi (λ) is the multiplicity of the part i in the partition λ.
Family of Integral Transformations and Basic Hypergeometric Series
453
The Macdonald symmetric functions Pλ (x; q, t) ∈ F are uniquely characterized by the following two conditions [3]: (a) Pλ = mλ + uλµ mµ , (52) µn2p f¯0 − E(f¯0 | U¯ n B0 ) 2 = O(1/n2 ), 0 L which is summable. This proves (50). Let h¯ = E(f¯0 | B0 ). This function is constant along the stable leaves, and has zero integral (since f¯0 also has zero integral). Hence, it induces a function h on the quotient 0 . Since f¯0 ∈ L2 , it satisfies h ∈ L2 ( 0 ). The following lemma is an easy consequence of the H¨older properties of the invariant measure and (52), see [You98, Sublemma, p. 612] for details. Lemma 5.4. There exist constants C > 0 and τ < 1 such that, for all x, y in the same ¯ 0,i , unstable leaf of a set ¯ ¯ |h(x) − h(y)| CA(x)τ s(x.y) . The function A is integrable. Hence, by [Gou04, Lemma 3.4], this implies that the 0 h is H¨older continuous on 0 . By [Gou04, Corollary 3.3], we get: function U 0n h tends exponentially fast to 0 U in the space of H¨older continuous functions on 0 . A computation gives
E(f¯0 | U¯ −n B0 ) 2 2 = 0 L
(55)
n
0n h) ◦ U0n hL2 (U 0 h) ◦ U0n 2 h · (U L
n = hL2 U0 h L2 .
Hence, this term is exponentially small. This proves (51) and concludes the proof of Lemma 5.3. The return time ϕ also satisfies a central limit theorem, by the same argument. Hence, by Theorem 5.1 (applied with b = 1), there exists σ12 0 such that n−1 ¯ ¯k k=0 f ◦ U → N (0, σ12 ). √ n ¯ to X, it implies that Going from n−1 k k=0 f ◦ T → N (0, σ12 ). √ n
(56)
Moreover, the return √ √ time√ϕ+ : X → N satisfies a limit theorem with normalization n log n. Since n = o( n log n), we can unfortunately not apply Theorem 5.1 with b = 1. However, if we can prove the following lemma, then this theorem applies with b < 1. Lemma 5.5. For all b > 1/2, n−1 1 f ◦Tk → 0 |n|b k=0
almost everywhere in X when n → ±∞.
(57)
Limit Theorems in the Stadium Billiard
507
Proof. We first estimate the decay of correlations of f¯0 for U¯ 0 . We will use the notations of the proof of Lemma 5.3. We have f¯0 · f¯0 ◦ U¯ 02n = f¯0 · E(f¯0 ◦ U¯ 0n | B0 ) ◦ U¯ 0n + f¯0 · f¯0 ◦ U¯ 02n − E(f¯0 ◦ U¯ 0n | B0 ) ◦ U¯ 0n . (58) The contraction properties of U¯ 0 along stable manifolds and (53) give |f¯0 ◦ U¯ 0n (x) − E(f¯0 ◦ U¯ 0n | B0 )(x)| CA(U¯ 0n x)λαn . Hence, the second integral in (58) is at most
|f¯0 | · A ◦ U¯ 02n λαn f¯0 L2+ε2 ALp λαn , 1 + p1 = 1. Hence, this term decays exponentially fast. where p < 2 is chosen so that 2+ε 2 In the first integral of (58), the function E(f¯0 ◦ U¯ 0n | B0 ) ◦ U¯ 0n is B0 -measurable (i.e., constant along stable leaves). Hence, this integral is equal to h¯ · E(f¯0 ◦ U¯ 0n | B0 ) ◦ U¯ 0n . (59)
Let h¯ n = E(f¯0 ◦ U¯ 0n | B0 ), it is B0 -measurable and defines a function hn on the quotient 0 . The integral (59) is then equal to 0n h · hn . h · hn ◦ U0n = U (60) 0
The L2 -norm of hn
is bounded independently of n. By (55), (60) is exponentially small. This proves that f¯0 · f¯0 ◦ U¯ 02n decays exponentially. In the same way, f¯0 · f¯0 ◦ U¯ 02n+1 decays exponentially. Since the correlations of f¯0 decay exponentially fast and f¯0 ∈ L2 , [Kac96, Theorem ¯ ¯k 16] implies that n1b n−1 k=0 f0 ◦ U0 tends to zero almost everywhere when n → +∞, for all b > 1/2. ¯ ¯k ¯ Now to see that n1b n−1 k=0 f ◦ U tends to zero almost everywhere in when n → +∞, for all b > 1/2, we use [MT04, Lemma 2.1 (a)] which gives this convergence on ¯ 0 . However, by the ergodicity of U¯ , the set on which this convergence holds must have ¯ 0 has positive measure, we get this convergence almost either full or zero measure. As ¯ Finally, this implies the same for f in X. We have proved (57) for everywhere on . any b > 1/2 when n → +∞. To deal with n → −∞, we go to the natural extension. It is sufficient to prove the ¯ 0 , since the previous reasoning still applies (using the fact that the result for f¯0 in natural extension is functorial, i.e., the natural extension and commutes with induction ¯ of ¯ 0 , we have f¯ · f¯ ◦ U¯ −n = f¯0 ◦ U¯ n · f¯0 , projections). In the natural extension 0 0 0 0 0 which is exponentially small. Hence, [Kac96, Theorem 16] still applies and gives the desired result. Remark 5.6. As µ0 (X) > 0, we may apply [MT04, Lemma 2.1 (a)] just as we did in the proof above to see that Lemma 5.5 implies n−1 1 f0 ◦ T0k → 0 |n|b k=0
almost everywhere when n → ±∞, for any b > 1/2.
508
P. B´alint, S. Gou¨ezel
Proof of Theorem 1.4. The convergence (56), together with Lemma 5.5 and Theorem 5.1, implies (2). We still have to prove the zero √ variance statement. If√ f0 = χ − χ ◦ T0 for some measurable function χ , then Sn f0 / n = (χ − χ ◦ T0n )/ n tends in probability to 0, which implies σ = 0. Conversely, assume that σ = 0. The function f¯0 on the basis ¯ 0 of the Young tower satisfies a central limit theorem with zero variance. The proof of Gordin’s theorem then ensures the existence of a measurable function χ¯ 0 such that ¯ 0 . This implies that f¯ is a coboundary f¯0 = χ¯ 0 − χ¯ 0 ◦ U¯ 0 , i.e., f¯0 is a coboundary on ¯ as follows: let π¯ 0 : ¯ → ¯ 0 be the projection on the basis of the tower. Defining on , ¯ → R by χ¯ : χ¯ (x) = χ¯ 0 (π¯ 0 x) −
ω(x)−1
f¯(U¯ k π¯ 0 x),
(61)
k=0
we have f¯ = χ¯ − χ¯ ◦ U¯ . Since the function f¯ = f ◦ πX is a coboundary, general results on coboundaries (see e.g. [Gou05a, Theorem 1.4]) ensure that f also is a coboundary on X for T . Finally, this implies that f0 is a coboundary on X0 for T0 , using a formula similar to (61). 5.3. Proof of Proposition 1.5. We work in the stadium billiard with = ∗ , for which the free flight τ0 satisfies a usual central limit theorem. Define a function τ : X → R ϕ+ (x)−1 ∗ k by τ (x) = k=0 τ0 (T0 x). Since the function τ0∗ does not satisfy (P 1), Lemma 2.5 does a priori not apply. Nevertheless, due to the geometric properties of the free flight, the function τ satisfies the following inequality: if x, y ∈ X are two points sliding n times along the semicircles, then |τ (x) − τ (y)| C(d(x, y) + d(T x, T y)). This estimate is sufficient to carry out the proofs of Lemmas 2.5 and 2.6. Hence, there exist two ¯ → R and g : → R such that τ ◦ πX = g ◦ π + u¯ − u¯ ◦ U¯ on , ¯ functions u¯ : ¯ and g is H¨older continuous on . Let 0 be the basis of the tower u¯ is bounded on ϕ0 (x)−1 g(U k x), , and let g0 be the function induced by g on 0 (given by g0 (x) = k=0 where ϕ0 is the return time from 0 to itself). Assume that σ = 0. By Theorem 1.4, this implies that τ0∗ is a coboundary. In turn, arguments similar to the end of the proof of Theorem 1.4 show that g0 itself is a coboundary. Since g is H¨older continuous, g0 satisfies the assumptions of [Gou05b, Theorem 1.1]. This theorem implies that the function g0 is essentially bounded. Let us show that this is not the case. The function τ is bounded from above: since E(τ0 ) = 2, the function τ is O(1) on the set of points bouncing n times between the segments. Moreover, on the set of points sliding n times along the circles, the function τ is equal to −2n + O(1). Since u¯ is bounded, this implies that there exists a constant C1 such that g C1 , and that g = −2n + O(1) on a set of measure at least C/n4 . Let An ⊂ be the set of points in the tower where ϕ0 (π0 x) n/(2C1 ) (where π0 : → 0 is the projection on the basis) and g −n. Since µ {ϕ0 (π0 x) > n/(2C1 )} is exponentially small in n, while µ {g −n} C/n4 , the set An has nonzero measure for n large enough. ϕ0 (y)−1 If y ∈ π0 (An ), then g0 (y) −n + k=0 C1 −n/2. Since π0 (An ) has positive measure, this shows that the function g0 is not bounded from below. This contradiction concludes the proof.
Limit Theorems in the Stadium Billiard
509
Acknowledgement. We are very grateful to D. Sz´asz and T. Varj´u for useful discussions and for their valuable remarks on earlier versions of the manuscript. This paper has grown out of discussions we had at the CIRM conference on multi-dimensional non-uniformly hyperbolic systems in Marseille in May 2004, and while S. G. visited the Institute of Mathematics of the BUTE in October 2004. The hospitality of both institutions, along with the financial support of Hungarian National Foundation for Scientific Research (OTKA), grants TS040719, T046187 and TS049835 is thankfully acknowledged.
A. Proof of Lemma 3.6 Let U0 be the map induced by U on the basis 0 of the tower. Denote by ϕ the first return time on the basis, so that U0 (x) = U ϕ(x) (x). Note that ϕ(x) can also be defined for x ∈ \ 0 as the first hitting time of the basis. Let F be a finite subset of N. Let (ni )i∈F be positive integers. Let K(F, ni ) = {x ∈ 0 | ∀i ∈ F, ϕ(U0i x) = ni }. Lemma A.1. There exists a constant C such that, for all F and ni as above, $ (Cρ ni ). µ (K(F, ni )) i∈F
Proof. The proof is by induction on max F , and the result is trivial when F = ∅. Write F = {i − 1 | i ∈ F, i 1} and, for i ∈ F , set ni = ni+1 . If 0 ∈ F , K(F, ni ) = U0−1 (K(F , ni )). Since U0 preserves µ and max F < max F , we get the result. Otherwise, 0 ∈ F . Then K(F, ni ) = U0−1 (K(F , ni )) ∩ {x ∈ 0 , ϕ(x) = n0 }. By bounded distortion, we get µ (K(F, ni )) Cµ (K(F , ni ))µ {x ∈ 0 , ϕ(x) = n0 } Cµ (K(F , ni ))ρ n0 . Lemma A.2. There exist C > 0 and θ < 1 such that, for all n ∈ N, τ n Cθ n . U −n 0
Proof. Let κ > 0 be very small (how small will be specified later in the proof). Then U −n 0 ⊂ {x ∈ | n (x) κn} ∪ {x ∈ | ϕ(x) n/2} ∪ {x ∈ | ϕ(x) < n/2, n (x) < κn}. On the first of these sets, τ n τ κn , whence the integral of τ n is exponentially small. The second of these sets has exponentially small measure. Finally, the last of these sets n/2 is contained in i=0 U −i n , where n = {x ∈ 0 |
ϕ(U0i x) n/2}.
0i κn
To conclude the proof of the lemma, it is sufficient to prove that the measure of n is exponentially small.
510
P. B´alint, S. Gou¨ezel
Take L ∈ N such that ∀n L, (Cρ)n ρ n/2 , where C is the constant given by Lemma A.1. For x ∈ n , let F (x) := {0 i κn | ϕ(U0i x) L}. Then n ϕ(U0i x) − L (1/2 − Lκ)n. 2 i∈F (x)
i∈F (x)
This implies that
n ⊂
F ⊂[0,κn] i∈F
By Lemma A.1, we get µ (n )
F ⊂[0,κn]
i∈F
κn k=0
2κn
κn k
K(F, ni ).
ni L ni (1/2−Lκ)n
$
(Cρ ni )
i∈F ni L ni (1/2−Lκ)n
(Cρ n0 ) . . . (Cρ nk−1 )
n0 ,...,nk−1 L ni (1/2−Lκ)n
ρ
ni /2
2κn
0k κn n0 ,...,nk−1 L ni (1/2−Lκ)n
For r ∈ N,
ρ
ni /2
n0 +···+nk−1 =r
ni = r} = ρ
r/2
ρ
ni /2
.
0k κn n0 ,...,nk−1 ∈N ni (1/2−Lκ)n
= ρ r/2 Card{n0 , . . . , nk−1 | (r + k)k r +k ρ r/2 . k k!
Hence,
µ (n ) 2κn
ρ r/2
0k κn r (1/2−Lκ)n
(r + k)k . k! κ
k
The sequence ur = ρ r/2 (r+k) satisfies uur+1 ρ := ρ 1/2 e 1/2−Lκ for all r (1/2 − k! r Lκ)n and k κn. If κ is small enough, ρ < 1, and we get " #k 1 κn (1/2−Lκ)n/2 (1/2 − Lκ)n + κn µ (n ) 2 ρ k! 1 − ρ 0k κn
The sequence
nk k!
2κn 1−ρ
ρ (1/2−Lκ)n/2
0k κn
nk . k!
is increasing for k n. Hence, we finally get µ (n )
2κn (1/2−Lκ)n/2 nκn . ρ (κn + 1) 1−ρ κn!
Limit Theorems in the Stadium Billiard
511
Using Stirling’s Formula, it is easy to check that this expression is exponentially small if κ is small enough. This concludes the proof. Proof of Lemma 3.6. Let θ be given by Lemma A.2. Choose α > 0 so that eεα θ < 1. Then
U −n 0 ⊂ {x ∈ | ω(x) αn} ∪ {x ∈ | ω(x) < αn} ∩ U −n 0 . Hence,
U −n 0
eεω τ n
ωαn
eεω + eεαn
U −n 0
τ n .
The first term is exponentially small since eε ρ < 1. Lemma A.2 and the definition of α also imply that the second term is exponentially small. References [Aar97]
Aaronson, J.: An introduction to infinite ergodic theory. Volume 50 of Mathematical Surveys and Monographs. Providence RI: American Mathematical Society, 1997 [AD01] Aaronson, J., Denker, M.: A local limit theorem for stationary processes in the domain of attraction of a normal distribution. In N. Balakrishnan, I.A. Ibragimov, V.B. Nevzorov, eds., Asymptotic methods in probability and statistics with applications. Papers from the international conference, St. Petersburg, Russia, 1998, Basel: Birkh¨auser, 2001, pp. 215–224 [Bun79] Bunimovich, L.: On the ergodic properties of nowhere dispersing billiards. Commun. Math. Phys. 65, 295–312 (1979) [BY93] Baladi, V., Young, L.-S.: On the spectra of randomly perturbed expanding maps. Commun. Math. Phys. 156, 355–385 (1993) [Che97] Chernov, N.: Entropy, Lyapunov exponents and mean free path for billiards. J. Stat. Phys. 88, 1–29 (1997) [Che99] Chernov, N.: Decay of correlations and dispersing billiards. J. Stat. Phys. 94, 513–556 (1999) [CZ05] Chernov, N., Zhang, H.: Billiards with polynomial mixing rates. Nonlinearity 18, 1527–1554 (2005) [Eag76] Eagleson, G.K.: Some simple conditions for limit theorems to be mixing. Teor. Verojatnost. i Primenen. 21(3), 53–660 (1976) [Gor69] Gordin, M.: The central limit theorem for stationary processes. Dokl. Akad. Nauk SSSR 188, 739–741 (1969) [Gou03] Gou¨ezel, S.: Statistical properties of a skew-product with a curve of neutral points. http://name.math.Univ-rennes1.fr/Sebastien.gouezel/articles/skewproduct. pdf, 2004 [Gou04] Gou¨ezel, S.: Central limit theorem and stable laws for intermittent maps. Probab. Theory and Rel. Fields 128, 82–122 (2004) [Gou05a] Gou¨ezel, S.: Berry-Esseen theorem and local limit theorem for non uniformly expanding maps. http://name.math.Univ-rennes1.fr/Sebastien.gouezel/articles/vitesse-TCL.pdf, Annales de l’IHP Probabilit´es et Statistiques, 41, 997–1024 [Gou05b] Gou¨ezel, S.: Regularity of coboundaries for non uniformly expanding Markov maps. Proc. Am. Math. Soc. 134(2), 391–401 (2005) [Hen93] Hennion, H.: Sur un th´eor`eme spectral et son application aux noyaux lipschitziens. Proc. Amer. Math. Soc. 118, 627–634 (1993) [Kac96] Kachurovski˘ı, A.G.: Rates of convergence in ergodic theorems. Russian Math. Surveys 51, 653–703 (1996) [KL99] Keller, G., Liverani, C.: Stability of the spectrum for transfer operators. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 28(1), 141–152 (1999) [Mac83] Machta, J.: Power law decay of correlations in a billiard problem. J. Statist. Phys. 32, 555–564 (1983) [Mar04] Markarian, R.: Billiards with polynomial decay of correlations. Ergodic Theory Dynam. Systems 24, 177–197 (2004) [MT04] Melbourne, I., T¨or¨ok, A.: Statistical limit theorems for suspension flows. Israel J. Math. 144, 191–210, 2004
512 [SV04a] [SV04b] [SV05] [You98] [You99]
P. B´alint, S. Gou¨ezel Sz´asz, D., Varj´u, T.: Local limit theorem for the Lorentz process and its recurrence on the plane. Ergodic Theory Dynam. Systems 24, 257–278 (2004) Sz´asz, D., Varj´u, T.: Markov towers and stochastic properties of billiards. In: Modern dynamical systems and applications, Cambridge: Cambridge University Press, 2004, pp. 433–445 Sz´asz, D., Varj´u, T.: Limit laws and recurrence for the planar Lorentz process with infinite horizon. Preprint, 2005 Young, L.-S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. Math. (2) 147, 585–650 (1998) Young, L.-S.: Recurrence times and rates of mixing. Israel J. Math. 110, 153–188 (1999)
Communicated by G.Gallavotti
Commun. Math. Phys. 263, 513–533 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1516-1
Communications in
Mathematical Physics
Local Energy Statistics in Disordered Systems: A Proof of the Local REM Conjecture Anton Bovier1,2 , Irina Kurkova3 1
Weierstraß–Institut f¨ur Angewandte Analysis und Stochastik, Mohrenstrasse 39, 10117 Berlin, Germany. E-mail:
[email protected] 2 Institut f¨ ur Mathematik, Technische Universit¨at Berlin, Strasse des 17. Juni 136, 12623 Berlin, Germany 3 Laboratoire de Probabilit´es et Mod`eles Al´eatoires, Universit´e Paris 6, 4, place Jussieu, B.C. 188, 75252 Paris, Cedex 5, France. E-mail:
[email protected] Received: 29 March 2005 / Accepted: 13 September 2005 Published online: 23 February 2006 – © Springer-Verlag 2006
Abstract: Recently, Bauke and Mertens conjectured that the local statistics of energies in random spin systems with discrete spin space should, in most circumstances, be the same as in the random energy model. Here we give necessary conditions for this hypothesis to be true, which we show to be satisfied in wide classes of examples: short range spin glasses and mean field spin glasses of the SK type. We also show that, under certain conditions, the conjecture holds even if energy levels that grow moderately with the volume of the system are considered. 1. Introduction In a recent paper [BaMe], Bauke and Mertens have formulated an interesting conjecture regarding the behaviour of local energy level statistics in disordered systems. Roughly speaking, their conjecture can be formulated as follows. Consider a random Hamiltonian, HN (σ ), i.e., a real-valued random function on some product space, S N , where S is a finite set, typically S = {−1, 1}, of the form HN (σ ) =
A (σ ),
(1.1)
A⊂N
where N are finite subsets of Zd of cardinality, say, N . The sum runs over subsets, A, of N and A are random local functions, typically of the form A (σ ) = JA
σx ,
(1.2)
x∈A Research supported in part by the DFG in the Dutch-German Bilateral Research Group “Mathematics of Random Spatial Models from Physics and Biology” and by the European Science Foundation in the Programme RDSES.
514
A. Bovier, I. Kurkova
where JA , A ⊂ Zd , is a family of (typically independent) random variables, defined on some probability space, (, F, P), √ whose distribution is not too singular. In such a situation, for typical σ , HN (σ ) ∼ N , while supσ HN (σ ) ∼ N . Bauke and Mertens then ask the following question: Given a fixed number, E, what is the statistics of the values N −1/2 HN (σ ) that are closest to this number, and how are configurations, σ , for which these good approximants of E are realised, distributed on S N ? Their conjectured answer, which at first glance seems rather surprising, is quite simple: find δN,E such that P(|N −1/2 HN (σ ) − E| ≤ bδN,E ) ∼ |S|−N b; then, the collection of points, −1 δN,E |N −1/2 HN (σ ) − E|, over all σ ∈ S N , converges to a Poisson point process on R+ . Furthermore, for any finite k, the k-tuple of configurations, σ 1 , σ 2 , . . . , σ k , where the k-best approximations are realised, is such that all of its elements have maximal Hamming distance between each other. In other words, the asymptotic behavior of these best approximants of E is the same, as if the random variables HN (σ ) were all independent Gaussian random variables with variance N , i.e., as if we were dealing with the random energy model (REM) [Der1]. Bauke and Mertens call this “universal REM like behaviour”. Mertens previously proposed such a conjecture in the special case of the number partitioning problem[Mer1]. In that case, the function HN is simply given by HN (σ ) =
N
Xi σ i ,
(1.3)
i=1
with Xi i.i.d. random variables uniformly distributed on [0, 1], σi ∈ {−1, 1}, and one is interested in the distribution of energies near the value zero (which corresponds to an optimal partitioning of the N random variables, Xi , into two groups such that their sum in each group is as similar as possible). In this case, his conjecture was proven to hold by Borgs, Chayes, and Pittel [BCP] and the same authors with Mertens [BCMP]. It should be noted that in this problem, one needs, of course, take care of the symmetry of the Hamiltonian under the transformation σ → −σ . An extension of this result in the spirit of the REM conjecture was proven recently in [BCMN], i.e., when the value zero is replaced by an arbitrary value, E. In [BK2] we generalised this result to the case of the k-partitioning problem, where the random function to be considered is vector-valued (consisting of the vector of differences between the sums of the random variables in each of the k subsets of the partition). To be precise, we considered the special case where the subsets of the partition are required to have the same cardinality, N/k (restricted k-partitioning problem). The general approach to the proof we developed in that paper sets the path towards the proof of the conjecture by Bauke and Mertens that we will present here. The universality conjecture suggests that correlations are irrelevant for the properties of the local energy statistics of disordered systems for energies near “typical energies”. On the other hand, we know that correlations must play a rˆole for the extremal energies near the maximum of HN (σ ). Thus, there are two questions beyond the original conjecture that naturally pose themselves: (i) assume we consider instead of fixed E, N -dependent energy levels, say, EN = N α C. How fast can we allow EN to grow for the REM-like behaviour to hold? and (ii) what type of behaviour can we expect once EN grows faster than this value? We will see that the answer to the first question depends on the properties of HN , and we will give an answer in models with Gaussian couplings. The answer to question (ii) requires a detailed understanding of HN (σ ) as a random process, and we will be able to give a complete answer only in the case of the GREM,
Local REM Conjecture
515
when HN is a hierarchically correlated Gaussian process. This will be discussed in a companion paper [BK3]. Our paper will be organized as follows. In Chapter 2, we prove an abstract theorem, that implies the REM-like-conjecture under three hypothesis. This will give us some heuristic understanding why and when such a conjecture should be true. In Chapter 3 we then show that the hypothesis of the theorem are fulfilled in two classes of examples: p-spin Sherrington-Kirkpatrick like models and short range Ising models on the lattice. In both cases we establish conditions on how fast EN can be allowed to grow, in the case when the couplings are Gaussian. Note added. After this paper was submitted, Borgs et al. published an interesting preprint [BCMN2] where the following results were obtained: (i) In the p-spin SK models, with p = 1, 2, our growth conditions on EN are optimal, i.e. for EN ∼ CN 1/4 (p = 1), resp. EN ∼ CN 1/2 (p = 2), the REM conjecture cannot hold, at least if c is small enough. They also extended our results in these examples to the case of non-Gaussian interactions, provided they have some finite exponential moments. 2. Abstract Theorems In this section we will formulate a general result that implies the REM property under some concise conditions, that can be verified in concrete examples. This will also allow us to present the broad outline of the structure of the proof without having to bother with technical details. Our approach to the proof of the Mertens conjecture is based on the following theorem, which provides a criterion for Poisson convergence in a rather general setting. Theorem 2.1. Let Vi,M ≥ 0, i ∈ N, be a family of non-negative random variables satisfying the following assumptions: for any ∈ N and all sets of constants bj > 0, j = 1, . . . , , lim
M↑∞
P(∀j =1 Vij ,M < bj ) →
(i1 ,...,i )⊂{1,...,M}
bj ,
(2.1)
j =1
where the sum is taken over all possible sequences of different indices (i1 , . . . , i ). Then the point process PM =
M
δVi,M ,
(2.2)
i=1
on R+ , converges weakly in distribution, as M ↑ ∞, to the standard Poisson point process, P on R+ (i.e., the Poisson point process whose intensity measure is the Lebesgue measure). Remark. Theorem 2.1 was proven (in a more general form, involving vector valued random variables) in [BK2]. It is very similar in its spirit to an analogous theorem for the case of exchangeable variables proven in [BM] in an application to the Hopfield model. The rather simple proof in the scalar setting can be found in Chapter 13 of [Bo].
516
A. Bovier, I. Kurkova
Naturally, we want to apply this theorem with Vi,M given by |N −1/2 HN (σ ) − EN |, properly normalised. We will now introduce a setting in which the assumptions of Theorem 2.1 are verified. Consider a product space S N where S is a finite set. We define on S N a real-valued random process, YN (σ ). Assume for simplicity that
Define on
EYN (σ ) = 0, E(YN (σ ))2 = 1.
(2.3)
bN (σ, σ ) ≡ cov(YN (σ ), YN (σ )).
(2.4)
SN , SN
Let us also introduce on the Gaussian process, ZN , that has the same mean and the same covariance matrix as YN . Let G be the group of automorphisms on SN , such that, for g ∈ G, YN (gσ ) = YN (σ ), and let F be the larger group, such that, for g ∈ F , |YN (gσ )| = |YN (σ )|. Let EN = cN α , c, α ∈ R, 0 ≤ α < 1/2,
(2.5)
be a sequence of real numbers, that is either a constant, c ∈ R, if α = 0, or converges to plus or minus infinity, if α > 0. We will define sets N as follows: If c = 0, we denote by N the set of residual classes of S N modulo G; if c = 0, we let N be the set of residual classes modulo F . We will assume throughout that | N | > κ N , for some κ > 1. Set 2 δN = π2 eEN /2 | N |−1 . (2.6) Note that for α < 1/2, δN is exponentially small in N . δN is chosen such that, for any b ≥ 0, lim | N |P(|ZN (σ ) − EN | < bδN ) = b.
N↑∞
(2.7)
⊗ For ∈ N, and any collection, σ 1 , . . . , σ ∈ N , we denote by BN (σ 1 , . . . , σ ) the covariance matrix of YN (σ ), with elements
bi,j (σ 1 , . . . , σ ) ≡ bN (σ i , σ j ). Assumption A.
(2.8)
η
(i) Let RN, denote the set
⊗ : ∀1≤i<j ≤ |bN (σ i , σ j )| ≤ N −η }. RN, ≡ {(σ 1 , . . . , σ ) ∈ N η
(2.9)
Then there exists a continuous decreasing function, ρ(η) > 0, on ]η0 , η˜ 0 [ (for some η˜ 0 ≥ η0 > 0), and µ > 0, such that η |RN, | ≥ 1 − exp −µ(η)N ρ(η) | N | . (2.10) (ii) Let ≥ 2, r = 1, . . . , − 1. Let ⊗ LN,r = (σ 1 , . . . , σ ) ∈ N : ∀1≤i<j ≤ |YN (σ i )| = |YN (σ j )|, rank(BN (σ 1 , . . . , σ )) = r .
(2.11)
Then there exists dr, > 0, such that, for all N large enough, |LN,r | ≤ | N |r e−dr, N .
(2.12)
Local REM Conjecture
517
(iii) For any ≥ 1, any r = 1, 2, . . . , , and any b1 , . . . , b ≥ 0, there exist con⊗ stants, pr, ≥ 0 and Q < ∞, such that, for any σ 1 , . . . , σ ∈ N for which 1 rank(BN (σ , . . . , σ )) = r, r N pr, . (2.13) P ∀i=1 : |YN (σ i ) − EN | < δN bi ≤ QδN Theorem 2.2. Assume Assumptions A hold. Assume that α ∈ [0, 1/2[ is such that, for some η1 ≤ η2 ∈]η0 , η˜ 0 [, we have: α < η2 /2,
(2.14)
α < η/2 + ρ(η)/2, ∀η ∈]η1 , η2 [,
(2.15)
α < ρ(η1 )/2.
(2.16)
and
η
1 , Furthermore, assume that, for any ≥ 1, any b1 , . . . , b > 0, and (σ 1 , . . . , σ ) ∈ RN, P ∀i=1 : |YN (σ i ) − EN | < δN bi = P ∀i=1 : |ZN (σ i ) − EN | < δN bi
+o(| N |− ).
(2.17)
Then, the point process, PN ≡
σ ∈ N
δ{δ −1 |YN (σ )−EN |} → P N
(2.18)
converges weakly, as N ↑ ∞, to the standard Poisson point process P on R+ . Moreover, for any > 0 and any b ∈ R+ , P ∀N0 ∃N≥N0 : ∃σ,σ :|bN (σ,σ )|> : |YN (σ ) − EN | ≤ |YN (σ ) − EN | ≤ δN b = 0. (2.19) Remark. Before giving the proof of the theorem, let us comment on the various assumptions. (i) Assumption A (i) holds with some η in any reasonable model, but the function ρ(η) is model dependent. (ii) Assumptions A (ii) and (iii) are also apparently valid in most cases, but can be tricky sometimes. An example where (ii) proved difficult is the k-partitioning problem, with k > 2 [BK2]. (iii) Condition (2.19) is essentially a local central limit theorem. In the case α = 0 it holds, if the Hamiltonian is a sum over independent random interactions, under mild decay assumptions on the characteristic function of the distributions of the interactions. Note that some such assumptions are obviously necessary, since if the random interactions take on only finitely many values, then also the Hamiltonian will take values on a lattice, whose spacings are not exponentially small, as would be necessary for the theorem to hold. Otherwise, if α > 0, this will require further assumptions on the interactions. We will leave this problem open in the present paper. It is of course trivially verified, if the interactions are Gaussian.
518
A. Bovier, I. Kurkova
−1 Proof. We just have to verify the hypothesis of Theorem 2.1, for Vi,M given by δN |YN (σ ) − EN |, i.e., we must show that P ∀i=1 : |YN (σ i ) − EN | < bi δN → b1 · · · b . (2.20) ⊗l (σ 1 ,...,σ )∈ N
η
1 We split this sum into the sums over the set RN, and its complement. First, by the assumption (2.17) P ∀i=1 : |YN (σ i ) − EN | < bi δN η
1 (σ 1 ,...,σ )∈RN,
=
P ∀i=1 : |ZN (σ i ) − EN | < bi δN + o(1).
(2.21)
η1 (σ 1 ,...,σ )∈RN,
But, with C(EN ) = { x = (x1 , . . . , x ) ∈ R : ∀i+1 |EN − xi | ≤ δN bi },
P ∀i=1 : |ZN (σ i ) − EN | < bi δN = C (EN )
−1
e−(z,BN (σ ,...,σ )z)/2 dz, (2π)/2 det(BN (σ 1 , . . . , σ )) 1
(2.22) where BN (σ 1 , . . . , σ ) is the covariance matrix defined in (2.8). Since δN is exponenη1 tially small in N, we see that, uniformly for (σ 1 , . . . , σ ) ∈ RN, , the integral (2.22) equals √ −1 1 (2.23) (2δN / 2π) (b1 · · · b )e−(EN ,B (σ ,...,σ )EN )/2 (1 + o(1)), where we denote by EN the vector (EN , . . . , EN ). η2 η1 We treat separately the sum (2.21) taken over the smaller set, RN, ⊂ RN, , and the η1 η2 one over RN, \ RN, . 2 N −η2 → 0, by (2.17), (2.22), and (2.23), Since, by (2.14), η2 is chosen such that EN η2 each term in the sum over RN, equals √ 1 2 −η2 (2δN / 2π) (b1 · · · b )e− 2 EN (1+O(N )) (1 + o(1)) = (b1 · · · b )| N |− (1 + o(1)), uniformly for (σ 1 , . . . , σ ) ∈
(2.24)
η2 RN,l .
Hence by Assumption A (i) P ∀i=1 : |ZN (σ i ) − EN | < bi δN
η
2 (σ 1 ,...,σ )∈RN, 2 = |RN,l || N |− (b1 · · · b )(1 + o(1)) → b1 · · · bl .
η
η
(2.25)
η
2 1 (if it is non-empty, i.e., if η1 < η2 ), Now let us consider the remaining set RN, \ RN, 0 1 n and let us find η1 = η < η < · · · < η = η2 , such that
α < ηi /2 + ρ(ηi+1 )/2 ∀i = 0, 1, . . . , n − 1,
(2.26)
Local REM Conjecture
519 η
η
1 2 which is possible due to the assumption (2.15). Then let us split the sum over RN,l \RN, ηi+1 ηi into n sums, each over RN, \ RN, , i = 0, 1, . . . , n − 1. By (2.17), (2.22), and (2.23),
ηi
we have, uniformly for (σ 1 , . . . , σ ) ∈ RN, , √ 1 2 −ηi P ∀i=1 : |ZN (σ i ) − EN | < bi δN = (2δN / 2π ) (b1 · · · b )e− 2 EN (1+O(N )) (1 + o(1)) ≤ C| N |− eN
2α−ηi
,
(2.27)
for some constant C < ∞. Thus by Assumption A (i), P(∀i=1 : |ZN (σ i ) − EN | < bi δN ) η
η
i \R i+1 RN,l N,l 2α−ηi
⊗l ≤ C| N \ R i+1 || N |− eN N,l i+1 i ≤ C exp −µ(ηi+1 )N ρ(η ) + N 2α−η , η
(2.28) that, by (2.26), converges to zero, as N → ∞, for any i = 0, 1, . . . , n − 1. So the sum η1 η2 (2.21) over RN,l \ RN,l vanishes. η1 Now we turn to the sum over collections, (σ 1 , . . . , σ ) ∈ RN,l . We distinguish the cases when det(BN (σ 1 , . . . , σ )) = 0 and det(BN (σ 1 , . . . , σ )) = 0. For the contributions from the latter case, using Assumptions A (i) and (iii), we get readily that, ρ(η1 ) P ∀i=1 |YN (σ i ) − EN | < δN bi ≤ | N | e−µ(η1 )N Q|δN | N p η (σ 1 ,...,σ ) ∈RN1
rank(BN (σ 1 ,...,σ ))=
2 ≤ CN p exp −µ(η1 )N ρ(η1 ) + EN /2 .
(2.29)
The right-hand side of (2.29) tends to zero exponentially fast, if condition (2.16) is verified. Finally, we must deal with the contributions from the cases when the covariance matrix is degenerate, namely P(∀i=1 : |YN (σ i ) − EN | < bi δN ), (2.30) ⊗l (σ 1 ,...,σ )∈ N 1 rank(BN (σ ,...,σ ))=r
for r = 1, . . . , − 1. In the case c = 0, this sum is taken over the set LrN, , since σ and σ in N are different, iff |YN (σ )| = |YN (σ )|, by definition of N . In the case c = 0, this sum is taken over -tuples (σ 1 , . . . , σ ) of different elements of N , i.e., such that YN (σ i ) = YN (σ j ), for any 1 ≤ i < j ≤ . But for all N large enough, all terms in this sum over -tuples, (σ 1 , . . . , σ ), such that YN (σ i ) = −YN (σ j ), for some 1 ≤ i < j ≤ , equal zero, since the events {|YN (σ i ) − EN | < bi δN } and {| − YN (σ i ) − EN | < bj δN }, with EN = cN α , c = 0, are disjoint. Therefore (2.30) is
520
A. Bovier, I. Kurkova
reduced to the sum over LrN, in the case c = 0 as well. Then, by Assumptions A (ii) and (iii), it is bounded from above by |LrN, |Q(δN )r N pr, ≤ | N |r e−dr, N Q(δN )r N pr, ≤ Ce−dr, N eEN /2 N pr, . (2.31) 2
2 = c2 N 2α , with α < 1/2. This bound converges to zero exponentially fast, since EN This concludes the proof of the first part of the theorem. The second assertion (2.19) is elementary: by (2.29) and (2.31), the sum of terms η1 ⊗2 P(∀2i=1 : |YN (σ i ) − EN | < δN b) over all pairs, (σ 1 , σ 2 ) ∈ N \ RN,2 , such that σ 1 = 2 σ , converges to zero exponentially fast. Thus (2.19) follows from the Borel-Cantelli lemma.
Finally, we remark that the results of Theorem 2.2 can be extended to the case when EYN (σ ) = 0, if α = 0, i.e., EN = c. Note that, e.g. the unrestricted number partitioning problem falls into this class. Let now ZN (σ ) be the Gaussian process with the same mean and covariances as YN (σ ). Let us consider both the covariance matrix, BN , and the mean of YN , EYN (σ ), as random variables on the probability space ( N , BN , Eσ ), where Eσ is the uniform law on N . Assume that, for any ≥ 1, D
BN (σ 1 , . . . , σ ) → Id , N ↑ ∞,
(2.32)
where Id denotes the identity matrix, and D
EYN (σ ) → D, N ↑ ∞, where D is some random variable D. Let δN = π2 K −1 | N |−1 ,
(2.33)
(2.34)
where K ≡ Ee−(c−D)
2 /2
.
(2.35)
Theorem 2.3. Assume that, for some R > 0, |EYN (σ )| ≤ N R , for all σ ∈ N . Assume that (2.10) holds for some η > 0 and that (ii) and (iii) of Assumptions A are valid. Assume η that there exists a set, QN ⊂ RN, , such that (2.17) is valid for any (σ 1 , . . . , σ ) ∈ QN , γ η and that |RN, \ QN | ≤ | N | e−N , with some γ > 0. Then, the point process δ δ −1 |YN (σ )−EN | → P (2.36) PN ≡ σ ∈ N
N
converges weakly to the standard Poisson point process P on R+ . Proof. We must prove again the convergence of the sum (2.20), that we split into three η sums: the first over QN , the second over RN, \ QN , and the third over the complement η of the set RN, . By assumption, (2.17) is valid on QN , and thus the terms of the first sum are reduced to −1 1
e−((z−EYN (σ ))BN (σ ,...,σ )(z−EYN (σ )))/2 dz (2π)/2 det(BN (σ 1 , . . . , σ )) ∀i=1,...,:|zi −c|< δN bi
√ −1 1 = (2δ˜N / 2π ) (b1 · · · b )e−(c−EYN (σ ))B (σ ,...,σ )(c−EYN (σ ))/2 (1 + o(1)), (2.37)
Local REM Conjecture
521
with c ≡ (c, . . . , c), and E YN (σ ) ≡ (EYN (σ 1 ), . . . , EYN (σ )), since δN is exponentially small and |EYN (σ )| ≤ N R . By definition of δ˜N , the quantities (2.37) are at most O(| N |− ), while, by the estimate (2.10) and by the assumption on the cardinality of η η η ⊗l \ RN, and in RN, \ QN RN, \ QN , the number of -tuples of configurations in N is exponentially smaller than | N | . Hence P(∀i=1 : |YN (σ i ) − EN | < bi δN ) (σ 1 ,...,σ )∈QN
=
√ −1 1 (2δ˜N / 2π) (b1 · · · b )e−(c−EYN (σ ))B (σ ,...,σ )(c−EYN (σ ))/2
(σ 1 ,...,σ )∈QN
×(1 + o(1)) + o(1) √ −1 1 = (2δ˜N / 2π) (b1 · · · b )e−(c−EYN (σ ))B (σ ,...,σ )(c−EYN (σ ))/2 ⊗ (σ 1 ,...,σ )∈ N
×(1 + o(1)) + o(1) b 1 · · · b = | N | K
e−(c−EYN (σ ))B
−1 (σ 1 ,...,σ )( c−EYN (σ ))/2
⊗ (σ 1 ,...,σ )∈ N
×(1 + o(1)) + o(1).
(2.38)
The last quantity converges to b1 · · · b , by the assumptions (2.32), (2.33) and (2.35). The sum of the probabilities, P(∀i=1 : |YN (σ ) − EN | < δN bi ), over all -tuples of −γ η RN, \ QN , contains at most | N | e−N terms, while, by Assumption A (iii), (and η since, for any (σ 1 , . . . , σ ) ∈ RN, , the rank of BN (σ 1 , . . . , σ ) equals ) each term is at most of order | N |− , up to a polynomial factor. Thus this sum converges to zero. ⊗l \ Finally, the sum of the same probabilities over the collections (σ 1 , . . . , σ ) ∈ N η RN, converges to zero, exponentially fast, by the same arguments as those leading to (2.29) and (2.31), with η1 = η. 3. Examples We will now show that the assumptions of our theorem are verified in a wide class of physically relevant models: 1) the Gaussian p-spin SK models, 2) SK-models with non-Gaussian couplings, and 3) short-range spin-glasses. In the last two examples we consider only the case α = 0. 3.1. p-spin Sherrington-Kirkpatrick models, 0 ≤ α < 1/2. In this subsection we illustrate our general theorem in the class of Sherrington-Kirkpatrick models. Consider S = {−1, 1}: √ N Ji1 ,...,ip σi1 σi2 · · · σip (3.1) HN (σ ) = N p
1≤i1 δ. Changing variables s = Np |IN4 (σ 1 , . . . , σ )| ≤ Q2−N N p/2
−(−1) N p (1+o(1)) φ(sm )2 dsm .
(3.36)
s >ζ m=1
Assumption B made on φ(s) implies that φ(s) is aperiodic, and thus |φ(s)| < 1, for any s = 0. Moreover, for any ζ > 0, there exists h(ζ ) > 0, such that |φ(s)| < 1 − h(ζ ), for all s with |s| > ζ /. Therefore, the right-hand side of (3.36) does not exceed Q2
−N
N
p/2
(1 − h(ζ ))
2−(−1) N p (1+o(1))−2
φ(sm )2 dsm ,
(3.37)
s >η m=1
where the integral is finite again due to Assumption B. Therefore, IN4 (σ 1 , . . . , σ ) is exponentially smaller than 2−N . This proves (2.17) on QN and hence the theorem. 3.3. Short range spin glasses. As a final example, we consider short-range spin glass models. To avoid unnecessary complications, we will look at models on the d-dimensional torus, N , of length N . We consider Hamiltonians of the form rA J A σ A , (3.38) HN (σ ) ≡ −N −d/2 A⊂N
where e σA ≡ x∈A σx , rA are given constants, and JA are random variables. We will introduce some notation: (a) Let AN denote the set of all A ⊂ N , such that rA = 0. (b) For any two subsets, A, B ⊂ N , we say that A ∼ B, iff there exists x ∈ N such that B = A + x. We denote by A the set of equivalence classes of AN under this relation. We will assume that the constants, rA , and the random variables, JA , satisfy the following conditions: (i) rA = rA+x , for any x ∈ N ; (ii) there exists k ∈ N, such that any equivalence class in A has a representative A ⊂ k ; we will identify the set A with a uniquely chosen set of representatives contained in k . 2 d (iii) A⊂N : rA = N . (iv) JA , A ∈ Zd , are a family of independent random variables, such that (v) JA and JA+x are identically distributed for any x ∈ Zd ; (vi) EJA = 0 and EJA2 = 1, and EJA3 < ∞;
530
A. Bovier, I. Kurkova
(vii) For any A ∈ A, the Fourier transform φA (s) ≡ E exp (isJA ), of JA satisfies |φA (s)| = O(|s|−1 ) as |s| → ∞. Observe that EHN (σ ) = 0, b(σ, σ ) ≡ N −d EHN (σ )HN (σ ) = N −d
rA2 σA σA ≤ 1,
(3.39)
A⊂N
where equality holds, if σ = σ . Note that YN (σ ) = YN (σ ) (resp. YN (σ ) = −YN (σ ) ), if and only if, for all A ∈ AN , σA = σA (resp. σA = −σA ). E.g., in the standard Edwards-Anderson model, with nearest neighbor pair interaction, if σx differs from σx on every second site, x, then YN (σ ) = −YN (σ ), and if σ = −σ , YN (σ ) = YN (σ ). In general, we will consider two configurations, σ, σ ∈ S N , as equivalent, iff for all A ∈ AN , σA = σA . We denote the set of these equivalence classes by N . We will assume in the sequel that there is d a finite constant, ≥ 1, such that | N | ≥ 2N −1 . In the special case of c = 0, the equivalence relation will be extended to include the case σA = −σA , for all A ∈ AN . In most reasonable examples (e.g. whenever nearest neighbor pair interactions are included in the set A), the constant ≤ 2 (resp. ≤ 4, if c = 0)). Theorem 3.7. Let c ∈ R, and N be the space of equivalence classes defined before. 2 /2 π −1 c Let δN ≡ | N | e 2 . Then the point process PN ≡
σ ∈ N
δ{δ −1 |HN (σ )−c|} , N
(3.40)
converges weakly to the standard Poisson point process on R+ . If, moreover, the random variables J A are Gaussian, then, for any c ∈ R, and 2α c2 /2 π −1 N 0 ≤ α < 1/4, with δN ≡ | N | e 2 , the point process PN ≡
σ ∈ N
δ{δ −1 |HN (σ )−cN α |} , N
(3.41)
converges weakly to the standard Poisson point process on R+ . Proof. We will now show that Assumptions A of Theorem 2.3 hold. First, the point (i) of Assumption A is verified due to the following proposition. η
Proposition 3.8. Let RN, be defined as in (2.9). Then, in the setting above, for all 0 ≤ η < 21 , d(1−2η) η , (3.42) |RN, | ≥ | N | 1 − e−hN with some constant h > 0. Proof. Let Eσ denote the expectation under the uniform probability measure on {−1, 1}N . We will show that there exists a constant, K > 0, such that, for any σ , and any 0 ≤ δN ≤ 1, 2 d N ). Pσ (σ : b(σ, σ ) > δN ) ≤ exp(−KδN
(3.43)
Local REM Conjecture
531
Note that without loss, we can take σ ≡ 1. We want to use the exponential Chebyshev inequality and thus need to estimate the Laplace transform −d 2 (3.44) rA σA . Eσ exp tN A∈N
Let us assume for simplicity that N = nk is a multiple of k, and introduce the sub-lattice, N,k ≡ {0, k, . . . , (n − 1)k, nk}d . Write
rA2 σA =
2 rA+y+x σA+y+x ≡
A∈A y∈N,k x∈k
A∈N
(3.45)
Zx (σ ),
x∈k
where
Zx (σ ) =
(3.46)
Yy,x (σ )
y∈N,k
has the nice feature that, for fixed x, the summands 2 rA+y+x σA+y+x Yx,y (σ ) ≡ A∈A
are independent for different y, y ∈ n,k (since the sets A + y + x and A + y + x are disjoint for any A, A ∈ k ). Using the H¨older inequality repeatedly, k −d d Zx (σ ) ≤ Eσ etk Zx (σ ) Eσ exp t x∈k
x∈k
=
Eσ etk
dY x,y (σ )
k −d
x∈k y∈N,k
N d k −d d = Eσ etk Y0,0 (σ ) .
(3.47)
It remains to estimate the Laplace transform of Y0,0 (σ ), Eσ exp tk d Y0,0 (σ ) = Eσ tk d rA2 σA ,
(3.48)
A∈k
and, since Eσ σA = 0, using that ex ≤ 1 + x + Eσ exp tk d
x 2 |x| 2 e ,
rA2 σA ≤ Eσ exp
A∈k
≡ Eσ exp
t2 2
k 2d
rA2 e
tk d
2 A∈k rA
A∈k
t 2 tD , Ce 2
(3.49)
532
A. Bovier, I. Kurkova
so that
Eσ exp tN −d
x∈k
2 −d t N −d tD , Ce Zx (σ ) ≤ exp N 2
(3.50)
with constants, C, C , D, that do not depend on N . To conclude the proof of the lemma, the exponential Chebyshev inequality gives,
t2 −d (3.51) Pσ b(σ, σ ) > δN ≤ exp −δN t + N −d C etN D . 2 Choosing t = N d δN , this gives 2 d Pσ b(σ, σ ) > δN ≤ exp −δN N 1 − C eδN D /2 .
(3.52)
Choosing small enough, but independent of N , we obtain the assertion of the lemma. To verifyAssumptionsA (ii) and (iii), we need to introduce the matrix C = C(σ 1 , . . . , with columns and |AN | rows, indexed by the subsets A ∈ AN : the elements of each of its column are rA σA1 , rA σA2 , . . . , rA σA , so that C T C is the covariance matrix, BN (σ 1 , . . . , σ ), up to a multiplicative factor N d . Assumption (ii) is verified due to Proposition 3.3. In fact, let us reduce C to the matrix ˜ ˜ 1 , . . . , σ ) with columns σ 1 , σ 2 , . . . , σ , without the constants rA . Then, C = C(σ A A A exactly as in the case of p-spin SK models, by Proposition (3.3), for any (σ 1 , . . . , σ ) ∈ ˜ 1 , . . . , σ ) can contain at most 2r − 1 different columns. Hence, LN d ,r the matrix C(σ σ )
d
d
|LN d ,r | = O((2r − 1)N ) while | N |r ≥ (2N / )r . Assumption (iii) is verified as well, and its proof is completely analogous to that of Proposition 3.5. The key observation is that, again, the number of possible non-degenerate matrices C¯ r×r that can be obtained from Cp (σ 1 , . . . , σ ) is independent of N . But this is true since, by assumption, the number of different constants rA is N -independent. Finally, we define QN as follows. For any A ∈ A, let j η,A QN, = (σ 1 , . . . , σ ) : ∀1≤i<j ≤ rA2 σAi σA < |A|−1 N −η . (3.53) x∈Zd :x+A⊂N
Let us define QN =
η,A A∈A QN,
η
⊂ RN, . By Proposition (3.8), applied to a model
where |A| = 1, for any A ∈ A, we have |SN⊗ \ QN, | ≤ 2N exp(−hA N d(1−2η) ), with η some hA > 0. Hence, |RN, \QN | has cardinality smaller than | N | exp(−hN d(1−2η) ), with some h > 0. The verification of (2.17) on QN is analogous to the one in Theorem 3.4, using the analogue of Proposition 3.6. We only note a small difference in the analysis of the term IN4 , where we use the explicit construction of QN . We represent the corresponding generating function as the product of |A| terms over different equivalence classes of A, with representatives A ⊂ k , each term being x∈Zd :x+A∈N φ(N −d/2 rA )). Next, we use the fact that for any (σ 1 , . . . , σ ) ∈ Q each 1 × (t1 σx+A + · · · + t σx+A N of these |A| terms is a product of at least 2 − 1 (and of course at most 2 ) different η,A
d
Local REM Conjecture
533
terms, each is taken to the power |A|−1 N d 2− (1 + o(1)). This proves the first assertion of the theorem. The proof of the second assertion, i.e., the case α > 0 with Gaussian variables JA is immediate from the estimates above and the abstract Theorem 2.2, in view of the fact that the condition (2.17) is trivially verified. Acknowledgement. We would like to thank Stephan Mertens for interesting discussions.
References [BFM]
Bauke, H., Franz, S., Mertens, S.: Number partitioning as random energy model. J. Stat. Mech.: Theory and Experiment, page P04003 (2004) [BaMe] Bauke, H., Mertens, S.: Universality in the level statistics of disordered systems. Phys. Rev. E 70, 025102(R) (2004) [BCP] Borgs, C., Chayes, J., Pittel, B.: Phase transition and finite-size scaling for the integer partitioning problem. Random Structures Algorithms 19(3–4), 247–288 (2001) [BCMP] Borgs, C., Chayes, J.T., Mertens, S., Pittel, B.: Phase diagram for the constrained integer partitioning problem. Random Structures Algorithms 24(3), 315–380 (2004) [BCMN] Borgs, C., Chayes, J.T., Mertens, S., Nair, C.: Proof of the local REM conjecture for number partitioning. Preprint 2005, available at http://research.microsoft.com/∼chayes/ [BCMN2] Borgs, C., Chayes, J.T., Mertens, S., Nair, C.: Proof of the local REM conjecture for number partitioning II: Growing energy scales. http://arvix.org/list/ cond-mat/0508600, 2005 [Bo] Bovier, A.: Statistical mechanics of disordered systems. In: Cambridge Series in Statistical and Probabilisitc mathematics, Cambridge University Press, to appear May 2006 [BK1] Bovier, A., Kurkova, I.: Derrida’s generalised random energy models. I. Models with finitely many hierarchies. Ann. Inst. H. Poincar´e Probab. Statist. 40(4), 439–480 (2004) [BK2] Bovier, A., Kurkova, I.: Poisson convergence in the restricted k-partioning problem. Preprint 964, WIAS, 2004, available at http://www.wias-berlin.de/people/files/publications.html, to appear in Random Structures Algorithms (2006) [BK3] Bovier, A., Kurkova, I.: A tomography of the GREM: beyond the REM conjecture. Commun. Math. Phys. 263(2), 535–552 (2006) [BKL] Bovier, A., Kurkov, I., L¨owe, M.: Fluctuations of the free energy in the REM and the p-spin SK models. Ann. Probab. 30, 605–651 (2002) [BM] Bovier, A., Mason, D.: Extreme value behaviour in the Hopfield model. Ann. Appl. Probab. 11, 91–120 (2001) [Der1] Derrida, B.: Random-energy model: an exactly solvable model of disordered systems. Phys. Rev. B (3) 24(5), 2613–2626 (1981) [Der2] Derrida, B.: A generalisation of the random energy model that includes correlations between the energies. J. Phys. Lett. 46, 401–407 (1985) [Mer1] Mertens, S.: Phase transition in the number partitioning problem. Phys. Rev. Lett. 81(20), 4281–4284 (1998) [Mer2] Mertens, S.: A physicist’s approach to number partitioning. Theoret. Comput. Sci. 265(1–2), 79–108, (2001) Communicated by M. Aizenman
Commun. Math. Phys. 263, 535–552 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1517-0
Communications in
Mathematical Physics
A Tomography of the GREM: Beyond the REM Conjecture Anton Bovier1,2 , Irina Kurkova3 1 2 3
Weierstraß–Institut f¨ur Angewandte Analysis und Stochastik, Mohrenstrasse 39, 10117 Berlin, Germany. E-mail:
[email protected] Institut f¨ur Mathematik, Technische Universit¨at Berlin, Strasse des 17. Juni 136, 12623 Berlin, Germany Laboratoire de Probabilit´es et Mod`eles Al´eatoires, Universit´e Paris 6, 4, place Jussieu, B.C. 188, 75252 Paris, Cedex 5, France. E-mail:
[email protected] Received: 29 March 2005 / Accepted: 13 September 2005 Published online: 23 February 2006 – © Springer-Verlag 2006
Abstract: In a companion paper we proved that in a large class of Gaussian disordered spin systems the local statistics of energy values near levels N 1/2+α with α < 1/2 are described by a simple Poisson process. In this paper we address the issue as to whether this is optimal, and what will happen if α = 1/2. We do this by analysing completely the Gaussian Generalised Random Energy Models (GREM). We show that the REM behaviour persists up to the level βc N, where βc denotes the critical temperature. We show that, beyond this value, the simple Poisson process must be replaced by more and more complex mixed Poisson point processes. 1. Introduction In a companion paper [4] we proved (for a large class of models) a conjecture by Bauke and Mertens [1] on the universality of the local energy level statistics in disordered systems, the so-called local REM conjecture. While in the original form this conjecture concerns the distribution of the values, HN (σ ), of the (random) Hamiltonian, HN , near √ energies EN = E N (where N is the volume of the system), we could show that, at least in the case of Gaussian couplings, the result still holds for energies EN = EN 1/2+α , with α > 0. In the case of short range spin glasses, it is true for all α < 1/4, and in the case of the p-spin SK models it holds for α < 1/4 if p = 1, and even for α < 1/2, if p ≥ 2. It is natural to ask whether these are true thresholds, and also what should happen for larger values of the energy. As we have noted in [4], the thresholds cannot be surpassed with the method of proof that was used there.1 In this paper we hope to provide some Research supported in part by the DFG in the Dutch-German Bilateral Research Group “Mathematics of Random Spatial Models from Physics and Biology” and by the European Science Foundation in the Programme RDSES. 1 After this paper had been submitted, Borgs et al [2] published a preprint that recovers our results on the p-spin model with p = 1, 2, and extends them to non-Gaussian couplings. They also prove via the estimates on moments, that if EN = cN 3/4 and p = 1, [resp. EN = cN and p = 2] with c small, convergence to a Poisson process cannot hold. No explicit analysis of what happens then is, however, given.
536
A. Bovier, I. Kurkova
insight into these questions by studying a model that allows more explicit computations and hence lets us provide the full picture for all energies, the Generalised Random Energy Model (GREM) of Derrida. The result we obtain gives a somewhat extreme microcanonical picture of the GREM, exhibiting in a somewhat tomographic way the distribution of states in a tiny vicinity of any value of the energy. Let us briefly recall the definition of the GREM. We consider parameters α0 = 1 < α1 , . . . , αn < 2 with ni=1 αi = 2, a0 = 0 < a1 , . . . , an < 1, ni=1 ai = 1. Let N = {−1, 1}N be the space of 2N spin configurations σ . Let Xσ1 ···σl , l = 1, . . . , n, be independent standard Gaussian random variables indexed by configurations σ1 √ . . . σl ∈ {−1, 1}N ln(α1 ···αl )/ ln 2 . We define the Hamiltonian of the GREM as HN (σ ) ≡ N Xσ , with Xσ ≡
√ √ a1 Xσ1 + · · · + an Xσ1 ···σn .
(1.1)
Then cov (Xσ , Xσ ) = A(dN (σ, σ )), where dN (σ, σ ) = N −1 [min{i : σi = σi } − 1], and A(x) is a right-continuous step function on [0, 1], such that, for any i = 0, 1, . . . , n, A(x) = a0 + · · · + ai , for x ∈ [ln(α0 α1 , · · · αi )/ ln 2 , ln(α0 α1 , · · · αi+1 )/ ln 2). Set J0 ≡ 0, and, define, for l > 0, ln(αJl−1+1 · · · αJ ) ln(αJ +1 · · · αm ) Jl = min n ≥ J > Jl−1 : < ∀m ≥ J +1 , (1.2) aJl−1+1 +· · ·+aJ aJ +1 +· · · + am up to Jk = n. Then, the k segments connecting the points (a0 + · · · + aJl , ln(α0 α1 · · · αJl )/ ln 2), for l = 0, 1, . . . , k form the concave hull of the graph of the function A(x). Let a¯ l = aJl−1 +1 + aJl−1 +2 + · · · + aJl , α¯ l = αJl−1 +1 αJl−1 +2 · · · αJl ..
(1.3)
ln α¯ 1 ln α¯ 2 ln α¯ k < < ··· < . a¯ 1 a¯ 2 a¯ k
(1.4)
Then
Moreover, as it is shown in Proposition 1.4 of [3], for any l = 1, . . . , k, and for any Jl−1 + 1 ≤ i < Jl , we have ln(αJl−1 +1 · · · αi )/(aJl−1 +1 + · · · + ai ) ≥ ln(α¯ l )/a¯ l . Hence ln(αJl−1 +1 · · · αj ) ln α¯ l = min . j =Jl−1 +1,Jl−1 +2,...,n aJl−1 +1 + · · · + aj a¯ l
(1.5)
To formulate our results, we also need to recall from [3] (Lemma 1.2) the point process of Poisson cascades P l on Rl . It is best understood in terms of the following iterative construction. If l = 1, P 1 is the Poisson point process on R1 with the intensity measure K1 e−x dx. To construct P l , we place the process P l−1 on the plane of the first l − 1 coordinates and through each of its points draw a straight line orthogonal to this plane. Then we put on each of these lines independently a Poisson point process with intensity measure Kl e−x dx. These points on Rl form the process P l . The constants K1 , . . . , Kl > 0 (that are different from 1 only in some degenerate cases) are defined in the formula (1.14) of [3].
Beyond the REM Conjecture
537
We will also need the following facts concerning P l from Theorem 1.5 of [3]. Let γ1 > γ2 > · · · > γl > 0. There exists a constant h > 0, such that, for all y > 0, P ∃(x1 , . . . , xl ) ∈ P l , ∃j = 1, . . . , l : γ1 x1 + γ2 x2 + · · · + γj xj > (γ1 + · · · + γj )y ≤ exp(−hy). (1.6) Here and below we identify the measure P l with its support, when suitable. Furthermore, for any y ∈ R, #{(x1 , . . . , xl ) ∈ P l : x1 γ1 + · · · + xl γl > y} < ∞ a.s. Moreover, let β > 0 be such that βγ1 > · · · > βγl > 1. The integral l = eβ(γ1 x1 +···γl xl ) P l (dx1 , . . . , dxl )
(1.7)
(1.8)
Rl
is understood as limy→−∞ Il (y) with eβ(γ1 x1 +···+γl xl ) P l (dx1 , . . . , dxl ) Il (y) = (x1 ,...,xl )∈Rl : ∃i,1≤i≤l:γ1 x1 +···+γi xi >(γ1 +···+γi )y
=
l
eβ(γ1 x1 +···+γl xl ) P l (dx1 , . . . , dxl , ). (1.9)
j =1
(x1 ,...,xl )∈Rl : ∀i=1,...,j −1:γ1 x1 +···+γi xi ≤(γ1 +···+γi )y γ1 x1 +···+γj xj >(γ1 +···+γj )y
It is finite, a.s., by Proposition 1.8 of [3]. To keep the paper self-contained, let us recall how this fact can be established by induction starting from l = 1. The integral (1.8), ∞ in the case l = 1, is understood as limy→−∞ I1 (y). Here I1 (y) = y eβγ1 x1 P1 (dx) is finite, a.s., since P1 contains a finite number of points on [y, ∞[, a.s. Furthermore, by [5] or Proposition 1.8 of [3], limy→−∞ I1 (y) is finite, a.s., since E supy ≤y (I1 (y ) − I1 (y)) converges to zero exponentially fast, as y → −∞, provided that βγ1 > 1. If l ≥ 1, each term in the representation (1.9) is determined and finite, a.s., by induction. In fact, to see this for the j th term, given any realization of P l in Rl , take its projection on the plane of the first j coordinates. Then by (1.7), there exists only a finite number of points (x1 , . . . , xj ) of P j , such that γ1 x1 + · · · + γj xj > (γ1 + · · · + γj )y, a.s. Whenever the first j coordinates of a point of P l in Rl are fixed, the remaining l − j coordinates are distributed as P l−j on Rl−j . Then the integral over the function eβ(γj +1 xj +1 +···+γl xl ) over these coordinates is defined by induction and is finite, a.s., provided that βγj +1 > · · · > βγl > 1. Thus the j th term in (1.9) is the sum of an a.s. finite number of terms and each of them is a.s. finite. Finally, again by Proposition 1.8 of [3], limy→−∞ Il (y) is finite, a.s., since E supy ≤y (Il (y ) − Il (y)) → 0 as y → −∞ exponentially fast provided that βγ1 > · · · > βγl > 1. Let us define the constants dl , l = 0, 1, . . . , k, where d0 = 0 and dl ≡
l
i=1
a¯ i 2 ln α¯ i .
(1.10)
538
A. Bovier, I. Kurkova
Finally, we define the domains Dl , for l = 0, . . . , k − 1, as k 2 ln α¯ l+1 a¯ j . Dl ≡ |y| < dl + a¯ l+1
(1.11)
j =l+1
It is not difficult to verify that D0 ⊂ D1 ⊂ · · · ⊂ Dk−1 . We are now ready to formulate the main result of this paper. Theorem 1.1. Let a sequence cN ∈ R be such that lim sup cN ∈ D0 and lim inf cN ∈ D0 . Then, the point process M0N =
N→∞
σ ∈N
δ
2N +1 (2π)−1/2 e
2 N/2 −cN
Xσ −cN √N
N→∞
(1.12)
converges to the Poisson point process with intensity measure the Lebesgue measure. Let, for l = 1, . . . , k − 1, c ∈ Dl \ Dl−1 (where Dl−1 is the closure of Dl−1 ). Define c˜l = |c| − dl , βl =
c˜l , a¯ l+1 + · · · + a¯ k
γi =
(1.13)
a¯ i /(2 ln α¯ i ), i = 1, . . . , l,
(1.14)
and Rl (N ) =
l 2(α¯ l+1 · · · α¯ k )N exp(−N c˜l βl /2)
(4N π ln α¯ j )−βl γj /2 . 2π(a¯ l+1 + · · · + a¯ k ) j =1
(1.15)
Then, the point process MlN =
σ ∈N
δR (N)√a X 1 σ l
√ √ an Xσ1 ...σn −c N
1 +···+
(1.16)
converges to mixed Poisson point process on [0, ∞[: given a realization of the random variable l , its intensity measure is l dx. The random variables l are defined in terms of the Poisson cascades Pl via (1.17) l = eβl (γ1 x1 +···γl xl ) P l (dx1 , . . . , dxl ).
Rl
Remark. Note that d0 + 2 lna¯ 1α¯ 1 kj =1 a¯ j = 2 lna¯ 1α¯ 1 = βc (see [3]), the inverse critical temperature in the GREM. Thus the first part of the theorem asserts that the REM conjecture holds for exactly those energies that satisfy EN < Nβc . It is tempting to conjecture that this model independent formulation of the result might be true more generall