Commun. Math. Phys. 227, 1 – 92 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Anderson Transitions for a Family of Almost Periodic Schrödinger Equations in the Adiabatic Case Alexander Fedotov1 , Frédéric Klopp2 1 Department of Mathematical Physics, St. Petersburg State University, 1, Ulianovskaja,
198904 St. Petersburg-Petrodvoretz, Russia. E-mail:
[email protected] 2 Département de Mathématique, Institut Galilée, L.A.G.A., UMR 7539 C.N.R.S, Université de Paris-Nord,
Avenue J.-B. Clément, 93430 Villetaneuse, France. E-mail:
[email protected] Received: 2 July 2001 / Accepted: 13 November 2001
Abstract: This work is devoted to the study of a family of almost periodic one-dimensional Schrödinger equations. Using results on the asymptotic behavior of a corresponding monodromy matrix in the adiabatic limit, we prove the existence of an asymptotically sharp Anderson transition in the low energy region. More explicitly, we prove the existence of energy intervals containing only singular spectrum, and of other energy intervals containing absolutely continuous spectrum; the zones containing singular spectrum and those containing absolutely continuous are separated by asymptotically sharp transitions. The analysis may be viewed as utilizing a complex WKB method for adiabatic perturbations of periodic Schrödinger equations. The transition energies are interpreted in terms of phase space tunneling. Résumé: Ce travail est consacré à l’étude d’une famille d’équations de Schrödinger quasi-périodiques en dimension 1. Nous définissons une matrice de monodromie pour cette famille. Nous étudions le comportement asymptotique de cette matrice dans le cas adiabatique. A cette fin, nous utilisons une version de la méthode WKB complexe pour l’équation de Schrödinger périodique avec une perturbation adiabatique. L’étude de la matrice de monodromie nous permet de prouver l’existence de transitions d’Anderson. Plus précisément, nous démontrons l’existence d’intervalles d’énergie ne contenant que du spectre singulier, d’autres intervalles contenant surtout du spectre absolument continu; les zones contenant du spectre singulier sont séparées des zones contenant du spectre absolument continu par des seuils de mobilité asymptotiquement ponctuels. Ces seuils de mobilité sont caractérisés grâce à l’effet tunnel dans l’espace des phases.
2
A. Fedotov, F. Klopp
1. Introduction In this paper, we study spectral properties of the ergodic family of Schrödinger equations Hφ,ε ψ = −
d2 ψ(x) + (V (x − φ) + α · cos(εx))ψ(x) = Eψ(x), dx 2
x ∈ R. (1.1)
We assume that • V is a locally square integrable, real valued, periodic function, V (x + 1) = V (x),
x ∈ R,
(1.2)
• α > 0 is a coupling constant, • ε is a positive number such that 2π/ε is irrational, • φ ∈ R is a parameter (“indexing the equations of the family”). We note that, under our assumptions, the potential V (x−φ)+α cos(εx) is quasi-periodic. The goal of this paper is to study the family (1.1) at “low” energies. Some of these results were announced in [21, 22]. During the last 20 years, quasi-periodic Schrödinger operators have been the object of many studies both rigorous and numerical (see, e.g. [25, 29, 33, 28, 14] for references). Motivated by the physics of condensed matter, among the main goals was the understanding of their spectra. One of the questions that has drawn a lot of interest is the nature of the spectrum; this is one of the main issues dealt with in this paper. One of the most remarkable phenomena exhibited by the spectrum of these operators is the so-called Anderson transition or “metal-insulator” transition. It was observed numerically (e.g. [27, 28]) that, for certain quasi-periodic Schrödinger operators, the spectrum is split into energy intervals where it is purely absolutely continuous and other energy intervals where it is singular. Roughly speaking, from the physicist’s point of view, the singular spectrum corresponds to a phase where the material is an insulator, and the absolutely continuous spectrum to a phase where the material is a conductor. The transitions from one of these phases to the other one are expected to be sharp and to happen at discrete energies called mobility edges. Such transitions do not always exist; e.g. for highly symmetrical models like the Almost Mathieu equation or for small potentials, it is believed and proved in many cases that no such transitions happen, the spectral type being determined by parameters of this equation ([13, 30]). As for the existence of transitions, in [23], for a model similar to ours, it was proved that the bottom of the spectrum is pure point in the large coupling constant limit. On the other hand, it is known that, for one dimensional quasi-periodic Schrödinger operators with analytic potentials, the spectrum is absolutely continuous for sufficiently high energies ([13]). So, different types of spectra may coexist. In the present paper, we define the monodromy matrix for the family of Eqs. (1.1), compute the asymptotics of a monodromy matrix in the adiabatic limit, i.e. as ε → 0, and use these results to study spectral properties of Eq. (1.1). More precisely, we show that, at low energies, the spectrum of the family (1.1) is located in exponentially small intervals and prove the coexistence of zones where the spectrum is singular and others where most of the spectrum is absolutely continuous. We also prove that the transitions between these regions are asymptotically sharp as ε → 0. The energies where these transitions occur are called asymptotic mobility edges. We give a simple description of these energies in terms of analytic objects naturally associated to Eq. (1.1). To the best
Anderson Transitions for Almost Periodic Schrödinger Equations
3
of our knowledge, this is the first instance of quasi-periodic Schrödinger operators for which such a coexistence has been proved outside of the large coupling regime and sharp transitions have been found. Also, in the present paper, we handle locally square integrable potentials, whereas in previous works analyticity of the potential has always been a crucial assumption. Let us now describe our model more precisely. Consider the periodic Schrödinger operator (H0 ψ)(x) = −
d2 ψ(x) + V (x)ψ(x). dx 2
(1.3)
In this paper, • we assume that the first gap in the spectrum of H0 is open; denote it by (E2 , E3 ); • we restrict our study to a neighborhood of E1 , the bottom of the spectrum, more precisely, to the energies satisfying E − α < E1 and E1 < E + α < E2 .
(1.4)
The techniques that we present here can also be used to study more general adiabatic quasi-periodic perturbations of H0 , i.e. we can replace the cosine by other real analytic potentials. It can also be used to study the situation at higher energies (see [17, 20, 16]). Let us now briefly describe our results, the next section being devoted to the precise mathematical statements. Let E(κ) be the dispersion relation associated to H0 and consider the real and the complex iso-energy curves R and defined by R : :
E(κ) + α · cos(ϕ) = E, κ, ϕ ∈ R, E(κ) + α · cos(ϕ) = E, κ, ϕ ∈ C.
(1.5) (1.6)
The connected components of R are called the real branches of the iso-energy curve (1.6). The real iso-energy curve is 2π -periodic in the κ- and ϕ- (vertical and horizontal) directions. It is symmetric with respect to the ϕ-axis. The role of the real iso-energy curve for adiabatic problems is well known (see, for example, [5]). Under assumption (1.4), the curve has the topology shown in Fig. 1. In this figure, the real branches of R are represented by the full lines, and the dashed lines are complex loops in . Define the actions Sh , Sv and by κdϕ, Sv = i κdϕ and = κdϕ. (1.7) Sh = −i γ2
γ3
γ1
Here, γ1 is a connected component of R , γ2 a loop in connecting γ1 and γ1 − (2π, 0), and, γ3 a loop in connecting γ1 and γ1 + (0, 2π ). We discuss these loops in Sect. 2.6. These loops are oriented so that the integrals in (1.7) be positive. We prove that, on the interval (1.4), the spectrum is situated in exponentially small (in 1/ε) intervals such that the distances between any two neighboring such intervals are of order ε; the “centers” of these intervals are given by the quantization condition cos((E)/ε) = 0. Moreover, if we set S(E) = Sv (E) − Sh (E), then, the subintervals where S(E) < 0 contain only singular spectrum, and the subintervals where S(E) > 0 contain absolutely continuous spectrum; actually, in the latter subintervals, most of the spectrum is absolutely continuous. Finally, we study general properties of the function S(E). For small α, S takes positive values; hence, the bottom of the spectrum of the operator Hφ,ε still contains absolutely continuous spectrum for α small. On the contrary,
4
A. Fedotov, F. Klopp
2
|
2
b b
b
'1
3
1
2 |
1
b
'1
0
1 b 2
|
' '1
Fig. 1. The contours on
for large α, the function S becomes negative, and thus, the bottom of the spectrum of the unperturbed operator becomes singular, the absolutely continuous spectrum disappears. For intermediate values of α, there are always one or several asymptotic mobility edges, i.e. points where the function S vanishes. These are the points where the transitions from absolutely continuous to singular spectrum (or vice versa) occur. One can define two tunneling coefficients, a vertical one associated to Sv : tv = exp(− 1ε Sv ), and, a horizontal one associated to Sh : th = exp(− 1ε Sh ). We see that, roughly, if the vertical tunneling is larger than the horizontal one, the spectrum is singular; and if the horizontal is the larger one, the spectrum is absolutely continuous. For Harper’s equation, an heuristic leading to a similar interpretation has been developed in [42].
E 3
S0 c
tr u
m
2
sp
e
1
a
c
0
b
0 1
2
b
3
4
5
6
7
8
1
2
sing. spectrum
3
4
5
6
Fig. 2. The phase diagram
9
Anderson Transitions for Almost Periodic Schrödinger Equations
5
To illustrate our results, we describe an example where we have computed the asymptotic mobility edges numerically, i.e. we have solved S = 0 numerically. We take the simplest example of V , i.e. we take V to be a one-gap potential ([36]) associated to the periodic spectrum [E1 , E2 ] ∪ [E3 , +∞). In our case E1 = 0, E2 = 8. The value of E3 is computed so that the potential V do have period 1. In Fig. 2, we represented the phase diagram, i.e. the curve S0 = {(α, E) : S(E, α) = 0} of the asymptotic mobility edges (for the potential V (· − φ) + α cos(ε·)). The spectrum is represented vertically and the parameter α horizontally. The oblique rectangle delimits the region (α, E) satisfying (1.4). It comprises the bottom of the spectrum of Hφ,ε . For α fixed, the domain marked by “ac spectrum” (resp. “sing. spectrum”) in Fig. 2 contains absolutely continuous (resp. singular) spectrum. α0 is the value of α at which mobility edges start to exist in our energy interval. When one turns the coupling constant α on, the lowest edge of the spectrum starts to “localize” first, though not immediately. When α > α ∗ , all the spectrum in the window we are considering is singular. In a more general case, i.e. if, in (1.1), we replace the cosine by a more general potential, the phase diagram may be quite different (see [17]). Such phase diagrams were already obtained for quasi-periodic finite difference equations using numerical simulations ([27]). Remark 1.1. In the description given above, we used the word “localized” very loosely to denote singular spectrum. 2. The Main Results In this section, we introduce the central object of our study, the monodromy matrix, and present our main results. 2.1. Monodromy equation. We define the monodromy matrix and introduce the monodromy equation. 2.1.1. Monodromy matrix. For any φ fixed, let ψ1,2 (x, φ) be two linearly independent solutions of Eq. (1.1). We say that they form a consistent basis if their Wronskian is independent of φ and if these solutions are 1-periodic in φ, i.e. ψ1,2 (x, φ + 1) = ψ1,2 (x, φ),
∀x, φ.
(2.1)
The existence of a consistent basis is clear: it suffices to take the solutions with canonical Cauchy data at x = 0, i.e. ψ1 (0, φ) = 1 and ψ1 (0, φ) = 0, and ψ2 (0, φ) = 0 and ψ2 (0, φ) = 1. Consider a consistent basis (ψ1,2 ). As cosine is 2π -periodic, the functions ψ1,2 (x + 2π/ε, φ + 2π/ε) are solutions of Eq. (1.1). Therefore, one can write ψ1 (x + 2π/ε, φ + 2π/ε) ψ1 (x, φ) = M(E, φ) , (2.2) ψ2 (x + 2π/ε, φ + 2π/ε) ψ2 (x, φ) where M(E, φ) is a 2 × 2 matrix with coefficients independent of x. The matrix M is called the monodromy matrix corresponding to the consistent basis (ψ1,2 ). Note that if the potential in (1.1) and the solutions ψ1,2 are independent of φ, then our definition becomes the standard definition of the monodromy matrix for the periodic Schrödinger equation (see e.g. [12, 41]). One immediately checks that, for any consistent basis, the monodromy matrix satisfies det M(E, φ) ≡ 1,
M(E, φ + 1) = M(E, φ),
∀φ.
(2.3)
6
A. Fedotov, F. Klopp
2.1.2. Monodromy equation. Set h=
2π mod1. ε
(2.4)
Let M be a monodromy matrix corresponding to a consistent basis (ψ1,2 ). The spectral analysis of (1.1) can be reduced to the investigation of the solutions of the monodromy equation Fn+1 = M(E, φ + nh)Fn ,
Fn ∈ C2 ,
∀n ∈ Z.
(2.5)
Going from Eq. (1.1) to the monodromy equation is close to the monodromization transformation introduced in [7] to construct Bloch solutions of difference equations. In our case, it appears that the behavior of solutions of (1.1) for x → ±∞ repeats the behavior of solutions of the monodromy equation for n → ∓∞. And, it is well known that the spectral properties of one dimensional Schrödinger equations can be described in terms of the behavior of its solutions as x → ±∞ (see [24, 31]). Let us formulate the precise statement. In the sequel, identifying the vector solutions of the monodromy Eq. (2.5) with functions F : Z → C2 , we denote the values of F by Fn . One has Theorem 2.1. Let ψj (x, φ), j = 1, 2, be a consistent basis solution of (1.1), locally bounded in (x, φ) together with their derivatives in x. Fix φ = φ0 ∈ R and consider solutions of the monodromy equation. Then, there exists a positive constant C such that, for any vector solution F of Eq. (2.5), there exists a unique solution f of (1.1) satisfying the estimates f (x + 2π n/ε, φ) 1 ≤ CF−n C2 , ∀x ∈ [0, 2π/ε), n ∈ Z. F−n C2 ≤ f (x + 2π n/ε, φ) C2 C (2.6) and reciprocally. Both, for Eqs. (1.1) and (2.5), one can define the Lyapunov exponent (see e.g. [4, 8, 10, 37]). It is defined for almost every φ and independent of φ. Let #(E, φ) be the Lyapunov exponent at energy E for (1.1) and θ (E, φ) be the one for (2.5). Theorem 2.1 then immediately implies Corollary 2.1. The Lyapunov exponents #(E, φ) and θ (E, φ) satisfy the relation #(E, φ) =
ε θ(E, φ). 2π
(2.7)
2.2. The adiabatic limit. We see that the spectral analysis of (1.1) reduces to the analysis of solutions of the monodromy equation (2.5). To compute a monodromy matrix, we use the adiabatic limit ε → 0. In the adiabatic case, the asymptotics of the monodromy matrices take up a very simple form so that one can speak of reducing of (1.1) to some simple model difference equation (i.e. the respective monodromy equation). To describe the reduced model, we first recall some basic facts about one dimensional periodic Schrödinger operators and define some related objects.
Anderson Transitions for Almost Periodic Schrödinger Equations
7
2.3. Periodic Schrödinger operator. Consider the periodic Schrödinger operator (1.3). Its spectrum on L2 (R), denoted by σ (H0 ), is absolutely continuous and consists of intervals [E1 , E2 ], [E3 , E4 ], . . . , [E2n+1 , E2n+2 ], . . . , of the real axis such that E1 < E2 ≤ E3 < E4 . . . E2n ≤ E2n+1 < E2n+2 ≤ . . . , En → +∞, n → +∞. The above intervals are called the spectral bands, and the open intervals (E2 , E3 ), (E4 , E5 ), . . . , (E2n , E2n+1 ), . . . , are called the spectral gaps. Denote the dispersion law associated to (1.3) by E(κ). The inverse function k(E) is called the Bloch quasi-momentum. It is a multi-valued analytic function; its branch points are the edges of the gaps and are of square root type. It is real on the spectral bands; on the spectral gaps, its real part is constant and its imaginary part does not vanish. For more details on this function, we refer to Sect. 5.
2.4. The complex momentum. Consider the function κ(ϕ) defined by the relation (1.6), i.e. κ(ϕ) = k(E − α cos(ϕ)).
(2.8)
We call it the complex momentum. This function is also multi-valued and analytic. The function κ will play a crucial role in the determination of the asymptotics of the monodromy matrix.
2.5. The branch points of κ and the set arccos(R). As the branch points of the Bloch quasi-momentum, all the branch points of κ are of square root type. They are given by the equation E − α cos(ϕ) = En , n ∈ N.
(2.9)
The set of the branch points is 2π -periodic in ϕ. If E ∈ R, it is symmetric with respect to the real line and, moreover, is situated on arccos(R) = R ∪ ∪n∈Z (iR + nπ ), the pre-image of R by the cosine. Under condition (1.4), the complex momentum has a single branch point in (0, π ). Denote it by ϕ1 ; it is defined by E1 = E−α cos(ϕ1 ) (see Fig. 3). The complex momentum also has non-real branch points. Let ϕ2 and ϕ3 be the branch points lying on the line π + iR+ , closest to π, indexed so that Im ϕ2 < Im ϕ3 ; they satisfy (2.9) for n = 2, 3 (see Fig. 3). Consider the map E : ϕ → E − α cos(ϕ). We denote by Z the pre-image of σ (H0 ), the spectrum of H0 and by G the pre-image of the spectral gaps of H0 , i.e. R \ σ (H0 ). Clearly, Z, G ⊂ arccos(R). The connected components of Z and G are separated by branch points of the complex momentum. All the branches of the complex momentum take real values on Z. And, the imaginary part of any branch of the complex momentum is non-zero on G.
8
A. Fedotov, F. Klopp
2.6. The phases and tunneling coefficients. We now describe the main analytical objects needed to write out the asymptotics of the coefficients of the monodromy matrix: the phase and the tunneling coefficients th and tv , in terms of contours in the complex plane of ϕ. There is a branch of the complex momentum taking values in iR+ on the interval (−ϕ1 , ϕ1 ), and values in [0, π] on the “cross” [ϕ1 , 2π − ϕ1 ] ∪ [ϕ2 , ϕ2 ] (see Fig. 3 and Sect. 7.1 for more details). Denote this branch by κ∗ . We define the actions ϕ1 ϕ2 Sh = −i κ∗ dϕ and Sv = i (κ∗ − π )dκ, (2.10) −ϕ1
ϕ2
and the phase integral 1 = ε
2π−ϕ1
ϕ1
κ∗ dϕ.
(2.11)
Let J be the interval defined by (2.16). In Sect. 9.1, we prove Lemma 2.1. The actions and phase integrals have the following properties (1) Sh , Sv and take positive values on J ; (2) they are analytic in E in a neighborhood of J ; (3) (E) > 0 for all E ∈ J ; (4) Sv ≤ 2π Im (ϕ2 ) for E ∈ J . We define tunneling coefficients 1 1 th = exp − Sh , tv = exp − Sv . ε ε
(2.12)
Due to point (1) in Lemma 2.1, the tunneling coefficients th and tv are exponentially small in 1/ε. The actions and the phase integral can be written as contour integrals of a branch of κ along closed curves i i 1 Sh = − κdϕ and Sv = κdϕ, and = κdϕ. (2.13) 2 γh 2 γv 2ε γp The loops γh , γv and γp are shown on Fig. 3. We note here that the contours γp , γh and γv can be considered as projections on the complex plane of ϕ, of contours γ1 , γ2 , γ3 located in (see Fig. 1). This follows from the fact that any analytic branch of κ is single valued on each of the curves γp , γh and γv (see Sect. 9.1). This last property is related to the fact that all the branch points of κ are of square root type. 2.7. Asymptotics of the monodromy matrix. As the potential in Eq. (1.1) is real valued, we are able to choose a consistent basis so that the corresponding monodromy matrix has the form a(φ, E) b(φ, E) a b M(φ, E) = (2.14) = ∗ ∗ . b a b(φ, E) a(φ, E)
Anderson Transitions for Almost Periodic Schrödinger Equations
9
'2
v
p
h '1
b 0
b
'1
2
'1
'2
Fig. 3. The branch points and the contours in the complex plane of ϕ
Here and from now on, for f : (z1 , z2 , ...zn ) → f (z1 , z2 , ...zn ) a function of complex variables, f ∗ denotes the function (z1 , z2 , ...zn) → f (z1 , z2 , ..., zn ). The set of a b GL(2, C)-valued functions having the form ∗ ∗ is denoted by M (regardless of b a the variables, these being clear from the context). We introduce another notation that is used throughout the paper; for a : φ → a(φ), a 1-periodic function, we write ˜ a = a0 + a(φ),
(2.15)
where a0 is the zeroth Fourier coefficients of a. Recall that E1 and E2 are the ends of the first spectral interval of the periodic Schrödinger equation (1.3). We assume that E − α ≤ E1 − δ,
E1 + δ < E + α < E2 − δ.
(2.16)
Here and below, δ denotes fixed positive constants independent of ε. This is just a strong version of condition (1.4). In Sect. 8, we prove Theorem 2.2. Pick Y ∈ (Im ϕ2 , Im ϕ3 ). Pick E0 satisfying (2.16). Then, there exists η > 0 and ε0 > 0 such that, for 0 < ε < ε0 , there exists a consistent basis such that M(E, φ), the corresponding monodromy matrix is analytic in V = {|E − E0 | < η} × {|Im φ| ≤ Y /ε}, belongs to M and, in V, its coefficients decomposed according to (2.15) admit the asymptotics 1 i/ε tv e (1 + o(1)), a(φ) ˜ = a(φ, ˜ E) = − e2iπ(φ−φ0 ) (1 + o(1)), th th i tv ˜ ˜ b0 = b0 (E) = ei/ε (1 + o(1)), b(φ) = b(φ, E) = −i e2iπ(φ−φ0 ) (1 + o(1)). th th (2.17) a0 = a0 (E) =
The functions = (E), th = th (E) and tv = tv (E) are the phase and tunneling coefficient defined above. The function φ0 = φ0 (E) is real analytic and satisfies φ0 (E) = O(1). The asymptotics are uniform in (E, φ) in the set V.
10
A. Fedotov, F. Klopp
Remark 2.1. The consistent basis solutions are constructed as functions of the variables u = x−φ, ϕ = φ/ε and E. They are analytic in (E, ϕ) ∈ {|E−E0 | < η}×{|Im ϕ| ≤ Y }. The proof of Theorem 2.2 is based on a new asymptotic method developed in [19] to study adiabatic perturbations of periodic Schrödinger operators in dimension 1. This method is based on the analysis of adiabatic asymptotics in the complex plane of ϕ of solutions of equations of the form −ψ (x) + V (x − ϕ, εx)ψ(x) = Eψ(x). In general, this method leads to computations similar to those typical for the standard complex WKB methods (see e.g. [15]). In our case, under condition (2.16), due to some natural symmetries, they remind of the computations made in [6] for the semi-classical analysis Harper equation.
2.8. The spectral results. 2.8.1. The spectrum in the adiabatic limit. We begin with a general observation concerning the location of the spectrum in the adiabatic limit. Recall that, as Hφ,ε is quasiperiodic, its spectrum does not depend on φ (see [1]). Let 6ε denote the spectrum of Hφ,ε . One proves Proposition 2.1. Let 6 = σ (H0 ) + α cos(R) = σ (H0 ) + [−α, α]. Then, one has • ∀ε ≥ 0, 6ε ⊂ 6. • for any K ⊂ 6 compact, there exists a constant C > 0 such that, for ε sufficiently small, one has 1
1
6ε ∩ (E − Cε 2 , E + Cε 2 ) = ∅,
∀E ∈ K.
2.8.2. Location of the spectrum. Let J = J (δ) be the energy interval defined by (2.16). By Lemma 2.1, the function (E) is monotonically increasing on J and its derivative does not vanish there. In J , consider the points (E (l) )l∈N defined by 1 (E (l) ) = π/2 + π l, l ∈ N. ε
(2.18)
The number of these points is of order 1/ε; we denote the minimal and the maximal values of l for which (2.18) admits a solution in J by L1 and L2 . For sufficiently small ε, the distances between the points (E (l) )l∈N satisfy the inequalities c1 ε ≤ E (l) − E (l−1) ≤ c2 ε, l = L1 + 1, . . . , L2 , where c1 and c2 are two positive constant independent of ε. One has Theorem 2.3. There exists a collection of intervals (Il )L1 ≤l≤L2 , Il ⊂ J such that, for ε > 0 sufficiently small, one has • 6ε ∩ J ⊂ ∪L1 ≤l≤L2 Il , • the interval Il lies in an o (ε)-neighborhood of E (l) ,
Anderson Transitions for Almost Periodic Schrödinger Equations
11
• the measure of Il satisfies |Il | = 2
ε(tv (E (l) ) + th (E (l) )) (1 + o(1)). (E (l) )
(2.19)
Moreover, if dNε (E) denotes the density of states measure of Hφ,ε at energy E, then, one has 1 dNε (E) = ε. (2.20) 2π Il Note that the intervals (Il )l are exponentially small and separated by distances of order O(ε). 2.8.3. The nature of the spectrum. Set λ(E) = tv (E)/th (E),
S(E) = ε log λ(E) = Sv (E) − Sh (E).
(2.21)
For δ > 0, define the sets Jδ− = {E ∈ J ; S(E) < −δ} and Jδ+ = {E ∈ J ; S(E) > δ}. If Jδ+ = ∅, then, for sufficiently small ε, the number of intervals Il lying in Jδ+ is of order O(1/ε). We prove Theorem 2.4. Let I ⊆ Jδ+ be an interval, and let λI = exp (− minE∈I S(E) /ε). Pick σ ∈ (0, 1). There exists D ⊂ (0, 1), a set of Diophantine numbers such that • mes (D ∩ (0, ε)) = 1 + O ελσI when ε → 0. ε
(2.22)
• For ε ∈ D sufficiently small, each of the intervals Il ⊂ I contains absolutely continuous spectrum, and for these intervals mes (Il ∩ 6ac ) = 1 + o(1). mes (Il )
(2.23)
Here, 6ac is the absolutely continuous spectrum for the family of Eqs. (1.1). Let us now study the singular spectrum. As before, if Jδ− = ∅, then, for sufficiently small ε, the number of intervals Il lying in Jδ− is of order O(1/ε). We prove Theorem 2.5. For sufficiently small ε, each of the intervals Il ⊂ Jδ− contains only singular spectrum. Moreover, for E ∈ Jδ− , one has 1 (Sh (E) − Sv (E)) + o(ε), 2π where # is the Lyapunov exponent for the family of Eqs. (1.1). #(E) ≥
(2.24)
Remark 2.2. In Theorems 2.4 and 2.5, we have fixed the value of δ independently of ε. The proofs show that one can actually take δ = ε α , 0 < α < 1; in this case, the statements of Theorems 2.4 and 2.5 stay correct except that, in (2.22) and (2.24), the estimates of the error terms have to be modified.
12
A. Fedotov, F. Klopp
2.8.4. Heuristics of the proof. Let us outline the basic heuristics guiding the proof of our results. If we omit the factors (1 + o(1)) in (2.17), the monodromy matrix takes the form −ζ −iζ z0 iz0 M = M0 + λM1 , , M1 = λ , M0 = −iz0 z0 iζ −1 −ζ −1 1 i z0 = e ε , ζ = e2iπ(φ−φ0 ) , th where M0 is a matrix with coefficients independent of φ, and M1 is a first order trigonometric polynomial in φ. Now, consider the monodromy equation (2.5) for this matrix. The coefficient λ = λ(E) plays the role of the coupling constant. If λ is small, one expects that the term λM1 can be “omitted” using KAM theory ideas. As a result, in this case, the spectrum should be absolutely continuous. If λ is large, Herman’s idea ([26]) shows that the Lyapunov exponent of the solutions of the monodromy equation is positive. By Corollary 2.1, this implies the positivity of the Lyapunov exponent of the family (1.1). Hence, the spectrum is singular. Finally, since the coefficients of M0 are much bigger than λ, and, since, by (2.3), the monodromy matrix is unimodular,the spectrum has to (E) be located near the zeros of Tr(M0 ), i.e. near the points where cos = 0. ε 2.9. The phase diagram. The transitions, possible in view of Theorems 2.4 and 2.5, actually occur as we shall see now. Therefore, we study the dependence of S(E) = Sv (E) − Sh (E) on E and on α. To underline the dependence on α, let us slightly change our notations and write Sh (α, E) = Sh (E), Sv (α, E) = Sv (E) and S(α, E) = S(E). The potential V is kept fixed. We work in the (α, E) domain S = {(α, E) ∈]0, +∞[×R; E − α ≤ E1 , E1 ≤ E + α ≤ E2 }
(2.25)
(compare with (1.4)). We define S± = {(α, E) ∈ S; S(α, E) ≷ 0}, S0 = {(α, E) ∈ 1 1 S; S(α, E) = 0}, α ∗ := E2 −E and E ∗ := E2 +E 2 2 . The sets S+ , S0 and S− form a partition of S. One has Theorem 2.6. Assume that V satisfy the assumptions described in the introduction. Then, • S0 is a compact analytic curve joining in S the point (α ∗ , E ∗ ) to some point on the half-line E = E1 − α, α > 0. • For any t ∈ [−1, 1], S0 intersects the half-line E = E1 + tα (α > 0) at exactly one point in S. The picture of the phase diagram one gets from Theorem 2.6 is essentially the one given in Fig. 4. Indeed, Theorem 2.6 implies that • S+ is connected and there exists α0 > 0 such that S ∩ {α < α0 } ⊂ S+ . • S− is connected and there exists α1 > α ∗ > 0 such that S ∩ {α > α1 } ⊂ S− . From Theorems 2.4 and 2.6, we deduce that, for α < α0 and for ε sufficiently small, most of the spectrum of Hφ,ε stays absolutely continuous (in the domain S). This can be considered as an analogue of one of the results of [13]; there, it is proved that, for a small d2 quasi-periodic perturbation of − 2 , the spectrum is purely absolutely continuous; here, dx we are dealing with small perturbations of general periodic operators. For α > α1 , by
Anderson Transitions for Almost Periodic Schrödinger Equations
13
E
)
E b
(
E
S+
E2
=
=
E1
E
(
+
S
)
0 b
S0
b
1
b
(
E =
E1 ) Fig. 4. An example of phase diagram
Theorem 2.5, for ε sufficiently small, the spectrum is singular (in the domain S). For α0 < α < α1 , there is at least one mobility edge. At the point (α ∗ , E ∗ ), S0 has the slope dα 1 − ceff = where ceff = dE 1 + ceff
m2 . m1
(2.26)
√ Here, m1,2 are the effective masses defined by |k (Ei + η)| ∼ mi · |η|, i = 1, 2, |η| % 1. By general estimates on the effective mass (see e.g. [32]), one always has ceff < 1. Hence, the picture of the phase diagram that one gets near (α ∗ , E ∗ ) is roughly the same as in Fig. 2. So, we see that, here, the singular spectrum grows with α for α close to α ∗ . Note that if, in (1.1), the cosine is replaced by a more general periodic potential, the dα derivative dE can be negative. In this case, we have singular spectrum “coming from” the “center” of the spectral band of the unperturbed periodic operator.
2.10. Outline of the paper. In the next section, we prove Theorem 2.1 and therefore study the link between solutions of (1.1) and solutions of (2.5). Section 4 is devoted to the proof of the results stated in Sect. 2.8 when assuming Theorem 2.2. Sections 5, 6, 7 and 8 are devoted to the proof of the asymptotics of the monodromy matrix, Theorem 2.2. In Sect. 5, we recall some results on one dimensional periodic Schrödinger operators; in Sect. 6, we describe the adiabatic complex WKB method. Section 7 is devoted to the description of geometrical and analytical objects of the adiabatic complex WKB method when applied to Eq. (1.1). In Sect. 8, we compute the asymptotics of the monodromy matrix. In Sect. 9, we discuss the properties of the phase and action integrals; we also prove the results on the phase diagrams. In Sect. 10, following [40], we adapt Herman’s argument to estimate the Lyapunov exponent in the case when S(E) < 0. And Sect. 11 is devoted to a simple version of KAM needed to study the monodromy equation for small coupling.
14
A. Fedotov, F. Klopp
3. Monodromy Matrices and Monodromy Equation In Sects. 2.1 and 2.1.2, we have introduced the monodromy matrix and the monodromy equation. We now prove Theorem 3.1. Let ψ1,2 be consistent basis solutions of (1.1), and let M be the corresponding monodromy matrix. Fix φ ∈ R. Then, for any solution χ of Eq. (2.5), there exists a unique solution of (1.1), say f , such that f (x + 2π n/ε, φ) (3.1) = A(x, φ − nh) · σ · χ−n , ∀x ∈ R, n ∈ Z, f (x + 2π n/ε, φ) where
A=
ψ1 ψ2
dψ1 dψ2 dx dx
,
σ =
0 −1 , 1 0
h=
2π ε
mod (1).
Reciprocally, for any f solution of (1.1), there exists a unique vector χ solution of (2.5) satisfying (3.1). As the matrix A(x, φ) is periodic in φ and unimodular, Theorem 3.1 immediately implies Theorem 2.1. Proof. Let us recall some elementary facts on difference equations of the form (2.5) where the matrix M is unimodular. Together with vector solutions of (2.5), we also consider matrix solutions, i.e. sequences of 2 × 2-matrices {ϒn }n∈Z such that ϒn+1 = M(φ + nh)ϒn . A matrix solution ϒ of Eq. (2.5) is called fundamental if and only if det ϒn ≡ 1 for all n ∈ Z. To construct a fundamental solution, we let ϒ0 (φ) = I , ∀φ ∈ R, and then, define ϒn for all n just by means of the equation ϒn+1 (φ) = M (φ + nh)ϒn (φ),
ϒn−1 (φ) = M −1 (φ + h(n − 1))ϒn (φ).
(3.2)
This is possible as det M = 0. In result, one obtains a matrix solution of the difference equation (2.5). Since det ϒn+1 (φ) = det M(φ + nh) det ϒn (φ) = det ϒn (φ),
∀n, φ,
this matrix solution is fundamental. Any vector solution of the monodromy equation, say χ , can be represented as χn = ϒn (φ) p,
∀n,
(3.3)
where p is a constant vector. Moreover, for any constant vector p, formula (3.3) gives a vector solution of the monodromy equation. Now, we prove Theorem 3.1. The proof consists of several steps. 1. Let (ψ1 , ψ2 ) be a consistent basis solution of (1.1) and define ψ = (ψ1 , ψ2 )T . Let ϒ(φ) be a fundamental solution of Eq. (2.5). Assuming that n ≥ 0, we compute 2π 2π n ψ (x + 2π n/ε, φ) = M φ − ···M φ − ψ (x, φ − 2π n/ε), ε ε 2π 2π n I = ϒ0 (φ) = M φ − ···M φ − ϒ−n (φ). ε ε
Anderson Transitions for Almost Periodic Schrödinger Equations
15
Hence, as φ → ψ(x, φ) is 1-periodic, we get −1 ψ (x + 2π n/ε, φ) = ϒ−n (φ)ψ (x, φ − nh).
(3.4)
2. Let A be the matrix defined in Theorem 3.1. As detϒl ≡ 1, one has (ϒl−1 )T = −σ ϒl σ ; then, formula (3.4) implies the relation A(x + 2π n/ε, φ) = −A(x, φ − nh) · σ · ϒ−n (φ) · σ.
(3.5)
We have obtained this formula for n ≥ 0; one checks that it remains correct for n < 0. 3. Let χ be a vector solution of (2.5) for a given φ. Represent it in the form (3.3). Denote the components of the vector p in this representation by p1 and p2 . The function f (x, φ) = −p2 ψ1 (x, φ) + p1 ψ2 (x, φ)
(3.6)
satisfies Eq. (1.1). Applying the left- and the right-hand sides of (3.5) to the vector −σp, we get the relation (3.1). This proves the first statement of Theorem 3.1. 4. To prove the second statement, we represent the solution f as a linear combination of the linearly independent solutions ψ1 (x, φ) and ψ2 (x, φ) as in (3.6). Then, we let p be the vector (p1 , p2 )T and remark that χn = ϒn (φ0 )p is a solution of (2.5). Applying (3.5) to the vector −σp, we get (3.1). This completes the proof of Theorem 3.1. & ' 4. The Spectral Study This section is devoted to the proofs of Proposition 2.1, and Theorems 2.3, 2.4 and 2.5 using the asymptotics of the monodromy matrix obtained in Theorem 2.2. 4.1. Convergence to the asymptotic spectrum. We prove Proposition 2.1. The first statement of Proposition 2.1 follows immediately from regular perturbation theory (see e.g. [3, 38]). We turn to the proof of the second statement. The spectrum of Hφ,ε is the same as the d2 spectrum of equation − du 2 ψ(u) + (V (u) + α cos(εu + ϕ))ψ(u) = Eψ(u), ϕ = εφ. Pick ν ∈ (0, 1) and K ⊂ 6 a compact set. Pick E ∈ K. Then, there exists ϕ ∈ [0, 2π ) and E0 ∈ σ (H0 ) such that E = E0 + α cos(ϕ). Hence, equation H0 u = E0 u has a bounded (Bloch) solution, say u, continuous in (x, E0 ) ∈ R × σ (H0 ) together with its derivative in x, see, for example, [41]. Pick χ ∈ C0∞ (R) non negative and not identically vanishing. And put χε (·) = εν/2 χ (ε ν ·). One sets uε = χε u and checks that 2 − d + V − E0 uε ≤ Cεν . 2 2 dx L The constant C is uniform in K. On the other hand, we compute [cos(ε · +ϕ) − cos(ϕ)] uε L2 ≤ Cε 1−ν . So, we obtain u˜ ∈ L2 (R), u ˜ = 1 such that − u˜ + (V + α cos(ε · +ϕ) − E)u ˜ ≤ ν 1−ν C(ε + ε ). Hence, either E ∈ σ (Hφ,ε ) or (Hφ,ε − E)−1 ≥ C/(ε ν + ε 1−ν ). This implies the second statement of Proposition 2.1.
16
A. Fedotov, F. Klopp
4.2. Location of the spectrum. Here, we apply Theorem 2.2 to deduce the results of Theorem 2.3 on the location of the spectrum. We proceed as follows. First, we describe conditions guaranteeing that the monodromy equation has two linearly independent solutions rapidly decaying at infinity. Then, we use Theorem 3.1 to construct the corresponding solutions to Eq. (1.1). This describes the complement of the spectrum of (1.1). 4.2.1. Bloch solutions of difference equations. We need some results about solutions of difference equation of the form χ (φ + h) = M (φ) χ (φ),
φ ∈ R,
(4.1)
where h is a positive number and M (φ) ∈ SL (2, C) is a given 1-periodic, matrix-valued function. The results we formulate below can be found in [7]. The set of the vector solutions of (4.1) is a two-dimensional module over the ring of h-periodic functions. For any two vector solutions χ1 and χ2 of (4.1), the determinant of the matrix (χ1 , χ2 ) is h-periodic. The solutions χ1 and χ2 are linearly independent (over the ring of h-periodic functions) if and only if this determinant does not vanish. If χ1 and χ2 are linearly independent solutions, then, any other vector solution χ can be represented as a linear combination of χ1 and χ2 with h-periodic coefficients χ (φ) = a(φ)χ1 (φ) + b(φ)χ2 (φ),
a(φ + h) = a(φ),
b(φ + h) = b(φ),
φ ∈ R.
If φ → χ (φ) is a solution of (4.1), so does φ → χ (φ + 1). Assume, moreover, that χ satisfies χ (φ + 1) = u(φ)χ (φ) for φ ∈ R, where φ → u(φ) is an h-periodic function, then χ is called a Bloch solution, and the function u is its Floquet multiplier. If either |u(φ)| ≥ Const > 1 for all φ ∈ R or |u(φ)| ≤ Const < 1 for all φ ∈ R, the Bloch solution is called monotonous. Remark 4.1. If |u(φ)| ≤ C < 1, then, the solution χ decays exponentially as φ → +∞. Indeed, one has |χ (φ + L)| =
L−1
|u(φ + l)| |χ (φ)| ≤ C L |χ (φ)|.
l=0
If, in addition, χ is bounded on the interval [0, 1[, then |χ (φ)| ≤ C1 e−C2 ·φ with some positive constants C1 and C2 . Similarly, if |u(φ)| ≥ C > 1, then χ decays exponentially as φ → −∞. We finish this section by formulating a simple condition for the existence of the monotonous Bloch solutions. Let M = ((Mij ))1≤i,j ≤2 and set ρ (φ) =
M12 (φ) , M12 (φ − h)
v (φ) = M11 (φ) + ρ (φ) M22 (φ − h).
(4.2)
We also define ρ− = inf |ρ (φ)|, φ∈R
ρ+ = sup |ρ (φ)|, φ∈R
v− = inf |v (φ)|, φ∈R
v+ = sup |v(φ)|. φ∈R
(4.3)
Anderson Transitions for Almost Periodic Schrödinger Equations
17
For f : R → C a periodic continuous non-vanishing function on R, let the index of f , arg f over one period. Clearly, indf is ind f , be the integer equal to the increment of 2π independent of the period chosen to define it. One has Proposition 4.1 ([7]). Assume that M is continuous in φ, and let M12 = 0. If v 2 − , ρ− > 0, ρ+ < 2 ind v = ind ρ = 0,
(4.4) (4.5)
then, Eq. (4.1) has two monotonous Bloch solutions χ ± that satisfy det(χ + , χ − ) = 1 for all φ ∈ R, that are continuous in φ and such that the Floquet multipliers u± of χ ± satisfy the relation u+ (φ)u− (φ) = 1. Remark 4.2. As det(χ + , χ − ) = 1, these solutions are linearly independent. As u+ u− = 1, one of them decays at +∞, and the other one decays at −∞. 4.2.2. Monotonous Bloch solutions and the spectrum of (1.1). Fix an energy E. Let ψ1 and ψ2 form a consistent basis for the family of Eqs. (1.1). Let M(φ) be the corresponding monodromy matrix. Here, instead of the monodromy Eq. (2.5), we consider its continuous analog (4.1). Clearly, if χ (φ) is a vector solution of this equation, then, the vectors χn = χ (φ + nh),
n ∈ Z,
(4.6)
satisfy the monodromy equation (2.5). This and Theorem 2.1 imply Lemma 4.1. Let ψ1,2 be locally bounded in (x, φ) ∈ R2 together with their derivatives in x. If, for a given E, Eq. (4.1) has two linearly independent, locally bounded, monotonous Bloch solutions χ ± , then E is in the resolvent set of Hφ,ε for any φ ∈ R. Proof. Fix φ ∈ R. By formula (4.6), using χ ± , construct two vector solutions χn± of the monodromy equation (2.5). By Theorem 3.1, construct f ± , two solutions of (1.1) satisfying the relation (3.1). Compute the Wronskian of f ± . By (3.1), one has + − w(f + , f − ) = det A(x, φ − nh) det σ det χ−n , χ−n = det χ + (φ − nh), χ − (φ − nh) . As χ ± (x) are linearly independent, the last determinant does not vanish, and f ± are linearly independent solutions of (1.1). As χ ± (x) are monotonous and locally bounded, each of them exponentially decays either at +∞ or at −∞. As they are linearly independent, det(χ + , χ − ) is a nonzero h-periodic function, and, therefore, one of them, say χ + , is exponentially decaying at +∞ and the second one, χ − , is exponentially decaying at −∞. So, we write χ ± (φ) ≤ C1 e∓C2 φ (C1 , C2 > 0). This, and relations (4.6) and (3.1) imply that
hε ±C4 x ± |f (x)| ≤ C3 e , C3 = C1 · sup A(x, φ) , C4 = C2 . 2π x∈[0,2π/ε],φ∈[0,1] (4.7) Here, we have used the fact that ψ1,2 are 1-periodic in φ and locally bounded in (x, φ) together with their derivatives. As (1.1) has two linearly independent solutions f ± satisfying estimates of the form (4.7), the energy E is in the resolvent set of Hφ,ε (see e.g. [9, 41]). & '
18
A. Fedotov, F. Klopp
4.2.3. The functions ρ and v . To study the functions ρ and v defined in (4.3) for the monodromy matrix described in Theorem 2.2, we discuss the coefficients a and b of the monodromy matrix M. Recall that they are 1-periodic and that we have decomposed ˜ them as a = a0 + a(φ) ˜ and b = b0 + b(φ), where a0 and b0 are the zeroth Fourier coefficients of a and b. The asymptotics of a0 , a, ˜ b0 and b˜ are given in Theorem 2.2. One proves Lemma 4.2. For E in a constant neighborhood of E0 , for φ in a constant neighborhood of R and for ε sufficiently small, one has (4.8) ρ(φ) = 1 − e−i/ε tv U (φ) (1 − e−2πih + o (1)), · p(φ, E) + o (λe2π|Im φ0 | ), v(φ) = F (E) − 2λ cos(2π(φ − φ0 − h)) − λ cos ε (4.9) p(φ, E) = 2e−i/ε U (φ)(1 − e−2πih + o (1)),
(4.10)
where F = a0 + a0∗ ,
U (φ) = e2πi(φ−φ0 ) ,
λ = tv /th .
The asymptotics in (4.8), (4.9) and (4.10) are uniform, and the error terms are analytic in E. Remark 4.3. In Lemma 4.2, we used the terminology “constant neighborhood”; here and in the sequel, this means a neighborhood independent of ε. Proof. Using (2.17), we get ˜ 0 = −tv e−i(E)/ε U (φ)(1 + o(1)). b/b Note that this ratio is small in a constant neighborhood of E0 , and so, in this neighborhood, we can write
˜ ˜ − h) ˜ ˜ − h) ˜ b(φ) b(φ b(φ) b(φ 1 + b(φ)/b 0 +o +o . =1+ − ρ= ˜ − h)/b0 b0 b0 b0 b0 1 + b(φ ˜ Substituting into this representation the asymptotics of b(φ)/b 0 , one obtains the asymptotics of ρ. The analyticity in E of the error term follows from the analyticity of the error terms in the monodromy matrix asymptotics. This completes the proof of (4.8). Recall a(φ) ˜ = a(φ) − a0 . As v(φ) = a(φ) + ρ(φ) a ∗ (φ − h), one has v(φ) = F + a(φ) ˜ + (ρ(φ) − 1) a ∗ (φ − h) + a˜ ∗ (φ − h). Using (4.8) and the asymptotics (2.17), we obtain (4.9)–(4.10). This completes the proof of Lemma 4.2. & ' The function F (E) = a0 (E) + a0∗ (E) plays an important role in the spectral analysis of Eq. (1.1). We call it the effective spectral parameter. One has Lemma 4.3. The function F is real analytic in a constant neighborhood of E0 and, in this neighborhood, it admits the following uniform asymptotic representation 1 1 + g1 (E) (E) + g2 (E) , g1,2 = o(1), (4.11) cos F (E) = 2 ε th (E) (E), g (E) = o(1). where g1 and g2 are real analytic, g1,2 1,2
Anderson Transitions for Almost Periodic Schrödinger Equations
19 i
Proof. The function F is real analytic. Theorem 2.2 implies that a0 = e th/ε eg(E) , g being analytic and satisfying the estimate g = o(1) in a constant neighborhood of E0 . Let ¯ g˜ 2 (E) = 1 (g(E)−g(E)). ¯ These two functions are real anag˜ 1 (E) = 21 (g(E)+g(E)), 2i g˜ 1 (E) lytic and admit the estimates g˜ 1,2 = o(1). Clearly, F (E) = 2 eth (E) cos 1ε (E) + g˜ 2 (E) . This implies (4.11). The estimates on the derivatives of g1,2 follow, for example, from the Cauchy estimates for analytic functions. & ' 4.2.4. Location of the spectrum up to exponentially small errors. Now, we are ready to prove the part of Theorem 2.3 concerning the geometry of the spectrum. Our plan is to describe the set R of values of the spectral parameter E for which the monodromy matrix described in Theorem 2.2 satisfies the hypothesis of Proposition 4.1. By Lemma 4.1, the spectrum of Hφ,ε lies in the complement of R. Let V0 be the constant neighborhood of E0 where (2.17) and Lemmas 4.2 and 4.3 are valid. We study the location of the spectrum of (1.1) in J0 = V0 ∩ J . The asymptotics of ρ, v and F imply that the monodromy matrix described by Theorem 2.2 satisfies the assumptions of Proposition 4.1 if cos 1 + g2 (E) ≥ Const(th + tv ), (4.12) ε where g2 is the function from (4.11), and Const is independent of ε. Hence, the values of E satisfying this condition are outside of the spectrum of Hφ,ε . By point (4) of Lemma 2.1, one has 1 C (E) + g2 ≥ , ε ε
E ∈ J0 .
(4.13)
As the tunneling coefficients are exponentially small, inequality (4.13) implies that (4.12) is satisfied outside exponentially small intervals. are located in o (ε) These intervals neighborhoods of the points E (l) defined by cos 1ε (E (l) ) = 0; the distances between these points are of order ε. Near each E (l) , there is exactly one of these intervals. We call these intervals (Il )l∈L . Let us discuss the intervals (Il )l∈L in more detail. Fix an index l ∈ L. Recalling the definitions of tv (E) and th (E), one sees that, up to a factor 1 + o (1), these functions are constant as on Il so in a o(ε)-neighborhood of E (l) . For E ∈ Il , by Lemmas 4.2 and 4.3, we have ρ = 1 + O(tv (E (l) )), v = F (E) − 2λ(E (l) ) cos(2π(φ − φ0 (E (l) ) − h)) + o(λ(E (l) )), 1 + o(1) 1 + g F (E) = 2 cos 2 . ε th (E (l) ) Now we can define the intervals (Il )l∈L more precisely; instead of (4.12), the interval Il situated in the o(ε)-vicinity of E (l) can be described by cos 1 + g2 (E) ≤ (th (E (l) ) + tv (E (l) ))(1 + o(1)), ε
20
A. Fedotov, F. Klopp
where o(1) only depends on ε. This implies that the length of the subinterval situated near E (l) is given by (2.19). Note that any interval Il contains precisely one zero of F (E). We have thus proved the statements of Theorem 2.3 on the location of the spectrum in a constant interval J0 ⊂ R containing E0 . As E0 can be any point in J , this completes the proof of Theorem 2.3 up to the result on the integrated density of states. 4.3. Calculation of the integrated density of states. For ε/π ∈ Q, (1.1) is a metrically transitive family of equations (see e.g. [37]). Recall that 6 is the spectrum of Hφ,ε . 4.3.1. Weyl solutions, Lyapunov exponents and the integrated density of states. We begin with recalling general results on the Lyapunov exponent (see [37, 39]). Consider the family of quasi-periodic equations (1.1). For any fixed value of the spectral parameter E such that Im E ≥ 0, E ∈ / 6, there is a unique solution to (1.1) such that u+ (x) ∈ 2 + L (0, +∞) and u (0) = 1; we call it the Weyl solution of (1.1). For almost all φ ∈ R, the Weyl solution satisfies the estimate |u+ (x, E, φ)|2 + |du+ (x, E, φ)/dx|2 = e−x(#(E)+o (1)) , x → +∞, (4.14) where #(E) is positive, independent of φ and x. #(E) is the Lyapunov exponent of the family (1.1). The Lyapunov exponent is the real part of a function f : E → f (E) analytic in the upper half-plane Im E ≥ 0, E ∈ / 6 and such that the integrated density of states is equal to the limit N (E) = −
1 lim Im f (E + iα), π α→+0
E ∈ R.
4.3.2. More on Bloch solutions of difference equations. Here, following [7], we discuss in more detail the monotonous Bloch solutions described in Proposition 4.1. By Remarks 4.1 and 4.2, in the case of Proposition 4.1, Eq. (4.1) has a continuous Bloch solution χ (φ) satisfying the estimate χ (φ) ≤ C1 eC2 ·φ (C1,2 > 0). This solution can be constructed in the following way. Together with (4.1), consider the finite difference Ricatti equation G(φ) +
ρ = v(φ), G(φ − h)
φ ∈ R,
(4.15)
where v and ρ defined by (4.2). Under condition (4.4), Eq. (4.15) has a continuous, 1-periodic solution that can be represented by the continued fraction G(φ) = v (φ) −
ρ(φ) . ρ(φ − h) v (φ − h) − ρ(φ − 2h) v (φ − 2h) − ···
This continued fraction converges uniformly in φ ∈ R. It satisfies
v− 2 − ρ+ , φ ∈ R, |G(φ) − v(φ)| < v− /2 − 2
(4.16)
(4.17)
Anderson Transitions for Almost Periodic Schrödinger Equations
21
where v− and ρ+ defined by (4.3). For φ ∈ R, define g(φ) = ln G(φ). Condition (4.5) guarantees that the function g can be defined as a 1-periodic continuous function of φ. If v depends analytically on some parameter E in a simply connected domain, then g also is analytic in E. Consider a continuous solution (not necessarily periodic) of the homological equation λ(φ + h) − λ(φ) = g(φ). For any irrational h, one has φ λ(φ) = (θ˜ + o (1)) , h
φ → +∞, where θ˜ =
1
ln G(φ) dφ, Re θ˜ > 0.
(4.18)
0
The function χ1 (φ) = eλ(φ) is the first component of the Bloch solution χ mentioned at the beginning of this section. Note that one has G(φ) =
χ1 (φ + h) . χ1 (φ)
(4.19)
4.3.3. Calculation of the integrated density of states. Let us now come back to Eq. (1.1). 1. We start with a description of the relation between the integrated density of states and the continued fraction G. Let J be a domain in the half plane Im E ≥ 0, J ∩ 6 = ∅. Assume that, on J, one can construct a consistent basis such that the corresponding monodromy matrix is analytic in E ∈ J and satisfies the conditions of Proposition 4.1. Then, for all E ∈ J, one can construct the Bloch solution χ of (4.1) described in Sect. 4.3.2. The vector χ (φ + nh) satisfies the monodromy equation. Clearly, by Theorem 2.1, this solution corresponds to the Weyl solution u+ of (1.1); the Lyapunov exponent # of (1.1) can be calculated by formula (2.7) with θ = Re θ˜ , θ˜ being defined by (4.18). This leads to the formula 1 ε ln G(φ, E) dφ, E ∈ J, (4.20) #(E) = Re 2π 0 where G is the continued fraction (4.16). Since the Lyapunov exponent # is the real part of the analytic function f (E), and ln G is analytic in E ∈ J, one has 1 ε f (E) = ln G(φ, E) dφ + C, E ∈ J, 2π 0 where C is a constant independent of E. Therefore, if γ is a continuous curve in J connecting a ∈ R to b ∈ R, then one has 1 ε N (b) − N (a) = − arg G(φ, E) dφ . (4.21) 2π 2 0 γ Here f |γ denotes the increment of f when going from a to b along γ . We use formula (4.21) to compute the increments of the integrated density of states along the intervals Il , l = L1 , . . . L2 , and, thus, prove (2.20).
22
A. Fedotov, F. Klopp
2. Let us prove Theorem 2.3. Consider one of the intervals Il , L1 ≤ l ≤ L2 . Let E∗ be the zero of F (E) closest to E (l) . By Sect. 4.2.4, we know that E∗ − E (l) = o (ε),
(4.22)
that the interval Il contains E∗ , and that the length of Il is exponentially small, |Il | ≤ e−C/ε . We fix two constants c1 and c2 so that 1 < c2 < c1 and choose J = E ∈ C | Im E ≥ 0, εc1 < |E − E∗ | < εc2 . As Il is exponentially small, one has J ∩ Il = ∅. Moreover, all the other intervals containing some spectrum of Hφ,ε are either to the right or to the left of J. So, J is in the resolvent set of (1.1). Pick a and b in J ∩ R, a < E∗ < b and let γ be the semi-circle connecting a and b in J. To prove the statement of Theorem 2.3 on the integrated density of states, it suffices to check the formula N (b) − N (a) =
ε . 2π
(4.23)
To prove (4.23), we first check Lemma 4.4. For E ∈ J, the monodromy matrix described in Theorem 2.2 satisfies the conditions of Proposition 4.1. So, we can calculate N (b)−N (a) using (4.21). To compute the right-hand side of (4.21), we use Proposition 4.2. For the monodromy matrix from Theorem 2.2, one has
1
arg G(φ, E)dφ = arg F (E)|γ .
(4.24)
γ
0
As F is real analytic, one has arg F (E)|γ ∈ π Z. To get the actual value of arg F (E)|γ , we use Lemma 4.5. Fix c > 0. Then, for ε sufficiently small, one has • (E∗ ) (1 + o(1)), εth (E∗ )
(4.25)
1 , uniformly for |E − E ∗ | ≤ εc . ε 2 th (E∗ )
(4.26)
F (E∗ ) = ±2
•
where the sign ± depends only on Il ; F (E) = O
Anderson Transitions for Almost Periodic Schrödinger Equations
23
By Lemma 2.8, one has ≥ Const > 0 on J ; hence, Lemma 4.5 implies that, on γ , F (E) = F (E∗ )(E − E∗ )(1 + o(1)). Therefore, arg F (E)|γ = −π + o(1). As arg F (E)|γ ∈ π Z, one has arg F (E)|γ = −π . This, (4.21) and Proposition 4.2 imply (4.23). This completes the proof of Theorem 2.3. & ' 3. We now prove Lemmas 4.4, 4.5 and Proposition 4.2. Proof of Lemma 4.5. Formulas (4.25) and (4.26) follow from the representation (4.11). Clearly, for E = E∗ , one has cos(/ε + g2 ) = 0, and sin(/ε + g2 ) = ±1. Therefore, 1 , and as, by Lemma 2.1, (E∗ ) > 0, we one has F (E∗ ) = ±2 1+g th ( /ε + g2 ) E=E∗
get (4.25). To estimate F , one uses Lemma 4.3 and the estimates (E), (E) = O(1), th (E) = O(th /ε) and th (E) = O(th /ε 2 ) which follow from the definitions of the phase integral and the tunneling coefficients. & ' Proof of Lemma 4.4. Therefore, we need the following immediate corollary of Lemma 4.5. Corollary 4.1. There exists C > 0 such that, for ε sufficiently small, 1 ε c1 −1 εc2 −1 ≤ |F | ≤ C , E ∈ J. C th (E∗ ) th (E∗ )
(4.27)
Now, let us check that, for E ∈ J, the monodromy matrix described in Theorem 2.2 satisfies the assumptions of Proposition 4.1. By Lemma 4.2, uniformly in E ∈ J, one has ρ = 1 + O(tv (E∗ )),
v = F (E) + O(λ(E∗ )).
(4.28)
The last formula for v and Corollary 4.1 imply that v−F = O tv (E∗ )ε 1−c1 , F
E ∈ J.
(4.29)
So, as tv is exponentially small, (v − F )/F is small. We check the assumptions of Proposition 4.1: • by (4.28), the index of the periodic function ρ is zero; • by (4.29), one has ind v = ind (F [1 + (v − F )/F ]) = ind (1 + (v − F )/F ) = 0; here, we have used the fact that F is independent of φ; • by (4.28), one has ρ± = 1 + O(tv (E∗ )),
E ∈ J;
(4.30)
by (4.29) and Corollary 4.1, one has v− = |F | min |1 + (v − F )/F | ≥ C φ∈R
therefore, 0 < ρ− and ρ+
−G 4 − f2 As = , for E ∈ I∗ , we can choose F /2 f
4 − f 2 (E) sign f (E∗ ). η(E) = − arctan f (E)
28
A. Fedotov, F. Klopp
Here, the square root is positive, and the branch of arctan is chosen so that, for E ∈ I∗ , one has 0 ≤ η(E) ≤ π. With this choice, η(E) is continuous in E ∈ I∗ . One computes η (E) =
f (E) 4 − f 2 (E)
sign f (E∗ ).
Hence, η(E) is monotonously increasing on I∗ from 0 to π . Finally, we note that the relation F (E) = ν + ν ∗ implies that 2 cos η(E) = f (E).
(4.45)
4.4.3. A new parameterization of the monodromy matrix. As d = 1 + O(λ2 ), on the interval I∗ , the matrix D admits the representation iη(E) e 0 D = D0 + D1 , D0 = , D1 = o(λ2∗ ). (4.46) 0 e−iη(E) Proposition 11.1 is applied to Eq. (11.3) with matrices D and A depending on a parameter η (see (11.2)). So, we shall consider the monodromy matrix as a function of η. As η(E) is monotonous, we can introduce the inverse function E(η) and consider all the objects as functions of η. Let us study E(η). Clearly, it is a monotonous continuous function mapping the interval [0, π ] onto the interval I∗ . Furthermore, (4.45) implies Lemma 4.8. The function E(η) can be analytically continued to V(0,π) , a constant neighborhood of the interval (0, π); and there exists C > 0 such that, in V(0,π) , one has |E(η) − E∗ | ≤ Cε th (E∗ ), and E (η) ∼ −
2 sin η . F (E∗ )
(4.47)
Proof. Fix C > 0. Let DC = {|E − E∗ | ≤ C ε th (E∗ )}. Lemma 4.7 and formula (4.25) imply that • there exists C1 > 0 such that, in the domain DC , one has |f (E)| ≤ C1 for ε sufficiently small; • f bijectively maps DC onto its image; • if C is large enough then, f (DC ) contains the interval (−2, 2); • for E ∈ DC , one has f (E) ∼ F (E∗ ). The function E(η) can be constructed as f −1 (2 cos η), where f −1 is the inverse to f . The information collected on f implies Lemma 4.8. & ' We finish this section with three immediate corollaries of Lemma 4.8. Corollary 4.2. The eigenvalues ν and ν ∗ of the matrix M0 are analytic in η in V(0,π) ; for η ∈ V(0,π) , one has ν = eiη + o(λ2∗ ),
ν ∗ = e−iη + o(λ2∗ ).
Proof. The statement is a consequence of (4.44) and Lemmas 4.8 and 4.6.
' &
Anderson Transitions for Almost Periodic Schrödinger Equations
29
Corollary 4.3. det P and sin η/ det P are analytic in η ∈ V(0,π) ; for E ∈ V(0,π) , one has det P ∼ −2i
ei(E∗ )/ε sin η. th (E∗ )
Proof. Corollary 4.3 is a consequence of formulas (4.38) and (4.44), Lemma 4.8 and ' the asymptotics of a0 . & Corollary 4.4. The matrix D1 = D − Diag eiη , e−iη is analytic in V(0,π) ; in V(0,π) , one has D1 = O(λ2∗ ). Proof. Corollary 4.4 is a consequence of Corollary 4.2.
' &
4.4.4. Estimates of the matrices A1 and A2 . Recall that the matrices A1 and A2 are defined in (4.40). We now estimate them assuming that (φ, η) ∈ R, where R = {|Im φ| ≤ r} × V(0,π) . Here, r is a fixed constant, independent of ε. Trying to estimate A1 , there is one difficulty. Indeed, note that in R, one has |a0 |, |b0 | ∼ 1/th (E∗ ),
˜ ∼ λ∗ . |a|, ˜ |b|
These estimates are a consequence 2.2 and Lemma 4.8. Furthermore, by of Theorem 1 th (E∗ ) ∼ Corollary 4.3, one has 2 sin η . Straightforward norm estimates yield only det P 1 that A1 = O . Thus, the norm of λ∗ A1 is not necessarily small. th (E∗ ) · sin η Let us show that one has 1 A1 = O , (φ, η) ∈ R. (4.48) sin η As A1 ∈ M, it suffices to estimate its coefficients (A1 )11 and (A1 )12 . One computes (A˜ 1 )11 = −a0 a˜ ∗ F + b0 b˜ ∗ F − d a˜ + a0 δ1 , (A˜ 1 )12 = b0∗ a˜ ∗ F − a0∗ b˜ ∗ F + d b˜ ∗ − b0∗ δ1 , where δ1 = a0 a˜ ∗ + a0∗ a˜ − b0 b˜ ∗ − b0∗ b˜
and
A˜ 1 = Pˇ0 M1 P0 .
Note that, by Lemma √ 4.6 and Lemma 4.8, for η ∈ V(0,π) , one has d ∼ 1, and F ∼ 2 cos η (as F = ν+ν ∗ = 2 d(E) cos η). So, to prove (4.48), it suffices to show that δ1 = O(λ∗ ). As M is 1-periodic, the equality det M ≡ 1 implies that all the non constant terms of the Fourier series of det M vanish. Therefore, one has a0 a˜ ∗ + a0∗ a˜ − b0 b˜ ∗ − b0∗ b˜ = a˜ a˜ ∗ − b˜ b˜ ∗ , where {f } denotes the sum of all non constant terms of the Fourier series of a periodic function f . By (2.17), in R, a˜ and b˜ are O(λ∗ ). This implies that δ1 = O(λ∗ ), hence, (4.48).
30
A. Fedotov, F. Klopp
Estimating A2 is straightforward; one just uses norm estimates on the matrices composing A2 to get (A2 ) ≤ O
1 . sin η
(4.49)
So, we have proved Lemma 4.9. The function sin η · A is analytic and bounded in (φ, η) ∈ R. Proof. As A = A1 + A2 , the boundedness follows from (4.48) and (4.49), and the analyticity follows from Corollary 4.3. & ' 4.4.5. Bounded solutions of the monodromy equation. Let us summarize the results obtained in Sect. 4.4.4 to check that one can apply Proposition 11.1 to the “continuous” monodromy equation (4.1) with the monodromy matrix described in Theorem 2.2. We have transformed the “continuous” monodromy equation into λ∗ ˜ A(φ + h) = D0 + A(φ) A(φ) sin η
(4.50)
with D0 = Diag (eiη , e−iη ),
A˜ = sin η (A + D1 /λ∗ ),
(4.51)
where D1 = D − D0 . The matrices D0 and A˜ belong to M. As det M ≡ 1, one has λ∗ ˜ ˜ det(D0 + sin η A) = 1. By Corollary 4.4 and Lemma 4.9, the matrix A is analytic and ˜ r,(0,π) is bounded by a constant uniformly in uniformly bounded in R; so, the norm A ε (for the definition of this norm, see (11.1)). We apply Proposition 11.1 to Eq. (4.50) except near energies E where sin(η(E)) = 0. Therefore, we “cut” the ends of the interval η ∈ (0, π). Fix 0 < α < 1/2 and consider the interval Iα = (λα∗ , π − λα∗ ). On this interval, the Lipschitz norm of the function sin1 η is bounded by Const · λ−2α ∗ , and thus, we can apply Proposition 11.1 with the “effective” coupling constant equal to λ1−2α . As a result, we ∗ get Proposition 4.3. Fix 0 < α < 1/2 and σ < 1. Let σ = (1−2α)σ . If ε is small enough, σ/2 η is outside a subset ∞ of Iα of the measure O(λ∗ ), and h satisfies the Diophantine condition (11.4) with λ = λ∗ , then the monodromy equation (2.5) has bounded solutions. 4.4.6. Back to Eq. (1.1). Let us reformulate the conditions of Proposition 4.3 in terms of the initial equation (1.1). We use the notations λI and S introduced in Theorem 2.4 and in (2.21). We let I = V0 ∩ R. 1. Let us study the set D of values of ε ∈ (0, 1) such that the Diophantine condition (11.4) is satisfied with h = 2π ε mod 1 and λ = λ∗ . Show thatthe set D possesses
Anderson Transitions for Almost Periodic Schrödinger Equations
31
the property (2.22). Recall that λI = exp (−S/ε), where S = minE∈I S(E). So, λ∗ ≤ λI , and one has: ∞ 2π mes ((0, ε) \ D) ≤ dh, (h + n)2 Hσ (n) n=L(ε)
σ Sn
k − 2π / k 3 }, Hσ (n) = ∪∞ k=1 ∪l=0 {h : min |h − l/k| ≤ e
where L(ε) is equal to the integer part of 2π/ε. Therefore as, on I ⊂ J+δ , S > δ > 0, we get ∞ σ nS 1 mes ({(0, ε) \ D}) ≤ C exp − ≤ Cε2 λσI , n2 2π n=L(ε)
where C denotes different positive constants independent of ε. This implies that the measure of D satisfies estimate (2.22). 4 2. By (4.43), the length |I∗ | of the interval I∗ has the asymptotics |I∗ | ∼ |F (E . ∗ )| As λ∗ = tv /th is small, by (2.19), one sees that |Il | has the same asymptotics, i.e. |Il |/|I∗ | ∼ 1 when ε → 0. 3. Let E∞ = E(B), where the set B = (0, λα∗ ) ∪(π − λα∗ , π ) ∪ ∞ , and the function E(η) is the inverse of η(E). Clearly, mes (E∞ ) = B E (η)dη. This and estimates (11.6) σ/2 1 and (4.47) imply that mes (E∞ ) = |F (E · (O(λ∗ ) + O(λ2α ∗ )) = o(|Il |). ∗ )| 4. Let us combine points 1, 2 and 3. We see that, if ε is sufficiently small and belongs to the set D, Eq. (4.1) has bounded solutions for E ∈ I∗ outside some subset of I∗ of measure o(|I∗ |). This means that the monodromy equation itself has bounded solutions for these values of E. Then, Corollary 2.1 implies that, for these energies, the Lyapunov exponent vanishes. Now, applying the Ishii–Pastur–Kotani Theorem, we see that the measure of the absolutely continuous spectrum of (1.1) situated on I∗ has the same asymptotics as the length of I∗ . As Il contains this spectrum, and as |Il |/|I∗ | ∼ 1, we get (2.23). As σ can be made as close to 1 as desired, this completes the proof of Theorem 2.4 for the interval I = V0 ∩R. The theorem clearly remains valid for any fixed subinterval of the interval I and for any finite union of such intervals. This completes the proof of Theorem 2.4. & ' 4.5. Singular spectrum. Here, we prove Theorem 2.5. Therefore, we estimate the Lyapunov exponent of (1.1) on Jδ− using the asymptotics (2.17) of the monodromy matrix described in Theorem 2.2. Let E0 ∈ Jδ− , let V0 be a neighborhood of E0 as in Theorem 2.2. Recall that the asymptotics (2.17) are uniform in E in V0 and in φ in the strip |Im φ| ≤ Y /ε, and that Y > Im ϕ2 (E0 ). 1 Let Y1 = 2π Sv (E). By point 4 of Lemma 2.1, one has Y1 < Im ϕ2 (E) for E ∈ J . Clearly, there is a constant real neighborhood V1 of E0 , where Y − Y1 > δ for some positive constant δ. For E ∈ V1 and for φ in the strip {−Y /ε ≤ Im φ ≤ −(Y − δ)/ε}, the product tv e2πiφ is exponentially large. As takes real values on J , see Lemma 2.1, the last observation and formula (2.17) imply that, for E ∈ V0 ∩ V1 and −Y /ε ≤ Im φ ≤ −(Y − δ)/ε, one has a = −λe2iπ(φ−φ0 ) (1 + o(1)),
b/a = i + o(1),
a ∗ /a = o(1),
b∗ /a = o(1).
32
A. Fedotov, F. Klopp
Therefore, M = λe
2iπ(φ−φ0 )
−1 i 0 0
+ o(1) .
Recall that λ is exponentially large on Jδ− . Let E belong to the constant interval J0 = Jδ− ∩V0 ∩V1 . Then, the monodromy matrix satisfies the assumptions of Proposition 10.1, and we get the following estimate for the Lyapunov exponent of the matrix cocycle associated to the pair (M, h): θ (M, h) ≥ log λ + o(1). The Lyapunov exponent #(E) for Eq. (1.1) is related to θ(M, h) by Corollary 2.1. Therefore, ε #(E) ≥ log λ + o(ε). 2π Expressing the coupling constant in terms of the tunneling actions, log λ = 1ε (Sh (E) − Sv (E)) + o(1), we obtain (2.24). On the interval J0 ⊂ Jδ+ , the leading term in (2.24) is positive, hence, so is the Lyapunov exponent for sufficiently small ε. Therefore, by the Ishii–Pastur–Kotani’s Theorem, the spectrum of (1.1) situated on J0 is singular. Since E0 can be taken arbitrary in Jδ+ , this completes the proof of Theorem 2.5. 5. Periodic Schrödinger Operators In this subsection, we discuss the periodic Schrödinger operator (1.3), where V is a 1-periodic, real valued, L2loc -function. We collect all the results needed for this paper. The proofs of these results as well as more details can be found, for example, in [12, 34, 35, 41]. 5.1. Bloch solutions. Let ψ be a solution of the equation H0 ψ = −
d2 ψ (x) + V (x)ψ (x) = Eψ (x), dx 2
x ∈ R,
(5.1)
satisfying the relation ψ (x + 1) = λ ψ (x),
∀x ∈ R,
(5.2)
with λ independent of x. Such a solution is called a Bloch solution, and the number λ is called the Floquet multiplier. Let us discuss the analytic properties of Bloch solutions. As in Sect. 2.3, we denote the spectral bands of the periodic Schrödinger equation by [E1 , E2 ], [E3 , E4 ], . . . , [E2n+1 , E2n+2 ], . . . . Consider G± , two copies of the complex plane E ∈ C cut along the spectral zones. Paste them together to get a Riemann surface with square root branch points. We call this Riemann surface G. One can construct a Bloch solution ψ(x, E) of Eq. (5.1) meromorphic on this Riemann surface. It can be normalized by the condition ψ(1, E) ≡ 1. The poles of this solution are located in the spectral gaps. More precisely, each spectral gap contains exactly one simple pole. It is located either on G+ or on G− . The position of the pole is independent of x. Outside of the edges of the spectrum, the two branches of ψ are linearly independent solutions of (5.1). Finally, we note that, in the spectral gaps, both branches of ψ are real valued functions of x, and, on the spectral bands, they differ only by complex conjugation.
Anderson Transitions for Almost Periodic Schrödinger Equations
33
5.2. The Bloch quasi-momentum. Consider the Bloch solution ψ(x, E). The corresponding Floquet multiplier λ (E) is analytic on G. Represent it in the form λ(E) = exp(ik(E)).
(5.3)
The function k(E) is the Bloch quasi-momentum. 5.2.1. The Bloch quasi-momentum as an analytic multi-valued function. The Bloch quasi-momentum is an analytic multi-valued function of E. It has the same branch points as ψ(x, E). Let D be a simply connected domain containing no branch point of the Bloch quasi-momentum. In D, one can fix an analytic single-valued branch of k, say k0 . All the other single-valued branches of k that are analytic in E ∈ D are related to k0 by the formulae k±,l (E) = ±k0 (E) + 2π l,
l ∈ Z.
(5.4)
The Riemann surface of k is more complicated than the one for ψ. However, on the complex plane cut along the spectral gaps of the periodic operator, one can fix a singlevalued analytic branch of k. 5.2.2. The main branch of the Bloch quasi-momentum. Consider C0 , the complex plane cut along the real line from E1 to +∞. On C0 , one can fix a single valued analytic branch of the quasi-momentum by the condition −ik0 (E) > 0,
E < E1 .
(5.5)
We call k0 the main branch of the Bloch quasi-momentum. The function k0 conformally maps C0 onto the upper half of the complex plane with some vertical slits beginning at the points πl, l ∈ Z, and having finite lengths. It is a bijection. In Fig. 5, we drew two curves in C0 and their images under the transformation E → k0 (E). Consider k0 along the curve γ1 . The quasi-momentum k0 (E) is real and
(E )
1
1
2
3
E1
E2
E3
(k0 (E ))
1 E4
E5 2
2 E1
E2
E3
E4
2
1 0
E5
Fig. 5. The action of mapping k on some curves
2
1 ; 2
3
2
34
A. Fedotov, F. Klopp
monotonically increasing along the spectral zones; along the spectral gaps, it takes complex values and its real part is constant; in particular, we have k0 (E1 ) = 0,
k0 (E2l + i0) = k0 (E2l+1 + i0) = π l,
l = 1, 2, 3 . . . .
(5.6)
If E2n < E2n+1 , one says that the nth gap is open. Along any open gap, Im k0 (E + i0) is a non-constant function with only one non-degenerate maximum. The values of the quasi-momentum k0 on the different sides of the cut E1 < E < +∞ are related to each other by the formula k0 (E + i0) = −k0 (E − i0),
E1 ≤ E.
(5.7)
All the branch point of k0 are of square root type: let El be one√ of the branch points, then, in a sufficiently small neighborhood of El , k0 is analytic in E − El , and k0 (E) − k0 (El ) = cl E − El + O(E − El ), cl = 0. (5.8) The constants ml = |cl |2 /2 are called the effective masses associated to El . 5.3. Periodic components of the Bloch solution. Let D ⊂ C be a simply connected domain that does not contain any branch point of k. On D, we fix an analytic branch k of the Bloch quasi-momentum. There are two disjoint domains in G denoted by D± that project onto D. Define ψ± to be the restrictions of the Bloch solution ψ to D± . The functions ψ± are indexed so that k is the Bloch quasi-momentum of ψ+ . The Bloch solutions ψ± can be represented in the form ψ± (x, E) = e±ik(E)x p± (x, E),
E ∈ D,
(5.9)
where p± (x, E) are functions periodic in x, p± (x + 1, E) = p± (x, E),
∀x ∈ R.
(5.10)
We call p± the periodic components of ψ± with respect to the branch k of the Bloch quasi-momentum. One has 1 1 p+ (t, E)p− (t, E)dt = ψ+ (t, E)ψ− (t, E)dt = −ik (E)w(E), E ∈ D, 0
0
(5.11) where w(E) = w(ψ+ (x, E), ψ− (x, E)), and w(f (x), g(x)) = f (x)g(x)−g (x)f (x). 6. Main Theorem of the Complex WKB Method In this section, we describe the complex WKB method for adiabatically perturbed periodic Schrödinger equations. The reader can find more details and the proofs of the results of this section in [19]. This method was developed for the asymptotic study of equations of the form −
d2 ψ(x) + (V (x) + W (εx))ψ(x) = Eψ(x), dx 2
x ∈ R,
(6.1)
Anderson Transitions for Almost Periodic Schrödinger Equations
35
where V is a real valued 1-periodic function of x, and ε is a small parameter (fixed positive number). This method is designed to study exponentially small effects due to the complex tunneling. Usually, these effects are measured by exponentially small coefficients of certain transition matrices (scattering matrices, monodromy matrices, etc.) relating two distinguished bases of solutions. To calculate such exponentially small terms, one assumes that W is analytic and introduces an additional parameter ϕ into Eq. (6.1) so that it becomes −
d2 ψ(x) + (V (x) + W (εx + ϕ))ψ(x) = Eψ(x), dx 2
x ∈ R.
(6.2)
The idea is that the terms that are exponentially small when ϕ is real may become dominant for complex values of ϕ, and that having computed these terms for complex values of ϕ, one can try to recover their values for ϕ on the real axis. Of course, to realize these ideas, one has to have good enough control of the dependence on ϕ of solutions of Eq. (6.2). However, there is no equation controlling this dependence. But, there is a natural condition which replaces such an equation. We say that (ψ± ) two solutions of (6.2) form a consistent basis if their Wronskian is independent of ϕ and if ψ± (x + 1, ϕ) = ψ± (x, ϕ + ε) ∀ϕ.
(6.3)
Equation (6.3) is called the consistency condition. To clarify this condition, we pass to the variables φ = ϕ/ε and t = x + φ. In terms of these variables, (6.2) takes the form −
d2 ˜ ψ˜ + (V (t − φ) + W (εt))ψ˜ = E ψ, dt 2
(6.4)
and (6.3) just becomes the 1-periodicity ψ˜ in φ. i.e. ψ˜ ± (t, φ + 1) = ψ˜ ± (t, φ).
(6.5)
This condition plays a crucial role for the asymptotic analysis of (6.2). Remark 6.1. Note that if W = α cos(·), then Eq. (6.4) is nothing but Eq. (1.1) and condition (6.5) is precisely the condition (2.1) that we required to define the monodromy matrices. We use the complex WKB method to calculate the Fourier coefficients of the monodromy matrices. Now, following [19], we shall describe the main constructions of the complex WKB method. Below, we assume that V ∈ L2loc , and that W is analytic in a neighborhood D(W ) of the real line. 6.1. Complex momentum. The central analytic object of the complex WKB method is the complex momentum κ(ϕ). It is defined in terms of the Bloch quasi-momentum of the operator (5.1) by the formula κ(ϕ) = k(E − W (ϕ)) in D(W ), the domain of analyticity of the function W .
(6.6)
36
A. Fedotov, F. Klopp
The complex momentum κ is a multi-valued analytic function. Its branch points are related to the branch points of the quasi-momentum by the relations El = E − W (ϕ),
l ∈ N,
(6.7)
where El are the ends of the spectral zones of the operator H0 . Let D be a simply connected domain containing no branch points of κ. Then, in D, one can fix an analytic branch κ0 of this function. By (6.10), all the other analytic branches are described by the formulas ± = ±κ0 + 2π m, κm
(6.8)
where ± and m are indexing the branches. 6.2. Canonical domains. The notion of canonical domain is the main geometric notion of the complex WKB method. Throughout the paper, we say that a set is regular if it is in the domain of analyticity of W and contains no branch points of the complex momentum. We use the terms: regular domain, regular curve, regular point. When speaking about regular curves, we suppose in addition that they are smooth and connected. A piecewise smooth, connected curve γ ⊂ C is called vertical if it intersects the lines {Im ϕ = Const} at non-zero angles θ, 0 < θ < π.
(6.9)
Thus, vertical lines are naturally parameterized by Im ϕ. Let γ be a regular, vertical curve. On γ , fix a continuous branch of the momentum κ. The curve γ is said to be canonical if, along γ , ϕ • Im κdϕ is strictly increasing with Imϕ; ϕ • Im (κ − π)dϕ is strictly decreasing with Imϕ. From now on, in this section, ϕ1 and ϕ2 are two regular points such that Im ϕ1 < Im ϕ2 . Definition 6.1. Let K ⊂ D(W ) be a regular, simply connected domain. On K, fix a continuous branch of the quasi-momentum, say κ. The domain K is called canonical if it is the union of curves canonical with respect to κ and connecting ϕ1 and ϕ2 located on ∂K. Note that the boundary of K may contain branch points of the complex momentum. 6.3. Canonical Bloch solutions. Consider the periodic Schrödinger equation d2 ψ(x) + V (x)ψ(x) = Eψ(x), E = E − W (ϕ), x ∈ R. (6.10) dx 2 Here, ϕ plays the role of a parameter. We need two linearly independent Bloch solutions of (6.10) that are analytic in ϕ in a given canonical domain. This property defines these solutions uniquely up to an analytic, non-vanishing factor depending on ϕ. We construct these solutions in terms of ψ(x, E), the Bloch solution of (5.1) meromorphic on the Riemann surface G (see Sect. 5.1) and its Bloch quasi-momentum k(E). Recall that k is defined modulo 2π . Pick ϕ 0 , a regular point such that k (E −W (ϕ0 )) = 0. Begin with a local construction in V 0 , a small enough neighborhood of ϕ 0 . One has, −
Anderson Transitions for Almost Periodic Schrödinger Equations
37
• in V 0 , we fix an analytic branch of κ(ϕ), the complex momentum; • there are two different branches of the function ψ(x, E − W (ϕ)) meromorphic in ϕ ∈ V 0 ; they are linearly independent Bloch solutions of Eq. (6.10); we denote them by ψ± (x, ϕ) so that their respective Bloch quasi-momentum be equal to ±κ(ϕ); • in V 0 , fix an analytic branch of the function (6.11) q(ϕ) = k (E − W (ϕ)). Consider p± (x, ϕ) the periodic components of the Bloch solutions ψ± (x, ϕ). One has ψ± (x, ϕ) = e±iκ(ϕ)x p± (x, ϕ), Let
p± (x + 1, ϕ) = p± (x, ϕ), 1 0
± p∓ (x, ϕ) ∂p ∂ϕ (x, ϕ)dx
0
p+ (x, ϕ)p− (x, ϕ)dx
ω± (ϕ) = − 1 The functions
A± (x, ϕ) = q(ϕ)e
ϕ
ω± (ϕ)dϕ
x ∈ R.
(6.12)
.
(6.13)
ψ± (x, ϕ)
(6.14)
are called the canonical Bloch solutions. In Sect. 3 of [19], we have proved Lemma 6.1. The canonical Bloch solutions are analytic in any regular simply connected domain D containing ϕ 0 . Normalization point. We normalize the canonical Bloch solutions by integrating in (6.14) from a given point ϕ0 , located in D or on its boundary. The point ϕ0 is called the normalization point. Indexing of the canonical Bloch solutions. In D, fix κ1 a continuous branch of the complex momentum. The canonical Bloch solutions A± can be indexed by ± so that κ1 (ϕ) is the Bloch quasi-momentum corresponding to the solution A+ (x, ϕ). The Wronskian of the canonical Bloch solutions. Here, we compute the Wronskian of A± normalized at ϕ0 . We prove Lemma 6.2. One has
w(A+ , A− ) = q 2 (ϕ0 )w(ψ+ (x, ϕ0 ), ψ− (x, ϕ0 )) = i
0
1
ψ+ (x, ϕ0 )ψ− (x, ϕ0 )dx.
The representation of the Wronskian given in Lemma 6.2 shows that, for any regular ϕ0 such that q 2 (ϕ0 ) = k (E − W (ϕ0 )) = 0, the canonical Bloch solutions are linearly independent. Note that, in this case, neither A+ nor A− can be identically zero for any given ϕ ∈ D. The second representation given by Lemma 6.2 allows us to choose as ϕ0 a branch point of the complex momentum. In this case, one has 1 w(A+ , A− ) = i |ψ+ (x, ϕ0 )|2 dx = 0. (6.15) 0
38
A. Fedotov, F. Klopp
Indeed, at the ends of any spectral gap, the branches of the Bloch solution ψ(x, E) coincide and are real valued in x. Proof of Lemma 6.2. The definition of the canonical Bloch solutions and of the functions ω± imply w(A+ , A− ) = q 2 (ϕ) ϕ 1 ∂ exp − p+ (x, ϕ)p− (x, ϕ)dx w(ψ+ (x, ϕ), ψ− (x, ϕ)). log ϕ0 ∂ϕ 0 Relation (5.11) then gives both representations for the Wronskian given in Lemma 6.2. ' &
6.4. The main theorem of the WKB method. Now, we can formulate the main theorem of the complex WKB method for (6.2): Theorem 6.1 ([19]). Fix X > 0. Fix E = E0 ∈ C. Let K be a bounded canonical domain for the family of Eqs. (6.2), and let κ be the branch of the complex momentum with respect to which K is canonical. For sufficiently small positive ε, there exists a consistent basis (f± ) defined for x ∈ R and ϕ ∈ K and having the following properties: • for any fixed x ∈ R, the functions f± (x, ϕ) are analytic in ϕ ∈ K. • For −X ≤ x ≤ X, and ϕ ∈ K, the functions f± (x, ϕ) have the asymptotic representations i ϕ κdϕ (A (x, ϕ) + o(1)), ε → 0. f± (x, ϕ) = e± ε (6.16) ± Here, A± are the canonical Bloch solutions corresponding to the domain K, indexed so that κ(ϕ) is the Bloch quasi-momentum corresponding to the solution A+ (x, ϕ). • The error estimates in (6.16) may be differentiated once in x. Moreover, they are uniform in x ∈ [−X, X] and locally uniform in ϕ in the interior of K. This theorem was proved in [19]. In the sequel, we normalize the solutions f± in (6.14) and in (6.16) by integrating from a point ϕ0 being either in K or on its boundary. When we need to indicate the point ϕ0 explicitly, we write f± (x, ϕ, ϕ0 ). One easily calculates the Wronskian of the solutions f± (x, ϕ, ϕ0 ): w(f+ , f− ) = w(A+ , A− ) + o(1).
(6.17)
By Lemma 6.2, the solutions f± are linearly independent if ϕ0 is regular and k (E − W (ϕ0 )) = 0, or if ϕ0 is one of the branch points of κ. 6.5. Dependence on the spectral parameter and admissible subdomains. To simplify the statement of Theorem 6.1, we have not considered the dependence of the solutions on the spectral parameter E. Therefore, we introduce the notion of admissible subdomains of a canonical domain. Let K be a compact canonical domain and pick δ > 0. Let A be the complementary in K of the δ-neighborhood of ∂K, the boundary of K. The set A is the δ-admissible subdomain of the canonical domain K. One proves
Anderson Transitions for Almost Periodic Schrödinger Equations
39
Proposition 6.1. In the setting of Theorem 6.1, the solutions f± are analytic in E in V0 , a neighborhood of E0 . For A, an admissible subdomain of K, there exists VA ⊂ V0 , a neighborhood of E0 such that the asymptotics (6.16) is uniform in (ϕ, E, x) ∈ A×VA × [−X, X]. It can be once differentiated in x without loosing its uniformity properties. In [19], we have not discussed explicitly the dependence of the solutions f± on the spectral parameter E. In the proof of Theorem 6.1, all the estimates are locally uniform in E. To prove Proposition 6.1, one just has to follow the proof of Theorem 6.1 given in [19], keeping in mind the following additional observation: Lemma 6.3. Let γ be a compact canonical curve for a given value E0 of the spectral parameter E. Then, there exists V1 ⊂ C, a neighborhood of E0 and a constant δ > 0 such that, for any E ∈ V1 , the curve γ remains canonical, and along γ , one has ϕ ϕ d d κdϕ > δ and (κ − π )dϕ < −δ, y = Imϕ, (6.18) Im Im dy dy uniformly in E ∈ V1 . Proof. Since γ is canonical and compact, for E = E0 , along γ , one has ϕ ϕ d d Im κdϕ > δ0 and (κ − π )dϕ < −δ0 , y = Imϕ, Im dy dy for some positive constant δ0 . The branch points of the complex momentum depend continuously on the spectral parameter. So, γ being regular and compact, there exists a closed neighborhood of E0 , say V0 ⊂ C, such that for all E ∈ V0 , the branch points of the complex momentum stay at a non-vanishing distance d of γ . The two derivatives in (6.18) depend continuously on (ϕ, E) ∈ γ × V0 . So, for any constant δ < δ0 , there is V1 ⊂ V0 , a neighborhood of E0 such that, for E ∈ V1 , one obtains the estimates (6.18). This completes the proof of Lemma 6.3. & ' 6.6. How to use the complex WKB method. Let us outline the ideas we use when applying the complex WKB method. Having described canonical domains and the corresponding bases of solutions with the standard asymptotic behavior, one tries to relate the bases corresponding to different canonical domains. Pick two canonical domains K 1,2 and the associated basis of solutions constructed in Theorem 6.1; denote it by (f±1,2 ). Equation (6.2) being linear of second order, there exists a transfer matrix T12 , depending only on ϕ, such that 2 1 f+ f+ = T , ϕ ∈ K 1 ∩ K 2. (ϕ) 12 f−2 f−1 The consistency condition (2.1) implies that T12 is ε-periodic in ϕ. Moreover, (f±1,2 ) being analytic in ϕ, T12 is also analytic. In K 1 ∩K 2 , we can use the asymptotics (6.16) for both bases to obtain the asymptotics of T12 . Using the ε-periodicity of T12 , we see that these asymptotics are valid in a horizontal strip containing K 1 ∩ K 2 . Now, assume that we know the asymptotics of T12 near the boundary of a complex strip containing R. Then, we can easily get the asymptotics of the Fourier coefficients of T12 corresponding to the terms of the Fourier
40
A. Fedotov, F. Klopp
series which are large at this boundary. This enables us to control the Fourier series terms of T12 which are of order exp(−Const/ε) on the real line. The consistency condition directly relates the x-behavior of the solution to its ϕbehavior. This means that one can express the transition matrices appearing in the spectral study of the equation family (6.2) (for example, monodromy matrices) as products of transfer matrices. 6.7. Canonical domains. We finish the section on the complex WKB method by describing a simple general approach to “constructing” canonical domains. We use this approach in this paper. Below, we assume that D is a regular simply connected domain and that κ is a branch of the complex momentum analytic in D. The main elements of the construction are the pre-canonical lines. Pre-canonical lines are made of elementary lines that we discuss now. 6.7.1. Lines of Stokes type. Let γ ⊂ D be a smooth curve. We say that γ is a line of Stokes type with respect to κ if, along γ , either ϕ ϕ Im κdϕ = Const or Im (κ − π )dϕ = Const. 6.7.2. Canonical lines. We identify C and R2 in the usual way. For ϕ ∈ D, denote by S(ϕ) ⊂ C the cone situated between the vectors κ(ϕ) and κ(ϕ)−π and such that, for any vector z ∈ S(ϕ) , one has Im (κ(ϕ) z) > 0, and Im ((κ(ϕ) − π ) z) < 0. The definition of a canonical line implies that a smooth vertical curve γ ∈ D is canonical with respect to κ if and only if for all ϕ ∈ γ , the vector t (ϕ) tangent to γ and oriented upwards belongs to S(ϕ). 6.7.3. Pre-canonical lines. Let γ ⊂ D be a vertical curve. The curve γ is a precanonical line if it consists of a finite union of bounded segments of either canonical lines or lines of Stokes type. One of the main properties of pre-canonical lines is that they can be used to construct canonical lines. Proposition 6.2. Let γ be a pre-canonical curve. Denote the ends of γ by ϕa and ϕb . For V ⊂ D, a neighborhood of γ and Vϕa ⊂ D, a neighborhood of ϕa , there exists a canonical line in V connecting the point ϕb to a point in Vϕa . Proof. The proof of Proposition 6.2 is done in two steps. The pre-canonical curve γ can contain segments of lines of Stokes type. In the first step, we "replace" these segments by nearby segments of canonical curves to prove that, near a given pre-canonical curve, there is a pre-canonical curve consisting only of segments of canonical curves. This curve is not yet canonical, since canonical curves are smooth. So, in the second step, we smooth out the constructed pre-canonical curve to get a canonical one. If the given canonical curve contains no segments of lines of Stokes type, we do not need to make the first step. Otherwise, we use the following lemma: Lemma 6.4. Let β be either a bounded vertical line of Stokes type or a bounded canonical line. Denote the ends of β by ϕa and ϕb . Let V ⊂ D be a neighborhood of β. For Vϕa ⊂ V , a neighborhood of ϕa , there exists Vϕb ⊂ V , a neighborhood of ϕb such that, for each point ϕ2 ∈ Vϕb , there is a point ϕ1 ∈ Vϕa , connected to ϕ2 by a curve canonical with respect to κ and staying in V .
Anderson Transitions for Almost Periodic Schrödinger Equations
41
Proof. If β is a canonical line, the result follows from the fact that any line C 1 -close to βis also canonical. Let us turn to the case of a line of Stokestype. Assume that ϕ ϕ Im κdϕ = Const along β. The case when Im (κ − π )dϕ = Const along β ϕ is treated similarly. On V , define I (ϕ) = ϕa κdϕ, X(ϕ) = Re [I (ϕ)], and Y (ϕ) = Im [I (ϕ)]. Note that along β, Y (ϕ) = 0, and I (β) is an interval [a, b] of R. As V does not contain any branch point of κ, we know that dI /dϕ = κ does not vanish in V . Moreover, along β, X(ϕ) is a strictly monotonous function of Im ϕ. This implies that there exists V ⊂ C a neighborhood of β and U ⊂ C a neighborhood of the interval [a, b] ≡ X(β) ⊂ R such that # : V → U, #(ϕ) = X(ϕ) + iY (ϕ), is a diffeomorphism. Assume that X is increasing along β (the case where X is decreasing is treated similarly). Consider the smooth curves α ⊂ U along which Y is a monotonously increasing function of X, Y = Fα (X), Fα (X) > 0,
X ∈ [a, b] and Fα C 1 ([a,b]) ≤ δ. ϕ Along the curves #−1 (α), the function Im κdϕ is strictly increasing. As ϕδ → 0, the curves #−1 (α) are C 1 -close to β. As β is vertical and as the function Im ϕa κ(ϕ)dϕ is ϕ constant along β, Im ϕa (κ(ϕ) − π )dϕ is decreasing along β. So, if δ is small enough, ϕ (κ − π )dϕ is decreasing along #−1 (α). Thus, the curves #−1 (α) are vertical, and Im −1 the curves # (α) are canonical. This implies the statement of Lemma 6.4. & ' Lemma 6.4 allows us to replace any given pre-canonical line with a neighboring pre-canonical line consisting only of segments of vertical canonical lines. This new precanonical curve begins at ϕa , one of the ends of β, and it connects it in V with a point in any given neighborhood Vb , of ϕb the other end of ϕ. To make the second step of the proof of Proposition 6.2, we have to smooth out the precanonical line βˆ obtained in the first step. In fact, we do it only in small neighborhoods ˆ Let ϕ0 be one such of the common ends of the segments of canonical lines making up β. ˆ points, and let γj ⊂ β, j = 1, 2 be the segments of canonical lines with the common end ϕ0 . As in Sect. 6.7.2 define the cones S(ϕ) for ϕ ∈ V . There exists V0 , a sufficiently small neighborhood of ϕ0 such that S0 ≡ ∩ϕ∈V0 S(ϕ) = ∅ and such that, in V0 , both the tangent vectors to γ1 and the tangent vectors to γ2 belong to S0 . We replace the part of βˆ in V0 by a smooth curve so that the new curve βˆ is smooth and the vectors tangent to this curve in V0 are in S0 . We carry out this smoothening for all the common ends of the ˆ The smoothened curve βˆ is canonical. This segments of canonical lines making up β. completes the proof of Proposition 6.2. & ' By Proposition 6.2, we can construct a canonical curve arbitrarily close to a given pre-canonical curve. But, in general, the ends of these curves can not coincide. Let us single out one case where these curves can have common ends: Lemma 6.5. Let γ be a pre-canonical curve. Assume that it begins with γa and ends with γb , both segments of canonical curves. Denote by ϕa and ϕb those ends of these segments which are internal points of γ ordered so that Im ϕa < Im ϕb . Then, for any positive δ, there is a pre-canonical curve situated in the δ-neighborhood of γ , and containing γa and γb without the δ-neighborhood of ϕa and the δ-neighborhood of ϕb . Proof. If the curve γ consists only of segments of canonical curves, then one proves the lemma just by smoothing out the curve γ as in the proof of Proposition 6.2. Otherwise,
42
A. Fedotov, F. Klopp
we apply Proposition 6.2 to the pre-canonical curve γ , where γ is the part of γ between ϕa and ϕb . This allows us to construct a canonical curve γ arbitrarily close to γ and connecting ϕa to a point ϕb arbitrarily close to ϕb . But, if this point is sufficiently close to ϕb , then we can use the C 1 -stability of canonical curves and slightly deform γb in the δ-neighborhood of ϕb to obtain a new canonical curve γb that ends at ϕb . Now, it suffices to smooth out the pre-canonical curve γa ∪ γ ∪ γb . This completes the proof of Lemma 6.5. & ' 6.8. Enclosing canonical domains. Let γ ⊂ D be a line canonical with respect to κ. Denote by ϕa and ϕb the ends of γ so that Im ϕa < Im ϕb . Let a domain K ⊂ D be a canonical domain corresponding to the triple ϕa , ϕb , κ. If γ ∈ K, then K is called a canonical domain enclosing γ . As any line C 1 -close to γ is canonical, one can always construct a canonical domain enclosing any given canonical curve. We shall call such canonical domains local. In practice, one constructs canonical domains by means of Proposition 6.3. Let γ be a canonical line with respect to κ. Assume that K ⊂ D is a simply connected domain containing γ (without its ends). The domain K is a canonical domain enclosing γ if it is the union of pre-canonical lines each of which is obtained from γ by replacing some of its internal segments by a line pre-canonical with respect to κ. Proof. It suffices to prove that, for any ϕ ∈ K, there is a pre-canonical curve β lying in K, connecting ϕd and ϕu , two internal points of γ , such that Im ϕu > Im ϕd and such that ϕ is an internal point of a segment of β which is a canonical line. Indeed, consider the pre-canonical curve going, first, along γ from ϕa to ϕd , then, along β from ϕd to ϕu , and, at last, along γ from ϕu to ϕb . Consider the segments of this curve from ϕa to ϕ and from ϕ to ϕb . The proof of Proposition 6.3 is then obtained by applying Lemma 6.5 (with δ small enough) to each of these segments. We know that there is a pre-canonical curve β containing ϕ and connecting two internal points of γ . Let us modify this curve to construct the curve β. The point ϕ divides β into two pre-canonical curves βu and βd connecting ϕ respectively to the points ϕu and ϕd of γ . We assume that βd is below βu . Begin with describing the necessary transformation of βu . We can continue βu somewhat beyond the point ϕu so that the new line remain pre-canonical. For this new line, we keep the old notation. By Proposition 6.2, in any neighborhood of βu , there is a canonical line βu beginning at ϕ and ending in any given neighborhood of the other end of βu . We can choose these two neighborhoods sufficiently small so that βu ⊂ K and so that βu intersects γ at, say, ϕu . By construction, one has Im ϕ < Im ϕu . Similarly, starting from βd , we construct a canonical line βd ⊂ K connecting ϕ to a point ϕd ∈ γ situated below ϕ. The precanonical line βd ∪ βu would be the one we need, but ϕ is the common end of the two canonical lines βu and βd . To finish the proof we deform these curves so that ϕ be an internal point of one of them. To this end, we continue βu somewhat beyond ϕ so that the new curve βu remain canonical. Then, using the C 1 -stability of the canonical lines, we slightly deform βd to a new canonical curve βd connecting ϕd to a point of βu situated somewhat below ϕ. The pre-canonical line going along βd from ϕd to βu , and then along βu to ϕu is the one we need. This completes the proof of Proposition 6.3. & '
Anderson Transitions for Almost Periodic Schrödinger Equations
43
7. Constructions of the Complex WKB Method for Eq. (1.1) To use the complex WKB method, we rewrite (1.1) in terms of the variables u = x − φ and ϕ = εφ, −
d2 ψ(u) + (V (u) + α cos(εu + ϕ))ψ(u) = Eψ(u), du2
u ∈ R.
(7.1)
We describe the complex momentum and canonical domains for (7.1); in Sect. 8, these are used to compute the monodromy matrix.
7.1. Complex momentum. Properties of κ depend on the value of the spectral parameter E and on the lengths of the spectral zones and the spectral gaps of the periodic operator H0 . Recall that we assume condition (2.16) to hold. Some general properties of the complex momentum were discussed in Sect. 6.1. In Sect. 2.5, we have stated some properties of the set of branch points of the complex momentum for Eq. (1.1). These properties are obvious consequences of the description of the Bloch quasi-momentum given in Sect. 5.2. Consider the strip {|Re ϕ| < π }. Cut it from −π to −ϕ1 and from ϕ1 to π . Denote the domain thus obtained by D0 . It is regular and simply connected. In D0 , one can fix a single-valued branch of the complex momentum by the conditions Im κ0 (0) > 0,
Re κ0 (0) = 0.
(7.2)
Indeed, the domain D0 is mapped by E : ϕ → E − α cos ϕ onto the upper half of the complex plane. The branch κ0 is related to k0 , the main branch of Bloch quasi-momentum by the formula κ0 (ϕ) = k0 (E − α cos ϕ).
(7.3)
The branch κ0 (defined on D0 ) is called the main branch of the complex momentum. We now describe four properties of the main branch of the complex momentum; they follow from (7.3) and from the properties of k0 described in Sect. 5.2.2. The first two properties are κ0 (−ϕ) = κ0 (ϕ),
(7.4)
κ0 (ϕ) = −κ0 (ϕ).
(7.5)
Note that (7.4) is also valid for complex values of E. The domain {ϕ ∈ D0 ; Re ϕ ≥ 0} is mapped by κ0 onto the upper half plane with finite vertical cuts starting at the points nπ , n ∈ Z (see Fig. 6; in this figure, we have drawn two curves on the complex plane ϕ ∈ C, and their images by κ0 (ϕ)). Finally, we consider the 2π-periodic curve γ going along the real line around the branch points as shown in Fig. 7. Continuing κ0 analytically along this curve, we get κ0 (ϕ + 2π ) = κ0 (ϕ),
ϕ ∈ γ.
(7.6)
44
A. Fedotov, F. Klopp
7
'4
6
'3
5
'2
2
4
'
'1
'4
2
1
'3
2
2
'2
2
3
'
( )
( 0 ( ))
'1
2
3
4
5
6
0
7 2
Fig. 6. The action of the mapping κ
b
'1
0
2
b
'1
2
Fig. 7. The period γ
7.2. The Stokes lines. The definition of the Stokes lines is fairly standard (see e.g. [15]). The integral κ dϕ has the same branch points as the complex momentum. Let ϕ0 be one of them. Consider the curves beginning at ϕ0 described by ϕ (κ (ξ ) − κ (ϕ0 )) dξ = 0. (7.7) Im ϕ0
These curves are the Stokes lines beginning at ϕ0 . It follows from (6.8) that the Stokes line definition is independent of the choice of the branch of κ in (7.7). As the branch points of the complex momentum are of square root type, exactly three Stokes lines begin at any branch point. At this point, the angle between any two nearest neighbor Stokes lines are equal to 2π 3 . These Stokes lines may be finite: they connect pairs of finite branch points; they may also be infinite: they go from finite branch points to infinity. If ϕ0 is a branch point for κ, so is the point ϕ0 + 2π . The Stokes lines starting at ϕ0 + 2π are just the 2π-translates of the Stokes lines starting at ϕ0 . Furthermore, the whole picture of the Stokes lines in the domain 0 ≤ Re ϕ ≤ 2π is symmetric with respect to the real line as well as with respect to the line Re ϕ = π . This is a consequence of the symmetries of the cosine. In Fig. 8, we have shown some of the Stokes lines; they are represented by dotted lines. Let us discuss them briefly. Consider the Stokes lines beginning at the branch point ϕ1 . As κ0 is real on the interval [ϕ1 , π], this interval is a part of the Stokes line beginning at the point ϕ1 . There are two
Anderson Transitions for Almost Periodic Schrödinger Equations
\c"
\b"
45
b'3 b'2
\a" 0
b
2
'1
Fig. 8. The Stokes lines
other Stokes lines beginning at this point. One of them is going upwards. We denote it by “a”. Consider the Stokes lines beginning at the branch point ϕ2 . As κ0 − π is purely imaginary on the segment [ϕ2 , ϕ3 ] of the line π + iR, this segment coincides with the Stokes line connecting the points ϕ2 and ϕ3 . There are two other Stokes lines starting at ϕ2 . We denote by “b” the Stokes line going to the left. As already noted, one of the Stokes lines beginning at ϕ3 coincides with the segment [ϕ2 , ϕ3 ] of the line π + iR. Let “c” be the Stokes line starting at ϕ3 going up to the left. The global behavior of the Stokes lines “a”, “b” and “c” is described by Lemma 7.1. (1) The Stokes line “a” stays vertical; it does not intersect the lines iR and π + iR and stays between them. (2) The Stokes line “b” intersects “a” above ϕ1 ; the segment between its beginning and the intersection is vertical and stays between iR and π + iR. (3) after intersecting “a”, the Stokes line “b” goes downward staying vertical, not intersecting “a” anymore; it intersects either R or the Stokes line “d” symmetric of “a” with respect to iR; (4) The Stokes line “c” stays vertical; it does not intersect the lines “a” and π + iR and stays between these two curves. Proof. First, consider the Stokes line “a”. A Stokes line can become horizontal only at a point where Im κ = 0, i.e. at a point of the pre-image of a spectral band. Therefore, “a” stays vertical as long as it stays (strictly) between iR and π + iR. Assume that “a” goes upward between iR and π + iR and, then, intersects π + iR at ϕa . Denote the segment of “a” between ϕ1 and ϕa by a. ˜ By the definition of Stokes lines and as the interval [ϕ1 , π] is a Stokes line, one has ϕa ϕa κ0 (ζ )dζ = Im κ0 (ζ )dζ = Re κ0 (ζ )dζ. (7.8) 0 = Im a˜
π
π, along π+iR
But, on the line π + iR, one has Re κ0 (ζ ) > 0. So, the right-hand side of (7.8) is nonzero. Thus, the Stokes line “a” does not intersect π + iR and stays vertical if it does not intersect iR. So, now we assume that “a” intersects iR at ϕa . Denote the segment of “a” between ϕ1 and ϕa by a. ˜ Then, as κ0 ∈ iR on iR, 0 0 κ0 (ζ )dζ = Im κ0 (ζ )dζ = Im κ0 (ζ )dζ. 0 = Im a˜
ϕ1
ϕ1 , along R
46
A. Fedotov, F. Klopp
The last integral is non-zero as Im κ0 > 0 on the real line between 0 and ϕ1 . The above two observations prove point 1 of Lemma 7.1. Now, consider the Stokes line “b”. If the lines “b” and “a” do not intersect one another, then “b” intersects either the interval [ϕ1 , π ] or the segment [π, ϕ2 ] of the line π + iR. Assume that there are such intersections. Denote by ϕb the point where it happens for the first time. If ϕb ∈ [π, ϕ2 ], then, as κ0 (ϕ2 ) = π , ϕ2 ϕ2 0 = Im (κ0 (ζ ) − π )dζ = Im (κ0 (ζ ) − π )dζ. ϕb , along "b"
ϕb , along π+iR
As, along [π, ϕ2 ), one has 0 < κ0 < π , the right-hand side in the above formula is non-zero. If ϕb ∈ [ϕ1 , π], one has ϕ2 ϕ2 0 = Im (κ0 (ζ ) − π )dζ = Im (κ0 (ζ ) − π )dζ. ϕb , along “b”
π, along π+iR
As before, we see that the right-hand side in the last formula is non-zero. As a result, we see that “b” must intersect “a” at a point ϕab above R; it stays between “a” and π + iR before the first intersection. This proves point 2 of Lemma 7.1. To prove point 3, we need only to check that “b” can not intersect “a” once more before intersecting either “d” or R. Indeed, assume that it intersects “a”. Denote the intersection point by ζab , then 0 = Im = Im
ϕab
ζab , along “b” ϕab ζab , along “a”
(κ0 (ζ ) − π )dζ (κ0 (ζ ) − π )dζ = −π(ϕab − ζab )
which is impossible. Check the last statement. Using almost the same argument as in the case of “a”, one proves that “c” can not intersect π + iR before intersecting iR. So, it suffices to prove that “c” stays to the right of “a”. Assume that the Stokes lines “a” and “c” intersect. Let ϕac be their first common point. Then, 0 = Im
ϕac ϕ1 , along “a”
κ0 (ζ )dζ = Im
ϕ2 π, along π+iR
κ0 dζ + Im
ϕac ϕ2 , along “c”
κ0 dζ.
Now, using the definition of the Stokes line “c”, we transform the right hand side into
ϕ2
π, along π+iR
Re κ0 dIm ζ + π Im (ϕac − ϕ2 ).
This expression is positive. So, “a” and “c” can not intersect. This implies point 4 and completes the proof of Lemma 7.1. & '
7.3. Canonical domains. Now we describe the canonical domains used in the next section to compute the monodromy matrix.
Anderson Transitions for Almost Periodic Schrödinger Equations
47
∞ 7.3.1. The domain K0 . Consider the simply connected ϕ regular domain K0 correspondκ0 dζ = Const are represented by ing to Fig. 9, part A. In this figure, the lines Im ϕ continuous curves, and the lines Im (κ0 − π )dζ = Const are shown as dotted curves. The boundary of K0∞ consists of Stokes lines. This domain exists for all E on the interval J defined by (1.4). On K0∞ , we fix an analytic branch κ0 of the complex momentum as the analytic continuation of the main branch of the complex momentum. Consider its subdomain K0 corresponding to Fig. 9, part B. Its boundary contains the lines of Stokes type (with respect to κ0 ) passing by ζ1 and ζ2 (as shown in the same figure). Assuming that such a domain exists, we define the horizontal strip |Im ϕ| ≤ Y (K0 ), where K0 and K0∞ coincide. One has
b2 b '3 b
b '2 '1
b
b '3
b
b
'1
0
b
b
b
b
b
b '2 '1
b
b
0
b
b
'1
b
b
b
A
B
1 b
Fig. 9. Canonical domains: K0∞ and K0
Proposition 7.1. One can choose the points ζ1 and ζ2 so that the domain K0 exist and so that Y (K0 ) > Im ϕ3 is as large as desired. The domain K0 is canonical with respect to the branch κ0 and the points ζ1 and ζ2 . ϕ Proof. As usual, we identify R2 and C. The lines of Stokes type Im κ0 dζ = Const are ϕ integral curves of the vector field κ0 (ϕ), and the lines of Stokes type Im (κ0 −π )dζ = Const are integral curves of the vector field κ0 (ϕ) − π . First, choosing the points ζ1 and ζ2 in K0∞ properly, we show that there exists a domain K0 bounded by the lines of Stokes type as shown in Fig. 9, and that Y (K0 ) can be made as large as desired. The proof is split into a few steps. ϕ 1. Pick a point A on the Stokes line “c”, see Fig. 10. Recall that along “c”, Im ϕ3 (κ0 − ϕ π)dζ = 0. Consider γA , the line of Stokes type Im A κ0 dζ = 0, passing through A. It is transversal to the Stokes line “c” at A. Show that, above A, it is vertical and stays between “a” and “c”. At A, γA is tangent to the vector κ0 (A); in a sufficiently small neighborhood of A, above A, it goes up and stays to the left of “c” and to the right of “a”. Above A, it can not intersect “a” (at least, without intersecting “c”) as “a” is also a line of Stokes type ϕ along which Im κ0 dζ = const. Assumes that γA intersects “c” above A. Denote the
48
A. Fedotov, F. Klopp
B
B
b
2
b
\d"
\a"
b b
\c" A b
0 b
b
0
'1
A
b'3
\b" b'2 b
'1
b b
b
b
b1 Fig. 10. The construction of K0
intersection point by ζi . Then, ζi 0 = Im κ0 dζ A, along γA ζi
= Im
A, along “c”
κ0 dζ = π Im (ζi − A),
which is impossible. So, γA stays between “a” and “c” which also implies that it is vertical above A. ϕ 2. We pick ζ2 on γA above A. Consider γB , the line of Stokes type Im ζ2 (κ0 −π )dζ = 0 passing by ζ2 . At ζ2 , it is transversal to γA and vertical. One shows that, below ζ2 , • γB goes down staying to the left of the lines γA , “c”, the segment [ϕ2 , ϕ3 ] and the line “b”, • then, it intersects either R or “d”, the Stokes line symmetric to “a” with respect iR, • it stays vertical between ζ2 and B, the intersection point with either R or “d”. Indeed, being tangent to the vector κ0 (ζ2 ) − π at ζ2 , γB goes down from ζ2 and, in a sufficiently small neighborhood of ζ2 , stays to the left of γA and to the right of “d”. Then, it stays vertical and goes down at least while between “d” and γA . Let R be the “rectangle” bounded by “d” , the line Im ϕ = Im ζ2 , γA and the line Im ϕ = Im A. If γB leaves R via “d”, the second step is completed. If it leaves R via ζi , a point of γA , then ζi ζi πIm (ζi − ζ2 ) = Im κ0 dζ = Im κ0 dζ = 0, ζ2 , along γB
ζ2 , along γA
which is impossible. So, we are left with the case where it leaves R through a point of the line Im ϕ = Im A staying between “d” and γA . Now, consider “P”, the “polygon” bounded by the lines “d”, Im ϕ = Im A, “c”, the segment [ϕ2 , ϕ3 ] of the line π + iR, the line “b” and, possibly, R. Inside P , γB stays vertical and goes down. If γB leaves P via “d” or R, the second step is completed. This
Anderson Transitions for Almost Periodic Schrödinger Equations
49
is the case as it can not intersect the other lines ϕforming the boundary of P since all of them are lines of Stokes type along which Im (κ0 − π )dζ = const. Note that, if B is real, then it is located between −ϕ1 , the starting point of “d”, and ϕ1 , the starting point of “a”. 3. Show that, by choosing Im ζ2 large enough, one can make B belong to “d” and Im B as large as desired. First assume that B ∈ R. Using the definitions of the lines of Stokes type, one computes πIm (ζ2 ) = Im
ζ2
B, along γB
κ0 dζ = Im
ϕ1 B, along R
κ0 dζ + Im
A ϕ1 , in K0∞
κ0 dζ.
The second term in the right-hand side is constant, and the first one increases with Im ζ2 . So, if Im ζ2 is large enough, then B becomes a point of “d”. Assume B ∈“d”. Along ϕ both “d” and γA , Im κ0 dζ = Const; so, one gets π Im (ζ2 − B) = Im
A −ϕ1 , in K0∞
κ0 dζ.
So, Im B linearly increases with Im ζ2 . This completes the third step. 4. One chooses the point ζ1 symmetric to ζ2 with respect to 0. As the cosine, hence, κ0 ϕ ϕ are even (see (7.4)), the families of lines Im (κ0 − π )dζ = Const and Im κ0 dζ = Const are symmetric with respect to ϕ = 0. As a result, we see that the domain K0 exists and that Y (K0 ), the size of the horizontal strip, where K0 coincides with K0∞ , can be made as large as wanted. By means of Proposition 6.3, we show that K0 is canonical. The proof is again split into a few steps. 5. We say that the lines of a family (lv )v∈U fibrate a domain D if D is the disjoint ϕ union of the lines (lv )v∈U . We use the fact that the families lines of Stokes type Im κ0 dζ = ϕ Const and Im (κ0 − π)dζ = Const both fibrate K0 . Indeed, consider, for example, the first family. Pick a point ζ0 in K0 . If there are two lines of Stokes type passing through ϕ ζ0 , then ζ0 is a critical point for the harmonic function Im 0 κ0 dζ . Then κ0 (ζ0 ) = 0. This is possible only at a branch point of κ0 and there are no branch points inside of K0 . 6. We construct a line γ ∈ K0 connecting ζ1 and ζ2 and canonical with respect to κ0 . Therefore, we use Lemma 6.5 and make some preparations that we present now. First, in a neighborhood of ζ2 , pick a segment γ2 of a vertical, say, straight line starting at ζ2 and going down between γA and γB . At ζ2 , the last two lines are tangent respectively to the vectors κ0 (ζ2 ) and κ0 (ζ2 ) − π . So, at least, the part of γ2 situated in a small enough neighborhood of ζ2 is a canonical line. To keep the notations simple, we denote it also by γ2 . The line γ1 starting at ζ1 and symmetric to γ2 with respect to 0 is also canonical. Describe a pre-canonical line β ⊂ K0 connecting a1 , an internal point of γ1 to a2 , an internal point aof γ2 . Consider σ2 , the line of Stokes type containing ϕ a2 and satisfying the relation Im ϕ 2 (κ0 − π)dζ = 0. As the lines of Stokes type Im (κ0 − π )dζ = const fibrate K0 , choosing a2 close enough to γB , we get that (1) σ2 stays in the part of K0 where Im κ0 > 0 and, so is vertical in K0 ; (2) the segment of σ2 going down from a2 intersects “d”. In particular, together with γB , σ2 intersects the line iR at a point b2 having a positive imaginary part. The pre-canonical line β is symmetric with respect to 0. It begins at a2 , then, along σ2 , it goes down to its intersection with iR, and then goes down along iR to the origin.
50
A. Fedotov, F. Klopp
The line β˜ = γ1 ∪β ∪γ2 is pre-canonical. The pre-canonical line β˜ found, Lemma 6.5 implies that, as close to β as desired, there exists γ , a canonical line connecting ζ1 to ζ2 so that, in a sufficiently small neighborhood of ζ1 , γ coincides with γ1 , and, in a sufficiently small neighborhood of ζ2 , it coincides with γ2 . We choose γ so that it belongs to K0 . 7. By means of Proposition 6.3, we show that K0 is a canonical domain enclosing γ (actually, it is the maximal canonical domain, but, we do not need this fact). Pick a point ζ0 in K0 . We need only to check that, in K0 , there is a pre-canonical curve β connecting two internal points of γ and containing ζ0 . We assume that Im ζ0 ≥ 0. Due to the symmetry of K0 , β and γ1,2 with respect to the origin, for Im ζ0 < 0, the analysis is similar. There are four cases to be considered. 7a. First, we assume that ζ0 is to the left of “a”. Consider σ0 , a line of Stokes type ϕ Im ζ0 κ0 dζ = 0, containing ζ0 . As ζ0 is between “d” and “a”, and as the lines of Stokes type fibrate K0 , the line σ0 stays between the lines “a” and “d”, and also between “a” and “d”, the lines symmetric to “a” and “d” with respect to the real line. So, it stays vertical and has to intersect one of the lines γA and γB and one of the lines −γA and −γB , the symmetrics of γA and γB with respect to the origin. In fact, as both γA and −γA belong to the same family of lines of Stokes type as σ0 , it intersects γB and −γB . To construct the needed pre-canonical curve, we pick c1 and c2 , two internal points of σ0 , so that Im c1 < Im ζ0 < Im c2 . Consider α2 , the line of Stokes type from the same family as γB and beginning at c2 . As the lines of Stokes type fibrate K0 , choosing c2 close enough to γB , we get that (1) α2 stays in the part of K0 where Im κ0 > 0, and, thus, is vertical; (2) the segment of α2 beginning at c2 and going up intersects γA . As γA is a part of the right boundary of K0 , this segment intersects also γ . Similarly, by properly choosing c1 , one constructs α1 , a vertical line of Stokes type in K0 , connecting the point c1 to an internal point of γ situated below c1 . The needed pre-canonical line β goes from this point of γ along α1 to σ0 , then, along σ0 , to α2 , and, then along α2 to an internal point of γ . The construction of β is illustrated by Fig. 10. 7b. Assume that ζ0 is situated to the right of or on “a” and above “b”. The construction of the pre-canonical curve corresponds to Fig. 11, part B. 7c. If ζ0 is to the left of the line π + iR and either below “b” or on “b”, we construct β as in Fig. 11, part C. 7d. If ζ0 is either to the right of the line π + iR or on this line, then, we first construct a line β˜ as shown in Fig. 11, part D. This line contains δ1 and δ2 , two segments of the line π + iR. As, on the line π + iR, 0 < κ0 < π , this line and its segments δ1 and δ2 are canonical. But, the line β˜ is not yet pre-canonical : as, on π + iR, κ0 is real, the lines b2
b2
b0 b
b
b
'1
0
'1
b b
b '3
b'3
b '2
b
b
b
b
b
B
Æ2
'2
b
b
b
b b0
'1
0
'1
b b
1
b2
b
b
b
b
b
1
C
Fig. 11. The pre-canonical line β
'2
b b
'1
0
'1
b b
b '3
~
Æ1
b
b
1
D
b
b
0
Anderson Transitions for Almost Periodic Schrödinger Equations
51
of Stokes type are horizontal (i.e. are not vertical) at the points of π + iR. To correct this, we use the C 1 -stability of the canonical lines. We slightly deform the canonical line ˜ π + iR so as to get a pre-canonical line from its deformed segments δ1 and δ2 and β. We have shown that, for any point ζ0 ∈ K0 , there exists a pre-canonical line β containing ζ0 as an internal point and connecting two internal points of the canonical ' line γ . As explained, this implies that K0 is canonical. This completes the proof. & We finish this section by noting that the relations (7.6) and (7.5) imply that any domain obtained from K0 by means of 2π -translations and/or the reflection with respect to the real line is canonical. 7.3.2. The domain K1 . Consider the simply connected regular domain K1∞ corresponding to Fig. 12, part A. The boundary of K1∞ consists of Stokes lines. This domain exists for all E in the interval J defined by (1.4). Let κ1 be the analytic continuation of κ0 from K0∞ to K1∞ through their common part. Consider the subdomain K1 ⊂ K1∞ corresponding to Fig. 12, part B. Its boundary contains the lines of Stokes type (with respect to κ1 ) passing by ζ1 and ζ2 (as shown in Fig. 12). Consider the horizontal strip |Im ϕ| ≤ Y (K1 ), where K1 and K1∞ coincide. One has Proposition 7.2. One can choose the points ζ1 and ζ2 so that the domain K1 exists and so that Y (K1 ) > Im ϕ3 is as large as desired. The domain K1 is canonical with respect to the branch κ1 and the points ζ1 and ζ2 . The proof of this statement being analogous to the one of Proposition 7.1, we omit it.
0
b'1
'3 b
'3 b
'2 b
'2 b
b
b
2
'2
b 2
'3
2
0
'1
b
b
A
b
b
2
1
b
2
'2
b 2
'3
2
B
Fig. 12. Canonical domains: K1∞ and K1
We finish this section by noting that κ1 satisfies the following symmetry relations: κ1 (2π − ϕ) = κ1 (ϕ),
(7.9)
κ1 (ϕ) = κ1 (ϕ).
(7.10)
52
A. Fedotov, F. Klopp
1 the canonical Bloch These relations follow from the definition of κ1 . Consider A± solutions constructed for the domain K1 and indexed so that κ1 is the quasi-momentum 1 . Discuss the corresponding functions ψ (x, ϕ) = ψ 1 (x, ϕ) and ω = ω1 (see of A+ ± ± ± ± Sect. 6.3). One has 1 1 (x, ϕ), ψ− (x, ϕ) = ψ+
1 1 ψ± (x, 2π − ϕ) = ψ± (x, ϕ).
(7.11)
Indeed, the first symmetry follows from the discussion at the end of Subsect. 5.1 as the interval [ϕ1 , 2π − ϕ1 ] is a connected component of the pre-image of a spectral band. The second symmetry holds as cos(2π − x) = cos(x). Furthermore, the definitions of ω± , see (6.13), and relations (7.11) imply that 1 (ϕ) = ω1 (ϕ), ω+ −
1 1 ω± (2π − ϕ) = −ω± (ϕ).
(7.12)
7.3.3. Bloch solutions along periodic curves. We need the following simple observations. Consider the two branches of ψ(x, E − α cos ϕ), the Bloch solution of (6.10) with W (ϕ) = α cos ϕ. Consider them on the domain D0 where the main branch κ0 was 0 (x, ϕ) so that their Bloch quasi-momenta are defined. We index these branches by ψ± equal to ±κ0 (ϕ). 0 analytically along the periodic curve γ described in Fig. 7. Then, Continue ψ± along γ , 0 0 ψ± (x, ϕ + 2π ) = ψ± (x, ϕ),
(7.13)
0 0 ω± (ϕ + 2π ) = ω± (ϕ),
(7.14)
0 are constructed by (6.13) in terms of ψ 0 . Equation (7.14) follows from the where ω± ± definition of the functions ω± and from (7.13). And (7.13) holds as (1) there are only two branches of the function ψ(x, E − α cos ϕ); (2) κ0 is the Bloch quasi-momentum 0 ; (3) κ is 2π -periodic along γ . of ψ+ 0 Recall that the derivative k0 (E) can vanish only inside the spectral bands of the periodic operator (1.3) (see Subsect. 5.2.2). So, on γ , we can fix an analytic branch of
q0 =
k0 (E − α cos ϕ). Recall that k0 (E) ∈ −iR+ for E < E1 (see Subsect. 5.2.2).
We fix the branch of q0 so that q0 ∈ e−iπ/4 R+ between ϕ1 and −ϕ1 . One has q0 (ϕ + 2π ) = q0 (ϕ),
ϕ ∈ γ.
(7.15)
This relation easily follows from the analytic properties of the main branch of the Bloch quasi-momentum k0 . 7.3.4. The domain K2 . Define the domain K2 by K2 = K0 + 2π ; by the remark concluding Sect. 7.3.1, the domain K2 is canonical. Let us discuss the solutions constructed 2 (x, ϕ), ω2 and q as the analytic by Theorem 6.1 on this domain. In K2 , define κ2 , ψ± 2 ± 0 0 and q to K along the pecontinuations of the functions κ0 , ψ± (x, E − α cos ϕ), ω± 0 2 riodic curve γ corresponding to Fig. 13. Recall that the analytic continuations of κ0 , 0 (x, E − α cos ϕ), ω0 and q along γ are 2π -periodic. This implies ψ± 0 ± Lemma 7.2. The domain K2 is canonical with respect to κ2 . Moreover, if f±0 are the consistent basis solutions constructed for K0 by Theorem 6.1 and normalized at the point ϕ = 0, then, on K2 , the functions f±2 (ϕ) = f±0 (ϕ − 2π ) have the same asymptotics as the consistent basis solutions constructed for K2 by Theorem 6.1 and normalized at the point ϕ = 2π .
Anderson Transitions for Almost Periodic Schrödinger Equations
53
'3 '2
0
'1
2
K2 : K1 : K0 :
Fig. 13. The three canonical domains
8. The Proof of Theorem 2.2 This section is devoted to the proof of Theorem 2.2. We begin with an observation on the analyticity in ϕ of the solutions constructed in Theorem 6.1. They can be analytically continued outside of K. Indeed, let S(Y1 , Y2 ) = {Y1 < Im ϕ < Y2 } be the smallest strip containing the domain K. Fix 0 < ν < (Y2 − Y1 )/2. Consider the domain Kν = {ζ ∈ K; Y1 + ν < Im ϕ < Y2 − ν}, and its horizontal width, w = sup |ϕ − ϕ |. Clearly, ϕ,ϕ ∈Kν Im ϕ=Im ϕ
w > 0. Assume that ε < w. Then, the functions f± , being defined for all x ∈ R and analytic in ϕ ∈ K, can be analytically continued in the whole strip {Y1 + ν < Im ϕ < Y2 − ν} using the consistency condition (6.3). To simplify the notations below, when speaking about the solutions f± constructed for a canonical domain K outside of this domain, we shall assume that ϕ ∈ {Y1 + ν < Im ϕ < Y2 − ν}.
8.1. The monodromy matrix. Let (f±0 ) be the solutions of (7.1) constructed by Theorem 6.1 for the canonical domain K0 . To this basis, we associate the matrix M˜ defined by 0 ˜ F 0 (u, ϕ + 2π ) = M(ϕ)F (u, ϕ),
f+0 . f−0
F0 =
(8.1)
In (8.1), changing the variables (u, ϕ) to the variables (x, φ) of the input Eq. (1.1), ˜ we see that M˜ is related to the monodromy matrix M by the formula M(ϕ) = M(φ), ϕ = εφ. The matrix M˜ is unimodular and ε-periodic (compare with (2.3)). We define ˜ the coefficients of the matrix M(ϕ) by m11 (ϕ) m12 (ϕ) ˜ M(ϕ) = . m21 (ϕ) m22 (ϕ)
54
A. Fedotov, F. Klopp
We prove Proposition 8.1. Under the assumptions of Theorem 2.2, there is a positive η such that M˜ is analytic in V = {|E − E0 | < η} × {|Im φ| ≤ Y /ε} and, in V, its coefficients admit the uniform asymptotics m11 (ϕ) = th · ei/ε (1 + o(1)), 2iπ m12 (ϕ) = iG · ei/ε (1 + o(1)) − tv · e ε (ϕ−ϕ(0) ) (1 + o(1)) , 2iπ m21 (ϕ) = −iG−1 · ei/ε (1 + o(1)) − tv · e− ε (ϕ−ϕ(0) ) (1 + o(1)) , m22 (ϕ) =
1 −i/ε 1 e (1 + o(1)) + ei/ε (1 + o(1)) th th 2π tv i 2π (ϕ−ϕ(0) ) e ε − (1 + o(1)) + e−i ε (ϕ−ϕ(0) ) (1 + o(1)) . th
Here, the phase , the coefficients th , tv are defined in (2.10) and (2.11), and ϕ1 iε ϕ2 (ω+ − ω− )dϕ , ϕ(0) = − (ω− − ω+ )dϕ − π, G = exp 2π ϕ1 0
(8.2)
where the integrals are taken along curves situated in K0 , and the functions ω± are the ones defined for K0 in 7.3.1. The function ϕ(0) is real analytic. We deduce Theorem 2.2 from Proposition 8.1 by changing the consistent basis; this is done in Sect. 8.6. The next sections are devoted to the proof of Proposition 8.1. 8.2. The canonical domains K0 , K1 , K2 and the associated transition matrices. It would be easy to compute the asymptotics of M˜ in a subset of K0 where we know the asymptotic behavior of both F 0 (u, ϕ + 2π) and F 0 (u, ϕ). Unfortunately, a quick look at Fig. 9, part B, shows us that K0 ∩ (K0 + 2π ) = ∅. Therefore, we need to introduce additional canonical domains. The first domain we need is K1 (see Fig. 12 B and Subsect. 7.3.2). Let (f±1 ) be the consistent basis constructed by Theorem 6.1 in K1 . As (f±0 ) and (f±1 ) are both bases of solutions of (7.1), we can write 1 f (8.3) F 0 (u, ϕ) = T1 (ϕ)F 1 (u, ϕ), F 1 = +1 , f− where the matrix T1 is independent of u. As (f±0 ) and (f±1 ) satisfy the consistency condition 6.3, the matrix T1 is ε-periodic. The second canonical domain we need is K2 , see Sect. 7.3.4. Define f±2 (ϕ) = f±0 (ϕ − 2π). It is a consistent basis of solutions with standard asymptotic behavior on K2 . As (f±2 ) and (f±1 ) are two bases of solutions of (7.1), we have 2 f F 1 (u, ϕ) = T2 (ϕ)F 2 (u, ϕ), F 2 = +2 , (8.4) f− where the matrix T2 is independent of u and ε-periodic in ϕ. Putting (8.3) and (8.4) together, we get F 0 (u, ϕ) = T1 (ϕ)T2 (ϕ)F 2 (u, ϕ) = T1 (ϕ)T2 (ϕ)F 0 (u, ϕ − 2π ).
Anderson Transitions for Almost Periodic Schrödinger Equations
55
Hence, the monodromy matrix reads ˜ M(ϕ) = T1 (ϕ + 2π )T2 (ϕ + 2π ).
(8.5)
To prove Proposition 8.1, we study the matrices T1 and T2 . We denote their coefficients as follows a (ϕ) b1 (ϕ) a (ϕ) b2 (ϕ) T1 (ϕ) = 1 , T2 (ϕ) = 2 . (8.6) c1 (ϕ) d1 (ϕ) c2 (ϕ) d2 (ϕ) Below, when describing the asymptotics of the matrices T1 and T2 and when computing j these asymptotics, the contour integrals of κj and ω± , j = 0, 1, 2, are taken along curves in Kj in all the cases when no other choice is described explicitly. One proves Proposition 8.2. Under the assumptions of Theorem 2.2, there is a positive η such that T1 is analytic in V = {|E − E0 | < η} × {|Im φ| ≤ Y /ε} and, in V, its coefficients admit the uniform asymptotics i
a1 (ϕ) = e ε
π
π
κ0 dϕ+
0 dϕ ω+
(1 + o(1)),
(8.7)
· o(e−δ1 /ε ), π ϕ 1 i π 1 2i κ dϕ ϕ1 0 c1 (ϕ) = −ie− ε 0 κ0 dϕ e ε ϕ1 1 e 0 ω− dϕ− π ω+ dϕ (1 + o(1))
(8.8)
b1 (ϕ) = e
i ε
0
π 0
0
κ0 dϕ
i
−e−2 ε d1 (ϕ) = e
− εi
π 0
ϕ π
e
ϕ2 0
ϕ
0 dϕ− ω−
π
0 dϕ −2π i (ϕ−π) ω+ ε
2
e
(1 + o(1)) , (8.9)
π
κ0 dϕ+
2 (κ −π)dϕ 0
0 0 ω− dϕ
(1 + o(1)).
(8.10)
In (8.8), δ1 is a positive constant. and Proposition 8.3. Under the assumptions of Theorem 2.2, there is a positive η such that T2 is analytic in V = {|E − E0 | < η} × {|Im φ| ≤ Y /ε} and, in V, its coefficients admit the uniform asymptotics i
a2 (ϕ) = e ε b2 (ϕ) = ie
2π π
− εi
2π
κ2 dϕ+
2π
κ2 dϕ
2 dϕ ω+
π
e
2 εi
(1 + o(1)),
2π −ϕ
1
κ1 dϕ
(8.11) 1
1 dϕ− ω+
2π −ϕ1
2 dϕ ω−
(1 + o(1)) π 1 2π 2 i ϕ¯2 ω dϕ− ϕ¯ ω+ dϕ 2π i (ϕ−π) 2 − e2 ε π (κ1 −π)dϕ e ϕ¯2 − e ε (1 + o(1)) , i
c2 (ϕ) = e ε d2 (ϕ) = e
π
2π
− εi
π
κ2 dϕ
2π π
π
e
2π −ϕ π
· o(e−δ2 /ε ), 2π
κ2 dϕ+
π
2 dϕ ω−
(1 + o(1)).
2π
(8.12)
(8.13) (8.14)
In (8.13), δ2 is a positive constant. The next sections are devoted to the proof of Propositions 8.2 and 8.3. As it is similar to the proof of Proposition 8.2, we do not give a detailed proof of Proposition 8.3; we just describe the starting points and partial results. From now on, C denotes different positive constants independent of ϕ and ε.
56
A. Fedotov, F. Klopp
8.3. The asymptotics of T1 . By definition, we have 1 w(f+0 , f−1 ), w1 1 c1 (ϕ) = w(f−0 , f−1 ), w1
1 w(f+0 , f+1 ), w1 1 d1 (ϕ) = − w(f−0 , f+1 ), w1 d d w1 = w(f+1 , f−1 ) = f−1 f+1 − f+1 f−1 . du du
a1 (ϕ) =
b1 (ϕ) = −
(8.15)
j
The solutions (f± ) are normalized at the points νj , j = 0, 1 where ν0 = 0,
ν1 = π. j
For convenience, we recall the asymptotics of f± j
f± (u, ϕ) = e
± εi
ϕ
νj
κj dϕ
j
(A± (u, ϕ) + o(1)),
j
A± (x, ϕ) = qj (ϕ)e
ϕ
νj
j
ω± dϕ
ϕ ∈ Kj ,
(8.16)
j
ψ± (x, ϕ).
(8.17)
1 , ω1 and q are the analytic continuations of respectively κ , ψ 0 , ω0 Recall that κ1 , ψ± 1 0 ± ± ± and q0 along the periodic curve γ from K0 to K1 (see Sect. 7.3.3). Note that, in view of Lemma 6.2, one has j
j
j
j
wj = w(f+ , f− ) = qj2 w(ψ+ , ψ− )|ϕ=νj + o(1).
(8.18)
These Wronskians are non-zero. Indeed, j
j
• as νj are not branch points of the complex momentum, ψ+ and ψ− are linearly independent; • q 2 = k (E − α cos ϕ) only vanishes at points in the pre-image of the spectral gaps ∪∞ l=1 (E2l , E2l+1 ) with respect to the mapping E : ϕ → E − α cos ϕ (see Sect. 5.2.2), and (νj )j =1,2 do not belong to this pre-image. 8.3.1. Three substrips. We assume that K0 and K1 are chosen so that Y (K0 ), Y (K1 ) > Im ϕ3 , the definitions and properties of Y (K0 ) and Y (K1 ) are described in Sects. 7.3.1 and 7.3.2. We study the transition matrix T1 in the strip −Y0 < Im ϕ < Y0 ,
Y0 = Im ϕ3 .
When computing the asymptotics of T1 , we divide this strip into three different smaller substrips called (I), (II) and (III) (see Fig. 14). Each of these strips requires a different type of computation.
Anderson Transitions for Almost Periodic Schrödinger Equations
57
8.3.2. Properties of the analytic objects of the complex WKB method in the substrips. In j j the section, we compare κj , ψ± , ω± and qj for j = 0 and j = 1 in each of the different strips (I), (II) and (III). In the strip (II), we have a non-empty intersection for K0 and K1 . In the intersection, by definition, we have κ1 = κ0 ,
1 0 ψ± = ψ± ,
1 0 ω± = ω± ,
q1 = q0 .
and
(8.19)
In the strip (I), consider the common boundary of K0 and K1 (see Fig. 14). It is the Stokes line beginning at ϕ = ϕ1 and going downwards. Along this line, we get κ1 (ϕ + 0) = −κ0 (ϕ − 0), 1 0 ψ± (u, ϕ + 0) = ψ∓ (u, ϕ − 0), 1 0 ω± (ϕ + 0) = ω∓ (ϕ − 0),
(8.20) (8.21) (8.22) (8.23)
q1 (ϕ + 0) = iq0 (ϕ − 0),
where ϕ + 0 (resp. ϕ − 0) denotes the limit taken from the right (resp. left). These formulae hold as ϕ1 is a pre-image of E1 (the infimum of the first band of the periodic Schrödinger operator) by the mapping E : ϕ → E − α cos ϕ. Indeed, formula (8.20) holds as ϕ1 is a branch point of the complex momentum, as ϕ1 is of j square root type and as κ0 (ϕ1 ) = 0. Formula (8.21) holds as ψ± are just branches of the Bloch solution ψ(x, E), E = E − α cos ϕ, which has only two different branches, and j as E1 is a branch point of this function. The relation for ω± follows from (8.21) and the definition of ω± . The relation for q follows from the definition of q as E1 is a square root branch point of the Bloch quasi-momentum. We also notice that, as the imaginary part of the main branch κ0 is positive in K0 along the common boundary of K0 and K1 , the relation (8.20) implies that, along this boundary, one has Im κ1 (ϕ + 0) = −Im κ0 (ϕ − 0) < 0.
'3 '2
(8.24)
b (III)
b
(II) 0
b
'1
2
b
b
2
'2
b2
'3
(I)
K1
:
K0
:
Fig. 14. Going from K0 to K1
58
A. Fedotov, F. Klopp
In the strip (III), the common boundary of K0 and K1 is the interval [ϕ2 , ϕ3 ], that is the Stokes line joining ϕ2 to ϕ3 . Along [ϕ2 , ϕ3 ], we have κ1 (ϕ + 0) = 2π − κ0 (ϕ − 0), 1 ψ± (u, ϕ 1 ω± (ϕ
+ 0) =
(8.25)
0 ψ∓ (u, ϕ − 0), 0 ω∓ (ϕ − 0),
(8.26)
+ 0) = q1 (ϕ + 0) = −iq0 (ϕ − 0),
(8.27) (8.28)
Im κ1 (ϕ + 0) = −Im κ0 (ϕ − 0) < 0.
(8.29)
and
8.3.3. A continuation lemma. In the strips (I) and (III), the δ-admissible sub-domains of canonical domains K0 and K1 are separated by a distance 2δ. To get uniform asymptotics for f±0 in K1 , we use Lemma 8.1. Let ϕ− , ϕ+ , ϕ0 be fixed points (see Figs. 16 and 17) such that • Im ϕ− = Im ϕ+ ; • there is no branch point of ϕ → κ(ϕ) on the interval [ϕ− , ϕ+ ]; • for ϕ0 ∈ (ϕ− , ϕ+ ), q(ϕ0 ) = 0. Fix a continuous branch of κ on [ϕ− , ϕ+ ]. Let f (u, ϕ), f± (u, ϕ) be solutions of (7.1) for ϕ ∈ [ϕ− , ϕ+ ] and u ∈ [−U, U ] satisfying condition (6.3), and such that: i
ϕ
κdϕ
(1) f (u, ϕ) = e ε ϕ0 (A+ (u, ϕ) + o(1)) for ϕ ∈ [ϕ− , ϕ0 ] when ε → 0, and the asymptotic is differentiable in u; ±i
ϕ
κdϕ
(A± (u, ϕ) + o(1)) for ϕ ∈ [ϕ− , ϕ+ ] when ε → 0, and the (2) f± (u, ϕ) = e ε ϕ0 asymptotic is differentiable in u. Here, (A± ) are canonical Bloch solutions with the Bloch quasi-momenta ±κ. Then, • if Im(κ(ϕ)) > 0 for all ϕ ∈ [ϕ− , ϕ+ ], there exists C > 0 such that, for ε > 0 small enough, 1 ϕ df (u, ϕ) + |f (u, ϕ)| ≤ Ce ε ϕ0 |Imκ|dϕ , ϕ ∈ [ϕ0 , ϕ+ ]; (8.30) du • if Im(κ(ϕ)) < 0 for all ϕ ∈ [ϕ− , ϕ+ ], then i
f (u, ϕ) = e ε
ϕ
ϕ0
κdϕ
(A+ (u, ϕ) + o(1)),
ϕ ∈ [ϕ0 , ϕ+ ],
(8.31)
and the asymptotic is differentiable in u. Remark 8.1. This lemma reflects a heuristic that says that the WKB asymptotics of a solution stays valid as long as its leading term is exponentially increasing. A similar statement for a class of difference equations can be found in [6].
Anderson Transitions for Almost Periodic Schrödinger Equations
59
Proof. Note that w(f+ , f− ) = w(A+ , A− )|ϕ=ϕ0 + o(1). As q(ϕ0 ) = 0, by Lemma 6.2, the leading term in this formula is a non-zero constant, and f± are linearly independent. So, we can write f = a(ϕ)f+ + b(ϕ)f− ,
(8.32)
where a(ϕ) =
w(f, f− ) w(f+ , f− )
w(f+ , f ) . w(f+ , f− )
b(ϕ) =
and
By (6.3), a and b are ε-periodic in ϕ. As, on [ϕ− , ϕ0 ], the solutions f and f+ have the same asymptotics, one computes a(ϕ) = 1 + o(1).
(8.33)
The coefficient a is ε-periodic in ϕ so that (8.33) holds on [ϕ− , ϕ+ ]. Let us now estimate the coefficient b. We start with the case when Im(κ(ϕ)) > 0 for all ϕ ∈ [ϕ− , ϕ+ ]. Then, on [ϕ− , ϕ0 ], we have 2i ϕ 2 ϕ0 κdϕ Im(κ)dϕ |w(f+ , f )| ≤ C e ε ϕ0 . ≤ Ce ε ϕ So, for ϕ ∈ [ϕ0 − ε, ϕ0 ], we get that |b(ϕ)| ≤ C; hence, by ε-periodicity, we have |b(ϕ)| ≤ C for ϕ ∈ [ϕ− , ϕ+ ]. On the other hand, on [ϕ0 , ϕ+ ], we have |f+ | ≤ Ce
− 1ε
ϕ
ϕ0
Im(κ)dϕ
1
, |f− | ≤ Ce ε
ϕ
ϕ0
Im(κ)dϕ
.
So that, by (8.32), we get (8.30) on [ϕ 02i, ϕϕ+ ]. IfIm(κ(ϕ)) < 0 for all ϕ ∈ [ϕ− , ϕ+ ] then, κdϕ for ϕ ∈ [ϕ− , ϕ0 ], we get |b(ϕ)| ≤ C e ε ϕ0 . Using this estimate for ϕ ∈ [ϕ− , ϕ− +ε] and the ε-periodicity, for ϕ ∈ [ϕ− , ϕ+ ], we get |b(ϕ)| ≤ Ce
− 2ε
ϕ
0 ϕ−
|Im(κ)|dϕ
.
Hence, by (8.32) and (8.33), on [ϕ0 , ϕ+ ], we have f = a(ϕ)f+ + b(ϕ)f− ϕ ϕ i ϕ κdϕ − 2iε ϕ κdϕ − 2ε ϕ 0 |Im(κ)|dϕ ϕ ε − 0 0 A+ (u, ϕ) + O e + o(1) =e e i
= eε
ϕ
ϕ0
κdϕ
(A+ (u, ϕ) + o(1)) .
This proves (8.31) and completes the proof of Lemma 8.1.
' &
Remark 8.2. Assume that, for E = E0 , (1) the functions f± are two solutions constructed by Theorem 6.1 on some canonical domain K , (2) the interval [ϕ− , ϕ+ ] is in the δ-admissible subdomain of K , (3) the function f is a solution constructed by Theorem 6.1 on a canonical domain K , (4) the interval [ϕ− , ϕ0 ] is located in its δ-admissible subdomain of K . Then, the proof of Lemma 8.1 shows the asymptotics and estimates of f obtained by Lemma 8.1 are uniform in a constant neighborhood of E0 .
60
A. Fedotov, F. Klopp
8.3.4. Complex values of E and admissible subdomains. Let E0 ∈ J be the point fixed in Theorem 2.2. And let K0 and K1 be the canonical domains corresponding to this value of E. Fix δ > 0. We describe the precise choice of δ later. To get asymptotics of T1 uniformly in E, we do not work with K0 and K1 , but j with their δ-admissible subdomains. More precisely, we use the asymptotics of f± , j = 1, 2, given by Theorem 6.1 only for ϕ in these admissible subdomains. Then, in view of Proposition 6.1, these asymptotics are valid and uniform in E in a constant complex neighborhood of the point E0 fixed in Theorem 2.2. This guarantees that the asymptotics of the coefficients of T1 are valid and uniform in this complex neighborhood. Let Y be chosen as in Theorem 6.1. In the sequel, we fix δ sufficiently small so that δ < Y0 − Y = Im ϕ3 − Y . In the sequel, if no other choice is indicated explicitly, K0 and K1 denote the δ-admissible subdomains of K0 and K1 .
8.3.5. Auxiliary canonical domains. To use the Continuation Lemma, Lemma 8.1, “to bridge” the gap between the admissible subdomains of K1 and K0 in the strip (III), we need two linearly independent solutions of Eq. (1.1) with standard asymptotic behavior in the strip (III) between these admissible subdomains. So, we introduce an auxiliary canonical domain K˜ 0 corresponding to Fig. 15, part A. Let κ˜ 0 be the analytic continuation of κ0 through the interval (ϕ2 , ϕ3 ). The domain K˜ 0 is canonical with respect to κ˜ 0 . We construct K˜ 0 by means of Proposition 6.3. This construction being similar to (and much simpler than) the one of K0 done in Sect. 7.3.1, we only illustrate it in parts B and C of Fig. 15. Here, the dotted lines are lines of Stokes type Im (κ˜ 0 − π )dϕ = Const, the thin continuous lines of Stokes type Im κ˜ 0 dϕ = Const, and the thick lines are canonical. Along the continuous lines Im ϕ decays as Re ϕ increases. Note that the domain K˜ 0 can be constructed so that it comprises any given compact subinterval of (ϕ2 , ϕ3 ).
2 b
b'3
b
~
'2 b
b1
b'3
B
K0
A
2 b
b
C '2 b
b1
Fig. 15. The local canonical domain
Also, we need to apply Lemma 8.1 to bridge the gap between the admissible subdomains of K0 and K1 in the strip (I). In this case, the auxiliary canonical domain is K0 , the symmetric of K0 with respect to the real line. We are now ready to start with the computations of the coefficients of T1 .
Anderson Transitions for Almost Periodic Schrödinger Equations
61
8.3.6. The asymptotics of d1 (ϕ). By (8.15), we need to compute the asymptotics of w(f−0 , f+1 ) when ε → 0. Strip (I). In view of (8.24), Lemma 8.1 allows us to “continue” the asymptotics of f−0 from K0 to K1 across the common boundary of the canonical domains in the strip (I). The consistent basis used to apply Lemma 8.1 is the basis corresponding to the auxiliary canonical domain K0 . Let the interval [ϕ− , ϕ+ ] be as in Fig. 16, i.e. the points ϕ− and ϕ0 are in the admissible subdomain of K0 , the point ϕ+ is situated in the admissible subdomain of K1 , and the point ϕb is on the common boundary of the canonical domains. Lemma 8.1 implies that the asymptotics (8.16) of f−0 stays valid on [ϕb , ϕ+ ]. We rewrite it in the form f−0 = e
− εi
ϕ1 0
κ0 dϕ− εi
ϕ
b ϕ1
κ0 dϕ − εi
e
ϕ
κ0 dϕ
ϕb
0 (A− + o(1)),
where the last integral is taken along [ϕ− , ϕ+ ]. By (8.16) and as κ1 = κ0 in the strip (II), and κ1 = −κ0 in the strip (I), for ϕ ∈ [ϕ− , ϕ+ ] ∩ K1 , we can write i
f+1 = e ε
ϕ π
1
κ0 dϕ− εi
ϕ
b ϕ1
κ0 dϕ − εi
e
ϕ
ϕb
κ0 dϕ
1 (A+ + o(1)),
where the last integral is taken along [ϕ− , ϕ+ ]. So, i
|w (f−0 , f+1 )| ≤ C|e− ε
π 0
κ0 dϕ
| · |e
− 2iε
As the points ϕ1 and ϕb are on the Stokes line Im 1.
ϕ
b ϕ1
κ0 dϕ
ϕ
ϕ1 κ0 dϕ
K0
| · |e
− 2iε
ϕ
ϕb
κ0 dϕ
= 0, one has |e
|.
− εi
ϕ
b ϕ1
κ0 dϕ
|=
'1
'
'b '0
K1
' '+
Fig. 16. Continuation below ϕ1
Now, as ϕ has to be in an admissible sub-domain of K1 , we know that Re (ϕ−ϕb ) ≥ δ. Assuming that δ < Re (ϕ − ϕb ) ≤ δ + ε, we get i
|w(f−0 , f+1 )| ≤ C|e− ε
π 0
κ0 dϕ
|eCδ/ε .
(8.34)
By the ε-periodicity of the Wronskian, it holds everywhere on the line {Im (ϕ) = Im (ϕb )}. The estimate (8.34) is valid as long as the interval [ϕ− , ϕ+ ] stays off the δ-neighborhood of the branch points, and as long as its ends are in the admissible subdomains of K0 and K1 . Hence, for −Y0 + δ ≤ Im ϕ ≤ −δ, we have i
|d1 (ϕ)| ≤ C|e− ε
π 0
κ0 dϕ
|eCδ/ε .
(8.35)
62
A. Fedotov, F. Klopp
This estimate is uniform along horizontal lines Im ϕ = C. Strip (II). In this case, by (8.19), for ϕ ∈ K0 ∩ K1 , we have i
f−0 = e− ε
π 0
π
κ0 dϕ+
0
ϕ
0 dϕ − i ω− ε
e
κ0 dϕ
π
1 (A− (u, ϕ) + o(1)).
Comparing this with the formula ϕ
i
f+1 = e ε
κ0 dϕ
π
1 (A+ (u, ϕ) + o(1)),
one easily obtains i
w(f−0 , f+1 ) = −e− ε
π 0
κ0 dϕ
e
π 0
0 dϕ ω−
(w(f+1 , f−1 ) + o(1)).
Hence, i
d1 (ϕ) = e− ε
π 0
κ0 dϕ
e
π 0
0 dϕ ω−
(1 + o(1)).
(8.36)
By the ε-periodicity of the Wronskian, (8.36) is uniform along horizontal lines in the strip δ ≤ Im ϕ ≤ Im ϕ2 − δ. Strip (III). In this region, we satisfy ourselves with an estimate on w(f−0 , f+1 ). In view of (8.29), by Lemma 8.1, we can continue the asymptotics of f−0 from [ϕ− , ϕ0 ] to the whole interval [ϕ− , ϕ+ ], see Fig. 17. The consistent basis needed to apply Lemma 8.1 is obtained by applying Theorem 6.1 for the auxiliary canonical domain K˜ 0 . This domain comprises a part of the common boundary of the canonical domains K0 and K1 , i.e. of the segment (ϕ2 , ϕ3 ) (see Figs. 14, 15 and 17). On the segment [ϕb , ϕ+ ], we have f−0 = e
− εi
ϕ2 0
κ0 dϕ− εi
ϕ
b ϕ2
κ0 dϕ − εi
e
ϕ
ϕb
κ0 dϕ
0 (A− + o(1)),
(8.37)
where the last integral is taken along [ϕ− , ϕ+ ]. In view of (8.19) and (8.25), we also have i
f+1 = e ε
ϕ π
2
κ0 dϕ− εi
ϕ
ϕ2
κ0 dϕ
i
1 e ε (ϕ−ϕ2 )2π (A+ + o(1)),
ϕ ∈ [ϕ− , ϕ+ ] ∩ K1 ,
(8.38)
where, in the last integral, we integrate the analytic continuation of κ0 from K0 to K1 through (ϕ2 , ϕ3 ), and this integral is taken along a curve in K1 . Representations (8.37) and (8.38) imply i π −2 i ϕ (κ −π)dϕ |d1 (ϕ)| ≤ C e− ε 0 κ0 dϕ · e ε ϕ2 0 .
(III)
K0
'
'3
'0 '2
K1
'b ' '+
Fig. 17. Continuation through [ϕ2 , ϕ3 ]
Anderson Transitions for Almost Periodic Schrödinger Equations
63
Recall that, for ϕ ∈ [ϕ2 , ϕ3 ], one has κ0 (ϕ) = π + it, where t ≥ 0. Therefore, if |ϕ − ϕb | is of order δ, we obtain i
|d1 (ϕ)| ≤ C|e− ε
π 0
κ0 dϕ
|eCδ/ε .
(8.39)
Since d1 is ε-periodic, this estimate is valid uniformly along horizontal lines in the strip {Im ϕ2 + δ ≤ Im ϕ ≤ Y0 − δ} (recall that Y0 = Im ϕ3 ). Uniform asymptotics. As d1 (ϕ) is ε-periodic in ϕ and analytic in a strip {−Y0 ≤ Imϕ ≤ Y0 }, we can expand d1 in a Fourier series with exponentially decreasing coefficients ϕ 1 ϕ0 +ε 2iπn ϕε δn e where δn = d1 (ϕ)e−2iπn ε dϕ (8.40) d1 (ϕ) = ε ϕ0 n∈Z
for any ϕ0 ∈ {−Y0 ≤ Imϕ ≤ Y0 }. For n = 0, we get i
|δn | ≤ C|e− ε
π
|e−2π|n|(Y0 −δ)/ε eCδ/ε .
κ0 dϕ
0
(8.41)
To prove this estimate for n < 0, one uses (8.39) and (8.40) with Im ϕ0 = Y0 − δ. In the case of n > 0, one uses estimate (8.35) and (8.40) with Im ϕ0 = −Y0 + δ. Furthermore, by means of the estimate (8.36) and (8.40) with Im ϕ0 = δ, we get i
δ0 = e− ε
π 0
κ0 dϕ
e
π
0 dϕ ω−
0
(1 + o(1)).
(8.42)
Choose δ < min{2π/C, Y0 − Y }. Then, (8.41) and (8.42) imply i
d1 (ϕ) = e− ε
π 0
π
κ0 dϕ+
0 dϕ ω−
0
(1 + o(1)),
uniformly along horizontal lines in the strip {|Im ϕ| ≤ Y }. 8.3.7. An estimate for b1 (ϕ). We only estimate b1 when ε → 0. Strip (I). Let γ be the Stokes line beginning at ϕ = ϕ1 and bordering K0 and K1 . In view of (8.24), Lemma 8.1 only gives us an estimation on f+0 when we cross γ along [ϕ− , ϕ+ ] (see Fig. 16). This estimation is i
|f+0 | ≤ C|e ε
ϕ1 0
κ0 dϕ+ εi
ϕ
b ϕ1
κ0 dϕ+ εi
ϕ
0 ϕb
κ0 dϕ
ϕ
1
|e ε
ϕ0
|Imκ0 |dϕ
, ϕ ∈ [ϕ− , ϕ+ ] ∩ K1 .
Recall that κ0 = κ1 in the strip (II) and that, in the strip (I), one has κ0 = −κ1 . Therefore, i
|f+1 | ≤ C|e ε
ϕ π
1
κ0 dϕ− εi
ϕ
b ϕ1
κ0 dϕ− εi
ϕ
0 ϕb
κ0 dϕ
1
|e ε
ϕ
ϕ0
|Im κ0 |dϕ
, ϕ ∈ [ϕ− , ϕ+ ] ∩ K1 .
Note that the first derivatives with respect to x of the solutions satisfy the same estimates. Therefore, for ϕ on [ϕb , ϕ+ ] and inside the admissible subdomain of K1 , we obtain i
|w(f+0 , f+1 )| ≤ C|e ε
ϕ1 0
κ0 dϕ
2
|e ε
ϕ
ϕ0
|Imκ0 |dϕ
.
We have used the fact that κ0 is real on [ϕ1 , π ]. Now, the only restriction we have on ϕ0 is that it has to be in the admissible subdomain of K0 ; hence, we choose |ϕ0 − ϕ| ∼ 2δ, where 2δ is the distance between the admissible subdomains of K0 and K1 . So, for −Y0 + δ ≤ Im ϕ ≤ −δ, uniformly along horizontal lines, we have δ
i
|b1 (ϕ)| ≤ CeC ε |e ε
ϕ1 0
κ0 dϕ
δ
i
| = eC ε |e ε
π 0
κ0 dϕ
|.
(8.43)
64
A. Fedotov, F. Klopp
Strip (II). We don’t need to study b1 here. Strip (III). In view of (8.29), Lemma 8.1 gives us only an estimate on f+0 in K1 . Using the notations of Fig. 17, we get i
|f+0 | ≤ C|e ε
π 0
κ0 dϕ+ εi
ϕ π
2
κ0 dϕ+ εi
ϕ
b ϕ2
κ0 dϕ
i
eε
ϕ
0 ϕb
κ0 dϕ
1
|e ε
ϕ
|Im κ0 |dϕ
, ϕ ∈ [ϕ− , ϕ+ ] ∩ K1 , ϕ0
(8.44)
where the last integral is taken along [ϕ− , ϕ+ ]. We also have i
|f+1 | ≤ C|e ε
ϕ π
2
κ1 dϕ+ εi
ϕ
b ϕ2
ϕ
κ1 dϕ+ εi
0 ϕb
κ1 dϕ
ϕ
1
|e ε
ϕ0
|Im κ1 |dϕ
, ϕ ∈ [ϕ− , ϕ+ ] ∩ K1 . (8.45)
In the last two integrals, κ1 is the analytic continuation of the branch κ1 through [ϕ2 , ϕ3 ]. The derivatives of f+0 and f+1 also satisfy (8.44) and (8.45). This and the relations (8.19) and (8.25) imply that i
|w(f+0 , f+1 )| ≤ C|e ε
π 0
κ0 dϕ
i
||e2 ε
ϕ π
2
κ0 dϕ+ 2πε i (ϕ0 −ϕ2 )
2
|e ε
ϕ
ϕ0
|Imκ0 |dϕ
.
ϕ The branch κ0 is positive along [π, ϕ2 ] and Im(ϕ0 −ϕ2 ) > 0. Hence, Im ( π 2 κ0 dϕ+ ϕmain π ϕ20 dϕ) > 0. So, there exists δ1 > 0 (independent of ε and of δ) such that δ
i
|b1 (ϕ)| ≤ CeC ε |e ε
π
κ0 dϕ
0
|e−δ1 /ε ,
uniformly along horizontal lines in the strip Im ϕ2 + δ ≤ Im ϕ ≤ Y0 − δ. Uniform asymptotics. Using the same method as for d1 , in the strip {|Imϕ| ≤ Y }, we get i
b1 (ϕ) = e ε
π 0
κ0 dϕ
· o(e−δ1 /ε ).
8.3.8. The asymptotics of a1 (ϕ). By (8.15), we need to compute the asymptotics of w(f+0 , f−1 ) when ε → 0. The computations being similar to those made for b1 and d1 , we give only the results. Strip (I). For −Y0 + δ ≤ Im ϕ ≤ −δ, δ
i
|a1 (ϕ)| ≤ CeC ε |e ε
π 0
κ0 dϕ
|.
Strip (II). For 0 < δ ≤ Im ϕ ≤ Im ϕ2 − δ, we get i
a1 (ϕ) = e ε
π 0
κ0 dϕ
e
π 0
0 dϕ ω+
(1 + o(1)).
Strip (III). For Im ϕ2 + δ ≤ Im ϕ ≤ Y0 − δ, we get δ
i
|a1 (ϕ)| ≤ CeC ε |e ε
π 0
κ0 dϕ
|.
Uniform asymptotics. Estimating the Fourier coefficients of a1 , uniformly in the strip {|Imϕ| ≤ Y }, we get that i
a1 (ϕ) = e ε
π 0
π
κ0 dϕ+
0
0 dϕ ω+
(1 + o(1)).
Anderson Transitions for Almost Periodic Schrödinger Equations
65
8.3.9. The asymptotics of c1 (ϕ). Now, we need to compute the asymptotics of w(f−0 , f−1 ) when ε → 0. Strip (I). In view of (8.20), we can use Lemma 8.1 to “continue” the asymptotic expansion for f−0 from K0 to K1 along a horizontal line (see Fig. 16). This allows us to get an asymptotic of c1 . We use formulae (8.20)–(8.23). The asymptotic expansion for f−0 on [ϕb , ϕ+ ] is ϕ 0 i ϕ 0 f−0 (u, ϕ) = e− ε 0 κ0 dϕ · q0 e 0 ω− dϕ (ψ− + o(1)) . Here, we integrate along a curve going from K0 to K1 below the point ϕ1 ; the functions 0 and q are the analytic continuations of these functions along this curve. We κ0 , ω− 0 use (8.20) and compute ϕ1 ϕ ϕ κ0 dϕ = κ0 dϕ − κ1 dϕ 0 0 ϕ1 π ϕ ϕ1 (8.46) κ0 dϕ − κ0 dϕ − κ1 dϕ. = ϕ1
0
Similarly, by (8.22), one obtains ϕ1 ϕ 0 0 ω− dϕ = ω− dϕ − 0
0
ϕ1 π
π
0 ω+ dϕ +
ϕ π
1 ω+ dϕ,
(8.47)
Using (8.46), (8.47), (8.21) and (8.23), we get f−0 (u, ϕ) = −ie
− εi
ϕ1 0
κ0 dϕ+ εi
ϕ1
π
κ0 dϕ+
ϕ1
0
ϕ
0 dϕ− ω−
π
1
0 dϕ ω+
i
eε
ϕ π
κ1 dϕ
1 (A+ (u, ϕ) + o(1)) .
On the other hand, we know that ϕ
i
f−1 (u, ϕ) = e− ε
π
κ1 dϕ
1 (A− (u, ϕ) + o(1)).
Therefore, one has i
w(f−0 , f−1 ) = −i · e− ε
π 0
i κ0 dϕ 2 ε
e
π
ϕ1
κ0 dϕ
e
ϕ1 0
ϕ
0 dϕ− ω−
π
1
0 dϕ ω+
(w1 + o(1)),
that is i
c1 (ϕ) = −i · e− ε
π 0
i κ0 dϕ 2 ε
e
π
ϕ1
κ0 dϕ
e
ϕ1 0
ϕ
0 dϕ− ω−
π
1
0 dϕ ω+
(1 + o(1)).
(8.48)
This holds in the strip {−Y0 + δ ≤ Im ϕ ≤ −δ}, uniformly along horizontal lines. Strip (II). We do not need any information on c1 in this region. Strip (III). We use Lemma 8.1 to compute the asymptotics of c1 . Let the segment [ϕ− , ϕ+ ] be as in Fig. 17. In view of (8.29), for ϕ ∈ [ϕ− , ϕ+ ], Lemma 8.1 implies that ϕ 0 i ϕ 0 f−0 = e− ε 0 κ0 dϕ q0 e 0 ω− dϕ ψ− + o(1) . (8.49)
66
A. Fedotov, F. Klopp
In (8.49), the integration contour first goes from 0 to ϕ− in K0 , then to ϕ along [ϕ− , ϕ+ ]. Using (8.25), we rewrite the first integral in (8.49) in the form ϕ π ϕ2 ϕ κ0 dϕ = κ0 dϕ + 2 κ0 dϕ + 2π(ϕ − ϕ2 ) − κ1 dϕ. 0
π
0
Also, by (8.27), we get ϕ 0 ω− dϕ = 0
π 0
π
0 ω− dϕ
+
ϕ2 π
0 (ω−
0 − ω+ )dϕ
+
ϕ π
1 ω+ dϕ.
Combining these formulae with (8.26) and (8.28), we obtain i
f−0 = ie− ε
π 0
κ0 dϕ−2 εi
ϕ π
2 (κ −π)dϕ 0
i
e−2π ε (ϕ−π) e
ϕ2 0
ϕ
0 dϕ− ω−
π
2
0 dϕ ω+
i
eε
ϕ
κ1 dϕ
π
1 (A+ + o(1)) .
Using this in conjunction with i
f−1 = e− ε
ϕ
κ1 dϕ
π
1 (A− + o(1)),
we get i
c1 (ϕ) = ie− ε
π 0
κ0 dϕ−2 εi
ϕ π
2 (κ −π)dϕ 0
i
e−2π ε (ϕ−π) e
ϕ
ϕ2
0 dϕ− ω−
0
π
2
0 dϕ ω+
(1 + o(1)). (8.50)
This asymptotic is valid for Im ϕ2 + δ ≤ ϕ ≤ Y0 − δ uniformly along horizontal lines. Uniform asymptotics. To get global information on c1 , we let ϕ−π π −ϕ 1 ϕ0 +ε c1 (ϕ) = γn e2iπn ε , γn = c1 (ϕ)e2iπn ε dϕ. (8.51) ε ϕ0 n∈Z
To study γn with n > 0, we use (8.48) and (8.51) with Im ϕ0 = −Y0 + δ. This gives i
|γn | ≤ C|e− ε
π 0
κ0 dϕ
|e−
2π ε |n|Y0
,
In the case of n = 0, by (8.48) and (8.51) with Im ϕ0 = −δ, we get γ0 = −ie
− εi
π 0
κ0 dϕ+2 εi
π
ϕ1
κ0 dϕ
e
ϕ1 0
ϕ
0 dϕ− ω−
π
1
0 dϕ ω+
(1 + o(1)).
For n < 0, we use the asymptotic obtained in the strip (III) and (8.51) with Im ϕ0 = Y0 − δ. We obtain π −ϕ 1 ϕ0 +ε γ−1 = c1 (ϕ)e−2iπ ε dϕ ε ϕ0 i
= ie− ε
π 0
κ0 dϕ−2 εi
ϕ π
2 (κ −π)dϕ 0
e
ϕ2 0
ϕ
0 dϕ− ω−
π
2
0 dϕ ω+
(1 + o(1)),
and, for n < −1, one has ϕ0 +ε 1 i π 2i ϕ2 2π 2inπ π −ϕ ε |γn | = c1 (ϕ)e dϕ ≤ C|e− ε 0 κ0 dϕ− ε π (κ0 −π)dϕ |e− ε |n+1|(Y0 −δ) . ε ϕ0
Anderson Transitions for Almost Periodic Schrödinger Equations
67
Putting all this together, we get for ϕ in the strip {|Imz| ≤ Y } (recall that Y < Y0 − δ): c1 (ϕ) = − ie
− εi i
−e−2 ε
π 0
ϕ π
κ0 dϕ
e
2 εi
2 (κ −π)dϕ 0
e
π
ϕ1
ϕ2 0
κ0 dϕ
e
ϕ1 0
ϕ
0 dϕ− ω−
π
ϕ
0 dϕ− ω− 2
0 dϕ ω+
1
0 dϕ ω+
(1 + o(1)) i e−2π ε (ϕ−π) (1 + o(1)) ,
π
where o(1) is uniform in the strip {|Imz| ≤ Y }. This ends the proof of Proposition 8.2.
8.4. The asymptotics of T2 . One has 1 w(f+1 , f−2 ), w2 1 c2 (ϕ) = w(f−1 , f−2 ), w2
1 w(f+1 , f+2 ), w2 1 d2 (ϕ) = − w(f−1 , f+2 ), w2 d d w2 = w(f+2 , f−2 ) = f−2 f+2 − f+2 f−2 . du du
a2 (ϕ) =
b2 (ϕ) = −
(8.52) (8.53) (8.54)
These Wronskians are ε-periodic and, together with the solutions, analytic in ϕ. 8.4.1. Three substrips. The coefficients of T2 are computed in the strip −Y0 + δ ≤ Im ϕ ≤ Y0 − δ. Here, Y0 is the same as before, and δ is an arbitrarily small positive constant that is independent of ε. Again we divide this strip into three smaller strips denoted respectively by (I), (II) and (III), see Fig. 18.
K1
:
K2
:
'3 '2
b b
(I) 0
b
'1
b
2
(II)
b (III)
2
'2
b
Fig. 18. Going from K1 to K2
'1
2
68
A. Fedotov, F. Klopp
2 , ω2 and q are 8.4.2. Properties of analytic objects. Recall that the branches κ2 , ψ± 2 ± 1 , ω1 and q from K to K through their defined as analytic continuations of κ1 , ψ± 1 1 2 ± common part (or along the periodic curve described in Fig. 7 and Sect. 7.3.4). In the strip (II), K1 and K2 do intersect. In K1 ∩ K2 , we have
κ1 = κ2 ,
1 2 ψ± = ψ± ,
1 2 ω± = ω± ,
q2 = q1 .
Consider the strip (I). Consider the common boundary of K2 and K1 in this strip (see Fig. 18). It is the Stokes lines beginning at 2π − ϕ1 and going upwards. Along this line, one has κ1 (ϕ − 0) = −κ2 (ϕ + 0),
1 2 ψ± (u, ϕ − 0) = ψ∓ (u, ϕ + 0),
1 2 ω± (ϕ − 0) = ω∓ (ϕ + 0),
q1 (ϕ − 0) = iq2 (ϕ + 0),
and Im κ2 (ϕ + 0) = −Im κ1 (ϕ − 0) > 0. In the strip (III), the common boundary of K2 and K1 is a part of the interval [ϕ2 , ϕ3 ]. This interval is the Stokes line joining ϕ2 to ϕ3 . Along this line, one has κ1 (ϕ − 0) = 2π − κ2 (ϕ + 0),
1 2 ψ± (u, ϕ − 0) = ψ∓ (u, ϕ + 0),
1 2 ω± (ϕ − 0) = ω∓ (ϕ + 0),
q1 (ϕ − 0) = −iq2 (ϕ + 0)
and Im κ2 (ϕ + 0) = −Im κ1 (ϕ − 0) > 0. 8.4.3. Auxiliary canonical domains. When applying Lemma 8.1 to compute T2 , we use two auxiliary canonical domains: (1) the domain symmetric to K˜ 0 with respect to the point π ; (2) the domain K2 which is symmetric to K2 with respect to the real line and symmetric to the auxiliary canonical domain K0 with respect to the point π. 8.4.4. Partial results. Now, we list the asymptotics of the coefficients of T2 in the strips (I)–(III). As before, δ is a sufficiently small, fixed, positive constant such that δ < Y0 −Y , and Y is as in Theorem 2.2. Coefficient a2 . As in the strip δ ≤ Im ϕ ≤ Y0 − δ so in the strip −Y0 + δ ≤ Im ϕ ≤ −Im ϕ2 − δ, we get the estimate i
|a2 (ϕ)| ≤ C|e ε
2π
2π −ϕ1
κ2 dϕ
|eCδ/ε .
It is uniform along horizontal lines. We get also the uniform asymptotics 2π
i
a2 (ϕ) = e ε
π
κ2 dϕ
e
2π π
2 dϕ ω+
(1 + o(1)),
−Im ϕ2 + δ ≤ Im ϕ ≤ −Im δ.
Coefficient b2 . We get two asymptotic formulae i
b2 (ϕ) = ie ε
2π −ϕ π
1
κ1 dϕ+ εi
2π −ϕ1 2π
2π −ϕ
κ2 dϕ+
π
1
2π −ϕ1
1 dϕ− ω+
2π
2 dϕ ω−
(1 + o(1)),
Anderson Transitions for Almost Periodic Schrödinger Equations
69
as δ ≤ Im ϕ ≤ Y0 − δ, and 2i
b2 (ϕ) = −ie ε
ϕ π
2 (κ −π)dϕ− i 1 ε
2π π
ϕ
κ2 dϕ+
π
2π
1 dϕ+ ω+
2
ϕ2
2 dϕ ω−
e
2π i ε (ϕ−π)
(1 + o(1)),
as −Y0 + δ < Im ϕ < −ϕ2 − δ. These asymptotics are uniform along horizontal lines. Coefficient c2 . We get only estimates uniform along the horizontal lines. If δ ≤ Im ϕ ≤ Y0 − δ, we have δ
2π
i
|c2 (ϕ)| ≤ CeC ε |e ε
π
κ2 dϕ
|,
and, if −Y0 + δ ≤ Im ϕ ≤ −Im ϕ2 − δ, we have i
|c2 (ϕ)| ≤ C|e ε
2π π
κ2 dϕ
|e−δ2 /ε .
Here, δ2 is a positive constant independent of ε and δ. Coefficient d2 . We get the estimates i
|d2 (ϕ)| ≤ C|e− ε
2π π
κ2 dϕ
|eCδ/ε .
They are uniform along horizontal lines as in the strip δ ≤ Im ϕ ≤ Y0 − δ so in the strip −Y0 + δ ≤ Im ϕ ≤ −Im ϕ2 − δ}. And, in the strip −Im ϕ2 + δ ≤ Im ϕ ≤ −δ, we get the uniform asymptotics i
d2 (ϕ) = e− ε
2π π
κ2 dϕ
e
2π π
2 dϕ ω−
(1 + o(1)).
Uniform asymptotics for the coefficients of T2 are obtained by analyzing their Fourier series. This leads to the formulae announced in Proposition 8.3. ˜ Let K := K0 ∪ K1 ∪ K2 . This domain is represented on 8.5. The computation of M. Fig. 19. Note that it is cut along the Stokes lines shown by the bold lines. It is simply connected and regular. Continue analytically the main branch of the complex momentum from K0 to K. Denote the analytic continuation simply by κ. By definition, the functions κj , j = 0, 1, 2, are the restrictions of κ to Kj , i.e. κ|Kj = κj . Similarly, we introduce the functions ω± single valued and analytic on K so that j
ω± |Kj = ω± . These functions κ and ω± have the following symmetry properties: κ(2π − ϕ) = κ(ϕ)
and
ω± (2π − ϕ) = −ω± (ϕ).
(8.55)
1 , see (7.9) and (7.12). Equation (8.55) follows from the analogous properties of κ1 and ω± To get the asymptotics of the matrix M˜ described in Proposition 8.1, we proceed as follows: j
• in the asymptotic formulae for T1 and T2 , we replace κj and ω± , j = 1, 2 by κ and ω respectively;
70
A. Fedotov, F. Klopp
'2 b
0
b
'1
2
b
b'
b
'
2
2
b
2
'1
'2
Fig. 19. The domain K
• we compute the product T1 (ϕ + 2π )T2 (ϕ + 2π ) (compare with (8.5)); • in the formulae thus obtained, we simplify the integrals by means of the relations (8.55). In result, after a rather long, but elementary calculation, we get i i ˜ ˜ m11 (ϕ) = t˜h · e ε 1 + o(1) + o e− ε e−(c1 +c2 )/ε , i i 2π i ˜ ˜ m12 (ϕ) = iG e ε 1 + o(1) + o e− ε e−c1 /ε − t˜v e−iθ + ε (ϕ+π) (1 + o(1)) , i 3i 2π i ˜ ˜ m21 (ϕ) = −iG−1 e ε 1 + o(1) + o e− 2ε e−c2 /ε − t˜v eiθ− ε (ϕ+π) (1 + o(1)) , i ˜ i ˜ 1 i ˜ m22 (ϕ) = e ε (1 + o(1)) + e− ε (1 + o(1)) + o t˜v2 e− ε − th 2π tv −iθ+ 2π i (ϕ+π) ε (1 + o(1)) + eiθ−i ε (ϕ+π) (1 + o(1)) . e th Here, c1 , c2 > 0 and we have set 2i
t˜h = e ε
ϕ1 0
κdϕ
G = exp
,
ϕ1
0
˜ =
2π−ϕ1
ϕ1
i
− t˜v = e ε
κdϕ,
(ω+ − ω− )dϕ ,
θ =i
ϕ2 ϕ1
ϕ2 ϕ2
(κ−π)dϕ
, (8.56)
(ω+ − ω− )dϕ.
All the integrals are taken along curves in K. We note that κ is real along the interval ˜ is real for real E, and all the terms of the form o (e... ) are in fact [ϕ1 , 2π − ϕ1 ]. So, o(1) in a sufficiently small constant neighborhood of E0 . This implies the asymptotic ˜ = ; this is done formulae of Proposition 8.1 if one checks that t˜h = th , t˜v = tv and in Sects. 9.1.1 and 9.1.2. 8.5.1. Completing the proof of Proposition 8.1. To finish the proof of ϕProposition 8.1, we check that ϕ(0) is real for E ∈ R. Therefore, we prove that θ = i ϕ12 (ω+ − ω− )dϕ is real. One has ϕ2 ϕ2 θ = −i (ω+ − ω− )dϕ = −i (ω+ (ϕ) ¯ − ω− (ϕ))dϕ. ¯ ϕ1
ϕ1
Anderson Transitions for Almost Periodic Schrödinger Equations
71
ϕ As ω+ (ϕ) = ω− (ϕ) (see (7.12)), and ϕ22 ω± dϕ = 0 (see (8.55)), one has θ = ϕ ' i ϕ12 (ω+ − ω− )dϕ = θ . This completes the proof of Proposition 8.1. & 8.6. Getting a monodromy matrix of the form (2.14). To get such a monodromy matrix, we change the consistent basis. Recall that, using Theorem 6.1, we have constructed the solution f+0 (u, ϕ) with the “standard” asymptotics in K0 . Denote it by f0 . Consider the function f0∗ (u, ϕ) = f0 (u, ϕ). It satisfies Eq. (7.1) and the consistency condition (6.3). First, we compute T0 (ϕ), the matrix of (f0 , f0∗ ) in the basis (f+0 , f−0 ). This matrix is ε-periodic and analytic. Then, we check that the Wronskian of f0 and f0∗ is a nonvanishing constant. So, the basis (f0 , f0∗ ) is consistent; the monodromy matrix for this basis is the one described in Theorem 2.2. It is given by −1 ˜ M(ϕ) = T0 (ϕ + 2π )M(ϕ)T 0 (ϕ) ,
(8.57)
where M˜ is the monodromy matrix for the basis (f+0 , f−0 ). To get the asymptotics of M, ˜ we use the asymptotics of T0 and M. As f0 = f+0 , the matrix T0 has the form 1 0 T0 (ϕ) = , (8.58) a0 b0 where a0 and b0 are defined f0∗ = a0 f+0 + b0 f−0 and, thus, are given by the formulae: a0 =
w(f−0 , f0∗ ) w(f−0 , f+0 )
and b0 =
w(f0∗ , f+0 ) w(f−0 , f+0 )
=
w(f0∗ , f0 ) w(f−0 , f+0 )
.
(8.59)
When computing a0 and b0 , as in Sect. 8.5, we use the functions κ, ω± , ψ± and q 0 , ψ 0 and q from K to K = K ∪ K ∪ K . defined by analytic continuation of κ0 , ω± 0 0 0 1 2 ± As before, δ is a sufficiently small (but independent of ε) positive number. Let us first compute a0 . First, note that, K0 ∩ K0 contains the rectangle R1 = {ϕ ∈ K0 ; |Im ϕ| ≤ Im ϕ3 − δ, |Re ϕ| ≤ δ}. In this rectangle, we know the asymptotic of f0 . To get the one of f0∗ , we note that ψ+ (u, ϕ) = ψ+ (u, ϕ),
κ(ϕ) = −κ(ϕ), ω+ (u, ϕ) = ω+ (u, ϕ),
q(ϕ) = iq(ϕ),
ϕ ∈ R1 .
(8.60)
The first of these relations follows as κ is purely imaginary on (0, ϕ1 ). The second is valid as the branches of the Bloch solution ψ(x, E) are real for E < E1 . The third one follows from the second one and the definitions of ω± . The last one follows from the choice of the branch of q0 : q0 ∈ e−iπ/4 R+ on (0, ϕ1 ), see Sect. 7.3.3. By (8.16), (8.17) and (8.60), we get i
f0∗ (u, ϕ) = ie ε
ϕ 0
κdϕ
0 (A+ (u, ϕ) + o(1)).
(8.61)
Comparing this result with the asymptotics of f0 = f+0 , see (8.16), we obtain the uniform formula a0 = i + o(1),
ϕ ∈ R1 .
(8.62)
72
A. Fedotov, F. Klopp
From the ε-periodicity of a0 , we deduce that this asymptotics is valid and uniform in the whole strip {|Im ϕ| ≤ Im ϕ3 − δ}. We now turn to the asymptotics of b0 . Therefore, we compute the asymptotics of f0 in K1 ∩ {−Im ϕ3 + δ ≤ Im ϕ ≤ −δ}. By definition of T1 and (8.6), we know that f0 = a1 f+1 + b1 f−1 . The asymptotics of a1 , b1 and of (f+1 , f−1 ) are known in K1 ∩ {−Im ϕ3 + δ ≤ Im ϕ ≤ −δ} (see (8.16) and Proposition 8.2). So, we get that i
f0 (u, ϕ) = e ε
π 0
κdϕ
e i
π 0
ω+ dϕ
π
i
ϕ
i
ϕ
eε
π
κdϕ
1 (A+ (u, ϕ) + o(1))
1 + o(1)e ε 0 κdϕ e− ε π κdϕ (A− (u, ϕ) + o(1)) i ϕ 2i ϕ κdϕ 0 1 A+ (u, ϕ) + o(1) + o(1)e− ε π κdϕ (A− (u, ϕ) + o(1)) = eε 0 i
= eε
ϕ 0
κdϕ
0 (A+ (u, ϕ) + o(1)).
(8.63) ϕ
Here, we have used the fact that the coefficient c = e− ε π κdϕ is (exponentially) small. Indeed, the domain K1 is canonical. So, there is a canonical curve connecting ϕ with the uppermost point of K1 . As Im ϕ < 0, it intersects the interval (ϕ1 , 2π − ϕ1 ) which is a Stokes line. So, in the formula for c, we can integrate from ϕ to this interval along the canonical line, and, then, along this interval to π . Now, the definitions of the canonical lines and the Stokes lines imply the smallness of c. As a result of the computation (8.63), we see that the leading term of the asymptotics of f0 in K1 ∩ {−Im ϕ3 + δ ≤ Im ϕ ≤ −δ} is obtained by analytic continuation from K0 across the common part of K0 and K1 . Consider the “rectangle” R2 = {ϕ1 + δ ≤ Re (ϕ) ≤ π −δ, δ ≤ |Im ϕ| ≤ Im ϕ3 −δ}. We can assume that it is contained in K0 ∪K1 . We have 2i
i
f0 = e ε
ϕ 0
κdϕ
0 (A+ (u, ϕ) + o(1)),
ϕ ∈ R2 .
(8.64)
In R2 ∩{Im ϕ < 0}, this follows from (8.63), and for Im ϕ > 0, this is just the asymptotics from (8.16). Now, we get from (8.64), the asymptotics of f0∗ in R2 . We note that κ(ϕ) = κ(ϕ),
ψ− (u, ϕ) = ψ+ (u, ϕ),
ω− (u, ϕ) = ω+ (u, ϕ),
q(ϕ) = q(ϕ),
ϕ ∈ R2 .
(8.65)
The first of these relations follows as κ is real on (ϕ1 , 2π − ϕ1 ). The second is valid as the branches of the Bloch solution ψ(x, E) differ by the complex conjugation on the spectral bands. The third one follows from the second one and the definitions of ω± . The last one follows from the choice of the branch of q0 , see Sect. 7.3.3. The relations (8.65) and (8.60), and the asymptotics (8.64) imply that i
f0∗ (u, ϕ) = e ε 2i
=eε
ϕ1 0
i κdϕ − ε
ϕ1 0
e
κdϕ
e
ϕ
ϕ1 0
ϕ1
κdϕ
e
ϕ1 0
(ω+ −ω− )dϕ
(ω+ −ω− )dϕ
i
e− ε
ϕ 0
0 (A− (u, ϕ) + o(1))
κdϕ
0 (A− (u, ϕ) + o(1)) .
The asymptotics is uniform for ϕ ∈ R2 . Now, using the asymptotics of f0 and f0∗ , formula (8.2) defining G and formula (9.3) for Sh , we get w(f0∗ , f0 ) = th Gw(f−0 , f+0 )(1 + o(1)).
(8.66)
Anderson Transitions for Almost Periodic Schrödinger Equations
73
As the Wronskian is analytic and periodic in ϕ, this asymptotic is valid and uniform in the strip {|Im ϕ| ≤ Im ϕ3 − δ}. We see that the leading term of the Wronskian is independent of ϕ. The factor 1+o(1) can depend on ϕ. Recall that in order to define the monodromy matrix, we need that the Wronskian be constant. To “correct” the situation, we slightly modify f0 . Denote the factor 1 + o(1) from (8.66) by g. The factor g is ε-periodic in ϕ as the Wronskians are (the coefficients th and G are constant). Formula (8.66) shows that g is real analytic up to a constant factor g1 = 1 + o(1) (as w(f0∗ , f0 ) is real analytic, th and G are real, and th Gw(f−0 , f+0 ) is independent of ϕ). So, we can (and we do) redefine f0 √ √ by dividing it by the real analytic and ε-periodic factor g/g1 = 1 + o(1) so that w(f0∗ , f0 ) = th w(f−0 , f+0 )g1 be constant. The new functions f0 and f0∗ form a consistent basis and still have the "old" asymptotics in R1 and R2 . So, for the new solutions, we get the "old" formulas for a0 and b0 . Now, combining (8.58) with the asymptotics obtained for a0 and b0 , we get
1 0 T0 (ϕ) = , i + o(1) th G(1 + o(1))
|Im ϕ| ≤ Im ϕ3 − δ.
Being obtained by means of the asymptotics of f±0 (constructed by Theorem 6.1) and the asymptotics of a1 and b1 given by Proposition 8.2, the asymptotics of T0 stay uniform in a constant neighborhood of E = E0 ∈ J . As the monodromy matrix M is associated to the basis (f0 , f0∗ ), it takes the form
a(ϕ) b(ϕ) M(ϕ) = ∗ . b (ϕ) a ∗ (ϕ) Using the asymptotics of T0 , of the monodromy matrix M˜ obtained in Theorem 8.1 and formula (8.57), one computes the coefficients a and b to obtain 2iπ 1 i/ε e (1 + o(1)) − tv e ε (ϕ−ϕ(0) ) (1 + o(1) , th 2iπ i i/ε b(ϕ) = (1 + o(1)) − tv e ε (ϕ−ϕ(0) ) (1 + o(1) . e th
a(ϕ) =
(8.67)
The asymptotics are valid uniformly in the strip |Im (ϕ)| ≤ Y if δ < Im (ϕ3 ) − Y (Y is as in Theorem 2.2). Now, to get the statement of Theorem 2.2, one passes to the variable φ = ϕ/ε. The asymptotic representations for a0 , b0 , a˜ and b˜ described in Theorem 2.2 follow from (8.67) and standard estimates of the Fourier coefficients. & '
9. Properties of the Phase and Actions Integrals. The Phase Diagrams First, we study properties of the phase integral and the tunneling actions Sv and Sh . Then, we investigate the location of the asymptotic mobility edges and prove Theorem 2.6 and formula (2.26).
74
A. Fedotov, F. Klopp
9.1. Properties of the action integrals. 9.1.1. Properties of the phase integral and tunneling coefficients. We begin with discussing the phase integral . We prove its properties described by Lemma 2.1 and the representation (2.13) of by a contour integral. The branch κ∗ chosen in Sect. 2.6 to define the phase integral is actually the branch κ defined in Sect. 8.5. Indeed, (1) κ coincides with κ0 on K0 and, in particular, on (−ϕ1 , ϕ1 ). By (7.4) and as the main branch κ0 ∈ iR+ on (0, ϕ1 ), we see that κ ∈ iR+ on (−ϕ1 , ϕ1 ). (2) κ coincides with κ1 on K1 and, in particular, on the cross (ϕ1 , 2π − ϕ1 ) ∪ (ϕ2 , ϕ2 ). By (7.9), as κ1 coincides with the main branch κ0 on the part of this cross (ϕ1 , π ] ∪ [π, ϕ2 ), and as κ0 ∈ (0, π) there, we see that κ ∈ (0, π ) on the cross. Now, let us prove the properties of described in Lemma 2.1. The positivity of is obvious. Check that (E) > 0 on J . Note that 2 π κ0 dϕ, (9.1) = ε ϕ1 where κ0 is the main branch of the complex momentum, and κ0 (ϕ) = k0 (E − α cos(ϕ)); here, k0 is the main branch of the Bloch quasi-momentum. Therefore, 2 π k (E − α cos(ϕ))dϕ. (9.2) (E) = ε ϕ1 0 Note that, since k0 (E) has square root branch points at the ends of the spectral zones, the integral in (9.2) converges. As k0 (E) > 0 inside the first spectral zone, Eq. (9.2) implies the positivity of inside J . The analyticity of in E will be checked by means of the representation (2.13) of by a contour integral taken along a closed curve γp corresponding to Fig. 3. Let us describe the choice of the branch of the complex momentum in (2.13) and prove (2.13). It suffices to assume that E is real and to consider contours γp symmetric with respect to the real axis and to the line π + iR. The branch of the complex momentum in (2.13) coincides with the main one on the part of the contour γp situated in the domain {0 < Re ϕ < π ; Im ϕ > 0}. We have to show that it can be analytically continued to a branch continuous on γp . Denote by a the right point of the intersection of γp and R. As we can continue κ on γp without this point, we need only to prove that the branch thus obtained is continuous at a. This follows from the relation κ(2π − ϕ) = κ(ϕ), ϕ ∈ γp . This relation itself follows from the fact that the main branch κ0 is real on the segments (π, ϕ2 ) and (π, ϕ2 ) of the line π + iR. Note that κ(ϕ) = −κ(ϕ), ϕ ∈ γp ; it follows from (7.5). Now, it suffices to deform γp so that it goes around the interval [ϕ1 , 2π − ϕ1 ] just along this interval. As the main branch is real along this interval, the last relation implies (2.13). The representation (2.13) implies the analyticity of in E as γp can be a closed curve going around [ϕ1 , 2π − ϕ1 ] and staying at a positive distance from the branch points. We have completed the analysis of . The analysis of the tunneling actions Sv and Sh is done in the same way. In particular, one has ϕ2 ϕ1 (κ0 (ϕ) − π )dϕ, Sh (E) = −2i κ0 (ϕ)dϕ, (9.3) Sv (E) = 2i π
0
Anderson Transitions for Almost Periodic Schrödinger Equations
75
where, in the first integral, we integrate along the segment [π, ϕ2 ] of the line π + iR. Point 4 of Lemma 2.1 follows from the representation for Sv in (9.3) as, on the integration contour, 0 < κ0 < π. ˜ t˜v and t˜h ˜ t˜v and t˜h . In view of (9.1) and (9.3), the functions , 9.1.2. Functions , defined by (8.56) satisfy the relations ˜ = ,
t˜v = tv ,
t˜h = th ,
needed to complete the computations of the monodromy matrix in Sect. 8.5. 9.1.3. Dependence of the action integrals on α. We now study the actions Sh and Sv defined in (2.10) as functions of α and E. It will be convenient to change variables (α, E) → (α, t) so that E = E1 + tα, and to keep the notations Sh (α, t) and Sv (α, t) for the functions obtained from Sh (α, E) and Sv (α, E) by this change of variables. Representations (9.3) imply that ϕ2 Sv (α, t) = 2i (k0 (E1 + α(t − cos(ϕ))) − π )dϕ, π ϕ1 (9.4) Sh (α, t) = −2i k0 (E1 + α(t − cos(ϕ)))dϕ, 0
where k0 is the main branch of the Bloch quasi-momentum, see Sect. 5.2.2. The domain S defined by (2.25) becomes K := {(α, t); −1 ≤ t ≤ 1, 0 < α, (1 + t)α ≤ E2 − E1 } . The formulae (9.4) show that the actions Sh (α, t) and Sv (α, t) are real analytic in (α, t) in int (K), the interior of K, and continuous at its boundary. One proves Lemma 9.1. In int (K), one has ∂Sv < 0, ∂α
∂Sh ∂S ∂(Sv − Sh ) > 0, and = < 0. ∂α ∂α ∂α
t 1
K
2
0 3
1
1
Fig. 20. The domain K
(9.5)
76
A. Fedotov, F. Klopp
Proof of Lemma 9.1. Taking (9.4) into account, we compute ϕ1 ∂Sh k0 (E1 + α(t − cos(ϕ)))(t − cos(ϕ))dϕ, = −2i ∂α Sh = ∂α 0 ϕ2 ∂Sv ∂ α Sv = k0 (E1 + α(t − cos(ϕ)))(t − cos(ϕ))dϕ. = 2i ∂α π We notice that • for ϕ ∈ (0, ϕ1 ) ⊂ R, we have E−α cos(ϕ) < E1 ; hence, −ik0 (E−α cos(ϕ)) < 0 (see Fig. 5). Furthermore, the inequality E−α cos(ϕ) < E1 is equivalent to t −cos(ϕ) < 0. Hence, ∂α Sh ≥ 0. Moreover, ∂α Sh = 0 if and only if ϕ1 = 0, i.e. if and only if t = 1. • for ϕ ∈ (π, ϕ2 ) ⊂ {π + iR}, we have E1 < E − α cos(ϕ) < E2 ; hence, k0 (E − α cos(ϕ)) > 0 (see Fig. 5). On the other hand, E1 < E − α cos(ϕ) means that t − cos(ϕ) > 0. Hence, ∂α Sv ≤ 0. Moreover, ∂α Sv = 0 if and only if ϕ2 = π , i.e. if and only if (1 + t)α = E2 − E1 . This completes the proof of Lemma 9.1.
' &
9.2. The phase diagram. Now, we study the asymptotic mobility edges: we prove Theorem 2.6 and formula (2.26). We use the notations of Sect. 9.1.3 9.2.1. The curve S = 0. Lemma 9.1 implies that, for every t ∈ [−1, 1], there exists at most one α such that (α, t) ∈ K and Sh (α, t) = 0. Let us study the behavior of S on the boundary of K. The boundary ∂K is made of four curves (1) (2) (3) (4)
0 1 2 3
: α = 0, −1 ≤ t ≤ 1; : t = −1, 0 < α; 1 : t = 1, 0 < α < α ∗ , α ∗ = E2 −E 2 ; ∗ : α ≤ α, (1 + t)α = E2 − E1 .
Note that in terms of α and t, the relations defining ϕ1 and ϕ2 have the form: cos ϕ1 = t,
cos ϕ2 = t −
E 2 − E1 . α
(9.6)
Now, we can study S on the boundary of K. • On 3 , one has ϕ2 = π, ϕ1 ≥ 0, and ϕ1 = 0 only at (α ∗ , 1); hence, by (9.4), S < 0 except at the point (α ∗ , 1), where S = 0. • Consider Sv and Sh near 0 i.e. for α small and t ∈ [−1, 1]. As usual, one has ϕ1 ∈ [0, π ], and, by (9.4), Sh (α, t) stays bounded. On the other hand, uniformly for t ∈ [−1, 1], one has Sv (t, α) → +∞ as α → 0. Indeed, pick 0 < β < E2 − E1 . For sufficiently small α, t − β/α < −1, and we can define ϕβ ∈ π + iR+ by cos ϕβ = t − β/α. Then, on the interval Iβ = {ϕ ∈ π + iR; 0 ≤ Im ϕ ≤ Im ϕβ }, one has k0 (E1 + α(t − cos ϕ)) − π ≤ k0 (E1 + β) − π < 0. Hence, by (9.4), one has Sv (t, α) ≥ 2(π − k0 (E1 + β))Im ϕβ . This implies that Sv (t, α) → +∞ as α → 0. Hence, for α sufficiently small, S is positive. • The statements of Lemma 9.1 remain valid on 1 . • On 1 , one has ϕ1 = π; and, when α → +∞, one has ϕ2 → π , hence, Sv → 0; on the other hand, Sh → +∞; so S is negative for α large enough. • On 2 , one has ϕ1 = 0 and Im ϕ2 > 0. So, on 1 , S > 0.
Anderson Transitions for Almost Periodic Schrödinger Equations
77
As a conclusion of this discussion, we see that, for every t ∈ [−1, 1], there exists a unique α(t) such that S(α(t), t) = 0 and (α(t), t) ∈ K. For t = 1, α(t) = α ∗ ; for t = −1, 0 < α(t) < ∞; for −1 < t < 1, (α(t), t) ∈ int (K). The analyticity of the curve t → α(t) is a consequence of the analyticity of S, Lemma 9.1 and the Local Inversion Theorem. This completes the proof of Theorem 2.6. & ' 9.2.2. Mobility edges near the point (α ∗ , 1). Consider the asymptotic mobility edge S in a neighborhood of the point (α ∗ , 1). When proving Theorem 2.6, we have seen that Sh (α ∗ , 1) = Sv (α ∗ , 1) = 0. Let the (mi )i=1,2 be the effective masses corresponding to (Ei )i=1,2 the ends of the first spectral band of the periodic Schrödinger operator (see Sect. 5.2.2). One can obtain the Taylor formula for Sh and Sv at (α ∗ , 1):
π 2m2 ∗ 2(α − α) + α ∗ (t − 1) and Sh (α, t) ∼ 2 α∗ π ∗ Sv (α, t) ∼ 2α m1 (1 − t), α ∼ α ∗ , t ∼ 1, 2 where we have omitted the standard error terms. As S = Sv − Sh , this implies (2.26). 10. Positive Lyapunov Exponent Let (M(φ, ε))0 0 and α ∈ (0, 1) independent of ε and such that |α(ε)| ≤ α and |β(ε)| ≤ β; • m(ε) = sup M1 (φ, ε) → 0 as ε → 0. Here, . is the matrix norm associated to φ∈S1
the vector norm for v = (v1 , v2 )T ∈ C2 .
v = |v1 | + |v2 |,
Then, there exist C > 0 and ε1 > 0 such that, if 0 < ε < ε1 , one has γ > log λ(ε) − Cm(ε).
(10.3)
Proof. To prove this result, we follow the ideas of Herman ([26]) as extended by Sorets and Spencer ([40]). One computes
1 0 1 · U (ε)−1 , where U (ε) = 0 α(ε) 0
M0 (ε) = U (ε) ·
−β(ε) 1−α(ε) .
1
Under our assumptions on M0 , the family (U (ε))ε∈(0,ε0 ) is uniformly bounded. Let n0 be as in Proposition 10.1. Put ˜ M(φ, ε) = e−i2πn0 φ · U (ε)−1 · M(φ, ε) · U (ε).
(10.4)
Note that, in S1 , ˜ M(φ, ε) = λ(ε)
1 0 ˜ + M1 (φ, ε) , 0 α(ε)
sup M˜ 1 ≤ const · m(ε).
φ∈S1
(10.5)
˜ h), Denote by P˜N (φ, ε) the matrix cocycle associated to the pair (M, ˜ + N h, ε) · · · M(φ ˜ + h, ε) · M(φ, ˜ P˜N (φ, ε) = M(φ ε). Hence, by (10.4) and (10.2), one has γ =
1 N→+∞ N
lim
0
1
log P˜N (θ, ε)dθ.
(10.6)
Define the function gN by gN (φ, ε) =
1 1 , P˜N (φ, ε) , 0 0
where the angular brackets denote the scalar product in R2 . Then, 0
1
log P˜N (θ, ε)dθ ≥
1 0
log |gN (θ, ε)|dθ.
(10.7)
Anderson Transitions for Almost Periodic Schrödinger Equations
79
Let us now recall a version of Jensen’s formula proved in [40]. Lemma 10.1 ([40]). Pick 0 < ρ < 1. Let f be an analytic function in the annulus A = {z ∈ C; ρ ≤ |z| ≤ 1} and continuous on A. Let f = 0 on the boundary of A. Let (rj )j denote the roots of f inside A. Then, one has 1 1 log |f (ei2πθ )|dθ = log |f (ρei2πθ )|dθ 0 0 (10.8) f (z) 1 − log |rj | − dz log ρ. 2iπ |z|=ρ f (z) rj ∈A
We let z = e2πiφ . As gN is analytic and 1-periodic in φ ∈ S, the relation fN (z, ε) = gN (φ, ε) defines a function analytic in the annulus 1 ≥ |z| ≥ e−2πy1 /ε . We take ρ = e−2πy/ε , where y satisfies y0 < y < y1 , and apply Lemma 10.1 to fN (z, ε). Taking into account the fact that the contribution of the zeroes of fN to (10.8) is non-positive, we get 1 1 fN (ζ, ε) y dζ . (10.9) log |gN (θ, ε)|dθ ≥ log(fN (ρei2πθ ), ε)dθ + ε |z|=ρ fN (ζ, ε) 0 0 We are now left with estimating fN and fN on the circles Cy = {|z| = e−2πy/ε }, y0 < y < y1 . Let M˜ 1 be as in (10.5). We define inductively {ak , bk }∞ k=0 so that a0 = b0 = 0, 1 0 1 1 = + M˜ 1 (φ + h(k − 1), ε) , (1 + bk ) 0 α(ε) ak ak−1
k ≥ 1.
(10.10)
Then, one checks that fN (z, ε) = λ(ε)N (1 + bN+1 ) · · · (1 + b1 ). We use Lemma 10.2. For ε sufficiently small and z ∈ Cy , y0 ≤ y ≤ y1 , one has |ak | ≤ m(ε) and |bk | ≤ 2m(ε), k ∈ N.
(10.11)
Proof. Lemma 10.2 is proved inductively. Obviously, (10.11) holds for k = 0. Let us assume that it holds up to rank k. Equation (10.10) implies 1 1 1 + bk+1 = 1 + , , M˜ 1 (φ + hk, ε) 0 ak 0 1 (1 + bk+1 )ak+1 = α(ε)ak + . , M˜ 1 (φ + hk, ε) 1 ak Using the assumptions on α(ε) and m(ε), one gets 1 [α|ak | + m(ε)(1 + |ak |)] and |bk+1 | ≤ m(ε)(1 + |ak |). (10.12) |ak+1 | ≤ 1 + bk+1 Using the assumption on m(ε) and the induction assumption, for ε sufficiently small, one gets |bk+1 | ≤ 2m(ε)
80
A. Fedotov, F. Klopp
uniformly in k. Plugging this into the first inequality from (10.12), for ε sufficiently small, one obtains |ak+1 | ≤
1 α m(ε) + m(ε)(1 + m(ε)) ≤ m(ε), 1 − 2m(ε)
uniformly in k. By induction, this completes the proof of Lemma 10.2.
' &
As an immediate consequence of (10.10) and Lemma 10.2, we get that, for any k, the functions ak and bk are analytic in φ in the strip {y0 /ε ≤ Im φ ≤ y1 /ε}. Hence, by the estimation (10.11) for bk and by the Cauchy estimates, the derivative of bk satisfies the estimate |bk | ≤ Const · ε · m(ε) in some smaller strip {y˜0 /ε ≤ Im φ ≤ y˜1 /ε}. As bk is 1-periodic in φ, we can also consider it as an analytic function of z = e2πiφ . Then 1 dbk dbk = . So, there exists C > 0 such that, if ρ = e−2πy/ε for y˜0 < y < y˜1 , dz 2πiz dφ one has N fN (z, ε) 1 bk (z) 1 dz ≤ dz ≤ CN m(ε). ε |z|=ρ fN (z, ε) ε |z|=ρ 1 + bk (z) k=1
and 0
1
log |fN (ρe
i2πθ
, ε)|dθ = N log λ(ε) +
N k=1 0
1
log |1 + bk (ρei2πθ )|dθ
≥ N log λ(ε) − CN m(ε). Plugging this into Eq. (10.9), one completes the proof of Proposition 10.1 by means of (10.7) and (10.6). & ' 11. A KAM Theory Construction 11.1. The model equation. Let φ∈ C. Recall that a matrix function M(φ) ∈ GL(2, C) a b belongs to M if it is of the form ∗ ∗ . b a We consider 1-periodic functions in M depending on a parameter η in B, a Borel subset of R. For functions M that are analytic in the strip {|Im φ| ≤ r} and that depend on η ∈ B, we consider their Lipschitz norms Mr,B ≡
sup
|Im φ|≤r,η∈B
M(φ, η) +
sup
|Im φ|≤r, η,η ∈B, η=η
M(φ, η) − M(φ, η ) . |η − η | (11.1)
Here, . is the matrix norm associated to the vector norm v = |v1 | + |v2 |,
for
v = (v1 , v2 )T ∈ C2 .
If M does not depend on φ, we write MB instead of Mr,B .
Anderson Transitions for Almost Periodic Schrödinger Equations
81
Let I be a bounded real interval. Let D and A in M be two matrix-valued functions of φ ∈ C such that (1) the matrices A and D depend on a parameter η ∈ I , D = diag (exp(iη), exp(−iη)),
A = A(φ, η);
(11.2)
(2) the matrix A(φ, η) is analytic in φ in a strip |Im φ| ≤ r and Ar,I ≤ 1. Let λ be a parameter taking positive values, and let h be a fixed positive number. Consider the equation A(φ + h) = (D + λA)A(φ),
φ ∈ R.
(11.3)
Our aim is to study matrix solutions of Eq. (11.3) for small values of λ. Therefore, we use standard KAM theory ideas avoiding small neighborhoods of the KAM resonances (see [11, 2]). This allows us to construct solutions of (11.3) outside of some set of η of small measure. As usual for KAM methods, we impose a Diophantine condition on the number h. We fix σ ∈ (0, 1) and define l λσ Hσ = h ∈ (0, 1); min h − ≥ 3 , k = 1, 2, 3 . . . . (11.4) l∈N k k We will assume that h ∈ Hσ . Remark 11.1. Consider the complement of Hσ in (0, 1). For λ < 1, it is contained in the union of open intervals Il,k , k ∞
R \ Hσ ⊂
Ik,l ,
k=1 l=0
centered at the points l/k and of length
2λσ k3
mes ( (0, 1) \ Hσ ) ≤
. Hence, one gets
∞ 2λσ k=1
k3
(k + 1) = C · λσ .
(11.5)
In the sequel, C denotes a constant depending only on I , σ and r (but not on η, φ, h, λ or A). We prove Proposition 11.1. Let h, A and D be chosen as above; assume that det(D + λA) ≡ 1. Then, there exists λ0 = λ0 (r, σ, I ) such that, for 0 < λ < λ0 , there is a Borel set ∞ ⊂ I , ∞ = ∞ (r, σ, I ), mes ∞ ≤ Cλσ/2 ,
(11.6)
such that, for η ∈ B∞ = I \ ∞ , Eq. (11.3) has a solution of the form iη ·φ/ h e ∞ 0 A(φ, η) = U (φ, η) , 0 e−iη∞ ·φ/ h where • U is defined and analytic for |Im φ| < r/2 and satisfies U − 1r/2, B∞ ≤ Cλ1−σ ; • η∞ is a real valued Lipschitz function of η that satisfies η∞ (η) − ηB∞ ≤ Cλ.
82
A. Fedotov, F. Klopp
α β , we let 11.2. The proof of Proposition 11.1. For a 1-periodic matrix M(φ) = β ∗ α∗ α0 0 Diag M = , where α0 is the mean value of the periodic function α. Let τh be 0 α0∗ the shift operator defined by τh f (φ) = f (φ + h). 11.2.1. The induction procedure. One proves Proposition 11.1 inductively. First, let us explain the heuristics guiding the KAM induction. The aim is to transform equation (11.3) with small matrix λA into an equation A∞ (φ + h) = D∞ A∞ (φ) with a constant diagonal matrix D∞ . Therefore, we search for a solution of (11.3) in the form A0 = (I + U1 (φ))A1 , where U1 is 1-periodic and small. This leads to the equation A1 (φ + h) = (I + τh U1 (φ))−1 (D + λA)(I + U1 (φ))A1 (φ). As U1 and λA are small, one has (I + τh U1 )−1 (D + λA)(I + U1 ) = D + (DU1 − τh U1 D + λA) + . . . , where the dots denote lower order terms. So, if DU1 − τh U1 D + λA = 0, then (11.3) is replaced by a new equation of the same form, namely, A1 (φ + h) = (D + A1 (φ))A1 (φ). The new matrix A1 is smaller than λA. We prove that one can find a periodic solution U of the equation DU − τh U D + A = 0 if Diag A = 0. Let us turn to the induction. First, we let D0 = D + λDiag A,
A0 = λA − λDiag A.
At the k th step of the induction, we define two matrix-valued functions Uk and Bk so that (I + τh Uk (φ))−1 (Dk−1 + Ak−1 (φ))(I + Uk (φ)) = Dk−1 + Bk (φ),
(11.7)
and set Dk = Dk−1 + Diag Bk ,
Ak (φ) = Bk (φ) − Diag Bk .
(11.8)
As Uk we take a periodic solution of the homological equation τh Uk (φ)Dk−1 − Dk−1 Uk (φ) = Ak−1 (φ).
(11.9)
Then, by (11.7), one has Bk (φ) = (I + τh Uk (φ))−1 Ak−1 (φ) Uk (φ).
(11.10)
We check that Uk and Ak quickly converge to zero and that the product P (φ) = X∞ k=1 (I + Uk (φ)) = (I + U1 (φ)) (I + U2 (φ)) (I + U3 (φ)) . . .
(11.11)
converges to an invertible matrix. As a result, we obtain P −1 (φ + h)(D + λA(φ))P (φ) = D∞ ,
(11.12)
where D∞ is a constant diagonal matrix. This immediately implies that (11.3) has a solution of the form A(φ) = P (φ) exp(φD∞ / h).
(11.13)
We prove that this solution has all the properties announced in Proposition 11.1.
Anderson Transitions for Almost Periodic Schrödinger Equations
83
11.2.2. Homological equation and small denominators. Consider the homological equation (11.9). At each step of the induction procedure, Ak is a 1-periodic in φ. Therefore, we look for 1-periodic solutions of (11.9). This leads to a small denominator problem. We write the homological equation as ˜ (φ) = A(φ). ˜ τh U (φ)D˜ − DU
(11.14)
We prove Lemma 11.1. Let A˜ and D˜ belong to M, and let ˜ d˜ ∗ ), where d˜ does not depend on φ; • D˜ be diagonal, D˜ = diag (d, • Diag A˜ ≡ 0; • A˜ be analytic in a strip |Im φ| ≤ r˜ ; • A˜ and D˜ be Lipschitz functions of a parameter η ∈ B (B, a Borel subset of R) satisfying ˜ r˜ ,B < ∞, A
˜ B < ∞, D
˜ > 0. and inf η∈B |d| If the number h satisfies the Diophantine condition (11.4), and the parameter η is outside of the set ˜ =
k,l∈Z
λσ/2 η ∈ B : |α(η) − hk − l| < , max2 {1, k}
α(η) =
1 ˜ arg d(η), π (11.15)
then, there exists C0 > 0, a universal constant, and U ∈ M, a solution of the homological equation (11.14) analytic in the strip |Im φ| < r˜ and satisfying the estimate U r˜ −ρ, B\˜
˜ r˜ ,B D ˜ B A ≤ C0 ˜2 λσ ρ 5 inf η∈B |d|
˜ 2 D B ρ + (1 + ρ ) 1 + ˜2 inf η∈B |d| 2
5
∀ρ ∈ (0, r˜ ).
, (11.16)
Proof. We are looking for U = ((Uij ))1≤i,j ≤2 ∈ M. Let u = U11 and v = U12 . Then, Eq. (11.14) is equivalent to the system of equations ˜ τh u − u = a/d, e
−2iπα
˜ τh v − v = b/d,
(11.17) (11.18)
where a = A˜ 11 , b = A˜ 12 are the coefficients of A˜ = ((A˜ ij ))1≤i,j ≤2 . To solve the first equation, consider the Fourier series of a, i.e. a = k=0 ak e2iπkφ . The zeroth term vanishes as Diag A˜ = 0. The Lipschitz norm of ak is estimated by ˜ r˜ ,B . ak B ≤ e−2π r˜ |k| A
(11.19)
84
A. Fedotov, F. Klopp
The 1-periodic solution of (11.17) is given by u=
1 ak e2πikφ . e2πikh − 1 d˜ k=0
This representation and (11.19) imply that e−2πρ|k| ˜ r˜ ,B D ˜ B e−2πρ|k| 1 A ˜ ur˜ −ρ,B ≤ ≤ . d˜ Ar˜ ,B 2πikh ˜2 |e − 1| |e2πikh − 1| inf η∈B |d| B k=0 k=0 (11.20) 1
As
|e2πikh
− 1|
≤ Cλ−σ k 2 and
k 2 e−2πρ|k| ≤ C/ρ 3 , 0 < ρ, there exists C > 0, a
k=0
universal constant, such that ur˜ −ρ,B ≤ C
˜ B ˜ r˜ ,B D A , where 0 < ρ < r˜ . ˜2 λσ ρ 3 inf η∈B |d|
(11.21)
In terms of the Fourier coefficients of the function b, the 1-periodic solution of Eq. (11.18), can be written as v=
1 bk e2πikφ ˜ r˜ ,B . , and bk B ≤ e−2π r˜ |k| A e2πi(kh−α(η)) − 1 d˜ k∈Z
The function v is estimated by 1 1 −2πρ|k| ˜ vr˜ −ρ,B\˜ ≤ Ar˜ ,B e . 2πi(kh−α(η)) ˜ e − 1 B\˜ d B k∈Z On the other hand, one has 2 1 1 2πi(hk−α(η)) ≤ sup − 1B , e2πi(kh−α(η)) − 1 ˜ e2πi(kh−α(η)) − 1 e ˜ B\ η∈B\ and ˜ B ≤1+ e2πi(hk−α(η)) − 1B = e2πihk d˜ ∗ /d˜ − 1B ≤ 1 + d˜ ∗ /d
By (11.15), one also has 2 1 ≤ Cλ−σ max{1, k 4 }, sup 2πi(kh−α(η)) e − 1 ˜ η∈B\ k 4 e−2πρ|k| ≤ C/ρ 5 , 0 < ρ. k>0
and
˜ 2 D B . ˜2 inf η∈B |d|
Anderson Transitions for Almost Periodic Schrödinger Equations
Finally, one obtains vr˜ −ρ,B\˜
˜ r˜ ,B D ˜ B A ≤C ˜2 λσ ρ 5 inf η∈B |d|
85
˜ 2 D B 1+ ˜2 inf η∈B |d|
(1 + ρ 5 ),
0 < ρ < r˜ . (11.22)
Estimates (11.21) and (11.22) imply (11.16) with a universal constant. This completes the proof of Lemma 11.1. & ' 11.2.3. Induction procedure: Estimates for Ak , Uk and Dk . At each step of the induction procedure, we use Lemma 11.2 to construct Uk , a solution of the homological equation (11.9) for the matrices Ak−1 and Dk−1 . Then, (11.10) and (11.8) yield the matrices Ak and Dk . The price of the use of Lemma 11.2 is twofold: • first, we can construct Uk only outside some set of values of η; let Bk−1 be the set denoted by B in Lemma 11.1 when A˜ = Ak and D˜ = Dk . The set of “bad” values of η “thrown away” at the kth step is described by (11.15) with d˜ = (Dk−1 )11 ; denote this set by k ; then, one has λσ/2 k = η ∈ Bk−1 : |αk−1 (η) − hl − m| < , max(l 2 , 1) l,m∈Z
Bk−1 = I \
k−1
j ,
B0 = I,
(11.23)
j =1
where
αk (η) =
1 arg dk (η), π
dk = (Dk )11 ;
(11.24)
note that, for all k ≥ 0, one has Bk+1 ⊂ Bk . • second, by formula (11.16) we can control only U r˜ −ρ, B\˜ , but not U r˜ , B\˜ ; so, if Ak−1 was constructed in a strip |Im φ| ≤ rk−1 , then to get a good enough norm control, we have to consider Uk in a smaller strip |Im φ| ≤ rk , 0 < rk < rk−1 ; we choose rk = r −
k
ρl ,
r0 = r,
ρl =
l=1
r ; 2l+1
note that, one has r∞ ≡ r −
∞
ρl = r/2.
(11.25)
l=1
We prove Lemma 11.2. Let k ≥ 0. Assume that Bk+1 is nonempty. Let C0 be the constant obtained in Lemma 11.1 and define S(k) =
k 2+l l=0
2l
,
S = lim Sk , k→∞
Q = 25 ,
C(r) = 24C0 (10 + r 2 + r 5 )/r 5 .
86
A. Fedotov, F. Klopp
Assume λ < 1/4 and λ1−σ 2C(r)QS < 1/2.
(11.26)
Then, for k ≥ 0, one has Bk+1 rk+1 ,Bk+1 ≤ Uk+1 rk+1 ,Bk+1
1 k λσ Q2 S(k) F (k + 1), 2C(r) 1 k−1 ≤ Qk/2+1+2 S(k) F (k). 4
(11.27) (11.28)
2k where F (k) = 2C(r)λ1−σ . Proof. Equations (11.27) and (11.28) are proved inductively. To simplify the notations, we write Ak , Dk , Uk and Bk instead of Ak Bk ,rk , Dk Bk , Uk Bk ,rk and Bk Bk ,rk . Let us check (11.27) – (11.28) for k = 0. One has D0 ≤ 1 + λ < 5/4,
inf |d0 | ≥ 1 − λ > 3/4,
η∈B0
A0 ≤ 2λ.
Using this and Lemma 11.1 with A˜ = A0 , D˜ = D0 and ρ = ρ1 , one gets a very rough estimate 2 + 1 + 25 (1 + ρ 5 ) 5 ρ 1 9 A0 r 1 20 U1 < C0 σ 9 λ ρ15 r5 (11.29) % & 2 + 10 + r 5 20 r 1 < C0 A0 λ−σ Q2 < Q2 C(r)λ1−σ . 9 2 r5 So (11.28) is proved for k = 0. To estimate B1 , we compare (11.29) with the second condition on λ; this implies that U1 < 1/8 (since Q2 < QS ). Therefore, (11.10) implies that B1 < 87 U1 A0 . And using (11.29) and A0 ≤ 2λ, we see that B1 < 2C(r)Q2 λ2−σ . This proves (11.27) for k = 0. Now, assume that (11.27) and (11.28) are valid for 1 ≤ l ≤ k. We check that (11.27) and (11.28) then hold for k + 1. Begin with checking that Dk ≤ 3/2,
inf |dk | ≥ 1/2.
η∈Bk
By (11.8), one has Dk ≤ Dk − Dk−1 + Dk−1 − Dk−2 + · · · + D1 − D0 + D0 ≤
k−1 j =0
Bj +1 + D0 .
(11.30)
Anderson Transitions for Almost Periodic Schrödinger Equations
87
Using (11.27) to estimate this last sum, we obtain k−1
Bj +1 ≤
j =0
k−1 1 j λσ Q2 S(j ) F (j + 1). 2C(r)
(11.31)
j =0
By (11.26), the terms in the right hand sum above are super exponentially decaying. So, it is roughly of order of the first term, and we need only a very rough estimate: 2j S(j )
Q
F (j + 1) ≤
2j +1 2C(r)λ1−σ QS
Q2
jS
2C(r)λ1−σ QS ≤ QS
2(j +1)
2C(r)λ1−σ QS ≤ QS
2 ·
1 , 22j
where, in the last step, we have used (11.26). This and (11.31) imply that k−1
Bj +1
0 (depending on r, σ , I ) such that, for sufficiently small λ, one has αk+1 − αk Bk+1 ≤ Cλσ (Cλ1−σ )k+1 , αk − η/π Bk ≤ Cλ,
k ≥ 0,
k ≥ 0.
(11.36) (11.37)
Proof. We write . instead of .Bk+1 . Note that αk+1 − αk =
1 arg(1 + wk ), π
wk =
dk+1 − dk . dk
(11.38)
By (11.30) and (11.33), for sufficiently small λ, we get that |wk (η)| ≤ 1/2,
∀η ∈ Bk+1 ,
∀k.
This and (11.38) gives αk+1 − αk ≤ Cwk for some C > 0. Using (11.34) and (11.38), we get αk+1 − αk ≤ Cdk+1 − dk . The bound (11.33) implies (11.36). Estimate (11.37) follows from (11.36). Indeed, αk − η/π ≤ αk − αk−1 + · · · + α1 − α0 + α0 − η/π ≤ λσ (Cλ1−σ )k + · · · + λσ (Cλ1−σ ) + Cλ ≤ Cλ. This completes the proof of Lemma 11.3.
(11.39)
' &
The function αk is defined only on Bk . It is more convenient to work with functions defined on I = B0 . Therefore, we extend αk to I in the following inductive way. For α0 , we have nothing to do. Suppose that, for 0 ≤ l ≤ k, αl have been extended to I . Consider the function αk+1 . We have to define it on I \ Bk+1 which is a countable set of open intervals. On each of these intervals, we define αk+1 so that the function αk+1 − αk is linear, and the function αk+1 remain continuous. To keep notations short, the extended functions are still called αk . Clearly, the continuation procedure does not change neither the Lipschitz norms of the differences αk+1 −αk , i.e. αk+1 − αk Bk+1 = αk+1 − αk B0 , nor the sets k (see (11.2.3)). Moreover, one checks that the estimate (11.37) remains valid with . Bk+1 replaced by . B0 . This follows from the fact that, in (11.39), we can change the sets Bk+1 , Bk , . . . , B1 by B0 = I . The idea of the estimate of ∞ is the following. As α0 is close to η/π , the measure of 0 is small. As long as αk is close to α0 , k almost coincides with 0 . When these sets are quite different, the measure of k becomes very small. We prove Lemma 11.4. Let fi , i = 1, 2, be two Lipschitz functions defined on an interval I such that, for some positive δ and δ1 , 0 < δ1 < 1, |f1 (η) − f2 (η)| ≤ δ ∀η ∈ I,
and
f1,2 − I dI ≤ δ1 < 1.
(11.40)
90
A. Fedotov, F. Klopp
Fix c ∈ I . Let > 0. Define F1,2 = η ∈ I : |f1,2 (η) − c| ≤ ,
and δ0 = 2
1 − δ1 . 1 + δ1
Then, one has mes (F1,2 ) ≤ 2/(1 − δ1 ). Moreover, if δ < δ0 , then mes (F2 \ F1 ) ≤ 2δ/(1 − δ1 ), and if δ ≥ δ0 , then mes (F2 \ F1 ) ≤ 2/(1 − δ1 ). Proof. Let us extend f1 and f2 continuously outside of I so that they keep the properties (11.40). It suffices to prove the statements for these “new” f1 and f2 , and I = R. First, we note that (η − η )(1 − δ1 ) ≤ fj (η) − fj (η ) ≤ (η − η )(1 + δ1 ),
j = 1, 2.
So, fj are monotonous and Fj are intervals. Denote the left (resp. right) end of these − + intervals by η1,2 (resp. η1,2 ). Obviously, one has 2 2 + − ≤ η1,2 − η1,2 ≤ , 1 + δ1 1 − δ1
f1 (η1± ) = f2 (η2± ) = c±.
So, one has |η1± − η2± | ≤
|f1 (η1± ) − f1 (η2± )| |f2 (η2± ) − f1 (η2± )| δ = ≤ . 1 − δ1 1 − δ1 1 − δ1
Now, if δ < δ0 then 2/(1 + δ1 ) > δ/(1 − δ1 ); hence, the intersection of F1 and F2 is not empty, and mes (F2 \ F1 ) can be estimated by the sum |η1+ − η2+ | + |η1− − η2− |, that is, by 2δ/(1 − δ1 ). If δ ≥ δ0 , we use the trivial estimate mes (F2 \ F1 ) ≤ mes (F2 ) ≤ 2/(1 − δ1 ). This completes the proof of Lemma 11.4. & ' Now, let us discuss the sets k+1 . Each of these sets is contained in a countable union of open intervals Il,m (k), l, m ∈ Z defined by λσ/2 Il,m (k) = η ∈ I : |αk (η) − hl − m| < . max(1, l 2 ) In view of Lemma 11.4 and (11.37), if λ is sufficiently small, the length of the interval Il,m (k) can be bounded by Cλσ/2 / max{1, l 2 } uniformly in m and k. Fix k and l. On any Il,m (k), one has −1 − lh + αk ≤ m ≤ 1 − lh + αk ; so, the number of Il,m (k) that are not empty does not exceed M = maxη∈B0 αk − minη∈B0 αk + 3. In view of (11.37), we see that M is bounded uniformly in k and l. Note that this implies in particular that, for some C > 0, one has mes (1 ) ≤ Cλσ/2 . We can represent the set ∞ as ∞ = 1 ∪
(k+1 \ k ).
k≥1
Let us estimate the mes (k+1 \ k ). One has mes Il,m (k + 1) \ Il,m (k) . mes (k+1 \ k ) ≤ l∈Z m∈Z
Anderson Transitions for Almost Periodic Schrödinger Equations
91
To estimate mes Il,m (k + 1) \ Il,m (k) , we apply Lemma 11.4 with f1 = π αk and f2 = παk+1 . In view of (11.36)–(11.37), one then has δ = δ(k) = Cλσ (Cλ1−σ )k+1 , δ1 = Cλ. λσ/2 We choose = λσ/2 / max{1, l 2 }. Then, δ0 (l) = 2 1−Cλ 1+Cλ max{1,l 2 } . For sufficiently small λ, one has • mes Il,m (k + 1) \ Il,m (k) ≤ Cλσ (Cλ1−σ )k+1 if λσ (Cλ1−σ )k+1 < Cλσ/2 / l 2 , i.e. −σ/4 (Cλ1−σ )−(k+1)/2 , if l ≤ L, L = Cλ • mes Il,m (k + 1) \ Il,m (k) ≤ Cλσ/2 l −2 otherwise. This implies mes (k+1 \ k ) ≤ CLλσ (Cλ1−σ )k+1 +
C σ/2 ≤ Cλ3σ/4 (Cλ1−σ )(k+1)/2 . λ L
Summing the exponentially convergent series in k, we see that ∞ satisfies (11.6) and, hence, that B∞ is not empty. This completes the proof of Proposition 11.1. & ' Acknowledgements. This work was done while A.F. held a PAST professorship at Université Paris 13. F.K. gratefully acknowledges support of the European TMR network ERBFMRXCT960001. While working on these results, we had the pleasure of many interesting and helpful discussions with many friends and colleagues. In particular, we would like to thank M. Aizenman, S. Jitomirskaya, L. Pastur, B. Simon and Y. Sinaï.
References 1. Avron, J. and Simon, B.: Almost periodic Schrödinger operators, II. The integrated density of states. Duke Math. J., 50, 369–391, (1983) 2. Bellissard, J., Lima, R. and Testard, D.: Metal-insulator transition for theAlmost Mathieu model. Commun. Math. Phys. 88, 207–234 (1983) 3. Birman, M.Sh. and Solomjak, M.Z.: Spectral theory of selfadjoint operators in Hilbert space. Dordrecht: D. Reidel Publishing Co., 1987 Translated from the 1980 Russian original by S. Khrushchëv and V. Peller 4. Bougerol, P. and Lacroix, J.: Products of random matrices with applications to Schrödinger operators. Boston, MA: Birkhäuser Boston Inc., 1985 5. Buslaev, V.: Adiabatic perturbation of a periodic potential. Teor. Mat. Fiz. 58, 223–243 (1984) (in Russian) 6. Buslaev, V. and Fedotov, A.: The complex WKB method for Harper’s equation. Preprint, Mittag-Leffler Institute, Stockholm, 1993 7. Buslaev, V. and Fedotov, A.: Bloch solutions of difference equations. St Petersburg Math. J. 7, 561–594 (1996) 8. Carmona, R. and Lacroix, J.: Spectral Theory of Random Schrödinger Operators. Basel: Birkhäuser, 1990 9. Coddington, E. and Levinson, N.: Theory of ordinary differential equations. New-York: McGraw-Hill, 1955 10. Cycon, H.L., Froese, R.G., Kirsch, W. and Simon, B.: Schrödinger Operators. Berlin: Springer Verlag, 1987 11. Dinaburg, E.I. and Sina˘ı, Ja.G.: The one-dimensional Schrödinger equation with quasiperiodic potential. Funk. Anal. i Priložen. 9(4), 8–21 (1975) 12. Eastham, M.: The spectral theory of periodic differential operators. Edinburgh: Scottish Academic Press, 1973 13. Eliasson, L.H.: Floquet solutions for the 1-dimensional quasi-periodic Schrödinger equation. Commun. Math. Phys. 146, 447–482 (1992) 14. Eliasson, L.H.: Reducibility and point spectrum for linear quasi-periodic skew products. In: Proceedings of the ICM 1998,Berlin, Volume II, pp. 779–787, 1998 15. Fedoryuk, M.: Asymptotic analysis. Berlin: Springer Verlag, 1st edition, 1993 16. Fedotov, A. and Klopp., F.: On the singular spectrum of one dimensional quasi-periodic Schrödinger operators in the adiabatic limit. Preprint Universität Potsdam 17. Fedotov, A. and Klopp, F.: The spectrum of adiabatic quasi-periodic Schrödinger operators on the real line. In progress 18. Fedotov, A. and Klopp, F.: The monodromy matrix for a family of almost periodic equations in the adiabatic case. Preprint, Fields Institute, Toronto, 1997
92
A. Fedotov, F. Klopp
19. Fedotov, A. and Klopp, F.: A complex WKB analysis for adiabatic problems. Asymptotic Anal. 27, 219–264 (2001) 20. Fedotov, A. and Klopp, F.: On the absolutely continuous spectrum of one dimensional quasi-periodic Schrödinger operators in the adiabatic limit. Preprint, Université Paris-Nord, 2001 21. Fedotov, A. and Klopp, F.: Coexistence of different spectral types for almost periodic Schrödinger equations in dimension one. In: Mathematical results in quantum mechanics (Prague, 1998). Basel: Birkhäuser, 1999, pp. 243–251 22. Fedotov, A. and Klopp, F.: Transitions d’Anderson pour des opérateurs de Schrödinger quasi-périodiques en dimension 1. In Seminaire: Équations aux Dérivées Partielles, 1998–1999. Palaiseau: École Polytech., 1999, pp. Exp. No. IV, 15 23. Fröhlich, J., Spencer, T. and Wittwer, P.: Localization for a class of one dimensional quasi-periodic Schrödinger operators. Commun. Math. Phys. 132, 5–25 (1990) 24. Gilbert, D. and Pearson, D.: On subordinacy and analysis of the spectrum of one-dimensional Schrödinger operators. J. Math. Anal. and its Appl. 128, 30–56 (1987) 25. Helffer, B., Kerdelhué, P. and Sjöstrand, J.: Le papillon de Hofstadter revisité. Mém. Soc. Math. France (N.S.) 43, 87 (1990) 26. Herman, M.-R.: Une méthode pour minorer les exposants de Lyapounov et quelques exemples montrant le caractère local d’un théorème d’Arnol’d et de Moser sur le tore de dimension 2. Comment. Math. Helv. 58 (3), 453–502 (1983) 27. Hiramoto, H. and Kohmoto, M.: Electronic spectral and wavefunction properties of one-dimensional quasi-periodic systems: A scaling approach. Int. J. Mod. Phys. B, 164 (3–4), 281–320 (1992) 28. Janssen, T.: Aperiodic Schrödinger operators. In: R. Moody, ed., The Mathematics of Long-Range Aperiodic Order. Dordrecht: Kluwer, 1997, pp. 269–306 29. Jitomirskaya, S.: Almost everything about the almost Mathieu operator. II. In: XIth International Congress of Mathematical Physics (Paris, 1994), Cambridge: Internat. Press 1995, pp. 373–382 30. Jitomirskaya, S.Ya.: Metal-insulator transition for the almost Mathieu operator. Ann. of Math. (2) 150 (3), 1159–1175 (1999) 31. Jitomirskaya, S.Ya. and Last, Y.: Power law subordinacy and singular spectra. II. Line operators. Comm. Math. Phys. 211 (3), 643–658 (2000) 32. Kargaev, P. and Korotyaev, E.: Effective masses and conformal mappings. Commun. Math. Phys. 169, 597–625 (1995) 33. Last, Y.: Almost everything about the almost Mathieu operator. I. In: XIth International Congress of Mathematical Physics (Paris, 1994), Cambridge: Internat. Press, 1995, pp. 366–372 34. Marchenko, V. and Ostrovskii, I.: A characterization of the spectrum of Hill’s equation. Math. USSR Sbornik 26, 493–554 (1975) 35. McKean, H. and Trubowitz, E.: The spectrum of Hill’s equation. Invent. Math. 30, 217–274 (1975) 36. McKean, H. and van Moerbeke, P.: Hill’s operator and hyperelliptic function theory in the presence of infinitely many branch points. Comm. Pure Appl. Math. 29, (2), 143–226 (1976) 37. Pastur, L. and Figotin, A.: Spectra of Random and Almost-Periodic Operators. Berlin: Springer-Verlag, 1992 38. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol I: Functional Analysis. New York: Academic Press, 1980 39. Simon, B.: Almost periodic Schrödinger operators: a review. Advances in Applied Mathematics 3, 463–490 (1982) 40. Sorets, E. and Spencer, T.: Positive Lyapunov exponents for Schrödinger operators with quasi-periodic potentials. Commun. Math. Phys. 142 (3), 543–566 (1991) 41. Titschmarch, E.C.: Eigenfunction expansions associated with second-order differential equations. Part II. Oxford: Clarendon Press, 1958 42. Wilkinson, M.: Tunnelling between tori in phase space. Phys. D 21 (2–3), 341–354 (1986) Communicated by M. Aizenman
Commun. Math. Phys. 227, 93 – 118 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
The Perturbation of the Quantum Calogero–Moser–Sutherland System and Related Results Yasushi Komori, Kouichi Takemura 1 Institute of Physics, University of Tokyo, Komaba, Tokyo 153-8902, Japan.
E-mail:
[email protected] 2 Department of Mathematical Sciences, Yokohama City University, 22-2 Seto, Kanazawa-ku,
Yokohama 236-0027, Japan. E-mail:
[email protected] Received: 30 May 2001 / Accepted: 27 November 2001
Abstract: The Hamiltonian of the trigonometric Calogero–Sutherland model coincides with a certain limit of the Hamiltonian of the elliptic Calogero–Moser model. In other words the elliptic Hamiltonian is a perturbed operator of the trigonometric one. In this article we show the essential self-adjointness of the Hamiltonian of the elliptic Calogero–Moser model and the regularity (convergence) of the perturbation for the arbitrary root system. We also show the holomorphy of the joint eigenfunctions of the commuting Hamiltonians w.r.t the variables (x1 , . . . , xN ) for the AN−1 -case. As a result, the algebraic calculation of the perturbation is justified. 1. Introduction The Hamiltonian of the elliptic Calogero–Moser model is given as follows ([8]), H := −
N
i=1
1≤i<j ≤N
1 ∂2 + β(β − 1) 2 ∂xi2
℘ (xi − xj ),
(1.1)
where β is the coupling constant. This Hamiltonian reduces to that of the trigonometric Calogero–Sutherland model √ by setting τ → −1∞, where τ is the ratio of two basic periods of the elliptic function. As for the trigonometric Calogero–Sutherland model, it is well-known by specialists that their eigenstates are given by the Jack polynomials (or the AN−1 -Jacobi polynomials). So far, many researchers have studied the Jack polynomials and its q-deformed version, the Macdonald polynomials, and clarified various properties such as the orthogonality, the norms, the Pieri formula, the Cauchy formula, and the evaluations at (1, . . . , 1). The Calogero–Sutherland model is extended to those associated with simple Lie algebras. From this point of view the Hamiltonian (1.1) is called the AN−1 -type. Studies of these models are being developed by using their algebraic structures.
94
Y. Komori, K. Takemura
In contrast with the trigonometric models, the elliptic models are less investigated and the spectrum or the eigenfunctions are not sufficiently analyzed. There is, however, some important progress due to Felder andVarchenko. They clarify that the BetheAnsatz works well for the AN−1 -type elliptic Calogero–Moser model ([1]). Although this method may have applications to the spectral problem and indeed some partial results are obtained in the articles ([10, 11]), we will employ another approach in the present article. In this article, we will add some knowledge of the elliptic Calogero–Moser model based on the analysis of the trigonometric model, which we will explain below. One topic is the essential self-adjointness of the elliptic Calogero–Moser model for the arbitrary root system. Firstly we will establish it for the trigonometric model by taking the Jacobi polynomials as its domain in the space of square integrable functions. We will obtain the elliptic version by perturbation. A second topic is to obtain the eigenvalues and the eigenfunctions of the elliptic Calogero–Moser model for arbitrary root systems. There are at least two ways to perform it. One is to use the Bethe Ansatz method, which is valid only for the AN−1 case. From this viewpoint some results are obtained in ([10, 11]). The other is to use the well-developed perturbation theory, which we will consider in this article. We regard the Hamiltonian of the elliptic Calogero–Moser model as the perturbed operator of √ the Calogero–Sutherland model by the parameter p = exp(2π τ −1). We have such abundant knowledge about the eigenvalues and the eigenfunctions of the Calogero– Sutherland model that we can apply the perturbation method. Then we will obtain the eigenvalues and the eigenfunctions as a formal power series of p. In general, such formal d2 2 4 power series do not converge. For example, consider the operator H := − dx 2 +x +αx , then the formal power series of the eigenvalues and eigenfunctions diverge for any α = 0. However, in our cases, the formal power series converges if p is sufficiently small. The convergence is assured by the functional analytic method introduced by Kato and Rellich. We mean the convergence of the eigenfunctions in the L2 -norm sense. The other topics are the holomorphy of the eigenfunctions, the relationship with the higher commuting operators, and giving the elliptic analogue of the Jacobi polynomials, which are valid for the AN−1 case. The Kato–Rellich method does not give the holomorphy a priori. We will obtain the holomorphy by using several properties of the Jack (or the AN−1 -Jacobi) polynomials. Thanks to the holomorphy, the eigenspaces of the secondorder Hamiltonian are compatible with the higher commuting operators. By considering the joint eigenfunctions of the Hamiltonian and the higher commuting operators, we see the well-definedness of an elliptic analogue of the AN−1 -Jacobi polynomials. We remark that Langmann obtained the algorithm for constructing the eigenfunctions and eigenvalues as the formal power series of p ([5]) His algorithm would be closely related to ours which is explained in Sect. 4.3. There are some merits for comparing the perturbation method to the Bethe Ansatz method. The calculation of the perturbation does not essentially depend on the coupling constant β though the calculation of the Bethe Ansatz method strongly depends on β. In addition, the Bethe Ansatz method is applied to the AN−1 type and β ∈ Z>1 cases, but the perturbation method may be valid for all types and the coupling constant does not need to be an integer.
Perturbation of Calogero–Moser–Sutherland System
95
2. Jacobi Polynomials and Self-Adjointness The Hamiltonians of the trigonometric and the elliptic Calogero–Sutherland models are respectively given by HT := − +
kα (kα − 1)|α|2
α∈R+
HE := − +
1 1 − , 4 sin2 ( α, h /2) 12
kα (kα − 1)|α|2 ℘ ( α, h ),
(2.1) (2.2)
α∈R+
where the coupling constant kα is real and invariant under the action of the Weyl group kα = kwα , and ℘ (x) = ℘ (x; π, π τ ) is the Weierstrass ℘ function. For our later convenience, we have subtracted a constant term from the original trigonometric Hamiltonian. √ is the Laplacian on T := hR /2π Q∨ . By using the variable p = exp(2τ π −1), we often write HE (p) = HE in order to emphasize the dependency of p. In this notation, we have HT = HE (0). We first show that the Hamiltonian of the trigonometric model is defined on a dense subspace of L2 (T , dµ)W , where µ is the normalized Haar measure, and is essentially self-adjoint with respect to the inner product (f, g) := f · g. (2.3) T
We denote by · := (· , ·)1/2 the norm in L2 (T , dµ). We define HT and HE on C 2 (T )W ∩ D(V ), where D(V ) denotes the domain of the multiplication operators in (2.1) and (2.2). Then we see that these operators are symmetric. If kα ≥ 2, HT has a C 2 -class W -invariant eigenfunction HT = E0 ,
(2.4)
where E0 = (ρ(k)|ρ(k)) − e0 , 1 kα (kα − 1)|α|2 , e0 = 12
(2.5) (2.6)
α∈R+
1 kα α, 2 α∈R+ | sin( α, h /2)|kα . =
ρ(k) =
(2.7) (2.8)
α∈R+
Let C[P ] be the polynomial ring of the weight lattice P . For each λ, let eλ denote the corresponding element, so that eλ eµ = eλ+µ and e0 = 1. We also regard the element eλ √ λ −1 λ,h ˙ := e as a function on T by the rule e (h) , where h˙ ∈ T is the image of h ∈ hR . Let mλ for λ ∈ P+ be the monomial symmetric functions mλ := |Wλ |−1 ewλ = eµ , (2.9) w∈W
µ∈W λ
96
Y. Komori, K. Takemura
where Wλ denotes the stabilizer of λ in W . The set {mλ |λ ∈ P+ } forms a basis of C[P ]W . Define the partial order ≺ in P by ν µ ⇔ µ − ν ∈ Q+ .
(2.10)
Let · and (· , ·) denote the norm and the inner product in L2 (T , 2 dµ) respectively. Definition 1 (Heckman-Opdam). There exists a family of polynomials {Jµ |µ ∈ P+ } which consists of a basis of C[P ]W satisfying the following conditions: Jµ = mµ + uµν mν , (2.11) ν≺µ
(Jµ , Jν ) = 0, −1 HT .
Let H0 := its eigenfunctions. Proposition 2.1.
if µ = ν.
(2.12)
Then these polynomials are characterized by the operator H0 as
H0 Jµ = Eµ Jµ ,
(2.13)
where Eµ = (µ + ρ(k)|µ + ρ(k)) − e0 . It is well known that the normalized Jacobi polynomials J˜λ (λ ∈ P+ ) form a complete orthonormal system in the space L2 (T , 2 dµ)W with the inner product (· , ·) if kα ≥ 0. It follows that Lemma 2.2. Assume kα ≥ 0. Then P := C[P ]W is a dense subspace in L2 (T , dµ)W . Theorem 2.3. Assume kα ≥ 2. Then HT is essentially self-adjoint on P. Proof. From Proposition 2.1 and P ⊂ C 2 (T )W ∩ D(V ) we see that Jλ are the eigenfunctions of HT . Then the theorem is obtained from Lemma 2.2 since it implies that the range of (HT ± i) is dense. If 0 < kα < 2, then ∈ C 2 (T )W ∪ D(V ) and P is not an appropriate domain for HT . However Theorem 2.3 is generalized in the following sense in terms of the adjoint operator HT∗ : Theorem 2.4. Assume kα ≥ 0. Then HT∗ |P is essentially self-adjoint on P. We rewrite HE (p) = W(p)+HT , where W(p) = (HE (p)−HT ) is a multiplication operator with 1 1 + W(p)(h) := kα (kα − 1)|α|2 ℘ ( α, h ) − . (2.14) 4 sin2 ( α, h /2) 12 α∈R+
By the formula (A.13), we see that W(p)u ≤ W(p)max u,
(2.15)
since the function W(p)(h) is a continuous function on T . This implies that W(p) is ∗ (p) = W(p)∗ + H∗ . bounded. Hence we have HE T ∗ (p)| is essentially selfTheorem 2.5 ([4]). Let −1 < p < 1 and kα ≥ 0. Then HE P adjoint on P. Proof. The symmetry of the operator W(p) is trivial. Then we deduce that W(p)+HT∗ |P is essentially self-adjoint on P. ∗ (p)| . In the next section, we abuse the symbols HT and HE (p) for HT∗ |P and HE P
Perturbation of Calogero–Moser–Sutherland System
97
3. Perturbation in the L2 -Space In this section, we employ the variable p with |p| < 1 instead of τ as a parameter of perturbation and treat mainly the gauge-transformed Hamiltonian defined below with kα > 0. For a linear operator T , we denote by D(T ) its domain and by R(T ) its range respectively. 3.1. The resolvent in the L2 space. For a bounded linear operator A, we denote by A the operator norm, i.e., A := supv =1 Av . We set W (p) := −1 (HE (p) − HT )(= W(p)), −1
T (p) :=
HE (p).
(3.1) (3.2)
Then T (p) = H0 + W (p) is a closable operator on L2 (T , 2 dµ)W with D(T (p)) = C[P ], and particularly if −1 < p < 1, T (p) is an essentially self-adjoint operator. Here W (p) is a bounded operator on L2 (T , 2 dµ)W with an upper bound, W (p) ≤ Wmax (p) := 4
∞ n|p|n · kα (kα − 1)|α|2 , 1 − |p|n n=1
(3.3)
α∈R+
which is monotonous with respect to |p| and tends to 0 as p → 0. Let T˜ denote the closure of a closable operator T . Then T˜ (p) for p ∈ (−1, 1) is the unique extension of T (p) to the self-adjoint operator. In particular H˜ 0 = T˜ (0) is the self-adjoint extension of H0 . T˜ (p) is a self-adjoint holomorphic family [3]. Notice that the spectrum of the operator H˜ 0 is discrete. Let σ (H˜ 0 ) be the set of the spectrum and let ρ(H˜ 0 ) be the resolvent set of the operator H˜ 0 . We have σ (H˜ 0 ) = {(λ + ρ(k)|λ + ρ(k)) − e0 |λ ∈ P+ }.
(3.4)
The following proposition is obvious. Proposition 3.1. For each a ∈ σ (H˜ 0 ), the corresponding eigenspace {v ∈ L2 (T , 2 dµ)W | H˜ 0 v = av} is finite dimensional. For ζ ∈ ρ(H˜ 0 ), the resolvent (H˜ 0 − ζ )−1 is compact and (H˜ 0 − ζ )−1 = (dist(ζ, σ (H˜ 0 )))−1 . We have (H˜ 0 − ζ )−1
c λ Jλ =
λ
(Eλ − ζ )−1 cλ Jλ , λ
(3.5)
where λ cλ Jλ ∈ L2 (T , 2 dµ)W and H˜ 0 Jλ = Eλ Jλ . The proof of the Kato-Rellich theorem also implies the compactness of the resolvent of T˜ (p) for −1 < p < 1. If (H˜ 0 − ζ )−1 W (p) < 1, then (H˜ 0 − ζ )−1 (T˜ (p) − ζ ) = 1 + (H˜ 0 − ζ )−1 W (p) has a bounded inverse by Neumann series and thus T˜ (p) − ζ has also a bounded inverse
−1 (T˜ (p) − ζ )−1 = 1 + (H˜ 0 − ζ )−1 W (p) (H˜ 0 − ζ )−1 (3.6) =
∞
j −(H˜ 0 − ζ )−1 W (p) (H˜ 0 − ζ )−1 .
j =0
98
Y. Komori, K. Takemura
In particular, if (H˜ 0 − ζ )−1 < W (p)−1 , then the bounded inverse of T˜ (p) − ζ exists. The right-hand side of this expression implies that the resolvent of T˜ (p) is also compact. By the equality (H˜ 0 −ζ )−1 = (dist(ζ, σ (H˜ 0 )))−1 and Eq. (3.4), the resolvent set ρ(T˜ (p)) is included outside the union of the closed disks dist(ζ, σ (H˜ 0 )) ≤ W (p). Proposition 3.2. Let T be a closed operator with the resolvent set ρ(T ). Let 41 , 42 be circles which are contained in ρ(T ) and whose interiors are disjoint. We set 1 (T − ζ )−1 dζ, (i = 1, 2). Pi := − √ 2π −1 4i Then we have Pi2 = Pi and P1 P2 = P2 P1 = 0. Proposition 3.3. Let P , Q be bounded operators subject to P 2 = P , Q2 = Q and P − Q < 1. Then we have rank P = rank Q. Proof. For u ∈ R(Q), we have P u = u + P u − Qu due to Qu = u. Then P u ≥ (1 − P − Q)u,
(3.7)
which implies that P |R(Q) : R(Q) → R(P Q) ⊂ R(P ) is one-to-one and rank P ≥ rank Q. Similarly we have rank P ≤ rank Q and hence rank P = rank Q. Let 4 ⊂ ρ(H˜ 0 ) be a circle and let r = dist(4, σ (H˜ 0 )). Then there exists p0 > 0 such that for all |p| < p0 , W (p) < r and thus 4 ⊂ ρ(T˜ (p)). Notice (H˜ 0 − ζ )−1 ≤ r −1 on 4. Let 1 P4 (p) := − √ (T˜ (p) − ζ )−1 dζ. 2π −1 4 Then we have 1 P4 (p) − P4 (0) ≤ 2π ≤
1 2π
4
(T˜ (p) − ζ )−1 − (H˜ 0 − ζ )−1 |dζ |
∞ 4 j =1
(H˜ 0 − ζ )−1 j +1 W (p)j |dζ |
1 r −2 W (p) < . |dζ | 2π 4 1 − r −1 W (p)
(3.8) (3.9)
(3.10)
Fix ai ∈ σ (H˜ 0 ). Since the set σ (H˜ 0 ) is discrete, we can choose a circle 4i and 0 < pi such that 4i contains only one element ai inside it and Pi (p) = P4i (p) satisfying Pi (p) − Pi (0) < 1 for |p| < pi . By Propositions 3.2, 3.3, we see that rank Pi (p) = rank Pi (0) and in particular, Pi (p) is a degenerate operator. By the proof of Proposition 3.3, we see that Vi (p) := R(Pi (p)) is spanned by the image of Vi (0) = R(Pi (0)), i.e., the eigenspace of H˜ 0 with the eigenvalue ai . We choose a basis of Vi (p) as {Pi (p)J˜λ | λ such that H˜ 0 J˜λ = ai J˜λ }, where J˜λ is the normalized Jacobi polynomial. One sees that Vi (p) is a finite dimensional invariant subspace of T˜ (p) due to the commutativity of Pi (p) and T˜ (p). Lemma 3.4. The matrix elements of T˜ (p)|Vi (p) : Vi (p) → Vi (p) with respect to Pi (p)J˜λ are real-holomorphic functions of p.
Perturbation of Calogero–Moser–Sutherland System µ
99 µ
Proof. We define the functions cλ (p) and dλ (p) by T˜ (p)Pi (p)J˜λ =
µ
Pi (0)Pi (p)J˜λ =
µ
µ
cλ (p)Pi (p)J˜µ , µ
dλ (p)J˜µ .
(3.11) (3.12)
Then we see that Pi (0)T˜ (p)Pi (p)|Vi (0) : Vi (0) →µVi (0) and Pi (0)Piµ(p)|Vi (0) : Vi (0) → Vi (0) are real-holomorphic. Equivalently, µ cλ (p)dµν (p) and dλ (p) are real-holoµ morphic. By Proposition 3.3, Pi (0)Pi (p)|Vi (0) or the matrix dλ (p) is invertible, which µ implies cλ (p) is real-holomorphic. µ
The matrix c(p) = (cλ (p)) is symmetric. It is known that if all the matrix elements of the symmetric operator on the finite dimensional vector space are real-holomorphic, then its eigenvalues and eigenvectors are real-holomorphic (see [3]). Hence we have Proposition 3.5. The eigenvalues of T˜ (p) are on Vi (p) real-holomorphic and coincide with ai when p = 0. The eigenfunctions are also real-holomorphic. Summarizing, we obtain the following theorem. Theorem 3.6. For each ai ∈ σ (H˜ 0 ), there exists pi > 0 such that for −pi < p < pi , the dimension of the eigenspace whose eigenvalues are included in |ζ − ai | < Wmax (p) is equal to the dimension of the eigenspace of eigenvalue ai . Moreover the eigenfunctions and the eigenvalues depend on p real-holomorphically. If the coupling constants kα (> 0) are all rational numbers, we can estimate the eigenvalues uniformly. We will explain this below. Suppose kα are all rational. Let kα = kα,num /kden be such that kα,num are integers and kden is a positive integer. Let n be the minimal positive integer such that (P |P ) ⊂ Z/n. Then we see that the spectrum of H˜ 0 is uniformly separated. To be more precise, if a, b ∈ σ (H˜ 0 ) and a = b, we have |a − b| ≥ 1/nkden . Hence if we take p0 as Wmax (p0 ) = 1/4nkden ,
(3.13)
then there exists a set of circles 4i such that for |p| < p0 , each 4i ⊂ ρ(T˜ (p)) contains only one element ai ∈ σ (H˜ 0 ) inside it, any two circles never cross, Pi (p)−Pi (0) < 1, and every element of σ (T˜ (p)) is contained inside some circle 4i . Therefore we have Theorem 3.7. Suppose kα ∈ Q>0 and let p0 be defined in (3.13). Then Theorem 3.6 holds for pi = p0 . All eigenvalues of T˜ (p) on the L2 (T , 2 dµ)W space are contained in ∪a∈σ (H˜ 0 ) {ζ | |ζ − a| < Wmax (p)} for −p0 < p < p0 . All eigenfunctions are realholomorphically connected to the eigenfunction of H˜ 0 as p → 0.
100
Y. Komori, K. Takemura
4. AN−1 -Cases 4.1. In Sect. 3.1, we considered the spectrum problem of the gauge-transformed Hamiltonian T˜ (p) in the L2 (T , 2 dµ)W space and show that the perturbation is holomorphic by use of the theory of Kato and Rellich. On the other hand, it is known that there is a commuting family of differential operators (e.g. (4.2) for the AN−1 case) which commute with the Hamiltonian ([8, 7, 2]). In this section, we will investigate the relationship between the functions obtained by applying the projections Pi (p) and the commuting family of differential operators. As a result, we will prove that the perturbation series which is obtained by the algorithmic calculation is not only square-integrable but also holomorphic w.r.t. the variables of the coordinate. For this purpose, we will consider the spectrum problem in the C ω (T )W -space. In this section, we consider the AN−1 cases. 4.2. We introduce some known result for the AN−1 cases. We realize the AN−1 root system in RN . Let {8i }i=1,... ,N be an orthonormal basis. N The space h∗ is defined by h∗ := {h = N i=1 hi 8i | i=1 hi = 0}. The simple roots are {8i − 8i+1 |i = 1, . . . , N − 1}. We set xi = (h|8i ). Let us recall the Hamiltonian of the elliptic Calogero–Moser model, H := −
N
i=1
1≤i<j ≤N
1 ∂2 + β(β − 1) 2 ∂xi2
℘ (xi − xj ).
This system is integrable, i.e., there exists sufficiently many commuting operators. The existence and the explicit expressions are known in ([8, 7, 2]), etc. Here, we exhibit Hasegawa’s expression which will be used in the proof of Proposition 4.1. Later we will discuss the relationship between the expression of Ochiai–Oshima–Sekiguchi ([7]) and the one of Hasegawa ([2]). Following ([2]), we set ∂ β j ∈J ∂xj :(x) ∂ ˆ Hi := , (4.1) :(x) ∂xj |I |=i J ⊂I
j ∈I \J
Hi := :(x) Hˆ i :(x)−β (1 ≤ i ≤ N ), (4.2) where :(x) := 1≤i<j ≤N θ ((xi − xj )/2π ), θ(x) is the theta function defined in Sect. A.2. The operators Hˆ i , Hi , H act on the space of functions which are meromorphic except for the branches along xj − xk ∈ 2π(Z + Zτ ) (j = k). On this space, we have [Hˆ i1 , Hˆ i2 ] = [Hi1 , Hi2 ] = 0 (1 ≤ i1 , i2 ≤ N ) and [Hi , H] = 0 (1 ≤ i ≤ N ). ˜ := 1≤j 0. Let λ, µ ∈ P+ . It is known that the condition Eλ = Eµ for all i ∈ {1, . . . , N} is equivalent to λ = µ. In other words, the joint eigenvalue is nondegenerate. From now on we will discuss the symmetry (self-adjointness) of the higher commuting Hamiltonians. For this purpose, we will discuss the relationship between the expressions of the higher commuting Hamiltonians in ([7]) and the ones in ([2]). Following ([7, 9]), we introduce the operators Ik =
0≤j ≤[ 2k ]
1 − 2j )!
2j j !(k
· · · u(x2j −1 − x2j )
σ
(u(x1 − x2 )u(x3 − x4 ) · · ·
σ ∈W
∂
∂
∂x2j +1 ∂x2j +2
...
∂ ∂xk
(4.8)
,
where k = 1, . . . , N, W is the Weyl group of AN−1 -type (N th symmetric group), 1 , . . . , xN )) = f (xσ −1 (1) , . . . , xσ −1 (N) ) for σ ∈ W , and u(x) = β(β − 1)℘ (x). The domains of the operators Ik (k = 1, . . . , N ) are the same as the ones of Hk (k = 1, . . . , N). By a straightforward calculation, we have H3 = I3 + CI1 for some constant C. Applying Theorem 5.2. in ([9]), we obtain that the operators Hk (k = 1, . . . , N ) are expressed as the polynomial of I1 , I2 , . . . , IN . σ (f (x
Perturbation of Calogero–Moser–Sutherland System
103
Let R[I1 , I2 , . . . , IN ] be a polynomial ring generated by I1 , I2 , . . . , IN and ς be an involution on R[I1 , I2 , . . . , IN ] such that ς F (x1 , . . . , xN ) = F (−x1 , . . . , −xN ). Then ς I = (−1)k I and ς H = (−1)k H . Hence H admit the expansion, k k k k k Hk = cj1 ,... ,jm Ij1 · · · Ijm , (4.9) j1 ≤···≤jm
where cj1 ,... ,jm ∈ R and if k − (j1 + · · · + jm ) ∈ 2Z≥0 then cj1 ,... ,jm = 0. From a similar discussion, the operators Ik admit the expansion, Ik = c˜j1 ,... ,jm Hj1 · · · Hjm ,
(4.10)
j1 ≤···≤jm
where c˜j1 ,... ,jm ∈ R and if k − (j1 + · · · + jm ) ∈ 2Z≥0 , then c˜j1 ,... ,jm = 0. Lemma 4.3. We suppose β > N . For f, g ∈ C ω (T )W , we have (H (k) (p)f, g) = (−1)k (f, H (k) (p)g) (1 ≤ k ≤ N ). ˜ 2 dµ and H (k) (p) = ˜ −1 Hk , ˜ it is enough Proof. Since (f, g) = T f (x)g(x)|| k ˜ ˜ ˜ Hk (g(x)||) ˜ to show T Hk (f (x)||) g(x)||dµ = (−1) T f (x)|| dµ. We have the equality T h(x)dµ = A 0≤x1 ,... ,xN ≤2πN h(x)dx1 · · · dxN for some non-zero constant A, which follows from the correspondence between the integration of the sln invariant function and the one of the gln . From this equality, the property (4.9), and the commutativity [Ij1 , Ij2 ] = 0 (1 ≤ j1 , j2 ≤ N ), if we show ˜ ˜ ˜ Ik (g(x)||) ˜ Ik (f (x)||) g(x)||dx = (−1)k f (x)|| dx, (4.11) D
D
RN |0
where D = {(x1 , . . . , xN ) ∈ ≤ x1 , . . . , xN ≤ 2π N } and dx = dx1 dx2 · · · dxN , then we obtain Lemma 4.3. ˜ g(x)|| ˜ are C N -class. If β > N then the functions f (x)||, ω W From the definition of C (T ) (4.3), we have the periodicity f (x1 , . . . , xl + 2πN, . . . , xN ) = f (x1 , . . . , xl , . . . , xN ) (1 ≤ l ≤ N ) for f (x1 , . . . , xN ) ∈ C ω (T )W . ˜ and g(x) ˜ The functions f˜(x), g(x) We set f˜(x) = f (x)|| ˜ = g(x)||. ˜ are smooth on RN except for xi − xj + 2π k = 0 (1 ≤ i = j ≤ N, k ∈ Z). The behaviors of the ˜ functions f˜(x), g(x) ˜ around xi − xj + 2π k = 0 are O(|xi − xj |β ), i.e. f (x) β and g(x) ˜ |xi −xj |β
|xi −xj |
are bounded around xi − xj + 2π k = 0. From the expression of Ik (4.8), if we show ∂ ∂ ˜ u(x1 − x2 )u(x3 − x4 ) · · · u(x2j −1 − x2j ) ··· f (x) g(x)dx ˜ (4.12) ∂x2j +1 ∂xk D ∂ ∂ ˜ = f (x) u(x1 − x2 )u(x3 − x4 ) · · · u(x2j −1 − x2j ) ··· g(x) ˜ dx, ∂x2j +1 ∂xk D
for all j s.t. 0 ≤ j ≤ [ 2k ], then we obtain (4.11) and Lemma 4.3. The number β satisfies β > N ≥ 2. Though the function u(x2l−1 − x2l ) = β(β − 1)℘ (x2l−1 − x2l ) (l = 1, . . . , j ) has a double pole along x2l−1 − x2l + 2π k = 0 (k ∈ Z),
104
Y. Komori, K. Takemura
the integrands of (4.12) are bounded around x2l−1 − x2l + 2π k = 0 from the properties f˜(x) = O(|x2l−1 − x2l |2 ) and g(x) ˜ = O(|x2l−1 − x2l |2 ) around x2l−1 − x2l + 2π k = 0. Hence the singularities along x2l−1 − x2l + 2π k = 0 (l = 1, . . . , j, k ∈ Z) do not affect the integration. Since the integrands of (4.12) are continuous, we can replace the range of integration of both sides of (4.12) with D ' , where D ' = {(x1 , . . . xN ) ∈ D|x2l−1 − x2l + 2π k = 0 (l = 1, . . . , j, k ∈ Z)}. It is obvious that ∂ ∂ ˜ u(x1 − x2 )u(x3 − x4 ) · · · u(x2j −1 − x2j ) f (x) g(x)dx ˜ (4.13) ··· ∂x2j +1 ∂xk D' ∂ ∂ = ··· u(x1 − x2 )u(x3 − x4 ) · · · u(x2j −1 − x2j )f˜(x) g(x)dx. ˜ ∂xk D ' ∂x2j +1 By applying the integration by parts repeatedly, we find that the r.h.s. of (4.13) is equal to ∂ ∂ (−1)k−2j u(x1 − x2 )u(x3 − x4 ) · · · u(x2j −1 − x2j )f˜(x) ··· g(x)dx. ˜ ' ∂x ∂x 2j +1 k D Here we used the periodicities on xl → xl + 2π N (l = 2j + 1, . . . , k). Hence we obtain (4.12) and Lemma 4.3. Proposition 4.4. We suppose β ≥ 0. For f, g ∈ C ω (T )W , we have√ (H (k) (p)f, g) = k (k) (−1) (f, H (p)g) (1 ≤ k ≤ N ). In other words, the operators ( −1)k H (k) (p) are symmetric on the space C ω (T )W . Proof. It is trivial for the β = 0 case. We assume β > 0. Let f (x) ∈ C ω (T )W . Then H (k) (p)f (x) is a polynomial in the parameter β of degree at most k and H (k) (p)f (x) ∈ C ω (T )W . k j ω W (0 ≤ j ≤ k), We set H (k) (p)f (x) = j =0 fj (x)β . Then fj (x) ∈ C (T ) (x) = because H (k) (p)f (x) ∈ C ω (T )W for all β. For f (x) and H (k) (p)f (x), we set f k j (k) f (x) and H (p)f (x) = fj (x)β . j =0
We fix the functions f (x), g(x) ∈ C ω (T )W . It is enough to show that the equations (H (k) (p)f, g) − (−1)k (f, H (k) (p)g) = 0 hold for β > 0 and 1 ≤ k ≤ N . We set T ' = {(x1 , . . . , xN ) ∈ RN |
N
xi = 0, 0 ≤ xi − xj ≤ 2π (1 ≤ i < j ≤ N )},
i=1 ◦
T ' = {(x1 , . . . , xN ) ∈ RN |
N i=1
xi = 0, 0 < xi − xj < 2π (1 ≤ i < j ≤ N )},
∗ ˜ 2 − (−1)k f (x) H (k) (p)g(x) || ˜ 2, h (x) = H (k) (p)f (x) g(x)|| ˜ 2 − (−1)k f ˜ 2. (x) H (k) (p)g(x) (p)f (x) g(x) h(x) = H (k)
Perturbation of Calogero–Moser–Sutherland System
105
∗ Then the equation (H (k) (p)f, g) − (−1)k (f, H (k) (p)g) = 0 is equivalent to T h ∗ ∗ 1 (x)dµ = 0, where T = hR /2π Q∨ . From the equation N! T h (x)dµ = T ' h (x)dµ, it ∗ is sufficient to show T ' h (x)dµ = 0. ◦
∗
◦
We have h(x) =h (x) on the domain T ' , because sin((xi − xj )/2) > 0 on T ' for i < j and the branch of the function sinβ ((xi − xj )/2) is chosen to be a positive real number. For β ∈ C, the branch of the function sinβ ((xi − xj )/2) (i < j ) is canonically chosen by the relation a β = exp(β log a) for a = sin((xi − xj )/2) > 0. Hence it is sufficient to show the equation T ' h(x)dµ = 0 for β > 0. ∗ From Lemma 4.3, T ' h(x)dµ = T ' h (x)dµ = 0 holds for β > N . (k) ω W ω W From Proposition 4.1, we have H (p)f (x) ∈ C (T ) when f (x) ∈ C (T ) and β ∈ C. Hence the integral T ' h(x)dµ is well-defined if Reβ > 0. We fix β0 (Reβ0 > 0). Since the function h(x) is holomorphic in β and the functions ∂ h(x) and ∂β h(x) are uniformly bounded in (x, β) ∈ T ' × {β ' | |β ' − β0 | < 8} for some 8 ∈ R>0 , the integral T ' h(x)dµ is also holomorphic at β = β0 (Reβ0 > 0) by Lebesgue’s theorem. By the identity theorem, the equation T ' h(x)dµ = 0 holds for β s.t. Reβ > 0. Therefore we obtain the proposition. 4.3. Perturbation. We start with the general proposition related to the perturbation method. Proposition 4.5. Let {v1 , v2 , . . . } be linearly independent vectors in a vector space {k} V over R. Let Hi (k ∈ Z≥0 , i = 1, . . . , N ) be linear operators on V such that {k} {k},i {0},i H i vj = for all i, j, k. We assume that there exists Ej ∈ R j ' :finite dj,j ' vj ' {0}
{0},i
such that Hi vj = Ej
vj for all i, j and if j1 = j2 then there exists some i such {0},i that = Ej . Let ( , ) be an inner product on V such that (vi , vj ) = δi,j . Let ∞ 2 {k} k Hi (p) := k=0 Hi p be formal power series of the linear operators and assume [Hi1 (p), Hi2 (p)] = 0 for all i1 , i2 ∈ {1, . . . , N} as the formal power series of p. Then {0},i Ej1
there exists formal power series of vectors vj (p) = vj +
∞ k=1
j ' :finite
{k}
cj,j ' vj ' p k ,
(4.14) {0},i
such that Hi (p)vj (p) = Eji (p)vj (p) and (vj (p), vj (p)) = 1, where Eji (p) = Ej + ∞ {k},i k p is a formal power series on p and the equalities hold as the formal power k=1 Ej series of p. For each j , the normalized formal power series of the joint eigenfunction of the form (4.14) is unique. Proof. We introduce variables w1 , . . . , wN and set H (w, p) :=
N i=1
wi Hi (p),
j'
{k}
dj,j ' (w)vj ' :=
N i=1
{k}
wi H i vj ,
106
Y. Komori, K. Takemura
vj (p) := vj +
∞ k=1 j '
Ej (w, p) :=
∞ k=0
{k}
{k}
Ej (w)p k =
N i=1
{k}
cj,j ' vj ' p k ,
{0},i
Ej
wi +
N ∞ k=1 i=1
{k},i
Ej
wi p k .
{0},i
are given in advance. We will investigate the condiThe numbers dj,j ' (w) and Ej tions for the coefficients of the formal power series vj (p) and Ej (w, p) satisfying the following relations: H (w, p)vj (p) = Ej (w, p)vj (p), (vj (p), vj (p)) = 1.
(4.15)
{0}
We set cj,j ' = δj,j ' By comparing the coefficients of vj ' p k , we obtain that the conditions (4.15) are equivalent to the following relations: k {k−k ' } {k ' } k−1 {k−k ' } {k ' } k ' =1 ( j '' cj,j '' dj '' ,j ' (w)) − k ' =1 cj,j ' Ej (w) {k} , (j ' = j ), (4.16) cj,j ' = {0} {0} Ej (w) − Ej ' (w) k−1 ' ' 1 {k} {k } {k−k } (4.17) cj,j ' cj,j ' , cj,j = − 2 ' ' k =1 j
{k}
Ej (w) =
k
k ' =1
j'
{k−k ' } {k ' } dj ' ,j (w) −
cj,j '
k−1 k ' =1
{k−k ' }
cj,j
{k ' }
Ej (w).
(4.18)
We remark that the denominator of (4.16) is non–zero by the non–degeneracy condition. {k} {k} The numbers cj,j ' and Ej (w) are determined recursively and they exist uniquely. We {k}
have recursively that for each j and k, #{j ' | cj,j ' = 0} is finite and the summations in (4.16–4.18) on the parameters j ' and j '' are indeed the finite summations. {k} At this stage, the apparent expression of the coefficients cj,j ' may depend on w. We {k}
will show that the coefficients cj,j ' do not depend on w. We denote vj (p) by vj (w, p). From the commutativity of H (w, p) and H (w' , p) we have H (w, p)(H (w' , p)vj (w, p)) = Ej (w, p)(H (w ' , p)vj (w, p)). Since the vector H (w' , p)vj (w, p) admits the expansion H (w ' , p)vj (w, p) = {0} Ej (w ' )vj + O(p), we obtain the following relation from the uniqueness of the formal eigenvector. H (w ' , p)vj (w, p) = vj (w, p), f (w, w ' , p)
where f (w, w' , p) := (H (w' , p)vj (w, p), H (w' , p)vj (w, p)) and 1/ f (w, w ' , p) is k −1/2 = regarded power series on p from the formula (a02 + ∞ k=1 ak p ) as a formal n −1/2 ∞ k a0−1 a0−2 ∞ . Therefore we have H (w ' , p)vj (w, p) = n=0 k=1 ak p n
Perturbation of Calogero–Moser–Sutherland System
107
f (w, w ' , p)vj (w, p). On the other hand we have H (w ' , p)vj (w ' , p) = Ej (w ' , p)vj (w ' , p). By the uniqueness of the formal eigenvector whose leading term is {k} vj , we have vj (w, p) = vj (w ' , p). Therefore the coefficients cj,j ' do not depend on w. From (4.18), we obtain recursively that the coefficients of the formal eigenvalue {k} {k},i (k ∈ Z≥1 ) are deterEj (w) are linear in w1 , . . . , wN . Therefore the numbers Ej mined appropriately. Proposition 4.6. Proposition 4.5 is applicable for the AN−1 -type elliptic Calogero– Moser model by the following correspondence: Hi (p) ⇔ The commuting differential operator H (i) (p), vj ⇔ The normalized Jacobi polynomial J˜λ . {k},i {k} Proof. The finiteness of the summation Hi vj = j ' dj,j ' vj ' follows from Proposition 4.2 and the fact that the Jacobi polynomial forms a basis of C[P ]W . {0},i The non-degeneracy of the joint eigenvalues Ej follows from the non-degeneracy of the joint eigenvalue of the Jacobi polynomial. Summarizing, we have the algorithm of computing the “formal” eigenvalues and “formal” eigenfunctions of the elliptic Calogero–Moser model of AN−1 -type by using the Jacobi polynomial. In the next subsection, we will discuss the convergence.
4.4. Analyticity and the higher commuting operators. In this subsection, we will consider the spectral problem in the C ω (T )W -space for the AN−1 elliptic Calogero–Moser model. We assume β > 1. Since T is compact, we have C ω (T )W ⊂ L2 (T , 2 dµ)W . We will show the holomorphy of the eigenfunctions which we have found on the L2 (T , 2 dµ)W space in Sect. 3. After having the holomorphy of the eigenfunctions, we will justify the convergence and the holomorphy of the joint eigenfunctions of the higher commuting operators obtained by the algorithmic calculation, which we have explained in Sect. 4.3. For this purpose, we need the following propositions. Proposition 4.7. For each eigenvalue ai ∈ σ (H˜ 0 ) and eigenfunction J˜λ of the Hamiltonian H˜ 0 of the trigonometric model such that H˜ 0 J˜λ = ai J˜λ , there exists a positive number pi such that the function Pi (p)J˜λ is holomorphic in (x1 , . . . , xN , p) on the set Bpi , where the operator Pi (p) is a projection on the Hilbert space L2 (T , 2 dµ)W which was defined in Sect. 3.1 and B8 = {(x1 , . . . , xN , p) ∈ CN × R| |Im xj | < 8 (j = 1, . . . , N ), −8 < p < 8}. (4.19) Proposition 4.8. For all eigenvalue ai ∈ σ (H˜ 0 ) and the Jacobi polynomial J˜λ , we have H (j ) (p)Pi (p)J˜λ = Pi (p)H (j ) (p)J˜λ , (j = 1, . . . , N ), when |p| is sufficiently small. We will prove Propositions 4.7 and 4.8 in the next section.
(4.20)
108
Y. Komori, K. Takemura
Remark. For the A1 and β ∈ Z>1 cases, and the A2 and β = 2 case, Proposition 4.7 is obvious from the construction of the eigenfunctions via the Bethe Ansatz method ([11]). We fix the eigenvalue ai ∈ σ (H˜ 0 ). From Propositions 3.1, 4.7, and 4.8, if |p| is sufficiently small then the operators H (j ) (p) act on the finite dimensional space Vi (p), where CPi (p)J˜λ , (4.21) Vi (p) = λ|H˜ 0 J˜λ =ai J˜λ
and we have Vi (p) ⊂ C ω (T )W . √ From Proposition 4.4, the higher commuting operators ( −1)j H (j ) (p) (j = 1, . . . , N ) are symmetric both on the space C ω (T )W and the finite dimensional space Vi (p). √ The joint eigenvalues are real-holomorphic w.r.t the parameter p and the operators ( −1)j H (j ) (p) are simultaneously diagonalizable in the space Vi (p) if |p| is sufficiently small and p ∈ R. The joint eigenfunctions are holomorphic on the domain B8 for sufficiently small 8 ∈ R>0 . Therefore the joint eigenfunction of H (1) (p), . . . , H (N) (p) admits the holomorphic expansion in the variable p. Since the joint eigenvalues of H (1) (0), . . . , H (N) (0) are distinct, the expansion is unique up to the normalization (see Sect. 4.3.) Hence the perturbation series which is obtained by the method introduced in Sect. 4.3 converges holomorphically and coincides with the eigenvalue and eigenfunction which is obtained by diagonalizing the finite dimensional space Vi (p). Summarizing, we have Theorem 4.9. For the AN−1 and β > 1 cases, the perturbation expansion of the commuting operators H (1) (p), . . . , H (N) (p) which is performed in Sect. 4.3 converges holomorphically and defines the eigenfunction which is holomorphic when |I mxj | (j = 1, . . . , N ) and |p| (p ∈ R) are sufficiently small. The joint eigenvalue is holomorphic in the parameter p(+ 1). Remark. It was pointed out by Prof. T. Oshima that the real-holomorphy of the squareintegrable eigenfunction ψ(x) (i.e. T (p)ψ(x) = E(p)ψ(x), ψ(x) ∈ L2 (T , 2 dµ)W ) is also obtained by the following argument. From the ellipticity of the operator T (p) and Weyl’s lemma, we have the realholomorphy of the eigenfunction ψ(x) on the domain T˙ = T \ T ' , where T ' := {(x1 , . . . , xN ) ∈ T |∃(i = j ), xi = xj }. Next we consider the analytic continuation of the function ψ(x). The equation T (p)ψ(x) = E(p)ψ(x) has regular singularities along xi = xj (i = j ), and the exponents at the singularity are (0, −2β − 1). It follows that the function ψ(x) is holomorphic along xi − xj = 0 from the property ψ(x) ∈ L2 . Hence we have the real-holomorphy of ψ(x) on T . 5. Proof of Propositions 4.7 and 4.8 In this section, we assume that the root system is of the AN−1 -type. For λ ∈ P+ and j = 1, . . . , N − 1, we set mλ = µ∈W λ e µ,h and eFj = mFj , where Fj is the j th fundamental weight. For λ = lj =1 Fij (l ∈ Z≥0 , ij ∈ {1, . . . , N − l 1} (j = 1, . . . , l)), we set e˜λ = j =1 eFij . Then we have e˜λ = eλ' on h∗ , where eµ is the elementary symmetric function for the partition µ defined in Macdonald’s book ([6], p. 20) and λ' is the conjugate of λ.
Perturbation of Calogero–Moser–Sutherland System
We set t (x, p) :=
∞
tk (x)p k := ℘ (x) −
k=1
109
1 4 sin2 (x/2)
+
1 , 12
which converges uniformly on a strip around R × [−8, 8] for 0 ≤ 8 < 1. From the formula (A.13), we have tk (x) = −2 j (cos j x − 1).
(5.1)
j |k
Here, j |k means that the positive integer j is a divisor of k. Lemma 5.1. For a real number c such that c > 1, there exists a positive number a ' such that |tk (x)| < a ' ck for all x ∈ R. formula (5.1), we have |tk (x)| ≤ 4tk < Proof. Let tk be the sum of all divisors of k. By the 4k 2 . Since the convergence radius of the series k 2 p k is equal to 1, the convergence radius of the series tk p k is equal to or less than 1. Therefore we have the lemma. The W (p) defined in (3.1) has an expansion in terms of p given by W (p) = ∞operator (k) p k , where T (k) is the operator of multiplication by the function T (k) (h) := T k=1 ∞ (k) p k converges α∈R+ β(β − 1)tk ( α, h ). For each p ∈ (1, −1), the series k=1 T uniformly on hR . Proposition 5.2. For a real number c such that c > 1, there exists a positive number a such that T (k) ≤ ack . Proof. It follows from Lemma 5.1 and the inequality |T (k) f |2 2 dµ ≤ sup |T (k) (h)|2 |f |2 2 dµ. T
h∈T
T
(5.2)
Lemma 5.3. The function α∈+ tk ( α, h ) admits the expansion, tk ( α, h ) = cµ m µ . α∈+
(5.3)
√ µ∈Q∩P+ ,|µ|≤ 2k
Proof. From formula (5.1), we have tk ( α, h ) = − ( j (ej α,h + e−j α,h − 2) = − 2j (mj θ − 1), α∈+
α∈+ j |k
j |k
where θ is the highest root of the root system AN−1 . Since |θ | = we have the lemma.
√
Sublemma 5.4. Let Jλ be the AN−1 -Jacobi polynomial. We have Jλ eFr = c¯ν Jν , ν∈P+ ,ν−λ∈{wFr |w∈W }
for some constants c¯ν .
2 and j θ ∈ Q ∩ P+ ,
(5.4)
110
Y. Komori, K. Takemura
Proof. This follows from the Pieri formula ([6], p. 332 and Sect. VI.10.).
Sublemma 5.5. Let l be a positive integer. Assume ij ∈ {1, . . . , N − 1} and wj ∈ W , (j = 1, . . . , l). We have | lj =1 wj (Fij )| ≤ | lj =1 Fij |. Proof. It is sufficient to show (λ + µ, λ + µ) ≥ (λ + w(µ), λ + w(µ)) for λ, µ ∈ P+ and w ∈ W . This inequality is equivalent to (λ, µ − w(µ)) ≥ 0. From the property µ − w(µ) ∈ Q+ , we have (λ, µ − w(µ)) ≥ 0. Sublemma 5.6. If λ, µ ∈ P+ and λ − µ ∈ Q+ , then we have |λ| ≥ |µ|. Proof. Immediate from the equality (λ, λ) − (µ, µ) = (λ − µ, λ + µ).
Sublemma 5.7 ([6], p. 20). The monomial symmetric function mλ has the expansion
mλ = e˜λ +
cˇν e˜ν ,
(5.5)
ν∈P+ ,λ−ν∈Q+ \{0}
for some constants cˇν . Lemma 5.8. We have the expansion,
J λ mµ =
c¯ν Jν ,
(5.6)
ν∈P+ ,|ν−λ|≤|µ|
for some constants c¯ν . Proof. First, we expand mµ by using Sublemma 5.7. Then Jλ mµ is expressed as the linear combination of Jλ e˜ν , where ν ∈ P+ and µ − ν ∈ Q+ . We set ν = lj =1 Fij (l ∈ Z≥0 , ij ∈ {1, . . . , N − 1} (j = 1, . . . , l)) We repeatedly apply Sublemma 5.4 for Jλ e˜ν . Then Jλ e˜ν is expressed as the linear combination of Jν ' , where ν ' = λ + lj =1 wj (Fij ) for some wj ∈ W (j = 1, . . . , l). From Sublemma 5.5, we have |ν ' − λ| ≤ | lj =1 Fij | = |ν|. Applying Sublemma 5.6 for µ and ν, we obtain Lemma 5.8. Proposition 5.9. Let |p| < 1 and λ ∈ P+ . Write ∞ k=1
T (k) p k J˜λ =
t˜λ,µ J˜µ ,
µ∈P+ ,λ−µ∈Q
where J˜λ is the normalized AN−1 -Jacobi polynomial. For each C such that C > 1 and √ C|p| < 1, there exists a number C '' ∈ R>0 such that |t˜λ,µ | ≤ C '' (C|p|)(|λ−µ|+1)/2 2 for all µ ∈ P+ . Proof. Since the normalized Jacobi polynomials form the complete orthonormal system (k) p k )J˜ , J˜ ) . with respect to the inner product ( , ) , we have t˜λ,µ = (( ∞ λ µ k=1 T
Perturbation of Calogero–Moser–Sutherland System
111
We √ fix λ, µ ∈ P+ . Let m be the smallest integer which is greater or equal to |λ − µ|/ 2. If k < m, then we have (T (k) p k J˜λ , J˜µ ) = 0 by Lemmas 5.3, 5.8 and the orthogonality. Therefore we have ∞ (k) k |t˜λ,µ | = (J˜µ , ( T p )J˜λ ) k=1 ∞ = (J˜µ , ( T (k) p k )J˜λ ) k=m ∞ k ˜ ˜ 2 = tk ( α, h )p Jλ Jµ dµ T k=m α∈+ ∞ k ≤ sup tk ( α, h )p |J˜λ J˜µ 2 |dµ h∈T k=m α∈ T + kα (kα − 1)N (N − 1) k ≤ tk |p| · |J˜λ |2 dµ |J˜µ |2 dµ 2 T T k≥m
kα (kα − 1)N (N − 1) ≤ tk |p|k . 2 k≥m
k Similarly, we have |t˜λ,λ | ≤ kα (kα −1)N(N−1) 2 k≥1 ntk |p| . Since the convergence radius of the series n tn p is equal to 1, we obtain that there √ exists a number C '' ∈ R>0 such that |t˜λ,λ | ≤ C '' (C|p|) and |t˜λ,µ | ≤ C '' (C|p|)|λ−µ|/ 2 for λ = µ. Hence we have the proposition. Proposition 5.10. Let suppose dist(ζ, σ (H˜ 0 )) ≥ D. Write D be a positive number. We −1 −1 ˜ ˜ ˜ ˜ (T (p)−ζ ) Jλ = µ tλ,µ Jµ , where (T (p)−ζ ) is defined in (3.6). For each λ ∈ P+ and C ∈ R>1 , there exists C ' ∈ R>0 and p0 ∈ R>0 which do not depend on ζ (but depend on D) such that tλ,µ satisfies |λ−µ| √ 2
|tλ,µ | ≤ C ' (C|p|) 2N
,
(5.7)
for all p, µ s.t. |p| < p0 and µ ∈ P+ . Proof. Let us recall that the operator (T˜ (p) − ζ )−1 is defined by the Neumann series (3.6). (k) p k ). From the We fix the number D(∈ R>0 ) and set X := (ζ − H˜ 0 )−1 ( ∞ k=1 T expansion (3.6) and Proposition 5.2, there exists a number p1 ∈ R>0 such that the inequality X < 1/2 holds for p (|p| < p1 ) and ζ (dist(ζ, σ (H˜ 0 )) > D). In this case, i )(H ˜ 0 − ζ )−1 = (T˜ (p) − ζ )−1 . We write X J˜λ = ˇ ˜ we have ( ∞ µ tλ,µ Jµ . i=0 X ' −1 ' ˜ For the series µ cµ J˜µ , write µ cµ J˜µ = (H˜ 0 − ζ ) µ cµ Jµ . We have |cµ | ≤ D −1 |cµ | for each µ. Combining with Proposition 5.9, we obtain that for each C such '' ∈ R that C > 1 and C+1 >0 which does not depend on ζ but D 2 p1 < 1, there exists C √ C+1 '' (|λ−µ|+1)/2 2 ˇ if |p| < p1 . such that |tλ,µ | ≤ C ( 2 |p|) To obtain Proposition 5.10, we use the method of majorants.
112
Y. Komori, K. Takemura
Z N We introduce the symbol eλ (λ ∈ ( N ) ) to avoid inaccuracies. We remark that Z N P+ ( N ) . We will apply the method of majorants for the formal series µ∈( Z )N cµ eµ N instead of µ∈P+ cµ eµ . ˜ by the rule For the formal series µ∈( Z )N cµ eµ , we define the partial ordering ≤ N (1) (2) (1) (2) ˜ cµ eµ ≤ cµ eµ ⇔ ∀µ, |cµ | ≤ |cµ |. Z N µ∈( N )
Z N µ∈( N )
(i)
Z N We will later consider the case that each cµ (µ ∈ ( N ) , i = 1, 2) is expressed as (2) the infinite sum. If one shows the absolute convergence of cµ for each µ, one has the (1) absolute convergence of cµ for each µ by the majorant. ˜ ˇ ˇ We set Xeλ = µ tλ,µ eµ , where the coefficients tλ,µ were defined by X Jλ = ˜ ˇ µ tλ,µ Jµ . i −1 ˜ ˜ Our goal is to show (5.7) for tλ,µ s.t. µ tλ,µ J˜µ = ( ∞ i=0 X )(H0 − ζ ) Jλ . Since (H˜ 0 − ζ )−1 J˜λ = (Eλ − ζ )−1 J˜λ and |(Eλ − ζ )−1 | ≤ D −1 , it is enough to show that there exists C I ∈ R>0 and p 0 ∈ R>0 which do not depend on ζ (but depend on D) such ∞ I is well-defined by I k that tλ,µ k=0 X eλ = tλ,µ eµ and satisfies |λ−µ| √ 2
I | ≤ C I (C|p|) 2N |tλ,µ
for all p, µ s.t. |p| < p0 and µ ∈ P+ . We set Y eλ := yλ,µ eµ := µ,λ−µ∈ZN
µ,λ−µ∈ZN
Zeλ :=
zλ,µ eµ :=
µ,λ−µ∈ZN
C
C
''
''
µ,λ−µ∈ZN
,
(5.8)
C+1 p 2 C+1 p 2
(|µ−λ|+1)/2√2
Ni=1 (|µi −λ√i |+1) 2N 2
eµ ,
eµ .
We have the inequality
˜ eλ ≤Ze ˜ λ. Xeλ ≤Y k Let k ∈ Z≥1 . If the coefficients of Z eλ w.r.t the basis {eµ } converge absolutely, then the coefficients of the series Xk eλ and Y k eλ are well-defined and we have ˜ k eλ ≤Z ˜ k eλ . X k eλ ≤Y From the equality Z k eλ = ν (1) ,... ,ν (k−1) zλ,ν (1) zν (1) ,ν (2) · · · zν (k−1) ,µ eµ and the property zλ,µ = z0,µ−λ , we have 1 Z k eλ = ··· √ (2π −1)N |s1 |=1 |sN |=1 µ,λ−µ∈ZN k λ −µ −1 νN λ1 −µ1 −1 z0,ν s1ν1 · · · sN s1 · · · sNN N ds1 · · · dsN eµ ν=(ν1 ,... ,νN )∈ZN
=
µ,λ−µ∈ZN
1 √ (2π −1)N
N i=1 |si |=1
ν∈Z
(C '' ) N 1
C +1 p 2
(|ν|+1) √ 2N 2
k λ −µi −1
siν si i
dsi eµ .
Perturbation of Calogero–Moser–Sutherland System 1√ 2
2N Set p˜ = ( C+1 2 p)
∞
113
, we have
Z k eλ
k=1
k N 1 λ −µ −1 '' N1 (|ν|+1) ν = (C ) p˜ si si i i dsi eµ √ N (2π −1) |s |=1 i k=1 µ,λ−µ∈ZN i=1 ν∈Z k N ∞ 1 1 λ −µ −1 '' (|ν|+1) ν ˜ (C ) N p˜ si si i i dsi eµ ≤ √ N (2π −1) |s |=1 i i=1 k=1 ν∈Z µ,λ−µ∈ZN = Zλ,µ eµ , ∞
µ,λ−µ∈ZN
where Zλ,µ =
1 N λ −µ −1 (C '' ) N (p˜ − p˜ 3 )si i i dsi 1 . √ 1 (2π −1)N i=1 |si |=1 (1 − ps ˜ i )(1 − ps ˜ i−1 ) − (C '' ) N (p˜ − p˜ 3 )
(5.9)
N N ∞ k k Remark that we used the inequality ∞ k=1 i=1 (ai ) ≤ i=1 k=1 (ai ) for 0 < q−q 3 |n|+1 n x = (1−qx)(1−qx −1 ) . The equality (5.9) a1 , . . . , aN < 1 and the formula n∈Z q makes sense for p˜ < p2 , where p2 is the positive number satisfying the inequalities 1
(C '' ) N |p2 −p23 | (1−p2 )2
1
< 1 and (C '' ) N p2 < 1. k Therefore each coefficient of ∞ k=1 Z eλ w.r.t the basis {eµ } converges absolutely. Hence the following inequality makes sense:
p2 < 1,
∞ k=0
˜ λ+ X k eλ ≤e
∞
˜ λ+ Z k eλ ≤e
k=1
Zλ,µ eµ .
µ,λ−µ∈ZN 1
˜ be the solution of the equation (1− ps)(1− ˜ ˜ −1 )−(C '' ) N (p˜ − p˜ 3 ) = 0 on Let s(p) ps ˜ < 1. Then s(p) ˜ is holomorphic in p˜ near 0 and admits the expansion s satisfying |s(p)| ˜ = p˜ + c2 p˜ 2 + · · · . We have s(p) 1 √
(2π −1)
1
(C '' ) N (p˜ − p˜ 3 )s n−1 ds |s|=1
1
˜ ˜ −1 ) − (C '' ) N (p˜ − p˜ 3 ) (1 − ps)(1 − ps
˜ (p)s( ˜ p) ˜ |n| , = pf
(5.10)
˜ is a holomorphic function defined near p˜ = 0. For the n ≥ 0 case, we have where f (p) ˜ For the n < 0 case, we the relation (5.10) by calculating the residue around s = s(p). ˜ need to change the variable s → s −1 and calculate the residue around s = s(p). k The coefficient of eµ on the series ∞ k=0 X eλ satisfying λ − µ ∈ Qhas to be zero ˜ |ν1 |+···+|νN | ≤ |s(p)| ˜ from the definition of X. By the inequality |s(p)|
2 ν12 +···+νN
, we
114
Y. Komori, K. Takemura
have ∞
˜ λ+ X k eλ ≤e
k=0
µ,λ−µ∈Q
N
˜ λ ˜ (p))s( ˜ ˜ |λi −µi | eµ ≤e (pf p)
i=1
+
˜ (p)) ˜ N s(p) ˜ |λ−µ| eµ . (pf
µ,λ−µ∈Q
˜ < p3 and p3 : a sufficiently small positive number. for |p|
1√ 2N 2 , the inequality C+1 < C, and the expanCombining the relation p˜ = C+1 2 p 2 2 ˜ = p˜ + c2 p˜ + . . . , we obtain (5.8) and the proposition. sion s(p) Proposition 5.11. Let ai ∈ σ (H˜ 0 ) and 4i be a circle in C which contains only one element ai of σ (H˜ 0 ) inside it. Let λ ∈ P+ satisfying H˜ 0 J˜λ = ai J˜λ . We set Pi (p) = − 2π √1 −1 4i (T˜ (p) − ζ )−1 dζ and write Pi (p)J˜λ = µ sλ,µ J˜µ . For each C ∈ R>1 , there exists C ' ∈ R>0 and p∗ ∈ R>0 such that sλ,µ satisfies |λ−µ| √ 2
|sλ,µ | ≤ C ' (C|p|) 2N
,
(5.11)
for all p, µ s.t. |p| < p∗ and µ ∈ P+ . Proof. Since the spectrum σ (H˜ 0 ) is discrete, there exists a positive number D such that inf ζ ∈4i dist(ζ, σ (H˜ 0 )) ≥ D. We write (T˜ (p) − ζ )−1 J˜λ = µ tλ,µ (ζ )J˜µ . From Proposition 5.10, we obtain that for each C ∈ R>1 , there exists C∗ ∈ R>0 and p∗ ∈ R>0 which does not depend on |λ−µ| √
ζ (∈ 4i ) such that |tλ,µ (ζ )| ≤ C∗ (C|p|) 2N 2 for all p, µ s.t. |p| < p∗ and µ ∈ P+ . Let L be the length of the circle 4i and write − 2π √1 −1 4i (T˜ (p) − ζ )−1 dζ J˜λ = ˜ ˜ µ sλ,µ Jµ . By integrating µ tλ,µ (ζ )Jµ over the circle 4i , we have |sλ,µ | ≤ |λ−µ|
√ L 2N 2 2π C∗ (C|p|)
for all p, µ s.t. |p| < p∗ and µ ∈ P+ . Therefore we have Proposition 5.11.
Proposition 5.12. Let µ ∈ P+ and cµ be a number satisfying |cµ | < a|p|b|µ| (|µ| > M) for some a, b > 0 and M ∈ Z. The function µ cµ J˜µ is holomorphic when |I mxj | (j = 1, . . . , N ) and |p| are sufficiently small. √ Proof. Since zi = e −1xi , it is enough to show that the function µ cµ J˜µ is holomorphic when |p| is sufficiently small and 1/2 < |zj | < 2 (j = 1, . . . , N ). We count roughly the number of the elements of P+ of a given length. The rough estimate is given by #{λ ∈ P+ | (λ|λ) = l} ≤ (2lN )N . We will use this in the inequality (5.12). In the proof, we will use the notations and the results written in Sect. A.1. In Sect. A.1, there are parameters r and C0 . We fix r = 2 and C0 = 2. There is another number A defined in Sect. A.1.
Perturbation of Calogero–Moser–Sutherland System
115
We have ≤ ˜ J c (z , . . . , z ) a|p|b|µ| |J˜µ (z1 , . . . , zN )| µ µ 1 N µ∈P+ ,|µ|≥M µ∈P+ ,|µ|≥M 1/2≤|zi |≤2 1/2≤|zi |≤2 √ √ √ ≤ a2(N−1) 2(µ|µ) 2 N(µ|µ) |p|b (µ|µ) µ∈P+ ,|µ|≥M
≤
√ √ 2+ N
aA(2nN )N (2(N−1)
|p|b )
√
(5.12)
n
n≥M,n∈Z/N
by the formulae (A.2), (A.11). √ √ (N−1) 2+ N |p|b < 1 then the bottom part of the inequality converges. We If 2 √ √ choose a positive number p0 which satisfies 2(N−1) 2+ N p0b < 1. Then the series | µ∈P+ ,|µ|≥M cµ J˜µ (z1 , . . . , zN )| is uniformly bounded and uniformly absolutely converges for |p| < p0 and 1/2 ≤ |zi | ≤ 2 (i = 1, . . . , N ). Since the functions J˜µ (z1 , . . . , zN ) are holomorphic, we have the holomorphy of the function ˜ µ cµ Jµ (z1 , . . . , zN ) by Weierstrass’s theorem. Combining Propositions 5.11 and 5.12, we have Proposition 4.7. From Propositions 5.10 and 5.12, the function (T˜ (p) − ζ )−1 J˜λ is real-holomorphic on (x1 , . . . , xN ) if |p| is sufficiently small. From Proposition 4.1, the operators H (j ) (p) (j = 1, . . . , N ) act well-definedly on the function (T˜ (p) − ζ )−1 J˜λ and we have H (j ) (p)(T˜ (p) − ζ )−1 J˜λ ∈ C ω (T )W . It follows from the commutativity of the operators T˜ (p) and H (j ) (p) (4.4) that H (j ) (p)J˜λ = H (j ) (p)(T˜ (p) − ζ )(T˜ (p) − ζ )−1 J˜λ = (T˜ (p) − ζ )H (j ) (p)(T˜ (p) − ζ )−1 J˜λ . Hence we have H (j ) (p)(T˜ (p) − ζ )−1 J˜λ = (T˜ (p) − ζ )−1 H (j ) (p)J˜λ . By integrating it on the variable ζ over the circle 4i , we have H (j ) (p)Pi (p)J˜λ = Pi (p)H (j ) (p)J˜λ . Therefore we have Proposition 4.8. A. Jack Polynomial and Special Functions A.1. Jack polynomial and AN−1 -Jacobi polynomial. We will see the relationship between the Jack polynomial and the AN−1 -Jacobi polynomial. Let MN be the set of partitions with at most N parts, i.e., MN := {λ = (λ1 , . . . , λN ) | λi − λi+1 ∈ Z≥0 , (i = 1, . . . , N − 1), λN ∈ Z≥0 }. We set M0N := {λ = (λ1 , . . . , λN ) ∈ MN | λN = 0}. The Jack polynomial JλI (z1 , . . . , zN ) (λ ∈ MN ) is a symmetric polynomial of variables (z1 , . . . , zN ) which is an eigenfunction of the gauge-transformed Hamiltonian H˜ 0 (4.6). Let mIλ be the monomial symmetric polynomial. The Jack polynomial admits the following expansion: uλµ mIµ , (A.1) JλI = mIλ + µ≺λ
116
Y. Komori, K. Takemura
where the dominant ordering of MN is given by λ µ ⇔ ij =1 λj ≤ ij =1 µj (i = N 1, . . . , N − 1), N j =1 λj = j =1 µj . We see the correspondence between the Jack polynomial and the AN−1 -Jacobi poly nomial. Let JλI (z1 , . . . , zN ) (λ ∈ MN ) be a Jack polynomial. We set |λ|I = N i=1 λi N and λ = i=1 (λi −|λ|I /N )8i . Then λ ∈ P+ , where P+ is the set of dominant weights of type AN−1 . The function (z1 · · · zN )−|λ|I /N JλI (z1 , . . . , zN ) is precisely the AN−1 -Jacobi polynomial Jλ . By this correspondence, the Jack polynomial JλI (z1 , . . . , zN ) (λ ∈ M0N ) corresponds with the AN−1 -Jacobi polynomial Jλ (λ ∈ P+ ) one-to-one. Let λ be an element in M0N and λ be the corresponding element in P+ . Since (λ|λ) ≥ (λ1 − |λ|I /N )2 + (|λ|I /N )2 ≥ (λ1 )2 /2 ≥
|λ|2I , 2(N−1)2
we have
|λ|I ≤ (N − 1) 2(λ|λ). Let us recall the Cauchy formula for the Jack polynomial. (1 − κXi Yj )−β = κ |λ|I JλI (X)JλI (Y )jλ−1 ,
(A.2)
(A.3)
λ∈MN
1≤i,j ≤N
where 0 ≤ jλ =
a(s) + βl(s) + 1 ≤ 1, a(s) + βl(s) + β
(A.4)
s∈λ
due to β ≥ 1. a(s) is the arm-length and l(s) is the leg-length. Since p. 379 of Macdonald’s book ([6]), we have JλI = mIλ + µ≺λ uλµ mIµ with uλµ > 0 if β > 0. Hence we have uλµ mµ , (A.5) Jλ = mλ + λ−µ∈Q+
with uλµ > 0. Let r be a real number greater than 1. √ N If 1/r < |zi | < r for all i then |mλ (z1 , . . . , zN )| ≤ r i=1 |(λ|8i )| mλ (1) ≤ r N(λ|λ) mλ (1). Therefore we have 0 ≤ |Jλ (z)| ≤ r
√ N(λ|λ)
Jλ (1)
on 1/r < |zi | < r for all i. By setting Xi = Yj = 1 in (A.3), we have 2 (1 − κ)−βN = κ n cn = κ |λ|I JλI (1)2 jλ−1 , n∈Z≥0
where cn = A such that
|λ|I =n
(A.7)
λ∈MN
4(βN 2 +n+1) . For each β ≥ 1 and C0 4(βN 2 +1)4(n+1) 2n 2 cn < A C0 for all n ∈ Z≥0 . Thus
JλI (1)2 ≤
(A.6)
> 1, there exists a positive number
JλI (1)2 jλ−1 < A2 C02n .
(A.8)
Perturbation of Calogero–Moser–Sutherland System
117
By the inequality (A.2), we have √ (N−1) 2(λ|λ)
|Jλ (1)| < AC0
.
(A.9)
The square of the norm of JλI is 4(ξi − ξj + β)4(ξi − ξj − β + 1) , 4(ξi − ξj )4(ξi − ξj + 1)
JλI 2 =
(A.10)
i<j
where ξi = λi + β(N − i). (See ([6], p. 383)) If β ≥ 1 then we have JλI 2 ≥ 1 because of the convexity of the function log 4(x). Therefore we have Jλ 2 ≥ 1. Generally we have for r > 1, max
1/r≤|zi |≤r
|J˜λ (z)| ≤
max
√ √ (N−1) 2(λ|λ) N(λ|λ)
1/r≤|zi |≤r
|Jλ (z)| ≤ AC0
r
,
(A.11)
where J˜λ (z) is the normalized AN−1 -Jacobi polynomial.
A.2. Special functions. We define some functions needed in this article: θ1 (x) := 2
∞
√ (−1)n−1 exp(τ π −1(n − 1/2)2 ) sin(2n − 1)π x,
(A.12)
n=1
θ(x) :=
℘ (x; ω1 , ω3 ) :=
1 + z2
(m,n)∈Z2 \{(0,0)}
θ1 (x) , θ1' (0) 1
(z + 2mω1 + 2nω3 )2
−
1 (2mω1 + 2nω3 )2
,
℘ (x) := ℘ (x; π, π τ ). We have ℘ (x) =
∞ 1 1 np n − − 2 (cos nx − 1), 2 1 − pn 4 sin (x/2) 12 n=1
(A.13)
√ where p = exp(2τ π −1). Acknowledgements. The authors would like to thank Prof. M. Kashiwara and Prof. T. Miwa for discussions and support. Thanks are also due to Dr. T. Koike and Prof. T. Oshima. They thank the referee for valuable comments. One of the authors (YK) is a Research Fellow of the Japan Society for the Promotion of Science.
118
Y. Komori, K. Takemura
References 1. Felder, G., Varchenko, A.: Three formulae for eigenfunctions of integrable Schrödinger operator. Comp. Math. 107, no. 2, 143–175 (1997) 2. Hasegawa, K.: Ruijsenaars’ commuting difference operators as commuting transfer matrices. Commun. Math. Phys. 187, no. 2, 289–325 (1997) 3. Kato, T.: Perturbation theory for linear operators, Corrected printing of second ed., Berlin–Heidelberg: Springer-Verlag, 1980 4. Komori, Y.: Algebraic analysis of one-dimensional quantum many-body systems. Ph.D thesis, Univ. of Tokyo (2000) 5. Langmann, E.: Anyons and the elliptic Calogero–Sutherland model. Lett. Math. Phys. 54, no. 4, 279–289 6. Macdonald, I.G.: Symmetric functions and Hall polynomials. Second edition. New York: Oxford Science Publications, The Clarendon Press, Oxford University Press, 1995 7. Ochiai, H., Oshima, T., Sekiguchi, H.: Commuting families of symmetric differential operators. Proc. Japan Acad. Ser. A Math. Sci. 70, no. 2, 62–66 (1994) 8. Olshanetsky, M.A., Perelomov, A.M.: Quantum integrable systems related to Lie algebras. Phys. Rep. 94, no. 6, 313–404 (1983) 9. Oshima, T., Sekiguchi, H.: Commuting families of differential operators invariant under the action of a Weylgroup. J. Math. Sci. Univ. Tokyo 2, no. 1, 1–75 (1995) 10. Ruijsenaars, S.N.M: Generalized Lamé functions. I. The elliptic cases. J. Math. Phys. 40, no. 3, 1595–1626 (1999) 11. Takemura, K.: On the eigenstates of the elliptic Calogero–Moser model. Lett. Math. Phys. 53, no. 3, 181–194 (2000) Communicated by L. Takhtajan
Commun. Math. Phys. 227, 119 – 130 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Singular Spectrum of Lebesgue Measure Zero for One-Dimensional Quasicrystals Daniel Lenz1,2, 1 Institute of Mathematics, The Hebrew University, Jerusalem 91904, Israel 2 Fachbereich Mathematik, Johann Wolfgang Goethe-Universität, 60054 Frankfurt, Germany.
E-mail:
[email protected] Received: 3 July 2001 / Accepted: 11 December 2001
Abstract: The spectrum of one-dimensional discrete Schrödinger operators associated to strictly ergodic dynamical systems is shown to coincide with the set of zeros of the Lyapunov exponent if and only if the Lyapunov exponent exists uniformly. This is used to obtain the Cantor spectrum of zero Lebesgue measure for all aperiodic subshifts with uniform positive weights. This covers, in particular, all aperiodic subshifts arising from primitive substitutions including new examples such as e.g. the Rudin–Shapiro substitution. Our investigation is not based on trace maps. Instead it relies on an Oseledec type theorem due to A. Furman and a uniform ergodic theorem due to the author. 1. Introduction This article is concerned with discrete random Schrödinger operators associated to minimal topological dynamical systems. This means we consider a family (Hω )ω∈ of operators acting on 2 (Z) by (Hω u)(n) ≡ u(n + 1) + u(n − 1) + f (T n ω)u(n),
(1)
where is a compact metric space, T : −→ is a homeomorphism and f : −→ R is continuous. The dynamical system (, T ) is called minimal if every orbit is dense. For minimal (, T ), there exists a set ⊂ R s.t. σ (Hω ) = , for all ω ∈ ,
(2)
where we denote the spectrum of the operator H by σ (H ) (cf. [6, 36]). This research was supported in part by THE ISRAEL SCIENCE FOUNDATION (Grant no. 447/99) and by the Edmund Landau Center for Research in Mathematical Analysis and Related Areas, sponsored by the Minerva Foundation (Germany).
120
D. Lenz
We will be particularly interested in the case that (, T ) is a subshift over a finite alphabet A ⊂ R. In this case is a closed subset of AZ , invariant under the shift operator T : AZ −→ AZ given by (T a)(n) ≡ a(n + 1) and f is given by f : −→ A ⊂ R, f (ω) ≡ ω(0). Here, A carries the discrete topology and AZ is given the product topology. Operators associated to subshifts arise in the quantum mechanical treatment of quasicrystals (cf. [3, 40] for background on quasicrystals). Various examples of such operators have been studied in recent years. The main examples can be divided in two classes. These classes are given by primitive substitution operators (cf. e.g. [4, 5, 7, 11, 41, 42]) and Sturmian operators respectively more generally circle map operators (cf. e.g. [6, 12, 15, 16, 26, 27, 30]). A recent survey can be found in [14]. For these classes and in fact for arbitrary operators associated to subshifts satisfying suitable ergodicity and aperiodicity conditions, one expects the following features: (S) Purely singular spectrum; (A) absence of eigenvalues; (Z) Cantor spectrum of Lebesgue measure zero. Note that (S) combined with (A) implies purely singular continuous spectrum and note also that (S) is a consequence of (Z). Let us mention that (S) is by now completely established for all relevant subshifts due to recent results of Last–Simon [34] in combination with earlier results of Kotani [32]. For discussion of (A) and further details we refer the reader to the cited literature. The aim of this article is to investigate (Z) and to relate it to ergodic properties of the underlying subshifts. The property (Z) has been investigated for several models by a number of authors: Following work by Bellissard–Bovier–Ghez [5], the most general result for primitive substitutions so far has been obtained by Bovier/Ghez [7]. They can treat a large class of substitutions which is given by an algorithmically accessible condition. The Rudin– Shapiro substitution does not belong to this class. For arbitrary Sturmian operators, Bellissard–Iochum–Scoppola–Testard established (Z) [6], thereby extending the work of Süt˝o in the golden mean case [41, 42]. A different approach, which recovers some of these results, is given in [13, 19]. A canonical starting point in the investigation of (Z) for subshifts is the fundamental result of Kotani [32] that the set {E ∈ R : γ (E) = 0} has Lebesgue measure zero if (, T ) is an aperiodic subshift. Here, γ denotes the Lyapunov exponent (precise definition given below). This reduces the problem (Z) to establishing the equality
= {E ∈ R : γ (E) = 0}.
(3)
As do all other investigations of (Z) so far, our approach starts from (3). Unlike the earlier treatments mentioned above our approach does not rely on the so called trace maps. Instead, we present a new method, the cornerstones of which are the following: (1) A strong type of Oseledec theorem by A. Furman [21]. (2) A uniform ergodic theorem for a large class of subshifts by the author [37]. This new setting allows us (∗)
to characterize validity of (3) for arbitrary strictly ergodic dynamical systems by an essentially ergodic property viz by uniform existence of the Lyapunov exponent (Theorem 1), (∗∗) to present a large class of subshifts satisfying this property (Theorem 2). Here, (∗) gives the new conceptual point of view of our treatment and (∗∗) gives a large class of examples. Put together (∗) and (∗∗) provide a soft argument for (Z) for a large class of examples which contains, among other examples, all primitive substitutions.
Singular Spectrum of Lebesgue Measure Zero
121
The paper is organized as follows. In Sect. 2 we present the subshifts we will be interested in, introduce some notation and state our results. In Sect. 3, we recall results of Furman [21] and of the author [37] and adopt them to our setting. Section 4 is devoted to a proof of our results. Finally, in Sect. 5 we provide some further comments and discuss a variant of our main result.
2. Notation and Results In this section we discuss basic material concerning topological dynamical systems and the associated operators and state our results. As usual a dynamical system is said to be strictly ergodic if it is uniquely ergodic (i.e. there exists only one invariant probability measure) and minimal. A minimal dynamical system is called aperiodic if there does not exist an n ∈ Z, n = 0, and ω ∈ with T n ω = ω. As mentioned already, our main focus will be the case that (, T ) is a subshift over the finite alphabet A ⊂ R . We will then consider the elements of (, T ) as double sided infinite words and use notation and concepts from the theory of words. In particular, we then associate to the set W of words associated to consisting of all finite subwords of elements of . The length |x| of a word x ≡ x1 . . . xn with xj ∈ A, j = 1, . . . , n, is defined by |x| ≡ n. The number of occurrences of v ∈ W in x ∈ W is denoted by v (x). We can now introduce the class of subshifts we will be dealing with. They are those satisfying uniform positivity of weights (PW) given as follows: (PW) There exists a C > 0 with lim inf |x|→∞
v (x) |x| |v|
≥ C for every v ∈ W.
One might think of (PW) as a strong type of minimality condition. Indeed, minimality can easily be seen to be equivalent to lim inf |x|→∞ |x|−1 v (x)|v| > 0 for every v ∈ W [39]. The condition (PW) implies strict ergodicity [37]. The class of subshifts satisfying (PW) is rather large. By [37], it contains all linearly repetitive subshifts (see [20, 33] for definition and thorough study of linearly repetitive systems). Thus, it contains, in particular, all subshifts arising from primitive substitutions as well as all those Sturmian dynamical systems whose rotation number has bounded continued fraction expansion [20, 33, 38]. In our setting the class of subshifts satisfying (PW) appears naturally as it is exactly the class of subshifts admitting a strong form of the uniform ergodic theorem [37]. Such a theorem in turn is needed to apply Furmans results (s. below for details). After this discussion of background from dynamical systems we are now heading towards introducing key tools in spectral theoretic considerations viz transfer matrices and Lyapunov exponents. The operator norm · on the set of 2 × 2-matrices induces a topology on GL(2, R) and SL(2, R). For a continuous function A : −→ GL(2, R), ω ∈ , and n ∈ Z, we define the cocycle A(n, ω) by
A(T n−1 ω) · · · A(ω) : n > 0 Id : n = 0 A(n, ω) ≡ −1 n A (T ω) · · · A−1 (T −1 ω) : n < 0.
122
D. Lenz
By Kingman’s subadditive ergodic theorem (cf. e.g. [31]), there exists (A) ∈ R with 1 (A) = lim log A(n, ω) n→∞ n for µ a. e. ω ∈ if (, T ) is uniquely ergodic with invariant probability measure µ. Following [21], we introduce the following definition. Definition 1. Let (, T ) be strictly ergodic. The continuous function A : (, T ) −→ GL(2, R) is called uniform if the limit (A) = limn→∞ n1 log A(n, ω) exists for all ω ∈ and the convergence is uniform on . Remark 1. It is possible to show that uniform existence of the limit in the definition already implies uniform convergence. The author learned this from Furstenberg and Weiss [22]. They actually have a more general result. Namely, they consider a continuous subadditive cocycle (fn )n∈N on a minimal (, T ) (i.e. fn are continuous real-valued functions on with fn+m (ω) ≤ fn (ω) + fm (T n ω) for all n, m ∈ N and ω ∈ ). Their result then gives that existence of φ(ω) = limn→∞ n−1 fn (ω) for all ω ∈ implies constancy of φ as well as uniform convergence. For spectral theoretic investigations a special type of SL(2, R)-valued function is relevant. Namely, for E ∈ R, we define the continuous function M E : −→ SL(2, R) by E − f (T ω) −1 E . (4) M (ω) ≡ 1 0 It is easy to see that a sequence u is a solution of the difference equation u(n + 1) + u(n − 1) + (f (T n ω) − E)u(n) = 0
(5)
if and only if
u(n + 1) u(n)
= M E (n, ω)
u(1) , n ∈ Z. u(0)
(6)
By the above considerations, M E gives rise to the average γ (E) ≡ (M E ). This average is called the Lyapunov exponent for the energy E. It measures the rate of exponential growth of solutions of (5). Our main result now reads as follows. Theorem 1. Let (, T ) be strictly ergodic. Then the following are equivalent: (i) The function M E is uniform for every E ∈ R. (ii) = {E ∈ R : γ (E) = 0}. In this case the Lyapunov exponent γ : R −→ [0, ∞) is continuous. Remark 2. (a) As will be seen later on, M E is always uniform for E with γ (E) = 0 and for E ∈ R \ . From this point of view, the theorem essentially states that M E can not be uniform for E ∈ with γ (E) > 0.
Singular Spectrum of Lebesgue Measure Zero
123
(b) Continuity of the Lyapunov exponent can easily be inferred from (ii) (though this does not seem to be in the literature). More precisely, continuity of γ on {E ∈ R : γ (E) = 0} is a consequence of subharmonicity. Continuity of γ on R \ follows from the Thouless formula (see e. g. [10] for discussion of subharmonicity and the Thouless formula). Below, we will show that continuity of γ follows from (i) and this will be crucial in our proof of (i) ⇒ (ii). Having studied (∗) of the introduction in the above theorem, we will now state our result on (∗∗). Theorem 2. If (, T ) is a subshift satisfying (PW), then the function M E is uniform for each E ∈ R. Remark 3. (a) Uniformity of M E is rather unusual. This is, of course, clear from Theorem 1. Alternatively, it is not hard to see directly that it already fails for discrete almost periodic operators. More precisely, the Almost–Mathieu-Operator with coupling bigger than 2 has uniform positive Lyapunov exponent [24]. By a deterministic version of the theorem of Oseledec (cf. Theorem 8.1 of [34] for example), this would force pure point spectrum for all these operators, if M E were uniform on the spectrum. However, there are examples of such Almost–Mathieu Operators without point spectrum [2, 29]. (b) The above theorem generalizes [18, 35], which in turn unified the work of Hof [25] on primitive substitutions and of Damanik and the author [17] on certain Sturmian subshifts. (c) The theorem is a rather direct consequence of the subadditive theorem of [37]. The two theorems yield some interesting conclusions. We start with the following consequence of Theorem 1 concerning (Z). A proof is given in Sect. 4. Corollary 2.1. Let (, T ) be an aperiodic strictly ergodic subshift. If M E is uniform for every E ∈ R, then the spectrum is a Cantor set of Lebesgue measure zero. As = {E : γ (E) = 0} holds for arbitrary Sturmian dynamical subshifts [6, 41] (cf. [19] as well), Theorem 1 immediately implies the following corollary. Corollary 2.2. Let ((α), T ) be a Sturmian dynamical system with rotation number α. Then M E is uniform for every E ∈ R. Remark 4. So far uniformity of M E for Sturmian systems could only be established for rotation numbers with bounded continued fraction expansion [17]. Moreover, the corollary is remarkable as a general type of uniform ergodic theorem actually fails as soon as the continued fraction expansion of α is unbounded [37, 38]. Theorem 1, Theorem 2 and Corollary 2.1 directly yield the following corollary. Corollary 2.3. Let (, T ) be a subshift sastisfying (PW). Then = {E ∈ R : γ (E) = 0}. If (, T ) is furthermore aperiodic, then is a Cantor set of Lebesgue measure zero. Remark 5. For aperiodic (, T ) satisfying (PW), this gives an alternative proof of (S). As discussed above primitive substitutions satisfy (PW). As validity of (Z) for primitive substitutions has been a special focus of earlier investigations (cf. the discussion in Sect. 1 and Sect. 5), we explicitly state the following consequence of the foregoing corollary. Corollary 2.4. Let (, T ) be aperiodic and associated to a primitive substitution, then
is a Cantor set of Lebesgue measure zero.
124
D. Lenz
3. Key Results In this section, we present (consequences of) results of Furman [21] and of the author [37]. We start with some simple facts concerning uniquely ergodic systems. Define for a continuous b : −→ R and n ∈ Z the averaged function An (b) : −→ R by k n−1 n−1 k=0 b(T ω) : n > 0 0 : n=0 An (b)(ω) ≡ (7) −1 |n| −k |n| b(T ω) : n < 0. k=1 Moreover, for a continuous b as above and a finite measure µ on we set µ(b) ≡ b(ω) dµ(ω). The following proposition is well known, see e.g. [43]. Proposition 3.1. Let (, T ) be uniquely ergodic with invariant probability measure µ. Let b be a continuous function on . Then the averaged functions An (b) converge uniformly towards the constant function with value µ(b) for |n| tending to infinity. The following consequence of a result by A. Furman is crucial to our approach. Lemma 3.2. Let (, T ) be strictly ergodic with invariant probability measure µ. Let B : −→ SL(2, R) be uniform with (B) > 0. Then, for arbitrary U ∈ C2 \ {0} and ω ∈ , there exist constants D, κ > 0 such that B(n, ω)U ≥ D exp(κ|n|) holds for all n ≥ 0 or for all n ≤ 0. Here, · denotes the standard norm on C2 . Proof. Theorem 4 of [21] states that uniformity of B implies that (in the notation of [21]) either (B) = 0 or B is continuously diagonalizable. As we have (B) > 0, we infer that B is continuously diagonalizable. This means that there exist continuous functions C : −→ GL(2, R) and a, d : −→ R with 0 −1 exp(a(ω)) B(1, ω) = C(T ω) C(ω). 0 exp(d(ω)) By multiplication and inversion, this immediately gives exp(nAn (a)(ω)) 0 B(n, ω) = C(T n ω)−1 C(ω), n ∈ Z. 0 exp(nAn (d)(ω))
(8)
As C : −→ GL(2, R) is continuous on the compact space , there exists a constant ρ > 0 with 0 < ρ ≤ C(ω), | det C(ω)|, C −1 (ω), | det C −1 (ω)| ≤
1 < ∞, for all ω ∈ . ρ (9)
In view of (8) and (9), exponential growth of terms as B(n, ω)U will follow from suitable upper and lower bounds on An (a)(ω) and An (d)(ω) for large |n|. To obtain these bounds we proceed as follows. Assume without loss of generality µ(a) ≥ µ(d). By (9), (8) and Proposition 3.1, we then have exp(a(·)) 0 0 < (B) = (C(T ·)−1 BC) = ( = µ(a). (10) 0 exp(d(·))
Singular Spectrum of Lebesgue Measure Zero
125
Moreover, det B(ω) = 1 implies det B(n, ω) = 1 for all n ∈ Z. Thus, taking determinants, logarithms and averaging with n1 in (8), we infer 0 = An (a)(ω) + An (d)(ω) +
1 log | det(C(T n ω)−1 C(ω))|. n
Taking the limit n → ∞ in this equation and invoking (9) as well as Proposition 3.1, we obtain µ(a) = −µ(d). As µ(a) > 0 by (10), Proposition 3.1 then shows that there exists κ > 0, e.g. κ = 21 µ(a), s.t. for large |n|, we have An (a)(ω) > κ, and An (d)(ω) < −κ for all ω ∈ . Now, the statement of the lemma is a direct consequence of (8) and (9).
Lemma 3.3. Let (, T ) be strictly ergodic. Let A : −→ SL(2, R) be uniform. Let (An ) be a sequence of continuous SL(2, R)-valued functions converging to A in the sense that d(An , A) ≡ supω∈ {An (ω)−A(ω)} −→ 0, n −→ ∞. Then, (An ) −→ (A), n −→ ∞. Proof. This is essentially a result of [21]. More precisely, Theorem 5 of [21] shows that (An ) converges to (A) whenever the following holds: A is a uniform GL(2, R)−1 valued function and d(An , A) −→ 0 and d(A−1 n , A ) −→ 0, n −→ ∞. Now, for −1 functions An , A with values in SL(2, R), it is easy to see that d(A−1 n , A ) −→ 0, n −→ ∞ if d(An , A) −→ 0, n −→ ∞. The proof of the lemma is finished. Lemma 3.4. Let (, T ) be uniquely ergodic. Let A : −→ GL(2, R) be continuous. Then, the inequality lim supn→∞ n−1 log A(n, ω) ≤ (A) holds uniformly on . Proof. This follows from Corollary 2 of [21] (cf. Theorem 1 of [21] as well).
Finally, we need the following lemma providing a large supply of uniform functions if (, T ) is a subshift satisfying (PW). Lemma 3.5. Let (, T ) be a subshift satisfying (PW). Let F : W −→ R satisfy F (xy) ≤ (x) exists. F (x) + F (y) (i.e. F is subadditive). Then, the limit lim|x|→∞ F|x| Proof. This is just Proposition 4.2 of [37].
4. Proofs of the Main Results In this section, we use the results of the foregoing section to prove the theorems stated in Sect. 2. We start with some lemmas needed for the proof of Theorem 1. Lemma 4.1. Let (, T ) be strictly ergodic. If M E is uniform for every E ∈ R then
= {E ∈ R : γ (E) = 0} and the Lyapunov exponent γ : R −→ [0, ∞) is continuous.
126
D. Lenz
Proof. We start by showing continuity of the Lyapunov exponent. Consider a sequence (En ) in R converging to E ∈ R. As the function M E is uniform by assumption, by Lemma 3.3, it suffices to show that d(M En , M E ) → 0, n → ∞. This is clear from the definition of M E in (4). Now, set / ≡ {E ∈ R : γ (E) = 0}. The inclusion / ⊂ follows from general principles (cf. e.g. [10]). Thus, it suffices to show the opposite inclusion ⊂ /. By (2), it suffices to show σ (Hω ) ⊂ / for a fixed ω ∈ . Assume the contrary. Then there exists spectrum of Hω in the complement / c ≡ R \ / of / in R. As γ is continuous, the set / c is open. Thus, the spectrum of Hω can only exist in / c , if spectral measures of Hω actually give weight to / c . By standard results on the generalized eigenfunction expansion [8], there exists then an E ∈ / c admitting a polynomially bounded solution u = 0 of (5). By (6), this solution satisfies (u(n + 1), u(n))t = M E (n, ω)(u(1), u(0))t , n ∈ Z, where v t denotes the transpose of v. By E ∈ / c , we have (M E ) ≡ γ (E) > 0. As M E is uniform by assumption, we can thus apply Lemma 3.2 to M E to obtain that (u(n+1), u(n))t is, at least, exponentially growing for large values of n or large values of −n. This contradicts the fact that u is polynomially bounded and the proof is finished. Lemma 4.2. If (, T ) is uniquely ergodic, M E is uniform for each E ∈ R with γ (E) = 0. Proof. By det M E (ω) = 1, we have 1 ≤ M E (n, ω) and therefore 0 ≤ lim inf n→∞ n−1 log M E (n, ω) ≤ lim supn→∞ n−1 log M E (n, ω). Now, the statement follows from Lemma 3.4. The following lemma is probably well known. However, as we could not find it in the literature, we include a proof. Lemma 4.3. If (, T ) is strictly ergodic, M E is uniform with γ (E) > 0 for each E ∈ R \ . Proof. Let E ∈ R \ be given. The proof will be split in four steps. Recall that is the spectrum of Hω for every ω ∈ by (2) and thus E belongs to the resolvent of Hω for all ω ∈ . Step 1. For every ω ∈ , there exist unique (up to a sign) normalized U (ω), V (ω) ∈ R2 such that M E (n, ω)U (ω) is exponentially decaying for n −→ ∞ and M E (n, ω)V (ω) is exponentially decaying for n −→ −∞. The vectors U (ω), V (ω) are linearly independent. For fixed ω ∈ they can be chosen to be continuous in a neighborhood of ω. Step 2. Define the matrix C(ω) by C(ω) ≡ (U (ω), V (ω)). Then C(ω) is invertible and there exist functions a, b : −→ R \ {0} such that a(ω) 0 C(T ω)−1 M E (ω)C(ω) = . (11) 0 b(ω) Step 3. The functions |a|, |b|, C, C −1 : −→ R are continuous. Step 4. M E is uniform with γ (E) > 0. Ad Step 1. This can be seen by standard arguments. Here is a sketch of the construction. Fix ω ∈ and set u0 (n) ≡ (Hω − E)−1 δ0 (n) and u−1 (n) ≡ (Hω − E)−1 δ−1 (n), where δk , k ∈ Z, is given by δk (k) = 1 and δk (n) = 0, k = n. By Combes–Thomas
Singular Spectrum of Lebesgue Measure Zero
127
arguments, see e.g. [10], the initial conditions (u0 (0), u0 (1)) and (u−1 (0), u−1 (1)) give rise to solutions of (5) which decay exponentially for n → ∞. It is easy to see that not both of these solutions can vanish identically. Thus, after normalizing, we find a vector U (ω) with the desired properties. The continuity statement follows easily from continuity of ω → (Hω − E)−1 x, for x ∈ 2 (Z). The construction for V (ω) is similar. Uniqueness follows by standard arguments from constancy of the Wronskian. Linear independence is clear as E is not an eigenvalue of Hω . Ad Step 2. The matrix C is invertible by linear independence of U and V . The uniqueness statements of Step 1, show that there exist functions a, b : −→ R with M E (ω)U (ω) = a(ω)U (T ω) and M E (ω)V (ω) = b(ω)V (T ω). This easily yields (11). As the left hand side of this equation is invertible, the right hand side is invertible as well. This shows that a and b do not vanish anywhere. Ad Step 3. Direct calculations show that the functions in question do not change if U (ω) or V (ω) or both are replaced by −U (ω) resp. −V (ω). By Step 1, such a replacement can be used to provide a version of V and U continuous around an arbitrary ω ∈ . This gives the desired continuity. Ad Step 4. As C and C −1 are continuous by Step 3 and is compact, there exists a constant κ > 0 with κ ≤ C(ω), C −1 (T ω) ≤ κ −1 for every ω ∈ . Thus, uniformity of M E will follow from uniformity of ω → C −1 (T ω)M E (ω)C(ω), which in turn will follow by Step 2 from uniformity of |a|(ω) 0 ω → D(ω) ≡ . 0 |b|(ω) As |a| and |b| are continuous by Step 3 and do not vanish by Step 2, the functions ln |a|, ln |b| : −→ R are continuous. The desired uniformity of D follows now by Proposition 3.1 (see proof of Lemma 3.2 for similar reasoning). Positivity of γ (E) is immediate from Step 1. A simple but crucial step in the proof of Theorem 2 is to relate the transfer matrices to subadditive functions. This will allow us to use Lemma 3.5 to show that the uniformity assumption of Lemma 3.2 and Lemma 3.3 holds for subshifts satisfying (PW). We proceed as follows. Let (, T ) be a strictly ergodic subshift and let E ∈ R be given. To the matrix valued function M E we associate the function F E : W −→ R by setting F E (x) ≡ log M E (|x|, ω), where ω ∈ is arbitrary with ω(1) · · · ω(|x|) = x. It is not hard to see that this is well defined. Moreover, by submultiplicativity of the norm · , we infer that F E satisfies F E (xy) ≤ F E (x) + F E (y). Proposition 4.4. M E is uniform if and only if the limit lim|x|→∞
F E (x) |x|
exists.
Proof. This is straightforward. Now, we can prove the results stated in Sect. 2. Proof of Theorem 1. The implication (i)⇒(ii) is an immediate consequence of Lemma 4.1. This lemma also shows continuity of the Lyapunov exponent. The implication (ii)⇒(i) follows from Lemma 4.2 and Lemma 4.3.
128
D. Lenz
Proof of Corollary 2.1. As is closed and has no discrete points by general principles on random operators, the Cantor property will follow if has measure zero. But this follows from the assumption and Theorem 1, as the set {E ∈ R : γ (E) = 0} has measure zero by the results of Kotani theory discussed in the introduction. Proof of Theorem 2. This is immediate from Lemma 3.5 and Proposition 4.4.
5. Further Discussion In this section we will present some comments on the results proven in the previous sections. As shown in the introduction and the proof of Theorem 1, the problem (Z) for subshifts can essentially be reduced to establishing the inclusion ⊂ {E ∈ R : γ (E) = 0}. This has been investigated for various models by various authors [5–7,13,19,42]. All these proofs rely on the same tool viz trace maps (see [1, 9] for study of trace maps as well). Trace maps are very powerful as they capture the underlying hierarchical structures. Besides being applicable in the investigation of (Z), trace maps are extremely useful because • trace map bounds are an important tool to prove absence of eigenvalues. Actually, most of the cited literature studies both (A) and (Z). In fact, (Z) can even be shown to follow from a strong version of (A) [19] (cf. [13] as well). While this makes the trace map approach to (Z) very attractive, it has two drawbacks: • The analysis of the actual trace maps may be quite hard or even impossible. • The trace map formalism only applies to substitution-like subshifts. Thus, trace map methods can not be expected to establish zero-measure spectrum in a generality comparable to the validity of the underlying Kotani result. Let us now compare this with the method presented above. Essentially, our method has a complementary profile: It does not seem to give information concerning absence of eigenvalues. But on the other hand it only requires a weak ergodic type condition. This condition is met by subshifts satisfying (PW) and this class of subshifts contains all primitive substitutions. In particular, it gives information on the Rudin-Shapiro substitution which so far had been unattainable. Moreover, quite likely, the condition (PW) will be satisfied for certain circle maps, where (Z) could not be proven by other means. All the same, it seems worthwhile pointing out that (PW) does not contain the class of Sturmian systems whose rotation number has an unbounded continued fraction expansion. This is in fact the only class known to satisfy (Z) (and much more [6, 12, 15–17, 27, 28, 41]) not covered by (PW). For this class, one can use the implication (ii) ⇒ (i) of Theorem 1, to conclude uniform existence of the Lyapunov exponent as done in Corollary 2.2. Still it seems desirable to give a direct proof of uniform existence of the Lyapunov exponent for these systems. Finally, let us give the following strengthening of (the proof of) Theorem 1. It may be of interest whenever the strictly ergodic system is not a subshift. Theorem 3. Let (, T ) be strictly ergodic. Then,
= {E ∈ R : γ (E) = 0} ∪ {E ∈ R : M E is not uniform}, where the union is disjoint.
Singular Spectrum of Lebesgue Measure Zero
129
Proof. The union is disjoint by Lemma 4.2. The inclusion “⊃” follows from Lemma 4.3. To prove the inclusion “⊂”, let E ∈ R with M E uniform and γ (E) > 0 be given. By Lemma 3.3, we infer positivity of the Lyapunov exponent for all F ∈ R close to E. Moreover, by Theorem 4 of [21], for F ∈ R with γ (F ) > 0, uniformity of M F is equivalent to existence of an n ∈ N and a continuous C : −→ GL(2, R) such that all entries of C(T n ω)−1 M F (n, ω)C(ω) are positive for all ω ∈ . By uniformity of M E this latter condition holds for M E . By continuity of (F, ω) → C(T n ω)−1 M F (n, ω)C(ω) and compactness of , it must then hold for M F as well whenever F is sufficiently close to E. These considerations prove existence of an open interval I ⊂ R containing E on which uniformity of the transfer matrices and positivity of the Lyapunov exponent hold (cf. top of p. 811 of [21] for related arguments). Now, replacing / c with I , one can easily adopt the proof of Lemma 4.1 to obtain the desired inclusion. Note added. After this work was completed, we learned about the very recent preprint “Measure Zero Spectrum of a Class of Schrödinger Operators” by Liu–Tan–Wen–Wu (mp-arc 01-189). They present a detailed and thorough analysis of trace maps for primitive substitutions. Based on this analysis, they establish (Z) for all primitive substitutions thereby extending the approach developed in [5, 7, 9, 41]. Acknowledgements. This work was done while the author was visiting The Hebrew University, Jerusalem. The author would like to thank Y. Last for hospitality as well as for many stimulating conversations on a wide range of topics including those considered above. Enlightening discussions with B. Weiss are also gratefully acknowledged. Special thanks are due to H. Furstenberg for most valuable discussions and for bringing the work of A. Furman [21] to the authors attention. The author would also like to thank D. Damanik for an earlier collaboration on the topic of zero measure spectrum [19].
References 1. Allouche, J.-P., Peyière, J.: Sur une formule de récurrence sur le s traces de produits de matrices associés à certaines substitutions. C.R. Acad. Sci. Paris 302, 1135–1136 (1986) 2. J. Avron, B. Simon, Almost periodic Schrödinger Operators, II. The integrated density of states. Duke Math. J. 50, 369–391 (1983) 3. Baake, M.: A guide to mathematical quasicrystals. In: Quasicrystals, eds. J.-B. Suck, M. Schreiber, P. Häussler, Berlin: Springer, 1999 4. Bellissard, J.: Spectral properties of Schrödinger operators with a Thue-Morse potential. In: Number theory and physics, eds. J.-M. Luck, P. Moussa, M. Waldschmidt, Proceedings in Physics 47, Berlin: Springer, 140–150 (1989) 5. Bellissard, J.,Bovier, A., Ghez, J.-M.: Spectral properties of a tight binding Hamiltonian with period doubling potential. Commun. Math. Phys. 135, 379–399 (1991) 6. Bellissard, J., Iochum, B., Scoppola, E. and Testard, D.: Spectral properties of one-dimensional quasicrystals. Commun. Math. Phys. 125, 527–543 (1989) 7. Bovier, A., Ghez, J.-M.: Spectral Properties of One-Dimensional Schrödinger Operators with Potentials Generated by Substitutions, Commun. Math. Phys. 158, 45–66 (1993); Erratum: Commun. Math. Phys. 166, 431–432 (1994) 8. Berezanskii, J.M.: Expansions in eigenfunctions of self-adjoint operators. Transl. Math. Monographs 17, Am. Math. Soc. Providence, R.I. (1968) 9. Casdagli, M.: Symbolic dynamics for the renormalization map of a quasiperiodic Schrödinger equation. Commun. Math. Phys. 107, 295–318 (1986) 10. Carmona, R., Lacroix, J.: Spectral theory of Random Schrödinger Operators. Boston: Birkhäuser (1990) 11. Damanik, D.: Singular continuous spectrum for a class of substitution Hamiltonians. Lett. Math. Phys. 46, 303–311 (1998) 12. Damanik, D.: α-continuity properties of one-dimensional quasicrystals. Commun. Math. Phys. 192, 169– 182 (1998)
130
D. Lenz
13. Damanik, D.: Substitution Hamiltonians with bounded trace map orbits. J. Math.Anal.Appl. 249, 393–411 (2000) 14. Damanik, D.: Gordon-type arguments in the spectral theory of one-dimensional quasicrystals. In: Directions in Mathematical Quasicrystals, eds. M. Baake, R.V. Moody, CRM Monograph Series 13, Providence, RI: AMS, 2000, 277–305 15. Damanik, D., Killip, R. and Lenz,D.: Uniform spectral properties of one-dimensional quasicrystals, III. α-continuity. Commun. Math. Phys. 212, 191–204 (2000) 16. Damanik, D., Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals, I. Absence of eigenvalues. Commun. Math. Phys. 207, 687–696 (1999) 17. Damanik, D., Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals, II. The Lyapunov exponent. Lett. Math. Phys. 50, 245–257 (1999) 18. Damanik, D., Lenz, D.: Linear repetitivity I., Subadditive ergodic theorems. To appear in Discr. Comput. Geom. 19. Damanik, D., Lenz, D.: Half-line eigenfunctions estimates and singular continuous spectrum of zero Lebesgue measure. Preprint 20. Durand, F.: Linearly recurrent subshifts have a finite number of non-periodic subshift factors. Ergod. Th. & Dynam. Sys. 20, 1061–1078 (2000) 21. Furman, A.: On the multiplicative ergodic theorem for uniquely ergodic ergodic systems. Ann. Inst. Henri Poincaré Probab. Statist. 33, 797–815 (1997) 22. Furstenberg, H., Weiss, B.: Private communication 23. Geerse, C., Hof, A.: Lattice gas models on self-similar aperiodic tilings. Rev. Math. Phys. 3, 163–221 (1991) 24. Herman, M.-R.: Une méthode pour minorer les exposants de Lyapunov et quelques exemples montrant the caractère local d’un théorème d’Arnold et de Moser sur le tore de dimension 2. Comment. Math. Helv 58, 4453–502 (1983) 25. Hof, A.: Some Remarks on Aperiodic Schrödinger Operators. J. Stat. Phys. 72, 1353–1374 (1993) 26. Hof, A., Knill, O., Simon, B.: Singular continuous spectrum for palindromic Schrödinger operators. Commun. Math. Phys. 174, 149–159 (1995) 27. Jitomirskaya, S., Last, Y.: Power law subordinacy and singular spectra. I. Half-line operators. Acta Math. 183, 171–189 (1999) 28. Jitomirskaya, S., Last, Y.: Power law subordinacy and singular spectra. II. Line Operators. Commun. Math. Phys. 211, 643–658 (2000) 29. Jitomirskaya, S., Simon, B.: Operators with singular continuous spectrum. III.Almost perodic Schrödinger operators. Commun. Math. Phys. 165, 201–205 (1994) 30. Kaminaga, M.: Absence of point spectrum for a class of discrete Schrödinger operators with quasiperiodic potential. Forum Math. 8, 63–69 (1996) 31. Katznelson, Z.,Weiss, B.: A simple proof of some ergodic theorems. Israel J. Math. 34, 291–296 (1982) 32. Kotani, S.: Jacobi matrices with random potentials taking finitely many values. Rev. Math. Phys. 1, 129–133 (1989) 33. Lagarias, J.C., Pleasants, P.A.B.: Repetitive Delone Sets and Quasicrystals, To appear in Ergod. Th. & Dynam. Sys. 34. Last, Y., Simon, B.: Eigenfunctions, transfer matrices, and absolutely continuous spectrum for onedimensional Schrödinger operators. Invent. Math. 135, 329–367 (1999) 35. Lenz, D.: Aperiodische Ordnung und gleichmässige spektrale Eigenschaften von Quasikristallen, Dissertation, Frankfurt/Main, Logos, Berlin (2000) 36. Lenz, D.: Random operators and crossed products. Mathematical Physics, Analysis and Geometry 2, 197–220 (1999) 37. Lenz, D.: Uniform ergodic theorems on subshifts over a finite alphabet. Ergod. Th. & Dynam. Syst. 22, 245–255 (2002) 38. Lenz, D.: Hierarchical structures in Sturmian dynamical systems. Preprint 39. Queffélec, M.: Substitution Dynamical Systems – Spectral Analysis. Lecture Notes in Mathematics, Vol. 1284. Berlin–Heidelberg–New York: Springer, 1987 40. Senechal, M.: Quasicrystals and geometry. Cambridge. Cambridge University Press, 1995 41. Süt˝o, A.: The spectrum of a quasiperiodic Schrödinger operator. Commun. Math. Phys. 111, 409–415 (1987) 42. Süt˝o, A.: Singular continuous spectrum on a Cantor set of zero Lebesgue measure. J. Stat. Phys. 56, 525–531 (1989) 43. Walters, P.: An introduction to ergodic theory. Graduate Texts in Mathematics, 79, Berlin: Springer, 1982 Communicated by B. Simon
Commun. Math. Phys. 227, 131 – 153 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Integrable Structure of the Dirichlet Boundary Problem in Two Dimensions A. Marshakov1,2 , P. Wiegmann3,4 , A. Zabrodin5,2 1 Theory Department, Lebedev Physics Institute, Leninsky pr. 53, 117924 Moscow, Russia 2 ITEP, Bol. Cheremushkinskaya str. 25, 117259 Moscow, Russia 3 James Franck Institute and Enrico Fermi Institute of the University of Chicago, 5640 S.Ellis Avenue,
Chicago, IL 60637, USA
4 Landau Institute for Theoretical Physics, Moscow, Russia 5 Institute of Biochemical Physics, Kosygina str. 4, 119991 Moscow, Russia
Received: 18 September 2001 / Accepted: 18 December 2001
Abstract: We study how the solution of the two-dimensional Dirichlet boundary problem for smooth simply connected domains depends upon variations of the data of the problem. We show that the Hadamard formula for the variation of the Dirichlet Green function under deformations of the domain reveals an integrable structure. The independent variables corresponding to the infinite set of commuting flows are identified with harmonic moments of the domain. The solution to the Dirichlet boundary problem is expressed through the tau-function of the dispersionless Toda hierarchy. We also discuss a degenerate case of the Dirichlet problem on the plane with a gap. In this case the taufunction is identical to the partition function of the planar large N limit of the Hermitian one-matrix model. 1. Introduction The subject of the Dirichlet boundary problem in two dimensions [1] is a harmonic function in a domain of the complex plane bounded by a closed curve with a given value on the boundary and continuous up to the boundary. The question we address in this paper is how the harmonic function in the bulk varies under a small deformation of the the shape of the domain. Remarkably, this standard problem of complex analysis possesses an integrable structure [2, 3] which we intend to clarify further in this paper. It is described by a particular solution of an integrable hierarchy of partial differential equations known in the literature as dispersionless Toda (dToda) hierarchy. Moreover, related integrable hierarchies arise in the context of 2D topological theories and just the same solution to the dToda hierarchy emerges in the study of 2D quantum gravity [4,5] (we do not elaborate these relations in this paper). Let D be a simply connected domain in the complex plane bounded by a smooth simple curve γ . The Dirichlet problem is to find a harmonic function u(z) in D such that it is continuous up to the boundary and equals a given function u0 (z) on the boundary.
132
A. Marshakov, P. Wiegmann, A. Zabrodin
The problem has a unique solution written in terms of the Green function G(z1 , z2 ) of the Dirichlet boundary problem: 1 u(z) = − u0 (ξ )∂n G(z, ξ )|dξ | , (1.1) 2π γ where ∂n is the normal derivative on the boundary with respect to the second variable, and the normal vector n always looks inside the domain, where the Dirichlet problem 1 is posed. Equivalently, the solution is represented as u(z) = u0 (ξ )∂ξ G(z, ξ )dξ , π i ∂D where ∂D is understood as γ runs anticlockwise with respect to the domain. The main object to study is, therefore, the Dirichlet Green function. It is uniquely determined by the following properties [1]: (G1) The function G(z1 , z2 ) − log |z1 − z2 | is symmetric, bounded and harmonic everywhere in D in both arguments; (G2) G(z1 , z2 ) = 0 if any one of the variables belongs to the boundary. The definition implies that G(z1 , z2 ) is real and negative in D. The Green function can be written explicitly through a conformal map of the domain D onto some “reference” domain for which the Green function is known. A convenient choice is the unit disk. Let f (z) be any bijective conformal map of D onto the unit disk (or its complement), then f (z1 ) − f (z2 ) , G(z1 , z2 ) = log (1.2) f (z1 )f (z2 ) − 1 where bar means complex conjugation. Such a map exists by virtue of the Riemann mapping theorem [1]. It thus suffices to study variations of the conformal map f (z) under deformations of the boundary. This problem was discussed in [2, 3], where it was shown that evolution of the conformal map under changing harmonic moments of the domain is given by the dToda integrable hierarchy. (A relation between conformal maps of slit domains and special solutions to some integrable equations of hydrodynamic type was earlier observed by Gibbons and Tsarev [6].) The study of the Dirichlet problem approaches this subject from another angle. Our starting point is the Hadamard variational formula [7]. It gives the variation of the Green function under small deformations of the domain in terms of the Green function itself: 1 δG(z1 , z2 ) = ∂n G(z1 , ξ )∂n G(z2 , ξ )δh(ξ )|dξ |. (1.3) 2π γ Here δh(ξ ) is the thickness between the curve γ and the deformed curve, counted along the normal vector at the point ξ ∈ γ . We show that already this remarkable formula reflects all integrable properties of the Dirichlet problem. A smooth closed curve γ (for simplicity, we may assume it to be analytic in order to have an easy sufficient justification of some arguments below) divides the complex plane into two parts having a common boundary: a compact interior domain Dint , and an exterior domain Dext containing ∞. Correspondingly, one recognizes interior and exterior Dirichlet problems. The main contents of the paper is common for both of them. To stress this, we try to keep the notation uniform calling the domain simply D. We will
Integrable Structure of the Dirichlet Boundary Problem in Two Dimensions
133
show that (logarithms of) the tau-functions, introduced in [2,3] and further studied in [8], for the interior and exterior problems are related to each other by a Legendre transform. The exterior Dirichlet problem makes sense when the interior domain degenerates into a segment (a plane with a gap). We will show that in this case a deformation problem is described by the dispersionless limit of the Toda chain hierarchy and discuss its relation to the planar limit of the Hermitian matrix model. 2. Deformations of the Boundary Let D be a simply-connected domain in the extended complex plane bounded by a smooth simple curve γ . Consider a basis ψk (z), k ≥ 1, of holomorphic functions in D such that ψk (z0 ) = 0 for some point z0 ∈ D. We call z0 the normalization point. The basis is assumed to be fixed and independent of the domain. For example, in case of the interior problem one may assume, without loss of generality, that the origin is in D and set z0 = 0, ψk (z) = zk /k, while a natural choice for the exterior problem is z0 = ∞ and ψk (z) = z−k /k. Throughout the paper, these bases for interior and exterior problems are referred to as natural ones. Let tk be moments of the domain D defined with respect to the basis ψk : 1 tk = κ ψk (z) d 2 z, k = 1, 2, . . . , (2.1) π D where κ = ± for the interior (exterior) problem. We also assume that the functions ψk for domains containing ∞ are integrable, or the integrals are properly regularized (see below). Besides, we denote by t0 the area (divided by π ) of the domain D in the case of the interior problem and that of the complementary (compact) domain in the case of the exterior problem: 1 d 2 z for compact domains π D . t0 = 1 2 d z for non-compact domains π C\D Let us note that the moments (except for t0 ) are in general complex. We call the quantities tk , t¯k and t0 harmonic moments of the domain D. The Stokes formula represents the harmonic moments as contour integrals 1 tk = ψk (z)¯zdz, k = 0, 1, 2, . . . 2π i γ (where it is set ψ0 (z) = 1) providing, in particular, a regularization of possibly divergent integrals (2.1) in case of the exterior problem. Throughout the paper the contour in γ is run in the anticlockwise direction both for interior and exterior problems. The basic fact of the theory of deformations of closed analytic curves is that the (in general complex) moments tk supplemented by the real variable t0 form a set of local coordinates in the space of smooth closed curves [9] (see also [10]). This means that under any small deformation of the domain the set {t0 , t1 , . . . } is subject to a small change and vice versa. More precisely, let γ (t) be a family of curves such that ∂t tk = 0 in some neighborhood of t = 0, then all the curves γ (t) coincide with γ = γ (0) in this neighborhood.
134
A. Marshakov, P. Wiegmann, A. Zabrodin
γ
ξ
ξ
.
γ
.
Dext
Dint
Fig. 1. Action of the operator ∇(ξ ) in the case of interior Dint (left) and exterior Dext (right) domains. In our convention bump always looks outside the (interior or exterior) domain
The family of differential operators
ψk (z)∂tk + ψk (z)∂t¯k ∇(z) = ∂t0 +
(2.2)
k≥1
span the complexified tangent space to the space of curves. They are invariant under change of variables in the following sense: let t˜k be harmonic moments defined with ˜ respect to another basis, ψ˜ k , of holomorphic functions in D; then ∇(z) = ∇(z). Note that ∇(z0 ) = ∂t0 since ψk (z0 ) = 0. The operator ∇(z) has a clear geometrical meaning described below. Let us consider a special deformation of the domain obtained by adding to it an infinitesimal smooth bump (of an arbitrary form) with area located at the point ξ ∈ γ . Our convention is that > 0 if the bump looks outside the domain in which the Dirichlet problem is posed, as is shown in Fig. 1. Let A be any functional of a domain that depends on the harmonic moments only. The variation of such a functional in the leading order in , is given by ∇(ξ )A, ξ ∈ γ. π (∂tk Aδtk + ∂t¯k Aδ t¯k ) and Indeed, combining δA = ∂t0 Aδt0 + δ(ξ ) A = κ
δtk = κ
1 π
(2.3)
k≥1
bump
ψk (z)d 2 z = κ
ψk (ξ ) π
we obtain (2.3). So, the result of the action of the operator ∇(ξ ) with ξ ∈ γ on A is proportional to the variation of the functional under attaching a bump at the point ξ . To put it differently, we can say that the boundary value of the function ∇(z)A is given by the l.h.s. of (2.3). For functionals A such that the series ∇(z)A converges everywhere in D up to the boundary, this remark gives a usable method to find the function ∇(z)A everywhere in the domain. This function is harmonic in D with the boundary value determined from (2.3). It is given by (1.1): 1 |dξ |∂n G(z, ξ )δ(ξ ) A. (2.4) ∇(z)A = κ 2π γ π This gives the result of the action of the operator ∇(z), when the argument is anywhere in D.
Integrable Structure of the Dirichlet Boundary Problem in Two Dimensions
135
For example, given any regular function f in a domain containing the interior domain D, set Af = D f (z)d 2 z. We have: δ(ξ ) Af = ∇(ξ )Af = f (ξ ), ξ ∈ γ . If the π function f is harmonic in D, then ∇(z)Af = πf (z) for any z ∈ D. The subject of the deformation theory of the Dirichlet boundary problem is to compute ∇(z)G(z1 , z2 ) through the conformal map or the Green function of the original domain. In the next section, we do this using the Hadamard variational formula. 3. Hadamard Variational Formula and Dispersionless Integrable Hierarchy 3.1. The Hadamard integrability condition. Variation of the Green function under small deformations of the domain is known due to Hadamard [7], see Eq. (1.3). Being specified to the particular case of attaching a bump of the area , it reads: δ(ξ ) G(z1 , z2 ) = − ∂n G(z1 , ξ )∂n G(z2 , ξ ), ξ ∈ γ. (3.1) 2π To find how the Green function changes under a variation of the harmonic moments, we use (2.4) to employ the harmonic continuation procedure explained in the previous section. The harmonic function in D with a boundary value given by the Hadamard formula is 1 ∇(z3 )G(z1 , z2 ) = κ ∂n G(z1 , ξ )∂n G(z2 , ξ )∂n G(z3 , ξ )|dξ |. (3.2) 4π γ It is obvious from the r.h.s. of (3.2) that the result of the action of the operator ∇(z) on the Green function is harmonic and symmetric in all three arguments, i.e., ∇(z3 )G(z1 , z2 ) = ∇(z1 )G(z2 , z3 ).
(3.3)
This is our basic relation. It has the form of the integrability condition. In the rest of the paper we will draw consequences of this symmetry and underlying algebraic structures. Note also that despite the fact that the Green function vanishes on the boundary, its derivative (the l.h.s. of Eq. (3.3)) with respect to the deformation of the domain does not. The basic equation (3.3) is a compressed form of an integrable hierarchy. To unfold it, let us separate holomorphic and antiholomorphic parts of this equation. Let E be the exterior to the unit disk. Given a point a ∈ D, consider a bijective conformal map fa : D → E such that fa (a) = ∞. The Dirichlet Green function then is G(a, z) = − log |fa (z)|.
(3.4)
Under a proper normalization of the map the integrability condition (3.3) becomes holomorphic: ∇(b) log fa (z) = ∇(a) log fb (z)
(3.5)
for all a, b, z ∈ D. The following normalization will be convenient: the overall phase is chosen to be argfa (z0 ) = π − arg
(z0 − a) if a = z0 , where z0 ∈ D is the normalization point. If a = z0 we set lim (z − z0 )2 f (z) to be real and negative. Under these conditions z→z0
fa (z) =
(a¯ − z¯ 0 ) f (a) (a − z0 ) f (a)
1/2
f (z)f (a) − 1 f (a) − f (z)
(3.6)
136
A. Marshakov, P. Wiegmann, A. Zabrodin
(for a, z0 = ∞). In the vicinity of the point a (a = ∞) ra
fa (z) = eiω(a,z0 ) pk (a)(z−a)k , 1+ z−a
(3.7)
k≥1
where the real constant ra is called the conformal radius of the domain [11] with respect to the point a, and ω is a phase determined from the normalization condition. In particular, they read ω(z0 , z0 ) = 0. Similarly, the map fa (z) can be defined in the case when either a or z0 lies at infinity. To verify (3.5), we note that the holomorphic function ∇(b) log fa (z)−∇(a) log fb (z) is also antiholomorphic (in z) by virtue of (3.3), and thus must be a constant. Setting z = z0 we find that the latter is zero: ∇(b) log |fa (z0 )| − ∇(a) log |fb (z0 )| + i∇(b) arg fa (z0 ) − i∇(a) arg fb (z0 ) = 0 (the first line vanishes due to (3.3), the second one vanishes because the normalization does not depend on the shape of the domain). 3.2. Harmonic moments as commuting flows. Equation (3.5) suggests to treat log fa (z) as a generating function of commuting flows with respect to spectral parameter a. The expansion
log fa (z) = H0 (z) + (3.8) ψk (a) Hk (z) − ψk (a) H˜ k (z) k≥1
defines generators Hk , H˜ k of the commuting flows. Clearly, H0 (z) = log f (z). It implies evolution equations for f (z), ∂ log f (z) ∂Hk (z) = , ∂tk ∂t0
∂ log f (z) ∂ H˜ k (z) =− ¯ ∂ tk ∂t0
and integrability conditions: ∂Hj (z) ∂Hk (z) = , ∂tj ∂tk
∂ H˜ j (z) ∂ H˜ k (z) = , ∂ t¯j ∂ t¯k
∂ H˜ j (z) ∂Hk (z) =− . ∂ t¯j ∂tk
(3.9)
The real part of (3.8) vanishes on the boundary (as it is the Dirichlet Green function), therefore, the boundary values of H˜ k and Hk are complex conjugated: H˜ k (z) = Hk (z) ,
z ∈ γ.
(3.10)
The structure of integrable hierarchy becomes explicit if instead of functions of z one passes to functions of its image w under the map fz0 : w = fz0 (z) ≡ f (z). Using the chain rule, one can write ∇(a) log fb (z) = ∇(a) log fb (z(w)) + (∇(a) log f (z)) w∂w log fb . w
In the last term we observe that ∇(a) log f (z) = ∂t0 log fa (z) (using (3.5) at b = z0 ). Subtracting the same equality with a, b interchanged, we come to the equation of zerocurvature type: ∇(a) log fb − ∇(b) log fa − {log fa , log fb } = 0,
Integrable Structure of the Dirichlet Boundary Problem in Two Dimensions
where the Poisson brackets are defined as {f, g} ≡ w
137
∂f ∂g ∂g ∂f −w and tk -deriva∂w ∂t0 ∂w ∂t0
tives are taken now at fixed w. Let z(w) be the map inverse to w = f (z). Equation (3.5) at b = z0 , being rewritten for the inverse map, has the form of a one-parametric family of evolution equations of the Lax type labeled by the spectral parameter a. They are ∇(a)z(w) = {log fa (z(w)), z(w)}.
(3.11)
We refer to them as to deformation equations. The zero-curvature conditions ensure that these equations are consistent, i.e. flows with different values of spectral parameter commute. The integrability conditions (3.9) in the new variable acquire the form of the zerocurvature equations ∂tj Hi (w) − ∂ti Hj (w) + {Hi (w), Hj (w)} = 0, ∂t¯j H˜ i (w) − ∂t¯i H˜ j (w) + {H˜ j (w), H˜ i (w)} = 0,
(3.12)
∂tj H˜ i (w) + ∂t¯i Hj (w) + {H˜ i (w), Hj (w)} = 0. From (3.6) it is easy to see that Hk are polynomials in w while H˜ k are polynomials in w−1 . Furthermore, for any basis such that ψk (z) = O( (z−z0 )k ) these polynomials are of degree k. In other words, the generators are meromorphic functions on the Riemann sphere with two marked points at w = 0 and w = ∞. This is a particular case of the universal Whitham hierarchy [12], known as the dispersionless Toda lattice. In [13], it is proved that the full set of zero-curvature conditions (3.12), together with the polynomial structure of the generators, already imply existence of the Lax function and Lax-Sato equations. For a particular choice of the basis, the Lax function can be expressed through the inverse conformal map. Consider the interior problem, set z0 = 0 ∈ D and fix the basis of holomorphic functions in D to be the natural one: ψk (z) = zk /k. From (3.7) and (3.8) it follows that H˜ k are holomorphic in D while Hk are meromorphic with the k th order pole at 0 so that Hk = z−k + O(1) as z → 0. Combining these properties with the polynomial structure of the generators as functions of w, and taking into account (3.10), one gets
1 −k Hk = z−k (w) z (w) , + >0 0 2
1 −k −1 −k + H˜ k = z¯ (w ) z¯ (w −1 ) , 0 , (f (w))0 0 2
1 k k H˜ k (w) = L˜ (w) + L˜ (w) 0
where the additional superscripts are set to distinguish between exterior and interior moments. Under our assumptions, these series converge in Dint and Dext respectively. The functions 4ext and ∂z 4ext are continuous at the boundary. 5.2. Integral formulas for the tau-function. Using the same strategy, we set z in (5.1) to the boundary and interpret this formula as a result of the bump deformation of the domain. It is easy to check that the variation of 1 1 1 1 F ext = d 2 z4ext (z) = − 2 d 2z d 2 ζ log − (5.5) 2π Dint π Dint z ζ Dint is δ(ξ ) F ext =
ext 1 4 (ξ ) + d 2 zδ(ξ ) 4ext (z) 2π 2π Dint ext = 4 (ξ ) − 2 d 2 z log |z−1 − ξ −1 | = 4ext (ξ ). 2π π Dint π
Integrable Structure of the Dirichlet Boundary Problem in Two Dimensions
147
Equation (5.5) presents the tau-function for the exterior Dirichlet problem as a double integral over the domain complement to the Dext [8]. Similar arguments give the tau-function of the interior problem:
1 F int = lim − 2 d 2z d 2 ζ log |z − ζ | + C(R) − c(R)t0 R→∞ π Dext Dext with c(R) as in (5.3), and 1 1 2 C(R) = 2 d z d 2 ζ log |z − ζ | = R 4 (2 log R 2 − 1). π |z|0
(2 − k)(tk ∂tk F ext + t¯k ∂t¯k F ext ).
(5.11)
These formulas reflect the scaling of moments as z → λz with real λ: tkext → λ2+k tkext , tkint → λ2−k tkint . The logarithmic moment v0 , under the same rescaling, exhibits a more complicated behaviour: v0 → λ2 v0 + t0 λ2 log λ2 . To get rid of the “anomaly term” t02 one may modify the tau-function by subtracting 41 t02 log t02 . 6. Dirichlet Problem on the Plane with a Gap Consider the case when the domain Dint shrinks to a segment of the real axis. Then the interior Dirichlet problem does not seem to make sense anymore but the exterior one is still well-posed: find a bounded harmonic function in the complex plane such that it equals a given function on the segment. The problem admits an explicit solution (see e.g. [24]). Possible variations of the data are a variation of the function on the segment and the endpoints of the segment. Solution of this problem as well as its integrable structure may be obtained from the formulas for a smooth domain as a result of a singular limit when a smooth domain shrinks to the segment. The tau-function, obtained in this way, is a partition function of the Hermitian onematrix model for the one-cut solution in the planar large N limit [25].
6.1. Shrinking the domain: The limiting procedure. Let us consider a family of curves γ (ε) obtained from a given curve γ by rescaling of the y-axis as y → εy. (If the curve γ is given by an equation P (x, y) = 0, then γ (ε) is given by P (x, y/ε) = 0.) We are interested in the limit ε → 0, γ (0) being a segment of the real axis. We denote the endpoints by α, β. Let =y(x) be the thickness of the domain bounded by the curve γ (ε) at the point x (see Fig. 2).
Integrable Structure of the Dirichlet Boundary Problem in Two Dimensions
149
∆ y(x) x
ρ (x) β
a
x
Fig. 2. A thin domain stretched along the real axis with thickness =y(x) shrinks into a segment with density ρ(x)
We introduce the function ρ(x) = lim
ε→0
=y(x) . ε
It is easy to see that in case of general position this function can be represented as (6.1) ρ(x) = (x − α)(β − x) M(x), where M(x) is a smooth function regular at the edges. So, instead of the space of contours γ we have the space of real positive functions ρ(x) of the form (6.1) with a finite support [α, β] (the endpoints of which are not fixed but may vary). Let us use the first equation in (5.4) to define times as coefficients of the expansion of the potential 4(x) = 4ext (x) generated by the thin domain with a uniform charge density (and with a point-like charge at the origin). In the leading order in ε the rescaled potential is
1 1 β 1 2 4(x) φ(x) ≡ lim =− − = T0 log x 2 + dx ρ(x ) log Tk x k . ε→0 ε π α x x k≥1
(6.2) Comparing this with (5.4), we get T0 = lim ε −1 t0 , ε→0
Tk = lim ε −1 (tkext + t¯kext ), ε→0
k ≥ 1, k = 2,
(6.3)
T2 = lim ε −1 (t2ext + t¯2ext − 1).
ε→0
β
1 ρ(x)dx but similar integral representations for other times, Tk = Note that T0 = π α β 2 ρ(x)x −k dx, which formally follow from (6.2), are ill-defined. On the other hand, πk α
150
A. Marshakov, P. Wiegmann, A. Zabrodin
harmonic moments of the interior behave, in the scaling limit, as tkint = εk −1 µk +O(ε 2 ), where µk are well-defined moments of the function ρ on the segment: 1 β ρ(x)x k dx . (6.4) µk = π α Using the integral formula (5.5), it is now straightforward to find the scaling limit of the tau-function F ext for the exterior problem. Taking into account (6.3), we introduce the function F cut as follows: F ext (εt0 ; εt1 ,
1 1 + εt2 , εt3 , . . . ; ε t¯1 , + ε t¯2 , ε t¯3 , . . . ) 2 2 = ε2 F cut (T0 ; T1 , T2 , T3 , . . . ) + O(ε 3 ),
(6.5)
where T0 = t0 , Tk = tk + t¯k . Since the second order derivatives of F ext are invariant under the rescaling (6.5), the function F cut obeys the Hirota equations (4.14)–(4.16) (as F ext does) where one has to set ∂tk = ∂t¯k = ∂Tk . The latter means that F cut is a solution of the reduced dToda hierarchy (see e.g. [12]). This sort of reduction is usually referred to as the dispersionless Toda chain. We conjecture that other types of reduction correspond, in the same way, to shrinking of Dint to slit domains of a more complicated form. An integral representation for the F cut (obtained as a limit of (5.5)) reads: β β 1 ρ(x1 )ρ(x2 ) log |x1−1 − x2−1 |dx1 dx2 . F cut = − 2 π α α β 1 1 ρ(x)φ(x)dx = µk Tk , where µk are defined in Represent this as F cut = 2π α 2 k≥0
(6.4) and µ0 =
1 π
β α
ρ(x) log x 2 dx .
It is clear from the limit of (5.9) that µk = ∂Tk F cut for k ≥ 0. Therefore, we obtain the relation 2F cut = Tk ∂Tk F cut , (6.6) k≥0
which means that F cut is a homogeneous function of degree 2. This formula also means that the tau-function F cut is “self-dual” under the Legendre transform with respect to T0 , T1 , T2 , . . . . However, the analog of the Legendre transform (5.10) we discussed in Sect. 5.3 does not include T0 , so the analog of the function F int is the function E cut =
k≥1
∂F cut ∂T0 β β 1 =− 2 ρ(x1 )ρ(x2 ) log |x1 − x2 |dx1 dx2 , π α α
Tk µk − F cut = F cut − T0
which is the electrostatic energy of the segment with the charge density ρ on it, regarded as a function of the variables T0 , µ1 , µ2 , . . . . Properties of this function and its possible relation to the Hamburger 1D moment problem are to be further investigated.
Integrable Structure of the Dirichlet Boundary Problem in Two Dimensions
151
6.2. Conformal maps. The conformal map f : C \ [α, β] −→ E from the plane with the gap onto the exterior E of the unit disk is given by the explicit formula √ 2z − α − β + 2 (z − α)(z − β) . f (z) = β −α All formulas which connect conformal maps and the tau-function remain true in this case, 4z 2(α + β) too. In particular, expanding f (z) as z → ∞, f (z) = − + O(z−1 ), β −α β −α we read from (4.9) formulas for the endpoints of the cut:
1 β − α = 4 exp ∂T20 F cut , β + α = 2∂T20 T1 F cut . 2 The Green function is expressed through the f (z) by the same formula (1.2). Set w = f (z), then the inverse map is z(w) =
β −α β +α (w + w −1 ) + . 4 2
˜ Clearly, z(w) = z¯ (w−1 ), so the constraint on the Lax functions is now L(w) = L(w) which signifies the reduction to the dispersionless Toda chain.
6.3. Relation to matrix models. In analogy with variations of the data of the Dirichlet problem discussed in Sect. 3, we can consider the following problem: given a set of Tk , k ≥ 0, to find endpoints α, β of the segment and the density ρ(x) as functions of these parameters. This problem has appeared in studies of the planar N → ∞ limit of the Hermitian matrix model. In this case the function ρ(x) is a density of eigenvalues [25]. For completeness, we recall the basic points. Taking the derivative of (6.2), we get the equation which connects the set of Tk with α, β and ρ(x): β 2 ρ(x )dx v.p. = V (x) , π x − x α where V (x) = φ(x) − T0 log x 2 =
Tk x k .
(6.7)
k≥1
Consider the function W (z) =
1 π
α
β
ρ(x)dx . z−x
It is analytic in the complex plane with the cut [α, β]. As z → ∞, it behaves as W (z) = T0 z−1 + O(z−2 ). On the cut, its boundary values are W (x ± i0) = − 21 V (x) ∓ iρ(x). The function W (z) is uniquely defined by these analytic properties. So, to find ρ one should find the holomorphic function from its mean value on the cut. The result is given by the explicit formula √ 1 dζ V (ζ ) (z − α)(z − β) W (z) = , (6.8) √ 4π i ζ −z (ζ − α)(ζ − β)
152
A. Marshakov, P. Wiegmann, A. Zabrodin
where the contour encircles the cut but not the point z. The endpoints of the cut are fixed by comparing the leading terms of (6.8) with the required asymptote of the W (z) as z → ∞. This leads to the hodograph-like formulas 1 V (z)dz zV (z)dz 1 = 0, = −2T0 , √ √ 2πi 2π i (z − α)(z − β) (z − α)(z − β) which implicitly determine α, β as functions of Tk . It follows from the above that F cut coincides with the one-cut free energy of the Hermitian one-matrix model with potential (6.7) in the planar large N limit. (As a matter of fact, it was shown in [26] that the partition function of the matrix model at finite N is the tau-function of the Toda chain hierarchy with dispersion.) Combining (6.6) with the ε → 0 limit of the second relation in (5.11), we obtain kTk ∂Tk F cut + T02 = 0. k≥1
This identity is known as the Virasoro L0 -constraint on the dispersionless tau-function F cut [26, 12]. Other Virasoro constraints can be obtained in the ε → 0 limit from the W1+∞ -constraints on the tau-function F ext . We do not discuss them here. 6.4. An example: Gaussian matrix model. If only the first three variables are nonzero (i.e., T0 , T1 , T2 = 0) the function F cut can be found explicitly. Consider a family of ellipses with axes l and εs centered at some point x0 on real axis: y2 1 (x − x0 )2 + = . l2 ε2 s 2 4 The harmonic moments, as ε → 0, are (see Appendix in ref. [3]): t0 = 41 εsl, t1ext = 2εx0 sl −1 + O(ε 2 ), t2ext = 21 − εsl −1 + O(ε2 ) and all other moments of the complement to the ellipse vanish. From (6.3) we read values of the rescaled variables: T0 =
1 (β − α)s , 4
T1 = 2
β +α s, β −α
T2 = −
2 s, β −α
and all the rest are zero. Here we have expressed times through the endpoints α = x0 − 21 l, β = x0 + 21 l of the segment and the extra parameter s. The density function is ρ(x) = −T2 (β − x)(x − α). (Note that T2 < 0.) Using the explicit form of the tau-function for an ellipse [3, 8], t1 t¯1 + t12 t¯2 + t¯12 t2 1 t0 3 F ext ellipse = t02 log − t02 + t0 , 2 1 − 4t2 t¯2 4 1 − 4t2 t¯2 and the scaling procedure (6.5), we find F cut (T0 , T1 , T2 , 0, 0, . . . ) =
T0 T12 3 1 2 T0 T0 log − T02 − . 2 −2T2 4 4T2
This is the expression for the free energy of the Gaussian matrix model in the planar large N limit [25]. The same result can be obtained from the integral formula for F cut .
Integrable Structure of the Dirichlet Boundary Problem in Two Dimensions
153
Acknowledgements. We acknowledge useful discussions with A. Boyarsky, L. Chekhov, B. Dubrovin, P. Di Francesco, M. Mineev-Weinstein, V. Kazakov, A. A. Kirillov, I. Kostov, A. Polyakov, O. Ruchayskiy, and especially with I. Krichever and L. Takhtajan. P.W. also acknowledges a discussion with L. Takhtajan who suggested to use the Hadamard formula to prove the relation between the tau-function and the Dirichlet Green function. The work of A.M. and A.Z. was partially supported by CRDF grant RP1-2102 and RFBR grant No. 00-02-16477, P.W. and A.Z. have been supported by grants NSF DMR 9971332 and MRSEC NSF DMR 9808595. A.M. was partially supported by INTAS grant 97-0103 and grant for support of scientific schools No. 00-15-96566. A.Z. was partially supported by grant INTAS-99-0590 and grant for support of scientific schools No. 00-15-96557.
References 1. Hurwitz, A. and Courant, R.: Vorlesungen über allgemeine Funktionentheorie und elliptische Funktionen. Herausgegeben und ergänzt durch einen Abschnitt über geometrische Funktionentheorie. Berlin– Heidelberg–New York: Springer-Verlag, 1964 (Russian translation, adapted by M.A. Evgrafov: Theory of functions, Moscow: Nauka, 1968) 2. Mineev-Weinstein, Wiegmann, M.P.B. and Zabrodin, A.: Phys. Rev. Lett. 84, 5106 (2000), e-print archive: nlin.SI/0001007 3. Wiegmann, P.B. and Zabrodin,A.: Commun. Math. Phys. 213, 523 (2000), e-print archive: hep-th/9909147 4. Hanany, A., Oz, Y. and Plesser, R.: Nucl. Phys. B425, 150–172 (1994); Takasaki, K.: Commun. Math. Phys. 170,101–116 (1995); Eguchi, T. and Kanno, H.: Phys. Lett. 331B, 330 (1994) 5. Daul, J.M., Kazakov V.A. and Kostov, I.K.: Nucl. Phys. B409, 311-338 (1993); Bonora, L. and Xiong, C.S.: Phys. Lett. B347, 41–48 (1995) 6. Gibbons, J. and Tsarev, S.P.: Phys. Lett. 211A, 19–24 (1996); ibid 258A, 263 (1999) 7. Hadamard, J.: Mém. présentés par divers savants à l’Acad. sci., 33 (1908) 8. Kostov, I.K., Krichever, I.M.,Mineev-Weinstein, M., Wiegmann, P.B. and Zabrodin, A.: τ -function for analytic curves. In: Random matrices and their applications, MSRI publications, Vol. 40, Cambridge: Academic Press, 2001, e-print archive: hep-th/0005259 9. Krichever, I.: Unpublished 10. Takhtajan, L.: Lett. Math. Phys. 56, 181-228 (2001). e-print archive: math.QA/0102164. 11. Hille, E.: Analytic function theory, V. II: Ginn and Company, 1962 12. Krichever, I.M.: Funct. Anal Appl. 22, 200–213 (1989); Commun. Pure. Appl. Math. 47, 437 (1992), e-print archive: hep-th/9205110 13. Takebe, T.: Adv. Series in Math. Phys. 16 (1992), Proceedings of RIMS Research Project 1991, pp 923– 940. 14. Shiota, T.: Invent. Math. 83, 333–382 (1986) 15. Douglas, M.: Phys. Lett. B238, 176 (1990) 16. Mineev-Weinstein, M. and Zabrodin, A.: Proceedings of the Workshop NEEDS 99 (Crete, Greece, June 1999), e-print archive: solv-int/9912012 17. Kharchev, S., Marshakov, A., Mironov, A. and Morozov, A.: Mod. Phys. Lett. A8, 1047–1061 (1993), e-print archive: hep-th/9208046 18. Sato, M.: Soliton Equations and Universal Grassmann Manifold Math. Lect. Notes Ser., Vol. 18, Sophia University, Tokyo (1984); E. Date, M. Jimbo, M.Kashiwara and T. Miwa, Transformation groups for soliton equations. In: Nonlinear Integrable Systems, eds. M. Jimbo and T. Miwa, Singapore: World Scientific, 1983 19. Gibbons, J. and Kodama, Y.: Proceedings of NATO ASI “Singular Limits of Dispersive Waves”, ed. N. Ercolani, London–New York: Plenum, 1994; Carroll, R. and Kodama, Y.: J. Phys. A: Math. Gen. A28, 6373 (1995) 20. Takasaki, K. and Takebe, T.: Rev. Math. Phys. 7, 743-808 (1995) 21. Dijkgraaf, R., Verlinde, E. and Verlinde, H.: Nucl. Phys. B352, 59 (1991) 22. Marshakov, A., Mironov, A. and Morozov, A.: Phys. Lett. B389, 43–52 (1996), e-print archive: hepth/9607109. 23. Boyarsky, A., Marshakov, A., Ruchayskiy, O., Wiegmann, P. and Zabrodin, A.: Phys. Lett. B515, 483–492 (2001) e-print archive: hep-th/0105260. 24. Gakhov, F.: Boundary problems, Moscow: Nauka, 1977 (in Russian); Bitsadze, A.: Foundations of the theory of analytic functions of a complex variable, Moscow: Nauka, 1984 (in Russian) 25. Brézin, E., Itzykson, C., Parisi, G. and Zuber, J.-B.: Commun. Math. Phys. 59, 35 (1978) 26. Gerasimov, A., Marshakov, A., Mironov, A., Morozov, A. and Orlov, A.: Nucl. Phys. B357, 565–618 (1991) Communicated by L. Takhtajan
Commun. Math. Phys. 227, 155 – 190 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
The Canonical Solutions of the Q-Systems and the Kirillov–Reshetikhin Conjecture Atsuo Kuniba1 , Tomoki Nakanishi2 , Zengo Tsuboi3 1 Institute of Physics, University of Tokyo, Tokyo 153-8902, Japan. E-mail:
[email protected] 2 Graduate School of Mathematics, Nagoya University, Nagoya 464-8602, Japan.
E-mail:
[email protected] 3 Graduate School of Mathematical Sciences, University of Tokyo, Tokyo 153-8914, Japan.
E-mail:
[email protected] Received: 2 August 2001 / Accepted: 27 December 2001
Abstract: We study a class of systems of functional equations closely related to various kinds of integrable statistical and quantum mechanical models. We call them the finite and infinite Q-systems according to the number of functions and equations. The finite Qsystems appear as the thermal equilibrium conditions (the Sutherland–Wu equation) for certain statistical mechanical systems. Some infinite Q-systems appear as the relations of the normalized characters of the KR modules of the Yangians and the quantum affine algebras. We give two types of power series formulae for the unique solution (resp. the unique canonical solution) for a finite (resp. infinite) Q-system. As an application, we reformulate the Kirillov–Reshetikhin conjecture on the multiplicities formula of the KR modules in terms of the canonical solutions of Q-systems.
1. Introduction
In the series of works [K1, K2, KR], Kirillov and Reshetikhin studied the formal counting problem (the formal completeness) of the Bethe vectors of the XXX-type integrable spin chains, and they empirically reached a remarkable conjectural formula on the characters of a certain family of finite-dimensional modules of the Yangian Y (g). Let us formulate it in the following way. Conjecture 1.1. Let g be a complex simple Lie algebra of rank n. We set y = (ya )na=1 , (a) ya = e−αa for the simple roots αa of g. Let Qm (y) be the normalized g-character of (a) the KR module Wm (u) (a = 1, . . . , n; m = 1, 2, . . . ; u ∈ C) of the Yangian Y (g); and
156
A. Kuniba, T. Nakanishi, Z. Tsuboi
Qν (y) :=
(a) (a) νm . (a,m) (Qm (y))
Qν (y)
Then, the formula
(1 − e−α ) =
α∈+
Pm(a) (ν, N ) =
Pm(a) (ν, N ) + Nm(a)
(a) N=(Nm ) (a,m)
∞ k=1
(a)
νk min(k, m) −
(b,k)
(a)
(ya )mNm ,
(a) Nm
(b)
Nk da Aab min
m k , db da
(1.1)
(1.2)
holds. Here, A = (Aab ) is the Cartan matrix of g, da are coprime positive integers such that (da Aab ) is symmetric, + is the set of all the positive roots of g, and ab = Γ (a + 1)/Γ (a − b + 1)Γ (b + 1). Remark 1.2. Due to the Weyl character formula, the series in the RHS of (1.1) should be a polynomial of y, and its coefficients are identified with the multiplicities of the (a) (a) (a) (a) g-irreducible components of the tensor product (a,m) Wm (um )⊗νm , where um are arbitrary. Remark 1.3. There are actually two versions of Conjecture 1.1. The above one is the version in [HKOTY] which followed [K1, K2]. In the version in [KR], the binomial coefficients ab are set to be 0 if a < b; furthermore, the equality is claimed, not for the entire series in both sides of (1.1), but only for their coefficients of the powers y M “in the fundamental Weyl chamber”; namely, M = (Ma )na=1 satisfies (a,m)
(a) νm ma −
n
Ma αa ∈ P+ ,
(1.3)
a=1
where a are the fundamental weights and P+ is the set of the dominant integral weights of g. So far, it is not proved that the two conjectures are equivalent. Both conjectures are naturally translated into ones for the untwisted quantum affine algebras, which are extendable to the twisted quantum affine algebras [HKOTT]. In this paper, we refer to all these conjectures as the Kirillov–Reshetikhin conjecture. More comments and the current status of the conjecture will be given in Sect. 5.7. (a)
In [KR, K3], it was claimed that the Qm (y)’s satisfy a system of equations (a)
(a)
2 (Q(a) m (y)) = Qm−1 (y)Qm+1 (y) 2 + (ya )m (Q(a) m (y))
(b)
(Qk (y))Gam,bk .
(1.4)
(b,k)
(a)
Here, Q0 (y) = 1, and Gam,bk are the integers defined as −Aba (δm,2k−1 + 2δm,2k + δm,2k+1 ) db /da = 2, −A (δ db /da = 3, ba m,3k−2 + 2δm,3k−1 + 3δm,3k Gam,bk = +2δ + δ ) m,3k+1 m,3k+2 −A δ otherwise. ab da m,db k
(1.5)
See (4.22) for the original form of (1.4) in [KR, K3]. The relations (1.4) and (4.22) are often called the Q-system. The importance of the role of the Q-system to the formula
Canonical Solutions of the Q-Systems
157
(1.1) was recognized in [K1, K2, KR], and more explicitly exhibited in [HKOTY, KN2]. In this paper we proceed one step further in this direction; we study Eq. (1.4) in a more general point of view, and give a characterization of the special power series solution in (1.1). For this purpose, we introduce finite and infinite Q-systems, where the former (resp. the latter) is a finite (resp. infinite) system of equations for a finite (resp. infinite) family of power series of the variable with finite (resp. infinite) components. Equation (1.4), which is an infinite system of equations with the variable with finite components, is regarded as an infinite Q-system with the specialization of the variable (a specialized Qsystem). We show that every finite Q-system has a unique solution which has the same type of power series formula as (1.1) (Theorem 2.4). In contrast, infinite Q-systems and their specializations, in general, admit more than one solution. However, every infinite Q-system, or its specialization, has a unique canonical solution (Theorems 3.7 and 4.2), whose definition is given in Definition 3.5. The formula (1.1) turns out to be exactly the power series formula for the canonical solution of (1.4) (Theorem 4.3 and Proposition 4.9). Therefore, one can rephrase Conjecture 1.1 in a more intrinsic way as (a) follows (Conjecture 5.5): The family (Qm (y)) of the normalized g-characters of the KR modules is characterized as the canonical solution of (1.4). This is the main statement of the paper. Interestingly, the finite Q-systems also appear in other types of integrable statistical mechanical systems. Namely, they appear as the thermal equilibrium condition (the Sutherland-Wu equation) for the Calogero-Sutherland model [S], as well as the one for the ideal gas of the Haldane exclusion statistics [W]. The property of the solution of the finite Q-systems are studied in [A,AI, IA] from the point of view of the quasihypergeometric functions. We expect that the study of the Q-system and its variations and extensions will be useful for the representation theory of the quantum groups, and for the understanding of the nature of the integrable models as well. 2. Finite Q-Systems A considerable part of the results in this section can be found in the work by Aomoto and Iguchi [A, IA]. We present here a more direct approach. More detailed remarks will be given in Sect. 2.4. 2.1. Finite Q-systems. Throughout Sect. 2, let H denote a finite index set. Let w = (wi )i∈H and v = (vi )i∈H be complex multivariables, and let G = (Gij )i,j ∈H be a given complex square matrix of size |H |. We consider a holomorphic map D → CH , v → w(v) with wi (v) = vi (1 − vj )−Gij , (2.1) j ∈H
where D is some neighborhood of v = 0 in CH . The Jacobian (∂w/∂v)(v) is 1 at v = 0, so that the map w(v) is bijective around v = w = 0. Let v(w) be the inverse map around v = w = 0. Inverting (2.1), we obtain the following functional equation for vi (w)’s: vi (w) = wi (1 − vj (w))Gij . (2.2) j ∈H
158
A. Kuniba, T. Nakanishi, Z. Tsuboi
By introducing new functions Qi (w) = 1 − vi (w),
(2.3)
Eq. (2.2) is written as Qi (w) + wi
(Qj (w))Gij = 1.
(2.4)
j ∈H
From now on, we regard (2.4) as a system of equations for a family (Qi (w))i∈H of power series of w = (wi )i∈H with the unit constant terms (i.e., the constant terms are 1). Here, for any power series f (w) with the unit constant term and any complex number α, we mean by (f (w))α ∈ C[[w]] the α th power of f (w) with the unit constant term. We can easily reverse the procedure from (2.1) to (2.4), and we have Proposition 2.1. The power series expansion of Qi (w) in (2.3) gives the unique family (Qi (w))i∈H of power series of w with the unit constant terms which satisfies (2.4). Definition 2.2. The following system of equations for a family (Qi (w))i∈H of power series of w with the unit constant terms is called a (finite) Q-system: For each i ∈ H ,
(Qj (w))Dij + wi
j ∈H
(Qj (w))Gij = 1,
(2.5)
j ∈H
where D = (Dij )i,j ∈H and G = (Gij )i,j ∈H are arbitrary complex matrices with det D = 0. Equation (2.4), which is the special case of (2.5) with D = I (I : the identity matrix), is called a standard Q-system. It is easy to see that there is a one-to-one correspondence between the solutions of the Q-system (2.5) and the solutions of the standard Q-system Qi (w) + wi
j ∈H
(Qj (w))Gij = 1,
G = GD −1 ,
(2.6)
where the correspondence is given by Qi (w) = Qi (w) =
(Qj (w))Dij ,
(2.7)
j ∈H
j ∈H
(Qj (w))(D
−1 ) ij
.
(2.8)
Therefore, from Proposition 2.1, we immediately have Theorem 2.3. There exists a unique solution of the Q-system (2.5), which is given by (2.8), where (Qi (w))i∈H is the unique solution of the standard Q-system (2.6).
Canonical Solutions of the Q-Systems
159
2.2. Power series formulae. In what follows, we use the binomial coefficient in the following sense: For a ∈ C and b ∈ Z≥0 , a Γ (a + 1) , (2.9) = Γ (a − b + 1)Γ (b + 1) b where the RHS means the limit value for the singularities. We set N := (Z≥0 )H . For D, G in (2.5) and ν = (νi )i∈H ∈ CH , we define two power series of w, N ν KD,G (w) = K(D, G; ν, N )wN , wN = wi i , (2.10) N ∈N
ν (w) RD,G
=
i∈H
R(D, G; ν, N )w
N
(2.11)
N ∈N
with the coefficients
Pi + Ni , Ni
K(D, G; ν, N ) =
i∈H (N)
R(D, G; ν, N ) =
det Fij
H (N)
i∈H (N)
(2.12)
1 Pi + N i − 1 , Ni − 1 Ni
where we set H (N) = { i ∈ H | Ni = 0 } for each N ∈ N , Pi = Pi (D, G; ν, N ) := − νj (D −1 )j i − Nj (GD −1 )j i , j ∈H
(2.13)
(2.14)
j ∈H
Fij = Fij (D, G; ν, N ) := δij Pj + (GD −1 )ij Nj ,
(2.15)
and notation for det i,j ∈H (N) . In (2.12) and (2.13), det ∅ and detH (N) is a shorthand ν ν ∅ mean 1; namely, KD,G (w) and RD,G (w) are power series with the unit constant γ terms. It is easy to check that both series converge for |wi | < |γi i /(γi + 1)γi +1 |, where −1 z γi = −(GD )ii and z = exp(z log z) with the principal branch −π < Im(log z) ≤ π chosen. Now we state our main results in this section. Theorem 2.4 (Power series formulae). Let (Qi (w))i∈H be the unique solution of (2.5). For ν ∈ CH , let QνD,G (w) := i∈H (Qi (w))νi . Then, ν 0 (w)/KD,G (w), QνD,G (w) = KD,G
QνD,G (w)
=
ν RD,G (w).
(2.16) (2.17)
The power series formulae for Qi (w) are obtained as special cases of (2.16) and (2.17) by setting ν = (νj )j ∈H as νj = δij . One may recognize that the first formula (2.16) is analogous to the formula (1.1), 0 (w) in (2.16) corresponds to the Weyl denominator in the where the denominator KD,G LHS of (1.1). As mentioned in Sect. 1, the formula (1.1) is interpreted as the formal completeness of the XXX-type Bethe vectors. In the same sense, the second formula (2.17) is analogous to the formal completeness of the XXZ-type Bethe vectors in [KN1, KN2]. See Sect. 2.4 for more remarks.
160
A. Kuniba, T. Nakanishi, Z. Tsuboi
Example 2.5. Let |H | = 1. Then, (2.5) is an equation for a single power series Q(w), (Q(w))D + w(Q(w))G = 1,
(2.18)
where D = 0 and G are complex numbers, and the series (2.11) reads as ν (w) = RD,G
∞ ν Γ ((ν + N G)/D)(−w)N . D Γ ((ν + N G)/D − N + 1)N!
(2.19)
N=0
Equation (2.18) and the power series formula (2.19) are well known and have a very long history since Lambert (e.g. [B, pp. 306–307]). Example 2.6. Consider the case G = O in (2.5), (Qj (w))Dij + wi = 1.
(2.20)
j ∈H
This is easily solved as
Qi (w) =
(1 − wj )(D
−1 ) ij
,
(2.21)
j ∈H
and, therefore, QνD,O (w) =
(1 − wi )
j ∈H
νj (D −1 )j i
=
i∈H
(1 − wi )−Pi (D,O;ν,N) ,
(2.22)
i∈H
where N ∈ N is arbitrary. Using the binomial theorem ∞ β +N N −β−1 x , = (1 − x) N
(2.23)
N=0
one can directly check that QνD,O (w)
=
Pi − 1 + Ni N ν wi i = RD,O (w), Ni
N∈N i∈H (N)
QνD,O (w)
=
j ∈H (1 − wi )
j ∈H
νj (D −1 )j i −1
j ∈H (1 − wi )
=
−1
ν (w) KD,O 0 (w) KD,O
.
(2.24)
(2.25)
2.3. Proof of Theorem 2.4 and basic formulae. Theorem 2.4 is regarded as a particularly nice example of the multivariable Lagrange inversion formula (e.g. [G]) where all the explicit calculations can be carried through. Here, we present the most direct calculation based on the multivariable residue formula (the Jacobi formula in [G, Theorem 3]). We first remark that Lemma 2.7. Let G = GD −1 . For each ν ∈ CH , let ν ∈ CH with νi = j ∈H νj (D −1 )j i . Then,
QνD,G (w) = QνI,G (w), ν KD,G (w)
=
ν KI,G (w),
(2.26) ν RD,G (w)
=
ν RI,G (w).
(2.27)
Canonical Solutions of the Q-Systems
161
Proof. The equality (2.26) is due to Theorem 2.3. The ones (2.27) follow from the fact Pi (D, G; ν, N ) = Pi (I, G ; ν , N ). By Lemma 2.7, we have only to prove Theorem 2.4 for the standard case D = I . Recall that (Proposition 2.1) QνI,G (w) = i∈H (1 − vi (w))νi , where v = v(w) is the inverse map of (2.1). Thus, Theorem 2.4 follows from Proposition 2.8 (Basic formulae). Let v = v(w) be the inverse map of (2.1). Then, the power series expansions det H
w ∂v j i ν (w) (1 − vi (w))νi −1 = KI,G (w), vi ∂wj i∈H ν (1 − vi (w))νi = RI,G (w)
(2.28) (2.29)
i∈H
hold around w = 0. Proof. The first formula (2.28). We evaluate the coefficient for w N in the LHS of (2.28) as follows: ∂v (1 − vi (w))νi −1 (vi (w))−1 (wi )1−Ni −1 dw (w) w=0 ∂w i∈H
−Ni = Res (1 − vj )−Gij (1 − vi )νi −1 (vi )−1 vi dv Res
v=0
= Res v=0
i∈H
j ∈H
(1 − vi )−Pi (I,G;ν,N)−1 (vi )−Ni −1 dv
i∈H
Pi (I, G; ν, N ) + Ni
=
i∈H
Ni
= K(I, G; ν, N ),
where we used (2.23) to get the last line. Thus, (2.28) is proved. The second formula (2.29). By a simple calculation, we have det H
v ∂w
j i (v) (1 − vi ) = det δij + (−δij + Gij )vi H wi ∂vj i∈H = dJ vi , J ⊂H
(2.30)
i∈J
where dJ := detJ (−δij + Gij ), and the sum is taken over all the subsets J of H . Therefore, the LHS of (2.29) is written as (θ(true) = 1 and θ(false) = 0) det H
w ∂v j i (1 − vi (w))νi −1 vi (w)θ(i∈J ) . (w) dJ vi ∂wj J ⊂H
i∈H
(2.31)
162
A. Kuniba, T. Nakanishi, Z. Tsuboi
By a similar residue calculation as above, the coefficient for w N of (2.31) is evaluated as (1 − vi )−Pi (I,G;ν,N)−1 (vi )−Ni +θ(i∈J )−1 dv dJ Res J ⊂H
v=0
i∈H
Pi (I, G; ν, N ) + Ni − θ(i ∈ J ) = dJ Ni − θ(i ∈ J ) J ⊂H (N) i∈H (N) 1 Pi + N i − 1 = dJ Ni (Pi + Ni ) Ni Ni − 1 i∈J J ⊂H (N) i∈H (N)\J i∈H (N)
1 P + N − 1 i i = det δij (Pj + Nj ) + (−δij + Gij )Nj H (N) Ni Ni − 1
i∈H (N)
= R(I, G; ν, N). Thus, (2.29) is proved.
This completes the proof of Theorem 2.4. Example 2.9. We say that the map w(v) in (2.1) is lower-triangular if the matrix Gij is strictly lower-triangular with respect to a certain total order ≺ in H (i.e., Gij = 0 for i j ). Let w(v) be a lower-triangular map. Then,
v ∂w
Gij vj j i (v) = det δij + = 1. (2.32) det H wi ∂vj H 1 − vj Thus, the formula (2.28) is simplified as ν (1 − vi (w))νi −1 = KI,G (w).
(2.33)
i∈H
This type of formulae has appeared in [K1, K2, HKOTY]. Let us isolate the case ν = 0 from (2.28), together with the formula (2.30), for later use: Corollary 2.10 (Denominator formulae).
w ∂v j i 0 (w) = det (w) (1 − vi (w))−1 , KI,G H vi ∂wj i∈H −1 0 (w) = det δij (1 − vi (w)) + Gij vi (w) . KI,G H
(2.34) (2.35)
From (2.35) and the first formula of Theorem 2.4, we obtain Corollary 2.11. QνI,G (w) = gJ :=
J ⊂H
ν+δJ gJ KI,G (w),
J ⊂H |J |=|J |
JJ sgn JJ
(2.36)
det
i∈J,j ∈J
δij − Gij
det
i∈J ,j ∈J
Gij ,
where δJ = (θi )i∈H , θi = 1 if i ∈ J and 0 otherwise, and J = H \ J .
(2.37)
Canonical Solutions of the Q-Systems
163
From Corollary 2.11, one can easily reproduce the second formula of Theorem 2.4. We leave it as an exercise for the reader.
2.4. Remarks on related works. i) The formal completeness of the Bethe vectors. In [K1, K2, HKOTY, KN1, KN2, KNT], the formal completeness of the XXX/XXZ-type Bethe vectors are studied. In the course of their analysis, several power series formulae in this section appeared in specialized/implicit forms. For example, Lemma 1 in [K1] is a special case of (2.33), Theorem 4.7 in [KN2] is a special case of Proposition 2.8, etc. From the current point of view, however, the relation between these power series formulae and the underlying finite Q-systems was not clearly recognized therein. As a result, these power series formulae and the infinite Q-systems were somewhat abruptly combined in the limiting procedure to obtain the power series formula for the infinite Qsystems. We are going to straighten out this logical entanglement, and make the logical structure more transparent by Theorem 2.4 and the forthcoming Theorems 3.10, 4.3, Proposition 4.9, and Conjecture 5.5. ii) The ideal gas with Haldane statistics and the Sutherland–Wu equation. The series ν (w) has an interpretation of the grand partition function of the ideal gas with the KD,G Haldane exclusion statistics [W]. The finite Q-system appeared in [W] as the thermal equilibrium condition for the distribution functions of the same system. See also [IA] for another interpretation. The one variable case (2.18) also appeared in [S] as the thermal equilibrium condition for the distribution function of the Calogero–Sutherland model. As an application of our second formula in Theorem 2.4, we can quickly reproduce the “cluster expansion formula” in [I, Eq. (129)], which was originally calculated by the Lagrange inversion formula, as follows: log Qi (w) = =
∂ ν RI,G (w) ν=0 ∂νi
det Fj k (I, G; 0, N )
H (N) N∈N j,k =i
j ∈H (N)
1 Pj (I, G; 0, N ) + Nj − 1 N w , Nj Nj − 1
(2.38)
where {Qi (w)}i∈H is the solution of (2.4). The Sutherland-Wu equation also plays an important role for the conformal field theory spectra. (See [BS] and the references therein.) ν (w) is a special example of iii) Quasi-hypergeometric functions. The series KD,G the quasi-hypergeometric functions by Aomoto and Iguchi [AI]; when Gij are all integers, it reduces to a general hypergeometric function of Barnes–Mellin type. A quasihypergeometric function satisfies a system of fractional differential equations and a system of difference-differential equations [AI]. It also admits an integral representation ν (w) reduces to a simple form ([A, [A]. In particular, the integral representation for KI,G Eq. (2.30)], [IA, Eq. (89)]); in our notation, 1 = tiνi −1 fi (w, t)−1 dt, √ (2π −1)|H | i∈H Gij fi (w, t) := ti − 1 + wi tj ,
ν KI,G (w)
j ∈H
(2.39) (2.40)
164
A. Kuniba, T. Nakanishi, Z. Tsuboi
where the integration is along a circle around ti = 1 starting from ti = 0 for each ti . We see that fi (w, t) = 0 is the standard Q-system (2.4). The integral (2.39) is easily evaluated by the Cauchy theorem as [A, Eq. (2.32)] ν KI,G (w) = QνI,G (w)/ det(δij Qi (w) + Gij (1 − Qj (w))), H
(2.41)
where {Qi (w)}i∈H is the solution of (2.4). The formula (2.41) reproduces a version of the Lagrange inversion formula (the Good formula [G, Theorem 2]), and it is equivalent to the formulae (2.16), (2.30), and (2.34). 3. Infinite Q-Systems 3.1. Infinite Q-systems. Throughout Sect. 3, let H be a countable index set. We fix an increasing sequence of finite subsets of H , H1 ⊂ H2 ⊂ · · · ⊂ H such that lim HL = H . − → The result below does not depend on the choice of the sequence {HL }∞ L=1 . A natural choice is H = N and HL = { 1, . . . , L }. However, we introduce this generality to accommodate the situation we encounter in Sect. 4 (cf. (4.1)). Let w = (wi )i∈H be a multivariable with infinitely many components. For each L ∈ N, let wL = (wi )i∈HL be the submultivariable of w. The field C[[wL ]] of the power series of wL over C is equipped with the standard XL -adic topology, where XL is the ideal of C[[wL ]] generated by wi ’s (i ∈ HL ). For L < L , there is a natural projection pLL : C[[wL ]] → C[[wL ]] such that pLL (wi ) = wi if i ∈ HL and 0 if i ∈ HL \ HL . A power series f (w) of w is an element of the projective limit C[[w]] = lim C[[wL ]] ← − of the projective system C[[w1 ]] ← C[[w2 ]] ← C[[w3 ]] ← · · ·
(3.1)
with the induced topology. Let pL be the canonical projection pL : C[[w]] → C[[wL ]], and fL (wL ) be the Lth projection image of f (w) ∈ C[[w]]; namely, fL (wL ) = pL (f (w)) and f (w) = (fL (wL ))∞ L=1 . Here are some basic properties of power series which we use below: (i) We also present a power series f (w) as a formal sum f (w) = aN w N , aN ∈ C, (3.2) N∈N
N = { N = (Ni )i∈H | Ni ∈ Z≥0 , all but finitely many Ni are zero },
(3.3)
(the definition of N is reset here for the infinite index set H ) whose Lth projection image is fL (wL ) = aN w N , (3.4) N∈NL
NL = { N ∈ N | Ni = 0 for i ∈ / HL }.
(3.5)
(ii) For any power series f (w) with the unit constant term and any complex number α, the α th power (f (w))α := ((fL (wL ))α )∞ L=1 ∈ C[[w]] is uniquely defined and has the unit constant term again. (iii) Let fi (w) (i ∈ H ) be a family of power series and fi,L (wL ) be their Lth projections. If their infinite product exists irrespective ofthe order of the product, we write it as i∈H fi (w). i∈H fi (w) exists if and only if i∈H fi,L (wL ) exists for each L; furthermore, if they exist, the latter is the Lth projection of the former.
Canonical Solutions of the Q-Systems
165
Definition 3.1. The following system of equations for a family (Qi (w))i∈H of power series of w with the unit constant terms is called an (infinite) Q-system: For each i ∈ H, (Qj (w))Dij + wi (Qj (w))Gij = 1. (3.6) j ∈H
j ∈H
Here, D = (Dij )i,j ∈H and G = (Gij )i,j ∈H are arbitrary infinite-size complex matrices satisfying the following two conditions: (D) The matrix D is invertible, i.e., there exists a matrix D −1 such that DD −1 = D −1 D = I . (G’) The matrix product G = GD −1 is well-defined. When D = I , Eq. (3.6) is called a standard Q-system. Remark 3.2. The condition (G’) is rephrased as “for each i and k, all but finitely many Gij (D −1 )j k (j ∈ H ) are zero”. Similarly, the condition (D) implies that, for each i and k, all but finitely many Dij (D −1 )j k , (D −1 )ij Dj k (j ∈ H ) are zero. For the standard case, (D) is trivially satisfied, and (G’) is satisfied for any complex matrix G. Unlike the finite Q-systems, the uniqueness of the solution does not hold for the infinite Q-systems, in general. For instance, the following example admits infinitely many solutions. Example 3.3. Let H = Z, and consider a Q-system, Qi−1 (w)Qi+1 (w) + wi = 1, (Qi (w))2
(3.7)
where Q0 (w) = 1. This can be easily solved as Qi (w) = (Q1 (w))
i
i−1
(1 − wj )i−j ,
(3.8)
j =1
where Q1 (w) is an arbitrary series of w with the unit constant term. 3.2. Canonical solution. 3.2.1. Solution of standard Q-system. First, we consider the standard case Qi (w) + wi (Qj (w))Gij = 1.
(3.9)
j ∈H
Let Qi,L (wL ) := pL (Qi (w)) be the Lth projection image of Qi (w). Then, (3.9) is equivalent to a series of equations (L = 1, 2, . . . ), Qi,L (wL ) + pL (wi ) (Qj,L (wL ))Gij = 1, (3.10) j ∈H
166
A. Kuniba, T. Nakanishi, Z. Tsuboi
which are further equivalent to
Qi,L (wL ) + wi
Qi,L (wL ) = 1
i∈ / HL ,
(3.11)
(Qj,L (wL ))Gij = 1
i ∈ HL .
(3.12)
j ∈HL
Namely, a standard infinite Q-system is an infinite family of standard finite Q-systems which is compatible with the projections (3.1). By Proposition 2.1, (3.12) uniquely determines Qi,L (wL ) for i ∈ HL . Furthermore, so determined (Qi,L (wL ))∞ L=1 belongs to C[[w]], again because of the uniqueness of the solution of (3.12). Therefore, Proposition 3.4. There exists a unique solution (Qi (w))i∈H of the standard Q-system (3.9), whose Lth projections Qi,L (wL ) := pL (Qi (w)) are determined by (3.11) and (3.12). 3.2.2. Canonical solution. As we have seen in Example 3.3, the uniqueness property does not hold for a general infinite Q-system (3.6). This is because, unlike the standard case, the Lth projection of (3.6) is not necessarily a finite Q-system. The non-uniqueness property also implies that, unlike the finite case, (3.6) does not always reduce to the standard one, Qi (w) + wi
j ∈H
(Qj (w))Gij = 1,
G = GD −1 .
(3.13)
In fact, the relations (2.7) and (2.8) are no longer equivalent due to the infinite products therein. However, the construction of a solution of a general Q-system from a standard one in Theorem 2.3 still works. We call the so obtained solution as canonical solution. Let us give a more intrinsic definition, however. Definition 3.5. We say that a solution (Qi (w))i∈H of the Q-system (3.6) is canonical if it satisfies the following condition: (Inversion property): For any i ∈ H ,
(Qk (w))(D
−1 ) D ij j k
= Qi (w).
(3.14)
j ∈H k∈H
Remark 3.6. The condition (3.14) is not trivial, because, in general, one cannot freely exchange the order of the infinite double product therein. Theorem 3.7. There exists a unique canonical solution of the Q-system (3.6), which is given by Qi (w) =
j ∈H
(Qj (w))(D
−1 ) ij
,
where (Qi (w))i∈H is the unique solution of the standard Q-system (3.13).
(3.15)
Canonical Solutions of the Q-Systems
167
Proof. First, we remark that the infinite product (3.15) exists, because its Lth projection image reduces to the finite product −1 (Qj,L (wL ))(D )ij (3.16) Qi,L (wL ) = j ∈HL
due to (3.11). Let us show that the family (Qi (w))i∈H in (3.15) is a solution of the Q-system (3.6). With the substitution of (3.16), the Lth projection image of the first term in the LHS of (3.6) is −1 (Qj,L (wL ))Dij = (Qk,L (wL ))Dij (D )j k j ∈H
j ∈H k∈HL
Qi,L (wL ) i ∈ HL = 1 = Qi,L (wL ) i ∈ / HL .
(3.17)
In the second equality above, we exchanged the order of the products. It is allowed because the double product is a finite one (cf. Remark 3.2). The second term in the LHS of (3.6) can be calculated in a similar way as follows: −1 (Qj,L (wL ))Gij = (Qk,L (wL ))Gij (D )j k j ∈H
j ∈H k∈HL
=
k∈HL
(Qk,L (wL ))Gik .
(3.18)
From (3.17) and (3.18), we conclude that (3.6) reduces to (3.13). Furthermore, by (3.17), we have (Qj (w))Dij = Qi (w). (3.19) j ∈H
Then, substituting (3.19) in (3.15), we obtain (3.14). Therefore, (Qi (w))i∈H is a canonical solution of (3.6). Next, we show the uniqueness. Suppose that (Qi (w))i∈H is a canonical solution of (3.6). We define Qi (w) as Qi (w) =
(Qj (w))Dij .
(3.20)
j ∈H
Then, by the inversion property (3.14), we have −1 Qi (w) = (Qj (w))(D )ij .
(3.21)
j ∈H
Also, by (3.6), Qi,L (wL ) = 1,
i∈ / HL .
(3.22)
With (3.21) and (3.22), the same calculation as (3.18) shows that (Qi (w))i∈H is the (unique) solution of (3.13). Therefore, by (3.21), Qi (w) is unique.
168
A. Kuniba, T. Nakanishi, Z. Tsuboi
Example 3.8. Let us find the canonical solution of the Q-system (3.7) in Example 3.3. We have Dij = −2δij + δi,j −1 + δi,j +1 ,
(D −1 )ij = − min(i, j ).
(3.23)
Let HL = {1, . . . , L}. By (3.20) and (3.22), the Lth projection of the LHS of (3.14) equals +1 L j
(Qk,L (wL ))(D
−1 ) D ij j k
j =1 k=j −1
=
L k+1
(Qk,L (wL ))(D
−1 ) D ij j k
(3.24)
k=1 j =k−1
× (QL+1,L (wL ))(D =
L
−1 ) D iL L,L+1
(QL,L (wL ))−(D
−1 ) i,L+1 DL+1,L
(Qk,L (wL ))δik (QL+1,L (wL ))− min(i,L) (QL,L (wL ))min(i,L+1) .
k=1
Therefore, condition (3.14) reads Qi,L (wL )(QL,L (wL )/QL+1,L (wL ))i Qi,L (wL ) = QL,L (wL )(QL,L (wL )/QL+1,L (wL ))L
i≤L i ≥ L + 1.
(3.25)
This is equivalent to Qi,L (wL ) = QL,L (wL ),
i ≥ L + 1.
(3.26)
Using (3.8) and (3.26), one can easily obtain Q1 (w) =
∞
(1 − wj )−1 .
(3.27)
j =1
Therefore, the canonical solution of (3.7) is given by Qi (w) =
∞
(1 − wj )− min(i,j ) .
(3.28)
j =1
3.3. Power series formula. Let (Qi (w))i∈H be the canonical solution of (3.6), and (Qi (w))i∈H be the unique solution of the standard Q-system (3.13). For the matrix D in (3.6), let ν(D) be the set of all ν = (νi )i∈H such that νi ∈ C and, for each i, the sum j ∈H νj (D −1 )j i exists (i.e., all but finitely many νj (D −1 )j i (j ∈ H ) are zero). For each ν ∈ ν(D), we define −1 QνD,G (w) := (3.29) (Qi (w))νi = (Qj (w))νi (D )ij . i∈H
i∈H j ∈H
Canonical Solutions of the Q-Systems
169
The last infinite product exists, because its Lth projection image reduces to a finite product due to (3.11) and the definition of ν(D). For each ν ∈ ν(D), let ν = (νi ) ∈ ν(I ), νi = j ∈H νj (D −1 )j i . Then, by (3.29), we have
QνD,G (w) = QνI,G (w),
G = GD −1 .
(3.30)
It follows from (3.11) and (3.30) that Lemma 3.9. ν
pL (QνD,G (w)) = QI L,G (wL ), L
(3.31)
L
where the RHS is for the solution of the finite Q-system with the finite index set HL , and IL = (δij )i,j ∈HL , GL = (Gij )i,j ∈HL , νL = (νi )i∈HL are the HL -truncations of I , G , ν , respectively. ν (w) and R ν For D, G in (3.6) and ν ∈ ν(D), we define the power series KD,G D,G (w) by the superficially identical formulae (2.10)–(2.15) with D, G, ν, N , etc., therein being replaced by the ones for the infinite index set H .
Theorem 3.10 (Power series formulae). For the canonical solution (Qi (w))i∈H of (3.6) and ν ∈ ν(D), let QνD,G (w) be the series in (3.29). Then, ν 0 ν QνD,G (w) = KD,G (w)/KD,G (w) = RD,G (w).
(3.32)
Proof. By Theorem 2.4 and Lemma 3.9, it is enough to show that ν
ν (w)) = KI L,G (wL ), pL (KD,G L
L
ν
ν pL (RD,G (w)) = RI L,G (wL ). L
L
(3.33)
By (3.2)–(3.5), (3.33) further reduces to the following equality: Pi (D, G; ν, N ) = Pi (IL , GL ; νL , NL ), where NL = (Ni )i∈HL is the HL -truncation of N .
N ∈ NL , i ∈ HL ,
(3.34)
4. Q-Systems of KR Type In this section, we introduce a class of infinite Q-systems which we call the Q-systems of KR type. This is a preliminary step towards the reformulation of Conjecture 1.1. 4.1. Specialized Q-systems. Throughout the section, we take the countable index set as H = {1, . . . , n} × N
(4.1)
for a given natural number n. We choose the increasing sequence H1 ⊂ H2 ⊂ · · · ⊂ H with lim HL = H as HL = {1, . . . , n} × {1, . . . , L}. Let y = (ya )na=1 be a multivariable − → with n components.
170
A. Kuniba, T. Nakanishi, Z. Tsuboi (a)
Definition 4.1. The following system of equations for a family (Qm (y))(a,m)∈H of power series of y with the unit constant terms is called a specialized (infinite) Q-system: For each (a, m) ∈ H ,
(b)
(b,k)∈H
(Qk (y))Dam,bk + (ya )m
(b)
(b,k)∈H
(Qk (y))Gam,bk = 1,
(4.2)
where the infinite-size complex matrices D = (Dam,bk )(a,m),(b,k)∈H and G = (Gam,bk )(a,m),(b,k)∈H satisfy the same conditions (D) and (G’) as in Definition 3.1. A solution of (4.2) is called canonical if it satisfies the condition
(c)
(b,k)∈H (c,j )∈H
(Qj (y))(D
−1 ) am,bk Dbk,cj
= Q(a) m (y).
(4.3)
Let C[[y]] be the field of power series of y with the standard topology, JL be the ideal of C[[y]] generated by (ya )L+1 ’s (a = 1, . . . , n), and C[[y]]L be the quotient C[[y]]/JL . We can identify C[[y]] with the projective limit of the projective system, C[[y]]1 ← C[[y]]2 ← C[[y]]3 ← · · · .
(4.4)
(a)
Let w = (wm )(a,m)∈H be a multivariable, and let w(y) be the map with (a) wm (y) = (ya )m .
(4.5)
The map (4.5) induces the maps ψL and ψ such that C[[wL ]] ← C[[w]] ψL ↓ ψ↓ C[[y]]L ← C[[y]].
(4.6)
We call the image ψ(f (w)) ∈ C[[y]] the specialization of f (w), and write it as f (w(y)). Explicitly, for f (w) in (3.2), f (w(y)) =
∞ M1 ,...,Mn =0
aN
n
(ya )Ma .
(4.7)
a=1
N∈N ∞ (a) m=1 mNm =Ma
Theorem 4.2. There exists a unique canonical solution of the specialized Q-system (4.2), (a) (a) which is given by the specialization Qm (y) = Qm (w(y)) of the canonical solution (a) (Qm (w))(a,m)∈H of the following Q-system: (b,k)∈H
(b)
(a) (Qk (w))Dam,bk + wm
(b,k)∈H
(b)
(Qk (w))Gam,bk = 1.
(4.8)
Proof. Since the map ψ is continuous, it preserves the infinite product. Therefore, the specialization of the canonical solution of (4.8) gives a canonical solution of (4.2). Let us show the uniqueness. By repeating the same proof for Theorem 3.7, the uniqueness
Canonical Solutions of the Q-Systems
171
is reduced to the one for the standard case D = I . Let us write (4.2) for D = I as (L = 1, 2, . . . ) m Q(a) m (y) + (ya )
Q(a) m (y) ≡1
(b,k)∈HL
(b) (Qk (y))Gam,bk
≡1
mod JL
(a, m) ∈ / HL ,
(4.9)
mod JL
(a, m) ∈ HL .
(4.10)
(a)
(a)
These equations uniquely determine Qm (y) mod JL . Since L is arbitrary, Qm (y) is unique. By the specialization of Theorem 3.10, we immediately obtain (a)
Theorem 4.3 (Power series formulae). Let (Qm (y))(a,m)∈H be the canonical solution (a) (a) of the Q-system (4.2). Let QνD,G (y) = (a,m)∈H (Qm (w))νm , ν ∈ ν(D). Then, ν 0 QνD,G (y) = KD,G (y)/KD,G (y) = RνD,G (y), ν KD,G (y)
ν (w(y)) KD,G
where the series = and cializations of the series in Theorem 3.10.
RνD,G (y)
=
(4.11)
ν RD,G (w(y))
are the spe-
4.2. Convergence property. Let us consider the special case of the specialized Q-system (4.2) where the matrix D and its inverse D −1 are given by Dam,bk = −δab (2δmk − δm,k+1 − δm,k−1 ), (D
−1
(4.12)
)am,bk = −δab min(m, k).
(4.13)
(a)
Then, (4.2) is written in the form (Q0 (y) = 1) (a)
(a)
2 (Q(a) m (y)) = Qm−1 (y)Qm+1 (y) 2 + (ya )m (Q(a) m (y))
(b,k)∈H
(b)
(Qk (y))Gam,bk .
(4.14)
(a)
Proposition 4.4. A solution (Qm (y)) of the specialized Q-system (4.14) is canonical if and only if it satisfies the following condition: (Convergence property): For each a, the limit lim Q(a) m (y) exists in C[[y]]. m→∞
(4.15)
(a)
Proof. Let (Qm (y))(a,m)∈H be a solution of (4.14). The same calculation as (3.24) in Example 3.8 shows that (4.3) is equivalent to the following equality for each L (cf. (3.26)): (a)
Q(a) m (y) ≡ QL (y)
mod JL ,
m ≥ L + 1.
(4.16)
Clearly, condition (4.15) follows from condition (4.16). Conversely, assume condition (4.15). By (4.14), we have (a)
(a)
(a) Q(a) m (y)/Qm−1 (y) ≡ Qm+1 (y)/Qm (y)
mod JL ,
(m ≥ L + 1). (a)
(4.17) (a)
Because of (4.15), both sides of (4.17) are 1 mod JL . Thus, we have Qm (y) ≡ Qm−1 (y) mod JL (m ≥ L + 1). Therefore, (4.16) holds.
172
A. Kuniba, T. Nakanishi, Z. Tsuboi
4.3. Q-system of KR type and denominator formula. Definition 4.5. A specialized Q-system (4.2) is called a Q-system of KR (Kirillov– Reshetikhin) type if the matrices D and G further satisfy the following conditions: (KR-I) The matrix D and its inverse D −1 are given by (4.12) and (4.13). (KR-II) There exists a well-order ≺ in H such that G = GD −1 has the form Gam,bk = gab m for (a, m) (b, k),
(4.18)
where gab (a, b = 1, . . . , n) are integers with det 1≤a,b≤n gab = 0. Example 4.6. Let ta > 0 and hab (a, b = 1, . . . , n) be real numbers such that gab := hab tb are integers and det hab = 0. We define a well-order ≺ in H as follows: (a, m) ≺ (b, k) if tb m < ta k, or if tb m = ta k and a < b. Then, Gam,bk = hab min(tb m, ta k)
(4.19)
satisfies the condition (KR-II) with gab = hab tb . Let x = (xa )na=1 be a multivariable with n components, and y(x) be the map ya (x) =
n
(xb )−gab ,
(4.20)
b=1
where gab are the integers in (4.18). We set m (a) Q(a) m (x) := (xa ) Qm (y(x)),
(4.21)
which are Laurent series of x. (a)
(a)
Proposition 4.7. The family (Qm (x))(a,m)∈H satisfies a system of equations (Q0 (x) = 1), (a) (a) (b) 2 (a) 2 (Qk (x))Gam,bk . (Q(a) m (x)) = Qm−1 (x)Qm+1 (x) + (Qm (x)) (4.22) (b,k)∈H
Proof. By comparing (4.14) and (4.22), it is enough to prove the equality ∞
Gam,bk (−k) = gab m.
(4.23)
k=1
Due to the condition (KR-II), for given (a, m) and b, there is some number L such that Gam,bk = gab m holds for any k ≥ L. Then, for k > L, we have Gam,bk =
∞ j =1
Gam,bj Dbj,bk = gab m(−2 + 1 + 1) = 0.
(4.24)
Therefore, the LHS of (4.23) is evaluated as ∞ L k=1 j =1
Gam,bj Dbj,bk (−k) = (L + 1)Gam,bL − LGam,bL+1 = gab m.
(4.25)
Canonical Solutions of the Q-Systems
173
Remark 4.8. The relation (4.22) is the original form of the Q-system in [K2, K3, KR], where the matrix G is taken as (1.5). See also (5.14) and (5.16). Note that, in the second (a) term of the RHS in (4.22), the factor (Qm (y))2 is cancelled by the factor in the product for (b, k) = (a, m), because Gam,am = −2. (a)
Proposition 4.9 (Denominator formula). Let (Qm (y))(a,m)∈H be the canonical solution 0 0 0 (x) := KD,G (y(x)), where KD,G (y) is the of the Q-system of KR type (4.14). Let KD,G power series in (4.11). Then, the formula
∂Q(a) 0 1 (x) = det (x) (4.26) KD,G 1≤a,b≤n ∂xb holds. A proof of Proposition 4.9 is given in Appendix A. In Conjecture 5.7, Proposition 4.9 0 will be used to identify KD,G (x) for some G with the Weyl denominators of the simple Lie algebras. 5. Q-Systems and the Kirillov–Reshetikhin Conjecture In this section, we reformulate Conjecture 1.1 in terms of the canonical solutions of certain Q-systems of KR type (Conjecture 5.5). Then, we present several character formulae, all of which are equivalent to Conjecture 5.5. 5.1. Quantum affine algebras. We formulate Conjecture 1.1 in the following setting: Firstly, we translate the conjecture for the KR modules of the (untwisted) quantum affine (1) algebra Uq (Xn ), based on the widely-believed correspondence between the finite(1) dimensional modules of Y (Xn ) and Uq (Xn ) (for the simply-laced case, see [V]). Secondly, we also include the twisted quantum affine algebra case, following [HKOTT]. First, we introduce some notations. Let g = XN be a complex simple Lie algebra of rank N. We fix a Dynkin diagram automorphism σ of g with r = ord σ . Let g0 be the σ -invariant subalgebra of g; namely, g r g0
Xn 1 Xn
A2n 2 Bn
A2n−1 2 Cn
Dn+1 2 Bn
E6 2 F4
D4 3 G2
(5.1)
See Fig. 1. Let A = (Aij ) (i, j ∈ I ) and A = (Aij ) (i, j ∈ Iσ ) be the Cartan matrices of g and g0 , respectively, where Iσ is the set of the σ -orbits on I . We define the numbers di , di , Ci , Ci (i ∈ I ) as follows: di (i ∈ I ) are coprime positive integers such that (di Aij ) is symmetric; di (i ∈ Iσ ) are coprime positive integers such that (di Aij ) is symmetric, and we set di = dπ(i) (i ∈ I ), where π : I → Iσ is the canonical projection; Ci = r if σ (i) = i, and 1 otherwise; Ci = 2 if Aiσ (i) < 0, and 1 otherwise. It immediately follows (r)
(2)
that di = di and Ci = 1 if r = 1; di = 1 if r > 1; Ci = 1 if XN = A2n . It is easy to (r) (2) check the following relations: Set κ0 = 2 if XN = A2n , and 1 otherwise. Then, κ0 di
r s=1
Aiσ s (j ) = di Aπ(i)π(j ) , κ0 Ci di = Ci di .
(5.2) (5.3)
174
A. Kuniba, T. Nakanishi, Z. Tsuboi (XN , r)
g0
(A2n , 2)
❞
❞
(A2n−1 , 2)
t
t
t
t ❞
❞
❞
t
t
Bn Cn
❞
❞
❞> t
t
t
t< ❞
❞
❞
❞> t
t (Dn+1 , 2)
❞
❞
❞
t
Bn
❞ t
(E6 , 2)
t
❞
t
t
❞
F4
❞> t
t
t t
(D4 , 3)
❞
t
❞> t
G2
Fig. 1. The Dynkin diagrams of XN and g0 for r > 1. The filled circles in XN correspond to the ones in g0 which are short roots of g0
For q ∈ C× , we set qi = q κ0 di , qi = q di , and [n]q = (q n − q −n )/(q − q −1 ).
(r)
We use the “second realization” of the quantum affine algebra Uq = Uq (XN ) [D2, ± (i ∈ I, k ∈ Z), Hik (i ∈ I, k ∈ Z \ {0}), Ki±1 (i ∈ I ), and the J] with the generators Xik ±1/2 central elements c . As far as finite-dimensional Uq -modules are concerned, we can ±1/2 set c = 1. Some of the defining relations in the quotient (the quantum loop algebra) Uq /(c±1/2 − 1) are presented below to fix notations (here we follow the convention in [CP2, CP3]): ± Xσ±(i)k = ωk Xik ,
Ki Xj±k Ki−1 = q
r
±κ0 di
[Hik , Xj±l ] = ± + [Xik , Xj−l ] =
s=1 Aiσ s (j )
r 1 k
r
s=1
±1 Kσ±1 (i)k = Kik ,
Hσ (i)k = ωk Hik , Xj±k ,
(5.5)
± [kκ0 di Aiσ s (j ) ]q ωsk Xj,k+l ,
δσ s (i),j ωsl
s=1
(5.4)
+ − − Hi,k+l Hi,k+l
qi − qi−1
(5.6)
,
(5.7)
± where ω = exp(2πi/r), and Hik (i ∈ I, k ∈ Z) are defined by ∞ k=0
± Hi,±k uk
=
Ki±1 exp
±(qi − qi−1 )
∞
Hi,±l u
l
(5.8)
l=1
± with Hik = 0 (±k < 0).
Remark 5.1. In [CP3], there are some misprints which are relevant here. Namely, the relation [Hik , Xj±l ] should read (5.6) here; in Proposition 2.2 and Theorem 3.1 (ii), q should read qi for such i that σ (i) = i and aiσ (i) = 0 therein. We thank V. Chari for the correspondence concerning these points.
Canonical Solutions of the Q-Systems
175
Let V (ψ ± ) denote the irreducible Uq -module with a highest weight vector v and the ± highest weight ψ ± = (ψik ), namely, + v = 0, Xik
± v Hik
(5.9)
± ψik v,
=
± ψik
∈ C.
(5.10)
The following theorem gives the classification of the finite-dimensional Uq -modules: (r)
Theorem 5.2 (Theorem 3.3 [CP2], Theorem 3.1 [CP3]). The Uq (XN )-module V (ψ ± ) is finite-dimensional if and only if there exist N -tuple of polynomials (Pi (u))i∈I with the unit constant terms such that ∞ k=0
+ k ψik u
=
∞ k=0
− ψi,−k u−k
=
qi Ci deg Pi
Pi (qi −2Ci uCi )
Pi (uCi )
,
(5.11)
where the first two terms are the Laurent expansions of the third term about u = 0 and u = ∞, respectively. The polynomials (Pi (u))i∈I are called the Drinfeld polynomials of V (ψ ± ). It follows from (5.3), (5.10), and (5.11) that
±Ci deg Pi
Ki±1 v = qi ±Ci deg Pi v = qi
v.
(5.12)
5.2. The KR modules. We take an inclusion ι : Iσ K→ I such that π ◦ ι = id, and regard Iσ as a subset of I . Let us label the set Iσ with {1, . . . , n}. The Drinfeld polynomials (5.11) satisfy the relation Pσ (i) (u) = Pi (ωu) (σ (i) = i) by (5.4) and (5.8). Therefore, it is enough to specify the polynomials Pi (u) only for those i ∈ {1, . . . , n} ⊂ I . We set H = {1, . . . , n} × N as in (4.1). (a)
Definition 5.3. For each (a, m) ∈ H and ζ ∈ C× , let Wm (ζ ) be the finite-dimensional irreducible Uq -module whose Drinfeld polynomials Pb (u) (b = 1, . . . , n) are specified as follows: Pb (u) = 1 for b = a, and Pa (u) =
m
(1 − ζ qa Ca (m+2−2k) u).
(5.13)
k=1 (a)
We call Wm (ζ ) a KR (Kirillov–Reshetikhin) module. ± and Ka±1 (a = 1, . . . , n) generate the subalgebra By (5.2) and (5.5), we see that Xa0 (a) Uq (g0 ). It is well known that all Wm (ζ ) (ζ ∈ C× ) share the same Uq (g0 )-module ± structure. If we set Ka±1 = qa±Ha and take the limit q → 1, Xa0 and Ha (a = 1, . . . , n) (a) generate the Lie algebra g0 . Accordingly, Wm (ζ ) is equipped with the g0 -module struc(a) ture. We call its g0 -character the g0 -character of Wm (ζ ). The g0 -highest weight of (a) Wm (ζ ), in the same sense as above, is mCa a by (5.12) and (5.13).
176
A. Kuniba, T. Nakanishi, Z. Tsuboi
5.3. The Kirillov–Reshetikhin conjecture. We define the matrix G = (Gam,bk )(a,m),(b,k)∈H with the entry r
m k db Gam,bk = (5.14) Abσ s (a) min , db da C s=1 b db Aba min( dmb , dka ) r = 1 = 1 (5.15) r > 1. Cb Aba min(m, k) It follows from (5.15) and Example 4.6 that G satisfies the condition (KR-II) in Definition 4.5 with gab = Aba /Cb . Below, we consider the Q-system of KR type with the matrix G := G D, where D is the matrix in (4.12). By using (A.6) of [KN2]), the entry of G is explicitly written as 1 r>1 − Cb Aba δm,k db /da = 2 −Aba (δm,2k−1 + 2δm,2k + δm,2k+1 ) Gam,bk = −Aba (δm,3k−2 + 2δm,3k−1 + 3δm,3k (5.16) db /da = 3 +2δm,3k+1 + δm,3k+2 ) −Aab δda m,db k otherwise. Let αa and a (a = 1, . . . , n) be the simple roots and the fundamental weights of g0 . We set xa = eCa a ,
ya = e−αa .
(5.17)
Then, they satisfy the relation (4.20) for the above gab ; namely, ya =
n b=1
−Aba /Cb
xb
.
(5.18)
(a)
Definition 5.4. Let Qm (x) be the Laurent polynomial of x = (xa )na=1 representing (a) (a) (a) the g0 -character of the KR module Wm (ζ ). Then, Qm (y) := (xa )−m Qm (x)|x=x(y) , n where x(y) is the inverse map of (5.18), is a polynomial of y = (ya )a=1 with the unit (a) (a) constant term. We call Qm (y) the normalized g0 -character of Wm (ζ ). Now we present a reformulation of Conjecture 1.1. This is the main statement of the paper. (a)
(a)
ν 0 (y)/KD,G (y) = RνD,G (y) Qν (y) = KD,G
(5.19)
Conjecture 5.5. Let Qm (y) be the normalized g0 -character of the KR module Wm (ζ ) (r) (a) of Uq (XN ). Then, the family (Qm (y))(a,m)∈H is characterized as the canonical solution of the Q-system of KR type (4.14) with G given in (5.16). (a) (a) Let Qν (y) = (a,m) (Qm (y))νm for ν ∈ ν(D). By Theorem 4.3, Conjecture 5.5 is equivalent to Conjecture 5.6 ([KN2]). The formulae ν (y) KD,G
and RνD,G (y) are the power series in (4.11) with D in (4.12) and hold, where G in (5.16). Therefore, RνD,G (y) is a polynomial of y, and its coefficients are identified (a) (a) (a) with the g0 -weight multiplicities of the tensor product (a,m)∈H Wm (ζm )⊗νm , where (a) ζm are arbitrary.
Canonical Solutions of the Q-Systems
177 g
5.4. Equivalence to Conjecture 1.1. Let + denote the set of all the positive roots of g. (r) (2) Originally, Conjecture 5.5 is formulated for XN = A2n as follows (cf. Conjecture 1.1): (r)
(2)
Conjecture 5.7 ([K1, K2, HKOTY, HKOTT]). For XN = A2n , the formula ν (y) KD,G Qν (y) = (1 − e−α )
(5.20)
g
α∈+0 ν (y) is the power series in (4.11) with D in (4.12) and G in (5.16). holds, where KD,G ν Therefore, KD,G (y) is a polynomial of y, and its coefficients are identified with the multi (a) (a) (a) plicities of the g0 -irreducible components of the tensor product (a,m)∈H Wm (ζm )⊗νm , (a)
where ζm are arbitrary. (r)
(2)
Proof of the equivalence between Conjectures 5.6 and 5.7 for XN = A2n . Suppose that Conjecture 5.7 holds. Then, setting ν = 0 in (5.20), we have 0 KD,G (y) =
(1 − e−α ).
(5.21)
g α∈+0
ν 0 Therefore, Qν (y) = KD,G (y)/KD,G (y) holds. Conversely, suppose that the family of (a)
the normalized g0 -characters (Qm (y))(a,m)∈H is the canonical solution of (4.14). Then, the equality (5.21) follows from Proposition 4.9 and the lemma below. Lemma 5.8. Let g be a complex simple Lie algebra of rank n, and αa and a be the simple roots and the fundamental weights of g. We set xa = ea , ya = e−αa /ka , where ka (a = 1, . . . , n) are 1 or 2. Suppose that fa (y) (a = 1, . . . , n) are polynomials of y with the unit constant terms such that fa (x) = xa fa (y(x)) are invariant under the action of the Weyl group of g. Then,
det
∂f
1≤a,b≤n
a
∂xb
(x) = (1 − e−α ).
Proof. The proof is the same as the one for Lemma 8.6 in [HKOTY]. (2)
(5.22)
g α∈+
In the case A2n , (5.21) does not hold under Conjecture 5.6, because the assumption (2) in Lemma 5.8 is not satisfied by (5.17). We treat the case A2n separately below.
178
A. Kuniba, T. Nakanishi, Z. Tsuboi
0
n−1 n ❞> ❞ 1 0
1
❞> ❞
n n−1
(2) Fig. 2. The Dynkin diagram of A2n . The upper and lower labels respect the subalgebra Bn and Cn , respectively
(2)
5.5. The A2n case. (2)
5.5.1. The Bn -character. For A2n , g0 = Bn . Let {1, . . . , n} label Iσ as the upper label in Fig. 2. Accordingly, Ca = 1 for a = 1, . . . , n − 1, and 2 for a = n. We continue to set ya = e−αa as in Sect. 5.3. We will show later, in (5.34) and (5.36), that under Conjecture 5.5 the following formula holds instead of the formula (5.21): 0 KD,G (y) =
n
n
a=1
k=a
1+
(r)
yk
(1 − e−α ).
(5.23)
n α∈B +
(2)
Therefore, Conjecture 5.5 for XN = A2n is equivalent to (r)
(2)
Conjecture 5.9. For XN = A2n , the formula ν
Q (y) =
ν (y) KD,G
n
a=1
1+
n
(1 − e
k=a yk −α
−1
)
(5.24)
n α∈B +
holds for the normalized Bn -characters of the KR-modules. (2)
5.5.2. The Cn -character. As is well-known, Uq (A2n ) has a realization with the “Chevalley generators” Xa± and Ka±1 (a = 0, . . . , n) (e.g. [CP3, Proposition 1.1]). Among them, ± Xa± and Ka±1 (a = 1, . . . , n) are identified with Xa0 , Ka±1 in (5.4)–(5.8), and generate ± the subalgebra Uq (Bn ). On the other hand, Xa and Ka±1 (a = 0, . . . , n − 1) generate the subalgebra Uq 2 (Cn ). See Fig. 2. If we set Ka = qaHa (a = 0, . . . , n − 1), where q0 = q d0 , d0 = 4, then Xa± and Ha (a = 0, . . . , n − 1) generate the Lie algebra Cn (a) in the limit q → 1. This provides Wm (ζ ) with the Cn -module structure, by which the (a) Cn -character of Wm (ζ ) is defined. ˙ Let α˙ a and a (a = 1, . . . , n) be the simple roots and the fundamental weights labeled with the lower label in Fig. 2. By looking at the same Uq -module as Bn and Cn -modules as above, a linear bijection φ : h∗ → h˙ ∗ is induced, where h∗ and h˙ ∗ are the duals of the Cartan subalgebras of Bn and Cn , respectively. ˙ 0 = 0): Lemma 5.10. Under the bijection φ, we have the correspondence ( ˙ n−a − ˙ n, Ca a → α˙ n−a a = 1, . . . , n − 1 αa → −(α˙ 1 + · · · + α˙ n−1 + 21 α˙ n ) a = n.
(5.25) (5.26)
Canonical Solutions of the Q-Systems
179 (2)
Proof. It is obtained from the relations among Hi and αi for A2n [Kac]: 0=c=
n i=0
ai∨ Hi ,
0=δ=
n
a i αi ,
(5.27)
i=0
where (a0∨ , . . . , an∨ ) = (2, . . . , 2, 1) and (a0 , . . . , an ) = (1, 2, . . . , 2) for the upper label in Fig. 2. Let W(Xn ) denote the Weyl group of Xn . Lemma 5.11. There is an element s ∈ W(Cn ) which acts on h˙ ∗ as follows: ˙ a (a = 1, . . . , n), φ(Ca a ) → 1 φ(αa ) → α˙ a (a = 1, . . . , n). Ca
(5.28) (5.29)
Proof. We take the standard orthonormal basis εa of h˙ ∗ . Let s be the element such that s : εa → −εn−a+1 . Then, ˙ n = −(εn−a+1 + · · · + εn ) → ε1 + · · · + εa = ˙ a, ˙ n−a − α˙ n−a = εn−a − εn−a+1 → εa − εa+1 = α˙ a (a = 1, . . . , n − 1), 1 1 − (α˙ 1 + · · · + α˙ n−1 + α˙ n ) = −ε1 → εn = α˙ n . 2 2
(5.30) (5.31) (5.32)
According to (5.30)–(5.32), we set ˙
xa = ea ,
ya = e−α˙ a /Ca .
(5.33)
Then, the relation (5.18) is preserved, since φ and s above are linear. Lemma 5.11 assures that the following definition is well-defined. (a)
Definition 5.12. Let Qm (x) be the Laurent polynomial of x = (xa )na=1 representing the (a) (a) (a) Cn -character of the KR module Wm (ζ ). Then, Qm (y) := (xa )−m Qm (x)|x=x(y) is a (a) polynomial of y = (ya )na=1 with the unit constant term. We call Qm (y) the normalized (a) Cn -character of Wm (ζ ). (a)
Moreover, by Lemma 5.11 and the W(Cn )-invariance of the Cn -character of Wm (ζ ), we have Proposition 5.13. The normalized Bn -character and the normalized Cn -character of (a) (2) Wm (ζ ) of Uq (A2n ) coincide as polynomials of y. (2)
Thus, Conjecture 5.5 for the normalized Bn -characters of A2n is applied for the normalized Cn -characters as well. Furthermore, in contrast to the Bn case, Lemma 5.8 is now applicable for (5.33). Therefore, under Conjecture 5.5, we have 0 KD,G (y) = (1 − e−α ). (5.34) n α∈C +
(r)
(2)
Hence, we conclude that Conjecture 5.5 for XN = A2n is also equivalent to
180
A. Kuniba, T. Nakanishi, Z. Tsuboi (r)
(2)
Conjecture 5.14 ([HKOTT]). For XN = A2n , the formula ν (y) KD,G Qν (y) = (1 − e−α )
(5.35)
n α∈C +
holds for the normalized Cn -characters of the KR-modules, where y is specified as (5.33). The following relation is easily derived from the explicit expressions of the Weyl denominators of Bn and Cn (e.g. [FH]):
(1 − e−α ) =
n α∈C +
n
n
a=1
k=a
1+
yk
(1 − e−α ),
(5.36)
n α∈B +
where the equality holds under the following identifications: ya = e−α˙ a /Ca for the LHS and ya = e−αa for the RHS under the label in Fig. 2. From (5.34) and (5.36), we obtain (5.23).
5.6. Characters for the rank n subalgebras. The procedure to deduce the Cn -characters (2) from the Bn -characters for A2n in Sect. 5.5 is also applicable to the g˙ -characters for any (r) rank n subalgebra g˙ = g0 of XN . (The characters of the lower rank subalgebras are obtained by their specializations.) Let us demonstrate how it works in two examples: (r) (1) Case I. XN = Bn , g0 = Bn , g˙ = Dn .
(r) (2) Case II. XN = A2n−1 , g0 = Cn , g˙ = Dn . ˙ a ) (a = 1, . . . , n) be the simple roots and the fundaLet αa and a (resp. α˙ a and mental weights of g0 (resp. g˙ ) labeled with the upper (resp. lower) label in Fig. 3. As in Sect. 5.5, a linear bijection φ : h∗ → h˙ ∗ is induced, where h∗ and h˙ ∗ are the duals of the Cartan subalgebras of g0 and g˙ , respectively.
0
0
❞ 2 n❅ ❅❞ 1
❞
n−1 n ❞> ❞ 1 0
n−2
n−1
❞
n−1 n ❞< ❞ 1 0
2 n❅ ❅ 1
❞
❞
n−2
n−1 (1)
(2)
Bn (1)
Fig. 3. The Dynkin diagrams of Bn g˙ , respectively
A2n−1 (2)
and A2n−1 . The upper and lower labels respect the subalgebra g0 and
Canonical Solutions of the Q-Systems
181
Doing a similar calculation to Lemmas 5.10 and 5.11, we have ˙ 0 = 0): Lemma 5.15. Under the bijection φ, we have the correspondence ( Case I. ˙ n−a − ˙ n a = 1, n a → ˙ ˙ n a = 2, . . . , n − 1, n−a − 2 α˙ n−a a = 1, . . . , n − 1 αa → − 21 (2α˙ 1 + · · · + 2α˙ n−2 + α˙ n−1 + α˙ n ) a = n. Case II.
(5.37)
(5.38)
˙ n−1 − ˙n a =1 ˙ ˙ n a = 2, . . . , n, n−a − 2
(5.39)
α˙ n−a a = 1, . . . , n − 1 −(2α˙ 1 + · · · + 2α˙ n−2 + α˙ n−1 + α˙ n ) a = n.
(5.40)
a → αa →
Lemma 5.16. There is an element s ∈ W(Dn ) which acts on h˙ ∗ as follows: Case I. ˙a a = 1, . . . , n − 2, n φ(a ) → ˙ ˙ n a = n − 1, n−1 + α˙ a a = 1, . . . , n − 1 φ(αa ) → 1 ˙ n−1 + α˙ n ) a = n. 2 (−α Case II.
˙a ˙ n−1 + ˙n φ(a ) → 2 ˙n α˙ a φ(αa ) → −α˙ n−1 + α˙ n
(5.41)
(5.42)
a = 1, . . . , n − 2 a =n−1 a = n,
(5.43)
a = 1, . . . , n − 1 a = n.
(5.44)
Accordingly, we set Case I. ˙
˙
˙
xa = ea (a = 1, . . . , n − 2, n), en−1 +n (a = n − 1), ya = e
−α˙ a
(a = 1, . . . , n − 1), e
(α˙ n−1 −α˙ n )/2
(a = n).
(5.45) (5.46)
Case II. ˙
˙
˙
˙
xa = ea (a = 1, . . . , n − 2), en−1 +n (a = n − 1), e2n (a = n), ya = e
−α˙ a
(a = 1, . . . , n − 1), e
α˙ n−1 −α˙ n
(a = n).
(5.47) (5.48)
(a)
Then, the relation (5.18) is preserved. Define the g˙ -characters of Wm (ζ ) in the same way as Definition 5.12. Then, the normalized g0 -character and the normalized g˙ -character
182
A. Kuniba, T. Nakanishi, Z. Tsuboi (a)
of Wm (ζ ) coincide as polynomials of y. Thus, Conjecture 5.5 for the normalized g0 characters is applied for the normalized g˙ -characters as well. So far, the situation is (2) parallel to the Cn case for A2n . From now on, the situation is parallel to the Bn case (2) for A2n . The following relations are easily derived from the explicit expressions of the Weyl denominators for Bn , Cn , Dn :
(1 − e−α ) =
n α∈B +
(1 − e−α ) =
n α∈C +
n
n
a=1
k=a
1−
yk
(5.49)
n α∈D +
n
n
a=1
k=a
1 − yn−1
(1 − e−α ),
yk2
(1 − e−α ),
(5.50)
n α∈D +
where the equalities hold under the following identifications: (5.17) for the LHSs, (5.46) for the RHS of (5.49), (5.48) for the RHS of (5.50) under the label in Fig. 3. We conclude (1) (2) that Conjecture 5.5 for Bn and A2n−1 is equivalent to (1)
Conjecture 5.17. (i) For Bn , the formula −1 ν (y) na=1 1 − nk=a yk KD,G ν Q (y) = (1 − e−α )
(5.51)
n α∈D +
holds for the Dn -characters of the KR-modules, where y is specified as (5.46). (2) (ii) For A2n−1 , the formula −1 ν (y) na=1 1 − yn−1 nk=a yk2 KD,G ν (5.52) Q (y) = (1 − e−α ) n α∈D +
holds for the Dn -characters of the KR-modules, where y is specified as (5.48). The manifest polynomial expressions of the numerators in the RHSs of (5.24), (5.51), (a) and (5.52) for Qm (y) are available in [HKOTT] with some other examples. 5.7. Related works. Below we list the related works on Conjectures 1.1 and 5.5–5.7 ν mostly chronologically. However, the list is by no means complete. The series KD,G (y) in (5.20) admits a natural q-analogue called the fermionic formula. This is another fascinating subject, but we do not cover it here. See [BS, HKOTY, HKOTT] and references therein. It is convenient to refer to the formula (5.20) with the binomial coefficient (2.9) as type I, and the ones with the binomial coefficient in Remark 1.3 as type II. (In the (a) (a) context of the XXX-type integrable spin chains, Nm and Pm represent the numbers of (a) m-strings and m-holes of color a, respectively. Therefore one must demand Pm ≥ 0, which implies that the relevant formulae are necessarily of type II.) The manifest ex(a) pression of the decomposition of Qm such as (2)
Q1 = χ (2 ) + χ (5 )
(5.53)
Canonical Solutions of the Q-Systems
183
is referred to as type III, where χ (λ) is the character of the irreducible Xn -module V (λ) with highest weight λ. Since there is no essential distinction between these conjectured (1) formulae for Y (Xn ) and Uq (Xn ), we simply refer to both cases as Xn below. At this moment, however, the proofs should be separately given for the nonsimply-laced case [V]. 0 [Be]. Bethe solved the XXX spin chain of length N by inventing what is later known as the Bethe ansatz and the string hypothesis. As a check of the completeness of his eigenvectors for the XXX Hamiltonian, he proved, in our terminology, the type II (1) formula of Qν (y) with νm = N δm1 for A1 . See [F, FT] for a readable exposition in the framework of the quantum inverse scattering method. 1 [K1, K2]. Kirillov proposed and proved the type I formula of the irreducible modules V (ma ) for A1 [K1] and An [K2]. The idea of the use of the generating function and the Q-system, which is extended in the present paper, originates in this work. 2 [KKR]. Kerov et al. proposed and proved the type II formula for An by the combinatorial method, where the bijection between the Littlewood-Richardson tableaux and the rigged configurations was constructed. 3 [D1]. Drinfeld claimed that V (ma ) can be lifted to a Y (Xn )-module, if the Kac (1) label for αa in Xn is 1. These modules are often called the evaluation modules, and (1) identified with some KR-modules. A method of proof is given in [C] for Uq (Xn ). (a) Therefore, the type III formula Qm = χ (ma ) holds for those a. Some of the corresponding R-matrices for the classical algebras, Xn = An , Bn , Cn , Dn , were obtained earlier in [KRS, R] by the reproduction scheme (also known as the fusion procedure) in the context of the algebraic Bethe ansatz method. (a) 4 [OW]. Ogievetsky and Wiegmann proposed the type III formula of Q1 for some a for the exceptional algebras from the reproduction scheme. 5 [KR, K3]. Kirillov and Reshetikhin formulated the type II formula for any simple Lie algebra Xn . For that purpose, they vaguely introduced a family of Y (Xn )-modules, which we identify with the KR modules here. They proposed the type II formula for any Xn , and the Q-system and the type III formula for Xn = Bn , Cn , Dn . The Q-system for exceptional algebras Xn was also proposed in [K3]. Due to the long-term absence of the proofs of the announced results by the authors, we regard these statements as conjectures at our discretion in this paper. See Remark 5.18 for the further remark. (a)
(a)
Remark 5.18. Let Xn = Bn , Cn , Dn . Let Qm and Qm be the Xn -character and the normalized Xn -character of the “KR module” proposed in [KR]. Then, one can organize the conjectures in [KR] as follows: (a)
(i) All Qm ’s are given by the type III formula in [KR]. (a) (a) (ii) The family (Qm )(a,m)∈H satisfies the Q-system (4.22) for Xn , and Q1 ’s (a = 1, . . . , n) are given by the type III formula in [KR]. (Note that the Q-system (a) (4.22), or equivalently (1.4), recursively determines all Qm ’s from the initial data (a) n (Q1 )a=1 .) (iii) Any power Qν is given by the type II formula. As stated in [KR], one can certainly show the equivalence between (i) and (ii) without referring of the KR-modules themselves. See [HKOTY]. One can also confirm the equivalence between (i) and a weak version of (iii), (a)
(iii’) All Qm ’s are given by the type II formula.
184
A. Kuniba, T. Nakanishi, Z. Tsuboi (a)
See [Kl] and Appendix A in [HKOTY]. The family (Qm )(a,m)∈H given by (i) satisfies the convergence property (4.15). Thus, (i), (ii), and (iii’) are all equivalent to (a)
(iv) The family (Qm )(a,m)∈H is the canonical solution of the Q-system (1.4). Therefore, as shown in Section 5.4 (also [KN2]), they are also equivalent to (v) Any power Qν is given by the type I formula (1.1). This is why we call Conjecture 1.1 the Kirillov-Reshetikhin conjecture. The equivalence between (iii) and the others has not been proved yet as we mentioned in Remark 1.3. (a)
6 [CP1, CP2]. Chari and Pressley proved the type III formula of Q1 in most cases (1) for Y (Xn ) [CP1], and for Uq (Xn ) [CP2], where the list is complete except for E7 and E8 . (a) 7 [Ku]. The type III formula of Qm was proposed for some a for the exceptional algebras. 8 [Kl]. Kleber analyzed a combinatorial structure of the type II formula for the (a) simply-laced algebras. In particular, it was proved that the type III formula of Qm and the corresponding type II formula are equivalent for An and Dn . 9 [HKOTY, HKOTT]. Hatayama et al. gave a characterization of the type I formula as the solution of the Q-system which are C-linear combinations of the Xn -characters with the property equivalent to the convergence property (4.15). Using it, the equivalence of (a) the type III formula of Qm and the type I formula of Qν (y) for the classical algebras was shown [HKOTY]. In [HKOTT], the type I and type II formulae, and the Q-systems (r) (a) (2) for the twisted algebras Uq (XN ) were proposed. The type III formula of Qm for A2n , (2) (2) (3) A2n−1 , Dn+1 , D4 was also proposed, and the equivalence to the type I formula was shown in a similar way to the untwisted case. 10 [KN1, KN2]. The second formula in Conjecture 5.6 was proposed and proved for A1 [KN1] from the formal completeness of the XXZ-type Bethe vectors. The same formula was proposed for Xn , and the equivalence to the type I formula was proved [KN2]. The type I formula is formulated in the form (5.19), and the characterization of type I formula in [HKOTY] was simplified as the solution of the Q-system with the convergence property (4.15). (a) (1) 11 [C]. Chari proved the type III formula of Qm for Uq (Xn ) for any a for the classical algebras, and for some a for the exceptional algebras. 12 [OSS]. Okado et al. constructed bijections between the rigged configurations and (a) the crystals (resp. virtual crystals) corresponding to Qν (y), with νm = 0 for m > 1, (1) (2) (2) for Cn and A2n (resp. Dn+1 ). As a corollary, the type II formula of those Qν (y) was (1)
(2)
proved for Cn and A2n . Assembling all the above results and the indications to each other, let us summarize the current status of the Kirillov-Reshetikhin conjectures into the following theorem. (r) Here, we mention the results only for the quantum affine algebra Uq (XN ) case. Also, we exclude the isolated results only valid for small m. (1)
Theorem 5.19. (i) Conjecture 5.5 and the type I formula of Qν (y) are valid for An , (1) (1) (1) Bn , Cn , Dn .
Canonical Solutions of the Q-Systems
185 (1)
(a)
(ii) The type II formula of Qν (y) is valid for An and valid for those ν with νm = 0 (1) (2) (a) for m > 1 for Cn and A2n . The type II formula of Qm (y) is valid for the following a (1) (1) (1) (1) (1) in [C]: any a for Bn , Cn , Dn ; a = 1, 6 for E6 ; a = 7 for E7 . (a) (1) (1) (1) (1) (iii) The type III formula of Qm is valid for all a for An , Bn , Cn , Dn , and for (1) (1) (1) (1) (1) those a listed in [C] for E6 , E7 , E8 , F4 , G2 . The formula is found in [C]. A. The Denominator Formulae We give a proof of Proposition 4.9. The proof is divided into three steps. A.1. Step 1. Reduction of the denominator formula. In Steps 1 and 2, we consider the unspecialized infinite Q-system (4.8), and we assume that D and G satisfy the condition (KR-II) in Definition 4.5. For a given positive integer L, let HL = {1, . . . , n} × {1, . . . , L} be the finite subset (a) (a) of H in Sect. 4.1. With multivariables vL = (vm )(a,m)∈HL , wL = (wm )(a,m)∈HL , (a) zL = (zm )(a,m)∈HL , we define the bijection vL → wL around v = w = 0 (cf. (2.1)) by (b) (a) (a) wm (vL ) = vm (1 − vk )−Gam,bk , (A.1) (b,k)∈HL
and the bijection vL → zL around v = z = 0 by (b) (a) (a) zm (vL ) = wm (vL ) (1 − vk )gab m
(A.2)
(b,k)∈HL
(a) = vm
(b,k)∈HL
(b)
(1 − vk )−Gam,bk +gab m ,
(A.3)
where gab is the one in (KR-II). Let us factorize the bijection wL → vL as wL → zL → vL . The map wL → zL is described as (a) (a) zm (wL ) = wm
n
(Qb (wL ))−gab m ,
Qb (wL ) :=
b=1
L
(b)
(1 − vk (wL ))−1 .
(A.4)
k=1
By the assumption (KR-II) and the expression (A.3), the map vL → zL is lowertriangular in the sense of Example 2.9. Therefore, the following equality holds: det HL
w (b) ∂z(a) m k (w ) = det (w ) L L , (b) HL z(a) ∂w (b) ∂wk m k
w (b) ∂v (a) m k (a)
vm
(A.5)
where detHL means the abbreviation of det (a,m),(b,k)∈HL . We now simultaneously specialize the variables wL and zL with the variables y = (ya )na=1 and u = (ua )na=1 as (cf. (4.5)) (a) (a) wm = wm (y) = (ya )m ,
(a) (a) zm = zm (u) = (ua )m .
(A.6)
This specialization is compatible with (A.4) and the map y → u, ua (y) = ya
n b=1
(qb (y))−gab ,
qb (y) := Qb (wL (y)).
(A.7)
186
A. Kuniba, T. Nakanishi, Z. Tsuboi
Proposition A.1. Let GL = (Gam,bk )(a,m),(b,k)∈HL be the HL -truncation of G , KI0 ,G (wL ) be the one in (2.34), and KI0 ,G (y) := KI0 ,G (wL (y)) be its specialL L L L L L ization by (A.6). Then, the formula (2.34) reduces to KI0L ,G (y) = L
n
y ∂u b a (y) qa (y). 1≤a,b≤n ua ∂yb
det
(A.8)
a=1
Proof. Because of (A.5), it is enough to prove the equality det HL
w (b) ∂z(a) m k
(w (y)) = L (b)
(a)
zm ∂wk
y ∂u b a (y) . 1≤a,b≤n ua ∂yb det
(A.9)
We remark that ya
L ∂ (a) ∂ = mwm , (a) ∂ya ∂wm
(A.10)
m=1
det(δam,bk + mαabk ) = HL
δab +
det
1≤a,b≤n
L
kαabk ,
(A.11)
k=1
where αabk are arbitrary constants depending on a, b, k. Set Fa (wL ) =
n
(Qb (wL ))−gab .
b=1
Then, (A.9) is obtained as
(b) (LHS) = det δam,bk + mwk HL
=
δab +
det
1≤a,b≤n
L k=1
∂ (b)
∂wk
log Fa (wL (y))
∂
(b)
kwk
(b)
∂wk
log Fa (wL (y))
∂ log Fa (wL (y)) 1≤a,b≤n ∂yb
y ∂u b a = det (y) , 1≤a,b≤n ua ∂yb =
det
δab + yb
where we used (A.4), (A.11), (A.10), and (A.7).
A.2. Step 2. Change of variables. We introduce the change of the variables y and u in (A.6) to x = (xa )na=1 and q = (qa )na=1 as ya (x) =
n b=1
(xb )−gab ,
ua (q) =
n
(qb )−gab .
b=1
(A.12)
Canonical Solutions of the Q-Systems
187
Thus, if f (y) is a power series of y, then f (y(x)) is a Laurent series of x because of the assumption in (KR-II) that gab ’s are integers. This specialization is compatible with (A.7) and the map x → q, qa (x) = xa qa (y(x)).
(A.13)
Let us summarize all the maps and variables in a diagram: (A.2)
(A.4)
v ←→ z ←→ w (A.6)↑ ↑ (A.6) (A.7)
u ←→ y (A.12)↑ ↑(A.12)
(A.14)
(A.13)
q ←→ x With these changes of variables, (A.8) becomes the Jacobian of q(x): Proposition A.2. Let KI0
L ,GL
KI0
L ,GL
(y) be the one in Proposition A.1, and let KI0
L ,GL
(y(x)). Then, the formula KI0L ,G (x) = L
∂q
det
1≤a,b≤n
a
∂xb
(x) :=
(x)
(A.15)
holds. Proof. By (A.12), we have
q ∂u
x ∂y b a b a det = det = det (−gab ) = 0. 1≤a,b≤n ua ∂qb 1≤a,b≤n ya ∂xb 1≤a,b≤n
(A.16)
Using Proposition A.1, (A.13), and (A.16), we obtain KI0L ,G (x) = L
n
y ∂u b a (y(x)) qa (y(x)) 1≤a,b≤n ua ∂yb
det
a=1
n
x ∂q
∂q b a a = det (x) qa (y(x)) = det (x) . 1≤a,b≤n qa ∂xb 1≤a,b≤n ∂xb
(A.17)
a=1
A.3. Step 3. Denominator formula for the Q-systems for KR type. Now we are ready to prove Proposition 4.9; namely, 0 0 0 Proposition A.3. Let KD,G (x) := KD,G (y(x)), where KD,G (y) is the denominator in (4.11) for the Q-system of KR type (4.14). Then, the formula 0 (x) = KD,G (a)
(a)
det
∂Q(a)
1≤a,b≤n
1
∂xb
(x)
(A.18) (a)
holds, where we set Q1 (x) := xa Q1 (y(x)) for the canonical solution (Qm (y))(a,m)∈H of (4.14).
188
A. Kuniba, T. Nakanishi, Z. Tsuboi
Proof. We recall the following four facts: Fact 1: By (3.33) and (4.6), we have 0 KD,G (y) ≡ KI0L ,G (y) L
mod JL .
(A.19) (a)
Fact 2: By Theorem 3.7 and the proof therein, the canonical solution (Qm (y))(a,m)∈H (a) of (4.14) and the solution (Qm (y))(a,m)∈H of the corresponding standard Q-system are related as Q(a) m (y) =
(a)
(a)
Qm+1 (y)Qm−1 (y) (a)
(Qm (y))2
.
(A.20)
Fact 3: By Propositions 2.1, 3.4, and (4.6), the series qb (y) in (A.7) satisfies qa (y) ≡
L m=1
−1 (Q(a) m (y))
mod JL ,
(A.21)
(a)
where Qm (y) is the one in Fact 2. Note that qb (y) depends on L. Fact 4: By the proof of Proposition 4.4, it holds that (a)
(a)
QL (y) ≡ QL+1 (y)
mod JL .
(A.22) (a)
Combining Facts 2–4, we immediately have qa (y) ≡ Q1 (y) mod JL . Thus, (a) limL→∞ qa (y) = Q1 (y) holds. Therefore, taking the limit L → ∞ of (A.8) with the help of Fact 1, we obtain 0 KD,G (y) =
n
y ∂U b a (a) (y) Q1 (y), 1≤a,b≤n Ua ∂yb
Ua (y) := ya
det
n
(A.23)
a=1
(b)
(Q1 (y))−gab .
(A.24)
b=1
The equality (A.18) is obtained from (A.23) in the same way as the proof of Proposition A.2. Acknowledgement. We would like to thank V. Chari, G. Hatayama, A. N. Kirillov, M. Noumi, M. Okado, T. Takagi, and Y. Yamada for very useful discussions. We especially thank K. Aomoto for the discussion where we recognize the very close relation between the present work and his work, and also for pointing out the reference [G] to us.
References [A] [AI] [B] [Be]
Aomoto, K.: Integral representations of quasi hypergeometric functions. In: Proc. of the International Workshop on Special Functions, Hong Kong, 1999, C. Dunkl et al. (eds), Singapore: World Scientific, pp. 1–15 Aomoto, K. and Iguchi, K.: On quasi-hypergeometric functions. Methods and Appli. of Anal. 6, 55–66 (1999) Berndt, B.C.: Ramanujan’s Notebooks, Part I. Berlin: Springer-Verlag Bethe, H.A.: Zur Theorie der Metalle, I. Eigenwerte und Eigenfunktionen der linearen Atomkette. Z. Physik 71, 205–231 (1931)
Canonical Solutions of the Q-Systems [BS]
189
Bouwknegt, P. and Schoutens, K.: Exclusion statistics in conformal field theory – generalized fermions and spinons for level-1 WZW theories. Nucl. Phys. B547, 501–537 (1999) [C] Chari, V.: On the fermionic formula and the Kirillov–Reshetikhin conjecture. Intern. Math. Res. Notices 12, 629–654 (2001) [CP1] Chari, V. and Pressley, A.: Fundamental representations of Yangians and singularities of Rmatrices. J. reine angew. Math. 417, 87–128 (1991) [CP2] Chari, V. and Pressley, A.: Quantum affine algebras and their representations, Canadian Math. Soc. Conf. Proc. 16, 59–78 (1995) [CP3] Chari, V. and Pressley, A.: Twisted quantum affine algebras. Commun. Math. Phys. 196, 461–476 (1998) [D1] Drinfeld, V.: Hopf algebras and the quantumYang-Baxter equations. Sov. Math. Dokl. 32, 264–268 (1985) [D2] Drinfeld, V.: A new realization of Yangians and quantum affine algebras. Sov. Math. Dokl. 36, 212–216 (1988) [F] Faddeev, L.D.: Lectures on quantum inverse scattering method. In: Integrable systems (Tianjin, 1987), Nankai Lectures Math. Phys., Teaneck, NJ: World Sci. Publishing, 1990, pp. 23–70 [FT] Faddeev, L.D. and Takhtadzhyan, L.A.: The spectrum and scattering of excitations in the onedimensional isotropic Heisenberg model. J. Sov. Math. 24, 241–246 (1984) [FH] W. Fulton and J. Harris, Representation theory: a first course, Springer-Verlag, New York, 1991 [G] Gessel, I.M.: A combinatorial proof of the multivariable Lagrange inversion formula. J. Combin. Theory Ser. A 45, 178–195 (1987) [HKOTT] Hatayama, G., Kuniba, A., Okado, M., Takagi, T. and Tsuboi, Z.: Paths, crystals and fermionic formulae, math.QA/0102113. To appear in Progr. in Math. Phys. [HKOTY] Hatayama, G., Kuniba, A., Okado, M., Takagi, T. and Yamada, Y.: Remarks on fermionic formula. Contemporary Math. 248, 243–291 (1999) [I] Iguchi, K.: Generalized Lagrange theorem and thermodynamics of a multispecies quasiparticle gas with mutual fractional exclusion statistics. Phys. Rev. B58, 6892–6911 (1998); Erratum: Phys. Rev. B 59, 10370 (1999) [IA] Iguchi, K. and Aomoto, K.: Integral representation for the grand partition function in quantum statistical mechanics of exclusion statistics. Int. J. Mod. Phys.: B14, 485–506 (2000) [J] Jing, N.-H.: On Drinfeld realization of quantum affine algebras. Ohio State Univ. Math. Res. Inst. Publ. 7, 195–206 (1998) [Kac] Kac, V.G.: Infinite dimensional Lie algebras, 3rd edition. Cambridge: Cambridge University Press, 1990 [KKR] Kerov, S.V., Kirillov, A.N. and Reshetikhin, N.Yu.: Combinatorics, the Bethe ansatz and representations of the symmetric group. J. Sov. Math. 41, 916–924 (1988) [K1] Kirillov, A.N.: Combinatorial identities and completeness of states for the Heisenberg magnet. J. Sov. Math. 30, 2298–3310 (1985) [K2] Kirillov, A.N.: Completeness of states of the generalized Heisenberg magnet. J. Sov. Math. 36, 115–128 (1987) [K3] Kirillov, A.N.: Identities for the Rogers dilogarithm function connected with simple Lie algebras. J. Sov. Math. 47, 2450–2459 (1989) [KR] Kirillov, A.N. and Reshetikhin, N.Yu.: Representations ofYangians and multiplicity of occurrence of the irreducible components of the tensor product of representations of simple Lie algebras. J. Sov. Math. 52, 3156–3164 (1990) [Kl] Kleber, M.: Combinatorial structure of finite-dimensional representations ofYangians: the simplylaced case. Int. Math. Res. Note 7, 187–201 (1997) [KRS] Kulish, P., Reshetikhin, N.Yu. and Sklyanin, E.: Yang–Baxter equations and representation theory I. Lett. Math. Phys. 5, 393–403 (1981) (1) [Ku] Kuniba, A.: Thermodynamics of the Uq (Xr ) Bethe ansatz system with q a root of unity. Nucl. Phys. B 389, 209–244 (1993) [KN1] Kuniba, A. and Nakanishi, T.: The Bethe equation at q = 0, the Möbius inversion formula, and weight multiplicities: I. The sl(2) case. Prog. in Math. 191, 185–216 (2000) [KN2] Kuniba, A. and Nakanishi, T.: The Bethe equation at q = 0, the Möbius inversion formula, and weight multiplicities: II. The Xn case. math.QA/0008047. To appear in J. Algebra [KNT] Kuniba, A., Nakanishi, T. and Tsuboi, Z.: The Bethe equation at q = 0, the Möbius inversion (r) formula, and weight multiplicities: III. The XN case. Lett. Math. Phys. 59, 19–31 (2002) [OW] Ogievetsky, E. and Wiegmann, P.: Factorized S-matrix and the Bethe ansatz for simple Lie groups. Phys. Lett. B 168, 360–366 (1986) [OSS] Okado, M., Schilling, A. and Shimozono, M.: Virtual crystals and fermionic formulas of type (2) (2) (1) Dn+1 , A2n , and Cn . math.QA/0105017
190
A. Kuniba, T. Nakanishi, Z. Tsuboi
[R]
Reshetikhin, N.Yu.: Integrable models of quantum 1-dimensional magnetics with O(n) and Sp(2k)-symmetry. Theoret. Math. Phys. 63, 555–569 (1985) Sutherland, B.: Quantum many-body problem in one dimension: Thermodynamics. J. Math. Phys. 12, 251–256 (1971) Varagnolo, M.: Quiver Varieties and Yangians. Lett. Math. Phys. 53, 273–283 (2000) Wu, Y.-S.: Statistical distribution for generalized ideal gas of fractional statistical particles. Phys. Rev. Lett. 73, 922–925 (1994)
[S] [V] [W]
Communicated by L. Takhtajan
Commun. Math. Phys. 227, 191 – 209 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Destruction of the Beating Effect for a Non-Linear Schrödinger Equation Vincenzo Grecchi1 , André Martinez1 , Andrea Sacchetti2 1 Dipartimento di Matematica, Università degli Studi di Bologna, Piazza di Porta S. Donato 5, 40127 Bologna,
Italy. E-mail:
[email protected],
[email protected] 2 Dipartimento di Matematica, Università degli Studi di Modena e Reggio Emilia, Via Campi 213/B,
41100 Modena, Italy. E-mail:
[email protected] Received: 31 May 2001 / Accepted: 23 January 2002
Abstract: We consider a non-linear perturbation of a symmetric double-well potential as a model for molecular localization. In the semiclassical limit, we prove the existence of a critical value of the perturbation parameter giving the destruction of the beating effect. This value is twice the one corresponding to the first bifurcation of the fundamental state. Here we make use of a particular projection operator introduced by G. Nenciu in order to extend to an infinite dimensional space some known results for a two-level system. 1. Introduction As it is well known, quantum double-well problems exhibit some caracteristic features such as “splitting of the energy levels”, “delocalization” and “beating effect”. It is also known that certain molecules, e.g. the ammonia one NH3 , are such that one of the nuclei (the nitrogen nucleus N in the case of ammonia), in the Born–Oppenheimer approximation, moves according to a double-well effective potential. The beating effect for such molecules, related to the periodic motion of a state passing from localization at one of the wells to localization at the other one, appears as an “inversion line” on the spectrum. For non-isolated molecules we have the “red shift” of the “inversion line”, and, if the ammonia gas is at a pressure large enough (about 2 atmospheres) the inversion line disappears, the N nucleus becomes localized: the well known pyramidal shape of the molecule (molecular structure) appears. Thus, we see classical behaviors of microscopical systems. The cause of this phenomenon should be the polarity of the pyramidal molecule which polarizes the environement, so that the reaction field stabilizes the molecular structure. We consider a standard model for molecular structure: a symmetric double-well potential with a non-linear perturbation [5]. In previous research [6, 7] a critical value of the perturbation parameter has been found giving a bifurcation of the fundamental state and new asymmetrical states.
192
V. Grecchi, A. Martinez, A. Sacchetti
The present research shows that for such a value of the parameter the dynamics is not qualitatively changed with respect to the unperturbed case; in particular the beating effect is unchanged. On the other side, here we found another critical value of the parameter at which we have the destruction of the beating effect. In particular, beating motion exists for any value of the parameter smaller than the critical one, at the limit of the critical value, the period of this motion diverges, and for larger values of the parameter the beating effect is absent (see Theorem 2 and Corollary 1). Curiously enough, this second critical value of the parameter is nearly (exactly in the limit considered) twice the previous one. The factor 2 between the critical values of the parameter is explained by the similar role played by two different “energy” invariants belonging to the original problem and to the linearized one respectively. Our work is based on the reduction of the problem to a bi-dimensional space in the semiclassical limit and it makes use of the known results about the dynamics of the reduced two level problem [13, 17]. The paper is organized in the following way. In Sect. 2 we describe the model and we give the main results. In Sect. 3 we prove the theorems. In particular in Sect. 3.2 the reduction of the time-dependent problem into a bi-dimensional space in the semiclassical limit is given. In Sect. 3.3 we recall some known results about the bi-dimensional problem, concerning the trajectories and the frequencies of the motion for different values of the parameter. Finally, in Sect. 3.4 we prove the stability result and the existence of the critical parameter in the full problem. 2. Description of the Model and Main Results We consider here the time-dependent non-linear Schrödinger equation ∂ψ h¯ 2 i h¯ ∂t = H0 ψ + f (x, ψ)ψ, H0 ψ = − 2m ψ + V (x)ψ, ψ(t, x)|t=0 = ψ 0 (x)
(1)
where V (x), x ∈ Rn , is a double-well symmetric potential: V (x , −xn ) = V (x),
x = (x , xn ) ∈ Rn ,
x = (x1 , . . . , xn−1 ),
and f (x, ψ) = ψ, W ψ W (x),
(2)
where is a real parameter and W ∈ C(Rn ) is a given real-valued, bounded, odd function: W (x , −xn ) = −W (x), x = (x , xn ).
(3)
In such a case W locally represents the position operator xn and Eq. (1) would describe the effect of the spontaneous symmetry breaking for a symmetric molecule [4–7, 12]. It is well known [2] that when the nonlinear term has a form given by (2) then we have the conservation of the energy defined below: 1 1 H = H0 ψ, ψ + ψ, W ψ 2 = H0 ψ 0 , ψ 0 + ψ 0 , W ψ 0 2 . 2 2 Hereafter, · , · and · respectively denote the scalar product and the norm in the Hilbert space L2 (Rn ).
Destruction of Beating Effect for Non-Linear Schrödinger Equation
193
Remark 1. If we consider the locally linear problem defined as i h¯
∂ψ = H lin ψ, ∂t
where H lin ψ = H0 ψ + f (x, ψ 0 )ψ, f (x, ψ 0 ) = ψ 0 , W ψ 0 W, then we have the conservation of the energy defined as E = ψ 0 , H lin ψ 0 = H0 ψ 0 , ψ 0 + ψ 0 , W ψ 0 2 . Let σ (H0 ) be the spectrum of the self-adjoint realization of H0 on the Hilbert space L2 (Rn , dx). We assume that the discrete spectrum of H0 is not empty and let E+ < E− be the two lowest eigenvalues of H0 , with associated normalized eigenvectors ϕ+ and ϕ− . It is well known [8, 14, 15] that, under very general conditions on V , the splitting between the first two eigenvalues, defined as ω = E− − E+ , satisfies to the following asymptotic behavior: ω ∼ e−C/h¯ , as h¯ → 0,
(4)
for some positive constant C (hereafter C denotes any generic positive constant). In the same limit we also have 1 ϕ± (x) ∼ √ [ϕ0 (x) ± ϕ0 (−x)] , as h¯ → 0, 2 where ϕ0 (x) is a function localized within one well, for instance the right-hand one corresponding to positive values of xn . We also assume that (5) dist {E+ , E− }, σ (H0 ) \ {E+ , E− } ∼ C h, ¯ as h¯ → 0, for some positive C. Now, let 1 1 ϕR = √ (ϕ+ + ϕ− ) and ϕL = √ (ϕ+ − ϕ− ) , 2 2 they are normalized functions such that ϕR (x) ∼ ϕ0 (x) and ϕL (x) ∼ ϕ0 (−x), as h¯ → 0.
(6)
That is ϕR , the so-called right-hand well wave-function, is localized within the righthand well and ϕL , the so-called left-hand well wave-function, is localized within the other well. The solution of Eq. (1) can be written in the form ψ(t, x) = aR (t)ϕR (x) + aL (t)ϕL (x) + ψc (t, x), aR,L (t) ∈ C,
(7)
where ψc = !c ψ is the projection of ψ on the eigenspace orthogonal to the twodimensional space spanned by ϕR and ϕL ; that is: !c = 1 − · , ϕR ϕR − · , ϕL ϕL .
194
V. Grecchi, A. Martinez, A. Sacchetti
It is well known that when the perturbation term f is absent in Eq. (1) then a state, initially prepared on the two lowest states, that is ψc0 ≡ 0, generically makes experience of a beating motion between the two wells with period 4π h/ω ¯ and the expectation value W t = ψ(t, ·), W (·)ψ(t, ·) has an oscillating behavior within the interval [−|w|, |w|], w = ϕR , W ϕR . Now, we are going to consider the effect of the perturbation f on these beating motions in the semiclassical limit. In the following we assume that the perturbation strength is of the same order of the splitting and we introduce the non-linearity parameter defined as µ=
c = O(1), as h¯ → 0, ω
(8)
where c = 2w2 = 2ρ02 , w = ϕR , W ϕR = ϕ+ , W ϕ− = ρ0 , the choice of ϕ± can be made such that ρ0 ∈ R − {0}. We state our main results: 0 2 n 1 Theorem 1. For any ψ ∈ H (R ), Eq. (1) admits a unique solution ψ ∈ C Rt ; L2 (Rn ) ∩ C 0 Rt ; H 2 (Rn ) such that ψ(0, x) = ψ 0 (x). Moreover, for all t ∈ R we have that
ψ(t, ·) = ψ 0 (·).
(9)
Theorem 2. If ψc0 = !c ψ 0 ≡ 0 and if 2|H − &| > δ, & = 1 (E+ + E− ), − 1 ω 2 for some δ > 0 fixed and any h¯ small enough; then there exists τB and a positive constant C independent of h¯ and such that for any α < 1 ψ t + 2τB , · − ψ(t, ·) = O(ω˜ α ), ∀t ∈ [0, t , ], ω˜ for h¯ small enough, where t, =
τ, 1 ln and τ , = (α − 1)/C. ω˜ ω˜
In particular, the expectation value W t is, up to an error of order O(ω˜ α ), a periodic function with pseudo-period T = 2τB /ω˜ and: (i) if 2|H − &| 0 and any h¯ small enough, then there exists t0 > 0 such that for any K ∈ N and η > 0 fixed then W t > 0, t0 + η + kT < t < t0 + (k + 1/2) T − η,
Destruction of Beating Effect for Non-Linear Schrödinger Equation
195
and W t < 0, t0 + (k + 1/2) T + η < t < t0 + (k + 1)T − η for any k = 0, 1, . . . , K and h¯ small enough; (ii) in contrast, if 2|H − &| >1+δ ω
(11)
for some δ > 0 and any h¯ small enough, then W t = 0, ∀t ∈ [0, t , ]. Remark 2. Condition (10) implies that H ∈ (E+ , E− ) and condition (11) implies that H∈ / [E+ , E− ]. Remark 3. For an expression of the pseudo-period we refer to Sect. 3.3; in particular, τB is given by Eq. (45) in the case (10) and τB is given by Eq. (46) in the case (11). For what concerns the dynamics of a state initially prepared on one well, e.g. the right-hand one, we have that: Corollary 1 (Beating Destruction: The critical parameter). Let ψ 0 = ψR and µ∞ = ±2, where µ = µ∞ +o(1), as h¯ → 0. Then the state returns near to the initial condition after a pseudo-period T of order 1/ω; ˜ that is for any α < 1 and any K ∈ N fixed then: ψ(kT , ·) − ψR (·) = O(ω˜ α ), for any k = 1, 2, . . . , K. Moreover, if: (i) |µ∞ | < 2, then ψ ((k + 1/2)T , ·) − ψL (·) = O(ω˜ α ), k = 1, 2, . . . , K, and we have the beating motion between the two wells as in the unperturbed case; (ii) |µ∞ | > 2, then W t > 0, ∀t ∈ [0, t , ], that is the state ψ is localized within the right-hand well. 3. Proof of the Theorems 3.1. Proof of Theorem 1. We denote ψ˜ = eitH0 /h¯ ψ and W˜ = eitH0 /h¯ W e−itH0 /h¯ . Then Eq. (1) is equivalent to: i h¯
∂ ψ˜ ˜ = F (ψ), ∂t
where ˜ = ψ, ˜ W˜ ψ ˜ W˜ ψ˜ F (ψ)
(12)
196
V. Grecchi, A. Martinez, A. Sacchetti
satisfies to the following Lipschitz-type estimate: for any ψ˜ 1 , ψ˜ 2 ∈ L2 (Rn ) we have that
(13) F (ψ˜ 1 ) − F (ψ˜ 2 ) ≤ C ψ˜ 1 2 + ψ˜ 2 2 ψ˜ 1 − ψ˜ 2 for some positive constant C. Therefore, for small enough, a local existence and unicity result follows from Cauchy’s theorem. Moreover, for any solution ψ˜ of (12) we have that
˜ 2 ∂ψ ∂ ψ˜ ˜ ψ ˜ = 0; ˜ = 2 , ψ = 2h¯ −1 F (ψ), ∂t ∂t ∂ ψ˜ ˜ is constant with respect to t. As a consequence, hence, ψ ∂t remains uniformly bounded on any open interval of time where it is defined and thus the global existence in time follows from standard arguments. ifψ 0 ∈ H 2 (Rn ) onealso has that Finally, n n ∞ 2 1 2 ˜ ψ ∈ C (Rt ; H (R )) and thus ψ ∈ C Rt ; L (R ) ∩ C 0 Rt ; H 2 (Rn ) . 3.2. Reduction to a two-level system. Here, we prove a stability result which allows us to reduce the analysis of Eq. (1) to a bi-dimensional space. To this purpose we make use of some ideas contained in [10, 11] and [16]. Now, let ω˜ = where
ω˜ h¯
ω 1 and H1 = H0 , h¯ h¯
(14)
= O(1) as h¯ → 0. We treat ω˜ as a new semiclassical parameter. We have that:
Theorem 3. Let ψ(t, x) = aR (t)ϕR (x) + aL (t)ϕL (x) + ψc (t, x), aR,L (t) ∈ C, ψc = !c ψ, be the solution of Eq. (1) satisfying the initial condition ψc0 ≡ 0. Then there exists a positive constant C such that
and
˜ ˜ ˜ C ωt , ψc ≤ C ωe ˜ C ωt H0 ψc ≤ C ωe
(15)
˜ ˜ C ωt aR,L (t) − e−i(E+ +E− )t/2h¯ AR,L (tω/2h¯ ) ≤ C ωe
(16)
for h¯ small enough and any t ∈ R+ , where AR,L (τ ) are the solutions of the non-linear system iAR = −AL + 2ν0 ρ0 AR AR,L (0) = aR,L (0) , (17) , |AR (τ )|2 + |AL (τ )|2 = 1 iAL = −AR − 2ν0 ρ0 AL where means the derivative with respect to τ and ν0 = ν0 (τ ) =
ρ0 (|AR (τ )|2 − |AL (τ )|2 ), ρ0 = ϕ+ , W ϕ− . h¯ ω˜
In particular, for any α ∈ (0, 1), then H0 ψc ≤ C ω˜ α , ψc ≤ C ω˜ α
(18)
Destruction of Beating Effect for Non-Linear Schrödinger Equation
and
197
aR,L (t) − e−i(E+ +E− )t/2h¯ AR,L (tω/2h¯ ) ≤ C ω˜ α
for any t ∈ [0, t , ], t , = (τ , /ω) ˜ ln(1/ω), ˜ τ , = (α − 1)/C. Proof. In order to prove the theorem we investigate the solution ψ of (1) with initial data: 0 0 0 2 0 2 ϕ+ + a − ϕ− , |a+ | + |a− | = 1. ψ 0 = a+
(19)
In order to do that, we make the change of time scale: t →τ =
ωt ωt ˜ = 2h¯ 2
which transform (1) into (for the sake of simplicity ψ still denotes the solution of the new equation): i ω˜ ∂ψ = H1 ψ + ψ, W ψ W ψ. h¯ 2 ∂τ
(20)
Our first aim is to construct an approximation of ψ as ω˜ → 0+ . Let us define χ 0 = χ , χ ∈ L2 (Rn ), and χ 1 = H˜ 1 χ , χ ∈ D(H˜ 1 ), where H˜ 1 = H1 + c1 1, c1 is such that H˜ 1 ≥ 1,
(21)
and therefore χ 0 ≤ χ 1 for any χ ∈ D(H˜ 1 ). We start by proving the following lemma. Lemma 1. Let ψ be the solution of Eq. (20) with initial data (19). Let j = 0 or j = 1, ϕ ∈ C 1 (Rτ ; L2 (Rn )) ∩ C 0 (Rτ ; H 2 (Rn )) be such that ϕ(τ, ·) ≤ C for some C > 0 and any τ , ϕ(0, ·) − ψ 0 (·)j = O(ω) ˜ and
i ω˜ ∂ φ= − + H1 + ϕ, W ϕ W ϕ h¯ 2 ∂τ
(22)
φ(τ, ·)j = O(ω˜ 2 )
(23)
be such that
uniformly for τ ≥ 0 and ω˜ small enough. Then, there exists C > 0 such that: ϕ(τ, ·) − ψ(τ, ·)j ≤ C ωe ˜ Cτ , ∀τ ≥ 0.
(24)
198
V. Grecchi, A. Martinez, A. Sacchetti
Proof. In order to prove this lemma, first consider j = 0. Let us denote ϕ˜ = e2iτ H1 /ω˜ ϕ, ψ˜ = e2iτ H1 /ω˜ ψ, φ˜ = e2iτ H1 /ω˜ φ, u = ϕ˜ − ψ˜ and W˜ = e2iτ H1 /ω˜ W e−2iτ H1 /ω˜ . We have i ω˜
ϕ, W ϕ W˜ ϕ˜ − ψ, W ψ W˜ ψ˜ − φ˜ u = h¯ 2 and therefore ∂u2 ∂τ = 2|u , u |
1 ˜ u = 4 ϕ, W ϕ W˜ ϕ˜ − ψ, W ψ W˜ ψ˜ − φ, h¯ ω˜ 2ω˜
≤ C u2 + ωu ˜ ≤ C u2 + ω˜ 2
(25)
for any τ ≥ 0 and for some constant C > 0 since (8), (23) and ab ≤ 21 a 2 + 21 b2 for any a, b > 0. As a result it follows that ∂ −Cτ u2 ≤ Ce−Cτ ω˜ 2 , e ∂τ ˜ and thus, since u|τ =0 = O(ω): e−Cτ u2 ≤ C ω˜ 2
(26)
for some C > 0. Then (24) immediately follows. Moreover, we have that (24) is still true when we replace the usual norm χ by ˜ χ 1 = H˜ 1 χ , χ ∈ D(H1 ), where H˜ 1 = H1 + c1 1 ≥ 1 for some c1 . Indeed, let ϕ, ˜ ψ, ˜ ˜ W and φ as above and let now ˜ u1 = H˜ 1 (ϕ˜ − ψ). Then i ω˜
ϕ, W ϕ H˜ 1 W˜ ϕ˜ − ψ, W ψ H˜ 1 W˜ ψ˜ − H˜ 1 φ˜ u1 = h¯ 2 and
∂u1 2 = 4 ϕ, W ϕ H˜ 1 W˜ ϕ˜ − ψ, W ψ H˜ 1 W˜ ψ˜ − 1 H˜ 1 φ, ˜ u1 ∂τ h¯ ω˜ 2ω˜
≤ C u1 2 + ωu ˜ 1 ≤ C u1 2 + ω˜ 2
for some constant C > 0 since (8), (23) and H˜ 1 W H˜ 1−1 and H˜ 1−1 are bounded operators, uniformly with respect to ω. ˜ As above, it follows that (26) is true, from which (24) follows. !
Destruction of Beating Effect for Non-Linear Schrödinger Equation
199
Now, in order to prove Theorem 3 we explicitly construct a solution ϕ satisfying the assumptions of Lemma 1. We re-write Eq. (22) as: i ω˜ ∂ − ˜ ϕ = φ, (27) + H1 + ωνW 2 ∂τ where ν = ν(τ ) =
ϕ, W ϕ , = O(1), as h¯ → 0, h¯ ω˜ h¯ ω˜
and where ϕ and φ must satisfy the conditions 0 ˜ ϕ(0, ·) − ψ (·)1 ≤ C ω, ϕ(τ, ·) ≤ C, ∀τ ≥ 0, φ(τ, ·)1 ≤ C ω˜ 2 , ∀τ ≥ 0,
(28)
for some C > 0, ψ 0 is given by (19). We denote by !0 = 1 − !c the orthogonal projection onto Cϕ+ ⊕ Cϕ− , that is: 1 !0 = (ζ − H1 )−1 dζ, 2π i γ where γ is a simple complex loop encircling h1¯ E+ , h1¯ E− , leaving the rest of σ (H1 ) in its exterior and such that (see (5)) dist (γ , σ (H1 )) ≥ C, for some constant C > 0. We also set: 1 !1 = (ζ − H1 )−1 W (ζ − H1 )−1 dζ 2π i γ
(29)
and, for any ν ∈ C 1 (R), ˜ )!1 . !ν(τ ) = !0 + ων(τ
(30)
From the definition, from (5) and (21) and since 1 H1 !1 = −W !0 + ζ (ζ − H1 )−1 W (ζ − H1 )−1 dζ, 2π i γ then it follows that !1 χ 1 ≤ Cχ 1 , for any χ ∈ D(H1 ). We look for a solution ϕ of the linear equation (27) of the form ϕ(τ ) = !ν(τ ) b+ (τ )ϕ+ + b− (τ )ϕ− . For such a choice of ϕ and from the definition of ν we have ν(τ ) = ϕ, W ϕ = ν0 + ωα(τ ˜ )ν(τ ) + ω˜ 2 β(τ )ν 2 (τ ) , h¯ ω˜ h¯ ω˜
(31)
(32)
(33)
200
V. Grecchi, A. Martinez, A. Sacchetti
where 2
ν0 =
b¯s(>) bs(> ) ϕs(> ) , W ϕs(>) = 2 b+ b¯− ρ0 ,
>, > =1
s(1) = + and s(2) = −, since (3), and α(τ ) and β(τ ) are functions independent of ω˜ given by: α=
2
bs(> ) b¯s(>) αs(> ),s(>) , β =
>, > =1
2
bs(> ) b¯s(>) βs(> ),s(>) ,
>, > =1
where α±,± = ϕ± , W !1 ϕ± + !1 ϕ± , W ϕ± , β±,± = !1 ϕ± , W !1 ϕ± . From this fact and since following behavior:
h¯ ω˜
= O(1), it follows that ϕ(τ, ·) ≤ C and ν satisfies the
ν, ν = O(1), uniformly w.r. to ω˜ > 0 small enough and τ ≥ 0,
(34)
provided that the unknown functions b± and their first derivative are bounded uniformly with respect to τ and ω. ˜ Now, observing that: i ω˜ ∂ ˜ !ν = K, + H1 + ωνW, − 2 ∂τ where K=−
i ω˜ 2 ˜ ([H1 , !1 ] + [W, !0 ]) + ω˜ 2 ν 2 [W, !1 ] ν !1 + ων 2
is such that Kχ 1 ≤ C ω˜ 2 χ 1 since (34), [H1 , !1 ] + [W, !0 ] = 0, by definition of !1 , and since H˜ 1 W H˜ 1−1 is a bounded operator. By inserting (32) into (27) we obtain that b+ (τ ) and b− (τ ) must satisfy to the following equation: i ω˜ E± !ν(τ ) {c+ ϕ+ + c− ϕ− } = φ, c± = − b± + ωνW ˜ b± , + (35) h¯ 2 where
and ν =
b± (0) = a± + O(ω), φ = −K(b+ ϕ+ + b− ϕ− ), φ(τ, ·)1 ≤ C ω˜ 2 , ∀τ ≥ 0,
h¯ ω˜ ϕ, W ϕ
= ν0 + O(ω) ˜ has to satisfy (34).
Destruction of Beating Effect for Non-Linear Schrödinger Equation
201
Now, we have that !ν(τ ) (W ϕ± ) = !0 W ϕ± + ων(τ ˜ )!1 W ϕ± , where !0 W ϕ± = ϕ+ , W ϕ± ϕ+ + ϕ− , W ϕ± ϕ− and ων(τ ˜ )!1 W ϕ± 1 ≤ C ωW ˜ ϕ± 1 ≤ C ω, ˜ since (31). Moreover, let (ζ − H1 )−1 K1 = (ζ − H1 − ωνW ˜ ˜ )−1 − (ζ − H1 )−1 − (ζ − H1 )−1 ωνW (ζ − H1 )−1 where
−1 ˜ − 1 − ωνW ˜ K1 = 1 − ωνW (ζ − H1 )−1 (ζ − H1 )−1 −1 = ω˜ 2 ν 2 (ζ − H1 )−1 W (ζ − H1 )−1 W 1 − ωνW ˜ (ζ − H1 )−1
is such that for any ζ ∈ γ then K1 χ 1 ≤ C ω˜ 2 χ 1 . From this it follows that !ν(τ ) =
1 2π i
γ
˜ )−1 dζ + K2 , (ζ − H1 − ωνW
where K2 χ 1 ≤ C ω˜ 2 χ 1 ; hence we can write that: !2ν(τ ) = !ν(τ ) + K3 , K3 χ 1 ≤ C ω˜ 2 χ 1 . Therefore:
i ω˜ E+ !ν(τ ) c+ ϕ+ = !ν(τ ) − b+ + + ωνW ˜ b + ϕ+ h¯ 2 i ω˜ E+ + ωνW ˜ b + ϕ+ + φ1 = !2ν(τ ) − b+ + h¯ 2 i ω˜ 1 = !ν(τ ) − b+ + E+ b+ ϕ+ + ωνb ˜ + ϕ− , W ϕ+ ϕ− + φ2 , h¯ 2
where φ> 1 ≤ C ω˜ 2 , > = 1, 2, and ϕ+ , W ϕ+ = 0. Therefore, (35) can be re-written as: !ν(τ ) {d+ ϕ+ + d− ϕ− } = φ3 , d± = −
i ω˜ 1 ˜ 0 b∓ , b + E± b± + ωνρ 2 ± h¯
where 0 b± (0) = a± + O(ω), ˜ φ3 (τ, ·)1 = O(ω˜ 2 ), ∀τ ≥ 0,
202
V. Grecchi, A. Martinez, A. Sacchetti
with ρ0 = ϕ+ , W ϕ− ∈ R. As a result it is enough to find b± , bounded together with their first derivative for any τ , such that ˜ 0 ρ 0 b− = 0 − i ω˜ b + 1 E b + ων 2 + h¯ + + + 1 E b + ων (36) ˜ 0 ρ 0 b+ = 0 . − i2ω˜ b− h¯ − − b (0) = a 0 , ν = 2 (b b¯ ρ ) ± 0 + − 0 ± h¯ ω˜
Setting: 1 1 aR = √ (b+ + b− ) and aL = √ (b+ − b− ) 2 2 the system (36) becomes + aR + 21 ωa ˜ L − ων ˜ 0 ρ0 aR − i2ω˜ aR = − E−2+E h¯ E +E i ω ˜ 1 − a = − − + aL + ωa ˜ 0 ρ0 aL 2 L 2h¯ 2 ˜ R + ων , 2 − |a |2 ) ν = ρ (|a | 0 0 R L h ω ˜ ¯ aR (0) = √1 (a 0 + a 0 ) and aL (0) = √1 (a 0 − a 0 ) + − + − 2
(37)
2
and we look for a solution of the form: aR (τ ) = AR (τ )e−i(E+ +E− )τ/h¯ ω˜ , aL (τ ) = AL (τ )e−i(E+ +E− )τ/h¯ ω˜ with AR and AL independent of ω. Then (37) is transformed into the correspondent system: iAR = −AL + 2ν0 ρ0 AR , (38) iAL = −AR − 2ν0 ρ0 AL where ν0 =
ρ0 (|AR |2 − |AL |2 ), AR,L (0) = aR,L (0). h¯ ω˜
It easy to verify that |AL (τ )|2 +|AR (τ )|2 = 1 since ρ0 ∈ R; hence, the solutions AR,L (τ ) exist for any τ and they are bounded, together with their first derivative, uniformly with respect to τ and ω˜ small enough since h¯ω˜ = O(1). Then (34) will be satisfied uniformly with respect to ω˜ (actually (38) is independent of ω). ˜ From these facts and by (30), (32) and Lemma 1 then the solution of (19)–(20) satisfies the estimates (15) and (16). Theorem 3 is proved. ! 3.3. Dynamics of the two-level system. In order to study the system of Eqs. (17) we re-write it in the form iAR = −AL + 2µ|AR |2 AR AR,L (0) = aR,L (0) , (39) , iAL = −AR + 2µ|AL |2 AL |AR (τ )|2 + |AL (τ )|2 = 1 where denotes the derivative with respect to τ , c = 2ρ02 , µ = c ω plays the role of the parameter of non-linearity and we re-define AR,L (τ ) up to a phase factor, i.e. AR,L (τ ) → AR,L (τ )ei2cτ .
Destruction of Beating Effect for Non-Linear Schrödinger Equation
203
Remark 4 (Gross–Pitaevskii equation). If the perturbation term has the form f (x, ψ) = |ψ(x)|2 W (x),
(40)
where W (x) is a given real-valued even function: W (x , −xn ) = W (x), then Eq. (1) takes the form of the Gross–Pitaevskii equation [1] and we have the conservation of the energy defined below: 1 H = H0 ψ, ψ + ψ 2 , W ψ 2 . 2 In particular, the same arguments given above prove that the two-level system for the Gross–Pitaevskii equation takes the form (39), where c = ϕR , W |ϕR |2 ϕR and where the function W is such that this scalar product is defined. In discussing two-level systems we have characterized the states in terms of AR (τ ) = p(τ )eiα(τ ) and AL (τ ) = q(τ )eiβ(τ ) ,
(41)
where p, q, α and β are real-valued functions, 0 ≤ p ≤ 1 and 0 ≤ q ≤ 1. From the redundancy of the common phase factor we have that the state can be described now by means of a vector in an abstract Euclidean three-dimensional space with components (p, q cos(β−α), q sin(β−α)). In particular, from the normalization condition p 2 +q 2 = 1, it belongs, in such an Euclidean space, to the surface of the sphere. Hence, in order to study the solution of the two-level system (39) we represent the surface of the sphere by means of a Mercator-type chart; that is by means of two real coordinates (P , z), where P = p2 ∈ [0, 1] is the square of the modulus of AR and z = α − β ∈ T = R/2π Z = [0, 2π) belongs to the one-dimensional torus and represents the difference between the phases of AR and AL (see Ch. 13, [9]). We underline that this representation is singular at P = 0 and P = 1; in fact, for P = 0 (respectively P = 1) and any z we have localization on the left-hand (respectively right-hand) well. If the non-linear term is absent in Eq. (39), then P (τ ) is a periodic function with period π and, if initially P (0) = 0 or P (0) = 1, then P (τ ) periodically assumes the values 0 and 1. The system of equations (39) has been recently studied [13]. Here, we recall the most relevant results. Lemma 2. et P (τ ) = p 2 (τ ) and z(τ ) = α(τ ) − β(τ ); then P (τ ) and z(τ ) satisfy the following system of ordinary differential equations: √ √ P = 2 P 1− P sin z . (42) cos z z = (1 − 2P ) 2µ + √ √1 P 1−P
Equations (42) have four stationary solutions (I) (P = 1/2, z = 0) , (II) (P = 1/2, z = π ) , 1 + 1 − 1/µ2 ,z = (III) P = 2 1 − 1 − 1/µ2 (IV) P = ,z = 2
π 1+ 2 π 1+ 2
|µ| µ |µ| µ
, if |µ| ≥ 1, , if |µ| ≥ 1,
204
V. Grecchi, A. Martinez, A. Sacchetti
where, for |µ| < 1, (I) and (II) are center points while, for µ > 1 (respectively µ < −1), the stationary solutions (I) (respectively (II)), (III) and (IV) are center points and the stationary solution (II) (respectively (I)) is a saddle point. Moreover, the function √ √ √ √ I = I (P , z, µ) = P 1 − P µ P 1 − P + cos z (43) is an integral of motion and the dynamics of the two-level system, with initial condition (P0 , z0 ), could be described by means of the integral path defined by the implicit equation I (P , z, µ) = I (P0 , z0 , µ). In particular, we consider the following two behaviors: [C1] P (τ ) is a periodic continuous function, with given period τB , such that P (τ ) = 21 , for τ = τ˜ , τ˜ + 21 τB , for some τ˜ , and P (τ ) < 21 and P (τ + 21 τB ) > 21 for any τ ∈ (τ˜ , τ˜ + 21 τB ). [C2] P (τ ) is a periodic continuous function such that P (τ ) = 21 for any τ . We have that: Lemma 3. Let (P0 , z0 ) ∈ [0, 1] × T be the initial state in the two-level representation. We have: (i) if |µ| ≤ 1, then P (τ ) has a time behavior of type C1 for any initial condition (P0 , z0 ), but the ones corresponding to the stationary solutions (I) and (II) (see Fig. 1);
1
0.8
0.6
P 0.4
0.2
0 -3
-2
-1
0
1
2
3
z
Fig. 1. Integral paths of the equation I (P , z, µ) = I˜ for some values of I˜ and for µ = − 21 fixed. The bold line represents the integral path of the beating motion, that is the transition from localization on a well to localization on the other one. Localization on the right-hand (respectively left-hand) well occurs at P = 1 (respectively P = 0) for any z
Destruction of Beating Effect for Non-Linear Schrödinger Equation
205
1
0.8
0.6
P 0.4
0.2
0 -3
-2
-1
0
1
2
3
Fig. 2. Integral paths of equation I (P , z, µ) = I˜ for some values of I˜ and for µ = − 23 fixed. We observe the stability of the beating motion (bold line) despite the appearance of the bifurcation of one fixed point. Broken lines represent the two sepratrices; inside the region enclosed by these lines we have closed paths, around the asymmetrical stationary state originated from the bifurcation of the fundamentals state, representing periodic oscillations within only one well
(ii) if |µ| > 1, let D = D(µ) be the bounded open set enclosed by the path with equation π |µ| 1 + 2µP (1 − P ) − µ/2 z= 1+ ± arccos (44) √ √ 2 µ 2 P 1−P and containing the stationary solutions (III) and (IV); then for any (P0 , z0 ) ∈ D, (P0 , z0 ) different from the stationary solutions (III) and (IV), P (τ ) has a behavior of ¯ where D¯ denotes the closure of D, and (P0 , z0 ) type C2; in contrast, if (P0 , z0 ) ∈ / D, is different from the stationary solution (I), then P (τ ) has a behavior of type C1 (see Figs. 2 and 3). Remark 5. Let (P0 , z0 ) be such that P0 = 0 or P0 = 1. Then I (P0 , z0 , µ) = 0 and ¯ if |µ| < 2. Hence, for |µ| < 2 we observe (P0 , z0 ) ∈ D, if |µ| > 2, and (P0 , z0 ) ∈ / D, the beating motion, such that P (τ ) periodically assumes the values 0 and 1 (see the bold line in Fig. 2). The beating motion corresponds to the path with equation I (P , z, µ) = 0; that is: √ √ π |µ| zf b = 1− ± arccos |µ| P 1 − P . 2 µ In contrast, for 2 < |µ| we have that the beating motion between the two wells is not possible (see the bold line in Fig. 3, where µ = − 25 ); in particular, if initially P (0) = 1
206
V. Grecchi, A. Martinez, A. Sacchetti
1
0.8
0.6
P 0.4
0.2
0 -3
-2
-1
0
1
2
3
z
Fig. 3. Integral paths of equation I (P , z, µ) = I˜ for some values of I˜ and for µ = − 25 fixed. We observe the destruction of the beating motion. The trajectory (bold line) starting from the localization point corresponding to P = 1 (respectively P = 0) stays in the region P > 21 (respectively P < 21 ) and it encircles one asymmetrical stationary state originated from the bifurcation of the fundamental state
(respectively P (0) = 0) then during the motion we have that P (τ ) > P (τ ) < 21 ) for any τ .
1 2
(respectively
As a result of Lemma 3, it follows that we generically observe a periodic motion with period τB that depends on the parameter µ and on the initial condition (P0 , z0 ). In particular: ¯ where the set D is defined in Lemma 3, then the beating / D, Lemma 4. If (P0 , z0 ) ∈ motion between the two wells has period given by
√ (µ−4I +2)(µ−4I −2) √ EK µ µ 2 −(1+ 1+4µI )2 τB = τB (I, µ) = 4 , (45) √ (1 + 1 + 4µI )2 − µ2 where I = I (P0 , z0 ; µ) and EK is the complete elliptic integral of the first kind. We close this section with the following remarks. Remark 6. If (P0 , z0 ) ∈ D then we have a periodic motion within one well with period: √ √ x2 µ x1 x2 x2 τB = −2i √ EF − EK , (46) , √ √ µ x1 x2 µ x1 µ x1 where EF denotes the incomplete elliptic integral of the first kind and where x1 = µ2 − 4 − 8µI + 16I 2 and x2 = µ2 − (1 + 4µI + 1)2 .
Destruction of Beating Effect for Non-Linear Schrödinger Equation
207
Remark 7. The frequency ν f b = 1/τ f b of the beating motion, corresponding to the value I = 0 of the integral of motion, depends on |µ| and monotonically decreases and vanishes at |µ| = 2; indeed, we have that 4 − µ2 fb
ν = 4EK iµ/ 4 − µ2 which is a monotone decreasing function as µ ∈ [0, 2). From formulas (106.02) and (112.01) [3], it follows that νf b ∼
1 1 as |µ| → 2− . 2 ln(8/ 4 − µ2 )
We remark also that the range of frequencies is given by (νmin , νmax ], where 1√ 1 − |µ|, if |µ| < 1 νmin = π 0, if |µ| ≥ 1 and νmax =
1 1 + |µ|, for any µ. π
In particular we observe that the interval (νmin , νmax ] broadens as µ increase. 3.4. Beating destruction for large non-linearity. Now, we complete the proof of Theorem 2 and of the corollary. To this end we remark that the energy has the form 1 H = ψ, H0 ψ + ψ, W ψ 2 , 2 where, in order to take into account the contribution due to the term ψc , we observe that
W t = ψ, W ψ = |aR |2 − |aL |2 ϕR , W ϕR + R1 + R2 , where R1 = aR a¯ L ϕR , W ϕL + a¯ R aL ϕL , W ϕR +|aL |2 (ϕL , W ϕL + ϕR , W ϕR ) , R2 = ψc , W ψc + aR ϕR , W ψc + a¯ R ψc , W ϕR +aL ϕL , W ψc + a¯ L ψc , W ϕL , and 1 ψ, H0 ψ = &(|aR |2 + |aL |2 ) − ω(aR a¯ L + aL a¯ R ) + R3 , 2 where R3 = ψc , H0 ψc .
208
V. Grecchi, A. Martinez, A. Sacchetti
From Theorem 3 we have that aR (t) and aL (t) are such that for any α < 1,
1 − |aR (t)|2 + |aL (t)|2 = ψc (t, ·)2 = O(ω˜ 2α ) for any t ∈ [0, (τ , /ω) ˜ ln(1/ω)], ˜ for some fixed τ , , and sup ˜ |aR,L (t)| − |AR,L (t ω/2)| = O(ω˜ α ), t∈[0,(τ , /ω) ˜ ln(1/ω)] ˜
where AR,L (τ ) are computed in Sects. 3.3. From (3) and since the wave-functions ϕR,L are localized on just one well [8], it follows that R1 = O(ω), as h¯ → 0,
(47)
for any t ≥ 0. Moreover, making use of Theorem 3, we have that R2 = O(ω˜ α ) and R3 = O(ω˜ 2α )
(48)
for any τ ∈ [0, (τ , /ω) ˜ ln(1/ω)] ˜ and for some τ , > 0. From these facts and from (41) then it follows that for any t ∈ [0, (τ , /ω) ˜ ln(1/ω)] ˜ we have that W t = (2P − 1) ϕR , W ϕR + O(ω˜ α ), where P = |AR |2 is the periodic solution given in Lemma 3. Then W t is, up to an error of order O(ω˜ α ), a periodic function with period T given in Lemma 4. If we remark that 1 H = & + ωµ − ωI (P , z, µ) + O(ω˜ 2α ), 4 ¯ From this where we choose α > 21 , then we have that (11) implies that (P0 , z0 ) ∈ / D. fact and from the stability result the beating motion between the two wells follows. In contrast, (10) implies that (P0 , z0 ) ∈ D; hence, the beating motion disappears. In particular, we observe that the energy corresponding to the beating motion with initial condition P0 = 0 (or P0 = 1) is such that Hf b ≈ & + 41 µω. Hence, the beating motion disappears for |µ| > 2. Theorem 2 and the corollary are proved. Acknowledgement. This work is partially supported by the Italian MURST and INDAM-GNFM. V.G. and A.M. are supported by the University of Bologna (funds for selected research topics). V.G. is also supported by the INFN.
References 1. D’Agosta, R., Malomed, B.A., Presilla, C.: Stationary solutions of the Gross–Pitaevskii equation with linear counterpart. Phys. Lett. A 275, 424–434 (2000) 2. Bourgain, J.: Global solutions of nonlinear Schrödinger equations. AMS - Coll. Publ. 46 (1999) 3. Byrd, P.F., Friedman, M.D.: Handbook of elliptic integrals for engineers and physicists. Berlin: SpringerVerlag, 1954 4. Claviere, P., Jona Lasinio, G.: Instability of tunneling and the concept of molecular structure in quantum mechanics: the case of pyramidal molecules and the enantiomer problem. Phys. Rev. A 33, 2245–2253 (1986) 5. Davies, E.B.: Symmetry breaking for a non-linear Schrödinger operator. Commun. Math. Phys. 64, 191–210 (1979) 6. E.B. Davies: Nonlinear Schrödinger operators and molecular structure. J. Phys. A: Math. and Gen. 28, 4025-4041 (1995)
Destruction of Beating Effect for Non-Linear Schrödinger Equation
209
7. Grecchi, V., Martinez, A.: Non-linear Stark effect and molecular localization. Commun. Math. Phys. 166, 533–548 (1995) 8. Helffer, B., Sjöstrand, J.: Multiple wells in the semi-classical limit I. Comm. P.D.E. 9, 337–408 (1984) 9. Merzbacher, E.: Quantum Mechanics. New York: Wiley Int. Ed., 2nd edition, 1970 10. Nenciu, G.: Adiabatic theorem and spectral concentration. Commun. Math. Phys. 82, 121–135 (1981/82) 11. Nenciu, G.: Linear adiabatic theory, exponential estimates. Commun. Math. Phys. 152, 479–496 (1993) 12. Pratt, R.F.: Spontaneous deformation of hydrogen atom shape in an isotropic environment. J. Phys. France 49, 635–641 (1988) 13. Raghavan, S., Smerzi, A., Fantoni, S., Shenoy, S.R.: Coherent oscillations between two weakly coupled Bose–Einstein condensates: Josephson effects, π oscillations, and macroscopic quantum self-trapping. Phys. Rev. A 59, 620–633 (1999) 14. Simon, B.: Semi-classical analysis of low lying eigenvalues I. Ann. I.H.P. 38, 295–307 (1983) 15. Simon, B.: Semi-classical analysis of low lying eigenvalues II: tunneling. Ann. of Math. 120, 89–118 (1984) 16. Sjöstrand, J.: Projecteurs adiabatique du point de vue pseudo-différentiel. C.R. Acad. Sci. Paris 317, Série I, 217–220 (1993) 17. Vardi, A.: On the role of intermolecular interactions in establishing chiral stability. J. Chem. Phys. 112, 8743–46 (2000) Communicated by A. Kupiainen
Commun. Math. Phys. 227, 211 – 241 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Weighted Bergman Kernels and Quantization Miroslav Engliš ˇ Žitná 25, 11567 Prague 1, Czech Republic. E-mail:
[email protected] MÚ AV CR, Received: 29 December 2000 / Accepted: 14 December 2001
Abstract: Let be a bounded pseudoconvex domain in CN , φ, ψ two positive functions on such that − log ψ, − log φ are plurisubharmonic, and z ∈ a point at which − log φ is smooth and strictly plurisubharmonic. We show that as k → ∞, the Bergman kernels with respect to the weights φ k ψ have an asymptotic expansion ∞
kN bj (x, y) k −j , Kφ k ψ (x, y) = N π φ(x, y)k ψ(x, y) j =0
for x, y near z, where φ(x, y) is an almost-analytic extension of φ(x) = φ(x, x) and similarly for ψ. Further, b0 (x, x) = det[−∂ 2 log φ(x)/∂xj ∂x k ]. If in addition is of finite type, φ, ψ behave reasonably at the boundary, and − log φ, − log ψ are strictly plurisubharmonic on , we obtain also an analogous asymptotic expansion for the Berezin transform and give applications to the Berezin quantization. Finally, for smoothly bounded and strictly pseudoconvex and φ a smooth strictly plurisubharmonic defining function for , we also obtain results on the Berezin–Toeplitz quantization. Introduction Let be a domain in CN , ρ a positive continuous function on , and Kρ the reproducing kernel of the weighted Bergman space A2 (, ρ) of all holomorphic functions on square-integrable with respect to the measure ρ(z) dz, dz being the Euclidean volume element in CN ; we call Kρ the weighted Bergman kernel corresponding to ρ, and for ρ ≡ 1 we will speak simply of the Bergman kernel K of . The Berezin transform Bρ is the integral operator defined by |Kρ (x, y)|2 Bρ f (y) = f (x) ρ(x) dx (1) Kρ (y, y) The author’s research was supported by GA CR ˇ grant no. 201/00/0208 and GA AV CR ˇ grant A1019005.
212
M. Engliš
for all y for which Kρ (y, y) = 0. In terms of the operator Mf of multiplication by f on the space L2 (, ρ dz) this can be rewritten as Bρ f (y) =
Mf Kρ (·, y), Kρ (·, y) , Kρ (·, y)2
from which it is immediate that the integral (1) converges, for instance, for any bounded measurable function f . The Berezin transform was first introduced by F. A. Berezin [Ber] in the context of quantization of Kähler manifolds. More specifically, let φ be a positive function on such that − log φ is strictly plurisubharmonic, and set gj k = ∂ 2 (− log φ)/∂zj ∂zk
(2)
and χ = det(gj k ) (so that ds 2 = gj k dzj dzk is the Kähler metric with potential − log φ and χ the corresponding volume density). For a bounded symmetric domain in CN and φ(z) = 1/K (z, z) (so that ds 2 is the Bergman metric), Berezin showed that for all m ≥ 1 it holds that Kφ m χ (x, y) = p(m)φ(x, y)−m ,
(3)
where φ(x, y) = 1/K (x, y) is a function on × holomorphic in x, y such that φ(x, x) = φ(x), and p is a polynomial of degree N which depends only on ; and that 1 1 ˜ (y) + O Bφ m χ f (y) = f (y) + f (4) m m2 ˜ is the Laplace-Beltrami operator of the metric ds 2 on . Using (4), as m → ∞, where he was then able to construct a nice quantization procedure for mechanical systems whose phase-space is with the Bergman metric. Later the present author showed that to get (4) it suffices that (3) holds only asymptotically as m → ∞ in a certain sense and used this to extend the range of applicability of Berezin’s original procedure to all plane domains with the Poincaré metric [E1], to some complete Reinhardt domains in C2 with natural rotation-invariant Kähler metrics [E2], and finally to any strictly pseudoconvex domain with real-analytic boundary and φ a real-analytic defining function for such that − log φ is strictly plurisubharmonic [E6]. In fact, [E6] even dealt with the more general setting of weights of the form ρ = φ m ψ M with −φ, −ψ two C ω defining functions of a strictly pseudoconvex domain such that − log φ, − log ψ are plurisubharmonic, M fixed and m → ∞. Then (3), with φ m ψ M in place of φ m χ , holds asymptotically for (x, y) near the diagonal, and (4) holds for any f ∈ L∞ () which is smooth in a neighbourhood of y. The aim of the present paper is to improve these results by relaxing the hypotheses of real-analyticity and of φ, ψ being defining functions. For a function f on a domain in Cn , we say that f is almost analytic at x = a if ∂f/∂x j , j = 1, . . . , n vanish at a together with their partial derivatives of all orders. It is known that any C ∞ function φ(x) possesses a (non-unique) almost analytic extension φ(x, y) such that φ(x, y) is almost-analytic in x and y at all points of the diagonal x = y, and φ(x, x) = φ(x). Further, if φ(x) is real-valued, then the extension may be chosen so that φ(y, x) = φ(x, y) (just replace φ(x, y) by 21 (φ(x, y) + φ(y, x))); in the sequel, we will always assume that an extension with this additional property has been chosen for a real-valued φ(x). We now have the following results.
Bergman Kernels
213
Theorem 1. Let be a bounded pseudoconvex domain in CN , φ, ψ two bounded positive continuous functions on such that − log φ, − log ψ are plurisubharmonic, and let x0 ∈ be a point in a neighbourhood of which φ and ψ are C ∞ and − log φ is strictly plurisubharmonic. Fix an integer M ≥ 0. Then there is a smaller neighbourhood U of x0 such that the asymptotic expansion Kφ k ψ M (x, y) =
∞ kN · bj (x, y)k −j π N φ(x, y)k ψ(x, y)M
(5)
j =0
holds uniformly for all x, y ∈ U as k → ∞, in the sense that for each m > 0, sup φ(x)k/2 φ(y)k/2 Kφ k ψ M (x, y) x,y∈U
k N φ(x)k/2 φ(y)k/2 − N π φ(x, y)k ψ(x, y)M
N+m−1
bj (x, y)k
j =0
−m = O(k )
(6)
−j
as k → ∞. Here φ(x, y), ψ(x, y) are fixed almost-analytic extensions of φ(x) and ψ(x) to U × U, respectively. The coefficients bj (x, y) ∈ C ∞ (U × U) are almost-analytic at x = y, and their jets at a point (x, x) on the diagonal depend only on the jets of φ and ψ at x. In particular, 1 . (7) b0 (x, x) = det ∂∂ log φ(x) In the situation of the last theorem, consider the domain = {(z1 , z2 , z3 ) ∈ × CM × C :
|z3 |2 |z2 |2 + < 1}. φ(z1 ) ψ(z1 )
(8)
Recall that for domain D in Cn , a boundary point z ∈ ∂D is called smooth if in some neighbourhood of z, ∂D is a C ∞ -submanifold of Cn ; the domain D is called smoothly bounded if it is bounded and all its boundary points are smooth. A smooth boundary point z is said to be of finite type ≤ m if there is no complex analytic variety passing through z which has order of contact with ∂D at z bigger than m. (Thus, for instance, a strictly pseudoconvex boundary point is of type 2.) Finally, a smoothly bounded domain is said to be of finite type if all its boundary points are of finite type. Theorem 2. Assume that the hypotheses of Theorem 1 are fulfilled, and that in addition is smoothly bounded and of finite type. (This implies, in particular, that φ, ψ ∈ C ∞ ().1 ) Then for any f ∈ L∞ () which is C ∞ in a neighbourhood of x0 , there is an asymptotic expansion Bφ k ψ M f (y) =
∞
Qj f (y) · k −j ,
(9)
j =0
uniformly for all y in a neighbourhood of x0 , where Qj are linear differential operators whose coefficients involve only the derivatives of φ, ψ at y and Q0 is the identity operator. 1 But not necessarily C ∞ ()!
214
M. Engliš
We remark that in the applications to the Berezin quantization, − log φ is the potential of the Kähler metric, and thus is automatically strictly plurisubharmonic on all of . Strictly speaking, the nonnegative integer M in Theorem 1 is redundant (one can rechristen ψ M to ψ and take M = 1); we preferred to keep it since the same weights φ k ψ M appear in Theorem 2, where M already seems to be essential (we do not know whether the hypotheses of Theorem 2 are also satisfied for the pair ψ M , 1 if they are satisfied for ψ, M). We should also remark that the hypotheses of smooth boundedness and finite type of are still not completely satisfactory – although they cover a lot of situations, they still exclude some important cases like, for instance, the Berezin quantization of a general (i.e. non-symmetric) domain with the Bergman or the Cheng–Yau metric; cf. Remark (2) after Theorem 11 below. The whole approach can also be adapted to arbitrary Kähler manifolds in place of domains in CN [Pe], and sections of line bundles in place of functions. The function φ then defines the metric structure of the line bundle, and ∂∂ log φ is the corresponding curvature form. For compact Kähler manifolds and − log φ strictly plurisubharmonic on all of (i.e. the line bundle of strictly negative curvature) the analogue of Theorem 1 has been obtained independently by Zelditch [Ze] for x = y and by Catlin [Ca] for general x, y; and the analogue of Theorem 2 in this setting was established by Karabegov and Schlichenmaier [KS]. In [BMS] and [Sch] the authors also obtain (still in the context of compact manifolds) somewhat stronger results concerning the Berezin–Toeplitz quantization, and we finish by observing that the same results can also be obtained in our noncompact situation. (ρ) Recall that, quite generally, the Toeplitz operator TF with symbol F ∈ L∞ () is the operator on A2 (, ρ) given by the recipe (ρ) f (x)F (x)Kρ (y, x)ρ(x) dx, TF f (y) =
or, equivalently, (ρ)
TF f = Pρ (Ff ), where Pρ is the orthogonal projection of L2 (, ρ) onto A2 (, ρ). For simplicity, we state our result on the Berezin–Toeplitz quantization only for the weights which are of most interest to us, viz. ρ = φ m χ with χ = det[−∂∂ log φ]. Theorem 3. Let be a smoothly bounded strictly pseudoconvex domain in CN and −φ a smooth defining function for = {φ > 0} such that − log φ is strictly plurisubharmonic. Then: (φ m χ)
→ f ∞ as m → ∞; (i) for any f ∈ C ∞ (), Tf (ii) there exist bilinear operators Cj : C ∞ () × C ∞ () → C ∞ () (j = 0, 1, 2, . . . ) such that for any f, g ∈ C ∞ () and any integer k, k m (φ χ) (φ m χ) −j (φ m χ) T − m TCj (f,g) = O(m−k−1 ) Tf g
(10)
j =0
as m → ∞. Further, C0 (f, g) = f g and C1 (f, g) − C1 (g, f ) = i{f, g}, the Poisson bracket of f and g with respect to the metric (2).
Bergman Kernels
215
The result of the kind appearing in Theorem 3 was first obtained for a domain in C with the Poincaré metric by Klimek and Lesniewski [KL], using uniformization techniques, and for a bounded symmetric domain with the invariant (Bergman) metric and = Cn with the Euclidean metric by Borthwick, Lesniewski and Upmeier [BLU] and Coburn [Co], respectively, in both cases with the aid of the computational machinery available thanks to the specific nature of the domain and metric. For a compact manifold with an arbitrary Kähler metric, Theorem 3 was proved by Bordemann, Meinrenken and Schlichenmaier [BMS]. In our case (i) is a fairly straightforward consequence of Theorem 2, while (ii) follows, as in [BMS] and [Sch], from the Boutet de Monvel–Guillemin calculus of generalized Toeplitz operators [BG]; see also [Gu]. It turns out that Cj are, in fact, differential operators (see Corollary 15); for compact Kähler manifolds, this was proved in [KS]. As in [E6], our method of proof of Theorem 1 is based on the analysis of the Bergman of the Forelli-Rudin domain (8) over ; (5) is then obtained from Fefferman’s kernel K near the boundary. This is done in Sect. 2, after establishing asymptotic expansion of K some localization theorems for the Bergman kernel in Sect. 1. Theorem 2 is proved in Sect. 3, and its applications to quantization are described in Sect. 4. The Berezin– Toeplitz quantization is discussed in Sect. 5. At the end of each section we provide various remarks, comments on related developments, open problems, etc. It is perhaps appropriate to point out briefly what are the new ingredients in Sects. 1– 3 here against [E6]. In [E6], we proved a stronger assertion than (6) (recalled in (24) below) under stronger hypotheses on , φ and ψ. Here, by enhancing the treatment of the technical matters (cf. Lemmas 7 and 8 in Sect. 2), we prove the weaker assertion (6) under weaker hypotheses, and then show that (6) is still sufficient to yield the conclusion of Theorem 2 under the additional assumption of smooth boundedness and finite type . of Throughout the paper, “psh” is an abbreviation for “plurisubharmonic”. 1. Preliminaries Our starting point is the following proposition, reproduced here from [E6] (see also [BFS]), which relates the weighted Bergman kernels Kφ k ψ M on to the unweighted . Bergman kernel of the domain Proposition 4. Let be an arbitrary domain in CN (it need not be bounded), φ, ψ two the domain defined positive continuous functions on , M a nonnegative integer, and by (8). Then the Bergman kernel K := K of is given by t) = K(z;
∞ (k + l + M + 1)! Kψ l+M φ k+1 (z1 , t1 ) z2 , t2 l (z3 t 3 )k . k! l! π M+1
(11)
k,l=0
. The series converges uniformly on compact subsets of is pseudoconvex if and Note that by the familiar criterion for Hartogs domains, only if is pseudoconvex and − log φ, − log ψ are psh. Proof. Arguing as in [Lig], Proposition 0 shows that β β t) = K(z; Kwαβ (z1 , t1 ) z2 t 2 z3α t α3 , α,β
216
M. Engliš
where the summation is over all multiindices α ∈ N, β ∈ NM , N = {0, 1, 2, . . . }, and
wαβ (z1 ) =
(z2 , z3 ) :
= Since
|α|=k
|z2 |2 ψ(z1 )
+
|z3 |2 φ(z1 )
0}, φ is C ∞ in a neighbourhood of , ∇φ = 0 on ∂, and the Levi matrix (−∂ 2 φ/∂zj ∂zk ) is positive definite on the complex tangent space (the last condition is equivalent to the Monge-Ampére matrix in (13) below having n positive and 1 negative eigenvalue, for any z ∈ ∂). Then there exist functions a(x, y), b(x, y), φ(x, y) ∈ C ∞ (Cn × Cn ) such that (a) a a(x, y), b(x, y), φ(x, y) are almost-analytic in x, y in the sense that ∂φ(x, y)/∂x and ∂φ(x, y)/∂y have a zero of infinite order at x = y, and similarly for a(x, y) and b(x, y); (b) φ(x, x) = φ(x); (c) for x ∈ ∂, a(x, x) =
n! J [φ](x) > 0, πn
(12)
where J [φ] is the Monge-Ampére determinant
−φ −∂φ/∂zk J [φ] = − det −∂φ/∂zj −∂ 2 φ/∂zj ∂zk
(13)
whose positivity follows from the strong pseudoconvexity of ∂; (d) the Bergman kernel of is given by the formula K(x, y) =
a(x, y) + b(x, y) log φ(x, y) φ(x, y)n+1
(14)
for (x, y) ∈ 4 = {|x − y| < 4, dist(x, ∂) < 4}, where 4 > 0 is sufficiently small; (e) outside any 4 the Bergman kernel is C ∞ up to the boundary of × ; (f) if the boundary ∂ is even real-analytic, then the functions a(x, y), b(x, y) and φ(x, y) can in fact be chosen to be holomorphic in x, y in a neighbourhood of the boundary diagonal {(x, x); x ∈ ∂} in Cn , and outside any 4 the Bergman kernel is holomorphic in x, y in a neighbourhood of × .
Bergman Kernels
217
The original proofs in [Fef] and [BS] deal only with (a)–(e); part (f) is due to Kashiwara [Kas] and Bell [Be1]. Observe that if φ (x, y) is another function satisfying (a) and (b), then h = (φ /φ)−1 vanishes at x = y to an infinite order; thus (14) remains in force with φ and a = (1 + h)n+1 a + φ n+1 b log(1 + h) in the place of φ and a. It follows that even for any function φ(x, y) satisfying (a) and (b) there exist a(x, y), b(x, y) such that the conclusions (a)–(d) hold. This allows us to work with a convenient φ(x, y) in concrete situations later on: for instance, if φ(x) is of the form |x1 |2 + (a function of x2 , . . . , xn ), we can take φ(x, y) = x1 y 1 + (a function of x2 , . . . , xn , y2 , . . . , yn ). We will find convenient the following two (probably well-known) “localization lemmas”, which can be used to obtain a local variant of Fefferman’s theorem (see [E6]). Lemma 5. Let 1 ⊂ be two bounded pseudoconvex domains and U a neighbourhood of a point x0 ∈ ∂ such that U ∩ ∂1 = U ∩ ∂ and the piece of common boundary U ∩ ∂ is smooth and strictly pseudoconvex. Then the difference K1 (x, y) − K (x, y) is C ∞ on (U ∩ 1 ) × (U ∩ 1 ). Lemma 6. Let be a pseudoconvex domain (possibly unbounded) and x0 ∈ ∂ a strictly pseudoconvex point of its boundary. Then there exists a bounded strictly pseudoconvex domain 1 ⊂ such that ∂ and ∂1 coincide in a neighbourhood of x0 . Further, if x0 is a smooth boundary point, then 1 can be chosen to be smoothly bounded. Proof of Lemma 5. For , 1 smoothly bounded and strictly pseudoconvex, this is the content of Lemma 1 on p. 6 in [Fef]. The local version given here follows in the same way by J. J. Kohn’s local regularity theorems for the ∂-operator and subelliptic estimates at x0 ([Ko], Theorems 1.13 and 1.16) by the argument as on p. 469 in [Be2], cf. in particular the formula (2.1) there. Proof of Lemma 6. Let u be a defining function for = {u < 0} strictly-psh in a neighbourhood B(x0 , δ) of x0 (see e.g. [Krn], Proposition 3.2.1). Choose a C ∞ function θ : [0, 1) → R+ such that θ ≡ 0 on [0, 1/2], θ ≥ 0 on [1/2, 1) and θ(1−) = +∞. Set 1 = {x : u(x) + θ (|x − x0 |2 /δ 2 ) < 0}. Then 1 ⊂ ∩ B(x0 , δ), ∂1 coincides with ∂ in B(x0 , δ/2), and as θ ≥ 0, θ(|x − x0 |2 /δ 2 ) is psh, so 1 is strictly pseudoconvex. Finally, if u is C ∞ in B(x0 , δ), then 1 is smoothly bounded. It turns out that the boundedness hypothesis on in Lemma 5 is, in fact, unnecessary: see [E7], Sect. 4, where also the full details of the proof can be found. The conclusion of the lemma fails, however, if U ∩∂ = U ∩∂1 is only assumed to be weakly pseudoconvex: for instance, take = {max(|z1 |, |z2 |) < 1} ⊂ C2 , 1 = {max(|z1 |, 2|z2 |) < 1}, and x0 = (1, 0). Similarly, the hypothesis that be pseudoconvex cannot be dispensed with: an example is 1 = {z ∈ C2 : |z| < 2}, = 1 ∪ {|z1 | < 3, 1 < |z2 | < 3}, x0 = (2, 0). On the other hand, the hypothesis that 1 be pseudoconvex is not needed in the proof and can be omitted (but we won’t have any use for this refinement in the sequel). A similar construction as in Lemma 6 was used by Bell [Be2] (cf. also the references therein). Remark. There is also a “local version” of part (f) of Fefferman’s theorem: namely, if is bounded pseudoconvex and z ∈ ∂ is a strictly pseudoconvex and real-analytic boundary point (i.e. ∂ is a C ω -submanifold of Cn in some neighbourhood of z), then there exists a neighbourhood U of z and functions a(x, y), b(x, y) and φ(x, y) on U ×U ,
218
M. Engliš
holomorphic in x, y, such that −φ(x, x) is a local defining function for on U and (12) and (14) hold. See [Kan], §9, in particular the Theorem on p. 94. (The author is obliged to M. Kashiwara and Gen Komatsu for this information.) 2. Weighted Bergman Kernels We now use Fefferman’s asymptotic expansion together with Proposition 4 to determine the asymptotics of Kφ m ψ M (z, z) as z and M are fixed and m → ∞. Let us start with two technical lemmas. Let φ(x, y) be a function in C ∞ ( × ) almost-analytic in x, y on the diagonal and such that φ(x, y) = φ(y, x), φ(x, x) =: φ(x) > 0, and − log φ(x) is strictly psh at some point x0 . The last condition implies that there exists c > 0 and a small ball U centered at x0 such that φ(x)φ(y) ≤ 1 − c|x − y|2 |φ(x, y)|2
∀x, y ∈ U.
Denote D = {(x, y, τ ) ∈ U × U × C : |τ |2
0 Wn ) = 0. −1 ˆ converges weakly to a ˆ Proof. By compactness, a subsequence of n1 n−1 i i=0 ( i (A)) probability measure µ˜ on A (where A is as in Lemma 3.1). Since the restrictions on ηi become milder and milder as i → ∞, µ˜ is an invariant measure for X . By construction, all the ˆ i are supported on supp µ, so we must have µ˜ = µ, for we know from Theorem A(1) that all the other ergodic invariant measures have their supports bounded away from supp µ. Let N = N (u0 ) be such that d(u0 , WN ) < ε, where ε < δ is a small positive number to be determined. Claim 2. For all k ≥ 0 and u ∈ WkN , ∃u ∈ W(k+1)N such that u − u < γ kN ε. Proof. The claim is true for k = 0 by choice of N . We prove it for k = 1: Let u0 ∈ WN be such that u0 − u0 < ε, and fix an arbitrary u ∈ WN . By definition, there exist ηi ∈ Kr−Mδγ i such that u = uN (η0 , · · · , ηN−1 ). We wish to use the proximity of u0 to u0 and the Matching Lemma to produce (η0 , · · · , ηN −1 ) with the property that ) ∈ W2N and uN − uN < εγ N . To obtain the first property, it uN (η0 , · · · , ηN−1 is necessary to have ηi ∈ Kr−Mδγ i+N for all i < N. We proceed as follows: since u0 − u0 < ε and η0 ∈ Kr−Mδ , ∃η0 ∈ Kr−Mδ+Mε such that u1 (η0 ) − u1 (η0 ) < εγ ; similarly ∃η1 ∈ Kr−Mδγ +Mεγ such that u2 (η0 , η1 ) − u1 (η0 , η1 ) < εγ 2 , and so on. (See the proof of Lemma 3.2.) Thus ηi ∈ Kr−Mγ i (δ−ε) , and assuming ε is sufficiently small that δγ N < (δ − ε), we have ηi ∈ Kr−Mδγ i+N . To prove the assertion for k = 2, we pick an arbitrary u ∈ W2N , which, by definition, is equal to vN from some v0 ∈ WN . Since we have shown that there exists v0 ∈ W2N with v0 − v0 < γ N ε, it suffices to ∈W 2N ε. repeat the argument above to obtain vN 3N with vN − vN < γ Claim 3. There exists k1 = k1 (u0 ) s.t. for k ≥ k1 , P kN (B|u) ≥ ˆ kN (B(u, ˜ 2ε˜ )) > 0 for all u ∈ H with u − u0 < δ. Proof. Let N (W, ε) denote the ε-neighborhood of W ⊂ H . If follows from Claim 2 that if NkN := N (WkN , 2ε ki=0 γ iN ), then NkN ⊂ N(k+1)N for all k. Moreover, the ergodicity of (X N , µ) together with an observation similar to that in Claim 1 shows that ˜ 4ε˜ ) = ∅ for large enough k. If the closure of ∪k NkN contains suppµ. Thus NkN ∩ B(u, ∞ iN ˜ 2ε˜ )) > 0. Now for u with u − u0 < δ, the entire 2ε i=1 γ < 4ε˜ , then ˆ kN (B(u, restricted distribution ˆ n starting from u0 can be coupled to a part of the (unrestricted) ˜ 2ε˜ )). distribution starting from u. Thus for sufficiently large n, P n (B|u) ≥ ˆ n (B(u,
470
N. Masmoudi, L.-S. Young (1)
(n)
To finish, we cover supp µ with a finite number of δ-balls centered at u0 , . . . , u0 , (i) (i) and choose N0 = kˆ1 Nˆ , where kˆ1 = maxi k1 (u0 ) and Nˆ = ;i N (u0 ). The lemma is ε ˜ ˜ 2 )), where ˆ N0 is the restricted distribution starting proved with α0 = mini ˆ N0 (B(u, (i) from u0 . From Lemma 3.2, we see that associated with each pair of points (u0 , u0 ) with u0 − u0 < δ, there is a cascade of matchings between un and un , leading to the definition of a measure-preserving map : : 9 := K 2r × Kr(1− 1 γ ) × Kr(1− 1 γ 2 ) × · · · → K N 2
2
with the property that for η ∈ 9, ui (η) − ui (:(η)) ≤ γ i u0 − u0
for all i ≤ n.
The main goal in the next proof is, in a sense, to extend : to all of K N by attempting repeatedly to match the orbits that have not yet been matched. Proof of Theorem A(2). We consider for simplicity the case N0 = 1. Let u0 , u0 ∈ supp µ, and let n and n denote the distributions of un and un respectively. We seek to define a measure-preserving map : : K N → K N and to estimate the difference between n and n by In := f d n − f d n ≤ |f (un (η)) − f (un (:(η)))| dν N (η) . Let B be a ball of diameter δ centered at some point in suppµ. By Lemma 3.3, P (B|u0 ) ≥ α0 , and P (B|u0 ) ≥ α0 . Matching u1 ∈ B to u1 ∈ B, we define a measurepreserving map :(1) : 9˜ 1 → K for some 9˜ 1 ⊂ K with |9˜ 1 | = α0 . This extends, by the Matching Lemma, to a measure-preserving map : : 91 = 9˜ 1 × 9 → K N . The map :|91 represents the cascade of future couplings initiated by :(1) . Suppose now that : has been defined on ∪k≤n 9k , where 9k is the set of η matched at step k. More precisely, 91 , 92 , · · · , 9n are disjoint subsets of K N , and each 9k is of the form 9k = 9˜ k × 9 for some 9˜ k ⊂ K k ; the matching of uk and uk in B that takes place at step k defines a map :(k) : 9˜ k → K k , while the cascade of future matchings initiated by :(k) results in the definition of : : 9˜ k × 9 → K N . We now explain how to ˜ n = K n \ ∪k≤n 9 (n) , where 9 (n) = 9˜ k × 9 (n−k−1) is the first n-factor define 9n+1 . Let G k k ˜ n ; the in 9k . Consider the restricted distribution ˜ n+1 defined by (η0 , · · · , ηn−1 ) ∈ G corresponding distribution ˜ n+1 is defined similarly. By Lemma 3.3, an α0 -fraction of these two distributions can be matched, defining an immediate matching :(n+1) : ˜ n × K and |9˜ n+1 | = α0 |G ˜ n |. Future couplings that result 9˜ n+1 → K n+1 with 9˜ n+1 ⊂ G (n+1) N ˜ define : : 9n+1 → K with 9n+1 = 9n+1 × 9. from : ˜ n ) decreases exponentially. This requires a little argument, for even We claim that ν n (G though at each step a fraction of α0 of what is left is matched, our matchings are “leaky”, (n) meaning not every orbit defined by a sequence in 9k can be matched to something ˜ n ), we write K N \∪k≤n 9k as the disjoint reasonable at the (n+1st ) step. To estimate ν n (G N ˜ union Gn ∪ Hn , where Gn = Gn × K . The dynamics of (Gn , Hn ) → (Gn+1 , Hn+1 ) are as follows: An α0 -fraction of Gn leaves Gn at the next step; of this part, a fraction of
Ergodic Theory of Infinite Dimensional Systems
471
;i≥0 (1 − 21 γ i )D (recall that D is the dimension of V ) goes into 9n+1 (see Lemma 3.2) while the rest goes into Hn+1 . At the same time, a fraction of Hn returns to Gn+1 . We claim that this fraction is bounded away from zero for all n. To see this, consider one 9k (n) (n) (n+1) at a time, and observe (from the definition of 9k ) that |(9k × K) \ 9k | ∼ const n−k ˜ |9k |γ . Combinatorial Lemma. Let a0 , b0 > 0, and suppose that an and bn satisfy recursively an+1 ≥ (1 − α0 )an + α1 bn
and
bn+1 ≤ (1 − α1 )bn + α0 an
for some 0 < α0 , α1 < 1. Then there exists c > 0 such that
an bn
> c for all n.
The proof of this purely combinatorial lemma is left as an exercise. We deduce from ˜ n ) ≤ Cβ n for some C > 0 and β < 1. it that inf n |Gn |/|Hn | > 0, which implies ν n (G n This in turn implies that |9n+1 | ≤ Cβ . Proceeding to the final count, we let f : supp µ → R be such that |f | < C1 and |f (u) − f (v)| < C1 u − vσ . Then In ≤
˜n G
|f (un (η0 , . . . , ηn−1 ))|dν n + +
k≤n
(n)
9k
(n) K n −:(n) (∪k≤n 9k )
|f (un (η0 , . . . , ηn−1 ))|dν n
|f (u( η0 , . . . , ηn−1 )) − f (un (:(n) (η0 , . . . , ηn−1 )))|dν n
≤ 2C1 · Cβ n +
(6)
Cβ k−1 · C1 (δγ n−k )σ
k≤n
≤ const n · [max(β, γ σ )]n ≤ const · τ n . Since these estimates are uniform for all pairs u0 , u0 , we obtain by integrating over u0 that fd n − f dµ ≤ const · τ n . Proof of Theorem B. We will prove, in the next paragraph, that assertion (2) in Theorem B holds for any invariant measure µ of X . From this (1) follows immediately: since (X , µ) is exponentially mixing, it is ergodic; and since µ is chosen arbitrarily, it must be the unique invariant measure. To prove the claim above, we pick arbitrary u0 ∈ H , u0 ∈ A, and compare their distributions n and n as we did in the proof of Theorem A(2). First, by waiting a suitable period, we may assume that n is supported in B(R0 ) (where R0 is as in (P2)). By condition (C) with ε0 = δ, where δ is as in Lemma 3.2, there is a set of controls of length N0 and having ν N0 -measure α0 for some α0 > 0 that steer the entire ball B(R0 ) into a set of diameter < δ. The estimate for | f d n − f d ˆ n | now proceeds as in Theorem A(2), with the use of these special controls taking the place of Lemma 3.3 to guarantee that an α0 -fraction of what is left is matched every N0 steps. Averaging u0 with respect to µ, we obtain the desired result.
472
N. Masmoudi, L.-S. Young
3.3. Applications to PDEs: Proofs of Theorems 1 and 3. In this subsection, we prove the theorems related to PDEs stated in Sect. 2.1. Proof of Theorem 1. We will prove that the abstract hypotheses (P1)–(P4) and (C) hold for the incompressible Navier–Stokes equation in L2 for the type of noise specified. Let S(u0 ) = u(t = 1) where u is the solution of the Navier–Stokes equation with initial data u0 , and let uk = S(uk−1 ) + ηk . Most of the computations below are classically known (see for instance [2, 14]); we include them for completeness. We start by recalling a few properties of the Navier–Stokes equation in the 2-D torus. First, the following energy estimate holds for all t > 0: 1 ||u(t)||2L2 + ν 2 Since
t 0
||∇u||2L2 =
1 ||u0 ||2L2 . 2
(7)
u = 0, we have the Poincaré inequality ||∇u||L2 ≥ ||u||L2 .
(8)
||S(u)||L2 ≤ e−ν ||u||L2 ;
(9)
From (7) and (8), it follows that
thus (P2) is satisfied by taking R0 (a) > 1−e1 −ν a. On the other hand, for any two solutions u and v with initial conditions u0 and v0 , we have 1 2 2 ∂t ||u − v||L2 + ν||∇(u − v)||L2 ≤ | (u − v).∇v(u − v)| 2 ≤ C||∇v||L2 ||u − v||L2 ||u − v||H 1 (10) ν C ≤ ||u − v||2H 1 + ||∇v||2L2 ||u − v||2L2 . 2 ν (Hölder and Sobolev inequalities are used to get the second line, and the Cauchy– Schwartz inequality is used to get the third.) Then, applying a Gronwall lemma, we get ||S(u0 ) − S(v0 )||2L2 + ν
1 0
||(u − v)(s)||2H 1 ds ≤ CR ||u0 − v0 ||2L2 .
(11)
Here and below, CR denotes a generic constant depending only on R, an upper bound on the L2 norm of u0 , and on the viscosity ν. (P1)(b) follows from (11). To prove that (P3) holds, we use (11), (7) and a Chebychev inequality to deduce the existence of a time s, 0 < s < 1, such that ν||(u − v)(s)||2H 1 ≤ 4CR ||u0 − v0 ||2L2 , ν||u(s)||2H 1 < 2R 2 and ν||v(s)||2H 1 < 2R 2 . Combining these estimates with energy estimates in H 1 for t > s, namely, t 1 1 ||∇u(t)||2L2 + ν || u||2L2 = ||∇u(s)||2L2 , 2 2 s t 1 1 2 ||∇v(t)||L2 + ν || v||2L2 = ||∇v(s)||2L2 , 2 2 s
(12) (13)
Ergodic Theory of Infinite Dimensional Systems
473
1 ∂t ||u − v||2H 1 + ν||u − v||2H 2 ≤ ||u − v||H 2 ||u − v||H 1 ||u||H 2 + ||v||H 2 (14) 2
1 ν ≤ ||u − v||2H 2 + ||u||2H 2 + ||v||2H 2 ||u − v||2H 1 , 4 ν integrating (14) between s and 1 and using again a Gronwall lemma, we deduce easily that ||S(u0 ) − S(v0 )||H 1 ≤ CR ||u0 − v0 ||L2 .
(15)
For any γ > 0 and R > 0, we may take N large enough that if VN :=span{e1 , e2 , ..., eN }, then CR ||u||L2 ≤ γ ||u||H 1 ∀u ∈ VN⊥ . This together with (15) proves (P3). Finally, property (C) is satisfied by taking ηi = 0 for 1 ≤ i ≤ n0 , where n0 is large enough that Re−νn0 ≤ ε0 (see (9)). The product structure of the noise ν 1 in property (P4)(b) holds because ξj k in (2) are independent; the assumption on PV ∗ ν holds because bj = 0 for 1 ≤ j ≤ N , where N is as in (P3) and the law for ξj k has density ρj . Proof of Theorem 1’. We now prove (P1)–(P4) and (C) in H s . To prove (P1)(b), we use the energy estimates 1 ∂t ||u||2H s + ν||u||2H s+1 ≤ C||u||H s ||u||H s+1 ||u||H 1 2 ν C ≤ ||u||2H s+1 + ||u||2H 1 ||u||2H s , 2 ν 1 ∂t ||u − v||2H s + ν||u − v||2H s+1 2 ≤ C||u − v||H s ||u − v||H s+1 (||u||H s+1 + ||v||H s+1 ) ν C ≤ ||u − v||2H s+1 + (||u||2H s+1 + ||v||2H s+1 )||u − v||2H s , 2 ν
(16)
(17)
and Gronwall’s lemma between times 0 and 1. To prove (P3), we proceed as in the case of L2 , showing the existence of a time τ , 0 < τ < 1, such that ||(u−v)(τ )||H s+1 ≤ 4CR ||u0 −v0 ||H s and ||u(τ )||H s+1 , ||v(τ )||H s+1 ≤ 4CR , where ||u0 ||H s , ||v0 ||H s < R. Then using (16) and (17) with s replaced by s + 1 and integrating between τ and 1, we deduce that ||S(u0 ) − S(v0 )||H s+1 ≤ CR ||u0 − v0 ||H s ,
(18)
from which we obtain (P3). To prove (P2), we make use of the regularizing effect of the Navier–Stokes equation in 2-D ||S(u0 )||H s ≤ Cs (||u||L2 ),
(19)
where Cs is a function depending only on s (see [14]). Since BH s (a) ⊂ BL2 (a), we know from (P3) for L2 that if u0 ∈ BL2 (R), we have un ∈ BL2 (R0 ) ∀n ≥ some N0 . Taking Rs = Cs (R0 ) + a, we get that un ∈ BH s (Rs ) ∀n ≥ N0 . To prove (C), we argue as in L2 , taking ηi = 0, 1 ≤ i ≤ n0 , for large enough n0 and appealing to the fact that Cs (r) → 0 as r → 0. 1 We hope our dual use of the symbol ν as viscosity and as noise does not lead to confusion.
474
N. Masmoudi, L.-S. Young
We remark that (P2) and (C) above can be proved directly without going through L2 . Next we move on to the real Ginzburg–Landau equation. Proof of Theorem 3. For simplicity, we take ν = 1. (a) We need to prove that there exist two disjoint stable sets A1 and A−1 , stable in the sense that ∀u ∈ A±1 , S(u) + η ∈ A±1 ∀η ∈ K. Let A1 = {u ∈ H, ||u − 1||L2 ≤ β},
(20)
where β is a constant to be determined. We recall for each φ ∈ R the energy estimate 1 (21) ∂t ||u − φ||2L2 + ||∇(u − φ)||2L2 + u(u − 1)(u + 1)(u − φ) dx = 0. 2 T Substituting φ = 1 in (21), we get 1 ∂t ||u − 1||2L2 + ||∇(u − 1)||2L2 ≤ − 2
T
u(u + 1)(u − 1)2 dx.
(22)
Now for any φ with 0 < φ < 1, we have u(u + 1)(u − 1)2 ≥ φ(φ + 1)(u − 1)2 u(u + 1)(u − 1)2 ≥ −1
if
u≥φ ∀u.
or
u ≤ −1 − φ,
(23)
Hence u(u + 1)(u − 1)2 dx ≥ (1{u≥φ} + 1{u≤−1−φ} )φ(φ + 1)(u − 1)2 − meas{u ≤ φ}. T
T
Since the first term on the right side is ≥ φ(φ + 1)||u − 1||2L2 −
T
1{−1−φ 1 can be relaxed.
4. Dynamics with Negative Lyapunov Exponents 4.1. Formulation of abstract results. We consider a semi-group St on H and a Markov chain X defined by (I) or (II) in the beginning of Sect. 3.1. In order for Lyapunov exponents to make sense, we need to impose differentiability assumptions.
476
N. Masmoudi, L.-S. Young
(P1’) (a) S(B(R)) is compact ∀R > 0; (b) S is C 1+Lip , meaning for every u ∈ H , there exists a bounded linear operator Lu : H → H with the property for all h ∈ H , 1 {S(u + εh) − S(u) − Lu (εh)} = 0 ε→0 ε lim
(34)
and ∀R > 0, ∃MR such that ∀u, v ∈ B(R), Lu − Lv ≤ MR u − v. Since Lemma 3.1 clearly holds with (P1) replaced by (P1’), we let A be as in Sect. 3. Proposition 4.1. Assume (P1’), (P2) and (P4)(a), and let µ be an invariant measure for X . Then there is a measurable function λ1 on H with −∞ ≤ λ1 < ∞ such that for µ-a.e. u0 and ν N -a.e. η = (η0 , η1 , η2 , . . . ), lim
n→∞
1 log Lun−1 ◦ · · · ◦ Lu1 ◦ Lu0 = λ1 (u0 ) . n
Moreover, λ1 is constant µ-a.e. if (X , µ) is ergodic. This proposition follows from a direct application of the Subadditive Ergodic Theorem [6] together with the boundedness of Lu on A (see also Lemma 4.1 below). We will refer to the function or, in the ergodic case, number λ1 as the top Lyapunov exponent of (X , µ). This section is concerned with the dynamics of X when λ1 < 0. We begin by stating a result, namely Theorem C, which gives a general description of the dynamics when λ1 < 0. This result, however, is not needed for our application to PDEs. The proof of Theorem 2 uses only Theorem D, which is independent of Theorem C. Let µ be an invariant measure of X . Theorem C concerns the conditional measures of µ given the past. That is to say, we view X as starting from time −∞, i.e. consider . . . , u−2 , u−1 , u0 , u1 , u2 , . . . defined by un+1 = Sun + ηn ∀n ∈ Z where . . . , η−2 , η−1 , η0 , η1 , η2 , . . . are ν-i.i.d. Then for ν Z -a.e. η = (. . . , η−1 , η0 , η1 , . . . ), the conditional probability of µ given η− := (. . . , η−2 , η−1 ) is well defined. We denote it by µη . Theorem C (Random sinks). Assume (P1’), (P2) and (P4)(a), and let µ be an ergodic invariant measure with λ1 < 0. Then there exists k0 ∈ Z+ such that for ν Z -a.e. η ∈ K Z , µη is supported on exactly k0 points of equal mass. This result is well known for stochastic flows in finite dimensions (see [11]). In the next theorem we impose a condition slightly stronger than (C) in Sect. 3.1 to obtain the type of uniqueness result needed for Theorem 2. (C’) There exists uˆ 0 ∈ H such that for all ε0 > 0 and R > 0, there is a finite sequence of controls ηˆ 0 , · · · ηˆ n such that for all u0 ∈ B(R), if uk+1 = Suk + ηˆ k and uˆ k+1 = S uˆ k + ηˆ k for all k < n, then un − uˆ n < ε0 . For u ∈ H , we define the accessibility set A(u) as follows: let A0 (u) = {u}, An (u) = S(An−1 (u)) + K for n > 0, and A(u) = ∪n≥0 An (u).
Ergodic Theory of Infinite Dimensional Systems
477
Theorem D (Asymptotic uniqueness of solutions independent of initial condition). Assume (P1’), (P2), (P4)(a) and (C’). Suppose there is an ergodic invariant measure µ supported on A(uˆ 0 ) for which λ1 < 0. Then µ is the only invariant measure X has, and the following holds for ν N -a.e. η = (η0 , η1 , · · · ): ∀u0 , u0 ∈ H,
un (η) − un (η) ≤ Ceλn
where λ is any number > λ1 and C =
∀n > 0,
C(u0 , u0 , λ).
Roughly speaking, Theorem D allows us to conclude that all the orbits are eventually “the same” once we know that the linearized flows along some orbits are contractive. This passage from a local to a global phenomenon is made possible by condition (C’), which in the abstract is quite special but is satisfied by a number of standard parabolic PDEs. 4.2. Proofs of abstract results (Theorems C and D). Let A be the compact set in Lemma 3.1, and let K denote the support of ν as before. We consider the dynamical system F : K N × A → K N × A defined by F (η, u) = (σ η, S(u) + η0 ), where η = (η0 , η1 , η2 , . . . ) and σ is the shift σ (η0 , η1 , η2 , . . . ) = (η1 , η2 , . . . ). The following is straightforward.
operator,
i.e.
Lemma 4.1. Let µ be an invariant measure of X in the sense of Definition 3.1. Then F preserves ν N × µ, and (F, ν N × µ) is ergodic if and only if (X , µ) is ergodic in the sense of Definition 3.2. Our next lemma relates the top Lyapunov exponent of a system, which describes the average infinitesimal behavior along its typical orbits, to the local behavior in neighborhoods of these orbits. A version applicable to our setting is contained in [13]. Let B(u, α) = {v ∈ H, v − u < α}. Proposition 4.2. [13] Let µ be an invariant measure, and assume that λ1 < 0 µ-a.e. Then given ε > 0, there exist measurable functions α, γ : K N × A → (0, ∞) and a measurable set G ⊂ K N × A with (ν N × µ)(G) = 1 such that for all (η, u0 ) ∈ G and v0 ∈ B(u0 , α(η, u0 )), vn (η) − un (η) < γ (η, u0 ) e(λ1 +ε)n ∀n ≥ 0. We first prove Theorem D, from which Theorem 2 is derived. Proof of Theorem D. From (P2), it follows that we need only to consider initial conditions in B(R0 ). Fix ε > 0 and let α and G be as in Proposition 4.2 for the dynamical system (F, ν N × µ). We make the following choices: 99 (1) Let α0 > 0 be a number small enough that (ν N × µ){α > 2α0 } > 100 . Covering 1 the compact set A(uˆ 0 ) with a finite number of 2 α0 -balls, we see that there exists u˜ 0 ∈ A(uˆ 0 ) such that
91 := {η ∈ K N : B(u˜ 0 , α0 ) ⊂ B(u, α(η, u)) for some u with (η, u) ∈ G} has positive ν N -measure.
478
N. Masmoudi, L.-S. Young
(2) Since u˜ 0 ∈ A(uˆ 0 ), there is a sequence of controls (η˜ 0 , . . . , η˜ k−1 ) that puts uˆ 0 in B(u˜ 0 , 21 α0 ). Choose δ > 0 and 92 ⊂ K k with ν k (92 ) > 0 such that if u0 ∈ B(uˆ 0 , δ) and (η0 , . . . , ηk−1 ) ∈ 92 , then uk (η0 , . . . , ηk−1 ) ∈ B(u˜ 0 , α0 ). (3) Condition (C’) guarantees that there exists a sequence of controls (ηˆ 0 , . . . , ηˆ j −1 ) that puts the entire ball B(R0 ) inside B(uˆ 0 , 21 δ). Choose 93 ⊂ K j with ν j (93 ) > 0 such that every sequence (η0 , . . . , ηj −1 ) ∈ 93 puts B(R0 ) inside B(uˆ 0 , δ). Let 9 ⊂ K N be the set defined by {(η0 , . . . , ηj −1 ) ∈ 93 ; (ηj , . . . , ηj +k−1 ) ∈ 92 ; (ηj +k , ηj +k+1 , . . . ) ∈ 91 } . Clearly, ν N (9) > 0. The following holds for ν N -a.e. η : Fix η, and let Bn denote the nth image of B(R0 ) for this sequence of kicks. By the ergodicity of (σ, ν N ), there exists N such that σ N η ∈ 9. Choosing N ≥ N0 (R0 ), we have, by (P2), that BN ⊂ B(R0 ). The choice in (3) then guarantees that BN+j ⊂ B(uˆ 0 , δ), and the choice in (2) guarantees that BN+j +k ⊂ B(u˜ 0 , α0 ). By (1), BN+j +k ⊂ B(u, α(u, σ N+j +k η)) for some u with (σ N+j +k η, u) ∈ G. Proposition 4.2 then says that when subjected to the sequence of kicks defined by σ N+j +k η, all orbits with initial conditions in BN+j +k converge exponentially to each other as n → ∞. Hence this property holds for all orbits starting from B(R0 ) when subjected to η. Theorem D is proved. Proceeding to Theorem C, the measures µη defined in Sect. 4.1 are called the sample or empirical measures of µ. They have the interpretation of describing what one sees at time 0 given that the system has experienced the sequence of kicks η− = (· · · , η−2 , η−1 ). The characterization of µη in the next lemma is useful. We introduce the following notation: Let Sη0 : H → H be the map defined by Sη0 (u) = Su + η0 ; for a measure µ on H , Sη0 ∗ µ is the measure defined by (Sη0 ∗ µ)(E) = µ(Sη−1 E). 0 Lemma 4.2. Let µ be an invariant measure for X . Then for ν Z -a.e. η (. . . , η−2 , η−1 , η0 , . . . ), (Sη−1 Sη−2 · · · Sη−n )∗ µ converges weakly to µη . Proof. Fix a continuous function ϕ : A → R, and define ϕ (n) : K Z → R by ϕ (n) (η) = ϕ d((Sη−1 Sη−2 · · · Sη−n )∗ µ) = ϕ(Sη−1 Sη−2 · · · Sη−n (u))dµ(u).
=
(35)
−n −n -measurable, where B−1 is the σ -algebra on K Z generated by coordiThen ϕ (n) is B−1 −n+1 ) = ϕ (n−1) . nates η−1 , · · · , η−n . Since Sη−n ∗ µ dν(η−n ) = µ, we have E(ϕ (n) |B−1 (n) The martingale convergence theorem then tells us that ϕ convergence ν Z -a.e. to a −∞ function measurable on B−1 . It suffices to carry out the argument above for a countable dense set of continuous functions ϕ.
Lemma 4.3. Given δ > 0, ∃N = N (δ) ∈ Z+ such that for ν Z -a.e. η, there is a set Eη consisting of ≤ N points such that µη (Eη ) > (1 − δ). Proof. Let α and γ be the functions in Proposition 4.2 for the dynamical system (F, ν N × µ). Given δ > 0, we let α0 , γ0 > 0 be constants with the property that if G = {(η, u) : α(η, u) ≥ α0 , γ (η, u) ≤ γ0 }
Ergodic Theory of Infinite Dimensional Systems
and
479
9 = {η ∈ K N : µ{u : (η, u) ∈ G} > 1 − δ},
then ν N (9) > 1 − δ. Consider η ∈ K Z such that (i) µη = lim(Sη−1 Sη−2 · · · Sη−n )∗ µ and (ii) (η−n , η−n+1 , . . . ) ∈ 9 for infinitely many n > 0. By Lemma 4.2 and the ergodicity of (σ, ν Z ), we deduce that the set of η satisfying (i) and (ii) has full measure. We will show that the property in the statement of the lemma holds for these η. Fix a cover {B1 , . . . , BN } of A by α20 -balls, and let η be as above. We consider n arbitrarily large with (η−n , η−n+1 , . . . ) ∈ 9. For each i, 1 ≤ i ≤ N , such that Bi ∩ {u ∈ H : ((η−n , η−n+1 , . . . ), u) ∈ G} = ∅, pick an arbitrary point u(i) in this set. Our choices of G and 9 ensure that µ(∪i B(u(i) , α0 )) > 1 − δ, and that the diameter of (Sη−1 Sη−2 · · · Sη−n )B(u(i) , α0 ) is ≤ γ0 α0 e(λ+ε)n . We have thus shown that a set of µη -measure > 1 − δ is contained in ≤ N balls each with diameter ≤ γ0 α0 e(λ+ε)n . The result follows by letting n → ∞. To prove Theorem C, we need to work with a version of (F, ν N × µ) that has a past. Let F˜ : K Z × A → K Z × A be such that F˜ : (η, u) ' → (σ η, Sη0 u), and let ν Z ∗ µ be the measure which projects onto ν Z in the first factor and has conditional probabilities µη on η-fibers. That ν Z ∗ µ is F˜ -invariant follows immediately from Lemma 4.2. It is also easy to see that (F˜ , ν Z ∗ µ) is ergodic if and only if (F, ν N × µ) is. Proof of Theorem C. It follows from Lemma 4.3 that for ν Z -a.e. η, µη is atomic, with possibly a countable number of atoms. We now argue that there exists k0 ∈ Z+ such that for a.e. η, µη has exactly k0 atoms of equal mass. Let h(η) = supu∈H µη {u}. To see that h is a measurable function on K Z , let P (n) , n = 1, 2, · · · , be an increasing sequence of finite measurable partitions of A such that diamP (n) → 0 as n → ∞. Then for each P ∈ P (n) , η ' → µη (P ) is a measurable function, as are hn := maxP ∈P (n) µη (P ) and h := limn hn . Observe that h(σ η) ≥ h(η), with > being possible in principle since Sη0 is not necessarily one-to-one. However, the measurability of h together with the ergodicity of (σ, ν Z ) implies that h is constant a.e. Let us call this value h0 . From the last lemma we know that h0 > 0. To finish, we let X = {(η, u) ∈ K Z × A : µη {u} = h0 }. Then X is a measurable set, (ν Z ∗ µ)(X) > 0 and F˜ −1 X ⊃ X. This together with the ergodicity of (F˜ , ν Z ∗ µ) implies that (ν Z ∗ µ)(X) = 1, which is what we want.
4.3. Application to PDEs: Proof of Theorem 2. Let St be the semi-group generated by the (unforced) Navier–Stokes system, and let S = S1 . Lemma 4.4. S is C 1+Lip in H 2 (R2 ). Proof. It is easy to see that Lu is defined by Lu w = ψ(1), where ψ is the solution of the linear problem ∂t ψ + U.∇ψ + ψ.∇U − ν ψ = −∇p, (36) ψ(t = 0) = w , divψ = 0,
480
N. Masmoudi, L.-S. Young
where U denotes the solution of the Navier–Stokes system with initial data u. That Lu is linear, continuous and goes from H 2 to H 2 is obvious. To prove that (34) holds, let U and V be the solutions of the Navier–Stokes system with initial data u and u + Lw respectively. Then y = V − U − Lψ satisfies ∂t y + (U + Lψ).∇y + y.∇V + L 2 ψ.∇ψ − y = −∇p (37) y(t = 0) = 0 , div(y) = 0. By a simple computation, we get that ||y(t = 1)||H 2 ≤ C(1 + ||w||2H 2 )L 2 , where here and below C denotes a constant depending only on the H 2 norm of u. To prove that Lu is Lipschitz, i.e., ||(Lu − Lv )w||H 2 ≤ C||u − v||H 2 ||w||H 2 ,
(38)
we define Lv w = φ(1), where φ solves an equation analogous to (36) with V in the place of U , V being the solution with initial condition v. The desired estimate ||(ψ − φ)(t = 1)||H 2 is obtained by subtracting this equation from (36). Remark 4.1. We observe here that the top Lyapunov exponent is negative if the noise is sufficiently small. We will show, in fact, that given any positive viscosity ν, if a (see Sect. 2.1 for definition) is small enough, then S : H 2 → H 2 is a contraction on the ball ν of radius 2C . Rewriting Eqs. (16) and (17) with s = 2, we have ∂t ||u||2H 2 + ν||u||2H 3 ≤
∂t ||u − v||2H 2 + ν||u − v||2H 3 ≤
C2 ||u||4H s , ν
(39)
C2 (||u||2H 3 + ||v||2H 3 )||u − v||2H 2 . ν
(40)
ν ν and a noise with a ≤ 2C (1 − e−ν/4 ), it follows from (39) For u0 with ||u0 ||H 2 ≤ 2C and a Gronwall lemma that ν 2 ||S(u0 )||2H 2 ≤ e−ν/2 , 2C
from which we obtain ||u1 ||H 2 ≤
ν 2C .
1
ν 0
Moreover, from (39), we have that ||u||2H 3 ≤
ν3 , 16C 2
so that if v is another solution of the Navier–Stokes system with ||v0 ||H 2 ≤ (40) gives
ν 2C ,
then
ν
||u − v||2H 2 ≤ ||u0 − v0 ||2H 2 e−ν+ 8 . Proof of Theorem 3. It suffices to check the hypotheses of Theorem D: (P1’) is proved in Lemma 4.4, and we explained in the proof of Theorem 1’ why (C’) holds with uˆ 0 = 0.
Ergodic Theory of Infinite Dimensional Systems
481
References 1. Bricmont, J., Kupiainen, A., Lefevere, R.: Exponential mixing for the 2D Navier–Stokes dynamics, 2000 preprint 2. Constantin, P., Foias, C.: Navier–Stokes equations. Chicago Lectures in Mathematics. Chicago, IL: University of Chicago Press, 1988, x+190 pp. 3. E, W., Mattingly, J., Sinai, Ya.G.: Gibbsian Dynamics and ergodicity for the stochastically forced 2D Navier–Stokes equation. 2000 preprint. 4. Eckmann, J.-P., Hairer, M.: Uniqueness of the invariant measure for a stochastic PDE driven by degenerate noise. 2000 preprint 5. Flandoli, F., Maslowski, B.: Ergodicity of the 2D Navier–Stokes equations. NoDEA 1, 403–426 (1994) 6. Kingman, J.F.C.: Subadditive processes. Ecole d’été des Probabilités de Saint-Flour. Lecture Notes in Math. 539. Berlin–Heidelberg–New York: Springer, 1976 7. Kuksin, S., Shirikyan, A.: Stochastic dissipative PDEs and Gibbs measures. Commun. Math. Phys. 213 , no. 2, 291–330 (2000) 8. Kuksin, S., Shirikyan, A.: On dissipative systems perturbed by bounded random kick-forces. 2000 preprint 9. Kuksin, S., Shirikyan, A.: A coupling approach to randomly forced nonlinear PDE’s. 2001 preprint 10. Kuksin, S., Piatnitski, A.L., Shirikyan, A.: A coupling approach to randomly forced nonlinear PDEs. II. 2001 preprint 11. Le Jan, Y.: Equilibre statistique pour les produits de diffeomorphismes aleatoires independants. Ann. Inst. Henri Poincare (Probabilites et Statistiques) 23, 111–120 (1987) 12. Mattingly, J.: Ergodicity of 2D Navier–Stokes Equations with random forcing and large viscosity. Commun. Math. Phys. 206, 273–288 (1999) 13. Ruelle, D.: Characteristic exponents and invariant manifolds in Hilbert space. Ann. Math. 115, 243–290 (1982) 14. Temam, R.: Navier–Stokes equations and nonlinear functional analysis. Second edition. CBMS-NSF Regional Conference Series in Applied Mathematics 66. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), 1995, xiv+141 pp. 15. Temam, R.: Infinite-dimensional dynamical systems in mechanics and physics. Second edition.Applied Mathematical Sciences 68. New York: Springer-Verlag, 1997, xxii+648 pp. Communicated by P. Constantin
Commun. Math. Phys. 227, 483 – 514 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Random Wavelet Series Jean-Marie Aubry , Stéphane Jaffard Département de Mathématiques, Faculté des Sciences et Technologies, Université Paris XII, 61 avenue du Général de Gaulle, 94010 Créteil Cedex, France. E-mail:
[email protected];
[email protected] Received: 15 December 2000 / Accepted: 20 December 2001
Abstract: This paper concerns the study of functions which are known through the statistics of their wavelet coefficients. We first obtain sharp bounds on spectra of singularities and spectra of oscillating singularities, which are deduced from the sole knowledge of the wavelet histograms. Then we study a mathematical model which has been considered both in the contexts of turbulence and signal processing: random wavelet series, obtained by picking independently wavelet coefficients at each scale, following a given sequence of probability laws. The sample paths of the processes thus constructed are almost surely multifractal functions, and their spectrum of singularities and their spectrum of oscillating singularities are determined. The bounds obtained in the first part are optimal, since they become equalities in the case of random wavelet series. This allows to derive a new multifractal formalism which has a wider range of validity than those that were previously proposed in the context of fully developed turbulence. 1. Introduction The statistical study of fully developed turbulence started in 1941 with Kolmogorov [19]. Based on a dimensional analysis, he predicted that the scaling function τ (p) of the velocity field v, defined by |v(x + δx) − v(x)|p dx ∼ |δx|τ (p) , should be linear: τ (p) = p/3. Subsequent experiences in wind tunnels showed that τ (p) is actually a strictly concave function, which is believed to be universal and central to This work was performed while the author was at the Department of Mathematics, University of California at Davis. Also at the Institut Universitaire de France.
484
J.-M. Aubry, S. Jaffard
the understanding of turbulence (see for instance [10]). This problem was addressed by Kolmogorov and Obukhov [20] and Mandelbrot [23], among others. Frisch and Parisi [9] proposed an explanation which has received the name multifractal: they interpret the nonlinearity of τ (p) as the signature of the presence of several kinds of Hölder singularities. Suppose that the Hölder exponent of v has value h on a set of (Hausdorff) dimension d(h). This function, called spectrum of singularities, is linked to τ (p) by the multifractal formalism, here stated in dimension 1 (although turbulence is a 3D phenomenon, its data are accessed only through the one-dimensional trace of the velocity field, measured in wind tunnels by the hot wire technique): d(h) = 1 + inf hp − τ (p). p
According to this formula, a linear τ (p) would indeed lead to a spectrum of singularities supported by only one value h0 . It should be noted that this relation is only based on a heuristic argument, and is not necessarily true for arbitrary functions, see [11, 12, 15] and our discussion in Sect. 3.3. A more precise information than τ (p) would be given by the distribution of the velocity increments at all scales δx. Then, the knowledge of these distributions could allow to test the validity of more sophisticated models of turbulence. This way was explored by Castaing et al. [5], who proposed continuous cascade models for the p.d.f. of the increments. To our knowledge, however, one has not yet constructed a process satisfying Castaing’s continuous equation. One problem met is that the family of p.d.f. of the increments of any given function at all scales are interdependent, so that it is by no means obvious to determine if a given (continuous) family of p.d.f. are indeed p.d.f. of increments. One way to eliminate this problem has been proposed by Arneodo et al. [2]: instead of modeling p.d.f. of increments at all scales, they propose to model p.d.f. of wavelet coefficients on an orthonormal wavelet basis. The wavelet coefficients Cj,k = f (t)ψ(2j t − k)dt can be interpreted as smoothed increments of f at the scale 2−j (because ψ is well localized and has a vanishing integral); therefore, it is reasonable to assume that, for a given j , the collections of Cj,k should follow Castaing’s statistics when δx = 2−j . The random cascades of Arneodo et al. have some drawbacks: their correlations between scales have the rigid structure of the dyadic tree; nevertheless they allow to explore basic hypotheses on turbulence statistics. Note also that, if it is possible to estimate sequences of histograms of wavelet coefficients at all dyadic scales, it is impossible to determine the correlations between all wavelet coefficients, see [25], especially since the statistics are not Gaussian. Therefore, any model that incorporates specific correlations is in practice impossible to validate on real-life data. We will adopt a different point of view: dropping all correlations between wavelet coefficients, but making no assumption on their distributions (except for a general regularity hypothesis). We then have a fairly general model that can be fitted to any statistics (note that correlations with limited range can be added, but would not change our qualitative conclusions). The advantage is that, under this hypothesis, we can completely derive the spectrum of singularities d(h), as well as other quantities related to the local oscillations of the functions considered. Of course, this general model can be fitted to Castaing’s model (this will be discussed in [4]). Our results also give new information on models that are already used as Bayesian a priori for signal and image processing [27, 29].
Random Wavelet Series
485
Random wavelet series were first introduced and studied in [14] in a very simplified case, where the non-zero coefficients at a given scale take only one value. However, even in this extremely simple model, it turned out that the sample paths are multifractal. Our goal in this paper is to compute the properties that can be deduced from the knowledge of the wavelet coefficients p.d.f. at each scale. We will obtain upper bounds for d(h) based on histograms of the wavelet coefficients, which are sharper than the bounds obtained by the usual Fenchel–Legendre transform technique, and we will show that these upper bounds become equalities in the case of random wavelet series. We will actually go beyond this analysis; for, when the computation of this spectrum is involved, it often happens that the Hölder exponent is not precise enough, in the sense that it doesn’t take into account the local oscillations of the function. Indeed, a given Hölder exponent h at x0 allows for many different behaviors near x0 : for instance cusp-like singularities, such as |x − x0 |h or very oscillatory behaviors, such as
1 gh,β (x) := |x − x0 | sin |x − x0 |β h
(1)
for β > 0. The functions gh,β are the most simple examples of chirps at x0 . In signal analysis, this notion is expected to give a model for functions whose “instantaneous frequency” increases fast at some time (see [16]). The oscillating singularity exponent β measures how fast the instantaneous frequency of (1) diverges at x0 . (We will give a precise definition of β for an arbitrary function in Sect. 2.2). Furthermore, such local oscillations can make the multifractal formalism wrong, see [1]. The introduction at each point of both exponents h and β thus has two advantages: first, it gives much more complete information on the pointwise behavior; second, it leads to extensions of the multifractal formalism which have a wider range of validity, as shown in [13]. In Theorem 2, we will also determine the spectrum of oscillating singularities d(h, β) of random wavelet series (i.e. the Hausdorff dimension of the set E(h, β) of points where a given sample path has Hölder exponent h and oscillating singularity exponent β). It should be noticed at this point that different functions with the same histograms of wavelet coefficients at each scale can have completely different spectra of singularities, see [11]; in other words, it is not only the histograms of wavelet coefficients which are important in multifractal analysis, but also the positions of these coefficients. Therefore no formula that gives the spectrum from these histograms can be valid in all generality. But we may expect that some formulas are “more valid than others”; indeed, we will show that, if the values of the coefficients are i.i.d. random variables, there exists an almost sure spectrum (which is a deterministic function); we will explicitly compute this spectrum and show that it differs from the spectra proposed up to now. This study will lead us to consider countable intersections and differences of some random fractals with interesting properties (Sect. 5); they are related to the sets with large intersection previously introduced and studied by Falconer [8].
2. Results Valid for All Functions Since we are interested in local properties of wavelet expansions, it is more convenient to work with periodic wavelets which are obtained by a periodization of a usual wavelet basis, see [6, 22], and are thus defined on the unit torus T := R/Z. Extensions to R and higher dimension are straightforward. We use a wavelet ψ in the Schwartz class such
486
J.-M. Aubry, S. Jaffard
that the periodized wavelets j
j
2 2 ψj,k (x) := 2 2
ψ(2j (x − l) − k),
l∈Z
j ∈ N, 0 ≤ k < 2j , form (together with the function ϕ(x) := 1) an orthonormal basis of L2 (T). We suppose that ψ has “enough” vanishing moments, in a sense to be made precise later (Proposition 4.1). Thus any one-periodic function f can be written Cj,k ψj,k (x), f (x) = j,k
where the wavelet coefficients of f are given by 1 Cj,k := 2j ψj,k (t)f (t) dt. 0
2.1. Histograms of coefficients. Let us now define some quantities that will be pertinent in our study. For each j , let Nj (α) := # k, |Cj k | ≥ 2−αj . We note for α ≥ 0, λ(α) := lim sup j →+∞
ρ(α, ε) := lim sup j →∞
log2 (Nj (α)) , j log2 (Nj (α + ε) − Nj (α − ε)) , j
and ρ(α) := inf ρ(α, ε). ε>0
(The reader should keep in mind the following heuristic interpretation: there are about 2λ(α)j coefficients larger than 2−αj , and about 2ρ(α)j coefficients of size of order 2−αj .) 2.2. Oscillating singularity exponents. We will study the sets of points where f has a given Hölder exponent, and local oscillations as in (1). We first have to define a pointwise oscillation exponent. Two definitions have been used previously, and we will briefly discuss them in order to motivate our choice. If f ∈ L∞ (R), denote by f (−n) an iterated primitive of f of order n. A consequence (−n) of the oscillations of (1) near x0 is that gh,β is C α+n(β+1) (x0 ) (the increase of regularity at x0 is not 1 at each integration, as would be expected for an arbitrary function, but β + 1). This remark motivated the following definition introduced by Y. Meyer, see [16]. Definition. Let h ≥ 0 and β > 0. A function f ∈ L∞ (R) is a chirp of type (h, β) at (n) x0 if, for every n ≥ 0, f can be written as f = gn , where gn ∈ C h+n(1+β) (x0 ).
Random Wavelet Series
487
Such functions, under an additional regularity assumption (Definition 2), can be characterized by size estimates of their wavelet transform, see [16]. One immediately meets some difficulties when using this definition for experimental data. Indeed it is not stable when one adds to f a function g which is arbitrarily smooth, but not C ∞ : for 1 1 instance |x| 3 sin x1 is a chirp of type 13 , 1 while |x| 3 sin x1 + |x|h (if h > 1 and h∈ / 2Z) is a chirp of type 13 , 0 , even if h is chosen arbitrarily large. Nonetheless the 1 strongest singularity at 0 is the chirp x 3 sin x1 , and one actually observes its oscillatory behavior after magnifying enough the graph near the origin, see [1]. Thus the local oscillations are not reflected in the chirp exponents since this function is a chirp of type ( 13 , 0) at the origin, and the oscillation exponent β, if defined as above, is a very unstable quantity. This drawback can be avoided by introducing a slightly different definition of oscillating singularities which agrees with the definition of a chirp for functions such as (1), and has the required stability properties with respect to the addition of “smooth noise”. Consider 1 f0 (x) := |x − x0 |h g + O(|x − x0 |h ), (2) |x − x0 |β where h > h; the first term describes the local behavior of f0 near x0 , so that, if x x → 0 g(t)dt ∈ L∞ (R), the oscillating singularity exponent at x0 should be (h, β). Let ht (x0 ) denote the Hölder exponent of the fractional primitive of order t at x0 of a function f . More precisely, if f is locally bounded, we denote by ht (x0 ) the Hölder exponent at x0 of the function t
ft := (Id −")− 2 (φf ), where φ is a C ∞ compactly supported function satisfying φ(x0 ) = 1. In the case of the function f0 defined by (2), for t small enough, ht (x0 ) = h + (1 + β)t: the increase of pointwise Hölder regularity at x0 after a fractional integration of very small order t is (1 + β)t. This remark motivated the following definition of [1]. Definition 1. Let f : Rd → R be a locally bounded function and x0 such that ht (x0 ) < +∞. The oscillating singularity exponents of f at x0 are defined by
∂
−1 . (h, β) := h(x0 ), ht (x0 )
∂t t=0 This definition makes sense because, for a given x0 , the function t → ht (x0 ) is differentiable (with a possible derivative of +∞, so that β can be infinite), see [1]. The following proposition of [3] shows that this definition does adequately recapture the oscillatory behavior of (2). We first need to make a minimal regularity hypothesis. Definition 2. We say that a function f is uniform Hölder if f ∈ C r (T).
(3)
r>0
This condition will also be necessary for our main theorems; see Appendix A for a more complete discussion.
488
J.-M. Aubry, S. Jaffard
Proposition 2.1. If f is uniform Hölder, then for any h < h(x0 ) and any β < β(x0 ), it can be written 1 h f (x) = |x − x0 | g + r(x), |x − x0 |β where r(x) ∈ C α (x0 ) for an α > h(x0 ), and g is indefinitely oscillating, i.e., g has bounded primitives of all orders on R+ and on R− . Note that, in the first example of random wavelet series studied in [14], chirp exponents were considered; however it is not difficult to see that, in this very particular case, the chirp exponent is everywhere identical to the oscillating singularity exponent. In the general setting of the present paper, they usually differ; and, for the reasons discussed above, it makes more sense to determine oscillating singularity exponents. We will obtain two kinds of results concerning either spectra of singularities or spectra of oscillating singularities. The first kind of results will be a priori upper bounds deduced from ρ(α). These bounds hold for any function which is uniform Hölder. The other type of results will be specific to random wavelet series, and we will show that, in this case, the upper bounds become equalities. 2.3. General upper bounds for spectra. As soon as we talk about pointwise Hölder regularity, f has to be locally bounded. This condition cannot be characterized by a condition on the modulus of wavelet coefficients (see [26]), so if we want to measure Hölder and oscillating singularity exponent with wavelet coefficients, we need to make a stronger assumption such as (3). This uniform Hölder regularity is not the weakest known condition on the modulus of wavelet coefficients that ensures local boundedness. In Appendix A, we present a weak uniform Hölder regularity which is sufficient for Proposition 2.1 to hold, as well as many of the results in the following sections. However, we shall also see that Theorems 1 and 2 cannot hold with only the weak hypothesis. This is why, in the following, (strong) uniform Hölder regularity is assumed, even though it is sometimes less than optimal. The following result will be proved in Sect. 4.2. Theorem 1. If f is uniform Hölder, its spectrum of oscillating singularities satisfies for all h ≥ 0, 0 ≤ β < +∞, h d(h, β) ≤ (1 + β)ρ . (4) 1+β Its spectrum of singularities satisfies, for h ≥ 0, ρ(α) α∈(0,h] α
d(h) ≤ h sup
(5)
(with the convention that sup(∅) = −∞). Remark. If f is a given function, using two different wavelet bases might lead to two different functions ρ and hence to two different upper bounds for d(h), and similarly for d(h, β). More generally, if ρ depends on the wavelet basis, it would be important to determine the quantities that can be deduced from ρ and are “wavelet-invariant” (for instance, it is the case for the concave hull of ρ, because of its relation with η). We intend to consider the general problem in a forthcoming paper.
Random Wavelet Series
489
In order to relate this result to previously published ones, we recall the definition of the scaling function, which was first introduced by physicists in the context of fully developed turbulence, see [10]. Definition 3. The scaling function η(p) is defined for p > 0 by any of the formulas s p ,∞ η(p) := sup s : f ∈ Bp,loc j −1 2 1 := lim inf − log2 2−j |Cj,k |p j →+∞ j k=0 1 := lim inf − log2 2−j 2−αpj Nj (α) dα . j →+∞ j
The first two definitions coincide because of the wavelet characterization of Besov spaces; the second and the third coincide because j −1 2
|Cj,k |p =
k=0
2−αpj dNj (α)
= pj
2−αpj Nj (α) dα.
The following proposition, proved in Sect. 4.2, relates η(p) to ρ(α). Note that absolutely no regularity assumption is made here (not even that f is a function). Proposition 2.2. For any periodic tempered distribution f , we have η(p) = inf (αp − ρ(α) + 1) . α≥0
(6)
Let us now deduce some implications of Theorem 1. If f is uniform Hölder, it is shown in [15] that there exists a unique critical exponent pc such that η(pc ) = 1. We will see, after the proof of Theorem 1, that in this case (5) implies the classical bound d(h) ≤ inf (ph − η(p) + 1), p≥pc
(7)
proved in [11] (see also [15]). Nonetheless, (5) clearly yields a sharper estimate if ρ(α) is not concave. This remark is important since it shows a situation where strictly more information can be deduced from the histogram of the wavelet coefficients than from the scaling function, or from the set of Besov spaces to which the function considered belongs. As a consequence, a multifractal formalism obtained by claiming equality in (5) has a domain of validity which is strictly larger than for the classical multifractal formalism based on equality in (7) (this will be developed in Sect. 3.3). 3. Random Wavelet Series In this section, we study the random processes obtained by first choosing an (almost) arbitrary sequence of histograms of wavelet coefficients at each scale, and then drawing at random each wavelet coefficients at each scale inside the corresponding histogram, independently. We will see that, for this general class of random processes, the sample paths are multifractal, and the spectra almost surely satisfy equality in (5) and (4).
490
J.-M. Aubry, S. Jaffard
3.1. Distribution of the coefficients. We shall use the following notation: bold symbols (ρj , ρ, λ, . . . ) denote deterministic quantities that are derived from the probability laws. Thin symbols (Nj , ρ, λ, . . . ) denote empirical quantities, that are measured on one sample path. Let (*, F, P) be a probability space. We suppose that, at each scale j , the wavelet coefficients of the process are drawn independently with a given law; ρj will denote the common probability measure of the 2j random variables Xj,k := − log2 (|Cj,k |)/j (the signs, or arguments, of the wavelet coefficients have no consequence for Hölder regularity, therefore, we do not need to make any assumption on them). The measure ρj thus satisfies
P Cj,k ≥ 2−αj = ρj ((−∞, α]) and E(Nj (α)) = 2j ρj ([0, α]). We note for α ≥ 0,
log2 2j ρj ([0, α]) , j j →+∞ log2 2j ρj ([α − ε, α + ε]) ρ(α, ε) := lim sup j j →+∞ λ(α) := lim sup
(8)
ρ(α) := inf ρ(α, ε).
(9)
and ε>0
We call ρ(α) the upper logarithmic density of the sequence ρj . Definition 4. We say that j
f :=
−1 2
Cj k ψj k
j ∈N k=0
is a Random Wavelet Series (R.W.S.) if there exists γ > 0 such that α < γ implies ρ(α) < 0. We assume from now on that this requirement is satisfied. The following propositions give the relationships between the quantities we defined, and show what are the “admissible” functions λ and ρ that can be obtained by (8) and (9). They will be proved in Sect. 4.3. Proposition 3.1. The function λ is nondecreasing and for all α, λ(α) ≤ 1; for any α < γ , λ(α) < 0. Conversely, for any function λ verifying these properties, there exists a R.W.S. such that (8) holds. Proposition 3.2. The function ρ is upper semi-continuous and for all α, ρ(α) ≤ 1; for any α < γ , ρ(α) < 0. Conversely, for any function ρ verifying these conditions, there exists a R.W.S. such that (9) holds.
Random Wavelet Series
491
We note λ¯ the upper closure of λ: the hypograph of λ¯ is the closure of the hypograph of λ. λ¯ is the only càdlàg function which coincides with λ almost everywhere. Or, equivalently since λ is increasing, ¯ λ(α) := lim λ(α ). α →α +
Proposition 3.3. For all α ≥ 0, ¯ λ(α) = sup ρ(α ).
(10)
α ≤α
Conversely, if (10) holds, as well as the conditions specified in Propositions 3.1 and 3.2, there exists a R.W.S. such that both (8) and (9) hold. Remark 3.1. Since the proof of Proposition 3.3 uses only the definitions (8) and (9), it ¯ is easy to see that (10) also holds for the corresponding empirical quantities λ(α) and ρ(α) of any function. Proposition 3.4. Let
2j ρj ([α − ε, α + ε]) = +∞ . W = α∀ε > 0, j ∈N
With probability one, for all α ≥ 0,
ρ(α) −∞
ρ(α) = and
λ(α) =
λ(α) −∞
if α ∈ W, else
(11)
if α ≥ inf(W ), else.
(12)
Corollary 3.5. A R.W.S. is almost surely uniform Hölder. Proof. From (11) and Definition 4, almost surely, ρ(α)
= −∞ when α < γ . This implies that, with a possible finite number of exceptions, Cj k ≤ 2−γj ; this is equivalent to f ∈ C γ (T). Remark. It is easy to check that ρ(α) > 0 ⇒ α ∈ W and that ρ(α) < 0 ⇒ α ∈ W . In cases where ρ(α) = 0 ⇒ α ∈ W , (11) boils down to: ρ(α) if ≥ 0, ρ(α) = −∞ else. Remark. Let α0 be defined by
α0 := inf γ , sup ρ(α) = sup ρ(α) [0,γ ]
R+
(possibly α0 = +∞). Since η(p) is computed for positive values of p, the infimum in (6) can be computed on [0, α0 ]. We can thus replace ρ(α) by ρ(α) in (6); in this case, because of (6), η(p) depends only on the concave hull of ρ(α) on [0, α0 ]; but, because of (10), the concave hulls of ρ(α) and of λ(α) coincide on [0, α0 ]; it follows that, almost surely, η(p) = inf (αp − λ(α) + 1). α≥0
492
J.-M. Aubry, S. Jaffard
3.2. Almost everywhere regularity and exact spectra. From now on, we suppose that ρ(α) takes a positive value for at least one value of α. Let W be defined as in Proposition 3.4, hmin := inf(W ) and
hmax
ρ(α) := sup α>0 α
−1
.
The supremum of ρ(α) α on an interval (0, h] cannot be achieved at a point in the neighborhood of which λ(α) is constant, so ¯ ρ(α) λ(α) = sup , α∈(0,h] α α∈(0,h] α sup
and, since λ¯ is càdlàg, this supremum is clearly achieved. In particular, we will denote by α˜ the largest value of α for which ¯ λ(α) α∈(0,hmax ] α sup
is achieved (or α˜ = 0 if hmax = 0). Theorem 2. Let f be a random wavelet series. With probability one, f has the following properties: • For almost every x, hf (x) = hmax
(13)
and βf (x) =
1 − 1. ˜ λ(α)
(14)
• The spectrum of oscillating singularities of f is defined for (h, β) in the rectangle h [hmin , hmax ] × 0, −1 , hmin where
d(h, β) = (1 + β)ρ
h 1+β
.
(15)
• The spectrum d(h) is defined for h ∈ [hmin , hmax ], where ¯ λ(α) ρ(α) = h sup . α∈(0,h] α α∈(0,h] α
d(h) = h sup
(16)
Random Wavelet Series
493
Remark. The assertions (16) and (15) are stronger than stating that, for each h and β, d(h) or d(h, β) has almost surely a given value, which would not be sufficient to determine the spectrum of singularities and the spectrum of oscillating singularities of almost every sample path. By inspecting (16), it is clear that d(h) need not be concave, which shows another possible occurrence of the failure of the classical multifractal formalism (this will be developed in Sect. 3.3 below). The function h → supα∈(0,h] h ρ(α) α is increasing on (0, hmax ] and takes the value d(h) = 1 for h = hmax , which is compatible with (13). Similarly, using Formula (15), we see that 1 1 ˜ d hmax , ρ(hmax λ(α)) −1 = ¯ α) ˜ λ(α) ˜ λ( 1 ˜ = ρ(α) ¯ α) ˜ λ( = 1, which is compatible with (14). Comparing Theorem 1 and Theorem 2, we see that both spectra d(h) and d(h, β) of random wavelet series take the largest possible values compatible with the bounds (5) and (4), which shows first that these bounds are optimal, and second that random wavelet series strive to have their Hölder singularities on sets of dimension as large as possible. Note finally that a random wavelet series satisfies the very natural property d(h) = sup d(h, β). β≥0
(17)
3.3. Multifractal formalisms. A multifractal formalism is a formula which allows in many cases to compute numerically the spectrum of singularities d(h) of a function, usually based on its wavelet coefficients. As an illustration, let us assume that we are given a function f with an upper logarithmic density of histograms ρ(α) as in Fig. 1 (and α < αmin ⇒ ρ(α) = −∞). Two main methods are used to derive a multifractal formalism. The first one consists in deducing formulas directly from “box-counting” arguments. This leads to the original Fenchel–Legendre transform formula of Frisch and Parisi (Fig. 2), which asserts that d(h) = d1 (h), where d1 (h) := inf (hp − η(p) + 1). p≥0
An alternative formula, based on a large deviation-type argument, simply asserts that d(h) = ρ(h); let us sketch its justification. If the Hölder exponent at x0 is h and is determined by the wavelet coefficients such that the support of the corresponding wavelet includes x0 (this is called a “cusp-type” singularity at x0 ), then (18) implies that these wavelet coefficients are of the order of magnitude of 2−hj . If the corresponding dimension is d(h), we expect to find about 2d(h)j such coefficients, but we know that there are about 2ρ(h)j of them, hence the formula.
494
J.-M. Aubry, S. Jaffard
1
0
α
αmin Fig. 1. Example of ρ(α)
1
h
0 Fig. 2. The Fenchel–Legendre transform spectrum d1 (h)
A less intuitive method for deriving a multifractal formalism consists first in obtaining sharp upper bounds for d(h), based either on “Besov-type information” (summed up in the scaling function η(p)) or based on wavelet histograms (in which case the information is summed up in the function ρ(α)). In both cases, the multifractal formalism asserts that, for a “large” class of functions, these upper bounds must be saturated; “large” meaning either in the sense of Baire categories if we only have a function space setting, or almost surely if we have a precise probabilistic model (which is the case in the present paper). This leads us to two additional possible multifractal formalisms. The first one states that d(h) = d2 (h), where d2 (h) is the quasi-sure spectrum (Fig. 3) d2 (h) := inf (hp − η(p) + 1), p≥pc
where pc is the only value of p for which η(p) = 1, see [15]).
Random Wavelet Series
495
1
0
hc
h
Fig. 3. The quasi-sure spectrum d2 (h)
The second one states that d(h) = d3 (h), where d3 (h) is the almost-sure spectrum studied in this paper (Fig. 4), ρ(α) . α∈(0,h] α
d3 (h) := h sup
1
0
hc
h
Fig. 4. The almost-sure spectrum d3 (h)
Let us now compare these different formulas. We saw that, for any Hölder function, d(h) ≤ d3 (h) ≤ d2 (h).
496
J.-M. Aubry, S. Jaffard
Of course, d1 (h) ≤ d2 (h) and ρ(h) ≤ d3 (h). Finally, since η(p) is the Fenchel–Legendre transform of ρ(h), d1 (h) is the increasing concave hull of ρ(h), so that ρ(h) ≤ d1 (h). These are the only inequalities between spectra that hold in all generality; indeed d(h) can be larger than d1 (h), see [15], ρ(h) can be smaller than d(h), as shown in the present paper, and it can be larger than d(h), as shown in [11]. Figure 5 recapitulates these inequalities (−→ means ≤ for any uniform Hölder function). d(h) −−−−−→ d3 (h) −−−−−→ d2 (h) ρ(h) −−−−−→ d1 (h) Fig. 5. Comparison between multifractal formalisms
Note that the inequality d3 (h) ≤ d2 (h) fits to the classical “rule of thumb” which holds for Fourier series: quasi-all Fourier series display the worst possible regularity, whereas random Fourier series display more regularity, see [18] and [21]. Since quasi-sure results are expected to display the worst possible case (in terms of regularity) and almost-sure results, the best possible case, it is interesting to determine when they coincide. It is clearly the case if and only if ρ(α) is concave for α ≤ hc , where hc = η (pc ) is the critical Hölder exponent introduced in [15] (indeed, for h ≥ hc , both spectra always coincide). Thus, when this additional condition holds, quasi-sure and almost-sure spectra coincide, which is a very strong mathematical argument in favor of the corresponding multifractal formalism. 4. First Proofs In this section the propositions and theorems stated in Sects. 2 and 3 are proved. The proof of Theorem 2 will be finished in Sect. 5. 4.1. Regularity and wavelet coefficients. We first recall the characterizations of Hölder and oscillating singularity exponents using wavelet coefficients, see [1, 16]. Proposition 4.1. If f is uniform Hölder, the Hölder exponent of f at x0 is hf (x0 ) = lim inf inf j →∞
k
log(|Cj,k |) , log(2−j + |k2−j − x0 |)
(18)
provided that ψ has at least hf (x0 ) +1 vanishing moments and continuous derivatives, each of them having fast decay. We call any sequence (jn , kn ) such that log(|Cjn ,kn |) → hf (x0 ) + |kn 2−jn − x0 |)
log(2−jn
a minimizing sequence for the wavelet coefficients of f at x0 . For β ≥ 0, let Zx0 (β) ⊂ N×Z be the set of indices (j, k) such that |k2−j − x0 |1+β ≤ 2−j . Note that Zx0 (β) is increasing with β. The following proposition is proved in [1].
Random Wavelet Series
497
Proposition 4.2. If f is uniform Hölder, its oscillating singularity exponent βf (x0 ) at x0 is the infimum of the β such that there exists a minimizing sequence in Zx0 (β). Propositions 4.1 and 4.2 will now be reformulated in a way easier to use in our proofs. If α ≥ 0, let F j (α) := {k : |Cj,k | ≥ 2−αj } and if d ≤ 1, let
k2−j − 2−dj , k2−j + 2−dj , E j (α, d) := k∈F j (α)
and
E(α, d) := lim sup E j (α, d) := j →∞
E m (α, d).
(19)
j ∈N m≥j
Since E(α, d) is increasing in α and decreasing in d, for all x ∈ T, γx0 (α) := sup {d, x0 ∈ E(α, d)} defines an increasing function α → γx0 (α), which is bounded by 1. Let γ¯x0 be the upper closure of γx0 . Note that supα set of α for which its maximum.
γ¯ (α) supα x0α
γx0 (α) α
= supα
γ¯x0 (α) α ,
and that the
is attained is a non-empty compact; we denote by α(x0 )
Proposition 4.3. If f is uniform Hölder, for all x0 , γx (α) −1 α(x0 ) hf (x0 ) = sup 0 = α γ¯x0 (α(x0 )) α>0
(20)
and βf (x0 ) =
1 − 1. γ¯x0 (α(x0 ))
(21)
We can also define H (α, d) := {x ∈ T, α(x) = α, γ¯x (α(x)) = d} ; it has the property that x0 ∈ H (α, d) ⇐⇒ hf (x0 ) =
α d
and βf (x0 ) =
Proof of Proposition 4.3. Let x0 be fixed and let h(x0 ) := {0, . . . , 2jn
(22)
inf α γx α(α) . 0
1 d
− 1.
If x0 ∈ E(α, d),
− 1} such that there exist sequences jn → +∞ and kn ∈
Cj ,k ≥ 2−αjn and |kn 2−jn − x0 | ≤ 2−djn . n n
Because of (18), it follows that hf (x0 ) ≤ αd , hence that hf (x0 ) ≤ γx α(α) ; since this holds 0 for any α > 0, we obtain that hf (x0 ) ≤ h(x0 ). Conversely, let h > hf (x0 ). Then there exist sequences jn → +∞ and kn ∈ {0, . . . , 2jn − 1} such that |Cjn ,kn | ≥ (2−jn + |kn 2−jn − x0 |)h .
(23)
498
J.-M. Aubry, S. Jaffard
Let d be any accumulation point of the sequence ε > 0, there exist infinitely many n such that
log2 (|kn 2−jn −x0 |) . −jn
2−(d+ε)jn ≤ 2−jn + |kn 2−jn − x0 | ≤ 2−(d−ε)jn .
If d ≤ 1, for any
(24)
Because of (23), it follows that |Cjn ,kn | ≥ 2−jn h(d+ε) , so that x0 ∈ E(h(d +ε), d −ε). Thus, for all ε > 0, h(d + ε) α ≥ inf ; α γx0 (α) d −ε it follows that h ≥ h(x0 ). If d > 1, E(α, d) is not defined but then for infinitely many n, |Cjn ,kn | ≥ 2−hjn , so x0 ∈ E(h, 1) and γx0 (h) ≥ 1, which implies h ≥ h(x0 ). In any case, h > hf (x0 ) implies h ≥ h(x0 ), hence hf (x0 ) ≥ h(x0 ). 1 Now let β(x0 ) := γ¯x (α(x − 1 and let β ∈ (0, β(x0 )). In order to show that 0 )) 0 βf (x0 ) ≥ β(x0 ), we need to show that there exists no minimizing sequence in Zx0 (β) (see Proposition 4.2). Suppose that (jn , kn ) is such a sequence, and let d = lim inf n→∞
log2 (2−jn + |kn 2−jn − x0 |) ; −jn
since (jn , kn ) belongs to Zx0 (β), d≥
1 1 > . 1+β 1 + β(x0 )
For any ε > 0, and infinitely many n, we have both log2 (2−jn + |kn 2−jn − x0 |) ≥ −(d + ε)jn
(25)
log2 (|Cjn kn |) ≤ h(x0 ) + ε, + |kn 2−jn − x0 |)
(26)
and
log2
(2−jn
hence log2 (|Cjn kn |) ≥ −(h(x0 ) + ε)(d + ε)jn . In other words, as soon as α > h(x0 )d, 1 we have x0 ∈ E(α, d), which is a contradiction with the fact that d > 1+β(x . 0) 1 1 . This implies that x0 ∈ E(α(x0 ) + n1 , 1+β ) for If β > β(x0 ), γ¯x0 (α(x0 )) > 1+β all n > 1, which means that there exists infinitely many j and k such that |Cj k | ≥ −
j
2−j (α(x0 )+ n ) and |k2−j − x0 | ≤ 2 1+β . By taking jn ≥ n among those, and the corresponding kn , we constructed a minimizing sequence in Zx0 (β). With Proposition 4.2, this proves that βf (x0 ) ≤ β(x0 ). 1
Random Wavelet Series
499
4.2. Properties of histograms. Proof of Proposition 2.2. Let α be fixed and ρ0 < ρ(α). For any ε > 0, there exists jn → +∞ such that Njn (α + ε) − Njn (α − ε) ≥ 2ρ0 jn , and therefore 2−jn
|Cjn ,k |p ≥ 2−jn 2ρ0 jn 2−(α+ε)jn p ,
k
so that η(p) ≤ (α + ε)p − ρ0 + 1. Since this holds for any ε > 0, any ρ0 < ρ(α) and any α ≥ 0, it follows that, for all p > 0, η(p) ≤ inf (αp − ρ(α) + 1) . α≥0
Conversely, let A > 0, B be larger than the order of the distribution f , and η > 0. For all α ∈ [−B, A], there exists ε > 0 such that for all ε ≤ ε, |ρ(α, ε ) − ρ(α)| ≤ η. Thus, there exist α1 , . . . , αN and ε1 , . . . , εN ≤ ε such that the intervals [αi − εi , αi + εi ] cover [−B, A], and ∀i, |ρ(αi , εi ) − ρ(αi )| ≤ η. Thus, for each (αi , εi ) there exists Ji such that ∀j ≥ Ji , Nj (αi + εi ) − Nj (αi − εi ) ≤ 2j 2(ρ(αi )+η)j .
(27)
Taking for J the maximum of the Ji , it follows that ∀j ≥ J , 2−j
|Cj,k |p ≤ 2−j
k
N
2−(αi −εi )jp 2(ρ(αi )+η)j + 2j 2−Ajp
i=1
(the last term corresponds to wavelet coefficients smaller than 2−Aj ). Thus ∀j ≥ J, 2−j |Cj,k |p ≤ N 2(εp+η)j 2supα (−αp+ρ(α)−1)j + 2j 2−Ajp . k
Since ε and η can be chosen arbitrarily small and A arbitrarily large, it follows that, for any p > 0, η(p) ≥ − sup (−αp + ρ(α) − 1) = inf (αp − ρ(α) + 1) , α≥0
α≥0
and Proposition 2.2 is proved.
We now obtain the upper bounds for spectra. To fix the notations, we recall that for s > 0 and ε > 0 (possibly ε = +∞), Hεs (A) := inf |I |s , (28) r∈R(A,ε)
I ∈r
500
J.-M. Aubry, S. Jaffard
where R(A, ε) is the set of all countable coverings r of A with intervals I of lengths |I | ≤ ε. The s-Hausdorff (outer) measure is defined by Hs (A) := lim Hεs (A) ε→0
and the Hausdorff dimension by dimH (A) := inf s, Hs (A) = 0 . Proof of Theorem 1. Let h > 0 be given. If f is uniform Hölder, the set of points x0 , where hf (x0 ) = h can be expressed as α H α, . h 0 0 be fixed. We take α ∈ [ν, h], µ > 0, s = h ρ(α) α + µ, and ε ∈ (0, h+s ). By definition of ρ(α), there exist η ∈ (0, ε) and j0 ∈ N such that for all j ≥ j0 ,
Nj (α + η) − Nj (α − η) ≤ 2j (ρ(α)+ε) . Let us define the auxiliary set Eη (α, d) := lim sup Eηj (α, d), j →∞
where Eηj (α, d) := E j (α + η, d)\E j (α − η, d). Clearly, for all γ ∈ (α − η2 , α + η2 ),
γ α−η ⊂ Eη α, . H γ, h h
j At scale j ≥ j0 , Eη α, α−η is covered by Nj (α + η) − Nj (α − η) segments of size h
2−j
α−η h
, and (Nj (α + η) − Nj (α − η))2−j
α−η h s
≤ 2j (ρ(α)+ε−s
α−η h )
sα
s
≤ 2j (ρ(α)− h +ε(1+ h )) , whose series converges. This proves that γ ρ(α) H γ, ≤s=h + µ. dimH h α η η γ ∈(α− 2 ,α+ 2 )
Taking a finite sub-covering of [ν, h] by the (α − η2 , α + η2 ), we get α ρ(α) H α, ≤ h sup + µ. dimH h ν≤α≤h α ν≤α≤h
(29)
Random Wavelet Series
501
To conclude, we make µ → 0 and remark that this result stays valid when we take the union over ν in a sequence converging to 0. So finally, dimH
0 0 such that ρ(α, 2ε) ≤ ρ(α) + γ . On the other hand, as soon as |α − α | < ε, ρj ([α − 2ε, α + 2ε]) ≥ ρj ([α − ε, α + ε]), hence ρ(α, 2ε) ≥ ρ(α , ε), so finally ρ(α ) ≤ ρ(α) + γ , so that ρ is upper semicontinuous. It is obvious that ρ(α) ≤ 1, and ρ(α) < 0 if α < γ is precisely the hypothesis made in Definition 4. We now prove the converse part. Let ω(j, k) :=
2j ρ(α)
sup √ , k+1/2 √ ] α∈[ k−1/2 j j
and ω(j, ˜ k) be defined by ω(j, ˜ k) :=
if ω(j, k) ≤ j 3 . else
0 ω(j, k)
Then let ρj be the measure supported on [j − 2 , j 2 ] and defined by 1
3
j 2−j ρj = 2 ω(j, k)δ √k + cj δj √j , j j 2
k=1
where cj is the positive constant such that the total mass of ρj is 1. In this case also, one easily checks that the sequence ρj has the properties announced. ¯ Proof of Proposition 3.3. The upper semi-continuity of λ¯ means that for all α, λ(α) ≥ ¯ ). But for α > α, λ(α ¯ ) ≥ ρ(α), hence λ(α) ¯ lim supα →α λ(α ≥ ρ(α). Since λ¯ is ¯ increasing, this implies that λ(α) ≥ supα ≤α ρ(α ).
Random Wavelet Series
503
For the other way, take ε > 0. By definition of ρ, for any α ≤ α there exists r(α ) > 0 and J (α) such that ∀j ≥ J (α),
ρj ([α − r(α ), α + r(α )]) ≤ 2j (ρ(α )+ε) .
(32)
From the covering of the compact set [0, α] by the open intervals (α − r(α ), α + r(α )) we extract a finite sub-covering B(αi , r(αi )), 1 ≤ i ≤ n . Then, for j large enough,
ρj ([0, α]) ≤
n
2j (ρ(αi )+ε) ≤ n2j (supα ≤α ρ(α )+ε)
i=1
hence ¯ λ(α) ≤ sup ρ(α ) + ε
(33)
α ≤α
and Proposition 3.3 follows. We now treat the converse part. If we took the same ρj as in the proof of Proposition 3.2, then (8) would yield λ¯ instead of λ. We need to take care of the discontinuities of λ. Let D(j ) be defined as in the proof of Proposition 3.1, and ω(j, k) :=
supα∈[ k−1 √ , k+1 √ ] ρ(α) − 1 j
j
2j λ+ (α) − 2j λ− (α)
√ , √k ] = ∅, if D(j ) ∩ ( k−1 j j √ , √k ] if α ∈ D(j ) ∩ ( k−1 j j
.
The rest of the construction is the same as in the proof of Proposition 3.2, and one checks that the sequence ρj has the properties announced. Before proving Proposition 3.4, let us state a technical lemma. Lemma 4.4. Let α < β be fixed. With probability 1, there are infinitely many wavelet coefficients satisfying
2−βj ≤ Cj k ≤ 2−αj if and only if
j
(34)
2j ρj ([a, b]) = +∞.
Proof. Suppose that
j
2j ρj ([a, b]) < +∞. Then for j fixed, the probability that at j
least one wavelet coefficient satisfies (34) is 1 − (1 − ρj ([a, b]))2 , which is of order 2j ρj ([a, b]) for j large enough. The Cj k being independent, by the Borel-Cantelli lemma, with probability 1, there can be only a finite number of wavelet coefficients satisfying (34). Conversely, suppose that the series diverges. If lim supj 2j ρj ([a, b]) > 0 then the result is trivial; otherwise 2j ρj ([a, b]) → 0 and we can use the same equivalence as above, to conclude again with the Borel–Cantelli lemma.
504
J.-M. Aubry, S. Jaffard
We write Nj (α, β) the number of wavelet coefficients satisfying (34). Proof of Proposition 3.4. Note that the set W is closed. R\W being open, it is a countable n union of open intervals (αn , βn ). Let m ∈ N be such that m1 < βn −α and let α ∈ 2 1 1 [αn + m , βn − m ]. Since α ∈ W , by Lemma 4.4, ∃>(α) and J (α) such that j ≥ J (α) ⇒ Nj (α − >(α), α + >(α)) = 0. From the covering of [αn + m1 , βn − m1 ] with intervals (α − >(α), α + >(α)) we extract a finite sub-covering centered on α1 , . . . , αk ; we note J (n, m) := sup1≤i≤k J (αi ). Thus for j ≥ J (n, m), Nj (αn + m1 , βn − m1 ) = 0 is an event of probability 1. Taking the intersection over n and m of all those events, we get that with probability 1, for all n, m, there exists j such that Nj (αn + m1 , βn − m1 ) = 0; this is equivalent to saying that for all α ∈ W , ρ(α) = −∞. Let us now consider the α such that ρ(α) > 0. We denote by Gj (a) = ρj ((−∞, a]) the common distribution of the 2j random variables Xj,k := − log2 (|Cj,k |)/j . The standard way to study properties of the distribution of a large number of independent draws of a random variable is to reduce it to the case where this random variable is uniformly distributed on the interval [0, 1]; this is done by writing the random variables Xj,k under the form Xj,k = G−1 j (ξ ), where ξ is equidistributed on [0, 1]. Thus we now j suppose that we have n = 2 independent draws ξ 1 , . . . , ξ n of the random variable ξ . Let Fn (t) :=
1 #{ξ i : ξ i ≤ t}; n
Fn (t) is the empirical distribution function of the (ξ i )i=1,...,n . The uniform empirical process is defined by 1
αn (t) := n 2 (Fn (t) − t). We will use independent copies of the empirical process for each n = 2j . The increments of the empirical process can be estimated using the following result which is a particular case of Lemma 2.4 of Stute [30]. Lemma. There exist √ two positive constants C1 and C2 such that, if 0 < l < 1/8, nl ≥ 1 and 8 ≤ A ≤ C1 nl, √ C2 − A2 P sup |αn (t) − αn (s)| > A l ≤ e 64 . l |t−s|≤l √ (log2 (n))2 For a fixed j and n = 2j , we pick l = and A = C1 nl = C1 log2 (n). n Therefore (C log (n))2 (log2 (n))2 C2 n − 1 642 P sup |αn (t) − αn (s)| > C1 e . ≤ √ (log2 (n))2 n But, if |αn (t) − αn (s)| ≤ C1 satisfies
(log2 (n))2 √ , n
the number N (t, s) of ξ k in the interval (s, t]
1 (log2 (n))2 , √ |N (t, s) − n(t − s)| ≤ C1 √ n n
Random Wavelet Series
505
so that N (t, s) = n(t − s) + O(log(n)2 ), and there are between n(t − s)/2 and 2n(t − s) of the ξ k in any interval of length (log2 (n))2 . n
Coming back to the random variables Xj,k , it follows that, in any interval [a, b)
satisfying ρj ([a, b)) ≥
j2 , 2j
there are between 2j −1 ρj ([a, b)) and 2j +1 ρj ([a, b)) of the
C12 j 2 −j 64 ). Let Ij k := [k2 , (k 2 2 C1 j k = 0, . . . , 2j − 1: with probability at least 1 − C2 22j j −2 exp(− 64 ), for 2 j j (at most 2 ) intervals Ij k which satisfies ρj (Ij,k ) ≥ 2j ,
Xjk with probability at least 1 − C2 2j j −2 exp(−
+ 1)2−j ), any of the
2j −1 ρj (Ij,k ) ≤ #{Xjk ∈ Ij,k } ≤ 2j +1 ρj (Ij,k ). C12 j 2 Since the series 22j j −2 exp(− 64 ) is convergent, by the Borel–Cantelli lemma, with probability 1 the above event happens for all j with only a finite number of exceptions. But if α ∈ [0, 1) and ρ(α) > 0, this α will be in infinitely many Ij k such that ρj (Ij,k ) ≥ j2 , 2j
hence ρ(α) = ρ(α). The same proof is valid ∀n for α ∈ [n, n + 1), so finally, with probability 1, ρ(α) > 0 ⇒ ρ(α) = ρ(α). The last case we have to consider is when α ∈ W0 := {α ∈ W, ρ(α) = 0}. The previous reasoning provides only an upper bound: ρ(α) ≤ 0. For > > 0, let {(αi − >, αi + >), i ∈ N} be a countable covering of W0 such that for all i, αi ∈ W0 . Applying Lemma 4.4 to each of these intervals, we see that with probability
1, for all i, there are infinitely many wavelet coefficients such that 2−αi −> ≤ Cj k ≤ 2−αi +> . It follows that
for
all α ∈ W0 , there are infinitely many wavelet coefficients such that 2−α−2> ≤ Cj k ≤ 2−α+2> , hence ρ(α) ≥ 0.
4.4. Almost everywhere regularity. Proposition 4.5. Let f be a random wavelet series. The following events hold with probability one: • For every x ∈ T, h(x) ≤ hmax .
(35)
• For almost every x ∈ T, for all α ≥ 0, γx (α) = λ(α).
Proof. By definition of hmax , for all ε > 0, there exists an α1 such that ρ(α1 ) −
(36)
α1
hmax ≤
ε. There exists also a sequence jn → +∞ such that, with probability at least 1 − jn−2 , there are at least 2(ρ(α1 )−ε)jn coefficients Cjn ,k satisfying 2−(α1 +ε)jn ≤ |Cjn ,k | ≤ 2−(α1 −ε)jn .
(37)
506
J.-M. Aubry, S. Jaffard
If we condition by this event, for all jn ≥ J (and J chosen large enough), the locations k of the coefficients satisfying (37) are picked at random among the 2jn possible locations. We can apply Lemma 1 of [14], which is a consequence of classical results (see [28]) concerning random coverings of the circle and which implies that, with probability 1, every x belongs to lim sup n→∞
k −(ρ(α1 )−2ε)jn k −(ρ(α1 )−2ε)jn , − 2 , + 2 2jn 2 jn k
where the union is taken on the k verifying (37). It follows that, with probability one, for any ε > 0 and any x ∈ T, hf (x) ≤
α1 + ε , ρ(α1 ) − 2ε
therefore (35) holds. Let us now prove the second part of Proposition 4.5. In the following, |E| denotes the Lebesgue measure of a measurable set E. Let α be a given positive number. Clearly, if λ(α) = −∞, then for all x ∈ T, γx (α) = −∞. If λ(α) = 0, then for almost every x, γx (α) = 0. If λ(α) > 0, let ε ∈ Q, ε > 0. By the Borel–Cantelli lemma, |E(α, λ(α) + ε)| = 0, and applying Lemma 1 of [14] as above, almost surely |E(α, λ(α) − ε)| = 1. Therefore, if E(α) :=
E(α, λ(α) − ε)\E(α, λ(α) + ε),
ε∈Q,ε>0
almost surely, |E(α)| = 1. We can take α in a countable set Q, which is defined as the set of α such that λ(α) > 0 and α is rational or λ has a discontinuity at α. Then, if E :=
E(α),
α∈Q
almost surely, |E| = 1, and by construction, ∀x ∈ E, ∀α ∈ Q, γx (α) = λ(α). When α ∈ Q, λ is continuous at α and by density of Q we get the same result.
Combined with Proposition 4.3, this result yields the almost everywhere Hölder and oscillating singularity exponents of f . We thus obtained one point of the spectra of f , namely
α˜ d ˜ λ(α)
α˜ 1 =d , − 1 = 1. ˜ λ(α) ˜ λ(α)
For the other values of the spectra of f , we want to calculate the dimension of the sets H (α, d), defined by (22), which form a partition of T. These sets are random, because they depend on the repartition of the wavelet coefficients.
Random Wavelet Series
507
5. Random Fractals 5.1. Upper limit of random segments. Though the sets E(α, d), defined by (19), are not those that appear in the definition of the spectra, as an intermediate step, we need to determine their dimension and to show that they fall in the category of the “sets with large intersection”. Proposition 5.1. For any α > 0, d ≥ λ(α), dimH (E(α, d)) ≤ Proof. Let s >
λ(α) d .
λ(α) . d
Then
2−sdj < ∞,
j ∈N k∈F j (α)
! and since j >j0 E j (α, d) is a covering of E(α, d) for any j0 ∈ N, this implies that Hs (E(α, d)) = 0, and dimH (E(α, d)) ≤ s. Note that this bound depends only on the histogram of the wavelet coefficients, and not on the random process considered here. The lower bound will be specific to the model we consider, and uses the following result (Theorem 2 of [14]). Proposition 5.2. Let Snt := (xn − snt , xn + snt ) for a sequence of xn ∈ T and sn > 0. We note B t = lim sup Snt . n→∞
If |B 1 | = 1, then for all t ≥ 1, dimH (B t ) ≥ 1/t. Corollary 5.3. Almost surely, for all α > 0, d ≥ λ(α), dimH (E(α, d)) ≥
λ(α) . d
(38)
Proof. Thanks to Proposition 5.2, we only have to prove that almost surely, for all α, |E(α, λ(α))| = 1. By the Borel–Cantelli lemma, for each α this is true almost surely, following the fact that lim supj →∞ |E j (α, λ(α))| = 1, which implies that the sum of the lengths of the segments whose lim sup constitutes E(α, λ(α)) diverges. As in the proof of Proposition 4.5, we take α in the countable set Q which is the set of α such that λ(α) > 0 and α is rational or λ has a discontinuity at α. So (38) is true almost surely for all α ∈ Q and any d. If α ∈ Q, λ is continuous at α; since E(α, d) is increasing in α and decreasing in d, (38) still holds by density of Q.
508
J.-M. Aubry, S. Jaffard
5.2. Sets with large intersection. Actually, we have something more. We recall that the class of sets with large intersection G s (T), defined in [8], is"the class of Gσ -sets (countable intersections of open sets) E ⊂ T such that dimH i∈N fi (E) ≥ s for all sequences of similarity transformations (fi,i∈N ). It is also the maximal class of Gσ sets of Hausdorff dimension at least s, that is closed under countable intersections and similarities (the original setting was in R; we make the obvious modifications for working on T). This property is somewhat counter-intuitive, because one usually expects for “generic” sets E and F that dimH (E ∩ F ) = dimH (E) + dimH (F ) − 1 (1 corresponds here to the dimension of T) (for precisions on what is meant by “generic”, and a sufficient condition for this relation to hold, see [24, 31]). On the contrary, if E and F belong to G s (T), their intersection still has dimension s. With this notion, and without any extra hypothesis, the conclusion of Proposition 5.2 can be improved, as follows. 1
Proposition 5.4. Under the hypotheses of Proposition 5.2, B t ∈ G t (T). Proof. According to Theorem B of [8], it suffices to prove that for any sequence of similarity transformations (fk ), we have 1 fk (B t ) ≥ . (39) dimH t k≥1
The original proof of Proposition 5.2, which relies on the construction of a generalized Cantor set included in B t , may be adapted to that purpose. We will prove the following lemma later.
Lemma 5.5. If lim supn Sn = 1, then for any interval I ⊂ T, for any c < 1, there exists a finite set of indices E ⊂ N such that for all n, n ∈ E, n = n , Sn ∩ Sn = ∅, Sn ⊂ I , and |Sn | ≥ c|I |. n∈E
We apply this lemma to T with c = 21 , obtaining a set of indices E0 . The segments n ∈ E0 , form the first stage of our generalized Cantor set. For each n0 ∈ E0 , we apply Lemma 5.5 to Snt 0 with c = √1 , obtaining a new set 2 of indices E1 (n0 ). But notice that the segments f1 (Sn ), n ∈ N can also be used in
Lemma 5.5, because if f1 is a similarity, lim supn f1 (Sn ) = 1. We do this on each Sn , n ∈ E1 (n0 ), again with c = √1 . At this point, we obtained a family of disjoint segments Snt ,
2
Sn ∩ f1 (Sn ) filling (in Lebesgue measure) at least half of each of the Snt 0 , n0 ∈ E0 . The (Sn ∩ f1 (Sn ))t form the second stage of the Cantor set. The construction of the next stages follows the same principle. At each step, we introduce a new similarity, so that at step k + 1 we obtain a family of disjoint segments Sn ∩ f1 (Sn ) ∩ · · · ∩ fk (Snk ) filling at least half of the segments from the previous stage. The (Sn ∩ f1 (Sn ) ∩ · · · ∩ fk (Sn(k) ))t form the stage number k + 1. In " the end, K is the intersection of all the stages. It is straightforward to check that K ⊂ k≥1 fk (B t ), and, exactly as in [14], that the natural measure µK supported by K 1
has the scaling property µK (I ) ≤ C|I | t log(|I |)2 for any interval I ⊂ T, which proves that dimH (K) ≥ 1t .
Random Wavelet Series
509
Proof of Lemma 5.5. We can suppose that the |Sn | are taken in decreasing order. Let n1 be the first index such that Sn1 ⊂ I , n2 the first index such that Sn2 ⊂ I \Sn1 , and so on. Suppose that 1. x ∈ I ∩ lim sup Sn ; 2. x is never covered by any of the 3Snj , j ∈ N. Then 1 implies that there exists m such that x ∈ Sm , and by 2 we know that around x there is enough space not covered by any of the Snj , nj ≤ m to fit Sm (because their sizes are decreasing). So m would be eventually selected as one of the nj , which is clearly in contradiction with 2.
!
As a consequence, j 3Snj → |I |, so there exists a finite set of indices E(I ) such
!
that n∈E(I ) 3Sn ≥ 21 |I |, hence n∈E(I ) |Sn | ≥ 16 |I |. ! But we can apply the same argument to each of the segments composing I \ n∈E(I )Sn , obtaining a new family of disjoint segments. The total covering is now larger than ( 16 + 16 (1 − 16 ))|I |. This construction yields, after n steps, a disjoint covering larger than un |I |, where u0 = 0 and un+1 = un + 16 (1 − un ). Since un → 1 when n → ∞, Lemma 5.5 is proved. Corollary 5.6. Almost surely, for all α > 0, d ≥ λ(α), E(α, d) ∈ G
λ(α) d
(T).
The proof is similar to the proof of Corollary 5.3. In particular, Corollary 5.6 shows that the dimension of the difference of two such sets (which is needed below) cannot be deduced from the dimension of their intersection, because the latter is as big as the dimension of the smallest set. Nevertheless, Proposition 5.7. Let E ⊂ T be a fixed set and E = lim sup Si where the Si are open segments with centers uniformly and independently drawn on T, and i |Si | < ∞. Then almost surely dimH (E\E ) = dimH (E). Proof. If dimH (E) = 0, E has at least one point, which is almost surely not in E (the latter being of measure zero, by the Borel–Cantelli lemma). If dimH (E) > 0, for all 0 < s < dimH (E), Hs (E) = +∞, thus according to the Frostman lemma, there exists a non-zero measure µ with support in E, such that for all x ∈ T for all r > 0, µ(B(x, r)) ≤ r s . It is well known (see for instance [7]) that the existence of such a measure implies conversely that dimH (E) ≥ s. If this measure was with support in E\E , the lemma would thus be proved. Unfortunately, since F := T\E is not closed in T, the support of µF (the measure restricted to F ) is not necessarily included in E ∩ F , so we cannot use this measure directly. " However Fk := i≥k T\Si is compact, so µFk has its support in E ∩ Fk . We just need to prove that µ(Fk ) > 0. For i ≥ K, K large enough,
E µ(T\Si ) = µ(E)(1 − |Si |), n n # E µ = µ(E) T\Si (1 − |Si |), i=K
E(µ(FK )) = µ(E)
i=K ∞ # i=K
(1 − |Si |) > 0,
510
J.-M. Aubry, S. Jaffard
hence with non-zero probability, µ(FK ) > 0. The measure µFK allows thus to prove that dimH (E\E ) = dimH (E ∩ F ) ≥ s with a non-zero probability. But dimH (E\E ) is a tail random variable (it doesn’t depend on any finite number of the Si ), obeying Kolmogorov zero-one law, so almost surely dimH (E\E ) ≥ s. To conclude, we take s in a sequence si $ dimH (E). Remark. Here the set E is fixed and E is random. It follows that the same result holds with E random if its building process is independent from E . Proposition 5.7 can naturally be extended to the case where a finite number of Ei , i ∈ E, is subtracted from E. This is not true in general if E is countable; but in a more precise case, we can combine the ideas of Propositions 5.4 and 5.7. Proposition 5.8. Let B t be fixed as in Proposition 5.2, and E a random set as in Propo1 sition 5.7. Then almost surely B t \E ∈ G t (T). 1
Proof. Theorem B in [8] gives another characterization for B t ∈ G t (T): it is that for 1/t any dyadic segment I of size 2−j , we have H∞ (I ∩ B t ) = 2−j/t . Now we want to 1/t calculate H∞ (I ∩ B t \E ). With the notations of Proof 5.7, for all k ∈ N, ∞
# 1/t E H∞ (I ∩ B t ∩ Fk ) = 2−j/t (1 − |Si |), i=k
hence E(H∞ (I ∩ B t \E )) = 2−j/t ; since it is a tail variable, this means that with 1/t probability one, H∞ (I ∩ B t \E ) = 2−j/t . There are only countably many dyadic segments, so almost surely, this is true for all dyadic I . 1/t
1
This time, because the class of sets with large intersection G t (T) is stable under ! 1 countable intersection, B t \ i∈N Ei is still in G t (T). We shall need this in the proof of Proposition 5.9 below. 5.3. Lower bound for the dimension of H (α, d). Proposition 5.9. Almost surely, for all α > 0, d ≥ ρ(α), dimH (H (α, d)) ≥
ρ(α) . d
Before entering the proof, let us make a few remarks. As usual when we want to lower bound the dimension of a set with a rather non-geometric definition, we look for a geometrically simpler set that is small enough to fit in H (α, d), but large enough to obtain a lower bound for the Hausdorff dimension. One of the difficulties here is that we want an almost sure property for all α and d. To ensure this, we will make this “almost sure”, that is, the exceptional event of probability zero, depend only on parameters belonging to a countable set. The large intersection property, which forced us to make the detour by Propositions 5.7 and 5.8, actually saves us there, and here is why. We want to prove that a certain property (a lower bound for the dimension) is almost surely true for a whole family of sets depending on some real parameters.
Random Wavelet Series
511
We will express these sets as a countable intersection of some other sets that depend only on rational parameters, and we prove for each of these sets, almost surely, the slightly stronger property that it belongs to a G s (T). This property is then almost sure for all the rational parameters, and we know that it is stable by countable intersection: it will thus be true, almost surely, for the real parameters. Proof of Proposition 5.9. Let ε1 , ε2 > 0 and Eεj1 ,ε2 (α, d) := E j (α + ε2 , d)\E j (α − ε1 , d), Eε1 ,ε2 (α, d) := lim sup Eεj1 ,ε2 (α, d) j →∞
(this is just a complicated version of Eε (α, d), but now we can take ε1 and ε2 independently). Using Proposition 5.4, we see that [almost surely, for any α > 0, d > ρ(α), ε1 ∈ Q+α and ε2 ∈ Q − α], we have Eε1 ,ε2 (α, d) ∈ G s (T), where s ≥ ρ(α) d is the dimension of Eε1 ,ε2 (α, d). Let d0 ∈ (ρ(α), d) ∩ Q and
Gε1 ,ε2 (α, d) := Eε1 ,ε2 (α, d)\ G(α, d) :=
Eε1 ,ε2 (α, d ),
(40)
d >d
Gε1 ,ε2 (α, d),
ε1 ,ε2 >0
ˆ G(α, d) := G(α, d)\
(41)
E(α , d0 ),
α ρ(α), ε1 ∈ Q + α and ε2 ∈ Q − α] in G s (T) as well. By stability of G s (T) under countable intersection, this implies in (40) that [almost surely, for any α > 0, d > ρ(α), ε1 ∈ Q + α and ε2 ∈ Q − α], Gε1 ,ε2 (α, d) ∈ G s (T) ρ(α)
and then in (41), that [almost surely, for any α > 0, d > ρ(α)], G(α, d) ∈ G d (T). Now if we take a close look at (42) and (43), each of the sets that are subtracted is independent from G(α, d), so by Proposition 5.8, [almost surely, for any α > 0, ρ(α) ˜ d > ρ(α)], G(α, d) is still in G d (T), and in particular its dimension is larger than ρ(α) d . ˜ ˜ To conclude, we only have to show that G(α, d) ⊂ H (α, d). Let x ∈ G(α, d) and ε > 0. 1. For ε1 , ε2 < ε, E(α + ε, d − ε) contains Eε1 ,ε2 (α, d), hence x ∈ E(α + ε, d − ε). 2. When ε and ε are small enough, x cannot belong to E(α − ε, d − ε αd + ε ) because in (42) we subtracted all E(α − ε, d0 ). 3. Suppose x ∈ E(α + ε, d + ε αd ). Then either (cases are non exclusive)
512
J.-M. Aubry, S. Jaffard d
(a) there exist α < α and infinitely many (j, k) such that |k2−j − x| < 2−(d+ε α ) j and 2 2 |Cj k | ≥ 2−(α )j ; (b) there exist α ∈ (α, α + ε) and infinitely many (j, k) such that |k2−j − x| < j d 2−(d+ε α ) and 2−(α+ε)j ≤ 2 2 |Cj k | ≤ 2−(α )j ; (c) for all α < α < α , there exist infinitely many (j, k) such that |k2−j − xl < j d 2−(d+ε α ) and 2−(α )j ≤ 2 2 |Cj k | ≤ 2−(α )j . But 3(a) is ruled out by (42), 3(b) is ruled out by (43), and 3(c) by (40) and (41). So finally x ∈ E(α + ε, d + ε αd ). These three properties 1, 2, and 3 are sufficient (and indeed necessary) for x ∈ H (α, d): the proof is now complete. Proof of Theorem 2. First, (13) and (14) were already seen as a consequence of Propositions 4.1 and 4.5. Together with the upper bound given by Theorem 1, Proposition 5.9 proves directly (15). To prove (16), it suffices to remark that for all β, d(h) ≥ d(h, β), hence d(h) ≥ supβ d(h, β). So, if the upper bound in Theorem 1 is attained for d(h, β), the same is true for d(h).
A. Weak Hölder Uniform Regularity As we mentioned in Sect. 2.3, the uniform Hölder regularity condition (3) is not the weakest hypothesis that ensures local boundedness, and thus that the Hölder and oscillating singularity exponents can be recovered from the modulus of the wavelet coefficients. In Propositions 2.1, 4.1, 4.2, 4.3, as well as for (4) in Theorem 1, it can be replaced by the following condition [16]: Definition 5. We say that f is a weak uniform Hölder function if for all n > 0, there exist Cn such that for all j > 0, |Cj,k | ≤
Cn . jn
(44)
This is equivalent to the following uniform Hölder condition: for all n ∈ N, there exists C(n) such that for all x, y, |f (x) − f (y)| ≤
C(n) . (1 + | log(|x − y|)|)n
If the R.W.S. hypothesis (Definition 4) is replaced by $ % n log(j ) (H) There exists nj → ∞ such that ρj is supported in j j , +∞ , then almost every sample path of the random wavelet series is weak uniform Hölder. Also note that Propositions 3.1, 3.2 and 3.3 still hold if Definition 4 is replaced by (H) (the proofs were designed for this case). Unfortunately, we cannot make weak uniform Hölder regularity our basic hypothesis for this paper. Even for the deterministic upper bound in Theorem 1, (strong) uniform Hölder regularity is required for (5). The following example shows what can happen if f is only weak uniform Hölder.
Random Wavelet Series
513
Example 1. At each scale j , take 2log(j ) equidistributed (up to round-off errors) wavelet 2 coefficients of size 2− log(j ) (the other coefficients being set equal to 0). Then ρ(α) = −∞ if α = 0, and ρ(0) = 0. But it is not difficult to see that f has everywhere a Hölder exponent equal to 0, so d(0) = 1. 3
However, (4) holds even if f is not uniform Hölder. Example 1 thus shows a situation where d(0) = 1, but for all 0 ≤ β < ∞, d(0, β) ≤ 0 (and d(0, +∞) = 1). Here is its random counterpart, showing that (strong) uniform Hölder regularity is also needed for Theorem 2. Example 2. Take
ρj := 2−j log(j )3 δ log(j )2 + 1 − 2−j log(j )3 δj log(j ) .
(45)
j
With probability one, the spectrum of singularities of f is reduced to d(0) = 1, whereas d(0, β) = 0 for any β ≥ 0 (and d(0, +∞) = 1). In this case, naturally, hmin = hmax = 0 so (16) doesn’t make sense. But the problem is more serious than that: if we add a uniform Hölder random wavelet series to (45), then hmax > 0 but still for almost every x, hf (x) = 0 (the big wavelet coefficients in (45) “mask” everything else), so (13) also fails in this case. However, uniform Hölder regularity may not be the weakest condition necessary for Theorems 1 and 2; this question is still open. References 1. Arneodo, A., Bacry, E., Jaffard, S., Muzy, J.-F.: Singularity spectrum of multifractal functions involving oscillating singularities. J. Fourier Anal. Appl. 4 (2), 159–174 (1998) 2. Arneodo, A., Bacry, Muzy, J.-F.: Random cascades on wavelet dyadic trees. J. Math. Phys. 39 (8), 4142– 4164 (1998) 3. Aubry, J.-M.: Representation of the singularities of a function. Appl. Comput. Harmonic Anal. 6, 282–286 (1999) 4. Aubry, J.-M., Jaffard, S.: Random wavelet series: Theory and applications. To be presented at Fractal 2002, Granada 5. Chillà, F., Peinke, J., Castaing, B.: Multiplicative process in turbulent velocity statistics: A simplified analysis. J. Phys. II France 6 (4), 455–460 (1996) 6. Daubechies, I.: Ten Lectures on Wavelets, Volume 61 of CBMS-NSF regional conference series in applied mathematics. SIAM, 1992 7. Falconer, K.J.: Fractal Geometry: Mathematical Foundations and Applications. Chichester: John Wiley & Sons, 1990 8. Falconer, K.J.: Sets with large intersection properties. J. Lond. Math. Soc. (2) 49, 267–280 (1994) 9. Frisch, U.: Fully developed turbulence and intermittency. In M. Ghil, ed., Turbulence and Predictability in Geophysical Fluid Dynamics and Climate Dynamics, Volume 88, International School of Physics Enrico Fermi, Amsterdam: North-Holland, June 1983, pp. 71–88 10. Frisch, U.: Turbulence : The legacy of A.N. Kolmogorov. Cambridge: Cambridge University Press, 1995 11. Jaffard, S.: Multifractal formalism for functions Part I: Results valid for all functions. SIAM J. Math. Anal. 28(4), 944–970 (1997) 12. Jaffard, S.: Multifractal formalism for functions Part II: Selfsimilar functions. SIAM J. Math. Anal. 28 (4), 971–998 (1997) 13. Jaffard, S.: Oscillation spaces: Properties and applications to fractal and multifracal functions. J. Math. Phys. 39 (8), 4129–4141 (1998) 14. Jaffard, S.: Lacunary wavelet series. Ann. Appl. Probab. 10 (1), 313–329 (2000) 15. Jaffard, S.: On the Frish–Parisi conjecture. J. Math. Pures Appl. 79 (6), 525–552 (2000) 16. Jaffard, S., Meyer, Y.: Wavelet methods for pointwise regularity and local oscillations of functions. Mem. Amer. Math. Soc. 123, 587 (Sept. 1996)
514
J.-M. Aubry, S. Jaffard
17. Jaffard, S., Meyer, Y.: On the pointwise regularity of functions in critical Besov spaces. J. Funct. Anal. 175, 415–434 (2000) 18. Kahane, J.-P.: Some random series of functions. Cambridge: Cambridge University Press, 1985 19. Kolmogorov, A. N.: C. R. Acad. Sci. URSS 30, 301–305 (1941) 20. Kolmogorov, A. N.: J. Fluid Mech. 13, 82–85 (1962) 21. Körner, T.: Kahane’s Helson curve. J. Fourier Anal. Appl., pp. 325–346 (1995). Special Issue: Proceedings of the conference in honor of J.-P. Kahane 22. Lemarié, P.-G., Meyer, Y.: Ondelettes et bases hilbertiennes. Rev. Mat. Iber. 2 (1/2), 1–18 (1987) 23. Mandelbrot, B. B.: Intermittent turbulence in self similar cascades: Divergence of high moments and dimension of the carrier. J. Fluid Mech. 62, 331–358 (1974) 24. Mattila, P.: Geometry of Sets and Measures in Euclidean Spaces, Volume 44 of Cambridge studies in advanced mathematics. Cambridge: Cambridge University Press, 1995 25. Meneveau, C., Sreenivasan, K.: Measurement of f (α) from scaling of histograms and applications to dynamical systems and fully developed turbulence. Phys. Letters A 137, 103–112 (1989) 26. Meyer, Y.: Ondelettes et Opérateurs I : Ondelettes. Actualités Mathématiques. Paris: Hermann, 1990 27. Müller, P., Vidakovic, B. (editors): Bayesian Inference in Wavelet Based Models, Volume 141 of Lect. Notes Stat., Berlin–Heidelberg–New York: Springer-Verlag, 1999 28. Shepp, L.A.: Covering the circle with random arcs. Israel J. Math. 11, 328–345 (1972) 29. Simoncelli, E.: Bayesian denoising of visual images in the wavelet domain. Lect. Notes Stat. 141, 291–308 (1999) 30. Stute, W.: The oscillation behavior of empirical processes. Ann. Probab. 10, 86–107 (1982) 31. Tricot, C.: Two definitions of fractional dimension. Math. Proc. Cambridge Philos. Soc. 91, 57–74 (1982) Communicated by J. L. Lebowitz
Commun. Math. Phys. 227, 515 – 539 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Phase-Averaged Transport for Quasi-Periodic Hamiltonians Jean Bellissard1,2 , Italo Guarneri3,4,5 , Hermann Schulz-Baldes6 1 2 3 4 5 6
Université Paul-Sabatier, 118 route de Narbonne, 31062 Toulouse, France Institut Universitaire de France Università dell’Insubria a Como, via Valleggio 11, 22100 Como, Italy Istituto Nazionale per la Fisica della Materia, via Celoria 16, 20133 Milano, Italy Istituto Nazionale di Fisica Nucleare, Sezione di Pavia, via Bassi 6, 27100 Pavia, Italy University of California at Irvine, CA 92697, USA
Received: 30 May 2001 / Accepted: 2 January 2002
Abstract: For a class of discrete quasi-periodic Schrödinger operators defined by covariant representations of the rotation algebra, a lower bound on phase-averaged transport in terms of the multifractal dimensions of the density of states is proven. This result is established under a Diophantine condition on the incommensuration parameter. The relevant class of operators is distinguished by invariance with respect to symmetry automorphisms of the rotation algebra. It includes the critical Harper (almost-Mathieu) operator. As a by-product, a new solution of the frame problem associated with Weyl– Heisenberg–Gabor lattices of coherent states is given. 1. Introduction This work is devoted to proving a lower bound on the diffusion exponents of a class of quasiperiodic Hamiltonians in terms of the multifractal dimensions of their density of states (DOS). The class of models involved describes the motion of a charged particle in a perfect two-dimensional crystal with 3-fold, 4-fold or 6-fold symmetry, submitted to a uniform irrational magnetic field. Irrationality means that the magnetic flux through each lattice cell is equal to an irrational number θ in units of the flux quantum. As shown by Harper [Har] in the specific case of a square lattice with nearest neighbor hopping, the Landau gauge allows to reduce such models to a family of Hamiltonians each describing the motion of a particle on a 1D chain with quasiperiodic potential. The latter representation gives a strongly continuous family H = (Hω )ω∈T of selfadjoint bounded operators on the Hilbert space 2 (Z) of the chain indexed by a phase ω ∈ T = R/2πZ. This family satisfies the covariance relation THω T−1 = Hω+2πθ (here T represents the operator of translation by one site along the chain). The phase-averaged diffusion exponents β(q), q > 0, of H are defined by: T dt qβ(q) q e−ıHω t |φ T∼ φ|eıHω t |X| dω ↑∞ T , 2T T −T
516
J. Bellissard, I. Guarneri, H. Schulz-Baldes
denotes the position operator on the chain. The DOS of the family H is the where X Borel measure N defined by phase-averaging the spectral measure with respect to any site. Its generalized multifractal dimensions DN (q) for q = 1 are formally defined by
R
dN (E)
E+ε
E−ε
dN (E )
q−1
∼
ε↓0
ε (q−1)DN (q) .
A somewhat imprecise statement of the main result of this work is: whenever θ/2π is a Roth number [Her] (namely, for any > 0, there is c > 0 such that |θ − p/q| ≥ c/q 2+ for all p/q ∈ Q), and for the class of models mentioned above, the following inequality holds for all 0 < q < 1: β(q) ≥ DN (1 − q).
(1)
This result can be reformulated in terms of two-dimensional magnetic operators on the lattice and then gives an improvement of the general Guarneri–Combes–Last lower bound [Gua, Com, Las] by a factor 2. More precise definitions and statements will be given in Sect. 2. The inequality (1) has been motivated by work by Piéchon [Pie], who gave heuristic arguments and numerical support for β(q) = DN (1 − q) for q > 0, valid for the Harper model and the Fibonacci chain (for the latter case, a perturbative argument was also given). It was theoretically and numerically demonstrated by Mantica [Man] that the same exact relation between spectral and transport exponents is also valid for the Jacobi matrices associated with a Julia set. This result was rigorously proven in [GSB1, BSB]. For the latter operators, the DOS and the local density of states (LDOS) coincide. Numerous works [Gua, Com, Las, GSB2, GSB3, BGT] yield lower bounds on the quantum diffusion of a given wave packet in terms of the fractal properties of the corresponding LDOS. These rigorous lower bounds are typically not optimal as shown by numerical simulations [GM, KKKG]. Better lower bounds are obtained if the behaviour of generalized eigenfunctions is taken into account [KKKG]. Kiselev and Last have proven general rigorous bounds in terms of upper bounds for the algebraic decay of the eigenfunctions [KL]. However, in most models used in solid state physics, the Hamiltonian is a covariant strongly continuous family of self-adjoint operators [Bel] indexed by a variable which represents the phase or the configuration of disorder. The measure class of the singular part of the LDOS may sensitively depend on the phase [DS]. In addition, the multifractal dimensions are not even measure class invariants [SBB] (unlike the Hausdorff and packing dimensions). This raises concerns about the practical relevance of bounds based on multifractal dimensions of the LDOS in this context. The bound (1) has a threefold advantage: (i) it involves the DOS, which is phase-averaged; (ii) it does not require information about eigenfunctions; (iii) the exponent of phase-averaged transport is the one that determines the low temperature behaviour of the conductivity [SBB]. The present formulation uses the C∗ -algebraic framework introduced by one of the authors for the study of homogeneous models of solid state physics. While referring to [Bel, SBB] for motivations and details, in the opening Sect. 2 we briefly recall some of the basic notions. A precise statement of our main results is also given in Sect. 2, along with an outline of the logical structure of their proofs. In the subsequent sections we present more results and proofs.
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
517
2. Notations and Results A number α ∈ R is of Roth type if and only if, for any > 0, there is a constant c > 0 such that for all rational numbers p/q the following inequality holds: α − p ≥ c . (2) q q 2+ Most properties of numbers of Roth type can be found in [Her]. They form a set of full Lebesgue measure containing all algebraic numbers (Roth’s theorem). θ > 0 will be called a Roth angle if θ/2π is a number of Roth type. The rotation algebra Aθ [Rie] is the smallest C ∗ -algebra generated by two unitaries U and V , such that U V = eıθ V U . It is convenient to set Wθ (m) = e−ıθm1 m2 /2 U m1 V m2 , whenever m = (m1 , m2 ) ∈ Z2 . The Wθ (m)’s are unitary operators satisfying Wθ (l)Wθ (m) = eı(θ/2)l∧m Wθ (l + m), where l ∧ m = l1 m2 − l2 m1 . The unique trace on Aθ (θ/2π irrational) is defined by Tθ (Wθ (m)) = δm,0 . A strongly continuous action of the torus T2 on Aθ is given by ((k1 , k2 ), Wθ (m)) ∈ T2 × Aθ → eı(m1 k1 +m2 k2 ) Wθ (m). The associated ∗-derivations are denoted by δ1 , δ2 . For n ∈ N, one says A ∈ C n (Aθ ) if δ1m1 δ2m2 A ∈ Aθ for all positive integers m1 , m2 satisfying m1 + m2 ≤ n. Aθ admits three classes of representations that will be considered in this work. The 1D-covariant representations is a faithful family (πω )ω∈R of representations on 2 (Z) are the shift and the defined by πω (U ) = T and πω (V ) = eı(ω−θ X) , where T and X position operator respectively, namely Tu(n) = u(n − 1),
Xu(n) = nu(n),
∀u ∈ 2 (Z).
It follows that πω+2π = πω (periodicity) and that Tπω (·)T−1 = πω+θ (·) (covariance). Moreover ω → πω (·) is norm continuous. In the sequel, it will be useful to denote by |n = un (n ∈ Z) the canonical basis of 2 (Z) defined by un (n ) = δn,n . The 2Drepresentation (or the GNS-representation of Tθ ) is given by the magnetic translations on 2 (Z2 ) (in symmetric gauge): π2D (Wθ (m))ψ(l) = eıθm∧l/2 ψ(l − m),
ψ ∈ 2 (Z2 ).
The position operators on 2 (Z2 ) are denoted by (X1 , X2 ). The Weyl representation πW acts on L2 (R). Let Q and P denote the position and momentum operators defined by Qφ(x) = xφ(x) and P φ = −ıdφ/dx whenever φ belongs to the Schwartz space S(R). It is known that Q and P are essentially selfadjoint and satisfy the canonical commutation rule [Q, P ] = ı1. Then πW is defined by πW (U ) = eı
√ θP
,
πW (V ) = eı
√ θQ
.
For every θ > 0, πW and π2D are unitarily equivalent and faithful. More results about Aθ are reviewed in Sect. 3.2. The group SL(2, Z) acts on Aθ through the automorphisms ηS (Wθ (m)) = Wθ (Sm), S ∈ SL(2, Z). S is called a symmetry if S = ±1 and supn∈N S n < ∞. Of special interest are the 3-fold, 4-fold and 6-fold symmetries 0 −1 0 −1 1 −1 , S4 = , S6 = , S3 = 1 −1 1 0 1 0
518
J. Bellissard, I. Guarneri, H. Schulz-Baldes
respectively generating the symmetry groups of the hexagonal (or honeycomb), square and triangular lattices in dimension 2. In this work, the Hamiltonian H = H ∗ is an element of Aθ . Of particular interest are Hamiltonians invariant under some symmetry S ∈ SL(2, Z), that is ηS (H ) = H . The most prominent among such operators is the (critical) Harper Hamiltonian on a square lattice H4 = U + U −1 + V + V −1 . For the sake of concreteness, let us write out its covariant representations u ∈ 2 (Z). πω (H4 )u(n) = u(n + 1) + u(n − 1) + 2 cos(nθ + ω)u(n), √ √ Its Weyl representation is πW (H4 ) = 2 cos( θQ) + 2 cos( θP ). Further examples are the magnetic operator on a triangular lattice H6 = U + U −1 + V + V −1 + e−ıθ/2 U V + e−ıθ/2 U −1 V −1 as well as on a hexagonal lattice (which reduces to two triangular ones [Ram]). For H = H ∗ ∈ Aθ let us introduce the notations Hω = πω (H ) and H2D = π2D (H ). Its density of states (DOS) is the measure N defined by (see, e.g., [Bel]) R
dN (E)f (E) = Tθ (f (H )) = 0|f (H2D )|0 = lim
0→∞
1 Tr0 (Hω ), 0
f ∈ C0 (R). (3)
Here |0 denotes the normalized state localized at the origin of Z2 , Tr0 (A) = 0 n=1 n|A|n and the last equality in (3) holds uniformly in ω. For a Borel set 1 ⊂ R and a Borel measure µ, the family of generalized multifractal dimensions is defined by
q−1 log 1 dµ(E) 1 dµ(E ) exp(−(E − E )2 T 2 ) 1 Dµ± (1; q) = lim ± , 1 − q T →∞ log(T ) (4) where lim+ and lim− denote lim sup or lim inf respectively. The gaussian exp(−(E − E )2 T 2 ) may be replaced by the indicator function on [E − T1 , E + T1 ] without changing the values of the generalized dimensions [GSB3, BGT]. Let now H ∈ C 2 (Aθ ). The diffusion exponents of H2D are defined by ± β2D (H, 1; q) = lim
T →∞
± log(M2D (H, 1; q, ·)T )
q log(T )
,
q ∈ (0, 2],
(5)
where M2D (H, 1; q, t) = 0|χ1 (H2D )eıH2D t (|X1 |q + |X2 |q )e−ıH2D t χ1 (H2D )|0, (6) +T and f (·)T denotes the average −T dtf (t)/2T of a measurable function t ∈ R → f (t) ∈ R. The phase-averaged diffusion exponents of the covariant family (Hω )ω∈R are defined as in (5) as growth exponents of 2π dω ˆ q e−ıHω t χ1 (Hω )|0. M1D (H, 1; q, t) = (7) 0|χ1 (Hω )eıHω t |X| 2π 0 Because H ∈ C 2 (Aθ ) and q ∈ (0, 2], M2D (H, 1; q, t) and M1D (H, 1; q, t) are finite. ± (H, 1; q) and β ± (H, 1; q) take values in the interval [0, 1] as long as Moreover, β2D 1D the boundary of 1 lies in gaps of H [SBB].
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
519
Main Theorem. Let θ be a Roth angle and H = H ∗ ∈ C 2 (Aθ ). (i) For any Borel subset 1 ⊂ R and q ∈ (0, 1), ± ± (H, 1; q) ≥ DN (1; 1 − q). β2D
(8)
(ii) Let H be invariant under some symmetry S ∈ SL(2, Z). Then, for any Borel subset 1 ⊂ R and q ∈ (0, 1), ± ± (H, 1; q) ≥ DN (1; 1 − q). β1D
(9)
Remark 1. Existing lower bounds (inequalities proved in [GSB3, BGT]) yield ± (H, 1; q) ≥ 1 D ± (1; 1/(1 + q)), where the factor 1 stems from the dimension β2D 2 N 2 ± ± (1; 1 − q) ≥ DN (1; 1/(1 + q)), so inequality (8) of physical space. In addition, DN substantially improves such bounds. The same is true of the inequality in Theorem 1 below which is actually the key to the bounds (8) and (9). This crucial improvement follows from an almost-sure estimate on the growth of the generalized eigenfunctions in the Weyl representation (cf. Proposition 4 below) which in turn follows from number-theoretic estimates. As in [KL], a control on the asymptotics of the generalized eigenfunctions then leads to an improved lower bound on the diffusion coefficients (here by a factor 2 at q = 0). Remark 2. The bound (8) is of practical interest especially if H is invariant under some symmetry. Non-symmetric Hamiltonians may lead to ballistic motion and absolutely continuous spectral measures (as it is generically the case for the non-critical Harper Hamiltonian, see [Jit] and references therein). In this situation, the bound becomes trivial because both sides in (9) are equal to 1. Remark 3. Numerical results [TK, RP] as well as the Thouless property [RP] support that DN (−1) = 21 in the case of the critical Harper Hamiltonian H4 for Diophantine θ/(2π). According to (9), one thus expects β1D (H4 , R; 2) ≥ 21 . Remark 4. Numerical simulations by Piéchon [Pie] for the Harper model with some strongly incommensurate θ/(2π ) indicate that (9) may actually be an exact estimate. Piechon also gave a perturbative argument supporting the equality β1D (H ; q) = DN (1− q) in the case of the Fibonacci Hamiltonian, and verified it numerically. The techniques of the present article do not apply to the Fibonacci model which has no phase-space symmetry. + − (1; q) = DN (1; q) Remark 5. Our proof forces q ∈ (0, 1) (see Lemma 3). If DN for all q = 1, the large deviation technique of [GSB3] leads to (8) for all q > 0 (if H ∈ C ∞ (Aθ )) and (9) for all q ∈ (0, 2]. Numerical results [TK, RP] suggest that the upper and lower fractal dimensions indeed coincide for Diophantine θ/(2π ). This is unlikely for θ/(2π) Liouville (compare [Las]).
Remark 6. Two-sided time averages are used for technical convenience. Important steps forwards of the proof are summarized below. Associated with the ηS with symmetry S there is a harmonic oscillator Hamiltonian HS invariant under ground state φS ∈ S(R), see Sect. 3.3. In the case of S4 (relevant to the critical Harper model) this is the conventional harmonic oscillator Hamiltonian HS 4 = (P 2 + Q2 )/2, and φS is the gaussian state. Let ρS be the spectral measure of HW = πW (H ) with respect to φS .
520
J. Bellissard, I. Guarneri, H. Schulz-Baldes
Proposition 1. Let θ > 2π. There are two positive constants c± such that for any Borel subset 1 ⊂ R, c− N (1) ≤ ρS (1) = φS |χ1 (HW )|φS ≤ c+ N (1). In particular, N and ρS have the same multifractal exponents. The Hamiltonian HS will be used to study transport in phase space. Similarly to Eqs. (5) and (6), moments of the phase space distance and growth exponents thereof can be defined in the Weyl representation as follows: q/2
MW (H, 1; q, t) = φS |χ1 (HW )eıtHW HS e−ıtHW χ1 (HW )|φS , log(MW (H, 1; q, ·)T ) ± (H, 1; q) = lim ± . βW T →∞ q log(T ) Proposition 2. Let θ > 2π and H = H ∗ ∈ C 2 (Aθ ). For q ∈ (0, 2], ± ± βW (H, 1; q) = β2D (H, 1; q).
ηS for some Proposition 3. Let θ > 2π and H = H ∗ ∈ C 2 (Aθ ) be invariant under symmetry S ∈ SL(2, Z). Then ± ± βW (H, 1; q) ≤ β1D (H, 1; q),
q ∈ (0, 2].
Thanks to Propositions 1, 2 and 3 and since θ may be replaced by θ + 2π without changing the 1D and 2D-representations, the Main Theorem is a direct consequence of the following: Theorem 1. Let H = H ∗ ∈ C 2 (Aθ ) and θ > 2π be a Roth angle. Then, for any Borel subset 1 ⊂ R, ± βW (H, 1; q) ≥ Dρ±S (1; 1 − q),
∀q ∈ (0, 1).
The proof of Theorem 1 will require two technical steps that are worth being mentioned here. The first one requires some notations. Given a symmetry S, let 6S be the projection onto the HW -cyclic subspace HS ⊂ H of φS . Using the spectral the(n) orem, there is an isomorphism between HS and L2 (R, dρS ). If (φS )n∈N denotes the (n) orthonormal basis of eigenstates of HS in H, let 7n,S (E) be the representative of 6S φS 2 in L (R, dρS ). Then: Proposition 4. Let H = H ∗ ∈ C 2 (Aθ ) and let θ be a Roth angle. Then for any > 0 there is c > 0 such that ∞
|7n,S (E)|2 e−δ(n+1/2) ≤ c δ −(1/2+) ,
∀0 < δ < 1,
ρS -a.e. E ∈ R.
n=0
Remark 7. This result is uniform (ρS -almost surely) with respect to the spectral parameter E and to δ. In particular, integrating over E with respect to ρS shows that N−1 (n) 2 1/2+ ). This is possible because of the following complen=0 6S φS = O(N mentary result proved in the Appendix:
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
521
Proposition 5. Let H = H ∗ ∈ Aθ . Then HW has infinite multiplicity and no cyclic vector. The second technical result concerns the so-called Mehler kernel of the Hamiltonian HS , notably the integral kernel of the operator e−tHS in the Q-representation: MS (t; x, y) = x|e−tHS |y.
(10)
Proposition 6. Let θ be a Roth angle. Then, for all > 0, as t ↓ 0 sup |MS (t; x + 2π m1 θ −1/2 , y + θ 1/2 m2 )| = O(t −1/2− ). 0≤x≤2πθ −1/2 ,0≤y≤θ 1/2 m∈Z2
3. Weyl’s Calculus This chapter begins with a review of basic facts about Weyl operators, the rotation algebra and implementation of symmetries therein. The formulas are well-known (e.g. [Per, Bel94] and mainly given in order to fix notations, but for the convenience of the reader their proofs are nevertheless given in the Appendix. The chapter also contains a new and compact solution of the frame problem for coherent states (Sect. 3.4). 3.1. Weyl operators. Let H denote the Hilbert space L2 (R). Given a vector a = (a1 , a2 ) ∈ R2 , the associated Weyl operator is defined by: W(a) = eı(a1 P +a2 Q) ⇔ W(a)ψ(x) = eıa1 a2 /2 eıa2 x ψ(x + a1 ),
∀ψ ∈ H. (11)
The Weyl operators are unitaries, strongly continuous with respect to a and satisfy W(a)W(b) = eıa∧b/2 W(a + b),
a ∧ b = a1 b2 − a2 b1 .
(12)
The following weak-integral identities are verified in the Appendix: ψ|W(a)−1 |ψW(a) = W(b)|ψψ|W(b)−1 =
R2
R2
d 2 b ıa∧b W(b)|ψψ|W(b)−1 , e 2π d 2 a ıb∧a ψ|W(a)−1 |ψW(a). e 2π
(13) (14)
Applying (13) to φ and setting a = 0 leads to φ=
R2
d 2b ψ|W(b)−1 |φW(b)ψ, 2π
φ, ψ ∈ H,
ψ = 1.
(15)
In particular, any non-zero vector in H is cyclic for the Weyl algebra {W(a)|a ∈ R2 }. If ψ ∈ H, the map a ∈ R2 → ψ|W(a)|ψ ∈ C is continuous, tends to zero at infinity and belongs to L2 (R2 ), whereas ψ ∈ S(R) if and only if this map belongs to S(R2 ).
522
J. Bellissard, I. Guarneri, H. Schulz-Baldes
3.2. The rotation algebra. The rotation algebra Aθ , its representations (πω )ω∈R , π2D and πW as well as the tracial state Tθ and ∗-derivations δ1 , δ2 were defined in Sect. 2. Here we give some complements, further definitions and the short proof of Proposition 5. The trace is faithful and satisfies the Fourier formula: A= al Wθ (l), al = Tθ (Wθ (l)−1 A). (16) l∈Z2
In addition, Tθ (A) =
2π 0
dω m|πω (A)|m = l|π2D (A)|l, 2π
∀A ∈ Aθ , ∀m ∈ Z, ∀l ∈ Z2 . (17)
The ∗-derivations satisfy δj Wθ (m) = ımj Wθ (m), j = 1, 2. It follows from (16) that A ∈ C ∞ (Aθ ) if and only if the sequence of its Fourier coefficients is fast decreasing. If A ∈ C ∞ (Aθ ) and A is invertible in Aθ , then A−1 ∈ C ∞ (Aθ ). The position operator (X1 , X2 ) defined on the space s(Z2 ) of Schwartz sequences in 2 (Z2 ) forms a connection [Con] in the following sense: Xj (π2D (A)φ) = π2D (δj A)φ + π2D (A)Xj φ
∀A ∈ C ∞ (Aθ ),
φ ∈ s(Z2 ). (18) √ √ Similarly, if (∇1 , ∇2 ) is defined on S(R) by ∇1 = −ıQ/ θ, ∇2 = ıP / θ , then ∇j (πW (A)ψ) = πW (δj A)ψ + πW (A)∇j ψ
∀A ∈ C ∞ (Aθ ),
ψ ∈ S(R). (19)
Then S(R) is exactly the set of C ∞ -elements of H with respect to ∇. In particular, if ψ ∈ S(R) and A ∈ C ∞ (Aθ ), then πW (A)ψ ∈ S(R). For the Weyl representation, let us use the notations √ πW (Wθ (m)) = Wθ (m) := W( θm), ∀m ∈ Z2 . (20) It can be seen as a direct integral of 1D-representations by introducing the family (Gω )ω∈R of transformations from H into 2 (Z), ω − nθ (Gω φ)(n) = θ −1/4 φ , ∀φ ∈ H. (21) √ θ Then a direct computation (given in the Appendix) shows that: θ dωGω φ|πω (A)|Gω ψ, A ∈ Aθ , φ, ψ ∈ H. φ|πW (A)|φ =
(22)
0
θ In particular, φ2 = 0 dωGω φ22 . The link between πW and π2D will be established in Sect. 4.2. It follows from a theorem by Rieffel [Rie] that the commutant of πW (Aθ ) is the von Neumann algebra generated by πW (Aθ ), where θ /2π = 2π/θ and πW (Wθ (l)) = Wθ (l). The following result is proven in the Appendix:
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
523
Proposition 7 (The generalized Poisson summation formula). Tψθ :=
Wθ (l)|ψψ|Wθ (l)−1 =
l∈Z2
θ ψ|Wθ (m)−1 |ψWθ (m). 2π 2
(23)
m∈Z
By Eq. (23), ψ ∈ S(R) implies Tψθ ∈ C ∞ (Aθ ). It follows immediately from Eq. (23) that, given ψ ∈ S(R), there is a positive element in Aθ , denoted Fψθ , such that Tψθ = (θ/2π)πW Fψθ . Moreover ψ|πW (A)|ψ = Tθ AFψθ ,
∀A ∈ Aθ .
(24)
3.3. Symmetries. It is well-known that S ∈ SL(2, R) can be uniquely decomposed in a torsion, a dilation and a rotation as follows: S=
ab cd
=
10 κ1
λ 0 0 λ−1
cos s − sin s sin s cos s
,
with κ = (ac+db)/(a 2 +b2 ), λ = (a 2 +b2 )1/2 , eıs = (a −ıb)(a 2 +b2 )−1/2 . Moreover, if S ∈ SL(2, R), then there is a unitary transformation FS acting on H such that W(Sa) = FS W(a)FS−1 ,
a ∈ R2 ,
(25)
as shows the above decomposition as well as the following result, the proof of which is deferred to the Appendix: Proposition 8. For any κ, λ, s ∈ R, λ = 0, up to a phase F 1
0 κ 1
F λ
0 0 λ−1
F cos s
− sin s sin s cos s
= e−ıκQ
2 /2
= e−ı ln(λ)(QP +P Q)/2 ,
= e−ıs(Q
,
2 +P 2 −1)/2
(26)
.
Note in particular that FS FS = zFSS for z ∈ C, |z| = 1. Furthermore, if 0 < s < π,
dy 2 2 F cos s − sin s φ(x) = (27) eı cos s(x +y )−2xy /2 sin s φ(y). √ sin s cos s 2π sin s R In the special case s = π/2, namely for the matrix S4 (see Sect. 2), this gives the usual Fourier transform dy FS4 φ(x) = (28) √ e−ıxy φ(y). 2π R
524
J. Bellissard, I. Guarneri, H. Schulz-Baldes
For the case of the 3-fold and 6-fold symmetries S3 and S6 , acting on a hexagonal or a triangular lattice (see Sect. 2), Eqs. (26) and (27) give dy ıπ/12 FS3 φ(x) = e √ e−ıx(x+2y)/2 φ(y), 2π R (29) dy −ıπ/12 FS6 φ(x) = e √ e−ıy(2x−y)/2 φ(y). 2π R Now suppose that S ∈ SL(2, R) satisfies S r = 1 for some r ∈ N, r ≥ 2 and S n = 1 for n < r. It will be convenient to introduce the following operator acting on H: r−1 1 n 2 −n 1 FS Q FS = K|MS |K, HS = 2r 2 n=0
r−1
1 n MS = S |e2 e2 |(S t )n , r n=0
R2 .
Note that HS4 = (P 2 + where K = (P , Q) and {e1 , e2 } is the canonical basis of 2 n Q )/2. There is 0 ≤ n ≤ r − 1 such that S e2 ∧ e2 = 0, so MS is positive definite and can be diagonalized by a rotation: −1 + cos γ − sin γ cos γ − sin γ µS 0 . MS = sin γ cos γ sin γ cos γ 0 µ− S 2 Hence HS is unitarily equivalent to the harmonic oscillator Hamiltonian (µ+ SP + − 2 µS Q )/2. Therefore, 1/4
∞ µ+ 1 (n) (n) + − 1/2 S n+ |φS φS |, µ = (µS µS ) , λ= , (30) HS = µ 2 µ− S n=0 (n)
(0)
where the φS are the eigenstates. The ground state is denoted φS ≡ φS . Proposition 9. Up to a phase, the ground state is given by + µ− %e(σS ) 1/4 −σS x 2 /2 S cos γ + ı µS sin γ φS (x) = e , σs = , π − µ+ cos γ + ı µ sin γ S S
(31)
and the Mehler kernel (10) by − (x−y)
2 tanh (tµ)−1 +(x+y)2 tanh (tµ) 4(λ2 cos γ 2 +λ−2 sin γ 2 )
2
−2
) e ı(x 2 −y 2 ) 2sin (2γ 2)(λ −λ 4(λ cos γ +λ−2 sin γ 2 ) . e MS (t; x, y) = λ 2π sinh (tµ)(λ2 cos γ 2 + λ−2 sin γ 2 ) (32)
By construction, FS HS FS∗ = HS , so that FS φS = eıδS φS for some phase δS . Thus, it is possible to choose the phase of FS such that FS φS = φS . Such is the case for FSi in Eqs. (28) and (29). Recall from Sect. 2 that ±1 = S ∈ SL(2, Z) is called a symmetry of Aθ if supn∈Z S n < ∞. Since the set of M ∈ SL(2, Z) with M ≤ c is finite (for any 0 < c < ∞), and since S = ±1, there is an integer r ∈ N∗ such that S r = 1 and S n = 1 for 0 < n < r. So the two eigenvalues are {e±ıϕs }, with rϕs = 0 (mod 2π) and ϕs = 0, π. In particular Tr(S) = 2 cos ϕs ∈ Z, implying r ∈ {3, 4, 6} and ϕs ∈ {±π/3, ±π/2, ±2π/3}. Any S ∈ SL(2, Z) defines a ∗-automorphism ηS of Aθ through ηS (Wθ (m)) = Wθ (Sm). According to the above, πW ( ηS (Wθ (m))) = FS πW (Wθ (m))FS−1 .
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
525
3.4. θ -traces and θ -frames. Definition 1. A vector ψ ∈ H will be called θ-tracial if ψ|Wθ (l)|ψ = Tθ (Wθ (l)) = δl,0 for all l ∈ Z2 . Equivalently, the family (Wθ (l)ψ)l∈Z2 is orthonormal. Using the commutation rules (12), it is possible to check that ψ is θ -tracial if and only if W(a)ψ is θ -tracial for any a ∈ R2 . It also follows from Eq. (23) that ψ is θ -tracial if and only if Tψθ = (θ/2π)1. Such θ -tracial states exist under the following condition: Theorem 2. There is a θ -tracial vector ψ ∈ H if and only if θ ≥ 2π. If θ > 2π there is a θ-tracial vector in S(R). For θ ≥ 2π , denote by 6ψ the projection on the orthocomplement of the ψ-cyclic subspace πW (Aθ )ψ ⊂ H. There is a projection Pψ ∈ Aθ satisfying πW (Pψ ) = 6ψ and Tθ (Pψ ) = 1 − 2π/θ . In particular, ψ is also Aθ -cyclic for θ = 2π . Proof. If ψ is θ -tracial, then (θ/2π ) = ψ|Tψθ |ψ = l∈Z2 |Wθ (l)ψ|ψ|2 ≥ ψ2 = 1. If θ > 2π, for 0 < ε < min (2π, θ − 2π ), let φ be a C ∞ function on R such that 0 ≤ φ ≤ 1, with support in [0, 2π + ε], such that φ = 1 on [ε, 2π ], and φ(x)2 + φ(x + 2π)2 = 1 whenever 0 ≤ x ≤ ε. Using (22), φ is θ-tracial (after normalization), and belongs to S(R). If θ = 2π, the same argument holds with ε = 0. Then φ ∈ H, but it is not smooth anymore. Let ψ be θ-tracial. Exchanging the rôles of θ and θ , the Poisson summation formula implies 2π Wθ (m)|ψψ|Wθ (m)−1 = ψ|Wθ (l)−1 |ψWθ (l). Tψθ = θ 2 2 m∈Z
l∈Z
Tψθ
is the desired orthonormal projection which, due to the r.h.s., is Hence 6ψ = 1 − the Weyl representative of an element Pψ ∈ Aθ . Its trace is Tθ (Pψ ) = 1 − 2π/θ . If ' θ = 2π, since the trace is faithful, Tψθ = 1, so that ψ is cyclic. & Definition 2. A vector ψ ∈ H is called a θ -frame, if there are constants 0 < c < C < ∞ such that c1 ≤ Tψθ ≤ C1. This definition is in accordance with the literature ([Sei] and references therein) where the complete set (Wθ (l)ψ)l∈Z2 is called a frame. The principal interest of frames −1 is due to the following: any vector φ ∈ H can be decomposed as φ = Tψθ (Tψθ ) φ = θ −1 ∗ l∈Z2 cl Wθ (l)ψ, where cl = ψ|Wθ (l) (Tψ ) |φ. If ψ ∈ S(R) and φ ∈ S(R), then −1/2 ψ (cl )l∈Z2 ∈ s(Z2 ). Further note that, if ψ is a θ -frame, then ψˆ = (θ/2π )1/2 (T θ ) ψ
is θ-tracial. In addition, if ψ ∈ S(R) then ψˆ ∈ S(R). The next result shows that so-called Weyl–Heisenberg or Gabor lattices constructed with a gaussian mother state are frames if only the volume of the chosen phase-space cell is sufficiently small. This was proved in [Sei], but the present proof is new and covers more general cases. Suppose S ∈ SL(2, R) satisfies S r = 1 for some r. Using the results of Sect. 3.3 and Eq. (11), it is possible to compute φS |W(a)|φS = e−|a|S /4 , 2
|a|2S =
− 2 2 µ+ S a1 + µ S a2 . µ
(33)
526
J. Bellissard, I. Guarneri, H. Schulz-Baldes
Theorem 3. For θ > 2π, φS is a θ -frame in S(R). Proof. The proof below is given for φ0 ≡ φS4 , but the same strategy works for any φS . 2 Thanks to Poisson’s formula (23) and Eq. (33), Tφθ0 ≤ (θ/2π ) m e−θ|m| /4 . It is therefore enough to find a positive lower bound. Since πW is faithful, it is enough to −θ |m|2 /4 Wθ (m) is itself bounded from below in Aθ . Writing show that T0 = me θ = 2π + δ with δ > 0, there is a ∗-isomorphism between Aθ and the closed subalgebra of A2π ⊗ Aδ generated by (W2π (m) ⊗ Wδ (m))m∈Z2 . It is enough to show −θ |m|2 /4 that Tˆ0 = W2π (m) ⊗ Wδ (m) is bounded from below in A2π ⊗ Aδ . me A2π is abelian and ∗-isomorphic to C(T2 ), provided W2π (m) is identified with the map κ = (κ1 , κ2 ) ∈ T2 → (−1)m1 m2 eıκ·m ∈ C. Hence it is enough to show that 2 Tˆ0 (κ) = m (−1)m1 m2 e−θ |m| /4+ıκ·m Wδ (m) is bounded from below in Aδ uniformly in κ. Since the Weyl representation √ is faithful, Wδ (m) can be replaced by Wδ (m). Using Eq. (13) with ψ = φ0 and a = δm, it is thus enough to show that T˜0 (κ) =
R2
√ √ d 2b I(κ1 + δb2 , κ2 − δb1 )W(b)|φ0 φ0 |W(b)−1 , 2π
where I(κ) =
(−1)m1 m2 e−π|m|
2 /2+ı(κ·m)
,
(34)
m∈Z2
is bounded from below. Clearly the function I is 2π -periodic in both of its arguments. Hence, decomposing the integral into a sum of integrals over the shifted unit cell C = √ [0, 2π) × [0, 2π) and using Wδ (a) = W(2π a/ δ) gives T˜0 (κ) =
d 2a a + κˆ a + κˆ −1 I(a)Wδ l + |φ0 φ0 |Wδ l + , 2πδ 2π 2π 2 C
l∈Z
where κˆ = (κ2 , −κ1 ). The Poisson summation formula applied to the summation over m1 in (34) gives a sum over an index n1 . Changing summation indexes n2 = m2 − n1 shows √ 2 that I(κ) = 2e−κ1 /2π |f (κ1 + ıκ2 )|2 , where f is the holomorphic entire function 2 given by f (z) = n∈Z e−πn −nz . It can be checked that f (z + 2ıπ ) = f (z) and that f (z + 2π) = ez+π f (z). Moreover, using the Poisson summation formula, f does not vanish on γ , the boundary of C oriented clockwise. As I has no poles, the number of zeros of f within C counted with their multiplicity is given by γ df/2ıπf . Using the periodicity properties of f , this integral equals 1. Moreover, a direct calculation shows that the unique zero with multiplicity 1 of f lies at the center π(1 + ı) of C. Hence there is a constant c1 > 0 such that |f (π + ıπ + reıϕ )| ≥ c1 r 2 for all ϕ ∈ [0, 2π ). Let Br denote the ball of size r around π(1 + ı). Replacing this shows
−1 2 2a d a + κ ˆ a + κ ˆ c r 1 . 1− Wδ Tφδ0 Wδ T˜0 (κ) ≥ δ 2π 2π Br 2π As Tφδ0 ≤ c2 1, T˜0 (κ) ≥ 1c1 r 2 (1−c2 r 2 /2)/δ. Choosing r small enough, T˜0 (κ) is bounded from below by a positive constant uniformly in κ. & '
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
527
4. Comparison Theorems 4.1. Proof of Proposition 1 . For normalized φ ∈ H, ρφ denotes the spectral measure of HW relative to φ. Proposition 1 is a corollary of the following result: Theorem 4. For θ ≥ 2π, for any normalized θ -frame φ ∈ H and any Borel subset 1 of R, 2π θ 2π −1 (Tφθ ) −1 N (1) ≤ ρφ (1) ≤ Tφ N (1). θ θ
(35)
Proof. Equation (24) leads to ρφ (1) = Tθ χ1 (H )Fφθ ≤ Fφθ N (1), and to
−1 −1 N (1) = Tθ χ1 (H )Fφθ (Fφθ ) ≤ ρφ (1)(Fφθ ) .
Since Tφθ = θ/2ππW (Fφθ ), the theorem follows.
' &
4.2. Proof of Proposition 2 . Let θ > 2π . The ground state φS of HS is a θ -frame −1/2 according to Theorem 3. Let ψS = (θ/2π )1/2 (TφθS ) φS be the associated θ -tracial vector. Further set HS = πW (Aθ )ψS . In this section, πW denotes the restriction of the Weyl representation to HS . A unitary transformation U : HS → 2 (Z2 ) is defined by (Uφ)(l) = ψS |Wθ (l)−1 |φ,
φ ∈ HS , l ∈ Z2 .
Then UπW (A)U ∗ = π2D (A) for all A ∈ Aθ . Moreover U : S(R) ∩ HS → s(Z2 ). As UψS = |0, MW (H, 1; q, t) = 0|χ1 (H2D )eıH2D t (UHS U ∗ )q/2 e−ıH2D t χ1 (H2D )|0. Recall that HS is a polynomial of degree two in Q and P . From (19) follows UQU ∗ = −θ 1/2 X1 + A1 ,
UP U ∗ = −θ −1/2 X2 + A2 ,
where l|A1 |m = ψS |Wθ (l − m)|QψS and l|A2 |m = ψS |Wθ (l − m)|P ψS . Because ψS , QψS and P ψS are in S(R), A1 and A2 are bounded operators. Using the standard operator inequalities |AB| ≤ A|B| and |A + B| ≤ 2(|A| + |B|) and the commutation relation [X1 , X2 ] = 0, it is now possible to deduce MW (H, 1; q, t) ≤ c1 M2D (H, 1; q, t) + c2 for two positive constants c1 and c2 . An inequality M2D (H, 1; q, t) ≤ c1 MW (H, 1; q, t) + c2 is obtained similarly. This implies Proposition 2.
528
J. Bellissard, I. Guarneri, H. Schulz-Baldes
4.3. Proof of Proposition 3. Lemma 1. Let Y1 , . . . , YN be selfadjoint operators on H with common domain which satisfy [Ym , Yn ] = ıcm,n 1. Then, if c = maxm,n (|cm,n |) > 0 and if 0 ≤ α ≤ 1, α
N N N 1 2α 2 Yn ≤ Yn ≤ Yn2α + 2N (N − 1)cα . N n=1
n=1
(36)
n=1
Proof. For α = 0, 1 both inequalities are trivial. For 0 < α < 1 the following identity holds: sin (π α) A = π α
∞ 0
dv A , 1−α v v+A
(37)
2 for a positive operator A. If A = N n=1 Yn , then the left-hand inequality in (36) follows 2 from Yn ≤ A and from the operator monotonicity of A/(v + A) = 1 − v/(v + A). On the other hand N 1 1 A Yn = Y n + Y n Yn , . v+A v+A v+A n=1
The first term of each summand is bounded by Yn2 /(v+Yn2 ). Noting Yn Yn , (v + A)−1 = Yn (v +A)−1 [A, Yn ] (v +A)−1 , and using the commutation rules for the Yn ’s, the second term in the r.h.s. is estimated by 1 1 1 cm,n Yn |cm,n |, Ym ≤2 −2ı v + A v + A v + c0 m,n m,n where c0 is the infimum of the spectrum of A. In the latter inequality Yn2 ≤ A has been used. By definition, there are m, n such that cm,n = c > 0 so that Yn2 + Ym2 = (Ym − ıYn )(Ym + ıYn ) + c1 ≥ c1. Hence c0 ≥ c. Integrating over v, using Eq. (37), and remarking that m,n |cm,n | ≤ N (N − 1)c gives the result. & ' If S ∈ SL(2, Z) is a symmetry such that S r = 1, the operators Yn = FSn QFS−n satisfy the hypothesis of Lemma 1, because calculating the derivative of (25) at a = 0 shows that each Yn is linear in P and Q. Clearly HS = 1/(2r) rn=1 Yn2 . If H ∈ Aθ is S-invariant, then HS (t) = 1/(2r) rn=1 FSn Q2 (t)FS−n , where A(t) = eıtHW Ae−ıtHW whenever A is an operator on H. Therefore, if 0 ≤ q ≤ 2, the inequality (36) leads to (with χ1 = χ1 (HW )) φS |χ1 HS (t)q/2 χ1 |φS ≤ r(2r)−q/2 φS |χ1 |Q(t)|q χ1 |φS + 2r(r − 1)
c q/2 , 2r
where Fs φS = φS has been used. Proposition 3 is then a direct consequence of the ± (H, 1; q), β ± (H, 1; q) and of the following lemma: definitions of the exponents β1D W
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
529
Lemma 2. Let φ ∈ S(R), θ ≥ 2π and q ≥ 0. Then, there are two positive constants c0 , c1 such that, for any element B ∈ Aθ , 2π dω ∗ q q Bω |0 + c1 , φ|BW |Q| BW |φ ≤ c0 0|Bω∗ |X| 2π 0 where BW = πW (B) and Bω = πω (B). Proof. Definition (21) and identity (22) of Sect. 3.2 lead to θ ω − nθ ω − n θ ∗ |Q|q BW |φ = θ (q−1)/2 dω φ φ|BW φ n|Kω |n , √ √ θ θ 0 n,n ∈Z q Bω . Since Kω is a positive operator, the Schwarz inequality with Kω = Bω∗ |(ω/θ ) − X| gives |n|Kω |n | ≤ (n|Kω |n + n |Kω |n )/2. Both terms can be bounded similarly. The covariance property of πω (see Sect. 3.2) gives n|Kω |n = 0|Kω−nθ |0. Since φ ∈ S(R), summing up over n first, then over n, there are constants C, c1 such that ∗ |Q|q BW |φ ≤ C dx|φ(x)|0|Kx √θ |0 φ|BW R q B √ |0 + c1 , ≤C dx|φ(x)|0|Bx∗√θ |X| x θ R
q ), valid for q ≥ 0 and some suitable ≤ Cq (|x|q + |X| where the inequality |x − constant Cq , has been used. Thanks to the periodicity of πω , the r.h.s. of the latter estimate can be written as 2π ω − 2π n dω φ + c1 , q Bω |0 sup r.h.s. ≤ √ 0|Bω∗ |X| √ θ θ 0 0, 2 2 g Iα (T ) = E ∈ I T −α−1/ log(T ) ≤ dρ(E )e−(E−E ) T = ρ(BT (E)) ≤ T −α . I
Then, for all p ∈ [0, 1], there is α = α(p, T ) and a constant c such that
p−1 cT (p−1)α g ρ(Iα (T )) ≥ dρ(E) ρ(BT (E)) . log(T ) I g Proof. Let κ > 0 and set O0 = E ∈ supp(ρ)|ρ(BT (E)) ≤ T −κ . In addition, for j = 1, . . . , κ log(T ) let g Oj = E ∈ supp(ρ)T −κ+(j −1)/ log(T ) ≤ ρ(BT (E)) ≤ T −κ+j/ log(T ) .
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
Then
g
dρ(E)ρ(BT (E))p−1 ≤
531
g
dρ(E)ρ(BT (E))p−1 + κ log(T ) max O0
g
j =1...κ log(T ) Oj
dρ(E)ρ(BT (E))p−1 .
(38)
Let j = j (T , p) be the index where the maximum is taken, and then set α = α(T , p) = κ − j log(T ). It only remains to show that the O0 -term is subdominant if only κ is chosen sufficiently big. To do so, the support of ρ is covered with intervals (Ak )k=1...K of length 1/T . Then K ≤ T |supp(ρ)| (where |A| denotes the diameter of A). If g ak = inf{ρ(BT (E))|E ∈ Ak ∩ O0 }, then ak ≤ T −κ by definition of O0 . More 2 2 g over ρ(BT (E)) ≥ Ak ∩O0 dρ(E )e−(E−E ) T . In particular, if E ∈ Ak ∩ O0 , then g |E − E |T ≤ 1 implying ρ(BT (E)) ≥ e−1 ρ(Ak ∩ O0 ) and thus, ρ(Ak ∩ O0 ) ≤ eak . Hence (p − 1 ≤ 0): p g p−1 dρ(E)ρ(BT (E))p−1 ≤ ρ(Ak ∩ O0 )ak ≤ e ak ≤ eT 1−κp |supp(ρ)|. O0
k≤K
k≤K
Hence choosing κ = 2/p, for example, provides a subdominant contribution in (38) such that (38) fulfills the desired bound. & ' 5.2. Proof of Proposition 4. This section √is devoted to the proof of Proposition 4 as = e2ıπQ/ θ = Wθ (0, 1) commutes with πW (Aθ ), it suming Proposition 6. Since U ) has a joint spectrum commutes, in particular, with HW . Therefore the pair (HW , U contained in R × T. Let mS denote the spectral measure of the pair relative to φS defined by )|φS , dmS (E, η)F (E, eıη ) = φS |F (HW , U ∀F ∈ C0 (R × T). R×T
The marginal probabilities associated with mS are respectively dρS (E), the spectral . measure of HW , and dηGθη/2π φS 22 θ/(2π ) for η ∈ T, the spectral measure of U Thanks to the Radon–Nikodym theorem, mS can be written either as 2π θ dmS (E, η)F (E, eıη ) = dη dµ(θη/2π) (E)F (E, eıη ), (39) 2π 0 R×T R (where µω is the spectral measure of Hω relative to Gω φS ), or as 2π dmS (E, η)F (E, eıη ) = dρS (E) dνE (η)F (E, eıη ), R×T
R
(40)
0
for some probabilty measure νE depending ρS -measurably upon E. Due to the spectral theorem, for every n ∈ Z, there is a function gn (ω, ·) ∈ L2 (R, µω ) such that dµω (E)f (E)gn (ω, E). (41) Gω φS |f (Hω )|n = R
In the following lemma, g˜ n (η, E) stands for θ −1/4 gn (θ η/2π, E):
532
J. Bellissard, I. Guarneri, H. Schulz-Baldes
Lemma 4. Let ψ ∈ S(R). Then the representative in L2 (R, ρS ) of the projection of ψ on the HW -cyclic component of φS is given by 2π ˜ ψ(E) = dνE (η) g˜ n (η, E)ψ (η − 2π n)θ 1/2 /2π . 0
n∈Z
Proof. ψ˜ is defined by φS |f (HW )|ψ = On the other hand, thanks to Eq. (22), φS |f (Hω )|ψ =
θ 0
˜
R dρS (E)f (E)ψ(E)
dωGω φS |f (Hω )|Gω ψ =
n∈Z 0
θ
for every f ∈ C0 (R).
dωGω φS |f (Hω )|n(Gω ψ)(n).
Then, using the definition (41) of gn together with Eqs. (39) and (40), and changing from ω to η, gives the result. & ' Proof of Prop. 4. Let 1 ⊂ R be a Borel set and, for δ > 0, let Q(1, δ) be defined by ∞ Q(1, δ) = dρS (E) e−δ(n+1/2) |7n,S (E)|2 . 1
n=0 (n)
Thanks to Lemma 4 applied to the eigenstates φS of HS (see Eq. (30)), it can be written as Q(1, δ) = 1 dρS dνE (η)dνE (η ) m,m g˜ m (η, E)g˜ m (η, E) · · · ···
∞
n=0 e
−δ(n+1/2) φ (n) ((η S
(n)
− 2π m)θ 1/2 /2π )φS ((η − 2π m )θ 1/2 /2π ).
The last sum on the r.h.s. of this identity reconstructs the Mehler kernel of Eq. (32) with t = δ/µ. It will be convenient to define (42) Gδ (E; x) = dνE (η ) MS (δ/µ; x, (η − 2π m )θ 1/2 /2π ) . m
Since the Mehler kernel decays fast, this sum converges. Using the Schwarz inequality together with the symmetry (m, η) ↔ (m , η ), Q(1, δ) can be bounded from above by Q(1, δ) ≤ dρS dνE (η) |g˜ m (η, E)|2 Gδ E; (η − 2π m)θ 1/2 /2π . m
1
Thanks to Eqs. (39) and (40), and changing again from η to ω, this bound can be written as θ dω 2 1/2 |g Q(1, δ) ≤ E; (ω − mθ )/θ . dµ (E) (ω, E)| G ω m δ θ 1/2 1 m 0 If now Pω is the projection on the Hω -cyclic component of Gω φS in 2 (Z), the definition (41) of gm and the covariance lead to the following inequality: dµω (E) |gm (ω, E)|2 f (E) = m|Pω f (Hω )Pω |m ≤ 0|f (Hω−mθ )|0,
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
533
valid for f ∈ C0 (R), f ≥ 0, because Hω commutes with Pω and the latter is a projection. (0) Let then µω be the spectral measure of Hω relative to the vector |0. The previous estimate implies θ (0) Q(1, δ) ≤ θ −1/2 dω dµω−mθ (E)Gδ E; (ω − mθ )/θ 1/2 m
≤ θ −1/2
0
∞ ∞
dω
1
1
1/2 dµ(0) (E)G E; ω/θ . δ ω
(0)
Since µω is 2π-periodic with respect to ω, the latter integral can be decomposed into a sum over intervals of length 2π leading to the following estimate: 2π 1/2 Q(1, δ) ≤ θ −1/2 E; (ω + 2π k)/θ . dω dµ(0) (E) G δ ω 0
1
k∈Z
Definitions (17) of the trace on Aθ , (3) of the DOS and (42) of Gδ give 2π dN (E)dνE (η) Q(1, δ) ≤ 1/2 θ 1×[0,2π] 1/2 MS δ/µ; ω + 2π k , (η − 2π m)θ . × θ 1/2 2π (k,m)∈Z2
The result of Proposition 6 can now be used. Remarking that νE is a probability, and using the equivalence between ρS and the DOS (Theorems 3 and 4 combined), the last estimate implies Q(1, δ) ≤ c ρS (1)δ −(1/2+) , for some suitable constant c . Since this inequality holds for all Borel subsets 1 of R, the Proposition 4 is proven. & '
5.3. Proof of Proposition 6. If α = θ/2π ∈ [0, 1] is an irrational number, a rational approximant is a rational number p/q, with p, q prime to each other, such that |α − p/q| < q −2 . The continued fraction expansion [a1 , · · · , an , · · · ] of α [Her], provides an infinite sequence pn /qn of such approximants, the principal convergents, recursively defined by p−1 = 1, q−1 = 0, p0 = 0, q0 = 1 and sn+1 = an+1 sn + sn−1 if s = p, q. It can be proved (see [Her] that α is a number of Roth type (see Eq. (2) in Prop. 7.8.3) < ∞ for all > 0. Sect. 2) if and only if ∞ a /q n+1 n n=1 The proof of Proposition 6 relies upon the so-called Denjoy–Koksma inequality [Her]. Let ϕ be a periodic function on R with period 1, of bounded total variation Var(ϕ) over a period interval. Then (see [Her], Theorem 3.1) Theorem (Denjoy–Koksma inequality). Let α ∈ [0, 1] be irrational and let ϕ be a real valued function on R of period one. Then, if p/q is a rational approximant of α, q ϕ(x + j α) − q j =1
1 0
dyϕ(y) ≤ Var(ϕ).
534
J. Bellissard, I. Guarneri, H. Schulz-Baldes
Proposition 6 is a direct consequence of the definition of the Mehler kernel (see Eq. (32)) and of the following result: Lemma 5. If δ > 0, let Fδ be the function on R2 defined by Fδ (x, y) = δ(x + y)2 + δ −1 (x − y)2 . If α is a number of Roth type, then for any a > 0, > 0, there is c > 0 such that sup e−aFδ (x+k,y+mα) ≤ c δ − , ∀δ ∈ (0, 1). x,y∈R
(k,m)∈Z2
2 2 2 Proof. Let (x0 , y 0 ) ∈ R be fixed and set L = {(x0 + k, y0 + mα) ∈ R |(k, m) ∈ Z }. −aF (x +k,y +mα) δ 0 0 If S(x0 , y0 ) = , then S is periodic of period 1 in x0 and of k,m e period α in y0 . Therefore, it is enough to assume 0 ≤ x0 < 1 and 0 ≤ y0 < 1 (since 0 < α < 1). For 0 < σ < 1 and for j ∈ N, let Lj be the set of points (x, y) ∈ L for which j 2 δ −σ ≤ Fδ (x, y) < (j + 1)2 δ −σ . Thus
S(x0 , y0 ) ≤
∞
e−aj
2 δ −σ
|Lj |,
(43)
j =0
where |A| denotes the number of points in A. Lj is contained in an elliptic crown with axis along the two diagonals x = ±y. In particular, (x, y) ∈ Lj ⇒ max{|x|, |y|} ≤ (j + 1)δ −(1+σ )/2 and |x − y| ≤ (j + 1)δ (1−σ )/2 . (44) If j ≥ 1, the number of points contained in Lj can be estimated by counting the number of rectangular cells of sizes (1, α) centered at points of L and meeting the elliptic crown. Since this crown is included inside the square max{|x|, |y|} ≤ (j + 1)δ −(1+σ )/2 it is enough to count such cells meeting this square. Such cells are all included inside the square C = {(x, y) ∈ R2 | max{|x|, |y|} ≤ (j + 2)δ −(1+σ )/2 } (since δ ≤ 1). Hence the number of such cells is certainly dominated by the ratio of the area of C to the area of each cell, namely (j + 2) 2 −(1+σ ) |Lj | ≤ δ . α Therefore, the part of the sum in (43) coming from j ≥ 1 converges to zero as δ ↓ 0. In particular, it is bounded by a constant c1 that is independent of (x0 , y0 ). Thus, it is sufficient to consider the term j = 0 only. Let ϕ be the function on R defined by ϕ(x) = k∈Z χI (x + y0 − x0 + k), where I is the interval I = [−δ (1−σ )/2 , δ (1−σ )/2 ] ⊂ R. It is a periodic function of period 1 with Var(ϕ) = 2. Moreover, using (44) it can be checked easily that S(x0 , y0 ) ≤ c1 +
|m|<M
ϕ(mα) ≤ c1 +
M−1
(ϕ(mα) + ϕ(−mα)) ,
m=0
provided M ≥ 3δ −(1+σ )/2 /α. For indeed, (x, y) ∈ L0 only if |y0 + mα| ≤ δ −(1+σ )/2 for some m ∈ Z. Let then n ∈ N be such that qn ≤ M < qn+1 , where the pn /qn ’s are the principal convergents of α. Replacing M by qn+1 in the r.h.s. gives an upper
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
535
bound. By the Denjoy-Koksma inequality, the r.h.s. is therefore bounded from above by c1 + 4qn+1 δ (1+σ )/2 . Since α is a number of Roth type, qn+1 ≤ (an+1 + 1)qn ≤ c2 · qn1+σ , thanks to Prop. 7.8.3 in [Her] (see above). It is important to notice that c2 only depends upon α and the choice of the exponent σ . Collecting all inequalities, gives S(x0 , y0 ) ≤ c1 +
12 · c2 −2σ . δ α
Choosing σ = /2 and remarking that none of the constants on the r.h.s. depends on ' (x0 , y0 ) leads to the result. & Appendix: Proofs of Various Results on Weyl Operators Proof of Eqs. (13) and (14). Due to the polarization principle, (13) is equivalent to d 2 b ıa∧b |ψ|W(b)|ψ|2 . φ|W(a)|φψ|W(a)|ψ = e (45) R2 2π By inverse Fourier transform, (45) is equivalent to d 2 a ıb∧a |φ|W(b)|ψ|2 = e φ|W(a)|φψ|W(a)|ψ, R2 2π
(46)
which is equivalent to (14), so that it is sufficient to prove (45). Using (11), db1 db2 dx r.h.s. of (45) = 2π R2 R × dyφ(x)φ(y)ψ(x + b1 )ψ(y + b1 )eı(b2 (x−y+a1 )−a2 b1 ) . R
The integral over b2 can be immediately evaluated by R db2 eıb2 (x−y+a1 ) = 2π δ(y −x − a1 ). Thus the integration over y is elementary. Changing variable from b1 to x = x + b1 therefore gives a1 a2 a1 a2 dx dx φ(x)φ(x + a1 )eıa2 x+ı 2 ψ(x )ψ(x + a1 )e−ıa2 x −ı 2 , r.h.s. of (45) = R
R
which is precisely the l.h.s. of (45).
' &
Proof of Eq. (22). It is sufficient to verify (22) for the generators A = Wθ (m), m ∈ Z2 , of Aθ . For such A, θ dω ω − nθ ω − lθ r.h.s. of (22) = φ n|πω (Wθ (m)|lψ . √ √ √ θ n,l∈Z θ θ 0 As n|πω (Wθ (m)|l = eıθm1 m2 /2 eı(ω−lθ)m2 δn,l+m1 , the sum over n can be immediately computed, and the one over l can be combined with the integral over ω in order to give dx x − m1 θ ıθ m1 m2 ıxm2 x ψ √ r.h.s. of (22) = e 2 e . √ φ √ θ θ θ R
536
J. Bellissard, I. Guarneri, H. Schulz-Baldes
√ √ Changing variable y = (x − m1 θ )/ θ and identifying W( θm) shows √ r.h.s. of (22) = dyφ(y) W( θ m)ψ (y), R
' &
namely the l.h.s. of (22).
Proof of Proposition 7. For f ∈ S(R2 ), let f˜ be its symplectic Fourier transform defined by (l, m ∈ R2 ): d 2 m ıl∧m d 2 l ım∧l ˜ ˜ f (l) = f (m). f (m), ⇔ f (m) = e e R2 2π R2 2π Then the classical Poisson summation formula reads f (m) = 2π f˜(2π l). m∈Z2
l∈Z2
√ √ Setting f (m) = φ|W( θ m)|φψ|W( θ m)|ψ, Eq. (46) leads to 2 1 2π ˜ f (l) = ψ|W √ l |φ . θ θ Inserting this into the Poisson summation formula and recalling the notation (20) gives (23). & ' Proof of Eq. (24). By (16) and (20), πW (A) = Thus ψ|πW (A)|ψ =
l∈Z2
al ψ|Wθ (l)|ψ = Tθ
al Wθ (l) with al = Tθ (Wθ (l)−1 A).
ψ|Wθ (l)|ψWθ (l)−1 A .
l∈Z2
l∈Z2
Comparing with the Poisson summation formula (23) shows (24).
' &
Proof of Proposition 8. Because of the freedom of phase and relation (12), it is sufficient to verify all implementation formulas (25) for the Weyl operators eıQ and eıP or equivalently (on the domain of) their generators Q and P . Concerning the first formula in (26), it thus follows from the identities e−ıκQ
2 /2
QeıκQ
2 /2
= Q,
e−ıκQ
2 /2
P eıκQ
2 /2
= κQ + P . √ Next let us consider the dilations on L2 (R) defined by (D(a)φ)(x) = ea φ(ea x). It generators are computed by d ı (D(a)φ)(x) = (QP + P Q)φ(x), da 2 a=0 so that for a = −ln(λ),
# e
−ı ln λ(QP +P Q)/2
φ(x) =
1 x φ . λ λ
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
537
This immediately allows us to verify 1 Q, λ = λP ,
e−ı ln λ(QP +P Q)/2 Qeı ln λ(QP +P Q)/2 = e−ı ln λ(QP +P Q)/2 P eı ln λ(QP +P Q)/2
which proves the second formula √ in (26). To prove the last√one, we use the annihiliationcreation operators a = (Q−ıP )/ 2 and a ∗ = (Q+ıP )/ 2.As (P 2 +Q2 −1)/2 = a ∗ a ∗ ∗ ∗ and e−ısa a aeısa a = eıs a, the formula follows after decomposing W(a) into a and a . ∗a −ısa , notably (Kφ) = dyk(x, y)φ(y). Finally we search the integral kernel for K = e (n) (n) (n) ısn If φS4 are the Hermite functions, then KφS4 = e φS4 . Equivalently, k has to satisfy ay k = eıs ax∗ k and Kφ (0) = φ (0) (here the index on the a’s indicate with respect to which 2 2 variable the operator acts). An Ansatz k(x, y) = e−b(x +y )+cxy+d leads to the integral kernel in (27). & ' Proof of Proposition 9. Let us set cos γ − sin γ R= , sin γ cos γ
D=
λ0 0 λ1
.
Then, using the notations and formulas in Subsect. 3.3, µ −1 −1 HS = (RD)t K|(RD)t K = µFR FD HS4 FD FR , φS = FR FD φS4 . (47) 2 Now φS4 is known to be the normalized gaussian. Using the implementation formulas of Proposition 8, it is straightforward to calculate the gaussian integrals giving (31). The Mehler kernel MS4 (t; x, y) for HS4 = (P 2 + Q2 )/2 is well-known (and can be read of (27) at imaginary time). Using (47) and the definition (10), −1 −1 MS (t; x, y) = dx dy x|FR FD |x MS4 (t; x , y )y |FD FR |y. R
R
The gaussian integrals herein give rise to (32). & ' Let us conclude with the proof of the complementary result given in Sect. 2. Proof of Proposition 5. The commutant B of the abelian C∗ -algebra generated by HW contains the commutant of πW (Aθ ), that is the von Neumann algebra πW (Aθ ) generated by πW (Aθ ). As πW (Aθ ) is of type II1 [Sak], there exist ∗-endomorphisms ηq : Matq×q → B for every q ∈ N (here Matq×q denotes the complex q × q matrices). According to the spectral theorem, H decomposes according to the multiplicity of πW (H ): H = ⊕n≥1 L2 (Xn , µn ) ⊗ Cn ⊕ L2 (X∞ , µ∞ ) ⊗ 2 (N), where the µn ’s are positive measures with pairwise disjoint supports Xn ⊂ R. In this representation, πW (H ) = ⊕n≥1 Mult(E)⊗1n ⊕Mult(E)⊗1∞ (here Mult(E) denotes the multiplication by the identity on R) and B = ⊕n≥1 L∞ (Xn , µn )⊗Matn×n ⊕L∞ (X∞ , µ∞ )⊗B(2 (N)). Let Pn be the projection on L2 (Xn , µn ) ⊗ Cn . Then Pn BPn = L∞ (Xn , µn ) ⊗ Matn×n . Moreover φn,x (B) = Pn BPn (x) defines a ∗-endomorphism from B to Matn×n for µn -almost all x ∈ Xn . Combining with ηq , one gets ∗-endomorphisms φn,x ◦ ηq : Matq×q → Matn×n for any q satisfying φn,x ◦ ηq (1q ) = 1n . This is impossible for any q > n so that Xn = ∅ for all n ≥ 1. If HW had a cyclic vector, its spectrum would be simple. & '
538
J. Bellissard, I. Guarneri, H. Schulz-Baldes
Acknowledgements. We would like to thank B. Simon, R. Seiler and S. Jitormiskaya for very useful comments. The work of H. S.-B. was supported by NSF Grant DMS-0070755 and the DFG Grant SCHU 1358/1-1. J.B. wants to thank the Institut Universitaire de France and the MSRI at Berkeley for providing support while this work was in progress.
Note added in proof. After this work was completed we learned that S. Tcheremchantsev in “Mixed lower bounds for quantum transport” extended Lemma 3 to arbitrary values of q. This implies that the Main Theorem (i) holds for all q > 0 and (ii) for q ∈ (0, 2]. References [BSB]
Barbaroux, J.-M., Schulz-Baldes, H.: Anomalous transport in presence of self-similar spectra. Annales I.H.P. Phys. Théo. 71, 539–559 (1999) [BGT] Barbaroux, J.M., Germinet, F., Tcheremchantsev, S.: Nonlinear variation of diffusion exponents in quantum dynamics. C.R. Acad. Sci. Paris 330, série I, 409–414 (2000); Fractal Dimensions and the Phenomenon of Intermittency in Quantum Dynamics. Duke Math. J. 110, 161–193 (2001) [Bel] Bellissard, J.: K-theory of C∗ -algebras in solid state physics. In: Statistical Mechanics and Field Theory: Mathematical Aspects, Lecture Notes in Physics 257, edited by T. Dorlas, M. Hugenholtz, M. Winnink, Berlin: Springer-Verlag, 1986, pp. 99–156 ; Gap labelling theorems for Schrödinger operators. In: From Number Theory to Physics, Berlin: Springer, 1992, pp. 538–630 [Bel94] Bellissard, J.: Lipshitz continuity of gap boundaries for Hofstadter-like spectra. Commun. Math. Phys. 160, 599–613 (1994) [Com] Combes, J.-M.: In:Differential Equations with Applications to Mathematical Physics. Ames, W.F., Harrell, E.M., Herod J.V., eds, Boston: Academic Press, 1993 [Con] Connes, A.: Noncommutative Geometry. London: Academic Press, 1994 [DS] Deift, P., Simon, B.: Almost periodic Schr'ödinger operators. III. The absolutely continuous spectrum in one dimension. Commun. Math. Phys. 90, 389–411 (1983) [Gua] Guarneri, I.: Spectral properties of quantum diffusion on discrete lattices. Europhys. Lett. 10, 95– 100 (1989); On an estimate concerning quantum diffusion in the presence of a fractal spectrum. Europhys. Lett. 21, 729–733 (1993) [GM] Guarneri, I., Mantica, G.: Multifractal Energy Spectra and Their Dynamical Implications. Phys. Rev. Lett. 73, 3379–3383 (1994) [GSB1] Guarneri, I., Schulz-Baldes, H.: Upper bounds for quantum dynamics governed by Jacobi matrices with self-similar spectra. Rev. Math. Phys. 11, 1249–1268 (1999) [GSB2] Guarneri, I., Schulz-Baldes, H.: Lower bounds on wave packet propagation by packing dimensions of spectral measures. Elect. J. Math. Phys. 5 (1999) [GSB3] Guarneri, I., Schulz-Baldes, H.: Intermittent lower bound on quantum diffusion. Lett. Math. Phys. 49, 317–324 (1999) [Har] Harper, P. G.: Single Band Motion of Conduction Electrons in a Uniform Magnetic Field. Proc. Phys. Soc. Lond. A 68, 874–878 (1955) [Her] Herman, M. R.: Sur la conjugaison différentiable des difféomorphismes du cercle à des rotations. Publications I.H.E.S. 49, 5–233 (1979) [Jit] Jitomirskaya, S.: Metal-Insulator Transition for the Almost Mathieu Operator. Ann. of Math. 150, 1159–1175 (1999) [Las] Last, Y.: Quantum Dynamics and decomposition of singular continuous spectra. J. Funct. Anal. 142, 402–445 (1996) [KKKG] Ketzmerick, R., Kruse, K., Kraut, S. and Geisel, T.: What determines the spreading of a wave packet?. Phys. Rev. Lett. 79, 1959–1962 (1997) [KL] Kiselev, A., Last, Y.: Solutions, spectrum, and dynamics for Schrödinger operators on infinite domains. Duke Math. J. 102, 125–150 (2000) [Man] Mantica, G.: Quantum intermittency in almost periodic systems derived from their spectral properties. Physica D 103 , 576–589 (1997); Wave Propagation in Almost-Periodic Structures. Physica D 109, 113–127 (1997) [Per] Perelomov, A.: Generalized Coherent States and Their Applications. Berlin: Springer, 1986 [Pie] Piéchon, F.: Anomalous Diffusion Properties of Wave Packets on Quasiperiodic Chains. Phys. Rev. Lett. 76, 4372–4375 (1996) [Ram] Rammal, R.: Landau level spectrum of Bloch electron in a honeycomb lattice. J. Phys. France 46, 1345–1354 (1985)
Phase-Averaged Transport for Quasi-Periodic Hamiltonians
[Rie] [RP] [Sak] [SBB] [Sei] [TK]
539
Rieffel, M. A.: C∗ -algebras associated with irrational rotations. Pac. J. Math. 93, 415–429 (1981) Rüdinger, A., Piéchon, F.: Hofstadter rules and generalized dimensions of the spectrum of Harper’s equation. J. Phys. A 30, 117–128 (1997) Sakai, S.: C∗ -algebras and W∗ -algebras. Berlin: Springer, 1971 Schulz-Baldes, H., Bellissard, J. Anomalous transport: A mathematical framework. Rev. Math. Phys. 10, 1–46 (1998) Seip, K.: Density theorems for sampling and interpolation in the Bargmann-Fock space I. J. Reine Angew. Math. 429, 91–106 (1992) Tang, C., Kohmoto, M.: Global scaling properties of the spectrum for a quasiperiodic Schrödinger equation. Phys. Rev. B 34, 2041–2044 (1986)
Communicated by M. Aizenman
Commun. Math. Phys. 227, 541 – 550 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Invariance Properties of Induced Fock Measures for U(1) Holonomies J. M. Velhinho Departamento de Física, Universidade da Beira Interior, R. Marquês D’Ávila e Bolama, 6201-001 Covilhã, Portugal. E-mail:
[email protected] Received: 19 July 2001 / Accepted: 7 January 2002
Abstract: We study invariance properties of the measures in the space of generalized U(1) connections associated to Varadarajan’s r-Fock representations. 1. Introduction Holonomies are the starting point for a rigorous approach to quantum gravity – often called “loop quantum gravity” – carried throughout the last decade. It is based on Ashtekar’s formulation of general relativity as a gauge theory [As], loop variables [GT, RoSm], C ∗ -algebra techniques [AI2, Ba1] and integral and functional calculus in spaces of generalized connections [AL1,AL3, MM, Ba2] (an excellent review of both the fundamentals and the most recent developments in this field can be found in [T]). Since the early days of this approach, free Maxwell theory has been a preferred testing ground for new ideas, especially in what concerns the relation between background independent representations of holonomy algebras and the standard Fock representation for smeared fields [ARS,AI1,AR]. Recently, Varadarajan revisited this subject, and proposed a family of representations for a kinematical Poisson algebra of U (1) holonomies and certain functions of the electric fields [Va1,Va2]. Varadarajan’s work allowed the emergence of Fock states within the framework of generalized connections and is therefore a promising starting point to close the gap between non-perturbative loop quantum gravity states and low energy states [AL4] (see also [T] for a general discussion of the issue of semiclassical analysis in loop quantum gravity). In the present work we study (quasi-)invariance and mutual singularity properties of the measures associated to Varadarajan’s representations. These are measures on the space A/G of generalized U (1) connections that can be obtained, by push-forward, from the standard Maxwell–Fock measure (see [AI2,AR,ARS] for previous work along these lines and also [AL4] for a projective construction of the measures). We will show that the measures are singular with respect to each other and are singular with respect to the measure µ0 of Ashtekar and Lewandowski. This implies, in particular, that the Fock
542
J. M. Velhinho
states are not in L2 (A/G, µ0 ) (but rather in an extension thereof [Va2]). It also follows that the measures are not quasi-invariant with respect to the natural action of A/G on itself, which is an obstruction to the quantization of the usual smeared electric fields. On the other hand, the measures on A/G inherit quasi-invariance properties directly related to the electric operators considered by Varadarajan. This work is organized as follows. In Sect. 2 we review the Fock representation, whereas the loop approach is reviewed in Sect. 3. Varadarajan’s measures are presented in Sect. 4 and studied in Sect. 5, which contains our main results. We conclude with a brief discussion, in Sect. 6. 2. Smeared Fields and Fock Representation This section briefly reviews some aspects of the Schrödinger representation of the usual Fock space for the Maxwell field, following [GV, ReSi, GJ, BSZ]. We use spatial coordinates (x a ), a = 1, 2, 3, and units such that c = h¯ = 1. The Euclidean metric δab in R3 is used to raise and lower indices whenever necessary. As is well known, connections A and electric fields E do not give rise to well defined quantum operators. In the Fock framework, they are replaced by smeared versions A() = Aa a d 3 x and E(λ) = λa E a d 3 x, where belongs to the (nuclear) space E∞ of smooth and fast decaying transverse vector fields and λ to the (nuclear) space (A/G)∞ of smooth and fast decaying transverse connections. The Poisson bracket between these basic observables is (1) A(), E(λ) = λa a d 3 x , to which correspond the Weyl relations: V(λ) U() = ei
λa a d 3 x
U()V(λ).
(2)
∗ , µ , where The usual Fock representation can be realized in the Hilbert space L2 E∞ ∗ is the space of tempered distributional 1-forms (the topological dual of E ) and µ E∞ ∞ is the Gaussian measure defined by: 1 iφ() a −1/2 b 3 e dµ (φ) := exp − d x , (3) δab (−) ∗ 4 E∞ where is the Laplacian in R3 . It is well known that µ is quasi-invariant with respect ∗ : to the action of (A/G)∞ on E∞ ∗ φ → φ + λ , λ ∈ (A/G)∞ , E∞ ∗ is defined by where φ + λ ∈ E∞
(φ + λ)() := φ() +
λa a d 3 x , ∀ ∈ E∞ .
(4)
(5)
(Recall that a measure µ is quasi-invariant with respect to a group of transformations T if the push-forward measure T∗ µ has the same zero measure sets as µ, ∀T , i.e. (T∗ µ)(B) =
Invariance Properties of Induced Fock Measures for U(1) Holonomies
543
0 if and only if µ(B) = 0.) One therefore has an unitary representation V of the abelian group (A/G)∞ as translations: ∗ dµ,λ (φ) V(λ)ψ (φ) = ψ(φ − λ) , ψ ∈ L2 E∞ (6) , µ , dµ (φ) where µ,λ is the push-forward of the measure µ by the map (4) and dµ,λ /dµ is the Radon-Nikodym derivative. (The existence of both the Radon-Nikodym derivative and of its inverse is equivalent to quasi-invariance.) A representation of the Weyl relations (2) is achieved with the following representation U of E∞ : (7) U()ψ (φ) = e−iφ() ψ(φ) . ˆ ˆ Since both representations U and V are continuous, the quantized fields A() and E(λ) can be identified with the generators of the one-parameter groups U(t) and V(tλ), respectively. 3. Holonomies and Haar Measure This section briefly reviews the loop approach to the Maxwell field, following in essence the general framework for gauge theories with a compact (not necessarily abelian) group (see e.g. [T] and references therein). Notice, however, that the presentation of the uniform measure µ0 [AL1] and the quantization of electric fields [AL3] are considerably simpler in the U (1) case. In the loop approach the configuration variables are (traces of) holonomies rather then smeared connections. Let us then consider U (1) holonomies Tα (A) := ei
α
Aa dx a
(8)
associated with piecewise analytic loops on R3 . It is convenient to eliminate redundant loops, i.e. one identifies two loops α and β such that Tα (A) = Tβ (A) ∀A. Such classes of loops are called hoops. The set HG of all U (1) hoops is an abelian group under the natural composition of loops. The set of holonomy functions Tα , α ∈ HG, is an abelian ∗-algebra. The C ∗ completion in the supremum norm is called the U (1) holonomy algebra HA [AI2,AL1]. It turns out that HA is isomorphic to the algebra of continuous functions on the space A/G of generalized connections, where A/G is the set of all group morphisms from the hoop group HG to U (1). In order to describe the isomorphism, let us consider the functions α : A/G → U (1): ¯ := A(α) ¯ , A¯ → α (A)
(9)
¯ where α ∈ HG and A(α) denotes evaluation. The space A/G is compact in the weakest topology such that all functions α are continuous. It is a key result that A/G is homeomorphic to the spectrum of HA [MM,AL1,AL2], with the functions α corresponding to Tα . (Cyclic) representations of HA are in 1-1 correspondence with positive linear functionals on HA. By the above isomorphism, those are in turn in 1-1 correspondence with
544
J. M. Velhinho
Borel measures in A/G. Given a measure µ, one thus has a representation the Hilbert space L2 A/G, µ : ¯ = α (A)ψ( ¯ ¯ ∀ψ ∈ L2 A/G, µ . A), (Tα )ψ (A)
of HA in
(10)
The associated positive linear functional ϕ is defined by: ϕ(Tα ) = 1,
(Tα )1 =
A/G
α dµ .
(11)
In the U (1) case, A/G is a topological group [AL1, Ma] with multiplication ¯ A¯ , A¯ ∈ A/G, α ∈ HG, A¯ A¯ (α) = A¯ (α)A(α),
(12)
¯ −1 ). Let us consider the Haar measure µ0 and the associand inverse A¯ −1 (α) = A(α ated representation 0 of HA [AL1]. Since µ0 is invariant, we also have an unitary representation V0 of the group A/G in L2 A/G, µ0 : ¯ ∀ψ ∈ L2 A/G, µ0 . ¯ = ψ(A¯ A), V0 (A¯ )ψ (A)
(13)
The representation V0 leads to smeared electric operators, as follows. For λ ∈ (A/G)∞ , let A¯ λ denote the element of A/G defined by holonomies, i.e. A¯ λ (α) := Tα (λ), ∀α ∈ HG. Restricting V0 to elements A¯ λ and the representation 0 to the functions Tα , one obtains the commutation relations: V0 (λ)
0 (Tα )
= ei
α
λa dx a
0 (Tα )V0 (λ) ,
(14)
where V0 (λ) := V0 (A¯ λ ). The action of V0 (λ) is particularly simple for the dense space of finite linear combinations of functions α : V0 (λ)α = ei
α
λa dx a
α .
(15)
Let us consider the one-parameter unitary group V0 (tλ), t ∈ R, and let dV0 (λ) be its self-adjoint generator. From (14) one finds the commutator:
dV0 (λ),
0 (Tα ) =
α
λa dx a
0 (Tα ) ,
(16)
showing that the operators 0 (Tα ) and dV0 (λ) give a quantization ofthe Poisson algebra of holonomies Tα and smeared electric fields E(λ), in L2 A/G, µ0 . In this representation, the states α describe one-dimensional excitations of the electric field along loops, or electric flux "quanta", and are therefore called loop states [GT, RoSm]. These type of excitations are, of course, absent in Fock space. On the other hand, neither are the familiar Fock n-particle states or coherent states obviously related to loop states.
Invariance Properties of Induced Fock Measures for U(1) Holonomies
545
4. r-Fock Measures In this section we present Varadarajan’s r-Fock representations of the U (1) holonomy algebra HA from the measure theoretic point of view. Let us start with hoop form factors [ARS,AR,AI2]. Given a hoop α, the form factor Xα is the transverse distributional vector field such that
(17) Xαa (x)Aa (x)d 3 x = Aa dx a , ∀A. α
Consider the one-parameter family of functions on R3 : fr (x) =
1 2π 3/2 r 3
e−x
2 /2r 2
,
(18)
where r > 0. The smeared form factors are smooth and fast decaying transverse vector fields, i.e. elements of E∞ , defined by: a (x) Xα,r
:=
fr (y − x)Xαa (y)d 3 y .
(19)
One thus has, for each r, a map α → Xα,r from hoops to E∞ . Notice that the composition of hoops is preserved, i.e. Xαβ,r = Xα,r + Xβ,r [AR]. Smeared form factors can be used to define measurable maps from the space of ∗ to A/G. Consider then the family of maps * : E ∗ → distributional connections E∞ r ∞ ¯ A/G given by φ → Aφ,r , where A¯ φ,r (α) := eiφ(Xα,r ) ∀α ∈ HG.
(20)
Since the σ -algebra of measurable sets in A/G is the smallest one such that all functions α (9) are measurable, one sees that *r is measurable if and only if the maps α ◦ *r : ∗ → U (1) are measurable for all α ∈ HG, which is clearly true, since they can be E∞ obtained as a composition of measurable maps: φ → φ(Xα,r ) → eiφ(Xα,r ) . One can now use the maps *r to push-forward the Fock measure µ , thus obtaining a family of measures µr := (*r )∗ µ on A/G. By definition µr (B) = µ (*−1 r B) ∀ measurable set B ⊂ A/G.
(21)
Each of the measures µr provides us with a Hilbert space L2 A/G, µr and a representation r of HA. The associated positive linear functional ϕr is: ϕr (Tα ) =
A/G
¯ dµr (A) ¯ = α (A)
∗ E∞
eiφ(Xα,r ) dµ (φ) .
(22)
Expression (22) shows that the representation r is the r-Fock representation considered by Varadarajan in [Va1], µr being the r-Fock measure in A/G whose existence was proved in [Va2].
546
J. M. Velhinho
5. Properties of the r-Fock Measures The present section contains our main results. We show that the r-Fock measures µr are all mutually singular, and are singular with respect to the Haar measure µ0 . We study also (quasi-)invariance properties of the r-Fock measures µr and their relation to the quantization of certain twice smeared electric fields introduced in [Va1]. Let Diff be the group of (analytic) diffeomorphisms of R3 . The natural action of Diff on the (piecewise analytic) curves of R3 induces an action on the hoop group HG: HG × Diff (α, ϕ) → ϕα , and therefore one has an action of Diff in A/G, given by ∗ ¯ ϕ ∈ Diff, A¯ ∈ A/G, α ∈ HG . ϕ A¯ (α) = A(ϕα),
(23)
(24)
It can be seen that the maps ϕ ∗ : A/G → A/G are continuous [AL1,AL2, Ba1]. The Haar measure µ0 is invariant under the action of Diff, since no background geometric structure is used in its definition [AL1]. The induced measures µr , on the other hand, are not invariant, due to the appearance of the Euclidean metric δab in the construction of the Fock measure µ . From now on we will restrict our attention to the Euclidean group, i.e., the subgroup of Diff of transformations that preserve the Euclidean metric. It is clear that the measures µr are invariant under these transformations, given the well known Euclidean invariance of the Fock measure. Besides being invariant, the Fock measure is moreover ergodic with respect to the action of the Euclidean group (see e.g. [BSZ,Ve2]), which means that the only invariant ∗ functions in L2 E∞ , µ are the constant functions. This ergodic property is shared by the measures µr , since if an invariant and non-constant function ψ were to exist in L2 A/G, µr , then the pull-back ψ ◦ *r would define an invariant and non-constant ∗ function in L2 E∞ , µ . An important fact is that the Haar measure µ0 on A/G is also ergodic under the action of the Euclidean group, as follows from more general results proven in [MTV]. Thus, all measures µr , r ∈ R+ , and µ0 are invariant and ergodic under the action of the same group. From well known results in measure theory (see e.g. [Ya]), this is only possible if all these measures are mutually singular, meaning that each measure of the set {µr , r ∈ R+ }∪µ0 is supported on a subset of A/G which has zero measure with respect to all the other measures (recall that a subset X of a space M is said to be a support for the measure µ on M if any measurable subset Y on the complement, Y ⊂ X c , has measure zero). It is thus proven that Theorem 1. The measures in the set {µr , r ∈ R+ } ∪ µ0 are all singular with respect to each other. Theorem 1 leads to the conclusion that none of the measures µr is quasi-invariant under the action of A/G on itself. This follows from the fact that A/G is a compact group, which implies that any quasi-invariant measure is in the equivalence class of the Haar measure, meaning that it must have the same zero measure sets (see e.g. [Ki, 9.1]). Thus Corollary 1. The measures µr , r ∈ R+ , are not quasi-invariant. We saw in Sect. 3 how the quantization of smeared electric fields canbe obtained from an unitary representation of the group A/G in the Hilbert space L2 A/G, µ0 . From
Invariance Properties of Induced Fock Measures for U(1) Holonomies
547
the corollary we conclude that such an unitary representation of A/G is not available in the Hilbert spaces L2 A/G, µr . One should thus look for the quantization of different functions of the electric fields. Varadarajan showed in [Va1] that certain “Gaussian-smeared smeared” electric fields can be consistently quantized in the r-Fock representations. In the remaining we will relate the quantization of these functions to quasi-invariance properties of the r-Fock measures µr . We will start by establishing the quasi-invariance properties, which, as expected, follow from the quasi-invariance of the Fock measure under the action (4). Let us consider the restriction of the maps *r (20) to (A/G)∞ , i.e., we consider the maps
such that
(A/G)∞ λ → A¯ λ,r ∈ A/G
(25)
a 3 ¯ Aλ,r (α) = exp i λa (x)Xα,r (x)d x , ∀α ∈ HG .
(26)
It is clear that A¯ λ+λ ,r = A¯ λ,r A¯ λ ,r , and therefore the group (A/G)∞ acts on the space A/G as a subgroup of the full group A/G. Let us denote this action by .r : ¯ → A¯ λ,r A¯ . (A/G)∞ × A/G (λ, A)
(27)
For any given λ ∈ (A/G)∞ , let µλ,r denote the push-forward of the measure µr by ¯ The measure µλ,r is completely determined by the integrals of the map A¯ → A¯ λ,r A. ¯ continuous functions F (A): ¯ ¯ ¯ . F (A)dµλ,r (A) = F A¯ λ,r A¯ dµr (A) (28) A/G
A/G
We need only to consider the functions α (9), and therefore the measure µλ,r is determined by the following map from HG to C: ¯ . A¯ λ,r A¯ (α)dµr (A) (29) α → A/G
One gets from (12), (22) and (26): a ¯ = exp i λa (x)Xα,r (x)d 3 x A¯ λ,r A¯ (α)dµr (A) A/G
∗ E∞
eiφ(Xα,r ) dµ (φ).
(30)
∗ , one gets further: Recalling the action (4) of (A/G)∞ on E∞ ¯ = A¯ λ,r A¯ (α)dµr (A) ei(φ+λ)(Xα,r ) dµ (φ) ∗ E∞
A/G
=
∗ E∞
eiφ(Xα,r ) dµ,λ (φ) ,
(31)
where the measure µ,λ is the push-forward of µ by the map φ → φ +λ (4). Recalling also the arguments of Sect. 4, one sees easily that the measure µλ,r coincides with (*r )∗ µ,λ , the push-forward of µ,λ by the map *r (20). Since the Fock measure µ is quasi-invariant under the action of (A/G)∞ , this is sufficient to prove that, for any
548
J. M. Velhinho
r ∈ R+ , the measure µr is quasi-invariant with respect to the action .r (27). For if B ⊂ A/G is such that µr (B) = 0, we then have µ (*−1 r B) = 0, by definition of the push-forward measure. The quasi-invariance of µ then shows that µ,λ (*−1 r B) = 0, ∀λ ∈ (A/G)∞ , which in turn is equivalent to µλ,r (B) = 0. Thus Theorem 2. The measure µr is quasi-invariant with respect to the action .r , for any given r. Using we define, for any r, a natural unitary representation Vr of (A/G)∞ in this result, L2 A/G, µr : ¯ = Vr (λ)ψ (A)
dµλ,r ¯ ¯ ψ Aλ,r A , λ ∈ (A/G)∞ , ψ ∈ L2 A/G, µr , dµr
(32)
where dµλ,r /dµr is the Radon-Nikodym derivative. One can easily work out the commutation relations between Vr (λ) and the r-Fock representation r (Tα ) of holonomies: Vr (λ)
a 3 (T ) = exp i λ (x)X (x)d x r α a α,r
r (Tα )Vr (λ) .
(33)
Let us consider the self-adjoint generator dVr (λ) of the one-parameter unitary group Vr (tλ), t ∈ R. Notice that the existence of dVr (λ), or the continuity of the one-parameter ∗ group Vr (tλ), follows from the continuity of the representation V (6) in L2 E∞ , µ . From (33) one obtains the following commutator:
dVr (λ),
r (Tα )
=
a λa (x)Xα,r (x)d 3 x
r (Tα ) .
(34)
This commutator is indeed the quantization of a given classical Poisson bracket, as realized by Varadarajan [Va1]. Consider then, for each r, the following functions of the electric field, parametrized by elements of (A/G)∞ : a
E (x) → Er (λ) :=
λa (x)
a
fr (x − y)E (y)d y d 3 x , 3
(35)
where fr is given by (18). The functions Er (λ) are referred to as “Gaussian-smeared smeared electric fields” in [Va1]. The Poisson bracket between these functions and the holonomies is Tα , Er (λ) = i
a λa (x)Xα,r (x)d 3 x Tα ,
(36)
showing that dVr (λ) can be seen as the quantization in L2 A/G, µr of the classical function Er (λ).
Invariance Properties of Induced Fock Measures for U(1) Holonomies
549
6. Discussion Varadarajan’s use of smeared form factors allowed an embedding of distributional connections into A/G, overcoming the fact that the natural embedding of connections is not extensible to distributions. This step is very welcome, since A/G can be seen as a common measurable space from which both Fock states and loop states can be defined. In particular, the r-Fock measures µr give natural images L2 A/G, µr of the Fock space [Va1]. The Fock states, however, cannot be regarded as elements of L2 A/G, µ0 , the kinematical Hilbert space for loop states, as a consequence of the mutual singularity between the Haar measure and the r-Fock measures, which we have proven. Nevertheless, Fock states can be exhibited within the framework of loop states. In fact, Varadarajan [Va2] showed that the Fock states can be realized as elements of a natural extension of L2 A/G, µ0 , e.g. the dual Cyl∗ of the space of cylinder functions in A/G. Notice that such an extension from L2 A/G, µ0 to Cyl∗ is already required in loop quantum gravity, in order to solve the diffeomorphism constraint [ALMMT]. A suitable generalization of Varadarajan’s work to quantum gravity is, therefore, expected to produce an embedding of Minkowskian Fock-like states (describing “gravitons” of a semiclassical or low energy effective theory) into the space Cyl∗ of non-perturbative physical loop states. These issues are currently under investigation (see [AL4] and also [T] for a more general approach to semiclassical analysis). In these efforts, measure theory in A/G plays a relevant role, e.g. in the definition of quantum operators and in the analysis of the physical contents of the states. A good understanding of the r-Fock measures and their relation to the Haar measure may therefore be important to further developments. In order to complement our measure theoretical results, we would like to conclude with a brief comment regarding topological aspects. Although the r-Fock measures are supported in irrelevant sets with respect to the Haar measure, it can be shown on the topological side that every conceivable support of a r-Fock measure is dense in A/G 1 . The r-Fock measures µr are therefore faithful2 , just like the Haar measure µ0 [AL1]. (Recall that a Borel measure is said to be faithful if every non-empty open set has non-zero measure, which is readily seen to be equivalent to the denseness of every conceivable support. Turning to representations, a measure in A/G is faithful if and only if the corresponding representation of the holonomy algebra HA is faithful.) The fact that the Haar measure and the r-Fock measures are all faithful and mutually singular means that one can find a family of mutually disjoint dense sets, each of which supports a different measure. Notice finally that dense sets in A/G that do not contribute to the Haar measure were already known, e.g. the set of smooth connections [MM] or a considerable extension of it given in [MTV]. In the present case, however, one has new measures, living on µ0 -irrelevant sets. Acknowledgements. I thank José Mourão, Jerzy Lewandowski, Madhavan Varadarajan and Roger Picken. This work was supported in part by PRAXIS/2/2.1/FIS/286/94 and CERN/P/FIS/40108/2000.
1 The crucial result, pointed out by M. Varadarajan, is that * (E ∗ ) is dense. The denseness of smaller r ∞ ∗ and the continuity of * with respect supports follows from the faithfulness of the Fock measure µ in E∞ r ∗ to an appropriate topology in E∞ . 2 An independent proof using projective arguments is given in [AL4].
550
J. M. Velhinho
References [As]
Ashtekar, A.: Lectures on non-Perturbative Canonical Quantum Gravity. Singapore: World Scientific, 1991 [ARS] Ashtekar, A., Rovelli, C., Smolin, L. S.: Phys. Rev. D44, 1740 (1991) [AR] Ashtekar, A., Rovelli, C.: Class. Quantum Grav. 9, 1121 (1992) [AI1] Ashtekar, A., Isham, C. J.: Phys. Lett. B274, 393 (1992) [AI2] Ashtekar, A., Isham, C. J.: Class. Quant. Grav. 9, 1433 (1992) [AL1] Ashtekar, A., Lewandowski, J.: Representation Theory of Analytic Holonomy C 0 Algebras. In: Knots and Quantum Gravity, ed. J. Baez, Oxford: Oxford University Press, 1994 [AL2] Ashtekar, A., Lewandowski, J.: J. Math. Phys. 36, 2170 (1995) [AL3] Ashtekar, A., Lewandowski, J.: J. Geom. Phys. 17, 191 (1995) [AL4] Ashtekar, A., Lewandowski, J.: Class. Quant. Grav. 18, L117 (2001) [ALMMT] Ashtekar, A., Lewandowski, J., Marolf, D., Mourão, J., Thiemann, T.: J. Math. Phys. 36, 6456 (1995) [BSZ] Baez, J., Segal, I., Zhou, Z.: Introduction to Algebraic and Constructive Quantum Field Theory, Princeton: Princeton University Press, 1992 [Ba1] Baez, J.: Lett. Math. Phys. 31, 213 (1994) [Ba2] Baez, J.: Diffeomorphism Invariant Generalized Measures on the Space of Connections Modulo Gauge Transformations. In: Proceedings of the Conference on Quantum Topology, ed. D. Yetter Singapore: World Scientific, 1994 [GV] Gelfand, I.M., Vilenkin, N.: Generalized Functions. Vol. IV, New York: Academic Press, 1964 [GT] Gambini, R., Trias, A.: Phys. Rev. D23, 553 (1981) [GJ] Glimm, J., Jaffe, A.: Quantum Physics. New York: Springer Verlag, 1987 [Ki] Kirillov, A.A.: Elements of the Theory of Representations. Berlin: Springer Verlag, 1975 [MM] Marolf, D., Mourão, J.M.: Commun. Math. Phys. 170, 583 (1995) [MTV] Mourão, J.M., Thiemann, T., Velhinho, J.M.: J. Math. Phys. 40, 2337 (1999) [Ma] Martins, J.: Mecânica Quântica em Espaços de Conexões. Instituto Superior Técnico, 2000. Unpublished [ReSi] Reed, M., Simon, B.: Methods of Modern Mathematical Physics II: Fourier Analysis, SelfAdjointeness. New York: Academic Press, 1975 [RoSm] Rovelli, C., Smolin, L.: Nucl. Phys. B331, 80 (1990) [T] Thiemann, T.: Introduction to Modern Canonical Quantum General Relativity. To appear in “Living Reviews” [Va1] Varadarajan, M.: Phys. Rev. D61, 104001 (2000) [Va2] Varadarajan, M.: Phys. Rev. D64, 104003 (2001) [Ve2] Velhinho, J.M.: Métodos Matemáticos em Quantização Canónica de Espaços de Fase não Triviais. Ph.D. Dissertation (Universidade Técnica de Lisboa, Instituto Superior Técnico, 2001) [Ya] Yamasaki, Y.: Measures on Infinite Dimensional Spaces. Singapore: World Scientific, 1985 Communicated by H. Nicolai
Commun. Math. Phys. 227, 551 – 585 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Gauge Theoretical Equivariant Gromov–Witten Invariants and the Full Seiberg–Witten Invariants of Ruled Surfaces Ch. Okonek1 , A. Teleman2,3 1 Institut für Mathematik, Universität Zürich, Winterthurerstrasse 190, 8057 Zürich, Switzerland.
E-mail:
[email protected] 2 LATP, CMI, Université de Provence, 39 Rue F. J. Curie, 13453 Marseille Cedex 13, France.
E-mail:
[email protected] 3 Faculty of Mathematics, University of Bucharest, Bucharest, Romania
Received: 22 February 2001 / Accepted: 16 January 2002
Abstract: Let F be a differentiable manifold endowed with an almost Kähler structure (J, ω), α a J -holomorphic action of a compact Lie group Kˆ on F , and K a closed normal subgroup of Kˆ which leaves ω invariant. The purpose of this article is to introduce gauge theoretical invariants for such triples (F, α, K). The invariants are associated with moduli spaces of solutions of a certain vortex type equation on a Riemann surface . Our main results concern the special case of the triple (Hom(Cr , Cr0 ), αcan , U (r)), where αcan denotes the canonical action of Kˆ = U (r) × U (r0 ) on Hom(Cr , Cr0 ). We give a complex geometric interpretation of the corresponding moduli spaces of solutions in terms of gauge theoretical quot spaces, and compute the invariants explicitly in the case r = 1. Proving a comparison theorem for virtual fundamental classes, we show that the full Seiberg–Witten invariants of ruled surfaces, as defined in [OT2], can be identified with certain gauge theoretical Gromov–Witten invariants of the triple (Hom(C, Cr0 ), αcan , U (1)). We find the following formula for the full Seiberg–Witten invariant of a ruled surface over a Riemann surface of genus g: −signc,[F ]
SWX,(O1 ,H0 ) (c) = 0, signc,[F ] SWX,(O1 ,H0 ) (c)(l)
= signc, [F ]
g
i≥max(0,g− w2c )
ic ∧ l, lO1 , i!
Partially supported by: EAGER – European Algebraic Geometry Research Training Network, contract No HPRN-CT-2000-00099 (BBW 99.0030), and by SNF, nr. 2000-055290.98/1.
552
Ch. Okonek, A. Teleman
where [F ] denotes the class of a fibre. The computation of the invariants in the general case r > 1 should lead to a generalized Vafa-Intriligator formula for “twisted” Gromov– Witten invariants associated with sections in Grassmann bundles. 1. Introduction 1.1. The general set up. Let F be a differentiable manifold, ω a symplectic form on F , and J a compatible almost complex structure. Let α be a J -holomorphic action of a compact Lie group Kˆ on F , and let K be a closed normal subgroup of Kˆ which leaves ω invariant. ˆ Put K0 := K/K, and let π be the projection of Kˆ onto this quotient. We fix an invariant inner product on the Lie algebra kˆ of Kˆ and denote by pr k : kˆ → k the orthogonal projection onto the Lie algebra of K. The topological data of our moduli problem are a K0 -bundle P0 on a compact oriented ˆ consisting of a differentiable 2-manifold , and an equivalence class c of pairs (λ, h) λ ˆ ˆ π -morphism P − → P0 and a homotopy class h of sections in the associated bundle ˆ (λ , hˆ ) are equivalent if there exists an isomorphism E := Pˆ ×Kˆ F . Two pairs (λ, h), Pˆ → Pˆ over P0 which maps hˆ onto hˆ . The pair (P0 , c) should be regarded as the discrete parameter on which our moduli problem depends. It plays the same role as the data of a SU(2)- or a PU(2)-bundle in Donaldson theory, or the data of an equivalence class of Spinc -structures in Seiberg– Witten theory. λ ˆ of c we denote by λ (c) ⊂ ( , E) the union For every representant (Pˆ − → P0 , h) ˆ of all homotopy classes h of sections in E for which (λ, hˆ ) ∈ c. This set λ (c) is the union of the homotopy classes in the orbit of hˆ with respect to the action of the group π0 (Aut P0 (Pˆ )) on the set π0 (( , E)). In other words, λ (c) is the saturation of hˆ with respect to the AutP0 (Pˆ )-action on the space of sections. λ ˆ of c. Let µ be a K-equivariant ˆ Now fix a representant (Pˆ − moment map → P0 , h) for the restricted K-action α|K on F , let g be a metric on , and let A0 be a connection on P0 . The triple p := (µ, g, A0 ) is the continuous parameter on which our moduli problem depends. It plays the role of the Riemannian metric on the base manifold in Donaldson theory [DK], or the role of the pair (Riemannian metric, self-dual form) in Seiberg– Witten theory [W2, OT1, OT2]. The orthogonal projection pr k induces a bundle projection which we denote by the same symbol pr k : Pˆ ×ad kˆ −→ Pˆ ×ad k. Since Kˆ acts J -holomorphically, a connection Aˆ in Pˆ defines an almost holomorphic structure JAˆ in the associated bundle E; JAˆ agrees with J on the vertical tangent bundle TE/ of E and it agrees with the holomorphic structure Jg defined by g on on the ˆ A-horizontal distribution of E. Our gauge group is G = Aut P0 (Pˆ ) ( , Pˆ ×Ad K), and it acts from the right on our configuration space A := AA0 (Pˆ , λ) × λ (c).
Gromov–Witten Invariants and Seiberg–Witten Invariants of Ruled Surfaces
553
Here AA0 (Pˆ , λ) is the affine space of connections Aˆ in Pˆ which project onto A0 via λ. ˆ ϕ) ∈ A we consider the equations For a pair (A, ϕ is JAˆ holomorphic (Vp ) 0. pr k #FAˆ + µ(ϕ) = These vortex type equations are obviously gauge invariant. The first condition of (Vp ) can be rewritten as ∂¯Aˆ ϕ = 0, where ∂¯Aˆ ϕ ∈ ( , #0,1 (ϕ ∗ (TE/ ))) is the (0, 1)-component of the derivative dAˆ ϕ ∈ ( , #1 (ϕ ∗ (TE/ ))). ˆ these equations were independently found and In the particular case where K = K, studied in [Mu1, CGS, G]. ˆ the moduli space of solutions of the equations (Vp ) Denote by M = Mp (λ, h) modulo gauge equivalence. Let A∗ be the open subspace of A consisting of irreducible pairs, i.e. of pairs with trivial stabilizer, and denote by M∗ the moduli space of irreducible solutions; M∗ can be ∗ regarded as a subspace of the infinite dimensional quotient B ∗ := A G of irreducible pairs. The space B ∗ becomes a Banach manifold after suitable Sobolev completions. The parameters p for which M∗ = M are called bad parameters, and the set of bad parameters is called the bad locus or the wall. Our purpose is to define invariants for triples (F, α, K) by evaluating certain tautological cohomology classes on the virtual fundamental class of moduli spaces M corresponding to good parameters, provided these spaces are compact (or have a canonical compactification) and possess a canonical virtual fundamental class. The invariants will depend on the choice of the discrete parameter (P0 , c), and a chamber C, i.e. a component of the complement of the bad locus in the space of continuous parameters. Some general ideas for the construction of Gromov–Witten type invariants associated with moduli spaces of solutions of vortex-type equations have been outlined in [CGS]; in [Mu2] such invariants are rigorously defined in the special case where F is compact Kähler and K = Kˆ = S 1 . Note that our program is fundamentally different: Our main construction begins with an important new idea, the parameter symmetry group ˆ K0 := K/K. This group, whose introduction was motivated by our previous work on Seiberg–Witten theory, leads to an essentially new set up and plays a crucial role in the following. Without it none of our main results could even be formulated. Our first aim is to construct tautological cohomology classes on the infinite dimensional quotient B ∗ . ˆ Note first that any section ϕ ∈ ( , E) can be regarded as a K-equivariant map Pˆ → F . ˆ Therefore one gets a K-equivariant evaluation map ev : A∗ × Pˆ → F, ˆ over which is obviously G-invariant. Let Pˆ := A∗ ×G Pˆ be the universal K-bundle ∗ ˆ B × . The evaluation map descends to a K-equivariant map ( : A∗ ×G Pˆ → F
554
Ch. Okonek, A. Teleman
which can be regarded as the universal section in the associated universal F -bundle A∗ ×G E. Let (∗ : HK∗ˆ (F, Z) → H ∗ (B ∗ × , Z) ˆ be the map induced by ( in K-equivariant cohomology. Using the same idea as in Donaldson theory, we define for every c ∈ H ∗ˆ (F, Z) and β ∈ H∗ ( ) the element K δ c (β) ∈ H ∗ (B ∗ , Z) by δ c (β) := (∗ (c)/β. Recall that one has natural morphisms η∗
B(π)∗
ˆ Z) −−→ H ∗ (F, Z) H ∗ (BK0 , Z) −−−−→ H ∗ (B K, Kˆ which are induced by the natural maps η
B(π)
E Kˆ ×Kˆ F − → B Kˆ −−−→ BK0 . Let κˆ : → B Kˆ be a classifying map for the bundle Pˆ , and let κ0 := B(π ) ◦ κˆ be the corresponding classifying map for P0 . ˆ Denote by hˆ ∗ the morphism H ∗ˆ (F, Z) → H ∗ ( , Z) defined by h. K
Proposition 1.1. The assignment (c, β) → δ c (β) has the following properties: 1. It is linear in both arguments. 2. For any homogeneous elements c ∈ H ∗ˆ (F, Z), β ∈ H∗ ( , Z) of the same degree, K one has c ∗ δ (β) = hˆ (c), β · 1H 0 (B∗ ,Z) . 3. For any homogeneous elements c, c ∈ H ∗ˆ (F, Z), one has K
δ
c∪c
([∗]) = δ c ([∗]) ∪ δ c ([∗]).
4. For any homogeneous elements c, c ∈ H ∗ˆ (F, Z), β ∈ H1 ( , Z) one has K
δ
c∪c
(β) = (−1)
degc
δ c (β) ∪ δ c ([∗]) + δ c ([∗]) ∪ δ c (β).
5. Let (βi )1≤i≤2g( ) be a basis of H1 ( , Z). Then for any homogeneous elements c, c ∈ H ∗ˆ (F, Z) one has K
δ c∪c ([ ]) = δ c ([ ]) ∪ δ c ([∗]) + δ c ([∗]) ∪ δ c ([ ]) − (−1)degc
2g( )
δ c (βi ) ∪ δ c (βj )(βi · βj ).
i,j =1
6. For every c0 ∈ H ∗ (BK0 , Z) one has δ c∪(η
∗ B(π)∗ c ) 0
(β) = δ c (κ0∗ (c0 ) ∩ β).
Gromov–Witten Invariants and Seiberg–Witten Invariants of Ruled Surfaces
555
The properties 1–5 follow from general properties of the slant product, whereas the last property follows from the natural identification Pˆ ×Kˆ K0 B ∗ × P0 . To every pair of homogeneous elements c ∈ H ∗ˆ (F, Z), β ∈ H∗ ( , Z) satisfying K c degc ≥ degβ we associate the symbol , considered as an element of degree β degc − degβ. Let A = A(F, α, K, c)be the graded-commutative graded Z-algebra which is genc erated by the symbols , subject to the relations which correspond to the properties β 1–6 in the proposition above. This algebra depends only on the homotopy type of our topological data. c The assignment → δ c (β) defines a morphism of graded-commutative Zβ algebras δ : A → H ∗ (B ∗ , Z). λ ˆ of c Now fix a discrete parameter (P0 , c) and choose a representant (Pˆ − → P0 , h) as above. Choose a continuous parameter p not on the wall. When the moduli space ˆ ∗ is compact and possesses a virtual fundamental class [Mp (λ, h) ˆ ∗ ]vir , then Mp (λ, h) this class defines an invariant (P0 ,c)
GGWp
(F, α, K) : A(F, α, K, c) −→ Z,
given by (P0 ,c)
GGWp
ˆ ∗ ]vir . (F, α, K)(a) := δ(a), [Mp (λ, h)
The 6 properties listed in the proposition above show that: Remark 1.2. Let G be a set of homogeneous generators of H ∗ˆ (F, Z), regarded as a K graded H∗ (BK 0 , Z)-algebra. Then A is generated as a graded Z-algebra by elements of c the form with c ∈ G, β ∈ H∗ ( , Z), and degc > degβ. β Suppose for example that we are in the simple situation where Kˆ splits as Kˆ = U (r) × K0 and F is contractible. In this case the graded algebra ˆ Z) = H ∗ (BU (r), Z) ⊗ H ∗ (BK0 , Z) HK∗ˆ (F, Z) = H ∗ (B K, is generated as a H ∗ (BK0 , Z)-algebra by the universal Chern classes ci ∈ H ∗ (BU (r), Z), 1 ≤ i ≤ r, and one has a natural identification A Z[u1 , . . . , ur , v2 , . . . , vr ] ⊗ #
∗
r i=1
H1 (X, Z)i .
(I )
556
Ch. Okonek, A. Teleman
Here ui =
ci ci , vi = have degree 2i and 2i − 2 respectively, whereas [∗] [ ] ci β ∈ H1 ( , Z) H1 ( , Z)i := β
is a copy of H1 ( , Z) whose elements are homogenous of degree 2i − 1. Note also that in the case Kˆ = K × K0 , Pˆ splits as the fibre product of a K-bundle P and P0 , and the gauge group G can be identified with Aut(P ). ˆ Similarly, the universal K-bundle Pˆ over B ∗ × splits as the fibre product of the ∗ universal K-bundle P := A ×G P with the K0 -bundle pr ∗ (P0 ). If K = U (r), one can use this bundle to give a geometric interpretation of the images ci via δ of the classes ui , vi , ∈ H1i ( , Z) defined above: β c δ(ui ) = ci (P)/[∗], δ(vi ) = ci (P)/[ ], δ i = ci (P)/β. β In the special case r = 1, one just gets A Z[u] ⊗ #∗ (H1 ( , Z)). This shows that when Kˆ = S 1 × K0 and F is contractible, the gauge theoretical Gromov–Witten invariants can be described by an inhomogeneous form (P0 ,c)
GGWp
(F, α, S 1 ) ∈ #∗ (H 1 ( , Z))
setting
(P ,c) GGWp 0 (F, α, S 1 )(l)
:= δ(
ˆ ∗ ]vir uj ∪ l), [Mp (λ, h)
j ≥0
for any l ∈
#∗ (H
ˆ represents c and p is a good continuous parameter. 1 ( , Z)). Here (λ, h)
1.2. Special cases. Twisted Gromov–Witten invariants. This is the special case K = {1}. Here the gauge group G is trivial, the moduli space M is the space of JA0 -holomorphic sections of the bundle E, and giving c is equivalent to fixing a homotopy class h0 of sections in P0 ×K0 F . The resulting invariants, when defined, should be regarded as twisted Gromov–Witten invariants, because we have replaced the space of F -valued maps on in the definition of the standard Gromov–Witten invariants ([Gr, LiT, R]) by the space of sections in a F -bundle P0 ×K0 F . These invariants are associated with the almost Kähler manifold F , the K0 -action, and they depend on the discrete parameter (P0 , h0 ) and the continuous parameter A0 . The invariants are defined on a graded algebra A(F, α, h0 ) obtained by applying the construction above in this special case. Note that even in the particular case when the bundle P0 is trivial, varying the parameter connection A0 provides interesting deformations of the usual Gromov–Witten moduli spaces. In some situations one can prove a transversality result with respect to the parameter A0 and then compute the standard Gromov–Witten invariants using a general parameter.
Gromov–Witten Invariants and Seiberg–Witten Invariants of Ruled Surfaces
557
Equivariant symplectic quotients. This is the special case where the K-action on µ−1 (0) is free and µ is a submersion around µ−1 (0). In this case our data define a symplec −1 tic factorization problem, and one has a symplectic quotient Fµ := µ (0) K with an induced compatible almost complex structure and an induced almost holomorphic K0 -action αµ . When K0 = {1}, the system (F, α, K, µ) should be called symplectic factorization problem with additional symmetry, since the symplectic manifold F was endowed with a larger symmetry than the Hamiltonian symmetry used in performing the symplectic factorization. For any homotopy class h0 of sections in P0 ×K0 Fµ one can consider the twisted Gromov–Witten invariants of the pair (Fµ , αµ ) corresponding to the parameters (P0 , h0 ) ˆ as follows. We choose a section and A0 . One can associate to h0 a class c(h0 ) = [λ, h] ϕ0 ∈ h0 regarded as a K0 -equivariant map P0 → Fµ , put Pˆ := P0 ×Fµ µ−1 (0) endowed λ ˆ action and the obvious morphism Pˆ − with the natural K→ P0 , and let hˆ be the class ˆ of the section defined by the K -equivariant map (p0 , f ) → f . It is then an interesting and natural problem to compare the twisted Gromov–Witten invariants of the pair (Fµ , αµ ) with the gauge theoretical Gromov–Witten invariants of the initial triple (F, α, K) via the natural morphism A(F, α, K, c(h0 )) → A(Fµ , αµ , h0 ). In the non-twisted case K0 = {1}, this problem was treated in [G, CGS]. 1.3. Main results. In Sect. 2 we study the moduli spaces Mt (E, E0 , A0 ) of solutions of the vortex type equations over Riemann surfaces ( , g), associated with the triple (Hom(Cr , Cr0 ), αcan , U (r)) and the moment map µt (f ) = 2i f ∗ ◦ f − itid, t ∈ R. In Sect. 2.1 we introduce the gauge theoretical quot space GQuot E E0 of a holomorphic bundle E0 on a general compact complex manifold X. The space GQuot E E0 parametrizes ∞ the quotients of E0 with locally free kernels of fixed C -type E, and can be identified with the corresponding analytical quot space when X is a curve. We prove a transversality result (Proposition 2.4) which states that, when X is a curve, GQuot E E0 is smooth and has the expected dimension for an open dense set of holomorphic structures E0 in a fixed C ∞ -bundle E0 . In Sect. 2.2 we use the Kobayashi–Hitchin correspondence for the vortex equation [B] over a Riemann surface ( , g), to identify the irreducible part M∗t (E, E0 , A0 ) of Volg ( ) Mt (E, E0 , A0 ) with the gauge theoretical moduli space of 2π t - stable pairs. The latter can be identified with a gauge theoretical quot space when t is sufficiently large (Corollary 2.8, Proposition 2.9). In Sect. 2.3 we prove transversality and compactness results for the moduli spaces Mt (E, E0 , A0 ). In Sect. 3 we introduce formally our gauge theoretical Gromov–Witten invariants for the triple (Hom(Cr , Cr0 ), αcan , U (r)) and prove an explicit formula in the abelian case r = 1. We define the invariants using Brussee’s formalism of virtual fundamental classes associated with Fredholm sections [Br] applied to the sections cutting out the moduli spaces M∗t (E, E0 , A0 ). The comparison Theorem 3.2 states that one can alternatively
558
Ch. Okonek, A. Teleman
use the virtual fundamental class of the corresponding moduli space of stable pairs. This provides a complex geometric interpretation of our invariants. The results of Sect. 2, and a complex geometric description of the abelian quot spaces as complete intersections in projective bundles, enables us to explicitly compute the full invariant in the abelian case r = 1: Theorem. Put v = χ (Hom(L, E0 )) − (1 − g( )). The Gromov–Witten invariant (E ,c ) GGWp 0 d (Hom(C, Cr0 ), αcan , U (1)) ∈ #∗ (H 1 ( , Z)) is given by the formula g( ) (r0 )i (E0 ,cd ) r0 GGWp (Hom(C, C ), αcan , U (1))(l) = ∧ l, lO1 , i! i≥max(0,g( )−v)
for any l ∈ #∗ (H1 ( , Z)). Here is the class in #2 (H1 ( , Z)) given by the intersection form on , and lO1 is the generator of #2g( ) (H 1 ( , Z)) defined by the complex orientation O1 of H 1 ( , R). As an application we give in Sect. 3.4 an explicit formula for the number of points in certain abelian quot spaces of expected dimension 0. This answers a classical problem in Algebraic Geometry. A generalisation of this result to the case r > 1 requires a wall-crossing formula for the non-abelian invariants. The main result of Sect. 4 is a natural identification of the full Seiberg–Witten invariants of ruled surfaces with certain abelian gauge theoretical Gromov–Witten invariants. This result is a direct consequence of two important comparison theorems: The stanπ dard description of the effective divisors on a ruled surface X := P(V0 ) −→ over a curve, identifies the Hilbert schemes of effective divisors on X with certain quot schemes associated with symmetric powers of the 2-bundle V0 over . In Sect. 4.1 we show that, if one replaces the Hilbert schemes and the quot schemes by their gauge theoretical analoga GDou, GQuot, one has Theorem. For every C ∞ - line bundle L on , there is a canonical isomorphism of complex spaces ∨ GDou(π ∗ (L) ⊗ OP(V0 ) (n)) GQuot L S n (V0 ) which maps the virtual fundamental class [GDou(π ∗ (L) ⊗ OP(V0 ) (n))]vir to the virtual vir
∨ . fundamental class GQuotL S n (V0 ) On the other hand, the gauge theoretical Douady space on the left can be identified with the moduli space of monopoles on X which corresponds to the π ∗ (L) ⊗ OP(V0 ) (n)twisted canonical Spinc -structure. In Sect. 4.2, we show that this identification respects virtual fundamental classes too. Combining all these results we see that the full Seiberg–Witten invariant of the ruled surface X as defined in [OT2] can be identified with a corresponding gauge theoretical Gromov–Witten invariant for the triple (Hom(C, Cn+1 ), αcan , S 1 ). Using the explicit formula proven in Sect. 3, one gets an independent check of the universal wall-crossing formula for the full Seiberg–Witten invariant in the case b+ = 1. 2. Moduli Spaces Associated with the Triple (Hom(Cr , Cr0 ), αcan , U (r)) Because of the very technical compactification problem, we will not introduce our gauge theoretical invariants formally in the general framework described in Sect. 1.1. Instead
Gromov–Witten Invariants and Seiberg–Witten Invariants of Ruled Surfaces
559
we specialize to the case K = U (r), Kˆ = U (r) × U (r0 ), and F = Hom(Cr , Cr0 ) ˆ endowed with the natural left K-action. The K-action on F has the following family of moment maps, µt (f ) =
i ∗ f ◦ f − it id, t ∈ R, 2
ˆ which are all K-equivariant. Since F is contractible, one has only one homotopy class of sections in any fixed F -bundle. Hence in this case our topological data reduce to the data of a differentiable Hermitian bundle E0 of rank r0 and a class of differentiable Hermitian bundles E of rank r. ˆ as Therefore, when we fix the bundle E0 , the set of equivalence classes of pairs (λ, h) above can be identified with Z via the map E → deg(E). We will denote the class corresponding to an integer d by cd . Our moduli problem becomes now: Let A0 be a fixed Hermitian connection in E0 and let E0 be the associated holomorphic bundle. Classify all pairs (A, ϕ) consisting of a Hermitian connection A in E and a (A, A0 )-holomorphic morphism ϕ : E → E0 such that the following vortex type equation is satisfied: 1 i#FA − ϕ ∗ ◦ ϕ = −t idE . 2 Our first purpose is to show that, in a suitable chamber, the moduli space of solutions of this equation can be identified with a certain space of quotients of the holomorphic bundle E0 . This remark will allow us later to describe the invariants explicitly in the abelian case r = 1.
2.1. Gauge theoretical quot spaces. Let E0 be a holomorphic bundle of rank r0 on a compact connected complex manifold X of dimension n, and fix a differentiable vector bundle E of rank r on X. There is a simple gauge theoretical way to construct a complex space GQuot E E0 parametrizing equivalence classes of pairs (E, ϕ) consisting of a holomorphic bundle E of C ∞ -type E and a sheaf monomorphism ϕ : E − 2π (n−1)!Volg (X) rk(E) . E0
Proof. Indeed, integrating the equation (VtA0 ) over X one finds t >−
deg(E) 2π (n − 1)!Volg (X) rk(E)
when (VtA0 ) has solutions with non-vanishing ϕ-component. Conversely, if t > deg(E) 2π − (n−1)!Vol , any solution (A, ϕ) must have ϕ = 0, so it must be irreducible. g (X) rk(E) Using the theorem we get an isomorphism Mt (E, E0 , A0 ) Mst τ (E, E0 ), where τ > − deg(E) rk(E) . Since any non-trivial morphism defined on a holomorphic line
deg(E) E bundle is generically injective, we see that Mst τ (E, E0 ) = GQuot E0 if τ > − rk(E) and r = 1. $ %
In the non-abelian case, one has the following generalization of Corollary 2.8: Proposition 2.9. There exists a constant c(E0 , E) such that, for all τ ≥ c(E0 , E) the following holds: (i) For every τ -semistable pair (E, ϕ), ϕ is injective. (ii) Every pair (E, ϕ) with ϕ injective is τ -stable. E (iii) There is a natural isomorphism Mst τ (E, E0 ) = GQuot E0 .
Gromov–Witten Invariants and Seiberg–Witten Invariants of Ruled Surfaces
565
For all sufficiently large t ∈ R, one has Mt (E, E0 , A0 ) = M∗t (E, E0 , A0 ) and a natural identification Mt (E, E0 , A0 ) = GQuotE E0 . Proof. (i) Note first that, if ker(ϕ) = 0, the second inequality of the stability condition for F = ker(ϕ) implies deg(im(ϕ)) ≥ d + τ rk(ker(ϕ)). But im(ϕ) is a non-trivial subsheaf of the fixed bundle E0 , so one has an estimate of the form deg(im(ϕ)) ≤ C(E0 ), where C(E0 ) = sup deg(G) [Ko]. Therefore, as soon as G ⊂E0
τ > c(E0 , E) := max
1≤i≤r−1
C(E0 ) − d , i
any τ -semistable pair (E, ϕ) has an injective ϕ. (ii) Suppose now that ϕ is injective. The second part of the stability condition becomes empty, hence we only have to show that deg(F) < d + τ (r − rk(F)) for all subsheaves F of E with 0 < rk(F) < r. But if τ is larger than c(E0 , E), it follows that d + τ (r − s) > C(E0 ) for all 0 < s < r. The inequality above is now automatically satisfied, since F can be regarded as a subsheaf of E0 via ϕ. (iii) This follows directly from (i), (ii) and Definition 2.6. The last statement follows from (iii) and the fact that any solution with generically injective ϕ-component is irreducible. $ % Corollary 2.8 shows that in the abelian case the moduli space M∗t (E, E0 , A0 ) is either empty or can be identified with a quot space. In the non-abelian case, the space of parameters (t, g, A0 ) has a chamber structure which can be very complicated. The wall in this parameter space consists of those points (t, g, A0 ) for which reducible solutions (A, ϕ) appear in the moduli space Mt (E, E0 , A0 ). Note that a solution (A, ϕ) is reducible if and only if either ϕ = 0, or A is reducible and ϕ vanishes on an A-parallel summand of E. When the parameter (t, g, A0 ) crosses the wall, the corresponding moduli space changes by a “generalized flip” [Th1, Th2, OST]. Let E := A∗ ×G E be the universal complex bundle over B ∗ × associated with E. This bundle is the dual of the vector bundle P ×U (r) Cr , where P is the universal K-bundle introduced in Sect. 1.1. In order to compute the gauge theoretical Gromov– Witten invariants we will need an explicit description of the restriction of this bundle to M∗t (E, E0 , A0 ) × . The following proposition provides a complex geometric interpretation of this bundle via the isomorphism given by Corollary 2.8, Proposition 2.9. Proposition 2.10. Suppose that t is large enough so that the Kobayashi–Hitchin correspondence defines an isomorphism M∗t (E, E0 , A0 ) GQuot E E0 . Via this isomorphism the restriction of the universal bundle E to M∗t (E, E0 , A0 ) × can be identified with ∗ (E ) → Q over GQuot E × . the kernel of the universal quotient p 0 E0
566
Ch. Okonek, A. Teleman
2.3. Transversality and compactness for moduli spaces of vortices. We first prove a simple regularity result for moduli spaces of vortices over curves. Proposition 2.11. Let X be a curve. (i) The moduli space M∗t (E, E0 , A0 ) is smooth of expected dimension in every point [A, ϕ] with ϕ generically surjective. (ii) There is a dense second category set C ⊂ A(E0 ) such that, for every A0 ∈ C and every t ∈ R, the open part Mt (E, E0 , A0 )inj ⊂ M∗t (E, E0 , A0 ), consisting of classes of pairs with generically injective ϕ-component, is smooth of expected dimension. Proof. (i) Since the Kobayashi–Hitchin correspondence is an isomorphism of real analytic spaces, it suffices to study the regularity of the moduli space Mst τ (E, E0 ) in (E, E ) which corresponds to [A, ϕ]. The first differential the point [∂¯A , ϕ] ∈ Mst 0 τ D∂1¯
A ,ϕ
: A1 End(E) × A0 Hom(E, E0 ) → A0,1 Hom(E, E0 )
in the elliptic complex associated with the τ -stable pair (∂¯A , ϕ) is given by D∂1¯
A ,ϕ
(α, φ) = ∂¯A,A0 φ − ϕ ◦ α.
It suffices to see that, after suitable Sobolev completions, the first order differential operator D∂1¯ ,ϕ is surjective. Let β ∈ A0,1 Hom(E, E0 ) be L2 -orthogonal to A
im(D∂1¯ ,ϕ ). Note that the linear map End(Cr ) → Hom(Cr , Cr0 ) given by A → A ( ◦ A is surjective when ( is surjective. Therefore, as in the proof of Proposition 2.4, we find that β vanishes as distribution, hence as a Sobolev section as well, on the open set where ϕ is surjective. But since β solves an elliptic second order system with scalar symbol, it follows that β = 0. (ii) Note that Mt (E, E0 , A0 )inj can be identified via the Kobayashi–Hitchin correpondence with an open subspace of GQuot E E0 . Therefore the statement follows from Proposition 2.4. $ % Theorem 2.12. Let (X, g) be a compact Kähler manifold of dimension n, E and E0 Hermitian bundles on X of ranks r and r0 respectively. Suppose that either n = 1 or r = 1. Then the moduli spaces Mt (E, E0 , A0 ) are compact for every t ∈ R and for every integrable Hermitian connection A0 ∈ A(E0 ). In particular, the moduli space GQuotE E0 is compact if X is a curve or rk(E) = 1. Proof. The Hermite–Einstein type equation 1 i#FA − ϕ ∗ ◦ ϕ = −tidE 2 implies µ(E) −
(n − 1)!V olg (X) (n − 1)! t. # ϕ #2 = − 2π 4πr
Gromov–Witten Invariants and Seiberg–Witten Invariants of Ruled Surfaces
567
The Weitzenböck formula for holomorphic sections in the holomorphic Hermitian bundle E ∨ ⊗ E0 with Chern connection B := A∨ ⊗ A0 yields ¯ i#∂∂(ϕ, ϕ) = (i#FB (ϕ), ϕ) − |∂B ϕ|2 ≤ ((i#FA0 ) ◦ ϕ − ϕ ◦ (i#FA ), ϕ) = (i#FA0 (ϕ), ϕ) − (i#FA , ϕ ∗ ◦ ϕ) 1 = (i#FA0 (ϕ), ϕ) + t|ϕ|2 − |ϕ ∗ ◦ ϕ|2 . 2 Notice that |ϕ ∗ ◦ ϕ|2 ≥ 1r |ϕ|4 . Let x0 be a point where the supremum of the function 0 |ϕ|2 is attained, and let λA M be the supremum of the highest eigenvalues of the Hermitian bundle endomorphism i#FA0 . By the maximum principle we get 1 2 2 0 ¯ 0 ≤ [i#∂∂|ϕ| ]x0 ≤ (λA |ϕ(x0 )|4 . M + t)|ϕ(x0 )| − 2r Therefore we have the following a priori C 0 -bound for the second component of a solution of (VtA0 ): 0 sup |ϕ|2 ≤ max(0, 2r(λA M + t)).
X
Now, if r = 1, one can bring A in Coulomb gauge with respect to a fixed connection A0 in E by a gauge transformation gA . Moreover, one can choose gA so that the projection of gA (A) − A0 on the kernel of the operator d + + d ∗ : iA1 (X) −→ i[(A0,2 (X) + A2,0 (X) + A0,0 (X)ωg ) ∩ A2 (X)] (which coincides with the harmonic space iH1 (X) in the Kählerian case) belongs to a fixed fundamental domain D of the lattice iH 1 (X, Z). Now standard bootstrapping arguments apply as in the case of the abelian monopole equations [KM]. If X is a curve, then the contraction operator # is an isomorphism, so one gets an a priori L∞ -bound for the curvature of the connection component. The result follows now from Uhlenbeck’s compactness theorems for connections with Lp -bound on the curvature [U]. $ % Corollary 2.13. Let X be a projective manifold endowed with an ample line bundle H , and let PL be the Hilbert polynomial of a locally free sheaf L of rank 1 with respect to PE −PL H . Then the analytic quot space Quot E0 0 is compact. Proof. Indeed, by Remark 2.3, the gauge theoretical quot space GQuot L E0 is an open PE0 −PL . But any torsion free sheaf on subspace of the underlying analytic space of Quot E0 ∞ X with Hilbert polynomial PL is a line bundle of C -type L, so that the open embedding PE −PL
0 GQuotL E0 0). Proof. Let C0 be the standard connection induced by the Levi-Civita connection in the line bundle KX−1 . Using the substitutions A := C0 ⊗ B ⊗2 with B ∈ A(M) and A =: ϕ + α ∈ A0 (M) ⊕ A0,2 (M), the configuration space of unknowns becomes A = A(M) × [A0 (M) ⊕ A0,2 (M)], and a pair (B, ϕ + α) solves the twisted monopole γ equation (SWβ M ) if −FA02 + α ⊗ ϕ¯ = 0,
∂¯B (ϕ) − i#∂B (α) = 0, ¯ = 0. i#g (FA + 2π iβ) + (ϕ ϕ¯ − ∗(α ∧ α)) We denote by A∗ the open subspace of A with non-trivial spinor component, and by B ∗ ∗ its quotient A G by the gauge group G = C ∞ (X, S 1 ). Let ei (B, ϕ, α), i = 1, . . . , 3 stand for the map of A defined by the left hand term of the i th equation above. This map induces a section ε i in a certain bundle H i over B ∗ which is associated with the principal G-bundle A∗ → B ∗ . γ The Seiberg–Witten moduli space Wβ M is the analytic subspace of B ∗ cut out by the Fredholm section ε = (ε1 , ε2 , ε3 ) in the bundle H := ⊕H i , and the virtual fundamental γ class [Wβ M ]vir is by definition the virtual fundamental class in the sense of Brussee [Br], associated with this section and the complex orientation data. We define a bundle morphism q : H → H 2 := A∗ ×G A0,2 (M) by 1 q(B,ϕ,α) (x 1 , x 2 , x 3 ) = ∂¯B x 2 + x 1 ϕ. 2 One easily checks that 1 q ◦ ε(B, ϕ, α) = ( |ϕ|2 + ∂¯B ∂¯B∗ )α. 2 Suppose now that (2c1 (M) − c1 (KX ) − [β]) ∪ [ωg ], [X] < 0. Integrating the third equation over X, one sees that any solution of the equations has a nontrivial ϕcomponent. The space ASW of solutions is therefore contained in the open subspace A◦ consisting of triples (B, ϕ, α) with ϕ = 0. But the operator ( 21 |ϕ|2 + ∂¯B ∂¯B∗ ) is invertible for ϕ = 0. It follows that on B ◦ := A◦ /G the section ε := q ◦ ε is regular around its vanishing locus Z(ε ), and that the submanifold Z(ε ) ⊂ B ◦ is just the submanifold cut out by the equation α = 0. One checks that q is a bundle epimorphism on B ◦ . Set H := ker q. The Associativity γ Property shows now that the virtual fundamental class [Wβ M ]vir can be identified with the virtual fundamental class associated with the Fredholm section ε := ε|Z(ε ) ∈ (Z(ε ), H |Z(ε ) ).
582
Ch. Okonek, A. Teleman
In other words, the virtual fundamental class of the Seiberg–Witten moduli space can be identified with the virtual fundamental class of the moduli space V( 2s −π#g β) (M) of ( 2s − π#g β)-vortices in M [OT1]. Recall that V( 2s −π#g β) (M) is defined as the space of equivalence classes of pairs (B, ϕ) ∈ A(M) × [A0 (M) \ {0}] satisfying the equations (−FA02 , ∂¯B ϕ) = 0, s 1 i#g FB + ϕ ϕ¯ + ( − π #g β) = 0. 2 2 Here the first equation is considered as taking values in the subspace 0,1 ¯ G1B,ϕ := {(u, v) ∈ A0,2 X ⊕ A (M)| ∂B v + uϕ = 0}.
0 More precisely, let C ∗ be the quotient C ∗ := A(M) × [A (M) \ {0}] G , and let G1 be the subbundle of the associated bundle
0,1 A(M) × [A0 (M) \ {0}] ×G [A0,2 X ⊕ A (M)] over C ∗ , whose fibre in [B, ϕ] is G1B,ϕ . Let G2 be the trivial bundle C ∗ × A0 (X) and G := G1 ⊕ G2 . The left-hand terms of the equations above define sections g i in the bundles Gi , and the section g = (g 1 , g 2 ) is Fredholm. So far we have shown that the virtual fundamental class of the Seiberg–Witten moduli space can be identified with the virtual fundamental class of the moduli space V( 2s −π#g β) (M) associated with the Fredholm section g and the complex orientations. To complete the proof, we have to identify the virtual fundamental class [V( 2s −π#g β) (M)]vir with the virtual fundamental class [GDou(M)]vir of the corresponding gauge theoretical Douady space. This is again an application of the general principle which states the Kobayashi–Hitchin-type correspondence between moduli spaces associated with Fredholm problems respects virtual fundamental classes. We proceed as in the proof of Theorem 3.2: Consider the exact sequence π 0 −→ G1 −→ G −→ G2 −→ 0
of bundles over C ∗ . The section g 2 = π ◦ g comes from a formal moment map, so one can show: 1. g 2 is regular around Z(g), 2. the natural map ρ : Z(g 2 ) → B¯ inj induces a bijection
Z(g) = V( 2s −π#g β) (M) → GDou(M), and is étale around Z(g). Using the notations of Sect. 4.1, one obtains a natural identification ρ ∗ (E) = G1 |Z(g 2 ) , and g 1 |Z(g 2 ) corresponds via this identification to the section s which defines the virtual fundamental class [GDou(M)]vir . The result follows now by applying again the Associativity Property of virtual fundamental classes. $ %
Gromov–Witten Invariants and Seiberg–Witten Invariants of Ruled Surfaces
583
Recall from [OT2] that with any compact oriented 4-manifold X with b+ = 1 one ± can associate a full Seiberg–Witten invariant SWX,( (c) ∈ #∗ H 1 (X, Z) which O 1 ,H0 ) depends on an equivalence class c of Spinc -structures, an orientation O1 of H 1 (X, R), and a component H0 of the hyperquadric H of H 2 (X, R) defined by the equation x·x = 1. By the homotopy invariance of the virtual fundamental classes, one has Remark 4.6. The Seiberg–Witten invariants defined in [OT2] using the generic regularity of Seiberg–Witten moduli spaces with respect to Witten’s perturbation [W2], coincide with the Seiberg–Witten invariants defined in [Br] using the virtual fundamental classes of these moduli spaces. Combining Theorems 2.8, 3.2, 3.8, 4.3, 4.5 we obtain Corollary 4.7. Consider a ruled surface X = P(V0 ) over the Riemann surface of genus g, and a class c of Spinc -strucures on X. Let c be the Chern class of the determinant line bundle of c and let wc := 41 (c2 − 3σ (X) − 2e(X)) be the index of c . Denote by [F ] the class of a fibre of X over , by c ∈ #2 (H1 (X, Z)) = #2 (H 1 (X, Z))∨ the element defined by 1 c (a, b) := c ∪ a ∪ b, [X], 2 and let lO1 be the generator of #2g (H 1 (X, Z)) corresponding to O1 . The full Seiberg–Witten invariant of X corresponding to c, the complex orientation O1 of the cohomology space H 1 (X, R), and the component H0 of H which contains the Kähler cone, is given by ± SWX,( (c) = 0 O 1 ,H0 ) if c, [F ] = 0; when c, [F ] = 0, it is given by −signc,[F ]
SWX,(O1 ,H0 ) (c) = 0, signc,[F ] SWX,(O1 ,H0 ) (c)(l)
= signc, [F ]
g
i≥max(0,g− w2c )
ic ∧ l, lO1 . i!
Remarks 1. This result cannot be obtained directly using the Kobayashi–Hitchin correspondence for the Seiberg–Witten equations, because the Douady spaces of divisors on ruled surfaces are in general oversized, non-reduced, and they can contain components of different dimensions. Moreover, it is not clear at all whether one can achieve regularity by varying the holomorphic structure V0 in V0 . This shows that the quot ∨ L∨ spaces of the form Quot L S n (V0 ) are very special within the class of quot spaces Quot E0 with E0 C ∞ - equivalent to S n (V0 ). The theory of gauge theoretical Gromov–Witten invariants and the comparison Theorem 4.3 show that one can however compute ∨ the full Seiberg–Witten invariant of X using a quot space Quot L E0 with E0 a general holomorphic bundle C ∞ -equivalent to S n (V0 ), although such a quot space cannot be identified with a space of divisors of X.
584
Ch. Okonek, A. Teleman
2. The result provides an independent check of the universal wall-crossing formula for the full Seiberg–Witten invariant, proven in [OT2]. Note however that, in the formula given in [OT2], the sign in front of uc , which corresponds to c above, is wrong. The error was pointed out to us by Markus Dürr, who also checked the corrected formula for a large class of elliptic surfaces [Dü]. References [A]
Aronszajin, N.: A unique continuation theorem for solutions of elliptic partial differential equations or inequalities of the second order. J. de Math. Pures et Appl. 9, 36, 235–249 (1957) [BDW] Bertram, A., Daskalopoulos, G., Wentworth, R.: Gromov invariants for holomorphic maps from Riemann surfaces to Grassmannians. J. A. M. S. 9, No. 2, 529–571 (1996) [B] Bradlow, S.B.: Special metrics and stability for holomorphic bundles with global sections. J. Diff. Geom. 33, 169–214 (1991) [Br] Brussee, R.: The canonical class and the C ∞ properties of Kähler surfaces. New York J. Math. 2, 103–146 (1996) [CGS] Cieliebak, K., Gaio, A.R., Salamon, D.: J-holomorphic curves, moment maps, and invariants of Hamiltonian group actions. Preprint math/9909122 [DK] Donaldson, S., Kronheimer, P.: The Geometry of Four-Manifolds. Oxford: Oxford Univ. Press, 1990 [Dou] Douady, A.: Le problème des modules pour les variétés analytiques complexes. Sem. Bourbaki nr. 277, 17-eme annee 1964/65 [Dü] Dürr, M.: Virtual fundamental classes and Poincaré invariants. Zürich, in preparation [G] Gaio,A.R.P.: J-holomorphic curves and moment maps. Ph. D. Thesis, University of Warwick, November 1999 [Gh] Ghione, F.: Quot schemes over a smooth curve. Indian Math. Soc. 48, 45–79 (1984) [Gr] Gromov, M.: Pseudo-holomorphic curves in symplectic manifolds. Invent. Math. 82, 307–347 (1985) [Ha] Hartshorne, R.: Algebraic geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1977 [HL] Huybrechts, D., Lehn, M.: Stable pairs on curves and surfaces. J. Alg. Geom. 4, 67–104 (1995) [K] Kobayashi, S.: Differential geometry of complex vector bundles. Princeton, NJ: Princeton Univ. Press, 1987 [KM] Kronheimer, P., Mrowka, T.: The genus of embedded surfaces in the projective plane. Math. Res. Letters 1, 797–808 (1994) [L] Lange, H.: Höhere Sekantenvarietäten und Vektorbündel auf Kurven. Manuscripta Math. 52, 63–80 (1985) [Lin] Lin, T.R.: Hermitian–Yang–Mills–Higgs Metrics and stability for holomorphic vector bundles with Higgs Fields. Preprint, Rutgers University, New Brunswick, NJ [LiT] Li, J., Tian, G.: Virtual moduli cycles and Gromov–Witten invariants of general symplectic manifolds. In: Topics in symplectic 4-manifolds (Irvine, CA, 1996), First Int. Press Lect. Ser., I, Cambridge, MA: Internat. Press, 1998, pp. 47–83 [LL] Lübke, M., Lupa¸scu, P.: Isomorphy of the gauge theoretical and the deformation theoretical moduli space of simple holomorphic pairs. Preprint (2001) [Lu] Lupa¸scu, P.: Seiberg–Witten equations and complex surfaces. Ph. D. Thesis, Zürich University, 1998 [LO] Lübke, M., Okonek, Ch.: Moduli spaces of simple bundles and Hermitian-Einstein connections. Math. Ann. 276, 663–674 (1987) [LT] Lübke, M., Teleman, A.: The Kobayashi–Hitchin correspondence. Singapore: World Scientific Publishing Co., 1995 [Mu1] Mundet i Riera, I.: A Hitchin–Kobayashi correspondence for Kaehler fibrations. J. Reine Angew. Math. 528, 41–80 (2000) [Mu2] Mundet i Riera, I.: Hamiltonian Gromov–Witten invariants. prep. math. SG/0002121 [OST] Okonek, Ch., Schmitt, A., Teleman, A.: Master spaces for stable pairs. Topology 38, No 1, 117–139 (1998) [OT1] Okonek, Ch., Teleman, A.: The Coupled Seiberg–Witten Equations, Vortices, and Moduli Spaces of Stable Pairs. Int. J. Math. 6, No. 6, 893–910 (1995) [OT2] Okonek, Ch., Teleman, A.: Seiberg–Witten invariants for manifolds with b+ = 1, and the universal wall crossing formula. Int. J. Math. 7, No. 6, 811–832 (1996) [Oxb] Oxbury, W. M.: Varieties of maximal line subbundles. Math. Proc. Cambridge Phil. Soc. 129, no. 1, 9–18 (2000) [R] Ruan, Y.: Topological sigma model and Donaldson-type invariants in Gromov theory. Duke Math. J. 83, no. 2, 461–500 (1996) [S] Serre, J. P.: Géometrie algébrique et Géometrie analytique. Ann. Inst. Fourier 6, 1–42 (1956)
Gromov–Witten Invariants and Seiberg–Witten Invariants of Ruled Surfaces
[Sm] [Su] [Th1] [Th2] [U] [W1] [W2]
585
Smale, S.: An infinite dimensional version of Sard’s theorem. Am. J. Math. 87, 861–866 (1965) Suyama, Y.: The analytic moduli space of simple framed holomorphic pairs. Kyushu J. Math. 50, 65–68 (1996) Thaddeus, M.: Stable pairs, linear systems and the Verlinde formula. Invent. Math. 117, 181–205 (1994) Thaddeus, M.: Geometric invariant theory and flips. JAMS 9, 691–725 (1996) Uhlenbeck, K.: Connections with Lp bounds on curvature. Commun. Math. Phys. 83, 31–42 (1982) Witten, E.: The Verlinde algebra and the cohomology of the Grassmannian. In: Geometry, Topology and Physics, Conf. Proc. Lecture Notes Geom. Topology, IV, Cambridge, MA: Internat. Press, 1995, pp. 357–422 Witten, E.: Monopoles and four-manifolds. Math. Res. Letters 1, 769–796 (1994)
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 227, 587 – 603 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Simulation of Topological Field Theories by Quantum Computers Michael H. Freedman1 , Alexei Kitaev1, , Zhenghan Wang2 1 Microsoft Research, One Microsoft Way, Redmond, WA 98052-6399, USA 2 Indiana University, Dept. of Math., Bloomington, IN 47405, USA
Received: 4 May 2001 / Accepted: 16 January 2002
Abstract: Quantum computers will work by evolving a high tensor power of a small (e.g. two) dimensional Hilbert space by local gates, which can be implemented by applying a local Hamiltonian H for a time t. In contrast to this quantum engineering, the most abstract reaches of theoretical physics has spawned “topological models” having a finite dimensional internal state space with no natural tensor product structure and in which the evolution of the state is discrete, H ≡ 0. These are called topological quantum field theories (TQFTs). These exotic physical systems are proved to be efficiently simulated on a quantum computer. The conclusion is two-fold: 1. TQFTs cannot be used to define a model of computation stronger than the usual quantum model “BQP”. 2. TQFTs provide a radically different way of looking at quantum computation. The rich mathematical structure of TQFTs might suggest a new quantum algorithm.
1. Introduction A topological quantum field theory (TQFT) is a mathematical abstraction, which codifies topological themes in conformal field theory and Chern–Simons theory. The strictly 2-dimensional part of a TQFT is called a topological modular functor (TMF). It (essentially) assigns a finite dimensional complex Hilbert space V () to each surface and to any (self)-diffeomorphism h of a surface a linear (auto)morphism V (h) : V () → V ( ). We restrict attention to unitary topological modular functors (UTMF) and show that a quantum computer can efficiently simulate transformations of any UTMF as a transformation on its computational state space. We should emphasize that both sides of our discussion are at present theoretical: the quantum computer which performs our simulation is also a mathematical abstraction – the quantum circuit model (QCM) [D,Y]. On leave from Landau Institute for Theoretical Physics, Moscow.
588
M. H. Freedman, A. Kitaev, Z. Wang
Very serious proposals exist for realizing this model, perhaps in silicon, e.g. [Ka], but we will not treat this aspect. There is a marked analogy between the development of the QCM from 1982 Feynman [Fey] to the present, and the development of recursive function theory in the 1930’s and 1940’s. At the close of the earlier period, “Church’s thesis” proclaimed the uniqueness of all models of (classical) calculation: recursive function theory, Turing machine, λcalculus, etc.... This result was refined in the 1960s, by showing that most “natural” models are polynomially equivalent to the Turing machine. The present paper can be viewed as supporting a similar status for QCM as the inherently quantum mechanical model of calculation. The modern reconsideration of computation is founded on the distinction between polynomial time and slower algorithms. Of course, all functions computed in the QCM can be computed classically, but probably not in comparable time. Assigning to an integer its factors, while polynomial time in QCM [Sh] is nearly exponential time, exp(O(n1/3 poly(log n))) (an emphiric bound, the proved one is even worse) according to the most refined classical algorithms. The origin of this paper is in thought [Fr] that since ordinary quantum mechanics appears to confer a substantial speed up over classical calculations, that some principle borrowed from the early, string, universe might go still further. Each TQFT is an instance of this question since their discrete topological nature lends itself to translation into computer science. We answer here in the negative by showing that for a unitary TQFT, the transformations V (h) have a hidden poly-local structure. Mathematically, V (h) can be realized as the restriction to an invariant subspace of a transformation gi on the state space of a quantum computer where each gi is a gate and the length of the composition is linear in the length of h as a word in the standard generators, “Dehn twists” of the mapping class group = diffeomorphisms ()/identity component. Thus, we add evidence to the unicity of the QCM. Several variants and antecedents of QCM, including quantum Turing machines, have previously been shown equivalent (with and without environmental errors)[Y]. From a physical standpoint, the QCM derives from Schrödinger’s equation as described by Feynman [Fey] and Lloyd [Ll]. Let us introduce the model. Given a decision problem, the first or classical phase of the QCM is a classical program, which designs a quantum circuit to “solve” instances of the decision problem of length n. A quantum circuit is a composition Un of operators or gates gi ∈ U(2) or U(4) taken from some fixed list of rapidly computable matrices1 , e.g. having algebraic entries. The following short list suffices to efficiently approximate any other choice of gates [Ki]: 0 1
1 , 0
1 0
0 , i
1 0 and 0 0
0 0 1 0 0 √1 0
2 √1 2
1 √ . 2 −1 √ 0 0
2
The gates are applied on some tensor power space (C2 )⊗k(n) of “k qubits” and models a local transformation on a system of k spin 21 particles. The gate g acts as the identity on all but one or two tensor factors where it acts as a matrix as above. This is the middle or quantum phase of the algorithm. The final phase is to perform a local von Neumann measurement on a final state ψfinal = Un (ψinitial ) (or a commuting family of the same) to extract a probabilistic answer to the decision problem. (The initial states’ ψinitial must also be locally constructed.) In this phase, we could declare that observing a certain 1 The i th digit of each entry should be computable in poly(i) time.
Simulation of Topological Field Theories by Quantum Computers
589
eigenvalue with probability ≥ 23 means “yes”. We are interested only in the case where the classical phase of circuit design and the length of the designed circuit are both smaller than some polynomial in n. Decision problems which can be solved in this way are said to be in the computational class BQP: bounded-error quantum polynomial. The use of C2 , the “qubit”, is merely a convenience, any decomposition into factors of bounded dimension gives an equivalent theory. We say U is a quantum circuit over Cp if all tensor factors have dimension = p. Following Lloyd [Ll], note that if a finite dimensional quantum system, say (C2 )⊗k , evolves by a Hamiltonian H , it is physically reasonable to assert that H is poly-local, L ∼ ∼
H = H , where the sum has ≤ poly(k) terms and each H = H ⊗ id, where H i=1
acts nontrivially only on a bounded number (often just two) qubits and as the identity on the remaining tensor factors. Now setting Plank’s constant h = 1, the time evolution is given by Schrödinger’s equation: Ut = e2πitH whereas gates can rapidly approximate [Ki] any local transformation of the form e2πitH . Only the nonabelian nature of the L
unitary group prevents us from approximating Ut directly from e2πiH . However, by the Trotter formula:
i=1
A/n+B/n n 1 A+B e =e +O , n
where the error O is measured in the operator norm. Thus, there is a good approximation to Ut as a product of gates:
2πi t H 1 2πi nt HL n 2 1 n Ut = e ...e +L ·O . n Because of the rapid approximation result of [Ki], in what follows, we will not discuss quantum circuits restricted to any small generating set as in the example above, rather we will permit a 2 × 2 or 4 × 4 unitary matrix with algebraic number entries to appear as a gate. In contrast to the systems considered by Lloyd, the Hamiltonian in a topological theory vanishes identically, H = 0, a different argument - the substance of this paper - is needed to construct a simulation. The reader may wonder how a theory with vanishing H can exhibit nontrivial unitary transformations. The answer lies in the Feynman pathintegral approach to QFT. When the theory is constructed from a Lagrangian (functional on the classical fields of the theory), which only involves first derivatives in time, the Legendre transform is identically zero [At], but may nevertheless have nontrivial global features as in the Aharonov-Bohm effect. Before defining the mathematical notions, we would make two comments. First, the converse to the theorem is also true. It has been shown recently [FLW] that a particular UTMF allows efficient simulation of the universal quantum computer. Second, we would like to suggest that the theorem may be viewed as a positive result for computation. Modular functors, because of their rich mathematical structure, may serve as higher order language for constructing a new quantum algorithm. In [Fr], it is observed that the transformations of UTMF’s can readily produce state vectors whose coordinates are computationally difficult evaluations of the Jones and Tutte polynomials. The same is now known for the state vector of a quantum computer, but the question of whether any useful part of this information can be made to survive the measurement phase of quantum computation is open.
590
M. H. Freedman, A. Kitaev, Z. Wang
2. Simulating Modular Functors We adopt the axiomatization of [Wa] or [T] to which we refer for details. Also see, Atiyah [At], Segal [Se], and Witten [Wi]. A surface is a compact oriented 2-manifold with parameterized boundaries. Each boundary component has a label from a finite set L = {1, a, b, c, . . . } with involution , 1 = 1. In examples, labels might be representations of a quantum group up to a given level or positive energy representations of a loop group, or some other algebraic construct. Technically, to avoid projective ambiguities each surface is provided with a Lagrangian subspace L ⊂ H1 (; Q) and each diffeomorphism f : → is provided with an integer “framing/signature” so the dynamics of the theory is actually given by a central extension of the mapping class group. Since these extended structures are irrelevant to our development, we suppress them from the notation. We use the letter below to indicate a label set for all boundary components, or in some cases, those boundary components without a specified letter as label. Definition 1. A unitary topological modular functor (UTMF) is a functor V from the category of (labeled surfaces with fixed boundary parameterizations, label preserving diffeomorphisms which commute with boundary parameterizations) to (finite dimensional complex Hilbert spaces, unitary transformations) which satisfies: 1. Disjoint union axiom: V (Y1 Y2 , 1 2 ) = V (Y1 , 1 ) ⊗ V (Y2 , 2 ). 2. Gluing axiom: let Yg arise from Y by gluing together a pair of boundary circles with dual labels, x glues to x , then V (Yg , ) =
V (Y, (, x, x )).
x" L
3. Duality axiom: reversing the orientation of Y and applyingto labels corresponds to replacing V by V ∗ . Evaluation must obey certain naturality conditions with respect to gluing and the action of the various mapping class groups. 4. Empty surface axiom: V (φ) ∼ =C. ∼ C, if a = 1 . 5. Disk axiom: Va = V (D, a) = 0, if a = 1 C, if a = b ∼ 6. Annulus axiom: Va,b = V A, (a, b) = 0, if a = b 7. Algebraic axiom: The basic data, the mapping class group actions and the maps F and S explained in the proof (from which V may be reconstructed if the Moore and Seiberg conditions are satisfied, see [MS] or [Wa] 6.4, 1–14) is algebraic over Q for some bases in Va , Va,a , and Vabc , where Vabc denotes V P , (a, b, c) for a (compact) 3-punctured sphere P . 3-punctured spheres are also called pants. Comments. (1) From the gluing axiom, V may be extended via dissection from simple pieces D, A, and P to general surfaces . But V () must be canonically defined: this looks quite difficult to arrange and it is remarkable that any nontrivial examples of UTMFs exist.
Simulation of Topological Field Theories by Quantum Computers
591
(2) The algebraic axiom is usually omitted, but holds for all known examples. We include it to avoid trivialities such as a UTMF where action by, say, a boundary twist is multiplication by a real number whose binary expansion encodes a difficult or even uncomputable function: e.g. the i th bit is 0 iff the i th Turing machine halts. If there are nontrivial parameter families of UTMF’s, such nonsensical examples must arise – although they could not be algebraically specified. In the context of bounded accuracy for the operation of diffeomorphisms V (h), Axiom 7 may be dropped (and simulation by bounded accuracy quantum circuits still obtained), but we prefer to work in the exact context since in a purely topological theory exactness is not implausible. (3) Axiom 2 will be particularly important in the context of a pants decomposition of a surface . This is a division of into a collection of compact surfaces P having the topology of 3-punctured spheres and meeting only in their boundary components which we call “cuffs”. Definition 2. A quantum circuit U : (Cp )⊗k → (Cp )⊗k =: W is said to simulate on W (exactly) a unitary transformation τ : S → S if there is a C-linear imbedding i : S ⊂ (Cp )⊗k invariant under U so that U ◦ i = i ◦ τ . The imbedding is said to intertwine τ and U. We also require that i be computable on a basis in poly(k) time. Since we prove efficient simulation of the topological dynamics for UTMFs V , it is redundant to dwell on “measurement” within V, but to complete the computational model, we can posit von Neumann type measurement with respect to any efficiently computable frame F in Vabc . The space Cp above, later denoted X = Cp , is defined by X := ⊕ Vabc and the computational space W := X ⊗k . We have set S := V () (a,b,c)∈L3
and assumed is divided into k “pants”, i.e. Euler class () = −k. Any frame F extends to a frame for V () via the gluing axiom once a pants decomposition of is specified. Thus, measurement in V becomes a restriction of measurement in W . It may be physically more natural to restrict the allowable measurements on V () to cutting along a simple closed curve γ and measuring the label which appears. Mathematically, this amounts to transforming to a pants decomposition with γ as one of its decomposition or “cuff” curves and then positing a Hermitian operator with eigenspaces equal to the summands of V () corresponding under the gluing axiom to labels x on γ − and x on γ + , x"L. A labeled surface (, ) determines a mapping class group M = M(, ) = “isotopy classes of orientation preserving diffeomorphisms of preserving labels and commuting with boundary parameterization”. For example, in the case of an n-punctured sphere with all labels equal (distinct), M = SFB(n), the spherical framed braid group M = PSFB(n), the pure spherical framed braid group . To prove the theorem below, we will need to describe a generating set S for the various M’s and within S chains of elementary moves which will allow us to prepare to apply any s2 ∈ S subsequent to having applied s1 ∈ S. Each M is generated by Dehn-twists and braid-moves (See [B]). A Dehn-twist Dγ is specified by drawing a simple closed curve (s.c.c.) γ on , cutting along γ , twisting 2π to the right along γ and then regluing. A braid-move Bδ will occur only when a s.c.c. δ cobounds a pair of pants with two boundary components of : If the labels of the boundary components are equal then Bδ braids them by a right π -twist. In the case that all labels are equal, there is a rather short list of D and B generators indicated in Fig. 1 below. Also sketched in Fig. 1 is a pants decomposition of diameter = O log b1 () ,
592
M. H. Freedman, A. Kitaev, Z. Wang
c
D γ ’s
c
c
c
c c
c
c
c
c
γ
γ'
A
Bδ ’s Fig. 1.
meaning the graph dual to the pants decomposition has diameter order log the first Betti number of . The s.c.c. γ ( δ) label Dehn (braid) generators Dγ ( and Bδ ). Figure 1 contains a punctured annulus A; note that the composition of oppositely oriented Dehn twists along the two “long” components of ∂A, γ and γ yield a diffeomorphism which moves the punctures about the loop γ . The figure implicitly contains such an A for each (γ , p), where p is a preferred puncture. The γ curves come in three types: (1) The loops at the top of the handles which are curves (“cuffs”) of the pants decomposition, (2) loops dual to type 1, and (3) loops running under adjacent pairs of handles (which cut through up to O log(b1 ) many cuffs). (See Fig. 1, where cuffs are marked by a “c”.) Each punctured annulus A is determined as a neighborhood (of a s.c.c. γ union an arc η from γ to p). To achieve general motions of p around , we require these arcs to be “standard” so that for each p, π1 (, p) is generated by {η · γ · η−1 }, where = with punctures filled by disks, and the disk corresponding to p serving as a base point. This list of generators is only linear in the first Betti number of . In the presence of distinct labels, many of the Bδ are illegal (they permute unequal labels). In this case, quadratically many generators are required. Figure 2 displays the replacements for the B’s, and additional A’s and D’s. Figure 2 shows a collection of B’s sufficient to effect arbitrary braiding within each commonly-labeled subset of punctures, a quadratically large collection of new Dehn curves {"} allowing a full twist between any pair of distinctly labeled punctures. (If the punctures are arranged along a convex arc of the Euclidean cell in , then each " will be the boundary of a narrow neighborhood of the straight line segment joining pairs of dissimilarly labeled punctures.) Finally a collection of punctured annuli, which enable one puncture pi from each label – constant subset to be carried around each free homotopy class from {γ }(respecting the previous generation condition for π1 (, pi ). Thus for distinct labels the generating sets are built from curves of type γ , γ , " and δ by Dehn twists around γ , γ , and ", braid moves around δ. Denote by ω, any such curve: ω ∈ 1 = {γ } ∪ {γ } ∪ {"} ∪ {δ} .
Simulation of Topological Field Theories by Quantum Computers
593
Bδ Dε γ'
A
A
Fig. 2.
Since various ω s intersect, it is not possible to realize all ω simultaneously as cuffs in a pants decomposition. However, we can start with the “base point” pants decomposition D indicated in Fig. 1 (note γ of type(1) are cuffs in D, but γ of types (2) or (3) are not) and for any ω find a short path of elementary moves: F and S (defined below) to a pants decomposition Dω containing ω as a cuff. Lemma 2.1. Assume = S 2 , disk, or annulus, and D the standardpants decomposition sketched in Fig. 1. Any ω as above, can be deformed through O log b1 () F and S moves to a pants decomposition Dω in which ω is a cuff. We postpone the proof of the lemma and the definition of its terms until we are partly into the proof of the theorem and have some experience passing between pants decompositions. Theorem 2.2. Suppose V is a UTMF and h : → is a diffeomorphism of length n in the standard generators for the mapping class group of described above (see Figs. 1 and 2). Then there are constants depending only on V , c = c(V ) and p = p(V ) such that V (h) : V () → V () is simulated (exactly) by a quantum circuit operating on “qupits” Cp of length ≤ c · n · log b1 (). The collection {cuffs} refers to the circles along which the pants decomposition decomposes; the “seams” are additional arcs, three per pant which cut the pant into two hexagons. Technically, we will need each pant in D to be parameterized by a fixed 3-punctured sphere so these seams are part of the data in D; for simplicity, we choose seams to minimize the number of intersections with {ω}. The theorem may be extended to cover a more general form of input. The original algorithm [L] which writes a Dα , α a s.c.c., as a word in standard generators Dγ is super-exponential. We define the combinatorial length of α, (α), to be the minimum number of intersections as we vary α by isotopy of α with {cuffs} ∪ {seams}. The best upper-bound (known to the authors) to the length L of Dα as a word in the mapping class group spanned by a fixed generating set is of the form L(Dα ) < super-exponential function f (). For this reason, we consider as input V (h), where h is a composition of k Dehn twists on α1 , . . . , αk and j braid moves along β1 , . . . , βj in any order.
594
M. H. Freedman, A. Kitaev, Z. Wang
Then V (h) is costed as the sum of the combinatorial length of the simple closed curves needed to write h as Dehn twists and braid moves within the mapping class group, j
k
i=1
i=1
(h) := (βi ) + (αi ). We obtain the following extension of the theorem. The map h∗ : V () −→ V () is exactly simulated by a quantum circuit QC with length (QC) ≤ 11(h) composed of algebraic 1 and 2-qupit Cp gates. Extension.2
Pre-Proof. Some physical comments will motivate the proof. V () are quantized gauge fields on (with a boundary condition given by labels ) and can be regarded as a finite dimensional space of internal symmetries. This is most clear when genus () = 0 , is a punctured sphere, the labeled punctures are “anyons” [Wil] and the relevant mapping class group is the braid group which moves the punctures around the surface of the sphere. An internal state ψ"V () is transformed to U(b)ψ ∈ V () under the functorial representation of the braid group. For U(b) to be defined the braiding must be “complete” in the sense that the punctures (anyons) must return setwise to their initial position. Infinitesimally, the braiding defines a Hamiltonian H on V ()⊗E, where E is an infinite dimensional Hilbert space which encodes the position of the anyons. The projection of H into V () vanishes which is consistent with the general covariance of topological theories. Nevertheless, when the braid is complete, the evolution U of H will leave V () invariant and it is U|V () = U which we will simulate.Anyons inherently reflect nonlocal entanglement so it is not to be expected that V () has any (natural) tensor decomposition and none are observed in interesting examples. Thus, simulation of U as an invariant subspace of a tensor product (Cp )⊗k is the best result we can expect. The mathematical proof will loosely follow the physical intuition of evolution in a super-space by defining, in the braid case (identical labels and genus = 0), two distinct imbeddings “odd” and odd
−→
p ⊗k = W and constructing the local evolution by gates acting on even (C ) “even”, V () −→ the target space. The imbeddings are named for the fact that in the usual presentation of the braid group, the odd (even) numbered generators can be implemented by restricting an action on W to image odd V () evenV () .
Proof. The case genus () = 0 with all boundary components carrying identical labels (this contains the classical, uncolored Jones polynomial case [J, Wi]) is treated first. For any number q of punctures (q = 10 in the illustration) there are two systematic ways of 8 dividing into pants (3-punctured spheres) along curves α = {α1 , . . . , αq−3 } or along 8
β = {β1 , . . . , βq−3 } so that a sequence of q F moves (6j -moves in physics notation) 8
8
transforms α to β . Let X = (a,b,c)" L3 Vabc be the orthogonal sum of all sectors of the pants Hilbert ⊗(q−2) := W is the sum over all space. Distributing over , the tensor power X labelings of the Hilbert space for (q − 2) pants. Choosing parameterizations, W is identified with both the label sum space (cut8 ) and sum ( 8 ). Now is assembled α 8
8
cut β
from the disjoint union by gluing along α or β so the gluing axiom defines imbeddings 8
8
i( α ) and i( β ) of V (, ) as a direct summand of X ⊗(q−2) = W . 2 Lee Mosher has informed us that the existence of the linear bound f () (but without control of the constants) follows at least for closed and single punctured surfaces from his two papers [M1] and [M2].
Simulation of Topological Field Theories by Quantum Computers
α1
595
α2 α5
α3
α4
3 6j
α5 α3
4 6j
β3
β1
3 6j
β1
β2
β5
β4
β3
Fig. 3.
Consider the action of braid move about α. This acts algebraically as θ(αi ) on a single 8 X factor of W and as the identity on other factors. This action leaves i( α ) V (, ) invariant and can be thought of as a “qupit” gate: θ(αi ) = V (braidαi ) : X → X, 8
where dimension dim(X) = p. Similarly the action of V (braidβi ) leaves i( β ) invariant. 8
8
It is well known [B] that the union of loops α ∪ β determines a complete set of generators of the braid group. The general element ω, which we must simulate by an action on W is a word in braid moves on α’s and β’s. Part of the basic data – implied by the gluing axiom for a UTMF is a fixed identification between elementary gluings: Fabcd :
x" L
Vxab ⊗ V x cd −→
y" L
Vybc ⊗ V y da
596
M. H. Freedman, A. Kitaev, Z. Wang
a
d
a 2
3
d 3
2
2
2
Fabcd 1
1 3
b
2
2
b
c
3
c
Fig. 4.
corresponding to the following two decompositions of the 4-punctured sphere into two pairs of pants (the dotted lines are pant “seams”, the uncircled number indicate boundary components, the letters label boundary components, and the circled numbers order the pairs of pants.): For each F , we choose an extension to a unitary map F : X X −→ X X. Then extend F to F by tensoring with identity on the q − 4 factors unaffected by F . The composition of q F ’s, extended to q F ’s, corresponding to the q moves illustrated in the case q = 10 by Fig. 3. (For q > 10 imagine the drawings in Fig. 3 extended 8
8
periodically.) These define a unitary transformation T : W → W with T ◦ i( α ) = i( β ). The word ω in the braid group can be simulated by τ on W , where τ is written as a composition of the unitary maps T , T −1 , θ(αi ), and θ(βj ). For example, β5 α1 β2−1 α1 α3 would be simulated as τ = T −1 ◦ θ (β5 ) ◦ T ◦ θ (α1 ) ◦ T −1 ◦ θ(β2−1 ) ◦ T ◦ θ(α1 ) ◦ θ(α3 ). As described τ has length ≤ 2q length ω. The dependence on q can be removed by dividing into q2 overlapping pieces i , each i a union of 6 consecutive pants. Every 8
8
loop of α ∪ β is contained well within some piece i so instead of moving between two fixed subspaces iα (V ) and iβ (V ) ⊂ W , when we encounter a βj , do constantly many F operations to find a new pants decomposition modified locally to contain βj . Then θ(βj ) may be applied and the F operations reversed to return to the α pants decomposition. The resulting simulation can be made to satisfy length τ ≤ 7 length ω. This completes the braid case with all bounding labels equal - an important case corresponding to the classical Jones polynomial [J]. Proof of Lemma. We have described the F -move on the 4-punctured sphere both geometrically and under the functor. The S-move is between two pants decompositions on the punctured torus T − . (Filling in the puncture, a variant of S may act between two distinct annular decomposition of T 2 . We suppress this case since, without topological parameter, there can be no computational complexity discussion over a single surface.) By [Li] or [HT] that one may move between any two pants decompositions via a finite sequence of moves of three types: F , S, and diffeomorphism M supported on the interior of a single pair of pants (see the Appendix [HT]). To pass from D, our “base point” decomposition, to Dω , F and S moves alone suffice and the logarithmic count is a consequence of the log depth nest of cuff loops of D on the planar surface obtained by
Simulation of Topological Field Theories by Quantum Computers
597
3 2
3 2
Sa
a
1
1
a Fig. 5. V (S):
Vax xˆ −→
x∈L
Vay yˆ
y∈L
cutting along type (1) γ curves. Below we draw examples of short paths of F and S moves taking D to a particular Dω . The logarithmic count is based on the proposition. Proposition 2.3. Let K be a trivalent tree of diameter = d and f be a move, which locally replaces { } and with { }, then any two leaves of K can be made adjacent by ≤ d moves of type f . (Here we consider abstract trees rather than ones imbedded in the plane.) Passing from K to a punctured sphere obtained by imbedding (K, univalent vertices) into ( 21 R 3 , R 2 ), thickening and deleting the boundary R 2 , the f move induces the previously defined F move. Some example of paths of F , S moves (Fig. 6). Continuation of the proof of the theorem. For the general on numerous case, we compute imbeddings of V () into W (rather than on two: iα V () and iβ V () as in the braid case). Each imbedding is determined by a pants decomposition and the imbedding changes (in principle) via the lemma every time we come to a new literal of the word ω. Recall that ω ∈ M, the mapping class group, is now written as a word in the letters (and their inverses) of type Dγ , Dγ , D" , and Bδ . Pick as a home base a fixed pants decomposition D0 corresponding to i0 V () ⊂ W . If the first literal is a twist or braid along the s.c.c. ω, then apply the lemma to pass through a sequence of F and S moves from D0 to D1 containing ω as a “cuff” curve. As in the braid case, choose extensions F and S to unitary automorphisms of W and applying V to the composition gives a transformation T1 of W such that i1 = T1 ◦ i0 , i1 being the inclusion V () → W associated with D 1 . Now execute the first literal ω1 of ω as a transformation θ(ω1 ), which −1 leaves i1 V () invariant and satisfies: θ(ωi ) ◦ i1 = i1 ◦ V (ω1 ). Finally apply T1 to return to the base inclusion i0 V () . The previous three steps can now be repeated for the second literal of ω: follow T1−1 ◦ θ(ω1 ) ◦ T1 by T2−1 ◦ θ(ω2 ) ◦ T2 . Continuing in this way, we construct a composition τ which simulates ω on W : . τ = Tn−1 ◦ θ(ωn ) ◦ Tn−1 . . ◦ T1−1 ◦ θ(ω1 ) ◦ T1 .
598
M. H. Freedman, A. Kitaev, Z. Wang γ'
ω S
ω
F
ω
ω
ω
F
ω
F
F
F
ω
Fig. 6.
From Lemma 2.1 the length of this simulation by one corresponding to S and θ(ωi ) and two (corresponding to F moves) qupit gates is proportional to n =length ω and log b1 (), where p = dim(X). Proof of Extension. What is at issue is the number of preparatory moves to change the base point decomposition D to Dγ containing γ = αi or βi as a cuff curve 1 ≤ i ≤ k or j . We have defined the F and S moves rigidly, i.e. with specified action on the seams. This was necessary to induce a well defined action on the functor V . Because of this rigid choice, we must add one more move – an M move – to have a complete set of moves capable of moving between any two pants decompositions of a surface (compare [HT]). The M move is simply a Dehn twist supported in a pair of pants of the current pants decomposition; it moves the seams (compare Chapter 5 [Wa]). Note that if M is a +1 Dehn twisit in a s.c.c. ω then, under the functor, V (M) is a restriction of θ(ω) in the notation above. As in [HT], the cuff curves of D may be regarded as level curves of a Morse function f : → R + , constant on boundary components which we assume to have minimum complexity (= total number of critial points) satisfying this constraint. Isotope α (we drop the index) on to have the smallest number of local maximums with respect to f and is disjoint from critical points of f on . Now generically deform f in a thin annular neighborhood of γ so that γ becomes a level curve. Consider the graphic G of the deformation ft , 0 ≤ t ≤ 1. For regular t the Morse function ft determines a pants decomposition: let the 1- complex K consist of / ∼ where x ∼ y if x and y belong to the same component of a level set of ft , and let
Simulation of Topological Field Theories by Quantum Computers
1) −ε cuff
+ε cuff
+ε cuff or
F-move
599
−ε cuff F-move
double critical level
double critical level 2)
moves: F 0 M 0 F (as shown below) double critical level
M
F
3)
F
−ε moves: M 0 M 0 F 0 S 0 F (as shown below)
+ε +ε
−ε
double critical level
F
S
M
2 0
S
2 M moves adjust seams to standard position Fig. 7.
L ⊂ K be the smallest complex to which K collapses relative to endpoints associated to boundary components. For example in Fig. 8, the top tree does not collapse at all while in the lower two trees the edge whose end is labeled, “local max” is collapsed away. The preimage of one point from each intrinsic 1-cell of L not containing a boundary point constitutes a {cuffs} determining a pants decomposition Dt . For singular t0 , let Dt0 −" and Dt0 +" may differ or may agree up to isotopy. The only change in D occurs when t is a crossing point for index= 1 handles where the two critical points are on the same connected component of a level set ft−1 (r). There are essentially only three possible “Cerf-transitions” and they are expressible as a product of 1, 2, or 3 F and S moves together with braid moves whose number we will later bound from above. The Cerf
600
M. H. Freedman, A. Kitaev, Z. Wang
γ local max
γ local max
γ Fig. 8. Pulling γ down yields F ◦ F
γ'
pull γ' down, cancel local max yields F o M o F Fig. 9.
transitions on D are shown in Fig. 7, together with their representation as compositions of elementary moves. Critical points of f |γ become critical points of ft of the same index once the deformation as passed an initial "0 > 0, and before any saddle-crossings have occurred. Let P be a pant from the composition induced by f and δ ⊂ γ ∩ P an arc. Applying the connectivity criterion of the previous paragraph, we can see that flattening a local maxima can effect at most the two cuff circles which δ meets, and these by elementary Cerf transition shown in Fig. 8. If γ crosses the seam arcs then the transitions are of the Cerf type, precomposed with M-moves to remove these crossings as shown in Fig. 9. Dynamically seam crossings by γ produce saddle connections in the Cerf diagram. The total number of these twists is bounded by length (γ ). The number of flattening moves as above is less than or equal |γ ∩ cuffs| ≤ length(γ ). The factor of 11 in the statement allows up to 5 F , S, and M moves for expressing each Cerf singularity which arises in passing from D◦ to Dγ and the same factor of 5 to pass back from Dγ to D◦ again, while saving at least one step to implement the twist or build move along γ . This completes the proof of the extension.
Simulation of Topological Field Theories by Quantum Computers
601
We should emphasize that although, we have adopted an “exact” model for the operation of the UTMF, faithful simulation as derived above does not depend on a perfectly accurate quantum circuit. Several authors have proved a threshold theorem [Ki, AB], and [KLZ]: If the rate of large errors acting on computational qubits (or qupits) is small enough, the size of ubiquitous error small enough, and both are uncorrelated, then such a computational space may be made to simulate with probability ≥ 23 an exact quantum circuit of length = L. The simulating circuit must exceed the exact circuit in both number of qubits and number of operations by a multiplicative factor ≤ poly (log L). 3. Simulating TQFT’s We conclude with a discussion about the three dimensional extension, the TQFT of a UTMF. In all known examples of TMF’s there is an extension to a TQFT meaning that b∗
it is possible to assign a linear map V () → V ( ) subject to several axioms [Wa] and [T] whenever and cobounds a bordism b (with some additional structure). The case of bordisms with a product structure is essentially the TMF part of the theory. Unitarity is extended to mean that if the orientation of the bordism b is reversed to b, we have b∗† = (b)∗ . It is known that a TMF has at most one extension to a TQFT and conjectured that this extension always exists. Non-product bordisms correspond to some loss of information of the state. This can be understood by factoring the bordism into pieces consisting of a product union 2-handle: × I ∪ h. The 2-handle h has the form (D 2 × I, ∂D 2 × I ) and is attached along the subspace ∂D 2 × I . The effect of attaching the handle will be to “pinch” off an essential loop ω on and so replace an annular neighborhood of ω by two disks turning into a simpler surface . It is an elementary consequence of the axioms that if b = × I ∪ h then b∗ is a projector as follows: Let D be a pants decomposition containing ω as a dissection curve. There are two cases: (1) ω appears as the first and second boundary components of a single pant called P0 or (2) ω appears as the first boundary component on two distinct pants called P1 and P2 . V () =
V \P Va ac , with label c on ∂ P = 0 3 0 , ˆ c" L
or =
a" L
labels a" L
Vabc
Vade ˆ
case (1),
V \ P1 ∪ P2 , appropriate labels ,
case (2).
In case (2), there may be a relation b = cˆ and/or d = eˆ depending on the topology of D. The map b∗ is obtained by extending linearly from the projections onto summands:
Va ac ˆ −→ V111
canonically
∼ =
V1 ,
(case 1)
a,c L
a,b,c,d,e " L
Vabc
or Vade −→ V1bbˆ ˆ
V1d dˆ
canonically
∼ =
Vbbˆ
Vd dˆ .
(case 2)
602
M. H. Freedman, A. Kitaev, Z. Wang
If the orientation on b is reversed the unitarity condition implies that b determines an injection onto a summand with a formula dual to the above. Thus, any bordism’s morphism can be systematically calculated. In quantum computation, as shown in [Ki], a projector corresponds to an intermediate binary measurement within the quantum phase of the computation, one outcome of which leads to cessation of the other continuation of the quantum circuits operation. Call such a probabilistically abortive computation a partial computation on a partial quantum circuit. Formally, if we write the identity as a sum of two projectors: idV = 0 + 1 , and let U0 and U1 be unitary operators on an ancillary space A with U0 (|0 ) = |0 and U1 |0 = |1 . The unitary operator 0 ⊗ U0 + 1 ⊗ U1 on V ⊗ A when applied to |v ⊗ |0 is |0 v ⊗ |0 + |1 v ⊗ |1 so continuing the computation only if the indicator |0 ∈ A is observed simulates the projection 0 . It is clear that the proof of the theorem can be modified to simulate 2-handle attachments as well as Dehn twists and braid moves along s.c.c.’s ω to yield: Scholium 3.1. Suppose b is an oriented bordism from 0 to 1 , where i is endowed with a pants decomposition Di . Let complexity (b) be the total number of moves of four types: F , S, M, and attachment of a 2-handle to a dissection curve of a current pants decomposition that are necessary to reconstruct b from (0 , D0 ) to (1 , D1 ). Then there is a constant c (V ) depending on the choice of UTQFT and p(V ) as before (for the TQFTs underlining TMF) so that b∗ : V (0 ) → V (1 ) is simulated (up to a non-topological factor of the form ν n2 , where n2 is the number of 2-hanles attached) by a partial quantum circuit over Cp of length ≤ c complexity (b). In general, the difference between topological objects (such as b∗ or closed 3-manifold invariants) and quantum mechanical ones (the evolution and probability) is related to critical points of a Morse function. A similar phenomenon for links in R 3 has been mentioned in [FKLW]. This subject will be addressed in detail in a forthcoming paper by S. Bravyi andA. Kitaev, “Quantum invariants of 3-manifolds and quantum computation”. Acknowledgements. We would like to thank Greg Kupperberg and Kevin Walker for many stimulating discussions on the material presented here.
References [At] [AB]
Atiyah, M.: Topological quantum field theories. Publ. Math. IHES 68, 175–186 (1989) Aharonov, D. and Ben-Or, M.: Fault-tolerant quantum computation with constant error. LANL e-print quan-ph/9611025 [B] Birman, J.: Braids, links, and mapping class groups. Ann. Math. Studies, Vol. 82 [D] Deutsch, D.: Quantum computational networks. Proc. Roy. Soc. London, A425, 73–90 (1989) [Fey] Feynman, R.: Simulating physics with computers. Int. J. Theor. Phys. 21, 467–488 (1982) [FKLW] Freedman, M.H., Kitaev, A., Larsen, M.J. and Wang, Z.: Topological Quantum Computation; LANL e-print quant-ph/0101025 [FLW] Freedman, M.H., Kitaev, A., Larsen, M.J. and Wang, Z.: A modular functor which is universal for quantum computation. LANL e-print quant-ph/0001108 [Fr] Freedman, M.H.: P/NP, and the quantum field computer. Proc. Natl. Acad. Sci., USA 95, 98–101 (1998) [HT] Hatcher, A. and Thurston, W.: A presentation for the mapping class group of a closed orientable surface. Topology 19, no. 3, 221–237 (1980) [J] Jones, V.F.R.: Hecke algebra representations of braid groups and link polynomial. Ann. Math. 126, 335–388 (1987) [Ka] Kane, B.: A silicon-based nuclear spin quantum computer. Nature 393, 133–137 (1998) [Ki] Kitaev, A.: Quantum computations: algorithms and error correction. Russian Math. Survey 52:61, 1191–1249 (1997)
Simulation of Topological Field Theories by Quantum Computers
[KLZ] [Li] [Ll] [M1] [M2] [MS] [Se] [Sh] [T] [Wa] [Wil] [Wi] [Y]
603
Knill, E., Laflamme, R. and Zurek, W.: Threshold Accuracy for Quantum Computation. LANL e-print quant-ph/9610011, 20 pages, 10/15/96 Lickorish, W.: A representation of orientable, combinatorial 3-manifolds. Ann. Math. 76, 531–540 (1962) Lloyd, S. Universal quantum simulators. Science 273, 1073–1078 (1996) Mosher, L.: Mapping class groups are automatic. Ann. of Math. 142, 303–384 (1995) Mosher, L.: Hyperbolic extensions of groups. J. of Pure and Applied Alg. 110, 305–314 (1996) Moore, G. and Seiberg, N.: Classical and quantum conformal field theory. Commun. Math. Phys. 123, 177–254 (1989) Segal, G.: The definition of conformal field theory. Preprint (1999) Shor, P.W.: Algorithms for quantum computers: Discrete logarithms and factoring. Proc. 35th Annual Symposium on Foundations of Computer Science. Los Alamitos: IEEE Computer Society Press, CA: pp. 124–134 Turaev, V.G.: Quantum invariant of knots and 3-manifolds. de Gruyter Studies in Math., Vol. 18 Walker, K.: On Witten’s 3-manifold invariants. Preprint, 1991 Wilczek, F.: Fractional statistics and anyon superconductivity. Teaneack, NJ: World Scientific Publishing Co., Inc., 1990 Witten, E.: Quantum field theory and the Jones polynomial. Commun. Math. Phys. 121, 351–399 (1989) Yao, A.: Quantum circuit complexity. Proc. 34th Annual Symposium on Foundations of Computer Science. Los Alamitos, CA: IEEE Computer Society Press, pp. 352–361
Communicated by P. Sarnak
Commun. Math. Phys. 227, 605 – 622 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
A Modular Functor Which is Universal for Quantum Computation Michael H. Freedman1 , Michael Larsen2 , Zhenghan Wang2 1 Microsoft Research, One Microsoft Way, Redmond, WA 98052-6399, USA 2 Indiana University, Dept. of Math., Bloomington, IN 47405, USA
Received: 4 May 2001 / Accepted: 18 February 2002
Abstract: We show that the topological modular functor from Witten–Chern–Simons theory is universal for quantum computation in the sense that a quantum circuit computation can be efficiently approximated by an intertwining action of a braid on the functor’s state space. A computational model based on Chern–Simons theory at a fifth root of unity is defined and shown to be polynomially equivalent to the quantum circuit model. The chief technical advance: the density of the irreducible sectors of the Jones representation has topological implications which will be considered elsewhere.
1. Introduction
The idea that computing with quantum mechanical systems might offer extraordinary advantages over ordinary “classical” computation has its origins in independent writings of Benioff [B], Manin [M] and Feynman [Fey]. Feynman explained that local “quantum gates”, the basis of his model, can efficiently simulate the evolution of any finite dimensional quantum system evolving under a local Hamiltonian Ht and by extension any renormalizable system. The details of this argument are (much clarified) in [Ll]. Topological quantum field theories (TQFTs), although possessing a finite dimensional Hilbert space, lack a Hamiltonian – the derivative of time evolution on which the Feynman– Lloyd argument is based. In [FKW], we provide a different argument for the poly-local nature of TQFTs showing that quantum computers efficiently simulate these as well. Here we give a converse to this simulation result. The Feynman–Lloyd argument is reversible, so we may summarize the situation as:
606
M.H. Freedman, M. Larsen, Z. Wang
(1) finite dimensional local1 quantum systems. (2) quantum computers (meaning the quantum circuit model QCM [D, Y]), (3) certain topological modular functors (TMFs). Each can efficiently simulate the others. We wrote TMF above instead of TQFT as a matter of notation because we use only the conformal blocks and the action of the mapping class groups on these – not the general morphisms associated to 3-dimensional non-product bordisms. The two dimensional aspects of a (2 + 1)-dimensional TQFT are referred to as a TMF. 2. A Universal Quantum Computer The strictly 2-dimensional part of a TQFT is called a topological modular functor (TMF). The most interesting examples of TMFs are given by the SU(2) Witten–Chern–Simons theory at roots of unity [Wi]. These examples are mathematically constructed in [RT] using quantum groups (see also [T, Wa]). A modular functor assigns to a compact surface (with some additional structures detailed below) a complex vector space V () and to a diffeomorphism of the surface (preserving structures) a linear map of V (). In the cases considered here V () always has a positive definite Hermitian inner product , h and the induced linear maps preserve , h , i.e. are unitary. The usual additional structures are fixed parameterizations of each boundary component, a labeling of each boundary component by an element of a finite label set L with an involution ˆ : L → L, and a Lagrangian subspace L of H1 (, Q) ([T, Wa]). Since our quantum computer is built from quantum-SU(2)-invariants of braiding, and the intersection pairing of a planar surface is 0, L = H1 (; Q) and can be ignored. The parameterization of boundary components can also be dropped at the cost of losing the overall phase information in the system which in any case is not physical. Mathematically this means that all unitaries should be regarded as projective. In three dimensional terms, this parameterization becomes the framing of a “Wilson” loop and is essential to well definedness of the phase of the Jones–Witten invariants. In our context it may be neglected. The involution ˆ is simply the identity since the SU(2)-theory is self-dual. In fact, we can manage by only 2π i considering the SU(2)-Chern–Simons theory at q = e r , r = 5 and so our label set will be the symbols {0, 1, 2, 3} which are the quantum group analogs of the 0th , 1st , 2nd , and 3rd symmetric powers of the fundamental representation of SU(2) in C2 . Note that in our notation, 0 labels the trivial representation, not 1. Since we are suppressing boundary parameterizations, we may work in the disk with n marked points thought of us crushed boundary components. Because we only need the “uncolored theory” to make a universal model, each marked point is assigned the label 1, and the boundary of the disk is assigned the label 0. We consider the action of the braid group B(n) which consists of diffeomorphisms of the disk which leave the n marked points and the boundary set-wise invariant modulo those isotopic to the identity leaving all marked points fixed. The braid group has the well-known presentation: B(n) = {σ1 , . . . , σn−1 | σi σj σi−1 σj−1 = id if |i − j | > 1 σi σj σi = σj σi σj if |i − j | = 1}, where σi is the half right twist of the i th marked point about the i + 1st marked point. 1 Local refers to the ubiquitous physical assumption that the Hamiltonian contains only k-body terms for k ≤ some fixed n. Note that such Hamiltonians well approximate lattice models with interactions which decay exponentially.
Modular Functor Which is Universal for Quantum Computation
607
To describe a fault-tolerant computational model “Chern–Simons 5” CS5, we must deal with the usual errors arising from decoherence as well as a novel “qubit smearing error” resulting from imbedding the computational qubits within a modular functor super-space. To explain our approach we initially ignore all errors; in particular formula (1) below is a simplification valid only in the error-free context. In fact, it is within the bounds of physical realism to study “Exact Chern–Simons 5” ECS 5, a model in which it is assumed that no errors occur in the implementation of the Jones representation from the braid group to the modular functor V . This may seem strange given that the major focus of the field of quantum computation has, since 1995, been on fault tolerance. The point is that topology represents a potential alternative path toward computational stability. Topology can confer physical error correction where the traditional approach within qubit models is a kind of software error correction. By definition topological structures, such as braids, are usually discrete so small variations do not risk confusing one type with another. The idea that the discreteness in topology can be used to protect quantum information first appears in [Ki1], though not yet in the context of a computational model. In that paper Kitaev uses perturbation theory to calculate an exponential decay, proportional to e−const. L , L a length scale, in the probability of one important source of error (tunneling of virtual excitations). Thus “ECS 5 computation” might be implemented in practice by adjusting the length scale L (in this context the distance at which punctures − physically anyons − must be kept separated) by a factor polylogrithmic in computation length. Perhaps a more likely implementation would be a hybrid scheme in which topology is used to reach the rather demanding threshold [P] required for software error correction. In this case modular functors and the usual theory of fault tolerance must be fitted together. This is possible using the perspective in [AB] and an argument for this sketched within the proof of Thm. 2.2. However, a comprehensive discussion of the interaction of the environment with topological degrees of freedom, and how computational stability can be achieved in this context is beyond the scope of this article. In fact recent work [AHHH] suggests that earlier interaction models which assume an uncorrelated environment may be too naive. We expect that the best framework for this discussion has not yet been constructed. The state space Sk = (C2 )⊗k of our quantum computer consists of k qubits, that is the disjoint union of k spin= 21 systems which can be described mathematically as the tensor product of k copies of the state space C2 of the basic 2-level system, i
C2 = span(|0, |1). For each even integer k, we will choose an inclusion Sk → V (D 2 , 3k marked points) = V (D 2 , 3k) and show how to use the action of the braid group B(3k) on the modular functor V to (approximately) induce the action of any poly-local unitary operator U : Sk → Sk . That is we will give an (in principle) efficient procedure for constructing a braid b = b(U) so that i ◦ U = V (b) ◦ i.
(1)
To see that this allows us to simulate the QCM, we need to explain: (i) what we mean by the hypothesis “poly-local” on U, (ii) what “efficient” means, (iii) what the effect of the two types of errors are on line (1), and (iv) what measurement consists of within our model. We begin by explaining how to map Sk into V and how to perform 1 and 2 qubit gates. Let D be the unit 2-dimensional disk and 12 13 21 22 23 10k + 1 10k + 2 10k + 3 11 , , , , , ,... , , , 100k 100k 100k 100k 100k 100k 100k 100k 100k
608
M.H. Freedman, M. Larsen, Z. Wang
be a subset of 3k marked points on the x-axis. Without giving formulae the reader should picture k disjoint sub-disks Di , 1 ≤ i ≤ k, each containing one clump of 3 marked points in itsinterior (these will serve to support qubits in a manner explained k below) and further disks Di,j , 1 ≤ i < j ≤ k, containing Di and Dj , but with 2 Dij ∩ Dl = ∅, l = i or j (which will allow 2-qubit gates). Strictly speaking, among the larger subdisks, we only need to consider Di,i+1 , 1 ≤ i, i + 1 < k, and could choose a standard (linear) arrangement for these but there is no cost in the exposition to considering all Di,j above which will correspond in the model to letting any two qubits interact. Also, curiously, we will see that any of the numerous topologically distinct arrangements for the {Di,j } within D may be selected without prejudice. Restricting to q = e2πi/5 , define Vkl to be the SU(2) Hilbert space of k marked points in the interior with labels equal 1 and l label on ∂D. We need to understand the many ways in which Vm0 arises via the “gluing axiom” ([Wa]) from smaller pieces. The axiom provides an isomorphism: V (X ∪γ Y ) ∼ = ⊕all consistent labelings l V (X, l) ⊗ V (Y, l),
(2)
where the notation has suppressed all labels not on the 1-manifold γ along which X and Y are glued. The sum is over all labelings of the components of γ satisfying the conditions that matched components have equal labels. According to SU(2)-Chern– Simons theory [KL], for three-punctured spheres with boundary labels a, b, c, the Hilbert space Vabc ∼ = C if (i) a + b + c = even, (ii) a ≤ b + c, b ≤ a + b, c ≤ a + b (triangle inequalities), (iii) a + b + c ≤ 2(r − 2);
(3)
and Vabc ∼ = 0 otherwise. The gluing axiom together with the above information allows an inductive calculation of Vkl , where the superscript denotes the label on ∂D. We easily calculate that dimV31 = 2, dimV33 = 1, dimV60 = 5, dimV62 = 8.
(4)
Line (4) motivates taking V (Di , its 3 marked points and boundary all label 1) =: Vi ∼ = C2 as our fundamental unit of computation, the qubit. Note that when V has only a lower index, 1 ≤ i ≤ k, it denotes the qubit supported in the disk Di . We fix the choice of k
k
an arbitrary “complementary vector” v in the state space of D\ ∪ Di v ∈ V (D\ ∪ Di , i=1
i=1 Vcomplement (To
all boundary labels = 1 except the label on the boundary of D is 0) =: keep this space nontrivial, we have taken k even.) Using v, the gluing axiom defines an injection: k
⊗v iv : (C2 )⊗k ∼ = ⊗ Vi → i=1
k ⊗ Vi ⊗ Vcomplement
i=1
as summand
→
0 . V3k
(5)
Modular Functor Which is Universal for Quantum Computation
609
This composition iv determines the inclusion of the computational qubits within the 0 . Observe in the calculation of line (9) below that the complementary modular functor V3k vector v will evolve to different v but this will be irrelevant to the measurement which is made at the end of the computation. The reader familiar with [FKW] will notice that we use here a dual approach. In that paper, we imbedded the modular functor into a larger Hilbert space that is a tensor power; here we imbedded a tensor power into the modular functor. The action of B(3) on Di yields 1-qubit gates, whereas two qubit gates will be constructed using the action of the six strand braid group B(6) on Di,j . Supposing our quantum computer Sk is in state s, a given v as above determines a state iv (s) = s ⊗ v ∈ 0 . Now suppose we wish to evolve s by a 2-qubit gate g ∈ P U (4) acting unitarily on V3k 2 Ci ⊗ C2j and by id on C2l , l = i or j . Using the gluing axiom (2) and the inclusion (5), we may write s=
th ⊗ uh ,
(6)
h
where {th } is a basis or partial basis for Vi ⊗ Vj ∼ = C2i ⊗ C2j and uh ∈ ⊗l=i,j C2l ,
so s ⊗ v = h (th ⊗ uh ) ⊗ v. Decomposing along γ = ∂D i,j , we may write v = α0 ⊗ β0 + α2 ⊗ β2 , where α. ∈ V Di,j \(D i ∪ Dj ), . on γ , . = 0 or 2 and β. ∈ V D\(∪l=i,j Dl ∪ Dij ), . on γ , and 0 on ∂D . Thus s⊗v =
th ⊗ u h ⊗ α 0 ⊗ β 0 +
h
th ⊗ uh ⊗ α2 ⊗ β2 .
(7)
h
An element of B(6) applied to the 6 marked points in Di ∪ Dj ⊂ Dij acts via a representation ρ 0 ⊕ ρ 2 =: ρ on V 0 (Dij , 6 pts) ⊕ V 2 (Dij , 6 pts), where the superscript denotes the label appearing when the surface is cut along γ . In particular B(6) acts on each factor th ⊗ α0 and th ⊗ α2 in (7). Note th ⊗ α 0 belongs to the summand of V 0 (Dij , 6 pts) corresponding to boundary labels on ∂ Dij \(Di ∪ Dj ) = 0, 1, 1. There is an additional 1-dimensional summand corresponding to boundary labels 0,3,3with 0,1,3 and 0,3,1 excluded by the triangle inequality (ii) in (3) above. Similarly th ⊗ α2 belongs to the summand of V 2 (Dij , 6 pts) with boundary labels=2,1,1. There are additional summands corresponding to (2,1,3), and (2,3,1) of dimensions 2 each. Ideally we would find a braid b = b(g) ∈ B(6) so that ρ 0 (b)(th ⊗ α0 ) = gth ⊗ α0 and ρ 2 (b)(th ⊗ α2 ) = gth ⊗ α2 . Then referring to (7) we easily check that ρ(b)(s ⊗ v) =
(gth ) ⊗ uh ⊗ v,
(8)
h
i.e. ρ(b) implements the gate g on the state space Sk of our quantum computer. In practice there are two issues: (i) we cannot control the phase of the output of either ρ 0 or ρ 2 , and (ii) these outputs will be only approximations of the desired gate g. The phase issue (i) leads to a change of the complimentary vector v → v as follows as seen on line (9) below. This is harmless since ultimately we only measure the qubits.
610
M.H. Freedman, M. Larsen, Z. Wang
s⊗v =
th ⊗ uh ⊗ α0 ⊗ β0 +
h
th ⊗ u h ⊗ α 2 ⊗ β 2
h
⇓ gate ρ(b)(s ⊗ v) = ω0 =
gth ⊗ uh ⊗ α0 ⊗ β0 + ω2
h
ω0 gth ⊗ uh ⊗ α0 ⊗ β0 +
h
=
gth ⊗ uh ⊗ α2 ⊗ β2
h
ω2 gth ⊗ uh ⊗ α2 ⊗ β2
h
(gth ⊗ uh ) ⊗ (ω0 α0 ⊗ β0 + ω2 α2 ⊗ β2 ) h
(gth ⊗ uh ) ⊗ v . =:
(9)
h
The approximation issue is addressed by Theorem 2.1 below. Theorem 2.1. There is a constant C > 0 so that for any positive . and for all unitary g : C2i ⊗ C2j → C2i ⊗ C2j , there is a braid bl of length ≤ l in the generators σi and their inverses σi−1 , 1 ≤ i ≤ n − 1, so that:
ω0 ρ 0 (bl ) − g ⊕ id1 + ω2 ρ 2 (bl ) − g ⊕ id4 ≤ . for some unit complex numbers (phases) ωi , i = 0, 2 whenever . satisfies k l ≤ C · log(1/.) for k ≥ 2.
(10)
(11)
We use to denote the operator norms and the subscripts on id indicate the dimension of the orthogonal component in which we are trying not to act. Proof. The main work in proving Theorem 2.1 is to show that the closure of the image of the representation ρ : B(6) → U(5) × U(8) contains SU(5) × SU(8). Once this is accomplished the estimate (10) follows with some exponent ≥ 2 from what is called the Solovay–Kitaev theorem [So, Ki2, KSV]. This is a rapid effective approximation theorem originally established in SU(2) with an exponent > 2 but in the last reference proved in SU(n) for all n, with same exponent k ≥ 2. Also by [KSV] there is a log2 (1/.) time classical algorithm which can be used to construct the approximating braid bl as a word in {σi } and {σi−1 }. The action ρ(b) “approximately” executes the gate g on Sk but not in the usual sense of approximation since the image of the state space ρ(b) (iv (Sk )) is only approximately iv (Sk ). This impression in the location of the computational qubits within a larger Hilbert space can be called “smearing”. We convert this “smearing of qubits” to errors of the type usually considered in the fault tolerant literature. After each g is approximately executed k
by ρ(b) we measure the labels around U ∂Di to project the new state ρ(b)(s ⊗ v) i=1
into the form s ⊗ v , s ∈ Sk , with probability 1 − O(. 2 ), |s − s| ≤ O(.). With probability O(. 2 ) the label measurement around ∂Di does not yield one; in this case V 1 (Di ; 3 pts.) ∼ = V31 ∼ = C2 has collapsed to V33 ∼ = C and it is as if a qubit has been
Modular Functor Which is Universal for Quantum Computation
611
“traced out” of our state space. More specifically, if the label 3 is measured on ∂Di , we replace V 3 (Di , its 3 marked pts.) with a freshly cooled qubit V 1 (Di , 3 pts.) with (say) a completely random initial state which we have been saving for such an occasion. The reader may picture dragging Di off to the edge of the disk D and dragging the ancillae D in as its replacement (and then renaming D as Di .) The hypothesis that such ancillae are available is discussed below. The error model of [AB] is precisely suited to this situation; Aharanov and Ben-Or show in Chapter 8 that a calculation on the level of “logical” qubits can be kept precisely on track with a probability ≥ 23 provided the ubiquitous errors at the level of “physical” qubits are of norm ≤ O(.) (even if they are systematic and not random) and the large errors (in our case tracing a qubit) have probability also ≤ O(.) for some threshold constant . > 0. For this, and all other fault tolerant models, entropy must be kept at bay by ensuring a “cold” stream of ancillary |0’s. In the context of our model we must now explain both the role of measurement and ancilla. Given any essential simple closed curve γ on a surface , the gluing formula reads: V () = ⊕l∈L V (cutγ , l)
(12)
so “measuring a label” means that we posit for every γ a Hermitian operator Hγ with eigenvalues distinguishing the summands of the r.h.s. of (12) above. For a more comprehensive computational study, we would wish to posit that if γ has length = L, then Hγ can be computed in poly(L) time. For the present purpose we only need that Hγ , γ = ∂Di or ∂Di,j can be computed in constant time. Beyond measuring labels, we hypothesize that there is some way of probing the quantum state of the smallest 1 ∼ C2 , nontrivial building blocks in the theory. For us these are the k qubits = V3,i = i 1 ≤ i ≤ k, where the index i refers to the qubit supported in Di . Fix a basis {|0, |1} for V31 and posit for each Di , 1 ≤ i ≤ k, with label 1 on its boundary, an observable 1 0 1 → V 1 which acts as the Pauli matrix Hermitian operator σzi : V3,i in the fixed 3,i 0 −1 i basis {|0, |1} for that qubit. In concrete terms, this Pauli operator σz has eigen vectors |0 and |2, where 0 and 2 are the two possible labels which can appear on the simple closed curve αi ⊂ Di which separates exactly two of the three punctures from ∂Di . The Pauli matrix σzi might be implemented by first fusing a pair of the punctures in Di and then measuring the resulting particle type. This then is our repertoire of measurement: Hγ is used to “unsmear physical qubits” after each gate and the σz ’s to read out the final state (according to the usual “von Neumann” statistical postulate on measurement) after the computation is completed. In fault tolerant models of computation it is essential to have available a stream of “freshly cooled” ancillary qubits. If these are present from the start of the computation, even if untouched, they will decohere from errors in employing the identity operator. In the physical realization of a quantum computer, unless stored zeros were extremely stable there would have to be some device (inherently not unitary!) for resetting ancillae to |0, e.g. a polarizing magnetic field. As a theoretical matter, unbounded computation requires such resetting. As discussed near the beginning of this section, in a topological model such as V () it is not unreasonable to postulate that |0 ∈ V31 = V 1 (Di , 3 pts.) is stable if not involved in any gates. An alternative hypothesis is that there is some mechanism outside the system analogous to the polarizing magnetic field above which can “refrigerate” ancillae in the state |0 until they are to be used. We refer below to either of these as the “fresh ancilli” hypothesis. To correct the novel qubit smearing errors, we already encountered the need for ancilli which we took to be an easily maintained random
612
state ρ =
M.H. Freedman, M. Larsen, Z. Wang
1 2
0
. Other uses of ancilli within fault tolerant schemes require a known 0 21 pure state |0. Let us now return to line (1). Let U be the theoretical output of a quantum circuit C of (i.e. composition of) gates to be executed on the physical qubit level so as to faulttolerantly solve a problem instance of length n. We assume the problem is in BQP and that the above composition has length ≤ poly(n). Actually, due to error, C will output a completely positive trace preserving super-operator O, called a physical operator. Now simulate C in the modular functor V a gate at a time by a succession of braidings and Hγ -measurements. With regard to parallelism (necessary in all fault tolerant schemes), notice that disjoint 2 qubit gates can be performed simultaneously if Di,j ∩ Di ,j = ∅. For example this can always be arranged in the linear QCM for gates acting in Di,i+1 and Dj,j +1 provided i + 1 = j, j + 1 = i, and i = j , and even this model is known to be fault tolerant [AB]. From line (9), the complementary vector v ∈ Vcomplement evolves probabilistically as the simulation progresses. Different v’s will occur as a tensor factor in a growing number of probabilistically weighted terms. However, the various v − factors are in the end inconsequential; they simply label a computational state (to be observed with some probability) and are never read by the output measurements σzi . We fix terminology and state the main theorems. QCM denotes the exact quantum circuit model. It is known that a quantum circuit operating in the presence of certain kinds of error can still simulate an exact QCM with only polylogrithmic cost in space and time. The basic error model permits gate error of arbitrary super−operator norm (to include identity gates) at some low rate, e.g. . ≈ 10−6 per operation site, but demands independence. This error model is enlarged (while retaining efficient simultability) in two ways in [AB] which are important to use here. First (see line 2.6 [AB]), as long as the probability of these arbitrary errors, which include tracing a qubit, is dominated by the independent case along the “fault-path” correlations are permitted. Second small systematic errors are permitted everywhere in the model provided they are small enough, e.g. unitaries may have systematic error of, again, about one part in 10−6 . Let BQP denote the class of decision problems which can be solved with probability ≥ 43 by an exact quantum circuit designed by a classical algorithm in time poly(L), where L is the length of the problem instance M. This same class can be solved in poly-time by a (slightly) error-prone QC. Let CS5 denote the model of computation described in this section. It is based on the Chern–Simons theory of SU(2) at the fifth root of unity q = e2πi/5 . We review its structure here; a list of generating “braid gates” is given in Sect. 3. The functor 1 , it contains k-qubits, i : S → V 1 and can be assigned a is the Hilbert space V3k v k 3k standard initial state α ∈ iv (Sk ). The 3k-strand braid group B(3k) acts unitarily by ρ 1 and a classical poly-time algorithm converts a circuit C in the QCM to a word in on V3k B(3k). Note that the braid group can be implemented in parallel (most of it generators commute) in imitation of that essential feature of quantum circuits. The model has two kinds of measurements Hγ and σzi , but only the later is allowed in the exact version of the model ECS5. In CS5 we envision access to “fresh ancilli”, in ECS5 there is 1 no need for these. The action ρ(b) of the braid b produces an evolution of α ⊗ v.V3k i to a probabilistic mixture of states γl = αl ⊗ vl with probability pl . Performing σz measurements 1 ≤ i ≤ k, then samples γl and observes only the αl factor. Classical poly (L)-time post−processing of these k observations can be permitted in the model but equivalently this step can be folded back into the quantum circuit phase to make the observation of σz1 on the first qubit the one and only read-out.
Modular Functor Which is Universal for Quantum Computation
613
Without error-correction no model ECS5 included can compute for very long if subjected errors of any constant size or probability > 0. However we explicitly assume that CS5 faces the kinds of environmental error analyzed in [AB] in addition to its intrinsic “gate errors” (from the approximate output of the Solovay–Kitaev theorem) and qubit smearing errors inherent in the model. Specifically for some small δ > 0 permit (1) δ-small systematic errors in each operation σi± or identity and (2) a probability of large environmental errors, which is dominated by the probability of independent individual errors of probability < δ each. Theorem 2.2. Given a problem in BQP and an instance M of length L a classical poly-time algorithm can convert the quantum circuit C for M into a braid b.B(3k). 1 and measuring σ 1 will correctly solve M with probability Implementing ρ(b) on V3k z 3 ≥ 4 . The number of marked points to be braided space (= 3k) and the length of the braiding exceed the size of the original circuit C by at most a multiplicative poly(log(L)) factor. Taken in triples, the points support represent the “physical qubits” of the [AB] fault tolerant model. Thus CS5 provides a model which efficiently and fault tolerantly simulates the computations of QCM. We note that the use of label measurements Hγ introduces non-unitary steps in the middle of our simulation. As usual the probability 3 4 is independent w.r.t trials and so converges exponentially to 1 upon repetition of the entire procedure. Proof. The proof relies heavily on Chapter 8 [AB] to reduce the QCM to a linear quantum circuit (with state space Sk ) stable under a very liberal error model – one permitting small systematic errors plus
rare large but uncorrelated qubit errors or trace over a qubit. In the final state γ = pl γl , each γl admits a tensor decomposition according to the geometry: D = (∪i Di ) ∪ (complement), but along the k boundary components ∪i ∂Di all choices of labels 1 or 3 may appear. In writing βl = αl ⊗ vl we must remember that associated to l is an element [l] ∈ {1, 3}k which defines the subspace [l]-sector, of the modular functor in which γl lies. All occurrences of the label 3 correspond to a C tensor factor, C ∼ = V33 ∼ = V 3 (Di , 3 pts) ⊂ V (Di , 3 pts) whereas the label 1 corresponds to a 2 C factor. Thus in the [AB] framework each label 3 corresponds to a “lost” or according to our replacement procedure Di ←→ D , a traced qubit. (Losing an occasional qubit from the computational space Sk is the price we pay to “unsmear” Sk within the modular functor.) Theorem 2.1 implies that for a braid length = O( .12 ) a qubit will be traced with probability O(. 2 ) and if no qubit is lost the gate will be performed with error O(.) on pure states. Factoring a mixed state as a probabilistic combination of pure states and passing the error estimate across the probabilities we see that for δ > 0 sufficiently small, the O(.) error bound holds with high probability on the observed γ6 . Thus for . sufficiently small (estimated ≈ 10−6 [AB]), observing αl amounts to sampling from an error prone implementation of the quantum circuit C. The error model is not entirely random in that the approximation procedure used to construct b will have systematic biases. This implies that the O(.) errors introduced in the functioning of each gate are not random and must be treated as “malicious”. The error model explained in Chapter 8 [AB] permits such small errors to be arbitrary as long as the large error, e.g. qubit losses, occurs with a probability dominated by a small constant independent of the qubit and the computational history. This is consistent with the assumptions on the CS5 model. This completes the proof of Theorem 2.2 modulo the proof of the density Theorem 4.1. We now turn to the exact variant ECS5, in which we assume that all the braid groups act exactly (no error) on the modular functor V . The only difference in the algorithm
614
M.H. Freedman, M. Larsen, Z. Wang
for modeling the QCM in ECS5 is the simplification that Hγ measurements are not performed in the middle of the simulation, but only at the very end, prior to reading out the qubits Sk with σzk measurements. Theorem 2.3. There is an efficient and strictly unitary simulation of QCM by ECS5. Thus given a problem instance M of length L in BQP, there is a classical poly(L) time algorithm for constructing a braid b as a word of length poly(L) in the generators σi , 1 ≤ i ≤ polyL). Let k be another polynomial function of L. Applying b to a standard initial state, ψinitial ∈ V 0 (D, 3k), results in a state ψfinal ∈ V 0 (D, 3k), so that the results of Hγ on ∂Di followed by σzi measurements on ψfinal correctly solve the problem instance M with probability ≥ 43 . Proof. In the quantum circuit C for M (implied by the problem lying in BQP) count the number n of gates to be applied. Use line (11) to approximate each gate g by a braid b of length l so that the operator norm error ||ρ(b) − g|| of the approximating gate will be less than .n−1 , for some fixed . > 0. The composition of n braids which gate-wise simulate the quantum circuit introduces an error on operator norm < .. It follows that the approximation of the desired unitary by the braid results in a 8final so that the absolute angle | < (8final , 8final )| ≤ 2 arcsin 2. . The application of our two measurement steps will therefore return an answer nearly as reliable as the original quantum circuit C: The probability ρ that the sequential measurements Hγ and σz1 (which is defined if and only if Hγ projects to V 1 (D, 3pts.)) will give different results for 8final and 8final is ≤ sin 2 arcsin 2. < .. So with probability 1 − p > 1 − . the final measurement |0 or |1 will be the same in the quantum circuit C and the ECS5 model. Remark. Theorem 2.2 and 2.3 are complementary. One provided additional fault tolerance – fault tolerance beyond what might be inherent in a topological model – but at the cost of introducing intermediate non-unitary steps (i.e. measurements). The other eschews intermediate measurements and so gives a strictly unitary simulation, but cannot confer additional fault tolerance. It is an interesting open technical problem whether fault tolerance and strict unitarity can be combined in a universal model of computation based on topological modular functors. Looking ahead to a possible implementation, however, intermediate measurements as in the fault tolerant model do not seem undesirable. 3. Jones’ Representation of the Braid Groups A TMF gives a family of representations of the braid groups and mapping class groups. In this section, we identify the representations of the braid groups from the SU(2) modular functor at primitive roots of unity with the irreducible sectors of the representation discovered by Jones whose weighted trace gives the Jones polynomial of the closure link of the braid [J1, J2]. To prove universality of the modular functor for quantum computation, we only use this portion of the TMF. Therefore, we will focus on these representations. First let us describe the Jones representation of the braid groups explicitly following [We]. To do so, we need first to describe the representation of the Temperley-Lieb2π i Jones algebras Aβ,n . Fix some integer r ≥ 3 and q = e r . Let [k] be the quantum integer defined as [k] =
k
−k 2 1 −1 q 2 −q 2
q 2 −q
1
. Note that [−k] = −[k], and [2] = q 2 + q
−1 2
. Then
β := [2]2 = q + q¯ + 2 = 4cos 2 ( πr ). The algebras Aβ,n are the finite dimensional C ∗ -algebras generated by 1 and projectors e1 , · · · , en−1 such that
Modular Functor Which is Universal for Quantum Computation
615
1. ei2 = ei , and ei∗ = ei , 2. ei ei±1 ei = β −1 ei , 3. ei ej = ej ei if |i − j | ≥ 2, ∞
and there exists a positive trace tr : U Aβ,n → C such that tr(xen ) = β −1 tr(x) for all n=1
x ∈ Aβ,n . The Jones representation of Aβ,n is the representation corresponding to the G.N.S. construction with respect to the above trace. An important feature of the Jones representation is that it splits as a direct sum of irreducible representations indexed by some 2-row Young diagrams, which we will refer to as sectors. A Young diagram λ = [λ1 , . . . , λs ], λ1 ≥ λ2 ≥ · · · ≥ λs is called a (2, r) diagram if s ≤ 2 (at most (2,r) denote all (2, r) diagrams with n nodes. two rows) and λ1 − λ2 ≤ r − 2. Let ∧n (2,r) (2,r) be all standard tableaus {t} with shape λ satisfying the inGiven λ ∈ ∧n , let Tλ ductive condition which is the analogue of (iii) in (3): when n, n−1, . . . , 2, 1 are deleted from t one at a time, each tableau appeared is a tableau for some (2, r) Young diagram. (2,r) over all The representation of Aβ,n is a direct sum of irreducible representations πλ (2,r) for a fixed (2, r) Young diagram λ is (2, r) Young diagrams λ. The representation πλ (2,r) (2,r) be the complex vector space with basis {v t , t ∈ Tλ }. Given given as follows: let Vλ (2,r) a generator ei in the Temperley–Lieb–Jones algebra and a standard tableau t ∈ Vλ . Suppose i appears in t in row r1 and column c1 , i + 1 in row r2 and column c2 . Denote
[dt,i +1] by dt,i = c1 − c2 − (r1 − r2 ), αt,i = [2][d , and βt,i = αt,i (1 − αt,i ). They are both t,i ] 2 + β 2 . Then we define non-negative real numbers and satisfy the equation αt,i = αt,i t,i (2,r)
πλ
(ei )(v t ) = αt,i v t + βt,i v gi (t) ,
(13) (2,r)
where gi (t) is the tableau obtained from t by switching i and i + 1 if gi (t) is in Tλ . (2,r) If gi (t) is not in Tλ , then αt,i is 0 or 1 given by its defining formula. This can occur (2,r) in several cases. It follows that πλ with respect to the basis {v t } is a matrix consisting of only 2 × 2 and 1 × 1 blocks. Furthermore, the 1 × 1 blocks are either 0 or 1, and the 2 × 2 blocks are αt,i βt,i . (14) βt,i 1 − αt,i 2 + β 2 implies that (14) is a projector. So all eigenvalues of e The identity αt,i = αt,i i t,i are either 0 or 1. The Jones representation of the braid groups is defined by
ρβ,n (σi ) = q − (1 + q)ei .
(15)
Combining (15) with the above representation of the Temperley–Lieb–Jones algebra, we get Jones’ representation of the braid groups, denoted still by ρβ,n : ρβ,n : Bn → Aβ,n → U(Nβ,n ),
(2,r) where the dimension Nβ,n = λ∈∧(2,r) dimVλ grows asymptotically as β n . n When |q| = 1, as we have seen already, Jones’ representation ρβ,n is unitary. To verify that ρ(σi )ρ ∗ (σi ) = 1, note ρ ∗ (σi ) = q¯ − (1 + q)e ¯ i∗ . So we have ρ(σi )ρ ∗ (σi ) =
616
M.H. Freedman, M. Larsen, Z. Wang
q q¯ + (1 + q)(1 + q)e ¯ i ei∗ − (1 + q)ei − (1 + q)e ¯ i∗ = 1. We use the fact ei∗ = ei and ei2 = ei to cancel out the last 3 terms. From the definition, ρβ,n also splits as a direct sum of representations over (2, r)Young diagrams. A sector corresponding to a particularYoung diagram λ will be denoted by ρλ,β,n . Now we collect some properties about the Jones representation of the braid groups into the following: Theorem 3.1. (i) For each (2, r)-Young diagram λ, the representation ρλ,β,n is irreducible. (ii) The matrices ρλ,β,n (σi ) for i = 1, 2 generate an infinite subgroup of U(2) modulo center for r = 3, 4, 6, 10. (iii) Each matrix ρλ,β,n (σi ), 1 ≤ i ≤ n − 1, has exactly two distinct eigenvalues −1, q. (iv) For the (2,5)-Young diagram λ = [4, 2], n = 6, the two eigenvalues −1, q of every ρλ,β,6 (σi ) have multiplicity of 3 and 5 respectively. The proofs of (i) and (ii) are in [J2]. For (iii), first note that the matrix ρλ,β,n (σ1 ) is a diagonal matrix with respect to the basis {v t } with only two distinct eigenvalues −1, q. Now (iii) follows from the fact that all braid generators σi are conjugate to each other. For (iv), simply check the explicit matrix for ρλ,β,6 (σ1 ) at the end of this section. Now we identify the sectors of the Jones representation with the representations of the braid groups coming from the SU(2) Chern–Simons modular functor. The SU(2) Chern–Simons modular functor CSr of level r has been constructed several times in the literature (for example, [RT, T, Wa, G]). Our construction of the modular functor CSr is based on skein theory [KL]. The key ingredient is the substitute of Jones–Wenzl idempotents for the intertwiners of the irreducible representations of quantum groups [RT, T, Wa]. This is the same SU(2) modular functor as constructed using quantum groups in [RT] (see [T]) which is regarded as a mathematical realization of the Witten–Chern– Simons theory. All formulae we need for skein theory are summarized in Chapter 9 of √ 2π i [KL] with appropriate admissible conditions. Fix an integer r ≥ 3. Let A = −1·e− 4r , 2 4 and s = A , and q = A . (Note the confusion caused by notations. The q in [KL] is A2 which is our s here. But in Jones’ representation of the braid groups [J2], q is A4 . In all formulae in [KL], q should be interpreted as s in our notation.) The label set L of the modular functor CSr will be {0, 1, . . . , r − 2} and the involution is the identity. We are interested in a unitary modular functor and the one in [G] is not unitary. We claim that if we follow the same construction of [G] using our choice of A and endow all state spaces of the modular functor with the following Hermitian inner product, the resulting modular functor CSr is unitary. The relevant Hilbert space structure has also been constructed earlier by others (e.g. in [KS, KSVo]). Given a surface , a pants decomposition of determines a basis of V (): each basis element is a tensor product of the basis elements of the constituent pants. The desired inner products are determined by axiom (2.14) [Wa] if we specify an inner product on each space Vabc . Our choice of A makes all constants S(a) appearing in the axiom (2.14) [Wa] positive. Consequently, positive definite Hermitian inner products on all spaces Vabc determine a positive definite Hermitian inner product on V (). The vector space Vabc of the three punctured sphere Pabc is defined to be the skein space of the disk Dabc enclosed by the seams of the punctured sphere Pabc . The numbering of the three punctures induces a numbering of the three boundary “points” of the disk Dabc labeled by {a, b, c}. Suppose t is a tangle on Dabc in the skein space of Dabc , and let t¯ be the tangle on Dabc obtained by reflecting the disk Dabc through the first
Modular Functor Which is Universal for Quantum Computation
617
boundary point and the origin. Then the inner product , h : Vabc × Vabc → C is as follows: given two tangles s and t on Dabc , their product s, th is the Kauffman bracket evaluation of the resulting diagram on S 2 obtained by gluing the two disks with s and t¯ on them respectively, along their common boundaries with matching numberings. Extending , h on the skein space of Dabc linearly in the first coordinate and conjugate linearly in the second coordinate, we obtain a positive definite Hermitian inner product on Vabc . It is also true that the mapping class groupoid actions in the basic data respect this Hermitian product, and the fusion and scattering matrices F and S also preserve this product. So CSr is indeed a unitary modular functor. This modular functor CSr defines representations of the central extension of the mapping class groups of labeled extended surfaces, in particular for n-punctured disks Dnm with all interior punctures labeled 1 and boundary labeled m. If m = 1, then the mapping class group is the braid group Bn . If m = 1, then the mapping class group is the spherical braid group SBn+1 = M(0, n + 1). Recall that we suppress the issues of framing and central extension as they are inessential in our discussion. Also the representation of the mapping class groups coming from CSr will be denoted simply by ρr . Theorem 3.2. Let Dnm be as above. (1) If m + n is even, and m = 1, then ρr is equivalent to the irreducible sector of the m−n Jones representation ρλ,β,n for the Young diagram λ = [ m+n 2 , 2 ] up to phase. (2) If n is odd, and m = 1, then the composition of ρr with the natural map ι : Bn → SBn+1 is equivalent to the irreducible sector of the Jones representation ρλ,β,n for n−1 the Young diagram λ = [ n+1 2 , 2 ] up to phase. The equivalence of these two representations was first established in a non-unitary version [Fu]. A computational proof of this theorem can be obtained following [Fu]. So we will be content with giving some examples for r = 5. To get a universal set of gates using these matrices, all we need is to realize the Solovay-Kitaev theorem by an algorithm for any prescribed precision [KSV, NC]. For the (2, 5) Young diagram λ = [2, 1], n = 3 with an appropriate ordering of the basis: −1 0 ρ[2,1],β,3 (σ1 ) = , 0 q √ q2 [3] − qq+1 q+1 √ , where quantum [3] = q + q¯ + 1. ρ[2,1],β,3 (σ2 ) = [3] 1 − qq+1 − q+1 For the (2, 5) Young diagram λ = [3, 3], n = 6, the representation is 5-dimensional. With an appropriateordering of thebasis, we have: −1 q −1 ρ[3,3],β,6 (σ1 ) = , q q √ q2 q [3] − q+1 q+1 q √[3] 1 − q+1 − q+1 √ 2 q q [3] ρ[3,3],β,6 (σ2 ) = − q+1 . q+1 √ − q [3] − 1 q+1
q+1
q
618
M.H. Freedman, M. Larsen, Z. Wang
For the (2, 5) Young diagram λ = [4, 2], n = 6, the representation is 8-dimensional. Here the inductive condition on basis elements make one standard tableau illegal, so the representation is not 9-dimensional as it would be if r > 5. This is the restriction analogous to (iii) in (3) for the modular functor. With an appropriate ordering of the basis: −1 q −1 q ρ[4,2],β,6 (σ1 ) = . −1 q q q 4. A Density Theorem In this section, we prove the density theorem. Theorem 4.1. Let ρ := ρ[3,3] ⊕ ρ[4,2] : B6 → U(5) × U(8) be the Jones representation 2π i of B6 at the 5th root of unity q = e 5 . Then the closure of the image of ρ(B6 ) in U(5) × U(8) contains SU(5) × SU(8). By Theorem 3.2, this is the same representation ρ := ρ 0 ⊕ρ 2 : B6 → U(5)×U(8) in the SU(2) Chern–Simons modular functor at the 5th root of unity used in Sect. 2 to build a universal quantum computer. In the following, a key fact used is that the image matrix of each braid generator under the Jones representation has exactly two eigenvalues {−1, q} whose ratio is not ±1. This strong restriction allows us to identify both the closed image and its representation. Proof. First it suffices to show that the images of ρ[3,3] and ρ[4,2] contain SU(5) and SU(8), respectively. Supposing so, if K = ρ(B6 ) ∩ (SU(5) × SU(8)), then the two projections p1 : K → SU(5) and p2 : K → SU(8) are both surjective. Let N2 (respectively N1 ) be the kernel of p1 (respectively p2 ). Then N1 (respectively N2 ) can be identified as a normal subgroup of SU(5) (respectively SU(8)). By Goursat’s Lemma (p. 54, [La]), the image of K in SU(5)/N1 × SU(8)/N2 is the graph of some isomorphism SU(5)/N1 ∼ = SU(8)/N2 . As the only nontrivial normal subgroups of SU(n) are finite groups, this is possible only if N1 = SU(5) and N2 = SU(8). Therefore, K = SU(5) × SU(8). The proofs of the density for ρ[3,3] and ρ[4,2] are similar. So we prove both cases at the same time and give separate argument for the more complicated case ρ[4,2] when necessary. Let G be the closure of the image of ρ[3,3] (or ρ[4,2] ) in U(5) (or U(8)) which we will try to identify. By Theorem 3.1, G is a compact subgroup of U(m) (m = 5 or 8) of positive dimension. Denote by V the induced m-dimensional faithful, irreducible complex representation of G. The representation V is faithful since G is a subgroup of U(m). Let H be the identity component of G. What we actually show is that the derived group of H , Der(H ) = [H, H ], is actually SU(m). We will divide the proof into several steps. Claim 1. The restriction of V to H is an isotypic representation, i.e. a direct sum of several copies of a single irreducible representation of H .
Modular Functor Which is Universal for Quantum Computation
619
Proof. As G is compact, V = ⊕P VP , where P runs through some irreducible representations of H , and VP is the direct sum of all the copies of P contained in V . Since H is a normal subgroup, and the braid generators σi topologically generate G, the σi ’s permute transitively the isotypic components VP [CR, Sect. 49]. If there is more than 1 such component, then some σi acts nontrivially, so it must permute these blocks. Now we need a linear algebra lemma: Lemma 4.2. Suppose W is a vector space with a direct sum decomposition W = ⊕ni=1 Wi , and there is a linear automorphism T such that T : Wi → Wi+1 1 ≤ i ≤ n cyclically. Then the product of any eigenvalue of T with any nth root of unity is still an eigenvalue of T . Proof. Choose a basis of W consisting of bases of Wi , i = 1, 2, . . . , n. If k is not a multiple of n, then trT k = 0, as all diagonal entries are 0 with respect to the above basis. repeat.) Consider all values of trT m =
mLet {λi } be all eigenvalues of T . (They may λi (m = 1, 2, . . . ) which are sums of mth powers of all eigenvalues of T . These if we simultaneously multiply sums of mth powers of {λi } are invariant
mall the eigenvalues
m m m th root of unity ω: m {λi } by an λi which is equal to n (ωλ ) = ω λ = ω i i
m trT m = λi because when m is not a multiple of n, they are both 0, and when m is, ωm = 1. These values trT m uniquely determine the eigenvalues of T , and therefore the set of the eigenvalues of T is invariant under multiplication by any nth root of unity. Back to Claim 1, if there is more than one isotypic component, then some σi will have an orbit of length at least 2. It is impossible to have an orbit of length 3 or more by the above lemma as this will lead to at least 3 eigenvalues. If the orbit is of length 2 and as ρ(σi ) has only two eigenvalues {a, b}, by the lemma, {−a, −b} are also eigenvalues. It follows that a = −b which is impossible when q = −1. Claim 2. The restriction of V to H is an irreducible representation. Proof. By Claim 1, V |H has only one isotypic component. If V |H is reducible, then the isotypic component is a tensor product V1 ⊗V2 , where V1 is the irreducible representation of H in the isotypic component and V2 is a trivial representation of H with dimV2 ≥ 2. If V1 is 1-dimensional, then ρ(σi ), i = 1, 2 generate a finite subgroup of U(m) modulo center which is excluded by Theorem 3.1. So we have dimV1 ≥ 2. Now we recall a fact in representation theory: a representation of a group ρ : G → GL(V ) is irreducible if and only if the image ρ(G) of G generates the full matrix algebra End(V ). As V1 is an irreducible representation of H , the image ρ(H ) generates End(V1 ) ⊗ id2 , where the subscript of id indicate the tensor factor. As the elements σi normalize H , they also normalize the subalgebra End(V1 ) ⊗ id2 in End(V1 ⊗ V2 ). Consequently they act as automorphisms of the full matrix algebra End(V1 ). Any automorphism of a full matrix algebra is a conjugation by a matrix, so the braid generators σi act via conjugation (up to a scalar multiple) as invertible matrices in End(V1 ) ⊗ id2 modulo its centralizer. It is not hard to see the centralizer of End(V1 ) ⊗ id2 in End(V1 ⊗ V2 ) is id1 ⊗ End(V2 ). Therefore, the braid generators σi act via conjugation as invertible matrices in End(V1 )⊗ End(V2 ), i.e. they preserve the tensor decomposition. This is impossible by the following eigenvalue analysis. Consider a braid generator σi , its image ρ(σi ) is a tensor product of two matrices each of sizes at least 2. Since ρ(σi ) has only two eigenvalues, neither factor matrix can have 3 or more eigenvalues. If both factor matrices have two eigenvalues, the fact that ρ(σi ) has 2 eigenvalues in all implies that the ratio of these two eigenvalues is ±1 which is forbidden. If one factor matrix is trivial, then ρ(σi ) acts trivially on this
620
M.H. Freedman, M. Larsen, Z. Wang
factor. As all braid generators are conjugate to each other, so the whole group G will act trivially on this factor which implies that V is a reducible representation of G. This case cannot happen either, as V is an irreducible representation of G. Claim 3. The derived group, Der(H ) = [H, H ], of H is a semi-simple Lie group, and the further restriction of V to Der(H ) is still irreducible. Proof. By Claim 2, V |H is a faithful, irreducible representation, so H is a reductive Lie group [V, Theorem 3.16.3]. It follows that the derived group of H is semi-simple. It also follows that the derived group and the center of H generate H . By Schur’s lemma, the center act by scalars. So V |Der(H ) is still irreducible. Claim 4. Every outer automorphism of Der(H ) has order 1, 2, or 3. First we recall a simple fact in representation theory. If V is an irreducible representation of a product group G1 × G2 , then V splits as an outer tensor product of irreducible representations of Gi , i = 1, 2. The restriction of V to G1 has only one isotypic component, and the restriction of V to G2 lies in the centralizer of the image of G1 . So the representation splits. Proof. It suffices to prove the same statement for the universal covering Der uc (H ) of Der(H ), as the automorphism group of Der(H ) is a subgroup of the automorphism group of Der uc (H ). For the 5-dimensional case: as 5 is a prime, Der uc (H ) is a simple group. Any outer automorphism of a simple Lie group is of order 1, 2, or 3. This follows from the fact that any outer automorphism of a simple Lie group is an outer automorphism of its Dynkin diagram together with the A-G classification of Dynkin diagrams [V]. For the 8-dimensional case, if Der uc (H ) is a simple group, it can be handled as above, so we need only to consider the split cases. If Der uc (H ) splits into two simple factors, then one factor must be SU(2): of all simply connected simple Lie groups, only SU(2) has a 2-dimensional irreducible representation. So the outer automorphism group is either Z2 when both factors are SU(2), or the same as the outer automorphism group of the other simple factor. Our claim holds. If there are three simple factors, they must all be SU(2). The outer automorphism group is the permutation group on three letters S3 . Again our claim is true. Claim 5. For each braid generator σi , we can choose a corresponding element σ˜i lying in the derived group Der(H ) which also has exactly two eigenvalues, whose ratio is not ±1. The multiplicity of each eigenvalue of σ˜i is the same as that of σi . (The choice of σ˜i is not unique, but its two eigenvalues have ratio q.) Proof. Since Der(H ) is still a normal subgroup of G, and the braid generators σi normalize Der(H ), so they determine outer-automorphisms of Der(H ). By Claim 4, an outer-automorphism of Der(H ) is of order 1, 2, or 3. Hence σi6 acts as an inner automorphism of Der(H ). By Schur’s lemma, each σi6 is the product of an element in Der(H ) with a scalar, though the decomposition is not unique. Fix a choice for an element σ˜i in Der(H ). Then it has exactly two desired eigenvalues. To complete the proof of Theorem 4.1, we summarize our situation: we have a nontrivial semi-simple group Der uc (H ) with an irreducible unitary representation. Furthermore, it has a special element x whose image under the representation has exactly two distinct eigenvalues whose ratio is not ±1.
Modular Functor Which is Universal for Quantum Computation
621
For the 5-dimensional case, Der uc (H ) is a simple Lie group. Going through the list [MP] of pairs (G, D ), where G is a simply connected Lie group and D a dominant weight, the only possible 5-dimensional irreducible representations are as follows: rank=1, (SU(2), 4D1 ), rank=2, (Sp(4), D2 ) on p. 52 of [MP], and rank=4, (SU(5), Di ), i = 1, 4 on p. 30. By examining the possible eigenvalues, we can exclude the first two cases as follows: for the first case, suppose α, β are the two eigenvalues of the above element x in SU(2), then under the representation 4D1 the eigenvalues of the image of x are α i β j , i + j = 4, where i and j both are non-negative integers. The only possibility is two eigenvalues whose ratio is ±1. For the second case, since 5 is an odd number, any element in the image has a real eigenvalue. Other eigenvalues come in mutually reciprocal pairs. Again the only possibility is two eigenvalues whose ratio is ±1. Therefore, the only possible pair is the third case which gives Der uc (H ) = SU(5). As V is a faithful representation of Der(H ), the image of Der(H ) is the same as that of Der uc (H ) which is SU(5). The 8-dimensional case for ρ[4,2] is similar. By [MP], pairs we see the possible for simply connected simple groups are SU(2), 7D + D , SU(3), D on p. 26 of 1 1 2 [MP], Spin(7), D3 on p. 40, Sp(8), D1 on p. 56, Spin(8), Di , i = 1, 3, 4 on p. 66 and SU(8), Di , i = 1, 7 on p. 36, whereDi is the fundamental weight. The same eigenvalue analysis will exclude all but the SU(8), Di case. The proof follows the same pattern as above with the following novelties. Case 2 is the adjoint representation of SU(3), if the special element x ∈ SU(3) has eigenvalues {α, β, γ }, the image matrix of x will have eigenvalue 1 with multiplicity 2 and all six pair-wise ratios of {α, β, γ }, so they are ±1. For Case 4, recall that if λ is an eigenvalue of a symplectic matrix, so is λ−1 with the same multiplicity, thus there are candidates for the special element x, but all such elements have the property that the multiplicity for both eigenvalues is 4. Notice by Theorem 3.1 (iv), the multiplicity of the two distinct eigenvalue in ρ(σ ˜ i ) is 3 and 5, respectively. Case 5 is done just as Case 4. This excludes all the unwanted simple groups. We have to consider also the product cases. For a product of two or three simple factors, the same analysis of eigenvalues as at the end of the proof of Claim 2 excludes them. Actually, there are only four cases here: SU(2) × SU(2), SU(2) × SU(4), SU(2) × Sp(4) and SU(2) × SU(2) × SU(2). This completes the proof of our density theorem. Acknowledgement. We would like to thank Alexei Kitaev for conversations on our approach.
References [AB]
Aharanov, D. and Ben-Or, M.: Fault tolerant quantum computation with constant error. quantph/9906129 [AHHH] Alicki, R., Horodecki, M., Horodecki, P., Horodecki, R.: Dynamical description of quantum computing: Generic nonlocality of quantum noise. quant-ph/0105115 [B] Benioff, P.: The computer as a physical system: A microscopic quantum mechanical Hamitonian model of computers as represented by Turing machines. J. Stat. Phys. 22(5), 563–591 (1980) [CR] Curtis, C. and Reiner, I.: Representation theory of finite groups and associate algebras. Pure and Applied Math. Vol XI, New York: Interscience Publisher, 1962 [D] Deutsch, D.: Quantum computational networks. Proc. Roy. Soc. London A425, 73–90 (1989) [Fey] Feynman, R.: Simulating physics with computers. Int. J. Theor. Phys. 21, 467–488 (1982) [FKW] M. Freedman, A. Kitaev, and Z. Wang: Simulation of topological field theories by quantum computers, quant-ph/0001071 [Fu] Funar, L.: On the TQFT representations of the mapping class groups. Pac. J. Math. 188, 251–274 (1999) [G] Gelca, R.: Topological quantum field theory with corners based on the Kauffmann bracket. Comment. Math. Helv. 72, 210–243 (1997)
622
M.H. Freedman, M. Larsen, Z. Wang
[J1]
Jones, V.F.R.: Hecke algebra representations of braid groups and link polynomial. Ann. Math. 126, 335–388 (1987) Jones, V.F.R.: Braid groups, Hecke algebras and type I I1 factors. In: Geometric methods in operator algebras, Proc. of the US-Japan Seminar, Kyoto, July 1983 Kauffmann, L. and Lins, S.: Temperley-Lieb recoupling theory and invariants of 3-manifolds. Ann. Math. Studies, Vol 134, Princeton, NJ: Princeton Univ. Press, 1994 Kitaev, A.: Fault-tolerant quantum computation by anyons. quant-ph/9707021, July (1997) Kitaev, A.: Quantum computations: Algorithms and error correction. Russ. Math. Surv. 52 61, 1191–1249 (1997) Karowski, M. and Schrader, R.: A combinatorial approach to topological quantum field theory and invariants of graphs. Commun. Math. Phys. 151, 355–402 (1992) Karowski, M., Schrader, R. and Vogt, E.: Invariants of three manifolds, unitary representations of the mapping class groups and numerical calculations. Experiment. Math. 6, 312–352 (1997) Kitaev, A. Yu., Shen, A. and Vyalyi, M.: Classical and quantum computation. To be published by AMS, approx. 250 pages Lang, S.: Algebra, 2nd edition, Reading, MA: Addison–Wesley Publishing Company, 1984 Lloyd, S.: Universal quantum simulators. Science 273, 1073–1078 (1996) Manin, Y.: Computable and uncomputable. (in Russian). Moscow: Sovetskoye Radio, 1980 Mckay, W. and Patera, J.: Tables of dimensions, indices, and branching rules for representations of simple Lie algebras. Lecture Notes in Pure and Applied Math. Vol 69, New York: Marcel Dekker, 1981 Nielsen, M. and Chuang, I.: Quantum computation and Quantum information. Cambridge: Cambridge Univ. Press, 2000 Preskill, J.: Fault tolerant quantum computation, quant-ph/9712048 Reshetikhin, N. and Turaev, V.G.: Invariants of 3-manifolds via link polynomials and quantum groups. Invent. Math. 103, no. 3, 547–597 (1991) Turaev, V.: Quantum invariants of knots and 3-manifolds. de Gruyter Studies in Math. Vol 18, 1994 Varadarajan, V.S.: Lie groups, Lie algebras and their representaions, Graduate Texts in Math. Vol. 102, Berlin–Heidelberg–New York: Springer-Verlag, 1984 Walker, K.: On Witten’s 3-manifold invariants. Preprint, 1991 Wenzl, H.: Hecke algebras of type An and subfactors. Invent. Math. 92, 349–383 (1988) Witten, E.: Quantum field theory and the Jones polynomial. Comm. Math. Phys. 121, 351–399 (1989) Yao, A.: Quantum circuit complexity, Proc. 34th Annual Symposium on Foundations of Computer Science, Los Alamitos, CA: IEEE Computer Society Press, pp. 352–361
[J2] [KL] [Ki1] [Ki2] [KS] [KSVo] [KSV] [La] [Ll] [M] [MP] [NC] [P] [RT] [T] [V] [Wa] [We] [Wi] [Y]
Communicated by M. Aizenman