This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
n∗ . Thus the larger part of is covered by the domains Dn , n 6 n∗ , which are determined by the inequalities: D1 = {x : F (x) > 1}, Dn = {x : 1/(n − 1)α > F (x) > 1/nα } for n > 1. The number n∗ is chosen in such √ a way that the ‘widths’ of the domains Dn , n 6 n∗ , are much bigger than 1/ λ. In fact the distance −(α+1) between e 0n and e 0√ ) as n → ∞. Thus this distance is greater n−1 has order O(n 2(α+1) ∗ λ/ λ if n 6 n . than C ln The remaining part 0 of is very close to the outer boundary of the domain . In fact, this part belongs to (const. · L)-neighborhood of ∂0 where L = (n∗ )−α = −α O(λ 2(α+1) ln2α λ). Let x = x(t, x0 ) be solution curves of the system ∇F (x) dx , = dt |∇F (x)|2
x(0) = x0 ∈ ∂0 .
(31)
MPAG003.tex; 17/08/1998; 14:53; p.13
158
S. MOLCHANOV AND B. VAINBERG
S Then the domain 0 = 0 \ n6n∗ Dn can be described by the inequalities 0 < t < (n∗ )−α , x0 ∈ ∂0 and the surfaces 0n , n > n∗ , are given by the relations t = n−α , x(t, x0 ) ∈ 1 . We cut ∂0 into small domains us of ‘size’ δ, and then split 0 into domains Us for which 0 < t < (n∗ )−α , x0 ∈ us . We specify δ and the domains Us below. We impose the Dirichlet boundary condition on all the new boundaries. Then from the mini-max principle it follows that X X ND−n (λ) + N−0 (λ), N−0 (λ) > NU−s (λ). (32) N − (λ) > n6n∗
s
We apply Theorem 1 in order to find ND−n (λ). Only the main terms of the asymptotics of ND−n (λ) contribute to the asymptotics of N − (λ). Since the asymptotic expansions given in Theorem 1 are uniform with respect to domains, we can estimate the sum of the remainders and show that this sum does not exceed the remainder term in the asymptotics of N − (λ). Indeed, from Theorem 1 it follows that there exist constants C and r independent of n such that |ND−n (λ) − (2π )−d Bd |Dn |λn/2 | 6 Cλ
n−1 2
ln λ,
λ > r,
and therefore, X X − −d n/2 N (λ) − (2π ) B λ |D | d n D n n6n∗
6 Cn∗ λ
n6n∗
n−1 2
ln λ 6 Cλ
n− α 2 2(α+1)
ln−1 λ,
λ > r.
(33)
In order to estimate the counting functions for the Dirichlet Laplacian in domains Us we use the new system of coordinates (t, y) where y = (y1 , . . . , yn−1 ) are local coordinates on ∂0 and t is the parameter along trajectories of the system (31). In the new coordinates the Laplacian becomes an elliptic operator with variable coefficients, but the domains Us have very simple geometrical shapes. We choose δ so small that we can fix coefficients of the operator when studying the problem in each of these domains. Thus we reduce the problems in Us to the problems for operators with constant coefficients which can be solved by separation of variables. The first two terms of asymptotics of the counting functions for these operators with constant coefficients contribute to the asymptotic expansions for N − (λ). On the other hand δ has to be not very small, so that the measure of new boundaries is not very large, and the contribution from the new boundaries does 1 not affect the main terms of asymptotics of N − (λ). We choose δ = O(λ 4(α+1) ). Now we specify the domains us (and therefore, Us ). We introduce the local coordinates y = (y1 , . . . , yn−1 ) on ∂0 in a special way. We start with a ‘triangulation’ of the boundary ∂0 of domain 0 , but we use a cube as a standard polyhedron instead of a simplex, i.e. we cut ∂0 into a finite system of domains Qj ⊂ ∂0 , 1 6 j 6 m0 , which are diffeomorphic to a cube v of the unit size in 2
n6n∗
+
X
s 6M1
NU−s (λ),
M = m0 δ 1−d .
(38)
M1 <s 6M
Now we show that the sum of (d − 1)-dimensional measures of the domains us with M2 < s 6 M does not exceed Cδ: X |us | 6 Cδ. (39) M2 <s 6M
MPAG003.tex; 17/08/1998; 14:53; p.15
160
S. MOLCHANOV AND B. VAINBERG
In particular, from here it follows that [ 6 Cδ, \0 u s
(40)
s>M2
where 0 = ∂0 ∩1. To show (39) let us consider 0 0 = ∂0 ∩∂1 (the edge of 0) and let (0 0 )ε be the set of points x on ∂0 such that the distance ρ(x, 0 0 ) < ε. Since ∂0 and ∂1 are transversal there exists a constant A such that the trajectories of the problem (37) emitted from ∂0 \(0 0 )ε with ε = A(n∗ )−α do not intersect ∂1 when 0 6 t 6 (n∗ )−α . Thus domains us with M2 < s 6 M have nonempty intersections with (0 0 )ε , ε = A(n∗ )−α . If we also take into account that domains us are small (they are the images of cubes vs under the action of diffeomorphisms (34)) we will get that us with M2 < s 6 M belong to (0 0 )ε with ε = A(n∗ )−α +Cδ. Since δ > (n∗ )−α for λ large enough (see (36) and (30)) we obtain (39). Now we study the counting function NU−s (λ) for the Laplacian in small domains Us were local coordinates can be used. The first step is to write the Laplacian in (t, y) coordinates. Let X (41) = (xk )yi (xk )yj gi,j = gi,j (t, y) = hxyi , xyj i k
and let [g i,j ] = [g i,j (t, y)] = [gi,j (t, y)]−1 be the inverse matrix. Let J = J (t, y) =
1 p det[gi,j ]. |∇F |
(42)
Then
1 ∂ ∂ 2 ∂ i,j ∂ . Jg 1 = P (t, y, ∂t , ∂y ) = J |∇F | + J ∂t ∂t ∂yi ∂yj
(43)
The important feature of this formula is the absence of the mixed derivatives in the right-hand side. This formula can be found in many books, but for the sake of completeness xt we shall prove it∗ here. Let z = (t, y) and A be the Jacoby matrix A = [xz ] = xy . Then dx = A dz and ∂2 ∂t ∂yi
|dx|2 = hA∗ dz, A∗ dzi = hAA∗ dz, dzi. Similarly ∇x = A−1 ∇z and 1 = (∇x )2 = hA−1 ∇z , A−1 ∇z i = h∇z , (AA∗ )−1 ∇z i + Q,
(44)
where Q is an operator containing only the first order derivatives. Since 1 is a symmetric operator, R it is a symmetric operator in the new coordinates with the dot product hu, vi = uv| det A| dz. Together with (44), this leads immediately to 1=
1 h∇z , (det A)(AA∗ )−1 ∇z i. det A
(45)
MPAG003.tex; 17/08/1998; 14:53; p.16
ON SPECTRAL ASYMPTOTICS FOR DOMAINS WITH FRACTAL BOUNDARIES
Since xt and xy are orthogonal we have |∇F |−2 0 |xt |2 0 ∗ = . AA = 0 gi,j 0 gi,j
161
(46)
From here it follows that det A = J . Together with (45) and (46) it proves (43). Thus NU−s (λ) = NV−s (λ),
(47)
where NV−s (λ) is the counting function for the Dirichlet problem in Vs for the operator (43). We would like to compare the eigenvalues of the Dirichlet problem in Vs for operators P (t, y, ∂t , ∂y ) and P (0, y0 , ∂t , ∂y ), where y0 is the center of the cube vs which is the base for Vs . We will use the following simple consequence of the mini-max principle. Let Pi =
1 h∇z , Bi (z)∇z i, bi (z)
z = (t, y) ∈ V , i = 1, 2,
(48)
be two elliptic operators in a bounded domain V such that 0 < b1 (z) 6 b2 (z), the matrices B1 (z), B2 (z) are symmetric and positive, B1 (z) > B2 (z). Let λi,1 < λi,2 6 λi,3 6 · · · be eigenvalues of the Dirichlet problem for operators Pi in V . Then λ1,j > λ2,j , j = 1, 2, 3, . . . . This assertion follows immediately from the fact that R hBi (z)f, f i dz 1/2 V R , λi,j = inf sup 2 Hj f ∈Hj V |f | bi (z) dz where Hj are j -dimensional subspaces of the space H 0,1 . Here f ∈ H 0,1 if f belongs to the Sobolev space H 1 (V ) and f = 0 on ∂V . Thus if Ni− (λ) are the counting functions for the Dirichlet problem for operators Pi in V , then N1− (λ) 6 N2− (λ).
(49)
Let us write operator P (see (43)) in the form (48): 1 |∇F |2 0 , z = (t, y) ∈ Vs , h∇z , B(t, y)∇z i, B = P = 0 g i,j J (t, y) where 0 < t < (n∗ )−α and |y − y0 | 6 δ for z ∈ Vs . From (30), (36) it follows that δ > (n∗ )−α , and therefore there is a constant c such that J (t, y) > J (0, y0 )(1 − cδ),
B(t, y) 6 B(0, y0 )(1 − cδ)−1 as (t, y) ∈ Vs . (50)
This leads to (49) for the operators P1 = (1 − cδ)−2 P (0, y0 , ∂t , ∂y ), P2 = P (t, y, ∂t , ∂y ). Thus NV−s (λ) > Ns− ((1 − cδ)2 λ),
λ > 1,
(51)
MPAG003.tex; 17/08/1998; 14:53; p.17
162
S. MOLCHANOV AND B. VAINBERG
where Ns− (λ) is the counting function for the Dirichlet problem for the operator P (0, y0 , ∂t , ∂y ) in Vs . Together with (47) and (36) this gives −1 2 (52) NV−s (λ) > Ns− 1 − cλ 4(α+1) λ , λ > 1. Now we are going to study the counting function Ns− (λ) for the operator P (0, y0 , ∂t , ∂y ) = (|∇F |2 )(0, y0 )
∂2 ∂2 i,j + g (0, y ) 0 (∂t)2 ∂yi ∂yj
(53)
in Vs . First we consider the more complicated case when s > M1 . The variables t and y in the Dirichlet problem for operator (53) in the domain Vs can be separated, i.e., the eigenvalues have the form τk + νl where τk are the eigenvalues of the corresponding one-dimensional (Sturm–Liouville) problem for the operator 2 σ 2 dtd 2 , σ = |∇F |(0, y0 ), and νl are the eigenvalues of the Dirichlet problem for the operator ∂y∂ i g i,j (0, y0 ) ∂y∂ j in the cube vs . Thus Ns− (λ) is the convolution Ns− (λ)
Z =
λ
N (λ − τ ) dN (τ ) 1
0
2
Z =
λ
N (λ − τ ) dN (τ ) 2
1
(54)
0
of the counting functions (we denote them by N 1 (λ) and N 2 (λ), respectively) for the corresponding one-dimensional problem and the problem in the cube vs . Let us specify the one-dimensional problem. Domain Vs is sliced into the thinner domains by the cuts t = n−α , n > n∗ . Thus N 1 (λ) is the eigenvalue counting function for the problem d2 u = λu, 0 < t < (n∗ )−α , t 6= n−α ; dt 2 u(0) = u(n−α ) = 0, n > n∗ .
σ2
(55)
Problem (55) is the Sturm–Liouville problem on the set of the intervals (n−α , (n + 1)−α ), n > n∗ . Two terms of the asymptotic expansion of N 1 (λ) are found by Lapidus [9] in the case when n∗ does not depend on λ. The expansion is expressed through the Minkowski measure of the sequence of the points n−α , n > n∗ , and it has the following form √ N 1 (λ) = (π σ )−1 L λ + π −µ ζ(µ)(α/σ )µλµ/2 + o(λµ/2 ), λ → ∞. (56) Here L = (n∗ )−α is the length of the set of the intervals in (55), µ = 1/(α + 1) is the Minkowski measure of the end points of the intervals, ζ(·) is the Riemann zeta-function, and coefficient for λµ/2 can be expressed through the Minkowski content of the sequence {n−α }. Since 0 < µ < 1 the value of the function ζ(τ ) = P ∞ −τ , Re τ > 1, at the point τ = µ can be written in the form j =1 j Z ∞ 1 ([x]−µ − x −µ ) dx, (57) ζ(µ) = + µ−1 1
MPAG003.tex; 17/08/1998; 14:53; p.18
ON SPECTRAL ASYMPTOTICS FOR DOMAINS WITH FRACTAL BOUNDARIES
163
where [x] is the integer part of x. One may verify that the Lapidus result and its proof remain valid if n∗ = n∗ (λ) tends to infinity sufficiently slow. In particular, formula (56) is valid when n∗ is given by (30). The standard Weyl formula is valid for the eigenvalue counting function N 2 (λ) 2 for the operator Q = g i,j (0, y0 ) ∂y∂i ∂yj in vs : (2π )1−d Bd−1 |vs | d−1 d−2 λ 2 +O λ 2 as λ → ∞, N 2 (λ) = p det[g i,j (0, y0 )]
(58)
where Bd−1 is the volume of the unit ball in |∂vs |, i,j i,j det[g (0, y0 )] det[g (0, y0 )] ρ ρ > ρ0 > √ , √ λmax λmin where λmax , λmin are the maximal and the minimal eigenvalues of [g i,j (0, y0 )]. In particular, from here it follows that (∂vs0 )ε (ε-neighborhood of ∂vs0 ) is contained in the image of (∂vs )ε/√λmin . Estimate (5) with r0 = 1 is valid for the cube v of the unit A = A0 . size, and therefore it is valid for vs with r0 = δ and with the same constant√ Then from the properties of the transformation T , it follows that for ε 6 λmin δ (∂vs )ε/√λ Aε|(∂vs )| min 0 6p |(∂vs )ε | 6 T (∂vs )ε/√λmin 6 p i,j det[g (0, y0 )] λmin det[g i,j ] Aε |∂v 0 |, 6 λmin s
MPAG003.tex; 17/08/1998; 14:53; p.19
164
S. MOLCHANOV AND B. VAINBERG
√ i.e. (5) is valid for vs0 with r0 = λmin δ and A = A0 /λmin. It allows us to apply Theorem 1 to the Dirichlet Laplacian in vs0 . Taking into account the existence of constants a, b independent of y0 such that b√> λmax > √ λmin > a > 0 we get (59) λ > 1/( aδ) (see (6)) and (36). where λ0 can be found from the inequality √ Due to (36) we can replace ln(δ λ) by ln λ in the right-hand side of (59). Using also relations, |vs | = δ d−1, |∂vs | = 2d−1 δ d−2 we get (2π )1−d Bd−1 |vs | d−1 N 2 (λ) = p λ 2 + n(λ); det[g i,j (0, y0 )] λ > λ0 ,
|n(λ)| 6 Cδ d−2 λ
d−2 2
ln
√ λ, (60)
where C and λ0 do not depend on s, and Bd−1 is the volume of the unit ball in M1 and 1 µ= , α+1
σ = |∇F |(0, y0 ),
As,k
, k2 )Bd−1 |vs | kB( d+1 2 p . = 2(2π )d−1 det[g i,j (0, y0 )]
MPAG003.tex; 17/08/1998; 14:53; p.20
165
ON SPECTRAL ASYMPTOTICS FOR DOMAINS WITH FRACTAL BOUNDARIES
It is important that here and in all formulae below the estimates of the remainders O(·) and o(·) are uniform with respect to s. We are going to express the coefficients in (64) through the volume |Us | of the domain Us ⊂ 0 and (d − 1)-dimensional measure of its base us ⊂ ∂0 . Recall that the Jacobian det A, where A = [ dtdxdy ], is equal to the function (42) (see (46)), and one can replace det[gi,j ] by (det[g i,j ])−1 in (42). Taking also into account that diameter of the domain Us and diameter of its image Vs in (t, y) coordinates do not exceed Cδ (see the arguments used for (50)) we get det A =
1 p + O(δ) for (t, y) ∈ Vs , σ det[g i,j (0, y0 )]
and therefore |Us | =
|Vs | p + O(δ|Vs |). σ det[g i,j (0, y0 )]
(65)
Since L = (n∗ )−α (see (56)) is the height of the domain Vs and vs is its base we can replace |Vs | in (65) by L|vs | and then specify the remainder with the help of (36) −α and relations |vs | = δ d−1 and (30). Thus O(δ|Vs |) = O(δ d L) = δ d−1 o(λ 2(α+1) ), and from (65) it follows that d−1+µ L|vs | p (66) = |Us | + δ n−1 o λ 2 , λ → ∞. σ det[g i,j (0, y0 )] Now if we also take into account that B( d+1 , 12 )Bd−1 = Bd we can rewrite the first 2 term in the right-hand side of (64) in the form d−1+µ d (2π )−d Bd |Us |λ 2 + δ d−1o λ 2 , λ → ∞. In order to simplify the coefficient of the second term in the right-hand side of (64) we have to note first of all that from (37) it follows that dS = (det A)|∇F | dy, where dS is (d − 1)-dimensional measure of an element of the surface ∂0 . Thus Z Z 1 1 p dy. dS = µ us |∇F (0, y)| vs |∇F (0, y)|µ det[g i,j (0, y)] From here similarly to (66) it follows that Z |vs | 1 p dS + δ d−1 o(1), = µ µ i,j |∇F (0, y)| σ det[g (0, y0 )] us
λ → ∞.
We use this relation to simplify the second term in the right-hand side of (64). d−1+µ Since the last two terms in (64) can be written in the form δ d−1 o(λ 2 ) formula (64) implies Ns− (λ) = (2π )−d Bd |Us |λ 2 − as (d, µ)λ d−1+µ + δ d−1 o λ 2 , λ → ∞, d
d−1+µ 2
+ (67)
MPAG003.tex; 17/08/1998; 14:53; p.21
166
S. MOLCHANOV AND B. VAINBERG
where , µ2 )Bd−1 µB( d+1 2 as (d, µ) = − ζ(µ)α µ 2d (π )d−1+µ
Z us
1 dS > 0 |∇F (0, y)|µ
(the constant is positive because ζ(µ) < 0). Now let us take the sum of equalities (67) with respect to all s ∈ (M1 , M]. Since the decay of the remainders o(·) in (67) is uniform with respect to s and d−1+µ M = m0 δ 1−d (see (38)), the sum of the remainders has the order o(λ 2 ). From (40) it follows that Z XZ 1 1 dS = dS + O(δ). µ µ |∇F (0, y)| 0 |∇F (0, y)| s<M us 1
Thus X
Ns− (λ)
−d
= (2π ) Bd
X
s>M1
d−1+µ d |Us | λ 2 − a(d, µ)λ 2 +
s>M1
+o λ
d−1+µ 2
,
λ → ∞,
where µB( d+1 , µ2 )Bd−1 2 a(d, µ) = − ζ(µ)α µ 2d (π )d−1+µ
Z 0
(68) 1 dS > 0. |∇F |µ
(69)
The asymptotic expansion for Ns− (λ) with s 6 M1 can be obtained similarly to (67). The only difference is that the one dimensional Sturm–Liouville problem now is much simpler. It is the problem on the interval (0, L), but not on the system of intervals as in (55). The eigenvalue counting function N 1 (λ) for this problem can be found immediately, and it has the form (56), but without the middle term in the right-hand side (in fact, the remainder also can be specified). It leads to the analog of (67), but without the middle term in the right-hand side. Correspondingly (68) holds if the limits of the summations are changed to s 6 M1 and the middle term in the right-hand side is omitted. Together with (68) it gives the following result: X d−1+µ d Ns− (λ) = (2π )−d Bd |0 |λ 2 − a(d, µ)λ 2 + s 6M
+o λ
d−1+µ 2
,
λ → ∞.
(70) −α 2(α+1)
Let us note that |0 | has the order O((n∗ )−α ) = O(λ ln λ2α ) which is the P −1 ‘thickness’ of the domain 0 . Thus s 6M Ns− ((1 − λ 4(α+1) )2 λ) also has the form (70). Together with (52) this implies X d−1+µ d NV−s (λ) > (2π )−d Bd |0 |λ 2 − a(d, µ)λ 2 + s 6M
+o λ
d−1+µ 2
,
λ → ∞.
(71)
MPAG003.tex; 17/08/1998; 14:53; p.22
ON SPECTRAL ASYMPTOTICS FOR DOMAINS WITH FRACTAL BOUNDARIES
167
Together with (32) and (33) this gives the estimate for N − (λ) from below: d−1+µ d−1+µ d N − (λ) > (2π )−d Bd ||λ 2 − a(d, µ)λ 2 + o λ 2 , λ → ∞. (72) 1 (see (56)), and a(d, µ) is given by (69) and (57). It is obvious that Here µ = 1+α Theorem 7 will be proved if we get the same estimate for N − (λ) from above. As we mentioned in the beginning of the proof of Theorem 7 the estimate of N − (λ) from above can be proved absolutely similarly to (72). Now we describe the changes which we need to make to get the estimate from above. First of all we impose the Neumann but not the Dirichlet boundary condition on e 0n = {x : F (x) = n−α }, n 6 n∗ . Then from the mini-max principle we have the following inequality instead of (32): X ND+n (λ) + N0 (λ), N − (λ) 6 n6n∗
where N0 (λ) is the counting function of the Laplacian in 0 with the Dirichlet 0n∗ . boundary condition on e 0n , n < n∗ , and the Neumann boundary condition on e Then similarly to (33) (using the second assertion of the Theorem 1 instead of the first one), we have X X n α + −d n/2 NDn (λ) − (2π ) Bd λ |Dn | 6 Cλ 2 − 2(α+1) ln−1 λ, λ > r. n6n∗
n6n∗
In order to get an estimate for N0 (λ) from above we could try to split 0 into the set of domains {Us } which was used earlier and impose the Neumann boundary condition instead of the Dirichlet condition on all additional boundaries. However, this approach will not work because the boundaries of the bases us of the domains Us are not smooth, and we will not be able to use Theorem 1 to get an analog of (64) in the case of the Neumann boundary conditions. Thus we use a covering of es with bases e us instead of the splitting of 0 . 0 by domains U It is not difficult to construct a family of neighborhoods v h , 0 < h < 1, of the cube v ∈ n∗ (it is the case when V the domains of the third type are the cylinders in which some of the cuts are not es ∩ ∂1 6= ∅). We complete, i.e. the cuts exist not for all values of y ∈ e us (if V e delete all the cuts in the third type of domains Vs , so they will have the same form as the domains of the first type (earlier we continued these cuts to get the estimate of N − (λ) from below). Then we get the analog of (38): X X X N − (λ) 6 ND+n (λ) + NUes (λ) + NUes (λ), n6n∗
s 6m1
m1 <s 6M
es of the where the middle term in the right-hand side corresponds to domains U e first and third type and the last term corresponds to domains Us of the second type (which are sliced by the cuts). After the change of the variables x → (t, y) we get the analog of (47): NUes (λ) = NVes (λ), es with the Dirichlet where NVes (λ) is the counting function for the operator (43) in V boundary condition on that part of the boundary where we had this condition in x coordinates, and with the Neumann type boundary condition of the form h∇z , (AA∗ )−1 νi = 0 on the other part of the boundary (where we had the usual Neumann boundary condition in x coordinates). Here ν is the normal to the boundes . ary of V Inequality (49) for the counting function holds for operators (48) when the Dirichlet boundary condition is imposed on some part of the boundary and the Neumann type boundary condition h∇z , Bi−1 νi = 0 holds on the remaining part of the boundary. Thus similarly to (52) we have: −1 2 NVes (λ) 6 Ns 1 + cλ 4(α+1) λ , λ > 1,
MPAG003.tex; 17/08/1998; 14:53; p.24
ON SPECTRAL ASYMPTOTICS FOR DOMAINS WITH FRACTAL BOUNDARIES
169
es with the Neuwhere Ns (λ) is the counting function for the operator (53) in V ∗ −1 mann type boundary condition of the form h∇z , [(AA ) (0, y0 )]νi = 0 on the es and with the Dirichlet boundary condition on the lateral side and on the top of V e remaining part of ∂ Vs . As earlier the variables t and y can be separated when we study Ns (λ), and it leads to an analog of (54), (56) and (59): Z λ N 1 (λ − τ ) dN 2 (τ ). Ns (λ) = 0
Here N 1 (λ) has the form (56) if s > m1 or the same form without the middle term vs | and |∂e vs | instead of in the right-hand side if s 6 m1 , and estimate (59) with |e |vs | and |∂vs | is valid for N 2 (λ). To get the expansion (56) for N 1 (λ), we have to note only that the change of the boundary condition (from the Dirichlet to the Neumann) in (55) at one point t = (n∗ )−α does not effect the main terms of the expansion (56). To get the estimate for N 2 (λ) we use the same linear transformation T : 2. Acting by an isomorphism u1 ∈ F∗q d(k1 ,r) which keeps 1(D1 ) and the k1 -coefficient
of D1 invariant we obtain ξk1 (D2 , E) = ξk2 (D2 , E) = 1 for D2 = u−1 1 D1 u1 . Suppose that m > 2 and ξk1 (Dm−1 ) = · · · = ξkm−1 (Dm−1 ) = 1
MPAG012.tex; 4/09/1998; 11:15; p.5
176
IGOR YU. POTEMINE
for some Drinfeld module Dm−1 ∼ = D over L. We have ξkm (Dm−1 , E)(q
d(k1 ,...,km−1 ,r) −1)/(q d(k1,...,km ,r) −1)
= 1.
Acting by an isomorphism um ∈ F∗q d(k1 ,...,km−1 ,r) which keeps ak1 , . . . , akm−1 and 1(Dm−1 ) invariant we obtain that ξkm (Dm , E) = 1 for Dm = u−1 m Dm−1 um . Finally, we can find a Drinfeld module Dl ∼ = D such that ξk1 (Dl , E) = · · · = ξkl (Dl , E) = 1. In view of (2.9) and (2.10) we obtain that Dl ∼ = D coincides with E and this 2 finishes the proof. By their definition the J -invariants are algebraic weakly modular functions ([Go], Def. 1.14). 3. Coarse Moduli Schemes and Canonical Compactification Let L be a separably closed A-field and M an A-scheme. We denote ML the scheme over L obtained from M by base change. Consider two contravariant functors from the category of A-schemes to the category of sets: D r : A-scheme S 7→ {isomorphism classes of Drinfeld modules of rank r over S} and hM : A-scheme S 7→ Hom(S, M). A scheme M = M r (1) is called the coarse moduli schemes of Drinfeld modules of rank r if there exists a morphism of functors f : D r → hM such that (1) D r (L) ' hM (L) for any separably closed A-field L; (2) For any A-scheme N and for any morphism of functors g: D r → hN there exists a unique morphism χ: hM → hN such that the following diagram f / hM CC CC χ C g CC !
Dr C
(3.1)
hN
commutes. It follows from the Theorem 2.2(ii) that MLr (1) is the factor of the variety V r given by the equations (q r −1)/(q d(k,r) −1)
Xk
= jk ,
1 6 k 6 r − 1,
(3.2)
MPAG012.tex; 4/09/1998; 11:15; p.6
MINIMAL TERMINAL Q-FACTORIAL MODELS
177
by the action of the finite group F∗q r /F∗q such that ξ(Xk ) = ξ q −1 Xk for any ξ ∈ F∗q r /F∗q . The variety V r is affine and toric, consequently, MLr (1) is also an affine toric variety over L. Therefore MLr (1) is the spectrum of an L-algebra generated by invariant monomials. Thus we have ...δl MLr (1) = Spec L Jkδ11...k . (3.3) l k
Using the descent of the ground field we obtain that M r (1) is the spectrum of an Aalgebra generated by the same system of invariants. A more formal proof is given below. THEOREM 3.1. The affine toric A-variety of relative dimension r − 1 ...δl M r (1) = Spec A Jkδ11...k l
(3.4)
is the coarse moduli scheme of Drinfeld A-modules of rank r. Proof. In virtue of Theorem 2.2(iii) the isomorphism classes of Drinfeld modules of rank r over L correspond bijectively to the geometric L-points of M r (1). Thus, the condition (1) above is verified. One can define a natural transformation f : D r → hM in the following way. Let L be a line bundle over an A-scheme S and E a Drinfeld module of rank r S over (S, L). Moreover, let S = Si be a covering trivializing L, Si = Spec Bi , then ...δl = γ δ1 ...δl (E) ∈ Bi . (E) Jkδ11...k k1 ...kl l Si Define the morphism of A-algebras ...δl A Jkδ11...k → Bi l by the specialization ...δl ...δl = γkδ11...k (E). Jkδ11...k l l
This defines a morphism S → M r (1), that is, a geometric S-point of M r (1). If λ: S 0 → S is a morphism of A-schemes and E 0 = λ∗ (E) then the diagram /E
E0
S 0 FF
λ
/S
FF FF F fE 0 FF#
fE
M r (1)
MPAG012.tex; 4/09/1998; 11:15; p.7
178
IGOR YU. POTEMINE
commutes. It means that the geometric S 0 -point of M r (1) defined by fE0 coincides with the geometric S 0 -point defined by the composition fE ◦ λ. Consequently, the diagram D r (S)
/ hM (S)
/ hM (S 0 )
D r (S 0 )
also commutes. We have proved that f is the natural transformation. The universality of scheme (3.4) follows from geometric invariant theory. 2 Let n be some admissible ideal of A, that is, divisible by at least two prime divisors. Let M r (n) be the fine moduli scheme of Drinfeld modules with n-level structure (in the sense of Drinfeld) ([Dr], §5). It is known that M r (n) is a nonsingular affine A-variety of relative dimension r − 1 ([Dr], Cor. of Prop. 5.4). We have also the forgetful morphism M r (n) → M r (1). In virtue of ([KM], (7.1), (8.1)) M r (n) is the PGL(r, A/n)-torsor (in the f.p.p.f. topology). Any geometric A-point P of M r (1) defines a unique isomorphism class of Drinfeld modules over the separable closure K s . Let EP be some Drinfeld A-module representing this class. Denote also d(EP ) the greatest common divisor of r and all natural integers k < r such that jk (EP ) 6= 0. We have Aut(EP ) = F∗q d(EP ) .
(3.5)
A geometric A-point P of M r (1) is called elliptic if Aut(EP ) strictly contains F∗q . Let Sing(M r (1)) and Ell(M r (1)) be the loci of singular points and elliptic points resp. Since M r (1) is the quotient of the non-singular variety V r by the finite cyclic group F∗q r /F∗q all the singularities are cyclic quotient singularities. THEOREM 3.2. For r > 2 we have Sing(M r (1)) = Ell(M r (1)).
(3.6)
Proof. Let Q be a geometric A-point of M r (n) over P . The inertia group is isomorphic to: I (Q/P ) = Aut(EP )/F∗q .
(3.7)
If Aut(EP ) = F∗q then I (Q/P ) is trivial and P is non-singular ([Oo], Th. 2.7). On the other hand, if P is non-singular there are two possibilities by the theorem on ‘purity of branch locus’ ([Oo], Th. 2.7; [AK], Ch. 6, Th. 6.8). The morphism M r (n) → M r (1) is non-ramified at P and I (Q/P ) = 1 or P is ramified in codimension 1. The second case is impossible because Ell(M r (1)) is of codimension strictly greater than 1 for r > 2. Indeed, if Aut(EP ) 6= F∗q then jk (EP )-invariants 2 with k prime to r are equal to zero by (3.5).
MPAG012.tex; 4/09/1998; 11:15; p.8
MINIMAL TERMINAL Q-FACTORIAL MODELS
179
Consider the following contravariant functor from the category of A-schemes to the category of sets: D r : A-scheme S 7→ {isomorphism classes of Drinfeld modules of rank 6 r over S} The coarse moduli scheme M r (1) of rational Drinfeld modules of rank 6 r is defined in the same manner as in the beginning of this section. PROPOSITION 3.3 (cf. [Ka], 1.6). The weighted projective space M r (1) = PA (q − 1, q 2 − 1, . . . , q r − 1) is the coarse moduli scheme of rational Drinfeld modules of rank 6 r. Proof. The reasoning analogous to the proof of the Theorem 3.1 shows that the affine subvariety of PA (q − 1, q 2 − 1, . . . , q r − 1) corresponding to the non-zero kth coordinate is the coarse moduli scheme of Drinfeld modules of rank 6 r with 2 non-zero jk -invariant. The gluing finishes the proof. COROLLARY 3.4. We have the following description of the cuspidal divisor: [ def M k (1). Cusp M r (1) = M r (1)\M r (1) = 16k 6r−1
COROLLARY 3.5. We have Sing M r (1) = Ell(M r (1))\Ell M 2 (1) =
[
Ell(M k (1)).
36k 6r−1
4. Ramification of the j -Covering By Theorem 2.2(ii) the j -invariant defines the finite flat covering of the affine space of dimension r − 1 by M r (1). PROPOSITION 4.1. The finite flat covering j : M r (1) → Ar−1 A ,
(4.1)
is étale over −1 −1 Gr−1 m,A = Spec A[j1 , . . . , jr−1 , j1 , . . . , jr−1 ]
(4.2)
and tame. The degree of this covering is equal to: r−1 Y qr − 1 N= . q d(i,r) − 1 i=2
(4.3)
MPAG012.tex; 4/09/1998; 11:15; p.9
180
IGOR YU. POTEMINE
2
Proof. It follows from the Theorem 2.2(ii).
Let (i1 , . . . , is ) be a multi-index such that 1 6 i1 < · · · < is 6 r − 1. We the affine subvariety generated by the coordinates denote AA (i1 , . . . , is ) ⊂ Ar−1 A ji1 , . . . , jis and Gm,A (i1 , . . . , is ) the corresponding subtorus. We also denote by M r (1)[i1 , . . . , is ] the subvariety of M r (1) corresponding to Drinfeld modules such that their coefficients different from i1 , . . . , is are zero. PROPOSITION 4.2. M r (1) is regular in relative codimension r − φ(r) − 2 where φ(r) is the length of a maximal chain (i1 , . . . , is ) such that d(i1 , . . . , is , r) > 1. Furthermore, [ M r (1)[i1 , . . . , is ]. (4.4) Sing(M r (1)) = d(i1 ,...,is ,r) > 1
2
Proof. The result follows immediately from (3.5) and (3.6).
COROLLARY 4.3. If r is a prime integer then M r (1) is regular outside of the origin. PROPOSITION 4.4. The finite flat covering j (i1 , . . . , is ): M r (1)[i1 , . . . , is ] → AA (i1 , . . . , is )
(4.5)
is étale over Gm,A (i1 , . . . , is ) and tame. The degree of this covering is equal to: N(i1 , . . . , is ) =
(q r − 1)s−1 (q d(i1 ,...,is ,r) − 1) . (q d(i1 ,r) − 1) · · · (q d(is ,r) − 1)
(4.6)
In particular, N(i1 ) = 1 for any 1 6 i1 6 r − 1. Proof. The first part is analogous to the first part of the Proposition 4.1 if we consider Drinfeld modules such that their coefficients different from i1 , . . . , is are zero. It suffices therefore to prove (4.6). Notice that N(i1 , . . . , is ) is equal to the number of non-isomorphic Drinfeld modules with the same j -invariant such that their non-zero components are exactly i1 , . . . , is . According to Theorem 2.2(ii) one can suppose that the k-coefficients of such Drinfeld modules belong to F∗q r /F∗q d(k,r) . We reason by induction on s. If s = 1 via an isomorphism u ∈ F∗q r one can suppose that the i1 -coefficient is equal to 1. Thus, N(i1 ) = 1. If s = 2 we put the i1 -coefficient equal to 1. The i2 -coefficient may be written as t k(q
d(i2 ,r) −1)
,
1 6 k 6 (q r − 1)/(q d(i2 ,r) − 1),
where t is some generator of F∗q r . On factorizing by the action of F∗q d(i1 ,r) we obtain (q r − 1) (q r − 1) N(i1 , i2 ) = g.c.d. , (q d(i1 ,r) − 1) (q d(i2 ,r) − 1) r (q − 1)(q d(i1 ,i2 ,r) − 1) = . (q d(i1 ,r) − 1)(q d(i2 ,r) − 1)
MPAG012.tex; 4/09/1998; 11:15; p.10
MINIMAL TERMINAL Q-FACTORIAL MODELS
181
In general, we have
N(i1 , . . . , is ) = N(i1 , . . . , is−1 ) · g.c.d.
(q r − 1)
(q r − 1) , (q d(i1 ,...,is−1 ,r) − 1) (q d(is ,r) − 1) (q r − 1)(q d(i1 ,...,is ,r) − 1) = N(i1 , . . . , is−1 ) · d(i ,...,i ,r) . (q 1 s−1 − 1)(q d(is ,r) − 1)
The formula (4.6) is an immediate consequence.
2
COROLLARY 4.5. The covering (4.1) is tamely ramified over Gm,A (i1 , . . . , is ) of ramification index e(i1 , ..., is ) =
N N(i1 , . . . , is )
(4.7)
for any multi-index (i1 , . . . , is ). Therefore this covering is totally ramified over AA (i1 ) for any 1 6 i1 6 r − 1. COROLLARY 4.6. If r > 3 is prime then r r q − 1 s−1 q − 1 r−s ; e(i1 , . . . , is ) = N(i1 , . . . , is ) = q −1 q −1
(4.8)
for any multi-index (i1 , . . . , is ). EXAMPLE 4.7. If r = 4 we have N=
(q 4 − 1)2 ; (q 2 − 1)(q − 1) (q 4 − 1) N(1, 3) = (q − 1)
N(1, 2) = N(2, 3) =
(q 4 − 1) ; (q 2 − 1)
and Sing(M 4 (1)) = M 4 (1)[2].
5. Rational Polyhedral Cone and Its Dual For any r > 3 we fix some lattice N r of rank (r−1) and let N r∗ = HomZ (N r , Z) be its dual. We write simply N and N ∗ if there is no confusion. There exists a natural correspondence between (r − 1)-dimensional rational strictly convex polyhedral cones in NR∗ and (r − 1)-dimensional affine toric varieties ([Da1], [Fu], [Od]). THEOREM 5.1. The rational simplicial cone generated by the following vectors e1∗ = (1, 0, . . . , 0), (q r − 1) (q k − 1) , 0, . . . , 0 , 0, . . . , 0 ek∗ = , (q d(k,r) − 1) | {z } (q d(k,r) − 1)
(5.1)
k−2
MPAG012.tex; 4/09/1998; 11:15; p.11
182
IGOR YU. POTEMINE
for 2 6 k 6 r − 1, is the dual rational polyhedral cone σˇ of M r (1). The rational polyhedral cone σ of M r (1) is generated by: r qk − 1 q r−1 − 1 q −1 , e1 = , −q − 1, . . . , − ,...,− q −1 q −1 q −1 ek = (0, . . . , 0, 1, 0, . . . , 0), 2 6 k 6 r − 1. (5.2) | {z } | {z } k−1
r−k−1
Proof. We know that ...δl . M r (1) = Spec A Jkδl1...k l For any 1 6 k 6 r − 1, let Jk be an element verifying (3.2), i.e. such that (q r −1)/(q d(k,r) −1)
Jk
= jk .
Using the transformations U1 = j1 , Uk =
Jk (q k −1)/(q−1) J1
(5.3)
for 2 6 k 6 r − 1 ([Fu], Sect. 2.2), we obtain that (q k −1)(q r −1) (q−1)(q d(k,r) −1)
jk = J1
q r −1 q d(k,r) −1
Uk
(q k −1)/(q d(k,r) −1)
= U1
(q r −1)/(q d(k,r) −1)
Uk
.
There is a bijective correspondence between the integral points of this cone and the J -invariants. Indeed, any monomial of U1 , . . . , Ur−1 belonging to the cone σˇ gives a J -invariant by the formula (5.3). ...δl verifying (B1) determines the monomial On the other hand, the invariant Jkδ11...k l
U1δr Ukδ11 . . . Ukδll . The cone σ is obtained by taking suitable orthogonal vectors to the facets of σˇ .
2
COROLLARY 5.2. The rational simplicial fan generated by the ray (−1, 0, . . . , 0) and by the rays (5.2) is the rational polyhedral fan of M r (1). Proof. In virtue of Proposition 3.3 we have M r (1) = PA (q − 1, q 2 − 1, . . . , q r − 1) q2 − 1 qr − 1 ,..., . = PA 1, q −1 q −1
(5.4)
Therefore M r (1) is the equivariant compactification of the affine space Ar−1 A ([Do], 1.2.4). Moreover, the weighted projective space (5.4) is the gluing of r affine toric
MPAG012.tex; 4/09/1998; 11:15; p.12
MINIMAL TERMINAL Q-FACTORIAL MODELS
183
Figure 1. Dual rational cone σˇ of M 3 (1) (on the left) and of M 4 (1).
varieties with simplicial cones. Thus, we have to add only one ray to the cone σ in order to form a fan of M r (1). It is easy to see that adding the ray (−1, 0, . . . , 0) 2 we obtain the result. Remark. This result may be also deduced applying ([Od], Th. 2.22) to the polytope corresponding to σ (cf. [Do], 1.2.5).
6. Minimal Terminal Q-Factorial Compactification We shall now construct the minimal simplicial terminal subdivision of the cone σ of M r (1). We suppose here that r > 4 and q is big enough. The unique equivariant minimal smooth compactification of the coarse moduli surface M 3 (1) will be constructed in the next section. We denote Sk1 σ the set of the extremal rays of a simplicial cone σ and lσ a linear form on NQ such that lσ (Sk1 σ ) = 1. The convex polytope σ ∩ lσ−1 [0, 1] is called the shed of σ and the convex polytope σ ∩ lσ−1 (1) in codimension 1 is called the roof of the shed of σ (cf. [Re], [BGS]). The shed (resp. the roof of the shed) of a fan 6 is the union of the sheds (resp. of the roofs of the sheds) of its cones. A cone is terminal if its shed does not contain integral points distinct from its vertices. Finally, a fan is terminal if it is the union of terminal cones. THEOREM 6.1. The consecutive star subdivisions centered in the rays (q r−m−2 + q r−m−4 + q r−m−5 + · · · + q + 1, 0, . . . , 0, −1, −q, | {z } m
MPAG012.tex; 4/09/1998; 11:15; p.13
184
IGOR YU. POTEMINE
Figure 2. Rational polyhedral cone of M 3 (1) (on the top) and of M 4 (1).
− q 2 − 1, . . . , −q r−m−3 − q r−m−5 − q r−m−6 − · · · − q − 1)
(6.1)
for 0 6 m 6 r − 4 (in ascending order) and in the ray (1, 0, . . . , 0) define the r (1) of M r (1). unique minimal terminal Q-factorial equivariant model Mmin Proof. The Q-factoriality follows from the fact that star subdivisions are simplicial (cf. [Br], Sect. 4.2). We shall check the terminality of singularities. Let r (1). The extremal rays of the cones of 6min are called 6min denote the fan of Mmin terminal rays of σ . A point of the shed of σ generating a terminal ray will be called a terminal point. The coordinates of terminal rays in the interior of the shed of σ may be found by consecutive projections to the coordinates {e1 , er−1 } and {−ek+1 , ek } for 2 6 k 6 r − 2. The projection on {e1 , er−1 } defines the two-dimensional cone r q − 1 1 − q r−1 (0, 1), , (6.2) q −1 q −1 which is the rational cone of the surface M r (1)[1, r −1]. The points (lq +1, −l) for 0 6 l < (q r−1 −1)/(q−1) are the only terminal points in the shed of this cone apart from the extremal rays (see Figure 3). They define the minimal desingularization of the surface M r (1)[1, r − 1]. A projection of the second type defines the two-dimensional fan k+1 − 1 1 − qk q (−1, 0), (0, 1), , q −1 q −1
MPAG012.tex; 4/09/1998; 11:15; p.14
MINIMAL TERMINAL Q-FACTORIAL MODELS
185
Figure 3. Minimal desingularization of the surface M r (1)[1, r − 1].
Figure 4. Minimal smooth compactification of the surface M k+1 [1, k].
which is the rational fan of the surface M k+1 (1)[1, k]. The points (lq + 1, −l) for 0 6 l < (q k − 1)/(q − 1) and the point (q, −1) are the only terminal points in the shed of this fan (apart from the extremal rays). These points give the minimal smooth compactification of the surface M k+1 (1)[1, k] (see Figure 4). We obtain, consequently, that a point (x1 , . . . , xr−1 ) distinct from the origin and lying strictly inside of the shed of σ is terminal only if one of the following conditions is satisfied: 0 6 −x2 6 q, xk+1 = xk q − 1 and x1 = 1 − xr−1 q,
(6.3)
for 2 6 k 6 r − 2, or x2 = · · · = xm+1 = 0 (if m > 1), xk+1 = xk q − 1 and x1 = 1 − xr−1 q, (6.4) for 0 6 m 6 r − 2 and m + 2 6 k 6 r − 2, or finally x2 = · · · = xm+1 = 0 (if m > 1), xm+2 = −1, xk+1 = xk q − 1 and x1 = 1 − xr−1 q
xm+3 = −q (6.5)
MPAG012.tex; 4/09/1998; 11:15; p.15
186
IGOR YU. POTEMINE
for 0 6 m 6 r − 2 and m + 4 6 k 6 r − 2. The relations (6.3) and (6.4) define the points (lq r−2 + q r−3 + · · · + q + 1, −l, −lq − 1, . . . , −lq r−3 − q r−4 − · · · − q − 1) for 0 6 l 6 q + 1 and the points (q r−m−2 + q r−m−3 + · · · + q + 1, 0, . . . , 0, −1, −q − 1, | {z } − q − q − 1, . . . , −q 2
r−m−3
−q
m r−m−4
− · · · − q − 1)
respectively. These points except for (1, 0, . . . , 0) lie above the hyperplane in NR passing through e1 , . . . , er−1 (see (5.2)) which is easy to prove by straightforward computation. Therefore these points do not belong to the shed of σ . The point (q, 0, . . . , 0, −1) corresponding to m = r − 3 in (6.5) can no more belong to this shed. Indeed, its projection on {e1 , er−1 } which is (q, −1) does not belong to the shed of M r (1)[1, r − 1] (see Figure 3). Thus, the point (x1 , . . . , xr−1 ) lying strictly inside of the shed of σ is terminal if and only if it is (1, 0, . . . , 0) or x2 = · · · = xm+1 = 0 (if m > 1), xm+2 = −1, xk+1 = xk q − 1 and x1 = 1 − xr−1 q
xm+3 = −q, (6.6) er
for 0 6 m 6 r − 4 and m + 4 6 k 6 r − 2. The variety M (1) obtained by the consecutive star subdivisions centered in these rays (in ascending order with respect to m) has the shed with concave roof along the internal walls (see [Re] for terminology). It follows from the Reid theorem ([Re], Th. 0.2) that it is a minimal model. Any other minimal model with terminal Q-factorial singularities has the same shed. The roof of this shed is strictly concave along the internal walls and, 2 consequently, constructed minimal model is unique. THEOREM 6.2. The consecutive star subdivisions of the rational polyhedral fan σ of M r (1) by the following rays (q r−m−2 + q r−m−4 + q r−m−5 + · · · + q + 1, 0, . . . , 0, −1, −q, | {z } m
− q − 1, . . . , −q 2
r−m−3
−q
r−m−5
−q
r−m−6
− · · · − q − 1)
(6.7)
for 0 6 m 6 r − 2 (in ascending order) define the unique minimal terminal r (1) of M r (1). Q-factorial equivariant compactification Mmin Proof. It suffices to prove that the points (6.7) are the only terminal points strictly inside of the shed of σ apart from the origin. As in the proof of the previous theorem take the consecutive projections to the coordinates {e1 , er−1 } and {−ek+1 , ek } for 2 6 k 6 r − 2. These projections define the two-dimensional fans k+1 − 1 1 − qk q , (−1, 0), (0, 1), q −1 q −1
MPAG012.tex; 4/09/1998; 11:15; p.16
MINIMAL TERMINAL Q-FACTORIAL MODELS
187
for 2 6 k 6 r − 1. We obtain, consequently, that a point distinct from the origin and lying strictly inside of the shed of σ is terminal if and only if the following condition is satisfied: x2 = · · · = xm+1 = 0 (if m > 1), xm+2 = −1, xk+1 = xk q − 1 and x1 = 1 − xr−1 q
xm+3 = −q, (6.8)
for 0 6 m 6 r − 2 and m + 4 6 k 6 r − 2. These points without (q, 0, . . . , 0, −1) (corresponding to m = r − 3) define the unique minimal simplicial terminal subdivision of the rational cone of M r (1) by the previous theorem. The point (q, 0, . . . , 0, −1) lie in the subcone h(−1, 0, . . . , 0), e1 , . . . , er−2 i of the fan σ (cf. (5.2)). The star subdivision centered in the corresponding ray together with consecutive star subdivisions of Theorem 6.1 define a unique minimal terminal 2 Q-factorial equivariant compactification of M r (1). Remark. We supposed that q is sufficiently big. In fact, if q 6 r − 3 then the ray of (6.1) and of (6.7) corresponding to m = 0 does not necessarily belong to the shed of the cone σ of M r (1).
7. Drinfeld Coarse Moduli Surface 7.1.
EQUATIONS DEFINING
M 3 (1)
First of all, we shall construct a regular subdivision of the dual cone σˇ (see Figure 5). Let χ u0 , . . . , χ uq+1 be the characters of the torus TN corresponding to the rays u0 , . . . , uq+1 of the regular subdivision of σˇ . Using the property 0
0
χ u χ u = χ u+u
Figure 5. Regular subdivision of the dual cone of M 3 (1).
MPAG012.tex; 4/09/1998; 11:15; p.17
188
IGOR YU. POTEMINE
valid for any elements u, u0 ∈ N we deduce in our case that χ uq+1 χ uq−1 = χ (q+2)uq ,
χ ui+1 χ ui−1 = χ 2ui , 1 6 i 6 q − 1.
(7.1)
Denote Xi = χ ui . We obtain that M 3 (1) is defined as a scheme over Spec A by the following q equations: q+2 Xq = Xq+1 Xq−1 , (7.2) 1 6 i 6 q − 1, Xi2 = Xi+1 Xi−1 , where Xq+1 = j2 and Xi =
U1 U2i
= j1
i
J2 q+1 J1
q 2 +q+1−i(q+1) i J2
= J1
(7.3)
for 0 6 i 6 q in notations (5.3). 7.2.
MINIMAL SMOOTH COMPACTIFICATION OF
M 3 (1)
In order to find a resolution of singularities of an affine toric variety it suffices to find a regular subdivision of the corresponding rational cone. In our case the minimal regular subdivision is given by Figure 3 since M 3 (1) = M 3 (1)[1, 2]. The minimal resolution of singularities is given by a chain of blowing-ups 3 (1) → M 3 (1) at TN -invariant centers. The exceptional divisor Mmin E = C1 + C2 + · · · + Cq+1 is described by Figure 6. Here C1 , . . . , Cq+1 are rational curves with the following indices of self-intersection: (C1 )2 = −q − 1,
(Ci )2 = −2, for i > 1.
The minimal smooth compactification is represented by Figure 4 for k = 2. The 3 rational polyhedral fan of Mmin (1) is the subdivision of the subfans corresponding 3 (1) may be obtained to the Hirzebruch surfaces Hq and Hq+1 . Consequently, Mmin by a succession of blowing-ups at TN -stable points of any of these surfaces. 3 We see further that the rational fan of Mmin (1) contains d2 = q + 5 twodimensional regular subcones and d1 = q + 5 one-dimensional subcones. In particular, the Euler characteristic is equal to: 3 (1) = d2 = q + 5 χ Mmin ([Fu], Ch. 4.3).
MPAG012.tex; 4/09/1998; 11:15; p.18
MINIMAL TERMINAL Q-FACTORIAL MODELS
189
Figure 6. Exceptional divisor and its weighted graph.
3 (1). Figure 7. Weighted circulated graph of Mmin
3 Let D1 , . . . , Dq+5 be the irreducible invariant divisors on Mmin (1). They correspond to the rays (one-dimensional subcones) in Figure 4 for k = 2. It is known that
K =−
q+5 X
Di
i=1
is the canonical divisor. Its self-intersection index is given by the formula: (K)2 = 12 − d2 = 7 − q. It is also possible to calculate the (l-adic) Betti numbers and the Poincaré polynomial using ([Fu], §4.5). We put d0 = 1 (the number of zero-dimensional subcones). Then β3 = β1 = 0, β0 = β4 = 1 and β2 = d1 − 2d0 = q + 3. Furthermore, the Poincaré polynomial is equal to: PM (t) = β4 t 4 + β2 t 2 + β0 = t 4 + (q + 3)t 2 + 1.
MPAG012.tex; 4/09/1998; 11:15; p.19
190 7.3.
IGOR YU. POTEMINE 3 Mmin (1)
ZETA - FUNCTION OF
First of all, 3 Card Mmin (1)(Fq m ) = β4 q 2m + β2 q m + β0 = q 2m + (q + 3)q m + 1. We recall now that if M is an A-variety of relative dimension r − 1 and if the number of its geometric points over Fq mn is equal to: νn =
r−1 X
µi (q mn )i
i=0
then ζ(M/Fq m , s) = exp
X n>1
= exp
(q m )−sn νn n
X r−1 X i=0 n>1
=
r−1 Y
= exp
XX r−1
m in (q
µi (q )
n>1 i=0
m −sn
) n
X Y r−1 (q m )(i−s)n (q m )(i−s)n µi exp µi = n n i=0 n>1
exp − µi ln 1 − q m(i−s)
=
i=0
r−1 Y
1 − q m(i−s)
−µi
i=0
(cf. [MP], Ch. 4, §1). In our case we have: µ0 = β0 = µ2 = β4 = 1, µ1 = β2 = q + 3. Consequently, −1 −q−3 −1 3 ζ Mmin (1)/Fq m , s = 1 − q −ms 1 − q m(1−s) 1 − q m(2−s) . In addition, ζ(M, s) =
Y
Y
ζ(M ⊗A Fq m , s) =
m>1 p∈Specm A deg p=m
=
r−1 Y Y
Y
i=0 m>1 p∈Specm A deg p=m
Y
Y
r−1 Y
1 − q m(i−s)
−µi
m>1 p∈Specm A i=0 deg p=m
1−q
m(i−s) −µi
=
r−1 Y
ζA (s − i)µi ,
i=0
where ζA (s) = (1 − q 1−s )−1 is the Dedekind zeta-function of A = Fq [T ]. In our case we have: 3 (1), s = ζA (s)ζA (s − 1)q+3 ζA (s − 2) ζ Mmin q+3 −1 1 − q 3−s . = 1 − q 1−s 1 − q 2−s
MPAG012.tex; 4/09/1998; 11:15; p.20
MINIMAL TERMINAL Q-FACTORIAL MODELS
191
Acknowledgements I am very grateful to Michel Brion for valuable remarks, to Catherine Bouvier for helpful discussions concerning toric varieties and to Alexei Panchishkin for constant encouragement and advice. I am thankful also to Karsten Bücker for reading this text and correcting some spelling mistakes. References AK. Altman, A. and Kleiman, S.: Introduction to Grothendieck Duality Theory, Lect. Notes Math. 146, Springer-Verlag, 1970. BGS. Bouvier, C. and Gonzalez-Sprinberg, G.: Système générateur minimal, diviseurs essentiels et G-désingularisations de variétés toriques, Tôhoku Math. J. 47 (1995), 125–149. Br. Brion, M.: Variétés sphériques et théorie de Mori, Duke Math. J. 72(2) (1993), 369–404. Da1. Danilov, V. I.: Geometry of toric varieties, Uspekhi Mat. Nauk 33(2) (1978), 83–134 (in Russian); Engl. transl.: Russian Math. Surveys 33(2) (1978), 97–154. Da2. Danilov, V. I.: Birational geometry of toric 3-folds, Izv. Akad. Nauk SSSR, Ser. Mat. 46(5) (1982), 971–982 (in Russian); Engl. transl.: Math. USSR-Izv. 21 (1983), 269–280. Do. Dolgachev, I.: Weighted projective varieties, Proc. of a Polish-North American Seminar, Vancouver, 1981, Lect. Notes Math. 956 (1982), 34–71. Dr. Drinfeld, V. G.: Elliptic modules, Mat. Sbornik 94 (1974), 594–627 (in Russian); Engl. transl.: Math. USSR S. 23 (1974), 561–592. Fu. Fulton, W.: Introduction to Toric Varieties, The William H. Roever Lectures in Geometry, Princeton University Press, 1993. Ge1. Gekeler, E.-U.: Moduli for Drinfeld modules, in The Arithmetic of Function Fields, Walter de Gruyter, Berlin, New York, 1992, pp. 153–170. Ge2. Gekeler, E.-U.: Satake compactification of Drinfeld modular schemes, in Proc. Conf. on p-adic Analysis held in Hengelhoef (Houthalen), Belgium, 1986, pp. 71–81. Go. Goss, D.: π-adic Eisenstein series for function fields, Comp. Math. 1 (1980), 3–38. KM. Katz, N. M. and Mazur, B.: Arithmetic Moduli of Elliptic Curves, Ann. of Math. Stud. 108, Princeton University Press, 1985. Ka. Kapranov, M. M.: On cuspidal divisors on the modular varieties of elliptic modules, Izv. Akad. Nauk URSS, Ser. Mat. 51(3) (1987), 568–583 (in Russian); Engl. transl.: Math. USSR - Izv. 30(3) (1988), 533–547. MP. Manin, Yu. I. and Panchishkin, A. A.: Number theory I, in A. N. Parshin and I. R. Shafarevich (eds.), Encyclopaedia of Math. Sciences 49, Springer-Verlag, 1995. Mu. Mumford, D.: Geometric Invariant Theory, Springer-Verlag, 1965. Od. Oda, T.: Convex Bodies and Algebraic Geometry, Ergebnisse der Math. 15, Springer-Verlag, 1988. Oo. Oort, F.: Coarse and fine moduli spaces of algebraic curves and polarized Abelian varieties, in Sympos. Math. XXIV, Academic Press, London, New York, 1981. Pi. Pink, R.: On compactification of Drinfeld moduli schemes, in: Moduli Spaces, Galois Representations and L-functions, S¯urikaisekikenky¯usho K¯oky¯uroku 884 (1994), 178–183 (Japanese). Po. Potemine, I. Yu.: J -invariant et schémas grossiers des modules de Drinfeld, Séminaire de Théorie des Nombres, Caen, Fascicule de l’année 1994–95, 15 pp. Re. Reid, M.: Decomposition of toric morphisms, Arithmetic and geometry, Papers dedicated to I. R. Shafarevich on the occasion of his 60th birthday, Birkhäuser, Progress in Math. 36 (1983), 395–418.
MPAG012.tex; 4/09/1998; 11:15; p.21
Mathematical Physics, Analysis and Geometry 1: 193–221, 1998. © 1998 Kluwer Academic Publishers. Printed in the Netherlands.
193
Stability Criteria for the Weyl m-Function W. O. AMREIN Department of Theoretical Physics, University of Geneva, CH-1211 Geneva 4, Switzerland
D. B. PEARSON? Department of Mathematics, University of Hull, Cottingham Road, Hull, U.K. (Received: 29 October 1997; in final form: 15 July 1998) Abstract. This paper presents a new approach to spectral theory for the Schrödinger Operator on the half-line. Solutions of nonlinear Riccati-type equations related to the Schrödinger equation at real spectral parameter λ are characterised by means of their clustering properties as λ is varied. A family of solutions exhibiting a so-called δ-clustering property is shown to imply precise estimates for the complex boundary value of the Weyl m-function and the spectral measure, and leads to an analysis of the absolutely continuous component of the spectral measure in terms of stability criteria for the corresponding Riccati equations. Mathematics Subject Classifications (1991): 34B25, 47E05. Key words: m-function, Schrödinger operator, spectrum.
1. Introduction This paper presents a new approach to the spectral theory of the Schrödinger operator on the half-line, based on an analysis of the Weyl–Titchmarsh m-function and its boundary values. It is well known (see, for example, [1–5]) that the m-function, defined in terms of two solutions u(·, z), v(·, z) of the Schrödinger equation at complex spectral parameter z as the coefficient m(z) for which u(·, z)+m(z)v(·, z) is square integrable over the half-line, is a Herglotz function (analytic in the upper half-plane with positive imaginary part), the boundary behaviour of which determines the spectral properties of the differential operator −d2 /dx 2 + V (x) in L2 (0, ∞). Here we are assuming a real locally integrable potential V , and limit-point case at infinity; we refer the reader to [2, 6, 7] for a treatment of the Weyl limit-point/limit circle theory and note that the limit-point case, which holds for any potential bounded at infinity and more generally for a wide range of unbounded potentials as well, is the case of physical interest in most applications to quantum mechanics and elsewhere, and allows one to dispense with a boundary condition at infinity. ? Partially supported by the Swiss National Science Foundation and by EPSRC.
VTEXVR PIPS No: 187754 (mpagkap:mathfam) v.1.15 MPAG013.tex; 5/11/1998; 8:38; p.1
194
W. O. AMREIN AND D. B. PEARSON
We do, however, need to impose a boundary condition at x = 0, and a oneparameter family of boundary conditions, parametrised by a real parameter α in the range π/2 < α 6 π/2, leads to a one-parameter family {mα } of m-functions, and correspondingly a one parameter family Tα = −d2 /dx 2 +V (x) of Schrödinger operators in L2 (0, ∞), each with its associated spectral properties. The case α = 0, with m(z) ≡ m0 (z), corresponds to the Schrödinger operator with Dirichlet boundary condition at x = 0; it should, however, be noted that it is usually necessary, in developing spectral theory for such operators, to deal with a family of operators {Tα } rather than just a single operator. Since the spectrum of each of these Schrödinger operators Tα is a subset of R, and the spectral measure µα is a measure on Borel subsets of R, one may expect that in principle it is better to deal with the Schrödinger equation at real spectral parameter λ, rather than at complex spectral parameter z. This will have the additional advantage that we can then call upon the variety of methods (orthogonality properties, oscillation and separation theorems) which apply to solutions of real Sturm–Liouville equations. Various theoretical ideas and methods have been introduced, particularly in recent years, which allow one to pass from a treatment of the m-function as a function of a complex variable z in the upper half-plane, to an analysis of the boundary values m+ (λ) ≡ limε→0+ m(λ + iε), defined for almost all λ ∈ R. This leads to a link between spectral behaviour and the asymptotic properties in the limit x → ∞ of solutions f (x, λ) of the Schrödinger equation at real spectral parameter λ. As examples of such developments, we may cite the application of the notion of subordinacy, introduced first of all in [8– 10], and recently developed still further in [11–14], as a powerful tool of spectral analysis, the treatment of absolutely continuous spectrum in [15, 16], using an asymptotic condition for the squared wave-function, recent results in [17–20] on the absolutely continuous spectrum, and new techniques for problems of singular spectra in [21, 22]. A novel feature of the approach presented here, applied in particular to a study of the absolutely continuous component of the spectral measure µα of Tα , is that it is based on an analysis of complex solutions of the Schrödinger equation at real spectral parameter λ. At first sight, this approach seems a little unusual, not to say perverse, since for λ real, the solution space for the Schrödinger equation −d2 f /dx 2 + Vf = λf is spanned at each λ by just two solutions u(·, λ), v(·, λ), which may be taken to be real, and any complex solution f will just be a complex linear combination of these two real solutions. Nevertheless, as is already suggested for example in [23, 24] and [25], complete spectral information cannot be extracted from a study of the asymptotics of two solutions u and v in isolation, but depends rather on a knowledge of their relative asymptotics, for example of their relative amplitudes and phases. This information appears to be encapsulated in a particularly crucial way, for spectral analysis, in the large x asymptotics of complex solutions at real λ. It should also be noted that to consider complex solutions is equivalent to consider-
MPAG013.tex; 5/11/1998; 8:38; p.2
195
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
ing simultaneously a pair of solutions, which is very much in line with recent ideas, expressed for example in [26], which stress the analogy between the asymptotic analysis of the Schrödinger equation and the large time behaviour of dynamical systems. The current paper, in drawing on notions such as stability, recurrence, and clustering, is a continuation of this line of development. Rather than dealing with complex solutions f (x, λ) = Au(x, λ) + Bv(x, λ) per se (A, B ∈ C, and dependent on λ), we consider instead, for such solutions, the ratio h(x, λ) = f 0 (x, λ)/f (x, λ), where f 0 denotes differentiation with respect to x. This function h(x, λ) is thus a particular rational combination of u, v, u0 and v 0 which contains more spectral information than, for example, the real functions u0 /u and v 0 /v considered separately. It is well known, and follows easily as a consequence of the Schrödinger equation satisfied by f (x, λ), that h(x, λ) satisfies the nonlinear Riccati differential equation d h(x, λ) = V (x) − λ − (h(x, λ))2 . dx
(1)
This equation is appropriate to the study of the m-function and spectral properties for T = −d2 /dx 2 + V (x) subject to Dirichlet boundary condition at x = 0; this is the special case α = 0, and for general boundary condition we will have a related Riccati equation, of which the solutions hα are related to h by explicit rational transformations. The principal aim of this paper will be to show how the large x asymptotics of families of solutions of the above Riccati equation, as λ is varied, imply explicit bounds of m+ (λ), the boundary value of the m-function (or of mα if Neumann or other boundary conditions are imposed); these estimates of m+ can be used to generate information about the spectral properties of the corresponding Schrödinger operator, in particular as relates to the absolutely continuous part of the spectrum. As an example which will provide a flavour of the kind of result we shall obtain, may be cited the following, which applies to all real-valued, locally integrable potentials in the limit-point case at infinity: Let h(·, λ) be any (complex-valued) solution of the above Riccati equation, measurable as a function of λ and satisfying, for all x sufficiently large and for all λ belonging to some finite interval I, the bound √ √ |h(x, λ) − i λ| < δ λ, where δ is a constant in the range 0 < δ < 1/2. Then, for almost all λ ∈ I, we have the estimates: |h(0, λ) − m+ (λ)| 6
δ Im m+ (λ), 1 − 2δ
| Im h(0, λ) − Im m+ (λ)| 6
δ Im m+ (λ). 1−δ
MPAG013.tex; 5/11/1998; 8:38; p.3
196
W. O. AMREIN AND D. B. PEARSON
These estimates allow us to deduce from the initial values h(0, λ) of our given family of solutions h(·, λ) precise upper and lower bounds for the value of Im m+ (λ). Since π −1 Im m+ (λ) is the density function for the spectral measure µa.c. of the absolutely continuous component for the Dirichlet Schrödinger operator in L2 (0, ∞), we can deduce corresponding estimates for the spectral measure of the interval I itself. We can also estimate, to order δ, the value of the complex limit m+ (λ) itself, and similar results apply to the m-function for all other boundary conditions at x = 0. A particular consequence of this result applies if the value of δ √ may be made arbitrarily small. For this to be so, we require limx→∞ h(x, λ) = i λ, and such a solution must then satisfy the initial condition h(0, λ) = m+ (λ) exactly. (It also follows, for general l > 0, that h(l, λ) is then the boundary value at λ for the mfunction of the Schrödinger operator acting in L2 (l, ∞), with Dirichlet boundary condition at x = l.) For a wide class of short – and long – range potentials (for example in the cases V ∈ L1 (0, ∞), or V of bounded variation with V → 0 at infinity), it is indeed the case √ that a solution h(·, λ) of the Riccati equation exists with limx→∞ h(x, λ) = i λ, for any λ > 0. Such solutions can then be used to determine the boundary values of the m-function and related spectral behaviour. It should, however, be noted that our results are extremely general, and apply to a far wider class of potentials, and under much weaker assumptions which do not require the convergence of h(x, λ) as x → ∞. The particular notion which we are able to isolate, and which seems to govern all spectral behaviour on the absolutely continuous part of the spectrum, is that of (recurrent) clustering. Roughly, a family h(·, λ) of solutions of the Riccati equation is said to be δ-clustering, for λ in some set S, provided recurrently at a sequence of points x = x1 , x2 , x3 , . . . , with xj → ∞, solutions as λ varies over S are within distance of order δ of each other. The family is said to be clustering if δ can be made arbitrarily small. Precise definitions of these two concepts are given in Section 5. The main results of the paper, stated below as Theorem 1, provide estimates to order δ of the boundary value of the m-function, spectral density function, and spectral measure of a set, based on the hypothesis that a given family of solutions of the Riccati equation is δ-clustering, and imply that the only solutions having the clustering property must be subject to initial conditions h(0, λ) = m+ (λ). All of these results are extended to arbitrary values of the boundary condition parameter α. As a consequence, one can use the behaviour of a family of solutions of the Riccati equation, for λ in some set S and at an increasing sequence of points {xj }, to derive precise bounds on the m+ -function, for λ ∈ S. It appears to us that the theoretical framework which leads to the derivation of such bounds provides a viable basis for a possible numerical approach to spectral analysis, in which results of the kind described here are coupled with ideas of interval analysis. Quite apart from such developments, we believe that the characterisation of the boundary values of
MPAG013.tex; 5/11/1998; 8:38; p.4
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
197
the m-function in terms of cluster properties of families of solutions of Riccati-type equations is of theoretical interest. The paper is organised as follows: Section 2 begins with an introduction to the general properties of Herglotz functions. Any Herglotz function has a unique representation ([27]) in terms of a corresponding measure ν on the Borel subsets of R. Just as is the case for Schrödinger operators, the spectral analysis of Herglotz functions is best carried out with a family {Fy }(y ∈ R) of Herglotz functions, rather than a single function. Given any Herglotz function F , one can define for almost all λ ∈ R a Cauchy measure ω(λ, ·). Given any Borel subset S of R, π ω(λ, S) is almost everywhere equal to the angle subtended by the set S at the boundary value F+ (λ) of F at λ. Thus the ω-measure carries important information relating to the boundary behaviour of the Herglotz function and has the added advantages, as compared to the measure ν, of both being a finite measure and of behaving well under various limiting operations. For the general background to the ω-measure and families of Herglotz functions, see [24]. In order to make full use of the ω-measure in spectral analysis, it is necessary to transfer between estimates of angles subtended by sets S at points w in the upper half-plane and estimates of the location of the points w themselves. Section 2 concludes with a general lemma which provides the necessary theoretical basis for doing this. In Section 3, we define the family of m-functions mα , as well as a related family of m-functions for the differential operator −d2 /dx 2 + V (x) acting in L2 of a finite interval [0, N] (see also [2, 25]). In this latter case, it is necessary to allow for complex boundary conditions at x = N, which may even be λ-dependent, and under these general conditions we prove in Lemma 2 a number of formulae relating averages over the parameter α of the spectral measures with integrals over λ of the corresponding ω-measures. These formulae allow us, by taking the limit N → ∞, to relate the ω-measures for differential operators acting in L2 (0, ∞) to ω-measures for operators in L2 (0, N) for N finite. The main idea of the proof of Lemma 2 is to make a change of variables between the parameter y for a general Herglotz family and the parameter α for a family of m-functions, and use a general spectral averaging formula for Herglotz functions to be found in [24]. Section 4 of the paper is concerned mainly with the Riccati equation and its solutions. Here we deal principally with the Riccati equation most appropriate to the case α = 0 for which the differential operator is subject to Dirichlet boundary condition. In Lemma 3 we derive some estimates which are used in the sequel and which relate to the question of stability with respect to changes in initial condition. In Section 5, we give precise definitions of the notions of δ-clustering and of clustering for families of solutions {h(·, λ)} of the Riccati equation, and illustrate these definitions with reference to some of the standard classes of potentials (L1 , bounded variation, and periodic). This is followed by the main theorem of the paper, which shows how the hypothesis of δ-clustering leads to estimates of the
MPAG013.tex; 5/11/1998; 8:38; p.5
198
W. O. AMREIN AND D. B. PEARSON
boundary values of the m-function, the spectral density, the spectral measure and the ω-measure for Schrödinger operators. A corollary to Theorem 1 provides a characterisation of clustering families of solutions in terms of m+ (λ), and these results are then extended to general (real) boundary conditions at x = 0. 2. Herglotz Functions Given a Herglotz function F (analytic in the upper half-plane with positive imaginary part), we have the representation ([27]) Z ∞ t 1 − 2 dν(t) (Im z > 0). (2) F (z) = a + bz + t +1 −∞ t − z Here a = Re F (i), b = lims→+∞ s −1 Im F (is), and ν = dν(t) is the uniquely determined spectral measure corresponding to F . In terms of a real parameter y, we can define a one-parameter family of Herglotz functions {Fy (·)} by Fy (z) =
F (z) 1 − yF (z)
(Im z > 0).
(3)
We denote respectively by ay , by and νy the constants a, b and the measure ν for the function Fy . For any w in the upper half plane, we can associate, as in [24], a Cauchy measure | · |w by Z Im w 1 |A|w = dt, π A |t − w|2 for any Borel subset A of R. Then π |A|w is the angle subtended at the point w by the subset A of R. For almost all λ ∈ R, we can define, for the Herglotz function F , a Cauchy measure ω(λ, ·) at λ by ω(λ, S) = lim |S|F (λ+iε).
(4)
ε→0+
F has a boundary value F+ (λ) = limε→0+ F (λ + iε) for almost all λ ∈ R. The decomposition of the measure ν into its singular and absolutely continuous components is determined by the boundary behaviour of F (λ + iε); thus (see, for example, [28]) n o νs = ν λ ∈ R : lim Im F (λ + iε) = ∞ ; ε→0+
n
o
νa.c. = ν λ ∈ R : lim Im F (λ + iε) exists finitely . ε→0+
MPAG013.tex; 5/11/1998; 8:38; p.6
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
199
For a given measurable subset S of R, we have ω(λ, S) = |S|F+ (λ) for all λ at which F+ (λ) exists with Im F+ (λ) > 0, and 1 (F+ (λ) ∈ S), ω(λ, S) = / S) 0 (F+ (λ) ∈ for almost all λ at which F+ (λ) exists with Im F+ (λ) = 0. The following integral identity relates the ω-measure ω(λ, ·) for a given Herglotz function F to the corresponding one parameter family {νy } of measures: Z Z y −2 νy −1 (A) dy = ω(t, S) dt, (5) S
A
where A, S are arbitrary measurable subsets of R. (For a proof see [24].) The ω-measure for a given Herglotz function F may be used to investigate the boundary values F+ (λ) of F . The following application of this idea will be developed in this paper and used to study the boundary behaviour of the Weyl– Titchmarsh m-function for a differential operator: Suppose F+ (λ0 ) exists at some λ0 ∈ R. Then Theorem 3 of [23] implies the result Z 1 λ0 +δ ω(t, S) dt = ω(λ0 , S), (6) lim δ→0+ 2δ λ −δ 0 for any measurable subset S of R. (In [24], this is stated under the hypothesis that F+ (λ0 ) is real, but the proof easily extends to the general case.) We shall apply this result, using a limiting argument together with detailed bounds for solutions of the appropriate differential equations, to estimate the ω-measure on the right hand side of (6) for general measurable sets S, often taken for convenience to be intervals. Since π ω(λ0, S) is just the angle subtended by the set S at the point F+ (λ0 ), this will lead to an estimate for the value of F+ (λ0 ), the boundary value of the Herglotz function/m-function. The viability of such an approach to boundary values of Herglotz functions depends on the following fundamental question: What are the implications for the value of ω(λ0, S) of a given estimate of the boundary value F+ (λ0 ), and conversely what consequences for the value of F+ (λ0 ) follow from detailed bounds on ω(λ0, S) as S is varied? An answer to this question relies on a study of the relationship between the location of points in the upper half-plane and estimates for the corresponding angles subtended by subsets of R, and is provided by the following lemma. Observe in this connection and in the later analysis presented in this paper that an appropriate measure of the separation between two points w1 , w2 in the upper half-plane is provided by |w1 − w2 |/ Im w2 rather than by |w1 − w2 |. The proof is given in the Appendix. LEMMA 1. Let w1 , w2 be two complex numbers and denote by θ1 (S), θ2 (S) the angles subtended by a given measurable subset S of R at w1 and w2 , respectively. Then, for any δ with 0 < δ < 1,
MPAG013.tex; 5/11/1998; 8:38; p.7
200
W. O. AMREIN AND D. B. PEARSON
(i) |w1 − w2 | 6 δ Im w2 ⇒ |θ1 (S) − θ2 (S)| 6 δ ∗ θ2 (S), for all S ⊆ R, where δ ∗ = δ/(1 − δ). (ii) |θ1 (S) − θ2 (S)| 6 δθ2 (S), for all S ⊆ R, ⇒ |w1 − w2 | 6 δ Im w2 . [This implication requires only δ > 0 rather than 0 < δ < 1].
3. m-Functions and their Properties We consider the differential operator Tα = −d2 /dx 2 + V (x), acting in L2 (0, ∞), subject to the boundary condition (cos α)ϕ + (sin α)
dϕ = 0 at x = 0. dx
(7)
Here V is assumed real and locally integrable, with no further conditions imposed on the behaviour of V at large distances, apart from the requirement that we are in the limit-point case at infinity. Associated with the differential expression −d2 /dx 2 + V (x) is the differential equation −
d2 f (x, z) + V (x)f (x, z) = zf (x, z), dx 2
(8)
where z is a complex spectral parameter; we take for convenience Im z > 0. In the case of real spectral parameter λ we write the differential equation −
d2 f (x, λ) + V (x)f (x, λ) = λf (x, λ). dx 2
(80 )
Solutions uα (·, z), vα (·, z) of (8) and correspondingly uα (·, λ), vα (·, λ) of (80 ), are defined subject to the initial conditions uα (0, z) = cos α, vα (0, z) = − sin α, u0α (0, z) = sin α, vα0 (0, z) = cos α. The so-called Weyl–Titchmarsh m-function mα (z) for the differential operator Tα is then uniquely defined, for Im z > 0, by the condition that uα (·, z) + mα (z)vα (·, z) ∈ L2 (0, ∞).
(9)
In the case α = 0, we shall write simply u and v for u0 , v0 and m(z) for m0 (z); we then have m(z) = f 0 (0, z)/f (0, z), where f (·, z) is any nontrivial solution of (8) for which f (·, z) ∈ L2 (0, ∞). The m-function mα is a Herglotz function (i.e. mα is analytic with positive imaginary part in the upper half-plane) having the dependence on α given by mα (z) =
(cos α)m(z) − (sin α) . (cos α) + (sin α)m(z)
MPAG013.tex; 5/11/1998; 8:38; p.8
201
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
We shall denote by µα the spectral measure defined in terms of the Herglotz representation for mα (cf. (2)). We shall also need to consider the Herglotz function for the differential operator −d2 /dx 2 + V (x) defined in L2 (0, N), with boundary conditions at each end of the finite interval [0, N]. Such m-functions have often been considered (see for example [2, 25]), but here we deviate slightly from usual practice in imposing a complex boundary condition at the right-hand endpoint x = N. Thus, for any Herglotz function η(·), define the m-function mN α,η by the condition that fα (x, z) ≡ uα (x, z) + mN α,η (z)vα (x, z)
(10)
satisfy at x = N the condition fα0 (N, z) = η(z)fα (N, z) for Im z > 0. Considering first the case α = 0, and using the initial conditions for u and v, we 0 see that mN 0,η (z) = f (0, z)/f (0, z), where f (·, z) is any (nontrivial) solution of the differential equation (8), subject to the prescribed condition at x = N. d Im(f 0 f¯) = −(Im z)|f |2 by the standard Lagrange identity ([2]), we Since dx have, on integrating with respect to x from 0 to N and using the condition that Im(f 0 (N, Z)f¯(N, z)) = (Im η(z))|f (N, z)|2 > 0, the result that Im(f 0 (0, z)f¯(0, z)) > 0. Hence f cannot be zero at x = 0, and also Im mN 0,η (z) =
Im(f 0 (0, z)f¯(0, z)) > 0. |f (0, z)|2
It follows that mN 0,η is a Herglotz function. For α 6= 0, the solution fα used to deter(z) must be a constant multiple of the solution f used for α = 0; hence mine mN α,η mN 0,η (z) =
f 0 (0, z) f 0 (0, z) = α , f (0, z) fα (0, z)
which on substituting for fα , fα0 and using the initial conditions for uα , vα , implies mN 0,η (z) =
(sin α) + (cos α)mN α,η (z) (cos α) − (sin α)mN α,η (z)
(11)
.
Hence mN α,η (z) has the same α dependence, mN α,η (z)
=
(cos α)mN 0,η (z) − (sin α) (cos α) + (sin α)mN 0,η (z)
(12)
,
as in the case of mα (z). It follows easily that mN α,η is a Herglotz function for general α. On combining Equation (11), with α replaced by β, and Equation (12), we have the equation mN α,η (z)
=
(cos(α − β))mN β,η (z) − sin(α − β) (cos(α − β)) + (sin(α − β))mN β,η (z)
,
(13)
MPAG013.tex; 5/11/1998; 8:38; p.9
202
W. O. AMREIN AND D. B. PEARSON
which relates these functions for different values of α and β. Using the Wronskian identity uα vα0 − u0α vα = 1 at x = N, we may verify the explicit expression for the m-function mN α,η (z) =
uα (N, z)η(z) − u0α (N, z) . −vα (N, z)η(z) + vα0 (N, z)
(14)
This may be verified by checking the boundary condition at x = N for the function fα defined by (9), making use of the Wronskian identity. Here we shall mainly be concerned with the special case in which η is a constant function having positive imaginary part. More generally, we assume that the boundary value η+ (λ) ≡ limε→0+ η(λ + iε) satisfies Im η+ (λ) > 0 for almost all λ ∈ R. In that case, for N+ almost all λ ∈ R the function mN α,η (z) also has boundary value mα,η (λ) having strictly positive imaginary part; we have, in fact Im mN+ α,η (λ) =
Im η+ (λ) . |vα (N, λ)η+ (λ) − vα0 (N, λ)|2
An alternative characterisation of mN+ α,η (λ) is by the condition that fα (x, λ) ≡ N+ uα (x, λ) + mα,η (λ)vα (x, λ) satisfy at x = N the λ-dependent complex boundary condition fα0 (N, λ) = η+ (λ)fα (N, λ). The following lemma extends to the case of complex boundary condition results already known ([25]) for real boundary condition, and will be the basis for the estimates which we shall carry out in Section 5. N LEMMA 2. Let µN α,η denote the spectral measure for the Herglotz function mα,η , N and let ωα,η (λ, ·) denote the ω-measure for this Herglotz function, defined as in N Equation (4). (Thus, for S ⊆ R, π ωα,η (λ, S) is the angle subtended by the set S at N+ the point mα,η (λ).) Let µα and ωα (λ, ·) denote respectively the spectral measure and ω-measure for mα , where mα is the m-function for the differential operator Tα = −d2 /dx 2 + V (x) in L2 (0, ∞) with boundary condition (7) at x = 0. Then for any Lebesgue measurable subsets A, S of R, Z Z N (i) (1 + y 2 )−1 µN (A) dy = ω0,η (t, S) dt, − cot−1 y,η S A Z Z 2 −1 N N (ii) lim (1 + y ) µ− cot−1 y,η (A) dy = lim ω0,η (t, S) dt N→∞ S N→∞ A Z = ω0 (t, S) dt. A
Proof. We start from the identity (5), which holds for arbitrary Herglotz functions F , and take the special case F (z) = mN 0,η (z). Equation (3) then implies ymN yF (z) 0,η (z) , = Fy −1 (z) = y − F (z) y − mN 0,η (z)
MPAG013.tex; 5/11/1998; 8:38; p.10
203
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
which on substituting y = − cot α becomes Fy −1 (z) =
(cos α)mN 0,η (z) (cos α) + (sin α)mN 0,η (z)
.
However, from the dependence (12) of mN α,η (z) on α, we may verify that Fy −1 (z) = 2 N (cos α)mα,η (z) + sin α cos α. Since, with y = − cot α, the Herglotz functions Fy −1 and (cos2 α)mN α,η differ by a constant, they must have the same spectral measures, so that νy −1 = (cos2 α)µN α,η . With y = − cot α, we have y −2 cos2 α = sin2 α = (1 + y 2 )−1 , so that Z Z y −2 νy −1 (A) dy = (1 + y 2 )−1 µN (A) dy, − cot−1 y,η S
S
implying (i) of the lemma. The proof of (ii) follows closely the arguments of [25, p. 4074]. First fix z in the upper half plane. For this z, the value of η(z) determines the boundary condition at x = N for the function fα in (10). Standard limit point/limit circle theory, with η replaced by a real valued function, shows that in that case the set of points mN α,η (z) lie on a circle Cα,N (z) in the upper half plane, the circle depending on the values of α and N, as well as on z. In our case, with η a Herglotz function and hence Im η > 0, a minor modification of this theory implies that these points lie in the open disc enclosed by Cα,N (z). For N1 > N2 , the N = N1 disc is contained in the N = N2 disc. As N → ∞ the disc shrinks to a single point, which is the point mα (z). One may verify that convergence of mN α,η (z) to mα (z) is uniform in z over compact subsets of the upper half-plane. This implies convergence of the corresponding spectral measures for finite intervals A, thus lim µN α,η (A) = µα (A)
(15)
N→∞
provided neither endpoint of A is a discrete point of the measure µα . For a given finite interval A, an endpoint can be a discrete point of µα for at most one value of α. We also have positive upper and lower bounds for |mN 0,η (i)| which, on using given by (12), may be made uniform in α for |mN the α dependence of mN α,η α,η (i)|. N Using the Herglotz representation as in (2) for mα,η yields a uniform estimate Z (i) Im mN − cot−1 y,η
=
∞
dµN (t) − cot−1 y,η
−∞
(t 2 + 1)
6 const.,
(16)
provided the m-function has no linear term in z in its representation. The coefficient N −1 Im mN of the linear term for mN α,η (z) is given by bα,η = lims→+∞ s α,η (is). Hence N N from Equation (13) we see that if bβ,η 6= 0 for some β then bα,η = 0 for all α 6= β. It follows that the estimate (16) is uniform in y over all except for at most one possible value of y, for each value of N. For a given finite interval A, this leads to a bound (A) 6 const., µN − cot−1 y,η
(17)
MPAG013.tex; 5/11/1998; 8:38; p.11
204
W. O. AMREIN AND D. B. PEARSON
holding except for at most one value of y for each N. Using now (15) and (17), we may apply the Lebesgue dominated-convergence theorem to obtain, for any finite interval A and measurable subset S of R, Z Z lim (1 + y 2 )−1 µN (A) dy = (1 + y 2 )−1 µ− cot−1 y (A) dy. (18) − cot−1 y,η N→∞
S
S
The equation extends readily to general measurable subsets A of R, using countable additivity. To complete the proof of (ii) of the lemma, it remains only to use Equation (12) of [25], which states that Z Z 2 −1 2 (1 + y ) µ− cot−1 y (A) dy = ω0 (t, S) dt. S
A
REMARK 1. In relation to the proof of (ii) of Lemma 2, we point out that in fact the estimate (16) holds uniformly for all values of y. To see this, one has to show N mentioned in the proof is zero for all α. That this that the Herglotz coefficient bα,η is so is a consequence of (12) and of the following asymptotic formula: √ 1 − 12 s − 2 mN 0,η (is) = (−1 + i)/ 2 + O s as s → +∞ through real values; the proof of this formula uses the analogue of Equation (171) of [1] for the solution f (x, is) of the Schrödinger equation (8) with complex boundary condition f 0 (N, is)/f (N, is) = η(is). (See also [29] for the special case in which η is a real constant.) The following Corollary extends the results of Lemma 2 to the function mN β,η for general values of β. COROLLARY 1. For any Lebesgue measurable subsets A, S of R, we have Z Z 2 −1 N N (i) (1 + y ) µ(− cot−1 y+β),η (A) dy = ωβ,η (t, S) dt, S A Z Z 2 −1 N N (ii) lim (1 + y ) µ(−cot−1 y+β),η (A) dy = lim ωβ,η (t, S) dt N→∞ S N→∞ A Z = ωβ (t, S) dt. A
Proof. Again we start from the identity (5), taking in this case F (z) = mN β,η (z). We have, then, Fy −1 (z) =
ymN β,η (z) y − mN β,η (z)
,
N which on setting y = − cot α becomes (cos α)mN β,η (z)/((cos α) + (sin α)mβ,η (z)). Substituting
mN β,η (z)
=
(cos β)mN 0,η (z) − (sin β) (cos β) + (sin β)mN 0,η (z)
MPAG013.tex; 5/11/1998; 8:38; p.12
205
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
from (12), we find, with y = − cot α, Fy −1 (z) =
(cos α cos β)mN 0,η (z) − (cos α sin β) (cos(α + β)) + (sin(α + β))mN 0,η (z)
.
Noting that mN α+β,η (z) =
(cos(α + β))mN 0,η (z) − sin(α + β) (cos(α + β)) + (sin(α + β))mN 0,η (z)
and using standard trigonometric identities, it is straightforward to verify that Fy −1 (z) = (cos2 α)mN α+β,η (z) + sin α cos α. As in the proof of the lemma, this leads to an analogous relation between measures, giving here νy −1 = (cos2 α)µN α+β,η . The proofs of (i) and (ii) of the corollary now follow similar lines to those of 2 the corresponding results of Lemma 2, of which they are a generalisation. REMARK 2. We shall actually need the results of Lemma 2 and Corollary 1 in slightly greater generality, such that the complex boundary condition defined by η is allowed to depend on the value of N. It may be verified that the proofs of (i) and (ii) may in each case be carried through in this more general case, with the role of the Herglotz function η taken by a family {ηN } of Herglotz functions.
4. The Riccati Equation Given any solution f (·, z) of the Schrödinger equation (8) at complex spectral parameter z, such that f (x, z) 6= 0 for x > 0, we can define a corresponding function h(x, z) = f 0 (x, z)/f (x, z) such that h(·, z) satisfies the well-known Riccati differential equation dh(x, z) = V (x) − z − (h(x, z))2 . dx
(19)
We assume here Im z > 0. Given the value of the solution h at x = N, for some N > 0, with Im h(N, z) > 0, the solution is well defined and has positive imaginary part for all x in the interval [0, N]. We can construct the solution explicitly in terms of any solution f (·, z) of the Schrödinger equation subject to the condition f 0 (N, z)/f (N, z) = h(N, z); we have, in that case, for 0 6 x 6 N, Z N 1 2 2 Im h(x, z) = Im h(N, z) + Im z |f (t, z)| dt , |f (N, z)| |f (x, z)|2 x from which the positivity of the imaginary part is easily seen.
MPAG013.tex; 5/11/1998; 8:38; p.13
206
W. O. AMREIN AND D. B. PEARSON
On the other hand, for x > N, positivity of the imaginary part of the solution h(x, z) will not necessarily be preserved. In fact it is well known (and indeed may be proved from the above identity for Im h(x, z)), that the only solution of the Riccati equation such that Im h(x, z) > 0 for all x > 0 is the solution subject to the initial condition h(0, z) = m(z),where m is the Weyl–Titchmarsh m-function. For all other initial conditions, the solution h(x, z) will either diverge at some finite positive value of x, or will have negative imaginary part for all x sufficiently large. In the case of the Riccati equation at real spectral parameter λ, dh(x, λ) = V (x) − λ − (h(x, λ))2 , dx
(190 )
the situation is completely different. Here every solution subject to an initial condition satisfying Im h(0, λ) > 0 will be well defined and have positive imaginary part for all x > 0. An explicit expression for the solution of the Riccati equation, subject to such an initial condition, is h(x, λ) = f 0 (x, λ)/f (x, λ), where f (x, λ) is given in terms of the standard solutions u, v(≡ u0 , v0 ) of the Schrödinger equation defined in Section 3, by f (x, λ) = u(x, λ) + h(0, λ)v(x, λ). From the Wronskian identity, we then have Im h(x, λ) =
Im h(0, λ) , |f (x, λ)|2
exhibiting clearly the positivity of the imaginary part of the solution. In this paper, we are particularly interested in the case in which the differential operator Tα = −d2 /dx 2 + V (x) has absolutely continuous spectrum, though not necessarily purely absolutely continuous spectrum. A support of the absolutely continuous part of µα can be defined as the set of all λ ∈ R at which the boundary value mα+(λ) ≡ limε→0+ mα (λ + iε) exists with strictly positive imaginary part. Equation (13) then implies that this set is in fact independent of α. For λ belonging to this set, a particularly significant solution of the Riccati equation is that solution which we shall denote by h+ (x, λ), which satisfies the initial condition h+ (0, λ) = m+ (λ) ≡ m0+ (λ). Thus h+ (x, λ) =
f 0 (x, λ) u0 (x, λ) + m+ (λ)v 0 (x, λ) = + , u(x, λ) + m+ (λ)v(x, λ) f+ (x, λ)
where f+ (x, λ) is the boundary value at λ of the L2 solution f (x, z) = u(x, z) + m(z)v(x, z) of the Schrödinger equation at complex spectral parameter z. Just as m+ (λ) = f+0 (0, λ)/f+ (0, λ), so m+ (N, λ) = h+ (N, λ) =
f+0 (N, λ) f+ (N, λ)
MPAG013.tex; 5/11/1998; 8:38; p.14
207
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
is the boundary value at λ of the m-function m(N, z) for the differential operator −d2 /dx 2 + V (x), acting in L2 (N, ∞) with Dirichlet boundary condition at x = N. Hence the single solution h+ (x, λ), with the appropriate initial condition h+ (0, λ) = m+ (λ), determines the boundary value of the m-function for the Dirichlet Hamiltonian in all intervals [N, ∞) for N > 0. The main purpose of this paper will be to identify criteria which will characterise the particular solution h+ (x, λ) of the Riccati equation at real spectral parameter λ which determines this family of m-functions (and hence also their related spectral measures, densities, and so on). Such criteria are to be found in an analysis of the clustering properties of solutions of the Riccati equation as the spectral parameter λ is varied. As a preliminary to this analysis, to be carried out in the next section, the following result deals with a different but related question, that of stability with respect to changes in initial condition. As in Section 2, we estimate the separation of two complex numbers w1 , w2 through a comparison of |w1 − w2 | with Im w2 (or Im w1 ) rather than a bound on the magnitude of |w1 − w2 |. LEMMA 3. Let h1 (·, λ), h2 (·, λ) be two solutions of the Riccati equation (190 ), at real spectral parameter λ, over an interval [0, N]. Then, for any δ in the interval 0 < δ < 1, |h1 (0, λ) − h2 (0, λ)| 6 δ Im h2 (0, λ) δ ⇒ |h1 (N, λ) − h2 (N, λ)| 6 Im h2 (N, λ). (1 − δ) Moreover, |h1 (N, λ) − h2 (N, λ)| 6 δ Im h2 (N, λ) δ Im h2 (0, λ). ⇒ |h1 (0, λ) − h2 (0, λ)| 6 (1 − δ) Proof. We have already written down the solution of the Riccati equation subject to a given initial condition, from which we have, for any solution h(·, λ) over the interval, h(N, λ) =
u0 (N, λ) + h(0, λ)v 0 (N, λ) . u(N, λ) + h(0, λ)v(N, λ)
For fixed N and λ, the transformation which takes h(0, λ) into h(N, λ) is a socalled Möbius or fractional linear transformation of the upper half plane, of the form w → (aw + b)/(cw + d), where a, b, c, d are real and the Wronskian identity implies ad − bc = 1. For the properties of Möbius transformations, see for example [30]. For such a transformation, suppose that w1 → ξ1 and w2 → ξ2 , where |w1 − w2 | 6 δ Im w 2 , with 0 < δ < 1. Then aw1 + b aw2 + b |w1 − w2 | = |ξ1 − ξ2 | = − , cw1 + d cw2 + d |cw1 + d||cw2 + d|
MPAG013.tex; 5/11/1998; 8:38; p.15
208
W. O. AMREIN AND D. B. PEARSON
and Im ξ2 = Im w2 /|cw2 + d|2 . Hence cw2 + d |ξ1 − ξ2 | cw2 + d |w1 − w2 | . = 6 δ Im ξ2 cw1 + d Im w2 cw1 + d However, cw1 + d c(w |w1 − w2 | − w ) 1 2 cw + d = 1 + cw + d > 1 − |w + d/c| 2 2 2 δ Im w2 > 1− > 1 − δ, |w2 + d/c| provided c 6= 0. Hence cw2 + d 1 cw + d 6 1 − δ , 1 the inequality holding trivially in the case c = 0, and we have 1 |w1 − w2 | δ |ξ1 − ξ2 | 6 6 . Im ξ2 (1 − δ) Im w2 (1 − δ) The first implication of the lemma now follows by taking the appropriate Möbius transformation for the Riccati equation across the interval [0, N], with w1 = h1 (0, λ), w2 = h2 (0, λ) and ξ1 = h1 (N, λ), ξ2 = h2 (N, λ). To prove the second implication, observe that the inverse transformation to ξ = (aw + b)/(cw + d) is a transformation of the same form, given by w = (dξ − b)/(a − cξ ), and repeat 2 the previous argument. REMARK 3. The multiplicative constants δ/(1 − δ) in the inequalities of the lemma are optimal, and rely on the bound |(w2 + x)/(w1 + x)| 6 1/(1 − δ) for all x ∈ R, whenever |w1 − w2 | 6 δ Im w2 . √ REMARK 4. One could use |w1 − w2 |/ (Im w1 )(Im w2 ) rather than, say, |w1 − w2 |/Im w2 , as a measure of separation for complex numbers w1 , w2 throughout this paper. This has the advantage of being symmetric between w1 and w2 , and also it follows from the conservation of cross ratios ([30]) that this quantity is conserved by Möbius transformations. Nevertheless, in our view these advantages are outweighed by the relative simplicity in the estimates of angle subtended and ω-measures which we shall derive in the following section, through the use of the |w1 − w2 |/Im w2 estimate. In the current context such alternative estimates differ in any case only to order δ 2 .
5. δ-Clustering Solutions of the Riccati Equation The following definitions express more precisely the notion that a solution h(·, λ) (more precisely a family of solutions) of the Riccati equation may to order δ be
MPAG013.tex; 5/11/1998; 8:38; p.16
209
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
asymptotically independent of λ, for λ belonging to suitable sets E, and for a sequence of large values of x. For such a solution, we use the term “recurrently δ-clustering”, abbreviated for convenience to “δ-clustering”. We shall also make use of the terms “point of density” and “approximate continuity”. (See [31, 32].) A real number λ0 is said to be a point of density of a measurable set E ⊆ R provided limK→0+ |E ∩ [λ0 − K, λ0 + K]|/2K = 1, where | · | stands for Lebesgue measure. A measurable function F from R to R is said to be approximately continuous at a point λ0 of its domain if, for any δ > 0, λ0 is a point of density of the set of λ for which |F (λ) − F (λ0 )| < δ. Thus, “point of density” expresses the idea that ‘most’ points near λ0 belong to a given set E, and “approximate continuity” expresses the idea that F (λ) is close to F (λ0 ) for ‘most’ points λ near λ0 . Given a measurable set E, almost all λ ∈ E will be points of density of E, and given a measurable function F , almost all λ ∈ domain (F ) will be points of approximate continuity. We are now ready to define the notion of δ-clustering. DEFINITION 1. Let E ⊆ R be measurable and let δ be a positive constant. We say that a family of solutions h(·, λ) of the Riccati equation d h(x, λ) = V (x) − λ − (h(x, λ))2 ; x ∈ [0, ∞), λ ∈ E dx is δ-clustering on E if there exists a function H : [0, ∞) → C, with Im H > 0, such that |h(x, λ) − H (x)| < δ. (20) lim inf sup x→∞ λ∈E Im H (x) The family {h(·, λ)} is said to be clustering at some λ0 ∈ R, if there is a measurable subset E of R, with all λ ∈ E points of density of E and h(0, λ) approximately continuous at all λ ∈ E as a function of λ, and such that for any δ > 0 an open interval I containing λ0 can be found, with the solution h(·, λ) δ-clustering on E ∩ I. The solution is said to be clustering on E if it is clustering at all λ ∈ E. REMARK 5. Given a set E and correspondingly a solution h(·, λ) of the Riccati equation for each λ ∈ E, we may try to minimise the value of δ in (20), by choosing the value of H (x) at each x > 0 to minimise the supremum of |h(x, λ) − H (x)|/ Im H (x) as λ is varied over E. This minimisation, though possible in principle, is not usually practical, and in practice, in verifying the δ-clustering property, it is simpler (though not optimal) to take for example H (x) = h(x, λ0 ) for some fixed λ0 ∈ E. REMARK 6. The property of a solution to be δ-clustering on a set E will hold if and only if a function H and an increasing sequence {`1 , `2 , `3 , . . .} can be found, with `j → ∞, such that, for all j = 1, 2, 3, . . . |h(`j , λ) − H (`j )| < δ Im H (`j )
MPAG013.tex; 5/11/1998; 8:38; p.17
210
W. O. AMREIN AND D. B. PEARSON
for all λ ∈ E. On the other hand, if the δ-clustering property fails for E, then for any δ0 in the interval 0 < δ0 < δ we have supλ∈E |h(x, λ) − H (x)|/Im H (x) > δ0 for all sufficiently large x and for all choices of the function H . In particular, we then have, for any fixed λ0 ∈ E, and for large enough x, |h(x, λ) − h(x, λ0 )| > δ0 Im h(x, λ0 ) for some λ ∈ E. REMARK 7. The clustering property at a point λ0 implies that a family of solutions clusters arbitrarily closely (i.e. to order δ with δ arbitrarily small) for λ sufficiently close to λ0 . Since the inequality |h(`, λ)−H (`)| < δ Im H (`), for all λ ∈ E, implies also |h(`, λ) − h(`, λ0 )| < (2δ/(1 − δ)) Im h(`, λ0), for all λ, λ0 ∈ E, in considering the clustering property at a point λ0 one may take without loss of generality H (x) = h(x, λ0 ), and this is often a convenient choice of the H function. Before proceeding to the main results of this paper, it may be helpful to consider briefly the application of the term “δ-clustering” to some simple classes of potentials. The simplest case is that in which V ≡ 0. The Riccati √ equation then 2 , which has the solution h = i λ for any λ > 0. takes the form dh/dx = −λ − h √ Note that i λ is the boundary value m+ (λ) of the m-function, and that this solution is a constant function for each λ > 0 because −d2 /dx 2 has the same m-function (with Dirichlet boundary condition) as a differential operator in L2 ([`, ∞)) for any ` > 0. √ For all initial conditions other than h(0, λ) = i λ , such that Im h(0, λ) > 0, the orbit of the solution h(x,√λ) as x is varied, for fixed λ > 0, is a circle |h|2 + λ = const. Im h, with the point i λ in its interior. In the case of zero potential, it is relatively straightforward to determine whether a given solution of the Riccati equation is δ-clustering or not, since exact solutions of the equation may be written down, for arbitrary initial conditions. As an example, consider the solution h(·, √ λ) of the Riccati equation (with V = 0), subject to the initial condition h(0, λ) = i λ0 , where λ0 is an arbitrary positive number. For 0 < δ < 1, if E is any closed subset of the interval E 0 = {λ : |λ−λ0 | < δλ0 }, we may verify for all x > 0 the estimate, for λ ∈ E 0 , √ −1/2 |λ − λ0 | |h(x, λ) − h(x, λ0 )| = < δ, 1 + cot2 x λ Im h(x, λ0 ) λ0 and it follows that h(·, λ) is δ-clustering on E, with H (x) = h(x, λ0 ). This may not, however, be the optimal choice √ of H to achieve δ-clustering, if we can vary E. Taking again h(0, λ) = i λ0 , one may verify that any λ in the interval λ0 (1 − δ)/(1 + δ) < λ < λ0 (1 + δ)/(1 − δ), which is larger than the interval E 0 , will belong to an open interval E on which h(·, λ) is δ-clustering, by suitable choice of H . The above example also provides an illustration of√the clustering property at λ0 ; the solutions subject to initial condition h(0, λ) = i λ0 are clustering at λ0 (and
MPAG013.tex; 5/11/1998; 8:38; p.18
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
211
at no other point), since by shrinking the interval E containing λ0 we can ensure the δ-clustering property on E with an arbritrarily small value of δ. In Theorem 1 below, we shall use the δ-clustering property to obtain an estimate of the closeness of h(0, λ), the initial value of the solution, to m+ (λ), the boundary value of the m-function, for values of λ in a given set E. Roughly, this estimate will state that if the family {h(·, λ)} is δ-clustering on E then h(0, λ) is within distance of order δ of m+ (λ), for almost all λ ∈ E. Although such estimates are of limited interest where m+ (λ) is known exactly, we believe that the general result, which makes no special assumptions on the form of the potential, is of both theoretical and practical interest in the study of the m-function and its boundary values. The class of potentials V ∈ L1 (0, ∞) may be treated as a perturbation of the case of zero potential. By regarding the Riccati equation as a pair of coupled differential equations for the real and imaginary parts of h, and evaluating the derivative, one may verify the identity, for any solution h(·, λ), 2V (x) Re h(x, λ) d |h(x, λ)|2 + λ = dx Im h(x, λ) Im h(x, λ) 2V (x) Re h(x, λ) |h(x, λ)|2 + λ = . (21) |h(x, λ)|2 + λ Im h(x, λ) For V ∈ L1 (0, ∞), it follows that (|h|2 + λ)/Im h converges to a limit√as x → ∞, for each λ > 0. This limit may be zero, in which case h(x, λ) → i λ, and the solution converges to the value of m+ (λ) for the unperturbed problem V ≡ 0; or the limit may be nonzero, in which case the orbit of the solution asymptotically approaches a circle in the upper half-plane. It will be a consequence of Theorem 1 below (which, however, is stated in much greater generality)√that for the unique initial condition which yields the asymptotics h(x, λ) → i λ we have h(0, λ) = m+ (λ), where now m+ (λ) is the boundary value of the m-function for the perturbed Dirichlet operator −d2 /dx 2 + V (x) in L2 (0, ∞); correspondingly, h(`, λ) will then be the boundary value of the m-function for the Dirichlet operator −d2 /dx 2 + V (x) in L2 (`, ∞). The case of a potential V of bounded variation (assuming for convenience that V → 0 as x → ∞) may be treated in a similar way. Using the identity (21) and noting that 2 Re h/Im h = d/dx(1/Im h), one may integrate the identity by parts and using the estimate that Im h is bounded below by a positive constant, for fixed λ > 0, deduce that again in this case (|h|2 + λ)/Im h converges to a limit as x → ∞. Hence for potentials of bounded variation we may identify, as in the of the Riccati equation which leads to the asymptotic L1 case, a particular solution √ property h(x, λ) → i λ, and satisfying the initial condition h(0, λ) = m+ (λ); the analysis may be extended without difficulty to potentials which are a sum of an L1 component and a component of bounded variation, covering in this way a wide class of decaying potentials, both of short and long-range. In all such cases, these particular solutions of the Riccati equation satisfying specific asymptotics are in fact precisely the clustering solutions we have already defined above.
MPAG013.tex; 5/11/1998; 8:38; p.19
212
W. O. AMREIN AND D. B. PEARSON
An interesting class of nondecaying potentials in this context is provided by the class of periodic potentials. For a potential of period T , a family of solutions h(·, λ) of the Riccati equation will be clustering on a set E if h(·, λ) is itself periodic, in the sense that h(x +T , λ) = h(x, λ) for λ ∈ E; for such solutions we will then have h(0, λ) = m+ (λ). The δ-clustering solutions may then roughly be characterised as those which stay within distance O(δ) of a periodic solution, at a sequence {`j } of points with `j → ∞. As a typical application to a class of potentials tending to −∞ at large distances, consider the one-parameter family of potentials V = Vβ (x) = −βx 2 (β > 0). In that case, for λ > 0, the transformation Z x (λ + βt 2 )1/2 dt, g = (λ + βx 2 )1/4f s= 0
reduces the Schrödinger equation to the standard form d2 g + {1 + Rβ }g = 0, ds 2
R∞ with Rβ ≡ Rβ (s, λ) satisfying 0 |Rβ (s, λ)|ds < ∞. (See [33] for further details.) If λ is allowed to be negative, say λ ∈ [−λ− , 0] for some fixed λ− > 0, a large, by taking similar transformation can be carried Rout for all x > x0 sufficiently √ x the transformed variable s to be s = x0 (λ + βt 2 )1/2 dt, with x0 > −λ− /β. The ratio k = (dg/ds)/g then satisfies a Riccati equation dk/ds = −1 − Rβ − k 2 , where, for a given solution f (x, λ) of the Schrödinger equation, h(x, λ) (= (df /dx)/f ) is given in terms of k(s, λ) by h(x, λ) = (λ + βx 2 )1/2 k(s, λ) −
βx (λ + βx 2 )−1 . 2
As x and s tend to infinity, this enables us to pass easily from bounds on k, of the form | k(s, λ) − K(x) | < δ0, Im K(x) to corresponding bounds on h, of the form | h(x, λ) − H (x) | < δ, Im H (x) provided H and K are related by H = (λ0 + βx 2 )1/2K −
βx (λ0 + βx 2 )−1 , 2
for some fixed λ0 .
By a natural extension of Definition 1, the notion of δ-clustering may be applied to families of solutions k(·, λ) of the transformed Riccati equation, and h(·, λ) will be δ-clustering whenever the family k(·, λ) is δ 0 -clustering for some δ 0 < δ. Moreover,
MPAG013.tex; 5/11/1998; 8:38; p.20
213
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
with Rβ (·, λ) ∈ L1 (0, ∞), we may adapt the same techniques as for L1 potentials to determine δ 0 -clustering solutions for k(·, λ). We omit, here, the detailed consequences of this approach, but note that there is a solution h(·, λ) of the Riccati equation (190 ) which is clustering on R, with the asymptotic behaviour h(x, λ) ∼ iβ 1/2 x as x → ∞, and that the solution hN (·, λ) of (190 ), subject to the condition hN (N, λ) = i(λ + βN 2 )1/2 −
βN (λ + βN 2 )−1 2
is δ-clustering with δ = tanh IN (β, λ), where Z ∞ β IN (β, λ) = |(3βx 2 − 2λ)(λ + βx 2 )−5/2 | dx 4 N
(N 2 > −λ/β).
The following theorem derives some consequences of the clustering property, and applies to all locally integrable potentials under the sole hypothesis that we are in the limit-point case at infinity. THEOREM 1. Let V be any real-valued potential, integrable on compact subintervals of [0, ∞), and in the limit-point case at infinity. Let E be a measurable subset of R, with each λ ∈ E a point of density of E. For λ ∈ E, let h(·, λ) satisfy the Riccati equation (190 ) subject to initial conditions with Im h(0, λ) > 0 and h(0, λ) approximately continuous at all λ ∈ E. Suppose that the solution h(·, λ) is δ-clustering on E, for some δ in the interval 0 < δ < 1/2. Let m(z) denote the m-function, with boundary value m+ (λ), ω(λ, ·) the ωmeasure, µ the spectral measure with absolutely continuous component µa.c. for the differential operator −d2 /dx 2 + V (x) in L2 (0, ∞), with Dirichlet boundary condition at x = 0. Then the function h(0, λ) provides the following order δ estimates of m+ (λ), ω(λ, S) and µa.c. (E) valid for almost all λ ∈ E, and holding in particular at all λ ∈ E at which m+ (λ) exists: (i) |h(0, λ) − m+ (λ)| 6 (δ/(1 − 2δ)) Im m+ (λ); (ii) |π −1 ψ(λ, S) − ω(λ, S)| 6 (δ/(1 − 2δ))ω(λ, S), where ψ(λ, S) is the angle subtended Z by S at the point h(0, λ); π −1 Im h(0, λ)dλ − µa.c. (E) 6 δ µa.c. (E). (iii) 1 − 2δ E Moreover, if E is an interval, the multiplicative constant on the right hand side of this last estimate may be improved to δ/(1 − δ), under the weaker hypothesis 0 < δ < 1, and we have the following additional estimate of 1 Im m+ (λ), the spectral density function: π (iv) | Im h(0, λ) − Im m+ (λ)| 6 (δ/(1 − δ)) Im m+ (λ). Proof. Suppose h(·, λ) is δ-clustering on E, with 0 < δ < 1/2. Following Remark 6 after Definition 1, let {`j } be an increasing sequence, with `j → ∞,
MPAG013.tex; 5/11/1998; 8:38; p.21
214
W. O. AMREIN AND D. B. PEARSON
such that |h(`j , λ) − H (`j )| < δ Im H (`j ), for λ ∈ E. Let ` denote `j for some fixed j , and define the constant function η = H (`). Now define the corresponding Herglotz function mN 0,η (z) as in Equation (10). Then the solution of the Riccati equation, which at x = 0 has the value mN+ 0,η (λ), will have the value H (`) at x = `. We also know that the solution, which at x = 0 has the value h(0, λ), will have the value h(`, λ) at x = `. Moreover, at x = ` we have the estimate |h(`, λ) − H (`)| < δ Im H (`), which by Lemma 3 implies the estimate at x = 0 h(0, λ) − mN+ (λ) 6 δ Im mN+ (λ), (22) 0,η 0,η 1−δ for all λ ∈ E. We can now apply Lemma 1, which translates an estimate of the separation between two points in the upper half-plane into an estimate of the difference between the angles subtended by a given measurable subset S of R at these two points. Noting that in this case δ δ δ ∗ 1− = , δ = 1−δ 1−δ 1 − 2δ N (λ, S), we then have from the definitions of ψ(λ, S) and ω0,η N |π −1 ψ(λ, S) − ω0,η (λ, S)| 6
δ ωN (λ, S), 1 − 2δ 0,η
for all λ ∈ E and all measurable S ⊆ R. Integrating over the set E now gives Z Z −1 N π ψ(t, S) dt − ω (t, S) dt 6 0,η E
E
δ 1 − 2δ
Z E
N ω0,η (t, S) dt,
which from Lemma 2, on letting N tend to ∞ and taking note of Remark 2 at the end of Section 3, yields Z Z Z π −1 ψ(t, S) dt − ω(t, S) dt 6 δ ω(t, S) dt. (23) 1 − 2δ E E E This inequality holds also with E replaced by EK ≡ E ∩ [λ − K, λ + K], for any λ ∈ E and K > 0. Take λ ∈ E to be a point at which m+ (λ) exists; this will be so for almost all λ ∈ E. By hypothesis, λ will be a point of density of E and a point of approximate continuity of h(0, λ). Hence alsoRψ(λ, S) will be approxi1 mately continuous at this point. So we have limK→0+ 2K EK ψ(t, S) dt = ψ(λ, S), R 1 and limK→0+ 2K EK ω(t, S) dt = ω(λ, S). We do not, here, need to assume that EK covers the whole of the interval [λ − K, λ + K]; since both π −1 ψ(t, S) and ω(t, S) are bounded by 1, and λ is a point of density of E, the contributions to R λ+K R λ+K 1 1 ψ(t, S) dt and 2K λ−K ω(t, S) dt from integrating over points not in E 2K λ−K would in any case vanish in the limits K → 0+. We may therefore conclude, on
MPAG013.tex; 5/11/1998; 8:38; p.22
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
215
setting E = EK in (23), dividing by 2K, and proceeding to the limit, that, at almost all λ ∈ E, δ ω(λ, S). (24) |π −1 ψ(λ, S) − ω(λ, S)| 6 1 − 2δ This proves (ii) of the theorem. The proof of (i) now follows immediately from (ii) of Lemma 1, which allows us to proceed from an estimate of angles subtended to an estimate of distances of points in the upper half-plane. Note that, under the hypotheses of the theorem, m+ (λ) cannot be real for any λ ∈ E. If m+ (λ) were real we should have ω(λ, S) = 0 for any closed interval S / S. For such intervals S, (24) then implies ψ(λ, S) = 0, which is with m+ (λ) ∈ not possible with Im h(0, λ) > 0. A similar argument shows that, for λ ∈ E, we cannot have limε→0+ Im m(λ + iε) = ∞, since this would imply ω(λ, S) = 0 for any finite interval S. Since the singular part µs of the measure µ is supported on the set of λ at which limε→0+ Im m(λ + iε) = ∞, it follows that µs (E) = 0, and hence that µ(E) = µa.c. (E). Part (iii) of the theorem now follows from (i) and the fact that π −1 Im m+ (λ) is the density function for µa.c. [28]. The proof of the stronger version of (iii), under the hypothesis that E is an interval, follows from (15) on integrating the inequality (22) directly and noting N that π −1 Im mN+ 0,η (λ) is the density function for the measure µ0,η . Inequality (iv) 2 follows as before from a limiting argument. The following corollary is a straightforward consequence of the main theorem: COROLLARY 2. Under the same hypotheses on the potential V as for Theorem 1, suppose h(·, λ) is a family of solutions of the Riccati equation (190 ) which is clustering at λ0 . Then h(·, λ0 ) satisfies the initial condition h(0, λ0 ) = m+ (λ0 ), provided m+ (λ0 ) exists. Proof. From Definition 1, we may assume the existence of a measurable set E, with λ0 ∈ E, and a family Iδ of open intervals containing λ0 , such that h(·, λ) is δ-clustering on E ∩ Iδ . Then the estimate (i) of Theorem 1 holds at λ = λ0 for all δ in the interval 2 0 < δ < 1/2. It follows immediately that h(0, λ0 ) = m+ (λ0 ). Theorem 1 and its corollary provide a general criterion for distinguishing the solution h+ (·, λ) of the Riccati equation which agrees at x = 0 with the boundary value m+ (λ) of the m-function m(z) and at x = ` with the boundary value m+ (`, λ) of the m-function for the differential operator acting in L2 (`, ∞). Thus, {h+ (·, λ)} is clustering for each λ and any family of solutions which is clustering at a point λ allows us to determine the boundary value at this point of the m-function for the differential operator in L2 (`, ∞) for ` > 0. REMARK 8. One may easily convert (i) of the theorem into the specific bound |m+ (λ) − h(0, λ)| 6 (δ/(1 − 3δ)) Im h(0, λ) for m+ (λ), provided δ < 1/3. This
MPAG013.tex; 5/11/1998; 8:38; p.23
216
W. O. AMREIN AND D. B. PEARSON
˜ bound may be slightly improved by defining h(λ) = Re h(0, λ) + (i/(1 − δ 2 )) Im h(0, λ). ˜ ˜ 6 (δ/(1 − 2δ)) Im h(λ). Bounds similar to those Then we have |m+ (λ) − h(λ)| above may also be obtained as a consequence of (ii), (iii) and (iv) of the theorem. REMARK 9. Estimates (i)–(iv) of the theorem, based on the hypothesis of δclustering on a set E, are close to optimal, in the sense that any improvement in the values of the multiplicative constants δ/(1 − 2δ) or δ/(1 − δ) can at best provide an order δ 2 correction. To illustrate this point, consider the simple case √ V (x) ≡ 0, taking as before the solution subject to initial condition h(0, λ) = i λ0 . Taking λ0 = 1 and δ = 1/10, we can find an interval, containing the point λ = 1, on which h(·, λ) is δ-clustering. Part (iv) of the theorem then provides the following estimate for m+ (1) : 9/10 6 Im m+ (1) 6 9/8. These bounds on the spectral density function at λ = 1 provide an accuracy of up to 10%. On the other hand, still with δ = 1/10 but taking a different family of solutions √ which leads to δclustering in a neighbourhood of λ = 1, namely h(0, λ) = i λ0 and λ0 slightly larger than 9/11, we deduce the upper bound Im m+ (1) 6 1.018. A third family of solutions with δ = 1/10 and λ0 slightly smaller than 11/9 gives rise to the lower bound Im m+ (1) > 0.995. It is interesting to note, here, that two estimates to order δ, taken together, give rise to the single estimate 0.995 6 Im m+ (1) 6 1.018, which determines the spectral density at λ = 1 to order δ 2 . Note also that were it possible to use the δ-clustering hypothesis to derive the bound (iv) with δ (or even δ/(1 − 1/2δ)) in place of δ/(1 − δ) the last stated inequalities for Im m+ (1) would be replaced in the case of the lower bound by an inequality that is in fact contradicted by the exact result Im m+ (1) = 1. Hence δ/(1 − δ), or something like it, is really needed. REMARK 10. For the class of potentials Vβ (x) = −βx 2 (β > 0) the results of Theorem 1 lead to explicit bounds for m+ (λ), using the δ-clustering families mentioned earlier. As an example, one finds the interesting estimate m+ (λ) 1/2 −1 1/2 −1 √ 6 tanh cβ − i cosh cβ λ λ , λ valid for all λ > 0, where the constant c can be precisely determined. In the asymptotic limit λ → ∞, one may use the method of Harris ([34]) to write down a series for the solution of the Riccati equation, of which the first few terms give β 1 m+ (λ) = i 1 − 2 + o 2 . 4λ λ See [33] for another approach to the asymptotic expansion in inverse powers of λ. The δ-clustering families hN (·, λ) referred to earlier, taking as an example N = 10, β = 4, and making a crude estimate of the resulting integral, already lead to uniform estimates of m+ (λ) to within a tolerance 2×10−3 over the range | λ |< 65. These estimates can be further improved, either by controlling more precisely the
MPAG013.tex; 5/11/1998; 8:38; p.24
217
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
solution of the Riccati equation, or, in the case of large λ, making use of λ → ∞ asymptotics. See [35] for the numerical computation of m(z) for Im z > 0, using repeated diagonalisation to control the large x asymptotics. REMARK 11. Although we have dealt mainly with the case α = 0, we can use Corollary 1 to cast the theory into a form which applies to general values of α. Let N 0 us define solutions uN α (·, λ), vα (·, λ) of (8 ), subject to the conditions at x = N N uN α (N, λ) = cos α, vα (N, λ) = − sin α, 0 N0 uN α (N, λ) = sin α, vα (N, λ) = cos α.
Given a complex solution f (·, λ) of Equation (80 ) with Im(f 0 /f ) > 0, we define a corresponding function hα (·, λ) by the equation N f (x, λ) = CαN {uN α (x, λ) + hα (N, λ)vα (x, λ)},
for all x > 0, N > 0. In the case α = 0 we have h(x, λ) ≡ h0 (x, λ) = f 0 (x, λ)/f (x, λ) and h(·, λ) then satisfies the Riccati equation (190 ). For general α, hα is related to h, as in Equations (11), (12), by h(x, λ) =
(sin α) + (cos α)hα (x, λ) , (cos α) − (sin α)hα (x, λ)
hα (x, λ) =
(cos α)h(x, λ) − (sin α) . (cos α) + (sin α)h(x, λ)
We can substitute the latter expression into the Riccati equation for h(·, λ) to obtain the Riccati equation for hα (·, λ): d hα (x, λ) = [V (x) − λ][(cos α) − (sin α)hα (x, λ)]2 − dx − [(sin α) + (cos α)hα (x, λ)]2 .
(25)
If, now, we define a family hα (·, λ) of solutions of (25) to be δ-clustering on E if there exists a function Hα such that lim infx→∞ supλ∈E |hα (x, λ) − Hα (x)|/ Im Hα (x) < δ, then the proof of Theorem 1 will proceed as before, with the obvious changes of h replaced by hα , m by mα , µ by µα , and so on. By suitable choice of the Hα function, we may then obtain estimates analogous to those of (i)–(iv) of Theorem 1, which will relate to properties of the absolutely continuous spectrum of Tα . Since the transformation between hα and hβ is a Möbius transformation, we can also use Lemma 3 to show that if hα (·, λ) is δ-clustering then hβ (·, λ) is δ/(1 − δ)clustering. In particular, this shows that the property of a family of solutions being clustering is in fact independent of α.
MPAG013.tex; 5/11/1998; 8:38; p.25
218
W. O. AMREIN AND D. B. PEARSON
Appendix: Proof of Lemma 1 (i) The density function for the measure S 7→ θ(S) corresponding to a point w in the upper half-plane is given (see p. 56 of [30]) by q(t) =
Im w 1 . = Im 2 |t − w| t −w
Setting qj (t) = Im 1/(t − wj ) (j = 1, 2), we shall derive the bound |q1 (t) − qR2 (t)| 6 δ ∗ q2 (t). The result R will then follow from the inequality |θ1 (S) − θ2 (S)| = | S (q1 (t) − q2 (t)) dt| 6 S |q1 (t) − q2 (t)| dt. Assuming, then, |w1 − w2 | 6 δ Im w2 , we have w1 − w2 1 1 Im = − |q1 − q2 | = Im t − w1 t − w2 (t − w1 )(t − w2 ) |w1 − w2 | . 6 |t − w2 ||t − w2 | With q2 = Im w2 /|t − w2 |2 , this gives q1 − q2 |w1 − w2 | t − w2 6 6 δ t − w2 . q Im w2 t − w1 t − w1 2 Now
t − w1 = 1 − w1 − w2 > 1 − |w1 − w2 | > 1 − |w1 − w2 | > 1 − δ. t − w t − w2 |t − w2 | Im w2 2
Hence t − w2 1 t − w 6 1 − δ , 1 and we have q1 − q2 6 δ q 1−δ 2 as required. (ii) The density function q for the measure S 7→ θ(S) may be expressed as Z 1 t +ε 1 q(x) dx = lim θ(t − ε, t + ε). q(t) = lim ε→0+ 2ε t −ε ε→0+ 2ε Hence from the assumed inequality (1 − δ)θ2 (S) 6 θ1 (S) 6 (1 + δ)θ2 (S), valid in particular if S is an interval, we can derive, by a limiting argument for small intervals, the corresponding inequality for density functions (1 − δ)q2 6 q1 6 (1 + δ)q2 ,
MPAG013.tex; 5/11/1998; 8:38; p.26
219
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
implying that q1 − q2 6 δ. q 2 To make full use of this inequality for the density functions, we first derive an identity for (q1 − q2 )/q2 . To do so, we make the substitution t = Re w1 + α Im w1 , where α is an arbitrary real parameter. Then |t − w1 |2 = (1 + α 2 )(Im w1 )2 , and setting w1 − w2 = ρeiφ (ρ, φ, real with ρ > 0), we get |t − w2 |2 = |t − w1 + ρeiφ |2 = |(α − i) Im w1 + ρ cos φ + iρ sin φ|2 = (1 + α 2 )(Im w1 )2 + ρ 2 + ρ Im w1 {2α cos φ − 2 sin φ}. With qj (t) = Im wj /|t − wj |2 , we have (Im w1 )|t − w2 |2 − (Im w2 )|t − w1 |2 q1 (t) − q2 (t) = , q2 (t) (Im w2 )|t − w1 |2 and substituting the above expressions for |t − w1 |2 and |t − w2 |2 in terms of α results in q1 − q2 q2 (1 + α2 )(Im w1 )2 + ρ 2 + ρ Im w1 {2α cos φ − 2 sin φ} − (Im w1 )(Im w2 )(1 + α2 ) = . (Im w1 )(Im w2 )(1 + α2 )
The first and last terms in the numerator on the right hand side together give (1 + α 2 )(Im w1 ) Im(w1 − w2 ) = (1 + α 2 )ρ(Im w1 ) sin φ, so that we now have q1 − q2 q2
ρ2 ρ 2α (1 − α 2 ) = + cos φ − sin φ . (Im w1 )(Im w2 )(1 + α 2 ) (Im w2 ) (1 + α 2 ) (1 + α 2 )
Now set α = tan γ2 (−π < γ < π ), and use the trigonometric identities cos γ = (1 − α 2 )/(1 + α 2 ), sin γ = 2α/(1 + α 2 ), to give ρ 2 cos2 γ2 ρ q1 − q2 = + sin(γ − φ). q2 (Im w1 )(Im w2 ) (Im w2 )
(26)
2 (t ) Since q1 (tq)−q converges to a limit as |t| → ∞ (i.e. as γ → ±π ), Equation (26) 2 (t ) also makes sense for γ = ±π . By hypothesis, we now have |(q1 − q2 )/q2 | 6 δ for all γ ∈ [−π, π ]. Setting γ = π2 + φ (mod 2π ), one may deduce the inequality ρ/Im w2 6 δ. (Equality can occur only in the case sin φ = 1, for which w1 is vertically above w2 in the complex plane.) Thus |w1 − w2 | 6 δ Im w2 , completing the proof of (ii) of the lemma. One may verify that this part of the lemma holds without the assumption δ < 1.
MPAG013.tex; 5/11/1998; 8:38; p.27
220
W. O. AMREIN AND D. B. PEARSON
References
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17. 18. 19. 20. 21. 22. 23. 24. 25.
Titchmarsh, E. C.: Eigenfunction Expansions, Part I, 2nd edn., Oxford University Press, 1962. Coddington, E. A. and Levinson, N.: Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955. Chaudhury, J. and Everitt, W. N.: On the spectrum of ordinary second order differential operators, Proc. Roy. Soc. Edinburgh Sect. A 68 (1968), 95–115. Bennewitz, C. and Everitt, W. N.: Some remarks on the Titchmarsh–Weyl m-coefficient, in: A Tribute to Ake Pleijel, Proc. Pleijel Conference, University of Uppsala, 1980, pp. 49–108. Eastham, M. S. P. and Kalf, H.: Schrödinger-type Operators with Continuous Spectra, Pitman, Boston, 1982. Weyl, H.: Über gewöhnliche Differentialgleichungen mit Singularitäten und die zugehörigen Entwicklungen Willkürlicher Funktionen, Math. Ann. 68 (1910), 220–269. Atkinson, F. V.: On the location of the Weyl circles, Proc. Roy. Soc. Edinburgh Sect. A 88 (1981), 345–356. Gilbert, D. J.: PhD Thesis, University of Hull, 1984. Gilbert, D. J. and Pearson, D. B.: On subordinacy and analysis of the spectrum of one-dimensional Schrödinger operators, J. Math. Anal. Appl. 128 (1987), 30–56. Gilbert, D. J.: On subordinacy and analysis of the spectrum of Schrödinger operators with two singular endpoints, Proc. Roy. Soc. Edinburgh Sect. A 112 (1989), 213–229. Stolz, G.: Bounded solutions and absolute continuity of Sturm–Liouville operators, J. Math. Anal. Appl. 169 (1992), 210–228. Kiselev, A.: Absolutely continuous spectrum of one-dimensional Schrödinger operators and Jacobi matrices with slowly decreasing potentials, Comm. Math. Phys. 179 (1996), 377–400. Jitomirskaya, S. and Last, Y.: Dimensional Hausdorff properties of singular continuous spectra, Phys. Rev. Lett. 76(11) (1996), 1765–1769. Jitomirskaya, S. and Last, Y.: Power law subordinacy and singular spectra, in preparation. Al-Naggar, I. and Pearson, D. B.: A new asymptotic condition for absolutely continuous spectrum of the Sturm–Liouville operator on the half line, Helv. Phys. Acta 67 (1994), 144–166. Al-Naggar, I. and Pearson, D. B.: Quadratic forms and solutions of the Schrödinger equation, J. Phys. A 29 (1996), 6581–6594. Last, Y. and Simon, B.: Eigenfunctions, transfer matrices and absolutely continuous spectrum of one-dimensional Schrödinger operators, Caltech Preprint, 1996. Remling, C.: The absolutely continuous spectrum of one-dimensional Schrödinger operators with decaying potentials, Caltech Preprint, 1997. Simon, B.: Bounded eigenfunctions and absolutely continuous spectra for one-dimensional Schrödinger operators, Proc. Amer. Math. Soc. 124 (1996), 3361–3369. Christ, M. and Kiselev, A.: Absolutely continuous spectrum and Schrödinger operators with decaying potentials; some optimal results, MSRI Preprint. Kiselev, A., Last, Y. and Simon, B.: Modified Prüfer and EFGP transforms and the spectral analysis of one-dimensional Schrödinger operators, Caltech Preprint, 1997. Simon, B. and Zhu, Y.: The Lyapunov exponents for Schrödinger operators with slowly oscillating potentials, J. Funct. Anal. 140 (1996), 541–556. Simon, B. and Wolff, T.: Singular continuous spectrum under rank one perturbations and localisation for random Hamiltonians, Comm. Pure Appl. Math. 39 (1986), 75–90. Pearson, D. B.: Value distribution and spectral theory, Proc. London Math. Soc. (3) 68 (1994), 127–144. Pearson, D. B.: Value distribution and spectral analysis of differential operators, J. Phys. A 26 (1993), 4067–4080.
MPAG013.tex; 5/11/1998; 8:38; p.28
STABILITY CRITERIA FOR THE WEYL m-FUNCTION
221
26.
Pearson, D. B.: Sturm–Liouville theory, asymptotics, and the Schrödinger equation, in: D. Hinton and P. W. Schaefer (eds), Spectral Theory and Computational Methods of Sturm–Liouville Problems, M. Dekker Inc., New York, 1997, pp. 301–312. 27. Akhiezer, N. I. and Glazman, I. M.: Theory of Linear Operators in Hilbert Space, Pitman, London, 1981. 28. Pearson, D. B.: Quantum Scattering and Spectral Theory, Academic Press, London, 1988. 29. Everitt, W. N.: On a property of the m-coefficient of a second-order linear differential equation, J. London Math. Soc. (2) 4 (1972), 443–457. 30. Conway, J. B.: Functions of One Complex Variable, 2nd edn., Springer-Verlag, New York, 1978. 31. Saks, S.: Theory of the Integral, 2nd edn., Hafner Publishing Company, New York, 1937. 32. Munroe, M. E.: Measure and Integration, 2nd edn., Addison-Wesley, Reading, MA, 1971. 33. Eastham, M. S. P.: The asymptotic form of the spectral function in Sturm–Liouville problems with a large potential like −x c (c 6 2), Proc. Roy. Soc. Edinburgh Sect. A 128 (1998), 37–45. 34. Harris, B. J.: The form of the spectral functions associated with Sturm–Liouville problems with continuous spectrum, Northern Illinois University Preprint, 1996. 35. Brown, B. M., Eastham, M. S. P., Evans, W. D. and Kirby, V. G.: Repeated diagonalisation and the numerical computation of the Titchmarsh–Weyl m(λ) function, Proc. Roy. Soc. London Ser. A 445 (1994), 113–126.
MPAG013.tex; 5/11/1998; 8:38; p.29
Mathematical Physics, Analysis and Geometry 1: 273–292, 1998. © 1998 Kluwer Academic Publishers. Printed in the Netherlands.
273
On a Counterexample Concerning Unique Continuation for Elliptic Equations in Divergence Form NICULAE MANDACHE Institute of Mathematics, Romanian Academy, P.O. box 1-764, 70700 Bucharest, Romania. e-mail: [email protected] (Received: 1 October 1997; accepted: 4 November 1997) Abstract. We construct a second order elliptic equation in divergence form in R3 , with a nonzero solution which vanishes in a half-space. The coefficients are α-Hölder continuous of any order α < 1. This improves a previous counterexample of Miller (1972, 1974). Moreover, we obtain coefficients which belong to a finer class of smoothness, expressed in terms of the modulus of continuity. Mathematics Subject Classifications (1991): 35A05, 35J15 (35B60, 35K10). Key words: elliptic equations, partial differential equations, unique continuation.
Introduction The aim of this paper is to improve a counterexample by Keith Miller [3, 4]. Part of the results presented here belong to the author’s Ph.D. thesis [2, Section 3.4]. The first who constructed an elliptic second order equation for which the Cauchy problem does not have the uniqueness property is Pliš [5]. The first and zero order coefficients of his equation are smooth, but the leading coefficients are only α-Hölder continuous of any order α < 1. This result is optimal, since for Lipschitzcontinuous coefficients we always have uniqueness in the Cauchy problem (and for even stronger results, see [1]). Miller was concerned with the nonuniqueness in the Cauchy problem for the elliptic equation in divergence form n X
∂i aij ∂j u = 0,
(1)
i,j =1
and the backward nonuniqueness for the corresponding parabolic equation ∂t u =
n X
∂i aij ∂j u.
(2)
i,j =1
VTEXVR PIPS No: 155023 (mpagkap:mathfam) v.1.15 MPAG011.tex; 5/11/1998; 8:53; p.1
274
NICULAE MANDACHE
Here the matrix of coefficients (aij ) is supposed real, symmetric, continuous and uniformly positive, i.e., n X
aij xi xj > C|x|2 ,
C > 0, for any x ∈ Rn .
i,j =1
The interesting aspects of Equations (1) and (2) lie in the fact that the first corresponds to symmetric operators in L2 (Rn ) and the second is the evolution equation for such operators. They also have a physical meaning: (2) is the heat equation in a medium with specific heat 1 and with the thermic conductivity given by the matrix (aij ). See [4] for further comments. Our example is better than the one in [4] in the following ways: (1) It allows optimal regularity by a precise choice of the parameters used in the construction. We obtain Hölder continuous coefficients of any order α < 1, whereas in [4], Miller obtained only the order α = 1/6. We also obtain a finer result: Suppose that ω: [0, ∞) → [0, ∞) is concave, continuous, nondecreasing, ω(0) = 0, ω(1) > 0 and ω satisfies: Z 1 dt < ∞. 0 ω(t) Then we can choose the coefficients of our equation such that their moduli of continuity are majorated by ω. (2) It is interesting to rephrase the problem into a system of inequations for sequences of numbers. The inherent limits of the construction below suggest that the unique continuation property for Equation (1) might hold under the assumption that aij ∈ W 1,1 . (3) There is a simplification in the technical part which allows us to give explicit (though complicated) expressions of the coefficients. THEOREM 1. There exist a smooth function u, smooth functions b11 , b12 , b22 , and continuous functions d1 , d2 defined on R3 3 (t, x, y), with the following properties (i) u is the solution of the equation: ∂t2 u + ∂x ((b11 + d1 )∂x u) + ∂y (b12 ∂x u) + ∂x (b12 ∂y u)+ (3) + ∂y ((b22 + d2 )∂y u) = 0. (ii) There is a T > 0 such that supp u = (−∞, T ] × R2 . (iii) u, bij and di are periodic in x and in y with period 2π . (iv) For any t ∈ R, u(t, ·, ·) satisfies the Neumann boundary condition on (0, 2π )2 with respect to Equation (3) (seen as an equation in the variables x and y). (v) d1 and d2 do not depend on x and y and are Hölder continuous of order α for all α < 1. b12 1 d1 + b11 < 2 on R3 . < (vi) 2 b12 d2 + b22
MPAG011.tex; 5/11/1998; 8:53; p.2
ON A COUNTEREXAMPLE CONCERNING UNIQUE CONTINUATION
275
Furthermore, there are also functions as above, satisfying conditions (i)–(vi) except that (3) is replaced with the parabolic equation: ∂t u = ∂x ((b11 + d1 )∂x u) + ∂y (b12 ∂x u) + ∂x (b12 ∂y u) + ∂y ((b22 + d2 )∂y u). (4) REMARK. Equation (3) can be seen, given the periodicity condition (iii), as an abstract equation for an L2 (R2 /2π Z2 )-valued function: u00 = A(t)u. Here A(t) is an elliptic operator on the torus, which is positive in L2 (R2 /2π Z2 ). Thus our theorem asserts the existence of an A(t) such that the Cauchy problem for the above equation does not have the uniqueness property. The interesting aspect of point (iv) of the theorem is that the above A can be replaced with an elliptic selfadjoint operator on L2 ((0, 2π )2 ), with Neumann boundary condition. Idea of the proof. We start from operator 1 = ∂t2 + 1xy , and from its solutions e−λt cos λx and e−λt cos λy. It is convenient to view the operator as being constructed (appearing in (3)) as a perturbation of 1. The above solutions of 1 decay with t, the bigger λ is the faster is the decay. We will ‘glue’ an infinite number of them, corresponding to the frequencies λ = λk , such that λk → ∞ as k → ∞. In this way, as t ↑ T the solution will be, for shorter and shorter intervals of time, proportional with e−λk t cos λk x, then with e−λk+1 t cos λk+1 y and so on. In these intervals the equation is ∂t2 u + 1xy u = 0. In the gaps between them, we will modify the coefficients such as to fit a prescribed solution, which passes smoothly from e−λk t cos λk x to e−λk+1 t cos λk+1 y. Choosing suitable λk and suitable lengths of the intervals and of the gaps, we obtain a smooth solution which vanishes in finite time. In fact the solution is also decaying in the gaps and we can choose intervals of length zero. The first part of the proof consists of constructing generic functions v, Bij , Di : [0, 5a]×R2 → R, i, j = 1, 2, which describe the solution and the coefficients in a gap. They depend on the following parameters: a > 0 gives the length (in time) of the domain of definition, λ > 1/a is the old frequency, λ0 > λ is the new frequency and ρ ∈ (0, λ/λ0 ) is a technical parameter. These functions satisfy the equality ∂t2 v + ∂x ((B11 + D1 )∂x v) + ∂x (B12 ∂y v)+ + ∂y (B12 ∂x v) + ∂y ((B22 + D2 )∂y v) = 0
(5)
on [0, 5a] × R , and do the required job of gluing, i.e., there is an ε > 0 such that: 2
Bij = δij , Di = 0 for t ∈ [0, ε) ∪ (5a − ε, 5a], v(t, x, y) = e−t λ cos λx for t ∈ [0, ε), 0 v(t, x, y) is proportional to e−t λ cos λ0 y for t ∈ (5a − ε, 5a].
MPAG011.tex; 5/11/1998; 8:53; p.3
276
NICULAE MANDACHE
In the second stage of the proof we will construct the functions u, bij , di : R3 → R, which satisfy the conclusions of the theorem. This is done by putting together an infinite number of instances of this generic construction, with appropriate values for the parameters. Construction of the generic v, Bij , Di . Let χ: R → [0, 1] be a smooth function with χ(t) = 1 in a neighbourhood of [1, ∞) and χ(t) = 0 in a neighbourhood of (−∞, 0]. Each of the intervals [(i − 1)a, ia], with i = 1, . . . , 5 (henceforth called steps) will have a precise job. We will describe the functions v, Bij and Di in each of them. The first step serves to a smooth decay of B22 + D2 from 1 to ρ 2 : v = e−λt cos λx, B11 = B22 = 1, B12 = D1 = 0, t D2 = χ (ρ 2 − 1). (6) a Since v does not depend on y, the last term in the l.h.s. of (5) vanishes and therefore (5) is satisfied for arbitrary D2 . The second step is the ‘seed’ step and serves to introduce a tiny component of the solution oscillating in y. t − a −ρλ0 t cos λ0 y. (7) e v = e−λt cos λx + c˜ χ a The constant factor 0
c˜ = e 2 (ρλ −λ) def
5a
(8)
serves to make the two components of the solution (one oscillating in x and one in y) of equal amplitude at t = 5a , the middle of the third step. We put 2 B22 = 1,
D1 = 0,
D2 = ρ 2 − 1
def and we construct below B11 = 1 + B˜ and B12 . Equation (5) reads: 2 0 t −a 1 00 t − a 2 −λt − χ ρλ0 + λ e cos λy + c˜ 2 χ a a a a t − a 2 02 −ρλ0 t ρ λ e cos λ0 y + +χ a t − a −ρλ0 t −λt 0 ˜ cos λ y + + ∂x (1 + B)∂x e cos λx + ∂x B12 ∂y c˜ χ e a t − a −ρλ0 t + ∂y (B12 ∂x e−λt cos λx) + ∂y ρ 2 ∂y c˜ χ e cos λ0 y = 0. a After reductions: 1 00 t − a 2 0 0 t −a 0 ˜ −λt λ sin λx)+ c˜ 2 χ − ρλ χ e−ρλ t cos λ0 y + ∂x (−Be a a a a t − a −ρλ0 t 0 0 + ∂x − B12 c˜ χ e λ sin λ y + ∂y (−B12e−λt λ sin λx) = 0. a
MPAG011.tex; 5/11/1998; 8:53; p.4
277
ON A COUNTEREXAMPLE CONCERNING UNIQUE CONTINUATION
Simplifying this equation by e−λt and using the notation 1 00 t − a 2ρλ0 0 t − a (λ−ρλ0 )t χ(t) ˜ = c˜ e χ − χ , a2 a a a
(9)
we obtain the equivalent relation
t − a (λ−ρλ0)t 0 ˜ e ˜ sin λ0 y ∂x B12 + χ(t) ˜ cos λ y = λ∂x (B sin λx) + λ cχ a + λ sin λx∂y B12 . 0
˜ sin λx has to be the primiThen if we choose first B12 from the above relation, Bλ tive of some function (depending on y and t as parameters). But this is only possible if that function has zero integral from kπ/λ to (k + 1)π/λ, in order to allow the primitive to have zeros at x = kπ/λ. To this end we take 2 sin λx sin λ0 y . λλ0 Then the above relation becomes: B12 (t, x, y) = χ˜ (t)
(10) 0
0
2 sin λx λ cos λ y λ∂x (B˜ sin λx) = χ˜ (t) cos λ0 y − λ sin λx χ(t) ˜ − λλ0 2λ cos λx sin λ0 y t − a (λ−ρλ0)t − λ0 c˜ χ e sin λ0 y χ(t) ˜ , a λλ0 and this yields further simplifying by λ: 1 − 2 sin2 λx − ∂x (B˜ sin λx) = χ(t) ˜ cos λ0 y λ t − a (λ−ρλ0 )t 2 cos λx sin2 λ0 y − c˜ χ χ(t) ˜ e , a λ R and since (1 − 2 sin2 λx) dx = sin λx cos λx/λ + C, we obtain by integration from 0 to x with respect to x and then simplification by sin λx: 2 0 0 y cos λx λy 2 sin t − a cos λ 0 ρ)t (λ−λ ˜ x, y) = χ˜ (t) . (11) − ce ˜ χ B(t, 2 2 λ a λ The third step has the coefficients B11 = 1,
B12 = D1 = 0,
B22 = 1,
D2 = ρ 2 − 1
and the solution is 0
v = e−λt cos λx + ce ˜ −ρλ t cos λ0 y. This step serves to propagate the two components with different speeds. Although the second term (depending on y) has a space frequency λ0 > λ, its decay rate is smaller than that of the term depending on x, since ρλ0 < λ.
MPAG011.tex; 5/11/1998; 8:53; p.5
278
NICULAE MANDACHE
The fourth step is symmetric to the second one and the construction is similar. Its purpose is to remove the component of v oscillating in x, which has become small with respect to the other component. 4a − t −λt 0 e cos λx + ce ˜ −ρλ t cos λ0 y. v=χ a Here the roles of x and y have changed. We have B11 = 1,
D1 = 0,
D2 = ρ 2 − 1
def and B12 , B22 = 1 + B˜˜ are computed below. Equation (5) gives 4a − t −λt 2 −ρλ0 t 0 ˜ cos λ y + e cos λx + ce ∂t χ a 4a − t −λt + ∂x 1 · ∂x χ e cos λx + a 0 + ∂ (B ∂ v) + ∂ (B ∂ v) + ∂ ρ 2 + B˜˜ ∂ ce ˜ −ρλ t cos λ0 y = 0 y
12 x
x
12 y
y
y
(we substituted the actual value of v only in the terms which are subject to reductions) and after reduction we obtain 2λ 0 4a − t 1 00 4a − t + χ e−λt cos λx+ χ 2 a a a a 4a − t −λt + ∂y B12 χ e ∂x cos λx + a 0 0 ˜ −ρλ t ∂ cos λ0 y) + ∂ B˜˜ ce ˜ −ρλ t ∂ cos λ0 y = 0. + ∂ (B ce x
12
y
y
y
0
Simplifying by ce ˜ −ρλ t and using the notation 0 2λ 0 4a − t e(ρλ −λ)t 1 00 4a − t ˜ + χ , χ˜ (t) = χ c˜ a2 a a a
(12)
the relation becomes 0 4a − t e(ρλ −λ)t χ˜˜ (t) cos λx = ∂y B12 χ λ sin λx + a c˜ + λ0 sin λ0 y∂ B + ∂ B˜˜ λ0 sin λ0 y . x
12
y
We choose 0
sin λx 2 sin λ y , B12 = χ˜˜ (t) λ λ0
(13)
MPAG011.tex; 5/11/1998; 8:53; p.6
ON A COUNTEREXAMPLE CONCERNING UNIQUE CONTINUATION
279
and taking the second term in the r.h.s. to the left in the relation above we obtain the equivalent relation 2 0 ˜ χ˜ cos λx(1 − 2 sin λ y) = ∂y B˜˜ λ0 sin λ0 y + 0
sin λx 2 sin λ y + χ˜˜ (t) × λ λ0 (ρλ0 −λ)t 4a − t e ×χ λ sin λx . a c˜ 0
0
λy = 1 − 2 sin2 λ0 y, the following relation ensures that (5) is Since ∂y sin λ yλcos 0 fulfilled for t ∈ [3a, 4a] (after simplification by sin λ0 y): (ρλ0 −λ)t 2 2 sin cos λ0 y λx e 4a − t ˜ 0 ˜ ˜ , = B˜ λ + χ˜ (t) χ χ˜ cos λx 0 0 λ λ a c˜
that is, (ρλ0 −λ)t ˜ (t) χ ˜ e 4a − t ˜ 0 2 B˜ = 0 2 cos λx cos λ y − 2χ sin λx . λ a c˜
(14)
The aim of the fifth step is to increase the coefficient B22 +D2 from the value ρ 2 to 1, in order to get back to the values B11 = B22 = 1 and B12 = D1 = D2 = 0 (this ensures the continuity of coefficients in the final construction). As in the previous steps, it is simpler first to choose R t v and then construct the coefficients accordingly. Let us define χ1 (t) = 0 χ(s) ds. Then we have χ1 (t) = t + χ1 (1) − 1 in a neighbourhood of [1, ∞) and χ1 (t) = 0 in a neighbourhood of (−∞, 0]. The solution is t − 4a 0 0 0 . v = c˜ cos λ y exp − λ ρt − λ (1 − ρ)aχ1 a The coefficients are B11 = B22 = 1,
B12 = D1 = 0 and
∂ 2v ∂t2 v −1 = D2 = − t2 − 1 = 02 ∂y v λ v
t − 4a 2 − ρ + (1 − ρ)χ a 1 − ρ 0 t − 4a − 1. − χ aλ0 a
(15)
We will now eliminate one of our parameters. The constant ρ is very sensitive in our construction; in fact 1−ρ 2 is the order of magnitude of the coefficient D2 . In steps 2 and 4 there is an exponential factor in χ(t) ˜ and in χ˜˜ (t), which will manage
MPAG011.tex; 5/11/1998; 8:53; p.7
280
NICULAE MANDACHE
to make the coefficients Bij (more precisely, Bij − δij ) small at little expense. Therefore, since we have the restriction ρ < λλ0 , which gives 1 − ρ 2 > 1 − (λ/λ0 )2 , we cannot do better (modulo a multiplicative constant) than choose ρ = (λ/λ0 )2 . We then have 1−ρ 2 ≈ 2 (1−(λ/λ0 )2 ) for λ/λ0 close to 1. In order to keep formulas to a reasonable complexity we will continue to use the constant ρ, substituting λ2 /λ02 for it when needed. We can express the solution in a single formula: 4a − t −λt va,λ,λ0 (t, x, y) = χ e cos λx+ a (16) t − a − λ20 t − 1− λ0 2 λ0 aχ1 t−4a a λ e λ + c˜ χ cos λ0 y. a Let us notice here that −λt e cos λx va,λ,λ0 (t, x, y) = 0 α(a, λ, λ0 )e−λ (t −5a) cos λ0 y
in the neighbourhood of 0, (17) in the neighbourhood of 5a.
The constant α(a, λ, λ0 ) is given by (8) and (16) with t = 5a: 0
α(a, λ, λ0 ) = e−5a(λ+λ /λ )/2−(1−λ /λ 2
2
02 )λ0 aχ (1) 1
6 e−5aλ/2 .
(18)
Estimates for the derivatives. We now compute the size of the derivatives of v and Bij constructed above. For Di , only the first order derivative is needed in the proof of Theorem 1, and we give a bound for it. Let k, l and m be three natural numbers, k + l + m > 0. Then during the second ˜ where B˜ is given by (11) and we have step, B11 = 1 + B, ∂tk ∂xl ∂ym B11 = ∂tk ∂xl ∂ym B˜ cos λ0 y cos λx − λ2 t − a l m 2 sin2 λ0 y k t (λ−λ0 ρ) ∂x ∂y ˜ χ . − ∂t χ˜ (t)ce a λ2
l m = ∂tk χ(t)∂ ˜ x ∂y
(19)
The kth derivative of χ˜ is (see (9)) k X 2ρλ0 0 t − a k j (λ−ρλ0 )t k−j 1 00 t − a k − , ∂t e ˜ = c˜ ∂t χ ∂t χ(t) χ 2 a a a a j j =0 and its absolute value is bounded by k X k 1 (k−j +2) t − a (λ−ρλ0 )t 0 j ce ˜ χ (λ − ρλ ) + k−j +2 j a a j =0 2ρλ0 (k−j +1) t − a + k−j +1 χ . a a
MPAG011.tex; 5/11/1998; 8:53; p.8
ON A COUNTEREXAMPLE CONCERNING UNIQUE CONTINUATION
281
Let us set Cχ,k = supi 6k,t ∈R |χ (i) (t)|. Using λ > ρλ0 and recalling that c˜ = 0 e−5a(λ−ρλ )/2 , we infer that 0
0
ce ˜ t (λ−ρλ ) 6 e−a(λ−ρλ )/2
for any t ∈ [a, 2a].
Now we use λ > 1/a and obtain: k−j k X k 2ρλ0 k 1 0 j 1 (λ−ρλ0 )t ∂ χ˜ 6 ce ˜ ) C + (λ − ρλ C χ,k+2 χ,k+1 t j a a2 a j =0 (20) k 1 0 0 3λ2 Cχ,k+2 6 e−a(λ−ρλ )/2 · 3 · 2k Cχ,k+2 λk+2 . 6 ce ˜ (λ−ρλ )t λ + a The same kind of computation will give k t − a (λ−λ0 ρ)t ∂ χ˜ (t)ce ˜ χ t a k 1 00 t − a 2 X i 2(λ−ρλ0 )t j = c˜ ∂e − ∂t χ ijh t a2 a i+j +h=k 2ρλ0 0 t − a t − a − χ ∂th χ a a a X k 0 (2λ − 2ρλ0 )i (1/a)j × 6 c˜2 e2(λ−ρλ )t i j h i+j +h=k 2ρλ0 1 × Cχ,k+2 + Cχ,k+1 (1/a)h Cχ,k a2 a 2 k 2 2(λ−ρλ0 )t 6 c˜ e · 3 · λ2 Cχ,k+2 Cχ,k 2λ + a 0
2 6 e−a(λ−ρλ ) Cχ,k+2 · 3 · 4k λk+2 .
(21)
Now we can estimate the derivatives of B11 (see (19)) using (20) and (21): k l m ∂ ∂ ∂ B11 (t, x, y) t
x y
0
6 e−a(λ−ρλ )/2 · 3 · 2k Cχ,k+2 λk+2 0
0 6 e−a(λ−ρλ )/2 Cχ,k,m λk+l λ0m .
m+1 0m λl λ0m λ −a(λ−ρλ0 ) 2 k k+2 2 + e C · 3 · 4 λ χ,k+2 2 2 λ λ (22)
0 Here the constant Cχ,k,m depends only on χ, k and m. For coefficient B12 the computation is simpler and we obtain in view of (10) and using estimate (20) and λ0 > λ: l 0m k l m ∂ ∂ ∂ B12 (t, x, y) 6 e−a(λ−ρλ0 )/2 · 3 · 2k Cχ,k+2 λk+2 2 λ λ t x y λλ0 −a(λ−ρλ0 )/2 00 k+l 0m 6 e Cχ,k,m λ λ . (23)
MPAG011.tex; 5/11/1998; 8:53; p.9
282
NICULAE MANDACHE
For the fourth step the estimate is similar. We use 0
et (ρλ −λ) 0 6 e−a(λ−ρλ )/2 c˜
for any t ∈ [3a, 4a],
and obtain in view of (12) that k ∂ χ˜˜ (t) 6 e−a(λ−ρλ0 )/2 · 3 · 2k Cχ,k+2 λk+2 . t The computation is made in the same way and we obtain that B22 satisfies (22) (replace B11 by B22 and [a, 2a] by [3a, 4a]) and B12 satisfies (23) for any t in [3a, 4a] (and any x, y ∈ R). Since Bij = δij during the first, third and fifth steps, we conclude from relations (22) and (23), that we have for any t ∈ [0, 5a]: k l m k+l 0m ∂ ∂ ∂ Bij (t, x, y) 6 e−a(λ−ρλ0 )/2 C 000 λ . (24) t x y χ,k,l,m λ Now we turn to the derivatives of v. We have from (16): 4a − t −t λ l m ∂tk ∂xl ∂ym v = ∂tk χ e ∂x ∂y cos λx+ a t − a −ρλ0 t −(1−ρ)λ0aχ1 (t /a−4) l m e + c∂ ˜ tk χ ∂x ∂y cos λ0 y. a
(25)
We first take care of the t derivatives. Using t > 0: k X k k ∂ χ(4 − t/a) e−t λ 6 e−t λ (1/a)j Cχ,j λk−j t j j =0
6 e−t λ (2λ)k Cχ,k 6 (2λ)k Cχ,k .
(26)
By induction we prove the existence of a constant C˜ χ,k , depending only on χ and k, such that k −ρλ0 t −(1−ρ)λ0aχ (t /a−4) 1 6 C˜ χ,k λ0k . ∂ e (27) t This is true for k = 0 since the exponent is negative. We prove that if (27) holds for k = 0, 1, . . . , m, then it also holds, for a certain C˜ χ,m+1 , for k = m + 1. Indeed, m+1 −ρλ0 t −(1−ρ)λ0aχ (t /a−4) 1 ∂ e t m 0 0 = ∂t (−ρλ0 − (1 − ρ)λ0 χ(t/a − 4)) e−ρλ t −(1−ρ)λ aχ1 (t /a−4) m X m−j m j 0 0 0 ∂t (−ρ − (1 − ρ)χ(t/a − 4)) ∂t e−ρλ t −(1−ρ)λ aχ1 (t /a−4) 6λ j j =0 m X m 0 6λ (1/a)j Cχ,j C˜ χ,m−j λ0m−j 6 C˜ χ,m+1 λ0m+1 . j j =0
MPAG011.tex; 5/11/1998; 8:53; p.10
ON A COUNTEREXAMPLE CONCERNING UNIQUE CONTINUATION
We used λ0 > 1/a. Applying (27) we obtain: k ∂ χ(t/a − 1) e−ρλ0 t −(1−ρ)λ0aχ1 (t /a−4) t k X k 6 (1/a)j Cχ,j C˜ χ,k−j λ0k−j 6 C˜˜ χ,k λ0k . j j =0 Using (25), (26) and (28), we conclude that k l m ∂ ∂ ∂ v 6 Cˆ χ,k λ0k+m λl . t x y
283
(28)
(29)
It remains to estimate the derivative of Di . The function D1 is identically 0, and D2 is constant during the second, the third and the fourth steps (i.e., on [a, 4a]). We have, in view of (6) and (15): |∂t D2 | 6 Cχ,1 (1 − ρ 2 )/a 6 2Cχ,1 (1 − ρ)/a for any t ∈ [0, a], 2ρ(1 − ρ) 0 t − 4a 2(1 − ρ)2 0 t − 4a t − 4a + χ − |∂t D2 | = χ χ a a a a a (1 − ρ) 00 t − 4a − 2 0 χ a λ a 6 (1 − ρ)(2Cχ,1 /a + 2Cχ,1 Cχ,0 /a + Cχ,2 /a) 6 5Cχ,2 (1 − ρ)/a for any t ∈ [4a, 5a], and we can conclude that |∂t Di | 6 5Cχ,2 (1 − (λ/λ0 )2 )/a
for any t ∈ [0, 5a].
(30)
Boundary conditions. Function u satisfies the Neumann boundary condition for Equation (1) in the open set ⊂ Rn if and only if n X
ni aij ∂j u(x) = 0
for any x ∈ ∂,
i,j =1
where (n1 , . . . nn ) is the normal vector to ∂. We want our function v to satisfy this condition for Equation (5), seen in the variables x and y, in the open set (0, 2π ) × (0, 2π ). In this case, the above relation reads: (B11 + D1 )∂x v + B12 ∂y v = 0
on {0, 2π } × [0, 2π ],
(31)
B12 ∂y v + (B22 + D2 )∂x v = 0
on [0, 2π ] × {0, 2π }.
(32)
We have ∂x v = χ(4 − t/a) e−t λ (−λ sin λx), 0 0 ˜ − 1) e−tρλ −(1−ρ)λ aχ(t /a−4)(−λ0 sin λ0 y). ∂y v = cχ(t/a
MPAG011.tex; 5/11/1998; 8:53; p.11
284
NICULAE MANDACHE
Since B12 is a multiple of sin λx sin λ0 y (see (10) and (13)), the conditions λ ∈ N,
λ0 ∈ N
(33)
are sufficient for ensuring the boundary conditions (31) and (32). These relations will imply that u, bij and di constructed below fulfill condition (iv) of Theorem 1. They satisfy also the periodicity condition (iii). Proof of Theorem 1. Let {ak }k>1 and {λk }k>1 be two sequences of positive numbers. We will suppose ∞ X
aj < ∞
and
1/ak < λk < λk+1 .
(34)
j =1
P P∞ We denote Tk = k−1 j =1 aj for k > 1 and T = j =1 aj . The sequence {ρk }k >1 is defined by ρk = λ2k /λ2k+1 . We postpone the choice of these sequences as much as we can, in order to first derive all the conditions they have to fulfill. We shall use the indices a, λ, λ0 for the functions Bij and Di , with i, j = 1, 2 (similarly to (16)) since we will use them for different values of these parameters. Let k0 > 0 be an even natural number, to be chosen later. We are ready for the definition of the functions u, bij and di . −(t −Tk )λk 0 0 cos λk x for all t ∈ (−∞, Tk0 ], e 0 c v for k even ∀t ∈ [Tk , Tk+1 ], k ak ,λk ,λk+1 (t − Tk , x, y) (35) u(t, x, y) = ck vak ,λk ,λk+1 (t − Tk , y, x) for k odd ∀k > k0 , 0 for all t ∈ [T , ∞). Here ck are constants which ensure the continuity (and therefore, the smoothness) of u. They are defined by the relations ck0 = 1, ck+1 = α(ak , λk , λk+1 ), ck where α(a, λ, λ0 ) is defined by relation (18). We have therefore (see (18)): ! k−1 5X aj λj . ck 6 exp − 2 j =k
(36)
0
The coefficients are δij bij (t, x, y) = Bij ak ,λk ,λk+1 (t − Tk , x, y) B (t − T , y, x) j i ak ,λk ,λk+1
k
for any t 6∈ [Tk0 , T ), for t ∈ [Tk , Tk+1 ] with k even, for t ∈ [Tk , Tk+1 ] with k odd,
MPAG011.tex; 5/11/1998; 8:53; p.12
ON A COUNTEREXAMPLE CONCERNING UNIQUE CONTINUATION
285
for all i, j = 1, 2 with i 6 j , where i = 3 − i and j = 3 − j . This inversion is necessary since the derivatives with respect to x and y – and therefore the coefficients – swap their roles in the odd intervals. The singular coefficients are defined in a similar manner: 0 for any t 6∈ [Tk0 , T ), di (t, x, y) = Di ak ,λk ,λk+1 (t − Tk , x, y) for t ∈ [Tk , Tk+1 ] with k even, Di ak ,λk ,λk+1 (t − Tk , y, x) for t ∈ [Tk , Tk+1 ] with k odd. The above u, bij and di fulfill Equation (3): indeed, they are obtained by simple changes of variables (the translation t → Tk + t and the symmetry which reverses the roles of x and y) from the functions satisfying (5). Notice that Bij a,λ,λ0 = δij for t in a neighbourhood of 0 or in a neighbourhood of 5a, and therefore bij are smooth in R\{T } × R2 . In order to obtain bij being smooth at t = T too, it is enough that all their derivatives are continuous and have the limit 0 as t ↑ T . In view of (24), we have for any i, j = 1, 2: p 2 p+l 000 e−a(λk −λk /λk+1 )/2 λk λm sup ∂t ∂xl ∂ym bij (t, x, y) 6 Cχ,p,l,m k+1 t∈[Tk ,Tk+1 ] x,y∈R
and due to the monotony of {λk } the following condition ensures that bij are smooth on R3 : lim e−ak (λk −λk /λk+1 )/2λm k+1 = 0 2
k→∞
for any m ∈ N.
(37)
Note that if we suppose di continuous, then limk→∞ (1 − ρk2 ) = 0, since di takes the value (1 − ρk2 ) on a subset of [Tk , Tk+1 ], for i = 2 for even k and i = 1 for odd k, and di = 0 for t > T for i = 1, 2. This implies that ρk → 1, and since ρk = λ2k /λ2k+1 , we have lim λk /λk+1 = 1.
(38)
k→∞
For the smoothness of u we use relation (29), and obtain p l m ∂t ∂ ∂ u(t, x, y) 6 ck Cˆ χ,p λp+l λm x y k+1 ∀k > k0 , ∀t ∈ [Tk , Tk+1 ], ∀x, y ∈ R, k and in view of (36) a sufficient condition for the smoothness of u is ! k−1 5X aj λj λm for any m ∈ N. lim exp − k+1 = 0 k→∞ 2 j =k
(39)
0
m Due to relation (38), we can replace in the limit above λm k+1 by λk or, equivalently, take the sum under exponential from k0 to k. We have k 5X aj λj 6 −ak λk /2 6 −ak (λk − λ2k /λk+1 )/2 − 2 j =k 0
MPAG011.tex; 5/11/1998; 8:53; p.13
286
NICULAE MANDACHE
and therefore (39) is a consequence of (37). Since we will put conditions on λk and ak that ensure the continuity of d1 and d2 (hence (38) holds), we will omit condition (39). Continuity of di . We will prove that for i ∈ {1, 2} we have: |di (t1 ) − di (t2 )| 6 10Cχ,2 sup 1 − λ2k /λ2k+1 min(5, |t1 − t2 |/ak ) , k >k0
∀t1 , t2 ∈ R.
(40)
In order to do so, we show that for any t1 and t2 there is a k > k0 such that: |di (t1 ) − di (t2 )| 6 10Cχ,2 1 − λ2k /λ2k+1 min(5, |t1 − t2 |/ak ). S Since ∞ k=k0 [Tk , Tk+1 ] = [Tk0 , T ), there are three cases to treat:
(41)
(a) There is a k > k0 such that t1 , t2 ∈ [Tk , Tk+1 ]. (b) One of the ti belongs to R\[Tk0 , T ). (c) t1 ∈ [Tk1 , Tk1 +1 ] and t2 ∈ [Tk2 , Tk2 +1 ], with k1 6= k2 . Case (a). Using the theorem of Cauchy, and the upper bound of the derivative of di given by (30), we obtain: |di (t1 ) − di (t2 )| 6 |t1 − t2 |5Cχ,2 1 − λ2k /λ2k+1 /ak . Using further that t1 , t2 ∈ [Tk , Tk+1 ] ⇒ |t1 − t2 | 6 Tk+1 − Tk = 5ak , we obtain
|di (t1 ) − di (t2 )| 6 5Cχ,2 1 − λ2k /λ2k+1 min(5, |t1 − t2 |/ak ).
Case (b). Suppose that t1 6∈ [Tk0 , T ). Then di (t1 ) = 0. If t2 is also outside this interval, then di (t2 ) = di (t1 ) = 0 and there is nothing to prove. So, we may suppose that t2 ∈ [Tk , Tk+1 ], with k > k0 . Then one of Tk and Tk+1 (let us denote it by t10 ) must lie between t1 and t2 (or equal t2 ). Then |t1 − t2 | > |t10 − t2 | and since di (Tk ) = di (Tk+1 ) = 0, we have di (t10 ) = 0 = di (t1 ). Applying the case (a) to t10 and t2 we obtain: |di (t1 ) − di (t2 )| = |di (t10 ) − di (t2 )| 6 5Cχ,2 1 − λ2k /λ2k+1 min(5, |t10 − t2 |/ak ) 6 5Cχ,2 1 − λ2k /λ2k+1 min(5, |t1 − t2 |/ak ). Case (c). The method is similar to the one used in case (b). Suppose t1 ∈ [Tk1 , Tk1 +1 ] and t2 ∈ [Tk2 , Tk2 +1 ] with k1 6= k2 . By symmetry we may suppose that t1 < t2 , hence k1 < k2 . Let t10 = Tk1 +1 and t20 = Tk2 . Then we have di (tj0 ) = 0 for j = 1, 2 and |di (t1 ) − di (t2 )| 6 |di (t1 )| + |di (t2 )| = |di (t1 ) − di (t10 )| + |di (t2 ) − di (t20 )| 6 5Cχ,2 1 − λ2k1 /λ2k1 +1 min(5, |t1 − t10 |/ak1 ) + + 5Cχ,2 1 − λ2k2 /λ2k2 +1 min(5, |t2 − t20 |/ak2 ) 6 10 Cχ,2 max 1 − λ2kj /λ2kj +1 min(5, |tj − tj0 |/akj ) . j =1,2
MPAG011.tex; 5/11/1998; 8:53; p.14
ON A COUNTEREXAMPLE CONCERNING UNIQUE CONTINUATION
287
The proof of (41) is complete. We turn back to Theorem 1, condition (v). In order to obtain Hölder continuous coefficients of any order 0 < α < 1 our sequences {ak } and {λk } must satisfy (in view of (41)): ∀α ∈ (0, 1) ∃C > 0 s.t. 1 − λ2k /λ2k+1 min(5, |t|/ak ) 6 Ct α , ∀t > 0, ∀k > k0 . Since the r.h.s. is concave and increasing, while the l.h.s. is linear on [0, 5ak ] and constant on [5ak , ∞] and is continuous, it is enough to check the inequality at t = 5ak . In this way we obtain the condition: ∀α < 1 ∃C > 0 s.t. 1 − λ2k /λ2k+1 6 Cakα , ∀k > k0 . Summarising, we need two sequences {ak }k>1 and {λk }k>1 which must satisfy: P (α) ∞ 1 ak < ∞ (the construction is to be achieved in finite time). (β) 1/ak < λk < λk+1 (technical condition). (γ ) λk ∈ N (in order to ensure the 2π -periodicity and the boundary conditions). 2 (δ) limk→∞ e−ak (λk −λk /λk+1 )/2 λm k+1 = 0 for any m ∈ N (to ensure the smoothness of bij and implicitly that of u). () ∀α < 1 ∃C > 0 s.t. 1 − λ2k /λ2k+1 6 Cakα for any k > k0 (the Hölder continuity of d1 , d2 , of any order α < 1). The following sequences satisfy all these conditions: λk = (k + 1)3 , ak = (k ln2 (k + 1))−1 . Condition (α) is easy to prove, and also (β), and (γ ). We have for (δ): − (k+1)
e−ak (λk −λk /λk+1 )/2 λm k+1 = e 2
3 −(k+1)6 /(k+2)3 k ln2 (k+1)
−(k+1)3
= e
(k + 2)3m
3k 2 +9k+7 (k+2)3 k ln2 (k+1)
(k + 2)3m .
The exponent is asymptotically −(k + 1)3
3k 2 + 9k + 7 = −(1 + O(1/k))3k ln−2 (k + 1) (k + 2)3 k ln2 (k + 1)
and therefore the whole expression above has limit zero as k → ∞. For condition () we have: 6k 5 + 45k 4 + · · · + 63 6 Ck −1 , 1 − λ2k /λ2k+1 = (1 − (k + 1)6 /(k + 2)6 ) = (k + 2)6 C > 0, and since limk→∞ k −1+α ln−2α (k + 1) = 0 for any α < 1, we have ∀α < 1 ∃Cα > 0 such that 1 − λ2k /λ2k+1 6 Cα k −α ln−2α (k + 1).
MPAG011.tex; 5/11/1998; 8:53; p.15
288
NICULAE MANDACHE
It remains to choose k0 . We must ensure the uniform ellipticity of Equation (3), as required in point (vi) of the theorem. This is possible since the coefficients that we have constructed are uniformly continuous: di and bij − δij have compact support in the t variable and are periodic in x and y. Now, passing from a k0 to a bigger k˜0 has the only effect that these functions become zero for t ∈ [Tk0 , Tk˜0 ] and remain as they were for t ∈ [Tk˜0 , T ]. Since they tend uniformly to zero as t ↑ T , we can choose k0 such that |di | 6 1/18 and |bij − δij | 6 1/18 and then b11 − 1 + d1 b12 6 6 × 1/18 = 1/3 b12 b22 − 1 + d2 and we obtain 1 − 1/3 6
b11 + d1 b12 b12 b22 + d2
6 1 + 1/3.
The proof is complete. The construction for the parabolic problem (4) is similar to the one for the elliptic equation and will be not done here. REMARK. From the condition λk → ∞ we infer that −4 lim λ−4 k = λk0
k→∞
∞ ∞ Y Y λ4k −4 = λ ρk2 = 0, k0 4 λ k+1 k=k k=k 0
0
and since ρk2 ∈ (0, 1) for any k, we can pass to the infinite sum associated to the infinite product, and obtain from the relation above: ∞ X
(1 − ρk2 ) = ∞.
k=k0
Since in each of the intervals [Tk , Tk+1 ] one of the functions d1 , d2 takes the value −(1 − ρk2 ) and gets back to the value 0 at the end of the interval, the relation above implies that either d1 or d2 have unbounded variation. Thus, we cannot obtain W 1,1 coefficients in the construction above. Professor N. Lerner raised the problem of the refinement of the result above, considering the continuity moduli of the coefficients. He asked in particular whether the results below hold. The following corollary is actually a corollary of the proof of Theorem 1. COROLLARY 1. Let ω: [0, ∞) → [0, ∞) be a continuous, nondecreasing and concave function such that ω(0) = 0 and ω(1) > 0. Suppose that Z 1 dt < ∞. (42) 0 ω(t)
MPAG011.tex; 5/11/1998; 8:53; p.16
ON A COUNTEREXAMPLE CONCERNING UNIQUE CONTINUATION
289
Then there exist u, bij and di , where i, j = 1, 2, satisfying all the conditions of Theorem 1, except (v), which is replaced by: |di (t1 ) − di (t2 )| 6 ω(|t1 − t2 |),
∀t1 , t2 ∈ R, i = 1, 2.
(43)
REMARK. If f : Rn → R then the modulus of continuity of f is by definition the function ωf : [0, ∞) → [0, ∞),
ωf (t) =
sup
|x1 −x2 |6t
|f (x1 ) − f (x2 )|.
It is easy to prove that ωf is nondecreasing and satisfies the relation ωf (αt1 + (1 − α)t2 ) > 1/2 αωf (t1 ) + (1 − α)ωf (t2 ) , ∀t1 , t2 > 0, ∀α ∈ [0, 1].
(44)
This shows that there is a concave function ω˜ f , more precisely, ω˜ f (t) =
sup
06t1 1 in the proof of Theorem 1, such that the conditions (α)–(δ) and the relation (43) are satisfied. We choose λk = k 4 . Let def
δk = 1 − λ2k /λ2k+1 =
(k + 1)8 − k 8 . (k + 1)8
MPAG011.tex; 5/11/1998; 8:53; p.17
290
NICULAE MANDACHE
We have to make some preparations in view of the construction of the sequence {ak }. Let a = sup{x ∈ [0, 1] | ω(x) < ω(1)}. Since ω(0) = 0 and ω(1) > 0, we have a ∈ (0, 1]. Then by continuity we have ω(a) = ω(1) and the function ω: [0, a] → [0, ω(1)] is bijective. Indeed, suppose 0 6 x < y 6 a. Since ω is nondecreasing, ω(x) < ω(a) by the definition of a. Using that ω is concave, ω(y) >
(a − y)ω(x) + (y − x)ω(x) (a − y)ω(x) + (y − x)ω(a) > = ω(x), a−x a−x
which proves that ω is strictly increasing on [0, a]. We put ak = 1/5 ω−1 (50Cχ,2 δk )
for any k > k0 .
This requires that the argument of ω−1 lies in [0, ω(1)]. To this end, we impose 50Cχ,2 δk0 6 ω(1). This relation is satisfied for k0 big enough since δk → 0. Since 0 < ak1 6 ak0 for any k1 > k0 , we obtain from the concavity of ω: 50Cχ,2 δk1 = ω(5ak1 ) > and we infer ak1 ak 6 0 δk1 δk0
(5ak0 − 5ak1 )ω(0) + 5ak1 ω(5ak0 ) ak = 1 50Cχ,2 δk0 5ak0 ak0
for all k1 > k0 .
(46)
Now we will check in order the conditions (α), (β), (γ ) and (δ) stated at the end of the proof of Theorem P 1. We first prove that ai < ∞. Using the monotony of ω and then relation (42): k1 k1 X X 1 1 (ak − ak+1 ) = 50Cχ,2 (ak − ak+1 ) δ 50C δ k χ,2 k k=k k=k 0
0
= 10Cχ,2
k1 X k=k0
6 10Cχ,2
1 (5ak − 5ak+1 ) ω(5ak )
k1 Z X
k=k0
Z 6 10Cχ,2
0
5ak0
5ak
5ak+1
dt ω(t)
dt =M k0 .
We will associate differentlyPthe terms in the first sum above, in order to obtain information about the series ak . We have k1 kX 1 −1 X 1 ak ak +1 1 1 ak+1 + 0 − 1 (ak − ak+1 ) = − δ δk+1 δk δk0 δk1 k=k k k=k 0
0
MPAG011.tex; 5/11/1998; 8:53; p.18
291
ON A COUNTEREXAMPLE CONCERNING UNIQUE CONTINUATION
and we obtain using (46): kX 1 −1 k=k0
1 δk+1
1 ak ak − ak+1 6 M − 0 + 1 6 M δk δk0 δk1
for any k1 > k0 .
1 Since {δk } is decreasing, ( δk+1 − δ1k )ak+1 > 0 for any k > k0 . We obtain that the series ∞ X 1 1 ak+1 − δk+1 δk k=k 0
is convergent. It remains now to use the fact that 1 1 = 1/8 − lim k→∞ δk+1 δk
(47)
and the positivity of ak to conclude that ∞ X
ak < ∞.
k=k0
In order to show that relation (47) holds, we compute 1 k 8 + 8k 7 + O(k 6 ) 1 1 (k + 1)8 = = (k + 9/2 + O(1/k)). = 7 δk 8k + 28k 6 + O(k 5 ) 8 k 7 + 7/2k 5 + O(k 5 ) 8 The proof of condition (α) is complete. Due to relation (45), we have ω−1 (t) > t 2 for t ∈ [0, ω(1)] and in particular 5ak = ω−1 (50Cχ,2 δk ) > (50Cχ,2 δk )2 for any k > k0 . Since δk = 8/k + O(1/k 2 ), we obtain the existence of a C > 0 such that ak > Ck −2 .
(48)
Choosing k0 big enough, we obtain 1/ak < k 4 = λk for any k > k0 and condition (β) is fulfilled. Condition (γ ) is obviously satisfied: λk = k 4 ∈ N. We have from (48): −Ck e−ak (λk −λk /λk+1 )/2 λm k+1 6 e 2
−2 k 4
(1−k4 /(k+1)4 )/2 (k + 1)4m
2 3 4 6 e−Ck (4k /(k+1) )/2 (k + 1)4m .
The limit of the above expression is 0 as k → ∞ since the exponent is −2Ck(1 + O(1/k)), hence condition (δ) is satisfied. It remains to prove inequality (43). In order to do so it is enough to prove that 10Cχ,2 sup (1 − λ2k /λ2k+1 ) min(5, t/ak ) 6 ω(t) for any t ∈ [0, ∞), k >k0
MPAG011.tex; 5/11/1998; 8:53; p.19
292
NICULAE MANDACHE
since (43) is then a consequence of (40). We will prove the inequality for each k > k0 : ω(t) > 10Cχ,2 δk min(5, t/ak ). We use the concavity of ω and the fact that it is nondecreasing. This implies that it is enough to prove the above inequality at the point t = 5ak where the r.h.s. passes from a linear function to a constant one. Indeed, suppose the inequality proved at t = 5ak . Then the result is, on the one hand, because of the monotony of ω, that the inequality holds in the interval [5ak , ∞). On the other hand, it is obviously true for t = 0 and from the concavity of ω it is true in the interval [0, 5ak ]. We have to check that ω(5ak ) > 10Cχ,2 δk · 5, in fact by the definition of ak we have equality. The proof is complete.
2
Acknowledgements I thank Professor Anne Boutet de Monvel for drawing my attention to the work of Miller. I am also indebted to Professor Vladimir Georgescu for valuable remarks on the paper. References 1. Hörmander, L.: Uniqueness theorems for second order elliptic differential equations, Comm. Partial Differential Equations 8(1) (1983), 21–64. 2. Mandache, N.: Estimations dans les espaces de Hilbert et applications au prolongement unique, Thèse, Université Paris 7, 1994. 3. Miller, K.: Non-unique continuation for certain ode’s in Hilbert space and for uniformly parabolic and elliptic equations in self-adjoint divergence form, in: Symposium on Non-Well-Posed Problems and Logarithmic Convexity (Heriot-Watt Univ., Edinburgh, 1972), Lecture Notes in Math. 316, Springer, 1973, pp. 85–101. 4. Miller, K.: Non-unique continuation for uniformly parabolic and elliptic equations in self-adjoint divergence form with Hölder-continuous coefficients, Arch. Rational Mech. Anal. 54 (1974), 105–117. 5. Pliš, A.: On non-uniqueness in Cauchy problem for an elliptic second order differential operator, Bull. Acad. Polon. Sci. 11 (1963), 95–100.
MPAG011.tex; 5/11/1998; 8:53; p.20
Mathematical Physics, Analysis and Geometry 1: 293, 1999.
293
Editorial
It is with great sadness that we learned of the unexpected and untimely death on November 27, 1998, of Moshé Flato. Moshé was extremely supportive of the launching of Mathematical Physics, Analysis and Geometry and he provided much useful advice as to the development of our journal. We are extremely pleased that he agreed to be on our Editorial Board – despite his longstanding commitment as founding Editor of the journal Letters in Mathematical Physics. His creative energy and loyalty will be sorely missed. VLADIMIR MARCHENKO ANNE BOUTET de MONVEL HENRY McKEAN
VTEXJu PIPS No: 230386 (mpagkap:mathfam) v.1.15 MPAGED2.tex; 6/04/1999; 9:34; p.1
Mathematical Physics, Analysis and Geometry 1: 295–312, 1999. © 1999 Kluwer Academic Publishers. Printed in the Netherlands.
295
Arnold’s Diffusion in Isochronous Systems ? G. GALLAVOTTI Università di Roma 1, Fisica, Italy (Received: 16 January 1998; in final form: 30 October 1998) Abstract. I discuss an illustration of a mechanism for Arnold’s diffusion following a non-variational approach, and an improvement of the related estimates for the diffusion time. Mathematics Subject Classifications (1991): 34C15, 34C29, 34C37, 58F30, 70H05. Key words: Arnold’s diffusion, homoclinic splitting, KAM.
1. Introduction Arnold’s diffusion was established in simple paradigmatic examples by Arnold [A]. Since that paper several methods aiming at extending its validity to more general systems have been developed: this was done either by following methods sometimes called “geometric methods” close to the original approach, [CG], [C], [M], or by “variational methods”, [Be], [Br]. In the approach [CG] one finds estimates, for the time necessary for a diffusion of O(1) in the space of the action variables, which are terribly big as functions of the size ε of the perturbation when it approaches 0 (their order is exp O(ε −1 )); the variational method instead gives better estimates, “fast”, ([Be], their orders is exp O(ε −1/2 )), and even very good, “polynomial”, ones ([Br], their order is O(ε −2 )). Recently remarkable progress has been made in the geometric approach via the papers [M] and the impressive “summa” [C], who have been able to recover not only the best variational results but to extend them to the cases discussed in [CG], greatly improving the bounds obtained there, and to many substantially new cases of applicative interest. The work [C] gives an extensive bibliography to which I refer. However, the subject is still presented at a very technical level, and the relation of the new methods with those in [CG] is not transparent. Here I first illustrate (Section 5) the method of [CG] by developing it with the aim of showing existence of diffusion. This may lead to a clarification of a method not appropriately quoted in the literature and which maintains its interest because of its relative simplicity, in spite of the better estimates coming from the quoted alternative methods. If explicit estimates are avoided one gains enormously in simplicity: this kind of approach was probably the one meant in [A] where the ? The first version of this paper is archived in: [email protected]#9709011.
VTEX(VR) PIPS No: 197526 (mpagkap:mathfam) v.1.15 MPAG024.tex; 6/04/1999; 8:19; p.1
296
G. GALLAVOTTI
problem was first posed and solved without bothering to give the (fairly obvious, see Section 5) details. What follows in Section 5 also applies to the Arnold’s case, but I prefer to illustrate it in a case that is even simpler. Furthermore I show (Section 6) that if a new idea is added to the method of [CG], then one can get a “fast” (still exponential) estimate for the drift time at least in the “isochronous” cases considered here, see (1.1) below. This bound is derived in detail and is conceptually independent of the other works. Consider Hamiltonians H with three degrees of freedom described by coordinates I ∈ R, A0 = (A01 , A02 ) ∈ R 2 and angles ϕ ∈ T 1 , α = (α1 , α2 ) ∈ T 2 : H = ω · A0 +
I2 def + g 2 (cos ϕ − 1) + εf (ϕ, α) = H0 + εf (ϕ, α), 2
(1.1)
where ω = (ω1 , ω2 ) ∈ R 2 is a vector with Diophantine constants C, τ , i.e., such that for all integer components vectors ν = (ν1 , ν2 ) it is |ω · ν|−1 6 C|ν|τ if ν 6= 0; the perturbationPf is supposed to be a (fixed) trigonometric polynomial of degree N : f (ϕ, α) = 06|ν| 0; the bound µ depends on ε, of course, and generically can be taken proportional to ε. Analytically we can write A0 + As (α, ϕ) and A + Au (α, ϕ) the parametric equation of the manifolds W u (A0 ) and W s (A), so representable at least for ϕ away from 0 or 2π , see (2.2) (e.g., see [CG] or [G3]). The functions Au , As do not depend on A because of isochrony:?? to see this note that the evolution equations for I , ϕ, α do not involve the A’s; explicit expressions for As , Au can be found in [G3] or [GGM]. The equation for the α value of an intersection point in W u (A) ∩ W s (A0 ) with def ϕ = π (say) is just Q(α) = As (α, π ) − Au (α, π ) = A0 − A, where usually Q(α) is called the splitting vector at α (and ϕ = π ). The angles between the tangent planes to W u (A) and W s (A) at the homoclinic intersection at ϕ = def π , α = 0 are related to the eigenvalues of the intersection matrix Dij =
? This is a simple special case of a property which becomes rather nontrivial in more interesting “anisochronous” Thirring models. Such models (see [T] and [G2, G3]) differ from (1.1) because of 1 A2 with K constant; then the average action A of a possible addition to H0 of an extra term 2K the motion on a torus is directly related to the gradient of the unperturbed Hamiltonian via ω = ∂A H0 (A), i.e., the frequencies are not “twisted” by the perturbation (a fact apparently “discovered” in [G3]). ?? Note that the parametric equation for the I variable needs not to be specified as it follows from the ones for the A’s via the energy conservation.
MPAG024.tex; 6/04/1999; 8:19; p.4
299
ARNOLD’S DIFFUSION IN ISOCHRONOUS SYSTEMS
∂i Qj (α)|α=0 which is also the Jacobian of the implicit equation Q(α) = A0 − A near α = 0. It follows from the classical Melnikov theory of splitting (see for instance [G3]) that the eigenvalues of D generically have values of order O(ε) so that the angles between tangents to W u (A) and W s (A) at α = 0, ϕ = π will have size O(ε) (and detD = O(ε 2 )). The genericity condition is a very simple algebraic condition on the coefficients of the polynomial f and is easily verified in many examples: the very simplest being perhaps f (α, ϕ) = cos(α1 + ϕ) + cos(α2 + ϕ). The non-vanishing of the intersection matrix determinant, and its interpretation as Jacobian of the implicit equation for the heteroclinic intersections, implies that the latter exist always, as soon as the average actions of the tori are close enough (and the tori have the same energy, of course). u s (t), X− (t) the (6) One can define also the splitting in the ϕ variables: call X− values at time t of the ϕ coordinate of the point on the unstable or stable manifold which at time t = 0 has coordinates (Au (α, π ), α, I u (α, π ), π ) or s u (t) − X− (t), which also (As (α, π ), α, I s (α, π ), π ) and one sets 1(t) = X− depends on α. (7) Finally a definition: Let A0 , A1 , . . . , AN be a sequence such that |Aj − Aj +1 | is so small that W u (Aj ) ∩ W s (Aj +1 ) have a transversal heteroclinic intersection, in the above sense, with intersection angles > µ at ϕ = π . We call such a chain a heteroclinic chain or ladder. As remarked in (5) one finds generically and in most simple examples µ = O(ε), hence N = O(ε −1 ). We shall prove the following theorem (“Arnold’s diffusion” or “drift”): THEOREM 1. Let A0 , A1 , . . . , AN be a heteroclinic chain: for any δ > 0 there are trajectories starting within δ of T (A0 ) and arriving after a finite time Tdrift within δ of T (AN ). I shall give a complete proof of it (Section 5), again, along the lines of [CG] for the sake of illustrating the simplicity of the method (due to Arnold). The purpose being of showing the conceptual difference with respect to the variational approaches, which accounts for the impressive difference in the time scale of Tdrift compared with [Be, Br] or with the estimate in Theorem 2 below (see (6.9)). In Section 6 I give a more refined, yet very simple and detailed, proof getting explicit and much better bounds (“fast”), although still far from the best in the literature. 4. Geometric Concepts Let 2κ > 0 be smaller than the radius of the disk in the (p, q)-plane where the functions in (2.2) are defined. We call κ a “target parameter”. To visualize the geometry of the problem involving 2-dimensional tori and their 3-dimensional stable and unstable manifolds, in the 5-dimensional energy surface,
MPAG024.tex; 6/04/1999; 8:19; p.5
300
G. GALLAVOTTI
we shall need the following geometric objects: (a) a point Xi , heteroclinic between T (Ai ) and T (Ai+1 ), which has local coordinates, see (2.2), Xi = (Ai , ψ i , 0, κ). (b) the equations, at fixed q = κ, of the connected part of W s (Ai+1 ) containing Xi , in the local coordinates near T (Ai ); they will be written as: s Yi (ψ) = (Asi+1 (ψ), ψ, pi+1 (ψ), κ)
(4.1)
with |ψ − ψ i | < ζ for some ζ > 0 (i-independent): it is Asi+1 (ψ i ) = s Ai , pi+1 (ψ i ) = 0 because we require Yi (ψ i ) = Xi . There are constants F 0 , F s (ψ)| are bounded, such that |Asi+1 (ψ) − Asi+1 (ψ i )| and max|ψ−ψ i | = fixed|pi+1 0 for ζ small enough, below by F |ψ−ψ i | and above by F |ψ−ψ i |; the constants F 0 , F have size bounded below by O(µ) (by the transversality assumption in the definition of heteroclinic chain, (7) of Section 3). Note that W s (Ai+1 ) also contains a part with local equations (Ai+1 , ψ, p, 0) which should not to be confused with the previous one described by the function Yi (ψ). This is more easily understood by looking at the meaning of the above objects in the original (A, α, I, ϕ) coordinates: in a way the first part of W s (Ai+1 ) is close to ϕ = 0 and the second to ϕ = 2π . They can be close because of the periodicity, but they are conceptually quite different. (c) a point Pi = Yi (ψ˜ i ) with |ψ˜ i − ψ i | = ri , where ψ˜ i , ri will be determined ei : recursively, and a neighborhood B s Bi = {|A − Asi+1 (ψ)| < κ 2 ri0 , |ψ − ψ˜ i | < ri0 , |p − pi+1 (ψ)| < κri0 , q = κ}, (4.2)
where ri0 < ri are scales < 1 and to be determined recursively. If g, ¯ 2g¯ are lower/upper bounds to (1 + γ (x))g(x) for |x| < 4κ 2 , the point Pi evolves in a time Ti ' g¯ −1 log κ −1 into a point Xi0 near T (A0i+1 ) which has local coordinates Xi0 = (Ai+1 , ψ 0i , κ, 0). Note that any point (A, ψ, p, q) will evolve, provided it does not exit the neighborhood where the local coordinates are defined, into a point of the form (A, ψ + ωTin , q, p) after a time Tin = −g(x)−1 (1 + γ (x))−1 log qp −1 if x = pq, because of the special hyperbolic form of the time evolution, see (2.1): we shall call this time the interchange time of the “last two coordinates” and we shall repeatedly use it. The choice of B0 is rather arbitrary and we take r0 = ζ (ζ is defined after (4.1)) and r00 = 12 r0 , choosing ψ˜0 arbitrarily (at distance r0 from ψ 0 ). (d) The points ξ of the set Bi are mapped by the time evolution to points that, at the beginning at least, come close to T (Ai+1 ) and in a time τ (ξ ) acquire local coordinates near T (Ai+1 ) with p = κ exactly: the time τ (ξ ) is of the order of g¯ −1 log κ −1 .
MPAG024.tex; 6/04/1999; 8:19; p.6
301
ARNOLD’S DIFFUSION IN ISOCHRONOUS SYSTEMS
If St is the time evolution flow for the system (1.1) we write Sξ = Sτ (ξ ) ξ (note that S depends also on i). Then S maps the set Bi into a set SBi containing: 1 2 0 1 0 κ 0 0 0 Bi = |A − Ai+1 | < κ ri , |ψ − ψ i | < ri , p = κ, |q| < ri (4.3) E E E s because all the points in Bi with A = Asi+1 (ψ), p = pi+1 (ψ), q = κ evolve (each taking its own time) to points with A = Ai+1 , p = κ, q = 0 and ψ close to ψ 0i , by the definitions. Here E is a bound on the Jacobian matrix of S. The latter, being essentially a flow over a time O(g¯ −1 log κ −1 ), has derivatives bounded i-independently: since we suppose that ε is “small enough” we could take E = 1 + bε for some b > 0 if, as often the case, |Ai − Ai+1 | < O(ε).
5. The [CG]-method of Proof of the Theorem s Consider the points Yi+1 (ψ) ∈ W s (Ai+2 ) with coordinates (Asi+2 (ψ), ψ, pi+2 (ψ), κ). They will evolve backwards in time so that A stays constant, ψ evolves s (ψ) evolves to κ while the q-coordinate quasi-periodically hence “rigidly”, and pi+2 s evolves from κ to q = pi+2 (ψ) (because pq stays constant, see (c) above). The s time for this evolution is Tψ ' g¯ −1 log κ|pi+2 (ψ)|−1 −−−−→ +∞. ψ→ψ i+1
s s Therefore there is a sequence ψ n −−−−→ ψ i+1 such that |pi+2 (ψ n )| > 0, pi+2 n→+∞
(ψ n ) → 0, Asi+2 (ψ n ) → Ai+1 and ψ n − ωTψ n −→ ψ 0i , as a consequence of the n→∞ def
Diophantine properties of ω. So that there is ψ˜ i+1 = ψ n with n suitable and a point s Pi+1 = (Asi+2 (ψ˜ ), ψ˜ , pi+2 (ψ˜ ), κ) ∈ W s (Ai+2 ) (actually infinitely many) i+1
i+1
i+1
which evolves, backwards in time, from Pi+1 to a point of Bi0 . 0 small enough so that the Hence we can define ri+1 = |ψ˜ i+1 − ψ i+1 | and ri+1 backward motion of the points in Bi+1 enters in due time into Bi0 . It follows that the set Bi evolves in time so that all the points of Bi+1 are on trajectories of points of Bi , for all i = 1, . . . , N . Hence all points of BN will be reached by points starting in B0 . This completes the proof. All constants can be estimated explicitly, even though this is somewhat long and cumbersome, see [CG]. The result is an extremely large drift time Tdrift (namely the value at N of a composition of N exponentials! at least this is the estimate I get after correcting an error in Section 8 of [CG]: the error is minor but leads to substantially worse bounds). Nevertheless the estimate that comes out of the above scheme seems essentially optimal. And then the problem is: “how is it possible that by other methods (e.g., variational methods of [Be], [Br]) one can get far better estimates? The above argument is quite close to the proof of the “obstruction property” in [C], p. 34: hence the latter work shows that the above analysis misses some key
MPAG024.tex; 6/04/1999; 8:19; p.7
302
G. GALLAVOTTI
idea that is exploited in the papers [M], [C]; perhaps the possibility of setting up a symbolic dynamics around the tori and exploiting it in the bounds. The difference with respect to the variational methods may be due to the fact that they are “less constructive”: less so than the above. The “fast drifting” trajectory exists but there seems to be no algorithm to determine it, not even the sequence of its “close encounters” with the invariant tori that generates drift: which is in fact preassigned in the above method. This certainly might account for a difference in the estimates. In fact the above construction is far too rigid: we pretend not only that drift takes place but also that it takes place via a path that visits closely a prescribed sequence of tori in an essentially predetermined way. In Section 6 a less constructive method is proposed and used to obtain bounds: which, however, turn out to be still far from polynomial. Of course a better understanding of why the results are so different with the different methods is highly desirable: but my efforts to understand satisfactorily this point only led to the improvement in Section 6 below, which has nevertheless some interest as it introduces the notion of elastic heteroclinic chain which I think might be useful also for the analysis of the anisochronous cases. 6. Fast Diffusion: Elastic Heteroclinic Chains The following adds a new idea to the method exposed in Section 5, allowing us to improve the super-exponential estimate of [CG] mentioned there. Below ε will be fixed small enough, and g, ¯ 2g¯ will be lower and upper bounds, respectively, to g(x)(1 + γ (x)), see (2.1). Let y be the curvilinear abscissa of a curve y → A(y), y ∈ [0, ymax ], in the “average action space” such that the tori T (A(y)) have fixed energy. Then evaluating the energy at the homoclinic point α = 0, ϕ = π and using that the I coordinate of the points on W u (A), W s (A) do not depend on A, because of isochrony, (2.2) one sees that ω · A(y) is constant so that the line y → A(y) is parallel to ω⊥ = (ω2 , −ω1 ). def ⊥ def ≡ w⊥ . Hence A(y) = A0 + w⊥ y with w⊥ = ω|ω| and A0 (y) = ∂A(y) dy Define y → A(y), y ∈ [0, y¯max ], to be an elastic heteroclinic chain with flexibility parameters β, ϑ > 0 and splitting µ if: (i) for all |y − y 0 | < ϑµ there is a heteroclinic intersection between the stable and unstable manifolds of T (A(y)) and T (A(y 0 )) with splitting angles > µ at ϕ = π. def (ii) The intersection matrix D = µDo at ϕ = π , α = 0 verifies: (w⊥ · Do−1 w⊥ ) = β 6= 0, def
w⊥ =
ω⊥ . |ω|
(6.1)
REMARKS. (a) The above definition is a special case of a natural more general definition relevant for higher dimensions and for anisochronous systems. For instance in the case of anisochronous systems, in which a term A2 /2K, with K > 0
MPAG024.tex; 6/04/1999; 8:19; p.8
303
ARNOLD’S DIFFUSION IN ISOCHRONOUS SYSTEMS
constant, is added to (1.1) one has to require that y → A(y) is a simple rectifiable curve and that, uniformly in y ∈ [0, ymax ], (6.1) holds with D replaced by the intersection matrix Dy , and ω replaced by ω(A(y)) = ω + A(y)K −1 . But in the anisochronous cases the condition that for all y there is the torus T (A(y)), called “no gap property”, is strongly restrictive and quite artificial (although it is verified in the example in [A], see also [P]). Below we consider, without further mention, only the isochronous models in (1.1) and in this case D, Do are y-independent because of isochrony, see (5) in Section 3. Condition (6.1) is a transversality property: in the case of (1.1) it holds generically (in the perturbation f and for ε small) and in this case it is a consequence of (i). Thus examples exist and are elementary, and generically µ = O(ε). A simple concrete example is provided by the already mentioned perturbation f (α, ϕ) = cos(α1 + ϕ) + cos(α2 + ϕ). Below we shall also suppose, without further mention, that µ = O(ε), i.e., that we consider a generic case. The greater generality of the above definition is meant to clarify a notion that might seem special for the isochronous case, and for future reference. (b) Thus every sequence y0 , y1 , . . . , yN with |yi − yi+1 | < ϑµ is a heteroclinic chain in the sense of Section 3, and the theorem proved in Section 5 applies to it. A elastic heteroclinic chain with parameter ϑ is also elastic with parameter ϑ 0 < ϑ. Hence it will not be restrictive to suppose that ϑ is as small as needed. (c) If ϑ is small enough so that the first order Taylor’s expansions of the splitting vector Q(α), see Section 3, (5), are “good” approximations we deduce (by applying the implicit functions theorem) that a heteroclinic intersection at ϕ = π between W s (A(y)) and W u (A(y + δ)) takes place at: α y (δ) = Do−1 w⊥ ϑ 0 + O(ϑ 02 ) for δ = µϑ 0 , |ϑ 0 | 6 ϑ, 1 ˜ < |(α y (δ 0 ) − α y (δ 00 )) · w ⊥ | < 2β|ϑ˜ |, β|ϑ| 2
∀δ 0 = µϑ 0 , δ 00 = µϑ 00
(6.2)
for ϑ small enough, for |ϑ 0 |, |ϑ 00 | < ϑ, and having set ϑ˜ = ϑ 0 − ϑ 00 . (d) A geometrical consequence of (6.1), (6.2) is that when y varies by δ (so that A(y) varies in R 2 orthogonally to ω by O(δ)), then the heteroclinic intersection α y (δ) between W s (A(y)) and W u (A(y + δ)) is displaced away from 0 with a component in the direction orthogonal to ω of size O(δµ−1 ), provided δµ−1 = ϑ 0 is small enough. (e) The value ϕ = π is not special in many respects and the same remains true if one looks at the displacement of the heteroclinic intersection at any other section located away from the tori by a fixed distance κ > 0, if ε is small enough. In fact consider the intersection matrix D(t) evaluated along the heteroclinic trajectory
MPAG024.tex; 6/04/1999; 8:19; p.9
304
G. GALLAVOTTI
at a time t after the passage through ϕ = π . From the equations of motion its evolution is: Z t ∂αϕ f (ϕ(τ ), ωτ )∂α 1(τ )dτ, (6.3) D(t) = D − ε 0
where 1(t) denotes the splitting in the ϕ-coordinates (and |1(t)| < O(ε)), defined in remark (6) of Section 3, and ϕ(t) is the heteroclinic evolution of ϕ: hence D(t) = D + O(ε 2 ) (while D = O(ε)) for t bounded, by Melnikov’s theory (see also (5.5) in [GGM]). In particular if we look at the ψ-coordinate ψ y (δ) of the heteroclinic intersection point at q = κ, on the same heteroclinic trajectory, and compare it with the position of the homoclinic point ψ y (0) of T (A(y)) at q = κ then we can say that, for some constants 2b1 , 2b0 (the factor 2 is for later convenience) it is: ˜ 2b0 ϑ] ˜ |w⊥ · (ψ y (δ 0 ) − ψ y (δ 00 ))| ∈ [2b1 ϑ,
(6.4)
with ϑ˜ = (δ 0 − δ 00 )µ−1 ; the constants b0 , b1 depend on the constant κ prefixed at the beginning of Section 6, and on β. THEOREM 2. Suppose that y → A(y) is elastic in the above sense, then fixed a, b there exist heteroclinic chains A0 = A(y0 ), A1 = A(y1 ), . . . , AN = A(yN ) with −1 y0 = 0, yN = ymax along which the drift time is g¯ −1 eO(µ ) . The estimates proceed by performing the construction of Section 5 without fixing a priori the heteroclinic chain: we construct it inductively, by trying to optimize (as well as we can) various choices. The proof below is divided, into several trivial statements, into a few propositions and lemmata each of which is marked by a •. Using the notations of Section 4, assume that yj have been constructed for j 6 i + 1 together with ψ˜ j , rj , rj0 , Bj , ψ 0j , Bj0 for j 6 i, verifying ri < ϑ. We must 0 . The set B0 is fixed as in the paragraph following (4.2) define yi+2 , ψ˜ , ri+1 , ri+1 i+1
above, and y1 − y0 = µϑ, r0 , r00 = 12 r0 are arbitrarily chosen (positive) and we also require r0 < ϑ and ϑ small. • 1. Let E be as in Section 5 and let E 0 be so large that if T 0 = g¯ −1 log E 0 E −1 the points ωt, t ∈ [0, T 0 ], fill the torus within 12 b1 ϑ (see (6.4) for the definition of b1 ). This means that E 0 is very large.? • 2. Let Xi+1 (y) be heteroclinic between T (Ai+1 ) and T (A(y)) for y ∈ [yi+1 + 1 µϑ, yi+1 + µϑ]. We choose to look for yi+2 among such y’s to be sure that every 2 ? One can take E 0 = E exp O(Cϑ −τ ) estimating by O(Cδ −τ ) the time needed to a quasi periodic rotation of the torus with vector ω, Diophantine with constants C, τ , to fill with lines parallel to ω and within δ the whole torus T 2 . I discuss this estimate in Appendix A1 as an aside, since here only finiteness of E 0 is required (a trivial fact): this gives me the chance of discussing a simple conjecture.
MPAG024.tex; 6/04/1999; 8:19; p.10
305
ARNOLD’S DIFFUSION IN ISOCHRONOUS SYSTEMS
time i increases by one unit then yi increases by 12 µϑ at least, so that after N steps, with N = O(µ−1 ) we shall have reached the “upper extreme ymax of the chain”. Let the local coordinates of Xi+1 (y) be (Ai+1 , ψ i+1 (y), 0, κ) (see Section 4 for the notations). Let, see (4.1): s (ψ), κ) Yi+1,y (ψ) = (Asi+2,y (ψ), ψ, pi+2,y
(6.5)
be the equation of W s (A(y)) in the local coordinates around the torus T (Ai+1 ) near Xi+1 (y). We may suppose that: s |Asi+2,y (ψ) − Ai+1 |, |pi+2,y (ψ)| < b3 µ|ψ − ψ i+1 (y)|
(6.6)
for some b3 of O(1) and we may suppose b3 > 1, for simplicity. This simply expresses the analyticity in ψ and ε of the stable manifold (note that Asi+1,y (ψ) − s (ψ) vanish at ψ i+1 (y), i.e., at the heteroclinic point). Ai+1 and pi+1,y • 3. Suppose r small: a first approximation to ψ˜ i+1 will be obtained by fixing y at the left extreme y¯ of its interval of variation (which is [yi+1 + 12 µϑ, yi+1 +µϑ]) and by choosing a point ψ i+1,y,r at distance r from ψ i+1 (y) ¯ along a line 3 on which ¯ s (ψ) does not vanish. For instance we can take the straight we can be sure that pi+2, y¯ s ◦ ◦ ¯ of pi+2,y¯ (ψ) at ψ = ψ i+1 (y). ¯ In this line 360 at 60 from the gradient a i+2 (y) 1 ◦ way (cos 60 = 2 ): def
s λ = |pi+2, )| ' y¯ (ψ i+1,y,r ¯
1 s |pi+2, max y¯ (ψ)| ¯ 2 |ψ−ψ i+1 (y)|=r
(6.7)
and λ ∈ [b2 µr, b3 µr], for some b3 , b2 = O(1) > 0, by the assumption on def the splitting angles (which implies that the modulus of the gradient a i+2 (y) = s (ψ i+1 (y)) is in [b2 µ, b3 µ] for some constants b2 , b3 > 0 of O(1)). The ∂ψ pi+2,y constants b2 , b3 depend on the “target” parameter κ, fixed once and for all, see beginning of Section 4, and b3 can be taken to be the same constant in (6.6). Let d = b2 /2b3 . • 4. To improve the approximation for ψ˜ i+1 note that as r varies in the range r0
r0
varies and λ varies by a factor not smaller d 4b3iE0 < r < 4b3i E the point ψ i+1,y,r ¯ 0 than 2E /E by our definition of d. Hence the time T (r) necessary in order that ) = (Asi+2,y¯ (ψ i+1,y,r ), ψ i+1,y,r , the backward evolution of the point Yi+1,y¯ (ψ i+1,y,r ¯ ¯ ¯ s pi+2,y¯ (ψ i+1,y,r ), κ) interchanges the last two coordinates, will vary by an amount ¯ > T 0 = g¯ −1 log E 0 /E = O(Cϑ −τ ), see footnote ? and comment (c) in Section 4.
• 5. This implies, by continuity and by the size of T0 , that there will be a value r(y) ¯ ¯ → ψ i+1 (y)−ωt ¯ of duration T (r(y)) ¯ has such that the “backward motion” ψ i+1 (y) ψ-coordinate close to ψ 0i within 12 b1 ϑ and on the line ` orthogonal to ω through ψ 0i .
MPAG024.tex; 6/04/1999; 8:19; p.11
306
G. GALLAVOTTI
We can also prefix on which side of it will be. Remark that as y increases past y¯ the point ψ i+1 (y) moves with a displacement having a nonzero component in the direction parallel to ` (by the second of (6.1) and the first of (6.2)). Therefore we shall choose the side so that the component of the displacement along ` is towards ψ 0i : this is convenient for reasons that will become clear below. • 6. I now imagine varying y in its interval of variation [yi+1 + 12 µϑ, yi+1 + µϑ] and select r(y), hence ψ i+1,y,r(y), so that the time of interchange of the last two coordinates of the point Yi+1,y (ψ i+1,y,r(y)) does not change. The latter time is, s (ψ i+1,y,r(y)), because by (2.1), g(pκ)−1 (1 + γ (pκ))−1 log κ|p|−1 if p = pi+2,y the motion is “exponential” and preserves the product of the last two coordinates (see (2.1)). Hence fixing the interchange time means determining ψ = ψ i+1,y,r(y) so that p is constant. Although it might be clear that this can be done I describe in some detail the way that I follows in Appendix A2 which also gives the quantitative information that:
1 d ri0 < r(y) < r0 0 8b3 E 2b3 E 0 i
(6.8)
and, therefore, the point Yi+1,y (ψ) remains in the neighborhood where the local coordinates are defined. ¯ will either “fall short” or “long” • 7. For each y the point ψ i+1 (y) − ωT (r(y)) 0 of the line ` orthogonal to ω through ψ i : but “only by a length bounded by 6 ¯ − ωT (r(y)) ¯ is exactly on 2|Do−1 w ⊥ |ϑ”. In fact by construction the vector ψ i+1 (y) ¯ can undergo as y the line ` and the maximum variation that |ψ i+1 (y) − ψ i+1 (y)| varies by at most 12 µϑ is bounded by the first of (6.2). • 8. Hence by a suitable rotation of the direction of the line 360◦ along which we s by a factor 1 + O(ϑ) choose the point ψ i+1,y,r(y) we can change the size of pi+2,y and arrange that at the time Tin when the last two coordinates are interchanged ψ i+1 (y) − ωTin is exactly on the line ` orthogonal to ω through ψ 0i . In fact our choice of the line 360◦ on which ψ i+1,y,r(y) is selected, neither orthogonal nor parallel to the gradient of pi+2,y (ψ), shows that we can change in this s | by up to a factor about 2, i.e., by a factor 1 + O(ϑ) if ϑ is small, way |pi+2,y and down to a factor 0.? So that, by continuity, we can find a line 30 slightly off 360◦ by an angle of O(ϑ) and a point ψ on it such that in its interchange time Tin the (other) point ψ i+1 (y) ends on the target line `. ? Note that ps i+2,y (ψ) vanishes in correspondence of the heteroclinic point, i.e., at ψ i+1 (y), as
well as on a curve through the heteroclinic point value ψ i+1 (y) by the transversality of the splitting and by the implicit functions theorem.
MPAG024.tex; 6/04/1999; 8:19; p.12
307
ARNOLD’S DIFFUSION IN ISOCHRONOUS SYSTEMS
We still call ψ i+1,y,r(y) the new (and final) choice of ψ on 30 . Note, however, that the point ψ i+1,y,r(y) will not end on the line ` but it will miss it by at most a distance r(y) < ri0 /(2b3 E), because ψ i+1,y,r(y) is r(y) apart from ψ i+1 (y), which by the construction ends on ` at the interchange time Tin : the point reached on ` will be away by at most b1 ϑ from ψ 0i . The angle of the needed rotation will be of O(ϑ) off the line 360◦ at 60◦ degrees to the gradient ∂ψ pi+2,y (ψ i+1 (y)), because the velocity of the quasi periodic motion has size of order O(1) (i.e., it is |ω|) so that a time variation of up to O(ϑ) suffices for a displacement of r(y) < ϑ (recall that ri0 < ri < ϑ, as stipulated at the beginning). • 9. As we vary y we find, by continuity, a point y ∗ such that ψ i+1 (y ∗ )−ωTin = ψ 0i because ψ i+1 (y) has a component along w⊥ which varies by b1 ϑ at least, see (6.4), and in the right direction towards ψ 0i by the above proposition • 5. • 10. Setting r ∗ = r(y ∗ ) and ψ˜ i+1 = ψ i+1,r ∗ ,y ∗ we see that the evolution of r0
Yi+1,y ∗ (ψ˜ i+1 ) leads to a point which has ψ-coordinate close within 2b3i E to the coordinate ψ 0i of the point Xi0 = (Ai+1 , ψ 0i , κ, 0) (around which the already inductively known set Bi0 is constructed, see (4.3)), because ψ˜ i+1 is within ri0 /2b3 E of ψ i+1 (y) by construction. Note that this is just a continuity statement: hence it is nonconstructive, as much as the other continuity arguments used above. r0 0 2 = γ ri+1 with γ small enough see that the We set ri+1 = r ∗ > d 8b3iE0 , ri+1 points of Bi+1 defined by such parameters via (4.2) evolve backward in time to fall inside Bi0 at their last interchange time. However the interchange time Tin varies when ψ varies in the disk of radius 0 centered at a point at distance ri+1 from the heteroclinic point. And it is proporri+1 tional to the logarithm of the inverse of |pi+2 (ψ)|; the latter is a function essentially r
+r 0
i+1 linear in ψ.? Hence it varies bounded by a factor proportional to log ri+1 . 0 i+1 −r 0 ri+1 ) O( ri+1
i+1
O(ri0 )
which must be < because we want The latter variation has size that the backward evolution of the points in Bi+1 is, at their interchange time, inside Bi0 and the velocity ω of the quasi periodic motion is of O(1). Hence if the time varies by O(ri0 ) the resulting displacement of the final value of ψ will be of O(ri0 ): recalling that ri+1 ∈ [ 8bd3 E0 ri0 , 2b13 E ri0 ] we get a quadratic recursion for the definition of the ri : unavoidable in the above scheme. • 11. We find that, from the above proposition, that ri0 , ri = O((0r0 )2 ) for some 0 (one can take 0 = dγ /(8b3 E 0 )); so that the time Ti = O(g¯ −1 log ri−1 ) necessary to hop one step along the chain is O(g¯ −1 2i ) and the time Tdrift for drifting along the i
? Because the point ψ ˜ is still on a line 30 very close, by the last remark in the preceding i+1 proposition, to 360◦ , at 60◦ degrees to the gradient ∂ψ pi+2,y (ψ i+1 (y ∗ )).
MPAG024.tex; 6/04/1999; 8:19; p.13
308
G. GALLAVOTTI
chain is bounded above by O(g¯ −1 2N ): −1 )
Tdrift 6 g¯ −1 2O(µ
(6.9)
. −1
Recalling that ϑ is fixed, if µ = O(ε) (generic) this is const econst ε . REMARK. Therefore the exponential bound (6.9) is due to the rapid convergence to 0 of ri , i.e., with a logarithm exponentially diverging, which arises from the 0 = O(ri02 ). quadratic recursion ri+1 7. Concluding Remarks. Very Fast Diffusion? For a review on diffusion see [L]: in this paper the possibility of estimates of size of an inverse power of ε is proposed and discussed. (1) The above non-variational proof gives results not directly comparable to the best known, [Be], [Br], based on a variational method and giving (in [Br]) a polynomial drift time of O(µ−2 ). The papers [Be], [Br], deal with Arnold’s example, [A], i.e., with a different case. However, they make use in an essential way of the very similar structure of the model, i.e., of the fact that it admits a “gap-less” foliation into stable and unstable manifolds of invariant tori, see also [P]. It is hard to see how to improve the bounds of Section 6 in Arnold’s example, if it is studied along the same lines. The recent works [M], [C], also lead very close by to the estimates in [Be], [Br] and, if I understand them correctly, they should also apply to the cases treated here and give polynomial estimates: hence the difference between the sizes of the bounds obtained by our approach and the ones obtained via variational methods or via geometric methods alternative to the ones exposed here remains (for me) a puzzle that I hope to understand in the future. (2) It is worth stressing again that the methods of Section 5 apply every time there are “no gaps” around resonant tori and the homoclinic angles admit a nonzero lower bound: therefore they apply to the case in [A] with, in the notations of [A], µ = ε c and c large enough. In the isochronous models they apply, immediately, to a variety of cases: a nontrivial one is the Hamiltonian (1.1) with ω = (ηa , η−1/2 ), a > 0, ε = µηc with c large enough and, possibly, even a further “monochromatic, strong and rapid” perturbation βf0 (ϕ, λ) like β cos(λ + ϕ) with β = O(1). Consider only values of η such that |ω · ν| > Cηd |ν|−τ for all 0 6= ν ∈ Z 2 , and for some C, d > 0, see Section 2 in [GGM]. Then by using the results of [GGM] (Section 8) we see that if η is fixed small enough the homoclinic splitting is analytic in β for |β| < O(η−1/2 ), while it does not vanish for β small (i.e., β = O(ηc )), generically in f (see [GGM], Section 6). Hence it is not 0 for all β < 2 (say) except, possibly, for finitely many values of β. This means that in such strongly perturbed systems (β = O(1)) one still has elastic heteroclinic chains of arbitrary length, see Section 8 of [GGM], and
MPAG024.tex; 6/04/1999; 8:19; p.14
309
ARNOLD’S DIFFUSION IN ISOCHRONOUS SYSTEMS
therefore there is diffusion provable by the methods of Sections 5 and 6, except possibly in correspondence of finitely many values of β. Furthermore the A-independent (because of isochrony, see [GGM]) homoclinic angles can become large when β, µ approach their convergence radii and this gives us the possibility of “very fast” drift on time scales of ∼ O(1). In fact I think that the homoclinic splitting might be a monotonic function of ε, β for interesting classes of perturbations. (3) An advantage of the technique of Section 5 is its flexibility which makes it immediately applicable, essentially without change, to anisochronous systems, see [CG] as corrected in [CV]. (4) Constructivity, even partial (see comments in Section 5), seems the key to understanding the huge difference between the results of Section 5 and the variational results, or those of Section 6 above: diffusion time bounds in an inverse power of ε (in [Br] and Section 6) versus a super-exponential in the more constructive proposal in Section 5. A hint in this direction is provided by the bound in Section 6: by adding a new idea to the method of Section 5, i.e., of [CG], one can −1 get a drift time estimate of 2−O(ε ) instead of the super-exponential of [CG], and Section 5. But the theory becomes now less constructive: not even the sequence of close encounters with invariant tori is determined constructively as continuity arguments are used. (5) The method of Section 5 and of Section 6 seems related to the “windowing” analysis in the early work [Ea] and in [M], [C] as pointed out to me by P. Lochak and J. Cresson. (6) Finally only drift in phase space is discussed here: but it is clear that heteroclinic chains do not need to “advance” at each step (e.g., a A-coordinate needs not to increase systematically): we can use heteroclinic chains that advance and back up at our prefixed wish (e.g., randomly). In this sense, there is no difference between drift and diffusion. Acknowledgements I am indebted to P. Lochak stimulating comments and, in particular, to G. Gentile and V. Mastropietro for many discussion and help in revising the manuscript. This work is part of the research program of the European Network on: “Stability and Universality in Classical Mechanics”, #ERBCHRXCT940460. Appendix A1. Filling Times of Quasi Periodic Motions: A Conjecture Let (ω1 , . . . , ωd ) = ω ∈ R d be such that |ω · ν|−1 6 C|ν|τ . Let χ(x), χ⊥ (x) be C ∞ -functions even and strictly positive for |x| < 12 π , vanishing elsewhere and with def
integral 1. Let ψ, ψ 0 ∈ T d and x(ψ) = ε −(d−1)χ(ω·(ψ −ψ 0 )/|ω|)·χ(ε −1 |P ⊥ (ψ − ψ 0 )|), P ⊥ = orthogonal projection on the plane orthogonal to ω.
MPAG024.tex; 6/04/1999; 8:19; p.15
310
G. GALLAVOTTI
The function x can be naturally regarded as defined and periodic on T d : if χ(σ ˆ ) is the Fourier transform of χ as a function on R then the Fourier transform of x is χˆ (ν || )χˆ (ε|ν ⊥ |), ν integer components vector, ν || = ω · ν/|ω|, ν ⊥ = P ⊥ ν. The −1 R T average X = T 0 x(ωt) dt is: X = 1+
X
x(ν) ˆ e−iψ 0 ·ν
ν6=0
> 1−
1 eiω·νT T iω · ν
2C X χˆ (ν || )χˆ (ε|ν ⊥ |) |ν|τ . T ν6=0
(A1.1)
Since the last sum is bounded above by bε −(τ +d−1) the average X is positive, e.g., > 12 , for all ψ 0 if T > 4bCε −(τ +d−1) . This means that for T > 4bCε −(τ +d−1) + π/|ω|, hence for T > BCε −(τ +d−1) with B a suitable constant depending only on d, the torus will have been filled by the trajectory of any point within a distance ε. This proof is taken from (5), p. 111, of [G1], see [BGW] for an alternative proof and a much stronger result (i.e., with τ replacing τ + d − 1). Of course the above estimate T > O(ε −τ −(d−1)) really deals with a quantity different from the minimum time of visit. It is an estimate of the minimum time beyond which all cylinders with height 1 (say) and basis of radius ε have not only been visited but they have been visited with a frequency that is, for all of them, larger than 12 of the asymptotic value (equal to ε d−1 ): we can call the latter time the first large frequency of visit time. The difference between the two concepts explains the difference between the two estimates which are equally good, i.e., alternative, for the purposes of our analysis (and both too detailed since we only need that the minimum time of visit is finite). And I conjecture that both are optimal: the first is optimal as an estimate of the first visit time and the second as an estimate of the first large frequency of visit time.
Appendix A2. Fixing the Time of Interchange d s s (pi+2,y (ψ))−pi+2,y (ψ i+1 (y)) = 0 (having inThe differential condition on ψ is dy s serted pi+2,y (ψ i+1 (y)) = 0 for convenience) or, if prime denotes y-differentiation: s s s 0 = ∂ψ pi+2,y (ψ) · (ψ 0 − ψ i+1 (y)0 ) + (∂ψ pi+2,y (ψ) − ∂ψ pi+2,y (ψ i+1 (y))) · s s · ψ i+1 (y)0 + (∂y pi+2,y )(ψ) − (∂y pi+2,y )(ψ i+1 (y)),
(A2.1)
def
which means that r(y) = |ψ 0 −ψ i+1 (y)| verifies |r(y)0 | < C1 r(y) for some C1 > 0 independent of µ because: s with respect to the arguments ψ, y are of order µ. (1) all derivatives of pi+2,y
MPAG024.tex; 6/04/1999; 8:19; p.16
ARNOLD’S DIFFUSION IN ISOCHRONOUS SYSTEMS
311
(2) The vector ψ − ψ i+1 (y) has the form r(y)w 60◦ (y) where w60◦ (y) is the unit s (ψ i+1 (y)), so that vector parallel to the axis forming 60◦ degrees with ∂ψ pi+2,y 0 0 0 (ψ − ψ i+1 (y)) = r (y)w 60◦ (y) + r(y)w 60◦ (y). s (ψ) with respect to (3) ψ i+1 (y)0 has size O(1); the second derivatives of pi+2,y ψ have size O(µ) and the derivative of w60◦ (y) that can be computed by s s (ψ i+1 (y))/|∂ψ pi+1 (ψ i+1 (y))|) is differentiating its expression (namely ∂ψ pi+2 of O(1). def
This fixes the y-derivatives of r(y) = |ψ − ψ i+1 (y)| to have size O(r(y)) so that the variation of r(y), as y varies in its interval of size 12 µϑ and starting at 1 y¯ = yi+1 + 12 µϑ, is bounded by (eC1 2 µϑ − 1) 6 C1 µϑr(y) ¯ if ϑC1 < 12 . Hence d r 0 < r(y) < 2b13 E0 ri0 and the point Yi+1,y (ψ) remains in the neighborhood 8b3 E 0 i where the local coordinates are defined. References [A] Arnold, V.: Instability of dynamical systems with several degrees of freedom, Sov. Mathematical Dokl. 5 (1966), 581–585. [Be] Bessi, U.: An approach to Arnold’s diffusion through the Calculus of Variations, Nonlinear Analysis, 1995. [Br] Bernard, P.: Perturbation d’un hamiltonien partiellement hyperbolique, C.R. Academie des Sciences de Paris 323(1) (1996), 189–194. [BGW] Bourgain, J., Golse, F. and Wennberg, S.: The ergodisation time for linear flows on tori: Application for kinetic theory, Preprint, 1995, to appear in Communications in Mathematical Physics. [C] Cresson, J.: Symbolic dynamics for homoclinic partially hyperbolic tori and “Arnold diffusion”, Preprint of Institut de mathematiques de Jussieux, 1997. And, mainly: Propriétés d’instabilité des systèmes Hamiltoniens proches de systèmes intégrables, Doctoral dissertation, L’Observatoire de Paris, Paris, 1997. [CG] Chierchia, L. and Gallavotti, G.: Drift and diffusion in phase space, Annales de l’Institut Poincarè B 60 (1994), 1–144. [CV] Chierchia, L. and Valdinoci, E.: A note on the construction of Hamiltonian trajectories along heteroclinic chains, to appear in Forum Mathematicum. [DGJS] Delshams, S., Gelfreich, V. G., Jorba, A. and Seara, T. M.: Exponentially small splitting of separatrices under fast quasiperiodic forcing, Communications in Mathematical Physic 189 (1997), 35–72. [Ea] Easton, R. W.: Orbit structure near trajectories biasymptotic to invariant tori, in R. Devaney, Z. Nitecki (eds.), Classical Mechanics and Dynamical Systems, Dekker, 1981, pp. 55–67. [E] Eliasson, L. H.: Absolutely convergent series expansions for quasi-periodic motions, Mathematical Physics Electronic Journal 2 (1996). [G1] Gallavotti, G.: The Elements of Mechanics, Springer, 1983. [G2] Gallavotti, G.: Twistless KAM tori, Communications in Mathematical Physics 164 (1994), 145–156. [G3] Gallavotti, G.: Twistless KAM tori, quasi flat homoclinic intersections, and other cancellations in the perturbation series of certain completely integrable Hamiltonian systems. A review, Reviews on Mathematical Physics 6 (1994), 343–411.
MPAG024.tex; 6/04/1999; 8:19; p.17
312
G. GALLAVOTTI
[G4]
[Ge]
[GG] [GGM]
[L] [Ea]
[M] [P]
[T]
Gallavotti, G.: Hamilton–Jacobi’s equation and Arnold’s diffusion near invariant tori in a priori unstable isochronous systems, Rendiconti del seminario matematico di Torino, in print; also in [email protected]#9710019. Gentile, G.: A proof of existence of whiskered tori with quasi flat homoclinic intersections in a class of almost integrable systems, Forum Mathematicum 7 (1995), 709–753. See also: Whiskered tori with prefixed frequencies and Lyapunov spectrum, Dynamics and Stability of Systems 10 (1995), 269–308. Gallavotti, G. and Gentile, G.: Majorant series convergence for twistless KAM tori, Ergodic Theory and Dynamical Systems 15 (1995), 857–869. Gallavotti, G., Gentile, G. and Mastropietro, V.: Pendulum: Separatrix splitting, Preprint, chao-dyn #9709004: this paper will appear with a different, more informative, title “Separatrix splitting for systems with three time scales”. And G. Gallavotti, G. Gentile and V. Mastropietro: Melnikov’s approximation dominance. Some examples, chao-dyn #9804043, in print in Reviews in Mathematical Physics. Lochak, P.: Arnold’s diffusion: A compendium of remarks and questions, Proceedings of 3DHAM, s’Agaro, 1995, in print. Easton, R. W.: Orbit structure near trajectories biasymptotic to invariant tori, in R. Devaney, Z. Nitecki (eds.), Classical Mechanics and Dynamical Systems, Dekker, 1981, pp. 55–67. Marco, J. P.: Transitions le long des chaines de tores invariants pour les systèmes hamiltoniens analytiques, Annales de l’Institut Poincaré 64 (1995), 205–252. Perfetti, P.: Fixed point theorems in the Arnol’d model about instability of the actionvariables in phase space, mp- [email protected], #97-478, 1997, in print in Discrete and Continuous Dynamical Systems. Thirring, W.: Course in Mathematical Physics, vol. 1, p. 133, Springer, Wien, 1983.
MPAG024.tex; 6/04/1999; 8:19; p.18
Mathematical Physics, Analysis and Geometry 1: 313–330, 1999. © 1999 Kluwer Academic Publishers. Printed in the Netherlands.
313
The Inverse Spectral Method for Colliding Gravitational Waves A. S. FOKAS Department of Mathematics, Imperial College, SW7 2BZ U.K.
L.-Y. SUNG Department of Mathematics, University of South Carolina, Columbia, SC 29208, U.S.A.
D. TSOUBELIS Department of Mathematics, University of Patras, 261 10 Patras, Greece (Received: 2 February 1998; in final form: 18 February 1998) Abstract. The problem of colliding gravitational waves gives rise to a Goursat problem in the triangular region 1 6 x < y 6 1 for a certain 2 × 2 matrix valued nonlinear equation. This equation, which is a particular exact reduction of the vacuum Einstein equations, is integrable, i.e. it possesses a Lax pair formulation. Using the simultaneous spectral analysis of this Lax pair we study the above Goursat problem as well as its linearized version. It is shown that the linear problem reduces to a scalar Riemann–Hilbert problem, which can be solved in closed form, while the nonlinear problem reduces to a 2 × 2 matrix Riemann–Hilbert problem, which under certain conditions is solvable. Mathematics Subject Classifications (1991): 83C35, 35Q20, 58F07, 65. Key words: colliding gravitational waves, Ernst equation, boundary-value problem, inverse spectral method, Riemann–Hilbert problem, Goursat problem, Einstein equations.
1. Introduction One of the most extensively studied problems in general relatively is the collision of two plane gravitational waves in a flat background. Assuming that the two approaching waves are known, it can be shown ([1] and appendix) that the problem of describing the interaction following the collision of the two waves is closely related to the following boundary value problem: Let g(x, y) be a real, symmetric, 2 × 2 matrix-valued function of x and y for (x, y) ∈ D, where D is the triangular region D = {(x, y) ∈ R2 , −1 6 x < y 6 1} depicted in Figure 1. Let subscripts denote partial derivatives. The function g(x, y) solves the PDE 2(y − x)gxy + gx − gy + (x − y)(gx g −1 gy + gy g −1 gx ) = 0,
(1.1)
with the boundary conditions g(−1, y) = g1 (y), −1 < y 6 1; g(x, 1) = g2 (x), −1 6 x < 1,
(1.2)
VTEX(P) PIPS No.: 209407 (mpagkap:mathfam) v.1.15 MPAG025.tex; 19/04/1999; 16:12; p.1
314
A. S. FOKAS ET AL.
Figure 1. The region D = {(x, y) ∈ R2 : −1 6 x < y 6 1} corresponding to the Goursat problem defined by (1.1) and (1.2).
where the functions g1 (y) and g2 (x) are uniquely specified by the approaching waves. Equation (1.1), which is a particular exact reduction of the vacuum Einstein equations, is equivalent to the celebrated Ernst equation [2]. Belinsky and Zakharov [3] have shown that Equation (1.1) is integrable, in the sense that it admits the Lax pair formulation [4] 2κ ∂9 (x − y)gx g −1 ∂9 + = 9, ∂x κ + x − y ∂κ κ +x−y 2κ (y − x)gy g −1 ∂9 ∂9 + = 9, ∂y κ + y − x ∂κ κ +y−x
(1.3a) (1.3b)
where 9(x, y, κ) is a 2 × 2 matrix-valued function of the arguments indicated and k ∈ C. An alternative Lax pair of Equation (1.1) is [5, 6] (y − λ)1/2 ∂9 1 1− gx g −1 9, (1.4a) = ∂x 2 (x − λ)1/2 1 (x − λ)1/2 ∂9 (1.4b) gy g −1 9, = 1− ∂y 2 (y − λ)1/2 where λ ∈ C. For integrable equations it is usually possible to: (i) Construct a large class of particular explicit solutions, using a variety of the so-called direct methods, such as Bäcklund transformations [7], the dressing method [8], the direct linearizing method [9], etc. (ii) Investigate certain initial-value problems using the so-called inverse spectral method [10 – 12]. Solving an initial-value problem is more difficult than deriving particular solutions. The problem of colliding gravitational waves is a boundary-value problem and such problems are even more difficult than initial-value problems. Indeed, regarding the interaction of plane gravitational waves, although many classes of particular exact solutions have been
MPAG025.tex; 19/04/1999; 16:12; p.2
THE INVERSE SPECTRAL METHOD FOR COLLIDING GRAVITATIONAL WAVES
315
found (see [1, 13 – 17]), the initial-value problem has only been addressed by Hauser and Ernst [18, 19]. These authors did not investigate Equation (1.1) directly. Instead, they have shown that, in the particular case of gravitational waves, Equation (1.1) can be related to the equation (y − x)Gxy + [Gx , Gy ] = 0,
(1.5)
where [ , ] denotes the matrix commutator. Equation (1.5) has been studied in [18, 19] using indirectly the fact that Equation (1.5) possesses the Lax pair Gx 9 ∂9 = , ∂x x−λ
∂9 Gy 9 = , ∂y y −λ
(1.6)
where 9(x, y, λ) is a 2 × 2 matrix-valued function of the arguments indicated. In this paper we use the inverse spectral method to solve the boundary value problem defined by Equations (1.1) and (1.2) as well as the following linear boundary value problem: Let the matrix-valued function γ (x, y), (x, y) ∈ D, satisfy the linear PDE 2(y − x)γxy + γx − γy = 0,
(1.7)
where γ (−1, y) and γ (x, 1) are given functions of y and x respectively. This boundary value problem can be considered as the small g limit of the boundary value problem defined by Equations (1.1) and (1.2). Indeed, substituting g = I +εγ in Equation (1.1) (where I is the identity matrix) and keeping only O(ε) terms, Equation (1.1) becomes Equation (1.7). We now state the main result of this paper: THEOREM 1.1. Assume that the derivative of g1 (y) and of g2 (x) are C 2 in [−1, 1], sufficiently small, and g1 (1) = g2 (−1) = I , where I is the 2 × 2 identity matrix. Then the Goursat problem defined by (1.1) and (1.2) has a unique C 2 classical solution in D. This solution can be obtained by solving the following Riemann–Hilbert problem for the 2 × 2 matrix-valued functions 9 and 8: − x 6 λ 6 y, 8 (x, y, λ), − −∞ < λ 6 −1, 1 6 λ < ∞, 9 (x, y, λ), + (1.8a) 9 (x, y, λ) = − (x, y, λ)G (λ), −1 6 λ 6 x, 9 l − 9 (x, y, λ)Gr (λ), y 6 λ 6 1, − x 6 λ 6 y, 9 (x, y, λ), − (x, y, λ), −∞ < λ 6 −1, 1 6 λ < ∞, 8 8+ (x, y, λ) = (1.8b) − −1 8 (x, y, λ)Gl (λ) , −1 6 λ 6 x, 8− (x, y, λ)Gr (λ)−1 , y 6 λ 6 1, lim 8 = g, (1.8c) lim 9 = I, λ→∞
λ→∞
where 9 ± (x, y, λ) = 9(x, y, λ ± i0),
8± (x, y, λ) = 8(x, y, λ ± i0),
λ ∈ R,
MPAG025.tex; 19/04/1999; 16:12; p.3
316
A. S. FOKAS ET AL.
and the 2 × 2 matrix-valued functions Gl (λ), Gr (λ) are defined in terms of g1 (y) and g2 (x) as follows: Gl (λ) = (L− (λ, λ))−1 L+ (λ, λ),
Gr (λ) = (R− (λ, λ))−1 R+ (λ, λ), (1.9)
where
Z 1 x (1 − λ)1/2 L± (x, λ) = I + 1∓i × 2 −1 (λ − ξ )1/2 dg2 (ξ ) −1 × g2 (ξ )L±(ξ, λ) dξ, −1 6 x 6 λ, dξ Z 1 1 (λ + 1)1/2 R± (y, λ) = I − 1±i × 2 y (η − λ)1/2 dg1 (η) −1 g1 (η)R± (η, λ) dη, λ 6 y 6 1. × dη
(1.10)
(1.11)
We conclude this introduction with some remarks. (1) It can be shown that if g ∈ R, and if g(x, y) satisfies the equation gCgC = ρ(x − y)2 , where ρ is a real constant and C is a real, nonsingular, constant matrix, then Equation (1.1) is simply related to (1.5). This relationship is valid for the particular case of gravity. Thus the initial-value problem for the colliding gravitational waves can also be investigated by applying the inverse spectral method to Equations (1.6). The inverse spectral method for the Lax pair (1.6) involves the technical difficulty of analyzing eigenfunctions with Cauchy type singularities; a rigorous investigation of this problem remains open. (2) The boundary value problem of Equation (1.7) mentioned above was first solved by Szekeres [20] using the classical Riemann function technique. The same problem was later solved by Hauser and Ernst using separation of variables and the Abel transform [21, 22]. The linear problem has also been discussed in [23]. It was emphasized in [24] that before solving a given nonlinear integrable equation, it is quite useful to use the inverse spectral method to solve the linearized version of this nonlinear equation. This is carried out in Section 3, using the fact that Equation (1.7) possesses the Lax pair 9 γx ∂9 + = , ∂x 2(x − λ) 2(x − λ)
∂9 9 γy + = . ∂y 2(y − λ) 2(y − λ)
(1.12)
(3) When analyzing a Lax pair, it is customary to study the two equations forming this pair independently. Indeed, one usually studies one of the two equations to formulate an inverse problem in terms of appropriate spectral data, and then one uses the second equation to determine the “evolution” of the spectral data. Actually, this philosophy is precisely the one used for solving linear equations. However, it turns out that for solving the boundary value problem (1.1) and (1.2), it is more convenient to study both equations forming the Lax pair simultaneously. This important insight was gained from the inverse spectral analysis of the linear
MPAG025.tex; 19/04/1999; 16:12; p.4
THE INVERSE SPECTRAL METHOD FOR COLLIDING GRAVITATIONAL WAVES
317
Equation (1.7). The simultaneous spectral analysis of the Lax pair has led to a unified transform method for solving initial-boundary-value problems for linear and for integrable nonlinear PDE’s [25]. (4) The Riemann–Hilbert (RH) problem (1.8) has the technical difficulty that 8(x, y, ∞) = g(x, y) is unknown. This difficulty can be bypassed by formulating an equivalent RH problem for some other sectionally analytic functional µ (see Equation (3.16) for the relationship between 8, 9 and µ). The function µ satisfies µ(x, y, ∞) = I . Furthermore, it is shown in [27] that the RH problem satisfied by µ is solvable without a small norm assumption of g1 (y) and of g2 (x) provided that they satisfy a certain symmetry condition. This is a consequence of the fact that there exists a topological vanishing lemma for this RH problem (see [27] for details). We emphasize that the solvability of the RH problem (1.8) is based on the proof presented in [27] of the solvability of the equivalent RH problem for µ. 2. The Lax Pair Representation PROPOSITION 2.1. Let g(x, y) be a matrix-valued function belonging to C 2 (R2 ). (i) The nonlinear Equation (1.1) is the compatibility condition of Equations (1.3), where 9(x, y, κ) is a 2 × 2 matrix-valued function belonging to C 2 (R × C) and κ ∈ C. (ii) Equation (1.1) is also the compatibility condition of Equations (1.4). (iii) Under the transformation Gx = (x − y)gx g −1 ,
Gy = (y − x)gy g −1 ,
(2.1)
Equation (1.1) becomes 2(y − x)Gxy + Gy − Gx + [Gx , Gy ] = 0.
(2.2)
Proof. (i) and (iii). Let 9 satisfy 2κ A9 9κ = , κ +x−y κ +x −y 2κ B9 9y + 9κ = , κ +y−x κ +y −x 9x +
(2.3a) (2.3b)
where A(x, y) and B(x, y) ∈ C 1 (R2 ). It can be verified that the compatibility of Equations (2.3) yields Ay = Bx , (x − y)(Bx + Ay ) + A − B + [B, A] = 0.
(2.4a) (2.4b)
Indeed, if D1 := ∂x +
2κ ∂κ , κ +x −y
D2 := ∂y +
2κ ∂κ , κ +y−x
MPAG025.tex; 19/04/1999; 16:12; p.5
318
A. S. FOKAS ET AL.
it is straightforward to show that D1 (D2 9) = D2 (D1 9).
(2.5)
Then the compatibility of Equations (2.3) yields A9 B9 D2 = D1 , κ +x−y κ +y−x which implies Equations (2.4). Integrating Equation (2.4a) it follows that A = Gx and B = Gy . Then Equation (2.4b) becomes Equation (2.2). Using Equations (2.1) in Equation (2.2), Equation (1.1) follows. We note that the compatibility condition of Equations (2.1) is Equation (1.1) itself, thus the transformation (2.1) is well-defined. (ii) It can be verified directly that the compatibility of Equations (1.4) is Equation (1.1). 2 REMARK 2.1. Equation (2.2) is the compatibility condition of 1 y − λ 1/2 Gx 9, 1− 9x = 2 x−λ and
1 x − λ 1/2 Gy 9. 9y = 1− 2 y−λ
(2.6a)
(2.6b)
REMARK 2.2. It is straightforward to obtain the Lax pair (1.4) from the Lax pair (1.3): Indeed, one can introduce characteristic coordinates in (1.3) if and only if 2κ ∂κ = , ∂x κ +x −y
∂κ 2κ = . ∂y κ +y−x
(2.7)
These equations are compatible since κxy = 2κ/(κ 2 − (x − y)2 ) = κyx . Their solution is κ 2 + 2κ(2λ − (x + y)) + (x − y)2 = 0, where λ is a constant. Thus κ = x + y − λ + 2(x − λ)1/2 (y − λ)1/2 .
(2.8)
Using this equation it follows that κ +x−y = (x − λ)1/2 (x − λ)1/2 + (y − λ)1/2 , 2
MPAG025.tex; 19/04/1999; 16:12; p.6
THE INVERSE SPECTRAL METHOD FOR COLLIDING GRAVITATIONAL WAVES
319
and κ +y−x = (y − λ)1/2 (x − λ)1/2 + (y − λ)1/2 . 2 Substituting these expressions into the right-hand side of Equations (1.3), and using x − y = (x − λ) − (y − λ) = (x − λ)1/2 − (y − λ)1/2 (x − λ)1/2 + (y − λ)1/2 , Equations (1.3) become Equations (1.4). PROPOSITION 2.2. Let g(x, y) satisfy Equation (1.1). Assume that g ∈ R,
gCgC = ρ(x − y)2 ,
(2.9)
where ρ is a real constant, and C is a real nonsingular constant matrix. Define G(x, y) by G = iαgC + βf C,
α2 = −
1 1 , β=− , ν = constant, 16ρ 4ρν
(2.10)
where f (x, y) is defined by fx =
νgCgx , x−y
fy =
νgCgy . y−x
(2.11)
Then G(x, y) solves Equation (1.5). Proof. Equation (2.9) implies gx CgC + gCgx C = 2ρ(x − y), gy CgC + gCgy C = 2ρ(y − x).
(2.12) (2.13)
Using g −1 = CgC/ρ(x − y)2 , Equation (1.1) becomes gx CgCgy gy CgCgx + gy − gx = 0. (x − y) 2gxy − − ρ(x − y)2 ρ(x − y)2 Multiplying this equation by gC and using Equations (2.12) to replace gCgx C and gCgy C, it follows that (x − y)(2gCgxy + gy Cgx + gx Cgy ) + gCgx − gCgy = 0.
(2.14)
This equation can be written as gCgy gCgx = , x−y y y−x x which shows that f is well-defined by Equation (2.11).
MPAG025.tex; 19/04/1999; 16:12; p.7
320
A. S. FOKAS ET AL.
Figure 2. The cut complex λ-plane used in Theorem 3.1.
Substituting G = iαgC + βf C into (1.5) one finds two equations. One of them is (y − x)gxy + βgx Cfy − βgy Cfx + βfx Cgy − βfy Cgx = 0. Replacing fx and fy (see Equations (2.11)), gCgx C and gCgy C (see Equations (2.12)), and CgC by g −1 ρ(x − y)2 , this equation becomes (1.1) if and only if 4ρβν = −1. The other equation is β(y − x)fxy − α 2 gx Cgy + α 2 gy Cgx + β 2 fx Cfy − β 2 fy Cfx = 0. Replacing f, gCgx C and gCgy C, this equation becomes (2.13) if and only if 4α 2 = νβ. 2
3. The Spectral Theory of a Boundary Value Problem of the Ernst Equation We first discuss the linear equation (1.7). THEOREM 3.1. Let the matrix-valued function γ (x, y), where −1 6 x < y 6 1, satisfy Equation (1.7). Let γ (−1, y) and γ (x, 1) be given differentiable functions of y and x respectively. The solution of this boundary value problem is given by Z (1 − λ)1/2 1 x γˆ1 (λ) dλ − γ (x, y) = γ (−1, 1) + π −1 (x − λ)1/2(y − λ)1/2 Z 1 1 (λ + 1)1/2 − γˆ2 (λ) dλ, (3.1) π y (λ − x)1/2 (λ − y)1/2 where Z γˆ1 (λ) =
λ
−1
d γ (x, 1) dx (λ − x)1/2
Z dx,
γˆ2 (λ) = λ
1 d γ (−1, y) dy (y − λ)1/2
dy.
(3.2)
MPAG025.tex; 19/04/1999; 16:12; p.8
THE INVERSE SPECTRAL METHOD FOR COLLIDING GRAVITATIONAL WAVES
321
Proof. Let the function (λ − α)1/2, α ∈ R, be defined with respect to a branch cut from −∞ to α (see Figure 2). The common solution of Equations (1.12) satisfying 9(−1, 1, λ) = 0, possesses two different representations, Z x γx 0 dx 0 1 − (λ − x)1/2 −1 2(λ − x 0 )1/2 Z 1 (λ + 1)1/2 γy 0 (−1, y 0 ) − dy 0 , (3.3a) (λ − x)1/2 (λ − y)1/2 y 2(λ − y 0 )1/2 Z 9= 1 γy 0 dy 0 1 − + (λ − y)1/2 y 2(λ − y 0 )Z1/2 x (λ − 1)1/2 γx 0 (x 0 , 1) dx 0 . (3.3b) + (λ − x)1/2 (λ − y)1/2 −1 2(λ − x 0 )1/2 Using −1 6 x 0 6 x < y 6 y 0 6 1, it follows that: (i) If λ > 1, the square roots appearing in (3.3) have no jumps, hence 9 has no jumps. (ii) If λ < −1, all the square roots have jumps, which however cancel, and hence 9 has no jumps. (iii) If −1 6 λ 6 x, then (λ − y)1/2 , (λ − y 0 )1/2 , (λ − 1)1/2 , (λ − x)1/2 have jumps, (λ + 1)1/2 has no jump, and (λ − x 0 )1/2 has no jump if λ > x 0 but has a jump if λ < x 0 . Thus, if the superscripts + and − denote the limit of 9 as λ approaches the real axis from above and below respectively, Equation (3.3b) implies Z λ γx 0 (x 0 , 1) (1 − λ)1/2 + − dx 0 , 9 −9 = i(x − λ)1/2(y − λ)1/2 −1 (λ − x 0 )1/2 −1 6 λ 6 x. (3.4) (iv) Similarly if y 6 λ 6 1, Equation (3.3a) yields Z 1 (λ + 1)1/2 γy 0 (−1, y 0 ) 0 + − dy , 9 −9 =− i(λ − x)1/2 (λ − y)1/2 λ (y 0 − λ)1/2 y 6 λ 6 1.
(3.5)
Thus 9 is a sectionally holomorphic function of λ, with jumps only in [−1, x] and [y, 1], given by Equations (3.4) and (3.5) respectively. Also Equations (3.3) imply that 9 = O( λ1 ) as λ → ∞, λI 6= 0. This information defines a Riemann–Hilbert problem [26] for 9. Its unique solution is given by Z x (1 − λ0 )1/2 dλ0 1 0 γ ˆ (λ ) + 9 = − 1 2π −1 (x − λ0 )1/2 (y − λ0 )1/2 λ0 − λ
MPAG025.tex; 19/04/1999; 16:12; p.9
322
A. S. FOKAS ET AL.
1 + 2π
Z y
1
dλ0 (λ0 + 1)1/2 0 , γ ˆ (λ ) 2 (λ0 − x)1/2 (λ0 − y)1/2 λ0 − λ
λI 6= 0.
(3.6)
Equation (3.3) implies that
1 1 γ (x, y) − γ (−1, 1) + O 2 , 9= 2λ λ
λI 6= 0, λ → ∞.
(3.7)
Using Equation (3.6) to compute the O( λ1 ) term of 9 and comparing with Equation (3.7), Equation (3.1) follows. The rigorous justification of the above formalism involves the following steps: (i) Given γ (−1, y) and γ (x, 1) in C 1 , Equations (3.2) define γˆ1 (λ) and γˆ2 (λ) in C1. (ii) Given γˆ1 (λ) and γˆ2 (λ) in C 1 , define γ (x, y) by Equation (3.1). Use a direct computation to show that γ (x, y) satisfies Equation (1.7) and the given boundary conditions. 2 Derivation of Theorem 1.1. We first assume that g(x, y) exists and show that g(x, y) can be obtained through the solution of the RH problem (1.8). We then discuss the rigorous justification of this construction without the a priori assumption of existence. Let 9(x, y, λ) be the unique matrix-valued function defined by 1 (λ − y)1/2 (3.8a) gx g −1 9, 1− 9x = 2 (λ − x)1/2 1 (λ − x)1/2 (3.8b) gy g −1 9, 1− 9y = 2 (λ − y)1/2 9(−1, 1, λ) = I. (3.8c) 9 possesses the two different integral representations Z x (λ − y)1/2 1 − a(x 0 , y)9(x 0 , y, λ) dx 0 − 0 )1/2 (λ − x −1 Z 1 (λ + 1)1/2 1− b(−1, y 0 )9(−1, y 0 , λ) dy 0 , − (λ − y 0 )1/2 y Z 1 9=I+ (λ − x)1/2 − 1− b(x, y 0 )9(x, y 0 , λ) dy 0 + 0 )1/2 (λ − y y Z x (λ − 1)1/2 + 1− a(x 0 , 1)9(x 0 , 1, λ) dx 0 , 0 )1/2 (λ − x −1
(3.9)
where a(x, y) and b(x, y) are defined by 1 a(x, y) = gx g −1 , 2
1 b(x, y) = gy g −1 . 2
(3.10)
MPAG025.tex; 19/04/1999; 16:12; p.10
THE INVERSE SPECTRAL METHOD FOR COLLIDING GRAVITATIONAL WAVES
Note that 9(x, 1, λ) and 9(−1, y, λ) satisfy Z (λ − 1)1/2 1 x 1− × 9(x, 1, λ) = I + 2 −1 (λ − x 0 )1/2 ×(gx g −1 )(x 0 , 1)9(x 0 , 1, λ) dx 0 and 1 9(−1, y, λ) = I − 2
Z y −1
1
323
(3.11)
(λ + 1)1/2 1− × (λ − y 0 )1/2
×(gy g )(−1, y 0 )9(−1, y 0 , λ) dy 0 . Let 8(x, y, λ) be the unique matrix-valued function defined by 1 (λ − y)1/2 8x = gx g −1 8, 1+ 2 (λ − x)1/2 (λ − x)1/2 1 gy g −1 8, 1+ 8y = 2 (λ − y)1/2 8(−1, 1, λ) = I.
(3.12)
(3.13a) (3.13b) (3.13c)
8(x, y, λ), 8(x, 1, λ) and 8(−1, y, λ) satisfy equations analogous to Equations (3.9), (3.11) and (3.12). We now compute the jumps of 9. (i) λ < −1 or λ > 1. The integral representations (3.9) imply that 9 + = 9 − . Indeed, if λ > 1 none of the square roots appearing in (3.9) has a jump; if λ < −1 all of the square roots have a jump, which however cancel. (ii) −1 6 λ 6 x. Both (λ − y)1/2 and (λ − x)1/2 have a jump, hence (λ − y)1/2 /(λ − x)1/2 has no jump and both 9 + and 9 − satisfy Equations (3.8). Thus 9 + (x, y, λ) = 9 − (x, y, λ)Gl (λ). In order to compute the matrix Gl (λ) we evaluate this equation at x = λ and y = 1, Gl (λ) = (9 − (λ, 1, λ))−1 9 + (λ, 1, λ). Let L± (s, λ) = limz→λ±i0 9(s, 1, z) for −1 6 s 6 λ. Equation (3.11) yields Z (1 − λ)1/2 1 s 1∓i × L± (s, λ) = I + 2 −1 (λ − s 0 )1/2 ×(gx g −1 )(s 0 , 1)L± (s 0 , λ) ds 0 (3.14) for −1 6 s 6 λ and 9 ± (λ, 1, λ) = L± (λ, λ). We have thus established (1.10) and the first half of (1.9).
MPAG025.tex; 19/04/1999; 16:12; p.11
324
A. S. FOKAS ET AL.
(iii) y 6 λ 6 1. Both (λ − y)1/2 and (λ − x)1/2 have no jumps, hence (λ − y)1/2 /(λ − x)1/2 has no jump and both 9 + and 9 − satisfy Equations (3.8). Thus 9 + (x, y, λ) = 9 − (x, y, λ)Gr (λ). In order to compute the matrix Gr (λ) we evaluate this equation at y = λ and x = −1, Gr (λ) = (9 − (−1, λ, λ))−1 9 + (−1, λ, λ). Let R± (t, λ) = limz→λ±i0 9(−1, t, z) for λ 6 t 6 1. Equation (3.12) yields Z 1 1 (λ + 1)1/2 ± 1±i 0 × R (t, λ) = I − 2 t (t − λ)1/2 (3.15) ×(gy g −1 )(−1, t 0 )R± (t 0 , λ) dt 0 for λ 6 t 6 1 and 9 ± (−1, λ, λ) = R± (λ, λ). We have thus established (1.11) and the second half of (1.9). (iv) x 6 λ 6 y. The ratio (λ − y)1/2 /(λ − x)1/2 has a jump, thus 9 + and 8− satisfy the same system of integrable equations. Since 9 + (−1, 1, λ) = I = 8− (−1, 1, λ), we have 9 + = 8− . The jumps of 8 can be computed in a similar way. Also note that for −1 6 x 6 λ or λ 6 y 6 1. 9 ± (x, 1, λ) = 8∓ (x, 1, λ),
9 ± (−1, y, λ) = 8∓ (−1, y, λ).
Equations (3.9) and the analogous equation for 8 imply Equation (1.8c). We now discuss the rigorous justification of the above construction: (i) Equations (1.10) and (1.11) are Volterra integral equations. Thus if g1 and g2 ∈ C 2 , the jump matrices Gl and Gr are well-defined. (ii) It can be shown that the RH problem (1.8) has a unique global solution. This follows from the fact that this RH problem is simply related to a RH problem satisfied by the function µ(x, y, w) defined by ( 8(x, y, f (x, y, w)), w − 12 6 12 , (3.16) µ(x, y, w) = 9(x, y, f (x, y, w)), w − 1 > 1 , 2
where λ = f (x, y, w) is the rational function defined by 1 2 λ−y 1− . = w λ−x
2
(3.17)
The function µ satisfies µ(x, y, ∞) = I , furthermore it turns out that the RH problem for µ satisfies a vanishing lemma [27], i.e., the homogeneous RH problem has only the zero solution, provided that g and g2 satisfy a certain symmetry condition.
MPAG025.tex; 19/04/1999; 16:12; p.12
THE INVERSE SPECTRAL METHOD FOR COLLIDING GRAVITATIONAL WAVES
325
(iii) Using direct differentiation it can be shown that if 8 and 9 solve the RH problem (1.8), then 1 (λ − y)1/2 α(x, y)9, 1− 9x = 2 (λ − x)1/2 (3.18) (λ − x)1/2 1 1− β(x, y)9, 9y = 2 (λ − y)1/2 1 (λ − y)1/2 8x = α(x, y)8, 1+ 2 (λ − x)1/2 (3.19) (λ − x)1/2 1 1+ β(x, y)8, 8y = 2 (λ − y)1/2 where α and β are some λ-independent functions. Let 91 and g be defined by 91 (x, y) 1 +o as λ → ∞, λ λ 8(x, y, λ) = g(x, y) + o(1) as λ → ∞.
9(x, y, λ) = I +
(3.20)
Then (y − x) (x − y) α(x, y), (92 )x = β(x, y), 4 4 β(x, y) = gy g −1 . α(x, y) = gx g −1 ,
(91 )x =
(3.21) (3.22)
The compatibility condition of Equation (3.21) implies that g solves the Ernst equation. (iv) The proof that g(x, y) satisfies g(−1, y) = g1 (y) and g(x, 1) = g2 (x) is given in [27]. (v) Equation (1.1) is invariant under g → gA,
g → g, ¯
g → gT ,
g → g −1 ,
where A is a nonsingular matrix. Thus without loss of generality we can assume that g(−1, 1) = I . Furthermore, if g1 (y) and g2 (x) are real, symmetric, positive definite matrices, then the solution also has the same properties. 2 Appendix. The Collision of Two Plane Gravitational Waves The spacetime manifold representing the collision of plane gravitational waves in vacuum is characterized by the presence of two spacelike, commuting and hypersurface orthogonal Killing vector fields. This allows one to write the metric as ds 2 = gab dx a dx b − 2f du dv,
a, b = 1, 2,
(A.1)
MPAG025.tex; 19/04/1999; 16:12; p.13
326
A. S. FOKAS ET AL.
where the 2 × 2 symmetric matrix function g := (gab ) and the scalar function f depend only on the null coordinates u, v, and satisfy the constraints f (u, v) > 0,
det(g(u, v)) > 0.
(A.2)
Hence, one can introduce a pair of scalar functions α and 0 such that det g = α 2 ,
α(u, v) > 0 and
f = α −1/2 e20 .
(A.3)
Thus, the matrix g can be written as g = αS,
det S = 1,
(A.4)
and Equation (A.1) takes the form ds 2 = αSab dx a dx b − 2α −1/2 e20 du dv.
(A.5)
In this form the four degrees of freedom characterizing the geometry of the space-time manifold of plane gravitational waves are expressed by the two scalar functions α and 0 and the unimodular, symmetric 2 × 2 matrix S. The two degrees of freedom incorporated in the latter can be expressed by a pair of real valued functions F and ω. Thus, S can be written as ¯ ω −1 E E , E := F + iω, (A.6) S=F ω 1 where E¯ denotes the complex conjugate of E. The functions α, 0 and E are determined by solving the Einstein field equations in the vacuum, namely the system Rij (u, v) = 0, i, j = 1, 2, 3, 4, where Rij is the Ricci tensor corresponding to Equation (A.5). The components of this system which do not vanish identically yield αu,v = 0,
(A.7)
F (2αEuv + αu Ev + αv Eu ) = 2αEu Ev ,
(A.8)
1 αuu α Eu 2 + , 0u = 2 αu αu 2F 1 αvv α Ev 2 0v = + , 2α α 2F v
(A.9a) (A.9b)
v
Eu E¯ v . 0uv = −Re 4F 2
(A.10)
The fundamental components of the above system of field equations are Equations (A.7) and (A.8). This follows from the fact that Equations (A.7) and (A.8)
MPAG025.tex; 19/04/1999; 16:12; p.14
THE INVERSE SPECTRAL METHOD FOR COLLIDING GRAVITATIONAL WAVES
327
Figure 3. The domain W = I ∪ II ∪ III ∪ IV of the (u, v)-plane corresponding to a space-time manifold which represents the collision of a pair of plane gravitational waves. Region I represents the initially flat (gravity free) domain into which the waves propagate. The two incoming pulses of gravitational radiation are represented by region II and III, respectively. Their interaction is represented by region IV, which corresponds to region D of Figure 1.
are the integrability conditions of Equations (A.9) and (A.10). Hence, given α and E, 0 can be found by quadrature. The matrix equation (αgu g −1 )v + α(gv g −1 )u = 0,
(A.11)
called the Ernst equation, is equivalent to the system of Equations (A.7) and (A.8). In particular, taking the trace of Equation (A.11) one finds Equation (A.7). Let us now consider the following adjacent regions of the (u, v)-plane (see Figure 3), where (u0 , v0 ) is a pair of positive numbers, I = (u, v) ∈ R2 : u 6 0, v 6 0 , II = (u, v) ∈ R2 : u 6 0, 0 6 v < v0 , III = (u, v) ∈ R2 : 0 6 u < u0 , v 6 0 , IV = (u, v) ∈ R2 : 0 6 u < u0 , 0 6 v < v0 , α(u, v) > 0 . It will be assumed that the metric coefficients are continuous in the domain W := I ∪ II ∪ III ∪ IV, with α(u, v) > 0 for all (u, v) ∈ W , and α(u, v) = 0 for (u, v) ∈ ∂W . Moreover, the same symbols I–IV will be used in the following for the corresponding regions of space-time. For example, II denotes the set II × R2 , where R2 represents the extent of the ignorable coordinates x 1 and x 2 . Region I represents a domain free of gravity into which a pair of gravitational waves impinge from the left and from the right. The latter are represented by regions II and III, respectively. Thus, in region I the line element is given by dsI2 = −2 du dv + (dx 1 )2 + (dx 2 )2 .
(A.12)
MPAG025.tex; 19/04/1999; 16:12; p.15
328
A. S. FOKAS ET AL.
In region II the metric coefficients depend only on v. They are specified by a given u-independent solution of the field equations (A.7)–(A.10). Similarly, the metric coefficients in region III depend only on u, and follow from a given v-independent solution of the same equations. By continuity, the given solutions in regions II and III determine the initial values of the metric coefficients in region IV, i.e., their values along the null hypersurfaces u = 0, 0 6 v < v0 and 0 6 u < u0 , v = 0. Thus, taking into account the earlier remarks regarding the function 0, one can formulate the problem associated with the process of colliding plane gravitational waves as follows. Find (α(u, v), E(u, v)) which: (i) satisfy Equations (A.7) and (A.8) in the interior of region IV, and (ii) take preassigned values along the boundary ∂IV of the above region, where ∂IV = {(u, v) ∈ R2 : u = 0, 0 6 v < v0 }∪{(u, v) ∈ R2 : 0 6 u < u0 , v = 0}. It is assumed that the boundary data sets {α(0, v), α(u, 0)} and {E(0, v), E(u, 0)} consist of functions which belong to the differentiability classes C 2 and C 1 , respectively. Following [21], let us introduce the functions r, s defined by r(u) := 1 − 2α(u, 0), s(v) := 2α(0, v) − 1,
0 6 u < u0 , 0 6 v < v0 .
(A.13a) (A.13b)
Then it is easily verified that the unique solution of Equation (A.7) in region IV which satisfies the given initial conditions is given by α(u, v) =
1 s(v) − r(u) . 2
(A.14)
It turns out that the field equations themselves determine a set of junction conditions along the null hypersurfaces u = 0 and v = 0. Following [21] these conditions can be written in the following form (i) dr (u) > 0, du ds (v) < 0, dv
for 0 < u < u0 ,
(A.15a)
for 0 < v < v0 ,
(A.15b)
dr ds (0) = (0) = 0. du dv (iii) The following limits exist " d2 r # (u) − 4L(u, 0) du2 lim , dr u→0+ 2 du (u) " d2 s # (v) − 4K(0, v) 2 lim dv , ds v→0+ 2 dv (v) (ii)
(A.16) 2 Eu where L := α , 2F 2 Ev where K := α . 2F
(A.17a)
(A.17b)
MPAG025.tex; 19/04/1999; 16:12; p.16
THE INVERSE SPECTRAL METHOD FOR COLLIDING GRAVITATIONAL WAVES
329
Conditions (ii) and (iii), called colliding wave conditions by Hauser and Ernst, must be satisfied in order for a solution of the associated boundary-value problem to admit the interpretation of a colliding plane gravitational wave model. Condition (i), on the other hand, allows one to introduce a new pair of null coordinates x, y by setting x = r(u),
y = s(v).
(A.18)
These equations define a one-to-one, bicontinuous mapping of region IV of the (u, v)-plane onto the triangular region D = {(x, y) ∈ R2 : −1 6 x < y 6 1} of the (x, y)-plane. In the new coordinate system α = 12 (y − x), and Equation (A.11) becomes 1 1 −1 −1 + = 0. (A.19) (y − x)gx g (y − x)gy g 2 2 y x Thus, the boundary-value problem reduces to solving Equation (A.19), which is equivalent to Equation (1.1), in the interior of D for specified boundary data E(−1, y) and E(x, 1). Global aspects of this problem and the singularity structure of the corresponding space-time manifolds are discussed in [28, 29]. Acknowledgements The authors wish to thank J. B. Griffiths for valuable discussions, D.T. gratefully acknowledges the hospitality of the Department of Mathematical Science, Loughborough University of Technology. This research was supported by Grant No MAJF2 from EPSRC. References 1. 2. 3.
4. 5.
6. 7. 8.
Griffiths, J. B.: Colliding Plane Waves in General Relativity, Oxford University Press, 1991. Ernst, F. J.: Phys. Rev. 168 (1968), 1415. Belinsky, V. A. and Zakharov, V. E.: Integration of the Einstein equations by means of the inverse scattering problem technique and construction of exact soliton solutions, Sov. Phys. JETP 48 (1978), 985–994. Lax, P. D.: Integrals of nonlinear equations of evolution and solitary waves, Comm. Pure Appl. Math. 21 (1968), 467–490. Neugebauer, G.: Proc. Workshop on Gravitation, Magneto-Convection and Accretion (ed. B. Schmidt, H. U. Schmidt and H. C. Thomas), MPA/P2, Max-Planck-Institut für Physik und Astrophysik, Garching, Germany 38 (1989). Manojlovi´c, N. and Spence, B.: Integrals of motion in the two-Killing-vector reduction of general relativity, Nuclear Physics B423 (1994), 243–259. Rogers, C. and Shadwick, W. F.: Bäcklund Transformations and Their Applications, Academic Press, 1982. Zakharov, V. E. and Shabat, F. B.: A plan for integrating the nonlinear equations of mathematical physics by the method of the inverse scattering problem. I, Funct. Anal. Appl. 8 (1974),
MPAG025.tex; 19/04/1999; 16:12; p.17
330
9. 10. 11. 12. 13. 14. 15.
16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.
29.
A. S. FOKAS ET AL.
226–235; Integration of the nonlinear equations of mathematical physics by the method of the inverse scattering problem. II, J. Funct. Anal. Appl. 13 (1979), 166–173. Fokas, A. S. and Ablowitz, M. J.: Linearization of the Korteweg de Vries and Painlevé II equations, Phys. Rev. Lett. 47 (1981), 1096–1100. Ablowitz, M. J. and Segur, H.: Solitons and the Inverse Scattering Transform, SIAM, 1981. Newell, A. C., Solitons in Mathematics and Physics, SIAM, 1985. Fokas, A. S. and Zakharov, V. E. (eds): Important Developments in Soliton Theory, SpringerVerlag, 1993. Nutku, Y. and Halil, M.: Phys. Rev. Lett. 39 (1977), 1379. Chandrasekhar, S. and Xanthopoulos, B. C.: The effect of sources on horizons that may develop when plane gravitational waves collide, Proc. Roy. Soc. A 414 (1987), 1–30. Ferrari, V., Ibanez, I. and Bruni, M.: Colliding gravitational waves with non-collinear polarization: a class of soliton solutions, Phys. Lett. A122 (1987), 459–462; Colliding plane gravitational waves: a class of nondiagonal soliton solutions, Phys. Rev. D. 36 (1987), 1053–1064. Ernst, F. J., Garcia-Diaz, A. and Hauser, I.: Colliding gravitational plane waves with noncollinear polarization. III, J. Math. Phys. 29 (1988), 681–689. Tsoubelis, D. and Wang, A. Z.: Asymmetric collision of gravitational plane waves: a new class of exact solutions, Gen. Rel. Grav. 21 (1989), 807–819. Hauser, I. and Ernst. F. J.: Initial value problem for colliding gravitational plane waves. III, J. Math. Phys. 31 (1990), 871–881. Hauser, I. and Ernst. F. J.: Initial value problem for colliding gravitational plane waves. IV, J. Math. Phys. 32 (1991), 198–209. Szekeres, P.: Colliding plane gravitational waves, J. Math. Phys. 13 (1972), 286–294. Hauser, I. and Ernst. F. J.: Initial value problem for colliding gravitational plane waves. I, J. Math. Phys. 30 (1989), 872–887. Hauser, I. and Ernst. F. J.: Initial value problem for colliding gravitational plane waves. II, J. Math. Phys. 30 (1989), 2322–2336. Yurtsever, U., Structure of the singularities produced by colliding plane waves, Phys. Rev. D 38 (1988), 1706–1730. Fokas, A. S. and Gel’fand. I. M.: Integrability of linear and nonlinear evolution equations and the associated nonlinear Fourier transforms, Lett. Math. Phys. 32 (1994), 189–210. Fokas, A. S.: A unified transform method for solving linear and certain nonlinear PDEs, Proc. R. Soc. Lond. A 453 (1997), 1411–1443. Ablowitz, M. J. and Fokas, A. S.: Complex Variables with Applications, Cambridge University Press, 1997. Fokas, A. S. and Sung, L.-Y.: Preprint, 1999. Penrose, R.: A remarkable property of plane waves in general relativity, Rev. Modern Phys. 37 (1965), 215–220. The geometry of impulsive gravitational waves, in General Relativity: Papers in honor of J. L. Synge (ed. L. O’Raifeartaigh), Oxford University Press (1972), 101. Yurtsever, U.: Colliding almost-plane gravitational waves: colliding plane waves and general properties of almost-plane-wave spacetimes, Phys. Rev. D 37 (1988), 2803–2817; Singularities in the collisions of almost-plane gravitational waves, Phys. Rev. D 38 (1988), 1731–1740; Singularities and horizons in the collisions of gravitational waves, Phys. Rev. D 40 (1989), 329–359.
MPAG025.tex; 19/04/1999; 16:12; p.18
Mathematical Physics, Analysis and Geometry 1: 331–365, 1999. © 1999 Kluwer Academic Publishers. Printed in the Netherlands.
331
Product Cocycles and the Approximate Transitivity VALENTIN YA. GOLODETS and ALEXANDER M. SOKHET Institute for Low Temperature Physics and Engineering, Academy of Science, 46 Lenin Avenue, 310164 Kharkov, Ukraine (Received: 25 June 1997; accepted: 25 March 1998) Abstract. Some criteria of the approximate transitivity in the terms of Mackey actions and product cocycles are proved. The Mackey action constructed by an amenable type II or III transformation group G and a 1-cocycle ρ × α, where ρ is the Radon–Nikodym cocycle while α is an arbitrary 1-cocycle with values in a locally compact separable group A, is approximately transitive (AT) if and only if the pair (G, (ρ, α)) is weakly equivalent to a product odometer supplied with a product cocycle. Besides, in the case when the given AT action from the very beginning was a range of a type II action and a nontransient cocycle, then this cocycle turns out to be cohomologous to a θ-product cocycle. An example is constructed that shows that it is necessary to consider the double Mackey actions since they can not be reduced to the single ones. Mathematics Subject Classifications (1991): Primary 46L55; Secondary 28D15, 28D99. Key words: ergodic theory, approximate transitivity, product cocycle, Mackey action.
Introduction The class of approximately transitive (AT) actions was introduced by A. Connes and E. J. Woods [3] in connection with the characterization problem for the factors which are infinite tensor products of type I factors. These actions have turned out to be very interesting from the ergodic theory point of view. Papers [14, 9, 10, 4, 5, 12] and some others were devoted to studying these actions. The result proved by A. Connes and E. J. Woods in [3] states that a type III0 hyperfinite factor is ITPFI if and only if its flow of weights is AT. As these factors appear as Krieger factors constructed by a product odometer, their result being translated to the measure-theoretic language meant that an amenable ergodic transformation group is orbit equivalent to a product odometer if and only if its associated flow is AT. A ‘pure ergodic’ proof of this theorem was found by T. Hamachi [12]. Therefore, all AT flows obtained their exact characterization. The natural direction to generalize this result was to obtain a characterization of AT actions of arbitrary groups, not only R. To do that, instead of the associated flow (also called Poincaré flow) one needs to consider the associated action (also called the Mackey action) constructed by a given action and its 1-cocycle. Hence, two natural directions of generalization can arise.
VTEX(EL) PIPS No.: 166563 (mpagkap:mathfam) v.1.15 MPAG007.tex; 6/04/1999; 8:14; p.1
332
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
First, one can consider a type II transformation group G acting on a Lebesgue space and supply it with a 1-cocycle α ∈ Z 1 (, G; A) with values in a group A. One can try to prove that the pair (G, α) is weakly equivalent to a pair consisting of a measure-preserving product odometer and a product cocycle if and only if the associated action is AT. This situation is referred to below as the type II case for brevity. Second, one can consider a type III transformation group G acting on a Lebesgue space and supply it with a 1-cocycle α with values in a l.c.s. group A. Construct a Mackey action by G and by the double cocycle (α, ρ), where ρ is the Radon– Nikodym cocycle, and prove that the pair (G, α) (or – which is the same – the pair (G, (α, ρ))) is weakly equivalent to a pair consisting of a product odometer and a product cocycle if and only if this double Mackey action is AT. This situation is referred to below as the type III case for brevity. We have to comment here that it becomes natural to consider the double Mackey actions due to paper [1]. It was shown there that, for the type II case, two pairs are stably weakly equivalent if and only if their Mackey actions are metrically isomorphic, and for the type III case, that two pairs are weakly equivalent if and only if their double Mackey actions are metrically isomorphic. The result proved there for the case of an Abelian group A was then generalized in [7] for the case of any l.c.s. group A. In this paper, both the type II case and the type III case are studied. The main result, Theorem 4.1 for the type III case and Theorem 5.1 for the type II case, states that a pair (G, α) is weakly equivalent to a product odometer with a product cocycle if and only if the (double – for the case of G of type III) Mackey action is AT. Note that an important corollary of our result is that any AT action can be constructed as an associated action to an action of any prescribed type and its product cocycle. Section 2 contains two technical criteria of the decomposability of the given pair (G, α) to an infinite product. The first of these is valid both for type II and III transformation groups G, while the second one is a corollary of the first one applicable for the type III case. They are quite similar to Propositions 6 and 7 of [12], but here we deal with a more complicated situation than Hamachi did: we study not only an action but an action and a cocycle together. Sections 3 and 4 are devoted to the type III case. In Section 3, the countable transformation group G is introduced, and some auxiliary technical lemmas are proved. Note that these Lemmas 3.2–3.8 correspond to Lemmas 11–16 of [12] with the necessary complications. Then, in Section 4, we prove our main Theorem 4.1. The main idea dates back to Hamachi’s proof: it turns out to be possible to make an arbitrary cocycle the same as was made for the Radon–Nikodym cocycle. Section 5 is devoted to the type II case. The new proof of our main Theorem 5.1 is presented here for the first time. Of course, historically the type II case was studied earlier than the type III case: this was done in [4] for the case of a discrete
MPAG007.tex; 6/04/1999; 8:14; p.2
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
333
group A, and in [5] and [9] for the general case. The short proof presented below replaces all the intricate technical considerations of these papers. (Here a funny situation arises. Though the type III case seems to be ‘the general case’, and the type II case seems a particular case when our measure is invariant, the ‘general’ proof, however, is not valid for the ‘particular’ case of type II. For example, the main approximation Lemma 3.2 is invalid for the type II case. This happens because it is impossible to use partial transformations in the type II case. It is well known that any two measurable sets are equivalent for any given ergodic action of type III, but in the case of type II transformations this statement is false, and hence we must present a separate proof for the type II case.) Section 5 also contains a statement that is valid for the type II case only. Suppose that the given AT action was represented as a cocycle range from the very beginning. The main theorem implies that this cocycle is weakly equivalent to a product cocycle, but for the type II case we may sharpen this result and prove that it is not only weakly equivalent, but even cohomologous to a θ-product cocycle (Theorem 5.7). Finally, in Section 6, we compare the double Mackey action constructed by (α, ρ) with two single ones constructed by α and ρ, respectively. It is easy to see that when the double Mackey action is AT, the two single ones are also AT. But the converse statement is false, and we construct an appropriate example. All our considerations have been taken into account for a more general case than those of the cocycles with values in Abelian groups, while we always keep in mind the Abelian case as the most simple and natural. The requirements for group A where the cocycles take their values are formulated in Section 1.4. The results of this paper are included in the Ph.D. thesis of the second author [21]. There one can find a more detailed comparison of the Abelian case and the non-Abelian one. 1. Notation and Definitions 1.1.
APPROXIMATE TRANSITIVITY
The following notion was introduced by A. Connes and E. J. Woods in [3]: DEFINITION 1.1. An action of a group G on a Lebesgue space (, B, µ) is called approximately transitive (AT) if for any ε > 0 and an arbitrary finite family f1 , f2 , . . . , fN ∈ L1+ (, µ) there exist a single function f ∈ L1+ (, µ), and gj ∈ G, and coefficients λij > 0 (here i = 1, . . . , n; j = 1, . . . , Ni ) satisfying the inequality
Ni
X dµ ◦ gj
λij · f (gj ω) ·
< ε.
fi −
dµ j =1
1
There are a lot of reformulations of this definition, and the reader can find them in [3]. As the Radon–Nikodym derivative provides a one-to-one correspondence
MPAG007.tex; 6/04/1999; 8:14; p.3
334
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
between functions ∈ L1+ (, µ) and finite measures absolutely continuous with respect to µ, one can easily transform this definition into the language of approximation of a given finite family of measures by a single measure; in fact, this was the initial Connes–Woods’ formulation, but this will not be used below. Connes and Woods proved in [3], in particular, that the AT property implies ergodicity, and a measure preserving AT transformation is of zero entropy. The main Connes–Woods’ result states that a type III hyperfinite factor is an infinite tensor product of type I factors if and only if its flow of weights is AT, and in this paper we intend to present a generalization of this result. In the proof of Theorem 4.1 below we will use the following special reformulation of the definition of the AT property. Let A be a l.c.s. group, and consider nonsingular joint action (Wa , Fr ) of the product group A×R, where a ∈ A, r ∈ R. PROPOSITION 1.1. A nonsingular A × R-action (Wa , Fr ) is AT if for any ε > 0 and for any given finite family f1 , f2 , . . . , fn ∈ L1+ (, µ) there exist a function f ∈ L1+ (, µ) and a finite collection of r(i, j ) ∈ R and a(i, j ) ∈ A (here 1 6 j 6 Li ) such that
Li
X dµ ◦ Fr(i,j ) Wa(i,j )
exp(−r(i, j )) · f (Fr(i,j ) Wa(i,j ) ω) ·
0 and any finite collection of partial transformations g1 , g2 , . . . , gn ∈ [G]m ∗ there exists a single tower ζ satisfying the following two properties: (a) Dom gi , Im gi ∈m,ε B(ζ ). Here B(ζ ) is the sub-sigma-algebra generated by all levels of ζ , and ∈m,ε means that the set on the left-hand side can be approximated by a set from the right-hand side up to ε in the sense of the measure m of their symmetric difference. (b) m(ω ∈ Dom gi : gi ω ∈ Orbζ (ω)) > (1 − ε)m(Dom gi ), where 1 6 i 6 n. Following T. Hamachi [12], instead of two words ‘approximately finite’ we shall use below one single word ‘amenable’. (The reader can find the definition of
MPAG007.tex; 6/04/1999; 8:14; p.7
338
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
amenability in [22], for example, but it will not be used below. A. Connes, J. Feldman and B. Weiss proved in [2] that the definition of amenability is equivalent to the requirement to the approximate finiteness for free l.c.s. group actions.) 1.4.
PRODUCT ODOMETER , PRODUCT COCYCLES , AND THE REQUIREMENTS FOR GROUP A
Let n , n ∈ N, be a finite set {0, 1, . . . , Rn − 1} ⊂ N. Let mn be a probability the infinite product measure on Q n such that mn (k) > 0, 0 6 k 6 Rn − 1. QTake ∞ with the product measure m = m space pr = ∞ n pr i=1 i=1 n . The permutation λn acts on n by λn (k) = k + 1 mod(Rn ). These λn generate a free countable transformation group Gpr on the space pr , and this group Gpr is called the product odometer. To define a product cocycle, let us start from the Abelian case. DEFINITION 1.5. Let αpr be a 1-cocycle pr × Gpr → A with values in an Abelian group A such that αpr (ω, λln ) (0 6 l < Rn ) depends only on the nth coordinate of the point ω = (ω1 , ω2 , . . . , ωn , . . .) ∈ pr . A cocycle αpr having such a form is usually called a product cocycle. And now the general (non-Abelian) case. Suppose that A is an arbitrary l.c.s. group, not necessarily Abelian. In this case, a natural analogue of a product cocycle can be defined in the following way. Let pr , Gpr , λn , etc., be as above. DEFINITION 1.6. A cocycle αpr ∈ Z 1 (pr , Gpr ; A) is called a product cocycle if αpr (ω, λln ) (0 6 l < Rn ) depends only on the 1st, 2nd, . . . , nth coordinates of the point ω = (ω1 , ω2 , . . . , ωn , . . .) ∈ pr . This definition can be easily reformulated in a following form: α: × G → A is a product cocycle if it possesses the following six properties: j
j
j
(1) for each j ∈ N, there exists a partition {Ek , 0 6 k < pj }, that is, Ek1 ∩Ek2 = ∅ Spj −1 j for k1 6= k2 , and k=0 Ek = ; j (2) and there exists a type I transformation Tj that permutes the sets Ek : p j j Tjl · Ek = Ek+l (mod pj ) , Tj j = id, so that (3) Tjl · Eki = Eki for all l, if only i 6= j , and (4) the group generated by {Tj }∞ j =1 coincides with [G]; dµ◦T
j
(5) dµ j is equal to a constant on each Ek ; (6) and α(ω, Tj ) is a map from to A measurable with respect to the σ -algebra j generated by {Ek : k = 0, . . . , pj − 1; j = 1, . . . , l}. One easily sees that conditions (1)–(5) define the same as the product odometer, while condition (6) requires that the cocycles have constant passage values on certain towers.
MPAG007.tex; 6/04/1999; 8:14; p.8
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
339
A l.c.s. group A where our cocycles take their values will be assumed everywhere below to satisfy the following requirements: (1) A is an amenable l.c.s. group; (2) A contains a countable amenable dense subgroup B (this property holds, for example, for any solvable Lie group), and (3) the given cocycle α ∈ Z 1 (, G; A) is such that log 1(α(ω, g)) is a coboundary, where 1 stands for the modular function of A (this property holds trivially when A is unimodular). DEFINITION 1.7. When properties (1)–(3) are satisfied, we shall say that the group A is admissible. Here is a simple and natural generalization of the notion of a product cocycle that is convenient for the cocycles classification problem. DEFINITION 1.8. Let (, B, m) be a Lebesgue space, and G an ergodic free countable transformation group acting of this space. Let α: × G → A be a 1cocycle. If there exists a measure-preserving orbit equivalence mapping θ: (, m) → (pr , mpr ), m ◦ θ = mpr , such that θ[G]θ −1 = [Gpr ] and α(θ −1 ω, g) = αpr (ω, θgθ −1 ),
where g ∈ [G],
then the cocycle α will be called below a θ-product cocycle. In fact, it is equivalent to a product cocycle αpr with respect to certain equivalence relations.
2. Two Auxiliary Criterions of the Product Property 2.1.
FIRST DECOMPOSITION CRITERION
PROPOSITION 2.1. Let a countable amenable group G act by ergodic transformations on a Lebesgue space (, B, m). Suppose this action to be supplied with a cocycle α taking values in an amenable group A. The pair (G, α) is weakly equivalent to a pair consisting of a product odometer and a product cocycle if and only if for any ε, θ, σ > 0 and any partial transformations g1 , g2 , . . . , gn ∈ [G]m ∗ there exist a finite measure P ∼ m, a cocycle β cohomologous to α, a function f intertwining β with α and a simple tower ζ with constant (β, P )-passage values such that: Dom gi , Im gi ∈m,ε B(ζ ); m ω ∈ Dom gi ∩ supp(ζ ) : gi ω ∈ Orbζ (ω) > (1 − ε) · m(Dom gi ); Z dP def kP − mksupp(ζ )∩E = 1 − dm dm < ε supp(ζ )∩E
MPAG007.tex; 6/04/1999; 8:14; p.9
340
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
(here E =
S
i (Dom gi
∪ Im gi )); and
m ω ∈ supp(ζ ) ∩ E : dist(eA , f (ω)) > σ < θ · m(supp(ζ ) ∩ E).
REMARK. We may slightly modify the statement of this theorem by replacing the last inequality with an estimate like the following one: m ω ∈ supp(ζ ) ∩ E: f (ω) 6= eA < θ · m(supp(ζ ) ∩ E). Proof. With the aid of a standard routine calculation, it is easy to see that this condition is necessary. We shall only prove the nontrivial part of this statement, i.e., suppose that the condition written above is valid and prove that (G, α) really is weakly equivalent to a product odometer supplied with a product cocycle. We may suppose that m() < ∞. Since G is amenable, there exists a nonsingular transformation T such that [T ] = [G]. TakePthree sequences ofP positive ∞ ∞ ), (σ ), and (θ ), such that ε < m(), numbers, denoted by (ε k k k n n=1 n=1 σn P∞ and n=1 θn converge. Form a sequence of sets (An ) ⊂ B which is dense in B and such that each set appears in it infinitely often. We shall prove by an induction argument the existence of a sequence (Ek ) of measurable sets, where E1 = , Ek+1 ⊂ Ek , and a sequence (Qk ) of measures on Ek , where Qk ∼ m, and a sequence (ζk ) of simple towers, and a sequence (γk ) of cocycles cohomologous to α, satisfying the following conditions: (a) ζk has constant (Qk , γk )-passage values; (b) supp(ζk ) = Ek ; (c) the set Ek is ζk−1 -invariant, and the tower ζk is a refinement of the tower ζk−1 in the sense that ζk = ζk−1 |Ek ⊗ ηk ; dQ e dQk ert,st (d) dQk−1k−1r,s (ω) = dQ (ω), where ω ∈ es ∩ Ek , er,s ∈ ζk−1 , ert,st ∈ ζk , and k γk−1 (ω, er,s ) = γk (ω, ert,st ); (e) m(Ek−1 \ Ek ) < εk ; dQk (ω) < exp(εk ), where ω ∈ Ek , and there exists a function (f) exp(−εk ) < dQ k−1 fk intertwining γk−1 with γk such that m ω ∈ Ek : dist(eA , fk (ω)) > σk < θk · m(Ek ); (g) Ai ∩ Ek ∈m,εk B(ζk ), 1 6 i 6 k; (h) m(ω ∈ Ek : TEk ω ∈ Orbζk (ω)) > (1 − εk ) · m(Ek ). Here TE means the induced transformation. We set: A1 = , ζ1 is trivial, Q1 = m, γ1 = α. Now suppose that the sets E1 ⊃ E2 ⊃ · · · ⊃ En , the measures Q1 , Q2 , . . . , Qn , the towers ζ1 , ζ2 , . . . , ζn and the cocycles γ1 , γ2 , . . . , γn have already been constructed, while according to the construction ) ( i Y 3j . ζi = η1 ⊗ η2 ⊗ · · · ⊗ ηi = er1 r2 ···ri ,s1 s2 ···si : r1 r2 · · · ri , s1 s2 · · · si ∈ j =1
MPAG007.tex; 6/04/1999; 8:14; p.10
341
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
Q Let us fix r = r1 r2 · · · rn ∈ nj=1 3j . Let ω ∈ En , s = s1 s2 · · · sn and t = Q t1 t2 · · · tn ∈ nj=1 3j are such that ω ∈ et and TEn ω ∈ es . Then it is possible to find j j = j (ω) ∈ Z such that TEn ω = es,r · Ter · er,t ω. Now we apply the assumption of this theorem. For any θ 0 , σ 0 , ε 0 and ε 00 > 0, and any N ∈ N we apply it to the partial j transformations Ter , −N 6 j 6 N, and id|er,s (es ∩Ak ) . This allows us to obtain a measure Q ∼ m, a cocycle γ ∼ α, and a simple tower ηn+1 , supp(ηn+1 ) ⊂ er , with constant (Q, γ )-passage values, such that er,s (es ∩ Ak ) ∈
m,ε 0
B(ηn+1 ),
1 6 k 6 n + 1, s ∈
n Y
3j ;
j =1
m ω : Tejr ω ∈ Orbηn+1 (ω) > (1 − ε 00 ) · m(er ), −N 6 j 6 N; m er \ supp(ηn+1 ) < ε 0 ; dQ (ω) < exp(ε 0 ), ω ∈ supp(ηn+1 ). exp(−ε 0 ) < dQn Besides, there exists a function ϕn intertwining γn with γ and satisfying the condition m ω ∈ supp(ηn+1 ) : dist(eA , ϕk (ω) > σ 0 < θ 0 · m(supp(ηn+1 )). Now we put the set En+1 = Orbζn (supp(ηn+1 )), and construct the product tower ζn+1 = ζn |En+1 ⊗ ηn+1 . The finite measure Qn+1 on En+1 will be defined as Qq , Q n es,r : s ∈ nj=1 3j ). The cocycle γn+1 will be defined as follows: where q = ( dQdQ n set the function fn+1 which intertwines it with γn to be equal to eA outside of supp(ζn ), and for any ω ∈ er , s ∈ 3n set fn+1 (es,r ω) = fn−1 (ω, es,r ) · ϕn (ω) · fn (ω, es,r ). Now we have to verify whether En+1 , Qn+1 , ζn+1 , γn+1 satisfy the conditions (a)–(h). This can be done rather straightforwardly, when ε 0 6 εn+1 , σ 0 6 σn+1 , ε 00 is sufficiently small and N is sufficiently large. For example, let us check the second part of condition (f). We see that the values taken by fn+1 on each es reproduce the values taken by ϕn on the fixed level er . Since ζn+1 has constant Qn+1 -passage values, we obtain: Qn+1 ω ∈ En+1 : dist(eA , fn+1 (ω)) > σ 0 = card(ζn ) · Qn+1 ω ∈ supp(ηn+1 ) : dist(eA , ϕn (ω)) > σ 0 . Hence, m ω ∈ En+1 : dist(eA , fn+1 (ω)) > σ 0 < card(ζn ) · Const(Qn+1 ) · θ 0 · m(supp(ηn+1 )). As the choice of θ 0 is to hand, the checking condition is valid.
MPAG007.tex; 6/04/1999; 8:14; p.11
342
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
T We see that E = ∞ k=1 Ek has a positive measure. It easy to show that, according to the Poincaré return lemma, for almost every ω ∈ E and for all but a finite number of k ∈ N we have TE ω = TEk ω. Applying Borel–Cantelly’s lemma to the sets where the condition (h) is false, we obtain that for almost every ω ∈ E and for all but a finite number of k ∈ N, TE ω = TEk ω ∈ Orbζk (ω). Following the first part of condition (f) we may define a function F : E → R+ by ∞ Y dQk F (ω) = (ω), dQk−1 k=2
and a new measure µ ∼ m on E by F (ω)dm(ω) dµ(ω) = R . E F (ω) dm Similarly, by the second part of condition (f) we may define a function 8: E → A by 8(ω) =
∞ Y
fk (ω).
k=1
This product converges in measure m (while the product defining F converges almost everywhere). There exists a subsequence of partial products converging almost everywhere and giving a pointwise definition of 8(ω). This function allows us to construct a new cocycle β which is cohomologous to α and is intertwined by 8 with it. And now it is easy to check that the transformation TE acting on the space (E, µ) and supplied with the cocycle β satisfies conditions (1)–(6) of the reformulation of the Definition 1.6; hence, β is a product cocycle. 2 REMARK. The pair (G, ρ) (where ρ is the Radon–Nikodym cocycle of the measure m), of course, by the same proof, has turned out to be weakly equivalent to the pair consisting of the constructed product Q∞ odometer and the (product) Radon– Nikodym cocycle of the product measure n=1 νn . 2.2.
SECOND DECOMPOSITION CRITERION
PROPOSITION 2.2. Let G be a type III countable ergodic amenable group of nonsingular transformations on (, B, m), supplied with a cocycle α with values in an amenable group A. The given pair (G, α) is weakly equivalent to the pair consisting of a product odometer and a product cocycle, if the following condition is valid:
MPAG007.tex; 6/04/1999; 8:14; p.12
343
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
for any finite measure PP equivalent to m, for any cocycle β equivalent to α, and for any multiple tower ni=1 ζi with constant (P , β)-passage values, for any ε > 0 and σ > 0 there exist a finite measure Q ∼ m, a cocycle γ ∼ α, a simple tower ξ with constant (Q, γ )- passage values being a refinement of the given multiple f
tower, and a function f , γ ∼ β, so that Z def S kP − Qk ni=1 supp(ζi ) = S
1 − dP dQ < ε, n dQ i=1 supp(ζi ) ! ! n n [ [ supp(ζi ) : dist(eA , f (ω)) > σ < ε · m supp(ζi ) . m ω∈ i=1
i=1
REMARK. The condition formulated here is not only sufficient but also necessary. This will be shown later (see Corollary 4.3). Proof. One must check the conditions of the previous criterion. Let ε > 0, g1 , . . . , gn ∈ [G]m ∗ . Since G is amenable, there exists a single tower ζ = {er,s : r, s ∈ 3} such that (1 6 i 6 n) Dom gi , Im gi ∈m,ε B(ζ ), m ω ∈ Dom gi ∩ supp(ζ ) : gi ω ∈ Orbζ (ω) > (1 − ε) · m(Dom gi ). Take an arbitrary level er ∈ ζ and divide it into a finite number of disjoint sets Aj , 0 6 j 6 N, such that for almost every ω ∈ Aj ⊂ er , 1 6 j 6 N, and any s ∈ 3, dm · es,r (ω) < cs,j exp(ε); dm dist(as,j , α(ω, es,r )) < ε, m(Orbζ (A0 )) < ε. cs,j exp(−ε)
ε is contained in Orbζ (A0 ) and has a small measure. Now we apply the given condition to our criterion and see that there exist a measure Q equivalent to m and P , a cocycle γ cohomologous to α and β, and a
MPAG007.tex; 6/04/1999; 8:14; p.13
344
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
P single tower ξ refining N i=1 ζi with constant (Q, γ )-passage values satisfying the estimates written above. This implies that the condition of the previous criterion is true. 2 REMARK. See remark after the proof of Criterion 2.1. 3. Type III Case – Auxiliary Results 3.1.
THE DOUBLE MACKEY ACTION
Let our amenable countable group G act freely on a Lebesgue space (, B, m), and assume this action to be supplied with the Radon–Nikodym cocycle ρ and with one more cocycle α with its values in an admissible group A. The pair (ρ, α) will be considered as a double cocycle. Define the product space × A × R with the following measure: dν(ω, a, u) = dm(ω) · da · exp(u) du; here ω ∈ , a ∈ A, u ∈ R. The natural projection maps from (, B, m) onto , A and R will be denoted by π , πA and πR , respectively. The triple (ω, a, u) will sometimes be denoted by z. Introduce the skew action of the group G on this space: dm ◦ g (ω) . g(ω, ˜ a, u) = gω, a · α(ω, g), u + log dm Also, consider the following actions of A and R on the product space: Tt (ω, a, u) = (ω, a, u + t), t ∈ R, Vb (ω, a, u) = (ω, ba, u), b ∈ A. We see that g˜ and Vb preserve the measure ν, while Tt does not. These three actions are permutable. e = {g˜ : g ∈ G}. Consider the quotient space X of × A × R by the σ Let G algebra of all g-invariant ˜ sets. Let π be the natural projection from × A × R onto X. Take an arbitrary σ -finite measure µ on X which is equivalent with ν0 ◦ π −1 , where ν0 is any finite measure equivalent to ν. Then, we can write the following decomposition: Z k(ω, a, u) dν(ω, a, u) ×A×R Z Z = dµ(x) · k(ω, a, u) dν(ω, a, u | x) X
π(ω,a,u)=x
MPAG007.tex; 6/04/1999; 8:14; p.14
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
345
for any k ∈ L1 ( × A × R; ν), where dν(ω, a, u | x) denote the uniquely-defined e σ -finite nonatomic G-invariant measures, satisfying ν({(ω, a, u) ∈ × A × R : π(ω, a, u) 6= x} | x) = 0 for almost every x ∈ X. DEFINITION 3.1. Consider the quotient actions Ft (π(ω, a, u)) = π(Tt (ω, a, u)) and Wb (π(ω, a, u)) = π(Vb (ω, a, u)) on X of R and A, respectively. The joint action (Ft , Wb ) will be called the double Mackey action. DEFINITION 3.2. Let 1 be a countable dense subgroup of R, and B a countable dense subgroup of A. Define the following countable nonsingular transformation group G on ( × A × R, B × B(A) × B(R), ν): G = {g˜ · Tδ · Vb : g ∈ G, δ ∈ 1, b ∈ B}. It is easy to see that if G is amenable, countable and of type III, then G is orbit equivalent with G. This is not sufficient for our purposes, and we shall now prove the following: PROPOSITION 3.1. The pair (G, (ρ, α)) is weakly equivalent to the pair (G, (ρ1 , α1 )), where ρ1 and α1 are the following cocycles: ρ1 (ω, a, u; g, ˜ Vb , Tt ) = −t; ˜ Vb , Tt ) = b−1 . α1 (ω, a, u; g, Recall that (ρ, α) ∈ Z 1 (, G; R × A). Due to the latter definition ρ1 ∈ Z 1 ( × e × B × 1; A). e × B × 1; R) and α1 ∈ Z 1 ( × A × R, G A × R, G e× Proof. Note that in the Abelian case the cocycle ρ1 ∈ Z 1 ( × A × R, G B × 1; R) defined to be equal to (−t) is exactly the Radon–Nikodym cocycle of the joint action (g, ˜ Tt , Vb ). This allowed us to apply the weak equivalence theorem proved in [1] to the Abelian group case, and, hence, to prove our statement, we only have to check that the Mackey action constructed by the pair (G, (ρ, α)) is isomorphic to the Mackey action constructed by the pair (G, (ρ1 , α1 )). The cited theorem of [1] was generalized by Golodets and Sinelshchikov [7] to state that if G1 , G2 are free countable amenable transformation groups supplied with cocycles α1 , α2 with values in a l.c.s. group A, and ρ1 , ρ2 are the Radon– Nikodym cocycles of G1 , G2 , then the double Mackey actions associated with (Gi , (αi , ρi )) are isomorphic if and only if the pairs (Gi , αi ) are weakly equivalent. In the case of a unimodular group A we can use this result directly in the proof of
MPAG007.tex; 6/04/1999; 8:14; p.15
346
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
Proposition 3.1 when comparing (G, α) and (G, α1 ), but in the general case ρ1 is not equal to the Radon–Nikodym cocycle of the action G, and the result of [7] is inapplicable. However, when log 1(α(ω, g)) is a coboundary, one can easily change the measure on so that this case is reduced to the unimodular one. This explains the admissibility conditions for group A that ensure the existence of a countable amenable B together with the possibility to apply the result of [7]. To prove our statement, we only have to check whether the Mackey action constructed by the pair (G, (ρ, α)) is isomorphic to the Mackey action constructed by the pair (G, (ρ1 , α1 )). To construct the latter Mackey action, we write the following bb , Tbt stand for the skew five actions being permutable one with one (below g, ˆ V product constructed by G and (ρ1 , α1 ), while ω ∈ , g ∈ G, a, a1 , a2 ∈ A, b ∈ B ⊂ A, u, u1 , u2 ∈ R, t ∈ 1 ⊂ R). (a) (b) (c) (d) (e)
g(ω, ˆ a, u, a1 , u1 ) = (gω, aα(ω, g), u + ρ(ω, g), a1 , u1 ); b Vb (ω, a, u, a1 , u1 ) = (ω, ba, u, a1 · b−1 , u1 ); Tbt (ω, a, u, a1 , u1 ) = (ω, a, u + t, a1 , u1 − t); Va2 (ω, a, u, a1 , u1 ) = (ω, a, u, a2 · a1 , u1 ); Tu2 (ω, a, u, a1 , u1 ) = (ω, a, u, a1 , u1 + u2 ).
According to the definition, to construct the Mackey action in this case, we have to find the quotient space of × A × R × A × R by the σ -algebra of all bb , Tbt )-invariant sets and then consider the quotient action of (d) and (e) in this (g, ˆ V space. Note that the condition a · a1 = Const, for any given constant value belonging bb acts ergodically inside each bb , and V to A, picks out an invariant subset for action V of these subsets because of the density of B in A. Similarly, the condition u + u1 = Const, for any given constant value belonging to R, picks out an invariant subset for the action Tbt , and Tbt acts ergodically inside each of these subsets because of the density of 1 in R. This allows us to define the quotient space whose elements have the form (ω, a, u), where a = a · a1 = Const ∈ A and u = u + u1 = Const ∈ R, together with the quotient actions ˆ a, u) = (gω, a · α(ω, g), u + ρ(ω, g)); (a0 ) g(ω, 0 (d ) Va2 (ω, a, u) = (ω, a2 · a, u); (e0 ) Tu2 (ω, a, u) = (ω, a, u + u2 ). We see that this space can indeed be identified with the quotient space of × A×R×A×R by the ergodic components of the actions (b) and (c), and the actions (a0 ), (d0 ) and (e0 ) are exactly the quotient actions of (a), (d), (e), respectively. But now we only have to note that the definition of the quotient action of (d0 ) and (e0 ) by the ergodic components of (a 0 ) coincides with the construction of Ft and Wb verbatim. 2
MPAG007.tex; 6/04/1999; 8:14; p.16
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
347
REMARK. The same argument implies, in particular, that the transformation groups G and G are orbit equivalent and the pairs (G, α) and (G, α1 ) are weakly equivalent. REMARK. An observation similar to this proposition was first made in [11]. 3.2.
THE MAIN APPROXIMATION LEMMA
DEFINITION 3.3. Let h ∈ [G]ν∗ , and E ⊂ Dom h be a measurable set. fh (x) and fE (x) will be nonnegative integrable functions ∈ L1 (X, µ) such that fh (x) = ν(Im h | x); fE (x) = ν(E | x). Obviously fE = fid|E , fh = fIm h , kfE k1 = ν(E). LEMMA 3.2. Let ε > 0, E be a measurable subset of × A × R, and f ∈ L1 (X, µ)+ such that kf − fE k1 < ε. Then there exists a measurable set E1 ⊂ × A × R and a map h ∈ [G]ν∗ from E onto E1 such that fE1 = f , kν(h·) − ν(· ∩ E)k 6 kf − fE k + ε < 2ε, ν z ∈ E : α1 (z, h) 6= eA < 2ε.
and
Note that the cocycle α1 is defined in Proposition 3.1. REMARK. Our proof almost reproduces the proof of Lemma 13 in [12], (which was presented there for a simpler case), but the basic idea of this proof dates back to Lemmas 5.9 and 6.4 in [3]. Proof. Decompose the space X into three disjoint subspaces X− , X0 , X+ in the following way: X− = {x ∈ X: f (x) < fE (x)}, X0 = {x ∈ X: f (x) = fE (x)}, X+ = {x ∈ X: f (x) > fE (x)}. When µ(X− ) = µ(X+ ) = 0 we may set h = id, E1 = E. Since ν(· | x) is an infinite σ -finite measure, it is possible to find a measurable subset E0 ⊂ × A × R such that ν(E0 | x) = f (x) for almost every x, E0 ∩ π −1 (X− ) ⊂ E ∩ π −1 (X− ), E0 ∩ π −1 (X0 ) = E ∩ π −1 (X0 ), E0 ∩ π −1 (X+ ) ⊃ E ∩ π −1 (X+ ).
MPAG007.tex; 6/04/1999; 8:14; p.17
348
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
Case (i). Let µ(X− ) > 0. In this case we can find some measurable sets E00 ⊂ E and F ⊂ × A × R such that E00 ∩ π −1 (X+ ∪ X0 ) = E0 ∩ π −1 (X+ ∪ X0 ), ν(E00 | x) < f (x)(= ν(E0 | x)), F ∩ E = ∅, ν(F | x) = ν(E0 | x) − ν(E00 | x), kf (x) − ν(E00 | x)k < ε/2. Since G is of type III, we can obtain a partial transformation h ∈ [G]ν∗ such that Dom h = E \ E00 , Im h = F . Then, extend h to the whole set E by setting h|E00 = id. Now we see that h maps E onto E00 ∪F . Denote E00 ∪F by E1 . Obviously, fh = fE1 = f and ν(z ∈ E : h·z 6= z) 6 ν(E\E00 ) 6 ν(E\E0 )+ν(E0 \E00 ) < 32 ε. Then kν(h·) − ν(id|E ·)k = kν(h|(E\E00 )∩π −1 (X− ) ·) − ν((E \ E00 ) ∩ π −1 (X− ) ∩ ·)k 6 ν(h((E \ E00 ) ∩ π −1 (X− ))) + ν((E \ E00 ) ∩ π −1 (X− )) Z 3 ε 3 6 ν(F | x) dµ(x) + ε 6 + ε = 2ε. 2 2 2 X− Case (ii). Let µ(X+ ) > 0. Similarly to the case considered above we can construct some measurable sets E00 ⊂ E, and F ⊂ × A × R such that E00 ∩ π −1 (X− ∪ X0 ) = E0 ∩ π −1 (X− ∪ X0 ), ν(E00 | x) < fE (x) for almost every x ∈ X+ , F ∩ E0 = ∅, ν(F | x) = ν(E | x) − ν(E00 | x), kν(F | x)k1 < ε/2. Since G is of type III, we can obtain a partial transformation h ∈ [G]ν∗ such that Dom h = E \ E00 , Im h = (E0 \ E) ∪ F . Note that the last union is disjoint. Define h|E00 = id. Then we obtain a map h which transforms E onto E1 = E00 ∪ F ∪ (E0 \ E); here E1 is represented as a disjoint union. Thus, we see that fh = fE00 + fF + fE0 − fE = fE0 = f, the set of those points where h differs from id is contained in E \ E0 and hence its measure is less than ε/2, and kν(h·) − ν(id|E ·)k = kν(h|(E\E00 )∩π −1 (X+ ) ·) − ν((E \ E00 ) ∩ π −1 (X+ ) ∩ ·)k 6 ν(h((E \ E00 ) ∩ π −1 (X+ ))) + ν((E \ E00 ) ∩ π −1 (X+ )) ε ε ε 6 ν(E0 \ E) + ν(F ) + < ε + + = 2ε. 2 2 2
MPAG007.tex; 6/04/1999; 8:14; p.18
349
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
3.3.
PROPERTIES OF THE TRANSFORMATION GROUP
g.
e-Hopf equivalent if DEFINITION 3.4. Two sets E and E 0 ⊂ B are called to be G e ∗ such that Dom h = E, Im h = E 0 . there exists a partial transformation h ∈ [G] e equivalence classes LEMMA 3.3. The correspondence E 7→ fE between G-Hopf 1 in B and functions from L+ (X, µ) is bijective and additive and, moreover, kfE − fE0 k 6 ν(E 4 E 0 ). Proof. It is clear that for E, E 0 ∈ B such that ν(E) < ∞ and ν(E 0 ) < ∞, e equivalent if and only if ν(E | x) = ν(E 0 | x) for almost E and E 0 are G-Hopf every x. Since each ν(· | x) is an infinite and σ -finite measure, the map E 7→ fE ∈ L1+ (X, µ) is onto. The additivity, i.e. that fE∪F = fE + fF when E and F are disjoint, is trivial. The estimate kfE − fE0 k 6 ν(E 4 E 0 ) follows from the definition of ν(· | x). 2 LEMMA 3.4. For any δ ∈ 1, b ∈ B, h ∈ [G]ν∗ , f ∈ L+ 1 (X, µ) we have: fT −1 ·V −1 ·h (x) = exp(−δ) · fh (Fδ Wb x) · δ
b
dµ ◦ Fδ Wb (x). dµ
Proof. Let g(x) be some function ∈ L∞ (X, µ). Then
Z X
g(x) · fT −1 ·V −1 ·h (x) dµ(x) δ
b
(according to the definition of our measures)
Z
Z g(x)dµ(x) ·
= X
π(ω,a,u)=x
dm(ω) ·
b
(due to the construction of ν)
A×R
Z =
δ
Z
Z =
χT −1 V −1 (Im h) (ω, a, u) dν(ω, a, u | x)
Z dm(ω)
Z
A×R
g(π(ω, a, u)) · χIm h (ω, ab, u + δ) · exp(u) du da (changing variables in × A × R)
g(π(ω, ab−1 , u − δ))χIm h (ω, a, u) exp(u) du da exp(−δ)
g(F−δ Wb−1 x) dµ(x)
Z(according to the definition of our measures) χIm h (ω, a, u) dν(ω, a, u | x)
= exp(−δ) X π(ω,a,u)=x Z dµ ◦ Fδ Wb exp(−δ) · g(x) · = · fh (Fδ Wb · x) · dµ(x). dµ X
LEMMA 3.5. Let h ∈ [G]ν∗ and fi ∈ L1+ (X, µ) (1 6 i 6 N) be such that P ν fh = N i=1 fi . Then there exist partial transformations hi ∈ [G]∗ such that Dom h =
N [
Dom hi
(disjoint union),
i=1
fi = fhi ,
MPAG007.tex; 6/04/1999; 8:14; p.19
350
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
ν(h·) =
N X
ν(h·),
and
i=1
{z ∈ Dom h: α1 (z, h) 6= eA } =
N [
{z ∈ Dom hi : α1 (z, hi ) 6= eA }.
i=1
Proof. Decompose the set Im h into a finite number of disjoint measurable sets Ei (1 6 i 6 N) such that ν(Ei | x) = fi (x) for almost every x ∈ X. Define partial transformations hi ∈ [G]ν∗ by hi (ω, a, u) = h(ω, a, u),
where (ω, a, u) ∈ h−1 (Ei ).
Then it is easy to check that these hi satisfy the desired conditions.
2
e denote the transformation group {Vb : b ∈ B}. Let B e × B] e = {h ∈ [G]∗ : ν(h·) = ν(·)}. LEMMA 3.6. [G e× Proof. We only have to prove that if h ∈ [G]∗ is ν-preserving then h ∈ [G e B]. Since for almost every (ω, a, u) ∈ Dom h we see that h(ω, a, u) = g˜ · Vb Tδ (ω, a, u) for some g ∈ G, b ∈ B ⊂ A, and δ ∈ 1 ⊂ R, and since dν◦g·V ˜ b ·Tδ dν
= exp(δ), the condition ν(h·) = ν(·) implies δ = 0.
2
e coincides with the set of all h ∈ [G]∗ possessing the following LEMMA 3.7. [G] properties: ν(h·) = ν(·) and for almost every point z ∈ × A × R there exists g ∈ G such that π (h · z) = gπ(z) and πA (h · z) = α(π(z), g) · πA (z). Proof. It is evident that h = g˜ possesses these properties. Conversely, the condition ν(h·) = ν(·), according to the previous lemma, implies that h is (locally) 2 equal to g˜ · Vb . The second condition is valid only when b = eA . LEMMA 3.8. Let h, h0 ∈ [G]ν∗ . The following statements are equivalent: (a) there exists v ∈ [G]ν∗ such that Im v = Dom h, ν(hv·) = ν(h0 ·), Dom v = Dom h0 , and for any (ω, a, u) ∈ Dom h0 there exists some g ∈ G such that gπ (ω, a, u) = gω = π (hvh0−1 (ω, a, u)) and πA (hvh0−1 (ω, a, u)) = α(ω, g); e equivalent; (b) the sets Im h and Im h0 are G-Hopf
MPAG007.tex; 6/04/1999; 8:14; p.20
351
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
(c) fh = fh0 . Proof. The equivalence between (a) and (b) follows from Lemma 3.7; we only need to note that v = h−1 gh0 , g = hvh0−1 . The equivalence between (b) and (c) was proved in Lemma 3.3. 2
4. Type III Case – Proof of the Main Theorem 4.1.
FORMULATION AND COMMENTARY
THEOREM 4.1. Let G be a type III amenable ergodic countable group of nonsingular transformations on (, B, µ), and let α be a 1-cocycle for this action with values in an admissible group A, and ρ the Radon–Nikodym cocycle. The pair (G, (ρ, α)) is weakly equivalent to a pair consisting of a product odometer and a product cocycle if and only if the double Mackey action (Ft , Wb ) is AT. Proof of sufficiency. The if part of this theorem is rather well known. It follows from the fact that transitive actions are approximately transitive (for Abelian group actions it was shown in [8] and independently in [15]; for the general case, it was noted in [14]; see the complete proof in [10]? ) and from the fact that a quotient action of an AT action is also AT ([3, Remark 2.4]). Indeed, the definitions of a product cocycle and a product odometer imply that the joint action (g, ˜ Tt , Vb ) is transitive and hence AT by an straightforward computation; but the double Mackey action is only its quotient action. 2 A nontrivial part of the theorem is the fact that approximate transitivity of the double Mackey action implies that the given pair is weakly equivalent to a product odometer supplied with a product cocycle. It will be proved below. In view of Proposition 3.1, instead of the given pair (G, (ρ, α)) we may consider the pair (G, (ρ1 , α1 )) and try to prove that this pair is weakly equivalent to a pair consisting of a product odometer and a product cocycle. To doPso, we are going to apply the criterion proved in Proposition 2.2. Thus, let Z = ni=1 ζi be an arbitrary multiple tower for G with constant (P , β)-passage h
values, where P ∼ ν, β ∼ α1 . Our purpose is to construct ξ, Q, γ as in Proposition 2.2. (See also the remarks after its proof and after the proof of Proposition 2.1.)
4.2.
REDUCTION TO A PARTICULAR CASE
Take an arbitrary floor er(i) from each ζi . Consider the set E = it the base of Z.
Sn
i=1 er(i)
and call
? Even a stronger fact is valid, namely, the transitive actions not only are AT but also have the funny rank one. See [20].
MPAG007.tex; 6/04/1999; 8:14; p.21
352
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
LEMMA 4.2. In the further proof of Theorem 4.1 we may assume that P = ν on E and that βE = (α1 )E . This means that when the statement of our theorem turns out to be true for this particular case, it will imply its validity for the general case. Proof will be presented in two steps. First, we construct another multiple tower 0 Z instead of Z, together with another measure P 0 and another cocycle β 0 that will be used instead of P and β, respectively, and that really possess the properties P 0 = ν on E and βE0 = (α1 )E . Second, suppose that the theorem is true for this particular case. This means that we can find some ξ, Q0 and γ 0 as in Proposition 2.2 suitable for the triple (Z 0 , P 0 , β 0 ). The same single tower ξ together with Q and γ that we present here will be suitable for the initial case of the triple (Z, P , β). Step 1. For any δ > 0 we may decompose the sets er(i) into a finite number of disjoint sets Eij (0 6 j 6 J ) such that P (Ei0 ) < δ, ν(Ei0 ) < δ, dP (z) < cij exp(δ), dν dist(h(z), aij ) < δ/2. cij exp(−δ)
δ/2 < δ · ν supp(ζi ) . ν ω∈ i=1
i=1
Now we can construct a (uniquely defined) measure Q together with a (uniquely defined) cocycle γ in such a way that Q(E) = cij · Q0 (E), when E ⊂ Eij , 1 6 j 6 J ; Q(E) = Q0 (E), when E ⊂ Ei0 ; P the distributions of Q and Q0 relative to ni=1 ζi coincide; the function k(z) which intertwines γ with γ 0 is equal to aij on each Eij , 1 6 j 6 J , and to eA on Ei0 ; P (e) the distributions of γ and γ 0 relative to ni=1 ζi coincide.
(a) (b) (c) (d)
Then we see that ξ has constant (Q, γ )-passage values, and kQ − P kSni=1 supp(ζi ) 6 kQ − P kSn
i=1
6
n X J X
SJ j=1
Orbζi (Eij )
+ kQ − P kSni=1 Orbζi (Ei0 )
cij exp(δ) · δ 0 + 2δ
i=1 j =1
can be done as small as we need because the choice of δ, δ 0 is to hand. Moreover, note that f (z) = h−1 (z) · h0 (z) · f 0 (z) · k(z) intertwines β with γ , and n [ supp(ζi ) : dist(f (z), eA ) > δ ν z∈ i=1 00
6δ ·ν
[ n
[ n supp(ζi ) + ν Orbζi (Ai0 )
i=1
also can be made as small as we need.
i=1
2
Note that this lemma implies, in particular, that for any partial transformation v ∈ [G] such that Dom v ⊂ er(1) , Im v ⊂ er(i) and for z ∈ er(1) we may regard β(z, v) as being equal to α1 (z, v).
MPAG007.tex; 6/04/1999; 8:14; p.23
354 4.3.
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
PROOF OF THE MAIN THEOREM
We know that the double Mackey action (Ft , Wb ) (t ∈ R, b ∈ A) is AT. Since for t Wb any f ∈ L1 (X, µ) the mapping R × A → L1 (X, µ), (t, b) 7→ f (Ft Wb x) · dµ◦F dµ is continuous and since 1 and B are dense subsgroups in R and A, respectively, then the restriction of the double Mackey action onto 1 × B is also AT. This means that for the above chosen er(i) , 1 6 i 6 n, and any θ > 0, we can find f ∈ L1+ (X, µ) and a finite number of δ(i, l) ∈ 1 ⊂ R, b(i, l) ∈ B ⊂ A, 1 6 l 6 Li , 1 6 i 6 n, such that
Li
X W dµ ◦ F
δ(i,l) b(i,l) exp(−δ(i, l)) · f (Fδ(i,l) Wb(i,l) x) (x) < θ
fer(i) (x) −
dµ l=1
for any 1 6 i 6 n. Here, as above, fer(i) = ν(er(i) | x). Applying Lemma 4.2 n times to each set er(i) and to the function Ri (x) =
Li X
exp(−δ(i, l)) · f (Fδ(i,l) Wb(i,l) x)
l=1
dµ ◦ Fδ(i,l) Wb(i,l) (x), dµ
we obtain each time a partial transformation hi ∈ [G]ν∗ such that Dom hi = er(i) , Im hi will be denoted by Zi , fhi (x) = Ri (x), kν(hi ·) − ν(er(i) ∩ ·)k < 2θ, ν ω ∈ er(i) : α1 (ω, hi ) 6= eA < 2θ, 1 6 i 6 n. Applying Lemma 3.5, for each i, to hi and Ri (x), we obtain that there exist partial transformations hli ∈ [G]ν∗ together with the corresponding sets Yil = Dom hli , Zil = Im hli so that er(i) =
Li [
Yil ,
l=1
fhl = fZ l = exp(−δ(i, l)) · f (Fδ(i,l) Wb(i,l) x) i
i
ν(hi ·) =
Li X
ν(hli ·),
dµ ◦ Fδ(i,l) Wb(i,l) (x), dµ
and
l=1 Li X
ν z ∈ Yil : α1 (z, hli ) 6= eA < 2θ.
l=1
Now we can use Lemma 3.4. We see that fhl (x) = exp(−δ(i, l)) · f (Fδ(i,l) Wb(i,l) x) i
dµ ◦ Fδ(i,l) Wb(i,l) (x) dµ
MPAG007.tex; 6/04/1999; 8:14; p.24
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
355
= exp(−δ(i, l)) · f Fδ(1,1) Wb(1,1) (Fδ(i,l)−δ(1,1) Wb(1,1)−1b(i,l) x) · dµ ◦ Fδ(1,1) Wb(1,1) · (Fδ(i,l)−δ(1,1) Wb(1,1)−1 b(i,l) x) · dµ dµ ◦ Fδ(i,l)−δ(1,1) · Wb(1,1)−1b(i,l) · (x) dµ dµ ◦ Fδ(i,l)−δ(1,1) Wb(1,1)−1b(i,l) (x) · = fh1 (Fδ(i,l)−δ(1,1) Wb(1,1)−1b(i,l) x) · 1 dµ · exp(−δ(i, l) + δ(1, 1)) = fT −1 ·V −1 ·h1 (x). δ(i,l)−δ(1,1)
b(1,1)−1 b(i,l)
1
This allows us to apply Lemma 3.8 to the partial transformations hli and −1 −1 l 1 ν ν Tδ(i,l)−δ(1,1) · Vb(1,1) −1 b(i,l) · h1 ∈ [G]∗ to obtain partial transformations vi ∈ [G]∗ such that Dom vil = Dom h11 = Y11 , Im vil = Dom hli = Yil , −1 −1 1 ν(hli vil (·)) = ν(Tδ(i,l)−δ(1,1) · Vb(1,1) −1b(i,l) · h1 (·)) (by the definitions of T , V and ν) = exp(−δ(i, l) + δ(1, 1)) · ν(h11 ·), e and hli · vil · (h11 )−1 · Vb(1,1)−1b(i,l) · Tδ(i,l)−δ(1,1) ∈ [G].
4.4.
CONSTRUCTION OF THE TOWER
ξ
Let z ∈ es , which is a floor of ζj , and let er(j ),s z ∈ Yjm . Define eirl,j sm (z) = er,r(i) · vil · (vjm )−1 · er(j ),s (z). It is easy to see that {eirl,j sm : 1 6 i, j 6 n, 1 6 l 6 Li , 1 6 j 6 Lj , r ∈ 3i , s ∈ 3j } Yil . Obviously form a tower which we denote by ξ . The levels of ξ are the sets er,r(i)P the simple tower ξ makes a refinement of the given multiple tower ni=1 ζi . 4.5.
CONSTRUCTION OF THE MEASURE
Q
Let E ⊂ Yil . Let us define, for any s ∈ 3i , Q(es,r(i)E) =
P (es ) · ν(hli E). P (er(i) )
This means that the values of the Radon–Nikodym cocycle for Q reproduce the corresponding values of the Radon–Nikodym cocycle for P on each er,s belonging
MPAG007.tex; 6/04/1999; 8:14; p.25
356
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
to the initial multiple tower. Obviously Q is equivalent to ν and P . Since we see that for any z ∈ Yil , dν ◦ hli vil dQ ◦ vil (z) = (z) = exp(−δ(i, l) + δ(1, 1)), dQ dν ◦ h11 then ξ has constant Q-passage values. 4.6.
CONSTRUCTION OF THE COCYCLE
γ
We will search for the appropriate γ in the following form: γ (ω, a, u; h) = f (ω, a, u) · β(ω, a, u; h) · f −1 (h(ω, a, u)). This ensures that γ is cohomologous to β and α. The intertwining function f can be found in the following form: f (er,r(i) z) = β −1 (z, er,r(i) ) · f (z) · β(z, er,r(i) ), where z ∈ er(i) and r ∈ 3i . This guarantees that the initial multiple tower has constant γ -passage values. Let f |Y11 ≡ eA , and for any z ∈ Y11 f (vil z) = α1 (z, hli vil )−1 · α1 (z, h11 ) · β(z, vil ). In other words, for z ∈ Y11 we have just defined that γ (z, vil ) = α1 (z, h11 )−1 · α1 (z, hli vil ). Let us compute this value and prove that it is a constant. Lemma 3.8 allows us to write (locally) hli · vil · (h11 )−1 · Vb(1,1)−1b(i,l) · Tδ(i,l)−δ(1,1) = g˜ il , −1 −1 l and hence, for any z2 ∈ Vb(1,1) −1b(i,l) · Tδ(i,l)−δ(1,1) · Yi ,
α1 (z2 , hli · vil · (h11 )−1 · Vb(1,1)−1b(i,l) · Tδ(i,l)−δ(1,1) ) = eA . By the definition of a cocycle, we obtain: α1 (z2 , Tδ(i,l)−δ(1,1) ) · α1 (z1 , hli · vil · (h11 )−1 · Vb(1,1)−1b(i,l) ) = eA , −1 l where z1 = Tδ(i,l)−δ(1,1) · z2 ∈ Vb(1,1) −1b(i,l) · Yi . As the first multiplier here is also equal to eA , we write:
α1 (z1 , hli · vil · (h11 )−1 · Vb(1,1)−1b(i,l) ) = eA . Using the definition of a cocycle again, we obtain: α1 (z1 , Vb(1,1)−1b(i,l) ) · α1 (z0 , hli · vil · (h11 )−1 ) = eA ,
MPAG007.tex; 6/04/1999; 8:14; p.26
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
357
where z0 = Vb(1,1)−1b(i,l) · z1 ∈ Yil . As the first multiplier here is known due to the construction of α1 , we write: α1 (z0 , hli · vil · (h11 )−1 ) = b(1, 1)−1 b(i, l) = Const. But the latter expression can be transformed in the following way: α1 (z0 , (h11 )−1 ) · α1 ((h11 )−1 z0 , hli · vil ) = b(1, 1)−1 b(i, l), and the first multiplier here is evidently equal to (α1 ((h11 )−1 z0 , h11 ))−1 . As a result of this, for z = (h11 )−1 z0 ∈ Y11 , we can write that α1 (z, h11)−1 · α1 (z, hli vil ) = b(1, 1)−1 · b(i, l). Thus, we have proved that ξ has constant γ -passage values. 4.7.
ESTIMATES FOR THE MEASURE
Q
We have to estimate kQ − P kSni=1 supp(ζi ) . As P = ν on each er(i) due to Lemma 4.2, it suffices to estimate kQ − νq kSni=1 supp(ζi )
P (here q is the distribution vector of P relative to ni=1 ζi ). Since P and Q have the same distributions, the required estimate can be obtained as kQ(er(i) ∩ ·) − ν(er(i) ∩ ·)k multiplied by the number of levels of the given multiple tower. But we see that kQ(er(i) ∩ ·) − ν(er(i) ∩ ·)k
L i
X
ν(hli ·) − ν(er(i) ∩ ·) = kν(hi ·) − ν(er(i) ∩ ·)k < 2θ. =
l=1
Since the choice of θ is to hand, kQ − P kSni=1 supp(ζi ) can be done less than any given ε > 0. 4.8.
ESTIMATES FOR THE COCYCLE
γ . A COROLLARY
We have already constructed the function f which intertwines β with γ , and we must estimate the set of points Pn where f differs from eA . As β and γ have the same distributions relative to i=1 ζi , to do so we have only to estimate ν(z ∈ Y11 : f (vil z 6= eA ). Then, f (vil z) = α1 (vil z, hli )−1 · α1 (z, vil )−1 · α1 (z, h11 ) · β(z, vil ).
MPAG007.tex; 6/04/1999; 8:14; p.27
358
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
The first and the third multipliers here differ from eA only inside the sets {z ∈ Yil : α1 (z, hli ) 6= eA } whose summary measure is small. Outside these sets the product of the second and the fourth multipliers is equal to eA due to Lemma 4.2. This completes the proof of Theorem 4.1. 2 COROLLARY 4.3. The sufficient condition for the product property proved in Proposition 2.2 is also a necessary condition. Indeed, the double Mackey action for a product odometer and a product cocycle is AT. We have already checked in the proof of Theorem 4.1 that approximate transitivity implies the required condition of Proposition 2.2. 2 5. Type II Case 5.1.
PROOF OF THE MAIN THEOREM
THEOREM 5.1. Let G be a countable amenable transformation group of type II acting on the Lebesgue space (, B, m), and α: × G → A be a 1-cocycle of this action with values in an admissible group A. The pair (G, α) is stably weakly equivalent to the pair consisting of a product odometer and a product cocycle if and only if the associated action is AT. Proof. Consider the product space × A. Denote the product measure dm × da by dν. Let G be a transformation group acting on this space, defined as follows: G = {g˜ · Vb : g ∈ G, b ∈ B}, where B is a countable dense subgroup of A, and the actions g˜ and Vb , as usually, are defined by g(ω, ˜ a) = (g · ω, a · α(ω, g)), Vb (ω, a) = (ω, b · a), and the corresponding Mackey action will be denoted by Wb , as above. Assume this action to be supplied with a cocycle α1 : ( × A) × G → A: α1 (ω, a; g˜ · Vb ) = b−1 . It is rather easy to see that the pair (G, α) is stably weakly equivalent to the pair (G, α1 ). The proof can be done in the same manner as in Proposition 3.1, and the main idea is to use the fact that the Mackey actions associated with these pairs are isomorphic. So we have to prove that the pair (G, α1 ) is weakly equivalent to a product odometer and a product cocycle. We are going to apply Proposition 2.1. To do that, suppose that a finite collection of partial transformations h1 , h2 , . . . , hn ∈ [G]ν∗ is given. (Of course, they are νpreserving.) Denote Dom hi by Ei and Im hi by Fi ; Ei and Fi ⊂ × A. Our
MPAG007.tex; 6/04/1999; 8:14; p.28
359
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
purpose is to construct some G-stack ζ and some cocycle β ∼ α1 so that ζ would have constant β-passage values, and the function f intertwining β with α1 would be close to eA , and Ei and Fi ∈ν,ε B(ζ ). To do that, notice from the very beginning that we may suppose that there exist ai ∈ A such that ν ω ∈ Ei : dist(ai , α1 (ω, hi )) > σ < ε 0 ν(Ei ) for any pre-given σ, ε 0 . (If this is wrong, split Ei into smaller sets.) Recall that (X, µ) is the quotient space of × A by the ergodic components of g. ˜ Let fi (x) = fEi (x) = ν(x | Ei ): X → R+ . Since the Mackey action Wb on X is approximately transitive, for any given ε > 0 it is possible to find f ∈ L1 (X, µ), b(i, l) ∈ B and λi,l ∈ Z+ (here i = 1, . . . , n; l = 1, . . . , Li ) such that
Li
X dµ ◦ Wb(i,l)
(x) < ε. λi,l · f (Wb(i,l) · x) ·
fi (x) −
dµ l=1
1
Hence, there exist sets Ri ∈ × A such that ν(Ri 4 Ei ) < ε and fRi (x) =
Li X
λi,l · f (Wb(i,l) · x) ·
l=1
dµ ◦ Wb(i,l) (x). dµ
Note that it is possible to extend the given partial transformations hi so that they would be defined for Ei ∪ Ri and would simultaneously belong to [G]ν∗ . S i Decompose these sets Ri into a disjoint union of sets Ri,l so that Ll=1 Ri,l = Ri and fRi,l (x) = λi,l · f (Wb(i,l) · x) ·
dµ ◦ Wb(i,l) (x) dµ
and hence Li X
fRi,l (x) = fRi (x).
l=1
Since we might assume from the very beginning that λi,l are positive integers, it is also possible to decompose Ri,l into a disjoint union of sets Ri,l,j , where j = S i Ri,l,j = Ri,l and 1, . . . , λi,l , so that Ll=1 fRi,l,j (x) = f (Wb(i,l) · x) ·
dµ ◦ Wb(i,l) (x). dµ
LEMMA 5.2. fV −1 E (x) = fE (Wb x) · b × G.
dµ◦Wb (x) dµ
for any measurable subset E ⊂
MPAG007.tex; 6/04/1999; 8:14; p.29
360
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
Proof. This can be shown very easily. A more difficult but similar equality was proved in Lemma 3.4 above. Applying this lemma, we immediately obtain: dµ ◦ Wb(i,l) (x) dµ dµ ◦ Wb(1,1) = f (Wb(1,1) · Wb(1,1)−1b(i,l) x) · (Wb(1,1)−1b(i,l) x) · dµ dµ ◦ b(1, 1)−1 b(i, l) (x) · dµ dµ ◦ Wb(1,1)−1b(i,l) (x) = f1,1,1(Wb(1,1)−1b(i,l) x) · dµ = fV −1 R1,1,1 (x).
fRi,l,j (x) = f (Wb(i,l) · x) ·
b(1,1)−1 b(i,l)
−1 Hence, the sets Ri,l,j and Vb(1,1) −1b(i,l) R1,1,1 not only have equal measures, but also the same conditional measures for a.e. x. Therefore, there exist partial transforma−1 tions vi,l,j such that Dom vi,l,j = Vb(1,1) −1b(i,l) R1,1,1 , Im vi,l,j = Ri,l,j , and these ν e ν∗ . Note that α1 (·, vi,l,j ) = eA . vi,l,j belong not only to [G]∗ , but even to [G] Denote hi Ri,l,j by Si,l,j . The desired stack ζ is now ready: it consists of the sets Ri,l,j and Si,l,j , and the sets Ri,l,j are intertwined by the transformations vi,l,j ◦ −1 Vb(1,1) −1b(i,l) : R1,1,1 → Ri,l,j . It is evident that this stack provides a good approximation of the initial partial transformations hi in the sense of Proposition 2.1. Note that for z ∈ R1,1,1 we have −1 −1 α1 z, vi,l,j ◦ Vb(1,1) b(i, l) = Const. −1 b(i,l) R1,1,1 = b(1, 1)
Now we are ready to define the cocycle β. Let the intertwining function f be equal to eA on each Ri . Further, we must have for z ∈ Ri,l,j , that f (hi z) = α1−1 (z, hi )·β(z, hi ), where β(z, hi ) are not defined yet but must be constants. Since for z ∈ Ei the values α1 (z, hi ) are close to ai , let for z ∈ Ri the values β(z, hi ) be equal to the these ai . This defines function f completely and correctly together with β, and the stack ζ has constant β-passage values according to the definition. Finally, ν z ∈ supp(ζ ) : dist(f (z), eA ) > σ ! n n [ X hi Ri \ hi Ei + ν z ∈ Ei : dist(f (z), eA ) > σ 6ν i=1
i=1 0
6 n · ε + n · ε · ν(Ei ) 6 n · ε + ε 0 . As n is given, and the choice of ε, ε 0 is to hand, the conditions of Proposition 2.1 are satisfied. 2
MPAG007.tex; 6/04/1999; 8:14; p.30
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
5.2.
361
A REPRESENTATION OF APPROXIMATELY TRANSITIVE GROUP ACTIONS AS A PRODUCT COCYCLE RANGE
Now we suppose an AT action to be given. The following is obvious: COROLLARY 5.3. Each admissible group AT action can be represented as a product cocycle range. Proof. Really, according to [6], each l.c.s. group action can be represented as cocycle range with the base action of any prescribed type. Now this statement follows directly from Theorem 5.1 (type II case) and Theorem 4.1 (type III case). 2 Here we are going to strengthen the result of Corollary 5.3 and to prove that if the initial AT action of an admissible group was from the very beginning represented as a Mackey action associated with a type II action and a cocycle, it is possible to choose a θ-product cocycle generating this action to be cohomologous to the initial one. Now we only know that these cocycles are weakly equivalent. So, we deal with the following situation: a type II G-action on (, m) supplied with cocycle α generates an AT Mackey action. The pair (G, α) is stably weakly equivalent to the pair consisting of a product odometer that we denote by Tpr and a product cocycle that we denote by αpr , and the space where Tpr acts will be denoted by (pr , mpr ). Note that our product odometer is of type II1 . Hence, two cases can arise: either G is also of type II1 and they are weakly equivalent, or G is of type II∞ and is weakly equivalent to the trivial expansion of our product odometer. In the case of type II1 actions, there exists a transformation θ: → pr that transforms [G] to [Gpr ] and α to a cocycle cohomologous to a product cocycle αpr . Note that mpr ◦ θ turns out to be an invariant probability measure on and hence mpr ◦ θ = m. Hence, α is cohomologous to a θ-product cocycle – see Definition 1.8. Now let us consider the type II∞ case in more detail. Introduce the trivial expansion of our product odometer and consider the product space pr × Z with product measure m0 = mpr × mZ . The following actions will be considered: g 0 (ω, z) = (gω, z), τ (ω, z) = (ω, z + 1), (here ω ∈ pr , z ∈ Z, g ∈ Gpr ) together with the following cocycle: α 0 (ω, z; g · τ n ) = α(ω, g). 0 The action associated with the pair (G0pr , αpr ), where G0pr = {g 0 τ n : g ∈ Gpr , n ∈ Z}, is the same as for the pair (Gpr , αpr ) because of [1, Proposition 2.3]. The weak equivalence relation is provided by θ: → pr × Z such that 0 (θω, θgθ −1 ) is cohomologous to α(ω, g), θ[G]θ −1 = [G0pr ], mpr ◦ θ ∼ m, and αpr where ω ∈ , g ∈ G. Note that m and mpr ◦ θ both are G-invariant infinite measures and hence differ by a constant multiplier: mpr ◦ θ = λ · m.
MPAG007.tex; 6/04/1999; 8:14; p.31
362
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
Now let us take the following set P ⊂ : P = θ −1 (pr × {0}). Then m(P ) = −1 1/λ. S For each i ∈ Z, let Pi = θ (pr × {i}). Obviously Pi are disjoint and i∈Z Pi = . So the space is represented in the form P ×Z by setting P ×{i} = Pi . For mP being the restriction of m on P , we can see that m = 1/λmP × mZ ; this follows from the fact that θ preserves the measure. Let τ1 = θ −1 ◦ τ ◦ θ; τ1 be an automorphism of the space = P × Z, it preserves the measure m and has the property τ1 (Pi ) = Pi+1 . LEMMA 5.4. There exists a cocycle β ∈ Z 1 (P × Z, G; A) cohomologous to α and taking its unit value at τ1 . Proof. We shall construct a function f (p, z) such that the cocycle β((p, z), g) = (f (p, z))−1 · α((p, z), g) · f (g · (p, z)) possesses the required property. Here g ∈ [G]. We put f (p, 0) = eA , f (p, z) = α((p, 0), τ1z )−1 . Then we immediately obtain β((p, 0), τ1z ) = (f (p, 0))−1 · α((p, 0), τ1z ) · f (p, z) = eA and hence β((p, z), τ1 ) = β((p, 0), τ1z )−1 · β((p, 0), τ1z+1 ) = eA .
We see also from the proof that αP = βP . We are now ready to formulate the following result. THEOREM 5.5. An Abelian group AT action being a Mackey action associated with a type II action and a cocycle is also a range of a θ-product cocycle cohomologous to the initial one. Proof. Let us consider again the restriction of θ to P that maps P to pr and 0 do not depend on τ1 and τ , respectively, θ transforms βP mP to mpr . As β and αpr to a cocycle cohomologous to αpr . Hence, βP = αP is a θ-product cocycle. So, the existence of β and P proves our theorem. 2
6. The Double Mackey Action and Two Single Ones Let us now compare the double Mackey action considered in the above with two single ones. Namely, we intend to consider α and ρ separately; this allows us to introduce the following actions: g(ω, a) = (gω, a · α(ω, g)), Vb (ω, a) = (ω, a · b). They act on the product space × A, and, as usual, we define the quotient action of Vb by the ergodic components of g by Wb .
MPAG007.tex; 6/04/1999; 8:14; p.32
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
363
Besides, introduce two actions dm ◦ g , g(ω, u) = gω, a + log dm Tt (ω, u) = (ω, u + t) acting on the product space × R; and similarly we define the quotient action of Tt by the ergodic components of g by Ft . The relation between the double Mackey action (Ft , Wb ) considered in the above (see Definition 3.1) and the single Mackey actions Ft and Wb introduced here is not clear because of the fact that the g-, ˜ g- and g-orbits are very different. However, the following is true: PROPOSITION 6.1. When the double Mackey action (Ft , Wb ) is AT, the two single Mackey actions Ft and Wb are also AT. Proof. Indeed, when (Ft , Wb ) is AT, then α × ρ is a product cocycle. According to the definition of product cocycles we see that α, i.e. its component with values in A, as well as ρ, i.e. its component with values in R, are both product cocycles. 2 This implies the approximate transitivity of Wb and Ft , respectively. Is the reverse statement true? The following example provides the negative answer to this question for the general case. EXAMPLE 6.2. (We reproduce this construction fromQ[5] with a little correction.) Let = {0, 1}Z and m be a product measure, m = ∞ i=1 mi , mi (0) = mi (1) = 1/2. Let θ be the Bernoulli transformation. Consider the space X = × with the product measure µ = m × m and two measure-preserving automorphisms on this space: θ1 = θ × θ, θ2 = id × θ. Let (Y, ν) be a Lebesgue space with a σ -finite measure ν. Let S be any ν-preserving ergodic transformation on Y , and u1 , u2 be two automorphisms permutable one with one and possessing the property: ν ◦ u1 = exp(τ1 ) · ν, ν ◦ u2 = exp(τ2 ) · ν, where τ1 and τ2 are two rationally incommensurable numbers, and belong to the normalizer of [S]. Introduce a Lebesgue space (X × Y, µ × ν) and construct the three following actions: Q1 (x, y) = (θ1 x, u1 y), Q2 (x, y) = (θ2 x, u2 y), S0 (x, y) = (x, Sy).
MPAG007.tex; 6/04/1999; 8:14; p.33
364
VALENTIN YA. GOLODETS AND ALEXANDER M. SOKHET
These actions are permutable. They generate the full group which we denote by G. Their joint action is of type III1 because of the rational incommensurability, and is orbit equivalent to Z-action. Define the cocycle α ∈ Z 1 (X × Y, G; Z) in the following way: α(x, y; Q1 ) = 0, α(x, y; Q2 ) = 1, α(x, y; S0 ) = 0. The Mackey action constructed by G and α is trivial and hence AT. Indeed, it is easy to see that the skew action acts ergodically on the product space X × Y × Z. (In this case α is said to be a cocycle with dense range.) On the other hand, the Mackey action constructed by G and the Radon–Nikodym cocycle ρ of the measure µ × ν, i.e. the associated flow, is also trivial and hence AT. This easily follows from the fact that G is an action of type III1 . Now consider the double Mackey action constructed by G and the double cocycle α × ρ. To do so, we write the following five actions being permutable one with one: e1 (x, y; z, t) = (θ1 x, u1 y; z, t − τ1 ), Q e2 (x, y; z, t) = (θ2 x, u2 y; z + 1, t − τ2 ), Q e S0 (x, y; z, t) = (x, Sy; z, t), zˆ 1 (x, y; z, t) = (x, y; z − z1 , t), tˆ1 (x, y; z, t) = (x, y; z, t − t1 ). Here z, z1 ∈ Z, t, t1 ∈ R. We have to construct the quotient space of X × Y × e2 , e e1 , Q S0 ) and obtain the quotient action of (ˆz1 , tˆ1 ) on Z × R by the orbits of (Q this quotient space. Fix y0 ∈ Y . It is easy to check that the set X × {y0 } × {0} × [0; τ1 ] can be identified with this quotient space. A straightforward computation shows that the Mackey action can be realized in this space in the following way: [(t −z1 τ2 )/τ1 ]
zˆ 1 (x, t) = (θ2z1 · θ1 tˆ1 (x, t) =
[(t −t )/τ ] (θ1 1 1
· x, {(t − z1 τ2 )/τ1 } · τ1 ),
· x, {(t − t1 )/τ1 } · τ1 ).
Here the brackets mean the integral part and the braces mean the fractional part of a real number. Finally, consider the σ -algebra containing all sets of the form × A × [0; τ1 ], where A is a measurable subset of . The quotient action of (ˆz1 , tˆ1 ) on the quotient space by this σ -algebra is essentially the action of Z on generated by the Bernoulli (1/2, 1/2)-action. It has positive entropy, and it now follows from [3, Theorem 3.5 and Remark 2.4], that our double Mackey action is not AT. This completes the consideration of this example. In other words, though the cocycles α and ρ are separately isomorphic to product cocycles, these isomorphisms have turned out to be incompatible.
MPAG007.tex; 6/04/1999; 8:14; p.34
PRODUCT COCYCLES AND THE APPROXIMATE TRANSITIVITY
365
References 1. 2. 3. 4.
5.
6. 7. 8.
9.
10.
11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.
Bezuglyi, S. I. and Golodets V. Ya.: Weak equivalence and the structures of cocycles of an ergodic automorphism, Publ. Res. Inst. Math. Sci. 27(4) (1991), 577–625. Connes, A., Feldman, J. and Weiss, B.: An amenable equivalence relation is generated by a single transformation, Ergodic Theory Dynamical Systems 1 (1981), 431–450. Connes, A. and Woods, E. J.: Approximately transitive flows and ITPFI factors, Ergodic Theory Dynamical Systems 5(2) (1985), 203–236. Golodets, V. Ya. and Nessonov, N. I.: Approximately transitive actions and product cocycles of an ergodic automorphism, Preprint of Institute for Low Temperature Physics and Engineering No. 4, Kharkov, 1987. Golodets, V. Ya. and Nessonov, N. I.: Approximately transitive actions of Abelian groups and product cocycles, Preprint of Institute for Low Temperature Physics and Engineering, No. 20, Kharkov, 1991. Golodets, V. Ya. and Sinelshchikov, S. D.: Amenable ergodic group actions and ranges of cocycles, Soviet Math. Dokl. 41 (1990), 523–526. Golodets, V. Ya. and Sinelshchikov, S. D.: Classification and structure of cocycles of amenable ergodic equivalence relations, JFA 121(2) (1994), 455–485. Golodets, V. Ya. and Sokhet, A. M.: Ergodic actions of an Abelian group with discrete spectrum, and approximate transitivity, J. Soviet Math. 52(6) (1990), 3530–3533; translated from Teor. Funktsi˘ı, Funktsional. Anal. i Prilozhen. 51 (1989), 117–122. Golodets, V. Ya. and Sokhet, A. M.: A representation of approximately transitive group actions as a product cocycle range, Preprint of Institute for Low Temperature Physics and Engineering No. 2, Kharkov, 1991. Golodets, V. Ya. and Sokhet, A. M.: Cocycles of type III transformation group and AT property for the double Mackey action, Preprint of the Erwin Shrödinger International Institute for Mathematical Physics, ESI 97, 1994. Hamachi, T.: The normalizer group of an ergodic automophism of type III and the commutant of an ergodic flow, J. Funct. Anal. 40(3) (1981), 387–403. Hamachi, T.: A measure theoretical proof of the Connes–Woods theorem on AT flows, Pacific J. Math. 154(1) (1992), 67–85. Hamachi, T. and Osikawa, M.: Ergodic groups of automorphisms and Krieger’s theorems, Sem. Math. Sci. 3 (1981), 113. Hawkins, J. M.: Properties of ergodic flows associated to product odometer, Pacific J. Math. 141(2) (1990), 287–294. Hawkins, J. M. and Robinson, E. A.: Approximately transitive (2) flows and transformations have simple spectrum, Lecture Notes Math. 1342 (1988), 261–280. Hewitt, E. and Ross, K.: Abstract Harmonic Analysis, Vol. 1, Springer, Berlin, 1963; Vol. 2, Springer, Berlin, 1970. Krieger, W.: On nonsingular transformations of a measure space I, II, Z. Wahrsch. verw. Gebiete 11 (1969), 83–97, 98–119. Mackey, G. W.: Ergodic transformation groups with a pure point spectrum, Ill. J. Math. 8(2) (1964), 593–600. Rohlin, V. A.: On the fundamental ideas of measure theory, Mat. Sb. 25(67) (1949), 107–150 (in Russian). Skandalis, G. and Sokhet, A. M.: Transitive actions have funny rank one, to appear. Sokhet, A. M.: Les actions approximativement transitives dans la théorie ergodique, Thèse de doctorat de l’Université Paris VII, soutenue le 26 juin 1997. Zimmer, R. J.: Amenable ergodic group actions and an application to Poisson boundaries of random walks, Ann. Sci. École Norm. Sup. 11 (1978), 407–428.
MPAG007.tex; 6/04/1999; 8:14; p.35
Mathematical Physics, Analysis and Geometry 1: 367–373, 1999. © 1999 Kluwer Academic Publishers. Printed in the Netherlands.
367
The Quantum Commutator Algebra of a Perfect Fluid M. D. ROBERTS Flat 5, 17 Wetherby Gardens, London SW5 OJP, U. K.? e-mail: [email protected] (Received: 2 December 1997; in final form: 15 December 1998) Abstract. A perfect fluid is quantized by the canonical method. The constraints are found and this allows the Dirac brackets to be calculated. Replacing the Dirac brackets with quantum commutators formally quantizes the system. There is a momentum operator in the denominator of some coordinate quantum commutators. It is shown that it is possible to multiply throughout by this momentum operator. Factor ordering differences can result in a viscosity term. The resulting quantum commutator algebra is unusual. Mathematics Subject Classifications (1991): 81S05, 81R10, 82B26, 83CC22. Key words: quantum commutator algebra, perfect fluid.
1. Introduction It has been known for some time [1] that a perfect fluid has a Lagrangian formulation. The Lagrangian is taken to be the pressure and variations are achieved through an infinitesimal form of the first law of thermodynamics. A perfect fluid’s stress is described using the vector field comoving with the fluid. This vector field defines an absolute time for the system. Furthermore, this absolute time can then be used to define canonical momenta and canonical Hamiltonians. This is done here for the first time. There are equivalences between scalar fields and fluids, [2]; more generally, the comoving vector field can be decomposed into scalar fields resulting in a description of a perfect fluid employing only scalar fields. Previously, this decomposition has been investigated by choosing an ad hoc global time rather than absolute time and defining canonical momenta and other quantities with respect to the global time. Typically, the resulting theory is applied to cosmology [3, 4]. Once the constrained Hamiltonian has been calculated by the standard canonical method [7, 8], the Dirac brackets can be replaced by quantum commutators. The original motive for investigating this was to find a fluid generalization of Higg’s model [5, 9]. A quantum treatment is required to estimate the VEVs (quantum ? From 1 Jan. 98: Department of Mathematics and Applied Mathematics, University of Cape Town, Rondebosh 7701, South Africa. e-mail: [email protected]
VTEX(VR) PIPS No: 205372 (mpagkap:mathfam) v.1.15 MPAG021.tex; 15/04/1999; 14:15; p.1
368
M. D. ROBERTS
vacuum expectation values) of the scalar fields which are related to the induced nonzero mass. The quantum commutator algebra is unusual, perhaps reflecting the structure of the scalar field decomposition of the comoving vector field [6, 10]. It is hoped that eventually the present theory will be applied to low temperature super fluids. To do this, it probably will be necessary to include a chemical potential term in the first law of thermodynamics (2.1). 2. Lagrangian and Hamiltonian Formulation of a Perfect Fluid’s Dynamics A perfect fluid has a Lagrangian formulation in which the Lagrangian is the pressure p. Variation is achieved by using the first law of thermodynamics dp = n dh − nT ds,
(2.1)
where n is the particle number, T is the temperature, s is the entropy, and h the enthalpy. The pressure and the density are equated to the enthalpy and the particle number by p + µ = nh.
(2.2)
In four dimensions, a vector can be decomposed into four scalars, however the five-scalar decomposition hVa = Wa = φa + 6(i) θ(i) S(i)a ,
Va V a = −1,
(2.3) R
(i) = 1, 2 is often used, because for i = 1, s and θ = T dτ have interpretation as the entropy and the thermasy, respectively. From now on, the index (i) is suppressed as it is straightforward to reinstate. There are other conventions for this scalar field decomposition, for example with a− instead of a+ before the summed fields. “q” is used to notate an arbitrary scalar field, i.e., q = φ, θ or s. The coordinate space action is taken to be Z √ −gp dx 4 . (2.4) I= Replacing the first law with dp = −nVa dW a − nT ds, variation with respect to the metric and φ, θ, and s gives Tab = (p + µ)Va Vb + pgab , ◦
(nV a )a = n +n2 = 0,
◦
s = 0,
◦
(2.5)
θ = T,
respectively. 2 = V·aa is the expansion of the vector field. The canonical momenta are given by 5i = δI /δq i and are 5φ = −n,
5θ = 0,
5s = −nθ .
(2.6)
MPAG021.tex; 15/04/1999; 14:15; p.2
THE QUANTUM COMMUTATOR ALGEBRA OF A PERFECT FLUID
369
The Hamiltonian density is usually defined in terms of components of the canonical stress as θ·tt . In the present case, the canonical stress is not defined so that the metric stress T·ba is used instead; also 4-vectors are used rather than components, resulting in Hd = V·a V·b Tab = µ.
(2.7)
The standard Poisson bracket is {A, B} =
δA δB δA δB − , δqi δ5i δ5i δqi
(2.8)
where i, which labels each field, is summed; the integral sign and measure have been suppressed and the variations are performed independently. When absolute time is used, Hamiltons equations have an additional term in the expansion [12], explicitly ◦
◦
q = δHc /δ5,
◦
5 + θ 5 = −δHc /δq,
where Hc is the canonical Hamiltonian Hc = momenta are constrained ϕ1 = 5s· − θ5φ· ,
R
√ Hd −g dx 4 . From (2.6), the
ϕ2 = 5θ· . ◦
The initial Hamiltonian is HI = 5i q ◦
(2.9)
(2.10) i
·
− L, replacing the dependent 5’s gives
◦
H0 = 5ϕ (φ + θ s) − L,
(2.11)
adding the constraints gives the Hamiltonian density Hλ = H0 + λα· ϕα , ◦
◦
λ1· = s,
◦
λ2· = θ ,
(2.12)
◦
Hd = 5ϕ· (φ + θ s) + λ1· (5s· − θ5ϕ· ) + λ2· 5ϕ· − L, where the λ’s are the Lagrange multipliers. Substituting the values of the momentum the Hamiltonian density is still weakly the fluid density. Using (2.9), the time evolution of any variable X is given by ∂X dX δX = + {X, Hλ } − 25i i , dτ ∂τ δ5
(2.13)
replacing the Hamiltonian density H by Hλ and then holding the multipliers constant so that {X, Hλ } = {X, H0 } + λα· {X, ϕα }
(2.14)
MPAG021.tex; 15/04/1999; 14:15; p.3
370
M. D. ROBERTS
gives the time evolution dX ∂X δX + {X, H0} + λα· {X, ϕα } (2.15) = − 25i· dτ ∂τ δ5i ◦ δX ◦ δX δX ∂X + (ϕ + θ(s − λ1· )) + λ1· + λ2 + = ∂τ δϕ δs δs δX δX ◦ ((V·a 5φ· )a − 25φ· ) + ((− s + λ1· )5φ − 25φ ) + + φ θ δ5· δ5· δX ((V·a θ5φ· )a − 25s· ) + δ5s· ≈
◦ δX ◦ δX ∂X δX ◦ δX ◦ δX +ϕ +θ −n −n − (θn)· s . φ φ ∂τ δϕ δθ δ5· δ5· δ5·
Letting X equal the constraints gives dϕα /dτ = 0, this shows that there are no further constraints so that the Dirac brackets can now be constructed. A quantity R(q, 5) is first class [8] if {R, ϕα } ≈ 0,
α = 1, 2,
(2.16)
otherwise it is second class. The Cαβ matrix, cf. [11, p. 10], is Cαβ = {ϕα , ϕβ } = −iσ·2 5ϕ· , 0 −i 2 , σ· = +i 0
−1 Cαβ = +iσ·2 /5ϕ· ,
(2.17)
where is a Pauli matrix. The Dirac bracket is defined by −1 {ϕβ , B}. {A, B}∗ = {A, B} − {A, ϕα }Cαβ
In the present case, this gives the Dirac bracket 1 δB δA θδA ∗ ϕ δA + {A, B} = {A, B} − φ − +5 5 δϕ δs δϕ δ5θ 1 δA δB θδB a δB + φ . − +5 5 δϕ δs δϕ δ5θ
(2.18)
(2.19)
Now −1 , Hλ = H0 − {H0 , ϕα }Cαβ
λβ = −{H0 , ϕα }Cλ−1 β ,
(2.20)
from which Hλ given by (2.11) can be recovered with the correct λ’s.
MPAG021.tex; 15/04/1999; 14:15; p.4
371
THE QUANTUM COMMUTATOR ALGEBRA OF A PERFECT FLUID
3. Quantization To quantize a classical dynamical system the Dirac bracket is replaced by the commutator ˆ {A, B}∗ → i h[ ¯ Aˆ Bˆ − Bˆ A],
(3.1)
where h¯ is Planck’s constant divided by 2π and the hat “∧ ” signifies that the variable is now an operator. There are various correspondence criteria which can be investigated, for example: as h¯ → 0, there should be (a) the same time evolution, (b) the same stress, and (c) the first law should be recovered. Another correspondence criteria can be called the particle number criteria: the particle number n should bear a relation to the quantum particle number constructed from creation and destruction operators. An intermediate aim, between formal quantization achieved by replacing field and momenta Dirac brackets with commutators, and establishing contact with applications, is to produce a quantum perfect fluid. This could be obtained from brackets involving the numbered field, the angular momentum and so on, or from brackets involving a mixture of these and geometric objects. However, no progress has been made so far in finding a quantum perfect fluid, so that attention is restricted to implications of replacing brackets consisting solely of individual components of fields and momenta with commutators. Effecting the replacement of the 15 Dirac brackets between the fields and momenta there are four nonvanishing commutators ˆ φ· = −i h, ˆ φ· ϕˆ − ϕˆ 5 5 ¯ i h¯ θˆ ϕˆ θˆ − θˆ ϕˆ = − , ˆ 5φ
ˆ θ· = 0, ˆ θ· θˆ − θˆ 5 5 i h¯ θˆ sˆ − sˆ θˆ = − φ , ˆ· 5
ˆ s· sˆ − sˆ 5 ˆ s· = −i h, 5 ¯
(3.2)
ϕˆ sˆ − sˆ ϕˆ = 0.
ˆ φ· in the denominator. This might The last two commutators have the operator 5 ˆ φ in the denominator we multiply by the operator not be well-defined. To avoid 5 ϕ ˆ 5 , using the first commutation of (3.2) it turns out that multiplying on the left or multiplying on the right are equivalent so that ˆ q· [ϕˆ θˆ − θˆ ϕ] ˆ ˆ q· = 5 ˆ = −i h¯ θ, [ϕˆ θˆ − θˆ ϕ] ˆ 5
(3.3)
ˆ ϕ· [θˆ sˆ − sˆ θ] ˆ = −i h. ˆ5 ˆ ϕ· = 5 [θˆ sˆ − sˆθ] ¯ These results are in accord with the equations deduced if the Dirac brackets {q·i , ˆ k· are also q·j 5k· }∗ are replaced by commutators. Left and right multiplication by 5 equivalent if anti-commutation rather than commutation is considered. The quantum Hamiltonian is ˆ◦ ˆ◦ ϕ ˆ ϕ· ϕ + l2 ϕ 5 ˆ ϕ· θˆ sˆ◦ − l4 5 ˆ ϕ· sˆ◦θˆ − ˆ · − l3 5 Hˆ q = l1 5 ˆ ϕ· − l7 sˆ◦5 ˆ ϕ· θˆ − l8 sˆ◦θˆ 5 ˆ ϕ· − p, ˆ ϕ· sˆ◦ − l6 θˆ sˆ◦5 ˆ − l5 5
(3.4)
MPAG021.tex; 15/04/1999; 14:15; p.5
372
M. D. ROBERTS
where the l’s are constant and obey l1 + l2 = 1, l3 + l4 + l5 + l6 + l7 + l8 = 1, using the commutation relations the quantum Hamiltonian (3.3) is ˆ◦ ˆ◦ ˆ ϕ· (φ − θˆ s) − i h2l ˆ Hˆ q = 5 ¯ − p,
(3.5)
where l = l2 + l4 + l7 + l8 is called the ordering constant: it is of undefined size but is can be taken to be of order unity. Because the Dirac bracket of pˆ with anything vanishes the commutators with p also vanish and p can be taken to be p⊥ , where ⊥ is the identity element. To investigate the algebraic implications of (3.2) and (3.3), label the six operators by v’s, φˆ v1
sˆ v2
θˆ v3
ˆ ϕ· 5 v4
ˆ s· 5 v5
ˆ θ· 5 v6
(3.6)
v6 commutes with everything and can be disregarded. Of the remaining commutators, only four are nonzero. In units h¯ = 1 (3.2). (3.3) and (3.7) give the algebra v4 (v3 v2 − v2 v3 ) = −i, v4 (v1 v3 − v3 v1 ) = −iv3 , v4 v1 − v1 v4 = −i, v5 v2 − v2 v5 = −i.
(3.7)
This algebra does not seem to be realizable in terms of matrices and differential operators, the closest algebras are found in [11]. If a commutator is constructed with a time derivative of the field or momenta, the same algebra results but multiplied by a term in the expansion. Similarly if m time derivatives occur, the algebra is multiplied by the expansion to the power of m. Acknowledgements I would like to thank Prof. T. W. B. Kibble for discussion of some of the points that occur. This work was supported in part by the South African Foundation for Research and Development (FRD). References 1. 2. 3. 4. 5. 6. 7.
Hargreaves, R.: Philos. Mag. 16 (1908), 436. Tabensky, R. and Taub, A. H.: Comm. Math. Phys. 290 (1973), 61. Tipler, F.: Phys. Rep. C 137 (1986), 231 Lapchiniskii, V. G. and Rubakov, V. A.: Theoret. Math. Phys. 33 (1977), 1076. Roberts, M. D.: Hadronic J. 20 (1997), 73–84. Schutz, B.: Phys. Rev. D 4 (1971), 3559. Dirac, P. A. M.: Lectures on quantum mechanics, Belfor Graduate School of Science, Yeshiva University, New York, 1963.
MPAG021.tex; 15/04/1999; 14:15; p.6
THE QUANTUM COMMUTATOR ALGEBRA OF A PERFECT FLUID
8. 9. 10. 11. 12.
373
Hanson, A. J., Regge, T. and Teitelboim, C.: Constrained Hamiltonian Systems, Accademia Nazionale die Lincei Rome, 1976. Roberts, M. D.: A generalized Higg’s model, Preprint. Eckart, C.: Phys. Fluids 3 (1960), 421, Appendix. Ohaki, Y. and Kamefuchi, S.: Quantum Field Theory and Parastatistics, Springer-Verlag, Heidelberg, 1982. Roberts, M. D.: An expansion term in Hamilton’s equations, Europhys. Lett. 45 (1999), 26–31.
MPAG021.tex; 15/04/1999; 14:15; p.7
Mathematical Physics, Analysis and Geometry 1: 375–376, 1999.
Contents of Volume 1
Volume 1 No. 1 1998 Editorial
v
L. BOUTET DE MONVEL and E. KHRUSLOV / Homogenization of Harmonic Vector Fields on Riemannian Manifolds with Complicated Microstructure ANDREI IFTIMOVICI / Hard-core Scattering for N-body Systems
1–22 23–74
S. SINEL’SHCHIKOV and L. VAKSMAN / On q-Analogues of Bounded Symmetric Domains and Dolbeault Complexes 75–100 Instructions for Authors
101–106
Volume 1 No. 2 1998 ANTON BOVIER and VÉRONIQUE GAYRARD / Metastates in the Hopfield Model in the Replica Symmetric Regime 107–144 S. MOLCHANOV and B. VAINBERG / On Spectral Asymptotics for Domains with Fractal Boundaries of Cabbage Type 145–170 IGOR YU. POTEMINE / Minimal Terminal Q-Factorial Models of Drinfeld Coarse Moduli Schemes 171–191 Volume 1 No. 3 1998 W. O. AMREIN and D. B. PEARSON / Stability Criteria for the Weyl m-Function 193–221 M. A. FEDOROV and A. F. GRISHIN / Some Questions of the Nevanlinna Theory for the Complex Half-Plane 223–271 NICULAE MANDACHE / On a Counterexample Concerning Unique Continuation for Elliptic Equations in Divergence Form 273–292
VTEX(JU) PIPS No.: 131669 (mpagkap:mathfam) v.1.15 MPAGVC1.tex; 16/04/1999; 9:26; p.1
CONTENTS OF VOLUME 1
Volume 1 No. 4 1998/1999 Editorial
293
G. GALLAVOTTI / Arnold’s Diffusion in Isochronous Systems
295–312
A. S. FOKAS, L.-Y. SUNG and D. TSOUBELIS / The Inverse Spectral Method for Colliding Gravitational Waves 313–330 VALENTIN YA. GOLODETS and ALEXANDER M. SOKHET / Product Cocycles and the Approximate Transitivity 331–365 M. D. ROBERTS / The Quantum Commutator Algebra of a Perfect Fluid 367–373 Volume Contents
375–376
Instructions for Authors
377–382
MPAGVC1.tex; 16/04/1999; 9:26; p.2
Mathematical Physics, Analysis and Geometry 1: 377–382, 1999.
377
Mathematical Physics, Analysis and Geometry INSTRUCTIONS FOR AUTHORS EDITORS-IN-CHIEF VLADIMIR A. MARCHENKO, B.I. Verkin Institute for Low Temperature Physics and Engineering, Academy of Sciences of Ukraine, Kharkov, Ukraine ANNE BOUTET DE MONVEL, Université de Paris 7 – Denis Diderot, Institut de Mathématiques, Paris, France HENRY McKEAN / New York University, Courant Institute of Mathematical Sciences, NY, U.S.A. AIMS AND SCOPE The journal will publish papers presenting new mathematical results in mathematical physics, analysis, and geometry, with particular reference to: ∗ mathematical problems of statistical physics, fluids, etc. ∗ complex function theory ∗ operators in function space, especially operator algebras ∗ ordinary and partial differential equations ∗ differential and algebraic geometry Papers which are too abstract will be discouraged. Review articles on new mathematical results will be welcome. MANUSCRIPT SUBMISSION Kluwer Academic Publishers prefer the submission of manuscripts and figures in electronic form. The preferred storage medium for your electronic manuscript is a 31/2 inch diskette. Please label your diskette properly, giving exact details on the name of the file(s), the operating system and software used. Always save your electronic manuscript in the wordprocessor format that you use. In general, use as few formatting codes as possible. For safety’s sake, you should always retain a
VTEX(Ju); PIPS No.: 230502 (mpagkap:mathfam) v.1.15 MPAGIA3.tex; 6/04/1999; 9:38; p.1
378 backup copy of your file(s). After acceptance, please make absolutely sure that you send us the latest (i.e., revised) version of your manuscript, both as hard copy printout and on diskette. Kluwer Academic Publishers prefer papers submitted in wordprocessing packages such as MS Word, WordPerfect, etc. running under operating systems MS DOS, Windows and Apple Macintosh, or in the file format LaTeX. Articles submitted in other software programs, as well as articles for conventional typesetting can also be accepted. For submission in LaTeX, Kluwer Academic Publishers have developed special LaTeX style files, KLUWER.STY (LaTeX 2.09) and KLUWER.CLS (LaTeX 2), which are used for all Kluwer journals, irrespective of the publication’s size or layout. The specific journal formatting is done later during the production process. KLUWER.STY and KLUWER.CLS are offered by a number of servers around the world. Unfortunately, these copies are often unauthorised and authors are strongly advised not to use them. Kluwer Academic Publishers can only guarantee the integrity of style files obtained directly from them. Authors can obtain KLUWER.STY and KLUWER.CLS and the accompanying instruction file KAPINS.TEX from the Kluwer Academic Publishers Information Service (KAPIS) at the following website: http://www.wkap.nl Technical support on the usage of the style file is given via the e-mail address: [email protected] For the purpose of reviewing, articles for publication should initially be submitted as hardcopy printout (4-fold) and on diskette to: Journals Editorial Office, Mathematical Physics, Analysis and Geometry, Kluwer Academic Publishers, P.O. Box 990, 3300 AZ Dordrecht, The Netherlands. MANUSCRIPT PRESENTATION The journal’s language is English. British English or American English and terminology may be used, but either one should be followed consistently throughout the article. Manuscripts should be printed or typewritten on A4 or US Letter bond paper, one side only, leaving adequate margins on all sides to allow reviewers’ remarks. Please double-space all material, including notes and references. Quotations of more than 40 words should be set off clearly, either by indenting the left-hand margin or by using a smaller typeface. Use double quotation marks for direct quotations and single quotation marks for quotations within quotations and for words or phrases used in a special sense. Number the pages consecutively with the first page containing: – running head (shortened title)
MPAGIA3.tex; 6/04/1999; 9:38; p.2
379 – – – – –
article type (if applicable) title author(s) affiliation(s), full address for correspondence, including telephone and fax numbers and e-mail address – the AMS Mathematics Subject Classifications (1991) Abstract Please provide a short abstract of 100 to 250 words. The abstract should not contain any undefined abbreviations or unspecified references. Key Words Please provide 5 to 10 key words or short phrases in alphabetical order. Figures and Tables In addition to hard copy printouts of figures, authors are encouraged to supply the electronic versions of figures in either Encapsulated PostScript or TIFF format. Many other formats, e.g., Microsoft Postscript, PiCT (Macintosh) and WMF (Windows), cannot be used and the hard copy will be scanned instead. Figures should be saved in separate files without their captions, which should be included with the text of the article. Files should be named according to DOS conventions, e.g. “figure1.eps”. For vector graphics Encapsulated PostScript is the preferred format. Lines should not be thinner than 0.25pts and in-fill patterns and screens should have a density of at least 10 percent. For bitmap graphics, TIFF is the preferred format. The following resolutions are optimal: black-and-white line figures – 1200 dpi; line figures with some gray or coloured lines – 600 dpi; photographs – 300 dpi; screen dumps–leave as is. If no electronic versions of figures are available, submit only high-quality artwork that can be reproduced as is, i.e., without any part having to be redrawn or re-typeset. The letter size of any text in the figures must be large enough to allow for reduction. Photographs should be in black-and-white on glossy paper. If a figure contains colour, make absolutely clear whether it should be printed in black-and-white or in colour. Authors will be charged for reproducing figures in colour. Each figure and table should be numbered and mentioned in the text. The approximate position of figures and tables should be indicated in the margin of the manuscript. On the reverse side of each figure, the name of the (first) author and the figure number should be written in pencil. Figures and tables should be placed at the end of the manuscript following the Reference section. Each figure and table
MPAGIA3.tex; 6/04/1999; 9:38; p.3
380 should be accompanied by an explanatory legend. The figure legends should be grouped and placed on a separate page. Figures are not returned to the author unless specifically requested. In tables, footnotes are preferable to long explanatory material in either the heading or body of the table. Such explanatory footnotes, identified by superscript letters, should be placed immediately below the table. Section Headings First-, second-, third- and fourth-order headings should be clearly distinguishable. Appendices Supplementary material should be collected in an Appendix and placed before the Notes and Reference sections. Notes Please use endnotes only. Notes should be indicated by consecutive superscript numbers in the text and listed at the end of the article before the References. A source reference note should be indicated by an asterisk after the title. This note should be placed at the bottom of the first page. Cross-Referencing Please make optimal use of the cross-referencing features of your software package. Do not cross-reference page numbers. Cross-references should refer to, for example, section numbers, equation numbers, figure and table numbers. In the text, a reference identified by means of an author’s name should be followed by the date of the reference in parentheses and page number(s) where appropriate. When there are more than two authors, only the first author’s name should be mentioned, followed by ‘et al.’. If numbered references are concerned, the reference number should be enclosed within square brackets. In the event that an author cited has had two or more works published during the same year, the reference, both in the text and in the reference list, should be identified by a lower case letter like ‘a’ and ‘b’ after the date to distinguish the works. Examples: Winograd (1986, p. 204) (Winograd, 1986a; Winograd, 1986b) (Winograd, 1986; Flores et al., 1988) (Bullen and Bennett, 1990) Winograd [1] Bullen and Bennett [2]
MPAGIA3.tex; 6/04/1999; 9:38; p.4
381 Acknowledgements Acknowledgements of people, grants, funds, etc. should be placed in a separate section before the References. References References to books, journal articles, articles in collections and conference or workshop proceedings, and technical reports should be listed at the end of the paper in numbered order. Articles in preparation or articles submitted for publication, unpublished observations, personal communications, etc. should not be included in the reference list but should only be mentioned in the article text (e.g., T. Moore, personal communication). References to books should include the author’s name; year of publication; title in full; page numbers where appropriate; publisher; place of publication. For example:
1. Popov, V. N.: Functional Integrals in Quantum Field Theory and Statistical Physics, D. Reidel, Dordrecht, 1989. References to articles in an edited collection should include the author’s name; year of publication; article title; editor’s name; title of collection; first and last page numbers; publisher; place of publication. For example: 2. Evans, J. and Kawahigashi, Y.: Subfactors and conformal field theory, in: H. Araki, K. Ito, A. Kishimoto, and I. Ojima (eds), Quantum and NonCommutative Analysis, Kluwer Acad. Publ., Dordrecht, 1993, pp. 341–369. References to articles in conference proceedings should include the author’s name; year of publication; article title; editor’s name (if any); title of proceedings; first and last page numbers; place and date of conference; publisher and/or organization from which the proceedings can be obtained; place of publication. For example: 3. Kramm, G., 1991: Numerical investigation of the dry deposition of reactive trace gases, in P. Borrel, P.M. Borrell, and W. Seiler (eds), Transport and Transformation of Pollutants in the Troposphere, Proc. EUROTRAC Symp. ’90, Frankfurt, 5 April, 1990, SPB Publ., The Hague, The Netherlands, pp. 155–157. References to articles in periodicals should include the author’s name; year of publication; title of article; full or abbreviated title of periodical; volume number (issue number where appropriate); first and last page numbers. For example:
MPAGIA3.tex; 6/04/1999; 9:38; p.5
382 4. Schwartz, J. H.: Evidence for nonperturbative string symmetries, Lett. Math. Phys. 34 (1995), 309–317. References to technical reports or doctoral dissertations should include the author’s name; year of publication; title of report or dissertation; institution; location of institution. For example: 5. Ramaroson, R. A., 1989: Modeling in one and three dimensions, PhD thesis, Paris VI. PROOFS Proofs will be sent to the corresponding author. One corrected proof, together with the original, edited manuscript, should be returned to the Publisher within three days of receipt by mail (airmail overseas). OFFPRINTS 50 offprints of each article will be provided free of charge. Additional offprints can be ordered by means of an offprint order form supplied with the proofs. PAGE CHARGES AND COLOUR FIGURES No page charges are levied on authors or their institutions. Colour figures are published at the author’s expense only. COPYRIGHT Authors will be asked, upon acceptance of an article, to transfer copyright of the article to the Publisher. This will ensure the widest possible dissemination of information under copyright laws. PERMISSIONS It is the responsibility of the author to obtain written permission for a quotation from unpublished material, or for all quotations in excess of 250 words in one extract or 500 words in total from any work still in copyright, and for the reprinting of figures, tables or poems from unpublished or copyrighted material. ADDITIONAL INFORMATION Additional information can be obtained from Mathematical Physics, Analysis and Geometry, Science and Technology Division, Kluwer Academic Publishers, P.O. Box 17, 3300 AA Dordrecht, The Netherlands; fax 078-6932388; e-mail [email protected]
MPAGIA3.tex; 6/04/1999; 9:38; p.6