This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
1/(p − 1). If |µ1 | + |µ2 |∞ < p − 1 < 1, then I − µ1 T − µ2 T is injective on Lp (C). For some µ1 , µ2 the injectivity fails when 1 > |µ1 | + |µ2 |∞ > p − 1. Hence only the limiting situation 1 + k = p < 2 remains open; consequently this case remains open also in the regularity theory of weakly quasi-regular mappings. We expect that the injectivity does hold here, too (cf. Section 7). One obtains the surjectivity properties of the Beltrami operators as a direct consequence of the above results and their proofs. For completeness we state these as a separate theorem. theorem 7 Suppose µ1 , µ2 ∈ L∞ with k = |µ1 | + |µ2 |∞ < 1. Then the Beltrami operator I − µ1 T − µ2 T is surjective on Lp (C) for all 1 + k < p < 1 + (1/k). For every p ≤ 1 + k and for every p ≥ 1 + (1/k), the surjectivity fails on Lp (C) for some µ1 , µ2 with k = |µ1 | + |µ2 |∞ .
34
ASTALA, IWANIEC, AND SAKSMAN
Finally, the results of this section can be interpreted as giving bounds for the spectral radius of the Beltrami operators. In general, no more can be said of the spectrum itself. theorem 8 Let µ ∈ L∞ (C) be normalized so that µ∞ = 1. Suppose 2 < q < ∞. Then the spectrum
(14) σ µT : Lq (C) −→ Lq (C) ⊂ z ∈ C : |z| ≤ q − 1 . Again this is optimal; there are compactly supported µ with µ∞ = 1 such that the equality holds in the above spectral inclusion. Additional results on the Beurling transform and related topics are contained in [10], [30], [13], [14], [15], [3], and [4]. We refer to [20] and the references therein for results on higher-dimensional counterparts. In the extremal case of the space L2 , the invertibility of the Beltrami operators and their counterparts leads to distortion functions bounded in BMO (bounded mean oscillation) (cf. [17]). 3. Jacobians of quasi-conformal maps as Ap -weights The basic properties of planar quasi-conformal mappings that are used in the sequel can be found in [25]. In particular, recall that the mappings are a.e. differentiable with positive Jacobian J (f ) ≡ |∂z f |2 − |∂z f |2 . We also use repeatedly the consequence that pointwise a.e. all of the quantities |∂z f |2 , J (f ), and Df 2 are comparable, with bounds depending only on the dilatation K. Throughout this paper the notation B = B(x, r) stands for the open disk of center x and radius r, while |E| denotes the area of the measurable set E ⊂ C. For the basic properties of Ap -weights, we refer to [32]. Recall that the positive weight w on C belongs to the class Ap = Ap (C) if
|w|Ap
1 := sup |B| B
B
w
1 |B|
B
w
−1/(p−1)
p−1
< ∞.
(15)
Here 1 < p < ∞ and the supremum is taken over all disks B ⊂ C. The quantity |w|Ap is referred to as the Ap -norm of the weight w. According to a basic theorem of R. Coifman and C. Fefferman [9], regular Calder´on-Zygmund operators, and the Beurling transform T in particular, are bounded on the weighted spaces 1/p |f |p w dm 0. This gives
−1 |fz |2 + |fz |2 |Et | ≤ t Et
−1 ≤ Kt |fz |2 − |fz |2 = Kt ≤t
−1
−1
Et
|f (Et )|
C1 (K)|Et |1/K .
Solving for |Et | from this estimate shows that |Et | ≤ C2 (K)t K/(1−K) . An easy computation then yields the claim. Finally, if B is a quasidisk, the corollary is deduced by observing that the quasi-symmetry condition (16) gives a round disk B1 such that B ⊂ B1 , |B1 | ≤ C2 (K)|B|, and |f (B1 )| ≤ C3 (K)|f (B)|.
37
BELTRAMI OPERATORS IN THE PLANE
Reverse H¨older estimates are closely related to the Ap -weights. This can be expressed in various ways; we make use of the following theorem. theorem 12 Let f : C → C be a K-quasi-conformal homeomorphism, and let p ∈ [2, 1 + 1/k), where k = (K − 1)/(K + 1). Denote ω = |Jf |1−p/2 .
(18)
Then ω ∈ A2 with |ω|A2 ≤ pC(K)/(1 + 1/k − p). Proof Let B be a ball. Denote by g = f −1 the inverse map. Then g is K-quasi-conformal and we have ω = |Jg ◦ g −1 |p/2−1 . We apply Corollary 11 and a change of variables in order to have the estimate |f (B)| 1−p/2 1 pC1 (K) 1 p/2 ω= |Jg | ≤ . |B| B |B| f (B) 1 + 1/k − p |B| Next, a second application of Corollary 11 yields |f (B)| p/2−1 1 pC1 (K) 1 −1 p/2−1 ω = |Jf | ≤ . |B| B |B| B 1 + 1/k − (p − 2) |B| (If 0 ≤ p/2 − 1 ≤ 1, one simply applies H¨older’s inequality instead of Corollary 11.) These estimates and the definition of the A2 -class immediately give the theorem. Remarks. The A2 -bounds of Theorem 12 are sufficient for our purposes in the sequel. However, if one wants to determine the optimal Ar -class for ω when the dilatation K and the exponent p are given, it is enough to observe that the previous arguments show for any disk B and for any K-quasi-conformal map f of C that 1/t 1 |f (B)| |f (B)| t ≤ |Jf | ≤ c2 (t, K) c1 (t, K) |B| |B| B |B| whenever (1/2)(1−(1/k)) < t < (1/2)(1+(1/k)), as usually k = (K −1)/(K +1). This follows from Corollary 11, with a change of variables when the exponent t is negative. Therefore, reasoning as above one can show that ω = |Jf |1−p/2 is contained in the class Ar if r>
1 + k(p − 1) 1+k
when 2 ≤ p < 1 +
r>
1 − k(p − 1) 1−k
when 1 −
and
1 k
1 < p ≤ 2. k
38
ASTALA, IWANIEC, AND SAKSMAN
It is interesting to compare the A2 -results of Theorem 12 with the classical HelsonSzeg¨o theorem (see [32, p. 226]). This theorem asserts that a weight w on R belongs to A2 if and only if one can write w = eh1 +H (h2 ) , where h1 , h2 ∈ L∞ (R) with h2 ∞ < π/2, and H denotes the ordinary Hilbert transform. In our case it is not difficult to show that if we pass to the limits k → 0 and p → ∞ in such a way that pk → 1 − ε, ε > 0, then the weight ω of Theorem 12 asymptotically approaches the expression ω = e(1−ε) Re(T µ) , µ∞ = 1. Moreover, the corresponding A2 -norms stay bounded. Consequently, we obtain a counterpart of the Helson-Szeg¨o theorem where H is replaced by the Beurling transform T . We present here a more direct proof of this result based on the corresponding asymptotic consequences of Theorem 9 that were given in [3]. theorem 13 Let µ, ν ∈ L∞ (C) be such that µ∞ < 1. Then eν+Re(T µ) ∈ A2 . Proof It is well known that a weight w ∈ A2 if and only if 1 e±(log w−(log w)B ) < ∞, sup |B| B B where (log w)B denotes the mean value of log w over B. Thus in our case it is enough to prove that 1 eRe T µ−(Re(T µ))B < ∞ (19) sup B |B| B if µ∞ < 1. By the invariance properties of T , we may assume that B = B(0, 1). It is clear that (19) follows from the distribution inequality
> t < Ce−t , z ∈ B(0, 1) : Re(T µ(z)) − Re(T µ) (20) B(0,1) for normalized µ with µ∞ = 1. To prove (20) we write µ1 = µχB(0,2) and µ2 = µ − µ1 . Applying the simple estimate 4 1 1 ≤ − for |z| ≤ 1 and |w| ≥ 2, (z − w)2 (w)2 |w|3 we deduce that the oscillation of T µ2 on B(0, 1) is bounded by a universal constant.
39
BELTRAMI OPERATORS IN THE PLANE
Moreover, by [3, Corollary 5.1], we have that z ∈ B(0, 2) : Re(T µ1 (z)) > t < C e−t . Combining these facts yields (20) and proves the theorem. Remarks. The assumption µ∞ < 1 above is necessary. For example, when µ = (z/z)χB(0,1) , µ∞ = 1; but since T µ = 1 + 2 log |z|χB(0,1) , exp(T µ) is not even integrable. Theorem 13 does not quite have a converse since the Fourier multiplier of T is even; thus there are weights w ∈ A2 (C) which cannot be written in the form log w = ν + T µ with µ, ν ∈ L∞ (C). This follows from Janson’s theorem (see [21]). 4. Invertibility of the Beltrami operators In this section we prove our main result, Theorem 1. In addition, we verify Theorem 8 and discuss in more detail the system of nonlinear first-order PDEs in the plane covered by Theorem 1. The proof of Theorem 1 is based on a fundamental auxiliary result yielding a priori bounds for solutions of the linear Beltrami operator I − µT ; we first establish this auxiliary result and later show how the general result can be reduced to it. Our strategy is to apply a quasi-conformal change of variables for solving the nonhomogeneous equation f − µTf = g. This leads us in a natural way to consider the Beurling transform on weighted Lp spaces, with weights constructed from the Jacobians of quasi-conformal mappings. lemma 14 Assume that µ ∈ L∞ (C) with µ∞ ≤ k < 1. Suppose also that 1+k < p < 1+1/k. Then the operator I − µT is bounded below on Lp (C); that is, (I − µT )g ≥ C0 gp , g ∈ Lp (C). (21) p Proof Choose an arbitrary function g on C such that g ∈ C0∞ (C)
g = 0,
and
(22)
C
and write h = g − µT g.
(23)
Since such functions g are dense in Lp (C), it is enough to prove the claim under the restriction (22). Furthermore, if w = i F −1 ((ξ )−1 F g(ξ )), that is, if w is the
40
ASTALA, IWANIEC, AND SAKSMAN
Cauchy transform of g, then w ∈ C ∞ (C) since (22) clearly implies that w is a rapidly decreasing Schwarz function on C. Moreover, as ∂z w = g and ∂z w = T g, we see that w satisfies the nonhomogeneous Beltrami equation ∂z w = µ∂z w + h.
(24)
We have thus reduced (21) to show that ∂z wp ≤ C0−1 hp , that is, to prove an a priori bound for the differential equation (24). Set K = (1 + k)/(1 − k) so that 2K/(K + 1) < p < 2K/(K − 1). Choose a K-quasi-conformal homeomorphism f : C → C satisfying the Beltrami equation fz = µfz .
(25)
The existence of f is exactly the assertion of the measurable Riemann mapping theorem (see [25, Theorem V.1.3]). Thus, setting u = w ◦ f −1 , we calculate wz = (uz ◦ f )fz + (uz ◦ f )fz ,
µwz + h = µ (uz ◦ f )fz + (uz ◦ f )fz + h.
(26)
An application of (25) enables us to eliminate the terms involving uz and leads to (uz ◦ f )fz = Consequently,
C
h . 1 − |µ|2
(uz ◦ f )fz p ≤ (1 − k 2 )−p
(27)
hp .
(28)
C
Next we use the Ap -properties of the derivatives |fz |. Write ω = |fz ◦ f −1 |p−2 , and note that |fz |2 ≤ |fz |2 ≤ (1 − k 2 )−1 Jf . A change of variables gives (uz ◦ f )fz p ≤ (uz ◦ f )fz p ≤ (1 − k 2 )−1 |uz |p ω. (29) C
C
C
Here the weight ω is comparable to (Jf −1 )1−p/2 , and, according to Theorem 12, (Jf −1 )1−p/2 ∈ A2 ⊂ Ap for 2 ≤ p < 1 + 1/k. For 1 + k < p ≤ 2, observe that, by the definition (15), ω ∈ Ap if and only if ω−1/(p−1) ∈ Ap , where p = p/(p − 1) is the conjugate exponent. As ω−1/(p−1) is comparable to (Jf −1 )p /2−1 ∈ A2 ⊂ Ap , p ≥ 2, we see that ω ∈ Ap for all 1 + k < p < 1 + 1/k. We are now able to use the theorem of Coifman and Fefferman [9] on the weighted
41
BELTRAMI OPERATORS IN THE PLANE
spaces Lp (ω). As the Beurling transform is a regular Calder´on-Zygmund operator, we may apply the result in the form |uz |p ω ≤ λpω |uz |p ω, (30) C
C
p
where λω is the norm of T on the Banach space Lω . Furthermore, λω has a bound λω ≤ α(|ω|Ap ) < ∞ depending only on the Ap -norm of ω. The estimates give us
p
C
|uz | ω ≤
λpω
p
C
|uz | ω ≤
λpω
C
(uz ◦ f )fz p ≤
p
λω (1 − k 2 )p
hp . C
We now put these estimates into (26) to obtain gp ≡ wz p ≤ (uz ◦ f )fz p + (uz ◦ f )fz p ≤ 2K 2 λω hp .
(31)
This shows that the operator I − µT is bounded from below on Lp (C) for 1 + k < p < 1 + (1/k). Remark 15 The above calculation gives the bound C0 ≥ (2K 2 λω )−1 , with λω = T Lpω , for the constant C0 in (21). We now apply Lemma 14 to prove Theorem 1. Proof of Theorem 1 Assume that the numbers k, p and the function H satisfy the assumptions of the theorem. We first establish the bi-Lipschitz property (12) for the map B . By hypothesis, B maps the zero function to the zero function, and, moreover, for g1 , g2 ∈ Lp (C) there is the estimate B g1 − B g2 p p (C) + k T g1 − T g2 p ≤ g − g 1 2 L L (C) L (C)
≤ kCp + 1 g1 − g2 Lp (C) , where Cp denotes the norm of T on Lp (C). Hence B is a well-defined Lipschitz map. Assume then that g1 , g2 ∈ Lp (C), and set h = B g1 − B g2 . According to (11) we may write H (z, T g1 ) − H (z, T g2 ) = µ(z)(T g1 − T g2 ), where the function µ = µ(g1 , g2 ) : C → C satisfies µ∞ ≤ k < 1. Hence the difference g = g1 − g2 solves the nonhomogeneous Beltrami equation h = g − µT g. We now invoke the a priori estimate (21), which shows that gLp (C) ≤ C(k, p)
42
ASTALA, IWANIEC, AND SAKSMAN
hLp (C) . These estimates together prove the required bi-Lipschitz estimate (12). In order to obtain the invertibility of the nonlinear operator B , we observe that since B is a bi-Lipschitz map it is enough to show that the image B (Lp (C)) is dense g = h1 + H (g, T g) in Lp (C). To that end, let h1 ∈ L2 (C) be arbitrary, and denote B 2 2 for g ∈ L (C). As T is an isometry on L (C), we obtain from condition (11) that is a strict contraction on L2 (C). Hence the Banach fixed point theorem yields B g = g, that is, B g = h1 . Now assume that h1 ∈ an element g ∈ L2 (C) with B 2 p L (C) ∩ L (C). The solution g ∈ L2 (C) for B g = h1 also satisfies g ∈ Lp (C) according to the bi-Lipschitz inequality (12). We thus infer that B (Lp (C)) contains the set L2 (C) ∩ Lp (C), which is dense in Lp (C). This completes the proof of the theorem. Next we turn to Theorem 8. Theorem 3 immediately shows that for µ∞ ≤ 1/(p−1) the spectrum of µT on Lp (C) is contained in the closed unit disk. Thus one only needs to construct a Beltrami coefficient µ for which the spectral inclusion (14) holds as an equality. Proof of Theorem 8 Let p > 2. As λ − µT = T −1 (λ − T µ)T , it suffices to construct a compactly supported µ such that µ∞ = k = 1/(p − 1) and
(32) σ T µ : Lp (C) −→ Lp (C) = D, where D = {λ : |λ| < 1}. Let {λj }∞ j =1 be a dense subset of D. Choose disjoint balls Bj = B(zj , rj ) such that rj > 0 and Bj ⊂ D for each j ≥ 1. We define µ by ∞
µ(z) = −
1 z − zj λj χB (z). p−1 z − zj j j =1
In order to establish (32) it is enough to show that for each j ≥ 1 the operator λj −T µ is not bounded from below on Lp (C). Consider the auxiliary function α |z|α χD , gα = 1 + 2 where the parameter α takes values in the open interval α ∈ (−2/p, 0). Since w = z|z|α χD + (z)−1 (1 − χD ) has derivatives α 2 α−2 1 z |z| χD − 2 (1 − χD ), wz = wz = gα , 2 z
43
BELTRAMI OPERATORS IN THE PLANE
and w ∈ Wp1 (C) for α ∈ (−2/p, 0), we have that T −1 gα = wz . Let us define ν=− Then T
−1
1 z χD . p−1z
1 2+α + α z2 |z|α−2 χD . gα − νgα = − 2 (1 − χD ) + 2 p−1 z 1
A direct computation gives −1 T gα − νgα p p gα p
=
2(pα + 2) . (1 + 2α)(p − 1)
Next, we note that the support of the function gα ((z − zj )/rj ) is contained in the ball B j . Applying the affine invariance of T , we may estimate gα − T (νgα ) (λj − T µ)gα ((z − zj )/rj ) p p = λj gα ((z − zj )/rj )p gα p −1 T gα − νgα p ≤ λj T Lp (C)→Lp (C) gα p Cp 2(pα + 2) −→ 0, ≤ (1 + 2α)(p − 1) as α → (−2/p)+ . This shows that λj − T µ is not bounded from below on Lp (C) and therefore completes the proof of the theorem. Proof of Theorem 4 If p = 2 we may take µ ≡ 1 (cf. Proposition 1). If p > 2, let µ be the dilatation function of Theorem 8; µ∞ = 1/(p − 1). As shown in the proof, 1 ∈ σ (T µ) = σ (µT ) on Lp (C). Therefore neither I − µT nor I − T µ is invertible on Lp (C). The same holds on Lp (C) for the transposes (I − µT ) = I − T µ, (I − T µ) = I − µT . Remark. Example 17 shows that one can even choose a smooth µ in the proof of Theorem 4. 5. A Liouville theorem for mappings with finite distortion We next establish a result of Liouville type for mappings of finite distortion. We apply it in the later sections in proving the injectivity for the Beltrami operators in various situations. However, since the result itself is of intrinsic interest, we formulate it in all dimensions n, n ≥ 2.
44
ASTALA, IWANIEC, AND SAKSMAN
Definition 1,n (Rn ) is said to be of finite distortion A mapping f : Rn → Rn of Sobolev class Wloc n if there is a measurable function K : R → [1, ∞) such that |Df (x)|n ≤ K(x)J (x, f ) for a.e. x ∈ Rn .
(33)
Above, |Df (x)| stands for the norm of the differential Df (x) : Rn → Rn and J = J (x, f ) = det(Df (x)). The smallest such function K is called the outer dilatation function of f . (For further information on this class of mappings, see, e.g., [17], [27], and their references.) In what follows we consider dilatations that satisfy the integral bound 1/(n−1) K n−1 (x) dx ≤ K∞ (34) B(0,R)
for all sufficiently large balls B(0, R) ⊂ Rn centered at the origin. The constant K∞ is, of course, independent of the ball B. Combining the isoperimetric inequality with appropriate differential inequalities leads to the following theorem. theorem 16 Let f : Rn → Rn be of finite distortion, and assume that the outer dilatation K of f satisfies bound (34) for all R > R0 > 1. If, moreover, J (x, f ) ∈ Lq (Rn ) with some 1 < q ≤ K∞ /K∞ − 1, then f is constant. Proof Denote B = B(0, t) for t > 0, and abbreviate J = J (x, f ). By the isoperimetric inequality (see, e.g., [27, Lemma 3.4]) and H¨older’s inequality, we obtain for almost all t ≥ 0 that n/(n−1) J ≤ |Df |n−1 B
∂B
|Df |n ≤ K ∂B ≤ J h(t),
∂B
K
n−1
1/(n−1)
∂B
K n−1 )1/(n−1) . Consider the increasing function J (x, f ) dx. φ(t) =
where we have denoted h(t) = (
∂B
B(0,t)
45
BELTRAMI OPERATORS IN THE PLANE
The previous estimate may be written in the form φ(t) ≤
t φ (t)h(t) for a.e. t > 0. n
(35)
We first claim that
φ(t) = o t n/K∞
as t → ∞.
(36)
To see this, fix an arbitrary bounded and measurable set E ⊂ Rn and consider balls B = B(0, t) ⊃ E. Since J ∈ Lq (Rn ) by assumption, we have 1/q 1/q 1−1/q q 1−1/q q J + J ≤ |B \E| J +|E| J . φ(t) = J = B
B\E
E
B\E
E
Multiplying by |B|−1+1/q and letting t grow to ∞, we conclude that 1/q Jq , lim sup t n(−1+1/q) φ(t) ≤ t→∞
Rn \E
which clearly yields (36) since the set E was arbitrary and 1 − 1/q ≤ 1/K∞ . In order to shorten further computations, we pass to new variables. Set s = t n,
h(t) = K∞ H (t n ),
and
φ(t) = (ψ(t n ))1/K∞ .
Now (35) and (36) take the form ψ(s) ≤ sψ (s)H (s) for a.e. s > 0
(37)
together with ψ(s) = o(s)
as s → ∞.
(38)
In addition, our assumption (34) on the outer dilatation shows that r rn n n−1 n−1 n−1 ≥ n ωn−1 t n−1 hn−1 (t) dt = r −n K∞ H (s) ds. K∞ r ωn−1 0 0 An application of H¨older’s inequality yields the simple bound 1 s H (s) ds ≤ 1, s 0
(39)
which is valid, say, for s ≥ s0 . Let a ≥ s0 be arbitrary. It is enough to show that ψ(a) = 0. Since φ is absolutely continuous, we may integrate (37) to deduce s du . ψ(a) ≤ ψ(s) exp − a uH (u)
46
ASTALA, IWANIEC, AND SAKSMAN
The desired conclusion follows from (38) as soon as we prove that s du ≥ log(s) − C(a). a uH (u) s Set B(s) = a H (u) du. According to (39), 0 ≤ B(u) ≤ u, and hence an application of the elementary inequality H + 1/H ≥ 2, together with an integration by parts, leads to s s s du du H (u) du ≥2 − u a uH (u) a u a s s s 1 B(u) du = 2 log − H (s) − a u2 a s a s u du −1− ≥ 2 log a a u
= log(s) − log(a) + 1 . The proof is complete. Remark. The above Liouville-type theorem is sharp in a quite strong sense. For every n ≥ 2 and K∞ > 1, there are K∞ -quasi-conformal (hence nonconstant) mappings f : Rn → Rn such that J (x, f ) ∈ Lp (Rn ) for all p >
K∞ . K∞ − 1
For example, the mapping f (x) = xχB + x|x|(1/K∞ )−1 χB c , B = {x ∈ Rn : |x| < 1}, satisfies all these requirements. Note also that the K∞ -quasi-conformal g(x) = x|x|1/K∞ −1 , x ∈ Rn , satisfies J (x, g) ∈ weak −Lq
for q =
K∞ . K∞ − 1
6. Beltrami operators with coefficients in VMO For elliptic differential operators the regularity of the coefficients always reflects the local smoothness and growth properties of the solutions of the corresponding equations. For example, if the coefficient µ of the Beltrami equation fz = µfz is continuously differentiable, the same holds for f , and, in particular, the derivatives of f are locally in Lp . However, the oscillations of the coefficients, even at one point, can destroy the p L -regularity. The following example gives a µ ∈ C ∞ (C) such that a corresponding nonhomogeneous Beltrami equation does not have appropriate Lp -solutions; this
47
BELTRAMI OPERATORS IN THE PLANE
appears already at the borderline case µ∞ = 1/(p−1) of Theorem 3. Consequently, the behaviour of the coefficients near infinity also play an important role in the global Lp -estimates. Example 17 Let p > 2, and define
−1 µ(z) = z2 p + (p − 1)|z|2 . Then µ ∈ C ∞ (C) and µ∞ = 1/(p −1). However, the operator I −µT : Lp (C) → Lp (C) is not surjective. In particular, for some φ ∈ Lp (C) the system wz − µwz = φ does not admit solutions w ∈ E p (C). Proof Notice first that the quasi-conformal diffeomorphism f : C → C, f (z) = z(1 + |z|2 )−1/p satisfies the Beltrami equation fz = µfz and that the derivatives fz and fz are in weak −Lp (C) but not in Lp (C). Consider the function h: −1/p
log−1/p 1 + |z|2 . h(z) = z 1 + |z|2 Define φ by the equation hz − µhz = φ. / Lp (C). It is straightforward to see that φ ∈ Lp (C). Note, however, that hz , hz ∈ We claim that the equation Fz − µFz = φ
(40)
has no solution such that Fz , Fz ∈ Lp (C). 1,2 (C) solves the homoAssuming the contrary, we observe that F − h ∈ Wloc geneous Beltrami equation with complex dilatation µ. This implies (see, e.g., [25, Theorem VI.2.2]) that F (z) = h(z) + ◦ f (z), where is an entire analytic function. For large |z| our construction shows that |h(z)| ≤ |z|1−2/p ≤ 2|f (z)|. On the other hand, from the Sobolev imbedding theorem we deduce |F (z)| ≤ C|z|1−2/p . Combining these bounds proves that |(w)| ≤ (2C + 2)|w| for large |w|. According to the classical Liouville theorem, (w) = c1 w + c2 for some con-
48
ASTALA, IWANIEC, AND SAKSMAN
stants c1 and c2 . Consequently, F = h+c1 f +c2 . However, calculating the derivatives we obtain
1 + log 1 + |z|2 −z2 Fz = hz + c1 fz =
. 1+1/p c1 + log1+1/p 1 + |z|2 p 1 + |z|2 If here c1 = 0, the constant term is dominant in the latter factor and then Fz ∈ / Lp (C). / Lp (C). Therefore no choice of the constant c1 yields But if c1 = 0, then Fz = hz ∈ the gradient of the solution in Lp (C). On the other hand, a global control of the regularity of the coefficient µ, even in the very weak sense of vanishing mean oscillations, that is, µ ∈ VMO(C), forces the operators I − µT to become invertible on all Lp -spaces. The definition of the class VMO seems to vary slightly in the literature; hence let us fix VMO(C) as the closure of C0∞ (C) in BMO(C). It is easy to see that this is equivalent to taking the closure of C(C) in BMO, where C is the Riemann sphere. First of all, the assumption µ ∈ VMO(C) with µ∞ < 1 makes I − µT : Lp (C) → Lp (C) a Fredholm operator with index zero, regardless of the value of p ∈ (1, ∞). Since this is needed below, for the convenience of the reader we outline the proof (cf., e.g., [16, pp. 42–43]). Fix p ∈ (1, ∞). Notice first that Proposition 2 together with the spectral radius formula shows that for every ε > 0, T j p→p ≤ (1 + ε)j as soon as j ≥ N = N(ε). By choosing ε so that µ∞ (1 + ε) < 1, we deduce the existence of an integer j0 such that j j µ 0 T 0 < 1. p→p This implies that the operator I − µj0 T j0 is invertible on Lp (C). At this point one invokes the theorem of A. Uchiyama [33] stating that a commutator of a classical Calder´on-Zygmund operator in Lp (Rn ) and the multiplication by an element in VMO is a compact operator on each Lp (Rn ), p ∈ (1, ∞). Thus µT − T µ is compact, and by applying this repeatedly we see that the difference K = µj0 T j0 − (µT )j0 is a compact operator on Lp (C). Writing P = 1 + µT + (µT )2 + · · · + (µT )j0 −1 , it follows that
(I − µT )P = P (I − µT ) = I − (µT )j0 = I − µj0 T j0 + K. This shows that I − µT is a Fredholm operator. Finally, the continuous homotopy t → I − tµT , t ∈ [0, 1], shows that its index is zero. In [16] it was already observed that for compactly supported µ the kernel of the operator I − µT is trivial, and since the index is zero, the operator becomes invertible. We now extend this result by removing the restrictive assumption on the support of µ.
49
BELTRAMI OPERATORS IN THE PLANE
proposition 18 Let 2 ≤ p < ∞, and suppose that µ ∈ VMO(C) satisfies µ∞ < 1. Then the operator I − µT is injective on Lp (C). Proof It involves no loss of generality to assume that p > 2. Showing that the kernel of the operator I − µT : Lp (C) → Lp (C) is empty amounts to proving that every solution of the Beltrami equation (41) fz = µfz with fz , fz ∈ Lp is constant. Assume therefore that f has partial derivatives in Lp and solves (41), 1 < p < ∞. From the definition of VMO we get a constant µ0 , |µ0 | < 1, such that |µ(z) − µ0 | dm(z) = 0. (42) lim R→∞
B(0,R)
Next, consider the linear map φ(z) = z −µ0 z, and write g = f ◦φ so that g ∈ E p (C) µgz , where satisfies the Beltrami equation gz = µ=
µ ◦ φ − µ0 . 1 − µ0 µ ◦ φ
µ satisfies the relation analogous to (42), but Clearly µ ∈ VMO with µ∞ < 1 and with µ0 = 0 in this case. The outer dilatation function K, defined by |Dg(z)|2 ≤ K(z)Jg (z), satisfies the pointwise estimates 1 ≤ K(z) = 1 + 2|µ(z)|(1 − |µ(z)|)−1 ≤ 1 + 2/(1 − k)|µ(z)|. Therefore K(z) dm(z) = 1, lim R→∞
Lq (C),
B(0,R)
1 < q = p/2 < ∞, by assumption Theorem 16 implies that and since Jg ∈ g is constant. Therefore the kernel of I − µT must be trivial. Proof of Theorem 5 The above argument shows that for all p ∈ (1, ∞) the operator I − µT is Fredholm on Lp with index zero, and by Proposition 18 it is also injective when p ≥ 2. This proves that I − µT is invertible for 2 ≤ p < ∞. For the case 1 < p ≤ 2, we note as in the proof of Theorem 3 that I − T µ = T (I − µT )T −1 is invertible on Lp (C), 2 ≤ p < ∞, and so is its transpose (I − T µ) = I − µT on Lp (C). 7. Injectivity in the borderline cases and weakly quasi-regular maps One more aspect of the Beltrami operators we wish to consider is their injectivity.
50
ASTALA, IWANIEC, AND SAKSMAN
We begin by studying their relations to the regularity theory of weakly quasi-regular mappings. lemma 19 Suppose that the Beltrami operator I − µT : Lq (C) → Lq (C) is injective for some 1 < q ≤ 2, where µ has compact support with µ∞ < 1. Then each weakly quasi1,q regular mapping f ∈ Wloc () in a domain ⊂ C with complex dilatation µ| is quasi-regular. Proof It is enough to prove that for each φ ∈ C0∞ () the function g = φf has partial derivatives in L2 (C). According to our assumptions, gz , gz ∈ Lq (C) and we have the nonhomogeneous Beltrami equation gz − µgz = f (φz − µφz ) ∈ L2 (C).
(43)
On the other hand, since I − µT is invertible in L2 (C), we can solve this equation also for a function g whose derivatives are in L2 (C); that is to say, gz = f (φz − µφz ), gz − µ
gz , gz ∈ L2 (C).
Furthermore, since φ and µ have compact support, it follows that gz ∈ Lq (C) and gz ∈ Lq (C). Observe that (g − g )z − µ(g − g )z = 0 in Lq (C). gz = T The assumed uniqueness or the injectivity of the Beltrami operator I − µT now implies that g = g modulo a constant, establishing the lemma. On the general level of all weakly K-quasi-regular mappings, we obtain the following theorem. theorem 20 Let k = (K − 1)/(K + 1), and suppose 1 < q ≤ 2. Then the following are equivalent for any subdomain ⊂ C: (a) the operators I − µT are injective on Lq (C) whenever µ∞ ≤ k; 1,q (b) every weakly K-quasi-regular map f ∈ Wloc () is quasi-regular. Proof That (a) implies (b) follows from Lemma 19, since quasiregularity is a local property. Conversely, if I − µT is not injective and µ∞ ≤ k = (K − 1)/(K + 1), then w − µT (w) = 0 for some nonzero w ∈ Lq (C). The Cauchy transform gives us a 1,q function f ∈ Wloc (C) satisfying the homogeneous Beltrami equation fz − µfz = 0, with w = fz .
51
BELTRAMI OPERATORS IN THE PLANE
The function f is now a weakly K-quasi-regular mapping of C. If f is strongly 1,2 (C), then the basic methods of elliptic differential quasi-regular, that is, if f ∈ Wloc equations are applicable. Letting q∗ denote the Sobolev conjugate q∗ = 2q/2 − q of q and using the Harnack inequality for the components of the function f − f (0) (see, e.g., [11, Paragraph 6.2 and Theorem 14.39]), it follows that, for |z| = r and for R > 2r, C f (w) − f (0)q∗ dm(w), f (z) − f (0)q∗ ≤ |B(0, R)| B(0,R) while the Sobolev imbedding theorem gives f (w) − f (0)q∗ dm(w) ≤ Cq B(0,R)
B(0,R)
q
|fz | + |fz |
q∗/q dm(w) .
Since the derivatives of f are globally in Lq (C) and f is nonconstant, letting R → ∞ 1,2 (C). yields a contradiction. Thus f ∈ / Wloc This example can also be used to construct a weakly K-quasi-regular mapping 1,q in Wloc () which is not quasi-regular. To this effect, we cover the entire plane by the domains α,β = {αz + β : z ∈ }, where α, β ∈ C. We see that at least one of the restrictions f|α,β is not quasi-regular. A change of variables finally gives a mapping for which (b) fails. Examples of weakly quasi-regular mappings show that they can be discontinuous with bad singularities (see [19]). Let D = {z : |z| < 1}, and choose a countable number of disjoint disks B(xi , ri ) ⊂ D such that the measure |D \ ∪i B(xi , ri )| = 0. Define (1+1)/K |z
f (z) = ri
− xi |1−1/K + xi , z − xi
z ∈ B(xi , ri ),
(44)
on D \ ∪i B(xi , ri ), set f (z) = z, and for |z| ≥ 1 let f (z) = 1/z. Even if f has a singularity at each xi , a calculation shows that f is weakly K-quasi-regular and 1,q contained in all Sobolev classes Wloc (C) with q < 2K/(K + 1) (see [19]). On the other hand, Theorem 3 implies that the Beltrami operator I − µT is injective on Lq (C) if 1 + µ∞ < q < 1 + 1/µ∞ . Combining these facts with Theorem 20 leads to the following theorem. theorem 21 Let 1 < K < ∞. Then every weakly K-quasi-regular mapping, contained in a 1,q Sobolev space Wloc () with 2K/(K + 1) < q ≤ 2, is quasi-regular on . For each q < 2K/(K + 1) there are weakly K-quasi-regular mappings f ∈ 1,q Wloc (C) which are not quasi-regular.
52
ASTALA, IWANIEC, AND SAKSMAN
The regularity part of Theorem 21 was shown in [3] as a consequence of the area distortion estimates; here we emphasize the conceptual connection to the injectivity properties of the corresponding Beltrami operators I − µT . This approach becomes especially interesting at the borderline case q = 2K/(K + 1). 1,q We conjecture that all weakly K-quasi-regular mappings f ∈ Wloc with q = 2K/(K + 1) are in fact quasi-regular. Equivalently, we expect that the Beltrami operators I − µT are injective on Lq (C) at the critical exponent q = 1 + µ∞ , q < 2. This open problem has interesting connections and reductions, which we now discuss before continuing with further injectivity results. In particular, we are led to consider how the norm T Lpω of the Beurling transp form on the weighted spaces Lω depends on the Ap -norm of the weight ω. Examples suggest that T Lpω depends linearly on ωAp . But this is not yet known in general. It turns out that this would settle the borderline cases q = 2K/(K +1) in Theorem 21. proposition 22 Let 1 < K < ∞, and assume that the Beurling transform satisfies the bounds T φLpω ≤ C(p)ωAp φLpω
(45)
p
for all ω ∈ Ap and φ ∈ Lω , with a constant C(p) depending only on p, 1 < p < ∞. 1,q Then every weakly K-quasi-regular mapping f ∈ Wloc with q = 2K/(K + 1) is quasi-regular. Proof Let q < 2. We begin with a duality argument. One has to show that if (45) is true, then I − µT or I − T µ = T (I − µT )T −1 is injective on Lq (C) for all µ with q = 1 + µ∞ . On the other hand, this is equivalent to showing that the transpose operator (I − T µ) = I − µT : Lp (C) → Lp (C) has dense range in the weak topology of Lp , 1/q + 1/p = 1. Therefore fix φ ∈ C0∞ (C). According to Theorem 3, φε = (I −(1−ε)µT )−1 φ ∈ Lp for each 0 < ε < 1. Furthermore, φε − µT φε = φ − εµφε . By the density of C0∞ in Lp , it is enough to show that our assumptions imply εφε tends weakly to zero as ε → 0. For this we claim that if one has the linear dependence (45) on the Ap -norm, then one obtains
C(q) I − (1 − ε)T µ −1 p ≤ L (C)→Lp (C) ε
for all small ε > 0.
(46)
53
BELTRAMI OPERATORS IN THE PLANE
If the estimate (46) could be proven, then it would give the uniform bound εφε p ≤ C(p)φp .
(47)
On the other hand, since (1 − ε)T µL2 ≤ (1 − ε)µ∞ = (1 − ε)/(p − 1), φε 2 ≤
p−1 φ2 . p−2
This shows that εφε → 0 in L2 , and combined with the uniform bound (47) it also implies that εφε → 0 weakly in Lp (C). Thus to conclude the proof we have to show that assumption (45) implies (46). This takes us back to our proof of Theorem 3; we must check how the quasi-conformal estimates there reflect on the norm estimates of the Beltrami operators. Indeed, let f be the quasi-conformal homeomorphism with dilatation (1 − ε)µ, and let ωε be the weight ωε = |fz ◦ f −1 |p−2 . According to Remark 15, the norm of the operator (I − (1 − ε)µT )−1 : Lp (C) → Lp (C) is bounded by 2K 2 λωε , where λωε = T Lpω ε p is the norm of the Beurling transform on Lωε . But as µ∞ = 1/(p − 1), Theorem 12 shows that 1 + µ∞ pC(K) ωε Ap ≤ ωε A2 ≤ , K= . ε 1 − µ∞ Thus (46) is indeed a consequence of the linear dependence of T Lpω on ωε A2 . ε
Remarks. (a) The best general result known in the direction of (45) is due to S. Buckley [8] and yields for T Lpω bounds of power type with exponent strictly bigger than 1. He also gave examples with a linear dependence on ωAp . (b) The missing borderline case in the regularity of weakly quasi-regular mappings would also follow from the conjecture at (8). In fact, one only has to note that the injectivity of I − µT on Lq (C), q = 1 + µ∞ , is a consequence of (46) and that clearly the conjecture at (8) implies (46). As a last related aspect on weak quasiregularity, we see that in the VMO-setting one again obtains strong regularity consequences; Theorem 5 and Lemma 19 have the following immediate corollary. corollary 23 1,q Assume that q > 1 and f ∈ Wloc () has dilatation µ, and assume that f is weakly quasi-regular. If µ is in VMO(), then f is quasi-regular. Let us then turn to the other aspects of injectivity of Beltrami operators. The case of the higher borderline exponent p = 1 + 1/µ∞ reduces to Theorem 16.
54
ASTALA, IWANIEC, AND SAKSMAN
proposition 24 Suppose that p > 2 and f : C → C solves the equation fz − µ1 fz − µ2 fz = 0, where |µ1 | + |µ2 |∞ ≤ 1/(p − 1). If both fz , fz are in Lp (C), then f is a constant. Proof Since f satisfies the Beltrami equation fz − µfz = 0 with µ = µ1 + (fz /fz )µ2 , µ∞ ≤ 1/(p − 1), the outer dilatation function of f is bounded by K = p/(p − 2). Then, as p = K/(K − 1), Theorem 16 implies the claim. Next, the functions f in (44) established noninjectivity for p < 1+µ∞ and I −µT . Similarly, if we let g(z) = zχD + z|z|(−2k)/(k+1) (1 − χD ),
(48)
D = {z : |z| < 1}, then gz , gz ∈ Lp (C) for p > 1 + (1/k) and (I − µ1 T )gz = 0 for µ1 = −k(z/z)χD c . Combining these facts we conclude with the following proof. Proof of Theorem 6 Theorem 16 and Proposition 24 show that I − µ1 T − µ2 T is injective on Lp (C) if 1 + k < p ≤ 1 + 1/k, k = |µ1 | + |µ2 |∞ . The noninjectivity part was established in (44) and (48). Acknowledgments. The authors wish to thank Gaven Martin and Pekka Koskela for helpful discussions on the topics of this paper. Note added in proof. St. Petermichl and A. Volberg [29] have been able to solve the question stated at the end of Section 1. They proved bound (45): for any Ap -weight w the norm T Lpw depends linearly on wAp , p ≥ 2. By Proposition 22, this shows that the Beltrami operators I − µT are injective on Lq (C) for q = 1 + µ∞ . In a related work, F. Nazarov and A. Volberg [28] proved that on Lp (C) the norm of the Beurling operator T : Lp (C) → Lp (C) ≤ 2 max{p − 1, 1/(p − 1)}.
References [1]
L. V. AHLFORS and L. BERS, Riemann’s mapping theorem for variable metrics, Ann.
[2]
L. V. AHLFORS and A. BEURLING, Conformal invariants and function-theoretic
[3]
K. ASTALA, Area distortion of quasiconformal mappings, Acta Math. 173 (1994),
of Math. (2) 72 (1960), 385–404. null-sets, Acta Math. 83 (1950), 101–129. 37–60.
BELTRAMI OPERATORS IN THE PLANE
[4]
˜ R. BANUELOS and G. WANG, Sharp inequalities for martingales and applications to
[5]
B. V. BOJARSKI, Homeomorphic solutions of Beltrami systems (in Russian), Dokl.
55
the Beurling-Ahlfors and Riesz transforms, Duke Math. J. 80 (1995), 575–600.
[6]
[7]
[8] [9] [10] [11]
[12]
[13] [14] [15] [16] [17] [18]
[19] [20] [21]
Akad. Nauk. SSSR (N.S.) 102 (1955), 661–664. , “Quasiconformal mappings and general structure properties of systems of nonlinear equations elliptic in the sense of Lavrentiev” in Convegno sulle Transformazioni Quasiconformi e Questioni Connesse (Rome, 1974), Sympos. Math. 18, Academic Press, London, 1976, 485–499. B. BOJARSKI and T. IWANIEC, Quasiconformal mappings and non-linear elliptic equations in two variables I, II, Bull. Acad. Polon. Sci. S`er. Sci. Math. Astronom. Phys. 22 (1974), 473–478, 479–484. S. BUCKLEY, Estimates for operator norms on weighted spaces and reverse Jensen inequalities, Trans. Amer. Math. Soc. 340 (1993), 253–272. R. R. COIFMAN and C. FEFFERMAN, Weighted norm inequalities for maximal functions and singular integrals, Studia Math. 51 (1974), 241–250. F. W. GEHRING and E. REICH, Area distortion under quasiconformal mappings, Ann. Acad. Sci. Fenn. Ser. A I Math. 388 (1966), 1–15. ¨ J. HEINONEN, T. KILPELAINEN, and O. MARTIO, Nonlinear Potential Theory of Degenerate Elliptic Equations, Oxford Math. Monogr., Oxford Univ. Press, New York, (1993). T. IWANIEC, “Quasiconformal mapping problem for general nonlinear systems of partial differential equations” in Convegno sulle Transformazioni Quasiconformi e Questioni Connesse (Rome, 1974), Sympos. Math. 18, Academic Press, London, 1976, 501–517. , Extremal inequalities in Sobolev spaces and quasiconformal mappings, Z. Anal. Anwendungen 1 (1982), 1–16. , The best constant in a BMO-inequality for the Beurling-Ahlfors transform, Michigan Math. J. 33 (1987), 387–394. , Hilbert transform in the complex plane and area inequalities for certain quadratic differentials, Michigan Math. J. 34 (1987), 407–434. , “Lp -theory of quasiregular mappings” in Quasiconformal Space Mappings, Lecture Notes in Math. 1508, Springer, Berlin, 1992. T. IWANIEC, P. KOSKELA, and G. MARTIN, Mappings of BMO-bounded distortion and Beltrami-type operators, preprint, Univ. of Jyväskylä, Jyväskylä, Finland, 1998. T. IWANIEC and A. MAMOURIAN, “On the first-order nonlinear differential systems with degeneration of ellipticity” in Proceedings of the Second Finnish-Polish Summer School in Complex Analysis (Jyv¨askyl¨a, 1983), Bericht 28, Univ. Jyv¨askyl¨a, Jyv¨askyl¨a, Finland, 1984, 41–52. T. IWANIEC and G. MARTIN, Quasiregular mappings in even dimensions, Acta Math. 170 (1993), 29–81. , Riesz transforms and related singular integrals, J. Reine Angew. Math. 473 (1996), 25–57. S. JANSON, Characterizations of H 1 by singular integral transforms on martingales and Rn , Math. Scand. 41 (1977), 140–152.
56
[22] [23] [24]
[25] [26] [27] [28] [29] [30] [31] [32]
[33]
ASTALA, IWANIEC, AND SAKSMAN
M. A. LAVRENTIEV, A general problem of the theory of quasi-conformal
representation of plane regions (in Russian), Mat. Sb. 21 (1947), 285–320. , The fundamental theorem of the theory of quasi-conformal mappings of two-dimensional domains, Izvestia Acad. Sc. USSR 12 (1948). O. LEHTO, “Quasiconformal mappings and singular integrals” in Convegno sulle Transformazioni Quasiconformi e Questioni Connesse (Rome, 1974), Sympos. Math. 18, Academic Press, London, 1976, 429–453. O. LEHTO and K. VIRTANEN, Quasiconformal Mappings in the Plane, 2d ed., Grundlehren Math. Wiss. 126, Springer, New York, 1973. C. B. MORREY, On the solution of quasi-linear elliptic partial differential equations, Trans. Amer. Math. Soc. 43 (1938), 126–166. ¨ S. MULLER, T. QI, and B. S. YAN, On a new class of elastic deformations not allowing for cavitation, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 11 (1994), 217–243. F. NAZAROV and A. VOLBERG, Heating of the Beurling operator and the estimates of its norm, preprint, 2000. ST. PETERMICHL and A. VOLBERG, Heating of the Beurling operator and the critical exponents for Beltrami equation, preprint, 2000. E. REICH, Some estimates for the two-dimensional Hilbert transform, J. Analyse Math. 18 (1967), 279–293. H. M. REIMANN and T. RYCHENER, Funktionen beschr¨ankter mittlerer Oszillation, Lecture Notes in Math., 487, Springer, Berlin, 1975. E. M. STEIN, Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals, Princeton Math. Ser. 43, Princeton Univ. Press, Princeton, 1993. A. UCHIYAMA, On the compactness of operators of Hankel type, Tohuku Math. J. (2) 30 (1978), 163–171.
Astala Department of Mathematics, University of Jyv¨askyl¨a, FIN-00014 Jyv¨askyl¨a, Finland; [email protected].fi Iwaniec Department of Mathematics, Syracuse University, Syracuse, New York 13244, USA; [email protected] Saksman Department of Mathematics, University of Helsinki, Post Office Box 4 (Yliopistonkatu 5), FIN-00014 Helsinki, Finland; eero.saksman@helsinki.fi
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION IN Rn : THE CASE OF FINITELY SUPPORTED INITIAL DATA MARK L. AGRANOVSKY and ERIC TODD QUINTO
Abstract We consider the Cauchy problem for the wave equation in the whole space Rn , with initial data that are distributions supported on finite sets. The main result is a precise description of the geometry of the sets of stationary points of the solutions to the wave equation. 0. Introduction The goal of this article is to clarify the structure of stationary sets of solutions to the wave equation (1.1) with initial data f a distribution supported on a finite set in Rn . Stationary sets are sets of points x ∈ Rn for which the solution to the wave equation is always zero. The problem has been solved in the plane (see [AQ1], [AQ2]) when f is an arbitrary distribution of compact support,1 but not much is known in general. For distributions in the plane, the stationary sets have a very restrictive structure; they must be the union of a finite set and a Coxeter system of lines (lines through one point generated by a finite rotation group). This Coxeter set is contained in a translate of the zero set of a homogeneous harmonic polynomial, and it is conical about the point of intersection. Loosely speaking, we prove that a similar pattern occurs for finitely supported distributions in Rn . Characterizing stationary sets in Rn for n > 2 is more difficult, and only partial results are known. It is known for compactly supported initial data in Rn that stationary sets are contained in the union of the zero sets of harmonic polynomials and algebraic 1 The
theorems in [AQ1] and [AQ2] are stated for functions of compact support, but they are valid for E (R2 )
and the proofs are the same. DUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 1, Received 6 July 1999. Revision received 5 April 2000. 2000 Mathematics Subject Classification. Primary 35L05, 44A12, 35S30. Agranovsky’s work partially supported by grant number 408/97-3 from the Academy of Sciences of Israel. Quinto’s work partially supported by United States National Science Foundation grant numbers 9622947 and 9877155.
57
58
AGRANOVSKY AND QUINTO
varieties of lower dimension, and it is conjectured that the harmonic polynomial can be assumed to be a translate of a homogeneous polynomial (see [AQ1], [AQ2]). It is shown in [ABK] for f sufficiently integrable at infinity that stationary sets cannot have bounded closed components. In [A] and [AVZ] more precise analyses are given in Rn for stationary sets of lower dimension and conical stationary sets for f with arbitrary growth. 1. Formulation of the problem and main results We use the following notation: D (Rn )—the space of all C ∞ -functions with compact support; D (Rn )—the space of distributions; E (Rn )—the space of compactly supported distributions; ∞ (Rn ) and D (Rn )—the subspaces of corresponding spaces, consisting of radial Crad rad functions f , that is, f (x) = f (|x|); (Rn )—the space of all distributions in E (Rn ) supported on finite sets; Efin R+ = (0, ∞). Let us consider the Cauchy problem for the wave equation in Rn , n ≥ 2: utt = u,
u = u(x, t), x ∈ Rn , t > 0,
u|t=+0 = 0,
ut |t=+0 = f,
(1.1)
with initial data f ∈ D (Rn ). After extending u to the half-space t < 0 by zero, the Cauchy problem (1.1) is equivalent to the equation utt = u + δ(t)f having a unique solution u ∈ D (Rn+1 ), which is a C ∞ -function in the t-variable. Definition 1.1 Let f ∈ C ∞ (Rn ), and let u be the (classical) solution for (1.1). Define the stationary set S(f ) as the set of time-invariant zeros of the solution u: (1.2) S(f ) = x ∈ Rn : u(x, t) = 0, t > 0 . To extend this definition to distributional solutions, we use regularization. Namely, if f ∈ D (Rn ) and ϕ ∈ Drad (Rn ), the convolution f ∗ ϕ is smooth and u ∗ ϕ (convolution with respect to x) is in C ∞ and solves (1.1) for the data f ∗ ϕ. Definition 1.2 For f ∈ D (Rn ) define
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
S(f ) =
ϕ∈Drad
S(f ∗ ϕ),
59
(1.3)
(Rn )
where S(f ∗ ϕ) is defined by (1.2). For regular f , Definition 1.2 is consistent with Definition 1.1, as shown in Lemma 2.2. In fact, the set of all ϕ ∈ Drad (Rn ) in (1.3) can be replaced by an δ-sequence ϕn , ϕn ∈ Drad (Rn ). The main question under consideration is the following problem. Problem Which sets S ⊂ Rn are stationary sets, S = S(f ), for some f ∈ D (Rn ) (f ∈ E (Rn ))? Stationary sets characterize, to a certain extent, wave propagation. An important property is that any domain ⊂ Rn , bounded by a stationary set, preserves energy; that is, E(t) = (u2t + |u|2 ) dx = const. In [AQ1] and [AQ2] this problem was solved in full for n = 2 (the membrane equation) and f ∈ E (R2 ). There it was discovered that the stationary sets for this case are unions of finite sets and equiangular families of straight lines through one point (Coxeter system of lines). In this article we describe stationary sets for the case of initial data with finite support, for arbitrary dimension n. We prove that, up to a low-dimensional component, the stationary sets are affine algebraic cones with a special geometry. A cone is understood to be a union of straight lines with a common point which is the vertex of the cone. We call a cone K ⊂ Rn k-flat with edge L, where L is a k-dimensional plane in Rn , if K is a union of (k + 1)-planes containing L. A union = H1 ∪· · ·∪Hq of hyperplanes Hi ⊂ Rn is called a Coxeter system of hyperplanes if is invariant with respect to any reflection σi around the hyperplane Hi , i = 1, . . . , q. The Coxeter group generated by the reflection σ1 , . . . , σq is denoted by W (). We call a polynomial P in Rn with real coefficients a harmonic divisor if P divides a nonzero harmonic polynomial. Zero sets of homogeneous harmonic divisors are called harmonic cones. For any set F ⊂ Rn , the affine subspace spanned by F is denoted by span F . Our main result is the following theorem. theorem 1.3 (Rn ), f = 0. If S(f ) = ∅, Let f be a distribution supported on a finite set, f ∈ Efin then
60
AGRANOVSKY AND QUINTO
(a) S(f ) is an algebraic variety in Rn , contained in the zero set of a nonzero harmonic polynomial. (b) After a suitable translation, the set S(f ) can be represented in the form S(f ) = S0 ∪ V , where V is an algebraic variety of codim V > 1 and where S0 , assuming it is nonempty, is a harmonic cone, which is a (n − 1)-dimensional real algebraic variety. In addition, the following are true. (c) The conical component S0 , in general, has the two components S0 = ∪ K, where K contains supp f but does not, ∩ K = ∅ provided both and K are nonempty, is a Coxeter system, and K is a k-flat harmonic cone with the edge L = span(supp f ), k = dim L ≤ n. If supp f is a generic set, that is, if k = n, then K = ∅. If k = n − 1, then K is a hyperplane and ∪ K is a Coxeter system. ˜ is again a Coxeter ˜ is the union of all hyperplanes contained in S0 , then (d) If ˜ that system; the distribution f is odd with respect to any reflection σ ∈ W (); ˜ is, f ◦ σ = −f ; the sets S0 , V , S(f ), and supp f are W ()-invariant. Theorem 1.3 says that finite sets of point sources generate stationary sets that are necessarily algebraic varieties and that are either small (empty or low-dimensional) or, up to a low-dimensional component, are (n − 1)-dimensional cones that, suitably translated, are determined by zeros of spatial harmonics. The geometry of the essential, conical part is as follows. If the set of points in supp f is generic, then the cone is a Coxeter system of hyperplanes. These stationary sets may appear only as a result of a Coxeter skew-symmetry of the initial data. If supp f lies in a proper affine subspace in Rn , then another component may appear that is a cone containing supp f . In the plane, for any compactly supported f , S(f ) is, up to a finite set, a Coxeter system of lines (see [AQ1]). However, in the plane the set K in Theorem 1.3 would be a collection of lines and therefore a Coxeter system by (d). A known problem in studying the wave equation is characterizing nodal sets (see [CH, Vol. I, Chap. 5, Sec. 5]), that is, zero sets of eigenfunctions of the Laplace operator or, equivalently, zero sets of time-harmonic solutions of the wave equation. This problem has been studied by many authors. Results on this subject mainly say that nodal sets are hypermanifolds with singularities and that the eigenfunctions cannot
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
61
vanish to high order on the nodal sets (see, e.g., [DF1], [DF2], [Ch], [Ba], [B1], [B2], and others). The problem under consideration has some relation to the description of nodal sets. Indeed, extending the solution u(x, t) of (1.1) for t < 0 by u(x, t) = −u(x, −t) and applying Fourier transform in t to both sides of (1.1) yields −λ2 v(x, λ) = v(x, λ), where v(x, λ) is the Fourier transform evaluated at arbitrary λ ∈ R. Thus the stationary set S(f ) in (1.2) is just the intersection of nodal sets of all the eigenfunctions v(·, λ) which are, since the initial data f has compact support, nonzero for an infinite number of values of λ. Therefore one deals with intersections of 1-parameter families of nodal sets, and one can expect more precise geometric descriptions of such sets. Indeed, let us compare our results with some known results on nodal sets. In dimension 2 (vibrating membrane), the structure of nodal sets is well understood. They consist of smooth arcs, called nodal lines, and isolated singular points where these arcs meet. Moreover, the arcs meeting at singular points form an equiangular configuration (see [B1] for the proof). The above mentioned results of [AQ1] and [AQ2] show that in the case of an infinite membrane and compactly supported initial velocity, the intersections of nodal sets that make up stationary sets are just straight lines with the equiangular configuration. Now, it is known that the nodal set of an eigenfunction of the Laplace-Beltrami operator on an n-dimensional Riemannian manifold consists of a smooth hypersurface and a singular part of dimension less than or equal to n − 2 (see [Ba], [Ch], [BM], [HS]). Theorem 1.3 specifies that, in the case of Rn and finitely supported initial data, the hypersurface part of the stationary set (intersection of nodal sets of eigenfunctions in the spectrum of the initial data) is an affine cone defined by zeros of a harmonic divisor. The plan of the present article is as follows. Sections 2–6 are devoted to the proof of Theorem 1.3. First, in Section 2 we reduce the characterization of stationary sets to the investigation of injectivity sets for the spherical transform, properly defined on distributions. In Section 3 we study the microlocal properties of the spherical Radon transform and prove a support theorem for this transform. This is the key to establishing geometric symmetry conditions on S(f ) relative to supp f . In Sections 4 and 5 we study the algebraic and geometric structure of the stationary set, and key points here are the properties of the zero sets of harmonic polynomials. In Section 6 we combine the information we have obtained and prove Theorem 1.3. In Section 7 we give sufficient conditions for a set in Rn to be a stationary set S(f ) (Rn ). Finally, in Section 8 we understand the geometry of S(f ) for some f ∈ Efin when supp f is a finite union of balls.
62
AGRANOVSKY AND QUINTO
This article continues a series of works (see [AQ1], [AQ2], [AQ3], [ABK], [A], [AVZ], [AR]) started by [AQ1] and devoted to the description of injectivity sets for the spherical transform, stationary sets for the wave and heat equations, and related problems. Our initial interest in the problem was motivated by a problem in approximation theory posed in [LP] (cf. [AQ1], [AQ2]). 2. O(n)-averaging and the spherical transform Let f ∈ D (Rn ). For any element k of the orthogonal group O(n), the composition f ◦ k is well defined. Denote the O(n)-average of f by f# = (f ◦ k) dk, k∈O(n)
where dk is the Haar measure on the group O(n). Clearly, for any test function ϕ ∈ D (Rn ), the equation f # , ϕ = f, ϕ # holds, and this can be taken as the definition of f # . Denote by τx , x ∈ Rn , the translation τx f (y) = f (x + y), and let fx# = (τx f )# . The operation fx# is just the radialization of f with respect to the rotation group isotropic about x. Let f ∈ C(Rn ). Define the spherical transform C(Rn ) f → fˆ ∈ C(Rn ×R+ ) by fˆ(x, r) = Rf (x, r) = f (x + rθ) dA(θ), (2.1) |θ|=1
where dA is the normalized surface measure on the unit sphere. Let S(x, r) = y ∈ Rn : |x − y| = r be the sphere centered at x and of radius r. Then Rf (x, r) is just the average of f over S(x, r). Because the spherical transform is a Fourier integral operator, it can be defined on distributions. It is easy to see that fx# and fˆ are related by fx# (y) = fˆ(x, |y|). lemma 2.1 Let f ∈ C(Rn ), and let S(f ) be defined by (1.2). Then S(f ) = x ∈ Rn : fˆ(x, r) = 0 for all r > 0 and also
S(f ) = x ∈ Rn : (f ∗ u)(x) = 0 for all u ∈ Drad (Rn ) .
(2.2)
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
63
Proof The right-hand sides in both formulas are the same. Indeed, we have in polar coordinates: f (x − y)u(y) dy (f ∗ u)(x) = Rn ∞ = f (x + rθ)u(rθ)r n−1 dr dθ 0 |θ|=1 ∞ fˆ(x, r)u(r)r n−1 dr. = 0
Therefore fˆ(x, ·) ≡ 0 implies (f ∗ u)(x) = 0. The opposite implication can be obtained by choosing a sequence um (r) tending to δ(r − r0 ); r0 > 0 is an arbitrary fixed number. Thus, it suffices to prove the first identity. The Poisson-Kirchhoff formula (see, e.g., [He, Chap. I, Sec. 5.2, 7]) for the solution u(x, t) of the Cauchy problem (1.1) yields u(x, t) = const(∂t )n−2 F (x, t),
where F (x, t) =
t
t2 − r2
(n−3)/2
r fˆ(x, r) dr.
(2.3)
0
Therefore fˆ(x, ·) ≡ 0 implies u(x, t) = 0, t > 0; that is, x ∈ S(f ). Let us prove that the converse is also true. First, we show F (x, t) = 0 for t all t > 0. For n = 2 we have u(x, t) = 0 (t 2 − r 2 )−1/2 r fˆ(x, r) dr, and hence u(x, t) = 0, t > 0, implies fˆ(x, r) = 0, r ∈ R+ , due to invertibility of the Abel transform. If n > 2, then u(x, t) = 0, t ∈ R+ , implies that F (x, t) is a polynomial (in the t-variable) of degree less than n − 2. The change of variables r = ts yields 1 (n−3)/2 n−1 F (x, t) = t s fˆ(x, ts) ds, 1 − s2 0
O(t n−1 ),
and hence F (x, t) = t → 0. As deg F < n − 2, this is possible only if F (x, t) = 0 for all t. Since (2.3) can be reduced to an invertible Abel or Volterra equation of the first kind by successive differentiations, fˆ = 0 (see, e.g., [Q2, Th. A and B, Lem. 3.3, and p. 516] for this reduction). lemma 2.2 For f ∈ C(Rn ), Definitions 1.1 and 1.2 coincide; that is, S(f ) = S(f ∗ ϕ). ϕ∈Drad (Rn )
64
AGRANOVSKY AND QUINTO
Proof If x ∈ S(f ) according to Definition 1.1, then (f ∗u)(x) = 0 for any u ∈ Drad (Rn ), due to Lemma 2.1. Let ϕ ∈ Drad (Rn ) be arbitrary. Then (f ∗ϕ)∗u(x) = (f ∗(ϕ∗u))(x) = 0 because ϕ ∗ u ∈ Drad (Rn ) as it is the convolution of two radial functions from D (Rn ). Again, by Lemma 2.1, we have x ∈ S(f ∗ ϕ). So x ∈ S(f ) in the sense of Definition 1.2. Conversely, if x ∈ S(f ∗ ϕ) for any ϕ ∈ Drad (Rn ), then, by Lemma 2.1, (f ∗ϕ∗u)(x) = 0 for all u ∈ Drad (Rn ). Taking a δ-sequence ϕn , we obtain (f ∗ u)(x) = 0, and this means x ∈ S(f ) due to Lemma 2.1 and Definition 1.1. Now we turn to the characterization of the set S(f ) in terms of O(n)-averaging, for the case f ∈ D (Rn ). lemma 2.3 Let f ∈ D (Rn ). Then and
S(f ) = x ∈ Rn : fx# = 0
S(f ) = x ∈ Rn : (f ∗ u)(x) = 0, u ∈ Drad (Rn ) .
Proof The second formula can be easily derived from the first one by verifying the identity on test functions. Let us prove the first formula. The following identity can be easily verified by acting on test functions: (f ∗ u)#x = fx# ∗ u,
u ∈ Drad (Rn ).
(2.4)
Now, if x ∈ S(f ), then by Definition 1.2, x ∈ S(f ∗ ϕ) for any ϕ ∈ Drad (Rn ). Lemma 2.1 is applicable to the smooth function f ∗ ϕ, and therefore (f ∗ ϕ)#x = (f ∗ ϕ)(x, ·) = 0. Choosing the sequence ϕ = ϕk converging to the δ-function as k → ∞, we obtain fx# = 0. Conversely, if fx# = 0, then we have from (2.4) and the fact the convolution is associative for distributions and functions of compact support (see [Ru, Th. 6.3.5]): # # (f ∗ ϕ) ∗ u x = f ∗ (ϕ ∗ u) x = fx# ∗ (ϕ ∗ u) for all ϕ, u ∈ Drad (Rn ). Lemma 2.1 asserts that x ∈ S(f ∗ ϕ). Because of Definition 1.2 and the arbitrariness of ϕ ∈ Drad (Rn ), this means that x ∈ S(f ). Remark. It is clear from the proof of Lemma 2.3 that in Definition 1.2 one can take instead of all ϕ ∈ Drad (Rn ) just any sequence ϕk ∈ Drad (Rn ) tending to the
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
δ-function, as k → ∞; that is, S(f ) =
S(f ∗ ϕk ),
65
ϕk −→ δ.
ϕk ∈Drad (Rn )
3. Microlocal analysis and the support theorem Microlocal analysis has been used to understand properties of Radon transforms, beginning with Guillemin’s seminal work (see [GS, pp. 336–337, 364–365]). Others (see, e.g., [Q3]) have used microlocal techniques to prove support theorems for generalized Radon transforms. We now give some of the basic conventions we use in the rest of the article. Definition 3.1 Let r ∈ R+ , S ⊂ Rn , and x ∈ S. The point x is called a regular point of S if and only if there is a connected real-analytic hypersurface A (an (n − 1)-dimensional submanifold of Rn ) such that x ∈ A ⊂ S. We let reg S denote the regular points in S. Let x be a regular point of S, and let A be such an associated hypersurface (x ∈ A ⊂ S). Then we let Tx denote the hyperplane tangent to A at x. The points y and y in Rn are said to be Tx -mirror if and only if they are reflections about Tx . If y ∈ Tx , then y is its own mirror point and we say y is self-mirror. Note that the definition of regular point includes the case when S itself is not a manifold at x. For example, using our definition, (0, 0) is a regular point of S = {(x, y) ∈ R2 : xy = 0}, and both the x-axis and the y-axis are “hypersurfaces” associated with (0, 0). Also, note that if y ∈ S(x, r) then its Tx -mirror is also in S(x, r). Let X be a manifold, and let f ∈ D (X). Let H ⊂ X be a hypersurface, and let y ∈ H ∩ supp f . We say f is zero locally near y on one side of H if and only if there is a connected neighborhood U of y such that H divides U into two disjoint open sets and f is zero on one of those sets. theorem 3.2 Let f ∈ D (Rn ), f = 0. We let x0 ∈ S = S(f ) be a regular point. Let A ⊂ S be a connected real-analytic hypersurface containing x0 , and let Tx0 be the hyperplane tangent to A at x0 . Let y0 ∈ supp f \ x0 , and assume f is zero on one side of S(x0 , |y0 − x0 |) locally near y0 . Then the Tx0 -mirror point to y0 must also be in supp f . This theorem has no conclusion if y0 is self-mirror. The theorem (and some geometric (Rn ) (see Section 5 arguments) can be used to prove a much stronger result if f ∈ Efin
66
AGRANOVSKY AND QUINTO
and Theorem 6.1). Theorem 3.2 provides microlocal information about wavefront at mirror points in Tx0 . In Section 3.1 we prove the needed microlocal results for the spherical transform, and in Section 3.2 we prove the support theorem. 3.1. The microlocal analysis The proof of the regularity theorem, Theorem 3.3, is similar to proofs in [AQ2] and [GrQ], and so it is sketched. In particular, we outline what must be done to show that R is an elliptic Fourier integral operator with the given microlocal properties. We use the basic notation from [Ho1], [Ho2], and [Tr]. If Y ⊂ X is a C 2 manifold, then the conormal bundle of Y , N ∗ Y ⊂ T ∗ X, is the set of covectors {(y, ξ ) : y ∈ Y, ξ ∈ Ty∗ X and Ty Y ⊂ ker ξ }. The analytic wavefront set of a distribution f ∈ D (Rn ) is a conical subset, WFA (f ), of the cotangent bundle T ∗ Rn consisting of “directions” in which f is not real-analytic. This is defined either in terms of the very rapid decrease of localized Fourier transforms of f (see [Tr, Def. 1.1, p. 243]) or in terms of exponential decrease of the Fourier-Bros-Iagolnitzer transform (see [Ho2, Th. 9.6.3]). For example, if f is the characteristic function of an open set D with real-analytic boundary, then WFA (f ) is the conormal bundle of the boundary of D, N ∗ (∂D). Theorem 3.3, a regularity theorem for the Radon transform R, is one of the keys to the proof of the support theorem. The hypotheses include an assumption on the vanishing of f at certain points. theorem 3.3 Let f ∈ D (Rn ), and let x0 be a regular point of S. Let A be a connected real-analytic hypersurface such that x0 ∈ A ⊂ S. Assume Rf is zero in an open neighborhood of (x0 , r0 ) ∈ S × R+ . Let (y; ξ ) ∈ N ∗ S(x0 , r0 ) \ 0, and assume that f is zero in a / WFA (f ). neighborhood of the Tx0 -mirror point to y. Then (y; ξ ) ∈ In general, Radon transforms do not detect singularities WFA (f ) unless they are conormal to the surface being integrated over. This lemma implies that any singularity at (y; ξ ) ∈ N ∗ S(x0 , r0 ) is detected by data Rf as long as f is zero (or just realanalytic) near the mirror point to y on S(x0 , r0 ). In fact, the conclusion is true under microlocal hypotheses at the mirror point of y, ym : if Ny∗m (S(x0 , r0 )) ∩ WFA (f ) = ∅, then Ny∗ (S(x0 , r0 ))∩WFA (f ) = ∅ (and the “outer” and “inner” conormal “directions” correspond). If y is self-mirror (y ∈ Tx0 ), then Theorem 3.3 gives no conclusion about y. In other cases, if f is zero in a neighborhood of the Tx0 -mirror point to y, then the theorem gives specific directions above y that are not in WFA (f ).
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
67
Theorem 3.3 is, at least morally, a microlocal version of a reflection principle of Courant and Hilbert. Their theorem says that if f has zero integrals over all spheres with centers on the hyperplane Tx0 , then f is an odd function about Tx0 . Also see Theorem 3.2 in this regard. Proof The proof is the n-dimensional generalization of the proof of the microlocal regularity theorem (see [AQ2, Lem. 4.3]). Locally above A the incidence relation for R is defined to be Z = {(y, x, r) ∈ Rn × A × R+ : y ∈ S(x, r)} (see [He]). The appropriate microlocal diagram (see [GS, pp. 364–365]; see also [Q1]) is 5 = N ∗Z \ 0
π2
T ∗ (A × R+ ) \ 0
(3.1)
π1
T ∗ (Rn ) \ 0 where the maps π1 and π2 are projections from 5 ⊂ T ∗ (Rn × (A × R+ )) onto the indicated factors. We must show that the map π2 is close enough to being an injective immersion (the Bolker Assumption; see [GS, pp. 364–365], [Q1]) that the calculus of Fourier integral operators can be used to prove Theorem 3.3. Specifically, the goal is to prove that π2 in (3.1) satisfies the following: Covectors (y, x, r; ξ, η) ∈ 5 and (y , x, r; ξ , η) ∈ 5 have the same image under π2 only if y and y are Tx -mirror; π2 is a local diffeomorphism except above points (y, x, r) where y ∈ S(x, r) is self-mirror.
(3.2)
To this end, we first calculate N ∗ Z in good coordinates. Points (y, x, r) ∈ Z are determined by the equation |y − x|2 − r 2 = 0, and the differential of this equation gives a basis of the fibers of N ∗ Z. Coordinates for 5 = N ∗ Z \ 0 are S n−1 × A × R+ × (R \ 0) −→ 5, (θ, x, r, α) −→ y, x, r; α [rθ] dy − [Px (rθ)] dx − r dr ,
(3.3)
where y = rθ + x. Here, (w1 , . . . , wn ) dy = w1 dy1 + · · · + wn dyn is the covector in T ∗ Rn corresponding to (w1 , . . . , wn ) ∈ Rn , and the map Px : Rn → (Tx − x) is the orthogonal projection onto the hyperplane through the origin parallel to Tx . This hyperplane is, of course, Tx A, viewed as a subspace of Rn . Finally, Px (rθ) dx is the covector corresponding to the vector Px (rθ).
68
AGRANOVSKY AND QUINTO
Equation (3.3) shows that π1 and π2 do not map to the zero sections, so R is a Fourier integral operator associated to the Lagrangian manifold 5 (see [Tr, Th. 2.1, p. 316]). This is one reason why R can be evaluated on distributions. R is real-analytic elliptic since the measure of integration for R, dA, is real-analytic and nowhere zero. The map π2 is equivalent to the corresponding map in coordinates (3.3): π˜ (θ, x, r, α) −−→ x, r; −α [Px (rθ)] dx − r dr .
(3.4)
Since x and r are known from the image of π, ˜ π˜ determines Px (θ), so y = x + rθ is known up to its Tx -mirror from π2 . The calculation that π˜ is a local diffeomorphism except at self-mirror points is left to the reader. (One can do it in local coordinates on A if one likes.) Since (3.2) is essentially a local statement, this proves (3.2). Now, assume f is as in the hypotheses of Theorem 3.3. R has been shown to be an analytic elliptic Fourier integral operator associated with 5. The calculus of such operators implies the conclusion of Theorem 3.3 in the same way as in [AQ1]. The basic idea is as follows. Let (y; ξ ) ∈ N ∗ (S(x0 , r0 )) \ 0, and assume f is zero near the Tx0 -mirror point to y. Then by (3.2) no singularities above the Tx0 -mirror point y can cancel singularities above y. By the calculus of these operators, the only singularities of f that are detectable from Radon data are those on N ∗ (S(x0 , r0 )) (WFA (Rf ) ⊂ {((x0 , r), η) : (y, (x0 , r); ξ, −η) ∈ 5 for some (y; ξ ) ∈ WFA (f )}, and for fixed (x0 , r0 ) the only wavefront of f that affects this are covectors in N ∗ S(x0 , r0 )). Furthermore, since R is elliptic and π2 maps only covectors above y and y to the same point, the only singularities of f on N ∗ S(x0 , r0 ) that can cancel singularities above y are those above y . So, since there are no singularities above y , / WFA (f ). The details of this and since Rf is real-analytic near (x0 , r0 ), then (y; ξ ) ∈ argument are given in [AQ1] and [GrQ] (see also [SKK], [Ka]). 3.2. Proof of support theorem The following theorem of L. H¨ormander, T. Kawai, and M. Kashiwara (see [Ho2, Th. 8.5.6], [SKK]) is one key to the proof. lemma 3.4 Let X be a real-analytic manifold, and let f ∈ D (X) and y ∈ supp f . Let H be a C 2 hypersurface with y ∈ H . Assume f is zero locally near y on one side of H . If (y; ξ ) ∈ N ∗ (H ) \ 0, then (y; ξ ) ∈ WFA (f ). Under the assumptions of this lemma, f cannot be real-analytic near y because y is a boundary point of supp f . Lemma 3.4 is a strengthening of this simple observation because it provides specific wave front directions above y that must be in WFA (f ).
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
69
Proof of Theorem 3.2 Assume the mirror point to y0 in Tx0 is not in supp f . Let r0 = |y0 − x0 |. Then y0 is a boundary point of supp f in S(x0 , r0 ). Furthermore, locally near y0 , f is zero on one side of the sphere S(x0 , r0 ). Let (y0 ; ξ0 ) ∈ N ∗ S(x0 , r0 ) \ 0. By Lemma 3.4, / (y0 ; ξ0 ) ∈ WFA (f ). However, by the regularity theorem, Theorem 3.3, (y0 ; ξ0 ) ∈ WFA (f ) since Rf is zero on A × R+ , A is a real-analytic hypersurface, and the Tx0 -mirror point to y0 is not in supp f . This contradiction proves the theorem. 4. Algebraic structure of S(f ) for f ∈ E (Rn ) We need some properties of harmonic polynomials in Rn . Let us introduce some notation. For a polynomial P ∈ R[x1 , . . . , xn ] we let P C ∈ C[z1 , . . . , zn ] be its natural extension to Cn . Denote N(P ) = P −1 (0), and let N C (P ) be the complexification of the variety N(P ), N C (P ) = (P C )−1 (0). We have N(P ) = N C (P ) ∩ Rn . lemma 4.1 Let Q = 0 be a harmonic divisor in Rn with real coefficients, and let C QC = AC 1 · · · Aq
be the decomposition into a product of irreducible factors (over C). Then the following are true. (1) All polynomials AC j are distinct and have real coefficients, so that decomposition above gives also the irreducible decomposition in R[x1 , . . . , xn ]. (2) Any irreducible component Nj = N(Aj ) of the variety N(Q) is an (n − 1)dimensional real algebraic variety in Rn . The complement Rn \ Nj is disconnected. (3) If G ∈ R[x1 , . . . , xn ] vanishes on an open subset U ⊂ Nj of some irreducible component Nj , then Aj divides G. In particular, Q divides any polynomial with real coefficients, vanishing on N(Q). Proof To prove (1), let (QC )∗ (z) = QC (z) be the polynomial with complex conjugate C ∗ coefficients. By assumption, (QC )∗ = QC , and therefore if (AC i ) = Ai for some i, C C ∗ then both (Ai ) and Ai are presented in the decomposition of the polynomial QC . Then Q(x), x ∈ Rn , is divisible by Ai (x)A∗i (x) = |Ai (x)|2 . This is impossible since Q is a harmonic divisor and the Brelot-Choquet theorem [BC] states that no nonnegative polynomial in Rn can divide a nonzero harmonic polynomial. Thus AC i = ∗ , i = 1, . . . , q, which means that all AC , and therefore A , have real coefficients. ) (AC i i i In addition, this implies that all the polynomials Ai are distinct as otherwise Q would have a nonnegative divisor A2i .
70
AGRANOVSKY AND QUINTO
We now prove (2). If some irreducible component Ni = N(Ai ) has lower dimension, then its complement Rn \ Ni would be connected. The polynomial Ai preserves sign in Rn \ Ni and vanishes on Ni . Since Ai is a harmonic divisor, this is again impossible due to the Brelot-Choquet theorem. To prove (3), denote for simplicity N = Nj . We can assume that U consists only of smooth points of N and that U = B ∩ N where B = B(a, r) is a ball in Rn . Denote Bˆ the corresponding complex ball in Cn , with the same center and radius, so that B = Bˆ ∩ Rn . The Bˆ ∩ N C is an open subset of the complex algebraic variety N C ⊂ Cn , and the complex polynomial GC vanishes on the (n − 1)-dimensional real smooth submanifold B ∩ N = Bˆ ∩ N C ∩ Rn of the complex manifold Bˆ ∩ N C . Due to the uniqueness theorem for holomorphic functions, the polynomial GC vanishes on Bˆ ∩ N C . Then GC vanishes on the Zariski closure of Bˆ ∩ N C which is exactly the irreducible component N C . By Hilbert’s Nullstellensatz, this implies that the C C defining irreducible polynomial AC j divides G and, as Aj has real coefficients, Aj divides G. lemma 4.2 Let f ∈ E (Rn ), f = 0. Then (a) the set S(f ) is an algebraic variety in Rn contained in the zero sets of a nonzero harmonic polynomial; (b) S(f ) = S0 ∪ V , where V is an algebraic variety of codim V > 1, S0 = N(Q), and Q is a harmonic divisor. Proof According to Lemma 2.3, S(f ) = {x ∈ Rn : (f ∗ u)(x) = 0, u ∈ Drad (Rn )}. Since supp f is compact, u ∈ Drad (Rn ) can be replaced by any radial C ∞ -function. Let u(y) = um (y) = |y|2m , m = 0, 1, . . .. The functions {um } constitute a complete ∞ on any compact set; hence system in the space Crad ∞ S(f ) = x ∈ Rn : (f ∗ um )(x) = 0, m ∈ {0} ∪ N = N(Pm ), m=0
where Pm = f ∗ um . By the Hilbert basis theorem (see [VW]), S(f ) is defined by a finite family of polynomials Pm : S(f ) =
M
N(Pm ).
m=m0
Here m0 = min{m : Pm = 0}. Let P be the greatest common polynomial divisor (over C) of all the polynomials PmC , m = m0 , . . . , M. Then the variety N(P ) is the union of all common irreducible
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
71
components of the varieties N(PmC ). Since the intersection of any two distinct irreducible components has at any smooth point complex codimension greater than 1, then the intersection of all the complex algebraic varieties N(Pm ) is representable as the union of the zero variety of P , N(P ), and a variety of codimension greater than 1: M m=m0
N PmC = N(P ) ∪ W,
codimC W > 1.
Denote by Q the restriction of P to the real space Rn , so that N(Q) = N(P )∩Rn . Then intersecting with the real space Rn yields M M C S(f ) = N(Pm ) = N Pm ∩ Rn = N(Q) ∪ V , m=m0
m=m0
where V = W ∩ Rn and codimR V ≥ codimC W > 1. It remains to prove that Q is a harmonic divisor. One proves that by the following argument. Since |y|2m = cm |y|2(m−1) , cm = 2m(2m + n − 2), we have that the Laplace operator acts on the polynomial Pm by Pm = (f ∗ um ) = f ∗ um = cm (f ∗ um−1 ) = cm Pm−1 . Therefore, Pm0 = cm0 Pm0 −1 = 0, and we conclude that Pm0 is harmonic. Since Q divides Pm0 , then Q is a harmonic divisor. (Rn ) 5. Geometric structure of S(f ) for f ∈ Efin (Rn ), From now on, we assume f is a distribution supported on a finite set, f ∈ Efin and f = 0. The distribution f is determined by a finite set of differential operators Di , i = 1, . . . , d, with constant coefficients, and points a1 , . . . , ad ∈ Rn , and it acts on test functions ϕ ∈ C ∞ (Rn ) by
f, ϕ =
d
(Di ϕ)(ai ).
i=1
The points a1 , . . . , ad constitute supp f . Definition Let x ∈ Rn , and let a ∈ supp f . We say that x simply touches supp f at the point a if the sphere S(x, |x − a|) intersects supp f only at the point a. We say that x multiply touches supp f at the point a if S(x, |x − a|) ∩ supp f contains at least two points. Clearly the set of simply touching points is open in Rn because supp f is finite.
72
AGRANOVSKY AND QUINTO
Let S0 , V , Q be as in Lemma 4.2, and let Ai , i = 1, . . . , q, be the irreducible divisors of the polynomial Q. All the polynomials Ai have real coefficients due to Lemma 4.1. lemma 5.1 Let Nj = N(Aj ) be an irreducible component of the algebraic variety S0 . If Nj contains a point x0 that simply touches supp f at some point a ∈ supp f , then the polynomial Aj (x + a) is homogeneous and, correspondingly, Nj is an affine cone with vertex a. Proof Denote for simplicity A = Aj , N = Nj . There exists a neighborhood U ⊂ N of x0 , consisting of points simply touching supp f at the point a. We can take a smaller subset and assume that U consists of smooth points of N. If U and ε > 0 are sufficiently small, then each sphere S(x, |x − a| + t), x ∈ U , 0 < |t| < ε, contains no point from supp f . Now we can apply Theorem 3.2 and conclude that point a must be a self-mirror for the tangent planes Tx (N), x ∈ U ; that is, a ∈ Tx (N ) for each x ∈ U . Then the vector x − a is orthogonal to the normal vector A(x) to Tx (N), and therefore the polynomial G(x) = x − a, A(x) vanishes for all x ∈ U . Lemma 4.1(3) implies that A divides G and, as deg G ≤ deg A, we have G = λA, λ ∈ R. Thus we obtain the Euler equation n
∂A (x) = λA(x) (xi − ai ) ∂x i=1
which implies that the polynomial A(x + a) is λ-homogeneous, where automatically λ = deg A. corollary 5.2 If f is supported at one point a, then S0 is an affine (algebraic) cone with the vertex a. Indeed, in this case every point x of any irreducible component Nj simply touches supp f at a, and therefore S = j Nj is a cone with respect to a. lemma 5.3 Let Nj be an irreducible component of S0 . If there exists an open set U ⊂ Nj consisting of points that multiply touch supp f , then Nj is a hyperplane, a bisector
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
73
between some two points a, a ∈ supp f . Proof Denote N = Nj = N(Aj ). Choosing U smaller, one can assume that all points of U are smooth points of N. Pick x ∈ U . By the assumption there exists a point a ∈ supp f such that the sphere S(x, |x − a|) contains another point a ∈ supp f , a = a. Then |x − a| = |x − a | and (x, a − a ) = (1/2)(|a|2 − |a |2 ). Therefore, x belongs to the hyperplane Ha,a , which is a bisector between the points a and a . Thus, U is contained in the (finite) union of the hyperplanes Ha,a and, as U is an (n − 1)-dimensional smooth manifold, U is contained in one hyperplane Ha,a . Then the entire irreducible component N coincides with the hyperplane Ha,a . proposition 5.4 Let f ∈ D (Rn ), f = 0. If S(f ) contains hyperplanes H1 , . . . , HC , then S(f ) contains the Coxeter system generated by the reflections σj about Hj . The hyperplanes H1 , . . . , HC have a common point, and the Coxeter group W () is finite. The distribution f is odd with respect to reflections σ ∈ W (), f ◦ σ = −f , and the sets S(f ) and supp f are W ()-invariant. Proof Definition (1.3), S(f ) =
S(f ∗ ϕ),
ϕ∈Drad (Rn )
implies that it suffices to prove the proposition for regular f . By Lemma 2.1 the spherical transform fˆ(x, r) = 0 for all x ∈ Hi , i = 1, . . . , s, and r > 0. A reflection principle is proved in [CH, Vol. II, p. 699 ff.] (in a slightly different formulation) that states that if a function integrates to zero on all spheres centered on a hyperplane, then the function is odd about that hyperplane. This reflection principle shows that the function f is odd around each hyperplane Hi , f ◦ σi = −f , i = 1, . . . , C. Since S(f ◦ σi ) = σi (S(f )), then S(f ) is σi -invariant, i = 1, . . . , C, and S(f ) contains the generated Coxeter system . The skew-symmetry f ◦ σ = −f is preserved for the reflections σ ∈ W (). The Coxeter group W () has a finite number of mirrors because S(f ) is an algebraic variety. Therefore group W () is finite because it is generated by reflections and has a finite set of mirrors (see [GB, Prop. 4.1.3]). By Kakutani’s theorem (see [DS, Chap. 8, Sec. 10, Th. 8.]) the convex hull of any orbit contains a fixed point of the group W (). This point a belongs to any hyperplane Hi in , and therefore is a cone with the vertex a.
74
AGRANOVSKY AND QUINTO
6. Proof of Theorem 1.3 (Rn ), f = 0, and let S(f ), S , V , Q be as above. We assume that S = ∅. Fix f ∈ Efin 0 0 Theorem 1.3(a) is already proved in Lemma 4.2(a). Our next goal is to prove that S0 is an affine cone. Let Q = A1 · · · Aq be the irreducible composition of the polynomial Q, and let S0 = N1 ∪ · · · ∪ Nq be the corresponding decomposition of the algebraic variety S0 = N (Q) into irreducible components Nj = N(Aj ). 6.1. Structure of the component K Denote by K the union of all irreducible components Nj such that supp f ⊂ Nj . Let a ∈ supp f . Then a is a nonisolated point of Nj as Rn \ Nj is disconnected due to Lemma 4.1. Take x ∈ Nj as close to a, so that S(x, |x − a|) ∩ supp f = {a}. The point x simply touches supp f at the point a. Lemma 5.1 says that the polynomial Aj (x + a) is homogeneous, and the component Nj is a cone with the vertex a. Thus Nj is conical about any point a ∈ supp f . Then Nj is a cone with the edge L = span(supp f ). This follows from geometric arguments, or it can be proved in terms of the defining polynomial Aj . Indeed, applying a rigid motion of Rn , we can assume that L = Rk × {0}, k = dim L. Then for any a ∈ supp f the Euler equation holds: x, Aj (x) = a, Aj (x) , x ∈ Rn . By linearity, it is true for all a ∈ L. Substituting a = 0, we obtain that Aj is homogeneous and also (a, Aj (x)) = 0, x ∈ Rn , which implies (∂Aj /∂x1 )(x) = · · · = (∂Aj /∂xk )(x) = 0, due to arbitrariness of a ∈ Rk × {0}. If k = n, then Aj = const = 0. (Aj = 0 would imply Q = 0, S(f ) = Rn , and, correspondingly, f = 0.) Therefore, Nj = ∅, and we obtain that K = ∅ if k = n. If k = n − 1, then Aj (x) = A˜ j (xn ) and Nj = {xn = 0}. If k < n − 1, then Aj (x1 , . . . , xn ) = A˜ j (xk+1 , . . . , xn ) and Nj = Rk × N(A˜ j ). Since A˜ j is homogeneous, then N(A˜ j ) is a cone in Rn−k . Then Nj is a cone with the edge L, and we have proven the assertion about K in (c). 6.2. Structure of the component Let be the union of all components Nj which do not entirely contain supp f . In other words, includes all the irreducible components of S0 which are not in K. So S0 = ∪ K. Let Nj be such a component; U is open in Nj and consists of smooth points. Let x ∈ U and a ∈ supp f \ Nj . Then the sphere S(x, |x − a|) contains another point from supp f since otherwise x simply touches supp f at the point a and a ∈ Nj according to Lemma 5.2. Thus U consists of points multiply touching supp f , and Nj is a bisector hyperplane between two points a, a ∈ suppf \Nj , according to Lemma 5.2.
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
75
We have proven that is a union of hyperplanes = H1 ∪ · · · ∪ HC . Proposition 5.4 yields that is a Coxeter system. Also, ∪ K is a Coxeter system if dim(span f ) = k − 1 and, correspondingly, K is a hyperplane. This proves (c) and (d). 6.3. Conical structure of the set So Now we want to prove that S0 has conical structure as claimed in (b). We have S0 = ∪ K, where is a cone with respect to any point in H1 ∩ · · · ∩ HC , and K is a cone with respect to any point in L = span(supp f ). To prove that S0 is a cone, it suffices to prove that H1 ∩ · · · ∩ HC ∩ L = ∅. Lemma 5.4 claims that supp f is invariant under the action of the Coxeter group W (). Then the convex hull of supp f possesses the same invariance. The invariant set conv hull(supp f ) is compact, and Kakutani’s theorem implies that there exists a W ()-fixed point a ∈ conv hull(supp f ) ⊂ L. Clearly, a belongs to any hyperplane Hi , i = 1, . . . ,C. Thus S0 is an affine cone with the vertex a ∈ conv hull(supp f ). Let Q be the defining polynomial for S0 introduced in Lemma 4.2, and let E be a nonzero harmonic polynomial (E = Pm0 in the proof of Lemma 4.2) having Q as a divisor. Since S0 is a cone with respect to a, then P (x) = Q(x − a) is homogeneous. (Lemma 5.1 says that all the irreducible factors of P are homogeneous.) It remains for us to note that P divides the harmonic polynomial F(x) = E(x + a), and S0 = a + N(P ). Since P is homogeneous, then P divides any homogeneous term in the decomposition of F, and therefore P is a divisor of some nonzero harmonic homogeneous polynomial in Rn . This completes the proof of Theorem 1.3. 6.4. Geometric conditions on S(f ) We summarize the geometric essence of what we have proven in the following theorem. The proof involves the microlocal arguments we have developed using Theorem 3.2 as well as the algebraicity of the set S(f ), which is proved in Section 4 and enables us to analytically continue a locally conical set to a globally conical set. theorem 6.1 (Support theorem) (Rn ), f = 0. Assume S = S(f ) = ∅. Assume there are regular points in Let f ∈ Efin S, and let x0 ∈ S be a regular point. Let A be a connected real-analytic hypersurface in Rn such that x0 ∈ A ⊂ S. Let Tx0 be the hyperplane tangent to A at x0 . There are two possibilities. / Tx0 . In this case A ⊂ Tx0 ⊂ S and supp f is (a) For some a0 ∈ supp f , a0 ∈ symmetric about Tx0 . Furthermore, f is odd about Tx0 . (b) Or supp f ⊂ Tx0 . In this case, near x0 , S is conical about L = span supp f .
76
AGRANOVSKY AND QUINTO
Precisely, A generates a subset of S that is conical with edge L. In this case, k = dim L < n. 7. Sufficient conditions for stationary sets (Rn ) may consist of three Theorem 1.3 says that the stationary sets S(f ) for f ∈ Efin parts: a low-dimensional variety V , a Coxeter system , and a cone K having all points in span(supp f ) as vertices. In addition, the union S0 = ∪ K must be a cone containing the zero set of some shifted harmonic homogeneous polynomial, and the entire stationary set S(f ) = ∪ K ∪ V must belong to the zero set of some nonzero harmonic (not necessarily homogeneous, if V is not a cone with the common vertex with S0 ). Now the question is whether all the possibilities are realizable, namely, whether each of the sets , K, V and any unions of sets of these three types are the stationary (Rn ) or, more generally, are contained in a stationary set. sets S(f ) for some f ∈ Efin Below we give positive answers for the sets , K, V , and ∪ V . The case of ∪ K ∪ V , where each of the three sets is nonempty, remains unsolved. Given a polynomial G ∈ R[x1 , . . . , xn ], denote by TG the distribution TG , ϕ = G(∂)ϕ(0), ϕ ∈ D (Rn ). Here G(∂) = G(∂/∂x1 , . . . , ∂/∂xn ). lemma 7.1 Let F be a homogeneous harmonic polynomial in Rn , with complex coefficients and of degree k. Then F(∂)|x|2m = dm,k |x|2(m−k) F(x) if k ≤ m, and F(∂)|x|2m = 0 if k > m. Here dm,k = 2k m!/(m − k)!. Proof The polynomial F can be represented as a linear combination of polynomials (c1 x1 + · · · + cn xn )k , k = deg F, with ci ∈ C, c12 + · · · + cn2 = 0. Thus it suffices to check the identity for such simple polynomials. This is done by straightforward computation. The following theorem shows that the stationary set generated by a homogeneous distribution (of finite order) supported at a single point coincides with common zeros of iterated Laplacians of the symbol of the corresponding differential operator. Recall for a polynomial p that N(P ) = P −1 (0), the zero set of that polynomial. theorem 7.2 For any homogeneous polynomial G ∈ R[x1 , . . . , xn ], we have N j (G) . S(TG ) = j ≥0
77
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
Proof According to Lemma 2.1, S(TG ) is the set of common zeros of the convolutions G(∂)u = TG ∗ u, where u ∈ Drad (Rn ). This set coincides with common zeros of the polynomials G(∂)|x|2m , m = 0, 1, . . . as the radial polynomials |x|2m form a ∞ on any compact set. Thus, S(T ) = {x ∈ Rn : G(∂)|x|2m = complete system in Crad G 0, m = 0, 1, . . . }. Represent G in the form G(x) = hk (x) + |x|2 hk−2 (x) + |x|4 hk−4 (x) + · · · ,
k = deg G,
(7.1)
where hk−2j is a harmonic homogeneous polynomial of degree k − 2j . We have, for m ≥ j , 2j |x| hk−2j (∂)|x|2m = hk−2j (∂)j |x|2m = cm cm−1 · · · cm−j +1 hk−2j (∂)|x|2(m−j ) ,
ci = 2i(2i + n − 2).
This expression vanishes for m < j . We proceed by using Lemma 7.1: G(∂)|x|2m =
[k/2]
amj |x|2(m+j −k) hk−2j (x),
j =k−m
where amj = dm,k−2j cm cm−1 · · · cm−j +1 and 2m ≥ k. We can assume that x = 0, and, as all the polynomials under consideration are homogeneous, we can take |x| = 1. Now take m = k, k − 1, . . . , [(k + 1)/2], and consider the system of k − [(k + 1)/2] + 1 = [k/2] + 1 linear equations G(∂)|x|2m =
[k/2]
amj hk−2j (x) = 0,
j =k−m
k≥m≥
(k + 1) , 2
for [k/2] + 1 unknown hk−2j (x), j = 0, . . . , [k/2]. We obtain the linear system with upper triangular matrix having nonzero diagonal entries and therefore conclude that the condition G(∂)|x|2m = 0 for all m is equivalent to hk (x) = hk−2 (x) = · · · = 0. In turn, the last equalities hold if and only if G(x) = G(x) = 2 G(x) = · · · = 0. To check this, observe that (|x|2j hk−2j ) = cj |x|2(j −1) hk−2j + 4|x|2(j −1) (k − 2j )hk−2j = (cj + 4(k − 2j ))|x|2(j −1) hk−2j . We have used here the Euler equation for homogeneous polynomial hk−2j and also its harmonicity. Then, applying the iterated Laplacians to both sides of (7.1), we obtain
bs,j |x|2(j −s) hk−2j (x), s G(x) = j ≥s
78
AGRANOVSKY AND QUINTO
where bs,j = (cj +4(k −2j ))(cj −1 +r(k −2j +2)) · · · (cj −s+1 +4(k −2j +2s −2)). The matrix bs,j is again upper triangular and nondegenerate; hence s G(x) = 0, s = 0, 1, . . . , is equivalent to hk−2j (x) = 0, j = 0, 1, . . .. This completes the proof. The following two corollaries prove that zero sets of harmonics are stationary sets of a homogeneous distribution supported at a single point and describe all such distributions. corollary 7.3 If F is a homogeneous harmonic polynomial, then N(F) = S(TF ). corollary 7.4 Let F be a homogeneous harmonic polynomial, and let G be a polynomial in Rn . Then N(F) ⊂ S(TG ) if and only if F divides all the polynomials G, G, 2 G, . . .. Proof Since F is homogeneous, then it is easy to check that N(F) ⊂ S(TG ) is equivalent to N(F) ⊂ S(TGm ), where Gm is any homogeneous term of G. In turn, by Theorem 7.2, this is equivalent to Gm , Gm , 2 Gm , . . . vanishing on N(F). Vanishing on zeros of a harmonic polynomial is equivalent to divisibility (see Lemma 4.1); therefore all the homogeneous terms Gm , along with their iterated Laplacians, are divisible by F. This proves the corollary. Finally, any low-dimensional real algebraic variety can be stationary for some solution of the wave equation with point-supported initial data and, moreover, any Coxeter system can be added. theorem 7.5 [A] Let V be an algebraic variety in Rn , codim V > 1. Let be either empty or a Coxeter system of hyperplanes. Then there exists a nontrivial polynomial G ∈ R[x1 , . . . , xn ] such that ∪ V ⊂ S(TG ). Remark. In order to prove that N(F) ∪ V , where F is a homogeneous harmonic polynomial and codim V > 1, can be realized as a stationary set, it would be sufficient to prove, according to Corollary 7.4, that the set of homogeneous polynomials G such that F divides all s G, s = 0, 1, . . . is big enough to satisfy the additional condition on the low-dimensional part: s G|V = 0, s = 0, 1, . . .. Because of what has been proven in Theorem 7.2, this means that all the harmonic homoge-
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
79
neous polynomials hk−2j in the decomposition (7.1) are divisible by F and vanish on V . However, showing that the space of harmonic homogeneous polynomials h divisible by a given harmonic F is big enough turned out to be very nontrivial in Rn for n > 2 (cf. [A]). We do not even know whether this space is always infinitedimensional or not. 8. The case of balls Similar arguments can be used to prove a theorem similar to Theorem 1.3 if supp f is (roughly) the disjoint union of balls. Let ED (Rn ) be the set of distributions whose support is contained in the union of a finite number of disjoint closed balls and whose support contains the boundaries of these balls. Distributions in ED (Rn ) can be arbitrary inside each closed ball, but their support must contain the entire boundary of each ball. Our next theorem is the analogue of Theorem 6.1 for ED (Rn ). theorem 8.1 (Support theorem) Let f ∈ ED (Rn ), f = 0. Assume S = S(f ) = ∅. Assume there are regular points / supp f . Let A be a connected in S, and let x0 ∈ S be a regular point and x0 ∈ real-analytic hypersurface in Rn such that x0 ∈ A ⊂ S. Let Tx0 be the hyperplane tangent to A at x0 . Let C be the set of centers of the disks making up supp f . There are two possibilities. / Tx0 . In this case, A ⊂ Tx0 ⊂ S and supp f is symmetric (a) For some c0 ∈ C, c0 ∈ about Tx0 . Furthermore, f is odd about Tx0 . (b) Or C ⊂ Tx0 . In this case, near x0 , S is conical about L = span C. Precisely, A generates a subset of S which is conical with edge L. In this case, k = dim L < n. Proof The proof is similar to the proof of Theorem 6.1. Consider case (a). Let r0 be the smallest radius such that S(x0 , r0 ) meets supp f on a disk D0 not centered on Tx0 . This implies that the point y0 , of tangency of S(x0 , r0 ) and D0 , is not self-mirror. Then, by Theorem 3.2, its mirror point y1 must also be in supp f . If this mirror point was in a disk in supp f centered on Tx0 , then y0 would lie in the same disk by symmetry; but the disks in supp f are disjoint. So, by the choice of r0 , y0 and y1 must both be boundary points of disks D0 and D1 in supp f which are not centered on Tx0 . Assume the disk Dj has center cj and radius tj , j = 0, 1. We show that t0 = t1 , that Tx0 ⊂ S, and that f is odd about Tx0 . For each x ∈ A \ D0 , let S(x) be the disk of smaller radius tangent to ∂D0 , and let rx be its radius. We claim that there is a neighborhood of x0 , U ⊂ A, such that for every point x ∈ U , the sphere S(x), which is tangent to D0 , is also tangent to D1 . We prove it by contradiction. If x1 ∈ A is close enough to x0 and S(x1 ) is tangent to
80
AGRANOVSKY AND QUINTO
D0 at a point y˜0 , then the mirror point y˜1 to y˜0 must be in D1 by Theorem 3.2 and the assumption that the disks in supp f are all disjoint. Again, since these disks are disjoint (and perhaps by making x1 closer to x0 ), we can find an r < rx1 such that S(x1 , r) does not meet D0 but is tangent to D1 at a point near y1 . Because (x1 , r) is sufficiently close to (x0 , r0 ), the mirror point to the point of tangency on D1 is not in supp f . This contradiction to the support theorem, Theorem 3.2, explains the claim. By the claim above, if x ∈ U and r > 0 is such that S(x, r) is tangent to D0 , then it is tangent to D1 . This means x must satisfy the equation |c0 − x| − |c1 − x| = t0 − t1 .
(8.1)
First, assume t0 = t1 . Equation (8.1) is the equation of a hyperboloid of two sheets, and so U is an open set on a hyperboloid. We show that the entire hyperboloid of two sheets is contained in S(f ). Let G(x) = 0 be the second-order polynomial (irreducible over R) that defines the hyperboloid, and let the set S(f ) = N(Q) ∪ V as in Lemma 4.2, where Q is a harmonic divisor and V is a low-dimensional variety. Clearly, the hypersurface A and therefore the set U belong to the (n − 1)-dimensional part So = N (Q). Thus, the polynomial G vanishes on an open subset of N(Q); then G vanishes on an open subset of some irreducible component of N(Q). Now, by Lemma 4.1, G must vanish on the entire component, and the irreducible factor defining the component is a divisor of G. As G is itself irreducible, the component is just the quadric N(G), and we have the entire 2-sheeted hyperboloid S1 in N(Q). This case is eliminated by our next lemma. lemma 8.2 Let S1 be a regular real-analytic surface (possibly disconnected). Assume that S1 contains two points, a and b, a = b, such that the segment ab is perpendicular to the tangent planes Ta and Tb to S1 at points a and b, respectively. Then the spherical Radon transform defined by (2.1) is injective on S1 . This theorem is the n-dimensional generalization of [AQ2, Th. 4.1]. The proof in Rn is the exact analogue of the proof on [AQ2, pp. 397–398] because the microlocal properties of the Radon transform are analogous, as shown above, and because the geometry is the same. This implies t0 = t1 and U is an open set on a hyperplane. Now, using the fact that S is an algebraic variety (see Lemma 4.2) allows us to infer that Tx0 ⊂ S. Finally, using the reflection principle [CH, Vol. II, p. 699 ff.], we see that f is odd about Tx0 . Case (b) is very similar to the case for finite support. Let D0 be a disk in supp f with center c0 ∈ Tx0 . We show that, near x0 , A generates a subset of S that is conical
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
81
about c0 . For x ∈ A\D0 , let S(x) be the sphere of smaller radius that is tangent to ∂D0 . Let y0 be the unique point of intersection of S(x0 ) and D0 . Then, by construction, y0 is self-mirror. By smoothness of A, we can find a neighborhood U ⊂ A of x0 and a neighborhood V ⊂ Rn of y0 such that for each x ∈ U the Tx -mirror point to the single point S(x) ∩ D0 is also in V . By perhaps making U and V smaller, we can assume (V ∩ supp f ) ⊂ D0 . So, for each x ∈ U , the mirror point to S(x) ∩ D0 must be in supp f (see Theorem 3.2) and it must be in D0 . By convexity, this point is self-mirror. This implies that c0 ∈ Tx , ∀x ∈ U . As with the case of finite support, we see that U generates a subset of S that is conical about c0 . Since this is true for all x ∈ A \ supp f and all centers in C, A \ supp f generates a subset of S that is conical about all points of L. Theorem 8.1 can be used to prove a version of Theorem 1.3 at least for the part of S(f ) disjoint from supp f . By analytic continuation, this gives information about the part of S(f ) in supp f : the conical sets defining S(f ) continue into supp f . As a result we obtain Theorem 1.3 for the case when supp f is the union of finite number of disjoint balls. The geometry of the stationary set in this case is the same as that (Rn ). described in Theorem 1.3 for f ∈ Efin 9. Concluding remarks Theorem 1.3 asserts that for initial distributions with finite support, the essential (n − 1)-dimensional part of the stationary set is a cone. From Section 7 we learn that this cone appears as the set of common zeros of spatial harmonics in the Fourier decomposition of the initial distribution. Correspondingly, this happens only when these harmonics simultaneously vanish on a large set (i.e., are coherent). More specifically, the cone may contain a system of Coxeter mirrors if the initial data (sources) admit a corresponding symmetry. In this case, vanishing of the solution of the wave equation on the mirrors is the result of cancelling of waves propagated by symmetric sources. We expect that the stationary sets have a similar geometry for compactly supported initial data and, more generally, for distributions vanishing sufficiently fast at infinity. The main difficulty in proving that is obtaining the conical structure of the essential part of stationary sets. This was done in [AQ1] and [AQ2] for n = 2, by using symmetry (mirror points) of the support of the initial data, given by the support theorem (Theorem 3.2), and the simple structure of zero sets of harmonic polynomials of two variables. Lack of information about zero sets of harmonic polynomials of more than two variables is the main obstacle to extending our approach to describing stationary sets for compactly supported or rapidly decreasing initial data in Rn . Nevertheless, we
82
AGRANOVSKY AND QUINTO
hope to succeed using a deeper analysis of the algebraic and geometric structure of stationary sets and by refining the microlocal results that go into the proof of Theorem 1.3 to be valid more generally, such as for rapidly decreasing functions. Note, in conclusion, that the conical structure is related to rate of decay of the initial data at infinity and does not occur in general. For instance, radial time-harmonic solutions to the wave equation have stationary sets that are concentric spheres (with radii determined by the zeros of a Bessel function). Acknowledgments. The authors would like to thank Carlos Berenstein and Eric Grinberg for useful comments as the article was being prepared and written. Conversations with Valery Volchkov at the beginning of this research were stimulating. Finally, the comments of the referee were insightful and detailed, and they are appreciated by the authors.
References [A]
[ABK] [AQ1]
[AQ2] [AQ3]
[AR] [AVZ] [Ba] [BM] [BC]
M. AGRANOVSKY, “On a problem of injectivity for the Radon transform on a
paraboloid” in Analysis, Geometry, Number Theory: The Mathematics of Leon Ehrenpreis (Philadelphia, 1998), Contemp. Math. 251, Amer. Math. Soc., Providence, 2000, 1–14. M. AGRANOVSKY, C. BERENSTEIN, and P. KUCHMENT, Approximation by spherical waves in Lp -spaces, J. Geom. Anal. 6 (1996), 365–383. M. L. AGRANOVSKY and E. T. QUINTO, Injectivity sets for a Radon transform and complete systems of radial functions: An announcement, Internat. Math. Res. Notices 1994, 467–473. , Injectivity sets for the Radon transform over circles and complete systems of radial functions, J. Funct. Anal. 139 (1996), 383–414. , “Injectivity of the spherical mean operator and related problems” in Complex Analysis, Harmonic Analysis and Applications (Bordeaux, 1995), Pitman Res. Notes Math. Ser. 347, Longman, Harlow, England, 1996, 12–36. M. L. AGRANOVSKY and R. RAWAT, Injectivity sets for spherical means on the Heisenberg group, J. Fourier Anal. Appl. 5 (1999), 363–372. M. L. AGRANOVSKY, V. V. VOLCHKOV, and L. A. ZALCMAN, Conical uniqueness sets for the spherical Radon transform, Bull. London Math. Soc. 31 (1999), 231–236. C. BÄR, On nodal sets for Dirac and Laplace operators, Comm. Math. Phys. 188 (1997), 709–721. ´ P. BERARD and D. MEYER, In´egalit´es isop´erim´etriques et applications, Ann. Sci. ´ Ecole Norm. Sup. 15 (1982), 513–541. M. BRELOT and G. CHOQUET, “Polynômes harmoniques et polyharmoniques” in Second colloque sur les e´ quations aux d´eriv´ees partielles (Brussels, 1954), Georges Thone, Li`ege, 1955, 45–66.
GEOMETRY OF STATIONARY SETS FOR THE WAVE EQUATION
[B1] [B2]
83
¨ J. BRUNING, Uber Knoten von Eigenfunktionen des Laplace-Beltrami-Operators, Math. Z. 158 (1978), 15–21. ¨ , Uber Membranen mit speziellen Knotenlinien, Comment. Math. Helv. 55 (1980), 13–19.
[Ch] [CH]
S. Y. CHENG, Eigenfunctions and nodal sets, Comment. Math. Helv. 51 (1976), 43–55.
[DF1]
H. DONNELLY and C. FEFFERMAN, Nodal sets for eigenfunctions on Riemannian
R. COURANT and D. HILBERT, Methods of Mathematical Physics, I, II, Wiley-
Interscience, New York, 1953, 1962.
[DF2] [DS] [GrQ] [GB] [GS] [HS] [He]
[Ho1] [Ho2] [Ka] [LP]
[Q1] [Q2] [Q3] [Ru] [SKK]
[Tr]
manifolds, Invent. Math. 93 (1988), 161–183. , Nodal sets for eigenfunctions of the Laplacian on surfaces, J. Amer. Math. Soc. 3 (1990), 333–353. N. DUNFORD and J. T. SCHWARTZ, Linear Operators, I: General Theory, Pure Appl. Math. 7, Interscience, New York, 1958. E. L. GRINBERG and E. T. QUINTO, Morera Theorems for complex manifolds, to appear in J. Funct. Anal. L. C. GROVE and C. T. BENSON, Finite Reflection Groups, 2d ed., Grad. Texts in Math. 99, Springer, New York, 1977. V. GUILLEMIN and S. STERNBERG, Geometric Asymptotics, Math. Surveys 14, Amer. Math. Soc., Providence, 1977. R. HARDT and L. SIMON, Nodal sets for solutions of elliptic equations, J. Differential Geom. 30 (1989), 505–522. S. HELGASON, Groups and Geometric Analysis: Integral Geometry, Invariant Differential Operators, and Spherical Functions, Pure Appl. Math. 113, Academic Press, Orlando, 1984. ¨ L. HORMANDER , Fourier integral operators, I, Acta Math. 127 (1971), 79–183. , The Analysis of Linear Partial Differential Operators, I: Distribution Theory and Fourier Analysis, Grundlehren Math. Wiss. 256, Springer, Berlin, 1983. A. KANEKO, Introduction to Hyperfunctions, Math. Appl. (Japanese Ser.) 3, Kluwer, Dordrecht, 1989. V. LIN and A. PINKUS, “Approximation of multivariate functions” in Advances in Computational Mathematics (New Delhi, 1993), Ser. Approx. Decompos. 4, World Sci., River Edge, N.J., 1994, 257–265. E. T. QUINTO, The dependence of the generalized Radon transform on defining measures, Trans. Amer. Math. Soc. 257 (1980), 331–346. , The invertibility of rotation invariant Radon transforms, J. Math. Anal. Appl. 91 (1983), 510–522; Erratum, Math. Anal. Appl. 94 (1983), 602–603. , Pompeiu transforms on geodesic spheres in real analytic manifolds, Israel J. Math. 84 (1993), 353–363. W. RUDIN, Functional Analysis, McGraw-Hill Ser. Higher Math., McGraw-Hill, New York, 1973. M. SATO, T. KAWAI, and M. KASHIWARA, “Microfunctions and pseudo-differential equations” in Hyperfunctions and Pseudo-Differential Equations (Katata, 1971), Lecture Notes in Math. 287, Springer, Berlin, 1973, 265–529. ´ F. TREVES , Introduction to Pseudodifferential and Fourier Integral Operators, I :
84
AGRANOVSKY AND QUINTO
Pseudodifferential Operators, Univ. Ser. Math., Plenum Press, New York, 1980. [VW]
B. L. VAN DER WAERDEN, Algebra, Ungar, New York, 1970.
Agranovsky Department of Mathematics, Bar-Ilan University, 53000 Ramat-Gan, Israel; agranovs@macs. biu.ac.il Quinto Department of Mathematics, Tufts University, Medford, Massachusetts 02155 USA; [email protected]
A GENERALIZATION OF CONJECTURES OF BOGOMOLOV AND LANG OVER FINITELY GENERATED FIELDS ATSUSHI MORIWAKI
Abstract In this paper we prove a generalization of conjectures of Bogomolov and Lang in terms of an arithmetic N´eron-Tate height pairing over a finitely generated field. §0. Introduction Let K be a finitely generated field over Q with d = tr. degQ (K), and let B be a big polarization of K. Let A be an abelian variety over K, and let L be a symmetric ample line bundle on A. In [4] we define the height pairing
, B L : A K × A K −→ R assigned to B and L with these properties: x, xB L ≥ 0 for all x ∈ A(K) and the equality holds if and only if x ∈ A(K)tor . For x1 , . . . , xl ∈ A(K), we denote B det(xi , xj B L ) by δL (x1 , . . . , xl ). The purpose of this paper is to prove the following theorem, which gives an answer to B. Poonen’s question in [5]. theorem 0.1 Let be a subgroup of finite rank in A(K) (i.e., is a subgroup of A(K) with dimQ ( ⊗ Q) < ∞), and let X be a subvariety of AK . Fix a basis {γ1 , . . . , γn } of ⊗ Q. If the set {x ∈ X(K) | δLB (γ1 , . . . , γn , x) ≤ } is Zariski dense in X for every positive number , then X is a translation of an abelian subvariety of AK by an element of div , where div = {x ∈ A(K) | nx ∈ for some positive integer s}. In the case where d = 0, Poonen [5] and S.-W. Zhang [8] proved the equivalent result independently. Our argument for the proof of Theorem 0.1 essentially follows Poonen’s ideas. A new point is that we remove the measure-theoretic argument from his original one, so that we can apply it to our case. We note that Theorem 0.1 substantially includes Lang’s conjecture in the absolute form. DUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 1, Received 14 September 1999. Revision received 8 May 2000. 2000 Mathematics Subject Classification. Primary 11G35, 14G25, 14G40; Secondary 11G10, 14K15.
85
86
ATSUSHI MORIWAKI
lang’s conjecture in the absolute form Let A be a complex abelian variety, let be a subgroup of finite rank in A(C), and let X be a subvariety of A. Then there are abelian subvarieties C1 , . . . , Cn of A and γ1 , . . . , γn ∈ such that X(C) ∩ =
n
Ci + γ i
and
X(C) ∩ =
i=1
n
Ci (C) + γi ∩ .
i=1
§1. Review of arithmetic height functions over finitely generated fields In this section we give a quick review of arithmetic height functions over finitely generated fields. For details, see [4]. Let K be a finitely generated field over Q with d = tr. degQ (K), and let B = (B; H 1 , . . . , H d ) be a big polarization of K; that is, B is a normal projective scheme over Z, whose function field is K, and H 1 , . . . , H d are nef and big C ∞ -hermitian line bundles on B. For the definition of nef and big C ∞ -hermitian line bundles, see [4, §2]. Let X be a projective variety over K, and let L be a line bundle on X. Let us consider a C ∞ -model (X , L ) of (X, L) over B. Namely, X is a projective integral scheme over B, whose generic fiber over B is X, and L is a C ∞ -hermitian Q-line bundle on X , which gives rise to L on the generic fiber of X → B. For x ∈ X(K), x → X → X . Then we define the height let x be the closure of the image Spec(K) − of x with respect to the polarization B and the C ∞ -model (X , L ) to be
∗ 1 B deg c1 L · h (x) = c1 π H 1 · · · c1 π ∗ H d , x x x (X ,L ) K(x) : K
where π : X → B is the canonical morphism. If (X , L ) is another C ∞ -model of (X, L), then there is a constant C such that B (x) − hB (x) ≤ C h (X , L )
(X , L )
for all x ∈ X(K). Thus, modulo the set of bounded functions, we can assign the unique height function hB L : X(K) → R to B and L. Note that if σ ∈ Gal(K/K), B then x = σ (x) . Thus, hB L (σ (x)) = hL (x). The first important theorem is the following Northcott’s theorem for our height functions. theorem 1.1 [4, Theorem 4.3] If L is ample, then, for any numbers M and any positive integers e, the set
x ∈ X K | hB L (x) ≤ M, K(x) : K ≤ e is finite.
87
CONJECTURES OF BOGOMOLOV AND LANG
Let A be an abelian variety over K, and let L be a symmetric ample line bundle on A. Then, as with the usual height functions over a number field, there is the canonical ∞ height function hˆ B L . Actually, if we fix a C -model (A , L ) of (A, L) over B, then hˆ B L is given by 1 lim 2 hB (nx). hˆ B L (x) = n→∞ n (A , L ) This gives rise to a quadric form on A(K); that is, if we set x, yB L =
1 ˆB B ˆ hL (x + y) − hˆ B (x) − h (y) L L 2
for x, y ∈ A(K), then , B L is a bilinear form on A(K). Concerning this bilinear form we have the following proposition. proposition 1.2 [4, §3.4] (1) For all x ∈ A(K), x, xB L ≥ 0, and the equality holds if and only if x is a torsion B point. Namely, , L is positive definite on A(K) ⊗ Q. (2) If f : A → A is a homomorphism of abelian varieties over K, and L is a symmetric ample line bundle on A , then there is a positive number a with
B f (x), f (x) L ≤ ax, xB L
for all x ∈ A(K). Remark 1.3 Proposition 1.2(2) holds even if f , A , L are not defined over K. Let K be a finite extension field of K such that f , A , L are defined over K . Let φ : B K → B be K
the normalization of B in K . Then B = (B K ; φ ∗ (H 1 ), . . . , φ ∗ (H d )) gives rise to a big polarization of K . Thus, there is a positive number a with B K B K f (x), f (x) L ≤ a x, x L for all x ∈ A(K). On the other hand, , B L
K
= [K : K] , B L . Hence,
B K f (x), f (x) L ≤ a K : K x, xB L for all x ∈ A(K). The crucial result for this note is the following solution of Bogomolov’s conjecture over finitely generated fields, which is a generalization of [6] and [7].
88
ATSUSHI MORIWAKI
theorem 1.4 [4, Theorem 8.1] Let X be a subvariety of AK . If the set
x ∈ X K | hˆ B L (x) ≤
is Zariski dense in X for every positive number , then X is a translation of an abelian subvariety of AK by a torsion point. §2. Small points with respect to a group of finite rank The contexts in this section are essentially due to Poonen [5]. We just deal with his ideas in a general situation. Let K be a finitely generated field over Q with d = tr. degQ (K), and let B be a big polarization of K. Let A be an abelian variety over K, and let L be a symmetric ample line bundle on A. Let , B L : A K × A K −→ R
be the height pairing associated with B and L as in §1. Let be a subgroup of finite rank in A(K), namely, is a subgroup of A(K) with dimQ ( ⊗ Q) < ∞. Note that div = x ∈ A K | nx ∈ for some positive integer n is also a subgroup of finite rank in A(K). A nonempty subset S of A(K) is said to be small with respect to if there is a decomposition s = γ (s) + z(s) for each s ∈ S with the following properties: (a) γ (s) ∈ for all s ∈ S; (b) for any > 0 there is a finite proper subset S of S such that z(s), z(s)B L ≤ for all s ∈ S \ S . Especially, a small subset S with respect to {0} is said to be small. Namely, a nonempty subset S of A(K) is small if and only if, for any positive numbers , there is a finite proper subset S of S with x, xB L ≤ for all s ∈ S \ S . Note that in the above definition S is proper; that is, S \ S = ∅. Let us begin with the following proposition. proposition 2.1 Let S be a nonempty subset of A(K), and let be a subgroup of finite rank in A(K). Then we have the following. (1) If S is small with respect to , then any infinite subsets of S are small with respect to . (2) We assume that S is finite. Then S is small (with respect to {0}) if and only if S contains a torsion point. (3) We assume that S is infinite. Let N be a positive integer, and let [N] be the
CONJECTURES OF BOGOMOLOV AND LANG
89
endomorphism of A given by [N ](x) = Nx. If S is small with respect to , then so is [N](S). (4) Let {xn } be a sequence in A(K) with the following properties: (4.1) if n = m, then xn = xm ; (4.2) each xn has a decomposition xn = γn + yn with γn ∈ ; (4.3) limn→∞ yn , yn B L = 0. Then {xn | n = 1, 2, . . . } is small with respect to . Proof (1) and (4) are obvious. (2) Clearly, if S contains a torsion point, then S is small. We assume that S is small. We set λ = min{s, sB L | s ∈ S}. If λ > 0, then there is a finite proper subset B S of S such that s, sL < λ for all s ∈ S \ S . This is a contradiction. Thus, λ = 0, which means that S contains a torsion point. (3) We fix a map t : [N](S) → S with [N](t (s)) = s for all s ∈ [N ](S). Then we have a decomposition s = [N](γ (t (s))) + [N](z(t (s))) for each s ∈ [N ](S). Clearly (a) in the definition of small sets is satisfied. Let be an arbitrary positive 2 number. Then there is a finite subset T of S such that z(s), z(s)B L ≤ /N for all s ∈ S \ T . If we set T = {s ∈ [N](S) | t (s) ∈ T }, then T is finite. Moreover, [N](z(t (s))), [N](z(t (s)))B L ≤ for all s ∈ [N](S) \ T . Therefore, we have (b) in the definition of small sets. Moreover, we have the following, which is a consequence of Bogomolov’s conjecture. theorem 2.2 Let S be a small set of A(K); that is, S is small with respect to {0}. Then there are abelian subvarieties C1 , . . . , Cr , torsion points c1 , . . . , cr , and finite nontorsion points b1 , . . . , bm such that S=
r
Ci + ci ∪ b1 , . . . , bm ,
i=1
where S is the Zariski closure of S. Proof It is sufficient to show that a positive-dimensional, irreducible component X of S is a translation of an abelian subvariety of A by a torsion point. Let S be the set of points in S, which is contained in X(K). Then, the Zariski closure of S is X. In particular, S is an infinite set, so that S is small. Thus, X is a translation of an abelian subvariety of A by a torsion point by virtue of Theorem 1.4.
90
ATSUSHI MORIWAKI
Let S be a small subset with respect to . For each n ≥ 2, let us consider a homomorphism βn : An → An−1 given by βn (a1 , . . . , an ) = (a2 − a1 , a3 − a1 , . . . , an − a1 ). Let F be a finite extension field of K in K. For x ∈ A(K) we denote by OF (x) the orbit of x by the Galois group Gal(K/F ). Noting OF (x)n ⊆ A(K)n for a subset T of S, we define Dn (T , F ) to be Dn (T , F ) = βn OF (s)n . s∈T
We denote the Zariski closure of Dn (T , F ) by D n (T , F ). On An we can give the height pairing associated with ni=1 pi∗ (L) and B, where pi : An → A is the projection to the ith factor. By abuse of notation, we denote this by , B L. proposition 2.3 Let f : A → A be a homomorphism of abelian varieties over K. Let F be a finite extension field of K in K. We assume that there is a finitely generated subgroup 0 of such that 0 ⊆ A(K) and 0 ⊗ Q = ⊗ Q. Then we have the following: (a) f n−1 (Dn (S, F )) is small (with respect to {0}) where f n−1 : An−1 → A n−1 is the morphism given by f n−1 (x1 , . . . , xn−1 ) = (f (x1 ), . . . , f (xn−1 )); (b) let b1 , . . . , bl be nontorsion points in f n−1 (Dn (S, F )); then there is a finite / f n−1 (Dn (S \ S , F )) for all i. proper subset S of S such that bi ∈ Proof Let σ , τ be elements of Gal(K/F ). Then σ (γ (s)) − τ (γ (s)) is torsion because nγ (s) ∈ 0 for some n > 0. Thus, σ (s) − τ (s)B = σ (z(s)) − τ (z(s))B ≤ 2z(s)B , L L L
where
xB L
= x, xB L . Therefore,
√ βn (x)B ≤ 2 n − 1z(s)B L L for all x ∈ OF (s)n . Let L be a symmetric ample line bundle on A . Then, by Proposition 1.2(2) (or Remark 1.3), there is a positive constant a with f (x), f (x)B L ≤ ax, xB L for all x ∈ A(K). Thus,
n−1 B f βn (x) L ≤ 2 a(n − 1)z(s)B L
(1)
for all x ∈ OF (s)n . First, let us see (b). We set µ = min{bi B L | i = 1, . . . , l} > 0. Then there is a
CONJECTURES OF BOGOMOLOV AND LANG
91
finite proper subset S of S with µ z(s)B L < √ 2 a(n − 1) for all s ∈ S \ S . Thus, by (1), n−1 B f βn (x) L < µ
for all x ∈ s∈S\S OF (s)n . Hence, bi ∈ f n−1 (Dn (S \ S , F )) for all i. Next we consider (a). If f n−1 (Dn (S, F )) is infinite, then the assertion of (a) is obvious by (1). Otherwise, let {b1 , . . . , bl } be the set of all nontorsion points in f n−1 (Dn (S, F )). Then, by (b), we can find a finite proper subset S of S with ∅ = f n−1 Dn (S \ S , F ) ⊆ f n−1 Dn (S, F ) \ b1 , . . . , bl . Hence, f n−1 (Dn (S, F )) contains a torsion point. Therefore, f n−1 (Dn (S, F )) is small.
Let S be a small subset with respect to . From now on we assume the following: (A) S is infinite; (B) there is a finitely generated subgroup 0 of such that 0 ⊆ A(K) and 0 ⊗Q = ⊗ Q. Let F be a finite extension field of K in K. A pair (S, F ) is said to be n-minimized if the following properties are satisfied: (i) D n (S , F ) = D n (S, F ) for any infinite subsets S of S and any finite extension fields F of F in K (recall that D n (·, ·) is the Zariski closure of Dn (·, ·)); (ii) D n ([N ](S), F ) = D n (S, F ) for any positive integers N . Note that [N](OF (s)) = OF ([N](s)) for s ∈ S and a positive integer N, so that Dn ([N ](S), F ) = [N](Dn (S, F )). Therefore, (ii) is equivalent to saying that [N ](D n (S, F )) = D n (S, F ) for any positive integers N. First let us consider the following proposition. proposition 2.4 (1) If we fix n ≥ 2, then there are an infinite subset T of S, a positive integer N , and a finite extension field F of K in K such that ([N](T ), F ) is n-minimized. (2) Let F be a finite extension field of K in K. Let N be a positive integer, let S be an infinite subset of [N](S), and let F be a finite extension field of F in K. If (S, F ) is n-minimized, then D n (S , F ) = D n (S, F ). Proof (1) Let F be a finite extension field of K in K. A pair (S, F ) is said to be weakly
92
ATSUSHI MORIWAKI
n-minimized if the above property (i) is satisfied. First we claim the following. claim 2.4.1 (a) If we fix n ≥ 2, then there are an infinite subset T of S and a finite extension field F of K such that (T , F ) is weakly n-minimized. (b) Let F be a finite extension field of K in K. If (S, F ) is weakly n-minimized, then there are abelian subvarieties C1 , . . . , Cr and torsion points c1 , . . . , cr such that r D n (S, F ) = Ci + c i . i=1
(c) Let F be a finite extension field of K in K, and let N be a positive integer. If (S, F ) is weakly n-minimized, then so is ([N](S), F ). Proof (a) We set
5 = D n (T , F ) | T is an infinite subset of S,
and F is a finite extension field of K in K .
. Since An−1 is a Noetherian space, Then 5 is a set of closed subsets of An−1 K K
there is a minimal element D n (T , F ) in 5. (b) By Theorem 2.2 there are abelian subvarieties C1, . . . ,Cr , torsion points c1, . . . ,cr , and finite nontorsion points b1 , . . . , bm such that D n (S, F ) =
r
Ci + ci ∪ b1 , . . . , bm .
i=1
By virtue of Proposition 2.3(b) we can find a finite set T of S such that D n (S \ T , K) ⊆
r
Ci + ci ⊆ D n (S, F ).
i=1
Here D n (S \ T , K) = D n (S, K). Thus, we get (b). (c) Let S1 be an infinite subset of [N](S) and F a finite extension field of F in K. We take a subset S of S with [N](S ) = S1 . Then D n (S , F ) = D n (S, F ). Thus, since [N] is a finite and surjective morphism, we can see D n S1 , F = D n [N ] S , F
93
CONJECTURES OF BOGOMOLOV AND LANG
= [N] D n S , F = [N] D n (S, F ) = D n [N ](S), F . Hence, we have (c). Let us start the proof of (1). By virtue of (a) there are an infinite subset T of S and a finite extension field F of K such that (T , F ) is weakly n-minimized. Hence, by (b), there are abelian subvarieties C1 , . . . , Cr and torsion points c1 , . . . , cr such that D n (T , F ) =
r
Ci + c i .
i=1
Let N be a positive integer with Nci = 0 for all i. Then r D n [N](T ), F = [N] D n (T , F ) = Ci . i=1
Here we claim that ([N ](T ), F ) is n-minimized. By (c), ([N](T ), F ) is weakly nminimized. Moreover, for any positive integers N , D n [N ] [N ](T ) , F = [N ] D n ([N](T ), F ) r r Ci = Ci = [N ]
i=1
i=1
= D n [N ](T ), F . Thus, ([N ](T ), F ) is n-minimized. (2) Let N be a positive integer, let S be an infinite subset of [N](S), and let F be a finite extension field of F . By (c), ([N](S), F ) is weakly n-minimized. Thus, D n S , F = D n [N ](S), F = D n (S, F ). Therefore, we get (2). Finally, let us consider the following theorem, which is crucial for our paper. theorem 2.5 Let F be a finite extension field of K in K. If (S, F ) are 2-minimized, then there is an abelian subvariety C of AK such that D n (S, F ) = C n−1 for all n ≥ 2. Proof Let us begin with the following lemma.
94
ATSUSHI MORIWAKI
lemma 2.6 Let F be a finite extension field of K in K, and let C be an abelian subscheme of AF over F . We assume that there is a positive integer e with the following property: for each s ∈ S there is a subset T (s) of OF (s) × OF (s) such that β2 (T (s)) ⊆ C(K) and #(T (s)) ≥ #(OF (s) × OF (s))/e. Then there is a finite subset S of S and a positive integer N with D2 ([N ](S \ S ), F ) ⊆ C(K). Proof Let π : A → A/C be a natural homomorphism. Fix s ∈ S. Let F be a finite Galois extension of F such that F contains F (s). Then there is a natural surjective map φ : Gal F /F −→ OF (s), whose fibers are cosets of the stabilizer of s. If we set E(s) = (φ × φ)−1 (T (s)), then #(E(s)) ≥ #(Gal(F /F ) × Gal(F /F ))/e and σ (π(s)) = τ (π(s)) for all (σ, τ ) ∈ E(s). Let Gπ(s) be the stabilizer of π(s) by the action of Gal(F /F ), and let R be the set of all (σ, τ ) ∈ Gal(F /F ) × Gal(F /F ) with σ (π(s)) = τ (π(s)). Then we have # Gal(F /F ) × Gal(F /F ) . and #(R) ≥ #(R) = # Gπ(s) # Gal F /F e Thus, [Gal(F /F ) : Gπ(s) ] ≤ e, which means that [F (π(s)) : F ] ≤ e. Then, since π(D2 (S, F )) is small, by virtue of Northcott’s theorem (cf. Theorem 1.1), π(D2 (S, F )) is finite. By Proposition 2.3(b), there is a finite proper subset S of S such that π(D2 (S \S , F )) consists of torsion points. Hence, there is a positive integer N such that [N](π(D2 (S \S , F ))) = {0}. Therefore, D2 ([N](S \S ), F ) ⊆ C(K). Let us go back to the proof of Theorem 2.5. First, let us consider the case n = 2. By using Claim 2.4.1(b) we can find abelian subvarieties C1 , . . . , Ce with D 2 (S, F ) =
e
Ci
i=1
because D 2 (S, F ) is stable by the endomorphism [N] for every positive integer N. Thus, in order to see e = 1, it is sufficient to find Ci , a positive integer N1 , an infinite subset S1 of S, and a finite extension field F1 of F such that D2 [N1 ](S1 ), F1 ⊆ Ci K . Let F1 be a finite extension field of F such that Ci ’s are defined over F1 . For each s ∈ S, let Ti (s) be the set of all elements x ∈ OF1 (s)2 with β2 (x) ∈ Ci (K). We choose a map λ : S → {1, . . . , e} such that #(Tλ(s) (s)) gives rise to the maximal value in
CONJECTURES OF BOGOMOLOV AND LANG
95
{#(Ti (s)) | i = 1, . . . , e}. By using the pigeonhole principle, there are i ∈ {1, . . . , e} and an infinite subset S of S with λ(s) = i for all s ∈ S . Then, for all s ∈ S , β2 (Ti (s)) ⊆ Ci (K) and #(Ti (s)) ≥ #(OF1 (s)2 )/e. Thus, by Lemma 2.6, there are an infinite subset S1 of S and a positive integer N1 with D2 ([N1 ](S1 ), F1 ) ⊆ Ci (K). From now on, we denote Ci by C. Then D 2 (S, F ) = C. Let us try to see that D n (S, F ) = C n−1 for all n ≥ 2. Clearly, D n (S, F ) ⊆ C n−1 . Thus it is sufficient to find a positive integer N2 , an infinite subset S2 of S, and a finite extension field F2 of F such that D n [N2 ](S2 ), F2 = C n−1 . By Proposition 2.4(1) there are a positive integer N2 , an infinite subset S2 of S, and a finite extension field F2 of F such that ([N2 ](S2 ), F2 ) is n-minimized. Thus, as before, there are abelian subvarieties G1 , . . . , Gl with D n ([N2 ](S2 ), F2 ) = lj =1 Gj . Moreover, replacing F2 by a finite extension field of F2 , we may assume that C and Gj ’s are defined over F2 . At this stage we would like to show that D n [N2 ](S2 ), F2 = C n−1 . In the same way as before, we can find Gj (say, G) and an infinite subset S of [N2 ](S2 ) such that for all s ∈ S there is a subset T (s) of OF2 (s)n with #(T (s)) ≥ #(OF2 (s)n )/ l and βn (T (s)) ⊆ G(K). Let C (q) = 0 × · · · × C × · · · × 0 be the qth factor of C n−1 , and let G(q) = G ∩ C (q) for 1 ≤ q ≤ n − 1. Since G ⊆ C n−1 , it is sufficient to see the following claim to conclude the proof of our theorem. claim 2.6.1 G(q) = C (q) for each 1 ≤ q ≤ n − 1. For each t1 , . . . , tq , tq+2 , . . . , tn ∈ OF2 (s) we set J t1 , . . . , tq , tq+2 , . . . , tn = x ∈ OF2 (s) | t1 , . . . , tq , x, tq+2 , . . . , tn ∈ T (s) . We choose s1 , . . . , sq , sq+2 , . . . , sn ∈ OF2 (s) such that #(J (s1 , . . . , sq , sq+2 , . . . , sn )) is maximal among {#(J (t1 , . . . , tq , tq+2 , . . . , tn )) | t1 , . . . , tq , tq+2 , . . . , tn ∈ OF2 (s)}. Then # OF2 (s)n n−1 . ≥ #(T (s)) ≥ # J s1 , . . . , sq , sq+2 , . . . , sn # OF2 (s) l Thus, if we set L(s) = J (s1 , . . . , sq , sq+2 , . . . , sn ), then #(L(s)) ≥ #(OF2 (s))/ l and βn s1 , . . . , sq , x, sq+2 , . . . , sn ∈ G K for all x ∈ L(s). Therefore, for all (x, x ) ∈ L(s) × L(s),
96
ATSUSHI MORIWAKI
βn 0, . . . , 0, x − x , 0, . . . , 0 = βn s1 , . . . , sq , x, sq+2 , . . . , sn − βn s1 , . . . , sq , x , sq+2 , . . . , sn ∈ G K . This means that β2 (x, x ) ∈ G(q) (K) for all (x, x ) ∈ L(s)×L(s) if we view G(q) as a subscheme of A. Here #(L(s)×L(s)) ≥ #(OF2 (s)×OF2 (s))/ l 2 . By Lemma 2.6 there are an infinite subset S of S and a positive integer N with D 2 ([N ](S ), F2 ) ⊆ G(q) , which implies that G(q) = C (q) because D 2 ([N ](S ), F2 ) = C by Proposition 2.4(2). As a corollary we have the following, which is not used in the later section. corollary 2.7 Let F be a finite extension field of K in K. Then the following (1), (2), and (3) are equivalent: (1) (S, F ) is n-minimized for all n ≥ 2; (2) (S, F ) is n-minimized for some n ≥ 2; (3) (S, F ) is 2-minimized. Proof It is sufficient to show that (2) ⇒ (3) and (3) ⇒ (1). (2) ⇒ (3): By Proposition 2.4(1) there are an infinite subset T of S, a positive integer N1 , and a finite extension field F1 of F in K such that ([N1 ](T ), F1 ) is 2minimized. Then, by Theorem 2.5, there is an abelian subvariety C of AK such that D 2 ([N1 ](T ), F1 ) = C and D n ([N1 ](T ), F1 ) = C n−1 . Thus, by Proposition 2.4(2), D n (S, F ) = C n−1 because (S, F ) is n-minimized. For all x, x ∈ OF (s) with s ∈ S, n−1 . βn (s, x, s, . . . , s) − βn (s, x , s, . . . , s) = x − x , 0, . . . , 0 ∈ C K Thus, β2 (OF (s)2 ) ⊆ C(K) for all s ∈ S. Therefore, D 2 (S, F ) ⊆ C. Let S be an infinite subset of S, and let F be a finite extension field of K. In order to see that D 2 (S , F ) = C, we may assume that S ⊆ T and F1 ⊆ F . Then [N1 ] D 2 (S , F ) = D 2 [N1 ](S ), F = D 2 [N1 ](T ), F1 = C. Thus, D 2 (S , F ) = C because D 2 (S , F ) ⊆ C. Hence, (S, F ) satisfies the property (i) in the definition of “2-minimized.” Moreover, [N](D 2 (S, F )) = [N](C) = C for all positive integers N. Therefore, (S, F ) is 2-minimized. (3) ⇒ (1): By Theorem 2.5 there is an abelian subvariety C of AK such that D n (S, F ) = C n−1 for all n ≥ 2. Fix n ≥ 2. By Proposition 2.4(1) there are an infinite subset T of S, a positive integer N1 , and a finite extension field F1 of F in
97
CONJECTURES OF BOGOMOLOV AND LANG
K such that ([N1 ](T ), F1 ) is n-minimized. Since ([N1 ](T ), F1 ) is 2-minimized and D 2 ([N1 ](T ), F1 ) = C, we have D n ([N1 ](T ), F1 ) = C n−1 by Theorem 2.5. Let S be an infinite subset of S, and let F be a finite extension field of K. Let us see D n (S , F ) = C n−1 . For this purpose we may assume that S ⊆ T and F1 ⊆ F . Then [N1 ] D n S , F = D n [N1 ](S ), F = D n [N1 ](T ), F1 = C n−1 . Thus, D n (S , F ) = C n−1 because D n (S , F ) ⊆ D n (S, F ) = C n−1 . Moreover, [N](D n (S, F )) = [N](C n−1 ) = C n−1 for all positive integers N. Therefore, (S, F ) is n-minimized.
§3. Proof of Theorem 0.1
3.1. Preliminary of linear algebra Let V be a vector space over R, and let , be an inner product on V ; that is, , is a symmetric positive definite bilinear form on V . For a finite set of linearly independent vectors < = {v1 , . . . , vn }, we define < : V × V −→ R to be
v1 , v1 · · · .. .. . < (x, y) = det . vn , v1 · · · x, v1 · · ·
v1 , y .. . . vn , vn vn , y x, vn x, y v1 , vn .. .
Then we have the following proposition. proposition 3.1 (1) < is a bilinear map. (2) < is symmetric and positive semidefinite. (3) For all v ∈ Span( d + 1, d ≥ 2. Moreover, we also establish the bound m(λ) ≥ cλ(d−1−δ)/4l ,
λ ≥ λ0 ,
(1.1)
with a positive constant c = c(δ), where δ = 0 if d = 1 (mod 4) and δ is an arbitrary positive number if d = 1 (mod 4). Our proof of (1.1) is an extension of the idea from [4] to general dimensions d and orders l. It is based on a very simple perturbation argument and uses only the information on the band structure of the unperturbed operator H0 , which, in its turn, is closely related to estimates for the number of lattice points in the ball of a large radius. The improvement of Skriganov’s result from [16], [18] has become possible due to more precise information on this number-theoretic problem, which was not available when [18] was published (see Section 3 for details). One should point out that in the short note [20] by N. Yakovlev, under the same restriction 4l > d + 1, the number of gaps was announced to be finite even for more
211
THE BETHE-SOMMERFELD CONJECTURE
general operators of the form P0 + V with an elliptic pseudodifferential operator P0 with constant coefficients having a homogeneous convex symbol of order 2l, and arbitrary bounded periodic perturbation V . However, we have been able neither to find in the literature nor to reproduce in full Yakovlev’s proof of this claim. The second result (Theorem 2.2) shows that the condition 4l > d + 1, can be relaxed if V is the multiplication by a real-valued periodic function V (x). Namely, under this assumption on V the lower bound (1.1) remains true for all l such that 6l > d + 2, d ≥ 2. Again, l is not supposed to be integer. Besides, for d = 1 (mod 4) we establish estimate (1.1) for 6l = d +2 assuming that the potential V is a trigonometric polynomial and that it is sufficiently small. The strategy of the proof follows [17], where the Bethe-Sommerfeld conjecture was justified for l = 1, d = 3 for the first time. It is a combination of arguments from number theory and perturbation theory. As in [17], we employ the connection of the multiplicity of overlapping m(λ) with the counting function N(λ; H (k)) of the operator H (k) = H0 (k) + V ; H0 (k) = (−i∇ + k)2 , acting on the torus Rd / , with the quasi-momentum k ∈ Rd : m(λ) ≥ max N λ; H (k) − min N λ; H (k) . k
k
Skriganov’s idea in [17] for d = 3 and l = 1 was to show the following. (1) The multiplicity m(λ) for the unperturbed operator H0 (k) satisfies (1.1). This is done by proving appropriate bounds on the number of lattice points in the ball of radius ρ = λ1/2l . (2) Effectively, for each k, the potential V induces only a finite-dimensional perturbation whose dimension is less than the right-hand side of (1.1), that is, N λ; H (k) − N λ; H0 (k) = o λ(d−1−δ)/4l . (1.2) Combining these two ingredients, Skriganov obtained (1.1) for the perturbed operator. We follow the same general plan but refine the second ingredient by observing that instead of a pointwise in k estimate it suffices to prove estimate (1.2) averaged in k (see Theorem 2.9). It is exactly this observation that allows us to simplify Skriganov’s argument and extend his result to arbitrary dimensions d and arbitrary orders l : 6l > d + 2. Note also that our proof of the averaged estimate (1.2) does not require any facts from number theory, in contrast to [17]. For l = 1 Theorem 2.2 proves the Bethe-Sommerfeld conjecture in dimensions d = 2, 3 and, for small trigonometric polynomials V , also in the dimension d = 4. Thus, for the Schr¨odinger operator our theorem does not provide any new information in comparison with the known results from [13], [4], [17], and [6], and in the case d = 4 it is even less general than [6]. However, we consider this an important advantage of our approach over that of [6]; in order to prove Theorem 2.2, we do not
212
PARNOVSKI AND SOBOLEV
need any advanced techniques, such as microlocal and quasi-classical analysis, but use only elementary perturbation theory as our main tool. In fact, a more elaborate variant of our method allows us to handle the case d = 4, l = 1 in full generality. We plan to present this and other findings in a subsequent publication. Notation. By bold lowercase letters we denote vectors in Rd and Zd , for example, x ∈ Rd , m ∈ Zd . Bold uppercase letters G, F are used for d × d constant positive definite matrices. The notations ab and aGb stand for the scalar product in Rd and the quadratic form of the matrix G, respectively. For any function f ∈ L1 (O ), O = [0, 2π )d , the Fourier transform is defined as follows: 1 fˆ(m) = e−imx f (x) dx. (2π)d/2 O By C and c (with or without indices) we denote various positive constants whose precise value is unimportant.
2. Main result and preliminaries 2.1. Notation and main result We are concerned with the spectrum of the operator H = H0 + B, H0 =
(l) H0
= (DGD)l ,
D = −i∇,
where G is a constant positive-definite d × d-matrix, and B is a bounded self-adjoint operator in L2 (Rd ), periodic with respect to the lattice = (2πZ)d . By periodicity we mean that B commutes with the family of unitary shifts by the vectors of the lattice : B Tm = Tm B,
(Tm u)(x) = u(x + 2πm),
m ∈ .
The assumption that the lattice is cubic is not restrictive as any lattice can be reduced to a cubic one by a suitable nondegenerate linear transformation that would affect only the matrix G. As B is bounded, the operator H is self-adjoint on the domain D(H0 ) = H 2l (Rd ). We use the following notation for the fundamental domains of the lattice and its dual lattice † = Zd : O = [0, 2π)d ,
O † = [0, 1)d .
Let us also introduce the torus Td = Rd / . To describe the spectrum of H , we use the Floquet decomposition of the operator H (see [14]). We identify the space L2 (Rd )
213
THE BETHE-SOMMERFELD CONJECTURE
with the direct integral
G=
O†
H dk,
H = L2 (O ).
The identification is implemented by the Gelfand transform (U u)(x, k) = e−ikx e−i2πkm u(x + 2πm),
k ∈ Rd ,
(2.1)
m∈Zd
which is initially defined on functions from the Schwarz class and extends by continuity to a unitary mapping from L2 (Rd ) onto G. The unitary operator U reduces Tm to the diagonal form: U Tm U −1 = ei2πkm ,
∀m ∈ Zd .
As H0 and V commute with all Tm ’s, they are partially diagonalized by U (see [14]). It is readily seen that U H0 U −1 u (·, k) = H0 (k)u(·, k), l H0 (k) = (D + k)G(D + k) , k ∈ Rd , with the domain D(H0 (k)) = H 2l (Td ). As far as B is concerned, we have U BU −1 u (·, k) = B(k)u(·, k), a.a. k ∈ Rd , with a measurable family of bounded self-adjoint operators B(k). It follows from the definition (2.1) that H0 (k + n) = En∗ H0 (k)En ,
B(k + n) = En∗ B(k)En ,
∀n ∈ Zd ,
(2.2)
where En is the unitary in H operator of multiplication by exp(ixn). Note that B = ess-supk B(k). The family H (k) = H0 (k) + B(k) realizes the decomposition of H in the direct integral: U H U −1 =
O†
H (k) dk.
From now on we always assume that the operator family B(·) is norm-continuous in k ∈ Rd . Note that if B is the multiplication by a real-valued function V (x), then B(k) ≡ V is trivially continuous in k. The spectra of all H (k) consist of discrete eigenvalues λj (k), j = 1, 2, . . . , that we arrange in nondecreasing order counting multiplicity. As B(·) depends on k ∈ Rd continuously, so do λj (·). By (2.2), λj (·) are periodic in k with respect to the lattice † . The images !j = ∪ λj (k) k∈O †
of the functions λj are called spectral bands. The spectrum of the initial operator H
214
PARNOVSKI AND SOBOLEV
has the following representation: σ (H ) = ∪j !j . The bands with distinct numbers may overlap. To characterize this overlapping we introduce the function m(λ) = m(λ, B), called the multiplicity of overlapping, which is equal to the number of bands containing given point λ ∈ R, m(λ) = # j : λ ∈ !j , and the overlapping function ζ (λ) = ζ (λ, B), λ ∈ R, defined as the maximal number t such that the symmetric interval [λ − t, λ + t] is entirely contained in one of the bands !j : ζ (λ) = max max t : [λ − t, λ + t] ⊂ !j . j
These two quantities were first introduced by Skriganov (see, e.g., [18]). It is easy to see that ζ is a continuous function of λ ∈ R. The main results of the paper are stated in the following two theorems. From now on we always use the following notation: 0, d = 1 (mod 4); δ = δd = (2.3) arbitrary positive number, d = 1 (mod 4). theorem 2.1 Let d ≥ 2, let l > 0, and let B be a periodic bounded self-adjoint operator such that B(k) is norm-continuous in k ∈ Rd . (1) If 4l > d + 1, then there is a number λl = λl (B, δ) ∈ R such that m(λ) ≥ c0 λ(d−1)/4l−δ ,
ζ (λ) ≥ c0 λ1−(d+1)/4l−δ
(2.4)
for all λ ≥ λl with a constant c0 independent of B. (2) If d = 1 (mod 4) and 4l = d + 1, then estimates (2.4) hold for sufficiently small B. We emphasize that in Theorem 2.1 the perturbation B is an arbitrary self-adjoint bounded periodic operator. As Theorem 2.2 shows, if one assumes that B is a local operator (multiplication by a function), then the condition 4l > d + 1 can be relaxed. theorem 2.2 Let d ≥ 2, let l > 0, and let B be the multiplication operator by a bounded periodic real-valued function V such that V (x) dx = 0. (2.5) O
(1) Suppose that one of the following two conditions is fulfilled:
215
THE BETHE-SOMMERFELD CONJECTURE
(i)
4l − 1 ≤ d < 6l − 2 and V ∈ H α (Td ) with 2α > d +
(d − 1)(d + 1 − 4l) ; 6l − (d + 2)
(2.6)
(ii) d = 1 (mod 4), 4l = d + 1, and V is continuous. Then there is a number λl = λl (V , δ) ∈ R such that the estimates (2.4) hold for all λ ≥ λl with a constant c0 independent of V . (2) Suppose that d = 1 (mod 4), 6l = d + 2, and V is a trigonometric polynomial. Then there are numbers λl = λl (V ) ∈ R and g0 = g0 (V ) > 0 such that the functions m(λ, gV ) and ζ (λ, gV ) satisfy estimates (2.4) for all λ ≥ λl and |g| ≤ g0 , with a constant c0 independent of V . Recall that a function V is called a trigonometric polynomial if the set ' = θ ∈ Zd : Vˆ (θ) = 0
(2.7)
is finite. Note that 0 ∈ / ' in view of (2.5). For a finite set ' the quantity M = M(') = |θ|−1
(2.8)
θ∈'
is finite. Remark 2.3 Either of the estimates (2.4) implies that the spectrum of H has no gaps on the semiaxis [λl , ∞). It is also legitimate to ask whether the spectrum has any gaps at all if the perturbation B is sufficiently small. As was found by Skriganov in [18], the answer to this question is closely connected with the properties of the overlapping function for the unperturbed operator, which we denote by ζ0 (λ). In Section 4 it is shown that ζ0 (λ) satisfies (2.4) for all l > 0 and all d ≥ 2, provided that λ is sufficiently large. In particular, ζ0 (λ) is strictly positive for large λ. According to Skriganov’s results (see, [18, Section 7]), this ensures that ζ0 is strictly positive for all λ > 0. Now a straightforward application of the perturbation theory leads to the conclusion that for any given λ0 there exists a number v0 = v0 (λ0 ) such that the perturbed operator H = H0 + B does not have any gaps on the interval (−∞, λ0 ] if B ≤ v0 . In combination with Theorems 2.1 and 2.2, this implies that the spectrum of H has no gaps at all for sufficiently small perturbations B satisfying the conditions of either of these theorems. Note also that the lower bound (2.4) for ζ0 improves the estimate ζ0 (λ) ≥ cλ1−d/2l , established in [18, Section 14].
216
PARNOVSKI AND SOBOLEV
Remark 2.4 Although both Theorems 2.1 and 2.2 proclaim estimates (2.4), there exists an important difference between the bounds for ζ under their conditions. If 4l > d + 1, then ζ (λ) → ∞, λ → ∞, while for 4l < d + 1 the function ζ (λ) is allowed to tend to zero as λ → ∞. The proofs of Theorems 2.1 and 2.2 exploit the connection between the functions m(λ), ζ (λ) and the counting functions 1, n λ; H (k) = 1. N λ; H (k) = λj (k)≤λ
Denote
λj (k) 0, let d ≥ 2, and suppose that the perturbation B is as in Theorem 2.1. Then T (ρ; B) ≤ Cvρ d−2l ,
v = B,
for all ρ ≥ 1. This elementary result is given here for methodological purposes only and is not used in the proofs of Theorems 2.1 and 2.2. It is convenient to postpone the proof of Theorem 2.5 until the end of Section 4. To obtain estimates with β < d − 2l, we assume, as in Theorem 2.2, that B = V is a multiplication by a real-valued periodic function and consider separately three cases. The following notation is convenient: γ = d − 2l − β, (2.11) ν = 6l + 2β − 2d − 1. Condition 2.6 (1) The parameter β ∈ (d −3l +1/2, d −2l], l > 1/2, and the function V ∈ H α (Td ) with γ (d − 1) d . (2.12) α> + 2 ν (2) If d = 2, then ν ≤ 1. Note that the conditions β ≥ d − 3l + 1/2 and ν ≥ 0 are equivalent. The restriction l > 1/2 guarantees that the interval (d − 3l + 1/2, d − 2l] is not empty. The next two cases deal with the endpoints of this interval. Condition 2.7 The parameter β = d − 2l, l > 1/2, and the function V is continuous. Condition 2.8 The parameter β = d − 3l + 1/2, l > 1/2, and the function V is a trigonometric polynomial.
218
PARNOVSKI AND SOBOLEV
theorem 2.9 Let d ≥ 2, let β ≤ d − 2l, and let V be a real-valued periodic bounded function. (1) Suppose that either Condition 2.6 or 2.7 is satisfied. Then one has lim ρ −β T (ρ; V ) = 0,
ρ −→ ∞.
(2.13)
(2) Suppose that Condition 2.8 is fulfilled. Then, for any α > d/2, T (ρ; V ) ≤ CV H α M(')1/2 ρ β
(2.14)
(with M(') defined in (2.8)) for all sufficiently large ρ. The constant C does not depend on ρ, V , or ' but may depend on α. Theorem 2.2 is deduced from Theorem 2.9 in Section 4. The proof of Theorem 2.9 is completed in Sections 5 and 6.
3. Integer points in the ellipsoid 3.1. Estimates In this section we collect some facts from number theory which play a crucial role in the sequel. Let C ⊂ Rd be a measurable set, and let C (k) , k ∈ O † , be the family of sets obtained by shifting C by the vector −k; that is, let C (k) = ξ ∈ Rd : ξ + k ∈ C . (3.1) The characteristic function of the set C is denoted by χ(·; C ). Denote by #(k; C ) the number of integer points in C (k) ; that is, let #(k; C ) = χ(m + k; C ). m∈Zd
The following formula is very useful in the sequel: #(C ) = vol(C ). It follows from the relation O† m
(3.2)
χ(m + k; C ) dk =
Rd
χ(ξ ; C ) dξ .
We need an estimate for the number of integer points inside a (closed) ellipsoid determined by the matrix G. Precisely, for any ρ > 0, let E (ρ) = E (ρ, F) ⊂ Rd be the ellipsoid ξ ∈ Rd : |Fξ | ≤ ρ , F = G1/2 ,
219
THE BETHE-SOMMERFELD CONJECTURE
E0 (ρ) = E (ρ, I ). There is a very simple connection between integer points in the ellipsoid and the eigenvalues of the unperturbed problem. Indeed, the eigenvalues of the operator H0 (k) equal |F(m + k)|2l , which ensures that (3.3) N ρ 2l ; H0 (k) = # k; E (ρ) , ρ ≥ 0.
Now we can use known properties of the right-hand side to get information on the left-hand side. Precisely, we are interested in the behavior of the counting function N(ρ 2l ; H0 (k)) as ρ → ∞. Naturally, the leading order is given by the volume of the ellipsoid which coincides with the average value of the counting function:
2l (3.4) N ρ ; H0 = # E (ρ) = wd ρ d , where wd = √
Kd det G
,
Kd =
π d/2 . (d/2 + 1)
(Kd is the volume of the unit ball in Rd .) We need bounds on the averaged deviation of N(ρ 2l ; H0 (k)) from its leading term. To state the result, introduce the notation p
σp (ρ) = # E (ρ) − wd ρ d , p > 0. theorem 3.1 (1) Lower bound Let the number δ be as defined in (2.3). Then for all sufficiently big ρ the following estimate holds: (3.5) σ1 (ρ) ≥ Cρ (d−1)/2−δ , with a constant C = C(d, G, δ). (2) Upper bounds For all sufficiently big ρ the following estimate holds: σ2 (ρ) ≤ Cρ d−1 ,
(3.6)
with a constant C = C(d, G). Moreover, if d = 1 (mod 4), then there exists a sequence ρj → ∞, j → ∞ such that σ2 (ρj ) ≤ Cρjd−1 (ln ρj )(−1+ε)/d ,
(3.7)
where ε > 0 is arbitrary and C = C(d, G, ε). Note that for the proof of Theorem 2.9 we need only the lower bound (3.5). The upper bounds (3.6) and (3.7) are given here to demonstrate the accuracy of (3.5). Indeed, √ as σ1 ≤ σ2 , for d = 1 (mod 4), we always have c ≤ ρ −(d−1)/2 σ1 (ρ) ≤ C.
220
PARNOVSKI AND SOBOLEV
For the case d = 1 (mod 4), estimate (3.7) shows that the lower bound (3.5) is precise in the sense that one cannot take δ = 0. If d = 1 (mod 4), then Theorem 3.1(1) can be easily derived using an argument due to B. Dahlberg and E. Trubowitz (see [4] and also [6]). In the case d = 1 (mod 4), the lower bound (3.5) calls for more elaborate consideration and was previously unknown. For the sake of completeness, we provide the proof of (3.5) for both these cases. Also for completeness, we give a proof of the upper bound (3.6), which was obtained for the first time in [10]. Estimate (3.7) is new, and our proof is original. The proof of Theorem 3.1 is postponed until the end of this section. Remark 3.2 Using a very simple argument based on the inequality f L1 ≤ f L∞ (see [18]), one can obtain from Theorem 3.1 useful L∞ estimates for the function #(k; E (ρ)). Indeed, suppose that f = f (k) is a bounded function on O † with a zero average. Then |f | dk = 2 f+ dk = 2 f− dk, 2f± = |f | ± f, O†
O†
O†
so that 2 sup f ≥ k
O†
|f | dk,
2 inf f ≤ − k
O†
|f | dk.
Remembering that the average value of #(k; E (ρ)) − wd ρ d is zero, it is now straightforward to deduce from (3.5) that maxk # k; E (ρ) ≥ wd ρ d + Cρ (d−1)/2−δ , (3.8) mink # k; E (ρ) ≤ wd ρ d − Cρ (d−1)/2−δ for sufficiently big ρ. These estimates are consistent with the result due to E. Hlawka [7]: # 0; E (ρ) − wd ρ d = 4 ρ (d−1)/2 . We should also mention two papers [19], [20] where bounds (3.8) were announced to hold with δ = 0 for all dimensions d. It is natural to ask whether estimates (3.8) are precise. A partial answer can be found in the book [2] by J. Beck and W. Chen, who studied discrepancies of distributions of discrete sets of points in compact convex bodies. They found lower and upper bounds on the discrepancies that are consistent with (3.8). 3.2. Technical lemmas Before we proceed to the proof of Theorem 3.1, we need to establish two preparatory lemmas. For t ∈ R, introduce the notation t for the distance from the number π −1 t to the nearest integer.
221
THE BETHE-SOMMERFELD CONJECTURE
lemma 3.3 Let 5 ⊂ Rd be a lattice. Then for any ε > 0 there exist numbers ρ0 > 0 and α ∈ (0, 1/2) such that for any ρ ≥ ρ0 one can find an element β ∈ 5 with the properties |β| ≤ ρ ε and |β|ρ ≥ α. Proof Let e1 , e2 ∈ 5 be an arbitrary pair of basis vectors. Without loss of generality we can assume that |e1 | = 1. Introduce two integer parameters n = n(ρ, ε) and k0 = k0 (ε) whose precise values are specified later. Consider the sequence of points β k = ne1 + ke2 , k = 0, 1, . . . , k0 . Then the length of each β k is given by B(k) = |β k | = (n + pk)2 + qk 2 , p = e1 e2 , q = |e2 |2 − |e1 e2 |2 . We show that for any ε > 0 there are real numbers ρ0 > 0, α ∈ (0, 1/2) and a positive integer k0 such that for any ρ ≥ ρ0 one can find an n ≤ ρ ε and an integer k ∈ [0, k0 ] with the property B(k)ρ ≥ α. (3.9) Suppose the converse, that is, that for some ε > 0 and any ρ0 , α ∈ (0, 1/2), k0 there exists a ρ ≥ ρ0 such that for all n ≤ ρ ε and k ∈ [0, k0 ] the inequality B(k)ρ < α holds. Denote B (1) (k) = B(k + 1) − B(k), k ∈ [0, k0 − 1], and denote B (m) (k) = B (m−1) (k + 1) − B (m−1) (k), k ∈ [0, k0 − m], m = 1, 2, . . . , k0 . Since B(k)ρ < α, we then have
(m) (3.10) B (k)ρ < 2m α, ∀k ∈ [0, k0 − m], m = 1, 2, . . . , k0 . To find a contradiction it is convenient to consider B(k) as a function of the continuous variable k ∈ [0, k0 ]. Let us show that there exist two infinite sequences, mj and Aj = 0, such that d ml B(k) ml −1 n = Al + O(n−1 ), dk ml uniformly in k ∈ [0, k0 ]. Notice that k , B(k) = nB˜ n
˜ B(t) =
n −→ ∞, ∀l,
(3.11)
(1 + pt)2 + qt 2 .
As q = 0, the function B˜ is not a polynomial, so that the series ˜ B(t) =
∞
zs t s ,
zs = zs (p, q),
s=0
contains an infinite set of nonzero coefficients zs . Denote the sequence of numbers s
222
PARNOVSKI AND SOBOLEV
for which zs = 0 by mj , and set Aj = mj !zmj . Then ˜ d mj B(k/n) nmj = Aj + O n−1 m j dk uniformly in k ∈ [0, k0 ]. This implies (3.11). It is clear that k+1 k1 +1 km−1 +1 m d B(km ) ··· dkm · · · dk1 . B (m) (k) = m dkm k k1 km−1 In view of (3.11),
B (mj ) (k) = Aj n1−mj 1 + O(n−1 ) ,
∀k ∈ [0, k0 − mj ].
Now let j be the smallest integer such that mj ≥ ε−1 + 2. Define k0 = mj , and let 1/(mj −1) ε −ε 2 /(1+ε) 2|Aj |π −1 ρ ε . ≤ ρ0 n = 2|Aj |π −1 ρ Then B (mj ) (0)ρ → sign Aj π/2, ρ → ∞, so that B (mj ) (0)ρ = 1/2 + o(1), ρ → ∞. Choosing α and ρ0 so that 2mj α < 1/2 and n ≤ ρ ε , we obtain a contradiction with (3.10). To complete the proof it remains to take β k = ne1 + ke2 satisfying (3.9). In order to prove (3.7), we use the following variant of Dirichlet’s theorem on simultaneous rational approximations of real numbers (see, e.g., [15]). lemma 3.4 Let α1 , α2 , . . . , αn be real numbers. Then for any positive integer Q there exist integers p1 , p2 , . . . , pn , q with Q ≤ q < Qn+1 such that αj q − pj < 1 . Q Proof Throughout this proof we use the notation {x} = x − [x] for the fractional part of x. Let K := [0, 1)n be the unit cube in Rn . Let us cut K into Qn smaller cubes, each with Qn sides of length Q−1 ; that is, K = j =1 Kj , where Kj are small cubes numbered in an arbitrary way. Consider the points a! ∈ K, ! = 0, . . . , Qn+1 , given by the formula a! = ({!α1 }, . . . , {!αn }). There are Qn+1 + 1 such points, and hence the pigeonhole principle implies that there is a number j0 such that the cube Kj0 contains at least Q + 1 points a! : a!0 , . . . , a!Q ∈ Kj0 . We can also assume that !0 < !1 < · · · < !Q , so that !Q − !0 ≥ Q. The fact that a!0 and a!Q belong to the same cube Kj0 implies that
223
THE BETHE-SOMMERFELD CONJECTURE
!Q αm − !0 αm ≤ 1 , Q
m = 1, . . . , n.
This in turn means that (!Q − !0 )αm − !Q αm − !0 αm ≤ 1 , Q
m = 1, . . . , n.
Now we denote q := !Q − !0 , pm := [!Q αm ] − [!0 αm ], and the lemma is proved. 3.3. Proof of Theorem 3.1 Lower bounds Denote
N (ρ, k) = # k; E (ρ) .
Just as in [4] and [10], we easily conclude that, for any b ∈ , 1 Nˆ (ρ; b) = N (ρ, k)eibk dk = eiβk dk, † det F O |k|≤ρ and, in particular, Note that σ1 (ρ) =
O†
β = F−1 b,
Nˆ (ρ; 0) = wd ρ d .
N (ρ, k) − Nˆ (ρ; 0) dk ≥ |Nˆ (ρ; b)|,
∀ b ∈ \ {0}.
(3.12)
Computing the Fourier coefficient, we obtain that det FNˆ (ρ; b) = (2π )d/2 β −d/2 ρ d/2 Jd/2 (ρβ),
β = |β| = 0.
(3.13)
To prove (3.5) for d = 1 (mod 4), we point out the following elementary property of Bessel functions: |Jν (z)| + |Jν (2z)| ≥ cν z−1/2 ,
2ν = 1 (mod 4),
(3.14)
for all sufficiently big z > 0. Indeed, the Bessel function has the asymptotics (see [1]): 2 Jν (z) = − g(z) + O z−3/2 , (3.15) πz with g(z) = sin(z + aπ),
a=−
2ν − 1 . 4
The required estimate is proved if we show that |g(z)| + |g(2z)| ≥ c,
z ≥ z0 ,
(3.16)
for some z0 > 0. The roots of g(z) and g(2z) are −aπ + πn and −aπ/2 + πm/2,
224
PARNOVSKI AND SOBOLEV
m, n ∈ Z, respectively. Since a is not integer, these roots never coincide. This proves (3.16) and (3.14). Now (3.14) and (3.12) immediately yield the required lower bound (3.5) for d = 1 (mod 4). Suppose now that d = 1 (mod 4). Then the asymptotics (3.15) gives that (d−1)/4 2 ˆ (2π)d/2 β −(d+1)/2 ρ (d−1)/2 sin(ρβ) det FN (ρ; b) = −(−1) (3.17) π (d−3)/2 −(d+3)/2 β , β = |β|. +O ρ Choosing β as in Lemma 3.3, we see that | sin(ρβ)| = | sin(πρβ)| ≥ c, and hence Nˆ (ρ; Fβ) ≥ cε ρ (d−1)/2−(d+1)ε/2 . Using (3.12), we obtain (3.5). Upper bounds By virtue of Parseval’s identity and (3.13), we have F−1 b−d J 2 ρ F−1 b σ2 (ρ) = (2π)d ρ d d/2 b∈
≤ Cρ
d−1
|m|−d−1
0 = m∈Zd
≤ Cρ d−1 . Here we have used the estimate |Jν2 (z)| ≤ C|z|−1 . This proves (3.6). Let us now prove (3.7). Let j > 0 be a natural number, and let M = Mj ⊂ R be the set Mj = F−1 m : m ∈ Zd \ {0} & |m| ≤ j . It is clear that nj , the number of elements in M, does not exceed 2d j d . Applying Lemma 3.4 to the set M, we conclude that for any Q > 0 one can find a natural number q = ρ = ρj such that Q ≤ ρj < Qnj +1 and
sin 2πρ F−1 m ≤ Q−1 ,
(3.18) ∀|m| ≤ j.
Using again Parseval’s identity and the asymptotics (3.17), we arrive at the estimate 1 F−1 m−(d+1) sin 2πρj F−1 m 2 σ2 (ρj ) ≤ 2 ρjd−1 π 0j
Cρjd−1 Q−2
|m|−d−1 + O ρjd−2
+ j −1 + O ρjd−2 .
Let Q = j 1/2 . Then the right inequality in (3.18) and the estimate nj ≤ 2d j d imply that j ≥ cε (ln ρj )(1−ε)/d , ∀ε > 0. Hence, the following upper bound holds: σ2 (ρj ) ≤ Cρjd−1 (ln ρj )(−1+ε)/d ,
∀ε > 0.
It remains to observe that, in view of the left inequality (3.18), ρj → ∞ as j → ∞. In conclusion, we recall that the number of points #(k; E (ρ)) also satisfies the well-known upper bound # k; E (ρ) − wd ρ d ≤ Cρ d−2+2b ,
b=
1 , d ≥ 2, d +1
(3.19)
uniformly in k ∈ O † (see [7]). For k = 0 one can take a smaller value of b (see [11]). Moreover, it is shown in [3] that b = 0 for d ≥ 9. For discussion and further references we refer to [5]. While the lower bound from Theorem 3.1 is used to establish the lower bounds (2.4) for the functions m(λ) and ζ (λ), the estimate (3.19) can be used to prove the following upper bounds for these quantities: m(λ) ≤ Cλ(d−2+2b)/2l ,
ζ (λ) ≤ Cλ1−(1−b)/ l .
The proof of these estimates simply follows the lines of [18], and we do not go into detail. 4. Proofs of Theorems 2.1 and 2.2 Theorems 2.1 and 2.2 are deduced from the following lemma showing how to extract the information on the functions m(λ) and ζ (λ) from the upper and lower bounds on the counting function. lemma 4.1 Let l > 0, and let d ≥ 2. Let B(k) be a family of bounded self-adjoint operators in H depending continuously on k ∈ Rd . Suppose that for all ρ ≥ ρ0 > 0 and some β ∈ (0, d) the counting function N(ρ 2l ; H (k)), H (k) = H0 (k) + B(k), obeys the estimates
226
PARNOVSKI AND SOBOLEV
maxk N ρ 2l ; H (k) ≥ wd ρ d + Cρ β , min N ρ 2l ; H (k) ≤ w ρ d − Cρ β . k d
(4.1)
Then the functions m(λ) and ζ (λ) satisfy the lower bounds m(λ) ≥ c0 λβ/2l ,
ζ (λ) ≥ c0 λ1−(d−β)/2l
(4.2)
for all λ ≥ (2ρ0 )2l . In the proof of this lemma and throughout the rest of the paper we use the following elementary two-sided estimates for the function h± (t) = (1 ± t)γ , 0 ≤ t ≤ 1/2: (±)
1 ± dl (±)
Here the constants dl (±) dl
= γ;
d(±) u = γ;
(±)
and du
t ≤ h± (t) ≤ 1 ± d(±) u t.
(4.3)
depend on γ and are given by the formulae
γ −1 3 γ , 2 1−γ 2 (+) dl = γ , 3 d(+) u =
d(−) u (−)
dl
γ −1 1 =γ 2
if γ ≥ 1;
= γ 21−γ
if γ < 1.
Proof of Lemma 4.1 According to (4.1) and (4.3), for all nonnegative t ≤ ρ 2l /2 we have d/2l β/2l + C ρ 2l − t N+ ρ 2l − t ≥ wd ρ 2l − t ≥ wd ρ d + Cρ β − ctρ d−2l ,
∀ρ ≥ 2ρ0 .
Similarly, d/2l β/2l − C ρ 2l + t N− ρ 2l + t ≤ wd ρ 2l + t ≤ wd ρ d − Cρ β + ctρ d−2l ,
∀ρ ≥ ρ0 .
Now one concludes from (2.10) that m(ρ 2l ) ≥ N+ (ρ 2l ) − N− (ρ 2l ) ≥ 2Cρ β ,
∀ρ ≥ 2ρ0 ,
and hence (4.2) holds for all λ ≥ λl = (2ρ0 )2l . This completes the proof of the lower bound for m(λ). To estimate ζ (λ), write N+ ρ 2l − t − N− ρ 2l + t ≥ 2Cρ β − 2ctρ d−2l . From the formula (2.9) one can now infer (4.2) for ζ (λ), λ ≥ (2ρ0 )2l .
THE BETHE-SOMMERFELD CONJECTURE
227
4.1. Proof of main results Here we complete the proof of Theorem 2.1 and show how to deduce Theorem 2.2 from Theorem 2.9. Proof of Theorem 2.1 In view of (3.8) and the relation (3.3), the counting function N(ρ 2l ; H0 (k)) of the unperturbed operator satisfies (4.1) with β = (d − 1)/2 − δ (see (2.3) for definition of δ) for all sufficiently large ρ > 0. This fact, in combination with (4.3), implies that for any ρ 2l ≥ 4v, v = maxk B(k), we have N+ ρ 2l ; H (k) ≥ max N ρ 2l − v; H0 (k) ≥ wd ρ d + Cρ β − cvρ d−2l , 2l
k
N− ρ ; H (k) ≤ min N ρ 2l + v; H0 (k) ≤ wd ρ d − Cρ β + cvρ d−2l .
k
Under the condition 4l > d + 1 + δ, we have β > d − 2l, so these estimates yield the bounds (4.1) for N(ρ 2l ; H (k)). Now Lemma 4.1 leads to Theorem 2.1(1). To prove Theorem 2.1(2), recall that if d = 1 (mod 4) and 4l = d + 1, then δ = 0 and β = d − 2l. Again, the bounds (4.1) follow from the above inequalities, as v is assumed to be sufficiently small. Proof of Theorem 2.2 We use Theorem 2.9 with β = (d − 1)/2 − δ. Let us prove Theorem 2.2(1) first. To this end, observe that under condition (i) (resp., condition (ii)) of Theorem 2.2(1), Condition 2.6 (resp., 2.7) is fulfilled. Indeed, if 4l ≤ d + 1, 6l > d + 2, and V ∈ H α (Td ) with an α satisfying (2.6), then for sufficiently small δ we have β ∈ (d − 3l + 1/2, d − 2l], and (2.12) is satisfied. Also, if d = 2, then ν = 6l − 4 ≤ 1. Furthermore, if d = 1 (mod 4), 4l = d + 1, and V is continuous, then β = d − 2l, l > 1/2. Consequently, Theorem 2.9(1) is applicable. According to (2.13) and (3.4),
(4.4) lim ρ −(d−1)/2+δ H ρ 2l ; H − wd ρ d = 0, ρ −→ ∞. Hence the inequalities
N(λ; H ) − N(λ; H ) ≥ N(λ; H0 ) − wd ρ d
− N(λ; H ) − N(λ; H0 ) − N(λ; H ) − wd ρ d , λ = ρ 2l , equality (2.13), and Theorem 3.1 yield that 2l
N ρ ; H − N ρ 2l ; H ≥ cρ (d−1)/2−δ for sufficiently large ρ. Applying the same argument as in Remark 3.2, we now see that
228
PARNOVSKI AND SOBOLEV
max N ρ 2l ; H (k) ≥ N ρ 2l ; H + cρ (d−1)/2−δ , k
min N ρ 2l ; H (k) ≤ N ρ 2l ; H − cρ (d−1)/2−δ . k
Referring again to (4.4), one concludes that the counting function of H (k) satisfies (4.1) with β = (d − 1)/2 − δ. Now Lemma 4.1 implies Theorem 2.2(1). Let d = 1 (mod 4), let 6l = d + 2, and let V be a trigonometric polynomial. As β = d − 3l + 1/2, the conditions of Theorem 2.9(2) are fulfilled, so that (2.14) holds. Assuming that the number g0 > 0 is sufficiently small, one proves as above that the counting function N(ρ 2l , H0 (k) + gV ), |g| ≤ g0 , satisfies estimates (4.1) with β = (d − 1)/2, which implies Theorem 2.2(2). 4.2. Proof of Theorem 2.5 Let v = B = max B(k) and λ = ρ 2l . By a straightforward perturbation argument
N(λ − v; H0 ) ≤ N(λ; H0 + B) ≤ N(λ + v; H0 ) . Note that
N(λ ± v; H0 ) − N (λ; H0 ) = N(λ ± v; H0 ) − N(λ; H0 ) . By virtue of (3.4) and (4.3), the right-hand side does not exceed d/2l − ρ d ≤ Cvρ d−2l . wd ρ 2l ± v This completes the proof. The rest of the paper is devoted to the proof of Theorem 2.9. 5. Reduction of the operator H (k) The idea of the proof is to bring the operator H (k), by a series of transformations, as close as possible to the unperturbed operator H0 (k), controlling on each step the counting function N (λ). A reduction of H (k) is done in two steps that are described below. Although our ultimate goal is to prove Theorem 2.9, we do not need to assume that all its conditions are fulfilled for each intermediary result obtained in this section. In particular, the conclusions of Subsection 5.1 are true for arbitrary bounded selfadjoint perturbation B and do not need the locality of B. Let us introduce necessary notation. For any (measurable) set C ∈ Rd , we denote by P (C ) the orthogonal projection in H = L2 (O ) onto the subspace spanned by the exponentials 1 eimx , m ∈ C ∩ Zd . (2π)d/2 As a rule, we also use the notation P (k) (C ) = P (C (k) ) (see (3.1) for the definition of
229
THE BETHE-SOMMERFELD CONJECTURE
C (k) ). From now on, slightly abusing notation, we use the symbol N(λ; H0 (k)P (k) ) to denote the counting function of the operator H0 (k) restricted to the range of the projection P (k) . In what follows we use a number of parameters depending on β, l, d. For the reader’s convenience we list them all below, including the parameters ν and γ defined in (2.11): ν = 6l + 2β − 2d − 1,
γ = d − 2l − β, η = d − 4l − β + 1.
(5.1)
5.1. Step 1 Suppose that B = B(k) is a family of bounded self-adjoint operators in H, normcontinuous in k ∈ Rd . Denote F = G1/2 , and define the shell S(ρ, r) = ξ ∈ Rd : |Fξ | − ρ ≤ r , 0 < r ≤ ρ. In the operator H (k) = H0 (k) + B(k), we split B into the sum of two operators: W = W (ρ, r; k) = P (k) B P (k) , W˜ = W˜ (ρ, r; k) = Q(k) B P (k) + P (k) B Q(k) + Q(k) B Q(k) , where we have denoted P (k) = P S (k) (ρ, r) ,
Q (k) = I − P (k) .
We are going to show that the counting function N(ρ 2l ; H (k)) is determined by N(ρ 2l ; H ) with the “effective” operator H = H (ρ, r; k) = H0 (k) + W (ρ, r; k).
(5.2)
Sometimes, if necessary, we also reflect the dependence of H on the operator B and write H (ρ, r; k, B). theorem 5.1 Let B(k) be as above, and let B = max B(k) = v. Let β ∈ (d − 4l, d − 2l], ω > 0 be arbitrary numbers, and let γ , η be as defined in (5.1). Then there exist a constant A > 0 and a number ρ1 = ρ1 (v, ω) > 0 such that for r = Avω−1 ρ η and ρ ≥ ρ1 one has ρ ≥ 2r, and N ρ 2l − vωρ −γ ; H (ρ, r; k) ≤ N ρ 2l ; H (k) ≤ N ρ 2l + vωρ −γ ; H (ρ, r; k) (5.3) uniformly in k ∈ O † .
230
PARNOVSKI AND SOBOLEV
The proof of this theorem is very simple; it requires only some basic knowledge of perturbation theory. Proof of Theorem 5.1 Denote for brevity P = P (k) , Q = Q(k) . Let us estimate the contribution of W˜ to the spectrum of H . For any u ∈ L2 (O ), we have (P B Qu, u) + (QB P u, u) ≤ vωρ −γ P u2 + vω−1 ρ γ Qu2 . Consequently, H ≤ H + vωρ −γ P + H0 + vω−1 ρ γ + v Q, H ≥ H − vωρ −γ P + H0 − vω−1 ρ γ − v Q. Denote
(5.4)
H ± = H0 ± vω−1 ρ γ ± v Q.
It is clear that N(λ; H ± ) = N λ ∓ vω−1 ρ γ ∓ v; H0 Q ,
λ = ρ 2l .
(5.5)
By the definition of Q, the operator H0 Q has no spectrum in the interval I (ρ, r) = ((ρ − r)2l , (ρ + r)2l ). Assuming that 2r ≤ ρ and using (4.3), we also conclude that H0 Q has no spectrum in the interval 2l (+) . ρ − d rρ 2l−1 , ρ 2l + d rρ 2l−1 ⊂ I (ρ, r), d = min d(−) u , dl Set
r = 4 d−1 vω−1 ρ γ +1−2l .
Since γ > 0, the numbers ρ 2l ∓ vω−1 ρ γ ∓ v lie inside the above interval for all ρ ≥ ρ1 with a sufficiently large ρ1 = ρ1 (v, ω). Consequently, the right-hand side of the equality (5.5) equals N(λ; H0 Q). In combination with (5.4), this yields (5.3). Precisely, the first bound from (5.4) implies N ρ 2l ; H ≥ N ρ 2l ; H + vωρ −γ P + N ρ 2l ; H0 Q (5.6) ≥ N ρ 2l − vωρ −γ ; HP + N ρ 2l − vωρ −γ ; H0 Q 2l = N ρ − vωρ −γ ; H (ρ, r; k) . The upper bound is proved in the same way. It remains to check the validity of the assumption ρ ≥ 2r above. Notice that γ + 1 − 2l = η. In view of the condition β > d − 4l the exponent η is always strictly less than 1. Therefore, increasing, if necessary, the number ρ1 = ρ1 (v, ω), we may indeed assume that ρ ≥ 2r for ρ ≥ ρ1 .
231
THE BETHE-SOMMERFELD CONJECTURE
5.2. Step 2: Effective part of the local operator H (k) In this step we have to assume that B is the multiplication by a function V = V (x). We begin by introducing a convenient representation for V . Write iθx = V (x) = (2π)−d/2 ˆ θ∈' V (θ)e θ∈' Vθ , (5.7) Vθ (x) = (2π)−d/2 Re Vˆ (θ)eiθx , where ' is defined in (2.7). Recall that 0 ∈ / ' in view of (2.5). The conditions on V , which are specified later, guarantee that the series (5.7) converges absolutely. For each θ ∈ ', decompose the shell S = S(ρ, r), with r as defined in Theorem 5.1, into two disjoint sets: S = Hθ ∪ 4θ depending on the scalar parameter σ > 0: Hθ = Hθ (ρ, r, σ ) = ξ ∈ S(ρ, r) : θ G ξ + θ/2 ≤ σρ η+1 or θG ξ − θ/2 ≤ σρ η+1 , (5.8) 4θ = 4θ (ρ, r, σ ) = S(ρ, r) \ Hθ (ρ, r, σ ). lemma 5.2 Let r be as defined in Theorem 5.1, and let ρ ≥ ρ1 (v, ω). Then for any σ ≥ 3Avω−1 one has P (k) (4θ )Vθ P (k) (S) = 0, ∀θ ∈ '. Proof It suffices to show that if for any ξ ∈ S = S(ρ, r) the point ξ ± θ also belongs to S, then ξ ± θ ∈ Hθ . Let ξ ∈ S, and let ξ ± θ ∈ S. As ρ ≥ 2r, we have ρ 2 − 3ρr < |Fξ |2 < ρ 2 + 3ρr, 2 ρ 2 − 3ρr < F(ξ ± θ) = |Fξ |2 ± 2θG ξ ± θ/2 < ρ 2 + 3ρr, which implies that θG ξ ± θ/2 ≤ 3ρr = 3Avω−1 ρ η+1 or that ξ ± θ ∈ Hθ (ρ, r, σ ), ∀σ ≥ 3Avω−1 . This lemma immediately implies that H = H0 + W I ,
WI =
P (k) (Hθ )Vθ P (k) (Hθ ),
(5.9)
θ
if σ ≥ 3Avω−1 . As the next theorem shows, the counting functions of H and H0 are already sufficiently close, so that we do not need to reduce H any further.
232
PARNOVSKI AND SOBOLEV
theorem 5.3 Suppose that V is a periodic real-valued function satisfying (2.5). Let β ≤ d − 2l, and let λ ∈ [ρ 2l − κρ −γ , ρ 2l + κρ −γ ] with an arbitrary fixed κ > 0. (1) Suppose that V is a trigonometric polynomial (5.7) and that ' ⊂ E0 (ρ/2) for d = 2. Then, for any α > d/2,
N λ; H (ρ, r) − N ρ 2l ; H0 ≤ Cρ β κ + V H α ω−1 M(')ρ −ν , (5.10) with ν and M(') defined in (2.11) and (2.8), respectively. (2) Suppose that one of Conditions 2.6 or 2.7 is fulfilled. Then lim sup ρ −β N λ; H (ρ, r) − N ρ 2l ; H0 ≤ Cκ
(5.11)
as ρ → ∞. The constant C in (5.10), (5.11) is independent of V , κ, ω. Recall that under Condition 2.6 or 2.7 the parameter ν defined in (2.11) is strictly positive. Theorem 5.3 is proved in the next section. 6. Proofs of Theorems 5.3 and 2.9 To study the spectrum of H we first investigate an auxiliary problem. 6.1. Auxiliary problem For a set C ⊂ Rd and a number g ∈ R, define on H the operator X(k) = Xg (k) = H0 (k) + g P (k) (C ),
k ∈ O †.
(6.1)
Denote ρ = λ1/2l , ρ ' = (λ − g)1/2l , λ ≥ |g|. The study of the counting function N(λ; X(k)) involves the set E (ρ) \ E (ρ ' ) ∩ C , g ≥ 0; D (λ) = E (ρ ' ) \ E (ρ) ∩ C , g < 0. Denote
ω(λ) = vol D (λ) .
lemma 6.1 Let X(k) be as defined above. Then for any λ = ρ 2l ≥ |g|,
N (λ; X) − # E (ρ) = ω(λ). Proof Denote P = P (k) (C ), Q = I − P (k) (C ). Since
(6.2)
233
THE BETHE-SOMMERFELD CONJECTURE
X = H0 Q ⊕ P X P , it is clear that N λ; X(k) =N λ; H0 (k)Q + N λ; X(k)P =N λ; H0 (k)Q + N λ − g; H0 (k)P . Using the definition of P , it is straightforward to rewrite this formula as follows: N λ; X(k) = # k; C ' (λ) , (6.3) with C ' = C ' (λ) = E (ρ) \ C E (ρ ' ) ∩ C ,
ρ = λ1/2l , ρ ' = (λ − g)1/2l .
Let us rewrite C ' in a different form using the set D (λ) defined before the lemma: E (ρ) ∪ D (λ), g < 0; ' C = E (ρ) \ D (λ), g ≥ 0. As D (λ) ⊂ E (ρ), g ≥ 0, and D (λ) ∩ E (ρ) = ∅, g < 0, one can write # k; C ' (λ) = # k; E (ρ) + M(k; λ), M(k; λ) = ∓# k; D (λ) , ±g ≥ 0.
(6.4)
In view of (3.2), |M(λ)| = ω(λ), so that (6.4) and (6.3) lead to (6.2). Let us apply this lemma to the set ∪θ∈' Hθ (ρ, σ ), where the sets Hθ are defined in (5.8). corollary 6.2 Suppose that ' is a finite set and that ' ⊂ E0 (ρ/2) for d = 2. Let C = ∪θ∈' Hθ (ρ, σ ). Then for any λ ∈ [ρ 2l − κρ −γ , ρ 2l + κρ −γ ], κ > 0, one has
N(λ; Xg ) − # E (ρ) ≤ Cρ β κ + σ |g|M(')ρ −ν , (6.5) where ν is defined in (2.11). Proof Let λ = τ 2l and τ± = (λ ± |g|)1/2l . It is obvious that D (λ) ⊂ ∪θ Dθ (λ), Dθ (λ) = E (τ+ ) \ E (τ− ) ∩ Hθ ,
θ ∈ Zd \ {0}.
Let us estimate ω(λ) first. By (4.3) and definition (5.8), we get for all λ ≥ 2|g| that
234
PARNOVSKI AND SOBOLEV
Dθ (λ) ⊂ ξ ∈ Rd : τ − C|g|τ 1−2l ≤ |Fξ | ≤ τ + C|g|τ 1−2l , θ G ξ + θ/2 ≤ σρ η+1 or θG ξ − θ/2 ≤ σρ η+1 . An elementary geometric argument shows that the volume of this set does not exceed vol Dθ (λ) ≤ Cσ |g|τ d−1−2l ρ η+1 |θ|−1 ≤ C|g|σ |θ|−1 ρ d−2l+η . If d ≥ 3, then this is true without any restrictions on the finite set '. If d = 2, then this is true under the condition |θ|ρ −1 ≤ c < 1, ∀θ ∈ ', which is satisfied in view of the assumption |θ | ≤ ρ/2. Since D (λ) ⊂ ∪θ Dθ (λ), we conclude that vol Dθ (λ) ≤ Cσ |g|ρ d−2l+η M('), ω(λ) ≤ θ
so that, by Lemma 6.1,
N(λ; Xg ) − # E (τ ) ≤ Cσ |g|ρ d−2l+η M(')
(6.6)
for all λ satisfying the conditions of this corollary. Now note that for any positive ρ and τ such that |ρ 2l − τ 2l | ≤ t for some t ≥ 0 one has, in view of (3.4), # E (ρ) − # E (τ ) = # E (ρ) − # E (τ ) ≤ Ctρ d−2l . Setting t = κρ −γ , we obtain from this inequality and (6.6) that
N(λ; X) − # E (ρ) ≤ N(λ; X) − # E (τ ) + # E (τ ) − # E (ρ) ≤ Cσ |g|ρ d−2l+η M(') + Cκρ β . It remains to notice that, by (2.11), d − 2l + η = 2d − 6l − β + 1 = −ν + β, which completes the proof. 6.2. Proof of Theorem 5.3 Assume that V is a trigonometric polynomial. To apply Lemma 6.1 and Corollary 6.2, set first of all C = ∪θ ∈' Hθ (ρ, σ ) and σ = 3Avω−1 . By (5.9) we have H = H0 + P (k) (C )W I P (k) (C ).
Furthermore, by virtue of (5.7), −g ≤ W I ≤ g,
235
THE BETHE-SOMMERFELD CONJECTURE
g = (2π )−d/2
|Vˆ (θ)| ≤ Cα V H α ,
∀α > d/2.
θ∈'
Consequently, X−g (k) ≤ H (k) ≤ Xg (k), where Xg and X−g are operators of the form (6.1). Applying Corollary 6.2, we conclude that the counting functions of both operators Xg , X−g fulfill (6.5). This implies (5.10). To prove Theorem 5.3(2), suppose first that Condition 2.6 is satisfied, and hence ν > 0. Split V into the sum V = V1 + V2 with (2π)d/2 V1 (x) = Vˆ (θ)eiθx , |θ|≤R
(2π)
d/2
V2 (x) =
Vˆ (θ)eiθx
|θ|>R
with an R > 0. Using the property V ∈ H α , we see that 2 2 d Vˆ (θ) ≤ V 2 α V2 ≤ (2π ) |θ|−2α ≤ CV 2H α R −2α+d . H |θ |>R
Assuming that
|θ|>R
(α−d/2)−1 , R = V H α κ −1 ρ γ
(6.7)
it is straightforward to check that V2 ≤ Cκρ −γ . Therefore, by a simple perturbation argument, N λ; H (ρ, r; k, V ) − N ρ 2l ; H0 (k) (6.8) N ρ 2l ± Cκρ −γ ; H ρ, r; k, V1 − N ρ 2l ; H0 (k) . ≤ ±
Since V1 is a trigonometric polynomial, we may try to use (5.10). However, in order to do so we need to check that if d = 2, then R ≤ ρ/2. As α obeys (2.6), we have γ (α − 1)−1 < ν ≤ 1. By definition (6.7), this implies that Rρ −1 → 0, as ρ → ∞. Therefore, applying (5.10) to the right-hand side of (6.8), we get
N λ; H (ρ, r) − N ρ 2l ; H0 ≤ Cρ β κ + ω−1 V H α M E0 (R) \ {0} ρ −ν (6.9) for sufficiently large ρ. By (2.6) the power of ρ in the right-hand side of the inequality (d−1)(α−d/2)−1 γ (d−1)(α−d/2)−1 ρ M E0 (R) \ {0} ≤ CR d−1 = C V H α κ −1 is strictly less than ν. Now (6.9) leads to (5.11).
236
PARNOVSKI AND SOBOLEV
Under Condition 2.7, Theorem 5.3 is proved in a similar way with the following minor alteration. As V is continuous, it can be approximated by a trigonometric polynomial V1 such that V − V1 ≤ κ. It remains to observe that if β = d − 2l, l > 1/2, then γ = 0, ν = 2l − 1 > 0, and to follow the above proof. 6.3. Proof of Theorem 2.9 As ν ≥ 0 and β ≤ d − 2l, the parameter β belongs to the interval (d − 4l, d − 2l], so that Theorem 5.1 is applicable. It follows from (5.3) that N λ− ; H (ρ, r; k) ≤ N ρ 2l ; H (k) (6.10) ≤ N λ+ ; H (ρ, r; k) , λ± = ρ 2l ± vωρ −γ , where v = maxx |V (x)|. Let Condition 2.8 be satisfied. As ν = 0, it follows from (6.10) and (5.10) that
T (ρ, V ) = N ρ 2l ; H − N ρ 2l ; H0 ≤ Cρ β vω + ω−1 V H α M(') . Now (2.14) follows if we set ω = M(')1/2 and recall that v ≤ V H α . Theorem 2.9(1) is a consequence of (5.11). Precisely, (6.10) and (5.11) with κ = vω ensure that lim sup ρ −β N ρ 2l ; H − N ρ 2l ; H0 ≤ Cvω, ρ −→ ∞. Since ω > 0 is arbitrary and the left-hand side does not depend on ω, we obtain (2.13). Proof of Theorem 2.9 is completed. As explained in Subsection 4.1, Theorem 2.9 leads to Theorem 2.2.
References [1]
M. ABRAMOWITZ and I. A. STEGUN, Handbook of Mathematical Functions with
[2]
J. BECK and W. W. L. CHEN, Irregularities of Distribution, Cambridge Tracts in Math.
[3]
¨ V. BENTKUS and F. GOTZE , On the lattice point problem for ellipsoids, Acta Arith. 80
[4]
B. E. J. DAHLBERG and E. TRUBOWITZ, A remark on two-dimensional periodic
[5]
¨ F. GOTZE , “Lattice point problems and the central limit theorem in Euclidean spaces”
Formulas, Graphs, and Mathematical Tables, Wiley, New York, 1972. 89, Cambridge Univ. Press, Cambridge, 1987. (1997), 101–125. potentials, Comment. Math. Helv. 57 (1982), 130–134. in Proceedings of the International Congress of Mathematicians (Berlin, 1998), Vol. 3, Doc. Math. 1998, Extra Vol. 3, Doc. Math., Bielefeld, 245–255, http://www.mathematik.uni-bielefeld.de/documenta/.
THE BETHE-SOMMERFELD CONJECTURE
237
[6]
B. HELFFER and A. MOHAMED, Asymptotic of the density of states for the Schr¨odinger
[7]
¨ E. HLAWKA, Uber integrale auf konvexen K¨orpern, I, Monatsh. Math. 54 (1950),
[8]
YU. E. KARPESHINA, Analytic perturbation theory for a periodic potential (in
operator with periodic electric potential, Duke Math. J. 92 (1998), 1–60. 1–36.
[9] [10] [11] [12] [13]
[14] [15] [16]
[17] [18]
[19]
[20]
Russian), Izv. Akad. Nauk SSSR Ser. Mat., 53 (1989), no. 1, 45–65; English translation in Math. USSR-Izv. 34 (1990), no. 1, 43–63. , Perturbation Theory for the Schr¨odinger Operator with a Periodic Potential, Lecture Notes in Math. 1663, Springer, Berlin, 1997. D. G. KENDALL and R. A. RANKIN, On the number of points of a given lattice in a random hypersphere, Quart. J. Math. Oxford Ser. (2) 4 (1953), 178–189. ¨ E. KRATZEL and W. G. NOWAK, Lattice points in large convex bodies, II, Acta Arith. 62 (1992), 285–295. P. KUCHMENT, Floquet Theory for Partial Differential Equations, Oper. Theory. Adv. Appl. 60, Birkh¨auser, Basel, 1993. V. N. POPOV and M. SKRIGANOV, “Remark on the structure of the spectrum of a two-dimensional Schr¨odinger operator with periodic potential” (in Russian) in Differential Geometry, Lie Groups and Mechanics, IV (in Russian), Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 109 (1981), 131–133, 181, 183–184. M. REED and B. SIMON, Methods of Modern Mathematical Physics, IV: Analysis of Operators, Academic Press, New York, 1978. W. SCHMIDT, Diophantine Approximation, Lecture Notes in Math. 785, Springer, Berlin, 1980. M. SKRIGANOV, Finiteness of the number of lacunae in the spectrum of a mutlidimensional polyharmonic operator with periodic potential (in Russian), Mat. Sb. (N. S.) 113 (155) (1980), no. 1 (9), 133–145, 176; English translation in Math. USSR-Sb. 41 (1982), no. 1, 115–125. , The spectrum band structure of the three-dimensional Schr¨odinger operator with periodic potential, Invent. Math. 80 (1985), 107–121. , Geometric and Arithmetic Methods in the Spectral Theory of Multidimensional Periodic Operators, Proc. Steklov Inst. Math. 171, Amer. Math. Soc., Providence, 1987. N. N. YAKOVLEV, Asymptotic estimates of densities of lattice k-packings and k-coverings, and the spectrum structure of the Schr¨odinger operator with periodic potential, Dokl. Akad. Nauk SSSR 276 (1984), no. 1, 54–57; English translation in Soviet Math. Dokl. 29 (1984), no. 3, 457–460. , The spectrum of multidimensional pseudodifferential periodic operators (in Russian), Vestnik. Moskov. Univ. Ser. 1 Mat. Mekh. 1985, no. 3, 80–81, 103.
Parnovski Department of Mathematics, University College London, Gower Street, London WC1E 6BT, United Kingdom; [email protected]
238
PARNOVSKI AND SOBOLEV
Sobolev Centre for Mathematical Analysis and Its Applications, University of Sussex, Falmer, Brighton, BN1 9QH, United Kingdom; [email protected]
LINEAR EQUATIONS OVER Fp AND MOMENTS OF EXPONENTIAL SUMS VSEVOLOD F. LEV
Abstract Two of our principal results (in a simplified form) are as follows. theorem For p prime, the number of solutions of the equation c1 a1 + · · · + ck ak = λ,
aj ∈ Aj ,
where cj ∈ F× p and λ ∈ Fp are fixed coefficients, and the variables aj range over sets Aj ⊆ Fp , does not exceed the number of solutions of the equation a1 + · · · + ak = 0,
aj ∈ A¯ j ,
where the variables aj range over arithmetic progressions A¯ j ⊆ Fp of cardinalities |A¯ j | = |Aj |, balanced around zero. This readily implies an integer version, which strengthens a result of R. Gabriel, G. Hardy, and J. Littlewood. theorem Let A be a set of n = |A| residues modulo a prime p. For z ∈ Fp , write e2π i(az/p) . SA (z) = a∈A
Then for ε > 0 we have
√ 2 6 p 1/2 ε : SA (z) > (1 − ε)n ≤ 1 + o(1) , # z∈ π n provided n → ∞ and ε → 0. Equality is attained when A is an arithmetic progression modulo p.
F× p
DUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 2, Received 24 August 1999. 2000 Mathematics Subject Classification. Primary 11B75; Secondary 42C20, 11D04, 11L07, 11P99, 11B25, 11D12. Author’s work partially supported by the Edmund Landau Center for Research in Mathematical Analysis and Related Areas, sponsored by the Minerva Foundation (Germany).
239
240
VSEVOLOD F. LEV
This implies an integer version that is due to A. A. Yudin. This paper consists of two parts, the latter of which depends on the results of the former. The first part aims to establish Fp -analogs of a series of results due to Gabriel, Hardy, and Littlewood known as “rearrangement theorems.” It is seen that the original integer versions follow easily from their counterparts in Fp , but not vice versa. The second part deals with the moments and distribution of the absolute values of exponential sums in the field Fp . Analogs modulo p of two of Yudin’s results concerning exponential sums, associated with a finite set of integers, are established. An extended discussion of the results obtained is presented in Sections 1 and 2; the proofs are given in Sections 3–6. 1. Historical background and summary of results: Inequalities, rearrangements, and linear equations over finite sets The classical rearrangement theorems are the subject of [HLP, Chapter X]. We first quote and discuss the central results, changing somewhat the original notation in an attempt to simplify the presentation. We say that a function f¯ : Z → R+ is balanced if it satisfies either f¯(0) ≥ f¯(1) ≥ f¯(−1) ≥ f¯(2) ≥ f¯(−2) ≥ · · · or
f¯(0) ≥ f¯(−1) ≥ f¯(1) ≥ f¯(−2) ≥ f¯(2) ≥ · · · .
We write δ(f¯) =
−1 1 0
if f¯(i) < f¯(−i) for some i ∈ Z+ , if f¯(i) > f¯(−i) for some i ∈ Z+ , if f¯(i) = f¯(−i) for any i ∈ Z.
For each function f : Z → R+ with a finite support, there exists a balanced function f¯ : Z → R+ which is a rearrangement of f in the sense that f¯ = f ◦ σ (equivalently, f¯(i) = f (σ (i))) for some bijection σ : Z → Z. There are two ways to balance a given function f , unless it attains its maximal value an odd number of times and any other nonzero value an even number of times. In the latter case the corresponding balanced function f¯ is said to be symmetrically decreasing. theorem A (see [G, Theorem 3]; see also [HLP, Theorem 376]) Suppose that the functions fj : Z → R+ (j = 1, . . . , k) have finite support, and let f¯j be balanced functions, corresponding to fj . Then
241
LINEAR EQUATIONS AND EXPONENTIAL SUMS
i1 ,...,ik ∈Z i1 +···+ik =i1 +···+ik
f1 (i1 )f1 (i1 ) · · · fk (ik )fk (ik )
≤
f¯1 (i1 )f¯1 (i1 ) · · · f¯k (ik )f¯k (ik ).
i1 ,...,ik ∈Z i1 +···+ik =i1 +···+ik
This theorem was applied by Gabriel, Hardy, and Littlewood to a problem in Fourier analysis. corollary 1 [G, Theorem 4] Given finite trigonometric series Fj (x) = fj (m)e2π imx ,
j = 1, . . . , k,
let gj (m) = |fj (m)|, and suppose that g¯ j are balanced functions, corresponding g¯ j (m)e2π imx (so that Gj are obtained by replacing the to gj . Define Gj (x) = Fourier coefficients of Fj with their absolute values and rearranging them). Then 1 1 2 2 |F1 (x)| · · · |Fk (x)| dx ≤ |G1 (x)|2 · · · |Gk (x)|2 dx. 0
0
As a particular case we have 1 |F (x)|2k dx ≤ 0
1
|G(x)|2k dx.
0
The proof of Theorem A is based on the following theorem. theorem B (see [G, Theorem 2]; see also [HLP, Theorem 374]) Suppose that the functions fj : Z → R+ (j = 1, . . . , k) have finite support, and let f¯j be corresponding balanced functions. Assume that δ(f¯1 ) ≥ 0, δ(f¯2 ) ≤ 0, and δ(f¯j ) = 0 for j = 3, . . . , k. Then f¯1 (i1 ) · · · f¯k (ik ). f1 (i1 ) · · · fk (ik ) ≤ i1 ,...,ik ∈Z i1 +···+ik =0
i1 ,...,ik ∈Z i1 +···+ik =0
This theorem is due to Gabriel [G]; it generalizes an earlier result of Hardy and Littlewood [HL] where all functions f¯j (including those with j = 1 and j = 2) were required to be symmetrically decreasing. For k = 2, Theorem B becomes a restatement of the well-known fact that the “scalar product” of two nonnegative real sequences is maximized when these se-
242
VSEVOLOD F. LEV
quences are similarly ordered. In general, Theorem B establishes an inequality between two convolutions: f1 ∗ · · · ∗ fk (0) ≤ f¯1 ∗ · · · ∗ f¯k (0). An important particular case, on which the proof of Theorem B in its full generality relies, is that of fj being the indicator functions of finite sets of integers Aj :
1 if i ∈ Aj , fj (i) = 0 if i ∈ / Aj . Theorem B reduces then to the following theorem. theorem C Suppose that Aj ⊆ Z (j = 1, . . . , k) are finite nonempty sets of integers of cardinalities nj = |Aj | such that n3 , . . . , nk are odd. Define A¯ j to be arithmetic progressions of difference one, balanced around zero as follows: n1 − 1 − δ1 n1 − 1 + δ1 , , A¯ 1 = − 2 2 n2 − 1 + δ2 n2 − 1 − δ2 ¯ , , A2 = − 2 2 nj − 1 nj − 1 , j = 3, . . . , k , A¯ j = − 2 2 where for j = 1, 2 we put
δj =
0
if nj is odd,
1
if nj is even.
Then the number of solutions of the equation a1 + · · · + ak = 0 in the variables aj ∈ Aj does not exceed the number of solutions of this equation in the variables aj ∈ A¯ j . The major difficulty in converting Theorems A, B, and C into their Fp -counterparts lies in the proof of a modulo p version of the latter of the three theorems. Neither of the methods of [G] or [HLP] works in the new setting, and we have to employ a completely different argument. In fact, we prove a somewhat stronger result than just an analog of Theorem C, and we need an additional notation to formulate this result.
243
LINEAR EQUATIONS AND EXPONENTIAL SUMS
For p ≥ 3 prime, we write p = (p − 1)/2. By a balanced set of residues modulo p, we mean a nonempty block of consecutive residues of the form A¯ = [−α, α + δ] (mod p), where α ∈ [0, p ] and δ ∈ {0, ±1} are integers. (We exclude from consideration the degenerate cases α = 0, δ = −1 and α = p , δ = 1.) ¯ and we write δ = δ(A). ¯ If the cardinality Obviously, δ is uniquely defined by the set A, ¯ ¯ ¯ |A| is odd, then necessarily δ(A) = 0; if |A| is even, there are two possibilities: either ¯ = 1, or δ(A) ¯ = −1. If A is a nonempty set of residues modulo p, then A¯ is δ(A) used for a balanced set of same cardinality as A. For λ ∈ Fp we denote by Nλ (A1 , . . . , Ak ) the number of representations λ = a1 + · · · + ak with aj ∈ Aj . Thus, Nλ A1 , . . . , Ak = χ1 (i1 ) · · · χk (ik ) = χ1 ∗ · · · ∗ χk (λ) i1 ,...,ik ∈Fp i1 +···+ik =λ
is the convolution of the characteristic functions χj of the sets Aj . p Given nonempty A1 , . . . , Ak ⊆ Fp , we write {νi (A1 , . . . , Ak )}i=1 for the dep
scending rearrangement of the sequence {Nλ (A1 , . . . , Ak )}λ=−p ; thus, ν1 A1 , . . . , Ak = max Nλ A1 , . . . , Ak , λ∈Fp
p
νi A1 , . . . , Ak =
p
Nλ A1 , . . . , Ak = |A1 | · · · |Ak |,
λ=−p
i=1
and the number of indices i such that νi (A1 , . . . , Ak ) > 0 is the cardinality of the sumset A1 + · · · + Ak . Furthermore, we write (A1 , . . . , Ak ) ≺ (B1 , . . . , Bk ) if A1 , . . . , Ak and B1 , . . . , Bk are two systems of sets such that m m νi A1 , . . . , Ak ≤ νi B1 , . . . , Bk i=1
(1)
i=1
for m = 1, . . . , p. In particular, when m = 1, this implies max Nλ A1 , . . . , Ak ≤ max Nλ B1 , . . . , Bk . λ∈Fp
λ∈Fp
Example For p = 7, consider A1 = {0, 2}, A2 = {1, 2, 6}, A¯ 1 = {0, 1}, and A¯ 2 = {−1, 0, 1}. Then 7 νi A1 , A2 i=1 = 2, 1, 1, 1, 1, 0, 0 , A1 + A2 = 1, 2, 3, 4, 6 , 7 νi A¯ 1 , A¯ 2 i=1 = 2, 2, 1, 1, 0, 0, 0 , A¯ 1 + A¯ 2 = − 1, 0, 1, 2 , whence (A1 , A2 ) ≺ (A¯ 1 , A¯ 2 ).
244
VSEVOLOD F. LEV
We can now formulate our version of Theorem C. theorem 1 Suppose that Aj ⊆ Fp (j = 1, . . . , k) are nonempty sets of residues modulo a prime p ≥ 3, and let A¯ j be corresponding balanced sets. Then A1 , . . . , Ak ≺ A¯ 1 , . . . , A¯ k . Moreover, assume that |δ(A¯ 1 ) + · · · + δ(A¯ k )| ≤ 1. Then, writing for brevity Nλ for Nλ (A¯ 1 , . . . , A¯ k ), we have (i) if 0 ≤ δ(A¯ 1 ) + · · · + δ(A¯ k ) ≤ 1, then N0 ≥ N1 ≥ N−1 ≥ N2 ≥ N−2 ≥ · · · ≥ Np ≥ N−p ; (ii) if −1 ≤ δ(A¯ 1 ) + · · · + δ(A¯ k ) ≤ 0, then N0 ≥ N−1 ≥ N1 ≥ N−2 ≥ N2 ≥ · · · ≥ N−p ≥ Np . That is, if λ¯ 1 , . . . , λ¯ m are the first m terms of either the sequence (0, 1, −1, 2, −2, . . . , p , −p ) or the sequence (0, −1, 1, −2, 2, . . . , −p , p ) (depending on the sign of the sum δ(A¯ 1 ) + · · · + δ(A¯ k )), and if λ1 , . . . , λm are any m pairwise distinct residues, then Nλ1 A1 , . . . , Ak + · · · + Nλm A1 , . . . , Ak ≤ Nλ¯ 1 A¯ 1 , . . . , A¯ k + · · · + Nλ¯ m A¯ 1 , . . . , A¯ k . Remark. Clearly, the first assertion of Theorem 1 remains valid when A¯ j are arbitrary arithmetic progressions, sharing a common difference—not necessarily balanced around zero and of difference one. Compared to Theorem C, we were able to replace the asymmetrical requirement on δ(A¯ j ) (and thus on the parity of |Aj |) by the weaker and more aesthetic assumption |δ(A¯ 1 ) + · · · + δ(A¯ k )| ≤ 1. At the same time the conclusion of Theorem 1 is considerably stronger than just N0 (A1 , . . . , Ak ) ≤ N0 (A¯ 1 , . . . , A¯ k ), claimed by Theorem C. Theorem 1 is an Fp -analog of [L, Theorem 2 and Lemma 1]. We show in Section 3 that the first assertion of Theorem 1 is essentially equivalent to a well-known result of J. Pollard [P], and we obtain the second assertion as a consequence of the first one. Following the argument of Hardy and Littlewood (or that of Gabriel), one can easily derive from Theorem 1 the following analog of Theorem B.
245
LINEAR EQUATIONS AND EXPONENTIAL SUMS
theorem 2 Suppose that fj : Fp → R+ (j = 1, . . . , k) is any system of functions, and let f¯j be corresponding balanced functions. Assume that δ(f¯1 ) ≥ 0, δ(f¯2 ) ≤ 0, and δ(f¯j ) = 0 for j = 3, . . . , k. Then f1 (i1 ) · · · fk (ik ) ≤ f¯1 (i1 ) · · · f¯k (ik ). i1 ,...,ik ∈Fp i1 +···+ik =0
i1 ,...,ik ∈Fp i1 +···+ik =0
(Though we have never given a formal definition of a balanced function with Fp as a domain, the reader would not have a problem extrapolating the definition for functions defined on Z.) From Theorem 2 we deduce an Fp -version of Theorem A; this is somewhat tricky, though it requires no principally new ideas compared to [HLP] and [G]. theorem 3 Suppose that fj : Fp → R+ (j = 1, . . . , k) is any system of functions, and let f¯j be corresponding balanced functions. Then f1 (i1 )f1 (i1 ) · · · fk (ik )fk (ik ) i1 ,...,ik ∈Fp i1 +···+ik =i1 +···+ik
≤
i1 ,...,ik ∈Fp i1 +···+ik =i1 +···+ik
f¯1 (i1 )f¯1 (i1 ) · · · f¯k (ik )f¯k (ik ).
We prove Theorems 2 and 3 in Section 4. Remark. Theorem 1 implies Theorem C; to see this, given integer sets A1 , . . . , Ak , choose p large enough, and consider the canonical images of Aj in Fp . Similarly, Theorem 2 implies Theorem B, and Theorem 3 implies Theorem A. However, no sufficiently general and powerful method is known to derive modulo p results from their integer counterparts. Immediate from Theorem 3 is the following corollary. corollary 2 Given trigonometric series
Fj (x) =
p m=−p
fj (m)e2πi(mx/p) ,
j = 1, . . . , k,
246
VSEVOLOD F. LEV
let gj (m) = |fj (m)|, and suppose that g¯ j are balanced functions, corresponding to gj . Define p Gj (x) = g¯ j (m)e2πi(mx/p) m=−p
(so that Gj are obtained by replacing the Fourier coefficients of Fj with their absolute values and rearranging them). Then
p
|F1 (x)| · · · |Fk (x)| ≤ 2
2
x=−p
p
|G1 (x)|2 · · · |Gk (x)|2 .
x=−p
As a particular case we have
p
|F (x)|
x=−p
2k
≤
p
|G(x)|2k .
x=−p
2. Historical background and summary of results: Moments and large values of exponential sums For a set A ⊆ Fp of n = |A| residues modulo a prime p, let e2π i(az/p) (z ∈ Fp ) S(z) = a∈A
denote the exponential sums of A. The number of solutions of the linear equation a1 + · · · + ak = λ, is
aj ∈ A,
(2)
p−1 1 k S (z)e−2πi(λz/p) , p z=0
and for k large the terms that determine the behavior of this sum are those with |S(z)| large. In [L] we gave an estimate for the number of such z, which allowed us to estimate the number of solutions of (2). Now, equipped with Theorem 1, which establishes a precise bound for the number of solutions of equations of type (2), we can reverse the procedure and obtain a sharp estimate for the number of z such that |S(z)| is large. In doing so we follow in general the approach of Yudin [Y], who solved the parallel problem for integer sets; the technical details are quite different, however. It is convenient to measure the number of large exponential sums by the function T (ϕ), defined by
247
LINEAR EQUATIONS AND EXPONENTIAL SUMS
T (ϕ) := # z ∈ F× p : |S(z)| > n cos ϕ ,
π . 2 Thus, T (ϕ) is a nondecreasing, piecewise constant function such that T (0) = 0 and T (π/2) = p − 1. The reason for introducing T (ϕ) is that it has a nice additivity property, established in the following lemma. 0≤ϕ≤
lemma 1 Suppose that ϕ1 , ϕ2 ≥ 0 and ϕ1 + ϕ2 ≤ π/2. Then T (ϕ1 + ϕ2 ) ≥ min T (ϕ1 ) + T (ϕ2 ), p − 1
(3)
for any nonempty A ⊆ Fp . This lemma follows easily from an observation of Yudin and was first used in [L]. For the benefit of the reader we reproduce its proof (which is just several lines long) in Section 6. From Lemma 1 we derive the following property of the function T (ϕ). lemma 2 Suppose that n = |A| ≥ 4. Then for any 0 < ϕ, ϕ0 ≤ π/2 we have ϕ T (ϕ0 ). T (ϕ) ≥ ϕ0 We note that the integer part brackets cannot be dropped; indeed, the function T (ϕ)/ϕ is not monotonic. Remark. The proof of Lemma 2 would be immediate if we could omit the p − 1 term in (3) and rewrite it as T (ϕ1 + ϕ2 ) ≥ T (ϕ1 ) + T (ϕ2 ). By induction we have then T (j ϕ0 ) ≥ j T (ϕ0 ), provided j ϕ0 ≤ π/2, and we choose j = ϕ/ϕ0 . It can be shown, however, that this “strong” form of (3) is incorrect. Lemma 2 is implicit in [L]; here we give an explicit formulation and an independent explicit proof. We establish two estimates for the number of large sums S(z). The first is indirect but may turn out to be more useful in certain applications. theorem 4 For any set A ⊆ Fp of n = |A| ≥ 3 residues modulo a prime p and any even integer k ≥ 2, we have
248
VSEVOLOD F. LEV p−1
k
|S(z)| ≤
z=1
6 k−1 n p 1 + n−2 1 + 4k −1 . πk
(In fact, we give a somewhat better reminder term, provided k ≥ 16; see the proof in Section 6.) It is interesting to compare the estimate of Theorem 4 with the trivial estimate |S(z)|k ≤ nk−1 p. The second is a direct estimate. theorem 5 For any set A ⊆ Fp of n = |A| ≥ 4 residues modulo a prime p and any ϕ ∈ [0, π/6], we have √ 2 3p ϕ 1 + n−2 1 + 2ϕ 2/3 . T (ϕ) ≤ π n In Section 5 we consider the case of an arithmetic progression A = {0, . . . , n − 1} and show that Theorems 4 and 5 are best possible, save for the reminder terms.
3. Pollard’s theorem and the proof of Theorem 1 theorem D (Pollard, [P, Theorem 1]) Suppose that Aj ⊆ Fp (j = 1, . . . , k) are nonempty sets of residues modulo a prime p ≥ 3, and let A¯ j be arithmetic progressions modulo p sharing a common difference and such that |Aj | = |A¯ j |. Write νi = νi (A1 , . . . , Ak ), ν¯ i = νi (A¯ 1 , . . . , A¯ k ), and define ∞ ∞ ¯ := min{νi , t}, h(t) min{¯νi , t}, t ≥ 0. h(t) := i=1
Then
i=1
¯ h(t) ≥ h(t)
for any nonnegative real t. Remark. For k = 2, this theorem is better known in an apparently different but equivalent form, namely: if Ki is for the number of elements c ∈ A1 + A2 that have at least i representations as c = a1 + a2 (aj ∈ Aj ), then for any integer 1 ≤ t ≤ min{|A1 |, |A2 |} we have K1 + · · · + Kt ≥ t min |A1 | + |A2 | − t, p . The first assertion of Theorem 1 follows immediately from Pollard’s theorem and our next lemma.
249
LINEAR EQUATIONS AND EXPONENTIAL SUMS
lemma 3 Let ν1 ≥ ν2 ≥ · · · and ν¯ 1 ≥ ν¯ 2 ≥ · · · be two sequences of nonnegative real numbers, such that ν1 + ν2 + · · · = ν¯ 1 + ν¯ 2 + · · · , and such that only a finite number of elements ¯ of each sequence are distinct from zero. For nonnegative real t, define h(t) and h(t) as in Theorem D. Then the two following assertions are equivalent: (a) ν1 + · · · + νm ≤ ν¯ 1 + · · · + ν¯ m for any m = 1, 2, . . . ; ¯ for any t ≥ 0. (b) h(t) ≥ h(t) Proof ¯ Put σ = ν1 + ν2 + · · · = ν¯ 1 + ν¯ 2 + · · · , so that both h(t) and h(t) are continuous, ¯ ¯ nondecreasing functions, satisfying h(0) = h(0) = 0, h(ν1 ) = h(¯ν1 ) = σ , and ¯ σ = max h = max h. (i) We first show that (a) implies (b). We assume that t ≤ ν1 , as otherwise ¯ h(t) = σ ≥ h(t). Thus, t ≤ ν¯ 1 also, and there exist indices i, j ≥ 1 such that νi+1 ≤ t ≤ νi ,
ν¯ j +1 ≤ t ≤ ν¯ j ,
whence h(t) = σ − ν1 − · · · − νi + it,
¯ = σ − ν¯ 1 − · · · − ν¯ j + j t. h(t)
What we want to prove is, therefore, ν1 + · · · + νi − it ≤ ν¯ 1 + · · · + ν¯ j − j t, and we show that, moreover, ν¯ 1 + · · · + ν¯ i − it ≤ ν¯ 1 + · · · + ν¯ j − j t.
(4)
Indeed, if i ≤ j , then (4) is equivalent to (j − i)t ≤ ν¯ i+1 + · · · + ν¯ j , which is evident in view of t ≤ ν¯ j ≤ · · · ≤ ν¯ i+1 . Otherwise, i ≥ j and (4) is equivalent to (i − j )t ≥ ν¯ j +1 + · · · + ν¯ i , which follows from t ≥ ν¯ j +1 ≥ · · · ≥ ν¯ i . (ii) We now use induction by m to show that (b) implies (a). First we have ¯ ν1 ) ≤ h(¯ν1 ) ≤ σ , whence h(¯ν1 ) = σ , and therefore ν1 ≤ ν¯ 1 . Next, assuming σ = h(¯ ν1 + · · · + νm ≤ ν¯ 1 + · · · + ν¯ m for some m ≥ 1, we show that ν1 + · · · + νm+1 ≤ ν¯ 1 + · · · + ν¯ m+1 .
(5)
250
VSEVOLOD F. LEV
Indeed, this is obvious if νm+1 ≤ ν¯ m+1 , while otherwise νm+1 ≥ ν¯ m+1 , and then σ − ν¯ 1 − · · · − ν¯ m + m¯νm+1 ¯ νm+1 ) ≤ h(¯νm+1 ) = h(¯ = (m + 1)¯νm+1 +
∞
∞ min νj , ν¯ m+1 ≤ (m + 1)¯νm+1 + νj
j =m+2
j =m+2
= (m + 1)¯νm+1 + σ − ν1 − · · · − νm+1 , from which (5) follows. To complete the proof of Theorem 1, it remains to show how its first assertion implies the second one. We use induction by k. The case k = 2 is easy, and we consider k ≥ 3. For definiteness we suppose that 0 ≤ δ(A¯ 1 ) + · · · + δ(A¯ k ) ≤ 1. Moreover, permuting the indices if necessary, we can assume that δ(A¯ k ) ≥ 0, and then δ A¯ 1 + · · · + δ A¯ k−1 ≤ 1. (6) For m ∈ [1, p], let λ1 , . . . , λm ∈ Fp be any pairwise distinct residues modulo p. We put * = {−λ1 , . . . , −λm }, and we define λ¯ 1 , . . . , λ¯ m to be the first m elements ¯ := {−λ¯ 1 , . . . , −λ¯ m } is of the sequence (0, 1, −1, 2, −2, . . . , p , −p ), so that * balanced and δ A¯ k + δ * ¯ ≤ 1. ¯ ≤ 0, (7) δ * Moreover, if δ(A¯ k ) = 0, then δ A¯ 1 + · · · + δ A¯ k−1 ≥ 0
and
¯ ≤ 0, δ A¯ k + δ *
and if δ(A¯ k ) = 1, then δ A¯ 1 + · · · + δ A¯ k−1 ≤ 0
and
¯ ≥ 0. δ A¯ k + δ *
By (6), (7), the latter observation, the first assertion of Theorem 1, and the induction hypothesis, we have m i=1
Nλi A¯ 1 , . . . , A¯ k = N0 A¯ 1 , . . . , A¯ k , * Nu A¯ 1 , . . . , A¯ k−1 Nv A¯ k , * =
≤
u,v∈Fp u+v=0 p
νi A¯ 1 , . . . , A¯ k−1 νi A¯ k , *
i=1
251
LINEAR EQUATIONS AND EXPONENTIAL SUMS
=
p−1
νm
m ¯ ¯ ¯ ¯ νi A¯ k , * A1 , . . . , Ak−1 − νm+1 A1 , . . . , Ak−1
m=1
p ¯ ¯ νi A¯ k , * + νp A1 , . . . , Ak−1
i=1
i=1
≤
p−1
m ¯ νm A¯ 1 , . . . , A¯ k−1 − νm+1 A¯ 1 , . . . , A¯ k−1 νi A¯ k , *
m=1
+ νp A¯ 1 , . . . , A¯ k−1 =
p
i=1
¯ νi A¯ k , *
i=1
¯ νi A¯ 1 , . . . , A¯ k−1 νi A¯ k , *
i=1
=
p
¯ Nu A¯ 1 , . . . , A¯ k−1 Nv A¯ k , *
u,v∈Fp u+v=0
¯ = N0 A¯ 1 , . . . , A¯ k , * =
m i=1
Nλ¯ i A¯ 1 , . . . , A¯ k ,
which is to be proven.
4. Proofs of Theorems 2 and 3 Proof of Theorem 2 We follow the method of Hardy and Littlewood [HL]; Gabriel [G] gives an alternative approach. The idea is to represent f¯j as linear combinations of indicator functions of balanced subsets of Fp , yielding corresponding representations of fj . Indeed, we can write p ¯ cj m χ¯ j m (i), j = 1, . . . , k, fj (i) = m=1
where cj m are nonnegative coefficients and χ¯ j m are the indicator functions of some balanced A¯ j m ⊆ Fp . Moreover, the conditions imposed on f¯j imply that A¯ j m can be chosen to satisfy δ A¯ 2m ≤ 0, δ A¯ j m = 0 for 3 ≤ j ≤ k. δ A¯ 1m ≥ 0, Accordingly, we have
252
VSEVOLOD F. LEV p
fj (i) =
cj m χj m (i),
j = 1, . . . , k,
m=1
where χmj are the indicator functions of some Aj m such that |Aj m | = |A¯ j m |. Now
p
f1 (i1 ) · · · fk (ik ) =
i1 +···+ik =0
c1m1 · · · ckmk
m1 ,...,mk =1
χ1m1 (i1 ) · · · χkmk (ik ),
i1 +···+ik =0
and the inner sum at the right is simply the number of solutions of the equation i1 + · · · + ik = 0 in the variables ij ∈ Aj mj . By Theorem 1 this number of solutions can only increase if all Aj mj are replaced by A¯ j mj ; that is, the inner sum can only increase if all χimj are replaced by χ¯ j mj . The result follows. To prove Theorem 3 we rewrite its assertion as F1 (m1 ) · · · Fk (mk ) ≤ m1 +···+mk =0
F¯1 (m1 ) · · · F¯k (mk ),
m1 +···+mk =0
where we put Fj (m) =
fj (i)fj (i )
i−i =m
and
F¯j (m) =
f¯j (i)f¯j (i ).
(8)
i−i =m
Notice that, for j = 1, . . . , k, Fj (0) = max Fj (m),
Fj (m) = Fj (−m)
(m ∈ Fp )
(9)
F¯j (0) = max F¯j (m),
F¯j (m) = F¯j (−m)
(m ∈ Fp )
(10)
and similarly,
for any j = 1, . . . , k. We need several lemmas. lemma 4 For j = 1, . . . , k, the functions F¯j defined by (8) are symmetrically decreasing: F¯j (0) ≥ F¯j (1) = F¯j (−1) ≥ · · · ≥ F¯j (p ) = F¯j (−p ). Proof For a nonempty set A ⊆ Fp of odd cardinality, let A¯ be the corresponding balanced ¯ respectively. set, and let χ = χA and χ¯ = χA¯ be the indicator functions of A and A,
253
LINEAR EQUATIONS AND EXPONENTIAL SUMS
Then χ¯ is the balanced rearrangement of χ, and we have F¯j (m)χ (m) = f¯j (i)f¯j (i )χ(m) = m
i −i+m=0
f¯j (−i)f¯j (i )χ(m)
i +i+m=0
(by Theorem 2) ≤
f¯j (−i)f¯j (i )χ(m) ¯ =
F¯j (m)χ(m). ¯
m
i +i+m=0
From this and (10) the assertion follows. For two functions g, h : Fp → R+ we write g ≺ h if for any system of m ≥ 1 pairwise distinct residues λ1 , . . . , λm ∈ Fp there exists another system of pairwise distinct residues µ1 , . . . , µm ∈ Fp such that g(λ1 ) + · · · + g(λm ) ≤ h(µ1 ) + · · · + h(µm ). (This agrees with the notation (A1 , . . . , Ak ) ≺ (B1 , . . . , Bk ) if we consider the latter as an abbreviation for N(A1 , . . . , Ak ) ≺ N(B1 , . . . , Bk ).) lemma 5 For j = 1, . . . , k we have Fj ≺ F¯j , where the functions Fj and F¯j are defined by (8). Proof ¯ χ, and χ, For A, A, ¯ as in the proof of Lemma 4, we have Fj (m)χ (m) = fj (i)fj (i )χ(m) = m
i −i+m=0
fj (−i)fj (i )χ(m)
i +i+m=0
(by Theorem 2) ≤
f¯j (−i)f¯j (i )χ(m) ¯ =
i +i+m=0
F¯j (m)χ(m). ¯
m
The assertion follows in view of (9) and (10). lemma 6 If g1 , . . . , gk : Fp → R+ are symmetrically decreasing, then the convolution G = g1 ∗ · · · ∗ gk is symmetrically decreasing also. Proof ¯ χ , and χ¯ be as in the proofs of Lemmas 4 and 5, except that we do not Let A, A, require the cardinality of A to be odd, and, accordingly, there can be a freedom of ¯ We have choice of A.
254
VSEVOLOD F. LEV
G(m)χ (m) =
m
g1 (i1 ) · · · gk (ik )χ(−m)
i1 +···+ik +m=0
(by Theorem 2)
≤
g1 (i1 ) · · · gk (ik )χ(−m) ¯ =
G(m)χ(m). ¯
m
i1 +···+ik +m=0
It follows that G is symmetrically decreasing. lemma 7 Let g, g1 , g2 : Fp → R+ be symmetrically decreasing functions, and suppose that g1 ≺ g2 . Then g(m)g1 (m) ≤ g(m)g2 (m). m∈Fp
m∈Fp
Proof We have
p
g(m)g1 (m) =
m=−p
−1 p
m
g(m) − g(m + 1)
≤
g1 (i) + g(p )
m
g(m) − g(m + 1)
g1 (i)
g2 (i) + g(p )
i=−m
m=0
p i=−p
i=−m
m=0 −1 p
p
g2 (i)
i=−p
=
p
g(m)g2 (m).
m=−p
Now we can prove Theorem 3. Proof of Theorem 3 Let σj : Z → Z (j = 1, . . . , k) be bijections of Z such that Fj ◦ σj are balanced and thus symmetrically decreasing in view of (9). By Theorem 2 we have F1 (m1 ) · · · Fk (mk ) m1 +···+mk =0
≤
m1 +···+mk =0
=
m1
F1 σ1 (m1 ) · · · Fk σk (mk )
F1 σ1 (m1 )
F2 σ2 (m2 ) · · · Fk σk (mk ) ,
m2 +···+mk =−m1
and we notice that both F1 ◦ σ1 and the inner sum, considered as a function of m1 , are symmetrically decreasing (the latter by Lemma 6). Next, F1 ◦ σ1 ≺ F¯1 by Lemma
255
LINEAR EQUATIONS AND EXPONENTIAL SUMS
5, and therefore by Lemmas 4 and 7 the right-hand side does not exceed F¯1 (m1 ) F2 σ2 (m2 ) · · · Fk σk (mk ) m1
=
m2 +···+mk =−m1
F2 σ2 (m2 )
m2
F¯1 (m1 )F3 σ3 (m3 ) · · · Fk σk (mk ) .
m1 +m3 +···+mk =−m2
In a similar vein, replacing F2 (σ2 (m2 )) by F¯2 (mj ), we can only increase this latter expression, and continuing in this way, we see that F1 (m1 ) · · · Fk (mk ) ≤ F1 σ1 (m1 ) · · · Fk σk (mk ) m1 +···+mk =0
m1 +···+mk =0
≤
F¯1 (m1 ) · · · F¯k (mk ),
m1 +···+mk =0
which is to be proven. 5. Large exponential sums: The case of an arithmetic progression Let A0 = {0, . . . , n − 1} (mod p), so that the corresponding sums S0 (z) satisfy sin π(nz/p) |S0 (z)| = (z = 1, . . . , p − 1). (11) sin π(z/p) We write
sin x (0 < x ≤ π), x and we define G(0) = 1 by continuity; thus, the inverse function G−1 (x) is a continuous, monotonically decreasing function of [0, 1], satisfying G−1 (0) = π, G−1 (1) = 0. G(x) =
lemma 8 For κ ∈ (0.5, 1) and n ≥ 2, the equation | sin(nx)| = κn (12) sin x has exactly one solution x = x(n, κ) on the interval (0, π/2), and this solution satisfies θ 1 x = G−1 (κ) 1 + 2 , n n where θ = θ (n, κ) ∈ (0, 1). Proof As the function f (x) = sin(nx)/ sin x decreases on (0, π/n) and f (0+) = n > nκ > 0 = f (π/n), equation (12) has precisely one solution on (0, π/n). Moreover,
256
VSEVOLOD F. LEV
if x ∈ (π/n, π/2), then f (x) ≤
2 x π
−1
≤
n < κn, 2
which shows that (π/n, π/2) contains no solutions. We now estimate the solution x = x(n, κ) ∈ (0, π/n). Since (12) implies G(nx) = κG(x) < κ, we have nx > G−1 (κ), whence x > n−1 G−1 (κ). To obtain the upper bound, we first note that x2 G(nx) = κG(x) > κ 1 − . (13) 6 Next we let µ = n/(n2 + 1), and we show that x2 G((n − µ)x). G(nx) ≤ 1 − 6
(14)
Notice that (14) along with (13) yield G((n − µ)x) > κ, (n − µ)x < G−1 (κ), 1 1 1 −1 G (κ) = 1 + 2 G−1 (κ), x< n−µ n n and it remains to prove (14). To this end we rewrite it as G((n − µ)x) − G(nx) ≥
x2 G((n − µ)x), 6
and we observe that
G((n − µ)x) − G(nx) ≥ µx − G ((n − µ)x) .
Letting ξ = (n − µ)x and using the inequality 1 sin ξ ≥ ξ cos ξ + ξ 2 sin ξ 3 (valid for any ξ ∈ [0, π/2]), we get sin ξ − ξ cos ξ G((n − µ)x) − G(nx) ≥ µx ξ2 1 ≥ µx sin ξ 3 x2 = 2µ(n − µ)G(ξ ) 6 x2 ≥ G(ξ ), 6 which proves (15).
(15)
257
LINEAR EQUATIONS AND EXPONENTIAL SUMS
corollary 3 For 0 < ϕ < π/3, the number of z ∈ F× p such that |S0 (z)| > n cos ϕ is T0 (ϕ) = 2Z − 2, θ 1 p −1 G (cos ϕ) 1 + 2 , Z= πn n
where
θ ∈ (0, 1).
We next estimate the function G−1 (x). lemma 9 We have
1 6(1 − x) ≤ G−1 (x) ≤ 6(1 − x) 1 + (1 − x) , 4 provided 7/15 ≤ x ≤ 1. (In fact, the lower bound holds true for all x ∈ [0, 1].) Proof √ (i) Given x ∈ [0, 1], we define z = 6(1 − x). Then G(z) ≥ 1 −
z2 = x, 6
G−1 (x) ≥ z =
6(1 − x).
(ii) To prove the upper bound, we use the inequality G(z) ≤ 1 −
z2 z4 + , 6 120
(z ≥ 0).
(16)
Given x ∈ [7/15, 1], we define z ∈ [0, 2] by x =1− and we get then
whence
z2 z4 + , 6 120
(17)
4 z2 6(1 − x) = z2 1 − ≥ z2 , 20 5 z2 3 ≤ (1 − x). 20 8
The latter inequality and (17) give z2 z2 3 2 2 ≥ z 1 − (1 − x) ≥ , 6(1 − x) = z 1 − 20 8 1 + (1/2)(1 − x) 1 z2 ≤ 6(1 − x) 1 + (1 − x) . 2
(18)
258
VSEVOLOD F. LEV
Now by (16), (17), and (18), 1 1 −1 G (x) ≤ z ≤ 6(1 − x) 1 + (1 − x) ≤ 6(1 − x) 1 + (1 − x) . 2 4 6. Large exponential sums: Proofs of Theorems 4 and 5 We first prove several auxiliary lemmas. Proof of Lemma 1 The observation of Yudin, mentioned in Section 2, is that S(z1 + z2 ) > n cos(ϕ1 + ϕ2 ), provided S(z1 ) > n cos ϕ1 ,
S(z2 ) > n cos ϕ2 ,
and
(19)
π . 2 (20)
0 ≤ ϕ1 , ϕ2 , ϕ1 + ϕ2 ≤
(See [Y, Lemma 1].) Consider the subsets of Fp , E(ϕ) := z ∈ Fp : |S(z)| > n cos ϕ , so that T (ϕ) = |E(ϕ)| − 1. What (19) and (20) mean is that E(ϕ1 ) + E(ϕ2 ) ⊆ E(ϕ1 + ϕ2 ), provided 0 ≤ ϕ1 , ϕ2 , ϕ1 + ϕ2 ≤ π/2. By the Cauchy-Davenport theorem we have then E(ϕ1 + ϕ2 ) ≥ min |E(ϕ1 )| + |E(ϕ2 )| − 1, p , and the result follows. Proof of Lemma 2 We assume 0 < ϕ0 < ϕ ≤ π/2, as otherwise the assertion is trivial. Let j = ϕ/ϕ0 ; then 0 < j ϕ0 ≤ ϕ ≤ π/2 and, by Lemma 1, T (ϕ) ≥ T (j ϕ0 ) ≥ min j T (ϕ0 ), p − 1 . Thus, the desired estimate follows unless ϕ T (ϕ0 ) ≥ p, ϕ0 and we suppose (21) to hold for the rest of the proof. We define
(21)
259
LINEAR EQUATIONS AND EXPONENTIAL SUMS
l :=
p−1 T (ϕ0 )
≥ 1,
so that lT (ϕ0 ) ≤ p − 1, whence, by (21),
l≤
ϕ ϕ0
(22)
− 1,
(23)
and therefore,
π . (24) 2 Now by Lemma 1, (22), and (24), T (iϕ0 ) ≥ iT (ϕ0 ) for i = 1, . . . , l, and it follows that there exist l pairwise disjoint residue sets E1 , . . . , El ⊆ F× p such that |E1 | = · · · = |El | = T (ϕ0 ) and |S(z)| > n cos(iϕ0 ) for z ∈ Ei . We conclude that lϕ0 < ϕ ≤
n(p − n) =
p−1
S(z)2
z=1 2
cos2 ϕ0 + · · · + cos2 (lϕ0 ) T (ϕ0 ) 1 1 sin(2l + 1)ϕ0 − sin ϕ0 T (ϕ0 ). l+ = n2 2 4 sin ϕ0
≥n
(25)
Also, by (23), (2l + 1)ϕ0 ≤ 2ϕ − ϕ0 ≤ π − ϕ0 , whence sin(2l + 1)ϕ0 ≥ sin ϕ0 .
(26)
From (25) and (26), 1 nlT (ϕ0 ), 2 n p−n p−n ≥ l ≥ 2l ≥ 2 , T (ϕ0 ) 2 T (ϕ0 ) p−n p−n ≤ , T (ϕ0 ) T (ϕ0 ) p−n≥
(27)
T (ϕ0 ) > p − n. The latter inequality, however, contradicts (27). In the course of the proof of Theorems 4 and 5, we need estimates of the Euler beta function 1
B(u, v) = 0
t u−1 (1 − t)v−1 dt.
260
lemma 10 For k > 0,
Proof We have
VSEVOLOD F. LEV
√ 3 π −5/2 5 < k B k, . 2 4
√ π −3/2 3 < k B k, , 2 2
3 B k, 2
=
1
0
1 = 2k 1 < 2k =
t 1/2
1
−1 (1 − t)k dt k
t −1/2 (1 − t)k dt
0 1
t −1/2 e−kt dt
0
1 −3/2 k 2
k
τ −1/2 e−τ dτ
0
1 −3/2 1 < k 5 2 2 √ π −3/2 = k . 2 Similarly,
5 B k, 2
= 0
= < = < =
t 3/2
−1 (1 − t)k dt k
1 3 t 1/2 (1 − t)k dt 2k 0 1 3 t 1/2 e−kt dt 2k 0 3 −5/2 k 1/2 −τ k τ e dτ 2 0 3 3 −5/2 k 5 2 2 √ 3 π −5/2 . k 4
We prove Theorems 4 and 5 together. Proof of Theorems 4 and 5 Using partial summation, we write
1
261
LINEAR EQUATIONS AND EXPONENTIAL SUMS p−1
k
k
π/2
|S(z)| = kn
T (ϕ) cosk−1 ϕ sin ϕ dϕ,
0
z=1 p−1
|S0 (z)|k = knk
π/2
T0 (ϕ) cosk−1 ϕ sin ϕ dϕ,
(28)
0
z=1
and we denote the integrals in the right-hand sides by I and I0 , respectively. Since |S(0)| = |S0 (0)|, by Corollary 2 we have I ≤ I0
(29)
for k even. We first estimate I0 . By Corollary 3 and Lemmas 9 and 10, π/3 1 2p 1+ 2 G−1 (cos ϕ) cosk−1 ϕ sin ϕ dϕ I0 ≤ πn n 0 π/2 k−2 π T0 (ϕ) cos ϕ sin ϕ dϕ + cos 3 π/3 1 π/2 1 2p 1+ 2 t k−1 G−1 (t) dt + 22−k T0 (ϕ) cos ϕ sin ϕ dϕ ≤ πn n 1/2 0 √ 1 2 6p 1 1 1 k−1 k−1 1/2 3/2 ≤ 1+ 2 t (1 − t) dt + t (1 − t) dt π n 4 1/2 n 1/2 n(p − n) 2n2 6 p −3/2 p 1 3 k + 21−k . < 1+ 2 1+ πn 8k n n + 22−k
Using elementary calculus, one can easily derive the two following estimates: 1 4 6 p −3/2 1+ 2 1+ k , k ≥ 2, I0 ≤ πn k n and I0 ≤
1 2 6 p −3/2 k , 1+ 2 1+ πn 5k n
k ≥ 16.
(30)
The first estimate, in view of (28), proves Theorem 4, and we proceed with the proof of Theorem 5. Fix ϕ0 ∈ (0, π/6]. Our plan is to apply Lemma 2 to show that a large value of T (ϕ0 ) results in a large value of I , while I cannot be “too large” by (29) and
262
VSEVOLOD F. LEV
(30). Indeed, for k ≥ 16 even, by Lemma 2 and using the Stirling formula, we have π/2 ϕ T (ϕ0 ) cosk−1 ϕ sin ϕ dϕ I≥ ϕ0 0 π/2 T (ϕ0 ) π/2 −1 k −1 k ≥ ϕ cos ϕ dϕ − T (ϕ0 ) cos ϕ dϕ ϕ0 k k 0 0 1 1 T (ϕ0 ) π/2 cosk ϕ dϕ − T (ϕ0 ) = k ϕ0 k 0 1 1 T (ϕ0 ) −k k π − T (ϕ0 ) 2 = k ϕ0 k k/2 2 π −3/2 T (ϕ0 ) −1/3k 1 ≥ k e − T (ϕ0 ) 2 ϕ0 k π −3/2 T (ϕ0 ) 2 1/3k √ k e 1− = ϕ0 k e−1/3k 2 ϕ0 π π −3/2 T (ϕ0 ) 5 √ k > 1 − ϕ0 k e−1/3k . 2 ϕ0 6 Along with (29) and (30), this yields π −3/2 T (ϕ0 ) 5 √ 6 p −3/2 1 2 −1/3k k 1 − ϕ0 k e < k 1+ 2 1+ . 2 ϕ0 6 πn 5k n Using the inequality
3 2 e1/3k < 1 + 1+ 5k 4k
(k ≥ 16)
and replacing for brevity ϕ0 by ϕ, we get √ 1 1 + 3/(4k) 2 3p ϕ 1+ 2 T (ϕ) < √ . π n n 1 − (5/6)ϕ k
(31)
We now recall the estimate of [L, Lemma 2]: T (ϕ)
1− ϕ 2+ ϕ 6 6 2 6 2 9
The result now follows from (31) and (32).
References [G]
R. M. GABRIEL, The rearrangement of positive Fourier coefficients, Proc. London
[HL]
G. H. HARDY and J. E. LITTLEWOOD, Notes on the theory of series, VIII: An
[HLP]
´ G. H. HARDY, J. E. LITTLEWOOD, and G. POLYA , Inequalities, 2d ed., Cambridge Math.
[L]
V. F. LEV, On the number of solutions of a linear equation over finite sets, J. Comb.
[P]
J. M. POLLARD, Addition properties of residue classes, J. London Math. Soc. (2) 11
[Y]
A. A. YUDIN, “The measure of the large values of the modulus of a trigonometric
Math. Soc. (2) 33 (1932), 32–51. inequality, J. London Math. Soc. 3 (1928), 105–110. Lib., Cambridge Univ. Press, Cambridge, 1988. Theory Ser. A 83 (1998), 251–267. (1975), 147–152. sum” (in Russian) in Number-Theoretic Studies in the Markov Spectrum and in the Structural Theory of Set Addition (in Russian), Kalinin. Gos. Univ., Moscow, 1973, 163–171.
Institute of Mathematics, Hebrew University, Jerusalem 91904, Israel; [email protected]
A DUALITY OF MACDONALD-KOORNWINDER POLYNOMIALS AND ITS APPLICATION TO INTEGRAL REPRESENTATIONS KATSUHISA MIMACHI
Abstract We give a formula representing a duality of Macdonald-Koornwinder polynomials. Using this formula, an integral representation of Macdonald-Koornwinder polynomials is derived, a special case of which is the conjectural formula stated in [22]. We also present the corresponding formula to Heckman and Opdam’s Jacobi polynomials of type BCm .
1. Introduction In 1987, a new integral representation of Jacobi polynomials −n, α + β + n + 1 1 − z (α,β) (z) = 2 F1 Pn ; α+1 2 was discovered by K. Aomoto [1]. The function of z represented by 1 1 ··· (xi − z)(x) dx1 · · · dxn −1
and (x) =
−1 1≤i≤n
(1 − xi )α−1 (1 + xi )β−1
(1.1)
(1.2)
xi − xj 2γ
1≤i<j ≤n
1≤i≤n (α/γ −1,β/γ −1)
turns out to be Pn (z), up to a constant factor. Note that is the integrand of the Selberg integral [28]: 1 1 ··· (x) dx1 · · · dxn −1
−1
=2
(α+β−1)n+γ n(n−1)
1 + γ (1 + j ) (α + j γ )(β + j γ ) . (1 + γ ) α + β + (n + j − 1)γ j =0 n−1
DUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 2, Received 11 October 1999. Revision received 1 June 2000. 2000 Mathematics Subject Classification. Primary 33E52; Secondary 33C70.
265
266
KATSUHISA MIMACHI
It is crucial that the formula be available for the general exponent γ ; the formula in the special case where γ = 1 has been known in the theory of orthogonal polynomials, explained, for example, in the textbook by G. Szeg¨o [29] (see also [18, Chapter 17]). If we change the product ni=1 (xi − z) of (1.2) into 1≤i≤n 1≤j ≤m (xi − zj ), the corresponding integral turns out to be the special case of Heckman and Opdam’s Jacobi polynomials associated with the root system of type BCm . It was implied by combining two results of J. Kaneko [12] and of R. Beerends and E. Opdam [3]. On the other hand, Askey-Wilson polynomials are known as the most general orthogonal polynomials in the single variable case in the sense that various sets of orthogonal polynomials—Wilson polynomials, continuous Hahn polynomials, Jacobi polynomials, Laguerre polynomials, Hermite polynomials, and so on—can be derived as limiting cases from Askey-Wilson polynomials (see [2], [13]). Similarly, Macdonald-Koornwinder polynomials are general orthogonal polynomials in several variables in the sense that various sets of orthogonal polynomials—Macdonald polynomials associated with the root system of classical (nonexceptional) type [15], [16], multivariable Wilson polynomials (see [8]), and Heckman and Opdam’s Jacobi polynomials (see [10]), and so on—can be derived as limiting cases from MacdonaldKoornwinder polynomials (see [6], [8], [14]). The corresponding formula to (1.2) in the case of Askey-Wilson polynomials was obtained in our previous work [22], where we conjectured the corresponding formula to (1.2) in the case of Macdonald-Koornwinder polynomials and suggested the existence of some duality behind it. The purpose of the present paper is to clarify these formulas. A formula representing a duality of Macdonald-Koornwinder polynomials is given in Theorem 2.1, an integral representation of Macdonald-Koornwinder polynomials is given in Theorem 2.2, and in Corollary 2.3 the conjectural formula stated above is presented. For results related to Theorem 2.1, we refer the reader to [26, Section 6]. Furthermore, as a variant of the formulas above, the corresponding formulas for Jacobi polynomials associated with the root system of type BCm due to G. Heckman and E. Opdam are given in Theorems 4.1 and 4.2 and in Corollary 4.3. It is noteworthy that a special case of Corollary 4.3 recovers the formula (1.2) by Aomoto (see also [20], [23], [25], [24]). Throughout this article, we consider q as a real number satisfying 0 < q < 1 and t as q k , where k ∈ N. 2. Macdonald-Koornwinder polynomials We begin by recalling some fundamental facts. For basic references, we refer the reader to [14] and [17]. A partition λ is a sequence λ = (λ1 , λ2 , . . . , λn ) of nonnegative integers in decreasing order: λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0. The sum of the λi is
267
DUALITY OF MACDONALD-KOORNWINDER POLYNOMIALS
the weight of λ, denoted by |λ|. Given a partition λ, we define the conjugate partition λ = (λ1 , λ2 , . . . , λn ) by λi = Card{j ; λj ≥ i}. On partitions, the dominance (or natural) ordering (associated with the root system of type BCn ) is defined by λ ≥ µ ⇐⇒ λ1 + · · · + λi ≥ µ1 + · · · + µi
for all i ≥ 1.
The diagram of a partition λ is defined as the set of points (i, j ) ∈ Z2 such that 1 ≤ j ≤ λi . If λ and µ are partitions, we write λ ⊇ µ to mean that the diagram of λ contains the diagram of µ, that is, that λi ≥ µi for all i ≥ 1. Let the ring C[x ±1 ] = C[x1±1 , . . . , xn±1 ] represent Laurent polynomials in n variables, x = (x1 , . . . , xn ), which can be regarded as the group algebra of the weight lattice P = ⊕1≤i≤n Zi associated with the root system of type BCn . The Weyl group W = Zn2 Sn of type BCn acts naturally on the ring C[x ±1 ]. The subring of all W -invariants is denoted by C[x ±1 ]W . For f = β fβ x β ∈ C[x ±1 ], we define f¯ =
fβ x −β ,
β
and we let [f ]1 denote the constant term of f . An inner product is defined by 1 [f g¯ ]1 2n n!
f, g = for f, g ∈ C[x ±1 ], with =
(x) =
1≤i<j ≤n 1 ,2 =±1
xi1 xj2 ; q
txi1 xj2 ; q
∞
∞ 1≤j ≤n =±1
xj2 ; q
∞
axj , bxj , cxj , dxj ; q
, ∞
where (a1 , . . . , am ; q)∞ = (a1 ; q)∞ · · · (am ; q)∞ and (a; q)∞ = j ≥0 (1 − aq j ). Then there is a unique family of polynomials Pλ (x) = Pλ (x; a, b, c, d; q, t) ∈ ± C[x ]W indexed by the partitions λ = (λ1 , . . . , λn ) such that (1) Pλ (x) = mλ (x) + µ0
αi (ξ ) 0 small enough, the reduced spaces, Wc+3 projective spaces obtained by reducing (1.10) at c + 3 and c − 3 by the linear action of the circle group H . + − The reduced spaces Wc+3 and Wc−3 are symplectic suborbifolds of Mc+3 and Mc−3 , and the “flip-flop” theorem asserts the following.
theorem 1.6.2 + The blow-up of Mc+3 along Wc+3 is diffeomorphic as a G1 -manifold to the blow-up − of Mc−3 along Wc−3 . Remarks. (1) The “blowing-up” referred to here is a symplectic blow-up in the sense of Gromov. (2) This theorem can be refined to describe how the symplectic structures of these two blow-ups are related (see [GS2]). (3) This result is due to Guillemin and Sternberg and to Godinho. An analogous
297
1-SKELETA AND EQUIVARIANT COHOMOLOGY
result for complex manifolds (with GIT (geometric invariant theory) reduction playing the role of symplectic reduction) can be found in [BP]. To see how the GKM 1-skeleton of Mc+3 is related to the GKM skeleton of Mc−3 , we must find out how GKM-skeleta are affected by blowing up. Consider the simplest case of a blow-up. Let M be a GKM manifold, let p be a point of M G , let Tp M =
d i=1
Tpαi
be the decomposition of Tp M into weight spaces, and let Xi , i = 1, . . . , d, be the embedded GKM 2-spheres at p with Tp Xi = Tpαi . As a complex vector space, Tp M is d-dimensional, and each of these weight spaces is 1-dimensional. Now blow up M at p. As an abstract set, this blow-up is a disjoint union of the projective space CP (Tp M) and M − {p}. The action of G on the first of these sets has exactly d fixed points. A fixed point, pi , corresponds to each subspace Tpαi of Tp M, and each pair of fixed points pi and pj are joined by an embedded 2-sphere, α the projective line in CP (Tp M) corresponding to the subspace Tpαi ⊕ Tp j of Tp M. In addition, each of the 2-spheres Xi is unaffected when we blow it up at p, but instead of joining p to a fixed point qi in M − {p}, it now joins pi to qi . For the blow-up of M along a G-invariant symplectic submanifold W 2r , the story is essentially the same. As an abstract set, the blow-up is the disjoint union of the projectivized normal bundle of W and M − W . Thus each fixed point p of G in W gets replaced in the blow-up by d − r new fixed points in the projectivized normal space to W at p. If is the GKM graph of M and 1 is the GKM graph of W , then, just as above, the GKM graph of the blow-up is obtained from and 1 by replacing each vertex of 1 by a complete graph on d − r vertices, one vertex for each edge of − 1 at p (see Figure 4 in Section 2.2.1). − . By Theorem This description is particularly simple if M is Mc−3 and W is Wc−3 − 1.6.1, Wc−3 is just a twisted projective space of dimension r − 1, r being the index of the fixed point p, so its graph is the complete graph on r vertices r . Hence, after blowing up, it gets replaced by the graph r × d−r . Similarly, the GKM + + is d−r , and when we blow up Mc+3 along Wc+3 , it gets replaced by graph of Wc+3 d−r × r . Thus, as one passes through the critical value c, the following scenario takes place: (1) r gets blown up to r × d−r ; (2) r × d−r gets “flip-flopped” to d−r × r ;
298
GUILLEMIN AND ZARA
(3) d−r × r gets blown down to d−r . To complete the description of this transition, we must still describe how this flip-flop process affects the connections and the axial functions on these graphs. This, too, we postpone until Section 2.3.2. 1.7. Equivariant cohomology Let HG (M) be the equivariant cohomology ring of M with complex coefficients. From the inclusion map i : M G → M one gets a transpose map in cohomology i ∗ : HG (M) −→ HG M G , (1.11) and we describe in this section some simple necessary conditions for an element of HG (M G ) to be in the image of this map. Since M G is a finite set, HG M G = (1.12) HG ({p}), p ∈ M G ; however, HG ({p}) is the polynomial ring, S(g∗ ), so the right-hand side of (1.12) is the ring N S(g∗ ), N = #M G . (1.13) It is useful to keep track of the fact that each summand of (1.13) corresponds to a fixed point by identifying (1.13) with the ring (1.14) Maps V , S(g∗ ) . Let e ∈ E , and let g∗e be the quotient of g∗ by the 1-dimensional subspace {cαe ; c ∈ C}. From the projection ρe : g∗ → g∗e one gets an epimorphism of rings ρe : S(g∗ ) −→ S g∗e . (1.15) (Since g∗e = g∗e¯ , we use the notation g∗e and ρe for unoriented edges, as well.) theorem 1.7.1 A necessary condition for an element, φ, of the ring (1.14) to be in the image of the map (1.11) is that for every edge, e, of it satisfies the compatibility condition ρe φp = ρe φq ,
(1.16)
p and q being the vertices of e, and φp and φq the elements of S(g∗ ) assigned to them by the map φ : V → S(g∗ ). Proof The right- and left-hand sides of (1.16) are the pullbacks to p and q of the image of
1-SKELETA AND EQUIVARIANT COHOMOLOGY
299
φ under the map HG (M) → HK (M), where K = exp ker αe and k = ge . Since p and q belong to the same connected component of M K , the pullbacks coincide. Let us denote by H (, α) (or simply by H () when the choice of α is clear) the subring of (1.14) consisting of those elements that satisfy the compatibility condition (1.16); we call H (, α) the cohomology ring of (, α). By Theorem 1.7.1 the map (1.11) factors through H (, α) to give a ring homomorphism i ∗ : HG (M) −→ H (, α).
(1.17)
This homomorphism also has a bit of additional structure. The constant maps of V into S(g∗ ) obviously satisfy condition (1.16), so that S(g∗ ) is a subring of H (, α). Also, from the constant map M → pt one gets a transpose map HG (pt) → HG (M), mapping S(g∗ ) into HG (M), and it is easy to see that (1.17) is a morphism of S(g∗ )modules. One of the main theorems of [GKM] asserts that the homomorphism (1.17) is frequently an isomorphism. More explicitly, recall that if K is a subgroup of G there is a forgetfulness map HG (M) → HK (M), and, in particular, for K = {e}, there is a map (1.18) HG (M) −→ H (M). Definition 1.7.1 M is equivariantly formal if (1.18) is surjective. There are many alternative equivalent definitions of equivariant formality. For instance, for every compact G-manifold, dim H i (M) ≥ dim H i M G , (1.19) and M is equivariantly formal if and only if the inequality is equality. Thus for GKM manifolds, M is equivariantly formal if the sum of its Betti numbers is equal to the cardinality of M G . In other words, we have the following theorem. theorem 1.7.2 For a GKM manifold, equivariant formality is equivalent to bi (M) = bi ().
(1.20)
For instance, if the action of G on M is Hamiltonian, then bi (M) = bi (), so (1.20) is trivially satisfied. A less trivial example of (1.20) is the action of the Cartan subgroup of G2 on the 6-sphere G2 / SU(3). For this example we show in Section 1.9 that b2 () = b4 () = 1, b0 (M) = b6 (M) = 1 and that all the other Betti numbers are
300
GUILLEMIN AND ZARA
zero. The theorem of Goresky, Kottwitz, and MacPherson that we allude to above asserts that the following is true. theorem 1.7.3 If M is equivariantly formal, the map (1.17) is a bijection. In other words, if M is equivariantly formal, the equivariant cohomology ring of M is isomorphic to the cohomology ring H (, α) of the GKM 1-skeleton (, α). Recently, a number of relatively simple proofs of this theorem have been given. For example, a proof of N. Berline and M. Vergne [BV] is based on localization ideas, and, in the Hamiltonian case, there is a very simple Morse-theoretic proof by S. Tolman and J. Weitsman [TW2]. Theorem 1.7.2 is, as we mentioned, just one of many alternative criteria for equivariant formality. Another is the following theorem. theorem 1.7.4 M is equivariantly formal if, as S(g∗ )-modules, HG (M) H (M) ⊗ S g∗ . Thus if (, α) is the GKM 1-skeleton of M, one gets from this the following result. theorem 1.7.5 If M is equivariantly formal, H (, α) is a free module over S(g∗ ) with b2i (M) generators in dimension 2i. One of the questions that we address in the second part of this paper is the following: When is the graph-theoretical analogue of this theorem true with the b2i (M)’s replaced by the b2i ()’s? From the examples in Section 1.9, we see that even for GKM-skeleta this theorem is not true with the b2i (M)’s replaced by the b2i ()’s. However, we show that one can make this substitution providing has the properties described in Theorems 1.4.4 and 1.5.1. 1.8. The Kirwan map Let M be a compact Hamiltonian G-manifold, let H be a circle subgroup of G, and let f : M → R be the H -moment mapping. If c is a regular value of f , then the reduced space Mc = f −1 (c)/H is a Hamiltonian G1 -space, with G1 = G/H , and one can define a morphism in cohomology
301
1-SKELETA AND EQUIVARIANT COHOMOLOGY
Kc : HG (M) −→ HG1 (Mc )
(1.21)
as follows. Let Z = f −1 (c). Since c is a regular value of f , the action of H on Z is locally free, so there is an isomorphism in cohomology, HG (Z) −→ HG1 (Mc ) (cf. [GS3, Sec. 4.6]), and the map (1.21) is just the composition of this with the restriction map HG (M) −→ HG (Z). The homomorphism (1.21) is called the Kirwan map, and a fundamental result of Kirwan [Ki] is the following theorem. theorem 1.8.1 The map (1.21) is surjective. One way of proving this theorem is to use the flip-flop theorem of Section 1.6. Let c1 and c2 be regular values of f , and suppose that there is just one critical point p of f with c1 < f (p) < c2 . Assume by induction that Kirwan’s theorem is true for c1 and prove it for c2 . The flip-flop theorem says that Mc2 is obtained from Mc1 by a blow-up followed by a blow-down, and to see what effect these operations have on cohomology, one makes use of the following theorem (see [McD]). theorem 1.8.2 Let M be a compact Hamiltonian G-manifold, and let W be a G-invariant symplectic submanifold of M. If β : M # → M is the symplectic blow-up of M along W and W # = β −1 (W ) is its singular locus, then there is a short exact sequence in cohomology 9 (1.22) 0 −→ HG (M) −→ HG M # −→ HG W # −→ 0, 9
the first arrow being β ∗ , the second being restriction, and HG (W # ) being the quotient, H (W # )/β ∗ H (W ). Suppose now that M satisfies the hypotheses of Theorem 1.5.1. Then both M and Mc are GKM spaces. Let (, α) and (c , αc ) be their GKM-skeleta. By Theorem 1.7.3, HG (M) is isomorphic to H (, α) and HG1 (Mc ) is isomorphic to H (c , αc ); so from (1.21) we get the Kirwan map Kc : H (, α) −→ H c , αc . We show that there is a purely graph-theoretical description of this map. Recall that
302
GUILLEMIN AND ZARA
an element φ of H (, α) is a map of V to S(g∗ ) which, for every edge e ∈ E , satisfies the compatibility condition (1.16), p = i(e) and q = t (e) being the vertices of e. Suppose that f (p) < c < f (q). Then e corresponds to a vertex pce of c . Moreover, if h is the Lie algebra of H , then g1 = g/h; thus there is a map g∗1 → g∗ that can be composed with the map ρe : g∗ → g∗e to give a bijection, g∗1 → g∗e , and an inverse bijection, g∗e → g∗1 . This, in turn, induces an isomorphism of rings, γe : S(g∗e ) −→ S g∗1 , and hence we get an element γe ρe φp = γe ρe φq of S(g∗1 ). theorem 1.8.3 The value of Kc φ at the vertex pce of c is γe ρe φp . Proof Let a be the element of HG (M) whose restriction to M G is φ. Let Xe be the embedded 2-sphere in M corresponding to e, and let ae ∈ HG (Xe ) be the restriction of a to Xe . Then the 1-point manifold {pce } is the reduction of Xe at c with respect to H . Therefore it suffices to check that γe ρe φp is the image of ae under the Kirwan map HG (Xe ) −→ HG1 pce . 1.9. Examples We describe in this section two examples of GKM manifolds for which the Betti numbers b2i (M) do not coincide with the combinatorial Betti numbers b2i ().
Figure 3. T 2 action on S 6
Example 1.9.1. G2 / SU(3) The space in Figure 3 is topologically just the standard 6-sphere. Moreover, if p is the identity coset, the isotropy representation of SU(3) on Tp is the standard representation of SU(3) on C3 , and hence the complex structure on Tp given by the
1-SKELETA AND EQUIVARIANT COHOMOLOGY
303
identification Tp C3 extends to a G2 -invariant almost complex structure on S 6 . Let T 2 be the Cartan subgroup of G2 . We show that the action of T 2 on S 6 is a GKM action, determine the GKM 1-skeleton, and compute its combinatorial Betti numbers. Recall that G2 is by definition the group of automorphisms of the nonassociative 8-dimensional algebra of Cayley numbers and that there is an intrinsic description of the almost complex structure on S 6 which makes use of algebraic proprieties of the Cayley numbers. (For details, see [KN, pp. 139–140].) In this description an element of the Cayley numbers is identified with an element, x = (z1 , z2 , w1 , w2 ), of C4 , and S 6 is realized as the unit sphere in the real subspace, z1 = −z1 ; that is, S 6 = x ∈ C4 ; |z1 |2 + |z2 |2 + |w1 |2 + |w2 |2 = 1, z1 = −z1 . Let α and β be basis vectors for the weight lattice of T 2 . Then the action of T 2 on the Cayley algebra defined by eiθ · z1 , z2 , w1 , w2 −→ z1 , ei(α+β)(θ) z2 , e−iα(θ) w1 , eiβ(θ) w2 is an action by automorphisms (see [Ja]), and it clearly leaves S 6 fixed, which implies that the induced action of T 2 on S 6 preserves the almost complex structure. Let p = (i, 0, 0, 0) and q = (−i, 0, 0, 0) be the fixed points of this induced action. If we identify Tp S 6 with 0, z2 , w1 , w2 ; z2 , w1 , w2 ∈ C ⊂ C4 , then the almost complex structure at p is Jp 0, z2 , w1 , w2 = 0, iz2 , iw1 , −iw2 . Hence, identifying Tp S 6 with C3 by 0, z2 , w1 , w2 −→ z2 , w1 , w2 , we deduce that the weights of the induced representation of T 2 on Tp S 6 are α + β, −α, and −β. Similarly, for the representation of T 2 on Tq S 6 , the weights are −α − β, α, and β. The GKM graph of this T 2 -space consists of the two vertices, p and q, linked by three edges (see Figure 3), labeled by the weights α +β, −α, and −β, and along every edge the connection swaps the remaining two edges. For every ξ ∈ P , the oriented graph (, oξ ) has cycles, and its combinatorial Betti numbers are b0 = b3 = 0, b1 = b2 = 1. Remark. Note that, by Theorem 1.7.2, G2 / SU(3) is equivariantly formal, so, in spite of the fact that the Betti numbers do not coincide with the combinatorial Betti numbers, still H (, α) = HT 2 (S 6 ).
304
GUILLEMIN AND ZARA
Example 1.9.2. The n-fold equivariant ramified cover of S 2 × S 2 In Example 1.9.1, every ξ -orientation of the GKM graph had cycles. We next describe an example of a GKM manifold whose GKM graph does have an acyclic ξ -orientation but for which the combinatorial Betti numbers are different from the topological Betti numbers. The 4-manifold W = S 2 × S 2 = CP 1 × CP 1 is a toric variety whose moment polytope ✷4 is the square in R2 with vertices at (1,1), (-1,1), (-1,-1), and (1,-1). Let φ : W → ✷4 be the moment map, and let ψ : R2 → R2 be the map (x, y) = x + iy −→ (x + iy)n . The preimage of ✷4 under this map is a regular curved polygon ✷4n with 4n sides. The fiber product of W and ✷4n , M = (p, z) ∈ W × ✷4n ; φ(p) = ψ(z) , (1.23) is a connected compact manifold, with maps π : M −→ W,
π(p, z) = p,
γ : M −→ ✷4n ,
and
γ (p, z) = z.
Moreover, if we let T 2 act trivially on ✷4n and act on W by its given action, we get from (1.23) an action of T 2 on M which makes the “fiber product” diagram M
γ
π
W
✷4n
(1.24)
ψ φ
✷4
T 2 -equivariant. Let M # = M − γ −1 (0) and W # = W − φ −1 (0). lemma 1.9.1 The map π : M # → W # is an n-to-1 covering map. Proof This follows from (1.24) and the fact that ψ : ✷4n − {0} → ✷4 − {0} is an n-to-1 covering map. corollary 1.9.1 There is a T 2 -invariant complex structure on M # . Let ✷04n be the interior of ✷4n , and let M0 = γ −1 (✷04n ).
1-SKELETA AND EQUIVARIANT COHOMOLOGY
305
lemma 1.9.2 There is a T 2 -equivariant diffeomorphism M0 → T 2 × ✷04n that intertwines γ and the map pr2 : T 2 × ✷04n → ✷04n . Proof Let ✷04 be the interior of ✷4 , and let W0 = φ −1 (✷04 ). Since W is a toric variety, there is a T 2 -equivariant diffeomorphism W0 −→ T 2 × ✷04 which intertwines φ and the map pr2 : T 2 × ✷04n → ✷04n . So the lemma follows from the description (1.23) of M. Let us use this diffeomorphism to pull back the complex structure on M0 ∩ M # to T 2 × (✷04n − {0}). We show that by modifying this structure, if necessary, on a small neighborhood of T 2 ×{0}, we can extend it to a T 2 -invariant almost complex structure on T 2 × ✷04n . Since the tangent bundle of T 2 × ✷04n is trivial, a T 2 -invariant almost complex structure on T 2 × ✷04n is simply a map J : ✷04n −→ GL(4, R)+ / GL(2, C).
(1.25)
So to prove this assertion we must show that the map J0 : ✷04n − {0} −→ GL(4, R)+ / GL(2, C)
(1.26)
associated with the complex structure on T 2 × (✷04n − {0}) can be modified slightly on a small disk D about the origin so that it is extendable over ✷04n . However, GL(4, R)+ / GL(2, C) is homotopy equivalent to the 2-sphere SO(4)/U (2), so the restriction of J0 to ∂D is a map J0 : S 1 → S 2 , and since S 2 is simply connected, this map extends over the interior. Pulling this almost complex structure back to M0 , we conclude with the following theorem. theorem 1.9.1 There exists a T 2 -invariant almost complex structure on M. We now show that M is a GKM manifold and compute its combinatorial Betti numbers. From the fact that W is a toric variety one easily proves the following lemma. lemma 1.9.3 The 1-skeleton of W is W − W0 and is the union of four 2-spheres with GKM graph 4 = ∂ ✷4 . Moreover, its axial function is the function that assigns to the edges of 4
306
GUILLEMIN AND ZARA
the weights α1 = (1, 0), α2 = (0, 1), α3 = (−1, 0), and α4 = (0, −1) (starting from the bottom edge and proceeding counterclockwise). Since T 2 acts freely on M0 and since π : M − M0 → W − W0 is an n-to-1 covering map, the lemma implies the following theorem. theorem 1.9.2 The 1-skeleton of M is M − M0 , and it is the union of 4n 2-spheres with GKM graph 4n = ∂ ✷4n . Its axial function is the pullback of the axial function of W by the map ψ : ∂ ✷4n → ∂ ✷4 . In particular, b2i (4n ) = nb2i (4 ), so that b0 (4n ) = b4 (4n ) = n and b2 (4n ) = 2n. By a simple Mayer-Vietoris type computation, with M, M0 , and M − M0 , it is easy to compute the “honest” Betti numbers of M and to show that b0 (M) = b4 (M) = 1
and
b2 (M) = 4n − 2
and also that the odd Betti numbers are zero. Thus, in particular, by Theorem 1.7.2, M is equivariantly formal and HG (M) = H (, α). 1.10. Edge-reflecting polytopes Let be an edge-reflecting polytope, and let be its 1-skeleton, the graph consisting of the vertices and edges of . The edge-reflecting property enables one to define a connection on as follows. Let p and p be adjacent vertices of , and let e be the edge joining p to p . If ei is an edge joining p to a vertex qi = p , then there exists, by the edge-reflecting property, a unique edge ei , joining p to another vertex qi = p, such that p, p , qi , and qi are collinear. The correspondence ei ←→ ei and e ←→ e¯ defines a bijective map θe : Ep −→ Ep , and the collection of these maps is, by definition, a connection on . We can also define an axial function α : E −→ Rn by attaching to each oriented edge e the vector → pq, αe = − where p = i(e) and q = t (e) are the endpoints of e. The triple (, θ, α) does not quite satisfy the properties described in Theorem 1.1.2. It does satisfy the first four of them, but it only satisfies a somewhat weaker version of the fifth, namely,
307
1-SKELETA AND EQUIVARIANT COHOMOLOGY
αei = λi,e αei + ci,e αe
with λi,e > 0.
(1.27)
Proof Condition (1.27) is just a restatement of the assumption that ei , ei , and e are coplanar; the positivity of λi,e is a consequence of the convexity of . If λi,e were not positive, e would be in the interior of the intersection of with the plane spanned by ei and ei . Remarks. (1) We call (, α) the GKM 1-skeleton of . (2) For edge-reflecting polytopes, Theorem 1.1.2(1) can be replaced by the much stronger statement: For every p ∈ V , the vectors αe ∈ Ep are n-independent; for every sequence 1 ≤ i1 < i2 < · · · < in ≤ d, the vectors αi1 , αi2 , . . . , αin are linearly independent. (3) Moreover, if (, α) is the GKM 1-skeleton of an edge-reflecting polytope, it satisfies both the “no-cycle” condition of Theorem 1.4.2 and the “zeroth Betti number” condition of Theorem 1.4.4. 1.11. Grassmannians as GKM manifolds Let G be the n-torus (S 1 )n , and let τ0 be the representation of G on Cn given by τ0 eiθ z = eiθ1 z1 , . . . , eiθn zn . We denote by vi , i = 1, . . . , n, the standard basis vectors of Cn and by αi , i = 1, . . . , n, the weights of τ0 associated with these basis vectors. Thus, identifying g with Rn , (1.28) αi (ξ ) = ξi for ξ = ξ1 , . . . , ξn ∈ g. From the action, τ0 , we get an induced action, τ , of G on the Grassmannian Gr k (Cn ). We prove that this is a GKM action by proving the following theorem. theorem 1.11.1 The fixed points of τ are in one-to-one correspondence with the k-element subsets of {1, . . . , n}. For the fixed point p = pS , corresponding to the subset S, the isotropy representation of G on Tp has weights α j − αi ,
i ∈ S, j ∈ S c .
(1.29)
Proof The fixed point pS corresponding to S is the subspace VS of Cn spanned by {vi ; i ∈ S}. Therefore the tangent space at pS is
308
GUILLEMIN AND ZARA
HomC VS , VS c ,
(1.30)
with basis vectors {vj ⊗ vi∗ ; i ∈ S, j ∈ S c }, and the weights associated with these basis vectors are (1.29). We leave the following theorem as an easy exercise. theorem 1.11.2 If n ≥ 4, the weights (1.29) are 3-independent. In particular, τ is a GKM action, as claimed. Let be its GKM graph. The following theorem gives a description of the G-invariant 2-spheres that correspond to the edges of this graph. theorem 1.11.3 Let S and S be k-element subsets of {1, . . . , n} with #(S ∩ S ) = k − 1. Define S1 = S ∩ S and S2 = S ∪ S , and let XS,S be the set of all k-dimensional subspaces V of Cn such that VS1 ⊂ V ⊂ VS2 . (1.31) Then XS,S is a G-invariant embedded 2-sphere containing pS and pS , and all Ginvariant embedded 2-spheres containing pS are of this form. Proof From the identification of XS,S with the projective space CP (VS2 /VS1 ), one sees that XS,S is an embedded 2-sphere. Moreover, since VS and VS satisfy (1.31), this sphere contains pS and pS . To prove the last assertion, note that the tangent space to XS,S at pS is HomC VS /VS1 , VS2 /VS . (1.32) Thus, if {i} = S − S1 and {j } = S2 − S, this tangent space has vi∗ ⊗ vj as basis vector with weight αj − αi . Thus the tangent spaces to these spheres account for all the weights on the list (1.29). From the result above we get the following description of the graph, . theorem 1.11.4 The vertices of the graph, , are in one-to-one correspondence with the k-element subsets, S, of {1, . . . , n} via the map S → pS ; two vertices pS and pS are adjacent if #(S ∩ S ) = k − 1.
309
1-SKELETA AND EQUIVARIANT COHOMOLOGY
The graph we just described is called the Johnson graph and is a familiar object in graph theory (see, e.g., [BCN]). The axial function, α, and the connection, θ, are easy to decipher from the results above. Let e be an oriented edge joining the vertex pS = i(e) to the vertex pS = t (e). Then, by (1.32), αe = αj − αi ,
(1.33)
with {i} = S −S and {j } = S −S; these identities determine the axial function, α. As for the connection, θ , we note that since the axial function, α, has the 3-independence property of Theorem 1.11.2, there is a unique connection on which is compatible with α in the sense that α and θ satisfy the properties of Theorem 1.1.2. Thus all we have to do is to produce a connection that satisfies these hypotheses, and we leave it to the reader to check that the following connection does. Let p = pS and p = pS be adjacent vertices with {i} = S − S and {j } = S − S, and let e be the oriented edge joining p to p . By Theorem 1.11.1, the set of edges, Ep , can be identified with the set of pairs, (i, j ) ∈ S × S c , and Ep can be identified with the set of pairs, (i , j ) ∈ S × (S )c . Define θe : Ep → Ep to be the map that sends (k, l) to (k, l) if k = i and l = j, (i, l) to (j, l) if l = j, (1.34) (k, j ) to (k, i) if k = i, (i, j ) to (j, i), and let θ be the connection consisting of all these maps. We next discuss some Morse-theoretic properties of the Johnson graph. Since the Grassmannian is a coadjoint orbit of SU(n), (, α) has the no-cycle property described in Theorem 1.4.2. However, it is also easy to verify this directly. For every fixed point p = pS , let αi , (1.35) αS = i∈S
and note that if e is an oriented edge that joins pS to pS , then αe = αS − αS .
(1.36)
As in Section 1.3, let P be the set of polarizing elements of g: ξ ∈ P if and only if αe (ξ ) = 0 for all e ∈ E . By (1.28) and (1.33), ξ = ξ1 , . . . , ξn ∈ P ⇐⇒ ξi = ξj , (1.37) and by (1.36) it is clear that if ξ ∈ P , then the function φ ξ : V −→ R, is ξ -compatible.
φ ξ (pS ) = αS (ξ )
(1.38)
310
GUILLEMIN AND ZARA
A particularly apposite choice of ξ is ξi = i, i = 1, . . . , n. We claim that for this choice of ξ we have the following theorem. theorem 1.11.5 The function φ = φ ξ is self-indexing modulo an additive constant: φ(pS ) = index(pS ) +
k(k + 1) . 2
(1.39)
Proof The index of pS is the number of edges e ∈ EpS with αe (ξ ) < 0; alternatively, it is the number of pairs (i, j ) ∈ S × S c with αj (ξ ) − αi (ξ ) < 0, which is the same as j − i < 0. Let i1 < i2 < · · · < ik be the elements of S. The number of elements j ∈ S c with j < i1 is i1 − 1; the number of elements j ∈ S c with j < i2 is i2 − 2, and so on. Therefore the number of pairs (i, j ) ∈ S × S c with j < i is (i1 + · · · + ik ) − k(k + 1)/2 = φ(pS ) − k(k + 1)/2. We conclude this description of the Johnson graph by saying a few words about the cohomology ring H (, α). Let us introduce a partial ordering on V by decreeing that for adjacent vertices p and p , p ≺ p ⇐⇒ φ(p) < φ(p ),
(1.40)
and, more generally, for any pair of vertices p and p , p ≺ p if there exists a sequence of adjacent vertices p = p0 ≺ p1 ≺ · · · ≺ pr = p . We prove in Section 2.4.3 (as a special case of a more general theorem) that H (, α) is a free module over S(g∗ ), with generators {τp ; p ∈ V } uniquely characterized by the following two properties: (1) (1/2) deg(τp ) = index(p); (2) the support of τp is contained in the set Fp = q ∈ V ; p ≺ q . (1.41) To reconcile this result with classical results of B. Kostant, S. Kumar, and others on the cohomology ring of the Grassmannian, we also give in Section 3.3 an alternative description of τp in terms of the Hecke algebra of divided difference operators; for this we need an alternative description of the ordering (1.40). One property of the Johnson graph which we have not yet commented on is that it is a symmetric graph. Given two pairs of adjacent vertices (p, p ) and (q, q ), one can find a permutation σ ∈ Sn with σ (p) = p and σ (q) = q . We claim that the partial ordering (1.40) is equivalent to the so-called Bruhat order on V (see [Hu]).
311
1-SKELETA AND EQUIVARIANT COHOMOLOGY
theorem 1.11.6 If p and p are vertices of , then p ≺ p if and only if there exists a sequence of elementary reflections (1.42) σi : i ←→ i + 1 with i = i1 , . . . , im such that m = index(p ) − index(p),
p = σim ◦ · · · ◦ σi1 (p),
(1.43) (1.44)
and such that φ is strictly increasing along the sequence of adjacent vertices pk = σik ◦ · · · ◦ σi1 (p),
k = 1, . . . , m.
(1.45)
For a proof of this, see, for instance, [GHZ].
2. Abstract 1-skeleta 2.1. Abstract 1-skeleta If one strips the manifold scaffolding from GKM theory, one gets a graph-like object that we call an abstract 1-skeleton. Let g∗ be an arbitrary n-dimensional vector space. Definition 2.1.1 An abstract 1-skeleton is a triple consisting of a d-valent graph, (with V as vertices and E as oriented edges), a connection, θ, on the “tangent bundle” of , π : E −→ V , and an axial function,
π(e) = i(e) (the initial vertex of e), α : E −→ g∗ ,
satisfying the following axioms. (A1) For every p ∈ V , the vectors {αe ; e ∈ Ep = π −1 (p)} are pairwise linearly independent. (A2) If e is an oriented edge of , and e¯ is the same edge with its orientation reversed, there exist positive numbers, me and me¯ , such that me¯ αe¯ = −me αe .
(2.1)
p
(A3) Let e ∈ E , p = i(e), and = t (e). Let ei , i = 1, . . . , d, be the elements of Ep , and let ei , i = 1, . . . , d, be their images with respect to θe in Ep . Then αei = λi,e αei + ci,e αe with λi,e > 0 and ci,e ∈ R.
(2.2)
312
GUILLEMIN AND ZARA
Remarks. (1) We denote an abstract 1-skeleton by (, α). ¯ (2) Axioms (A1)–(A3) imply that θe (e) = e. (3) There is a natural notion of equivalence for axial functions. Let θ be a connection on a graph , and let α and α be axial functions. We say that α and α are equivalent axial functions if for every oriented edge e, αe = λe αe ,
with λe > 0.
(2.3)
(4) We can always replace an axial function, α, by an equivalent axial function for which the constants m in (2.1) are 1; that is, we can assume αe¯ = −αe .
(2.4)
(5) One can define the Betti numbers, b2i (), and the cohomology ring, H (, α), of an abstract 1-skeleton exactly as in Sections 1.3 and 1.7. (It is easy to check, by the way, that in our proof of the well-definedness of b2i () (Theorem 1.3.1) we can replace Theorem 1.1.2(5) by the somewhat weaker hypothesis (2.2).) (6) It is also clear that the definitions of b2i () and of H (, α) are unchanged if we replace α by an equivalent axial function, α . (7) Let g∗ be, as in the first part of this paper, the dual of the Lie algebra of an n-dimensional torus, G. Suppose that, for every e ∈ E , αe is an element of the weight lattice of G. We say that the abstract 1-skeleton (, α) is an abstract GKM 1-skeleton if the m’s in (2.1) and the λ’s in (2.2) are all equal to 1; that is, αe¯ = −αe
(2.5)
αei = αei + ci,e αe ,
(2.6)
and and the ci,e ’s are integers. We show in Section 3.1 that every abstract GKM 1-skeleton is actually the GKM 1-skeleton of a GKM manifold. Definition 2.1.2 We say that an axial function, α, is 3-independent if, for every p ∈ V , the vectors {αe ; e ∈ Ep } are 3-independent in the sense of Theorem 1.5.1. It is clear that if α and α are equivalent axial functions and one of them is 3independent, the other is as well. The hypothesis of 3-independence is frequently evoked in this section. It enables us to blow up and blow down abstract 1-skeleta and, by mimicking Theorem 1.5.1, to define an analogue of symplectic reduction for abstract 1-skeleta. It also rules out the existence of 2-cycles in (such as the three 2-cycles exhibited in Figure 3).
1-SKELETA AND EQUIVARIANT COHOMOLOGY
313
proposition 2.1.1 If α is 3-independent, every pair of adjacent vertices in is connected by a unique unoriented edge. Proof Suppose that there are two distinct oriented edges, e and e1 , from p to p ; let e = θe1 (e) ∈ Ep . Since αe¯ = −αe and αe = λαe + cαe1 , with λ > 0, it follows that ¯ Thus the vectors αe , αe¯ , and αe¯1 are distinct and coplanar, which contradicts e = e. the 3-independence of α at p . Another useful consequence of 3-independence is the following proposition. proposition 2.1.2 If h is a codimension-2 subspace of g, the graph h is 2-valent. Finally, if α is 3-independent, the compatibility conditions between θ and α imposed by Axiom (A3) determine θ . proposition 2.1.3 The connection, θ , is the only connection on satisfying (2.2). The GKM-theorem asserts that if M is a GKM manifold and is equivariantly formal, then HG (M) is isomorphic to H (, α). In particular, H (, α) is a free module over the ring S(g∗ ) with b2i (M) generators in dimension 2i. In the second part of this paper we attempt to ascertain to what extent this theorem is true for abstract 1-skeleta with b2i (M) replaced by b2i (). The examples we have encountered in Section 1.9 already give us some inkling of what to expect. This assertion is unlikely to be true if for some admissible orientation of (see Section 1.4) there exist oriented closed paths or if for some subspace h of g the totally geodesic subgraph h of has fewer connected components than predicted by its combinatorial Betti number. This motivates the following definition. Definition 2.1.3 The abstract 1-skeleton (, α) is noncyclic if (NCA1) for some vector ξ in the set (1.5), (, α) is ξ -acyclic; that is, the oriented graph (, oξ ) has no closed paths; (NCA2) for every codimension-2 subspace, h, of g and for every connected component, 0 , of h , (2.7) b0 (0 ) = 1.
314
GUILLEMIN AND ZARA
For the definition of oξ , see Section 1.4. Also recall that, by Theorem 1.4.1, ξ acyclicity implies the existence of a function f : V → R that is ξ -compatible. In the remainder of Section 2, by 1-skeleton we mean an abstract 1-skeleton. Examples Example 2.1.1. The complete 1-skeleton In this example the vertices of are the elements of the N-element set V = {p1 , . . . , pN }, and each pair of elements, (pi , pj ), i = j , is joined by an edge. We denote by e = pi pj the oriented edge that joins pi = i(e) to pj = t (e). Thus the set of oriented edges is just the set pi pj ; 1 ≤ i, j ≤ N, i = j , and its fiber over pi is
Ei = pi pj ; 1 ≤ j ≤ N, = j .
A connection, θ, is defined by maps, θij : Ei → Ej , where pj pi if k = j, θij (pi pk ) = p p if k = i, j. j k (Note that this connection is invariant under all permutations of the vertices.) Let τ : V → g∗ be any function such that τ1 , . . . , τN are 3-independent; then the function, α, given by (2.8) αpi pj = τj − τi , is an axial function compatible with θ. We call τ : V → S1 (g∗ ) the generating class of . The following theorem describes the additive structure of the cohomology ring of (, α). If dim (g∗ ) = n, let
dim Sk (g∗ ) if k ≥ 0, λk = λk,n = (2.9) 0 if k < 0. theorem 2.1.1 If (, α) is the complete 1-skeleton with N vertices and generating class τ , then H 2m (, α)
N −1 k=0
for every m ≥ 0. In particular,
Sm−k (g∗ )τ k
315
1-SKELETA AND EQUIVARIANT COHOMOLOGY
dim H
2m
(, α) =
N −1
λm−k .
(2.10)
k=0
Proof The generating class τ ∈ H 2 (, α) satisfies the relation τ N = σ1 (τ1 , . . . , τN )τ N −1 − σ2 (τ1 , . . . , τN )τ N −2 + · · · ,
(2.11)
where σk (τ1 , . . . , τN ) ∈ Sk (g∗ ) is the kth symmetric polynomial in τ1 , . . . , τN . We show that every element f ∈ H 2m (, α) can be written uniquely as f =
N −1
fm−k τ k ,
(2.12)
k=0
with fm−k ∈ Sm−k (g∗ ) if k ≤ m and fm−k = 0 if k > m. For m = 0 the statement is obvious. Assume m > 0, and let gm = (−1)
N +1
τ 1 · · · τN
N i=1
τi
f (pi ) . j =i (τi − τj )
(2.13)
A priori, gm is an element in the field of fractions of S(g∗ ). Since f ∈ H (, α), τi −τj divides f (pi ) − f (pj ) for all i = j , and hence all the factors in the denominator of gm are canceled, so that gm ∈ Sm (g∗ ). Moreover, from (2.13) it follows that f (pi ) − gm ≡ 0 on τi ≡ 0; therefore there exists hi ∈ Sm−1 (g∗ ) such that f (pi ) = gm + τi hi ,
∀i = 1, . . . , N.
Since f (pi ) − f (pj ) = (τi − τj )hi + τj (hi − hj ),
∀i = j,
it follows that τi − τj divides hi − hj for all i = j ; that is, the function h : V → Sm−1 (g∗ ), given by h(pi ) = hi , satisfies the compatibility conditions and is therefore an element of H 2(m−1) (, α). Thus f = gm + τ h,
(2.14)
with h ∈ H 2(m−1) (, α). From the induction hypothesis, h can be uniquely written as (2.15) h = hm−1 + hm−2 τ + · · · + hm−N τ N −1 . Introducing (2.15) in (2.14) and using (2.11), we deduce that f can be written in the form (2.12), and the uniqueness follows from the nondegeneracy of the Vandermonde determinant with entries (τik ). If m < k, then the corresponding fm−k would have negative degree; so the only possibility is that it is zero.
316
GUILLEMIN AND ZARA
Remark. This example is associated with a simple (but very important) GKM action, the action of T N on CP N−1 . Example 2.1.2. Subskeleta Let 0 be an r-valent subgraph of which is totally geodesic in the sense of Definition 1.4.2. Then the restriction of θ and α to 0 defines a connection, θ0 , and axial function, α0 , on 0 . We call (0 , α0 ) a subskeleton of (, α). Associated to the subskeleton is the notion of normal holonomy. Let 0 be a totally geodesic subgraph of , and let θ0 be the connection on 0 induced by θ. For a vertex p ∈ V0 , let Ep0 be the fiber of E0 over p, and let Np = Ep − Ep0 . For every loop γ in 0 based at p, the map σγ preserves the decomposition Ep = Ep0 ∪ Np and thus induces a permutation σγ0 of Np . Let Hol⊥ (, 0 , p) be the subgroup of #(Np ) ⊂ #(Ep ) generated by the permutations σγ0 , for all loops γ included in 0 and based at p. Again, if p1 and p2 are connected by a path in 0 , then Hol⊥ (, 0 , p1 ) and Hol⊥ (, 0 , p2 ) are isomorphic by conjugacy; so, if 0 is connected, we can define the normal holonomy group, Hol⊥ (0 , ), of 0 in to be Hol⊥ (, 0 , p) for any p ∈ 0 . We also say that 0 has trivial normal holonomy in if Hol⊥ (0i , ) is trivial for every connected component 0i of 0 . Example 2.1.3. Product 1-skeleta Let (i , αi ), i = 1, 2, be a di -valent 1-skeleta. The vertices of the product graph = 1 × 2 are the pairs, (p, q), p ∈ V1 and q ∈ V2 . Two vertices, (p, q) and (p , q ), are joined by an edge if either p = p and q and q are joined by an edge in 2 or q = q and p and p are joined by an edge in 1 . (If p is joined to p by several edges, each of these edges corresponds to an edge joining (p, q) to (p , q). We are, however, a bit careless about this fact in the paragraph below and denote (oriented) edges by the pairs of adjacent vertices they join.) If θ1 and θ2 are connections on 1 and 2 , the product connection, θ, on 1 × 2 is defined by θ(p,q;p ,q) = θp,p × (Id)q,q
and
θ(p,q;p,q ) = (Id)p,p × θq,q ,
and one can construct an axial function on compatible with θ by defining
α1 (p, p ) α (p, q), (p , q ) = α2 (q, q )
if q = q and (p, p ) ∈ E1 , if p = p and (q, q ) ∈ E2 .
Then (, α) is a (d1 + d2 )-valent 1-skeleton that we call the direct product of 1 and 2 .
317
1-SKELETA AND EQUIVARIANT COHOMOLOGY
Example 2.1.4. Twisted products One can extend the results we have just described to twisted products. Let 0 and be two graphs, and let ψ : E0 → Aut( ) be a map such that ψ(e) = (ψ(e)) ¯ −1 for every oriented edge e. We define the twisted product of 0 and , = 0 ×ψ , as follows. The set of vertices of this new graph is V = V0 × V . Two vertices p1 = (p01 , p1 ) and p2 = (p02 , p2 ) are joined by an edge if and only if (1) p01 = p02 and p1 , p2 are joined by an edge of (these edges are called vertical) or (2) p01 is joined with p02 by an edge, e ∈ E0 , and p2 = ψ(e)(p1 ) (these edges are called horizontal). For a vertex p = (p0 , p ) ∈ V we denote by Eph the set of oriented horizontal edges issuing from p and by Epv the set of oriented vertical edges issuing from p. Then the projection π : V = V0 × V −→ V0 induces a bijection dπp : Eph → Ep0 . Let q0 ∈ V0 , and let F(0 , q0 ) be the fundamental group of 0 , that is, the set of all loops based at q0 . Every such loop, γ , induces an element ψγ ∈ Aut( ) by composing the automorphisms corresponding to its edges. Now let G be the subgroup of Aut( ) which is the image of the morphism G : F(0 , q0 ) −→ Aut( ),
G(γ ) = ψγ .
If θ0 is a connection on 0 and θ is a G-invariant connection on , we get a connection on the twisted product as follows. If we require this connection to take horizontal edges to horizontal edges and vertical edges to vertical edges, we must decide how these horizontal and vertical components are related at adjacent points of V . For the horizontal component, the relation is simple. By assumption, the map where p = (p0 , p ) dπp : Eph −→ Ep0 is a bijection; so if p1 and p2 are adjacent points on the same fiber above p0 , ∼
◦ dπp1 : Eph1 −→ Eph2 Ep0 . θp1 p2 = dπp−1 2
318
GUILLEMIN AND ZARA
If p1 = (p01 , p1 ) and p2 = (p02 , p2 ) are adjacent points on different fibers, θp1 p2 = dπp−1 ◦ θp01 p02 ◦ dπp1 , 2 so that the following diagram commutes: Eph1
Eph2
Ep01
θp01 p02
Ep02
For the vertical component the relation is nearly as simple. Let p0 ∈ V0 , and let γ be a path in 0 from q0 to p0 . Then ψ induces an automorphism Gγ : π −1 (q0 ) −→ π −1 (p0 ) V , and we use Gγ to define a connection θp 0 on π −1 (p0 ). Since θ is G-invariant, this new connection is actually independent of γ . Thus if p1 and p2 are adjacent points on the fiber above p0 , the connection on along p1 p2 is induced on vertical edges by θp 0 . On the other hand, if p1 and p2 are adjacent points on different fibers and p0i = π(pi ), then ψp01 p02 induces a map Epv 1 → Epv 2 that is the connection on vertical edges. Note that if G is trivial, then the twisted product is actually a direct product. The axial functions on which we encounter in Section 2.2 are not, as a rule, of the product form described in Example 2.1.3. They do, however, satisfy α(e) = α0 (e0 ),
e0 = dπp (e),
(2.16)
at all p ∈ V and e ∈ Eph , α0 being a given axial function on 0 . Example 2.1.5. Fibrations An important example of twisted products is given by fibrations. Let and 0 be graphs of valence d and d0 , and let V and V0 be their vertex sets. Definition 2.1.4 A morphism of into 0 is a map f : V → V0 with the property that if p and q are adjacent points in , then either f (p) = f (q) or f (p) and f (q) are adjacent points in 0 . Let p ∈ V , p0 = f (p), let Epv be the set of oriented edges, pq, with f (p) = f (q), and let Eph = Ep − Epv . (One can regard Eph as the “horizontal” component of Ep and Epv as the “vertical” component.) By definition there is a map dfp : Ep − Epv −→ Ep0
(2.17)
1-SKELETA AND EQUIVARIANT COHOMOLOGY
319
which we call the derivative of f at p. Definition 2.1.5 A morphism, f , is a submersion atp if the map, dfp , is bijective. If df is bijective at all points of V , then we simply say that f is a submersion. theorem 2.1.2 Let and 0 be connected, and let f be a submersion. Then (1) f is surjective; (2) for every p ∈ V0 , the set Vp = f −1 (p) is the vertex set of a subgraph of valence r = d − d0 ; (3) if p, q ∈ V0 are adjacent, there is a canonical bijective map
defined by
Kp,q : Vp −→ Vq
(2.18)
Kp,q (p ) = q ⇐⇒ p and q are adjacent;
(2.19)
(4) in particular, the cardinality of Vp is the same for all p. We leave the proofs of these assertions as easy exercises. Note by the way that the map (2.18) satisfies −1 = Kq,p . Kp,q Definition 2.1.6 The submersion f is a fibration if the map (2.18) preserves adjacency; two points in Vp are adjacent if and only if their images in Vq are adjacent. Let f be a fibration, let p0 be a base point in V0 , and let p be any other point. Let p0 and p be the subgraphs of (of valence r) whose vertices are the points of Vp0 and Vp . For every path p0 −→ p1 −→ · · · −→ pN = p in 0 joining p0 to p, there is a holonomy map KpN −1 ,pN ◦ · · · ◦ Kp1 ,p0 : Vp0 −→ Vp
(2.20)
that preserves adjacency. Hence all the graphs p are isomorphic (and, in particular, isomorphic to := p0 ). Thus can be regarded as a twisted product of 0 and . Moreover, for every closed path γ : p0 −→ p1 −→ · · · −→ pN = p0 ,
320
GUILLEMIN AND ZARA
there is a holonomy map Kγ : Vp0 −→ Vp0 .
(2.21)
It is clear that if this map is the identity for all γ , this twisted product is a direct product; that is, 0 × . From (2.21) one gets a homomorphism of the fundamental group of 0 into Aut( ). Let G be its image. Given a connection on 0 and a G-invariant connection on , one gets, by the construction described in Example 2.1.5, a connection on . From now on, unless specified otherwise, we assume that (, α) is 3-independent. Also, frequently we refer to the 1-skeleton (, α) simply as . 2.2. The blow-up operation 2.2.1. The blow-up of a 1-skeleton Let (, α) be a d-valent 1-skeleton, and let 0 be a subskeleton of valence d0 and covalence s = d − d0 . We define in this section a new d-valent 1-skeleton, ( # , α # ), which we call the blow-up of along 0 ; we also define a blowing-down map β : V # −→ V , which is a morphism of graphs in the sense of Definition 2.1.4. The singular locus of this blowing-down map can be described as a twisted product of 0 and a complete 1-skeleton on r vertices, and # itself is obtained from this singular locus by gluing it to the complement of 0 in . The details are as follows (see Figure 4). pj pj a
pk
pi pj c
pj b qj a
pib qj c
qj b
pic pkc pia
qib q ia
pka qkb
qic qkc
qka
Figure 4. Blow-up
Let V0 and V be the vertices of 0 and , and let V2 = V −V0 . For each pi ∈ V0 , let {qia ; a = d0 + 1, . . . , d} be the set of points in V2 which are adjacent to pi . Define
1-SKELETA AND EQUIVARIANT COHOMOLOGY
321
a new set of vertices, Npi = {pia ; a = d0 + 1, . . . , d}, with one new vertex, pia , for each edge pi qia . For each pair of adjacent points, p and q, in V0 , the holonomy map θp,q : Ep → Eq induces a map Kp,q : Np −→ Nq . Let V1 be the disjoint union of the Np ’s, and let f : V1 → V0 be the map that sends Np to p. We make V1 into a graph, 1 , by decreeing that two points, p and q of V1 , are adjacent if and only if f (p ) = f (q ) or p = f (p ) and q = f (q ) are adjacent and q = Kp,q (p ). It is clear that this notion of adjacency defines a graph, 1 , of valence d − 1, and that f is a fibration in the sense of Example 2.1.5. Let p0 be a base point in V0 . The subgraph = f −1 (p0 ) is a complete graph on s vertices; therefore we can equip it with the connection described in Example 2.1.1. This connection is invariant under all the automorphisms of , so we can, as in Example 2.1.5, take its twisted product with the connection on 0 to get a connection on 1 . To define an axial function on 1 which is compatible with this connection, we have to assume that the axial function, α, on satisfies a GKM hypothesis of type (2.6). Fortunately, however, (1) we have to make this assumption only for the edges of normal to 0 ; that is, we only have to assume that α satisfies the condition αpi qia − αpj qj b
is a multiple of αpi pj
(2.22)
for every edge, pi pj , of 0 , where pj qj b = θpi pj (pi qia ); (2) in the blow-up-blow-down construction in Section 2.3.2 in which we apply the construction that we are about to describe, the hypotheses (2.22) are satisfied. Consider positive numbers (nia ) such that nia = nj b
(2.23)
if pia and pj b are joined by an horizontal edge. We can define an axial function, α , on 1 as follows. On horizontal edges of 1 , which are of the form pia pj b , we require that α be defined by (2.16); that is, αp ia pj b = αpi pj . On vertical edges that are of the form e = pia pib , we define α by nib αp q . αp ia pib = αpi qib − nia i ia
We now define # . Its vertices are the set V # = V1 ( V2 , and we define adjacency in V # as follows.
322
GUILLEMIN AND ZARA
(1) Two points, p and q , in V1 , are adjacent if they are adjacent in 1 . (2) Two points, p and q, in V2 , are adjacent if they are adjacent in . (3) Consider a point p = pia ∈ Npi ⊂ V1 . By definition, p corresponds to a point qia ∈ V2 , which is adjacent to pi ∈ V . Join p = pia to qia . Then # is a graph of valence d and the blowing-down map β : V # −→ V is defined to be equal to f on V1 and to the identity map on V2 . We define an axial function, α # , by letting α # = α on edges of type 1 and α # = α on edges of type 2. Thus it remains to define α # on edges of type 3. Let pia qia be such an edge. Then we define αq#ia pia = αqia pi
and
αp# ia qia =
1 αp q . nia i ia
We define a connection, θ # , on # , by letting θ # be equal to θ on edges of type 1 and equal to θ on edges of type 2. Thus it remains to define θ # along edges of type 3. Let pi be a vertex of V0 , and let qia ∈ V2 be an adjacent vertex. Then there is a holonomy map (2.24) θpi qia : Epi −→ Eqia . Moreover, one can identify Epi with Epia as follows. If pj is a vertex of 0 adjacent to pi , then there is a unique vertex, pj b , sitting over pj in 1 and adjacent in 1 to pia , by (2.19). If qib is a vertex of adjacent to pi but not in 0 , then, by definition, it corresponds to an element, pib , of Npi . Thus we can join it to pia by an edge of type 2, or, if q = qia , by an edge of type 3. Composing the map (2.24) with this identification of Epia with Epi , we get a holonomy map θp#ia qia : Epia −→ Eqia . Then ( # , α # ) is a d-valent 1-skeleton, called the blow-up of along 0 . There exists a blowing-down map β : # → , obtained by collapsing all pia ’s to the corresponding pi . The preimage of 0 under β, called the singular locus of β, is a (d − 1)-valent subskeleton of # . A particularly important case occurs when 0 has trivial normal holonomy in . In this case the singular locus is naturally isomorphic to the direct product of 0 with , which is a complete 1-skeleton in s = d − d0 vertices. Define τ : V # → S1 (g∗ ) by if v ∈ V # − V # , 0 0 τ (v) = (2.25) 1 αpi qia if v = pia . nia Then τ ∈ H 2 ( # , α # ) and is called the Thom class of 0# in # .
1-SKELETA AND EQUIVARIANT COHOMOLOGY
323
2.2.2. The cohomology of the singular locus Let be a 1-skeleton, let 0 be a subskeleton of covalence s, let # be the blow-up of along 0 , and let 0# be the singular locus of # , as defined in Section 2.2.1. Let β : 0# −→ 0
(2.26)
be the blowing-down map. One has an inclusion, H () → H ( # ), and an element f ∈ H ( # ) is the image of an element of H () if and only if it is constant on the fibers of β. Let τ ∈ H 2 ( # ) be the Thom class of 0# in # , and let τ0 ∈ H 2 (0# ) be the restriction of τ to 0# . lemma 2.2.1 Every element f ∈ H 2m (0# ) can be written uniquely as f =
s−1
τ0k fm−k ,
(2.27)
k=0
with fm−k ∈ H 2(m−k) (0 ) if k ≤ m and zero otherwise. Proof For p ∈ V0 , let Np = β −1 (p) be the fiber over p. By definition, Np is a complete 1-skeleton with s vertices for which a generating class is the restriction of τ0 to Np . If f ∈ H 2m (0# ), then h, the restriction of f to Np , is an element of H 2m (Np ), and hence, by (2.12), s−1 fm−k (p)τ0k , h= k=0
Sm−k (g∗ )
where fm−k (p) ∈ if k ≤ m and is zero otherwise. To get (2.27) we need to show that the maps fk : V0 → Sk (g∗ ) are in H 2k (0 ). Let pi , pj ∈ V0 be joined by an edge. If qia , a = d0 +1, . . . , d, are the neighbors of pi not in 0 , the connection along the edge pi pj transforms the edges, pi qia , into edges, pj qj a , a = d0 + 1, . . . , d, modulo some relabeling. Then pia and pj a are joined by an edge in 0# , which implies that αpi pj divides f (pia ) − f (pj a ) and hence that f (pia ) − f (pj a ) ≡ 0 mod τ0 (pia ) − τ0 (pj a ) . From f (pia ) − f (pj a ) =
s−1 k=0
fm−k (pj ) τ k (pia ) − τ k (pj a )
324
GUILLEMIN AND ZARA
+
s−1
fm−k (pi ) − fm−k (pj ) τ k (pia ),
k=0
we deduce that, for every a = d0 + 1, . . . , d, s−1
fm−k (pi ) − fm−k (pj ) τ k (pia ) ≡ 0
mod αpi pj .
(2.28)
k=0
Since τ (pia ) − τ (pib ) is not a multiple of αpi pj for a = b (recall that (, α) is assumed to be 3-independent; see the comment at the end of Section 2.1), relations (2.28) imply that every term fm−k (pi ) − fm−k (pj ) is a multiple of αpi pj , which means that fm−k ∈ H 2(m−k) (0 ) if k ≤ m or is zero otherwise. 2.2.3. The cohomology of the blow-up We can now determine the additive structure of the cohomology ring of the blown-up 1-skeleton. The following identity is a graph-theoretic version of the exact sequence (1.22). theorem 2.2.1 There is an isomorphism H 2m ( # ) H 2m () ⊕
s−1
H 2(m−k) (0 ).
k=1
Proof We show that every element f ∈ H 2m ( # ) can be written uniquely as f =g+
s−1
τ k fm−k ,
(2.29)
k=1
H 2m (),
with g ∈ fm−k ∈ 0 ) if 1 ≤ k ≤ m and zero if k > m. The restriction, h, of f to 0# , is an element of H 2m (0# ), and, therefore, from Lemma 2.2.1 it follows that H 2(m−k) (
h = fm +
s−1
τ0k fm−k .
k=1
But then g=f −
s−1
τ0k fm−k
k=1
is constant along fibers of β, implying that g ∈ H 2m (). Hence f can be written as in (2.29). If
325
1-SKELETA AND EQUIVARIANT COHOMOLOGY
f =g +
s−1 k=1
τ k fm−k
is another decomposition of f , then g − g is supported on 0# and, therefore, 0 = g − g +
s−1 k=1
τ0k fm−k − fm−k ,
which, from the uniqueness of (2.27), implies that g = g and fm−k = fm−k for all k’s.
2.3. Reduction 2.3.1. The reduced 1-skeleton Let (, α) be a d-valent noncyclic (in the sense of Definition 2.1.3) 1-skeleton, and let φ : V → R be an injective function that is ξ -compatible for some polarizing vector ξ . The image of φ is called the set of critical values of φ, and its complement in R is called the set of regular values. For each regular value, c, we construct a new (d − 1)-valent 1-skeleton (c , α c ). This new 1-skeleton is called the reduced 1-skeleton of (, α) at c. The construction we are about to describe is motivated by the geometric description of reduction in Theorem 1.5.1 (see Figure 5). (In the remainder of this section, we use the notation (p, q) for an unoriented edge joining p and q, and we use the notation pq for an oriented edge with initial vertex p and terminal vertex q.) Consider the cross section of at c, consisting of all edges (p0 , pi ) of such that φ(pi ) < c < φ(p0 ); to each such edge we associate a vertex vi of a new graph, denoted by c . Let r be the index of p0 and s = d − r. Let the other d − 1 oriented edges issuing from p0 be denoted by p0 qa , a = 1, . . . , d, a = i, and let ha be the annihilator in g of the 2-dimensional linear subspace of g∗ generated by αp0 pi and αp0 qa . The connected component of ha that contains p0 , pi , and qa has Betti number equal to 1; therefore there exists exactly one more edge eia = (pia , qai ) in this component which crosses the c-level of φ, that is, for which φ(pia ) < c < φ(qai ). If via is the vertex of c corresponding to eia , we add an edge connecting vi to via . The axial function on the oriented edge vi via is αvci via = αp0 qa −
αp0 qa (ξ ) αp p . αp0 pi (ξ ) 0 i
(2.30)
This axial function takes values in g∗ξ , the annihilator of ξ in g∗ . The reduced 1-skeleton has two connections: an “up” connection and a “down” connection. Along the edge vi via , the “down” connection of the reduced 1-skeleton
326
GUILLEMIN AND ZARA
qb = s1
sk+1
qa = t0
p0 = t1 vib vi
via pi = t2
qai = tk+1
pia = tk t3
s2
sk
tj
s3
viab
sj Figure 5. Reduction
is defined as follows. Let vi vib be another edge at vi , corresponding to an edge p0 qb . Let s1 = qb , and let t2 s2 , . . . , tk+1 sk+1 be the edges obtained by transporting t1 s1
, along the path t1 , t2 , . . . , tk+1 . The edge tk+1 sk+1 corresponds to a neighbor, viab of via , and we define the “down” connection on the edge vi via by requiring that it
. The “up” connection is defined similarly, except that, instead send vi vib to via viab of transporting t1 s1 along the bottom path in Figure 5, we transport it along the top path from t1 to tk+1 . theorem 2.3.1 The reduced 1-skeleton at c is a (d − 1)-valent 1-skeleton. If (, α) is l-independent, then (c , α c ) is (l − 1)-independent. Proof Let vi be a vertex of c , corresponding to the edge e = (p0 , pi ) of , and let via be a neighbor of vi , corresponding to the edge eia = (pia , qai ) and obtained as above by using the edge ea = p0 qa . Let t0 = qa , t1 = p0 , t2 = pi , . . . , tk = pia , tk+1 = qai be the path that connects q0 and qai , crosses the c-level, and is contained in the 2-dimensional subskeleton of generated by p0 , pi , and qa (see Figure 5). It is clear that αvci via is a positive multiple of ιξ αt1 t0 ∧ αt1 t2 , where ιξ is the interior product with ξ . Axiom (2.2) implies that, for every j , αtj tj −1 ∧ αtj tj +1
and
αtj −1 tj −2 ∧ αtj −1 tj
are positive multiples of each other and, hence, that
327
1-SKELETA AND EQUIVARIANT COHOMOLOGY
ιξ αtj tj −1 ∧ αtj tj +1 is a positive multiple of ιξ αtj −1 tj −2 ∧ αtj −1 tj . Therefore αvci via is a negative multiple of αvcia vi , so α c satisfies Definition 2.1.1(A2). We show that (c , α c ) satisfies Definition 2.1.1(A3). Note that ιξ αt1 t2 ∧ αt1 s1 ιξ αtk+1 tk ∧ αtk+1 sk+1 c c and that αv v = . αvi vib = ia iab αt1 t2 (ξ ) αtk+1 tk (ξ ) A direct computation shows that ιξ αtj tj +1 ∧ αtj sj αtj tj +1 (ξ )
−
ιξ αtj tj −1 ∧ αtj sj αtj tj −1 (ξ )
is a multiple of αvci via ; if αtj +1 sj +1 = λj αtj sj + cj αtj tj +1 , then ιξ αtj tj +1 ∧ αtj sj ιξ αtj +1 tj ∧ αtj +1 sj +1 = λj , αtj +1 tj (ξ ) αtj tj +1 (ξ ) and eliminating the intermediary terms, we see that αvc
ia viab
− λαvci via is a multiple of αvci via ,
(2.31)
with λ = λk · · · λ1 > 0. Definition 2.1.1(A1), as well as the statement about the (l − 1)-independence, follows at once from the fact that the value of α c at the vertex vi of c is a linear combination of two values of α at a vertex p0 of , one of which is fixed, namely, α p 0 pi . 2.3.2. Passage over a critical value In this section we describe what happens to the reduced 1-skeleton at c as c varies; it is clear that if there is no critical value between c and c , then the two reduced 1-skeleta are identical. Suppose, therefore, that there exists exactly one critical value in the interval (c, c ) and that it is attained at the vertex p0 . Let r be the index of p0 , and let s = d − r. theorem 2.3.2 (c , αc ) is obtained from (c , αc ) by a blowing-up of c along a complete subskeleton with r vertices followed by a blowing-down along a complete subskeleton with s vertices. Proof The modifications from c to c are due to the edges that cross one level but not
328
GUILLEMIN AND ZARA
the other one; but these are exactly the edges with one endpoint p0 . Let p0 pi , i = 1, . . . , r, be the edges of with initial vertex p0 that point downward, and let p0 qa , a = r +1, . . . , d, be the edges that point upward. Let vi be the vertex of c associated to (p0 , pi ), and let wa be the vertex of c associated to (p0 , qa ), for i = 1, . . . , r and a = r + 1, . . . , d. The vi ’s are the vertices of a complete subskeleton c of c , and the wa ’s are the vertices of a complete subskeleton c of c , c having trivial normal holonomy with respect to the “up” connection on c , and c having trivial normal holonomy with respect to the “down” connection on c . (This can be shown as follows; see Figure 6. The “normal bundle” to vi is the same for all i s; namely, it can be identified with the set of edges p0 qa . Moreover, the holonomy map associated with vi vj is by definition just the identity map on this set of edges.) qb
qa
I = c
wb P0
qai
wa
wai
vi
via
vj Pj
I=c pia
pi Figure 6. Evolution
Consider hia , the annihilator of the 2-dimensional subspace of g∗ generated by αp0 pi and αp0 qa ; the connected component of hia that contains p0 , pi , and qa contains exactly one edge eia = (pia , qia ) that crosses both the c-level and the c -level. To this edge corresponds a neighbor via of vi in c and a neighbor wai of wa in c . Let µ > 0, and for all i = 1, . . . , r, a = r + 1, . . . , d, define nia = µαp0 qa (ξ ) > 0,
(2.32)
nai = −µαp0 pi (ξ ) > 0.
(2.33)
Denote τi = −
αp p αp0 pi = 0 i µαp0 pi (ξ ) nai
and
τa =
αp q αp0 qa = 0 a. µαp0 qa (ξ ) nia
329
1-SKELETA AND EQUIVARIANT COHOMOLOGY
Then we have αvci vj = αp0 pj −
αp0 pj (ξ ) αp0 pi (ξ )
αp0 pi = naj (τj − τi ),
c αw = αp0 qb − a wb
αp0 qb (ξ ) αp q = nib (τb − τa ), αp0 qa (ξ ) 0 a
αvci via = αp0 qa −
αp0 qa (ξ ) αp p = nia (τa + τi ), αp0 pi (ξ ) 0 i
c = αp0 pi − αw a wai
αp0 pi (ξ ) αp q = nai (τi + τa ). αp0 qa (ξ ) 0 a
Let c0 be the subskeleton of c with vertices {v1 , . . . , vr }. Along the edge vi vj , the connection transports the edge vi via to vj vj a . Noting that nia = nj a and that nia c α , αvcj vj a − αvci via = nj a vi vj that is, that c satisfies (2.22) on edges of c normal to c0 , we can define the blowup c# of c along c0 by means of the positive numbers nia , i = 1, . . . , r and a = r + 1, . . . , d. The singular locus, 0# , is, as a graph, a product of two complete graphs, r × s ; each vertex zia corresponds to a pair (vi , wa ), and the blow-down map β : 0# → 0c sends zia to vi . There are edges connecting zia with via , zia with zib , and zia with zj a , for all distinct i, j = 1, . . . , r and a, b = r + 1, . . . , d. The values of the axial function α # on these edges are 1 c µ αvi via = τa + τi = − ι ξ αp 0 p i ∧ α p 0 q a , nia nia nai nib c µ c = αvi vib − αvi via = nib (τb − τa ) = ιξ αp0 qa ∧ αp0 qb , nia nai µ c = αvi vj = naj (τj − τi ) = − ιξ αp0 pi ∧ αp0 pj . nai
αz#ia via = αz#ia zib αz#ia zj a
While αz#ia via and αz#ia zib are not collinear (since τa , τb , and τi are independent), and neither are αz#ia via and αz#ia zj a , it may happen that αz#ia zib and αz#ia zj a are collinear. We can, however, circumvent this problem by means of the following lemma (which we leave as an easy exercise). lemma 2.3.1 Let ω1 , ω2 , ω3 , ω4 ∈ g∗ be 3-independent, and suppose that for some ξ ∈ g, ιξ (ω1 ∧ ω2 ) and ιξ (ω3 ∧ ω4 ) are collinear. Then the 2-planes generated by {ω1 , ω2 } and {ω3 , ω4 } intersect in a line. Moreover, if {ω0 } is a basis for this line, then ω0 (ξ ) = 0.
330
GUILLEMIN AND ZARA
Therefore αz#ia zib and αz#ia zj a are collinear precisely when ξ belongs to a predetermined hyperplane; so we can insure 2-independence for c# by avoiding a finite number of such hyperplanes. Definition 2.3.1 An element ξ ∈ g is called generic for the 1-skeleton, (, α), if for every vertex, p, and every quadruple of distinct edges e1 , e2 , e3 , and e4 in Ep , the vectors ιξ (αe1 ∧αe2 ) and ιξ (αe3 ∧ αe4 ) are linearly independent. If has valence 3, then every element is generic. In general, for every element, ξ , of P and every neighborhood of ξ in P , there exists a generic element, ξ , in that neighborhood, such that the orientations oξ and oξ are the same, and the reduced skeleta corresponding to ξ and ξ have isomorphic underlying graphs. We now return to the proof of Theorem 2.3.2. For a generic ξ ∈ g, the blow-up # c can still be defined. Note that 1 c 1 c αvi via = α nia nai wa wai
and
αvci vib −
nib c c αvi via = αw . a wb nia
These relations imply that c# is the same as the blow-up, c# , of c along c0 using the nai ’s. Therefore, for generic ξ , the passage from c to c is equivalent to a blow-up from c to # = c# followed by a blow-down from # = c# to c . 2.3.3. The changes in cohomology We now describe how the cohomology changes as one passes from c to c . For this we use a variant of Theorem 2.2.1. This theorem itself cannot be applied directly since the reduced 1-skeleta might not be 3-independent. However, the proof of Lemma 2.2.1 is valid up to assertion (2.28) without this assumption, and beyond this point it suffices to assume that ξ is generic. Combining Theorems 2.2.1 and 2.1.1, we conclude that dim H 2m ( # ) = dim H 2m (c ) +
s−1
dim H 2(m−k) (r )
k=1
= dim H 2m (c ) +
r−1 s−1
λm−k−l
k=1 l=0
= dim H 2m (c ) +
r−1 s−1 k=1 l=1
Therefore
λm−k−l +
s−1 k=1
λm−k .
331
1-SKELETA AND EQUIVARIANT COHOMOLOGY
dim H
2m
( ) = dim H #
2m
(c ) +
r−1 s−1
λm−k−l +
k=1 l=1
and dim H 2m ( # ) = dim H 2m (c ) +
s−1 r−1
s−1
λm−k
k=1
λm−k−l +
r−1
l=1 k=1
λm−l ,
l=1
which imply that dim H 2m (c ) = dim H 2m (c ) +
s−1 k=1
λm−k −
r−1
λm−k .
(2.34)
k=1
(Note that all λj ’s in the displays above are λj,n−1 ’s, since the dimension of g∗ξ is n − 1.) Since a λj,n−1 , (2.35) λa,n = j =0
equality (2.34) can be written as dim H 2m (c ) = dim H 2m (c ) + λm−r,n − λm−s,n .
(2.36)
2.4. The additive structure of H (, α) 2.4.1. Symplectic cutting In this section we use the results above to draw some conclusions about H (, α) itself. We do this by mimicking, in our graph-theoretic setting (see Figure 7), the symplectic cutting operation of E. Lerman [Le]. Let L be the “edge” graph, with two vertices labeled zero and 1 and one edge + = 1, connecting them. Consider L+ = (L, α + ), with the axial function given by α01 + − − − − α10 = −1, and L = (L, α ), with the axial function α01 = −1, α10 = 1. Here α ± : EL → R∗ R. For R we have the basis {1}, and for its dual R∗ we have the basis {1}. For both these axial functions, 1 ∈ R is polarizing. Finally, let φ0± : VL → R be given by φ0± (0) = 0, φ0± (1) = ±1. Then φ0± is o1 -compatible for α ± . Let (, α) be a 1-skeleton that is 3-independent and noncyclic in the sense of Definition 2.1.3. Let φ : V → R be ξ -compatible for some ξ ∈ P , and choose a > φmax − φmin > 0. For c ∈ (φmin , φmax ), let (c , α c ) be the reduced 1-skeleton of at c. Consider the product 1-skeleton ( × L+ , α × α + ), with ∗ α × α + : E×L −→ g∗ ⊕ R∗ g ⊕ R . This 1-skeleton is also 3-independent and noncyclic, and the function I+ (p, t) = φ(p) + aφ0+ (t) = φ(p) + at
332
GUILLEMIN AND ZARA
(pa , 1)
(pr , 1) (ph , 1)
(pj , 1)
(pi , 1) (pk , 1) (pr , 0) wh vhr (ph , 0)
(pa , 0) (pj , 0) vij I+
wi
=c (pi , 0)
wk (pk , 0)
Figure 7. Cutting
is (ξ, 1)-compatible. Define φ≤c to be the reduced 1-skeleton of ( × L+ , α × α + ) at I+ = c. The vertices of φ≤c correspond to two types of edges of × L+ : (1) ((pi , 0), (pj , 0)), with φ(pi ) < c < φ(pj ); (2) ((pi , 0), (pi , 1)), with φ(pi ) < c. Let vij be the vertex of φ≤c corresponding to an edge ((pi , 0), (pj , 0)), and let wi be the vertex corresponding to an edge ((pi , 0), (pi , 1)). The neighbors of wi are of two types: (1) wk if (pi , pk ) is an edge of and φ(pk ) < c; (2) vij if (pi , pj ) is an edge of and φ(pj ) > c. As for neighbors of vij , apart from wi , they are precisely the neighbors of vij in the reduced 1-skeleton c . This c sits inside φ≤c as the subgraph with vertices vij . Using (2.30) we deduce that the axial function of φ≤c , which we denote by β + , is given by βw+i wk = αpi pk − αpi pk (ξ ) · 1, βv+ij wi = −
1 αp p + 1, αpj pi (ξ ) j i
βv+ij vhr = αpj pa −
αpj pa (ξ ) αpj pi (ξ )
αpj pi = αvcij vhr .
333
1-SKELETA AND EQUIVARIANT COHOMOLOGY
The axial function β + takes values in (g ⊕ R)∗(ξ,1) ⊂ g∗ ⊕ R∗ . However, there is a natural isomorphism g∗ → (g ⊕ R)∗(ξ,1) , given by σ −→ (σ, −σ (ξ ) · 1).
(2.37)
So we can regard β + as taking values in g∗ , and, as such, it is given by βw+i wk = αpi pk , 1 αp p , βv+ij wi = − αpj pi (ξ ) j i βv+ij vhr = αvcij vhr . Similarly, one can define φ≥c as the reduced 1-skeleton of ( × L− , α × α − ) at I− = c, where I− (p, t) = φ(p) + aφ0− (t) = φ(p) − at is (ξ, 1)-compatible. Note that if ξ is generic, then (ξ, 1) is generic as well. 2.4.2. The dimension of H (, α) We now apply (2.34) several times to suitably chosen 1-skeleta to get the following result, which, in some sense, implies the main results of this article. theorem 2.4.1 Let (, α) be a d-valent 1-skeleton that is 3-independent and noncyclic. Then dim H 2m (, α) =
d
b2k ()λm−k,n ,
(2.38)
k=0
the
λ s
being defined by (2.9).
Proof Let ξ ∈ g be a generic element of P , let φ : V → R be ξ -compatible, and let φ≤c be the 1-skeleton defined in the previous section. If there is only one vertex (p, t) with I+ (p, t) < c0 , this 1-skeleton is d+1 , the complete 1-skeleton with d + 1 vertices, and if φmax < c1 < a + φmin , this 1-skeleton is just (, α). Therefore, by studying the change in the cohomology of φ≤c as c varies, we can determine the additive structure of H (, α) from the additive structure of H (d+1 ).
334
GUILLEMIN AND ZARA
Let c0 < a < b < c1 be such that there is exactly one vertex p ∈ V with a < I+ (p, t) < b. If the index of p in is σ (p) = r, then the index of (p, 0) in × L+ is also r. Note that since the zeroth Betti number of is 1, r cannot be zero. Also, in this case, 1 ≤ s = d + 1 − r < d + 1. Thus we can apply (2.34) to obtain dim H 2m (φ≤b ) = dim H 2m (φ≤a ) +
d−r
λm−k −
k=1
r−1
λm−k .
k=1
Adding together these changes, we get
dim H 2m (, α) = dim H 2m (d+1 ) +
= λm +
d−σ (p)
σ (p)>0 d−σ (p)
σ (p)≥0
λm−k −
k=1
λm−k −
k=1 σ (p)−1
σ (p)−1
λm−k
k=1
λm−k .
k=1
The minimum value for k is 1, and the maximum is d; λm−k appears in the first sum when σ (p) ≤ d − k and in the second one when σ (p) ≥ k + 1. Therefore d d−k d b2l () − b2l () λm−k . dim H 2m (, α) = λm + k=1
l=0
l=k+1
Because of the relations b2d−2l () = b2l () (see (1.8)), the expression in brackets reduces to b2k (), and therefore dim H 2m (, α) =
d
b2k ()λm−k,n .
k=0
2.4.3. Generators for H (, α) We can sharpen the result above by constructing a set of generators for H (, α) with nice support conditions. Let φ : V → R be ξ -compatible for ξ ∈ P . For p ∈ V , let Fp be the flow-out of p, that is, the set of vertices of the oriented graph (, oξ ) which can be reached by a positively oriented path starting from p. theorem 2.4.2 If p ∈ V is a vertex of index r, then there exists an element τp ∈ H 2r (, α), with the following properties: (1) τp is supported on Fp ; (2) τp (p) = αe , the product over edges e ∈ Ep with αe (ξ ) < 0.
335
1-SKELETA AND EQUIVARIANT COHOMOLOGY
Proof We first recall a construction used in [GZ, Sec. 2.9]. For every regular value c ∈ R, let Hc (, α) be the subring of those maps f ∈ H (, α) that are supported on the set φ ≥ c. Now consider regular values c, c such that there is exactly one vertex, p, satisfying c < φ(p) < c . Let r = σ (p) be the index of p, and let α1 , . . . , αr be the values of the axial function on the edges pointing down from p. Consider the restriction map f −→ f (p). Hc2m (, α) −→ Sm (g∗ ), The image of this map is contained in α1 · · · αr Sm−r (g∗ ), and the kernel is Hc2m
(, α), so we have an exact sequence 2m m−r ∗ 0 −→ Hc2m (g ).
(, α) −→ Hc (, α) −→ α1 · · · αr S
(2.39)
dim Hc2m (, α) − dim Hc2m
(, α) ≤ λm−r .
(2.40)
Therefore So when we go from c < φmin to c > φmax and add together the inequalities (2.40), we get the inequality dim H 2m (, α) ≤
d
b2r ()λm−r .
(2.41)
r=0
But we proved that (2.41) is actually an equality, so all the inequalities (2.40) are equalities, which means that the right arrow in (2.39) is surjective. In particular, when 2σ (p) (, α) with m = r(= σ (p)), there exists an element τp ∈ Hc αe , τp (p) = α1 · · · αr = e∈Ep ,αe (ξ ) c,
if φ(q) < c, if φ(q) > c.
It is clear that f = h+ + h− and that h± ∈ H (, α). By a slight modification of the proof of Theorem 2.4.4, one can prove the following corollary. corollary 2.5.1 The dimension of ker{Kc : H 2m () → H 2m (c )} is 2m dim Hc2m λm−σ (q),n + λm−d+σ (p),n , + () + dim Hc− () = φ(q)>c
φ(p) c
and
Hc− () = h ∈ H (, α); h is supported on φ < c .
2.5.3. The surjectivity of the Kirwan map We finally prove the graph-theoretic analogue of Theorem 1.8.1. theorem 2.5.3 For generic ξ ∈ P , the Kirwan map Kc is surjective. Proof We show that Kc : H (, α) → H (c , α c ) is surjective by a dimension count, using induction on the number of vertices p ∈ V that lie under the level φ = c. To start, assume there is only one such vertex, p. Then p is a minimum of φ and the reduced space c is a complete 1-skeleton with d vertices, v1 , . . . , vd , one for each edge ei issuing from p. Let f ∈ H 2m (c , α c ). Then, by (2.12), there exists f0 ∈ S(g∗ ) ⊂ H (, α) such that ραei (f0 ) = f (vi ) for all i’s. Hence Kc (f0 ) = f. Let us assume now that the map, Kc , is surjective at the level a, and let us choose a regular value b > a such that there is exactly one vertex, p, with a < φ(p) < b. Let r = σ (p) and s = d − r. We have two exact sequences Ka 0 −→ ker(Ka ) −→ H 2m (, α) −→ H 2m a , α a −→ 0
(2.44)
340
GUILLEMIN AND ZARA
and
Kb 0 −→ ker(Kb ) −→ H 2m (, α) −→ H 2m b , α b .
The last arrow in (2.44) is surjective because of our inductive assumption. By Corollary 2.5.1, dim ker(Kb ) − dim ker(Ka ) = λm−s,n − λm−r,n ,
(2.45)
and, by (2.45) and (2.36), dim ker(Kb ) + dim H 2m (b ) = dim ker(Ka ) + dim H 2m (a ) + (λm−s,n − λm−r,n ) + (λm−r,n − λm−s,n ) = dim H 2m (, α). This proves that dim(im(Kb )) = dim H 2m () − dim ker(Kb ) = dim H 2m (b ) and hence that Kb is surjective.
3. Applications 3.1. The realization theorem Recall that an abstract 1-skeleton, (, α), is an abstract GKM 1-skeleton if α satisfies (2.5) and (2.6) and if the constants ci,e in (2.6) are integers. In this section we prove that all such abstract 1-skeleta can be realized as the GKM-skeleta of GKM spaces. theorem 3.1.1 If (, α) is an abstract GKM 1-skeleton, there exists a complex manifold M and a GKM action of G on M for which (, α) is its GKM 1-skeleton. Remarks. (1) The manifold M which we construct below is not compact, and there does not appear to be a canonical compactification of it. For some interesting noncanonical compactifications of it, see [GKT]. (2) The manifold M is also not equivariantly formal, but it does have the property that the canonical map of HG (M) into H (, α) is surjective. Proof of Theorem 3.1.1 Our construction of M involves three steps. First we construct the CP 1 ’s corresponding to the edges of ; then, for each of these CP 1 ’s, we construct a tubular neighborhood of it in M. Then we construct M itself by gluing these tubular neighborhoods together.
341
1-SKELETA AND EQUIVARIANT COHOMOLOGY
Let ρα be the 1-dimensional representation of G with weight α, and let Vα C be the vector space on which this representation lives. Let G act on Vα ⊕ C by acting by ρα on the first factor and by the trivial representation on the second factor. This action induces an action of G on the projectivization Xα = CP 1 = P(Vα ⊕ C).
(3.1)
The points q = [1 : 0] and p = [0 : 1] are the two fixed points of this action, and there are equivariant bijective maps Vα −→ Xα − {q},
c −→ [c : 1],
(3.2)
and V−α −→ Xα − {p},
c −→ [1 : c].
(3.3)
The equivariance of (3.3) follows from the fact that, for ξ ∈ g, iα(ξ ) e : c = 1 : e−iα(ξ ) c . We denote by Lα the tautological line bundle over Xα . By definition, the fiber of Lα over [c1 : c2 ] is the 1-dimensional subspace of Vα ⊕ C spanned by (c1 , c2 ), so, from the action of G on Vα ⊕ C, one gets an action of G on Lα lifting the action of G on Xα . The fiber of Lα over q is Vα , so, in particular, the following is true. lemma 3.1.1 The weight of the isotropy representation of G on (Lα )p is zero and on (Lα )q is α. The mapping [c : 1] −→ (c, 1)
(3.4)
defines a holomorphic section of Lα over Xα − {q} and, hence, a holomorphic trivialization of the restriction of Lα to Xα − {q}. The vector space Vα ⊕ C C2 can be equipped with the G-invariant Hermitian form (3.5) |z|2 = |z1 |2 + |z2 |2 , and, since the restriction of this form to each subspace of C2 defines a Hermitian form on this subspace, we get from this form a G-invariant Hermitian structure on Lα . Now let αi and αi be weights of G with αi − αi = mα, m being an integer, and let Li be the line bundle (3.6) Li = Lm α ⊗ V αi . From the action of G on Lα and on Vαi , one gets an action of G on this line bundle lifting the action of G on Xα . The following is a corollary of Lemma 3.1.1 and of the
342
GUILLEMIN AND ZARA
existence of the Hermitian structure (3.5) on Lα and of the trivialization (3.4) of Lα over Xα − {q}. lemma 3.1.2 The weight of the isotropy representation of G on (Li )p is αi and on (Li )q is αi . In addition, Li has a G-invariant Hermitian structure and a nonvanishing holomorphic section, si : Xα − {q} → Li , which transforms under the action of G according to the weight αi . In particular, the restriction of Li to Xα − {q} is isomorphic to the trivial bundle over Xα − {q} with fiber Vαi . Let us now return to the problem of constructing a manifold M with (, α) as its GKM 1-skeleton. Let e be an oriented edge of , and let p, p , ei , and ei be as in ¯ Let Xe = Xα with α = αe , and let Li be the line (2.6) with ed = e and ed = e. bundle constructed above with αi = αei and αi = αei . The Xe ’s are our candidates for the G-invariant CP 1 ’s in M, and the vector bundle Ne =
d−1
Li
(3.7)
i=1
is our candidate for the normal bundle of Xe in M. Thus, a candidate for a tubular neighborhood of Xe in M is a convex neighborhood of the zero section in Ne , for example, the disk bundle Ue3 = (x, v1 , . . . , vd−1 ); x ∈ Xe , vi ∈ (Li )x , |v1 | < 3 . (3.8) We construct M by starting with the disjoint union Ue3
(3.9)
over all edges e of and by making the following obvious identifications. Let Ne,p be the restriction of the bundle Ne to Xe − {q}. By Lemma 3.1.2 and by (3.2), one gets a G-equivariant bijective map γe
Ne,p −→ Tp M,
(3.10)
where Tp M is by definition the sum d
Vαi .
(3.11)
i=1 3 and If e and e are edges of meeting at p, we identify the points u ∈ Ue,p u ∈ Ue3 ,p in the set (3.9) if γe (u) = γe (u ).
343
1-SKELETA AND EQUIVARIANT COHOMOLOGY
It is easy to check that if we quotient (3.9) by the equivalence relation defined by these identifications (with 3 small), we get a manifold M with the properties listed in the theorem. 3.2. A deformation problem We return to the example of Section 1.10: the 1-skeleton, , of an edge-reflecting dvalent convex polytope embedded into an n-dimensional space g∗ by I : → g∗ , the axial function, α, of being given by αpq = I(q) − I(p). Fix a vector ξ ∈ P , and let φ : V → R be given by φ(p) = *I(p), ξ +
∀p ∈ V .
(3.12)
Then φ is ξ -compatible and the zeroth Betti number of is 1, and since every 2-face of is convex, is noncyclic. Moreover, is 3-independent if dim g∗ ≥ 3. (See the comments at the end of Section 1.10.) Assume that there exists a lattice Z∗G in g∗ such that the edges of are scalar multiples of rational vectors. If we try to deform by changing its vertices such that the above property is preserved, we are led to the following definition. Definition 3.2.1 A function f : V → g∗ is called a rational deformation of I if there exists 3 > 0 such that, for every t ∈ [0, 3), the map It : V → g∗ , given by It (p) = I(p) + tf (p),
∀p ∈ V ,
is an embedding of V into g∗ and, for every edge e = (p, q) of , It (q) − It (p) is a positive multiple of αp,e . theorem 3.2.1 For a given embedding I, the space of rational deformations is H 2 (, α). Proof Let f be a rational deformation. Then It (q) − It (p) = I(q) − I(p) + t f (q) − f (p) ;
(3.13)
since αpq divides both It (q) − It (p) and I(q) − I(p), it follows that it divides f (q) − f (p) as well, which means that f ∈ H 2 (, α).
344
GUILLEMIN AND ZARA
Conversely, if f ∈ H 2 (, α), then (3.13) implies that It (q)−It (p) is a multiple of αpq , and, since I(q)−I(p) = αpq , for t small enough, it is a positive multiple. We can choose 3 small enough for all edges, which proves that f is a rational deformation.
By (2.38), dim H 2 (, α) = b2 ()λ0 + b0 ()λ1 = b2 () + n.
(3.14)
Every translation is a rational deformation, and n in (3.14) is the contribution of these “trivial” deformations to the space of deformations of . Hence the nontrivial rational deformations are those corresponding to b2 (). By Theorem 2.4.4, these deformations are linear combinations of Thom classes, τp , for p of index 1. Example 3.2.1 Consider an octahedron embedded in R3 ; then b2 () = 1. All rational deformations are obtained by composing translations with the homothety p ∈ V → (1 − t)p. In Figure 8 the numbers next to vertices indicate the index with respect to the height function. The condition on 2-planes amounts to saying that 04, 22, and 13 intersect in a point. 4
3 2
2 1 0 Figure 8. Rational deformation of the octahedron
3.3. Schubert polynomials For the Grassmannians, the classes τp of Theorem 2.4.2 have an alternative descrip-
345
1-SKELETA AND EQUIVARIANT COHOMOLOGY
tion in terms of Schubert polynomials (see [BGG], [LS], [Dem], [KK], [Mac], [BH], [Fu1], et al.). This description involves the Hecke algebra of divided difference operators, an algebra that is intrinsically associated to every compact semisimple Lie group K. Let G be the Cartan subgroup of K, and let W = N(G)/G be the Weyl group. As a group of transformations of g, W is generated by simple reflections. Moreover, to each reflection σ ∈ W corresponds a unique positive root α = ασ ∈ g∗ with α(σ ξ ) = −α(ξ ) for all ξ ∈ g. In particular, σ leaves fixed the hyperplane α(ξ ) = 0. The divided difference operator Dσ : S(g∗ ) −→ S(g∗ ) is the operator defined by
f − σf . (3.15) ασ (Notice that since σf (ξ ) = f (σ (ξ )) = f (ξ ) for ξ on the hyperplane α(ξ ) = 0, the left-hand side of (3.15) is an element of S(g∗ ), that is, a polynomial function on g.) The Hecke algebra of divided difference operators D is the algebra generated by the Dσ ’s and the operators “multiplication by f ” for f ∈ S(g∗ ). We note that if g ∈ S(g∗ )W , then Dσ (f ) =
Dσ (gf ) =
gf − σ (gf ) f − σf =g = gDσ f. ασ ασ
Hence, if D ∈ D , then D(gf ) = gDf, S(g∗ )
as morphisms of so the algebra D acts on ∗ W M0 is an S(g ) -module and
S(g∗ )W -modules.
M = M0 ⊗S(g∗ )W S(g∗ ),
(3.16) More generally, if
(3.17)
then one can make M into a D -module by setting D(m ⊗ f ) = m ⊗ (Df ).
(3.18)
(In view of (3.16), this is a well-defined operator on M .) Now let M be a K-manifold. From the constant map M → pt one gets a map in cohomology HK (pt) −→ HK (M), and, since
HK (pt) = S(k∗ )K = S(g∗ )W ,
this map makes HK (M) into a module over S(g∗ )W . For the following result, see, for instance, [GS3, Chap. 6].
346
GUILLEMIN AND ZARA
theorem 3.3.1 The G-equivariant cohomology ring of M is related to the K-equivariant cohomology ring of M by the following ring-theoretic identity: HG (M) = HK (M) ⊗S(g∗ )W S(g∗ ).
(3.19)
Therefore, by (3.17) and (3.18) we conclude the following theorem. theorem 3.3.2 The G-equivariant cohomology ring HG (M) is canonically a module over D . Suppose now that M is a GKM manifold and is equivariantly formal. Then, by Theorem 1.7.3, HG (M) H (, α), so we can transport this D -module structure to H (, α). In particular, we get an action of the divided difference operator Dσ on H (, α). theorem 3.3.3 Let f : V → S(g∗ ) be a map that satisfies the compatibility conditions (1.16) (in other words, that belongs to H (, α)). Then, for p ∈ V , f (p) − σ f (σ −1 p) . (3.20) (Dσ f )(p) = ασ Remarks. (1) Since W is, by definition, N(G)/G, it acts on the fixed point set M G . Since M G = V , the expression σ −1 p on the left-hand side is unambiguously defined. (2) Since W acts on S(g∗ ) by ring automorphisms, the ring automorphism σ , applied to the element f (σ −1 p) ∈ S(g∗ ) on the left-hand side, is also unambiguously defined. (3) For a proof of Theorem 3.3.3, see [GHZ]. We now return to Section 1.11 and the Bruhat structure of the Johnson graph. Let φ be the Morse function (1.39), and let p ∈ V be a vertex of of index r. Let p0 be the unique maximum of φ; that is, let p0 = pS , S = l + 1, . . . , n , with l = n − k, and let =
e∈Ep0
αe =
i≤k<j
(αi − αj ).
(3.21)
1-SKELETA AND EQUIVARIANT COHOMOLOGY
347
By Theorem 2.4.2, the Thom class τp0 is the map τp0 : V −→ S(g∗ ) that takes the value at p0 and zero everywhere else. Now let σi1 , . . . , σis (s = kl−r) be the elementary reflections with the properties described in Theorem 1.11.6. By applying the operator Dσi1 ◦ · · · ◦ Dσis to τp0 , one gets a cohomology class τp , which is of degree r = kl − s and which, by (1.44), (1.45), and (3.20), is supported on Fp . Moreover, it is easy to see that τp (p) = αe , the product being over the downward pointing edges e ∈ Ep . Thus, by Theorem 2.4.3 and (1.39), τp = τp . Remark. One can regard the τp ’s as being a doubly indexed family of polynomials fp,q = τp (q), indexed by pairs (p, q) of vertices of with p ≺ q. These polynomials, which are called double Schubert polynomials, have been studied extensively by S. Billey, M. Haiman, R. Stanley, S. Fomin, A. Kirillov, W. Jockusch, and others. (For a succinct and engrossing account of what is known about these polynomials, we recommend, as collateral reading, the beautiful monograph Young Tableaux by William Fulton, which has just been published by Cambridge University Press.) Acknowledgments. We would like to express our thanks to several of our colleagues for helping us to understand some of the key motivating examples in this subject: David Vogan for furnishing us with an enlightening example of a GKM action of T 2 on S 6 ; Viktor Ginzburg, Yael Karshon, and Sue Tolman for furnishing us with an equally enlightening example of a GKM action of T 2 on the n-fold equivariant ramified cover of S 2 ×S 2 ; Mark Goresky for pointing out to us the connection between the GKM theory of toric varieties and Stanley-Reisner theory; Werner Ballmann for making us aware of the fact that, for the Grassmannian, GKM theory reduces to studying an object that graph theorists call the Johnson graph; Sara Billey for helping us to understand the tie-in between our theory of Thom classes for this graph and the standard Schubert calculus; Rebecca Goldin and Tara Holm for helping us work out the details of this example (in Sections 1.11 and 3.3); and Ethan Bolker for his beautiful observation (see Section 1.3, Theorem 1.3.1) that the Betti numbers of a GKM 1-skeleton are well defined, independent of the choice of an admissible orientation. Last, but not least, we would like to thank the referee of this paper for a superb refereeing job.
348
GUILLEMIN AND ZARA
References [At]
M. F. ATIYAH, Convexity and commuting Hamiltonians, Bull. London Math. Soc. 14
[AB]
M. F. ATIYAH and R. BOTT, The moment map and equivariant cohomology, Topology
[BV]
N. BERLINE and M. VERGNE, Classes caract´eristiques e´ quivariantes. Formule de
(1982), 1–15. 23 (1984), 1–28.
[BGG] [BB] [Bi] [BH] [Bo] [BP]
[BCN] [CW] [Del] [Dem] [Fu1] [Fu2] [GKT] [Go] [GKM] [GHZ] [GS1] [GS2]
localisation en cohomologie e´ quivariante, C. R. Acad. Sci. Paris S´er. I Math. 295 (1982), 539–541. ˇ ˘IN, I. M. GELFAND, and S. I. GELFAND, Schubert cells, and the I. N. BERNSTE cohomology of the spaces G/P , Russian Math. Surveys 28 (1973), 1–26. A. BIALYNICKI-BIRULA, Some theorems on actions of algebraic groups, Ann. of Math. (2) 98 (1973), 480–497. S. BILLEY, Kostant polynomials and the cohomology ring for G/B, Duke Math. J. 96 (1999), 205–224. S. BILLEY and M. HAIMAN, Schubert polynomials for the classical groups, J. Amer. Math. Soc. 8 (1995), 443–482. A. BOREL, Seminar on Transformation Groups, Ann. of Math. Stud. 46, Princeton Univ. Press, Princeton, 1960. M. BRION and C. PROCESI, “Action d’un tore dans une vari´et´e projective” in Operator Algebras, Unitary Representations, Enveloping Algebras, and Invariant Theory (Paris, 1989), Progr. Math. 92, Birkh¨auser, Boston, 1990, 509–539. A. E. BROUWER, A. M. COHEN, and A. NEUMAIER, Distance-Regular Graphs, Ergeb. Math. Grenzgeb. (3) 18, Springer, Berlin, 1989. H. CRAPO and W. WHITELEY, Spaces of stresses, projections and parallel drawings for spherical polyhedra, Beitr¨age Algebra Geom. 35 (1994), 259–281. T. DELZANT, Hamiltoniens p´eriodiques et images convexes de l’application moment, Bull. Soc. Math. France 116 (1988), 315–339. M. DEMAZURE, D´esingularisation des vari´et´es de Schubert g´en´eralis´ees, Ann. Sci. ´ Ecole Norm. Sup. (4) 7 (1974), 53–88. W. FULTON, Flags, Schubert polynomials, degeneracy loci, and determinantal formulas, Duke Math. J. 65 (1992), 381–420. , Young Tableaux: With Applications to Representation Theory and Geometry, London Math. Soc. Stud. Texts 35, Cambridge Univ. Press, Cambridge, 1997. V. GINZBURG, Y. KARSHON, and S. TOLMAN, in preparation. L. GODINHO, Circle actions on symplectic manifolds, Ph.D. dissertation, State University of New York at Stony Brook, 1999. M. GORESKY, R. KOTTWITZ, and R. MACPHERSON, Equivariant cohomology, Koszul duality, and the localization theorem, Invent. Math. 131 (1998), 25–83. V. GUILLEMIN, T. HOLM, and C. ZARA, GKM-theory on homogeneous manifolds, in preparation. V. GUILLEMIN and S. STERNBERG, Convexity properties of the moment mapping, Invent. Math. 67 (1982), 491–513. , Birational equivalence in the symplectic category, Invent. Math. 97 (1989),
1-SKELETA AND EQUIVARIANT COHOMOLOGY
[GS3] [GZ] [Hu] [Ja] [Ki] [Kl] [KN] [KK] [LS] [Le] [LT] [LLY]
[Mac]
[McD] [Me] [St] [TW1] [TW2]
349
485–522. , Supersymmetry and Equivariant de Rham Theory, Math. Past Present, Springer, Berlin, 1999. V. GUILLEMIN and C. ZARA, Equivariant de Rham theory and graphs, Asian J. Math. 3 (1999), 49–76. J. HUMPHREYS, Reflection Groups and Coxeter Groups, Cambridge Stud. Adv. Math. 29, Cambridge Univ. Press, Cambridge, 1990. N. JACOBSON, Cayley numbers and normal simple Lie algebras of type G, Duke Math. J. 5 (1939), 775–783. F. C. KIRWAN, Cohomology of Quotients in Symplectic and Algebraic Geometry, Math. Notes 31, Princeton Univ. Press, Princeton, 1984. A. A. KLYACHKO, Equivariant bundles over toric varieties, Math. USSR-Izv. 35 (1990), no. 2, 337–375. S. KOBAYASHI and K. NOMIZU, Foundations of Differential Geometry, Vol. II, Interscience Tracts in Pure Appl. Math. 15, Interscience, New York, 1969. B. KOSTANT and S. KUMAR, The nil Hecke ring and cohomology of G/P for a Kac-Moody group G, Adv. in Math. 62 (1986), 187–237. ¨ A. LASCOUX and M.-P. SCHUTZENBERGER , Schubert polynomials and the Littlewood-Richardson rule, Lett. Math. Phys. 10 (1985), 111–124. E. LERMAN, Symplectic cuts, Math. Res. Lett. 2 (1995), 247–258. E. LERMAN and S. TOLMAN, Hamiltonian torus actions on symplectic orbifolds and toric varieties, Trans. Amer. Math. Soc. 349 (1997), 4201–4230. B. H. LIAN, K. LIU, and S.-T. YAU, “Mirror principle: A survey” in Current Developments in Mathematics (Cambridge, Mass., 1998), International Press, Boston, 1998, 35–65. I. G. MACDONALD, “Schubert polynomials” in Surveys in Combinatorics (Guildford, U.K., 1991), London Math. Soc. Lecture Note Ser. 166, Cambridge Univ. Press, Cambridge, 1991, 73–99. D. MCDUFF, Examples of simply-connected symplectic non-K¨ahlerian manifolds, J. Differential Geom. 20 (1984), 267–277. E. MEINRENKEN, Symplectic surgery and the Spinc -Dirac operator, Adv. Math. 134 (1998), 240–277. R. P. STANLEY, Combinatorics and Commutative Algebra, Progr. Math. 41, Birkh¨auser, Boston, 1983. S. TOLMAN and J. WEITSMAN, The cohomology rings of abelian symplectic quotients, preprint, http://www.arXiv.org/abs/math.DG/9807173. , “On the cohomology rings of Hamiltonian T-spaces” in Northern California Symplectic Geometry Seminar, Amer. Math. Soc. Transl. Ser. 2 196, Amer. Math. Soc., Providence, 1999, 251–258.
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; [email protected]; [email protected]
LONG-WAVE SHORT-WAVE RESONANCE FOR NONLINEAR GEOMETRIC OPTICS THIERRY COLIN and DAVID LANNES
Abstract The aim of this paper is to study oscillatory solutions of nonlinear hyperbolic systems in the framework developed during the last decade by J.-L. Joly, G. M´etivier, and J. Rauch. Here we focus mainly on rectification effects, that is, the interaction of oscillations with a mean field created by the nonlinearity. A real interaction can occur only under some geometric conditions described in [JMR1] and [L1] that are generally not satisfied by the physical models except in the 1-dimensional case. We introduce here a new type of ansatz that allows us to obtain rectification effects under weaker assumptions. We obtain a new class of profile equations and construct solutions for a subclass. Finally, the stability of the asymptotic expansion is proved in the context of Maxwell-Bloch-type systems.
1. Introduction 1.1. Motivations In the study of solutions to nonlinear hyperbolic systems, many nonlinear effects can be observed. In optics, they are linked to a nonlinear response of the medium and therefore to the intensity of the incoming light. The more intense it is, the sooner these nonlinear effects occur. This physical phenomenon encountered in optics occurs in all nonlinear hyperbolic systems; the scale of the appearance of the nonlinear effects is in inverse proportion to the size of the solution. For instance, for a semilinear hyperbolic problem Lε (∂x )uε := ∂t uε +
d j =0
A j ∂j u ε +
L0 ε u = f uε , uε , ε
DUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 2, Received 6 January 2000. Revision received 13 June 2000. 2000 Mathematics Subject Classification. Primary 35B34, 35B40, 35C20, 35L60.
351
352
COLIN and LANNES
nonlinear effects occur at times O(1) if uε is of size O(1) and at times O(1/ε) if its size is O(ε). We investigate here phenomena that occur for diffractive times O(1/ε), and we are interested in one particular nonlinear effect called rectification, which means the creation of a mean field thanks to the asymptotical nonlinear interaction of oscillating modes. It is a nonlinear interaction between the zero frequency (long waves) and nonzero frequencies (short waves). In the first studies of rectification for times O(1/ε) (in [JMR1] for the nondispersive case and in [L1] for the dispersive case), it has been shown that it can occur only if the tangent cone C0 to the characteristic variety C at (0, 0) contains a hyperplane tangent to C . Such a condition is quite strong and seems to exclude all physical examples, unless we are in space dimension 1, since C 0 is then a union of straight lines. But even in this case, the nonlinear coupling that should appear between the mean field and the oscillating modes remains equal to zero, as computations show (see [L1]). Such a phenomenon belongs to the transparency phenomena first mentioned by P. Donnat (cf. [D]) and extensively studied in [JMR2]. As said above, nonlinear effects are linked to the amplitude of the solutions we study. Since rectification does not occur at times O(1/ε) when dealing with “normal” solutions of size O(ε) to transparent problems, it is therefore natural to seek “abnormal” solutions of size O(1). We said above that in this case nonlinear effects should occur at times O(1), but because of transparency they occur only at a diffractive scale. It has been proved in this case (see [C]) that the approximate solution given by geometric optics must satisfy a Davey-Stewartson-type system that couples the leading oscillating term of the ansatz with the leading nonoscillating term. There is therefore a nonlinear interaction between the oscillating and mean modes, but this Davey-Stewartson-type interaction is due to the algebraic structure of the system and not to asymptotical effects coming from the long-time interaction of different modes travelling at the same velocity. Hence, the nonlinear interaction in Davey-Stewartson systems cannot be called rectification. In fact, the study made in [C] remains valid while rectification does not occur. It has been proved indeed that the classical rectification condition (i.e., C 0 contains a hyperplane tangent to C ) is a singular case for the Davey-Stewartson system of [C]. It is not surprising since in this case the system is “more nonlinear”—since we then have to add the rectification effects to the Davey-Stewartson nonlinear effects—and the solutions are therefore more likely to explode. There is also another singular case for this Davey-Stewartson system, which occurs when there exists a tangent plane to C also tangent to C 0 . This condition is close but weaker than the rectification condition. Here again, one can think that the Davey-Stewartson system becomes singular because of rectification effects.
LONG-WAVE SHORT-WAVE RESONANCE
353
Here lies the motivation of this paper. We want to observe rectification effects, but we are confronted by two opposite situations where their study is not possible. On the one hand (see [JMR1], [L1]), the amplitude O(ε) of the solutions is too small and, because of transparency effects, rectification effects do not occur for observation times O(1/ε). On the other hand (in [C]), the amplitude O(1) of the solutions is too big and when rectification effects occur, solutions explode. It is therefore natural to √ consider solutions at an intermediate scale O( ε) and to investigate the two cases which are singular in [C]. (i) There is in C 0 a tangent hyperplane to C . As said above, the only physically interesting case is when the space dimension is 1. We then seek approximate solution uε of the form √ τ t + ηy , uε (t, y) = εU ε εt, t, y, ε where U ε (T , t, y, θ ) is periodic in θ. The scale εt is the diffractive scale, while (t, y) is the scale of geometric optics and (τ t + ηy)/ε the fast oscillating scale. (ii) C has a tangent plane P also tangent to C 0 . The 1-dimensional case is the same as above, so that we consider only space dimension d ≥ 2. The situation is here a bit different since if we seek approximate solutions uε as above, we know, thanks to [JMR1] and [L1], that there is no rectification effect (since the rectification condition is not fulfilled). Denoting y = (y1 , . . . , yd ) and assuming that P is tangent to C 0 along the first coordinate, a first idea is to consider approximate solutions of the form √ τ t + ηy , uε (t, y) = ε U ε εt, t, y1 , ε which brings us back to case (i). This is not satisfactory since we lose the dependence on yI I := (y2 , . . . , yd ), and hence we can only study the rectification effects that occur along the first coordinate. Our problem is then quite similar to what happens when choosing the amplitude. We are indeed confronted by two opposite situations. On the one hand, rectification does not occur fast enough to be described with a dependence on yI I of the same scale as the dependence on y1 . On the other hand, taking a dependence on yI I slower or of the same scale as the diffractive scale, we miss part of it. Thus, we introduce a new scale and seek uε under the form √ √ τ t + ηy . uε (t, y) = εU ε εt, εyI I , t, y1 , ε Throughout this paper, we investigate case (ii). The associated condition is called the long-wave short-wave resonant condition. It is easy to see that the situation described in case (i) can easily be deduced from it. We show that multidimensional nontrivial rectification occurs in this case. Since the long-wave short-wave resonance condition is likely to occur for physical systems, we suspect that this study gives a
COLIN and LANNES
354
good framework to observe experimentally rectification effects. In [L2], we follow formally the theory exposed here to study rectification effects for water waves, but, unfortunately, we cannot apply directly the result of this paper to the Euler equations with free surface. Note that the system found in [L2] is also derived in the book by C. and P.-L. Sulem [Su]. 1.2. Setting up the problem We consider here a general class of hyperbolic quadratic systems. More precisely, we √ seek approximate solutions of size ε to Lε (∂x )uε := ∂t uε +
d
A j ∂j u ε +
j =0
L0 ε u = f uε , uε , ε
(1)
where the Aj are N × N symmetric real matrices, while L0 is skew-symmetric. We assume that the mapping (u, v) ∈ C2N → f (u, v) ∈ CN is bilinear. Throughout this paper, t ∈ R+ denotes the time variable and y := (y1 , . . . , yd ) ∈ Rd denotes the space variables; we also write x := (t, y) ∈ R1+d . For(τ, η) ∈ R1+d , we introduce L(τ, η) = τ I +
d
Aj ηj +
j =1
as well as AI I (η) :=
d
L0 , i
A j ηj .
j =2
The set of all β := (τ, η) ∈ R1+d such that det(L(β)) = 0 is the characteristic variety of L and is denoted by CL . If β is a smooth point of CL , we denote by η → τ (η) a local parametrization of CL in a neighborhood of β. For β = (τ, η) ∈ CL , we finally denote by π(β) the orthogonal projector on ker(τ I + A(η) + L0 /i), and we denote by L(β)−1 the partial inverse of L(β). We also need to consider the characteristic variety C 0 of the operator π(0)Lε π(0), which is the tangent cone to CL at (0, 0) (see [L1]). √ As said above, we look for approximate solutions to (1) whose size is O( ε), that is, intermediate between the normal size O(ε) and the large size O(1) used when deriving the Davey-Stewartson systems (see [C]). The leading oscillating term oscillates with a phase β · x/ε, where β satisfies the following assumption. assumption 1 β is a smooth point of CL , and 2β is not on CL .
LONG-WAVE SHORT-WAVE RESONANCE
355
The short-wave long-wave resonance condition we have mentioned above, and which corresponds to the singular case for the Davey-Stewartson system with which we are dealing, states the following. assumption 2 (Long-wave short-wave resonance) We say that we have a long-wave short-wave resonance if the tangent space P to CL at β is the tangent space to C 0 at (0, 0). Under this assumption, the intersection of P and C 0 is a straight line passing through the origin. We denote by β 0 the point of this line with vertical coordinate equal to 1: β 0 := (1, η0 ) = (1, η10 , . . . , ηd0 ). When CL is of revolution, then η, η0 , and ∇τ (η) are necessarily colinear. In the general case, this is no longer true, but we have the following proposition. proposition 1 If ∇τ (η) · η0 = 0, then we can be brought back to the case where ∇τ (η) and the contact direction η0 are colinear. The proof of this proposition is postponed to Section 6.1. In order to be in this framework, we make the following assumption, satisfied by all the physical examples we have encountered. assumption 3 One has ∇τ (η) · η0 = 0. Convention Under the above assumption, we can assume from now on that η0 = (η10 , 0, . . . , 0) and τ (η) = (∂1 τ (η), 0, . . . , 0). 1.3. The ansatz In diffractive optics (see [D], [JMR1], [L1], [C]), ansatzes with three scales are used, and the approximate solutions are therefore of the form β ·x ε p , u (x) = ε U ε, εt, x, ε where the profile U (ε, T , x, θ) is periodic in θ. The scale O(1/ε) is the fast scale associated to the oscillations, and the intermediate scale O(1) is the scale of geometric optics, that is, the scale for which propagation of oscillations along rays furnishes a good approximation. The last scale O(ε) is the
COLIN and LANNES
356
slow scale we have to introduce in order to take into account the diffractive modifications one has to make to the non-space-time dispersive propagation along rays. √ As said in the introduction, we introduce here a fourth scale O( ε) in order to take into account the rectification effects in the transverse directions. Still supposing that η0 and −τ (η) are along the first coordinate, we seek approximate solutions of the form β ·x √ √ uε (x) = ε U ε, εt, εyI I , t, y1 , , (2) ε where y1 is thus the direction of η0 and τ (η), and yI I := (y2 , . . . , yd ). The profile U (ε, T , Y, t, y1 , θ ) is chosen of the form √ U ε, T , Y, t, y1 , θ := U1 + ε U2 + ε U3 + ε 3/2 U4 + ε 2 U5 ε, T , Y, t, y1 , θ , (3) where the Ui are smooth functions of their arguments and are periodic in θ. Since the above expansion is used for times O(1/ε), we have to control the growth of the profiles in t. In order for the correctors Ui , i = 2, . . . , 5, to remain smaller √ than the leading term U1 for such times, we must have U2 = o( t), U3 = o(t), U4 = o(t 3/2 ), and U5 = o(t 2 ). We impose the following stronger conditions. • The first corrector U2 remains bounded, (4) ∃C > 0, sup U2 (·, ·, t, ·, ·)L∞ ([0,T ]×Rd ×T) ≤ C. t∈R+
•
Y,y1
The other correctors Ui , i = 3, 4, 5, satisfy the sublinear growth condition introduced in [JMR1], lim
t→∞
1 Ui (·, ·, t, ·, ·) ∞ = 0, L ([0,T ]×RdY,y ×T) t 1
i = 3, 4, 5.
(5)
1.4. Outline of the results In Section 2 we derive the profile equations using the techniques of geometrical optics. However, the size of the solution considered here is too big to allow a standard derivation, and we have to make a transparency assumption. To our knowledge (see [JMR2], [L1]), this assumption is satisfied by all the physical systems of the form (1). The profile equations found in this case are given in Theorem 1. In particular, one can notice that the evolution equations of the oscillating and mean modes are coupled. In Section 3 we assume the existence of a solution to the profile equations and prove a few properties of the approximate solutions associated to these profiles. In Proposition 7 we show that the residual that one obtains when plugging these approximate solutions into (1) is small. Section 4 is devoted to the study of a particular subclass of systems (see (1)),
357
LONG-WAVE SHORT-WAVE RESONANCE
the Maxwell-Bloch systems. This class of problems has been extensively studied in [JMR2]. Under a strong transparency assumption, we prove that the nonlinearity appearing in the evolution equation of the mean mode vanishes. In this case, the existence of a solution to the profile equations is proved. Moreover, we prove in Theorem 2 that the associated approximate solutions are stable, that is, remain close to an exact solution of (1). The 1-dimensional case is another framework in which we can prove the existence of a solution to the profile equations, as we show in Section 5. Finally, we prove in Section 6 an existence theorem for the profile equations in two dimensions, without doing the strong transparency assumption. Though the system we consider in this version is simplified with regard to the profile equations given in Theorem 1, it is of particular interest since this is the system obtained by Sulem and Sulem [Su] when studying the long-wave short-wave resonance for water waves.
2. Derivation of the equations 2.1. Equations for the profiles As usual in geometric optics, we expand Lε uε − f (uε , uε ) (where uε is the approximate solution given by equations (2) and (3)) in powers of ε. One finds Lε uε − f uε , uε = ε −1/2 iL βDθ U1 + ε 0 iL βDθ U2 + ε 1/2 iL βDθ U3 + ∂t + A1 ∂y1 U1 + ε 1 iL βDθ U4 + ∂t + A1 ∂y1 U2 + AI I (∂Y )U1 − f U1 , U1 + ε 3/2 iL βDθ U5 + ∂t + A1 ∂y1 U3 + AI I (∂Y )U2 + ∂T U1 − 2f U1 , U2 + ε2 ∂t + A1 ∂y1 U4 + AI I (∂Y )U3 + ∂T U2 − f U2 , U2 , − 2f U1 , U3 + ε 5/2 R ε √ T =εt,Y = εyI I ,θ=β·x/ε
(6)
where we recall that AI I (∂Y ) := dj =2 Aj (∂Yj ) and where Dθ := ∂θ /i. We want to choose profiles Ui in order to cancel the first terms in the above expansion. This yields the following profile equations: (7) iL βDθ U1 = 0, (8) iL βDθ U2 = 0, (9) iL βDθ U3 + ∂t + A1 ∂y1 U1 = 0,
COLIN and LANNES
358
iL βDθ U4 + ∂t + A1 ∂y1 U2 + AI I (∂Y )U1 − f U1 , U1 = 0, iL βDθ U5 + ∂t + A1 ∂y1 U3 + AI I (∂Y )U2 + ∂T U1 − 2f U1 , U2 = 0, and ∂t + A1 ∂y1 U4 + AI I (∂Y )U3 + ∂T U2 − f U2 , U2 − 2f U1 , U3 = 0.
(10) (11)
(12)
2.2. Algebraic analysis of equations (7)–(12) For the principal term of our ansatz, we choose U1 (T , Y, t, y1 , θ ) := U11 (T , Y, t, y1 )eiθ + c. c. (complex conjugate),
which means that we exclude nonoscillating terms from the principal term. This is realistic since the nonoscillating terms, which are created by rectification effects, cannot reach the same amplitude as the main oscillating terms. In order to deduce conditions on U11 from (7)–(12) and throughout this section, we need the following algebraic lemma. lemma 1 Let a, b ∈ CN , and let β ∈ R1+d . The following two assertions are then equivalent: (i) L(β)a = b, (ii) π(β)b = 0 and (I − π(β))a = L(β)−1 b. Thanks to this lemma, equation (7) is then equivalent to the polarization condition π(β)U11 = U11 .
(13)
Contrary to what has been done for U1 , we allow nonoscillating terms for the first corrector U2 . We take therefore U2 T , Y, t, y1 , θ := U20 T , Y, t, y1 + U21 T , Y, t, y1 eiθ + c. c. We first decompose (8) into its Fourier modes and then apply Lemma 1 to find that (8) is equivalent to (14) π(β)U21 = U21 and π(0)U20 = U20 .
(15)
Pursuing our analysis, we now want to find necessary conditions from (9). We search a U3 of the form U3 T , Y, t, y1 , θ := U30 T , Y, t, y1 + U31 T , Y, t, y1 eiθ + c. c., so that the nonoscillating Fourier coefficient of (9) reads L(0)U30 = 0.
LONG-WAVE SHORT-WAVE RESONANCE
359
Thanks to Lemma 1, this is equivalent to π(0)U30 = U30 .
(16)
The first mode of the Fourier expansion of (9) reads iL(β)U31 + ∂t + A1 ∂y1 U11 = 0. Using Lemma 1 and (13), this is equivalent to the following two equations: π(β) ∂t + A1 ∂y1 π(β)U11 = 0 and
(17)
I − π(β) U31 = iL(β)−1 ∂t + A1 ∂y1 π(β)U11 ;
that is, since L(β)−1 π(β) = 0, I − π(β) U31 = iL(β)−1 A1 ∂y1 π(β)U11 .
(18)
Since (10) is nonlinear quadratic, we have to look for a U4 with the second harmonic U4 T , Y, t, y1 , θ : = U40 T , Y, t, y1 + U41 T , Y, t, y1 eiθ + c. c. + U42 T , Y, t, y1 e2iθ + c. c. With the same method as above and using (15), we obtain the following equivalent equations to (10): π(0) ∂t + A1 ∂y1 π(0)U20 = 2 π(0)f π(β)U11 , π(β)U11 (19) and (I − π(0))U40 = iL(0)−1 A1 ∂y1 π(0)U20 − 2i L(0)−1 f π(β)U11 , π(β)U11 , (20) as far as the nonoscillating mode is concerned, and (21) π(β) ∂t + A1 ∂y1 π(β)U21 + π(β)AI I (∂Y )π(β)U11 = 0 and
I − π(β) U41 = iL(β)−1 A1 ∂y1 π(β)U21 + iL(β)−1 AI I (∂Y )U11
for the first oscillating mode, and finally −1 U42 = −iL 2β f U11 , U11 for the second harmonic, since L(2β) is invertible thanks to Assumption 1.
(22)
(23)
COLIN and LANNES
360
Since (11) is also nonlinear quadratic, we look for a U5 of the same kind as U4 , U5 T , Y, t, y1 , θ := U50 T , Y, t, y1 + U51 T , Y, t, y1 eiθ + c. c. + U52 T , Y, t, y1 e2iθ + c. c., and we obtain the following equivalent conditions: π(0) ∂t + A1 ∂y1 π(0)U30 + π(0)AI I (∂Y )π(0)U20 = 4 π(0)f π(β)U11 , π(β)U21 and
I − π(0) U50 = iL(0)−1 A1 ∂y1 π(0)U30
+ iL(0)−1 AI I (∂Y )U20 − 4iL(0)−1 f U11 , U21 ,
(24)
(25)
as far as the nonoscillating mode is concerned, and
π(β)A1 ∂y1 π(β)U31 + π(β)AI I (∂Y )π(β)U21 + ∂T π(β)U11 = 2π(β)f U11 , U20 (26)
and
I − π(β) U51 = iL(β)−1 ∂t + A1 ∂y1 U31 + iL(β)−1 AI I (∂Y )U21 − 2iL(β)−1 f U11 , U20
(27)
for the first order term of the Fourier expansion. The second harmonic U52 is obtained in the same way as U42 , −1 U52 = −2iL 2β f U11 , U21 . (28) Equation (26) involves U31 , which can be split under the form U31 = π(β)U31 + (I − π(β))U31 . Plugging this decomposition into (26) and using the expression of (I − π(β))U31 given by (18) yields ∂T π(β)U11 + iπ(β)A1 ∂y1 L(β)−1 A1 ∂y1 π(β)U11 + π(β) ∂t + A1 ∂y1 π(β)U31 + π(β)AI I (∂Y )π(β)U21 = 2π(β)f π(β)U11 , π(0)U20 .
(29)
We finally consider (12). In fact, we do not solve it entirely, but only its projection onto the range of π(0). The equation thus obtained reads, thanks to (15)–(16), π(0) ∂t + A1 ∂y1 U40 + π(0)AI I (∂Y )π(0)U30 + π(0)∂T π(0)U20 = π(0) f π(0)U20 , π(0)U20 L + 2f π(β)U21 , π(β)U21 (30) + 4 π(0)f π(β)U11 , U31 .
LONG-WAVE SHORT-WAVE RESONANCE
361
2.3. The transparency condition Without any additional information, the equations found in the above section cannot be solved. We recall indeed that the scaling of our solutions is bigger than the normal scaling, so that the nonlinear effects should occur too soon to allow a study over large times. As said in the introduction, these nonlinear effects do not occur in many cases, provided that the following transparency condition is fulfilled. assumption 4 (Transparency) For any a, b ∈ CN , one has π(0)f π(β)a, π(β)b = 0. Under this assumption, (19) becomes linear, π(0) ∂t + A1 ∂y1 π(0)U20 = 0,
(31)
and so does (24), which reads π(0) ∂t + A1 ∂y1 π(0)U30 + π(0)AI I (∂Y )π(0)U20 = 0.
(32)
We finally consider (30). Under the transparency assumption, it reads ∂T π(0)U20 + π(0) ∂t + A1 ∂y1 U40 + π(0)AI I (∂Y )π(0)U30 = π(0)f π(0)U20 , π(0)U20 + 4 π(0)f π(β)U11 , I − π(β) U31 . We can now use the expression of (I −π(β))U31 given by (18), decompose U40 under the form U40 = π(0)U40 + (I − π(0))U40 , and use the expression of (I − π(0))U40 given by (20) to find ∂T π(0)U20 + iπ(0)A1 ∂y1 L(0)−1 A1 ∂y1 π(0)U20 + π(0) ∂t + A1 ∂y1 π(0)U40 + π(0)AI I (∂Y )π(0)U30 = π(0)f π(0)U20 , π(0)U20 + 4 π(0)f π(β)U11 , iL(β)−1 A1 ∂y1 π(β)U11 + 2iπ(0)A1 ∂y1 L(0)−1 f π(β)U11 , π(β)U11 .
(33)
2.4. Transport at the group velocity In this section, we review some of the profiles which are transported at the group velocity, since these profiles play the essential part in the asymptotic study. The first proposition we give is a simple consequence of the classical property of transport along rays.
COLIN and LANNES
362
proposition 2 When β is a smooth point of CL and under Proposition 1, one has π(β) ∂t + A1 ∂y1 π(β) = ∂t − ∂1 τ (η)∂y1 π(β) and π(β)AI I (∂Y )π(β) = 0. Proof It is known that for all j one has π(β)Aj π(β) = −∂j τ (η)π(β). Since in the present case we have ∂j τ (η) = 0 when j ≥ 2, the results of the proposition follow. Using this proposition, together with (17) and (21), yields ∂t − ∂1 τ (η)∂y1 π(β)U11 = 0 and
∂t − ∂1 τ (η)∂y1 π(β)U21 = 0,
(34)
so that both π(β)U11 and π(β)U21 are transported at the group velocity, since we recall that the group velocity reads −τ (η) = (∂1 τ (η), 0, . . . , 0) in our coordinates. We finally prove that a component of π(0)U20 also travels at the group velocity. We recall that, thanks to (31), one has π(0) ∂t + A1 ∂y1 π(0)U20 = 0. Since π(0)(∂t + A1 ∂y1 )π(0) is a hyperbolic symmetric operator of dimension 1, we can decompose it under the form p
∂t + vj ∂y1 π j (0), π(0) ∂t + A1 ∂y1 π(0) =
(35)
j =1
where the vj are the distinct eigenvalues of π(0)A1 π(0) and where π j (0) is the associated orthogonal projector defined on the range of π(0). Each component π j (0)U20 of π(0)U20 is therefore transported at the velocity vj , with respect to the variables t and y1 . The following lemma says that one of these components is tranported at the group velocity. lemma 2 The group velocity −∂1 τ (η) is an eigenvalue of π(0)A1 π(0). Proof The vector (1, −τ (η)) is by definition normal to the tangent plane P to CL at β. We recall that β 0 = (1, η0 ) is on the contact line between P and C 0 ; thanks to
363
LONG-WAVE SHORT-WAVE RESONANCE
Assumption 2, we thus know that (1, −τ (η)) is also normal to C 0 at β 0 . Denoting by τ 0 (η) a local parametrization of C 0 in a neighborhood of β 0 , we have therefore τ 0 (η0 ) = τ (η) and τ 0 (η0 ) = 1. But since C 0 is conic, τ 0 is homogenous
of degree 1, and Euler’s formula yields τ 0 (η0 ) = τ 0 (η0 ) · η0 . It follows that 1 = τ (η) · η0 . Since in our coordinates we have η0 = (η10 , 0, . . . , 0), this last equality reads 1 = ∂1 τ (η)η10 , and β 0 thus reads β 0 = (∂1 τ (η)η10 , η10 , 0, . . . , 0). We have therefore L(β 0 ) = η10 (∂1 τ (η) + A1 ). Since β 0 ∈ C 0 , the endomorphism π(0)L(β 0 )π(0) is not invertible on the range of π(0), and hence neither is π(0)(∂1 τ (η) + A1 )π(0), thanks to the expression just found for L(β 0 ). This means that −∂1 τ (η) is an eigenvalue of π(0)A1 π(0), and the lemma is thus proved. Convention In other words, the lemma says that there exists j such that vj = −∂1 τ (η). Up to a change of indices, we suppose from now on that v1 = −∂1 τ (η), so that π 1 (0)U20 travels at the group velocity. 2.5. Averaging We now use the average projectors introduced in [L1] to obtain new equations that describe the asymptotic behavior of the solution for long times. We first recall the definition of the average projector in the case in which we are interested. Definition 1 (Average projector) Let T (∂x ) := ∂x + v∂y1 be a transport operator. The average projector associated to T is the operator GT defined on smooth functions on R2t,y1 as 1 GT w(t, y1 ) = lim h→∞ h
h
w t + s, y1 + vs ds
0
when this limit exists. If v = −∂1 τ (η) and if the function GT w exists, it is denoted by w. The following proposition gives the properties of GT which we need in this paper. proposition 3 (i) Let T (∂x ) = ∂t + v∂y1 . • If w is a smooth function of (t, y1 ) ∈ R2 such that T (∂x )w = 0, then we have GT w = w;
COLIN and LANNES
364 •
If w has a sublinear growth, that is, if limt→∞ (1/t)w(t, ·)∞ = 0, then GT T (∂x )w = 0.
(ii) Let v1 = v2 , T1 (∂x ) := ∂t + v1 ∂y1 , and T2 (∂x ) := ∂t + v2 ∂y1 . If w is such that T1 (∂x )w = 0, then GT2 w = 0.
(iii) Let T1 (∂x ) := ∂t + v1 ∂y1 , T2 (∂x ) := ∂t + v2 ∂y1 , and T (∂x ) := ∂t + v∂y1 , and suppose that T1 (∂x )w1 = T2 (∂x )w2 = 0. Then GT f (w1 , w2 ) = 0
unless v = v1 = v2 , in which case GT f (w1 , w2 ) = f (w1 , w2 ).
(iv) If w has a sublinear growth, (w) is well defined, and v = −∂1 τ (η), then one has ∂t + v∂y1 w = ∂1 τ (η) + v ∂y1 (w). Proof We only prove (iv) since all the other assertions of the proposition can be found in [L1]. One has 1 h ∂t + v∂y1 w t + s, y − ∂1 τ (η)s ds h 0 1 h = ∂s w t + s, y − ∂1 τ (η)s h 0
+ v + ∂1 τ (η) ∂y1 w t + s, y − ∂1 τ (η)s ds
1 = w t + h, y − ∂1 τ (η)h − w(t, y) h 1 h v + ∂1 τ (η) ∂y1 w t + s, y − ∂1 τ (η)s ds + h 0
1 = w t + h, y − ∂1 τ (η)h − w(t, y) h 1 h w t + s, y − ∂1 τ (η)s ds. + v + ∂1 τ (η) ∂y1 h 0 Since w has a sublinear growth, the first of these two terms tends to zero when h → ∞. The second of these terms tends toward (∂1 τ (η) + v)∂y1 w since (w) is well defined. The assertion of the proposition is thus proved.
365
LONG-WAVE SHORT-WAVE RESONANCE
We first use these results to solve (32), which reads π(0) ∂t + A1 ∂y1 π(0)U30 + π(0)AI I (∂Y )π(0)U20 = 0. There is not uniqueness of the solution to this equation; the following lemma gives the most natural. lemma 3 As a solution to (32), one can take π(0)U20 = π 1 (0)U20 and p
π(0)U30 = π(0)U30 = − j =2
1 ∂ −1 π j (0)AI I (∂Y )π 1 (0)U20 ∂1 τ (η) + vj y1
(where the vj , j ≥ 2, are the eigenvalues of π(0)A1 π(0) distinct from −∂1 τ (η)). Proof Using decomposition (35), equation (32) writes p
∂t + vj ∂y1 π j (0)U30 + π(0)AI I (∂Y )π(0)U20 = 0,
j =1
with v = −∂1 τ (η) and vj = v1 for j ≥ 2. We also recall that π(0)U20 =
p 1 j j j =1 π (0)U20 with (∂t + vj ∂y1 )π (0)U20 = 0 for all j , so that p
j
∂t + vj ∂y1 π (0)U30 + π(0)AI I (∂Y )
j =1
p
π j (0)U20 = 0.
j =1
Multiplying this equation on the left by π j (0), with 1 ≤ j ≤ p, yields p
∂t + vj ∂y1 π j (0)U30 + π j (0)AI I (∂Y ) π k (0)U20 = 0.
k=1
Let us introduce the operator Tj (∂x ) := ∂t + vj ∂y1 . Since we impose that U30 has a sublinear growth, we can apply the average projector GTj to the above equation and use Proposition 3 to find π j (0)AI I (∂Y )π j (0)U20 = 0.
(36)j
When j ≥ 2, the operator π j (0)AI I (∂Y )π j (0) is in general not equal to zero, so that we take π j (0)U20 = 0 as a solution to (36)j .
COLIN and LANNES
366
When j = 1, things are different since π 1 (0)AI I (∂Y )π 1 (0) = 0, as we now prove. This is done in two steps (i) One has ker π(0)(∂1 τ (η)I + A1 )π(0) = ker π(0)(I + A1 η10 )π(0). Indeed, one has ker π(0)(∂1 τ (η)I + A1 )π(0) = ker π(0)(∂1 τ (η)η10 I + A1 η10 )π(0), and 1 = ∂1 τ (η)η10 , as we have seen in the proof of Lemma 2. (ii) As in the proof of Lemma 2, denote by τ 0 (η) a local parametrization of C 0 in a neighborhood of β 0 . Denote by π 0 (η) the orthogonal projector on ker π(0)L(τ 0 (η), η) π(0). Thanks to (i), we know that π 0 (η0 ) = π 1 (0). We thus have π 1 (0)Aj π 1 (0) = −∂j τ 0 η0 ,
and since τ 0 (η0 ) = τ (η) = (∂1 τ (η), 0, . . . , 0), we have π 1 (0)Aj π 1 (0) = 0 for all j ≥ 2, and therefore π 1 (0)AI I (∂Y )π 1 (0) = 0, as wanted. Therefore, (36)1 does not impose any condition, so that the choice of π(0)U20 = π 1 (0)U20 is free. Before giving an expression for π(0)U30 , first remark that if U20 is regular enough, π j (0)U30 , for j ≥ 2, is a sum of regular functions that travel at velocity v1 or vj , so that π j (0)U30 exists. Thanks to Proposition 3, applying GT1 on (36)j yields ∂1 τ (η) + vj ∂y1 π j (0)U30 = −π j (0)AI I (∂Y )π 1 (0)U20 . It is then easy to see that the function given in the lemma indeed solves (32). Remark 1 As one can see in the proof of Lemma 3, the solution given by the lemma is not the only possible one, but it is the simplest and most natural. We now use the average projector to obtain two new equations equivalent to (29). We recall that π(β)U11 , π(β)U21 , and π 1 (0)U20 are transported at the group velocity and are therefore left invariant by the action of . Using also Proposition 2 and the fact that U31 has a sublinear growth, one then finds, after applying to (29), that this equation is equivalent to the couple of equations ∂T π(β)U11 + iπ(β)A1 ∂y1 L(β)−1 A1 ∂y1 π(β)U11 = 2π(β)f π(β)U11 , π 1 (0)U20 (37) and p j π (0)U20 . π(β) ∂t + A1 ∂y1 π(β)U31 = 2π(β)f π(β)U11 , j =2
Since one has π(0)U20 =
π 1 (0)U
20
from Lemma 3, this last equation reads
367
LONG-WAVE SHORT-WAVE RESONANCE
π(β) ∂t + A1 ∂y1 π(β)U31 = 0,
(38)
which is equivalent to saying that π(β)U31 is transported at the group velocity, thanks to Proposition 2. In the evolution equation for U11 given by (37), the corrector U20 also appears, and we therefore need another profile equation in order to determine U11 and U20 . This second equation is derived from (33). In fact, we do not solve (33), but only its spectral component on the range of π 1 (0). It is obtained by multiplying (33) on the left by π 1 (0). Using decomposition (35) and Lemma 3, this reads ∂T π 1 (0)U20 + iπ 1 (0)A1 ∂y1 L(0)−1 A1 ∂y1 π 1 (0)U20 + ∂t − ∂1 τ (η)∂y1 π 1 (0)U40 + π 1 (0)AI I (∂Y )π(0)U30 = π 1 (0)f π 1 (0)U20 , π 1 (0)U20 + 4 π 1 (0)f π(β)U11 , iL(β)−1 A1 ∂y1 π(β)U11 + 2iπ 1 (0)A1 ∂y1 L(0)−1 f π(β)U11 , π(β)U11 . Since U20 , U30 , and U11 travel at the group velocity and since we impose that U40 has a sublinear growth, we can see by applying the average projector G to this equation that it is equivalent to ∂T π 1 (0)U20 + iπ 1 (0)A1 ∂y1 L(0)−1 A1 ∂y1 π 1 (0)U20 p
1 ∂ −1 π j (0)AI I (∂Y )π 1 (0)U20 ∂1 τ (η) + vj y1 j =2 1 1 = π (0)f π (0)U20 , π 1 (0)U20 + 4 π 1 (0)f π(β)U11 , iL(β)−1 A1 ∂y1 π(β)U11 + 2iπ 1 (0)A1 ∂y1 L(0)−1 f π(β)U11 , π(β)U11
− π 1 (0)AI I (∂Y )
and
(39)
∂t − ∂1 τ (η)∂y1 π 1 (0)U40 = 0.
Equation (39) is the coupled equation on U11 and U20 for which we were looking. 2.6. The evolution system for π(β)U11 and π 1 (0)U20 We simplify here (37) and (39), which yields a system on π(β)U11 and π 1 (0)U20 that is easier to handle. We first need the following proposition. proposition 4 (i) One has π(β)A1 ∂y1 L(β)−1 A1 ∂y1 π(β) =
1 2 ∂ τ (η)π(β)∂y21 2 1
COLIN and LANNES
368
and
π 1 (0)A1 ∂y1 L(0)−1 A1 ∂y1 π 1 (0) = 0.
(ii) For any a, b ∈ RN , we have π 1 (0)f π 1 (0)a, π 1 (0)b = 0. (iii) The first quadratic term in U11 in (39) is a derivative: 4 π 1 (0)f π(β)U11 , iL(β)−1 A1 ∂y1 π(β)U11 = −2i∂y1 π 1 (0)f ∂1 π(β)U11 , π(β)U11 . Proof (i) The first assertion of this point is very classical and can, for instance, be found in [DJMR]. We now prove the second assertion. For any β1 := (τ, η1 ) ∈ R2 , introduce LI (τ, η1 ) := τ I + A1 η1 + L0 /i. The j associated characteristic variety CI is parametrized by (τI (η1 ))j =1,...,r , where, up to j j a change of indices, τI1 , . . . , τIs denotes the τI such that τI (0) = 0. j Since we are in dimension 1, the τI are analytic functions (cf. [K]) and are odd j when j ≤ s (because A1 and L0 are real). We also denote by πI (η1 ) the projector on j ker LI (τI (η1 ), η1 ) when β1 is smooth. These functions can be analytically extended to R (cf. [K]). j j (a) We prove here that πI (0) = π j (0) for 1 ≤ j ≤ s (where πI (0) denotes the j analytic extension of πI (η1 ) to zero). We know that the characteristic variety CI0 defined as {(τ, η1 ), det(π(0)(τ I + A1 η1 + L0 / i)π(0)) = 0} is the tangent cone to CI at (0, 0) (see [L1]); as we are in space dimension 1, it is a union of straight lines. But, thanks to decomposition (35), we can write p vj π j (0), π(0)A1 π(0) = j =1
so that
CI0 = τ + vj η1 = 0, j = 1, . . . , p .
(40) j
On the other hand, since CI is the union of the analytic curves τI , the tangent cone CI0 is given by j CI0 = τ − τI (0)η1 = 0, j = 1, . . . , s . (41) Thanks to (40) and (41), we know that p = s and that, up to a change of indices, j vj = −τI (0). We also have
369
LONG-WAVE SHORT-WAVE RESONANCE
1 j j πI (η1 ) τI (η1 ) + A1 η1 + L0 = 0, i
∀η1 .
Differentiating this equality yields 1 j j j πI (η1 ) τI (η1 ) + A1 η1 + L0 + πI (η1 ) τI (η1 ) + A1 = 0. i
(42) j
Taking the limit of this equality when η1 → 0 and multiplying on the right by πI (0) yields j j j j πI (0)A1 πI (0) = −τI (0)πI (0). j
j
Since −τI (0) = vj , this means that πI (0) is the eigenprojector associated to the j j j eigenvalue vj of πI (0)A1 πI (0), and therefore πI (0) = π j (0). (b) We now introduce 1 p p πI (η1 ) + · · · + πI (η1 ) A1 πI1 (η1 ) + · · · + πI (η1 ) − π(0)A1 π(0) φ(η1 ) = , η1 and we prove that lim φ(η1 ) = −
η1 →0
p j =1
j
j
j
τI (0)πI (0) +
j,k,k=j
j
j
j j τIk (0) − τI (0) πI (0)πIk (0).
(43)
j
We know that πI (η1 )A1 πI (η1 ) = −τI (η1 )πI (η1 ) and, as we have seen in (a), that
p j j π(0)A1 π(0) = − j =1 τI (0)πI (0). We therefore have
p j j τI (0) − τI (η1 ) j φ(η1 ) = πI (η1 ) η1 j =1 p
+
j =1
j
τI (0)
j
j
j
πI (0) − πI (η1 ) πI (η1 )A1 πIk (η1 ) + η1 η1 i=j
:= A + B + C. One then has
j j j A → j τI (0)πI (0) when η1 → 0, and therefore A → 0 since the τI are odd for j ≤ p;
j • B →− j τI (0)πI (0) when η1 → 0;
j j k k • C → j =k (τI (0) − τI (0))πI (0)πI (0) when η1 → 0. In order to prove this result, we first multiply (42) on the right by πIk (η1 ), for k = j , •
COLIN and LANNES
370
1 j j πI (η1 ) τI (η1 ) + A1 η1 + L0 πIk (η1 ) + πI (η1 )A1 πIk (η1 ) = 0, i and thus,
j j j τI (η1 ) − τIk (η1 ) πI (η1 )πIk (η1 ) + πI (η1 )A1 πIk (η1 ) = 0.
We have therefore j j j τIk (η1 ) − τI (η1 ) πI (η1 )πIk (η1 ). πI (η1 )A1 πIk (η1 ) = i=j
i=j
We just have to divide this equality by η1 and take the limit when η1 → 0 to obtain the desired result. Since φ(η1 ) = A + B + C, equality (43) is proved. p (c) Let us introduce *(η1 ) := (πI1 (η1 ) + · · · + πI (η1 )). One has *(0) = π(0), and * is an analytic function of η1 ; we prove here the equality L(0)−1 A1 π(0) + (I − π(0))* (0) = 0.
(44)
In order to prove this relation, first notice that p 1 j τI (η1 ) + A1 η1 + L0 *(η1 ) = 0. i j =1
Differentiating this equality with respect to η1 yields p j k 1 1 j τI (η1 ) + A1 η1 + L0 τI (η1 ) + A1 τI (η1 ) + A1 η1 + L0 *(η1 ) i i k=1 j k
1 j τI (η1 ) + A1 η1 + L0 * (η1 ) = 0. i
Taking the limit of this expression when η1 → 0 yields p p 1 1 k 1 L0 τI (0) + A1 L0 * (0) = 0; *(0) + i L0 i k=1 j k
L0 i
s−1
j =1
A1 *(0) +
L0 i
s
* (0) = 0.
Multiplying this equality on the left by (L(0)−1 )s then yields equality (44). (d) We prove here that lim φ(η1 ) = −2π(0)A1 L(0)−1 A1 π(0)
η1 →0
(45)
371
LONG-WAVE SHORT-WAVE RESONANCE
− * (0)
j
Indeed, one has
j
j
τI (0)πI (0) −
j
j
j
τI (0)πI (0)* (0).
*(η1 ) − π(0) A1 *(η1 ) + π(0)A1 *(η1 ) − π(0) , φ(η1 ) = η1
and therefore
lim φ(η1 ) = * (0)A1 *(0) + * (0)A1 *(0).
η1 →0
But thanks to (c), it is easy to see that * (0)A1 *(0) = * (0) I − *(0) A1 *(0) + * (0)*(0)A1 *(0) j j τI (0)πI (0). = −*(0)A1 L(0)−1 A1 *(0) − * (0) j
We just have to transpose this equality to find *(0)A1 * (0), and thus (45) is proved. (e) Thanks to (43) and (45), we find an expression for π(0)A1 L(0)−1 A1 π(0). It is then easy to see that if we multiply this expression on both sides by π 1 (0), we find zero, so that the second assertion of point (i) of the proposition is proved. (ii) Thanks to what we have seen in the proof of (i), we can write π 1 (0)f π 1 (0)a, π 1 (0)b = lim πI1 (0)f πI1 (η1 )a, πI1 (η1 )b . η1 →0
Since πI1 (η1 ) = π(β1 ), with β1 := (τI1 (η1 ), η1 , 0, . . . , 0) ∈ R1+d , the right-hand side of the above equation is equal to zero, thanks to Assumption 4, and the result follows. (iii) The proof of this point can be found in [C]. We have thus proved the following theorem. theorem 1 Suppose that uε , given by
β ·x √ √ , u (x) = ε U ε, εt, εyI I , t, y1 , ε ε
with U := U1 +
√
ε U2 + ε U3 + ε 3/2 U4 + ε 2 U5
is the approximate solution to (1) given by geometric optics. If U1 = U11 eiθ + c. c. and U2 = U20 + U21 eiθ + c. c., then one has π(β)U11 = U11 ,
U21 = 0,
and
π(0)U20 = U20 .
COLIN and LANNES
372
Moreover, π(β)U11 and π 1 (0)U20 = π(0)U20 are transported at the group velocity −∂1 τ (η), that is, ∂t − ∂1 τ (η)∂y1 π(β)U11 = ∂t − ∂1 τ (η)∂y1 π 1 (0)U20 = 0, and must also satisfy i ∂T π(β)U11 + ∂12 τ (η)∂y21 π(β)U11 = 2π(β)f π(β)U11 , π 1 (0)U20 2
(46)
and p
1 ∂ −1 π j (0)AI I (∂Y )π 1 (0)U20 ∂1 τ (η) + vj y1 j =2 1 = −2i∂y1 π (0)f ∂1 π(β)U11 , π(β)U11 + 2iπ 1 (0)A1 ∂y1 L(0)−1 f π(β)U11 , π(β)U11 .
∂T π 1 (0)U20 − π 1 (0)AI I (∂Y )
(47)
3. The approximate solution uε and its properties 3.1. The leading terms of the ansatz We want to know the leading term U1 of ansatz (3). We have seen that U11 = U11 eiθ + c. c. and that U11 (T , Y, t, y1 ) satisfies the polarization condition U11 = π(β)U11 , together with the transport equation ∂t − ∂1 τ (η)∂y1 U11 = 0, so that U11 (T , Y, t, y1 ) may be written under the form U11 (T , Y, ζ ), where ζ := y1 + t∂1 τ (η). The second term of the ansatz writes U2 = U20 + U21 eiθ +c. c., and its nonoscillating mode U20 satisfies the polarization condition U20 = π 1 (0)U20 together with the same transport equation as U11 , so that we can also write U20 (T , Y, t, y1 ) under the form U20 (T , Y, ζ ). We have seen that the slow evolutions of U11 and U20 are coupled by (46) and (47). Such a system presents many difficulties. In this paper, we assume that it admits sufficiently regular solutions and pursue the analysis. assumption 5 0 = π(β)U 0 and U 0 = π 1 (0)U 0 be in H ∞ (Rd ). There exists a T > 0, Let U11 Y,ζ 11 20 20 an integer s sufficiently large, and a unique couple of profiles U11 , U20 ∈ C([0, T ]; H s (RdY,ζ )) satisfying
373
LONG-WAVE SHORT-WAVE RESONANCE
i ∂T π(β)U11 + ∂12 τ (η)∂ζ2 π(β)U11 = 2π(β)f π(β)U11 , π 1 (0)U20 , 2 p 1 ∂T π 1 (0)U20 − π 1 (0)AI I (∂Y ) ∂ζ−1 π j (0)AI I (∂Y )π 1 (0)U20 ∂ τ (η) + v 1 j (S) j =2 1 = −2i∂ζ π (0)f ∂1 π(β)U11 , π(β)U11 +2iπ 1 (0)A1 L(0)−1 ∂ζ f π(β)U11 , π(β)U11 , together with the polarization conditions U11 = π(β)U11
and
U20 = π 1 (0)U20
and with the initial conditions 0 U11 T =0 = U11
and
0 U20 T =0 = U20 .
Remark 2 In Section 4 we prove that this assumption can be proved for Maxwell-Bloch systems satisfying a strong transparency condition. We also prove that this assumption is satisfied in the 1-dimensional case in Section 5. Finally, we give in Section 6 an existence theorem for a simplified system arising also in the study of water waves. Under this assumption, the profiles U11 and U20 may be determined, and the other terms follow, as we now see. 3.2. Corrector terms of the ansatz In this section, we suppose that U11 and U20 are known, and we construct the missing terms of ansatz (3) in accordance with the equations found in Section 2. The leading term U1 is already known since U1 = U11 eiθ + c. c., but we still have to find U21 to determine the first corrector U2 . The only conditions found so far on U21 are the polarization condition (14) and the transport equation (34). We can therefore take U21 = 0. The second corrector U3 writes U3 = U30 + U31 eiθ + c. c. The nonoscillating component satisfies the polarization condition (16), that is, U30 = π(0)U30 , and is therefore given by Lemma 3. The component π(β)U31 of the oscillating mode must only satisfy the transport equation (38) and can therefore be taken equal to zero. The component (I −π(β))U11 is given in terms of U11 by (18). For the corrector U4 = U40 + U41 eiθ +c. c. +U42 e2iθ +c. c., we obtain similarly (I − π(0))U40 , thanks to (20), and we can take π(0)U40 = 0. The component (I −
COLIN and LANNES
374
π(β))U41 of the first oscillating mode is given by (22), and we can take π(β)U41 = 0. The second harmonic is found using (23). Finally, for the last corrector U5 = U50 + U51 eiθ + c. c. +U52 eiθ + c. c., we obtain (I − π(0))U50 , thanks to (25), and we can take π(0)U50 = 0. The component (I − π(β))U51 is given by (27), while π(β)U51 can also be taken equal to zero. The second harmonic is given by (28) and is therefore equal to zero, since U21 = 0. All the components of the ansatz (3), √ U ε, T , Y, t, y1 , θ := U1 + ε U2 + ε U3 + ε 3/2 U4 + ε 2 U5 ε, T , Y, t, y1 , θ , are therefore known, once Assumption 5 is made. The dependence on t and y1 of all these profiles is indeed a dependence on ζ = y1 + ∂1 τ (η)t since they are all transported at the group velocity. We now give explicitly the expression of the ansatz we have found: √ U ε, T , Y, ζ, θ = π(β)U11 (T , Y, ζ )eiθ + c. c. + επ 1 (0)U20 (T , Y, ζ ) +ε
p j =2
−1 ∂ −1 π j (0)AI I (∂Y )π 1 (0)U20 ∂1 τ (η) + vj ζ
+ iL(β)−1 A1 ∂ζ π(β)U11 eiθ + c. c.
+ ε3/2 iL(0)−1 A1 ∂ζ π 1 (0)U20 −2iL(0)−1 f π(β)U11 , π(β)U11
+ iL(β)−1 AI I (∂Y )π(β)U11 eiθ + c. c. − iL(2β)−1 f U11 , U11 e2iθ + c. c. p
1 π j (0)AI I (∂Y )π 1 (0)U20 ∂1 τ (η) + vj j =2 −1 − L(β) ∂1 τ (η) + A1 L(β)−1 A1 ∂ζ2 π(β)U11 − 2iL(β)−1 f U11 , U20 eiθ + c. c. + ε 2 − iL(0)−1 A1
3.3. Properties of ansatz (3) Now that we have found the ansatz we were looking for, we give a few properties. The first one concerns regularity. proposition 5 If U11 and U20 are in C([0, T ]; H s (RdY,ζ )) as asserted in Assumption 5, then all the Fourier coefficients Uij , i = 1, . . . , 5 and j = 0, . . . , 2, are in C([0, T ]; H s−2 (RdY,ζ )).
375
LONG-WAVE SHORT-WAVE RESONANCE
Proof Thanks to the expression of U given above, the only difficulty is to prove that π(0)U30 = −
p j =2
is in
1 ∂ −1 π j (0)AI I (∂Y )π 1 (0)U20 ∂1 τ (η) + vj ζ
C([0, T ]; H s−2 (RdY,ζ )).
The crucial point is that the nonlinearity of the second equation of (S) is a derivative with respect to ζ . If π 1 (0)U20 and π(β)U11 solve (S), then W := ∂ζ−1 π 1 (0)U20 solves p
1 ∂ −1 π j (0)AI I (∂Y )W ∂1 τ (η) + vj ζ j =2 1 = 2iπ (0)f ∂1 π(β)U11 , π(β)U11 + 2iπ 1 (0)A1 L(0)−1 f π(β)U11 , π(β)U11 .
∂T W − π (0)AI I (∂Y ) 1
Since the second member of this equation is in C([0, T ]; H s (RdY,ζ )) (if s is large enough), then W is also in this space. Since π(0)U30 writes π(0)U30 = −
p j =2
1 π j (0)AI I (∂Y )W , ∂1 τ (η) + vj
it is therefore in C([0, T ]; H s−1 (RdY,ζ )). √ We now prove that the corrector term εU2 + · · · + ε 2 U5 remains smaller than the leading term U1 for times O(1/ε). In order to do this, we show that the boundedness condition (4) and the sublinear growth conditions (5) are satisfied. proposition 6 The profile U2 satisfies the boundedness condition (4): ∃C > 0, sup U2 (·, ·, t, ·, ·) ∞ t∈R+
L ([0,T ]×RdY,y ×T)
≤ C.
1
The other correctors Ui , i = 3, 4, 5, also satisfy this boundedness condition, so that the sublinear growth condition (5) is a fortiori satisfied. Proof We recall that U2 = π 1 (0)U20 , so that the fact that U2 satisfies the boundedness condition (4) is a mere consequence of Assumption 5 if s is large enough. Thanks to the expressions already given, it is also easy to see that the other correctors Ui , i = 3, 4, 5, are also bounded.
COLIN and LANNES
376
Remark 3 (1) As seen in Proposition 6, all the profiles are bounded, so that the sublinear growth condition may seem too strong. But it was not a priori obvious that this would be the case. What happens here is that all the profiles considered travel at the velocity −∂1 τ (η), while sublinear growth occurs when other velocities are present. To be more precise, if w1 and w2 are two functions such that ∂t − ∂1 τ (η)∂y1 w1 = w2 and w2 = 0, then w1 has a sublinear growth. In [L1], we can find a second member w2 that travels at a different velocity than −∂1 τ (η). One then has w2 = 0 but w2 = 0, and w1 has therefore a sublinear growth but is not bounded. In this paper, the second member w2 is always equal to zero, so that w1 is bounded. (2) The fact that all the profiles are bounded suggest an improvement of the precision of our approximation, as we see in the next sections. 3.4. Estimate for the residual In this section, we prove that the approximate solution (defined thanks to the ansatz we have found) is almost a solution of problem (1) since it provides a small residual. We first give a regularity result for the residual. lemma 4 √ √ To the approximate solution uε = ε U (ε, εt, εyI I , t, y1 , β · x/ε) corresponds the residual Lε (∂x )uε − f uε , uε = k ε (x), which may be written under the form β ·x √ ε k (x) = K ε, εt, εyI I , y1 + ∂1 τ (η)t, , ε with
4 K ε, T , Y, ζ, θ = Kj ε, T , Y, ζ eij θ , j =−4
and the Kj are in Assumption 5.
C([0, T ]; H s−4 (RdY,ζ ))
if U11 and U20 are in H s as asserted by
Proof The proof of this lemma is straightforward, once we have proved that the derivatives ∂T U1 , ∂T U2 , ∂T U3 , ∂T U4 , and ∂T U5 which appear in the residual are in C([0, T ]; H s−4 (RdY,ζ )).
377
LONG-WAVE SHORT-WAVE RESONANCE
This is clear for ∂T U1 , thanks to the first equation of (S). We have already seen in the proof of Proposition 5 that W = ∂ζ−1 π 1 (0)U20 is in C([0, T ]; H s (RdY,ζ )). Thanks to the second equation of (S), ∂T U2 is thus in C([0, T ]; H s−2 (RdY,ζ )). Differentiating the second equation of (S) with respect to T and using the same method as in the proof of Proposition 5 then yields that ∂T ∂ζ−1 U20 is in C([0, T ]; H s−2 (RdY,ζ )). Thanks to the expression given by Lemma 3, we can then conclude that ∂T U3 is in C([0, T ]; H s−3 (RdY,ζ )). The proof that ∂T U4 and ∂T U5 are in C([0, T ]; H s−4 (RdY,ζ )) is left to the reader. Knowing in which spaces things are living, we can give estimates on the residual. proposition 7 (i) The Fourier coefficients of the profile K of the residual satisfy Kj L∞ ([0,T ],H s−4 (Rd
Y,ζ ))
= O(ε2 ),
for j = −4, . . . , 4.
(ii) We have a better estimate for the component π 1 (0)K0 of the nonoscillating mode 1 π (0)K0 ∞ = O ε5/2 . L ([0,T ],H s−4 (Rd )) Y,ζ
Proof This proposition is a direct consequence of the method we have used to find our approximate solution, since we have cancelled the terms of expansion (6) up to the power ε3/2 . We also have cancelled the component polarized along π 1 (0) of the term in ε2 , which yields the improvement stated in (ii). Remark. (i) If the profiles U3 , U4 , and U5 had a sublinear growth instead of being bounded, then we would have Kj = o(ε) and π 1 (0)K0 = o(ε3/2 ) instead of O(ε 2 ) and O(ε 5/2 ), respectively. (ii) Proposition 7(ii) is of crucial importance in the proof of the stability result of the next section. 4. The case of Maxwell-Bloch systems In the previous section, we proved that our approximate solution uε is almost a solution of problem (1). But the most important thing is to prove that uε remains close to the exact solution uε . Such a stability property is very difficult to prove because of resonances (see [JMR2]). The general case remains at the moment out of reach, and, as done in [C], we limit ourselves to a smaller class of problems than
COLIN and LANNES
378
those of type (1). Under a strong transparency assumption we also prove that the nonlinearity in System (S) vanishes, so that Assumption 5 can be proved in this case. 4.1. General Maxwell-Bloch systems √ We now look for solutions of size O( ε) to systems of the form d1 L0 ε ε u = f uε , v ε , ∂ u + A j ∂y j u ε + t ε j =1
d2 M0 ε ε v = g uε , uε , B j ∂y j v ε + ∂t v + ε
(48)
j =1
where Aj and Bj denote symmetric real-valued matrices, while L0 and M0 are skewsymmetric. The mappings f and g are bilinear mappings and g is symmetric. For (τ, η) ∈ R1+d , we recall that L(τ, η) = τ I +
d1
Aj ηj +
j =1
as well as AI I (η) :=
d1
L0 , i
A j ηj .
j =2
We similarly define M(τ, η) = τ I +
d2
B j ηj +
j =1
as well as BI I (η) :=
d2
M0 , i
Bj η j .
j =2
The set of all β := (τ, η) ∈ R1+d such that det(L(β)) = 0 is the characteristic variety of L and is denoted by CL . Similarly, CM denotes the characteristic variety of M. For any η ∈ Rd , we denote by (−τLl (η))l=1,...,p1 the eigenvalues of A(η) + L0 /i and l (η)) by (−τM l=1,...,p2 those of B(η) + M0 /i, thus providing a parametrization of CL and CM . Up to a renumbering, we can suppose that β = τL1 (η). We also denote by πL (β) and πM (β) the orthogonal projectors on ker(τ I + A(η) + L0 / i) and ker(τ I + B(η) + M0 /i), respectively, and we denote by L(β)−1 and M(β)−1 the partial inverses of L(β) and M(β). Similarly, πL (0) and πM (0) are the orthogonal projectors on the kernel of L(0) and M(0), and L(0)−1 and M(0)−1 their partial inverses.
379
LONG-WAVE SHORT-WAVE RESONANCE
0 the characteristic varieties of the operators We finally denote by CL0 and CM πL (0)Lε (∂x )πL (0) and πM (0)M ε (∂x )πM (0), respectively. Thanks to Lemma 2, we know that πM (0)B1 πM (0) admits −∂1 τL1 (η) as an eigenvalue. The associated eigen1 (0), while the projectors associated to the other eigenvalues projector is denoted by πM j vj are denoted πM (0), j ≥ 2. Assumption 1 on the choice of β and Assumption 2 on the long-wave short-wave resonance are replaced in this new framework by the following assumption.
assumption 6 / CL ; neither β nor 2β is in CM . (i) The point β of CL is smooth and 2β ∈ 0 at (0, 0). (ii) The plane P tangent to CL at β is tangent to CM As systems like (48) are a subclass of systems like (1), all the results proved above remain valid. In particular, we can construct approximate solutions (uε , v ε ) to (48) under the form β ·x √ , uε (x) = Uaε εt, εyI I , y1 + ∂1 τ (η)t, ε β ·x √ ε ε v (x) = Va εt, εyI I , y1 + ∂1 τ (η), , ε where the profiles Uaε and Vaε are given by the formulas √ Uaε (T , Y, ζ, θ ) = ε πL (β)U11 eiθ + c. c. + ε3/2 iL(β)−1 A1 ∂ζ πL (β)U11 eiθ + c. c. + ε2 iL(β)−1 AI I (∂Y )πL (β)U11 eiθ + c. c. + ε 5/2 − L(β)−1 ∂1 τ L (η) + A1 L(β)−1 A1 ∂ζ2 πL (β)U11 + 2iL(β)−1 f U11 , V20 eiθ + c. c. and 1 Vaε (T , Y, ζ, θ ) = επM (0)V20
−ε
3/2
p
j =2
1 ∂1
τ L (η) + v
j
j
1 ∂ζ−1 πM (0)BI I (∂Y )πM (0)V20
1 + ε iM(0)−1 B1 ∂ζ πM (0)V20 − 2iM(0)−1 g π(β)U11 , π(β)U11 −1 − iM 2β g U11 , U11 e2iθ + c. c. 2
COLIN and LANNES
380
− ε 5/2 iM(0)−1 B1
p j =2
1 ∂1
τ L (η) + v
j
j
1 πM (0)BI I (∂Y )πM (0)V20 .
Remark 4 √ One can notice that Uaε = O( ε) while Vaε = O(ε), so that uε and v ε defined √ as uε = εuε and v ε = εv ε are of size O(1). Instead of looking for solutions of √ size (O( ε), O(ε)) to (48), we could therefore look for solutions u ε and v ε of size O(1) to d1 ε ε L0 ε ε ε ∂t u + Aj ∂yj u + u = εf u , v , ε j =1 (49) d2 ε ε M0 ε ε ε v = g u ,u . B j ∂ yj v + ∂t v + ε j =1
Such a system belongs to the general class of Maxwell-Bloch systems introduced and studied in [JMR2]. 4.2. A stability result Assumption 4 is not strong enough to allow the proof of a stability result; that is why we introduce a strong transparency condition, as in [JMR2] and [C]. This strong transparency condition is satisfied by the physical Maxwell-Bloch systems, and we also prove that if it is satisfied then the nonlinearity of the second equation of (S) vanishes, so that Assumption 5 can be proved. assumption 7 (Strong transparency condition) There exists C > 0 such that for all η, η , and η in Rd , all 1 ≤ j , k ≤ p1 , and 1 ≤ l ≤ p2 , and all a, b ∈ CN , one has πM (β )g πL (β)a, πL (β )b ≤ Cab τ j (η) + τ k (η ) − τ l (η ) , L M L j
l (η ), η ). where β := (τL (η), η), β := (τLk (η ), η ), and β := (τM
Remark 5 It is straightforward to see that Assumption 4 can be deduced from Assumption 7. The following proposition asserts that, under Assumption 7, the nonlinearity of the second equation of (S) vanishes and that Assumption 5 can therefore be proved. proposition 8 Suppose that Assumption 7 is satisfied; then
LONG-WAVE SHORT-WAVE RESONANCE
381
(i) one has 1 1 −πM (0)g ∂1 πL (β)U11 , π(β)U11 + πM (0)A1 L(0)−1 g π(β)U11 , π(β)U11 = 0; (ii) the system (S) reads i 1 (0)U20 , ∂T U11 + ∂12 τ (η)∂ζ2 U11 = 2πL (β)f πL (β)U11 , πM 2 1 ∂T U20 − πM (0)AI I (∂Y ) p 1 j 1 × ∂ −1 π (0)AI I (∂Y )πM (0)U20 = 0, ∂1 τL (η) + vj ζ M j =2
so that Assumption 5 is satisfied. Proof (i) Let α be in a neighborhood of zero in R, and take here β = (τL1 (η + (α/2, 0)), η + 1 (α, 0), (α, 0)). (α/2, 0)), β = (τL1 (−η + (α/2, 0)), −η + (α/2, 0)), and β = (τM Expanding πM (β )g(πL (β )a, π(β)b) with respect to α near zero yields, for all a and b in CN , 1 πM β g πL β a, π(β)b = πM (0)g πL (β)a, πL (−β)b 1 1 + α πM (0)g ∂1 πL (β)a, πL (−β)b 2 1 1 − πM (0)g πL (β)a, ∂1 πL (−β)b 2 1 + πM (0)g π(β)a, π(−β)b + o(α). The leading term of this expansion vanishes, thanks to Assumption 4. Using the fact that π(−β) = π(β) and taking b = a therefore yields 1 πM β g πL (β )a, π(β)a = α πM (0)g ∂1 πL (β)a, πL (β)a (50) 1 + πM (0)g π(β)a, π(β)a + o(α). Now, introducing η = η + (0, α/2), η = −η + (0, α/2), and η = (α, 0), and 1 (η ), yields expanding τL1 (η) + τL1 (η ) − τM
1 1 τL1 (η) + τL1 (η ) − τM (η ) = 0 + α ∂1 τL1 (η) − ∂1 τM (0) + o(α), 1 (0), so that but, thanks to Assumption 6, we have ∂1 τL1 (η) = ∂1 τM 1 (η ) = o(α). τL1 (η) + τL1 (η ) − τM
(51)
COLIN and LANNES
382
Thanks to Assumption 7, we know that πM (β )g πL (β )a, π(β)a 1 τ (η) + τ 1 (η ) − τ 1 (η ) L
L
M
must remain bounded for all α. Equations (50)–(51) say that this is possible if and only if 1 1 (0)g ∂1 πL (β)a, πL (β)a + πM (0)g π(β)a, π(β)a = 0. (52) πM We now prove that this condition gives the one given in point (i) of the proposition. As in the proof of Proposition 4, we write that, for all α in a neighborhood of zero, L0 1 = 0. πM (α) τL (α) + A1 α + i Differentiating this equality with respect to α and taking the limit α → 0 yields
L0 1 1 + πM (0) (0) (τL1 ) (0)I + A1 = 0, πM i
and multiplying on the right by L(0)−1 thus gives 1 1 (0)A1 L(0)−1 = 0. πM (0) I − πM (0) + πM Therefore, one has 1 1 (0)A1 L(0)−1 g π(β)a, π(β)a ; πM (0) I − πM (0) g π(β)a, π(β)a = −πM that is, since πM (0)g(π(β)a, π(β)a) = 0,
1 1 πM (0)g π(β)a, π(β)a = −πM (0)A1 L(0)−1 g π(β)a, π(β)a .
(53)
Equations (52)–(53) then prove the desired result. (ii) It is a straightforward consequence of point (i) that system (S) takes the form given in the proposition. It is also easy to prove that Assumption 5 can be proved in this case. We can now prove a stability result for those Maxwell-Bloch systems. theorem 2 0 = π (β)U 0 and V 0 = π 1 (0)V 0 be in H ∞ (Rd ), and suppose AssumpLet U11 L M Y,ζ 11 20 20 tions 5, 6, and 7 are satisfied. Then there exists Tmax > 0 such that
383
LONG-WAVE SHORT-WAVE RESONANCE
(i) for all 0 < T < Tmax , there exists a unique smooth exact solution (uε , vε ), defined on [0, T /ε] × Rd , to problem (48) with initial conditions √ 0 0, εyI I , y1 eiη·y/ε + c. c. uε |t=0 (y) = ε1/2 U11 and
√ 0 0, εyI I , y1 ; vε |t=0 (y) = εV20
(ii) we can write uε and vε under the form β ·x √ ε 1/2 ε u (x) = ε U εt, εyI I , t, y1 , ε and ε
v (x) = εV
ε
β ·x √ , εt, εyI I , t, y1 , ε
with Uε and Vε bounded in C([0, T ]; H s (Rd ε ε V − U − √1 U ε + a ε C([0,T ];H s (Rd ×T))
× T)), and we have 1 ε Va = o(1). ε C([0,T ];H s (Rd ×T))
In particular, 1 √ uε − uε L∞ ([0,T /ε]×Rd ×T) + ε
1 vε − v ε ∞ = o(1). L ([0,T /ε]×Rd ×T) ε
Proof Existence on a small time interval (depending on ε) is given by general theorems. It is therefore sufficient to obtain some bounds in H s for the solution in order to prove the √ existence part of the theorem. Call M ε = Uε −(1/ ε)Uaε and N ε = Vε −(1/ε)Vaε . Then M ε and N ε satisfy L βDθ 1 1 1 Mε ∂T + A1 ∂ζ + √ AI I (∂Y ) + ∂1 τ (η)∂ζ + i ε ε ε2 ε (54) ε 1 ε Rε ε ε 1 ε ε = f M , N + f M , Va + f √ U a , N + 3/2 ε ε ε and
M βDθ 1 1 1 N ∂T + B1 ∂ζ + √ BI I (∂Y ) + ∂1 τ (η)∂ζ + i ε ε ε2 ε 2 1 ε Sε ε ε 1 ε = g M , M + g M , Ua + 2 , ε ε ε ε
ε
(55)
COLIN and LANNES
384
where, thanks to Proposition 7, |R ε |L∞ ([0,T ];H s ) = O ε2 and
|S ε |L∞ ([0,T ];H s ) = O ε 2 . Following [JMR2] and [C], we perform the change of functions 3/2 2 T P = e−(εA1 ∂ζ +ε AI I (∂Y )+ε∂1 τ (η)∂ζ +iL(βDθ ))(T /ε ) M ε := S1ε 2 M ε ε
and Q=e
−(εB1 ∂ζ +ε 3/2 BI I (∂Y )+ε∂1 τ (η)∂ζ +iM(βDθ ))(T /ε 2 )
N
ε
:=
S2ε
T N ε. ε2
Note that this kind of group has also been used in [S], [Gr], [BMN], and [Ga]. The equations satisfied by P and Q are written T T ε T ε ε ∂T P = S1 2 f S1 − 2 P , S2 − 2 Q ε ε ε 1 ε T ε T ε + S1 2 f S1 − 2 P , Va (56) ε ε ε ε T 1 T T R + S1ε 2 f √ Uaε , S2ε − 2 Q + S1ε 2 3/2 ε ε ε ε ε and
1 ε T T T ε ε ∂T Q = S2 2 g S1 − 2 P , S1 − 2 P ε ε ε ε ε 2 1 T T T S + S2ε 2 g S1ε − 2 P , √ Uaε + S2ε 2 . ε ε ε ε ε2 ε
(57)
As S1ε and S2ε are unitary groups on all the Sobolev spaces H s , we just have to find n s estimates on P and Q in L∞ T ([0, T ]; H (Rζ,Y × Tθ )) for a T > 0. Denoting by | · |T the norm associated to this space, (56) yields (58) |P |T ≤ CT |P |T |Q|T + |P |T + |Q|T + O ε 1/2 + P (T = 0) H s , where we have used the fact that ε ε Ua Va ≤ C, √ ≤ C, ε ε T
Moreover, one also has
T
and
|R ε |T = O ε 2 .
385
LONG-WAVE SHORT-WAVE RESONANCE
∂T P ≤ C |P |T |Q|T + |P |T + |Q|T + O ε 1/2 . T
(59)
The case of (57) is more delicate. One has 1 T ε s s s ε ε Q = Q (T = 0) + S g S1 − 2 P (s), S1 − 2 P (s) ds ε 0 2 ε2 ε ε T T ε ε Ua (s) s s s S 2 Sε S2ε 2 ds g S1ε − 2 P , √ + ds + ε 0 2 ε2 ε ε ε2 ε 0 = Q(T = 0) + I1 + I2 + I3 . We now estimate I1 , I2 , and I3 separately. • Estimate of I1 We use the spectralization of the groups S1ε and S2ε as follows. Denote by m, ξ1 , and ξI I the Fourier dual variables of θ, ζ , and Y , respectively, and introduce πLl (η) := l (η) := π (τ l (η), η). πL (τLl (η), η) and πM M M We then have l 3/2 2 T ε S1 2 = πLl ηm + εξ1 , ε3/2 ξI I e−i[mτ +ε∂1 τ (η)ξ1 −τL (ηm+(εξ1 ,ε ξI I ))](T /ε ) ε l
and l 3/2 2 T l ε S2 2 = πM ηm+ εξ1 , ε3/2 ξI I ei[mτ +ε∂1 τ (η)ξ1 −τM (ηm+(εξ1 ,ε ξI I ))](T /ε ) . ε l
Denoting by F (I1 )(m, ξ ) the Fourier transform of I1 , we have therefore F (I1 )(m, ξ ) 1 T l = πM ηm + (εξ1 , ε3/2 ξI I ) ε 0 p l
×e
l
l
l (ηm+(εξ ,ε 3/2 ξ ))](s/ε 2 ) −i[mτ +ε∂1 τ (η)ξ1 −τM II 1
× g πLl ηp + εη1 , ε3/2 ηI I l
p (η), × ei[pτ +ε∂1 τ (η)η1 −τL (ηp+(εη1 ,ε ηI I ))](s/ε ) P πLl η(m − p) + ε(ξ1 − η1 ), ε3/2 ξI I − ηI I 3/2
l
2
× ei[(m−p)τ +ε∂1 τ (η)(ξ1 −η1 )−τL (η(m−p)+(ε(ξ1 −η1 ),ε m−p (ξ − η) dη ds, ×P
3/2 (ξ
I I −ηI I )))](s/ε
2)
COLIN and LANNES
386
and thus F (I1 )(m, ξ ) 1 T = ε 0 P
e
l
l
l
l (ηm+(εξ ,ε 3/2 ξ ))−τ l (ηp+(εη ,ε 3/2 η ))−τ l (η(m−p)+(ε(ξ −η ),ε 3/2 (ξ −η )))](s/ε 2 ) i[τM II II II II 1 1 1 1 L L
l p (η), × πM ηm + εξ1 , ε3/2 ξI I g πLl ηp + εη1 , ε3/2 ηI I P πLl η(m − p) + ε ξ1 − η1 , m−p (ξ − η) dη ds. ε3/2 ξI I − ηI I P Integrating by parts and using Assumption 7 yields T P F (I1 )(m, ξ ) ≤ Cε m−p (ξ − η) dη ds p (η) ∂T P 0
p
P p (η) P m−p (ξ − η) dη(T ) + Cε p
P m−p (ξ − η) dη(0). p (η) P + Cε p
It follows that
2 2 |I1 |T ≤ CεT |P |T |∂T P T + Cε |P T + P (T = 0) H s .
• Estimate of I2 Recall that Uaε (T , Y, ζ, θ ) =
As in [C], introduce
(60)
√ ε πL (β)U11 eiθ + c. c. + ε3/2 iL(β)−1 A1 ∂ζ πL (β)U11 eiθ + c. c. + ε2 iL(β)−1 AI I (∂Y )πL (β)U11 eiθ + c. c. + O ε 5/2 . ε
U 11 = 1{|(∂ζ ,∂Y )|≤1/√ε} U11 .
The following lemma is a direct consequence of the decreasing properties associated to the regularity of U11 . lemma 5 ε The difference between U11 and U 11 is controlled by ε U − U11 ≤ Cε2 U11 ∞ . 11 T L (0,T ;H s+4 )
387
LONG-WAVE SHORT-WAVE RESONANCE l
As β is a smooth point of CL , there exists a local parametrization η → τL0 (η) defined l
l
on a neighborhood of η such that τL0 (η) = τ . We denote by πL0 (η) the associated spectral projector. Thanks to [C], we know that, for all j , l
l
iL(β)−1 Aj πL0 (η) = −i∂j πL0 (η), so that
ε 1 l √ Uaε = πL0 η + ε∂ζ , ε3/2 ∂Y U 11 eiθ + c. c. +ε2 Tε ε
with | T ε |T ≤ C ε
l
since πL0 (ξ ) is smooth near ξ = η and since the spectrum of U 11 is included in √ |ξ | ≤ 1/ ε. We can therefore write ε 2 T ε s s l0 ε 3/2 S g S1 − 2 P (s), πL η + ε∂ζ , ε ∂Y U 11 + c. c. ds I2 = ε 0 2 ε2 ε T √ s s S2ε 2 g S1ε − 2 P (s), Tε ds + ε ε ε 0 = I21 + I22 . It is clear that |I22 |T ≤ CT
√
ε|P |T .
(61)
Now remark that √ l l τL0 η + ε ξ1 , εξI I = τL0 (η) + εξ1 ∂1 τ (η) + O ε 2 since ∂I I τ (η) = 0. Defining l0
√
ε = ei[τL (η+ε(ξ1 , W 11 we thus obtain We can write
and
εξI I ))−τL0 (η)−εξ1 ∂1 τ (η)](T /ε 2 ) l0 πL η l
ε W ≤ C 11 T
and
! ε + εξ1 , ε3/2 ξI I U 11 ,
∂T W ε ≤ C. 11 T
(62)
ε T l ε πL0 η + ε∂ζ , ε3/2 ∂Y U 11 = S1ε − 2 W11 ε
s s 2 T ε s ε ds. S2 2 g S1ε − 2 P (s), S1ε − 2 W11 ε 0 ε ε ε It follows that I21 has the same form as I1 , and an integration by parts yields, using (62), I21 =
COLIN and LANNES
388
|I21 |T ≤ CεT ∂T P T + |P |T + Cε |P |T + P (T = 0) H s . It follows from (61) and (63) that √ |I2 |T ≤ CT ε ∂T P T + |P |T + Cε |P |T + 1 .
(63)
(64)
• Estimate of I3 We first recall that, thanks to Proposition 7, we have S ε /ε 2 = S1ε + O(ε) and ε 2iθ ε S1ε = S10 (T , ζ, Y ) + S12 e + c. c. , 1 (0)S ε = 0, thanks to Proposition 7(ii). as well as πM 10 We now introduce the notation T s j ε ij θ S2ε 2 S1j e ds. I3 = ε 0 ε is smooth enough, we have As S1j 1|(∂ ,∂ )|≥1/√ε S ε ≤ Cε, 1j T ζ Y
and thus
j I ≤ 3 T
T
0
S2ε
s √ S ε eij θ ds + Cε. 1 |(∂ζ ,∂Y )|≤1/ ε 1j 2 ε T
For j = 2, since 2β is not in the characteristic variety of M, an integration by parts yields 2 I ≤ Cε2 + Cε ≤ Cε. (65) 3 T For j = 0, one has F I30 =
T
0
l
√
ei(ε∂1 τ (η)ξ1 −τM (ε(ξ1 ,
l
√ εξI I )))(s/ε 2 ) l πM ε ξ1 , εξI I
ε (s, ξ ) ds + O(ε) × 1{|(∂ζ ,∂Y )|≤1/√ε} S 10 0 F I3 (l) + O(ε). := l
We may encounter three cases. l (ε(ξ , √εξ )) → τ l (0) = 0 when ε tends towards zero. (i) We have τM 1 II M In this case, an integration by parts yields I30 (l) ≤ Cε2 . l l,τM (0)= 0
T
l (ε(ξ , √εξ )) ∼ ε∂ τ l (0)ξ and ∂ τ l (0) = ∂ τ (η). In this case, (ii) We have τM 1 II 1 M 1 1 M 1 the phase does not vanish except in a neighborhood of zero, and a standard argument yields
389
LONG-WAVE SHORT-WAVE RESONANCE
I30 (l) l l,∂1 τM (0)= ∂1 τ (η) T
= o(1).
l (ε(ξ , √εξ )) ∼ ε∂ τ (η)ξ . In this case, we cannot expect anything (iii) We have τM 1 II 1 1 from the phase; however, we have the following lemma.
lemma 6 l (0) = ∂ τ (η), then we have If ∂τM 1 √ l 1 lim πM (0) = 0. ε ξ1 , εξI I 1 − πM ε→0
Proof 1 (0) is the spectral projector of π (0)B π (0) associated to the First recall that πM M 1 M eigenvalue −∂1 τ (η). The mapping √ l ε ξ1 , εξI I ε −→ πM is analytical and bounded for ε small enough and ε = 0. Thanks to [K], we can l (0) the value of therefore extend this function analytically to zero. We denote by πM this extension. l (ε(ξ , √εξ )), one has By definition of πM 1 II √ M0 l l τM πM ε ξ1 , ξI I + εB1 ξ1 + ε 3/2 BI I (ξI I ) + ε ξ1 , εξI I = 0. (66) i Multiplying this expression on the left by πM (0), dividing it by ε, and finally taking the limit when ε → 0 yields l l (0) + B1 ξ1 πM (0) = 0. πM (0) ξ1 ∂1 τM l (0) = ∂ τ (η), we can conclude that As we have ∂1 τM 1 l l πM (0)B1 πM (0) = −∂1 τ (η)πM (0)πM (0). l (0) is contained in the range of π (0) to We just have to prove that the range of πM M complete the proof. l (0) = 0, which proves the But taking the limit when ε → 0 in (66) yields L0 πM desired result.
It follows that I30 → 0 when ε → 0, and we thus obtain lim I3 = 0.
ε→0
It follows from (60), (64), and (67) that
(67)
COLIN and LANNES
390
|Q|T ≤ Q(T = 0) H s + CεT |P |T ∂T P T + Cε |P |2T + P (T = 0) H s (68) √ + C εT |∂T P |T + |P |T + Cε |P |T + 1 + o(1). Thanks to (58), (59), and (68), we can end the proof of the theorem as in [C]. 5. The 1-dimensional case We consider in this section 1-dimensional problems that belong to the general class (1). They read L0 ε u = f uε , uε . (69) ∂ t + A 1 ∂1 u ε + ε As said in the introduction, one seeks in this case approximate solutions to this system under the form β ·x √ ε , u (x) = ε U ε, εt, t, y1 , ε with
√ U ε, T , t, y1 , θ := U1 + ε U2 + ε U3 + ε 3/2 U4 + ε 2 U5 ε, T , t, y1 , θ .
We have also seen that the long-wave short-wave resonance condition reduces in this case to the usual rectification condition. The study of this 1-dimensional case can easily be deduced from the multidimensional study made in the previous sections. The following theorem gives the evolution equations that the leading terms of the ansatz must satisfy in order for uε to be a good approximation of the exact solution uε . theorem 3 Suppose that uε , given by uε (x) = with U := U1 +
β ·x √ εU ε, εt, t, y1 , ε
√
ε U2 + ε U3 + ε 3/2 U4 + ε 2 U5 ,
is the approximate solution to (69) given by geometric optics. If U1 = U11 eiθ + c. c. and U2 = U20 + U21 eiθ + c. c., then one has π(β)U11 = U11 ,
U21 = 0,
and
π(0)U20 = U20 .
Moreover, π(β)U11 and π 1 (0)U20 = π(0)U20 are transported at the group velocity −∂1 τ (η), that is, ∂t − ∂1 τ (η)∂y1 π(β)U11 = ∂t − ∂1 τ (η)∂y1 π 1 (0)U20 = 0, and must also satisfy
391
LONG-WAVE SHORT-WAVE RESONANCE
i ∂T π(β)U11 + ∂12 τ (η)∂y21 π(β)U11 = 2π(β)f π(β)U11 , π 1 (0)U20 2 and
∂T π 1 (0)U20 = −2i∂y1 π 1 (0)f ∂1 π(β)U11 , π(β)U11 + 2iπ 1 (0)A1 ∂y1 L(0)−1 f π(β)U11 , π(β)U11 .
(70)
(71)
The system that π(β)U11 and π 1 (0)U20 must solve is simpler than the system (S) found in the multidimensional case since the dispersive term ∂ξ−1 disappears. The system found here can be solved, so that we do not need to do an assumption like Assumption 5. Since the dependence of π(β)U11 and π 1 (0)U20 on t and y1 is made through ζ := y1 + t∂1 τ (η), we can write π(β)U11 (T , t, y1 ) and π 1 (0)U20 (T , t, y1 ) under the forms π(β)U11 (T , ζ ) and π 1 (0)U20 (T , ζ ). We then have the following theorem. theorem 4 0 = π(β)U 0 and U 0 = π 1 (0)U 0 be in H s (R ) for s ≥ 0. Let U11 ζ 11 20 20 There exists a T > 0 and a unique couple of profiles U11 , U20 ∈ C([0, T ]; H s (Rζ )) satisfying i ∂T π(β)U11 + ∂12 τ (η)∂ζ2 π(β)U11 = 2π(β)f π(β)U11 , π 1 (0)U20 , 2 (S1 ) ∂T π 1 (0)U20 = −2i∂ζ π 1 (0)f ∂1 π(β)U11 , π(β)U11 +2iπ 1 (0)A1 L(0)−1 ∂ζ f π(β)U11 , π(β)U11 , together with the polarization conditions U11 = π(β)U11
and
U20 = π 1 (0)U20 ,
and with the initial conditions 0 U11 T =0 = U11
and
0 U20 T =0 = U20 .
Proof The theorem is proved if we can have an existence/uniqueness result for a general system writing ∂T u + iλ∂ζ2 u = f1 (u, v), (72) ∂T v = ∂ζ f2 (u, u), where λ ∈ R\{0}, f1 and f2 are two bilinear mappings and u and v are vector-valued functions defined on [0, T ] × Rζ . This system is completed with the initial conditions u(T = 0) = u0 ∈ H 2 (R)
and
v(T = 0) = v0 ∈ H 1 (R).
COLIN and LANNES
392
A direct proof using Picard iterates cannot yield the result for a system like (72) since we must deal with the loss of a derivative because of the term ∂ζ in front of the second member of the equation. In order to overcome this difficulty, we use a technique introduced in [OT] for the Zakharov equations. We thus introduce the system T 2 u w + iλ∂ w = f (w, v) + f + w, ∂ v , ∂ T 1 1 0 T ζ 0 (73) ∂T v = ∂ζ f2 (u, u), T T i 1 w − f1 u0 + w, v , ∂ζ2 − 1 u = w − u0 − λ λ 0 0 together with the initial conditions v(T = 0) = v0
and
w(T = 0) = −iλu0 + f1 (u0 , v0 ) ∈ L2 (R).
This system is formally obtained by differentiating the first equation in (72) with respect to T and introducing w = ∂T u. The problem due to the loss of derivatives has disappeared from this new formulation. The third equation in (73) gives u in terms of v and w, thanks to an elliptical inversion. Using the expression of u thus found, the first two equations of (73) are written in terms of v and w. It is easy to show, using classical Picard iterates, that this system of two equations on v and w admits a unique solution (v, w) ∈ C([0, T ]; H 1 (R) × L2 (R)), for a T > 0, and satisfying (v, w)(T = 0) = (v0 , w0 ). Once v and w are known, we can find u, thanks to the formula T T −1 i 1 w − u0 − w − f1 u0 + w, v . u = ∂ζ2 − 1 λ λ 0 0 The system (73) thus admits a unique solution (u, v, w) ∈ C([0, T ]; H 2 (R) × H 1 (R) × L2 (R)) such that (v, w)(T = 0) = (v0 , w0 ). The proof of the theorem will therefore be complete once we have proved that u ∈ C 1 ([0, T ]; L2 (R)) with ∂T u = w, and that u(T = 0) = u0 . Differentiating the third equation in (73) with respect to T , one gets T 2 i 1 ∂ζ − 1 ∂T u = ∂T w − w − ∂T f1 u0 + w, v . (74) λ λ 0 But thanks to the first equation of (73), one has
T i 1 ∂ζ2 − 1 w = ∂T w − w − ∂T f1 u0 + w, v , λ λ 0
so that we can conclude that ∂T u(T ) = w(T ) in H −2 (R). But it is easy to see, thanks to (74), that ∂T u is in C([0, T ]; L2 (R)), so that u ∈ C 1 ([0, T ]; L2 (R)).
393
LONG-WAVE SHORT-WAVE RESONANCE
Using the third equation of (73) and the initial conditions associated to this system, one gets u(T = 0) = u0 , and the proof of the theorem is thus complete. Theorem 4 gives the leading oscillating term and the leading nonoscillating term of the ansatz. As done previously for the multidimensional case, we can determine completely our ansatz thanks to these two profiles. Here again, a stability property for the approximate solution uε can be proved, but only in the case of systems of the form (48).
6. About Proposition 1 and Assumption 5 6.1. Proof of Proposition 1 More precisely, we prove the following proposition. proposition 9 Suppose that Assumptions 1 and 2 are satisfied, and assume that τ (η) · η0 = 0. Then there exists a problem (1) in one-to-one correspondence with problem (1) and for which the contact direction and the group speed are colinear. Proof We can always suppose that β 0 as defined in the introduction is of the form β 0 = (1, η10 , 0, . . . , 0). Let P = (pj k ) be an invertible matrix; to any function u(t, y) we associate the function " u defined as " u(t, y) := u t, P −1 y . Then, if u solves (1), that is, if Lε (∂x )u + f (u, u) = 0, then " u solves (1),
"ε (∂x )" u + f (" u, " u) = 0, L
where "ε (∂x ) := ∂T + L
d j =1
"j ∂j + L0 A ε
and
"j := A
d
p j k Ak .
k=1
" We also introduce the operators L(β) and " π (β) which are linked to (1) and whose definition is straightforward. To β we also associate " β defined as " β := (τ , (P −1 )T η).
COLIN and LANNES
394
(a) We prove here that " π (" β ) = π(β). Indeed, one has d
L0 "j " ηj + A i j =1 d d L0 =τ+ p j k Ak " ηj + i j =1 k=1 d d L0 =τ+ p j k Ak " ηj + i
"" L( β) = τ +
k=1
=τ+
j =1
d k=1
L0 , η Ak + P ek · " i
where (e1 , . . . , ed ) denotes the canonical basis of Rd . η = ηk , and therefore Since " η = (P −1 )T η, one has P ek · " "" L( β) = τ +
d j =1
Aj η j +
L0 ; i
"" that is, L( β ) = L(β). The kernels of these matrices are therefore the same, and thus we have " π (" β ) = π(β). " (β ") = (b) Denoting by " τ (η) a parametrization of CL ", we now prove that τ (∂1 τ (β), 0, . . . , 0). We know that "j " π (" β ) = −∂j " τ (" β )" π (" β ), " π (" β )A which, thanks to the result of (a), reads "j π(β) = −∂j " π(β)A τ (" β )π(β). We now say which matrix P we take. Denoting by lj its line vectors, we take l1 = e1 , and for (l2 , . . . , ld ) we take any basis of the orthogonal hyperplane to τ (β). Since we have supposed that τ (β) · η0 = 0, that is, that τ (β) · e1 = 0, P is invertible. "1 π(β) = π(β)A1 π(β), and since this last quantity is equal to We then have π(β)A τ (" β ) = ∂1 τ (β). When j ≥ 2, one has −∂1 τ (β)π(β), we can conclude that ∂1" "j π(β) = π(β)A
d k=1
pj k π(β)Ak π(β) = −
d
pj k ∂k τ (β) = −lj · τ (β) = 0
k=1
since (lj )j ≥2 is a basis of the orthogonal hyperplane to τ (β).
395
LONG-WAVE SHORT-WAVE RESONANCE
We therefore have ∂j " τ (" β ) = 0 for j ≥ 2, so that β ) = (∂1 τ (β), 0, . . . , 0). τ" (" (c) Denoting by C 0 the tangent cone at (0, 0) to CL ", we now prove that the 0 at β 0 . " to C" at " β is also tangent to C tangent plane P L Thanks to the results of (b), we know that the vector n! := (1, −∂1 τ (β), 0, . . . , 0) is normal to P . We thus have to show that it is also normal to C 0 at β 0 . With arguments similar to those used in (a), we can prove that T (τ, η) ∈ C 0 ⇐⇒ τ, P −1 η ∈ C 0 , so that if τ 0 (η) is a parametrization of C 0 , then τ 0 (η) := τ 0 (P T η) is a parametrization of C 0 . We have therefore τ 0 η0 = τ 0 P T η0 P T = τ 0 (η0 )P T ,
since η0 = e1 . But Assumption 2 says that τ 0 (η0 ) = τ (β), so that one has τ 0 (η0 ) =
τ (β)P T , and hence ∂j τ 0 (η0 ) = τ (β) · lj for all j . Thanks to the definition of the lj , this yields τ 0 (η0 ) = ∂1 τ (β), 0, . . . , 0 , and therefore n! = (1, −∂1 τ (β), 0, . . . , 0) and is thus normal to C 0 at β 0 , as we wanted to prove. (d) We have thus proved that for the problem (1), Assumptions 1 and 2 remain true, and that Proposition 1 is also satisfied. 6.2. An existence theorem In Assumption 5 we supposed the existence and uniqueness of a regular solution to the coupled problem (S) which gives the leading terms of our approximate solution. We have not proved yet this existence/uniqueness theorem, but we give here an existence theorem for a simplified version of system (S) which also appears in the study of water waves (see [L2], [Su]). This system reads i∂t u + ∂ 2 u = u∂1 v, 1 (T) ∂t v + ∂ −1 ∂ 2 v = −|u|2 , 1
2
where ∂1 and ∂2 denote the partial derivative with respect to the first and the second space coordinate, respectively. We want v to be real valued, while u may take complex values.
COLIN and LANNES
396
The second equation does not make sense since the operator ∂1−1 ∂22 does not act on distributions. However, the integral equation (used in Theorem 9), t −1 2 −1 2 e∂1 ∂2 (t−s) |u|2 (s) ds, v = e ∂ 1 ∂ 2 t v0 − 0 ∂1−1 ∂22 t
makes sense since the group e acts on every Sobolev space H s , and for u ∈ L∞ (0, T ; L2 ), |u|2 lies in L∞ (0, T ; H s ) for some negative s. This system may be seen as a simplified version of (S) in space dimension equal to 2 where u plays the role of π(β)U11 , and ∂1 v plays the role of π 1 (0)U20 . Throughout this section, the Fourier dual variables of y1 and y2 are denoted by ξ1 and ξ2 , respectively. 6.2.1. The regularized problem In order to define a regularized problem associated to (T), we introduce, for any µ > 0, the operator ∂µ , whose symbol is given by i
µ + ξ12 . ξ1
The operator ∂µ−1 is therefore given by the symbol −i
ξ1 , µ + ξ12
which is also used to regularize the KP equation in R2 (see [IMS]). In the following lemma, we give some of the properties of these operators. lemma 7 (i) ∂µ and ∂µ−1 are antiadjoints. (ii) If ϕ is a real-valued function, then ∂µ ϕ and ∂µ−1 ϕ are also real valued. Proof (i) This is a consequence of the fact that the symbols of ∂µ and ∂µ−1 are purely imaginary. (ii) It follows from the fact that these symbols are also odd.
We can now define the regularized problem. For > > 0 and µ > 0, i∂t u + ∂12 u = u∂1 v, T>,µ ∂t 1 + >?2 v + ∂ −1 ∂ 2 v = −|u|2 . µ
2
397
LONG-WAVE SHORT-WAVE RESONANCE
The end of this section is devoted to the proof of the following theorem. theorem 5 (i) Let (u0 , v0 ) ∈ L2 ×H 5/2 (R2 ). There exists a unique solution (u, v) ∈ C(R; L2 (R2 ) ×H 5/2 (R2 ))∩C 1 (R; H −2 (R2 )×H 5/2 (R2 )) of (T>,µ ) with initial values (u, v)(t = 0) = (u0 , v0 ). (ii) If (u0 , v0 ) ∈ H 2 × H 5 , then (u, v) ∈ C(R; H 2 × H 5 ) ∩ C 1 (R; L2 × H 7 ). Proof Solving (T>,µ ) in the spaces given in the theorem is equivalent to solving the two integral equations t S1 (t − s)u∂1 v(s) ds (75) u = S1 (t)u0 − i 0
and
t
v = S2 (t)v0 −
−1 S2 (t − s) 1 + >?2 |u|2 (s) ds,
(76)
0
where
S1 (t) := ei∂1 t 2
−1 ∂ 2 (1+>?2 )−1 t 2
S2 (t) := e−∂µ
and
are two unitary groups on L2 . For (u, v) ∈ C(R; L2 × H 5/2 (R2 )), let us introduce C (u, v) = C1 (u, v), C2 (u, v)
with
t
C1 (u, v) = S1 (t)u0 − i
S1 (t − s)u∂1 v(s) ds
(77)
−1 S2 (t − s) 1 + >?2 |u|2 (s) ds.
(78)
0
and
t
C2 (u, v) = S2 (t)v0 − 0
We also introduce the space XT := C([0, T ]; L2 (R2 ) × H 5/2 (R2 )) and consider its natural norm (u, v)XT := max |u|L∞ ([0,T ];L2 ) , |v|L∞ ([0,T ];H 5/2 ) . For any R > 0, we also denote by BR the ball of XT with radius R. We can now state the following lemma. lemma 8 Let R := 2 max(|u0 |L2 , |v0 |H 5/2 ).
COLIN and LANNES
398
There exists T1 > 0 such that, for all T ≤ T1 , the application C maps BR into itself. Proof One has C1 (u, v)
L∞ ([0,T ];L2 )
≤ |u0 |L2 + T u∂1 v L∞ ([0,T ];L2 ) ≤ |u0 |L2 + T |u|L∞ ([0,T ];L2 ) |∂1 v|L∞ ([0,T ];L∞ )
(79)
≤ |u0 |L2 + C1 T |u|L∞ ([0,T ];L2 ) |v|L∞ ([0,T ];H 5/2 ) , since ∂1 v ∈ H 3/2 (R2 ) ⊂ L∞ . We also have (1 − ?)5/4 C2 (u, v) ∞ L ([0,T ];L2 ) −1 ≤ |v0 |H 5/2 + T (1 − ?)5/4 1 + >?2 |u|2 L∞ ([0,T ];L2 ) , but
(1 − ?)5/4 1 + >?2 −1 |u|2
L∞ ([0,T ];L2 )
and
2 |u|
H −1−α
≤ C (1 − ?)3/4 |u|2 L∞ ([0,T ];L2 )
≤ C |u|2 L1
for any α > 0. Taking α = 1/2 thus yields 2 |u| −3/2 ≤ C |u|2 1 = C|u|2 2 . H L L We have therefore (1 − ?)5/4 C2 (u, v) ∞ ≤ |v0 |H 5/2 + C2 T |u|2L∞ ([0,T ];L2 ) . L ([0,T ];L2 )
(80)
With R = 2 max(|u0 |L2 , |v0 |H 5/2 ) and (u, v) ∈ BR , equation (79) yields R C1 (u, v) ∞ ≤ + C1 T R 2 , L ([0,T ];L2 ) 2 and (80) yields R C2 (u, v) ∞ ≤ + C2 T R 2 . L ([0,T ];H 5/2 ) 2 With T1 = min(1/2C1 R, 1/2C2 R) and T ≤ T1 , we have therefore C (u, v)XT ≤ R, and Lemma 8 is thus proved. We now prove another lemma before pursuing the proof of the theorem. lemma 9 There exists T2 > 0 such that, for all T ≤ T2 , C is a contraction on the ball BR of XT .
399
LONG-WAVE SHORT-WAVE RESONANCE
Proof ˜ v)(t ˜ = 0) = (u0 , v0 ). Let (u, v) and (u, ˜ v) ˜ in XT be such that (u, v)(t = 0) = (u, One has t C1 (u, v) − C1 (u, ˜ v) ˜ = −i S1 (t − s) u∂1 v − u∂ ˜ 1 v˜ ds, 0
so that
C1 (u, v) − C1 (u, ˜ v) ˜ L∞ ([0,T ];L2 ) ≤ T (u − u)∂ ˜ 1v ∞
L ([0,T ];L2 )
+ |u(∂ ˜ 1 v − ∂1 v)| ˜ L∞ ([0,T ];L2 )
≤ T C1 |u − u| ˜ L∞ ([0,T ];L2 ) |v|L∞ ([0,T ];H 5/2 )
+ C1 |u| ˜ L∞ ([0,T ];L2 ) |v − v| ˜ L∞ ([0,T ];H 5/2 ) .
If (u, v) ∈ BR and (u, ˜ v) ˜ ∈ BR , one then has C1 (u, v) − C1 (u, ˜ v) ˜ L∞ ([0,T ];L2 ) ≤ 2C1 T R (u, v) − (u, ˜ v) ˜ X ,
(81)
and one can show in the same way that C2 (u, v) − C2 (u, ˜ v) ˜ ∞
(82)
T
L ([0,T ];H 5/2 )
≤ 2C2 T R (u, v) − (u, ˜ v) ˜ X , T
and the lemma is thus proved if we take T2 = 1/4C2 R. Thanks to those two lemmas, the proof of the following proposition is straightforward. proposition 10 For all (u0 , v0 ) ∈ L2 × H 5/2 , there exists a unique maximal solution (u, v) ∈ C([0, Tmax [; L2 × H 5/2 ) to (T>,µ ) such that (u, v)(t = 0) = (u0 , v0 ). Moreover, if Tmax < ∞, then |u|L2 (t) + |v|H 5/2 (t) −→ ∞ when t −→ Tmax . Once the next proposition is shown, the proof of Theorem 5(i) will be complete. proposition 11 One has Tmax = +∞ (where Tmax is defined in Proposition 10), and for all t ∈ R, one has 2 |u| (t) = |u0 |2 . R2
R2
Proof Let (u, v) be as given by Proposition 10. We have
COLIN and LANNES
400
i∂t u + ∂12 u = u∂1 v,
(83)
and u ∈ C([0, Tmax ), L2 ) ∩ C 1 ([0, Tmax ), H −2 ). Let ρα (y1 , y2 ) be a regularizing sequence defined on R2y . We then take the convolution product of ρα and (83). The L2 scalar product of each term of the equation thus obtained with ρα ∗ u is well defined. Taking the imaginary part yields 2 1 ∂t ρα ∗ u∂1 v ρα ∗ u . ρα ∗ u = ( 2 Integrating this equality with respect to the time variable t then yields t ρα ∗ u 2 (t) − ρα ∗ u0 2 = 2 ( ρα ∗ u∂1 v ρα ∗ u . 0
But since u ∈ C([0, Tmax ), L2 ), we have ρα ∗ u(t) → u(t) for all t when α → 0. Moreover, one has u∂1 v ∈ C([0, Tmax ), L2 ), so that ρα ∗ (u∂1 v)(t) → u∂1 v(t) for all t. We have therefore ρα ∗ u∂1 v ρα ∗ u(t) −→ |u|2 ∂1 v(t) when α → 0, and thus
(
ρα ∗ u∂1 v ρα ∗ u(t) −→ 0.
We now prove a domination property. One has ( ρα ∗ u∂1 v ρα ∗ u(t) ≤ ρα ∗ u∂1 v L2 ρα ∗ u L2 ≤ u∂1 v L2 |u|L2 ≤ R 3 , with R such that (u, v) is in the ball BR of XT . Thanks to Lebesgue’s dominated convergence theorem, we have therefore t ( ρα ∗ u∂1 v ρα ∗ u ds −→ 0 0
when α → 0, and we have thus proved that |u|L2 (t) = |u0 |L2 for all t. Moreover, inequality (80) applied to C2 (u, v) = v yields, for all T < Tmax , |v|L∞ ([0,T ];H 5/2 ) ≤ |v0 |H 5/2 + C2 T |u|2L∞ ([0,T ];L2 ) = |v0 |H 5/2 + C2 T |u0 |2L2 . Therefore, if Tmax < ∞, we have |v|L∞ ([0,Tmax );H 5/2 ) ≤ |v0 |H 5/2 + C2 Tmax |u0 |2L2 and |u|L∞ ([0,Tmax );L2 ) = |u0 |L2 ,
401
LONG-WAVE SHORT-WAVE RESONANCE
which is in contradiction with the explosion condition of Proposition 10. We have therefore Tmax = +∞, and the proposition is thus proved. We now prove Theorem 5(ii), which concerns the regularity of the solutions. Let (u0 , v0 ) be in H 2 × H 5 . Solving the Cauchy problem in H 2 × H 5 locally in time does not raise any difficulty, and we omit the proof. It remains to show that the result is valid globally in time. Thanks to Theorem 5(i), we know that we can find a continuous function C(t) such that |v|H 5/2 (t) ≤ C(t) for all t. From (77) we deduce t u∂1 v 1 (s) ds. (84) |u|H 1 (t) ≤ |u0 |H 1 + H 0
But one has ∂(u∂1 v) = ∂u∂1 v + u∂∂1 v and ∂u∂1 v 2 ≤ |∂u| 2 |∂1 v|L∞ ≤ |u| 1 C(t); L H L we also have
u∂∂1 v
L2
≤ |u|L4 ∂∂1 v L4 ≤ Cst |u| 1/2 ∂∂1 v H
H 1/2
≤ Cst |u|H 1/2 |v|H 5/2
(85)
(86)
≤ Cst C(t)|u|H 1 . Thanks to (84)–(86), we have
|u|H 1 (t) ≤ |u0 |H 1 + Cst
t 0
C(s)|u|H 1 (s) ds,
so that Gronwall’s lemma yields the existence of a continuous function D(t) such that |u|H 1 (t) ≤ D(t). From (77) we also deduce t u∂1 v 2 (s) ds. (87) |u|H 2 (t) ≤ |u0 |H 2 + H 0
But one has
∂ 2 (u∂
we also have
1 v)
=
∂ 2 u∂
2 ∂ u∂1 v
L2
1v
+ 2∂u∂∂1 v + u∂ 2 ∂1 v and ≤ ∂ 2 u L2 |∂1 v|L∞ ≤ Cst C(t)|u|H 2 ;
∂u∂1 ∂v 2 ≤ |∂u| 4 ∂1 ∂v 4 L L L ≤ Cst |∂u| 1/2 ∂1 ∂v H
H 1/2
≤ Cst |∂u|H 1/2 |v|H 5/2 ≤ Cst C(t)|u|H 2
(88)
(89)
COLIN and LANNES
402
and 2 u∂ ∂1 v
L2
≤ |u|L2 ∂ 2 ∂1 v L∞ ≤ Cst |u|L2 ∂ 2 ∂1 v H 2 ≤ Cst |u0 |L2 |u|H 5 .
(90)
Thanks to (87)–(90), we have |u|H 2 (t) ≤ |u0 |H 2 + Cst
t
C(s)|u|H 2 (s) + |v(s)|H 5 ds.
0
(91)
From (78) we deduce |v|H 5 ≤ |v0 |H 5 +
t 0
2 |u|
H1
(s) ds,
(92)
and we have ∂|u|2 = 2(u∂u) and 1/2
1/2
|u∂u|L2 ≤ |u|L4 |∂u|L4 ≤ Cst |u|H 1/2 |∂u|H 1/2 ≤ Cst |u|L2 |u|H 1 |u|H 3/2 , √ D(t)|u|H 2 . From (92) we then deduce t' |v|H 5 ≤ |v0 |H 5 + Cst D(s) |u|H 2 (s) + 1 ds.
so that |u∂u|L2 ≤ Cst
(93)
0
Equations (91) and (93), together with Gronwall’s lemma, yield that |v|H 5 + |u|H 2 ≤ E(t), where E(t) is a continuous function. It is now easy to conclude the proof of the theorem. Remark 6 (i) Since |v|H 5 and |u|H 2 control |v|W 1,∞ and |u|∞ , we can easily obtain results for more regular solutions. One has, for instance, a solution in H 3 × H 6 . (ii) In the above proof, we found two constants C1 (T ) and C2 (T ) such that |(u, v)|L∞ ([0,T ];H 2 ×H 5 ) ≤ C1 (T )
and
|(u, v)|L∞ ([0,T ];L2 ×H 5/2 ) ≤ C2 (T ).
These constants C1 (T ) and C2 (T ) depend on T , >, u0 , and v0 but not on µ. We now prove the following theorem, which deals with the continuity of the solutions given by Theorem 5 with respect to the parameter µ. theorem 6 (i) We take here µ = 0. If (u0 , v0 ) ∈ L2 × H 5/2 , then there exists a unique solution
LONG-WAVE SHORT-WAVE RESONANCE
403
(u, v) ∈ C(R; L2 × H 5/2 ) to the integral equations (75)–(76) such that (u, v)(t = 0) = (u0 , v0 ). Moreover, if (u0 , v0 ) ∈ H 2 × H 5 , then we also have (u, v) ∈ C(R; H 2 × H 5 ). (ii) Let (u0 , v0 ) ∈ L2 × H 5/2 (resp., H 2 × H 5 or H 3 × H 6 ), and let (uµ , v µ ) be the solution of (75)–(76) such that (u, v)(t = 0) = (u0 , v0 ), with µ ≥ 0. Then the mapping resp., H 2 × H 5 or H 3 × H 6 , R+ −→ C R; L2 × H 5/2 µ −→ (uµ , v µ ) is continuous. Proof (i) The proof made for Theorem 5 remains valid. The only difference is that we cannot use the partial differential equation satisfied by v because of the operator ∂1−1 , but we do not need it. (ii) We consider here the case L2 × H 5/2 . We write the integral equations (75)–(76) for µ and µ0 ≥ 0: t µ S1 (t − s)uµ ∂1 v µ (s) ds, u = S1 (t)u0 − i 0 t −1 µ v µ = S µ (t)v0 − S2 (t − s) 1 + >?2 |uµ |2 (s) ds 2 0
and
t µ0 = S (t)u − i S1 (t − s)uµ0 ∂1 v µ0 (s) ds, u 1 0 0 t −1 µ µ v µ0 = S 0 (t)v0 − S2 0 (t − s) 1 + >?2 |uµ0 |2 (s) ds. 2 0
Subtracting those two systems yields on the one hand t µ µ µ u ∂1 v − uµ0 ∂1 v µ0 2 (s) ds u − uµ0 2 ≤ L L
0
t
≤ 0
|uµ |L2 ∂1 v µ − ∂1 v µ0 L∞ + |∂1 v µ0 |L∞ |uµ − uµ0 |L2 (s) ds.
|uµ |
µ0 We have seen that L2 = |u0 |L2 and |v |H 5/2 (t) ≤ C(t), where C(t) is a continuous function of t which does not depend on µ. We have therefore t µ µ v − v µ0 5/2 (s) + C(s) uµ − uµ0 2 (s) ds. (94) u − uµ0 2 ≤ Cst L H L 0
One has on the other hand
COLIN and LANNES
404
µ v − v µ0
H 5/2
µ µ ≤ S2 (t) − S2 0 (t) v0 H 5/2 t µ S (t − s) − S µ0 (t − s) 1 + >?2 −1 uµ0 2 (s) 5/2 ds + 2 2 H 0 t 1 + >?2 −1 |uµ0 |2 − |uµ |2 5/2 ds + H µ0 µ ≤ S2 (t) − S2 0 (t) v0 H 5/2 t µ S (t − s) − S µ0 (t − s) |uµ0 |2 (s) −3/2 ds + 2 2 H 0 t µ 2 |u 0 | − |uµ |2 −3/2 (s) ds. + H 0
But we know that µ 2 |u 0 | − |uµ |2 −3/2 ≤ Cst |uµ0 |2 − |uµ |2 1 H L ≤ Cst |uµ0 |L2 + |uµ |L2 uµ0 − uµ L2 ≤ Cst uµ0 − uµ 2 , L
so that µ v − v µ0
H 5/2
t µ µ µ u 0 − uµ 2 ≤ S2 (t) − S2 0 (t) v0 H 5/2 + Cst L 0 t µ S (t − s) − S µ0 (t − s) |uµ0 |2 −3/2 ds. + 2 2 H
(95)
0
We introduce
t
m1 (t) := 0
µ S (t − s) − S µ0 (t − s) |uµ0 |2 (s) −3/2 ds, 2 2 H
and we want to prove that m1 (t) → 0 when µ → µ0 . Denoting by iPµ the symbol of ∂µ , we have 2 µ µ F S2 (t − s) − S2 0 (t − s) uµ0 (s) (ξ1 , ξ2 ) 4 −1 2 4 −1 2 = e−iPµ (ξ1 )(1+>|ξ | ) ξ2 (t−s) − e−iPµ0 (ξ1 )(1+>|ξ | ) ξ2 (t−s) F |uµ0 |2 (ξ ), and it is clear that the second member tends toward zero for almost every ξ1 , ξ2 , and t when µ → µ0 . Since the integrand that appears in the definition of m1 (t) is dominated by ||uµ0 |2 |H −3/2 (s) ∈ L1loc (R), we can conclude, thanks to Lebesgue’s dominated convergence theorem, that m1 (t) −→ 0
in L∞ loc (R)
when µ −→ µ0 .
405
LONG-WAVE SHORT-WAVE RESONANCE
We now introduce
µ µ m2 (t) := S2 (t) − S2 0 (t) v0 H 5/2 ,
and we want to prove that it also tends towards zero when µ → µ0 . We have µ µ F S2 (t) − S2 0 (t) v0 4 −1 2 4 −1 2 = e−iPµ (ξ1 )(1+>|ξ | ) ξ2 t − e−iPµ0 (ξ1 )(1+>|ξ | ) ξ2 t v0 (ξ ) 4 −1 2 4 −1 2 = e−iPµ (ξ1 )(1+>|ξ | ) ξ2 t − e−iPµ0 (ξ1 )(1+>|ξ | ) ξ2 t 1l{|ξ |≤α} 1l{|ξ1 |≥β} v0 (ξ ) v0 (ξ ) + (· · · )(1 − 1l{|ξ |≤α} 1l{|ξ1 |≥β} ) = fµ (t, ξ ) + gµ (t, ξ ). Let γ > 0, and choose α > 0 sufficiently big and β > 0 sufficiently small to have 1 + |ξ |2 5/4 gµ (t, ξ ) 2 ≤ γ . (96) L With the same α and β, one has 5/2 1 + |ξ |2 5/4 fµ (t, ξ ) 2 2 = v0 (ξ ) 2 1 + |ξ |2 L
4 −1 2 4 −1 2 2 × e−iPµ0 (ξ1 )(1+>|ξ | ) ξ2 t − e−iPµ (ξ1 )(1+>|ξ | ) ξ2 t × 1l{|ξ |≤α} 1l{|ξ1 |≥β} dξ.
For |ξ | ≤ α and |ξ1 | ≥ β, one has, for t ≤ T , −iP (ξ )(1+>|ξ |4 )−1 ξ 2 t 4 −1 2 2 e µ0 1 2 − e −iPµ (ξ1 )(1+>|ξ | ) ξ2 t ≤ C(α, β)T 2 |µ − µ |2 , 0 so that it is easy to see that 1 + |ξ |2 5/4 fµ (t, ξ ) 2 2 ≤ γ 2 L if µ and µ0 are close enough. Together with (96), this yields m2 (t) −→ 0
in L∞ loc (R)
when µ −→ µ0 .
Equation (95) thus is written
µ v − v µ0
(t) ≤ m1 (t) + m2 (t) + Cst H 5/2 L∞ loc (R).
t 0
µ u − uµ0 2 (s) ds, L
Using (94) and Gronwall’s lemma with m1 (t) + m2 (t) → 0 as µ → µ0 in yields µ u − uµ0 2 + v µ − v µ0 5/2 −→ 0 in L∞ (R) as µ −→ µ0 , loc L H and the proof is thus complete.
COLIN and LANNES
406
6.2.2. Energy estimates We first prove a few energy estimates linked to the regularized problem (T>,µ ), for µ > 0. These estimates are very similar to those obtained by Ph. Laurenc¸ot [La] for the 1-dimensional problem. theorem 7 Let (u0 , v0 ) ∈ H 3 × H 6 . Then the solution (u, v) ∈ C(R; H 3 × H 6 ) given by Theorem 5 for µ > 0 satisfies (i) 2 |u| (t) = |u0 |2 ; R2
(ii)
R2
1/2 2 1 |∂1 u|2 + |u|2 ∂1 v + ∂2 ∂µ−1 ∂1 v 2 R2 1/2 2 1 = |∂1 u0 |2 + |u0 |2 ∂1 v0 + ∂2 ∂µ−1 ∂1 v0 ; 2 R2
(iii)
R2
(iv)
1 + >?2 1/2 ∂1 v 2 + 2iu∂1 u =
R2
(1 + >?)1/2 v 2 =
R2
R2
1 + >?2 1/2 ∂1 v0 2 + 2iu0 ∂1 u0 ;
(1 + >?)1/2 v0 2 − 2
t 0
R2
|u|2 v(s) ds.
Proof (i) Taking the imaginary part of the L2 product of the first equation of (T>,µ ) with u yields ∂t
|u|2 = 0,
and the result follows. (ii) Taking the real part of the L2 product of the first equation of (T>,µ ) with ∂t u yields 1 − ∂t |∂1 u|2 = ∂t uu∂1 v 2 1 = ∂t |u|2 ∂1 v 2 1 1 = ∂t |u|2 ∂1 v − |u|2 ∂t ∂1 v, 2 2
407
LONG-WAVE SHORT-WAVE RESONANCE
and therefore ∂t
|∂1 u| + ∂t 2
|u| ∂1 v = 2
|u|2 ∂t ∂1 v.
(97)
The second equation of (T>,µ ) may be written under the form
so that
(
−1 −1 ∂t v + ∂µ−1 ∂22 1 + >?2 v = − 1 + >?2 |u|2 ,
(98)
∂1 (98)|u|2 reads −1 ∂1 ∂t v|u|2 + ∂µ−1 ∂1 ∂22 1 + >?2 v|u|2 = 0,
(99)
since (1 + >?2 )−1 ∂1 is antiadjoint. ( We now compute (98)∂22 ∂µ−1 ∂1 v and find 1/2 2 −1 1 v = − 1 + >?2 |u|2 ∂22 ∂µ−1 ∂1 v. − ∂t ∂2 ∂µ−1 ∂1 2
(100)
Since (1 + >?2 )−1 is self-adjoint, (99)–(100) yield 1/2 2 1 2 v , ∂1 ∂t v|u| = − ∂t ∂2 ∂µ−1 ∂1 2 so that plugging this equation in (97) yields 1/2 2 1 ∂t |∂1 u|2 + |u|2 ∂1 v + ∂2 ∂µ−1 ∂1 v = 0, 2 R2 and the result follows. (iii) Taking the L2 product of the second equation of (T>,µ ) with ∂12 v yields 1/2 2 1 − ∂t 1 + >?2 ∂1 v = − |u|2 ∂12 v = u∂1 u∂1 v + u∂1 u∂1 v. 2 One then takes the expressions on u∂1 v and u∂1 v given by the first equation of (T>,µ ) and plugs them into the above equation, and thus one obtains 1/2 2 1 ∂1 v = ∂1 u − i∂t u + ∂12 u + ∂1 u i∂t u + ∂12 u − ∂t 1 + >?2 2 = i ∂1 u∂t u − ∂1 u∂t u + 0 = i ∂1 u∂t u + u∂t ∂1 u = i∂t u∂1 u,
COLIN and LANNES
408
so that ∂t
1 + >?2 1/2 ∂1 v 2 + 2i
u∂1 u = 0,
and the result follows. (iv) Taking the L2 product of the second equation of (T>,µ ) with v reads 2 1/2 2 1 + >? v = −2 |u|2 v, ∂t which yields the result. The following corollary gives energy estimates associated to the solutions of (T>,0 ). corollary 1 We take here µ = 0. The solution (u, v) given in this case by Theorem 6 satisfies (i) R2
(ii)
|u|2 (t) =
1 |∂1 u| + |u| ∂1 v + |∂2 v|2 = 2 R2 2
(iii)
R2
(iv)
2
1 + >?2 1/2 ∂1 v 2 + 2iu∂1 u =
R2
(1 + >?)1/2 v 2 =
R2
R2
|u0 |2 ;
1 |∂1 u0 |2 + |u0 |2 ∂1 v0 + |∂2 v0 |2 ; 2 R2
R2
1 + >?2 1/2 ∂1 v0 2 + 2iu0 ∂1 u0 ;
(1 + >?)1/2 v0 2 − 2
t 0
R2
|u|2 v(s) ds.
Proof This corollary is a consequence of Theorem 7 and of the continuity of the flow with respect to the parameter µ. Remark 7 The results of the corollary cannot be obtained directly without treating the case µ > 0. Indeed, the estimates cannot be done directly on (T>,0 ) since ∂1−1 ∂22 v and ∂t v are not distributions. 6.2.3. Finding bounds independent of > Useful inequalities We first give two useful inequalities that we use throughout this section.
409
LONG-WAVE SHORT-WAVE RESONANCE
lemma 10 If u and ∂1 u are in L2 (R2 ), then sup |u|2 (y2 ) dy2 ≤ 2|u|2 |∂1 u|2 . y1 ∈R
Proof √ Since for any function f ∈ H 1 (R) one has |f |∞ ≤ 2|f |2 |f |2 , we can write 2 sup |u| (y2 ) ≤ 2|u|2,y1 (y2 )|∂1 u|2,y1 (y2 ). y1 ∈R
Integrating this inequality with respect to y2 and using the Cauchy-Schwarz inequality then yields the result. lemma 11 If v and ∂2 v are in L2 (R2 ), then 1/2 √ 1/2 1/2 2 v (y1 , y2 ) dy1 ≤ 2|v|2 |∂2 v|2 . sup y2
Proof Let ψ be defined as 1/2
ψ : y2 −→
v (y1 , y2 ) dy1 2
R
.
One has ψ ∈ L2 (R) and |ψ|2 = |v|2 . Moreover, we have ( v∂2 v dy1 ψ (y2 ) = ( 1/2 , 2 v (y1 , y2 ) dy1 ( so that |ψ (y)| ≤ ( |∂2 v|2 dy1 )1/2 by Cauchy-Schwarz. One has therefore ψ ∈ H 1 (R) and |ψ |2 ≤ |∂2 v|2 , and thus √ 1/2 1/2 |ψ|∞,y2 ≤ 2|v|2 |∂2 v|2 . Local bounds in time, for small initial data The following theorem gives useful bounds independent of >. theorem 8 We take here µ = 0, and we let T > 0. There exists >0 > 0, and there exist λ > 0 and C > 0 independent of >, such that
COLIN and LANNES
410
if (u0 , v0 ) ∈ H 3 × H 6 is such that
2 |u0 |22 + ∂1 u0 2 + |v0 |2H 1 ≤ λ,
then the solution (u, v) of (T>,0 ) such that (u, v)(t = 0) = (u0 , v0 ), given by Theorem 6, satisfies |u|L∞ ([0,T ];L2 ) + |∂1 u|L∞ ([0,T ];L2 ) + |v|L∞ ([0,T ];H 1 ) ≤ C. Proof We first introduce the quantity N1 defined as 2 1 N1 := ∂1 u0 + |u0 |2 ∂1 v0 + |∂2 v0 |2 . 2 Thanks to Corollary 1(ii), one has 1 2 2 2 |∂1 u|2 + |∂2 v|2 ≤ N1 + |u| ∂1 v ≤ N1 + 2 |u| |∂1 u| |v|, 2
(101)
but one also has 2 |u||∂1 u||v| = 2 |u||∂1 u||v| dy1 dy2 ≤ 2 |u|∞,y1 (y2 ) |∂1 u||v| dy1 dy2
≤2
|u|∞,y1 (y2 )
|∂1 u| dy1
≤ 2 sup y2
1/2 |v|2 dy1
≤ 2 sup y2
1/2
1/2
1/2
1/2
(102)
1/2 |∂1 u|2 dy1
1/2 |v| dy1
dy2
2
|u|∞,y1
2
1/2 |v| dy1
2
dy2
1/2 |u|∞,y1 dy2
|∂1 u|2
3/2
≤ 4|v|2 |∂2 v|2 |u|2 |∂1 u|2 , the last inequality being a consequence of Lemmas 10 and11. Thanks to (101), we have therefore 1 1/2 1/2 1/2 3/2 |∂1 u|22 + |∂2 v|22 ≤ N1 + 4|v|2 |∂2 v|2 |u0 |2 |∂1 u|2 , 2 since for all t, |u|2 (t) = |u0 |2 . It is also a consequence of (102) that 2 1/2 3/2 1 1/2 1/2 N1 ≤ |u0 |22 + ∂2 v0 2 + 4|v0 |2 ∂2 v0 2 |u0 |2 ∂1 u0 2 . 2 Thanks to Corollary 1(iv), one also has
(103)
(104)
411
LONG-WAVE SHORT-WAVE RESONANCE
1 + >?2 1/2 v 2 = 1 + >?2 1/2 v0 2 − 2 2
where one can write
|u|2 v = so that
t
|u|2 v,
0
uuv dy1 dy2 ,
|u|2 v ≤ |u|∞,y |u||v| dy1 dy2 1 1/2 1/2 |u|2 dy1 |v|2 dy1 ≤ |u|∞,y1 dy2 ≤ sup y2
1/2 |v| dy1
≤ sup
|v| dy1 1/2
2
1/2 1/2
3/2
dy2
1/2 |u|2∞,y1
2
y2
1/2 |u| dy1
|u|∞,y1
2
dy2
|u|2
1/2
≤ 2|v|2 |∂2 v|2 |u0 |2 |∂1 u|2 , where the last inequality is a consequence of Lemmas 10 and 11 and of the conservation of the L2 norm of u. We have therefore t 1/2 1/2 3/2 1/2 1 + >?2 1/2 v 2 ≤ 1 + >?2 1/2 v0 2 + 4 |v|2 |∂2 v|2 |u0 |2 |∂1 u|2 , (105) 2 2 0
so that, for T > 0 and t ≤ T , we have 1 + >?2 1/2 v 2 ≤ 1 + >?2 1/2 v0 2 + 4T |v|1/2 |∂2 v|1/2 |u0 |3/2 |∂1 y|1/2 , 2 2 where the norm | · |L∞ ([0,T ];L2 ) is denoted by | · |. Since |v|2 ≤ |(1 + >?2 )1/2 v|2 , and since for > small enough we have |(1 + >?2 )1/2 v0 |22 ≤ 2|v0 |22 , we have 3/2
|v|2 ≤ 2|v0 |22 + 4T |v|1/2 |∂2 v|1/2 |u0 |2 |∂1 y|1/2 .
(106)
Taking the sup in time in (103) and summing with (106) then yields 1 1/2 |∂1 u|2 + |∂2 v|2 + |v|2 ≤ N1 + 2|v0 |22 + 4|v|1/2 |∂2 v|1/2 |u0 |2 |∂1 u|3/2 2 3/2 + 4T |v|1/2 |∂2 v|1/2 |u0 |2 |∂1 u|1/2 .
(107)
COLIN and LANNES
412
We now use the Young inequality abcd ≤ (1/4)(a 4 + b4 + c4 + d 4 ) with a = 3/2
4T |u0 |2 , b = |v|1/2 , c = |∂2 v|1/2 , and d = |∂1 u|1/2 to obtain 1 3 2 2 2 |∂1 u| + |∂2 v| + |v| 4 3 1 1/2 3/2 4 ≤ N1 + 2|v0 |22 + 4|v|1/2 |∂2 v|1/2 |u0 |2 |∂1 u|3/2 + 4T |u0 |2 . 4
(108)
We now use another Young inequality, abcd ≤ (1/8)(a 8 + b8 + 3c8/3 + 3d 8/3 ) with 1/2 a = |v|1/2 , b = |∂2 v|1/2 , c = |u0 |2 , and d = |∂1 u|3/2 to obtain |∂1 u|2 + |∂2 v|2 + |v|2 ≤ Cst N1 + |v0 |22 + T 4 |u0 |62 4/3 + Cst |u0 |2 + |v|4 + |∂2 v|4 + |∂1 u|4 . Introducing f := |v|2 + |∂1 u|2 + |∂2 v|2 , we obtain from the above equation 4/3 f ≤ Cst N1 + T 4 |u0 |62 + |v0 |22 + |u0 |2 + Cst f 2 , which is of the form αX 2 − X + β ≥ 0. We want to choose α and β such that the trinomial αX2 − X + β has two distinct real roots. We want therefore 1 − 4αβ > 0, which reads 1 4/3 (109) < . Cst N1 + T 4 |u0 |6 + |v0 |22 + |u0 |2 4 For λ > 0 small enough, it is a consequence of (104) that if 2 |u0 |2 + ∂1 u0 + |v0 |2H 1 ≤ λ, then condition (109) is satisfied, and we denote by X0 < X1 the two roots. Since for all t such that 0 ≤ t ≤ T one has αf (t)2 − f (t) + β ≥ 0, one has either f (t) < X0 or f (t) > X1 for all t ≤ T . We are in the first case if f (0) < X0 and in the second otherwise. In order to have an upper bound for f (t), we therefore want to have f (0) < X0 , which is the case if 2αf (0) − 1 < 0, that is, if 2 2 Cst |v0 |2 + ∂1 u0 + ∂2 v0 < 1, which is satisfied if the λ defined above is small enough. One then has for all t ≤ T , √ 2 1 1 − 1 − αβ 2 2 |v| + |∂1 u| + |∂2 v| (t) ≤ X0 = ≤ . (110) 2 2 We now want a bound for |∂1 v|2 ; one has 1/2 2 2 |∂1 v| ≤ 1 + >?2 ∂1 v 1/2 2 ≤ 1 + >?2 ∂1 v + 2iu∂1 u + 2 u∂1 u , and using Corollary 1(iii) yields
413
LONG-WAVE SHORT-WAVE RESONANCE
2 1/2 ∂1 v0 + 2iu0 ∂1 u0 + 2|u|L2 |∂1 u|L2 . |∂1 v|2 ≤ 1 + >?2
For > small enough, one has therefore |∂1 v|2 ≤ 2|∂1 v0 |2L2 + 2 u0 ∂1 u0 + 2|u|L2 |∂1 u|L2 . Since f (t) ≤ 1/2, for all t ≤ T , we can conclude that |∂1 v|2 ≤ Cst . This inequality, together with (110), proves the theorem. 6.2.4. Conclusion Throughout this section, we denote by (uε , v ε ) the solution to (T>,0 ) given by Theorem 6. Thanks to Theorem 8, we can consider a subsequence, still denoted by (uε , v ε ), such that uε H u in L∞ [0, T ]; L2 weak ∗, ∂1 uε H ∂1 u in L∞ [0, T ]; L2 weak ∗, v ε H v in L∞ [0, T ]; H 1 weak ∗ . We want to prove that (u, v) solves (T). We first give a compactness result for v ε . lemma 12 If |u0 |2 ∈ H 1 , then one has v ε → v strongly in L∞ ([0, T ]; L2loc ). Proof Multiplying the first equation of (T>,0 ) by uε and taking the imaginary part yields 1 ∂t |uε |2 + 2( ∂12 uε uε = 0, 2 and therefore |uε |2 = |u0 |2 − 4∂1 since ∂1
uε ∂
ε
1u
t
( ∂1 uε uε (s) ds,
0
is real. Introduce now t ε U := ( ∂1 uε uε (s) ds, 0
so that −1 2 ∂2 (1+>?)−1 t
v ε = e−∂1
v0 − 0
t
−1 2 ∂2 (1+>?)−1 (t−s)
e−∂1
(1 + >?)−1 |u0 |2 − 4∂1 U ε ds.
COLIN and LANNES
414
We also introduce ε
V := e
−∂1−1 ∂22 (1+>?)−1 t
t
v0 −
−1 2 ∂2 (1+>?)−1 (t−s)
e−∂1
(1 + >?)−1 |u0 |2 ds
0
and ε
t
W := 4
−1 2 ∂2 (1+>?)−1 (t−s)
e−∂1
(1 + >?)−1 ∂1 U ε (s) ds,
0
so that v ε = V ε + W ε . As soon as v0 ∈ H 1 and |u0 |2 ∈ H 1 , we have V ε bounded in L∞ ([0, T ]; H 1 ) and V ε → V in L∞ ([0, T ]; H 1 ) when > → 0, where t −1 2 −1 2 V := e−∂1 ∂2 t v0 − e−∂1 ∂2 (t−s) |u0 |2 ds. 0
Since v ε and V ε are bounded in L∞ ([0, T ]; H 1 ), then so is W ε = v ε −V ε . Moreover, one has t −2 −1 2 2 −1 ε 2 −1 ε ∂1 U (t) − 4 e−∂1 ∂2 (1+>? ) (t−s) 1 + >?2 ∂22 U ε ds. ∂t W = 4 1 + ε? 0
U ε,
But the sequence as defined above, is bounded in L∞ ([0, T ]; L1 ) and therefore in L∞ ([0, T ]; H −3/2 ), so that ∂t W ε is bounded in L∞ ([0, T ]; H −7/2 ). It follows that W ε is strongly compact in L∞ ([0, T ]; L2loc ), and the lemma is thus proved. The following lemma says that (u, v) solves the first equation of (T). lemma 13 The functions u and v solve i∂t u + ∂12 u = u∂1 v. Proof We know that which is equivalent to
i∂t uε + ∂12 uε = u∂1 v ε , i∂t uε + ∂12 uε = ∂1 uε v ε − ∂1 uε v ε .
But since v ε → v strongly in L∞ ([0, T ]; L2loc ) and uε and ∂1 uε converge weakly in L∞ ([0, T ]; L2 ), we can take the limit in the above equation; that is, i∂t u + ∂12 u = ∂1 (uv) − ∂1 uv, which yields the result of the lemma.
415
LONG-WAVE SHORT-WAVE RESONANCE
In order to prove a strong compactness result for uε , we need the following lemma. lemma 14 (i) One has u∂1 v ∈ L∞ ([0, T ]; L1y2 (L2y1 )). (ii) Let u0 ∈ L1y2 (L2y1 ) and f ∈ L∞ ([0, T ]; L1y2 (L2y1 )). Then the solution w of i∂t w + ∂12 w = f, w(0, y) = u0 (y), is in C([0, T ]; L1y2 (L2y1 )). Proof (i) One has
and thus
so that
u∂1 v 2 dy1 ≤ |u|2 ∞,y1
u∂1 v 2 dy1 u∂1 v 2 dy1
|∂1 v| dy1 , 2
1/2
≤ |u|∞,y1
1/2 |∂1 v|2 dy1
1/2 dy2 ≤
,
1/2 |u|2∞,y1
dy2 |∂1 v|2 √ 1/2 1/2 ≤ 2|u|2 |∂1 u|2 |∂1 v|2 ,
thanks to Lemma 10, and the proof is thus complete. (ii) The function w is written t 2 i∂12 t ei∂1 (t−s) f (s) ds, w = e u0 − i 0
but since
i∂ 2 t 2 e 1 u0 dy1 =
|u0 |2 dy1 ,
the function t → ei∂1 t u0 is in C([0, T ]; L1y2 (L2y1 )). The proof does not differ for the component of w concerning the second member f . 2
We can now state a compactness result for uε . proposition 12 Let u0 ∈ L1y2 (L2y1 ). Then uε → u strongly in L2 ([0, T ] × R2 ).
COLIN and LANNES
416
Proof Thanks to Lemma 14(i) and (ii), we know that the weak limit u of uε is in C([0, T ]; L1y2 (L2y1 )). We now introduce a regularizing sequence ρα (y1 ) of Ry1 , and we consider 2 ∂t ρα ∗ u ρα ∗ ∂t u dy1 . ρα ∗ u dy1 = 2 R
We know, thanks to Lemma 13, that ∂t u − i∂12 u = −iu∂1 v, so that ∂t
2 ρα ∗ u i∂12 ρα ∗ u − iρα ∗ u∂1 v dy1 ρα ∗ u dy1 = 2 = 2( ρα ∗ u∂1 v ρα ∗ u dy1 .
R
But for almost every y2 and t we have u∂1 v ∈ L2y1 (because u ∈ Hy11 ⊂ L∞ y1 ). We therefore have ρα ∗ (u∂1 v) → u∂1 v in L2y1 when α → 0. Moreover, for almost all y2 , u(·, y2 ) ∈ L2y1 , and therefore ρα ∗ u → u in L2y1 . We have therefore ρα ∗ u∂1 v ρα ∗ u dy1 −→ 0 gα (y2 , t) := 2( almost everywhere in y2 and t. But we also have ρα ∗ ∂1 uv ρα ∗ u dy1 − 2( ρα ∗ (uv) ρα ∗ ∂1 u dy1 , gα (y2 , t) = −2( so that
gα (y2 , t) ≤ 4|∂1 u|2,y |v|2,y |u|∞,y := g(y2 , t). 1 1 1
We have t 0
R
T
g(y2 , t) dy2 dt ≤ 4
0
≤8 0
T
|∂1 u|2
1/2 |u|2∞,y1 dy2
1/2
1/2
sup |v|2,y1 dt y2
1/2
1/2
|∂1 u|2 |u|2 |∂1 u|2 |v|2 |∂2 v|2 dt,
thanks to Lemmas 10 and 11, and thus, by Theorem 8, t g(y2 , t) dy2 dt ≤ Cst T . 0
R
We have therefore a domination condition on gα . Since we have also seen that gα → 0 almost everywhere in y2 and t, we can conclude, thanks to Lebesgue’s dominated
417
LONG-WAVE SHORT-WAVE RESONANCE
convergence theorem, that gα → 0 in L1 ([0, T ] × R). We have therefore |u|2 dy1 = 0, ∂t R
and therefore
|u|2 dy1 = Cst . R ( We now prove that this constant is equal to R |u0 |2 dy1 . As we have u ∈ C([0, T ]; L1y2 (L2y1 )), we have 1/2 dy2 −→ 0 as t −→ 0, |u − u0 |2 dy1 and therefore
|u − u0 |2 dy1 −→ 0
when t → 0 almost everywhere in y2 . ( ( Hence, we have R |u|2 dy1 → R |u0 |2 dy1 almost everywhere in y2 . The con( ( stant R |u|2 dy1 is therefore equal to R |u0 |2 dy1 . Integrating this relation with respect to y2 yields |u|2 = |u0 |2 . We recall that we also have |uε |2 = |u0 |2 , so that uε converges weakly towards u, and it converges also in L2 norm. We can therefore conclude that uε → u strongly in L2 ([0, T ] × R2 ), and the proposition is thus proved. Remark 8 Thanks to the compactness properties of (uε , v ε ), given by Lemma 12 and Proposition 12, Theorem 8 remains valid with initial values (u0 , v0 ) ∈ L2 × H 1 instead of H 3 × H 6 . One just has to consider regularizations of these initial values and then take the limit. Thanks to Proposition 12, we can now take the limit in the expression which gives v ε , t −1 −1 2 2 −1 ε ∂1−1 ∂22 (1+>?2 )−1 t v =e v0 − e∂1 ∂2 (1+>? ) (t−s) 1 + >?2 |uε |2 (s) ds, 0
and we state the following theorem. theorem 9 Let (u0 , v0 ) be two functions such that • u0 and ∂1 u0 are in L2 , |u0 |2 ∈ H 1 , and u0 ∈ L1 (L2 ); y2 y1 • v0 ∈ H 1 .
COLIN and LANNES
418
Let T > 0. If |u0 |2 + |∂1 u0 |2 + |v0 |H 1 is small enough, then there exists (u, v) such that i∂t u + ∂12 u = u∂1 v, ( t −1 2 −1 2 v = e∂1 ∂2 t v0 − 0 e∂1 ∂2 (t−s) |u|2 (s) ds and u ∈ C [0, T ]; L2 ,
∂1 u ∈ L∞ [0, T ]; L2 ,
u(0, y1 , y2 ) = u0 (y1 , y2 ), v ∈ L∞ [0, T ]; H 1 ∩ C [0, T ]; L2loc . Recall that the integral equation for v used in this result makes sense since the group −1 2 e∂1 ∂2 t acts on every Sobolev space H s , and, for u ∈ L∞ (0, T ; L2 ), |u|2 lies in L∞ (0, T ; H s ) for some negative s. Acknowledgments. The authors want to thank J.-L. Joly and G. M´etivier for fruitful discussions about this work.
References [BMN]
A. BABIN, A. MAHALOV, and B. NICOLAENKO, Global splitting, integrability and
regularity of 3D Euler and Navier-Stokes equations for uniformly rotating fluids, European J. Mech. B Fluids 15 (1996), 291–300. [C] T. COLIN, Rigorous derivation of the nonlinear Schr¨odinger equation and Davey-Stewartson systems from quadratic hyperbolic systems, preprint, Universit´e Bordeaux I, 1999. [D] P. DONNAT, Equations hyperboliques semi-lin´eaires dispersives, cours a` l’Ecole Polytechnique, 1995. ´ [DJMR] P. DONNAT, J.-L. JOLY, G. METIVIER, and J. RAUCH, “Diffractive nonlinear geometric optics” in S´eminaire sur les e´ quations aux d´eriv´ees partielles (1995–1996), ´ ´ S´emin. Equ. D´eriv. Partielles, Ecole Polytech., Palaiseau, 1996, exp. no. 17. [Ga] I. GALLAGHER, Applications of Schochet’s methods to parabolic equations, J. Math. Pures Appl. (9) 77 (1998), 989–1054. [Gr] E. GRENIER, Oscillatory perturbations of the Navier-Stokes equations, J. Math. Pures Appl. (9) 76 (1997), 477–498. [IMS] P. ISAZA, J. MEJIA, and V. STALLBOHM, Local solution for the Kadomtsev-Petviashvili equation in R2 , J. Math. Anal. Appl. 196 (1995), 566–587. ´ [JMR1] J.-L. JOLY, G. METIVIER, and J. RAUCH, Diffractive nonlinear geometric optics with rectification, Indiana Univ. Math. J. 47 (1998), 1167–1241.
LONG-WAVE SHORT-WAVE RESONANCE
[JMR2] [K] [L1] [L2] [La] [OT] [S] [Su]
419
, Transparent nonlinear geometric optics and Maxwell-Bloch equations, J. Differential Equations 166 (2000), 175–250. T. KATO, Perturbation Theory for Linear Operators, Grundlehren Math. Wiss. 132, Springer, New York, 1966. D. LANNES, Dispersive effects for nonlinear geometrical optics with rectification, Asymptot. Anal. 18 (1998), 111–146. , Quelques ph´enom`enes d’interaction d’ondes en optique non lin´eaire, th`ese, Universit´e Bordeaux I, 1999. Ph. Laurenc¸ot, On a nonlinear Schr¨odinger equation arising in the theory of water waves, Nonlinear Anal. 24 (1995), 509–527. T. OZAWA and Y. TSUTSUMI, Existence and smoothing effect of solutions for the Zakharov equations, Publ. Res. Inst. Math. Sci. 28 (1992), 329–361. S. SCHOCHET, Fast singular limits of hyperbolic PDEs, J. Differential Equations 114 (1994), 476–512. C. SULEM and P.-L. SULEM, The Nonlinear Schr¨odinger Equation: Self-Focusing and Wave Collapse, Appl. Math. Sci. 139, Springer, New York, 1999.
Math´ematiques Appliqu´ees de Bordeaux, Universit´e Bordeaux 1 et Centre National de la Recherche Scientifique Unit´e Mixte de Recherche 5466, 351 Cours de la Lib´eration, 33405 Talence Cedex, France; [email protected], [email protected]
COMPLETE PROPERLY EMBEDDED MINIMAL SURFACES IN R3 TOBIAS H. COLDING and WILLIAM P. MINICOZZI II
Abstract In this short paper, we apply estimates and ideas from [CM4] to study the ends of a properly embedded complete minimal surface 2 ⊂ R3 with finite topology. The main result is that any complete properly embedded minimal annulus that lies above a sufficiently narrow downward sloping cone must have finite total curvature. In this short paper, we apply estimates and ideas from [CM4] to study the ends of a properly embedded complete minimal surface 2 ⊂ R3 with finite topology. For such surfaces, each end has a representative E that is a properly embedded minimal annulus. If E has finite total curvature, it is asymptotic to either a plane or half of a catenoid. On the other hand, the helicoid provides the only known example of an end with finite topology and infinite total curvature. Clearly, no representative for the end of the helicoid can be disjoint from an end of a plane or catenoid. The main result of this paper is that any complete properly embedded minimal annulus that lies above a sufficiently narrow downward sloping cone must have finite total curvature. This is closely related to a result of P. Collin [Co] described below. In [HoMe], D. Hoffman and W. Meeks proved that at most two ends of can have infinite total curvature. Further, they conjectured that if as above has at least two ends, then it must have finite total curvature (the so-called finite total curvature conjecture; see [Me]). If has at least two ends, then there is either an end of a plane or a catenoid disjoint from (see [HoMe, Lemma 5]). Therefore, to prove the finite total curvature conjecture, it suffices to show that a properly embedded minimal annular end E that lies above the bottom half of a catenoid has finite total curvature. In this direction, Meeks and H. Rosenberg [MeR] showed that if has at least two ends, then is conformally equivalent to a compact Riemann surface with finitely DUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 2, Received 23 February 1999. Revision received 26 June 2000. 2000 Mathematics Subject Classification. Primary 53C21, 53C43. Colding’s work partially supported by National Science Foundation grant number DMS-9803253 and an Alfred P. Sloan Research Fellowship. Minicozzi’s work supported by National Science Foundation grant number DMS-9803144 and an Alfred P. Sloan Research Fellowship.
421
422
COLDING AND MINICOZZI II
many points removed. Using this, they showed that an annular end E arising in the finite total curvature conjecture lies in a half-space. In fact, they showed that E is either asymptotically planar (with finite total curvature) or satisfies the hypotheses of the so-called generalized Nitsche conjecture. conjecture (Generalized Nitsche conjecture (see [MeR]); Collin’s theorem [Co]) For t ≥ 0, let Pt = {x3 = t} ⊂ R3 . If E ⊂ {x3 ≥ 0} is a properly embedded minimal annulus, ∂E ⊂ P0 , and E ∩ Pt is a simple closed curve for all t > 0, then E has finite total curvature. Collin proved this conjecture in [Co], thereby showing, using [MeR], that, for properly embedded complete minimal surfaces with at least two ends, finite topology is equivalent to finite total curvature. An example of a properly immersed minimal cylinder in R3 with infinite total curvature was constructed by Rosenberg and E. Toubiana [RT]; this example shows that embeddedness is a necessary hypothesis in both the finite total curvature and generalized Nitsche conjectures. Let x1 , x2 , x3 be the standard coordinates on R3 . Given ∈ R, C ⊂ R3 denotes the cone (1) x3 > x12 + x22 . C0 is a half-space and C is convex if and only if ≥ 0. Given a = (a1 , a2 , a3 ) ∈ R3 , Cat(a) is the vertical catenoid centered at a given by (2) Cat(a) = cosh2 (x3 − a3 ) = (x1 − a1 )2 + (x2 − a2 )2 .
Note that {(x1 − a1 )2 + (x2 − a2 )2 < 1} ∩ Cat(a) = ∅, B1 ∩ Cat(s, 0, 100) = ∅ if |s| ≤ 10, and Cδ ∩ Cat(a) is bounded if δ > 0. Our main result is the following theorem. theorem There exists > 0 such that any complete properly embedded minimal annular end E ⊂ C− has finite total curvature. The generalized Nitsche conjecture follows directly since {x3 > 0} = C0 ⊂ C− . In fact, since the height of the catenoid grows logarithmically, the finite total curvature conjecture follows directly from our theorem. Before we proceed with the proof, we need to recall the following estimate, which is a special case of [CM4, Theorem 6.4]. There exists 0 > 0 such that the following holds. Let y ∈ R3 , r0 > 0, and 2 ⊂ B2r0 (y) ∩ x3 > x3 (y) ⊂ R3 (3)
MINIMAL SURFACES IN EUCLIDEAN SPACE
423
be a compact embedded minimal disk with ∂ ⊂ ∂B2r0 (y). For any connected component of Br0 (y) ∩ with B0 r0 (y) ∩ = ∅, sup |A |2 ≤ r0−2 .
(4)
As in [CM4], if (3) holds for and B0 r0 (y) ∩ = ∅, then we say that satisfies an (0 , r0 )-effective one-sided Reifenberg condition at y. The following lemma is standard; we include a proof for completeness. (This lemma also follows directly from [CM4, Theorem 6.4], mentioned above.) lemma With E as in the theorem above and any δ > 0, there exists a sequence of points yj ∈ E \ Cδ with |yj | → ∞. Proof If this fails for δ > 0, we may rescale to get ∂E ⊂ B1 and E \ B1 (0) ⊂ Cδ . Since E is proper and δ > 0, a curve σ0 ⊂ E connects B1 to {x3 = 200}. Since δ > 0, there exists s0 < 0 such that cosh−1 (t) + s0 < δt for all t ≥ 1; hence Cat(0, 0, s0 ) ∩ E¯ = ∅. By the strong maximum principle, there cannot be a first s > s0 with Cat(0, 0, s) ∩ E¯ = ∅ (since ∂E ⊂ B1 ). In particular, E ⊂ {x12 + x22 < 1}. Again by the strong maximum principle, there cannot be a first 0 < s ≤ 10, so that Cat(s, 0, 100) ∩ E¯ = ∅. This gives the contradiction since Cat(10, 0, 100) separates B1 from {x3 = 200} in {x12 + x22 < 1}. In the proof of our theorem, we use intrinsic balls in addition to the usual extrinsic balls Bs (y). The intrinsic ball in E of radius s about a point y ∈ E is denoted as Bs (y). Proof of the theorem By elementary topology, there is a connected curve β ⊂ C− \ E with one endpoint in ∂E and the other at the origin. Since the statement of the theorem is scale-invariant, we may suppose that (5) ∂E ∪ β ⊂ B1 ∩ C− . The above lemma yields a sequence of points yj ∈ E \ C with 4 < |yj | → ∞. By Sard’s theorem, we may choose a sequence rj with |yj | − 2 ≤ 6 rj ≤ |yj | − 1 such that Brj (yj ) and B2rj (yj ) intersect E transversely. For each j , we define a function fj on R3 by
(6)
424
COLDING AND MINICOZZI II
2 2 fj (x) = 2 x · yj − 2|yj | − |yj |x − 2yj ,
(7)
and we note that fj is superharmonic on any minimal surface. Let j denote the connected component of B2rj (yj ) ∩ E containing yj . If j is not a disk, then there exists a minimal annulus j with boundary components ηj1 ⊂ B1 and ηj2 ⊂ B2rj (yj ). Since fj is positive on ηj1 and ηj2 , the maximum principle implies fj > 0 on j . This is a contradiction since ηj1 and ηj2 are in different connected components of {fj > 0}, and hence j is a disk. By construction, we have x3 (yj ) ≤ 12rj and B2rj (yj ) ∩ j ⊂ B2rj (yj ) ∩ C− ⊂ x3 ≥ −14 rj . (8) We conclude that j satisfies an (26 , rj )-effective one-sided Reifenberg condition, and hence the curvature estimate (4) applies. In particular, if 26 ≤ 0 , then 2 sup Aj1 ≤ rj−2 , (9) j1
where j1 is the connected component of Brj (yj ) ∩ j containing yj . The curvature estimate (9) allows us to apply the Harnack inequality (see [ChY]) to the positive harmonic function x3 + 14 rj on Brj (yj ) ⊂ j1 . We get that sup x3 + 14 rj ≤ Ch x3 + 14 rj (yj ) ≤ 26 Ch rj , (10) B3rj /4 (yj )
where Ch comes from the Harnack estimate. Applying the gradient estimate, we have sup |∇x3 | ≤ Cg 26 Ch , (11) B5rj /8 (yj )
where Cg comes from the gradient estimate (see [ChY]). For > 0 small, (11) implies that B5rj /8 (yj ) is a graph with small gradient over x3 = 0. In particular, there is a point 2 2 (12) yj1 ∈ ∂B|yj | ∩ x1 − x1 (yj ) + x2 − x2 (yj ) = rj2 /4 ∩ B5rj /8 (yj ). By (10), if 26 Ch ≤ 0 , we may now apply the preceding argument with yj1 in place of yj . After iterating this at most 48π + 1 times, we go entirely around the cylinder (13) ∂B|yj | ∩ − |yj | ≤ x3 ≤ Ch48π+1 6 |yj | . If > 0 is small enough, then the effective one-sided Reifenberg applies in this entire cylinder. Therefore, so long as is sufficiently small, iterating this gives a curve
425
MINIMAL SURFACES IN EUCLIDEAN SPACE
γj ⊂ ∂B|yj | ∩ − |yj | ≤ x3 ≤ Ch48π+1 6 |yj | ∩ E,
(14)
so that γj is graphical over {x3 = 0}. Since γj is embedded, it either spirals indefinitely or is closed and linked with the x3 -axis. Since E is proper, γj is compact and hence must be closed. Let Ej ⊂ E be the connected component of B|yj | ∩ E containing ∂E. By the maximum principle, Ej is an annulus and the other components of B|yj | ∩E are disks. We show next that ∂Ej = ∂E ∪ σj , where σj ⊂ ∂B|yj | ∩ − |yj | ≤ x3 ≤ Ch48π+1 6 |yj | . (15) This is immediate if γj ⊂ Ej by (14). On the other hand, if γj ∈ / Ej , then the component of B|yj | ∩ E bounded by γj must be a disk Fj ⊂ E. In this case, (14) and the maximum principle imply that Fj ⊂ − |yj | ≤ x3 ≤ Ch48π+1 6 |yj | . (16) Since ∂Fj is graphical and Fj is a disk, (16) implies that Fj separates ∂ C− and {x3 > Ch48π +1 (6 |yj |)} in B|yj | . Since E is embedded and Fj ∩ β = ∅, we conclude that (15) holds. As above, (15) allows us to apply the estimates of [CM4] to prove that σj is graphical with uniformly bounded gradient. Therefore, the length of σj is at most 2π(1 + C0 )|yj | for a uniform constant C0 . By the isoperimetric inequality for doubly connected minimal surfaces (see [OsS]), Area(Ej ) ≤
C1 |yj |2 , 4π
(17)
where C1 < (2π + 2C0 π + |∂E|)2 is independent of j . By a standard argument, for instance, a slight variation of [CM4, Theorem 3.9] (to allow for a fixed compact boundary term), Ej0
|A|2 ≤ C2 ,
(18)
where C2 = C2 (C1 ) < ∞ and where Ej0 is the connected component of Brj ∩ Ej which contains ∂E. Since C2 is independent of j and since the Ej0 exhaust E as j → ∞, (18) implies that E
|A|2 ≤ C2 .
(19)
This completes the proof. We note that there are various other ways to obtain (19) from (17), but the argument given seems to be the most elementary.
426
COLDING AND MINICOZZI II
References [ChY]
S. Y. CHENG and S. T. YAU, Differential equations on Riemannian manifolds and their
[CM1]
T. H. COLDING and W. P. MINICOZZI II, Convergence of embedded minimal surfaces
geometric applications, Comm. Pure Appl. Math. 28 (1975), 333–354. without area bounds in three-manifolds, C. R. Acad. Sci. Paris S´er. I Math. 327 (1998), 765–770. [CM2] , “Embedded minimal surfaces without area bounds in 3-manifolds” in Geometry and Topology (Aarhus, 1998), Contemp. Math. 258, Amer. Math. Soc., Providence, 2000, 107–120. [CM3] , Minimal Surfaces, Courant Lect. Notes Math. 4, New York Univ., New York, 1999. [CM4] , Convergence and compactness of minimal surfaces without density bounds, I: Partial regularity and convergence, preprint, 2000. [CM5] , Convergence and compactness of minimal surfaces without density bounds, II: Compactness and removable singularities, in preparation. [CM6] , Convergence and compactness of minimal surfaces without density bounds, III: Morse index bounds and applications to topology, in preparation. [Co] P. COLLIN, Topologie et courbure des surfaces minimales proprement plong´ees de R3 , Ann. of Math. (2) 145 (1997), 1–31. [HoMe] D. HOFFMAN and W. H. MEEKS III, The asymptotic behavior of properly embedded minimal surfaces of finite topology, J. Amer. Math. Soc. 2 (1989), 667–682. [Me] W. H. MEEKS III, “The geometry, topology, and existence of periodic minimal surfaces” in Differential Geometry: Partial Differential Equations on Manifolds (Los Angeles, 1990), Proc. Sympos. Pure Math. 54, Part 1, ed. R. Greene and S. T. Yau, Amer. Math. Soc., Providence, 1993, 333–374. [MeR] W. H. MEEKS III and H. ROSENBERG, The geometry and conformal structure of properly embedded minimal surfaces of finite topology in R3 , Invent. Math. 114 (1993), 625–639. [OsS] R. OSSERMAN and M. SCHIFFER, Doubly-connected minimal surfaces, Arch. Rational Mech. Anal. 58 (1975), 285–307. [RT] H. ROSENBERG and E. TOUBIANA, A cylindrical type complete minimal surface in a slab of R3 , Bull. Sci. Math. (2) 111 (1987), 241–245.
Colding Courant Institute of Mathematical Sciences, 251 Mercer Street, New York, New York 10012, USA; [email protected] Minicozzi Department of Mathematics, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218, USA; [email protected]
LAGRANGIAN SUBBUNDLES AND CODIMENSION 3 SUBCANONICAL SUBSCHEMES DAVID EISENBUD, SORIN POPESCU, and CHARLES WALTER
Abstract We show that a Gorenstein subcanonical codimension 3 subscheme Z ⊂ X = PN , N ≥ 4, can be realized as the locus along which two Lagrangian subbundles of a twisted orthogonal bundle meet degenerately and conversely. We extend this result to singular Z and all quasi-projective ambient schemes X under the necessary hypothesis that Z is strongly subcanonical in a sense defined below. A central point is that a pair of Lagrangian subbundles can be transformed locally into an alternating map. In the local case our structure theorem reduces to that of D. Buchsbaum and D. Eisenbud [6] and says that Z is Pfaffian. We also prove codimension 1 symmetric and skew-symmetric analogues of our structure theorems. 0. Introduction Smooth subvarieties of small codimension Z ⊂ X = PN have been extensively studied in recent years, especially in relation to R. Hartshorne’s conjecture that a smooth subvariety of sufficiently small codimension in PN is a complete intersection. Although the conjecture remains open, any smooth subvariety Z of small codimension in PN is known, by a theorem of W. Barth, M. Larsen, and S. Lefschetz, to have the weaker property that it is subcanonical in the sense that its canonical class is a multiple of its hyperplane class. More generally, a subscheme Z of a nonsingular Noetherian scheme X is said to be subcanonical if Z is Gorenstein and its canonical bundle is the restriction of a bundle on X. There is a natural generalization to an arbitrary (possibly singular) scheme X (see below). In this paper we give a structure theorem for subcanonical subschemes of codimension 3 in PN and generalize it to subcanonical subschemes of codimension 3 in an arbitrary quasi-projective scheme X satisfying a mild extra cohomological conDUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 3, Received 4 July 1999. Revision received 17 July 2000. 2000 Mathematics Subject Classification. Primary 14M07, 13D02; Secondary 14M12, 14F05. Authors’ work partially supported by the National Science Foundation.
427
428
EISENBUD, POPESCU, AND WALTER
dition (strongly subcanonical subschemes). The construction works even without the quasi-projective hypothesis. There are well-known theorems describing the local structure of Gorenstein subschemes of nonsingular Noetherian schemes in codimensions less than or equal to 3. In codimensions 1 and 2 all Gorenstein subschemes are locally complete intersections. These results have been globalized. If X is nonsingular, any Z ⊂ X of codimension 1 is the zero locus of a section of a line bundle, while a subcanonical Z ⊂ X of codimension 2 is the zero locus of a section of a rank 2 vector bundle if a certain obstruction in cohomology vanishes (as explained below). In both cases OZ has a symmetric resolution by locally free OX -modules. In codimension 3 both the local and the global cases become more complicated. Locally, a Gorenstein subscheme of codimension 3 need not be a locally complete intersection. Rather, Buchsbaum and Eisenbud [6] showed that such a subscheme is cut out locally by the submaximal Pfaffians of an alternating matrix appearing in a minimal free resolution. C. Okonek [29] asked whether this local result could be generalized to show that codimension 3 subcanonical schemes are cut out by the Pfaffians of an alternating map of vector bundles. C. Walter [35] gave a positive answer to Okonek’s question in Pn under a mild additional hypothesis but left open the question of whether this hypothesis is always satisfied. In our paper [14] we will show that not every subcanonical subscheme of codimension 3 in Pn is Pfaffian, settling Okonek’s question negatively. In the present paper we show that a different way of looking at the Pfaffian construction does generalize and gives the desired structure theorem for all subcanonical subschemes of codimension 3. (The question as to which subschemes are Pfaffian can be answered in the derived Witt group of P. Balmer [2]; see Walter [36].) In this paper a closed subscheme Z ⊂ X of a Noetherian scheme is called subcanonical of codimension d if it satisfies the following two conditions. (A) The subscheme Z is relatively Cohen-Macaulay of codimension d in X; that is, E xt iO (OZ , OX ) = 0 for all i = d. X (B) There exists a line bundle L on X such that the relative canonical sheaf ωZ/X := E xt dO (OZ , OX ) is isomorphic to the restriction of L−1 to Z. X These conditions are not enough for the Serre correspondence in codimension 2 nor for our structure theorem in codimension 3. Condition (B) asserts the existence of an isomorphism ∼ η : OZ −−→ ωZ/X (L) = E xt dOX OZ , L which one can think of as an η ∈ H 0 (E xt dOX (OZ , L)) = ExtdOX (OZ , L). In the Yoneda Ext, this η defines a class of “resolutions of OZ by coherent sheaves”: 0 −→ L −→ Fd−1 −→ · · · −→ F1 −→ F0 −→ OZ −→ 0.
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
429
In our structure theorem we require such resolutions with F0 , . . . , Fd−1 locally free. Thus we need the following condition: (C) The OX -module OZ is of finite local projective dimension (necessarily equal to the codimension d). This condition holds automatically if the ambient scheme X is nonsingular. We also need F0 = OX , which means that we want η ∈ ExtdOX (OZ , L) to lift to
Ext d−1 OX (IZ , L). Since these two groups are joined by a map in the long exact sequence obtained by applying Ext ∗OX (−, L) to the short exact sequence 0 → IZ → OX → OZ → 0, we see that the lifting exists if and only if η ∈ Ext dO (OZ , L) X goes to zero in Ext dOX (OX , L) ∼ = H d (X, L). We are thus led to the following condition: (D) The isomorphism class η ∈ Ext dOX (OZ , L) of (2) goes to zero under the map ExtdOX OZ , L −→ ExtdOX OX , L = H d (X, L) induced by the surjection OX → OZ . Condition (D) holds automatically if H d (X, L) = 0. This is the case if X = Pn with n ≥ d + 1 or if X is an affine scheme. In addition, if the ambient scheme X is a Gorenstein variety over a field k, then condition (D) can be put into a dual form that looks more natural. For in that case Z ⊂ X is subcanonical (of dimension r) if and only if it is Cohen-Macaulay and there exists a line bundle M on X such that ωZ ∼ = M|Z , and condition (D) holds if and only if the following composite map vanishes: η rest tr (1) H r (X, M) −−−→ H r (Z, M|Z ) −−→ H r (Z, ωZ ) −→ k. ∼ =
In these terms we may give the following central definition of this paper. Definition 0.1 A subscheme Z ⊂ X is strongly subcanonical if it satisfies conditions (A)–(D). The Serre construction shows that a subscheme of codimension 2 is the zero locus of a rank 2 vector bundle if and only if it is strongly subcanonical. (P. Griffiths and J. Harris [20, Proposition 1.33], J. Vogelaar [34, Theorem 2.1], and C. B˘anic˘a and M. Putinar [3, §2.1] state variants of condition (D) explicitly.) Our main results show that a codimension 3 subscheme Z of a quasi-projective scheme X is strongly subcanonical if and only if it can be expressed as an appropriate “Lagrangian degeneracy scheme,” defined as follows. Let V be a vector bundle on X of even rank 2n equipped with a nonsingular quadratic form q with values in a line bundle L. Let E and F be a pair of Lagrangian subbundles of (V , q) (i.e., totally
430
EISENBUD, POPESCU, AND WALTER
isotropic subbundles of rank n). It is then well known that dim[E (x) ∩ F (x)] is locally constant modulo 2. Now suppose that m is an integer such that dim[E (x) ∩ F (x)] ≡ m (mod 2) for all x ∈ X. Then there is a degeneracy locus that as a set is given by Zm (E , F )red := x ∈ X | dimk(x) E (x) ∩ F (x) ≥ m . In §2 we define a scheme structure on this set in roughly the following manner. Using the data E , F ⊂ (V , q), one defines a composite map E −→ V ∼ = V ∗ (L) −→ F ∗ (L)
λ:
such that ker(λ(x)) = E (x) ∩ F (x) for all x ∈ X. Even if E ∼ = F , the map λ may not be alternating, but (perhaps after modifying λ slightly to make it have even rank everywhere) we show that it is possible to find local bases in which the matrix of λ is alternating (see Proposition 2.3). Although these alternating matrices do not glue together, they are sufficiently compatible that we can define Zm (E , F ) as the locus defined by their Pfaffians of order rk(E ) − m + 2. This scheme structure is natural in the sense that in a suitably generic setting it is reduced, and it is stable under base change. The following structure theorem for strongly subcanonical codimension 3 subschemes collects the main results of this paper. theorem 0.2 Let X be a quasi-projective scheme over a Noetherian ring, and let Z ⊂ X be a closed subscheme of grade 3. The following conditions are equivalent. (a) Z ⊂ X is strongly subcanonical. (b) There exists a twisted orthogonal bundle (V , q) and Lagrangian subbundles E , F ⊂ (V , q), with dimk(x) [E (x) ∩ F (x)] odd for all x ∈ X, such that Z = Z3 (E , F ). (c) There exists a vector bundle F , a line bundle L, and a Lagrangian subbundle E of the hyperbolic bundle F ⊕ F ∗ (L) such that the composite map λ : E → F ⊕ F ∗ (L) F ∗ (L) has kernel of odd rank and such that Z = Z3 (E , F ∗ (L)). (d) Z has symmetrically quasi-isomorphic locally free resolutions 0
L
H
ψ
L
G ∗ (L)
−ψ ∗
H ∗ (L)
OZ
0
∼ = η
φ∗
φ
0
OX
G
OX
E xt 3O (OZ , L) X
0
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
431
where L is a line bundle on X, and φ ∗ ψ : E → E ∗ (L) is an alternating map. The structure theorem is proved in several parts (see Theorems 3.1, 4.1, 6.1). One way to look at the structure theorem is as follows. The existence of the symmetric isomorphism ∼ η : OZ −−→ E xt 3OX OZ , L
(2)
means that there should be a symmetric isomorphism in the derived category from the locally free resolution of OZ , q
ψ
p
0 −→ L −→ E −−→ G −→ OX −→ OZ −→ 0,
(3)
into its twisted shifted dual. In general, morphisms in the derived category are complicated objects involving homotopy classes of maps and a calculus of fractions. Nevertheless, in Theorem 6.1 we show that there exist such locally free resolutions of OZ (which depend on the choice of η) for which the symmetric isomorphism in the derived category is induced by a symmetric chain map which is a quasi-isomorphism and therefore becomes an isomorphism in the derived category. Okonek’s Pfaffian subschemes correspond to situations where this quasi-isomorphism is an isomorphism. The philosophy that (skew)-symmetric sheaves should have locally free resolutions that are (skew)-symmetric up to quasi-isomorphism is also pursued in [15] and [36]. The former deals primarily with methods for constructing explicit locally free resolutions for (skew)-symmetric sheaves on Pn . The latter studies the obstructions (in Balmer’s derived Witt groups; see [2]) to the existence of a genuinely (skew)symmetric resolution. The results of this paper give a full characterization of codimension 3 subcanonical subschemes. In [14] we use this machinery to construct various geometric examples of subcanonical subschemes of codimension 3 which are not Pfaffian. Porteous-type formulas for the fundamental classes of degeneracy loci for skewsymmetric maps φ : E → E ∗ (L) were found by Harris and L. Tu [21], T. J´ozefiak, A. Lascoux, and P. Pragacz [23], and Pragacz [32]. Harris asked for similar formulas for degeneracy loci related to pairs of Lagrangian subbundles, and they were provided by W. Fulton [16], [17] and by Pragacz and J. Ratajski [33] (see Fulton and Pragacz [18] for more details). (A scheme structure on these degeneracy loci and their generalizations with isotropic flag conditions can be defined in a manner similar to (12).) Fulton and Pragacz (see [18, §9.4]) also ask whether one can find “natural” resolutions for the structure sheaves of these kinds of symmetric and skew-symmetric
432
EISENBUD, POPESCU, AND WALTER
degeneracy loci. From such a resolution one can read off formulas in K0 (X). Theorem 3.1 provides an explicit answer in one simple case. Structure of the paper In §§1 and 2 we review basic facts about Lagrangian subbundles of twisted orthogonal bundles and define the scheme structure on the degeneracy loci Zm (E , F ). In §3 we prove that Lagrangian degeneracy loci of codimension 3 are strongly subcanonical (see Theorem 3.1). In §4 we discuss “split” Lagrangian degeneracy loci (see Theorem 4.1), which are often more practical for constructing codimension 3 subcanonical subschemes. The computation of local equations for these degeneracy loci is discussed in §5. In §6 we complete the proof of the structure Theorem 0.2 by showing that strongly subcanonical subschemes of codimension 3 are split Lagrangian degeneracy loci (see Theorem 6.1). In §§7 and 8 we discuss at length various examples of codimension 3 subcanonical subschemes, particularly the case of points in P3 . Further examples can be found in our paper [14]. Finally, in §9 we prove codimension 1 symmetric and skew-symmetric analogues of all previous results. In particular, we state G. Casnati and F. Catanese’s structural result (see [7, Remark 2.2]) and give an example of a self-linked threefold of degree 18 in P5 which does not have a symmetric resolution because the parity condition fails. 1. Quadratic forms on vector bundles In this section we recall the basic definitions of twisted orthogonal bundles and Lagrangian subbundles. The definitions and results can be found in many standard references, such as Fulton and Pragacz [18, Chapter 6], M.-A. Knus [25], and S. Mukai [26, §1]. Quadratic forms Suppose that V is a finite-dimensional vector space over a field k. (We impose no restrictions on k; it may have characteristic 2 and need not be algebraically closed.) A quadratic form on V is a homogeneous quadratic polynomial in the linear forms on V , that is, a member q ∈ S 2 (V ∗ ). The symmetric bilinear form b : V × V → k associated to q is given by the formula b(x, y) := q(x + y) − q(x) − q(y).
(4)
The quadratic form q is nondegenerate if b is a perfect pairing. Now suppose that V is a locally free sheaf of constant finite rank over a scheme X. A quadratic form on V with values in a line bundle L is a global section q of
433
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
S 2 (V ∗ ) ⊗ L. Such a quadratic form is nonsingular if the induced symmetric bilinear form is a perfect pairing. Equivalently, a quadratic form q on V is nonsingular if for each point x ∈ X the induced quadratic form q(x) on the fiber vector space V (x) is nondegenerate. A twisted orthogonal bundle on X is a vector bundle V equipped with a nonsingular quadratic form q with values in some line bundle L. Lagrangian subbundles If V is a vector space of even dimension 2n equipped with a nondegenerate quadratic form, then a Lagrangian subspace E ⊂ (V , q) is a subspace of V of dimension n such that q|E ≡ 0. If the characteristic is not equal to 2, then E ⊂ (V , q) is Lagrangian if and only if E = E ⊥ := {x ∈ V | b(x, y) = 0 for all y ∈ E}. In characteristic 2 this condition is necessary but not sufficient for q to vanish on E, that is, for E to be Lagrangian. Similarly, a Lagrangian subbundle E ⊂ (V , q) of a twisted orthogonal bundle of even rank 2n is a subbundle (with locally free quotient sheaf) of rank n such that q|E ≡ 0. The following result is well known (cf. N. Bourbaki [5, §6, exercice 18(d)], D. Mumford [27], Mukai [26, Proposition 1.6]). proposition 1.1 If E and F are Lagrangian subbundles of a twisted orthogonal bundle over a scheme X, then the function on X given by x → dimk(x) [E (x) ∩ F (x)] is locally constant modulo 2. Hyperbolic bundles If F is any vector bundle of constant rank, and L is any line bundle, then F ⊕ F ∗ (L) may be endowed with the hyperbolic quadratic form qh (e ⊕ α) := α(e) with values in L. (This qh is bilinear on F × F ∗ (L) but quadratic on F ⊕ F ∗ (L).) The associated hyperbolic symmetric bilinear form has matrix I0 I0 . We use the following notation for graph subbundles. If ψ : A → B and α : B → A are morphisms of vector bundles, then we write
1 ψ
%ψ := im A → A ⊕ B ,
α 1
%α := im B → A ⊕ B .
These graphs are to be regarded as subbundles of A ⊕ B . lemma 1.2 A subbundle E ⊂ (F ⊕ F ∗ (L), qh ) is a Lagrangian subbundle complementary to the
434
EISENBUD, POPESCU, AND WALTER
direct summand F ∗ (L) if and only if there is an alternating map ζ : F → F ∗ (L) such that E = %ζ . Any Lagrangian subbundle of a twisted orthogonal bundle over an affine scheme has a Lagrangian complement (cf. [25, Remark I.5.5.4]), although this is not always true over a general scheme. However, if a Lagrangian subbundle F ⊂ (V , q) has a Lagrangian complement M , then the symmetric bilinear form induces a natural isomorphism M ∼ = F ∗ (L). This defines an isometry φ F ,M (V , q) −−−−→ F ⊕ F ∗ (L), qh ∼ =
(5)
which is the identity on F and which identifies the complementary Lagrangian subbundles F , M ⊂ V with the two direct summands of F ⊕ F ∗ (L). Lemma 1.2 has the following corollary. corollary 1.3 If E ,F ⊂ (V , q) are Lagrangian subbundles with a common Lagrangian complement M , then there is an alternating map ζ : F → F ∗ (L) such that φF ,M (E ) = %ζ . 2. Locally alternating maps and Lagrangian degeneracy loci In this section we show how to use Corollary 1.3 to define scheme-theoretic degeneracy loci for pairs of Lagrangian subbundles of a twisted orthogonal bundle which generalize the degeneracy loci for alternating maps defined by ideals of Pfaffians. We also show how to turn a pair of Lagrangian subbundles into a locally alternating map. Several steps are required in order to make sure that common Lagrangian complements exist locally and to show that our degeneracy loci are independent of the choice of common Lagrangian complement. Our scheme structure defined by local equations coincides with that given by a universal construction in C. De Concini and Pragacz [11]. Existence of local common Lagrangian complements The result we need is standard if the residue field is infinite, but if the residue field is very small, care is required. We are interested in when two Lagrangian subbundles E , F of a twisted orthogonal bundle (V , q) have a common Lagrangian complement locally. If one recalls that an even-dimensional quadratic vector space (V (x), q(x)) has two families of Lagrangian subspaces, and that in order for two Lagrangian subspaces to have a common Lagrangian complement they must lie in the same family, and that this is measured by the dimension of the intersection modulo 2, we see that in order for E and F to have a common Lagrangian complement to E and
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
435
F , we must have dimk(x) [E (x) ∩ F (x)] ≡ rk(E ) (mod 2). We now show that locally this condition is also sufficient.
proposition 2.1 Let E , F ⊂ (V , q) be Lagrangian subbundles of a twisted orthogonal bundle on a scheme X. Suppose that dimk(x) [E (x) ∩ F (x)] ≡ rk(E ) (mod 2) for all x ∈ X. Then any x ∈ X has a neighborhood U over which E |U and F |U have a common Lagrangian complement MU . Proof It is easy to see that any common Lagrangian complement at x extends to a common Lagrangian complement in a neighborhood U . The existence at x of such a complement is standard if the residue field is infinite. The following lemma proves the existence of such a complement in general. lemma 2.2 Suppose that q is a nondegenerate quadratic form on an even-dimensional vector space V and that U, U ⊂ (V , q) are two Lagrangian subspaces such that dim(U ∩ U ) ≡ dim(U ) (mod 2). Then there exists a Lagrangian subspace L ⊂ (V , q) complementary to U and to U . Proof Let K = U ∩ U . Then U = U ⊥ ⊂ K ⊥ , and similarly U ⊂ K ⊥ . On dimensional grounds, we must indeed have U + U = K ⊥ . As a result, U/K and U /K are complementary Lagrangian subspaces of K ⊥ /K. Moreover, by hypothesis they are even-dimensional. Let f1 , . . . , f2m be a system of vectors in U mapping onto a basis of U/K. Since U/K and U /K are complementary Lagrangian subspaces of K ⊥ /K, the symmetric bilinear form b associated to q induces a perfect pairing between them. So there exists a system of vectors g1 , . . . , g2m in U such that b(fi , gj ) = δij for all i, j . Let N be the subspace spanned by the fi and gj . Then q|N is nondegenerate, so there is an orthogonal direct sum decomposition V = N ⊕N ⊥ such that q|N and q|N ⊥ are both nondegenerate. Moreover, K ⊂ (N ⊥ , q|N ⊥ ) is a Lagrangian subspace, for which there exists a complementary Lagrangian subspace P by our previous remarks. Let p1 , . . . , pr be a basis of P . One may now check that f1 + g2 , f2 − g1 , f3 + g4 , f4 − g3 , . . . , f2m−1 + g2m , f2m − g2m−1 , p1 , . . . , pr form a basis for a Lagrangian subspace L ⊂ (V , q) complementary to both U and U .
436
EISENBUD, POPESCU, AND WALTER
The following example shows that Lemma 2.2 does not always extend to three Lagrangian subspaces. Suppose that k = Z/2Z, that V = k 4 , and that q = x1 x3 + x2 x4 . Let U , U , and U be the Lagrangian subspaces given by x1 = x2 = 0, by x3 = x4 = 0, and by x1 + x3 = x2 + x4 = 0, respectively. Each subspace is of dimension 2, and each pair of subspaces has intersection of dimension zero. But there is no Lagrangian subspace of V which is simultaneously complementary to U , U , and U . Locally alternating maps Suppose that f : E → (V , q) and g : F → (V , q) are Lagrangian subbundles of a twisted orthogonal bundle. Consider the composite map f
g∗
β
λ : E −−→ V −→ V ∗ (L) −−→ F ∗ (L),
(6)
∼ =
where β : V −→ V ∗ (L) is the isomorphism induced by the quadratic form q. In the special case of Lemma 1.2, λ is the alternating map ζ : F → F ∗ (L). In the general case, the rank of λ may not be even, and thus λ may not be locally alternating. But this is the only obstruction; when the rank of λ is even, we show that λ is locally alternating, and we show how to reduce to the even rank case. We may assume that X is connected. Then λ is either everywhere of even rank or everywhere of odd rank because the kernel of λ(x) : E (x) → F ∗ (L)(x) is E (x) ∩ F (x), which is of constant rank modulo 2 by Proposition 1.1. If λ is everywhere of odd rank, then replace the Lagrangian subbundles E , F of V by the Lagrangian subbundles E1 := E ⊕ OX and F1 := F ⊕ L of the orthogonal bundle V1 := V ⊕ OX ⊕ L. This replaces λ by
λ0 01
λ1 : E1 = E ⊕ OX −−−→ F ∗ (L) ⊕ OX = F1∗ (L).
(7)
The rank of λ1 is everywhere even, but its kernel and cokernel are the same as those of λ. Notice also that E1 (x) ∩ F1 (x) = E (x) ∩ F (x) for all x ∈ X. Thus by replacing λ by λ1 if necessary, we can reduce to the case where the rank is everywhere even. proposition 2.3 The following are equivalent: (a) the rank of λ is everywhere even; (b) dimk(x) [E (x) ∩ F (x)] ≡ rk(E ) (mod 2) for all x ∈ X; (c) λ is locally alternating; that is, there exists a cover of X by open subsets U and isomorphisms ιU : F |U ∼ = E |U such that the compositions λ|U ◦ ιU are alternating.
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
437
Proof The equivalence of (a) and (b) follows from the fact that ker λ(x) = E (x)∩ F (x). The implication (c) ⇒ (a) is standard. To prove (a) ⇒ (c), use Proposition 2.1 to cover X by open subsets U over each of which E |U , F |U have a common Lagrangian complement MU . Then by Corollary 1.3 there exist alternating maps ζU : F |U → F ∗ (L)|U such that φF |U ,MU (E |U ) = %ζU . Let α : E |U ∼ = %ζU be the isomorphism ∗ F | and π : % induced by φF |U ,MU , and let π1 : %ζU ∼ = 2 ζU → F (L)|U be the two U projections from %ζU ⊂ (F ⊕ F ∗ (L))|U : E |U
∼ = α λ|U
∼ = π1
% ζU π2
F ∗ (L)|U
F |U
(8)
ζU
Then ιU := (π1 ◦ α)−1 is an isomorphism such that ζU = λ|U ◦ ιU is alternating. Independence of the common Lagrangian complement Unfortunately, the construction that makes λ locally alternating depends on choices of local common Lagrangian complements. We now look at what happens if we replace one choice by another. lemma 2.4 Let E , F ⊂ (V , q) be Lagrangian subbundles, and let M and N both be common Lagrangian complements to E and F . Suppose that the map φF ,M of (5) sends E to the graph of ζ : F → F ∗ (L) and sends N to the graph of h : F ∗ (L) → F . Then (a) the maps ζ and h are alternating; (b) the map u := 1 − hζ and its transpose u∗ = 1 − ζ h are both invertible; (c) the isometry φF ,N sends E to the graph of the morphism ∗ ζ u−1 = u−1 (ζ − ζ hζ )u−1 . Proof Part (a) follows from Corollary 1.3. (b) Since N and E are complementary, their images %ζ , %h ⊂ F ⊕ F ∗ (L) are also complementary. Thus the map
1h ζ 1
F ⊕ F ∗ (L) −−−−→ F ⊕ F ∗ (L)
is an isomorphism. This is equivalent to the composition 1 −h 1 h 1 − hζ 0 = 0 1 ζ 1 ζ 1
438
EISENBUD, POPESCU, AND WALTER
being invertible or to u = 1 − hζ being invertible. It follows that u∗ = 1 − ζ h is also invertible. (c) We now have isometries φF , M
φF , N
∼ =
∼ =
F ⊕ F ∗ (L) ←−−−− V −−−−→ F ⊕ F ∗ (L).
(9)
The left-to-right composition is the identity on the first summand F and sends %h (corresponding to N on the left) onto the second summand F ∗ (L) (corresponding to N on the right) compatibly with the hyperbolic quadratic form on F ⊕ F ∗ (L). Consequently, the left-to-right composition is 01 −h 1 . To find the image of E on the right, one first goes to the left (where its image is %ζ ) and then applies the left-to-right composition. Therefore the image of E on the right is the composite image F ∼ =
1 ζ
F ⊕ F ∗ (L)
1 −h 0 1
F ⊕ F ∗ (L).
left-to-right
left|E
E
The above composite map is the right is %ζ u−1 . But
1 −h 1 0 1
ζ
=
u ζ
=
1 ζ u−1 u,
so the image of E on
∗ ∗ ζ u−1 = u−1 u∗ ζ u−1 = u−1 (ζ − ζ hζ )u−1 , and this completes the proof. Degeneracy loci Let F be a vector bundle of constant rank on a scheme X, let L be a line bundle, and let ζ : F → F ∗ (L) be an alternating map. If k ≥ 0 is an integer and if m := rk(F ) − 2k, then the degeneracy locus Zm (ζ ) := x ∈ X | rk(ζ (x)) ≤ rk(F ) − m = 2k (10) has codimension at most m(m − 1)/2, its “expected” value. The natural scheme structure on Zm (ζ ) is defined locally by the ideal Pf 2k+2 (ζ ) generated by the (2k + 2)×(2k + 2) Pfaffians of the alternating map ζ . These loci have been studied notably in Harris and Tu [21] and from a different point of view in Okonek [29] and Walter [35]. We need to extend this notion to the case of a locally alternating map, as described in Proposition 2.3. Let E , F ⊂ (V , q) be Lagrangian subbundles of a twisted orthogonal bundle, let m be an integer such that m ≡ dimk(x) [E (x) ∩ F (x)] (mod 2)
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
439
for all x ∈ X, and let Zm (E , F ) := x ∈ X | dimk(x) E (x) ∩ F (x) ≥ m .
(11)
The fundamental classes of these loci are discussed in Fulton and Pragacz [18, Chapter 6], where they are given as polynomials in the Chern classes of E , F , and L. We now define a scheme structure on these Lagrangian degeneracy loci. Replacing λ by the λ1 of (7) if necessary, we may assume that λ is everywhere of even rank. By Proposition 2.3, λ is then locally alternating; that is, there exists a cover of X by open subsets U and isomorphisms ιU : F |U ∼ = E |U such that the compositions ζU = λ|U ◦ ιU are alternating. The scheme Zm (E , F )|U is then defined by IZm (E ,F ) |U := Pf rk(E )−m+2 (ζU ).
(12)
Since Zm (E , F )|U is the degeneracy locus of an alternating map, its codimension is at most m(m − 1)/2. Now the construction of the maps ζU in Proposition 2.3 depends on the choice of a common Lagrangian complement to E |U and F |U . Nevertheless, the local degeneracy loci Zm (E , F )|U are independent of this choice and therefore glue together to form a scheme Zm (E , F ) because of Lemma 2.4 and the following lemma. lemma 2.5 Let F be a vector bundle of constant rank, and let L be a line bundle over a scheme X. If ζ : F → F ∗ (L) and h : F ∗ (L) → F are alternating maps such that u := 1−hζ is invertible, then Pf 2k (ζ ) = Pf 2k (ζ − ζ hζ ) for all integers k. Proof It is enough to prove the lemma in the case where L = OX and where X is universal. So let r := rk(F ), and let R := Z[Xij , Yij ] be the polynomial ring in the independent variables Xij , Yij (1 ≤ i < j ≤ r). Let ζ and h be the r×r matrices with coefficients in R given by X if i < j, if i < j, ij Yij h := ζij := 0 if i = j, 0 if i = j, ij −X if i > j, if i > j, −Y ji
ji
let u := 1 − hζ , let δ := det(u), and set Rδ := R[1/δ]. Thus X := Spec(Rδ ). We have to show that the two ideals I := Pf 2k (ζ ),
J := Pf 2k (ζ − ζ hζ )
in Rδ coincide. However, we may notice the following three facts.
440
EISENBUD, POPESCU, AND WALTER
The ideal I ⊂ Rδ is prime. This is because the ideal of Z[Xij ] generated by the 2k×2k Pfaffians of ζ is prime (cf. S. Abeasis and A. Del Fra [1, §3]), so its extensions to R (which is a polynomial algebra over Z[Xij ]) and to Rδ are also prime. There is an involution of Rδ exchanging I and J . Since R is a polynomial algebra over Z in variables that are the entries of ζ and h, one can specify a morphism f : R → Rδ by specifying alternating matrices f (ζ ) and f (h). Thus we may define f by f (h) := −u−1 h(u∗ )−1 . f (ζ ) := ζ − ζ hζ = ζ u = u∗ ζ, One computes that f (u) = u−1 , so f (δ) is the invertible element 1/δ ∈ Rδ . Hence f extends uniquely to a morphism f : Rδ → Rδ . One checks that f (f (ζ )) = ζ and that f (f (h)) = h, so f is an involution. Since f exchanges ζ and ζ − ζ hζ , it exchanges I and J . The ideals I and J define the same algebraic subset of Spec(Rδ ). This is equivalent to showing that a morphism g : Rδ → K, with K a field, factors through Rδ /I if and only if it factors through Rδ /J . But giving such a g is equivalent to giving alternating matrices g(ζ ) and g(h) with coefficients in K such that g(u) = 1 − g(h)g(ζ ) is invertible. Such a g factors through Rδ /I if and only if rk[g(ζ )] < 2k, and it factors through Rδ /J if and only if rk g(ζ − ζ hζ ) = rk g(ζ )g(u) < 2k. Since g(u) is invertible, the two conditions are equivalent. These three facts show that (in the generic case) I and J are prime ideals defining the same algebraic subsets. This proves that I and J are equal in the generic case and therefore equal in all cases. This proves the lemma. The definition of Zm (E , F ) generalizes the definition of Zm (ζ ). lemma 2.6 Let ζ : F → F ∗ (L) be an alternating map of vector bundles, and let %ζ (F ) ⊂ F ⊕ F ∗ (L) be its graph. For any m such that m ≡ rk(F ) (mod 2), the degeneracy loci Zm (ζ ) and Zm (F , %ζ (F )) are identical schemes. Finally, we may verify that our scheme-theoretic degeneracy loci do not change if we invert the order of our pair of Lagrangian subbundles; that is, Zm (E , F ) = Zm (F , E ) for all m ≡ rk(E ) (mod 2). Essentially, if MU is a common Lagrangian complement of E |U and of F |U which leads to a ζU as in (12) such that Zm (E , F )|U =
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
441
Zm (ζU ), then a computation similar to Lemma 2.4(c) leads to a natural identification Zm (F , E )|U = Zm (−ζU ). We leave the details to the reader. In this paper our interest is in the locus Z := Z3 (E , F ) in the case when it has codimension 3, the largest possible (and expected) value. The fundamental classes computed in [17], [16], and [18] agree with the scheme structures introduced here, which coincide with the scheme structures defined in [11]. All the results of this section have analogues for pairs of Lagrangian subbundles of twisted symplectic bundles. Degeneracy loci for such pairs are the degeneracy loci of locally symmetric maps. The symplectic case is slightly simpler than the orthogonal case because one does not need to worry about the parity of m or of dimk(x) [E (x) ∩ F (x)]. (There is no symplectic analogue of Proposition 1.1.) The details are left to the reader. 3. Lagrangian degeneracy loci are strongly subcanonical We now prove the implication (b) ⇒ (a) of our main Theorem 0.2. theorem 3.1 Suppose that (V , q) is a twisted orthogonal bundle over a locally Noetherian scheme X with values in a line bundle L. Suppose that E , F ⊂ (V , q) are Lagrangian subbundles such that dimk(x) [E (x) ∩ F (x)] is odd for all x ∈ X. Write LE ,F ,V := det(E ) ⊗ det(F ) ⊗ det(V )−1 . Suppose that the submaximal minors of the composite map λ : E → V ∼ = V ∗ (L) F ∗ (L) generate an ideal sheaf I of grade 3 (the expected value). Then the ideal sheaf of the closed subscheme (cf. (12)), Z = Z3 (E , F ) = x ∈ X | dimk(x) E (x) ∩ F (x) ≥ 3 , has grade 3 and satisfies IZ2 = I . The sheaf OZ has locally free resolutions λ
0 −→ LE ,F ,V −→ E (M) −−→ F ∗ (L ⊗ M) −→ OX −→ OZ −→ 0, −λ∗
0 −→ LE ,F ,V −→ F (M) −−→ E ∗ (L ⊗ M) −→ OX −→ OZ −→ 0,
(13a) (13b)
with M a line bundle such that M ⊗2 ∼ = LE ,F ,V ⊗ L−1 . Moreover, the natural isomorphism between (13b) and the dual of (13a) defines an isomorphism ∼ = η : OZ −−→ E xt 3OX OZ , LE ,F ,V =: ωZ/X LE ,F ,V , with respect to which Z is strongly subcanonical of codimension 3 in X (cf. Definition 0.1).
442
EISENBUD, POPESCU, AND WALTER
corollary 3.2 If, in the situation of the theorem, X is locally Gorenstein, then so is Z, and ωZ ∼ = ωX (L−1 )| . E ,F ,V Z The statement of the theorem remains true even if X is not Noetherian, provided one defines grade as in J. Eagon and D. Northcott [12] and Northcott [28]. The only difference in the proofs is that one uses the non-Noetherian generalizations of the Buchsbaum-Eisenbud structure theorems found in these references. Proof of Theorem 3.1 Let f : E → V and g : F → V be the inclusions, and let N = E ∩ F be the kernel in the natural sequence
i j
( f −g )
0 −→ N −−−→ E ⊕ F −−−−→ V . ∼
If β : V −→ V ∗ (L) is the isomorphism induced by the quadratic form q, and λ := g ∗ βf , then we get a commutative diagram λ
E
F ∗ (L) g∗ β
f
i
N
V
j∗
N −1 ⊗ L
f ∗β
g j
F
E ∗ (L)
λ∗
(14)
i∗
Since the diagonals are short exact sequences, the kernels of λ and of λ∗ are both equal to N . In addition, f i = gj . We claim that N is a line bundle and that the complexes j∗
i
λ
j
−λ∗
0 −→ N −→ E −→ F ∗ (L) −−→ N −1 ⊗ L, i∗
0 −→ N −→ F −−→ E ∗ (L) −−→ N −1 ⊗ L
(15a) (15b)
are exact and are locally free resolutions of OZ (N −1 ⊗ L) for the subscheme Z = Z3 (E , F ) ⊂ X of grade 3, with IZ2 = I . We prove these claims locally by making λ locally alternating and applying the Buchsbaum-Eisenbud structure theorem [6]. Now the vector bundles E and F may be of even or odd rank. If the rank is even, we use the same trick as in (7) and replace λ by
λ0 01
E ⊕ OX −−−→ F ∗ (L) ⊕ OX
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
443
without changing the kernel and cokernel of λ. Thus we may assume that E and F are of odd rank. By hypothesis, dimk(x) [E (x) ∩ F (x)] is also odd for all x ∈ X. Therefore, by Proposition 2.3, λ is locally alternating; that is, X is covered by open subsets U over which there are isomorphisms ιU : F |U ∼ = E |U such that ζU = λ|U ◦ιU is alternating. Thus we see that our complexes (15a) and (15b) are locally isomorphic to complexes j ∗ |U j |U ζU 0 −→ N|U −−→ F −−→ F ∗ (L) −−−→ N −1 ⊗ L |U
(16)
such that ζU is alternating with kernel j |U = ι−1 U ◦i|U in the notation of diagram (14). Now F is of odd rank, ζU is alternating, and the ideal I generated by its submaximal minors is of grade 3. So the Buchsbaum-Eisenbud structure theorem [6] applies. Therefore the kernel N |U is a line bundle, the map j |U is given by the submaximal Pfaffians of ζU , and the complex (16) is exact and is a resolution of OZ (N −1 ⊗ L)|U . We can also identify the ideal sheaf I generated by the submaximal minors of λ with IZ2 . This works because on U the sheaf IZ |U is generated by the submaximal Pfaffians p1 , . . . , pn of the alternating map ζU , while I |U is generated by the submaximal minors. Since the (i, j )th submaximal minor is ±pi pj (see [6, appendix]), we do indeed get IZ2 |U = I |U . This verifies our claims. Now we set M := N ⊗ L−1 and twist. We get two dual resolutions: λ
0 −→ M ⊗2 ⊗ L −→ E (M) −→ F ∗ (L ⊗ M) −→ OX −→ OZ −→ 0, −λ∗
0 −→ M ⊗2 ⊗ L −→ F (M) −−→ E ∗ (L ⊗ M) −→ OX −→ OZ −→ 0.
(17a) (17b)
The alternating product of the determinant line bundles in each resolution is trivial, and therefore M ⊗2 ⊗ L ∼ = LE ,F ,V . Let us now verify that Z ⊂ X satisfies conditions (A)–(D) of Definition 0.1. The duality between the two resolutions of OZ shows that Z ⊂ X is relatively Cohen∼ Macaulay of codimension 3. The duality also induces an isomorphism η : OZ −→ E xt 3O (OZ , LE ,F ,V ), making Z ⊂ X subcanonical. Clearly OZ is of finite local X projective dimension. Moreover, η is the Yoneda extension class of (17a) and is thus the image of the class in Ext 2OX (IZ , M ⊗2 ⊗ L) of 0 −→ M ⊗2 ⊗ L −→ E (M) −→ F ∗ (L ⊗ M) −→ IZ −→ 0. So Z ⊂ X is strongly subcanonical with respect to η. 4. Degeneracy loci for split Lagrangian subbundles In this section we discuss Lagrangian degeneracy loci in the “split” case where the bundles are E , F ⊂ (F ⊕ F ∗ (L), qh ). We prove the implications (c) ⇔ (d) ⇒
444
EISENBUD, POPESCU, AND WALTER
(a) of our main Theorem 0.2. We discuss the relation between this case and Pfaffian subschemes, and we show how the general case can be transformed into the split case. In [14] we use this split case to construct non-Pfaffian subcanonical subschemes of codimension 3 in Pn . theorem 4.1 Let F be a vector bundle of rank n, and let L be a line bundle on a locally Noetherian scheme X. Let F ⊕ F ∗ (L) be the hyperbolic twisted orthogonal bundle. (a) Suppose that
ψ φ
E → F ⊕ F ∗ (L)
(18)
is a Lagrangian subbundle such that dimk(x) [E (x) ∩ F ∗ (L)(x)] is odd for all x ∈ X. Let LE ,F := det(E ) ⊗ det(F )−1 . If the sheaf of ideals I generated by the submaximal minors of ψ is of grade 3 (the expected value), then the ideal sheaf of the closed subscheme Z = Z3 (E , F ∗ (L)) = x ∈ X | dimk(x) E (x) ∩ F ∗ (L)(x) ≥ 3 has grade 3 and satisfies IZ2 = I . There is a commutative diagram with exact rows 0
LE , F
E (M)
ψ
LE , F
F ∗ (L ⊗ M)
OX
OZ ∼ = η
φ∗
φ
0
F (M)
−ψ ∗
E ∗ (L ⊗ M)
OX
E xt 3O OZ , LE ,F X
(19) ∼ L−1 ⊗ LE ,F and with φ ∗ ψ alternating. with M a line bundle on X such that M ⊗2 = Moreover, ωZ/X ∼ = L−1 E ,F |Z , and Z is strongly subcanonical of codimension 3 in X with respect to η. (b) Conversely, given a subscheme Z with locally free resolutions as in (19) and with φ ∗ ψ alternating, then E ⊂ F ⊕ F ∗ (L) is a Lagrangian subbundle, and thus Z ⊂ X is strongly subcanonical of codimension 3. corollary 4.2 If, in the situation of the theorem, X is locally Gorenstein, then so is Z, and ωZ ∼ = −1 ωX (LE ,F )|Z . Proof of Theorem 4.1 The only things we need to show for part (a) which do not already follow from
445
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
Theorem 3.1 are that (19) commutes and that φ ∗ ψ is alternating. But E is a totally isotropic subbundle, so any local section e ∈ %(U, E ) satisfies 0 = qh ψ(e) ⊕ φ(e) = ψ(e), φ(e) = φ ∗ ψ(e), e .
(20)
Thus φ ∗ ψ is alternating, and the central part of diagram (19) commutes. The rest is easy and left to the reader. For part (b) the fact that (19) is a quasi-isomorphism implies that E is a subbundle of F ⊕ F ∗ (L). It is totally isotropic by the same calculation (20). Pfaffian subschemes Okonek’s Pfaffian subschemes (see [29]) are the special case of the construction of Theorem 4.1 with E = F ∗ (L) and φ = 1. For if ψ E → E ∗ (L) ⊕ E 1
is a Lagrangian subbundle, then ψ is alternating by Lemma 1.2 or by (20). So in this case the two resolutions of (19) reduce to ψ
0 −→ L ⊗ M ⊗2 −→ E (M) −−→ E ∗ (L ⊗ M) −→ OX −→ OZ −→ 0, with ψ alternating. Thus Z ⊂ X is one of Okonek’s Pfaffian subschemes. From nonsplit to split bundles When the orthogonal bundle (V , q) of Theorem 3.1 does not split as F ⊕ F ∗ (L), we cannot fill in the diagram (19) with direct arrows: 0
LE ,F ,V
i
E (M)
λ
F ∗ (L ⊗ M)
j∗
OX
OZ ∼ = η
0
LE ,F ,V
j
F (M)
−λ∗
E ∗ (L ⊗ M)
i∗
OX
E xt 3O OZ , LE ,G X
(21) Nevertheless, by modifying the orthogonal bundle and its Lagrangian subbundles, we can usually realize the same degeneracy locus as a Lagrangian degeneracy locus of a split orthogonal bundle. We know two strategies to accomplish this under different hypotheses. One is to apply the converse structure theorem (Theorem 6.1). The other strategy works whenever the quadratic form q ∈ %(X, (S 2 V ∗ )(L)) is the image of an α ∈ %(X, (V ∗ ⊗ V ∗ )(L)), for instance, if 2 ∈ %(X, OX )× . Then the orthogonal direct sum (V , q) ⊥
446
EISENBUD, POPESCU, AND WALTER
(V , −q) is hyperbolic because of the inverse isometries
(V , q) ⊥ (V , −q)
1 1 α −α ∗
β −1 α ∗
β −1
V
⊕ V ∗ (L)
β −1 α −β −1
with β := α + α ∗ the nonsingular symmetric bilinear form associated to q. The composite map E ⊕ F → (V , q) ⊥ (V , −q) ∼ = V ⊕ V ∗ (L), or, more explicitly, f g αf −α ∗ g
E ⊕ F −−−−−−−→ V ⊕ V ∗ (L),
embeds E ⊕ F as a Lagrangian subbundle of the hyperbolic bundle V ⊕ V ∗ (L). We may then fill in the diagram (21) with a sequence of quasi-isomorphisms going in both directions: 0
L E ,F ,V
g ∗ βf
E (M)
F ∗ (L ⊗ M) g∗ β
(1 0)
0
L E ,F ,V
(f g)
(E ⊕ F )(M)
L E ,F ,V
V ∗ (L ⊗ M) βg
0
L E ,F ,V
F (M)
V (M)
(αf −α ∗ g)
0
OX
−f ∗ −g ∗
−f ∗ βg
f ∗ α∗ −g ∗ α ∗
OX
(E ∗ ⊕ F ∗ )(L ⊗ M)
OX
E ∗ (L ⊗ M)
OX
1 0
Another way of looking at this is as follows. Let P and Q denote the first two lines of the last diagram. If one has V ∼ = F ⊕ F ∗ (L) as in Theorem 4.1, then the chain map of that theorem is induced by a twisted shifted nonsingular quadratic form on the chain complex P given by a chain map D2 (P ) → LE ,F ,V [3]. In general, no such chain map exists, but if we can lift q to α as above, then there is a pair of chain maps ∼ (22) D2 P ←−− D2 Q −→ LE ,F ,V [3] with the first arrow a quasi-isomorphism. This means that in Theorem 3.1 we are also dealing with a sort of twisted shifted nonsingular quadratic form on P , but only in the derived category. 5. Local equations for the degeneracy locus Let Z ⊂ X be a subcanonical subscheme of codimension 3 which is a split Lagrangian
447
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
degeneracy locus as in diagram (19). We give two strategies for computing equations that define this degeneracy locus locally. The first is based on the idea of using a common Lagrangian complement to make λ alternating as in Proposition 2.3. The second is based on finding standard local forms for a pair of Lagrangian submodules. Strategy 1: Alternating homotopies We start with a pair of Lagrangian subbundles E and F ∗ (L) of a twisted orthogonal bundle. We may reduce to the case where the rank of both E and F is odd using (7). Also, since we are working locally, we may assume that the orthogonal bundle splits as F ⊕ F ∗ (L) because over an affine scheme any Lagrangian subbundle has a Lagrangian complement (see, e.g., Knus [25, Remark I.5.5.4]). Then locally E and F ∗ (L) have a common Lagrangian complement M according to Proposition 2.1. This M is necessarily the graph of an alternating map h : F → F ∗ (L) by Lemma 1.2. We use this h as an alternating local homotopy to transform the commutative diagram on the left below into the one on the right: ψ
E
h
φ
F ∗ (L)
−ψ ∗
E
F φ∗
ψ
F ∗ (L)
(23)
φ ∗ +ψ ∗ h
φ−hψ
E ∗ (L)
F
−ψ ∗
E ∗ (L)
Then φ − hψ is an isomorphism because it is the projection of E onto F ∗ (L) along their common complement M . The dual map φ ∗ + ψ ∗ h is also an isomorphism. Thus diagram (19), with a symmetric quasi-isomorphism between the resolutions of OZ , can be modified locally by an alternating homotopy to get a diagram (valid locally) with a symmetric isomorphism from the resolution into its dual: 0
LE ,F
E (M)
ψ
F (M)
LE , F
F ∗ (L ⊗ M)
OZ ∼ = η
φ ∗ +ψ ∗ h
φ−hψ
0
OX
−ψ ∗
E ∗ (L ⊗ M)
OX
E xt 3O (OZ , LE ,F ) X
Thus locally OZ has a symmetric resolution µ
0 −→ LE ,F −→ E (M) −−→ E ∗ (L ⊗ M) −→ OX −→ OZ −→ 0, where µ = φ ∗ ψ + ψ ∗ hψ is alternating (and is essentially the map µM of (16) and Proposition 2.1). The submaximal Pfaffians of µ give local equations for the degeneracy locus Z. The choice of another common Lagrangian complement M gives a different alternating homotopy h, and vice versa. These calculations are similar to Lemma 2.4.
448
EISENBUD, POPESCU, AND WALTER
Strategy 2: Standard local forms Suppose that R is a commutative local ring with maximal ideal m and residue field k := R/m. Let F be a free R-module of finite rank, and equip F ⊕ F ∗ with the hyperbolic quadratic form. Suppose that E ⊂ F ⊕ F ∗ is a Lagrangian submodule, that is, a totally isotropic direct summand of rank equal to that of F . Let ψ : E → F and φ : E → F ∗ be the two components of the inclusion. lemma 5.1 In the above situation there exist bases of E and F and a dual basis of F ∗ in which the matrices of ψ and φ are of the form β 0 I 0 ψ= , φ= 0 I 0 γ with the blocks in the two matrices of the same size, and with β and γ alternating. Proof We begin by choosing bases for E and F and the dual basis of F ∗ , so that we can treat ψ and φ as matrices. Since E is a direct summand of F ⊕ F ∗ , the columns of the total matrix ψφ are linearly independent even modulo m. Moreover, by (20), φ ∗ ψ is an alternating matrix because E ⊂ F ⊕ F ∗ is a totally isotropic submodule. We now begin a series of row and column operations on ψ and φ which puts them into the required form. The column operations (resp., row operations) correspond to changes of basis of E (resp., of F and F ∗ ) and to the action of invertible matrices P (resp., Q) on ψ and φ via ψ Q−1 ψP and φ Q∗ φP . Choose a maximal invertible minor of φ. After row and column operations, we can assume that the corresponding submatrix is an identity block lying in the upper left corner of φ and that the blocks below and to the right of it are zero. Thus we can assume that I 0 ψ11 ψ12 , φ= , ψ= ψ21 ψ22 0 δ where the blocks of the two matrices are of the same size, the on-diagonal blocks are square, and the coefficients of δ lie in m. Since ψ12 ψ φ ∗ ψ = ∗ 11 δ ψ21 δ ∗ ψ22 is alternating, we see that all the coefficients of ψ12 also lie in m. Hence all the coefficients in the last block of columns of ψφ lie in m except those in ψ22 . Since these columns must be linearly independent modulo m, it follows that ψ22 must be invertible. Applying a new set of column operations to φ and ψ, we may assume that
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
449
ψ11 ; ψ= ψ and that φ = I0 γ0 . Moreover, φ ∗ ψ remains alternating, which actually 21 I ∗ γ . A final set of row and column means that ψ11 and γ are alternating, and ; = −ψ21 I ψ∗ γ and P = −ψI 21 I0 puts ψ and φ into operations using the matrices Q = 0 21I the form required by the lemma. corollary 5.2 Let R be a commutative local ring with residue field k, let F be a free R-module of odd rank, and let E ⊂ F ⊕ F ∗ be a Lagrangian submodule such that dimk [(E ⊗ k) ∩ (F ⊗ k)] is odd. Let ψ : E → F and φ : E → F be the two components of the inclusion. (a) The determinant of φ is of the form det φ = af 2 with a invertible. (b) If det(φ) is not a zero-divisor, and if ψ degenerates along an ideal I of height and grade 3 (as expected), then this ideal is I = (Pf(φ ∗ ψ) : f ), where Pf(φ ∗ ψ) is the ideal generated by the submaximal Pfaffians of φ ∗ ψ and where f is as in part (a). Proof (a) We put the matrices of φ and of ψ in the special form of Lemma 5.1, and we set f := Pf(γ ). The determinant of the matrix of φ is then f 2 . Consequently, the determinant of the matrix of φ with respect to any bases of E and F is of the form af 2 , with a an invertible element of R coming from the determinants of the changeof-basis matrices. (b) Using the special forms for φ and ψ given in Lemma 5.1, we find that I is generated by the submaximal Pfaffians p1 , . . . , p2s+1 of β, while the ideal Pf(φ ∗ ψ) is generated by fp1 , . . . , fp2s+1 . Since we suppose det(φ) and therefore f are not zero-divisors, this gives (b). 6. Subcanonical subschemes are Lagrangian degeneracy loci In this section we prove the implication (a) ⇒ (d) of our main Theorem 0.2. Taken together with Theorems 3.1 and 4.1, this proves the main theorem because the implication (c) ⇒ (b) is trivial. theorem 6.1 Let A be a Noetherian ring, and let X ⊂ PN A be a locally closed subscheme. If Z ⊂ X is a codimension 3 strongly subcanonical subscheme (cf. Definition 0.1), then there exist vector bundles E and G , a line bundle L on X, and an embedding of E as a Lagrangian subbundle of the twisted hyperbolic bundle G ⊕ G ∗ (L) such that Z = Z3 (E , G ∗ (L)) and OZ has symmetrically quasi-isomorphic locally free resolutions
450
0
EISENBUD, POPESCU, AND WALTER
L
E
ψ
L
G ∗ (L)
−ψ ∗
E ∗ (L)
OZ
0
∼ = η
φ∗
φ
0
OX
G
OX
E xt 3O (OZ , L) X
0 (24)
with φ ∗ ψ : E → E ∗ (L) an alternating map. We need the following two lemmas in the proof of the theorem. lemma 6.2 Let A be a Noetherian ring, let X ⊂ PN A be a locally closed subscheme, let F , G be coherent sheaves on X, let M be a vector bundle on X, and let p > 0. p (a) If ξ ∈ Ext OX (F , G ), then there exists a vector bundle E on X and a surjection p f : E F such that the pullback class f ∗ ξ ∈ ExtOX (E , G ) vanishes. p (b) If ζ ∈ ExtOX (@2 M , G ), then there exists a surjection of vector bundles P M p such that the pullback of ζ to ExtOX (@2 P , G ) vanishes. Proof (a) Extending F to a coherent sheaf on the closure X ⊂ PN A and then applying Serre’s Theorem A, we see that there exists a surjection of the form g : OX (−n)r F . Pulling back gives us a class g ∗ ξ ∈ H p (X, G (n))r . Let R be the homogeneous coordinate ring of X, and let I ⊂ R be the homogeneous ideal of strictly positive degree elements vanishing on the closed subset X \ X. Extend G to a coherent sheaf on X, and let G be a finitely generated graded Rp p+1 module whose associated sheaf is this extension of G . Then H∗ (X, G )r ∼ = HI (G)r . ∗ Consequently, g ξ , as a member of a local cohomology module, is annihilated by some power I m of I . A finite set of homogeneous generators of I m gives surjections I m and i OX (−ai ) OX such that the pullback of g ∗ ξ along the i R(−ai ) induced map i OX (−ai − n)r OX (−n)r vanishes. (b) For the same reasons as in part (a), ζ is killed by some power of I . For convenience we assume that the same I m as in part (a) kills ζ . Let i OX (−ai ) OX be the surjection used in part (a). Then the surjection i M (−ai ) M kills ζ 2 because the exterior square factors as @2 ( i M (−ai )) i≤j (@ M )(−ai − aj ) @2 M . lemma 6.3 (Serre; see [30, Lemma 5.1.2]) Let A be a Noetherian local ring, and let M be a finitely generated A-module of projective dimension at most 1. Suppose that ζ ∈ Ext 1A (M, A) corresponds to the
451
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
extension 0 → A → N → M → 0. Then N is a free A-module if and only if ζ generates the A-module Ext 1A (M, A). Proof of Theorem 6.1 By hypothesis, η lifts to a class in Ext 2OX (IZ , L). By Lemma 6.2(a) there exists a vector bundle M and a surjection and kernel 0 → K → M → IZ → 0 such that η lifts further to a class ζ ∈ Ext 1OX (K , L). This defines an extension 0 → L → E → K → 0. Attaching these extensions gives an acyclic complex 0 −→ L −→ E −→ M −→ OX −→ OZ −→ 0.
(25)
We claim that E is locally free. Our reasoning is as follows. Since the local projective dimension of OZ is at most 3, the local projective dimension of K is at most 1. By Lemma 6.3, E is locally free if ζ generates the sheaf E xt 1OX (K , L). Moreover, the sheaves OZ , E xt 3OX (OZ , L)), and E xt 1OX (K , L) are all isomorphic, and their respective global sections 1, η, and ζ correspond under these isomorphisms: H 0 E xt 1OX (K , L) ζ ∈ Ext1OX (K , L) ∼ =
η ∈ Ext3OX OZ , L
∼ =
H 0 E xt 3OX (OZ , L)
∼ =
H 0 (O Z ) " 1
Since 1 generates OZ , the section ζ generates E xt 1OX (K , L). Thus E is locally free. The complex A : 0 −→ L −→ E −→ M −→ OX −→ 0
(26)
is now a locally free resolution of OZ . As in [6] and [35], we try to make this into a commutative associative differential graded algebra resolution of OZ by constructing a map D2 (A ) → A from the divided square covering the identity in degree zero: ...
M (L) ⊕ D2 E
...
0
L ⊕ (E ⊗ M )
E ⊕ @2 M
M
OX
OX (27) 2 Now @ M maps into the kernel K of M → OX . Hence the first problem in filling in the dotted arrows above is to carry out a lifting
L
M
E
@2 M
0
L
E
K
0
452
EISENBUD, POPESCU, AND WALTER
The obstruction to carrying out the lifting is a class ζ ∈ Ext1OX (@2 M , L). There is no reason for this class to vanish. So the liftings sought in (27) need not exist. But there is a way around this. By Lemma 6.2(b) there is a surjection from another vector bundle G M such that the pullback of ζ to Ext 1OX (@2 G , L) vanishes. We now redo the construction of the complex and get commutative diagrams with exact rows and columns:
0
L
R
R
F
K
0
0
K
0
0
K
R
R
G
IZ
0
M
IZ
0
0
L
E
K
This allows us to construct a new complex ψ
B : 0 −→ L −→ F −→ G −→ OX −→ 0.
One sees easily that R and therefore F are also vector bundles. But this time the composite map @2 G → K → K lifts to E since the obstruction is the class in Ext1OX (@2 G , L) which we got to vanish using Lemma 6.2(b). Since the square marked with the is cartesian, we get a lifting @2 G → F . The other liftings ...
G (L) ⊕ D2 F
...
0
L ⊕ (F ⊗ G )
G
F ⊕ @2 G
OX
OX (28) now occur automatically. We therefore get a chain map D2 B → B that makes B into a commutative associative differential graded algebra with divided powers. We now claim that having this differential graded algebra structure gives us all the properties we want and puts us into the situation of Theorem 4.1. Indeed, as in [6], the multiplication gives pairings Bi ⊗ B3−i → B3 = L and therefore maps ∗ (L). These maps are compatible with the differential, and, as a result, Bi → B3−i the following diagram commutes:
0
L
L
F
ψ
φ∗
φ
0
L
G ∗ (L)
OX
G
−ψ ∗
F ∗ (L)
G
F
OZ
0
# η
OX
E xt 3O (OZ , L) X
0
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
453
The top row is exact by construction, and the bottom row is exact because it is the dual of the top row which is a resolution of a sheaf of grade 3. Since η is an isomorphism, one sees that
ψ φ
( φ∗ ψ ∗ )
0 −→ F −−−→ G ⊕ G ∗ (L) −−−−−→ F ∗ (L) −→ 0 is exact. Thus F embeds in G ⊕ G ∗ (L) as a subbundle that is totally isotropic for the hyperbolic symmetric bilinear form on G ⊕ G ∗ (L). The subbundle F is even totally isotropic for the hyperbolic quadratic form since the restriction of this form to local sections of F is the function e → $φ(e), ψ(e)%, and this function vanishes because the composite map from diagram (28), D2 F −→ F ⊗ G −→ L, f ⊗ f −→ f ⊗ ψ(f ) −→ φ(f ), ψ(f ) , factors through zero and hence vanishes identically. Thus F is a Lagrangian subbundle of G ⊕ G ∗ (L). This completes the proof. 7. Points in P3 In this and the following section we discuss several classes of examples which satisfy some or all of the conditions (A)–(D) of the definition of a strongly subcanonical subscheme; thus Theorems 0.2 and 6.1 may apply. Additional geometric applications and examples can be found in [14]. Okonek [29, p. 429] has shown that any reduced set of points in P3 is Pfaffian. By carefully analyzing the constructions of Theorem 6.1, we describe Pfaffian resolutions of locally Gorenstein zero-dimensional subschemes in P3 (see Remark 7.4). For a locally Gorenstein zero-dimensional subscheme Z ⊂ P3k over a field k, ∼ there are many isomorphisms η : OZ −→ ωZ (t). Which triples (Z, ωP3 (t), η) satisfy all the conditions of Definition 0.1, and which do not? In particular (and this is the only condition that causes trouble), when does the image of η in H 3 (P3 , ωP3 (t)) vanish? We use the following notation (see, e.g., [13]). Let I ⊂ R := k[x0 , x1 , x2 , x3 ] be the homogeneous ideal of Z, let A := R/I be its homogeneous coordinate ring, and let ωA := Ext 3R (A, R(−4)) be its canonical module. Note that η ∈ H∗0 (ωZ ) ⊃ ωA . Also, if M is a graded R-module, then let M be its dual as a graded k-vector space, endowed with the natural dual R-module structure. proposition 7.1 ∼ Let Z ⊂ P3 be a locally Gorenstein subscheme of dimension zero, and let η : OZ −→ ωZ (t) be an isomorphism. Then the triple (Z, ωP3 (t), η) is subcanonical and satisfies
454
EISENBUD, POPESCU, AND WALTER
conditions (A)–(C) of Definition 0.1, and it satisfies condition (D) if and only if η ∈ ωA . Proof The map η : OZ → ωZ (t) may be identified with an element of Ext3O 3 (OZ , ωP3 (t)) P ∼ = H 0 (OZ (−t)) . The subscheme Z ⊂ P3 satisfies condition D for η if and only if η ∼ H 1 (IZ (−t)) . is in the image of Ext2 (IZ , ω 3 (t)) = OP 3
P
Local duality and Serre duality give identifications 1 ωA := Ext3R (A, R(−4)) ∼ (A) ∼ = H∗1 (IZ ) = Hm
and H∗0 (ωZ ) ∼ = H∗0 (OZ ) which are compatible with the inclusions. So η satisfies Definition 0.1(D) if and only if η ∈ ωA . theorem 7.2 Let Z ⊂ P3 be a locally Gorenstein subscheme of dimension zero, and let η ∈ H 0 (ωZ (t)). Suppose that (a) η generates the sheaf ωZ , (b) η ∈ ωA , and (c) if t = −2C is even, then the following nondegenerate symmetric bilinear form on H 0 (OZ (C)) is metabolic (i.e., contains a Lagrangian subspace): η tr (29) H 0 (OZ (C)) × H 0 (OZ (C)) −→ H 0 (OZ (2C)) −→ H 0 (ωZ ) −→ k. Then there exists a locally free resolution ψ
0 −→ OP3 (t − 4) −→ F ∗ (t − 4) −−→ F −→ OP3 −→ OZ −→ 0
(30)
with ψ alternating and IZ generated by the submaximal Pfaffians of ψ and such that the Yoneda extension class of (30) is η ∈ Ext3O 3 (OZ , OP3 (t − 4)) ∼ = H 0 (ωZ (t)). P Conversely, if there exists a locally free resolution of OZ as in (30) with ψ alternating, then its Yoneda extension class η satisfies conditions (a), (b), and (c). In order for the symmetric bilinear form (29) to be metabolic, it is necessary for deg(Z) to be even. If the base field k is closed under square roots, this is also sufficient. In any case, the conditions of the theorem always hold if t is large and odd and η is general. This proves the following result, which was proven for reduced sets of points by Okonek [29, p. 429]. corollary 7.3 A zero-dimensional subscheme of P3 is Pfaffian if and only if it is locally Gorenstein.
455
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
Proof of Theorem 7.2 We show how to start the proof. But we stop when we reach the point where it becomes identical to the proof of the main result of [35]. Suppose that Z, t, η satisfy conditions (a), (b), and (c) of Theorem 7.2. Condition (a) implies that the map η : OZ → ωZ (t) is an isomorphism. So η and Serre duality induce a symmetric perfect pairing η
mult
tr
H∗0 (OZ ) × H∗0 (OZ ) −−→ H∗0 (OZ ) −→ H∗0 (ωZ (t)) −→ k(t) H 0 (O
(31)
H 0 (O
that pairs Z (n)) with Z (−n − t)) for all n. 0 Condition (c) implies that H∗ (OZ ) contains a Lagrangian submodule M for this symmetric perfect pairing. Indeed, if t is odd, one can pick M := n>−t/2 H 0 (OZ (n)). If t is even, then there exists a Lagrangian subspace W ⊂ H 0 (OZ (−t/2)), and one can pick M := W ⊕ n>−t/2 H 0 (OZ (n)). The two submodules A ⊂ H∗0 (OZ ) and ωA ⊂ H∗0 (ωZ ) are orthogonal complements of each other under the Serre duality pairing (see, e.g., [13]). Hence condition ⊥ ⊂ (ηA)⊥ . Now (b) (that η ∈ ωA ) implies that ηA ⊂ ωA and therefore that A = ωA 0 the orthogonal complement of ηA ⊂ H∗ (ωZ ) under the Serre duality pairing corresponds to the orthogonal complement of A ⊂ H∗0 (OZ ) under our pairing (31). So condition (b) implies that A ⊂ A⊥ . In other words, A ⊂ H∗0 (OZ ) is sub-Lagrangian. It now follows that there exists a Lagrangian submodule L such that 0 ⊂ A ⊂ L = L⊥ ⊂ A⊥ ⊂ H∗0 (OZ ). For instance, pick L := A + (M ∩ A⊥ ) (cf. Knus [25, Lemma I.6.1.2]). One easily checks that An = (A⊥ )n = H 0 (OZ (n)) for n ' 0 and that An = ⊥ (A )n = 0 for n ( 0. Consequently, A⊥ /A is of finite length. It has an induced nondegenerate symmetric bilinear form, and it has a Lagrangian submodule L/A. We now claim that we can construct a locally free resolution α
ψ
β
0 −→ OP3 (t − 4) −→ F ∗ (t − 4) −−→ F −→ OP3 −→ OZ −→ 0 with ψ alternating and such that H∗1 (F ) ∼ = L/A, and H∗2 (F ) = 0. Moreover, β in0 0 duces a surjection H∗ (F ) H∗ (IZ ). Different pieces of the resolution contribute different pieces of the cohomology module H∗0 (OZ ). The submodule A is contributed by coker H∗0 (β); the piece L/A is contributed by H∗1 (F ); the piece A⊥ /L is contributed by H∗2 (F ∗ (t − 4)); and the piece H∗0 (OZ )/A⊥ is contributed by ker H∗3 (α). The construction of this resolution and the verification of its properties can be done using the Horrocks correspondence by the same method as in [35]. It is quite long, and we omit the details. Remark 7.4 The graded module A⊥ /A above can be thought of as the “intermediate cohomology”
456
EISENBUD, POPESCU, AND WALTER
or deficiency module of (Z, t, η). To emphasize the dependence of this module on η, one could write it as (ηA)⊥ /A, where (ηA)⊥ ⊂ H∗0 (OZ ) means the orthogonal complement of ηA ⊂ H∗0 (ωZ ) with respect to the Serre duality pairing. Now (ηA)⊥ /A is dual to (ωA /ηA), and it is also self-dual with a shift. Consequently, if η ∈ ωA is of degree t, then the corresponding deficiency module is (ηA)⊥ ∼ ωA ∼ ωA (t). = = A ηA ηA In Theorem 7.2 we split the deficiency module in half, and we put a Lagrangian subhalf in F and the quotient half in F ∗ (t − 4). The Pfaffian resolutions of OZ are thus classified up to symmetric homotopy equivalence by pairs (η, L/A) with η ∈ ωA generating the sheaf ωZ , and with L/A ⊂ (ηA)⊥ /A a Lagrangian submodule. An alternative strategy for dealing with this deficiency module is to construct a diagram of the form of (19) in Theorem 4.1 (in which we write O := OP3 to try to stay inside the margins): O (t − 4)
0
G
ψ
O (t − 4)
O (ai + t − 4)
O
−ψ ∗
G ∗ (t − 4)
OZ ∼ = η
φ∗
φ
0
O (−ai )
O
ωZ (t)
(32) with O (−ai ) corresponding to a minimal set of generators of the homogeneous ideal of Z, with H∗2 (G ) ∼ = (ηA)⊥ /A, the deficiency module, and with H∗1 (G ) = 0. We now give examples both of Pfaffian resolutions as in (30) which split the deficiency module, and of resolutions as in (32) in the form of Theorem 4.1 which gather the deficiency module up in one piece.
Example 1: One point Consider a single rational point Q. Its geometry is simple, but we can make its algebra complicated. The canonical module of Q is ωA ∼ = n≥1 H 0 (ωQ (n)). If we pick a nonzero η of degree 1, then it generates ωA , and its deficiency module vanishes. The constructions described above both lead unsurprisingly to the Koszul resolution 0 −→ OP3 (−3) −→ OP3 (−2)⊕3 −→ OP3 (−1)⊕3 −→ OP3 −→ OQ −→ 0. However, if we let η ∈ ωA be a nonzero element of degree 2, then the deficiency module is k concentrated in degree −1, and the construction (32) yields a diagram
457
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
0
OP3 (−2)
OP3 (−1)⊕3
E2P3 (1)
OQ
OP 3
∼ = η
0
OP3 (−1)⊕3
OP3 (−2)
EP3 (1)
ωQ (2)
OP 3
More generally, if we let η ∈ ωA be a nonzero element of degree t, then the 0 deficiency module is −1 n=−(t−1) H (OQ (n)), and the construction (32) yields 0
OP3 (t − 4)
Ft∗ (t − 4)
OP3 (−1)⊕3
OP 3
OQ
0
OP3 (t − 4)
OP3 (t − 3)⊕3
Ft
OP 3
OQ
with Ft a rank 3 locally free sheaf which is the sheafification of the kernel of the presentation of the deficiency module 0 −→ Ft −→ OP3 ⊕ OP3 (t − 2)⊕3 −→ OP3 (t − 1) −→ 0. If one lets η ∈ ωA be a nonzero element of degree 3, then applying the methods of Theorem 7.2 yields a resolution that one recognizes as the Koszul complex associated to the zero locus of a section of the rank 3 bundle TP3 (−1): 0 −→ OP3 (−1) −→ E2P3 (2) −→ EP3 (1) −→ OP3 −→ OQ −→ 0. Example 2: Three points If Z is the union of three noncollinear rational points, then the module ωA has two generators of degree zero, and Z is not arithmetically Gorenstein. If we pick a general η ∈ ωA of degree zero, then the deficiency module is k, concentrated in degree zero, and the construction (32) yields a diagram (in which we again write O := OP3 in order to simplify the notation): 0
O (−4)
O (−3) ⊕ E2P3
O (−2)⊕3 ⊕ O (−1)
O
OZ
0
O (−4)
O (−3) ⊕ O (−2)⊕3
EP3 ⊕ O (−1)
O
OZ
If we pick a general η ∈ ωA of degree 1, then the deficiency module (ωA /ηA)(1) is of length 4, concentrated in degrees zero and −1, and the methods of Theorem 7.2 yield a symmetric resolution (with alternating middle map ψ): ψ
⊕ O (−1) −→ O −→ OZ −→ 0. 0 −→ O (−3) −→ ⊗2P3 (1)⊕2 ⊕ O (−2) −→ E⊕2 P3
458
EISENBUD, POPESCU, AND WALTER
8. Some weakly subcanonical subschemes In this section we give some examples of weakly subcanonical subschemes. These are examples of subschemes Z ⊂ X which satisfy conditions (A) and (B) of the definition of a strongly subcanonical subscheme but fail one or both of conditions (C) and (D). Thus the Serre construction (in codimension 2) and Theorem 6.1 (in codimension 3) fail for these subschemes. A weakly subcanonical curve We construct a subcanonical curve C ⊂ P1 × Pn for n ≥ 2 which fails the lifting condition (D) of Definition 0.1. Let C be a nonsingular projective curve of genus 2 over an algebraically closed field k, let P be one of its Weierstrass points, and let D be a divisor of degree 4 on C. A base-point-free pencil in the linear system of divisors |D| defines a map f : C → P1 , and a base-point-free net in |D + P | defines a map g : C → P2 . Composing g with a linear embedding P2 → Pn gives a map h : C → Pn . Let i := (f, h) : C → P1 × Pn . If the linear systems are chosen sufficiently generally, then i is an embedding. The restriction to C of a line bundle OP1 ×Pn (a, b) is OC ((a + b)D + bP ). So the canonical bundle ωC ∼ = OC (2P ) is the restriction of OP1 ×Pn (−2, 2). If the class of D − 4P in Pic0 (C) is not torsion, then OP1 ×Pn (−2, 2) is the only line bundle on P1 ×Pn whose restriction is ωC . Hence the subcanonical curve C ⊂ P1 ×Pn definitely fails such structure theorems as the Serre construction or Theorem 6.1 if the lifting condition (D) of Definition 0.1 fails for the isomorphism η : ωC ∼ = OP1 ×Pn (−2, 2)|C . By (1) this failure is equivalent to the nonvanishing of the composite map rest η tr H 1 P1 × Pn , O (−2, 2) −−→ H 1 C, O (−2, 2)|C −→ H 1 (C, ωC ) −→ k. ∼ =
∼ =
(33)
Now the image of g : C → P2 is a singular quintic plane curve. If we resolve the singularities, then g factors as an embedding followed by the blowdown C → P2 → P2 . The composite map of (33) now factors through the diagram H 1 P1 × Pn , O (−2, 2) H 1 P1 × P2 , O (−2, 2)
∼ =
H 1 P1 × P2 , O (−2, 2) ∼ = k6
α
k∼ = H 1 (C, ωC )
β γ
1
H P1 × C, O (−2, 2(D + P )) ∼ = k9
LAGRANGIAN SUBBUNDLES AND SUBCANONICAL SUBSCHEMES
459
The lifting condition (D) fails if and only if α is surjective and hence if and only if im(β) ⊂ ker(γ ). Now γ is part of the long exact sequence of cohomology for 0 −→ OP1 ×C (−3, D + 2P ) −→ OP1 ×C (−2, 2D + 2P ) −→ ωC −→ 0. So γ is surjective, and ker(γ ) ⊂ k 9 is a hyperplane. The map g : C → P2 is defined using a 3-dimensional subspace U3 ⊂ V4 := 0 H (C, OC (D + P )). The complete linear system embeds C → P3 as a curve of degree 5 and genus 2 contained in a unique quadric surface Q. Then β is the natural map from S 2 U3 ∼ = k 6 to S 2 V4 /$Q% ∼ = k 9 . Now we have a range of choices for the subspace U3 ⊂ V4 which vary in a Zariski open subset of P3 = P(V4∗ ). Hence we have a family of possible subspaces S 2 U3 ⊂ S 2 V4 whose different members are not all contained in any fixed hyperplane of S 2 V4 . So if we choose a general U3 ⊂ V4 , then S 2 U3 = im(β) is not contained in the hyperplane ker(γ ) ⊂ S 2 V4 /$Q%. In that case, C ⊂ P1 × Pn is a subcanonical curve that fails the lifting condition (D). Singular points Examples can easily be given of subcanonical subschemes Z ⊂ X which are not covered by our construction because the finite projective dimension condition (C) of Definition 0.1 breaks down. This may happen at the same time that the lifting condition (D) breaks down, or it may happen independently. If (D) holds but (C) breaks down, OZ still has resolutions fitting into diagrams such as (19) of Theorem 4.1, except that E or F or both are not locally free. For instance, if X ⊂ P4 is a singular hypersurface of degree d, and P ∈ X is a singular point, then P is indeed subcanonical, but condition (C) fails because OP is of infinite local projective dimension over OX . There exist isomorphisms η : OP ∼ = E xt 3OX (OP , OX (C)) for all C ∈ Z, but these satisfy condition (D) if and only if C ≥ d − 4. Similarly, if D is a line in P5 , and if Y ⊂ P5 is a hypersurface containing D which is singular in at least one point of D, then condition (C) fails for D ⊂ Y , but all the other conditions hold (since H 3 (Y, ωY (2)) = 0). So although D ⊂ Y may be obtained as a degeneracy locus of a pair of Lagrangian subsheaves of a twisted orthogonal bundle on Y , at least one of the Lagrangian subsheaves is not locally free. A nonseparated example We now give an example where there is no real choice about the η (because H 0 (ωZ ) = k and there are no twists), where conditions (A)–(C) hold, but where condition (D) fails. The real reason for the failure in this example is that we are doing something silly on a nonseparated scheme. But the interesting thing is that the cohomological obstruction (D) is able to detect our misbehavior.
460
EISENBUD, POPESCU, AND WALTER
Let X be the nonseparated scheme consisting of two copies A3 glued together along A3 − {0}. In other words, X is A3 with the origin doubled up. Let P ∈ X be one of the two origins. It is a subcanonical subscheme of X of codimension 3 of finite local projective dimension; that is, it satisfies conditions (A)–(C) of Definition 0.1. We claim that it does not satisfy condition (D). The problem is to compute the map Ext3OX (OP , OX ) −→ Ext3OX (OX , OX ) = H 3 (X, OX ).
(34)
We use the following notation: U , U ⊂ X are the two copies of A3 ; for α = 1, 2, 3, let Uα ⊂ X be the open locus where xα = 0; let Uαβ := Uα ∩ Uβ ; and let U123 := U1 ∩ U2 ∩ U3 . For any inclusion of an affine open subscheme U ⊂ X, we denote by i! OU the extension by zero of OU to all of X. We use the same letter i! whatever the U . Then OX and OP have resolutions of the form 0
i! OU123
0
i! O U
α (r − 1)2 (g − 1), resp.). If h denotes the greatest common divisor of r and e, our main result is then formulated as the following theorem. theorem |L p | is base point free on SUX (r, e) for p ≥ max{((r + 1)2 /4r)h, (r 2 /4s)h}. The number s is an invariant associated to the space SUX (r, e), which is defined precisely in Section 4. We restrict here to saying that always s ≥ h, so the bound in the theorem is at most quadratic in the rank r, no worse than (r + 1)2 /4. The theorem substantially improves the existing bounds mentioned above. Most importantly, beyond the concrete numerology the main improvement is that the present bound is independent of the genus of the curve, as it is natural to expect. The idea is again to
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
471
make effective the method of G. Faltings [6], but the technique involved is substantially different from those used in [12] or [8], the key ingredient being precisely the result on Hilbert schemes described above. Two cases that have traditionally been under intensive study are that of degree zero bundles and that of bundles of degree ±1 (or, more generally, e ≡ 0 (mod r) and e ≡ ±1 (mod r)). In the first case the bound that we obtain is quadratic in r, but somewhat surprisingly in the second case it is linear. corollary (i) |L p | is base point free on SUX (r) for p ≥ (r + 1)2 /4. (ii) |L p | is base point free on SUX (r, 1) and SUX (r, −1) for p ≥ r − 1. In fact, in the first case one can do slightly better for r even (see Section 4). Furthermore, making use of the notion of Verlinde bundles introduced in [18], we find similar effective bounds for the base point freeness of linear series on UX (r, e) (see also [18, Sections 5 and 6]). The result (Theorem 5.3) can be formulated as the following theorem. theorem Let F be a vector bundle of rank r/ h and degree (r/ h)(g − 1) − e/ h on X, and let F be the corresponding generalized theta divisor on UX (r, e). Then |pF | is base point free for (r + 1)2 r 2 p ≥ max h, h . 4r 4s In fact, this statement is a special case of a result about linear series of a more general type (cf. Theorem 5.9). In a different direction we follow Le Potier’s idea in [12, Section 3] to observe that the bound given in the main theorem improves substantially the analogous bound for the global generation of multiples of the Donaldson determinant line bundle on moduli spaces of semistable sheaves on surfaces, the independence on the genus being again the crucial fact. In particular, this bounds the dimension of a projective space that is an ambient space for an embedding of the moduli space of µ-semistable sheaves. In the case of rank 2 (and degree zero) sheaves, this space in turn is known to be homeomorphic to the Donaldson-Uhlenbeck compactification of the moduli space of anti-self-dual connections in gauge theory. theorem Let (X, OX (1)) be a polarized smooth projective surface, and let L be a line bundle
472
MIHNEA POPA
on X. Let M = MX (r, L, c2 ) be the moduli space of semistable sheaves of rank r, fixed determinant L, and second Chern class c2 on X, and denote n = deg(X) = OX (1)2 and d = n[r 2 /2]. If D is the Donaldson determinant line bundle on M, then D ⊗p is globally generated for p ≥ d · (r + 1)2 /4 divisible by d. The paper is organized as follows. In Section 1 we review a few basic facts about generalized theta divisors on moduli spaces of vector bundles in the context of our problem. In Section 2 we turn our attention to the dimension bounds for Hilbert schemes of coherent quotients of a given vector bundle. This section is of a somewhat different flavor from the rest of the paper and can be read independently. Section 3 treats the special case of rank 2 vector bundles. We prove a well-known result of M. Raynaud [20] by a method intended to be a toy version of the proof of the general theorem in the subsequent section. Section 4 contains the proof of the main base point freeness result on SUX (r, e), while Section 5 treats the case of linear series on UX (r, e). There we also formulate some questions about optimal bounds in arbitrary rank, which, for example, in degree zero follow from our results in the case of rank 2 and rank 3 vector bundles. Section 6 is devoted to a brief treatment of the abovementioned application to moduli spaces of sheaves on surfaces. 1. Background The underlying idea for studying linear series on the moduli space SUX (r, e) has its roots in the paper of Faltings [6], where a construction of the moduli space based on theta divisors is given. A very nice introduction to the subject is provided in [2]. Fix r and e, and denote h = gcd(r, e), r1 = r/ h, and e1 = e/ h. Consider a vector bundle F of rank pr1 and degree p(r1 (g − 1) − e1 ). Generically such a choice determines (cf. [5, Section 0.2]) a theta divisor F on SUX (r, e), supported on the set F = E | h0 (E ⊗ F ) = 0 . All the divisors F for F ∈ UX (pr1 , p(r1 (g − 1) − e1 )) belong to the linear system |L p |, where L is the determinant line bundle L . We have the following well-known lemma. lemma 1.1 E ∈ SUX (r, e) is not a base point for |L p | if there exists a vector bundle F of rank pr1 and degree p(r1 (g − 1) − e1 ) such that h0 (E ⊗ F ) = 0. It is easy to see that such an F must necessarily be semistable (cf. [12, Lemma 2.5]). It is also a simple consequence of the existence of Jordan-H¨older filtrations that one has to check the condition in Lemma 1.1 only for E stable. We sketch the proof for convenience.
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
473
lemma 1.2 If for any stable bundle V of rank r ≤ r and slope e/r there exists F ∈ UX (pr1 , p(r1 (g−1)−e1 )) such that h0 (V ⊗F ) = 0, then the same is true for every E ∈ SUX (r, e). Proof Assume that E is strictly semistable. Then it has a Jordan-H¨older filtration 0 = E0 ⊂ E1 ⊂ · · · ⊂ En = E such that Ei /Ei−1 are stable for i ∈ {1, . . . , n} and µ(Ei /Ei−1 ) = e/r. By assumption there exist Fi ∈ UX (pr1 , p(r1 (g − 1) − e1 )) such that h0 (Ei /Ei−1 ⊗ Fi ) = 0, and so if we denote Ei /Ei−1 := F h0 Ei /Ei−1 ⊗ F = 0 ⊂ UX pr1 , p r1 (g − 1) − e1 , this is a proper subset for every i. It is clear that any n F ∈ UX pr1 , p r1 (g − 1) − e1 − Ei /Ei−1 i=1
satisfies h0 (E ⊗ F ) = 0. We also record a simple lemma that is useful in Section 4. It is most certainly well known, but we sketch the proof for convenience (cf. also [21, Lemma 1.1]). lemma 1.3 Consider a sheaf extension 0 −→ F −→ E −→ G −→ 0. If E is stable, then h0 (G∗ ⊗ F ) = 0. Proof Assuming the contrary, there is a nonzero morphism G → F . Composing this with the maps E → G to the left and F → E to the right, we obtain a nontrivial endomorphism of E, which contradicts the stability assumption. As a final remark, note that we are always slightly abusing the notation by using vector bundles instead of S-equivalence classes. This is harmless since it is easily seen that it is enough to check the assertions for any representative of the equivalence class. 2. An upper bound on the dimension of Hilbert schemes The goal of this section is to prove a result (see Theorem 2.2) giving an upper bound
474
MIHNEA POPA
on the dimension of the Hilbert schemes of coherent quotients of fixed rank and degree of a given vector bundle, optimal at least in the case corresponding to line subbundles. For the general theory of Hilbert schemes the reader can consult, for example, [12, Section 4]. Concretely, fix a vector bundle E of rank r and degree e on X, and denote by Quotr−k,e−d (E) the Hilbert scheme of coherent quotients of E of rank r − k and degree e − d. We can (and do) identify Quot r−k,e−d (E) to the set of subsheaves of E of rank k and degree d. Consider also dk := max deg(F ) | F ⊂ E, rk(F ) = k and
Mk (E) = F | F ⊂ E, rk(F ) = k, deg(F ) = dk
the set of maximal subbundles of rank k. Clearly any F ∈ Mk (E) has to be a vector subbundle of E. Note that the number e − dk is exactly the minimal degree of a quotient bundle of E of rank r − k. By [14, Section 2] we have the following basic result. proposition 2.1 The following hold and are equivalent: (i) dim Mk (E) ≤ k(r − k); (ii) for any x ∈ X and any W ⊂ E(x) k-dimensional subspace of the fiber of E at x, there are at most finitely many F ∈ Mk (E) such that F (x) = W . Part (i) above thus gives an upper bound for the dimension of the Hilbert scheme in the case d = dk . The next result is a generalization in the case of arbitrary degree d, which turns out to give an optimal result (see Example 2.13). theorem 2.2 With the notation above we have dim Quotr−k,e−d (E) ≤ k(r − k) + (dk − d)k(r − k + 1). Remark 2.3 To avoid any confusion we emphasize here that the notation is slightly different from that used in the introduction in the sense that we are replacing k by r − k, d by e − d, and fk by e − dk . This is done for consistency in rewriting everything in terms of subbundles, but note that the statement is exactly the same. The proof proceeds by induction on the difference dk − d. In order to perform this induction we have to use a special case of the notion of elementary transformation
475
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
along a zero-dimensional subscheme of arbitrary length. We call this construction simply elementary transformation since there is no danger of confusion. Definition 2.4 Let τ be a zero-dimensional subscheme of X supported on the points P1 , . . . , Ps . An elementary transformation of E along τ is a vector bundle E defined by a sequence of the form φ 0 −→ E −→ E −→ τ −→ 0, φi
where the morphism φ is determined by giving surjective maps EPi −→ CaPii induced by specifying ai distinct hyperplanes in E(Pi ) (whose intersection is the kernel of φi ), ∀i ∈ {1, . . . , s}. We call m = a1 + · · · + as the length of τ , and we call ai the weight of Pi . Let us briefly remark that this is not the most general definition since we are imposing a condition on the choice of hyperplanes. We prefer to work with this notion because it is sufficient for our purposes and allows us to avoid some technicalities. Note, though, that the space parametrizing these transformations is not compact. One could equally well work with the general definition, when the hyperplanes could come together, and obtain a compact parameter space, which can be shown to be irreducible. (It is basically a Hilbert scheme of rank zero quotients of fixed length.) In fact, it is an immediate observation that the elementary transformations of E of length m, in the sense of the definition above, are parametrized by Y := (PE)m − ', where (PE)m is the mth symmetric product of the projective bundle PE and ' is the union of all its diagonals. There is an obvious forgetful map π : Y −→ Xm , where Xm is the mth symmetric product of the curve X. We denote by Ym ⊂ (PE)m − ' the open subset (PE)m − π −1 (δ), where δ is the union of the diagonals in Xm . Its points correspond to the elementary transformations of length m supported at m distinct points of X. Definition 2.5 Let V be a subbundle of E. An elementary transformation φ
0 −→ E −→ E −→ τ −→ 0 is said to preserve V if the inclusion V ⊂ E factors through the inclusion E ⊂ E. lemma 2.6 If E is determined by the hyperplanes Hi1 , . . . , Hiai ⊂ E(Pi ) for i ∈ {1, . . . , s} and
476
Vi :=
MIHNEA POPA
ai
j j =1 Hi , then V
is preserved by E if and only if V (P ) ⊂ Vi , ∀i ∈ {1, . . . , s}.
Proof We have an induced diagram V α
0
E
E
φ
τ
0
where α is the composition of φ with the inclusion of V in E. It is clear that E preserves V if and only if α is identically zero. The lemma follows then easily from the definitions. In general, it is important to know the dimension of the set of elementary transformations of a certain type preserving a given subbundle. This is given by the following simple lemma. lemma 2.7 Let V ⊂ E be a subbundle of rank k. Consider the set of elementary transformations of E along a zero-dimensional subscheme of length m belonging to an irreducible subvariety W of Xm − δ that preserve V : ZV := E | V ⊂ E ⊂ π −1 (W ), where π : Ym → Xm is the natural projection. Then ZV is irreducible of dimension m(r − k − 1) + dim W . Proof An elementary transformation at m points x1 , . . . , xm is given by a choice of hyperplanes Hi ⊂ E(xi ) for each i. By the previous lemma such a transformation preserves V if and only if V (xi ) ⊂ Hi for all i. We have a natural diagram ZV
i
π −1 (W )
p
π
W π −1 (W )
where π is the restriction to and p is the composition with the inclusion of −1 ZV in π (W ). For D = x1 + · · · + xm ∈ W , we have p −1 (D) ∼ = H1 , . . . , Hm | V (xi ) ⊂ Hi ⊂ E(xi ), ∀i = 1, . . . , m ∼ = Pr−k−1 × · · · × Pr−k−1 ,
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
477
where the product is taken m times. So p−1 (D) is irreducible of dimension m(r − k − 1), and this gives that ZV is irreducible of dimension m(r − k − 1) + dim W . The following proposition is the key step in running the inductive argument. It computes “how fast” we can eliminate all the maximal subbundles of E while preserving a fixed nonmaximal subbundle. proposition 2.8 Let V ⊂ E be a subbundle of rank k and degree d, not maximal. Then if m ≥ r −k+1, there exists an elementary transformation of length m, 0 −→ E −→ E −→ τ −→ 0, such that V ⊂ E , but F ⊂ E for any F ∈ Mk (E). In other words, E preserves V but does not preserve any maximal F . If we fix a point P ∈ X, then τ can be chosen to have weight m − 1 at P and weight 1 at a generic point Q ∈ X. Proof Fix a point P ∈ X. We can consider an elementary transformation of E of length r − k, supported only at P , 0 −→ E −→ E −→ τ −→ 0, such that Im(E (P ) → E(P )) = V (P ). Then, as in Lemma 2.6, the only maximal subbundles F that are preserved by this transformation are exactly those such that F (P ) = V (P ). By Proposition 2.1 this implies that only at most a finite number of F ’s can be preserved. If none of the maximal subbundles actually survive in E , then any further transformation at one point would do. Otherwise, clearly for a generic Q ∈ X we have F (Q) = V (Q) for all the F ’s that are preserved, and so we can choose a hyperplane V (Q) ⊂ H ⊂ E(Q) such that F (Q) ⊂ H for any such F . The elementary transformation of E at Q corresponding to this hyperplane satisfies then the required property. Remark 2.9 (1) It can definitely happen that all the maximal subbundles are killed by elementary transformations of length less than r −k +1 which preserve V . In any case, as already suggested in the proof above, by further elementary transforming we obviously do not change the property that we are interested in, so r − k + 1 is a bound that works in all situations.
478
MIHNEA POPA
(2) By Lemma 2.7 the set ZV of all elementary transformations of length m preserving V is irreducible of dimension m(r − k). On the other hand, the condition of preserving at least one maximal subbundle is closed, so once the lemma above is true for one elementary transformation, it applies for an open subset of ZV . Finally we have all the ingredients necessary to prove the theorem. To simplify the formulations it is convenient to introduce the following ad hoc definition. Definition 2.10 An irreducible component Q ⊂ Quotr−k,e−d (E) is called nonspecial if its generic point corresponds to a locally free quotient of E and special if all its points correspond to nonlocally free quotients. For any Q, denote by Q0 the open subset parametrizing
locally free quotients, and consider Quot 0r−k,e−d (E) := Q Q0 . Proof of Theorem 2.2 Denote by Q an irreducible component of Quotr−k,e−d (E). (Recall that we are thinking now of this Hilbert scheme as parametrizing subsheaves of rank k and degree d.) The first step is to observe that it is enough to prove the statement when Q is nonspecial. To see this, note that every nonsaturated subsheaf F ⊂ E determines a diagram
0
F
0
0
F
τ
0
E
G
0
∼ =
0
F
G
0
∼ =
G
0
where F is the saturation of F , G is a quotient vector bundle, and τ , the torsion subsheaf of G , is a nontrivial zero-dimensional subscheme, say, of length a. We can stratify the set of all such F ’s according to the value of the parameter a, which obviously runs over a finite set. If we denote by {F }a the subset corresponding to a fixed a, this then gives dim{F }a ≤ dim Quot0r−k,e−d−a (E) + ka.
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
479
The right-hand side is clearly less than k(r − k) + (dk − d)k(r − k + 1) if we assume that the statement of the theorem holds for Quot 0r−k,e−d−a (E). Let us then restrict to the case when Q is a nonspecial component. The proof goes by induction on dk − d. If dk = d, the statement is exactly the content of Proposition 2.1. Assume now that dk > d and that the statement holds for all the pairs where this difference is smaller. Recall that Q0 ⊂ Q denotes the open subset corresponding to vector subbundles, and fix V ∈ Q0 . Then by Proposition 2.8 there exists an elementary transformation 0 −→ E −→ E −→ τ −→ 0 of length r−k+1 such that V ⊂ E , but F ⊂ E for any F ∈ Mk (E). Then necessarily dk (E ) < dk (E) = dk (consider the saturation in E of a maximal subbundle of E ), and so dk (E ) − d < dk − d. This means that we can apply the inductive hypothesis for any nonspecial component of the set of subsheaves of rank k and degree d of E . To this end, consider the correspondence
p1
Q0
W
= (V , E ) | V ⊂ E , F ⊂ E , ∀F ∈ Mk (E) ⊂ Q0 × Yr−k+1
p2
Yr−k+1
By Lemma 2.7 and Remark 2.9(b), for any V ∈ Q0 , the fiber p1−1 (V ) is a (quasi-projective irreducible) variety of dimension (r − k + 1)(r − k), and so dim W = dim Q0 + (r − k + 1)(r − k).
(1)
On the other hand, for E ∈ Im(p2 ), the inductive hypothesis implies that dim p2−1 (E ) ≤ k(r − k) + dk (E ) − d k(r − k + 1) ≤ k(r − k) + dk − d − 1 k(r − k + 1), and since dim Yr−k+1 = r(r − k + 1), we have
dim W ≤ r(r − k + 1) + k(r − k) + dk − d − 1 k(r − k + 1).
(2)
Combining (1) and (2), we get
dim Q0 ≤ k(r − k) + dk − d k(r − k + 1),
and of course the same holds for Q = Q0 . This completes the proof. The formulation and the proof of the theorem give rise to a few natural questions, and we address them in the following examples.
480
MIHNEA POPA
Example 2.11 It is easy to construct special components of Hilbert schemes. For example, consider for any X the Hilbert scheme of quotients of OX⊕2 of rank 1 and degree 1. There certainly exist such quotients that have torsion, such as OX⊕2 −→ OX ⊕ OP −→ 0,
where P is any point of X, but for obvious cohomological reasons there can be no sequence of the form 0 −→ L−1 −→ OX⊕2 −→ L −→ 0 with deg(L) = 1. So in this case there are actually no nonspecial components. Example 2.12 Going one step further, there may exist special components whose dimension is greater than that of any of the nonspecial ones. Note, though, that the proof shows that in this case the bound cannot be optimal. To see an example, consider quotients of OX⊕2 of rank 1 and degree 1 ≤ d ≤ g − 2 on a nonhyperelliptic curve X. Any such locally free quotient L gives a sequence 0 −→ L−1 −→ OX⊕2 −→ L −→ 0, and so the dimension of any component of the Hilbert scheme containing it is bounded above by h0 (L⊗2 ). Now Clifford’s theorem says that h0 (L⊗2 ) ≤ d +1, but our choices make the equality case impossible, so in fact h0 (L⊗2 ) ≤ d. On the other hand, consider an effective divisor D of degree d. Then a point in the same Hilbert scheme is determined by a natural sequence 0 −→ OX (−D) −→ OX⊕2 −→ OX ⊕ OD −→ 0, and it is not hard to see that the dimension of the Hilbert scheme at this point is equal to d + 1. (Essentially d parameters come from D, and one comes from the sections of OX⊕2 .) This gives then a special component whose dimension is greater than that of any nonspecial one. Example 2.13 More significantly, the bound given in the theorem is optimal. Consider for this a line bundle L of degree 4 on a curve X of genus 2 and a generic extension 0 −→ OX −→ E −→ L −→ 0. By standard arguments one can see that such an extension must be stable. Since µ(E) = 2, by the classical theorem of M. Nagata [15] we get that d1 (E) :=
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
481
maxM⊂E deg(M) = 1, and so, for the sequence above, d1 − d = 1. The theorem then tells us that the dimension of any component of the Hilbert scheme containing the given quotient is bounded above by 3. But on the other hand, h1 (L) = 0, so this gives a smooth point and the dimension of the component is h0 (L), which by Riemann-Roch is exactly 3. This example turns out to be a special case of a general pattern, as suggested by M. Teixidor. In fact, in [21] it is shown that whenever E is a generic stable bundle, the invariant dk is the largest integer d that makes the expression ke −rd −k(r −k)(g −1) nonnegative (cf. also [4]). Also, the dimension of the Hilbert scheme can be computed exactly in this case (see [21, Theorem 0.2]), and, for example, under the numerical assumptions above it is precisely equal to 3. Thus in fact for every generic stable bundle of rank 2 and degree 4 on a curve of genus 2, we have equality in the theorem. Much more generally, it can be seen that for any r and g equality is satisfied for a generic stable bundle as long as d1 satisfies a certain numerical condition. Sketch of an alternative proof of Theorem 2.2 The proof of Theorem 2.2 can be slightly modified toward a more natural and compact form. We choose to follow the longer approach because it emphasizes very clearly what the phenomenon involved is, but below we would also like to briefly sketch this parallel argument, which grew out of conversations with I. Coand˘a. We use the same notation as above. There exists a natural specialization map X × Quot 0r−k,e−d (E) −→ Gr−k (E), where Gr−k (E) is the Grassmann bundle of (r −k)-dimensional quotients of the fibers of E. Of course in the case d = dk , Quot0r−k,e−d (E) is compact, and the morphism above is finite. Fix now P ∈ X and w ∈ Gr−k (E(P )) a point corresponding to a quotient E(P ) → W → 0. The choice of P determines a map φ : Quot 0r−k,e−d (E) −→ Gr−k (E(P )), and we would like to bound the dimension of φ −1 (w). There is a natural induced sequence 0 −→ F −→ E −→ W ⊗ CP −→ 0, and it is not hard to see that φ −1 (w) embeds as an open subset in Quot 0r−k,e−d−k (F ). Every locally free quotient of F has degree greater than or equal to e − dk − k, and there are at most a finite number of quotients having precisely this degree. (They come exactly from the minimal degree quotients of E having fixed fiber W at P .) Let G1 , . . . , Gm be these quotients, sitting in exact sequences 0 −→ Fi −→ F −→ Gi −→ 0.
482
MIHNEA POPA
The variety Y := PF − m i=1 PGi parametrizes then the one-point elementary transformations of F that do not preserve any of the Fi ’s. Consider the natural incidence Z ⊂ Quot 0r−k,e−d−k (F ) × Y
parametrizing the pairs (F → Q → 0, F ), where F is an elementary transformation in Y , and Q is not preserved as a quotient of F . (In other words, the corresponding kernel is preserved.) The fiber of Z over F → Q → 0 is isomorphic to PQ∩Y , and so dim Z = dim Quot0r−k,e−d−k (F ) + r − k. On the other hand, the fiber of Z over F ∈ Y is Quot0r−k,e−d−k−1 (F ). Now for F the minimal degree of a quotient of rank r − k is smaller, and hence inductively as before dim Quot0r−k,e−d−k−1 (F ) ≤ k(r − k) + dk − d − 1 k(r − k + 1). This immediately implies that
dim Quot 0r−k,e−d−k (F ) ≤ k(r − k) + dk − d − 1 k(r − k + 1) + k.
As this consequently holds for every fiber of the map φ, we conclude that dim Quot 0r−k,e−d (E) ≤ k(r − k) + dk − d k(r − k + 1), which finishes the alternative proof. 3. Warm-up for effective base point freeness: The case of SUX (2) In this section we give a very simple proof of a theorem that first appeared in [20] (see also [8]). It completely takes care of the case of SUX (2). Although the specific technique (based on the Clifford theorem for line bundles) is different from the methods that are used in Section 4 to prove the main result, the general computational idea already appears here in a particularly transparent form. This is the reason for including the proof. theorem 3.1 The linear system |L | on SUX (2) has no base points. Proof Recall from Lemmas 1.1 and 1.2 that the statement of the theorem is equivalent to the following fact: for any stable bundle E ∈ SUX (2), there exists a line bundle L ∈ Picg−1 (X) such that H 0 (E ⊗ L) = 0. This is certainly an open condition, and it is sufficient to prove that the algebraic set L ∈ Picg−1 (X) | H 0 (E ⊗ L) = 0 ⊂ Picg−1 (X) has dimension strictly less than g.
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
483
A nonzero map E ∗ → L comes together with a diagram of the form E∗
M
0
L where M is just the image in L. Then we have M = L(−D) for some effective divisor D. Since E is stable, the degree of M can vary from 1 to g − 1, and we want to count all these cases separately. So for m = 1, . . . , g − 1, consider the following algebraic subsets of Picg−1 (X): Am := L ∈ Picg−1 (X) | ∃ 0 = φ : E ∗ −→ L with M = Im(φ), deg(M) = m . The claim is that dim Am ≤ g − 1 for all such m. Then of course A1 ∪ · · · ∪ Ag−1 Picg−1 (X), and any L outside this union satisfies our requirement. To prove the claim, denote by Quot 1,m (E) the Hilbert scheme of coherent quotients of E of rank 1 and degree m. The set of line bundle quotients E ∗ → M → 0 of degree m is a subset of Quot 1,m (E). On the other hand, every L ∈ Am can be written as L = M(D), with M as above and D effective of degree g − 1 − m. This gives the obvious bound dim Am ≤ dim Quot1,m (E) + g − 1 − m. To bound the dimension of the Hilbert scheme in question, fix an M as before and consider the exact sequence that it determines: 0 −→ M ∗ −→ E −→ M −→ 0. Note that the kernel is isomorphic to M ∗ since E has trivial determinant. Now we use the well-known fact from deformation theory that dim Quot 1,m (E) ≤ h0 (M ⊗2 ). To estimate h0 (M ⊗2 ), one uses all the information provided by Clifford’s theorem. The initial bound that it gives is h0 (M ⊗2 ) ≤ m+1. (Note that deg(M) ≤ g−1.) If actually h0 (M ⊗2 ) ≤ m, then we immediately get dim Am ≤ g − 1 as required. On the other hand, if h0 (M ⊗2 ) = m + 1, by the equality case in Clifford’s theorem (see, e.g., [1, Chapter III, Section 1]), one of the following must hold: M ⊗2 ∼ = OX or M ⊗2 ∼ = ωX or 1 ⊗2 ∼ X is hyperelliptic and M = m · g2 . The first case is impossible since deg(M) > 0. In the second case M is a theta characteristic, and we are done either by the fact that these are a finite number or by other overlapping cases. The third case can also happen only for a finite number of M’s, and if we are not in any of the other cases, then of course dim Am ≤ g − 1 − m < g − 1. This concludes the proof of the theorem.
484
MIHNEA POPA
Remark 3.2 Note that the key point in the proof above is the ability to give a convenient upper bound on the dimension of certain Hilbert schemes. This essentially is the main ingredient in the general result proved in Section 4, and the needed estimate was provided in Section 2. 4. Base point freeness for pluritheta linear series on SUX (r, e) Using the dimension bound given in Section 2, we are now able to prove the main result of this paper, namely, an effective base point freeness bound for pluritheta linear series on SUX (r, e). The proof is computational in nature, and the roots of the main technique involved have already appeared in Theorem 3.1. Let r ≥ 2 and e be arbitrary integers, and let h = gcd(r, e), r = r1 h, and e = e1 h. For the statement it is convenient to introduce another invariant of the moduli space. If E ∈ SUX (r, e) and 1 ≤ k ≤ r − 1, define sk (E) := ke − rdk , where dk is the maximum degree of a subbundle of E of rank k (cf. [11]). Note that if E is stable, one has sk (E) ≥ h and we can further define sk = sk (r, e) := minE stable sk (E) and s = s(r, e) := min1≤k≤r−1 sk . Clearly s ≥ h, and it is also an immediate observation that s(r, e) = s(r, −e). theorem 4.1 The linear series |L p | on SUX (r, e) is base point free for (r + 1)2 r 2 p ≥ max h, h . 4r 4s Remark 4.2 Note that the bound given in the theorem is always either a quadratic or a linear function in the rank r. It should also be said right away that although this bound works uniformly, in almost any particular situation one can do a little better. Unfortunately there does not seem to be a better uniform way to express it, but we comment more on this at the end of the section (cf. Remark 4.5). Proof of Theorem 4.1 Let us denote for simplicity M := max{((r + 1)2 /4r)h, (r 2 /4s)h}. Since the problem depends only on the residues of e modulo r, there is no loss of generality in looking only at SUX (r, −e) with 0 ≤ e ≤ r − 1. The statement of the theorem is implied by the following assertion, as described in Lemmas 1.1 and 1.2: For any stable E ∈ SUX (r, −e) and any p ≥ M, there is an F ∈ UX pr1 , p r1 (g − 1) + e1 such that h0 (E ⊗ F ) = 0. Fix a stable bundle E ∈ SUX (r, −e). Note that only in this proof, as opposed to
485
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
the rest of the paper, e in fact denotes the degree of E ∗ and not that of E. If for some φ
F ∈ UX (pr1 , p(r1 (g − 1) + e1 )) there is a nonzero map E ∗ − → F , then this comes together with a diagram of the form E∗
V
0
φ
F where the vector bundle V is the image of φ. The idea is essentially to count all such diagrams assuming that the rank and degree of V are fixed and to see that the F ’s involved in at least one of them cover only a proper subset of the whole moduli space. Denote as before by Quotk,d (E ∗ ) the Hilbert scheme of quotients of E ∗ of rank k and degree d, and for any 1 ≤ k ≤ r and any d in the suitable range (given by the stability of E and F ), consider its subset: Ak,d := V ∈ Quotk,d (E ∗ ) | ∃ F ∈ UX pr1 , p r1 (g − 1) + e1 , ∃ 0 = φ : E ∗ −→ F with V = Im(φ) . The theorem on Hilbert schemes stated in the introduction then gives us the dimension estimate (3) dim Ak,d ≤ k(r − k) + d − fk (k + 1)(r − k), where fk = fk (E ∗ ) is the minimum possible degree of a quotient bundle of E ∗ of rank k (which is the same as −dk ). Define now the following subsets of UX (pr1 , p(r1 (g − 1) + e1 )): Uk,d := F | ∃ V ∈ Ak,d with V ⊂ F ⊂ UX pr1 , p r1 (g − 1) + e1 . The elements of Uk,d are all the F ’s that appear in diagrams as above for fixed k and d. The claim is that dim Uk,d < (pr1 )2 (g − 1) + 1, which would imply that Uk,d UX (pr1 , p(r1 (g−1)+e1 )). Assuming that this is true,
and since k and d run over a finite set, any F ∈ UX (pr1 , p(r1 (g−1)+e1 ))− k,d Uk,d satisfies the desired property that h0 (E ⊗ F ) = 0, which gives the statement of the theorem. It is easy to see, and in fact a particular case of the computation below, that in the case k = r (i.e., V = E ∗ ), Uk,d has dimension exactly (pr1 )2 (g − 1). Let us concentrate then on proving the claim above for 1 ≤ k ≤ r − 1. Note that the inclusions V ⊂ F appearing in the definition of Uk,d are valid in general only at the sheaf level. Any such inclusion determines an exact sequence 0 −→ V −→ F −→ G −→ 0,
(4)
where G = G ⊕ τa , with G locally free and with τa a zero-dimensional subscheme
486
MIHNEA POPA
of length a. We stratify Uk,d by the subsets a Uk,d := F | F given by an extension of type (4) ⊂ Uk,d , where a runs over the obvious allowable finite set of integers. A simple computation a the shows that G has rank pr1 −k and degree p(r1 (g −1)+e1 )−d −a. Denote by Tk,d set of all vector bundles G that are quotients of some F ∈ UX (pr1 , p(r1 (g −1)+e1 )). These can be parametrized by a relative Hilbert scheme (see, e.g., [13, Section 8.6]) over (an e´ tale cover of) UX (pr1 , p(r1 (g − 1) + e1 )), and so they form a bounded family. We invoke a general result, proved in [3, Lemma 4.1 and Remark 4.2], saying that the dimension of such a family is always at most what we get if we assume that the generic member is stable. Thus we get the bound 2 a ≤ pr1 − k (g − 1) + 1. dim Tk,d Now we only have to compute the dimension of the family of all possible exa , tensions of the form (4) when V and G are allowed to vary over Ak,d and Tk,d respectively, and τa varies over the symmetric product Xa . Any such extension induces a diagram 0 0
0
V
V
τa
0
F
G
0
∼ =
0
V
G
0
∼ =
G
0
Aak,d
the set of isomorphism classes of vector bundles V that are If we denote by (inverse) elementary transformations of length a of vector bundles in Ak,d , then we have the obvious dim Aak,d ≤ dim Ak,d + ka. a by a bundle On the other hand, any F is obtained as an extension of a bundle in Tk,d a a a in Ak,d . Denote by U ⊂ Ak,d × Tk,d the open subset consisting of pairs (V , G) such that there exists an extension
0 −→ V −→ F −→ G −→ 0 with F stable. Note that by Lemma 1.3 for any such pair we have h0 (G∗ ⊗ V ) = 0,
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
487
and so by Riemann-Roch h1 (G∗ ⊗ V ) is constant, given by h1 G∗ ⊗ V = 2kpr1 (g − 1) − k 2 (g − 1) + kpe1 − pr1 d − pr1 a.
(5)
In this situation it is a well-known result (see, e.g., [19, (2.4)] or [10, Section 4]) that there exists a universal space of extension classes P(U ) → U whose dimension is computed by the formula a + h1 G∗ ⊗ V − 1. dim P(U ) = dim Aak,d + dim Tk,d There is an obvious forgetful map P(U ) −→ UX pr1 , p r1 (g − 1) + e1 a . Thus by putting together all the inequalities above we whose image is exactly Uk,d obtain a a ≤ dim Aak,d + dim Tk,d + h 1 G∗ ⊗ V − 1 dim Uk,d 2 ≤ k(r − k) + d − fk (k + 1)(r − k) + ka + pr1 − k (g − 1) + 1
+ 2kpr1 (g − 1) − k 2 (g − 1) + kpe1 − pr1 d − pr1 a − 1 ≤ k(r − k) + d − fk (k + 1)(r − k) + (pr1 )2 (g − 1) + kpe1 − pr1 d + ka − pr1 a ≤ k(r − k) + d − fk (k + 1)(r − k) + (pr1 )2 (g − 1) + kpe1 − pr1 d, where the last inequality is due to the obvious fact that k ≤ r − 1 < pr1 if p ≥ M. Since a runs over a finite set, to conclude the proof of the claim it is enough to see a ≤ (pr )2 (g − 1). By the inequality above this is true if that dim Uk,d 1 p r1 d − ke1 ≥ k(r − k) + d − fk (k + 1)(r − k) or, equivalently, if p(rd − ke) ≥ k(r − k)h + d − fk (k + 1)(r − k)h for any k and d. This can be rewritten in the following more manageable form: p r d − fk + rfk − ke ≥ k(r − k)h + d − fk (k + 1)(r − k)h. The first case to look at is d = fk , when we should have p(rfk − ke) ≥ k(r − k)h and this should hold for every k. But clearly rfk − ke = sr−k (E ∗ ) = sk (E), defined above in terms of maximal subbundles, and h | sk (E). Since E is stable, we then have sk (E) ≥ h for all k, and so s ≥ h as mentioned before. Note, though, that in
488
MIHNEA POPA
general one cannot do better (cf. Remark 4.6). In any case this says that the inequality p ≥ (r 2 /4s)h must be satisfied (which would certainly hold if p ≥ (r 2 /4)). When d > fk , it is convenient to collect together all the terms containing d − fk . The last inequality above then reads d − fk pr − (k + 1)(r − k)h + psk (E) ≥ k(r − k)h. For p as before it is then sufficient to have pr ≥ (k + 1)(r − k)h, which again by simple optimization is satisfied for p ≥ ((r + 1)2 /4r)h. Concluding, the desired inequality holds as long as p ≥ M. The most important instances of this theorem are the cases of vector bundles of degree zero (more generally, d ≡ 0 (mod r)) and degree 1 or −1 (more generally, d ≡ ±1 (mod r)). In the second situation the moduli space in question is smooth. It is somewhat surprising that the results obtained in these cases have different orders of magnitude. corollary 4.3 |L p | is base point free on SUX (r) for p ≥ (r + 1)2 /4. Proof This is clear since h = r. corollary 4.4 |L p | is base point free on SUX (r, 1) and SUX (r, −1) for p ≥ r − 1. Proof Note that by duality it suffices to prove the claim for one of the moduli spaces, say, SUX (r, −1). In this case h = 1, and for any E ∈ SUX (r, −1), sr−k (E ∗ ) = rfk − k ≥ r − k. Following the proof of the theorem we see thus that it suffices to have (r + 1)2 , p ≥ max r − 1, 4r which is equal to r − 1 for r ≥ 3. For r = 2 one can slightly improve the last inequality in the proof of the theorem (actually this is true whenever r is even) to see that p = 1 already works. Remark 4.5 Corollary 4.4 can be seen as a generalization of the well-known fact that |L | is base point free on SUX (2, 1). Also, as already noted in its proof, the general bound obtained in the theorem can be slightly improved in each particular case, due to the
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
489
fact that the two optimization problems do not simultaneously have integral solutions. Thus, for example, if r is even, the proof of the theorem actually gives that |L p | is base point free on SUX (r) for p ≥ r(r + 2)/4. Remark 4.6 As already noted, in any given numerical situation the bound given by the theorem is either linear or quadratic in rank r. One may thus hope that at least in the case h = 1 (i.e., r and e coprime) a closer study of the number s(r, e) might always produce by this method a linear bound. Examples show, though, that this is not the case: one can take r = 4l, e = 2l − 1, k = 2l + 1, and fk = l (this works for special vector bundles) for a positive integer l, which implies s = sk = 1. 5. Effective base point freeness on UX (r, e) and some conjectures The deformation theoretic methods used in [12] and [8] allow one to prove results similar to Theorem 4.1 for pluritheta linear series on UX (r, e) (with some extra effort due to the fact that in this case one has to control the determinant of the complementary vector bundle). Since the method used in this paper is of a different nature, a generalization along those lines is not immediately apparent. Instead we propose the formalism of Verlinde bundles, which we developed in [18]. This comes with the advantage that it applies automatically as soon as one has results on SUX (r, e) and also suggests what the optimal bounds should be. Moreover, the method equally applies to other linear series on UX (r, e), as we see shortly. Fix a generic vector bundle F ∈ UX (r1 , r1 (g − 1) − e1 ), where as usual r1 = r/ h and e1 = e/ h. To it we can associate the theta divisor F on UX (r, e) supported on the set F = E ∈ UX (r, e) | h0 (E ⊗ F ) = 0 . Fix also L ∈ Pice (X). The (r, e, k)-Verlinde bundle Er,e,k associated to these choices is defined as
F,L := πL∗ OU (kF ) Er,e,k = Er,e,k (cf. [18, Section 6]), where πL is the composition det
⊗L−1
πL : UX (r, e) −−→ Pice (X) −−−→ J (X). This is a vector bundle on J (X) of rank equal to the Verlinde number h0 (SUX (r, e), L k ). The following results are proved in [18]. theorem 5.1 [18, Theorems 6.4 and 5.3] OU (kF ) is globally generated on UX (r, e) as long as L k is globally generated on SUX (r, e) and Er,e,k is globally generated. Moreover, OU (kF ) is not globally generated for k ≤ h.
490
MIHNEA POPA
proposition 5.2 [18, Proposition 5.2] Er,e,k is globally generated for k ≥ h + 1, and this bound is optimal. By using Theorem 4.1 we immediately obtain the following bound, where s is the invariant defined in Section 4. theorem 5.3 OU (kF ) is globally generated on UX (r, e) for (r + 1)2 r 2 h, h . k ≥ max 4r 4s In fact, the theorem is a special case of the more general Theorem 5.9, which we treat at the end of the section. Right now it is interesting to see how these bounds relate to possible optimal bounds and to discuss some conjectures and questions in this direction. Given the shape of the result, we carry out this discussion in the case of SUX (r) and SUX (r, ±1), based on the results of Corollaries 4.3 and 4.4. A similar analysis can be applied to any other case, but we do not give any details here. We begin by looking at degree zero vector bundles, where global generation is attained for k ≥ (r + 1)2 /4, with the improvement by Theorem 3.1 in the case of rank 2, when k ≥ 1 suffices. In view of Theorem 5.1, the bound in Theorem 5.3 is optimal in the case of rank 2 and rank 3 vector bundles. corollary 5.4 Let N ∈ Picg−1 (X). Then (i) OU (3N ) is globally generated on UX (2, 0); (ii) OU (4N ) is globally generated on UX (3, 0). This could be seen as a natural extension of the classical fact that OJ (2N ) is globally generated on J (X) ∼ = UX (1, 0). In presence of this evidence it is natural to conjecture that this is indeed the case for any rank. Conjecture 5.5 For any r ≥ 1, OU (kN ) is globally generated on UX (r, 0) for k ≥ r + 1. This is the best that one can hope for, and there is a possibility that it might be a little too optimistic or, in other words, that Corollary 5.4 might be an accident of low values of a quadratic function. On the other hand, if that is the case, the theorem should be very close to being optimal. Turning to SUX (r), in [17, Section 3] we showed that, granting the strange duality conjecture, the optimal bound for the global
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
491
generation of L k should also go up as we increase the rank r. The underlying reason (without specifying the actual numbers) is the following: assume that we are given a vector bundle E such that h0 (E ⊗ ξ ) = 0 for all ξ ∈ Pic0 (X) (for examples of such E, see [20], [17], or [18]). If we choose some complementary bundle F (i.e., χ (E ⊗ F ) = 0) of rank t, then a theorem of H. Lange and of S. Mukai and F. Sakai (see [10], [14]) asserts that F admits a line subbundle of degree greater than or equal to µ(F ) − g + g/t = g/t − 1 − µ(E). For small t (with respect to r), in most examples mentioned above it happens that this number is positive, which automatically implies that h0 (E ⊗ F ) = 0 for all such F . This prevents the global generation of a certain multiple of L depending on the rank of F . The case of rank 2 vector bundles in Theorem 3.1 suggests, though, that we could ask for a slightly better result than for UX (r, 0), but unfortunately further evidence is still missing. Conjecture/Question 5.6 Is L k globally generated on SUX (r) for k ≥ r − 1? Note also that in view of Theorem 5.1 and Proposition 5.2 any positive answer in the range {r − 1, r, r + 1} would imply the optimal Conjecture 5.5. In the case of SUX (r, ±1), Corollary 4.4 and Theorem 5.1 give that OU (kF ) is globally generated for k ≥ max{2, r − 1}, while OU (F ) cannot be. We thus obtain again optimal bounds for rank 2 and rank 3 vector bundles. corollary 5.7 OU (2F ) is globally generated on UX (2, 1) and UX (3, ±1). Note also that for all the examples of special vector bundles constructed in [20], [17], and [18] we have h = 1; therefore theoretically an optimal bound that does not depend on the rank r is still possible. It is natural to ask if the best possible result always holds. Question 5.8 Is OU (2F ) on UX (r, ±1), and so also L 2 on SUX (r, ±1), globally generated? More generally, is this true whenever r and d are coprime? We conclude the section with a generalization of Theorem 5.3. For simplicity we present it only in the degree zero case, but the extension to other degrees is immediate. Recall from [5, Theorem C] that for N ∈ Picg−1 (X), Pic(UX (r, 0)) ∼ = Z · O (N ) ⊕ det∗ Pic(J (X)). The method provided by the Verlinde bundles allows one to study
492
MIHNEA POPA
effective global generation for “mixed” line bundles of the form O (kN ) ⊗ det ∗ L with L ∈ Pic(J (X)). Concretely we have the following cohomological criterion (assume r ≥ 2). theorem 5.9 The line bundle O (kN ) ⊗ det ∗ L is globally generated if k ≥ (r + 1)2 /4 and if 2 hi OJ kr − r 2 N ⊗ L⊗r ⊗ α = 0, ∀i > 0, ∀α ∈ Pic0 (J (X)). Proof By the projection formula, for every i > 0 we have R i det∗ OU (kN ) ⊗ det ∗ L ∼ = R i det∗ OU (kN ) ⊗ L = 0. Also, the restriction of OU (kN ) ⊗ det∗ L to any fiber SUX (r, ξ ) of the determinant map is isomorphic to L k and so globally generated for k ≥ (r + 1)2 /4. It is a simple consequence of general machinery, described in [18, Proposition 5.1], that in these conditions the statement holds as soon as det∗ OU (kN ) ⊗ det ∗ L ∼ = Er,k ⊗ L is globally generated on J (X), where Er,k is a simplified notation for Er,0,k . To study this we make use, as in [18], of a cohomological criterion for global generation of vector bundles on abelian varieties due to G. Pareschi [16, Theorem 2.1]. In our particular setting it says that Er,k ⊗ L is globally generated if there exists some ample line bundle A on J (X) such that hi Er,k ⊗ L ⊗ A−1 ⊗ α = 0, ∀i > 0, ∀α ∈ Pic0 (J (X)). We choose A to be OJ (N ), where N is the theta divisor on J (X) associated to N . The cohomology vanishing that we need is true if it holds for the pullback of Er,k ⊗ L ⊗ OJ (−N ) ⊗ α by any finite cover of J (X). But recall from [18, OJ (krN ), where rJ is the multiplication by r. Since Lemma 2.3] that rJ∗ Er,k ∼ = 2 ∗ ⊗r rJ L ≡ L , via pulling back by rJ the required vanishing certainly holds if 2 hi OJ kr − r 2 N ⊗ L⊗r ⊗ α = 0,
∀i > 0, ∀α ∈ Pic0 (J (X)).
corollary 5.10 If l ∈ Z, O (kN ) ⊗ det ∗ OJ (lN ) is globally generated for (r + 1)2 k ≥ max r + 1 − lr, . 4
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
493
6. An application to surfaces a` la Le Potier Another, in some sense algorithmic, application of the effective bound in Theorem 4.1 can be given following the paper of Le Potier [12]. By a simple use of a restriction theorem due to H. Flenner [7] (cf. also [13, Section 11]), Le Potier shows that effective results for the determinant bundle L induce effective results for the Donaldson determinant line bundles on moduli spaces of semistable sheaves on surfaces. For the appropriate definitions and basic results, the reader can consult [9, Section 8]. Using the uniform bound k ≥ (r + 1)2 /4 that works on every moduli space SUX (r, d), the result can be formulated as the following theorem. theorem 6.1 Let (X, OX (1)) be a polarized smooth projective surface, and let L be a line bundle on X. Let M = MX (r, L, c2 ) be the moduli space of semistable sheaves of rank r, fixed determinant L, and second Chern class c2 on X, and denote n = deg(X) = OX (1)2 and d = n[r 2 /2]. If D is the Donaldson determinant line bundle on M, then D ⊗p is globally generated for p ≥ d · (r + 1)2 /4 divisible by d. Note that it is not true that D is ample, which accounts for the formulation of the theorem. The significance of the map to projective space given by some multiple of D is well known. Its image is the moduli space of µ-semistable sheaves, and in the rank 2 and degree zero case this is homeomorphic to the Donaldson-Uhlenbeck compactification of the moduli space of anti-self-dual connections in gauge theory, the map realizing the transition between the Gieseker and Uhlenbeck points of view (see, e.g., [9, Section 8.2] for a survey). Better bounds for the global generation of the Donaldson line bundle thus limit the dimension of an ambient projective space for this moduli space. The main improvement brought by the results in the present paper comes from the fact that our results are not influenced by the genus of the curve given by Flenner’s theorem. Effectively that reduces the bound given in [12, Section 3.2] by an order of 4, namely, from a polynomial of degree 8 in rank r to a polynomial of degree 4. Sketch of proof of Theorem 6.1 (cf. [12, Theorem 3.6]) In analogy with the curve situation, given E ∈ M, the problem is to find a complementary 1-dimensional sheaf F on X such that h1 (E ⊗ F ) = 0. Flenner’s theorem says that there exists a smooth curve C belonging to the linear series |OX (d)| such that E|C is semistable. By Theorem 4.1, on C one can find for any k ≥ (r + 1)2 /4 a vector bundle V of rank kr1 such that h1 (E ⊗V ) = 0. The F that we are looking for is obtained by considering V as a 1-dimensional sheaf on X, and a simple computation shows that if p = dk, this gives the global generation of D ⊗p .
494
MIHNEA POPA
Remark 6.2 Depending on the values of the invariants involved, this bound may sometimes be improved to a polynomial of degree 3 in r, according to the precise statement of Theorem 4.1. Acknowledgments. I would especially like to thank my advisor, R. Lazarsfeld, whose support and suggestions have been decisive to this work. I am also indebted to I. Coand˘a, I. Dolgachev, W. Fulton, M. Roth, and M. Teixidor i Bigas for very valuable discussions. In particular, numerous conversations with M. Roth had a significant influence on Section 2.
References [1]
[2]
[3] [4] [5] [6] [7] [8] [9] [10] [11] [12]
[13]
E. ARBARELLO, M. CORNALBA, P. A. GRIFFITHS, and J. HARRIS, Geometry of
Algebraic Curves, Vol. I, Grundlehren Math. Wiss. 267, Springer, New York, 1985. A. BEAUVILLE, “Vector bundles on curves and generalized theta functions: Recent results and open problems” in Current Topics in Algebraic Geometry (Berkeley, 1991/93), Math. Sci. Res. Inst. Publ. 28, Cambridge Univ. Press, Cambridge, 1995, 17–33. L. BRAMBILA-PAZ, I. GRZEGORCZYK, and P. E. NEWSTEAD, Geography of Brill-Noether loci for small slopes, J. Algebraic Geom. 6 (1997), 645–669. L. BRAMBILA-PAZ and H. LANGE, A stratification of the moduli space of vector bundles on curves, J. Reine Angew. Math. 494 (1998), 173–187. J.-M. DREZET and M. S. NARASIMHAN, Groupe de Picard des vari´et´es des modules de fibr´es semi-stables sur les courbes alg´ebriques, Invent. Math. 97 (1989), 53–94. G. FALTINGS, Stable G-bundles and projective connections, J. Algebraic Geom. 2 (1993), 507–568. H. FLENNER, Restrictions of semistable bundles on projective varieties, Comment. Math. Helv. 59 (1984), 635–650. G. HEIN, On the generalized theta divisor, Beitr¨age Algebra Geom. 38 (1997), 95–98. D. HUYBRECHTS and M. LEHN, The Geometry of Moduli Spaces of Sheaves, Aspects Math. E 31, Vieweg, Braunschweig, Germany, 1997. H. LANGE, Universal families of extensions, J. Algebra 83 (1983), 101–112. , Zur Klassifikation von Regelmannigfaltigkeiten, Math. Ann. 262 (1983), 447–459. J. LE POTIER, “Module des fibr´es semi-stables et fonctions thˆeta” in Moduli of Vector Bundles (Sanda, 1994; Kyoto, 1994), ed. M. Maruyama, Lecture Notes in Pure and Appl. Math. 179, Dekker, New York,1996, 83–101. , Lectures on Vector Bundles, trans. A. Maciocia, Cambridge Stud. Adv. Math. 54, Cambridge Univ. Press, Cambridge, 1997.
HILBERT SCHEMES AND EFFECTIVE BASE POINT FREENESS
495
[14]
S. MUKAI and F. SAKAI, Maximal subbundles of vector bundles on a curve,
[15]
M. NAGATA, On self-intersection number of a section on a ruled surface, Nagoya
[16]
G. PARESCHI, Syzygies of abelian varieties, J. Amer. Math. Soc. 13 (2000), 651–664,
[17]
M. POPA, On the base locus of the generalized theta divisor, C. R. Acad. Sci. Paris
Manuscripta Math. 52 (1985), 251–256. Math. J. 37 (1970), 191–196. http://www.ams.org/jams.
[18] [19] [20] [21]
S´er. I Math. 329 (1999), 507–512. , Verlinde bundles and generalized theta linear series, preprint, http://www.arXiv.org/abs/math.AG/0002017. S. RAMANAN, The moduli space of vector bundles over an algebraic curve, Math. Ann. 200 (1973), 69–84. M. RAYNAUD, Sections des fibr´es vectoriels sur une courbe, Bull. Soc. Math. France 110 (1982), 103–125. B. RUSSO and M. TEIXIDOR i BIGAS, On a conjecture of Lange, J. Algebraic Geom. 8 (1999), 483–496.
Department of Mathematics, University of Michigan, 525 East University, Ann Arbor, Michigan 48109-1109, USA; Institute of Mathematics of the Romanian Academy
QUASI-LINEAR PARABOLIC SYSTEMS IN DIVERGENCE FORM WITH WEAK MONOTONICITY ¨ NORBERT HUNGERBUHLER
Abstract We consider the initial and boundary value problem for the quasi-linear parabolic system ∂u − div σ x, t, u(x, t), Du(x, t) = f ∂t u(x, t) = 0
on × (0, T ), on ∂ × (0, T ),
u(x, 0) = u0 (x) on
for a function u : × [0, T ) → Rm with T > 0. Here, f ∈ Lp (0, T ; W −1,p (; Rm )) for some p ∈ (2n/(n + 2), ∞), and u0 ∈ L2 (; Rm ). We prove existence of a weak solution under classical regularity, growth, and coercivity conditions for σ but with only very mild monotonicity assumptions. 1. Introduction On a bounded open domain ⊂ Rn , we consider the initial and boundary value problem for the quasi-linear parabolic system ∂u − div σ x, t, u(x, t), Du(x, t) = f ∂t u(x, t) = 0
on × (0, T ),
(1)
on ∂ × (0, T ),
(2)
u(x, 0) = u0 (x) on
(3)
for a function u : × [0, T ) → Rm , T > 0. Here, f ∈ Lp (0, T ; W −1,p (; Rm )) for some p ∈ (2n/(n + 2), ∞), u0 ∈ L2 (; Rm ), and σ satisfies the conditions (P0)– (P2) below. A feature of the Young measure technique we use is that we can treat a class of problems for which the classical monotone operator methods developed by M. Viˇsik [23], G. Minty [21], F. Browder [5], H. Br´ezis [3], J.-L. Lions [20], and others do not apply. The reason for this is that σ does not need to satisfy the strict DUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 3, Received 22 December 1999. Revision received 10 July 2000. 2000 Mathematics Subject Classification. Primary 35K55.
497
¨ NORBERT HUNGERBUHLER
498
monotonicity condition of a typical Leray-Lions operator. The tool we use in order to prove the needed compactness of approximating solutions is Young measures. The methods are inspired by [6] and [12]. To fix some notation, let Mm×n denote the real vector space of m × n matrices equipped with the inner product M : N = Mij Nij (with the usual summation convention). The following notion of monotonicity plays a role in part of the exposition. Instead of assuming the usual pointwise monotonicity condition for σ , we also use a weaker, integrated version of monotonicity which is called quasi monotonicity (see [6]). The definition is phrased in terms of gradient Young measures. Note, however, that although quasi monotonicity is “monotonicity in integrated form,” the gradient Dη of a quasi-convex function η is not necessarily quasi-monotone. Definition 1 A function η : Mm×n → Mm×n is said to be strictly p-quasi-monotone if η(λ) − η(λ¯ ) : λ − λ¯ dν(λ) > 0 Mm×n
for all homogeneous W 1,p -gradient Young measures ν with center of mass λ¯ = ν, id which are not a single Dirac mass. A simple example is the following. Assume that η satisfies the growth condition η(F ) ≤ C|F |p−1 with p > 1 and the structure condition |∇φ|p dx η(F + ∇φ) − η(F ) : ∇φ dx ≥ c
for a constant c > 0 and for all φ ∈ C0∞ () and all F ∈ Mm×n . Then η is strictly p-quasi-monotone. This follows easily from the definition if one uses that for every W 1,p -gradient Young measure ν there exists a sequence {Dvk } generating ν for which {|Dvk |p } is equiintegrable (see [9], [14]). Now, we state our main assumptions. (P0) (Continuity) We assume that σ : × (0, T ) × Rm × Mm×n → Mm×n is a Carath´eodory function; that is, (x, t) → σ (x, t, u, F ) is measurable for every (u, F ) ∈ Rm × Mm×n , and (u, F ) → σ (x, t, u, F ) is continuous for almost every (x, t) ∈ × (0, T ). (P1) (Growth and coercivity) There exist c1 ≥ 0, c2 > 0, λ1 ∈ Lp ( × (0, T )), λ2 ∈ L1 ( × (0, T )), λ3 ∈ L(p/α) ( × (0, T )), 0 < α < p, such that
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
499
σ (x, t, u, F ) ≤ λ1 (x, t) + c1 |u|p−1 + |F |p−1 , σ (x, t, u, F ) : F ≥ −λ2 (x, t) − λ3 (x, t)|u|α + c2 |F |p . (P2) (Monotonicity) We assume that σ satisfies one of the following conditions: (a) for all (x, t) ∈ × (0, T ) and all u ∈ Rm , the map F → σ (x, t, u, F ) is a C 1 -function and is monotone; that is, σ (x, t, u, F ) − σ (x, t, u, G) : (F − G) ≥ 0 for all (x, t) ∈ × (0, T ), u ∈ Rm , and F, G ∈ Mm×n ; (b) there exists a function W : × (0, T ) × Rm × Mm×n → R such that σ (x, t, u, F ) = (∂W/∂F )(x, t, u, F ), and F → W (x, t, u, F ) is convex and a C 1 -function for all (x, t) ∈ × (0, T ) and all u ∈ Rm ; (c) σ is strictly monotone; that is, σ is monotone, and (σ (x, t, u, F ) − σ (x, t, u, G)) : (F − G) = 0 implies F = G; (d) σ (x, t, u, F ) is strictly p-quasi-monotone in F . The Carath´eodory condition (P0) ensures that σ (x, t, u(x, t), U (x, t)) is measurable on × (0, T ) for measurable functions u : × (0, T ) → Rm and U : × (0, T ) → Mm×n (see, e.g., [25]). (P1) states standard growth and coercivity conditions. They are used in the construction of approximate solutions by a Galerkin method and when we pass to the limit. The strict monotonicity condition (c) in (P2) ensures existence of weak solutions of the corresponding parabolic systems by standard methods. However, the main point is that we do not require strict monotonicity or monotonicity in the variables (u, F ) in (a), (b), or (d), as it is usually assumed in previous work (see, e.g., [2], [4], [16], [18], [17], [19], and the references therein). We prove the following result. theorem 2 If σ satisfies the conditions (P0)–(P2) for some p ∈ (2n/(n + 2), ∞), then the 1,p parabolic system (1)–(3) has a weak solution u ∈ Lp (0, T ; W0 ()) for every f ∈ Lp (0, T ; W −1,p ()) and every u0 ∈ L2 (). Remark. The result for case (d) in (P2) answers, in particular, a question by J. Frehse [10]. 2. Choice of the Galerkin base 1,p Let s ≥ 1 + n(1/2 − 1/p). Then W0s,2 () ⊂ W0 (). For ζ ∈ L2 (), we consider the linear bounded map φ : W0s,2 () −→ R,
v −→ (ζ, v)L2 ,
where (·, ·)L2 denotes the inner product of L2 . By the Riesz representation theorem
¨ NORBERT HUNGERBUHLER
500
there exists a unique Kζ ∈ W0s,2 () such that φ(v) = (ζ, v)L2 = (Kζ, v)W s,2
for all v ∈ W0s,2 ().
The map L2 → L2 , ζ → Kζ , is linear, symmetric, bounded, and (due to the compact embedding W0s,2 () ⊂ L2 ()) compact. Moreover, since (ζ, Kζ )L2 = (Kζ, Kζ )W s,2 ≥ 0, the operator K is (strictly) positive. Hence, there exists an L2 -orthonormal base W := {w1 , w2 , . . . } of eigenvectors of K and positive real eigenvalues λi with Kwi = λi wi . This, in particular, means that wi ∈ W0s,2 () for all i and that, for all v ∈ W0s,2 (), λi (wi , v)W s,2 = (Kwi , v)W s,2 = (wi , v)L2 .
(4)
Notice that the functions wi are therefore orthogonal also with respect to the inner product of W s,2 (). In fact, for i = j , we get, by choosing v = wj in (4), 0=
1 wi , wj L2 = wi , wj W s,2 . λi
Notice also that, by choosing v = wi in (4), 1 = wi 2L2 = wi , wi L2 = λi wi , wi W s,2 = λi wi 2W s,2 . √ = {w˜ 1 , w˜ 2 , . . . }, with w˜ i := λi wi , is an orthonormal set for W s,2 (). Thus, W 0 is a basis for W s,2 (). To see this, observe that, for arbitrary v ∈ Actually, W 0 W0s,2 (), the Fourier series sn (v) :=
n
w˜ i , v
i=1
W s,2
w˜ i −→ v˜
in W0s,2 ()
converges to some v. ˜ On the other hand, we have sn (v) =
n (wi , v)L2 wi −→ v
in L2 ()
i=1
and, by the uniqueness of the limit, v˜ = v. We need below the L2 -orthonormal projector Pk : L2 → L2 onto span(w1 , w2 , . . . , wk ), k ∈ N. Of course, the operator norm Pk L (L2 ,L2 ) = 1. But notice also that Pk L (W s,2 ,W s,2 ) = 1 since, for u ∈ W s,2 (), Pk u =
k i=1
(wi , u)L2 wi =
k
w˜ i , u
i=1
W s,2
w˜ i .
3. Galerkin approximation We make the following ansatz for approximating solutions of (1)–(3):
501
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
uk (x, t) =
k
cki (t)wi (x),
i=1
where cki : [0, T ) → R are supposed to be measurable-bounded functions. Each uk satisfies the boundary condition (2) by construction in the sense that uk ∈ Lp (0, T ; 1,p W0 ()). We take care of the initial condition (3) by choosing the initial coefficients cki (0) := (u0 , wi )L2 such that uk (·, 0) =
k
cki (0)wi (·) −→ u0
in L2 () as k −→ ∞.
(5)
i=1
We try to determine the coefficients cik (t) in such a way that for all k ∈ N the system of ordinary differential equations (6) σ x, t, uk , Duk : Dwj dx = f (t), wj ∂t uk , wj L2 +
(with j ∈ {1, 2, . . . , k}) is satisfied in the sense of distributions. In (6), ·, · denotes 1,p the dual pairing of W −1,p () and W0 (). Now, we fix k ∈ N for the moment. Let 0 < ε < T and J = [0, ε]. Moreover, we choose r > 0 large enough such that the set Br (0) ⊂ Rk contains the vector (c1k (0), . . . , ckk (0)), and we set K = Br (0). Observe that, by (P0), the function F : J × K −→ Rk ,
k k c i wi , ci Dwi : Dwj dx t, c1 , . . . , ck −→ f (t), wj − σ x, t,
i=1
i=1
j =1,...,k
is a Carath´eodory function. Moreover, each component Fj may be estimated on J ×K by Fj t, c1 , . . . , ck ≤ f (t) −1,p wj 1,p W W +
0
1/p
p k k c i wi , ci Dwi dx σ x, t,
×
i=1
Dwj p dx
i=1
1/p
(7)
.
Using the growth condition in (P1), the right-hand side of (7) can be estimated in such a way that Fj t, c1 , . . . , ck ≤ C(r, k)M(t) (8) uniformly on J × K, where C(r, k) is a constant that depends on r and k, and where M(t) ∈ L1 (J ) (independent of j , k, and r). Thus, the Carath´eodory existence result
¨ NORBERT HUNGERBUHLER
502
on ordinary differential equations (see, e.g., [13]) applied to the system cj (t) = Fj t, c1 (t), . . . , ck (t) ,
(9)
cj (0) = ckj (0)
(10)
(for j ∈ {1, . . . , k}) ensures existence of a distributional, continuous solution cj (depending on k) of (9)–(10) on a time interval [0, ε ), where 0 > 0, a priori, may depend on k. Moreover, the corresponding integral equation t cj (t) = cj (0) + Fj τ, c1 (τ ), . . . , ck (τ ) dτ (11) k
[0, 0 ).
0
holds on Then uk := j =1 cj (t)wj is the desired (short-time) solution of (6) with initial condition (5). Now, we want to show that the local solution constructed above can be extended to the whole interval [0, T ) independent of k. As a word of warning we should mention that the solution need not be unique. The first thing we want to establish is a uniform bound on the coefficients |cki (t)|. Since (6) is linear in wj , it is allowable to use uk as a test function in equation (6) in place of wj . This gives, for an arbitrary time τ in the existence interval, τ τ τ σ x, t, uk , Duk : Duk dx dt = ∂t uk , uk L2 dt + f (t), uk . 0 0 0 =:I
=:I I
=:I I I
For the first term, we have 1 uk (·, τ )2 2 − 1 uk (·, 0)2 2 . () L L () 2 2 Using the coercivity in (P1) for the second term, we obtain I=
I I ≥−λ2 L1 (×(0,T )) −λ3 L(p/α) (×(0,T )) uk αLp (×(0,τ )) +c2 uk
p
For the third term, we finally get I I I ≤ f Lp (0,T ;W −1,p ()) uk Lp (0,τ ;W 1,p ()) . 0
The combination of these three estimates gives 2 k = uk (·, τ )2 2 cki (τ ) ≤ C¯ i=1,...,k R L () for a constant C¯ that is independent of τ (and of k). Now, let 3 := t ∈ [0, T ) : there exists a weak solution of (9)–(10) on [0, t) . 3 is nonempty since we proved local existence above.
1,p
Lp (0,τ ;W0 ())
.
503
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
Moreover, 3 is an open set. To see this, let t ∈ 3, and let 0 < τ1 < τ2 ≤ t. Then, by (11) and (8), we have τ2 τ2 Fj τ, ck1 (τ ), . . . , ckk (τ ) dτ ≤ C C, M(t) dτ. ckj (τ1 ) − ckj (τ2 ) ≤ ¯ k τ1
τ1
Since M ∈ L1 (0, T ), this implies that τ → ckj (τ ) is uniformly continuous. Thus, we can restart to solve (6) at time t with initial data limτ t uk (τ ) and hence get a solution of (9)–(10) on [0, t + 0). Finally, we prove that 3 is also closed. To see this, we consider a sequence τi t, τi ∈ 3. Let ckj,i denote the solution of (9)–(10) we constructed on [0, τi ], and define ckj,i (τ ) if τ ∈ [0, τi ], c˜kj,i (τ ) := ckj,i (τi ) if τ ∈ (τi , t). The sequence {ckj,i }i is bounded and equicontinuous on [0, t), as seen above. Hence, by the Arzela-Ascoli theorem, a subsequence (again denoted by c˜kj,i (τ )) converges uniformly in τ on [0, t) to a continuous function ckj (τ ). Using the Lebesgue convergence theorem in (11), it is now easy to see that ckj (τ ) solves (9) on [0, t). Hence, t ∈ 3, and thus 3 is indeed closed. And as claimed, it follows that 3 = [0, T ). 4. Compactness of the Galerkin approximation By testing equation (6) by uk in place of wj , we obtain, as in Section 3, that the sequence {uk }k is bounded in 1,p L∞ 0, T ; L2 () ∩ Lp 0, T ; W0 () . Therefore, by extracting a suitable subsequence that is again denoted by uk , we may assume ∗ uk 4 u in L∞ 0, T ; L2 () , 1,p uk 4 u in Lp 0, T ; W0 () . At this point, the idea is to use J.-P. Aubin’s lemma in order to prove compactness of the sequence {uk } in an appropriate space. Technically, this is achieved by the following lemma, which is slightly more flexible than, for example, the version in [20, Chapter 1, Section 5.2] or in [22]. lemma 3 Let B, B0 , and B1 be Banach spaces, B0 and B1 reflexive. Let i : B0 → B be a compact linear map, and let j : B → B1 be an injective-bounded linear operator. For T finite and 1 < pi < ∞, i = 0, 1,
¨ NORBERT HUNGERBUHLER
504
d W := v | v ∈ Lp0 0, T ; B0 , (j ◦ i ◦ v) ∈ Lp1 0, T ; B1 dt is a Banach space under the norm vLp0 (0,T ;B0 ) + j ◦ i ◦ vLp1 (0,T ;B1 ) . Then if V ⊂ W is bounded, the set {i ◦ v | v ∈ V } is precompact in Lp0 (0, T ; B). The proof of Lemma 3 is given in Appendix A. 1,p Now, we apply Lemma 3 to the following case: B0 := W0 (), B := Lq () ∗ (for some q with 2 < q < p := np/(n − p) if p < n and 2 < p < ∞ if p ≥ n), and B1 := (W0s,2 ()) . Since we assume that p ∈ (2n/(n + 2), ∞), we have the following chain of continuous injections: γ i1 i0 i B0 7→ B 7→ L2 () ∼ = L2 () 7→ B1 .
(12)
Here, L2 () ∼ = (L2 ()) is the canonical isomorphism γ of the Hilbert space L2 () and its dual. For i : B0 → B we take simply the injection mapping, and for j : B → B1 we take the concatenation of injections and the canonical isomorphism given by (12), that is, j := i1 ◦ γ ◦ i0 . Then, as stated at the beginning of this section, {uk }k is a bounded sequence in p L (0, T ; B0 ). Observe that the time derivative (d/dt)(j ◦ i ◦ uk ) is, according to (6), given by d j ◦ i ◦ uk : [0, T ) −→ B1 = W0s,2 () , dt t −→ φ −→ − σ x, t, uk , Duk : D(Pk φ) dx + f (t), Pk φ .
(We recall that the projection operators Pk are self-adjoint with respect to the L2 inner product.) Now we claim that indeed {∂t j ◦ i ◦ uk }k is a bounded sequence in Lp (0, T ; (W0s,2 ()) ). Namely, we have by the growth condition in (P1) that T − σ x, t, u , Du φ) dx dt + f, P φ : D(P k k k k 0 p−1 (13) ≤ C λ1 Lp ((0,T )×) + uk p 1,p L (0,T ;W0 ()) + f Lp (0,T ;W −1,p ()) Pk φLp (0,T ;W 1,p ()) , 0
and the claim follows since Pk φLp (0,T ;W 1,p ()) ≤ Pk φLp (0,T ;W s,2 ()) ≤ φLp (0,T ;W s,2 ()) . 0
0
0
In the last inequality we used the remark at the end of Section 2. Hence, from Lemma 3, we may conclude that there exists a subsequence, which we still denote by uk , having the property that
505
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
uk −→ u
in Lp 0, T ; Lq () for all q < p∗ and in measure on × (0, T ).
Notice that, in order to have the strong convergence simultaneously for all q < p ∗ , the usual diagonal sequence procedure applies. For further use, we note that from (13) we can conclude that ∂t u (or rather ∂t (j ◦ i ◦ u)) is an element of the space Lp (0, T ; W −1,p ()). (This follows easily 1,p from the fact that the set {φ ∈ Lp (0, T ; W0 ()) : ∃ k ∈ N such that Pk φ = φ} is 1,p dense in Lp (0, T ; W0 ()), as proved in Appendix B.) See also Appendix C. Recall that the space 1,p u ∈ Lp 0, T ; W0 () : ∂t (j ◦ i ◦ u) ∈ Lp 0, T ; W −1,p () is continuously embedded in C 0 [0, T ]; L2 () . Hence, we have that u ∈ C 0 ([0, T ]; L2 ()) after possible modification of u on a Lebesgue zero-set of [0, T ]. This gives u(t, ·) ∈ L2 () a pointwise interpretation for all t ∈ [0, T ] and allows in particular the statement that u(t, ·) attains its initial value u(·, 0) = u0
(14)
continuously in L2 () (see Appendix C for a proof of (14)). At this point we mention that, in the case when σ depends only on t and in a strictly quasi-monotonic way on Du, a quite simple proof gives the existence result. This is carried out in Appendix D. However, to obtain the general result stated in Theorem 2, some more work is needed in order to pass to the limit. 5. The Young measure generated by the Galerkin approximation The sequence (or at least a subsequence) of the gradients Duk generates a Young measure ν(x,t) , and since uk converges in measure to u on × (0, T ), the sequence (uk , Duk ) generates the Young measure δu(x,t) ⊗ ν(x,t) (see, e.g., [11]). Now, we collect some facts about the Young measure ν in the following proposition. proposition 4 The Young measure ν(x,t) generated by the sequence {Duk }k has the following properties: (i) ν(x,t) is a probability measure on Mm×n for almost all (x, t) ∈ × (0, T ); (ii) ν(x,t) satisfies Du(x, t) = ν(x,t) , id for almost every (x, t) ∈ × (0, T ); (iii) ν(x,t) has finite pth moment for almost all (x, t) ∈ × (0, T ); (iv) ν(x,t) is a homogeneous W 1,p -gradient Young measure for almost all (x, t) ∈ × (0, T ).
¨ NORBERT HUNGERBUHLER
506
Proof (i) The first observation is simple. To see that ν(x,t) is a probability measure on Mm×n for almost all (x, t) ∈ × (0, T ), it suffices to recall the fact that Duk is a bounded sequence in L1 ( × (0, T )) and to use the fundamental theorem in [1]. (ii) As we stated at the beginning of Section 4, {Duk }k is bounded in Lp (0, T ; p L ()), and we may assume that Duk 4 Du in Lp 0, T ; Lp () . On the other hand, it follows that the sequence {Duk }k is equiintegrable on ×(0, T ), and hence, by the Dunford-Pettis theorem (see, e.g., [7]), the sequence is sequentially weakly precompact in L1 ( × (0, T )), which implies that Duk 4 ν(x,t) , id in L1 0, T ; L1 () . Hence, we have Du(x, t) = ν(x,t) , id for almost every (x, t) ∈ × (0, T ). (iii) The next thing we have to check is that ν(x,t) has finite pth moment for almost all (x, t) ∈ × (0, T ). To see this, we choose a cutoff function η ∈ C0∞ (B2α (0); Rm ) with η = id on Bα (0) for some α > 0. Then the sequence D(η ◦ uk ) = (Dη)(uk )Duk η
generates a probability Young measure ν(x,t) on × (0, T ) with finite pth moment; that is, η
Mm×n
|λ|p dν(x,t) (λ) < ∞
for almost all (x, t) ∈ × (0, T ). Now, for φ ∈ C0∞ (Mm×n ) we have η η φ D(η ◦ uk ) 4 ν(x,t) , φ = φ(λ) dν(x,t) (λ) Mm×n
weakly in L1 ( × (0, T )). Rewriting the left-hand side, we have also (see, e.g., [11]) φ (Dη)(uk )Duk 4 φ Dη(u(x, t))λ dν(x,t) (λ). Mm×n
Hence,
η
ν(x,t) = ν(x,t)
if |u(x, t)| < α.
Since α was arbitrary, it follows that indeed ν(x,t) has finite pth moment for almost all (x, t) ∈ × (0, T ). (iv) Finally, we have to show that {ν(x,t) }x∈ is for almost all t ∈ (0, T ) a W 1,p gradient Young measure. To see this, we take a quasi-convex function q on Mm×n with q(F )/|F | → 1 as F → ∞. Then we fix x ∈ , δ ∈ (0, 1) and use inequality (1.21)
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
507
from [15, Lemma 1.6] with u replaced by uk (x, t), with a := u(x, t) − Du(x, t)x, and with X := Du(x, t). Furthermore, we choose r > 0 such that Br (x) ⊂ . Observe that the singular part of the distributional gradient vanishes for uk and that, after integrating the inequality over the time interval [t0 − 0, t0 + 0] ⊂ (0, T ), we get t0 +0 q Duk (y, t) dy dt t0 −0
Br (x)
t0 +0 1 uk (y, t) − u(x, t) − Du(x, t)(y − x) dy dt + (1 − δ)r t0 −0 Br (x)\Bδr (x) t0 +0 q Du(x, t) dt. ≥ Bδr (x) t0 −0
Letting k tend to infinity in the inequality above, we obtain t0 +0 q(λ) dν(y,t) (λ) dy dt t0 −0
+
Br (x) Mm×n t0 +0 1
(1 − δ)r t0 −0 ≥ Bδr (x)
Br (x)\Bδr (x) t0 +0
t0 −0
u(y, t) − u(x, t) + Du(x, t)(y − x) dy dt
q Du(x, t) dt.
Now, we let 0 → 0 and r → 0 and use the differentiability properties of Sobolev functions (see, e.g., [8]) and obtain that, for almost all (x, t0 ) ∈ × (0, T ), Bδr (x) q Du(x, t0 ) . q(λ) dν(x,t0 ) (λ) ≥ Br (x) Mm×n Since δ ∈ (0, 1) was arbitrary, we conclude that Jensen’s inequality holds true for q and the measure ν(x,t) for almost all (x, t) ∈ × (0, T ). Using the characterization of W 1,p -gradient Young measures of [14] (e.g., in the form of [15, Theorem 8.1]), we conclude that in fact {νx,t }x∈ is a W 1,p -gradient Young measure on for almost all t ∈ (0, T ). By the localization principle for gradient Young measures, we conclude then that ν(x,t) is a homogeneous W 1,p -gradient Young measure for almost all (x, t) ∈ × (0, T ). 6. A parabolic div-curl inequality In this section, we prove a parabolic version of a “div-curl lemma” (see also [6, Lemma 11]), which is the key ingredient for passing to the limit in the approximating equations and for proving that the weak limit u of the Galerkin approximations uk is indeed a solution of (1)–(3). Let us consider the sequence
¨ NORBERT HUNGERBUHLER
508
Ik := σ x, t, uk , Duk − σ x, t, u, Du : Duk − Du , and let us prove that its negative part Ik− is equiintegrable on × (0, T ). To do this, we write Ik− in the form Ik = σ x, t, uk , Duk : Duk − σ x, t, uk , Duk : Du − σ x, t, u, Du : Duk + σ x, t, u, Du : Du =: I Ik + I I Ik + I Vk + Vk . The sequences I Ik− and Vk− are easily seen to be equiintegrable by the coercivity condition in (P1). Then, to see equiintegrability of the sequence I I Ik , we take a measurable subset S ⊂ × (0, T ) and write σ x, t, uk , Duk : Du dx dt S
≤
S
σ x, t, uk , Duk p dx dt
≤C
S
1/p S
1/p p
|Du| dx dt
λ1 (x, t)p + |uk |p + Duk p dx dt
1/p S
1/p p
|Du| dx dt
.
The first integral is uniformly bounded in k (see Section 4). The second integral is arbitrarily small if the measure of S is chosen small enough. A similar argument gives the equiintegrability of the sequence I Vk . Having established the equiintegrability of Ik− , it follows by the Fatou lemma [6, Lemma 6] that T Ik dx dt X : = lim inf k→∞ 0 (15) T ≥ σ x, t, u, λ : (λ − Du) dν(x,t) (λ) dx dt. 0
Mm×n
On the other hand, we now see that X ≤ 0. According to Mazur’s theorem (see, 1,p e.g., [24, Theorem 2, p. 120]), there exists a sequence vk in Lp (0, T ; W0 ()) where each vk is a convex linear combination of {u1 , . . . , uk } such that vk → u in 1,p Lp (0, T ; W0 ()). In particular, vk (t, ·) ∈ span(w1 , w2 , . . . , wk ) for all t ∈ [0, T ]. Now, we have T σ x, t, uk , Duk : Duk − Du dx dt X = lim inf k→∞ 0 T = lim inf σ x, t, uk , Duk : Duk − Dvk dx dt k→∞
0
509
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
T
+ T
≤ lim inf k→∞
0
σ x, t, uk , Duk : Dvk − Du dx dt
0
σ x, t, uk , Duk p dx dt
+ f, uk − vk −
T 0
1/p vk − uLp (0,T ;W 1,p ())
(uk − vk )∂t uk dx dt .
(16)
Observe that uk − vk ∈ span(w1 , w2 , . . . , wk ), which allows us to use (6) in the inequality above. The first factor in the first term in (16),
T 0
σ x, t, uk , Duk p dx dt
1/p ,
is uniformly bounded in k by the growth condition in (P1) and the bound for uk in Lp (0, T ; W 1,p ()) (see Section 4). The second factor, vk − uLp (0,T ;W 1,p ()) , converges to zero for k → ∞ by construction of the sequence vk . Hence, the first term in (16) vanishes in the limit. The second term in (16), f, uk − vk , converges to zero since uk − vk 4 0 in Lp (0, T ; W 1,p ()). Finally, for the last term in (16), we have T uk − vk ∂t uk dx dt − 0
T 1 2 (17) =− ∂t uk dx dt + vk ∂t uk dx dt 0 0 2 T 2 2 1 1 = − uk (·, T ) L2 () + uk (·, 0) L2 () + vk ∂t uk dx dt. 2 2 0 T
Concerning the last term in (17), we claim that for k → ∞ we have T T 2 1 1 vk ∂t uk dx dt −→ u∂t u dx dt = u(·, T )L2 () − u0 2L2 () . 2 2 0 0 (18) To see this, let 0 > 0 be given. Then there exists M such that for all l ≥ m ≥ M we have T (i) | 0 (u − vm )∂t u dx dt| ≤ 0. This is possible since ∂t (j ◦ i ◦ u) ∈ Lp (0, T ; 1,p W −1,p ()) and vm → u in Lp (0, T ; W0 ()).
¨ NORBERT HUNGERBUHLER
510 T
(ii) | 0 (vl − vm )∂t ul dx dt| ≤ 0. This is possible by (13) since vl − vm ∈ span (w1 , . . . , wl ) for all fixed t ∈ (0, T ). Now, we fix m ≥ M and choose m0 ≥ m such that, for all l ≥ m0 , T ≤ 0. v u − ∂ u ∂ dx dt m t t l
0
∗
This is possible since ∂t ul 4 ∂t u in Lp (0, T ; (W0s,2 ()) ). Combination yields, for all l = l(0) ≥ m0 (0), T T vl ∂t ul dx dt − u∂t u dx dt 0 0 T T vl − vm ∂t ul dx dt + vm ∂t ul − ∂t u dx dt ≤ 0 0 T vm − u ∂t u dx dt ≤ 30. + 0
This establishes (18). On the other hand, since {uk }k is bounded in L∞ (0, T ; L2 ()), we have (after extraction of a further subsequence if necessary) that uk (·, T ) 4 u(·, T ) in L2 () (see Appendix C for a proof). Hence, (19) lim inf uk (·, T )L2 () ≥ u(·, T )L2 () . k→∞
By construction of uk , we also have lim uk (·, 0)L2 () = u0 L2 () . k→∞
(20)
Using (19), (20), and (18) in (17), we conclude T (uk − vk )∂t uk dx dt ≤ 0. lim inf − k→∞
0
This establishes X ≤ 0, and we infer from (15) that the following “div-curl inequality” holds. lemma 5 The Young measure ν(x,t) generated by the gradients Duk of the Galerkin approximations uk has the property that T σ x, t, u, λ −σ x, t, u, Du : λ−Du dν(x,t) (λ) dx dt ≤ 0. (21) 0
Mm×n
7. Passage to the limit We start with the easiest case.
511
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
Case (d) Suppose that ν(x,t) is not a Dirac mass on a set (x, t) ∈ M ⊂ × (0, T ) of positive Lebesgue measure |M| > 0. Then, by the strict p–quasi monotonicity of σ (x, t, u, ·) and by the fact that ν(x,t) is a homogeneous W 1,p -gradient Young measure (see Section 5) for almost all (x, t) ∈ × (0, T ), we have, for a.e. (x, t) ∈ M, σ x, t, u, λ : λ dν(x,t) (λ) m×n M > σ x, t, u, λ dν(x,t) (λ) : λ dν(x,t) (λ) . Mm×n Mm×n = Du(x, t)
Hence, by integrating over × (0, T ), we get, together with Lemma 5, T σ x, t, u, λ dν(x,t) (λ) : Du(x, t) dx dt 0
Mm×n T
≥
0
T
> 0
Mm×n
Mm×n
σ x, t, u, λ : λ dν(x,t) (λ) dx dt σ x, t, u, λ dν(x,t) (λ) : Du(x, t) dx dt,
which is a contradiction. Hence, we have ν(x,t) = δDu(x,t) for almost every (x, t) ∈ ×(0, T ). From this it follows that Duk → Du on ×(0, T ) in measure for k → ∞ (see, e.g., [11]), and thus σ (x, t, uk , Duk ) → σ (x, t, u, Du) almost everywhere on × (0, T ) (up to extraction of a further subsequence). Since, by the growth condition in (P1), σ (x, t, uk , Duk ) is equiintegrable, it follows that σ (x, t, uk , Duk ) → σ (x, t, u, Du) in L1 ( × (0, T )) by the Vitali convergence theorem. Now, we take a test function w ∈ ∪i∈N span(w1 , . . . , wi ) and φ ∈ C0∞ ([0, T ]) in (6) and integrate over the interval (0, T ) and pass to the limit k → ∞. The resulting equation is T T ∂t u(x)φ(t)w(x) dx dt + σ x, t, u, Du : Dw(x)φ(t) dx dt = f, φw 0
0
for arbitrary w ∈ ∪i∈N span(w1 , . . . , wi ) and φ ∈ C ∞ ([0, T ]). By density of the linear span of these functions in Lp (0, T ; W 1,p ()), this proves that u is in fact a weak solution. Hence, the theorem follows in case (d). Now, we prepare the proof of Theorem 2 in the remaining cases as follows. Observe that the integrand in (21) is nonnegative by monotonicity. Thus, it follows from Lemma 5 that the integrand must vanish almost everywhere with respect to the product measure dν(x,t) ⊗ dx ⊗ dt. Hence, we have that, for almost all (x, t) ∈ × (0, T ), (22) σ x, t, u, λ − σ x, t, u, Du : λ − Du = 0 on spt ν(x,t)
¨ NORBERT HUNGERBUHLER
512
and thus spt ν(x,t) ⊂ λ | σ x, t, u, λ − σ x, t, u, Du : λ − Du = 0 .
(23)
Now, we proceed with the proof in the single cases (a), (b), and (c) of (P2). We start with the simplest case (c). Case (c) By strict monotonicity, it follows from (22) or (23) that ν(x,t) = δDu(x,t) for almost all (x, t) ∈ × (0, T ), and hence Duk → Du in measure on × (0, T ). The rest of the proof is identical to the proof for case (d). Case (b) We start by showing that, for almost all (x, t) ∈ ×(0, T ), the support of ν(x,t) is contained in the set where W agrees with the supporting hyperplane L := {(λ, W (x, t, u, Du) + σ (x, t, u, Du)(λ − Du))} in Du(x, t); that is, we want to show that spt ν(x,t) ⊂ K(x,t) = λ ∈ Mm×n : W x, t.u, λ = W x, t, u, Du + σ x, t, u, Du : λ − Du . If λ ∈ spt ν(x,t) , then, by (23), (1 − τ ) σ x, t, u, Du − σ x, t, u, λ : Du − λ = 0
for all τ ∈ [0, 1]. (24)
On the other hand, by monotonicity, we have for τ ∈ [0, 1] that 0 ≤ (1 − τ ) σ x, t, u, Du + τ λ − Du − σ x, t, u, λ : Du − λ .
(25)
Subtracting (24) from (25), we get 0 ≤ (1 − τ ) σ x, t, u, Du + τ λ − Du − σ x, t, u, Du : Du − λ
(26)
for all τ ∈ [0, 1]. But, by monotonicity, in (26) also the reverse inequality holds, and we may conclude that σ x, t, u, Du + τ λ − Du − σ x, t, u, Du : λ − Du = 0 (27) for all τ ∈ [0, 1], whenever λ ∈ spt ν(x,t) . Now, it follows from (27) that 1 σ x, t, u, Du + τ λ − Du : λ − Du dτ W x, t, u, λ = W x, t, u, Du + 0 = W x, t, u, Du + σ x, t, u, Du : λ − Du , as claimed.
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
513
By the convexity of W we have W (x, t, u, λ) ≥ W (x, t, u, Du)+σ (x, t, u, Du) : (λ − Du) for all λ ∈ Mm×n , and thus L is a supporting hyperplane for all λ ∈ K(x,t) . Since the mapping λ → W (x, t, u, λ) is by assumption continuously differentiable, we obtain (28) σ x, t, u, λ = σ x, t, u, Du for all λ ∈ K(x,t) ⊃ spt ν(x,t) ,
and thus σ¯ :=
Mm×n
σ x, t, u, λ dν(x,t) (λ) = σ x, t, u, Du .
(29)
Now consider the Carath´eodory function g x, t, u, p = σ x, t, u, p − σ¯ (x, t). The sequence gk (x, t) = g(x, t, uk (x, t), Duk (x, t)) is equiintegrable, and thus gk 4 g¯ weakly in L1 × (0, T ) and the weak limit g¯ is given by σ x, t, η, λ − σ¯ (x, t) dδu(x,t) (η) ⊗ dν(x,t) (λ) g(x, ¯ t) = m m×n R ×M σ x, t, u(x, t), λ − σ¯ (x, t) dν(x,t) (λ) = 0 = spt ν(x,t)
by (28) and (29). Since gk ≥ 0, it follows that gk −→ 0
strongly in L1 × (0, T ) .
This again suffices to pass to the limit in the equation, and the proof of case (b) is finished. Case (a) We claim that in this case for almost all (x, t) ∈ × (0, T ) the following identity holds for all µ ∈ Mm×n on the support of ν(x,t) : σ x, t, u, λ : µ = σ x, t, u, Du : µ + ∇σ x, t, u, Du µ : Du − λ , (30) where ∇ is the derivative with respect to the third variable of σ . Indeed, by the monotonicity of σ we have, for all τ ∈ R, σ x, t, u, λ − σ x, t, u, Du + τ µ : λ − Du − τ µ ≥ 0, whence, by (22),
¨ NORBERT HUNGERBUHLER
514
− σ x, t, u, λ : (τ µ) ≥ −σ x, t, u, Du : λ − Du + σ x, t, u, Du + τ µ : λ − Du − τ µ = τ ∇σ x, t, u, Du µ λ − Du − σ x, t, u, Du : µ + o(τ ). The claim follows from this inequality since the sign of τ is arbitrary. Since the sequence σ (x, t, uk , Duk ) is equiintegrable, its weak L1 -limit σ¯ is given by σ x, t, u, λ dν(x,t) (λ) σ¯ = =
spt ν(x,t) spt ν(x,t)
σ x, t, u, Du dν(x,t) (λ)
t + ∇σ x, t, u, Du
= σ x, t, u, Du ,
spt ν(x,t)
Du − λ dν(x,t) (λ)
where we used (30) in this calculation. This finishes the proof of case (a) and hence of Theorem 2. Remark. Notice that in case (a) we have σ (x, t, uk , Duk ) 4 σ (x, t, u, Du), in case (b) we have σ (x, t, uk , Duk ) → σ (x, t, u, Du) in L1 ( × (0, T )), and in case (c) we even have Duk → Du in measure on × (0, T ) as k → ∞.
Appendices Appendix A Here we give the proof of the modified lemma of Aubin, that is, Lemma 3. Proof of Lemma 3 Let B˜ 0 := j (i(B0 )) ⊂ B1 be the Banach space equipped with the norm x ˜ B˜ 0 :=
inf
x∈B0 j ◦i(x)=x˜
xB0 ,
and let B˜ := j (B) ⊂ B1 be the Banach space equipped with the norm xB˜ := j −1 (x)B . (We recall that j is supposed to be injective.) Now, we consider a bounded sequence {vn }n in W . Let v˜n := j ◦ i ◦ vn . Then {v˜n }n is bounded in d v˜ p0 p1 ˜ ˜ ˜ ∈ L 0, T ; B1 , W := v˜ v˜ ∈ L 0, T ; B0 , dt
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
515
and by the usual Aubin lemma (see [20, Chapter 1, Section 5.2]) it follows that there ˜ By isometry of B exists a subsequence v˜ν that converges strongly in Lp0 (0, T ; B). ˜ and B, the claim follows.
Appendix B 1,p Let u be an arbitrary function in Lp (0, T ; W0 ()). We want to construct a sequence 1,p vk ∈ Lp (0, T ; W0 ()) which has the following properties: 1,p (i) vk → u in Lp (0, T ; W0 ()); (ii) vk (t) ∈ span(w1 , w2 , . . . , wk ) for 0 ≤ t ≤ T . To construct the sequence {vk }k , we take 0 > 0 (with the intention to let 0 → 0) and a standard mollifier δη in space-time. The function u is extended by zero outside × [0, T ] ⊂ Rn+1 . Choosing η > 0 small enough, we may achieve that u ∗ δη − uLp (0,T ;W 1,p ()) < 0. ¯ × [0, T ]) and j ∈ N, let Now, for a smooth function φ ∈ C ∞ ( ! T T T if t ∈ i , (i + 1) Qj (φ)(x, t) := φ x, i j j j denote the step function approximation of φ in time. We fix j ∈ N large enough such that we have u ∗ δη − Qj (u ∗ δη ) p < 0. L (0,T ;W 1,p ()) Finally, we choose k large enough such that Qj (u ∗ δη ) − Pk ◦ Qj (u ∗ δη ) p < 0, L (0,T ;W 1,p ()) where (as before) Pk denotes the W s,2 ()-projection onto span(w1 , w2 , . . . , wk ). (Notice that this is possible since t → Qj (u ∗ δη ) takes only finitely many values on [0, T ].) Combination yields u − Pk ◦ Qj (u ∗ δη ) p < 30, L (0,T ;W 1,p ()) and hence the sequence vk = Pk(0) ◦ Qj (0) (u ∗ δη(0) ) for 0 → 0 is a sequence with the properties (i)–(ii).
Appendix C Here, we want to prove that uk (·, T ) 4 u(·, T )
weakly in L2 ()
¨ NORBERT HUNGERBUHLER
516
and that u(·, 0) = u0 . L∞ (0, T ; L2 ()),
Since {uk }k is bounded in subsequence, uk (·, T ) 4 z
it is clear that, for a (not relabeled)
weakly in L2 (),
and we have to show z = u(·, T ). To shorten the notation, we write from now on u(T ) instead of u(·, T ), and so on. In order to prove the claim, note that (again, after a possible choice of a further subsequence) − div σ x, t, uk , Duk 4 χ weakly in Lp 0, T ; W −1,p () . Now, we claim that, for arbitrary ψ ∈ C ∞ ([0, T ]) and v ∈ W0 (), T zψ(T )v dx − u0 ψ(0)v dx = f − χ, ψv + ψ vu dx dt. 1,p
0
(31)
1,p
Since ∪n∈N span(w1 , . . . , wn ) is dense in W0 (), it suffices to verify (31) for v ∈ span(w1 , . . . , wn ). Then, by testing (6) by vψ, we have, for m ≥ n, T T ∂t um vψ dx dt + σ x, t, um , Dum : Dvψ dx dt = f, vψ. 0 0 =
um (T )ψ(T )v dx −
um (0)ψ(0)v dx −
T 0
um vψ dx dt
Then (31) follows by letting m tend to infinity. By choosing ψ(0) = ψ(T ) = 0 in (31), we have, in particular, T T f − χ , ψv = − ψ vu dx dt = ψvu dx dt,
0
and hence
0
u + χ = f.
Using this and (31) we have, on the other hand, zψ(T )v dx − u0 ψ(0)v dx
T
ψ vu dx dt = u , ψv + 0 T = uψv dx 0 = u(T )ψ(T )v dx − u(0)ψ(0)v dx.
(32)
Choosing ψ(T ) = 1, ψ(0) = 0 in (32), we obtain that u(0) = u0 , and choosing ψ(T ) = 0, ψ(0) = 1, we get u(T ) = z, as claimed.
517
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
Appendix D In this section, we want to assume that σ does not depend on x and u, and we want to replace condition (P2) by the following more classical quasi-monotonicity condition: (P 2 ) For all fixed t ∈ [0, T ), the map σ (t, F ) is strictly quasi-monotone in the variable F . Here, by strictly quasi-monotone, we mean the following. Definition 6 A function η : Mm×n → Mm×n is said to be strictly quasi-monotone if there exist constants c > 0 and r > 0 such that Du − Dv r dx η(Du) − η(Dv) : (Du − Dv) dx ≥ c
for all u, v ∈
1,p W0 ().
We want to prove the following theorem. theorem 7 If σ (t, Du) satisfies conditions (P0), (P1), and (P 2 ) for some p ∈(2n/(n + 2), ∞), 1,p then the parabolic system (1)–(3) has a weak solution u ∈ Lp (0, T ; W0 ()) for every f ∈ Lp (0, T ; W −1,p ()) and every u0 ∈ L2 (). Since in this case we do not have to deal with x- and u-dependence of σ , the following simple proof is possible. Proof Let uk and vk be constructed as in the proof of Theorem 2. Then, by using uk − vk as a test function in (6), we obtain T uk − vk ∂t uk dx dt f, uk − vk −
T
T
=
0
T
0
=
0
+ 0
σ t, Duk : Duk − Dvk dx dt
σ t, Duk − σ t, Dvk : Duk − Dvk dx dt
(33)
σ t, Dvk : Duk − Dvk dx dt.
The first term on the left of (33), f, uk − vk , converges to zero as k → ∞ since 1,p uk − vk 4 0 in Lp (0, T ; W0 ()). For the second term on the left of (33), we have
¨ NORBERT HUNGERBUHLER
518
seen in Section 6 that
T
lim inf − k→∞
0
uk − vk ∂t uk dx dt ≤ 0
for k → ∞. The last term on the right of (33) converges to zero for k → ∞ since σ (t, Dvk ) → σ (t, Du) in Lp (0, T ; Lp ()) (at least for a subsequence) and Duk − Dvk 4 0 in Lp (0, T ; Lp ()). We conclude that T o(1) = σ t, Duk − σ t, Dvk : Duk − Dvk dx dt 0
≥c 0
T
Duk − Dvk r dx dt.
This implies Duk → Du in measure for a suitable subsequence. The rest of the proof is as in case (d) in Section 7. Acknowledgments. This work was carried out while the author was at the Max Planck Institute for Mathematics in the Sciences in Leipzig. I am most grateful to Professors Stefan M¨uller and Eberhard Zeidler for their support and for providing and maintaining such a stimulating place. I would like to thank Professors M¨uller and Michael Struwe for their continuing interest in this work. I am especially grateful to Professor Jens Frehse for initiating my interest in parabolic systems with weak monotonicity assumptions. Finally, I would like to thank Jan Kristensen and Mikhail Sytchev for inspiring discussions and many useful hints.
References [1]
[2]
[3]
[4] [5]
J. M. BALL, “A version of the fundamental theorem for Young measures” in PDEs and
Continuum Models of Phase Transitions (Nice, 1988), Lecture Notes in Phys. 344, Springer, Berlin, 1989, 207–215. L. BOCCARDO and F. MURAT, Almost everywhere convergence of the gradients of solutions to elliptic and parabolic equations, Nonlinear Anal. 19 (1992), 581–597. ´ H. BREZIS , Op´erateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert, North-Holland Math. Stud. 5, North-Holland, Amsterdam, Notas Mat. 50, American Elsevier, New York, 1973. ´ H. BREZIS and F. E. BROWDER, Strongly nonlinear parabolic initial-boundary value problems, Proc. Nat. Acad. Sci. U.S.A. 76 (1979), 38–40. F. E. BROWDER, “Existence theorems for nonlinear partial differential equations” in Global Analysis (Berkeley, Calif., 1968), Proc. Sympos. Pure Math. 16, Amer. Math. Soc., Providence, 1970, 1–60.
PARABOLIC SYSTEMS WITH WEAK MONOTONICITY
519
[6]
¨ ¨ G. DOLZMANN, N. HUNGERBUHLER, and S. MULLER , Non-linear elliptic systems with
[7]
N. DUNFORD and J. T. SCHWARTZ, Linear Operators, Parts I–III, Wiley Classics Lib.,
[8]
L. C. EVANS and R. F. GARIEPY, Measure theory and fine properties of functions, Stud.
[9]
¨ I. FONSECA, S. MULLER, and P. PEDREGAL, Analysis of concentration and oscillation
measure-valued right hand side, Math. Z. 226 (1997), 545–574. Wiley, New York, 1988. Adv. Math., CRC Press, Boca Raton, Fla., 1992.
[10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]
[24] [25]
effects generated by gradients, SIAM J. Math. Anal. 29 (1998), 736–756, http://epubs.siam.org/sam-bin/dbq/toclist/SIMA J. FREHSE, private communication, 1999. ¨ N. HUNGERBUHLER , A refinement of Ball’s theorem on Young measures, New York J. Math. 3 (1997), 48–53, http://nyjm.albany.edu:8000/nyjm.html , Quasilinear elliptic systems in divergence form with weak monotonicity, New York J. Math. 5 (1999), 83–90, http://nyjm.albany.edu:8000/nyjm.html E. KAMKE, Das Lebesgue-Stieltjes-Integral, Teubner, Leipzig, 1960. D. KINDERLEHRER and P. PEDREGAL, Gradient Young measures generated by sequences in Sobolev spaces, J. Geom. Anal. 4 (1994), 59–90. J. KRISTENSEN, Lower semicontinuity in spaces of weakly differentiable functions, Math. Ann. 313 (1999), 653–710. R. LANDES, On weak solutions of quasilinear parabolic equations, Nonlinear Anal. 9 (1985), 887–904. , A note on strongly nonlinear parabolic equations of higher order, Differential Integral Equations 3 (1990), 851–862. R. LANDES and V. MUSTONEN, A strongly nonlinear parabolic initial-boundary value problem, Ark. Mat. 25 (1987), 29–40. , On parabolic initial-boundary value problems with critical growth for the gradient, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 11 (1994), 135–158. J.-L. LIONS, Quelques m´ethodes de r´esolution des probl`emes aux limites non lin´eaires, Dunod, Gauthier-Villars, Paris, 1969. G. J. MINTY, Monotone (nonlinear) operators in Hilbert space, Duke Math. J. 29 (1962), 341–346. J. SIMON, Compact sets in the space Lp (0, T ; B), Ann. Mat. Pura Appl. (4) 146 (1987), 65–96. ˇ , Solubility of boundary-value problems for quasi-linear parabolic M. I. VISIK equations of higher orders (in Russian), Mat. Sb. (N.S.) 59 (101) (1962) suppl., 289–325. K. YOSIDA, Functional Analysis, 6th ed., Grundlehren Math. Wiss. 123, Springer, Berlin, 1980. E. ZEIDLER, Nonlinear Functional Analysis and Its Applications, II/B: Nonlinear Monotone Operators, Springer, New York, 1990.
520
¨ NORBERT HUNGERBUHLER
Department of Mathematics, University of Alabama at Birmingham, 452 Campbell Hall, 1300 University Boulevard, Birmingham, Alabama 35294-1170, USA; [email protected]; current: Institut de Math´ematiques, Universit´e de Fribourg, P´erolles, 1700 Fribourg, Switzerland
ON PRODUCTS OF HARMONIC FORMS D. KOTSCHICK
Abstract We prove that manifolds admitting a Riemannian metric for which products of harmonic forms are harmonic satisfy strong topological restrictions, some of which are akin to properties of flat manifolds. Others are more subtle and are related to symplectic geometry and Seiberg-Witten theory. We also prove that a manifold admits a metric with harmonic forms whose product is not harmonic if and only if it is not a rational homology sphere. 1. Introduction On a general Riemannian manifold, wedge products of harmonic forms are not usually harmonic. But there are some examples where this does happen, such as in compact globally symmetric spaces. For these the harmonic forms coincide with the invariant ones, and the latter are clearly closed under products (see [2, pp. 10–13]). D. Sullivan [8] observed that “There are topological obstructions for M to admit a metric in which the product of harmonic forms is harmonic.” The reason Sullivan gave is that if the product of harmonic forms is harmonic, then the rational homotopy type is a formal consequence of the cohomology ring. Therefore manifolds that are not formal in this sense cannot admit a metric for which the products of harmonic forms are harmonic. This motivates the following definition. Definition 1 A Riemannian metric is called (metrically) formal if all wedge products of harmonic forms are harmonic. A closed manifold is called geometrically formal if it admits a formal Riemannian metric. Thus geometric formality implies formality in the sense of Sullivan. Compact globDUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 3, Received 18 April 2000. Revision received 6 July 2000. 2000 Mathematics Subject Classification. Primary 53C25, 57R57, 58A14, 57R17. Author’s work partially supported by the European Contract Human Potential Programme, Research Training Network European Differential Geometry Endeavour contract number HPRN-CT-2000-00101.
521
522
D. KOTSCHICK
ally symmetric spaces are metrically formal, as are arbitrary Riemannian metrics on rational homology spheres. Further examples can be generated by taking products, because the product of two formal metrics is again formal. In this paper we describe a number of elementary topological obstructions for geometric formality of closed oriented manifolds. These obstructions are independent of formality in the sense of rational homotopy theory and are often nonzero on formal manifolds. The simplest obstruction is the product of the first Betti number and the Euler characteristic. In small dimensions these elementary obstructions are strong enough to imply the following theorem. theorem 2 If M is a closed oriented geometrically formal manifold of dimension less than or equal to 4, then M has the real cohomology algebra of a compact globally symmetric space. It is, however, not true that M is a globally symmetric space, even up to homotopy. We give many examples of this in dimensions 3 and 4. We also give examples of 4-manifolds that do have the real cohomology algebra of a compact symmetric space but are not geometrically formal. This is detected by some more subtle obstructions coming from symplectic geometry and Seiberg-Witten gauge theory. The pattern of the arguments presented here is that metric formality is a weakening of a reduction of holonomy. For example, it implies that harmonic forms have constant length, though it does not imply that they are parallel. Nevertheless, the more harmonic forms there are, the stronger the constraints. In Section 7 we prove that every manifold that is not a rational homology sphere admits a nonformal Riemannian metric. This paper was motivated by an example pointed out by D. Toledo, which we describe in Section 2 and which is related to joint work in progress of the author and H. Endo. Further impetus came from a question posed by D. Huybrechts and U. Semmelmann concerning products of harmonic forms on Calabi-Yau manifolds (see [4]). 2. Motivation: Dimension 2 Let us consider first the case of a closed oriented surface . If its genus is 0 or 1, then there are globally symmetric Riemannian metrics. If the genus of is greater than or equal to 2, there are nontrivial harmonic 1-forms for all metrics, but every 1-form has zeros. In this case every wedge product of 1-forms also has zeros but for cohomological reasons cannot vanish identically in all cases. The only harmonic
ON PRODUCTS OF HARMONIC FORMS
523
2-forms are the constant multiples of the Riemannian volume form, so there cannot be any formal Riemannian metric. This proves Theorem 2 in the 2-dimensional case. The argument above, pointed out to the author by D. Toledo, shows that harmonicity of products of harmonic forms can fail on compact locally rather than globally symmetric spaces, contradicting a statement in [3, p. 158]. Note that is formal in the sense of Sullivan but that the nonvanishing of b1 () · χ () obstructs geometric formality. On the sphere every metric is formal because there are no interesting harmonic forms. However, when there are enough harmonic forms, the harmonicity of their products is a restriction on the metric enforcing rigidity. theorem 3 Every formal Riemannian metric on the 2-torus is flat. Proof Let g be a formal metric on T 2 , and let α be a nontrivial harmonic 1-form. Then ∗α and (1) α ∧ ∗α = |α|2 dvolg are also harmonic, and so α has constant length. In particular, it has no zeros. It follows that α(p) and ∗α(p) span Tp for all p ∈ . As |a| is constant for every constant linear combination a of α and ∗α, the Bochner formula (a) = ∇ ∗ ∇a + Ric(a),
(2)
together with the expression of ((1/2)|a|2 ) in terms of the rough Laplacian ∇ ∗ ∇a (cf. [11, p. 306]), allows us to compute 1 0 = |a|2 = g ∇ ∗ ∇a, a − |∇a|2 = −g Ric(a), a − |∇a|2 . 2 This shows that the (Ricci) curvature is everywhere nonpositive. But by the GaussBonnet theorem this implies that g is flat. 3. Elementary obstructions Let M be a closed oriented manifold of dimension n, and let g be a formal Riemannian metric on M. As usual, we extend g to spaces of differential forms. lemma 4 The inner product of any two harmonic k-forms is a constant function. In particular, the length of any harmonic form is constant.
524
D. KOTSCHICK
Proof That the length of any harmonic form is constant follows from equation (1). The more general statement follows by polarisation. lemma 5 Suppose α1 , . . . , αm are orthogonal harmonic k-forms. Then α=
m
f i αi
i=1
is harmonic if and only if the functions fi are all constant. Proof If α is harmonic, then g(α, αi ) = fi |αi |2 is constant by Lemma 4. Using that the length of αi is also constant by Lemma 4, we conclude that fi is constant. The converse is trivial. Lemma 4 implies that harmonic forms that are linearly independent at some point are linearly independent everywhere. Systems of linearly independent harmonic forms can be orthonormalised using constant coefficients. We can now generalise the discussion in Section 2 to higher dimensions. theorem 6 Suppose the closed oriented manifold M n is geometrically formal. Then (1) the real Betti numbers of M are bounded by bk (M) ≤ bk (T n ); ± ± (M) ≤ b2m (T n ); (2) if n = 4m, then b2m (3) the first Betti number b1 (M) = n − 1. Proof Fix a formal Riemannian metric on M. It follows from the above lemmas that the number of linearly independent harmonic k-forms is at most the rank of the vector bundle k . Similarly, when the dimension is 4m, the number of self-dual or anti-selfdual harmonic forms in the middle dimension is bounded by the rank of 2m ± . Suppose now that α1 , . . . , αn−1 are linearly independent harmonic 1-forms. Then ∗(α1∧· · ·∧αn−1 ) is also a harmonic 1-form and is linearly independent of α1 , . . . ,αn−1. Thus b1 (M) ≥ n − 1 implies b1 (M) = n. There is an uncanny similarity here with the classification of flat Riemannian manifolds (see [10]), which satisfy all the conclusions of Theorem 6. We can push this further.
ON PRODUCTS OF HARMONIC FORMS
525
theorem 7 Suppose the closed oriented manifold M n is geometrically formal. If b1 (M) = k, then there is a smooth submersion π : M → T k , for which π ∗ is an injection of cohomology algebras. In particular, if b1 (M) = n, then M is diffeomorphic to T n . In this case every formal Riemannian metric is flat. Proof Fix a formal Riemannian metric g on M. We consider the Albanese or Jacobi map π : M → T k given by integration of harmonic 1-forms. As the harmonic 1-forms have constant lengths and inner products, π is a submersion. It induces an isomorphism on H 1 , and products of linearly independent harmonic 1-forms are never zero but are harmonic because the metric is formal. In the case b1 (M) = n, we conclude that M is a covering of T n and is therefore a torus itself. Every formal metric on T n must be flat because it admits an orthonormal framing by harmonic 1-forms. In the case b1 (M) = 1, we have a partial converse to Theorem 7. theorem 8 Let M be a closed oriented n-manifold that fibers smoothly over S 1 . If b1 (M) = 1 and bk (M) = 0 for 1 < k < n − 1, then M is geometrically formal. Proof Suppose M fibers over S 1 with fiber F and monodromy diffeomorphism φ : F → F . By Moser’s lemma we may assume that φ preserves a volume form on F , so that its pullback to F × R descends to M as a closed form that is a volume form along the fibers. We can find a Riemannian metric on M for which ∗ = α is the closed 1-form defining the fibration over S 1 and has constant length. Then α and generate the harmonic forms in degree 1 and n − 1, and their product is harmonic. Flat manifolds satisfy further topological restrictions; for example, their Euler characteristics vanish. In our present context we have the following theorem. theorem 9 Suppose the closed oriented manifold M n is geometrically formal. (1) If bk (M) = 0, then e(k ) = 0; and ± (M) = 0, then e(2m (2) if n = 4m and b2m ± ) = 0. In particular, the Euler characteristic of M vanishes if b1 (M) = 0.
526
D. KOTSCHICK
Proof This follows from the obstruction-theory definition of the Euler class and the fact that every nontrivial harmonic form has no zeros because of (1). 4. Dimension 3 If M is a closed oriented geometrically formal 3-manifold, then by Theorem 6 we have b1 (M) ∈ {0, 1, 3}. If the first Betti number is maximal, then Theorem 7 says that M is the 3-torus. At the other extreme, if the first Betti number is zero, then M is a rational homology sphere. Clearly every metric on every such manifold is formal. Thus the only interesting case is that of first Betti number 1. Then the real cohomology algebra is that of the globally symmetric space S 2 ×S 1 , so that Theorem 2 is proved in the 3-dimensional case. Theorems 7 and 8 imply the following corollary. corollary 10 Let M be a closed oriented 3-manifold with b1 (M) = 1. Then M is geometrically formal if and only if it fibers over S 1 . This includes many nonsymmetric manifolds. Thurston has proved that every 3-manifold that fibers over S 1 carries a unique locally homogeneous geometry. It is not clear whether the induced metric is formal, even when the first Betti number is 1. 5. Dimension 4 If M is a closed oriented geometrically formal 4-manifold, then by Theorem 6 we have b1 (M) ∈ {0, 1, 2, 4}. If the first Betti number is maximal, then Theorem 7 says that M is the 4-torus. 5.1. First Betti number equals 2 In this case the Euler characteristic vanishes by Theorem 9, and b2 (M) = 2. Fix a formal Riemannian metric g. If α and β are harmonic 1-forms generating H 1 (M), then they are pointwise linearly independent. Therefore ω = α ∧ β is a nonzero harmonic 2-form with square zero. Thus the intersection form of M is indefinite, and we conclude that b2+ = b2− = 1. This means that the real cohomology ring of M is the same as that of the globally symmetric space S 2 × T 2 . We know from Theorem 7 that M fibers over T 2 . The above discussion shows that the fiber is nontrivial in homology. There are many examples of such manifolds other than S 2 × T 2 . If N is any 3manifold with b1 (N) = 1 which fibers over the circle, then the product M = N × S 1
ON PRODUCTS OF HARMONIC FORMS
527
is a 4-manifold with the real cohomology ring of S 2 × T 2 . By Corollary 10 it is geometrically formal because we can take a product metric that on N is the formal metric constructed in the proof of Theorem 8. 5.2. First Betti number equals 1 If the first Betti number is 1, the Euler characteristic vanishes by Theorem 9, and therefore b2 (M) = 0. In this case M has the real cohomology algebra of the globally symmetric space S 3 × S 1 . Theorems 7 and 8 imply the following corollary. corollary 11 Let M be a closed oriented 4-manifold with b1 (M) = 1 and b2 (M) = 0. Then M is geometrically formal if and only if it fibers over S 1 . This includes many nonsymmetric manifolds. The simplest example is a product of S 1 with a rational homology 3-sphere that is not symmetric. 5.3. First Betti number equals 0 From Theorem 6 we know b2± ≤ 3. If b2+ > 0, then there are nontrivial self-dual harmonic forms. By (1) they have no zeros and so define almost complex structures compatible with the orientation of M. It follows that b2+ is odd. Similarly, if b2− > 0, then there are almost complex structures compatible with the orientation of M and b2− must be odd. Suppose now that b2+ (M) = 3. Then the self-dual harmonic forms trivialise 2+ , and each defines an almost complex structure with trivial first Chern class (because the pointwise orthogonal complement of each in 2+ is trivial). Thus 0 = c12 (M) = 2χ (M) + 3σ (M) = 4 + 5b2+ − b2− = 19 − b2− , which contradicts b2− ≤ 3. Therefore b2+ = 3 is not possible, and similarly b2− = 3 is not possible either. Thus the only possible values for b2± are 0 and 1, and all combinations occur for the globally symmetric spaces S 4 , CP 2 , CP 2 , and S 2 × S 2 . This finally completes the proof of Theorem 2 in the 4-dimensional case. Any other example must have the same real cohomology ring as one of the above. In fact other examples exist for each cohomological type. In the case of S 4 , any rational homology 4-sphere will do. In the case of CP 2 , there is the Mumford surface [7], an algebraic surface of the form CH 2 / % with the same rational cohomology as CP 2 . The K¨ahler form is of course harmonic; it generates the cohomology, and its square is harmonic. Reversing the orientation of the Mumford surface, we obtain an example with the cohomology ring of CP 2 . Finally, in the case of S 2 × S 2 , there is also a locally Hermitian symmetric algebraic surface M of general type with the same real cohomology ring, due to Kuga (cf. [1, p. 237]). In this case, M is of the form
528
D. KOTSCHICK
(H2 × H2 )/ %, and the harmonic forms are generated by a self-dual and an antiself-dual harmonic 2-form (with respect to the locally symmetric metric). These are K¨ahler forms for M and M, respectively, and are therefore parallel and have harmonic products. 6. Obstructions from symplectic geometry In this section we discuss relations between harmonicity of products of harmonic forms on 4-manifolds on the one hand and existence of symplectic structures on the other. This leads to further obstructions to geometric formality. Let M denote a closed oriented 4-manifold with a Riemannian metric g. Suppose that b2+ (M) > 0. Then there is a nontrivial g-self-dual harmonic 2-form ω. If the product ω ∧ ω = ω ∧ ∗ω = |ω|2 dvolg is harmonic, then ω has constant length and in particular has no zeros. It is then a symplectic form on M compatible with the orientation, and g is an almost K¨ahler metric. There are 4-manifolds for which the elementary obstructions of Section 3 vanish but which are not geometrically formal because they do not admit any symplectic structure. Example 12 Let X be CP 2 , S 2 × S 2 , or the Kuga or Mumford surface. Let N be a rational homology 4-sphere whose fundamental group has a nontrivial finite quotient. Then M = X#N has the real cohomology ring of the geometrically formal manifold X but is not itself geometrically formal because it does not admit any symplectic structure by the result of [6]. There is another application of the relationship between harmonicity of products of harmonic forms and symplectic structures. Namely, we can show that on certain manifolds all products of certain harmonic forms are nonharmonic. This is obviously much stronger than geometric nonformality. For an example, consider the smooth manifold M underlying a complex K3 surface. Then M is simply connected with b2+ = 3 and b2− = 19. The elementary considerations in Section 5 already show that M is not geometrically formal. We can sharpen this as follows. proposition 13 Let g be an arbitrary Riemannian metric on the K3 surface M. If α is a g-anti-selfdual harmonic 2-form, then it must have a zero. In particular, the wedge product
ON PRODUCTS OF HARMONIC FORMS
529
α ∧ β is not harmonic for any β unless it vanishes identically. For example, if α is nontrivial, then α ∧ α is not harmonic. Proof Suppose α is nontrivial and anti-self-dual. We have α ∧ α = −α ∧ ∗α = −|α|2 dvolg , which is harmonic if and only if the norm of α is constant. If it is constant, it must be a nonzero constant, and then the above equation shows that α is a symplectic form inducing the opposite (noncomplex) orientation on M. In particular, M must have nontrivial Seiberg-Witten invariants (see [9]). But the K3 surface contains smoothly embedded (−2)-spheres, which become (+2)-spheres when the orientation is reversed, showing that all the Seiberg-Witten invariants vanish (see [5]). Remark 14 The vanishing of the Seiberg-Witten invariants for M can also be proved without appealing to the vanishing theorem for spheres of positive self-intersection. For a scalar-flat Calabi-Yau metric, the Seiberg-Witten equations on M have no solution, though they do have a (unique) solution on M. This can be generalised quite substantially. If M has an indefinite intersection form, there are both self-dual and anti-self-dual harmonic forms for all metrics. If the square of such a form is harmonic, it is a symplectic form on M, respectively, M. But by the results of [5], manifolds that are symplectic for both choices of orientation are quite rare. Thus the above proof generalises to many cases to show that for all metrics on certain 4-manifolds, all self-dual and/or all anti-self-dual harmonic forms must have zeros and nonharmonic squares. This generalises the existence of zeros of harmonic 1-forms on surfaces (cf. Section 2). In the case of complex surfaces, [5, Theorem 1] implies the following theorem. theorem 15 Let M be a compact complex surface of general type. Assume one of the following conditions holds: (1) KM is not ample, or (2) c12 (M) is odd, or (3) the signature σ (M) is negative, or is zero and M is not uniformised by the polydisk. Then, for every Riemannian metric g on M, all g-anti-self-dual harmonic 2-forms have zeros and nonharmonic squares.
530
D. KOTSCHICK
In the first case the argument is the same as in the proof of Proposition 13 because ampleness of KM only fails if there are rational curves of negative self-intersection. Note that we only need smoothly rather than holomorphically embedded spheres, so one can replace condition (1) by a weaker assumption. 7. General existence of nonformal metrics Having seen that only very few manifolds are geometrically formal, we now show that even these tend to also have nonformal metrics. The 2-dimensional case of the following result was already proved in Section 2. theorem 16 A closed oriented manifold admits a nonformal Riemannian metric if and only if it is not a rational homology sphere. Proof It is clear that every metric on every rational homology sphere is formal, because there are no nontrivial harmonic forms. Conversely, assume that M is a manifold with a nonzero Betti number bk (M) for 0 < k < dim(M). Let g be a Riemannian metric that has a positive curvature operator on an open set, say, a ball B ⊂ M, and assume it is formal. If α is a nontrivial g-harmonic k-form, then α ∧ ∗α = |α|2 dvolg shows that α has constant length. Therefore the Bochner-Weitzenb¨ock formula (α) = ∇ ∗ ∇α + R (α)
(3)
for k-forms allows us to compute, as in the proof of Theorem 3, 1 0 = |α|2 = g ∇ ∗ ∇α, α − |∇α|2 = −g R (α), α − |∇α|2 . 2 Here the term R is positive on B because there the curvature operator is positive. Thus α vanishes identically on B. As α is harmonic, the unique continuation principle implies that α vanishes on all of M, contradicting the assumption that α is nontrivial.
Remark 17 The above proof shows that there is an open set of nonformal metrics in the space of all Riemannian metrics (with the C ∞ -topology, say) on any manifold that is not a rational homology sphere.
ON PRODUCTS OF HARMONIC FORMS
531
References [1]
W. BARTH, C. PETERS, and A. VAN DE VEN, Compact Complex Surfaces, Ergeb. Math.
[2]
B. A. DUBROVIN, A. T. FOMENKO, and S. P. NOVIKOV, Modern Geometry—Methods
Grenzgeb. (3) 4, Springer, Berlin, 1984.
[3] [4] [5] [6]
[7] [8] [9] [10] [11]
and Applications, Part III: Introduction to Homology Theory, Grad. Texts in Math. 124, Springer, Berlin, 1990. P. A. GRIFFITHS and J. W. MORGAN, Rational Homotopy Theory and Differential Forms, Prog. Math. 16, Birkh¨auser, Boston, 1981. D. HUYBRECHTS, Products of harmonic forms and rational curves, preprint, http://www.arXiv.org/abs/math.AG/0003202 D. KOTSCHICK, Orientations and geometrisations of compact complex surfaces, Bull. London Math. Soc. 29 (1997), 145–149. D. KOTSCHICK, J. W. MORGAN, and C. H. TAUBES, Four-manifolds without symplectic structures but with nontrivial Seiberg-Witten invariants, Math. Res. Lett. 2 (1995), 119–124. D. MUMFORD, An algebraic surface with K ample, (K 2 ) = 9, pg = q = 0, Amer. J. Math. 101 (1979), 233–244. D. SULLIVAN, “Differential Forms and the Topology of Manifolds” in Manifolds (Tokyo, 1973), ed. A. Hattori, Univ. Tokyo Press, Tokyo, 1975, 37–49. C. H. TAUBES, The Seiberg-Witten invariants and symplectic forms, Math. Res. Lett. 1 (1994), 809–822. ¨ M. WAGNER, Uber die Klassifikation flacher Riemannscher Mannigfaltigkeiten, Diplomarbeit, Universit¨at Basel, 1997. H. H. WU, The Bochner technique in differential geometry, Math. Rep. 3 (1988), i–xii, 289–538.
Mathematisches Institut, Universit¨at M¨unchen, Theresienstrasse 39, 80333 M¨unchen, Germany; [email protected]
ON PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS FRANCISCO MART´IN and MATTHIAS WEBER
Abstract We classify all complete embedded minimal surfaces in R3 with three ends of genus g and at least 2g + 2 symmetries. The surfaces in this class are the Costa-HoffmanMeeks surfaces that have 4g + 4 symmetries in the case of a flat middle end. The proof consists of using the symmetry assumptions to deduce the possible Weierstrass data and then studying the period problems in all cases. To handle the 1-dimensional period problems, we develop a new general method to prove convexity results for period quotients. The 2-dimensional period problems are reduced to the 1-dimensional case by an extremal length argument. 1. Introduction The Costa-Hoffman-Meeks surfaces are the only known properly embedded minimal surfaces in R3 with three ends. These are 1-parameter families of complete embedded minimal surfaces of genus k − 1 for each integer k > 1. The following theorem summarizes all that we know about the existence of these surfaces. theorem 1 [6], [10] For every k ≥ 2, there is a 1-parameter family Mkx , x ≥ 1, of embedded minimal surfaces of genus k − 1 and total curvature −4π(k + 1). The surfaces Mk1 have two catenoid ends and one flat end. The surfaces Mkx , x ≥ 1, have all three catenoid ends. The Riemann surface M kx is given by −1 1 2 M kx = (z, w) ∈ C : wk = −cx zk−1 (z − x) z + , cx = x + x −1 . x The Weierstrass data are w G=ρ , mz + 1
z m + z−1 dz dh = (z − x) z + x −1
DUKE MATHEMATICAL JOURNAL c 2001 Vol. 107, No. 3, Received 7 June 1999. Revision received 25 July 2000. 2000 Mathematics Subject Classification. Primary 53A10; Secondary 53C42. Mart´ın’s research partially supported by Direcci´on General de Investigaci´on Cient´ıfica y T´ecnica grant number PB97-0785. Weber’s research partially supported by Sonderforschungsbereich 256 Bonn.
533
MART´IN AND WEBER
534
for suitable constants ρ and m depending on x (when x = 1, m(1) = 0). If x = 1, the catenoid ends are located at (x, 0) and (−x −1 , 0), and the flat end is (∞, ∞). If x > 1, these three points correspond to catenoid ends. See Figure 1 for a picture of M4,1 . These surfaces have a rather large symmetry group. In the case of a middle planar end, it consists of the dihedral group k of order 4k generated by a rotation about the x3 -axis by angle π/k followed by a reflection in the (x1 , x2 )-plane, and a reflection in the (x1 , x3 )-plane. In the case of a middle catenoid end, the symmetry group k is isomorphic to the dihedral group with 2k elements and is generated by a rotation about the x3 -axis by angle 2π/k, and a reflection in the (x1 , x3 )-plane.
Figure 1. The surface M4,1
In [8] D. Hoffman and H. Karcher ask whether one can show that these are the only properly embedded, 3-ended, minimal surfaces with at least this number of symmetries. In this paper, we give an affirmative answer to this question. More precisely, we prove the following theorem. main theorem Let M be a properly embedded minimal surface in R3 with three ends, genus g = k − 1, k > 2, and at least 2k symmetries. Then the symmetry group is either k or k , and M is one of the Costa-Hoffman-Meeks surfaces. In case g = 1, the corresponding result, without any symmetry assumptions, was obtained by C. Costa in [6]. The proof of the main theorem follows from the results obtained in Sections 2–4. In fact, Section 2 is devoted to describing the possible Weierstrass representations of
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
535
surfaces satisfying the symmetry assumptions. For this, we use similar techniques as developed by Hoffman and W. Meeks in [9]. As a consequence of this analysis, we obtain three possible Weierstrass representations for such surfaces. To decide the existence of these surfaces, we need to study the period problems in each case. The first one corresponds to the Costa-Hoffman-Meeks examples, and these examples have been discussed in [8]. The second one corresponds to the so-called Horgan surface (see [15]) with arbitrary dihedral symmetry group. Here, the period problem is 1-dimensional, and the nonexistence of the Horgan surface with 2-fold dihedral symmetry has already been proven in [15]. Unfortunately, the method of the simple proof does not extend to the general case, and we use a new technique to prove convexity properties of period quotients. This method relies crucially on the second-order differential equation of hypergeometric type satisfied by the periods as functions of the moduli. Using this differential equation, we obtain a differential inequality for the second derivative of period quotients which can be integrated to an inequality for the period quotients themselves. Finally, the third family consists of Weierstrass data describing surfaces of genus 3 with the same symmetry group as the Costa torus with a flat end. Here, the period problem is 2-dimensional. Using an extremal length argument, we are able to reduce this to the second case, concluding also the nonexistence of this family. The second and third case families consist of Weierstrass data describing rather plausible surfaces whose existence is not obstructed by known results or by the (unproven) Hoffman-Meeks conjecture, which states that, for embedded minimal surfaces of finite total curvature, the number of ends can be at most the genus plus 2. For surfaces of the second type, one can even produce convincing numerical pictures because the period problem can be solved to an arbitrary accuracy. For the third case, such a surface would arise from adding two handles to Costa’s surface as indicated in Figure 2. (The second additional handle is symmetrically placed below the planar end and therefore hidden.) It is remarkable that such a handle addition is possible in cases with more ends, as shown in [19]. 2. From geometric assumptions to Weierstrass data The aim of this section is to derive the possible Weierstrass representations for the surfaces which satisfy the hypothesis of the main theorem. Let X : M → R3 be a properly minimal embedding of a surface of genus k − 1, k > 2, with at least three topological ends in 3-dimensional euclidean space. As a consequence of P. Collin’s theorem, M has finite total curvature. (See [5] or [4] for an alternative proof.)
MART´IN AND WEBER
536
Figure 2. Such a minimal surface does not exist
R. Osserman’s theorem in [13] states that M is conformally equivalent to M − {E1 , E2 , E3 }, where M is a compact Riemann surface of genus k − 1. The removed points correspond to the ends, and Osserman proved that the Weierstrass data (G, dh) of X extend meromorphically to M. Recall that G is the stereographic projection of the Gauss map. As X : M → R3 is an embedding, R3 − X(M) consists of two connected components. Therefore, up to a rigid motion, G must alternate between zero and ∞ on the ends of M, ordered by height in R3 . So we can assume without loss of generality that G(E1 ) = G(E3 ) = 0, G(E2 ) = ∞, and that E1 is the highest end, E2 is the middle end, and E3 is the lowest end. By the maximum principle, the top and bottom ends must be catenoid ends. We denote by Iso(M) the isometry group of M, and we denote by Sym(M) the subgroup of Iso(M) consisting of isometries that are the restriction of a rigid motion in R3 leaving X(M) invariant. H. Choi, Meeks, and B. White [3] have proved that a properly embedded minimal surface M in R3 which has more than one end is minimally rigid. In particular, any intrinsic isometry of M extends to an isometry of R3 . We assume that Sym(M) = Iso(M) has at least 2k elements. A symmetry of X(M) induces in a natural way a conformal automorphism of M which extends to M leaving the set {E1 , E2 , E3 } invariant. Using Hurwitz’s theorem, Sym(M) is finite. Thus, after fixing a suitable origin, Sym(M) is a finite group of orthogonal linear transformations of R3 . Furthermore, since the normal vectors at the ends are vertical, leaves the x3 -axis invariant. In what follows we do not distinguish between a symmetry F ∈ of M and the induced conformal automorphism F|X(M) . Observe that any symmetry in leaves the set {E1 , E3 } invariant and fixes the point E2 . Let 0 be the subgroup of holomorphic transformations in , and denote
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
537
by R ⊆ 0 the cyclic subgroup of rotations around the x3 -axis. It is obvious that 0 : R ≤ 2. (1) : 0 ≤ 2, Denote by R the rotation around the x3 -axis with the smallest positive angle in 0 . Then R = R. In this setting, F. L´opez, D. Rodr´ıguez, and F. Mart´ın have proved the following theorem. theorem 2 [11] If ord(R) ≥ k, then there exists x ≥ 1 such that, up to conformal transformations and rigid motions in R3 , M = Mk,x and X = Xm(x) is the minimal embedding described in Theorem 1. Let D be the conformal disk in M centered at E2 which is invariant under 0 . Then {F|D : F ∈ 0 } is a finite group of biholomorphisms of the disk fixing the origin. Hence this group is cyclic, generated by some automorphism J . As a symmetry, J is either a rotation around the x3 -axis or a rotation around the x3 -axis followed by a symmetry with respect to the (x1 , x2 )-plane. Hence in the rest of this section we can assume that J is the generator of 0 corresponding either to a rotation around the x3 -axis by an angle of 2π/ ord(J ) or to a rotation around the x3 -axis by an angle of 2π/ ord(J ) followed by a symmetry with respect to the (x1 , x2 )-plane, where ord(J ) is the order of J and ord(J ) = |0 |. Note that either R = J or R = J 2 . We introduce the following notation. For any point Q ∈ M denote by
I (Q) = F ∈ 0 : F (Q) = Q the isotropy group of Q in 0 , and denote by µ(Q) = |I (Q)| the cardinality of I (Q). We also denote by orb(Q) = {Q, J (Q), . . . , J |0 |−1 (Q)} the orbit of Q under 0 . Notice that orb(Q) has |0 |/(µ(Q)) elements. If E1 , E2 , E3 are all catenoid ends, then Sym(M) does not contain any rotation followed by a symmetry; that is, R = J . Hence, from Theorem 2 we get the following corollary. corollary 1 [11] If X : M → R3 has three catenoid ends and if Sym(M) contains 2k elements or more, then up to natural transformations X = Xm(x) , x > 1. Therefore we only need to deal with the following case: R = J 2 , E2 is a flat end, and E1 and E3 are catenoid ends.
MART´IN AND WEBER
538
Thus 0 is generated by a rotation followed by a symmetry. From the RiemannHurwitz formula, we obtain µ(Q) − 1 . 4 − 2k = |0 |χ M/0 − 2|0 | − 3 + Q∈M
Since |0 | ≥ k, we deduce χ(M/0 ) > 0, and so M/0 is a sphere and in fact χ (M/0 ) = 2. Using this in the above formula, we get µ(Q) − 1 = 2k − 1. (2) Q∈M
Let orb(Q1 ), . . . , orb(Qs ) be the different nontrivial orbits of 0 on M (i.e., µ(Qi ) > 1, i = 1, . . . , s; if Q ∈ M − si=1 orb(Qi ), then µ(Q) = 1). If we let mi = |0 |/µ(Qi ), i = 1, . . . , s, then (2) gives s
|0 | − mi = 2k − 1.
(3)
i=1
Since |0 | is even, then at least one of the numbers mi is odd. On the other hand, if mi is odd, then J mi is a rotation around the x3 -axis followed by a symmetry with respect to the (x1 , x2 )-plane. Thus in this case J mi only fixes the origin of R3 . As X is an embedding, there is at most one point of M mapped by X into the origin, and so only one mi is odd and mi = 1. Up to reindexing we can assume that m1 = 1. Therefore mi is even, i ≥ 2, and J mi is a rotation, i ≥ 2. If mi > 2, i ∈ {2, . . . , s}, then Qi and J 2 (Qi ) are two different points lying in orb(Qi ). Since J mi is a rotation around the x3 -axis, X(Qi ) and X(J 2 (Qi )) lie in the x3 -axis. Moreover, J 2 is a rotation around the x3 -axis too, and thus X(Qi ) = X(J 2 (Qi )), which contradicts the fact that X is an embedding. Hence mi = 2, ∀i ≥ 2, and (3) becomes |0 | + (s − 1) |0 | − 2 = 2k. If s = 1, we get |0 | = 2k; and by Theorem 2 the surface corresponds to the unique element of the Costa-Hoffman-Meeks family with a flat end. If s > 1, using the Riemann-Hurwitz formula once again, we get |R |χ M/R = χ (M) + (3 + 2s − 1)(|R | − 1) = 4 − 2k + 2(s + 1)(|R | − 1). As s > 1, we obtain χ (M/R ) = 2. So the above equality gives k+1 k−1 k ≤ |R | = 1 + ≤ . 2 s 2 This leaves us with two possibilities.
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
539
(A) |R | = (k + 1)/2. In this case, k − 1 is even and s = 2. Then we have three points in M which are mapped by X into the x3 -axis: Q1 , Q2 , and Q3 = J (Q2 ). (B) |R | = k/2. This implies k − 1 = 3 and s = 3. Now, there are five points in M which are mapped by X into the x3 -axis: Q1 , Q2 , Q3 , Q4 = J (Q2 ), and Q5 = J (Q3 ). Observe that in both cases any symmetry of M fixes the point Q1 . Moreover, any symmetry of M which fixes the ends must fix the points Qi , i ≥ 2, and any symmetry of M which interchanges the catenoid ends must interchange the points Qi and J (Qi ), i ≥ 2. At this point we focus on the branched covering z : M → M/R ≡ C. Up to M¨obius transformations we can assume in both cases that z(E2 ) = ∞, z(Q1 ) = 0, and z(Q2 ) = 1. From (1) and the hypothesis || ≥ 2k, we see that [ : 0 ] = 2, [0 : R ] = 2. So /R is a group of conformal transformations of C isomorphic to Z2 × Z2 . Hence it follows that /R = {R, f, g, h}, where (1) f is a holomorphic involution fixing zero and ∞, so f (z) = −z and z(J (Q2 )) = −1; (2) g is an antiholomorphic involution fixing the real axis, and so g(z) = z; (3) h(z) = (f ◦ g)(z) = −z. Taking these facts into account, we deduce that: (A) λ = z(E1 ) ∈ R − {0, 1} and z(E3 ) = −λ; (B) λ = z(E1 ) ∈ R − {0, 1}, z(E3 ) = −λ, µ = z(Q3 ) ∈ R − {0, 1, λ}, and z(Q5 ) = −µ. We now investigate the behavior of the Gauss map at an end and at a point on the x3 -axis in the presence of the rotational symmetry R. Each catenoid end still produces a simple zero of G. However, the flat end must correspond to a pole of order j m + 1, where j is a positive integer and m is the order of R. On the other hand, if Q is a point on the x3 -axis, then the order of Q as zero or pole of G is j m − 1 for a positive integer j (for details, see [2]). Using these arguments and the fact that deg(G) = k + 1, we are able to deduce that the divisor associated to G as a meromorphic function on M is (in each case) (k−1)/2 (k−1)/2 (k+3)/2 (k−1)/2 · Q3 / E2 · Q1 , (A) (G) = E1 · E3 · Q2 3 (B) (G) = E1 · E3 · Q1 · Q3 · Q5 / E2 · Q2 · Q4 . Next we write down the divisor for dh on M. The behavior of dh is determined by the rule that, on M, 1/G dh is holomorphic with zeros precisely at the poles of G but with double order. Recalling that the poles and zeros of 1/G dh can lie only at the poles and zeros of G and that their order is determined by the requirement that the first and second Weierstrass forms have poles of order 2, we obtain
MART´IN AND WEBER
540 (k−1)/2
(A) (dh) = Q1
5
(k−1)/2
· Q2
(k−1)/2
· Q3
(k−1)/2
· E2
/(E1 · E3 ),
(B) (dh) = E2 · i=1 Qi /(E1 · E3 ). We now use techniques similar to those introduced by Hoffman and Meeks in [9] to obtain the two possible conformal structures for M. Case A Define N = M − {Q1 , Q2 , Q3 }, and denote n = (k + 1)/2. Then z|N : N → C − {0, ±1, ±λ} is an n-fold unbranched cyclic covering. Moreover, the conformal structure on N determines that of M. We may determine z|N as follows. Recall that R is the generator of R corresponding to counterclockwise rotation around the x3 -axis by an angle of 2π/n. Let αi , i = 1, . . . , 5, be a counterclockwise circuit around 0, −1, 1, −λ, and λ, respectively, and let αi be its lift to N . The endpoints of αi differ by a deck transformation of the form R ki , 0 ≤ ki ≤ n − 1, i = 1, . . . , 5. The choice of R and the fact that we have oriented M with downward-pointing normals at E1 , E3 , Q2 , and Q3 implies that R has rotation number π/n at E1 and E3 and rotation number −π/n at Q2 and Q3 . Using similar arguments for Q1 , R has rotation number π/n at Q1 . Hence k1 ≡ k4 ≡ k5 ≡ 1 mod(n) and k2 ≡ k3 ≡ −1 mod(n). The numbers k1 , . . . , k5 determine the induced map from the fundamental group '1 (C − {0, ±1, ±λ}) onto Zn whose kernel corresponds to z∗ ('1 (N)) ⊂ '1 (C − {0, ±1, ±λ}). Any n-fold cyclic covering of C − {0, ±1, ±λ} is equivalent to z|N if the associated representation has the same kernel. In particular, the cyclic covering defined by the z-projection of 2 2
z z − λ (z, w) ∈ C − 0, ±1, ±λ × C − {0} : w n = z2 − 1 is equivalent to z|N . The extension of this covering to the Riemann surface 2 z z 2 − λ2 n M nλ = (z, w) ∈ C ∪ {∞} : w = z2 − 1 is conformally equivalent to z : M → M/R . In particular, M = M nλ , E1 = (λ, 0), E2 = (∞, ∞), E3 = (−λ, 0), Q1 = (0, 0), Q2 = (1, ∞), and Q3 = (−1, ∞). Furthermore, the conformal transformation R corresponds to R(z, w) = (z, e2π i/n w). We denote by Mnλ = M nλ − {E1 , E2 , E3 }. Taking into account the expression for (G) and (dh) that we have obtained before, we deduce that dz z 2 − λ2 dh = σ 2 , G = ρ n−1 , w z − λ2 where, up to scaling and rigid motion, ρ ∈ R∗ and σ ∈ C∗ , |σ | = 1. The 1-form dh is the pullback of a differential on C. So the period problem reduces to impose that
541
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
its residues at the ends are real. This leads to σ = ±1. Up to a symmetry, we can assume σ = 1. For λ, we have to consider the two cases λ ∈ (0, 1) and λ ∈ (1, ∞). The former can be easily excluded as follows (see also Section 3). Assume the contrary, that is, λ < 1. Denote by S : M nλ → M nλ the anticonformal map S(z, w) = (z, w). It is obvious that S ∗ (,1 , ,2 , ,3 ) = (,1 , −,2 , ,3 ), which means that S induces in X(M) a reflection in the plane x2 = 0. Let γ1 , γ2 , and γ3 be the lifts to M nλ of the segments in the z-plane: [1, ∞[, [0, λ[, and [−1, −λ[, respectively, and label .i = X(γi ), i = 1, 2, 3. Using the above reasoning, .i is contained in x2 = 0, i = 1, 2, 3. On the other hand, one has that • .1 is a curve connecting the X(Q2 ) (in the x3 -axis) with the flat end; • .2 is another divergent curve connecting X(Q1 ) = (0, 0, 0) with the catenoid end E1 ; • .3 connects the point X(Q3 ) and the catenoid end E3 . These facts lead to /( 3i=1 .i ) ≥ 2 (see Figure 3), which is contrary to the embeddedness. So we conclude that λ is in ]1, +∞[.
X(Q3 )
X(Q2 )
X(Q1 )
X(Q1 )
X(Q2 )
X(Q3 )
Figure 3. The two possible situations for the curves .i , i = 1, 2, 3
Case B Recall that, in this case, the genus of M is 3 and ord(R) = 2. Using similar arguments as in case A, we deduce that M is conformally equivalent to 2 M λµ = (z, w) ∈ C ∪ {∞} : w2 = z z2 − 1 z2 − λ2 z2 − µ2 , and the Weierstrass data are
MART´IN AND WEBER
542
G=ρ
z2
w , −1
dh = σ
z2
dz , − λ2
where ρ ∈ R∗ and σ ∈ C∗ , |σ | = 1. The period conditions for dh tell us that σ is real and, once again, we can suppose that σ = 1. 3. Preliminary discussion of the period problem In this section, we discuss the period condition on the candidate Weierstrass data from the last section. The period condition ensures that the potentially multiple-valued surface parametrization is in fact single-valued. It can be stated as follows. For any closed cycle γ on X, we need dh = 0, (4) Re γ
γ
G dh =
γ
1 dh. G
(5)
Our goal is to show that, in both cases A and B, the period problem cannot be solved. As we mentioned at the end of Section 2, the period condition on dh is equivalent to the requirement that σ is real. We assume without loss of generality that σ = 1. The discussion of the second condition is much more difficult. In case A, it requires a detailed analysis of hypergeometric functions, which is postponed to Section 4. In case B, the period problem appears to be more difficult than in case A, as it depends on one additional parameter. Surprisingly, it can in fact be reduced to case A by a simple extremal length argument, which is done in Section 5. For both discussions, it is helpful to consider certain flat polygonal domains associated naturally to the Weierstrass forms G dh and 1/G dh. These domains were introduced in [18] and [19] and were also used in [16]. We illustrate the concept in case A where the Weierstrass forms, restricted to the upper half-plane in the z-sphere, are given by 1−1/n 2 1/n−1 dz = ρz1/n−1 z2 − 1 dz, z − λ2 w n−1 1/n−1 2 −1−1/n 1 w n−1 dz 1 1−1/n 2 1 dh = dz. z −1 z − λ2 2 = z G ρ z 2 − λ2 ρ G dh = ρ
We now look at the maps H → C defined by z G dh and z −→ z −→
z
1 dh. G
543
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
From the Schwarz-Christoffel formula (see [12]), we see that these holomorphic maps map the upper half-plane conformally onto a (possibly branched and unbounded) euclidean polygon bounding a disk. The zeros and poles of the integrands, that is, 0, ±1, ±λ, and ∞, are mapped to the vertices of these domains, and the angles are determined explicitly by the exponents of the factors. Figure 4 shows the two image domains in case A with n = 2 and λ > 1. Here and in the following figures, the left domain corresponds to G dh, and the right domain corresponds to 1/G dh. −λ
λ
∞ 1 0
−λ
∞
λ
λ
1 −1 −λ
−1
0
Figure 4. Case Aa
The polygonal arc is the image of the real axes, the arrow(s) indicate(s) the orientation, and the domains are always to the left of these polygonal boundaries. Interior branched points are indicated by a fat dot. Such branched domains can be realized by taking a domain for each boundary component, slitting these domains with slits emanating from the branched points, and glueing them together. As the minimal surface patch corresponding to the upper half-plane is the conformal image of the upper half-plane, the above domains are other conformally correct representations of this surface patch. Moreover, some of the surface symmetries are visible; surface isometries that preserve the absolute value of G dh become automorphisms of the new flat surface metric |G dh|, and their fixed point sets become straight lines. In the particular case of this example, the reflections about the vertical coordinate planes decompose the minimal surface into four simply connected isometric pieces, each the image of the upper half-plane. The symmetry lines become just the polygonal segments. There is an additional assumed symmetry, namely, a rotation about a horizontal line on the surface, also preserving |G dh|. This symmetry becomes the reflectional symmetry of the domains. Our convention here is that the figures are symmetric with respect to the (y = −x)-diagonal. This can always be achieved by a rotation. So the (y = −x)-axes takes over the role of the imaginary axes, and hence conjugation means reflection at the (y = x)-axes.
544
MART´IN AND WEBER
The periods of the meromorphic 1-forms G dh and 1/G dh can now be read off from these domains as displacement vectors between parallel edges, because integrating (say) G dh over a closed cycle is nothing but developing the curve using the flat metric. The period condition on G dh and 1/G dh requires that their respective periods are complex conjugate. Under the chosen normalization of the Schwarz-Christoffel maps, this means that corresponding period vectors (i.e., displacement vectors between the same pair of parallel edges) in G dh and 1/G dh must be conjugate by a reflection at the (y = x)-diagonal. Definition 1 Two period domains as above are called conjugate if all corresponding period vectors are symmetric with respect to reflection at the (y = x)-axes. Given a conjugate pair of domains, it is not clear a priori that they come from the same modulus, that is, that the modulus λ is the same number in both Schwarz-Christoffel maps. If this is the case, we call the two domains conformal. Definition 2 Two period domains as above are called conformal if there is a conformal biholomorphism mapping vertices to corresponding vertices. Using this terminology, we can rephrase the period problem as follows. theorem 3 The period problem in case A or B has a solution if and only if there is a pair of conjugate and conformal domains which corresponds to the respective Weierstrass data. Using this observation, we can give an alternative argument to show that, in case A, λ ∈ (0, 1) is impossible. Here the domains look like those in Figure 5. Notice how the angle geometry of this domain imposes restrictions on the periods. The period vector corresponding to the cycle connecting (0, λ) to (1, ∞) has to point up in the G dh domain because the branched point cannot lie outside the domain and the period vector points left in the 1/G dh domain by the angles. This contradicts the conjugacy requirement for the period vectors (recalling that the real axes in each of our figures is always the (y = x)-diagonal). In the case of higher dihedral symmetry, we obtain the kind of domains shown in Figure 6.
545
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
∞
λ −1
−λ
1
∞
1
λ
0
λ
−1 −λ
0
−λ Figure 5. Case Ab
The periods can be read off as before. Notice that the period vectors are always parallel to edges of one type. Again, the imaginary axis becomes the symmetry diagonal. ∞
λ
−λ ∞
λ
1 0
−λ
1 −1
−1
λ 0
−λ
Figure 6. Case Aa, n > 2
For case B, we have to distinguish six different cases corresponding to the relative position of 0, 1, λ, µ. Fortunately, all but two drop out by the same type of argument as before (see Figures 7–12). Cases Bb, Bc, Bd, Bf are impossible because the period vectors have inconsistent signs. To ensure conjugacy of the domains, the lowest edge of the 1/G dh domain needs to be the leftmost edge of the G dh domain. Similarly, the ordering of the horizontal edges from bottom to top 1/G dh imposed by the geometry of the domain has to persist as an ordering from left to right in the G dh domain. Using this, together with the assumed reflectional symmetry of the domains, leads in all these cases to domains that are geometrically impossible. Cases Ba and Be are dealt with in Section 5. This concludes our preliminary discussion of the period problem.
MART´IN AND WEBER
546
−µ
∞
µ
∞
µ
λ 1
λ
1
−λ −1
−1 0 −µ
0
µ
−µ
−λ
Figure 7. Case Ba: 0 < 1 < λ < µ
∞
µ −λ
−µ λ
λ
µ
−λ
0
1
−µ
−1
∞ 1 −1 0
µ
−µ Figure 8. Case Bb: 0 < 1 < µ < λ
−µ µ
−1
−λ
1 λ
−µ
∞
µ
∞
−λ 0
Figure 9. Case Bc: 0 < µ < 1 < λ
−1
λ
1
0
µ
−µ
547
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
−µ
∞
µ
λ µ
−1 −λ ∞
1
µ
λ
−λ
−µ
0
0 −µ
1
−1
Figure 10. Case Bd: 0 < µ < λ < 1
−µ
µ
∞
µ
∞
0
1 −µ
λ
λ
−λ −1
−1
µ
−µ
0
−λ
1
Figure 11. Case Be: 0 < λ < 1 < µ
∞
−µ µ −1
λ −µ
1 ∞ 0
−λ
0 −λ
µ
Figure 12. Case Bf: 0 < λ < µ < 1
λ µ −µ −1
1
548
MART´IN AND WEBER
4. Convexity properties of period quotients In this section, we study period quotients of meromorphic 1-forms as functions of one real modulus λ ∈ (1, ∞). Such period quotients arise naturally in existence and classification problems for minimal surfaces (see [17]). The reason for this is that the second period condition in (5) asks for complex conjugate period vectors, and often this can be decided by just asking for proportional period vectors. In situations with only one free essential parameter, this reduces to period quotient functions. However, usually it is a delicate problem to find good estimates for the transcendental functions arising this way. The difficulties have two different causes. First, the periods forming the fraction are integrals over different cycles on the Riemann surface, giving no idea how to compare the integrals. Second, despite the fact that the integrands are real 1-forms, very often the cycles are curves in some complex domain which cannot be replaced by better estimable integrals along segments on the real axes. Both of these problems are overcome here by proving a much stronger statement than needed. The ultimate goal here is to show that two such period quotients can be equal only for λ = 1. We do this by proving that one of these functions is convex, the other is concave, and their graphs touch at λ = 1. The reason why it is actually simpler to control second derivatives than the functions themselves lies in the fact that period integrals often satisfy linear differential equations. In our case we obtain a formula relating the second derivative of a period quotient to the coefficients of the differential equation and a term involving only one period and its derivative. The latter term can easily be estimated using direct computations. All the discussions in this section are directly concerned with case A. Hence we consider the meromorphic 1-forms from case A: 1−1/n 2 1/n−1 dz, λ − z2 ω1 = z1/n−1 z2 − 1 −1+1/n −1/n−1 dz, λ2 − z2 ω2 = z1−1/n z2 − 1 which are single-valued on the Riemann surface X defined by the algebraic equation z z 2 − λ2 n . w = z2 − 1 As explained in Section 2, we can assume that λ > 1. The principal goal is to prove the following theorem. theorem 4 There is no complex number ρ ∈ C∗ such that we have, for all closed cycles γ on X,
549
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
γ
ρω1 =
γ
1 ω2 . ρ
To prove this, we first reduce this statement to a period quotient statement for real integrals. Observe that ω1 can be integrated safely on the intervals [0, 1] and [1, λ]. This does not hold for ω2 (see Figure 13), and we take care of this by the following lemma. −λ
λ
∞
λ
1
0
−1
−λ
Figure 13. Period computations
lemma 1 We have
1 ω2 = − 2 2 λ − 1 λ3/n
γ
where
γ∗
ω3 ,
−1/n 2 1/n ω3 = z1/n−1 z2 − 1 dz λ − z2
and where γ ∗ denotes the cycle obtained from γ by applying the conformal transformation S(z, w) = (λ/z, λ3/n /w). Proof Using first partial integration, we obtain 1 1 n dz d = . ω2 = ω2 + 2 − 1) 2−1 w zw 2(λ 2 λ γ γ γ So we have
γ
ω2 =
S∗ γ
S
∗
dz zw
1 = 2 2 λ − 1 λ3/n
γ∗
ω3 .
(6)
MART´IN AND WEBER
550
lemma 2 We have
1 0 λ
1
1 ω2 = 2 2 λ − 1 λ3/n 1 ω2 = 2 2 λ − 1 λ3/n
0 λ
1
|ω3 | −
λ
|ω3 | ,
1
ω3 .
1
Proof The second equality is a consequence of the substitution v = λ/z and the symmetries of the integrand. The nonobvious part is the first equality. Let a(s) and b(s) be the oriented simple closed curves in the z-plane illustrated in Figure 14. We assume that c(0) ∈]1, λ[, b(0) ∈] − 1, 0[. Let α(s) and β(s) be the unique lifts of a(s) and b(s) to M n λ , respectively, satisfying arg(w(β(0))) = arg(w(α(0))) = π/n. b(s) −λ
−1
a(s) 0
λ
1
Figure 14. The curves a(s) and b(s) in the z-plane
Consider the antiholomorphic involution T (z, w) = S(z, w) = (λ/z, λ3/n /w). Analytic continuation of w along a(s) and b(s) and elementary topological arguments give (7) T∗ (α) = α + β. So, using (6) and (7), one has 1 dz = 2 ω2 = T∗ ω3 zw 2 λ − 1 λ3/n α+β α T∗ α
1 |w| |w| dz + dz z −λ 0 z 1 λ −2π i/n |w| |w| 1 dz − dz . −1 e = 2 z z 2 λ − 1 λ3/n 0 1
−2π i/n 1 = 2 e −1 3/n 2 λ −1 λ
−1
Now introduce the period integrals 1 1−1/n 2 1/n−1 z1/n−1 1 − z2 dz, λ − z2 f1 (λ) = 0
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
f2 (λ) =
λ
1−1/n 2 1/n−1 z1/n−1 z2 − 1 dz, λ − z2
1
−1/n 2 1/n z1/n−1 1 − z2 dz, λ − z2
λ
−1/n 2 1/n z1/n−1 z2 − 1 dz λ − z2
551
1
g1 (λ) =
0
g2 (λ) =
1
and the associated period quotients pqf (λ) =
f2 (λ) , f1 (λ)
g2 (λ) . g1 (λ) − g2 (λ)
pqg (λ) =
Later we need the following lemma. lemma 3 We have g1 (λ) > g2 (λ),
g1 (λ) > g2 (λ).
Proof The first inequality is a simple consequence of Lemma 2. In fact, the substitution v = λ/z carries the cycle from zero to 1 into a cycle homotopic to the difference of the cycles from zero to 1 and 1 to λ, so that this difference can be evaluated as an integral over a positive integrand. The only critical issue here is that the integrals need to converge. We can apply a similar strategy for the second inequality. Here we compute −1/n 2 1/n 2λ ∂ ω3 = v 1/n−1 1 − v 2 dv λ − v2 ∂λ γ ∗ n γ∗ −1/n 1/n−1 2λ λ2 1/n−1 λ λ λ2 =− dz 1− 2 λ2 − 2 n γ z z z z2 1/n−1 2 −1/n 2λ3−3/n =− z1−1/n z2 − 1 dz. z − λ2 n γ The last integral converges, and the same reasoning about the cycles as above implies that g1 (λ) > g2 (λ). We now come to the main theorem of this section. Lemma 4 shows that this is a sharp estimate. In other words, for λ close to 1, one gets a minimal surface with its periods almost closed.
MART´IN AND WEBER
552
theorem 5 For every λ > 1, we have pqf (λ) > pqg (λ). Assuming Theorem 5, Theorem 4 follows easily. Theorem 5 is proven as a corollary of Lemma 4 and Proposition 1. lemma 4 The period quotient functions satisfy pqf (1) = pqg (1) = 0, n−1 π, pq f (1) = 2 n sin (π/n) 1 pq g (1) = 2 π. n sin (π/n) Proof The first statement follows instantly from the definitions. For the others, we first deal with pq f (1). For this we need to evaluate f2 (1), and to compute the derivative we have to ensure that the integral converges. This is done using the substitution x = λ/z: λ 1/n−1 1−1/n 2 λ − z2 z1/n−1 z2 − 1 dz f2 = 1 λ 1/n−1 2 1−1/n x −1/n−1 x 2 − 1 dx. λ − x2 = λ3/n−2 1
Hence (using x = (λ − 1)y + 1) λ 1/n−1 2 −1/n 1 lim x −1/n−1 x 2 − 1 dx λ − x2 f2 (1) = 2 1 − n λ→1 1 1 1/n−1 1 =2 1− lim (λ − 1)((λ − 1)y + 1)−1−1/n 1 − ((λ − 1)y + 1)2 n λ→1 0 −1/n × λ2 − ((λ − 1)y + 1)2 dy 1 1 (2y)1/n−1 (2 − 2y)−1/n dy =2 1− n 0 1 π = 1− . n sin(π/n) More easily, f1 (1) = n. This implies the statement about pq f (1). Concerning pq g (1), one easily evaluates g2 (1) = 0 and g1 (1) = n. Then it follows that pq g (1) = (1/n)g2 (1). To evaluate
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
553
g2 (1), we proceed as before (using z = (λ − 1)y + 1): λ −1/n 2 1/n−1 2 g2 (1) = lim z1/n−1 z2 − 1 dz λ − z2 n λ→1 1 1 1 (y)−1/n (1 − y)1/n−1 dy = lim n λ→1 0 π = . n sin(π/n) Now we have the following crucial proposition. proposition 1 For n ≥ 3, the period quotient functions satisfy pq f pq f
> 0,
pq g pq g
< 0.
This proposition is proven in two steps. First we deduce a linear differential equation of second order for the periods. Then we use the important Lemma 6 to compute the second derivative of the period quotients. To begin, consider a path γ in C such that za (z − 1)b (z − λ)c does not change its value when continued analytically along γ , and define Fγ (a, b, c; λ) := za (z − 1)b (z − λ)c dz. γ
This function is well known to satisfy a hypergeometric differential equation. proposition 2 F satisfies the second-order differential equation F = AF + BF with
− a+b+c+1 c , A= λ(λ − 1)
(a + b + 2c)λ − a − c B= . λ(λ − 1)
The proof of this proposition is classical and can be found in [12]. Example 1 We obtain that f˜γ (λ) =
γ
z1/(2n)−1 (1 − z)1−1/n (λ − z)1/n−1 dz
MART´IN AND WEBER
554
satisfies f˜γ =
and that g˜ γ (λ) =
γ
n−1
f˜γ 2n2 λ(λ − 1)
−
4n − 3 ˜ f 2nλ γ
z1/(2n)−1 (1 − z)−1/n (λ − z)1/n dz
satisfies g˜ γ = −
1 2n − 3 g˜ γ − g˜ . 2nλ γ 2n2 λ(λ − 1)
Now we transform this information into our notation, as shown in the following lemma. lemma 5 Suppose that F˜ satisfies
F˜ = AF˜ + B F˜ .
Then F = F˜ ◦ φ satisfies
φ F = (A ◦ φ)φ F + (B ◦ φ)φ + F . φ
Proof We compute F = F˜ ◦ φ φ , 2 F = F˜ ◦ φ φ + F˜ ◦ φ φ 2 = (A ◦ φ) F˜ ◦ φ + (B ◦ φ) F˜ ◦ φ φ + F˜ ◦ φ φ φ 2 = (A ◦ φ)F φ + (B ◦ φ)φ + F . φ This lemma is used only in the following case. corollary 2 If F˜ (λ) satisfies F˜ = AF˜ + B F˜ , then F (λ) = F˜ (λ2 ) satisfies 2 2 1 2 F (λ) = 4λ A λ F (λ) + 2λB λ + F (λ). λ Example 2 Using the f˜γ from Example 1, we obtain that
fγ (λ) = f˜γ λ2
(8)
PROPERLY EMBEDDED MINIMAL SURFACES WITH THREE ENDS
satisfies fγ =
555
2(n − 1) n−1 fγ − 3 f 2 2 nλ γ n λ −1
and that
gγ (λ) = g˜ γ λ2
satisfies gγ = −
2 n−3 gγ − g . nλ γ n2 λ 2 − 1
The following lemma shows how we want to use the differential equation to obtain convexity statements. lemma 6 Suppose that F1 , F2 satisfy
F = AF + BF .
Then the quotient pqF = F2 /F1 satisfies F pq F = B − 2 1. pqF F1 Proof The proof consists of the first step in the standard proof for the formula of the Schwarzian derivative of the quotient w (see [12]). Differentiating F2 = pqF F1 twice gives F2 = pq F F1 + pqF F1 ,
F2 = pq F F1 + pq F F1 + pqF F1 .
Using the differential equation for Fi and substituting pqF F1 for F2 gives B · pq F F1 = pq F F1 + pq F F1 , and the claim follows. Proof of Proposition 1 and Theorem 5. The idea is to show the convexity properties of the period quotients by using Lemma 6 and estimating the integrals explicitly. For the first quotient, denote r = pqf = f2 /f1 . Then (using λ2 − z2 < λ2 ) we obtain f1 n−1 r − 2 = −3 r nλ f1
MART´IN AND WEBER
556
1−1/n 2 1/n−2 1 1/n−1 dz 1 − z2 λ − z2 1 n−1 0 z +2 1− 2λ 1 = −3 1−1/n 1/n−1 1/n−1 1 − z2 nλ n dz λ2 − z 2 0 z n−1 > nλ > 0. Using the asymptotic estimates (see Lemma 4), one obtains that r is strictly convex and increasing. Similarly, we obtain for the second quotient r = pqg = g2 /g1 . Then (using Lemma 3) we get, for n ≥ 3, g1 − g2 r n−3 − 2 = − < 0, r nλ g1 − g 2 proving concavity of pqg . This concludes the proof of Proposition 1. For n ≥ 3, Proposition 1, together with Lemma 4, implies Theorem 5 instantly. The remaining case n = 2 was treated differently by Weber in [15]. 5. Extremal length argument In this section, we use an extremal length argument to reduce the 2-dimensional case to the 1-dimensional case. Recall that, from the discussion in Section 3, we are left with two possible candidates of Weierstrass data, represented by the pair of domains in Figures 7 and 11. theorem 6 There are no surfaces of type Ba or Be. Proof Suppose there is a surface of type Ba. Then there is a pair of domains 0, and Soundararajan [S] improved this for all k ≥ 2 by showing that gk (T ) ≥ 2. Gonek [G] proved that, on RH, gk (T ) ≥ 1 for −1/2 < k < 0. Conrey and Ghosh [CG3] also gave conjectural improvements in the lower bound for the interval 1 < k < 2, assuming their conjecture that θ = 1 is permissible in (20) and (21). One rationale for studying gk (T ) for nonintegral k is that if gk exists and is
586
CONREY AND GONEK
meromorphic as a function of k, then it can be identified from its values for small real k. In fact, the other components in the conjectural formula (19) for Ik (T ) are 2 known to be entire functions of order 2. This is clearly the case for logk T and 1/ (1+k 2 ), and it was proved for ak by Conrey and Ghosh [CG3]. It would therefore be interesting to know how gk behaves as k → ∞. The conjectural result (22) suggests that gk grows at least like a function of order 1. We see below that the function wk (η) 2 in Conjecture 4 is (1 + η)k , so that 2 2T D 2 1 + it dt 2k 2 ak T logk 2 T . k,T 2 1 + k2 T Using this in (6) and dropping the second term, which is positive and probably much 2 larger than the first, we deduce that gk (T ) 2k as k → ∞ through the integers. Thus, Conjecture 4 implies that if gk exists, it grows at least as fast as a function of order 2. Probably it grows no faster than this. To see why, take M = N in (6) to obtain 2k 2 2T 2T 1 ζ Dk,N 1 + it dt dt ∼ 2 + it 2 2 T
with N =
T k/2 .
T
According to (9), the contribution to this of the “diagonal” terms is 2T
dk (n)2 2ak 2
T logk N. ∼ 2 n k +1 n≤N
But we expect this to be larger than the entire mean value, as it is when N T by Montgomery and Vaughan’s mean value theorem and when T N T 2 by 2 Conjecture 4. This reasoning suggests that gk ≤ 2(k/2)k . Thus, we believe that 2
k2
k 2 k gk 2 . 2
This is consistent with the conjecture of Keating and Snaith referred to above, which is that N
(j )(j + 2k) 2 −k 2 gk = 1 + k lim N . N →∞ (j + k)2 j =1
By Stirling’s formula, we easily see that this implies gk =
k 4e1/2
k 2 (1+o(1)) (23)
as k → ∞. With conjectural estimates for gk in hand, we can approach the question of the maximal order of the zeta-function. Define
HIGH MOMENTS OF THE RIEMANN ZETA-FUNCTION
587
1 + it , mT = max ζ 0≤t≤T 2 and for convenience write L = log T . On RH, it is known that
L (24) mT exp Cu log L for some positive constant Cu . On the other hand, it follows from work of Montgomery [M] that if RH is true, then L (25) mT exp Cl log L with Cl = 1/20. Subsequently, Balasubramanian and Ramachandra [BR] eliminated the need for RH in the lower bound, and Balasubramanian [B] increased the constant to Cl = 0.5305 . . .. (The constant is quoted as “3/4” in his paper, but Soundararajan has pointed out that there is √ an error in the computation of max D(7) there; it seems to be larger by a factor of 2 than it should be.) The wide disparity between the upper and lower bounds here appears in several other problems as well, and for the same reasons. For example, on RH it is known that S(T ) = (1/π) arg ζ (1/2 + iT ) satisfies L , (26) S(T ) log L and also (see [M]) that there exists a sequence of values of T → ∞ such that L . S(T ) log L On the 1-line, the disparity appears as a factor of 2. Namely, RH implies that ζ (1 + it) 2eγ log log t
(27)
as t → ∞, while unconditionally there exists a sequence of t → ∞ for which ζ (1 + it) eγ log log t. The q-analogue of this asserts that if the generalized Riemann hypothesis is true, then L(1, χ) 2eγ log log q (28) for every primitive character χ (mod q), whereas unconditionally there is a sequence of q → ∞ such that
L 1, χq eγ log log q, with χq a quadratic, primitive character (mod q). (See D. Shanks [Sh] for a discussion of his extensive numerical work on this question.)
588
CONREY AND GONEK
We can obtain lower bounds for mT directly from lower bounds for Ik (T ) by observing that 2k 1/2k 1 T 1 + it dt ζ . mT ≥ T 0 2 To estimate the right-hand side of this, we require an estimate for ak in addition to our conjectural estimates for gk . To this end we prove the following proposition. proposition Let ak be defined as above. Then we have
log ak = −k 2 log 2eγ log k + o k 2 as k → ∞. With more work and assuming RH, we could obtain the much more precise result
log 2 log ak = −k 2 log log k + log 2eγ + log 1 + log k ∞ 1 log(J (iw)) − w 2 /4 log(J0 (iw)) 0 2
dw +
dw + 8k w 3 log 4k 2 /w 2 w 3 log 4k 2 /w 2 1 0
+ O k 1+ , where J0 is the Bessel function of the first kind of order zero. The integrals can then be expanded further to give an asymptotic expansion in decreasing powers of log k. We now calculate a lower bound for mT assuming a lower bound for gk of the 2 form (Ak + B)k . By Stirling’s formula, we have 1/2k k/2 2 gk ak Lk (Ak + B)Le1−γ
. (29) mT 2k 2 log k 1 + k2 It is not difficult to see that when A = 0, the right-hand side is maximized (as a function of k) essentially by taking k 2 log k = (B/2e1+γ )L. Then log k ∼ (1/2) log L, and we have BL . mT ≥ ek ≥ exp e1+γ log L Note that if B = 2, then (B/e1+γ )1/2 = 0.64 . . . , and if√B = 1, it equals 0.45 . . .. Of course, this assumes we have uniformity in k out to L. On the other hand, if B = 0 in (29), so that gk has the form suggested by Keating and Snaith, then we find that the maximum is attained when k log k is near (A/2e1/2+γ )L. This implies that log k ∼ log L and that
589
HIGH MOMENTS OF THE RIEMANN ZETA-FUNCTION
AL . 4eγ log L Thus, if gk grows as suggested by (23) and if (29) holds uniformly for k L, then (24) is closer to the truth than (25). Similarly, (26) and (27) would reflect the true order of S(T ) and |ζ (1 + it)|, respectively. Analogously, this suggests that (28) reflects the correct maximal size of |L(1, χ)|. This is at odds with the usual view in these questions which is that the true order is most likely to be near the lower bounds. See the forthcoming paper of A. Granville and Soundararajan [GS], where one may find similar computations and an asymptotic evaluation of n dk (n)2 /n2 which resembles the result of the above proposition.
mT ≥ exp
1. The conjectural formula for Dk (x, h) In this section we sketch the derivation of the form and properties of mk (x, h), the conjectural main term for dk (n) dk (n + h). Dk (x, h) = n≤x
We assume that (a, q) =1 and that the main term for the sum n≤x dk (n)e(an/q) x can be written in the form (1/q) 0 Pk (y, q) dy independently of a. Then applying the δ-method of Duke, Friedlander, and Iwaniec [DFI] in exactly the same way they do in the case k = 2, but ignoring all error terms, immediately leads to mk (x, h) =
∞ cq (h)
q2
q=1
Pk (x, q)2 ,
(30)
where cq (h) = d|q,d|h dµ(q/d) is Ramanujan’s sum. Substituting this expression for cq (h) into the right-hand side, changing the order of summation, which is justified by absolute convergence once we have established the bound for Pk (x, q) in (38), and relabeling variables, we find that fk (x, d) , mk (x, h) = d d|h
where fk (x, d) =
∞ µ(q) q=1
q2
Pk (x, qd)2 .
Next we need to determine an explicit expression for Pk (x, q). To do this we consider the generating function ∞ an −s a Dk s, = n dk (n)e (31) q q n=1
590
CONREY AND GONEK
with (a, q) = 1 and σ > 1. Now for any integer m we have q −1 m m = , φ τ (χ)χ e q d d d|m,d|q
χ (mod q/d)
where the inner sum is over all characters χ to the modulus q/d and where τ (χ) = b (mod q) χ (b)e(b/q) is Gauss’s sum. Using this to replace the exponential in (31) and rearranging the resulting sums, we obtain ∞ a qm −s −1 s =q χ(m)m−s . φ(d) d χ(a)τ (χ) dk Dk s, q d d|q m=1 χ (mod q) α Now if r = p p , we see that ∞ j +α j −j s ∞ χ p p j =0 dk p −s
dk (rm)χ (m)m = ∞ j j −j s j =0 dk p χ p p p|r m=1 ∞ (32) j j −j s dk p χ p p × p
j =0
= gk (s, r, χ)Lk (s, χ), say. Thus, for σ > 1 we have a = q −s φ(d)−1 d s Dk s, q d|q
χ (mod q)
q χ(a)τ (χ)gk s, , χ Lk (s, χ). d
This provides a meromorphic continuation of Dk (s, a/q) to the whole complex plane and shows that its only possible pole in σ > 0 occurs at s = 1 and is due to the (0) principal character χd (mod d) for each d dividing q. Thus, the singular part of Dk (s, a/q) is the same as that of ∞ qm (0) (0) −s −1 s (0) φ(d) d χd (a)τ χd χd (m)m−s . dk q d d|q
b (0) = cd (1) = µ(d), e τ χd = d b (mod d)
Now
and
(0) χd (m)
m=1
=
(b,d)=1
µ(e), so we find that this equals ∞ µ(d) qen −s −s s −s d n . µ(e)e dk q φ(d) d e|m,e|d
d|q
e|d
(33)
n=1
We can use (32) to express the sum over n here in terms of the zeta-function. Taking χ
591
HIGH MOMENTS OF THE RIEMANN ZETA-FUNCTION (0)
(0)
equal to χ1 , the principal character (mod 1), and writing gk (s, r) = gk (s, r, χ1 ), we deduce from (32) that ∞
dk (rn)n−s = gk (s, r)ζ k (s),
n=1
where
∞
1 − p −s k dk p j +α p −j s . gk (s, r) = p|r
j =0
Inserting this into (33), we find that the singular part of Dk (s, a/q) is identical to that of q −s ζ k (s)Gk (s, q), where Gk (s, q) =
µ(d) d|q
φ(d)
ds
e|d
qe , µ(e)e−s gk s, d
and we note that this is independent of a. From this and Perron’s formula, we see that x (1/q) 0 Pk (y, q) dy, the main term for n≤x dk (n)e(an/q), should be given by 1 (x/q)s ds. ζ (s)k Gk (s, q) 2π i |s−1|=1/8 s Thus, differentiating with respect to x, we find that Pk (x, q) =
1 2πi
s−1 x ζ (s)k Gk (s, q) ds. q |s−1|=1/8
(34)
On changing s to s + 1, we obtain (15). It only remains to prove (18). However, before doing this we derive a formula that we feel is interesting in its own right, even though we do not require it here. Define ∞ Dk (s, h) = dk (n) dk (n + h)n−s . n=1
Then
so we see that
1 mk (x, h) = 2πi
|s−1|=1/8
Dk (s, h)
xs ds, s
mk (x, h) = ress=1 Dk (s, h)x s−1 .
On the other hand, since Dk (s, 1/q) and q −s ζ k (s)Gk (s, q) have the same singular part at s = 1, from (34) we have that
592
CONREY AND GONEK
1 s−1 1 Pk (x, q) = ress=1 Dk s, x . q q It therefore follows from (30) that ress=1 Dk (s, h)x
s−1
=
∞ q=1
1 s−1 2 cq (h) ress=1 Dk s, x . q
This is the formula referred to above. We now proceed to the proof of (18). From (16) and (17), we see that gk (s, 1) = Gk (s, 1) = 1 and that, for α ≥ 1,
1 −1 α Gk s, p = 1 − gk s, pα − p s−1 gk s, pα−1 . p Using this with (17) and the easily proven identity
dk p β = dk−1 p β + dk p β−1 ,
(35)
we obtain
Gk s, p
α
∞
1 −1 1 k −j s p = 1− dk−1 p α+j + 1−p s−1 dk p α+j −1 . 1− s p p j =0
(36) It is also not difficult to show, for example by induction on β, that
dl p α+β ≤ dl (p α )dl pβ .
(37)
Therefore, we have that k −1
Gk (s, pα ) ≤ 1 − 1 1 − 1 dk−1 (p α ) 1 − p −σ −k+1 s p p
−k 1 − p s−1 . + dk p α−1 1 − p −σ Now we assume that pα || q, and we restrict s to the circle |s − 1| = A1 /(log qx). Then there exist positive constants A2 and A3 such that 1 −(A2 log p)/(p log qx) 1 (A2 log p)/(p log qx) 1 1− e e ≤ 1 − s ≤ 1 − p p p and
1 − p s−1 ≤ A3 log p . log qx
Hence, for some positive constant A4 we have
593
HIGH MOMENTS OF THE RIEMANN ZETA-FUNCTION
−1
A3 log p Gk (s, pα ) ≤ e(A4 k log p)/(p log qx) dk−1 (p α ) + dk p α−1 1 − 1 . p log qx We now use the simple formula
dk p α−1 =
α dk−1 p α , k−1
and we find that
Gk s, pα ≤ dk−1 p α e(A4 k log p)/(p log qx) 1 + A5 α log p k log qx α (B log pα )/(log qx) ≤ dk−1 p e k , where Bk depends only on k. It follows that Gk (s, q) ≤ dk−1 (q)e(Bk log q)/(log qx) k dk−1 (q). Since ζ (s) |s − 1|−1 near s = 1, if we use this in (34) and shrink the path of integration to the circle |s − 1| = A1 /(log qx), we deduce that Pk (x, q) dk (q) logk−1 qx.
(38)
Finally, we use this bound in (14), separate the resulting divisor functions by means of (37), and recall that d ≤ x to obtain 2 (d) log2k−2 x, fk (x, d) dk−1
which is (18). 2. The generating function Fk (x, z) In applying Conjecture 3 in Section 3, it turns out that what we actually require is the behavior of the generating function Fk (x, z) =
∞ fk (dx, d) d=1
d z+1
near z = 0. By (14) and (15), we see that 1 ζ k (s + 1)ζ k (w + 1)Hk (z, s, w)x s+w ds dw, Fk (x, z) = |s|=1/8 (2π i)2 |w|=1/8
(39)
where Hk (z, s, w) =
∞ d=1
If we define
1 d 1+z
∞ µ(q)Gk (s + 1, dq)Gk (w + 1, dq) q=1
q 2+s+w
hk (p α ) = hk s, w, pα
= Gk s + 1, p α Gk w + 1, p α
.
(40)
594
CONREY AND GONEK
− Gk s+, pα+1 Gk w + 1, p α+1 p −2−s−w , then from the multiplicativity of Gk (s + 1, q) we see that the sum over q in the definition of Hk (z, s, w) equals hk (p α ) hk (p). hk (p) p p||d
The first product defines a multiplicative function of d, so we find that ∞ 1 hk (pα ) Hk (z, s, w) = hk (p) hk (p) d 1+z p p||d d=1 ∞ hk (p α )/ hk (p) = hk (p) p α(1+z) p p α=0 ∞ hk (p α ) . = p α(1+z) p
(41)
α=0
Now Gk (s + 1, 1) = 1, and, for α ≥ 1,
Gk s + 1, p α = dk−1 p α + dk p α−1 1 − p s + Ok p −1−σ + Ok p −1 by (36). Thus, writing s = σ , w = u, and z = x, we see that
hk (1) = 1 + Ok p −2+|σ |+|u| ,
dk−1 (p α ) + dk pα−1 1 − p w hk p α = dk−1 p α + dk p α−1 1 − p s + Ok p−1+|σ |+|u| , and
∞
hk (p α ) (k − p s )(k − p w ) = 1 + + Ok p −2+|σ |+|u|+|x| , 1+z α(1+z) p p
α=0
where the function bounded by the O-term in the last line is analytic in the region |σ | + |u| + |x| < 2. When we use this with (41), we obtain k k 1 k2 Hk (z, s, w) = 1 + 1+z − 1+z−s − 1+z−w + 1+z−s−w + · · · p p p p p = ζ k (1+z)ζ −k (1+z−s)ζ −k (1+z − w)ζ (1 + z − s − w)Hk∗ (z, s, w), 2
where Hk∗ (z, s, w)
1− = p
k 2 1− 1+z
p
−k 1− 1+z−s 1
1
p
1 p 1+z−w
−k
595
HIGH MOMENTS OF THE RIEMANN ZETA-FUNCTION
× 1−
1 p 1+z−s−w
∞ α=0
hk (p α ) p α(1+z)
is analytic for |σ | + |u| + |x| < 1. Combining this with (39), we find that 1 Fk (x, z) = |s|=1/8 (2π i)2 |w|=1/8 ζ (s + 1)k ζ (w + 1)k ζ (z + 1)k ζ (z + 1 − s − w)x s+w ∗ Hk (z, s, w) ds dw. ζ (z + 1 − s)k ζ (z + 1 − w)k (42) 2
·
Finally, we evaluate Hk∗ (0, 0, 0), as this is also required in Section 3. By (40) and the definition of Hk∗ (z, s, w), we have
2 ∞ 1 (k−1) G2k (1, pα ) − G2k 1, pα+1 p −2 ∗ Hk (0, 0, 0) = 1− . (43) p pα p α=0 α+j )p −j . Hence, factoring the By (36), Gk (1, pα ) = (1 − 1/p)k−1 ∞ j =0 dk−1 (p numerator in the sum over α as a difference of two squares, we obtain
G2k 1, pα − G2k 1, pα+1 p −2 α+j +1 2k−2 ∞
dk−1 p 1 . = 1− dk−1 p α dk−1 (p α ) + 2 p p j +1 j =0
We therefore see that the typical factor in (43) equals α−1 2 ∞ 2 ∞ (p α ) l
1 k −1 dk−1 dk−1 (p α ) 1− +2 dk−1 p . p pα pα α=0 α=1 l=0 Since q|n dk−1 (q) = dk (n), the sum over l equals dk (p α−1 ), so this is
2
2 ∞ − dk2 p α−1 dk−1 (p α ) + dk p α−1 1 k −1 1− 1+ . p pα α=1
We now use (35) to see that this is
2 ∞ 2 ∞ ∞ 1 k dk2 (p α ) 1 k −1 dk2 (p α ) dk2 p α−1 = 1− . − 1− p pα pα p pα α=0
α=1
α=0
Inserting this into (43) and comparing with (10), we deduce that Hk∗ (0, 0, 0) = ak .
3. Conjecture 4 A precise version of the mean value formula we require to derive the formula in
596
CONREY AND GONEK
Conjecture 4 is given in [GG]. However, we adopt a simpler heuristic approach that leads to the same formula more quickly. Recall that N dk (n) Dk,N (s) = ns n=1
and that we are to estimate I (T ) = Ik,N (T ) =
2T
T
2 Dk,N 1 + it dt. 2
The reason we integrate from T to 2T rather than from zero to T is that for t near zero the integrand is very large, on the order of N 1/2 , so the mean square for small t is about N. Since N can be as large as T 2 , this would dominate, and so obscure, the behavior of the mean value away from the real axis. We let 1 J (T ) = Jk,N (T ) = 2
T −T
2 Dk,N 1 + it dt, 2
(44)
and then we obtain our estimate for I (T ) via the formula I (T ) = J (2T ) − J (T ).
(45)
Squaring out and integrating term-by-term in (44), we find that J (T ) = T
N dk (n)2 n=1
n
dk (m) dk (n) sin T log(m/n) + = Jd (T ) + Jo (T ), log(m/n) (mn)1/2 m= n
(46) say. Using the symmetry in m and n and writing m = n + h, we find that
N N−n dk (n) dk (n + h) sin T log(1 + h/n) . Jo = 2 √ log(1 + h/n) n 1 + h/n n=1 h=1 Now we make several approximations that are justified in [GG] for the weighted √ version of this formula. Namely, we replace log(1 + h/n) by h/n, 1 + h/n by 1, and dk (n) dk (n + h) by mk (n, h) from Conjecture 3. Finally, the sum over n can be replaced by an integral, and the sum over h can be extended to ∞. This leads to Jo ∼ 2
∞ N
0
By (13), the right-hand side equals
h=1
mk (x, h) Th sin dx. h x
597
HIGH MOMENTS OF THE RIEMANN ZETA-FUNCTION
2
N ∞ ∞ Th 1 fk (x, d) fk (x, d) sin(T h d/x) sin dx = 2 dx h d x h d2 0 d=1 d|h h=1 h=1 N ∞ 1 fk (x, d) Td − dx, = −2π 2πx 2 d2 0
∞ N
0
d=1
where {x} =
x − [x]
if x is not an integer,
1 if x is an integer. 2 The “−1/2” term leads to a large contribution if N is large, but it is independent of T and so disappears when we take the difference J (2T ) − J (T ) = I (T ). Thus, we may express Jo as Jo (T ) = Jo,1 (T ) + Jo,2 ,
(47)
where Jo,1 (T ) is the part with “{}” and where Jo,2 is the part with “−1/2”. We now make the change of variable y = T d/(2πx) in Jo,1 , and we find that N ∞ fk (x, d) T d dx Jo,1 (T ) = −2π 2πx d2 0 d=1 ∞ 1 ∞ fk (T d/2πy, d) = −T {y} dy. d (T d)/(2π N ) y2 d=1
We split the sum over d into d ≤ 2πN/T and d > 2πN/T . From our estimate for fk in (18), we find that the contribution to Jo,1 from the upper range of d is
T
τk−1 (d)2 N 2 2 L2k−2 T L(k−1) −1+2k−2 = T Lk −2 . d Td
N/T d
In the lower range of d, we split the integral over y into two ranges: T d/2πN ≤ y < 1 and y ≥ 1. The contribution from the upper range of y is τk−1 (d)2 N 2 L2k−2
T Lk −1 .
T d dT d N/T
Hence, since {y} = y for 0 < y < 1, we see that
1 1
fk T d/2πy, d 2 dy + O T Lk −1 Jo,1 (T ) = −T y d (T d)/(2π N ) d≤N/T
1 fk T d/2πy, d dy
2 = −T + O T Lk −1 . d y T /(2π N) d≤yN/T
We now use the expression in (42) for the generating function Fk (x, z) of
598
CONREY AND GONEK
fk (dx, d). In doing this we retain only the first term in the Laurent expansion at zero of the various factors in the integrand, and we find that 1 dy 1 (T /y)s+w−z N z (z − s)k (z − w)k Jo,1(T )∼−ak T ds dw dz . |s|=1/8 2 +1 3 k k k |w|=1/8 y (2π i) z (z − s − w)s w T /N |z|=1/2 To evaluate this we make the substitutions s → sz and w → wz, and we carry out the integration in z. We then find that ak T Jo,1 (T ) ∼ − 2
|s|=1/16 k (2π i)2 T /N |w|=1/16
k 2 −1 (s + w − 1) log(T /y) + log N (1 − s)k (1 − w)k dy · ds dw . k k (1 − s − w)s w y (48) Next we write N = T 1+η , where 0 ≤ η ≤ 1, and we make the substitution y = T α(1+η)−η . Then y = 1 corresponds to α = 1 − 1/(1 + η), and y = T /N corresponds to α = 0. Also,
T (s + w − 1) log + log N = log N 1 + (s + w − 1)(1 − α) y and
dy = log Ndα. y
Thus, the asymptotic expression for (48) becomes Jo,1 (T ) ∼
ak 2 2 (1 + η)k T Lk Mk , (k 2 )
where Mk = −
1 (2π i)2
0
1−1/(1+η) |s|=1/16 |w|=1/16
(1 + (s + w − 1)(1 − α))k −1 (1 − s)k (1 − w)k ds dw dα. (1 − s − w)s k w k 2
·
We observe for future reference that Mk → 0 as k → ∞. In fact, since 1 + (s + w − 1)(1 − α) = α + (s + w)(1 − α),
(49)
599
HIGH MOMENTS OF THE RIEMANN ZETA-FUNCTION
we find that
|M k | ≤
1−1 1 + η+1 8
k 2 −1
152k
k 2 4 5
as k → ∞. 2 Next, we expand (1 + (s + w − 1)(1 − α))k −1 into powers of (s + w − 1), and we find that 2 −1
k 2 −1 k k2 − 1 = (−1)n (1 − s − w)n (1 − α)n . 1 − (1 − s − w)(1 − α) n n=0
Thus,
Mk = −
where γk (n) =
1 (2π i)2
Now 1 γk (0) = (2π i)2
2 −1 1−1/(1+η) k
(−1)n γk (n)(1 − α)n dα,
0
n=0
|s|=1/16 |w|=1/16
|s|=1/16 |w|=1/16
(1 − s − w)n−1 (1 − s)k (1 − w)k ds dw. s k wk
(1 − s)k (1 − w)k ds dw (1 − s − w)s k w k
k k k k 1 ds dw i+j ds dw (−1) = 2 |s|=1/16 (1 − s − w)s i w j i j (2πi) |w|=1/16 i=0 j =0
=
k ∞ k k k 1 (s + w)m ds dw (−1)i+j ds dw |s|=1/16 i j s i wj (2πi)2 |w|=1/16 i=0 j =0 m=0
=
k k k k i+j −2 (−1)i+j . i j i−1 i=0 j =0
Actually, this can be simplified to γk (0) = k, but this fact is not necessary to proceed. In a similar manner we find that k k k k n−1 γk (n) = (−1)n . i j i − 1, j − 1, n − i − j + 1 i=0 j =0
Finally, integrating with respect to α, we obtain
2 −1 k 1 − (η + 1)−(n+1) n . Mk = − (−1) γk (n) n+1 n=0
Combining this with (47) and (49), we obtain
600
CONREY AND GONEK
Jo (2T ) − Jo (T ) = Jo,1 (2T ) − Jo,1 (T )
2 −1 k 1 − (η + 1)−(n+1) ak k2 k2 n
(1 + η) T L . (−1) γk (n) ∼− n+1 k2 n=0 (50)
Also, from (9) we have Jd (2T ) − Jd (T ) ∼
ak 2 2
(1 + η)k Lk . k2 + 1
It now follows from (45), (46), (50), and this that 2 2T 1 ak k2 Ik,N = Dk,N 2 + it dt ∼ wk (η) k 2 + 1 T L , T where
wk (η) = (1 + η) 1 − k2
1 − (1 + η)−(n+1) k2 . γk (n) n+1 n+1
2 −1 k
n=0
This is Conjecture 4. 4. The sixth and eighth power moment conjectures We first note an alternative expression for ak . Conrey and Ghosh [CG3] have shown that ak = a1−k . Thus, by (10), 2 ∞ 2 (p r ) 1 (k−1) d1−k 1− . ak = p pr p Since dk (p r ) = have that
k+r−1
r
r −k
r=0
= (−1) r , we see that d1−k (p r ) = (−1)r (k−1)2 k−1 k−1 2 1 r 1− . ak = r p p p
k−1
r
, so we
(51)
r=0
This is the expression for ak we have used in (3) and Conjecture 1. By (6) we expect that 6 2 2 2T 2T 1 1 1 ζ D 1+η + it + D3,T 2−η + it dt 2 + it dt ∼ 3,T 2 2 T T for any η with 0 ≤ η ≤ 2. Using Conjecture 4 and adding the results together for T /2, T /4, . . . , we obtain 6 T 1
a3 9 I3 (T ) = ζ 2 + it dt ∼ w3 (η) + w3 (1 − η) 9! T L . 0
HIGH MOMENTS OF THE RIEMANN ZETA-FUNCTION
601
From the formula for wk (η) in the conjecture, we calculate that w3 (η) = 1 + 9η + 36η2 + 84η3 + 126η4 − 630η5 + 588η6 + 180η7 − 9η8 + 2η9 , and it is not difficult to verify that w3 (η) + w3 (1 − η) = 42 for all η. This identity provides compelling evidence for the sixth power moment conjecture of Conrey and Ghosh. Similarly, we calculate that w4 (η) = 1 + 16η + 120η2 + 560η3 + 1820η4 + 4368η5 + 8008η6 + 11440η7 + 12870η8 + 11440η9 − 152152η10 + 179088η11 − 78260η12 + 14000η13 − 1320η14 + 16η15 − 3η16 . We then find that w4 (1) = 12012, and this leads to I4 (T ) ∼ 24024
a4 T L16 , 16!
which is Conjecture 1. 5. Proof of the proposition To prove the proposition we work from the expression for ak given in (51). Using this, we first prove an upper bound for ak . Write ak = B− B+ , where B− is the part of the product over the primes less than or equal to 2k 2 and where B+ is the part over the primes greater than 2k 2 . Then k−1 k−1
2 2 1 (k−1) k−1 3 + ··· 1− 1 + 1/2 + 2 + 3/2 B ≤ p p p p 2 −
p≤2k
(k−1)2 e−γ 1 2(k−1) ≤ (1 + o(1)) 1 + 1/2 log 2k 2 p 2 p≤2k −γ k 2 2(k − 1) e 2 ≤ eo(k ) exp 2 log k p 1/2 2 p≤2k
602
CONREY AND GONEK
=
e−γ 2 log k
e−γ = 2 log k
k 2 k 2
eo(k ) eO(k 2
2 / log k)
eo(k ) .
Next, it is easy to show that
2
k−1 r
2
(k − 1)2 ≤ r
for r = 0, 1, 2, . . . and k = 1, 2, 3, . . . , so we see that 2 2 2 1 (k−1) 1 (k−1) 1 (k−1) B+ ≤ 1− 1+ 1− 2 = ≤ 1. p p p 2 2 p>2k
p>2k
Thus, we find that
2 e−γ k o(k 2 ) e . 2 log k Now we deduce a lower bound. First we have 2 −γ (k−1)2 −γ k 2 e e 1 (k−1) 2 − = (1 + o(1)) = eo(k ) . 1− B ≥ 2 p 2 log k log 2k 2
ak ≤
p≤2k
Also, since (1 − x)n ≥ 1 − nx for 0 < x < 1, we have 2 1 (k−1) (k − 1)2 1− 1+ B+ ≥ p p p>2k 2 (k − 1)2 (k − 1)2 1− 1+ ≥ p p p>2k 2 (k − 1)4 ≥ 1− p2 p>2k 2 4 (k − 1) . log 1 − = exp 2 p 2 p>2k
Since log(1 − x) ≥ −2x for 0 ≤ x ≤ 0.8, and (k − 1)4 /p 2 ≤ 0.8 for p > 2k 2 , this is greater that or equal to (k − 1)4 k4 2 = eo(k ) . exp −2 ≥ exp − O 2 2 p k log k 2 p>2k
Thus, we find that
603
HIGH MOMENTS OF THE RIEMANN ZETA-FUNCTION
ak ≥
e−γ 2 log k
k 2
eo(k ) . 2
Since the upper and lower bounds are the same, the proposition follows. Acknowledgment. S. M. Gonek wishes to express his sincere gratitude to the American Institute of Mathematics for its generous support and hospitality while he was working on this paper.
References [B]
R. BALASUBRAMANIAN, On the frequency of Titchmarsh’s phenomenon for ζ (s), IV,
[BR]
R. BALASUBRAMANIAN and K. RAMACHANDRA, On the frequency of Titchmarsh’s
[CG1]
J. B. CONREY and A. GHOSH, On mean values of the zeta-function, Mathematika 31
Hardy-Ramanujan J. 9 (1986), 1–10. phenomenon for ζ (s), III, Proc. Indian Acad. Sci. Sect. A 86 (1977), 341–351.
[CG2]
[CG3] [DFI] [GG] [G] [Go]
[GS] [HL]
[HB1] [HB2] [I]
(1984), 159–161. , “Mean values of the Riemann zeta-function, III” in Proceedings of the Amalfi Conference on Analytic Number Theory (Maiori, Italy, 1989), Univ. Salerno, Salerno, Italy, 1992, 35–59. , A conjecture for the sixth power moment of the Riemann zeta-function, Internat. Math. Res. Notices 1998, 775–780. W. DUKE, J. B. FRIEDLANDER, and H. IWANIEC, A quadratic divisor problem, Invent. Math. 115 (1994), 209–217. D. A. GOLDSTON and S. M. GONEK, Mean value theorems for long Dirichlet polynomials and tails of Dirichlet series, Acta Arith. 84 (1998), 155–192. S. M. GONEK, On negative moments of the Riemann zeta-function, Mathematika 36 (1989), 71–88. A. GOOD, Approximate Funktionalgleichungen und Mittelwerts¨atze f¨ur Dirichletreihen, die Spitzenformen assoziiert sind, Comment. Math. Helv. 50 (1975), 327–361. A. GRANVILLE and K. SOUNDARARAJAN, The distribution of values of L(1, χ), preprint. G. H. HARDY and J. E. LITTLEWOOD, Contributions to the theory of the Riemann zeta-function and the theory of the distribution of primes, Acta Math. 41 (1918), 119–196. D. R. HEATH-BROWN, The fourth power moment of the Riemann zeta function, Proc. London Math. Soc. (3) 38 (1979), 385–422. , Fractional moments of the Riemann zeta function, J. London Math. Soc. (2) 24 (1981), 65–78. A. E. INGHAM, Mean-value theorems in the theory of the Riemann zeta-function, Proc. London Math. Soc. (2) 27 (1926), 273–300.
604
[Iv]
[KS] [M] [MV] [R] [Sh]
[S]
CONREY AND GONEK ´ , “The general additive divisor problem and moments of the zeta-function” in A. IVIC
New Trends in Probability and Statistics, Vol. 4: Analytic and Probabalistic Methods in Number Theory (Palanga, Lithuania, 1996), VSP, Utrecht, Netherlands, 1997, 69–89. J. KEATING and N. SNAITH, Random matrix theory and some zeta-function moments, lecture at Erwin Schr¨odinger Institute, Vienna, Sept. 1998. H. L. MONTGOMERY, Extreme values of the Riemann zeta function, Comment. Math. Helv. 52 (1977), 511–518. H. L. MONTGOMERY and R. C. VAUGHAN, The large sieve, Mathematika 20 (1973), 119–134. K. RAMACHANDRA, Some remarks on the mean value of the Riemann zeta function and other Dirichlet series, II, Hardy-Ramanujan J. 3 (1980), 1–24. D. SHANKS, “Systematic examination of Littlewood’s bounds on L(1, χ)” in Analytic Number Theory (St. Louis, 1972), Proc. Sympos. Pure Math. 24, Amer. Math. Soc., Providence, 1973, 267–283. K. SOUNDARARAJAN, Mean-values of the Riemann zeta-function, Mathematika 42 (1995), 158–174.
Conrey Department of Mathematics, Oklahoma State University, Stillwater, Oklahoma 74078-0613, USA; current: American Institute of Mathematics, 360 Portage Avenue, Palo Alto, California 94306, USA; [email protected] Gonek Department of Mathematics, University of Rochester, Rochester, New York 14627, USA; [email protected]