Applied and Numerical Harmonic Analysis Series Editor John J. University
Benedetto of Maryland
Editorial Advisory Board Akram Aldroubi
Douglas Cochran
Vanderbilt University
Anzona State University
Ingrid Daubechies
Hans G. Feichtinger
University
University of Vienna
Christopher Hell
Murat Kunt
Georgia Institute of Technology
Swiss Federal Institute of Technology, Lausanne
James McClellan Georgia Institute of Technology
Wim Sweldens
Michael Unser
Lucent Technologies Bell Laboratones
Swiss Federal Institute of Technology, Lausanne
Martin Vetterli
M. Victor Wickerhauser
Swiss Federal Institute of Technology, Lausanne
Washington University
Applied and Numerical Harmonic Analysis J M Cooper lntroduceori to Partial Differential Equations wth MATLAB (ISBN 0-81 76-3967-5)
Theory and !-la,mon,c
CE DAttetis and EM Applied Sciences (ISBN 0-81 76-3953-5)
in
HG. escPtwige and T Stcohmer. Gabcr Analysis and Algorithms (ISBN 0-81 76-3959-4) P TM Peters, J H T Bates, GB B'omed,cal &gineenng (ISBN 0-81 76-3941 -1)
Al Saichev axi WA
and J.C WWams Fourier Transforms and
and Engineering Sciences
Distributions in the
(ISBN 0-81 76-3924-1)
Representations (ISBN 0-8176-391 8-7)
R Tolirraen and M An
G T Herman Geometry of Digital Spaces (ISBN0-81 76-3897-0) A PPxt-iázi
for all rationals p/q
(called "badly approximable" numbers), and at the same time, they are the most uni-
formly distributed na mod I sequences. The extremely simple cyclic group structure of the units in the corresponding quadratic fields makes the technicalities much simpler.
Local Case: lattice points in tilted hyperbola segments. Consider first an inhomogeneous Pell inequality like
—c_
0 is
A
-+
I
A 2
/2 du
a constant depending on c only—see (5).
What is the intuition behind Theorem 1.1? Well, write
j= 1.2
N,
< that is, gj(co) is the number of the integral solutions n E N of(2) satisfying The key observation is that function g1 (w) resembles the j-th Rademacher n function, so the sum
4
J.Beck
as a function of cv
[0, 1), behaves like a sum of
N independent Bernoulli
variables (6)
Let us go back to Theorem 1.1. What can we say about the sequence of those n's which satisfy inequality (7) below (note that —00 0, for almost every cv.
holds for infinitely many values of n. Similarly,
+ holds for infinitely many values of n as well.
— r)loglogn
Lattice Point Problems
5
What we actually formulate next is the "ultimate" Kolmogorov—Erdôos form which contains the Khintchine's form as a straightforward corollary.
Theorem 13. ([Be3]) Let p(n) be an arbitrary positive increasing function of n. For almost every w,
+ holds for infinitely many n's (f and only
(9)
the series diverges.
Exactly the same holds for the other inequality e'T)
0 there are infinitely many, and what is more, = (c) E [0, 1) such that
continuum many "divergence points"
6
J. Beck
•
>
urn sup
n-÷oo
fl
c
(12)
and
c
0.
(26)
Note that Halász [Hal and independently Tijdeman and Wagner [T-WJ proved far-reaching generalizations of (26) for arbitrary sequences (not just for the ncr mod I sequences). SOs' theorem can be interpreted as a global "extra large discrepancy" result. In other words, it is a sort of global version of Theorem 1.5.
Simultaneous Case. The main difficulty is that the theory of continued fractions does not seem to extend to higher dimensions. This is the reason why many of the most natural problems, like the famous Littlewood conjecture. are still completely open and hopeless. Here we just mention one result which extends a 70-year-old theorem of Khintchine [Khl] to higher dimensions (see [Be2]). Khintchine proved, by using the the-
ory of continued fractions, that for almost every a, the usual interval discrepancy We proved, by using a of the ncr mod I sequence is between logn . (log log completely different Fourier analysis approach, that for almost every ak) Rk, the usual box discrepancy of the (ncr i flak) mod 1 sequence is between Similarly to Khintchine, we in fact proved a precise "conver(log . (log log n)1 gence-divergence criterion," i.e., a Borel—Cantelli type theorem. The proof is more than 40 pages long, so we just say two things: the starting point is Poisson's summation formula, and non-trivial combinatorics is employed to guarantee "cancellations" in some exponential sums. Note that one can easily formulate the analogues of the above-mentioned "onedimensional" probabilistic local and global theorems in the simultaneous case. However, we do not believe in any normal limit distribution type result in higher dimensions. The reason is that the same notorious difficulty appears here as in Littlewood's intractable conjecture lim infn ma II lIne = 0 for any pair a, of reals. Note that there were some earlier attempts to give a systematic study of probabilistic methods in Diophantine approximations (see e.g. Kac [Ka] and Kemperman [KemD. but they were discussing ncr (mod 1) for almost every a only, and did not say anything about concrete sequences like n.ñ (mod 1).
Lattice Point Problems
II
2 continued fractions and quadratic fields In § I we studied the local and the global behaviors of the sequence na (mod 1) for specific values of a (like or other quadratic irrationals). So far it has not made any difference whether a was or But there is a substantial difference between the the corresponding fundaarithmetics of the real quadratic fields Q( ...ñ) and mental units have different norms (namely I and — I). In other words, in contrast to the equation x2 — 2y2 = —1 which has infinitely many integral solutions, the other equation x2 — 3y2 = — I does not have an integer solution. This simple fact plays an important role in what follows.
Lattice points in axis-parallel right-angled triangles. Consider the classical Diophantine series
Sa(n) =
—
1/2)
(1)
where a is a fixed irrational, and as usual, {... } stands for the fractional part. Of
is a lattice point problem in disguise. ber of lattice points in a half-open right-angled triangle
counts the numn) with vertices (n + (i.e., a is the slope). It is half-open in the sense 0), (n + (0, 0), (n + that we don't count the number of lattice points on the horizontal side (located on the horizontal axis). Because of this it is more natural to speak about sub-triangle n) which is obtained from n) by removing the intersection n) C y with the horizontal strip 0 1/2. Sa(n) is the number of lattice points inside triangle n) minus the area of this triangle. course
The series Sa(n) has been thoroughly discussed by Hardy and Littlewood, Hecke, Ostrowski, Behnke, and more recently by Sos and others. They concentrated on the maximum fluctuations as n —p oc. We focus on the typical fluctuations and present an elegant central limit theorem for individual a's. including the class of quadratic irrationals. In particular for a = this central limit theorem is as follows, see [Be3].
Theorem 2.1.
N
There is a positive absolute constant c such that I
-
ç2ir I
21
dfi > C3 . diam(A).
Jo
Remarks. It seems a natural idea to try to prove the Chord Lemma by using the following well-known formula: ç2ir
I
Jo
PA(fl) d$
= perimeter(A),
denotes the radius of curvature of A, and fi is the direction of the 2 (p), hence
where PA
normal vector. If A is a "nice" ellipse-like region, then IA (,6)
8I
I
Jo
PA(fl) dfi
= 8. perimeter(A),
Jo
and the Chord Lemma follows. Unfortunately, in general for an arbitrary convex re-
gion A. functions fA (fi) and 4/PA (fi) can have quite different orders of magnitudes, and this natural approach breaks down (at least we couldn't make any use of it). The difficulty is that the constants Cl, C3 have to be independent of A, so the problem is mainly combinatorial rather than analytic. We were unable to find a simple proof—our proof was 15 pages long—and we challenge the reader to find a simple one. The third key ingredient of the proofs of Theorems 3.1 and 3.2 was a number-
theoretic lemma, in fact a non-conventional Diophantine approximation lemma. The standard problem of Diophantine approximation is to solve an inequality like ilnail < s. where a is a given real number, e > 0 is "small" and n (what we are looking for) is a non-zero integer (I Ix Ii denotes, as usual, the distance of a real x from the nearest integer). The most well-known example is Dirichlec's theorem.
Dirichiet's approximation theorem. For any real numbers 0 1, there is an integern = n(a, N) such that I n N and linall 1/N. n s N N) then we cannot guarantee a "small" I nail for every 0 < a < or I — 1/4 for every N/2 n N. 1. For example, leta = then Ilnali However, if we extend the set of integers n to the larger set of quadratic numbers of type .1k2 + e2, where k and £ are integers, then for every a there is a "multiplier" .1k2 + £2 such that N/2 .1k2 + £2 N and 11./k2 + £2. allis "small." Since the set of numbers '../k2 + £2, where k and £ are integers, is not periodic, it does make sense to extend the unit interval 0 < a < 1 to a longer one—in fact, the longer the interval, the more difficult is to prove anything. Now we are ready to formulate our non-conventional Diophantine approximation lemma.
Note that if one requires n to be in the "upper half" interval N/2
0 be an arbitrarily large but fixed reaL Then there such exist an "exponent" S = 8(y) > 0 and a "threshold" co = co(y)
I
Compare Roth's theorem to the following fundamental result in Ramsey theory.
Van der Waerden's theorem on short arithmetic progressions. For any integer k there exists an N such that for any partition of the integers 1, 2,... , N into two sets S1 and 52. there exists an arithmetic progression P = {a, a + d, a + 2d a+ (k — 1 )d} of length k which is entirely contained in either S1 or in S2. In other words,
I
IPflS1I—IPflS2II=IPI=k.
Though these two theorems have pretty much the same structure, there is a fundamental difference: in Van der Waerden's theorem the discrepancy is as large as possible. On the other hand, the length k of the "monochromatic" arithmetic progression in terms of the length N of the underlying interval [1, N] is very short: it is known that log N > k > log log log log log log N. The six times iterated logarithm lower bound is a very recent extremely complicated analytic result of the 1998 Fields-medalist Gowers (before that nobody could prove a lower estimate with a bounded number of iterations of the logarithm). Next consider the following basic theorem from geometric discrepancy theory.
Schmidt's theorem on rectangles. Let P be an arbitrary set of N points in the unit square. (i) Then there exists an axis-parallel rectangle A in the unit square such that > (ii) There exists a tilted rectangle B in the unit square such that
P fl BI — N area(B)I >
Both statements (i) and (ii) are the best possible. We mention that circles have roughly the same discrepancy as tilted rectangles, and even for the class of all possible convex sets in the unit square the discrepancy is less than Nh/S+6 [Be4].-...4his result is best possible). Is there any natural geometric shape which has "extra large" discrepancy? Well, the answer is yes if we change the normalization, and instead of taking N-element
Lattice Point Problems
25
point sets in the unit square, we consider infinite point sets of density one in the whole plane.
Extra Large Discrepancy Problem. Let A be an arbitrarily large real number Does there exist a set S of area A such that for any point set P of density one on the plane there is a congruent copy S' of S such that
JS'flPI > (I +c)A,
(I)
and there is another congruent copy S" of S such that
IS"flPI 0 is an absolute constant independent of A, S. P. We managed to give a positive answer to this question (see [Be3]). We proved that hyperbola segments like
SA={(x,y)EIR2:
(3)
satisfy requirements (1) and (2). The proof uses a version of the Roth—Halász or-
thogonal function method from discrepancy theory. What is so special about hyperbolas? Well, the shape of the hyperbola "resembles" a lacunary Fourier series with Hadamard gap condition. For these gap series a well-known theorem of S. Sidon states that the maximum is "almost as big as possible." More precisely, the maximum is bigger than a small absolute constant multiple of the sum of the absolute values of the Fourier coefficients (which is naturally the absolute limit). Lacunary Fourier series with Hadamard gap condition behave like "random series." (For a similar idea, see (4) in §2 and the argument after it.) Sidon's theorem corresponds to the "extra large deviation" in the following sense. Tossing a fair coin n times, it is extremely unlikely to have n heads, or even more than 51% heads, but the probability of the event is positive, which means that on a very long run we will have n consecutive heads. (The whole § 1 was devoted to similar probabilistic heuristics in "lattice point counting".)
Open ProbLem 1. Does there exist a set S in the plane such thai its translates alone satisfy requirements (1) and (2), respectively? We can answer the analogous question for 2-colorings of the lattice points Z2 Given any 2-coloring f —+ (—.1, + I) of the lattice points and given an arbitrarily large positive real number A, there is a plane set S of area A such that :
sup
f(n)
(4)
v€R2 flEZ2fl(S+V)
Here S is a tilted copy of the hyperbola segment SA with slope (the slope can be any other quadratic irrational), S + v is the translate of S by the vector v, and c > 0 is a universal constant.
26
J. Beck
Note that (4) is an "almost Van der Waerden" theorem for translated copies without magnification. This is a "new phenomenon" because in Ramsey theory magnification it is inevitable (i.e., it plays an absolutely crucial role in the proofs).
Open Problem 2.
Does there exist a convex set S such that some of its congruent copies S' and S" satisfy requirements (I) and (2), respectively? It is not hard to see that a positive answer to Open Problem 2 implies the positive solutions of two famous unsolved problems in geometry.
Danzer's Conjecture. (early 1 950s) Let P be an infinite point set on the plane with the property that any convex set of area one contains at least one point of P. Then the density of P is
"Dual Danzer." Let P be an infinite point set of density one on the plane. Then for any arbitrarily large n, there is a convex set of area one which contains (at least) n points of P. We cannot help mentioning here an old "extra large fluctuation" type problem of Erdös and Turán in combinatorial number theory.
Conjecture. (l930s) Let A = fai, ... } C N be an infinite If sequence of integers. Let fA (n) denote the number of solutions of n = + f(n) > 1 for all n E N, then fA(n) cannot be bounded from above. What is more, log n holds for infinitely many n 's. inequality fA (n) > This conjecture was motivated by the highly irregular behaviors of the well-
known arithmetical functions
d(n) =
1
and r(n) =
n=x2—y2
I
n=x2+v2
(the "divisor function" and the "divisor function for Gaussian integers").
Answering a 20-year-old problem of S. Sidon, in 1954 Erdôs proved the existence . . . } C N such that for all n
of an infinite sequence of integers A = fat, a2, a3, logn
0 log n
is impossible. Erdös offered $500 for a first solution to any of the last two conjectures.
Lattice Point Problems
5 Appendix:
27
a number-theoretic lemma
Ineffective Proof of the Diophantine Lemma. It is well known that if p is a prime I (mod 4), then p is the sum of two squares: p = a2+b2 (in fact, the representation is unique, but we don't need that). Now let p' = 5, P2 = 13. = 17 Ph be the first h primes I (mod 4). The value of h = h(y) will be specified later (note in advance that h = 14j' +21 is a good choice). Let cr E 11/10, N1J. Applying Dirichiet's approximation theorem for h. we obtain an integer flj with I
O}.
The spherical Radon transform (1.1) integrating f over "great circles" is substituted for the totally geodesic Radon transform
CA. Berenstein and B. Rubin
40
ff(x)drn(x).
(1.8)
This assigns to a sufficiently good function I on W a collection of integrals of
f over d-dimensional totally geodesic submanifolds of H's. Each represents a section of 1111" by a (d + 1)-dimensional plane through the origin. We assume the n — 1. An analog of the cosine transform (1.2) associated to general case 1 d (1.8) has the form
(
JØn
d(x,
f(x)
being the geodesic distance between x and
(1.9) This transform was introduced
in ER lOJ.
In the present paper we shall focus on the Radon transform (1.8). It was introduced by Helgason [HI] for smooth compactly supported f. In a series of papers by Berenstein and Casadio Tarabusi [BC lJ-[BC3I, Ishikawa FIsh), Fridman et al. [Fl], 1F2], Kurusa [K!], 1K2], and Lissianoi and Ponomarev [LP], the Radon transform f was studied from different points of view for smooth and rapidly decreasing (or compactly supported) functions f. For such good functions the following inversion formulas are known:
f(x) =
(1.10)
f(x) =
(1.11)
f(x)
=
c[(
)j
—
(1.12)
are polynomials of the Beltrami—Laplace operator, $ is a cerand denotes the average over all at tain convolution operator. = cosh' (v I), Here
ing to
from x, and (f)" stands for the dual Radon transform of / correspondwith v = I. Formulae (1.10) (ford even) and(1.12) are due to Helgason
[H41. They hold pointwise for each x E IHI". The formula (1.11) belongs to Berenis understood in the stein and Casadio Tarabusi [BC 1], [BC2I (in these papers sense of distributions).
In the present article we consider more general classes of functions, namely. the space L" (1111"), and the space of continuous functions. In the case d = n — 1, mapping properties of f for f E LP(LHI") were studied by Strichartz [StrJ. Note that formulae (1.10), (1.12) are invalid. The point is that the expression for! E (f)V is the integral of a potential type on a noncompact space (see (4.53) and Lemma 4). Thus, in general, (1.10) and (1.12) cannot be applied pointwise. Of course, they can be treated in the sense of distributions, but this method does not reflect the nature of the problem. Although (1.12) is presented through the left -sided fractional integral over a finite interval, the derivation of(l.12) heavily relies on the semigroup property = I_ i/i of the right-sided fractional integrals
Radon Transform I
=
f
41
5)a_Idt
over an infinite interval (this becomes clear if we change variables). For generic
f E Li', a formal application of the semigroup property may lead to divergent integrals or unnatural restriction on p. It is possible to get around this difficulty by utilizing Marchaud's fractional derivative [RI] instead of the Riemann—Liouville one; see [R9] for details. In our previous paper [BR], it was proved that ford = n—i and f E (W), the exists for almost all if and only if 1 Radon transform p < (n — I )/(n —2). We also obtained explicit inversion formulae for f in terms of the corresponding continuous wavelet-like transforms. Below we obtain more general results for all 1 d n — I by making use of new ideas and different techniques. The paper is organized as follows. Section 2 contains basic definitions and preliminaries. In Section 3 we prove that for f E LP(IHVt), f exists a.e. if and only if I p < (n — 1)/(d — 1), and obtain estimates of the Solmon type (cf. [So] for the Euclidean case). These estimates have the form
if
c
is the relevant weight function. In Section 4 we introduce analytic families of intertwining fractional integrals where
(K'1f)(x) = Yn.a
f
f(y)(sinhd(x,
(R'1p)(x) = yn,d(a) f
sinhd(x,
= The limiting case a = 0 in (1.13) (for smooth f) corresponds to the well-known relation (f)V = Kdf due to Helgason ([H4], p. 92. formula (28)). The equality (1.13) plays a crucial role. Since the analytic family {K'1 } contains the identity operator
f
= f on a formal level. This (for a = 0), (1.13) implies the inversion formula formula is given precise sense in Section 5. The list of references is not complete. More information can be found in the cited papers and books.
2 Preliminaries Let B",
n 2, be the real pseudo-Euclidean space of points x with the inner product
(xi, ...
,
C.A. Berenstein and B. Rubin
42
Ix. yI =
(2.14)
+Xn+IYn+I.
X1Y1 —
A hyperbolic space X is interpreted as the "upper" sheet of the two-sheeted hyperboloid
X=H" We denote by I
0).
(2.15)
the set of d-dimensional totally geodesic submanifolds = x where
C
n — 1. Let
EdI
...
=
Rd+I = lRen_d+1
... e
0. 1) and being coordinate unit vectors, in the following xo (0 = IHI" denote the origins in X and E respectively; G = S00(n, 1) is the identity component of the pseudo-orthogonal group O(n. 1) preserving the bilinear form (2.14); K = SO(n) and H = SO(n —d) x SOo(d, 1) are the stationary subgroups of xO and respectively, so that X = G/K, = Gill. One can write e
En
+
=
IHI"
fl
f(x)
f(gK).
p(gH),
g E G.
= cosh''Ix, yl. We have to ensure consistent normalization of various invariant measures throughout the paper. Given a rotation group SO(k), the corresponding invariant measure dy will be normalized by dy = 1, so that for the relevant unit sphere c we have The geodesic distance between x andy in X is defined by d(x.
f
f(w)dw=7k_If
it_i =
f(yw)dy.
SO(k)
An invariant measure dg on G will be normalized by the equality
I
JG
= I f(x)dx
(2.16)
Jx
where dx stands for the usual Riemannian measure on X = G/K [BR), [VK]. Each point x E X can be represented in hyperbolic bispherical coordinates as
x=
+
sinh8,
(2.17)
so that
dx
dL'(O) = (sinho)?1_d_l(coshoylde.
and being Riemannian measures on IHI" and pp. 12. 23). Owing (0(2.17),
(2.18)
respectively (see (VKJ,
Radon Transform
r
r dv(9) I f(x)dx = I Jx Jo JH"
r
43
f(qcoshO +
Jsn-d-1
(2.19)
= where
an_d_I j [L
dh is a product measure on H = SO(n — d) x SOo(d, 1) (normalized as
above) and
rcosh&
0
sinh0l
0
cosheJ
0
0
=
Lsinho
(2.20)
.
Once the invariant measures dg on G and dh on H are determined, we can define an invariant measure = d(gH) on = G/H normalized by
f
=
f
d(gH)f
(see [H31, p. 91).
(2.21)
The following statement gives precise meaning to this measure.
Lemma 1. If 'p E L'(E)
f
then
dv(8).
= Cn_d_1 £
(2.22)
Proof. Let ir : G —t G/H be a canonical projection, and let f' (g) be a positive function on G so that
=
I fi(gy)dy 0 and 8 E [0. 2jr], let Xr.6 denote the characteristic function
of the rotated square A(r, 8). Consider the function (2.1)
= Xr.O * (dZo —
where f * g denotes the convolution of the functions I and g. so that for every xe
=
J
Xr.O(X —
y)(dZo(y) —
Note that the rotated square A(r, 0) is symmetric across the origin, and so
x—y€A(r.0)
y—x€A(r,0)
y€A(r,0,x).
It follows that
=
f XrO(XY)(dZO(Y) —
0. x)I —Nit(A(r. 6.
and therefore
= Zo(A(r, 8, x)) — Nji0(A(r. 6, x)) = D0(A(r. 0, x))
(2.2)
represents the discrepancy of the part of A(r. 8. x) in the unit square We now appeal to the theory of Fourier transforms. Let L1 (R2) denote the set of all measurable complex-valued functions I that are absolutely integrable over R2, with Fourier transform f defined for every t E by
1(t) =
I f(x)e_"tdx.
JR2
It is well known that for any two functions f, g E L1(R2). we have f * g E L1(R2) and the Fourier transforms f and satisfy (2.3)
Let L2(R2) denote the set of all measurable complex-valued functions f that are square integrable over R2. Then the Parseval—Plancherel theorem states that for every function f L1(R2) fl L2(R2), the Fourier transform I E L2(R2) and satisfies
1g2 If For every *
=
dx
If(t)I2dt.
=
(2.4)
R2, we write
2ir f —k--
e_"t dDo(x) =
JR2
e_tXt(dZo(x)
Then it follows from (2.1) and (2.3)—(2.5) that
—
(2.5)
Irregularities of Point Distribution
f
=
f
=
f
63
(2.6)
— Njio. and hence the function is determined by the point distribution Q and has nothing to do with the rotated squares A(r. 0). On the other hand, the characteristic function Xr.O is determined by the rotated square A(r, 0) and has nothing to do with the point distribution Q. In other words, the identity (2.6) represents a separation of measure and geometry as a result and at the expense of passing over to the corresponding Fourier transforms. In lower bound proofs, the point distributions Q are arbitrary, so we have very However, we need only the following little control over the measure Do = Zo — estimate on the trivial error arising from the gaps between successive integers.
Note that the measure Do = Zo
Lemma 2.1.
Suppose that a measurable set B 0
SZ0(B+x).
so that
Z0(B + x) — N,io(B + x)I
8Z0(B + x).
Note that this last inequality is trivial if Zo(B + x) = 0. It follows that on writing p — B = (p — y : y E B} and Xp—B for its characteristic function, we have
f IZo(B
= 82
+x) —
12 R P€Q
=
— B) =
>
pEQ
The main part of the proof is therefore to study the characteristic functions and
their Fourier transforms Ir.8. Ideally, we would like an inequality of the type
64
W.W.L. Chen
r IZ.0(t)12
s
However, this makes use of only one rotated square A (r, 8), with no extra rotation or contraction. For any parameter q > 0, we consider instead an average
q Jq/2 J—3r/4 We have the following amplification result, which we shall use to blow up the trivial
error obtained in Lemma 2.1.
Lemma 2.2.
Suppose that 0 < p 0 and —,r/4 8 First of all, it is easy to show that for evezy t = (ft. t2) R2, we have
=
cos
8+
sin 8, —r1 sin9 + t2 cos 8),
(2.9)
where Xr denotes the characteristic function of the square A(r) = A(r, 0) [—r, r]2. Furthermore, for every u = (ui, U2) ER2, we have Xr(U) =
2sin(rui) sin(ru2)
=
(2.10)
.
)TUIU2
Lemma 2.2 follows easily from the result below.
Lemma 2.3.
Uniformly for all non-zero t e wq(t)
mm
{q4,
we have (2.11)
.
Proof. Note that in view of the integration over 8 in the definition of 0q (t), it suffices to show that uniformly for all t = (t1, t2) E R2 satisfying r1 > 0 and = 0, we have
min{q4. Using (2.9) and (2.10), we have
(.Oq(tI,O) x
sin2(r:1 cos 8)sin2(rt1 sin8) q q/2
8 sin2 8
d9d,.
Irregularities of Point Distribution
Since —,r/4
2.
We shall refer to them here as Walsh
functions. For any £ E N0, the Walsh function wt(x), where
€(O,1,... is defined for every real number x
1}.
[0. 1) of the form (6.4) by
= = for every real number z. As in the case p = 2, the collection E No} of Walsh functions gives rise to an orthonormal basis of the Hubert space L2([0, 1]), and so there is a theory of Fourier—Walsh series base p. In particular, for any fixed x E [0, 1), the characteristic function X(O.x)(Y) of the interval [0, x) where ep(z)
lwe
£
has the Fourier—Walsh expansion xl0.x)(Y) where = x and the analogues of (6.3) are given by Price [19]. Using this and the abbreviation P for the point set one can show that the discrepancy
function
D(P; B(x)] =
XB(x)(P)
pEP
1(0.1)
XB(x)(Y) dy 2
has Fourier—Walsh expansion
D[P;
B(x)J
peP e1=ot2=o ((i
which can be approximated by the finite series pIP_)
Dh[P; B(x)J = p€P
e1=0t2=O
\PEP
j
/
Recall that the Walsh functions are characters of the group P.
orthogonality relationship
so
that we have the
Irregularities of Point Distribution
if(e1.t2) 0
77
E
otherwise,
where P1 ç
is the orthogonal dual to the group Niederreiter [16]. Hence
see,
for example, Lidl and
(e1
One would like to square this expression and integrate with respect to x = (xi. x2) over the unit square [0, 112. Unfortunately, the Fourier—Walsh coefficients
\ ((0,0)},
(t1, £2) E
not orthogonal in L2([0, j]2) in general. In Chen and Skriganov [9], it is shown that as long as the prime p is chosen large elements in the square [0, l]2, in the spirit of van enough, there exist groups P of der Corput, such that the Fourier—Walsh coefficients (6.5) are quasi -orthonormal in L2([0, l]2). Indeed, they are able to establish Theorem 1(u) for arbitrary dimensions with explicitly constructed point sets. More recently, Chen and Skriganov [101 have shown that, in fact, as long as the prime p is chosen Large enough, there exist groups P of elements in the square [0, 112, in the spirit of van der Corput, such that the Fourier—Walsh coefficients (6.5) are orthogonal in L2 ([0, 1 J2), so that are
(ei,t2)EP'\((O.O)}
J
10.11
7 Roth's orthogonal function method In this section, we sketch Roth's proof of Theorem 1(i). Corresponding to every distribution P of N points in the unit square [0, 112, Roth creates an auxiliary function F(x) = F[P; x] such that, writing D(x) for D[P; B(x)], we have the inequalities
F(x)D(x)dx>> logN and
(7.2)
F(x)I2dx > 1.
The inequality (7.1) follows immediately, in view of (7.3) and (7.4).
8 A Haar wavelet approach Let p(x) denote the characteristic function of the interval [0. 1), so that 1
çQ(x)=
Let 0(x) =
—
—
0
otherwise.
I) for every x E IR, so that I
0
For every n, k E Z and x
=
otherwise.
R, write —k)
and
=
—k).
— k) that for every n E N0 and k = 0, 1,2 2" — 1, the function denotes the characteristic function of the interval [2"k. 2"(k + 1)) C [0, 1). It
Note
W.W.L. Chen
80
is well known that an orthonormal basis for the space L2([0, 1]) is given by the collection of functions n E N0 and k
= 0, 1, 2,...,
— 1,
together with the function p(x). This is known as the wavelet basis for L2([0, 1]); see, for example, Daubechies [Ill or Meyer [171.
Let us now extend this to two dimensions. For eveiy n = (n i, n2) and k = inZ2 andeveryx = (xl,x2)inR2,write
(k1,k2)
®n,k(X)
=
0n1,ki(XI)t9n2,k2(X2).
Then an orthonormal basis for L2([0, I ]2) is given by the collection of functions
—landk2=0,l,2
=0,1,2 together with the two collections of functions
1,2
112
E N0 and k2 =
ni
ENoandk1 =0,1,2,...
0,
2"2 — 1, ,2t71
This is usually known as the rectangular wavelet basis for L2([0, 112). We now give an alternative proof of Theorem 1(i), due to Pollington El 8]. First
and the function
of all, note that the discrepancy function D(x) = form D(x) = Z(x) — NxIx2, where Z(x) =
Xlo.xi)(P1)X[0.x2)(P2)
pEP
=
DEl-';
B(x)] can be written in the
X(pi.1)(XI)X(p2.I)(X2), PEP
where xs(x) denotes the characteristic function of the set S. We now make use of the rectangular wavelet basis for L2([0, 112). For every n = (ni, n2) e and every k=(k1,k2),wherek1=0,l,2 2"2—1,consider the wavelet coefficients and
an,k = J10112 I It is easy to see that
an.k =
bfl,k =
I'
J10112
Z(x)8flk(x)dx.
(fI)
N(f
A simple calculation gives I
f
=
j
— k)dx
i
=
Irregularities of Point Distribution
81
It follows that writing ml = fli + fl2, we have an.k
N
=
21111+421111/2'
On the other hand, we have
(JI b11,k
= =
(f
(f1
(j1
P2
P1
Note that the only non-zero contributions to bnk come from those p Bn,k, the + I), or support of E)n,k. If p E Bn,k, then 2'1'k1 p1] = k, for pj
Jp
+ 1), so
p
both i = 1,2. A simple calculation now shows that if
p I
( 0(y) dy = — = —,;— 2 J{2"p}
2n/2
2
II
where for every JR. denote respectively the fractional part of } and the distance of to the nearest integer. It follows that II
bn,k
and
= 3j72 >
Combining the above, we then have the wavelet coefficients
=
=
— 21111+4)
form a subcollection of the rectangular basis for L2([O, 1)2). It follows from Parseval's identity that Note in particular that the functions
f
10,112
cc
/
cc 1
=0
fl20
2"I—12"2—1 I I
k1
=0 k2=0 \pEB.k
N
W.W.L.Chen
82
2" < 4N. Then for every we now choose n so that 2N To complete the of the rectangles Bn.k do not contain any fixed n satisfying ml = n, at least point of P. so that 112"'p111112"2P211
It
0.
follows that
(N)2
D(x)I2dx?
n+I
log N.
nl=0n2=O In
I=n
References I. Beck. J.. Irregularities of distribution. Ada Math. 159 (1987). 1—49. 2. Beck, J. and Chen. W.W.L., Note on irregularities of distribution II. Proc. London Math. Soc.
61(1990),
251—272.
Beck. J. and Chen, W.W.L., Irregularities of point distribution relative to convex polygons 111, J. London Math. Soc. 56(1997). 222—230. 4. Brandolini, L.. Colzani, L., and Travaglini, G., Average decay of Fourier transforms and integer points in polyhedra. Ark. Mat. 35(1997). 253—275. 5. Brandolini, L., losevich, A., and Travaglini, G., Planar convex bodies, Fourier transform, lattice points and irregularities of distribution, Trans. Math. Soc. 355 (2003), 3513— 3.
3535.
Brandolini, L., Rigoli, M.. and Travaglini, G., Average decay of Fourier transforms and geometry of convex sets, Rest Mat. Iberoa,n. 14 (1998), 519—560. 7. Chen, W.W.L.. On irregularities of distribution H, Quart. J. Math. Oxford 34(1983), 257— 6.
279.
8. Chen, W.W.L. and Skriganov, M.M., Davenport's theorem in the theory of irregularities of point distribution, Zapiski Nauch. Sem. POMI 269(2000), 339—353. 9. Chen. W.W.L. and Skriganov. M.M., Explicit constructions in the classical mean squares problem in irregularities of point distribution, J. Reine Angew Math. 545 (2002). 67—95. 10. Chen. W.W.L. and Skriganov. M.M., Orthogonality and digit shifts in the classical mean squares problem in irregularities o point distribution (preprint). II. Daubechies, 1.. Ten Lectures on Wavelets, SIAM, 1992. 12. Davenport, H., Note on irregularities of distribution, Mathemazika 3 (1956), 131—135. 13. Davenport. H.. A note on diophanline approximation II, Marhematika 11(1964). 50—58. 14. Fine. N.J., On the Walsh functions, Trans. Anierkan Math. Soc. 65 (1949), 373—414. 15. Halton, J.H. and Zaremba, S.K., The extreme and L2 discrepancies of some plane sets, Monais. Math. 73 (1969), 316—328. 16. Lidl. R. and Niederreiter. H., Finite Fields. Addison-Wesley, 1983. 17. Meyer, Y., Wavelets and Operators. Cambridge University Press, 1992. 18. Pollington. AD.. Haar wavelets and irregularities of distribution (manuscript). 19. Price. J.J.. Certain groups of orthonormal step functions, Canadian J. Math. 9 (1957). 413-425. 20. Roth, K.F.. On irregularities of distribution, Mathemazika 1 (1954), 73—79. 21. Roth. K.F., On irregularities of distribution Ill, Acta Aridz. 35 (1979), 373—384. 22. Roth, K.F., On irregularities of distribution lv, ActaArith. 37(1980), 67—75.
Spectral Structure of Sets of Integers Ben Green Trinity
College,
Cambridge CB2 ITQ. England
[email protected] Summary. Let A be a small subset of a finite abelian group. and let R be
the set of points
at
which its Fourier transform is large. A result of Chang states that R has a great deal of additive structure. We give a statement and a proof of this result and discuss some applications of it. Finally, we discuss some related open questions.
1 Introduction, notation and definitions Harmonic analysis has been used to great effect in additive number theory for more than 150 years. In this article we will look at one specific theme that has received attention of late. This is the principle that the large values of the Fourier transform of a small set have a great deal of structure.
We begin by introducing a small amount of notation which is necessary for the discussion. Throughout this paper N
will be a
large prime number, and
we
will write
Zpq we ZN for the additive group' of residues modulo N. If E = {e1 eL} write Span(E) for the set of all sums s(e) = with r, E (—1, 0, II. We will write Often the subscript N will be suppressed. as the value of N = will be clear from the context. If f : ZN —÷ C is a function and r E ZN then we define the Fourier transform of f at r by
1(r) = We will adopt the convenient notational practice of identifying sets with their characteristic functions.
Much of what we have to say can be generalised to arbitrary finite abelian groups. However in this article we will eschew such generality and discuss instead the group ZN and, occasionally, the group
84
2
B. Green
Chang's structure theorem
In a recent paper 15] of Chang the following result is stated.2
Theorem I (Chang). Let p,a E [0, 1],let A C ZN be a set of size aN and let R c ZN be the set of all nor which IA(r)I ? plAl. Then there is a set E c ZN with IEI such that R ç Span(E). log It is convenient to give a name to the situation covered by this theorem. Thus if (0, I) then we say that A is p-large at R if IA(r)I for
A, R c ZN and if p
alIr €R. Theorem I is an extremely interesting result. Parseval's theorem implies that the set R has size at most but for small a this is much bigger than the size of E guaranteed by Chang's result. Theorem I may thus be viewed as saying that the "large spectrum" of a small set is very highly structured. There are already two rather different applications of this result in combinatorial number theory. The first, in Chang's original paper [5], concerns Freiman's theorem on sets with small sumsets. The second, due to the author [7], concerns arithmetic progressions in sumsets. We will discuss this application in §6. In [5] Theorem I is derived from an inequality of Rudin. We will describe a proof of this result in the next two sections. A rather different proof was shown to us by I.Z. Ruzsa (personal communication), an account of which may be found in [101(2). In §5 we give the deduction of Theorem 1.
3 An inequality of Rudin The main sources for this discussion were [12] and [15]. Let us begin by stating the inequality of Rudin that interests us. We say that a set A = (Aj Am } c ZN dissociated3 if the only solution to the equation
CIXI+"+6mAm =
=0. In the statement of Rudin's inequality, A will be assumed to be dissociated and we will regard A and ZN as finite measure spaces (M1, and (M2, is2) respectively. will be the counting measure, so that = JAI, while /12 will be the normalised counting measure, which means that tL2 (M2) = 1. Write B(M1) for the space of functions on M,. 2 Chang's paper seems to be the first place where this result is explicitly stated. However, similar ideas can be found in an earlier paper of Bourgain [4], and the whole circle of ideas perhaps originated with Rudin 115). We will discuss Rudin's inequality later in the paper. The reader should be aware that various slightly different definitions will be encountered in the literature.
Spectral Structure of Sets of Integers
85
PropositIon 1 (Rudin). Let T : B(Mi) —+ B(M2) be the linear map that sends Then for any a sequence E B(M1) to the function f(x) = p > 2 we have the bound
norm of the operator T.
on the
Written out in full, this means that
/
p
iifiic
= N1
>anoi" x
(144p)I'12 n€A
nEA
The formulation we have used in Proposition 1 is, perhaps, more suggestive.
Observe that
lanl2)U2 is equal to hf 112. The inequality may, therefore,
be interpreted as a statement to the effect that the L2 and L" norms of a function whose spectrum is dissociated are comparable. In the next few paragraphs we show that Rudin's inequality is true, on average, for modified versions off in which the have been subjected to random and independent changes of sign. This may seem like a curious thing to do, so we offer some motivation at the end of the section. Suppose then that X, j E A are independent Bernoulli random variables taking values in (± 1) and let us consider the random function
X(x) = flEA
is to write
A sensible way to estimate Eli XII
=
N' XEZN
f°
P(IX(x)I
tm') di,
recalling the availability of certain large deviation inequalities associated with the
names of Bernstein, Chemoff and Hoeffding. The following is a typical example.
Proposition 2. Let Z1 be independent complex-valued random variables a for all i = 1, ... , n. Let t be a positive real with zem means and with 1Z1 I Then
IP'(IZi +
+
i)
for example, [10] (1). Substituting into (1) gives See,
an
expression which may be evaluated explicitly as
86
B.Green
A short calculation using a sharp form of Stirling's formula then yields iai
(2)
This is all very well, but there is no reason to suppose that the behaviour of f should be linked in any way to that of the random function X. The dissociativity of A is exactly what provides such a link, a fact that we shall endeavour to explain now. We begin with the observation that the norm of f(x) is the same as that of
f(x+O) = flEA
for any 6 E ZN that we may care to select. Suppose that for any choice of a sign for all n E A (we will not function e A —+ (d1) we could find a 6 with be precise about what we mean by the approximate symbol that there is a specific choice of e for which
here). Now (2) implies
/ n
6 would then allow us to recover an inequality of the desired form for 1. Now whether or not one can find such a 0 is related to issues of simultaneous diophantine approximation. Observe that if there is a "small" linear relation amongst the elements of A—say, for example, {5, 7, 12) ç A—then such a 0 need not exist. One can prove using Fourier analysis that this is necessary and sufficient; that is to say, if there are no small linear relations then 6 can always be found, whatever the choice of signs The phrase "no small linear relations" turns out to mean that A is linearly independent over a set such as {—D. —D + 1 D} where D IA I. Unfortunately this is a stronger condition than just dissociativity, but when it does hold, f models the U' behaviour of the randomised sum X very closely. It turns out however that dissociativity is exactly what we need to make a different approach to the comparison of f and X work.
4 Riesz products and Young's inequality It is convenient to have a notation for twisted versions of f like those we encountered in (3). If e : A {± I) is a sign function then write
= PlEA
Write
(x) for the Riesz product
= flEA
Spectra) Structure of Sets of Integers
Claim.
87
Wehavef=
Proof of daim. This can be established by a fairly straightforward computation. We have
= 2N'
f6 *
fl (1 + y mEA
+
(4)
flEA
Multiplying out the product and changing the order of summation, one is confronted with a weighted sum of terms of the form (5)
where the n.n are distinct elements of A and m E A. The dissociativity of A implies that such a sum is zero unless r = 1, s = 0 and m = n in which case it equals N. It is easy to see that the weight attached to (5) in this case (in the expanded version of (4)) is
= and the claim follows quickly. Now the Riesz product p6 is non-negative, and so lip6 is simply N — p, (x). This sum may easily be calculated by expanding out another product and using dissociativity, and it turns out that pp6 h = 2. Thus by Young's inequality and the claim we have
liflip =
* Pellp lifE IlpilPE hi
= 2
any choice of sign function e. Now (2) implies that there is a
specific choice of e for which 1/2
lifE lip
12)
Thus 1/2
huh,' < and Proposition 1 follows immediately.
88
B.Green
5 Completion of the proof of Chang's theorem In this section we derive Theorem 1 from Proposition 1. It turns out that the dual form of Proposition 1 is easier to work with in this context. This takes the form IIT*II,'_.2
where p' is the dual exponent of p. Here T* : B(M2) —+ 8(M1) is the adjoint ofT, which is easily seen to be given by
T*f(n) = forn E A. with cardinality aN, and we Now recall that we are interested in a set A ç have written R for the set of all r E ZN for which IA(r)I ? pIAl. We wish to show that R has lots of structure, and we do this by proving that it does not contain a very large unstructured subset. To this end let A be a maximal dissociated subset of R and apply (7) with p = Iog(1/a) and f equal to the characteristic function of A. It is easy to check that 1/2
IIT*A112 = N1 and that
It follows immediately that
IAI 2k(bk cos kO — ak sink9) k=()
and it follows from the above formulas for P(K) and A(K) combined with Parseval's equation that
P(K) = A(K)
—
—
and therefore
—
4jrA(K) =
—
+
(1)
Since the right hand side of this equation is evidently not negative, we obtain again the isoperimetric inequality. Moreover, if the equality sign holds then h (9) = ao + a i cos 9 + b, sinG and this is the support function of a circular disc of radius ao centered at (ai, b1). However, the above equation actually contains a stronger statement than the ordinary isoperimetric inequality. To elaborate on this, two more concepts have to be introduced. First one needs a kind of a distance concept that measures the deviation of two convex domains from each other. Among the many possibilities of defining such a concept, the distance based on the L2-metric in function spaces is most suited for work that involves Fourier series. If M and N are two convex domains, and hM, hN the respective support functions, this distance is defined by
S(M, N) =
OhM — hNII.
Note that S is indeed a metric on the set of all convex domains and that it is invariant
under rigid motions of R2. Parseval's equation shows that, if hM(G)
cos kG + bk sinkO)
hN(O)
cos kG + dk sink9),
100 Years of Fourier Series and Spherical Harmonics in Convexity
101
then
S(M,N)2 = 2,r(ao—co)2
—ck)2+(bk —dk)2).
To introduce the second concept consider the distance between a convex domain K and a circular disc, say D, of radius r centered at a point q = Since D has the support function r + cos 0 + sinG it follows that eS(K,D)2 =2,r(ao—r)2+ir((ai —qi)2+(bi
If r and q are considered to be variable, this relation shows that ö(K, D) as a function of D is minimal exactly if D has radius = h(8)dG = P(K)/2ir and is centered at the point with coordinates 1
=—
irj0
h(G) cos OdO
=—
h(0)sin0d0.
This point is called the Steiner point of K. Thus one may say that the best approximating circular disc of a convex domain K is the disc of radius P(K)/2ir that is centered at the Steiner point of K. This disk will be referred to as the Steiner disc of K and will be denoted by B(K). Hence
S(K, B(K)) = ir
+
As an immediate consequence of this relation and (1) one finds that P(K)2 — 4,rA(K) ? 6irS(K, B(K))2.
This is evidently a strengthened form of the ordinary isoperimetric inequality. It not only shows that the equality sign in the isoperimetric inequality implies that K is circular, but it actually provides an explicit and sharp estimate of the distance of K from a circular disc if the value of the isoperimetric deficit, i.e. of P(K)2 — 4ir A(K), is given. Results of this kind are referred to as stability results and have been actively investigated in recent years.
3 SpherIcal harmonics It is now time to look at the situation in the Euclidean space W with n being an arbitrary integer greater than 1. Let denote the (a — 1)-dimensional unit sphere in R' (centered at o) and let denote the Laplace operator in i.e., = + Furthermore, if F and G are Iwo bounded integrable functions on F(u)G(u)d,(u), where do(u) their inner product is defined by (F, G) = refers to the area differential on As usual, F and G are said to be orthogonal if (F, G) = 0, and the norm of F is defined by IIFfl = (F, F)112.
102
H. Groemer
First let us consider the appropriate generalization of Fourier series in the higher dimensional setting. This generalization is suggested by the following considerations. Letting
cos 6 =
SiflO = X2
x1
and using the well-known formulas for cos kO and sin k6 one finds cos k-Il =
sinkO
fk\
=
k
—
fk\
k—'
x1
+
fk\
k—4 4 X-, —
+
—
—
One easily shows that these polynomials are harmonic, that is, they are homogeneous
and vanish under the application of the Laplace operator. Thus, the terms of the Fourier series. i.e. cos kO and sin kO, are the restrictions of harmonic polynomials to the unit circle. This fact motivates the following definition: A spherical harmonic of dimension n and order k is the restriction to of a harmonic polynomial of degree k in n variables. Obvious examples of spherical harmonics of dimension n are the respective restrictions to Sd—I of the constant functions and the linear functions CIXI + C2X2 + .
+
Let me now list some important properties of n-dimensional spherical harmonics.
(a) The maximum number of linearly independent spherical harmonics of order k is
N(n
2(k+n2
k+n—2\ n—2
(The quotient (2k + n — 2)/(k + n —2) is supposed to be I if n = 2 and k = 0.) For example, N(n. 0) = I and there is essentially only one spherical harmonic of order 0, namely the constant one. If k = I, then N(n. 1) = n and the most natural choice of n linearly independent spherical harmonics will be the coordinates UI considered as functions of u = (Ui Note that these n spherical E
harmonics are mutually orthogonal. (b) Any two spherical harmonics of different order are orthogonal. (c) Using the standard procedure of orthogonalization one can find for any k ? 0 an orthogonal set of N(n. k) non-zero spherical harmonics of order k. ... of non-vanishing spherical har(d) There exists an orthogonal sequence Hi, monics of non-decreasing orders such that for each k ? 0 it contains N(n, k) terms of order k. Any such sequence is complete. The latter fact can be described as follows: If F is any bounded integrable function on S" — and if one associates with such a sequence the 'Fourier series' I
F then Parseval's equation
aj =
(F.
H1)/11H1 112.
100 Years of Fourier Series and Spherical Harmonics in Convexity
11F112
= i =()
valid. In most cases it is convenient to collect all terms H of given order k into one expression. say Qt. and write is
F-..>Qk. k=()
where we now have 11F112
= A =0
The latter relation will also be referred to as Parseval's equation.
To illustrate these statements consider the case n = 2. As the orthonormal sequence H1. H2.... one chooses the functions I, cos 0, sin9, cos 20, sin 20,... and this results in the classical Fourier series. In this case the difference between and is that in the latter case the terms cos kG and sin kG are combined into one term, as it is customary to do. For n = 3 one has N(3. k) = 2k + 1 and the situation is more complicated. Letting (0. denote the usual spherical coordinates of a point on S2. the corresponding 2k + I spherical harmonics are
(j =0.1 (j = I
sin JO
k). k).
(1ff = 0 and
is supposed to be 1.) = 0. then (sin denotes the j-th derivative of the k-th Legendre polynomial Pt. The corresponding series development, known as the Laplace series, is
F
cos JO +
sin
k=0 j=0
4
Convex bodies in W'
Most of the geometric concepts discussed in connection with the euclidean plane can be naturally generalized to the situation in R', where the dimension ii is assumed
to be given and fixed (n (n — 1)-dimensional.
2). A 'plane' is always meant to be a hyperplane. i.e.
A convex
is defined as a closed and bounded convex subset
of R' with interior points. Similarly as in the two-dimensional case, the support function h(u) of a given convex body K is the directed distance of the origin o to the support plane (tangent plane) of K of direction u. Hence u is orthogonal to the support plane and points in the direction of the half-space determined by the plane that does not contain K; h(u) is negative if the support plane separates o and K. In a more condensed form this can be expressed by the formula h(u) = sup(x •u x E K)
104
H. Groemer
where x .u indicates the usual dot product for vectors in R". With the support function
available, one may define the distance between two convex bodies M and N with respective support functions hM and hN by
S(M, N) =
OhM — hNfl.
Similarly as before, S is a metric on the set of all convex bodies in R' that is invariant under rigid motions. It is not so obvious what in R' the analogues of area and perimeter are supposed to be. Certainly volume V(K) and surface area S(K) come to mind, but a closer look reveals that volume and surface area are only a small part of the whole picture. For
example, if n = 2 a change of notation in our previous formula for the perimeter in terms of the support function shows that P(K) = h(u)da(u). So, wouldn't
it be more natural to introduce as the n-dimensional analogue of the perimeter the expression h (u)da (u)? Actually, this expression has a simple geometric interpretation. Noting that the sum
w(u) = h(u) + h(—u) is the width of K in the direction u, i.e. the distance between two parallel support planes of K orthogonal to u, one may define the mean width of K by
=
f
w(u)da(u),
Jsn—I
denotes the surface area of Hence, in R2 the perimeter is (except for a constant factor) the mean width and it might therefore be more fitting to look at the mean width as the proper generalization of the perimeter in the n-dimensional case. The classical approach to a better understanding of this situation is via Steiner's theorem on parallel bodies. To describe it, let denote the parallel body at diswhere
tance r of K, that is, the set of all points in R" that are within distance at most r of K. In other words, K,. is the convex body with support function h(u) + r. Then, Steiner's theorem, in its setting for R', says that there exist n + 1 numbers (K), depending on n and K only, such that
+ (")vi:(K)r". For example, if n =
2
then
A(K) + P(K)r + (This is obvious for convex polygons and can be extended to arbitrary convex domains by approximation.) Thus, if n = 2 then
100 Years of Fourier Series and Spherical Harmonics in Convexity
= A(K),
=
=
105
= it.
If n = 3 then
= V(K),
W?(K) =
W,3(K)
=
=
For arbitrary n it can be shown that
W(K)= V(K),
!S(K).
W,,"(K)=ic,,,
(K) where ic,, denotes the volume of the n-dimensional unit ball B". The numbers are often called the quermassintegrals of K. apparently since they appeared first in the German literature under the name of Quermal3integrale and nobody has been able to translate this rather clumsy German word. In addition to the above definition based on Steiner's theorem, there are many other ways to define them. For example they can be defined axiomatically by some very basic properties or as special cases of what are called mixed volumes. Aside from a constant factor, Wk" (K) can also be defined as the mean value of the (n — k)-dimensional volume of the orthogonal projections of K onto all (n — k)-dimensional linear subspaces of R". For this reason the numbers Wk" (K) are sometimes referred to as the mean pmjeclion measures or simply as the projection measures of K. It also is possible to express them in terms of integrals over certain curvature functions of the boundary of K. In any case, these quermassintegrals (or constant multiples of them) play a very important role in the theory of convex bodies. Note that always W (K) = K,, and that for all k one has
W(B") = K,,. 5 Inequalities and spherical harmonics Since in Section 2 it has been shown that for two-dimensional convex domains the Fourier expansion of the support function led to an interesting sharpened form of the isoperimetric inequality, it is natural to try to prove a similar result for the ndimensional case. Thus, let us assume that h is the support function of the convex
body Kandthat
is the corresponding expansion of h in terms of n-dimensional spherical harmonics, where Qk has order k. In particular, Qo = co and Qi = cIul + ... + c,,u,,,
E S"'
where CO, ci,... , c,, are constants and u = (uo,... , u,,) = W(K)/2. in the case when (h(u), 1)/fl 1112 it follows that
sional ball of radius r centered at the point p = (Pi corresponding expansion is
r+piui+..•+pnun.
. Since
cçj
=
is an n-dimenp,,), one finds that the
106
H. Groemer
From this, Parseval's equation, and the fact that
Br(pfl2 =
an(Co — r)2
—
112
= +
p1)2 +
it follows that —
+
IIQklI2.
Similarly as in the two-dimensional case, if p and r are considered to be variable this shows that ö(K. Br(p)) is minimal exactly if r = CO = and Hence one can state that the best approximating ball of a convex body K has radius and center (Cl c,,), where = h(u)u1da(u). This ball is called the Steiner ball of K and will be denoted by B(K). The center z(K) of B(K) is called the Steiner point of K. Hence the expansion of the support function of K can be written in the form
h(u)
+ z(K) . u +
2
Let us now return to the isoperimetric inequality. If n = 2. then from the present point of view, this is an inequality between two successive mean projection measures, namely and W?. In fact, it can be written in the form
> 0.
W?(K)2 —
So. generalizing the proof that led to this inequality, one can expect to arrive at an analogous inequality between two successive The most interesting inequality would be between W(f (volume) and Wr (surface area). Unfortunately, for n > 2 the formulas for volume and surface area in terms of the support function are not suitable for work with spherical harmonics. One has to look at the other end of the chain, that is, at a possible inequality between W_1 (mean width) and This is feasible because of the following two assertions. First, it can be shown that
=
!11h112
n
where
+
n(n—l)
(h,
denotes the Laplace—Beltrami operator on
It is defined for functions as the ordinary Laplace operator of the function obtained by extending F from Sit_I to R \ {o} so that it is constant on each ray starting at o, and subsequently restricting the resulting function to In other words. is the restriction of F(x /jx I) to — where indicates the norm for points (vectors) of R't. Second, if Q is an n-dimensional spherical harmonic of order k, one can prove that Es0Q = —k(n + k — 2)Q. (For example, if n = 2 this is essentially the fact that
F on
I
.
I
kO) = —k- cos kO and (sinkO) = —k sink6.) From these two observations and the general version of Parseval's equation it follows that can be expressed in terms of the expansion of h as (cos
W,'_2(K) =
— — 1)
—
k=O
l)(n +k — 1)11 Qk 112.
100 Years
of Fourier Series and Spherical Harmonics in Convexity
Combined with the earlier remark that Qo =
W_,(K)2
—
Kflw:2(K)
—
=
one obtains
= l)(n + k —
IIQ,J12.
?:
and this leads immediately to the inequality
B(K))2.
(2)
For n = 2 this is the previously discussed version of the isoperimetric inequality. If n = 3 there results an 'isoperimetric' inequality between the mean width and the surface area of K. namely —
S(K) >
B(Kfl2.
With the term ö(K, B(K)) replaced by 0 this inequality has already been obtained by Minkowski, and Hurwitz has shown that spherical harmonics can be used to prove it. Note that these inequalities again contain stability statements; they not only show that K must be a ball if the left hand side vanishes, but they also provide information on the distance of K from a ball if the size of the left hand side is given. Although this approach for proving geometric inequalities does not yield an in-
equality between V(K) and S(K), it can be used in conjunction with some classical inequalities to prove stability results for inequalities involving any pair W7, W$. For example, if (2) is combined with the known inequality one obtains forn >3 —
K,1W_3(K)2
n+l n(n
—
(K)2
8(K, B(K))2.
Repeated application of this method leads to an n-dimensional version of the standard isoperimetric inequality with a stability term, namely n+1 —
(n — 1)
8(K. B(K))2.
If n =
2 this becomes the isoperimetric inequality discussed at the end of Section 2. I wish to mention that recently B. Fuglede has found a different approach for proving strong inequalities of this kind that does not use the expansion of the support function but rather of the radial function. It involves a long chain of intricate estimates.
6 Bodies of constant width and the Radon transformation Almost everything I have said so far concerned geometric inequalities, and I may have left the impression that this is the only subject in the theory of convex sets that
108
H. Groemer
is amenable to the application of spherical harmonics. To dispel this impression I will now discuss some results that are of a very different nature. The original source of these results is a short paper of Minkowski [13] published in 1906. Its title is 'On convex bodies of constant width.' Actually this is a translation; the original being written in Russian, the only paper of Minkowski written in this language. In this article Minkowski showed how three-dimensional spherical harmonics can be used to solve an interesting problem regarding projections of convex bodies of constant width. To describe this theorem of Minkowski a few preliminary remarks are necessary. with n 3. As one might expect, K is said Assume that K is a convex body in
to be of constant width, if its width w(u) = h(u) + h(—u), considered as a function denote the (n — 1)-dimensional linear subspace of of u is constant. Let denote the orthogonal projection of K onto that is orthogonal to u, and let and if K is of constant U-'-. Obviously, can be viewed as a convex body in is also of constant width w0. width w0 then every K the Assume now that n = 3. The perimeter of direction u. Since, as pointed out earlier, the perimeter is ir times the mean width it follows that the girth of does not depend on u. In other words, bodies of constant width are bodies of constant girth. Minkowski's theorem is the converse of this statement: Convex bodies of constant girth are bodies of constant width. Let me sketch how Minkowski proved this theorem. The assumption that K has constant girth means that for some constant c and all u c.
S2 is designated as 'pole,' then, if a cartesian coordinate system If a point q is selected so that the positive z-axis intersects at the point q, every U E is determined by its spherical coordinates 9, ço with respect to this choice of the coordinate system. Hence, the support function h(u) of K can be interpreted as a function of 9 and p, and because of this possibility one may write h(9, instead of h(u). Let now
h(u) or, equivalently,
h(O,
be the representation of h in terms of spherical harmonics. From the explicit representation of three-dimensional spherical harmonics mentioned before it follows that
=
Qk(u) = where
particular, if
denotes
cos JO + bk3 sin
the j-th derivative of the Legendre polynomial of order k. In
= 0, then
100 Years of Fourier Series and Spherical Harmonics in Convexity
=
= Qt(G.0) = and
if
=
109
then ir/2)dO
I Jo
= 2JrakoPk(O).
Furthermore, the perimeter P(Kq) of Kq IS
J h(9, n'/2)dO.
P(Kq) Combining these facts one finds
P(Kq) = 2ir Since there were no restrictions in the selection of q, this equality can be viewed as the expansion in terms of spherical harmonics of considered as a function of q. However, since K is of constant girth, this function is actually constant. Since for even k one has (0) 0, it follows that Qk must vanish identically for all even k > 0. Hence,
h(u) + h(—u) = 2Qo, which shows that K is of constant width. Focusing on the essential features of this proof and generalizing to n dimensions (n 3) one can summarize the situation as follows. Let F be a function on and with U-'-. let S(u) be the (n — 2)-dimensional sphere that is the intersection of Then one can define a new function on say F, by the relation
=
I
JS(u)
refers to the surface area measure on This new function F is called Thus, what the (spherical) Radon transform of F and is often denoted by Minkowski did (in the case n = 3) is to show that if the original function F is even, then it is uniquely determined by its transformed function R(F). Minkowski is usuwhere
ally not given proper credit for this achievement, although Radon in his famous paper [14] has acknowledged the priority of Minkowski's work. Since the appearance of Minkowski's paper, a huge body of literature concerning the Radon transformation has been created. In accordance with my present objectives I restrict myself here to a discussion of the relationship between spherical harmonics and the Radon transformation and to some applications of the Radon transformation in the theory of convex sets.
H. Groerncr
I 10
Before I say more about this topic and other transformations of this kind, let me show how the injectivity (for even functions) of the Radon transformation can be used to prove a generalization of Minkowski's result. (n > 3) and let WK, Wi. denote the Let K and L be two convex bodies in respective width functions of K and L. Clearly. the function F = u'K — WL is even, and and the difierence of the mean widths of is given by
I
Un—I JS(u)
if this integral vanishes then the injectivity of the Radon transformation shows that F = 0. Calling K and L equiwide if for every direction they have the Hence,
same width, one can express this result by stating that if the projections of two convex bodies on any plane have the same mean width they must be equiwide. Letting L be a ball one obtains again Minkowski's theorem (generalized to R").
7 The Funk—Hecke theorem and some of its consequences As indicated in the previous section, spherical harmonics are quite useful in dealing with the Radon transformation. But they are also effective means of dealing with several other spherical integral transformations that have interesting geometric applications. An essential tool for work in this area is a relation called the Funk—Hecke theorem. It can be formulated as follows. Let 1 be a bounded integrable function on 1—1. 1J and Q a spherical harmonic of dimension n and order k. Then
I
v)Q(v)dU(v) =
Js,,-I with
I
=
I_I
—
where is the Legendre polynomial of dimension n and degree k. Since Legendre polynomials of higher dimensions are not commonly known ob-
jects. let me just mention that they have properties very similar to those of the ordinary Legendre polynomials. For example, they may be defined by the generalized Rodrigues formula pfl( X ) =
where a = (ii
and
—
(—i,
2k(a+l)(a+2)...(a+k) (I
— X
—
dxk
3)/2. and they satisfy the recurrence relation
the second order differential equation
X
2)a+k
100 Years of Fourier Series and Spherical Harmonics in Convexity
(l—x2)
k
dx-
—(n—1)x
dP"(x) k dx
Ill
+k(k+n—2)P(x)=0.
For n = 3 these are well-known relations of the ordinary Legendre polynomials. The Funk—Hecke theorem is not difficult to prove; one starts with polynomials and proceeds to the general case using approximations. It can be applied when dealing with spherical integral transformations of the type
I
G(u) = Jan-I
c1(u
v)F(v)da(v).
Aside from the Radon transformation, of particular interest for geometric applications are the cosine transfor,nation and the hemispherical transformation, which are defined, respectively, by
C(F)(u) = I
vIF(v)da(v)
Ia
J I
ifx>O —
0
ifxe
then
= lim
2e
One of the important features of the Funk—Hecke theorem is that it enables one to explicitly determine the expansion of the transformed function in terms of spherical harmonics if the corresponding expansion of the original function is known. For example, if F -'Qk then C(F) —
= 3 .5... (k — 3)/(n + I) (n + 3).. . (ii + k — I) jfk is even. Of particular importance is the fact that Xn.k 0
where Xnk = 0 if k is odd, and
.
112
H. Groemer
jfk is even. Similar representations can be found for 7-1(F) and R.(F) with the result 0 jfk is odd, and that 7-1(F) Q* P-a.k Qk with —
0 if k is even. Hence, the transfonnations C and 1?. are mjective for with even functions, and H is injective for odd functions. Moreover, if F is even and G is then careful estimations of the explicit representation another even function on of the respective coefficients An.k allow one to derive estimates of the type
hF — Gil s yhlC(F) — This fact is of importance and (G, y depends only on n, (F, for proving stability results that will be mentioned below. Remember that if h is the = n(n — — (n — support function of a convex body K, then (h, has an upper bound depending only on n and the I) lih 112, which shows that (h, size of K. Corresponding statements hold for the transformations H and R.. where
8 Geometric applications of the transformations R., fl, and C will now discuss some geometric problems that can be solved using the analytic assertions of the preceding section. Historically of great importance for this topic are the papers of Funk [3], [4]. Applications of the Radon transformations have already been discussed, but here 3) that is centrally symmetric is another example. Let K be a convex body in R" (n I
with respect to some point p. Clearly, if K is a ball then every intersection of K with a plane through p has the same (n — 1)-dimensional volume. To determine if and let H(u) denote the plane the converse of this statement is true, let u E through p that is orthogonal to u. Then the property that K fl H(u) has constant volume can be expressed by the relation
f
=
S(u)
with some constant c and r(v) denoting the radial distance from p in the direction v = c. Since K was to the boundary of K. Equivalently, this can be stated as is even, and the injectivity statement assumed to be symmetric, the function
r itself must be regarding the Radon transformation shows that constant. Thus we obtain the theorem that a centrally symmetric convex body K must be a ball if every plane through the center of K intersects the body in a set of constant volume. This theorem can be generalized in various directions. For example, one may consider two centrally symmetric convex bodies with the property that for every direction the corresponding planes through the respective centers of these bodies intersect in sets of equal volume. Then it follows that the two bodies must be translates of each other. Letting one of the bodies be a ball, one obtains again the previous theorem
100 Years of Fourier Series and Spherical Harmonics in Convexity
Similarly as in the case of geometric inequalities one can formulate a correspond-
ing stability problem: How much does K deviate from a ball if the volume of all pertinent sections is close to a constant? Using estimates of the type mentioned at the end of the previous section one can show that if the volumes of the sections differ with by at most E (e 1) then there is a ball B such that S(K, B) depending only on n and the inradius and circumradius of K.
As an application of the hemispherical transformation consider the following 2). Clearly, if K is centrally symmetric with respect to a point p. then every plane through p divides K into two parts of equal volume. Let us consider the converse of this theorem. Thus, K is assumed to have the property that there is a point p such that every plane through p divides K into two parts of equal volume. Using the radial function r and the function r that has been employed in the definition of the hemispherical transformation, one can reformulate this condition by writing problem. Let K be a convex body in R" (n
I
Jan—I
v(u)r(v)"dcr(v) =
I
r(u)r(—v)"dc,(v),
Js'—'
or, equivalently, by stating that ?-1(r(v)" — r(—vY1) = 0. Since the function r(vY' — r(—v)'7 is odd, it follows that r(v) = r(—v), which shows that K is symmetric with
respect to p. It is also possible to prove various generalizations and corresponding stability versions of this result. Moreover, one can show an analogous assertion with the volume replaced by the surface area. The most prominent example illustrating an application of the cosine transformation concerns projections of convex bodies. To describe this matter let again (Ku) signifies the volume of K onto w'-. If the (n— 1)-dimensional body it is not difficult to see that considered as a function of u, does not determine K uniquely. In fact, if n = 2 then any two convex domains K, L of the same constant width have the property that Vi (Ku) = V1 (La), and for arbitrary n typical examples are provided by non-spherical convex bodies of constant brightness, where (Ku) is called the 'brightness' of K in the direction u. The problem I want to discuss here is whether centrally symmetric convex bodies are uniquely determined (up to translations) by their brightness. More precisely, if K and L are two centrally symmetric convex bodies such that (Ku) = (La) (for all u E 5n_1), does this imply that L is a translate of K? To transform this problem into an analytical problem that can be solved by the tools on hand, let us first note that (under suitable regularity assumptions)
=
fdK Jv ulds(v),
where ds(v) indicates the area differential at the point x(v) on the boundary dK of
K determined by the support plane corresponding to the outer normal vector v of K. This integral can also be written in the form 2
I
114
H. Groemer
the Gaussian curvature at the point x(v). Hence, if then C(RK) = C(RL). Since, due to the assumed central symmetry, RK = and RL are even functions it follows that for every direction K and L have the same Gaussian curvature. This, as already shown by Minkowski, implies that L and K are translates of each other. One can also formulate a corresponding stability problem. But this problem is much more difficult than the stability problems mentioned in the previous examples, and a satisfactory result has been found only in recent times. where
l/RK(v)
is
9 Rotors and related bodies Let T be a convex polytope and K a convex body in R'1. Assume that K is inscribed in T (or, as one may also say, T is circumscribed about K). That means that K C T and that each ((n — 1)-dimensional) face ofT contains a point of K. The body K is said to be a rotor in T if K can be arbitrarily rotated while remaining inscribed in T. More precisely, for every rotation p there should exist a translation vector qp so that pK + qp is inscribed in T. Most results that have been found concerning rotors have been proved by the use spherical harmonics. In fact, this is a topic where spherical harmonics are apparently unavoidable. It turns out that the situation is completely different depending on whether n = 2 or n > 2. Assume first that K is a convex domain in R2 that is inscribed in a convex poly-
gon T. and that T is equiangular. meaning that all interior angles of T are equal. Instead of subjecting K to a rotation suppose that, equivalently, T is rotated. To describe the situation more formally assume that T has m sides and is rotated about o through an angle 0 resulting in a polygon T'. Let T8 denote the polygon circumscribed about K with sides parallel to the respective sides of T'. Clearly, K is a rotor is congruent to T. But let us first characterize those domains in T exactly if every K for which all polygons T9 have the same perimeter. Doing some elementary geometry and assuming that T has one side that is perpendicular to the x-axis, one finds that the perimeter of T9 can be expressed in terms of the support function h of K by the formula
= 2tan
h(0 + j=O
Thus, all polygons T9 have the same perimeter exactly if the right hand side of this equality is constant. Using again the Fourier series
h(0)
cos kO + bk sink0)
and substituting this into the previous formula for P(T9) one finds that all circumscribed equiangular rn-gons of K have the same perimeter if and only if
=
=
a3rn
= ... = 0.
brn
=
= b3m
... = 0.
100 Years of Fourier Series and Spherical Harmonics in Convexity
115
Following a similar line of reasoning one can deduce that all circumscribed equian-
gular m-gons of K are regular exactly if am = 0 and
= 0 whenever m is not
congruent to 0, 1, or —1 modulo m. Since K is evidently a rotor in a regular m-gon if all equiangular circumscribed m-gons are regular and have the same perimeter, one obtains a necessary and suflicient condition that K is a rotor in a regular tn-gon. Excluding ao one can summarize the patterns of the Fourier coefficients characterizing these properties as follows: Condition that all equiangular circumscribed m-gons have the same perimeter: a2m_I,O,a2m+I b1
bm_i. 0,
bm+i
b2rn_).
0.
a3m_I,O.a3m+1,... b3m_I, 0, b3m+I
Condition that all equiangular circumscribed m-gons are regular:
0.. ..0.
a,,1,
0.
.. .0,
a2m+I. 0, . . .0,
a3,,;_I,a3,n.a3m÷1.O
0,...0,bm_I.bni.bm+I,0....0.b2m_I,b2m,b2,n+I,O....0. b3m_I,
b3m+1, 0
Condition that K is a rotor in a regular m-gon:
O,...O,am_I,O.am+l.O,...O,a2n,_J.O,a2m+1,O,...O,a3m_I.O,a3,,,+1,O 0,...0,b,n_I,0,bm+I,0,...0,b2m_I,0,b2rn+l,0,...0.b3m_I,0,b3m+I.O every regular polygon has infinitely many incongruent rotors. (Note that the non-zero coefficients cannot be chosen arbitrarily since the resulting function h must be the support function of a convex domain. But if these coefficients are small enough Hence,
in relationship to ao this is always the case.) It is worth mentioning, as one can infer from these patterns, that there are convex domains such that all circumscribed equiangular m-gons are regular but not of the same size. All these results (with some unnecessary regularity restrictions) were found by Meissner 1111 in 1909. Several years after Meissner's work, Fujiwara [2] studied rotors in arbitrary polygons. I quickly mention here the pertinent major results. If T is a triangle with all angles rational multiples of r then it has infinitely many (non-congruent) rotors. The Fourier coefficients detennined by the support function form a similar pattern as indicated above for rotors in regular polygons. (This is not
surprising since in this case the three sides of the triangle contain three sides of a regular polygon.) If T has an angle that is not a rational multiple of ir then T has no non-circular rotor. This assertion about rotors in triangles can be used to establish a necessary and
sufficient condition that an arbitrary polygon T has non-circular rotors. It can be formulated as follows: T has non-circular rotors if and only if it either is a rhombus
116
H. Groemer
or else is a polygon that has an inscribed circular disc, and all its angles are rational multiples on ir. Let me now mention some results concerning the three-dimensional case. First I discuss rotors in (regular) tetrahedra. The problem is to find out whether tetrahedra have non-spherical rotors, and, if possible, to describe the support function of all such rotors. To determine if a given convex body K in R3 with support function h is such a rotor, it is again advantageous to assume that K is fixed and to investigate if all circumscribed tetrahedra are of the same size. Let T be such a tetrahedron with faces F0, F1, F2, F3 and consider rotations p of T that leave the plane containing Fo invariant. If ISO, ui. U2, 113 are the four respective directions that are orthogonal to these faces then
h(uo) + h(pui) +
+ h(pu3) =
where r is the inradius of T. This condition has to be combined with the expansion
h(u)
Q,(u).
But to arrive at concrete results one has to introduce spherical coordinates, with uo indicating the 'z-axis,' and use the explicit representation
h(u) = h(O, ço) =
cos JO
+
sin
co)).
k=0 j=0 Then one obtains alter some technical manipulations the condition
Qk(uo)(3Pk(—l/3) + I) = 4r. Since uo can be arbitrary, it follows that for k > 0 only those Qk can appear for which 3Pk(—l/3) = —1. It can be shown that this happens only ifk = 1,2. or5. Noting also that Qo = r one finds that if K is a rotor in T then its support function must be of the form r + Q i + + turns out that this condition is also sufficient. This is shown by investigating the explicit representation of functions of this type. This investigation reveals that may appear because of a peculiar property of the fifth Legendre polynomial, namely the fact that 1/3) = 0. Since an octahedron can be viewed as an intersection of two tetrahedra, and since an octahedron has parallel faces, it follows that K is a rotor in an octahedron exactly if it is both a rotor in a tetrahedron and of constant width. Because the expansion in spherical harmonics of the support function of a body of constant width cannot contain any even terms (except Qo). it follows that K is a rotor in an octahedron exactly if its support function is of the form r + Q + Thus, to make a provoking statement emphasizing the relationship between analysis and geometry, one may say that the reason why octahedra have non-spherical rotors is that = 0. It can be shown that icosahedra and dodecahedra have no non-spherical rotors. Obviously,
100 Years of Fourier Series and Spherical Harmonics in Convexity
117
K is a rotor in a cube exactly if it is of constant width. These properties of rotors in the Platonic polyhedra were found by Meissner [12] in 1918. What can be said about non-spherical rotors in other polyhedra in R3? The an-
swer (but not the proof) is simple: They don't exist. There is of course the trivial exception of convex bodies of constant width. They are obviously rotors in any parallelepiped with the property that all pairs of parallel faces are in planes of equal distance. Finally, addressing the problem of rotors in with n > 3 one can show that the only polytopes that permit non-spherical rotors are regular simplexes and parallelotopes of the analogous type as the parallelepipeds just mentioned. This result was proved by Schneider [151 in 1971. Actually, Schneider considers not only rotors in polytopes but also rotors in unbounded polytopal sets. For example. one may remove one face from a polytope that has a non-spherical rotor and this results in a polytopal set that has the same convex body as a rotor. For instance, convex bodies of constant width are rotors in a 'slab' bounded by two parallel planes. If such polytopal sets are permitted (and trivial situations excluded) the results are essentially the same as in the case of ordinary polytopes except that in R3 an additional possibility appears. It is a strange polytopal cone with a square cross section. Schneider's work on rotors in is technically rather involved and depends on intricate properties of spherical harmonics that have not been discussed here. It is one of the most sophisticated applications of spherical harmonics for geometric purposes that has been produced during the past 100 years.
10 Remarks on the literature concerning applications of Fourier series and spherical harmonics in convexity A survey of the material discussed here and many other pertinent results, together with references (but no proofs), has been published in the Handbook of Convex Geometry [7]. The reader who is interested not only in the results but also in the proofs may consult my book [8]. For the general theory of convex sets the classic reference is the book of Bonnesen and Fenchel [I], which also exists now in English translation. A more modern and comprehensive work is the book by Schneider [161. Regarding the general area of the stability of geometric inequalities, mentioned in Sections 2 and 3, see my survey paper [61. The problem area regarding the recovery of properties of convex bodies from intersections with planes or from projections (as discussed in Section 8) now forms part of the area called 'geometric tomography.' The standard reference for this topic is the book [5] of Gardner.
Acknowledgements. I wish to thank the organizers of the Workshop on Fourier Analysis and Convexity, Professors L. Brandolini, L. Colzani, A. losevich, and 0. Travaglini, for inviting me to present the lectures on the subject of this paper. I am also grateful to Professors G. D. Chakerian and L. Wallen for suggesting various improvements in the original manuscript.
118
H. Groemer
References Bonnesen, 1. and Fenchel W., Theorie der konvexen Körper, Ergebn. d. Math., Rd. 3, Springer Verlag, 1934. (EngI. transi.: Theory of Convex Bodies. BCS Assoc., 1987.) Fujiwara, M.. Uber die einem Vielecke eingeschnebenen und umdrehbaren konvexen 121 geschlossenen Kurven, Sci. Rep. Tôhoku Univ. 4(1915), 43—55. Funk, P.. Uber Flächen mit lauter geschlossenen geodätischen Linien, Math. Ann. 74 [3] (1913), 278—300. Funk, P.. Uber eine geometnsche Anwendung der Abelschen Integralgleichung, Math. 141 Ann. 77(1915), 129—135. Gardner, R., Geometric Tomography, Cambridge University Press, 1995. [5] Groemer, H., Stability of geometric inequalities. In Handbook of Convex Geometry [6] (Section 1.4), North Holland PubI., 1993. [7] Groemer. H.. Fourier series and spherical harmonics in convexity. In Handbook of Convex Geometry (Section 4.8), North Holland PubI., 1993. Groemer, H., Geometric Applications of Fourier Series and Spherical Harmonics, Cam181 bridge University Press, 1996. Hurwitz, A., Sur le probléme des isopèrimétres, C. R. Acad. Sci. Paris 132(1901), 401— [9] 403; Math. Werke, I. Bd., Birkhäuser, 1932, pp. 490—491. 110] Hurwitz, A.. Sur quelque applications gdomdtriques des sdries Fourier, Ann. Sd. Ecole Normal Sup. (3). 19(1902), 357—408; Math. Werke, 1. Bd. Birkhäuser, 1932, pp. 509—
Ill
554.
LIII Meissner, E., Uber die Anwendung von Fourier-Reihen auf einige Aufgaben der Geometrie und Kinematik, 112] [131
d. na:urforschenden Gesellsch. Zurich 54 (I909). 309—329. Meissner, E., Uber die durch regulare Polyeder nicht stUtzbaren Körper, Vierreljahresschr,ft d. naturforsehenden Gesellsch. Zurich 63 (1918). 544—551.
Minkowski. H.. On convex bodies of constant width (Russian), Mat. Sb. 25 (1904— 1906), 505—508. (German transl.: Uber die Korper konstanter Breite, Ge.c. Abh. 2, Bd. 277—279, Teubner, 1911.)
1141
Radon. J.. Uber die Bestimmung von Funktionen durch ihre Integralwerte Iangs gewisser Manigfaltigkeiten, Verh. Sächs. Akad. Leipzig. Math.-Phvs. Ki. 69(1917), 262—277.
1151
Schneider. R., Gleitkorper in konvexen Polytopen, J. Reine Angew. Math. 248 (1971).
116]
Schneider. R.. ('onvex Bodies: The Brunn-Minkow.ski Theory. Cambridge University
193—220.
Press, 1993.
Fourier Analytic Methods in the Study of Projections and Sections of Convex Bodies A. Koldobskyt. D. Ryabogin2, and Artem Zvavitch3 Department of Mathematics. University of Missouri, Columbia, MO 65211. USA
[email protected] 2
Department of Mathematics.
Kansas State University, Manhattan. KS 66506, USA
[email protected] Department of Mathematics, Kent State University, Kent. OH 44442. USA
zvavitch@math. kent.
edu
Summary. A Fourier analytic approach to
sections and projections of convex bodies has recently been developed and led to several results, including unified analytic solutions to the Busemann—Petty and Shephard problems. characterizations of intersection and projection bodies, extremal sections and projections of certain classes of bodies. The idea is to express certain geometric properties of convex bodies in terms of the Fourier transform, and then use methods of Fourier analysis to solve geometric problems. In this article, we outline the main features of this approach emphasizing similarities between results for sections and projections
It was noticed long ago that many results on sections and projections are dual to each other, although methods used in the proofs are quite different and don't use the duality of underlying structures directly. In the paper [KRZJ, the authors attempted to start a unified approach connecting sections and projections, which may eventually explain these mysterious connections. The idea is to use the recently developed Fourier analytic approach to sections of convex bodies (a short description of this approach can be found in K5 J) as a prototype of a new approach to projections. The first results seem to be quite promising. The crucial role in the Fourier approach to sections belongs to certain formulas connecting the volume of sections with the Fourier transform of powers of the Minkowski functional. An analog of these formulas for the case of projections was found in [KRZI and connects the volume of projections to the Fourier transform of the curvature function. This formula was applied in FKRZ] to give a new proof of the result of Barthe and Naor on the extremal
projections of la-balls with p >
2, which is similar to the proof of the result on with 0 < p < 2 in [K3]. Another application is to the extremal sections of the Shephard problem. asking whether bodies with smaller hyperplane projections necessarily have smaller volume. The problem was solved independently by Petty and Schneider, and the answer is affirmative in dimension two and negative in dimensions three and higher. The paper [KRZ] gives a new Fourier analytic solution to this problem that essentially follows the Fourier analytic solution to the Busemann—
120
A. Koldobsky, D. Ryabogin, and A. Zvavitch
Petty problem (the projection counterpart of Shephard's problem) from [K 1]. The transition in the Busemann—Petty problem occurs between dimensions four and five. In Section 4, we show that the transition in both problems has the same explanation based on similar Fourier analytic characterizations of intersection and projection bodies.
The goal of this survey is to bring together certain aspects of the Fourier approaches to sections and projections, in order to emphasize the similarities between the results and the proofs. We do not include the proofs and refer the reader to [K5J and [KRZ] for complete proofs, other related results and references.
1 Volume and the Fourier transform We start with the necessary notation and definitions. The Minkowski functional of a convex body K is defined by
x EaK}.
lixilK =min{a
Observe that K = {x [—1.
:
l}. If x is the indicator function of the interval
lixilK
then
J
x(IIxIIic)dx.
xER":x
Passing to polar coordinates in the hyperplane x . for the volume of sections:
n
—
= 0, we get the polar formula
1
It is important to note that the right-hand side of the above formula is the spherical Radon transform of II II The surface area measure of a convex body K in is defined as follows: for every Borel set E C S"', (K, E) is equal to the Lebesgue measure of the part of the boundary where normal vectors belong to E (see, for example
[0a3], page 351). The well-known Cauchy formula ([Ga3], page 361) expresses the
volume of projections of the body K as the cosine transform of the surface area measure:
=!
f
J9.
v),
9E
sn-I
For our needs, it is enough to consider bodies with absolutely continuous surface
area measures. A convex body K is said to have the curvature function
fK()
:
Volume of Projections and Sections
if its surface area measure Lebesgue measure on
121
(K,.) is absolutely continuous with respect to the and
= fK() i
also recall that fg (.) is the reciprocal Gauss curvature, viewed as a function of the Unit normal vector (see LSc2I, page 419). In the next section, we apply this property to compute the volume of projections of By expressing the volume of sections and projections in terms of the spherical Radon and cosine transform, one can reduce geometric problems to the study of these transforms. The Fourier analytic approach to sections and projections is based on relating these transforms to the Fourier transform first, and then applying methods of harmonic analysis to solve geometric problems. In many cases we operate with the Fourier transform of distributions. We denote by S the space of rapidly decreasing infinitely differentiable functions (test functions) on with values in C. By S' we denote the space of distributions over S. Every locally integrable real-valued function f on with power growth at infinity represents a distribution acting by integration: for every E S, (f, f (x)ço(x)dx. The Fourier transform of a distribution = f is defined by (f, = p). for every test function q. The following formula expressing the volume of hyperplane sections in terms of the Fourier transform was proved in (K31 by using a connection between the Radon and Fourier transforms of homogeneous distributions: Let K be an origin symmetric star body in R" and let Then We
=
1
,r(n
1)
—
An analog of this formula for the volume of projections was proved in (KRZJ by relating the cosine and Fourier transforms of homogeneous distributions: Let K be a convex origin symmetric body in with an absolutely continuous surface area measure. Then
= Here fx(x) =
!
(9),
fK(x/IxI), x E
VO E
istheextensionoffK(x),x E
In the case of general convex bodies one has to use the extended surface area measure Se(K) in place of the curvature function. An interested reader may find a brief description of this and other related concepts in the Appendix at the end of the paper. to a homogeneous function of degree —n —
1.
2 Extremal sections and projections of It has been known for some time that Fourier analytic formulas for the volume of sections are useful in the study of extremal sections of certain bodies. The first formula
A. Koldobsky, D. Ryabogin. and A. Zvavitch
122
of this kind, relating the volume of sections of the unit cube the Fourier transform, was known to Polya [Po]:
=
!J
=
[—1/2, 1/21" to
(5) k=I
This formula has many applications, the most remarkable of them being the result
of Ball [Ball that the maximal volume of hyperplane sections of the cube (in every 0). It is worth mentioning and is attained at 0 dimension) is = that finding the smallest hyperplane section of the cube is much easier and does not involve the Fourier transform (the minimal section is the one parallel to the face, as was first proved by Hadwiger [Ha]). An analog of formula (5) for 1,,-balls
={x ER": wa.s
established by Meyer and Pajor [MPJ for I
for
1)
p
2 using probabilistic methods:
E
= ,r(n — l)run — l)/p)
(6)
P) on R. It was shown in . [K3] that the latter formula works for all p E (0, oo) and is a direct consequence of formula (3). when In particular, this formula allows us to find the extremal sections of where y,, is the Fourier transform of the function exp( —
0 2. The result is "dual" to that for sections: for every 0 E S"1, voin_i
voin_i
where = (1.0 0) and = (l/,Ji on a formula similar to (6): for every E S"',
The proof in IBNI is based
1
(B" "
where
1/p +
=
/
=1
(
1
/p)
irp"'r(n — and
is
f J
—
k=I
dt,
(7)
the Fourier transform of the function
on R. The proof of this formula in [BN] uses probabilistic arguments.
2. the funcThe rest of the proof in [BN] is similar to that for sections: for p tion (,/) is log-convex on [0. oo), which together with (7) immediately implies the result. It was shown in [KRZ] that formula (7) follows directly from (4), which makes the proof for projections similar ("dual") to that for sections.
124
A. Koldobsky. D. kyabogin, and A. Zvavitch
In fact, 1K is the reciprocal Gauss curvature of K, viewed as a function of the unit normal vector. Thus, computing the Gauss curvature of one gets (see [KRZD:
= (p* and
1911(n-I)--np'
(n
8
so (7) can be proved by computing the Fourier transform of
3 The Busemann—Petty and Shephard problems The Shephard problem (see [Sh]) reads as follows. Let K, L be convex symmetric bodies in 1k" and suppose that, for every 8 E Voi,,_i
(L)? The problem was solved independently by Does it follow that (K) Petty [PJ and Schneider [Sc!], who showed that the answer is affirmative if n 2 3. Ball [Ba2] proved that it is necessary to multiply Vole (L) and negative if n to make the answer affirmative in all dimensions. A version of the Shephard by problem with lower dimensional projections was solved by Goodey and Zhang [GZ]. One of the main steps in the solution of the Shephard problem is a connection to projection bodies found by Schneider [Sc!]. Recall that an origin symmetric convex body L in 1k" is called a projection body if there exists another convex body K so that the support function of L in every direction is equal to the volume of the hyperplane 8i)• The projection of K to this direction: for every 0 E hL(0) =
where V x) is equal to the dual norm II support function hL(0) = stands for the polar body of L. Schneider [Sc 1] discovered that if L is a projection body then the answer to the Shephard problem is affirmative for every K, and, on the other hand, if K is not a projection body one can perturb it to construct a body L giving together with K a counterexample. Therefore, the answer to the Shephard problem in 1k" is affirmative if and only if every symmetric convex body in W1 is a projection body. The section counterpart of Shephard's problem is the Busemann—Petty problem, posed in 1956 (see [BPJ). Suppose that K and L are origin symmetric convex bodies fl H) in R" such that (L fl H) for every central hyperplane H in W'. Does it follow that The answer is affirmative if n 4 and negative if n ? 5. The solution appeared as the result of a sequence of papers FLR], [Ball, [Gil, [Bo], [Lu], [Pa], [Gall, [Ga2], [Zh2], [K4], [K5], [Zh3], [GKS] (see [Zh3] and [GKS] for historical details). The class of intersection bodies introduced by Lutwak [Lu] in 1988 plays the same role in the solution of the Busemann—Petty problem, as projection bodies in the solution to Shephard's problem. Let K and L be symmetric star bodies in 1k". We say that K is the intersection body of L if the radius of K in every direction .
Volume of Projections and Sections
125
is equal to the (n — 1)-dimensional volume of the central hyperplane section of L fl perpendicular to this direction, i.e. for every E = A more general class of intersection bodies can be defined as the closure of intersection bodies of star bodies in the radial metric d(K, L) = — Lutwak [Lu] found the following connection between intersection bodies and the Busemann—Petty problem (the original result of Lutwak was slightly improved in [Gall and [ZhID: if K is an intersection body then the answer to the BP problem is affirmative for every L, and, on the other hand, if L is not an intersection body one can perturb it to construct a body K giving together with L a counterexample. Therefore, the answer to the Busemann—Petty problem in is affirmative if and only if every symmetric convex body in JR'1 is an intersection body. We would like to mention several facts concerning intersection and projection bodies, which can be found in [K4], [K21: The unit ball of any n-dimensional subspace of with 0 0,suchthat
A =IJ(ajR+flj). That is, tiling sets for compactly supported tiles in dimension 1 are finite unions of complete arithmetic progressions.
The Idempotent theorem, the Bohr group and Meyer's theorem This extreme structure is, in the end, a consequence of P.J. Cohen's idempotent theorem on a general abelian group [Coh591.
Theorem 4. (Cohen, 1959) If E M(G) is a finite measure on a locally compact abelian group G, such that takes only finitely many values, then, for any such value c, the set S = E G: = c) belongs to the open coset ring of G. The (open) coset ring is defined below.
DefinItion 4. (The coset ring of a group) The coset ring of an abelian group G is the smallest collection of subsets of G which is closed under finite unions, finite intersections and complements and which contains all cosets of G. For a topological group G the smallest ring of subsets of G which contains all open cosels is called the open coset ring of G.
Cohen's theorem therefore says that S can be constructed with finitely many set-theoretic operations from the open cosets of G. The group G is called the dual group of G and is the group of continuous charC with the group acters on G, that is, the group of all group homomorphisms G operation being the pointwise multiplication. It can be proved that is isomorphic
140
M.N. Kolountzakis
(as a topological group) with G (Pontryagin duality) and that is compact if and only Some dual group pairs are the following: if G is discrete. Further, = x (Z", Td). (Z, T), (R, R), (Zn, Zn), (Rd. If is a finite measure on G its Fourier transform is a continuous function on G defined by JG
the integration carried out with respect to the essentially unique translation invariant measure on G called the Haar measure. For example, when G = R the Haar measure = is the Lebesgue measure and (The reader should consult [R62) for the basic definitions and facts about Fourier analysis on locally compact abelian groups.) We do not use Cohen's theorem directly, but rather a consequence of it discovered by Y. Meyer [Mey7OJ.
Rd be a discrete set and SA be the Radon
Theorem 5. (Meyer, 1970) Let A measure
CkES, AeA
where S ç C \ (0) is aflnile set. Suppose that 8A is tempere4 and that
is a Radon
measure on Rd which satisfies
C1R", as R -+
00,
where C1 > 0 is a constant. Then, for each s E S, the set
cx=s) is in the coset ring of Rd.
Proof. Let E Cr(B1 (0)), = 1, so that its Fourier transform satisfies for all a > 0. For positive integers n define the functions
=
> n we have (using the
fact
that
as
The Study of Translational Thing with Fourier Analysis 11Z1(B2k+I(O)
141
\
j
2(k-4-l)d
Cn
C1,
which, together with (7), shows that the sequence 11Z1(R") is bounded.
Notice also that = c, if x A and is 0 otherwise. This is a consequence of the fact that A is discrete and the support of shrinks to 0. We now use the following properties of Rd, the Bohr compactification of R", a locally compact abelian group. 1. R'1 is the dual group of the d-dimensiona] Euclidean space with the discrete topology. Therefore Rd is a compact group that is the dual group of a discrete group. 2. R" C Rd as topological spaces and Rd is dense in Identifying the continuous functions on Rd with bounded continuous functions on Rd we get that
C(Rd) ç
L00(Rti)
is a Banach space inclusion.
Since the measures are unifonnly bounded they act on all bounded continuous functions on Ri', and consequently also on all continuous functions on Rd. That is, they constitute a uniformly bounded family of linear functionals on C(Rd). By the Banach—Alaoglu theorem there exists a measure v on Rd such that for every call it again I E C(R') there is a subsequence of such that —+
v(f),
as n —+ Co.
Applying this with each character of W' in place off we obtain that
D(x)= lim
if—xEA,
and IsO otherwise. the finite range —S. By Theorem 4 the set —A, and Since thus A, belongs to the open coset ring of has the discrete topology the open coset ring is the same as the coset ring of
Since we need to know what kind of sets the elements of the coset ring of Rd are, we use the following general theorem [KOOa], which says that discrete elements of the coset ring can always be constructed from discrete cosets using finitely many unions, intersections and complementations.
M.N. Kolountzakis
142
Theorem 6. (Kolountzakis, 2000) Let G be a topological abelian group and let R be the least ring of sets which contains the discrete cosets of G. Then 1Z contains all discrete elements of the coset ring of G.
In dimension I this implies the following result by Rosenthal [Ros66]. Theorem 7. (Rosenthal, 1966) The elements of the coset ring of R which are discrete in the usual topology of R are precisely the sets of the form
F ç R is finite, J E
N,
a3 > 0 and
E
R
denotes symmetric difference).
Getting structure in dimension 1 In this section we prove Theorem 3. Assume that A C R is a set of bounded density and that f + A = £R for a function f E L' of compact support, contained in, say, (—A, A). We will use (2), so the first thing to do is to obtain information on the set
Z(1)= (1=oJ. We look at the Fourier transform of f defined on the complex numbers
1z
=
f
dx,
(z E C).
Since f is supported in (—A, A) it follows that Iis entire so that 2(1) is a discrete subset of R. Furthermore f satisfies the growth bound
1A dx
I1(z)I
If N(T) counts the number of zeros of 1z
!1fI11e23z1.
in the disk {z: Izi
T), an application
of Jensen's formula gives
limsup T—*oo
N(T)
'r
Write B for the discrete set (0) U so that by (2) the tempered distribution 8A is supported on B. It is well known, and easy to prove, that a tempered disthbution supported at a single point b is necessarily a finite linear combination of derivatives of Sb, and the same proof gives
beB
Pb(a) =
is differential polynomial operator applied on the Dirac point mass at b. (The degree N can be taken the same for all b E B as any tempered distribution has finite degree. This is not used below.) Here
c3
The Study of Translational Tiling with Fourier Analysis
143
— b))
b'
b"
b
at b by applying it on FIg. 6. Picking out the distribution points of set B are left out and the behavior at b is isolated.
—
b)). For large I the other
Step 1 All Pb are constants (hence c5A is locally a measure) Focus on a single b B and let be a smooth function of compact support. Examine the quantity — b))), (t —÷ co), 1(t) = SA as shown in Figure 6. For large t this equals
(4(t(x
—
b))) =
— b)))
= j=1 Choose
=
(— I)' to get the above expression equal to
j=1 Next we will bound the growth of 1(t).
g(x) =
— b)),
By duality
I1(t)I =
=
=
144
M.N. Kolountzakis
tM':
C+C..ii n=jtj
= O(../i). We used the bounded density
of A for the convergence of the sum
and the
fact that
= for any M > 0 we wish. We took M = 3/2. Since 1(t) cannot even grow linearly it follows that the degree N is zero and we can now write
= bE B
for some constants Cb. Step 2 The coefficients Cb are uniformly bounded.
To prove this we are just a bit more careful in the last estimate and now use a which is 1 at 0. For large I then Cb
=
— b))),
and one can get a bound for this by duality which does not involve I at all using the exponent M = 2 instead of M = 3/2 in (9).
Step 3 Use of Meyer's theorem Now the crucial condition R)
CR
in Meyer's theorem holds (remember there is a linear number of zeros and at each
one we have a bounded mass), hence, by Rosenthal's Theorem 7,
A= for some real numbers crj,
and finite set F.
Step 4 F is empty Otherwise would have a continuous part, a trigonometric polynomial due to F. But it cannot have such a continuous part as its support is discrete.
Open Problem
1 Is the main theorem true 4ff is only supposed to be in L' but not of compact support? What if f is an indicator function?
The Study of Translational Thing with Fourier Analysis
1.4 Structure of some polygonal tHings in dimension 2 The one-dimensional tiling problem treated in the previous section is very particular. One cannot expect this rigid structure in higher dimensions. For example, even when the tile is a square in two dimensions, one cannot expect every tiling of it to be fully
periodic, in the sense of posessing a period lattice of full rank. One can, after all, make vertical columns of squares which can be shifted vertically within themselves, arbitranly, preserving the tiling property (see Figure 1 (a)). It is clear that there is no horizontal period here, in general. One might suspect that there is always, no matter what the tile, at least one period, but this phenomenon, if true, must happen only in dimension 2. In dimension 3 one can construct cube tilings with no periods at all. First make horizontal layers of cubes, some of which have no period along the x-axis and some others having no period along the y-axis. Consider these tiled slabs as rigid bodies and move each of them by an arbitrary horizontal vector, thereby destroying all vertical periods as well.
OpenProblem2 IfE c It2, is it rruetha: in anytilingE+A =R2:hesetAmust possess a least one period vector? The main difficulty in dimensions 2 and higher is that the zero set of Iis not a discrete set any more, at least under no set of reasonable assumptions about f (such as compact support was in dimension I). Therefore, from our basic condition (2) one obtains that 8A is supported, in general, on a subset of the plane, which, under some reasonable asswnptions, is a collection of submanifolds of codimension 1. The structure of such distributions is much richer of course than those supported at points, and this is the main source of difficulty, at least compared with the one-dimensional problem. In this section we will show the following result [KOOa] in two dimensions.
Theorem 8. (Kolountzakls, 2000) Suppose that P is a symmetric convex polygon in the plane which tiles (multiplies) with the multiset A:
P+A at some integer level m. If P is not a parallelogram then A is a finite union of twodimensional lattices. The convexity assumption here is only used to guarantee that each edge direction appears in the polygon exactly twice. For a more general theorem see [KOOaJ.
If one tries to use (2) directly, one encounters the problems mentioned above, is not discrete, but rather a one-dimensional mainly the fact that the zero set Let and e2 be two edges of the polygon P of the same direction u. By the symmetry of P they have the same length. We can then write (here e1 and e2 are viewed as point sets in R2 and r as a vector)
= ei + r,
146
M.N. Kolountzakis '4
III
+ ,
Fig. 7. The measure
supported on two parallel edges of the polygon ei and
U
with opposite
sign on each edge.
for some E JR2. (Foreach setA and vectorx we write A +x = (a +x: a E Al.) be the measure which is equal to arc length on ei and negative arc length Let then on (see Figure 7). Since every part of a translate of ej in the tiling P + A has to be cancelled by part of a copy of it follows that —A) AEA
is the zero measure in JR2. It is also intuitively obvious that the vanishing of the above
measure for all relevant directions (i.e. those appearing as edge directions) u also implies tiling at some integer level. So a convex symmetric polygon P tiles multiply with a multiset A if and only if for each pair e and e + r of parallel edges of P
—A)=O, XEA
where is the measure in JR2 that is arc length on e and negative arc length on e + r. Condition (10) then becomes * 8A = 0 or, taking Fourier transforms (arguing as in §1.2), and
suppöA ç for all edge directions e.
The shape of the zero set and determine its structure. We first calculate Here we study the zero set of in the particular case when e is parallel to the x-axis, for simplicity. Let M(R2) be the measure defined by duality by
ph2
j
J—l/2
The Study of Translational Tiling with Fourier Analysis
147
That is, is the arc length on the line segment joining the points (—1/2,0) and (1/2,0). Calculation gives
Notice that 77) = 0 is equivalent to E Z \ {0}. If is the arc length measure on the line segment joining (— L/2, 0) and (L/2, 0) we have
= Write r
= (a, b) and let
E
L'Z\ toi}.
be the
measure which is the arc length on the segment joining (—L/2, 0) and (L/2, 0) translated by r/2 and the negative arc length on the same segment translated by —r/2. That is, we have
lJ'L,r =
P'L *
—
and, taking Fourier transforms, we get
= Define u = u)
and
+b77).
v = (IlL, 0). It follows that (u1 is a unit vector orthogonal to = (Zu + Ru1) U (Z \ (0)v + Rv1).
(Each of the two summands in the union above corresponds to each of the factors in the formula for This is a set of straight lines of direction u-i- spaced by luI and containing 0 plus a similar set of lines of direction v1, spaced by lul and containing zero. However in the latter set of parallel lines the straight line through 0 has been removed (see Figure 8). We state this as a theorem for later use, formulated in a coordinate-free way. Definition S. (Geometric inverse of a vector) The geometric inverse of a non-zero vector u R2 is the vector u*
=
lu 12
Theorem 9. Let e and e + r be two parallel line segments (translated by r, of magnitude and direction described by e, symmetric with respect to 0). Let also be the
measure which charges e with its arc length and e + r with its negative arc length. Then
= (Zr' + Rr'1) U (Z \ (0)e' + Re'1).
148
MN. Kolountzakis
= (Zu+P.w')U(Z\{O)v+Rv1),withu v= (1/L,O).
=
and
Completion of the argument is easily shown to be a discrete set, except
The intersection of all the relevant when P is a parallelogram.
To conclude the argument we show that (a) the tempered distribution 8A is locally a measure, and (b) the point masses of 8A are uniformly bounded. This is accomplished using the following two theorems.
Theorem 10. Suppose that A E R' is a multiset with density p. ôA that 8Ä is a measure in a neighborhood of 0. Then ((0)) = p. Proof. Take
C°° of compact support with
= lim = lirn
ÔÄ
= AEA
= where, for fixed and large T > 0,
ne?I
AGQ1,
=
1.
We have
5A. and
The Study of Translational Tiling with Fourier Analysis
149
flEZd. Since A has density p ii follows that for each E > 0 we can choose T large enough
so that for all n
Afl with
Iön
E.
For each n and
PIQnt(l we have
A
=
+
Hence
with IrxI s CT:—'
+ rx)
=
AEQ,,
=
urn
E
+
nEZd
+ urn
t_" nEZd
= urn S1 + urn S2. t-+oo
We have
Si
E
—
The first sum in (13) is a Riemann sum for p IRa
=
p and the second is a Riemann
0. Define —
if 1
the indicator function of R is equal It follows that, with this assignment for the to the function .. . 8d)*(x) of (30) and tiling follows from the previous discussion.
Most likely the extended cubes with an intersection of even codimension do not tile, at least not for general side lengths. This is clear in dimension 2 and it is conceivable that some combinatorial argument could easily show this in any dimension. The Fourier analysis approach does not seem to be very helpful when one tries to disprove that something is a translational tile.
Open Problem 4 In the setting of Theorem 18 prove that then the set Q U R is not a tile.
the codimension is even
2.3 The Steinhaus tiling problem The original, two-dimensional case Steinhaus [Mos8 1, problem 59] asked whether there is a planar set S which, no matter how translated and rotated, always contains exactly one point with integer coordinates.
DefinitIon 6. (Stelnhaus property) A set S C R2 has the Steinhaus property tf for every x R2 and for every rotation
e A9=ifcos \S1fl9 cos 9
The Study of Translational Tiling with Fourier Analysis
we have
where
#(z2nAOs÷x))=l.
159
(33)
AoS+x={A9s+x: SES}.
SierpitIski
[Sie59] first proved that a set which is bounded and either open or
closed cannot have the Steinhaus property. Croft [Cro82} and Beck [Bec89] proved
the same of any set which is bounded and measurable. (Croft's approach is more direct and geometric. Beck is using Fourier analysis.) Ciucu [Ciu96J shows that any Stcinhaus set must have an empty interior, without assuming boundedness. Several variations of the problem have been investigated by Komjáth [Kom92] from a rather different point of view, where one places a different subgroup of the plane in place of z2. Very recently it was shown by Jackson and Mauldin [JMO2J that Steinhaus sets do indeed exist. But the construction there does not furnish measurable such sets and it is precisely under the assumption of measurability that we study the existence problem for Steinhaus sets here, using Fourier analysis. To begin, notice that the question of Steinhaus can be rephrased as follows:
(a) Is there a set E which tiles the plane if translated at any rotated copy of Z2? (b) Or, is there a common set of coset representatives (fundamental domain) of all groups R9Z2 in the group We only care for measurable Steinhaus sets (if they exist) so tiling, above, is to be interpreted in the almost everywhere sense, as it is normally interpreted throughout this survey. As first noticed by Beck [Bec89], the Steinhaus question in the form (a) is equiv-
alent to asking if there exists a measurable set E c R2, of measure I, such that the Fourier transform of its indicator function vanishes on all circles of the plane which are centered at the origin and pass through some point of the integer lattice Z2. This is so since for a set to have the Steinhaus property it must tile the plane when translated by any rotation of Z2 (this alone implies of course that JEI = 1). These sets vanishing on all these lattices, which are are lattices, hence this is equivalent to self-dual. The union of these rotated lattices is precisely the set of circles mentioned above. We state this as a theorem.
Theorem 19. A measurable set £ c R2 is simultaneously a tile for all rotations of vanishes on all circles Z2 if and only (fit has measure 1 and its Fourier transform n2, with m, n N, not both with center at the origin and radius of the form It is now easy to see that such sets cannot be bounded, if they exist. Indeed, the is nothing but the one-dimensional Fourier restriction onto any line L through 0 of transform of the function XE projected onto L, i.e., of the function
f(t) =
I
+ s) ds,
160
MN. Kolounczakis
is the line through 0 which is orthogonal to L. E is bounded the function f(:) has compact support, hence fj(tu) is an entire But if function of exponential type, and, as such, it should have at most C. R zeros in the interval (—R, R), where C > 0 is a constant. (See the discussion in § 1.3.) However, is twice the number of circles out to radius R, or, in the number of zeros of where u is a unit vector on L and
other words, twice the number of integers expressible as a sum of two integer squares
and of size up to R2. But this number is almost quadratic in R. It is a well-known R. result of Landau [Fri82] that it is With a more careful and quantitative approach along similar lines, but not using entire functions, it was then proved by the author (K96] that any set E with the Steinhaus property must be large at infinity:
J
dx = oo, for any a>
With much more care the following theorem was obtained in [KW99] by the author and 1. Wolff.
Theorem 20. (Kolountzakls and Wolff, 1997) If E c set then fE Ix 1a = 00, for all a > 46/27.
R2
is a measurable Steinhaus
The number 46/27 comes from the best known estimate for the circle problem. This is the problem where one asks for the best upper estimates in the error term E(R) (as R —* 00) in the expression
N(R) = ,rR2 + E(R), where N(R) is the number of integerlatticepoints in thedisk (IxI R} ç R2. Even if is proved, the estimate the conjectured best possible upper bound E(R) = 0 (R for the Steinhaus tiling problem in Theorem 20 would only become true for all a> 1. So it appears that if one is going to disprove the existence of measurable Steinhaus sets in dimension 2 one needs a rather different approach. This seems to be the state of knowledge for the two-dimensional case.
The problem in dimension d
3
The Steinhaus problem generalizes very naturally to any dimension. One asks for a such that no matter what orthogonal linear transformation you apply to set E c With precisely the same reasoning as before, it, it still tiles Rd when translated by one is looking for a measurable set of measure I such that the Fourier transform of its indicator function vanishes on all spheres centered at the origin that contain some integer lattice point. It is because of the fact that we know precisely which numbers are representable as sums of three squares that the following result [KW99] holds.
Theorem 21. (KolountzaklsandWolff1997)Iff E L'(R'),d on all spheres centered at the origin through some lattice point, then I is a.e. equal to a continuous function. In there are no measurable Steinhaus sets in dimension d 3.
The Study of Translational Tiling with Fourier Analysis
161
Here we show an alternative way [KPO2] of proving that there are no sets with the Steinhaus property in dimension d 3. We emphasize though that Theorem 21 is much stronger than Theorem 22 given below. See also some related results of Mauldin and Yingst (MYO2].
Theorem 22. (Kolountzakis and Papadlmltralds, 2000) There are no measurable Steinhaus sets in dimension d ? 3. Proof. In any dimension d write B for the union of all spheres centered at the origin that go through at least one lattice point. The point 0 is included in 8. Assume from now on that the set E is a Steinhaus set in dimension d. Suppose now that we can find a lattice A4 C B with det A4 not an integer. Since vanishes on A \ (0) it follows that E + A is a tiling atlevelt = IEI x dens A = 1 x det A4, which is not an integer. This is a contradiction as, obviously, any set may only tile at an integral level. Looking at the quadratic form (ATAx, x) for each lattice A* = AZd we summarize the above observations in the following lemma. Lemma 6. Suppose there exists a positive definite quadratic form Q(x) = Q(x1, xa) = (Bx, x) such that for all integral XI, . .. ,xa its value is the sum of d integer squares, and the determinant ofdet Q, B, is not the square of an Then there are no Steinhaus sets in dimension d.
The cased
4: Consider the symmetric 4 x 4 matrix B with I on the diagonal and 1/2 everywhere else. The matrix B is positive definite (its eigenvalues are 1/2, 1/2, 1/2 and 5/2) and its determinant is 5/16. It defines the quadratic form
Q(x)= i=I
i>j
which is obviously integer valued and has non-square determinant. Furthermore,
every non-negative integer may be written as a sum of four squares (Lagrange). From Lemma 6 it follows that there are no Steinhaus sets ford =4. We easily see that this extends to all higher dimensions by taking as our matrix the identity in one corner of which sits the 4 x 4 matrix B described above.
The cased = 3: The determinant of the form that appears in the following theorem is 2. 11 .6, which is not a square, hence there are no Steinhaus sets in dimension 3.
Theorem 23. For each x, y, z
Z the number
Q(x,y,z)=2x2+ 11y2+6z2 is a sum of three integer squares. Proof. Suppose this is false and that there are (xo, Yo. zo)
(0,0,0) and
M.N. Kolountzakis
162
(a) Q(xo, (b) 4
+
Zo) is not a sum of three squares, and
+ 4 is minimal.
From (a), and the well-known characterization of those natural numbers that cannot be written as a sum of three squares, we have that
Q(xo,yo,zo)=4"(8k+7),
v
If alixo, yo, zoareeven,wehavev> l,and,settingxo = 2xi,yo = 2yi andzo = 2ZI, zj) is not a sum of three squares, which contradicts the we obtain that Q(x1, minimality of the initial triple (xO, yo, zQ). We conclude that at least one of Zo is odd.
CaseNo.1: v =0. Then Q(xo, Yo. zo) = 7 mod 8. But the quadratic residues mod 8 are 0, 1 and 4, and one checks by examining all the possibilities that Q is never 7 mod 8.
CaseNo.2: v = Then Q(xo,
1.
zo) = 32k + 28. Hence
is even, say Yo = 2yi. We get
16k+ 14, from which we conclude that
and zo
are
odd, xo = 2xi + 1, zo = 2zj + I.
Substitution gives
4x?+4x1 + 1
+ l)+
+3 = 16k+ 14 + l)+2 = 8k+7 + 1) =
2x1(x1 + l)+
5
mod 8.
+=
hence, by applying this to the first and 0 or2 or4 or6 mod 8, for last term in the above sum and checking all possibilities, we get a contradiction.
CaseNo,3: v ?2. As in Case No.2:
=
2Y1.
20 = 2zi + 1, .ro = 2xi + 1. Hence
v—1>1. So
is even, Yi =
+
which gives
+ 1)+1
a contradiction as the left-hand side is odd while the right-hand side is even.
We point out here that the actual quadratic form was only found by a semiautomated computer search. See [MYO2] for a more systematic study of the method.
It is also shown in [KPO2J that the method shown above cannot be applied in dimension 2 to show the non-existence of measurable sets with the Steinhaus property.
Theorem 24. (Kolountzakis and Papadimltrakis, 2002) Any positive-definite binary quad ratic form whose values are always sums of two integer squares must have a determinanr which is the square of an
The Study of Translational Tiling with Fourier Analysis
163
2.4 Multi-lattice tiles A "finite" Steinhaus problem The Steinhaus question essentially asks if there is a set in the plane which is simultaneously a translational tile for each translation set in the collection
fR9Z2:
0 define the approximate identity *5(x) = E_d*(x/E). Let fE
=
I
and for
ifr5f,
which has rapid decay. IC + A is a tiling. That is, we show that the convolution First we show that (f * is a constant. Let 4' be any Schwartz function. Then .fE
*
= f(8A(Ø(X)) =
is a Schwartz function whose support intersects supp at 0, since, for small enough 0,
The function
supp4,f5 c Hence, for each Schwartz function 4,
IC *5A(4,) =
c
(suppf)(
ç
only
176
M.N. Kolountzakis
which implies
= fE(O)SA((O}), a.e. (x).
*
We also have that If(x — A)I is finite ac. (see the remark following the definition of tiling), hence, for almost eveiy x E Rd f(x — A) —
If(x —
— A)I =
AEA
—
—
AEA
which tends to 0 as E —+ 0. This proves
f(x —
A)
= 7(0)
a.e. (x).
Convex spectral bodies must be symmetric
Proof of Theorem 32. Write K = which is a symmetric, open convex set. — Assume that A) is a spectral pair. We can clearly assume that 0 E A. It follows that + A isa tiling and hence that A has uniformly bounded density, has density equal to I and SA((O)) = 1. By Theorem 33 (with I = it follows that = *
I
c
(0) U Kc.
LetH = K/2andwrite
f is supported in
and has non-negative Fourier transform
We have
fRd1=f(0=volH 1(0)=fgdf=(voIH)2. By the Brunn—Minkowski inequality for any convex body voR2,
vol
with equality only in the case of symmetric non-symmetric it follows that
volH>1.
Since
Q has been assumed to be
The Study of Translational Tiling with Fourier Analysis
1> consider
p>
1
1
177
\1/d
g(x) = f(x/p)
which is supported properly inside K, and has
g(O) = f(O) = vol
= pd(voljj)2.
H, f g =
Since supp g is properly contained in K Theorem 35 implies that I + A is a tiling = g(O) = vol H. However, the value oflat 0 is fg = at level dens A = pd(vol H)2 > vol H, and, since i> 0 and us continuous, this is a contradiction. I
fi
33 The spectra of the cube In this section we prove the following [1P98, LRWOO, KOOb].
Theorem 36. (losevlch and Pedersen, 1998, Lagarlas, Reeds and Wang, 1998, Kolountzakk, 1999) Let Q = (—1/2, 1/2)d be the unit cube in R" and A C Rd. Then
Q + A = R".
A isa spectrum of Q
This had been proved earlier by Jorgensen and Pedersen [JP99] ford = 3.
A lemma for two different tiles The following simple result is rather unexpected. It is intuitively clear when A is a periodic set but it is, perhaps, suprising that it holds without any assumptions on the set A.
f
LemmaS. 1ff, g 0, f f(x)dx = g(x)dx = I and both f + A and g + A are + A isa tiling. packings of Rd. then! + A isa tiling and only Proof. We first show that, under the assumptions of the theorem, (49)
g + A tiles —supp f.
f + A tiles —suppg Indeed, if f + A tiles —supp g then I
=
f
f(x — A) dx =
g(—x)
AEA
which, after the change of variable y = 1
=
f
f(—y)
—x
f
g(—x)f(x
+ A, gives g(y — A) dy.
— A)
dx,
178
M.N. Kolountzakis
Fig. 13. Packing of set B, the parallelogram above the shaded triangle, with motions A. The shaded triangle is not covered.
This in turn implies, since
g(y
—
A)
1, that
g(y — A)
=
1
for a.e.
ye —suppf. To complete the proof of the theorem, notice that if f + A is a tiling of Rd and
and g(x—a)+A arepackingsandf+A
aC tiles —supp g(x — a)
=
—supp g — a.
We conclude that g(x — a) + A tiles —supp f,
org + A tiles —supp I — a. Since a E Rd is arbitrary we conclude that g + A tiles S R". Example. Use Lemma 8 to prove that there is no measurable non-negative function f that tiles with A = Z" \ (0) (or even minus a set of lower density 0, such as a line). Try to prove this otherwise.
Failure of the lemma for non-translational tiling Suppose we study tiling where all rigid motions of the tile, and not just translations, are allowed. The analogue of the tiling set then is a set A of rigid motions. For x E Rd and A a rigid motion we denote by A(x) the action of A on x. The following theorem shows that our Lemma 8 is very particular to translations.
Theorem 37. There are two polygons A and B in R2 of the same area and a set of rigid motions A such that both collections {A(A): A E A) and {A(B): A E A) are packings but only one of them is a tiling.
Proof. Take A =
(—1/2, 1/2)2 and B to be the parallelogram with vertices (—1/2, —1/2), (1/2,0), (1/2, 1) and (—1/2, 1/2). Take the set of rigid motions
179
The Study of Translational Thing with Fourier Analysis
to be the set of translations by Z2 modified as follows: instead of translating by the elements (0, k), k El,
and define fE by
fE = *€ *xof 121 12), where is a smooth, positive-definite approximate identity supported in BE/z(0). One can easily prove the following proposition. ..+ in V. Ifgn -+ g in L2 then (For the proof just notice the identity
(or fE =
I
1g12 —
=
jg
—g,*)),
integrate and use the triangle and Cauchy—Schwarz inequalities.) Since *E * xrz xcz in L2 (dominated convergence) we have (Parseval) that ...÷ in L2 and, using the proposition above, that in -+ finL'. We also have that
The assumptions of Theorem 38 are therefore satisfied. Combining Theorems 33 and 38 with the above observations we obtain the following characterization of tiling by the function The special form of this function allows us to drop any conditions that are otherwise needed regarding the order (how many derivatives it involves) of the tempered distribution 8A• Theorem 39. Let Q be a bounded open set, A a discrete set in Rd, and A = EAEA Then + A is a tiling (fond only (f A has un(formly bounded density and
=(0}.
The Study of Translational Tiling with Fourier Analysis
181
Proof of Theorem 36. By a simple calculation we get some
= Suppose first that Q
+A= ç (0) U
is a non-zero integer}
c
(2Q)c.
From Theorem 33 it follows that
c (0) U (Q
— Q)C
and from Theorem 39 we deduce that A is a spectrum of Q.
Conversely assume that A is a spectrum of Q, so that
+ A = Rd. It follows
I and > 0 on Q — Q. that (Q — Q) fl (A — A) = (0) as we have But this means that we have a packing Q + A S Rd. However, A is a tiling set, 12, because it is a spectrum, and there is another object that tiles with A, namely (that is, 1). It follows from Lemma 8 that and this object has the same integral as
Q+A=R"isalsoatiling,aswehadtoprove. 3.4
U
A proof (which just makes it) that the disk Is not spectral
Here we present a proof of why the disk D =
(xi 0 the support of the terms in (2.5) is not small enough to be majorized by standard maximal operators as in the continuous case. However by a similar error estimate one can further reduce the size of the intervals to 0 — that is to the case when e = 0. According to (2.5) one can write M = M,o + and consider the corresponding maximal operators and M'1q separately. Notice that the multiplier mj.o(O) =
is the same as the one appearing in the continuous case, and thus it is majorized by the Hardy—Littlewood maximal operator on 12(Z). Similarly M1q can be majorized by (a sum of) standard maximal operators which act on 12(qZ). To see
this one decomposes the integers (mod q) and sum in each residue class separately first. This is the point at which the arithmetic nature of the problem appears as the size of Gaussian sums becomes essential. To be more precise, let 0 = /3 + a/q for some fixed rational a/q, where 1/31 One writes n := qn + v, v = q and 1
+a/q) =
+ v)
+ ej(/3) = mjq(fi)G(a, q) + eJ(13).
= (2.6)
Here mj.q can be considered as a multiplier on 12(qZ). and the corresponding maximal operator is bounded by 1/q (there are just 22J/q elements of qZ in the support). However the size of the Gaussian sum is about .fij giving us a gain of about at each rational. For the error one has (/3)1 2—j/2, say because of the small size of /3, and thus it can be handled again by a square function estimate. The actual proof is technically more involved. One cannot simply add the norms of the maximal operators but first has to group the denominators q into classes C and to each class define Q = HqEC and then decompose Z modulo Q and assign to it a maximal operator
Discrete Maximal Functions and Ergodic Theorems
195
This will be explained together with the passage to the ergodic theorem later in also [B!].
the non-commutative settings; see
2.2 Diophantine equations If P(mi
md) is a positive definite polynomial with integer coefficients, then a fundamental problem in number theory is to determine asymptotically the number of integer solutions of the corresponding diophantine equation: P(m) = N as N —+ A strikingly general result of Birch says that this is possible if P is also homogeneous of degree k and depends essentially on exponentially many variables w.r.t. its degree IBi]. The precise condition is d —dim Vp >
(2.7)
where Vp = {z E : P'(z) = O} is called the complex singular variety of P. We will refer to polynomials satisfying all the above conditions as non-degenerate forms. Suppose there is a commuting family of measure-preserving transformations T = (T1 Td) acting on a finite measure space (X, /L). Then for each N and P(m) = NJ into X. x X the family T maps the solution set SN = {m E : Indeed let
= {T m x
T1
nil .
..
?flj
x: in =
md) E SN).
Our next theorem says that the sets cN.X become equi-distributed on X as N —* 00 for almost every x E X if the family T is fully ergodic [MI. For a family this means = that for each q and F E L2(X): if = TF = F then F is a constant. Notice that this means that the family
T7 } is ergodic for each q.
={
Theorem 2. Let T he a fully ergodic family and let P be an integral non-degenerate form. Then one has for F E L2(X) and for a.e. x X: lim
F(Tmx)
= I' Fdp
(2.8)
JX
and also the averages on the left side converge in the L2 norm
F(Tmx) NmESN as N
—
JX Fdit
(2.9)
L2
00.
A special case is when X = 11" is a torus, aI is the shift by aj in the j-th coordinate. Then N}.
0
are irrational numbers and
= (n
,... njaj): P(n) =
A. Magyar
196
Note that here the averages are taken over disjoint sets and, assuming only the ergodicity of the family T, the averages in (2.8) may not even converge in the L2norm. As before the crucial point is to prove the 12 boundedness of the corresponding . .+n3 this is the discrete maximal operator. In the simplest case, when P(n) = discrete analogue of Stein's spherical maximal function: S*f = SUPN ISNI where f(m — n), where rd(N) is the number of ways of SN 1(m) = 1/rd(N) writing N as a sum of d squares. It can be proved that for d > 4 the operator is bounded in exactly when p > d/(d — 2) and this is sharp. For d = 4 one might expect that it is bounded in for p > 2, at least when the supremum is taken over odd values of N; however this remains an open question. I
The asymptotic formula. One of the key elements of the proof is an asymptotic formula for the Fourier transform of the solution sets, i.e. for the following exponential sums:
=
,
(2.10)
E
Ii" = R"IZ" is the flat torus. Let us introduce the measure dap(x) = where dSp(x) denotes the Euclidean surface area measure of the level sur}äce P(x) 1, and IP'(x)I is the magnitude of the gradient of the form P. and its Fourier Here
transform:
=
f
(2.11) P(x)=I}
Now we can state the following.
Lemma 2. Let P (m) be a positive integral non-degenerate form of degree k; then there exists > 0, s.t.
K(q,l,
=
—s/q))
q=1 !EZ"
+ Here
is
.
(2.12)
where
a smooth cut-offfuncrion.
The approximation formula (2.12) means that the Fourier transform (of the in-
dicator function) of the solution set P(m) = N is asymptotically a sum over all rational points of pieces of the Fourier transform of a surface measure on the level set P(x) = N, multiplied by arithmetic factors and shifted by rationals. We sketch below how to derive formula (2.12) and how to use it to prove the mean ergodic theorem. Let M = N and let be a smooth cut-off function on Rd s.t. = 1 for P(x)
H(qmi +
=
—
1/q,
where
=
M"f
(2.13)
stands for the Fourier transform of H(x, resembling (2.12)
=
This already gives an approximation
G(a, 1.
— 1/q)
+ Error,
q5M'. (oq)=I
,and G(a,1,q) is the normalized JN(')) = exponential sum obtained by collecting the terms depending on s. Notice that the gradient of phase CD(x) = Mx —I/q) in (2.13) is at least MI_E for Thus using partial integration —11>> 1 on the support —1) by making a small error. one can insert the factors Next one uses a uniform estimate for the integral where
codEm Vp
IHOi,
+ MkJpI)
(k_12k
•
(2.14)
This estimate, which is uniform in is far from obvious even in the non-singular case when Vp = 0. It follows from a Weyl-type estimate for exponential sums and a comparison of the integral to these sums it would be desirable to obtain such estimates directly. It allows one to extend the integration in to the whole real line by which is smaller than the main term. making an error One can evaluate the integral in the sense of distributions as follows: INO?)
=
f
f
P(x)=N
=d&p,N(17) = Nd/k_I
(2.15)
198
A. Magyar
Here the third inequality is an oscillatory integral expression for the measure dap.N supported on the surface P(x) = N. and the last equality follows by scaling. If we put together these transformations and extend the summation in q. which is possible by using standard estimates for the exponential sums G(a. 1. q). we get the asymptotic formula (2.9).
The Mean Ergodic Theorem. We illustrate the use of this formula by proving the L2 convergence of the averages in (2.9). First notice that by substituting E = 0 into (2.12) we get rp(N) = ISNI Nil/A_I. and one can easily show the following. Proposition 1. Let
Q'1. that is assume that
has at least one irrational coordi-
nate. Then one has
lim
rp(N)
Iap,N(UI = 0.
(2.16)
Indeed because of the factors — I) there is at most one non-zero term in the / summation for each q. After normalizing with the factor each term is bounded by
K(q. 1. N) = q_d
I. q) 0 the sum in q is bounded by E. However for fixed q < q( the non-zero term can be for say q estimated by dñ( (N if N is large enough w.r.t. where II) IIqE — Il > 0 because Q". Here we used the uniform decay of IIqEII = the Fourier transform of the measure dap. which can be derived from (2.14). So as N —* oc each term in q individually tends to zero, and this proves (2.16). Now let (X. be a probability measure space, and let T = (T1 . . be a family of commuting, measure-preserving and invertible transformations. By the Spectral theorem there exists a positive Borel measure on the torus fl", s.t. H
.
(P(T1
T,,)f. f) =
for every polynomial P(:1
Zn). where
p(s) =
i,,) =
I
(2.17)
J n.
and (.)denotes the inner product on L2(X. p). We recall two basic facts. > 0 if and only if r is ajoint eigenvalue of the shifts I) For r E 11". there exists g E L2(X) s.t. = for each j).
ii) If the
T=
(T1
T,1)
is ergodic. then vj(O) =
(f. 1)12
(i.e.
=
fdp We first observe that the full ergodicity is in fact a condition on the joint spectrum of the shifts Ti. I
12.
Discrete Maximal Functions and Ergodic Theorems
199
Proposition 2. Suppose the family T = (T1 i'd) is ergodic. Then it is fully ergodic if and only if v1(r) = Oforeveiy r E Q'1, r 0. To see this, suppose v1(l/q) > 0 for some I s.t.
/
=
0. Then there exists g E L2(X.
constant since for all j. But then T7g = g for all j but g constant. 0. On the other hand suppose that Tj'g = g, for all j for some g
Then the functions gs1
d
for s E Zl/qZl? defined by e_27TITT1P?hl
=
.
.
m€Z"/qZ"
are joint eigenfunctions with eigenvalues Si /q. They cannot vanish for all s 0 (mod q), because then one would have by expressing T1g in terms of the functions
= g for every j, as can be seen easily This proves the proposition.
Proof of the Mean Ergodic Theorem. We start rewriting the statement in the form IISNf — (1.
=
— (f. 1H2 =f
rp(N)—
The point is that vj(Q'1/(0}) = 0 by the full ergodicity condition; moreover the integrand pointwise tends to zero on the irrationals by Lemma 2, and is majorized by 1. It follows from the Lebesgue dominant convergence theorem that the integral also tends to 0 as N —÷ oo. This proves the theorem. The proof of the 12 boundedness of the associated discrete maximal function and that of the pointwise ergodic theorem are much more involved. Both use the approximation formula by looking at the main term as a weighted sum of averaging operators
over the level surfaces P(x) = N. The associated maximal operators are bounded on Rd and, by using a general transference argument, one can pass the estimates to Zd. Another difficulty is that one cannot seem to use square function arguments, and the maximal operators associated with the error terms must be bounded by directly using the structure of the operators. The interested reader can consult [MSW1 for the case of spheres and [MJ for the general case.
3
Polynomial averages on the discrete Heisenberg goup
In this section we sketch the proof of the simplest non-commutative analogue of
Bourgain's pointwise ergodic theorem. As before the key tool is to show the 12 boundedness of the associated maximal function. In doing so, one has to reformulate some of the basic estimates, such as the Weyl summation, into an operator-valued setting. As the Fourier transform of the central variables is used, the method is somewhat limited. At the end of the section we indicate the type of more general results which seem to be obtainable via these arguments.
200
A. Magyar
3.1 The discrete maximal function on H01
= ((m,l) E
where n = (n i
nd), m
Led' = We
x Z : (n,k).(m,1) = (n+m,k+1+n2•mI}
=
(m i, m2) and
m i denotes the scalar product in
remark that H0, is isomorphic to the standard Heisenberg group
productlaw(n,k).(m,l) = (n+m,k+l-4-nom},wherenom =
with the —ni
Thus the maximal theorem transfers from one group to the other. Let p(n) : Z2d —* Z be an integral polynomial of degree at most 2. Consider N, Vi < I 2d}. Then one has the family of surfaces SN = {(n, p(n)) : In,I the following theorem.
Theorem 3. Let f E
and define the averages and the corresponding maximal
12
function by
SN f(h) =
ISNI gESN
Then one has IIS*fII,2(Hd)
Note
5*1(h) =
,
ISNI(h)I.
sup N>O
Cp,dIIfII,2(Hd).
that ifh = (m,1) then SNI(m, 1) = (2N +
As
f(g h)
f(n + m, 1+ n o m + p(n)).
in the commutative case it is enough to consider dyadic values N = V and
smoothed averages, which after a change of variables n -÷ n
1) = (using m o m
=
2—2jd
—
m
look like
m)f(n 1 + n o m + p(n — m))
(3.2)
0).
Suppose now that H01 acts on a probability measure space (X, via measure-preserving invertible transformations T1. ... using S as its generators. This means that they satisfy the commutation relations:
[Tj,Tj+d]=S
if Ij—iIld.
and
(3.3)
where I denotes the identity on X. Indeed from (3.3) it is easy to see that = Tm+nSk+t+n2.ml. We call the action of H0, on X fully ergodic if = S'F = F implies for each q and F E L2(X) one has that = = that F is constant. Now we can formulate the following.
Theorem 4. Let F E L2(X), SN c H01 defined as above. Assume that the action of H0, on X is fully ergodic. Then one has lim
fora.e. x E X.
f(g x) = I F Jx
(3.4)
Discrete Maximal Functions and Ergodic Theorems
201
Let us remark that similarly as in Theorem 1, the averages on the left side of (3.4) converge a.e. to some function E L2(X) without assuming full ergodicity. However in our approach the essential part is to prove (3.4).
The discrete maximal function on Hd. By taking the Fourier transform in the central variable one has 1)
f
=
(3.5)
where
J(O)(n) = f(n,e) = k
and
Tfg(m) =
om+1*,_m))
—
where ço1(n) =
g(n).
(3.6)
One sees immediately that
are chosen large enough. By subadditivity of the maximal opand erator the left side of (3.19) is majorized by a finite sum of the L2-norm square of maximal operators: Ma,qf = maxJk<J 2. Let — a/q and w1(fi) are disjoint for 1k — fl supports of
=
+
Accordingly one has the decomposition
Mj.ajqf =
Bj.a/qfic_I.a/q,
(3.20)
—a/q))(Tf(0))(m)dO.
(3.21)
+ Mj.a/qfk_1.a/q
—
where
fk.a/q(n, fi) = f(n. and
Bj,aiqf(Pfl.I)
=
J
We remark that the P boundedness of both maximal operators
already been proved in Section 3.1. By the essential disjointness the functions fk,a/q one has
K LO
Let W be a convex
r>O
compact set in
and
let f be a "good" function on W. We
want to discuss the behavior of the integrals (everywhere in the paper x y
means a
scalar product of two vectors in iRe)
1j(x)
=
oo. By the "behavior" we mean the asymptotics and estimates of the inteas lxi gral 11(x) for large lxi.
Of course, everything is defined by the function f and the set W. Our main interest is the influence of the geometric properties of the set W on the integral Ij (x). The function f can be assumed to be very smooth, and the most interesting case is 1. The results are well known for compact sets W whose boundary is smooth enough and has everywhere positive Gaussian curvature [K], [H]. Let us note that with the increase of the dimension the demands on the smoothness of the boundary increase as well. It is very annoying that the conditions are actually needed only for the proof of the result. The asymptotic formulas themselves or the estimates do not include the smoothness characteristics of the boundary. We are interested in whether or not it is possible to say something meaningful about the integral behavior 11(x)
f
210
A.N.Podkorytov
when lxi +oo, with minimal assumptions of convexity and compactness on the set W. Unfortunately, our achievements are not very significant. Some results are proved only in the two-dimensional case. In this paper. we describe these results as well as the ideas that have led to them [P1, [01, [B-TI. oc? Let us first look at How is it possible to study the integral Ij(x) as lxi how the corresponding result is obtained in the simplest, one-dimensional case. The first step is obvious, it is the integration by parts: Iq,(t)
I
= , =
ic
itc
s=h I
(s)e
ds
Ja
c=a
ço(b)e"1'
I
(p(a)eia: —
= 0(f) for any smooth function
= 0(f). Hence,
It gives an estimate
and therefore
=
— Q(a)e"
+
This means that the asymptotic for 4, (:) is defined by the values of the function p at the endpoints of the interval (we assume that these values differ from zero, otherwise we must repeat integration by parts). Doing the same in the two-dimensional case, we obtain
!j(x)
=
x
(f
—
Here v(y) is the outernormal vector todW at the point y, and Vf (y) = (f(y), is the gradient of the function f. The fact that a vector-valued function appears under the integral sign instead of a scalar one changes nothing. For this, we do not need any smoothness of the boundary since, by the convexity of W, the normal to the boundary does exist and is continuous everywhere except for at most a countable number of points. This implies a simple estimate 11(x) = 0(th). Applying it to the integral of the gradient, we obtain
11(x) =
iixi
I
Jaw
+
lxi
One can expect, like in the one-dimensional case, that the main input to the integral Ij(x) is given by the values of our function on the boundary. But here we run across another essential difference, the infinite number of boundary points. It becomes necessary to take into account their joint input, i.e., to study the line integral. It is not clear a priori why it decreases more slowly than the integral of the gradient of the function (of course, in the most interesting case f it is obvious, because Vf 0). But it appears that the typical speed of decreasing of the line integral is 0 Typical does not mean that it is always so. in order to understand what is crucial here, let us consider the integral 1
________________
Asymptotic of the Fourier Transform for a Convex Body
J(x)
211
= Lw
F is a scalar- or vector-valued function. Let Ia, bj be the projection of W onto the line passing through the origin and vector x (a and b depend on x). The boundary
aw splits into two graphs of convex and concave functions on [a. b). which we denote N and M respectively. Then J(x) = JM(x) + JN(x), where JM(x) is the line integral over M, and JN(x) is the line integral over N. Let us consider the first of them. Let YM(U), u E Ia, bI, be the point of M whose projection on the interval La, bJ is equal to ii. Then b
JM(x)
=
j F(yM(u))%/l +
The exponential function is a high frequency lxi oscillating factor. If points yM(a) and are not corner points (vertices) of the boundary. then M'(a) = —M'(h) =
+oc. This is true for most of the directions of the vector x, if W is not a polygon. 0 and F(vM(b)) 0. the main input to the Hence in the case when F(yM(a)) integral JM(x) is given by small neighborhoods of only two points y(x) = yM(b) and v(—x) = YM(a) (these are the points where the vectors x and —x are outer normals to ö W). In order to obtain a quantitative estimate of this input and an asymp-
totic formula for the integral JM(x), it is necessary to know how fast the factor + (M')2(u) tends to infinity at the endpoints of the interval Ia. bJ. It is clear that this depends on how smoothly the tangent approaches the boundary at the points y(x) and y(—x). In the classical case of a smooth boundary with everywhere nonzero curvature, the order of tangency is equal to two at every point of the boundary. It is easy to check that in this case
+ (M')2(u)
11 + (M')2(u) where
'i—"
— u)K(v(x))
I
V (a —a)K(v(—x))
K(y) isthecurvatureofaW atthepoint v.y E
Now it is not difficult to obtain an asymptotic formula for JM(X). The integral
JN(x) can be investigated in the same way. Finally, we obtain the classical result for continuous function F and compact W with smooth boundary a w, and the curvature K distinct from zero at every point:
J(x)
=
=
Here
A(x) =
F(y(x)) .JK(y(x))
+ A(—x) +
212
A.N. Podkorytov
Therefore, at least in the classical case, the line integral J(x) decreases as change lxi —* oo. What would occur if we drop the assumptions of smoothness and nondegeneracy of the boundary? It is easy to understand that the presence of the corner points on the boundary does not influence the upper estimate for J(x). On the contrary, if y(x) is a corner point, then K(y(x)) = 00, and therefore the input of this point to the integral vanishes. On the other hand, when K(y(x)) = 0 or K(y(—x)) = 0, the estimate of the integral is significantly worse. This can be seen by direct calculations. Indeed, let, for example, W,, = {y = (yi.
Y2)
E R2i 1y11P + y21P
11
for large p > 0. In particular, one can hardly expect to obtain a formula for J(x) However, it is obvious that with uniform estimate of the remainder 0(1 /ix I
K (y(x)). K(y(—x)) 0 for most values of the vector x. From this discussion, we see that if the boundary of the compact set has positive (maybe infinite) curvature at points y(x) and y(—x), then we can hope to prove the following equality:
J(x) =
I
Jaw
?L(A(x) + A(—x) + e(x)).
= y
ixi
wheree(x) —÷Oasixl —+ oo,butx/ixi is fixed. However, two natural questions appear. The first (mainly formal and not difficult)
question is what should we mean by the curvature of a nonsmooth boundary? The second (much more important and difficult) question is how large can the value c(x) be when the vector x rotates? Very often it is not enough to know that this value is small for a fixed direction of vector x. In order to formulate the answers to these questions it is convenient to use the polar coordinate system x = rep, where r = lxi sing,). and eq, = (cos The asymptotic formula for the integral J (x) shows that, instead of the curvature K (y), it would be more convenient to use the reciprocal value, i.e., the curvature radius The curvature equals the speed of the rotation of the normal vector, i.e., it is the derivative of the angle of the slope of the normal with respect to the length of the arc. Therefore, the radius of curvature is nothing else than the derivative of the inverse function. This observation allows us to define the radius of curvature of an arbitrary not necessarily smooth convex curve. For this, for an arbitrary angle q, we assign a point a W such that the unit vector eçi, is an outer normal vector to dW, i.e., W. It is clear that = y(reç0) for every (y — y r > 0. For simplicity, we can assume that the compact set W is strictly convex. Then the point is defined uniquely for every angle Otherwise, there exists an at most countable set of angles ço where the unit vector is orthogonal to a nondegenerate We may avoid defining the point yç for these exceptional segment contained in directions at all, or we can take the middle point of the segment as Let be the length of the segment boundary between yo and yç' measured in the positive direction. Since the function a is increasing, then, by the Lebesgue theorem, the
Asymptotic of the Fourier Transform for a Convex Body
213
= exists for almost every direction Thus, the geometric meaning of the derivative is the radius of curvature of a w at the point y4,. Let us note two essential features of this definition. First, in fact the radius of curvature is put into correspondence not with a boundary point, but with a direction p. The matter is that every corner point of the boundary is realized as with q in an interval. This observation is not very essential because it deals with at most derivative p(so)
a countable set of points. What is essential is the second feature of the accepted definition. When we say "the radius of curvature exists almost everywhere:' we do not mean the linear measure on a w induced by the plane Lebesgue measure in a natural way. What we mean is another measure on a w generated by the Gaussian mapping. Namely, the measure of the boundary arc is equal to the angle by which the line of support rotates when the "tangent" point is moving along the measured arc. For example, if W is a polygon then this measure is concentrated in its vertices. This allows us (without additional assumptions of smoothness and nondegeneracy of the boundary) to obtain the following result. For almost all directions of the vector x =rev,, the following relation holds:
= Jaw
=
(F(ycp)vT
+
+
+
where 0
as r
—* +00.
do not need the continuity of F everywhere. It is enough to assume only the boundedness of its variation on aw. Such a weakening of assumptions about the f(y)u(y)esxYds that function F is important in the study of the line integral appears after the integration by parts of the integral 11(x). Here the vector-valued function F(y) = f(y)v(y) can be discontinuous because of "jumps" of the normal We
vector in the case of a nonsmooth boundary. The answer to the question how large can the value of be for different q (i.e., under the rotation of vector x) is provided by the inequality
J dip —÷ 0 as r —÷ +00. In particular, it implies that How do we prove the boundedness of the L2-norm of the remainder? p(ip) is a derivative of a monotone function,
£ p(ip)dip
Lw =
the arc
Therefore, the boundedness of the L2-norm of the remainder is equivalent to the possibility of the following estimate of the average over all rotations of the integral
I (rep):
A.N.Podkorytov
214
< CW,F In many analogous questions estimates are more important than asymptotic formulas. Therefore we will briefly explain how this inequality is obtained. First of all, it can be reduced to a certain geometric inequality. Indeed, it is easy to check that
IJ(reç)I
!)),
Cp(/i(co* !)
(*)
e) is the length of the chord that appears when one shifts the line of p support at point by e toward the set W. We must check that the order of the quadratic means (over all rotations) of the value of the length of the chord is less than or equal to the corresponding order for the unit circle: where
J12(çoe)dço
Cw
we failed to find a natural geometric demonstration of this inequality. Instead, we check a slightly stronger inequality Unfortunately,
c which implies the necessary inequality with the help of the integral Minkowski inwe inscribe a broken line (generally equality. In order to estimate the average
speaking, not closed; the first and the last of its segments can intersect each other) into aW. The i-th segment has length E) (the angles are strictly increasing). e))2 over the i-th arc The crucial fact used in the proof is that the integral of is majorized by
e) +
e) +
e)).
In conclusion, we briefly discuss the estimate of the maximal function J*((p) = sup r >0
The estimates of J* for sets with sufficiently smooth boundary are known [RI ],[R2],[SI.
Inequality (*) shows that it suffices to obtain estimates for another maximal function *
,_
I
l\
= sup = sup —) r r>0 Here the crucial result is the following inequality [B-I]: 8
sup
S
a(*) —
(recall that a is the length of the arc of a w between the points yo and y9,). This inequality is exact; it becomes an equality for a circle. But the following property is more important. By the F. Riesz theorem,
Asymptotic of the Fourier Transform for a Convex Body
E [0. 2ir]
I
—a(0)
> t)
215
the arc IengthaW
=
Hence
const >
In particular.
IJ*(co)Ipdgo
for p I, but has an infinite R'7 : = 0). Fourier transform on every point on the hyperplane On the other hand, from the Hausdorff—Young inequality we see that f lies in the Lebesgue space LP'(R'7), where I/p + l/p' = I. Thus f can be meaningfully restricted to every set S of positive measure. This leaves open the question of what happens to sets S that have zero measure but are not contained in hyperplanes. In 1967 Stein made the surprising discovery that when such sets contain sufficient "curvature" that one can indeed restrict the Fourier transform of U' functions for certain p > I. This led to the restriction pmhlem [46]: for which sets S ç R" and which I p 2 can the Fourier transform function be meaningfully restricted? of an There are of course infinitely many such sets to consider, but we shall focus our attention here on sets S that are hypersurfaces2 or compact subsets of hypersurfaces. In particular, we shall be interested3 in the sphere where xI is
Ss,,ijere
:=
ER'7:
= I).
(I)
the paraboloid 2
For surfaces of lower dimension, see 1171, 1441, [38J, Ill; for fractal sets in R, see 1391,
1451, [38]; for surfaces in finite field geometries, see 1401: for the restriction theory of the prime numbers, see [261. It is easy to see, using the symmetries of the Fourier transform, that the restriction problem for a set S is unaffected by applying any translations or invertible linear transformations to the set 5, so we can place the sphere, paraboloid, and cone in their standard forms (I), (2), (3) without loss of generality.
Recent Progress on Restriction
Sparab :=
219
=
e R?
and the cone —
C
cn —
E R"' x K R'1, and we always taken 2 to avoid trivial sit= uations. These three surfaces are model examples of hypersurfaces with curvature,4 though of course the cone differs from the sphere and paraboloid in that it has one vanishing principal curvature. These three hypersurfaces also enjoy a large group of symmetries (the orthogonal group, the parabolic scaling and Galilean groups, and the Poincare group, respectively). Also, these three hypersurfaces are related via the Fourier transform to solutions to certain familiar PDEs, namely the Helmholtz equation, Schrodinger equation, and wave equation; however we will not focus on applications to PDEs in this paper. where
3 Restriction estimates and extension estimates Let S be a compact subset (but with non-empty interior) of one of the above surfaces Sparab, Scone. We endow S with a canonical measure do-. For the sphere, this is surface measure, for the parabola. it is the pullback of the n — I-dimensional while for the cone the Lebesgue measure under the projection map is the most natural measure (as it is Lorentz invariant. In order to pullback of function to S. it will suffice to prove an restrict the Fourier transform of an a priori "restriction estimate" of the form
If ISHLq(S:da)
(4)
for all Schwartz functions f and some 1 00, since one can then use density q (R") to (5: do-) arguments to obtain a continuous restriction operator from Is for Schwartz functions. When the set S has sufwhich extends the map f ficient symmetry (e.g. if S is the sphere), this implication can in fact be reversed, using Stein's maximal principle [48]; if there is no bound of the form5 (4), then whose Fourier transform is infinite almost one can construct functions f E everywhere in S. We will tend to think of R' as representing "physical space," whose elements will be denoted names such as x and y, while S lives in "frequency space," whose elements will be denoted names such as or co. For the PDE applications it is sometimes t E R} x convenient to think of R' as a spacetime R
f
One could also consider cylinders such as Sk1 x
c W2, but it turns out that the
inside restriction theory for these surfaces is identical to that of the sphere thi) to fail. Sec to Indeed, it suffices for the weak-type estimate from [48]; similar ideas arise in the factorization theory of Nikishin and Pisier.
220
T. Tao
(with the frequency space thus becoming spacetime frequency space ((f, r) tE but we will avoid doing so here. It is thus of interest to see for which sets S and which exponents p and q one has estimates of the form (4); henceforth we assume our functions f to be Schwartz. We denote by R5(p —* q) the statement that (4) holds for all f. From our previous oo, while R5(2 —* q) remarks we thus see that R5(1 —+ q) holds for all 1 q fails for all 1 co; the interesting question is then what happens for interq mediate values of p. If S is compact, then an estimate of the form R5(p —+ q) will automatically imply an estimate p and 4 < q by the Sobolev 4) for all and Holder inequalities. Thus the aim is to increase the size of p and q for which R5(p q) holds by as much as possible. A simple duality argument shows that the estimate (4) is equivalent to the "extension estimate" II
(Fda)"
LP'(R")
Cp.q.S II
F on 5, where (Fda)" is the inverse Fourier transform of the measure Fda:
(Fda)"(x) :=
f
Indeed, the equivalence of (4) and (5) follows from Parseval's identity
=
f
and duality. If we use —+ p') to denote the statement that the estimate (5) holds, then —, p') is thus equivalent to R5(p —÷ q). Note that because F is smooth, it is possible to use the principle of stationary phase (see e.g. [47]) to obtain asymptotics for (Fda)". However, such asymptotics depend very much on the smooth norms of F, not just on the norm, and so do not imply estimates of the form (5) (although they can be used to provide counterexamples). Thus one can think of extension estimates as a more general way than the stationary phase to control oscillatory integrals, applicable in situations where the amplitude function has magnitude bounds but no smoothness properties. The extension formulation (5) also highlights the connection between this problem and PDEs. For instance, consider a solution u(t, x) : R x —÷ C to the free SchrOdinger equation 0
with initial data u(0. x) = uo(x). This has the explicit solution
u(t, x)
=
f
or equivalently
u
= (Fda)",
Recent Progress on Restriction
where do
221
is a (weighted) surface measure on the paraboloid restricted to the and F is the function paraboloid. Thus, estimates of the form -+ p') when S is the paraboloid in R x are used to control certain spacetime norms of solutions to the free Schrodinger equation. Somewhat similar connections exist between the cone (3) (in R x R'1) and solutions to the wave equation U1, — au = 0, or between the sphere (1) and solutions to the Helntholtz equation + u =0. We will not pursue these connections further here, but see for instance (49] and the (numerous) papers descended from that paper. (Some other connections between restriction estimates and PDE-type estimates are summarized in [621 and the references therein; for the Helmholtz equation, see for instance [4].) (r — 2,r
R x W'
:
12)
r=
4 Necessary conditions We will use the extension formulation (5) to develop some necessary conditions in order for —+ p') to hold. First of all, by setting F 1 we clearly see that we must have (do)" E LP'(R") as a necessary condition. In the case of the sphere (1), 1)/2 (as can be the Fourier transform (do)" (x) decays in magniwde like (1+ x seen either by stationary phase or by the asymptotics of Bessel functions), and so we obtain the necessary condition6 p' > 2n/(n — 1), or equivalently p 2(n—1)/(n—2). Let F be a smooth function on S with an L°° norm of at most I. Since Fda is pointwise dominated by do, it seems intuitive that (Fda)" should be "smaller" than (do)". Thus one should expect the above necessary conditions to in fact be sufficient to obtain the estimate R(oo -+ p'). For completely general sets S, this assertion is essentially the Hardy—Littlewood majoran: conjecture; it is true when p' is an even integer by direct calculation using Plancherel's theorem, but is false for other values of p' (a "logarithmic" failure was established by Bachelis in the 1970s; a more recent "polynomial" failure has been established independently by Mockenhaupt and Schlag (private communication) and Green and Ruzsa (private communication). See [381 for further discussion). However, it may still be that the majorant conjecture is still true for "non-pathological" sets S such as the sphere, paraboloid, and cone. Another necessary condition comes from the Knapp example [63], [49]. In the case of the sphere or paraboloid, we sketch the example as follows. Let R >> I. we see Then, by a Taylor expansion of the surface S around any interior point that the surface S contains a "cap" K C S centered at of diameter7 1 /R and 6 There does not seem to be any hope for any weak-type endpoint estimate at p' = 2n/(n — 1), see [5]. We use X Y or X = 0(Y) to denote an estimate of the form X CY where C depends on S, p. q, but not on functions such as f, F, or on parameters such as R. We use X -" Y to denote the estimate X Y X.
222
T.Tao
which is contained inside a disk D of radius "- hR and oriented perpendicular to the unit normal of S at xo. Let F be the characteristic function of this cap K (one can smooth F out if desired, but this does not affect the final necessary condition), and let T be the dual tube to the disk D, i.e. a tube centered at the origin of length R2 and thickness — R oriented in the direction on a of the unit normal to Sat xo. Then (Fdo)" has magnitude -'-j c(K) — large portion of T (this is basically because for a large portion of points x in T. the phase function is essentially constant on K). In particular. we have
surface measure thickness —
TI PR_—h
II(Fda)"
R
we thus see that we need the necessary condition
n+l
n—I —
q
p') to hold. (In the case of compact subsets of the paraboloid with non-empty interior, one can obtain the same necessary condition using the in order for
For the full (non-compact) paraboloid, one In the case of the cone, we can lengthen the cap K = and lives in a "plate" of in the null direction (so that it now has measure length I, width 1 /R and thickness I /R2). which eventually leads to the stronger = necessary condition as before, this can be strengthened to if one is considering the full cone (3) and not just compact subsets of it with non-empty parabolic scaling can improve this to
i—p (Ai.
interior.
One can formulate a Knapp counterexample for any smooth hypersurface; the necessary conditions obtained this way become stronger as the surface becomes flat-
ter, and in the extreme case where the surface is infinitely flat (e.g. when it is a hyperplane), there are no estimates. The restriction conjecture for the sphere, paraboloid, and cone then asserts that
the above necessary conditions are in fact sufficient. In other words, for compact subsets of the sphere and paraboloid the conjecture asserts that R(q' p') holds when p' > 2n/(n — I) and while for compact subsets of the cone the conditions become p' > 2(n — l)/(n — 2) and (i.e. they match the numerology of the sphere and paraboloid in one lower dimension). This conjecture has been solved for the paraboloid and sphere in two dimensions, and for the cone in up to four dimensions; see Tables I and 2 for a more detailed summary of progress on this problem. The restriction problems for the three surfaces are related; the sharp restriction conjecture for the sphere would imply the sharp restriction estimate for the paraboloid, because one can parabolically rescale the sphere to approach the paraboloid; see [52j. Also, using the method of descent, one can link the restriction conjecture for the cone in with the restriction conjecture for the sphere or paraboloid in R". although the connection here is not as tight (see 1551 for some further discussion).
Recent Progress on Restriction
223
Dimension Range of p and q
n=
q' =
2
p'
2,
Stein 1967 Zygmund 1974 [71] (best possible)
8
(p'/3)'; p' >
4
n=3
SteinI967
p'
q' 2
Tomas 1975 [63] Stein 1975; SjOlin Bourgain 1991 [6]
4
p'
1975
Wolffl995[64]
p' > q'
Moyua, Vargas, Vega 1996 [421 Tao, Vargas, Vega 1998 [60] Tao, Vargas, Vega 1998 [60] Tao, Vargas 2000 [61] Tao, Vargas 2000(61] Tao 2003 [59] (conjectured)
—
p' >
4
—
4
—
(p'/2)': p' p'
q'
p' >
q' 2
n >3
4
3
Tomas 1975 163] q' > ((n — l)p'/(n + I))'; p' > 2(n+I) ((n — I)p'/(n + I))'; p' 2 Stein 1975
q', I">
Bourgain 1991 [6]
—
q'
Wolff 1995 [64]
Moyua. Vargas, Vega 1996 [42]
2
p' > ((n — l)p'/(n + I))'; p' >
2(n+2)
Tao 20031591 (conjectured)
Table 1. Known results on the restriction problem R5(p q) (or R(q' —p p')) for the sphere and for compact subsets of the paraboloid. (For the whole paraboloid, restrict the exponents to the range q' = ((n—i)p Dimension Range of p and q
n=
3
n=4 a>4
q' 2 (p'/3)', p' 6 q' 2 (p'/3)'; p' > 4 q' (p'/2)'. p' 4 2 (p'/2)'; p' > 3 q'
((a — 2)p'/n)'. p' 2 ((a — 2)p'/n)': 2 ((a — 2)p'/n)': p' >
Strichartz 1977 [49] Barcelo 1985 [2] (best possible) Strichartz 1977 [49] Wolff 2000 1691 (best possible) Strichartz 1977 [491 Wolff 2000 1691
(conjectured)
Table 2. Known results on the restriction problem Rs(p —' q) (or
-+ p')) for compact
subsets of the cone. (For the whole cone, restrict the exponents to the range
=
'.
224
5
T.Tao
Local restriction estimates
We now begin discussing some of the tools used to prove the above restriction theorems. The first key idea is to reduce the study of global restriction theorems (where the physical space variable is allowed to range over all of R") to that of local restriction theorems (where the physical space variable is constrained to lie in a ball). 0. let Rs(p More precisely, for any exponents p. q, and any a q; a) denote the statement that the localized restriction estimate Cp,q.s,crR"IIIIILP(B(xo,R))
(6)
l,any ball B(xg, R) := (x ER": Ix —xol R} of radius R). Note that the center of the ball R, and any test function f supported in is irrelevant since one can translate f by an arbitrary amount without affecting the magnitude of f. Observe that estimates for higher a immediately imply estimates for lower a (keeping p, q. S fixed). Also, the local estimate R5(p -+ q; 0) is clearly equivalent R —÷ 00 and applying a limiting arguto the global estimate ment. Finally, it is easy to prove estimates of this type for very large a; for instance, q; n/p') just for smooth compact hypersurfaces S one has the estimate R5(p from the Holder inequality holds forany radius R
111111
Thus the aim is to lower the value of a from the trivial value of a = n/p', toward the ultimate aim of a = 0. at least when p and q lie inside the conjectured range of the restriction conjecture. (For other p and q, the canonical counterexamples will give some non-zero lower bound on a.) By duality, the local restriction estimate R5(p —÷ q; a) is equivalent to the local extension estimate —+ p': a), which asserts that II
(Fda)" It LP'(B(xo. R))
Cp,q,s,a
Fl L'I'(S;do)
(7)
for all smooth functions F on S, all R> I, and all balls B(xo, R).
The uncertainty principle suggests that since the spatial variable has now been localized to scale R, the frequency variable can be safely blurred to scale I /R. In the case where S is a smooth compact hypersurface, this is indeed correct; the estimate (6) is equivalent to the estimate
> one at fine scales 0, at which will eventually be able to obtain the estimate point we can use epsilon removal lemmas to obtain a global restriction estimate. We still have to obtain the estimate R*(ql —÷ p'; a — s) from the inductive hypothesis Rt (q' -+ p'; a). We first describe a somewhat oversimplified version of the main idea as follows. We have to prove an estimate of the form ii
(Fda)"
II
LP'(B(O.R))
Cp,q,s,ct
II Fil
1, which we now fix. Introduce the scale r := p'; a) which is slightly smaller than R. Then by the induction hypothesis applied to scale r, we have for some F on S and R
(Fda)" "LP' (B(xo.r))
Cp.q.s.a
V FII LQ'(S)
for any ball B(xo, r). Thus we can already prove the desired estimate on smaller balls B(x0, r). More generally, we can prove Cp,q.s.u
(Fda)" OLP'(U
II FII
r) of smaller balls, as long as the number of balls involved is not too large (e.g. at most O((log R)C) for some absolute constant C). As a rough first approximation, the idea of Wolff is to identify the "bad" balls B(x1, r) on which the function (Fdo )V "concentrates." The choice of these balls will of course depend on F. These balls can be dealt with using the induction hypothesis, and it then remains to verify the restriction estimate on the exterior of these bad balls: on any union
II
(Fda)" flLP'(B(O R)
U B(Xj .r))
Cp.q.s,a
FH LQ'(SY
The above description of Wolff's argument was something of an oversimplification for two reasons; firstly, Wolff is working in the bilinear setting rather than the linear setting, and secondly, the balls r) turn out to depend not only on the
Recent Progress on Restriction
237
original function F, but on the wave packet decomposition associated to F. Let us ignore the first reason for the moment, and clarify the second. On the ball B(O, R), one can obtain a wave packet decomposition of the form
= Because the argument of Wolff dealt with the cone, the wave packet decomposition here is slightly different from that discussed in the previous section, in two respects: firstly, the tubes T are oriented on "light rays" normal to the cone S instead of pointing in general directions, and secondly, the internal structure of the wave packet i/Fr is more interesting than just the product of a plane wave and a bump function, being decomposable into "plates?' We however will gloss over this technical issue. For simplicity, let us suppose that the constants CT behave like a characteristic function; more precisely, there is some collection T of tubes such that cr = c for T e T and CT = 0 otherwise. (The general case can be reduced to this case via a dyadic pigeonholing argument, which costs a relatively small factor of log R.) Then we have
(Fda)"(x) = c
*T(x). TET
The idea now is to allow each wave packet i/Fr to be able to "exclude" a single ball
into two
BT of the slightly smaller radius r. In other words, one divides pieces, a "localized" piece
CE TET and
the "global" piece *T(X)(l — XBT(X)).
C
reT One then tries to control the localized piece using the induction hypothesis, and then handle the non-localized piece using the strategy of the previous section.
In the linear setting. this strategy does not quite work, because the localized pieces cannot be adequately controlled by the induction hypothesis. However, in the bilinear setting, when one is trying to prove an estimate of the form ff(F,da, )V (F2dc2)" IILP'/2(B(OR))
Cp.q.s,a
II
Fi II Lq'(S)
IF2 II
then one can decompose
T1eT1
for j = 1,2, and allow each tube
to exclude a single'3 ball
of radius r. We
can then split the bilinear expression Actually, in Wolff's argument there are O((log R)C) such balls excluded, but this is a minor technical detail.
238
T.Tao
T1 eT1 T2ET2
into a local piece
E
hIT1
'I'T2XBr1flBr,
TIETI T2ET2
(where both tubes T1 and T2 are excluding x), and a global piece
hr1*r2(l T1ET1
The local piece turns out to be easily controllable by the inductive hypothesis, so it
remains to control the global piece.
The key point is to prevent too many of the tubes T1 and T2 from interacting with each other. This is done by selecting the balls BT1, Br, strategically. Roughly speaking. for each tube T1, we choose BT1 to be the ball which contains as many intersections of the form T1 fl 7'2 as possible: the ball BT., is chosen similarly. The effect of this choice is that any point x which lies in a large number of tubes in T1 and in 7'2 simultaneously is likely to be placed primarily in the local part of the bilinear expression. and not in the global part. With this choice of the excluding balls Br1, Br,, Wolff was able to obtain satisfactory control on the number of times tubes 7'1 from Ti would intersect tubes T2 from T2. The key geometric observation is as follows. Suppose that many tubes T1 in T1 were going through a common point xo; since the tubes T1 are constrained to be oriented along light rays, these tubes must then align on a "light cone." Now consider a tube T2 from T2; this tube is of course transverse to all the tubes T1 considered above, and furthermore is transverse to the light cone that the tubes T1 lie on. It can either pass near Xo or stay far away from In the first case it turns out that the joint contribution of the tubes T1 and 7'2 will largely lie in the local part of the bilinear expression and thus be manageable. In the second case we see from transversality that the tube T2 can only intersect a small number of tubes T1. Thus if there is too much intersection among tubes in T1. then there will be fairly sparse intersection between those tubes T1 and tubes 1'2 in T2. This geometric fact was exploited via combinatorial arguments in [69J. When combined with some local L2 arguments from L371 to handle the fine-scale oscillations, and the inductionon-scales argument, this fact was used to obtain the near-optimal bilinear restriction q)forq > x2 theorem = to the Machedon— Klainerman conjecture was then obtained in 1551 by refining the above argument.)
9 Adapting Wolff's argument to the paraboloid The above argument of Wolff [691, which yielded the optimal bilinear L2 restriction theorem for the cone, relied on a key fact about the cone: all the tubes passing through a common point xo were restricted to lie on a hypersurface (specifically, the cone with
Recent Progress on Restriction
239
vertex at xo). This property does not hold for the paraboloid, since in this setting the tubes can point in arbitrary directions. Nevertheless, it is possible to recover this hypersurface property by exploiting a little more structure at fine scales, and more precisely by squeezing one "dimension" of gain out of (18), thus obtaining the optimal bilinear L2 restriction theorem for the paraboloid (and in fact also for the sphere, by a slight modification of the argument); this was achieved in 1591. We sketch the main idea of that paper here. As in the last section, we can reduce matters to estimating a quantity such as II
—
> ij€T
for some I If(flI > so that the analogue of (3) fails. The easy argument of this last example does not work if we consider a convex body B with smooth boundary. This problem has been studied in (131 and 111. where it has been shown that an analogue of (3) exists in 1R3 only under additional geometric assumptions. We go back to the case of a polygon, for which it is easy to estimate the lengths of the chords A(p , 0) defined in Figure 6 and then deduce (2). As another application of (3), one easily obtains the well-known bound
Ixü(UI
c•IEI
-.
which is true for every convex planar body B, the boundary of which is smooth with everywhere positive curvature. It can be shown that (2) is best possible. Indeed, for any convex planar body B, we have
Jo
>0.
(7)
The proof of (7) depends on a modification of an argument used by Yudin (1481) in the study of Lebesgue constants (see 1111). Note that a connection between dO and f12
dx seems to be natural (see also 132)).
The bound in (2) is also essentially best possible in a perhaps more subtle sense. If we assume P = Pp.j to be a polygon with N sides, contained in the unit disc, then the above argument (divergence theorem + splitting the integration according to the sides) shows that
Average Decay of the Fourier Transform
251
Fig. 8
,2,r
cNp2logp.
J0
where now c is independent of N. We show that (8) is essentially best possible, since for any e > Owe cannot replace N in the above RHS by NI_t. Indeed, assume PN is a regular N-gon inscribed in the unit disc D and consider XD\PN (p8). Here the parallel section function f(s) is the sum of the lengths of the two small segments obtained after removing the chord of the polygon from the chord of the disc in Figure 8. The set D\ PN is the union of N "lunes." Let us number them counterclockwise and let us consider only the first [N/21 lunes for simplicity. Figure [N/21) to the total variation 9 shows that the contribution of the kth lune (1 k
of the parallel section function f(s) is
Adding on k and using the theorem on the Fourier transform of a function of bounded variation we get, uniformly in 9,
1logN On the other hand (l)unplies
Jo
IXD(PNO)I dO =
for a suitable diverging sequence (e small). Then
—I
(2lrpN)I
?
—3/2
C2PN
We can choose PN so that PN
G. Travaglini
252
e
0' Fig. 9
f2Jf
t2,r
J0
IIPNpNO)Idoa
IXD(PN8)I dO —
IXD\PN(PN9)
cjN3'2 — In order to end the proof, let us assume that K = K(N) satisfies
K(N)p2 log p.
IXPN(P8)I dO Jo
Then choosing again p = PN
N
The bound in (9) came out as a very partial answer to a question raised a few years ago by A. Koldobsky. The above result has not been useful to him, but it was later one of the basic ingredients in [11]. 2.2
drcular means for planar convex bodies
Let us consider a convex planar body B and more general U' circular means:
Average Decay of the Fourier Transform
253
1,'p
22r
f
lme(PG)l"
dO
0
Arguing as in the L' case, one can prove the following best possible bound for a
polygon P when I
2 can be explained in the
following way.
When p < 2 the above result essentially says that polygons have p-order of decay equal to 1 + I/p. while for all other bodies we have 3/2, i.e. they behave like discs. Observe that if B is not a polygon. then its (piecewise smooth) boundary must contain an arc with positive curvature. Hence we get sharp uniform order of decay 3/2 on a positive interval of 9's, and this (together with (10)) implies the average 3/2 over 10. 2ir) when p 2. The situation is different when p > 2, where (consider for a moment the extreme case p = 00) the flat points of the boundary are the relevant ones. Here one can produce a scaling between the polygons and the disc by constructing suitable convex bodies containing a piece of the graph of the function x —+ (y > 2) within their boundaries. The above dichotomy is no longer valid for arbitrary convex bodies, where the average decay describes global geometric properties of B, as we shall see in the following two sections. 2.3 InscrIbed polygons
Given a planar convex body B we consider (as in 134], 141] or [36]) the following inscribed polygon. Choose any chord at distance S from the boundary (as in Figure 6) and name it Let us move counterclockwise constructing a finite sequence of consecutive chords (each one at distance S from the boundary) until we reach again. Then, if necessary, we replace the last chord by one consecutive to £ i. In this way we get a polygon inscribed in B and we denote it by Pf (once S is given, this polygon is uniquely determined up to the choice of the starting point, and this latter turns out to be irrelevant). Let Mf be the number of sides of It is known (see whenever [41]) that Mf Observe that Mf contains a piece of a curved arc, while, on the other hand. Mf I if B is a polygon. We have the following (see [Ill).
256
G. Travaglini
B
Fig 11
Theorem 4. Let B be a convex planar body and con3ider the inscri bed polygon P..1
(see FIgure 11). LetO_ 0, Ixa(p8)1d8 > 0.
limsup p—,+oo
JO
The fist step in the proof of Theorem 4 is to prove the inequality
ç2r
I
Jo
which shows that the polygon
Jo
Ipa (p0) dO,
is a good substitute for B while studying
xB(p8). We ate therefore reduced to estimate the average decay for a polygon with no more than cpa sides. In order to get (15) we then apply the trivial estimate (8) with p" in place of N. At this point one should expect to have obtained a poor result. The second part of the statement of Theorem 4 shows that it is not so. The counterexample follows the idea which has been used to prove (9). 2.4 Measuring the Image of the Gauss map We can get similar results starting from a different point of view. We think B close to a polygon when its boundary has relatively few normals. We then recall that,
Average Decay of the Fourier Transform
257
by convexity, at every point of a B there is a left and a right tangent, therefore a left and a right outward normal. Let C [0. 2,r) be the whole set of the directions is the image of the appearing as a left or right outward normal. In other words, (generalized) Gauss map. We say that a B has "few" or "many" normals according to We measure this set in a fractal way by defining its s-neighborhood the "size" of
<sj
= fo and assuming an inequality of the form
If B is a disc, we need d = 1. On the other hand, we can choose d = 0 if and only if B is a polygon with finitely many sides. As an intermediate example, let us consider a polygon with infinitely many sides, such that the set of the normal directions to its sides is = (y > 0). Then it is not difficult to show that 5y/(y+l) We have the following essentially sharp result (see [I 1]).
Theorem 5. Let 0