Preface
This book grew out of my attempt in August 1998 to compare Carleson’s and Fefferman’s proofs of the pointwise c...
21 downloads
1035 Views
1MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Preface
This book grew out of my attempt in August 1998 to compare Carleson’s and Fefferman’s proofs of the pointwise convergence of Fourier series with Lacey and Thiele’s proof of the boundedness of the bilinear Hilbert transform. I started with Carleson’s paper and soon realized that my summer vacation would not suffice to understand Carleson’s proof. Bit by bit I began to understand it. I was impressed by the breathtaking proof and started to give a detailed exposition that could be understandable by someone who, like me, was not a specialist in harmonic analysis. I’ve been working on this project for almost two years and lectured on it at the University of Seville from February to June 2000. Thus, this book is meant for graduate students who want to understand one of the great achievements of the twentieth century. This is the first exposition of Carleson’s theorem about the convergence of Fourier series in book form. It differs from the previous lecture notes, one by Mozzochi [38], and the other by Jørsboe and Mejlbro [26], in that our exposition points out the motivation of every step in the proof. Since its publication in 1966, the theorem has acquired a reputation of being an isolated result, very technical, and not profitable to study. There have also been many attempts to obtain the results by simpler methods. To this day it is the proof that gives the finest results about the maximal operator of Fourier series. The Carleson analysis of the function, one of the fundamental steps of the proof, has an interesting musical interpretation. A sound wave consists of a periodic variation of pressure occurring around the equilibrium pressure prevailing at a particular time and place. The sound signal f is the variation of the pressure as a function of time. The Carleson analysis gives the score of a musical composition given the sound signal f . The Carleson analysis can be carried out at different levels. Obviously the above assertion is true only if we consider an adequate level. Carleson’s proof has something that reminds me of living organisms. The proof is based on many choices that seem arbitrary. This happens also in living organisms. An example is the error in the design of the eyes of the vertebrates. The photoreceptors are situated in the retina, but their outputs emerge on the wrong side: inside the eyes. Therefore the axons must finally
VI
Preface
be packed in the optic nerve that exit the eyes by the so called blind spot. But so many fibers (125 million light-sensitive cells) will not pass by a small spot. Hence evolution has solved the problem packing another layer of neurons inside the eyes that have rich interconections with the photoreceptors and with each other. These neurons process the information before it is send to the brain, hence the number of axons that must leave the eye is sustantially reduced (one million axons in each optic nerve). The incoming light must traverse these neurons to reach the photoreceptors, hence evolution has the added problem of making them transparent. We have tried to arrange the proof so that these things do not happen, so that these arbitrary selections do not shade the idea of the proof. We have had the advantage of the text processor TEX, which has allowed us to rewrite without much pain. (We hope that no signs of these rewritings remain). By the way, the eyes and the ears process the information in totally different ways. The proof of Carleson follows more the ear than the eyes. But what these neurons are doing in the inside of the eyes is just to solve the problem: How must I compress the information to send images using the least possible number of bits? A problem for which the wavelets are being used today. I would like this book to be a commentary to the Carleson paper. Therefore we give the Carleson-Hunt theorem following more Carleson’s than Hunt’s paper. The chapter on the maximal operator of Fourier series S ∗ f , gives the first exposition of the consequences of the Carleson-Hunt theorem. Some of the results appear here for the first time. I wish to express my thanks to Fernando Soria and to N. Yu Antonov for sending me their papers and their comments about the consequences of the Carleson-Hunt theorem. Also to some members of the department of Mathematical Analysis of the University of Seville, especially to Luis Rodr´ıguezPiazza who showed me the example contained in chapter XIII.
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . About the notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
v xi xv
Part I. Fourier series and Hilbert Transform 1. Hardy-Littlewood maximal function 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Weak Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 A general inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 4 6 8 9
2. Fourier Series 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Dirichlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Fourier Series of Continuous Functions . . . . . . . . . . . . . . . . . . 2.4 Banach continuity principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Summability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 The Conjugate Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 The Hilbert transform on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 The conjecture of Luzin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11 12 15 18 20 24 26 28
3. Hilbert Transform 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Truncated operators on L2 (R) . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Truncated operators on L1 (R) . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 The Hilbert Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Maximal Hilbert Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31 31 32 36 37 39
VIII
Table of Contents
Part II. The Carleson-Hunt Theorem 4. The 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14
Basic Step Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carleson maximal operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . Local norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dyadic Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The first term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notation α/β . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The second term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The third term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First form of the basic step . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some comments about the proof . . . . . . . . . . . . . . . . . . . . . . . . Choosing the partition Πα . The norm |f |α . . . . . . . . . . . . . Basic theorem, second form . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51 51 53 56 59 60 61 62 63 63 66 66 68 70
5. Maximal inequalities 5.1 Maximal inequalities for Δ(Π, x) . . . . . . . . . . . . . . . . . . . . . . . 5.2 Maximal inequalities for HI∗ f . . . . . . . . . . . . . . . . . . . . . . . . . .
73 75
6. Growth of Partial Sums 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 The seven trick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 The exceptional set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Bound for the partial sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77 78 78 81
7. Carleson Analysis of the Function 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 A musical interlude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 The notes of f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 The set X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 The set S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85 86 87 89 90
8. Allowed pairs 8.1 The length of the notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 8.2 Well situated notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 8.3 The length of well situated notes . . . . . . . . . . . . . . . . . . . . . . . 98 8.4 Allowed pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 8.5 The exceptional set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 9. Pair Interchange Theorems 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Choosing the shift m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 A bound of f α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Selecting an allowed pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
103 103 105 107
Table of Contents
IX
10. All together 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 10.2 End of proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Part III. Consequences 11. Some spaces of functions 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Decreasing rearrangement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 The Lorentz spaces Lp,1 (μ) and Lp,∞ (μ) . . . . . . . . . . . . . . . . . 11.4 Marcinkiewicz interpolation theorem . . . . . . . . . . . . . . . . . . . . 11.5 Spaces near L1 (μ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 The spaces L log L(μ) and Lexp (μ) . . . . . . . . . . . . . . . . . . . . . . 12. The 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 13. 13.1 13.2
Maximal Operator of Fourier series Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maximal operator of Fourier series . . . . . . . . . . . . . . . . . . . . . . The distribution function of S ∗ f . . . . . . . . . . . . . . . . . . . . . . . . The operator S ∗ on the space L∞ . . . . . . . . . . . . . . . . . . . . . . The operator S ∗ on the space L(log L)2 . . . . . . . . . . . . . . . . . The operator S ∗ on the space Lp . . . . . . . . . . . . . . . . . . . . . . . The maximal space Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The theorem of Antonov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fourier transform on the line Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
127 127 130 134 137 141 145 145 147 148 149 150 152 157 163 163
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Introduction
The origin of Fourier series is the 18th century study by Euler and Daniel Bernoulli of the vibrating string. Bernoulli took the point of view, suggested by physical considerations, that every function can be expanded in a trigonometric series. At this time the prevalent idea was that such an expression implied differentiability properties, and that such an expansion was not possible in general. Such a question was not one for that time. A response depended on what is understood by a function, a concept that was not clear until the 20th century. The first positive results were given in 1829 by Dirichlet, who proved that the expansion is valid for every continuous function with a finite number of maxima and minima. A great portion of the mathematics of the first part of the 20th century was motivated by the convergence of Fourier series. For example, Cantor’s set theory has its origin in the study of this convergence. Also Lebesgue’s measure theory owes its success to its application to Fourier series. Luzin, in 1913, while considering the properties of Hilbert’s transform, conjectured that every function in L2 [−π, π] has an a. e. convergent Fourier series. Kolmogorov, in 1923 gave an example of a function in L1 [−π, π] with an a. e. divergent Fourier series. A. P. Calderon (in 1959) proved that if the Fourier series of every function in L2 [−π, π] converges a. e., then m{x : sup |Sn f (x)| > y} ≤ C n
f 2 . y2
For many people the belief in Luzin’s conjecture was destroyed; it seemed too good to be true. So, it was a surprise when Carleson, in 1966, proved Luzin’s conjecture. The next year Hunt proved the a. e. convergence of the Fourier series of every f ∈ Lp [−π, π] for 1 < p ≤ ∞. Kolmogorov’s example is in fact a function in L log log L with a. e. divergent Fourier series. Hunt proved that every function in L(log L)2 has an a. e. convergent Fourier series. Sj¨ olin, in 1969, sharpened this result: every function in the space L log L log log L has an a. e. convergent Fourier series. The last result in this direction is that of Antonov (in 1996) who proved the
XII
Introduction
same for functions in L log L(log log log L). Also, there are some quasi-Banach spaces of functions with a. e. convergent Fourier series given by Soria in 1985. In the other direction, the results of Kolmogorov were sharpened by Chen in 1969 giving functions in L(log log L)1−ε with a. e. divergent Fourier series. Recently, Konyagin (1999) has obtained the same result for the space Lϕ(L), whenever ϕ satisfies ϕ(t) = o( (log t/ log log t)). Apart from the proof of Carleson, there have been two others. First, the one by Fefferman in 1973. He says: However, our proof is very inefficient near L1 . Carleson’s construction can be pushed down as far as L log L(log log L), but our proof seems unavoidably restricted to L(log L)M for some large M . Then we have the recent proof of Lacey and Thiele; they are not so explicit as Fefferman, but what they prove (as far as I know) is limited to the case p = 2. The proof of Lacey and Thiele is based on ideas from Fefferman proof, also the proof of Fefferman has been very important since it has inspired these two authors in his magnificent proof of the boundedness of the bilinear Hilbert transform. The trees and forests that appears in these proofs has some resemblance to the notes of the function and the allowed pairs of the proof of Carleson that are introduced in chapter eight and nine of this book, but to understand better these relationships will be matter for other book. The aim of this book is the exposition of the principal result about the convergence of Fourier series, that is, the Carleson-Hunt Theorem. The book has three parts. The first part gives a review of some results needed in the proof and consists of three chapters. In the first chapter we give a review of the Hardy-Littlewod maximal function. We prove that this operator transforms Lp into Lp for 1 < p ≤ ∞. The differentiation theorem allows one to see the great success we get with a pointwise convergence problem by applying the idea of the maximal function. This makes it reasonable to consider the maximal operator S ∗ f (x) = supn |Sn (f, x)|, in the problem of convergence of Fourier series. In chapter two we give elementary results about Fourier series. We see the relevance of the conjugate function and explain the elements on which Luzin, in 1913, founded his conjecture about the convergence of Fourier series of L2 functions. We also present Dini’s and Jordan’s tests of convergence in conformity with the law: s/he who does not know these criteria must not read Carleson’s proof. The properties of Hilbert’s transform, needed in the proof of Carleson’s Theorem are treated in chapter three. In the second part we give the exposition of the Carleson-Hunt Theorem. The basic idea of the proof is the following. Our aim is to bound what we call Carleson integrals π in(x−t) e f (t) dt. p.v. −π x − t
Introduction
XIII
To this end we consider a partition Π of the interval [−π, π] into subintervals, one of them I(x) containing the point x, and write the integral as π in(x−t) e ein(x−t) f (t) dt = p.v. f (t) dt p.v. −π x − t I(x) x − t in(x−t) e f (t) − MJ MJ + dt + dt. x−t J J x−t J∈Π,J=I(x)
where MJ is the mean value of ein(x−t) f (t) on the interval J. The last sum can be conveniently bounded so that, in fact, we have changed the problem of bounding the first integral to the analogous problem for the integral on I(x). After a change of scale we see that we have a similar integral, but the number of cycles in the exponent n has decreased in number. Therefore we can repeat the reasoning. With this procedure we obtain the theorem that Sn (f, x) = o(log log n) a. e., for every f ∈ L2 [−π, π]. We think that to understand the proof of Carleson’s theorem it is important to start with this theorem, because this is how the proof was generated. Only after we have understood this proof can we understand the very clever modifications that Carleson devised to obtain his theorem. The next three chapters are dedicated to this end. The first deals with the actual bound of the second and third terms, and the problem of how we must choose the partition to optimize these bounds. In Chapter five we prove that the bounds are good, with the exception of sets of controlled measure. Then, in Chapter six, given f ∈ L2 [−π, π], y > 0, N ∈ N , and ε > 0, we define a measurable set E with m(E) < Aε and such that π in(x−t) f 2 e sup p.v. f (t) dt ≤ C √ (log N ). ε 0≤n≤N/4 −π x − t From this estimate we obtain the desired conclusion that Sn (f, x) = o(log log n) a. e. These three chapters follow Carleson’s paper, where instead of f ∈ L2 he assumed that |f |(log+ |f |)1+δ ∈ L1 , reaching the same conclusion. Since we shall obtain further results, we have taken the simpler hypothesis that f ∈ L2 . In fact, our motivation to include the proof is to allow the reader to understand the modifications contained in the next five chapters. The logarithmic term appears in the above proof because every time we apply the basic procedure, we must put apart in the set E a small subset where the bound is not good. We have to put in a term log N in order to obtain a controlled measure. In fact, we are considering all pairs (n, J) formed by a dyadic subinterval J of [−π, π], and the number of cycles of the Carleson integral. If we consider the procedure of chapters four to six, then we suspect
XIV
Introduction
that we do not need all of these pairs. This is the basic observation on which all the clever reasoning of Carleson is founded. In chapter seven we determine which pairs are needed. Carleson made an analysis of the function to detect which pairs these are. If we think of f as the sound signal of a piece of music, then this analysis can be seen as a process to derive from f the score of this piece of music. In this chapter we define the set Qj of notes of f to the level j. In chapter eight we define the set Rj of allowed pairs. This is an enlargement of the set of notes of f , so that we can achieve two objectives. The principal objective is that if α = (n, J) is a pair such that α ∈ Rj , then the sounds of the notes of f (at level j) that have a duration containing J, is essentially a single note or a rest. This is very important because if we consider a Carleson integral Cα f (x) with this pair, then we have a candidate note, the sound of f , that is an allowed pair and therefore can be used in the basic procedure of chapter four. Chapter nine is the most difficult part of the proof. In it we see how, given an arbitrary Carleson integral Cα f (x), we can obtain an allowed pair ξ such that we can apply the procedure of chapter four, and a change of frequency to bound this integral. In chapter ten we apply all this machinery to prove the basic inequality of Theorem 10.2. The last part of the book is dedicated to deriving some consequences of the proof of Carleson-Hunt. First, in chapter eleven we prove a version of the Marcinkiewicz interpolation theorem and give the definition and first properties of the spaces that we shall need in chapter twelve. In particular, we study a class of spaces near L1 (μ) that play a prominent role. We prove that they are atomic spaces, a fact that allows very neat proofs in the following chapter. In Chapter twelve we study the maximal operator S ∗ f of Fourier series. In it we give detailed and explicit versions of Hunt’s theorem, with improved constants. We end the chapter by defining two quasi-Banach spaces, Q and QA, of functions with almost everywhere convergent Fourier series. These spaces improve the known results of Sj¨ olin, Soria and Antonov, and the proofs are simpler. In the last chapter we consider the Fourier transform on R. We consider the problem of when we can obtain the Fourier transform of a function f ∈ Lp (R) by the formula a f (t)e−2πixt dt. f(x) = lim a→+∞
−a
We prove by an example (Example 13.2) that our results are optimal.
1. Hardy-Littlewood maximal function
1.1 Introduction What Carleson proved in 1966 was Luzin’s conjecture of 1913, and this proof depended on many results obtained in the fifty years since the conjecture was stated. In this chapter we make a rapid exposition of one of these prerequisites. We can also see one of the best ideas, that is, taking a maximal operator when one wants to prove pointwise convergence. The convergence result obtained is simple: the differentiability of the definite integral. This permits one to observe one of the pieces of Carleson’s proof without any technical problems. Given a function f ∈ L1 (R) we ask about the differentiability properties of the definite integral x
F (x) =
f (t) dt. −∞
This is equivalent to the question of whether there exists F (x + h) − F (x) 1 x+h = lim lim f (t) dt. h→0 h→0 h x h When we are confronted with questions of convergence it is advisable to study the corresponding maximal function. Here, 1 x+h sup f (t) dt. h h x An analogous result in dimension n will be 1 f (x) = lim f (t) dt, Qx |Q| Q
(1.1)
where Q denotes a cube of center x and side h and we write Q x to express that the side h → 0+ . In the one-dimensional case we have Q = [x − h, x + h], this difference ([x − h, x + h] instead of [x, x + h]) has no consequence, as we will see. For every locally integrable function f : Rn → C, we put
J.A. de Reyna: LNM 1785, pp. 3–10, 2002. c Springer-Verlag Berlin Heidelberg 2002
4
1. Hardy-Littlewood maximal function
1 Mf (x) = sup Q |Q|
|f (t)| dt, Q
where the supremum is taken over all cubes Q ⊂ Rn with center x. Mf is the Hardy-Littlewood maximal function.
1.2 Weak inequality First observe that given f locally integrable, the function Mf : Rn → [0, +∞] is measurable. In fact for every positive real number α the set {Mf (x) > α} is open, because given x ∈ Rn with Mf (x) > α there exists a cube Q with center at x and such that 1 |f (t)| dt > α. |Q| Q We only have to observe that the function 1 |f (t)| dt y → |Q| y+Q is continuous. If f ∈ Lp (Rn ), with 1 < p < +∞, we shall show that Mf ∈ Lp (Rn ). However, for p = 1 this is no longer true. What we can say is only that f belongs to weak-L1 . That is to say m{Mf (x) > α} ≤ cn
f 1 . α
The proof is really wonderful. The set where Mf (x) > α is covered by cubes where the mean of |f | is greater than α. If this set has a big measure, we shall have plenty of these cubes. Then we can select a big pairwise disjoint subfamily and this implies that the norm of f is big. The most delicate point of this proof is that at which we select the disjoint cubes. This is accomplished by the following covering lemma Lemma 1.1 (Covering lemma) Let Rd be endowed with some norm, and let cd = 2 · 3d . If A ⊂ Rd is a non-empty set of finite exterior measure, and U is a covering of A by open balls, then there is a finite subfamily of disjoint balls B1 ,. . . ,Bn of U such that cd
n
m(Bj ) ≥ m∗ (A).
j=1
Proof. We can assume that A is measurable, because if it were not, there would exist open set G ⊃ A with m(G) finite and such that U would be a
1.2 Weak inequality
5
covering of G. Now, assuming that A is measurable, there exists a compact set K ⊂ A with m(K) ≥ m(A)/2. Now select a finite subcovering of K, say that with the balls U1 , U2 , . . . , Um . Assume that these balls are ordered with decreasing radii. Then we select the balls Bj in the following way. First B1 = U1 is the greatest of them all. Then B2 is the first ball in the sequence of Uj that is disjoint from B1 , if there is one, in the other case we put n = 1. Then B3 will be the first ball from the Uj that is disjoint from B1 ∪ B2 . We continue in this way, until every ball from the sequence Uj has non-empty intersection with some Bj . m n Now we claim that K ⊂ j=1 3Bj . In fact we know that K ⊂ j=1 Uj . Hence for every x ∈ K, there is a first j such that x ∈ Uj . If this Uj is equal to some Bk obviously we have x ∈ Bk ⊂ 3Bk . In other case Uj intersects some Bk = Us . Selecting the minimum k, it must be that s < j, for otherwise we would have selected Uj instead of Bk in our process. So the radius of the ball Bk is greater than or equal to that of Uj . It follows that Uj ⊂ 3Bk . Therefore n 1 m(A) ≤ m(K) ≤ 3d m(Bj ), 2 j=1 and the construction implies that these balls are disjoint.
Lemma 1.2 (Hardy and Littlewood) If f ∈ L1 (Rd ) then Mf satisfies, for each α > 0, the weak inequality m{x ∈ Rd | Mf (x) > α} ≤ cd
f 1 . α
Proof. Let A = {x ∈ Rd | Mf (x) > α}, it is an open set. We do not know yet that it has finite measure, so we consider An = A ∩ Bn , where Bn is a ball of radius n and center 0. Now each x ∈ An has Mf (x) > α; hence there exists an open cube Q, with center at x and such that 1 |f (t)| dt > α. (1.2) |Q| Q Now cubes are balls for the norm · ∞ on Rd . So we can apply the covering lemma to obtain a finite set of disjoint cubes (Qj )m j=1 such that every one of them satisfies (1.2), and m(An ) ≤ cd
m j=1
Therefore we have
m(Qj ).
6
1. Hardy-Littlewood maximal function
m 1 m(An ) ≤ cd |f (t)| dt. α Q j=1 Since the cubes are disjoint m(An ) ≤ cd
f 1 . α
Taking limits when n → ∞, we obtain our desired bound.
1.3 Differentiability As an application we desire to obtain (1.1). In fact we can prove something more. It is not only that at almost every point x ∈ Rd we have 1 f (t) − f (x) dt = 0, lim Qx |Q| Q but that we have
1 lim Qx |Q|
f (t) − f (x) dt = 0.
Q
A point where this is true is called a Lebesgue point of f . Theorem 1.3 (Differentiability Theorem) Let f : Rd → C be a locally integrable function. There exists a subset Z ⊂ Rd of null measure and such that every x ∈ / Z is a Lebesgue point of f . That is 1 f (t) − f (x) dt = 0. lim Qx |Q| Q Proof. Whether x is a Lebesgue point of f or not, depends only on the values of f in a neighborhood of x. So we can reduce to the case of f integrable. Also the results are true for a dense set on L1 (Rd ). In fact if f is continuous, given x and ε > 0, there is a neighborhood of x such that |f (t)−f (x)| < ε. Hence if Q denotes a cube with a sufficiently small radius we have 1 f (t) − f (x) dt ≤ ε. |Q| Q Hence, for a continuous function f , every point is a Lebesgue point. Now we can observe for the first time how the maximal function intervenes in pointwise convergence matters. We are going to define the operator Ω. If f ∈ L1 (Rd ), 1 f (t) − f (x) dt Ωf (x) = lim sup Qx |Q| Q
1.3 Differentiability
7
Note that Ωf (x) ≤ Mf (x) + |f (x)|. Now our objective is to prove that Ωf (x) = 0 almost everywhere. Fix ε > 0. Since the continuous functions are dense on L1 (Rd ), we obtain a continuous ϕ ∈ L1 (Rd ), such that f − ϕ1 < ε. By the triangle inequality Ωf (x) ≤ Ωϕ(x) + Ω(f − ϕ)(x) = Ω(f − ϕ)(x) ≤ M(f − ϕ)(x) + |f (x) − ϕ(x)|. Hence for every α > 0 we have {Ωf (x) > α} ⊂ {M(f − ϕ)(x) > α/2} ∪ {|f (x) − ϕ(x)| > α/2}. Now we use the weak inequality for the Hardy-Littlewood maximal function and the Chebyshev inequality for |f − ϕ| m{Ωf (x) > α} ≤ 2cd
f − ϕ1 ε f − ϕ1 +2 ≤ Cd . α α α
Since this inequality is true for every ε > 0, we deduce m{Ωf (x) > α} = 0. And this is true for every α > 0, hence Ωf (x) = 0 almost everywhere. As an example we prove that
x
F (x) =
f (t) dt −∞
is differentiable at every Lebesgue point of f . We assume that f is integrable. For h > 0 1 x+h F (x + h) − F (x) − f (x) = f (t) − f (x) dt. h h x Hence
F (x + h) − F (x) 1 x+h f (t) − f (x) dt − f (x) ≤ h h x x+h 2 f (t) − f (x) dt. ≤ 2h x−h
If x is a Lebesgue point of f we know that the limit when h → 0 is equal to zero. An analogous procedure proves the existence of the left-hand limit at x.
8
1. Hardy-Littlewood maximal function
1.4 Interpolation At one extreme, with p = 1, the maximal function Mf satisfies a weak inequality. At the other extreme p = +∞, it is obvious from the definition that if f ∈ L∞ (Rd ) Mf ∞ ≤ f ∞ . An idea of Marcinkiewicz permits us to interpolate between these two extremes. Theorem 1.4 For every f ∈ Lp (Rd ), 1 < p < +∞ we have Mf p ≤ Cd
p f p . p−1
Proof. For every α > 0 we decompose f , f = f χA +f χRd A , where A = {|f | > α}. Then Mf ≤ α + M(f χA ). Consequently cd |f | χ{|f |>α} dm. m{Mf > 2α} ≤ m{M(f χA ) > α} ≤ α Rd The proof depends on a judicious use of this inequality. In particular observe that we have used a different decomposition of f for every α. We have the following chain of inequalities +∞ p tp−1 m{Mf > t} dt ≤ Mf p = p 0
+∞
t
p 0
p−1 2cd
t
Rd
Applying Fubini’s theorem |f (x)| 2cd p Rd
2cd p
Rd
0
|f | χ{|f |>t/2} dm dt ≤
+∞
tp−2 χ{|f (x)|>t/2} dt dx =
2p cd p (2|f (x)|)p−1 |f (x)| dx = f pp p−1 p−1
It is easy to see that (p/(p − 1))1/p is equivalent to p/(p − 1). Hence we obtain our claim about the norm. In the case of p = 1 the best we can say is the weak inequality. For example if f 1 > 0, then Mf is not integrable. In spite of this we shall need in the proof of Carleson theorem a bound of the integral of the maximal function on a set of finite measure; that is a consequence of the weak inequality
1.5 A general inequality
9
Proposition 1.5 For every function f ∈ L1 (Rd ) and B ⊂ Rd a measurable set Mf (x) dx ≤ m(B) + 2cd |f (x)| log+ |f (x)| dx. Rd
B
Proof. Let mB be the measure mB (M ) = m(B ∩ M ). We have +∞ Mf (x) dx = mB {Mf (x) > t} dt. 0
B
Now we have two inequalities mB {Mf (x) > t} ≤ m(B), and the weak inequality. The point of the proof is to use adequately the weak inequality. For every α we have f = f χA +f χRd A where A = {f (x) > α}. Therefore Mf ≤ α + M(f χA ), and {Mf (x) > 2α} ⊂ {M(f χA )(x) > α}. It follows that cd m{Mf (x) > 2α} ≤ α Hence
Mf (x) dx ≤ m(B) + 2 B
1
+∞
{|f (x)|>α}
cd
t
Therefore by Fubini’s theorem Mf (x) dx ≤ m(B) + 2cd
Rd
B
|f (x)| dx.
{|f (x)|>t}
|f (x)| dx dt.
|f (x)| log+ |f (x)| dx.
1.5 A general inequality The Hardy-Littlewood maximal function can be used to prove many theorems of pointwise convergence. This and many other applications of these functions derive from the following inequality. Theorem 1.6 Let ϕ: Rd → R be a positive, radial, decreasing, and integrable function. Then for every f ∈ Lp (Rd ) and x ∈ Rd we have |ϕ ∗ f (x)| ≤ Cd ϕ1 Mf (x), where Cd is a constant depending only on the dimension, and equal to 1 for d = 1.
10
1. Hardy-Littlewood maximal function
Proof. We say that ϕ is radial if there is a function u: [0, +∞) → R such that ϕ(x) = u(|x|) for every x ∈ Rd . Also we say that a radial function ϕ is decreasing if u is decreasing. The function u is measurable, hence there is an increasing sequence of simple functions (un ) such that un (t) converges to u(t) for every t ≥ 0. In this case, since u is decreasing, it is possible to choose each un un (t) =
N
hj χ[0,tj ] (t),
j=1
where 0 < t1 < t2 < · · · < tN and hj > 0 and the natural number N depends on n. Now the proof is straightforward. Let ϕn (x) = un (|x|). By the monotone convergence theorem |ϕ ∗ f (x)| ≤ ϕ ∗ |f |(x) = lim ϕn ∗ |f |(x). n
Therefore ϕn ∗ |f |(x) =
N j=1
hj
B(x,tj )
|f (y)| dy.
We can replace the ball B(x, tj ) by the cube with center x and side 2tj . The quotient between the volume of the ball and the cube is bounded by a constant. Thus ϕn ∗ |f |(x) ≤
N
hj m Q(xj , tj ) · Mf (x) ≤ Cd ϕ1 Mf (x).
j=1
2. Fourier Series
2.1 Introduction Let f : R → C be a 2π-periodic function, integrable in [−π, π]. The Fourier series of f is the series +∞ aj eijt (2.1) j=−∞
where the Fourier coefficients aj are defined by π 1 aj = f (t)e−ijt dt. 2π −π
(2.2)
These coefficients are denoted as f(j) = aj . These series had been considered in the eighteen century by Daniel Bernoulli, Euler, Lagrange, etc. They knew that if a function is given by the series (2.1), the coefficients can be calculated by (2.2). They also knew many examples. Bernoulli, studying the movement of a string fixed at its extremes, gave the expression ∞ jπx cos jρt, y(x, t) = aj sin j=1 for its position, where is the length of the string and the coefficient ρ depends on its physical properties. In 1753 Euler noticed a paradoxical implication: the initial position of the string would be given by f (x) =
∞ j=1
aj sin
jπx .
At this moment the curves were classified as continuous, if they were defined by a formula, and geometrical if they could be drawn with the hand. They thought that the first ones were locally determined while the movement of the hand was not determined by the first stroke. Bernoulli believed that the representation of an arbitrary function was possible.
J.A. de Reyna: LNM 1785, pp. 11–29, 2002. c Springer-Verlag Berlin Heidelberg 2002
12
2. Fourier Series
Fourier affirmed in his book Th´eorie Analytique de la Chaleur (1822) that the development was valid in the general case. This topic is connected with the definition of the concept of function.
2.2 Dirichlet Kernel The convergence of the series (2.1) was considered by Dirichlet in 1829. He proved that the series converges to f (x + 0) + f (x − 0) /2 for every piecewise continuous and monotonous function. This was later superseded by the results of Dini and Jordan. To prove these results we consider first the result of Riemann Proposition 2.1 (Riemann-Lebesgue lemma) If f : R → C is 2πperiodic and integrable on [−π, π], then lim f(j) = 0.
|j|→∞
Proof. If we change variables u = t + π/j in the integral (2.2) the exponential changes sign. Hence we have π π
1 1 π −ijt −ijt f (j) = e f (t)e dt − f t− dt. 4π −π 4π −π j For a continuous function f it follows that lim |f(j)| = 0. For a general f we approximate it in L1 norm by a continuous function.
2.2 Dirichlet Kernel
To study pointwise convergence we consider the partial sums Sn (f, x) =
n
f(j)eijx .
j=−n
Since every coefficient has an integral expression, we obtain an integral form for the partial sum of the Fourier series π 1 Dn (x − t)f (t) dt, Sn (f, x) = 2π −π where the function Dn , the Dirichlet kernel, is given by n sin n + 12 t ijt e = Dn (t) = . sin t/2 j=−n It follows that f → Sn (f, x) is a continuous linear form defined on L1 [−π, π]. The function Dn is 2π-periodic, with integral equal to 1, but Dn 1 and Dn ∞ are not uniformly bounded. With the integral expression of the partial sums we can obtain the two basic conditions for pointwise convergence.
13
. ...... ..... ........ .. ...... .. ... ... ... .. .. ... ... ... ..... .... .. .. ... .. ... ... ... .. .. ... .. .. ... ... . . .. ... .. .. ... ..... . . .. .. .. ... ... .... . . .. ... .. .. ... ..... . . .. .. .. ... ... .... . . .. ... .. ... ... .... . . .. ... .. .. ... ..... . . .. .. .. ... ... .... . . .. ... .. .. ... ..... . . .. .. .. ... ... .... . . .. ... .. ... ... .... . . .. ... .. .. ... ..... . . .. ... .. .. ..... ..... ..... ..... .. . ... ..... ..... .. ... .. ... .. .... ... ... ... ... .. . . . . . ... .. .. .. .. .. .. ..... . . .. . .. . .. ..... . ... ...... .. ... .. ... .. .. .. .. .. .. .. .. .. .. .. ... .. ... .. ... .. .. . ... . .. .. . .. ... .. .. .. .. .. .. .. .. .. .. .. ... . ... . .. .. .. .. .. .. .. .. ... .. ... .. ... ... ... .. .. .. ... .. .. .. .. .. ... .. ... .. ... .. ... .. .. . .. . .. .. ... .. ... ... ... ... ... ... ... ... .. .. .. .. ... .. .. . .. .. .. .. .. .. .. .. ... .. ... .. ... .. .... .. .. . .. .. .. .. .. . . . . . . . .... .. .. .. . ... . .. .. .. .. .. . ..... . . . ... ..... .. .. .. .. .. .. .. . ..... . . .. .. . .. . .. .. .. .. ...... .. .. .. .. ...... .... ... .. .. .. .. . . ... . .. .. ... .. .. .. ... . ... ... ... .. .. .. ... .. .. ... ..... .... ..... ..... .... ..... .... ...
π
−π
The Dirichlet kernel D8 (t).
Theorem 2.2 (Dini’s test) If f ∈ L1 [−π, π] and π dt < +∞, |f (x + t) + f (x − t) − 2f (x)| t 0 then the Fourier series of f at the point x converges to f (x). Proof. The difference Sn (f ) − f can be written as π 1 Dn (t) f (x − t) − f (x) dt = Sn (f, x) − f (x) = 2π −π π 1 Dn (t) f (x + t) + f (x − t) − 2f (x) dt. 2π 0 Since 2 sin t/2 ∼ t, the Riemann-Lebesgue lemma proves that this difference tends to 0.
14
2. Fourier Series
Theorem 2.3 (Jordan’s test) If f ∈ L1 [−π, π] is of bounded variation on an open interval that contains x, then the Fourier series at x converges to f (x + 0) + f (x − 0) /2. Proof. The proof is based on the fact that although Dn 1 are not bounded the integrals δ Dn (t) dt 0
are uniformly bounded on n and δ. (This can be proved changing the Dirich1 let kernel to the equivalent sin n + 2 /t and then applying a change of variables). Without loss of generality we can assume that x = 0. Also we can assume that f is increasing on a neighborhood of 0. We must prove π 1 Dn (t) f (t) + f (−t) dt = f (0+) + f (0−) /2. lim n 2π 0 By symmetry it suffices to prove π 1 Dn (t)f (t) dt = f (0+)/2. lim n 2π 0 Finally we can assume that f (0+) = 0. Choose δ > 0 such that 0 ≤ f (t) < ε for every 0 < t < δ. We decompose the integral into two parts, one on [0, δ] and the other on [δ, π]. We apply to the first integral the second mean value theorem, which states that if g is continuous and f monotone on [a, b], there exists c ∈ [a, b] such that b b c f (t)g(t) dt = f (b−) g(t) dt + f (a+) g(t) dt. a
c
a
Therefore 0
π
Dn (t)f (t) dt = f (δ−)
δ
Dn (t) dt + η
π
Dn (t)f (t) dt. δ
The second integral converges to 0 by the Riemann-Lebesgue lemma and the first is less than Cε by the property of Dirichlet kernel that we have noted. We see that these conditions only depend on the values of f in an arbitrarily small neighborhood of x. This is a general fact and it is known as the Riemann localization principle: the convergence of the Fourier series to f (x) only depends on the values of f in a neighborhood of f . This is clear from the expression of Sn (f, x) as an integral and the Riemann-Lebesgue lemma. This is surprising, because each f(j) depends on all the values of f .
2.3 Fourier series of continuous functions
15
The two criteria given are independent. If f (t) = 1/|log(t/2π)|, g(t) = t sin(1/t), for 0 < t < π and 0 < α < 1, then f satisfies Jordan’s condition but not Dini’s test, at the point t = 0. On the other hand g satisfies only Dini’s test. α
2.3 Fourier series of continuous functions The convergence conditions that we have proved show that the Fourier series of a differentiable function converges pointwise to the function. This is not true for continuous functions. Du Bois Reymond constructed a continuous function whose Fourier series is divergent at one point. This follows from Banach-Steinhauss theorem. We consider Tn (f ) = Sn (f, 0) as a linear operator on the space of continuous functions on [−π, π] that take the same values at the extremes. By the Banach-Steinhauss theorem supn Tn < +∞ if and only if for every f ∈ C(T) we have supN |Tn (f )| < +∞. But an easy calculus shows that Tn = Dn 1 =
4 log n + O(1). π2
The numbers Ln = Dn 1 are called Lebesgue constants. Its order is computed as follows π 2 π sin(n + 1/2)t 2 sin(n + 1/2)t Ln = dt = dt + O(1) 2π 0 sin(t/2) π 0 t 2 = π
n−1 2 (k+1)π sin u sin u du + O(1) = du + O(1) u π u kπ
(2n+1)π/2
0
k=0
n−1 2 π sin u = du + O(1) π 0 kπ + u k=0
2 π
0
n−1 4 4 1 sin u + O(1) = 2 log n + O(1). du + O(1) = kπ + u π kπ + ξ π
π n−1 k=1
k=1
Notice the following corollary Corollary 2.4 If f ∈ L∞ [−π, π], then |Sn (f, x)| ≤ ( π42 log n + C)f ∞ . The following theorem is more difficult. In its proof we need an expression of the Dirichlet kernel that plays an important role in Carleson’s theorem.
16
2. Fourier Series
Theorem 2.5 (Hardy) If f ∈ L1 [−π, π], then at every Lebesgue point x of f lim [Sn (f, x)/(log n)] = 0. n→∞
Furthermore, if f is continuous on an open interval I, the convergence is uniform on every closed J ⊂ I.
Proof. The Dirichlet kernel can be written as
1 sin n + 1/2 t sin nt 2 =2 + cos nt + − Dn (t) = sin nt. sin t/2 t tan t/2 t The last two terms are bounded uniformly in n and t. Therefore Dn (t) = 2
sin nt + ϕn (t), t
|t| < π,
(2.3)
and there is an absolute constant 0 < C < +∞ such that ϕn ∞ ≤ C. This expression will play a role in Carleson’s theorem. Now we have π 1 π sin nt f (x − t) |f (t)| dt. dt ≤ c Sn (f, x) − π −π t −π It follows that for every f in L1 [−π, π] 1 π sin nt dt . |Sn (f, x)| ≤ cf 1 + f (x − t) π −π t The function sin t/t has integrals uniformly bounded on intervals. It follows that 1 π sin nt |Sn (f, x)| ≤ C + dt . {f (x + t) + f (x − t) − 2f (x)} π 0 t Let ϕx (t) denote the function f (x+t)+f (x−t)−2f (x). If x is a Lebesgue point of f , the primitive Φ(t) of |ϕx (t)| satisfies Φ(t) = o(t) when t → 0. With these notations n 1/n 1 π −1 |ϕx (t)| dt + t |ϕx (t)| dt |Sn (f, x)| ≤ C + π 0 π 1/n 1 1 π n 1 −1 π Φ(t)t−2 dt. =C+ Φ + Φ(t)t 1/n + π n π π 1/n It follows easily that Sn (f, x)/ log n → 0 since, x being a Lebesgue point of f , we have Φ(t) = o(t).
2.3 Fourier series of continuous functions
17
This result is the best possible in the following sense: for every sequence (λn ) such that λ−1 n log n → +∞, there exists a continuous function f such that |Sn (f, x)| > λn for infinitely many natural numbers n. Some properties of the trigonometrical system follow from the fact that it is an orthonormal set of functions. For example D. E. Menshov and H. Rademacher proved that the series of orthonormal functions j cj ϕj con verges almost everywhere if j |cj log j|2 < +∞. This is also the best result for general orthonormal systems, in particular D. E. Menshov in 1923 proved that there exist series j cj ϕj divergent almost everywhere and such that 2 j |cj | < +∞. Therefore Carleson’s theorem is a property of the trigonometrical system that depends on the natural ordering of this system. The result of D. E. Menshov and H. Rademacher is easily proved. In fact this is a general result that relates the order of Sn (f, x) and the convergence of certain series. Theorem 2.6 Assume that (λn ) is an increasing sequence of positive real number such that for every g ∈ L2 [−π, π] Sn (g, x) = 0. n→+∞ λn+1 Then if f ∈ L1 [−π, π] satisfies j |f(j)λ|j| |2 < +∞, then lim
f (x) = lim Sn (f, x),
a. e.
n
Proof. By the Riesz-Fischer theorem there exists a function g ∈ L2 [−π, π], such that g(j) = f(j)λ|j| . Comparing Fourier coefficients we derive the equality n
1 1 Sn (g, x) Sn (f, x) = − . Sk (g, x) + λk λk+1 λn+1 k=0
By our hypothesis about the functions on L2 [−π, π], we deduce that the character of the sequence Sn (f, x) coincides with that of the series ∞
∞ 1 1 Sk (g, x) = − hk (x). λk λk+1
k=1
But as a series on L2 [−π, π], we have series converges a. e.
k=1
k
hk 2 < +∞. Therefore the
18
2. Fourier Series
2.4 Banach continuity principle The hypothesis that limn Sn (f, x)/λn+1 → 0 in the theorem 2.6 can be replaced by supn |Sn (f, x)/λn+1 | < +∞ a. e. This is a general fact due to Banach. This reduces the problem of a. e. convergence of Fourier series to proving the pointwise boundedness of the maximal operator sup |Sn (f, x)|. n
To prove these assertions we need some knowledge about the space of measurable functions L0 [−π, π]. It is a metric space with distance π 1 |f − g| d(f, g) = dm. 2π −π 1 + |f − g| This is a complete metric vector space. A sequence (fn ) converges to 0 if and only if it converges to 0 in measure. That is to say: for every ε > 0 we have limn m{|fn | > ε} = 0. Consider now a sequence (Tn ) of linear operators Tn : Lp [−π, π] → L0 [−π, π]. We assume that each Tn is continuous in measure, therefore for every (fk ) with fk p → 0, and every ε > 0 we have m{|T (fk )| > ε} → 0, (and this is true for every T = Tn ). Observe that if Tn f (x) converges a. e., then the maximal operator T ∗ f (x) = supn |Tn f (x)| is bounded a. e. The principle of Banach is a sort of uniform boundedness principle: The continuity in measure of a sequence of operators and the almost everywhere finiteness of the maximal operator imply the continuity at 0 in measure of the maximal operator. Theorem 2.7 (Banach’s continuity principle) Let us assume that for every f ∈ Lp [−π, π], the function T ∗ f (x) < +∞ a. e. on [−π, π], then there exists a decreasing function C(α) defined for every α > 0, such that limα→+∞ C(α) = 0 and such that m{T ∗ f (x) > αf p } ≤ C(α), for every f ∈ Lp [−π, π] Proof. Fix a positive real number ε > 0. For every natural number let Fn be the set of f ∈ Lp [−π, π] such that m{T ∗ f (x) > n} ≤ ε. The set Fn is closed on Lp [−π, π]. To prove this consider f ∈ Fn , then m{T ∗ f (x) > n} > ε. It follows that there exists N such that
2.4 Banach continuity principle
19
m{ sup Tk f (x) > n} > ε. 1≤k≤N
Then there exists δ > 0 such that m{ sup Tk f (x) > n + δ} > ε + δ. 1≤k≤N
By the continuity in measure of the operators Tk , there exists a δ > 0 such that for every g with f − gp < δ we have m{|Tk (f − g)(x)| > δ} < δ/2k ,
1 ≤ k ≤ N.
Let Z be the union of the exceptional sets {|Tk (f −g)(x)| > δ}. Then m(Z) < δ. Also we have {T ∗ g(x) > n} ∪ Z ⊃ { sup Tk f (x) > n + δ}. 1≤k≤N
Therefore it follows that m{T ∗ g(x) > n} > ε. That is, the set Lp [−π, π] Fn is open. Now our hypothesis about the boundedness of T ∗ f implies that Lp [−π, π] = Fn . n
By Baire’s category theorem there is some n ∈ N such that Fn has a nonempty interior. That is, there exist f0 ∈ Fn and δ > 0, such that f = f0 + δg with gp = 1. Thus m{T ∗ (f0 + δg) > n} ≤ ε. Then m{T ∗ g > 2n/δ} ≤ m{T ∗ (f0 + δg) > n} + m{T ∗ (f0 − δg) > n} ≤ 2ε. Therefore for every g ∈ Lp [−π, π] m{T ∗ g > (2n/δ)gp } ≤ 2ε. Hence, if we define C(α) = sup m{T ∗ g > αgp }, the function C(α) satisfies limα→+∞ C(α) = 0.
This principle is completed with the fact that under the same hypothesis about (Tn ), the set of f ∈ Lp [−π, π], where the limit limn Tn f (x) exists a. e., is closed in Lp [−π, π]. To prove this, define the operator
20
2. Fourier Series
Ω(f )(x) = lim sup |Tn f (x) − Tm f (x)|. n,m
It is clear that Ωf ≤ 2T ∗ f . Therefore m{Ωf (x) > αf p } ≤ C(α/2). For every function ϕ such that the limit limn Tn ϕ(x) exists a. e., we have Ωϕ = 0, and Ω(f − ϕ) = Ωf . It follows that m{Ωf (x) > αf − ϕp } ≤ C(α/2). Now let f be in the closure of the sets of functions ϕ. Take α = 1/ε and f − ϕp < ε2 . We obtain m{Ωf (x) > ε} ≤ C(1/2ε). It follows easily that m{Ωf (x) > 0} = 0.
2.5 Summability As we have said, Du Bois Reymond constructed a continuous function whose Fourier series diverges at some point. Lipot Fej´er proved, when he was 19 years old, that in spite of this we can recover a continuous function from its Fourier series. Recall that if a sequence converges, there converges also, and to the same limit, the series formed by the arithmetic means of his terms. Fej´er considered the mean values of the partial sums 1 Sn (f, x). σn (f, x) = n + 1 j=0 n
We have an integral expression for these mean values π 1 Fn (x − t)f (t) dt σn (f, x) = 2π −π where Fn is Fej´er kernel: j=n n
|j| ijt 1 1− Dj (t) = Fn (t) = e . n + 1 j=0 n+1 j=−n
There is another expression for Fn . We substitute the value of the Dirichlet kernel, then we want to sum the sequence sin(n+1/2)t. This is the imaginary part of n n 1 ei(n+1)t − 1 i(2j+1)t/2 −it/2 e =e eijt = 2i sin(t/2) j=0 j=0
2.5 Summability
thus
n j=0
sin
21
1 − cos(n + 1)t sin2 (n + 1) t/2 2j + 1 t= = ; 2 2 sin(t/2) sin(t/2)
it follows that 1 Fn (t) = n+1
sin(n + 1) t/2 sin(t/2)
2 .
(2.4)
We thus have that Fn is a positive function, Fn 1 = 1, and for every δ > 0 we have, uniformly on δ < |t| ≤ π, that limn Fn (t) = 0. With more generality we define a summability kernel to be a sequence (kn ) of periodic functions such that: π 1 (i) kn (t) dt = 1. 2π −π kn 1 ≤ C.
(ii) For every δ > 0 (iii)
1 lim n→+∞ 2π
|kn (t)| dt = 0. δ 0 we decompose the integral into two parts, one over {|t| < δ} and the other over {δ < |t| < π}. The first is small by the continuity of f and property (ii) of the kernel; the second is small by (iii). Observe that the same proof shows us the convergence at every point of continuity of f , for a measurable bounded f . Since (Fn ) is a summability kernel and Fn ∗ f is a trigonometrical polynomial for every f , it follows that these polynomials are dense on C(T).
22
2. Fourier Series
Now for a f ∈ Lp [−π, π] we have that t → G(t) = f (· + t) − f (·)p is a continuous 2π-periodic function. We have π 1 f (· − t) − f (·)p kn (t) dt = kn ∗ G(0). kn ∗ f − f p ≤ 2π −π We can apply to this convolution the first part of the theorem to conclude that limn kn ∗ G(0) → 0. Thus if f is continuous and 2π-periodic, σn (f, x) converges uniformly to f , and converges to f in Lp [−π, π] if f ∈ Lp [−π, π]. Another important example is that of the Poisson kernel. This kernel appear when we consider the Fourier series of f as the boundary values of a complex function defined on the open unit disc. If f ∈ L1 [−π, π] the series +∞
f(j)z j +
j=0
+∞
f(−j)z j
j=1
converges on the open unit disc and defines a complex harmonic function u(z). Then iθ
u(re ) =
+∞
f(j)r e
|j| ijθ
j=−∞
1 = 2π
π
−π
Pr (θ − t)f (t) dt,
where the Poisson kernel Pr (θ) is defined as Pr (θ) =
+∞
|j| ijθ
r e
j=−∞
1 − r2 = . 1 − 2r cos θ + r2
(2.5)
It is easy to check that Pr (θ) is a summability kernel. (Here the variable r takes the role of n, but this is a minor difference).
..... ... ... .. ... 0.65 ... .... . . .. .... ... .. .. . .. .. .. .. ... ... ... .. ... ... ... .. .. .. .. ... .. ........... ..... .. ..... ..... ... ... .. .. ... ..... .... ..... .... ... .... ..... . . ...... ... . . ........ . ... . . . ... ..... .. .. . . . . ... ... ....... . 0.4 . . . .. ... ... ...... . . . . . ..... ........ . .... . . . . . . . . ...... ........... ... ...... . . . . . . . . . . . . . ... ........ ............................ ................ ............. ......................... ............................... ...................................
The Poisson kernel.
P
(t)
−π
Then we have
P
(t) π
2.5 Summability
23
Proposition 2.9 If f ∈ Lp [−π, π], 1 ≤ p < +∞, or p = +∞ and f is continuous with f (π) = f (−π), we have limr→1− Pr ∗ f − f p = 0. Now we consider f ∈ L1 [−π, π] and ask about the a. e. convergence of Fn ∗f (x) or Pr ∗f (x) to f (x). Obviously there exist some subsequences Fnk ∗f and Prk ∗ f that converge a. e. to f . u(reiθ ) = Pr ∗ f (θ) is a harmonic function on the unit disc. Then what we want is a theorem of radial convergence of a harmonic function to limr→1− u(reiθ ). The first theorem of this type is due to Fatou in 1905: A bounded and analytic function on the open unit disc has radial limits at almost every point of the boundary. We prefer to give a proof like that of the differentiation theorem that also can be extended to the case of σn (f, x). Theorem 2.10 (Fatou) Let f ∈ L1 [−π, π]. For almost every point x ∈ [−π, π] we have lim Pr ∗ f (x) = f (x),
r→1−
lim σn (f, x) = f (x). n
Proof. The principal part is to prove that the maximal operators P ∗ f (x) = sup |Pr ∗ f (x)|, 0 π, and Pr (θ) for |θ| < π. Then we can write Pr ∗ f (x) = Pr◦ ∗ f ◦ (x),
|x| < π.
In the same way we can define the function Fn◦ so that σn (f, x) = Fn ∗ f (x) = Fn◦ ∗ f ◦ (x),
|x| < π.
Since Pr◦ is a radial function that is decreasing for x > 0 and its integral on R is equal to 1, we have |Pr◦ ∗ f ◦ (x)| ≤ Mf ◦ (x). Thus P ∗ f (x) ≤ Mf ◦ (x) for every |x| < π. The Fej´er kernel is not decreasing, but sin(t/2) > t/π for 0 < t < π, therefore ⎧
2 ⎪ ⎨ 1 (n+1)t/2 ≤ 16(n + 1), if |t| ≤ 1 , n+1 t/4 n+1 Fn◦ (t) ≤
2 ⎪ 1 16 1 1 ⎩ 1 = n+1 if n+1 < |t| < π. n+1 t/4 t2 , Thus Fn is bounded by a radial function that is decreasing for t > 0 and has integrals uniformly bounded. It follows that
24
2. Fourier Series
F ∗ f (x) ≤ CMf (x). Now the proof of the a. e. pointwise convergence follows as in the differentiability theorem.
2.6 The conjugate function Since (eint ) is a complete orthonormal system on the space L2 [−π, π], we have Parseval equality π 2 1 f(j) = |f (t)|2 dt. 2π −π j∈Z
It follows that lim f − Sn (f )2 = 0. n
In fact for every 1 < p < +∞ we have lim f − Sn (f )p = 0, n
(2.6)
for every f ∈ Lp [−π, π]. In the case p = 1 this is no longer true. In fact if Fn and Dn are Fej´er’s and Dirichlet’s kernels, then Sn (FN ) = Dn ∗ FN = σN (Dn ). Therefore by Fej´er’s Theorem limN Sn (FN ) = Dn on L1 [−π, π]. Since FN 1 = 1, it follows that Sn 1 ≥ Ln . And we know that Ln ∼ log n. On the other hand to prove (2.6), it suffices to prove that the norm of the operators of partial sums Sn : Lp [−π, π] → Lp [−π, π] is uniformly bounded. In fact, since the polynomials are dense on Lp [−π, π], given ε > 0 we find a polynomial Pε such that f − Pε p < ε. Then if n is greater than the degree of Pε f − Sn (f )p ≤ f − Pε p + Sn (Pε ) − Sn (f )p ≤ ε + Cε. The uniform boundedness of these norms was proved by M. Riesz in 1928. He considered the operator defined on the space of trigonometrical polynomials as aj eijt = aj eijt . R j
j≥0
It is clear that R is a continuous projection on L2 [−π, π]. What is remarkable about R is that we have the relationship Sn (f ) = e−int R(eint f ) − ei(n+1)t R(e−i(n+1)t f ). Then the uniform boundedness of the norm of Sn follows if R can be extended to a continuous operator on Lp [−π, π].
2.6 The conjugate function
25
The operator R is related to the conjugate harmonic function. Consider a power series (an + ibn )z n . n>0
Its real and imaginary part for z = eit are (an cos nt − bn sin nt); v= (an sin nt + bn cos nt). u= n>0
n>0
We say that v = u is the conjugate series to u. The operator H that sends u to v must satisfy H(cos nt) = sin nt;
H(sin nt) = − cos nt.
It is the same to say H(eint ) = −i sgn(n)eint . The R and H operators are related by R f (θ) + eiθ R e−iθ f (θ) = f (θ) + iH f (θ) . It follows that the operator H extends to a continuous operator from L2 [−π, π] to L2 [−π, π]. In the next chapter we shall study the operator H. For the time being we obtain some expression for this operator. Let f ∈ L1 [−π, π]. Its Fourier series is f(j)eijx . j
We shall call
(−i) sgn(j)f(j)eijx
j
the conjugate series. It is clear that when f ∈ L2 [−π, π], this conjugate series is the Fourier series of Hf . We can express the partial sums of the conjugate series as a convolution π n 1 ijx ˜ ˜ n (x − t) dt. (−i) sgn(j)f (j)e = f (t)D Sn (f, x) = 2π −π j=−n ˜ n is the conjugate of the Dirichlet kernel Here D ˜ n (t) = 2 D
n j=1
sin jt =
cos t/2 − cos(n + 1/2)t . sin t/2
And we have a condition of convergence similar to that of Dini.
26
2. Fourier Series
Theorem 2.11 (Pringsheim convergence test) Let f ∈ L1 [−π, π] be a 2π-periodic function and x ∈ [−π, π] such that π dt |f (x + t) − f (x − t)| < +∞, t 0 then the conjugate series converges at the point x. ˜ n (t) is an odd function, Proof. Since D π 1 ˜ n (t) dt. f (x − t) − f (x + t) D S˜n (f, x) = 2π 0 ˜ n (t) end the Then the Riemann-Lebesgue lemma and the expression for D proof. We also get that 1 lim S˜n (f, x) = 2π
π
0
f (x − t) − f (x + t) dt. tan t/2
It follows easily that under the hypothesis of the theorem, there exists the principal value and π f (x − t) 1 lim S˜n (f, x) = p.v. dt. 2π −π tan t/2 We see that the Hilbert transform of a differentiable function is given by π 1 f (x − t) Hf (x) = p.v. dt. 2π −π tan t/2 In 1913 Luzin proved that the principal value exists and equals Hf (x) a. e. for every f ∈ L2 [−π, π]. Later, in 1919, Privalov proved that the principal value exists a. e. for every f ∈ L1 [−π, π].
2.7 The Hilbert transform on R In the following chapter we will study the Hilbert transform. It is convenient perform this study on R instead of on the torus. Almost all of what we have said for Fourier series has an analogue for R. For f ∈ L1 (R) the Fourier transform is defined as +∞ f(x) = f (t)e−2πitx dt. −∞
2.7 The Hilbert transform on R
27
This is analogous to the Fourier coefficients f(j). The partial sums of the Fourier series are similar to a 2πiξx dξ = f (t)Da (x − t) dt, f (ξ)e Sa (f, x) = −a
where the Dirichlet kernel is replaced by Da (t) =
sin 2πat . πt
And Fej´er sums are replaced by 1 a σa (f, x) = St (f, x) dt = f (x − ξ)Fa (ξ) dξ, a 0 where the analogue of the Fej´er kernel is
2 1 sin πξa . a πξ The role of the unit disc is taken by the semiplane y > 0 on R2 . If f ∈ L1 (R), we define on this semiplane the analytic function +∞ f (t) 1 dt. F (z) = πi −∞ t − z When f is a real function the real and imaginary parts of F are given by y 1 +∞ f (t) dt; u(x, y) = π −∞ (x − t)2 + y 2 x−t 1 +∞ f (t) dt. v(x, y) = π −∞ (x − t)2 + y 2 For a general f the functions u and v defined by these integrals are harmonic conjugate functions. The Hilbert transform is defined as +∞ f (t) 1 dt. Hf (x) = p.v. π −∞ x − t The study of this transform is equivalent to that of the transform on the torus. In fact, given f ∈ L1 [−π, π], if we define f ◦ : R → C as 0 for |x| > 2π and equal to the periodic extension of f when |x| < 2π; then for every |x| < π π ◦ f (x − t) 1 p.v. Hf (x) = dt. 2π −π tan t/2 Therefore
28
2. Fourier Series
π +∞ ◦
1 2 1 f (x − t) 1 ◦ − dt + p.v. dt Hf (x) = f (x − t) 2π −π tan t/2 t π t −∞ f ◦ (x − t) 1 . − π |t|>π t If we designate by HR and HT the two transforms we have |HT f (x) − HR f ◦ (x)| ≤ Cf 1 . And results for either transform can be transferred to the other.
2.8 The conjecture of Luzin When Luzin published his paper in 1913, he knew the result of Fatou: the Poisson integral π 1 1 − r2 f (t) dt 2π −π 1 − 2r cos(t − x) + r2 1 converges a. e. to f (x) for every π]. He also knew the F. Riesz ∞ f ∈2 L [−π, 2 and E. Fischer result: given n=1 (a n∞+ bn ) < +∞ there exists a function 2 f ∈ L [−π, π] with Fourier series n=1 an cos nx + bn sin nx. He deduced that with the same hypotheses, there exists also the conjugate function g ∈ ∞ L2 [−π, π] with Fourier series n=1 −bn cos nx + an sin nx. representations in terms of Then the coefficients an and bn have integral ∞ f and also in terms of g. The analytic function n=1 (an − ibn )(reix )n has two representations; π π 1 (1 − r2 )f (t) 2rg(t) sin(t − x) 1 dt = dt. 2 2π −π 1 − 2r cos(t − x) + r 2π −π 1 − 2r cos(t − x) + r2
Fatou’s result gives then that the second integral also converges a. e. to f (x) when r → 1− . He proved then the following result: Theorem 2.12 Let f ∈ L2 [−π, π], then π 2rg(t) sin(t − x) g(x − t) 1 1 dt = 0 lim dt − 2 − 2π −π 1 − 2r cos(t − x) + r 2π ηε x − t for every f ∈ L1 (R) and also the bound H∗ f p ≤ Cp f p , where H∗ f denotes the maximal operator ∗ H f (x) = sup ε>0
|x−t|>ε
f (t) dt. x−t
To obtain the fine result of Sj¨ olin: Sn (f, x) converges a. e. when π |f (t)| log+ |f (t)| log+ log+ |f (t)| dt < +∞, −π
it is necessary to estimate the constant Cp in the previous inequality.
3.2 Trunctated operators on L2 (R) We begin studying the truncated operators Hε f (x) = Kε ∗ f (x) =
|x−t|>ε
f (t) dt. x−t
Here Kε denotes the function equal to 1/t for |t| > ε and 0 otherwise. As Kε ∈ Lp (R) for every 1 < p ≤ +∞, the convolution is defined for every f ∈ Lp (R), 1 ≤ p < +∞ and Kε ∗ f is a continuous and bounded function. J.A. de Reyna: LNM 1785, pp. 31–44, 2002. c Springer-Verlag Berlin Heidelberg 2002
32
3. Hilbert Transform
We want to prove that the operator Hε : Lp (R) → Lp (R) is bounded by a constant that does not depend on ε. To achieve this we apply the interpolation of operators. First consider the case p = 2. The Fourier transform is an isometry of 2 L (R). Also since Hε f is a convolution we have H ε f = Kε · f . Hence the ε g. norm of Hε is equal to the norm of the operator that sends g ∈ L2 to K ∞ In particular Kε is bounded if Kε is in L , and Hε = Kε ∞ . ε . Since Kε Hence the result for p = 2 is reduced to the calculation of K is not integrable, we must calculate its transform as a limit of the functions R sin 2πxt −2πixt dt = −2i dt, Kε,R (x) = e t t R>|t|>ε ε where the limit is taken in L2 (R). As the pointwise limit of these functions when R → +∞ exists, it is equal to the limit in L2 (R). Hence +∞ +∞ sin 2πxt sin t Kε (x) = −2i dt = −2i dt, t t ε y for y = 2πxε, and it is easy to see that these integrals are uniformly bounded by an absolute constant.
3.3 Truncated operators on L1 (R) It is not true that Hε maps L1 (R) on L1 (R). What can be proved is only a weak type inequality. If f ∈ Lp (μ), we have the relation μ{x ∈ X : |f (x)| > t} ≤ f pp t−p . A function that satisfies an inequality of the type μ{x ∈ X : |f (x)| > t} ≤
Cp tp
is not necessarily contained in Lp (μ). We say it is in weak Lp . The best constant C = f ∗p that satisfies the above inequality is called the weak Lp norm of f . Also an operator T , defined on every f ∈ Lp (μ), is said to be of type (p, q) if T f q ≤ Cf p and of weak type (p, q) if ν{y ∈ Y : |T f (y)| > t} ≤ C q f qp /tq . It is the same to say that T f ∗q ≤ Cf p . Then what we shall prove is that the operator Hε is of weak type (1, 1). We give a proof that can be extended to more general operators. It is based on the so-called decomposition of Calder´ on-Zygmund.
3.3 Truncated operators on L1 (R)
33
Theorem 3.1 (Decomposition of Calder´ on-Zygmund) Let f ∈ L1 (R) and a positive real number α be given. There exists a decomposition f = g + b (a good and a bad function) with the following properties. There exists an open set Ω = j Qj where Qj are nonoverlapping open intervals such that b(t) dt = 0 for every j. On F = R Ω the function g is bounded by α. Qj On every Qj the function g is constant and equal to the mean value of f on Qj and 1 1 α≤ |f (t)| dt, f (t) dt ≤ 2α. |Qj | Qj |Qj | Qj
Proof. We consider the line to be decomposed into disjoint intervals on which the mean value of |f | is less than α. This can be achieved taking large intervals. Now we subdivide each of these intervals into two equal intervals. For each one we calculate the mean value of |f |. If one of these mean values is greater than α we take the corresponding interval as one of the Qj . We continue the process dividing those intervals on which the mean value is less than α. Finally we set Ω equal to the union of those intervals on which the mean value is greater than α. Then we set 1 f (t) dt χQj (x) b(x) = f (x) χΩ (x) − |Qj | Qj j 1 f (t) dt χQj (x). g(x) = f (x) χF (x) + |Qj | Qj j Now it is easy to prove that these functions satisfy all our conditions. The Qj are disjoint by construction. For almost every point x ∈ F , the complement of Ω, there is a sequence of intervals Jn , with x ∈ Jn and such that the mean value of |f | on every one is less than α. By the differentiation theorem the value of f (x) is also less than α if x is a point of Lebesgue of f . Every Qj is the half of an interval J where the mean value of |f | is less than α. Therefore 1 2 f (t) dt ≤ |f (t)| dt ≤ 2α. |Qj | Qj |J| J With this decomposition we can prove: Theorem 3.2 (Kolmogorov) For every f ∈ L1 (R) and α > 0 m{x ∈ R : |Hε f (x)| > α} ≤ C
f 1 . α
(3.1)
34
3. Hilbert Transform
Proof. Let f = g + bbe the Calder´ on-Zygmund decomposition at the level α, and also let Ω = j Qj the corresponding open set. We know that g and b are in L1 , since 1 |f | dm + f dmm(Qj ) ≤ f 1 . g1 = |Qj | Qj F j Therefore {x ∈ R : Hε f > 2α} ⊂ {x ∈ R : Hε g > α} ∪ {x ∈ R : Hε b > α}. But g is also in L2 , and since |g| ≤ α on F and ≤ 2α on Ω 2 |f | dm + 2α |f | dm ≤ 2αf 1 . g2 ≤ α F
Ω
It follows that m{x ∈ R : Hε g > α} ≤
Hε g22 2αf 1 f 1 . ≤C ≤A 2 2 α α α
Now we start with the bad function. First observe that if G = we have 1 2 m(G) ≤ 2 m(Qj ) ≤ 2 |f | dm ≤ f 1 . α Qj α j j
j
2Qj ,
Therefore we have m{x ∈ R : Hε b(x) > α} ≤ m(G) + m{x ∈ R G : Hε b(x) > α}. To obtain the corresponding inequality we calculate |Hε b(x)| dx ≤ K (x − t)b(t) dt dx ε RG
j
RG
Qj
where we have used that b is zero on F . Now applying that the integral of b is zero on every Qj , we get |Hε b(x)| dx ≤ (Kε (x − t) − Kε (x − tj ))b(t) dt dx RG
j
RG
Qj
where tj denotes the center of Qj . Thus, by Fubini’s theorem, the last integral is less than |Kε (x − t) − Kε (x − tj )| dx |b(t)| dt j
Qj
|x−tj |>2|t−tj |
Now we claim: the integral on x is bounded by an absolute constant B. Therefore
3.3 Truncated operators on L1 (R)
RG
|Hε b(x)| dx ≤ B
j
35
|b(t)| dt ≤ 2Bf 1 .
Qj
Consequently m{x ∈ R G : Hε b(x) > α} ≤
2Bf 1 . α
We only have to collect the results obtained. We have to prove the claim. This will be done in the following proposition. In the following it will be convenient to use Iverson-Knuth’s notation: If P (x) is a condition that can be true or false for every x, then [P (x)] by definition, is equal to 1 if P (x) is true and 0 if P (x) is false. In other words, [P (x)] is the characteristic function of the set {x ∈ R : P (x)}. The claim will be a consequence of the following fact: Proposition 3.3 For every a ∈ R and ε > 0 there exists an even function ψ: R → [0, +∞) such that it is decreasing on [0, +∞), Kε (t + a) − Kε (t)[|t| > 2|a|] ≤ ψ(x) for every |t| ≥ |x|, and the integral is bounded by an absolute constant ψ(t) dt < C.
Proof. We have Kε (t + a) − Kε (t) · [|t| > 2|a|] ≤ K(t + a) − K(t) · [|t + a| > ε] · [|t| > 2|a|] + K(t) · [|t + a| > ε] − [|t| > ε] · [|t| > 2|a|]. That is, for ε < 3|a| |a| 1 [|t| > 2|a|] + [2|a| ≤ |t| ≤ 4|a|]. |t|(|t| − |a|) |t|
≤ And for ε ≥ 3|a| ≤ Now let
|a| 1 2 4 [|t| > 2|a|] + ε ≤ |t| ≤ ε . |t|(|t| − |a|) |t| 3 3
|a|/(|t|(|t| − |a|)) if |t| > 2|a|, 1/2|a| if |t| ≤ 2|a|. ⎧ −1 if 2ε/3 ≤ |t| ≤ 4ε/3, ⎪ ⎨ |t| 3 ψ2 (ε, t) = if |t| ≤ 2ε/3, ⎪ ⎩ 2ε 0 if |t| > 4ε/3.
ψ1 (a, t) =
36
3. Hilbert Transform
Then we can take ψ(t) = ψ1 (a, t) + ψ2 (3|a|, t) in case ε < 3|a| and ψ(t) = ψ1 (a, t) + ψ2 (ε, t) when ε ≥ 3|a|. It is clear that this function satisfies all our conditions. This proposition will be needed in the proof of Cotlar’s Inequality (Theorem 3.7)
3.4 Interpolation Assume that T is a linear operator defined on some space of measurable functions that contains Lp0 (μ) and Lp1 (μ) for some 1 ≤ p1 < p0 ≤ +∞. Then T is defined also on every p ∈ [p1 , p0 ]. This is true because we can decompose every function f ∈ Lp (μ) f = f0 + f1 , where, being A = {t ∈ X: |f (t)| < 1}, we define f0 = f χA
and
f1 = f − f0 .
Then |f0 | ≤ 1 and |f0 | ≤ |f |, and so f0 ∈ L∞ (μ) and f0 ∈ Lp (μ). This implies that f0 ∈ Lp0 (μ). In an analogous way |f1 | ≤ |f | and |f1 | ≤ (|f |)p , therefore f1 ∈ Lp (μ) and f1 ∈ L1 (μ), and so it is in the intermediate Lp1 (μ). Then it is clear that T (f ) = T (f0 ) + T (f1 ) is well defined. The interpolation theorem gives a quantitative version of this observation. We will prove the Marcinkiewicz interpolation theorem in chapter 11. We apply now this theorem. The reader can read this chapter now or even only the proof of theorem 11.10 and some previous definitions needed in it. Proposition 3.4 For every 1 < p < +∞ the operator Hε : Lp (R) → Lp (R) is continuous and Hε p ≤ Cp2 /(p − 1). Proof. We have proved that Hε is of weak type (1, 1) and strong type (2, 2). Therefore applying Marcinkiewicz’s Theorem 11.10 for 1 < p ≤ 2 we obtain Hε f p ≤
Cp f p . (p − 1)(2 − p)
Values of p > 2 are conjugates to p < 2. And it is easy to see that Hε p = Hε p . This follows from Fubini’s theorem: If f ∈ Lp and g ∈ Lp we have f (t) g(x) dt g(x) dx = dx f (t) dt R |x−t|>ε x − t R |x−t|>ε x − t
3.5 The Hilbert transform
37
This implies that if |p − 2| > 1/2, then Hε p ≤ Cp2 /(p − 1). (This is true for 1 < p < 3/2 and the bound can be written in the symmetric form Cpp ) We only have to prove that the norm Hε p is uniformly bounded for |p − 2| ≤ 1/2, and this can be done by applying again the interpolation theorem, this time between p1 = 4/3 and p0 = 4.
3.5 The Hilbert transform Now we are in position to prove that, for every 1 < p < +∞ there is a bounded operator H: Lp (R) → Lp (R). We shall prove that if 1 < p < +∞ and f ∈ Lp (R), then there exists the limit limε→0+ Hε f (being taken in the space Lp (R)). In the case p = 1, Hε f is only in weak L1 so that we have to modify slightly the reasoning. We shall need a dense subset where the limits exist. Proposition 3.5 Let ϕ be an infinitely differentiable function of compact support. For every 1 < p < +∞ the limit Hϕ = lim+ Hε ϕ ε→0
exists on Lp (R). Moreover for every x there exists the limit Hϕ(x) = lim+ Hε ϕ(x). ε→0
Proof. Observe that for 0 < δ < ε 1 1 Hε ϕ(x) − Hδ ϕ(x) = ϕ(x − t) dt = ϕ(x − t) − ϕ(x) dt. δ0
This is not a norm but satisfies af 1,∞ = |a| · f 1,∞ ,
f + g1,∞ ≤ 2f 1,∞ + 2g1,∞ .
Therefore L1,∞ (R) is a vector space. Also for f ∈ L1 (R) we have f 1,∞ ≤ f 1 . Sometimes we call L1,∞ (R) the weak L1 space. This quasi-norm allows us to define a topology on L1,∞ (R) where a basis of neighborhoods of f are the sets f +B(0, ε), where B(0, ε) = {g ∈ L1,∞ (R) : g1,∞ < ε}. Now for f ∈ L1 (R) and a smooth ϕ Hε f − Hδ f 1,∞ ≤ 3Hε ϕ − Hδ ϕ1,∞ + 3Hε (f − ϕ)1,∞ + 3Hδ (f − ϕ)1,∞ ≤ Cϕ |ε − δ| + Cf − ϕ1 . We are able to prove the condition of Cauchy for Hε f in the space L1,∞ (R). All that is needed now is to prove that the space L1,∞ (R) is complete. Proposition 3.6 Let (fn ) be a Cauchy sequence in L1,∞ (R). Then there exists a subsequence (fnk ) and a measurable function f ∈ L1,∞ (R), such that fnk converges a. e. to f , and limn fn − f 1,∞ = 0. Proof. The proof is the same as that of the Riesz-Fischer theorem. We select the subsequence gk = fnk in such way that gk+1 − gk 1,∞ < 4−k . If Zk = −k k −k = 2−k . Therefore Z = {|g − gk | > 2 }, we have m(Zk ) ≤ 2 · 4 k+1 N k>N Zk is of measure zero and for every x ∈ Z there exists N such that
3.6 Maximal Hilbert transform
39
for every n > N , |gn+1 (x) − gn (x)| ≤ 2−n . It follows that the sequence (gn ) converges a. e. to some measurable function f . To prove that limn fn − f 1,∞ = 0 it is convenient to observe that for every finite sequence of functions (hj ) in L1,∞ (R) we have
N
hj 1,∞ ≤
j=1
N
2j hj 1,∞ .
j=1
Therefore we have gn+k − gn 1,∞ ≤ 41−n . That is, m(Ak (α)) = m{|gn+k − gn | > α} ≤ Since {|f − gn | > α} ⊂ Z ∪
∞ N
k=N
1 4n−1 α
.
Ak (α), it follows easily that
m{|f − gn | > α} ≤ 41−n α−1 . Therefore f − gn 1,∞ → 0.
Now we can define the Hilbert transform Hf for every f ∈ Lp (R). For 1 < p < +∞, Hf ∈ Lp (R). Also H: Lp (R) → Lp (R) is a bounded linear operator and p2 f p , 1 < p < +∞. Hf p ≤ C p−1 In the case p = 1 the linear operator H is defined but it takes values in L1,∞ . In particular for every f ∈ L1 (R), Hf 1,∞ ≤ Cf 1 . This follows from the corresponding Kolmogorov theorem about Hε and from the fact that for some sequence (εn ) we have that Hεn f converges a. e. to Hf .
3.6 Maximal Hilbert transform In the proof of Carleson’s Theorem we need the inequality H∗ f p ≤ Bpf p , valid for every p ≥ 2 and every f ∈ Lp (R), and where H∗ f (x) = sup |Hε f (x)| ε>0
is the maximal Hilbert transform. Historically this result is obtained to give a real proof of the pointwise convergence Hf (x) = lim Hε f (x), +
a. e.
ε→0
The proof which we shall give of this result is based on the following bound.
40
3. Hilbert Transform
Theorem 3.7 (Cotlar’s Inequality) Let f ∈ Lp (R), 1 < p < +∞. Then H∗ f (x) ≤ A M(Hf )(x) + Mf (x) . Therefore
H∗ f p ≤ Bpf p ,
p ≥ 2.
Proof. Fix a ∈ R and ε > 0. We want to prove |Hε f (a)| ≤ A M(Hf )(a) + Mf (a) . We notice that
Hε f (a) =
|t−a|>ε
f (t) dt = a−t
R
f2 (t) dt = Hf2 (a), a−t
where f = f1 + f2 and f2 is equal to f on |t − a| > ε and is equal to 0 on the interval 2J = {t : |a − t| ≤ ε}. This equality has a sense in spite of the fact that we have defined H only as an operator from Lp (R) to Lp (R). In fact, since f2 is null on the interval 2J, at every point x of the interval J = {t : |a − t| ≤ ε/2}, we have that Hη f2 (x) does not depend on η for η < ε/2. Therefore Hf2 is equal to Hη f2 on J and is a continuous function on this interval. We can bound the oscillation of Hf2 on J. |Hf2 (x) − Hf2 (a)| ≤ Kη (x − t) − Kη (a − t) · |f2 (t)| dt. Now we recall that there exists an even function ψ(x) that decreases with |x|, its integral is bounded by an absolute constant, and such that Kη (x − t) − Kη (a − t)[|a − t| > 2|x − a|] ≤ ψ(a − t). (cf. proposition 3.3). Therefore, it follows that |Hf2 (x) − Hf2 (a)| ≤ ψ(a − t)f2 (t) dt ≤ CMf2 (a) ≤ CMf (a), R
by the general inequality satisfied by the Hardy-Littlewood maximal function. Collecting our results we have |Hε f (a)| ≤ CMf (a) + |Hf2 (x)|, for every x ∈ J. Therefore |Hε f (a)| ≤ CMf (a) + |Hf (x)| + |Hf1 (x)|,
(3.2)
for almost every x ∈ J. What we have achieved is the liberty to choose x ∈ J. Now we use probabilistic reasoning to prove that there is some point where |Hf (x)| and |Hf1 (x)| are bounded. By the definition of the Hardy-Littlewood maximal function,
3.6 Maximal Hilbert transform
1 |J|
41
|Hf (t)| dt ≤ M(Hf )(a). J
The probability that a function is greater than three times its mean is less than 1/3, therefore 2 |J|. 3
m{t ∈ J : |Hf (t)| < 3M(Hf )(a)} ≥
For the other term we use the weak inequality for Hf1 m{t ∈ J : |Hf1 (t)| > α} ≤ C
Mf (a) f1 1 ≤ 2C |J|. α α
Therefore, in the same way we obtain m{t ∈ J : |Hf1 (t)| ≤ 6CMf (a)} ≥
2 |J|. 3
Now there is some point x ∈ J such that, simultaneously |Hf (x)| < 3M(Hf )(a),
and
|Hf1 (x)| < 6CMf (a).
Finally, we arrive at |Hε f (a)| ≤ CMf (a) + 3M(Hf )(a) + 6CMf (a) ≤ A Mf (a) + M(Hf )(a) . Now we can prove the bound on the norms. Since p > 1 we have, applying the known results about the Hardy-Littlewood maximal function and the Hilbert transform, H∗ f p ≤ AM(Hf )p + AMf p ≤
Cp Cp Hf p + f p p−1 p−1
p3 Cp ≤C f + f p . p (p − 1)2 p−1 For p > 2 it follows that H∗ f p ≤ Cpf p . The inequality that we obtain near p = 1 is not sharp. We shall need the sharp estimate to obtain the result of Sj¨ olin. Also we shall give the corresponding result for p = 1 that is needed in the proof of the almost everywhere pointwise convergence of Hε f (x) to Hf (x) for f ∈ L1 (R). The problem at p = 1 is in Cotlar’s Inequality. M(Hf ) may be infinity at every point if Hf is not in L1 (R). Hence we modify this inequality taking M(|Hf |1/2 ) instead of M(Hf ).
42
3. Hilbert Transform
Theorem 3.8 (Modified Cotlar’s Inequality) Let f ∈ L1 (R), then H∗ f (x) ≤ A {M(|Hf |1/2 )(x)}2 + Mf (x) . Proof. The proof is the same as that of the inequality of Cotlar, until we obtain inequality (3.2). By definition of the maximal Hardy-Littlewood maximal function, 1 |Hf |1/2 dm ≤ M(|Hf |1/2 )(a). |J| J Therefore m{t ∈ J : |Hf (t)| < {3M(|Hf |1/2 )(a)}2 } ≥
2 m(J). 3
The rest of the proof is the same as before. We obtain that there is some point x ∈ J where simultaneously we have |Hf (x)|1/2 < 3M(|Hf |1/2 )(a), and |Hf1 (x)| ≤ 6CMf (a). Now we can prove that H∗ f is in weak L1 when f ∈ L1 (R). In fact if T : L1 (R) → L1,∞ (R), M(T f ) is not, in general, in L1,∞ (R), but as Kolmogorov observed (see next proposition), M(|T f |1/2 ) is in L2,∞ (R), and M(|T f |1/2 )2,∞ ≤ 2c1 T f 1 . Given the modified Cotlar’s inequality, we can prove that H∗ f is in L1,∞ (R) for every f ∈ L1 (R). In fact, M(|Hf |1/2 ) ∈ L2,∞ (R) gives us m{t ∈ R : {M(|Hf |1/2 )}2 > α} ≤ C
f 1 . α
Proposition 3.9 (Kolmogorov) Let T be an operator such that for every f ∈ L1 (R), we have T f 1,∞ ≤ T f 1 . Then for every α > 0, and f ∈ L1 (R) f 1 m{t ∈ R : {M(|T f |1/2 )}2 > α} ≤ cT . α Proof. For every measurable function g: R → C we have g = g[|g| ≤ α] + g[|g| > α]. Therefore {Mg > 2α} ⊂ {M(g[|g| > α]) > α} and by the lemma of Hardy and Littlewood c1 m{Mg > 2α} ≤ |g| dm. α |g|>α We apply this inequality to our function
3.6 Maximal Hilbert transform 1/2
m{t ∈ R : {M(|T f |
c1 )} > 4α} ≤ √ α 2
√ α
|T f |1/2 >
43
|T f |1/2 dm.
Now we follow the traditional path +∞ c1 c1 1/2 √ |T f | dm = √ t−1/2 m{|T f | > t} dt, α |T f |1/2 >√α 2 α α and now by the hypothesis on T , we get 1/2
m{t ∈ R : {M(|T f |
c1 T f 1 +∞ −3/2 √ )} > 4α} ≤ t dt 2 α α c1 T f 1 . = α 2
Theorem 3.10 For every f ∈ L1 (R) and α > 0 m{x ∈ R : H∗ f (x) > α} ≤ C
f 1 , α
where C is an absolute constant. Therefore for every f ∈ Lp (R) H∗ f p ≤ B
p2 f p , p−1
1 < p < +∞.
(3.3)
Proof. By the modified Cotlar‘s Inequality {H∗ f > 2α} ⊂ M(|Hf |1/2 ) > α/A ∪ Mf > α/A . Hence, the previous theorem, applied to H gives us the weak inequality. Now we know that there is a constant C such that Hf 4 ≤ Cf 4 . We apply Marcinkiewicz’s Theorem to get H∗ f p ≤
C f p , p−1
1 < p < 2.
Recall that we have proved H∗ f p ≤ Bpf p , These two inequalities prove (3.3). Finally we arrive at
2 ≤ p < +∞.
44
3. Hilbert Transform
Theorem 3.11 (Pointwise convergence) For every f ∈ Lp (R), where 1 ≤ p < +∞, a. e. Hf (x) = lim+ Hε f (x), ε→0
Proof. The proof is almost the same as that of the differentiation theorem. Put Ωf (x) = lim sup Hε f (x) − lim inf Hε f (x). ε→0+
ε→0+
What we have to prove is that Ωf (x) = 0 a. e. For every smooth function of compact support ϕ it is easily shown that Ωϕ(x) = 0 for all x ∈ R. We also know that Ωf (x) ≤ 2H∗ f (x) for all x ∈ R. These two facts combine in the following way m{Ωf > α} = m{Ω(f − ϕ) > α} ≤ m{H∗ (f − ϕ) > α/2} Therefore by the results we have proved about H∗ , it follows that
Cf − ϕ p p m{Ωf > α} ≤ . α By the density of the smooth functions on Lp (R), it follows that, for every α > 0, m{Ωf > α} = 0. Therefore Ωf (x) = 0 a. e. as we wanted to prove.
Part Two The Carleson–Hunt Theorem
In this part we study the proof of the Carleson–Hunt Theorem (theorem 12.8 p.172]. To prove the convergence almost everywhere of a Fourier series we consider the maximal operator S ∗ (f, x) = supn |Sn (f, x)|. Instead of this we prefer to consider the Carleson maximal operator. This is defined by replacing the Dirichlet kernel by eint /t. In equation (2.3) we have seen that this is almost the same. The Carleson maximal operator is defined as π in(x−t) e ∗ f (t) dt. C f (x) = sup p.v. n∈Z −π x − t This step follows the observation of Luzin. We also want to prove that C ∗ f (x) is in Lp (−π/2, π/2) when f ∈ p L (−π, π). By the interpolation theorems we must bound the measure of the set where C ∗ f (x) > y. Therefore we are interested in bounding the Carleson integrals π in(x−t) e f (t) dt. p.v. −π x − t The main idea of the proof is to decompose the interval I into subintervals, one of them, I(x), containing x (x ∈ I(x)/2). Then we have iλ(x−t) eiλ(x−t) e eiλ(x−t) f (t) dt = p.v. f (t) dt + f (t) dt. p.v. I x−t I(x) x − t J x−t J
Clearly, the first term gives the main contribution. It is almost a new Carleson integral, the only difference being that, in general, the number of cycles will not be an integer. We shall see that we can replace it by a Carleson integral. The other terms can be conveniently bounded. To this end we write iλ(x−t) iλ(x−t) e e f (t) − M M f (t) dt = dt + dt, x−t J x−t J J x−t where M is the mean value of eiλ(x−t) f (t) on J. Now we can benefit from the fact that the integral of eiλ(x−t) f (t) − M in J is 0, and put iλ(x−t) iλ(x−t) 1 e f (t) − M 1 dt = − e f (t) − M dt, x−t x − t x − tJ J J J.A. de Reyna: LNM 1785, pp. 47–49, 2002. c Springer-Verlag Berlin Heidelberg 2002
48
where tJ is the center of J. All these terms can be bounded by weak local norms t
c 1 j f (n,J) = f (t) exp −2πi n + dt . 1 + j 2 |J| J 3 |J| j∈Z
Hence, given α = (n, J), f α is a mean value of absolute values of (generalized) Fourier coefficients of f . Observe now that the first term is of the type with which we started, only that the new integral is more simple because it corresponds to fewer cycles, but a non integer number of cycles. We can change to a new integral that has an integer number of cycles, and in such a way that the difference is again bounded by the local norm. We iterate this process until we obtain a Carleson integral with 0 cycles, that is, we arrive to a Hilbert transform that is easily estimated. The remainders that appear in these steps are bounded except on an exceptional set. The larger we allow the bound, the smaller can this exceptional set be made. To bound the number of steps we start with n ≤ 2N . In this way we arrive naturally to the log log n result. That is because for every interval that appears in the process we must add a fraction to the exceptional set. In the process we consider every dyadic interval of length ≤ 2π/2N . This gives a contribution of length proportional to N . We must compensate with log N in the bound allowed. Chapter four contains the basic step. This is a precise formulation of how we choose the partition of the interval of a given Carleson integral, and the bounds we obtain. It also contains the theorem of change of frequency that gives a good bound of the difference between two Carleson integrals in the same interval but with different frequency. This theorem plays an important role in the second part of the proof. Chapter five gives the bounds of the terms that appear in the basic step. And finally chapter six defines the exceptional set and combines the previous bounds to obtain the result Sn (f, x) = o(log log n) a.e. for every f ∈ L2 [−π, π]. Then we start the proof of the Carleson-Hunt theorem. This is a refined version of the previous theorem. An analysis of the previous proof shows the reason for the log log n term: we have used in the proof all the pairs α = (n, I). But if we think about the process of selection of the central interval I(x) we notice that, in it, the local norm must be relatively great. Therefore we need to select a set S of allowed pairs and assure that we only use them in the inductive steps. If we start with a Carleson integral (maybe one for an allowed pair), and carry out the procedure of chapter four, we reach a Carleson integral that in general does not correspond to an allowed pair. Therefore we must change to another one, but with a controllable error.
49
In the way to the definition of the allowed pairs Carleson gives an analysis of the function f that is very clever. This analysis can be regarded as that of writing the score from a piece of music. Given a level bj y p/2 , (the intensity of the least intense note), we start from the four equal intervals into which the interval I is divided. In every one we obtain the Fourier series of f , and retain only those terms that are greater that the chosen level. This forms the polynomial Pj . Then every one of these intervals is divided into two parts. In every one of these parts we obtain the Fourier series of f − Pj , and proceed in the same way. This procedure gives one a set of pairs Q that we call notes of f . This will be the starting point for the definition of the allowed pairs In chapters 4, 5 and 6 we will apply the principal idea of the proof in a straightforward way. We will obtain that for every f ∈ L2 the partial sums Sn (f, x) = o(log log n). This reasoning is relatively easy to follow. The kernel of the proof is in chapters 7 to 10. In these chapters the previous proof is refined to remove the log log n term.
4. The Basic Step
4.1 Introduction The proof of the Carleson Theorem is based on a new method of estimating partial sums of Fourier series. We replace the Dirichlet integral by the singular integral π in(x−t) e f (t) dt. C(n,I) f (x) = p.v. −π x − t The new method consists of applying repeatedly a basic step that we are going to analyze in this chapter. Given a partition Π of the interval I = [−π, π], we decompose the integral as ein(x−t) EΠ f (t) C(n,I) f (x) = p.v. f (t) dt + dt x−t I(x) x − t J J∈Π,J⊂I(x) (4.1) ein(x−t) f (t) − EΠ f (t) dt, + x−t J J∈Π,J⊂I(x)
where I(x) is an interval containing x, that is a union of some members of Π, and EΠ f is the conditional expectation of ein(x−t) f (t) with respect to Π. The principal point of the basic step is the careful choice of I(x) and Π so that we have good control of the last two sums. The first term is an integral of the same type that we can treat in a similar way. First we define local norms so that we can give adequate bounds for the last two terms. Later we will show the precise bounds and we will give the form of the basic step in theorem 4.13.
4.2 Carleson maximal operator Let I ⊂ R be a bounded interval and n ∈ Z. For every f ∈ L1 (I), we consider the singular integral 2πin(x−t)/|I| e f (t) dt. (4.2) p.v. x−t I
J.A. de Reyna: LNM 1785, pp. 51–72, 2002. c Springer-Verlag Berlin Heidelberg 2002
52
4. The Basic Step
If x is contained in the interval I/2 with the same center as I and half length we call (4.2) a Carleson integral. Here we have an arbitrary selection: x ∈ I/2. Every condition x ∈ θI, with θ ∈ (0, 1) would be adequate. If we allow x to be near the extremes of I, the simplest Carleson integral would be unbounded. For example if we want to bound b dt b supp.v. = sup log a a,b>0 −a t we need M −1 < b/a < M . This is the condition 0 ∈ θ(a, b) with θ = (M − 1)/(M + 1) ∈ (0, 1). Although this selection is arbitrary, many details of the proof shall depend on the selection made here. The set of pairs P = {(n, I) : n ∈ Z, I a bounded interval of R} will assume a very central role in the following. We use greek letters to denote the elements of P. Given a pair α = (n, I) we denote I by I(α) and n by n(α). Also we call |I| = |I(α)| the length of α, and write it as |α|. Then for every α ∈ P and x ∈ I(α)/2 we put e2πin(α)(x−t)/|α| f (t) dt. Cα f (x) = p.v. x−t I(α) Given f ∈ L1 (I) and α ∈ P with I(α) = I, Cα f (x) is defined for almost every x ∈ I(α)/2. We shall study the maximal Carleson operator CI∗ f (x) =
sup
|Cα f (x)|.
I(α)=I,n(α)∈Z
This is a measurable function with values in [0, +∞] (since it is a supreme of a countable number of measurable functions). We shall prove that it is a bounded operator from Lp (I) to Lp (I/2) for every 1 < p < +∞. The following proposition gives us some practice with Carleson integrals, and it will be needed in the proof of Carleson’s Theorem: Lemma 4.1 There exists an absolute constant A > 0 such that if x ∈ I(γ)/2, and for every λ ∈ R, |Cγ (eiλt )(x)| ≤ A. Proof. Write I = I(γ). By definition if λ(γ) = 2πn(γ)/|γ|, iλ(γ)(x−t) e iλt eiλt dt. Cγ (e )(x) = p.v. x−t I Hence, if ω = λ(γ) − λ, |Cγ (e
iλt
iω(x−t) e dt. )(x)| = p.v. I x−t
4.3 Local norms
53
We change variables, letting u = x − t and see that b iωu e iλt du. |Cγ (e )(x)| = p.v. −a u Since x ∈ I(γ)/2, we have 0 ∈ J/2 if J = [−a, b]. This condition is equivalent to 1/3 < b/a < 3. Therefore a iωu b iωu e e iλt |Cγ (e )(x)| = p.v. du + du u −a u a a sin ωu ≤ du + log 3. u −a x Furthermore −x (sin t/t) dt is bounded, thus proving our lemma.
4.3 Local norms The basic procedure is the reduction of one Carleson integral to another, simpler, Carleson integral. In order to bound the difference between them we shall need norms associated to pairs α ∈ P. First we associate a function eα to every pair α ∈ P. eα is a function in 2 L (R) supported by the interval I(α).
n(α) x χI(α) (x) = eiλ(α)x χI(α) (x). eα (x) = exp 2πi |α| The function eα is a localized wave train. It is localized in the time interval I(α), and has angular frequency λ(α) = 2πn(α)/|I(α)|. The number |α| = |I(α)| is the duration and n(α) the total number of cycles in the wave train. The functions eα with I(α) = I fixed, form an orthonormal system and every function f supported by I can be developed in a series of these functions, convergent in L2 (R). f=
f, eα eα . |α|
I(α)=I
Observe that 1 f, eα = |α| |α|
n(α) t f (t) exp −2πi |α| I(α)
dt.
We define the local norm f α as
t c 1 j dt . f α = f (t) exp −2πi n(α) + 1 + j 2 |I| I 3 |I| j∈Z
54
4. The Basic Step
Here c is chosen so that j∈Z c/(1 + j 2 ) = 1. Hence f α is a mean value of absolute values of (generalized) Fourier coefficients of f . One motivation for this definition is that with this norm we can control the integrals I f (t)ϕ(t) dt when the function ϕ is twice continuously differentiable on I. We shall show this in theorem 4.3 below. Proposition 4.2 Let ϕ ∈ C 2 [a, b], δ = b − a. For every x ∈ [a, b] we have
n x , ϕ(x) = cn exp 2πi 3δ n∈Z
where the coefficient cn satisfies (1+n2 )|cn | ≤ A(ϕ∞ +δ 2 ϕ ∞ ), for every n ∈ Z. Proof. First assume that δ = 1. We can extends ϕ to a twice continuously differentiable function ϕ, ˜ of period 3, defined on R. We can assume that ϕ ˜ ∞ , ϕ˜ ∞ and ϕ˜ ∞ are bounded by ϕ∞ + ϕ ∞ . The Fourier series for ϕ˜ is
n ϕ(x) ˜ = cn exp 2πi x , 3 n where 1 cn = 3
3
0
1
n −2πint ϕ(t) ˜ exp −2πi t dt = ϕ(3t)e ˜ dt. 3 0
Now, integration by parts leads, for n = 0 to 1 9 cn = ϕ˜ (3t)e−2πint dt, 2 (2πin) 0 so that for some absolute constant A and every n ∈ Z (1 + n2 )|cn | ≤ A(ϕ∞ + ϕ ∞ ). In the general case a change of scale leads to the inequality (1 + n2 )|cn | ≤ A(ϕ∞ + δ 2 ϕ ∞ ). Theorem 4.3 Let f ∈ L1 (I), ϕ ∈ C 2 (I) and α = (n, I) ∈ P. For some absolute constant B we have 1 2πin(x−t)/|α| e f (t)ϕ(t) dt ≤ B(ϕ∞ + |α|2 ϕ ∞ ) f α . |α| I Proof. We apply proposition 4.2 to ϕ and obtain
4.3 Local norms
ϕ(x) =
j∈Z
55
j x , cj exp 2πi 3 |α|
where (1 + j 2 )|cj | ≤ A(ϕ∞ + |α|2 ϕ ∞ ). Hence we have 1 2πin(x−t)/|α| e f (t)ϕ(t) dt |α| I 1
j t |cj | f (t) exp −2πi n − ≤ dt |α| I 3 |α| j∈Z
≤ B(ϕ∞ + |α|2 ϕ ∞ ) f α . We shall need a bound of f α when f is an exponential, and will give it here: Proposition 4.4 There exists an absolute constant C such that for every ω ∈ R and α ∈ P we have e2πiωx α ≤ 1,
and
C . e2πiωx α ≤ ω|α| − n(α)
Proof. Since |e2πiωx | = 1 and f α is a mean value of integrals, 1
j t f (t) exp −2πi n(α) + dt, |α| I(α) 3 |α|
(4.3)
the first inequality is obvious. For the second inequality we calculate (4.3), and obtain |e2πi(A−j/3) − 1| , 2π|A − j/3| where A = ω|α| − n(α). Hence we have 2πiωx
e
α ≤
j∈Z
≤
1 2π
c |e2πi(A−j/3) − 1| 1 + j 2 2π|A − j/3| |j|/3≤|A|/2
c 4 + 2 1 + j |A|
|j|/3>|A|/2
M c 2 + . ≤ 2 1+j π|A| |A|
And, therefore e2πiωx α ≤
(2/π) + M . |A|
For technical reasons it will be convenient to replace A = ω|α| − n(α) by ω|α| − n(α). Since e2πiωx α ≤ 1 always, this presents no problem.
56
4. The Basic Step
Proposition 4.5 There exists an absolute constant B > 0 such that, for every interval K and ω ∈ R, there exists k ∈ N such that for κ = (k, K), we have e2πiωx κ ≥ B. We can choose k = |ω| · |K|. Proof. As in the proof of proposition 4.4 we obtain e2πiωx κ =
j∈Z
c |e2πi(A−j0 /3) − 1| c |e2πi(A−j/3) − 1| ≥ . 1 + j 2 2π|A − j/3| 1 + j02 2π|A − j0 /3|
One of every three consecutive j satisfies |e2πi(A−j/3) − 1| > 1. Hence we can choose j0 such that |A − j0 /3| < 1, and |e2πi(A−j0 /3) − 1| > 1. For such j0 we have C e2πiωx κ ≥ . 1 + 9(1 + |A|)2 This is greater than B if we choose |A| < 1. Since A = |ω| · |K| − n(κ), this goal is achieved taking k = n(κ) = |ω| · |K|. Note that for α = (n, I) and f ∈ Lp (I) we have f α ≤ f Lp (I) ,
(4.4)
Here and elsewhere if f ∈ Lp (I) we put
1 1/p
1/p p p |f | dm , and f p = |f | dm . f Lp (I) = |I| I I Hence f Lp (I) always denotes the Lp (I) norm with respect to normalized Lebesgue measure.
4.4 Dyadic partition Given an interval I = [a, b], the central point c of I divides this interval in two intervals of half length that we denote by I0 = [a, c] and I1 = [c, b]. a
d
c
e
b
fig. 4.1
Analogously given I1 = [c, b] we obtain the interval (I1 )0 = I10 = [c, e] and I11 = [e, b] where e is the central point of the interval I1 .
4.4 Dyadic partition
57
The same process defines the intervals Iu for every word u ∈ {0, 1}∗ . We call these intervals dyadic intervals generated from I. Every dyadic interval Iu has two sons, Iu0 and Iu1 . Every dyadic interval has a brother. For example, the brother of I00101 is I00100 . But, in general, every dyadic interval has two contiguous intervals. We understand by contiguous an interval of the same length and with a unique point in common. For example it is easy to see that the contiguous intervals of I00101 are its brother I00100 and I00110 . We also speak of the grandsons of I. They are the four intervals, I00 , I01 , I10 , and I11 . We are dealing with the operator CI∗ : Lp (I) → Lp (I/2). I denotes always this interval and we speak of dyadic intervals as those intervals that can be written as Iu with u ∈ {0, 1}∗ . Also it must be noticed that I is an arbitrary interval, so we can apply every concept defined on I to every other interval. For example, we can speak of dyadic intervals with respect to J. Smoothing intervals. In general the union of two contiguous dyadic intervals is not a dyadic interval. Such an interval we shall call a smoothing interval . They play a prominent role in the proof. All dyadic intervals are smoothing intervals, since they can be written as the union of its two sons. But there are smoothing intervals that are not dyadic. For example the interval [d, e] = I01 ∪ I10 is a smoothing interval, but it is not dyadic. Given an interval I we shall denote by I/2 its middle half, that is I/2 = I01 ∪ I10 . Given the interval I, we denote by PI the set of pairs (n, J) where n ∈ Z and J is a smoothing interval with respect to I. Dyadic Points. The extremes of all dyadic intervals (with respect to I) are called dyadic points and the set of all dyadic points will be denoted by D. D is a countable set and so it is of measure 0. Proposition 4.6 Given I and x ∈ I/2 that is not a dyadic point, for every n = 0, 1, 2, . . . there is only one smoothing interval In of length |I|/2n such that x ∈ In /2. We also have I = I0 ⊃ I1 ⊃ I2 ⊃ . . .. Proof. It is clear that for n = 0 there is only one smoothing interval I. Now assume that there is only one smoothing interval J = In = [a, b] with / D, there is x ∈ In /2. Then, x ∈ (d, e) (refer to the figure 1.1). Since x ∈ one and only one interval J010 , J011 , J100 , or J101 that contains x. In every case we check that there is only one smoothing interval with the required conditions, In+1 will be respectively [a, c], [d, e], [d, e], or [c, b]. We observe that in every case In+1 ⊂ In . Choosing I(x). We shall consider partitions Π of some smoothing interval J where every member of Π is a dyadic interval generated from I, of
58
4. The Basic Step
length ≤ |J|/4. We always consider closed intervals, and when we speak of partitions we don’t take into consideration the extremes of these intervals. We also speak of disjoint intervals to mean those whose interiors are disjoint. We assume a Carleson integral Cα f (x) and a dyadic partition Π of I = I(α), where every J ∈ Π has length |J| ≤ |I|/4, to be given. I(x) will be an interval, a union of some members of Π, such that x ∈ I(x)/2, so that the first term in decomposition (4.1) will be almost a Carleson integral. We must also choose I(x) in order to obtain a good bound for the other members of decomposition (4.1). For example EΠ f (t) dt. J x−t Here x ∈ J ∈ Π. Therefore we shall choose I(x) so that |x − t| is of the same order as |J|, for every t ∈ J. The next proposition guarantees that all these conditions can be attained. Proposition 4.7 Let x ∈ I/2 and let Π be a dyadic partition of the smoothing interval I, with intervals of length ≤ |I|/4. Then there exists a smoothing interval I(x) such that: (a) x ∈ I(x)/2. (b) |I(x)| ≤ |I|/2. (c) I(x) is the union of some intervals of Π. (d) Some of the two sons of I(x) is a member of Π. (e) For every J ∈ Π such that J ⊂ I(x) we have d(x, J) ≥ |J|/2. (f ) Each smoothing interval J with I(x) ⊂ J ⊂ I and x ∈ J/2 is a union of intervals of the partition Π. Proof. We consider the smoothing intervals J ∪J , union of an interval J ∈ Π and a contiguous interval J , such that x ∈ (J ∪J )/2. For example if x ∈ J0 ∈ Π, we can choose J0 contiguous to J0 and such that J0 ∪ J0 satisfies these conditions. Now let I(x) be such an interval J ∪ J of maximum length. Then (a) and (d) are satisfied by construction. (b) follows from the hypothesis that every J ∈ Π has length ≤ |I|/4. (c) Let I(x) = J ∪ J with J ∈ Π. J and J are dyadic intervals. H
H J
J
fig. 4.2
Two dyadic intervals have intersection of measure zero or one is contained in the other. Hence if J is not the union of intervals of Π, there must be an element H ∈ Π such that J ⊂ H. Since H and J are members of Π they are essentially disjoint. Let H be the dyadic interval of the same length as H that contains J. As J and J are contiguous, H and H are also contiguous.
4.5 Some definitions
59
Then K = H ∪ H satisfies x ∈ K/2, and |K| > |I(x)|. This contradicts the definition of I(x). (e) Let J ∈ Π be such that J ∩ I(x) is of measure zero. If d(x, J) < |J|/2, there exists J contiguous to J and containing x. Hence x ∈ (J ∪ J )/2. By definition this implies I(x) ⊃ J ∪ J , which contradicts the hypothesis that J ⊂ I(x). (f) Let J be such a smoothing interval and let K be a son of J. Two dyadic intervals are disjoint or one is contained in the other. Since K is dyadic it follows that K is the union of the intervals L ∈ Π such that L ⊂ K or there is L ∈ Π with L K. The case L K is impossible, because if L K, since x ∈ J/2 we have d(x, L) ≤ d(x, K) ≤ |K|/2 ≤
1 |L|. 4
Therefore there is L contiguous to L and such that x ∈ (L ∪ L )/2. Also |L ∪ L | > 2|K| ≥ |I(x)|, which contradicts the definition of I(x). The condition (e) says that every interval J ∈ Π such that J ⊂ I(x) is contained in a security interval J of length 2|J| such that x is not contained in J . This will play a role in the bound (4.15).
4.5 Some definitions The remainder terms in (4.1) will be bounded using two functions. The first is a modification of the maximal Hilbert transform. Definition 4.8 Given f ∈ L1 (I), we define HI∗ f (x), the maximal Hilbert dyadic transform on I, by f (t) ∗ dt, HI f (x) = supp.v. K K x−t where the supremum is over the intervals K = J ∪ J , which are the union of two contiguous dyadic intervals such that x ∈ K/2.
The second function will be needed to bound the third term in (4.1). Definition 4.9 Given a finite partition Π of I by intervals Jk of length δk and center tk we define the function Δ(Π, x) =
k
δk2 . (x − tk )2 + δk2
60
4. The Basic Step
In the future reasoning a pair α = (n, I), (with some other elements that now are of no consequence) will determine a dyadic partition Πα of I(α). Hence we shall denote by Δα (x) the corresponding function Δ(Πα , x). We shall also use Hα∗ f (x) to denote the maximal Hilbert dyadic transform ∗ Eα f (x), where the function Eα f (x) denotes HI(α) iλ(α)(x−t) f (t), Πα ), E(e
the expectation of the function eiλ(α)(x−t)f (t) with respect to the partition Πα . Hence if z ∈ J ∈ Π, Eα f (z) = |J|−1 J eiλ(α)(x−t) f (t) dt. (In order not to be pedantic, the notations Δα , and Hα∗ f do not mention the partition Πα ). In the sequel we analyze every term of the decomposition (4.1). This will allow us to decide how to choose our partition Πα .
4.6 Basic decomposition The basic step in the proof of Carleson’s Theorem is a decomposition of a Carleson integral Cα f (x) into three parts associated with a dyadic partition Π of I = I(α). We assume that every J ∈ Π has a measure ≤ |I|/4; then Proposition 4.7 gives us an interval I(x) such that x ∈ I(x)/2. The decomposition is given by iλ(α)(x−t) e f (t) dt Cα f (x) = p.v. x−t I eiλ(α)(x−t) = p.v. f (t) dt x−t I(x) Eα f (t) eiλ(α)(x−t) f (t) − Eα f (t) + dt + dt. x−t II(x) x − t II(x) (4.5) We shall transform the first term into a Carleson integral that can be considered simpler than Cα f (x). The second and third terms can be bounded in terms of the functions Hα f (x) and Δα (x) = Δ(Πα , x). In the following sections we are going to obtain the relevant bounds. These are collected together in theorem 4.11, that is a preliminary version of the basic step. Then we choose a partition in order to optimize the bound of theorem 4.11. With the selected partition Π we formulate a new form of the basic step in Theorem 4.13. In what follows we will try to bound Carleson integrals Cα f (x). It must be understood that we put Cα f (x) = +∞ if the principal value is not defined.
4.7 The first term
61
With this convention it can be noticed that our bounds are also correct in this case.
4.7 The first term In fact, the first integral in (4.5) is almost a Carleson integral. We need β = (m, I(x)) ∈ P such that the difference between Cβ f (x) and the first integral is small. Now these are eiλ(α)(x−t) eiλ(β)(x−t) p.v. f (t) dt, p.v. f (t) dt. x−t x−t I(x) I(x) They will be equal if λ(α) = λ(β), that is, 2π
n m = 2π . |I| |I(x)|
(4.6)
In general it is not possible to choose m ∈ Z such that (4.6) holds. Hence we choose |I(x)| ! m= n . (4.7) |I| Now that we have chosen a convenient β = (m, I(x)) ∈ P we must bound the difference. p.v. I(x)
eiλ(α)(x−t) f (t) dt − Cβ f (x). x−t
Here, for the first time, we want to change the frequency of a Carleson integral. These changes are governed by Theorem 4.10 (Change of frequency) Let α, β be two pairs with I(α) = I(β) = J and x ∈ J/2. If |n(α) − n(β)| ≤ M with M > 1, then |Cβ f (x) − Cα f (x)| ≤ BM 3 f α , where B is some absolute constant. Remark. It is not necessary that n(β) ∈ Z. Proof. We apply Theorem 4.3. First observe that ei λ(β)−λ(α) (x−t) − 1 iλ(α)(x−t) f (t) dt. |Cβ f (x) − Cα f (x)| = e x−t J Let λ(β) − λ(α) = L; then
(4.8)
62
4. The Basic Step
1 eiL(x−t) − 1 iλ(α)(x−t) e |Cβ f (x) − Cα f (x)| = |L| |α| f (t) dt. |J| J |L|(x − t) The function (eit − 1)/t defined on R is in C ∞ (R) and so it is easy to see that if ϕ(t) = (eiL(x−t) − 1)/L(x − t) then ϕ∞ + |α|2 ϕ ∞ ≤ C(1 + L2 |α|2 ). Since |L| |α| = 2π|n(β) − n(α)|, Theorem 4.3 gives us |Cβ f (x) − Cα f (x)| ≤ C|L| |α|(1 + |L|2 |α|2 )f α ≤ BM 3 f α . If we apply this to our case it follows that eiλ(α)(x−t) f (t) dt − Cβ f (x) ≤ Cf β . p.v. x−t I(x)
(4.9)
4.8 Notation α/β We have related the first term in (4.5) to a Carleson integral Cβ f (x). The process to obtain β from α and I(x) will be used very often. Hence given α ∈ P and an interval J ⊂ I(α) we define α/J ∈ P by |J| ! α/J = (m, J), where m = n(α) . |α| We choose α/J so that eα/J represents more or less the same musical note as eα , but of duration J, as far as this is possible. Observe that by definition 0 ≤ λ(α) − λ(α/J) |J| < 2π. (4.10) Also, given α and β ∈ P such that I(β) ⊂ I(α), we define α/β = α/I(β). For future reference we notice the following relation: λ(α) |β| = n(α/β) + h, 2π
where 0 ≤ h < 1,
valid whenever α and β ∈ P are such that I(β) ⊂ I(α). This follows from the identity λ(α) |β| |β| = n(α) . 2π |α|
(4.11)
4.10 The third term
63
If we have I(α) ⊃ J ⊃ K and all are smoothing intervals, we have α/K = (α/J)/K.
(4.12)
In fact, we have |K| ! , n(α/K) = n |α|
|K| ! n (α/J)/K = n(α/J) = |J|
|J| ! |K| ! n . |α| |J|
All the lengths are of type |I|/2s . Hence all we have to prove is # " n/2k ! n ! = k+l . 2l 2 This is easy if we think in binary. With our new notations, Proposition 4.5 says that there exists an absolute constant B > 0 such that eiλ(δ)t δ/K ≥ B, for every K ⊂ I(δ).
4.9 The second term This term is really easy to bound. We have Eα f (t) Eα f (t) Eα f (t) dt = p.v. dt − p.v. dt. II(x) x − t I x−t I(x) x − t Now these two intervals I and I(x) are of the form that is used in definition 4.8 of the maximal Hilbert dyadic transform. Hence we have Eα f (t) dt ≤ 2Hα∗ f (x). (4.13) x − t II(x)
4.10 The third term We use here the fact that the numerator has a vanishing integral on every J ∈ Π. In this way we can change the first order singularity into a second order singularity, which is easier to handle. Observe that by Proposition 4.7, I I(x) is the union of some members of the partition Π. Hence we can write eiλ(α)(x−t) f (t) − Eα f (t) eiλ(α)(x−t) f (t) − Eα f (t) dt = dt, x−t x−t II(x) Jk Jk
where we are summing over those Jk ∈ Π ’disjoint’ from I(x). Let tk be the center of Jk . We have
64
4. The Basic Step
1 t − tk 1 = − . x − t x − tk (x − t)(x − tk ) Now, the integral of eiλ(α)(x−t) f (t) − Eα f (t) on every Jk is zero, so that eiλ(α)(x−t) f (t) − Eα f (t) dt = x−t II(x) t − tk t − tk iλ(α)(x−t) e Eα f (t) dt. f (t) dt − Jk (x − t)(x − tk ) Jk (x − t)(x − tk ) Jk
Jk
(4.14) We want to reduce the third term to the function Δα (x). First consider the second part of (4.14). We recall that from Proposition 4.7 we know that d(x, Jk ) ≥ δk /2, where δk = |Jk |, for every Jk that appears in (4.14). Hence if t ∈ Jk , δk ≤ |x − tk | it follows that |(x−t)(x−tk )| ≥ |x−tk |2 −
δk 1 1 1 |x−tk | ≥ |x−tk |2 ≥ |x−tk |2 + δk2 . (4.15) 2 2 4 4
By definition Eα f is constant on every Jk . Hence we have t − tk δk dt Eα f (t) dt ≤2 |Eα f (tk )| 2 2 Jk (x − t)(x − tk ) Jk (x − tk ) + δk Jk
Jk
=2
Jk
δk2 |Eα f (tk )|. (x − tk )2 + δk2
Now, as every term in the definition 4.9 of Δ(Π, x) is positive, we conclude that t − tk δk2 |Eα f (tk )|. (4.16) Eα f (t) dt ≤ 2 (x − tk )2 + δk2 Jk (x − t)(x − tk ) Jk ∈Π
Jk
We are now summing over all Jk ∈ Πα . Now we bound the first term in (4.14). Here we use Theorem 4.3 again. The procedure is similar to that we have used to bound the first term. Let βk = α/Jk . We have t − tk eiλ(α)(x−t) f (t) dt Jk (x − t)(x − tk ) Jk 1 $ |Jk |(t − tk ) iλ(α)−λ(β ) (x−t) % k e = eiλ(βk )(x−t) f (t) dt. |Jk | Jk (x − t)(x − tk ) Jk
Now we apply Theorem 4.3 to every integral. Applying (4.15)
4.10 The third term
65
& |J |(t − t ) & δk2 & k k i λ(α)−λ(βk ) (x−t) & e . & & ≤2 (x − t)(x − tk ) (x − tk )2 + δk2 ∞ The second derivative of |J|(t − tk )eiM (x−t) (x − t)(x − tk ) is −
2|J|eiM (x−t) 2|J|iM (t − tk )eiM (x−t) 2|J|iM eiM (x−t) + − (x − t)(x − tk ) (x − t)2 (x − tk ) (x − t)2 (x − tk )
|J|M 2 (t − tk )eiM (x−t) 2|J|(t − tk )eiM (x−t) − + . (x − t)(x − tk ) (x − t)3 (x − tk ) By (4.10), |J|M ≤ 2π, and by (4.15) |Jk | = δk , |x − t| > δk /2 and |t − tk | < δk /2. We obtain that δk2 ϕ ∞ is bounded by δk2 [8π + 32π + 32] . (x − tk )2 + δk2 2
All this gives us t − tk δk2 eiλ(α)(x−t) f (t) dt ≤ C f βk , (x − tk )2 + δk2 Jk (x − t)(x − tk ) Jk
Jk
(4.17) where C is an absolute constant. To compare (4.16) and (4.17) we observe that |Eα f (tk )| ≤ Cf βk .
(4.18)
In fact, we have that 1 iλ(α)(x−t) e f (t) dt |Eα f (tk )| = |Jk | Jk 1 i λ(α)−λ(βk ) (x−t) iλ(βk )(x−t) = e e f (t) dt. |Jk | Jk We are now in position to apply Theorem 4.3. Here we use again (4.10) and finally obtain |Eα f (tk )| ≤ Cf βk , for some absolute constant C. Therefore the third term in (4.5) is bounded by
66
4. The Basic Step
D
Jk
δk2 f βk . (x − tk )2 + δk2
Now, as every summand in the definition of Δα (x) is positive, we can write |third term in (4.5)| ≤ D sup f βk Δα (x).
(4.19)
Jk
4.11 First form of the basic step Theorem 4.11 Let Cα f (x) be a Carleson integral and Π = Πα a dyadic partition of I = I(α). Assume that every J ∈ Π has measure ≤ |I|/4. Let I(x) be the interval defined on proposition 4.7, and β = α/I(x). Then |Cα f (x) − Cβ f (x)| ≤ Cf β + 2Hα∗ f (x) + D sup f α/Jk Δα (x),
(4.20)
Jk
where C and D are absolute constants. Observe that by (4.9) we have |Cα f (x) − Cβ f (x)| ≤ Cf β + k
Jk
eiλ(α)(x−t) f (t) dt. x−t
We have introduced here the conditional expectation Eα f (x) and then we have replaced (x − t)−1 by (t − tk )(x − t)−1 (x − tk )−1 . This is very convenient. For example, if we apply directly Theorem 4.3 we only obtain the bound ≤ C k f α/Jk .
4.12 Some comments about the proof 1. In what sense can we say that Cβ f (x) is a ‘simpler’ Carleson Integral? First, we can say that we pass from Cα f (x) to Cβ f (x) where if n(α) > 0 then n(β) ≤ n(α). Thus we can expect to obtain n(β) = 0. We can restrict the study to real functions, because, with f real, we have C−α f (x) = Cα f (x),
(4.21)
if α = (n, I) and −α = (−n, I) with n ∈ N. Hence we can assume that n(α) is positive. Another case in which we can consider only values n(α) > 0 is when we study functions f with |f | = χA (for some measurable set A). In this case f is of the same nature and
4.12 Some comments about the proof
67
C−α f (x) = Cα f (x). This is not all we have to say about question 1. But it is what we can say now. 2. We want to prove that the Carleson maximal operator is bounded from Lp (I) to Lp (I/2) (1 < p < +∞). But we can use the interpolation theorems to reduce the problem to proving the weak inequality m{CI∗ f (x)
Ap f pp > y} ≤ . yp
Hence we would like, given f ∈ Lp (I) and given y > 0, to define E with m(E) < Ap f pp /y p and such that for every x ∈ I/2 E and α ∈ P with I(α) = I we have |Cα f (x)| < y. In fact, we will first construct, given f ∈ Lp (I), y > 0 and N ∈ N, a subset EN with m(EN ) < Ap f pp /y p ; such that for every x ∈ I/2 EN and α ∈ P with I(α) = I, and 0 ≤ n(α) < 2N we will have |Cα f (x)| < y. Then {CI∗ f >y} ⊂
{x ∈ I/2 : |Cα f (x)| > y, 0 ≤ n(α) < 2N , I(α) = I},
N
and since AN = {x ∈ I/2 : |Cα f (x)| > y, 0 ≤ n(α) < 2N , I(α) = I} is an increasing sequence of sets, we will have m({CI∗ f > y}) = lim m(AN ) ≤ Ap f pp /y p . N
Technical reasons will force us to replace 2N by θ2N (θ an absolute constant) in the above reasoning. 3. Another important point about the proof is that we shall use the basic step repeatedly. Therefore, given a Carleson integral Cα f (x) with I(α) = I, / EN , we shall obtain a sequence (αj )sj=1 in P with 0 ≤ n(α) < 2N and x ∈ α1 = α. Then we will have |Cα f (x)| ≤
s−1
|Cαj f (x) − Cαj+1 f (x)| + |Cαs f (x)|.
(4.22)
j=1
We shall apply the basic step to every difference and will arrange things so that n(αs ) = 0.
68
4. The Basic Step
4.13 Choosing the partition Πα . The norm |f |α From now on we will consider a Carleson integral Cα f (x) where 0 ≤ n(α) < 2N and I(α) = J, but we will not assume that J = I. If we want to apply the basic step to this integral, what selection of the partition Π will be good? The intervals of the partition Π will be dyadic with respect to J = I(α) and of length less than |J|/4. What we want, in view of (4.20) is to have j control of f α/Jk for every Jk ∈ Π. Hence we put bj = 2 · 2−2 , and we will assume that for the intervals J00 , J01 , J10 , and J11 we have f α/J00 , f α/J01 , f α/J10 , f α/J11 < ybj−1 .
(4.23)
We obtain the partition Πα by a process of subdivision. We start with the four grandsons of J = I(α) of which we assume (4.23). Then at every stage of the process we take some interval K and we subdivide it in its two sons K0 and K1 , if they satisfy the condition f α/K0 , f α/K1 < ybj−1 . If they do not, we consider K to be one of the intervals of the partition. Since this process can be infinite we also stop the division if |K| ≤ |I|/2N and consider it to be of the partition. As we need to consider the condition (4.23), we define for every α = (n, J) ∈ P |f |α = sup f α/J00 , f α/J01 , f α/J10 , f α/J11 . It is important to see that this construction and the definition of I(x) in proposition 4.7, imply that either |I(x)| = 2|I|/2N or |f |β ≥ ybj−1 . Now we have another answer to the question 1. If we start with ybj ≤ |f |α < ybj−1 we arrive at |f |β ≥ ybj−1 . We go from level j to a lesser level. It is true that a lesser level means here a greater norm |f |β , but it also means a smaller number of cycles n(β) ≤ n(α), and we will arrange things so that we arrive to Cαs f with n(αs ) = 0. We also have a good bound of the differences |Cαj f (x) − Cαj+1 f (x)| in (4.22). As we have motivated above, given f ∈ L1 (I) and α = (n, J) ∈ P, we define (4.24) |f |α = sup f α/J00 , f α/J01 , f α/J10 , f α/J11 . We will say a Carleson integral Cα f (x) is of level j ∈ N if ybj ≤ |f |α < ybj−1 . In the construction that follows we assume f ∈ L1 (I), a Carleson integral Cα f (x), a natural number N , and a real number y > 0 to be given; where α = (n, J) with 0 ≤ n < 2N , and J being the union of two dyadic intervals
4.13 Choosing the partition Πα . The norm |f |α
69
with respect to I, has length |J| > 4|I|/2N . We also assume that |f |α < bj−1 y for some natural number j (not necessarily the level of Cα f (x)). Our objective is to select a convenient dyadic partition Π of J so that we can apply theorem 4.11. We consider now the set of dyadic intervals Ju with respect to J, such that |J|/4 ≥ |Ju | ≥ |I|/2N . For example, in the first four rows of the figure 4.3 we have represented these intervals.
x fig. 4.3 For every one of these intervals we determine if they satisfy the condition f α/Ju < ybj−1 .
(4.25)
(In the figure we have painted in black the intervals that, hypothetically, do not satisfy this condition). Now the interval Ju is a member of the partition Π if it is of length |Ju | = |I|/2N and it, all its ancestors and their brothers satisfy the condition (4.25), (as J10011 in the example of the figure) or it, all its ancestors and their brothers satisfy the condition (4.25) but one of its sons does not satisfy this condition (as J011 in the example). (The fifth row of the figure is a representation of the partition Π in the case we are handling). Finally, observe that according to proposition 4.7, the interval I(x) is always the union of one of the intervals of Π and a contiguous interval of the same length. Therefore |I(x)| = 2|I|/2N or some of the four grandsons of I(x) will not satisfy the condition (4.25). (In the figure assuming that x is in the interval J10001 , the interval I(x) is represented in the sixth row. In this case I(x) is not a dyadic interval). In our example, J10011 is a member of the partition Π, since its length is just |I|/2N and it, its ancestors J1001 , J100 , J10 , and their brothers J10010 , J1000 , J101 , J11 satisfy the condition (4.25). Also J011 is a member of the partition since it, its ancestor J01 and their brothers J010 , J00 satisfy the condition (4.25), but one of its sons does not, in this case J0111 .
70
4. The Basic Step
4.14 Basic theorem, second form Now that we have a good selection of the partition Πα we can give a better version of the basic step. We shall need the following comparison between the two norms · α and | · |α . Proposition 4.12 There is a constant C > 0 such that for every f ∈ L2 (J) and α = (n, J) f α ≤ C|f |α . Proof. Let |J| = δ, and denote by K the grandsons of J. Then $
c 1 j t % dt. f (t) exp −2πi n + f α = 1 + j2 δ J 3 δ j Let δ = δ/4. We can write f α ≤
K
j
c 1 1 + j 2 4δ
$
n j t % + f (t) exp −2πi dt, 4 12 δ K
where K denotes the grandsons of J. Let n = 4m + r where r = 0, 1, 2 or 3 and put 4j + s instead of j, where s = 0, 1, 2 or 3. Then f α is equal to $
r s t % c 1 j f (t) exp −2πi m + + + dt. 2 4δ 1 + (4j + s) 3 4 12 δ K s j K
By Proposition 4.2 we have for t ∈ K $ r
st% t exp 2πi + c exp 2πi = ; 4 12 δ 3 δ
where (1 + 2 )|c | ≤ B (and where c depends on K, r, and s). Now we have f α bounded by $
|c | t % c j + f (t) exp −2πi m + dt 2 1 + (4j + s) 4δ 3 3 δ K s j K
≤
K
≤C
K
Now observe that
s
j,k
j
C |c ||F (m, j + , K)| 1 + j2
|F (m, k, K)| (1 + k 2 ) . 2 2 (1 + j )(1 + (k − j) ) 1 + k2
4.14 Basic theorem, second form
71
1 1 + k2 1 + k2 1 = + (1 + j 2 )(1 + (k − j)2 ) 1 + j2 1 + (k − j)2 2 + j 2 + (j − k)2
1 1 ≤2 + . 1 + j2 1 + (k − j)2 Hence we have f α ≤ 2C
K
≤D
k
j
|F (m, k, K)| 1 1 + 1 + j2 1 + (k − j)2 1 + k2
|F (m, k, K)| K
k
≤D
1 + k2
f α/K ≤ 4D|f |α .
K
In the same way we can prove that f α ≤ C sup{f α/J0 , f α/J1 },
J = I(α).
(4.26)
Now we can formulate the basic step with all its ingredients. Theorem 4.13 (Basic Step) Let ξ ∈ PI , and x ∈ I(ξ)/2 and assume that |f |ξ < ybj−1 . Given N ∈ N, let Πξ and I(x) be the corresponding partition of I(ξ) and interval, defined in Proposition 4.7. Let J be a smoothing interval such that I(x) ⊂ J ⊂ I(ξ), and x ∈ J/2. Assume that |ξ| ≥ 4|I|/2N . Then we have Cξ/J f (x) − Cξ/I(x) f (x) ≤ Cybj−1 + 2Hξ∗ f (x) + Dybj−1 Δξ (x), (4.27) where C and D are absolute constants. Proof. The condition |ξ| ≥ 4|I|/2N assure us that we can apply the procedure to obtain Πξ . We have seen that the selection of I(x) implies that J is union of some members of Πξ , and by (4.26), f ξ/J ≤ Cybj−1 . The same reasoning gives f ξ/I(x) ≤ Cybj−1 . Now observe that 2πin(ξ/J)(x−t)/|J| e f (t) dt, Cξ/J f (x) = p.v. x−t J where n(ξ/J) = n(ξ)|J|/|I(ξ)|. By a change of frequency, we have 2πin(ξ)(x−t)/|I(ξ)| e f (t) dt ≤ Bf ξ/J ≤ Cybj−1 . Cξ/J f (x) − p.v. x − t J
72
4. The Basic Step
In spite of the possibility that J = I(ξ) we can use the partition Πξ , as in the basic decomposition (4.1) for the integral 2πin(ξ)(x−t)/|I(ξ)| e p.v. f (t) dt. (4.28) x−t J We obtain a representation of the integral in (4.28) as eiλ(ξ)(x−t) Eξ f (t) f (t) dt + dt p.v. x−t I(x) JI(x) x − t eiλ(ξ)(x−t) f (t) − Eξ f (t) + dt. x−t JI(x) For the first term we obtain, as in (4.9), and by a change of frequency, | First term − Cξ/I(x) f (x)| ≤ Cf ξ/I(x) ≤ Cybj−1 . The second can be bounded as in (4.13) by | Second term | ≤ 2Hξ∗ f (x). For the third term we must use the fact that J I(x) can be written as a union of intervals from Πξ . Then we proceed as in Theorem 4.11. In this case we obtain a sum of some of the terms of Δξ (x) instead of all the terms. Since these terms are all positive this sum is less than Δξ (x). Finally we obtain, as in (4.19), | Third term | ≤ D sup f βk Δξ (x) ≤ Dybj−1 Δξ (x). Jk
This finishes the proof.
5. Maximal Inequalities
In this chapter we give two inequalities to bound the two terms Δξ (x) and Hξ∗ f (x) that arise in the basic step.
5.1 Maximal inequality for Δ(Π, x) Theorem 5.1 There are some absolute constants A and B > 0, such that for every finite partition Π of the interval J ⊂ R by intervals, we have m{x ∈ J : Δ(Π, x) > y} ≤ Ae−By . |J|
(5.1)
Proof. First recall that if the intervals Jk of Π have center at tk and length δk we have defined the function Δ(Π, x) as Δ(Π, x) =
k
δk2 . (x − tk )2 + δk2
Let g: R → [0, +∞) be a bounded and measurable function. We can define a harmonic function on the upper-half plane by convolution with the Poisson kernel 1 y g(t) dt = Py ∗ g(x). u(x, y) = π R (x − t)2 + y 2 Hence we have πδk u(tk , δk ) = πδk Pδk ∗ g(tk ). Δ(Π, t)g(t) dt = k
k
If we assume g to be positive, Lemma 5.2 gives 2 Pδ ∗ g(t) dt. Pδk ∗ g(tk ) ≤ δ k Jk k Therefore
Δ(Π, t)g(t) dt ≤ 2π
k
J.A. de Reyna: LNM 1785, pp. 73–76, 2002. c Springer-Verlag Berlin Heidelberg 2002
Jk
Pδk ∗ g(t) dt.
74
5. Maximal Inequalities
By the general inequality of the Hardy-Littlewood maximal function (cf. Theorem 1.6), we get Pδk ∗ g(t) ≤ Mg(t). Hence Δ(Π, t)g(t) dt ≤ 2π Mg(t) dt J
Now, we have seen (cf. Proposition 1.5), that B Mf (x) dx ≤ m(B) + 2c1 |f (x)| log+ |f (x)| dx; hence Δ(Π, t)g(t) dt ≤ c|J| + c |g(t)| log+ |g(t)| dt. R
Now we put g(t) = ey/2c χ{Δ(Π,t)>y} (t) and obtain y/2c ye m{t : Δ(Π, t) > y} ≤ Δ(Π, t)g(t) dt ≤ c|J| + cey/2c Hence
y m{t : Δ(Π, t) > y}. 2c
2c m{t : Δ(Π, t) > y} ≤ e−y/2c . |J| y
Since the left member is less than or equal 1, we obtain the desired bound. We prove now the following lemma: Lemma 5.2 Let g ∈ L∞ (R) be a positive function and u(x, y) = Py ∗ g(x). Then for every interval J ⊂ R with center at a and length y we have 2 u(x, y) dx. u(a, y) ≤ |J| J Proof. Without loss of generality we can assume that a = 0. So we want to prove y 1 y 1 2 g(−t) dt ≤ g(x − t) dt dx π R t2 + y 2 |J| J π R t2 + y 2 1 y 2 g(−t) dt dx. = |J| J π R (x + t)2 + y 2 But this follows from y y 2 2 y/2 y ≤ dx = dx. 2 2 2 2 t +y |J| J (x + t) + y y −y/2 (x + t)2 + y 2 Changing the variable we see that (5.2) is equivalent to
(5.2)
5.2 Maximal inequality for HI∗ f
1 ≤2 2 u +1
1/2
−1/2
75
dx . (x + u)2 + 1
And we see that for every ξ with |ξ| < 1/2 we have u2
1 2 ≤ . +1 (ξ + u)2 + 1
5.2 Maximal inequality for HI∗ f In order to prove an inequality of the same type as (5.1) for HI∗ f we first relate this maximal function to the ordinary maximal Hilbert transform H∗ f and the maximal function of Hardy and Littlewood. Proposition 5.3 Let 1 ≤ p < +∞, f ∈ Lp (R), and let I ⊂ R be a bounded interval. Then for every x ∈ I we have HI∗ f (x) ≤ 2H∗ f (x) + 6Mf (x).
(5.3)
Proof. Observe that in the definition of HI∗ f (x) we consider only the restriction of f to the interval I. Let K ⊂ I be such that x ∈ K/2. Let J ⊂ K be an interval with center at x and of maximal length, and L ⊃ K be an interval with center at x but of minimal length. We can write f (t) f (t) f (t) dt ≤ dt + dt. K x−t J x−t KJ x − t The first term is equal to Hf (x) − RJ · · · and consequently is bounded by 2H∗ f (x). To bound the second term observe that if t ∈ K J, then |x − t| ≥ d(x, R K) ≥ |K|/4. Hence 4 f (t) 4|L| 1 dt ≤ |f (t)| dt ≤ |f (t)| dt. |K| KJ |K| |L| L KJ x − t From x ∈ K/2 and the definition of L follows that |L| ≤ 6|K|/4. Hence we have obtained (5.3). Theorem 5.4 There are absolute constants A and B > 0 such that for every f ∈ L∞ (I), and every y > 0 m{HI∗ f (x) > y} ≤ Ae−By/f ∞ . |I|
(5.4)
76
5. Maximal Inequalities
Proof. By homogeneity we can assume that f ∞ = 1. We shall also consider f as the restriction to I of a function vanishing on R I. By proposition 5.3 as Mf ∞ ≤ 1 we have {HI∗ f (x) > y} ⊂ {2H∗ f (x) > y/2}, if y > 12. Now we want to bound
∗
eAH
f (t)
dt.
I
We have seen that the maximal Hilbert transform satisfies H∗ f p ≤ Bpf p , Hence
AH∗ f (t)
e
I
2 < p < +∞.
+∞ An n n − AH f (t) dt ≤ |I| + B n |I|. n! 2 ∗
If we choose A small enough (A only depending on B), we arrive at AH∗ f (t) e − AH∗ f (t) dt ≤ C|I|. I
It follows that m{H∗ f (x) > y/4} eAy/4 − Ay/4 ≤ C|I|. Therefore, if y > y0 (y0 depending only on A), we have 1 Ay/4 m{HI∗ f (x) > y} ≤ C|I|. e 2 Hence we have proved (5.4) for y > y0 . Now, changing A, we can forget the restriction on y.
6. Growth of Partial Sums
6.1 Introduction As a first indication that the basic step is powerful we give a theorem about the partial sums of the Fourier series of a function f ∈ L2 [0, 2π]. This can be regarded as a toy example of the techniques involved in Carleson’s theorem. This example will also justify our procedure in the proof of Carleson’s Theorem. The principal result in this chapter will be that if f ∈ L2 [0, 2π], then Sn (f, x) = o(log log n) almost everywhere. As we have seen, to bound Sn (f, x) we must bound the corresponding Carleson integral. Hence our first objective will be to bound supα |Cα f (x)| where the supremum is taken over all pairs α with I(α) = I and |n(α)| < θ2N . Each time we apply the basic step we lessen the values of n(α) and |α|; the role of θ is to assure that we arrive to n(α) = 0 before we arrive to |α| < 4|I|/2N . In this chapter θ = 1/4, but later, in the proof of the Carleson Theorem, we shall need another value of θ. We will consider that f is real, so that by (4.21) we can assume 0 ≤ n(α) < θ2N . ∞ Given ε > 0, we shall construct a set E = N =2 EN ⊂ I with m(E) < Aε and such that sup 0≤n(α) 0. Later we will determine y big enough so that m(E) < Aε. ∗ The first component of the exceptional set is S = J 7J, where the union is taken for all the dyadic intervals, J, such that 1 |f |2 dm ≥ y 2 . (6.1) |J| J
6.3 The exceptional set
79
The definition of S ∗ is given so that: For every α ∈ PI such that x ∈ I(α)/2, but x ∈ S ∗ we have |f |α ≤ y. In fact, for every grandson J of I(α), we have 1 |f |2 dm < y 2 . |J| J (In the other case x ∈ I(α) ⊂ 7J ⊂ S ∗ ). Therefore for every grandson J of I(α) we have f α/J < y. Hence |f |α < y. Two dyadic intervals are disjoint or one is contained in the other. It follows that there exists a sequence (Jn ) of disjoint dyadic intervals, such that every Jn satisfies (6.1), and every other J that satisfies (6.1) is contained in one of the Jn . Hence
7 7 ∗ m(S ) = m 7Jn ≤ 7 m(Jn ) ≤ 2 |f |2 dm ≤ 2 f 22 . y I y n n Let α ∈ C such that I(α) ⊂ S ∗ , then |f |α < y. Hence |f |α = 0 or there exists j ∈ N such that ybj ≤ |f |α < ybj−1 . With α, j, y and N we obtain a partition Πα of I(α) as in section [4.13]. Now, for such α, we define TN (α) = ∅ if |f |α = 0 or, in other case, √ (6.2) TN (α) = x ∈ I(α) : Hα∗ f (x) > M ybj−1 log N /bj , √ where M > 0 and y > 0 will be determined later. Observe also that N > bj always. The norm Eα f ∞ has been bounded in (4.18) in terms of the local norms of f on the intervals of the partition, hence we have the bound Eα f ∞ ≤ C supk f βk < Cybj−1 . Therefore by the maximal inequality obtained in (5.4) for the maximal Hilbert dyadic transform,
√ BM log( N /bj ) m(TN (α)) ≤ A|α| exp − C b2j |f |2 ≤ A|α| 3 ≤ A|α| 2 α3 , N y N if we choose M in such a way that BM/C = 6. Let TN be the union of the sets TN (α). To bound m(TN ) we need the following: Lemma 6.1 For every f ∈ L2 (J) n∈Z
|f |2αn
1 ≤ 16 |f |2 dm , |J| J
where αn = (n, J) for every n ∈ Z.
80
6. Growth of Partial Sums
Now we have m(TN ) =
m(TN (α)) ≤
J
α=(n,J)
A|J|y −2 N −3 |f |2α
n
≤ 16Ay
−2
N
−3
J
|f |2 dm.
J
Here J can be any smoothing interval of I whose length ≥ 4|I|/2N . Summing first for all J with the same length |I|/2r , m(TN ) ≤ 16Ay
−2
N
−3
N −2
2
r=0
|f |2 dm ≤
I
32A f 22 . y2 N 2
(6.3)
The set UN will also be the union of UN (α), for the same α’s. Put √ UN (α) = x ∈ I(α) : Δα (x) > (M/C) log N /bj , where M , C and y are the same constants that appear in (6.2). By the corresponding maximal inequality of theorem 5.1 we obtain: m(UN ) = m(UN (α)) ≤ A|J|y −2 N −3 |f |2α . J
n
J
n
The same reasoning we used before proves the inequality m(UN ) ≤
32A f 22 . y2 N 2
Finally, the last component V will be the set V = {x ∈ I : HI∗ f (x) > y} By the relation, proved in proposition 5.3, between the maximal Hilbert dyadic transform, the maximal Hilbert transform and the maximal HardyLittlewood function, the function H ∗ f is of weak type (2, 2). Therefore we have f 22 m(V ) ≤ C 2 . y Observe that given f ∈ L2 (I), N ∈ N, and y > 0, we have constructed a ∞ ∗ measurable set EN = S ∪ TN ∪ UN ∪ V ⊂ I such that for E = N =2 EN we have ∞ f 2 ∗ m(TN ∪ UN ) ≤ A 2 2 , m(E) = m(S ) + m(V ) + y N =2
where A denotes an absolute constant. Proof of lemma 6.1. First, we prove the inequality
6.4 Bound for the partial sums
f 2αn
n∈Z
81
1 2 ≤ |f | dm . |J| J
There exists (xn )n∈Z ∈ 2 (Z) with n |xn |2 = 1 such that $
1/2 x c j t % n 2 dt. f αn = f (t) exp −2πi n + 2 |J| 1 + j 3 |J| J j,n n∈Z
Hence if we denote by y3n+j the above integral, we have 2 2 2 −1 |y3n | = |y3n+1 | = |y3n+2 | = |J| |f (t)|2 dt. n∈Z
Therefore
n∈Z
f 2αn
1/2 =c
J
n∈Z
xn y3n+j j,n
n∈Z
1 + j2
≤ f L2 (J)
j
c = f L2 (J) . 1 + j2
Now we denote by K the grandsons of J and we have |f |2αn ≤ f 2(n/4,K) ≤ 4 f 2(m,K) K
n∈Z
n
K
m
1 16 16 2 2 ≤4 |f | dm ≤ |f | dm ≤ |f |2 dm. |K| K |J| |J| K J K
K
6.4 Bound for the partial sums Proposition 6.2 Let f ∈ L2 (I), and ε > 0 be given. There exists a set E of measure m(E) < Aε, such that for every N > 2 and x ∈ I/2 E f 2 |C(k,I) f (x)| ≤ B √ (log N ). ε 0≤k 0. Choosing y = f 2 / ε, we have constructed in the previous section a set E such that m(E) < Aε. Consider the set C of those pairs α such that I(α) is a smoothing interval of length |α| > 4|I|/2N and 0 ≤ n(α)|I|/|α| < θ2N . (This is a subset of the set of pairs C that we have used in section 1). The set C is defined in order that for every pair α = (k, I) with 0 ≤ k < θ2N and every J ⊂ I a smoothing interval of length > 4|I|/2N we have α/J ∈ C .
82
6. Growth of Partial Sums
Now choose x ∈ I/2 E. For every Carleson integral Cα f (x) appearing in (6.4), we have α ∈ C . Assume that we have a Carleson integral Cα f (x) with α ∈ C . Since x ∈ S ∗ and x ∈ I(α)/2, we have |f |α < y. Hence there is a well defined level j ∈ N such that bj y ≤ |f |α < ybj−1 . (Or f = 0 a. e. on I(α) and there is nothing to prove). We are in position to apply the procedure of section [4.13] to obtain a dyadic partition Πα of I(α) with intervals J of length |I|/2N ≤ |J| ≤ |I|/4. Then Proposition 4.7 gives us a smoothing interval I(x). This interval I(x) is the union of an interval J0 of Πα and a contiguous interval. If |J0 | = |I|/2N , then |I(x)| = 2|I|/2N . Otherwise there is a son K of J0 (hence a grandson of I(x)) such that f α/K ≥ ybj−1 . Therefore we have |I(x)| = 2|I|/2N or |f |α/I(x) ≥ ybj−1 . Put β = α/I(x). We are going to prove that we have a good bound for |Cα f (x) − Cβ f (x)|. Also either β ∈ C or we have a good bound for Cβ f (x). When |I(x)| ≤ 4|I|/2N we have n(β) = 0. In fact, 4 |I(x)| ! |I| |I(x)| ! = n(α) < θ2N N = 1. n(β) = n(α) |α| |α| |I| 2 Since x ∈ V we will have |Cβ f (x)| ≤ y, a bound that is good enough for us. In the second case |I(x)| = |I(β)| > 4|I|/2N and |I(x)| ! |I| |I| |I| = n(α) ≤ n(α) < θ2N . n(β) |β| |α| |β| |α| Therefore β ∈ C . Also by Proposition 4.7, x ∈ I(β)/2. If |f |β = 0, we obtain Cβ f (x) = 0 and so we have a good bound of Cβ f (x). Otherwise there exists k ∈ N such that ybk ≤ |f |β < ybk−1 . The construction of Πα and the fact that |f |β ≥ ybj−1 imply that j > k. The basic step gives us in all cases (taking J = I(α)) |Cα f (x) − Cβ f (x)| ≤ Cybj−1 + 2Hα∗ f (x) + Dybj−1 Δα (x). Now since x ∈ EN this is
√ √ ≤ Cybj−1 + 2M ybj−1 log( N /bj ) + Dybj−1 (M/C) log( N /bj ) √ ≤ Cybj−1 log( N /bj ).
Now either Cβ f (x) = 0, or n(β) = 0, or we are in position to apply the same procedure with β instead of α. In the last case we also have a level 1 ≤ k < j, so that in a finite number of steps we must arrive at n(β) = 0 or Cβ f (x) = 0. Then we obtain √ Cybj−1 log( N /bj ) ≤ B(log N )y. |Cα f (x)| ≤ y + j∈N
6.4 Bound for the partial sums
83
Proposition 6.3 Let f ∈ L2 (I), then sup |C(k,I) f (x)| = o(log log n),
a.e. on I/2.
0≤|k|