Gerrit van Dijk Distribution Theory De Gruyter Graduate Lectures
Gerrit van Dijk
Distribution Theory
Convolution, Fourier Transform, and Laplace Transform
Mathematics Subject Classiﬁcation 2010 35D30, 42A85, 42B37, 46A04, 46F05, 46F10, 46F12 Author Prof. Dr. Gerrit van Dijk Universiteit Leiden Mathematisch Instituut Niels Bohrweg 1 2333 CA Leiden The Netherlands EMail:
[email protected] ISBN 9783110295917 eISBN 9783110298512
Library of Congress CataloginginPublication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliograﬁe; detailed bibliographic data are available in the Internet at http://dnb.dnb.de. © 2013 Walter de Gruyter GmbH, Berlin/Boston Cover image: FourierTransformation of a sawtooth wave. © TieR0815 Typesetting: letex publishing services GmbH, Leipzig Printing and binding: Hubert & Co.GmbH & Co. KG, Göttingen Printed on acidfree paper Printed in Germany
www.degruyter.com
Preface The mathematical concept of a distribution originates from physics. It was ﬁrst used by O. Heaviside, a British engineer, in his theory of symbolic calculus and then by P. A. M. Dirac around 1920 in his research on quantum mechanics, in which he introduced the deltafunction (or deltadistribution). The foundations of the mathematical theory of distributions were laid by S. L. Sobolev in 1936, while in the 1950s L. Schwartz gave a systematic account of the theory. The theory of distributions has numerous applications and is extensively used in mathematics, physics and engineering. In the early stages of the theory one used the term generalized function rather than distribution, as is still reﬂected in the term deltafunction and the title of some textbooks on the subject. This book is intended as an introduction to distribution theory, as developed by Laurent Schwartz. It is aimed at an audience of advanced undergraduates or beginning graduate students. It is based on lectures I have given at Utrecht and Leiden University. Student input has strongly inﬂuenced the writing, and I hope that this book will help students to share my enthusiasm for the beautiful topics discussed. Starting with the elementary theory of distributions, I proceed to convolution products of distributions, Fourier and Laplace transforms, tempered distributions, summable distributions and applications. The theory is illustrated by several examples, mostly beginning with the case of the real line and then followed by examples in higher dimensions. This is a justiﬁed and practical approach in our view, it helps the reader to become familiar with the subject. A moderate number of exercises are added with hints to their solutions. There is relatively little expository literature on distribution theory compared to other topics in mathematics, but there is a standard reference [10], and also [6]. I have mainly drawn on [9] and [5]. The main prerequisites for the book are elementary real, complex and functional analysis and Lebesgue integration. In the later chapters we shall assume familiarity with some more advanced measure theory and functional analysis, in particular with the Banach–Steinhaus theorem. The emphasis is however on applications, rather than on the theory. For terminology and notations we generally follow N. Bourbaki. Sections with a star may be omitted at ﬁrst reading. The index will be helpful to trace important notions deﬁned in the text. Thanks are due to my colleagues and students in The Netherlands for their remarks and suggestions. Special thanks are due to Dr J. D. Stegeman (Utrecht) whose help in developing the ﬁnal version of the manuscript has greatly improved the presentation. Leiden, November 2012
Gerrit van Dijk
Contents Preface
V
1
Introduction
2 2.1 2.2 2.3
Deﬁnition and First Properties of Distributions Test Functions 3 4 Distributions 6 Support of a Distribution
3 3.1 3.2 3.3 3.4 3.5 3.6
9 Differentiating Distributions 9 Deﬁnition and Properties 10 Examples The Distributions x+λ−1 (λ = 0, −1, −2, . . .)* 14 Exercises Green’s Formula and Harmonic Functions Exercises 20
4 4.1 4.2 4.3 4.4
Multiplication and Convergence of Distributions 22 Multiplication with a C ∞ Function Exercises 23 23 Convergence in D 24 Exercises
5 5.1 5.2 5.3 5.4
26 Distributions with Compact Support 26 Deﬁnition and Properties Distributions Supported at the Origin Taylor’s Formula for Rn 27 28 Structure of a Distribution*
6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8
Convolution of Distributions 31 Tensor Product of Distributions 31 33 Convolution Product of Distributions 39 Associativity of the Convolution Product Exercises 39 40 Newton Potentials and Harmonic Functions 42 Convolution Equations Symbolic Calculus of Heaviside 45 47 Volterra Integral Equations of the Second Kind
1 3
12 14
22
27
VIII
Contents
6.9 6.10 6.11
Exercises 49 Systems of Convolution Equations* 51 Exercises
50
7 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12
The Fourier Transform 52 Fourier Transform of a Function on R 52 54 The Inversion Theorem Plancherel’s Theorem 56 57 Differentiability Properties 58 The Schwartz Space S(R) The Space of Tempered Distributions S (R) 60 61 Structure of a Tempered Distribution* 63 Fourier Transform of a Tempered Distribution Paley–Wiener Theorems on R* 65 68 Exercises 69 Fourier Transform in Rn The Heat or Diffusion Equation in One Dimension 71
8 8.1 8.2 8.3 8.4
The Laplace Transform 74 74 Laplace Transform of a Function Laplace Transform of a Distribution 75 76 Laplace Transform and Convolution Inversion Formula for the Laplace Transform
9 9.1 9.2 9.3 9.4 9.5
82 Summable Distributions* 82 Deﬁnition and Main Properties 83 The Iterated Poisson Equation Proof of the Main Theorem 84 Canonical Extension of a Summable Distribution 87 Rank of a Distribution
10 10.1 10.2
90 Appendix The Banach–Steinhaus Theorem The Beta and Gamma Function
11
Hints to the Exercises
References Index
109
107
102
90 97
79
85
1 Introduction Differential equations appear in several forms. One has ordinary differential equations and partial differential equations, equations with constant coefﬁcients and with variable coefﬁcients. Equations with constant coefﬁcients are relatively well understood. If the coefﬁcients are variable, much less is known. Let us consider the singular differential equation of the ﬁrst order x u = 0 .
(∗)
Though the equation is deﬁned everywhere on the real line, classically a solution is only given for x > 0 and x < 0. In both cases u(x) = c with c a constant, different for x > 0 and x < 0 eventually. In order to ﬁnd a global solution, we consider a weak form of the differential equation. Let ϕ be a C 1 function on the real line, vanishing outside some bounded interval. Then equation (∗) can be rephrased as
∞
xu , ϕ =
x u (x) ϕ(x) dx = 0 .
−∞
Applying partial integration, we get
x u , ϕ = − u, ϕ + x ϕ = 0 .
Take u(x) = 1 for x ≥ 0, u(x) = 0 for x < 0. Then we obtain x u , ϕ = −
∞ 0
ϕ(x) + x ϕ (x) dx = −
∞
∞ ϕ(x) dx +
0
ϕ(x) dx = 0 . 0
Call this function H , known as the Heaviside function. We see that we obtain the following (weak) global solutions of the equation: u(x) = c1 H(x) + c2 ,
with c1 and c2 constants. Observe that we get a twodimensional solution space. One can show that these are all weak solutions of the equation. The functions ϕ are called test functions. Of course one can narrow the class of test functions to C k functions with k > 1, vanishing outside a bounded interval. This is certainly useful if the order of the differential equation is greater than one. It would be nice if we could assume that the test functions are C ∞ functions, vanishing outside a bounded interval. But then there is really something to show: do such functions exist? The answer is yes (see Chapter 2). Therefore we can set up a nice theory of global solutions. This is important in several branches of mathematics and physics. Consider, for example, a point mass in R3 with force ﬁeld having potential V = 1/r ,
2
Introduction
r being the distance function. It satisﬁes the partial differential equation ΔV = 0 outside 0. To include the origin in the equation, one writes it as ΔV = −4π δ
with δ the functional given by δ, ϕ = ϕ(0). So it is the desire to go to global equations and global solutions to develop a theory of (weak) solutions. This theory is known as distribution theory. It has several applications also outside the theory of differential equations. To mention one, in representation theory of groups, a wellknown concept is the character of a representation. This is perfectly deﬁned for ﬁnitedimensional representations. If the dimension of the space is inﬁnite, the concept of distribution character can take over that role.
2 Deﬁnition and First Properties of Distributions Summary In this chapter we show the existence of test functions, deﬁne distributions, give some examples and prove their elementary properties. Similar to the notion of support of a function we deﬁne the support of a distribution. This is a rather technical part, but it is important because it has applications in several other branches of mathematics, such as differential geometry and the theory of Lie groups.
Learning Targets Understanding the deﬁnition of a distribution. Getting acquainted with the notion of support of a distribution.
2.1 Test Functions We consider the Euclidean space Rn , n ≥ 1, with elements x = (x1 , . . . , xn ). One deﬁnes x = x12 + · · · + xn2 , the length of x . Let ϕ be a complexvalued function on Rn . The closure of the set of points {x ∈ n R : ϕ(x) = 0} is called the support of ϕ and is denoted by Supp ϕ. For any ntuple k = (k1 , . . . , kn ) of nonnegative integers ki one deﬁnes the partial differential operator D k as Dk =
∂ ∂x1
k 1 ···
∂ ∂xn
k n =
∂ k k
k
∂x1 1 · · · ∂xnn
.
The symbol k = k1 + · · · + kn is called the order of the partial differential operator. Note that order 0 corresponds to the identity operator. Of course, in the special case n = 1 we simply have the differential operators dk /dx k (k ≥ 0). A function ϕ : Rn → C is called a C m function if all partial derivatives D k ϕ of order k ≤ m exist and are continuous. The space of all C m functions on Rn will be denoted by E m (Rn ). In practice, once a value of n ≥ 1 is ﬁxed, this space will be simply denoted by E m . A function ϕ : Rn → C is called a C ∞ function if all its partial derivatives D k ϕ exist and are continuous. A C ∞ function with compact support is called a test function. The space of all C ∞ functions on Rn will be denoted by E(Rn ), the space of test functions on Rn by D(Rn ). In practice, once a value of n ≥ 1 is ﬁxed, these spaces will be simply denoted by E and D respectively. It is not immediately clear that nontrivial test functions exist. The requirement of being C ∞ is easy (for example every polynomial function is), but the requirement of also having compact support is difﬁcult. See the following example however.
4
Deﬁnition and First Properties of Distributions
Example of a Test Function
First take n = 1, the real line. Let ϕ be deﬁned by ⎧
⎪ ⎨ exp − 1 if x < 1 ϕ(x) = 1 − x2 ⎪ ⎩0 if x ≥ 1 . Then ϕ ∈ D(R). To see this, it is sufﬁcient to show that ϕ is inﬁnitely many times differentiable at the points x = ±1 and that all derivatives at x = ±1 vanish. After performing a translation, this amounts to showing that the function f deﬁned by f (x) = e−1/x (x > 0), f (x) = 0 (x ≤ 0) is C ∞ at x = 0 and that f (m) (0) = 0 for all m = 0, 1, 2, . . .. This easily follows from the fact that limx↓0 e−1/x /x k = 0 for all k = 0, 1, 2, . . .. For arbitrary n ≥ 1 we denote r = x and take ⎧
⎪ ⎨ exp − 1 if r < 1 ϕ(x) = 1 − r2 ⎪ ⎩0 if r ≥ 1 . Then ϕ ∈ D(Rn ). The space D is a complex linear space, even an algebra. And even more generally, if ϕ ∈ D and ψ a C ∞ function, then ψϕ ∈ D. It is easily veriﬁed that Supp(ψϕ) ⊂ Supp ϕ ∩ Supp ψ. We deﬁne an important convergence principle in D Deﬁnition 2.1. A sequence of functions ϕj ∈ D (j = 1, 2, . . .) converges to ϕ ∈ D if the following two conditions are satisﬁed: (i) The supports of all ϕj are contained in a compact set, not depending on j , (ii) For any ntuple k of nonnegative integers the functions D k ϕj converge uniformly to D k ϕ (j → ∞).
2.2 Distributions We can now deﬁne the notion of a distribution. A distribution on the Euclidean space Rn (with n ≥ 1) is a continuous complexvalued linear function deﬁned on D(Rn ), the linear space of test functions on Rn . Explicitly, a function T : D(Rn ) → C is a distribution if it has the following properties: a. T (ϕ1 + ϕ2 ) = T (ϕ1 ) + T (ϕ2 ) for all ϕ1 , ϕ2 ∈ D, b. T (λϕ) = λ T (ϕ) for all ϕ ∈ D and λ ∈ C, c.
If ϕj tends to ϕ in D then T (ϕj ) tends T (ϕ).
Instead of the term linear function, the term linear form is often used in the literature.
Distributions
5
The set D of all distributions is itself a linear space: the sum T1 + T2 and the scalar product λT are deﬁned by T1 + T2 , ϕ = T1 , ϕ + T2 , ϕ , λ T , ϕ = λ T , ϕ
for all ϕ ∈ D. One usually writes T , ϕ for T (ϕ), which is convenient because of the double linearity. Examples of Distributions
1.
A function on Rn is called locally integrable if it is integrable over every compact subset of Rn . Clearly, continuous functions are locally integrable. Let f be a locally integrable function on Rn . Then Tf , deﬁned by Tf , ϕ = f (x) ϕ(x) dx (ϕ ∈ D) Rn
is a distribution on Rn , a socalled regular distribution. Observe that Tf is well deﬁned because of the compact support of the functions ϕ. One also writes in this case f , ϕ for Tf , ϕ. Because of this relationship between Tf and f , it is justiﬁed to call distributions generalized functions, which was customary at the start of the theory of distributions. Clearly, if the function f is continuous, then the relationship is onetoone: if Tf = 0 then f = 0. Indeed: Lemma 2.2. Let f be a continuous function satisfying f , ϕ = 0 for all ϕ ∈ D. Then f is identically zero. Proof. Let f satisfy f , ϕ = 0 for all ϕ ∈ D and suppose that f (x0 ) = 0 for some x0 . Then we may assume that Re f (x0 ) = 0 (otherwise consider if ), even Re f (x0 ) > 0. Since f is continuous, there is a neighborhood V of x0 where Re f (x0 ) > 0. If ϕ ∈ D, ϕ ≥ 0, ϕ = 1 near x0 and Supp ϕ ⊂ V , then Re[ f (x) ϕ(x) dx] > 0, hence f , ϕ = 0. This contradicts the assumption on f . We shall see later on in Section 6.2 that this lemma is true for general locally integrable functions: if Tf = 0 then f = 0 almost everywhere. 2.
The Dirac distribution δ: δ, ϕ = ϕ(0)(ϕ ∈ D). More generally for a ∈ Rn , δ(a) , ϕ = ϕ(a). One sometimes uses the terminology δfunction and writes δ, ϕ = ϕ(x) δ(x) dx , δ(a) , ϕ = ϕ(x) δ(a) (x) dx Rn
for ϕ ∈ D. But clearly δ is not a regular function.
Rn
6
3.
Deﬁnition and First Properties of Distributions
For any ntuple k of nonnegative integers and any a ∈ Rn , T , ϕ = D k ϕ(a) (ϕ ∈ D) is a distribution on Rn .
We return to the deﬁnition of a distribution and in particular to the continuity property. An equivalent deﬁnition is the following: Proposition 2.3. A distribution T is a linear function on D such that for each compact subset K of Rn there exists a constant CK and an integer m with T , ϕ ≤ CK sup D k ϕ k≤m
for all ϕ ∈ D with Supp ϕ ⊂ K . Proof. Clearly any linear function T that satisﬁes the above inequality is a distribution. Let us show the converse by contradiction. If a distribution T does not satisfy such an inequality, then for some compact set K and for all C and m, e. g. for C = m (m = 1, 2, . . .), we can ﬁnd a function ϕm ∈ D with T , ϕm = 1, Supp ϕm ⊂ K , D k ϕm  ≤ 1/m if k ≤ m. Clearly ϕm tends to zero in D if m tends to inﬁnity, but T , ϕm = 1 for all m. This contradicts the continuity condition of T . If m can be chosen independent of K , then T is said to be of ﬁnite order. The smallest possible m is called the order of T . If, in addition, CK can be chosen independent of K , then T is said to be a summable distribution (see Chapter 9). In the above examples, Tf , δ, δ(a) are of order zero, in Example 3 above the distribution T is of order k, hence equal to the order of the partial differential operator.
2.3 Support of a Distribution Let f be a nonvanishing continuous function on Rn . The support of f was deﬁned in Section 2.1 as the closure of the set of points where f does not vanish. If now ϕ is a test function with support in the complement of the support of f then f , ϕ = Rn f (x) ϕ(x) dx = 0. It is easy to see that this complement is the largest open set with this property. Indeed, if O is an open set such that f , ϕ = 0 for all ϕ with Supp ϕ ⊂ O , then, similar to the proof of Lemma 2.2, f = 0 on O , hence Supp f ∩ O = ∅. These observations permit us to deﬁne the support of a distribution in a similar way. We begin with two important lemmas, of which the proofs are rather terse, but they are of great value. Lemma 2.4. Let K ⊂ Rn be a compact subset of Rn . Then for any open set O containing K , there exists a function ϕ ∈ D such that 0 ≤ ϕ ≤ 1, ϕ = 1 on a neighborhood of K and Supp ϕ ⊂ O .
Support of a Distribution
7
Proof. Let 0 < a < b and consider the function f on R given by ⎧
1 ⎪ 1 ⎨ exp − if a < x < b f (x) = x−b x−a ⎪ ⎩0 elsewhere. Then f is C ∞ and the same holds for b
b F (x) =
Observe that F (x) =
f (t) dt x
1 0
f (t) dt . a
if x ≤ a if x ≥ b.
The function ψ on Rn given by ψ(x1 , . . . , xn ) = F (x12 + · · · xn2 ) is C ∞ , is equal to 1 for r 2 ≤ a and zero for r 2 ≥ b. Let B ⊂ B be two different concentric balls in Rn . By performing a linear transformation in Rn if necessary, we can now construct a function ψ in D that is equal to 1 on B and zero outside B . Now consider K . There are ﬁnitely many balls B1 , . . . , Bm needed to cover K and we can arrange it such that m i=1 Bi ⊂ O . One can even manage in such a way that the balls B1 , . . . , Bm which are concentric with B1 , . . . , Bm , but have half the radius, i and zero outside Bi . Set cover K as well. Let ψi ∈ D be such that ψi = 1 on B ψ = 1 − (1 − ψ1 )(1 − ψ2 ) · · · (1 − ψm ) .
Then ψ ∈ D, 0 ≤ ψ ≤ 1, ψ is equal to 1 on a neighborhood of K and Supp ψ ⊂ O .
Lemma 2.5 (Partition of unity). Let O1 , . . . , Om be open subsets of Rn , K a compact subset of Rn and assume K ⊂ m Oi . Then there are functions ϕi ∈ D with i=1 m m Supp ϕi ⊂ Oi such that ϕi ≥ 0, i=1 ϕi ≤ 1 and i=1 ϕi = 1 on a neighborhood of K . Proof. Select compact subsets Ki ⊂ Oi such that K ⊂ m i−1 Ki . By the previous lemma there are functions ψi ∈ D with Supp ψi ⊂ Oi , ψi = 1 on a neighborhood of Ki , 0 ≤ ψi ≤ 1. Set ϕ1 = ψ1 , ϕi = ψi (1 − ψ1 ) · · · (1 − ψi−1 )
(i = 2, . . . , m) .
m
Then i=1 ϕi = 1 − (1 − ψ1 ) · · · (1 − ψm ), which is indeed equal to 1 on a neighborhood of K . Let T be a distribution. Let O be an open subset of Rn such that T , ϕ = 0 for all ϕ ∈ D with Supp ϕ ⊂ O . Then we say that T vanishes on O . Let U be the union of such open sets O . Then U is again open, and from Lemma 2.5 it follows that T vanishes on U. Thus U is the largest open set on which T vanishes. Therefore we can now give the following deﬁnition.
8
Deﬁnition and First Properties of Distributions
Deﬁnition 2.6. The support of the distribution T on Rn , denoted by Supp T , is the complement of the largest open set in Rn on which T vanishes. Clearly, Supp T is a closed subset of Rn . Examples
1.
Supp δ(a) = {a}.
2.
If f is a continuous function, then Supp Tf = Supp f . The following proposition is useful.
Proposition 2.7. Let T be a distribution on Rn of ﬁnite order m. Then T , ψ = 0 for all ψ ∈ D for which the derivatives D k ψ with k ≤ m vanish on Supp T . Proof. Let ψ ∈ D be as in the proposition and assume that Supp ψ is contained in a compact subset K of Rn . Since the order of T is ﬁnite, equal to m, one has T , ϕ ≤ CK sup D k ϕ k≤m
for all ϕ ∈ D with Supp ϕ ⊂ K, CK being a positive constant. Applying Lemma 2.4, choose for any ε > 0 a function ϕε ∈ D such that ϕε = 1 on a neighborhood of K ∩ Supp T and such that sup D k (ϕε ψ) < ε for all partial differential operators D k with k ≤ m. Then T , ψ = T , ϕε ψ ≤ CK sup D k (ϕε ψ) ≤ CK (m + 1)n ε . k≤m
Since this holds for all ε > 0, it follows that T , ψ = 0.
Further Reading Every book on distribution theory contains of course the deﬁnition and ﬁrst properties of distributions discussed in this chapter. See, e. g. [9], [10] and [5]. Schwartz’s book [10] contains a large amount of additional material.
3 Differentiating Distributions Summary In this chapter we deﬁne the derivative of a distribution and show that any distribution can be differentiated an arbitrary number of times. We give several examples of distributions and their derivatives, both on the real line and in higher dimensions. The principal value and the ﬁnite part of a distribution are introduced. In higher dimensions we derive Green’s formula and study harmonic functions only depending on the radius r , both as classical functions outside the origin and globally as distributions. These functions play a prominent role in physics, when studying the potential of a ﬁeld of a point mass or of an electron.
Learning Targets Understanding that any distribution can be differentiated an arbitrary number of times. Learning about the principal value and the ﬁnite part of a distribution. Deriving a formula of jumps. What are harmonic functions and how do they behave as distributions?
3.1 Deﬁnition and Properties If f is a continuously differentiable function on R and ϕ ∈ D(R), then it is easily veriﬁed by partial integration, using that ϕ has compact support, that
df ,ϕ dx
∞
=
∞
f (x) ϕ(x) dx = − −∞
−∞
dϕ . f (x) ϕ (x) dx = − f , dx
In a similar way one has for a continuously differentiable function f on Rn (n ≥ 1),
∂f ,ϕ ∂xi
∂ϕ = − f, ∂xi
for all i = 1, 2, . . . , n and all ϕ ∈ D(Rn ). The following deﬁnition is then natural for a general distribution T on Rn : ∂T /∂xi , ϕ = −T , ∂ϕ/∂xi . Notice that ∂T /∂xi (ϕ ∈ D(Rn )) is again a distribution. We resume: Deﬁnition 3.1. Let T be a distribution on Rn . Then the ith partial derivative of T is deﬁned by ∂T ∂ϕ , ϕ = − T, ϕ ∈ D(Rn ) . ∂xi ∂xi
10
Differentiating Distributions
One clearly has ∂2T ∂2T = ∂xi ∂xj ∂xj ∂xi
(i, j = 1, . . . , n)
because the same is true for functions in D(Rn ). Let k be a ntuple of nonnegative integers. Then one has ϕ ∈ D(Rn ) . D k T , ϕ = (−1)k T , D k ϕ We may conclude that a distribution can be differentiated an arbitrary number of times. In particular any locally integrable function is C ∞ as a distribution. Also the δfunction can be differentiated an arbitrary number of times. Let D = k≤m ak D k be a differential operator with constant coefﬁcients ak ∈ C. Then the adjoint or transpose tD of D is by deﬁnition tD = k≤m (−1)k ak D k . One clearly has ttD = D and DT , ϕ = T , tDϕ ϕ ∈ D(Rn ) for any distribution T . If tD = D we say that D is selfadjoint. The Laplace operator n Δ = i=1 ∂ 2 /∂xi2 is selfadjoint.
3.2 Examples As before, we begin with some examples on the real line. 0 if x < 0 1. Let Y (x) = 1 if x ≥ 0. The function Y is called the Heaviside function, named after Oliver Heaviside (1850–1925), a selftaught British electrical engineer, mathematician, and physicist. One has Y = δ. Indeed,
Y , ϕ = − Y , ϕ = −
∞
ϕ (x) dx = ϕ(0) = δ, ϕ
(ϕ ∈ D) .
0
2.
For all m = 0, 1, 2, . . . one has δ(m) , ϕ = (−1)m ϕ(m) (0) (ϕ ∈ D). This is easy.
3.
Let us generalize Example 1 to more general functions and higher derivatives. Consider a function f : R → C that is m times continuously differentiable for x = 0 and such that lim f (k) (x) x↓0
and
lim f (k) (x) x↑0
exist for all k ≤ m. Let us denote the difference between these limits (the jump) by σk , thus σk = lim f (k) (x) − lim f (k) (x) . x↓0
x↑0
Examples
11
Call f , f , . . . the distributional derivatives of f and {f }, {f }, . . . the distributions deﬁned by the ordinary derivatives of f on R\{0}. For example, if f = Y , then f = δ, {f } = 0. One easily veriﬁes by partial integration that ⎧ f = f + σ0 δ , ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ f = f + σ0 δ + σ1 δ, ⎪ .. ⎪ . ⎪ ⎪ ⎪ ⎩ (m) (m) + σ0 δ(m−1) + σ1 δ(m−2) + · · · + σm−1 δ . f = f So, for example,
Y (x) cos x = −Y (x) sin x + δ , Y (x) sin x = Y (x) cos x .
4. The distribution pv x1 . We now come to an important notion in distribution theory, the principal value of a function. We restrict ourselves to the function f (x) = 1/x and deﬁne pv(1/x) (the principal value of 1/x ) by ϕ(x) dx (ϕ ∈ D) . pv x1 , ϕ = lim ε↓0 x x≥ε
By a wellknown result from calculus, we know that limx↓0 x ε log x = 0 for any ε > 0. Therefore, log x is a locally integrable function and thus deﬁnes a regular distribution. Keeping this in mind and using partial integration we obtain for any ϕ ∈ D pv x1 , ϕ = − lim ε↓0
log x ϕ (x) dx = −
∞
log x ϕ (x) dx ,
−∞
x≥ε
and we see that pv(1/x) is a distribution since 1 pv x = log x in the sense of distributions. 5.
Partie ﬁnie. We now introduce another important notion, the socalled partie ﬁnie (or ﬁnite part) of a function. Again we restrict to particular functions, on this occasion to powers of 1/x , which occur when we differentiate the function 1/x . The notion of partie ﬁnie occurs when we differentiate the principal value of 1/x . Let us deﬁne for ϕ ∈ D Pf x12 , ϕ = lim ε↓0
−ε −∞
ϕ(x) dx + x2
∞ ε
ϕ(x) 2ϕ(0) dx − x2 ε
.
12
Differentiating Distributions
Though this is a rather complicated expression, it occurs in a natural way when we compute [pv(1/x)] (do it!). It turns out that ! " 1 1 pv x = − Pf x 2 . Thus Pf(1/x 2 ) is a distribution. Keeping this in mind, let us deﬁne # $ x −(n−1) d −n Pf Pf x =− dx n−1
(n = 2, 3, . . .)
and we set Pf(1/x) = pv(1/x). For an alternative deﬁnition of Pf x −n , see [10], Section II, 2:26.
3.3 The Distributions x+λ−1 (λ = 0, −1, −2, . . .)* Let us deﬁne for λ ∈ C with Re λ > 0, λ−1 x+ , ϕ = x λ−1 ϕ(x) dx ∞
(ϕ ∈ D) .
(3.1)
0
Since x λ−1 is locally integrable on (0, ∞) for Re λ > 0, we can regard x λ−1 as a distribution. Expanding x λ−λ0 = e(λ−λ0 ) log x (x > 0) into a power series, we obtain for any (small) ε > 0 the inequality x λ−1 − x λ0 −1  ≤ x −ε λ − λ0  x λ0 −1   log x (0 < x ≤ 1) for all λ with λ − λ0  < ε. A similar inequality holds for x ≥ 1 with −ε replaced by ε in the power of x . One then easily sees, by using Lebesgue’s theorem λ−1 , ϕ is a complex analytic function for on dominated convergence, that λ x+ Re λ > 0. Applying partial integration we obtain d λ λ−1 x = λx+ dx + λ−1 if Re λ > 0. Let us now deﬁne the distribution x+ for all λ = 0, −1, −2, . . . by choosing a nonnegative integer k such that Re λ + k > 0 and setting
d k λ+k−1 1 λ−1 x+ = x+ . λ (λ + 1) · · · (λ + k − 1) dx
Explicitly one has λ−1 ,ϕ = x+
(−1)k λ (λ + 1) · · · (λ + k − 1)
∞ x λ+k−1
dk ϕ dx dx k
(ϕ ∈ D) .
(3.2)
0
It is clear, using partial integration, that this deﬁnition agrees with equation (3.1) when Re λ > 0, and that it is independent of the choice of k provided Re λ + k > 0. It also gives an analytic function of λ on the set Ω = λ ∈ C : λ = 0, −1, −2, . . . .
The Distributions x+λ−1 (λ = 0, −1, −2, . . .)*
13
λ λ−1 Note that (d/dx)x+ = λ x+ for all λ ∈ Ω. The excluded points λ = 0, −1, −2, . . . λ−1 are simple poles of x+ . In order to compute the residues at these poles, we replace k by k + 1 in equation (3.2), and obtain λ−1 λ−1 Resλ=−k x+ , ϕ = lim (λ + k) x+ ,ϕ λ→−k
(−1)k+1 = (−k + 1) · · · (−1)
∞
& dk+1 ϕ dk ϕ (x) dx = (0) k! . dx k+1 dx k
0
λ−1 Hence Resλ=−k x+ , ϕ = (−1)k δ(k) /k!, ϕ(ϕ ∈ D). The gamma function Γ (λ), deﬁned by
∞ Γ (λ) =
e−x x λ−1 dx ,
0
is also deﬁned and analytic for Re λ > 0. This can be shown as above, with ϕ replaced by the function e−x . We refer to Section 10.2 for a detailed account of the gamma function. Using partial integration, one obtains Γ (λ + 1) = λ Γ (λ) for Re λ > 0. This functional equation permits us, similar to the above procedure, to extend the gamma function to an analytic function on the whole complex plane minus the points λ = 0, −1, −2, . . ., thus on Ω. Furthermore, the gamma function has simple poles at these points with residue at the point λ = −k equal to (−1)k /k!. A wellknown formula for the gamma function is the following: Γ (λ) Γ (1 − λ) =
π . sin π λ
We refer again to Section 10.2. From this formula we may conclude that Γ (λ) vanishes nowhere on C\Z. Since Γ (k) = (k − 1)! for k = 1, 2, . . ., we see that Γ (λ) vanishes in no point of Ω. Deﬁne now Eλ ∈ D by Eλ =
λ−1 x+ Γ (λ)
E−k = δ(k)
(λ ∈ Ω) , (k = 0, 1, 2, . . .) .
Then λ Eλ , ϕ (ϕ ∈ D) is an entire function on C. Note that d Eλ = Eλ−1 dx
for all λ ∈ C.
14
Differentiating Distributions
3.4 Exercises Exercise 3.2. Let Y be the Heaviside function, deﬁned in Section 3.2, Example 1 and let λ ∈ C. Prove that in distributional sense the following equations hold: $ #
d Y (x) sin ωx d2 λx 2 − λ Y (x) e = δ , =δ, +ω dx dx 2 ω $ # dm Y (x) x m−1 =δ for m a positive integer. dx m (m − 1)! Exercise 3.3. Determine, in distributional sense, all derivatives of x. Exercise 3.4. Find a distribution of the form F (t) = Y (t) f (t), with f a two times continuously differentiable function, satisfying the distribution equation a
d2 F dF + cF = mδ + nδ +b dt 2 dt
with a, b, c, m, n complex constants. Now consider the particular cases (i) a = c = 1; b = 2; m = n = 1, (ii) a = 1; b = 0; c = 4; m = 1; n = 0, (iii) a = 1; b = 0; c = −4; m = 2; n = 1.
Exercise 3.5. Deﬁne the following generalizations of the partie ﬁnie: ∞ Pf 0
∞ Pf 0
ϕ(x) dx = lim ε↓0 x ϕ(x) dx = lim ε↓0 x2
( ∞ ε
( ∞ ε
) ϕ(x) dx + ϕ(0) log ε , x ) ϕ(x) ϕ(0) + ϕ (0) log ε . dx − x2 ε
These expressions deﬁne distributions, denoted by pv(Y (x)/x) and Pf(Y (x)/x 2 ). Show this and determine the derivative of the ﬁrst expression.
3.5 Green’s Formula and Harmonic Functions We continue with examples in Rn for n ≥ 1.
Green’s Formula and Harmonic Functions
1.
15
Green’s formula. Let S be a hypersurface F (x1 , . . . , xn ) = 0 in Rn . The function F is assumed to be a C 1 function such that grad F does not vanish at any point of S . Such a hypersurface is called regular. Therefore, in each point of S a tangent surface exists and with a normal vector on it. Choose at each x ∈ S as normal vector n(x) =
grad F (x) .  grad F (x)
Clearly n(x) depends continuously on x . Given x ∈ S and e ∈ Rn , the line x + t e with ( e, n(x)) = 0 does not belong to S if t is small, t = 0. Verify it! Notice also that S has Lebesgue measure equal to zero. Let f be a C 2 function on Rn \S such that for each x ∈ S and each partial derivative D k f on Rn \S (k ≤ 2) the limits lim D k f (y)
y →x F(y) > 0
and
lim D k f (y)
y →x F(y) < 0
and decreases in the direction exist. Observe that F increases in the direction of n . of −n The difference will be denoted by σ k : σ k (x) =
lim D k f (y) − lim D k f (y)
y →x F(y) > 0
y→x F(y) < 0
(x ∈ S) .
Assume that σ k is a nice, e. g. continuous, function on S . Let us regard f as a distribution, D k f as a distributional derivative, {D k f } as the distribution given by the function D k f on Rn \S . For ϕ ∈ D we have ∂f ∂ϕ , ϕ = − f, ∂x1 ∂x1 ∂ϕ = − f (x) dx ∂x1 ∂ϕ (x1 , . . . , xn ) dx1 · · · dxn = − · · · f (x1 , . . . , xn ) ∂x1 ∂ϕ = − dx2 · · · dxn f (x1 , x2 , . . . , xn ) (x1 , x2 , . . . , xn ) dx1 . ∂x1 From Section 3.2, Example 3, follows, with σ 0 = σ (0,...,0) and e1 = (1, 0, . . . , 0): ∂f 1 , . . . , xn ) ϕ(x1 , . . . , xn ) dx1 . . . dxn , ϕ = σ 0 (x1 , . . . , xn ) sign e1 , n(x ∂x1 S ∞ ∂f + dx2 . . . dxn ϕ dx1 . ∂x1 −∞
16
Differentiating Distributions
Let ds be the surface element of S at x ∈ S . One has cos θ1 dx1 ∧ ds = dn(x) ∧ ds = dx1 ∧ dx2 . . . ∧ dxn , so dx2 . . . dxn =  cos θ1  ds , θ1 being the angle . We thus obtain between e1 and n(x) 1 , . . . , xn ) ϕ(x1 , . . . , xn ) dx1 . . . dxn = σ 0 (x1 , . . . , xn ) sign e1 , n(x S
σ 0 (s) ϕ(s) cos θ1 (s) ds . S
Notice that T , ϕ = S σ 0 (s) ϕ(s) cos θ1 (s) ds (ϕ ∈ D) deﬁnes a distribution T . Symbolically we shall write T = (σ 0 cos θ1 ) δS . Hence we have for i = 1, . . . , n, in obvious notation, + * ∂f ∂f + σ 0 cos θi δS . = ∂xi ∂xi
Differentiating this expression once more gives ∂2f ∂2f ∂ 0 = + (σ cos θi ) δS + σ i cos θi δS . 2 2 ∂xi ∂xi ∂xi Here σ i = σ (0,...,0,1,0,...,0) , with 1 on the ith place. Let Δ =
n ∂2 be the Laplace operator. Then we have ∂xi2 i=1 n n i ∂ 0 σ cos θi δS + σ cos θi δS . Δf = Δf + ∂xi i=1 i=1
Observe that n n ∂f ∂f a. σ i cos θi is the jump of cos θi = , ∂x ∂ν i i=1 i=1 . Call it σν . the derivative in the direction of n(s) ,n n ∂ ∂ϕ 0 ∂ϕ 0 σ ds . σ 0 cos θi δS , ϕ = − cos θi σ ds = − b. ∂x ∂x ∂ν i i i=1 i=1 S
S
0
Symbolically we will denote this distribution by (∂/∂ν)(σ δS ). We now come to a particular case. Let S be the border of an open volume V and assume that f is zero outside the closure V of V . Let ν = ν(s) be the direction of the inner normal at s ∈ S . Set ∂f (s) = ∂ν
lim
x→s x∈V
∂f (x) ∂ν
Green’s Formula and Harmonic Functions
17
and assume that this limit exists. Then one has Δf , ϕ = f , Δϕ = f (x) Δϕ(x) dx V
∂f ∂ δS , ϕ + (f δS ), ϕ = Δf , ϕ + ∂ν ∂ν ∂f ∂ϕ (s) ϕ(s) ds − f (s) (s) ds . = Δf (x) ϕ(x) dx + ∂ν ∂ν V
S
S
Hence we get Green’s formula in ndimensional space
∂ϕ ∂f −f ds f Δϕ − Δf ϕ dx = ϕ ∂ν ∂ν V
(ϕ ∈ D) ,
S
where still ν is the direction of the inner normal vector. Observe that this formula is even correct for any C 2 function ϕ. 2.
Harmonic functions. Let f be a C 2 function on Rn \{0}, only depending on the radius r = x = 2 . Notice that r = r (x , . . . , x ) is not differentiable at x = 0. x12 + · · · + xn 1 n The function f is called a harmonic function if Δf = 0 on Rn \{0}. What is the general form of f , when writing f (x) = F (r )? We have for r = 0 ∂f dF ∂r xi dF , = = ∂xi dr ∂xi r dr ∂2f ∂ = ∂xi ∂xi2
xi dF r dr
x 2 d2 F = i2 + r dr 2
hence Δf (x) =
#
=
xi ∂ r ∂xi
x2 1 − i3 r r
$
dF dr
+
dF dr
#
r − xi2 /r r2
$
dF , dr
d2 F n − 1 dF + . 2 dr r dr
We have to solve d2 F /dr 2 + (n − 1)/r (dF /dr ) = 0. Set u = dF/dr . Then u + [(n − 1)/r ] · u = 0, hence u(r ) = c/r n−1 for some constant c and thus ⎧ ⎪ ⎨ A +B if n = 2 , F (r ) = r n−2 ⎪ ⎩ A log r + B if n = 2 , A and B being constants.
The constant functions are harmonic functions everywhere, while the functions 1/r n−2 (n > 2) and log r (n = 2) are harmonic functions outside the origin x = 0. At x = 0 they have a singularity. Both functions are however locally integrable
18
Differentiating Distributions
(see below, change to spherical coordinates), hence they deﬁne distributions. We shall compute the distributional derivative Δ(1/r n−2 ) for n = 2. Therefore we need some preparations. 3.
Spherical coordinates and the area of the (n − 1)dimensional sphere S n−1 . Spherical coordinates in Rn (n > 1) are given by x1 = r sin θn−1 · · · sin θ2 sin θ1 x2 = r sin θn−1 · · · sin θ2 cos θ1 x3 = r sin θn−1 · · · cos θ2
.. . xn−1 = r sin θn−1 cos θn−2 xn = r cos θn−1 .
Here 0 ≤ r < ∞; 0 ≤ θ1 < 2π , 0 ≤ θj < π (j = 1). Let rj ≥ 0, rj2 = x12 + · · · + xj2 and assume that rj > 0 for all j = 1. Then ⎧ xj+1 ⎪ , cos θj = ⎪ ⎪ ⎪ rj+1 ⎪ ⎪ ⎪ ⎨ rj x1 sin θj = (j = 1), sin θ1 = , ⎪ ⎪ ⎪ r r2 ⎪ j+1 ⎪ ⎪ ⎪ ⎩ r = rn . One has dx1 . . . dxn = r n−1 J(θ1 , . . . , θn−1 ) dr dθ1 . . . dθn−1 with J θ1 , . . . , θn−1 = sin n−2 θn−1 sin n−3 θn−2 · · · sin θ2 . These expressions can easily be veriﬁed by induction on n. For n = 2 and n = 3 they are well known. Let Sn−1 denote the area of S n−1 , the unit sphere in Rn . Then 2π π π Sn−1 = · · · J θ1 , . . . , θn−1 dθ1 . . . dθn−1 . 0 0
0
To compute Sn−1 , we can use the explicit form of J . Easier is the following ∞ √ 2 method. Recall that −∞ e−x dx = π . Then we obtain π
n 2
=
e Rn
2 − x12 +···+xn
∞ dx1 . . . dxn =
2
r n−1 e−r dr .Sn−1
0
1 = 2
∞
n
e−t t 2 −1 dt .Sn−1 =
1 Sn−1 Γ 2
n 2
.
0
Hence Sn−1 = (2π n/2 )/Γ (n/2). If we set S0 = 2 then this formula for Sn−1 is valid for n ≥ 1. For properties of the gamma function Γ , see Section 10.2.
Green’s Formula and Harmonic Functions
19
4. Distributional derivative of a harmonic function. We return to the computation of the distributional derivative Δ(1/r n−2 ) for n = 2. One has 1 1 Δ n−2 , ϕ = , Δϕ r r n−2 1 1 Δϕ(x) dx = lim Δϕ(x) dx (ϕ ∈ D) . = ε→0 r n−2 r n−2 r ≥ε
Rn
Let us apply Green’s formula to the integral r ≥ε (1/r n−2 ) Δϕ(x) dx . Take V = Vε = {x : r > ε}, S = Sε = {x : r = ε}, f (x) = 1/r n−2 (r > ε). Then ∂/∂ν = ∂/∂r . Furthermore Δf = 0 on Vε . We thus obtain 1 1 ∂ϕ −(n − 2) ds + Δϕ(x) dx = − ϕ(s) ds . r n−2 r n−2 ∂r εn−1 r =ε
Vε
One has
Hence
r =ε
n ∂ϕ xi ∂ϕ = ≤ n supi.x ∂r r ∂x i i=1
∂ϕ (x) . ∂xi
∂ϕ 1 ∂ϕ .εSn−1 → 0 ds ≤ n supi,x ∂xi r n−2 ∂r r =ε
when ε → 0. Moreover −(n − 2) −(n − 2) ϕ(s) ds = ϕ(0) ds εn−1 εn−1 r =ε r =ε −(n − 2) + ϕ(s) − ϕ(0) ds , εn−1 r =ε
and
ϕ(0) r =ε
−(n − 2) ds = −(n − 2) Sn−1 ϕ(0) . εn−1
For the second summand, one has ∂ϕ √ √ ϕ(s) − ϕ(0) ≤ s n sup x ≤ε ∂x ≤ ε n supi,x i 1≤i≤n
for s = ε. Hence −(n − 2) ϕ(s) − ϕ(0) ds εn−1 r =ε
≤ const. ε .
We may conclude Δ
1
r
,ϕ n−2
= −(n − 2) Sn−1 ϕ(0) ,
∂ϕ ∂x , i
20
Differentiating Distributions
or Δ
1 r n−2
= −(n − 2)
2π n/2 δ Γ (n/2)
(n = 2) .
In a similar way one can show in R2 Δ (log r ) = 2π δ
(n = 2) .
Special Cases n = 1:
Δ x =
n = 3:
Δ
d2 x = 2 δ. dx 2
1 = −4π δ. r
3.6 Exercises Exercise 3.6. In the (x, y)plane, consider the square ABCD with A = (1, 1) ,
B = (2, 0) ,
C = (3, 1) ,
D = (2, 2) .
Let T be the distribution deﬁned by the function that is equal to 1 on the square and zero elsewhere. Compute in the sense of distributions ∂2T ∂2T − . ∂y 2 ∂x 2
Exercise 3.7. Show that, in the sense of distributions, $# $ # 1 ∂ ∂ +i = 2π δ . ∂x ∂y x + iy Exercise 3.8. Show that, in the sense of distributions, Δ(log r ) = 2π δ
in R2 . Exercise 3.9. In Rn one has, in the sense of distributions, for n ≥ 3,
* + 1 1 Δ if m < n − 2 . = Δ rm rm Prove it.
Exercises
21
Exercise 3.10. In the (x, t)plane, let E(x, t) =
Y (t) −x 2 /(4t) √ . e 2 πt
Show that, in the sense of distributions, ∂E ∂2E − = δ(x) δ(t) . ∂t ∂x 2
Further Reading Most of the material in this chapter can be found in [9], where many more exercises are available. Harmonic functions only depending on the radius are treated in several books on physics. Harmonic polynomials form the other side of the medal, they depend only on the angles (in other words, their domain of deﬁnition is the unit sphere). These socalled spherical harmonics arise frequently in quantum mechanics. For an account of these spherical harmonics, see [4], Section 7.3.
4 Multiplication and Convergence of Distributions Summary In this short chapter we deﬁne two important notions: multiplication of a distribution with a C ∞ function and a convergence principle for distributions. Multiplication with a C ∞ function is a crucial ingredient when considering differential equations with variable coefﬁcients.
Learning Targets Understanding the deﬁnition of multiplication of a distribution with a C ∞ function and its consequences. Getting familiar with the notion of convergence of a sequence or series of distributions.
4.1 Multiplication with a C ∞ Function There is no multiplication possible of two arbitrary distributions. Multiplication of a distribution with a C ∞ function can be deﬁned. Let α be a C ∞ function on Rn . Then it it easily veriﬁed that the mapping ϕ α ϕ is a continuous linear mapping of D(Rn ) into itself: if ϕj converges to ϕ in D(Rn ), then αϕj converges to αϕ in D(Rn ) when j tends to inﬁnity. Therefore, given a distribution T on Rn , the mapping deﬁned by ϕ T , α ϕ
ϕ ∈ D(Rn )
is again a distribution on Rn . We thus can deﬁne: Deﬁnition 4.1. Let α ∈ E(Rn ) and T ∈ D (Rn ). Then the distribution α T is deﬁned by α T , ϕ = T , α ϕ ϕ ∈ D(Rn ) . If f is a locally integrable function, then clearly αf = {αf } in the common notation. Examples on the Real Line
1.
αδ = α(0)δ, in particular xδ = 0.
2.
(αδ) = α(0)δ + α (0)δ.
The following result is important, it provides a kind of converse of the ﬁrst example. Theorem 4.2. Let T be a distribution on R. If xT = 0 then T = c δ, c being a constant.
Exercises
23
Proof. Let ϕ ∈ D and let χ ∈ D be such that χ = 1 near x = 0. Deﬁne ⎧ ⎪ ⎨ ϕ(x) − ϕ(0) χ(x) if x = 0 ψ(x) = x ⎪ ⎩ ϕ (0) if x = 0 . Then ψ ∈ D (use the Taylor expansion of ϕ at x = 0, cf. Section 5.3) and ϕ(x) = ϕ(0) · χ(x) + xψ(x). Hence T , ϕ = ϕ(0) T , χ(ϕ ∈ D), therefore T = c δ with c = T , χ. Theorem 4.3. Let T ∈ D (Rn ) and α ∈ E(Rn ). Then one has for all 1 ≤ i ≤ n, ∂ ∂α ∂T (α T ) = T +α . ∂xi ∂xi ∂xi
The proof is left to the reader.
4.2 Exercises Exercise 4.4. a.
Show, in a similar way as in the proof of Theorem 4.2, that T = 0 implies that T = const.
b. Let T and T be regular distributions, T = Tf , T = Tg . Assume both f and g are continuous. Show that f is continuously differentiable and f = g . Exercise 4.5. Determine all solutions of the distribution equation xT = 1.
4.3 Convergence in D Deﬁnition 4.6. A sequence of distributions {Tj } converges to a distribution T if Tj , ϕ converges to T , ϕ for every ϕ ∈ D. The convergence corresponds to the socalled weak topology on D , known from functional analysis. The space D is sequentially complete: if for a given sequence of distributions {Tj }, the sequence of scalars Tj , ϕ tends to T , ϕ for each ϕ ∈ D, when j tends to inﬁnity, then T ∈ D . This important result follows from Banach– Steinhaus theorem from functional analysis, see [1], Chapter III, §3, n◦ 6, Théorème 2. A detailed proof of this result is contained in Section 10.1. A series ∞ j=o Tj of distributions Tj is said to be convergent if the sequence {SN } N with SN = j=0 Tj converges in D . Theorem 4.7. If the locally integrable functions fj converge to a locally integrable function f for j → ∞, pointwise almost everywhere, and if there is a locally integrable func
24
Multiplication and Convergence of Distributions
tion g ≥ 0 such that fj (x) ≤ g(x) almost everywhere for all j , then the sequence of distributions fj converges to the distribution f . This follows immediately from Lebesgue’s theorem on dominated convergence. Theorem 4.8. Differentiation is a continuous operation in D : if {Tj } converges to T , then {D k Tj } converges to D k T for every ntuple k. The proof is straightforward. Examples
1.
Let fε (x)
⎧ ⎨0
if x > ε
⎩
if x ≤ ε
n εn Sn−1
(x ∈ Rn ) .
Then fε converges to the δfunction when ε ↓ 0. 2.
The functions fε (x) =
εn
1 2 2 √ n e−x /(2ε ) 2π
(x ∈ R)
converge to δ when ε ↓ 0.
4.4 Exercises Exercise 4.9. a.
Show that for every ϕ ∈ C ∞ (R) and every interval (a, b), b
b sin λx ϕ(x) dx
cos λx ϕ(x) dx
and
a
a
converge to zero when λ tends to inﬁnity. b. Derive from this, by writing ϕ(x) = ϕ(0) + ϕ(x) − ϕ(0)
that the distributions (sin λx)/x tend to π δ when λ tends to inﬁnity. c.
Show that, for real λ, the mapping ∞ ϕ Pf −∞
cos λx ϕ(x) dx = lim ε↓0 x
x≥ε
cos λx ϕ(x) dx x
(ϕ ∈ D)
deﬁnes a distribution. Show also that these distributions tend to zero when λ tends to inﬁnity.
Exercises
25
Exercise 4.10. Show that multiplication by a C ∞ function deﬁnes a continuous linear mapping in D . Determine the limits in D , when a ↓ 0, of a x 2 + a2
and
ax . x 2 + a2
Further Reading This chapter is selfcontained and no reference to other literature is necessary.
5 Distributions with Compact Support Summary Distributions with compact support form an important subset of all distributions. In this chapter we reveal their structure and properties. In particular the structure of distributions supported at the origin is shown.
Learning Targets Understanding the structure of distributions supported at the origin.
5.1 Deﬁnition and Properties We recall the space E of C ∞ functions on Rn . The following convergence principle is deﬁned in E : Deﬁnition 5.1. A sequence ϕj ∈ E tends to ϕ ∈ E if D k ϕj tends to D k ϕ (j → ∞) for all ntuples k of nonnegative integers, uniformly on compact subsets of Rn . Thus, given a compact subset K , one has supx∈K D k ϕj (x) − D k ϕ(x) → 0
(j → ∞)
for all ntuples k of nonnegative integers. In a similar way one deﬁnes a convergence principle in the space E m of C m functions on Rn by restricting k to k ≤ m. Notice that D ⊂ E and the injection is linear and continuous: if ϕj , ϕ ∈ D and ϕj tends to ϕ in D, then ϕj tends to ϕ in E . Let {αj } be a sequence of functions in D with the property αj (x) = 1 on the ball {x ∈ Rn : x ≤ j} and let ϕ ∈ E . Then αj ϕ ∈ D and αj ϕ tends to ϕ (j → ∞) in E . We may conclude that D is a dense subspace of E . Denote by E the space of continuous linear forms L on E . Continuity is deﬁned as in the case of distributions: if ϕj tends to ϕ in E , then L(ϕj ) tends to L(ϕ). Let T be a distribution with compact support and let α ∈ D be such that α(x) = 1 for x in a neighborhood of Supp T . We can then extend T to a linear form on E by L, ϕ = T , αϕ(ϕ ∈ E). This extension is obviously independent of the choice of α. Moreover L is a continuous linear form on E and L = T on D. Because D is a dense subspace of E , this extension L of T to an element of E is unique. We have: Theorem 5.2. Every distribution with compact support can be uniquely extended to a continuous linear form on E . Conversely, any L ∈ E is completely determined by its restriction to D. This restriction is a distribution since D is continuously embedded into E . Theorem 5.3. The restriction of L ∈ E to D is a distribution with compact support.
Distributions Supported at the Origin
27
Proof. The proof is by contradiction. Assume that the restriction T of L does not have compact support. Then we can ﬁnd for each natural number m a function ϕm ∈ D with Supp ϕm ∩ {x : x < m} = ∅ and T , ϕm = 1. Clearly ϕm tends to 0 in E , so L(ϕm ) tends to 0 (m → ∞), but L(ϕm ) = T , ϕm = 1 for all m. Similar to Proposition 2.3 one has: Proposition 5.4. Let T be a distribution. Then T has compact support if and only if there exists a constant C > 0, a compact subset K and a positive integer m such that T , ϕ ≤ C supx∈K D k ϕ(x) k≤m
for all ϕ ∈ D. Observe that if T satisﬁes the above inequality, then Supp T ⊂ K . We also may conclude that a distribution with compact support has ﬁnite order and is summable.
5.2 Distributions Supported at the Origin Theorem 5.5. Let T be a distribution on Rn supported at the origin. Then there exist an integer m and complex scalars ck such that T = k≤m ck D k δ. Proof. We know that T has ﬁnite order, say m. By Proposition 2.7 we have T , ϕ = 0 for all ϕ ∈ D with D k ϕ(0) = 0 for k ≤ m. The same then holds for the unique extension of T to E . Now write for ϕ ∈ D, applying Taylor’s formula (see Section 5.3) ϕ(x) = k
xk D k ϕ(0) + ψ(x) k! k≤m
k
(x ∈ Rn )
k
deﬁning x k = x1 1 x2 2 · · · xnn , k! = k1 !k2 ! · · · kn ! and taking ψ ∈ E such that D k ψ(0) = 0 for k ≤ m. Then we obtain , (−1)k x k k D δ, ϕ T, T , ϕ = (ϕ ∈ D) , k! k≤m or T =
k≤m
ck D k δ with , ck =
(−1)k x k T, k!
.
5.3 Taylor’s Formula for Rn We start with Taylor’s formula for R. Let f ∈ D(R). Then we can write for any m ∈ N f (x) =
x m x j (j) 1 f (0) + (x − t)m f (m+1) (t) dt , j! m! j=0 0
28
Distributions with Compact Support
which can be proved by integration by parts. The change of variable t → xs in the integral gives for the last term x m+1 m!
1 (1 − s)m f (m+1) (xs) ds . 0
This term is of the form (x m+1 /m!) g(x) in which, clearly, g is a C ∞ function on R, hence an element of E(R). In particular m f (j) (0) 1 + (1 − s)m f (m+1) (s) ds . j! m! j=0 1
f (1) =
(5.1)
0
Now let ϕ ∈ D(Rn ) and let t ∈ R. Then we can easily prove by induction on j , xk 1 dj D k ϕ (xt) . ϕ(xt) = j j! dt k! k=j
Taking f (t) = ϕ(xt) in formula (5.1), we obtain Taylor’s formula ϕ(x) =
k≤m
xk k xk D ϕ (0) + ψk (x) k! k! k=m+1
with
1 ψk (x) = (m + 1)
(1 − t)m D k ϕ (xt) dt .
0
Clearly ψk ∈ E(Rn ).
5.4 Structure of a Distribution* Theorem 5.6. Let T be a distribution on Rn and K an open bounded subset of Rn. Then there exists a continuous function f on Rn and a ntuple k (both depending on K ) such that T = D k f on K . Proof. Without loss of generality we may assume that K is contained in the cube Q = {(x1 , . . . , xn ) : xi  < 1 for all i}. Let us denote by D(Q) the space of all ϕ ∈ D(Rn ) with Supp ϕ ⊂ Q. Then, by the mean value theorem, we have ∂ϕ ϕ(x) ≤ sup ∂xi
(x ∈ Rn )
for all ϕ ∈ D(Q) and all i = 1, . . . , n. Let R = ∂/∂x1 · · · ∂/∂xn and let us denote for y ∈ Q Qy = (x1 , . . . , xn ) : −1 < xi ≤ yi for all i .
Structure of a Distribution*
29
We then have the following integral representation for ϕ ∈ D(Q): ϕ(y) = (Rϕ)(x) dx . Qy
For any positive integer m set ϕ m = supk≤m, x D k ϕ(x) .
Then we have for all ϕ ∈ D(Q) the inequalities ϕ m ≤ sup R m ϕ ≤ R m+1 ϕ(x) dx . Q
Because T is a distribution, there exist a positive constant c and a positive integer m such for all ϕ ∈ D(Q) T , ϕ ≤ c ϕ m ≤ c R m+1 ϕ(x) dx . Q
Let us introduce the function T1 deﬁned on the image of R m+1 by T1 R m+1 ϕ = T , ϕ . This is a welldeﬁned linear function on R m+1 (D(Q)). It is also continuous in the following sense: T1 (ψ) ≤ c
ψ(x) dx . Q
Therefore, using the Hahn–Banach theorem, we can extend T1 to a continuous linear form on L1 (Q). So there exists an integrable function g on Q such that m+1 T , ϕ = T1 R ϕ = g(x) R m+1 ϕ (x) dx Q
for all ϕ ∈ D(Q). Setting g(x) = 0 outside Q and deﬁning y 1
f (y) =
y n
g(x) dx1 . . . dxn ,
... −1
−1
and using integration by parts, we obtain T , ϕ = (−1)n f (x) R m+2 ϕ (x) dx Rn
for all ϕ ∈ D(Q). Hence T = (−1)n+m+2 R m+2 f on K . For distributions with compact support we have a global result.
30
Distributions with Compact Support
Theorem 5.7. Let T be a distribution with compact support. Then there exist ﬁnitely many continuous functions fk on Rn , indexed by ntuples k, such that T = D k fk . k
Moreover, the support of each fk can be chosen in an arbitrary neighborhood of Supp T . Proof. Let K be an open bounded set such that Supp T ⊂ K . Select χ ∈ D(Rn ) such that χ(x) = 1 on a neighborhood of Supp T and Supp χ ⊂ K . Then by Theorem 5.6 there is a continuous function f on Rn such that T = D l f on K for some ntuple l. Hence T , ϕ = T , χ · ϕ = (−1)l f , D l (χϕ) = D k fk , ϕ k≤l
for suitable continuous functions fk with Supp fk ⊂ K and for all ϕ ∈ D(Rn ).
Further Reading The proof of the structure theorem in Section 5.4, of which the proof can be omitted at ﬁrst reading, is due to M. Pevzner. Several other proofs are possible, see Section 7.7 for example. See also [10]. Distributions with compact support occur later in this book at several places.
6 Convolution of Distributions Summary This chapter is devoted to the convolution product of distributions, a very useful tool with numerous applications in the theory of differential and integral equations. Several examples, some related to physics, are discussed. The symbolic calculus of Heaviside, one of the eldest applications of the theory of distributions, is discussed in detail.
Learning Targets Getting familiar with the notion of convolution product of distributions. Getting an impression of the impact on the theory of differential and integral equations.
6.1 Tensor Product of Distributions Set X = Rm , Y = Rn , Z = X × Y ( Rn+m ). If f is a function on X and g one on Y , then we deﬁne the following function on Z : (f ⊗ g)(x, y) = f (x)g(y)
(x ∈ X, y ∈ Y ) .
We call f ⊗ g the tensor product of f and g . If f is locally integrable on X and g locally integrable on Y , then f ⊗ g is locally integrable on Z . Moreover one has for u ∈ D(X) and v ∈ D(Y ), f ⊗ g, u ⊗ v = f , u g, v .
If ϕ ∈ D(X × Y ) is arbitrary, not necessarily of the form u ⊗ v , then, by Fubini’s theorem, f ⊗ g, ϕ = f (x) ⊗ g(x), ϕ(x, y) (notation) = f (x), g(y), ϕ(x, y) = g(y), f (x), ϕ(x, y) .
The following proposition, of which we omit the technical proof (see [10], Chapter IV, §§ 2, 3, 4), is important for the theory we shall develop. Proposition 6.1. Let S be a distribution on X and T one on Y . There exists one and only one distribution W on X × Y such that W , u ⊗ v = S, u T , v for all u ∈ D(X), v ∈ D(Y ). W is called the tensor product of S and T and is denoted by W = S ⊗ T . It has the following properties. For ﬁxed ϕ ∈ D(X × Y ) set θ(x) = Ty , ϕ(x, y). Then θ ∈ D(X) and one has W , ϕ = S, θ = Sx , Ty , ϕ(x, y) ,
32
Convolution of Distributions
and also W , ϕ = Ty , Sx , ϕ(x, y) .
Proposition 6.2. Supp S ⊗ T = Supp S × Supp T . Proof. Set A = Supp S , B = Supp T . (i) Let ϕ ∈ D(X ×Y ) be such that Supp ϕ ∩(A×B) = ∅. There are neighborhoods A of A and B of B such that Supp ϕ∩(A ×B ) = ∅ too. For x ∈ A the function y ϕ(x, y) vanishes on B , so, in the above notation, θ(x) = 0 for x ∈ A , hence Sx , θ(x) = 0 and thus W , ϕ = 0. We may conclude Supp W ⊂ A × B . (ii) Let us prove the converse. Let (x, y) ∉ Supp W . Then we can ﬁnd an open “rectangle” P × Q with center (x, y), that does not intersect Supp W . Now assume that for some u ∈ D(X) with Supp u ⊂ P one has S, u = 1. Then T , v = 0 for all v ∈ D(Y ) with Supp v ⊂ Q. Hence Q ∩ B = ∅, and therefore (x, y) ∉ A × B . Examples and More Properties
1.
l l Dxk Dy (Sx ⊗ Ty ) = Dxk Sx ⊗ Dy Ty .
2.
δx ⊗ δy = δ(x,y) .
3.
A function on X × Y is independent of x if it is of the form 1x ⊗ g(y), with g a function on Y . Similarly, a distribution is said to be independent of x if it is of the form 1x ⊗ Ty with T a distribution on Y . One then has , 1x ⊗ Ty , ϕ(x, y) = Ty , ϕ(x, y) dx = Ty , ϕ(x, y) dx X
X
for ϕ ∈ D(X × Y ). We remark that the tensor product is associative Sx ⊗ Ty ⊗ Uξ = Sx ⊗ Ty ⊗ Uξ . Observe that
Sx ⊗ Ty ⊗ Uξ , u(x)v(y)t(ξ) = S, uT , vU , t .
4. Deﬁne
Y (x1 , . . . , xn ) =
1 0
if xi ≥ 0 (1 ≤ i ≤ n) elsewhere.
Then Y (x1 , . . . , xn ) = Y (x1 ) ⊗ · · · ⊗ Y (xn ). Moreover ∂nY = δ(x1 ,...,xn ) . ∂x1 · · · ∂xn
Convolution Product of Distributions
33
6.2 Convolution Product of Distributions Let S and T be distributions on Rn . We shall say that the convolution product S ∗ T of S and T exists if there is a “canonical” extension of S ⊗ T to functions of the form (ξ, η) ϕ(ξ + η)(ϕ ∈ D), such that S ∗ T , ϕ = Sξ ⊗ Tη , ϕ(ξ + η) (ϕ ∈ D) deﬁnes a distribution. Comments
1.
Sξ ⊗ Tη is a distribution on Rn × Rn .
2.
The function (ξ, η) ϕ(ξ + η) (ϕ ∈ D) has no compact support in general. The support is contained in a band of the form
@ η @@ @@ @@@ @@@@ @@@@ @@@@@ @@@@@@ @@@@@@@ @@@@@@@ @@@@@@@ @@@ @@ @@@ @@ @@ @@ @@ @@ @ @@ @ @@ @@ @@ @ 3.
ξ
The distribution S ∗ T exists for example when the intersection of Supp(S ⊗ T ) = Supp S × Supp T with such bands is compact (or bounded). The continuity of S∗T is then a straightforward consequence of the relations (∂/∂ξi ) ϕ(ξ+η) = (∂ϕ/∂ξi )(ξ + η) etc. In that case T ∗ S exists also and one has S ∗ T = T ∗ S .
4. Case by case one has to decide what “canonical” means. We may conclude: Theorem 6.3. A sufﬁcient condition for the existence of the convolution product S ∗ T is that for every compact subset K ⊂ Rn one has ξ ∈ Supp S ,
η ∈ Supp T ,
ξ + η ∈ K ⇒ ξ and η remain bounded.
The convolution product S ∗ T has then a canonical deﬁnition and is commutative: S ∗ T = T ∗ S. Special Cases (Examples)
1.
At least one of the distributions S and T has compact support. For example: S ∗ δ = δ ∗ S = S for all S ∈ D (Rn ).
34
Convolution of Distributions
2.
(n = 1) We shall say that a subset A ⊂ R is bounded from the left if A ⊂ (a, ∞) for some a ∈ R. Similarly, A is bounded from the right if A ⊂ (−∞, b) for some b ∈ R. If Supp S and Supp T are both bounded from the left (right), then S ∗ T exists. Example: Y ∗ Y exists, with Y , as usual, the Heaviside function.
3.
(n = 4) Let Supp S be contained in the positive light cone t 2 − x 2 − y 2 − z2 ≥ 0 ,
t ≥0,
and let Supp T be contained in t ≥ 0. Then S ∗ T exists. Verify that Theorem 6.3 applies. Theorem 6.4. Let S and T be regular distributions given by the locally integrable functions f and g . Let their supports satisfy the conditions of Theorem 6.3. Then one has (i) Rn f (x −t) g(t) dt and Rn f (t) g(x −t) dt exist for almost all x , and are equal almost everywhere to, say, h(x); (ii) h is locally integrable; (iii) S ∗ T is a function, namely h. The proof can obviously be reduced to the wellknown case of L1 functions f and g with compact support. Sometimes the conditions of Theorem 6.3 are not fulﬁlled, while the convolution product nevertheless exists. Let us give two examples. 1.
If f and g are arbitrary functions in L1 (Rn ), then f ∗ g , deﬁned by f ∗ g(x) = f (t)g(x − t) dt Rn
exists and is an element of L1 (Rn ) again. One has f ∗ g 1 ≤ f 1 g 1 . 2.
If f ∈ L1 (Rn ), g ∈ L∞ (Rn ), then f ∗ g deﬁned by f ∗ g(x) = f (t)g(x − t) dt Rn ∞
n
exists and is an element of L (R ) again. It is even a continuous function. Furthermore f ∗ g ∞ ≤ f 1 g ∞ . Let now f and g be locally integrable on R, Supp f ⊂ (0, ∞) Supp g ⊂ (0, ∞). Then f ∗ g exists (see special case 2) and is a function (Theorem 6.4). One has Supp(f ∗ g) ⊂ (0, ∞) and ⎧ ⎪ if x ≤ 0 0 ⎪ ⎪ ⎪ ⎨ x f ∗ g(x) = ⎪ ⎪ f (x − t) g(t) dt if x > 0 . ⎪ ⎪ ⎩ 0
Convolution Product of Distributions
35
Examples
1.
Let Yλα (x) = Y (x) β
α+β
One then has Yλα ∗ Yλ = Yλ tion:
x α−1 λx e Γ (α)
(α > 0, λ ∈ C) .
, applying the following formula for the beta func
1 t α−1 (1 − t)β−1 dt =
B(α, β) = 0
Γ (α) Γ (β) Γ (α + β)
(see Section 10.2). 2.
Let Gσ (x) =
1 2 2 √ e−x /(2σ ) σ 2π
(σ > 0) .
Then one has Gσ ∗ Gτ = G√σ 2 +τ 2 . Gσ is the Gauss distribution with expectation 0 and variance σ . If Gσ is the distribution function of the stochastic variable x and Gτ the one of y , and if the stochastic variables are independent, then Gσ ∗ Gτ is the distribution function √ of x + y . Its variance is apparently σ 2 + τ 2 .
3.
Let Pa (x) =
a 1 2 2 π x + a2
(a > 0) ,
the socalled Cauchy distribution. One has Pa ∗ Pb = Pa+b . We shall prove this later on. Theorem 6.5. Let α ∈ E and let T be a distribution. Assume that the distribution T ∗{α} exists (e. g. if the conditions of Theorem 6.3 are satisﬁed). Then this distribution is regular and given by a function in E , namely x Tt , α(x − t). We shall write T ∗ α for this function. It is called the regularization of T by α. Proof. We shall only consider the case where T and {α} satisfy the conditions of Theorem 6.3. Then the function x Tt , α(x − t) exists and is C ∞ since x β(x) Tt , α(x − t) = Tt , β(x) α(x − t) is C ∞ for every β ∈ D, by Proposition 6.1.
36
Convolution of Distributions
We now show that T ∗ {α} is a function: T ∗ {α} = {T ∗ α}. One has for ϕ ∈ D T ∗ {α}, ϕ = Tξ ⊗ α(η), ϕ(ξ + η) = Tξ , α(η), ϕ(ξ + η) , =
α(η) ϕ(ξ + η) dη
Tξ , ,
=
Rn

α(x − ξ) ϕ(x) dx
Tξ , Rn
= Tξ , ϕ(x), α(x − ξ) = Tξ ⊗ ϕ(x), α(x − ξ) .
Applying Fubini’s theorem, we obtain Tξ ⊗ ϕ(x), α(x −ξ) = ϕ(x), Tξ , α(x −ξ) = ϕ(x) Tξ , α(x − ξ) dx Rn
=
(T ∗ α)(x) ϕ(x) dx . Rn
Hence the result.
Examples
1.
If T has compact support or T = f with f ∈ L1 . Then T ∗ 1 = T , 1.
2.
If α is a polynomial of degree ≤ m, then T ∗α is. Let us take n = 1 for simplicity. Then xk Tt , α(k) (−t) (T ∗ α)(x) = Tt , α(x − t) = k! k≤m by using Taylor’s formula for the function x α(x −t). Here T is supposed to be a distribution with compact support or T = f with x k f ∈ L1 for k = 0, 1, . . . , m.
Deﬁnition 6.6. Let T be a distribution on Rn and a ∈ Rn . The translation τa T of T over a is deﬁned by τa T , ϕ = Tξ , ϕ(ξ + a) (ϕ ∈ D) . Proposition 6.7. One has the following formulae: (i) δ ∗ T = T for any T ∈ D (Rn ); (ii) δ(a) ∗ T = τ(a) T ; δ(a) ∗ δ(b) = δ(a+b) ; (iii) (n = 1) δ ∗ T = T for any T ∈ D .
Convolution Product of Distributions
37
The proof is left to the reader. In a similar way one has δ(m) ∗T = T (m) on R, and, if D is a differential operator on Rn with constant coefﬁcients, then Dδ ∗ T = DT , for all T ∈ D (Rn ). In particular, if Δ = Δn =
and =
1 ∂2 − Δ3 c 2 ∂t 2
then Δ δ ∗ T = Δ T and
n ∂2 ∂xi2 i=1
(Laplace operator in Rn )
(differential operator of d’Alembert in R4 ) ,
δ∗T =
T.
Proposition 6.8. Let Sj tend to S (j → ∞) in D (Rn ) and let T ∈ D (Rn ). Then Sj ∗ T tends to S ∗ T (j → ∞) if either T has compact support or the Supp Sj are uniformly bounded in Rn . The proof is easy and left to the reader. As application consider a function ϕ ∈ D(Rn ) satisfying ϕ ≥ 0, Rn ϕ(x) dx = 1, and set ϕk (x) = kn ϕ(kx) for k = 1, 2, . . .. Then δ = limk→∞ ϕk and Supp ϕk remains uniformly bounded in Rn . Hence for all T ∈ D (Rn ) one has T = T ∗ δ = lim T ∗ ϕk . k→∞
Notice that T ∗ ϕk is a C ∞ function, so any distribution is the limit, in the sense of distributions, of a sequence of C ∞ functions of the form T ∗ ϕ with ϕ ∈ D. Now take n = 1. Then δ is also the limit of a sequence of polynomials. This follows from the Weierstrass theorem: any continuous function on a closed interval can be uniformly approximated by polynomials. Select now the sequence of polynomials as follows. Write (as above) δ = limk→∞ ϕk with ϕk ∈ D. For each k choose a polynomial pk such that supx∈[−k,k] ϕk (x) − pk (x) < 1/k by the Weierstrass theorem. Then δ = limk→∞ pk . More generally, if T ∈ D (R) has compact support, then, by previous results, T is the limit of a sequence of polynomials (because T ∗ α is a polynomial for any polynomial α). We just mention, without proof, that this result can be extended to n > 1. Let now f ∈ L1 (Rn ) and let again {ϕk } be the sequence of functions considered above. Then f = limk→∞ f ∗ ϕk in the sense of distributions. One can however show a much stronger result in this case with important implications. Lemma 6.9. Let f ∈ L1 (Rn ). Then one can ﬁnd for every ε > 0 a neighborhood V of x = 0 in Rn such that Rn f (x − y) − f (x) dx < ε for all y ∈ V . Proof. Approximate f in L1 norm with a continuous function ψ with compact support. Then the lemma follows from the uniform continuity of ψ.
38
Convolution of Distributions
Proposition 6.10. Let f ∈ L1 (Rn ). Then f = limk→∞ f ∗ ϕk in L1 norm. Proof. One has f − f ∗ ϕk 1 =
dx f (x) − f (x − y) ϕ (y) dy k Rn
≤
Rn
f (x) − f (y − x) ϕk (y) dx dy .
Rn Rn
Given ε > 0, this expression is less than ε for k large enough by Lemma 6.9. Hence the result. Corollary 6.11. Let f ∈ L1 (Rn ). If f = 0 as a distribution, then f = 0 almost everywhere. Proof. If f = 0 as a distribution, then f ∗ ϕk = 0 as a distribution for all k, because ˇ k (x) = ϕk (−x)(x ∈ ˇ k ∗ ϕ for all ϕ ∈ D(Rn ), deﬁning ϕ f ∗ ϕk , ϕ = f , ϕ n R ). Since f ∗ ϕk is a continuous function, we conclude that f ∗ ϕk = 0. Then by Proposition 6.10, f = 0 almost everywhere. Corollary 6.12. Let f be a locally integrable function. If f = 0 as a distribution, then f = 0 almost everywhere. Proof. Let χ ∈ D(Rn ) be arbitrary. Then χ f ∈ L1 (Rn ) and χ f = 0 as a distribution. Hence χ f = 0 almost everywhere by the previous corollary. Since this holds for all χ , we obtain f = 0 almost everywhere. We continue with some results on the support of the convolution product of two distributions. Proposition 6.13. Let S and T be two distributions such that S ∗ T exists, Supp S ⊂ A, Supp T ⊂ B . Then Supp S ∗ T is contained in the closure of the set A + B . The proof is again left to the reader. Examples
1.
Let Supp S and Supp T be compact. Then S ∗ T has compact support and the support is contained in Supp S + Supp T .
2.
(n = 1) If Supp S ⊂ (a, ∞), Supp T ⊂ (b, ∞), then Supp S ∗ T ⊂ (a + b, ∞).
3.
(n = 4) If both S and T have support contained in the positive light cone t ≥ 0, t 2 − x 2 − y 2 − z 2 ≥ 0, then S ∗ T has.
Associativity of the Convolution Product
39
6.3 Associativity of the Convolution Product Let R , S , T be distributions on Rn with supports A, B, C , respectively. We deﬁne the distribution R ∗ S ∗ T by R ∗ S ∗ T , ϕ = Rξ ⊗ Sη ⊗ Tζ , ϕ(ξ + η + ζ) (ϕ ∈ D) . This distribution makes sense if, for example, ξ ∈ A, η ∈ B, ζ ∈ C, ξ + η + ζ bounded ,
implies ξ , η and ζ are bounded. The associativity of the tensor product then gives the associativity of the convolution product R ∗ S ∗ T = (R ∗ S) ∗ T = R ∗ (S ∗ T ) .
The convolution product may not be associative if the above condition is not satisﬁed. The distributions (R ∗ S) ∗ T and R ∗ (S ∗ T ) might exist without being equal. So is (1 ∗ δ ) ∗ Y = 0 while 1 ∗ (δ ∗ Y ) = 1 ∗ δ = 1. Theorem 6.14. The convolution product of several distributions makes sense, is commutative and associative, in the following cases: (i) All, but at most one distribution has compact support. (ii) (n = 1) The supports of all distributions are bounded from the left (right). (iii) (n = 4) All distributions have support contained in t ≥ 0 and, up to at most one, also in t ≥ 0, t 2 − x 2 − y 2 − z2 ≥ 0. The proof is left to the reader. Assume that S and T satisfy: ξ ∈ Supp S, η ∈ Supp T , ξ + η bounded, implies that ξ and η are bounded. Then one has: Theorem 6.15. Translation and differentiation of the convolution product S ∗ T goes by translation and differentiation of one of the factors. This follows easily from Theorem 6.14 by invoking the delta distribution and using τa T = δ(a) ∗ T , DT = Dδ ∗ T if T ∈ D (Rn ), D being a differential operator with constant coefﬁcients on Rn and a ∈ Rn .
Example (n = 2) ∂2 ∂2S ∂2T ∂S ∂T ∂S ∂T (S ∗ T ) = ∗T =S ∗ = ∗ = ∗ . ∂x∂y ∂x∂y ∂x∂y ∂x ∂y ∂y ∂x
6.4 Exercises Exercise 6.16. Show that the following functions are in L1 (R): e−x ,
2
e−ax (a > 0) ,
2
xe−ax (a > 0) .
40
Convolution of Distributions
Compute (i) e−x ∗ e−x , 2
2
(ii) e−ax ∗ xe−ax , 2
2
(iii) xe−ax ∗ xe−ax .
Exercise 6.17. Determine all powers Fm = f ∗ f ∗ · · · ∗ f (m times), where 1 if −1 < x < 1 f (x) = 0 elsewhere. Determine also Supp Fm and Fm .
Exercise 6.18. In the (x, y)plane we consider the set C : 0 < y ≤ x . If f and g are functions with support contained in C , write then f ∗ g in a “suitable” form. Let χ be the characteristic function of C . Compute χ ∗ χ .
6.5 Newton Potentials and Harmonic Functions Let μ be a continuous mass distribution in R3 and μ(y1 , y2 , y3 ) dy1 dy2 dy3 . U μ (x1 , x2 , x3 ) = (x1 − y1 )2 + (x2 − y2 )2 + (x3 − y3 )2 the potential belonging to μ . Generalized to Rn we get μ(y) dy 1 μ U = or U μ = μ ∗ n−2 x − y n−2 r
(n > 2) .
Rn
Deﬁnition 6.19. The Newton potential of a distribution T on Rn (n > 2) is the distribution 1 U T = T ∗ n−2 . r In order to guarantee the existence of the convolution product, let us assume, for the time being, that T has compact support. Theorem 6.20. U T satisﬁes the Poisson equation n
ΔU T = −NT
with N =
2π 2 ! ". n Γ 2
The proof is easy, apply in particular Section 3.5. An elementary or fundamental solution of Δ U = V is, by deﬁnition, a solution of Δ U = δ. Hence E = −1/(Nr n−2) is an elementary solution of the Poisson equation. Call a distribution U harmonic if ΔU = 0.
Newton Potentials and Harmonic Functions
41
Lemma 6.21 (Averaging theorem from potential theory). Let f be a harmonic function on Rn (i. e. f is a C 2 function on Rn with Δf = 0). Then the average of f over any sphere around x with radius R is equal to f (x): f (x) = f (x − y) dωR (y) = f (x − Rω) dω y =R
S n−1
with J(θ1 , . . . , θn−1 )dθ1 · · · dθn−1 Sn−1
dω =
(in spherical coordinates) ,
dωR (y) = surface element of sphere around x = 0 with radius R , such that total mass is 1 .
Proof. Let n > 2. In order to prove this lemma, we start with a C 2 function f on Rn , a function ϕ ∈ D(Rn ) and a volume V with boundary S , and apply Green’s formula (Section 3.5):
∂f ∂ϕ −ϕ (i) f dS = 0 , f Δ ϕ − ϕ Δ f dx + ∂ν ∂νi V
S
with ν the inner normal on S . Suppose now f harmonic, so Δ f = 0, and take V = VR (x), being a ball around x with radius R and boundary S . Take ϕ ∈ D(Rn ) such that ϕ = 1 on a neighborhood of V . Then (i) gives ∂f dS = 0 . (ii) ∂ν S
Choose then ϕ ∈ D(Rn ) such that ϕ(y) = 1/r n−2 for 0 < R0 ≤ r ≤ R and set ψ(y) = ϕ(x − y). Then (i) and (ii) imply ∂ψ ∂ψ dS = dS . f f ∂ν ∂ν x−y =R
Now
Hence
∂ψ ∂ϕ n−2 n−2 (y) = (x − y) = − n−1 (x − y) = − . ∂ν ∂ν r x − y n−1
1 R n−1
or
x−y =R0
f (y) dS = x−y =R
1 R0n−1
f (y) dS , x−y =R0
f (x − R ω) dω = S n−1
f (x − R0 ω) dω .
S n−1
Let ﬁnally R0 tend to zero. Then the lefthand side tends to f (x). The proof in the cases n = 1 and n = 2 of this lemma is left to the reader.
42
Convolution of Distributions
Theorem 6.22. A harmonic distribution is a C ∞ function. Proof. Choose α ∈ D(Rn ) rotation invariant and such that α(x) dx = 1. Let f be a C 2 function satisfying Δ f = 0. According to Lemma 6.21, f ∗ α(x) =
∞ f (x − r ω) r n−1 α(r ) dr dω
f (x − t) α(t) dt = Sn−1 Rn
0 S n−1
∞ α(r ) r n−1 dr · f (x) = f (x) .
= Sn−1 0
ˇ ˇ ∈ D. Let now T be a harmonic = ϕ(−x) for any ϕ ∈ D. Then also ϕ Set ϕ(x) ˇ: f = T ∗ ϕ ˇ . Then we have distribution. Consider the regularization of T by ϕ ˇ ˇ ˇ T ∗α, ϕ = (T ∗α)∗ ϕ(0)= (T ∗ ϕ)∗α(0)= f ∗α(0)= f (0) = T ∗ ϕ(0)= T , ϕ ˇ is a harmonic function. Hence T ∗ α = T and thus for any ϕ ∈ D, since f = T ∗ ϕ n T ∈ E(R ). Notice that the convolution product is associative (and commutative) in ˇ have compact support. this case, since both α and ϕ
We remark that, by applying more advanced techniques from the theory of differential equations, one can show that a harmonic function on Rn is even real analytic. The following corollary is left as an exercise. Corollary 6.23. A sequence of harmonic functions, converging uniformly on compact subsets of Rn , converges to a harmonic function.
6.6 Convolution Equations We begin with a few deﬁnitions. An algebra is a vector space provided with a bilinear and associative product. A convolution algebra A is a linear subspace of D , closed under taking convolution products. The product should be associative (and commutative) and, moreover, δ should belong to A . A convolution algebra is thus in particular an algebra, even with unit element. Let us give an illustration of this notion of convolution algebra. Examples of Convolution Algebras
(i) E : the distributions with compact support on Rn . (ii) D+ : the distributions with support in [0, ∞) on R; similarly D− . (iii) In R4 , the distributions with support contained in the positive light cone t ≥ 0, t 2 − x 2 − y 2 − z 2 ≥ 0.
Convolution Equations
43
Let A be a convolution algebra. A convolution equation in A is an equation of the form A∗X = B. Here A, B ∈ A are given and X ∈ A has to be determined. Theorem 6.24. The equation A ∗ X = B is solvable for any B in A if and only if A has an inverse in A , that is an element A−1 ∈ A such that A ∗ A−1 = A−1 ∗ A = δ .
The inverse A−1 of A is unique and the equation has a unique solution. Namely X = A−1 ∗ B . This algebraic result is easy to show. One just copies the method of proof from elementary algebra. Not every A has an inverse in general. If A is a C ∞ function with compact support, then A−1 ∉ A , for A∗ A−1 is then C ∞ , but on the other hand equal to δ. In principle however A ∗ X = B might have a solution in A for several B . The Poisson equation Δ U = V has been considered in Section 6.5. This is a convolution equation: Δ U = Δ δ ∗ U = V with elementary solution X = −1/(4π r ) in R3 . If V has compact support, then all solutions of Δ U = V are U =−
1 ∗ V + harmonic C ∞ functions . 4π r
Though Δ δ and V are in E (R3 ), X is not. There is no solution of Δ U = δ in E (R3 ) (see later), but there is one in D (R3 ). The latter space is however no convolution algebra. There are even inﬁnitely many solutions of Δ U = V in D (R3 ). We continue with convolution equations in D+ . Theorem 6.25. Consider the ordinary differential operator D=
dm−1 d dm + am + a + · · · + am−1 1 m m−1 dx dx dx
with complex constants a1 , . . . , am . Then D δ is invertible in D+ and its inverse is of the form Y Z , with Y the Heaviside function and Z the solution of DZ = 0 with initial conditions Z(0) = Z (0) = · · · = Z (m−2) (0) = 0, Z (m−1) (0) = 1 .
Proof. Any solution Z of DZ = 0 is an analytic function, so C ∞ . We have to show that D(Y Z) = δ if Z satisﬁes the initial conditions of the theorem. Applying the formulae
44
Convolution of Distributions
of jumps from Section 3.2, Example 3, we obtain (Y Z) = Y Z + Z(0) δ (Y Z) = Y Z + Z (0) δ + Z(0) δ
.. . (Y Z)(m−1) = Y Z (m−1) + Z (m−2) (0) δ + · · · + Z(0) δ(m−2) (Y Z)(m) = Y Z (m) + Z (m−1) (0) δ + · · · + Z(0) δ(m−1) .
Hence (Y Z)(k) = Y Z (k) if k ≤ m − 1 and (Y Z)(m) = Y Z (m) + δ, therefore D(Y Z) = Y DZ + δ = δ. Examples
1. 2.
d −λ, with λ ∈ C. One has D δ = δ −λ δ and (δ −λ δ)−1 = Y (x) eλx . dx # $ d2 sin ωx 2 . + ω with ω ∈ R. Then (δ + ω2 δ)−1 = Y (x) Consider D = dx 2 ω
Let D =
3.
Set D =
d −λ dx
m
. Then (δ − λ δ)−m = Y (x)
x m−1 eλx . (m − 1)!
As application we determine the solution in the sense of function theory of Dz = f if D is as in Theorem 6.25 and f a given function, and we impose the initial conditions z(k) (0) = zk (0 ≤ k ≤ m − 1). We shall take z ∈ C m and f continuous. We ﬁrst compute Y z. As in the proof of Theorem 6.25, we obtain D(Y z) = Y (Dz) +
m
ck δ(k−1)
k=1
with ck = am−k z0 +am−k−1 z1 +· · ·+zm−k . Hence D(Y z) = Y f + (k−1) ), or, for x ≥ 0 and therefore Y z = Y Z ∗ (Y f + m k=1 ck δ x z(x) =
Z(x − t) f (t) dt + 0
m
m
k=1 ck
δ(k−1) ,
ck Z (k−1) (x) .
k=1
Similarly we can proceed in D− with −Y (−x). One obtains the same formula. Verify that z satisﬁes Dz = f with the given initial conditions. Proposition 6.26. If A1 and A2 are invertible in D+ , then A1 ∗A2 is and (A1 ∗A2 )−1 = −1 A−1 2 ∗ A1 . This proposition is obvious. It has a nice application. Let D be the above differential operator and let P (z) = z m + a1 zm−1 + · · · + am−1 z + am .
Symbolic Calculus of Heaviside
45
Then P (z) can be written as P (z) = (z − z1 )(z − z2 ) · · · (z − zm ) ,
with z1 , . . . , zm being the complex roots of P (z) = 0. Hence
d d d − z1 − z2 · · · − zm D= dx dx dx or
Dδ = δ − z1 δ ∗ δ − z2 δ ∗ · · · ∗ δ − zm δ .
Therefore (Dδ)−1 = Y (x) ez1 x ∗ Y (x) ez2 x ∗ · · · ∗ Y (x) ezm x
which is in D+ . For example (δ − λ δ)m ; its inverse is Y (x) eλx x m−1 /(m − 1)!, which can easily be proved by induction on m; see also Example 3 above.
6.7 Symbolic Calculus of Heaviside It is known that D+ is a commutative algebra over C with unit element δ and without zero divisors. The latter property says: if S ∗ T = 0 for S, T ∈ D+ , then S = 0 or T = 0. For a proof we refer to [14], p. 325, Theorem 152 and [10], p. 173. Therefore, by a wellknown theorem from algebra, D+ has a quotient ﬁeld, where convolution equations can be solved in the natural way by just dividing. Let us apply this procedure, called symbolic calculus of Heaviside. Consider again, for complex numbers ai , D=
dm−1 d dm + am . + a + · · · + am−1 1 dx m dx m−1 dx
We will determine the inverse of Dδ again. Let P (z) = z m + a1 zm−1 + · · · + am−1 z + am
and write P (z) =
/
(z − zj )kj
j
the zj being the mutually different roots of P (z) = 0 with multiplicity kj . Set p = δ and write zj for zj δ, 1 for δ and just product for convolution product. 0 Then we have to determine the inverse of j (p − zj )kj in the quotient ﬁeld of D+ . By partial fraction decomposition we get cj,kj cj,1 1 0 , = + ··· + kj kj (z − zj ) (z − z ) (z − z ) j j j j
46
Convolution of Distributions
the cj,l being complex scalars. The inverse of (p−zj )m is Y (x) ezj x x m−1 /(m − 1)!, which is an element of D+ again. Hence (Dδ)−1 is a sum of such expressions # $ x kj −1 + · · · + cj,1 ezj x . cj,kj (Dδ)−1 = Y (x) (kj − 1)! j Let us consider a particular case. Let D = d2 /dx 2 + ω2 with ω ∈ R. Write
1 1 1 1 − . = z 2 + ω2 2iω z − iω z + iω Thus (Dδ)−1 =
sin ωx 1 Y (x) eiωx − e−iωx = Y (x) . 2iω ω
Another application concerns integral equations. Consider the following integral equation on [0, ∞]: x cos(x − t) f (t) dt = g(x)
(g given, x ≥ 0) .
0
Let us assume that f is continuous and g is C 1 for x ≥ 0. Replacing f and g by Y f and Y g , respectively, we can interpret this equation in the sense of distributions, and we actually have to solve Y (x) cos x ∗ Y f = Y g in D+ . We have ( Y (x) cos x = Y (x)
eix + e−ix 2
)
1 = 2
(
1 1 + p−i p+i
) =
p . p2 + 1
The inverse is therefore p2 + 1 1 =p+ = δ + Y (x) . p p
Hence
x
Y f = Y g ∗(δ +Y (x)) = (Y g) +Y (x)
x
g(t) dt = Y g +g(0) δ+Y (x) 0
g(t) dt . 0
We conclude that g(0) has to be zero since we want a function solution. But this was clear a priori from the integral equation.
Volterra Integral Equations of the Second Kind
47
6.8 Volterra Integral Equations of the Second Kind Consider for x ≥ 0 the equation x f (x) +
K(x − t) f (t) dt = g(x) .
(6.1)
0
We assume (i) K and g are locally integrable, f locally integrable and unknown, (ii) K(x) = g(x) = f (x) = 0 for x ≤ 0. Then we can rewrite this equation as a convolution equation in D+ f +K∗f =g
or
(δ + K) ∗ f = g .
Equation (6.1) is a particular case of a Volterra integral equation of the second kind. The general form of a Volterra integral equation is: x K(x, y) f (y) dy = g(x)
(ﬁrst kind),
K(x, y) f (y) dy = g(x)
(second kind).
0
x f (x) + 0
Here we assume that x ≥ 0 and that K is a function of two variables on 0 ≤ y ≤ x . For an example of a Volterra integral equation of the ﬁrst kind, see Section 6.7. Volterra integral equations of the second kind arise for example when considering linear ordinary differential equations with variable coefﬁcients dm u dm−1 u du + am (x) = g(x) + a1 (x) + · · · + am−1 (x) m dx dx m−1 dx
in which the ai (x) and g(x) are supposed to be continuous. Let us impose the initial conditions u(0) = u0 , u (0) = u1 , . . . , u(m−1) (0) = um−1 .
Then this differential equation together with the initial conditions is equivalent with the Volterra integral equation of the second kind x v(x) +
K(x, y) v(y) dy = w(x) 0
48
Convolution of Distributions
with v(x) = u(m) (x) , K(x, y) = a1 (x) + a2 (x) (x −y) + a3 (x)
w(x) = g(x) −
m−1 m−1
uk am−l (x)
l=0 k=l
(x −y)2 (x −y)m−1 + · · · + am (x) , 2! (m−1)!
x k−l . (k − l)!
In particular, we obtain a convolution Volterra equation of the second kind if all ai are constants, so in case of a constant coefﬁcient differential equation. Of course we have to restrict to [0, ∞), but a similar treatment can be given on (−∞, 0], by working in D− . Example
Consider d2 u/dx 2 + ω2 u = x. Then v = u satisﬁes x v(x) + ω
2
(x − y) v(y) dy = x − ω2 u0 − ω2 u1 x . 0
Theorem 6.27. For all locally bounded, locally integrable functions K on [0, ∞), the distribution A = δ + K is invertible in D+ and A−1 is again of the form δ + H with H a locally bounded, locally integrable function on [0, ∞). Proof. Let us write, symbolically, δ = 1 and K = q. Then we have to determine the inverse of 1 + q, ﬁrst in the quotient ﬁeld of D+ and then, hopefully, in D+ itself. We have ∞ 1 = (−1)k qk 1+q k=0 provided this series converges in D+ . Set E = δ − K + K ∗2 + · · · + (−1)k K ∗k + · · · .
Is this series convergent in D+ ? Yes, take an interval 0 ≤ x ≤ a and set Ma = ess. sup0≤x≤a K(x). For x in this interval one has x ∗2 K (x) = K(x − t) K(t) dt ≤ x Ma2 , 0
hence, with induction on k, ∗k K (x) ≤
x k−1 Mk , (k − 1)! a
and thus K ∗k (x) ≤ Mak ak−1 /(k − 1)! (k = 0, 1, 2, . . .) for 0 ≤ x ≤ a. On the righthand side we see terms of a convergent series, with sum Ma eMa a . Therefore the series
Exercises
∞
k k=1 (−1)
49
K ∗k converges to a locally bounded, locally integrable, function H on
[0, ∞) by Lebesgue’s theorem on dominated convergence, hence the series converges in D+ . Thus E = δ + H and E ∈ D+ . Verify that E ∗ A = δ.
Thus the solution of the convolution equation (6.1) is x f = (δ + H) ∗ g ,
f (x) = g(x) +
or
H(x − t) g(t) dt
(x ≥ 0) .
0
We conclude that the solution is given by a formula very much like the original equation (6.1).
6.9 Exercises Exercise 6.28. Determine the inverses of the following distributions in D+ : δ − 5 δ + 6 δ;
Y + δ ;
Y (x) ex + δ .
Exercise 6.29. Solve the following integral equation: x (x − t) cos(x − t) f (t) dt = g(x) , 0
for g a given function and f an unknown function, both locally integrable and with support in [0, ∞). Exercise 6.30. Let g be a function with support in [0, ∞). Find a distribution f (x) with support in [0, ∞) that satisﬁes the equation x
e−t − sin t f (x − t) dt = g(x) .
0
Which conditions have to be imposed on g in order that the solution is a continuous function? Determine the solution in case g = Y , the Heaviside function. Exercise 6.31. Solve the following integral equation: x f (x) +
cos(x − t) f (t) dt = g(x) . 0
Here g is given, f unknown. Furthermore Supp f and Supp g are contained in [0, ∞).
50
Convolution of Distributions
Exercise 6.32. Let f (t) be the solution of the differential equation u + 2u + u + 2u = −10 cos t ,
with initial conditions u(0) = 0, u (0) = 2, u (0) = −4 .
Set F (t) = Y (t)f (t). Write the differential equation for F , in the sense of distributions. Determine then F by symbolic calculus in D+ .
6.10 Systems of Convolution Equations* Consider the system of n equations with n unknowns Xj A11 ∗ X1 + A12 ∗ X2 + · · · + A1n ∗ Xn = B1
.. . An1 ∗ X1 + An2 ∗ X2 + · · · + Ann ∗ Xn = Bn ,
where Aij ∈ A , Xj ∈ A , Bj ∈ A . The following considerations hold for any system of linear equations over an arbitrary commutative ring with unit element (for example A ). Let us write A = (Aij ), B = (Bj ), X = (Xj ). Then we have to solve A ∗ X = B . Theorem 6.33. The equation A ∗ X = B has a solution for all B if and only if Δ = det A has an inverse in A . The solution is then unique, X = A−1 ∗ B . Proof. (i) Suppose A∗X = B has a solution for all B . Choose B such that Bi = δ, Bj = 0 for j = i, and call the solution in this case Xi = (Cij ). Let i vary between 1 and n. Then n 0 if i = j Aik ∗ Ckj = δ if i = j , k=1 or A ∗ C = δ I with C = (Cij ). Then det A ∗ det C = δ, hence Δ has an inverse in A . (ii) Assume that Δ−1 exists in A . Let aij be the minor of A = (Aij ) equal to (−1)i+j × the determinant of the matrix obtained from A by deleting the j th row and the ith column. Set Cij = aij ∗Δ−1 . Then A∗C = C ∗A = δ I , so A∗X = B is solvable for any B , X = C ∗ B . The remaining statements are easy to prove and left to the reader.
Exercises
51
6.11 Exercises Exercise 6.34. Solve in D+ the following system of convolution equations: δ ∗ X1 + δ ∗ X2 = δ δ ∗ X1 + δ ∗ X2 = 0 .
Further Reading Rigorous proofs of the existence of the tensor product and convolution product are in [5], [10] and also in [15]. For the impact on the theory of differential equations, see [7].
7 The Fourier Transform Summary Fourier transformation is an important tool in mathematics, physics, computer science, chemistry and the medical and pharmaceutical sciences. It is used for analyzing and interpreting of images and sounds. In this chapter we lay the fundamentals of the theory. We deﬁne the Fourier transform of a function and of a distribution. Unfortunately not every distribution has a Fourier transform, we have to restrict to socalled tempered distributions. As an example we determine the tempered solutions of the heat or diffusion equation.
Learning Targets A thorough knowledge of the Fourier transform of an integrable function. Understanding the deﬁnition and properties of tempered distributions. How to determine a tempered solution of the heat equation.
7.1 Fourier Transform of a Function on R For f ∈ L1 (R) we deﬁne f1(y) =
∞
f (x) e−2π ixy dx
(y ∈ R) .
−∞
We call f1 the Fourier transform of f , also denoted by F f . We list some elementary properties without proof. a. (c1 f1 + c2 f2 )ˆ = c1 f11 + c2 f12 for c1 , c2 ∈ C and f1 , f2 ∈ L1 (R). b. f1(y) ≤ f 1 , f1 is continuous and limy→∞ f1(y) = 0 (Riemann–Lebesgue
c.
lemma) for all f ∈ L1 (R). To prove the latter two properties, approximate f by a step function. See also Exercise 4.9 a. 1 for all f , g ∈ L1 (R). (f ∗ g)ˆ = f1 · g
d. Let f(x) = f (−x)(x ∈ R). Then (f)ˆ = f1 for all f ∈ L1 (R). e.
Let (Lt f )(x) = f (x − t), (Mρ f )(x) = f (ρx), ρ > 0. Then (Lt f )ˆ(y) = e−2π ity f1(y) , 3ˆ 2 e2π itx f (x (y) = (Lt f1)(y) , (Mρ f )ˆ(y) =
1 1 f ρ
#
y ρ
$
for all f ∈ L1 (R) .
Fourier Transform of a Function on R
53
Examples
1.
Let A > 0 and ΦA the characteristic function of the closed interval [−A, A]. 1 A (y) = Then Φ
2.
sin 2π Ay πy
Set
Δ(x) =
1 − x 0
(triangle function)
#
1 = Then Δ = Φ1/2 ∗ Φ1/2 , so Δ(y)
3.
1 A (0) = 2A . Φ
(y = 0) ,
for x ≤ 1 for x > 1 .
sin π y πy
Let T be the trapezoidal function ⎧ ⎪ ⎪ ⎨1 T (x) = 1 − x ⎪ ⎪ ⎩0
$2 1 (y = 0) , Δ(0) =1.
if x ≤ 1 if 1 < x ≤ 2 if x ≥ 2 .
sin 3π y sin π y Then T = Φ1/2 ∗ Φ3/2 , hence T1 (y) = (y = 0) and T1 (0) = 3. (π y)2
4. Let f (x) = e−ax , a > 0. Then f1(y) = 5.
2a . a2 + 4π 2 y 2
2 Let f (x) = e−ax , a > 0. Then we get by complex integration f1(y) = π /a · 2 2 2 2 e−π y /a . In particular F (e−π x ) = e−π y . ∞ 2 We know −∞ e−ax dx = π /a for a > 0. From this result we shall deduce, ap∞ 2 plying Cauchy’s theorem from complex analysis, that −∞ e−a(t+ib) dt = π /a for all a > 0 and b ∈ R. Consider the closed path W in the complex plane consisting of the line segments from −R to R (k1 ), from R to R + ib (k2 ), from R + ib to −R + ib (k3 ) and from −R + ib to −R (k4 ). Here R is a positive real number.
−R + ib
←
k3
R + ib ↑ k2
k4 ↓
−R
k1
→
R
54
The Fourier Transform
2
Since e−az is analytic in z, one has by Cauchy’s theorem us split this integral into four pieces. One has e
−az2
R
2
W
e−az dz = 0. Let
2
e−ax dx ,
dz = −R
k1
5
R
2
e−az dz = −
2
e−a(t+ib) dt .
−R
k3
Parametrizing k2 with z(t) = R + it , we get e
−az2
b dz = i
2
e−a(R+it) dt .
0
k2
Since 2 2 2 2 −a(R+it)2 e = e−a(R −t ) ≤ e−a(R −b )
(a ≤ t ≤ b) ,
we get 2 e−az dz
2 2 ≤ b e−a(R −b ) ,
hence lim
R→∞
k2
2
e−az dz = 0 .
k2
In a similar way
lim
R→∞
2
e−az dz = 0 .
k4
Therefore
∞
2
e−a(t+ib) dt =
−∞
∞
2
e−ax dx =
−∞
6
π . a
We now can determine f1 f1(y) =
∞
2
e−ax e−2π ixy dx = e−π
−∞
2
y 2 /a
∞
e−a
x+
πiy a
2
6 dx =
−∞
π − π2 y2 e a . a
7.2 The Inversion Theorem Theorem 7.1. Let f ∈ L1 (R) and x a point where f (x + 0) = limx↓0 f (x) and f (x − 0) = limx↑0 f (x) exist. Then one has ∞ lim α↓0
−∞
f (x + 0) + f (x − 0) 2 2 f1(y) e2π ixy e−4π αy dy = . 2
The Inversion Theorem
55
Proof. Step 1. One has ∞
∞ ∞
2 2 f1(y) e2π ixy e−4π αy dy =
−∞
f (t) e2π i(x−t)y e−4π
2 αy 2
dtdy
−∞ −∞
( ∞
∞ =
f (t) −∞
) e−4π
2
αy 2
e2π i(x−t)y dy dt
(by Fubini’s theorem)
−∞
∞
1 = √ 2 πα
f (t) e
(x−t)2 4α
−∞
1 dt = √ 2 πα
∞
t2
f (x + t) e− 4α dt .
−∞
Step 2. We now get ∞
2 2 f1(y) e2π ixy e−4π αy dy −
*
−∞
1 = √ 2 πα
=
1 √ 2 πα
∞
f (x + 0) + f (x − 0) 2
+
2 f (x + t) − f (x − 0) + f (x − t) − f (x − 0) e−t /4α dt
0
δ 0
1 ··· + √ 2 πα
∞ · · · = I1 + I2 , δ
where δ > 0 still has to be chosen. Step 3. Set φ(t) = f (x + t) − f (x + 0) + f (x − t) − f (x − 0). Let ε > 0 be given. Choose δ such that φ(t) < ε/2 for t < δ, t > 0. This is possible, since limt↓0 φ(t) = 0. Then we have I1  ≤ ε/2. Step 4. We now estimate I2 . We have 1 I2  ≤ √ 2 πα
Now ∞
∞ δ
∞ φ(t) e
−t 2 /4α
√ ∞ 2 α φ(t) dt ≤ √ dt . π t2
δ
δ
φ(t)/t 2 dt is bounded, because
φ(t) dt ≤ t2
δ
∞ δ
f (x + t) dt + t2
∞ δ
f (x − t) dt + t2
∞
f (x + 0) − f (x − 0) dt t2
δ
1 2 f (x + 0) − f (x − 0) . ≤ 2 f 1 + δ δ
Hence limα↓0 I2 = 0, so I2  < ε/2 as soon as α is small, say 0 < α < η. Step 5. Summarizing: if 0 < α < η, then ∞ f (x + 0) + f (x − 0) 2 2 f1(y) e2π ixy e−4π αy dy − 2 −∞
0), see Section 6.2. Then we know by Section 7.1, Example 4, that Pa (x) = F (e−2π ay )(x). Therefore, by the inversion theorem, e−2π ay = F (Pa )(y). Hence F (Pa ∗ Pb )(y) = e−2π ay e−2π by = e−2π (a+b)y = F (Pa+b )(y) .
Again by the inversion theorem we get Pa ∗ Pb = Pa+b . 2.
If f and g are in L1 (R), and if both functions are continuous, then f1 = g1 implies f = g . Later on we shall see that this is true without the continuity assumption, or, otherwise formulated, that F is a onetoone linear map.
7.3 Plancherel’s Theorem Theorem 7.4. If f ∈ L1 ∩ L2 , then f1 ∈ L2 and f 2 = f1 2 . Proof. Let f be a function in L1 ∩L2 . Then f ∗ f is in L1 and is a continuous function. Moreover (f ∗f)ˆ = f 2 . According to Corollary 7.3 we have f12 ∈ L1 , hence f1 ∈ L2 . Furthermore ∞ ∞ 2 1 f (x) 2 dx = f (y) du . f ∗ f(0) = −∞
So f 2 = f1 2 .
−∞
The Fourier transform is therefore an isometric linear mapping, deﬁned on the dense linear subspace L1 ∩ L2 of L2 . Let us extend this mapping F to L2 , for instance by k F f (x) = lim f (t)e−2π ixt dt . (L2 convergence) k→∞
−k
Differentiability Properties
57
We then have: Theorem 7.5 (Plancherel’s theorem). The Fourier transform is a unitary operator on L2 . Proof. For any f , g ∈ L1 ∩ L2 one has ∞
∞ 1 f (x)g(x)dx =
−∞
f1(x)g(x)dx .
−∞
∞ ∞ Since both −∞ f (x)(F g)(x)dx and −∞ (F f )(x)g(x)dx are deﬁned for f , g ∈ L2 , and because ∞ f (x)(F g)(x)dx ≤ f 2 F g 2 = f 2 g 2 , −∞
∞ (F f )(x)g(x)dx −∞
≤ F f 2 g 2 = f 2 g 2 ,
we have for any f , g ∈ L2 ∞
∞ f (x)(F g)(x)dx =
−∞
(F f )(x)g(x)dx . −∞
Notice that F (L2 ) ⊂ L2 is a closed linear subspace. We have to show that F (L2 )= L2 . ∞ Let g ∈ L2 , g orthogonal to F (L2 ), so −∞ (F f )(x)g(x)dx = 0 for all f ∈ L2 . ∞ Then, by the above arguments, −∞ f (x)(F g)(x)dx = 0 for all f ∈ L2 , hence F (g) = 0. Since g 2 = F g 2 , we get g = 0. So F (L2 ) = L2 .
7.4 Differentiability Properties Theorem 7.6. Let f ∈ L1 (R) be continuously differentiable and let f be in L1 (R) too. 7 (y) = (2π iy) f1(y). Then f Proof. By partial integration and for y = 0 we have A f (x) e
−2π ixy
−A
A x=A e−2π ixy 1 f (x) dx = + f (x) e−2π ixy dx . x=−A −2π iy 2π iy −A
x
Notice that limx→∞ f (x) exists, since f (x) = f (0) + 0 f (t) dt and f ∈ L1 (R). Hence, because f ∈ L1 (R), limx→∞ f (x) = 0. Then we get ∞ f (x) e −∞
−2π ixy
1 dx = 2π iy
∞ −∞
f (x) e−2π ixy dx ,
58
The Fourier Transform
or (2π iy) f1(y) = f7 (y) for y = 0. Since both sides are continuous in y , the theorem follows. Corollary 7.7. Let f be a C m function and let f (k) ∈ L1 (R) for all k with 0 ≤ k ≤ m. Then (m) (y) = (2π iy)m f1(y) . f Notice that for f as above, f1(y) = o(y−m ). Theorem 7.8. Let f ∈ L1 (R) and let (−2π ix) f (x) = g(x) be in L1 (R) too. Then f1 is continuously differentiable and (f1) = g1. Proof. By deﬁnition (f1) (y0 ) = lim
h→0
(
∞ f (x) e
−2π ixy0
−∞
e−2π ixh − 1 h
) dx .
We shall show that this limit exists and that we may interchange limit and integration. This is a consequence of Lebesgue’s theorem on dominated convergence. Indeed, ) ( e−2π ixh − 1 −2π ixy0 ≤ f (x) e h f (x) 2π x  sin 2π xθ1 (h, x) − i cos 2π xθ2 (h, x)
with 0 ≤ θi (h, x) ≤ 1 (i = 1, 2), hence bounded by 2g(x). Since g ∈ L1 (R) we are done. Corollary 7.9. Let f ∈ L1 (R), gm (x) = (−2π ix)m f (x) and gm ∈ L1 (R) too. Then 8 f1 is C m and f1(m) = g m. Both Corollary 7.7 and Corollary 7.9 are easily shown by induction on m.
7.5 The Schwartz Space S(R) Let f be in L1 (R). Then one obviously has, applying Fubini’s theorem, f1, ϕ = 1 for all ϕ ∈ D(R). If T is an arbitrary distribution, then a “natural” deﬁnition f , ϕ 1 ∈ D(R)). The space D(R) is however not closed of T1 would be T1 , ϕ = T , ϕ(ϕ 1 can be continued under taking Fourier transforms. Indeed, if ϕ ∈ D(R), then ϕ 1 to a complex entire analytic function ϕ(z) . This is seen as follows. Let Supp ϕ ⊂ [−A, A]. Then A 1 ϕ(z) = −A
ϕ(y) e−2π iyz dy =
A ∞ (−2π iz)k ϕ(y) y k dy k! k=0 −A
The Schwartz Space S(R)
59
1 were in D(R), and this is an everywhere convergent power series. Therefore, if ϕ then ϕ = 0. Thus we will replace D = D(R) with a new space, closed under taking Fourier transforms, the Schwartz space S = S(R).
Deﬁnition 7.10. The Schwartz space S consists of C ∞ functions ϕ such that for all k, l ≥ 0, the functions x l ϕ(k) (x) are bounded on R. Elements of S are called Schwartz functions, also functions whose derivatives are “rapidly decreasing” (à décroissance rapide). Observe that D ⊂ S. An example of 2 a new Schwartz function is ϕ(x) = e−ax (a > 0). Properties of S
1.
S is a complex vector space.
2.
If ϕ ∈ S, then x l ϕ(k) (x) and [x l ϕ(x)](k) belong to S as well, for all k, l ≥ 0.
3.
S ⊂ Lp for all p satisfying 1 ≤ p ≤ ∞.
4. S1 = S. The latter property follows from Section 7.4 (Corollaries 7.7 and 7.9). One deﬁnes on S a convergence principle as follows: say that ϕj tends to zero (notation ϕj → 0) when for all k, l ≥ 0, (k) sup x l ϕj (x) → 0 if j tends to inﬁnity. One then has a notion of continuity in S, in a similar way as in the spaces D and E . Theorem 7.11. The Fourier transform is an isomorphism of S. Proof. The linearity of the Fourier transform is clear. Let us show the continuity of the Fourier transform. Let {ϕj } be a sequence of functions in S. (k) 1 j (y) where ψj (x) = [(−2π ix)k ϕj (x)](l) . 1 j (y) = ψ (i) Notice that (2π iy)l ϕ Clearly ψj ∈ S. (k)
1 j (y) ≤ ψj 1 . 1 j (y) = ψ (ii) One has (2π iy)l ϕ
(iii) The following inequalities hold: ∞ ψj 1 ≤
1 ψj (x) dx =
−∞
ψj (x) dx + −1
≤ 2 sup ψj  + x≥1
ψj (x) dx x≥1
x 2 ψj (x) dx ≤ 2 sup ψj  + 2 sup x 2 ψj (x) . x2
60
The Fourier Transform
Now, if ϕj → 0 in S, then ψj → 0 in S, hence ψj 1 → 0 and therefore (k) 1 j (y) → 0 if j → ∞. Hence ϕ 1 j → 0 in S. Thus the Fourier sup (2π iy)l ϕ transform is a continuous mapping from S to itself. In a similar way one shows that the inverse Fourier transform is continuous on S. Hence the Fourier transform is an isomorphism. Observe that item (iii) in the above proof implies that the injection S ⊂ L1 is continuous; the same is true for S ⊂ Lp for all p satisfying 1 ≤ p ≤ ∞.
7.6 The Space of Tempered Distributions S (R) One has the inclusions D ⊂ S ⊂ E . The injections are continuous, the images are dense. Compare with Section 5.1. Deﬁnition 7.12. A tempered distribution is a distribution that can be extended to a continuous linear form on S. Let S = S (R) be the space of continuous linear forms on S. Then, by the remark at the beginning of this section, S can be identiﬁed with the space of tempered distributions. Examples
1.
Let f ∈ Lp , 1 ≤ p ≤ ∞. Then Tf is tempered since S ⊂ Lq , for q such that 1/p + 1/q = 1, and the injection is continuous.
2.
Let f be a locally integrable function that is slowly increasing (à croissance lente): there are A > 0 and an integer k ≥ 0 such that f (x) ≤ A xk if x → ∞. Then Tf is tempered. This follows from the following inequalities: 1 ∞ f (x) ϕ(x) dx ≤ f (x) ϕ(x) dx + f (x) ϕ(x) dx −∞
−1
x≥1
1 ≤ sup ϕ
f (x) dx + sup ϕ(x) x k+2
−1
3.
x≥1
f (x) dx . xk+2
Every distribution T with compact support is tempered. To see this, extend T to E (Theorem 5.2) and then restrict T to S.
4. Since ϕ → ϕ is a continuous linear mapping from S to S, the derivative of a tempered distribution is again tempered. 5.
If T is tempered and α a polynomial, then α T is tempered, since ϕ → α ϕ is a continuous linear map from S into itself.
6.
Let f (x) = ex . Then Tf is not tempered.
2
Structure of a Tempered Distribution*
61
∞ Indeed let α ∈ D, α ≥ 0, α(x) = α(−x) and −∞ α(x) dx = 1. Let χj be the characteristic function of the closed interval [−j, j] and set αj = χj ∗ α. Then 2 αj ∈ D and αj ϕ → ϕ in S for all ϕ ∈ S. Choose ϕ(x) = e−x . Then Tf , αj ϕ = ∞ ∞ −∞ αj (x) dx . But limj→∞ −∞ αj (x) dx = ∞. Similar to Propositions 2.3 and 5.4 one has:
Proposition 7.13. Let T be a distribution. Then T is tempered if and only if there exists a constant C > 0 and a positive integer m such that T , ϕ ≤ C sup x k ϕ(l) (x) k,l≤m
for all ϕ ∈ D.
7.7 Structure of a Tempered Distribution* This section is devoted to the structure of tempered distributions on R. Theorem 7.14. Let T be a tempered distribution on R. Then there exists a continuous, slowly increasing, function f on R and a positive integer m such that T = f (m) in the sense of distributions. Proof. Since T is tempered, there exists a positive integer N such that T , ϕ ≤ const. sup x k ϕ(l) (x) k,l≤N
for all ϕ ∈ D. Consider (1 + x 2 )−N T . We have ! :(l) 9! "−N "−N 2 k 2 1+x T, ϕ sup x ϕ(x) 1+x ≤ const. k,l≤N
≤ const.
sup ϕ(k) (x)
k≤N
1 . 1 + x2
By partial integration from −∞ to x , we obtain ϕ(k) (x) ≤ 1 + x2
∞ −∞
2xϕ(k) (x) dx + (1 + x 2 )2
∞ −∞
ϕ(k+1) (x) dx . 1 + x2
Hence ∞ ! 2 "−N (k) 1 + x2 ϕ (x) dx T , ϕ ≤ const. k≤N+1
1/2
−∞
for all ϕ ∈ D. Let H N+1 be the (Sobolev) space of all functions ϕ ∈ L2 (R) for which the distributional derivatives ϕ , . . . , ϕ(N+1) also belong to L2 (R), provided with the
62
The Fourier Transform
scalar product (ψ, ϕ) =
∞ N+1
ψ(k) (x) ϕ(k) (x) dx .
k=0 −∞
H N+1
It is not difﬁcult to show that is a Hilbert space (see also Exercise 7.24). Therefore, (1 + x 2 )−N T can be extended to H N+1 and thus (1 + x 2 )−N T = g for some g ∈ H N+1 , i. e. ∞ N+1 2 −N 1+x T , ϕ = (g, ϕ) = g (k) (x) ϕ(k) (x) dx . k=0 −∞
Thus
N+1 ! "−N 1 + x2 T = (−1)k g (2k) k=0
in the sense of distributions. Now write x G(x) =
g(t) dt . 0
Then G is continuous, slowly increasing because G(x) ≤ in the sense of distributions. Thus
x g 2 , and G = g
N+1 ! "−N 1 + x2 T = (−1)k G(2k+1) k=0
on D. Hence T is of the form T =
N+1
dk Gk
k=0
for some constants dk , while each Gk is a linear combination of terms of the form ; N (l) − 2 .
0
For ν = (n − 2)/2, x = 2π ρr one obtains J n−2 (2π ρr ) = 2
=
Therefore Ψ (ρ) =
2π ρ
n−2 2
π
n−2 2
√
ρ
n−2 2
π Γ(
π n−32 ρ
n−2 2
r
n−1 2 ) n−2 2
r
Γ ( n−1 2 ) ∞ r
n 2
n−2 2
π
e−2π ir ρ cos θ sin n−2 θ dθ
0
π
e−2π ir ρ cos θ sin n−2 θ dθ .
0
J n−2 (2π ρr ) Φ(r ) dr 2
(n ≥ 2) .
(7.1)
0
One also has, if ρ n−1 Ψ (ρ) ∈ L1 (0, ∞) and Φ is continuous, then Φ(r ) =
2π r
n−2 2
∞
n
ρ 2 J n−2 (2π ρr ) Ψ (ρ) dρ 2
(n ≥ 2) .
(7.2)
0
One calls the transform in equation (7.1) (and equation (7.2)) Fourier–Bessel transform or Hankel transform of order (n − 2)/2. It is deﬁned on L1 ((0, ∞), r n−1 dr ). We leave the case n = 1 to the reader. There are many books on Bessel functions, Fourier–Bessel transforms and Hankel transforms, see for example [16]. Because of its importance for mathematical physics, it is useful to study them in some detail.
The Heat or Diffusion Equation in One Dimension
71
7.12 The Heat or Diffusion Equation in One Dimension The heat equation was ﬁrst studied by Fourier (1768–1830) in his famous publication “Théorie analytique de la chaleur”. It might be considered as the origin of, what is now called, Fourier theory.  x=0
Let us consider a bar of length inﬁnity, thin, homogeneous and heat conducting. Denote by U (x, t) the temperature of the bar at place x and time t . Let c be its heat capacity. This is deﬁned as follows: the heat quantity in 1 cm of the bar with temperature U is equal to c U . Let γ denote the heat conducting coefﬁcient: the heat quantity that ﬂows from left to right at place x in one second is equal to −γ ∂U /∂x , with −∂U /∂x the temperature decay at place x . Let us assume, in addition, that there are some heat sources along the bar with heat density ρ(x, t). This means that between t and t + dt the piece (x, x + dx) of the bar is supplied with ρ(x, t) dx dt calories of heat. We assume furthermore that there are no other kinds of sources in play for heat supply (neither positive nor negative). Let us determine how U behaves as a function of x and t . The increase of heat in (x, x + dx) between t and t + dt is : 9 ∂U ∂2U ∂U (x + dx, t) − γ (x, t) dt ∼ γ dx dt (by conduction), γ 1. ∂x ∂x ∂x 2 2.
ρ(x, t) dx dt
(by sources).
(
Together:
) ∂2U γ + ρ(x, t) dxdt . ∂x 2
On the other hand this is equal to c (∂U /∂t) dxdt . Thus one obtains the heat equation ∂2U ∂U −γ c =ρ. ∂t ∂x 2 Let us assume that the heat equation is valid for distributions ρ too, e. g. for –
ρ(x, t) = δ(x): each second only at place x = 0 one unit of heat is supplied to
the bar: –
ρ(x, t) = δ(x) δ(t): only at t = 0 and only at place x = 0 one unit of heat is
supplied to the bar, during one second. We shall solve the following Cauchy problem: ﬁnd a function U (x, t) for t ≥ 0, which satisﬁes the heat equation with initial condition U (x, 0) = U0 (x), with U0 a C 2 function.
72
The Fourier Transform
We will consider the heat equation as a distribution equation and therefore set U (x, t) if t ≥ 0 , (x, t) = U 0 if t < 0 , and we assume that U is C 2 as a function of x and C 1 as a function of t for t ≥ 0. We have * + ∂U ∂2U ∂U ∂2U = + U0 (x) δ(t) , , = ∂x 2 ∂x 2 ∂t ∂t satisﬁes so that U c
∂2U ∂U −γ = ρ (x, t) + c U0 (x) δ(t). ∂t ∂x 2
(7.3)
We assume, in addition, –
x ρ(x, t) is slowly increasing for all t ,
–
x U (x, t) is slowly increasing for all t ,
–
x U0 (x) is slowly increasing.
(x, t), σ (y, t) = Fx ρ(x, t), V0 (y) = Fx U0 (x). Equation (7.3) Set V (y, t) = Fx U becomes, after taking the Fourier transform with respect to x , c
(y, t) ∂V (y, t) = σ (y, t) + c V0 (y) δ(t) . + 4π 2 γy 2 V ∂t
Now ﬁx y (that is: consider the distribution ϕ . , ϕ(t)⊗ψ(y) for ﬁxed ψ ∈ D). We obtain a convolution equation in D+ , namely (y, t) = σ (y, t) + c V0 (y) δ(t) . (c δ + 4π 2 γy 2 δ) ∗ V
Call A = c δ + 4π 2 γy 2 δ. Then A−1 = (Y (t)/c) e−4π
2 γy 2 t
. We obtain the solution
Y (t) −4π 2 (γ/c)y 2 t 2 2 (y, t) = σ (y, t) ∗t V e + V0 (y) Y (t) e−4π (γ/c)y t . c (x, t) = F y V (y, t). Observe that U Important special case: U0 (x) = 0, ρ (x, t) = δ(x) δ(t). Then one asks in fact (y, t) = δ(t), V0 (y) = 0, for an elementary solution of equation (7.3). We have σ hence Y (t) −4π 2 (γ/c)y 2 t (y, t) = V e . c An elementary solution of the heat equation is thus given by (x, t) = U
x2 Y (t) 1 − e 4(γ/c)t . c (γ/c)π t
The Heat or Diffusion Equation in One Dimension
73
indeed satisﬁes We leave it to the reader to check that U c
∂2U ∂U −γ = δ(x) δ(t) . ∂t ∂x 2
If there are no heat sources, then U (x, t) is given by U (x, t) = c U0 (x) ∗x E(x, t)
with E(x, t) =
2 1 − x e 4(γ/c)t 2c (γ/c)π t
if t > 0 ,
(t > 0) .
We leave it as an exercise to solve the heat equation in three dimensions c
∂U − γ ΔU = ρ . ∂t
Further Reading More on the Fourier transform of functions is in [4] and the references therein. More on tempered solutions of differential equations one ﬁnds for example in [7]. Notice that the Poisson equation has a tempered elementary solution. Is this a general phenomenon for differential equations with constant coefﬁcients? Another, more modern, method for analyzing images and sounds is the wavelet transform. For an introduction, see [3].
8 The Laplace Transform Summary The Laplace transform is a useful tool to ﬁnd nonnecessarily tempered solutions of differential equations. We introduce it here for the real line. It turns out that there is an intimate connection with the symbolic calculus of Heaviside.
Learning Targets Learning about how the Laplace transform works.
8.1 Laplace Transform of a Function For any function f on R, that is locally integrable on [0, ∞), we deﬁne the Laplace transform L by ∞ Lf (p) = f (t) e−pt dt (p ∈ C) . 0
Existence
1.
If for t ≥ 0 the function e−ξ0 t f (t) is in L1 ([0, ∞)) for some ξ0 ∈ R, then clearly Lf exists as an absolutely convergent integral for all p ∈ C with Re(p) ≥ ξ0 . Furthermore, Lf is analytic for Re(p) > ξ0 and, in that region, we have for all positive integers m (Lf )(m) (p) =
∞ (−t)m f (t) e−pt dt . 0
Observe that it sufﬁces to show the latter statements for ξ0 = 0 and m = 1. In that case we have for p with Re(p) > 0 the inequalities e−(p+h)t − e−pt −ht − 1 −ξt e ≤ f (t) e−ξt t eth ≤ f (t) e f (t) h h ≤ f (t) t e−(ξ−ε)t
if h < ε and ξ = Re(p). Because ξ > 0, one has ξ − ε > 0 for ε small. Set 1 ξ − ε = δ. Then there is a constant C > 0 such that t ≤ C e 2 δt . We thus may conclude that the above expressions are bounded by C f (t) provided h < ε. Now apply Lebesgue’s theorem on dominated convergence. 2.
If f (t) = O (ekt )(t → ∞), then Lf exists for Re(p) > k. In particular, slowly increasing functions have a Laplace transform for Re(p) > 0. If f has compact support, then Lf is an entire analytic function.
3.
If f (t) = e−t , then Lf is entire. If f (t) = et , then Lf does not exist.
2
2
Laplace Transform of a Distribution
75
Examples
1.
L1 = 1/p , L(t m ) = m!/p m+1 (Re(p) > 0). By abuse of notation we frequently omit the argument p in the lefthand side of the equations.
2.
Let f (t) = t α−1 , α being a complex number with Re(α) > 0. Then Lf (p) = Γ (α)/p α for Re(p) > 0.
3.
Set for c > 0,
Hc (t) =
1 0
if t ≥ c , elsewhere.
Then L(Hc )(p) = e−cp /p for all p with Re(p) > 0. 4. Let f (t) = e−αt (α complex). Then Lf (p) = 1/(p + α) for Re(p) > − Re(α). In particular, for a real, a – L (sin at) = 2 for Re(p) > 0 , p + a2 p – L (cos at) = 2 for Re(p) > 0 . p + a2 5.
Set h(t) = e−at f (t) with a ∈ R and assume that Lf exists for Re(p) > ξ0 . Then Lh exists for Re(p) > ξ0 − a and Lh(p) = Lf (p + a). In particular, b – L (e−at sin bt) = for Re(p) > −a , (p 2 + a2 ) + b2 p+a – L (e−at cos bt) = for Re(p) > −a , (p 2 + a2 ) + b2 $ # 1 t α−1 eλt = – L for Re(p) > Re(λ) , and Re(α) > 0 . Γ (α) (p − λ)α
6.
Let h(t) = f (λt)(λ > 0) and let Lf exist for Re(p) > ξ0 . Then Lh(p) exists for Re(p) > λξ0 and Lh(p) = (1/λ)Lf (p/λ).
8.2 Laplace Transform of a Distribution Let T ∈ D+ , thus the support of T is contained in [0, ∞). Assume that there exists ξ0 ∈ R such that e−ξ0 t Tt ∈ S (R). Then evidently e−pt Tt ∈ S (R) as well for all p ∈ C with Re(p) ≥ ξ0 . Let now α be a C ∞ function with Supp α bounded from the left and α(t) = 1 in a neighborhood of [0, ∞). Then α(t) ept ∈ S(R) for all p ∈ C with Re(p) < 0. Therefore, e−ξ0 t Tt , α(t) e(−p+ξ0 )t exists, commonly abbreviated by Tt , e−pt (Re(p) > ξ0 ). We thus deﬁne for T ∈ D+ as above
76
The Laplace Transform
Deﬁnition 8.1.
LT (p) = Tt , e−pt
(Re(p) > ξ0 ) .
Comments
1.
The deﬁnition does not depend on the special choice of α.
2.
LT is analytic for Re(p) > ξ0 (without proof; a proof can be given by using
a slight modiﬁcation of Corollary 7.15, performing a translation of the interval). 3.
The deﬁnition coincides with the one given for functions in Section 8.1 in case of a regular distribution.
4. For any positive integer m, (LT )(m) (p) = L((−t)m T )(p) for Re(p) > ξ0 . 5.
L(λT + μS) = λL(T ) + μL(S) for Re(p) large, provided both S and T admit
a Laplace transform. Examples
1.
Lδ = 1, Lδ(m) = p m , Lδ(a) = e−ap .
2.
If T ∈ E then LT is an entire analytic function.
3.
Based on the formulae Lδ(m) = p m , L(t l ) = l!/p l+1 and L(e−at T )(p) = LT (p + a), one can determine the inverse Laplace transform of any rational function P (p)/Q(p) (P , Q polynomials) as a distribution in D+ . One just applies partial fraction decomposition.
8.3 Laplace Transform and Convolution Let S and T be two distributions in D+ such that e−ξt S and e−ξt T are tempered distributions for all ξ ≥ ξ0 . Then according to [10], Chapter VIII (or, alternatively, using again the slight modiﬁcation of Corollary 7.15) we have e−ξt (S ∗ T ) ∈ S
Moreover
for ξ > ξ0 .
L(S ∗ T )(p) = St ⊗ Tu , e−p(t+u) = St , e−pt Tu , e−pu = LS(p) · LT (p)
for Re(p) > ξ0 .
Therefore, L(S ∗ T ) = LS · LT .
As a corollary we get: L(T (m) )(p) = L(δ(m) ∗ T )(p) = p m LT (p)
for Re(p) > max(ξ0 , 0) .
Laplace Transform and Convolution
77
Let now f be a function on R with f (t) = 0 for t < 0. Assume that e−ξt f (t) is integrable for ξ > ξ0 , and, in addition, that f is C 1 for t > 0, that limt↓0 f (t) = f (0) and that limt↓0 f (t) is ﬁnite. We know then that {f } = {f } + f (0) δ ,
in which {f } is the derivative of the distribution {f }. Then Lf is clearly deﬁned where L{f } is deﬁned, hence where Lf is deﬁned. Furthermore, Lf (p) = p (Lf )(p) − f (0)
(Re(p) > ξ0 ) .
With induction we obtain the following generalization. Let f be m times continuously differentiable on (0, ∞), and assume that limt↓0 f k (t) exists for all k ≤ m. Set limt↓0 f (k) (t) = f (k) (0)(0 ≤ k ≤ m − 1). Then Lf (m) (p) = p m Lf (p) − {f (m−1 (0) + p f (m−2 (0) + · · · + p m−1 f (0)} Re(p) > ξ0 .
Examples
1.
Let f = Y be the Heaviside function. Then f = 0 and f (0) = 1. Thus, as we know already, L(Y )(p) = 1/p .
2.
Y (t) sin at has as Laplace transform a/(p 2 + a2 ) for Re(p) > 0. Differentiation gives: aY (t) cos at has Laplace transform (ap)/(p 2 + a2 ) for Re(p) > 0. See
also Section 8.1. 3.
Recall that Y (t) eλt
t α−1 t β−1 t α+β−1 ∗ Y (t) eλt = Y (t) eλt Γ (α) Γ (β) Γ (α + β)
for Re(α) > 0, Re(β) > 0, see Section 6.2. Laplace transformation gives 1 1 1 · = (p − λ)α (p − λ)β (p − λ)α+β
(Re(p) > Re(λ)) .
The second relation is easier than the ﬁrst one, of course. The second relation implies the ﬁrst one, because of Theorem 8.2. Let T ∈ D+ satisfy e−ξt Tt ∈ S (R) for ξ > ξ0 . If LT (p) = 0 for Re(p) > ξ0 , then T = 0. Proof. One has e−ξt T , α(t) e−εt e2π iηt = 0 for all η ∈ R and all ε > 0. Here α(t) is chosen as in the deﬁnition of LT and ξ is ﬁxed, ξ > ξ0 . Let ϕ ∈ S. 1 e2π iηt = 0 for all η ∈ R and all ε > 0. Applying Then e−ξt T , α(t) e−εt ϕ(η) Fubini’s theorem in S (R) gives ∞ 1 e2π iηt dη = e−ξt T , α(t) ϕ(t) e−εt e−ξt T , α(t) e−εt ϕ(η) 0= −∞
78
The Laplace Transform
for all ϕ ∈ S and all ε > 0. Hence e−(ξ+ε)t T , ϕ = 0 for all ϕ ∈ S and all ε > 0, thus, for ϕ ∈ D, T , ϕ = e−(ξ+ε)t T , e(ξ+ε)t ϕ = 0, so T = 0. “Symbolic calculus” is nothing else then taking Laplace transforms: L maps the subalgebra of D+ generated by δ and δ onto the algebra of polynomials in p . If the polynomial P is the image of A: LA = P , and if LB = 1/P for some B ∈ D+ , then B is the inverse of A by Theorem 8.2. The Abel Integral Equation
The origin of this equation is in [N. Abel, Solution de quelques problèmes à l’aide d’intégrales déﬁnies, Werke (Christiania 1881), I, pp. 11–27]. The equation plays a role in Fourier analysis on the upper half plane and other symmetric spaces. The general form of the equation is t 0
ϕ(τ) dτ = f (t) (t − τ)α
(t ≥ 0, 0 < α < 1) ,
a Volterra integral equation of the ﬁrst kind. Abel considered only the case α = 1/2. The solution of this equation falls outside the scope of symbolic calculus. Set ϕ(t) = f (t) = 0 for t < 0. Then the equation changes into a convolution equation in D+ Y (t) t −α ∗ ϕ = f . Let us assume that ϕ and f admit a Laplace transform. Set Lϕ = Φ, Lf = F . Then we obtain Γ (1 − α) Φ(p) = F (p) , p 1−α hence Φ(p) =
p F (p) Lf (p) + f (0) = . α Γ (1 − α) p Γ (1 − α) p α
Therefore ϕ(t) =
; < 1 Y (t) t α−1 ∗ f (t) + f (0) Y (t) t α−1 Γ (1 − α) Γ (α)
sin π α = π
t 0
f (τ) f (0) dτ + 1−α 1−α (t − τ) t
for t ≥ 0, almost everywhere. We have assumed that f is continuous for t ≥ 0, that f exists and is continuous for t > 0, and that limt↓0 f (t) is ﬁnite. We also applied here the formula Γ (α) Γ (1 − α) = (sin π α)/π for 0 < α < 1. See equation (10.8).
Inversion Formula for the Laplace Transform
79
8.4 Inversion Formula for the Laplace Transform Let f be a function satisfying f (t) = 0 for t < 0 and e−ξt f (t) ∈ L1 for ξ > ξ0 . Set F (p) = Lf (p), thus ∞
f (t) e−ξt e−iηt dt
F (ξ + iη) =
(ξ > ξ0 ) .
0
If η F (ξ + iη) ∈ L1 for some ξ > ξ0 , then f (t) e
−ξt
1 = 2π
∞ F (ξ + iη) eiηt du −∞
at the points where f is continuous, or 1 f (t) = 2π
∞ F (ξ + iη) e −∞
(ξ+iη)t
1 dη = 2π i
ξ+i∞
F (p) ept dp . ξ−i∞
The following theorem holds: Theorem 8.3. An analytic function F coincides with the Laplace transform of a distribution T ∈ D+ in some open right half plane if and only if there exists c ∈ R such that F (p) is deﬁned for Re(p) > c and F (p) ≤ polynomial in p
(Re(p) > c) .
Proof. “Necessary”*. Let T ∈ D+ be such that e−ct Tt ∈ S for some c > 0. Then T = S + U with S ∈ D+ of compact support, Supp U ⊂ (0, ∞) and e−ct Ut ∈ S . This is evident: choose a function α ∈ D such that α(x) = 1 in a neighborhood of x = 0 and set S = α T , U = (1−α) T . Then L S(p) is an entire function of p of polynomial growth. To see this, compare with the proof of Theorem 7.17. Furthermore, e−ct Ut is of the form f (m) for some continuous, slowly increasing, function ! "f with Supp f ⊂ m k m c k (ect f )(m−k) . (0, ∞) by Corollary 7.15. Then Ut = ect f (m) = (−1) k=0 k Moreover e−ξt ect f (t) ∈ L1 for ξ > c , hence L (ect f )(p) exists and is bounded for Re(p) ≥ c + 1. Therefore # $ m k m c k L (ect f )(m−k) (p) L U (p) = (−1) k k=0 # $ m k m = c k p m−k L (ect f )(p) (−1) k k=0 satisﬁes L U (p) ≤ polynomial in p of degree m, for Re(p) > c + 1. This completes this part of the proof.
80
The Laplace Transform
“Sufﬁcient”. Let F be analytic and F (p) ≤ C/p2 for ξ = Re(p) > c > 0, C being a positive constant. Then F (ξ + iη) ≤ C/(ξ 2 + η2 ) for ξ > c and all η. Therefore, ξ+i∞ 1 f (t) = F (p) ept dp 2π i ξ−i∞
exists for ξ > c and all t and does not depend on the choice of ξ > c by Cauchy’s theorem. Let us write e−ξt f (t) =
1 2π
∞ F (ξ + iη) eiηt dη
(ξ > c) .
−∞
Then we see that f is continuous. Moreover, if e−ξt f (t) ∈ L1 for ξ > c , then F (p) = ∞ −pt dt for ξ = Re(p) > c , by Corollary 7.2 on the inversion of the Fourier −∞ f (t) e transform. We shall now show (i) f (t) = 0 for t < 0, (ii) e−ξt f (t) ∈ L1 for ξ > c . It then follows that F = Lf . To show (i), observe that f (t) ≤
C eξt 2π c
∞ −∞
c dη C eξt = 2 +η 2c
(ξ > c) .
c2
For t < 0 we thus obtain f (t) ≤ limξ→∞ C eξt /(2c) = 0, hence f (t) = 0. To show (ii), use again f (t) ≤ C eξt /(2c) for all ξ > c , hence f (t) ≤ C ect /(2c). Therefore e−ξt f (t) ≤
C −(ξ−c)t e ∈ L1 2c
for ξ > c
and t ≥ 0 .
Let now, more generally, F be analytic and F (p) ≤ polynomial in p of degree m ,
for ξ = Re(p) > c > 0. Then F (p)/p m+2  ≤ C/p2 for ξ > c and some constant C . Therefore, there is a continuous function f such that (i) f (t) = 0 for t < 0, (ii) e−ξt f (t) ∈ L1 for ξ > c , and F (p) = p m+2 Lf (p)
(ξ > c) .
Inversion Formula for the Laplace Transform
81
Then F (p) = L(f (m+2) )(p) for ξ > c or, otherwise formulated, F = LT with T = dm+2 f /dt m+2 , the derivative being taken in the sense of distributions. Notice that the theorem implies that for any T ∈ D+ which admits a Laplace transform there exists a positive integer m such that T is the mth derivative (in the sense of distributions) of a continuous function f with support in [0, ∞) and such that for some ξ0 , e−ξ0 t f (t) ∈ L1 .
Further Reading Further reading may be done in the “bible” of the Laplace transform [17].
9 Summable Distributions* Summary This chapter might be considered as the completion of the classiﬁcation of distributions. As indicated, it can be omitted in the ﬁrst instance.
Learning Targets Understanding the structure of summable distributions.
9.1 Deﬁnition and Main Properties We recall the deﬁnition k = k1 + · · · + kn if k = (k1 , . . . , kn ) is an ntuple of nonnegative integers. We also recall from Section 2.2: Deﬁnition 9.1. A summable distribution on Rn is a distribution T with the following property: there exists an integer m ≥ 0 and a constant C satisfying T , ϕ ≤ C sup D k ϕ (ϕ ∈ D(Rn )) . k≤m
The smallest number m satisfying this inequality is called the summabilityorder of T , abbreviated by sumorder (T ). Observe that locally, i. e. on every open bounded subset of Rn , each distribution T is a summable distribution. Moreover, any distribution with compact support is summable. A summable distribution is of ﬁnite order in the usual sense, and we have order(T ) ≤ sumorder(T ), with equality if T has compact support. It immediately follows from the deﬁnition that the derivative of a summable distribution is summable with
∂T ≤ sumorder(T ) + 1 . sumorder ∂xi The space of summable distributions of sumorder 0 coincides with the space Mb (Rn ) of bounded measures on Rn , that is with the space of measures μ with ﬁnite total mass Rn dμ(x). This is a wellknown fact. First observe that a distribution of order 0 can be uniquely extended to a continuous linear form on the space D0 of continuous functions with compact support, provided with its usual topology (see [2], Chapter III), thus to a measure, using a sequence of functions {ϕk } as in Section 6.2 and deﬁning T , ϕ = limk→∞ T , ϕk ∗ ϕ for ϕ ∈ D0 . Next apply for example [2], Chapter IV, § 4, Nr. 7. If μ ∈ Mb then the distribution D k μ is summable with sumorder ≤ k. We shall show the following structure theorem:
83
The Iterated Poisson Equation
Theorem 9.2 (L. Schwartz [10]). Let T be a distribution. Then the following conditions on T are equivalent: 1.
T is summable.
2.
T is a ﬁnite sum
3.
T is a ﬁnite sum
k
D k μk of derivatives of bounded measures μk .
k
D k fk of derivatives of L1 functions fk .
4. For every α ∈ D, α ∗ T belongs to Mb (Rn ). 5.
For every α ∈ D, α ∗ T belongs to L1 (Rn ). We need some preparations for the proof.
9.2 The Iterated Poisson Equation In this section we will consider elementary solutions of the iterated Poisson equation Δl E = δ
where l is a positive integer and Δ the Laplace operator in Rn . One has, similarly to the case l = 1 (see Section 3.5) n
2π 2 Δl r 2l−n = (2l − n)(2l − 2 − n) . . . (4 − n)(2 − n) 2l−1 (l − 1)! n δ . Γ 2
Thus, if 2l − n < 0 or 2l − n ≥ 0 and n odd, then there exists a constant Bl,n such that Δl Bl,n r 2l−n = δ . If now 2l − n ≥ 0 and n even, then n
l
Δ (r
2l−n
log r ) = [[(2l − n)(2l − 2 − n) . . . (4 − n)(2 − n)]] 2
l−1
2π 2 (l − 1)! n δ , Γ 2
where in the expression between double square brackets the factor 0 has to be omitted. Thus, there exists a constant Al,n such that Δl Al,n r 2l−n log r = δ .
We conclude that for all l and n there exist constants Al,n and Bl,n such that Δl r 2l−n (Al,n log r + Bl,n ) = δ .
Let us call this elementary solution of the iterated Poisson equation E = El,n . We now replace E by a parametrix γE , with γ ∈ D(Rn ), γ(x) = 1 for x in a neighborhood of 0 in Rn . Then one has for any T ∈ D (Rn ) Δl (γE) − ζ = δ for some ζ ∈ D , Δl (γE ∗ T ) − ζ ∗ T = T .
84
Summable Distributions*
This result, of which the proof is straightforward, implies the following. If we take for T a bounded measure and take l so large that E is a continuous function, then every bounded measure is a ﬁnite sum of derivatives of functions in L1 (Rn ) (which are continuous and converge to zero at inﬁnity). This proves in particular the equivalence of 2. and 3. of Theorem 9.2. Moreover, replacing T with α ∗ T in the above parametrix formula easily implies the equivalence of 4. and 5.
9.3 Proof of the Main Theorem To prove Theorem 9.2, it is sufﬁcient to show the implications: 2 ⇒ 1 ⇒ 4 ⇒ 2. 2 ⇒ 1. Let T = k≤m D k μk . Then k k ≤ C μ T , ϕ = (−1) , D ϕ sup D k ϕ k k≤m
k≤m
for some constant C and all ϕ ∈ D. Hence T is summable. 1 ⇒ 4. Let α ∈ D be ﬁxed and T summable. Then T satisﬁes a relation of the form T , ϕ ≤ C sup D k ϕ (ϕ ∈ D) . k≤m
Therefore, ˇ ∗ ϕ ≤ C T ∗ α, ϕ = T , α
ˇ ∗ ϕ sup D k α
k≤m
⎛ ≤C ⎝
⎞ ˇ 1 ⎠ sup ϕ sup D k α
(ϕ ∈ D) .
k≤m
Hence T ∗ α ∈ Mb . 4 ⇒ 2. This is the most difﬁcult part. From the relation ˇ α ˇ = T ∗ α, ϕ T ∗ ϕ,
for α, ϕ ∈ D, we see, by 4, that ˇ α ˇ ≤ Cα T ∗ ϕ,
for all ϕ ∈ D with sup ϕ ≤ 1, with Cα a positive constant. By the Banach– Steinhaus theorem (or the principle of uniform boundedness) in D, see Section 10.1, we get: for any open bounded subset K of Rn there is a positive integer m and a constant CK such that ˇ α ˇ ≤ CK T ∗ ϕ, sup D k α (9.1) k≤m
Canonical Extension of a Summable Distribution
85
for all α ∈ D with Supp α ⊂ K and all ϕ ∈ D with sup ϕ ≤ 1. Let us denote by Dm the space of C m functions with compact support and by Dm (K) the subspace consisting of all α ∈ Dm with Supp α ⊂ K . Similarly, deﬁne D(K) as the space of functions α ∈ D with Supp α ⊂ K . Let us provide Dm (K) with the norm α = sup D k α α ∈ Dm (K) . k≤m
It is not difﬁcult to show that D(K) is dense in Dm (K). Indeed, taking a sequence of functions ϕk (k = 1, 2, . . .) as in Section 6.2, one sees, by applying the relation D l (ϕk ∗ ϕ) = ϕk ∗ D l ϕ for any ntuple of nonnegative integers l with l ≤ m and using the uniform continuity of D l ϕ, that any ϕ ∈ Dm (K) can be approximated in the norm topology of Dm (K) by functions of the form ϕk ∗ ϕ, which are in D(K) ˇ can be for k large. From equation (9.1) we then see that every distribution T ∗ ϕ m uniquely extended to D (K), such that equation (9.1) remains to hold. Therefore, ˇ α ˇ = T ∗ α, ϕ(ϕ ∈ D). for all α ∈ Dm (K), T ∗ α is in Mb , since still T ∗ ϕ, Now apply the parametrix formula again, taking α = γ E = γ El,n with l large enough in order that α ∈ Dm . The proof of the theorem is now complete.
9.4 Canonical Extension of a Summable Distribution The content of this section (and the next one) is based on notes by the late Erik Thomas [13]. Let T be a summable distribution of sumorder m. Denote by Bm = Bm (Rn ) the space of C m functions ϕ on Rn which are bounded as well as its derivatives D k ϕ with k ≤ m. Let us provide Bm with the norm pm (ϕ) = sup D k ϕ (ϕ ∈ Bm ) . k≤m
Then, by the Hahn–Banach theorem, T can be extended to a continuous linear form on Bm . C m Let B = ∞ and provide it with the convergence principle: a sequence m=0 B {ϕj } in B converges to zero if, for all m, the scalars pm (ϕj ) tend to zero when j tends to inﬁnity. Then the extension of T to Bm is also a continuous linear form on B. There exists a variety of extensions in general. We will now deﬁne a canonical extension of T to B. To this end we make use of a representation of T as a sum of derivatives of bounded measures T = k≤m D k μk . The number m arising in this summation may be larger than the sumorder of T . With help of this representation we can easily extend T to a continuous linear form on B, since bounded measures are naturally deﬁned on B. Indeed, the functions in B are integrable with respect to any bounded measure. We shall show that such an extension does not depend on the speciﬁc
86
Summable Distributions*
representation of T as a sum of derivatives of bounded measures, it is a canonical extension. The reason is the following. Observe that any μ ∈ Mb is concentrated, up to ε > 0, on a compact set (depending on μ ). It follows that a canonical extension has a special property, T has the bounded convergence property. A continuous linear form on Bm is said to have the bounded convergence property of order m if the following holds: if ϕj ∈ Bm , supj pm (ϕj ) < ∞ and ϕj → 0 in E m , then T , ϕj → 0 when j tends to inﬁnity. A similar deﬁnition holds in B: a continuous linear form on B has the bounded convergence property if ϕj ∈ B, supj pm (ϕj ) < ∞ for all m, and ϕj → 0 in E , then T , ϕj → 0 when j tends to inﬁnity. Relying on this property we can easily relate T with its extension to B. Indeed, let α ∈ D be such that α(x) = 1 in a neighborhood of x = 0, and set αj (x) = α(x/j). Then, for any ϕ ∈ B the functions αj ϕ tend to ϕ in E and supj pm (ϕj ) < ∞ for all m. Hence T , ϕ = limj→∞ T , ϕj . Therefore the extension of T to B does not depend on the speciﬁc representation. We call it the canonical extension. In particular one can deﬁne the total mass T , 1 of T , sometimes denoted by T (dx), which accounts for the name “summable distribution”. Furthermore, the canonical extension satisﬁes again T , ϕ ≤ C pm (ϕ) for some constant C and all ϕ ∈ B, m being now the sumorder of T . Conversely, any continuous linear form T on B gives, by restriction to D, a summable distribution. There is however only a onetoone relation between T and its restriction to D if we impose the condition: T has the bounded convergence property. The above allows one to deﬁne several operations on summable distribution which are common for bounded measures. Fourier Transform
Any summable distribution T is tempered, so has a Fourier transform. But there is more: F T is function, F T (y) = Tx , e−2π ix, y
(y ∈ Rn ) .
Here we use the canonical extension of T . Clearly F T is continuous (since any μ ∈ Mb is concentrated up to ε > 0 on a compact set) and of polynomial growth. Convolution
Let T and S be summable distributions. Then the convolution product S ∗ T exists. ˇ ∈ D). This is a good deﬁnition, since Indeed, let us deﬁne S∗T , ϕ = T , ϕ∗S(ϕ ϕ ∗ Sˇ ∈ B. Clearly S ∗ T is summable again. Due to the canonical representation of T and S as a sum of derivatives of bounded measures, we see that this convolution product is commutative and associative and furthermore, F (T ∗ S) = F (T ) · F (S) .
Rank of a Distribution
87
Let O be the space of functions f ∈ E(Rn ) such that f and the derivatives of f have at most polynomial growth. Then O operates by multiplication on the space S(Rn ) and on S (Rn ). Moreover, since the functions in O have polynomial growth, they deﬁne themselves tempered distributions. Theorem 9.3. Every T ∈ F (O) is summable. More precisely, if P is a polynomial then P T is summable. Conversely, if P T is summable for every polynomial P , then T ∈ F (O). Proof. If T = F (f ) with f ∈ O and α ∈ D, then α ∗ T = F (βf ) with β ∈ S the inverse Fourier transform of α. Since βf belongs to S we have α ∗ T ∈ S, a fortiori α ∗ T ∈ L1 . Therefore, by Theorem 9.2, T is summable. Similarly, if P is a polynomial, then P T = F (Df ) for some differential operator with constant coefﬁcients, so P T ∈ F (O). Conversely, if P T is summable for all polynomials P , then D F (T ) is continuous and of polynomial growth for all differential operators D with constant coefﬁcients, and so F (T ) belongs to O (cf. Exercise 4.4 b.).
9.5 Rank of a Distribution What is the precise relation between the sumorder of T and the number of terms in a representation of T as a sum of derivatives of bounded measures from Theorem 9.2. To answer this question, we introduce the notion of rank of T . For every ntuple of nonnegative integers k = (k1 , . . . , kn ) we set k∞ = max1≤i≤n ki . A partial differential operator with constant coefﬁcients ak is said to have rank m if it is of the form D= ak D k k∞ ≤m
and not every ak with k∞ = m vanishes. Deﬁnition 9.4. A distribution T is said to have ﬁnite rank if there exists a positive integer m and a constant C > 0 such that T , ϕ ≤ C sup D k ϕ (ϕ ∈ D) . k∞ ≤m
The smallest such m is called the rank of T . Clearly, k∞ ≤ k ≤ nk∞ . So the distributions of ﬁnite rank coincide with the summable distributions. There is however a striking difference between sumorder and rank: the sumorder may grow with the dimension while the rank remains constant. For example sumorder(Δδ) = 2n and rank(Δδ) = 2, Δ being the usual Laplacian in Rn .
88
Summable Distributions*
Deﬁnition 9.5. A molliﬁer of rank m is a bounded measure K having the following properties: 1.
D k K is a bounded measure for all ntuples k with k∞ ≤ m.
2.
There exists a differential operator with constant coefﬁcients D of rank m such that DK = δ .
Theorem 9.6. There exists molliﬁers of rank m. Let K be a molliﬁer of rank m and let D be a differential operator of rank m such that DK = δ. Then if T is any summable distribution of rank ≤ m, the summable distribution μ = K ∗ T is a bounded measure and T = Dμ . Thus for summable distributions of rank ≤ m we get a representation of the form T = D k μk , k∞ ≤m
which gives an answer to the question posed at the beginning of this section. It turns out that the rank of a distribution is an important notion. Let E(m) (Rn ) be the space of functions ϕ such that D k ϕ exists and is continuous for all k with k∞ ≤ m, and let B(m) (Rn ) be the subspace of function ϕ ∈ E(m) such that D k ϕ is bounded for all k with k∞ ≤ m. Notice that B(m) is a Banach space with norm p(m) (ϕ) = sup D k ϕ (ϕ ∈ B(m) ) . k∞ ≤m
Clearly any summable distribution T of rank ≤ m has a canonical extension to the space B(m) , having the bounded convergence property of rank m: if ϕj → 0 in E(m) and supj p(m) (ϕj ) < ∞, then T , ϕj → 0. This follows in the same way as before. Proof. (Theorem 9.6) For dimension n = 1 and rank m = 1, we set, if a > 0, L(x) = La (x) =
1 Y (x) e−x/a . a
Here Y is, as usual, the Heaviside function. Furthermore, set D =1+a
d . dx
Then DL = δ and L and L are bounded measures. Now consider the case of rank 2 in dimension n = 1 in more detail. For a > 0 set ˇa , with L ˇa (x) = (1/a) Y (−x) ex/a . Then Ka = La ∗ L K(x) = Ka (x) =
1 −x/a e . 2a
Rank of a Distribution
Set
89
d d2 d 1−a = 1 − a2 D = 1+a . dx dx dx 2
Then K, K and K are bounded measures, and ϕ = ϕ ∗ δ = ϕ ∗ DK = K ∗ Dϕ ,
DK = δ ,
ϕ = K ∗ Dϕ .
ϕ = K ∗ Dϕ ,
Therefore there exists a constant C such that for all ϕ ∈ D(R), sup ϕ ≤ C sup Dϕ, sup ϕ  ≤ C sup Dϕ, sup ϕ  ≤ C sup Dϕ
and thus, if T has rank ≤ 2, we therefore have, for some C , T , ϕ ≤ C sup Dϕ
for all ϕ ∈ D(R), but also for ϕ ∈ B(m) (R). It follows that K ∗ T , ϕ = T , K ∗ ϕ ≤ C sup DK ∗ ϕ ≤ C sup ϕ
for ϕ ∈ D(R), so that K ∗ T is a bounded measure. Let us now consider the case of dimension n > 1, but still m = 2. Set in this case K(x) = Ka (x1 ) ⊗ · · · ⊗ Ka (xn )
and D=
n / i=1
#
∂2 1−a ∂xi2 2
$ .
Then D k K is a bounded measure for k∞ ≤ 2. By the same argument as before, we see that for some C , sup D k ϕ ≤ C sup Dϕ ϕ ∈ D(Rn ) if k∞ ≤ 2, and if T has rank ≤ 2, we have again T , ϕ ≤ C sup Dϕ ϕ ∈ D(Rn ) , so that μ = K ∗ T is a bounded measure and T = Dμ . For higher rank m we take H = K ∗ · · · ∗ K (p times) if m = 2p , and H = K ∗· · ·∗K ∗L (p times K) if m is odd, m = 2p +1. Here L(x) = L(x1 )⊗· · ·⊗ L(xn ) = (1/an ) Y (x1 , . . . , xn ) · e−(x1 +···+xn )/a . We choose also the corresponding powers of the differential operators, and ﬁnd in the same way: if T has rank ≤ m, then μ = H ∗ T is a bounded measure and T = Dμ .
Further Reading An application of this chapter is a mathematically correct deﬁnition of the Feynman path integral, see [13].
10 Appendix 10.1 The Banach–Steinhaus Theorem Topological spaces We recall some facts from topology. Let X be a set. Then X is said to be a topological space if a system of subsets T has been selected in X with the following three properties:
1.
∅ ∈ T ,X ∈ T .
2.
The union of arbitrary many sets of T belongs to T again.
3.
The intersection of ﬁnitely many sets of T belongs to T again.
The elements of T are called open sets, its complements closed sets. Let x ∈ X . An open set U containing x is called a neighborhood of x . The intersection of two neighborhoods of x is again a neighborhood of x . A system of neighborhoods U (x) of x is called a neighborhood basis of x if every neighborhood of x contains an element of this system. Suppose that we have selected a neighborhood basis for every element of X . Then, clearly, the neighborhood basis of a speciﬁc element x has the following properties: 1.
x ∈ U (x).
2.
The intersection U1 (x) ∩ U2 (x) of two arbitrary neighborhoods in the basis contains an element of the basis.
3.
If y ∈ U (x), then there is a neighborhood U (y) in the basis of y with U (y) ⊂ U (x).
If, conversely, for any x ∈ X a system of sets U (x) is given, satisfying the above three properties, then we can easily deﬁne a topology on X such that these sets form a neighborhood basis for x , for any x . Indeed, one just calls a set open in X if it is the union of sets of the form U (x). Let X be a topological space. Then X is called a Hausdorff space if for any two points x and y with x = y there exist neighborhoods U of x and V of y with U ∩ V = ∅. Let f be a mapping from a topological space X to a topological space Y . One says that f is continuous at x ∈ X if for any neighborhood V of f (x) there is a neighborhood U of x with f (U ) ⊂ V . The mapping is called continuous on X if it is continuous at every point of X . This property is obviously equivalent with saying that the inverse image of any open set in Y is open in X . Let X be again a topological space and Z , a subset of X . Then Z can also be seen as a topological space, a topological subspace of X , with the induced topology: open sets in Z are intersections of open sets in X with Z .
The Banach–Steinhaus Theorem
91
Fréchet spaces Let now X be a complex vector space.
Deﬁnition 10.1. A seminorm on X is a function p : X → R satisfying 1.
p(x) ≥ 0;
2.
p(λx) = λ p(x);
3.
p(x + y) ≤ p(x) + p(y) (triangle inequality)
for all x, y ∈ X and λ ∈ C. Observe that a seminorm differs only slightly from a norm: it is allowed that p(x) = 0 for some x = 0. Clearly, by (iii), p is a convex function, hence all sets of the form {x ∈ X : p(x) < c}, with c a positive number, are convex. We will consider vector spaces X provided with a countable set of seminorms p1 , p2 , . . . . Given x ∈ X , ε > 0 and a positive integer m, we deﬁne the sets V (x, m; ε)
as follows: V (x, m; ε) = y ∈ X : pk (x − y) < ε for k = 1, 2, . . . , m .
Then, clearly, the intersection of two of such sets contains again a set of this form. Furthermore, if y ∈ V (x, m; ε) then there exists ε > 0 such that V (y, m; ε ) ⊂ V (x, m; ε). We can thus provide X with a topology by calling a subset of X open if it is a union of sets of the form V (x, m; ε). This topology has the following properties: the mappings (x, y) x + y (λ, x) λ x ,
from X × X → X and C × X → X , respectively, are continuous. The space X is said to be a topological vector space. Clearly, any x ∈ X has a neighborhood basis consisting of convex sets. Therefore X is called a locally convex (topological vector) space. The space X is a Hausdorff space if and only if for any nonzero x ∈ X there is a seminorm pk with pk (x) = 0. From now we shall always assume that this property holds. Let {xn } be a sequence of elements xn ∈ X . One says that {xn } converges to x ∈ X , in notation limn→∞ xn = x or xn → x (n → ∞), if for any neighborhood V of x there exists a natural number N such that xn ∈ V for n ≥ N . Alternatively, one may say that {xn } converges to x if for any ε > 0 and any k ∈ N there is a natural number N = N(k, ε) satisfying pk (xn − x) < ε for n ≥ N . Because X is a Hausdorff space, the limit x , if it exists, is uniquely determined. A sequence {xn } is called a Cauchy sequence if for any neighborhood V of x = 0 there is a natural number N such that xn − xm ∈ V for n, m ≥ N . Alternatively, one may say that {xn } is a Cauchy sequence if for any ε > 0 and any k ∈ N there is a natural number N = N(k, ε) satisfying pk (xn − xm ) < ε for n, m ≥ N .
92
Appendix
The space X is said to be complete (also: sequentially complete) if any Cauchy sequence is convergent. Deﬁnition 10.2. A complete locally convex Hausdorff topological vector space with a topology deﬁned by a countable set of seminorms is called a Fréchet space. Examples of Fréchet Spaces
1.
Let K be an open bounded subset of Rn and let D(K) be the space of functions ϕ ∈ D(Rn ) with Supp ϕ ⊂ K . Provide D(K) with the set of seminorms (actually norms in this case) given by pm (ϕ) = sup D k ϕ (ϕ ∈ D(K)) . k≤m
Then D(K) is a Fréchet space. Notice that the seminorms form an increasing sequence in this case: pm (ϕ) ≤ pm+1 (ϕ) for all m and ϕ. This property might be imposed on any Fréchet space X by considering the set of new seminorms pm = p1 + · · · + pm (m ∈ N) . Obviously, these new seminorms deﬁne the same topology on X . 2.
The spaces E and S. For E we choose the seminorms pk,K (ϕ) = supx∈K D k ϕ , k being a ntuple of nonnegative integers and K a compact subset of Rn . Taking for K a ball around x = 0 of radius l (l ∈ N), which sufﬁces, we see that we obtain a countable set of seminorms. For S we choose the seminorms pk,l (ϕ) = sup x l D k ϕ .
Here both k and l are ntuples of nonnegative integers.
The dual space Let X be a Fréchet space with an increasing set of seminorms pm (m = 1, 2, . . .). Denote by X the space of (complexvalued) continuous linear forms x on X . The value of x at x is usually denoted by x , x, showing a nice bilinear correspondence between X and X
–
x , λ x + μ y = λ x , x + μ x , y ,
–
λ x + μ y , x = λ x , x + μ y , x ,
for x, y ∈ X, x , y ∈ X , λ, μ ∈ C. The space X is called the (continuous) dual space of X .
The Banach–Steinhaus Theorem
93
Proposition 10.3. A linear form x on X is continuous at x = 0 if and only if there is m ∈ N and a constant c > 0 such that x , x ≤ c pm (x)
for all x ∈ X . In this case x is continuous everywhere on X . Proof. Let x be continuous at x = 0. Then there is a neighborhood U (0) of x = 0 such that x , x < 1 for x ∈ U (0). Since the set of seminorms is increasing, we may assume that U (0) is of the form {x : pm (x) ≤ δ} for some δ > 0. Then we get for any x ∈ X and any ε > 0, x1 =
δx ∈ U (0) , pm (x) + ε
thus x , x1  < 1 and x , x = x , x1 pm (x) + ε < pm (x) + ε . δ δ
Hence x , x ≤ c pm (x) for all x ∈ X with c = 1/δ, since ε was arbitrary. Conversely, any linear form, satisfying a relation x , x ≤ c pm (x)
(x ∈ X)
for some m ∈ N and some constant c > 0, is clearly continuous at x = 0. But, we get more. If x0 ∈ X is arbitrary, then we obtain x , x − x0 ≤ c pm (x − x0 )
(x ∈ X) ,
hence x is continuous at x0 . Indeed, if ε > 0 is given, then we get x , x − x0 = x , x − x , x0 < ε
if pm (x − xo ) < ε/c and this latter inequality deﬁnes a neighborhood of x0 .
We can also deﬁne continuity of a linear form x in another way. Proposition 10.4. A linear form x on X is continuous on X if one of the following conditions is satisﬁed: (i) x is continuous with respect to the topology of X ; (ii) there exists m ∈ N and c > 0 such that x , x ≤ c pm (x)
for all x ∈ X ; (iii) for any sequence {xn } with limn→∞ xn = x one has limn→∞ x , xn = x , x.
94
Appendix
Proof. It is sufﬁcient to show the equivalence of (ii) and (iii). Obviously, (ii) implies (iii). The converse implication is proved by contradiction. If condition (ii) is not satisﬁed, then we can ﬁnd for all m ∈ N and all c > 0 an element xm ∈ X with x , xm  = 1 and pm (xm ) < 1/m, taking c = m. Then we have limm→∞ xm = 0, because the sequence of seminorms is increasing. But x , xm  = 1 for all m. This contradicts (iii). There exists sufﬁciently many continuous linear forms on X . Indeed, compare with the following proposition. Proposition 10.5. For any x0 = 0 in X there is a continuous linear form x on X with x , x0 = 0. Proof. Because x0 = 0, there exists a seminorm pm with pm (x0 ) > 0. By Hahn– Banach’s theorem, there exists a linear form x on X with x , x ≤ pm (x) for all x ∈ X and x , x0 = pm (x0 ). By Proposition 10.4 the linear form x is continuous.
Metric spaces Let X be a set. We recall:
Deﬁnition 10.6. A metric on X is a mapping d : X × X → R with the following three properties: (i) d(x, y) ≥ 0 for all x, y ∈ X , d(x, y) = 0 if and only if x = y ; (ii) d(x, y) = d(y, x) for all x, y ∈ X ; (iii) d(x, y) ≤ d(x, z) + d(z, y) for all x, y, z ∈ X (triangle inequality). Clearly, the balls B(x, ε) = {y ∈ X : d(x, y) < ε} can serve as a neighborhood basis of x ∈ X . Here ε is a positive number. The topology generated by these balls is called the topology associated with the metric d. The set X , provided with this topology, is then called a metric space. Let now X be a Fréchet space with an increasing set p1 ≤ p2 ≤ · · · of seminorms. We shall deﬁne a metric on X in such a way that the topology associated with the metric is the same as the original topology, deﬁned by the seminorms. The space X is called metrizable. Theorem 10.7. Any Fréchet space is metrizable with a translation invariant metric. Proof. For the proof we follow [8], § 18, 2. Set x =
∞ 1 pk (x) k 1 + p (x) 2 k k=1
and deﬁne d by d(x, y) = x − y for x, y ∈ X . Then d is a translation invariant metric. To see this, observe that x = 0 if and only if pk (x) = 0 for
The Banach–Steinhaus Theorem
95
all k, hence x = 0. Thus the ﬁrst property of a metric is satisﬁed. Furthermore, d(x, y) = d(y, x) since x = − x . The triangle inequality follows from the relation: if two real numbers a and b satisfy 0 ≤ a ≤ b, then a/(1 + a) ≤ b/(1 + b). Indeed, we then obtain pk (x + y) pk (x) + pk (y) pk (x) pk (y) ≤ ≤ + 1 + pk (x + y) 1 + pk (x) + pk (y) 1 + pk (x) 1 + pk (y)
for all x, y ∈ X . Next we have to show that the metric deﬁnes the same topology as the seminorms. Therefore it is sufﬁcient to show that every neighborhood of x = 0 in the ﬁrst topology, contains a neighborhood of x = 0 in the second topology and conversely. a. The neighborhood given by the inequality x < 1/2m contains the neighborhood given by pm+1 (x) < 1/2m+1 . Indeed, since pk (x)/[1 + pk (x)] ≤ pk (x), we have for any x satisfying pm+1 (x) < 1/2m+1 : p1 (x) ≤ p2 (x) ≤ · · · ≤ ∞ m+1 pm+1 (x) < 1/2m+1 , hence x < 1/2m+1 k=0 1/2k + k=m+2 1/2k < 1/2m . b. The neighborhood given by pk (x) < 1/2m contains the neighborhood given by x < 1/2m+k+1. Indeed, for any x satisfying x < 1/2m+k+1 one has (1/2k )· [pk (x)/(1 + pk (x))] < 1/2m+k+1 , hence pk (x)/[1 + pk (x)] < 1/2m+1 . Thus pk (x) · (1 − 1/2m+1 ) < 1/2m+1 and therefore pk (x) < 1/2m .
This completes the proof of the theorem.
Baire’s theorem The following theorem, due to Baire, plays an important role in our ﬁnal result.
Theorem 10.8. If a complete metric space is the countable union of closed subsets, then at least one of these subsets contains a nonempty open subset. Proof. Let (X, d) be a complete metric space, Sn closed in X (n = 1, 2, . . .) and X = ∞ n=1 Sn . Suppose no Sn contains an open subset. Then S1 = X and X\S1 is open. There is x1 ∈ X\S1 and a ball B(x1 , ε1 ) ⊂ X\S1 with 0 < ε1 < 1/2. Now the ball B(x1 , ε1 ) does not belong to S2 , hence X\S2 ∩ B(x1 , ε1 ) contains a point x2 and a ball B(x2 , ε2 ) with 0 < ε2 < 1/4. In this way we get a sequence of balls B(xn , εn ) with 1 B(x1 , ε1 ) ⊃ B(x2 , ε2 ) ⊃ · · · ; 0 < εn < n , B(xn , εn ) ∩ Sn = ∅ . 2 For n < m one has d(xn , xm ) < 1/2n , which tends to zero if n, m → ∞. The Cauchy sequence {xn } has a limit x ∈ X since X is complete. But d(xn , x) ≤ d(xn , xm ) + d(xm , x) < εn + d(xm , x) → εn
(m → ∞) .
So x ∈ B(xn , εn ) for all n. Hence x ∉ Sn for all n, so x ∉ X , which is a contradic tion.
96
Appendix
The Banach–Steinhaus theorem We now come to our ﬁnal result.
Theorem 10.9. Let X be a Fréchet space with an increasing set of seminorms. Let F be a subset of X . Suppose for each x ∈ X the family of scalars {x , x : x ∈ F } is bounded. Then there exists c > 0 and a seminorm pm such that x , x ≤ c pm (x)
for all x ∈ X and all x ∈ F . Proof. For the proof we stay close to [12], Chapter III, Theorem 9.1. Since X is a complete metric space, Baire’s theorem can be applied. For each positive integer n let Sn = {x ∈ X : x , x ≤ n for all x ∈ F }. The continuity of each x ensures that Sn is closed. By hypothesis, each x belongs to some Sn . Thus, by Baire’s theorem, at least one of the Sn , say SN , contains a nonempty open set and hence contains a set of the form V (x0 , m; ρ) = {x : pm (x − x0 ) < ρ} for some x0 ∈ SN , ρ > 0 and m ∈ N. That is, x x ≤ N for all x with pm (x − x0 ) < ρ and all x ∈ F . Now, if y is an arbitrary element of X with pm (y) < 1, then we have x0 + ρy ∈ V (x0 , m; ρ) and thus x , x0 + ρy ≤ N for all x ∈ F . It follows that x , ρy ≤ x , x0 + ρy + x , x0 ≤ 2N .
Letting c = 2N/ρ , we have for y with pm (y) < 1, x , y = x , ρy 1 ≤ c , ρ
and thus, ,  y x , y = (pm (y) + ε) ≤ c (pm (y) + ε) x, pm (y) + ε
for any ε > 0, hence x , y ≤ c pm (y) for all y ∈ X and all x ∈ F .
The theorem just proved is known as Banach–Steinhaus theorem, and also as the principle of uniform boundedness.
Application We can now show the following result, announced in Section 4.3.
Theorem 10.10. Let {Tj } be a sequence of distributions such that the limit limj→∞ Tj , ϕ exists for all ϕ ∈ D. Set T , ϕ = limj→∞ Tj , ϕ (ϕ ∈ D). Then T is a distribution.
The Beta and Gamma Function
97
Proof. Clearly T is a linear form on D. We only have to prove the continuity of T . Let K be an open bounded subset of Rn and consider the Fréchet space D(K) of functions ϕ ∈ D with Supp ϕ ⊂ K and seminorms pm (ϕ) = sup D k ϕ (ϕ ∈ D(K)) . k≤m
Since limj→∞ Tj , ϕ exists, the family of scalars {Tj , ϕ : j = 1, 2 . . .} is bounded for every ϕ ∈ D(K). Therefore, by the Banach–Steinhaus theorem, there exist a positive integer m and a constant CK > 0 such that for each j = 1, 2, . . . Tj , ϕ ≤ CK sup D k ϕ k≤m
for all ϕ ∈ D(K), hence T , ϕ ≤ CK
sup D k ϕ
k≤m
for all ϕ ∈ D(K). Since K was arbitrary, it follows from Proposition 2.3 that T is a distribution.
10.2 The Beta and Gamma Function We begin with the deﬁnition of the gamma function ∞ Γ (λ) =
e−x x λ−1 dx .
(10.1)
0
This integral converges for Re λ > 0. Expanding x λ−λ0 = e(λ−λ0 ) log x (x > 0) into a power series, we obtain for any (small) ε > 0 the inequality λ−1 x − x λ0 −1 ≤ x −ε λ − λ0 x λ0 −1 log x (0 < x ≤ 1) for all λ with λ − λ0  < ε. A similar inequality holds for x ≥ 1 with −ε replaced by ε in the power of x . One then easily sees, by using Lebesgue’s theorem on dominated convergence, that Γ (λ) is a complex analytic function of λ for Re λ > 0 and
∞
Γ (λ) =
e−x x λ−1 log x dx
(Re λ > 0) .
0
Using partial integration we obtain Γ (λ + 1) = λ Γ (λ)
(Re λ > 0) .
Since Γ (1) = 1, we have Γ (n + 1) = n! for n = 0, 1, 2, . . ..
(10.2)
98
Appendix
We are looking for an analytic continuation of Γ . The above relation (10.2) implies, for Re λ > 0, Γ (λ + n + 1) . Γ (λ) = (10.3) (λ + n)(λ + n − 1) · · · (λ + 1)λ The righthand side has however a meaning for all λ with Re λ > −n − 1 minus the points λ = 0, −1, . . . , −n. We use this expression as deﬁnition for the analytic continuation: for every λ with λ = 0, −1, −2, . . . we determine a natural number n with n > − Re λ − 1. Then we deﬁne Γ (λ) as in equation (10.3). Using equation (10.2) one easily veriﬁes that this is a good deﬁnition: the answer does not depend on the special chosen n, provided n > − Re λ − 1. It follows also that the gamma function has simple poles at the points λ = 0, −1, −2, . . . with residue at λ = −k equal to (−1)k /k!. Substituting x = t 2 in equation (10.1) one gets ∞ Γ (λ) = 2
2
e−t t 2λ−1 dt ,
(10.4)
0
√
hence Γ (1/2) = π . We continue with the beta function, deﬁned for Re λ > 0, Re μ > 0 by 1 x λ−1 (1 − x)μ−1 dx .
B(λ, μ) =
(10.5)
0
There is a nice relation with the gamma function. One has for Re λ > 0, Re μ > 0, using equation (10.4), ∞ ∞ Γ (λ) Γ (μ) = 4
x 2λ−1 t 2μ−1 e−(x
2 +t 2 )
dx dt .
0 0
In polar coordinates we may write the integral as ∞ π/2 0 1 2
2
r 2(λ+μ)−1 e−r (cos ϕ)2λ−1 (sin ϕ)2μ−1 dϕ dr =
0 π/2
(cos ϕ2λ−1 (sin ϕ)2μ−1 dϕ =
Γ (λ + μ)
1 4
B(λ, μ) ,
0 2
by substituting u = sin ϕ in the latter integral. Hence B(λ, μ) =
Γ (λ) Γ (μ) Γ (λ + μ)
(Re λ > 0, Re μ > 0) .
Let us consider two special cases. For λ = μ and Re λ > 0 we obtain B(λ, λ) =
Γ (λ)2 . Γ (2λ)
(10.6)
99
The Beta and Gamma Function
On the other hand, 1 B(λ, λ) =
x
λ−1
λ−1
(1 − x)
dx =
π/2
1 2
(cos ϕ)2λ−1 (sin ϕ)2λ−1 dϕ
0
0
=2
−2λ
π/2
(sin 2ϕ)2λ−1 dϕ . 0
Splitting the path of integration [0, π /2] into [0, π /4] ∪ [π /4, π /2], using the relation sin 2ϕ = sin 2( 12 π − ϕ) and substituting t = sin2 2ϕ we obtain 2
−2λ
π/2 2λ−1
(sin 2ϕ)
dϕ = 2
−2λ+1
0
Thus
1 t
λ−1
− 12
(1 − t)
dt = 2
0
1−2λ
1 Γ (λ) Γ 2 1 . Γ λ+ 2
! " Γ (2λ) = 22λ−1 π −1/2 Γ (λ) Γ λ + 12 .
(10.7)
This formula is called the duplication formula for the gamma function. Another important relation is the following: Γ (λ) Γ (1 − λ) = B(λ, 1 − λ) =
π , sin π λ
(10.8)
ﬁrst for real λ with 0 < λ < 1 and then, by analytic continuation, for all noninteger λ in the complex plane. This formula can be shown by integration over paths in the complex plane. Consider 1 B(λ, 1 − λ) =
x λ−1 (1 − x)−λ dx ,
(0 < λ < 1) .
0
Substituting x = t/(1 + t) we obtain ∞ B(λ, 1 − λ) = 0
t λ−1 dt . 1+t
To determine this integral, consider for 0 < r < 1 < ρ the closed path W in the complex plane which arises by walking from −ρ to −r along the real axis (k1 ), next in negative sense along the circle z = r from −r back to −r (k2 ), then via −k1 to −ρ and then in positive sense along the circle z = ρ back to −ρ (k3 ) (see Figure 10.1). Let us consider the function f (z) = zλ−1 /(1 − z) = e(λ−1) log z /(1 − z). In order to deﬁne zλ−1 as an analytic function, we consider the region G1 , with border consisting of k1 , the part k21 of k2 between −r and ir , k4 and the part k31 of k3 between iρ and −ρ , and the region G2 with border consisting of −k1 , the part k32 of k3 between −ρ and iρ , −k4 and the part k22 of k2 between ir and r . We deﬁne
100
Appendix
Figure 10.1. The closed path W .
zλ−1 now on G1 and G2 , respectively, including the border, by means of analytic extensions f1 and f2 , respectively, of the, in a neighborhood of z = 1 deﬁned, principal value, and in such a way that f1 (z) = f2 (z) for all z ∈ k4 . This can be done for example by means of the cuts ϕ = −(3/4)π and ϕ = (3/4)π , respectively. We
obtain –
on G1 : f1 (z) = e(λ−1)(log z+i arg z) , π /2 ≤ arg z ≤ π ,
–
on G2 : f2 (z) = e(λ−1)(log z+i arg z) , −π ≤ arg z ≤ π /2.
On k4 we have to take arg z = π /2 in both cases. We now apply the residue theorem on G1 with g1 (z) = f1 (z)/(1 − z) and on G2 with g2 (z) = f2 (z)/(1 − z), respectively. Observe that g1 has no poles in G1 and g2 has a simple pole at z = 1 in G2 , with residue −f2 (1) = −1. We obtain D D g1 (z) dz + g2 (z) dz = −2π i . ∂G1
∂G2
From the deﬁnitions of g1 and g2 follows: g1 (z) dz + g2 (z) dz = 0 , −k4
k4
−r
g1 (z) dz = k1
−ρ
−ρ
g2 (z) dz = −k1
−r
e(λ−1)(log x+π i) dx = e(λ−1)π i 1−x
ρ r
e(λ−1)(log x−π i) dx = e−(λ−1)π i 1−x
y λ−1 dy , 1+y
ρ r
y λ−1 dy . 1+y
The Beta and Gamma Function
Furthermore, for z = r : zλ−1 /(z − 1) ≤ r λ−1 /(1 − r ), hence 2π r λ , g1 (z) dz + g2 (z) dz ≤ 1−r k21
and, similarly,
k22
g1 (z) dz + g2 (z) dz k k 31
32
2π ρ λ ≤ . ρ−1
Taking the limits r ↓ 0 and ρ → ∞, we see that 2 3 e(λ−1)π i − e−(λ−1)π i I + 2π i = 0 with I = B(λ, 1 − λ), hence ∞ B(λ, 1 − λ) = 0
t λ−1 π dt = . 1+t sin π λ
101
11 Hints to the Exercises Exercise 3.2. Use the formula of jumps (Section 3.2, Example 3). For the last equation, apply induction with respect to m. Exercise 3.3. Write x = Y (x) x − Y (−x) x . Then x = Y (x) − Y (−x) by the formula of jumps, hence x = 2δ. Thus x(k) = 2δ(k−2) for k ≥ 2. Exercise 3.4.
Applying the formula of jumps, one could take f such that ⎧ ⎪ ⎪ ⎨ af + bf + cf = 0 af (0) = n ⎪ ⎪ ⎩ af (0) + bf (0) = m .
This system has a wellknown classical solution. The special cases are easily worked out. Exercise 3.5. The ﬁrst expression is easily shown to be equal to the derivative of the distribution Y (x) log x , which itself is a regular distribution. The second expression is the derivative of the distribution pv
Y (x) −δ, x
thus deﬁnes a distribution. This is easily worked out, applying partial integration. Exercise 3.6.
This is most easily shown by performing a transformation x = u + v, y = u − v .
Then dx dy = 2 du dv , ∂ 2 ϕ/∂x 2 − ∂ 2 ϕ/∂y 2 = ∂ 2 Φ/∂u ∂v for ϕ ∈ D(R2 ), with Φ(u, v) = ϕ(u + v, u − v). One then obtains , ∂2T ∂2T − , ϕ = 2 ϕ(3, 1) − ϕ(2, 2) + ϕ(1, 1) − ϕ(2, 0) 2 2 ∂x ∂y for ϕ ∈ D(R2 ). Exercise 3.7.
Notice that $ # 1 ∂ ∂ 1 ∂ +i =2 =0 ∂x ∂y x + iy ∂z z
for z = 0. Clearly, 1/(x + iy ) is a locally integrable function (use polar coordinates). Let χ ∈ D(R), χ ≥ 0, χ(x) = 1 in a neighborhood of x = 0 and Supp χ ⊂ (−1, 1). Set χε (x, y) = χ[(x 2 + y 2 )/ε2 ] for any ε > 0. Then we have for ϕ ∈ D(R2 ) $  ,# $ ,# 1 ∂ 1 ∂ ∂ ∂ +i ,ϕ = +i , χε ϕ ∂x ∂y x + iy ∂x ∂y x + iy
Hints to the Exercises
103
for any ε > 0. Working out the righthand side in polar coordinates and letting ε tend to zero, gives the result. Exercise 3.8. coordinates.
Apply Green’s formula, as in Section 3.5, or just compute using polar
Exercise 3.9.
Apply Green’s formula, as in Section 3.5.
Exercise 3.10. Show ﬁrst that ∂E/∂t − ∂ 2 E/∂x 2 = 0 for t = 0. Then just compute, applying partial integration. Exercise 4.4. The main task is to characterize all ψ ∈ D of the form ψ = ϕ for some ϕ ∈ D. ∞ Such ψ are easily shown to be the functions ψ in D with −∞ ψ(x) dx = 0. x b. Write F (x) = 0 g(t)dt and show that F = f + const. a.
Exercise 4.5.
The solutions are pv(1/x) + const.
Exercise 4.9. a.
Use partial integration. ∞
b. −∞
sin λx ϕ(x) dx = x
A
9 sin λx
−A
A : sin λx ϕ(x) − ϕ(0) dx + ϕ(0) dx x x −A
if Supp ϕ ⊂ [−A, A]. The ﬁrst term tends to zero by Exercise 4.4 a. The second term is equal to A ϕ(0) −A
sin λx dx = ϕ(0) x ∞
which tends to π ϕ(0) if λ → ∞, since −∞
c.
lim ε↓0
x≥ε
cos λx ϕ(x)dx = x
∞ −∞
λA
−λA
sin y dy , y
sin y dy = π . y
cos λx − 1 1 ϕ(x) dx + pv , ϕ . x x
104
Hints to the Exercises
∞
Hence ϕ Pf −∞
x≥ε
cos λx ϕ(x) dx is a distribution. If Supp ϕ ⊂ [−A, A], then x
cos λx ϕ(x) dx = x
cos λx ϕ(x) dx x
A≥x≥ε
cos λx [ϕ(x) − ϕ(0)] dx x
= A≥x≥ε
cos λx dx . x
+ ϕ(0) A≥x≥ε
The latter term is equal to zero. Hence lim ε↓0
x≥ε
cos λx ϕ(x) dx = x
A
9 cos λx
−A
ϕ(x) − ϕ(0) x
: dx .
This term tends to zero again if λ → ∞ by Exercise 4.4 a. Exercise 4.10.
lim a↓0
a = π δ, x 2 + a2
lim a↓0
ax = 0. x 2 + a2
(i) e−x ∗ e−x = (1 + y) e−y . 2 2 1 d −ax 2 1 For (ii) and (iii), observe that xe−ax = − e δ ∗ e−ax . =− 2a dx 2a
Exercise 6.16.
Exercise 6.17. Observe that f (x) = f (−x), Fm (x) = Fm (−x) and f = δ−1 ∗ Y − δ1 ∗ Y . Then Supp Fm ⊂ [−m, m]. Exercise 6.18.
Let A be the rotation over π /4 in R2 : # $ 1 −1 1 A= √ . 2 1 1
Then χ(x) = Y (Ax) for x ∈ R2 . Use this to determine f ∗ g and χ ∗ χ . Exercise 6.28.
Apply Sections 6.6 and 6.7.
Exercise 6.29.
Idem.
Exercise 6.30.
Condition: g has to be a continuous function.
Exercise 6.31.
Apply Sections 6.7 and 6.8.
Exercise 6.32.
Set, as usual, p = δ and write z for zδ.
Exercise 6.34.
Apply Section 6.10.
Hints to the Exercises
Exercise 7.20.
1 T = pv x , T1 = −π i Y (y) − Y (−y) .
Exercise 7.21.
sin 2π y cos 2π y − 1 f1(y) = + (y = 0), f1(0) = 1. 2π y 2π y
f1 is C ∞ , also because Tf ∈ E . 1 1 Pf 2 . 2π 2 x
Exercise 7.22.
F (x) = −
Exercise 7.23.
Take the Fourier or Laplace transform of the distributions.
Exercise 7.24.
Observe that, if fm tends to f in L2 , then Tfm tends to Tf in D .
105
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
N. Bourbaki, Espaces Vectoriels Topologiques, Hermann, Paris, 1964. N. Bourbaki, Intégration, Chapitres 1, 2, 3 et 4, Hermann, Paris, 1965. I. Daubechies, Ten Lectures on Wavelets, Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, 1992. G. van Dijk, Introduction to Harmonic Analysis and Generalized Gelfand Pairs, de Gruyter, Berlin, 2009. F. G. Friedlander and M. Joshi, Introduction to the Theory of Distributions, Cambridge University Press, Cambridge, 1998. I. M. Gelfand and G. E. Shilov, Generalized Functions, Vol. 1, Academic Press, New York, 1964. L. Hörmander, The Analysis of Linear Partial Differential Operators, Vol. I–IV, SpringerVerlag, Berlin, 1983–85, Second edition 1990. G. Köthe, Topologische Lineare Räume, SpringerVerlag, Berlin, 1960. L. Schwartz, Méthodes Mathématiques pour les Sciences Physiques, Hermann, Paris, 1965. L. Schwartz, Théorie des Distributions, nouvelle édition, Hermann, Paris, 1978. E. M. Stein and G. Weiss, Introduction to Fourier Analysis on Euclidean Spaces, Princeton University Press, Princeton, 1971. A. E. Taylor and D. C. Lay, Introduction to Functional Analysis, second edition, John Wiley and Sons, New York, 1980. E. G. F. Thomas, Path Distributions on Sequence Spaces, preprint, 2000. E. C. Titchnarsh, Theory of Fourier Integrals, Oxford University Press, Oxford, 1937. F. Trèves, Topological Vector Spaces, Distributions and Kernels, Academic Press, New York, 1967. E. T. Whittaker and G. N. Watson, A Course of Modern Analysis, Cambridge University Press, Cambridge, 1980. D. Widder, The Laplace Transform, Princeton University Press, Princeton, 1941.
Index Abel integral equation 78 averaging theorem 41 Banach–Steinhaus theorem 23, 84, 90 Bessel function 70 bounded – from the left 34 – from the right 34 bounded convergence property 86 canonical extension 85, 86 Cauchy distribution 35 Cauchy problem 71 Cauchy sequence 91 closed set 90 complete space 92 convergence principle 4, 26, 59 convolution algebra 42 convolution equation 43 convolution of distributions 33 derivative of a distribution 9 distribution 4 dual space 92 duplication formula 99 elementary solution 40 Fourier transform – of a distribution 63 – of a function 52 Fourier–Bessel transform 70 Fréchet space 91 Gauss distribution 35 Green’s formula 17 Hankel transform 70 harmonic distribution 40 harmonic function 17, 40 heat equation 71 Heaviside function 10 inversion theorem 54 iterated Poisson equation 83 Laplace transform – of a distribution 75 – of a function 74 locally convex space 91 locally integrable function 5
metric 94 metric space 94 metrizable space 94 molliﬁer 88 neighborhood 90 neighborhood basis 90 Newton potential 40 open set 90 order of a distribution 6 Paley–Wiener Theorems 65 parametrix 83 Partie ﬁnie 11 Plancherel’s theorem 57 Poisson equation 40 principal value 11 radial function 69 rank of a distribution 87 rapidly decreasing function 59 Riemann–Lebesgue lemma 52 Schwartz space 59 seminorm 91 slowly increasing function 60 spherical coordinates 18 structure of a distribution 28, 61 summability order 82 summable distribution 6, 82 support – of a distribution 8 – of a function 3 symbolic calculus 45 tempered distribution 60 tensor product – of distributions 31 – of functions 31 test function 3 topological space 90 topological vector space 91 trapezoidal function 53 triangle function 53 uniform boundedness 96 Volterra integral equation 47