Pseudo-Differential Operators Theory and Applications Vol. 2
Managing Editor M.W. Wong (York University, Canada)
Editorial Board Luigi Rodino (Università di Torino, Italy) Bert-Wolfgang Schulze (Universität Potsdam, Germany) Johannes Sjöstrand (École Polytechnique, Palaiseau, France) Sundaram Thangavelu (Indian Institute of Science at Bangalore, India) Marciej Zworski (University of California at Berkeley, USA)
Pseudo-Differential Operators: Theory and Applications is a series of moderately priced graduate-level textbooks and monographs appealing to students and experts alike. Pseudo-differential operators are understood in a very broad sense and include such topics as harmonic analysis, PDE, geometry, mathematical physics, microlocal analysis, time-frequency analysis, imaging and computations. Modern trends and novel applications in mathematics, natural sciences, medicine, scientific computing, and engineering are highlighted.
Michael Ruzhansky | Ville Turunen
Pseudo-Differential Operators and Symmetries Background Analysis and Advanced Topics
Birkhäuser Basel · Boston · Berlin
Authors: Michael Ruzhansky Department of Mathematics Imperial College London 180 Queen’s Gate London SW7 2AZ United Kingdom e-mail:
[email protected] Ville Turunen Institute of Mathematics Helsinki University of Technology P.O. Box 1100 FI-02015 TKK Finland e-mail: ville.turunen@hut.fi
2000 Mathematics Subject Classification: 35Sxx, 58J40; 43A77, 43A80, 43A85
Library of Congress Control Number: 2009929498 Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at .
ISBN 978-3-7643-8513-2 Birkhäuser Verlag AG, Basel · Boston · Berlin This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. For any kind of use whatsoever, permission from the copyright owner must be obtained. © 2010 Birkhäuser Verlag AG Basel · Boston · Berlin P.O. Box 133, CH-4010 Basel, Switzerland Part of Springer Science+Business Media Printed on acid-free paper produced of chlorine-free pulp. TCF∞ Printed in Germany ISBN 978-3-7643-8513-2
e-ISBN 978-3-7643-8514-9
987654321
www.birkhauser.ch
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Part I Foundations of Analysis A Sets, A.1 A.2 A.3 A.4 A.5 A.6 A.7 A.8 A.9 A.10 A.11 A.12 A.13 A.14 A.15 A.16 A.17 A.18 A.19 A.20
Topology and Metrics Sets, collections, families . . . . . . . . . . . . Relations, functions, equivalences and orders . Dominoes tumbling and transfinite induction Axiom of Choice: equivalent formulations . . Well-Ordering Principle revisited . . . . . . . Metric spaces . . . . . . . . . . . . . . . . . . Topological spaces . . . . . . . . . . . . . . . Kuratowski’s closure . . . . . . . . . . . . . . Complete metric spaces . . . . . . . . . . . . Continuity and homeomorphisms . . . . . . . Compact topological spaces . . . . . . . . . . Compact Hausdorff spaces . . . . . . . . . . . Sequential compactness . . . . . . . . . . . . Stone–Weierstrass theorem . . . . . . . . . . . Manifolds . . . . . . . . . . . . . . . . . . . . Connectedness and path-connectedness . . . . Co-induction and quotient spaces . . . . . . . Induction and product spaces . . . . . . . . . Metrisable topologies . . . . . . . . . . . . . . Topology via generalised sequences . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
9 12 16 17 25 26 29 35 40 46 49 52 57 62 65 66 69 70 74 77
vi
Contents
B Elementary Functional Analysis B.1 Vector spaces . . . . . . . . . . . . . . . B.1.1 Tensor products . . . . . . . . . . B.2 Topological vector spaces . . . . . . . . B.3 Locally convex spaces . . . . . . . . . . B.3.1 Topological tensor products . . . B.4 Banach spaces . . . . . . . . . . . . . . . B.4.1 Banach space adjoint . . . . . . . B.5 Hilbert spaces . . . . . . . . . . . . . . . B.5.1 Trace class, Hilbert–Schmidt, and
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Schatten classes
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. 79 . 83 . 85 . 87 . 90 . 92 . 101 . 103 . 111
C Measure Theory and Integration C.1 Measures and outer measures . . . . . . . . . . . . C.1.1 Measuring sets . . . . . . . . . . . . . . . . C.1.2 Borel regularity . . . . . . . . . . . . . . . . C.1.3 On Lebesgue measure . . . . . . . . . . . . C.1.4 Lebesgue non-measurable sets . . . . . . . . C.2 Measurable functions . . . . . . . . . . . . . . . . . C.2.1 Well-behaving functions . . . . . . . . . . . C.2.2 Sequences of measurable functions . . . . . C.2.3 Approximating measurable functions . . . . C.3 Integration . . . . . . . . . . . . . . . . . . . . . . C.3.1 Integrating simple non-negative functions . C.3.2 Integrating non-negative functions . . . . . C.3.3 Integration in general . . . . . . . . . . . . C.4 Integral as a functional . . . . . . . . . . . . . . . . C.4.1 Lebesgue spaces Lp (μ) . . . . . . . . . . . . C.4.2 Signed measures . . . . . . . . . . . . . . . C.4.3 Derivatives of signed measures . . . . . . . C.4.4 Integration as functional on function spaces C.4.5 Integration as functional on Lp (μ) . . . . . C.4.6 Integration as functional on C(X) . . . . . C.5 Product measure and integral . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
116 116 124 128 133 134 134 137 141 143 144 144 147 152 152 158 162 169 170 174 181
D Algebras D.1 Algebras . . . . . . . . . . . . . D.2 Topological algebras . . . . . . D.3 Banach algebras . . . . . . . . . D.4 Commutative Banach algebras . D.5 C∗ -algebras . . . . . . . . . . . D.6 Appendix: Liouville’s Theorem
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
191 196 200 207 213 217
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Contents
Part II 1
2
3
vii
Commutative Symmetries
Fourier Analysis on Rn 1.1 Basic properties of the Fourier transform . . . . . . 1.2 Useful inequalities . . . . . . . . . . . . . . . . . . 1.3 Tempered distributions . . . . . . . . . . . . . . . . 1.3.1 Fourier transform of tempered distributions 1.3.2 Operations with distributions . . . . . . . . 1.3.3 Approximating by smooth functions . . . . 1.4 Distributions . . . . . . . . . . . . . . . . . . . . . 1.4.1 Localisation of Lp -spaces and distributions . 1.4.2 Convolution of distributions . . . . . . . . . 1.5 Sobolev spaces . . . . . . . . . . . . . . . . . . . . 1.5.1 Weak derivatives and Sobolev spaces . . . . 1.5.2 Some properties of Sobolev spaces . . . . . 1.5.3 Mollifiers . . . . . . . . . . . . . . . . . . . 1.5.4 Approximation of Sobolev space functions . 1.6 Interpolation . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
221 229 233 233 236 239 241 241 244 246 246 249 250 253 255
Pseudo-differential Operators on Rn 2.1 Motivation and definition . . . . . . . . . . . . . . . . . . 2.2 Amplitude representation of pseudo-differential operators 2.3 Kernel representation of pseudo-differential operators . . . 2.4 Boundedness on L2 (Rn ) . . . . . . . . . . . . . . . . . . . 2.5 Calculus of pseudo-differential operators . . . . . . . . . . 2.5.1 Composition formulae . . . . . . . . . . . . . . . . 2.5.2 Changes of variables . . . . . . . . . . . . . . . . . 2.5.3 Principal symbol and classical symbols . . . . . . . 2.5.4 Calculus proof of L2 -boundedness . . . . . . . . . . 2.5.5 Asymptotic sums . . . . . . . . . . . . . . . . . . . 2.6 Applications to partial differential equations . . . . . . . . 2.6.1 Freezing principle for PDEs . . . . . . . . . . . . . 2.6.2 Elliptic operators . . . . . . . . . . . . . . . . . . . 2.6.3 Sobolev spaces revisited . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
259 263 264 267 271 271 281 282 284 285 287 288 289 293
Periodic and Discrete Analysis 3.1 Distributions and Fourier transforms on Tn and Zn . 3.2 Sobolev spaces H s (Tn ) . . . . . . . . . . . . . . . . . 3.3 Discrete analysis toolkit . . . . . . . . . . . . . . . . 3.3.1 Calculus of finite differences . . . . . . . . . . 3.3.2 Discrete Taylor expansion and polynomials on 3.3.3 Several discrete inequalities . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
298 306 309 310 313 319
. . . . . . . . Zn . .
. . . . . .
viii
Contents
3.4 3.5 4
5
3.3.4 Linking differences to derivatives . . . . . . . . . . . . . . 321 Periodic Taylor expansion . . . . . . . . . . . . . . . . . . . . . . 327 Appendix: on operators in Banach spaces . . . . . . . . . . . . . 329
Pseudo-differential Operators on Tn 4.1 Toroidal symbols . . . . . . . . . . . . . . . . . . . 4.1.1 Quantization of operators on Tn . . . . . . 4.1.2 Toroidal symbols . . . . . . . . . . . . . . . 4.1.3 Toroidal amplitudes . . . . . . . . . . . . . 4.2 Pseudo-differential operators on Sobolev spaces . . 4.3 Kernels of periodic pseudo-differential operators . . 4.4 Asymptotic sums and amplitude operators . . . . . 4.5 Extension of toroidal symbols . . . . . . . . . . . . 4.6 Periodisation of pseudo-differential operators . . . 4.7 Symbolic calculus . . . . . . . . . . . . . . . . . . . 4.8 Operators on L2 (Tn ) and Sobolev spaces . . . . . . 4.9 Elliptic pseudo-differential operators on Tn . . . . 4.10 Smoothness properties . . . . . . . . . . . . . . . . 4.11 An application to periodic integral operators . . . . 4.12 Toroidal wave front sets . . . . . . . . . . . . . . . 4.13 Fourier series operators . . . . . . . . . . . . . . . . 4.14 Boundedness of Fourier series operators on L2 (Tn ) 4.15 An application to hyperbolic equations . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
335 335 337 340 342 347 351 356 360 367 374 376 382 387 389 393 405 410
Commutator Characterisation of Pseudo-differential Operators 5.1 Euclidean commutator characterisation . . . . . . . . . . 5.2 Pseudo-differential operators on manifolds . . . . . . . . 5.3 Commutator characterisation on closed manifolds . . . . 5.4 Toroidal commutator characterisation . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
413 416 421 423
Part III Representation Theory of Compact Groups 6
Groups 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 6.2 Groups without topology . . . . . . . . . . . . . . . . . . . . . . . 430 6.3 Group actions and representations . . . . . . . . . . . . . . . . . 436
7 Topological Groups 7.1 Topological groups . . . . . . . . . . . . . . . . . . . . . . . . . . 445 7.2 Representations of topological groups . . . . . . . . . . . . . . . . 449 7.3 Compact groups . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Contents
7.4
Haar measure and integral . . . . . . . . . . . 7.4.1 Integration on quotient spaces . . . . . Peter–Weyl decomposition of representations Fourier series and trigonometric polynomials . Convolutions . . . . . . . . . . . . . . . . . . Characters . . . . . . . . . . . . . . . . . . . . Induced representations . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
453 462 465 474 478 479 482
Linear Lie Groups 8.1 Exponential map . . . . . . . . . . . . . . . . 8.2 No small subgroups for Lie, please . . . . . . 8.3 Lie groups and Lie algebras . . . . . . . . . . 8.3.1 Universal enveloping algebra . . . . . . 8.3.2 Casimir element and Laplace operator
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
492 496 498 506 510
7.5 7.6 7.7 7.8 7.9 8
9
ix
Hopf Algebras 9.1 Commutative C ∗ -algebras . . . . . . . . . . . . . . . . . . . . . . 515 9.2 Hopf algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
Part IV Non-commutative Symmetries 10 Pseudo-differential Operators on Compact Lie Groups 10.1 Introduction . . . . . . . . . . . . . . . . . . . . 10.2 Fourier series on compact Lie groups . . . . . . 10.3 Function spaces on the unitary dual . . . . . . 10.3.1 Spaces on the group G . . . . . . . . . . . . . . . . . . . . . 10.3.2 Spaces on the dual G p 10.3.3 Spaces L (G) . . . . . . . . . . . . . . . 10.4 Symbols of operators . . . . . . . . . . . . . . . 10.4.1 Full symbols . . . . . . . . . . . . . . . . 10.4.2 Conjugation properties of symbols . . . 10.5 Boundedness of operators on L2 (G) . . . . . . . 10.6 Taylor expansion on Lie groups . . . . . . . . . 10.7 Symbolic calculus . . . . . . . . . . . . . . . . . 10.7.1 Difference operators . . . . . . . . . . . 10.7.2 Commutator characterisation . . . . . . 10.7.3 Calculus . . . . . . . . . . . . . . . . . . 10.7.4 Leibniz formula . . . . . . . . . . . . . . 10.8 Boundedness on Sobolev spaces H s (G) . . . . . 10.9 Symbol classes on compact Lie groups . . . . . 10.9.1 Some properties of symbols of Ψm (G) .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
529 530 534 534 536 546 550 552 556 559 561 563 563 566 567 570 571 572 573
x
Contents
10.9.2 Symbol classes Σm (G) . . . . . . . . . . . . . . . . . . . . 575 10.10 Full symbols on compact manifolds . . . . . . . . . . . . . . . . . 578 10.11 Operator-valued symbols . . . . . . . . . . . . . . . . . . . . . . . 579 10.11.1 Example on the torus Tn
. . . . . . . . . . . . . . . . . . 589
10.12 Appendix: integral kernels . . . . . . . . . . . . . . . . . . . . . . 591 11 Fourier Analysis on SU(2) 11.1
Preliminaries: groups U(1), SO(2), and SO(3) . . . . . . . . . . . 595 11.1.1 Euler angles on SO(3) . . . . . . . . . . . . . . . . . . . . 597 11.1.2 Partial derivatives on SO(3) . . . . . . . . . . . . . . . . . 598 11.1.3 Invariant integration on SO(3) . . . . . . . . . . . . . . . 598
11.2
General properties of SU(2) . . . . . . . . . . . . . . . . . . . . . 599
11.3
Euler angle parametrisation of SU(2) . . . . . . . . . . . . . . . . 600
11.4
Quaternions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603 11.4.1 Quaternions and SU(2)
. . . . . . . . . . . . . . . . . . . 603
11.4.2 Quaternions and SO(3) . . . . . . . . . . . . . . . . . . . 604 11.4.3 Invariant integration on SU(2) . . . . . . . . . . . . . . . 605 11.4.4 Symplectic groups . . . . . . . . . . . . . . . . . . . . . . 605 11.5 11.6
Lie algebra and differential operators on SU(2) . . . . . . . . . . 607 Irreducible unitary representations of SU(2) . . . . . . . . . . . . 612 11.6.1 Representations of SO(3) . . . . . . . . . . . . . . . . . . 615
11.7
Matrix elements of representations of SU(2) . . . . . . . . . . . . 616
11.8
Multiplication formulae for representations of SU(2) . . . . . . . 620
11.9
Laplacian and derivatives of representations on SU(2) . . . . . . 624
11.10 Fourier series on SU(2) and on SO(3) . . . . . . . . . . . . . . . . 629 12 Pseudo-differential Operators on SU(2) 12.1
Symbols of operators on SU(2) . . . . . . . . . . . . . . . . . . . 631
12.2
Symbols of ∂+ , ∂− , ∂0 and Laplacian L . . . . . . . . . . . . . . . 634
12.3
Difference operators for symbols . . . . . . . . . . . . . . . . . . . 636 12.3.1 Difference operators on SU(2) . . . . . . . . . . . . . . . . 636 12.3.2 Differences for symbols of ∂+ , ∂− , ∂0 and Laplacian L . . . 640 12.3.3 Differences for aσ∂0 . . . . . . . . . . . . . . . . . . . . . . 649
12.4
Symbol classes on SU(2) . . . . . . . . . . . . . . . . . . . . . . . 656
12.5
Pseudo-differential operators on S3 . . . . . . . . . . . . . . . . . 660
12.6
Appendix: infinite matrices . . . . . . . . . . . . . . . . . . . . . 662
Contents
13 Pseudo-differential Operators on Homogeneous Spaces 13.1 Analysis on closed manifolds . . . . . . . . . . . 13.2 Analysis on compact homogeneous spaces . . . 13.3 Analysis on K\G, K a torus . . . . . . . . . . . 13.4 Lifting of operators . . . . . . . . . . . . . . . .
xi
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
667 669 673 679
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697
Preface
This monograph is devoted to the development of the theory of pseudo-differential operators on spaces with symmetries. Such spaces are the Euclidean space Rn , the torus Tn , compact Lie groups and compact homogeneous spaces. The book consists of several parts. One of our aims has been not only to present new results on pseudo-differential operators but also to show parallels between different approaches to pseudo-differential operators on different spaces. Moreover, we tried to present the material in a self-contained way to make it accessible for readers approaching the material for the first time. However, different spaces on which we develop the theory of pseudo-differential operators require different backgrounds. Thus, while operators on the Euclidean space in Chapter 2 rely on the well-known Euclidean Fourier analysis, pseudo-differential operators on the torus and more general Lie groups in Chapters 4 and 10 require certain backgrounds in discrete analysis and in the representation theory of compact Lie groups, which we therefore present in Chapter 3 and in Part III, respectively. Moreover, anyone who wishes to work with pseudo-differential operators on Lie groups will certainly benefit from a good grasp of certain aspects of representation theory. That is why we present the main elements of this theory in Part III, thus eliminating the necessity for the reader to consult other sources for most of the time. Similarly, the backgrounds for the theory of pseudo-differential operators on S3 and SU(2) developed in Chapter 12 can be found in Chapter 11 presented in a self-contained way suitable for immediate use. However, it was still not a simple matter to make a self-contained presentation of these theories without referring to basics of the more general analysis. Thus, in hoping that this monograph may serve as a guide to different aspects of pseudodifferential operators, we decided to include the basics of analysis that are certainly useful for anyone working with pseudo-differential operators. Overall, we tried to supplement all the material with exercises for learning the ideas and practicing the techniques. They range from elementary problems to more challenging ones. In fact, on many occasions where other authors could say “it is easy to see” or “one can check”, we prefer to present it as an exercise. At the same time, more challenging exercises also serve as an excellent way to present more aspects of the discussed material.
xiv
Preface
We would like to thank Professor G. Vainikko, who introduced V. Turunen to pseudo-differential equations on circles [137], leading naturally to the noncommutative setting of the doctoral thesis. The thesis work was crucially influenced by a visit to M.E. Taylor in spring 2000. We are grateful to Professor M.W. Wong for suggesting that we write this monograph, to our students for giving us useful feedback on the background material of the book, and to Dr. J. Wirth for reading the manuscript and for his useful feedback and numerous comments, which led to clarifications of the presentation, especially of the material from Section 10.3. Most of the work was carried out at the pleasant atmospheres provided by Helsinki University of Technology and Imperial College London. Moreover, over the years, we have outlined substantial parts of the monograph elsewhere: particularly, we appreciate the hospitality of University of North Carolina at Chapel Hill, University of Torino and Osaka University. The work of M. Ruzhansky was supported in part by EPSRC grants EP/E062873/01 and EP/G007233/1. The travels of V. Turunen were financed by the Magnus Ehrnrooth Foundation, by the Vilho, Yrj¨ o and Kalle V¨ ais¨al¨ a Foundation of the Finnish Academy of Science and Letters, and by the Finnish Cultural Foundation. Finally, our loving thanks go to our families for all the encouragement and understanding that we received while working on this monograph.
March 2009
Michael Ruzhansky, London Ville Turunen, Helsinki
Introduction
Historical notes Pseudo-differential operators (ΨDO) can be considered as natural extensions of linear partial differential operators, with which they share many essential properties. The study of pseudo-differential operators grew out of research in the 1960s on singular integral operators; being a relatively young subject, the theory is only now reaching a stable form. Pseudo-differential operators are generalisations of linear partial differential operators, with roots entwined deep down in solving differential equations. Among the most influential predecessors of the theory of pseudo-differential operators one must mention the works of Solomon Grigorievich Mikhlin, Alberto Calder´ on and Antoni Szczepan Zygmund. Around 1957, anticipating novel methods, Alberto Calder´ on proved the local uniqueness theorem of the Cauchy problem of a partial differential equation. This proof involved the idea of studying the algebraic theory of characteristic polynomials of differential equations. Another landmark was set in ca. 1963, when Michael Atiyah and Isadore Singer presented their celebrated index theorem. Applying operators, which nowadays are recognised as pseudo-differential operators, it was shown that the geometric and analytical indices of Erik Ivar Fredholm’s “Fredholm operator” on a compact manifold are equal. In particular, these successes by Calder´ on and Atiyah–Singer motivated developing a comprehensive theory for these newly found tools. The Atiyah–Singer index theorem is also tied to the advent of K-theory, a significant field of study in itself. The evolution of the pseudo-differential theory was then rapid. In 1963, Peter Lax proposed some singular integral representations using Jean Baptiste Joseph Fourier’s “Fourier series”. A little later, Joseph Kohn and Louis Nirenberg presented a more useful approach with the aid of Fourier integral operators and named their representations pseudo-differential operators. Showing that these operators form an algebra, they derived a broad theory, and their results were applied by Peter Lax and Kurt Otto Friedrichs in boundary problems of linear partial differential equations. Other related studies were conducted by Agranovich, Bokobza, Kumano-go, Schwartz, Seeley, Unterberger, and foremost, by Lars H¨ ormander,
2
Introduction
who coined the modern pseudo-differential theory in 1965, leading into a vast range of methods and results. The efforts of Kohn, Nirenberg, and H¨ ormander gave birth to symbol analysis, which is the basis of the theory of pseudo-differential operators. It is interesting how the ideas of symbol analysis have matured over about 200 years. Already Joseph Lagrange and Augustin Cauchy studied the assignment of a characteristic polynomial to the corresponding differential operator. In the 1880s, Oliver Heaviside developed an operational calculus for the solution of ordinary differential equations met in the theory of electrical circuits. A more sophisticated problem of this kind, related to quantum mechanics, was solved by Hermann Weyl in 1927, and eventually the concept of the symbol of an operator was introduced by Solomon Grigorievich Mikhlin in 1936. After all, there is nothing new under the sun. Since the mid-1960s, pseudo-differential operators have been widely applied in research on partial differential equations: along with new theorems, they have provided a better understanding of parts of classical analysis including, for instance, Sergei Lvovich Sobolev’s “Sobolev spaces”, potentials, George Green’s “Green functions”, fundamental solutions, and the index theory of elliptic operators. Furthermore, they appear naturally when reducing elliptic boundary value problems to the boundary. Briefly, modern mathematical analysis has gained valuable clarity with the unifying aid of pseudo-differential operators. Fourier integral operators are more general than pseudo-differential operators, having the same status in the study of hyperbolic equations as pseudo-differential operators have with respect to elliptic equations. A natural approach to treat pseudo-differential operators on n-dimensional C ∞ -manifolds is to use the theory of Rn locally: this can be done, since the classes of pseudo-differential operators are invariant under smooth changes of coordinates. However, on periodic spaces (tori) Tn , this could be a clumsy way of thinking, as the local theory is plagued with rather technical convergence and local coordinate questions. The compact group structure of the torus is important from the harmonic analysis point of view. In 1979 (and 1985) Mikhail Semenovich Agranovich (see [3]) presented an appealing formulation of pseudo-differential operators on the unit circle S1 using Fourier series. Hence, the independent study of periodic pseudo-differential operators was initiated. The equivalence of local and global definitions of periodic pseudo-differential operators was completely proven by William McLean in 1989. By then, the global definition was widely adopted and used by Agranovich, Amosov, D.N. Arnold, Elschner, McLean, Saranen, Schmidt, Sloan, and Wendland among others. Its effectiveness has been recognised particularly in the numerical analysis of boundary integral equations. The literature on pseudo-differential operators is extensive. At the time of writing of this paragraph (28 January 2009), a search on MathSciNet showed 1107 entries with words “pseudodifferential operator” in the title (among which 33 are books), 436 entries with words “pseudo-differential operator” in the title
Introduction
3
(among which 37 are books), 3971 entries with words “pseudodifferential operator” anywhere (among which 417 are books), and 1509 entries with words “pseudodifferential operator” anywhere (among which 151 are books). Most of these works are devoted to the analysis on Rn and thus we have no means to give a comprehensive overview there. Thus, the emphasis of this monograph is on pseudo-differential operators on the torus, on Lie groups, and on spaces with symmetries, in which cases the literature is much more limited.
Periodic pseudo-differential operators It turns out that the pseudo-differential and periodic pseudo-differential theories are analogous, the periodic case actually being more discernible. Despite the intense research on periodic integral equations, the theory of periodic pseudo-differential operators has been difficult to find in the literature. On the other hand, the wealth of publications on general pseudo-differential operators is cumbersome for the periodic case, and it is too easy to get lost in the midst of irrelevant technical details. In the sequel the elementary properties of periodic pseudo-differential operators are studied. The prerequisites for understanding the theory are more modest than one might expect. Of course, a basic knowledge of functional analysis is necessary, but the simple central tools are Gottfried Wilhelm von Leibniz’ “Leibniz formula”, Brook Taylor’s “Taylor expansion”, and Jean Baptiste Joseph Fourier’s “Fourier transform”. In the periodic case, these familiar concepts of the classical calculus are to be expressed in discrete forms using differences and summation instead of derivatives and integration. Our working spaces will be the Sobolev spaces H s (Tn ) on the compact torus group Tn . These spaces ideally reflect smoothness properties, which are of fundamental significance for pseudo-differential operators, as the traditional operator theoretic methods fail to be satisfactory – pseudo-differential operators and periodic pseudo-differential operators do not form any reasonable normed algebra. The structure of the treatment of periodic pseudo-differential operators is the following: first, introduction of necessary functional analytic prerequisites, then development of useful tools for analysis of series and periodic functions, and after that the presentation of the theory of periodic pseudo-differential operators. The focus of the study is on symbolic analysis. The techniques of the extension of symbols and the periodisation of operators allow one effectively to relate the Euclidean and the periodic theories, and to use one to derive results in the other. However, we tried to reduce a reliance on such ideas, keeping in mind the development of the subject on Lie groups where such a relation is not readily available. From this point of view, analysis on the torus can be viewed rather as a special case of analysis on a Lie group than the periodic Euclidean case.
4
Introduction
The main justification of this work on the torus, from the authors’ point of view, is the unification and development of the global theory of periodic pseudodifferential operators. It becomes evident how elegant this theory is, especially when compared to the theory on Rn ; and as such, periodic pseudo-differential operators may actually serve as a nice first introduction to the general theory of pseudo-differential operators. For those who have already acquainted themselves with pseudo-differential operators this work may still offer another aspect of the analysis. Thus, there is a hope that these tools will find various uses. Although we decided not to discuss Fourier integral operators on Rn , we devote some efforts to analysing operators that we call Fourier series operators. These are analogues of Fourier integral operators on the torus and we study them in terms of toroidal quantization. The main new difficulty here is that while pseudodifferential operators do not move the wave front sets of distributions, this is no longer the case for Fourier series operators. Thus, we are quickly forced to make extensions of functions from an integer lattice to Euclidean space on the frequency side. The analysis presented here shows certain limitation of the use of Fourier series operators; however, we succeed in establishing elements of calculus for them and discuss an application to hyperbolic partial differential equations.
Pseudo-differential operators on Lie groups Non-commutative Lie groups and homogeneous spaces play important roles in different areas of mathematics. Some fundamental examples include spheres Sn , which are homogeneous spaces under the action of the orthogonal groups. The important special case is the three-dimensional sphere S3 which happens to be also a group. However, while the general theory of pseudo-differential operators is available on such spaces, it presents certain limitations. First, working in local coordinates often makes it very complicated to keep track of the global geometric features. For example, a fundamental property that spheres are fixed by rotations becomes almost untraceable when looking at it in local coordinates. Another limitation is that while the local approach yields an invariant notion of the principal symbol, the full symbol is not readily available. This presents profound complications in applying the theory of pseudo-differential operators to problems on manifolds that depend on knowledge of the full symbol of an operator. In general, it is a natural idea to build pseudo-differential operators out of smooth families of convolution operators on Lie groups. There have been many works aiming at the understanding of pseudo-differential operators on Lie groups from this point of view, e.g., the works on left-invariant operators [121, 78, 40], convolution calculus on nilpotent Lie groups [77], L2 -boundedness of convolution operators related to Howe’s conjecture [57, 41], and many others. However, in this work, we strive to develop the convolution approach into a symbolic quantization, which always provides a much more convenient framework for the analysis of operators. For this, our analysis of operators and their symbols
Introduction
5
is based on the representation theory of Lie groups. This leads to a description of the full symbol of a pseudo-differential operator on a Lie group as a sequence of matrices of growing sizes equal to the dimensions of the corresponding representations of the group. We also characterise, in terms of the introduced quantizations, standard H¨ ormander classes Ψm on Lie groups. One of the advantages of the presented approach is that we obtain a notion of full (global) symbols which matches the underlying Fourier analysis on the group in a perfect way. For a group G, such where G a symbol can be interpreted as a mapping defined on the space G × G, is the unitary dual of a compact Lie group G. In a nutshell, this analysis can be regarded as a non-commutative analogue of the Kohn–Nirenberg quantization of pseudo-differential operators that was proposed by Joseph Kohn and Louis Nirenberg in [68] in the Euclidean setting. As such, the present research is perhaps most closely related to the work of Michael Taylor [128], who, however, in his analysis used an exponential mapping to rely on pseudo-differential operators on a Lie algebra which can be viewed as a Euclidean space with the corresponding standard theory of pseudo-differential operators. However, the approach developed in this work is different from that of [128, 129] in the sense that we rely on the group structure directly and thus are not restricted to neighbourhoods of the neutral element, thus being able to approach global symbol classes directly. Some aspects of the analysis presented in this part appeared in [99]. As an important example, the approach developed here gives us quite detailed information on the global quantization of operators on the three-dimensional sphere S3 . More generally, we note that if we have a closed simply-connected threedimensional manifold M , then by the recently resolved Poincar´e conjecture there exists a global diffeomorphism M S3 SU(2) that turns M into a Lie group with a group structure induced by S3 (or by SU(2)). Thus, we can use the approach developed for SU(2) to immediately obtain the corresponding global quantization of operators on M with respect to this induced group product. In fact, all the formulae remain completely the same since the unitary dual of SU(2) (or S3 in the quaternionic R4 ) is mapped by this diffeomorphism as well. An interesting feature of the pseudo-differential operators from H¨ ormander’s classes Ψm on these spaces is that they have matrix-valued full symbols with a remarkable rapid off-diagonal decay property. We also introduce a general machinery with which we obtain global quantization on homogeneous spaces using the one on the Lie group that acts on the space. Although we do not yet have general analogues of the diffeomorphic Poincar´e conjecture in higher dimensions, this already covers cases when M is a convex surface or a surface with positive curvature tensor, as well as more general manifolds in terms of their Pontryagin class, etc.
6
Introduction
Conventions Each part or a chapter of the book is preceded by a short introduction explaining the layout and conventions. However, let us mention now several conventions that hold throughout the book. Constants will be usually denoted by C (sometimes with subscripts), and their values may differ on different occasions, even when appearing in subsequent estimates. Throughout the book, the notation for the Laplace operator is L in order not to confuse it with difference operators which are denoted by . In Chapters 3 and 4 we encounter a notational difficulty that both frequencies and multi-indices are integers with different conventions for norms than are normally used in the literature. To address this issue, there we let |α| = |α|1 be the 1 -norm (of the multi-index α) and ξ = ξ2 be the Euclidean 2 norm (of the frequency ξ ∈ Zn ). However, in other chapters we write a more traditional |ξ| for the length of the vector ξ in Rn , and reserve the notation · X for a norm in a normed space X. However, there should be no confusion with this notation since we usually make it clear which norm we use. In Part IV, ξ = ξ(x) stands for a representation, so that we can still use the usual notation σ(x, ξ) for symbols.
Part I
Foundations of Analysis Part I of the monograph contains preliminary material that could be useful for anyone working in the theory of pseudo-differential operators. The material of the book is on the intersection of classical analysis with the representation theory of Lie groups. Aiming at making the presentation selfsufficient we include preliminary material that may be used as a reference for concepts developed later. In any case, the material presented in this part may be used either as a reference or as an independent textbook on the foundations of analysis. Throughout the book, we assume that the reader has survived undergraduate calculus courses, so that concepts like partial derivatives and the Riemann integral are familiar. Otherwise, the prerequisites for understanding the material in this book are quite modest. We shall start with a naive version of a set theory, metric spaces, topology, functional analysis, measure theory and integration in Lebesgue’s sense.
Chapter A
Sets, Topology and Metrics First, we present the basic notations and properties of sets, used elsewhere in the book. The set theory involved is “naive”, sufficient for our purposes; for a thorough treatment, see, e.g., [46]. The sets of integer, rational, real or complex numbers will be taken for granted, we shall not construct them. Let us first list some abbreviations that we are going to use: • “P and Q” means that both properties P and Q are true. • “P or Q” means that at least one of the properties P and Q is true. • “P ⇒ Q” reads “If P then Q”, meaning that “P is false or Q is true”. Equivalently “Q ⇐ P ”, i.e., “Q only if P ”. • “P ⇐⇒ Q” is “P ⇒ Q and P ⇐ Q”, reading “P if and only if Q”. • “∃x” reads “There exists x”. • “∃!x” reads “There exists a unique x”. • “∀x” reads “For every x”. • “P := Q” or “Q =: P ” reads “P is defined to be Q”.
A.1
Sets, collections, families
Naively, a set (or a collection or a family) A consists of points (or elements or members) x. Example. Sets of points, like a collection of coins, a family of two parents and three children, a flock of sheep, a pack of wolves, or a crowd of protesters. Example. Points in a set, like the members of a parliament, the flowers in a bundle, or the stars in a constellation. We denote x ∈ A if the element x belongs to the set A, and x ∈ A if x does not belong to A. A set A is a subset of a set B, denoted by A ⊂ B or B ⊃ A, if ∀x : x ∈ A ⇒ x ∈ B.
10
Chapter A. Sets, Topology and Metrics
Sets A, B are equal, denoted by A = B, if A ⊂ B and B ⊂ A, i.e., ∀x : x ∈ A ⇐⇒ x ∈ B. If A ⊂ B and A = B then A is called a proper subset of B. Remark A.1.1 (Notation for numbers). The sets of integer, rational, real and complex numbers are respectively Z, Q, R and C; let N = Z+ and R+ stand for the corresponding subsets of (strictly) positive numbers. Then Z+ ⊂ Z ⊂ Q ⊂ R ⊂ C. We also write N0 = N ∪ {0}. There are various ways for expressing sets. Sometimes all the elements can be listed: • The empty set ∅ = {} is the unique set without elements: ∀x : x ∈ ∅. • Set {x} consists of a single element x ∈ {x}. • Set {x, y} = {y, x} consists of elements x and y. And so on. Yet {x} = {x, x} = {x, x, x} etc. A set consisting of those elements for which property P holds can be denoted by {x : P (x)} = {x | P (x)} . A set consisting of finitely many elements x1 , . . . , xn could be denoted by {x1 , . . . , xn }
= {xk : k ∈ {1, . . . , n}} = xk | k ∈ Z+ : k ≤ n n = {xk }k=1 ,
and the infinite set of positive integers by Z+
=
{1, 2, 3, 4, 5, · · · } .
The power set P(X) consists of all the subsets of X, P(X) = {A : A ⊂ X} Example. For the set X = {1}, we have P(X) P(P(X))
=
{∅, {1}} ,
= {∅, {∅}, {1}, {∅, {1}}} ,
and we leave it as an exercise to find P(P(P(X))), which contains 24 = 16 elements in this case.
A.1. Sets, collections, families
11
Example. Always at least ∅, X ∈ P(X). If x ∈ X, then {x} ∈ P(X) and {{x}} ∈ P(P(X)), x = {x} = {{x}} = · · · , x ∈ {x} ∈ {{x}} ∈ · · · . However, we shall allow neither x ∈ x nor x ∈ x; consider Russell’s paradox: given x = {a : a ∈ a}, is x ∈ x? For A, B ⊂ X, let us define the union A ∪ B, the intersection A ∩ B and the difference A \ B by A∪B A∩B A\B
:= {x : x ∈ A or x ∈ B} , := {x : x ∈ A and x ∈ B} , := {x : x ∈ A and x ∈ B} .
The complement Ac of A in X is defined by Ac := X\A. Example. If A = {1, 2} and B = {2, 3} then A ∪ B = {1, 2, 3}, A ∩ B = {2} and A \ B = {1}. Example. R \ Q is the set of irrational numbers. Exercise A.1.2. Show that (A ∪ B) ∪ C (A ∩ B) ∩ C (A ∪ B) ∩ C (A ∩ B) ∪ C
= A ∪ (B ∪ C), = A ∩ (B ∩ C), = (A ∩ C) ∪ (B ∩ C), = (A ∪ C) ∩ (B ∪ C).
Notice that in the latter two cases above, the order of the parentheses is essential. On the other hand, the associativity in the first two equalities allows us to abbreviate A ∪ B ∪ C := (A ∪ B) ∪ C and A ∩ B ∩ C := (A ∩ B) ∩ C and so on. Definition A.1.3 (Index sets). Let I be any set and assume that for every i ∈ I we are given a set Ai . Then I is an index set for the collection of sets Ai . Definition A.1.4 (Unions and intersections of families). For a family A ⊂ P(X), the union A and the intersection A are defined by
A=
B
:= {x | ∃B ∈ A : x ∈ B} ,
B
:= {x | ∀B ∈ A : x ∈ B} .
B∈A
A=
B∈A
Example. If A = {B, C} then
A = B ∪ C and
A = B ∩ C.
12
Chapter A. Sets, Topology and Metrics
Notice that if A ⊂ B ⊂ P(X) then ∅⊂ A⊂ B ⊂ X and ∅ ⊂ B⊂ A ⊂ X. Especially, for ∅ ⊂ P(X) we have ∅=∅
and
∅ = X.
(A.1)
Notice that A ∪ B = {A, B} and A ∩ B = {A, B}. For unions (and similarly for intersections), the following notations are also commonplace: Aj := {Aj | j ∈ K}, j∈K n k=1 ∞
Ak
:=
Ak
:=
{Ak | k ∈ Z+ : 1 ≤ k ≤ n},
{Ak | k ∈ Z+ }.
k=1
Example.
3
Ak = A1 ∩ A2 ∩ A3 .
k=1
Exercise A.1.5 (de Morgan’s rules). Prove de Morgan’s rules: X\ Aj = (X \ Aj ), j∈K
X\
j∈K
Aj
=
j∈K
A.2
(X \ Aj ).
j∈K
Relations, functions, equivalences and orders
The Cartesian product of sets A and B is A × B = {(x, y) : x ∈ A, y ∈ B} , where the elements (x, y) := {x, {x, y}} are ordered pairs: if x = y then (x, y) = (y, x), whereas {x, y} = {y, x}. A relation from A to B is a subset R ⊂ A × B. We write xRy if (x, y) ∈ R, saying “x is in relation R to y”; analogously, x R y means (x, y) ∈ R (“x is not in relation R to y”). Functions. A relation f ⊂ X × Y is called a function (or a mapping) from X to Y , denoted by f :X→Y
f
or
X → Y,
A.2. Relations, functions, equivalences and orders
13
if for each x ∈ X there exists a unique y ∈ Y such that (x, y) ∈ f : ∀x ∈ X ∃!y ∈ Y :
(x, y) ∈ f ;
in this case, we write y := f (x)
or x → f (x) = y.
Intuitively, a function f : X → Y is a rule taking x ∈ X to f (x) ∈ Y . Functions f
g
g◦f
X → Y and Y → Z yield a composition X → Z by g ◦ f (x) := g(f (x)). The restriction of f : X → Y to A ⊂ X is f |A : A → Y defined by f |A (x) := f (x). Example. The characteristic function of a set E ∈ P(X) is χE : X → R defined by 1, if x ∈ E, χE (x) := 0, if x ∈ E. Definition A.2.1 (Injections, surjections, bijections). A function f : X → Y is • an injection if f (x1 ) = f (x2 ) implies x1 = x2 , • a surjection if for every y ∈ Y there exists x ∈ X such that f (x) = y, • and a bijection if it is both injective and surjective, and in this case we may define the inverse function f −1 : Y → X such that f (x) = y if and only if x = f −1 (y). Definition A.2.2. (Image and preimage) A function f : X → Y begets functions f + : P(X) → P(Y ), f + (A) = f (A) := {f (x) ∈ Y : x ∈ A} , f − : P(Y ) → P(X), f − (B) = f −1 (B) := {x ∈ Y : f (x) ∈ B}. Sets f (A) and f −1 (B) are called the image of A ⊂ X and the preimage of B ⊂ Y , respectively. Exercise A.2.3. Let f : X → Y , A ⊂ X and B ⊂ Y . Show that A ⊂ f −1 (f (A)) and f (f −1 (B)) ⊂ B. Give examples showing that these subsets can be proper. Exercise A.2.4. Let f : X → Y , A0 ⊂ X, Show that ⎧ ⎪ = ⎨f ( A) f ( A) ⊂ ⎪ ⎩ f (X \ A0 ) ⊃ where the subsets can be proper, while ⎧ −1 ⎪ = ⎨f ( B) −1 = f ( B) ⎪ ⎩ −1 f (Y \ B0 ) =
B0 ⊂ Y , A ⊂ P(X) and B ⊂ P(Y ). f (A), A∈A A∈A f (A), Y \ f (A0 ), f −1 (B), B∈B −1 (B), B∈B f X \ f −1 (B0 ).
14
Chapter A. Sets, Topology and Metrics
These set-operation-friendly properties of f −1 : P(Y ) → P(X) will be encountered later in topology and measure theory. Definition A.2.5 (Induced and co-induced families). Let f : X → Y , A ⊂ P(X) and B ⊂ P(Y ). Then f is said to induce the family f −1 (B) ⊂ P(X) and to co-induce the family D ⊂ P(Y ), where −1 f −1 (B) := f (B) | B ∈ B , D := B ⊂ Y | f −1 (B) ∈ A . Equivalences Definition A.2.6 (Equivalence relation). A subset ∼ of X × X is an equivalence relation on X if it is 1. reflexive: x ∼ x (for all x ∈ X); 2. symmetric: if x ∼ y then y ∼ x (for all x, y ∈ X); 3. transitive: if x ∼ y and y ∼ z then x ∼ z (for all x, y, z ∈ X). The equivalence class of x ∈ X is [x] := {y ∈ X | x ∼ y} , and the equivalence classes form the quotient space X/ ∼
:=
{[x] | x ∈ X} .
Notice that x ∈ [x] ⊂ X, that [x]∩[y] = ∅ if [x] = [y], and that X = x∈X [x]. Example. Clearly, the identity relation = is an equivalence relation on X, and f (x) := {x} defines a natural bijection f : X → X/ =. Example. Let X and Y denote the sets of all women and men, respectively. For simplicity, we may assume the disjointness X ∩ Y = ∅. Let Isolde, Juliet ∈ X and Romeo, T ristan ∈ Y . For a, b ∈ X ∪ Y , let x ∼ y if and only if a and b are of the same gender. Then Y = [T ristan] = [Romeo] =
[Juliet] = [Isolde] = X, X ∪ Y = [Romeo] ∪ [Juliet], (X ∪ Y )/ ∼ = {[Romeo], [Juliet]} . Exercise A.2.7. Let us define a relation ∼ in the Euclidean plane R2 by setting (x1 , x2 ) ∼ (y1 , y2 ) if and only if x1 − y1 , x2 − y2 ∈ Z. Show that ∼ is an equivalence relation. What is the equivalence class of the origin (0, 0) ∈ R2 ? What is common between a doughnut and the quotient space here? Exercise A.2.8. Let us define a relation ∼ in the punctured Euclidean space R3 \ {(0, 0, 0)} by setting (x1 , x2 , x3 ) ∼ (y1 , y2 , y3 ) if and only if (x1 , x2 , x3 ) = (ty1 , ty2 , ty3 ) for some t ∈ R+ . Prove that ∼ is an equivalence relation. What is common between a sphere and the quotient space here?
A.2. Relations, functions, equivalences and orders
15
Orders Definition A.2.9 (Partial order). A non-empty set X is partially ordered if there is a partial order ≤ on X. That is, ≤ is a relation from X to X, such that it is 1. reflexive: x ≤ x (for all x ∈ X); 2. anti-symmetric: if x ≤ y and y ≤ x then x = y (for all x, y ∈ X); 3. transitive: if x ≤ y and y ≤ z then x ≤ z (for all x, y, z ∈ X). We say that y is greater than x (or x is less than y), denoted by x < y, if x ≤ y and x = y. Example. The set R of real numbers has the usual order ≤. Naturally, any of its non-empty subsets, e.g., Z+ ⊂ R, inherits the order. The set [−∞, +∞] = R ∪ {−∞, +∞} has the order ≤ extended from R, with conventions −∞ ≤ x and x ≤ +∞ for every x ∈ [−∞, +∞]. Example. Let us order X = P(S) by inclusion. That is, for A, B ⊂ S, let A ≤ B if and only if A ⊂ B. Example. Let X, Y be sets, where Y has a partial order ≤. We may introduce a new partial order for all functions f, g : X → Y by setting f ≤g
definition
⇐⇒
∀x ∈ X : f (x) ≤ g(x).
This partial order is commonplace especially when Y = R or Y = [−∞, ∞]. Definition A.2.10 (Chains and total order). A non-empty subset K ⊂ X is a chain if x ≤ y or y ≤ x for all x, y ∈ K. The partial order is total (or linear ) if the whole set X is a chain. Example. [−∞, +∞] is a chain with the usual partial order. Thereby also its subsets are chains, e.g., R and Z+ . If {Aj : j ∈ J} ⊂ P(S) is a chain then Aj ⊂ Ak or Ak ⊂ Aj for each j, k ∈ J. Moreover, P(S) is not a chain if S has more than one element. Definition A.2.11 (Bounds). Let ≤ be a partial order on X. The sets of upper and lower bounds of A ⊂ X are defined, respectively, by ↑A
:= {x ∈ X | ∀a ∈ A : a ≤ x} ,
↓A
:= {x ∈ X | ∀a ∈ A : x ≤ a} .
If x ∈ A∩ ↑ A then it is the maximum of A, denoted by x = max(A). If x ∈ A∩ ↓ A then it is the minimum of A, denoted by x = min(A). If A∩ ↑ {z} = {z} then the element z ∈ A is called maximal in A. Similarly, if A∩ ↓ {z} = {z} then the element z ∈ A is called minimal in A. If sup(A) := min(↑ A) ∈ X exists, it is called the supremum of A, and if inf(A) := max(↓ A) ∈ X exists, it is the infimum of A.
16
Chapter A. Sets, Topology and Metrics
Remark A.2.12. Notations like sup xk = sup xk = sup{xk : k ∈ Z+ } k∈Z+
k≥1
are quite common. Example. The minimum in Z+ is 1, but there is no maximal element. For each A ⊂ [−∞, ∞], the infimum and the supremum exist. Example. Let X = P(S). Then max(X) = S and min(X) = ∅. If A ⊂ X then sup(A) = A and inf(A) = A. For each x ∈ S, element S \ {x} ∈ X is maximal in the subset X \ {S}. Definition A.2.13 (lim sup and lim inf ). Let xk ∈ X for each k ∈ Z+ . If the following supremums and infimums exist, let lim sup xk := inf sup{xk : j ≤ k} | j ∈ Z+ , k→∞ lim inf xk := sup inf{xk : j ≤ k} | j ∈ Z+ . k→∞
Example. Let Ek ∈ P(X) for each k ∈ Z+ . Then lim sup Ek
=
k→∞
lim inf Ek k→∞
∞ ∞
Ek ,
j=1 k=j
=
∞ ∞
Ek .
j=1 k=j
Exercise A.2.14. Let A = lim sup Ek and B = lim inf Ek as in the example above. k→∞
k→∞
Show that χA = lim sup χEk k→∞
and χB = lim inf χEk , k→∞
where χE : X → R is the characteristic function of E ⊂ X.
A.3
Dominoes tumbling and transfinite induction
The principle of mathematical induction can be compared to a sequence of dominoes, falling over one after another when the first tumbles down. More precisely, if 1 ∈ S ⊂ Z+ and n ∈ S ⇒ n + 1 ∈ S for every n ∈ Z+ , then S = Z+ . The Transfinite Induction Principle generalises this, working on any well-ordered set. Definition A.3.1 (Well-ordered sets). A partially ordered set X is said to be well ordered, if min(A) exists whenever ∅ = A ⊂ X.
A.4. Axiom of Choice: equivalent formulations
17
Example. With its usual order, Z+ is well ordered. With their usual orders, Z, R and [−∞, +∞] are not well ordered. With the inclusion order, P(S) is not well ordered, if there is more than one element in S. Theorem A.3.2 (Transfinite Induction Principle). Let X be well ordered and S ⊂ X. Assume that for each x ∈ X it holds that x ∈ S if {y ∈ X : y < x} ⊂ S. Then S = X. Exercise A.3.3. Prove the Transfinite Induction Principle. Exercise A.3.4 (Transfinite =⇒ mathematical induction). Check that in the case X = Z+ , the Transfinite Induction Principle is the usual mathematical induction. The value of the Transfinite Induction Principle might be limited, as we have to assume the well-ordering of the underlying set. Actually, many (but not all) working mathematicians assume that every non-empty set can be well ordered, which is the so-called Well-Ordering Principle. Is such a principle likely to be true? After all, for example on sets R or P(Z+ ), can we imagine what well-orderings might look like? All the elementary tools which we use in our mathematical reasoning should be at least believable, so maybe the Well-Ordering Principle does not appear as a satisfying set theoretic axiom. Could we perhaps prove or disprove it from other, intuitively more reliable principles? We shall return to this question later.
A.4
Axiom of Choice: equivalent formulations
In this section we shall consider how to calculate the number of points in a set, and what infinity might mean in general. Choosing. We may always choose one point out of a non-empty set, no matter how many points there are around. But sometimes we need infinitely many tasks done at once. For instance, we might want to choose a point from each of the nonempty subsets A ⊂ X in no time at all: as a tool, we need the Axiom of Choice for X. Definition A.4.1 (Choice function). Let X = ∅. A mapping f : P(X) → X is called a choice function on X if f (A) ∈ A whenever ∅ = A ⊂ X. Example. Let X = {p, q} where p = q. Let f : P(X) → X such that f (X) = p = f ({p}) and f ({q}) = q. Then f is a choice function on X. The following Axiom of Choice should be considered as an axiom or a fundamental principle. In this section we discuss its implications.
18
Chapter A. Sets, Topology and Metrics
Axiom A.4.2 (Axiom of Choice). For every non-empty set there exists a choice function. Exercise A.4.3. Prove that a choice function exists on a well-ordered set. Thus the Well-Ordering Principle implies the Axiom of Choice. The Axiom of Choice might look more convincing than the Well-Ordering Principle. Yet, we should be careful, as we are dealing with all kinds of sets, about which our intuition might be deficient. The Axiom of Choice might be plausible for X = Z+ , or maybe even for X = R, but can we be sure whether it is true in general? Nevertheless, let us add Axiom A.4.2 to our set-theoretic tool box. There are plenty of equivalent formulations for the Axiom of Choice. In the sequel, we present some variants, starting with the “Axiom of Choice for Cartesian Products”, to be presented soon. Definition A.4.4 (Cartesian product). Let Xj be a set for each j ∈ J. The Cartesian product is defined to be Xj := f | f : J → Xj and ∀j ∈ J : f (j) ∈ Xj . j∈J
j∈J
If Xj = X for each j ∈ J, we write X J := j∈J Xj . The elements f ∈ X J are then functions f : J → X. Moreover, let X n := X Zn , where Zn := {k ∈ Z+ | k ≤ n}. Exercise A.4.5. Give an example of a bijection g : X 1 × X2 → Xj , j∈{1,2} 2 especially in the case X × X → X . Thereby X1 × X2 can be identified with the Cartesian product j∈{1,2} Xj .
Exercise A.4.6. Give a bijection g : P(X) → {0, 1}X . The following Theorem A.4.7 is a consequence of the Axiom of Choice A.4.2. However, by Exercise A.4.8, Theorem A.4.7 also implies the Axiom of Choice, and thus it could have been taken as an axiom itself. Theorem A.4.7 (Axiom of Choice for Cartesian Products). The Cartesian product of non-empty sets is non-empty. Exercise A.4.8. Show that the Axiom of Choice is equivalent to the Axiom of Choice for Cartesian Products. Theorem A.4.9 (Hausdorff Maximal Principle). Any chain is contained in a maximal chain. Proof. Let (X, ≤) be a partially ordered set with a chain C0 ⊂ X. Let T := {C | C ⊂ X is a chain such that C0 ⊂ C} .
A.4. Axiom of Choice: equivalent formulations
19
Now C0 ∈ T , so T = ∅. Let f : P(X) → X be a choice function for X. Let us define s : T → T such that s(C) = C if C ∈ T is maximal, and if C ∈ T is not maximal then s(C) := C ∪ {f ({x ∈ X \ C : C ∪ {x} ∈ T })} ; in this latter case, the chain s(C) is obtained by adding one element to the chain C. The claim follows if we can show that C = s(C) for some C ∈ T . Let U ⊂ T be a tower if • C0 ∈ U, • K ∈ U for any chain K ⊂ U, • s(U) ⊂ U. In other words: if A ∈ U then s(A) ∈ U. For instance, T is a tower. Let V be the intersection of all towers. Clearly, V is a tower, in fact the minimal tower. It will turn out that V ∈ T is a maximal chain. This follows if we can show that V ⊂ V is a tower, where V := {C ∈ V | ∀B ∈ V : B ⊂ C or C ⊂ B} , since the minimality would imply V = V . Clearly, C0 ∈ V , and if K ⊂ V is a chain then K ∈ V . Let C ∈ V ; we have to show that s(C) ∈ V . This follows, if we can show that C ⊂ V is a tower, where C := {A ∈ V | A ⊂ C or s(C) ⊂ A} . Clearly, C0 ∈ C, and if K ⊂ C is a chain then K ∈ C. Let A ∈ C; we have to show that s(A) ∈ C, i.e., show that s(A) ⊂ C or s(C) ⊂ s(A). Since C ∈ V , we have s(A) ⊂ C or C ⊂ s(A). Suppose the non-trivial case “C ⊂ s(A) and A ⊂ C”. Since s(A) = A ∪ {x} for some x ∈ X, we must have s(A) = C or C = A. The proof is complete. Theorem A.4.10 (Zorn’s Lemma). A partially ordered set where every chain has an upper bound has a maximal element. Exercise A.4.11 (Hausdorff Maximal Principle ⇐⇒ Zorn’s Lemma). Show that the Hausdorff Maximal Principle is equivalent to Zorn’s Lemma. Theorem A.4.12 (Zorn’s Lemma =⇒ Axiom of Choice). Zorn’s Lemma implies the Axiom of Choice. Proof. Let X be a non-empty set. Let P := {f | f : P(A) → A is a choice function for some A ⊂ X} . Now P = ∅, because ({x} → x) : P({x}) → {x} belongs to P for any x ∈ X. Let us endow P with the partial order ≤ by inclusion: f ≤g
definition
⇐⇒
f ⊂g
20
Chapter A. Sets, Topology and Metrics
(here recall that f ∈ P is a subset f ⊂ P(A) × A for some A ⊂ X). Suppose C = {fj : j ∈ J} ⊂ P is a chain. Then it is easy to verify that fj ∈ P C= j∈J
is an upper bound for C, so according to Zorn’s Lemma there exists a maximal element f ∈ P , which is a choice function for some A ⊂ X. We have to show that A = X. On the contrary, suppose B ⊂ X such that B ∈ P(A). Take x ∈ B. Then f ⊂ f ∪ {(B, x)} ∈ P , which would contradict the maximality of f . Hence f must be a choice function for A = X. How many points? Intuitively, cardinality measures the number of the elements in a set. Cardinality is a relative concept: sets A, B are compared by whether there is an injection, a surjection or a bijection from one to another. The most interesting results concern infinite sets. Definition A.4.13 (Cardinality). Sets A, B have the same cardinality, denoted by |A| = |B|
(or A ∼ B),
if there exists a bijection f : A → B. If there exists C ⊂ B such that |A| = |C|, we write |A| ≤ |B|. Moreover, |A| ≤ |B| = |A| is abbreviated by |A| < |B|. The cardinality of a set A is often also denoted by card(A). Exercise A.4.14. Let |A| = |B|. Show that |P(A)| = |P(B)|. Exercise A.4.15. Show that |Z+ | = |Z|. Remark A.4.16. Clearly for every set A, B, C we have A ∼ A, A ∼ B ⇐⇒ B ∼ A, A ∼ B and B ∼ C =⇒ A ∼ C; formally this is an equivalence relation, though we may have difficulties when discussing the “set of all sets”. Notice that |A| ≤ |B| means that there is an injection f : A → B, and in this case we may identify set A with f (A) ⊂ B. Obviously, |A| ≤ |B| ≤ |C| =⇒ |A| ≤ |C|. It is less obvious whether |A| = |B| when |A| ≤ |B| ≤ |A|:
A.4. Axiom of Choice: equivalent formulations
21
Theorem A.4.17 (Schr¨ oder–Bernstein). Let |X| ≤ |Y | and |Y | ≤ |X|. Then |X| = |Y |. Proof. Let f : X → Y and g : Y → X be injections. Let X0 := X and X1 := g(Y ). Define inductively {Xk : k ∈ Z+ } ⊂ P(X) by Xk+2 := g(f (Xk )). Let X∞ :=
∞ k=0
Xk . Now X∞ ⊂ Xk+1 ⊂ Xk for each k ≥ 0. Moreover, X0 \ X1 , if k is odd, Xk \ Xk+1 ∼ X1 \ X2 , if k is even,
so that X
= ∼
X∞ ∪ X∞ ∪
∞ k=0 ∞
(Xk \ Xk+1 ) (Xk+1 \ Xk+2 )
k=0
= X1 ∼ Y. Thus X ∼ Y .
The following Law of Trichotomy is equivalent to the Axiom of Choice, though we derive it only as a corollary to Zorn’s Lemma: Theorem A.4.18 (The Law of Trichotomy). Let X, Y be sets. Then exactly one of the following holds: |X| < |Y |,
|X| = |Y |,
|Y | < |X|.
Proof. Assume the non-trivial case X, Y = ∅. Let us define J := {f | A ⊂ X, f : A → Y injective} . Clearly, J = ∅. Thus we may define a partial order ≤ on J by g ≤ h ⇐⇒ g ⊂ h; notice here that g, h ⊂ X × Y . Let K ⊂ J be a chain. Then it has an upper bound K ∈ J . Hence by Zorn’s Lemma, there exists a maximal element f ∈ J . Now f : A → Y is injective, where A ⊂ X. If A = X then |X| ≤ |Y |.
22
Chapter A. Sets, Topology and Metrics
If f (A) = Y then |Y | = |A| ≤ |X|. So let us suppose that A = X and f (A) = Y . Then take x0 ∈ X \ A and y0 ∈ Y \ f (A). Define g : A ∪ {x0 } → Y,
g(x) :=
f (x), if x ∈ A, if x = x0 . y0 ,
Then g ∈ J and f ≤ g = f , which contradicts the maximality of f . Thereby A = X or f (A) = Y , meaning |X| ≤ |Y |
or
|Y | ≤ |X|.
Finally, if |X| ≤ |Y | and |Y | ≤ |X| then |X| = |Y | by Theorem A.4.17.
There is no greatest cardinality: Theorem A.4.19 (No greatest cardinality). Let X be a set. Then |X| < |P(X)|. Proof. If X = ∅ then P(X) = {∅}, and the only injection from X to P(X) is then the empty relation, which is not a bijection. Assume that X = ∅. Then function f : X → P(X),
f (x) := {x}
is an injection, establishing |X| ≤ |P(X)|. To get a contradiction, assume that X ∼ P(X), so that there exists a bijection g : X → P(X). Let A := {x ∈ X : x ∈ g(x)} . Let x0 := g −1 (A). Now x0 ∈ A if and only if x0 ∈ g(x0 ) = A, which is a contradiction. Definition A.4.20 (Counting). Let A, B, C, D be sets. For n ∈ Z+ , let Zn := {k ∈ Z+ | k ≤ n} = {1, . . . , n}. We say that |∅| = 0, |Zn | = n, A is finite if |A| = n for some n ∈ Z+ ∪ {0}. B is infinite if it is not finite. C is countable if |C| ≤ |Z+ |. D is uncountable if it is not countable.
A.4. Axiom of Choice: equivalent formulations
23
Remark A.4.21. To strive for transparency in the proofs in this section, let us forget the Law of Trichotomy, which would provide short-cuts like |X| < |Y | ⇐⇒ |Y | ≤ |X|. The reader may easily simplify parts of the reasoning using this. The reader is also encouraged to find out where we use the Axiom of Choice or some other nontrivial tools. Proposition A.4.22. Let A, B be sets. Then |A| < |Z+ | ≤ |B| if and only if A is finite and B is infinite. Proof. Let A = ∅ be finite, so A ∼ Zn ⊂ Z+ for some n ∈ Z+ . Hence |A| ≤ |Z+ |. If f : Z+ → A then f (n + 1) ∈ f (Zn ), so f is not injective, especially not bijective. Thus |Z+ | ≤ |A| and |A| < |Z+ |. Consequently, if |Z+ | ≤ |B| then B is infinite. Let B be infinite. Take x1 ∈ B = ∅. Let An = {x1 , . . . , xn } ⊂ B be a finite set. Inductively, take xn+1 ∈ B \ An = ∅. Define g : Z+ → B, g(n) := xn . Now g is injective. Hence |Z+ | ≤ |B|. Let B ⊂ Z+ be infinite. Define h : Z+ → B inductively by h(1) := min(B), h(n + 1) := min (B \ {h(1), . . . , h(n)}) . Now h is a bijection: |B| = |Z+ |. So if |A| < |Z+ | then A is finite.
Proposition A.4.23. Let C, D be sets. Then |C| ≤ |Z | < |D| if and only if C is countable and D is uncountable. +
Proof. Property |C| ≤ |Z+ | is just the definition of countability. Let D be uncountable, i.e., |D| ≤ |Z+ |. By Proposition A.4.22, D is not finite, i.e., it is infinite, i.e., |Z+ | ≤ |D|. Because of |Z+ | = |D|, we have |Z+ | < |D|. Let |Z+ | < |D|. By Proposition A.4.22, D is infinite, i.e., |D| < |Z+ |. Because +
|D|, we have even |D| ≤ |Z+ |, i.e., D is uncountable. of |Z | = Remark A.4.24. Let us collect the results from Propositions A.4.22 and A.4.23: For sets A, B, C, D, |A| < |Z+ | ≤ |B|, |C| ≤ |Z+ | < |D| if and only if A is finite, B is infinite, C is countable, and D is uncountable. In the proofs, we used induction, i.e., well-ordering for Z+ . + Proposition ∞ A.4.25 (Cantor). Let Ak ⊂ X be a countable subset for each k ∈ Z . Then k=1 Ak is countable.
24
Chapter A. Sets, Topology and Metrics
Proof. We may enumerate the elements of each countable Ak : Ak := akj : j ∈ Z+ , A1 A2 A3 A4
= {a11 , a12 , a13 , a14 , · · · } , = {a21 , a22 , a23 , a24 , · · · } , = {a31 , a32 , a33 , a34 , · · · } , = {a41 , a42 , a43 , a44 , · · · } , .. .
Their union is enumerated by ∪∞ k=1 Ak = {a11 , a21 , a12 , a31 , a22 , a13 , a41 , a32 , a23 , a14 , · · · } = ak−j+1,j : 1 ≤ j ≤ k, k ∈ Z+ .
Exercise A.4.26. Show that the set Q of rational numbers is countably infinite. Exercise A.4.27 (Algebraic numbers). A number λ ∈ C is called algebraic if p(λ) = 0 for some non-zero polynomial p with integer coefficients, i.e., if some polynomial p(z) =
n
ak z k ,
k=0
⊂ Z and an = 0. Let A ⊂ C be the set of algebraic where n ∈ Z , numbers. Show that Q ⊂ A, that A is countable, and give an example of a number λ ∈ (R ∩ A) \ Q. +
{ak }nk=0
Proposition A.4.28. |R| = |P(Z+ )|. Proof. Let us define f : R → P(Q),
f (x) := {r ∈ Q : r < x} .
Obviously f is injective, hence |R| ≤ |P(Q)|. By Exercise A.4.26, |Q| = |Z+ |, implying |P(Q)| = |P(Z+ )|. On the other hand, let us define 10−k . g : P(Z+ ) → R, g(A) := k∈A
For instance, 0 = g(∅) ≤ g(A) ≤ g(Z ) = 1/9. Nevertheless, g is injective, implying |P(Z+ )| ≤ |R|. This completes the proof. +
Exercise A.4.29. Let X be an uncountable set. Show that there exists an uncountable subset S ⊂ X such that X \ S is also uncountable.
A.5. Well-Ordering Principle revisited
25
A.5 Well-Ordering Principle revisited Trivially, the Well-Ordering Principle implies the Axiom of Choice. Actually, there is the reverse implication, too: Theorem A.5.1 (Well-Ordering Principle). Every non-empty set can be well ordered. Proof. Let X = ∅. Let P := {(Aj , ≤j ) | j ∈ J, Aj ⊂ X, (Aj , ≤j ) well-ordered} . Clearly, P = ∅. Define a partial order ≤ on P by inclusion: (Aj , ≤j ) ≤ (Ak , ≤k ) Take a chain C ⊂ P . Let B :=
Aj ,
(Aj ,≤j )∈C
definition
⇐⇒ ≤j ⊂≤k .
≤B :=
≤j .
(Aj ,≤j )∈C
Then (B, ≤B ) ∈ P is an upper bound for the chain C ⊂ P , so there exists a maximal element (A, ≤A ) ∈ P by Zorn’s Lemma A.4.10. Now, if there was x ∈ X \ A, then we easily see that A ∪ {x} could be well ordered by ≤x for which ≤A ⊂≤x , which would contradict the maximality of (A, ≤A ). Therefore A = X has been well ordered. Although we already know that the Well-Ordering Principle and the Hausdorff Maximal Principle are equivalent, let us demonstrate how to use transfinite induction in a related proof: Proposition A.5.2 (Well-Ordering Principle =⇒ Hausdorff Maximal Principle). The Well-Ordering Principle implies the Hausdorff Maximal Principle. Proof. Let (X, ≤) be well ordered, i.e., there exists min(A) ∈ A whenever ∅ = A ⊂ X. Let ≤0 be a partial order on X. Let us define f : X → P(X) by transfinite induction in the following way: {x}, if {x} ∪ f ({y : y < x}) is a chain with respect to ≤0 , f (x) := ∅ otherwise. Then f (X) ⊂ P(X) is a maximal chain.
Exercise A.5.3. Fill in the details in the proof of Proposition A.5.2. Remark A.5.4 (Formulations of the Axiom of Choice). Collecting earlier results and exercises, we see that the following claims are equivalent: the Axiom of Choice, the Axiom of Choice for Cartesian Products, the Hausdorff Maximal Principle, Zorn’s Lemma, and the Well-Ordering Principle. The Law of Trichotomy was derived as a corollary to these, but it is actually another equivalent formulation for the Axiom of Choice (see, e.g., [124]).
26
Chapter A. Sets, Topology and Metrics
Remark A.5.5 (Continuum Hypothesis). When working in analysis, one does not often pay much attention to the underlying set theoretic foundations. Yet, there are many deep problems involved. For instance, it can be shown that there is the smallest uncountable cardinality |Ω|, i.e., whenever S is uncountable then |Z+ | < |Ω| ≤ |S|. So |Ω| ≤ |R|. A natural question is whether |Ω| = |R|? Actually, in year 1900, David Hilbert proposed the so-called Continuum Hypothesis |Ω| = |R|. The Generalised Continuum Hypothesis is that if X, Y are infinite sets and |X| ≤ |Y | ≤ |P(X)| then |X| = |Y | or |Y | = |P(X)|. Without going into details, let (ZF) denote the Zermelo–Fraenkel axioms for set theory, (AC) the Axiom of Choice, and (CH) the Generalised Continuum Hypothesis. From 1930s to 1960s, Kurt G¨ odel and Paul Cohen discovered that: 1. 2. 3. 4.
Within (ZF) one cannot prove whether (ZF) is consistent. (ZF+AC+CH) is consistent if (ZF) is consistent. (AC) is independent of (ZF). (CH) is independent of (ZF+AC).
The reader will be notified, whenever we apply (AC) or its equivalents (which is not that often); in this book, we shall not need (CH) at all.
A.6
Metric spaces
Definition A.6.1 (Metric space). A function d : X × X → [0, ∞) is called a metric on the set X if for every x, y, z ∈ X we have d(x, y) = 0 ⇐⇒ x = y d(x, y) = d(y, x)
(non-degeneracy); (symmetry);
d(x, z) ≤ d(x, y) + d(y, z)
(triangle inequality).
Then (X, d) (or simply X when d is evident) is called a metric space. Sometimes a metric is called a distance function. When x ∈ X and r > 0, Br (x) := {y ∈ X | d(x, y) < r} is called the x-centered open ball of radius r. If we want to emphasise that the ball is taken with respect to metric d, we will write Bd (x, r). Remark A.6.2. In a metric space (X, d), ∞ k=1
Bk (x) = X
and
∞ k=1
B1/k (x) = {x}.
A.6. Metric spaces
27
Example (Discrete metric). The mapping d : X × X → [0, ∞) defined by 1, if x = y, d(x, y) := 0, if x = y is called the discrete metric on X. Here X, if 1 < r, Br (x) = {x}, if 0 < r ≤ 1. Example. Normed vector spaces form a very important class of metric spaces, see Definition B.4.1. Exercise A.6.3. For 1 ≤ p < ∞, ⎛ dp (x, y) = x − yp := ⎝
n
⎞1/p |xj − yj |p ⎠
j=1
defines a metric dp : Rn × Rn → [0, ∞). Function d∞ (x, y) = max |xj − yj | 1≤j≤n
also turns Rn into a metric space. Unless otherwise mentioned, the space Rn is endowed with the Euclidean metric d2 (distance “as the crow flies”). Exercise A.6.4 (Sup-metric). Let a < b and let B([a, b]) be the space of all bounded functions f : [a, b] → R. Show that function d∞ (f, g) = sup |f (y) − g(y)| y∈[a,b]
turns B([a, b]) into a metric space. It is called the sup-metric. Remark A.6.5 (Metric subspaces). If A ⊂ X and d : X × X → [0, ∞) is a metric then the restriction d|A×A : A × A → [0, ∞) is a metric on A, with Bd|A×A (x, r) = A ∩ Bd (x, r). Exercise A.6.6. Let a < b and let C([a, b]) be the space of all continuous functions f : [a, b] → R. Show the following statements: The function d∞ (f, g) = supy∈[a,b] |f (y)−g(y)| turns (C([a, b]), d∞ ) into a metric subspace of (B([a, b]), d∞ ). The space C([a, b]) also becomes a metric space with metric 1/p b
|f (y) − g(y)|p dy
dp (f, g) =
,
a
for any 1 ≤ p < ∞. However, B([a, b]) with these dp is not a metric space.
28
Chapter A. Sets, Topology and Metrics
Definition A.6.7 (Diameter and bounded sets). The diameter of a set A ⊂ X in a metric space (X, d) is diam(A) := sup {d(x, y) | x, y ∈ A} , with convention diam(∅) = 0. A set A ⊂ X is said to be bounded, if diam(A) < ∞. Example. diam({x}) = 0, diam({x, y}) = d(x, y), and diam({x, y, z}) = max {d(x, y), d(y, z), d(x, z)} . Exercise A.6.8. Show that diam(Br (x)) ≤ 2r, so that balls are bounded. Definition A.6.9 (Distance between sets). The distance between sets A, B ⊂ X is dist(A, B) := inf {d(x, y) | x ∈ A, y ∈ B} , with the convention that dist(A, ∅) = ∞. Exercise A.6.10. Show that A ∩ Br (x) = ∅ if and only if dist({x}, A) < r. We note that the function dist(A, B) does not define a metric on subsets of X. For example: Exercise A.6.11. Give an example of sets A, B ⊂ R2 for which dist(A, B) = 0 even though A ∩ B = ∅. Here we consider naturally the Euclidean metric. Exercise A.6.12. Show that set S in a metric space (X, d) is bounded if and only if there exist some a ∈ X and r > 0 such that S ⊂ Br (a). Lemma A.6.13. Let S be a bounded set in a metric space (X, d) and let c ∈ X. Then S ⊂ BR (c) for some R > 0. Proof. Since S is a bounded set, there exist some a ∈ X and r > 0 such that S ⊂ Br (a). Consequently, for all x ∈ S we have d(x, c) ≤ d(x, a) + d(a, c) < r + d(a, c), so the statement follows with R = r + d(a, c).
Proposition A.6.14. The union of finitely many bounded sets in a non-empty metric space is bounded. Proof. Let S1 , . . . , Sn be bounded sets in a non-empty metric space (X, d). Let us take some c ∈ X. Then by Lemma A.6.13 there exists some Ri , i = 1, . . . , n, such that Si ⊂ BRi (c). If we take R = max{R1 , . . . , Rn }, then we have Si ⊂ BRi (c) ⊂ BR (c), which implies that ∪ni=1 Si ⊂ BR (c) is bounded. Remark A.6.15. We note that the union of infinitely many bounded sets does not have to be bounded. For example, the union of sets Si = (0, i) ⊂ R, i ∈ N, is not bounded in (R, d∞ ).
A.7. Topological spaces
29
Usually, the topological properties can be characterised with generalised sequences (or nets). Now, we briefly study this phenomenon in metric topology, where ordinary sequences suffice. Definition A.6.16 (Sequences). A sequence in a set A is a mapping x : Z+ → A. We write xk := x(k) and x = (xk )k∈Z+ = (xk )∞ k=1 = (x1 , x2 , x3 , . . .). Notice that x = {x1 , x2 , x3 , · · · } = {xk : k ∈ Z+ }. Definition A.6.17 (Convergence). Let (X, d) be a metric space. A sequence x : Z+ → X converges to a point p ∈ X, if lim d(xk , p) = 0, i.e., k→∞
∀ε > 0 ∃kε ∈ Z+ :
k ≥ kε ⇒ d(xk , p) < ε. d
In such a case, we write lim xk = p or xk → p or xk −−−−→ p etc. k→∞
k→∞
Clearly, xk → p as k → ∞ if and only if ∀ε > 0 ∃N :
k ≥ N ⇒ xk ∈ Bε (p).
We now collect some properties of limits. Proposition A.6.18 (Uniqueness of limits in metric spaces). Let (X, d) be a metric space. If xk → p and xk → q as k → ∞, then p = q. Proof. Let ε > 0. Since xk → p and xk → q as k → ∞, it follows that there are some numbers N1 , N2 such that d(xk , p) < ε for all k > N1 and such that d(xk , q) < ε for all k > N2 . Hence by the triangle inequality for all k > max{N1 , N2 } we have d(p, q) ≤ d(p, xk ) + d(xk , q) < 2ε. Since this conclusion is true for any ε > 0, it follows that d(p, q) = 0 and hence p = q.
A.7 Topological spaces Previously, a metric provided a way of measuring distances between sets. The branch of mathematics called topology can be thought as a way to describe “qualitative geography of a set” without referring to specific numerical distance values. We begin by considering properties of metric spaces that motivate the definition of topology which follows after them. Definition A.7.1 (Open sets and neighbourhoods). A set U ⊂ X in a metric space X is said to be open if for every x ∈ U there is some ε > 0 such that Bε (x) ⊂ U . For a point x ∈ X, any open set containing x is called an open neighbourhood of x. Proposition A.7.2. Every ball Br (a) in a metric space (X, d) is open.
30
Chapter A. Sets, Topology and Metrics
Proof. Let x ∈ Br (a). Then the number ε = r − d(x, a) > 0 is positive, and Bε (x) ⊂ Br (a). Indeed, for any y ∈ Bε (x) we have d(y, a) ≤ d(y, x) + d(x, a) < ε + d(x, a) = r. Proposition A.7.3. Let (X, d) be a metric space. Then xk → p as k → ∞ if and only if every open neighbourhood of x contains all but finitely many of the points xk . Proof. “If” implication is immediate because balls are open. On the other hand, let p ∈ U where U is an open set. Then there is some ε > 0 such that Bε (p) ⊂ U . Now, if xk → p as k → ∞, there is some N such that for all k > N we have xk ∈ Bε (p) ⊂ U , implying the statement. Definition A.7.4 (Continuous mappings in metric spaces). Let (X1 , d1 ) and (X2 , d2 ) be two metric spaces, let f : X1 → X2 , and let a ∈ X1 . Then f is said to be continuous at a if for every ε > 0 there is some δ > 0 such that d1 (x, a) < δ implies d2 (f (x), f (a)) < ε. The mapping f is said to be continuous (on X1 ) if it is continuous at all points of X1 . Example. Let X1 = C([a, b]) and X2 = R be equipped with the sup-metrics d1 b and d2 , respectively. Then mapping Φ : X1 → X2 defined by Φ(h) = a h(y)dy is continuous. Definition A.7.5 (Preimage). Let f : X1 → X2 be a mapping and let S ⊂ X2 be any subset of X2 . Then the preimage of S under f is defined by f −1 (S) = {x ∈ X1 : f (x) ∈ S}. Theorem A.7.6. Let (X1 , d1 ), (X2 , d2 ) be metric spaces and let f : X1 → X2 . Then the following statements are equivalent: (i) f is continuous on X1 ; (ii) for every a ∈ X1 and every ball Bε (f (a)) ⊂ X2 there is a ball Bδ (a) ⊂ X1 such that Bδ (a) ⊂ f −1 (Bε (f (a))); (iii) for every open set U ⊂ X2 its preimage f −1 (U ) is open in X1 . Proof. First, let us show the equivalence of (i) and (ii). Condition (i) is equivalent to saying that for every ε > 0 there is δ > 0 such that d1 (x, a) < δ implies d2 (f (a), f (x)) < ε. In turn this is equivalent to saying that for every ε > 0 there is δ > 0 such that x ∈ Bδ (a) implies f (x) ∈ Bε (f (a)), which means that Bδ (a) ⊂ f −1 (Bε (f (a))). To show that (ii) implies (iii), let us assume that f is continuous and that U ⊂ X2 is open. Take x ∈ f −1 (U ). Then f (x) ∈ U and since U is open there is some ε > 0 such that Bε (f (a)) ⊂ U . Consequently, by (ii), there exists δ > 0 such that Bδ (a) ⊂ f −1 (Bε (f (a))) ⊂ f −1 (U ), implying that f −1 (U ) is open. Finally, let us show that (iii) implies (ii). We observe that by (iii) for every a ∈ X1 and every ε > 0 the set f −1 (Bε (f (a))) is an open set containing a. Hence there is some δ > 0 such that Bδ (a) ⊂ f −1 (Bε (f (a))), completing the proof.
A.7. Topological spaces
31
Theorem A.7.7. Let X be a metric space. We have the following properties of open sets in X: (T1) (T2) (T3)
∅ and X are open sets in X. The union of any collection of open subsets of X is open. The intersection of a finite collection of open subsets of X is open.
Proof. It is obvious that the empty set ∅ is open. Moreover, for any x ∈ X and any ε > 0 we have Bε (x) ⊂ X, implying that X is also open. To show (T2), suppose that we have a collection {Ai }i∈I of open sets in X, for an index set I. Let a ∈ ∪i∈I Ai . Then there is some j ∈ I such that a ∈ Aj and since Aj is open there is some ε > 0 such that Bε (a) ⊂ Aj ⊂ ∪i∈I Ai , implying (T2). To show (T3), assume that A1 , . . . , An is a finite collection of open sets and let a ∈ ∩ni=1 Ai . It follows that for every i = 1, . . . , n we have a ∈ Ai and hence there is εi > 0 such that Bεi (a) ⊂ Ai . Let now ε = min{ε1 , . . . , εn }. Then Bε (a) ⊂ Ai for all i and hence Bε (a) ⊂ ∩ni=1 Ai implying that the intersection of Ai ’s is open. Definition A.7.8 (Topology). A family of sets τ ⊂ P(X) is called a topology on the set X if 1. U ∈ τ for every collection U ⊂ τ , and 2. U ∈ τ for every finite collection U ⊂ τ . Then (X, τ ) (or simply X when τ is evident) is called a topological space; a set A ⊂ X is called open (or τ -open) if A ∈ τ , and closed (or τ -closed ) if X \ A ∈ τ . Let the collection of τ -closed sets be denoted by τ ∗ = {X \ U : U ∈ τ }. Then the axioms of the topology become naturally complemented: 1. A ∈ τ ∗ for every collection A ⊂ τ ∗ , and 2. A ∈ τ ∗ for every finite collection A ⊂ τ ∗ . Remark A.7.9. Recall our natural conventions (A.1) for the union and the intersection of the empty family. Thereby τ ⊂ P(X) is a topology if and only if the following conditions hold: (T1) ∅, X ∈ τ , (T2) U ∈ τ for every non-empty collection U ⊂ τ , and (T3) U ∩ V ∈ τ for every U, V ∈ τ . Consequently, for any topology of X, the subsets ∅ ⊂ X and X ⊂ X are always both open and closed. Proposition A.7.3 motivates the following notion of convergence in topological spaces.
32
Chapter A. Sets, Topology and Metrics
Definition A.7.10 (Convergence in topological spaces). Let (X, τ ) be a topological space. We say that a sequence xk converges to p as k → ∞, and write xk → p as k → ∞, if every open neighbourhood of p contains all but finitely many of points xk . Proposition A.7.11. Let X and Y be topological spaces and let f : X → Y be continuous. If xk → p in X as k → ∞ then f (xk ) → f (p) in Y as k → ∞. Proof. Let U be an open set in Y containing f (p). Then p ∈ f −1 (U ) and f −1 (U ) is open in X, implying that there is N such that xk ∈ f −1 (U ) for all k > N . Consequently, f (xk ) ∈ U for all k > N implying that f (xk ) → f (p) in Y as k → ∞. Corollary A.7.12. Any metric space is a topological space by Theorem A.7.7. The canonical topology of a metric space (X, d) is the family τ consisting of all sets in (X, d) which are open according to Definition A.7.1. This canonical metric topology will be denoted by τd or by τ (d). Metric convergence in (X, d) is equivalent to the topological convergence in the canonical metric topology (X, τd ). Remark A.7.13. Notice that the intersection of any finite collection of τ -open sets is τ -open. On the other hand, it may well be that a countably infinite intersection of open sets is not open. In a metric space (X, d), ∞
B1/k (x) = {x}.
k=1
Now {x} ∈ τd if and only if {x} = Br (x) for some r > 0. Corollary A.7.14 (Properties of closed sets). Let X be a topological space. We have the following properties of closed sets in X: (C1) (C2) (C3)
∅ and X are closed in X. The intersection of any collection of closed subsets of X is closed. The union of a finite collection of closed subsets of X is closed.
Proof. Let Ai , i ∈ I, be any collection of subsets of X. The corollary follows immediately from Remark A.7.9 and de Morgan’s rules Ai = (X\Ai ), X\ Ai = (X\Ai ), X\ i∈I
see Exercise A.1.5.
i∈I
i∈I
i∈I
Definition A.7.15 (Comparing metric topologies). Let d1 , d2 be two metrics on a set X. The topology τ (d1 ) defined by d1 is said to be stronger than topology τ (d2 ) defined by d2 if τ (d1 ) ⊃ τ (d2 ). In this case the topology τ (d2 ) is also said to be weaker than τ (d1 ). Metrics d1 , d2 on a set X are said to be equivalent if they define the same topology τ (d1 ) = τ (d2 ).
A.7. Topological spaces
33
Proposition A.7.16 (Criterion for comparing metric topologies). Let d1 , d2 be two metrics on a set X such that there is a constant C > 0 such that d2 (x, y) ≤ Cd1 (x, y) for all x, y ∈ X. Then τ (d2 ) ⊂ τ (d1 ), i.e., every d2 -open set is also d1 -open. Consequently, if there is a constant C > 0 such that C −1 d1 (x, y) ≤ d2 (x, y) ≤ Cd1 (x, y),
(A.2)
for all x, y ∈ X, then metrics d1 and d2 are equivalent. Such metrics are called Lipschitz equivalent. Sometimes such metrics are called just equivalent, however we use the term “Lipschitz” to distinguish this equivalence from the one in Definition A.7.15. Proof. Fixing the constant C > 0 from (A.2), we observe that d1 (x, y) < r implies d2 (x, y) < Cr, which means that Bd1 (x, r) ⊂ Bd2 (x, Cr). Let now U ∈ τ (d2 ) and let x ∈ U . Then there is some ε > 0 such that Bd2 (x, ε) ⊂ U implying that Bd1 (x, ε/C) ⊂ U . Hence U ∈ τ (d1 ). Exercise A.7.17. Prove that the metrics dp , 1 ≤ p ≤ ∞, from Exercise A.6.3, are all Lipschitz equivalent. The corresponding topology is called the Euclidean metric topology on Rn . Definition A.7.18 (Relative topology). Let (X, τ ) be a topological space and let A ⊂ X. Then we define the relative topology on A by τA = {U ∩ A : U ∈ τ }. Proposition A.7.19 (Relative topology is a topology). Any subset A of a topological space (X, τ ) when equipped with the relative topology τA is a topological space. Proof. We have to check the properties (T1)–(T3) of Remark A.7.9. It is easy to see that ∅ = ∅∩A ∈ τA and that A = X ∩A ∈ τA . To show (T2), let Vi ∈ τA , i ∈ I, be a family of sets from τA . Then there exist sets Ui ∈ τ such that Vi = Ui ∩ A. Consequently, we have Vi = (Ui ∩ A) = Ui ∩ A ∈ τA . i∈I
i∈I
i∈I
To show (T3), let V1 , . . . , Vn be a family of sets from τA . It follows that there exist sets Ui ∈ τ such that Vi = Ui ∩ A. Consequently, we have n n n Vi = (Ui ∩ A) = Ui ∩ A ∈ τA , i=1
completing the proof.
i=1
i=1
34
Chapter A. Sets, Topology and Metrics
Remark A.7.20 (Metric subspaces). Let (X, d) be a metric space with canonical topology τ (d). Let Y ⊂ X be a subset of X and let us define dY = d|Y ×Y . Then τ (dY ) = τ (d)Y , i.e., the canonical topology of the metric subspace coincides with the relative topology of the metric space. Definition A.7.21 (Product topology). Let (X1 , τ1 ) and (X2 , τ2 ) be topological spaces. A subset of X1 × X2 is said to be open in the product topology if it is a union of sets of the form U1 × U2 , where U1 ∈ τ1 , U2 ∈ τ2 . The collection of all such open sets is denoted by τ1 ⊗ τ2 . Proposition A.7.22 (Product topology is a topology). The set X1 × X2 with the collection τ1 ⊗ τ2 is a topological space. Proof. We have to check properties (T1)–(T3) of Remark A.7.9. It is easy to see that ∅ = ∅ × ∅ ∈ τ1 ⊗ τ2 and that X1 × X2 ∈ τ1 ⊗ τ2 . To show (T2), assume that Aα ∈ τ1 ⊗ τ2 for all α ∈ I. Then each Aα is a union of sets of the form U1 × U2 with U1 ∈ τ1 , U2 ∈ τ2 . Consequently, the union ∪α∈I Aα is a union of sets of the same form and does, therefore, also belong to τ 1 ⊗ τ2 . To show (T3), even for n sets, assume that Ai ∈ τ1 ⊗ τ2 , for all i = 1, . . . , n. By definition there exist collections Uαi i ∈ τ1 , Vαii ∈ τ2 , αi ∈ Ii , i = 1, . . . , n, such that Ai = (Uαi i × Vαii ), i = 1, . . . , n. αi ∈Ii
Consequently, n
Ai =
i=1
completing the proof.
(∩nj=1 Uαj i ) × (∩nj=1 Vαji ) ∈ τ1 ⊗ τ2 ,
αi ∈Ii , 1≤i≤n
Theorem A.7.23 (Topologies on R2 ). The product topology on R × R is the Euclidean metric topology of R2 . Proof. We start by proving that every set open in the product topology of R2 is also open in the Euclidean topology of R2 . First we note that any open set in R in the Euclidean topology is a union of open intervals, i.e., every open set U can be written as U = ∪x∈U Bεx (x), where Bεx (x) is an open ball centred at x with some εx > 0. Then we note that every open rectangle in R2 is open in the Euclidean topology. Indeed, any rectangle R = (a, b) × (c, d) in R2 can be written as a union of balls, i.e., R = ∪x∈R Bεx (x), with balls Bεx (x) taken with respect to d2 , with some εx > 0, implying that R is open in the Euclidean topology of R2 . Finally, we note that any open set A in the product topology is a union of sets of the form U1 × U2 , where U1 , U2 are open in
A.8. Kuratowski’s closure
35
R. Consequently, writing both U1 and U2 as unions of open intervals, we obtain A is a union of open rectangles in R2 , which we showed to be open in the Euclidean topology, implying in turn that A is also open in the Euclidean topology of R2 . Conversely, let us prove that every set open in the Euclidean topology of R2 is also open in the product topology of R2 . First we note that clearly every disc Bε (x) in R2 can be written as a union of open rectangles and is, therefore, open in the product topology of R2 . Consequently, every open set U in the Euclidean topology can be written as U = ∪x∈U Bεx (x) for some εx > 0, so that it is also open in the product topology as a union of open sets.
A.8
Kuratowski’s closure
In this section we describe another approach to topology based on Kuratowski’s closure operator. This provides another (and perhaps more intuitive) approach to some notions of the previous section. Definition A.8.1 (Metric interior, closure, boundary, etc.). In a metric space (X, d), the metric closure of A ⊂ X is A = cld (A) := {x ∈ X | ∀r > 0 : A ∩ Br (x) = ∅} . In other words, x ∈ cld (A) ⇐⇒ dist({x}, A) = 0 (i.e., “x is close to A”). This is also equivalent to saying that every ball around x contains point(s) of A. The metric interior intd (A), the metric exterior extd (A) and the metric boundary ∂d (A) are defined by intd (A) := X \ cld (X \ A), extd (A) := X \ cld (A), ∂d (A) := cld (A) ∩ cld (X \ A). Notice that in this way, we have defined mappings cld , intd , extd , ∂d : P(X) → P(X). Exercise A.8.2. Let (X, d) be a metric space and A ⊂ X. Prove the following claims: intd (A) = {x ∈ X | ∃r > 0 : Br (x) ⊂ A} , ∂d (A) = cld (A) \ intd (A), X = intd (A) ∪ ∂d (A) ∪ extd (A). Consequently, prove that cld (A) is closed for any set A ⊂ X.
36
Chapter A. Sets, Topology and Metrics
Definition A.8.3 (Metric topology). Let (X, d) be a metric space. Then τd := intd (P(X)) = {intd (A) | A ⊂ X} is called the metric topology or the family of metrically open sets. The corresponding family of metrically closed sets is τd∗ := cld (P(X)) = {cld (A) | A ⊂ X} . By the following Lemma A.8.4, we have • a set C ⊂ X is metrically closed if and only if C = cld (C), • a set U ⊂ X is metrically open if and only if U = intd (U ). Lemma A.8.4. Let (X, d) be a metric space and A ⊂ X. Then cld (cld (A)) intd (intd (A))
= cld (A), = intd (A).
(A.3) (A.4)
Proof. Let C = cld (A). Trivially, C ⊂ cld (C). Let x ∈ cld (C). Let r > 0. Take y ∈ C ∩ Br (x), and then z ∈ A ∩ Br (y). Hence d(x, z) ≤ d(x, y) + d(y, z) < 2r, so x ∈ C. Thus (A.3) is obtained. By the definition of the metric interior, (A.3) implies (A.4). Definition A.8.5 (Topological interior, closure, boundary, etc.). Let τ be a topology on X. For A ⊂ X, the interior intτ (A) is the largest open subset of A, and the closure A = clτ (A) is the smallest closed set containing A. That is, A = intτ (A) := {U ∈ τ | U ⊂ A} , clτ (A) := {S ∈ τ ∗ | A ⊂ S} . These define mappings intτ , clτ : P(X) → P(X). The boundary ∂τ (A) of a set A ⊂ X is defined by ∂τ (A) := clτ (A) ∩ clτ (X \ A). A set A ⊂ X is dense if clτ (A) = X. The topological space (X, τ ) is separable if it has a countable dense subset. A point x ∈ X is an isolated point of a set A ⊂ X if A ∩ U = {x} for some U ∈ τ . A point y ∈ X is an accumulation point of a set B ⊂ X if (B ∩ V ) \ {y} = ∅ for every V ∈ τ . A neighbourhood of x ∈ X is any open set U ⊂ X containing x. The family of neighbourhoods of x ∈ X is denoted by Vτ (x) := {U ∈ τ | x ∈ U } (or simply V(x), when τ is evident).
A.8. Kuratowski’s closure
37
Remark A.8.6. Intuitively, the closure clτ (A) ⊂ X contains those points that are close to A. Clearly, τ τ
∗
= {intτ (A) | A ⊂ X} , = {clτ (A) | A ⊂ X} .
Moreover, U ∈ τ if and only if U = intτ (U ), and C ∈ τ ∗ if and only if C = clτ (C). Exercise A.8.7. Prove that ∂τ (A) = clτ (A) \ intτ (A). Exercise A.8.8. Let τd be the metric topology of a metric space (X, d). Show that intd = intτd and that cld = clτd . Proposition A.8.9 (A characterisation of open sets). Let A be a subset of a topological space X. Then A is open if and only if for every x ∈ A there is an open set Ux containing x such that Ux ⊂ S. Proof. If A is open we can take Ux = A for every x ∈ A. Conversely, writing A = ∪x∈A Ux by property (T2) of open sets we get that A is open if all Ux are open. Proposition A.8.10 (A characterisation of closures). Let A be a subset of a topological space X. Then x ∈ A if and only if every open set containing x contains a point of A. Proof. We will prove that x ∈ A if and only if there is an open set U such that x ∈ U but A ∩ U = ∅. Since A is defined as the intersection of all closed sets containing A, it follows that x ∈ A means that there is a closed set C such that A ⊂ C and x ∈ C. Set U = X\U is then the required set. Definition A.8.11 (Closure operator). Let X be a set. A closure operator on X is a mapping c : P(X) → P(X) satisfying Kuratowski’s closure axioms: 1. c(∅) = ∅, 2. A ⊂ c(A), 3. c(c(A)) = c(A), 4. c(A ∪ B) = c(A) ∪ c(B). Instead of a closure operator c : P(X) → P(X), we could study an interior operator i : P(X) → P(X), related to each other by i(S) = X \ c(X \ S), c(A) = X \ i(X \ A).
38
Chapter A. Sets, Topology and Metrics
Kuratowski’s closure axioms become interior axioms: 1. i(X) = X, 2. i(S) ⊂ S, 3. i(i(S)) = i(S), 4. i(S ∩ T ) = i(S) ∩ i(T ). Theorem A.8.12. Let (X, τ ) be a topological space. Then the mappings intτ , clτ : P(X) → P(X) are interior and closure operators, respectively. Proof. Obviously, intτ (X) = X and intτ (A) ⊂ A. Moreover, intτ (U ) = U for U ∈ τ , and intτ (A) ∈ τ , because τ is a topology. Hence intτ (intτ (A)) = intτ (A). Finally, intτ (A ∩ B) ⊂ intτ (A) ⊂ A, intτ (A ∩ B) ⊂ intτ (B) ⊂ B, yielding intτ (intτ (A ∩ B)) ⊂ intτ (intτ (A) ∩ intτ (B)) ⊂ intτ (A ∩ B), where intτ (A) ∩ intτ (B) ∈ τ , so that intτ (A ∩ B) = intτ (A) ∩ intτ (B).
Theorem A.8.13. Let i : P(X) → P(X) be an interior operator. Then the family τi = i(P(X)) = {i(A) : A ⊂ X} is a topology. Moreover, i = intτi . Proof. First, ∅ = i(∅) ∈ τi ,
X = i(X) ∈ τi .
Second, if A, B ∈ τi then A ∩ B = i(A) ∩ i(B) = i(A ∩ B) ∈ τi . Third, let A = {Aj : j ∈ J} ⊂ τi . Now
A
Aj =i(Aj )
=
i(Aj )
i(Aj )⊂i(
⊂
A)
i
A
⊂
A.
j∈J
Thus
A = i(
A) ∈ τi . Next, intτi (A)
= =
{U ∈ τi | U ⊂ A} {i(B) | i(B) ⊂ A, B ⊂ X} .
Here we see that i(A) ⊂ intτi (A). Moreover, if i(A) ⊂ i(B) ⊂ A then i(A) = i(i(A)) ⊂ i(i(B)) = i(B) ⊂ i(A). Hence i(A) = intτi (A). Remark A.8.14. Above we have seen how topologies and closure operators (or interior operators) on a set are in bijective correspondence.
A.8. Kuratowski’s closure
39
Exercise A.8.15. For each j ∈ J, let τj be a topology on X. Prove that τ = is a topology. Give an example, where
τj
j∈J
τj is not a topology.
j∈J
Definition A.8.16 (Base of topology). Let (X, τ ) be a topological space. A family B ⊂ P(X) is called a base (or basis) for the topology τ if any open set is a union of some members of B, i.e., τ= B : B ⊂ B . A family A ⊂ P(X) is called a subbase (or subbasis) for the topology τ if A : A ⊂ A is finite is a base for the topology. A topology is called second countable if it has a countable base. Example. Trivially a topology τ is a base for itself, as U = {U } for every U ∈ τ . If (X, d) is a metric space then B := {Br (x) | x ∈ X, r > 0} constitutes a base for τd . Exercise A.8.17. Let A ⊂ P(X). Show that there is the minimal topology τA on X such that A ⊂ τA : more precisely, if σ is a topology on X for which A ⊂ σ, then τA ⊂ σ. Exercise A.8.18. Let τA be as in the previous exercise. Prove that a base for this topology is provided by B= A : A ⊂ A ∪ {X} is finite . Finally, we give another proof of Corollary A.7.12 that metric spaces are topological spaces using the introduced notions of interior and closure. Theorem A.8.19 (Metric topology is a topology). Any metric topology is a topology. Proof. Let τd be the metric topology of (X, d). By Lemma A.8.4, U ∈ τd if and only if U = intd (U ). Now ∅, X ∈ τd , because intd (∅) = {x ∈ X | ∃r > 0 : Br (x) ⊂ ∅} = ∅, intd (X) = {x ∈ X | ∃r > 0 : Br (x) ⊂ X} = X. Next, if Br (x) ⊂ U and Bs (x) ⊂ V then Bmin{r,s} (x) ⊂ U ∩ V . Thus if U, V ∈ τd then U ∩ V ∈ τd . Finally, if Br (x) ⊂ U for some k ∈ J then B (x) ⊂ k r j∈J Uj . Thus if {Uj : j ∈ J} ⊂ τd then j∈J Uj ∈ τd .
40
Chapter A. Sets, Topology and Metrics
Exercise A.8.20 (Product topology). Let X, Y be topological spaces with bases BX , BY , respectively. Show that sets {U × V | U ∈ BX , V ∈ BY } form a base for the product topology of X × Y = {(x, y) | x ∈ X, y ∈ Y } from Definition A.7.21. The metric topology (but not only, cf. topological spaces with countable topology bases) can be characterised by the limits of sequences: Theorem A.8.21. Let (X, d) be a metric space, p ∈ X and A ⊂ X. Then p ∈ cd (A) if and only if some sequence x : Z+ → A converges to p. Proof. Let xk → p, where xk ∈ A for each k ∈ Z+ . That is, ∀ε > 0 ∃kε ∈ Z+ :
k ≥ kε ⇒ xk ∈ Bd (p, ε).
Thus A ∩ Bd (p, ε) = ∅ for every ε > 0. Thereby p ∈ cd (A). Let p ∈ cd (A), that is A ∩ Br (x) = ∅ for all r > 0. For each k ∈ Z+ , take xk ∈ A ∩ Bd (p, 1/k). Now (xk )∞ k=1 is a sequence in A, converging to p, because d(xk , p) < 1/k.
A.9 Complete metric spaces In this section we discuss complete metric spaces, give a sample application to Fredholm integral equations using Banach’s Fixed Point Theorem, and show that every metric space can be “completed” and such a completion is essentially unique. Later, we will revisit this topic again to show completeness of R and Rn in Theorem A.13.10 and Corollary A.13.11. Completeness in topological vector spaces will be discussed in Section B.2. Definition A.9.1 (Cauchy sequences and completeness). Let (X, d) be a metric space. A sequence x : Z+ → X is a Cauchy sequence if ∀ε > 0 ∃kε ∈ Z+ : i, j ≥ kε ⇒ d(xi , xj ) < ε. A metric space is called complete if all Cauchy sequences converge. Example. The Euclidean metric space (Rn , d) is complete (see Corollary A.13.11), but its dense subset Qn is not (metric of course inherited from d). For instance, k Napier’s constant e ∈ R \ Q is obtained as the limit of numbers j=0 1/j! ∈ Q. Lemma A.9.2 (Properties of Cauchy sequences). We have the following properties: (1) Every convergent sequence is a Cauchy sequence. (2) Every Cauchy sequence is bounded. (3) If a Cauchy sequence has a convergent subsequence, it converges to the same limit.
A.9. Complete metric spaces
41
Proof. We assume that a metric space (X, d) is non-empty. To prove (1), let xk → + p. We want to show that (xk )∞ k=1 is a Cauchy sequence. Let ε > 0. Take kε ∈ Z such that d(xk , p) < ε if k ≥ kε . Let i, j ≥ kε . Then d(xi , xj ) ≤ d(xi , p) + d(p, xj ) < 2ε. To prove (2), let (xk )∞ k=1 be a Cauchy sequence. Take ε = 1. Then there is some k such that for i, j ≥ k we have d(xi , xj ) < 1. Let us now fix some a ∈ X. Then for i > k we have d(a, xi ) ≤ d(a, xk+1 ) + d(xk+1 , xi ) < ρ + 1, with ρ = d(a, xk+1 ). Setting R := max{d(a, x1 ), . . . , d(a, xk ), ρ}, we get that xi ∈ BR+1 (a) for all i. To prove (3), let (xn )∞ n=1 be a Cauchy sequence, with a convergent subsequence xni → p ∈ X. Fix some ε > 0. Then there is some k such that for all n, m ≥ k we have d(xn , xm ) < ε. At the same time, there is some N such that for ni > N , we have d(xni , p) < ε. Consequently, for n ≥ max{k, N }, we get d(xn , p) ≤ d(xn , xni ) + d(xni , p) < 2ε, which means that xn → p as n → ∞.
Theorem A.9.3. Let (X, d) be a complete metric space, and A ⊂ X. Then (A, d|A×A ) is complete if and only if A ⊂ X is closed. Proof. Let A ⊂ X be closed. Take a Cauchy sequence x : Z+ → A. Due to the completeness of (X, d), x converges to a point p ∈ X. Now p ∈ A, because A is closed. Thus (A, d|A×A ) is complete. Suppose (A, d|A×A ) is complete. We have to show that cd (A) = A. Take p ∈ cd (A). For each k ∈ Z+ , take xk ∈ A ∩ Bd (p, 1/k). Clearly, xk → p, so (xk )∞ k=1 is a Cauchy sequence in A. Due to the completeness of (A, d|A×A ), xk → a for some a ∈ A. Because the limits in X are unique, p = a ∈ A. Thus A = cd (A) is closed. We now show one application of the notion of completeness to solving integral equations. Definition A.9.4 (Pointwise convergence of functions). Let fn : [a, b] → R be a sequence of functions and let f : [a, b] → R. Then we say that fn converges to f pointwise on [a, b] if fn (x) → f (x) as n → ∞ for all x ∈ [a, b]. In other words, this means that ∀x ∈ [a, b]
∀ε > 0 ∃N = N (ε, x) :
n > N =⇒ |fn (x) − f (x)| < ε.
As before, by C([a, b]) we denote the space of all continuous functions f : [a, b] → R. By default we always equip it with the sup-metric d∞ .
42
Chapter A. Sets, Topology and Metrics
Exercise A.9.5. Find a sequence of continuous functions fn ∈ C([0, 1]) such that fn → f pointwise on [0, 1], but f : [0, 1] → R is not continuous on [0, 1]. To remedy this situation, we introduce another notion of convergence of functions: Definition A.9.6 (Uniform convergence of functions). Let fn : [a, b] → R be a sequence of functions and let f : [a, b] → R. Then we say that fn converges to f uniformly on [a, b] if ∀ε > 0 ∃N = N (ε) :
∀n > N
x ∈ [a, b] =⇒ |fn (x) − f (x)| < ε.
The difference with the pointwise convergence here is that the same index N works for all x ∈ [a, b]. Theorem A.9.7. Let fn ∈ C([a, b]) be a sequence of continuous functions, let f : [a, b] → R, and suppose that fn converges to f uniformly on [a, b]. Then f is continuous on [a, b]. Proof. Fix ε > 0. Since fn → f uniformly, there is some N = N (ε) such that for all n > N and all x ∈ [a, b] we have |fn (x) − f (x)| < ε. Let c ∈ [a, b]. We will show that f is continuous at c. Since every function fn is continuous at c, there is some δ = δ(n) > 0 such that |x − c| < δ implies |fn (x) − fn (c)| < ε. Taking some n > N , we get |f (x) − f (c)| ≤ |f (x) − fn (x)| + |fn (x) − fn (c)| + |fn (c) − f (c)| < 3ε for all |x − c| < δ, implying that f is continuous at c.
This result extends to uniform limits of continuous functions on general topological spaces, see Exercise C.2.18. Proposition A.9.8 (Metric uniform convergence). We have fn → f in metric space (C([a, b]), d∞ ) if and only if fn → f uniformly on [a, b]. Proof. Convergence fn → f in metric space (C([a, b]), d∞ ) means that for every ε > 0 there is N such that for all n > N we have supx∈[a,b] |fn (x) − f (x)| < ε. But this means that |fn (x) − f (x)| < ε for all x ∈ [a, b], which is the uniform convergence. Theorem A.9.9 (Completeness of continuous functions). Space C([a, b]) with supmetric d∞ is complete. Proof. Let fn ∈ C([a, b]) be a Cauchy sequence. Fix ε > 0. Then there is some N such that for all m, n > N we have sup |fn (x) − fm (x)| < ε. x∈[a,b]
(A.5)
A.9. Complete metric spaces
43
Therefore, for each x ∈ [a, b] the sequence (fn (x))∞ n=1 is a Cauchy sequence in R. If we use that R is complete (see Theorem A.13.10), it converges to some point in R, which we call f (x). Thus, for every x ∈ [a, b] we have fn (x) → f (x) as n → ∞. Passing to the limit as n → ∞ in (A.5), we obtain supx∈[a,b] |f (x) − fm (x)| ≤ ε, which means that d∞ (f, fm ) ≤ ε, completing the proof. Theorem A.9.10 (Banach’s Fixed Point Theorem). Let (X, d) be a non-empty complete metric space, let k < 1 be a constant, and let f : X → X be such that d(f (x), f (y)) ≤ k d(x, y)
(A.6)
for all x, y ∈ X. Then there exists a unique point a ∈ X such that a = f (a). A mapping f satisfying (A.6) with some constant k < 1 is called a contraction. A point a such that a = f (a) is called a fixed point of f . Exercise A.9.11. Show that the conditions of Theorem A.9.10 are indispensable. For example, the conclusion of Theorem A.9.10 fails if X is not complete. Show that it also fails if k ≥ 1. Finally, give an example of a function f : X → X satisfying d(f (x), f (y)) < d(x, y) instead of (A.6) on a complete metric space X = ∅ such that f does not have fixed points. Proof of Theorem A.9.10. First we observe that f is continuous. Indeed, if d(x, y) < ε, it follows that d(f (x), f (y)) ≤ kd(x, y) < kε < . We now construct a certain Cauchy sequence, whose limit will be the required fixed point of f . Take any x0 ∈ X. For all n ≥ 0, define xn+1 = f (xn ). Then for all n ≥ 1 we have d(xn+1 , xn ) = d(f (xn+1 ), f (xn )) ≤ kd(xn , xn−1 ), implying that d(xn+1 , xn ) ≤ k n d(x1 , x0 ). Consequently, for n > m ≥ 1, we have d(xn , xm )
≤ d(xn , xn−1 ) + · · · + d(xm+1 , xm ) ≤ (k n−1 + · · · + k m )d(x1 , x0 ) ∞ ≤ km k i d(x1 , x0 ) i=0
km d(x1 , x0 ). = 1−k Since k < 1 it follows that d(xn , xm ) → 0 as n, m → ∞ which means that xn is a Cauchy sequence. Since X is complete, xn → a for some a ∈ X. We claim that a is a fixed point of f . Indeed, since xn → a and since f is continuous, we have f (xn ) → f (a) by Proposition A.7.11. Therefore, xn+1 → f (a) as n → ∞, and by the uniqueness of limits in metric spaces Proposition A.6.18 we have f (a) = a. Finally, let us show that there is only one fixed point. Suppose that f (a) = a and f (b) = b. It follows that d(a, b) = d(f (a), f (b)) ≤ kd(a, b) and since k < 1, we must have d(a, b) = 0 and hence a = b.
44
Chapter A. Sets, Topology and Metrics
Corollary A.9.12 (Fredholm integral equations). Let p : [0, 1] → R be continuous, 1 p ≥ 0, and such that 0 p(t) dt < 1. Let g ∈ C([0, 1]). Then there exists a unique function f ∈ C([0, 1]) such that x f (x) = g(x) − f (t) p(t) dt. 0
Proof. As usual, let us equip C([0, 1]) with the sup-metric d∞ , and let us define T : C([0, 1]) → C([0, 1]) by x (T f )(x) = g(x) − f (t) p(t) dt. 0
We claim that Y is a contraction, which together with the completeness of C([0, 1]) in Theorem A.9.9 and Banach’s Fixed Point Theorem A.9.10 would imply the statement. We have x x d∞ (T f, T g) = sup (f (t) − g(t)) p(t) dt ≤ sup |f (t) − g(t)| p(t) dt x∈[0,1]
x∈[0,1]
0
0
1
1
|f (t) − g(t)| p(t) dt ≤ sup |f (x) − g(x)|
= 0
x∈[0,1]
p(t) dt 0
≤ kd(f, g), where k =
1 0
p(t) dt < 1.
Finally we will show that every metric space can be “completed” to become a complete metric space and such “completion” is essentially unique. Definition A.9.13 (Completion). Let (X, d) be a metric space. A complete metric space X ∗ is said to be a completion of X if X is a topological subspace of X ∗ and if X = X ∗ (i.e., if X is dense in X ∗ ). Remark A.9.14. Completion of a metric space can be defined in another way: a complete metric space (X ∗ , d∗ ) is a completion of (X, d) if there exists an isometry ι : X → X ∗ such that the image ι(X) is dense in X ∗ . In the proof of Theorem A.9.15, we are actually using this idea: there X ∗ is the family of Cauchy sequences in X, and the points of X are naturally identified with the constant sequences. Theorem A.9.15 (Completions of metric spaces). Every metric space (X, d) has a completion. This completion is unique up to an isometry leaving X fixed. Proof. Existence. We will construct a completion as a space of equivalence classes of Cauchy sequences in X. Thus, we will call Cauchy sequences (xn )∞ n=1 and (xn )∞ equivalent if d(x , x ) → 0 as n → ∞. One can readily see that this n n n=1 is an equivalence relation as in Definition A.2.6, and we define X ∗ to be the space of equivalence classes of such Cauchy sequences. Space X ∗ has a metric d∗ de∗ fined as follows. For x∗ , y ∗ ∈ X ∗ , pick some representatives (xn )∞ n=1 ∈ x and
A.9. Complete metric spaces ∗ (yn )∞ n=1 ∈ y , and set
45
d∗ (x∗ , y ∗ ) := lim d(xn , yn ). n→∞
(A.7)
We first check that d∗ is a well-defined function on X ∗ , namely that the limit in (A.7) exists and that it is independent of the choice of representatives on equivalence classes x∗ and y ∗ . To check that the limit exists, we use the fact that xn and yn are Cauchy sequences, so for n and m sufficiently large we can estimate |d(xn , yn ) − d(xm , ym )| = |d(xn , yn ) − d(xn , ym ) + d(xn , ym ) − d(xm , ym )| ≤ |d(xn , yn ) − d(xn , ym )| + |d(xn , ym ) − d(xm , ym )| ≤ d(yn , ym ) + d(xn , xm ), and the latter goes to zero as n, m → ∞. It follows that the sequence of real numbers (d(xn , yn ))∞ n=1 is a Cauchy sequence in R, and hence converges because R is complete by Theorem A.13.10 (which will be proved later). Let us now show that d∗ (x∗ , y ∗ ) is independent of the choice of representatives ∞ ∗ ∞ ∞ ∗ from x∗ and y ∗ . Let us take (xn )∞ n=1 , (xn )n=1 ∈ x and (yn )n=1 , (yn )n=1 ∈ y . Then by a calculation similar to the one before we can show that |d(xn , yn ) − d(xn , yn )| ≤ d(xn , xn ) + d(yn , yn ), which implies that limn→∞ d(xn , yn ) = limn→∞ d(xn , yn ). We now claim that (X ∗ , d∗ ) is a metric space. Non-degeneracy and symmetry in Definition A.6.1 are straightforward. The triangle inequality for d∗ follows from that for d. Indeed, passing to the limit as n → ∞ in the inequality d(xn , zn ) ≤ d(xn , yn ) + d(yn , zn ), we get d∗ (x∗ , z ∗ ) ≤ d∗ (x∗ , y ∗ ) + d∗ (y ∗ , z ∗ ). Next we will verify that X ∗ is a completion of X. We first have to check that (X, d) is a topological subspace of (X ∗ , d∗ ). We observe that for every x ∈ X its equivalence class contains the convergent constant sequence (xn = x)∞ n=1 , and hence any equivalent Cauchy sequence must be also convergent. Thus, the class x∗ consists of all sequences (xn )∞ n=1 convergent to x. Now, if x, y ∈ X and ∗ ∞ ∗ (xn )∞ n=1 ∈ x , (yn )n=1 ∈ y , we have xn → x and yn → y as n → ∞, and hence d(x, y) = limn→∞ d(xn , yn ) = d∗ (x∗ , y ∗ ). Therefore, the mapping x → x∗ is an isometry from X to X ∗ and hence X is a topological subspace of X ∗ if we identify it with its image under this isometry. Thus, in the sequel we will no longer distinguish between X and its image in X ∗ . We next show that X is dense in X ∗ . Let x∗ ∈ X ∗ , let ε > 0, and let ∞ (xn )n=1 ∈ x∗ . Since xn is a Cauchy sequence, there is some N such that for all n, m > N we have d(xn , xm ) < ε. Letting m → ∞, we get d∗ (xn , x∗ ) = limm→∞ d(xn , xm ) ≤ ε. Therefore, any neighbourhood of x∗ contains a point of X, which means that X = X ∗ by Proposition A.8.10. Finally, we show that (X ∗ , d∗ ) is complete. First we observe that by the construction of X ∗ any Cauchy sequence (xn )∞ n=1 of points of X converges to . Second, for any Cauchy sequence x∗n of points in x∗ ∈ X ∗ , for x∗ (xn )∞ n=1
46
Chapter A. Sets, Topology and Metrics
X ∗ there is an equivalent sequence xn of points of X because X = X ∗ . Indeed, for every n there is a point xn ∈ X such that d∗ (xn , x∗n ) < n1 . Sequence xn is then a Cauchy sequence and by the first part of this argument it converges to its equivalence class x∗ in X ∗ . Therefore, x∗n also converges to x∗ in (X ∗ , d∗ ). Uniqueness. We want to show that if (X ∗ , d∗ ) and (X ∗∗ , d∗∗ ) are two completions of X then there is a bijection f : X ∗ → X ∗∗ such that f (x) = x for all x ∈ X, and such that f (x∗ ) = x∗∗ , f (y ∗ ) = y ∗∗ implies that d∗ (x∗ , y ∗ ) = d∗∗ (x∗∗ , y ∗∗ ). We define f in the following way. For x∗ ∈ X ∗ , in view of the density of X in X ∗ , there exists a sequence xn ∈ X such that xn → x∗ in (X ∗ , d∗ ). Therefore, xn is a Cauchy sequence in X, and since X ∗∗ is also a completion of X and is complete, it has some limit in X ∗∗ , so that xn → x∗∗ in (X ∗∗ , d∗∗ ). One can readily see that this x∗∗ is independent of the choice of sequence xn convergent to x∗ . We define f by setting x∗∗ = f (x∗ ). By construction it is clear that f (x) = x for all x ∈ X. Moreover, let xn → x∗ and yn → y ∗ in (X ∗ , d∗ ) and let xn → x∗∗ and yn → y ∗∗ in (X ∗∗ , d∗∗ ). Consequently, d∗ (x∗ , y ∗ ) = lim d∗ (xn , yn ) = lim d(xn , yn ) n→∞
n→∞
= lim d∗∗ (xn , yn ) = d∗∗ (x∗∗ , y ∗∗ ), n→∞
completing the proof.
A.10
Continuity and homeomorphisms
Recall that an expression like “(X, τ ) is a topological space” is often abbreviated by “X is a topological space”. In the sequel, to simplify notation, we may use the same letter c for the closure operators of different topological spaces: that is, if A ⊂ X and B ⊂ Y , c(A) is the closure in the topology of X, and c(B) is the closure in the topology of Y . If needed, we shall express which topologies are meant. In reading the following definition, recall how we have interpreted x ∈ c(A) as “x ∈ X is close to A ⊂ X”: Definition A.10.1 (Continuous mappings). A mapping f : X → Y is continuous at point x ∈ X if x ∈ c(A) =⇒ f (x) ∈ c (f (A)) for every A ⊂ X. A mapping f : X → Y is continuous if it is continuous at every point x ∈ X, i.e., f (c(A)) ⊂ c (f (A)) for every A ⊂ X. If precision is needed, we may emphasize the topologies involved and, instead of mere continuity, speak specifically about (τX , τY )-continuity. The set of continuous functions from X to Y is often denoted by C(X, Y ), with convention C(X) = C(X, R) (or C(X) = C(X, C)).
A.10. Continuity and homeomorphisms
47
Exercise A.10.2. Let c ∈ R. Let f, g : X → R be continuous, where we use the Euclidean metric topology on R. Show that the following functions X → R are then continuous: cf , f + g, f g, |f |, max{f, g}, min{f, g} (here, e.g., max{f, g}(x) := max{f (x), g(x)} etc.). Moreover show that if g(x) = 0 then f /g is continuous at x ∈ X. Exercise A.10.3. Let (X1 , τ1 ) and (X2 , τ2 ) be topological spaces. Show that a mapping f : X1 → X2 is continuous at x ∈ X1 if and only if ∀V ∈ Vτ2 (f (x)) ∃U ∈ Vτ1 (x) :
f (U ) ⊂ V.
Exercise A.10.4. Let (X, dX ) and (Y, dY ) be metric spaces, p ∈ X and f : X → Y . Show that the following conditions are equivalent: 1. f is continuous at p ∈ X (with respect to the metric topologies). 2. ∀ε > 0 ∃δ > 0 ∀w ∈ X : dX (p, w) < δ ⇒ dY (f (p), f (w)) < ε. 3. f (xk ) → f (p) whenever xk → p. Theorem A.10.5. Let f : X → Y . Then f is continuous if and only if f −1 (V ) ∈ τX for every V ∈ τY . Remark A.10.6. The continuity criterion here might be read as: “preimages of open sets are open”. Sometimes this condition is taken as the definition of continuity of f . Equivalently, by taking complements, this means “preimages of closed sets are closed ”. Proof. Let us assume that “preimages of closed sets are closed”. Then A = f −1 (c(f (A))) is closed, and A ⊂ A , so c(A) ⊂ c(A ) = A . Hence f (c(A)) ⊂ f (A ) ⊂ c(f (A)). Property f (c(A)) ⊂ c(f (A)) means the continuity of f : X → Y . Conversely, let f : X → Y be continuous. Let A = f −1 (c(B)), where B ⊂ Y . Then f (c(A)) ⊂ c(f (A)) ⊂ c(c(B)) = c(B), so c(A) ⊂ f −1 (f (c(A))) ⊂ f −1 (c(B)) = A. Therefore c(A) = A, i.e., A is closed.
Corollary A.10.7. Let f : X → Y , and let τY be a topology on Y . Then f −1 (τY ) = {f −1 (V ) | V ∈ τY } is a topology on X. Moreover, f is (τX , τY )-continuous if and only if f −1 (τY ) ⊂ τX . Exercise A.10.8. Prove Corollary A.10.7. The topology f −1 (τY ) is called the topology induced from τY by f . Show that the relative topology on a subset A ⊂ X of a topological space X in Definition A.7.18 is induced by the identity mapping A → X.
48
Chapter A. Sets, Topology and Metrics
Definition A.10.9 (Induced topology). Let F be a family of mappings f : X → Y , where (Y, τY ) is a topological space. Then f −1 (τY ) ⊂ P(X) f ∈F
is the topology induced from τY by F. Proposition A.10.10. Let X, Y, Z be topological spaces and let f : X → Y and g : Y → Z be continuous. Then g ◦ f : X → Z is continuous. Proof. We will use Theorem A.10.5. Let U be open in Z. Then g −1 (U ) is open in Y and hence (g ◦ f )−1 (U ) = f −1 (g −1 (U )) is open in X, implying that g ◦ f is continuous. Exercise A.10.11. Prove Proposition A.10.10 directly from Definition A.10.1. Definition A.10.12 (Homeomorphisms and topological equivalence). A bijective mapping f : X → Y is a homeomorphism if both f and f −1 are continuous. In this case we say that the corresponding topological spaces (X, τX ) and (Y, τY ) are homeomorphic. Homeomorphic spaces are also called topologically equivalent. A property which holds in all topologically equivalent spaces is called a topological property. Example. Any two open intervals in R are topologically equivalent. For a set X, properties “X has five elements” or “all subsets of X are open” are topological properties. Remark A.10.13. A homeomorphism is a topological isomorphism: homeomorphic spaces are topologically the same. As the saying goes, a topologist is a person who does not know the difference between a doughnut and a coffee cup. Let us denote briefly X ≈ Y when (X, τX ) and (Y, τY ) are homeomorphic. It is easy to see that we have an equivalence
X≈Y X ≈ Y and Y ≈ Z
X ≈ X, =⇒ Y ≈ X, =⇒ X ≈ Z.
Analogously, there is a concept of metric space isomorphisms: a bijective mapping f : X → Y between metric spaces (X, dX ), (Y, dY ) is called an isometric isomorphism if dY (f (a), f (b)) = dX (a, b) for every a ∈ X and b ∈ Y . Example. The reader may check that (x → x/(1 + |x|)) : R ≈ (−1, 1). Using algebraic topology, one can prove that Rm ≈ Rn if and only if m = n (this is not trivial!). Example. Any isometric isomorphism is a homeomorphism. Clearly the unbounded R and the bounded (−1, 1) are not isometrically isomorphic. An orthogonal linear operator A : Rn → Rn is an isometric isomorphism, when Rn is endowed with the
A.11. Compact topological spaces
49
Euclidean norm. The forward shift operator on p (Z) is an isometric isomorphism, but the forward shift operator on p (N) is only a non-surjective isometry. Exercise A.10.14. Let (X, dX ) and (Y, dY ) be metric spaces. Recall that f : X → Y is continuous if and only if ∀a ∈ X ∀ε > 0 ∃δ > 0 ∀b ∈ X :
dX (a, b) < δ =⇒ dY (f (a), f (b)) < ε.
A function f : X → Y is uniformly continuous if ∀ε > 0 ∃δ > 0 ∀a, b ∈ X :
dX (a, b) < δ =⇒ dY (f (a), f (b)) < ε,
and Lipschitz-continuous if ∃C < ∞ ∀a, b ∈ X :
dY (f (a), f (b)) ≤ C dX (a, b).
Prove that Lipschitz-continuity implies uniform continuity, and that uniform continuity implies continuity; give examples showing that these implications cannot be reversed. Theorem A.10.15. Two metrics d1 , d2 on a set X are equivalent if and only if the identity mapping from (X, d1 ) to (X, d2 ) is a homeomorphism. Proof. Let id(x) = x be the identity mapping from (X, d1 ) to (X, d2 ). Since id−1 (U ) = U for any set U , the forward implication follows from the definition of a continuous mapping and that of equivalent metrics. On the other hand, suppose the identity map is a homeomorphism. Again, since id−1 (U ) = U we get that every set open in (X, d2 ) is open in (X, d1 ) since id is continuous. The converse is true since id−1 is also continuous.
A.11
Compact topological spaces
Eventually, we will mainly concentrate on compact Hausdorff spaces, but in this section we deal with more general classes of topological spaces. Definition A.11.1 (Coverings). Let X be a set and K ⊂ X. A family U ⊂ P(X) is called a cover of K if K⊂ U; if the cover U is a finite set, it is called a finite cover. A cover U of K ⊂ X has a subcover U ⊂ U if U itself is a cover of K. In a topological space, an open cover refers to a cover consisting of open sets. Definition A.11.2 (Compact sets). Let (X, τ ) be a topological space. A subset K ⊂ X is compact (more precisely τ -compact) if every open cover of K has a finite subcover. We say that (X, τ ) is a compact space if X itself is τ -compact. A topological space is locally compact if each of its points has a neighbourhood whose closure is compact.
50
Chapter A. Sets, Topology and Metrics
Remark A.11.3. Briefly, in a topological space (X, τ ), K ⊂ X is compact if and only if the following holds: given any family U⊂ τ such that K ⊂ U, there exists a finite subfamily U ⊂ U such that K ⊂ U . Remark A.11.4. Let us consider a tongue-in-cheek geographical-zoological analogue for compactness: In a space or universe (X, τ ), let non-empty open sets correspond to territories of angry animals; recall the metaphor that a point x ∈ U ∈ τ is “far away from (i.e., not close to) the set X \ U ”. Compactness of an island K ⊂ X means that any given territorial cover U has a finite subcover U : already a finite number of beasts governs the whole island. Example. 1. If τ1 and τ2 are topologies of X, τ1 ⊂ τ2 , and (X, τ2 ) is a compact space then (X, τ1 ) is a compact space. 2. (X, {∅, X}) is a compact space. 3. If |X| = ∞ then (X, P(X)) is not a compact space, but it is locally compact. Clearly any space with a finite topology is compact. Even though a compact topology can be of any cardinality, it is in a sense “not far away from being finite”. 4. A metric space is compact if and only if it is sequentially compact (i.e., every sequence contains a converging subsequence, see Theorem A.13.4). 5. A subset X ⊂ Rn is compact if and only if it is closed and bounded (Heine– Borel Theorem A.13.7). 6. Theorem B.4.21 due to Frigyes Riesz asserts that a closed ball in a normed vector space over C (or R) is compact (i.e., the space is locally compact) if and only if the vector space is finite-dimensional. Of course, we may work with a complemented version of the compactness criterion in terms of closed sets: Proposition A.11.5 (Finite intersection property). A topological space X is compact if and only if the closed sets in X have the finite intersection property, which means that any collection {Fα }α of closed sets in X with ∩α Fα = ∅ has a finite subcollection {Fi }ni=1 ⊂ {Fα }α such that ∩ni=1 Fi = ∅. Proof. Defining Uα = X\Fα , we observe that condition ∩α Fα = ∅ means that {Uα }α is an open covering of X. The condition that X is compact means that any such covering has some finite subcollection {Ui }ni=1 with ∪ni=1 Ui = X, which in turn means that ∩ni=1 Fi = ∅. Proposition A.11.6 (Characterisation of compact subspaces). Let (X, τ ) be a topological space and let Y ⊂ X. Topological subspace (Y, τY ) is compact if and only if every collection {Uα }α∈I of sets Uα ∈ τ with α∈I Uα ⊃ Y has a finite subcollection that covers Y .
A.11. Compact topological spaces
51
Proof. Assume that (Y, τY ) is compact and let {Uα }α∈I be a collection of sets Uα ∈ τ with α∈I Uα ⊃ Y . Then the collection {Uα ∩ Y }α∈I is an open cover of (Y, τY ) and hence has a finite subcover {Ui ∩ Y }ni=1 . The corresponding collection {Ui }ni=1 is a finite subcollection of {Uα }α∈I that covers Y . Conversely, let {Vα }α∈I ⊂ τY be an open cover of Y . Then there exist sets Uα ∈ τ such that Vα = Uα ∩ Y . Consequently, {Uα }α∈I ⊂ τ is a cover of Y , and by assumption it has a finite subcollection {Ui }ni=1 that covers Y . The corresponding collection {Vi }ni=1 is then a finite open cover of Y . Exercise A.11.7. Show that a finite set in a topological space is compact. Exercise A.11.8. Let x ∈ Rn and r > 0. Show that the open ball Br (x) ⊂ Rn is not compact in the Euclidean metric topology. Exercise A.11.9. Prove that a union of two compact sets is compact. Proposition A.11.10. Let (X, τ ) be a topological space, K ⊂ X compact and S ⊂ X closed. Then K ∩ S is compact. Proof. Let U be an open cover of K ∩ S. Then U ∪ {X \ S} is an open cover of K, thus having a finite subcover U . Then U ∩U ⊂ U is a finite subcover of K ∩S. Proposition A.11.11 (Some properties of compact sets). We have the following properties: (1) A closed subset of a compact topological space is compact. (2) A compact subset of a metric space is bounded (and closed ). Proof. To prove (1), let Y be a closed subset of a compact topological space (X, τ ). Let {Uα }α∈I ⊂ τ be an open cover of Y . Since Y is closed, its complement X\Y is open, and collection {X\Y, Uα }α∈I is an open cover of X. Since X is compact, it has a finite subcover and since X\Y is disjoint from Y , removing X\Y (if necessary) from this subcover we obtain a finite subcover of Y . To prove (2), let Y be a compact subspace of a metric space (X, d). A collection of unit balls {B1 (y)}y∈Y is an open cover of Y , and hence it has a finite subcover, say {B1 (yi )}ni=1 . Applying Proposition A.6.14 we obtain that Y must be bounded. Proposition A.11.12. Let X be a compact space and f : X → Y continuous. Then f (X) ⊂ Y is compact. Proof. Let V be an open cover of f (X). Then U := {f −1 (V ) | V ∈ V} is an open cover of X, thus having a finite subcover U = {f −1 (V ) | V ∈ V }, where V ⊂ V is a finite collection. Then f (X) is covered by V ⊂ V: if y ∈ f (X) then y = f (x) for some x ∈ X, so x ∈ f −1 (V0 ) for some V0 ∈ V , so y = f (x) ∈ f (f −1 (V0 )) ⊂ V0 . Corollary A.11.13. Let f : X → R be a continuous mapping from a compact topological space X to R equipped with the Euclidean topology. Then f (X) is a bounded subset of R.
52
Chapter A. Sets, Topology and Metrics
Theorem A.11.14 (Product of compact spaces is compact). Let X, Y be compact topological spaces. Then X × Y in the product topology is compact. Proof. Let C = {Wα }α∈I be an open cover of X × Y in the product topology. In particular, it means that each Wα is a union of “rectangles” of the form U × Y where U and V are open in X and Y , respectively. For every (x, y) there is a “rectangle” Uxy × Vxy and the corresponding set Wxy such that (x, y) ∈ Uxy × Vxy ⊂ Wxy ∈ C. For every x ∈ y, collection {Vxy }y∈Y is an open covering of Y which then must y (x) n(x) n(x) y (x) have some finite subcover, which we denote by {Vx i }i=1 . Set Ux = ∩i=1 Ux i yi (x) n(x) is open in X and collection {Wx }i=1 is a cover of Ux × Y . In turn, the collection {Ux }x∈X is an open cover of X which then must have some finite subcover, which we denote by {Uxj }m j=1 . We now claim that the y (x )
collection {Wxji j }ij ⊂ C is a finite cover of X×Y . Indeed, for every (x, y) ∈ X×Y y (x ) there is some Uxj that contains x, and then there is some Vxjj j that contains y, y (xj )
implying that (x, y) ∈ Wxjj
.
Lemma A.11.15. Let (X, τ ) be a compact space and S ⊂ X infinite. Then S has an accumulation point. Proof. Recall that x ∈ X is an accumulation point of S ⊂ X if ∀U ∈ τ :
x ∈ U =⇒ (S ∩ U ) \ {x} = ∅.
Suppose S ⊂ X has no accumulation points, i.e., ∀x ∈ X ∃Ux ∈ τ :
x ∈ Ux and S ∩ Ux ⊂ {x}.
Now U = {Ux : x ∈ X} is an open cover of X, having a finite subcover U ⊂ U by compactness. Then (S ∩ Ux ) . S=S∩ U = Ux ∈U
Here the union is finite, and S ∩ Ux ⊂ {x} in each case. Thus S is finite.
A.12
Compact Hausdorff spaces
Next we are going to witness how beautiful compact Hausdorff topologies are. Among topological spaces, Hausdorff spaces are those where points are distinctively separated by open neighbourhoods; this happens especially in metric topology. Roughly, Hausdorff spaces have enough open sets to distinguish between any two points, while compact spaces “do not have too many open sets”. Combining these two properties, compact Hausdorff spaces form a useful class of topological spaces.
A.12. Compact Hausdorff spaces
53
Definition A.12.1 (Hausdorff spaces). A topological space (X, τ ) is called a Hausdorff space if for each a, b ∈ X, where a = b, there exists U, V ∈ τ such that a ∈ U , b ∈ V and U ∩ V = ∅. Example. 1. If τ1 and τ2 are topologies of X, τ1 ⊂ τ2 , and (X, τ1 ) is a Hausdorff space then (X, τ2 ) is a Hausdorff space. 2. (X, P(X)) is a Hausdorff space. 3. If X has more than one point and τ = {∅, X} then (X, τ ) is not Hausdorff. 4. Clearly any metric space (X, d) is a Hausdorff space; if x, y ∈ X, x = y, then Br (x) ∩ Br (y) = ∅, when r ≤ d(x, y)/2. 5. The distribution spaces D (Rn ), S (Rn ) and E (Rn ) are non-metrisable Hausdorff spaces. Theorem A.12.2. In Hausdorff spaces, we have the following properties: (1) (2) (3) (4) (5)
Every convergent sequence has a unique limit. All finite sets are closed. Every topological subspace is also Hausdorff. A compact subspace of a Hausdorff space is closed. A subset of a compact Hausdorff space is compact if and only if it is closed.
Proof. To prove (1), let xn be a sequence such that xn → p and xn → q as n → ∞. Assume p = q. Then there exist open sets U, V such that p ∈ U, q ∈ V and U ∩V = ∅. Consequently, there are numbers N and M such that for all n > N we have xn ∈ U and for all n > M we have xn ∈ V , which yields a contradiction. To prove (2), in view of property (C3) of Corollary A.7.14 it is enough to show that one-point sets {x} in a Hausdorff topological space X are closed. For every y ∈ X\{x} there exist open disjoint sets Uy x and Vy y. Since x ∈ Vy it follows that Vy ⊂ X\{x} and hence X\{x} = y∈X\{x} Vy , implying that X\{x} is open. To prove (3), let Y be a subset of a Hausdorff topological space (X, τ ) and let τY be the relative topology on Y . Let a, b ∈ Y be such that a = b. Since (X, τ ) is Hausdorff there exist open disjoint sets U, V ∈ τ such that a ∈ U and b ∈ V . Consequently, a ∈ U ∩ Y ∈ τY and b ∈ V ∩ Y ∈ τY , and U ∩ Y and V ∩ Y are disjoint, implying that (Y, τY ) is Hausdorff. To prove (4), let Y be a compact subspace of a topological space X. If Y = X the statement is trivial. Assuming that Y = X, let us take some x ∈ X\Y . Then for every y ∈ Y there are open disjoint sets Uy x and Vy y. The collection {Vy }y∈Y is a covering of Y , and hence by Proposition A.11.6 there is a finite collection Vy1 , . . . , Vyn still covering Y . Then set Ux = ∩ni=1 Uyi is open, x ∈ Ux , and Ux ∩ Y = ∅. Therefore, X\Y = ∪x∈X\Y Ux is open and hence Y is closed. Statement (5) follows immediately from (4) and property (1) of Proposition A.11.11.
54
Chapter A. Sets, Topology and Metrics
Theorem A.12.3 (Hausdorff property is a topological property). Let f : X1 → X2 be an injective and continuous mapping between topological spaces (X1 , τ1 ) and (X2 , τ2 ). If (X2 , τ2 ) is Hausdorff then (X1 , τ1 ) is also Hausdorff. Consequently, the Hausdorff property is a topological property. Proof. Let x, y ∈ X1 be such that x = y. Since f is injective, we have f (x) = f (y) and since (X2 , τ2 ) is Hausdorff there exist open disjoint sets U, V ∈ τ2 such that f (x) ∈ U and f (y) ∈ V . Since f is continuous, sets f −1 (U ) and f −1 (V ) are open disjoint neighbourhoods of x and y in X1 , respectively, implying that (X1 , τ1 ) is also Hausdorff. That the Hausdorff property is a topological property follows immediately from this. Exercise A.12.4 (Product of Hausdorff spaces). If (X1 , τ1 ) and (X1 , τ1 ) are Hausdorff topological spaces, show that (X1 × X2 , τ1 ⊗ τ2 ) is a Hausdorff topological space. Theorem A.12.5. Let X be a Hausdorff space, A, B ⊂ X compact subsets, and A ∩ B = ∅. Then there exist open sets U, V ⊂ X such that A ⊂ U , B ⊂ V , and U ∩ V = ∅. Proof. The proof is trivial if A = ∅ or B = ∅. So assume x ∈ A and y ∈ B. Since X is a Hausdorff space and x = y, we can choose neighbourhoods Uxy ∈ V(x) and Vxy ∈ V(y) such that Uxy ∩ Vxy = ∅. The collection P = {Vxy | y ∈ B} is an open cover of the compact set B, so that it has a finite subcover Px = {Vxyj | 1 ≤ j ≤ nx } ⊂ P for some nx ∈ N. Let Ux :=
nx
Uxyj .
j=1
Now O = {Ux | x ∈ A} is an open cover of the compact set A, so that it has a finite subcover O = {Uxi | 1 ≤ i ≤ m} ⊂ O. Then define U :=
O ,
V :=
m
Pxi .
i=1
It is an easy task to check that U and V have the desired properties.
Corollary A.12.6. Let X be a compact Hausdorff space, x ∈ X, and W ∈ V(x). Then there exists U ∈ V(x) such that U ⊂ W . Proof. Now {x} and X \ W are closed sets in a compact space, thus they are compact. Since these sets are disjoint, there exist open disjoint sets U, V ⊂ X such that x ∈ U and X \ W ⊂ V ; i.e., x ∈ U ⊂ X \ V ⊂ W. Hence x ∈ U ⊂ U ⊂ X \ V ⊂ W.
A.12. Compact Hausdorff spaces
55
Proposition A.12.7. Let (X, τX ) be a compact space and (Y, τY ) a Hausdorff space. Any bijective continuous mapping f : X → Y is a homeomorphism. Proof. Let U ∈ τX . Then X \ U is closed, hence compact. Consequently, f (X \ U ) is compact, and due to the Hausdorff property f (X \ U ) is closed. Therefore (f −1 )−1 (U ) = f (U ) is open. Corollary A.12.8. Let X be a set with a compact topology τ2 and a Hausdorff topology τ1 . If τ1 ⊂ τ2 then τ1 = τ2 . Proof. The identity mapping (x → x) : X → X is a continuous bijection from (X, τ2 ) to (X, τ1 ). A more direct proof of the corollary. Let U ∈ τ2 . Since (X, τ2 ) is compact and X \ U is τ2 -closed, X \ U must be τ2 -compact. Now τ1 ⊂ τ2 , so that X \ U is τ1 -compact. (X, τ1 ) is Hausdorff, implying that X \ U is τ1 -closed, thus U ∈ τ1 ; this yields τ2 ⊂ τ1 . Definition A.12.9 (Separating points). A family F of mappings X → C is said to separate the points of the set X if there exists f ∈ F such that f (x) = f (y) whenever x = y. Definition A.12.10 (Support). The support of a function f ∈ C(X) is the set supp(f ) := {x ∈ X | f (x) = 0}. Let f ∈ C(X) such that 0 ≤ f ≤ 1. Notations K ≺ f,
f ≺U
mean, respectively, that K ⊂ X is compact and χK ≤ f , and that U ⊂ X is open and supp(f ) ⊂ U . Theorem A.12.11 (Urysohn’s Lemma). Let X be a compact Hausdorff space, A, B ⊂ X closed non-empty sets, A ∩ B = ∅. Then there exists f ∈ C(X) and U ⊂ X \ A such that B ≺ f ≺ U . Especially, we find f such that 0 ≤ f ≤ 1,
f (A) = {0},
f (B) = {1}.
Proof. The set Q ∩ [0, 1] is countably infinite; let φ : N → Q ∩ [0, 1] be a bijection satisfying φ(0) = 0 and φ(1) = 1. Choose open sets U0 , U1 ⊂ X such that A ⊂ U0 ⊂ U0 ⊂ U1 ⊂ U1 ⊂ X \ B. Then we proceed inductively as follows: Suppose we have chosen open sets Uφ(0) , Uφ(1) , . . ., Uφ(n) such that φ(i) < φ(j) ⇒ Uφ(i) ⊂ Uφ(j) .
56
Chapter A. Sets, Topology and Metrics
Let us choose an open set Uφ(n+1) ⊂ X such that φ(i) < φ(n + 1) < φ(j) ⇒ Uφ(i) ⊂ Uφ(n+1) ⊂ Uφ(n+1) ⊂ Uφ(j) whenever 0 ≤ i, j ≤ n. Let us define r < 0 ⇒ Ur := ∅,
s > 1 ⇒ Us := X.
Hence for each q ∈ Q we get an open set Uq ⊂ X such that ∀r, s ∈ Q : r < s ⇒ Ur ⊂ Us . Let us define a function f : X → [0, 1] by f (x) := inf{r : x ∈ Ur }. Clearly 0 ≤ f ≤ 1, f (A) = {0} and f (B) = {1}. Let us prove that f is continuous. Take x ∈ X and ε > 0. Take r, s ∈ Q such that f (x) − ε < r < f (x) < s < f (x) + ε; then f is continuous at x, since x ∈ Us \ Ur and for every y ∈ Us \ Ur we have |f (y) − f (x)| < ε. Thus f ∈ C(X). Corollary A.12.12. Let X be a compact space. Then C(X) separates the points of X if and only if X is Hausdorff. Exercise A.12.13. Prove the previous corollary. Definition A.12.14 (Partition of unity). A partition of unity on K ⊂ X in a topological space (X, τ ) is a family F = {φj : X → [0, 1] | j ∈ J} of continuous functions such that φj ≤ 1, χK ≤ j∈J
where the sum is required to be locally finite: for each x ∈ X there exists U ∈ V(x) such that supp(φj ) ⊂ X \ U for all but finitely many φj ∈ F. Moreover, if now φj ≺ Uj for all j ∈ J, where U = {Uj : j ∈ J} is an open cover of X, then F is called a partition of unity on K subordinate to U. Corollary A.12.15 (Partition of unity). Let U be an open cover of a compact set K ⊂ X in a Hausdorff space (X, τ ). Then there exists a partition of unity on K subordinate to U. Proof. Assume the non-trivial case K = ∅. Take a finite subcover U = {Uj | 1 ≤ j ≤ n} ⊂ U. For x ∈ K, take j ∈ {1, . . . , n} such that x ∈ Uj ; then choose Vx ∈ V(x) such that Vx ⊂ Uj . Then O = {Vx | x ∈ K} is an open cover of K, thus having a finite subcover O ⊂ O. Let Kj := {V ∈ O : V ⊂ Uj }.
A.13. Sequential compactness
57
Urysohn’s Lemma provides functions fj ∈ C(X) satisfying Kj ≺ fj ≺ Uj . Again by Urysohn’s Lemma, there exists g ∈ C(X) such that ! n n Kj ≺ g ≺ x ∈ X : fk (x) > 0 . j=1
Notice that K ⊂
n j=1
k=1
Kj . Let φj := fj /(1 − g +
n
fk ).
k=1
Then {φj ∈ C(X)}nj=1 provides a desired partition of unity.
Exercise A.12.16. In a compact metric space (X, d), Urysohn’s Lemma is much easier to obtain: When A, B ⊂ X are closed and non-empty such that A ∩ B = ∅, define f : X → R by # " dist(A, {x}) . f (x) := min 1, dist(A, B) Show that f is continuous, 0 ≤ f ≤ 1, f (A) = {0} and f (B) = 1. Definition A.12.17 (Equicontinuity). Let X be a topological space. A family F of mappings f : X → C is called equicontinuous at p ∈ X if for every ε > 0 there exists a neighbourhood U ⊂ X of p such that |f (x) − f (p)| < ε whenever f ∈ F and x ∈ U . Exercise A.12.18. Prove the following Theorem A.12.19. (Hint: a bounded sequence of numbers has a convergent subsequence. . . ) Theorem A.12.19 (Arzel`a-Ascoli Theorem). Let K ⊂ Rn be compact. For each j ∈ Z+ , let fj : K → C be continuous, and assume that F = {fj | j ∈ Z+ } is equicontinuous on K. If F is bounded, i.e., sup x∈K,j∈Z+
|fj (x)| < ∞,
then there is a subsequence {fjk | k ∈ Z+ } that converges uniformly on K.
A.13
Sequential compactness
In this section, a metric space (X, d) is endowed with its canonical metric topology τd . Proposition A.13.1 (Closed and bounded if compact in metric). Let (X, d) be a metric space, and let K ⊂ X be compact. Then K is closed and bounded.
58
Chapter A. Sets, Topology and Metrics
Proof. Let us assume K = ∅, to avoid a triviality. Let x0 ∈ X. Then U = {Bk (x0 ) | k ∈ Z+ } is an open cover of K. Due to compactness of K, there is a subcover U = {Bk (x0 ) | k ∈ S}, where S ⊂ Z+ is finite. Now K⊂
U =
Bk (x0 ) = Bmax(S) (x0 ).
k∈S
Therefore diam(K) ≤ 2 max(S) < ∞, so K is bounded. We have to prove that K is closed. Let x ∈ X \ K. Then V := Bd(x,y)/2 (y) | y ∈ K is an open cover of K. By compactness, there is a finite subcover n V = Bd(x,yj )/2 (yj ) j=1 . n
Let r := min {d(x, yj )/2}j=1 . Then Br (x) ∩ K ⊂
n
Br (x) ∩ Bd(x,yj )/2 (yj ) = ∅,
j=1
so x ∈ cd (K). Thereby K = cd (K) is closed. Exercise A.13.2. Give an example of a bounded non-compact metric space.
Definition A.13.3 (Sequential compactness). A metric space is sequentially compact if each of its sequences has a converging subsequence. That is, given a sequence (xk )∞ k=1 in a sequentially compact metric space (X, d), there is a converging se+ + quence (xkj )∞ j=1 , where kj+1 > kj ∈ Z for each j ∈ Z . Theorem A.13.4 (Compact ⇔ sequentially compact in metric spaces). A metric space (X, d) is compact if and only if it is sequentially compact. Proof. Let us assume that X = ∅ is compact. Take a sequence (xk )∞ k=1 in X. If the set {xk : k ∈ Z+ } is finite, there exists y ∈ X such that y = xk for infinitely many k ∈ Z+ . Then a desired convergent subsequence is given by (y, y, y, . . .). Now assume that the set S := {xk : k ∈ Z+ } is infinite, so it has an accumulation point p ∈ X by Lemma A.11.15. Take k1 ∈ Z+ such that xk1 ∈ S ∩ B1 (p). Inductively, take kj+1 > kj ∈ Z+ such that xkj+1 ∈ S ∩ B1/j (p). Then d(p, xkj+1 ) < 1/j →j→∞ 0, so xkj →j→∞ p. We have proven that a compact metric space is sequentially compact. Now let (X, d) be sequentially compact. We want to show that its metric topology is compact. Take an open cover U = {Uα : α ∈ A} of X. We claim that ∃ε0 > 0 ∀x ∈ X ∃α ∈ A :
Bε0 (x) ⊂ Uα .
(A.8)
A.13. Sequential compactness
59
Let us prove this by deducing a contradiction from the logically negated assumption ∀ε > 0 ∃x ∈ X ∀α ∈ A : Bε (x) ⊂ Uα . This would especially imply ∀k ∈ Z+ ∃xk ∈ X ∀α ∈ A :
B1/k (xk ) ⊂ Uα .
This gives us a sequence (xk )∞ k=1 , which by sequential compactness has a subsequence (xkj )∞ j=1 converging to a point p ∈ X. Since U covers X, we have p ∈ Uαp for some αp ∈ A. Since Uαp is open, Bε (p) ⊂ Uαp for some ε > 0. But for large enough j, B1/kj (xkj ) ⊂ Bε (p) ⊂ Uαp . This is a contradiction, so (A.8) must be true. Now we claim that X can be covered with finitely many open balls of radius ε0 .
(A.9)
What happens if (A.9) is not true? Then take x1 ∈ X, and inductively xk+1 ∈ X \
k
Bε0 (xj ) = ∅,
j=1
where the non-emptiness of the set is due to the counter-assumption. Now d(xj , xk ) ≥ ε0 > 0 if j = k, so the sequence (xk )∞ k=1 does not have a convergent subsequence. But this would contradict the sequential compactness. Hence (A.9) must be true. Exercise A.13.5. Think why the compactness of X follows from (A.8) and (A.9). Exercise A.13.6. Show that a compact metric space is complete. Corollary A.13.7 (Heine–Borel Theorem). Let Rn be endowed with its Euclidean topology. Then K ⊂ Rn is compact if and only if it is closed and bounded. Proof. In any metric topology, compactness implies closedness and boundedness, see Proposition A.13.1. So let S ⊂ Rn be non-empty, closed and bounded. We shall prove that it is sequentially compact. Take a sequence (xk )∞ k=1 in S. By boundedness, there exist a, b ∈ R such that S ⊂ [a, b]n =: Q1 . That is, Q1 is a closed cube of sidelength b − a. Now we chop Q1 inductively into pieces. When the cube Qj has been chosen, we decompose Qj “dyadically” to a union of 2n cubes Qj+1,m (here m ∈ {1, . . . , 2n }), whose interiors are disjoint and whose sidelengths are 2−j (b − a). Choose Qj ∈ {Qj+1,m : j ∈ {1, . . . , 2n }} such that xk ∈ Qj+1 for infinitely many k ∈ Z+ .
60
Chapter A. Sets, Topology and Metrics
We construct the convergent subsequence (xkj )∞ j=1 inductively. Let k1 := 1. Take kj+1 > kj ∈ Z+ such that xkj+1 ∈ Qj+1 . Now (xkj )∞ j=1 is a Cauchy sequence, because Q1 ⊃ Q2 ⊃ Q3 ⊃ · · · ⊃ Qj ⊃ Qj+1 ⊃ · · · , √ diam(Qj+1 ) = n 2−j (b − a) →j→∞ 0. n Due to the completeness of Rn , the Cauchy sequence (xkj )∞ j=1 of S ⊂ R converges n to a point p ∈ R . But p ∈ S, because S is closed. Thus S is sequentially compact.
Corollary A.13.8. Let (X, τ ) be a compact topological space and f : X → R continuous. Then there exist max(f (X)), min(f (X)) ∈ R. Proof. Assume that X = ∅. By Proposition A.11.12, f (X) ⊂ R is compact. By the Heine–Borel Theorem A.13.7, equivalently f (X) ⊂ R is closed and bounded. Thereby sup(f (X)), inf(f (X)) ∈ f (X). We note that the Heine–Borel theorem can also be proved without referring to the sequential compactness. For simplicity, we show this in the one-dimensional case. Theorem A.13.9 (Heine-Borel Theorem in 1D). Closed intervals [a, b] are compact in R in the Euclidean topology. Proof. We will assume a < b since otherwise the statement is trivial. For an open covering C = {Uα }α∈I of [a, b] let S ⊂ [a, b] be defined by S = {x ∈ [a, b] : [a, x] can be covered by finitely many sets from C}. The statement of the theorem will follow if we show that b ∈ S. Since S = 0 in view of a ∈ S and since S ⊂ [a, b] is bounded, we can define c = sup S so that c ∈ [a, b]. The statement of the theorem will follow if we show that c ∈ S and that c = b. To show that c ∈ S, we observe that since c ∈ [a, b], there is some set Uc ∈ C such that c ∈ Uc . Since Uc is open, there is some ε > 0 such that (c − ε, c] ⊂ Uc . At the same time, since c − ε < c = sup S, the closed interval [a, c − ε] can be covered by finitely many sets from C by the definition of S and c. Consequently, adding Uc to this finite collection of sets from C we obtain a finite covering of [a, c], implying that c ∈ S. To show that c = b, let us assume that c < b. As before, let Uc be such that c ∈ Uc ∈ C. Since Uc is open and c < b, there is ε > 0 such that [c, c + 2ε) ⊂ Uc . Since c ∈ S, closed interval [a, c] can be covered by finitely many sets from C, and adding Uc to this finite collection we obtain a finite covering of [a, c + ε] which is a contradiction with c = sup S. Theorem A.13.10 (R is complete). The real line R with the Euclidean metric is complete.
A.13. Sequential compactness
61
Proof. Let xn be a Cauchy sequence in R. By Lemma A.9.2, (2), the set {xn }∞ n=1 is bounded, i.e., there are some a, b ∈ R such that {xn }∞ n=1 ⊂ [a, b]. By the Heine– Borel Theorem in Corollary A.13.7 or in Theorem A.13.9 the interval [a, b] is compact, and by Exercise A.13.6 it must be complete. Therefore, xn must have a convergent subsequence, and since it is a Cauchy sequence, the whole sequence is convergent by Lemma A.9.2, (3). Corollary A.13.11 (Rn is complete). The space Rn is complete with respect to any of the Lipschitz equivalent metrics dp , 1 ≤ p ≤ ∞. Proof. Since all metrics dp are Lipschitz equivalent, it is enough to take one, e.g., (1) (n) (i) (i) ¯k = (xk , . . . , xk ) and d∞ (¯ xk , x ¯l ) = max1≤i≤n |xk − xl |, we d∞ . Writing x (i) (i) have that d∞ (¯ xk , x ¯l ) < ε implies |xk − xl | < ε for all i = 1, . . . , n. Thus, if (i) x ¯k ∈ Rn is a Cauchy sequence in Rn , it follows that xk is a Cauchy sequence (i) in R for all i, and hence it has a limit, say x , for all i, by Theorem A.13.10. ¯k → x ¯ as k → ∞. Indeed, let ε > 0. Writing x ¯ = (x(1) , . . . , x(n) ), we claim that x (i) Then for all i there is a number Ni such that k > Ni implies |xk − x(i) | < ε. (i) Therefore, for k > max1≤i≤n Ni , we have |xk − x(i) | < ε for all i, which means xk , x ¯) < ε. that d∞ (¯ Alternative proof of Theorem A.13.4. We now state several results of independent importance that will give another proof of Theorem A.13.4. Lemma A.13.12 (Lebesgue’s covering lemma). Let C be an open covering of a sequentially compact metric space (X, d). Then there is ε > 0 such that every ball with radius ε is contained in some set from the covering C. Such ε is called a Lebesgue number of the covering C. Proof. Suppose that no such ε > 0 exists. It means that for every n ∈ N there is a ball B1/n (xn ) which is not contained in any set from C. Let (xnj )∞ j=1 be a with some limit x ∈ X, so that x convergent subsequence of (xn )∞ nj → X as n=1 j → ∞. Let U ∈ C be a set in C containing x. Since U is open, there is some δ > 0 such that B2δ (x) ⊂ U . Now let N be one of indices nj such that d(xN , x) < δ and such that N1 < δ. We claim that B1/N (xN ) ⊂ B2δ (x) which would be a contradiction with our choice of the sequence xn and the fact that B2δ (x) ⊂ U . Indeed, if y ∈ B1/N (xN ), we have d(y, x) ≤ d(y, xN ) + d(xN , x)
0 there are finitely many balls in X with radius ε that cover X.
62
Chapter A. Sets, Topology and Metrics
Proof. Suppose that there is ε > 0 such that no finitely many balls in X with radius ε cover X. We will now construct a sequence of points in X with no convergent subsequence. Let x1 ∈ X be an arbitrary point. Let x2 be any point in X\Bε (x1 ). j−1 Bε (xi ). Inductively, suppose we have points x1 , . . . , xn ∈ X such that xj ∈ X\∪i=1 Since the collection {Bε (xi )}ni=1 does not cover X, we can always choose some xn ∈ X\ ∪n−1 i=1 Bε (xi ). All points in this sequence have the property that d(xn , xk ) ≥ ε for any n, k ∈ N which means that sequence (xn )∞ n=1 can not have any convergent subsequence. Exercise A.13.14. In general, a metric space is said to be totally bounded if ∀ε > 0 ∃ {xj | j ∈ {1, . . . , nε }} ⊂ X :
X=
nε
Bε (xj ).
j=1
Show that a metric space (X, d) is compact if and only if it is bounded and totally bounded. Alternative proof of Theorem A.13.4. Let (X, d) be a metric space. First we will prove that if X is compact it is sequentially compact. Let (xn )∞ n=1 be a sequence of points in x. Define An = {xn , xn+1 , xn+2 , . . . }, so that A1 = {xn }∞ n=1 . Let Fn = An . Clearly Fn is a closed set and the intersection of any finite number of sets Fn is non-empty since it contains AN for some N . Since X is compact, by the finite intersection property in Proposition A.11.5 we have ∩∞ n=1 Fn = ∅. Let now F , so that x ∈ F = A for all n. Using a characterisation of closures x ∈ ∩∞ n n n=1 n in Proposition A.8.10, it follows that every open ball B1/j (x) contains a point xnj ∈ Anj with nj as large as we want. Therefore, we have a subsequence {xnj } of {xn } such that d(xnj , x) < 1/j, which means that it is a convergent subsequence of {xn }. Let us now prove that if X is sequentially compact it is compact. Let C be an open cover of X and let ε > 0 be its Lebesgue number according to Lemma A.13.12. By Lemma A.13.13, X is totally bounded, so that it is covered by finitely many balls {Bε (xi )}ni=1 . Since ε is a Lebesgue number, for every i = 1, . . . , n, there is some Ui ∈ C such that Bε (xi ) ⊂ Ui . Consequently, {Ui }ni=1 must be a cover for X.
A.14 Stone–Weierstrass theorem In the sequel we study densities of subalgebras in C(X). These results will be applied in characterising function algebras among Banach algebras. For material concerning algebras we refer to Chapter D. First we study continuous functions on [a, b] ⊂ R: Theorem A.14.1 (Weierstrass Theorem (1885)). Polynomials are dense in C([a,b]).
A.14. Stone–Weierstrass theorem
63
Proof. Evidently, it is enough to consider the case [a, b] = [0, 1]. Let f ∈ C([0, 1]), and let g(x) = f (x) − (f (0) + (f (1) − f (0))x); then g ∈ C(R) if we define g(x) = 0 for x ∈ R \ [0, 1]. For n ∈ N let us define kn : R → [0, ∞) by ⎧ ⎨ 1 (1−x2 )n , when |x| < 1, (1−t2 )n dt −1 kn (x) := ⎩0, when |x| ≥ 1. Then define Pn := g ∗ kn (convolution of g and kn ), that is ∞ ∞ g(x − t) kn (t) dt = g(t) kn (x − t) dt = Pn (x) = −∞
−∞
1
g(t) kn (x − t) dt,
0
and from this last expression we see that Pn is a polynomial on [0, 1]. Notice that Pn is real valued if f is real valued. Take any ε > 0. The function g is uniformly continuous, so that there exists δ > 0 such that ∀x, y ∈ R : |x − y| < δ ⇒ |g(x) − g(y)| < ε. Let g = max |g(t)|. Take x ∈ [0, 1]. Then t∈[0,1]
|Pn (x) − g(x)| =
−∞ 1
= ≤ ≤
∞
−1 1
g(x − t) kn (t) dt − g(x)
∞
−∞
kn (t) dt
(g(x − t) − g(x)) kn (t) dt
|g(x − t) − g(x)| kn (t) dt
−1 −δ −1
2g kn (t) dt +
≤ 4g
δ
−δ
ε kn (t) dt +
1
2g kn (t) dt δ
1
kn (t) dt + ε. δ
1 The reader may verify that δ kn (t) dt →n→∞ 0 for every δ > 0. Hence Qn − f →n→∞ 0, where Qn (x) = Pn (x) + f (0) + (f (1) − f (0))x. 1 Exercise A.14.2. Show that δ kn (t) dt →n→∞ 0 in the proof of the Weierstrass Theorem A.14.1. Definition A.14.3 (Involutive subalgebras). For f : X → C let us define f ∗ : X → C by f ∗ (x) := f (x), and define |f | : X → C by |f |(x) := |f (x)|. A subalgebra A ⊂ F(X) is called involutive if f ∗ ∈ A whenever f ∈ A. Theorem A.14.4 (Stone–Weierstrass Theorem (1937)). Let X be a compact space. Let A ⊂ C(X) be an involutive subalgebra separating the points of X. Then A is dense in C(X).
64
Chapter A. Sets, Topology and Metrics
Proof. If f ∈ A then f ∗ ∈ A, so that the real part Ref = Let us define AR := {Ref | f ∈ A};
f + f∗ belongs to A. 2
this is a R-subalgebra of the R-algebra C(X, R) of continuous real-valued functions on X. Then A = {f + ig | f, g ∈ AR }, so that AR separates the points of X. If we can show that AR is dense in C(X, R) then A would be dense in C(X). First we have to show that AR is closed under taking maximums and minimums. For f, g ∈ C(X, R) we define max(f, g)(x) := max(f (x), g(x)),
min(f, g)(x) := min(f (x), g(x)).
Notice that AR is an algebra over the field R. Since max(f, g) =
f + g |f − g| + , 2 2
min(f, g) =
f + g |f − g| − , 2 2
it is enough to prove that |h| ∈ AR whenever h ∈ AR . Let h ∈ AR . By the Weierstrass Theorem A.14.1 there is a sequence of polynomials Pn : R → R such that Pn (x) →n→∞ |x| uniformly on the interval [−h, h]. Thereby |h| − Pn (h) →n→∞ 0, where Pn (h)(x) := Pn (h(x)). Since Pn (h) ∈ AR for every n, this implies that |h| ∈ AR . Now we know that max(f, g), min(f, g) ∈ AR whenever f, g ∈ AR . Now we are ready to prove that f ∈ C(X, R) can be approximated by elements of AR . Take ε > 0 and x, y ∈ X, x = y. Since AR separates the points of X, we may pick h ∈ AR such that h(x) = h(y). Let gxx = f (x)1, and let gxy (z) :=
h(z) − h(x) h(z) − h(y) f (x) + f (y). h(x) − h(y) h(y) − h(x)
Here gxx , gxy ∈ AR , since AR is an algebra. Furthermore, gxy (x) = f (x),
gxy (y) = f (y).
Due to the continuity of gxy , there is an open set Vxy ∈ V(y) such that z ∈ Vxy ⇒ f (z) − ε < gxy (z). Now {Vxy | y ∈ X} is an open cover of the compact space X, so that there is a finite subcover {Vxyj | 1 ≤ j ≤ n}. Define gx := max gxyj ; 1≤j≤n
A.15. Manifolds
65
gx ∈ AR , because AR is closed under taking maximums. Moreover, ∀z ∈ X : f (z) − ε < gx (z). Due to the continuity of gx (and since gx (x) = f (x)), there is an open set Ux ∈ V(x) such that z ∈ Ux ⇒ gx (z) < f (z) + ε. Now {Ux | x ∈ X} is an open cover of the compact space X, so that there is a finite subcover {Uxi | 1 ≤ i ≤ m}. Define g := min gxi ; 1≤i≤m
g ∈ AR , because AR is closed under taking minimums. Moreover, ∀z ∈ X : g(z) < f (z) + ε. Thus f (z) − ε < min gxi (z) = g(z) < f (z) + ε, 1≤i≤m
that is |g(z) − f (z)| < ε for every z ∈ X, i.e., g − f < ε. Hence AR is dense in C(X, R) implying that A is dense in C(X). Remark A.14.5. Notice that under the assumptions of the Stone–Weierstrass Theorem, the compact space is actually a compact Hausdorff space, since continuous functions separate the points.
A.15
Manifolds
We now give an example of Hausdorff spaces which is a starting point of the geometric analysis. We will come back to this topic with more details in Section 5.2. Definition A.15.1 (Manifold). A topological space (X, τ ) is called an n-dimensional (topological ) manifold if it is second countable, Hausdorff and each of its points has a neighbourhood homeomorphic to an open set of the Euclidean space Rn . If φ : U → U is a homeomorphism, where U ∈ τ and U ⊂ Rn is open then the pair (U, φ) is called a chart on X. n+1 2 is an Exercise A.15.2. Show that the sphere Sn = x ∈ Rn+1 : j=1 xj = 1 n-dimensional manifold. Exercise A.15.3. Let X and Y be manifolds of respective dimensions m, n. Show that X × Y is a manifold of dimension m + n.
66
Chapter A. Sets, Topology and Metrics
Definition A.15.4 (Differentiable manifold). Let (X, τ ) be an n-dimensional manifold. A collection A = {(Ui , φi ) : i ∈ I} of charts on X is called a C k -atlas if {Ui : i ∈ I} is a cover of X and if the mappings x → φj (φ−1 i (x)) : φi (Ui ∩ Uj ) → φj (Ui ∩ Uj ) are C k -smooth whenever Ui ∩ Uj = ∅. If there is a C k -atlas then X is called a C k -manifold (differentiable manifold ).
A.16
Connectedness and path-connectedness
In this section we discuss notions of connected and path-connected topological spaces and a relation between them. Proposition A.16.1. Let (X, τ ) be a topological space. Then the following statements are equivalent (i) There exist non-empty open subsets U, V of X such that U ∩ V = ∅ and U ∪ V = X. (ii) There exists a non-empty subset U of X such that U is open and closed and such that U = X. (iii) There exists a continuous surjective mapping from X to the set {0, 1} equipped with the discrete topology. Proof. Statements (i) and (ii) are equivalent if we take V = X\U . Let us show that (i) implies (iii). Define a mapping f by f (x) = 0 for x ∈ U and f (x) = 1 for x ∈ V . Since U and V are non-empty, the mapping f is surjective. If W is any subset of {0, 1}, its preimage f −1 (W ) is one of the sets ∅, U, V, X. Since all of them are open, f is continuous. Finally, to show that (iii) implies (i), we set U = f −1 (0) and V = f −1 (1). Since f is continuous, both sets are open. Moreover, clearly they are disjoint, U ∪ V = X, and they are non-empty because f is surjective. Definition A.16.2 (Connected topological space). A topological space (X, τ ) is said to be disconnected if it satisfies any of the equivalent properties of Proposition A.16.1. Otherwise, it is said to be connected. Proposition A.16.3 (“Connectedness” is a topological property). Let X and Y be topological spaces and let f : X → Y be continuous. If X is connected, then f (X) is also connected. Consequently, “connectedness” is a topological property. Proof. Suppose that f (X) = U ∪ V with U, V as in Proposition A.16.1, (i). Then X = f −1 (U ) ∪ f −1 (V ) and sets f −1 (U ), f −1 (V ) satisfy conditions of Proposition A.16.1, (i), yielding a contradiction.
A.16. Connectedness and path-connectedness
67
Exercise A.16.4. Prove that a subset A of a topological space X is disconnected (in the relative topology) if and only if there are open sets U, V in X such that U ∩ A = ∅, V ∩ A = ∅, A ⊂ U ∪ V and U ∩ V ∩ A = ∅. Proposition A.16.5 (Closures are connected). Let X be a topological space and let A ⊂ X. If A is connected, then its closure A is also connected. Proof. Let U and V be open sets in X such that A ⊂ U ∪ V and U ∩ V ∩ A = ∅. Since A ⊂ A, we have A ⊂ U ∪ V and U ∩ V ∩ A = ∅. Since A is connected, by Exercise A.16.4 we must then have either U ∩ A = ∅ or V ∩ A = ∅, which means that either A ⊂ X\U or A ⊂ X\V . Since the sets X\U and X\V are closed in X, it follows that we have either A ⊂ X\U or A ⊂ X\V , which means that either U ∩ A = ∅ or V ∩ A = ∅. By Exercise A.16.4 again, it means that A is connected. Definition A.16.6 (Path-connected topological spaces). A topological space X is said to be path-connected if for any two points a, b ∈ X there is a path from a to b, i.e., a continuous mapping γ : [0, 1] → X such that γ(0) = a and γ(1) = b. Theorem A.16.7 (Path-connected =⇒ connected). A path-connected topological space is connected. Exercise A.16.8. Show that the converse is not true. For example, prove that the set X = {(0, t) : −1 ≤ t ≤ 1} ∪ {(t, sin 1t ), t > 0} in the relative topology of the Euclidean space R2 is connected but not path-connected. We first prove a special case of Theorem A.16.7, namely we show that intervals in R are connected. We then reduce the general case to this one. By an interval in R we understand any open or closed or half-open, finite or infinite interval. Theorem A.16.9 (Interval in R =⇒ connected). Every interval I in R with the Euclidean topology is connected. Proof. We will prove it by contradiction. Suppose I = U ∪ V , where U and V are non-empty, disjoint, and open in the relative topology of I. Let u ∈ U , v ∈ V , and assume u < v. Since I is an interval we have [u, v] ⊂ I, and we write A = {x ∈ I : u ≤ x and [u, x] ⊂ U }. Since u ∈ A, A is non-empty, and since v ∈ U , A is bounded above. Thus, we can define w = sup A, and we have [u, w) ⊂ U . Since w ∈ [u, v], we also have w ∈ I = U ∪ V , so that either w ∈ U or w ∈ V . We will now show that both choices are impossible. Suppose w ∈ U . Then w < v and since U is open, there is some δ > 0 such that (w − δ, w + δ) ∩ A ⊂ U . Now, if we take some z ∈ (w, w +δ)∩A, we have [w, z] ⊂ U , so that also [u, z] ∈ U , contradicting w = sup A. Suppose now w ∈ V . Then u < w and since V is open, there is some δ > 0 such that (w − δ, w + δ) ∩ A ⊂ V . Now, if we take some z ∈ (w − δ, w) ∩ A, we
68
Chapter A. Sets, Topology and Metrics
have (z, w] ⊂ V , so that for all x ∈ (z, w] we have that [u, x] ⊂ U , contradicting w = sup A again. Proof of Theorem A.16.7. Let X be a path-connected topological space and let f be a continuous mapping from X to {0, 1} equipped with the discrete topology. By Proposition A.16.1 it is enough to show that f must be constant. Without loss of generality, suppose that f (x) = 0 for some x ∈ X. Let y ∈ X and let γ be a path from x to y. Then the composition mapping f ◦ γ : [0, 1] → {0, 1} is continuous. Since [0, 1] is connected by Theorem A.16.9 it follows that f ◦ γ can not be surjective, so that f (y) = f (γ(1)) = f (γ(0)) = f (x) = 0. Thus, f (y) = 0 for all y ∈ X, which means that f can not be surjective. Theorem A.16.9 has the converse: Theorem A.16.10 (Connected in R =⇒ interval). If I is a connected subset of R it must be an interval. Proof. First we show that I ⊂ R is an interval if and only if for any a, c ∈ I and any b ∈ R with a < b < c we must have b ∈ I. If I is an interval the implication is trivial. Conversely, we will prove that if a ∈ I, then I ∩ [a, ∞) is [a, ∞) or [a, e] or [a, e) for some e ∈ R. If I is not bounded above, then for any b > a there is c ∈ I such that c > b. Hence b ∈ I by assumption. In this case I ∩ [a, ∞) = [a, ∞). So we may assume that I is bounded above and let e = sup I. If e = a, then I ∩ [a, ∞) = [a, a], so we may assume e > a. Then for any b with a < b < e there is some c ∈ I such that a < b < c, and hence b ∈ I by assumption. Therefore, I ∩ [a, ∞) is [a, e] or [a, e) depending on whether e ∈ I or not. Arguing in a similar way for I ∩ (−∞, a] we get that I must be an interval. Now, suppose I is not an interval. By the above claim, there exits some a, c ∈ I and b ∈ I such that a < b < c. But then U = I ∩(−∞, b) and V = I ∩(b, ∞) is a decomposition of I into a union of non-empty, open disjoint sets with U ∪V = I, contradicting the assumption that I is connected. We will now show a converse to Theorem A.16.7, provided that we are dealing with subsets of Rn . Theorem A.16.11 (Open connected in Rn =⇒ path-connected). Every open connected subset of Rn with the Euclidean topology is path-connected. Proof. First we note that if we have a path γ1 from a to b and a path γ2 from b to c, we can glue them together to obtain a path from a to c, e.g., by setting γ(t) = γ1 (2t) for 0 ≤ t ≤ 21 , and γ(t) = γ2 (2t − 1) for 12 ≤ t ≤ 1. Let A be a non-empty open connected set in Rn (the statement is trivial for the empty set). Take some a ∈ A and define U = {b ∈ A : there is a path from a to b in A}.
A.17. Co-induction and quotient spaces
69
We claim that U is open and closed. Indeed, if b ∈ U , we have Bε (b) ⊂ A for some ε > 0. Consequently, for any c ∈ Bε (b) we have a path from a to b in A by the definition of U , and we obviously also have a path from b to c in Bε (b) (e.g., just a straight line). Glueing these paths together, we obtain a path from a to c in A, which means that Bε (b) ⊂ U and hence U is open. To show that U is also closed, take some b ∈ A\U . Then we have Bε (b) ⊂ A for some ε > 0. If now c ∈ Bε (b), there is a path from c to b in Bε (b). Consequently, there can be no path from a to c because otherwise there would be a path from a to b in A. Thus, c ∈ A\U , implying that A\U is open. Finally, writing A = U ∪ (A\U ) as a union of two disjoint open sets, and observing that U contains a and is, therefore, non-empty, it follows that A\U = ∅ because A is connected. But this means that A = U and hence A is pathconnected.
A.17
Co-induction and quotient spaces
Definition A.17.1 (Co-induced topology). Let X and J be sets, (Xj , τj ) be topological spaces for every j ∈ J, and F = {fj : Xj → X | j ∈ J} be a family of mappings. The F-co-induced topology of X is the strongest topology τ on X such that the mappings fj are continuous for every j ∈ J. Exercise A.17.2. Let τ be the co-induced topology from Definition A.17.1. Show that τ = U ⊂ X | ∀j ∈ J : fj−1 (U ) ∈ τj . Definition A.17.3 (Quotient topology). Let (X, τX ) be a topological space, and let ∼ be an equivalence relation on X. Let [x] := {y ∈ X | x ∼ y}, X/ ∼:= {[x] | x ∈ X}, and define the quotient map π : X → X/ ∼ by π(x) := [x]. The quotient topology of the quotient space X/ ∼ is the {π}-co-induced topology on X/ ∼. Exercise A.17.4. Show that X/ ∼ is compact if X is compact. Example. Let A be a topological vector space and J its subspace. Let us write [x] := x + J for x ∈ A. Then the quotient topology of A/J = {[x] | x ∈ A} is the topology co-induced by the family {(x → [x]) : A → A/J }. Remark A.17.5. The message of the following Exercise A.17.6 is that if our compact space X is not Hausdorff, we can factor out the inessential information that the continuous functions f : X → C do not see, to obtain a compact Hausdorff space related nicely to X.
70
Chapter A. Sets, Topology and Metrics
Exercise A.17.6. Let X be a topological space, and let us define a relation R ⊂ X × X by (x, y) ∈ R
definition
⇐⇒
∀f ∈ C(X) : f (x) = f (y).
Prove: (a) (b) (c) (d)
R is an equivalence relation on X. There is a natural bijection between the sets C(X) and C(X/R). X/R is a Hausdorff space. If X is a compact Hausdorff space then X ∼ = X/R.
Exercise A.17.7. For A ⊂ X, let us define the equivalence relation RA by (x, y) ∈ RA
definition
⇐⇒
x = y or {x, y} ⊂ A.
Let X be a topological space, and let ∞ ⊂ X be a closed subset. Prove that the mapping [x], X \ ∞ → (X/R∞ ) \ {∞}, x → is a homeomorphism. Finally, let us state a basic property of co-induced topologies: Proposition A.17.8. Let X have the F-co-induced topology, and Y be a topological space. A mapping g : X → Y is continuous if and only if g ◦ f is continuous for every f ∈ F. Proof. If g is continuous then the composed mapping g ◦ f is continuous for every f ∈ F. Conversely, suppose g ◦ fj is continuous for every fj ∈ F, fj : Xj → X. Let V ⊂ Y be open. Then fj−1 (g −1 (V )) = (g ◦ fj )−1 (V ) ⊂ Xj thereby g −1 (V ) = fj (fj−1 (g −1 (V ))) ⊂ X is open.
is open;
Corollary A.17.9. Let X, Y be topological spaces, R be an equivalence relation on X, and endow X/R with the quotient topology. A mapping f : X/R → Y is continuous if and only if (x → f ([x])) : X → Y is continuous.
A.18
Induction and product spaces
The main theorem of this section is Tihonov’s theorem which is a generalisation of Theorem A.11.14 to infinitely many sets. However, we also discuss other topologies induced by infinite families, and some of their properties.
A.18. Induction and product spaces
71
Definition A.18.1 (Induced topology). Let X and J be sets, (Xj , τj ) be topological spaces for every j ∈ J and F = {fj : X → Xj | j ∈ J} be a family of mappings. The F-induced topology of X is the weakest topology τ on X such that the mappings fj are continuous for every j ∈ J. Example. Let (X, τX ) be a topological space, A ⊂ X, and let ι : A → X be defined by ι(a) = a. Then the {ι}-induced topology on A is τX |A := {U ∩ A | U ∈ τX }. This is called the relative topology of A, see Definition A.7.18. Let f : X → Y . The restriction f |A = f ◦ ι : A → Y satisfies f |A (a) = f (a) for every a ∈ A ⊂ X. Exercise A.18.2. Prove Tietze’s Extension Theorem: Let X be a compact Hausdorff space, K ⊂ X closed and f ∈ C(K). Then there exists F ∈ C(X) such that F |K = f . (Hint: approximate F by continuous functions that would exist by Urysohn’s lemma.) Example. Let (X, τ ) be a topological space. Let σ be the C(X) = C(X, τ )-induced topology, i.e., the weakest topology on X making the all τ -continuous functions continuous. Obviously, σ ⊂ τ , and C(X, σ) = C(X, τ ). If (X, τ ) is a compact Hausdorff space it is easy to check that σ = τ . Example. Let X, Y be topological spaces with bases BX , BY , respectively. Recall that the product topology for X × Y = {(x, y) | x ∈ X, y ∈ Y } has a base {U × V | U ∈ BX , V ∈ BY }. This topology is actually induced by the family {pX : X × Y → X, pY : X × Y → Y }, where the coordinate projections pX and pY are defined by pX ((x, y)) = x and pY ((x, y)) = y. Definition A.18.3 (Product topology). Let Xj be a set for every j ∈ J. The Cartesian product Xj X= j∈J
is the set of the mappings Xj x:J →
such that ∀j ∈ J : x(j) ∈ Xj .
j∈J
Due to the Axiom of Choice, X is non-empty if all Xj are non-empty. The mapping pj : X → Xj ,
x → xj := x(j),
72
Chapter A. Sets, Topology and Metrics
is called the jth coordinate projection. Let (Xj , τj ) be topological spaces. Let X := j∈J Xj be the Cartesian product. Then the {pj | j ∈ J}-induced topology on X is called the product topology of X. If Xj = Y for all j ∈ J, it is customary to write Xj = Y J = {f | f : J → Y }. j∈J
Let us state a basic property of induced topologies: Proposition A.18.4. Let X have the F-induced topology, and Y be a topological space. A mapping g : Y → X is continuous if and only if f ◦ g is continuous for every f ∈ F. Proof. If g is continuous then the composed mapping f ◦ g is continuous for every f ∈ F, by Proposition A.10.10. Conversely, suppose fj ◦ g is continuous for every fj ∈ F, f : X → Xj . Let y ∈ Y , V ⊂ X be open, g(y) ∈ V . Then there exist {fjk }nk=1 ⊂ F and open sets Wjk ⊂ Xjk such that n g(y) ∈ fj−1 (Wjk ) ⊂ V. k k=1
Let U :=
n
(fjk ◦ g)−1 (Wjk ).
k=1
Then U ⊂ Y is open, y ∈ U , and g(U ) ⊂ V ; hence g : Y → X is continuous at an arbitrary point y ∈ Y , i.e., g ∈ C(Y, X). Remark A.18.5 (Hausdorff preserved in products). It is easy to see that a Cartesian product of Hausdorff spaces is always Hausdorff: if X = j∈J Xj and x, y ∈ X, x = y, then there exists j ∈ J such that xj = yj . Therefore there are open sets Uj , Vj ⊂ Xj such that xj ∈ Uj ,
yj ∈ V j ,
Uj ∩ Vj = ∅.
−1 (Vj ). Then U, V ⊂ X are open, Let U := p−1 j (Uj ) and V := p
x ∈ U,
y ∈ V,
U ∩ V = ∅.
Also compactness is preserved in products; this is stated in Tihonov’s Theorem (Tychonoff’s Theorem). Before proving this we introduce a tool, which can be compared with Proposition A.11.5: Definition A.18.6 (Non-Empty Finite InterSection (NEFIS) property). Let X be a set. Let NEFIS(X) be the set of those families F ⊂ P(X) such that every finite subfamily of F has a non-empty intersection. In other words, a family F ⊂ P(X) belongs to NEFIS(X) if and only if F = ∅ for every finite subfamily F ⊂ F.
A.18. Induction and product spaces
73
Lemma A.18.7. A topological space X is compact if andonly if F ∈ NEFIS(X) whenever F ⊂ P(X) is a family of closed sets satisfying F = ∅. Proof. Let X be a set, U ⊂ P(X), and F := {X \ U | U ∈ U }. Then (X \ U ) = X \ U, F= U ∈U
so that U is a cover of X if and only if definition of compactness.
F = ∅. Now the claim follows the
Theorem A.18.8 (Tihonov’s Theorem (1935)). Let Xj be a compact space for every j ∈ J. Then X = Xj is compact. j∈J
Proof. To avoid the trivial case, suppose Xj = ∅ for every j ∈ J. Let F ∈ NEFIS(X) be a family of closed sets. In order to prove the compactness of X we have to show that F = ∅. Let P := {G ∈ NEFIS(X) | F ⊂ G}. Let us equip the set P with a partial order relation ≤: G≤H
definition
⇐⇒
G ⊂ H.
The Hausdorff Maximal Principle A.4.9 says that the chain {F} ⊂ P belongs to a maximal chain C ⊂ P . The reader may verify that G := C ∈ P is a maximal element of P . Notice that the maximal element G may contain non-closed sets. For every j ∈ J the family {pj (G) | G ∈ G} belongs to NEFIS(Xj ). Define Gj := {pj (G) | G ∈ G}. Clearly also Gj ∈ NEFIS(X j ), and the elements of Gj are closed sets in Xj . Since Xj is compact, we have Gj = ∅. Hence, by the Axiom of Choice A.4.2, there is an element x := (xj )j∈J ∈ X such that xj ∈
Gj .
We shall show that x ∈ F, which proves Tihonov’s Theorem. If Vj ⊂ Xj is a neighbourhood of xj and G ∈ G then pj (G) ∩ Vj = ∅,
74
Chapter A. Sets, Topology and Metrics
because xj ∈ pj (G). Thus
G ∩ p−1 j (Vj ) = ∅
for every G ∈ G, so that G ∪ {p−1 j (Vj )} belongs to P ; the maximality of G implies that p−1 j (Vj ) ∈ G. Let V ∈ τX be a neighbourhood of x. Due to the definition of the product topology, n p−1 x∈ jk (Vjk ) ⊂ V k=1
{jk }nk=1
⊂ J, where Vjk ⊂ Xjk is a neighbourhood of xjk . for some finite index set Due to the maximality of G, any finite intersection of members of G belongs to G, so that n p−1 jk (Vjk ) ∈ G. k=1
Therefore for every G ∈ G and V ∈ VτX (x) we have G ∩ V = ∅. Hence x ∈ G for every G ∈ G, yielding x∈
G∈G
so that
F ⊂G
G ⊂
F =
F ∈F
F = ∅.
F =
F,
F ∈F
Remark A.18.9. Actually, Tihonov’s Theorem A.18.8 is equivalent to the Axiom of Choice A.4.2; we shall not prove this.
A.19
Metrisable topologies
It is often very useful to know whether a topology on a space comes from some metric. Here we try to construct metrics on compact spaces. We shall learn that a compact space X is metrisable if and only if the corresponding normed algebra C(X) is separable. Metrisability is equivalent to the existence of a countable family of continuous functions separating the points of the space. As a vague analogy to manifolds, the reader may view such a countable family as a set of coordinate functions on the space. Definition A.19.1 (Metrisable topology). A topological space (X, τ ) is called metrisable if there exists a metric d on X such that the topology τ is the canonical metric topology of (X, d), i.e., if there exists a metric d on X such that τ = τd .
A.19. Metrisable topologies
75
Example (Discrete topology). The discrete topology on the set X is the collection τ of all subsets of X. This is a metric topology corresponding to the discrete metric. Exercise A.19.2. Let X, Y be metrisable. Prove that X × Y is metrisable, and that X×Y
(xn , yn ) → (x, y)
⇔
X
Y
xn → x and yn → y.
Remark A.19.3. There are plenty of non-metrisable topological spaces, the easiest example being X with more than one point and with τ = {∅, X}. If X is an infinite-dimensional Banach space then the weak∗ -topology1 of X := L(X, C) is not metrisable. The distribution spaces D (Rn ), S (Rn ) and E (Rn ) are nonmetrisable topological spaces. We shall later prove that for the compact Hausdorff spaces metrisability is equivalent to the existence of a countable base. Exercise A.19.4. Show that (X, τ ) is a topological space, where τ = {U ⊂ X | U = ∅, or X \ U is finite} . When is this topology metrisable? Theorem A.19.5. Let (X, τ ) be compact. Assume that there exists a countable family F ⊂ C(X) separating the points of X. Then (X, τ ) is metrisable. Proof. Let F = {fn }∞ n=0 ⊂ C(X) separate the points of X. We can assume that |fn | ≤ 1 for every n ∈ N; otherwise consider for instance functions x → fn (x)/(1 + |fn (x)|). Let us define d(x, y) := sup 2−n |fn (x) − fn (y)| n∈N
for every x, y ∈ X. Next we prove that d : X × X → [0, ∞) is a metric: d(x, y) = 0 ⇔ x = y, because {fn }∞ n=0 is a separating family. Clearly also d(x, y) = d(y, x) for every x, y ∈ X. Let x, y, z ∈ X. We have the triangle inequality: d(x, z)
=
sup 2−n |fn (x) − fn (z)|
n∈N
≤ ≤ =
sup(2−n |fn (x) − fn (y)| + 2−n |fn (y) − fn (z)|)
n∈N
sup 2−m |fm (x) − fm (y)| + sup 2−n |fn (y) − fn (z)|
m∈N
n∈N
d(x, y) + d(y, z).
Hence d is a metric on X. Finally, let us prove that the metric topology coincides with the original topology, τd = τ . Let x ∈ X, ε > 0. Take N ∈ N such that 2−N < ε. Define Un :=
fn−1 (Bε (fn (x)))
∈ Vτ (x),
U :=
N n=0
1 see
Definition B.4.35
Un ∈ Vτ (x).
76
If y ∈ U then
Chapter A. Sets, Topology and Metrics
d(x, y) = sup 2−n |fn (x) − fn (y)| < ε. n∈N
Thus x ∈ U ⊂ Bε (x) = {y ∈ X | d(x, y) < ε}. This proves that the original topology τ is finer than the metric topology τd , i.e., τd ⊂ τ . Combined with the facts that (X, τ ) is compact and (X, τd ) is Hausdorff, this implies that we must have τd = τ , by Corollary A.12.8. Corollary A.19.6. Let X be a compact Hausdorff space. Then X is metrisable if and only if it has a countable basis. Proof. Suppose X is a compact space, metrisable with a metric d. Let r > 0. Then Br = {Bd (x, r) | x ∈ X} is an open cover of X, thus having a finite subcover ∞ Br ⊂ Br . Then B := B1/n is a countable basis for X. n=1
Conversely, suppose X is a compact Hausdorff space with a countable basis B. Then the family C := {(B1 , B2 ) ∈ B × B | B1 ⊂ B2 } is countable. For each (B1 , B2 ) ∈ C, Urysohn’s Lemma (Theorem A.12.11) provides a function fB1 B2 ∈ C(X) satisfying fB1 B2 (B1 ) = {0} and fB1 B2 (X \ B2 ) = {1}. Next we show that the countable family F = {fB1 B2 : (B1 , B2 ) ∈ C} ⊂ C(X) separates the points of X. Indeed, Take x, y ∈ X, x = y. Then W := X \ {y} ∈ V(x). Since X is a compact Hausdorff space, by Corollary A.12.6 there exists U ∈ V(x) such that U ⊂ W . Take B , B ∈ B such that x ∈ B ⊂ B ⊂ B ⊂ U . Then fB B (x) = 0 = 1 = fB B (y). Thus X is metrisable. Corollary A.19.7. Let X be a compact Hausdorff space. Then X is metrisable if and only if C(X) is separable. Proof. Suppose X is a metrisable compact space. Let F ⊂ C(X) be a countable family separating the points of X (as in the proof of the previous corollary). Let G be the set of finite products of functions f for which f ∈ F ∪ F ∗ ∪ {1}; the set G = {gj }∞ j=0 is countable. The linear span A of G is the involutive algebra generated by F (the smallest ∗-algebra containing F, see Definition D.5.1); due to the Stone–Weierstrass Theorem (see Theorem A.14.4), A is dense in C(X). If S ⊂ C is a countable dense set then n λj gj | n ∈ Z+ , (λj )nj=0 ⊂ S} {λ0 1 + j=1
is a countable dense subset of A, thereby dense in C(X).
A.20. Topology via generalised sequences
77
Conversely, assume that F = {fn }∞ n=0 ⊂ C(X) is a dense subset. Take x, y ∈ X, x = y. By Urysohn’s Lemma (Theorem A.12.11) there exists f ∈ C(X) such that f (x) = 0 = 1 = f (y). Take fn ∈ F such that f − fn < 1/2. Then |fn (x)| < 1/2
and |fn (y)| > 1/2,
so that fn (x) = fn (y); F separates the points of X.
Exercise A.19.8. Prove that a topological space with a countable basis is separable. Prove that a metric space has a countable basis if and only if it is separable. Exercise A.19.9. There are non-metrisable separable compact Hausdorff spaces! Prove that X, X = {f : [0, 1] → [0, 1] | x ≤ y ⇒ f (x) ≤ f (y)}, endowed with a relative topology, is such a space. Hint: Tihonov’s Theorem.
A.20 Topology via generalised sequences Definition A.20.1 (Directed set). A non-empty set J is directed if there exists a relation “≤” such that ≤⊂ J × J (where (x, y) ∈≤ is usually denoted by x ≤ y) such that for every x, y, z ∈ J it holds that 1. x ≤ x, 2. if x ≤ y and y ≤ z then x ≤ z, 3. there exists w ∈ J such that x ≤ w and y ≤ w. Definition A.20.2 (Nets and convergence). A net (or a generalised sequence) in a topological space (X, τ ) is a mapping (j → xj ) : J → X, denoted also by (xj )j∈J , where J is a directed set. If K ⊂ J is a directed set (with respect to the natural inherited relation ≤) then the net (xj )j∈K is called a subnet of the net (xj )j∈J . A net (xj )j∈J converges to a point p ∈ X, denoted by xj → p
or
xj −−→ p j∈J
or
lim xj = lim xj = p,
j∈J
if for every neighbourhood U of p there exists jU ∈ J such that xj ∈ U whenever jU ≤ j. Example. A sequence (xj )j∈Z+ is a net, where Z+ is directed by the usual partial order; sequences characterise topology in spaces of countable local bases, for instance metric spaces. But there are more complicated topologies, where sequences are not enough; for instance, the weak∗ -topology for the dual of an infinite-dimensional topological vector space. Exercise A.20.3 (Nets and closure). Let X be a topological space. Show that p ∈ X belongs to the closure of S ⊂ X if and only if there exists a net (xj )j∈J : J → S such that xj → p.
78
Chapter A. Sets, Topology and Metrics
Exercise A.20.4 (Nets and continuity). Show that a function f : X → Y is continuous at p ∈ X if and only if f (xj ) → f (p) whenever xj → p for nets (xj )j∈J in X. Exercise A.20.5 (Nets and compactness). Show that a topological space X is compact if and only if its every net has a converging subnet. Exercise A.20.6. In the spirit of Exercises A.20.3, A.20.4 and A.20.5, express other topological concepts via nets.
Chapter B
Elementary Functional Analysis We assume that the reader already has knowledge of (complex) matrices, determinants, etc. In this chapter, we shall present basic machinery for dealing with vector spaces, especially Banach and Hilbert spaces. We do not go into depth in this direction as there are plenty of excellent specialised monographs available devoted to various aspects of the subject, see, e.g., [11, 35, 53, 59, 63, 70, 87, 89, 90, 116, 134, 146, 153]. However, we still make an independent presentation of a collection of results which are indispensable for anyone working in analysis, and which are useful for other parts of this book.
B.1
Vector spaces
Definition B.1.1 (Vector space). Let K ∈ {R, C}. A K-vector space (or a vector space over the field K, or a vector space if K is implicitly known) is a set V endowed with mappings ((x, y) → x + y) : V × V → V, ((λ, x) → λx) : K × V → V such that there exists an origin 0 ∈ V and such that the following properties hold: (x + y) + z = x + (y + z), x + 0 = x, x + (−1)x = 0, x + y = y + x, 1x = x, λ(μx) = (λμ)x, λ(x + y) = λx + λy, (λ + μ)x = λx + μx
80
Chapter B. Elementary Functional Analysis
for all x, y, z ∈ V and λ, μ ∈ K. We may write x + y + z := (x + y) + z and −x := (−1)x. Elements of a vector space are called vectors. Definition B.1.2 (Convex and balanced sets). A subset C of a vector space is convex if tx + (1 − t)y ∈ C for every x, y ∈ C whenever 0 < t < 1. A subset B of a vector space is balanced if λx ∈ B for every x ∈ B whenever |λ| ≤ 1. Example. K ∈ {R, C} is itself a vector space over K, likewise Kn with operations (xk )nk=1 + (yk )nk=1 := (xk + yk )nk=1 and λ(xk )nk=1 := (λxk )nk=1 . Example. Let V be a K-vector space and X = ∅. The set V X of mappings f : X → V is a K-vector space with pointwise operations (f + g)(x) := f (x) + g(x) and (λf )(x) := λ f (x). The vector space Kn can be naturally identified with KX , where X = {k ∈ Z+ : k ≤ n}. Example. Let V be a vector space such that its vector operations restricted to W ⊂ V endow this subset with the vector space structure. Then W is called a vector subspace. A vector space V has always trivial subspaces {0} and V . The vector space V X has, e.g., the subspace {f : X → V | ∀x ∈ K : f (x) = 0}, where K ⊂ X is a fixed subset. Definition B.1.3 (Algebraic basis). Let V be a vector space and S ⊂ V . Let us write λ(x)x = λ(x)x, x∈S
x∈S: λ(x)=0
when λ : S → K is finitely supported, i.e., {x ∈ S : λ(x) = 0} is finite. The span of a subset S of a vector space V is ! span(S) := λ(x)x λ : S → K finitely supported . x∈S
Thus span(S) is the smallest subspace containing S ⊂ V . A subset S of a K-vector space is said to be linearly independent if λ(x)x = 0 ⇒ λ ≡ 0. x∈S
A subset is linearly dependent if it is not linearly independent. A subset S ⊂ V is called an algebraic basis (or a Hamel basis) of V if S is linearly independent and V = span(S). Remark B.1.4. Let B be an algebraic basis for V . Then there exists a unique set of functions (x → x, bB ) : V → K such that x= x, bB b b∈B
for every x ∈ V . Notice that x, bB = 0 for at most finitely many b ∈ B. Consider this, e.g., with respect to the vector space KX in the example before.
B.1. Vector spaces
81
Example. The canonical algebraic basis for Kn is {ek }nk=1 , where ek = (δjk )nj=1 and δkk = 1 and δjk = 0 otherwise. Lemma B.1.5. Every vector space V = {0} has an algebraic basis. Moreover, any two algebraic bases have the same cardinality1 . Proof. Let F be the family of all linearly independent subsets of V . Now F = ∅, because {x} ∈ F for every x ∈ V \ {0}. Endow F with a partial order by inclusion. Let C ⊂ F be a chain and let F := C. It is easy to check that F ∈ F is an upper bound for C. Thereby there is a maximal element M ∈ F. Obviously, M is an algebraic basis for V . Let A, B be algebraic bases for V . The reader may prove (by induction) that card(A) = card(B) when A is finite. So suppose card(A) ≤ card(B), where A is infinite. Now card(A) = card(S), where S := {(a, b) ∈ A × B : a, bB = ∅} . Assume card(A) < card(B). Thus ∃b0 ∈ B ∀a ∈ A : a, b0 B = 0. But then b0
=
b0 , aA a
a∈A
=
b0 , aA
a∈A
=
b∈B
=
a, bB b
b∈B
b0 , aA a, bB
b
a∈A
b∈B\{b0 }
∈
b0 , aA a, bB
b
a∈A
span(B \ {b0 }),
contradicting the linear independence of B. Thus card(A) = card(B).
Definition B.1.6 (Algebraic dimension). By Lemma B.1.5, we may define the algebraic dimension dim(V ) of a vector space V to be the cardinality of any of its algebraic bases. The vector space V is said to be finite-dimensional if dim(V ) is finite, and infinite-dimensional otherwise. 1 Here
we will use the notation card(A) for the cardinality of A to avoid any confusion with the notation for norms; see Definition A.4.13.
82
Chapter B. Elementary Functional Analysis
Definition B.1.7 (Linear operators and functionals). Let V, W be K-vector spaces. A mapping A : V → W is called a linear operator (or a linear mapping), denoted A ∈ L(V, W ), if A(u + v) = A(u) + A(v), A(λv) = λ A(v) for every u, v ∈ V and λ ∈ K. Then it is customary to write Av := A(v), and L(V ) := L(V, V ). A linear mapping f : V → K is called a linear functional. Definition B.1.8 (Kernel and image). The kernel Ker(A) ⊂ V of a linear operator A : V → W is defined by Ker(A) := {u ∈ V : Au = 0}, where 0 is the origin of the vector space W . The image Im(A) ⊂ W of A is defined by Im(A) := {Au : u ∈ V }. Exercise B.1.9. Show that Ker(A) is a vector subspace of V and that Im(A) is a vector subspace of W . Exercise B.1.10. Let C ⊂ V be convex and A ∈ L(V, W ). Show that A(C) ⊂ W is convex. Definition B.1.11 (Spectrum of an operator). Let V be a K-vector space. Let I ∈ L(V ) denote the identity operator (x → x) : V → V . The spectrum of A ∈ L(V ) is σ(A) := {λ ∈ K : λI − A is not bijective} . Exercise B.1.12. Appealing to the Fundamental Theorem of Algebra, show that σ(A) = ∅ for A ∈ L(Cn ). Exercise B.1.13. Give an example, where σ(A) = ∅ = σ(A2 ). Exercise B.1.14. Show that σ(A) = {0} if A is nilpotent, i.e., if Ak = 0 for some k ∈ Z+ . Exercise B.1.15. Show that σ(AB) ∪ {0} = σ(BA) ∪ {0} in general, and that σ(AB) = σ(BA) if A is bijective. Definition B.1.16 (Quotient vector space). Let M be a subspace of a K-vector space V . Let us endow the quotient set V /M := {x + M | x ∈ V } with the operations ((x + M, y + M ) → x + y + M ) : V /M × V /M → V /M, (λ, (y + M )) → λy + M ) : K × V /M → V /M. Then it is easy to show that with these operations, this so-called quotient vector space is indeed a vector space. Remark B.1.17. In the case of a topological vector space V (see Definition B.2.1), the quotient V /M is endowed with the quotient topology, and then V /M is a topological vector space if and only if the original subspace M ⊂ V was closed.
B.1. Vector spaces
B.1.1
83
Tensor products
The basic idea in multilinear algebra is to linearise multilinear operators. The functional analytic foundation is provided by tensor products that we concisely review here. We also introduce locally convex spaces and Fr´echet spaces as well as Montel and nuclear spaces. Definition B.1.18 (Bilinear mappings). Let Xj (1 ≤ j ≤ r) and V be K-vector spaces (that is, vector spaces over the field K). A mapping A : X1 × X2 → V is 2-linear (or bilinear ) if x → A(x, x2 ) and x → A(x1 , x) are linear mappings for each xj ∈ Xj . The reader may guess what conditions an r-linear mapping X 1 × · · · × Xr → V satisfies. Definition B.1.19 (Tensor product of spaces). The algebraic tensor product of Kvector spaces X1 , . . . , Xr is a K-vector space V endowed with an r-linear mapping i such that for every K-vector space W and for every r-linear mapping A : X1 × · · · × Xr → W ˜ = A. (The there exists a (unique) linear mapping A˜ : V → W satisfying Ai reader is encouraged to draw a commutative diagram involving the vector spaces ˜ Any two tensor products for X1 , . . . , Xr can easily be seen and mappings i, A, A!) isomorphic, so that we may denote the tensor product of these vector spaces by X1 ⊗ · · · ⊗ Xr . In fact, such a tensor product always exists. Indeed, let X, Y be K-vector spaces. We may formally define the set B := {x ⊗ y | x ∈ X, y ∈ Y }, where x ⊗ y = a ⊗ b if and only if x = a and y = b. Let Z be the K-vector space with basis B, i.e., Z
= =
span {x ⊗ y | x ∈ X, y ∈ Y } ⎫ ⎧ n ⎬ ⎨ λj (xj ⊗ yj ) : n ∈ N, λj ∈ K, xj ∈ X, yj ∈ Y . ⎭ ⎩ j=0
Let [0 ⊗ 0] := span α1 β1 (x1 ⊗ y1 ) + α1 β2 (x1 ⊗ y2 ) +α2 β1 (x2 ⊗ y1 ) + α2 β2 (x2 ⊗ y2 ) −(α1 x1 + α2 x2 ) ⊗ (β1 y1 + β2 y2 ) : αj , βj ∈ K, xj ∈ X, yj ∈ Y .
84
Chapter B. Elementary Functional Analysis
For z ∈ Z, let [z] := z + [0 ⊗ 0]. The tensor product of X, Y is the K-vector space X ⊗ Y := Z/[0 ⊗ 0] = {[z] | z ∈ Z} , where ([z1 ], [z2 ]) → [z1 + z2 ] and (λ, [z]) → [λz] are well defined mappings (X ⊗ Y ) × (X ⊗ Y ) → X ⊗ Y and K × (X ⊗ Y ) → X ⊗ Y , respectively. Definition B.1.20 (Tensor product of operators). Let X, Y , V , W be K-vector spaces, and let A : X → V and B : Y → W be linear operators. The tensor product of A, B is the linear operator A ⊗ B : X ⊗ Y → V ⊗ W , which is the unique linear extension of the mapping x ⊗ y → Ax ⊗ By, where x ∈ X and y ∈ Y . Example. Let X and Y be finite-dimensional K-vector spaces with bases dim(X)
{xi }i=1
dim(Y )
and {yj }j=1
,
respectively. Then X ⊗ Y has a basis {xi ⊗ yj | 1 ≤ i ≤ dim(X), 1 ≤ j ≤ dim(Y )} . Let S be a finite set. Let F(S) be the K-vector space of functions S → K; it has a basis {δx | x ∈ S}, where δx (y) = 1 if x = y, and δx (y) = 0 otherwise. Now it is easy to see that for finite sets S1 , S2 the vector spaces F(S1 ) ⊗ F(S2 ) and F(S1 ×S2 ) are isomorphic; for fj ∈ F(Sj ), we may regard f1 ⊗f2 ∈ F(S1 )⊗F(S2 ) as a function f1 ⊗ f2 ∈ F(S1 × S2 ) by (f1 ⊗ f2 )(x1 , x2 ) := f1 (x1 ) f2 (x2 ). Definition B.1.21 (Inner product on V ⊗ W ). Suppose V, W are finite-dimensional inner product spaces over K. The natural inner product for V ⊗ W is obtained by extending v1 ⊗ w1 , v2 ⊗ w2 V ⊗W := v1 , v2 V w1 , w2 W . Definition B.1.22 (Duals of tensor product spaces). The dual (V ⊗ W ) of a tensor product space V ⊗ W is naturally identified with V ⊗ W . Alternative approach to tensor products. Now we briefly describe another approach to tensor products. Definition B.1.23 (Algebraic tensor product). Let K ∈ {R, C}. Let X, Y be Kvector spaces, and X , Y their respective algebraic duals, i.e., the spaces of linear functionals X → K and Y → K. For x ∈ X and y ∈ Y , define the bilinear functional x ⊗ y : X × Y → K by (x ⊗ y)(x , y ) := x (x) y (y). Let B(X , Y ) denote the space of all bilinear functionals X × Y → K. The algebraic tensor product (or simply the tensor product) X⊗Y is the vector subspace of B(X , Y ) which is spanned by the set {x ⊗ y : x ∈ X, y ∈ Y }.
B.2. Topological vector spaces
85
Exercise B.1.24. Show that a : (X ⊗ Y ) → B(X, Y ) is a linear bijection, where a(f )(x ⊗ y) := f (x, y) for f ∈ (X ⊗ Y ) , x ∈ X and y ∈ Y . Exercise B.1.25. Let X, Y, Z be K-vector spaces. Let B(X, Y ; Z) denote the vector space of bilinear mappings X × Y → Z. Find a linear bijection B(X, Y ; Z) → L(X ⊗ Y, Z), where L(V, Z) is the vector space of linear mappings V → Z.
B.2
Topological vector spaces
Vector spaces can be combined with topology. For reader’s convenience, if one has not encountered Banach and Hilbert spaces yet, we suggest skipping the sections on topological vector spaces and locally convex spaces at this point, and returning here later. Definition B.2.1 (Topological vector space). A topological vector space V over a field K ∈ {R, C} is both a topological space and a vector space over K such that {0} ⊂ V is closed and such that the mappings ((λ, x) → λx) : K × V → V, ((x, y) → x + y) : V × V → V are continuous. The dual space V of a topological vector space V consists of continuous linear functionals f : V → K. Exercise B.2.2. Show that a topological vector space is a Hausdorff space. Exercise B.2.3. Show that in a topological vector space every neighbourhood of 0 contains a balanced neighbourhood of 0. Exercise B.2.4. Prove that a topological vector space V is metrisable if and ∞ only if it has a countable family {Uj }∞ of neighbourhoods of 0 ∈ V such that j=1 j=1 Uj = {0}. Moreover, show that in this case a compatible metric d : V × V → [0, ∞) can be chosen translation-invariant in the sense that d(x + z, y + z) = d(x, y) for every x, y, z ∈ V . Definition B.2.5 (Equicontinuity in vector space). Let X be a topological space and V a topological vector space. A family F of mappings f : X → V is called equicontinuous at p ∈ X if for every neighbourhood W ⊂ V of f (p) there exists a neighbourhood U ⊂ X of p such that f (x) ∈ W whenever f ∈ F and x ∈ U . Remark B.2.6 (NEFIS property and compactness). Recall the Non-Empty Finite Intersection property (NEFIS) from Definition A.18.6: that is, we denote by NEFIS(X) the set of those families F ⊂ P(X) such that every finite subfamily of F has a non-empty intersection. Recall also that a topological space X is compact if and only if F = ∅ whenever F ∈ NEFIS(X) consists of closed sets. Definition B.2.7 (Small sets property). Let X be a topological vector space. A family F ⊂ P(X) is said to contain small sets if for every neighbourhood U of 0 ∈ X there exists x ∈ X and S ∈ F such that S ⊂ x + U .
86
Chapter B. Elementary Functional Analysis
Definition B.2.8 (Completeness of topologicalvector spaces). A subset S of a topological vector space X is called complete if F = ∅ whenever F ∈ NEFIS(X) consists of closed subsets of S and contains small sets. Exercise B.2.9 (Completeness and Cauchy nets). A net (xj )j∈J in a topological vector space X is called a Cauchy net if for every neighbourhood V of 0 ∈ X there exists k = kV ∈ J such that xi − xj ∈ V whenever k ≤ i, j. Prove that S ⊂ X is complete if and only if each Cauchy net in S converges to a point in S. Exercise B.2.10. Show that a complete subset of a topological vector space is closed, and that a closed subset of a complete topological vector space is complete. Exercise B.2.11 (Completeness and Cartesian product). Let Xj be a topological vector space for each j ∈ J. Show that the product space X = j∈J Xj is complete if and only if Xj is complete for every j ∈ J. Definition B.2.12 (Total boundedness in topological vector spaces). A subset S of a topological vector space X is totally bounded if for every neighbourhood U of 0 ∈ X there exists a finite set F ⊂ X such that S ⊂ F + U . Exercise B.2.13 (Hausdorff Total Boundedness Theorem). Prove the following Hausdorff Total Boundedness Theorem: A subset of a topological vector space is compact if and only if it is totally bounded and complete. Definition B.2.14 (Completion of a topological vector space). A completion of a topological vector space X is an injective open continuous linear mapping ι : X → where ι(X) is a dense subset of the complete topological vector space X. X, Exercise B.2.15 (Existence and uniqueness of completion). Let X be a topological and that this completion vector space. Show that it has a completion ι : X → X, is unique in the following sense: if κ : X → Z is another completion, then the linear mapping (ι(x) → κ(x)) : ι(X) → Z has a unique continuous extension to → Z of topological vector spaces. an isomorphism X Exercise B.2.16 (Extension of continuous linear operators). Let A : X → Y be continuous and linear, where the topological vector spaces X, Y have respective and ιY : Y → Y . Show that there exists a unique completions ιX : X → X :X → Y such that A ◦ ιX = ιY ◦ A, i.e., that the continuous linear mapping A following diagram is commutative: A
X −−−−→ ⏐ ⏐ ιX (
Y ⏐ ⏐ι (Y
A −−− −→ Y X
B.3. Locally convex spaces
B.3
87
Locally convex spaces
A locally convex space is a topological vector space where a local base for the topology can be given by convex neighbourhoods. If the reader is not familiar with Banach and Hilbert spaces yet, we suggest first examining those concepts, and returning to this section only afterwards. This is why in this section we will freely refer to Section B.4 to illustrate the introduced concepts in a simpler setting of Banach spaces. In the sequel, we present some essential results for locally convex spaces in a series of exercises of widely varying difficulty, for which the reader may find help, e.g., from [63], [89] and [134]. Definition B.3.1 (Locally convex spaces). A topological vector space V (over K) is called locally convex if for every neighbourhood U of 0 ∈ V there exists a convex neighbourhood C such that 0 ∈ C ⊂ U . Exercise B.3.2. Show that in a locally convex space each neighbourhood of 0 contains a convex balanced neighbourhood of 0. Exercise B.3.3. Let U be the family of all convex balanced neighbourhoods of 0 in a topological vector space V . For U ∈ U, define a so-called Minkowski functional pU : V → [0, ∞) by pU (x) := inf λ ∈ R+ : x/λ ∈ U . Show that pU is a seminorm (see Definition B.4.1). Moreover, prove that V is locally convex if and only if its topology is induced by the family {pU : V → [0, ∞) | U ∈ U} . Definition B.3.4 (Fr´echet spaces). A locally convex space having a complete (and translation-invariant) metric is called a Fr´echet space. Exercise B.3.5. Show that a locally convex space V is metrisable if and only if it has the following property: there exists a countable collection {pk }∞ k=1 of continuous seminorms pk : V → [0, ∞) such that for every x ∈ V \ {0} there exists kx ∈ Z+ satisfying pkx (x) = 0 (i.e., the seminorm family separates the points of V ). Exercise B.3.6. Let k ∈ Z+ ∪ {0, ∞} and let U ⊂ Rn be an open non-empty set. Endow space C k (U ) with a Fr´echet space structure. Exercise B.3.7. Let Ω ⊂ C be open and non-empty. Endow the space H(Ω) ⊂ C(Ω) of analytic functions f : Ω → C with a structure of a Fr´echet space. Definition B.3.8 (Schwartz space). For f ∈ C ∞ (Rn ) and α, β ∈ Nn0 , let pαβ (f ) := sup xβ ∂xα f (x) . x∈Rn
If pαβ (f ) < ∞ for every α, β, then f is called a rapidly decreasing smooth function. The collection of such functions is called the Schwartz space S(Rn ).
88
Chapter B. Elementary Functional Analysis
Exercise B.3.9. Show that the Schwartz space S(Rn ) is a Fr´echet space. Definition B.3.10 (LF-space). A C-vector space X is called an LF-space (or a ∞ limit of Fr´echet spaces) if X = j=1 Xj , where each Xj ⊂ Xj+1 is a subspace of X, having a Fr´echet space topology τj such that τj = {U ∩ Xj : U ∈ τj+1 }. The topology τ of the LF-space X is then generated by the set τ := {x + V | x ∈ X, V ∈ C, V ∩ Xj ∈ τj for every j} , where C is the family of those convex subsets of X that contain 0. ∞ Exercise B.3.11. Let τ be the topology of an LF-space X = j=1 Xj as in Definition B.3.10. Prove that τj = {U ∩ Xj | U ∈ τ } . Moreover, show that a linear functional f : X → C is continuous if and only if the restriction f |Xj : Xj → C is continuous for every j. Exercise B.3.12. Let U ⊂ Rn be an open non-empty set. Let D(U ) consist of compactly supported C ∞ -smooth functions f : U → C. Endow D(U ) with an LF-space structure; this is not a Fr´echet space anymore. Definition B.3.13 (Test functions and distributions). The LF-space D(U ) of Exercise B.3.12 is called the space of test functions, and a continuous linear functional f : D(U ) → C is called a distribution on U ⊂ Rn . The space of distributions on U is denoted by D (U ). Exercise B.3.14 (Locally convex Hahn–Banach Theorem). Prove the following analogue of the Hahn–Banach Theorem B.4.25: Let X be a locally convex space (over K) and f : M → K be a continuous linear functional on a vector subspace M ⊂ X. Then there exists a continuous extension F : X → K such that F |M = f . Exercise B.3.15. Let X be a K-vector space, and suppose V is a vector space of linear functionals f : X → K that separates the points of X. Show that V induces a locally convex topology on X, and that then the dual X = V . Definition B.3.16 (Weak topology). Let X be a topological vector space such that the dual X = L(X, K) separates the points of X. The X -induced topology is called the weak topology of X. Exercise B.3.17 (Closure of convex sets). Let X be locally convex and C ⊂ X convex. Show that the closure of C is the same in both the original topology and the weak topology. Definition B.3.18 (Weak∗ -topology). Let X be a topological vector space. The weak∗ -topology of the dual X is the topology induced by the family {x | x ∈ X}, where x : X → K is defined by x (f ) := f (x). Exercise B.3.19. Let x ∈ X. Show that x = (f → f (x)) : X → K is linear. Moreover, prove that if a linear functional f : X → K is continuous with respect to the weak∗ -topology, then f = x for some x ∈ X.
B.3. Locally convex spaces
89
Exercise B.3.20 (Banach–Alaoglu Theorem in topological vector spaces). Prove the following generalisation of the Banach–Alaoglu Theorem B.4.36: Let X be a topological vector space. Let U ⊂ X be a neighbourhood of 0 ∈ X, and let K := {f ∈ X | ∀x ∈ U : |f (x)| ≤ 1} . Then K ⊂ X is compact in the weak∗ -topology. Definition B.3.21 (Convex hull). The convex hull of a subset S of a vector space X is the intersection of all convex sets that contain S. (Notice that at least X is a convex set containing S.) Exercise B.3.22. Show that the convex hull of S is the smallest convex set that contains S. Exercise n B.3.23. Show that x ∈ X belongs to the convex hull of S if and only if x = k=1 tk xk for some n ∈ Z+ , where the vectors xk ∈ S, and tk > 0 are such n that k=1 tk = 1. Definition B.3.24 (Extreme set). Let K be a subset of a vector space X. A nonempty set E ⊂ K is called an extreme set of K if the conditions x, y ∈ K, tx + (1 − t)y ∈ E for some t ∈ (0, 1) imply that x, y ∈ E. A point z ∈ K is called an extreme point of K ⊂ X if {z} is an extreme set of K (alternative characterisation: if x, y ∈ K and z = tx + (1 − t)y for some 0 < t < 1 then x = y = z). Exercise B.3.25 (Krein–Milman Theorem). Prove the following Krein–Milman Theorem: Let X be a locally convex space and K ⊂ X compact. Then K is contained in the closure of the convex hull of the set of the extreme points of K. (Hint: The first problem is the very existence of extreme points. The family of compact extreme sets of K can be ordered by inclusion, and by the Hausdorff Maximal Principle there is a maximal chain. Notice that X separates the points of X. . . ) Exercise B.3.26. Let K be a compact subset of a Fr´echet space X. Show that the closure of the convex hull of K is compact. Exercise B.3.27. Let f : G → X be continuous, where X is a Fr´echet space and G is a compact Hausdorff space. Let μ be a finite positive Borel measure on G. Show that there exists a unique vector v ∈ X such that φ(f ) dμ φ(v) = G
for every φ ∈ X .
90
Chapter B. Elementary Functional Analysis
Definition B.3.28 (Pettis integral). Let f : G → X, μ and v be as in Exercise B.3.27. Then the vector v ∈ X is called the Pettis integral (or the weak integral ) of f with respect to μ, denoted by f dμ. v= G
Exercise B.3.29. Let f : G → X and μ be as in Definition B.3.28. Assume that X is even a Banach space. Show that ) ) ) ) ) f dμ) ≤ f dμ. ) ) G
G
Definition B.3.30 (Barreled space). A subset B of a topological vector space X is called a barrel if it is closed, balanced, convex and X = t>0 tB. A topological vector space is called barreled if its every barrel contains a neighbourhood of the origin. Remark B.3.31. Notice that a barreled space is not necessarily locally convex. Exercise B.3.32 (LF-spaces are barreled). Show that LF-spaces are barreled. Definition B.3.33 (Heine–Borel property). A metric space is said to satisfy the Heine–Borel property if its closed and bounded sets are compact. Definition B.3.34 (Montel space). A barreled locally convex space with the Heine– Borel property is called a Montel space. Exercise B.3.35. Prove that C ∞ (U ) and D(U ) are Montel spaces, where U ⊂ Rn is open and non-empty. Exercise B.3.36. Let Ω ⊂ C be open and non-empty. Show that the space H(Ω) of analytic functions on Ω is a Montel space. Exercise B.3.37. Prove that the Schwartz space S(Rn ) is a Montel space. Exercise B.3.38. Let U ⊂ Rn be open and non-empty. Show that C k (U ) is not a Montel space if k ∈ N0 .
B.3.1
Topological tensor products
In this section we review the topological tensor products. If the reader is interested in more details on this subject we refer to [87] and to [134]. Definition B.3.39 (Projective tensor product). Let X ⊗ Y be the algebraic tensor product of locally convex spaces X, Y . The projective tensor topology or the πtopology of X⊗Y is the strongest topology for which the bilinear mapping ((x, y) → x⊗y) : X ×Y → X ⊗Y is continuous. This topological space is denoted by X ⊗π Y , πY . and its completion by X ⊗
B.3. Locally convex spaces
91
Exercise B.3.40. Let X, Y be locally convex spaces over C. Show that the dual π Y ) is isomorphic to the space of continuous bilinear of X ⊗π Y (and also of X ⊗ mappings X × Y → C. πY Exercise B.3.41. Let X, Y be locally convex metrisable spaces. Show that X ⊗ is a Fr´echet space. Moreover, if X, Y are barreled, show that X ⊗ Y is barreled. Exercise B.3.42. Let X, Y be locally convex metrisable barreled spaces. Show that X ⊗ Y is barreled. Exercise B.3.43 (Projective Banach tensor product). Let X, Y be Banach spaces. For f ∈ X ⊗ Y , define ⎧ ⎫ ⎨ ⎬ xj yj : f = xj ⊗ yj . f π := inf ⎩ ⎭ j
j
Show that f → f π is a norm on X ⊗ Y , and that the corresponding norm topology is the projective tensor topology. Exercise B.3.44. Let X, Y be locally convex spaces over C. Show that the algebraic tensor product X ⊗ Y can be identified with the space B(X , Y ) of continuous bilinear functionals X × Y → C, where X and Y are the dual spaces with weak topologies. Definition B.3.45 (Injective tensor product). Let X, Y be locally convex spaces * , Y ) be the space of those bilinear functionals X × Y → C that over C. Let B(X * , Y ) with the topology τ are continuous separately in each variable. Endow B(X of uniform convergence on the products of an equicontinuous subset of X and an * , Y ) as in Exercise B.3.44, equicontinuous subset of Y . Interpreting X ⊗Y ⊂ B(X let the injective tensor topology be the restriction of τ to X ⊗ Y . This topological εY . space is denoted by X ⊗ε Y , and its completion by X ⊗ Exercise B.3.46. Let X, Y be locally convex spaces over C. Show that the bilinear mapping ((x, y) → x ⊗ y) : X × Y → X ⊗ε Y is continuous. From this, deduce that the injective topology of X ⊗ Y is coarser than the projective topology (i.e., is a subset of the projective topology). Exercise B.3.47. Studying the mapping ((x, y) → x⊗y) : X ×Y → X ⊗Y , explain πY ⊂ X⊗ ε Y should be understood. how the inclusion X ⊗ Exercise B.3.48 (Injective Banach tensor product). Let X, Y be Banach spaces. For f ∈ X ⊗ Y , define f ε := sup {|x ⊗ y (f )| : x ∈ X , y ∈ Y , x = 1 = y } . Show that f → f ε is a norm on X ⊗ Y , and that the corresponding norm topology is the injective tensor topology.
92
Chapter B. Elementary Functional Analysis
Definition B.3.49 (Nuclear space). A locally convex space X is called nuclear if ε Y for every locally convex space Y (where the equality of sets is πY = X⊗ X⊗ understood in the sense of Exercise B.3.47). In such a case, these completed tensor . products are written simply X ⊗Y Exercise B.3.50. Let X, Y be nuclear spaces, and let M, N ⊂ X be vector sub are nuclear spaces such that N is closed. Show that M , X/N , X × Y and X ⊗Y spaces. Exercise B.3.51. Show that C ∞ (U ), D(U ), S(Rn ), H(Ω) and their dual spaces (of distributions) are nuclear. Exercise B.3.52. Let X, Y be Fr´echet spaces and X nuclear. Show that L(X , Y ) ∼ = , L(X, Y ) ∼ ) . , and that X ⊗Y ∼ X ⊗Y = X ⊗Y = (X ⊗Y Exercise B.3.53. Prove the following Schwartz Kernel Theorem B.3.55: Remark B.3.54. In the following Schwartz Kernel Theorem B.3.55, we denote ψ, Aφ := (Aφ)(ψ), and ψ ⊗ φ, KA := KA (ψ ⊗ φ). Theorem B.3.55 (Schwartz Kernel Theorem). Let U ⊂ Rm , V ⊂ Rn be open and non-empty, and let A : D(U ) → D (V ) be linear and continuous. Then there exists (U ) ∼ a unique distribution KA ∈ D (V )⊗D = D (V × U ) such that ψ, Aφ = ψ ⊗ φ, KA for every φ ∈ D(U ) and ψ ∈ D(V ). Moreover, if A : D(U ) → C ∞ (V ) is continuous then it can be interpreted that KA ∈ C ∞ (V, D (U )). Definition B.3.56 (Schwartz kernel). The distribution KA in Theorem B.3.55 is called the Schwartz kernel of A, written informally as KA (x, y) φ(y) dy. Aφ(x) = V
Exercise B.3.57. Let A : D(U ) → D (V ) be continuous and linear as in Theorem B.3.55. Give necessary and sufficient conditions for A such that KA ∈ C ∞ (V × U ). Exercise B.3.58. Find variants of the Schwartz Kernel Theorem B.3.55 for Schwartz functions and for tempered distributions.
B.4
Banach spaces
Definition B.4.1 (Seminorm and norm; normed spaces). Let X be a K-vector space. A mapping p : X → R is a seminorm if p(x + y) ≤ p(x) + p(y), p(λx) = |λ| p(x)
B.4. Banach spaces
93
for every x, y ∈ X and λ ∈ R. If p : X → R is a seminorm for which p(x) = 0 implies x = 0, then it is called a norm. Typically, a norm on X is written as x → xX or simply x. A vector space with a norm is called a normed space. Example. On the vector space K, the absolute value mapping x → |x| is a norm. Exercise B.4.2. Let p : X → [0, ∞) be a seminorm on a K-vector space X and x∼y
definition
⇐⇒
p(x − y) = 0.
Prove the following claims: (a) ∼ is an equivalence relation on X. (b) The set L := {[x] : x ∈ X}, with [x] := {y ∈ X : x ∼ y}, is an R-vector space when endowed with operations [x] + [y] := [x + y],
λ[x] := [λx]
and the norm [x] → p(x). Exercise B.4.3. Let wj ≥ 0 for every j ∈ J. Define ! wj := sup wk : K ⊂ J finite . j∈J
k∈K
Show that {j ∈ J : wj > 0} is at most countable if
j∈J
wj < ∞.
Exercise B.4.4. For x ∈ KJ define ⎧ 1/p ⎨ |xj |p , if 1 ≤ p < ∞, j∈J xp := ⎩sup if p = ∞, j∈J |xj |, where xj := x(j). Show that p (J) := x ∈ KJ : xp < ∞ is a Banach space with respect to the norm x → xp . Exercise B.4.5. Norms p1 , p2 on a vector space V are called (Lipschitz) equivalent if a−1 p1 (x) ≤ p2 (x) ≤ ap1 (x) for every x ∈ V , where a ≥ 1 is a constant. Show that any two norms on a finite-dimensional space V are equivalent. Consequently, a finite-dimensional normed space is a Banach space. Exercise B.4.6. Let K be a compact space. Show that C(K) := {f : K → K | f continuous } is a Banach space when endowed with the norm f → f C(K) := sup |f (x)|. x∈K
94
Chapter B. Elementary Functional Analysis
Remark B.4.7. The previous exercise deals with special cases of of Lp (μ), the Lebesgue p-spaces. These Banach spaces are introduced in Definition C.4.6. Definition B.4.8 (Normed and Banach spaces). Notice that the norm metric ((x, y) → x − y) : X × X → R is a metric on X. Let τX denote the corresponding metric topology, called the norm topology, where the open ball centered at x ∈ X with radius r > 0 is BX (x, r) = B(x, r) = {y ∈ X : x − y < r} . Ball BX (0, 1) is called the open unit ball. The closed ball centered at x ∈ X with radius r > 0 is B(x, r) := {y ∈ X : x − y ≤ r} . Notice that here B(x, r) = B(x, r), where S refers to the norm closure of a set S ⊂ X. A Banach space is a normed space where the norm metric is complete. Exercise B.4.9. Show that V := {x ∈ p (J) : {j ∈ J : xj = 0} finite} is a dense normed vector subspace of p (J). Definition B.4.10 (Bounded linear operators). A linear mapping A : X → Y between normed spaces X, Y is called bounded, denoted A ∈ L(X, Y ), if Ax ≤ C x for every x ∈ X, where C < ∞ is a constant. The norm of A ∈ L(X, Y ) is A :=
sup x∈X: x ≤1
Ax.
This norm is also called the operator norm and is often denoted by Aop . We often abbreviate L(X) := L(X, X). Exercise B.4.11. Let A : X → Y be a linear operator between normed spaces X and Y . Show that A is bounded if and only if it is continuous. Exercise B.4.12. Show that L(X, Y ) is really a normed space. Exercise B.4.13. Show that AB ≤ AB if B ∈ L(X, Y ) and A ∈ L(Y, Z). Exercise B.4.14. Show that L(X, Y ) is a Banach space if Y is a Banach space. Definition B.4.15 (Duals). Let V be a Banach space over K. The dual of V is the space V = L(V, K) := {A : V → K | A bounded and linear} endowed with the (operator) norm A → A :=
sup f ∈V : f V ≤1
|A(f )|.
B.4. Banach spaces
95
Exercise B.4.16. Prove that V is a Banach space. Definition B.4.17 (Compact linear operator). Let X, Y be normed spaces, and let B = B(0, 1) = {x ∈ X : x ≤ 1}. A linear mapping A : X → Y is called compact, written A ∈ LC(X, Y ), if the closure of A(B) ⊂ Y is compact. We also write LC(X) := LC(X, X). Exercise B.4.18. Show that LC(X, Y ) is a linear subspace of L(X, Y ), and it is closed if Y is complete. Exercise B.4.19. Let B0 , C0 ∈ L(X, Y ) and B1 , C1 ∈ L(Y, Z) such that C0 , C1 are compact. Show that C1 B0 , B1 C0 are compact. Lemma B.4.20. (Almost Orthogonality Lemma [F. Riesz]) Let X be a normed space with closed subspace Y = X. For each ε > 0 there exists xε ∈ X such that xε = 1 and dist(xε , Y ) ≥ 1 − ε. Proof. Let z ∈ X \ Y and r := dist(z, Y ) > 0. Take y = yε ∈ Y such that r ≤ z − y < (1 − ε)−1 r. Let xε := (z − y)/z − y. If u ∈ Y then ) ) ) ) z−y ) xε − u = ) − u) ) z − y z − (y + z − yu) = z − y r > (1 − ε)−1 r = 1 − ε, showing that dist(xε , Y ) ≥ 1 − ε.
Theorem B.4.21 (Riesz’s Compactness Theorem). Let X be a normed space. Then X is finite-dimensional if and only if B(0, 1) is compact. Proof. By the Heine–Borel Theorem, a set in a finite-dimensional normed space is compact if and only if it is bounded. Now let X be infinite-dimensional. Let 0 < ε < 1 and x1 ∈ X such that x1 = 1. Inductively, let Yk := span{xj }kj=1 = X, and choose xk+1 ∈ X \ Yk = ∅ such that xk+1 = 1 and dist(xk+1 , Yk ) > 1−ε. Then it is clear that the sequence (xk )∞ k=1 does not have a converging subsequence. Hence by Theorem A.13.4, B(0, 1) is not compact. Remark B.4.22 (Is identity compact?). Riesz’s Compactness Theorem B.4.21 could also be stated: a normed space X is finite-dimensional if and only if the identity mapping I = (x → x) : X → X is compact. This together with the results of Exercises B.4.18 and B.4.19 proves that LC(X) is a closed two-sided proper ideal of L(X), where X is a Banach space and X is not finite-dimensional.
96
Chapter B. Elementary Functional Analysis
Theorem B.4.23 (Baire’s Theorem). Let (X, d) be a complete metric space and ∞ Uj ⊂ X be dense and open for each k ∈ Z+ . Then G = k=1 Uk is dense. Proof. We must show that G ∩ B(x0 , r0 ) = ∅ for any x0 ∈ X and r0 > 0. Assuming X = ∅, take x1 and r1 such that B(x1 , r1 ) ⊂ U1 ∩ B(x0 , r0 ). Inductively, we choose xk+1 and rk+1 < 1/k so that B(xk+1 , rk+1 ) ⊂ Uk+1 ∩ B(xk , rk ). Then (xk )∞ k=1 is a Cauchy sequence, thus converging to some x ∈ X by complete ness. By construction, x ∈ G ∩ B(x0 , r0 ). Exercise B.4.24 (Baire’s Theorem and interior points). Clearly, Baire’s Theorem B.4.23 is equivalent to the following: in a complete metric space, a countable union of sets without interior points is without interior points. Use this to prove that an algebraic basis of an infinite-dimensional Banach space must be uncountable. Theorem B.4.25 (Hahn–Banach Theorem). Let X be a real normed space and f : Mf → R be bounded and linear on a vector subspace Mf ⊂ X. Then there exists extension F : X → R such that F |Mf = f and F = f . Proof. Let S := {h : Mh → R
|
h linear on vector subspace Mh ⊂ X, Mf ⊂ Mh , h = f } .
Then f ∈ S = ∅. Endow S with the partial order Mg ⊂ Mh , g ≤ h ⇐⇒ g = h|Mg . Take a chain (f j )j∈J ⊂ S. Then fj ≤ h for each j ∈ J, where h ∈ S is defined so that Mh = j∈J Mfj , h|Mfj = fj . Thereby, in view of Zorn’s lemma (Theorem A.4.10), there is a maximal element F : MF → R in S. Suppose MF = X. Then take x0 ∈ X \ MF . Given a ∈ R, define Ga : MF + Rx0 → R,
Ga (u + tx0 ) = F (u) − ta.
Then Ga is bounded, linear, and Ga |MF = F . Hence Ga ≥ F = f . Could it be that Ga = F (this would contradict the maximality of F )? For any u, v ∈ MF , |F (u) − F (v)|
=
|F (u − v)|
≤ ≤
F u − v F (u + x0 + v + x0 ) .
B.4. Banach spaces
97
Hence there exists a0 ∈ R such that F (u) − F u + x0 ≤ a0 ≤ F (v) + F v + x0 for every u, v ∈ MF . Thus |F (w) − a0 | ≤ F w + x0 for every w ∈ MF . From this (assuming that non-trivially t = 0), we get |Ga0 (u + tx0 )| = |t| |u/t − a0 | ≤ |t| F u/t + x0 = F u + tx0 ; but this means Ga0 ≤ F , a contradiction.
Exercise B.4.26 (Complex version of the Hahn–Banach Theorem). Prove the complex version of the Hahn–Banach Theorem: Let X be a complex normed space and f : Mf → C be bounded and linear on a vector subspace Mf ⊂ X. Then there exists an extension F : X → C such that F |Mf = f and F = f . Corollary B.4.27. Let X be a normed space and x ∈ X. Then x = max {|F (x)| : F ∈ L(X, K), F ≤ 1} . Corollary B.4.28 (Hahn–Banach =⇒ Riesz’ Compactness Theorem). Let X be a normed space. Then B(0, 1) is compact if and only if it is finite-dimensional. Proof. By the Heine–Borel Theorem, a closed set in a finite-dimensional normed space is compact if and only if it is bounded. The proof for the converse follows [28]: Suppose X is locally compact and let S 1 := {x ∈ X : x = 1}. Then S 1 ∩ Ker(f ) : f ∈ L(X, K) is a family of compact sets, whose intersection is empty by the Hahn–Banach Theorem. Thereby there exists {fk }nk=1 ⊂ L(X, K) such that n
S 1 ∩ Ker(fk ) = ∅,
k=1
i.e.,
n
Ker(fk ) = {0}.
k=1
Since the co-dimension of Ker(fk ) ≤ 1, this implies that dim(X) ≤ n.
Theorem B.4.29 (Banach–Steinhaus Theorem, or Uniform Boundedness Principle). Let X be a Banach space, let Y be a normed space, and let {Aj }j∈J ⊂ L(X, Y ) be such that sup Aj x < ∞ j∈J
for every x ∈ X. Then sup Aj < ∞. j∈J
98
Chapter B. Elementary Functional Analysis
Proof. Let pj (x) := Aj x and p(x) := sup{pj (x) | j ∈ J}. Clearly, p, pj : X → R are seminorms. Moreover, pj is continuous for every j ∈ J, and we must show that also p is continuous. Since pj is continuous for every j ∈ J, set {x ∈ X : pj (x) > k} Uk := {x ∈ X : p(x) > k} = j∈J
∞
is open. Now k=1 Uk = ∅, so that by Baire’s Theorem B.4.23 there exists k0 ∈ Z+ for which Uk0 = X; actually, here U1 = X, because U1 = k −1 Uk . Choose x0 ∈ X and r0 > 0 such that B(x0 , r0 ) ⊂ X \ U1 . If z ∈ B(0, 1) then r0 p(z)
= p(r0 z) ≤ p(x0 + r0 z) + p(−x0 ) ≤ 2.
Thus Aj ≤ 2/r0 for every j ∈ J.
Definition B.4.30 (Open mappings). A mapping f : X → Y between topological spaces X, Y is said to be open, if f (U ) ⊂ Y is open for every open U ⊂ X. Theorem B.4.31 (Open Mapping Theorem). Let A ∈ L(X, Y ) be surjective, where X, Y are Banach spaces. Then A is open. 1)) for some r > 0. For Proof. It is sufficient to show that BY (0, r) ⊂ A(BX (0, ∞ each k ∈ Z+ , set Uk := Y \ A(BX (0, k)) is open. Now k=1 Uk = ∅, because A is surjective. By Baire’s Theorem B.4.23, Uk0 = Y for some k0 ∈ Z+ ; actually, U1 = Y , because A(BX (0, 1)) = k −1 A(BX (0, k)). Take y0 ∈ Y and r0 > 0 such that BY (y0 , r0 ) ⊂ Y \ U1 . Now BY (y0 , r0 ) ⊂ Y \ U1 = A(BX (0, 1)). Let ε > 0 and y ∈ BY (0, r0 ). Take w1 , w2 ∈ BX (0, 1) such that y0 − Aw1 (y0 + y) − Aw2
<
0 ∀y ∈ BY (0, r0 ) ∃x ∈ BX (0, 2y/r0 ) : y − Ax < ε. Thus if z ∈ BY (0, r0 ), take x0 ∈ BX (0, 2) such that z − Ax0 < r0 /2. Inductively, k k choose xk ∈ BX (0, 21−k ) such that z − A j=0 xj < 21−k r0 . Now j=0 xj →k
B.4. Banach spaces
99
x ∈ BX (0, 4) ⊂ BX (0, 5), because X is complete. We have z = Ax by continuity of A. Thereby BY (0, r0 ) ⊂ A(BX (0, 5)), implying BY (0, r0 /5) ⊂ A(BX (0, 1)).
Corollary B.4.32 (Bounded Inverse Theorem). Let B ∈ L(X, Y ) be bijective between Banach spaces X, Y . Then B −1 is continuous. Definition B.4.33 (Graph). The graph of a mapping f : X → Y is Γ(f )
:= {(x, f (x)) | x ∈ X} ⊂ X × Y.
Theorem B.4.34 (Closed Graph Theorem). Let A : X → Y be a linear mapping between Banach spaces X, Y . Then A is continuous if and only if its graph is closed in X × Y . Proof. Suppose A is continuous. Take a Cauchy sequence ((xj , Axj ))∞ j=1 of Γ(A) ⊂ X ×Y . Then (xj )∞ is a Cauchy sequence of X, thereby converging to some x ∈ X j=1 by completeness. Then Axj → Ax by the continuity of A. Hence (xj , Axj ) → (x, Ax) ∈ Γ(A); the graph is closed. Now assume that Γ(A) ⊂ X × Y is closed. Thus the graph is a Banach subspace of X × Y . Define a mapping B := (x → (x, Ax)) : X → Γ(A). It is easy to see that B is a linear bijection. By the Open Mapping Theorem, B is continuous. This implies the continuity of A. Definition B.4.35 (Weak∗ -topology). Let x → x be the norm of a normed vector space X over a field K ∈ {R, C}. The dual space X = L(X, K) of X is a set of bounded linear functionals f : X → K, having a norm f :=
sup x∈X: x ≤1
|f (x)|.
This endows X with a Banach space structure. However, it is often better to use a weaker topology for the dual: let us define x(f ) := f (x) for every x ∈ X and f ∈ X ; this gives the interpretation X ⊂ X := L(X , K), because |x(f )| = |f (x)| ≤ f x. So we may treat X as a set of functions X → K, and we define the weak∗ -topology of X to be the X-induced2 topology of X . Theorem B.4.36 (Banach–Alaoglu Theorem). Let X be a Banach space. Then the closed unit ball K := BX (0, 1) = {φ ∈ X : φX ≤ 1} of X is weak∗ -compact. 2 see
Definition A.18.1
100
Chapter B. Elementary Functional Analysis
Proof. Due to Tihonov’s Theorem A.18.8, X P := {λ ∈ C : |λ| ≤ x} = D(0, x) x∈X
is compact in the product topology τP . Any element f ∈ P is a mapping f :X→C
such that
f (x) ≤ x.
Hence K = X ∩ P . Let τ1 and τ2 be the relative topologies of K inherited from the weak∗ -topology τX of X and the product topology τP of P , respectively. We shall prove that τ1 = τ2 and that K ⊂ P is closed; this would show that K is a compact Hausdorff space. First, let φ ∈ X , f ∈ P , S ⊂ X, and δ > 0. Define U (φ, S, δ) V (f, S, δ)
:= {ψ ∈ X : x ∈ S ⇒ |ψx − φx| < δ}, := {g ∈ P : x ∈ S ⇒ |g(x) − f (x)| < δ}.
Then U V
:= {U (φ, S, δ) | φ ∈ X , S ⊂ X finite, δ > 0}, := {V (f, S, δ) | f ∈ P, S ⊂ X finite, δ > 0}
are bases for the topologies τX and τP , respectively. Clearly K ∩ U (φ, S, δ) = K ∩ V (φ, S, δ), so that the topologies τX and τP agree on K, i.e., τ1 = τ2 . Still we have to show that K ⊂ P is closed. Let f ∈ K ⊂ P . First we show that f is linear. Take x, y ∈ X, λ, μ ∈ C and δ > 0. Choose φδ ∈ K such that f ∈ V (φδ , {x, y, λx + μy}, δ). Then |f (λx + μy) − (λf (x) + μf (y))| ≤ |f (λx + μy) − φδ (λx + μy)| + |φδ (λx + μy) − (λf (x) + μf (y))| = |f (λx + μy) − φδ (λx + μy)| + |λ(φδ x − f (x)) + μ(φδ y − f (y))| ≤ |f (λx + μy) − φδ (λx + μy)| + |λ| |φδ x − f (x)| + |μ| |φδ y − f (y)| ≤ δ (1 + |λ| + |μ|). This holds for every δ > 0, so that actually f (λx + μy) = λf (x) + μf (y), f is linear! Moreover, f ≤ 1, because |f (x)| ≤ |f (x) − φδ x| + |φδ x| ≤ δ + x. Hence f ∈ K, K is closed.
B.4. Banach spaces
101
Remark B.4.37. The Banach–Alaoglu Theorem B.4.36 implies that a bounded weak∗ -closed subset of the dual space is a compact Hausdorff space in the relative weak∗ -topology. However, in a normed space norm-closed balls are compact if and only if the dimension is finite!
B.4.1
Banach space adjoint
We now come back to the adjoints of Banach spaces and of operators introduced in Definition B.4.15. Here we give a condensed treatment to acquaint the reader with the topic. Definition B.4.38 (Duality). Let X be a Banach space and X = L(X, K) its dual. For x ∈ X and x ∈ X let us write x, x := x (x). We endow X with the norm x → x given by x := sup {|x, x | : x ∈ X, x ≤ 1} . Exercise B.4.39. Let X be a Banach space and x ∈ X. Show that x = sup {|x, x | : x ∈ X , x ≤ 1} . Exercise B.4.40. Let X, Y be Banach spaces with respective duals X , Y . Let A ∈ L(X, Y ). Show that there exists a unique A ∈ L(Y , X ) such that Ax, y = x, A (y )
(B.1)
for every x ∈ X and y ∈ Y . Prove also that A = A. Definition B.4.41 (Adjoint operator). Let A ∈ L(X, Y ) be as in Exercise B.4.40. Then A ∈ L(Y , X ) defined by (B.1) is called the (Banach) adjoint of A. Exercise B.4.42. Show that A ∈ L(X, Y ) is compact if and only if A ∈ L(Y , X ) is compact. Definition B.4.43 (Complemented subspace). A closed subspace V of a topological vector space X is said to be complemented in X by a subspace W ⊂ X if V +W =X and V ∩ W = {0}. Then we write X = V ⊕ W , saying that X is the direct sum of V and W .
102
Chapter B. Elementary Functional Analysis
Exercise B.4.44. Show that a closed subspace V is complemented in X if X/V is finite-dimensional. Exercise B.4.45. Show that a finite-dimensional subspace of a locally convex space is complemented. (Hint: Hahn–Banach.) Exercise B.4.46. Let A ∈ L(X) be compact, where X is a Banach space. Let λ be a non-zero scalar. Show that the range set (λI − A)(X) = {λx − Ax : x ∈ X} is closed, Ker(λI − A) = {x ∈ X : Ax = λx} is finite-dimensional, and that dim (Ker(λI − A)) =
dim (Ker(λI − A ))
= dim (X/((λI − A)(X))) = dim (X /((λI − A )(X ))) . Definition B.4.47 (Reflexive space). Let X be a Banach space and X = L(X, K) its dual Banach space. The second dual of X is X := (X ) = L(X , K). It is then easy to show that we can define a linear isometry (x → x ) : X → X onto a closed subspace of X by x (f ) := f (x). Thus X can be regarded as a subspace of X . If X = {x : x ∈ X} then X is called reflexive. Exercise B.4.48. Show that (x → x ) : X → X in Definition B.4.47 has the claimed properties. Exercise B.4.49. Let 1 < p < ∞. Show that p = p (Z+ ) is reflexive. What about 1 and ∞ ? Exercise B.4.50. Show that C([0, 1]) is not reflexive. Exercise B.4.51. Let X be a Banach space. Prove that X is reflexive if and only if its closed unit ball is compact in the weak topology. (Hint: Hahn–Banach and Banach–Alaoglu). Exercise B.4.52. Let V be a closed subspace of a reflexive Banach space X. Show that V and X/V are reflexive. Exercise B.4.53. Show that X is reflexive if and only if X is reflexive.
B.5. Hilbert spaces
B.5
103
Hilbert spaces
Definition B.5.1 (Inner product and Hilbert spaces). Let H be a C-vector space. A mapping ((x, y) → x, y) : H × H → C is an inner product if x + y, z = x, z + y, z, λx, y = λx, y, y, x = x, y, x, x ≥ 0, x, x = 0 ⇒
x=0
for every x, y ∈ H and λ, μ ∈ C. Then H endowed with the inner product is called an inner product space. An inner product defines the canonical norm x := x, x1/2 ; we shall soon prove that this is a norm in the usual sense. H is called a Hilbert space (or a complete inner product space) if it is a Banach space with respect to the canonical norm. Exercise B.5.2. Show that 2 (J) is a Hilbert space, where x, y = xj yj . j∈J
Definition B.5.3 (Orthogonality). Vectors x, y ∈ H are said to be orthogonal in an inner product space H, denoted x⊥y, if x, y = 0. For S ⊂ H, let S ⊥ := {x ∈ H | ∀y ∈ S : x⊥y} . Subspaces M, N ⊂ H are called orthogonal, denoted by M ⊥N , if x, y = 0 for every x ∈ M and y ∈ N . A collection {xα }α∈I is called orthonormal if xα = 1 for all α ∈ I and if xα , xβ = 0 for all α = β, α, β ∈ I. Exercise B.5.4. Show that S ⊥ ⊂ H is a closed vector subspace, and that S ⊂ (S ⊥ )⊥ . Show that if V is a closed vector subspace of H then V = (V ⊥ )⊥ . be mutually Exercise B.5.5 (Pythagoras’ Theorem). Let x1 , x2 , . . ., xn ∈ H n orthogonal, i.e., assume that xi ⊥xj for all i = j. Prove that j=1 xj 2 = n 2 j=1 xj . (This generalised the famous theorem of Pythagoras of Samos on the triangles in the plane.) Proposition B.5.6 (Cauchy–Schwarz inequality). Let H be an inner product space. Then |x, y| ≤ x y (B.2) for every x, y ∈ H.
104
Chapter B. Elementary Functional Analysis
Proof. We may assume that x = 0 and y = 0, otherwise the statement is trivial. For t ∈ R, 0
≤
x − ty2
= x − ty, x − ty = x, x − tx, y − ty, x + t2 y, y = y2 t2 − 2tRex, y + x2 ,2 + + ,2 Rex, y Rex, y = y2 t − + x2 − . y2 y Taking t =
Rex,y
y 2 ,
we get |Rex, y| ≤ x y
for every x, y ∈ H. Now x, y = |x, y| eiφ for some φ ∈ R, and |x, y| = e−iφ x, y = Ree−iφ x, y ≤ e−iφ x y = x y.
This completes the proof. Corollary B.5.7 (Triangle inequality). Let H be an inner product space. Then x + y ≤ x + y.
Consequently, the canonical norm of an inner product space is a norm in the usual sense. Proof. Now x + y2
= = (B.2)
≤ =
x + y, x + y x, x + x, y + y, x + y, y x2 + 2 x y + y2 (x + y)2 ,
completing the proof.
Remark B.5.8. One may naturally study R-Hilbert spaces, where the scalar field is R and the inner product takes real values. Then x, y =
x2 + y2 − x − y2 . 2
Thus the inner product can be recovered from the norm here.
B.5. Hilbert spaces
105
Exercise B.5.9. Prove this remark. In (C-) Hilbert spaces, prove that x, y =
x + y2 − x − y2 + ix + iy2 − ix − iy2 . 4
Exercise B.5.10. Every Hilbert space is canonically a Banach space, but not vice versa: in a real Banach space, (x, y) → (x2 + y2 − x − y2 )/2 does not always define an inner product. Present some examples. Lemma B.5.11. Let H be a Hilbert space. Suppose C ⊂ H is closed, convex and non-empty. Then there exists unique z ∈ C such that z = inf{x : x ∈ C}. Proof. Let r := inf{x : x ∈ C}. For any x, y ∈ H, the parallelogram identity x + y2 + x − y2 = 2(x2 + y2 )
(B.3)
holds. Take a sequence (xk )∞ k=1 in C such that xk →k→∞ r. Now (xj +xk )/2 ∈ C due to convexity, so that 4r2 ≤ xj + xk 2 . Hence 4r2 + xj − xk 2
≤
xj + xk 2 + xj − xk 2
(B.3)
= 2(xj 2 + xk 2 ) −−−−−→ 4r2 , j,k→∞
implying xj − xk →j,k→∞ 0. Thus (xk )∞ k=1 is a Cauchy sequence, converging to some z ∈ C with z = r (recall that H is complete and C ⊂ H is closed). If z ∈ C satisfies z = d then the alternating sequence (z, z , z, z , . . .) would be a Cauchy sequence, by the reasoning above: hence z = z . Exercise B.5.12 (Parallelogram identity). Show that the parallelogram identity (B.3): x + y2 + x − y2 = 2(x2 + y2 ) holds for all x, y ∈ H. Lemma B.5.13. Let M be a vector subspace in a Hilbert space H. Let z ≤ z +u for every u ∈ M . Then z ∈ M ⊥ . Proof. To get a contradiction, assume z, v = 0 for some v ∈ M . Multiplying v by a scalar, we may assume that Rez, v = 0. If r ∈ R then 0 ≤ z − rv2 − z2 = r2 v2 − 2rRez, v = r(rv2 − 2Rez, v), but this inequality fails when r is between 0 and 2Rez, v/v2 .
Definition B.5.14 (Orthogonal projection). Let M be a closed subspace of a Hilbert space H. Then we may define PM : H → H so that PM (x) ∈ M is the point in M closest to x ∈ H. Mapping PM is called the orthogonal projection onto M .
106
Chapter B. Elementary Functional Analysis
Proposition B.5.15. Operator PM : H → H defined above is linear, and PM = 1 (unless M = {0}). Moreover, PM ⊥ = I − PM . Proof. Let x ∈ H, P := PM and Q = I − P . By Definition B.5.14, P (x) ∈ M and Q(x) ≤ Q(x) + u for every u ∈ M . This implies Q(x) ∈ M ⊥ by Lemma B.5.13. Let x, y ∈ H and λ, μ ∈ C. Since λx μy
= =
λ (P (x) + Q(x)) , μ (P (y) + Q(y)) ,
λx + μy
=
P (λx + μy) + Q(λx + μy),
we get M P (λx + μy) − λP (x) − μP (y) = λQ(x) + μQ(y) − Q(λx + μy) ∈ M ⊥ . This implies the linearity of P , because M ∩ M ⊥ = {0}. Finally, x2 = P x + Qx2 = P x2 + Qx2 + 2%P x, Qx = P x2 + Qx2 ; in particular, P x ≤ x.
Remark B.5.16. We have proven that H = M ⊕ M ⊥. This means that M, M ⊥ are closed subspaces of the Hilbert space H such that M ⊥M ⊥ and that M + M ⊥ = H. Definition B.5.17 (Direct sum). Let {Hj : j∈ J} be a family of pair-wise orthogonal closed subspaces of H. If the span of j∈J Hj is dense in H then H is said to be a direct sum of {Hj : j ∈ J}, denoted by H=
-
Hj .
j∈J
If H is a direct sum of {Mj }kj=1 , we write H = .2 j=1 Mj .
.k j=1
Mj . Especially, M1 ⊕ M2 =
Remark B.5.18. If H is a Hilbert space, it is easy to see that f = (x → x, y) : H → C is a linear functional, and f = y due to the Cauchy–Schwarz inequality and to f (y) = y2 . Actually, there are no other kinds of bounded linear functionals on a Hilbert space:
B.5. Hilbert spaces
107
Theorem B.5.19 (Riesz (Hilbert Space) Representation Theorem). Let f : H → C be a bounded linear functional on a Hilbert space H. Then there exists a unique y ∈ H such that f (x) = x, y for every x ∈ H. Moreover, f = y. Sometimes this theorem is also called the Fr´echet–Riesz (representation) theorem. Proof. Assume the non-trivial case f = 0. Thus we may choose u ∈ Ker(f )⊥ for which u = 1. Pursuing for a suitable representative y ∈ H, we notice that f (u) = u, f (u)u, inspiring an investigation: x, f (u)u − f (x)
= f (u)x, u − f (x)u, u = f (u)x − f (x)u, u = 0,
since f (u)x − f (x)u ∈ Ker(f ). Thus f (x) = x, f (u)u for every x ∈ H. Furthermore, if f (x) = x, y = x, z for every x ∈ H then 0 = f (x) − f (x) = x, y − x, z = x, y − z
x=y−z
=
y − z2 ,
so that y = z.
Definition B.5.20 (Adjoint operator). Let H be a Hilbert space, z ∈ H and A ∈ L(H). Then a bounded linear functional on H is defined by x → Ax, z, so that by Theorem B.5.19 there exists a unique vector A∗ z ∈ H satisfying Ax, z = x, A∗ z for every x ∈ H. This defines a mapping A∗ : H → H, which is called the adjoint of A ∈ L(H). If A∗ = A then A is called self-adjoint. Exercise B.5.21. Let λ ∈ C and A, B ∈ L(H). Show that (λA)∗ = λA∗ , (A+B)∗ = A∗ + B ∗ and (AB)∗ = B ∗ A∗ . Exercise B.5.22. Show that the adjoint operator A∗ : H → H of A ∈ L(H) is linear and bounded. Moreover, show that (A∗ )∗ = A, A∗ A = A2 and A∗ = A. Lemma B.5.23. Let A∗ = A ∈ L(H). Then A =
sup x: x ≤1
|Ax, x| .
Proof. Let r := sup {|Ax, x| : x ∈ H, x ≤ 1}. Then (B.2)
r ≤
sup x: x ≤1
Ax x ≤ A.
108
Chapter B. Elementary Functional Analysis
Let us assume that Ax = 0 for x = 1, and let y := Ax/Ax. Since A∗ = A, we have Ax, y = x, Ay = Ay, x ∈ R, so that Ax
= ∗
A =A
=
≤ ≤ (B.3)
=
=
Ax, y 1 (A(x + y), x + y − A(x − y), x − y) 4 1 (|A(x + y), x + y| + |A(x − y), x − y|) 4 r x + y2 + x − y2 4 r x2 + y2 2 r.
This concludes the proof. ∗
Lemma B.5.24. Let H = {0}. Let A = A ∈ L(H) be compact. Then there exists a non-zero x ∈ H such that Ax = +Ax or Ax = −Ax. Proof. Assume the non-trivial case A > 0. By Lemma B.5.23, we may choose λ ∈ {±A} to be an accumulation point of the set {Ax, x : x ∈ H, x ≤ 1}. For each k ∈ Z+ , take xk ∈ H such that xk ≤ 1 and Axk , xk →k→∞ λ. Since A is compact, by Theorem A.13.4 it follows that the sequence (Axk )∞ k=1 has a convergent subsequence; omitting elements from the sequence, we may assume that z := limk Axk ∈ H exists. Now 0
≤ Axk − λxk 2 = Axk 2 + λ2 xk 2 − 2λAxk , xk ≤ A2 + λ2 − 2λAxk , xk −−−−→ 0, k→∞
implying that limk λxk exists and is equal to limk Axk = z. Finally, let x := z/λ, so that by continuity Ax = limk Axk = λx. Theorem B.5.25 (Diagonalisation of compact self-adjoint operators). Let H be infinite-dimensional and A∗ = A ∈ L(H) be compact. Then there exist {λk }∞ k=1 ⊂ ⊂ H such that |λ | ≤ |λ |, lim λ = 0 and R and an orthonormal set {xk }∞ k+1 k k k k=1 Ax =
∞
λk x, xk xk
k=1
for every x ∈ H. Proof. By Lemma B.5.24, take λ1 ∈ R and x1 ∈ H such that x1 = 1, Ax1 = λ1 x1 ⊥ and A1 = |λ1 |. Then we proceed by induction as follows. Let Hk := {xj }k−1 . j=1
B.5. Hilbert spaces
109
Then A∗k = Ak := A|Hk ∈ L(Hk ) is compact as it is a finite-dimensional operator, so we may apply Lemma B.5.24 to choose λk ∈ R and xk ∈ Hk such that xk = 1, Axk = λk xk and Ak := |λk |. Since H is infinite-dimensional, we obtain an + orthonormal family {xk }∞ k=1 ⊂ H, and Axk = λk xk for each k ∈ Z , where |λk+1 | ≤ |λk |. Since A is compact, (Axk )∞ k=1 has a converging subsequence. Actually, ∞ (Ak )k=1 itself must converge and λk → 0, because Axj − Axk = λj xj − λk xk = for every j, k ∈ Z+ . If x ∈ H then zk := x −
/ λ2j + λ2k ≥ |λk |
k−1
j=1 x, xk
xk ∈ Hk , and
Azk = Ak zk ≤ Ak zk = |λk | zk ≤ |λk | x −−−−→ 0, k→∞
completing the proof.
Corollary B.5.26 (Hilbert–Schmidt Spectral Theorem). Let A∗ = A ∈ L(H) be compact. Then σ(A) is at most countable, and Ker(λI − A) is finite-dimensional if 0 = λ ∈ σ(A). Moreover, σ(A) \ {0} is discrete, and Ker(λI − A). H= λ∈σ(A)
Exercise B.5.27. Prove the Hilbert–Schmidt Spectral Theorem using Theorem B.5.25. Definition B.5.28 (Weak topology on a Hilbert space). The weak topology of a Hilbert space H is the smallest topology for which mappings (u → u, vH ) : H → C are continuous for all v ∈ H. Exercise B.5.29 (Weak = weak∗ in Hilbert spaces). Show that Hilbert spaces are reflexive. Prove that in a Hilbert space the weak topology is the same as the weak∗ -topology, introduced in Definition B.4.35. As a consequence of Exercise B.5.29 and the Banach–Alaoglu Theorem B.4.36 we obtain: Theorem B.5.30 (Banach–Alaoglu Theorem for Hilbert spaces). Let H be a Hilbert space. Its closed unit ball B = {v ∈ H : vH ≤ 1} is compact in the weak topology.
110
Chapter B. Elementary Functional Analysis
Exercise B.5.31. Let {eα }α∈I be an orthonormal collection in H and let x ∈ H. Show that |x, eα |2 ≤ x2 . (B.4) α∈I
(Hint: Pythagoras’ theorem.) Consequently, deduce from Exercise B.4.3 that the set of α such that x, eα = 0 is at most countable. We finish with the following theorem which is of importance, because it allows one to decompose elements into “simpler ones”, which is particularly important in applications. Theorem B.5.32 (Orthonormal sets in a Hilbert space). Let {eα }α∈I be an orthonormal set in the Hilbert space H. Then the following conditions are equivalent: (i) For every x ∈ H there are only countably many α ∈ I such that x, eα = 0, and the equality x= x, eα eα α∈I
holds, where the series is converging in norm, independent of any ordering of its terms. (ii) If x, eα = 0 for all α ∈ I, then x = 0. (iii) (Plancherel’s identity) For every x ∈ H it holds that x2 = α∈I |x, eα |2 . Proof. (i) ⇒ (iii). This follows by enumerating countably many eα ’s with x, eα = 0 by {ej }∞ j=1 , and using the identity x − 2
n
|x, ej | = x − 2
j=1
n
x, ej ej 2 .
j=1
(iii) ⇒ (ii) is automatic. Finally, let us show (ii) ⇒ (i). It follows from the last part of Exercise B.5.31 that the collection of eα with x, eα = 0 is countable, so it can be enumerated by {ej }∞ j=1 . Now, the identity
j2 j=j1
x, ej ej 2 =
j2
x, ej 2
j=j1
and (B.4) imply that the right-hand side → 0 as j1 , j 2 → ∞. This means that ∞ ∞ the series j=1 x, ej ej converges. Setting y := x − j=1 x, ej ej we see that y, eα = 0 for all α ∈ I, which implies that y = 0. Exercise B.5.33. Verify the identities stated in the proof. Definition B.5.34 (Orthonormal basis). An orthonormal set satisfying conditions of Theorem B.5.32 is called an orthonormal basis of the Hilbert space H. Then we have the following properties
B.5. Hilbert spaces
111
Theorem B.5.35 (Every Hilbert space has an orthonormal basis). Every Hilbert space H has an orthonormal basis. An orthonormal basis is countable if and only if H is separable, in which case any other basis is also countable. Exercise B.5.36. Prove Theorem B.5.35: the first part follows from Zorn’s lemma if we order orthonormal collections by inclusion, since the maximal element would satisfy property (ii) of Theorem B.5.32. The second part follows from the Gram– Schmidt process; Exercise B.5.37 (Gram–Schmidt orthonormalisation process). Let {xk }∞ k=1 be a linearly independent family of vectors in a Hilbert space H. Let ⎧ ⎪ ⎪ ⎪ ⎪ ⎨
y1 := x1 , ek := yk /yk , and
k ⎪ ⎪ ⎪ y := x − xk+1 , ej ej ⎪ k+1 ⎩ k+1 j=1
for all k ∈ Z+ . Show that {ek }∞ k=1 is an orthonormal set in H, such that span {ek }nk=1 = span {xk }nk=1 for every n ∈ Z+ ∪ {∞}.
B.5.1 Trace class, Hilbert–Schmidt, and Schatten classes Definition B.5.38 (Trace class operators). Let H be a Hilbert space with orthonormal basis {ej | j ∈ J}. Let A ∈ L(H). Let us write AS1 :=
|Aej , ej H | ;
j∈J
this is the trace norm of A, and the trace class is the (Banach) space S1 = S1 (H) := {A ∈ L(H) : AS1 < ∞} . The trace is the linear functional Tr : S1 (H) → C, defined by A →
Aej , ej H . j∈J
Exercise B.5.39. Verify that the definition of the trace is independent of the choice of the orthonormal basis for H. Consequently, if (aij )i,j∈J is the matrix representation of A ∈ S1 with respect to the chosen basis, then Tr(A) = j∈J ajj .
112
Chapter B. Elementary Functional Analysis
Exercise B.5.40 (Properties of trace). Prove the following properties of the trace functional: Tr(AB) Tr(A∗ ) Tr(A∗ A) Tr(A ⊕ B) dim(H) < ∞
= = ≥ =
Tr(BA), Tr(A), 0, Tr(A) + Tr(B), Tr(IH ) = dim(H), ⇒ Tr(A ⊗ B) = Tr(A) Tr(B).
Exercise B.5.41 (Trace on a finite-dimensional space). Show that the trace on a finite-dimensional vector space is independent of the choice of inner product. Thus, the trace of a square matrix is defined to be the sum of its diagonal elements; moreover, the trace is the sum of the eigenvalues (with multiplicities counted). Exercise B.5.42. Let H be finite-dimensional. Let f : L(H) → C be a linear functional satisfying ⎧ ⎪ ⎨f (AB) = f (BA), f (A∗ A) ≥ 0, ⎪ ⎩ f (IH ) = dim(H) for all A, B ∈ L(H). Show that f = Tr. Definition B.5.43 (Hilbert-Schmidt operators). The space of Hilbert–Schmidt operators is S2 = S2 (H) := {A ∈ L(H) : A∗ A ∈ S1 (H)} , and it can be endowed with a Hilbert space structure via the inner product A, BS2 := Tr(AB ∗ ). The Hilbert–Schmidt norm is then 1/2
AHS = AS2 := A, AS2 . The case of the Hilbert–Schmidt norm on the finite-dimensional spaces will be discussed in more detail in Section 12.6. Remark B.5.44. In general, there are inclusions S1 ⊂ S2 ⊂ K ⊂ S∞ , where S∞ := L(H) and K ⊂ S∞ is the subspace of compact linear operators. Moreover, AS∞ ≤ AS2 ≤ AS1 for all A ∈ S∞ . One can show that the dual K = L(K, C) is isometrically isomorphic to S1 , and that (S1 ) is isometrically isomorphic to S∞ . In the latter case, it turns out that a bounded linear functional on S1 is of the form A → Tr(AB) for some B ∈ S∞ . These phenomena are related to properties of the sequence spaces p = p (Z+ ). In analogy to the operator spaces, 1 ⊂ 2 ⊂ c0 ⊂ ∞ , where c0 is the space of sequences converging to 0, playing the counterpart of space K.
B.5. Hilbert spaces
113
Remark B.5.45 (Schatten classes). Trace class operators S1 and Hilbert–Schmidt operators S2 turn out to be special cases of the Schatten classes Sp , 1 ≤ p < ∞. These classes can be introduced with the help of the singular values μ2 ∈ σ(A∗ A). To avoid the technicalities we assume that all the operators below are compact. Thus, for A ∈ L(H) we set ⎛ ASp := ⎝
⎞1/p μp ⎠
.
μ2 ∈σ(A∗ A)
We note that operators that satisfy ASp < ∞ must have at most countable spectrum σ(A∗ A) in view of Exercise B.4.3, but in our case this is automatically satisfied since we assumed that A is compact. Therefore, denoting the sequence of singular values μ2j ∈ σ(A∗ A), counted with multiplicities, we have ASp = {μj }j p . The Schatten class Sp is then defined as the space Sp = Sp (H) := A ∈ L(H) : ASp < ∞ . With this norm, Sp (H) is a Banach space, and S2 (H) is a Hilbert space. In analogy to the trace class and Hilbert–Schmidt operators, one can show that actually ApSp = Tr(|A∗ A|p/2 ) = Tr(|A|p ) for a compact operator A. Exercise B.5.46. Show that the Schatten classes S1 and S2 coincide with the previously defined trace class and Hilbert–Schmidt class, respectively. Exercise B.5.47 (H¨ older’s inequality for Schatten classes). Show that a Schatten class Sp is an ideal in L(H). Let H be separable. Show that if 1 ≤ p ≤ ∞, p1 + 1q = 1, A ∈ Sp and B ∈ Sq , then ABS1 ≤ ASp BSq . (Hint: approximate operators by matrices.)
Chapter C
Measure Theory and Integration This chapter provides sufficient general information about measures and integration for the purposes of this book. The starting point is the concept of an outer measure, which “measures weights of subsets of a space”. We should first consider how to sum such weights, which are either infinite or non-negative real numbers. For a finite set K, notation aj j∈K
abbreviates the usual sum of numbers aj ∈ [0, ∞] over the index set K. The conventions here are that a + ∞ = ∞ for all a ∈ [0, ∞], and that
aj = 0.
j∈∅
Infinite summations are defined by limits as follows: Definition C.0.1. The sum of numbers aj ∈ [0, ∞] over the index set J is ⎧ ⎫ ⎨ ⎬ aj := sup aj : K ⊂ J is finite . ⎩ ⎭ j∈J
j∈K
Exercise C.0.2. Let 0 < aj < ∞ for each j ∈ J. Suppose
aj < ∞.
j∈J
Show that J is at most countable. The message of Exercise C.0.2 is that for positive numbers, only countable summations are interesting. In measure theory, where summations are fundamental, such a “restriction to countability” will be encountered recurrently.
116
C.1
Chapter C. Measure Theory and Integration
Measures and outer measures
Recall that for a set X, by P(X) := {E | E ⊂ X} we denote its power set, i.e., the family of all subsets of X. Let us write E c := X \ E = {x ∈ X : x ∈ E} for the complement set, when the space X is implicitly known from the context.
C.1.1
Measuring sets
Definition C.1.1 (Outer measure). A mapping ψ : P(X) → [0, ∞] is an outer measure on a set X = ∅ if ψ(∅) = 0, E ⊂ F ⇒ ψ(E) ≤ ψ(F ), ⎞ ⎛ ∞ ∞ Ej ⎠ ≤ ψ(Ej ) ψ⎝ j=1
j=1
for every E, F ⊂ X and {Ej }∞ j=1 ⊂ P(X). Intuitively, an outer measure is weighs the subsets of a space. Example. Define ψ : P(X) → [0, ∞] by ψ(∅) = 0 and ψ(E) = 1, when ∅ = E ⊂ X. This is an outer measure. Example. Let ψ : P(X) → [0, ∞], where ψ(E) is the number of points in the set E ⊂ X. Such an outer measure is called a counting measure for obvious reasons. At first sight, constructing meaningful non-trivial outer measures may appear difficult. However, there is an easy and useful method for generating outer measures out of simpler set functions, which we call the measurelets: Definition C.1.2 (Measurelets). Let A ⊂ P(X) cover X, i.e., X = A. We call a mapping m : A → [0, ∞] a measurelet on X. Members of the family A are called the elementary sets. A measurelet m : A → [0, ∞] on X generates a mapping m∗ : P(X) → [0, ∞] defined by ! ∗ m (E) := inf m(A) : B ⊂ A is countable, E ⊂ B . A∈B
Exercise C.1.3. Let A := {∅, R2 } ∪ {S ⊂ R2 : S a finite union of polygons}. Let us define a measurelet A : A → [0, ∞] by the following informal demands: (1) A(rectangle) = base · height. (2) A(S1 ∪ S2 ) = A(S1 ) + A(S2 ), if the interiors of the sets S1 , S2 are disjoint. (3) The measurelet A does not change in translations nor rotations of sets. Using these rules, calculate the measurelets of a parallelogram and a triangle.
C.1. Measures and outer measures
117
Apparently, there are plenty of measurelets: almost anything goes. Especially, outer measures are measurelets. Theorem C.1.4. Let m : A → [0, ∞] be a measurelet on a set X. Then m∗ : P(X) → [0, ∞] is an outer measure for which m∗ (A) ≤ m(A) for every A ∈ A. ∗ Proof. Clearly, m∗ : P(X) → [0, ∞] is well defined, and m (A) ≤ m(A) for every ∗ A∈ A. We see that m (∅) = 0, because A∈∅ m(A) = 0, ∅ ⊂ A is countable, and ∅ ⊂ ∅. Next, if E ⊂ F ⊂ X then m∗ (E) ≤ m∗ (F ), because any cover {Aj }∞ j=1 of F is also a cover of E. Lastly, let {Ej }∞ j=1 ⊂ P(X). Take ε > 0. For each j ≥ 1, choose {Ajk }∞ k=1 ⊂ A such that
Ej ⊂
∞
and m∗ (Ej ) + 2−j ε ≥
Ajk
k=1
∞
m(Ajk ).
k=1
Then {Ajk }∞ j,k=1 ⊂ A is a countable cover of ⎛ m∗ ⎝
∞
⎞ Ej ⎠
≤
j=1
∞ j=1
∞ ∞
Ej ⊂ X, and
m(Ajk )
j=1 k=1
≤
∞
m∗ (Ej ) + ε.
j=1
Thus m∗ (
∞
Ej ) ≤
j=1
∞
m∗ (Ej ); the proof is complete.
j=1
Definition C.1.5 (Lebesgue’s outer measure). On the Euclidean space X = Rn , let us define the partial order ≤ by a≤b
definition
⇐⇒
∀j ∈ {1, . . . , n} : aj ≤ bj .
When a ≤ b, let the n-interval be [a, b] := [a1 , b1 ] × · · · × [an , bn ] = {x ∈ Rn : a ≤ x ≤ b} . For A = {[a, b] : a, b ∈ X, a ≤ b} let us define the Lebesgue measurelet m : A → [0, ∞] by n |aj − bj |. m([a, b]) := volume([a, b]) = j=1 ∗
Then the generated outer measure λ = Lebesgue outer measure of Rn .
λ∗Rn
:= m∗ : P(Rn ) → [0, ∞] is called the
Exercise C.1.6. Give an example of an outer measure that cannot be generated by a measurelet.
118
Chapter C. Measure Theory and Integration
Definition C.1.7 (Outer measure measurability). Let ψ : P(X) → [0, ∞] be an outer measure. A set E ⊂ X is called ψ-measurable if ψ(S) = ψ(E ∩ S) + ψ(E c ∩ S) for every S ⊂ X, where E c = X\E. The family of ψ-measurable sets is denoted by M(ψ) ⊂ P(X). Remark C.1.8. Notice that trivially ψ(S) ≤ ψ(E ∩ S) + ψ(E c ∩ S) by the properties of the outer measure. Intuitively, a measurable set E “sharply cuts” “rough” sets S ⊂ X into two disjoint pieces, E ∩ S and E c ∩ S. Remark C.1.9 (Non-measurability). The Axiom of Choice can be used to “construct” a subset E ⊂ Rn which is not Lebesgue measurable. We will discuss this topic in Section C.1.4. Exercise C.1.10. Let ψ : P(X) → [0, ∞] be an outer measure and E ⊂ X. Define ψE : P(X) → [0, ∞] by ψE (S) := ψ(E ∩ S). Show that ψE is an outer measure for which M(ψ) ⊂ M(ψE ). Lemma C.1.11. Let ψ : P(X) → [0, ∞] be an outer measure and ψ(E) = 0. Then E ∈ M(ψ). Proof. Let S ⊂ X. Then ψ(S)
≤
ψ(E ∩ S) + ψ(E c ∩ S)
≤ =
ψ(E) + ψ(S) ψ(S),
so that ψ(S) = ψ(E ∩ S) + ψ(E c ∩ S); set E is ψ-measurable.
Lemma C.1.12. Let E, F ∈ M(ψ). Then E , E ∩ F, E ∪ F ∈ M(ψ). c
Proof. The definition of ψ-measurability is clearly complement symmetric, so that E ∈ M(ψ) ⇐⇒ E c ∈ M(ψ). Next, it is sufficient to deal with E ∪ F , since E ∩ F = (E c ∪ F c )c . Take S ⊂ X. Then ψ(S)
≤
ψ((E ∪ F ) ∩ S) + ψ((E ∪ F )c ∩ S)
=
ψ((E ∪ F ) ∩ S) + ψ(E c ∩ F c ∩ S)
E∈M(ψ)
=
F ∈M(ψ)
=
E∈M(ψ)
=
ψ(E ∩ S) + ψ(E c ∩ F ∩ S) + ψ(E c ∩ F c ∩ S) ψ(E ∩ S) + ψ(E c ∩ S) ψ(S).
Hence ψ(S) = ψ((E ∪F )∩S)+ψ((E ∪F )c ∩S), so that E ∪F is ψ-measurable.
C.1. Measures and outer measures
119
Exercise C.1.13. Let ψ : P(X) → [0, ∞] be an outer measure. Let E ⊂ S ⊂ X, E ∈ M(ψ) and ψ(E) < ∞. Show that ψ(S \ E) = ψ(S) − ψ(E). Definition C.1.14 (σ-algebras). A family M ⊂ P(X) is called a σ-algebra on X (pronounced: sigma-algebra) if 1. E ∈ M for every countable collection E ⊂ M, and c 2. E ∈ M for every E ∈ M. Remark C.1.15. Here, recall the conventions for the union and the intersection of the empty family: for A = ∅ ⊂ P(X), we naturally define A = ∅, but notice that A = X (this is not as surprising as it might first seem). Thereby M is a σ-algebra on X if and only if ∞ ∞ 1. j=1 Ej ∈ M whenever {Ej }j=1 ⊂ M, c 2. E ∈ M for every E ∈ M, and 3. ∅ ∈ M. Thus, a σ-algebra on X contains always at least subsets ∅ ⊂ X and X ⊂ X. Proposition C.1.16. Let A ⊂ P(X). There exists the smallest σ-algebra Σ(A) on X containing A, called the σ-algebra generated by A. A word of warning: there is no summation in this σ-algebra business here, even though we have used the capital sigma symbol Σ. Exercise C.1.17. Prove Proposition C.1.16. Definition C.1.18 (Borel σ-algebra). The Borel σ-algebra of a topological space (X, τ ) is Σ(τ ) ⊂ P(X). The members of Σ(τ ) are called Borel sets. Definition C.1.19 (Disjoint family). A family A ⊂ P(X) is called disjoint if A∩B = ∅ for every A, B ∈ A for which A = B. Remark C.1.20 (Disjointisation). In measure theory, the following “disjointisation” process often comes in handy. Let M be a σ-algebra and {Ej }∞ j=1 ⊂ M. Let F1 := E1 and k Fk+1 := Ek+1 \ Ej . j=1
Now
{Fk }∞ k=1
⊂ M is a disjoint family satisfying Fk ⊂ Ek and ∞
Ej =
j=1
∞
Fk .
k=1
Proposition C.1.21. Let ψ : P(X) → [0, ∞] be an outer measure. Let {Fk }∞ k=1 ⊂ ∞ M(ψ) be disjoint. Then k=1 Fk ∈ M(ψ) and ψ(
∞
k=1
Fk ∩ S) =
∞ k=1
ψ(Fk ∩ S)
(C.1)
120
Chapter C. Measure Theory and Integration
for every S ⊂ X, especially
ψ
Proof. Let E := M(ψ). Now ψ(S)
∞
Fk
∞
=
k=1
∞ k=1
ψ(Fk ).
k=1
Fk . Take S ⊂ X. By Lemma C.1.12, Gn :=
n k=1
Fk ∈
ψ(E ∩ S) + ψ(E c ∩ S) ∞ ψ(Fk ∩ S) + ψ(E c ∩ S)
≤ ≤
k=1
=
lim
n→∞
{Fk }n k=1 ⊂M(ψ) disjoint
=
n
ψ(Fk ∩ S) + ψ(E ∩ S) c
k=1
lim (ψ(Gn ∩ S) + ψ(E c ∩ S))
n→∞
E c ⊂Gcn
≤
lim (ψ(Gn ∩ S) + ψ(Gcn ∩ S))
n→∞
Gn ∈M(ψ)
=
ψ(S).
Hence ψ(S) = ψ(E ∩ S) + ψ(E c ∩ S), meaning that E ∈ M(ψ). Moreover, (C.1) follows from the above chain of (in)equalities. Corollary C.1.22. Let ψ : P(X) → [0, ∞] be an outer measure. For each k ≥ 1, let Ek ∈ M(ψ) be such that Ek ⊂ Ek+1 . Then ∞ Ek = lim ψ(Ek ). (C.2) ψ k→∞
k=1
For each k ≥ 1, let Fk ∈ M(ψ) such that Fk ⊃ Fk+1 and ψ(F1 ) < ∞. Then ∞ Fk = lim ψ(Fk ). (C.3) ψ k→∞
k=1
Proof. Let us assume that ψ(Ek ) < ∞ for every k ≥ 1, for otherwise the first claim is trivial. Thereby ∞ ∞ ψ( Ek ) = ψ E1 ∪ (Ek+1 \ Ek ) k=1
k=1 Prop. C.1.21
=
ψ(E1 ) +
∞
ψ(Ek+1 \ Ek )
k=1 Exercise C.1.13
=
=
ψ(E1 ) + lim
n→∞
n k=1
lim ψ(En+1 ).
n→∞
(ψ(Ek+1 ) − ψ(Ek ))
C.1. Measures and outer measures
121
Now
⎛ ψ(F1 )
ψ⎝
=
∞
∪
Fk
Prop. C.1.21
=
ψ
∞
(C.2)
=
ψ
Exercise C.1.13
=
ψ
∞
k=1 ∞
⎛
+ψ⎝
Fk
k=1
⎞ (F1 \ Fj )⎠
j=1
k=1
∞
∞
⎞ (F1 \ Fj )⎠
j=1
+ lim ψ(F1 \ Fj )
Fk
j→∞
Fk
+ lim (ψ(F1 ) − ψ(Fj )), j→∞
k=1
from which (C.3) follows, since ψ(F1 ) < ∞.
Exercise C.1.23. Give an example of an outer measure ψ : P(X) → [0, ∞] and sets Ek ⊂ X such that Ek ⊂ Ek+1 for all k ∈ Z+ and ∞ Ek = lim ψ(Ek ). ψ k→∞
k=1
Exercise C.1.24. Give an example that shows the indispensability of the assumption ψ(F1 ) < ∞ in Corollary C.1.22. For instance, find an outer measure ϕ : P(Z) → [0, ∞] and a family {Fk }∞ k=1 ⊂ M(ϕ) for which ∞ Fk = lim ϕ(Fk ), ϕ k=1
k→∞
even though Fk ⊃ Fk+1 for all k. Theorem C.1.25. Let ψ : P(X) → [0, ∞] be an outer measure. Then the ψmeasurable sets form a σ-algebra M(ψ). Proof. ∅ ∈ M(ψ) due to Lemma C.1.11. By Lemma C.1.12, we know that M(ψ) is closed under taking complements. We must prove that it is closed also under taking process of countable unions. Let {Ej }∞ j=1 ⊂ M(ψ). Applying the disjointisation ∞ ⊂ M(ψ), for which Remark C.1.20, we obtain a disjoint family {Fk }∞ k=1 j=1 Ej = ∞ k=1 Fk . Exploiting Proposition C.1.21, the proof is concluded. Definition C.1.26 (Measures and measure spaces). Let M be a σ-algebra on X. A mapping μ : M → [0, ∞] is called a measure on X if μ(∅) = 0, ⎛ ⎞ ∞ ∞ ⎝ ⎠ μ Ej = μ(Ej ) j=1
j=1
122
Chapter C. Measure Theory and Integration
whenever {Ej }∞ j=1 ⊂ M is a disjoint family. Then the triple (X, M, μ) is called a measure space; such a measure space and the corresponding measure μ are called: • finite, if μ(X) < ∞; • probability, if μ is a finite measure with μ(X) = 1; • complete, if F ∈ M whenever there exists E ∈ M such that F ⊂ E and μ(E) = 0; • Borel, if M = Σ(τ ), σ-algebra of the Borel sets in a topological space (X, τ ). However, sometimes the Borel condition may mean Σ(τ ) ⊂ M (more on this later). Theorem C.1.27. Let ψ : P(X) → [0, ∞] be an outer measure. Then the restriction ψ|M(ψ) : M(ψ) → [0, ∞] is a complete measure. Proof. This follows by Proposition C.1.21 and Lemma C.1.11.
Exercise C.1.28. Let μk : M → [0, ∞] be measures for which μk (E) ≤ μk+1 (E) for every E ∈ M (and all k ∈ Z+ ). Show that μ : M → [0, ∞], where μ(E) := lim μk (E). k→∞
Exercise C.1.29 (Borel–Cantelli Lemma). Let (X, M, μ) be a measure space, {Ej }∞ j=1 ⊂ M and E := x ∈ X | {j ∈ Z+ : x ∈ Ej } is infinite . Prove that μ(E) = 0 if ∞
μ(Ej ) < ∞.
j=1
This is the so-called Borel–Cantelli Lemma. Remark C.1.30. By Theorem C.1.4, any measure μ generates the outer measure μ∗ , whose restriction μ∗ |M(μ∗ ) is a complete measure, which generates an outer measure, and so on. Fortunately, this back-and-forth-process terminates, as we shall see in Theorems C.1.35 and C.1.36. Lemma C.1.31. Let μ : M → [0, ∞] be a measure on X. Then for every S ⊂ X there exists A ∈ M such that S ⊂ A : μ∗ (S) = μ(A). Consequently, μ∗ (S) = min {μ(A) : S ⊂ A ∈ M} .
C.1. Measures and outer measures
123
Remark C.1.32. An outer measure ψ : P(X) → [0, ∞] is called M-regular if M ⊂ M(ψ) and ∀S ⊂ X ∃A ∈ M : S ⊂ A, ψ(S) = ψ(A); according to Lemma C.1.31, the outer measure μ∗ generated by a measure μ : M → [0, ∞] is M-regular. Proof. If S ⊂ X then μ∗ (S)
⎧ ∞ ⎨
⎫ ⎬ ⊂M ⎭
∞
μ(Aj ) : S ⊂ Aj , {Aj }∞ j=1 ⎩ j=1 j=1 ⎫ ⎧ ⎛ ⎞ ∞ ∞ ⎬ ⎨ Aj ⎠ : S ⊂ Aj , {Aj }∞ ⊂ M ≥ inf μ ⎝ j=1 ⎭ ⎩ =
inf
j=1
j=1
= inf {μ(A) : S ⊂ A, A ∈ M} ≥ μ∗ (S). For ε > 0, choose Aε ∈ M such that Thus μ∗ (S) = inf{μ(A) : S ⊂ A ∈ M}. ∞ S ⊂ Aε and μ∗ (S) + ε ≥ μ(Aε ). Let A0 := k=1 A1/k ∈ M. Then S ⊂ A0 , and μ∗ (S) ≤ μ(A0 ) ≤ μ(Aε ) ≤ μ∗ (S) + ε implies μ∗ (S) = μ(A0 ).
Exercise C.1.33. Let ψ : P(X) → [0, ∞] be an M-regular outer measure and E ∈ M(ψ). Define ψE : P(X) → [0, ∞] by ψE (S) := ψ(E ∩ S) as in Exercise C.1.10. Show that ψE is an M-regular outer measure. Exercise C.1.34. Let (X, M, μ) be a measure space and Ek ⊂ X such that Ek ⊂ Ek+1 for all k ∈ Z+ . Show that ∞ Ek = lim μ∗ (Ek ). μ∗ k→∞
k=1
Notice that this does not violate Exercise C.1.23. Theorem C.1.35 (Carath´eodory–Hahn extension). Let μ : M → [0, ∞] be a measure. Then M ⊂ M(μ∗ ) and μ = μ∗ |M . Proof. Let E ∈ M. Then μ∗ (E) = μ(E), because trivially μ∗ (E) ≤ μ(E) and because ⎞ ⎛ ∞ ∞ ⎠ ⎝ Ej ≤ μ(Ej ) μ(E) ≤ μ j=1
j=1
124
Chapter C. Measure Theory and Integration
∗ for any {Ej }∞ j=1 ⊂ M covering E. To prove M ⊂ M(μ ), we must show that
μ∗ (S) = μ∗ (E ∩ S) + μ∗ (E c ∩ S) for any S ⊂ X. This follows, because μ∗ (E ∩ S) + μ∗ (E c ∩ S) μ∗ (S)
≥ Lemma C.1.31
= = ≥
inf {μ(A) : S ⊂ A ∈ M} inf {μ(A ∩ E) + μ(A ∩ E c ) : S ⊂ A ∈ M} μ∗ (E ∩ S) + μ∗ (E c ∩ S).
This concludes the proof.
∗ Theorem C.1.36. Let μ : M → [0, ∞] be a measure. Then μ∗ = μ∗ |M(μ∗ ) .
Proof. Let ν := μ∗ |M(μ∗ ) . We must show that ν ∗ = μ∗ . Since μ = μ∗ |M and M ⊂ M(μ∗ ) by Theorem C.1.35, we see that μ is a restriction of ν, and thus the investigation of Definition C.1.2 yields ν ∗ ≤ μ∗ . Moreover, μ∗ (S)
≥
ν ∗ (S)
Lemma C.1.31
inf {ν(A) : S ⊂ A ∈ M(μ∗ )}
Lemma C.1.31
inf {μ(B) : S ⊂ A ∈ M(μ∗ ), A ⊂ B ∈ M} inf {μ(B) : S ⊂ B ∈ M} μ∗ (S),
=
= ≥ ≥
so that μ∗ (S) = ν ∗ (S).
Remark C.1.37. In the sequel, measures are often required to be complete. This restriction is not severe, as measures can always be completed, e.g., by the Carath´eodory–Hahn extension, whose naturality is proclaimed by Theorems C.1.35 and C.1.36: if (X, M, μ) is a measure space, N = M(μ∗ ) and ν = μ∗ |N , then (X, N , ν) is a complete measure space such that M ⊂ N and μ = ν|M , with μ∗ = ν ∗ . So, from this point onwards, we may assume that a measure μ : M → [0, ∞] is already Carath´eodory–Hahn complete, i.e., that M = M(μ∗ ).
C.1.2 Borel regularity Borel measures are particularly important, providing a link with topology on the space. We will study such measures in this section. Definition C.1.38 (Borel regular outer measures). Let (X, τ ) be a topological space and Σ(τ ) its Borel σ-algebra. An outer measure ψ : P(X) → [0, ∞] is Borel regular if it is Σ(τ )-regular.
C.1. Measures and outer measures
125
Definition C.1.39 (Metric outer measure). An outer measure ψ : P(X) → [0, ∞] on a metric space (X, d) is called a metric outer measure if it satisfies the following Carath´eodory condition: dist(A, B) > 0
⇒
ψ(A ∪ B) = ψ(A) + ψ(B).
(C.4)
This condition characterises measurability of Borel sets of a metric space: Theorem C.1.40. Let τd be the metric topology of a metric space (X, d). An outer measure ψ : P(X) → [0, ∞] is a metric outer measure if and only if τd ⊂ M(ψ). Proof. The “if” part of the proof is left for the reader as Exercise C.1.41. Take U ∈ τd . To show that U ∈ M(ψ), we need to prove ψ(A ∪ B) = ψ(A) + ψ(B) when A ⊂ U and B ⊂ U c . We may assume that ψ(A), ψ(B) < ∞. For each k ∈ Z+ , let Ak := {x ∈ A | dist(x, U c ) ≥ 1/k} . Then dist(Ak , B) ≥ 1/k, enabling the application of the Carath´eodory condition (C.4) in trivial
≥
ψ(A) + ψ(B)
A⊃Ak
=
(C.4)
=
ψ(A ∪ B) ψ(Ak ∪ B) ψ(Ak ) + ψ(B).
Clearly ψ(Ak ) ≤ ψ(A) ≤ ψ(Ak ) + ψ(A \ Ak ), ∞ so we have to show that ψ(A \ Ak ) → 0. Here A = k=1 Ak , since U is open. Consequently ∞ ψ(A \ Ak ) = ψ (Al+1 \ Al ) l=k
≤
∞
ψ(Al+1 \ Al )
l=k
(C.4)
=
ψ
∞
(Ak+2m+1 \ Ak+2m )
m=1
≤ Thus ψ(A \ Ak ) ≤
2 ψ(A) ∞ l=k
+ψ
∞
(Ak+2m \ Ak+2m−1 )
m=1
0 there exist closed Fε ⊂ X and open Gε ⊂ X such that Gε ⊃ E ⊃ Fε and ψ(Gε \ Fε ) < ε. Proof. Let us assume the second condition. Let E ⊂ X such that for each ε > 0 there exists a closed set Fε ⊂ X such that Fε ⊂ E
∞
and ψ(E \ Fε ) < ε.
If F = k=1 F1/k then E ⊃ F ∈ Σ(τd ) ⊂ M(ψ), since we assume the measurability of the Borel sets. Moreover, E ∈ M(ψ), because E = F ∪ (E \ F ), where E \ F ∈ M(ψ) due to 0 ≤ ψ(E \ F ) \ ψ(E \ F1/k )
0. If Q(x, r) ⊂ U , take z ∈ Qn ∩ Br/2 (x); then x ∈ Q(z, r/2) ⊂ Q(x, r) ⊂ U . Thus U= Q(z, 1/m) : z ∈ Qn , m ∈ Z+ , Q(z, 1/m) ⊂ U , which is measurable as a countable union of measurable sets.
Remark C.1.53. It now turns out that Lebesgue measurable sets are nearly open or closed sets: Theorem C.1.54 (Topological approximation of Lebesgue measurable sets). Let E ⊂ Rn . The following three conditions are equivalent: 1. E ∈ M(λ∗Rn ). 2. For every ε > 0 there exists an open set U ⊂ Rn such that E ⊂ U and λ∗Rn (U \ E) < ε. 3. For every ε > 0 there exists a closed set S ⊂ Rn such that S ⊂ E and λ∗Rn (E \ S) < ε. Proof. Let us show that the first condition implies the second one. Suppose E ⊂ Rn is Lebesgue measurable. Let ε > 0. For a moment, assume that λRn (E) < ∞. Take a family {Aj }∞ j=1 of n-intervals such E ⊂ ∞
(C.5) ∞ j=1
Aj and
volume(Aj ) < λRn (E) + ε.
(C.6)
j=1
We may think that this is an ε-tight cover of E, and we may loosen it a bit by taking a family {Bj }∞ j=1 on n-intervals such that Aj ⊂ int(Bj ) and
Let U :=
∞ j=1
λRn (Bj ) ≤ λRn (Aj ) + 2−j ε. int(Bj ). Then U ⊂ Rn is open, E ⊂ U and λRn (U )
≤ (C.7)
≤
∞ j=1 ∞
λRn (Bj ) λRn (Aj ) + ε
j=1 (C.6)
0, because Rn = Qn + S is the union of a countable family {q + S | q ∈ Qn }, where λ∗Rn (r +S) = λ∗Rn (S). Moreover, if 0 = q ∈ Qn then S ∩ (q + S) = ∅. By the following result, this proves the non-measurability of S: Proposition C.1.57. Let S ⊂ Rn be Lebesgue measurable and λRn (S) > 0. Then there exists δ > 0 such that λRn (S ∩ (x + S)) > 0 whenever xRn < δ. Proof. Let 0 < ε < 1. Since λ(S) > 0, there exists an n-interval I = [a, b] ⊂ Rn such that λ(S ∩ I) = (1 − ε) λ(I) > 0. Let E = S ∩ I. Then λ(I \ E) = λ(I) − λ(E) = ε λ(I) due to the measurability of E. For any x ∈ Rn , I ∩ (x + I) = (E ∪ (x + E)) ∪ (I \ E) ∪ ((x + I) \ (x + E)) , so that λ (I ∩ (x + I)) ≤ λ (E ∩ (x + E)) + λ (I \ E) + λ ((x + I) \ (x + E)) = λ (E ∩ (x + E)) + 2ε λ (I) , where the last equality follows by the translation invariance of the Lebesgue measure. The reader easily verifies that limx→0 λ(I + (x + I)) = λ(I). Thus the claim follows if we choose ε small enough.
134
Chapter C. Measure Theory and Integration
Exercise C.1.58. Let I = [a, b] ⊂ Rn be an n-interval. Show that λRn (I ∩ (x + I)) −−−−−−→ λRn (I).
x Rn →0
Actually, it can be shown that in the Zermelo–Fraenkel set theory without the Axiom of Choice, one cannot prove the existence of Lebesgue non-measurable sets: see [114]. In practice, we do not have to worry about non-Lebesgue-measurability much.
C.2
Measurable functions
In topology, continuous functions were essential; in measure theory, the nice functions are the measurable ones. Before going into details, let us sketch the common framework behind both continuity and measurability. Let us say that f : X → Y induces (or pulls back ) from a family B ⊂ P(Y ) a new family f ∗ (B) ⊂ P(X) defined by f ∗ (B) := f −1 (B) ⊂ X | B ∈ B , and f : X → Y co-induces (or pushes forward ) from a family A ⊂ P(X) a new family f∗ (A) ⊂ P(Y ) defined by f∗ (A) := B ⊂ Y | f −1 (B) ∈ A . Here if A, B are topologies (or respectively σ-algebras) then f∗ (A), f ∗ (B) are also topologies (or respectively σ-algebras), since f −1 : P(Y ) → P(X) preserves unions, intersections and complementations. Exercise C.2.1. Let A, B be σ-algebras. Check that f∗ (A), f ∗ (B) are indeed σalgebras.
C.2.1
Well-behaving functions
Definition C.2.2 (Measurability). Let MX , MY be σ-algebras on X and Y , respectively. A function f : X → Y is called (MX , MY )-measurable if f −1 (V ) ∈ MX for every V ∈ MY ; that is, if f ∗ (MY ) ⊂ MX . Remark C.2.3. We see that the measurability behaves well in compositions provided that the involved σ-algebras naturally match: if f
g
(M,N )-measurable
(N ,O)-measurable
X −−−−−−−−−−−−→ Y −−−−−−−−−−−→ Z then g ◦ f : X → Z is (M, O)-measurable. For us, a most important case is Y = Z = [−∞, +∞] = R ∪ {−∞, +∞}, for which the canonical σ-algebra will be the collection Σ(τ∞ ) of Borel sets, where τ∞ ⊂ P([−∞, +∞]) is the smallest topology for which all the intervals [a, b] ⊂ [−∞, +∞] are closed.
C.2. Measurable functions
135
Definition C.2.4 (Borel/Lebesgue mesurability). Let M be a σ-algebra on X, and let τX be a topology of X. A function f : X → [−∞, +∞] is called • M-measurable if it is (M, Σ(τ∞ ))-measurable, and • Borel measurable if it is Σ(τX )-measurable. A function f : Rn → [−∞, +∞] is called Lebesgue measurable if it is M(λ∗Rn )measurable. Definition C.2.5. The characteristic function χE : X → R of a subset E ⊂ X is defined by 1, if x ∈ E, χE (x) := 0, if x ∈ E c . Notice that χE is M-measurable if and only if E ∈ M. Definition C.2.6. Let a ∈ R and f, g : X → [−∞, +∞]. We write {f > a} := {x ∈ X | f (x) > a}, {f > g} := {x ∈ X | f (x) > g(x)}. In an analogous manner one defines sets {f < a}, {f ≥ a}, {f ≤ a}, {f = a}, {f = a}, {f < g}, {f ≥ g}, {f ≤ g}, {f = g}, {f = g}, and so on. Theorem C.2.7. Let M be a σ-algebra on X and f : X → [−∞, +∞]. Then the following conditions are equivalent: 1. f is M-measurable. 2. {f > a} is measurable for each a ∈ R. 3. {f ≥ a} is measurable for each a ∈ R. 4. {f < a} is measurable for each a ∈ R. 5. {f ≤ a} is measurable for each a ∈ R. Proof. If f is M-measurable then {f > a} = f −1 ((a, +∞]) ∈ M, because (a, +∞] ⊂ [−∞, +∞] is a Borel set. Now suppose {f > a} ∈ M for every a ∈ R: we have to show that f is M-measurable. We notice that f is (M, D)-measurable, where D := f∗ (M) = B ⊂ [−∞, +∞] | f −1 (B) ∈ M .
136
Chapter C. Measure Theory and Integration
Furthermore, f is M-measurable, because Σ(τ∞ ) ⊂ D, because for every [a, b] ⊂ [−∞, +∞] we have f −1 ([a, b])
= {f ≥ a} ∩ {f ≤ b} ∞ = {f > a − 1/k} ∩ {f > b}c
∈
M;
k=1
recall that Σ(τ∞ ) is the smallest σ-algebra containing every interval. Thus f is M-measurable. All the other claims have essentially similar proofs. Remark C.2.8. Let f, g : X → [−∞, +∞] be M-measurable. By Theorem C.2.7, then {f > g} ∈ M, because ({f > r} ∩ {g < r}) ; {f > g} = r∈Q
notice that here the union is countable! Similarly, also {f ≥ g}, {f < g}, {f ≤ g}, {f = g}, {f = g} ∈ M. Example. A continuous function f : X → [−∞, +∞] is Borel measurable, because {f ≥ a} ⊂ X is closed for each a ∈ R. Therefore a continuous function f : Rn → [−∞, +∞] is Lebesgue measurable, because Borel sets in Rn are Lebesgue measurable. Theorem C.2.9. Let λ ∈ R and 0 < p < ∞. Let f, g : X → R be M-measurable. Then λf, f + g, f g, |f |p , min(f, g), max(f, g) : X → R are M-measurable. Moreover, if 0 ∈ f (X) = {f (x) : x ∈ X} then 1/f is Mmeasurable. Proof. The reader may easily show that λf is M-measurable. If a ∈ R then {f + g > q} {f + g > a} = q∈Q: q>a
=
({f > r} ∩ {g > s})
r,s∈Q: r+s>a
showing that f + g is M-measurable. If a ≥ 0 then 2 √ √ f >a = f > a ∪ f a} ∈ M for any a ∈ R. Notice that {g > a}
= =
(N ∩ {g > a}) ∪ (N c ∩ {g > a}) (N ∩ {g > a}) ∪ (N c ∩ {f > a}) .
Clearly, N c ∩ {f > a} ∈ M. Moreover, N ∩ {g > a} ∈ M, because μ is complete and μ∗ (N ∩ {g > a}) ≤ μ(N ) = 0. Definition C.2.14 (Distinguishing functions?). Let (X, M, μ) be complete. Write f ∼μ g, if f = g μ-almost everywhere: we may identify those functions that μ “does not distinguish”. Especially, if f : X → [−∞, +∞] such that μ({|f | = ∞}) = 0, we may identify f with g : X → R defined by f (x), when f (x) ∈ R, g(x) := 0, otherwise.
C.2.2
Sequences of measurable functions
Theorem C.2.15. Let fj : X → [−∞, +∞] be M-measurable for each j ∈ Z+ . Then inf fj , lim sup fj , lim inf fj sup fj , j∈Z+
are also M-measurable.
j∈Z+
j→∞
j→∞
138
Proof. First,
Chapter C. Measure Theory and Integration
! sup fj > a
=
j∈Z+
∞
{fj > a} ∈ M.
j=1
Second, the case of the infimum is handled analogously. Third, these previous cases imply the results for lim sup and lim inf. Definition C.2.16 (Convergences). Let fj , f : X → R, where j ∈ Z+ . Let us define various convergences fj → f in the following manner: We say that fj → f pointwise (word “pointwise” often omitted) if ∀x ∈ X : |fj (x) − f (x)| −−−→ 0. j→∞
Saying that fj → f uniformly means sup : |fj (x) − f (x)| −−−→ 0. j→∞
x∈X
Let (X, M, μ) be complete, fj : X → R be M-measurable and f : X → [−∞, +∞]. We say that fj → f μ-a.e. if fj → f pointwise μ-a.e. on X. Saying that fj → f μ-almost uniformly means that ∀ε > 0 ∃Aε ∈ M : (fj − f )|Aε −−−→ 0 uniformly, j→∞
μ(Acε ) < ε. Saying that fj → f in measure μ means ∀ε > 0 : μ∗ ({|fj − f | ≥ ε}) −−−→ 0. j→∞
Exercise C.2.17. Let functions fj : X → R be M-measurable for every j ∈ Z+ . Show that E ∈ M, where " # E := x ∈ X : lim fj (x) ∈ R exists . j→∞
Exercise C.2.18. Let (X, τ ) be a topological space, fj ∈ C(X) for each j ∈ Z+ and fj → f uniformly. Show that f : X → R is also continuous. This extends Theorem A.9.7. Remark C.2.19. Let (X, M, μ) be as above. By Theorems C.2.13 and C.2.15, if fj → f μ-a.e. then f : X → [−∞, +∞] is M-measurable. Moreover, if fj → f in measure or fj → f almost uniformly then f is M-measurable, and f (x) ∈ R for μ-a.e. x ∈ X (by Theorem C.2.24 and Exercise C.2.20, respectively).
C.2. Measurable functions
139
Exercise C.2.20. Let fj → f μ-almost uniformly. a) Show that fj → f in measure μ. b) Show that fj → f μ-almost everywhere. These implications cannot be reversed: give examples. Exercise C.2.21. For each j ∈ Z+ , let fj : X → R be M-measurable. Let (fj )∞ j=1 be a Cauchy sequence in measure μ, that is μ({|fi − fj | ≥ ε}) −−−−→ 0.
∀ε > 0 :
i,j→∞
Show that there exists f : X → [−∞, +∞] such that fj → f in measure μ. Exercise C.2.22. Let fj → f μ-almost everywhere. a) Show that fj → f in measure μ, if μ(X) < ∞. b) Give an example where μ(X) = ∞ and fj → f in measure μ; consequently, here also fj → f μ-almost uniformly, by Exercise C.2.20. For finite measure spaces, almost everywhere convergence implies almost uniform convergence: Theorem C.2.23 (Egorov: “finite pointwise is almost uniform”). Let (X, M, μ) be a complete finite measure space. Let fj → f μ-almost everywhere. Then fj → f almost uniformly. Proof. Take ε > 0. We want to find Aε ∈ M such that μ(Acε ) < ε and (fj − f )|Aε −−−→ 0 uniformly. Let j→∞
E := {|fj − f | → 0} . Now E ∈ M and μ(E c ) = 0, because fj → f μ-almost everywhere. Moreover, Ajk :=
∞ "
|fi − f |
(1 − ε)s}. Since fk and s are measurable, Ek ∈ M. Furthermore, fk dμ ≥ (1 − ε)s χEk dμ = (1 − ε)a · μ (Ek ∩ {s = a}) a∈s(X)
−−−−→ (1 − ε) k→∞
a · μ ({s = a})
a∈s(X)
= ≥
(1 − ε) (1 − ε)2
s dμ f dμ,
∞ where the limit is due to X = k=1 Ek , where Ek ⊂ Ek+1 ∈ M. Thus 2 lim fk dμ ≥ (1 − ε) f dμ. k→∞
Taking ε → 0, the proof is complete.
Corollary C.3.7. Let f, g : X → [0, ∞] be M-measurable. Then (f + g) dμ = f dμ + g dμ. Proof. Take measurable simple functions fk , gk : X → [0, ∞) such that fk ≤ fk+1 and gk ≤ gk+1 for each k ∈ Z+ , and fk → f and gk → g pointwise. Then fk + gk : X → [0, ∞) is measurable and simple, such that fk + gk ≤ fk+1 + gk+1 −−−−→ f + g, k→∞
146
Chapter C. Measure Theory and Integration
so that by the Monotone Convergence Theorem C.3.6, (f + g) dμ = lim (fk + gk ) dμ k→∞ , + Exercise C.3.4 = lim fk dμ + gk dμ k→∞ = f dμ + g dμ,
establishing the result. Corollary C.3.8. Let gj : X → [0, ∞] be M-measurable for each j ∈ Z+ . Then ∞
gj dμ =
j=1
∞
gj dμ.
j=1
Proof. For each k ∈ Z+ , let us define functions fk , f : X → [0, ∞] by fk :=
k
gj
and f := lim fk = k→∞
j=1
∞
gj .
j=1
These functions are measurable and fk ≤ fk+1 ≤ f , so lim
k→∞
k
Monotone Convergence
gj dμ
=
lim
k→∞
j=1 Corollary C.3.7
=
lim
k→∞
k
gj dμ
j=1 k
gj dμ,
j=1
completing the proof. Exercise C.3.9. Let f ≥ 0 be M-measurable and ∀ε > 0 ∃δ > 0 ∀A ∈ M :
f dμ < ∞. Prove that μ(A) < δ ⇒ f dμ < ε. A
Theorem C.3.10 (Fatou’s lemma). Let gk : X → [0, ∞] be M-measurable for each k ∈ Z+ . Then lim inf gk dμ ≤ lim inf gk dμ. k→∞
k→∞
Proof. Notice that lim inf gk = sup inf gj . k→∞
k≥1 j≥k
C.3. Integration
147
Define fk := inf gj for each k ≥ 1. Now fk : X → [0, ∞] is measurable and j≥k
fk ≤ fk+1 , so that sup fk = lim fk , and k→∞
k≥1
lim inf gk dμ
=
k→∞
sup fk dμ
=
k≥1
lim fk dμ fk dμ lim k→∞ lim inf fk dμ k→∞ lim inf gk dμ. k→∞
Monotone Convergence
=
= ≤
k→∞
The proof is complete.
Exercise C.3.11. Sometimes
lim inf gk dμ < lim inf k→∞
k→∞
gk dμ happens in Fatou’s
Lemma C.3.10. Find an example. Exercise C.3.12. Actually, the Monotone Convergence Theorem C.3.6 and Fatou’s Lemma C.3.10 are logically equivalent: prove this. Exercise C.3.13 (Reverse Fatou’s lemma). Prove the following reverse Fatou’s lemma. Let gk : X → [0, ∞] be M-measurable for each k ∈ Z+ . Assume that gk ≤ g for every k, where g is μ-integrable. Then lim sup gk dμ ≥ lim sup gk dμ. k→∞
k→∞
C.3.3 Integration in general Let f : X → [−∞, +∞] be an M-measurable function. Recall that if or I − = f − dμ < ∞ I + = f + dμ < ∞ then the μ-integral f is f dμ = I + − I − . Moreover, if both I + and I − are finite, f is called μ-integrable. We shall be interested mainly in μ-integrable functions. Theorem C.3.14. Let a ∈ R and f : X → [−∞, +∞] be μ-integrable. Then af dμ = a f dμ.
148
Chapter C. Measure Theory and Integration
Moreover, if g : X → [−∞, +∞] is μ-integrable such that f ≤ g, then f dμ ≤ g dμ. Especially,
f dμ ≤
|f | dμ.
Exercise C.3.15. Prove Theorem C.3.14. Exercise C.3.16. Let E ∈ M and |f | ≤ g, where f is M-measurable and g is μ-integrable. Show that f and f χE are μ-integrable. Exercise C.3.17 (Chebyshev’s inequality). Let 0 < a < ∞, and let f : X → [−∞, +∞] be M-measurable. Prove Chebyshev’s inequality (C.11) μ({|f | > a}) ≤ a−1 |f | dμ. We continue by noticing the short-sightedness of integrals: Lemma C.3.18. Let f, g : X → [−∞, +∞] be μ-integrable. Then 1. Let E ∈ M such that μ(E) = 0. Then f dμ = 0. E 2. Let f = g μ-almost everywhere. Then f dμ = g dμ. 3. Let |f | dμ = 0. Then f = 0 μ-almost everywhere. Proof. First, f + dμ
=
E
= μ(E)=0
=
f + χE dμ " # sup s dμ : s ≤ f + χE simple measurable 0,
proving the first result. Next, let us suppose f = g μ-almost everywhere. Then + = f χ{f =g} + f + χ{f =g} dμ f + dμ Corollary C.3.7 + = f dμ + f + dμ {f =g} {f =g} μ({f =g})=0 = f + dμ, {f =g}
C.3. Integration
showing that
149
f + dμ =
g + dμ, establishing the second result. Finally, ∞ {|f | > 1/k} μ ({f = 0}) = μ k=1
≤ = ≤ =
∞
μ ({|f | > 1/k})
k=1 ∞ k=1 ∞ k=1 ∞
χ{|f |>1/k} dμ k|f | dμ
k
|f | dμ
k=1
so that if
|f |dμ = 0, then μ({f = 0}) = 0.
Proposition C.3.19. Let f : X → [−∞, +∞] be μ-integrable. Then f (x) ∈ R for μ-almost every x ∈ X. ∞ Proof. First, {f + = ∞} = k=1 {f + > k} ∈ M, because f + is M-measurable. Thereby 1 k · χ{f + =∞} dμ μ {f + = ∞} = k 1 ≤ f + dμ −−−−→ 0, k→∞ k so that μ ({f + = ∞}) = 0. Similarly, μ ({f − = ∞}) = 0.
Remark C.3.20. By Lemma C.3.18 and Proposition C.3.19, when it comes to integration, we may identify a μ-integrable function f : X → [−∞, +∞] with the function f˜ : X → R defined by f (x), when f (x) ∈ R, ˜ f (x) = 0, when |f (x)| = ∞. We shall establish this identification without any further notice. Theorem C.3.21 (Sum is integrable). Let f, g : X → [−∞, +∞] be μ-integrable. Then f + g is μ-integrable and (f + g) dμ = f dμ + g dμ.
150
Chapter C. Measure Theory and Integration
Proof. For integrable f, g : X → R, the function f + g : X → R is measurable. Notice that (f + − f − ) + (g + − g − ), f +g = (f + g)+ − (f + g)− . Since (f + g)+ ≤ f + + g + , and (f + g)− ≤ f − + g − , the integrability of f + g follows. Moreover, (f + g)+ + f − + g − = (f + g)− + f + + g + . By Corollary C.3.7, + − − − + (f + g) dμ + f dμ + g dμ = (f + g) dμ + f dμ + g + dμ, implying
(f + g) dμ − (f + g)− dμ = f + dμ − f − dμ + g + dμ − g − dμ = f dμ + g dμ.
(f + g) dμ =
+
The proof for the summation is thus complete.
Theorem C.3.22 (Lebesgue’s Dominated Convergence Theorem). For each k ≥ 1, let fk : X → [−∞, +∞] be measurable and fk −−−−→ f pointwise. Assume that k→∞
|fk | ≤ g for every k ≥ 1, where g is μ-integrable. Then |fk − f | dμ −−−−→ 0, k→∞ fk dμ −−−−→ f dμ. k→∞
Proof. The functions fk , f, |fk − f | are μ-integrable, because they are measurable, g is μ-integrable, |fk |, |f | ≤ g and |fk − f | ≤ 2g. For each k ≥ 1, we define function gk := 2g − |fk − f |. Then the functions gk ≥ 0 satisfy the assumptions of Fatou’s Lemma C.3.10, yielding 2g dμ = lim inf gk dμ k→∞ Fatou ≤ lim inf gk dμ k→∞ + , = lim inf 2g dμ − |fk − f | dμ k→∞ = 2g dμ − lim sup |fk − f | dμ. k→∞
C.3. Integration
Here we may cancel
151
2g dμ ∈ R, getting lim sup |fk − f | dμ ≤ 0, k→∞
so that
|fk − f | dμ −−−−→ 0. Finally, k→∞
fk dμ −
f dμ =
(fk − f ) dμ ≤
|fk − f | dμ −−−−→ 0, k→∞
which completes the proof.
Remark C.3.23. It is easy to slightly generalise Lebesgue’s Dominated Convergence Theorem C.3.22: the same conclusions hold even if we assume only that fk → f almost everywhere, and that |fk | ≤ g almost everywhere, where g is integrable. This is because integrals are not affected if we change values of functions in a set of measure zero. Exercise C.3.24 (Indispensability of an integrable dominating function). Show that in Theorem C.3.22 it is indispensable to require the μ-integrability of a dominating function g. For this, consider X = [0, 1], μ the Lebesgue measure, and the sequence (fk )∞ k=1 with fk (x) = k for x ∈ (0, 1/k], and fk (x) = 0 for x ∈ (1/k, 1]. Show that the function h := supk fk ≥ 0 is not Lebesgue-integrable on [0, 1] (hence no dominating function here can be Lebesgue-integrable). Finally, show that the conclusion of Theorem C.3.22 fails for this sequence (fk )∞ k=1 . Exercise C.3.25 (Fatou–Lebesgue Theorem). Prove the following Fatou–Lebesgue Theorem: Let (fk )∞ k=1 be a sequence of M-measurable functions fk : X → R on a measure space (X, M, μ). Assume that |fk | ≤ g for every k ≥ 1, where g is μ-integrable. Then lim inf k→∞ fk and lim supk→∞ fk are μ-integrable and we have lim inf fk dμ ≤ lim inf fk dμ ≤ lim sup fk dμ ≤ lim sup fk dμ . k→∞
k→∞
k→∞
k→∞
Proposition C.3.26 (Riemann vs Lebesgue). Let f : R → R be Riemann-integrable on the closed interval [a, b] ⊂ R. Then f χ[a,b] is Lebesgue-integrable and the Riemann- and Lebesgue-integrals coincide:
b
f (x) dx = a
f dλR . [a,b]
Exercise C.3.27 (Riemann integration). Prove Proposition C.3.26. Recall the definition of the Riemann-integral: Let g : [a, b] → R be bounded. A finite sequence Pn = (x0 , . . . , xn ) is called a partition of [a, b] if a = x0 < x1 < x2 < · · · < xn−1 < xn = b,
152
Chapter C. Measure Theory and Integration
for which the lower and upper Riemann sums L(g, Pn ), U (g, Pn ) are defined by n g(x) (xk − xk−1 ), sup U (g, Pn ) = L(g, Pn )
=
k=1 n + k=1
xk−1 ≤x<xk
inf
xk−1 ≤x<xk
, g(x)
(xk − xk−1 ).
Now L(g) ≤ U (g), where U (g) := inf {U (g, P ) : P is a partition of [a, b]} , L(g) := sup {L(g, P ) : P is a partition of [a, b]} . If L(g) = U (g), we say that g is Riemann-integrable with Riemann integral
b
g(x) dx = L(g). a
Exercise C.3.28. Prove the following -criterion for Riemann integrability: if for any ε > 0 there is a partition P of [a, b] such that U (g, P ) − L(g, P ) < ε, then g is Riemann-integrable over [a, b]. Consequently, prove that if g is monotonic on [a, b] or if g is continuous on [a, b], then g is Riemann-integrable over [a, b].
C.4
Integral as a functional
C.4.1 Lebesgue spaces Lp (μ) In the sequel, (X, M, μ) is a complete measure space. For instance, we may have M = M(μ∗ ). Definition C.4.1 (Lp (μ)-norms). For 1 ≤ p < ∞, the Lp (μ)-norm of an Mmeasurable function f : X → [−∞, +∞] is + f Lp (μ) :=
,1/p |f |p dμ ,
and let f L∞ (μ) := inf {M ∈ [0, ∞] : |f | ≤ M μ-a.e.} , Here Lp is read “L-p” or “Lebesgue-p”. If μ is known from the context, notations Lp = Lp (X) = Lp (μ) are used: e.g., Lp (Rn ) = Lp (λRn ). Remark C.4.2. The quantities f Lp (μ) are not the norms because the non-degeneracy fails: f Lp (μ) = 0 only implies that f = 0 μ-a.e. In fact, clearly, f Lp ∈
C.4. Integral as a functional
153
[0, ∞], f Lp = 0 if and only if f = 0 μ-almost everywhere, λf Lp = |λ| f Lp , and f L1 (μ) = |f | dμ. Also, in Theorem C.4.5 we will see that ||f + g||Lp ≤ ||f ||Lp + ||g||Lp so that the triangle inequality would be satisfied. Thus, we will modify the construction slightly (identifying functions equal μ-a.e.) in Definition C.4.6 to make them norms, so that the terminology “norm” will be justified. Definition C.4.3 (Lebesgue conjugate). The Lebesgue conjugate of p ∈ [1, ∞] is the number p ∈ [1, ∞] defined by 1 1 + =1 p p with the usual convention 1/∞ = 0. The converse to the following theorem (the converse of H¨older’s inequality) will be shown in Theorem C.4.56. Theorem C.4.4 (H¨ older’s inequality). Let 1 ≤ p ≤ ∞ and q = p . Let f, g : X → [−∞, +∞] be M-measurable. Then f gL1 ≤ f Lp gLq . Proof. For p = 1, f gL1
= ≤ =
|f ||g| dμ |f | dμ gL∞
f L1 gL∞ ;
the proof for p = ∞ is symmetric. Finally, let us assume that 1 < p < ∞. We may assume the non-trivial case 0 < f Lp < ∞ and 0 < gLq < ∞. Then f gL1 = f Lp gLq ab dμ, where a = |f |/f Lp and b = |g|/gLq . The concavity of the logarithm gives ln(ab)
= 1/q=1−1/p
so
ab dμ ≤
≤
ln(ap )/p + ln(bq )/q ln (ap /p + bq /q) ,
(ap /p + bq /q) dμ = 1/p + 1/q = 1.
Theorem C.4.5 (Minkowski’s inequality). Let 1 ≤ p ≤ ∞. Let f, g : X → [−∞, +∞] be M-measurable. Then f + gLp ≤ f Lp + f Lp .
(C.12)
154
Chapter C. Measure Theory and Integration
Proof. First, f + g
L1
= ≤ =
|f + g| dμ |f | dμ + |g| dμ
f L1 + gL1 .
Also |f + g| ≤ |f | + |g|, so that |f + g| ≤ f L∞ + gL∞ almost everywhere, yielding f + gL∞ ≤ f L∞ + gL∞ . Finally, assume that 1 < p < ∞. Then p f + gLp = |f + g|p dμ ≤ (|f | + |g|) |f + g|p−1 dμ ) ) ) ) )f |f + g|p−1 ) 1 + )g |f + g|p−1 ) 1 = L L ) ) H¨ older p−1 ) ) ≤ (f Lp + gLp ) |f + g| , Lq where ) ) )|f + g|p−1 ) q L
+ |f + g|
(p−1)q
= + = =
,1/q dμ
,(p−1)/p |f + g| dμ p
f + gp−1 Lp ,
concluding the proof. Definition C.4.6 (Lp (μ)-spaces). Let 1 ≤ p ≤ ∞ and V p := {f : f Lp < ∞} .
Noticing that λf Lp = |λ| f Lp for any scalar λ, and recalling Minkowski’s inequality (C.12), we see that (f → f Lp ) : V p → [0, ∞) is a seminorm on the vector space V p . Let us define an equivalence relation ∼ on V p by f ∼ g ⇐⇒ f − gLp = 0;
C.4. Integral as a functional
i.e., f ∼ g classes by
155
⇐⇒ f = g μ-almost everywhere. Let us denote the equivalence [f ] := {g ∈ V p : f ∼ g} .
We obtain the quotient vector space Lp (μ)
:=
V p/ ∼
=
{[f ] : f ∈ V p }
with the usual vector space operations. Moreover, ([f ] → f Lp ) : Lp (μ) → [0, ∞) is a norm on vector space Lp (μ). Customarily, p (X) := Lp (μ), where μ is the counting measure, i.e., μ(E) is the number of points in the set E. Remark C.4.7. f ∈ [f ] is a function X → [−∞, +∞], but [f ] ∈ Lp (μ) is not a function, but an equivalence class of functions. However in practice, to avoid cumbersome notation, one often identifies f and [f ], e.g., writing briefly f ∈ Lp (μ). p Definition C.4.8 (Convergence in Lp (μ)). Let f ∈ Lp and {fj }∞ j=1 ⊂ L . We say p that fj → f in L if fj − f Lp −−−→ 0. j→∞
Theorem C.4.9. Lp (μ) is a Banach space. Proof. The case p = ∞ is left as Exercise C.4.10; let us consider the case 1 ≤ p < ∞. We already know that Lp (μ) is a normed space. Given a Cauchy sequence p ∞ (fj )∞ j=1 in L (μ), we need a candidate f for the limit of this sequence. Now (fj )j=1 is a Cauchy sequence in measure μ, because μ ({|fi − fj | ≥ ε})
≤
μ ({|fi − fj |p ≥ εp }) −p |fi − fj |p dμ ε
=
ε−p fi − fj pLp
= Chebyshev (C.11)
−−−−→ i,j→∞
0.
Hence by Exercise C.2.21, fj → f in measure μ for an M-measurable function f . By Theorem C.2.24, fjk → f μ-almost everywhere for a subsequence (fjk )∞ k=1 of p (fj )∞ j=1 . Here f ∈ L (μ), because = |f |p dμ f pLp = lim inf |fjk |p dμ k→∞ Fatou ≤ lim inf |fjk |p dμ k→∞
≤
constant < ∞,
156
Chapter C. Measure Theory and Integration
because Cauchy sequences in a normed space are bounded. Finally, fi → f in Lp (μ), because fi − f pLp = |fi − f |p dμ = lim inf |fi − fjk |p dμ k→∞
Fatou
≤
(fi )∞ i=1 Cauchy
−−−−−−−−−−→ i→∞
lim inf fi − fjk pLp k→∞
0.
Thus Lp (μ) is a Banach space for 1 ≤ p < ∞.
Exercise C.4.10. Complete the previous proof by showing that L∞ (μ) is a Banach space. Exercise C.4.11. Let 1 ≤ p < ∞ and fj − f Lp → 0, where f ∈ Lp and {fj }∞ j=1 ⊂ Lp . Show that + ∀ε > 0 ∃δ > 0 ∀j ∈ Z : μ(E) < δ =⇒ |fj |p dμ < ε. E
Why is p = ∞ here? Lemma C.4.12. Let g ∈ Lp (μ), where 1 ≤ p < ∞. Show that ∀ε > 0 ∃Eg ∈ M : μ(Eg ) < ∞ and |g|p dμ < ε. Egc
Exercise C.4.13. Prove Lemma C.4.12. Theorem C.4.14 (Vitali’s Convergence Theorem). Let 1 ≤ p < ∞. Let f, fj ∈ Lp (μ) for each j ∈ Z + . Then properties (1,2,3) imply (0), and (0) implies properties (2,3): (0) (1) (2) (3)
fj → f in Lp . fj → f μ-almost everywhere. ∀ε > 0 ∃E ∈ M ∀j ∈ Z+ : μ(E) < ∞, E c |fj |p dμ < ε. ∀ε > 0 ∃δ > 0 ∀j ∈ Z+ ∀A ∈ M : μ(A) < δ ⇒ A |fj |p dμ < ε.
Proof. First, let us show that (1, 2, 3) implies (0). Take ε > 0. Take δ > 0 as in (3). Take E ∈ M as in (2). Exploiting (1), Egorov’s Theorem C.2.23 says that (fj − f )|E → 0 μ-almost uniformly. Hence there exists B ∈ M such that ⎧ ⎪ ⎨B ⊂ E, (C.13) μ(E \ B) < δ, ⎪ ⎩ (fj − f )|B → 0 uniformly.
C.4. Integral as a functional
157
We want to show that fj → f Lp → 0: fj − f pLp
=
|fj − f |p dμ
|fj − f | dμ + p
= B
Bc
|fj − f |p dμ,
and here the integral over B tends to 0 as j → ∞, by (C.13). What about the integral over B c ? Since (t → tp ) : R+ → R is a convex function, we have (a/2 + b/2)p ≤ ap /2 + bp /2, so that Bc
|fj − f |p dμ
≤
Bc
2p−1 (|fj |p + |f |p ) dμ
2p−1
=
Ec (2), (3), Fatou
0. Take jε ∈ + Z such that fj − f Lp < ε1/p whenever j > jε . Take Ef , Efj ∈ M as in Lemma C.4.12. Let jε E fj . E := Ef ∪ j=1
Then E ∈ M and μ(E) < ∞. If j ≤ jε then
|fj | dμ ≤ p
Ec
Efc
|fj |p dμ < ε.
j
If j > jε then Ec
|fj |p dμ
Minkowski
≤ ≤
p
(χE c (fj − f )Lp + χE c f Lp ) p ε1/p + ε1/p ,
so that Ec
|fj |p dμ ≤ 2p ε for every j ∈ Z+ . We have shown that (0) ⇒ (2).
158
Chapter C. Measure Theory and Integration
Finally, let us prove that (0) ⇒ (1). We have μ({|fj − f | ≥ ε})
= Chebyshev
≤
(0)
−−−→
μ({|fj − f |p ≥ εp }) ε−p |fj − f |p dμ 0,
j→∞
so that fj → f in measure μ. By Theorem C.2.24, there is a subsequence (fjk )∞ k=1 such that fjk → f μ-almost everywhere. We have shown that (0) ⇒ (1). Exercise C.4.15. Complete the proof of Vitali’s Convergence Theorem C.4.14 by showing that (0) ⇒ (3). p Exercise C.4.16. Let 1 ≤ p ≤ ∞ and fj → f μ-a.e., where {fj }∞ j=1 ⊂ L .
(a) Let fj → g in Lp . Show that f = g μ-a.e. (b) Give an example where f ∈ Lp , but fj → f in Lp . Finally, we give without proof a very useful interpolation theorem. But first we introduce Definition C.4.17 (Semifinite measures). A measure μ is called semifinite if for every E ∈ M with μ(E) = ∞ there exists F ∈ M such that F ⊂ E and 0 < μ(F ) < ∞. Theorem C.4.18 (M. Riesz–Thorin interpolation theorem). Let μ, ν be semifinite measures and let 1 ≤ p0 , p1 , q0 , q1 ≤ ∞. For every 0 < t < 1 define pt and qt by 1 1−t t 1 1−t t = + , = + . pt p0 p1 qt q0 q1 Assume that A is a linear operator such that ||Af ||Lq0 (ν) ≤ C0 ||f ||Lp0 (μ) ,
||Af ||Lq1 (ν) ≤ C1 ||f ||Lp1 (μ) ,
for all f ∈ L (μ) and f ∈ L (μ), respectively. Then for all 0 < t < 1, the operator A extends to a bounded linear operator from Lpt (μ) to Lqt (ν) and we have ||Af ||Lqt (ν) ≤ C01−t C1t ||f ||Lpt (μ) p0
p1
for all f ∈ Lpt (μ).
C.4.2
Signed measures
Definition C.4.19 (Signed measures). Let M be a σ-algebra on X. A mapping ν : M → R is called a signed measure on X if ⎞ ⎛ ∞ ∞ Ej ⎠ = ν(Ej ) ν⎝ j=1
for any disjoint countable family
{Ej }∞ j=1
j=1
⊂ M.
C.4. Integral as a functional
159
Example. Let μ, ν : M → [0, ∞] be finite measures on X, that is μ(X) < ∞ and ν(X) < ∞. Then μ−ν :M→R is a signed measure. It will turn out that there are no other types of signed measures on X, see the Jordan decomposition result in Corollary C.4.26 Remark C.4.20. For simplicity and in view of the planned applications of this notion we restrict the exposition to what may be called finite signed measures. In principle, one can allow ν : M → [−∞, +∞] assuming that only one of infinities may be achieved. The statements and the proofs remain largely similar, so we may leave this case as an exercise for an interested reader. For example, only one of the measures in Theorem C.4.25 would be finite, etc. Exercise C.4.21. Let (X, M, μ) be a measure space and let f : X → [−∞, +∞] be μ-integrable. Define ν : M → R by ν(E) := f dμ. (C.14) E
Show that ν is a signed measure. Moreover, prove that ν is a (finite) measure if and only if f ≥ 0 μ-almost everywhere. Definition C.4.22 (Variations of measures). Let ν : M → R be a signed measure. Define mappings ν + , ν − , |ν| : M → [0, ∞] by ν + (E) ν− |ν|
:=
sup
ν(A),
A∈M: A⊂E +
:= ( − ν) , := ν + + ν − .
The mappings ν + , ν − are called the positive and negative variations (respectively) of ν, and the pair (ν + , ν − ) is the Jordan decomposition of ν. The mapping |ν| is the total variation of ν. Exercise C.4.23. Show that ν + , ν − , |ν| : M → [0, ∞] are measures. Exercise C.4.24. Let ν(E) = f dμ as in (C.14). Show that E
ν + (E) = E
f + dμ and ν − (E) =
f − dμ.
E
Hence here ν = ν + − ν − , but this happens even generally: Theorem C.4.25. Let ν : M → R be a signed measure. Then the measures ν + , ν − : M → [0, ∞] are finite.
160
Chapter C. Measure Theory and Integration
Proof. By Exercise C.4.23, ν + and ν − are measures. Let us show that ν + (and similarly ν − ) is finite. To get a contradiction, assume that ν + (X) = ∞. Take E0 ∈ M such that ν(E0 ) ≥ 0. Take A0 ∈ {E0 , X \ E0 } such that ν + (A0 ) = ∞. For k ∈ Z+ , suppose Ek , Ak ∈ M have been chosen so that ν + (Ak ) = ∞. Take Ek+1 ∈ M such that Ek+1 ⊂ Ak
and ν(Ek+1 ) ≥ 1 + ν(Ek ).
Take Ak+1 ∈ {Ek+1 , Ak \ Ek+1 } such that ν + (Ak+1 ) = ∞. Then 1. either 2. or
∃k0 ∀k ≥ k0 : Ak+1 = Ek+1 ∀k0 ∃k ≥ k0 : Ak+1 = Ak \ Ek+1 .
Here in the first case, E ⊃ Ek ⊃ Ek+1 for every k ≥ k0 , and ∞ ∞ Ek + ν(Ek \ Ek+1 ) ν(Ek0 ) = ν k=k0
=
ν
∞
k=k0
Ek
+
k=k0
(ν(Ek ) − ν(Ek+1 ))
k=k0
−∞;
=
∞
of course, this is a contradiction, excluding the first case. In the second case, take + a disjoint family {Ekj }∞ j=1 where kj+1 > kj ∈ Z , so that ⎛ ν⎝
∞
⎞ Ekj ⎠ =
j=1
∞
ν(Ekj ) = +∞,
j=1
again a contradiction; therefore ν + and ν − must be finite measures.
Corollary C.4.26 (Jordan Decomposition). Let ν : M → R be a signed measure. Then ν = ν + − ν−. Proof. Let E ∈ M. For any A ∈ M we have ν(E)
=
ν(A ∩ E) + ν(Ac ∩ E)
≤
ν + (E) − (−ν)(Ac ∩ E),
yielding ν(E) ≤ ν + (E) − ν − (E). Similarly, (−ν)(E) ≤ (−ν)+ (E) − (−ν)− (E) = ν − (E) − ν + (E), so that ν(E) ≥ ν + (E) − ν − (E).
C.4. Integral as a functional
161
Exercise C.4.27. Let M be a σ-algebra on X. Let M (M) be the real vector space of all signed measures ν : M → R. For ν ∈ M (M), let ν = |ν|(X); show that this gives a Banach space norm on M (M). Definition C.4.28 (Hahn decomposition). A pair (P, P c ) is called a Hahn decomposition of a signed measure ν : M → R if P ∈ M and ∀E ∈ M :
ν(P ∩ E) ≥ 0 ≥ ν(P c ∩ E).
Then P is called a ν-positive set and P c is a ν-negative set. Example. Let ν(E) = f dμ as in (C.14). Then (P, P c ) and (Q, Qc ) are Hahn E
decompositions of ν : M → R, where P := {f ≥ 0},
Q := {f > 0}.
Definition C.4.29 (Mutually singular measures). The measures μ, λ : M → [0, ∞] are mutually singular, denoted by μ⊥λ, if there exists P ∈ M such that μ(P ) = 0 = λ(P c ). Here, the zero-measure condition μ(P ) = 0 can be interpreted so that the measure μ does not see the set P ∈ M. Theorem C.4.30 (Hahn Decomposition). Let ν : M → R be a signed measure. Then ν has a Hahn decomposition (P, P c ). More precisely, ν + (E) = +ν(P ∩ E), ν − (E) = −ν(P c ∩ E) for each E ∈ M. Especially, ν − ⊥ν + such that ν − (P ) = 0 = ν + (P c ). Proof. For each k ∈ Z+ , take Ak ∈ M such that ν + (X) − ν(Ak ) < 2−k . Then P := lim sup Ak = k→∞
∞ ∞
Ak ∈ M.
j=1 k=j
Moreover, −
ν (P )
≤ Corollary C.4.26
=
≤
∞ k=j ∞ k=j ∞ k=j
ν − (Ak )
ν + (Ak ) − ν(Ak )
2−k
=
21−j ,
162
Chapter C. Measure Theory and Integration
so that ν − (P ) = 0. On the other hand, ν + (P c )
= = ≤ ≤
ν + (lim inf Ack ) k→∞ ⎛ ⎞ ∞ Ack ⎠ lim ν + ⎝
j→∞
k=j +
lim ν (Acj )
j→∞
lim 2−j
j→∞
=
0,
so that ν + (P c ) = 0. Thereby ν(P ∩ E)
Jordan
= = =
ν + (P ∩ E) − ν − (P ∩ E) ν + (P ∩ E) + ν + (P c ∩ E) ν + (E),
and similarly ν(P c ∩ E) = −ν − (E).
Exercise C.4.31. Let (P, P c ) and (Q, Qc ) be two Hahn decompositions of a signed measure ν. Show that |ν|(P \ Q) = 0. The moral here is that all the Hahn decompositions are “essentially the same”. Exercise C.4.32. Let ν = α − β, where α, β : M → [0, ∞] are finite measures and α⊥β. Show that α = ν + and β = ν − . In this respect the Jordan decomposition is the most natural decomposition of ν as a difference of two measures.
C.4.3
Derivatives of signed measures
In this section we study which signed measures ν : M → R can be written in the integral form as in (C.15). The key property is the absolute continuity of ν with respect to μ, and the key result is the Radon–Nikodym Theorem C.4.38. Definition C.4.33 (Radon–Nikodym derivative). Let (X, M, μ) be a measure space and f : X → [−∞, +∞] be μ-integrable. Let a signed measure ν : M → R be defined by f dμ.
ν(E) := E
Then
dν := f is the Radon–Nikodym derivative of ν with respect to μ. dμ
(C.15)
C.4. Integral as a functional
163
Remark C.4.34. Actually, the Radon–Nikodym derivative dν/dμ = f is not an integrable function X → [−∞, +∞] but is the equivalence class {g : X → [−∞, +∞] | f ∼ g} , where f ∼ g ⇐⇒ f = g μ-almost everywhere. The classical derivative of a function (a limit of a difference quotient) is connected to the Radon–Nikodym derivative in the case of the Lebesgue measure μ = λR , but this shall not be investigated here. Definition C.4.35 (Absolutely continuous measures). A signed measure ν : M → R is absolutely continuous with respect to a measure μ : M → [0, ∞], denoted by ν & μ, if ∀E ⊂ M : μ(E) = 0 ⇒ ν(E) = 0. Example. If ν(E) = f dμ as in (C.15) then ν & μ. E
The following (ε, δ)-result justifies the term absolute continuity here: Theorem C.4.36. Let ν : M → R be a signed measure and μ : M → [0, ∞] a measure. Then the following conditions are equivalent: (a) ν & μ. (b) ∀ε > 0 ∃δ > 0 ∀E ∈ M : μ(E) < δ ⇒ |ν(E)| < ε. Proof. The (ε, δ)-condition trivially implies ν & μ. On the other hand, let us show that ν & μ, when we assume ∃ε > 0 ∀δ > 0 ∃Eδ ∈ M :
μ(Eδ ) < δ
Then E := lim sup E2−k = k→∞
∞ ∞
and |ν(Eδ )| ≥ ε.
E2−k ∈ M,
j=1 k=j
and μ(E) = 0, because ⎛ μ(E) ≤ μ ⎝
∞
⎞ E2−k ⎠ ≤
k=j
∞
2−k = 21−j .
k=j
Now |ν|(E) > 0, because ⎛ |ν|(E) = lim |ν| ⎝ j→∞
∞
⎞ E2−k ⎠ ≥ ε.
k=j
Hence ν + (E) > 0 or ν − (E) > 0, so that |ν(A)| > 0 for some A ⊂ E, where A ∈ M. Here μ(A) = 0, so that ν & μ.
164
Chapter C. Measure Theory and Integration
Exercise C.4.37. Show that the following conditions are equivalent: 1. ν & μ. 2. |ν| & μ. 3. ν + & μ and ν − & μ. Theorem C.4.38 (Radon–Nikodym). Let μ : M → [0, ∞] be a finite measure and ν & μ. Then there exists a Radon–Nikodym derivative dν/dμ, i.e., dν ν(E) = dμ dμ E for every E ∈ M. Exercise C.4.39 (σ-finite Radon–Nikodym). A measure space is called σ-finite if it is a countable union of sets of finite measure. Generalise the Radon–Nikodym Theorem to σ-finite measure spaces. For example, if μ is σ-finite, we can find a sequence Ej ' X with μ(Ej ) < ∞ and define dν|Ej dν := sup . dμ j dμ|Ej Exercise C.4.40. Let ν = λRn be the Lebesgue measure, and let (X, M, μ) = space is not σ(Rn , M(λ∗Rn ), μ), where μ is the counting measure; this measure finite, but ν & μ. Show that ν cannot be of the form ν(E) = E f dμ. Thus there is no anologue to the Radon–Nikodym Theorem in this case. Before proving the Radon–Nikodym Theorem C.4.38, let us deal with the essential special case of the result: Lemma C.4.41. Let μ, ν : M → [0, ∞] be finite measures such that ν ≤ μ. Then there exists a Radon–Nikodym derivative dν/dμ, i.e., dν dμ ν(E) = dμ E for every E ∈ M. Moreover,
g
+
dν =
g+
dν dμ dμ
(C.16)
when g + : X → [0, ∞] is M-measurable. Proof. An M-partition of a set X is a finite disjoint collection P ⊂ M, for which X = P. Let us define a partial order ≤ on the family the M-partitions by P ≤ Q if and only if for every Q ∈ Q there exists P ∈ P such that Q ⊂ P . The common refinement of M-partitions P, Q is the M-partition ↑ {P, Q} = {P ∩ Q : P ∈ P, Q ∈ Q} .
C.4. Integral as a functional
165
For an M-partition P, let us define dP : M → R by ν(P ) , if x ∈ P ∈ P and μ(P ) > 0, dP (E) := μ(P ) 0, otherwise. Then 0 ≤ dP ≤ 1, dP is simple and μ-integrable, and dP =
ν(P ) χP . μ(P )
P ∈P
The idea in the following is that the Radon–Nikodym derivative dν/dμ will be approximated by functions dP is the L2 (μ)-sense. If P ≤ Q and E ∈ P then dP dμ = dQ dμ, (C.17) ν(E) = E
because
E
dQ dμ
=
E
=
ν(Q) χQ dμ E Q∈Q μ(Q) ν(Q) χQ∩E dμ μ(Q)
Q∈Q
E∈P≤Q
=
Q∈Q: Q⊂E E∈P≤Q
=
ν(Q) μ(Q) μ(Q)
ν(E).
Moreover, here dP 2L2 (μE ) ≤ dQ 2L2 (μE ) = dP 2L2 (μE ) + dQ − dP 2L2 (μE ) , because dP 2L2 (μE )
≤ = = E∈P
=
(C.17)
=
=
dP 2L2 (μE ) + dQ − dP 2L2 (μE ) 2 d2P dμ + (dQ − dP ) dμ E E 2 dQ dμ + 2 dP (dP − dQ ) dμ E E + , ν(E) ν(E) 2 − dQ dμ dQ L2 (μE ) + 2 μ(E) E μ(E) + , ν(E) ν(E) 2 − ν(E) dQ L2 (μE ) + 2 μ(E) μ(E) dQ 2L2 (μE ) .
(C.18)
166
Chapter C. Measure Theory and Integration
Now M
sup dP 2L2 (μ) | P is an M-partition
:= 0≤dP ≤1
≤
μ(X)
∞.
0 . A := d(μ + ν)
C.4. Integral as a functional
169
Define measures ν0 , ν1 : M → [0, ∞] by ν0 (E) := ν(Ac ∩ E), ν1 (E) := ν(A ∩ E). Clearly ν = ν0 + ν1 , and ν0 ⊥μ because ν0 (A) = ν(Ac ∩ A) = ν(∅) = 0, (C.16) dμ d(μ + ν) = 0. μ(Ac ) = Ac d(μ+ν) " We will now prove that ν1 & μ. Let Ak :=
# dμ ≥ 1/k . Take E ∈ M such d(μ + ν)
that μ(E) = 0. Now ν1 (E) = 0, because ν1 (E)
= ≤ = ≤ Radon−Nikodym
= ≤
ν(A ∩ E) (μ + ν)(A ∩ E) lim (μ + ν)(Ak ∩ E) k→∞ dμ d(μ + ν) k d(μ + ν) Ak ∩E k μ(Ak ∩ E) k μ(E) = 0.
Proving the uniqueness part is left as Exercise C.4.45.
Exercise C.4.45. Show that the Lebesgue decomposition in Theorem C.4.44 is unique.
C.4.4
Integration as functional on function spaces
Assume (X, M, μ) is a measure space, possibly with topology. On function spaces like Lp (μ) or C(X) = C(X, R) (when X is, e.g., a compact Hausdorff space), integration acts as a bounded linear functional by f → f g dμ, when g is a suitable weight function on X. It is natural to study necessary and sufficient conditions for g, and ask whether all the bounded linear functionals are of this form. The general functional analytic outline is as follows: Let V be a real Banach space, e.g., Lp (μ) or C(X) = C(X, R). The dual of V is the Banach space V = L(V, R) := {φ : V → R | φ bounded and linear} ,
170
Chapter C. Measure Theory and Integration
endowed with the (operator) norm φ → φ :=
sup f ∈V : f V ≤1
|φ(f )|,
see Definition B.4.15 and Exercise B.4.16. Given a “concrete” space V , we would like to discover an intuitive representation of the dual.
C.4.5
Integration as functional on Lp (μ)
Now we are going to find a concrete presentation for the dual of V = Lp (μ), where (X, M, μ) is a measure space. We shall assume that μ(X) < ∞, though often this technical assumption can be removed, since everything works for σ-finite measures just as well. Lemma C.4.46. Let μ be a finite measure. Let 1 ≤ p ≤ ∞, and let q = p be its Lebesgue conjugate, i.e., 1/p + 1/q = 1. Let g ∈ Lq (μ). Then φg ∈ Lp (μ) , where φg (f ) :=
f g dμ,
and φg = gLq . Exercise C.4.47. Prove Lemma C.4.46. Remark C.4.48. You may generalise Lemma C.4.46 as follows: the conclusion holds for a general measure μ if 1 < p ≤ ∞, and for a σ-finite measure μ if 1 ≤ p ≤ ∞. Remark C.4.49. The next Theorem C.4.50 roughly says that the dual of Lp “is” Lq , under some technical assumptions. The result holds for a general measure μ if 1 < p < ∞, and for a σ-finite measure if 1 ≤ p < ∞. Theorem C.4.50 (Dual of Lp (μ)). Let μ be a finite measure. Let 1 ≤ p < ∞, and let q = p be its Lebesgue conjugate. Then the mapping (g → φg ) : Lq (μ) → Lp (μ) is an isometric isomorphism, i.e., Lp (μ) ∼ = Lq (μ). Proof. By the previous Lemma C.4.46, it suffices to show that ψ ∈ Lp (μ) is of the form ψ = φg for some g ∈ Lq (μ). Let us define ν : M → R such that ν(E) := ψ(χE ), where χE ∈ Lp (μ) because μ(X) < ∞.
C.4. Integral as a functional
171
The idea in the proof is to show that dν/dμ ∈ Lq (μ) and that ψ(f ) =
f
dν dμ. dμ
(C.19)
The first step is to show that ν is a signed measure: Let {Ej }∞ j=1 ⊂ M be a disjoint ∞ ∞ collection. Then ν( j=1 Ej ) = j=1 ν(Ej ), because
ν(
∞
j=1
Ej ) −
k
ν(Ej )
ψ(χ ∞ )− j=1 Ej
=
j=1
k
ψ(Ej )
j=1
=
∞
ψ(
χEj )
j=k+1
≤
∞
ψ
χEj Lp (μ)
j=k+1 μ(X) ψ and let " AM :=
dν >M dμ
# .
172
Chapter C. Measure Theory and Integration
Then
M μ(AM )
=
M dμ
≤ = (C.20)
=
≤ p=1
≤
AM
dν dμ dμ AM + , dν dν dμ χAM sgn dμ dμ + + ,, dν ψ χAM sgn dμ ) + ,) ) dν ) ) ) ψ )χAM sgn dμ )Lp (μ) ψ μ(AM ).
Since M > ψ and μ(AM ) < ∞, we must have μ(AM ) = 0, so dν/dμL∞ (μ) ≤ ψ. Now let 1 < p < ∞ (so that ∞ > q > 1). Take simple M-measurable functions hk : X → R such that 0 ≤ hk (x) ≤ hk+1 (x) −−−−→ k→∞
Then
) )q ) dν ) ) ) = ) dμ ) q L (μ)
dν (x) . dμ
Fatou dν dμ ≤ lim inf k→∞ dμ
hqk dμ,
so that dν/dμ ∈ Lq (μ) follows if we show that hk Lq (μ) ≤ constant < ∞ for every k ∈ Z+ : = hqk dμ hk qLq (μ) dν dμ ≤ hq−1 k dμ + , dν dν dμ = hq−1 sgn k dμ dμ + + ,, dν (C.20) = ψ hq−1 sgn k dμ ) ) ) q−1 ) ≤ ψ )hk ) p L (μ)
=
ψ hk Lq/p (μ) , q(1−1/p)
because p(q − 1) = q. Hence hk Lq (μ) = hk Lq (μ)
≤ ψ.
C.4. Integral as a functional
173
Finally, we have to show that (C.19) holds for f ∈ Lp (μ). Take simple Mmeasurable functions fk : X → [−∞, +∞] such that fk → f in Lp (μ). Then dν dμ ψ(f ) − f dμ dν (C.20) dμ = ψ(f − fk ) + (fk − f ) dμ dν dμ ≤ |ψ(f − fk )| + |fk − f | dμ ) ) ) dν ) H¨ older ) ≤ ψ f − fk Lp (μ) + fk − f Lp (μ) ) ) dμ ) q L (μ) −−−−→ 0. k→∞
Thus the proof is complete.
Exercise C.4.51. Generalise Theorem C.4.50 to the case where μ is σ-finite (and 1 ≤ p < ∞). Exercise C.4.52. Generalise Theorem C.4.50 to the case where μ is any measure and 1 < p < ∞ (so that 1 < q < ∞ also). (Hint: apply the result of Exercise C.4.51.) Remark C.4.53. We have not dealt with the dual of L∞ (μ). This case actually resembles the other Lp -cases, but is slightly different, see details, e.g., in [153]. Often, however, L∞ (μ) ∼ = L1 (μ). Exercise C.4.54. Let X = [0, 1] and μ = (λR )X . Show that there exists ψ ∈ L∞ (μ) which is not of the form f → f g dμ for any g ∈ L1 (μ). (Hint: Define a suitable bounded linear functional f → ϕ(f ) for continuous functions f , and extend it to ψ using the Hahn–Banach Theorem, see Theorem B.4.25.) Exercise C.4.55. Let (X, M, μ) be a measure space, where X is uncountable, measure. Show that M = {E ⊂ X : E or E c is countable} and μ is the counting there exists ψ ∈ L1 (μ) which is not of the form f → f g dμ for any g ∈ L∞ (μ). (Hint: You may use that there exists S ∈ P(X) \ M, which follows by using the Hausdorff Maximal Principle or other equivalents to the Axiom of Choice.) Theorem C.4.56 (Converse of H¨ older’s inequality). Let μ be a σ-finite measure, 1 ≤ p ≤ ∞, and p1 + 1q = 1. Let S be the space of all simple functions that vanish outside a set of finite measure. Let g be M-measurable such that f g ∈ L1 (μ) for all f ∈ S, and such that " # Mq (g) := sup | f g dμ| : f ∈ S, f Lp (μ) = 1 is finite. Then g ∈ Lq (μ) and Mq (g) = gLq (μ) .
174
Chapter C. Measure Theory and Integration
Proof. From H¨older’s inequality (Theorem C.4.4) we have the inequality Mq (g) ≤ ||g||Lq (μ) . For the proof of ||g||Lq (μ) ≤ Mq (g) we follow [35]. Assume first that an increasing sequence of sets such that 0 < μ(En ) < ∞ q < ∞. Let En ⊂ X be ∞ for all n, and such that n=1 En = X. Let ϕn be a sequence of simple functions such that ϕn → g pointwise and |ϕn | ≤ g, and let gn := ϕn χEn , where χEn is the characteristic function of the set En . Then gn → g pointwise, |gn | ≤ |g| and q−1 g gn ∈ S. Define fn := ||gn ||1−q Lq (μ) |gn | |g| when g = 0 and fn := 0 when g = 0. 1 1 The relation p + q = 1 implies (q − 1)p = q, so that ||fn ||Lp (μ) = 1, and by Fatou’s lemma C.3.10 we have: ||g||Lq (μ) ≤ lim inf ||gn ||Lq (μ) = lim inf |fn gn | dμ ≤ lim inf |fn g| dμ = lim inf fn g dμ ≤ Mq (g). The case q = ∞ is slightly different. Take > 0 and denote A := {x ∈ X : |g(x)| ≥ M∞ (g) + . We need to show that μ(A) = 0. If μ(A) > 0, there exists some B ⊂ A g 1 such that 0 < μ(B) < ∞. Let us define f := μ(B) |g| χB when g = 0 and f := 0 when g = 0. Then ||f ||L1 (μ) = 1 and f g dμ ≥ M∞ (g) + , a contradiction.
C.4.6
Integration as functional on C(X)
Measure theory and topology have fundamental connections, as exemplified in this passage. For our purposes, it is enough to study compact Hausdorff spaces, though analogies hold for locally compact Hausdorff spaces. Let (X, τ ) be a compact Hausdorff space and let C(X) = C(X, R) denote the Banach space of continuous functions f : X → R, endowed with the supremum norm: f = f C(X) := sup |f (x)|. x∈X
Appealing to the “geometry” of X, we are going to characterise the dual C(X) = L(C(X), R). Exercise C.4.57. Let (X, τ ) be a compact Hausdorff space. Actually, C(X) contains all the information about (X, τ ): a set S ⊂ X is closed if and only if S = {f = 0} for some f ∈ C(X). Prove this. Remark C.4.58. Let (X, τ ) be a topological space. Recall that the vector space of signed (Borel) measures M (X) = M (Σ(τ )) := {ν : Σ(τ ) → R | ν is a signed measure} is a Banach space with the norm ν := |ν|(X).
C.4. Integral as a functional
175
Lemma C.4.59. Let ν : M → R be a signed measure on X, where τ ⊂ M. For f ∈ C(X), let Tν (f ) := f dν + = f dν − f dν − . Then Tν ∈ C(X) and Tν ≤ ν. Proof. Each f ∈ C(X) is M-measurable, as τ ⊂ M. Furthermore, ν + , ν − : M → [0, ∞] are finite measures. Consequently, f ∈ C(X) is ν ± -integrable, so Tν (f ) ∈ R is well defined, and + |f | dν + |f | dν − |Tν (f )| ≤ ≤ f ν + (X) + ν − (X) = f ν. The operator Tν : C(X) → R is linear since integration is linear.
Theorem C.4.60 (F. Riesz’s Topological Representation Theorem). Let (X, τ ) be a compact Hausdorff space. Let M (X) and Tν ∈ C(X) be as above. Then (ν → Tν ) : M (X) → C(X) is an isometric isomorphism. In other words, bounded linear functionals on C(X) are exactly integrations with respect to signed measures, with the natural norms coinciding. We shall soon prove the Riesz Representation Theorem C.4.60 step-wise. Definition C.4.61 (Positive functionals). Let (X, τ ) be a compact Hausdorff space. A functional T : C(X) → R is called positive if T (f ) ≥ 0 whenever f ≥ 0. Exercise C.4.62. Show that a positive linear functional T ∈ C(X) is bounded and that T = T 1, where 1 ∈ C(X) is the constant function x → 1. Lemma C.4.63. Let T ∈ C(X) , where (X, τ ) is a compact Hausdorff space. Then there exist positive T + , T − ∈ C(X) such that T
=
T + − T −,
T
=
T + + T − .
Proof. For f = f + − f − ∈ C(X), let us define T + (f ) := T + (f + ) − T + (f − ),
176
Chapter C. Measure Theory and Integration
where
T + (g + ) := sup T (h+ ) | h+ ∈ C(X), 0 ≤ h+ ≤ g + .
Obviously, 0 = T (0) ≤ T + (g + ) ≤ T (g + 1) = T g + . Thereby the functional T + : C(X) → R is well defined and positive. Let us show that T + is linear. If 0 < λ+ ∈ R then T + (λ+ f + )
= = T linear
=
sup T h | h ∈ C(X), 0 ≤ h ≤ λ+ f + sup T (λ+ h) | h ∈ C(X), 0 ≤ h ≤ f + λ+ T + (f + );
from this we easily see that T + (λf ) = λ T + (f ) for every λ ∈ R and f ∈ C(X). Next, T + (f + + g + ) = T + (f + ) + T − (g + ) whenever 0 ≤ f + , g + ∈ C(X), because
if
⎧ + + ⎪ ⎨0 ≤ h ≤ f + g , + 0 ≤ h1 ≤ f , ⎪ ⎩ 0 ≤ h2 ≤ g +
then
⎧ + + ⎪ ⎨0 ≤ h1 + h2 ≤ f + g , + + 0 ≤ min(f , h) ≤ f , ⎪ ⎩ 0 ≤ h − min(f + , h) ≤ g + .
Since (f + g)+ + f − + g − = (f + g)− + f + + g + , we get T + ((f + g)+ ) + T + (f − ) + T + (g − ) = T + ((f + g)− ) + T + (f + ) + T + (g + ), so that = T + ((f + g)+ ) − T + ((f + g)− ) = T + (f + ) − T + (f − ) + T + (g + ) − T + (g − ) = T + (f ) + T + (g).
T + (f + g)
Hence we have seen that T + : C(X) → R is linear and positive, and that T + ≤ T . Next, let us define T − := T + − T ∈ C(X) . Then T − is positive, because T − (f + )
=
sup T h − T (f + ) | h ∈ C(X) : 0 ≤ h ≤ f + .
C.4. Integral as a functional
177
Finally, T
= T + − T − ≤ T + + T − = T + (1) + T − (1) = 2 T + (1) − T (1) = sup {T (2h − 1) | h ∈ C(X) : 0 ≤ h ≤ 1} = sup {T (g) | g ∈ C(X) : −1 ≤ g ≤ 1} = T ,
so T = T + + T − .
Remark C.4.64. Recall that the support supp(f ) ⊂ X of a function f ∈ C(X) is the closure of the set {f = 0}. Moreover, abbreviations K ≺ f,
f ≺U
mean that 0 ≤ f ≤ 1, K ⊂ X is compact such that χK ≤ f , and U ⊂ X is open such that supp(f ) ⊂ U . Theorem C.4.65. Let T + ∈ C(X) be positive, where (X, τ ) is a compact Hausdorff space. Then there exists a finite Borel measure μ : Σ(τ ) → [0, ∞] such that T f = f dμ for every f ∈ C(X). Proof. Let us define a measurelet m : τ → [0, ∞] such that m(U ) := sup {T f | f ≺ U } . Indeed, m(∅) = T (0) = 0. Thus m generates an outer measure m∗ : P(X) → [0, ∞] by ⎧ ⎫ ∞ ∞ ⎨ ⎬ m∗ (E) = inf m(Uj ) : E ⊂ Uj ∈ τ, Uj ∈ τ . ⎩ ⎭ j=1
j=1
We have to show μ := m∗ |Σ(τ ) is the desired measure. First, m∗ (E) = inf {m(U ) :
E ⊂ U ∈ τ}
follows, if we show that ⎛ m⎝
∞
j=1
⎞ Uj ⎠ ≤
∞ j=1
m(Uj ).
(C.21)
178
Chapter C. Measure Theory and Integration
∞ n So let f ≺ j=1 Uj . Now supp(f ) ⊂ X is compact, so supp(f ) ⊂ j=1 Uj for some n ∈ Z+ . Let {gj }nj=1 be a partition of unity for which K≺
n
g,
gj ≺ Uj .
j=1
Thereby Tf
=
T (f
n
gj )
j=1 T linear
n
f gj ≺Uj
j=1 n
= ≤
≤
j=1 ∞
T (f gj ) m(Uj ) m(Uj ),
j=1
proving (C.21). Next, we show that τ ⊂ M(m∗ ) by proving that m∗ (A ∪ B) = m∗ (A) + m∗ (B) whenever A ⊂ U ∈ τ and B ⊂ U c ; let us assume the non-trivial case m∗ (A), m∗ (B) < ∞. Given ε > 0, there exists V ∈ τ such that A ∪ B ⊂ V and m∗ (A ∪ B) + ε > m(V ). Moreover, let f ≺U ∩V : m(U ∩ V ) < T f + ε, c g ≺ supp(f ) ∩ V : m(supp(f )c ∩ V ) < T g + ε. We notice that U ∈ M(m∗ ), because m∗ (A ∪ B) + ε
>
m(V )
f +g≺V
≥
T linear
= > ≥ ≥
T (f + g) Tf + Tg m(U ∩ V ) + m(supp(f )c ∩ V ) − 2ε m∗ (A) + m∗ (B) − 2ε m∗ (A ∪ B) − 2ε.
Thus we can define the Borel measure μ := m∗ |Σ(τ ) . Notice that m(U ) = μ(U ), μ(X) = T 1 < ∞ and that m∗ is Borel-regular. If χE ≤ g ≤ χF , where g ∈ C(X) and E, F ∈ Σ(τ ), then χE dμ ≤ g dμ ≤ χF dμ; (C.22)
C.4. Integral as a functional
179
moreover, μ(E) ≤ T g ≤ μ(F ),
(C.23)
because E⊂{g≥1}
≤
μ(E)
μ({g ≥ 1})
δ>0
≤
μ({g > 1 − δ})
{g>1−δ}∈τ
sup {T f : f ≺ {g > 1 − δ}}
=
T positive, 0 0. Let {Aj × Bj }∞ j=1 ⊂ A be disjoint such that S∪T ε + m∗ (S ∪ T )
⊂ >
∞ j=1 ∞
Aj × Bj , m(Aj × Bj ).
j=1
Let us define Sj , Tj , Uj ⊂ X × Y by Sj Tj Uj
:= := :=
(Aj × Bj ) ∩ (A × B), (Aj × Bj ) ∩ (A × (Y \ B)) , (Aj × Bj ) ∩ ((X \ A) × Y ) .
Then {Sj , Tj , Uj }∞ j=1 ⊂ A is disjoint, and Aj × Bj = Sj ∪ Tj ∪ Uj . Moreover, S⊂
∞
Sj ,
j=1
T ⊂
∞
(Tj ∪ Uj ),
j=1
so that ε + m∗ (S ∪ T )
>
= ≥
∞ j=1 ∞ j=1 ∗
m(Aj × Bj ) m(Sj ) +
∞ j=1
m (S) + m∗ (T ).
(m(Tj ) + m(Uj ))
C.5. Product measure and integral
183
Thus we have shown that A × B ∈ Mμ×ν . Finally, if {Aj × Bj }∞ j=1 ⊂ A is a cover of A × B then trivial
m∗ (A × B)
≤ =
=
m(A × B) μ(A) ν(B) χA×B dν dμ X
Y
X
Y j=1
∞
≤ Mon. Conv.
=
∞
χAj ×Bj dν dμ
m(Aj × Bj ).
j=1
Therefore m∗ (A × B) = μ(A) ν(B).
Exercise C.5.5. Show that for the Lebesgue measures, λRm × λRn = λRm+n . Definition C.5.6. For x ∈ X, the x-slice S x ⊂ Y of a set S ⊂ X × Y is S x := {y ∈ Y | (x, y) ∈ S} . Remark C.5.7. Let B = {R ∈ Σ(A) : Rx ∈ Mν for all x ∈ X} . Clearly X × Y ∈ A ⊂ B. If R ∈ B then also Rc = (X × Y ) \ R ∈B, because ∞ ∞ x (Rc )x = (Rx )c . Similarly, if {Rj }∞ j=1 ⊂ B then j=1 Rj ∈ B, because ( j=1 Rj ) = ∞ x j=1 (Rj ) . Thus Σ(A) ⊂ B. Lemma C.5.8. The product outer measure m∗ is Σ(A)-regular: for any S ⊂ X × Y there exists R ∈ Σ(A) such that S ⊂ R and m∗ (S) = m∗ (R). Moreover, if μ, ν are finite then the x-slice Rx ∈ Mν for every x ∈ X, x → ν(Rx ) is Mμ -measurable, and m∗ (R) = χR dν dμ. X
Y
Proof. For each k ∈ Z+ , take a disjoint family {Akj × Bkj }∞ j=1 ⊂ A such that S m∗ (S) +
1 k
⊂ ≥
∞ j=1 ∞ j=1
Akj × Bkj , m(Akj × Bkj ).
184
Chapter C. Measure Theory and Integration
Let Rn :=
∞ n
(Akj × Bkj ) and R :=
∞ n=1
Rn . Then S ⊂ R ∈ Σ(A). Moreover,
k=1 j=1 ∗
we have m (S) = m∗ (R), because m∗ (S)
≤ ≤
m∗ (R) ∞ m∗ ( Anj × Bnj ) j=1
=
∞
m(Anj × Bnj )
j=1
≤
m∗ (S) +
1 . n
The set Rn is the union of a disjoint family {Cnj × Dnj }∞ j=1 ⊂ A, and χ
x Rn
(y) =
∞
χCnj (x) χDnj (y).
j=1
Consequently, χRnx : Y → R is Mν -measurable for all x ∈ X, x (y) −−−−→ χRx (y) ≥ 0. 1 ≤ χRnx (y) ≤ χRn+1
n→∞
Lebesgue’s Dominated Convergence Theorem C.3.22 yields ν(Y ) ν(Y ) ≥ hn (x) ≥ hn+1 (x) −−−−→ h(x) ≥ 0, n→∞
so also h : X → [0, ∞) is Mμ -measurable and μ(X) t}) dλR (t). f dμ = [0,∞)
Corollary C.5.20 (Young’s inequality). Let μ, ν be σ-finite and 1 < p < ∞. Assume that K : X × Y → C is a Mμ×ν -measurable function satisfying |K(x, y)| dμ(x), C1 := sup y∈Y X |K(x, y)| dν(y), C2 := sup x∈X
Y
where C1 , C2 < ∞. For any u ∈ Lp (ν) define Au : X → C by K(x, y) u(y) dν(y). Au(x) = Y
Then 1/p
AuLp (μ) ≤ C1
1/q
C2
uLp (ν) ,
where q is the conjugate exponent of p. Remark C.5.21. Notice that this defines a unique bounded linear operator A : Lp (ν) → Lp (μ).
188
Chapter C. Measure Theory and Integration
Remark C.5.22. It is clear that we can replace sup by the esssup in the definition of C1 , C2 , where the esssup would be taken with respect to ν and μ in C1 and C2 , respectively: := ν − esssupy∈Y
C1
|K(x, y)| dμ(x),
X
:= μ − esssupx∈X
C2
|K(x, y)| dν(y), Y
with the same proof. Proof of Corollary C.5.20. First, y → K(x, y) u(y) is Mν -measurable, and |Au(x)|
≤
|K(x, y)|1/p |u(y)|
|K(x, y)|1/q
Y
H¨ older
dν(y)
,1/p +
+
≤
|K(x, y)||u(y)|p dν(y) Y
+ ≤
,1/q |K(x, y)| dν(y)
,1/p |K(x, y)||u(y)|p dν(y)
Y 1/q
C2 .
Y
Using this we get AupLp (μ)
p
= ≤ Fubini
=
≤ =
|Au(x)| dμ(x) X p/q C2 |K(x, y)||u(y)|p dν(y) dμ(x) X Y p/q p C2 |u(y)| |K(x, y)| dμ(x) dν(y) Y X p/q C1 C2 |u(y)|p dν(y) Y p 1/p 1/q C1 C2 upLp (μ) ,
which gives the result.
Theorem C.5.23 (Minkowski’s inequality for integrals). Let μ, ν be σ-finite and let f : X × Y → C be a Mμ×ν -measurable function. Let 1 ≤ p < ∞. Then " +
#1/p
,p |f (x, y)| dν(y)
dμ(x)
+ ≤
,1/p |f (x, y)|p dμ(x)
dν(y).
C.5. Product measure and integral
189
Proof. If p = 1 the result follows from Theorem C.5.15 exchanging the order of the integration. For 1 < p < ∞, taking g ∈ Lq (μ), p1 + 1q = 1, we have +
Fubini
=
H¨ older
=
, |f (x, y)| dν(y) |g(x)| dμ(x) , + |f (x, y)||g(x)| dμ(x) dν(y) +
,1/p |f (x, y)| dμ(x) p
dν(y) gLq (μ) .
Now the statement follows from the converse of H¨older’s inequality (Theorem C.4.56). As a consequence, we obtain the second part of Minkowski’s inequality for integrals: Corollary C.5.24 (Monotonicity of Lp -norm). Let μ, ν be σ-finite and let f : X × Y → C be a Mμ×ν -measurable function. Let 1 ≤ p ≤ ∞. Assume that f (·, y) ∈ Lp (μ) for ν-a.e. y, and assume that the function y → ||f (·, y)||Lp (μ) is in L1 (μ). Then f (x, ·) ∈ L1 (ν) for μ-a.e. x, the function x → f (x, y) dν(y) is in Lp (μ), and ) ) ) ) ) f (·, y) dν(y)) ≤ f (·, y)Lp (μ) dν(y). ) ) Lp (μ)
Proof. For p = ∞ the statement follows from Theorem C.3.14. For 1 ≤ p < ∞ it follows from Theorem C.5.23 and Fubini’s Theorem C.5.15.
Chapter D
Algebras An algebra is a vector space endowed with a multiplication, satisfying some compatibility conditions. In the sequel, we are going to deal with spectral properties of algebras under various additional assumptions.
D.1 Algebras Definition D.1.1 (Algebra). A vector space A over the field C is an algebra if there exists an element 1A ∈ A \ {0} and a mapping A × A → A, (x, y) → xy, satisfying x(yz) = (xy)z, x(y + z) = xy + xz, (x + y)z = xz + yz, λ(xy) = (λx)y = x(λy), 1A x = x = x1A , for all x, y, z ∈ A and λ ∈ C. We briefly write xyz := x(yz). The element 1 := 1A is called the unit of A, and an element x ∈ A is called invertible (with the unique inverse x−1 ) if there exists x−1 ∈ A such that x−1 x = 1 = xx−1 . If xy = yx for every x, y ∈ A then A is called commutative. Remark D.1.2. Warnings: in some books the algebra axioms allow 1A to be 0, but then the resulting algebra is simply {0}; we have omitted such a triviality. In some books the existence of a unit is omitted from the algebra axioms; what we have called an algebra is there called a unital algebra. Example. Let us give some examples of algebras: 1. C is the most important algebra. The operations are the usual ones for complex numbers, and the unit element is 1C = 1 ∈ C. Clearly C is a commutative algebra.
192
Chapter D. Algebras
2. The algebra F(X) := {f | f : X → C} of complex-valued functions on a (finite or infinite) set X is endowed with the usual algebra structure (pointwise operations). Function algebras are commutative, because C is commutative. 3. The algebra L(V ) := {A : V → V | A is linear} of linear operators on a vector space V = {0} over C is endowed with the usual vector space structure and with the multiplication (A, B) → AB (composition of operators); the unit element is 1L(V ) = (v → v) : V → V , the identity operator on V . This algebra is non-commutative if V is at least two-dimensional. Exercise D.1.3. Let A be an algebra and x, y ∈ A. Prove the following claims: (a) If x, xy are invertible then y is invertible. (b) If xy, yx are invertible then x, y are invertible. Exercise D.1.4. Give an example of an algebra A and elements x, y ∈ A such that xy = 1A = yx. Prove that then (yx)2 = yx = 0. (Hint: Such an algebra is necessarily infinite-dimensional.) Exercise D.1.5 (Commutators). In an algebra A, let [A, B] = AB − BA. If λ is a scalar and A, B, C are elements of the algebra A show that [B, A] = −[A, B], [λA, B] = λ[A, B], [A + B, C] = [A, C] + [B, C], [AB, C] = A[B, C] + [A, C]B, C[A, B]C −1 = [CAC −1 , CBC −1 ]. Definition D.1.6 (Spectrum). Let A be an algebra. The spectrum σ(x) of an element x ∈ A is the set σA (x) = σ(x) = {λ ∈ C : λ1 − x is not invertible}. Example. Let us give some examples of invertibility and spectra: 1. An element λ ∈ C is invertible if and only if λ = 0; the inverse of an invertible λ is the usual λ−1 = 1/λ. Generally, σC (λ) = {λ}. 2. An element f ∈ F(X) is invertible if and only if f (x) = 0 for every x ∈ X. The inverse of an invertible f is g with g(x) = f (x)−1 . Generally, σF (X) (f ) = f (X) := {f (x) | x ∈ X}. 3. An element A ∈ L(V ) is invertible if and only if it is a bijection (if and only if 0 ∈ σL(V ) (A)). Exercise D.1.7. Let A be an algebra and x, y ∈ A. Prove the following claims: (a) 1 − yx is invertible if and only if 1 − xy is invertible. (b) σ(yx) ⊂ σ(xy) ∪ {0}. (c) If x is invertible then σ(xy) = σ(yx).
D.1. Algebras
193
Definition D.1.8 (Ideals). Let A be an algebra. An ideal J ⊂ A (more precisely, a two-sided ideal ) is a vector subspace J = A satisfying ∀x ∈ A ∀y ∈ J : xy, yx ∈ J , i.e., xJ , J x ⊂ J for every x ∈ A. A maximal ideal is an ideal not contained in any other ideal. Remark D.1.9. In some books our ideals are called proper ideals, and there ideal is either a proper ideal or the whole algebra. In the case of (proper) ideals, the vector space A/J := {x + J | x ∈ A} becomes an algebra with the operation (x + J , y + J ) → xy + J and the unit element 1A/J := 1A + J . It is evident that no proper ideal contains any invertible elements. We will drop the word “proper” since it is incorporated in Definition D.1.8. Remark D.1.10. Let J ⊂ A be an ideal. Because x1 = x for every x ∈ A, we notice that 1 ∈ J . Therefore an invertible element x ∈ A cannot belong to an ideal (since x−1 x = 1 ∈ J ). Example. Let us give examples of ideals. Intuitively, an ideal of an algebra is a subspace resembling a multiplicative zero; consider equations x0 = 0 = 0x. 1. Let A be an algebra. Then {0} ⊂ A is an ideal. 2. The only ideal of C is {0} ⊂ C. 3. Let X be a set, and ∅ = S ⊂ X. Now I(S) := {f ∈ F(X) | ∀x ∈ S : f (x) = 0} is an ideal of the function algebra F(X). If x ∈ X then I({x}) is a maximal ideal of F(X), because it is of co-dimension 1 in F(X). Notice that I(S) ⊂ I({x}) for every x ∈ S; an ideal may be contained in many different maximal ideals (cf. Krull’s Theorem D.1.13 in the sequel). 4. Let X be an infinite-dimensional Banach space. The set LC(X) := {A ∈ L(X) | A is compact} of compact linear operators X → X is an ideal of the algebra L(X) of bounded linear operators X → X. Definition D.1.11 (Semisimple algebras). The radical Rad(A) of an algebra A is the intersection of all the maximal ideals of A; A is called semisimple if Rad(A) = {0}. Exercise D.1.12 (Ideals spanned by sets). Show that any intersection of ideals is an ideal. Hence for any set S ⊂ A in an algebra A there exists a smallest possible ideal J ⊂ A such that S ⊂ J ; this J is called the ideal spanned by the set S. Theorem D.1.13 (W. Krull). Every ideal is contained in a maximal ideal.
194
Chapter D. Algebras
Proof. Let J be an ideal of an algebra A. Let P be the set of those ideals of A that contain J . The inclusion relation is the natural partial order on P ; the Hausdorff Maximal Principle (Theorem A.4.9) says that there is a maximal chain C ⊂ P . Let M := C. Clearly J ⊂ M. Let λ ∈ C, x, y ∈ M and z ∈ A. Then there exists I ∈ C such that x, y ∈ I, so that λx ∈ I ⊂ M, moreover, 1∈
x + y ∈ I ⊂ M,
(A \ I) = A \
I∈C
xz, zx ∈ I ⊂ M; I = A \ M,
I∈C
so that M = A. We have proven that M is an ideal. The maximality of the chain C implies that M is maximal. Lemma D.1.14. Let A be a commutative algebra and let M be an ideal. Then M is maximal if and only if [0] is the only non-invertible element of A/M. Proof. Of course, here [x] means x + M, where x ∈ A. Assume that M is a maximal ideal. Take [x] = [0], so that x ∈ M. Define J := Ax + M = {ax + m | a ∈ A, m ∈ M}. Then clearly J = M ⊂ J , and J is a vector subspace of A. If y ∈ A then J y = yJ = yAx + yM ⊂ Ax + M = J , so that either J is an ideal or J = A. But since M is a maximal ideal contained properly in J , we must have J = A. Thus there exist a ∈ A and m ∈ M such that ax + m = 1A . Then [a][x] = 1A/M = [x][a], [x] is invertible in A/M. Conversely, assume that all the non-zero elements are invertible in A/M. Assume that J ⊂ A is an ideal containing M. Suppose J = M, and pick x ∈ J \ M. Now [x] = [0], so that for some y ∈ A we have [x][y] = [1A ]. Thereby x∈J
1A ∈ xy + M ⊂ J + M ⊂ J + J = J , which is a contradiction, since no ideal can contain invertible elements. Therefore we must have J = M, meaning that M is maximal. Definition D.1.15 (Quotient algebra). Let A be an algebra with an ideal J . For x ∈ A, let us denote [x] := x + J = {x + j | j ∈ J }.
D.1. Algebras
195
Then the set A/J := {[x] | x ∈ A} can be endowed with a natural algebra structure. Indeed, let us define λ[x] := [λx],
[x] + [y] := [x + y],
[x][y] := [xy],
1A/J := [1A ];
first of all, these operations are well defined, since if λ ∈ C and j, j1 , j2 ∈ J then λ(x + j) (x + j1 ) + (y + j2 ) (x + j1 )(y + j2 )
= λx + λj ∈ [λx], = (x + y) + (j1 + j2 ) ∈ [x + y], = xy + j1 y + xj2 + j1 j2 ∈ [xy].
Secondly, [1A ] = 1A + J = J = [0], because 1A ∈ J . Moreover, (x + j1 )(1A + j2 ) (1A + j2 )(x + j1 )
= =
x + j1 + xj2 + j1 j2 ∈ [x], x + j1 + j2 x + j2 j1 ∈ [x].
Now the reader may verify that A/J is really an algebra; it is called the quotient algebra of A modulo J . Remark D.1.16. Notice that A/J is commutative if A is commutative. Also notice that [0] = J is the zero element in the quotient algebra. Definition D.1.17 (Algebra homomorphism). Let A and B be algebras. A mapping φ : A → B is called an algebra homomorphism (or simply a homomorphism) if it is a linear mapping satisfying φ(xy) = φ(x)φ(y) for every x, y ∈ A (multiplicativity) and φ(1A ) = 1B . The set of all homomorphisms A → B is denoted by Hom(A, B). A bijective homomorphism φ : A → B is called an isomorphism, denoted by φ:A∼ = B. Example. Let us give examples of algebra homomorphisms: 1. The only homomorphism C → C is the identity mapping, i.e., Hom(C, C) = {x → x}. 2. Let x ∈ X. Let us define the evaluation mapping φx : F(X) → C by f → f (x). Then φx ∈ Hom(F(X), C). 3. Let J be an ideal of an algebra A, and denote [x] = x + J . Then (x → [x]) ∈ Hom(A, A/J ).
196
Chapter D. Algebras
Exercise D.1.18. Let φ ∈ Hom(A, B). If x ∈ A is invertible then φ(x) ∈ B is invertible. For any x ∈ A, σB (φ(x)) ⊂ σA (x). Exercise D.1.19. Let A be the set of matrices + , α β (α, β ∈ C). 0 α Show that A is a commutative algebra. Classify (up to an isomorphism) all the two-dimensional algebras. (Hint: Prove that in a two-dimensional algebra either ∃x = 0 : x2 = 0 or ∃x ∈ {1, −1} : x2 = 1.) Proposition D.1.20. Let A and B be algebras, and φ ∈ Hom(A, B). Then φ(A) ⊂ B is a subalgebra, Ker(φ) := {x ∈ A | φ(x) = 0} is an ideal of A, and A/Ker(φ) ∼ = φ(A). Exercise D.1.21. Prove Proposition D.1.20. Definition D.1.22 (Tensor product algebra). The tensor product algebra of a Kvector space V is the K-vector space T :=
∞ -
⊗m V,
m=0
where ⊗0 V := K, ⊗m+1 V := (⊗m V ) ⊗ V ; the multiplication of this algebra is given by (x, y) → xy := x ⊗ y with the identifications W ⊗ K ∼ =W ∼ = K ⊗ W for a K-vector space W , so that the unit element 1T ∈ T is the unit element 1 ∈ K.
D.2
Topological algebras
Definition D.2.1 (Topological algebra). A topological space A with the structure of an algebra is called a topological algebra if 1. {0} ⊂ A is a closed subset, and 2. the algebraic operations are continuous, i.e., the mappings ((λ, x) → λx) : C × A → A, ((x, y) → x + y) : A × A → A, ((x, y) → xy) : A × A → A are continuous. Remark D.2.2. Similarly, a topological vector space is a topological space and a vector space, in which {0} is a closed subset and the vector space operations (λ, x) → λx and (x, y) → x + y are continuous.
D.2. Topological algebras
197
Remark D.2.3. Some books omit the assumption that {0} should be a closed set; then, e.g., any algebra A with a topology τ = {∅, A} would become a topological algebra. However, such generalisations are seldom useful. And it will turn out soon, that actually our topological algebras are indeed Hausdorff spaces! {0} being a closed set puts emphasis on closed ideals and continuous homomorphisms, as we shall see later. Example. Let us give examples of topological algebras: 1. The commutative algebra C endowed with its usual topology (given by the absolute value norm x → |x|) is a topological algebra. 2. If (X, x → x) is a normed space, X = {0}, then L(X) is a topological algebra with the norm A → A :=
sup
Ax.
x∈X: x ≤1
Notice that L(C) ∼ = C, and L(X) is non-commutative if dim(X) ≥ 2. 3. Let X be a set. Then Fb (X) := {f ∈ F(X) | f is bounded} is a commutative topological algebra with the supremum norm f → f := sup |f (x)|. x∈X
Similarly, if X is a topological space then the algebra Cb (X) := {f ∈ C(X) | f is bounded} of bounded continuous functions on X is a commutative topological algebra when endowed with the supremum norm. 4. If (X, d) is a metric space then the algebra Lip(X) := {f : X → C | f is Lipschitz continuous and bounded} is a commutative topological algebra with the norm ! |f (x) − f (y)| . f → f := max sup |f (x)|, sup d(x, y) x∈X x=y 5. E(R) := C ∞ (R) is a commutative topological algebra with the metric (f, g) →
∞ m=1
2−m
pm (f − g) , where pm (f ) := max |f (k) (x)|. 1 + pm (f − g) |x|≤m,k≤m
This algebra is not normable (can not be endowed with a norm).
198
Chapter D. Algebras
6. The topological dual E (R) of E(R) is the space of compactly supported distributions (see Definition 1.4.8). There the multiplication is the convolution, which is defined for nice enough f, g by ∞ f (x − y) g(y) dy. (f, g) → f ∗ g, (f ∗ g)(x) := −∞
The unit element of E(R) is the Dirac delta distribution δ0 at the origin 0 ∈ R. This is a commutative topological algebra with the weak∗ -topology, but it is not metrisable. 7. Convolution algebras of compactly supported distributions on Lie groups are non-metrisable topological algebras; such an algebra is commutative if and only if the group is commutative. Remark D.2.4. Let A be a topological algebra, U ⊂ A open, and S ⊂ A. Due to the continuity of ((λ, x) → λx) : C × A → A the set λU = {λu | u ∈ U } is open if λ = 0. Due to the continuity of ((x, y) → x + y) : A × A → A the set U + S = {u + s | u ∈ U, s ∈ S} is open. Exercise D.2.5. Show that topological algebras are Hausdorff spaces. Remark D.2.6. Notice that in the previous exercise you actually need only the continuities of the mappings (x, y) → x + y and x → −x, and the fact that {0} is a closed set. Indeed, the commutativity of the addition operation is not needed, so that you can actually prove a proposition “Topological groups are Hausdorff spaces”! Exercise D.2.7. Let {Aj | j ∈ J} be a family of topological algebras. Endow A := Aj with a structure of a topological algebra. j∈J
Exercise D.2.8. Let A be an algebra and a normed space. Prove that it is a topological algebra if and only if there exists a constant C < ∞ such that xy ≤ C x y for every x, y ∈ A. Exercise D.2.9. Let A be an algebra. The commutant of a subset S ⊂ A is Γ(S) := {x ∈ A | ∀y ∈ S : xy = yx}. Prove the following claims: (a) Γ(S) ⊂ A is a subalgebra; Γ(S) is closed if A is a topological algebra. (b) S ⊂ Γ(Γ(S)). (c) If xy = yx for every x, y ∈ S then Γ(Γ(S)) ⊂ A is a commutative subalgebra, where σΓ(Γ(S)) (z) = σA (z) for every z ∈ Γ(Γ(S)).
D.2. Topological algebras
199
Closed ideals In topological algebras, the good ideals are the closed ones. Example. Let A be a topological algebra; then {0} ⊂ A is a closed ideal. Let B be another topological algebra, and φ ∈ Hom(A, B) be continuous. Then it is easy to see that Ker(φ) = φ−1 ({0}) ⊂ A is a closed ideal; this is actually a canonical example of closed ideals. Proposition D.2.10. Let A be a topological algebra and J its ideal. Then either J = A or J ⊂ A is a closed ideal. Proof. Let λ ∈ C, x, y ∈ J , and z ∈ A. Take V ∈ V(λx). Then there exists U ∈ V(x) such that λU ⊂ V (due to the continuity of the multiplication by a scalar). Since x ∈ J , we may pick x0 ∈ J ∩ U . Now λx0 ∈ J ∩ (λU ) ⊂ J ∩ V, which proves that λx ∈ J . Next take W ∈ V(x + y). Then for some U ∈ V −→(x) and V ∈ V(y) we have U + V ⊂ W (due to the continuity of the mapping (x, y) → x + y). Since x, y ∈ J , we may pick x0 ∈ J ∩ U and y0 ∈ J ∩ V . Now x + y ∈ J ∩ (U + V ) ⊂ J ∩ W, which proves that x + y ∈ J . Finally, we should show that xz, zx ∈ J , but this proof is so similar to the previous steps that it is left for the reader as an easy task. Definition D.2.11 (Topology for quotient algebra). Let J be an ideal of a topological algebra A. Let τ be the topology of A. For x ∈ A, define [x] = x + J , and let [S] = {[x] | x ∈ S}. Then it is easy to check that {[U ] | U ∈ τ } is a topology of the quotient algebra A/J ; it is called the quotient topology. Remark D.2.12. Let A be a topological algebra and J and ideal in A. The quotient map (x → [x]) ∈ Hom(A, A/J ) is continuous: namely, if x ∈ A and [V ] ∈ VA/J ([x]) for some V ∈ τ then U := V + J ∈ V(x) and [U ] = [V ]. Lemma D.2.13. Let J be an ideal of a topological algebra A. Then the algebra operations on the quotient algebra A/J are continuous. Proof. Let us check the continuity of the multiplication in the quotient algebra: Suppose [x][y] = [xy] ∈ [W ], where W ⊂ A is an open set (recall that every open set in the quotient algebra is of the form [W ]). Then xy ∈ W + J . Since A is a topological algebra, there are open sets U ∈ VA (x) and V ∈ VA (y) satisfying UV ⊂ W + J .
200
Chapter D. Algebras
Now [U ] ∈ VA/J ([x]) and [V ] ∈ VA/J ([y]). Furthermore, [U ][V ] ⊂ [W ] because (U + J )(V + J ) ⊂ U V + J ⊂ W + J ; we have proven the continuity of the multiplication ([x], [y]) → [x][y]. As an easy exercise, we leave it for the reader to verify the continuity of the mappings (λ, [x]) → λ[x] and ([x], [y]) → [x] + [y]. Exercise D.2.14. Complete the previous proof by showing the continuity of the mappings (λ, [x]) → λ[x] and ([x], [y]) → [x] + [y]. With Lemma D.2.13, we conclude: Proposition D.2.15. Let J be an ideal of a topological algebra A. Then A/J is a topological algebra if and only if J is closed. Proof. If the quotient algebra is a topological algebra then {[0]} = {J } is a closed subset of A/J ; since the quotient homomorphism is a continuous mapping, J = Ker(x → [x]) ⊂ A must be a closed set. Conversely, suppose J is a closed ideal of a topological algebra A. Then we deduce that (A/J ) \ {[0]} = [A \ J ] is an open subset of the quotient algebra, so that {[0]} ⊂ A/J is closed.
Remark D.2.16. Let X be a topological vector space and M be its subspace. The reader should be able to define the quotient topology for the quotient vector space X/M = {[x] := x + M | x ∈ X}. Now X/M is a topological vector space if and only if M is a closed subspace. Let M ⊂ X be a closed subspace. If d is a metric on X then there is a natural metric for X/M : ([x], [y]) → d([x], [y]) := inf d(x − y, z), z∈M
and if X is a complete metric space then X/M is also complete. Moreover, if x → x is a norm on X then there is a natural norm for X/M : [x] → [x] := inf x − z. z∈M
D.3 Banach algebras Definition D.3.1 (Banach algebra). An algebra A is called a Banach algebra if it is a Banach space satisfying xy ≤ x y for all x, y ∈ A and
1 = 1.
D.3. Banach algebras
201
Exercise D.3.2. Let K be a compact space. Show that C(K) is a Banach algebra with the norm f → f = maxx∈K |f (x)|. Example. Let X be a Banach space. Then the Banach space L(X) of bounded linear operators X → X is a Banach algebra if the multiplication is the composition of operators, since AB ≤ A B for every A, B ∈ L(X); the unit is the identity operator I : X → X, x → x. Actually, this is not far away from characterising all the Banach algebras: Theorem D.3.3 (Characterisation of Banach algebras). A Banach algebra A is isometrically isomorphic to a norm closed subalgebra of L(X) for a Banach space X. Proof. Here X := A. For x ∈ A, let us define m(x) : A → A
by
m(x)y := xy.
Obviously m(x) is a linear mapping, m(xy) = m(x)m(y), m(1A ) = 1L(A) , and m(x)
=
sup y∈A: y ≤1
≤ ≤
sup y∈A: y ≤1
xy (x y) = x = m(x)1A
m(x) 1A = m(x);
briefly, m = (x → m(x)) ∈ Hom(A, L(A)) is isometric. Thereby m(A) ⊂ L(A) is a closed subspace and hence a Banach algebra. Proposition D.3.4. A maximal ideal in a Banach algebra is closed. Proof. In a topological algebra, the closure of an ideal is either an ideal or the whole algebra. Let M be a maximal ideal of a Banach algebra A. The set G(A) ⊂ A of all invertible elements is open, and M ∩ G(A) = ∅ (because no ideal contains invertible elements). Thus M ⊂ M ⊂ A \ G(A), so that M is an ideal containing a maximal ideal M; thus M = M. Proposition D.3.5. Let J be a closed ideal of a Banach algebra A. Then the quotient vector space A/J is a Banach algebra; moreover, A/J is commutative if A is commutative. Proof. Let us denote [x] := x + J for x ∈ A. Since J is a closed vector subspace, the quotient space A/J is a Banach space with norm [x] → [x] = inf x + j. j∈J
Let x, y ∈ A and ε > 0. Then there exist i, j ∈ J such that x + i ≤ [x] + ε,
y + j ≤ [y] + ε.
202
Chapter D. Algebras
Now (x + i)(y + j) ∈ [xy], so that [xy]
≤
(x + i)(y + j)
≤ ≤
x + i y + j ([x] + ε) ([y] + ε)
=
[x] [y] + ε([x] + [y] + ε);
since ε > 0 is arbitrary, we have [x][y] ≤ [x] [y]. Finally, [1] ≤ 1 = 1 and [x] = [x][1] ≤ [x] [1], so that we have [1] = 1. Exercise D.3.6. Let A be a Banach algebra, and let x, y ∈ A satisfy x2 = x,
y 2 = y,
xy = yx.
Show that either x = y or x − y ≥ 1. Give an example of a Banach algebra A with elements x, y ∈ A such that x2 = x = y = y 2 and x − y < 1. Proposition D.3.7. Let A be a Banach algebra. Then Hom(A, C) ⊂ A and φ = 1 for every φ ∈ Hom(A, C). Proof. Let x ∈ A, x < 1. Let yn :=
n
xj ,
j=0
where x0 := 1. If n > m then yn − ym
= xm + xm+1 + · · · + xn ≤ xm 1 + x + · · · + xn−m =
xm
1 − xn−m+1 →n>m→∞ 0; 1 − x
thus (yn )∞ n=1 ⊂ A is a Cauchy sequence. There exists y = limn→∞ yn ∈ A, because A is complete. Since xn → 0 and yn (1 − x) = 1 − xn+1 = (1 − x)yn , we deduce y = (1−x)−1 . Suppose λ = φ(x), |λ| > x; now λ−1 x = |λ|−1 x < 1, so that 1 − λ−1 x is invertible. Then 1 = φ(1) = φ (1 − λ−1 x)(1 − λ−1 x)−1 = φ 1 − λ−1 x φ (1 − λ−1 x)−1 = (1 − λ−1 φ(x)) φ (1 − λ−1 x)−1 = 0,
D.3. Banach algebras
203
a contradiction; hence ∀x ∈ A : |φ(x)| ≤ x, that is φ ≤ 1. Finally, φ(1) = 1, so that φ = 1.
Lemma D.3.8. Let A be a Banach algebra. The set G(A) ⊂ A of its invertible elements is open. The mapping (x → x−1 ) : G(A) → G(A) is a homeomorphism. Proof. Take x ∈ G(A) and h ∈ A. As in the proof of the previous Proposition, we see that x − h = x(1 − x−1 h) is invertible if x−1 h < 1, that is h < x−1 −1 ; thus G(A) ⊂ A is open. The mapping x → x−1 is clearly its own inverse. Moreover (x − h)−1 − x−1
=
(1 − x−1 h)−1 x−1 − x−1
≤
(1 − x−1 h)−1 − 1 x−1 =
≤
h
∞
x−1 n+1 hn−1
∞
(x−1 h)n x−1
n=1
→h→0 0;
n=1
hence x → x−1 is a homeomorphism.
Exercise D.3.9 (Topological zero divisors). Let A be a Banach algebra. We say that x ∈ A is a topological zero divisor if there exists a sequence (yn )∞ n=1 ⊂ A such that yn = 1 for all n and lim xyn = 0 = lim yn x.
n→∞
n→∞
−1 (a) Show that if (xn )∞ n=1 ⊂ G(A) satisfies xn → x ∈ ∂G(A) then xn → ∞. (b) Using this result, show that the boundary points of G(A) are topological zero divisors. (c) In what kind of Banach algebras 0 is the only topological zero divisor?
Theorem D.3.10 (Gelfand, 1939). Let A be a Banach algebra and x ∈ A. Then the spectrum σ(x) ⊂ C is a non-empty compact set. Proof. Let x ∈ A. Then σ(x) belongs to a 0-centered disc of radius x in the complex plane: for if λ ∈ C, |λ| > x then 1 − λ−1 x is invertible, equivalently λ1 − x is invertible. The mapping g : C → A, λ → λ1 − x, is continuous; the set G(A) ⊂ A of invertible elements is open, so that C \ σ(x) = g −1 (G(A)) is open. Thus σ(x) ∈ C is closed and bounded, i.e., compact by the Heine–Borel Theorem (Corollary A.13.7).
204
Chapter D. Algebras
The hard part is to prove the non-emptiness of the spectrum. Let us define the resolvent mapping R : C \ σ(x) → G(A) by R(λ) = (λ1 − x)−1 . We know that this mapping is continuous, because it is composed of continuous mappings (λ → λ1 − x) : C \ σ(x) → G(A)
and
(y → y −1 ) : G(A) → G(A).
We want to show that R is weakly holomorphic, that is f ◦ R ∈ H(C \ σ(x)) for every f ∈ A = L(A, C). Let z ∈ C \ σ(x), f ∈ A . Then we calculate
= = = = →h→0
(f ◦ R)(z + h) − (f ◦ R)(z) h , + R(z + h) − R(z) f h + , R(z + h)R(z)−1 − 1 f R(z) h + , R(z + h)(R(z + h)−1 − h1) − 1 f R(z) h f (−R(z + h)R(z)) f (−R(z)2 ),
because f and R are continuous; thus R is weakly holomorphic. Suppose |λ| > x. Then R(λ)
=
(λ1 − x)−1
=
|λ|−1 (1 − x/λ)−1 ) ) ) ) ∞ ) ) j) (x/λ) |λ|−1 ) ) ) )j=0 )
= ≤
|λ|−1
∞
x/λ−j
j=0
= = →|λ|→∞
1 1 − x/λ 1 |λ| − x 0. |λ|−1
Thereby (f ◦ R)(λ) →|λ|→∞ 0
D.3. Banach algebras
205
for every f ∈ A . To get a contradiction, suppose σ(x) = ∅. Then f ◦R ∈ H(C) is 0 by Liouville’s Theorem D.6.2 for every f ∈ A ; the Hahn–Banach Theorem B.4.25 says that then R(λ) = 0 for every λ ∈ C; this is a contradiction, since 0 ∈ G(A). Thus σ(x) = ∅. Exercise D.3.11. Let A be a Banach algebra, x ∈ A, Ω ⊂ C an open set, and σ(x) ⊂ Ω. Then ∃δ > 0 ∀y ∈ A : y < δ ⇒ σ(x + y) ⊂ Ω. Exercise D.3.12. Alternatively, in the proof of Theorem D.3.10 one could use the Neumann series ∞ k ((λ0 − λ)R(λ0 )) , R(λ) = R(λ0 ) k=0
for all λ0 ∈ C\σ(x) and |λ − λ0 |R(λ0 ) < 1. Then R(λ) is analytic in C\σ(x) and satisfies R(λ) → 0 as λ → ∞. Consequently, use Liouville’s theorem (Theorem D.6.2) to conclude the statement. Corollary D.3.13 (Gelfand–Mazur Theorem). Let A be a Banach algebra where 0 ∈ A is the only non-invertible element. Then A is isometrically isomorphic to C. Proof. Take x ∈ A, x = 0. Since σ(x) = ∅, pick λ(x) ∈ σ(x). Then λ(x)1 − x is non-invertible, so that it must be 0; x = λ(x)1. By defining λ(0) = 0, we have an algebra isomorphism λ : A → C. Moreover, |λ(x)| = λ(x)1 = x.
Exercise D.3.14. Let A be a Banach algebra, and suppose that there exists C < ∞ such that x y ≤ C xy for every x, y ∈ A. Show that A ∼ = C isometrically. Definition D.3.15 (Spectral radius). Let A be a Banach algebra. The spectral radius of x ∈ A is ρ(x) := sup |λ|; λ∈σ(x)
this is well defined, because due to Gelfand’s Theorem D.3.10 the spectrum is nonempty. In other words, D(0, ρ(x)) = {λ ∈ C : |λ| ≤ ρ(x)} is the smallest 0-centered closed disk containing σ(x) ⊂ C. Notice that ρ(x) ≤ x, since λ1−x = λ(1−x/λ) is invertible if |λ| > x. Theorem D.3.16 (Spectral Radius Formula (Beurling, 1938; Gelfand, 1939)). Let A be a Banach algebra, x ∈ A. Then ρ(x) = lim xn 1/n . n→∞
206
Chapter D. Algebras
Proof. For x = 0 the claim is trivial, so let us assume that x = 0. By Gelfand’s Theorem D.3.10, σ(x) = ∅. Let λ ∈ σ(x) and n ≥ 1. Notice that in an algebra, if both ab and ba are invertible then the elements a, b are invertible. Therefore n−1 n−1 λn−1−k xk = λn−1−k xk (λ1 − x) λn 1 − xn = (λ1 − x) k=0
k=0
implies that λ ∈ σ(x ). Thus |λ | ≤ x , so that n
n
n
n
ρ(x) = sup |λ| ≤ lim inf xn 1/n . n→∞
λ∈σ(x)
Let f ∈ A and λ ∈ C, |λ| > x. Then f (R(λ)) = f (λ1 − x)−1 = f λ−1 (1 − λ−1 x)−1 ∞ −1 −n n λ x = f λ n=0
=
λ−1
∞
f (λ−n xn ).
n=0
This formula is true also when |λ| > ρ(x), because f ◦R is holomorphic in C\σ(x) ⊃ C\D(0, ρ(x)). Hence if we define Tλ,x,n ∈ A = L(A , C) by Tλ,x,n (f ) := f (λ−n xn ), we obtain sup |Tλ,x,n (f )| = sup |f (λ−n xn )| < ∞
n∈N
n∈N
(when |λ| > ρ(x))
for every f ∈ A ; the Banach–Steinhaus Theorem B.4.29 applied on the family {Tλ,x,n }n∈N shows that Mλ,x := sup Tλ,x,n < ∞, n∈N
so that we have λ−n xn
Hahn−Banach
=
sup f ∈A : f ≤1
=
sup f ∈A : f ≤1
=
Tλ,x,n
≤
Mλ,x .
Hence
|f (λ−n xn )| |Tλ,x,n (f )|
1/n
xn 1/n ≤ Mλ,x |λ| →n→∞ |λ|, when |λ| > ρ(x). Thus
lim sup xn 1/n ≤ ρ(x); n→∞
collecting the results, the Spectral Radius Formula is verified.
D.4. Commutative Banach algebras
207
Remark D.3.17. The Spectral Radius Formula contains startling information: the spectral radius ρ(x) is purely an algebraic property (though related to a topological algebra), but the quantity lim xn 1/n relies on both algebraic and metric properties! Yet the results are equal! Remark D.3.18. ρ(x)−1 is the radius of convergence of the A-valued power series λ →
∞
λn xn .
n=0
Remark D.3.19. Let A be a Banach algebra and B a Banach subalgebra. If x ∈ B then σA (x) ⊂ σB (x) and the inclusion can be proper, but the spectral radii for both Banach algebras are the same, since ρA (x) = lim xn 1/n = ρB (x). n→∞
Exercise D.3.20. Let A be a Banach algebra, x, y ∈ A. Show that ρ(xy) = ρ(yx). Show that if x ∈ A is nilpotent (i.e., xk = 0 for some k ∈ N) then σ(x) = {0}. Exercise D.3.21. Let A be a Banach algebra and x, y ∈ A such that xy = yx. Prove that ρ(xy) ≤ ρ(x)ρ(y). Exercise D.3.22. In the proof of Theorem D.3.16 argue as follows. For λ > ||x|| note that the resolvent satisfies R(λ) = λ−1
∞
(λ−1 x)k ,
k=0
and this Laurent series converges for all |λ| > ρ(x). Consequently, its (Hadamard) radius of convergence satisfies ρ(x) ≤ lim inf ||xn ||1/n . n→∞
At the same time, the convergence for |λ| > ρ(x) implies limn→∞ λ−n xn = 0, which means that ||xn || ≤ |λ|n for large enough n.
D.4
Commutative Banach algebras
In this section we are interested in maximal ideals of commutative Banach algebras. We shall learn that such algebras are closely related to algebras of continuous functions on compact Hausdorff spaces: there is a natural, far from trivial, homomorphism from a commutative Banach algebra A to an algebra of functions on the set Hom(A, C), which can be endowed with a canonical topology – related mathematics is called the Gelfand theory. In the sequel, one should ponder this dilemma: which is more fundamental, a space or algebras of functions on it?
208
Chapter D. Algebras
Example. Let us give examples of commutative Banach algebras: 1. Our familiar C(K), when K is a compact space. 2. L∞ ([0, 1]), when [0, 1] is endowed with the Lebesgue measure. 3. A(Ω) := C(Ω) ∩ H(Ω), when Ω ⊂ C is open, H(Ω) are holomorphic functions in Ω, and Ω ⊂ C is compact. 4. M (Rn ), the convolution algebra of complex Borel measures on Rn , with the Dirac delta distribution at 0 ∈ Rn as the unit element, and endowed with the total variation norm. + , α β 5. The algebra of matrices , where α, β ∈ C; notice that this algebra 0 α contains nilpotent elements! Definition D.4.1 (Spectrum and characters of an algebra). The spectrum of an algebra A is Spec(A) := Hom(A, C), i.e., the set of homomorphisms A → C; such a homomorphism is called a character of A. Remark D.4.2. The concept of spectrum is best suited for commutative algebras, as C is a commutative algebra; here a character A → C should actually be considered as an algebra representation A → L(C). In order to fully capture the structure of a non-commutative algebra, we should study representations of type A → L(X), where the vector spaces X are multi-dimensional; for instance, if H is a Hilbert space of dimension 2 or greater then Spec(L(H)) = ∅. However, the spectrum of a commutative Banach algebra is rich, as there is a bijective correspondence between characters and maximal ideals. Moreover, the spectrum of the algebra is akin to the spectra of its elements: Theorem D.4.3 (Gelfand, 1940). Let A be a commutative Banach algebra. Then: (a) (b) (c) (d) (e)
Every maximal ideal of A is of the form Ker(h) for some h ∈ Spec(A); Ker(h) is a maximal ideal for every h ∈ Spec(A); x ∈ A is invertible if and only if ∀h ∈ Spec(A) : h(x) = 0; x ∈ A is invertible if and only if it is not in any ideal of A; σ(x) = {h(x) | h ∈ Spec(A)}.
Proof. (a) Let M ⊂ A be a maximal ideal; let [x] := x + M for x ∈ A. Since A is commutative and M is maximal, every non-zero element in the quotient algebra A/M is invertible. We know that M is closed, so that A/M is a Banach algebra. Due to the Gelfand–Mazur Theorem (Corollary D.3.13), there exists an isometric isomorphism λ ∈ Hom(A/M, C). Then h = (x → λ([x])) : A → C
D.4. Commutative Banach algebras
209
is a character, and Ker(h) = Ker((x → [x]) : A → A/M) = M. (b) Let h : A → C be a character. Now h is a linear mapping, so that the co-dimension of Ker(h) in A equals the dimension of h(A) ⊂ C, which clearly is 1. Any ideal of co-dimension 1 in an algebra must be maximal, so that Ker(h) is maximal. (c) If x ∈ A is invertible and h ∈ Spec(A) then h(x) ∈ C is invertible, that is h(x) = 0. For the converse, assume that x ∈ A is non-invertible. Then Ax = {ax | a ∈ A} is an ideal of A (notice that 1 = ax = xa would mean that a = x−1 ). Hence by Krull’s Theorem D.1.13, there is a maximal ideal M ⊂ A such that Ax ⊂ M. Then (a) provides a character h ∈ Spec(A) for which Ker(h) = M. Especially, h(x) = 0. (d) This follows from (a,b,c) directly. (e) (c) is equivalent to “x ∈ A is non-invertible if and only if ∃h ∈ Spec(A) : h(x) = 0”, which is equivalent to “λ1 − x is non-invertible if and only if ∃h ∈ Spec(A) : h(x) = λ”.
Exercise D.4.4. Let A be a Banach algebra and x, y ∈ A such that xy = yx. Prove that σ(x + y) ⊂ σ(x) + σ(y) and σ(xy) ⊂ σ(x)σ(y). Exercise D.4.5. Let A be the algebra of those functions f : R → C for which fn eix·n , f = |fn | < ∞. f (x) = n∈Z
n∈Z
Show that A is a commutative Banach algebra. Show that if f ∈ A and ∀x ∈ R : f (x) = 0 then 1/f ∈ A. Definition D.4.6 (Gelfand transform). Let A be a commutative Banach algebra. The Gelfand transform x of an element x ∈ A is the function x : Spec(A) → C,
x (φ) := φ(x).
Let A := { x : Spec(A) → C | x ∈ A}. The mapping A → A,
x → x ,
is called the Gelfand transform of A. We endow the set Spec(A) with the Ainduced topology, called the Gelfand topology; this topological space is called the
210
Chapter D. Algebras
maximal ideal space of A (for a good reason, in the light of the previous theorem). In other words, the Gelfand topology is the weakest topology on Spec(A) making every x a continuous function, i.e., the weakest topology on Spec(A) for which A ⊂ C(Spec(A)). Theorem D.4.7 (Gelfand, 1940). Let A be a commutative Banach algebra. Then K = Spec(A) is a compact Hausdorff space in the Gelfand topology, the Gelfand x(φ)| = transform is a continuous homomorphism A → C(K), and x = sup | φ∈K
ρ(x) for every x ∈ A. Proof. The Gelfand transform is a homomorphism, since 0 λx(φ) = φ(λx) = λφ(x) = λ x(φ) = (λ x)(φ), x + y(φ) = φ(x + y) = φ(x) + φ(y) = x (φ) + y(φ) = ( x + y)(φ), x 0y(φ) = φ(xy) = φ(x)φ(y) = x (φ) y (φ) = ( xy)(φ), 10 A (φ) = φ(1A ) = 1 = 1C(K) (φ), for every λ ∈ C, x, y ∈ A and φ ∈ K. Moreover, x (K) = { x(φ) | φ ∈ K} = {φ(x) | φ ∈ Spec(A)} = σ(x), implying x = ρ(x) ≤ x. Clearly K is a Hausdorff space. What about compactness? Now K = Hom(A, C) is a subset of the closed unit ball of the dual Banach space A ; by the Banach–Alaoglu Theorem B.4.36, this unit ball is compact in the weak∗ -topology. Recall that the weak∗ -topology τA of A is the A-induced topology, with the interpretation A ⊂ A ; thus the Gelfand topology τK is the relative weak∗ -topology, i.e., τK = τA |K . To prove that τK is compact, it is sufficient to show that K ⊂ A is closed in the weak∗ -topology. Let f ∈ A be in the weak∗ -closure of K. We have to prove that f ∈ K, i.e., f (xy) = f (x)f (y)
and
f (1) = 1.
Let x, y ∈ A, ε > 0. Let S := {1, x, y, xy}. Using the notation of the proof of the Banach–Alaoglu Theorem B.4.36, U (f, S, ε) = {ψ ∈ A : z ∈ S ⇒ |ψz − f z| < ε} is a weak∗ -neighbourhood of f . Thus choose hε ∈ K ∩ U (f, S, ε). Then |1 − f (1)| = |hε (1) − f (1)| < ε;
D.4. Commutative Banach algebras
211
ε > 0 being arbitrary, we have f (1) = 1. Noticing that |hε (x)| ≤ x, we get ≤
|f (xy) − f (x)f (y)| |f (xy) − hε (xy)| + |hε (xy) − hε (x)f (y)| + |hε (x)f (y) − f (x)f (y)|
= |f (xy) − hε (xy)| + |hε (x)| · |hε (y) − f (y)| + |hε (x) − f (x)| · |f (y)| ≤ ε (1 + x + |f (y)|). This holds for every ε > 0, so that actually f (xy) = f (x)f (y); we have proven that f is a homomorphism, f ∈ K.
Exercise D.4.8 (Radicals). Let A be a commutative Banach algebra. Its radical Rad(A) is the intersection of all the maximal ideals of A. Show that Rad(A) = Ker(x → x ) = {x ∈ A | ρ(x) = 0}, where x → x is the Gelfand transform. Show that nilpotent elements of A belong to the radical. Exercise D.4.9. Let X be a finite set. Describe the Gelfand transform of F(X). β Exercise D.4.10. Describe the Gelfand transform of the algebra of matrices α0 α , where α, β ∈ C. Theorem D.4.11 (When is Spec(C(X)) homeomorphic to X?). Let X be a compact Hausdorff space. Then Spec(C(X)) is homeomorphic to X. Proof. For x ∈ X, let us define the function hx : C(X) → C,
f → f (x) (evaluation at x ∈ X).
This is clearly a homomorphism, and hence we may define the mapping φ : X → Spec(C(X)),
x → hx .
Let us prove that φ is a homeomorphism. If x, y ∈ X, x = y, then Urysohn’s Lemma (Theorem A.12.11) provides f ∈ C(X) such that f (x) = f (y). Thereby hx (f ) = hy (f ), yielding φ(x) = hx = hy = φ(y); thus φ is injective. It is also surjective: namely, let us assume that h ∈ Spec(C(X)) \ φ(X). Now Ker(h) ⊂ C(X) is a maximal ideal, and for every x ∈ X we may choose fx ∈ Ker(h) \ Ker(hx ) ⊂ C(X). Then Ux := fx−1 (C \ {0}) ∈ V(x), so that U = {Ux | x ∈ X}
212
Chapter D. Algebras
is an open cover of X, which due to the compactness has a finite subcover {Uxj }nj=1 ⊂ U. Since fxj ∈ Ker(h), the function f :=
n
|fxj |2 =
j=1
n
fxj fxj
j=1
belongs to Ker(h). Clearly f (x) = 0 for every x ∈ X. Therefore g ∈ C(X) with g(x) = 1/f (x) is the inverse element of f ; this is a contradiction, since no invertible element belongs to an ideal. Thus φ must be surjective. We have proven that φ : X → Spec(C(X)) is a bijection. Thereby X and Spec(C(X)) can be identified as sets. The Gelfand-topology of Spec(C(X)) is then identified with the C(X)-induced topology σ of X, which is weaker than the original topology τ of X. Hence φ : (X, τ ) → Spec(C(X)) is continuous. Actually, σ = τ , because a continuous bijection from a compact space to a Hausdorff space is a homeomorphism, see Proposition A.12.7. Corollary D.4.12. Let X and Y be compact Hausdorff spaces. Then the Banach algebras C(X) and C(Y ) are isomorphic if and only if X is homeomorphic to Y . Proof. By Theorem D.4.11, X ∼ = Spec(C(X)) and Y ∼ = Spec(C(Y )). If C(X) and C(Y ) are isomorphic Banach algebras then X∼ = Spec(C(X))
C(X)∼ =C(Y )
∼ =
Spec(C(Y )) ∼ = Y.
Conversely, a homeomorphism φ : X → Y begets a Banach algebra isomorphism Φ : C(Y ) → C(X), (Φf )(x) := f (φ(x)), as the reader easily verifies.
Exercise D.4.13. Let K be a compact Hausdorff space, ∅ = S ⊂ K, and J ⊂ C(K) be an ideal. Let us define I(S) := {f ∈ C(K) | ∀x ∈ S : f (x) = 0}, V (J ) := {x ∈ K | ∀f ∈ J : f (x) = 0}. Prove that (a) (b) (c) (d)
I(S) ⊂ C(K) a closed ideal, V (J ) ⊂ K is a closed non-empty subset, V (I(S)) = S (hint: Urysohn), and I(V (J )) = J .
Lesson to be learned: topology of K goes hand in hand with the (closed) ideal structure of C(K).
D.5. C∗ -algebras
213
D.5 C∗ -algebras Now we are finally in the position to abstractly characterise algebras C(X) among Banach algebras: according to Gelfand and Naimark, the category of compact Hausdorff spaces is equivalent to the category of commutative C∗ -algebras. The class of C∗ -algebras behaves nicely, and the related functional analysis adequately deserves the name “non-commutative topology”. Definition D.5.1 (Involutive algebra). An algebra A is a ∗-algebra (“star-algebra” or an involutive algebra) if there is a mapping (x → x∗ ) : A → A satisfying (λx)∗ = λx∗ ,
(x + y)∗ = x∗ + y ∗ ,
(xy)∗ = y ∗ x∗ ,
(x∗ )∗ = x,
for all x, y ∈ A and λ ∈ C; such a mapping is called an involution. In other words, an involution is a conjugate-linear anti-multiplicative self-invertible mapping A → A. A ∗-homomorphism φ : A → B between involutive algebras A and B is an algebra homomorphism satisfying φ(x∗ ) = φ(x)∗ for every x ∈ A. The set of all ∗-homomorphisms between ∗-algebras A and B is denoted by Hom∗ (A, B). Definition D.5.2 (C∗ -algebra). A C∗ -algebra A is an involutive Banach algebra such that x∗ x = x2 for every x ∈ A. Example. Let us consider some involutive algebras: 1. The Banach algebra C is a C∗ -algebra with the involution λ → λ∗ = λ, i.e., the complex conjugation. 2. If K is a compact space then C(K) is a commutative C∗ -algebra with the involution f → f ∗ by complex conjugation, f ∗ (x) := f (x). 3. L∞ ([0, 1]) is a C∗ -algebra, when the involution is as above. 4. A(D(0, 1)) = C D(0, 1) ∩ H(D(0, 1)) is an involutive Banach algebra with f ∗ (z) := f (z), but it is not a C∗ -algebra. Here H(D(0, 1)) are functions holomorphic in the unit disc. 5. The radical of a commutative C∗ -algebra is always the trivial ideal {0}, and thus + 0 is,the only nilpotent element. Hence for instance the algebra of matrices α β (where α, β ∈ C) cannot be a C∗ -algebra. 0 α
214
Chapter D. Algebras
6. If H is a Hilbert space then L(H) is a C∗ -algebra when the involution is the usual adjunction A → A∗ , and clearly any norm-closed involutive subalgebra of L(H) is also a C∗ -algebra. Actually, there are no others, but in the sequel we shall not prove the related Gelfand–Naimark Theorem D.5.3: Theorem D.5.3 (Gelfand–Naimark Theorem (1943)). If A is a C∗ -algebra then there exists a Hilbert space H and an isometric ∗-homomorphism onto a closed involutive subalgebra of L(H). However, we shall characterise the commutative case: the Gelfand transform of a commutative C∗ -algebra A will turn out to be an isometric isomorphism A → C(Spec(A)), so that A “is” the function algebra C(K) for the compact Hausdorff space K = Spec(A)! Before going into this, we prove some related results. Proposition D.5.4. Let A be a ∗-algebra. Then 1∗ = 1, x ∈ A is invertible if and only if x∗ ∈ A is invertible, and σ(x∗ ) = σ(x) := {λ | λ ∈ σ(x)}. Proof. First,
1∗ = 1∗ 1 = 1∗ (1∗ )∗ = (1∗ 1)∗ = (1∗ )∗ = 1;
second, (x−1 )∗ x∗ = (xx−1 )∗ = 1∗ = 1 = 1∗ = (x−1 x)∗ = x∗ (x−1 )∗ ; third,
λ1 − x∗ = (λ1∗ )∗ − x∗ = (λ1)∗ − x∗ = (λ1 − x)∗ ,
which concludes the proof. ∗
∗
Proposition D.5.5. Let A be a C -algebra, and x = x ∈ A. Then σ(x) ⊂ R. Proof. Assume that λ ∈ σ(x) \ R, i.e., λ = λ1 + iλ2 for some λj ∈ R with λ2 = 0. Hence we may define y := (x − λ1 1)/λ2 ∈ A. Now y ∗ = y. Moreover, i ∈ σ(y), because λ1 − x . i1 − y = λ2 Take t ∈ R. Then t + 1 ∈ σ(t1 − iy), because (t + 1)1 − (t1 − iy) = −i(i1 − y). Thereby (t + 1)2
≤ ≤
ρ(t1 − iy)2 t1 − iy2
C∗
(t1 − iy)∗ (t1 − iy)
=
t∈R, y ∗ =y
= ≤
(t1 + iy)(t1 − iy) = t2 1 + y 2 t2 + y,
so that 2t + 1 ≤ y for every t ∈ R; a contradiction.
D.5. C∗ -algebras
215
Corollary D.5.6. Let A be a C∗ -algebra, φ : A → C a homomorphism, and x ∈ A. Then φ(x∗ ) = φ(x), i.e., φ is a ∗-homomorphism. Proof. Define the “real part” and the “imaginary part” of x by u :=
x + x∗ , 2
v :=
x − x∗ . 2i
Then x = u + iv, u∗ = u, v ∗ = v, and x∗ = u − iv. Since a homomorphism maps invertibles to invertibles, we have φ(u) ∈ σ(u); we know that σ(u) ⊂ R, because u∗ = u. Similarly we obtain φ(v) ∈ R. Thereby φ(x∗ ) = φ(u − iv) = φ(u) − iφ(v) = φ(u) + iφ(v) = φ(u + iv) = φ(x); this means that Hom∗ (A, C) = Hom(A, C).
Exercise D.5.7. Let A be a Banach algebra, B a closed subalgebra, and x ∈ B. Prove the following facts: (a) G(B) is open and closed in G(A) ∩ B. (b) σA (x) ⊂ σB (x) and ∂σB (x) ⊂ ∂σA (x). (c) If C \ σA (x) is connected then σA (x) = σB (x). Using the results of the exercise above, the reader can prove the following important fact on the invariance of the spectrum in C∗ -algebras: Exercise D.5.8. Let A be a C∗ -algebra and B a C∗ -subalgebra. Show that σB (x) = σA (x) for every x ∈ B. Lemma D.5.9. Let A be a C∗ -algebra. Then x2 = ρ(x∗ x) for every x ∈ A. Proof. Now C∗
(x∗ x)2 = (x∗ x)(x∗ x) = (x∗ x)∗ (x∗ x) = x∗ x2 , so that by induction
(x∗ x)2 = x∗ x2 n
n
for every n ∈ N. Therefore applying the Spectral Radius Formula, we get ρ(x∗ x) = lim (x∗ x)2 1/2 = lim x∗ x2 n
n→∞
the result we wanted.
n
n→∞
n
/2n
= x∗ x,
Exercise D.5.10. Let A be a C∗ -algebra. Show that there can be at most one C∗ algebra norm on an involutive Banach algebra. Moreover, prove that if A, B are C∗ -algebras then φ ∈ Hom∗ (A, B) is continuous and has norm φ = 1. Theorem D.5.11 (Commutative Gelfand–Naimark). Let A be a commutative C∗ algebra. Then the Gelfand transform (x → x ) : A → C(Spec(A)) is an isometric ∗-isomorphism.
216
Chapter D. Algebras
Proof. Let K = Spec(A). We already know that the Gelfand transform is a Banach algebra homomorphism A → C(K). Let x ∈ A and φ ∈ K. Since φ is actually a ∗-homomorphism, we get 0∗ (φ) = φ(x∗ ) = φ(x) = x (φ) = x ∗ (φ); x the Gelfand transform is a ∗-homomorphism. Now we have proven that A ⊂ C(K) is an involutive subalgebra separating the points of K. Stone–Weierstrass Theorem A.14.4 thus says that A is dense in C(K). If we can show that the Gelfand transform A → A is an isometry then we must have A = C(K): Take x ∈ A. Then Lemma ∗ x Gelfand x∗ x = x1 = ρ(x∗ x) = x2 , x2 =
i.e., x = x.
∗
Exercise D.5.12. Show that an injective ∗-homomorphism between C -algebras is an isometry. (Hint: Gelfand transform.) Exercise D.5.13. A linear functional f on a C∗ -algebra A is called positive if f (x∗ x) ≥ 0 for every x ∈ A. Show that the positive functionals separate the points of A. Exercise D.5.14. Prove that the involution of a C∗ -algebra cannot be altered without destroying the C∗ -property x∗ x = x2 . Definition D.5.15 (Normal element). An element x of a C∗ -algebra is called normal if x∗ x = xx∗ . We use the commutative Gelfand–Naimark Theorem to create the so-called continuous functional calculus at a normal element – a non-commutative C∗ algebra admits some commutative studies: Theorem D.5.16 (Functional calculus at the normal element). Let A be a C∗ algebra, and x ∈ A be a normal element. Let ι = (λ → λ) : σ(x) → C. Then there exists a unique isometric ∗-homomorphism φ : C(σ(x)) → A such that φ(ι) = x and φ(C(σ(x))) is the C∗ -algebra generated by x, i.e., the smallest C∗ -algebra containing {x}. Proof. Let B be the C∗ -algebra generated by x. Since x is normal, B is commutative. Let Gel = (y → y) : B → C(Spec(B)) be the Gelfand transform of B. The reader may easily verify that x : Spec(B) → σ(x) is a continuous bijection from a compact space to a Hausdorff space; hence it is a homeomorphism. Let us define the mapping Cx : C(σ(x)) → C(Spec(B)),
(Cx f )(h) := f ( x(h)) = f (h(x));
D.6. Appendix: Liouville’s Theorem
217
Cx can be thought as a “transpose” of x . Let us define φ = Gel−1 ◦ Cx : C(σ(x)) → B ⊂ A. Then φ : C(σ(x)) → A is obviously an isometric ∗-homomorphism. Furthermore, x) = Gel−1 (Gel(x)) = x. φ(ι) = Gel−1 (Cx (ι)) = Gel−1 ( Due to the Stone–Weierstrass Theorem A.14.4, the ∗-algebra generated by ι ∈ C(σ(x)) is dense in C(σ(x)); since the ∗-homomorphism φ maps the generator ι to the generator x, the uniqueness of φ follows Remark D.5.17. The ∗-homomorphism φ : C(σ(x)) → A above is called the (continuous) functional calculus at the normal element φ(ι) = x ∈ A. If p = (z → n n j j a z ) : C → C is a polynomial then it is natural to define p(x) := j=1 j j=1 aj x . Then actually p(x) = φ(p); hence it is natural to define f (x) := φ(f ) for every f ∈ C(σ(x)). It is easy to check that if f ∈ C(σ(x)) and h ∈ Spec(B) then f (h(x)) = h(f (x)). Exercise D.5.18. Let A be a C∗ -algebra, x ∈ A normal, f ∈ C(σ(x)), and g ∈ C(f (σ(x))). Show that σ(f (x)) = f (σ(x)) and that (g ◦ f )(x) = g(f (x)).
D.6 Appendix: Liouville’s Theorem Here we prove Liouville’s Theorem D.6.2 from complex analysis which was used in the proof of Gelfand’s Theorem D.3.10. Definition D.6.1 (Holomorphic function). Let Ω ⊂ C be open. A function f : Ω → C is called holomorphic in Ω, denoted by f ∈ H(Ω), if the limit f (z) := lim
h→0
f (z + h) − f (z) h
exists for every z ∈ Ω. Then Cauchy’s integral formula provides a power series representation ∞ cn (z − a)n f (z) = n=0
converging uniformly on the compact subsets of the disk D(a, r) = {z ∈ C : |z − a| < r} ⊂ Ω; here cn = f (n) (a)/n!, where f (0) = f and f (n+1) = f (n) . Theorem D.6.2 (Liouville’s Theorem). Let f ∈ H(C) such that |f | is bounded. Then f is constant, i.e., f (z) ≡ f (0) for every z ∈ C.
218
Chapter D. Algebras
Proof. Since f ∈ H(C), we have a power series representation f (z) =
∞
cn z n
n=0
converging uniformly on the compact sets in the complex plane. Thereby 2π 2π 1 1 |f (reiφ )|2 dφ = cn cm rn+m ei(n−m)φ dφ 2π 0 2π 0 n,m 2π 1 = cn cm rn+m ei(n−m)φ dφ 2π 0 n,m =
∞
|cn |2 r2n
n=0
for every r > 0. Hence the fact ∞ n=0
|cn | r
2 2n
1 = 2π
2π
|f (reiφ )|2 dφ ≤ sup |f (z)|2 < ∞ 0
z∈C
implies cn = 0 for every n ≥ 1; thus f (z) ≡ c0 = f (0).
A more general Liouville’s theorem for harmonic functions will be given in Theorem 2.6.14, with a proof relying on the Fourier analysis instead.
Part II
Commutative Symmetries In Part II we present the theory of pseudo-differential operators on commutative groups. The first commutative case is the Euclidean space Rn where the theory of pseudo-differential operators is developed most and many things may be considered well-known, so here we only review basics which are useful to contrast it with constructions on other spaces. We start by introducing elements of Fourier analysis in Chapter 1, trying to make an independent exposition of the theory, reducing references to general measure theory to a minimum. In Chapter 2 we develop the most important elements of the theory of pseudo-differential operators on Rn . There we do not aim at developing a comprehensive treatment since there are several excellent monographs already available. Instead, we focus in Chapter 4 on aspects of the theory that have analogues on the torus, and on more general (compact) Lie groups in Part IV. From this point of view Chapters 1 and 2 can be regarded as an introduction to the subject and that is why we have taken special care to accommodate a possibly less experienced reader there. The second commutative case is the case of the torus Tn = Rn /Zn considered in Chapter 4. On one hand, pseudo-differential operators on Tn can be viewed as a special case of periodic pseudo-differential operators on Rn , with all the consequences. However, in this way one may lose many important features of the underlying torus. On the other hand, carrying out the analysis in the intrinsic language of the underlying space is usually a more natural point of view that also has chances of extension to other Lie groups that are not so intimately related to the Euclidean space. Here the literature on the general theory of periodic pseudo-differential operators in the “toroidal language” is rather non-existent and only a few results seem to be available. This fact is quite surprising given that Fourier analysis on Tn is nothing else but the periodic Fourier transform on Rn and, as such, constitutes a starting point of applications of Fourier analysis to numerous problems in applied
220
mathematics and engineering. In particular, such applications (and especially real life or computational applications) do often rely on the toroidal language of the Fourier coefficients and the Fourier series rather than on the Euclidean language of the Fourier transform. Since every connected commutative Lie group G can be identified with the product G ∼ = Tn × Rm , the combination of these two settings essentially exhausts all compact commutative Lie groups. Indeed, every compact (disconnected) commutative group is isomorphic to the product of a torus and a finite commutative group, so that being connected is not really a restriction and thus it is sufficient to study the case of the torus again. In Chapter 5 we discuss commutator characterisations of pseudo-differential operators on Rn and Tn , as well as on closed manifolds which becomes useful in the sequel. In particular, Section 5.2 contains a short introduction to pseudodifferential operators on manifolds.
Chapter 1
Fourier Analysis on Rn In this chapter we review basic elements of Fourier analysis on Rn . Consequently, we introduce spaces of distributions, putting emphasis on the space of tempered distributions S (Rn ). Finally, we discuss Sobolev spaces and approximation of functions and distributions by smooth functions. Throughout, we fix the measure on Rn to be Lebesgue measure. For convenience, we may repeat a few definitions in the context of Rn although they may have already appeared in Chapter C on measure theory. From this point of view, the present chapter can be read essentially independently. The notation used in this chapter and also in Chapter 2 is ξ = (1 + |ξ|2 )1/2 where |ξ| = (ξ12 + · · · + ξn2 )1/2 , ξ ∈ Rn .
1.1
Basic properties of the Fourier transform
Let Ω ⊂ Rn be a measurable subset of Rn . For simplicity, we may always think of Ω being open or closed in Rn . In this section we will mostly have Ω = Rn . Definition 1.1.1 (Lp -spaces). Let 1 ≤ p < ∞. A function f : Ω → C is said to be in Lp (Ω) if it is measurable and its norm + ,1/p |f (x)|p dx ||f ||Lp (Ω) := Ω
is finite. In the case p = ∞, f is said to be in L∞ (Ω) if it is measurable and essentially bounded, i.e., if ||f ||L∞ (Ω) := esssupx∈Ω |f (x)| < ∞. Here esssupx∈Ω |f (x)| is defined as the smallest M such that |f (x)| ≤ M for almost all x ∈ Ω. In particular, L1 (Ω) is the space of absolutely integrable functions on Ω with ||f ||L1 (Ω) = Ω |f (x)| dx. We will often abbreviate the ||f ||Lp (Ω) norm by ||f ||Lp , or by ||f ||p , if the choice of Ω is clear from the context.
Chapter 1. Fourier Analysis on Rn
222
We note that it is customary to abuse the notation slightly by talking about functions in Lp (Rn ) while in reality elements in Lp (Rn ) are equivalence classes of functions that are equal almost everywhere. However, this is a minor technical issue, see Definition C.4.6 for details. Definition 1.1.2 (Fourier transform in L1 (Rn )). For f ∈ L1 (Rn ) we define its Fourier transform by (FRn f )(ξ) = (Ff )(ξ) = f(ξ) := e−2πix·ξ f (x) dx. Rn
Remark 1.1.3. Other similar definitions are often encountered in the literature. For example, one can use e−ix·ξ instead of e−2πix·ξ , multiply the integral by the constant (2π)−n/2 , etc. Changes in definitions may lead to changes in constants in formulae. It may also seem that our notation for the Fourier transform is a bit excessive. However, f is a useful shorthand notation, while FRn f is useful in the sequel when we want to explicitly distinguish it from the Fourier transform FTn f for functions on the torus Tn considered in Chapters 3 and 4. However, in this chapter as well as in Chapter 2 we may omit the subscript and write simply F since there should be no confusion. It is easy to check that F : L1 (Rn ) → L∞ (Rn ) is a bounded linear operator with norm one: (1.1) ||f||∞ ≤ ||f ||1 . Moreover, if f ∈ L1 (Rn ), its Fourier transform f is continuous, which follows from Lebesgue’s dominated convergence theorem. For Lebesgue’s dominated convergence theorem on general measure spaces we refer to Theorem C.3.22, but for completeness, we also state it here in a form useful to us: Theorem 1.1.4 (Lebesgue’s dominated convergence theorem). Let (fk )∞ k=1 be a sequence of measurable functions on Ω such that fk → f pointwise almost everywhere on Ω as k → ∞. Suppose there is an integrable function g ∈ L1 (Ω) such that |fk | ≤ g for all k. Then f is integrable and Ω f dx = limk→∞ Ω fk dx. Exercise 1.1.5. Prove that if f ∈ L1 (Rn ) then f is continuous everywhere. Exercise 1.1.6. Let u, f ∈ L1 (Rn ) satisfy Lu = f , where L = the Laplace operator. Prove that Rn f (x) dx = 0.
∂2 ∂x21
+ ··· +
∂2 ∂x2n
is
Exercise 1.1.7. Let u, f ∈ L1 (Rn ) satisfy (1 − L)u = f. Suppose that f satisfies |f(ξ)| ≤
C , for all ξ ∈ Rn . (1 + |ξ|)n−1
Prove that u is a bounded continuous function on Rn . It is quite difficult to characterise the image of the space L1 (Rn ) under the Fourier transform. But we have the following theorem:
1.1. Basic properties of the Fourier transform
223
Theorem 1.1.8 (Riemann–Lebesgue lemma). Let f ∈ L1 (Rn ). Then its Fourier transform f is a continuous function on Rn vanishing at infinity, i.e., f(ξ) → 0 as ξ → ∞. Proof. It is enough to make an explicit calculation for f being a characteristic function of a cube and then use a standard limiting argument. Thus, let f be a characteristic function of the unit cube, i.e., f (x) = 1 for x ∈ [−1, 1]n and f (x) = 0 otherwise. Then f (ξ) = e−2πix·ξ dx [−1,1]n
=
n j=1 n
1
−1
e−2πixj ξj dxj
1 e−2πixj ξj |1−1 −2πiξ j j=1 + ,n n 2 −2πiξj 3 i 1 = − e2πiξj e 2π ξ1 · · · ξn j=1 =
=
n sin(2πξj ) . πξj j=1
The product of exponents is bounded, so the whole expression tends to zero as ξ → ∞ away from the coordinate axis. In case some of the ξj ’s are zero, an obvious modification of this argument yields the same result. Exercise 1.1.9. Complete the proof of Theorem 1.1.8 in the case when some of the ξj ’s are zero. Definition 1.1.10 (Multi-index notation). When working in Rn , the following notation is extremely useful. For multi-indices α = (α1 , . . . , αn ) and β = (β1 , . . . , βn ) with integer entries αj , βj ≥ 0, we define ∂ α ϕ(x) =
∂ α1 ∂ αn ϕ(x) α1 · · · n ∂x1 ∂xα n
and xβ = xβ1 1 · · · xβnn . For such multi-indices we will write α, β ≥ 0. For multiindices α and β, α ≤ β means αj ≤ βj for all j ∈ {1, . . . , n}. The length of the multi-index α will be denoted by |α| = α1 + · · · + αn , and α! = α1 ! · · · αn !. Space L1 (Rn ) has its limitations for the Fourier analysis because its elements may be quite irregular. The following space is an excellent alternative because its elements are smooth and have strong decay properties, thus allowing us not to worry about the convergence of integrals as well as allowing the use of analytic techniques such as integration by parts, etc.
Chapter 1. Fourier Analysis on Rn
224
Definition 1.1.11 (Schwartz space S(Rn )). We define the Schwartz space S(Rn ) of rapidly decreasing functions as follows. We say that ϕ ∈ S(Rn ) if ϕ is smooth on Rn and if sup xβ ∂ α ϕ(x) < ∞ x∈Rn
for all multi-indices α, β ≥ 0. Exercise 1.1.12. Show that a smooth function f is in the Schwartz space if and only if for all α ≥ 0 and N ≥ 0 there is a constant Cα,N such that |∂ α ϕ(x)| ≤ Cα,N (1 + |x|)−N for all x ∈ Rn . The space S(Rn ) is a topological space. Let us now introduce the convergence of functions in S(Rn ). Definition 1.1.13 (Convergence in S(Rn )). We will say that ϕj → ϕ in S(Rn ) as j → ∞, if ϕj , ϕ ∈ S(Rn ) and if sup |xβ ∂ α (ϕj − ϕ)(x)| → 0 as j → ∞,
(1.2)
x∈Rn
for all multi-indices α, β ≥ 0. Remark 1.1.14. The Schwartz space S(Rn ) contains C ∞ -smooth functions on Rn that decay rapidly at infinity, i.e., " # S(Rn ) := f ∈ C ∞ (Rn ) : pαβ (ϕ) := sup xβ ∂ α ϕ(x) < ∞ (α, β ∈ Nn0 ) . x∈Rn
If one is familiar with functional analysis, one can take the expressions pαβ (ϕ) as seminorms on the space S(Rn ), see Definition B.4.1. This collection turns S(Rn ) into a locally convex linear topological space. Moreover, it is also a Fr´echet space with the natural topology induced by the seminorms pαβ (see Exercise B.3.9), and it is a nuclear Montel space (see Exercises B.3.37 and B.3.51). Definition 1.1.15 (Useful notation Dα ). Since the definition of the Fourier transform contains the complex exponential, it is often convenient to use the notation α1 1 ∂ α αn Dj = 2πi ∂xj and D = D1 · · · Dn . If Dj is applied to a function of ξ it will ob1 ∂ viously mean 2πi ∂ξj . However, there should be no confusion with this convention. If we want to additionally emphasize the variable for differentiation, we will write Dxα or Dξα .
The following theorem relates multiplication with differentiation, with respect to the Fourier transform. 1 and x1 Theorem 1.1.16. Let ϕ ∈ S(Rn ). Then D j ϕ(ξ) = ξj ϕ(ξ) j ϕ(ξ) = −Dj ϕ(ξ). Proof. From the definition of the Fourier transform we readily see that = e−2πix·ξ (−xj )ϕ(x) dx. Dj ϕ(ξ) Rn
1.1. Basic properties of the Fourier transform
225
This gives the second formula. Since the integrals converge absolutely, we can integrate by parts with respect to xj in the following expression to get ξj ϕ(ξ) = e−2πix·ξ Dj ϕ(x) dx. −Dj e−2πix·ξ ϕ(x) dx = Rn
Rn
This implies the first formula. Note that we do not get boundary terms when integrating by parts because function ϕ vanishes at infinity. Remark 1.1.17. This theorem allows one to tackle some differential equations already. For example, let us look at the equation Lu = f with the Laplace operator ∂2 ∂2 L = ∂x 2 +· · ·+ ∂x2 . Taking the Fourier transform and using the theorem we arrive n 1 at the equation −4π 2 |ξ|2 u = f. If we knew how to invert the Fourier transform we could find the solution u = −F −1 4π21|ξ|2 f . Corollary 1.1.18. Let ϕ ∈ S(Rn ). Then β α = e−2πix·ξ Dxβ ((−x)α ϕ(x)) dx. ξ Dξ ϕ(ξ) Rn
Hence also
with C =
≤ C sup (1 + |x|)n+1 Dxβ (xα ϕ(x)) , sup ξ β Dξα ϕ(ξ)
Rn
x∈Rn
ξ∈Rn
(1 + |x|)−n−1 dx < ∞.
Here we used the following useful criterion:
dx Exercise 1.1.19 (Integrability criterion). Show that we have Rn (1+|x|) ρ < ∞ if dx and only if ρ > n. We also have |x|≤1 |x|ρ < ∞ if and only if ρ < n. Both of these criteria can be easily checked by passing to polar coordinates. Remark 1.1.20 (Fourier transform in S(Rn )). Corollary 1.1.18 implies that the Fourier transform F maps S(Rn ) to itself. In fact, later we will show that much more is true. Let us note for now that Corollary 1.1.18 together with Lebesgue’s dominated convergence theorem imply that the Fourier transform F : S(Rn ) → S(Rn ) is continuous, i.e., ϕj → ϕ in S(Rn ) implies ϕ 0j → ϕ in S(Rn ). Theorem 1.1.21 (Fourier inversion formula). The Fourier transform F : ϕ → ϕ is an isomorphism of S(Rn ) into S(Rn ), whose inverse is given by ϕ(x) = e2πix·ξ ϕ(ξ) dξ. (1.3) Rn
This formula is called the Fourier inversion formula and the inverse Fourier transform is denoted by −1 (FR−1 f )(x) ≡ (F f )(x) := e2πix·ξ f (ξ) dξ. n Rn
Thus, we can say that F ◦ F −1 = F −1 ◦ F = identity
on
S(Rn ).
Chapter 1. Fourier Analysis on Rn
226
The proof of this theorem will rely on several lemmas which have a significance on their own. Lemma 1.1.22 (Multiplication formula for the Fourier transform). Let f, g ∈ L1 (Rn ). Then Rn fg dx = Rn f g dx. Proof. We will apply Fubini’s theorem. Thus, 5 4 −2πix·y e f (y) dy g(x) dx f g dx = Rn Rn Rn 5 4 −2πix·y e g(x) dx f (y) dy = n Rn R = gf dy, Rn
proving the lemma.
Lemma 1.1.23 (Fourier transform of Gaussian distributions). We have the equality 2 2 2 e−2πix·ξ e−π |x| dx = (π )−n/2 e−|ξ| / , Rn
for every > 0. By the change of 2πx → x and → 2 it is equivalent to + ,n/2 2 2π −ix·ξ −|x|2 /2 e e dx = e−|ξ| /(2) .
n R Proof. We will use the standard identities ∞ √ −t2 /2 e dt = 2π and
e−|x|
2
/2
dx = (2π)n/2 .
(1.4)
(1.5)
Rn
−∞
In fact, (1.4) will follow from the one-dimensional case, when we have ∞ ∞ 2 −itτ −t2 /2 −τ 2 /2 e e dt = e e−(t+iτ ) /2 dt −∞ −∞ ∞ 2 2 e−t /2 dt = e−τ /2 √
= 2π e
−∞ −τ 2 /2
,
where we used the Cauchy theorem about changing the for √ contour of integration √ analytic functions and formula (1.5). Changing t → t and τ → τ / gives ∞ √ √ 2 2
e−itτ e−t /2 dt = 2π e−τ /(2) . −∞
Extending this to n dimensions yields (1.4).
1.1. Basic properties of the Fourier transform
227
Proof of Theorem 1.1.21. For ϕ ∈ S(Rn ), we want to prove (1.3), i.e., that ϕ(x) = e2πix·ξ ϕ(ξ) dξ. Rn
By the Lebesgue dominated convergence theorem (Theorem 1.1.4) we can replace the right-hand side of this formula by e2πix·ξ ϕ(ξ) dξ Rn 2 2 = lim e2πix·ξ ϕ(ξ) e−2π |ξ| dξ →0 Rn 2 2 = lim e2πi(x−y)·ξ ϕ(y) e−2π |ξ| dy dξ (change y → y + x) →0 Rn Rn 2 2 = lim e−2πiy·ξ ϕ(y + x) e−2π |ξ| dy dξ (Fubini’s theorem) →0 Rn Rn + , −2πiy·ξ −2π 2 |ξ|2 = lim ϕ(y + x) dy e e dξ (F.T. of Gaussian) →0 Rn Rn √ 2 = lim ϕ(y + x)(2π )−n/2 e−|y| /(2) dy (change y = z) →0 Rn √ 2 = (2π)−n/2 lim ϕ( z + x) e−|z| /2 dz (use (1.5)) →0
Rn
= ϕ(x).
This finishes the proof.
Remark 1.1.24. In fact, in the proof we implicitly established the following useful relation between Fourier transforms and translations of functions. Let h ∈ Rn and define (τh f )(x) = f (x − h). Then we also see that (τ1 f )(ξ) = e−2πix·ξ (τh f )(x) dx h Rn = e−2πix·ξ f (x − h) dx (change y = x − h) Rn = e−2πi(y+h)·ξ f (y) dy Rn −2πih·ξ
= e
f(ξ).
Exercise 1.1.25 (Fourier transform and linear transformations). Let f ∈ L1 (Rn ). Show that if A ∈ Rn×n satisfies det A = 0, and B = (AT )−1 , then f ◦ A = | det A|−1 f ◦ B. In particular, conclude that the Fourier transform commutes with rotations: if A is orthogonal (i.e., AT = A∗ = A−1 , so that A defines a rotation of Rn ), then
Chapter 1. Fourier Analysis on Rn
228
f ◦ A = f ◦ A. Consequently, conclude also that the Fourier transform of a radial function is radial: if f (x) = h1 (|x|) for some h1 , then f(ξ) = h2 (|ξ|) for some h2 . Definition 1.1.26 (Convolutions). For functions f, g ∈ L1 (Rn ), we define their convolution by (f ∗ g)(x) := f (x − y) g(y) dy. (1.6) Rn
It is easy to see that f ∗ g ∈ L1 (Rn ) with norm inequality ||f ∗ g||L1 (Rn ) ≤ ||f ||L1 (Rn ) ||g||L1 (Rn )
(1.7)
and that f ∗ g = g ∗ f. Also, in particular for f, g ∈ S(Rn ), integrals are absolutely convergent and we can differentiate under the integral sign to get ∂ α (f ∗ g) = ∂ α f ∗ g = f ∗ ∂ α g.
(1.8)
Remark 1.1.27. We can note that a more rigorous way of defining the convolution would be first defining (1.6) for f, g ∈ S(Rn ) and then extending it to a mapping ∗ : L1 (Rn ) × L1 (Rn ) → L1 (Rn ) by (1.7) avoiding the convergence question of the integral in (1.6) for functions from L1 (Rn ). Exercise 1.1.28. Prove the commutativity of convolution: if f, g ∈ L1 (Rn ), then f ∗ g = g ∗ f . If f, g, ∈ S(Rn ), prove formula (1.8). Exercise 1.1.29. Prove the associativity of convolution: if f, g, h ∈ L1 (Rn ), prove that (f ∗ g) ∗ h = f ∗ (g ∗ h). The following properties relate convolutions with Fourier transforms. Theorem 1.1.30. Let ϕ, ψ ∈ S(Rn ). Then we have (i) Rn ϕ ψ dx = Rn ϕ ψ dξ; (ii) ϕ ∗ ψ(ξ) = ϕ(ξ) ψ(ξ); 1 (iii) ϕ ψ(ξ) = (ϕ ∗ ψ)(ξ).
Proof. (i) Let us denote = χ(ξ) = ψ(ξ)
Rn
e2πix·ξ ψ(x) dx = F −1 (ψ)(ξ),
so that χ = ψ. It follows now that ϕψ = ϕ χ= Rn
Rn
Rn
ϕχ =
Rn
ϕ ψ,
where we used the multiplication formula for the Fourier transform in Lemma 1.1.22.
1.2. Useful inequalities
229
(ii) We can easily calculate e−2πix·ξ (ϕ ∗ ψ)(x) dx = e−2πix·ξ ϕ(x − y)ψ(y) dy dx ϕ ∗ ψ(ξ) = n n n R R R −2πi(x−y)·ξ −2πiy·ξ = e ϕ(x − y) e ψ(y) dy dx n n R R = e−2πiz·ξ ϕ(z) e−2πiy·ξ ψ(y) dy dz = ϕ(ξ) ψ(ξ), Rn
Rn
where we used the substitution z = x − y. We leave (iii) as Exercise 1.1.31.
Exercise 1.1.31. Prove part (iii) of Theorem 1.1.30.
1.2
Useful inequalities
This section will be devoted to several important inequalities which are very useful in Fourier analysis and in many types of analysis involving spaces of functions. 2
2
Proposition 1.2.1 (Cauchy’s inequality). For all a, b ∈ R we have ab ≤ a2 + b2 . 2 Moreover, for any > 0, we also have ab ≤ a2 + b4 . As a consequence, we immediately obtain Cauchy’s inequality for functions: 1 |f (x)g(x)| dx ≤ (|f (x)|2 + |g(x)|2 ) dx, 2 Ω Ω which is ||f g||L1 (Ω) ≤
1 ||f ||2L2 (Ω) + ||g||2L2 (Ω) . 2
2 2 2 Proof. The first inequality follows from 0 ≤ (a √− b) =√a − 2ab + b . The second one follows if we apply the first one to ab = ( 2 a)(b/ 2 ).
Proposition 1.2.2 (Cauchy–Schwarz inequality). Let x, y ∈ Rn . Then we have |x · y| ≤ |x||y|. Proof. For > 0, we have 0 ≤ |x ± y|2 = |x|2 ± 2 x · y + 2 |y|2 . This implies 1 |x|2 + 2 |y|2 . Setting = |x| ±x · y ≤ 2 |y| , we obtain the required inequality, provided y = 0 (if x = 0 or y = 0 it is trivial). An alternative proof may be given as follows. We can observe that the inequality 0 ≤ |x + y|2 = |x|2 + 2 x · y + 2 |y|2 implies that the discriminant of the quadratic (in ) polynomial on the right-hand side must be non-positive, which means |x · y|2 − |x|2 |y|2 ≤ 0. Proposition 1.2.3 (Young’s inequality). Let 1 < p, q < ∞ be such that Then bq ap + for all a, b > 0. ab ≤ p q
1 p
+
1 q
= 1.
Chapter 1. Fourier Analysis on Rn
230
Moreover, if > 0, we have ab ≤ ap + C( )bq for all a, b > 0, where C( ) = ( p)−q/p q −1 . As a consequence, we immediately obtain that if f ∈ Lp (Ω) and g ∈ Lq (Ω), then f g ∈ L1 (Ω) with ||f g||L1 ≤
1 1 ||f ||pLp + ||g||qLq . p q
Proof. To prove the first inequality, we will use the fact that the exponential function x → ex is convex (a function f : R → R is called convex if f (τ x + (1 − τ )y) ≤ τ f (x) + (1 − τ )f (y), for all x, y ∈ R and all 0 ≤ τ ≤ 1). This implies 1
ab = eln a+ln b = e p ln a
p
+ q1 ln bq
≤
bq 1 ln ap 1 ln bq ap e + . + e = p q p q
The second inequality with follows if we apply the first one to the product ab = ( p)1/p a b/( p)1/p . Proposition 1.2.4 (H¨ older’s inequality). Let 1 ≤ p, q ≤ ∞ with p f ∈ L (Ω) and g ∈ Lq (Ω). Then f g ∈ L1 (Ω) and
1 p
+
1 q
= 1. Let
||f g||L1 (Ω) ≤ ||f ||Lp (Ω) ||g||Lq (Ω) . In the formulation we use the standard convention of setting 1/∞ = 0. In the case of p = q = 2 H¨ older’s inequality is often called the Cauchy–Schwarz inequality. H¨older’s inequality in the setting of general measures was given in Theorem C.4.4, but here we give a short proof also in Rn for transparency. Proof. In the case p = 1 or p = ∞ the inequality is obvious, so let us assume 1 < p < ∞. Let us first consider the case when ||f ||Lp = ||g||Lq = 1. Then by Young’s inequality with 1 < p, q < ∞, we have ||f g||L1 ≤
1 1 1 1 ||f ||pLp + ||g||qLq = + = 1 = ||f ||Lp ||g||Lq , p q p q
which is the desired inequality. Now, let us consider general f, g. We observe that we may assume that ||f ||Lp = 0 and ||g||Lq = 0, since otherwise one of the functions is zero almost everywhere in Ω and H¨older’s inequality becomes trivial. It follows from the considered case that g f dx ≤ 1, ||f || ||g|| p q Ω which implies the general case by the linearity of the integral.
1.2. Useful inequalities
231
Proposition 1.2.5 (General H¨ older’s inequality). Let 1 ≤ p1 , . . . , pm ≤ ∞ be such that p11 + · · · + p1m = 1. Let fk ∈ Lpk (Ω) for all k = 1, . . . , m. Then the product f1 · · · fm ∈ L1 (Ω) and ||f1 · · · fm ||L1 (Ω) ≤
m
||fk ||Lpk (Ω) .
k=1
This inequality readily follows from H¨ older’s inequality by induction on the number of functions. Exercise 1.2.6. Prove Proposition 1.2.5. Formulate and prove the corresponding general version of Theorem C.4.4. Proposition 1.2.7 (Minkowski’s inequality). Let 1 ≤ p ≤ ∞. Let f, g ∈ Lp (Ω). Then ||f + g||Lp (Ω) ≤ ||f ||Lp (Ω) + ||g||Lp (Ω) . In particular, this means that ||·||Lp satisfies the triangle inequality and is a norm, so Lp (Ω) is a normed space. Minkowski’s inequality in the setting of general measures was given in Theorem C.4.5. Proof. The cases of p = 1 or p = ∞ follow from the triangle inequality for complex numbers and are, therefore, trivial. So we may assume 1 < p < ∞. Then we have p p |f + g| dx ≤ |f + g|p−1 (|f | + |g|) dx ||f + g||Lp (Ω) = Ω Ω p−1 = |f + g| |f | dx + |f + g|p−1 |g| dx Ω
Ω
p ) (use H¨older’s inequality with p = p, q = p−1 6+ , p−1 , p1 + , p1 7 + p p p p |f + g| dx |f | dx + |g| dx ≤ Ω
= ||f +
p g||p−1 Lp (Ω) ||f ||L (Ω)
Ω
Ω
+ ||g||Lp (Ω) ,
which implies the desired inequality.
Proposition 1.2.8 (Young’s inequality for convolutions). Let 1 ≤ p ≤ ∞, f ∈ L1 (Rn ) and g ∈ Lp (Rn ). Then f ∗ g ∈ Lp (Rn ) and ||f ∗ g||Lp ≤ ||f ||L1 ||g||Lp . Proof. We will not prove it from the beginning because the proof is much shorter if we use Minkowski’s inequality for integrals in Theorem C.5.23 or the monotonicity
Chapter 1. Fourier Analysis on Rn
232
of the Lp -norm in Corollary C.5.24. Indeed, we can write ||f ∗ g||Lp = || f (y) g(· − y) dy||Lp Rn ≤ |f (y)| ||g(· − y)||Lp dy = ||f ||L1 ||g||Lp .
Exercise 1.2.9. Let f ∈ L1 (Rn ) and g ∈ C k (Rn ) be such that ∂ α g ∈ L∞ (Rn ) for all |α| ≤ k. Prove that f ∗ g ∈ C k . Consequently, show that ∂ α (f ∗ g) = f ∗ ∂ α g at all points. Proposition 1.2.10 (General Young’s inequality for convolutions). Let 1 ≤ p, q, r ≤ ∞ be such that p1 + 1q = 1+ 1r . Let f ∈ Lp (Rn ) and g ∈ Lq (Rn ). Then f ∗g ∈ Lr (Rn ) and ||f ∗ g||Lr ≤ ||f ||Lp ||g||Lq . Proof. The proof follows by the Riesz–Thorin interpolation theorem C.4.18 from Proposition 1.2.8 and the estimate ||f ∗ g||L∞ ≤ ||f ||Lp ||g||Lq in the case of 1 1 p + q = 1. Exercise 1.2.11. If
1 p
+
= 1, f ∈ Lp (Rn ) and g ∈ Lq (Rn ), prove the estimate
1 q
||f ∗ g||L∞ ≤ ||f ||Lp ||g||Lq . (Hint: H¨ older’s inequality.) Remark 1.2.12. If p1 + 1q = 1, f ∈ Lp (Rn ) and g ∈ Lq (Rn ), one can actually show that f ∗ g is not only bounded, but also uniformly continuous. Consequently, if 1 < p, q < ∞, then f ∗ g(x) → 0 as x → ∞. Exercise 1.2.13. Prove this remark. (Hint: for the uniform continuity use H¨ older’s inequality. For the second part check that the statement is obviously true for compactly supported f and g, and then pass to the limit as supports of f and g grow; this is possible in view of the uniform continuity.) ≤ s ≤ r ≤ t ≤ ∞ be such Proposition 1.2.14 (Interpolation for Lp -norms). Let 1 s for some 0 ≤ θ ≤ 1. Let f ∈ L (Ω) Lt (Ω). Then f ∈ Lr (Ω) that 1r = θs + 1−θ t and ||f ||Lr (Ω) ≤ ||f ||θLs (Ω) ||f ||1−θ Lt (Ω) . (1−θ)r = 1 and so we can apply H¨ older’s Proof. To prove this, we use that θr s + t inequality in the following way: r |f | dx = |f |θr |f |(1−θ)r dx Ω
Ω
+
s θr· θr
|f |
≤ Ω
which is the desired inequality.
+ , θr , (1−θ)r s t t (1−θ)r· (1−θ)r dx |f | dx , Ω
1.3. Tempered distributions
1.3
233
Tempered distributions
In this section we will introduce several spaces of distributions and will extend the Fourier transform to more general spaces of functions than S(Rn ) or L1 (Rn ) considered in Section 1.1. The main problem with the immediate extension is that the integral in the definition of the Fourier transform in Definition 1.1.2 may no longer converge if we go beyond the space L1 (Rn ) of integrable functions. We give preference to tempered distributions over general distributions since our main focus in this chapter is Fourier analysis. Definition 1.3.1 (Tempered distributions S (Rn )). We define the space of tempered distributions S (Rn ) as the space of all continuous linear functionals on S(Rn ). This means that u ∈ S (Rn ) if it is a functional u : S(Rn ) → C such that: 1. u is linear, i.e., u(αϕ + βψ) = αu(ϕ) + βu(ψ) for all α, β ∈ C and all ϕ, ψ ∈ S(Rn ); 2. u is continuous, i.e., u(ϕj ) → u(ϕ) in C whenever ϕj → ϕ in S(Rn ). We can also define the convergence in the space S (Rn ) of tempered distributions.1 Let uj , u ∈ S (Rn ). We will say that uj → u in S (Rn ) as j → ∞ if uj (ϕ) → u(ϕ) in C as j → ∞, for all ϕ ∈ S(Rn ). Functions in S(Rn ) are called the test functions for tempered distributions in S (Rn ). Another notation for u(ϕ) will be u, ϕ. Here one can also recall the definition of the convergence ϕj → ϕ in S(Rn ) from (1.2), which said that ϕj → ϕ in S(Rn ) as j → ∞, if ϕj , ϕ ∈ S(Rn ) and if supx∈Rn |xβ ∂ α (ϕj − ϕ)(x)| → 0 as j → ∞, for all multi-indices α, β ≥ 0.
1.3.1
Fourier transform of tempered distributions
Here we show that the Fourier transform can be extended from S(Rn ) to S (Rn ) by duality. We also establish Plancherel’s and Parseval’s equalities on the space L2 (Rn ). Definition 1.3.2 (Fourier transform of tempered distributions). If u ∈ S (Rn ), we can define its (generalised ) Fourier transform by setting u (ϕ) := u(ϕ), for all ϕ ∈ S(Rn ). Proposition 1.3.3 (Fourier transform on S (Rn )). The Fourier transform from Definition 1.3.2 is well defined and continuous from S (Rn ) to S (Rn ). 1 We will not discuss here topological properties of spaces of distributions. See Remark A.19.3 for some properties as well as Section B.3 and Section 10.12.
Chapter 1. Fourier Analysis on Rn
234
Proof. First, we can readily see that if u ∈ S (Rn ) then also u ∈ S (Rn ). Indeed, n n ∈ S(R ) and so u(ϕ) is a well-defined complex since ϕ ∈ S(R ), it follows that ϕ number. Moreover, u is linear since both u and the Fourier transform F are linear. j → ϕ in S(Rn ) by Finally, u is continuous because ϕj → ϕ in S(Rn ) implies ϕ Remark 1.1.20, and hence u (ϕj ) = u(ϕ j ) → u(ϕ) =u (ϕ) by the continuity of both u from S(Rn ) to C and of the Fourier transform F as a mapping from S(Rn ) to S(Rn ) (see Corollary 1.1.18). Now, it follows that it is also continuous as a mapping from S (Rn ) to S (Rn ), in S (Rn ). Indeed, if uj → u in S (Rn ), i.e., uj → u in S (Rn ) implies that uj → u we have uj (ϕ) = uj (ϕ) → u(ϕ) = u (ϕ) for all ϕ ∈ S(Rn ), which means that in S (Rn ). uj → u Now we give two immediate but important principles for distributions. Proposition 1.3.4 (Convergence principle). Let X be a topological subspace in S (Rn ) (i.e., convergence in X implies convergence in S (Rn )). Suppose that uj → u in S (Rn ) and that uj → v in X. Then u ∈ X and u = v. This statement is simply a consequence of the fact that the space S (Rn ) is Hausdorff, hence it has the uniqueness of limits property (recall that a topological space is called Hausdorff if any two points have open disjoint neighbourhoods, i.e., open disjoint sets containing them). The convergence principle is also related to another principle which we call Proposition 1.3.5 (Uniqueness principle for distributions). Let u, v ∈ S (Rn ) and suppose that u(ϕ) = v(ϕ) for all ϕ ∈ S(Rn ). Then u = v. This can be reformulated by saying that if an element o ∈ S (Rn ) satisfies o(ϕ) = 0 for all ϕ ∈ S(Rn ), then o is the zero element in S (Rn ). Exercise 1.3.6. Let f ∈ Lp (Rn ), 1 ≤ p ≤ ∞, and assume that we have f (x) ϕ(x) dx = 0 Rn
for all ϕ ∈ C ∞ (Rn ) for which the integral makes sense. Prove that f = 0 almost everywhere. Do also a local version of this statement in Exercise 1.4.20. Remark 1.3.7 (Functions as distributions). We can interpret functions in Lp (Rn ), 1 ≤ p ≤ ∞, as tempered distributions. If f ∈ Lp (Rn ), we define the functional uf by uf (ϕ) :=
f (x) ϕ(x) dx,
(1.9)
Rn
for all ϕ ∈ S(Rn ). By H¨older’s inequality, we observe that |uf (ϕ)| ≤ ||f ||Lp ||ϕ||Lq , for p1 + 1q = 1, and hence uf (ϕ) is well defined in view of the simple inclusion
1.3. Tempered distributions
235
S(Rn ) ⊂ Lq (Rn ), for all 1 ≤ q ≤ ∞. It needs to be verified that uf is a linear continuous functional on S(Rn ). It is clearly linear in ϕ, while its continuity follows by H¨ older’s inequality (Proposition 1.2.4) from |uf (ϕj ) − uf (ϕ)| ≤ ||f ||Lp ||ϕj − ϕ||Lq and the following lemma: Lemma 1.3.8. We have S(Rn ) ⊂ Lq (Rn ) with continuous embedding, i.e., ϕj → ϕ in S(Rn ) implies that ϕj → ϕ in Lq (Rn ). Exercise 1.3.9. Prove this lemma. To summarise, any function f ∈ Lp (Rn ) leads to a tempered distribution uf ∈ S (Rn ) in the canonical way given by (1.9). In this way we will view functions in Lp (Rn ) as tempered distributions and continue to simply write f instead of uf . There should be no confusion with this notation since writing f (x) suggests that f is a function while f (ϕ) suggests that it is applied to test functions and so it is viewed as a distribution uf . Remark 1.3.10 (Consistency of all definitions). With this identification, Definition 1.1.2 of the Fourier transform for functions in L1 (Rn ) agrees with Definition 1.3.2 of the Fourier transforms of tempered distributions. Indeed, let f ∈ L1 (Rn ). Then we have two ways of looking at its Fourier transforms: 1. We can use the first definition f(ξ) = Rn e−2πix·ξ f (x) dx, and then we know that f ∈ L∞ (Rn ). In this way we get uf ∈ S (Rn ). 2. We can immediately think of f ∈ L1 (Rn ) as of tempered distribution uf ∈ S (Rn ), and the second definition then produces its Fourier transform u 0f ∈ S (Rn ). Fortunately, these two approaches are consistent and produce the same tempered distribution uf = u 0f ∈ S (Rn ). Indeed, we have f ϕ dx = uf (ϕ). uf(ϕ) = f ϕ dx = Rn
Rn
Here we used the multiplication formula for the Fourier transform in Lemma 1.1.22 and the fact that both u ∈ L1 (Rn ) and u ∈ L∞ (Rn ) can be viewed as tempered distributions in the canonical way (see Remark 1.3.7). It follows that we have f(ϕ) = f (ϕ) which justifies Definition 1.3.2. ∈ L1 (Rn ), then the Fourier Remark 1.3.11. We note that if u ∈ L1 (Rn ) and also u inversion formula in Theorem 1.1.21 holds for almost all x ∈ Rn . A more general Fourier inversion formula for tempered distributions will be given in Theorem 1.3.25. Exercise 1.3.12. Let 1 ≤ p ≤ ∞. Show that if fk → f in Lp (Rn ) then fk → f in S (Rn ).
Chapter 1. Fourier Analysis on Rn
236
It turns out that the Fourier transform acts especially nicely on one of the spaces Lp (Rn ), namely on the space L2 (Rn ), which is also a Hilbert space. These two facts lead to a very rich Fourier analysis on L2 (Rn ) which we will deal with only briefly. Theorem 1.3.13 (Plancherel’s and Parseval’s formulae). Let u ∈ L2 (Rn ). Then u ∈ L2 (Rn ) and || u||L2 (Rn ) = ||u||L2 (Rn )
(Plancherel’s identity).
Moreover, for all u, v ∈ L2 (Rn ) we have u v dx = u v dξ Rn
(Parseval’s identity).
Rn
Proof. We will use the fact (a special case of this fact follows from Theorem 1.3.31 to be proved later) that S(Rn ) is sequentially dense in L2 (Rn ), i.e., that for every u ∈ L2 (Rn ) there exists a sequence uj ∈ S(Rn ) such that uj → u in L2 (Rn ). Then Theorem 1.1.30, (i), with ϕ = ψ = uj − uk , implies that 0k ||2L2 = ||uj − uk ||2L2 → 0, ||uj − u since uj is a convergent sequence in L2 (Rn ). Thus, uj is a Cauchy sequence in the complete (Banach, see Theorem C.4.9) space L2 (Rn ). It follows that it must converge to some v ∈ L2 (Rn ). By the continuity of the Fourier transform in S (Rn ) in S (Rn ). By the convergence (see Proposition 1.3.3) we must also have uj → u principle for distributions in Proposition 1.3.4, we get that u = v ∈ L2 (Rn ). Applying Theorem 1.1.30, (i), again, to ϕ = ψ = uj , we get ||uj ||2L2 = ||uj ||2L2 . Passing to the limit, we get || u||2L2 = ||u||2L2 , which is Plancherel’s formula. 2 n Finally, for u, v ∈ L (R ), let uj , vj ∈ S(Rn ) be such that uj → u and vj → v 2 in L (Rn ). Applying Theorem 1.1.30, (i), to ϕ = uj , ψ = vj , and passing to the limit, we obtain Parseval’s identity. Corollary 1.3.14 (Hausdorff–Young inequality). Let 1 ≤ p ≤ 2 and ∈ Lq (Rn ) and u ∈ Lp (Rn ) then u
1 p
+
1 q
= 1. If
|| u||Lq (Rn ) ≤ ||u||Lp (Rn ) . Proof. The statement follows by the Riesz–Thorin Interpolation Theorem C.4.18 from estimates || u||L∞ (Rn ) ≤ ||u||L1 (Rn ) in (1.1) and Plancherel’s identity || u||L2 (Rn ) = ||u||L2 (Rn ) in Theorem 1.3.13.
1.3.2
Operations with distributions
Besides the Fourier transform, there are several other operations that can be extended from functions in S(Rn ) to tempered distributions in S (Rn ).
1.3. Tempered distributions
237
For example, partial differentiation operator define
∂ ∂xj
∂ ∂xj
can be extended to a continuous
: S (Rn ) → S (Rn ). Indeed, for u ∈ S (Rn ) and ϕ ∈ S(Rn ), let us +
, + , ∂ ∂ϕ u (ϕ) := −u . ∂xj ∂xj
It is necessary to include the negative sign in this definition. Indeed, if u ∈ S(Rn ), then the integration by parts formula and the identification of functions with distributions in Remark 1.3.7 yield , , + + ∂u ∂ (x)ϕ(x) dx u (ϕ) = ∂xj ∂xj Rn , + , + ∂ϕ ∂ϕ (x) dx = −u , u(x) =− ∂xj ∂xj Rn which explains the sign. This also shows the consistency of this definition of the derivative with the usual definition for differentiable functions. Definition 1.3.15 (Distributional derivatives). More generally, for any multi-index α, one can define (∂ α u)(ϕ) = (−1)|α| u(∂ α ϕ), for ϕ ∈ S(Rn ). Proposition 1.3.16. If u ∈ S (Rn ), then ∂ α u ∈ S (Rn ) and operator ∂ α : S (Rn ) → S (Rn ) is continuous. Proof. Indeed, if ϕk → ϕ in S(Rn ), then clearly also ∂ α ϕk → ∂ α ϕ in S(Rn ), and, therefore, (∂ α u)(ϕk ) = (−1)|α| u(∂ α ϕk ) → (−1)|α| u(∂ α ϕ) = (∂ α u)(ϕ), which means that ∂ α u ∈ S (Rn ). Moreover, let uk → u ∈ S (Rn ). Then ∂ α uk (ϕ) = (−1)|α| uk (∂ α ϕ) → (−1)|α| u(∂ α ϕ) = ∂ α u(ϕ), for all ϕ ∈ S(Rn ), i.e., ∂ α is contin uous on S (Rn ). Exercise 1.3.17. Show that if u ∈ S (Rn ), then ∂ α ∂ β u = ∂ β ∂ α u = ∂ α+β u. Exercise 1.3.18. Let χ : R → R be the characteristic function of the interval [−1, 1], i.e., χ(y) = 1 for −1 ≤ y ≤ 1 and χ(y) = 0 for |y| > 1. Calculate the d distributional derivative χ . Define operator T by T f (x) = dx (χ ∗ f )(x), x ∈ R, n f ∈ S(R ). Prove that T f (x) = f (x + 1) − f (x − 1). Remark 1.3.19 (Multiplication by functions). If a smooth function f ∈ C ∞ (Rn ) and all of its derivatives are bounded by some polynomial functions, we can define the multiplication of a tempered distribution u by f by setting (f u)(ϕ) := u(f ϕ). This is well defined since ϕ ∈ S(Rn ) implies f ϕ ∈ S(Rn ).
Chapter 1. Fourier Analysis on Rn
238
Exercise 1.3.20 (Hadamard’s principal value). Show that log |x| is a tempered d log |x|. Show that distribution on R. Let u = dx 1 ϕ(x) dx u(ϕ) = lim 0 R\[−,] x for all ϕ ∈ C 1 (R) vanishing outside a bounded set. The distribution u is called the principal value of x1 and is denoted by p.v. x1 . Remark 1.3.21 (Schwartz’ impossibility result). One has to be careful when multiplying distributions as the following example shows: + , 1 1 · x · δ = δ, 0 = · (x · δ) = x x where x1 may be any inverse of x, for example p.v. x1 . In general, distributions can not be multiplied, as was noted by Laurent Schwartz in [104], and which is called the Schwartz’ impossibility result. Still, some multiplication is possible, as is demonstrated by Remark 1.3.19. 1 by Exercise 1.3.22. Define the distribution x±i0 + , 1 1 (ϕ) := lim ϕ(x) dx, →0± R x + i x ± i0
for ϕ ∈ S(Rn ). Prove that 1 1 = p.v. ∓ iπδ. x ± i0 x However, as we have seen, statements on S(Rn ) can usually be extended to corresponding statements on S (Rn ). This applies to the Fourier inversion formula as well. Definition 1.3.23 (Inverse Fourier transform). Define F −1 on S (Rn ) by (F −1 u)(ϕ) := u(F −1 ϕ), for u ∈ S (Rn ) and ϕ ∈ S(Rn ). Exercise 1.3.24. Show that F −1 : S (Rn ) → S (Rn ) is well defined and continuous. Theorem 1.3.25 (Fourier inversion formula for tempered distributions). Operators F and F −1 are inverse to each other on S (Rn ), i.e., FF −1 = F −1 F = identity
on
S (Rn ).
Proof. To prove this, let u ∈ S (Rn ) and ϕ ∈ S(Rn ). Then by Theorem 1.1.21 and Definitions 1.3.2 and 1.3.23, we get (FF −1 u)(ϕ) = (F −1 u)(Fϕ) = u(F −1 Fϕ) = u(ϕ), so FF −1 u = u by the uniqueness principle for distributions in Proposition 1.3.5. A similar argument applies to show that F −1 F = id.
1.3. Tempered distributions
239
Remark 1.3.26. To give an example of these operations, let us define the Heaviside function H on R by setting " 0, if x < 0, H(x) = 1, if x ≥ 0. Clearly H ∈ L∞ (R), so in particular, it is a tempered distribution in S (Rn ). Let us also define the Dirac δ–distribution by setting δ(ϕ) = ϕ(0) for all ϕ ∈ S(R). It is easy to see that δ ∈ S (Rn ). We claim first that H = δ. Indeed, we have ∞ H (ϕ) = −H(ϕ ) = − ϕ (x) dx = ϕ(0) = δ(ϕ), 0
hence H = δ by the uniqueness principle for distributions. Let us now calculate the Fourier transform of δ. According to the definitions, we have ϕ(x) dx = 1(ϕ), δ(ϕ) = δ(ϕ) = ϕ(0) = R
hence δ = 1. Here we used the fact that the constant 1 is in L∞ (Rn ), hence also a tempered distribution. Exercise 1.3.27. Check that we also have 1 = δ.
1.3.3
Approximating by smooth functions
It turns out that although elements of S (Rn ) can be very irregular and the space is quite large, tempered distributions can still be approximated by smooth compactly supported functions. Definition 1.3.28 (Space C0∞ (Ω)). For an open set Ω ⊂ Rn , the space C0∞ (Ω) of smooth compactly supported functions is defined as the space of smooth functions ϕ : Ω → C with compact support. Here the support of ϕ is defined as the closure of the set where ϕ is non-zero, i.e., by supp ϕ = {x ∈ Ω : ϕ(x) = 0}. Remark 1.3.29 (How large is C0∞ (Ω)?). We can see that this space is non-empty. 2 For example, if we define function χ(t) by χ(t) = e−1/t for t > 0 and by χ(t) = 0 for t ≤ 0, then f (t) = χ(t)χ(1 − t) is a smooth compactly supported function on R. Consequently, ϕ(x) = f (x1 ) · · · f (xn ) is a function in C0∞ (Rn ), with supp ϕ = [0, 1]n . 2 Another example is the function ψ defined by ψ(x) = e1/(|x| −1) for |x| < 1 and by ψ(x) = 0 for |x| ≥ 1. We have ψ ∈ C0∞ (Rn ) with supp ψ = {|x| ≤ 1}.
Chapter 1. Fourier Analysis on Rn
240
Remark 1.3.30. For the functional analytic description of the topology of the space C0∞ (Ω) we refer to Exercise B.3.12. It is also a nuclear Montel space, see Exercises B.3.35 and B.3.51. Although these examples are quite special, products of these functions with any other smooth function as well as their derivatives are all in C0∞ (Rn ). On the other hand, C0∞ (Rn ) can not contain analytic functions, thus making it relatively small. Still, it is dense in very large spaces of functions/distributions in their respective topologies. Theorem 1.3.31 (Sequential density of C0∞ (Ω) in S (Rn )). The space C0∞ (Rn ) is sequentially dense in S (Rn ), i.e., for every u ∈ S (Rn ) there exists a sequence uk ∈ C0∞ (Rn ) such that uk → u in S (Rn ) as k → ∞. Lemma 1.3.32. The space C0∞ (Rn ) is sequentially dense in S(Rn ), i.e., for every ϕ ∈ S(Rn ) there exists a sequence ϕk ∈ C0∞ (Rn ) such that ϕk → ϕ in S(Rn ) as k → ∞. Proof. Let ϕ ∈ S(Rn ). Let us fix some ψ ∈ C0∞ (Rn ) such that ψ = 1 in a neighbourhood of the origin and let us define ψk (x) = ψ(x/k). Then it can be easily checked that ϕk = ψk ϕ → ϕ in S(Rn ), as k → ∞. Proof of Theorem 1.3.31. Let u ∈ S (Rn ) and let ψ and ψk be as in the proof of Lemma 1.3.32. Then ψu ∈ S (Rn ) is well defined by (ψu)(ϕ) = u(ψϕ), for all ϕ ∈ S(Rn ). We have that ψk u → u in S (Rn ). Indeed, we have that (ψk u)(ϕ) = → u in S (Rn ), u(ψk ϕ) → u(ϕ) by Lemma 1.3.32. Similarly, we have that ψk u −1 n ) → u in S (R ) because of the continuity of the Fourier and hence also F (ψk u transform in S (Rn ), see Proposition 1.3.3. Consequently, we have )) → u in S (Rn ) as k, j → ∞. ukj = ψj (F −1 (ψk u It remains to show that ukj ∈ C0∞ (Rn ). In general, let χ ∈ C0∞ (Rn ) and let w = χ u. We claim that F −1 w ∈ C ∞ (Rn ). Indeed, we have + , e2πix·ξ ϕ(x) dx (F −1 w)(ϕ) = w(F −1 ϕ) = wξ Rn = wξ ( e2πix·ξ )ϕ(x) dx, Rn
where we write wξ to emphasize that w acts on the test function as the function of ξ-variable, and where we used the continuity of w and the fact that wξ ( e2πix·ξ ) = u (χ e2πix·ξ ) is well defined. Now, it follows that F −1 w can be identified with the function (F −1 w)(x) = u ξ (χ(ξ) e2πix·ξ ), which is smooth with respect to x. Indeed, we can note first that the right-hand side depends continuously on x because of the continuity of u on S(Rn ). Here we also use that everything is well defined since ∞ n χ ∈ C0 (R ). Moreover, since the function χ(ξ) e2πix·ξ is compactly supported in ξ, so are its derivatives with respect to x, and hence all the derivatives of (F −1 w)(x) are also continuous in x, proving the claim and the theorem.
1.4. Distributions
241
Exercise 1.3.33. Prove that S(Rn ) is sequentially dense in L2 (Rn ), i.e., that for every u ∈ L2 (Rn ) there exists a sequence uj ∈ S(Rn ) such that uj → u in L2 (Rn ). Prove that this is also true for Lp (Rn ), for all 1 ≤ p < ∞. Exercise 1.3.34 (Uncertainty principle). Prove that C0∞ (Rn ) ∩ FC0∞ (Rn ) = {0}. (Hint: it is enough to know that polynomials are dense in L2 (K) for any compact K.) Exercise 1.3.35 (Scaling operators). For λ ∈ R, λ = 0, define the mapping mλ : Rn → Rn by mλ (x) = λx. ◦ mλ (ξ) = λ−n (ϕ ◦ mλ−1 )(ξ) for all ξ ∈ Rn . (i) Let ϕ ∈ S(Rn ). Prove that ϕ (ii) Let u ∈ S (Rn ). Define the distribution u ◦ mλ by (u ◦ mλ )(ϕ) := λ−n u(ϕ ◦ mλ−1 ), for all ϕ ∈ S(Rn ). Prove that this definition is consistent with S(Rn ), i.e., show that if u ∈ S(Rn ), (u ◦ mλ )(x) = u(λx), and if we identify u with its canonical distribution, then we have (u ◦ mλ )(ϕ) = λ−n u(ϕ ◦ mλ−1 ) for all ϕ ∈ S(Rn ). ◦ mλ = λ−n u ◦ mλ−1 . (iii) Let u ∈ S (Rn ). Prove that u
1.4
Distributions
Since our main interest is in Fourier analysis, we started with the space S (Rn ) of tempered distributions which allows the definition and use of the Fourier transform. However, there is a bigger space of distributions which we will sketch here. It will contain some important classes of functions that S (Rn ) does not contain. For much more comprehensive treatments of spaces of distributions and their properties we refer the reader to monographs [8, 10, 39, 106, 105].
1.4.1
Localisation of Lp -spaces and distributions
Definition 1.4.1 (Localisations of Lp -spaces). We define local versions of the spaces Lp (Ω) as follows. We will say that f ∈ Lploc (Ω) if ϕf ∈ Lp (Ω) for all ϕ ∈ C0∞ (Ω). We note that the spaces Lploc (Rn ) are not subspaces of S (Rn ) since they do not 2 encode any information on the global behaviour of functions. For example, e|x| is smooth, and hence belongs to all Lploc (Rn ), 1 ≤ p ≤ ∞, but it is not in S (Rn ). There is a natural notion of convergence in the localised spaces Lploc (Ω). Thus, we will write fm → f in Lploc (Ω) as m → ∞, if f and fm belong to Lp (Ω)loc for all m, and if ϕfm → ϕf in Lp (Ω) as m → ∞, for all ϕ ∈ C0∞ (Ω). The difference between the space of distributions D (Rn ) that we are going to introduce now, and the space of tempered distributions S (Rn ) is the choice of
Chapter 1. Fourier Analysis on Rn
242
the set C0∞ (Rn ) rather than S(Rn ) as the space of test functions. At the same time, choosing C0∞ (Ω) as test functions allows one to obtain the space D (Ω) of distributions in Ω, rather than on the whole space Rn . The definition and facts below are sketched only as they are similar to Definition 1.3.1. Definition 1.4.2 (Distributions D (Ω)). We say that ϕk → ϕ in C0∞ (Ω) if ϕk , ϕ ∈ C0∞ (Ω), if there is a compact set K ⊂ Ω such that supp ϕk ⊂ K for all k, and if supx∈Ω |∂ α (ϕk − ϕ)(x)| → 0 for all multi-indices α. Then D (Ω) is defined as the set of all linear continuous functionals u : C0∞ (Ω) → C, i.e., all functionals u : C0∞ (Ω) → C such that: 1. u is linear, i.e., u(αϕ + βψ) = αu(ϕ) + βu(ψ) for all α, β ∈ C and all ϕ, ψ ∈ C0∞ (Ω); 2. u is continuous, i.e., u(ϕj ) → u(ϕ) in C whenever ϕj → ϕ in C0∞ (Ω). Exercise 1.4.3 (Order of a distribution). Show that a linear operator u : C0∞ (Ω) → C belongs to D (Ω) if and only if for every compact set K ⊂ Ω there exist constants C and m such that (1.10) |u(ϕ)| ≤ C max sup |∂ α ϕ(x)|, |α|≤m x∈Ω
C0∞ (Ω)
for all ϕ ∈ with supp ϕ ⊂ K. The smallest m for which (1.10) holds is called the order of u in K. The smallest m which works for all compact sets if called the order of the distribution u. Show that δ-distribution has order 1. Find examples of distributions of infinite order. Remark 1.4.4 (Distributions of zero order as measures). If u ∈ D (Ω) is a distribution of order zero, by (1.10) it defines a continuous functional on C(Ω). Then it follows from Theorem C.4.60 that u is a measure (at least when Ω is compact, or when u acts on continuous compactly supported functions). Remark 1.4.5 (Continuous inclusion S (Rn ) ⊂ D (Rn )). It is easy to see that C0∞ (Rn ) ⊂ S(Rn ) and that if ϕk → ϕ in C0∞ (Rn ), then ϕk → ϕ in S(Rn ). Thus, if u ∈ S (Rn ) and if ϕk → ϕ in C0∞ (Rn ), we have u(ϕk ) → u(ϕ), which means that u ∈ D (Rn ). Thus, we showed that S (Rn ) ⊂ D (Rn ). We say that uk → u ∈ D (Ω) if uk , u ∈ D (Ω) and if uk (ϕ) → u(ϕ) for all ϕ ∈ C0∞ (Ω). Exercise 1.4.6. Show that uk → u in S (Rn ) implies uk → u in D (Rn ), i.e., the inclusion S (Rn ) ⊂ D (Rn ) is continuous. Exercise 1.4.7. Prove that the canonical identification in Remark 1.3.7 yields the inclusions Lploc (Ω) ⊂ D (Ω) for all 1 ≤ p ≤ ∞. Definition 1.4.8 (Compactly supported distributions E (Ω)). We say that ϕk → ϕ in C ∞ (Ω) if ϕk , ϕ ∈ C ∞ (Ω) and if supx∈K |∂ α (ϕk −ϕ)(x)| → 0 for all multi-indices α and all compact subsets K of Ω. Then E (Ω) is defined as the set of all linear continuous functionals u : C ∞ (Ω) → C, i.e., all functionals u : C ∞ (Ω) → C such that:
1.4. Distributions
243
1. u is linear, i.e., u(αϕ + βψ) = αu(ϕ) + βu(ψ) for all α, β ∈ C and all ϕ, ψ ∈ C ∞ (Ω); 2. u is continuous, i.e., u(ϕj ) → u(ϕ) in C whenever ϕj → ϕ in C ∞ (Ω). Exercise 1.4.9. Show that the restriction of u ∈ E (Ω) to C0∞ (Ω) is an injective linear mapping from E (Ω) to D (Ω). Exercise 1.4.10. Show that E (Ω) ⊂ D (Ω) and that E (Rn ) ⊂ S (Rn ) ⊂ D (Rn ). Show also that all these inclusions are continuous. Definition 1.4.11 (Support of a distribution). We say that u ∈ D (Ω) is supported in the set K ⊂ Ω if u(ϕ) = 0 for all ϕ ∈ C ∞ (Ω) such that ϕ = 0 on K. The smallest closed set in which u is supported is called the support of u and is denoted by supp u. Exercise 1.4.12. Formulate and prove the analogue of the criterion in Exercise 1.4.3 for compactly supported distributions in E (Ω). Exercise 1.4.13. Show that distributions in E (Ω) have compact support (justifying the name of “compactly supported” distributions in Definition 1.4.8). Exercise 1.4.14 (Distributions with compact support). Prove that if the support of u ∈ D (Rn ) is compact then u is of finite order. Prove that all compactly supported distributions belong to E (Ω). Exercise 1.4.15 (Distributions with point support). Prove that if a distribution m has support supp u = {0}, then there exist constants u ∈ D (Rn ) of order aα ∈ C such that u = |α|≤m aα ∂ α δ. Definition 1.4.16 (Singular support). The singular support of u ∈ D (Ω) is defined as the complement of the set where u is smooth. Namely, x ∈ sing supp u if there is an open neighbourhood U of x and a smooth function f ∈ C ∞ (U ) such that u(ϕ) = f (ϕ) for all ϕ ∈ C0∞ (U ). Exercise 1.4.17. Show that if u ∈ D (Ω) then its singular support is closed. Exercise 1.4.18. Show that sing supp |x| = {0} and that sing supp δ = {0}. is a smooth function of the soExercise 1.4.19. Show that if u ∈ E (Rn ), then u called slow growth (i.e., u (ξ) and all of its derivatives are of at most polynomial growth). Hint: the slow growth follows from testing u on the exponential functions eξ (x) = e2πix·ξ . Indeed, show first that u(ξ) = u, eξ and thus 9 8 (ξ) = u, ∂ξα eξ = (−2πi)|α| u, xα eξ . ∂αu Consequently, by an analogue of (1.10) in Exercise 1.4.3 we conclude that ˜ + R)|α| (1 + |ξ|)m . |∂ α u (ξ)| ≤ C sup |∂xβ (xα eξ (x))| ≤ C(1 |β|≤m
|x|≤R
Exercise 1.4.20. Prove thefollowing stronger version of Exercise 1.3.6. Let f ∈ L1loc (Rn ) and assume that Rn f (x) ϕ(x) dx = 0 for all ϕ ∈ C0∞ (Rn ). Prove that f = 0 almost everywhere.
Chapter 1. Fourier Analysis on Rn
244
1.4.2
Convolution of distributions
We can write the convolution of two functions f, g ∈ S(Rn ) in the following way: f (z)g(x − z) dz = f (z)(τx Rg)(z) dz, (f ∗ g)(x) = Rn
Rn
where (Rg)(x) = g(−x) and (τh g)(x) = g(x − h), so that (τx Rg)(z) = (Rg)(z − x) = g(x − z). Recalling our identification of functions with distributions in Remark 1.3.7, we can write (f ∗ g)(x) = f (τx Rg). This can now be extended to distributions. Definition 1.4.21 (Convolution with a distribution). For u ∈ S (Rn ) and ϕ ∈ S(Rn ), define (u ∗ ϕ)(x) := u(τx Rϕ). The definition makes sense since τx Rϕ ∈ S(Rn ) and since τx , R : S(Rn ) → S(Rn ) are continuous. Corollary 1.4.22. For example, for ψ ∈ S(Rn ) we have δ ∗ ψ = ψ since for every x ∈ Rn we have (δ ∗ ψ)(x) = δ(τx Rψ) = ψ(x − z)|z=0 = ψ(x). Lemma 1.4.23. Let u ∈ S (Rn ) and ϕ ∈ S(Rn ). Then u ∗ ϕ ∈ C ∞ (Rn ). Proof. We can observe that (u ∗ ϕ)(x) = u(τx Rϕ) is continuous in x since τx : S(Rn ) → S(Rn ) and u : S(Rn ) → C are continuous. The same applies when we look at derivatives in x, implying that u ∗ ϕ is smooth. Here we note that we are allowed to pass the limit through u since it is a continuous functional. Exercise 1.4.24. Prove that if u, v, ϕ ∈ S(Rn ) then (u ∗ v)(ϕ) = u(Rv ∗ ϕ). Exercise 1.4.25 (Reflection of a distribution). For v ∈ S (Rn ), define its reflection Rv by (Rv)(ϕ) := v(Rϕ), for ϕ ∈ S(Rn ). Prove that Rv ∈ S (Rn ). Prove also that this definition is consistent with the definition of (Rg)(x) = g(−x) for g ∈ C ∞ (Rn ). Exercise 1.4.26. Show that if v ∈ S (Rn ), then the mapping ϕ → Rv ∗ ϕ is continuous from C ∞ (Rn ) to C ∞ (Rn ). Consequently, if v ∈ S (Rn ) and ϕ ∈ S(Rn ), we have Rv ∗ ϕ ∈ C ∞ (Rn ) by Lemma 1.4.23. This motivates the following: Definition 1.4.27 (Convolution of distributions). Let u ∈ E (Rn ) and v ∈ S (Rn ). Define the convolution u ∗ v of u and v by (u ∗ v)(ϕ) := u(Rv ∗ ϕ), for all ϕ ∈ S(R ). n
1.4. Distributions
245
Exercise 1.4.28. We see from Exercise 1.4.24 that this definition is consistent with S(Rn ). Prove that if u ∈ E (Rn ) and v ∈ S (Rn ) then u ∗ v ∈ S (Rn ). Exercise 1.4.29. Prove that if u ∈ E (Rn ) and v ∈ S (Rn ) then sing supp (u ∗ v) ⊂ sing supp u + sing supp v. Exercise 1.4.30. Extend the notion of convolution to two distributions u, v ∈ D (Rn ) when at least one of them has compact support. Exercise 1.4.31 (Diagonal property). Show that the convolution u ∗ v of two distributions exists if for every compact set K the intersection (supp u×supp v)∩{(x, y) : x + y ∈ K} is compact. This allows, for example, to take a convolution of two Heaviside functions H, yielding H ∗ H = xH. Remark 1.4.32. Let us show an example of the calculation with distributions. Let v ∈ S (Rn ). We will show that v ∗ δ = δ ∗ v = v. Indeed, on one hand we have 1.4.27
1.4.22
(v ∗ δ)(ϕ) = v(Rδ ∗ ϕ) = v(δ ∗ ϕ) = v(ϕ) in view of Rδ = δ: 1.4.25
Rδ, ψ = δ, Rψ = Rψ(0) = ψ(0) = δ, ψ. On the other hand, we have 1.4.27
1.4.21
1.4.25
(δ ∗ v)(ϕ) = δ(Rv ∗ ϕ) = (Rv ∗ ϕ)(0) = Rv(τ0 Rϕ) = Rv(Rϕ) = v(ϕ). Note that in view of Exercise 1.4.30 we could have taken v ∈ D (Rn ) here. Exercise 1.4.33. Let u ∈ S (Rn ) and v ∈ E (Rn ). Define u ∗ v as v ∗ u, i.e., u ∗ v := v ∗ u. Prove that this coincides with Definition 1.4.27 when u, v ∈ E (Rn ). Exercise 1.4.34. Prove that the extension of (1.8) holds, i.e., that ∂ α (f ∗ g) = ∂ α f ∗ g = f ∗ ∂ α g. Remark 1.4.35 (Non-associativity of convolution). In Exercise 1.1.29 we formulated the associativity of a convolution. However, for distributions one has to be careful. Indeed, recalling the relation H = δ from Remark 1.3.26 and assuming the associativity one could prove 1
1.4.32
= =
1.4.32
δ∗1 (δ ∗ δ) ∗ 1
(H ∗ δ ) ∗ 1 “ = ” H ∗ (δ ∗ 1) 1.4.34
=
= =
H ∗ (δ ∗ 1 ) H ∗0
=
0.
1.4.34
Chapter 1. Fourier Analysis on Rn
246
Exercise 1.4.36. Why does the associativity in Remark 1.4.35 fail? How could we restrict the spaces of distributions for the convolution to be still associative? ∗v = u v, where Exercise 1.4.37. Show that if u ∈ E (Rn ) and v ∈ S (Rn ), then u the product on the right-hand side makes sense in view of Remark 1.3.19 and Exercise 1.4.19. We now formulate a couple of useful properties of translations: Exercise 1.4.38 (Translation is continuous in Lp (Rn )). Prove that translation is continuous in Lp (Rn ), namely that translations τx : Lp (Rn ) → Lp (Rn ), (τx f )(y) = f (y − x), satisfy ||τx f − f ||Lp → 0 as x → 0, for every f ∈ Lp (Rn ). Exercise 1.4.39 (Translations of convolutions). For f, g ∈ L1 (Rn ), show that the convolution of f and g satisfies τx (f ∗ g) = (τx f ) ∗ g = f ∗ (τx g). Can you extend this to some classes of distributions?
1.5
Sobolev spaces
In this section we discuss Sobolev spaces Lpk with integer orders k ∈ N. After introducing the necessary elements of the theory of pseudo-differential operators we will come back to this topic in Section 2.6.3 to also discuss Sobolev spaces Lps for all real s ∈ R.
1.5.1 Weak derivatives and Sobolev spaces There is a notion of a weak derivative which is a special case of the distributional derivative from Definition 1.3.15. However, it allows a realisation in an integral form and we mention it here briefly. Definition 1.5.1 (Weak derivative). Let Ω be an open subset of Rn and let u, v ∈ L1loc (Ω). We say that v is the αth -weak partial derivative of u if α |α| u ∂ ϕ dx = (−1) v ϕ dx, for all ϕ ∈ C0∞ (Ω). Ω
Ω α
In this case we also write v = ∂ u. The constant (−1)|α| stands for the consistency with the corresponding definition for smooth functions when using integration by parts in Ω. It is the same reason as to include the constant (−1)|α| in Definition 1.3.15. The weak derivative defined in this way is uniquely determined: Lemma 1.5.2. Let u ∈ L1loc (Ω). If a weak αth derivative of u exists, it is uniquely defined up to a set of measure zero.
1.5. Sobolev spaces
247
Proof. Indeed, assume that there are two functions v, w ∈ L1loc (Ω) such that u ∂ α ϕ dx = (−1)|α| v ϕ dx = (−1)|α| w ϕ dx, Ω
Ω
Ω
for all ϕ ∈ C0∞ (Ω). Then Ω (v − w)ϕ dx = 0 for all ϕ ∈ C0∞ (Ω). A standard result from measure theory (e.g., Theorem C.4.60) now implies that v = w almost everywhere in Ω. Exercise 1.5.3. Let us define u, v : R → R by " " x, if x ≤ 1, 1, u(x) = v(x) = 1, if x > 1, 0,
if x ≤ 1, if x > 1,
Prove that u = v weakly. Exercise 1.5.4. Define u : R → R by " x, if x ≤ 1, u(x) = 2, if x > 1. Prove that u has no weak derivative. Calculate the distributional derivative of u. Exercise 1.5.5. Prove that the Dirac δ-distribution is not an element of L1loc (Rn ). There are different ways to define Sobolev spaces2 . Here we choose the one using weak or distributional derivatives. Definition 1.5.6 (Sobolev spaces). Let 1 ≤ p ≤ ∞ and let k ∈ N ∪ {0}. The Sobolev space Lpk (Ω) (or W p,k (Ω)) consists of all u ∈ L1loc (Ω) such that for all multi-indices α with |α| ≤ k, ∂ α u exists weakly (or distributionally) and ∂ α u ∈ Lp (Ω). For u ∈ Lpk (Ω), we define ⎛ ||u||Lpk (Ω) := ⎝
⎞1/p ||∂ α u||pLp ⎠
|α|≤k
⎛ =⎝
|α|≤k
⎞1/p |∂ α u|p dx⎠
,
Ω
for 1 ≤ p < ∞, and for p = ∞ we define := max esssupΩ |∂ α u|. ||u||L∞ k (Ω) |α|≤k
older’s inequality Since p ≥ 1, we know that Lploc (Ω) ⊂ L1loc (Ω), e.g., by H¨ (Proposition 1.2.4), so we note that it does not matter whether we take a weak or a distributional derivative. In the case p = 2, one often uses the notation H k (Ω) for L2k (Ω), and in the case p = 2 and k = 0, we get H 0 (Ω) = L2 (Ω). As usual, we identify functions in Lpk (Ω) which are equal almost everywhere (see Definition C.4.6). 2 We
come back to this subject in Section 2.6.3.
Chapter 1. Fourier Analysis on Rn
248
Proposition 1.5.7. The functions ||·||Lpk (Ω) in Definition 1.5.6 are norms on Lpk (Ω). Proof. Indeed, we clearly have ||λu||Lpk = |λ|||u||Lpk and ||u||Lpk = 0 if and only if u = 0 almost everywhere. For the triangle inequality, the case p = ∞ is straightforward. For 1 ≤ p < ∞ and for u, v ∈ Lpk (Ω), Minkowski’s inequality (Proposition 1.2.7) implies ⎛ ||u + v||Lpk = ⎝
≤⎝
⎞1/p ||∂
α
u||pLp ⎠
|α|≤k
⎛
+⎝
|α|≤k
≤⎝
||∂ α u + ∂ α v||pLp ⎠
|α|≤k
⎛
⎛
⎞1/p
||∂
α
⎞1/p p (||∂ α u||Lp + ||∂ α v||Lp ) ⎠
⎞1/p
v||pLp ⎠
|α|≤k
= ||u||Lpk + ||v||Lpk ,
completing the proof.
We define local versions of spaces Lpk (Ω) similarly to local versions of Lp spaces. Definition 1.5.8 (Localisations of Sobolev spaces). We will say that f ∈ Lpk (Ω)loc if ϕf ∈ Lpk (Ω) for all ϕ ∈ C0∞ (Ω). We will write fm → f in Lpk (Ω)loc as m → ∞, if f and fm belong to Lpk (Ω)loc for all m, and if ϕfm → ϕf in Lpk (Ω) as m → ∞, for all ϕ ∈ C0∞ (Ω). Example (Example of a point singularity). An often encountered example of a function with a point singularity is u(x) = |x|−a defined for x ∈ Ω = B(0, 1) ⊂ Rn , x = 0. We may ask a question: for which a > 0 do we have u ∈ Lp1 (Ω)? First we observe that away from the origin, u is a smooth function and can be differentiated pointwise with ∂xj u = −axj |x|−a−2 and hence also |∇u(x)| = |a||x|−a−1 , x = 0. In particular, |∇u| ∈ L1 (Ω) for a+1 < n (here Exercise 1.1.19 is of use). We also have |∇u| ∈ Lp (Ω) for (a + 1)p < n. So we must assume a + 1 < n and (a + 1)p < n. Let us now calculate the weak (distributional) derivative of u in Ω. Let ϕ ∈ C0∞ (Ω). Let > 0. On Ω\B(0, ) we can integrate by parts to get u∂xj ϕ dx = − ∂xj uϕ dx + uϕν j dσ, (1.11) Ω\B(0,)
Ω\B(0,)
∂B(0,)
where dσ is the surface measure on the sphere ∂B(0, ) and ν = (ν 1 , . . . , ν n ) is the inward pointing normal on ∂B(0, ). Now, since u = | |−a on ∂B(0, ), we can estimate
uϕν j dσ ≤ ||ϕ||L∞ ∂B(0,)
∂B(0,)
−a dσ ≤ C n−1−a → 0
1.5. Sobolev spaces
249
as → 0, since a + 1 < n. Passing to the limit in the integration by parts formula (1.11), we get Ω u∂xj ϕ dx = − Ω ∂xj uϕ dx, which means that ∂xj u is also the weak derivative of u. So, u ∈ Lp1 (Ω) if u, |∇u| ∈ Lp (Ω), which holds for (a+1)p < n, i.e., for a < (n − p)/p. Exercise 1.5.9. Find conditions on a in the above example for which u ∈ Lpk (Ω).
1.5.2 Some properties of Sobolev spaces Since Lp (Ω) ⊂ D (Ω), we can work with u ∈ Lp (Ω) as with functions or as with distributions. In particular, we can differentiate them distributionally, etc. Moreover, as we have already seen, the equality of objects (be it functions, functionals, distributions, etc.) depends on the spaces in which the equality is considered. In Sobolev spaces we can use tools from measure theory so we work with functions defined almost everywhere. Thus, an equality in Sobolev spaces (as in the following theorem) means pointwise equality almost everywhere. Theorem 1.5.10 (Properties of Sobolev spaces). Let u, v ∈ Lpk (Ω), and let α be a multi-index with |α| ≤ k. Then: (i) ∂ α u ∈ Lpk−|α| (Ω), and ∂ α (∂ β u) = ∂ β (∂ α u) = ∂ α+β u, for all multi-indices α, β such that |α| + |β| ≤ k. (ii) For all λ, μ ∈ C we have λu + μv ∈ Lpk (Ω) and ∂ α (λu + μv) = λ∂ α u + μ∂ α v. * * is an open subset of Ω, then u ∈ Lp (Ω). (iii) If Ω k
(iv) If χ ∈ C0∞ (Ω), then χu ∈ Lpk (Ω) and we have the Leibniz formula +α, α (∂ β χ)(∂ α−β u), ∂ (χu) = β
(v)
β≤α
where
α
Lpk (Ω)
is a Banach space.
β
=
α! β!(α−β)!
is the binomial coefficient.
Proof. Statements (i), (ii), and (iii) are easy. For example, if ϕ ∈ C0∞ (Ω) then also ∂ β ϕ ∈ C0∞ (Ω), and (i) follows from α β |α| α+β |α|+|α+β| ∂ u ∂ ϕ dx = (−1) u∂ ϕ dx = (−1) ∂ α+β u ϕ dx, Ω
Ω |α|+|α+β|
Ω
|β|
= (−1) . since (−1) Let us now show (iv). The proof will be carried out by induction on |α|. For |α| = 1, writing u, ϕ for u(ϕ) = Ω uϕ dx, we get ∂ α (χu), ϕ = (−1)|α| u, χ∂ α ϕ = −u, ∂ α (χϕ) − (∂ α χ)ϕ = χ∂ α u, ϕ + (∂ α χ)u, ϕ, which is what was required. Now, suppose that the Leibniz formula is valid for all |β| ≤ l, and let us take α with |α| = l + 1. Then we can write α = β + γ with some
Chapter 1. Fourier Analysis on Rn
250
|β| = l and |γ| = 1. We get 8 9 χu, ∂ α ϕ = χu, ∂ β (∂ γ ϕ) 9 8 = (−1)|β| ∂ β (χu), ∂ γ ϕ (by induction hypothesis) : ; +β , ∂ σ χ ∂ β−σ u, ∂ γ ϕ (by definition) = (−1)|β| σ σ≤β : ; +β , ∂ γ (∂ σ χ ∂ β−σ u), ϕ (set ρ = σ + γ) = (−1)|β|+|γ| σ σ≤β : ; +β , |α| ρ α−ρ σ α−σ ∂ χ∂ u+∂ χ ∂ u ,ϕ = (−1) σ σ≤β : ; +α, |α| ρ α−ρ ∂ χ∂ u, ϕ , = (−1) ρ ρ≤α
α−γ α = ρ . where we used that βσ + βρ = α−γ ρ−γ + ρ Now let us prove (v). We have already shown in Proposition 1.5.7 that Lpk (Ω) is a normed space. Let us show now that the completeness of Lp (Ω) (Theorem C.4.9) implies the completeness of Lpk (Ω). Let um be a Cauchy sequence in Lpk (Ω). Then ∂ α um is a Cauchy sequence in Lp (Ω) for any |α| ≤ k. Since Lp (Ω) is complete, there exists some uα ∈ Lp (Ω) such that ∂ α um → uα in Lp (Ω). Let u = u(0,...,0) , so in particular, we have um → u in Lp (Ω). Let us now show that in fact u ∈ Lpk (Ω) and ∂ α u = uα for all |α| ≤ k. Let ϕ ∈ C0∞ (Ω). Then ∂ α u, ϕ
= ( − 1)|α| u, ∂ α ϕ = ( − 1)|α| lim um , ∂ α ϕ m→∞
= =
lim ∂ α um , ϕ
m→∞
uα , ϕ,
which implies u ∈ Lpk (Ω) and ∂ α u = uα . Moreover, we have ∂ α um → ∂ α u in Lp (Ω) for all |α| ≤ k, which means that um → u in Lpk (Ω) and hence Lpk (Ω) is complete. Exercise 1.5.11 (An embedding theorem). Prove that if s > k + n/2 and s ∈ N then H s (Rn ) ⊂ C k (Rn ) and the inclusion is continuous. Do also Exercise 2.6.17 for a sharper version of this embedding.
1.5.3
Mollifiers
In Theorem 1.3.31 we saw that we can approximate quite irregular functions or (tempered) distributions by much more regular functions. The argument relied on
1.5. Sobolev spaces
251
the use of Fourier analysis and worked well on Rn . Such a technique is very powerful, as could have been seen from the proof of Plancherel’s formula in Theorem 1.3.13. On the other hand, when working in subsets of Rn we may be unable to use the Fourier transform (since for its definition we used the whole space Rn ). Thus, we want to be able to approximate functions (or distributions) by smooth functions without using Fourier techniques. This turns out to be possible using the so-called mollification of functions. Assume for a moment that we are in Rn again and let us first argue very informally. Let us first look at the Fourier transform of the convolution with a δ-distribution. Thus, for a function f we must have δ ∗ f = δf = f, if we use that δ = 1. Taking the inverse Fourier transform we obtain the important identity δ ∗ f = f, which will be justified formally later. Now, if we take a sequence of smooth functions η approximating the δ-distribution, i.e., if η → δ in some sense as → 0, and if this convergence is preserved by the convolution, we should get η ∗ f → δ ∗ f = f
as → 0.
Now, the convolution η ∗ f may be defined locally in Rn , and functions η ∗ f will be smooth if η are, thus giving us a way to approximate f . We will now make this argument precise. For this, we will deal in a straightforward manner by looking at the limit of η ∗ f for a suitably chosen sequence of functions η , referring neither to δ-distribution nor to the Fourier transform. Definition 1.5.12 (Mollifiers). For an open Ω = {x ∈ Ω : dist(x, ∂Ω) > }. Let us define 1 C e |x|2 −1 , η(x) = 0,
set Ω ⊂ Rn and > 0 we define η ∈ C0∞ (Rn ) by if |x| < 1, if |x| ≥ 1,
where the constant C is chosen so that Rn η dx = 1. Such a function η is called a (Friedrichs) mollifier. For > 0, we define η (x) = so that supp η ⊂ B(0, ) and
Rn
1 x , η
n
η dx = 1.
Let f ∈ L1loc (Ω). A mollification of f corresponding to η is a family f = η ∗f in Ω , i.e., η (x − y)f (y) dy = η (y)f (x − y) dy, for x ∈ Ω . f (x) = Ω
B(0,)
Chapter 1. Fourier Analysis on Rn
252
Theorem 1.5.13 (Properties of mollifications). Let f ∈ L1loc (Ω). Then we have the following properties. f ∈ C ∞ (Ω ). f → f almost everywhere as → 0. If f ∈ C(Ω), then f → f uniformly on compact subsets of Ω. f → f in Lploc (Ω) for all 1 ≤ p < ∞. Proof. To show (i), we can differentiate f (x) = Ω η (x − y)f (y) dy under the integral sign and use the fact that f ∈ L1loc (Ω). The proof of (ii) will rely on the following (i) (ii) (iii) (iv)
Theorem 1.5.14 (Lebesgue’s differentiation theorem). Let f ∈ L1loc (Ω). Then 1 |f (y) − f (x)| dy = 0 for a.e. x ∈ Ω. lim r→0 |B(x, r)| B(x,r) Now, for all x for which the statement of Lebesgue’s differentiation theorem is true, we can estimate |f (x) − f (x)| =
η (x − y)(f (y) − f (x)) dy B(x,)
, x−y |f (y) − f (x)| dy η ≤
B(x,) 1 ≤C |f (y) − f (x)| dy, |B(x, )| B(x,) −n
+
where the last expression goes to zero as → 0, by the choice of x. For (iii), let K be a compact subset of Ω. Let K0 ⊂ Ω be another compact set such that K is contained in the interior of K0 . Then f is uniformly continuous on K0 and the limit in the Lebesgue differentiation theorem holds uniformly for x ∈ K. The same argument as in (ii) then shows that f → f uniformly on K. Finally, to show (iv), let us choose open sets U ⊂ V ⊂ Ω such that U ⊂ Vδ and V ⊂ Ωδ for some small δ > 0. Let us show first that ||f ||Lp (U ) ≤ ||f ||Lp (V ) for all sufficiently small > 0. Indeed, for all x ∈ U , we can estimate |f (x)| =
η (x − y)f (y) dy
B(x,)
≤
η1−1/p (x − y)η1/p (x − y)|f (y)| dy B(x,)
1−1/p
η (x − y) dy
≤ B(x,)
(H¨ older’s inequality) 1/p
η (x − y)|f (y)| dy p
B(x,)
.
1.5. Sobolev spaces
Since
B(x,)
253
η (x − y) dy = 1, we get
|f (x)|p dx ≤ U
η (x − y)|f (y)|p dy U
B(x,)
V
B(y,)
dx
η (x − y) dx |f (y)|p dy
≤
|f (y)|p dy.
= V
Now, let δ > 0 and let us choose g ∈ C(V ) such that ||f − g||Lp (V ) < δ (here we use the fact that C(V ) is (sequentially) dense in Lp (V )). Then ||f − f ||Lp (U ) ≤ ||f − g ||Lp (U ) + ||g − g||Lp (U ) + ||g − f ||Lp (U ) ≤ 2||f − g||Lp (V ) + ||g − g||Lp (U ) < 2δ + ||g − g||Lp (U ) . Since g → g uniformly on the closure of V by (iii), it follows that ||f − f ||Lp (U ) ≤ 3δ for small enough > 0, completing the proof of (iv). As a consequence of Theorem 1.5.13 we obtain Corollary 1.5.15. The space C ∞ (Ω) is sequentially dense in the space C0 (Ω) of all continuous functions with compact support in Ω. Also, C ∞ (Ω) is sequentially dense in Lploc (Ω) for all 1 ≤ p < ∞. Exercise 1.5.16. Prove a simple but useful corollary of the Lebesgue differentiation theorem, partly explaining its name: Corollary 1.5.17 (Corollary of the Lebesgue differentiation theorem). Let f ∈ L1loc (Ω). Then 1 f (y) dy = f (x) for a.e. x ∈ Ω. lim r→0 |B(x, r)| B(x,r)
1.5.4
Approximation of Sobolev space functions
With the use of mollifications we can approximate functions in Sobolev spaces by smooth functions. We have a local approximation in localised Sobolev spaces Lpk (Ω)loc , a global approximation in Lpk (Ω), and further approximations dependent on the regularity of the boundary of Ω. Although the set Ω is bounded, we still say that an approximation in Lpk (Ω) is global if it works up to the boundary. Proposition 1.5.18 (Local approximation by smooth functions). Assume that Ω ⊂ Rn is open. Let f ∈ Lpk (Ω) for 1 ≤ p < ∞ and k ∈ N ∪ {0}. Let f = η ∗ f in Ω be the mollification of f , > 0. Then f ∈ C ∞ (Ω ) and f → f in Lpk (Ω)loc as
→ 0, i.e., f → f in Lpk (K) as → 0 for all compact K ⊂ Ω.
Chapter 1. Fourier Analysis on Rn
254
Proof. It was already proved in Theorem 1.5.3, (i), that f ∈ C ∞ (Ω ). Since f is locally integrable, we can differentiate the convolution under the integral sign to get ∂ α f = η ∗ ∂ α u in Ω . Now, let U be an open and bounded subset of Ω containing K. Then by Theorem 1.5.3, (iv), we get ∂ α f → ∂ α f in Lp (U ) as
→ 0, for all |α| ≤ k. Hence ||f − f ||pLp (U ) = ||∂ α f − ∂ α f ||pLp (U ) → 0 k
|α|≤k
as → 0, proving the statement.
Proposition 1.5.19 (Global approximation by smooth functions). Assume that Ω ⊂ Rn is open and bounded. Let f ∈Lpk (Ω) for 1 ≤ p < ∞ and k ∈ N ∪ {0}. Then there is a sequence fm ∈ C ∞ (Ω) Lpk (Ω) such that fm → f in Lpk (Ω). ∞ Proof. Let us write Ω = j=1 Ωj , where Ωj = {x ∈ Ω : dist(x, ∂Ω) > 1/j}. Let Vj = Ωj+3 \Ωj+1 (this definition ∞ will be very important). Take also any open V0 with V0 ⊂ Ω so that Ω = j=0 Vj . Let χj be a partition of unity subordinate to ∞ Vj , i.e., a family χj ∈ C0∞ (Vj ) such that 0 ≤ χj ≤ 1 and j=0 χj = 1 in Ω. Then χj f ∈ Lpk (Ω) and supp(χj f ) ⊂ Vj . Let us fix some δ > 0 and choose j > 0 so small in Wj = Ωj+4 \Ωj and satisfies that the function f j = ηj ∗ (χj f ) is supported ∞ ||f j − χj f ||Lpk (Ω) ≤ δ2−j−1 for all j. Let now g = j=0 f j . Then g ∈ C ∞ (Ω) since in any open set U inΩ there are only finitely many non-zero terms in the sum. ∞ Moreover, since f = j=0 χj f , for each such U we have ||g − f ||Lpk (U ) ≤
∞ j=0
||f j − χj f ||Lpk (Ω) ≤ δ
∞ j=0
1 2j+1
= δ.
Taking the supremum over all open subsets U of Ω, we obtain ||g − f ||Lpk (Ω) ≤ δ, completing the proof. In general, there are many versions of these results depending on the set Ω, in particular on the regularity of its boundary. For example, we give here without proof the following Further result. Let Ω be a bounded subset of Rn with C 1 boundary. Let f ∈ Lpk (Ω) for 1 ≤ p < ∞ and k ∈ N ∪ {0}. Then there is a sequence fm ∈ C ∞ (Ω) such that fm → f in Lpk (Ω). Finally, we use mollifiers to establish a smooth version of Urysohn’s lemma in Theorem A.12.11. Theorem 1.5.20 (Smooth Urysohn’s lemma). Let K ⊂ Rn be compact and U ⊂ Rn be open such that K ⊂ U . Then there exists f ∈ C0∞ (U ) such that 0 ≤ f ≤ 1, f = 1 on K and supp f ⊂ U .
1.6. Interpolation
255
Proof. First we observe that the distance δ := dist(K, Rn \U ) > 0 because K is compact and Rn \U is closed. Let V := {x ∈ Rn : dist(x, K) < δ/3}. If η is the Friedrichs mollifier from Definition 1.5.12, then χ := ηδ/3 satisfies supp χ ⊂ {x ∈ Rn : |x| ≤ δ/3} and Rn χ(x) dx = 1. The desired function f can then be obtained as f := IV ∗ χ, where IV is the characteristic function of the set V . We have that f ∈ C ∞ by Theorem 1.5.13 and supp f ⊂ U by Exercise3 1.4.29. We have 0 ≤ f ≤ 1 from its definition, and f = 1 on K follows by a direct verification.
1.6
Interpolation
The Riesz–Thorin interpolation theorem C.4.18 was already useful in establishing various inequalities in Lp (for example, it was used to prove the general Young’s inequality for convolutions in Proposition 1.2.10, or the Hausdorff–Young inequality in Corollary 1.3.14). The aim of this section is to prove another very useful interpolation result: the Marcinkiewicz interpolation theorem. Here μ will stand for the Lebesgue measure on Rn . Definition 1.6.1 (Distribution functions). For a function f : Rn → C we define its distribution function μf (λ) by μf (λ) = μ{x ∈ Rn : |f (x)| ≥ λ}. We have the following useful relation between the Lp -norm and the distribution of a function. Theorem 1.6.2. Let f ∈ Lp (Rn ). Then we have the identity ∞ |f (x)|p dx = p μf (λ)λp−1 dλ. Rn
0
Proof. Let us define a measure on R by setting ν((a, b]) := μf (b) − μf (a) = −μ{x ∈ Rn : a < |f (x)| ≤ b} = −μ(|f |−1 ((a, b])). By the standard extension property of measures we can then extend ν to all Borel sets E ⊂ (0, ∞) by setting ν(E) = −μ(|f |−1 (E)). We note that this definition is well defined since |f | is measurable if f is measurable (Theorem C.2.9). Then we claim that we have the following property for, say, integrable functions φ : [0, ∞) → R: Rn 3 In
∞
ϕ ◦ |f | dμ = −
ϕ(α) dν(α).
(1.12)
0
fact, in this case the property supp f ⊂ supp IV + supp χ ⊂ V + Bδ/3 (0) = {x ∈ Rn : d(x, K) ≤ 2δ/3} ⊂ U can be easily checked directly.
Chapter 1. Fourier Analysis on Rn
256
Indeed, if ϕ = χ[a,b] is a characteristic function of a set [a, b], i.e., equal to one on [a, b] and zero on its complement, then the definition of ν implies Rn
χ[a,b] ◦ |f | dμ = a 0 such that for every λ > 0 we have μ{x ∈ Rn : |T u(x)| > λ} ≤ C
||u||pLp . λp
Proposition 1.6.4. If T is bounded from Lp (Rn ) to Lp (Rn ) then T is also of weak type (p, p). Proof. If v ∈ L1 (Rn ) then for all ρ > 0 we have a simple estimate n ρμ{x ∈ R : |v(x)| > ρ} ≤ |v(x)| dμ(x) ≤ ||v||L1 . |v(x)|>ρ
Now, if we take v(x) = |T u(x)|p and ρ = λp , this readily implies that T is of weak type (p, p). The following theorem is extremely valuable in proving Lp -continuity of operators since it reduces the analysis to a weaker type continuity only for two values of indices. Theorem 1.6.5 (Marcinkiewicz’ interpolation theorem). Let r < q and assume that operator T is of weak types (r, r) and (q, q). Then T is bounded from Lp (Rn ) to Lp (Rn ) for all r < p < q. Proof. Let u ∈ Lp (Rn ). For each λ > 0 we can define functions u1 and u2 by u1 (x) = u(x) for |u(x)| > λ and by u2 (x) = u(x) for |u(x)| ≤ λ, and to be zero otherwise. Then we have the identity u = u1 + u2 and estimates |u1 |, |u2 | ≤ |u|. It follows that μT u (2λ) ≤ μT u1 (λ) + μT u2 (λ) ≤ C1
||u2 ||qLq ||u1 ||rLr + C , 2 λr λq
1.6. Interpolation
257
since T is of weak types (r, r) and (q, q). Therefore, we can estimate
∞
|T u(x)| dx = p p
Rn
0
λp−1 μT u (λ) dλ ∞
≤ C1 p
λ
p−1−r |u|>λ
0
∞
+ C2 p
|u(x)| dx r
λp−1−q |u|≤λ
0
dλ
|u(x)|q dx
dλ.
Using Fubini’s theorem, the first term on the right-hand side can be rewritten as
λ 0
∞
p−1−r
|u(x)| dx r
|u|>λ
dλ
, λ χ|u|>λ |u(x)| dx dλ = n 0 + R∞ , r p−1−r |u(x)| λ χ|u|>λ dλ dx = Rn 0
∞
+
p−1−r
r
|u(x)|
|u(x)|r
= Rn
λp−1−r dλ
dx
0
1 = |u(x)|r |u(x)|p−r dx p − r Rn 1 = |u(x)|p dx, p − r Rn where χ|u|>λ is the characteristic function of the set {x ∈ Rn : |u(x)| > λ}. Similarly, we have
∞
λ
p−1−q
0
completing the proof.
|u(x)| dx q
|u|≤λ
1 dλ = q−p
Rn
|u(x)|p dx,
As an important tool (which will not be used here so it is given just for the information) for proving various results of boundedness in L1 (Rn ) or of weak type (1, 1), we have the following fundamental decomposition of integrable functions. Theorem 1.6.6 (Calder´ on–Zygmund covering lemma). Let u ∈ L1 (Rn ) and λ > 0. Then there exist v, wk ∈ L1 (Rn ) and there exists a collection of disjoint cubes Qk ,
Chapter 1. Fourier Analysis on Rn
258
k ∈ N, centred at some points xk , such that the following properties are satisfied: u=v+
∞
wk ,
k=1
||v||L1 +
||wk ||L1 ≤ 3||u||L1 ,
k=1
supp wk ⊂ Qk ,
∞
wk (x) dx = 0, Qk
∞
μ(Qk ) ≤ λ−1 ||u||L1 ,
|v(x)| ≤ 2n λ.
k=1
This theorem is one of the starting points of the harmonic analysis of operators on Lp (Rn ), but we will not pursue this topic here, and can refer to, e.g., [118] or [132] for many further aspects.
Chapter 2
Pseudo-differential Operators on Rn The subject of pseudo-differential operators on Rn is well studied and there are many excellent monographs on the subject, see, e.g., [27, 33, 55, 71, 101, 112, 130, 135, 152], as well as on the more general subject of Fourier integral operators, microlocal analysis, and related topics in, e.g., [30, 56, 45, 81, 113]. Therefore, here we only sketch main elements of the theory. In this chapter, we use the notation ξ = (1 + |ξ|2 )1/2 .
2.1
Motivation and definition
We will start with an informal observation that if T is a translation invariant linear operator on some space of functions on Rn , then we can write T ( e2πix·ξ ) = a(ξ) e2πix·ξ
for all ξ ∈ Rn .
(2.1)
Indeed, more explicitly, if T acts on functions of the variable y, we can write f (x, ξ) = T ( e2πiy·ξ )(x) = (T eξ )(x), where eξ (x) = e2πix·ξ . Let (τh f )(x) = f (x−h) be the translation operator by h ∈ Rn . We say that T is translation invariant if T τh = τh T for all h. By our assumptions on T we get f (x + h, ξ) = T ( e2πi(y+h)·ξ )(x) = e2πih·ξ T ( e2πiy·ξ )(x) = e2πih·ξ f (x, ξ). Now, setting x = 0, we get f (h, ξ) = e2πih·ξ f (0, ξ), so we obtain formula (2.1) with a(ξ) = f (0, ξ). In turn, this a(ξ) can be found from formula (2.1), yielding a(ξ) = e−2πix·ξ T ( e2πiy·ξ )(x). If we now formally apply T to the Fourier inversion formula e2πix·ξ f(ξ) dξ f (x) = Rn
Chapter 2. Pseudo-differential Operators on Rn
260
and use the linearity of T , we obtain T ( e2πix·ξ )f(ξ) dξ = T f (x) = Rn
e2πix·ξ a(ξ)f(ξ) dξ.
Rn
This formula allows one to reduce certain properties of the operator T to properties of the multiplication by the corresponding function a(ξ), called the symbol of T . For example, continuity of T on L2 would reduce to the boundedness of a(ξ), composition of two operators T1 ◦ T2 would reduce to the multiplication of their symbols a1 (ξ)a2 (ξ), etc. Pseudo-differential operators extend this construction to functions which are not necessarily translation invariant. In fact, as we saw above we can always write a(x, ξ) := e−2πix·ξ (T eξ )(x), so that we would have T ( e2πix·ξ ) = e2πix·ξ a(x, ξ). Consequently, reasoning as above, we could analogously arrive at the formula e2πix·ξ a(x, ξ) f(ξ) dξ. (2.2) T f (x) = Rn
Now, in order to avoid several rather informal conclusions in the arguments above, one usually takes the opposite route and adopts formula (2.2) as the definition of the pseudo-differential operator with symbol a(x, ξ). Such operators are then often denoted by Op(a), by a(X, D), or by Ta . The simplest and perhaps most useful class of symbols allowing this apm (Rn × Rn ), or simply by proach to work well is the following class denoted by S1,0 S m (Rn × Rn ). Definition 2.1.1 (Symbol classes S m (Rn × Rn )). We will say that a ∈ S m (Rn × Rn ) if a = a(x, ξ) is smooth on Rn × Rn and if the estimates |∂xβ ∂ξα a(x, ξ)| ≤ Aαβ (1 + |ξ|)m−|α|
(2.3)
hold for all α, β and all x, ξ ∈ Rn . Constants Aαβ may depend on a, α, β but not on x, ξ. The operator T defined by (2.2) is called the pseudo-differential operator with symbol a. The class of operators of the form (2.2) with symbols from S m (Rn × Rn ) is denoted by Ψm (Rn × Rn ) or by Op S m (Rn × Rn ). Remark 2.1.2. We will insist on writing S m (Rn × Rn ) and not abbreviating it to S m or even to S m (Rn ). The reason is that in Chapter 4 we will want to distinguish between symbol class S m (Tn × Rn ) which will be 1-periodic symbols from S m (Rn × Rn ) and symbol class S m (Tn × Zn ) which will be the class of toroidal symbols. Remark 2.1.3 (Symbols of differential operators). Note that for partial differential operators symbols are just the characteristic polynomials. One can readily see α that the symbol of the differential operator L = a |α|≤m α (x)∂x is a(x, ξ) = α m n n |α|≤m aα (x)(2πiξ) and a ∈ S (R × R ) if the coefficients aα and all of their derivatives are smooth and bounded on Rn .
2.1. Motivation and definition
261
Remark 2.1.4 (Powers of the Laplacian). For example, the symbol of the Laplacian ∂2 ∂2 2 2 2 n × Rn ). Consequently, L = ∂x 2 +· · ·+ ∂x2 is −4π |ξ| and it is an element of S (R n 1 for any μ ∈ R, we can define the operators (1−L)μ as pseudo-differential operators with symbol (1 + 4π 2 |ξ|2 )μ/2 ∈ S μ (Rn × Rn ). N
Exercise 2.1.5. Let u ∈ C(Rn ) satisfy |u(x)| ≤ Cx for some constants C, N , where x = (1 + |x|2 )1/2 . Let k > N + n. Let us define −k vk (φ) = e−2πix·ξ u(x)x (1 − L)k/2 φ(ξ) dx dξ, Rn
Rn
where φ ∈ S(Rn ). Prove that vk ∈ S (Rn ). Prove that there is v ∈ S (Rn ) such . that v = vk for all k > N + n. Show that v = u We now proceed in establishing basic properties of pseudo-differential operators. Theorem 2.1.6 (Pseudo-differential operators on S(Rn )). Let a ∈ S m (Rn × Rn ) and f ∈ S(Rn ). We define the pseudo-differential operator with symbol a by a(X, D)f (x) := e2πix·ξ a(x, ξ)f(ξ) dξ. (2.4) Rn
Then a(X, D)f ∈ S(Rn ). Proof. First we observe that the integral in (2.4) converges absolutely. The same is true for all of its derivatives with respect to x by Lebesgue’s dominated convergence theorem (Theorem 1.1.4), which implies that a(X, D)f ∈ C ∞ (Rn ). Let us show now that in fact a(X, D)f ∈ S(Rn ). Introducing the operator Lξ = (1 + 4π 2 |x|2 )−1 (I − Lξ ) (where Lξ is the Laplace operator with respect to ξ-variables) with the property Lξ e2πix·ξ = e2πix·ξ , integrating (2.2) by parts N times yields a(X, D)f (x) = e2πix·ξ (Lξ )N [a(x, ξ)f(ξ)] dξ. Rn
From this we get |a(X, D)f (x)| ≤ CN (1 + |x|)−2N for all N , so a(X, D)f is rapidly decreasing. The same argument applies to derivatives of a(X, D)f to show that a(X, D)f ∈ S(Rn ). The following generalisation of symbol class S m (Rn × Rn ) is often useful: m (Rn × Rn )). Let 0 ≤ ρ, δ ≤ 1. We will say Definition 2.1.7 (Symbol classes Sρ,δ m n n that a ∈ Sρ,δ (R × R ) if a = a(x, ξ) is smooth on Rn × Rn and if
|∂xβ ∂ξα a(x, ξ)| ≤ Aαβ (1 + |ξ|)m−ρ|α|+δ|β|
(2.5)
Chapter 2. Pseudo-differential Operators on Rn
262
for all α, β and all x, ξ ∈ Rn . Constants Aαβ may depend on a, α, β but not on x, ξ. The operator T defined by (2.2) is called the pseudo-differential operator with symbol a of order m and type (ρ, δ). The class of operators of the form (2.2) with m n n m n n (Rn × Rn ) is denoted by Ψm symbols from Sρ,δ ρ,δ (R × R ) or by Op Sρ,δ (R × R ). n n Definition 2.1.8 (Symbol σA of operator A). If A ∈ Ψm ρ,δ (R × R ) we denote its symbol by σA = σA (x, ξ). It is well defined in view of Theorem 2.5.6 later on, m (Rn × Rn ). which also gives a formula for σA ∈ Sρ,δ
Exercise 2.1.9. Extend the statement of Theorem 2.1.6 to operators of type (ρ, δ). m Namely, let 0 ≤ ρ, δ ≤ 1, and let a ∈ Sρ,δ (Rn × Rn ) and f ∈ S(Rn ). Show that n a(X, D)f ∈ S(R ). The following convergence criterion will be useful in the sequel. It follows directly from the Lebesgue dominated convergence Theorem 1.1.4. Proposition 2.1.10 (Convergence criterion for pseudo-differential operators). Suppose we have a sequence of symbols ak ∈ S m (Rn × Rn ) which satisfies the uniform symbolic estimates |∂xβ ∂ξα ak (x, ξ)| ≤ Aαβ (1 + |ξ|)m−|α| , for all α, β, all x, ξ ∈ Rn , and all k, with constants Aαβ independent of x, ξ and k. Suppose that a ∈ S m (Rn × Rn ) is such that ak (x, ξ) and all of its derivatives converge to a(x, ξ) and its derivatives, respectively, pointwise as k → ∞. Then ak (X, D)f → a(X, D)f in S(Rn ) for any f ∈ S(Rn ). Exercise 2.1.11. Verify the details of the proof of Proposition 2.1.10. Remark 2.1.12. More general families of pseudo-differential operators are introm duced in, e.g., [55] and [130]. Yet S m (Rn ) = S1,0 (Rn ) contained in the H¨ ormander m n classes Sρ,δ (R ) is definitely the most important case, and [135], [130], [55], and [118] concentrate on it. Compressed information about pseudo-differential operators and nonlinear partial differential equations can be found in [131]. The spectral properties of pseudo-differential operators are considered in [112], and we have also left out the matrix-valued pseudo-differential operators. Remark 2.1.13. The relation between operators and symbols can be also viewed as follows. Let u ∈ S(Rn ) and fix s < −n/2. The function ψ : Rn → H s (Rn ), ψ(ξ) = eξ , where eξ (x) = e2πix·ξ , is Bochner-integrable (see [53]) with respect to u (ξ) dξ, and therefore (Au)(x) = e2πix·ξ σA (x, ξ) u (ξ) dξ Rn
for symbols of order zero. The distribution Au can be viewed as a σA -weighted inverse Fourier transform of u . Unfortunately, the algebra of the finite order operators on the Sobolev scale is too large to admit fruitful symbol analysis, while the non-trivial restrictions by the symbol inequalities (2.3) yield a well-behaving subalgebra.
2.2. Amplitude representation of pseudo-differential operators
2.2
263
Amplitude representation of pseudo-differential operators
If we write out the Fourier transform in (2.4) as an integral, we obtain a(X, D)f (x) = e2πi(x−y)·ξ a(x, ξ) f (y) dy dξ. Rn
(2.6)
Rn
However, a problem with this formula is that the ξ-integral does not converge absolutely even for f ∈ S(Rn ). To overcome this difficulty, one uses the idea to approximate a(x, ξ) by symbols with compact support. To this end, let us fix some γ ∈ C0∞ (Rn × Rn ) such that γ = 1 near the origin. Let us now define a (x, ξ) = a(x, ξ)γ( x, ξ). Then one can readily check that a ∈ C0∞ (Rn × Rn ) and that the following holds: • if a ∈ S m (Rn × Rn ), then a ∈ S m (Rn × Rn ) uniformly in 0 < ≤ 1 (this means that constants Aαβ in symbolic inequalities in Definition 2.1.1 may be chosen independent of 0 < ≤ 1); • a → a pointwise as → 0, uniformly in 0 < ≤ 1. The same is true for derivatives of a and a. It follows now from the convergence criterion Proposition 2.1.10 that a (X, D)f → a(X, D)f in S(Rn ) as → 0, for all f ∈ S(Rn ). Here a(X, D)f is defined as in (2.4). Now, formula (2.6) does make sense for a ∈ C0∞ , so we may define the double integral in (2.6) as the limit in S(Rn ) of a (X, D)f , i.e., take a(X, D)f (x) := lim e2πi(x−y)·ξ a (x, ξ)f (y) dy dξ, f ∈ S(Rn ). →0
Rn
Rn
Pseudo-differential operators on S (Rn ). Recall that we can define the L2 -adjoint a(X, D)∗ of an operator a(X, D) by the formula (a(X, D)f, g)L2 = (f, a(X, D)∗ g)L2 , f, g ∈ S(Rn ), where
(u, v)L2 =
u(x)v(x) dx Rn
is the usual L2 -inner product. From (2.6) and this formula we can readily calculate that a(X, D)∗ g(y) = lim e2πi(y−x)·ξ a (x, ξ) g(x) dx dξ, g ∈ S(Rn ). →0
Rn
Rn
With the same understanding of non-convergent integrals as in (2.6) and replacing x by z to eliminate any confusion, we can write e2πi(y−z)·ξ a(z, ξ) g(z) dz dξ, g ∈ S(Rn ). (2.7) a(X, D)∗ g(y) = Rn
Rn
Chapter 2. Pseudo-differential Operators on Rn
264
Exercise 2.2.1. As before, by integration by parts, check that a(X, D)∗ : S(Rn ) → S(Rn ) is continuous. Definition 2.2.2 (Pseudo-differential operators on S (Rn )). Let u ∈ S (Rn ). We define a(X, D)u by the formula (a(X, D)u)(ϕ) := u(a(X, D)∗ ϕ)
for all ϕ ∈ S(Rn ).
Remark 2.2.3 (Consistency). We clearly have ∗ a(X, D) ϕ(y) = e2πi(z−y)·ξ a(z, ξ) ϕ(z) dz dξ, Rn
Rn
so if u, ϕ ∈ S(Rn ), we have the consistency in (a(X, D)u)(ϕ) = a(X, D)u(x)ϕ(x) dx = (a(X, D)u, ϕ)L2 Rn ∗ = (u, a(X, D) ϕ)L2 = u(x)a(X, D)∗ ϕ(x) dx = u(a(X, D)∗ ϕ). Rn
Proposition 2.2.4. If a ∈ S m (Rn × Rn ) and u ∈ S (Rn ) then a(X, D)u ∈ S (Rn ). Moreover, operator a(X, D) : S (Rn ) → S (Rn ) is continuous. Proof. Indeed, let uk → u in S (Rn ). Then we have (a(X, D)uk )(ϕ) = uk (a(X, D)∗ ϕ) → u(a(X, D)∗ ϕ) = (a(X, D)u)(ϕ), so a(X, D)uk → a(X, D)u in S (Rn ) and, therefore, a(X, D) : S (Rn ) → S (Rn ) is continuous. m (Rn × Rn ) and Exercise 2.2.5. Let 0 ≤ ρ ≤ 1 and 0 ≤ δ < 1. Show that if a ∈ Sρ,δ n n u ∈ S (R ) then a(X, D)u ∈ S (R ), and that the operator a(X, D) : S (Rn ) → S (Rn ) is continuous.
2.3
Kernel representation of pseudo-differential operators
Summarising Sections 2.1 and 2.2, we can write pseudo-differential operators in different ways: 2πix·ξ e a(x, ξ) f (ξ) dξ = e2πi(x−y)·ξ a(x, ξ) f (y) dy dξ a(X, D)f (x) = Rn Rn R n = e2πiz·ξ a(x, ξ) f (x − z) dz dξ = k(x, z) f (x − z) dz R n Rn Rn = K(x, y) f (y) dy, Rn
2.3. Kernel representation of pseudo-differential operators
with kernels
265
K(x, y) = k(x, x − y),
e2πiz·ξ a(x, ξ) dξ.
k(x, z) = Rn
Theorem 2.3.1 (Kernel of a pseudo-differential operator). Let a ∈ S m (Rn × Rn ). Then the kernel K(x, y) of pseudo-differential operator a(X, D) satisfies β |∂x,y K(x, y)| ≤ CN β |x − y|−N
for N > m + n + |β| and x = y. Thus, for x = y, the kernel K(x, y) is a smooth function, rapidly decreasing as |x − y| → ∞. Proof. We notice that k(x, ·) is the inverse Fourier transform of a(x, ·). It follows then that (−2πiz)2α ∂zβ k(x, z) is the 3 inverse Fourier transform with respect to ξ of the derivative ∂ξα (2πiξ)β a(x, ξ) , i.e., 3 2 (−2πiz)α ∂zβ k(x, z) = Fξ−1 ∂ξα (2πiξ)β a(x, ξ) (z). Since (2πiξ)β a(x, ξ) ∈ S m+|β| (Rn × Rn ) is a symbol of order m + |β|, we have that 3 2 m+|β|−|α| . ∂ξα (2πiξ)β a(x, ξ) ≤ Cαβ ξ 3 2 Therefore, ∂ξα (2πiξ)β a(x, ξ) is in L1 (Rnξ ) with respect to ξ, if |α| > m + n + |β|. Consequently, its inverse Fourier transform is bounded: (−2πiz)α ∂zβ k(x, z) ∈ L∞ (Rnz ) for |α| > m + n + |β|. Since taking derivatives of k(x, z) with respect to x does not change the argument, this implies the statement of the theorem. As an immediate consequence of Theorem 2.3.1 we obtain the information on how the singular support is mapped by a pseudo-differential operator (see Remark 4.10.8 for more details): Corollary 2.3.2 (Singular supports). Let T ∈ Ψm (Rn × Rn ). Then for every u ∈ S (Rn ) we have sing supp Au ⊂ sing supp u. (2.8) Definition 2.3.3 (Local and pseudolocal operators). An operator A is called pseudolocal if the property (2.8) holds for all u. This is in analogy to the term “local” where an operator A is called local if supp Au ⊂ supp u for all u. By Corollary 2.3.2 every operator in Ψm (Rn × Rn ) is pseudolocal. The converse is not true, as stated in Exercise 2.3.6.
Chapter 2. Pseudo-differential Operators on Rn
266
Exercise 2.3.4 (Partial differential operators are local). Let A be a linear differential operator Af (x) = aα (x)∂xα f (x) |α|≤m ∞
with coefficients aα ∈ C (R ), |α| ≤ m. Prove that supp Af ⊂ supp f , for all f ∈ C ∞ (Rn ). n
Exercise 2.3.5 (Peetre’s theorem). Prove the converse to Exercise 2.3.4 which is known as Peetre’s theorem: if A : C ∞ (Ω) → C ∞ (Ω) is a continuous linear operator which is local, then A is a partial differential operator with smooth coefficients. Exercise 2.3.6 (Pseudo–Peetre’s theorem?). Prove that we can not add the word “pseudo” to Peetre’s theorem. Namely, a pseudolocal linear continuous operator on C ∞ (Rn ) does not have to be a pseudo-differential operator. We refer to Section 4.10 for a further exploration of these properties. We will now discuss an important class of operators which are usually taken to be negligible when one works with pseudo-differential operators. One of the reasons is that whenever they are applied to distributions they produce smooth functions, and so such operators can be neglected from the point of view of the analysis of singularities. However, it is important to understand these operators in order to know exactly what we are allowed to neglect. Definition 2.3.7 (Smoothingoperators). We can define symbols of order −∞ by setting S −∞ (Rn × Rn ) := m∈R S m (Rn × Rn ), so that a ∈ S −∞ (Rn × Rn ) if a ∈ C ∞ and if |∂xβ ∂ξα a(x, ξ)| ≤ AαβN (1 + |ξ|)−N holds for all N , and all x, ξ ∈ Rn . The constants AαβN may depend on a, α, β, N but not on x, ξ. Pseudo-differential operators with symbols in S −∞ are called smoothing pseudo-differential operators. −∞ (Rn × Rn ) is independent of ρ and δ in Exercise 2.3.8. Show that the class S m −∞ n n (R × R ) = m∈R Sρ,δ (Rn × Rn ) for all ρ and δ. the sense that S
Proposition 2.3.9. Let a ∈ S −∞ (Rn × Rn ). Then the integral kernel K of a(X, D) is smooth on Rn × Rn . Proof. Since a(x, ·) ∈ L1 (Rn ), we immediately get k ∈ L∞ (Rn ). Moreover, β α ∂x ∂z k(x, z) = e2πiz·ξ (2πiξ)α ∂xβ a(x, ξ) dξ. Rn
Since (2πiξ)α ∂xβ a(x, ξ) is absolutely integrable, it follows that from the Lebesgue dominated convergence theorem (Theorem 1.1.4) that ∂xβ ∂zα k is continuous. This is true for all α, β, hence k, and then also K, are smooth. Let us write kx (·) = k(x, ·).
2.4. Boundedness on L2 (Rn )
267
Corollary 2.3.10. Let a ∈ S −∞ (Rn × Rn ). Then kx ∈ S(Rn ). We have a(X, D)f (x) = (kx ∗ f )(x) and, consequently, a(X, D)f ∈ C ∞ (Rn ) for all f ∈ S (Rn ). We note that the convolution in the corollary is understood in the sense of distributions, see Section 1.4.2. Proof of Corollary 2.3.10. Now Corollary 2.3.10 follows from the fact that for a ∈ S −∞ we can write a(X, D)f (x) = (f ∗ kx )(x) with kx (·) = k(x, ·) ∈ S(Rn ). So (a(X, D)f )(x) = f (τx Rkx ). If now f ∈ S (Rn ), it follows that a(X, D)f ∈ C ∞ because of the continuity of f (τx Rkx ) and all of its derivatives with respect to x. Exercise 2.3.11 (Non-locality). Let T be an operator defined by T f (x) = K(x, y) f (y) dy, Rn
C0∞ (Rn
with K ∈ × R ). Prove that T defines a continuous operator from S(Rn ) n to S(R ) and from S (Rn ) to S (Rn ). For operators T as above with K ≡ 0, show that we can never have the property supp T f ⊂ supp f for all f ∈ C ∞ (Rn ).
2.4
n
Boundedness on L2 (Rn )
In this section we prove that pseudo-differential operators with symbols in S 0 (Rn × Rn ) are bounded on L2 (Rn ). The corresponding result in Sobolev spaces will be given in Theorem 2.6.11. First we prepare the following general result that shows that in many similar situations we only have to verify the estimate for the operator on a smaller space: Proposition 2.4.1. Let A : S (Rn ) → S (Rn ) be a continuous linear operator such that A(S(Rn )) ⊂ L2 (Rn ) and such that there exists C for which the estimate ||Af ||L2 (Rn ) ≤ C||f ||L2 (Rn )
(2.9)
holds for all f ∈ S(Rn ). Then A extends to a bounded linear operator from L2 (Rn ) to L2 (Rn ), and estimate (2.9) holds for all f ∈ L2 (Rn ), with the same constant C. Proof. Indeed, let f ∈ L2 (Rn ) and let fk ∈ S(Rn ) be a sequence of rapidly decreasing functions such that fk → f in L2 (Rn ). Such a sequence exists because S(Rn ) is dense in L2 (Rn ) (Exercise 1.3.33). Then by (2.9) applied to fk − fm we have ||A(fk − fm )||L2 (Rn ) ≤ C||fk − fm ||L2 (Rn ) ,
Chapter 2. Pseudo-differential Operators on Rn
268
so Afk is a Cauchy sequence in L2 (Rn ). By the completeness of L2 (Rn ) (Theorem C.4.9) there is some g ∈ L2 (Rn ) such that Afk → g in L2 (Rn ). On the other hand Afk → Af in S (Rn ) because fk → f in L2 (Rn ) implies that fk → f in S(Rn ) (Exercise 1.3.12). By the uniqueness principle in Proposition 1.3.5 we have Af = g ∈ L2 (Rn ). Passing to the limit in (2.9) applied to fk , we get ||Af ||L2 (Rn ) ≤ C||f ||L2 (Rn ) , with the same constant C, completing the proof. There are different proofs of the L2 -result. For the proof of Theorem 2.4.2 below we follow [118] but an alternative proof based on the calculus will be also given later in Section 2.5.4. Theorem 2.4.2 (L2 -boundedness of pseudo-differential operators). Let a ∈ S 0 (Rn × Rn ). Then a(X, D) extends to a bounded linear operator from L2 (Rn ) to L2 (Rn ). Proof. First of all, we note that by a standard functional analytic argument in Proposition 2.4.1 it is sufficient to show the boundedness inequality (2.9) for A = a(X, D) only for f ∈ S(Rn ), with constant C independent of the choice of f . The proof of (2.9) will consist of two parts. In the first part we establish it for compactly supported (with respect to x) symbols and in the second part we will extend it to the general case of a ∈ S 0 (Rn × Rn ). So, let us first assume that a(x, ξ) has compact support with respect to x. This will allow us to use the Fourier transform with respect to x, in particular the formulae 2πix·λ a(x, ξ) = e a(λ, ξ) dλ, a(λ, ξ) = e−2πix·λ a(x, ξ) dx, Rn
Rn
with absolutely convergent integrals. We will use the fact that a(·, ξ) ∈ C0∞ (Rn ) ⊂ S(Rn ), so that a(·, ξ) is in the Schwartz space in the first variable. Consequently, we have a(·, ξ) ∈ S(Rn ) uniformly in ξ. To see the uniformity, we can notice that α (2πiλ) a(λ, ξ) = e−2πix·λ ∂xα a(x, ξ) dx, Rn
and hence |(2πiλ)α a(λ, ξ)| ≤ Cα for all ξ ∈ Rn . It follows that sup | a(λ, ξ)| ≤ CN (1 + |λ|)−N
ξ∈Rn
for all N . Now we can write a(X, D)f (x) = e2πix·ξ a(x, ξ) f(ξ) dξ Rn = e2πix·ξ e2πix·λ a(λ, ξ) f(ξ) dλ dξ R n Rn = (Sf )(λ, x) dλ, Rn
2.4. Boundedness on L2 (Rn )
269
where (Sf )(λ, x) = e2πix·λ ( a(λ, D)f )(x). Here a(λ, D)f is a Fourier multiplier with symbol a(λ, ξ) independent of x, so by Plancherel’s identity Theorem 1.3.13 we get || a(λ, D)f ||L2 = ||F( a(λ, D)f )||L2 = || a(λ, ·)f||L2 ≤ sup | a(λ, ξ)| ||f||L2 ≤ CN (1 + |λ|)−N ||f ||L2 , ξ∈Rn
for all N ≥ 0. Hence we get ||a(X, D)f ||L2 ≤
||Sf (λ, ·)||L2 dλ ≤ CN (1 + |λ|)−N ||f ||L2 dλ ≤ C||f ||L2 , Rn
Rn
if we take N > n. Now, to pass to symbols which are not necessarily compactly supported with respect to x, we will use the inequality |f (x)|2 dx 2 |a(X, D)f (x)| dx ≤ CN , (2.10) N |x−x0 |≤1 Rn (1 + |x − x0 |) which holds for every x0 ∈ Rn and for every N ≥ 0, with CN independent of x0 and dependent only on constants in the symbolic inequalities for a. Let us show first that (2.10) implies (2.9). Writing χ|x−x0 |≤1 for the characteristic function of the set |x − x0 | ≤ 1 and integrating (2.10) with respect to x0 yields , + χ|x−x0 |≤1 |a(X, D)f (x)|2 dx dx0 Rn Rn , + |f (x)|2 dx ≤ CN dx0 . N Rn Rn (1 + |x − x0 |) Changing the order of integration, we arrive at *N vol(B(1)) |a(X, D)f (x)|2 dx ≤ C Rn
Rn
|f (x)|2 dx,
which is (2.9). Let us now prove (2.10). Let us prove it for x0 = 0 first. We can write f = f1 + f2 , where f1 and f2 are smooth functions such that |f1 | ≤ |f |, |f2 | ≤ |f |, and supp f1 ⊂ {|x| ≤ 3}, supp f2 ⊂ {|x| ≥ 2}. We will do the estimate for f1 first. Let us fix η ∈ C0∞ (Rn ) such that η(x) = 1 for |x| ≤ 1. Then η(a(X, D)) = (ηa)(X, D) is a pseudodifferential operator with a compactly supported in x symbol η(x)a(x, ξ), thus by
Chapter 2. Pseudo-differential Operators on Rn
270
the first part we have |a(X, D)f1 (x)|2 dx =
Rn
{|x|≤1}
|(ηa)(X, D)f1 (x)|2 dx
≤C
Rn
|f1 (x)|2 dx
≤C
{|x|≤3}
|f (x)|2 dx,
which is the required estimate for f1 . Let us now do the estimate for f2 . If |x| ≤ 1, then x ∈ supp f2 , so we can write a(X, D)f2 (x) = k(x, x − y)f2 (y) dy, {|x|≥2}
where k is the kernel of a(X, D). Since |x| ≤ 1 and |y| ≥ 2, we have |x − y| ≥ 1 and hence by Theorem 2.3.1 we can estimate |k(x, x − z)| ≤ C1 |x − y|−N ≤ C2 |y|−N for all N ≥ 0. Thus we can estimate |a(X, D)f2 (x)| ≤ C1
{|y|≥2}
≤ C2
Rn
+ ≤ C3
|f2 (y)| dy |y|N
|f (y)| dy (1 + |y|)N
Rn
|f (y)|2 dy (1 + |y|)N
,1/2 ,
where we used the Cauchy-Schwarz inequality (Proposition 1.2.4) and that ,1/2 + 1 dy n (Exercise 1.1.19). This in turn implies + |a(X, D)f2 (x)|2 dx ≤ C {|x|≤1}
Rn
|f (y)|2 dy (1 + |y|)N
,1/2 ,
which is the required estimate for f2 . These estimates for f1 and f2 imply (2.10) with x0 = 0. We note that constant C0 depends only on the dimension and on the constants in symbolic inequalities for a. Let us now show (2.10) with an arbitrary x0 ∈ Rn . Let us define ax0 (x, ξ) = a(x − x0 , ξ). Then we immediately see that estimate (2.10) for a(X, D) in the ball {|x − x0 | ≤ 1} is equivalent to the same estimate for ax0 (X, D) in the ball {|x| ≤ 1}. Finally we note that since constants in symbolic inequalities for a and ax0 are the same, we obtain (2.10) with constant CN independent of x0 . This completes the proof of Theorem 2.4.2.
2.5. Calculus of pseudo-differential operators
2.5
271
Calculus of pseudo-differential operators
In this section we establish formulae for the composition of pseudo-differential operators, adjoint operators and and discuss the transformation of symbols under changes of variables.
2.5.1 Composition formulae First we analyse compositions of pseudo-differential operators. Theorem 2.5.1 (Composition of pseudo-differential operators). Let a ∈ S m1 (Rn × Rn )
and
b ∈ S m2 (Rn × Rn ).
Then there exists some symbol c ∈ S m1 +m2 (Rn × Rn ) such that c(X, D) = a(X, D) ◦ b(X, D). Moreover, we have the asymptotic formula c∼
(2πi)−|α| α!
α
(∂ξα a)(∂xα b),
(2.11)
which means that for all N > 0 we have c−
(2πi)−|α| (∂ξα a)(∂xα b) ∈ S m1 +m2 −N (Rn × Rn ). α!
|α| m2 > · · · and mj → −∞ as j → ∞. Then there exists a symbol a ∈ S m0 (Rn × Rn ) such that a∼
∞
aj = a0 + a1 + a2 + · · · ,
j=0
which means that we have a−
k−1 j=0
for all k ∈ N.
aj ∈ S mk (Rn × Rn ),
286
Chapter 2. Pseudo-differential Operators on Rn
Proof. Let us fix a function χ ∈ C ∞ (Rn ) such that χ(ξ) = 1 for all |ξ| ≥ 1 and such that χ(ξ) = 0 for all |ξ| ≤ 1/2. Then, for some sequence τj increasing sufficiently fast and to be chosen later, we define + , ∞ ξ a(x, ξ) = . aj (x, ξ)χ τ j j=0 We note that this sum is well defined pointwise because it is in fact locally finite since χ τξj = 0 for |ξ| < τj /2. In order to show that a ∈ S m0 (Rn × Rn ) we first take a sequence τj such that the inequality + ,5 4 ξ β α (2.18) ≤ 2−j (1 + |ξ|)mj +1−|α| ∂x ∂ξ aj (x, ξ)χ τj is satisfied for all |α|, |β| ≤ j. We first show that function ξ α ∂ξα χ τξj is uniformly bounded in ξ for each j. Indeed, we have ⎧ 0, |ξ| < τj /2, ⎪ ⎪ + , ⎪ + ,α ⎨ ξ ξ ξ α ∂ξα χ = bounded by C , τj /2 ≤ |ξ| ≤ |τ |, ⎪ τj τj ⎪ ⎪ ⎩ 0, τj < |ξ|, so that ξ α ∂ξα χ τξj ≤ C is uniformly bounded for all ξ, for any given j. Using this fact, we can also estimate + ,5 + , 4 ξ ξ ∂xβ ∂ξα aj (x, ξ)χ cα1 α2 ∂xβ ∂ξα1 aj (x, ξ)∂ξα2 χ = τj τj α1 +α2 =α |cα1 α2 |(1 + |ξ|)mj −|α1 | (1 + |ξ|)−|α2 | ≤ α1 +α2 =α
≤ C(1 + |ξ|)mj −|α| 2 3 = C(1 + |ξ|)−1 (1 + |ξ|)mj +1−|α| . Now, the left-hand side in estimate (2.18) is zero for |ξ| < τj /2, so we may assume that |ξ| ≥ τj /2. Hence we can have C(1 + |ξ|)−1 ≤ C(1 + |τj /2|)−1 < 2−j if we take τj sufficiently large. This implies that we can take the sum of ∂xβ ∂ξα – derivatives in the definition of a(x, ξ) and (2.18) implies that a ∈ S m0 (Rn × Rn ). Finally, to show the asymptotic formula, we can write + , k−1 ∞ ξ aj = aj (x, ξ)χ a− , τj j=0 j=k
2.6. Applications to partial differential equations
and so
⎡ ∂xβ ∂ξα ⎣a −
k−1
287
⎤ aj ⎦ ≤ C(1 + |ξ|)mk −|α| .
j=0
In this argument we fix α and βfirst, and then use the required estimates for all k−1 j ≥ |α|, |β|. This shows that a− j=0 aj ∈ S mk (Rn × Rn ) finishing the proof. Exercise 2.5.34. Prove that Proposition 2.5.33 remains valid in (ρ, δ) classes for all ρ and δ.
2.6 Applications to partial differential equations The main question in the theory of partial differential equations is how to solve the equation Au = f for a given partial differential operator A and a given function f . In other words, how to find the inverse of A, i.e., an operator A−1 such that A ◦ A−1 = A−1 ◦ A = I
(2.19)
is the identity operator (on some space of functions where everything is well defined). In this case function u = A−1 f gives a solution to the partial differential equation Au = f . First of all we can observe that if the operator A is an operator with variable coefficients in most cases it is impossible or very hard to find an explicit formula for its inverse A−1 (even when it exists). However, in many questions in the theory of partial differential equations one is actually not so much interested in having a precise explicit formula for A−1 . Indeed, in reality one is mostly interested not in knowing the solution u to the equation Au = f explicitly but rather in knowing some fundamental properties of u. One of the most important properties is the position and the strength of singularities of u. Thus, the question becomes whether we can say something about singularities of u knowing singularities of f = Au. In this case we do not need to solve equation Au = f exactly but it is sufficient to know its solution modulo the class of smooth functions. Namely, instead of A−1 in (2.19) one is interested in finding an “approximate” inverse of A modulo smooth functions, i.e., an operator B such that u = Bf solves the equation Au = f modulo smooth functions, i.e., if (BA − I)f and (AB − I)f are smooth for all functions f from some class. Recalling that operators in Ψ−∞ (Rn × Rn ) have such a property, we have the following definition, which applies to all pseudo-differential operators A: Definition 2.6.1 (Parametrix). Operator B is called the right parametrix of A if AB − I ∈ Ψ−∞ (Rn × Rn ). Operator C is called the left parametrix of A if CA − I ∈ Ψ−∞ (Rn × Rn ).
Chapter 2. Pseudo-differential Operators on Rn
288
Remark 2.6.2 (Left or right parametrix?). In fact, the left and right parametrix are closely related. Indeed, by definition we have AB − I = R1 and CA − I = R2 with some R1 , R2 ∈ Ψ−∞ (Rn × Rn ). Then we have C = C(AB − R1 ) = (CA)B − CR1 = B + R2 B − CR1 . If A, B, C are pseudo-differential operators of finite orders, the composition formula in Theorem 2.5.1 implies that R2 B, CR1 ∈ Ψ−∞ (Rn × Rn ), i.e., C − B is a smoothing operator. Thus, we will be mainly interested in the right parametrix B because u = Bf immediately solves the equation Au = f modulo smooth functions. We also note that since we work here modulo smoothing operators (i.e., operators in Ψ−∞ (Rn × Rn )), parametrices are obviously not unique – finding one of them is already very good because any two parametrices differ by a smoothing operator.
2.6.1
Freezing principle for PDEs
The following freezing principle provides a good and well-known motivation (see, e.g., [118]) for the use of the symbolic analysis in finding parametrices. Suppose we want to solve the following equation for an unknown function u = u(x):
(Au)(x) :=
aij (x)
1≤i,j≤n
∂2u (x) = f (x), ∂xi ∂xj
where the matrix {aij (x)}ni,j=1 is real valued, smooth, symmetric and positive definite. If we want to proceed in analogy to the Laplace equation in Remark 1.1.17, we should look for the inverse of the operator A. In the case of an operator with variable coefficients this may turn out to be difficult, so we may look for an approximate inverse B such that AB = I + E, where the error E is small in some sense. To be able to argue similar to Remark 1.1.17, we “freeze” the operator A at x0 to get the constant coefficients operator Ax0 =
aij (x0 )
1≤i,j≤n
∂2 . ∂xi ∂xj
Now, Ax0 has the exact inverse which is the operator of multiplication by ⎛ ⎝−4π 2
⎞−1 aij (x0 )ξi ξj ⎠
1≤i,j≤n
on the Fourier transform side. To avoid a singularity at the origin, we introduce a cut-off function χ ∈ C ∞ which is 0 near the origin and 1 for large ξ. Then we
2.6. Applications to partial differential equations
289
define ⎛
(Bx0 f )(x) =
Rn
e2πix·ξ ⎝−4π 2
⎞−1 aij (x0 )ξi ξj ⎠
χ(ξ) f(ξ) dξ.
1≤i,j≤n
Consequently, we can readily see that e2πix·ξ χ(ξ) f(ξ) dξ (Ax0 Bx0 f )(x) = Rn = f (x) + e2πix·ξ (χ(ξ) − 1) f(ξ) dξ. Rn
It follows that Ax0 Bx0 = I + Ex0 , where (Ex0 f )(x) = e2πix·ξ (χ(ξ) − 1) f(ξ) dξ Rn
is an operator of multiplication by a compactly supported function on the Fourier transform side. Writing it as a convolution with a smooth test function we can readily see that it is a smoothing operator. Exercise 2.6.3. Prove this. Now, we can “unfreeze” the point x0 expecting that the inverse B will be close to Bx0 for x close to x0 , and define (Bf )(x)
=
(Bx f )(x)
= Rn
⎛
e2πix·ξ ⎝−4π 2
⎞−1 aij (x)ξi ξj ⎠
χ(ξ) f(ξ) dξ.
1≤i,j≤n
This does not yield a parametrix yet, but it will be clear from the composition formula that we still have AB = I + E1 with error E1 ∈ Ψ−1 (Rn × Rn ) being “smoothing of order 1”. We can then set up an iterative procedure to improve the approximation of the inverse operator relying on the calculus of the appearing operators, and to find a parametrix for A. This will be done in Theorem 2.6.7.
2.6.2
Elliptic operators
We will now show how we can use the calculus to “solve” elliptic partial differential equations. First, we recall the notion of ellipticity. Definition 2.6.4 (Elliptic symbols). A symbol a ∈ S m (Rn × Rn ) is called elliptic if for some A > 0 it satisfies |a(x, ξ)| ≥ A|ξ|m
290
Chapter 2. Pseudo-differential Operators on Rn
for all |ξ| ≥ n0 and all x ∈ Rn , for some n0 > 0. We also say that the symbol a is elliptic in U ⊂ Rn if the above estimate holds for all x ∈ U . Pseudo-differential operators with elliptic symbols are also called elliptic. Exercise 2.6.5. Show that the constant n0 is not essential in this definition. Namely, for an elliptic symbol a ∈ S m (Rn × Rn ) show that there exists a symbol b ∈ S m (Rn × Rn ) satisfying |b(x, ξ)| ≥ c˜(1 + |ξ|)m for all x, ξ ∈ Rn , such that b differs from a by a symbol in S −∞ (Rn × Rn ). Now, let L = a(X, D) be an elliptic pseudo-differential operator with symbol a ∈ S m (Rn × Rn ) (which is then also elliptic by definition). Let us introduce a cut-off function χ ∈ C ∞ (Rn ) such that χ(ξ) = 0 for small ξ, e.g., for |ξ| ≤ 1, and such that χ(ξ) = 1 for large ξ, e.g., for |ξ| > 2. The ellipticity of a(x, ξ) assures that it can be inverted pointwise for |ξ| ≥ 1, so we can define the symbol b(x, ξ) = χ(ξ) [a(x, ξ)]
−1
.
Since a ∈ S m (Rn × Rn ) is elliptic, we easily see that b ∈ S −m (Rn × Rn ). If we take P0 = b(X, D) then by the composition Theorem 2.5.1 we obtain LP0 = I + E1 ,
P L = I + E2 ,
for some E1 , E2 ∈ Ψ−1 (Rn × Rn ). Thus, we may view P0 as a good first approximation for a parametrix of L. In order to find a parametrix of L, we need to modify P0 in such a way that E1 and E2 would be in Ψ−∞ (Rn × Rn ). This construction can be carried out in an iterative way. Indeed, we now show that ellipticity is equivalent to invertibility in the algebra Ψ∞ (Rn × Rn )/Ψ−∞ (Rn × Rn ): Theorem 2.6.6 (Elliptic ⇐⇒ ∃ Parametrix). Operator A ∈ Ψm (Rn × Rn ) is elliptic if and only if there exists B ∈ Ψ−m (Rn × Rn ) such that BA ∼ I ∼ AB modulo Ψ−∞ (Rn × Rn ). Proof. Let σA and σB denote symbols of A and B, respectively. Assume first that A ∈ Ψm (Rn × Rn ) and B ∈ Ψ−m (Rn × Rn ) satisfy BA = I − T and AB = I − T with T, T ∈ Ψ−∞ (Rn × Rn ). Then 1 − σBA = σT ∈ S −∞ (Rn × Rn ) and consequently by Theorem 2.5.1 we have 1 − σB σA ∈ S −1 (Rn × Rn ), so that |1 − σB σA | ≤ Cξ−1 . Hence 1 − |σB | · |σA | ≤ Cξ−1 , or equivalently |σB | · |σA | ≥ 1 − Cξ−1 . If we choose n0 > C, then |σB (x, ξ)| · |σA (x, ξ)| ≥ 1 − Cn0 −1 > 0 for any |ξ| ≥ n0 . Thus, σA (x, ξ) = 0 for |ξ| > n0 and 1 −m ≤ C|σB (x, ξ)| ≤ Cξ . |σA (x, ξ)| Hence A is elliptic of order m. This yields the first part of the proof.
2.6. Applications to partial differential equations
291
Conversely, assume that A and σA (x, ξ) are elliptic. We will construct the symbol b as an asymptotic sum b ∼ b 0 + b1 + b 2 + · · · and then use Proposition 2.5.5 to justify this infinite sum. Then we take operators Bj with symbols bj and the operator B with symbol b will be the parametrix for A. We will also work with |ξ| ≥ n0 since small ξ are not relevant for symbolic constructions. Moreover, once we have the left parametrix, we also have the right one in view of Remark 2.6.2. First, we take b0 = 1/σA which is well defined for |ξ| ≥ n0 in view of the ellipticity of σA . Then we have b0 ∈ S −m , with e0 = 1 − σB0 A ∈ S −1 .
σB0 A = 1 − e0 ,
Then we take b1 = e0 /σA ∈ S −m−1 so that we have σ(B0 +B1 )A = 1 − e0 + σB1 A = 1 − e1 ,
with e1 = e0 − σB1 A ∈ S −2 .
Inductively, we define bj = ej−1 /σA ∈ S −m−j and we have σ(B0 +B1 +···+Bj )A = 1 − ej ,
with ej = ej−1 − σBj A ∈ S −j−1 .
Now, Proposition 2.5.5 shows that b ∈ S −m and it satisfies σBA = 1 − e with e ∈ S −∞ by its construction, completing the proof. We now give a slightly more general statement which is useful for other purposes as well. It is a consequence of Theorem 2.6.6 and composition Theorem 2.5.1. Corollary 2.6.7 (Local parametrix). Let a ∈ S m (Rn × Rn ) be elliptic on an open set U ⊂ Rn , i.e., there exists some A > 0 such that |a(x, ξ)| ≥ A|ξ|m for all x ∈ U and all |ξ| ≥ 1. Let c ∈ S l be a symbol of order l whose support with respect to x is a compact subset of U . Then there exists a symbol b ∈ S l−m such that b(X, D)a(X, D) = c(X, D) − e(X, D) for some symbol e ∈ S −∞ . We also have the following local version of this for partial differential operators: Corollary 2.6.8 (Parametrix for elliptic differential operators). Let L= aα (x)∂xα |α|≤m
be an elliptic partial differential operator in an open set U ⊂ Rn . Let χ1 , χ2 , χ3 ∈ C0∞ (Rn ) be such that χ2 = 1 on the support of χ1 and χ3 = 1 on the support of χ2 . Then there is an operator P ∈ S −m (Rn × Rn ) such that P (χ2 L) = χ1 I + Eχ3 , −∞
for some E ∈ Ψ
(R × R ). n
n
292
Chapter 2. Pseudo-differential Operators on Rn
Proof. We take a(X, D) = χ2 L and c(X, D) = χ1 I in Corollary 2.6.7. Then a(X, D) is elliptic on the support of χ2 and we can take P = b(X, D) with b ∈ S −m from Corollary 2.6.7. We will now apply this result to obtain a statement on the regularity of solution to elliptic partial differential equations. We assume that the order m below is an integer which is certainly true when L is a partial differential operator. However, if we take into account the discussion from the next section, we will see that the statements below are still true for any m ∈ R. Theorem 2.6.9 (A-priori estimate). Let L ∈ Ψm be an elliptic pseudo-differential operator in an open set U ⊂ Rn and let Lu = f in U . Assume that f ∈ (L2k (U ))loc . Then u ∈ (L2m+k (U ))loc . This theorem shows that if u is a solution of an elliptic partial differential equation Lu = f then there is local gain of m derivatives for u compared to f , where m is the order of the operator L. Proof. Let χ1 , χ2 , χ3 ∈ C0∞ (U ) be non-zero functions such that χ2 = 1 on the support of χ1 and χ3 = 1 on the support of χ2 . Then, similar to the proof of Corollary 2.6.8 we have P (χ2 L) = χ1 I + Eχ3 , with some P ∈ Ψ−m . Since f ∈ (L2k )loc we have P (χ2 f ) ∈ (L2m+k )loc . Also, E(χ3 u) ∈ C ∞ so that ||χ3 E(χ3 u)||L2k ≤ ||χ3 u||L2 for any k. Summarising and using P (χ2 f ) = χ1 u + E(χ3 u), we obtain
||χ1 u||L2k+m ≤ C ||χ2 f ||L2k + ||χ3 u||L2 ,
which implies that u ∈ (L2m+k )loc in U .
Remark 2.6.10. We can observe from the proof that properties of solution u by the calculus and the existence of a parametrix are reduced to the fact that pseudodifferential operators in Ψ−m map L2k to L2k+m . In fact, in this way many properties of solutions to partial differential equations are reduced to questions about general pseudo-differential operators. In the following statement for now one can think of m and k being integers or zeros such that m ≤ k, but if we adopt the definition of Sobolev spaces from Definition 2.6.15, it is valid for all m, k ∈ R. We will prove it completely in the case of p = 2, and in the case p = 2 we will show how to reduce it to the Lp -boundedness of pseudo-differential operators. Theorem 2.6.11 (Lpk -continuity). Let T ∈ S m (Rn × Rn ) be a pseudo-differential operator of order m ∈ R, let 1 < p < ∞, and let k ∈ R. Then T extends to a bounded linear operator from the Sobolev space Lpk (Rn ) to the Sobolev space Lpk−m (Rn ).
2.6. Applications to partial differential equations
293
We will prove this statement in the next section. As an immediate consequence, by the same argument as in the proof of this theorem, we also obtain Corollary 2.6.12 (Local Lpk -continuity). Let L ∈ Ψm be an elliptic pseudo-differential operator in an open set U ⊂ Rn , let 1 < p < ∞, m, k ∈ R, and let Lu = f in U . Assume that f ∈ (Lpk (U ))loc . Then u ∈ (Lpm+k (U ))loc . Let us briefly discuss an application of the established a priori estimates. Definition 2.6.13 (Harmonic functions). A distribution f ∈ D (Rn ) is called har∂2 ∂2 monic if Lf = 0, where L = ∂x 2 + · · · + ∂x2 is the usual Laplace operator. 1
n
Taking real and imaginary parts of holomorphic functions, we see that Liouville’s theorem D.6.2 for holomorphic functions follows from Theorem 2.6.14 (Liouville’s theorem for harmonic functions). Every harmonic function f ∈ L∞ (Rn ) is constant. Proof. Since L is elliptic, by Theorem 2.6.9 it follows from the equation Lf = 0 that f ∈ C ∞ (Rn ). Taking the Fourier transform of Lf = 0 we obtain −4π 2 |ξ|2 f = 0which means that supp f ⊂ {0}. By Exercise 1.4.15 it follows that f = α |α|≤m aα ∂ δ. Taking the inverse Fourier transform we see that f (x) must be a polynomial. Finally, the assumption that f is bounded implies that f must be constant.
2.6.3
Sobolev spaces revisited
Up to now we defined Sobolev spaces Lpk assuming that the index k is an integer. In fact, using the calculus of pseudo-differential operators we can show that these spaces can be defined for all k ∈ R thus allowing one to measure the regularity of functions much more precisely. In the following discussion we assume the statement on the Lp -continuity of pseudo-differential operators from Theorem 2.6.22. We recall from Definition 1.5.6 that for an integer k ∈ N we defined the Sobolev space Lpk (Rn ) as the space of all f ∈ Lp (Rn ) such that their distributional derivatives satisfy ∂xα f ∈ Lp (Rn ), for all 0 ≤ |α| ≤ k. This space is equipped with a norm ||f ||Lpk = |α|≤k ||∂xα f ||Lp (or with any equivalent norm) for 1 ≤ p < ∞, with a modification for p = ∞. ∂2 ∂2 Let L = ∂x 2 + · · · + ∂x2 be the Laplace operator, so that its symbol is equal 1
n
to 4π 2 |ξ|2 . Let s ∈ R be a real number and let us consider operators (I − L)s/2 ∈ Ψs (Rn × Rn ) which are pseudo-differential operators with symbols a(x, ξ) = (1 + 4π 2 |ξ|2 )s/2 . Definition 2.6.15 (Sobolev spaces). We will say f is in the Sobolev space Lps (Rn ), i.e., f ∈ Lps (Rn ), if (I − L)s/2 f ∈ Lp (Rn ). We equip this space with the norm ||f ||Lps := ||(I − L)s/2 f ||Lp .
Chapter 2. Pseudo-differential Operators on Rn
294
Proposition 2.6.16. If s ∈ N is an integer, the space Lps (Rn ) coincides with the space Lpk (Rn ) with k = s, with equivalence of norms. Proof. We will use the index k for both spaces. Since operator (I − L)k/2 is a pseudo-differential operator of order k, by Theorem 2.6.11 we get that it is bounded from Lpk to Lp , i.e., we have ||∂xα f ||Lp . ||(I − L)k/2 f ||Lp ≤ C |α|≤k
Conversely, let Pα be a pseudo-differential operator defined by Pα = ∂xα (I −L)−k/2 , i.e., a pseudo-differential operator with symbol pα (x, ξ) = (2πiξ)α (1+4π 2 |ξ|2 )−k/2 , independent of x. If |α| ≤ k, we get that pα ∈ S |α|−k ⊂ S 0 , so that Pα ∈ S 0 (Rn × Rn ) for all |α| ≤ k. By Theorem 2.6.22 operators Pα are bounded on Lp (Rn ). Therefore, we obtain ||∂xα f ||Lp = ||Pα (I − L)k/2 f ||Lp ≤ C||(I − L)k/2 f ||Lp , |α|≤k
|α|≤k
completing the proof.
Exercise 2.6.17 (Sobolev embedding theorem). Prove that if s > k + n/2 then H s (Rn ) ⊂ C k (Rn ) and the inclusion is continuous. This gives a sharper version of Exercise 1.5.11. Exercise 2.6.18 (Distributions as Sobolev space functions). Recall from Exercise 1.4.14 that if u ∈ E (Rn ) then u is a distribution of some finite order m. Prove that if s < −m − n/2 then u ∈ H s . Contrast this with Exercise 2.6.17. Exercise 2.6.19. Prove that −s s x H s (Rn ) and S (Rn ) = x H s (Rn ). S(Rn ) = s∈R
s∈R
Note that fail without weights: for example, show that we have the equalities sin x sin x k ∈ H (R) but k∈N0 x x ∈ S(R). The situation on the torus will be somewhat simpler, see Corollary 3.2.12. Finally, let us justify Theorem 2.6.11 However, we will assume without proof that pseudo-differential operators of order zero are bounded on Lp (Rn ) for all 1 < p < ∞, see Theorem 2.6.22. Proof of Theorem 2.6.11. Let f ∈ Lps (Rn ). By definition this means that (I − L)s/2 f ∈ Lp (Rn ). Then we can write using the calculus of pseudo-differential operators (composition Theorem 2.5.1): (I − L)(s−μ)/2 T f = (I − L)(s−μ)/2 T (I − L)−s/2 (I − L)s/2 f ∈ Lp (Rn ) since operator (I − L)(s−μ)/2 T (I − L)−s/2 is a pseudo-differential operator of order zero and is, therefore, bounded on Lp (Rn ) by Theorem 2.6.22 if p = 2 and by Theorem 2.4.2 if p = 2.
2.6. Applications to partial differential equations
295
Remark 2.6.20. It is often very useful to conclude something about properties of functions in one Sobolev space knowing about their properties in another Sobolev space. One instance of such a conclusion will be used in the proof of Theorem 4.2.3 on the Sobolev boundedness of operators on the L2 -space on the torus. A general Banach space setting for such conclusions will be presented in Section 3.5. Here we present without proof another instance of this phenomenon: s n Theorem 2.6.21 (Rellich’s theorem). Let (fk )∞ k=1 ⊂ H (R ) be a uniformly bounded sequence of functions: there exists C such that ||fk ||H s (Rn ) ≤ C for all k. Assume that all functions fk are supported in a fixed compact set. Then there exists a σ n subsequence of (fk )∞ k=1 which converges in H (R ) for all σ < s.
Remarks on Lp -continuity of pseudo-differential operators. Let a ∈ S 0 (Rn × Rn ). Then by Theorem 2.3.1 the integral kernel K(x, y) of pseudo-differential operator a(X, D) satisfies estimates |∂xα ∂yβ K(x, y)| ≤ Aαβ |x − y|−n−|α|−|β| for all x = y. In particular, for α = β = 0 this gives |K(x, y)| ≤ A|x − y|−n for all x = y. Moreover, if we use it for α = 0 and |β| = 1, we get |K(x, y) − K(x, z)| dx ≤ A if |x − z| ≤ δ, for all δ > 0.
(2.20)
(2.21)
|x−z|≥2δ
Now, if we take a general integral operator T of the form K(x, y)u(y) dy, T u(x) = Rn
properties (2.20) and (2.21) of the kernel are the starting point of the so-called Calder´ on–Zygmund theory of singular integral operators. In particular, one can conclude that such operators are of weak type (1, 1), i.e., they satisfy the estimate μ{x ∈ Rn : |T u(x)| > λ} ≤
||u||L1 , λ
(see Definition 1.6.3 and the discussion following it for more details). Since we also know from Theorem 2.4.2 that a(X, D) ∈ Ψ0 (Rn × Rn ) are bounded on L2 (Rn ) and since we also know from Proposition 1.6.4 that this implies that a(X, D) is of weak type (2, 2), we get that pseudo-differential operators of order zero are of weak types (1, 1) and (2, 2). Then, by Marcinkiewicz’ interpolation Theorem 1.6.5, we conclude that a(X, D) is bounded on Lp (Rn ) for all 1 < p < 2. By the standard duality argument, this implies that a(X, D) is bounded on Lp (Rn ) also for all 2 < p < ∞.
296
Chapter 2. Pseudo-differential Operators on Rn
Since we also have the boundedness of L2 (Rn ), we obtain Theorem 2.6.22. Let T ∈ Ψ0 (Rn × Rn ). Then T extends to a bounded operator from Lp (Rn ) to Lp (Rn ), for all 1 < p < ∞. We note that there exist different proofs of this theorem. On one hand, it follows automatically from the Calder´ on–Zygmund theory of singular integral operators which include pseudo-differential operators considered here, if we view them as integral operators with singular kernels. There are many other proofs that can be found in monographs on pseudo-differential operators. Another alternative and more direct method is to reduce the Lp -boundedness to the question of uniform boundedness of Fourier multipliers in Lp (Rn ) which then follows from H¨ormander’s theorem on Fourier multipliers. However, in this monograph we decided not to immerse ourselves in the Lp -world since our aims here are different. We can refer to [91] and to [92] for more information on the Lp -boundedness of general Fourier integral operators in (ρ, δ)-classes with real and complex phase functions, respectively.
Chapter 3
Periodic and Discrete Analysis In this chapter we will review basics of the periodic and discrete analysis which will be necessary for development of the theory of pseudo-differential operators on the torus in Chapter 4. Our aim is to make these two chapters accessible independently for people who choose periodic pseudo-differential operators as a starting point for learning about pseudo-differential operators on Rn . This may be a fruitful idea in the sense that many technical issues disappear on the torus as opposed to Rn . Among them is the fact that often one does not need to worry about convergence of the integrals in view of the torus being compact. Moreover, the theory of distributions on the torus is much simpler than that on Rn , at least in the form required for us. The main reason is that the periodic Fourier transform takes functions on Tn = Rn /Zn to functions on Zn where, for example, tempered distributions become pointwise defined functions on the lattice Zn of polynomial growth at infinity. Also, on the lattice Zn there are no questions of regularity since all the objects are defined on a discrete set. However, there are many parallels between Euclidean and toroidal theories of pseudo-differential operators, so looking at proofs of similar results in different chapters may be beneficial. In many cases we tried to avoid overlaps by presenting a different proof or by giving a different explanation. Therefore, we also try to make the reading self-contained and elementary, avoiding cross-references to other chapters unless they increase the didactic value of the material. Yet, being written for people working with analysis, this chapter only briefly states the related notations and facts of more general function analysis. Supplementary material is, of course, referred to. The reader should have a basic knowledge of Banach and Hilbert spaces (the necessary background material is provided in Chapter B); some familiarity with distributions and point set topology definitely helps (this material can be found in Chapter A and in Chapter 1 if necessary). A word of warning has to be said: in order to use the theory of periodic pseudo-differential operators as a tool, there is no demand to dwell deeply on these prerequisites. One is rather encouraged to read the appropriate theory only when
298
Chapter 3. Periodic and Discrete Analysis
it is encountered and needed, and that is why we present a summary of necessary things here as well. We will use the following notation in the sequel. Triangles and will denote the forward and backward difference operators, respectively. The Laplacian will be denoted by L to avoid any confusion. The Dirac delta at x will be denoted by δx and the Kronecker delta at ξ will be denoted by δξ,η . As is common, R and C are written for real and complex numbers, respectively, Z stands for the integers, while N = Z+ := {n ∈ Z | n ≥ 1} and N0 := Z+ ∪ {0} are the sets of positive integers and nonnegative integers, respectively. We would also like to draw the reader’s attention to the notation |α| and ξ in (3.2) and (3.3), respectively, that we will be using in this chapter as well as in Chapters 4 and 5. This is especially of relevance in these chapters as both multi-indices α ∈ Nn0 and frequencies ξ ∈ Zn are integers.
3.1
Distributions and Fourier transforms on Tn and Zn
We fix the notation for the torus as Tn = (R/Z)n = Rn /Zn . Often we may identify Tn with the cube [0, 1)n ⊂ Rn , where we identify the measure on the torus with the restriction of the Euclidean measure on the cube. Functions on Tn may be thought as those functions on Rn that are 1-periodic in each of the coordinate directions. We will often simply say that such functions are 1-periodic (instead of Zn -periodic). More precisely, on the Euclidean space Rn we define an equivalence relation definition x ∼ y ⇐⇒ x − y ∈ Zn , where the equivalence classes are [x]
= {y ∈ Rn : x ∼ y} = {x + k : k ∈ Zn } .
A point x ∈ Rn is naturally mapped to a point [x] ∈ Tn , and usually there is no harm in writing x ∈ Tn instead of the actual [x] ∈ Tn . We may identify functions on Tn with Zn -periodic functions on Rn in a natural manner, f : Tn → C being identified with g : Rn → C satisfying g(x) = f ([x]) for all x ∈ Rn . In such a case we typically even write g = f and g(x) = f (x), and we might casually say things like • “f is periodic”, • “g ∈ C ∞ (Tn )” when actually “g ∈ C ∞ (Rn ) is periodic”, • etc.
3.1. Distributions and Fourier transforms on Tn and Zn
299
The reader has at least been warned. Moreover, the one-dimensional torus T1 = R1 /Z1 is isomorphic to the circle S1 = z ∈ R2 : z = 1 = {(cos(t), sin(t)) : t ∈ R} by the obvious mapping [t] → (cos(2πt), sin(2πt)) , so we may identify functions on T1 with functions on S1 . Remark 3.1.1 (What makes T1 and Tn special?). At this point, we must emphasize how fundamental the study on the one-dimensional torus T1 = R1 /Z1 is. First, smooth Jordan curves, especially the one-dimensional sphere S1 , are diffeomorphic to T1 . Secondly, the theory on the n-dimensional torus Tn = Rn /Zn sometimes reduces to the case of T1 . Furthermore, compared to the theory of pseudo-differential operators on Rn , the case of Tn is beautifully simple. This is due to the fact that Tn is a compact Abelian group – whereas Rn is only locally compact – on which the powerful aid of Fourier series is at our disposal. However, the results on Rn and Tn are somewhat alike. Many general results concerning series on the torus and their properties can be found in, e.g., [155]. To make this chapter more self-contained, let us also briefly review the multiindex notation. A vector α = (αj )nj=1 ∈ Nn0 is called a multi-index. If x = (xj )nj=1 ∈ αn 1 Rn and α ∈ Nn0 , we write xα := xα 1 · · · xn . For multi-indices, α ≤ β means αj ≤ βj for all j ∈ {1, . . . , n}. We also write β! := β1 ! · · · βn ! and + , + , + , α α! α1 αn := = ··· , β β! (α − β)! β1 βn so that α
(x + y) =
+α , β
β≤α
xα−β y β .
(3.1)
For α ∈ Nn0 and x ∈ Rn we shall write |α|
n
αj ,
(3.2)
⎞1/2 ⎛ n x2j ⎠ , := ⎝
(3.3)
:=
j=1
x
j=1
∂xα
:= ∂xα11 · · · ∂xαnn ,
∂ ∂ where ∂xj = ∂x etc. We will also use the notation Dxj = −i2π∂xj = −i2π ∂x , j √ j where i = −1 is the imaginary unit. We have chosen the notation x for the
300
Chapter 3. Periodic and Discrete Analysis
Euclidean distance in this chapter, to contrast it with |α| used for multi-indices. We also denote x := (1 + x2 )1/2 .
Exercise 3.1.2. Prove (3.1). n α n Exercise 3.1.3. Show that ( j=1 xj )m = |α|=m m! α! x , where x ∈ R and m ∈ N0 . Definition 3.1.4 (Periodic functions). A function f : Rn → Y is 1-periodic if f (x + k) = f (x) for every x ∈ Rn and k ∈ Zn . We shall consider these functions to be defined on Tn = Rn /Zn = {x + Zn |x ∈ Rn }. The space of 1-periodic m times continuously differentiable functions is denoted by C m (Tn ), and the test functions ∞ n are the elements of the space C (T ) := m∈Z+ C m (Tn ) . Remark 3.1.5. The natural inherent topology of C ∞ (Tn ) is induced by the seminorms that one gets by demanding the following convergence: uj → u if and only if ∂ α uj → ∂ α u uniformly, for all α ∈ Nn0 . Thus, e.g., by [89, 1.46] C ∞ (Tn ) is a Fr´echet space, but it is not normable as it has the Heine–Borel property. Let S(Rn ) denote the space of the Schwartz test functions from Definition 1.1.11, and let S (Rn ) be its dual, i.e., the space of the tempered distributions from Definition 1.3.1. The integer lattice Zn plays an important role in periodic and discrete analysis. Definition 3.1.6 (Schwartz space S(Zn )). Let S(Zn ) denote the space of rapidly decaying functions Zn → C. That is, ϕ ∈ S(Zn ) if for any M < ∞ there exists a constant Cϕ,M such that |ϕ(ξ)| ≤ Cϕ,M ξ−M holds for all ξ ∈ Zn . The topology on S(Zn ) is given by the seminorms pk , where k ∈ N0 and pk (ϕ) := supξ∈Zn ξk |ϕ(ξ)| . Exercise 3.1.7 (Tempered distributions S (Zn )). Show that the continuous linear functionals on S(Zn ) are of the form ϕ → u, ϕ :=
u(ξ) ϕ(ξ),
ξ∈Zn
where functions u : Zn → C grow at most polynomially at infinity, i.e., there exist constants M < ∞ and Cu,M such that |u(ξ)| ≤ Cu,M ξM holds for all ξ ∈ Zn . Such distributions u : Zn → C form the space S (Zn ). Note that compared to S (Rn ), distributions in S (Zn ) are pointwise well-defined functions (!) on the lattice Zn .
3.1. Distributions and Fourier transforms on Tn and Zn
301
To contrast Euclidean and toroidal Fourier transforms, they will be denoted by FRn and FTn , respectively. Let FRn : S(Rn ) → S(Rn ) be the Euclidean Fourier transform defined by (FRn )f (ξ) := e−2πix·ξ f (x) dx. Rn
Mapping FRn : S(Rn ) → S(Rn ) is a bijection, and its inverse FR−1 n is given by e2πix·ξ (FRn f )(ξ) dξ, f (x) = Rn
see Theorem 1.1.21. As is well known, this Fourier transform can be uniquely extended to FRn : S (Rn ) → S (Rn ) by duality, see Definition 1.3.2. We refer to Section 1.1 for further details concerning the Euclidean Fourier transform. Definition 3.1.8 (Toroidal/periodic Fourier transform). Let FTn = (f → f) : C ∞ (Tn ) → S(Zn ) be the toroidal Fourier transform defined by e−i2πx·ξ f (x) dx. f (ξ) :=
(3.4)
Tn n ∞ (Tn ) is given by Then FTn is a bijection and its inverse FT−1 n : S(Z ) → C ei2πx·ξ f(ξ), f (x) = ξ∈Zn
so that for h ∈ S(Zn ) we have −1 ei2πx·ξ h(ξ). FTn h (x) := ξ∈Zn
Remark 3.1.9 (Notations i2πx · ξ vs 2πix · ξ). We note that in the case of the toroidal Fourier transform we write i2πx · ξ in the exponential with i in front to emphasize that 2πx is now a periodic variable, and also to distinguish it from the Euclidean Fourier transform in which case we usually write 2πi in the exponential. We will write FTn instead of f in this and the next chapters only if we want to emphasize that we want to take the periodic Fourier transform. Exercise 3.1.10 (Two Fourier inversion formulae). Prove that the Fourier transform FTn : C ∞ (Tn ) → S(Zn ) is a bijection, that FTn : C ∞ (Tn ) → S(Zn ) and n ∞ (Tn ) are continuous, and that FT−1 n : S(Z ) → C −1 n n ∞ (Tn ) → C ∞ (Tn ) FTn ◦ FT−1 n : S(Z ) → S(Z ) and FTn ◦ FTn : C
are identity mappings on S(Zn ) and C ∞ (Tn ), respectively.
302
Chapter 3. Periodic and Discrete Analysis
Let us study an example of periodic distributions, the space L2 (Tn ). Definition 3.1.11 (Space L2 (Tn )). Space L2 (Tn ) is a Hilbert space with the inner product (u, v)L2 (Tn ) := u(x) v(x) dx, (3.5) Tn
where z is the complex conjugate of z ∈ C. The Fourier coefficients of u ∈ L2 (Tn ) are u (ξ) = e−i2πx·ξ u(x) dx (ξ ∈ Zn ), (3.6) Tn
and they are well defined for all ξ due to H¨ older’s inequality (Proposition 1.2.4) and compactness of Tn . Remark 3.1.12 (Fourier series on L2 (Tn )). The family {eξ : ξ ∈ Zn } defined by eξ (x) := ei2πx·ξ
(3.7)
forms an orthonormal basis on L2 (Tn ), whichwill be proved in Theorem 3.1.20. (ξ) ei2πx·ξ converge to u in Thus the partial sums of the Fourier series ξ∈Zn u 2 the L -norm, so that we shall identify u with its Fourier series representation: u (ξ) ei2πx·ξ . u(x) = ξ∈Zn
As before, we call u : Zn → C the Fourier transform of u. As a consequence of the Plancherel identity on general compact topological groups to be proved in Corollary 7.6.7 we obtain: ∈ 2 (Zn ) and Remark 3.1.13 (Plancherel’s identity). If u ∈ L2 (Tn ) then u || u||2 (Zn ) = ||u||L2 (Tn ) . Exercise 3.1.14. Give a simple direct proof of Remark 3.1.13. (Hint: it is similar to the proof on Rn but simpler.) Exercise 3.1.15. Show that S(Zn ) is dense in 2 (Zn ). Remark 3.1.16 (Functions eξ ). We can observe that the functions eξ (x) = ei2πx·ξ from (3.7) satisfy eξ (x + y) = eξ (x)eξ (y) and |eξ (x)| = 1 for all x ∈ Tn . The converse is also true, namely: Theorem 3.1.17 (Unitary representations of Tn ). If f ∈ L1 (Tn ) is such that we have f (x + y) = f (x)f (y) and |f (x)| = 1 for all x, y ∈ Tn , then there exists some ξ ∈ Zn such that f = eξ . Remark 3.1.18. It is a nice exercise to show this directly and we do it below. However, we note that employing a more general terminology of Chapter 7, the conditions on f mean that f : Tn → U(1) is a unitary representation of Tn ,
3.1. Distributions and Fourier transforms on Tn and Zn
303
automatically irreducible since it is one-dimensional. Moreover, these conditions 0n , the unitary dual of f . Since functions imply that f is continuous, and hence f ∈ T eξ exhaust the unitary dual by the Peter–Weyl theorem (see, e.g., Remark 7.5.17), we obtain the result. Proof of Theorem 3.1.17. We will prove the one-dimensional case since the general case of Tn follows from it if we look at functions f (τ ej ) where ej is the j th unit basis vector of Rn . Thus, x ∈ T1 , we can think of T as of periodic R, and we choose λ > 0 such λ that Λ = 0 f (τ ) dτ = 0. Such λ exists because otherwise we would have f = 0 a.e. by Corollary 1.5.17 of the Lebesgue differentiation theorem, contradicting the assumptions. Consequently we can write f (x) = Λ−1
0
λ
f (x)f (τ ) dτ = Λ−1
λ
f (x + τ ) dτ = Λ−1
0
x+λ
f (τ ) dτ. x
From this we can observe that f ∈ L1 (R) implies that f is continuous at x. Since this is true for all x ∈ T we get f ∈ C 1 (T). By induction, we get that actually f ∈ C ∞ (T). Differentiating the equality above, we see that f satisfies the equation f (x) = Λ−1 (f (x + λ) − f (x)) = Λ−1 (f (x)f (λ) − f (x)) = C0 f (x), with C0 = Λ−1 (f (λ)−1). Solving this equation we find f (x) = f (0) eC0 x . Recalling that |f (0)| = 1 we get that |f (x)| = eReC0 x . Since |f (x)| = 1 we see that ReC0 = 0, and thus C0 = i2πξ for some ξ ∈ R. Finally, the fact that f is periodic implies that ξ ∈ Z. Exercise 3.1.19. Work out the details of the extension of the proof from T1 to Tn . Also, show that the conclusion of Theorem 3.1.17 remains true if we replace Tn by Rn and condition f ∈ L1 (Tn ) by f ∈ L1loc (Rn ), but in this case ξ ∈ Rn does not have to be in the lattice Zn . Theorem 3.1.20 (An orthonormal basis of L2 (Tn )). The collection {eξ : ξ ∈ Zn } is an orthonormal basis of L2 (Tn ). Remark 3.1.21. Let us make some general remarks first. From the general theory of Hilbert spaces we know that L2 (Tn ) has an orthonormal basis, which is countable by Theorem B.5.35 if we can check that L2 (Rn ) is separable. On the other hand, a more precise conclusion is possible from the general theory if we use that Tn is 0n ∼ a group. Indeed, Theorem 3.1.17 (see also Remark 3.1.18) implies that T = {eξ : ξ ∈ Zn }. Theorem 3.1.20 is then a special case of the Peter–Weyl theorem (see, e.g., Remark 7.5.17). However, at this point we give a more direct proof: Proof of Theorem 3.1.20. It is easy to check the orthogonality property (eξ , eη )L2 (Tn ) = 0 for ξ = η,
304
Chapter 3. Periodic and Discrete Analysis
and the normality (eξ , eξ )L2 (Tn ) = 1 for all ξ ∈ Zn , so the real issue is to show that we have the basis according to Definition B.5.34. We want to apply the Stone–Weierstrass theorem A.14.4 to show that the set E = span{eξ : ξ ∈ Zn } is dense in C(Tn ). If we have this, we can use the density of C(Tn ) in L2 (Tn ), so that by Theorem B.5.32 it would be a basis. We note that in fact the density of E in both C(Tn ) and L2 (Tn ) is a special case of Theorem 7.6.2 on general topological groups, but we give a direct short proof here. In view of the Stone–Weierstrass theorem A.14.4 all we have to show is that E is an involutive algebra separating the points of Tn . It is clear that E separates points. Finally, from the identity eξ eη = eξ+η it follows that E is an algebra, which is also involutive because of the identity eξ = e−ξ . Exercise 3.1.22. Show explicitly how E separates the points of Tn , as well as verify the orthonormality statement in the proof. Definition 3.1.23 (Spaces Lp (Tn )). For 1 ≤ p < ∞ let Lp (Tn ) be the space of all u ∈ L1 (Tn ) such that + ||u||Lp (Tn ) :=
Tn
,1/p |u(x)|p dx < ∞.
For p = ∞, let L∞ (Tn ) be the space of all u ∈ L1 (Tn ) such that ||u||L∞ (Tn ) := esssupx∈Tn |u(x)| < ∞. These are Banach spaces by Theorem C.4.9. Corollary 3.1.24 (Hausdorff–Young inequality). Let 1 ≤ p ≤ 2 and ∈ q (Zn ) and u ∈ Lp (Tn ) then u
1 p
+
1 q
= 1. If
|| u||q (Zn ) ≤ ||u||Lp (Tn ) . Proof. The statement follows by the Riesz–Thorin interpolation theorem C.4.18 from the simple estimate || u||∞ (Zn ) ≤ ||u||L1 (Tn ) and Plancherel’s identity || u||2 (Zn ) = ||u||L2 (Tn ) in Remark 3.1.13. Definition 3.1.25 (Periodic distribution space D (Tn )). The dual space D (Tn ) = L(C ∞ (Tn ), C) is called the space of periodic distributions. For u ∈ D (Tn ) and ϕ ∈ C ∞ (Tn ), we shall write u(ϕ) = u, ϕ. For any ψ ∈ C ∞ (Tn ),
ϕ →
ϕ(x) ψ(x) dx Tn
is a periodic distribution, which gives the embedding ψ ∈ C ∞ (Tn ) ⊂ D (Tn ). Note that the same argument also shows the embedding of the spaces Lp (Tn ), 1 ≤ p ≤
3.1. Distributions and Fourier transforms on Tn and Zn
305
∞, into D (Tn ). Due to the test function equality ∂ α ψ, ϕ = ψ, (−1)|α| ∂ α ϕ, it is natural to define distributional derivatives by ∂ α f, ϕ := f, (−1)|α| ∂ α ϕ. The topology of D (Tn ) = L(C ∞ (Tn ), C) is the weak∗ -topology. Remark 3.1.26 (Trigonometric polynomials). The space TrigPol(Tn ) of trigonometric polynomials on the torus is defined by TrigPol(Tn ) := span{eξ : ξ ∈ Zn }. Thus, f ∈ TrigPol(Tn ) is of the form f (x) =
f(ξ)ei2πx·ξ ,
ξ∈Zn
where f(ξ) = 0 for only finitely many ξ ∈ Zn . In the proof of Theorem 3.1.20 we showed that TrigPol(Tn ) is dense in both C(Tn ) and in L2 (Tn ) in the corresponding norms. Now, the set of trigonometric polynomials is actually also dense in C ∞ (Tn ), so that a distribution is characterised by evaluating it at the vectors eξ for all ξ ∈ Zn . We note that there exist linear mappings u ∈ L(span{eξ | ξ ∈ Zn }, C) that do not belong to L(C ∞ (Tn ), C), but for which the determination of the Fourier coefficients u (ξ) = u(eξ ) makes sense. Definition 3.1.27 (Fourier transform on D (Tn )). By dualising the inverse FT−1 n : S(Zn ) → C ∞ (Tn ), the Fourier transform is extended uniquely to the mapping FTn : D (Tn ) → S (Zn ) by the formula FTn u, ϕ := u, ι ◦ FT−1 n ϕ,
(3.8)
where u ∈ D (Tn ), ϕ ∈ S(Zn ), and ι is defined by (ι ◦ ψ)(x) = ψ(−x). Exercise 3.1.28. Prove that if u ∈ D (Tn ) then FTn u ∈ S (Zn ). Note that by Exercise 3.1.7 it means in particular that FTn u is defined pointwise on Zn . Exercise 3.1.29 (Compatibility). Check that extension (3.8) when restricted to C ∞ (Tn ), is compatible with the definition (3.4). Here, the inclusion C ∞ (Tn ) ⊂ D (Tn ) is interpreted in the standard way by u, ϕ = u(ϕ) = u(x) ϕ(x) dx. Tn
Remark 3.1.30 (Notice: different spaces). Observe that spaces of functions where the toroidal Fourier transform FTn acts are different: one is the space of functions on the torus C ∞ (Tn ) while the other is the space of functions on the lattice S(Zn ). That is why one has to be more careful on the torus, e.g., compared to the Fourier transform for distributions on Rn in Definition 1.3.2. This difference will be even more apparent in the case of compact Lie groups in Chapter 10.
306
Chapter 3. Periodic and Discrete Analysis
Remark 3.1.31 (Bernstein’s theorem). The Fourier transform can be studied on other spaces on the torus. For example, let Λs (T) be the space of H¨older continuous functions of order 0 < s < 1 on the one-dimensional torus T1 , defined as ! |f (x + h) − f (x)| < ∞. . Λs (T) := f ∈ C(T) : sup |h|s x,h∈T Then Bernstein’s theorem holds: if f ∈ Λs (T) with s > 12 , then f ∈ 1 (Z). We refer to [35] for further details on the H¨ older continuity on the torus. Working on the lattice it is always useful to keep in mind the following: Definition 3.1.32 (Dirac delta comb). The Dirac delta comb δZn : S(Rn ) → C is defined by δZn , ϕ := ϕ(x), x∈Zn
and the sum here is absolutely convergent. Exercise 3.1.33. Prove that δZn ∈ S (Rn ). We recall that the Dirac delta δx ∈ S (Rn ) at x is defined by δx (ϕ) = ϕ(x) for all ϕ ∈ S(Rn ). It may be not surprising that we obtain the Dirac delta comb by summing up Dirac deltas over the integer lattice: Proposition 3.1.34. We have the convergence
δx
x∈Zn : |x|≤j
Proof. Let us denote Pj := |Pj − δZn , ϕ| ≤
x∈Zn : |x|≤j
S (Rn )
−−−−→ δZn . j→∞
δx . If ϕ ∈ S(Rn ) then
|ϕ(x)| ≤
x∈Zn : |x|>j
cM x−M −−−→ 0
x∈Zn : |x|>j
for M large enough (e.g., M = n + 1), proving the claim.
j→∞
Another sequence converging to the Dirac delta comb will be shown in Proposition 4.6.8.
3.2
Sobolev spaces H s (Tn )
Fortunately, we have rich structures to work on. The periodic Sobolev spaces H s (Tn ) that we introduce in Definition 3.2.2 are actually Hilbert spaces (and in Section 3.5 we prove several auxiliary theorems about continuity and extensions in Banach spaces that apply in our situation). Here we shall deal with periodic functions and distributions on Rn and we shall pursue another more applicable
3.2. Sobolev spaces H s (Tn )
307
definition of distributions: a Hilbert topology will be given for certain distribution subspaces, which are the Sobolev spaces. It happens that every periodic distribution belongs to some of these spaces. Thus, we are attempting to create spaces which include L2 (Tn ) as a special case and which would pay attention to smoothness properties of distributions. To give an informal motivation, assume that u ∈ L2 (Tn ) also satisfies ∂ α u ∈ L2 (Tn ) for some α ∈ Nn0 . Then writing ∂ α u in a Fourier series we have (i2πξ)α u (ξ) eξ , ∂αu = ξ∈Zn
with eξ as in (3.7), from which by Parseval’s equality we obtain 2 |∂ α u(x)|2 dx = (2π)2|α| |ξ α u (ξ)| ; Tn
ξ∈Zn
with α = 0 this is just the L2 -norm. Let us define ξ := (1 + ξ2 )1/2 , where we recall the notation ξ for the Euclidean norm in (3.3). Remark 3.2.1. This function will be used for measuring decay rates, and other possible analogues for (ξ → ξ) : Zn → R+ would be 1 + ξ, or a function equal to ξ for ξ = 0 and to 1 for ξ = 0. The idea here is to get a function ξ → ξ, which behaves asymptotically like the norm ξ → ξ when ξ → ∞, and which satisfies a form of Peetre’s inequality (see Proposition 3.3.31), thus vanishing nowhere. Definition 3.2.2 (Sobolev spaces H s (Tn )). For u ∈ D (Tn ) and s ∈ R we define the norm · H s (Tn ) by ⎛ uH s (Tn ) := ⎝
⎞1/2 ξ2s | u(ξ)|2 ⎠
.
(3.9)
ξ∈Zn
The Sobolev space H s (Tn ) is then the space of 1-periodic distributions u for which u s n < ∞. For them, we will formally write their Fourier series representation H (T ) (ξ) ei2πx·ξ , and in Remark 3.2.5 we give a justification for this. Thus, ξ∈Zn u such also called 1-periodic distributions, represented by the formal series u will bei2πx·ξ u (ξ) e . Note that in the definition (3.9) we again take an advantage n ξ∈Z of Tn : compared to Rn the distributions on the lattice Zn take pointwise values, see Exercise 3.1.7. Exercise 3.2.3. For example, the 1-periodic Dirac delta δ is expressed by δ(x) = i2πx·ξ ≡ 1. Show that δ belongs to H s (Tn ) , or by δ(ξ) , where δ(ξ) ξ∈Zn e n if and only if s < −n/2.
ξ∈Z
308
Chapter 3. Periodic and Discrete Analysis s
Exercise 3.2.4. For the function eξ (x) = ei2πx·ξ show that eξ H s (Tn ) = ξ . Remark 3.2.5. One can readily see that the union s∈R H s (Tn ) is the dual of ∞ n C (T ) in its uniform topology from Remark 3.1.5 (see Corollary 3.2.12). For the details concerning this duality we refer, e.g., to [11, Theorem 6.1]). Hence our definition of the 1-periodic distributions in Definition 3.2.2 coincides with the “official” one in view of the equality H s (Tn ). (3.10) D (Tn ) = L(C ∞ (Tn ), C) = s∈R
Proposition 3.2.6 (Sobolev spaces are Hilbert spaces). For every s ∈ R, the Sobolev space H s (Tn ) is a Hilbert space with the inner product
(u, v)H s (Tn ) :=
ξ2s u (ξ) v(ξ).
ξ∈Zn
Proof. The spaces H 0 (Tn ) and H s (Tn ) are isometrically isomorphic by the canonical isomorphism ϕs : H 0 (Tn ) → H s (Tn ), defined by ϕs u(x) :=
ξ−s u (ξ) ei2πx·ξ .
ξ∈Zn
Indeed, ϕs is a linear isometry between H t (Tn ) and H t+s (Tn ) for every t ∈ R, = ϕ−s . Then the completeness of and it is true that ϕs1 ϕs2 = ϕs1 +s2 and ϕ−1 s L2 (Tn ) = H 0 (Tn ) is transferred to that of H s (Tn ) for every s ∈ R. Exercise 3.2.7. For k ∈ N0 the traditional Sobolev norm · k is defined by ⎛ uk := ⎝
|α|≤k
Tn
⎞1/2 |∂ α u(x)|2 dx⎠
.
Show that uH k (Tn ) ≤ uk ≤ Ck uH k (Tn ) , and try to find the best possible constant Ck < ∞. This resembles Definitions 1.5.6 and 2.6.15 in the case of Rn , with the equivalence of norms proved in Proposition 2.6.16. Definition 3.2.8 (Banach and Hilbert dualities). We can define different dualities between Sobolev spaces. The Sobolev space H −s (Tn ) is the dual space of H s (Tn ) via the Banach duality product ·, · defined by u, v :=
ξ∈Zn
u (ξ) v(−ξ),
3.3. Discrete analysis toolkit
309
where u ∈ H s (Tn ) and v ∈ H −s (Tn ). Note that u, v = s = 0. Accordingly, the L2 - (or H 0 -) inner product (u, v)H 0 (Tn ) = u(x) v(x) dx
Tn
u(x) v(x) dx, when
Tn
is the Hilbert duality product, and H (T ) and H −s (Tn ) are duals of each other with respect to this duality. If A is a linear operator between two Sobolev spaces, we shall denote its Banach and Hilbert adjoints by A(∗B) and A(∗H) , respectively. Often, the Banach adjoint is called the transpose of the operator A and is denoted by At . Then the Hilbert adjoint is simply called the adjoint and denoted by A∗ . For the relation between Banach and Hilbert adjoints see Definition 2.5.15, Exercise 2.5.16 and Remark 2.5.15. s
n
Exercise 3.2.9 (Trigonometric polynomials are dense). Prove that the trigonometric polynomials (and hence also C ∞ (Tn )) are dense in every H s (Tn ). Exercise 3.2.10 (Embeddings are compact). Prove that the inclusion ι : H t (Tn ) → H s (Tn ) is compact for s < t. Exercise 3.2.11 (An embedding theorem). Let m ∈ N0 and s > m + n/2. Prove that H s (Tn ) ⊂ C m (Tn ). As a corollary, we get
Corollary 3.2.12. We have the equality s∈R H s (Tn ) = C ∞ (Tn ). By the duality in Definition 3.2.8 it is related to (3.10) in Remark 3.2.5. Note that the situation on Rn is somewhat more complicated, see Exercise 2.6.19. Definition 3.2.13 (Biperiodic Sobolev spaces). The biperiodic Sobolev space H s,t (Tn × Tn ) (s, t ∈ R) is the subspace of biperiodic distributions having the norm · s,t defined by ⎡ ⎤1/2 η2s ξ2t | v (η, ξ)|2 ⎦ , (3.11) vs,t := ⎣ η∈Zn ξ∈Zn
where v(η, ξ) =
Tn
Tn
e−η (x) e−ξ (y) v(x, y) dy dx
(3.12)
are the Fourier coefficients. It is true family of C ∞ -smooth biperiodic that the ∞ n n s,t functions satisfies C (T × T ) = s,t∈R H (Tn × Tn ). In an obvious manner one relates all these definitions for 1-periodic spaces Tn = Rn /Zn .
3.3
Discrete analysis toolkit
In this section we provide tools for the study of periodic pseudo-differential operators. In fact, some of the discrete results presented date back to the 18th and 19th centuries, but seem to have been forgotten in the advent of modern numerical anal-
310
Chapter 3. Periodic and Discrete Analysis
ysis. Global investigation of periodic functions also requires a special treatment, presented in the last subsection, as well as periodic Taylor series in Section 3.4. Defining functions on the discrete space Zn instead of Rn , we lose the traditional limit concepts of differential calculus. However, it is worth viewing differences and sums as relatives to derivatives and integrals, and what we shall come up with is a theory that quite nicely resembles differential calculus. Therefore it is no wonder that this theory is known as the calculus of finite differences.
3.3.1
Calculus of finite differences
In this section we develop the discrete calculus which will be needed in the sequel. In particular, we will formulate and prove a discrete version of the Taylor expansion formula on the lattice Zn . Let us first list some conventions that will be spotted in the formulae: a sum over an empty index set is 0 (empty product is 1), 0! = 1, and heretically 00 = 1. When the index set is known from the context, we may even leave it out. α
n Definition 3.3.1 (Forward and backward differences α ξ and ξ ). Let σ : Z → C n and 1 ≤ i, j ≤ n. Let δj ∈ N0 be defined by 1, if i = j, (δj )i := 0, if i = j.
We define the forward and backward partial difference operators ξj and ξj , respectively, by ξj σ(ξ)
:= σ(ξ + δj ) − σ(ξ),
ξj σ(ξ)
:= σ(ξ) − σ(ξ − δj ),
and for α ∈ Nn0 define α ξ
αn 1 := α ξ1 · · · ξn ,
α
:= ξ1 · · · ξn .
ξ
α1
αn
Remark 3.3.2 (Classical relatives). Several familiar formulae from classical analysis have discrete relatives: for instance, it can be easily checked that these difference operators commute, i.e., that β β α α+β α ξ ξ = ξ ξ = ξ
for all multi-indices α, β ∈ Nn0 . Moreover, α α α ξ (sϕ + tψ)(ξ) = sξ ϕ(ξ) + tξ ψ(ξ),
where s and t are scalars.
3.3. Discrete analysis toolkit
311
Exercise 3.3.3. Prove these formulae. α
n Proposition 3.3.4 (Formulae for α ξ and ξ ). Let φ : Z → C. We have
α ξ φ(ξ)
=
(−1)
β≤α α
ξ φ(ξ)
=
+ , α φ(ξ + β), β + , α φ(ξ − β). β
|α−β|
(−1)|β|
β≤α
Proof. Let us introduce translation operators Ej := (I + ξj ), acting on functions φ : Zn → C by Ej φ(ξ) := (I + ξj )φ(ξ) = φ(ξ + δj ). Let E α := E1α1 · · · Enαn . An application of the binomial formula is enough: α ξ φ(ξ)
(E − I)α φ(ξ) +α, (−1)|α−β| E β φ(ξ) = β β≤α + , α φ(ξ + β). (−1)|α−β| = β =
β≤α
The backward difference equality is left for the reader to prove as Exercise 3.3.5. Exercise 3.3.5. Notice that Ej ξj = ξj = ξj Ej . Complete the proof of Proposition 3.3.4. The discrete Leibniz formula is complicated enough to have a proof of its own, and it can be compared with the Leibniz formula on Rn in Theorem 1.5.10, (iv). Lemma 3.3.6 (Discrete Leibniz formula). Let φ, ψ : Zn → C. Then α ξ (φψ)(ξ) =
+α , β≤α
β
βξ φ(ξ)
α−β ψ(ξ + β). ξ
(3.13)
Proof. (Another proof idea, not using induction, can be found in [117, p. 11] and [52, p. 16].) First, we have an easy check ξj (ϕψ)(ξ)
= (ϕψ)(ξ + δj ) − (ϕψ)(ξ) = ϕ(ξ) (ψ(ξ + δj ) − ψ(ξ)) + (ϕ(ξ + δj ) − ϕ(ξ)) ψ(ξ + δj ) = ϕ(ξ) ξj ψ(ξ) + ξj ϕ(ξ) ψ(ξ + δj ).
312
Chapter 3. Periodic and Discrete Analysis
We use this and induction on α ∈ Nn0 : α+δj
ξ
= =
()
=
=
(ϕψ)(ξ) = ξj α ξ (ϕψ)(ξ) +α , β ξ ϕ(ξ) α−β ψ(ξ + β) ξj ξ β β≤α +α, A β α+δ −β ξ ϕ(ξ) ξ j ψ(ξ + β) β β≤α B β+δ ψ(ξ + β + δj ) + ξ j ϕ(ξ) α−β ξ 4+α, + α ,5 β α+δ −β + ξ ϕ(ξ) ξ j ψ(ξ + β) β − δj β β≤α+δj +α + δ j , β α+δ −β ξ ϕ(ξ) ξ j ψ(ξ + β). β β≤α+δj
In () above, we used the convention that is complete.
α γ
= 0 if γ ≤ α or if γ ∈ Nn0 . The proof
Exercise 3.3.7. Verify that , , + + , + α α + δj α + = β − δj β β in the proof of (3.13). Remark 3.3.8 (Discrete product rule – notice the shifts). Notice the shift in (3.13) in the argument of ψ. For example, already the product rule becomes ξj (ϕψ)(ξ) = ϕ(ξ) ξj ψ(ξ) + ξj ϕ(ξ) ψ(ξ + δj ). The shift is caused by the difference operator ξ , and it is characteristic to the calculus of finite differences – in classical Euclidean analysis it is not present. This shift will have its consequences for the whole theory of pseudo-differential operators on the torus in Chapter 4, especially for the formulae in the calculus. Exercise 3.3.9. Prove the following form of the discrete Leibniz formula: α−β +α , β (ϕ(ξ) ψ(ξ)) = ϕ(ξ) ξ ψ(ξ + α). α ξ ξ β β≤α
As it is easy to guess, in the calculus of finite differences, sums correspond to integrals of classical analysis, and the theory of series (presented, e.g., in [66]) serves as an integration theory. Assuming convergence of the following series, it holds that (sϕ(ξ) + tψ(ξ)) = s ϕ(ξ) + t ψ(ξ), ξ
ξ
ξ
3.3. Discrete analysis toolkit
313
and when a ≤ b on Z1 , we have an analogue of the fundamental theorem of calculus: b ξ ψ(ξ) = ψ(b + 1) − ψ(a). ξ=a
Difference and partial difference equations (cf. differential and partial differential) are handled in several books concerning combinatorics or difference methods (e.g., [52]), but various mean value theorems have no straightforward interpretation here, since the functions are usually defined only on a discrete set of points (although one can use some suitable interpolation; we refer to Theorem 3.3.39 and Section 4.5). Integration by parts can be, however, translated for our purposes: Lemma 3.3.10 (Summation by parts). Assume that ϕ, ψ : Zn → C. Then α |α| ϕ(ξ) α ψ(ξ) = (−1) ϕ(ξ) ψ(ξ) ξ ξ ξ∈Zn
(3.14)
ξ∈Zn
provided that both series are absolutely convergent. Proof. Let us check the case |α| = 1: ϕ(ξ) ξj ψ(ξ) = (ψ(ξ + δj ) − ψ(ξ)) ϕ(ξ) ξ∈Zn
ξ∈Zn
=
ψ(ξ) (−ϕ(ξ) + ϕ(ξ − δj ))
ξ∈Zn
=
( − 1)1
ψ(ξ) ξj ϕ(ξ).
ξ∈Zn
For any α ∈ Nn0 the result is obtained recursively.
Exercise 3.3.11. Complete the proof of (3.14) for |α| ≥ 2.
3.3.2
Discrete Taylor expansion and polynomials on Zn
The usual polynomials θ → θα do not behave naturally with respect to differences: typically γθ θα = cαγ θα−γ for any constant cαγ . Thus let us introduce polynomials θ → θ(α) to cure this defect: Definition 3.3.12 (Discrete polynomials). For θ ∈ Zn and α ∈ Nn0 , we define (α ) (α ) (0) θ(α) = θ1 1 · · · θn n , where θj = 1 and (k+1)
θj
(k)
= θj (θj − k) = θj (θj − 1) . . . (θj − k).
Exercise 3.3.13. Show that γθ θ(α) = α(γ) θ(α−γ) , in analogy to the Euclidean case where ∂θγ θα = α(γ) θα−γ .
(3.15)
314
Chapter 3. Periodic and Discrete Analysis
Remark 3.3.14. Difference operators lessen the degree of a polynomial by 1. In the literature on numerical analysis the polynomials θ → θ(α) appear sometimes in a concealed form using the binomial coefficients: + , θ (α) . = α! θ α Next, let us consider “discrete integration”. Definition 3.3.15 (Discrete integration). For b ≥ 0, let us write Ikb
:=
and
Ik−b
:=
0≤k k, j ≥ 1, k ≥ 0,
where δp,q is the Kronecker delta, defined by δp,p = 1 and by δp,q = 0 for p = q.
3.3. Discrete analysis toolkit
323
(0)
and x(k+1) = x(x − 1) · · · (x − k) we see that it has " to # k (0) (k) (j) be Sk = δ0,k and that Sk = 1 for every k ∈ N0 . The statement Sk = 0 = j when j < 0 or j > k simply rephrases a part of the extended definition for Stirling k (j) numbers. Suppose that x(k) = j=0 Sk xj . Then
Proof. From x(0) = 1 = S0
k+1
(j)
Sk+1 xj
(x − k)x(k)
=
x(k+1)
=
Sk xk+1 − k Sk +
=
j=0 (k)
(0)
k A
(j−1)
Sk
(j)
− k Sk
B
xj
j=1 k+1 A
(j−1)
Sk
=
(j)
− k Sk
B
xj .
j=0
" # k The case of the first kind is thus concluded. No doubt = 1 for every k ∈ N0 , k " # " # k k k and clearly = δ0,k . Assume that xk = j=0 x(j) . Then 0 j k+1 " j=0
# k + 1 (j) x j
=
=
k+1
x
k j=0
=
=
xx
=
x
" # k (x − j + j) x(j) j
j
(j+1)
x
k " # k j=0
j
x(j)
5 " # k (j) +j x j
k+1 4" j=0
so that we can calculate
=
k 4" # k j=0
k
# " #5 k k +j x(j) , j−1 j
" # k by recursion. j
(j)
(j−1)
(j)
− k Sk is The general solution of the difference equation Sk+1 = Sk unknown (see [60, p. 143]), but for the second kind there is a closed form without recursion (it is easily obtained by applying Proposition 3.3.4 on Lemma 3.3.34): + , " # j j k 1 k i . (−1)j−i = j i j! i=0
324
Chapter 3. Periodic and Discrete Analysis
There are combinatorial ideas behind the Stirling numbers, as explained, + , 4 5e.g., " in # k k k [24]. The following exercise collects these ideas, using notations , , j j j of [67]: Exercise 3.3.37 (Combinatorial background). Let j, k ∈ N0 such that j ≤ k, and let S be a set with exactly k elements. Show that S has + , k k! = j j! (k − j)! subsets of exactly j elements (as usual, read: “k choose j”). Moreover, prove that 4 5 k (j) := (−1)k−j Sk j is the number of permutations of S with precisely j cycles. Finally, show that " # k j is the number of ways to partition S into j non-empty subsets, i.e., the number of the equivalence relations on S having j equivalence classes (read: “k quotient j”). Hint: exploit the recursions in Lemma 3.3.36. In the following matrices some of the Stirling numbers are presented. The index j is used for rows and k for columns: ⎛ ⎞ 1 0 0 0 0 0 ⎜ 0 1 −1 2 −6 24 ⎟ ⎜ ⎟ 5 ⎜ 0 0 1 −3 11 −50 ⎟ (j) ⎟, Sk =⎜ ⎜ 0 0 0 1 −6 35 ⎟ j,k=0 ⎜ ⎟ ⎝ 0 0 0 0 1 −10 ⎠ 0 0 0 0 0 1 ⎛ ⎞ 1 0 0 0 0 0 ⎜ 0 1 1 1 1 1 ⎟ ⎜ ⎟ +" #,5 ⎜ 0 0 1 3 7 15 ⎟ k ⎜ ⎟. =⎜ ⎟ j ⎜ 0 0 0 1 6 25 ⎟ j,k=0 ⎝ 0 0 0 0 1 10 ⎠ 0 0 0 0 0 1 Such matrices are inverses to each other: Lemma 3.3.38. Assume that i, j, N ∈ N0 such that 0 ≤ i, j ≤ N . Then N k=0
(i) Sk
" # N " # j k (k) = δi,j = Sj . k i k=0
3.3. Discrete analysis toolkit
325
Proof. Due to the symmetry, it suffices to prove only that the first sum equals δi,j : j
x =
j " # j
k
k=0
=
N " # k j
k
k=0
=
x
(k)
N
xi
i=0
=
N " # j k=0
(i) Sk xi
k
x(k)
=
i=0
N k=0
N " # N j k=0
" # (i) j Sk . k
k
(i)
Sk xi
i=0
In the sequel, we shall present two alternative definitions for periodic pseudodifferential operators. To build a bridge between these approaches (in Lemma 4.7.1), we have to know how to approximate differences by derivatives. This problem is considered, for example, in [60, p. 164–165, 189–192], where the error estimates are neglected, as well as in the treatise of the subject in [15]. Francis Hildebrand ([52, p. 123–125]) makes a notice on the estimates, but does not calculate them, and there the approximation is thoroughly presented only for degrees j = 1, 2, and without a connection to the Stirling numbers. The finest account is by Steffensen in [117, p. 60-70], where the presentation of Markoff ’s formulae is general with error terms for any degree, but still it lacks the Stirling numbers. The following theorem is in one dimension, so here ϕ(k) denotes the usual th k derivative of ϕ. Theorem 3.3.39 (Approximating differences by derivatives). There exist constants d ∞ 1 c N,j , cN,j > 0 for any N ∈ N0 and j < N such that for every ϕ ∈ C (R ) and 1 ξ ∈ R the following inequalities hold: jξ ϕ(ξ)
−
N −1 k=j
ϕ
(j)
(ξ) −
N −1 k=j
" # k ϕ(k) (ξ) j
≤
(N ) c (ξ + η) , N,j max ϕ
j! (j) k S ξ ϕ(ξ) k! k
≤
cdN,j
j! k!
(3.26)
η∈[0,j]
max
η∈[0,N −1]
ϕ(N ) (ξ + η) .
(3.27)
Proof. Note that in the following we cannot apply the discrete Taylor series, because its remainder is defined only on Z1 with respect to the variable η. The classical Taylor series does not have this disadvantage: ϕ(ξ + η) =
N −1 k=0
1 (k) 1 (N ) ϕ (ξ) η k + ϕ (θ(η)) η N . k! N!
We use the Lagrange form of the error term. Here θ(η) is some point in the segment connecting ξ and ξ + η. Assume that N > j. Applying jη at η = 0, and using
326
Chapter 3. Periodic and Discrete Analysis
Lemma 3.3.34 we get jξ ϕ(ξ)
=
jη
6N −1 7 1 1 (N ) (k) k N ϕ (ξ) η + ϕ (θ(η)) η k! N! k=0
=
N −1 k=j
η=0
" # A B j! k 1 ϕ(k) (ξ) + jη ϕ(N ) (θ(η)) η N k! j N!
.
(3.28)
η=0
Using the Leibniz formula on the remainder term, we see that its absolute value is (N ) (θj ) for some θj ∈ [ξ, ξ+j], and hence (3.26) is true. For the majorised by c N,j ϕ latter inequality (3.27), the “orthogonality” of Stirling numbers (Lemma 3.3.38), and (3.28) are essential: N −1 k=i
" # N −1 N −1 i! (i) k i! (i) k! j S ξ ϕ(ξ) = S ϕ(j) (ξ) k! k k! k j! k k=i
j=k
A B i! (i) 1 Sk kη ϕ(N ) (θ(η)) η N k! N! k=i " # j N −1 i! (j) (i) j ϕ (ξ) = Sk k j! j=i +
N −1
η=0
k=i
+
N −1 k=i
A B i! (i) 1 Sk kη ϕ(N ) (θ(η)) η N k! N!
= ϕ(i) (ξ) +
N −1 k=i
η=0
A B i! (i) 1 Sk kη ϕ(N ) (θ(η)) η N k! N!
, η=0
where the absolute value of the remainder part is estimated above by some cdN,j ϕ(N ) (θN ) (cf. the proof of (3.26)). Inequality (3.27) is not actually needed in this work, but as a dual statement to (3.26) it is justified. Note that in (3.26) the maximum of |ϕ(N ) (ξ + η)| is taken over the interval η ∈ [0, j], whereas in (3.27) over η ∈ [0, ξ − 1]. Exercise 3.3.40. Let α, β ∈ Nn0 , ξ ∈ Zn and ϕ ∈ C ∞ (Rn ). Estimate βξ ϕ(ξ) −
β! "α# ∂ α ϕ(ξ) α! β
|α| t2 .
Chapter 4. Pseudo-differential Operators on Tn
348
Theorem 4.3.1 (Smoothing). The following conditions are equivalent: (i) A ∈ L(H s (Tn ), H t (Tn )) for every s, t ∈ R. (ii) σA ∈ S −∞ (Tn × Zn ). (iii) There exists KA ∈ C ∞ (Tn × Tn ) such that for all u ∈ C ∞ (Tn ) we have Au(x) = KA (x, y) u(y) dy. Tn
Proof. Assume that A satisfies (i). To obtain (ii), it is enough to prove |∂xβ σA (x, ξ)| ≤ cβ,r ξ−r for every r ∈ R, because by Proposition 3.3.4 we have formula (4.8) which we recall here: + , α β |α−γ| ∂xβ σA (x, ξ + γ); α ∂ σ (x, ξ) = (−1) ξ x A γ
(4.15)
γ≤α
reasoning why this is enough is left as Exercise 4.3.2. Recall that eξ (x) = ei2πx·ξ s so that eξ H s (Tn ) = ξ . We now prepare another estimate: 2|β|+2t e−ξ f 2H |β|+t (Tn ) = η |e−ξ f (η)|2 η∈Zn
≤
22|β|+2|t|
η + ξ
2|β|+2|t|
ξ
2|β|+2t
|f(η + ξ)|2
η∈Zn
=
22|β|+2|t| ξ
2|β|+2t
f 2H |β|+t (Tn ) ,
where we applied Peetre’s inequality (Proposition 3.3.31). Finally, choosing t > n/2 and using the Sobolev embedding theorem (see, e.g., Exercise 2.6.17), we get ∂xβ σA (x, ξ) ≤ (2π)|β| η|β| |σ0 A (η, ξ)| η∈Zn
≤ Cβ,t x → σA (x, ξ)H |β|+t (Tn ) = Cβ,t e−ξ Aeξ H |β|+t (Tn ) ≤ Cβ,t e−ξ IL(H |β|+t (Tn ),H |β|+t (Tn )) ×AL(H s (Tn ),H |β|+t (Tn )) eξ H s (Tn ) ≤ where Cβ,t = (2π)|β|
2|β|+t Cβ,t AL(H s (Tn ),H |β|+t (Tn )) ξs+|β|+t
A η∈Zn
η−2t
B1/2
. Since s ∈ R is arbitrary, we get (ii).
Let us now show that (ii) implies (iii). If σA ∈ S −∞ (Tn × Zn ), the Schwartz kernel σA (x, ξ) ei2π(x−y)·ξ KA (x, y) := ξ∈Zn
4.3. Kernels of periodic pseudo-differential operators
349
is in C ∞ (Tn × Tn ). Indeed, formally we can differentiate KA to obtain +α, α β β [∂xγ σA (x, ξ)] ∂xα−γ ei2π(x−y)·ξ ; (−i2πξ) ∂x ∂y KA (x, y) = γ n γ≤α
ξ∈Z
this is justified, as the convergence of the resulting series is absolute, because |∂xγ σA (x, ξ)| ≤ cγ,r ξ−r for any r ∈ R. This gives (iii). Finally, assume that (iii) holds. Define the amplitude a by a(x, y, ξ) := δ0,ξ KA (x, y). Now a ∈ A−∞ (Tn ), since ∂xα ∂yβ γξ a(x, y, ξ)
≤
2|γ| ∂xα ∂yβ KA (x, y) χ[−|γ|,|γ|]n
≤
Crαβγ ξ−r
for every r ∈ R, where χ[−|γ|,|γ|]n is the characteristic function of the cube [−|γ|, |γ|]n ⊂ Zn . On the other hand, ei2π(x−y)·ξ a(x, y, ξ) u(y) dy Op(a)u(x) = ξ∈Zn
=
Tn
Tn
KA (x, y) u(y) dy = Au(x).
Property (i) now follows by Theorem 4.2.10.
Exercise 4.3.2. In the proof above, based on (4.15), explain why it sufficed to prove ∂xβ σA (x, ξ) ≤ cβ,r ξ−r . Because the inclusion of a Sobolev space into a strictly larger one is compact (see Exercise 3.2.10), we also obtain Corollary 4.3.3. Operators from Op(S −∞ (Tn × Zn )) are compact between any spaces H s (Tn ), H t (Tn ). Unlike in the case of symbols, the correspondence of amplitudes and amplitude operators is not bijective: several different amplitudes may define the same operator. As an example we are now going to study how the multiplication of an amplitude by n γ γj ei2π(y−x) − 1 := ei2π(yj −xj ) − 1 (4.16) j=1
affects the amplitude operator. Notice that this multiplier was encountered in the biperiodic Taylor series (see Corollary 3.4.3 and Theorem 3.4.4). n Lemma 4.3.4. Let c ∈ Am ρ,δ (T ), and define γ bγ (x, y, ξ) := ei2π(y−x) − 1 c(x, y, ξ),
m−ρ|γ| (Tn ) . where γ ∈ Nn0 . Then Op(bγ ) = Op(γξ c) ∈ Op Aρ,δ
Chapter 4. Pseudo-differential Operators on Tn
350
Proof. First we note the identity γ γ ei2π(x−y)·ξ ei2π(y−x) − 1 = (−1)|γ| ξ ei2π(x−y)·ξ , γ
where ξ is the backward difference operator (see Definition 3.3.1) which we leave as Exercise 4.3.5. Consequently, the summation by parts (see Lemma 3.3.10) yields A γ B Op(bγ )u(x) = ei2π(x−y)·ξ ei2π(y−x) − 1 c(x, y, ξ) u(y) dy ξ∈Zn
= Tn
= Tn
⎡ ⎣
Tn
⎤ c(x, y, ξ)
γ (−1)|γ| ξ ei2π(x−y)·ξ ⎦ u(y)
ξ∈Zn
⎡ ⎣
dy
⎤ ei2π(x−y)·ξ γξ c(x, y, ξ)⎦ u(y) dy.
ξ∈Zn m−ρ|γ|
Thus Op(bγ ) = Op(γξ c), and clearly γξ c ∈ Aρ,δ
n (Tn ), since c ∈ Am ρ,δ (T ).
Exercise 4.3.5. Prove that for every γ ∈ Nn0 we have the identity γ γ ei2π(x−y)·ξ ei2π(y−x) − 1 = (−1)|γ| ξ ei2π(x−y)·ξ , γ
where ξ is the backward difference operator from Definition 3.3.1. Surprising or not, but from the smoothness point of view the essential information content of a periodic pseudo-differential operator is in the behavior of its Schwartz kernel in any neighbourhood of the diagonal x = y. We note that this can be also seen from the local theory once we know the equality of operator n m n classes Op(Am ρ,δ (T )) and periodic operators in Op(Aρ,δ (R )). But here we give a direct proof: Theorem 4.3.6 (Schwartz kernel). Let 0 < ρ and δ < 1. Let A = Op(a) ∈ n Op(Am ρ,δ (T )) be expressed in the form KA (x, y) u(y) dy, Au(x) = Tn
where KA (x, y) = ξ∈Zn ei2π(x−y)·ξ a(x, y, ξ). Then the Schwartz kernel KA is a smooth function outside the diagonal x = y. Proof. Let j ∈ {1, . . . , n}. Take ψ ∈ C ∞ (Tn × Tn ) such that xj = yj for every (x, y) ∈ supp(ψ). We have to prove that (x, y) → ψ(x, y) KA (x, y) belongs to C ∞ (Tn × Tn ). Define b(x, y, ξ) := ψ(x, y) a(x, y, ξ).
4.4. Asymptotic sums and amplitude operators
351
By Lemma 4.3.4, the amplitudes (x, y, ξ)
→
(x, y, ξ)
→
b(x, y, ξ) and kξj b(x, y, ξ) k ei2π(yj −xj ) − 1
give the same periodic pseudo-differential operator B := Op(b). Hence b ∈ Am−ρk (Tn ) for every k ∈ Nn0 , so that it is in A−∞ (Tn ). Theoρ,δ rem 4.2.10 states that B is continuous between any Sobolev spaces, and then by Theorem 4.3.1 the kernel (x, y) → ψ(x, y) KA (x, y) belongs to C ∞ (Tn × Tn ). Exercise 4.3.7. Derive the quantitative behavior of the kernel KA (x, y) near the diagonal x = y, similarly to Theorem 2.3.1.
4.4
Asymptotic sums and amplitude operators
The next theorem is a prelude to asymptotic expansions, which are the main tool in the symbolic analysis of periodic pseudo-differential operators. Theorem 4.4.1 (Asymptotic sums of symbols). Let (mj )∞ j=0 ⊂ R be a sequence m such that mj > mj+1 −−−→ −∞, and σj ∈ Sρ,δj (Tn × Zn ) for all j ∈ N0 . Then j→∞
m0 (Tn × Zn ) such that for all N ∈ N0 , there exists a toroidal symbol σ ∈ Sρ,δ
σ
mN ,ρ,δ
∼
N −1
σj .
j=0
Proof. Choose a function ϕ ∈ C ∞ (Rn ) satisfying ξ ≥ 1 ⇒ ϕ(ξ) = 1 and ξ ≤ 1/2 ⇒ ϕ(ξ) = 0; otherwise ϕ can be arbitrary. Take a sequence (εj )∞ j=0 of positive real numbers such that εj > εj+1 → 0 (j ∈ N0 ), and define ϕj ∈ C ∞ (Rn ) by ϕj (ξ) := ϕ(εj ξ). When |α| ≥ 1, the support set of α ξ ϕj is bounded, so that by the discrete Leibniz formula (Lemma 3.3.6) we have β mj −ρ|α|+δ|β| α ξ ∂x [ϕj (ξ)σj (x, ξ)] ≤ Cjαβ ξ m
for some constant Cjαβ , since σj ∈ Sρ,δj (Tn × Zn ). This means that ((x, ξ) → m ϕj (ξ)σj (x, ξ)) ∈ Sρ,δj (Tn × Zn ). Examining the support of α ξ ϕj , we see that α ξ (ϕj (ξ)σj (x, ξ)) (where α ∈ Nn0 ) vanishes for any fixed ξ ∈ Zn , when j is large enough. This justifies the definition σ(x, ξ) :=
∞ j=0
ϕj (ξ) σj (x, ξ),
(4.17)
Chapter 4. Pseudo-differential Operators on Tn
352
m0 and clearly σ ∈ Sρ,δ (Tn ). Furthermore, ⎡ ⎤ N −1 β⎣ σj (x, ξ)⎦ α ξ ∂x σ(x, ξ) − j=0
≤
N −1
β α ξ ∂x {[ϕj (ξ) − 1] σj (x, ξ)} +
j=0
∞
β α ξ ∂x [ϕj (ξ) σj (x, ξ)] .
j=N
N −1 Recall that εj > εj+1 → 0, so that the j=0 part of the sum vanishes, whenever ξ is large. Hence this part of the sumis dominated by CrN αβ ξ−r for any ∞ r ∈ R. The reader may verify that the j=N part of the sum is majorised by mN −ρ|α|+δ|β| . CN αβ ξ Exercise 4.4.2. In the proof of Theorem 4.4.1. estimate the support N −1 of ξ → ϕ (ξ) in terms of α and j. How large should ξ be for the α j ξ j=0 part of the sum to vanish? Complete the proof by filling in the details. If necessary, consult the Euclidean version of this result which was proved in Proposition 2.5.33. ∞ Definition 4.4.3 (Asymptotic expansions). The formal series j=0 σj in Theom0 (Tn × Zn ) and rem 4.4.1 is called an asymptotic expansion of the symbol σ ∈ Sρ,δ it is presented in (4.17). In this case we denote σ∼
∞
σj
j=0
∞ (cf. a ∼ a ; a different but related meaning). Respectively, j=0 Op(σj ) is an m0 (Tn × Zn )), denoted asymptotic expansion of the operator Op(σ) ∈ Op(Sρ,δ ∞ Op(σ) ∼ j=0 Op(σj ). By altering ϕ ∈ C ∞ (Rn ) and (εj )∞ j=0 in the proof of Theorem 4.4.1 we get a (possibly) different symbol τ by (4.17). Nevertheless, σ ∼ τ , which is enough in the symbol analysis of periodic pseudo-differential operators. ∞ We are often faced with asymptotic expansions σ ∼ j=0 σj , where σγ . σj = γ∈Nn 0 : |γ|=j
In such case we shall write σ∼
σγ .
γ≥0
Remark 4.4.4 (Principal symbol). Assume that in the asymptotic expansion σ ∼ ∞ m0 (Tn × Zn )\S m1 (Tn × Zn ), i.e., σ0 is the most significant j=0 σj we have σ0 ∈ S ∞ term. It is then called the principal symbol of σ ∼ j=0 σj . (In [130, p. 49] the class σA + S m−1 (Rn ) is by definition the principal symbol of the periodic pseudodifferential operator A ∈ Op(S m (Rn )) when l < m implies that σA ∈ S l ; it is important due to its invariance under smooth changes of coordinates.)
4.4. Asymptotic sums and amplitude operators
353
Next we present an elementary result stating that amplitude operators are merely periodic pseudo-differential operators, and we provide an effective way to calculate the symbol modulo S −∞ (Tn × Zn ) from an amplitude: this theorem has a fundamental status in the symbolic analysis. We give two alternative proofs for it. Theorem 4.4.5 (Symbols of amplitude operators). Let 0 ≤ δ < ρ ≤ 1. For evn ery toroidal amplitude a ∈ Am ρ,δ (T ) there exists a unique toroidal symbol σ ∈ m n n Sρ,δ (T × Z ) satisfying Op(a) = Op(σ), and σ has the following asymptotic expansion: 1 γ D(γ) a(x, y, ξ)|y=x . (4.18) σ(x, ξ) ∼ γ! ξ y γ≥0
Proof. As a linear operator in Sobolev spaces, Op(a) possesses the unique symbol σ = σOp(a) (or as an operator on C ∞ (Tn ), see Definition 4.1.2), but at the moment m (Tn × Zn ). By Theorem 4.1.4 the symbol is we do not yet know whether σ ∈ Sρ,δ computed from σ(x, ξ)
= =
e−i2πx·ξ Op(a) eξ (x) e−i2πx·ξ ei2π(x−y)·η a(x, y, η) ei2πy·ξ dy. η∈Zn
Tn
Now, we apply the discrete Taylor formula from Theorem 3.3.21 to obtain σ(x, ξ)
=
=
ei2π(x−y)·(η−ξ) a(x, y, η) dy
Tn
η∈Zn
ei2πx·(η−ξ) a2 (x, η − ξ, η)
η∈Zn
=
ei2πx·η a2 (x, η, ξ + η)
η∈Zn
=
ei2πx·η
|γ| δ, and the remaining term satisfies η−N ≤ Cξ Therefore, taking enough terms in the asymptotic expansion we can estimate −1 β and since all the terms are in the necessary α ξ ∂x EN (x, ξ) by any power of ξ symbol classes the estimate for the remainder is complete. Consequently σ belongs m (Tn × Zn ) by equation (4.19), and Theorem 4.4.1 provides the asymptotic to Sρ,δ expansion (4.18).
4.4. Asymptotic sums and amplitude operators
355
Remark 4.4.6. Now we can compare the results above with the biperiodic Taylor series. Applying Theorem 4.4.5 and Lemma 4.3.4, we get a(x, y, ξ)
∼
∞ 1 γξ Dy(γ) a(x, y, ξ)|y=x γ! γ=0
∼
∞ γ 1 i2π(y−x) e − 1 Dz(γ) a(x, z, ξ)|z=x , γ! γ=0
reminding us of the series representation of Corollary 3.4.3. Alternative proof for Theorem 4.4.5 on T1 . We invoke the biperiodic Taylor expansion for a(x, y, ξ) (see Corollary 3.4.3): a(x, y, ξ)
=
1 i2π(y−x) e − 1 Dz(j) a(x, z, ξ)|z=x j! j=0 N +aN (x, y, ξ) ei2π(y−x) − 1 . N −1
Then we use Lemma 4.3.4 with bj (x, y, ξ) = (j) cj (x, y, ξ) = j!1 Dy a(x, y, ξ)|y=x , to obtain + Op(bj ) = Op(jξ cj ) = Op
j
ei2π(y−x) − 1
cj (x, y, ξ), where
, 1 j (j) ξ Dy a(x, y, ξ)|y=x . j!
N hence contributes to By Lemma 4.3.4, the remainder aN (x, y, ξ) ei2π(y−x) − 1 the operator Op(N a ). Thus, in order to get the asymptotic expansion (4.18), ξ N m 1 + we have to prove that aN ∈ Aρ,δ (T ) for every N ∈ Z . From the proof of Theorem 3.4.2 we see that aN is given by aN (x, y, ξ) =
aN −1 (x, y, ξ) − aN −1 (x, x, ξ) , ei2π(y−x) − 1
and that it is in C ∞ (T1 × T1 ) for every ξ ∈ Z1 . Here aN has the same order 1 m 1 as aN −1 does, so that recursively aN ∈ Am ρ,δ (T ), since a0 = a ∈ Aρ,δ (T ). This completes the proof. Remark 4.4.7 (Classical periodic pseudo-differential operators). The operator A ∈ Op(S m (Tn )) is called a classical periodic pseudo-differential operator, if its symbol has an asymptotic expansion σA (x, ξ) ∼
∞ j=0
σj (x, ξ),
Chapter 4. Pseudo-differential Operators on Tn
356
where the symbols σj are positively homogeneous of degree m − j: they satisfy σj (x, ξ) = σj (x, ξ/ξ)ξm−j for large ξ. In [142] and [102], it is shown that any classical periodic pseudo-differential operator can be expressed as a sum of periodic integral operators of the type (4.44) – other contributions to periodic integral operators and classical operators are made in [34], [62], [142], and [102]. The research on these operators is of interest, but in the sequel we will rather concentrate on questions of the symbolic analysis.
4.5
Extension of toroidal symbols
In the study of periodic pseudo-differential operators some of the applications of the calculus of finite differences, for example the discrete Taylor series, can be eliminated. We are going to explain how this can be done by interpolating a symbol (x, ξ) → σ(x, ξ) in the second argument ξ in a smooth way, so that it becomes defined on Tn × Rn instead of Tn × Zn . This process will be called an extension of the toroidal symbol. By using such extensions one can work with the familiar tools of classical analysis yielding the same theory as before, and for some practical examples this may be more convenient than operating with differences. However, this approach is quite alien to the idea of periodic symbols, as the results can be derived using quite simple difference calculus. In addition, difference operations can easily be carried out with computers, whereas program realisations of numerical differentiation are computationally expensive and troublesome. Moreover, such an extension explores the intricate relation between Tn and Rn and can not be readily generalised to symbols on other compact Lie groups (thus while very useful on Tn yet unfortunately not providing an additional intuition for operators in Part IV). Thus, it is often useful to extend toroidal symbols from Tn × Zn to Tn × Rn , ideally getting symbols in H¨ ormander’s symbol classes. The case of n = 1 and (ρ, δ) = (1, 0) was considered in [141] and [102]. This extension can be done with a suitable convolution that respects the symbol inequalities. In the following, δ0,ξ is the Kronecker delta at 0 ∈ Zn , i.e., δ0,0 = 1, and δ0,ξ = 0 if ξ = 0. First we prepare the following useful functions θ, φα ∈ S(Rn ): Lemma 4.5.1. There exist functions φα ∈ S(Rn ) (for each α ∈ Nn0 ) and a function θ ∈ S(Rn ) such that Pθ(x) := θ(x + k) ≡ 1, k∈Zn
(FRn θ)|Zn (ξ) = δ0,ξ
and
α
∂ξα (FRn θ)(ξ) = ξ φα (ξ)
for all ξ ∈ Zn . The idea of this lemma may be credited to Yves Meyer [29, p. 4]. It will be used in the interpolation presented in Theorem 4.5.3.
4.5. Extension of toroidal symbols
357
Proof. Let us first consider the one-dimensional case. Let θ = θ1 ∈ C ∞ (R1 ) such that supp(θ1 ) ⊂ (−1, 1), θ1 (−x) = θ1 (x), θ1 (1 − y) + θ1 (y) = 1 for x ∈ R and for 0 ≤ y ≤ 1; these assumptions for θ1 are enough for us, and of course the choice is not unique. In any case, θ1 ∈ S(R1 ), so that also FR θ1 ∈ S(R1 ). If ξ ∈ Z1 then we have θ1 (x) e−i2πx·ξ dx FR θ1 (ξ) = R1 1
=
(θ1 (x − 1) + θ1 (x)) e−i2πx·ξ dx
0
=
δ0,ξ .
If a desired φα ∈ S(R1 ) exists, it must satisfy α i2πx·ξ α e ∂ξ (FR θ1 )(ξ) dξ = ei2πx·ξ ξ φα (ξ) dξ R1 R1 α = 1 − ei2πx ei2πx·ξ φα (ξ) dξ R1
due to the bijectivity of FR : S(R1 ) → S(R1 ). Integration by parts leads to the formula (−i2πx)α θ1 (x) = (1 − ei2πx )α (FR−1 φα )(x). Thus
⎧ α −i2πx ⎪ θ1 (x), ⎨ 1−e i2πx (FR−1 φα )(x) = 1, ⎪ ⎩ 0,
if 0 < |x| < 1, if x = 0, if |x| ≥ 1.
The general n-dimensional case is reduced to the one-dimensional case, since map ping θ = (x → θ1 (x1 )θ1 (x2 ) · · · θ1 (xn )) ∈ S(Rn ) has the desired properties. Remark 4.5.2 (Periodic symbols on Rn ). The defining symbol inequalities for the m class Sρ,δ (Tn × Rn ) of periodic symbols on Rn are ∀α, β ∈ Nn0 ∃cαβ > 0 : ∂ξα ∂xβ σ(x, ξ) ≤ cαβ (1 + ξ)m−ρ|α|+δ|β| .
(4.20)
To emphasize the difference with toroidal symbols defined on Tn × Zn we call them Euclidean symbols. Lemma 4.5.1 provides us the means to interpolate between the discrete points of Zn in a manner that is faithful to the symbol (cf. inequalities (4.6) and (4.20)): Theorem 4.5.3 (Toroidal vs Euclidean symbols). Let 0 < ρ ≤ 1 and 0 ≤ δ ≤ 1. m (Tn ×Zn ) is a toroidal symbol if and only if there exists a Euclidean Symbol * a ∈ Sρ,δ m a = a|Tn ×Zn . Moreover, this extended symbol symbol a ∈ Sρ,δ (Tn × Rn ) such that * a is unique modulo S −∞ (Tn × Rn ).
Chapter 4. Pseudo-differential Operators on Tn
358
The relation between the corresponding pseudo-differential operators will be given in Theorem 4.6.12. For the relation between extensions and ellipticity see Theorem 4.9.15. m (Tn × Rn ), and in this part we Proof. Let us first prove the “if” part. Let a ∈ Sρ,δ can actually allow any ρ and δ, for example 0 ≤ ρ, δ ≤ 1. By the Lagrange Mean Value Theorem, if |α| = 1 then β α a(x, ξ) ξ ∂x *
=
β α ξ ∂x a(x, ξ)
=
∂ξα ∂xβ a(x, ξ)|ξ=η ,
where η is on the line from ξ to ξ + α. By the Mean Value Theorem, for a general multi-index α ∈ Nn0 , we also have β α a(x, ξ) = ∂ξα ∂xβ a(x, ξ)|ξ=η ξ ∂x *
for some η ∈ Q := [ξ1 , ξ1 +α1 ]×· · ·×[ξn , ξn +αn ]. This can be shown by induction. Indeed, let us write α = ω + γ for some ω = δj . Then we can calculate γ β β α a(x, ξ) = ω a (x, ξ) ξ ∂x * ξ ξ ∂ x * = ξj ∂ξγ ∂xβ a(x, ξ)|ξ=ζ = ∂ξγ ∂xβ a(x, ζ + δj ) − ∂ξγ ∂xβ a(x, ζ) ∂ξα ∂xβ a(x, ξ)|ξ=η
=
for some ζ and η, where we used the induction hypothesis in the first line. Therefore, we can estimate β α a(x, ξ) ξ ∂x *
=
∂ξα ∂xβ a(x, ξ)|ξ=η∈Q
≤
Cαβm ηm−ρ|α|+δ|β|
≤
Cαβm ξm−ρ|α|+δ|β| .
Let us now prove the “only if” part. First we show the uniqueness. Let m a, b ∈ Sρ,δ (Tn × Rn ) and assume that a|Tn ×Zn = b|Tn ×Zn . Let c = a − b. Then m n c ∈ Sρ,δ (T × Rn ) and it satisfies c|Tn ×Zn = 0. If ξ ∈ Rn \ Zn , choose η ∈ Zn that is the nearest point (or one of the nearest points) to ξ. Then we have the first-order Taylor expansion rα (x, ξ, ξ − η) (ξ − η)α c(x, ξ) = c(x, η) + =
α: |α|=1
rα (x, ξ, ξ − η) (ξ − η)α ,
α: |α|=1
where
1
(1 − t) ∂ξα c(x, ξ + tθ) dt.
rα (x, ξ, θ) = 0
4.5. Extension of toroidal symbols
359
Hence we have |c(x, ξ)| ≤ C ξm−ρ . Continuing the argument inductively for c and its derivatives and using that ρ > 0, we obtain the uniqueness modulo S −∞ (Tn × Rn ). Let us now show the existence. Let θ ∈ S(Rn ) be as in Lemma 4.5.1. Define n a : T × Rn → C by a(x, ξ) := (FRn θ)(ξ − η) * a(x, η). (4.21) η∈Zn
It is easy to see that * a = a|Tn ×Zn . Furthermore, we have ∂ξα ∂xβ a(x, ξ)
=
∂ξα (FRn θ)(ξ − η) ∂xβ * a(x, η)
η∈Zn
=
α
ξ φα (ξ − η) ∂xβ * a(x, η)
η∈Zn (3.14)
=
β φα (ξ − η) α a(x, η) (−1)|α| η ∂x *
η∈Zn
≤
|φα (ξ − η)| Cαβm ηm−ρ|α|+δ|β|
η∈Zn
≤
Cαβm ξm−ρ|α|+δ|β|
≤
Cαβm
|φα (η)| η|m−ρ|α|+δ|β||
η∈Zn
ξ
m−ρ|α|+δ|β|
,
where we used the summation by parts formula (3.14). In the last two lines we also p used that φα ∈ S(Rn ) and also a simple fact that for p > 0 we have ξ + η ≤ p p −p −p −p n m n n ξ η and ξ + η η ≤ ξ , for all ξ, η ∈ R . Thus a ∈ Sρ,δ (T ×R ). From now on, we can exploit inequalities (4.20), but it is good to remember that all the information was contained already in the original definition of symbols on Tn × Zn . In a sense, the extension is arbitrary (yet unique up to order −∞), as the demands for the function θ ∈ S(Rn ) were quite modest in the proof of Lemma 4.5.1. Remark 4.5.4. The extension process can also be modified for amplitudes to get a(x, y, ξ) (continuous ξ ∈ Rn ) from a(x, y, ξ) (discrete ξ ∈ Zn ). Remark 4.5.5 (Extension respects ellipticity.). Moreover, the extension respects ellipticity, as we will show in Theorem 4.9.15. Exercise 4.5.6. Work out details of the proof of Remark 4.5.4. We also observe that the same proof yields the following limited regularity version of Theorem 4.5.3:
Chapter 4. Pseudo-differential Operators on Tn
360
Corollary 4.5.7 (Limited regularity extensions). Let the function a : Tn × Rn → C satisfy ∂ξα ∂xβ a(x, ξ) ≤ cαβ ξ
m−ρ|α|+δ|β|
for all x ∈ Tn , ξ ∈ Rn ,
(4.22)
a := a|Tn ×Zn satisfies for all |α| ≤ N1 and |β| ≤ N2 . Then its restriction * β a(x, ξ) ≤ cαβ ξ α ξ ∂x *
m−ρ|α|+δ|β|
for all x ∈ Tn , ξ ∈ Zn ,
(4.23)
a : Tn × Zn → C and all |α| ≤ N1 and |β| ≤ N2 . Conversely, every function * satisfying (4.23) for all |α| ≤ N1 and |β| ≤ N2 is a restriction * a = a|Tn ×Zn of some function a : Tn × Rn → C satisfying (4.22) for all |α| ≤ N1 and |β| ≤ N2 .
4.6
Periodisation of pseudo-differential operators
In this section we describe the relation between operators with Euclidean and toroidal quantizations and between operators corresponding to symbols a(x, ξ) and * a = a|Tn ×Zn , given by the operator of the periodisation of functions. First we state a property of a pseudo-differential operator a(X, D) to map the space S(Rn ) into itself, which will be of importance. The following class will be sufficient for our purposes, and the proof is straightforward. Proposition 4.6.1. Let a = a(x, ξ) ∈ C ∞ (Rn × Rn ) and assume that there exist
> 0 and N ∈ R such that for every α, β there are constants Cαβ and M (α, β) such that the estimate ∂xα ∂ξβ a(x, ξ) ≤ Cαβ x
N +(1−)|β|
ξ
M (α,β)
holds for all x, ξ ∈ Rn . Then the pseudo-differential operator a(X, D) with symbol a(x, ξ) is continuous from S(Rn ) to S(Rn ). Exercise 4.6.2. Prove Proposition 4.6.1. Before analysing the relation between operators with Euclidean and toroidal quantizations, we will describe the periodisation operator that will be of importance for such analysis. Theorem 4.6.3 (Periodisation). The periodisation Pf : Rn → C of a function f ∈ S(Rn ) is defined by f (x + k). (4.24) Pf (x) := k∈Zn ∞
Then P : S(R ) → C (T ) is surjective and Pf L1 (Tn ) ≤ f L1 (Rn ) . Moreover, we have (4.25) Pf (x) = FT−1 n ((FRn f )|Zn ) (x) n
and
k∈Zn
n
f (x + k) =
ξ∈Zn
ei2πx·ξ (FRn f )(ξ).
(4.26)
4.6. Periodisation of pseudo-differential operators
361
Taking x = 0 in (4.26), we obtain Corollary 4.6.4 (Poisson summation formula). For f ∈ S(Rn ) we have f (k) = f(ξ). k∈Zn
ξ∈Zn
As a consequence of the Poisson summation formula and the Fourier transform of Gaussians in Lemma 1.1.23 we obtain the so-called Jacobi identity: Exercise 4.6.5 (Jacobi identity for Gaussians). For every > 0 we have +∞
e−πj
2
/
=
+∞ √ 2
e−πj .
j=−∞
j=−∞
Remark 4.6.6. We note that by Theorem 4.6.3 we may extend the periodisation operator P to L1 (Rn ), and also this extension is surjective from L1 (Rn ) to L1 (Tn ). This is actually rather trivial compared to the smooth case of Theorem 4.6.3 because we can find a preimage f ∈ L1 (Rn ) of g ∈ L1 (Tn ) under the periodisation mapping P by for example setting f = g|[0,1]n and f = 0 otherwise. Exercise 4.6.7. Observe that the periodisation operator P : S(Rn ) → C ∞ (Tn ) is dualised to P t : D (Tn ) → S (Rn ) by the formula 9 8 t P u, ϕ := u, Pϕ for all ϕ ∈ S(Rn ). Indeed, if ϕ ∈ S(Rn ) we have that Pϕ ∈ C ∞ (Tn ), so that this definition makes sense for u ∈ D (Tn ). What is the meaning of the operator P t ? Proof of Theorem 4.6.3. The estimate Pf L1 (Tn ) ≤ f L1 (Rn ) is straightforward. Next, for ξ ∈ Zn , we have FTn (Pf )(ξ) = e−i2πx·ξ Pf (x) dx Tn = e−i2πx·ξ f (x + k) dx Tn
=
k∈Zn
e−i2πx·ξ f (x) dx
Rn
= From this we can see that f (x + k)
(FRn f )(ξ).
=
k∈Zn
=
Pf (x)
ei2πx·ξ FTn (Pf )(ξ)
ξ∈Zn
=
ξ∈Zn
ei2πx·ξ (FRn f )(ξ),
Chapter 4. Pseudo-differential Operators on Tn
362
proving (4.26). Let us show the surjectivity of P : S(Rn ) → C ∞ (Tn ). Let θ ∈ S(Rn ) be a function defined in Lemma 4.5.1. Then for any g ∈ C ∞ (Tn ) it holds that g(x + k) θ(x + k) = g(x) θ(x + k) = g(x), P(gθ)(x) = k∈Zn
k∈Zn
where gθ is the product of θ with Z -periodic function g on Rn . We omit the straightforward proofs of the other claims. n
We saw in Proposition 3.1.34 that Dirac delta comb δZn can be viewed as a sum of Dirac deltas. We can also relate it to the partial sums of Fourier coefficients: Proposition 4.6.8. Let us define Qj ∈ S (Rn ) by ei2πk·ξ ϕ(ξ) dξ Qj , ϕ := k∈Zn : |k|≤j
(4.27)
Rn
for ϕ ∈ S(Rn ). Then we have the convergence Qj → δZn in S (Rn ) to the Dirac delta comb. Proof. Indeed, we have Qj , ϕ
=
k∈Zn : |k|≤j
=
k∈Zn :
−−−→ j→∞
Poisson
=
ei2πk·ξ ϕ(ξ) dξ Rn
(FRn ϕ)(k)
|k|≤j
(FRn ϕ)(k)
k∈Zn
ϕ(ξ)
ξ∈Zn
=
δZn , ϕ ,
for all ϕ ∈ S(Rn ).
Remark 4.6.9 (Inflated torus). For N ∈ N we write N T = (R/N Z) which we call the N -inflated torus, or simply an inflated torus if the value of N is not of importance. We note that in the case of the N -inflated torus N Tn we can use the periodisation operator PN instead of P, where PN : S(Rn ) → C ∞ (N Tn ) can be defined by −1 n 1 n (x), x ∈ N T . (PN f )(x) = FN (4.28) Z Tn FRn f | N n
n
Exercise 4.6.10. Generalise Theorem 4.6.3 to the N -inflated torus using operator PN . Let us now establish some basic properties of pseudo-differential operators with respect to periodisation.
4.6. Periodisation of pseudo-differential operators
363
Definition 4.6.11. We will say that a function a : Rn × Rn → C is 1-periodic (we will always mean that it is periodic with respect to the first variable x ∈ Rn ) if the function x → a(x, ξ) is Zn -periodic for all ξ. As in Theorem 4.5.3, we use tilde to denote the restriction of a ∈ C ∞ (Rn × Rn ) to Rn ×Zn . If a(x, ξ) is 1-periodic, we can view it as a function on Tn ×Zn , and a the corresponding operator Op(* a) = * a(X, D) is we write * a = a|Tn ×Zn . For such * defined by (4.7) in Definition 4.1.9. Theorem 4.6.12 (Periodisation of operators). Let a = a(x, ξ) ∈ C ∞ (Rn × Rn ) be 1-periodic with respect to x for every ξ ∈ Rn . Assume that for every α, β ∈ Nn0 there are constants Cαβ and M (α, β) such that the estimate ∂xα ∂ξβ a(x, ξ) ≤ Cαβ ξ
M (α,β)
holds for all x, ξ ∈ Rn . Let * a = a|Tn ×Zn . Then P ◦ a(X, D)f = * a(X, D) ◦ Pf
(4.29)
for all f ∈ S(Rn ). Note that it is not important in this theorem that a is in any of the symbol m (Rn × Rn ). classes Sρ,δ Combining Theorems 4.5.3 and 4.6.12 we get Corollary 4.6.13 (Equality of operator classes). For 0 ≤ δ ≤ 1 and 0 < ρ ≤ 1 we have m m Op(Sρ,δ (Tn × Rn )) = Op(Sρ,δ (Tn × Zn )), i.e., classes of 1-periodic pseudo-differential operators with Euclidean (H¨ ormanm m (Tn × Rn ) and toroidal symbols in Sρ,δ (Tn × Zn ) coincide. der’s) symbols in Sρ,δ Remark 4.6.14. Note that by Proposition 4.6.1 both sides of (4.29) are well-defined functions in C ∞ (Tn ). Moreover, equality (4.29) can be justified for f in larger classes of functions. For example, (4.29) remains true pointwise if f ∈ C0k (Rn ) is a C k compactly supported function for k sufficiently large. In any case, an equality on S(Rn ) allows an extension to S (Rn ) by duality. Proof of Theorem 4.6.12. Let f ∈ S(Rn ). Then we have P(a(X, D)f )(x) = (a(X, D)f )(x + k) k∈Zn
=
k∈Zn
= Rn
Rn
ei2π(x+k)·ξ a(x + k, ξ) (FRn f )(ξ) dξ
k∈Zn
e
i2πk·ξ
ei2πx·ξ a(x, ξ) (FRn f )(ξ) dξ
Chapter 4. Pseudo-differential Operators on Tn
364
= Rn
=
δZn (ξ) ei2πx·ξ a(x, ξ) (FRn f )(ξ) dξ ei2πx·ξ a(x, ξ) (FRn f )(ξ)
ξ∈Zn
=
ei2πx·ξ a(x, ξ) FTn (Pf )(ξ)
=
* a(X, D)(Pf )(x),
ξ∈Zn
where δZn is the Dirac δ comb from Definition 3.1.32. As usual, these calculations can be justified in the sense of distributions (see Remark 4.6.15). Remark 4.6.15 (Distributional justification). We now give the distributional interpretation of the calculations in Theorem 4.6.12. Let us define some useful variants of the Dirac delta comb from Definition 3.1.32: for x ∈ R and j ∈ Z+ , let P x , Pjx ∈ S (Rn ) be such that ϕ(x + k), Pjx , ϕ := ϕ(x + k), P x , ϕ := k∈Zn : |k|≤j
k∈Zn
for ϕ ∈ S(Rn ). We can easily observe that Pjx → P x in S (Rn ). Then we can calculate:
= Pjx →P x
=
= = = Qj from (4.27)
=
Qj →δZn
=
(4.25)
=
P(a(X, D)f )(x) P x , a(X, D)f 9 8 lim Pjx , a(X, D)f j→∞ (a(X, D)f )(x + k) lim j→∞
lim
j→∞
lim
j→∞
k∈Zn : |k|≤j
k∈Zn : |k|≤j
Rn
e2πi(x+k)·ξ a(x + k, ξ) (FRn f )(ξ) dξ
k∈Zn : |k|≤j
Rn
ei2πk·ξ e2πix·ξ a(x, ξ) (FRn f )(ξ) dξ
9 8 lim Qj , ξ → e2πix·ξ a(x, ξ) (FRn f )(ξ)
j→∞
8
9
δZn , ξ → e2πix·ξ a(x, ξ) (FRn f )(ξ) e2πix·ξ a(x, ξ) (FRn f )(ξ)
ξ∈Zn
=
ei2πx·ξ a(x, ξ) FTn (Pf )(ξ)
ξ∈Zn
=
* a(X, D)(Pf )(x).
As we can see, the distributional justifications are quite natural, in the end.
4.6. Periodisation of pseudo-differential operators
365
Let us now formulate a useful corollary of Theorem 4.6.12 that will be of importance later, in particular in composing a pseudo-differential operator with a Fourier series operator in the proof of Theorem 4.13.11. If in Theorem 4.6.12 we take function f such that f = g|[0,1]n for some g ∈ C ∞ (Tn ), and f = 0 otherwise, it follows immediately that g = Pf . Adjusting this argument by shifting the cube [0, 1]n if necessary and shrinking the support of g to make f smooth, we obtain Corollary 4.6.16. Let a = a(x, ξ) be as in Theorem 4.6.12, let g ∈ C ∞ (Tn ), and let V be an open cube in Rn with side length equal to 1. Assume that the support of g|V is contained in V . Then we have the equality * a(X, D)g = P ◦ a(X, D)(g|V ), where g|V : Rn → C is defined as the restriction of g to V and equal to zero outside of V . Exercise 4.6.17. Work out the details of the proof of Corollary 4.6.16. Especially, the fact that a is 1-periodic plays an important role. Since we do not always have periodic symbols on Rn it may be convenient to periodise them. Definition 4.6.18 (Periodisation of symbols). If a(X, D) is a pseudo-differential operator with symbol a(x, ξ), by (Pa)(X, D) we will denote the pseudo-differential operator with symbol a(x + k, ξ). (Pa)(x, ξ) := k∈Zn
This procedure makes sense if, for example, a is in L1 (Rn ) with respect to the variable x. In the following proposition we will assume that supports of symbols and functions are contained in the cube [−1/2, 1/2]n . We note that this is not restrictive if these functions are already compactly supported. Indeed, if supports of a(·, ξ) and f are contained in some compact set independent of ξ, we can find some N ∈ N such that they are contained in [−N/2, N/2]n , and then use the analysis on the N -inflated torus, with periodisation operator PN instead of P, defined in (4.28). Proposition 4.6.19 (Operator with periodised symbol). Let a = a(x, ξ) ∈ C ∞ (Rn × Rn ) satisfy supp a ⊂ [−1/2, 1/2]n × Rn and be such that for every α, β ∈ Nn0 there are constants Cαβ and M (α, β) such that the estimate ∂xα ∂ξβ a(x, ξ) ≤ Cαβ ξ
M (α,β)
holds for all x, ξ ∈ Rn . Then we have a(X, D)f = (Pa)(X, D)f + Rf,
Chapter 4. Pseudo-differential Operators on Tn
366
m for all f ∈ C ∞ (Rn ) with supp f ⊂ [−1/2, 1/2]n . If moreover a ∈ Sρ,δ (Rn × Rn ) with ρ > 0, then the operator R extends to a smoothing pseudo-differential operator R : D (Rn ) → S(Rn ).
Proof. By the definition we can write ei2πx·ξ a(x + k, ξ) FRn f (ξ) dξ, (Pa)(X, D)f (x) = k∈Zn
Rn
and let us define Rf := a(X, D)f − (Pa)(X, D)f. The assumption on the support of a implies that for every x there is only one k ∈ Zn for which a(x + k, ξ) = 0, so the sum consists of only one term. It follows that Rf (x) = 0 for x ∈ [−1/2, 1/2]n , because for such x the non-zero term corresponds to k = 0. Let now x ∈ Rn \ [−1/2, 1/2]n . Since a(x, ξ) = 0 for all ξ ∈ Zn , we have that the sum ei2π(x−y)·ξ a(x + k, ξ) f (y) dy dξ Rf (x) = − k∈Zn ,k=0
Rn
Rn
is just a single term and |x − y| > 0 on supp f , so we can integrate by parts with respect to ξ any number of times. This implies that R ∈ Ψ−∞ (Rn × Rn ) because ρ > 0 and that Rf decays at infinity faster than any power. The proof is complete since the same argument can be applied to the derivatives of Rf as well. Exercise 4.6.20. Work out the details for the derivatives of Rf . Proposition 4.6.19 allows us to extend the formula of Theorem 4.6.12 to compact perturbations of periodic symbols. We will use it later when a(X, D) is a sum of a constant coefficient operator and an operator with compactly (in x) supported symbol. Corollary 4.6.21 (Periodisation and compactly supported perturbations). Let a(X, D) be an operator with symbol a(x, ξ) = a1 (x, ξ) + a0 (x, ξ), where a1 is as in Theorem 4.6.12, a1 is 1-periodic in x for every ξ ∈ Rn , and a0 is as in Proposition 4.6.19, supported in [−1/2, 1/2]n × Rn . Define *b(x, ξ) := a*1 (x, ξ) + Pa E0 (x, ξ), x ∈ Tn , ξ ∈ Zn . Then we have
P ◦ a(X, D)f = *b(X, D) ◦ Pf + P ◦ Rf
(4.30)
for all f ∈ S(Rn ), and operator R extends to R : D (Rn ) → S(Rn ), so that P ◦ R : m m D (Rn ) → C ∞ (Tn ). Moreover, if a1 , a0 ∈ Sρ,δ (Rn × Rn ), then *b ∈ Sρ,δ (Tn × Zn ). Remark 4.6.22. Recalling Remark 4.6.14, (4.30) can be justified for larger function classes, e.g., for f ∈ C0k (Rn ) for k sufficiently large (which will be of use in Section 4.12).
4.7. Symbolic calculus
367
Proof of Corollary 4.6.21. By Proposition 4.6.19 we can write a(X, D) = a1 (X, D) + (Pa0 )(X, D) + R, with R : D (Rn ) → S(Rn ). Let us define b(x, ξ) := a1 (x, ξ) + (Pa0 )(x, ξ), so that *b = b|Tn ×Zn . The symbol b is 1-periodic, hence for the operator b(X, D) = a1 (X, D) + (Pa0 )(X, D) by Theorem 4.6.12 we have P ◦ b(X, D)
= *b(X, D) ◦ P =
E0 (X, D) ◦ P. a*1 (X, D) ◦ P + Pa
Since R : D (Rn ) → S(Rn ), we also have P ◦ R : D (Rn ) → C ∞ (Tn ). Finally, if m m a1 , a0 ∈ Sρ,δ (Rn × Rn ), then *b ∈ Sρ,δ (Tn × Zn ) by Theorem 4.5.3. The proof is complete.
4.7
Symbolic calculus
In this section we show that (for suitable ρ, δ) the family of periodic pseudodifferential operators is a ∗-algebra, i.e., it is closed under sums (trivially σj ∈ m max{m1 ,m2 } (Tn × Zn )), products, and taking adjoints; Sρ,δj (Tn × Zn ) ⇒ σ1 +σ2 ∈ Sρ,δ this algebraic structure is the key property to the applicability of periodic pseudodifferential operators. Furthermore, under these operations the degrees of operators behave as one would expect. In the proofs the symbol analysis techniques are used leaving us with asymptotic expansions, so that there is a point to study periodic pseudo-differential operators that are invertible modulo Op(S −∞ (Tn × Zn )); that is, the elliptic operators, which are discussed in Section 4.9. Recall that now there are two types of symbols, toroidal and Euclidean, in (4.6) and (4.20), yielding two alternative (toroidal and Euclidean) quantizations for operators, respectively, see Corollary 4.6.13. As usual, we emphasize this difference m m (Tn × Zn ) and σ ∈ Sρ,δ (Tn × Rn ), respectively. by writing σ ∈ Sρ,δ Now we will discuss the calculus of pseudo-differential operators with toroidal symbols. For this, let us fix the notation first and recall discrete versions of derivatives from Definition 3.4.1: Dy(α) = Dy(α1 1 ) · · · Dy(αnn ) , (0)
where Dyj = I and Dy(k+1) j
+
= =
, ∂ − kI i2π∂yj , + , + ∂ ∂ ∂ − I ··· − kI . i2π∂yj i2π∂yj i2π∂yj
Dy(k) j
(4.31)
Chapter 4. Pseudo-differential Operators on Tn
368
Also, in this section the equivalence of asymptotic sums in Definition 4.1.19 will be of use. We now observe how the difference operator affects expansions. m (Tn × Rn ). Then Lemma 4.7.1. Let 0 ≤ δ < ρ ≤ 1. Assume that σ ∈ Sρ,δ ∞ 1 γξ Dx(γ) σ(x, ξ) γ! γ=0
∼
∞ 1 γ γ ∂ξ Dx σ(x, ξ) γ! γ=0
=
exp (∂ξ Dx ) σ(x, ξ),
−∞
(4.32)
where exp is used in abbreviating the right-hand side of (4.32). −∞
In the sequel we will drop the infinity sign from ∼ and will simply write ∼. Proof. We apply Theorem 3.3.39 in order to translate differences into derivatives, and use the definition of the Stirling numbers1 of the second kind:
m−(ρ−δ)N,ρ,δ
∼
=
1 α D(α) σ(x, ξ) α! ξ x α≥0 ⎡ ⎤ α! " γ # γ 1 ⎣ ∂ ⎦ Dx(α) σ(x, ξ) α! γ! α ξ |α| 0 :
A ∈ L(H s (Tn ), H s−l−ε (Tn )).
Furthermore, if m > l ≥ m − (ρ − δ), we can take ε = 0 above. Proof. Notice that the requirement l < m is not really restricting, since by Theorem 4.2.3 we already know that A ∈ L(H q (Tn ), H q−m (Tn )) for every q ∈ R. Fix ε > 0 and assume for clarity that s < t (the case s > t is totally symmetric). Then, by choosing q < s small enough, the interpolation theorems L(H t1 , H t2 ) ∩ L(H q1 , H q2 ) ⊂ L([H t1 , H q1 ]θ , [H t2 , H q2 ]θ ), [H tj , H qj ]θ = H θtj +(1−θ)qj (here 0 < θ < 1; see [72, Theorems 5.1 and 7.7]) imply that A ∈ L(H s (Tn ), H s−l−ε (Tn )). Now suppose l ≥ m − (ρ − δ). With the aid of the canonical Sobolev space isomorm−(ρ−δ) (Tn × Zn )). phisms ϕγ and Theorem 4.7.10, we get ϕs−t Aϕt−s − A ∈ Op(Sρ,δ On the other hand, ϕs−t Aϕt−s H s (Tn )
This completes the proof.
= ϕs−t A H t (Tn ) ⊂ ϕs−t H t−l (Tn ) = H s−l (Tn ).
The interpolation theorems [72, Theorems 5.1 and 7.7] of the preceding proof enhanced with norm estimates are significant also in the proofs of [142, Lemma 4.3] and [62, Lemma 4.1], which are important results in the analysis of periodic integral operators. Finally, we study the connection of orders and continuity in the elementary cases when a periodic pseudo-differential operator is either a multiplier or a multiplication. The next theorem, Abel–Dini, dwells in the theory of series. We present only the proof of the divergence part, which Niels Henrik Abel solved in the 1820s.
Chapter 4. Pseudo-differential Operators on Tn
386
N Theorem 4.10.12 (Abel–Dini). Let dj be positive numbers and let DN := j=1 dj . ∞ r Assume that (DN )∞ N =1 is divergent. Then j=1 dj /Dj diverges exactly when r ≤ 1. Proof (Abel part). (The whole proof is in [66, p. 290-291].) We assume that r ≤ 1. + + Since (DN )∞ N =1 diverges, it is true that for every i ∈ Z there exists ki ∈ Z such that Di /Di+ki ≤ 1/2. Hence i+k i
i+k i+k i dj i dj 1 Di 1 ≥ ≥ dj = 1 − ≥ . r D D D D 2 j i+ki j=i+1 i+ki j j=i+1 j=i+1
Due to this,
∞ j=1
dj /Djr diverges.
If we say that a sequence converges to infinity, pj → ∞, it is meant that for every C < ∞ there exists jC ∈ Z+ such that pj > C if j > jC . Corollary 4.10.13. If (pj )j∈Z+ is a monotone sequenceof positive real numbers ∞ con∞ verging to infinity, then there is a convergent series j=1 cj such that j=1 pj cj diverges. Proof. (A modification of [66, p. 302].) Define d1 := p1 , dj+1 := pj+1 − pj . N Then, in the notation of the Abel–Dini theorem, DN = j=1 dj = pN → ∞, and ∞ ∞ j=1 dj /Dj = 1 + j=1 (pj+1 − pj )/pj+1 diverges. Let us define cj := (pj+1 − pj )/(pj+1 pj ). Then
∞
j=1 cj
converges, because 1/pj → 0: ∞
cj =
j=1
Clearly,
∞ j=1
pj cj =
∞
j=1 (pj+1
, ∞ + 1 1 1 − = . pj pj+1 p1 j=1 − pj )/pj+1 diverges.
We apply this to obtain the following result concerning multipliers: Proposition 4.10.14 (Sobolev unboundedness of multipliers). Assume that σA (x, ξ) = k(ξ), where for every C < ∞ there exists ξ ∈ Zn such that | k(ξ)| > Cξl . Then A H s (Tn ) ⊂ H s−l (Tn ) for any s ∈ R. k(ξ)|2 )ξ∈Zn that converges to ∞ as Proof. Now there is a subsequence of (ξ−2l | ξ → ∞. Corollary 4.10.13 then provides the existenceof a sequence ( u(ξ))ξ∈Zn for which ξ∈Zn ξ2s | u(ξ)|2 converges, but for which ξ∈Zn ξ2(s−l) | k(ξ) u (ξ)|2 diverges. Thus u ∈ H s (Tn ), and it is mapped to Au ∈ H s−l (Tn ).
4.11. An application to periodic integral operators
387
Example. Proposition 4.2.3 showed that the order of an operator determines its boundedness properties on Sobolev spaces. The converse is not true. Indeed, there is no straightforward way of concluding the order of a symbol from observations about between which spaces the mapping acts. A simple demonstration of this kind of phenomenon is σ(x, ξ) := sin(ln(|ξ|)2 ) (when |ξ| ≥ 1, ξ ∈ R1 ; the definition of σ for |ξ| < 1 is not interesting). This symbol is independent of x, and it is bounded, resulting in that Op(σ) maps H s (T1 ) into itself for every s ∈ R. On the other hand, σ defines a periodic pseudo-differential operator of degree ε for any ε > 0, as it is easily verified – however, σ ∈ S 0 (T1 ), because ∂ξ σ(x, ξ) = 2
ln(|ξ|) cos(ln(|ξ|)2 ), ξ
(|ξ| > 1),
which certainly is not in O((1 + |ξ|)−1 ). The case of pure multiplications can be more easily and thoroughly handled: Proposition 4.10.15. Any Sobolev space H s (Tn ) is the intersection of the local Sobolev spaces of the same order, i.e., H s (Tn ) = x∈Tn H s (x) for every s ∈ R. Moreover, if ϕ ∈ C ∞ (Tn ) such that ϕ(x) = 0, then ϕ defines an automorphism of H s (x) by multiplication. Proof. By Theorem 4.2.3, v ∈ H s (Tn ) implies ψv ∈ H s (Tn ) for any ψ ∈ C ∞ (Tn ). Then assume that v ∈ H s (x) for every x ∈ Tn , so that there exist neighbourhoods Ux of points x where v ∈ H s (Ux ). Since Tn is compact, there is a finite subcover U = {Ux(j) }N j=1 . Since there exists a smooth partition of unity subordinate to U (see Corollary A.12.15 for a continuous partition, and then make it smooth, e.g., by mollification), and U is finite, it is true that v ∈ H s (Tn ) – the first claim is proved. Let us then show that u → ϕu defines an automorphism. As above, ϕu ∈ H s (x). By the continuity of ϕ on Tn there exists a neighbourhood U of x such that ϕ(y) = 0 whenever y ∈ U , and furthermore U can be chosen so small that u ∈ H s (U ). Then take such ψ ∈ C ∞ (Tn ) that ψ|U = 1/ϕ. Since ψϕu|U = u|U , and the result is obtained.
4.11 An application to periodic integral operators As an example of the symbolic analysis techniques, here we study periodic integral operators. Let A be a linear operator defined on C ∞ (Tn ) by Au(x) := a(x, y) k(x − y) u(y) dy, (4.44) Tn
where a is a C ∞ -smooth biperiodic function, and k is a 1-periodic distribution. Note that when a is a function of a single variable, A is simply a convolution
Chapter 4. Pseudo-differential Operators on Tn
388
operator composed with multiplication: either Au(x) = f (x) Tn k(x − y) u(y) dy if a(x, y) = f (x), or Au(x) = Tn g(y) k(x − y) u(y) dy if a(x, y) = g(y). We are going to show that whenever A of the type (4.44) is a periodic pseudodifferential operator, it is really something like a convolution operator with multiplication, or on the Fourier side, almost a multiplier: Theorem 4.11.1. Let ρ > 0. The operator A defined by (4.44) is a periodic pseudodifferential operator of order m if and only if the Fourier coefficients of the distribution k satisfy ∀α ∈ Nn0 ∃Cα ∈ R ∀ξ ∈ Zn :
m−ρ|α| α . ξ k(ξ) ≤ Cα ξ
(4.45)
m In this case A ∈ Op(Sρ,0 (Tn × Zn )) and the symbol of A has the following asymptotic expansion: 1 γ σA (x, ξ) ∼ k(ξ) Dy(γ) a(x, y)|y=x . γ! ξ |γ|≥0
Proof. An amplitude of A is right in front of our eyes: Au(x) = u(y) a(x, y) k(x − y) dy Tn = u(y) a(x, y) k(ξ) ei2π(x−y)·ξ dy Tn
=
ξ∈Zn
Op(a)u(x),
where a(x, y, ξ) = a(x, y) k(ξ). Certainly k satisfies the estimates (4.45) if and only n if a ∈ Am (T ). Accordingly, a yields the asymptotic expansion in view of (4.18) ρ,0 in Theorem 4.4.5. Remark 4.11.2. By Theorem 4.11.1 it is readily seen that a periodic pseudodifferential operator A of the periodic integral operator form (4.44), that is Au(x) = a(x, y) k(x − y) u(y) dy, (4.46) Tn
is elliptic if and only if k(ξ) is an elliptic symbol and a(x, x) = 0 for all x ∈ Tn . Consequently in this case by Theorem 4.9.17 it is a Fredholm operator. The index is invariant under compact perturbations (see [55, Corollary 19.1.8], or [135, p. 99]), so that we can add to A any periodic pseudo-differential operator of strictly lower degree and still get an operator with the same index. Exercise 4.11.3. Let A in (4.46) be elliptic. Show that index Ind(A) = 0. Theorem 4.11.1 implies that the principal symbol of the periodic integral operator in (4.44) viewed as a periodic pseudo-differential operator is a(x, x) k(ξ). By combining Propositions 4.10.14 and 4.10.15 with this observation, we obtain another application to periodic integral operators:
4.12. Toroidal wave front sets
389
Proposition 4.11.4. If a periodic pseudo-differential operator A is of the periodic integral operator form (4.44) Au(x) = u(y) a(x, y) k(x − y) dy, Tn
where a(x, x) = 0 for all x ∈ Tn and ∀C ∈ R ∃ξ ∈ Zn : | k(ξ)| > Cξl , then A H s (Tn ) ⊂ H s−l (Tn ) for all s ∈ R. Remark 4.11.5. In [142] and [102], it is shown that any classical periodic pseudodifferential operator can be expressed as a sum of periodic integral operators of the type (4.44), see Remark 4.4.7. Other contributions to periodic integral operators and classical operators are made, e.g., in [34], [62], [142], and [102].
4.12
Toroidal wave front sets
Here we shall briefly study microlocal analysis not on the cotangent bundle of the torus but on Tn × Zn , which is better suited for the Fourier series representations. Let us define mappings
πTn ×Rn
πRn : Rn \ {0} → Sn−1 , : Tn × (Rn \ {0}) → Tn × Sn−1 ,
πRn (ξ) := ξ/ξ, πTn ×Rn (x, ξ) := (x, ξ/ξ).
We set πZn = πRn |Zn : Zn \{0} → Sn−1 . Definition 4.12.1 (Discrete cones). We say that K ⊂ Rn \ {0} is a cone in Rn if ξ ∈ K and λ > 0 imply λξ ∈ K. We say that Γ ⊂ Zn \ {0} is a discrete cone if Γ = Zn ∩ K for some cone K in Rn ; moreover, if this K is open then Γ is called an open discrete cone. The set S := πRn (Zn \{0}) is the set of points with rational directions on the unit sphere. Proposition 4.12.2. Γ ⊂ Zn \ {0} is a discrete cone if and only if Γ = Zn ∩ πR−1 n (πRn (Γ)). Proof. We must show that if K is a cone in Rn then n Zn ∩ K = πZ−1 ∩ K). n πZn (Z n The inclusion “⊂” is obvious. Let us show the inclusion “⊃”. Let ξ ∈ πZ−1 n πZn (Z ∩ n K). Then ξ ∈ Z so we need to show that ξ ∈ K. It follows that πZn (ξ) ∈ πZn (Zn ∩ K) = S ∩ πRn (K), which implies ξ ∈ πZ−1 n (S ∩ πRn (K)) ⊂ K, completing the proof.
Chapter 4. Pseudo-differential Operators on Tn
390
Definition 4.12.3 (Toroidal wave front sets). Let u ∈ D (Tn ). The toroidal wave front set WFT (u) ⊂ Tn × (Zn \ {0}) is defined as follows: we say that (x0 , ξ0 ) ∈ Tn × (Zn \ {0}) does not belong to WFT (u) if and only if there exist χ ∈ C ∞ (Tn ) and an open discrete cone Γ ⊂ Zn \ {0} such that χ(x0 ) = 0, ξ0 ∈ Γ and ∀N > 0 ∃CN < ∞ ∀ξ ∈ Γ : |FTn (χu)(ξ)| ≤ CN ξ−N ; in such a case we say that FTn (χu) decays rapidly in Γ. We say that a pseudo-differential operator A ∈ Ψm (Tn × Zn ) = Op S m (Tn × n Z ) is elliptic at the point (x0 , ξ0 ) ∈ Tn × (Zn \ {0}) if its toroidal symbol σA : Tn × Zn → C satisfies |σA (x0 , ξ)| ≥ C ξm for some constant C > 0 as ξ → ∞, where ξ ∈ Γ and Γ ⊂ Zn \ {0} is an open discrete cone containing ξ0 . Should ξ → σA (x0 , ξ) be rapidly decaying in an open discrete cone containing ξ0 then A is said to be smoothing at (x0 , ξ0 ). The toroidal characteristic set of A ∈ Ψm (Tn × Zn ) is charT (A) := {(x0 , ξ0 ) ∈ Tn × (Zn \ {0}) : A is not elliptic at (x0 , ξ0 )}, and the toroidal wave front set of A is WFT (A) := {(x0 , ξ0 ) ∈ Tn × (Zn \ {0}) : A is not smoothing at (x0 , ξ0 )}. Proposition 4.12.4. We have WFT (A) ∪ charT (A) = Tn × (Zn \ {0}). Proof. The statement follows because if (x, ξ) ∈ charT (A), it means that A is elliptic at (x, ξ), and hence not smoothing. Exercise 4.12.5. Show that WFT (A) = ∅ if and only if A is smoothing, i.e., maps D (Tn ) to C ∞ (Tn ) (equivalently, the Schwartz kernel is smooth by Theorem 4.3.6). Proposition 4.12.6. Let A, B ∈ Op S m (Tn × Zn ). Then WFT (AB) ⊂ WFT (A) ∩ WFT (B). Proof. By Theorem 4.7.10 applied to pseudo-differential operators A and B we notice that the toroidal symbol of AB ∈ Op S 2m (Tn × Zn ) has an asymptotic expansion 1 (α) σAB (x, ξ) ∼ α ξ σA (x, ξ) Dx σB (x, ξ) α! α≥0
∼
1 ∂ξα σA (x, ξ) ∂xα σB (x, ξ), α!
α≥0
where in the latter expansion we have used smooth extensions of toroidal symbols. This expansion says that AB is smoothing at (x0 , ξ0 ) if A or B is smoothing at (x0 , ξ0 ).
4.12. Toroidal wave front sets
391
The notion of the toroidal wave front set is compatible with the action of pseudo-differential operators: Proposition 4.12.7 (Transformation of toroidal wave fronts). Let u ∈ D (Tn ) and m (Tn × Zn ), where 0 ≤ ρ ≤ 1, 0 ≤ δ < 1. Then A ∈ Op Sρ,δ WFT (Au) ⊂ WFT (u). ∞
Especially, if ϕ ∈ C (T ) does not vanish, then WFT (ϕu) = WFT (u). n
Proof. Let FTn u decay rapidly in an open discrete cone Γ ⊂ Zn . Let us estimate FTn (Au)(η) = σ0 A (η − ξ, ξ) FTn u(ξ), where σ0 A (η, ξ) =
ξ∈Zn
Tn
e−i2πx·η σA (x, ξ) dx. Integration by parts yields −M |σ0 ξm+δM , A (η, ξ)| ≤ CM η
m (Tn × Zn ). Due to the rapid decay of FTn u on Γ, we get because σA ∈ Sρ,δ |σ0 η − ξ−M ξm+δM ξ−N A (η − ξ, ξ)| |FTn u(ξ)| ≤ CM,N ξ∈Γ
ξ∈Γ
≤
M
2 CM,N
η−M ξm+(1+δ)M −N
ξ∈Γ
≤
CM η
−M
,
where we used Peetre’s inequality and chose N large enough. Next, take an open discrete cone Γ1 ⊂ Γ such that η ∈ Γ1 and that ω − ξ ≥ C1 max{ω, ξ} for all ω ∈ Γ1 and ξ ∈ Zn \ Γ (where C1 is a constant). Then ω − ξ ≥ C1 ω1/k ξ1−1/k for all k ∈ N. Notice that |FTn u(ξ)| ≤ CN ξN for some positive N . Thereby |σ0 A (η − ξ, ξ)| |FTn u(ξ)| ξ∈Zn \Γ
≤
C
η − ξ−M ξm+δM ξN
ξ∈Zn \Γ
≤
C
η−M/k ξm+(δ−(k−1)/k)M +N
ξ∈Zn \Γ
≤
CM η−M/k ,
where we chose (k − 1)/k > δ and then M large enough. Thus FTn (Au) decays rapidly in Γ1 . We will not pursue the complete analysis of toroidal wave front sets much further because most of their properties can be obtained from the known properties of the usual wave front sets and the following relation, where WF(u) stands for the usual H¨ ormander’s wave front set of a distribution u.
392
Chapter 4. Pseudo-differential Operators on Tn
Theorem 4.12.8 (Characterisation of toroidal wave front sets). Let u ∈ D (Tn ). Then WFT (u) = (Tn × Zn ) ∩ WF(u). Proof. Without loss of generality, let u ∈ C k (Tn ) for some large k, and let χ ∈ C0∞ (Rn ) such that supp(χ) ⊂ (0, 1)n . If FRn (χu) decays rapidly in an open cone K ⊂ Rn then FTn (P(χu)) = FRn (χu)|Zn decays rapidly in the open discrete cone Zn ∩ K. Hence WFT (u) ⊂ (Tn × Zn ) ∩ WF(u). Next, we need to show that (Tn × Zn ) \ WFT (u) ⊂ (Tn × Zn ) \ WF(u). Let (x0 , ξ0 ) ∈ (Tn × Zn ) \ WFT (u) (where ξ0 = 0). We must show that (x0 , ξ0 ) ∈ WF(u). There exist χ ∈ C ∞ (Tn ) (we may assume that supp(χ) ⊂ (0, 1)n as above) and an open cone K ⊂ Rn such that χ(x0 ) = 0, ξ0 ∈ Zn ∩ K and that FTn (P(χu)) decays rapidly in Zn ∩ K. Let K1 ⊂ Rn be an open cone such that ξ0 ∈ K1 ⊂ K and that the closure K1 ⊂ K ∪ {0}. Take any function w ∈ C ∞ (Sn−1 ) such that 1, if ω ∈ Sn−1 ∩ K1 , w(ω) = 0, if ω ∈ Sn−1 \ K. Let a ∈ C ∞ (Rn × Rn ) be independent of x and such that a(x, ξ) = w(ξ/ξ) whenever ξ ≥ 1. Then a ∈ S 0 (Rn × Rn ). Let * a = a|Tn ×Zn , so that * a ∈ S 0 (Tn × n Z ) by Theorem 4.5.3. By Corollary 4.6.21, we have P(χ a(X, D)f ) = P(χ) * a(X, D)(Pf ) + P(Rf ) for all Schwartz test functions f , for a smoothing operator R : D (Rn ) → S(Rn ). By Remark 4.6.22 we also have P(χ a(X, D)(χu)) = P(χ) * a(X, D)(P(χu)) + P(R(χu)), where the right-hand side belongs to C ∞ (Tn ), since its Fourier coefficients decay rapidly on the whole Zn . Therefore also P(χ a(X, D)(χu)) belongs to C ∞ (Tn ). Thus χ a(X, D)(χu) ∈ C0∞ (Rn ). Let ξ ∈ K1 such that ξ ≥ 1. Then we have FRn (a(X, D)(χu))(ξ) = w(ξ/ξ) FRn (χu)(ξ) = FRn (χu)(ξ). Thus FRn (χu) decays rapidly on K1 . Therefore (x0 , ξ0 ) does not belong to WF(u). Exercise 4.12.9. Show that for every u ∈ D (Tn ) we have charT (A). WFT (u) = A∈Ψ0 ,Au∈C ∞
4.13. Fourier series operators
4.13
393
Fourier series operators
In this section we consider analogues of Fourier integral operators on the torus Tn . We will call such operators Fourier series operators and study their composition formulae with pseudo-differential operators on the torus. Definition 4.13.1 (Fourier series operators). Fourier series operators (FSO) are operators of the form e2πi(φ(x,ξ)−y·ξ) a(x, y, ξ) u(y) dy, (4.47) T u(x) := Tn
ξ∈Zn
where a ∈ C ∞ (Tn × Tn × Zn ) is a toroidal amplitude and φ is a real-valued phase function such that conditions of the following Remark 4.13.2 are satisfied. Remark 4.13.2 (Phase functions). We note that if u ∈ C ∞ (Tn ), for the function T u to be well defined on the torus we need that the integral (4.47) is 1-periodic in x. Therefore, by identifying functions on Tn with 1-periodic functions on Rn , we will require that the phase function φ : Rn × Zn → R is such that the function x → e2πiφ(x,ξ) is 1-periodic for all ξ ∈ Zn . Note that here it is not necessary that the function x → φ(x, ξ) itself is 1-periodic. Remark 4.13.3. Assume that the function φ : Rn × Zn → R is in C k with respect to x for all ξ ∈ Zn . Assume also that the function x → e2πiφ(x,ξ) is 1-periodic for all ξ ∈ Zn . Differentiating it with respect to x we get that the functions x → ∂xα φ(x, ξ) are 1-periodic for all ξ ∈ Zn and all α ∈ Nn0 with 1 ≤ |α| ≤ k. Remark 4.13.4. The operator T : C ∞ (Tn ) → D (Tn ) in (4.47) can be justified in the usual way for oscillatory integrals. If we have more information on the symbol we have better properties, for example: Proposition 4.13.5. Let φ ∈ C ∞ (Tn × Zn ) be such that the function x → e2πiφ(x,ξ) is 1-periodic for all ξ ∈ Zn , and such that for some ∈ R we have |∂xα φ(x, ξ)| ≤ Cα ξ
for all multi-indices α, all x ∈ Tn and ξ ∈ Zn . Let a ∈ C ∞ (Tn × Tn × Zn ) be such that there is m, δ1 ∈ R and δ2 < 1 such that for all multi-indices α, β we have |∂xα ∂yβ a(x, y, ξ)| ≤ Cαβ ξ
m+δ1 |α|+δ2 |β|
for all x, y ∈ Tn and ξ ∈ Zn . Then the operator T in (4.47) is a well-defined continuous linear operator from C ∞ (Tn ) to C ∞ (Tn ). Proof. Let u ∈ C ∞ (Tn ) and let Ly be the Laplacian with respect to y. Expression (4.47) can be justified by integration by parts with the operator Ly = 1−(4π 2 )−1 Ly −2 which satisfies ξ Ly e2πiy·ξ = e2πiy·ξ . Consequently, we interpret (4.47) as −2N T u(x) = ξ e2πi(φ(x,ξ)−y·ξ) LN (4.48) y [a(x, y, ξ) u(y)] dy, ξ∈Zn
Tn
Chapter 4. Pseudo-differential Operators on Tn
394
so that both y-integral and ξ-sum converge absolutely if N is large enough in view of δ2 < 1. Consequently, T u is 1-periodic by our assumptions and by Remark 4.13.2, and (4.48) can be differentiated any number of times with respect to x to yield a function T u ∈ C ∞ (Tn ) by Remark 4.13.3. Continuity of T on C ∞ (Tn ) follows from Lebesgue’s dominated convergence theorem on Tn × Zn (see Theorems C.3.22 and 1.1.4). Remark 4.13.6. Thus, we will always interpret (4.47) as (4.48). Composition formulae of this section can be compared with those obtained in [94, 96] globally on Rn under minimal assumptions on phases and amplitudes. However, on the torus, the assumptions on the regularity or boundedness of higher-order derivatives of phases and amplitudes are redundant due to the fact that ξ ∈ Zn takes only discrete values. We recall the notation for the toroidal version of Taylor polynomials and the corresponding derivatives introduced in (3.15) and (4.31), which will be used in the formulation of the following theorems. However, we need the following: Definition 4.13.7 (Warning: operators (−Dy )(α) ). Before we define operators (−Dy )(α) below we warn the reader that one should not formally plug in the minus sign in the definition of the previously defined operators (Dy )(α) in Definition 3.4.1! Please compare these operators with those in (4.31) and observe how the sign changes. With this warning in place, we define (−Dy )(α) = (−Dy1 )(α1 ) · · · (−Dyn )(αn ) ,
(4.49)
(0)
where −Dyj = I and , ∂ ( − Dyj ) − kI − 2πi∂yj + , + , ∂ ∂ ∂ − − − I ··· − − kI . 2πi∂yj 2πi∂yj 2πi∂yj +
( − Dyj )
(k+1)
= =
(k)
We now study composition formulae of Fourier series operators with pseudodifferential operators. Theorem 4.13.8 (Composition FSO◦ΨDO). Let φ : Rn × Zn → R be such that function x → e2πiφ(x,ξ) is 1-periodic for all ξ ∈ Zn . Let T : C ∞ (Tn ) → D (Tn ) be defined by e2πi(φ(x,ξ)−y·ξ) a(x, y, ξ) u(y) dy, (4.50) T u(x) := ξ∈Zn
Tn
where the toroidal amplitude a ∈ C ∞ (Tn × Tn × Zn ) satisfies ∂xα ∂yβ a(x, y, ξ) ≤ Cαβm ξm
4.13. Fourier series operators
395
for all x, y ∈ Tn , ξ ∈ Zn and α, β ∈ Nn0 . Let p ∈ S (Tn × Zn ) be a toroidal symbol and P = Op(p) the corresponding pseudo-differential operator. Then the composition T P has the form T P u(x) = e2πi(φ(x,ξ)−z·ξ) c(x, z, ξ) u(z) dz, Tn
ξ∈Zn
where c(x, z, ξ) =
e2πi(y−z)·(η−ξ) a(x, y, ξ) p(y, η) dy
Tn
η∈Zn
satisfies ∂xα ∂zβ c(x, z, ξ) ≤ Cαβmt ξm+ for every x, z ∈ Tn , ξ ∈ Zn and α, β ∈ Nn0 . Moreover, we have an asymptotic expansion 1 3 2 (−Dz )(α) a(x, z, ξ) α c(x, z, ξ) ∼ ξ p(z, ξ) . α! α≥0
n (Tn × Zn ) and a ∈ Am Furthermore, if 0 ≤ δ < ρ ≤ 1, p ∈ Sρ,δ ρ,δ (T ), then n c ∈ Am+ ρ,δ (T ).
Remark 4.13.9. We note that if T in (4.50) is a pseudo-differential operator with phase φ(x, ξ) = x · ξ and amplitude a(x, y, ξ) = a(x, ξ) independent of y, then the asymptotic expansion formula for the composition of two pseudo-differential operators T ◦ P becomes c(x, z, ξ) ∼
1 a(x, ξ) (−Dz )(α) α ξ p(z, ξ). α!
α≥0
This is another representation for the composition compared to Theorem 4.7.10, with an amplitude realisation of the pseudo-differential operator T ◦P , see Remark 4.7.12. Proof of Theorem 4.13.8. Let us calculate the composition T P : T P u(x) = e2πi(φ(x,ξ)−y·ξ) a(x, y, ξ) P u(y) dy =
ξ∈Zn
Tn
ξ∈Zn
Tn
× =
ξ∈Zn
e2πi(φ(x,ξ)−y·ξ) a(x, y, ξ)
η∈Zn
Tn
e2πi(y−z)·η p(y, η) u(z) dz dy
Tn
e2πi(φ(x,ξ)−z·ξ) c(x, z, ξ) u(z) dz,
Chapter 4. Pseudo-differential Operators on Tn
396
where c(x, z, ξ) =
e2πi(y−z)·(η−ξ) a(x, y, ξ) p(y, η) dy.
Tn
η∈Zn
Denote θ := η − ξ, so that by the discrete Taylor expansion (Theorem 3.3.21), we formally get 1 θ(α) α e2πi(y−z)·θ a(x, y, ξ) c(x, z, ξ) ∼ ξ p(y, ξ) dy α! n α≥0 θ∈Zn T 1 = θ(α) e2πi(y−z)·θ a(x, y, ξ) α ξ p(y, ξ) dy α! n T n α≥0
=
θ∈Z
1 (−Dy )(α) a(x, y, ξ) α ξ p(y, ξ) |y=z . α!
α≥0
Now we have to justify the asymptotic expansion. First we take a discrete Taylor expansion and using Theorem 3.3.21 again, we obtain
p(y, ξ + θ) =
|ω|<M
1 (ω) ω θ ξ p(y, ξ) + rM (y, ξ, θ). ω!
Let Q(θ) = {ν ∈ Zn : |νj | ≤ |θj | for all j = 1, . . . , n} as in Theorem 3.3.21. Then by Peetre’s inequality (Proposition 3.3.31) we have β α ξ ∂y rM (y, ξ, θ)
≤
C θM
≤
C θM
≤
C θM
≤
C θ2M +||+|α| ξ−|α|−M .
max α+ω ∂yβ p(y, ξ + ν) ξ
|ω|=M ν∈Q(θ)
max ξ + ν−|α|−M
ν∈Q(θ)
max ν|−|α|−M | ξ−|α|−M
ν∈Q(θ)
The corresponding remainder term in the asymptotic expansion of c(x, z, ξ) is RM (x, z, ξ) = e2πi(y−z)·θ a(x, y, ξ) rM (y, ξ, θ) dy =
θ∈Zn
Tn
θ∈Zn
Tn
e2πi(y−z)·θ θ−2N
×(I − (4π 2 )−1 Ly )N [a(x, y, ξ) rM (y, ξ, θ)] dy, where we integrated by parts exploiting that (I − (4π 2 )−1 Ly ) e2πi(y−z)·θ = θ2 e2πi(y−z)·θ ,
4.13. Fourier series operators
397
where Ly is the Laplacian with respect to y. Thus we get the estimate β γ ≤ C θ|γ|−2N +2M +||+|α| ξm+−M , α ξ ∂x ∂z RM (x, z, ξ) θ∈Zn
and we take N ∈ N so large that this sum over θ converges. Hence β γ m+−M . α ξ ∂x ∂z RM (x, z, ξ) ≤ C ξ
This completes the proof of the first part of the theorem. Finally, we assume n that a ∈ Am ρ,δ (T ). Then also the terms in the asymptotic expansion and the remainder RM have corresponding decay properties in the ξ-differences, leading n to the amplitude c ∈ Am+ ρ,δ (T ). This completes the proof. Exercise 4.13.10. Work out all the details of the proof in the (ρ, δ)-case. We now formulate the theorem about compositions of operators in the opposite order. Theorem 4.13.11 (Composition ΨDO◦FSO). Let φ : Rn × Zn → R be 1-periodic for all ξ ∈ Zn . Let T : C ∞ (Tn ) → D (Tn ) be such that e2πi(φ(x,ξ)−y·ξ) a(x, y, ξ) u(y) dy, T u(x) := ξ∈Zn
Tn
where a ∈ C ∞ (Tn × Tn × Zn ) satisfies ∂xα ∂yβ a(x, y, ξ) ≤ Cαβm ξm for all x, y ∈ Tn , ξ ∈ Zn and α, β ∈ Nn0 . Assume that for some C > 0 we have C −1 ξ ≤ ∇x φ(x, ξ) ≤ C ξ
(4.51)
for all x ∈ Tn , ξ ∈ Zn , and that |∂xα φ(x, ξ)| ≤ Cα ξ,
∂xα βξ φ(x, ξ) ≤ Cαβ
(4.52)
for all x ∈ Tn , ξ ∈ Zn and α, β ∈ Nn0 with |β| = 1. Let p* ∈ S (Tn × Zn ) be a toroidal symbol let p(x, ξ) denote an extension of p*(x, ξ) to a symbol in S (Tn ×Rn ) as given in Theorem 4.5.3. Let P = Op(p) be the corresponding pseudo-differential operator. Then P T u(x) = e2πi(φ(x,ξ)−z·ξ) c(x, z, ξ) u(z) dz, ξ∈Zn
Tn
where we have ∂xα ∂zβ c(x, z, ξ) ≤ Cαβ ξm+
Chapter 4. Pseudo-differential Operators on Tn
398
all every x, z ∈ Tn , ξ ∈ Zn and α, β ∈ Nn0 . Moreover, we have the asymptotic expansion A B (2πi)−|α| ∂ηα p(x, η)|η=∇x φ(x,ξ) ∂yα e2πiΨ(x,y,ξ) a(y, z, ξ) |y=x , c(x, z, ξ) ∼ α! α≥0
(4.53) where Ψ(x, y, ξ) := φ(y, ξ) − φ(x, ξ) + (x − y) · ∇x φ(x, ξ). Remark 4.13.12. Let us make some remarks about quantities appearing in the asymptotic extension formula (4.53). It is geometrically reasonable to evaluate the symbol p*(x, ξ) at the real Hamiltonian flow generated by the phase function φ of the Fourier series operator T . This is the main complication compared with pseudo-differential operators for which we have Proposition 4.12.7. However, although a priori the symbol p* is defined only on Tn × Zn , we can still extend it to a symbol p(x, ξ) on Tn × Rn by Theorem 4.5.3, so that the restriction ∂ηα p(x, η)|η=∇x φ(x,ξ) makes sense. We also note that the function Ψ(x, y, ξ) can not be in general considered as a function on Tn × Tn × Zn because it may not2 be periodic in x and 3 y. However, we can still observe that the derivatives ∂yα e2πiΨ(x,y,ξ) a(y, z, ξ) |y=x are periodic in x and z, so all terms in the righthand side of (4.53) are well-defined functions on Tn × Tn × Zn . In any case, for a standard theory of Fourier integral operators on Rn we refer the reader to [56]. 1 Remark 4.13.13. In Theorem 4.13.11, we note that if φ ∈ Sρ,δ (Rn × Rn ), p* ∈ n n m n n Sρ,δ (T × Z ), a ∈ Aρ,δ (T ), and 0 ≤ δ < ρ ≤ 1, then we also have c ∈ Am+ ρ,δ (T ). Exercise 4.13.14. Prove this remark. Proof of Theorem 4.13.11. To simplify the notation, let us drop writing tilde on p, and denote both symbols p* and p by the same letter p. There should be no confusion since they coincide on Tn × Zn . Let P = Op(p). We can write e2πi(x−y)·η p(x, η) T u(y) dy P T u(x) = =
η∈Zn
Tn
η∈Zn
Tn
× =
ξ∈Zn
where c(x, z, ξ) =
η∈Zn
e2πi(x−y)·η p(x, η)
ξ∈Zn
e2πi(φ(y,ξ)−z·ξ) a(y, z, ξ) u(z) dz dy
Tn
e2πi(φ(x,ξ)−z·ξ) c(x, z, ξ) u(z) dz,
Tn
e2πi(φ(y,ξ)−φ(x,ξ)+(x−y)·η) a(y, z, ξ) dy.
p(x, η) Tn
4.13. Fourier series operators
399
Let us fix some x ∈ Rn , with corresponding equivalence class [x] ∈ Tn which we still denote by x. Let V ⊂ Rn be an open cube with side length equal to 1 centred at x. Let χ = χ(x, y) ∈ C ∞ (Tn × Tn ) be such that 0 ≤ χ ≤ 1, χ(x, y) = 1 for x − y < κ for some sufficiently small κ > 0, and such that supp χ(x, ·) ∩ V ⊂ V . The last condition means that χ(x, ·)|V ∈ C0∞ (V ) is supported away from the boundaries of the cube V . Let c(x, z, ξ) = cI (x, z, ξ) + cII (x, z, ξ), where cI (x, z, ξ) =
η∈Zn
and cII (x, z, ξ) =
e2πi(φ(y,ξ)−φ(x,ξ)+(x−y)·η) (1 − χ(x, y)) a(y, z, ξ) p(x, η) dy,
Tn
η∈Zn
e2πi(φ(y,ξ)−φ(x,ξ)+(x−y)·η) χ(x, y) a(y, z, ξ) p(x, η) dy.
Tn
1. Estimate on the support of 1 − χ. By making a decomposition into cones (sectors) centred at x viewed as a point in Rn , it follows that we can assume without loss of generality that the support of 1 − χ is contained in a set where C < |xj − yj |, for some 1 ≤ j ≤ n. In turn, because of the assumption on the support of χ(x, ·)|V it follows that C < |xj − yj | < 1 − C, for some C > 0. Now we are going to apply the summation by parts formula (3.14) to estimate cI (x, z, ξ). First we notice that it follows that ηj e2πi(x−y)·η
= =
=
e2πi(x−y)·(η+ej ) − e2πi(x−y)·η e2πi(x−y)·η e2πi(xj −yj ) − 1 0
on supp(1 − χ). Hence by the summation by parts formula (3.14) we get that −N1 N1 e2πi(x−y)·η p(x, η) = e2πi(xj −yj ) − 1 e2πi(x−y)·η ηj p(x, η), η∈Zn
η∈Zn
where the sum on the right-hand side converges absolutely for large enough N1 . On the other hand, we can integrate by parts with the operator t
Ly =
1 − (4π 2 )−1 Ly 2
∇y φ(y, ξ) − (2π)−1 i Ly φ(y, ξ)
,
where Ly is the Laplace operator with respect to y, and for which we have 2 e2πiφ(y,ξ) = e2πiφ(y,ξ) . Note that in view of our assumption (4.51) on φ, LN y we have |∇y φ(y, ξ) − (2π)−1 i Ly φ(y, ξ)| ≥ |∇y φ(y, ξ)|2 ≥ C1 ξ . 2
2
Chapter 4. Pseudo-differential Operators on Tn
400
Therefore, I
c (x, z, ξ) =
Tn
η∈Zn
× e
2 e2πi(φ(y,ξ)−φ(x,ξ)+x·η) LN y
2πi(xj −yj )
−N1
−1
ηj
N1
e−2πiy·η #
p(x, η) (1 − χ(x, y)) a(y, z, ξ)
dy.
From the properties of amplitudes, we get |cI (x, z, ξ)|
≤ C
V ∩{2π−c>|xj −yj |>c}
η∈Zn
ξ
m−2N2
η
2N2 +−N1
dy
−N
≤
Cξ
for all N , if we choose large enough N2 and then large enough N1 . We can easily see that similar estimates work for the derivatives of cI , completing the proof on the support of 1 − χ. 2. Estimate on the support of χ. Extending p* ∈ S (Tn × Zn ) to a symbol in p ∈ S (Tn × Rn ) as in Theorem 4.5.3, we will make its usual Taylor expansion at η = ∇x φ(x, ξ), so that we have p(x, η) =
(η − ∇x φ(x, ξ))α ∂ξα p(x, ∇x φ(x, ξ)) α! |α|<M Cα (η − ∇x φ(x, ξ))α rα (x, ξ, η − ∇x φ(x, ξ)), + |α|=M
rα (x, ξ, η − ∇x φ(x, ξ)) =
1
(1 − t)M1 ∂ξα p(x, tη + (1 − t)∇x φ(x, ξ)) dt. 0
Then cII (x, z, ξ) =
|α|<M
1 cα (x, z, ξ) + Cα Rα (x, z, ξ), α! |α|=M
where Rα (x, z, ξ) =
η∈Zn
cα (x, z, ξ) =
Tn
η∈Zn
e2πi(φ(y,ξ)−φ(x,ξ)+(x−y)·η) (η − ∇x φ(x, ξ))α
× χ(x, y) rα (x, ξ, η − ∇x φ(x, ξ)) a(y, z, ξ) dy,
Tn
e2πi(φ(y,ξ)−φ(x,ξ)+(x−y)·η) (η − ∇x φ(x, ξ))α
× χ(x, y) a(y, z, ξ) ∂ξα p(x, ∇x φ(x, ξ)) dy.
4.13. Fourier series operators
401
Now using Corollary 4.6.16 we can calculate α
cα (x, z, ξ) = ∂ξα p(x, ∇x φ(x, ξ)) [Dy − ∇x φ(x, ξ)] × e2πi(φ(y,ξ)−φ(x,ξ)) χ(x, y) a(y, z, ξ) |y=x α = ∂ξα p(x, ∇x φ(x, ξ)) e2πi(x−y)·η [η − ∇x φ(x, ξ)] Rn
V
×e χ(x, y) a(y, z, ξ) dy dη α = ∂ξ p(x, ∇x φ(x, ξ)) B A × Dyα e2πi(φ(y,ξ)−φ(x,ξ)+(x−y)·∇x φ(x,ξ)) χ(x, y) a(y, z, ξ) |y=x , 2πi(φ(y,ξ)−φ(x,ξ))
α
where we wrote the derivative [Dy − ∇x φ(x, ξ)] as a pseudo-differential operator α with symbol [η − ∇x φ(x, ξ)] , x, ξ ∈ Rn , and changed the variables η → η + ∇x φ(x, ξ). Since χ is identically equal to one for y near x, we obtain the asymptotic formula (4.53), once the remainders Rα are analysed. 3. Estimates for the remainder. Let us first write the remainder in the form e2πi(x−y)·η rα (x, ξ, η − ∇x φ(x, ξ)) Rα (x, z, ξ) = n T n (4.54) η∈Z × (η − ∇x φ(x, ξ))α χ(x, y) g(y) dy, with g(y) = e2πi(φ(y,ξ)−φ(x,ξ)) χ(x, y) a(y, z, ξ), which is a 1-periodic function of y. Now, we can use Corollary 4.6.16 to conclude that Rα (x, z, ξ) in (4.54) is equal to the periodisation with respect to x in the form Rα (x, z, ξ) = Px Sα (x, z, ξ), where Sα (x, z, ξ) = rα (x, ξ, Dy − ∇x φ(x, ξ)) (Dy − ∇x φ(x, ξ))α g(y)|y=x = e2πi(x−y)·η rα (x, ξ, η − ∇x φ(x, ξ)) Rn
V
e2πi(x−y)·θ e2πiΨ(x,y,ξ) θα χ(x, y) a(y, z, ξ) rα (x, ξ, θ) dy dθ,
= Rn
× (η − ∇x φ(x, ξ))α χ(x, y) g(y) dy dη
V
where we changed the variables by θ = η − ∇x φ(x, ξ) and where Ψ(x, y, ξ) := φ(y, ξ) − φ(x, ξ) + (x − y) · ∇x φ(x, ξ) and
1
(1 − t)M1 ∂ξα p(x, ∇x φ(x, ξ) + tθ) dt.
rα (x, ξ, θ) = 0
Since the periodisation Px does not change the orders in z and ξ it is enough to derive the required estimates for Sα (x, z, ξ).
Chapter 4. Pseudo-differential Operators on Tn
402
Let ρ ∈ C0∞ (Rn ) be such that ρ(θ) = 1 for θ < /2, and ρ(θ) = 0 for θ > , for some small > 0 to be chosen later. We decompose Sα (x, z, ξ) = SαI (x, z, ξ) + SαII (x, z, ξ), where
SαI (x, z, ξ)
+
=
e Rn
2πi(x−y)·θ
v
ρ A
θ ξ
,
B × rα (x, ξ, θ) Dyα e2πiΨ(x,y,ξ) χ(x, y) a(y, z, ξ) dy dθ, ,, + + θ e2πi(x−y)·θ 1 − ρ SαII (x, z, ξ) = ξ Rn V A B α 2πiΨ(x,y,ξ) × rα (x, ξ, θ) Dy e χ(x, y) a(y, z, ξ) dy dθ. 3.1. Estimate for θ ≤ ξ. For sufficiently small > 0, for any 0 ≤ t ≤ 1, ∇x φ(x, ξ) + tθ and ξ are equivalent. Indeed, if we use the inequalities √ z ≤ 1 + z ≤ 2z, we get √
√ ∇x φ(x, ξ) + tθ ≤ (C2 2 + )ξ 2∇x φ(x, ξ) + tθ, ≥ 1 + ∇x φ − θ ≥ ∇x φ − θ ≥ (C1 − )ξ,
so we will take < C1 . This equivalence means that for θ ≤ ξ, the function −|α| since p ∈ S (Tn × Rn ). We will need two rα (x, ξ, θ) is dominated by ξ auxiliary estimates. The first estimate , , , + + + θ θ rα (x, ξ, θ) ≤ C ∂θγ−δ rα (x, ξ, θ) ∂θδ ρ ∂θγ ρ ξ ξ δ≤γ −|δ| −|α|−|γ−δ| (4.55) ξ ξ ≤C δ≤γ −|α|−|γ|
≤ Cξ
follows from the properties of rα . Before we state the second estimate, let us analyse the structure of ∂yα e2πiΨ(x,y,ξ) . It has at most |α| powers of terms ∇y φ(y,ξ) − ∇x φ(x,ξ), possibly also multiplied by at most |α| higher-order derivatives ∂yδ φ(y,ξ). The product of the terms of the form ∇y φ(y, ξ) − ∇x φ(x, ξ) can be estimated by C(y − xξ)|α| . The terms containing no difference ∇y φ(y, ξ) − ∇x φ(x, ξ) are the products of at most |α|/2 terms of the type ∂yδ φ(y, ξ), and the product of all such terms can be estimated by Cξ
|α|/2
. Altogether, we obtain the estimate
|∂yα e2πiΨ(x,y,ξ) | ≤ Cα (1 + ξy − x)|α| ξ
|α|/2
.
4.13. Fourier series operators
403
The second auxiliary estimate now is Dyα
A
B
e2πiΨ(x,y,ξ) χ(x, y) a(y, z, ξ)
≤ Cα (1 + ξy − x)|α| ξ
|α| 2 +m
.
(4.56)
Now we are ready to prove a necessary estimate for SαI (x, z, ξ). Let (1 − (4π 2 )−1 ξ Lθ ) 2
Lθ =
2
1 + ξ x − y2
2πi(x−y)·θ , LN = e2πi(x−y)·θ , θ e
where Lθ is the Laplace operator with respect to θ. Integration by parts with Lθ yields SαI (x, z, ξ) =
e2πi(x−y)·θ 2
(1 − (4π 2 )−1 ξ Lθ )N 2
V (1 + ξ x − , " + A B# θ α 2πiΨ(x,y,ξ) rα (x, ξ, θ) Dy e × ρ χ(x, y) a(y, z, ξ) dy dθ ξ e2πi(x−y)·θ |γ| = Cγ ξ 2 2 N Rn V (1 + ξ x − y ) |γ|≤2N " ,# A B + + θ , γ α 2πiΨ(x,y,ξ) rα (x, ξ, θ) × Dy e χ(x, y) a(y, z, ξ) ∂θ ρ dy dθ. ξ Rn
y2 )N
Using estimates (4.55), (4.56) and the fact that the measure of the support of function θ → ρ(θ/ξ) is estimated by ( ξ)n , we obtain the estimate |SαI (x, z, ξ)|
≤C
ξ
n+|γ|+
|α| 2 +m
ξ
−|α|−|γ| V
|γ|≤2N +m+n−
≤ Cξ
|α| 2
(1 + ξy − x)|α| 2
(1 + ξ x − y2 )N
dy
,
if we choose N large enough, e.g., N ≥ M = |α|. Each derivative of SαI (x, z, ξ) with respect to x or ξ gives an extra power of θ under the integral. Integrating by parts, this amounts to taking more y-derivatives, giving a higher power of ξ. However, this is not a problem if for the estimate for a given number of derivatives of the remainder SαI (x, z, ξ), we choose M = |α| sufficiently large. 3.2. Estimate for θ > ξ. Let us define ω(x, y, ξ, θ) := (x − y) · θ + Ψ(x, y, ξ) = (x − y) · (∇x φ(x, ξ) + θ) + φ(y, ξ) − φ(x, ξ).
Chapter 4. Pseudo-differential Operators on Tn
404
From (4.51) and (4.52) we have ∇y ω = − θ + ∇y φ − ∇x φ ≤ 2C2 (θ + ξ), ∇y ω ≥ θ − ∇y φ − ∇x φ 1 − C0 x − y ξ ≥ θ + 2 2 ≥ C(θ + ξ),
(4.57)
if we choose κ < 2C 0 , since x − y < κ in the support of χ in V (recall that we were free to choose κ > 0). Let us write σγ1 (x, y, ξ) := e−2πiΨ(x,y,ξ) Dyγ1 e2πiΨ(x,y,ξ) . For any ν we have an estimate |∂yν σγ1 (x, y, ξ)| ≤ Cξ
|γ1 |
,
(4.58)
because of our assumption (4.52) that |∂yν φ(y, ξ)| ≤ Cν ξ. For M = |α| > we also observe that m
|rα (x, ξ, θ)| ≤ Cα , |∂yν a(y, z, ξ)| ≤ Cβ ξ . (4.59) n Let us take t Ly = i∇y ω−2 j=1 (∂yj ω)∂yj . It can be shown by induction that the operator LN y has the form LN y =
1 Pν,N ∂yν , Pν,N = cνμδj (∇y ω)μ ∂yδ1 ω · · · ∂yδN ω, 4N ∇y ω |ν|≤N
where |μ| = 2N, |δj | ≥ 1,
|μ|=2N
N j=1
|δj | + |ν| = 2N. It follows from (4.52) and (4.57) δ
that |Pν,N | ≤ C(θ + ξ)3N , since for all δj we have |∂yj ω| ≤ C(θ + ξ). By the Leibniz formula we have SαII (x, z, ξ) ,, + + θ rα (x, ξ, θ) e2πi(x−y)·θ 1 − ρ = ξ Rn V B A × Dyα e2πiΨ(x,y,ξ) χ(x, y) a(y, z, ξ) dy dθ ,, + + θ 2πiω(x,y,ξ,θ) rα (x, ξ, θ) e = 1−ρ ξ Rn V σγ1 (x, y, ξ) Dyγ2 χ(x, y) Dyγ3 a(y, z, ξ) dy dθ ×
γ1 +γ2 +γ3 =α
e
= Rn
2πiω(x,y,ξ,θ)
∇y ω
V
× rα (x, ξ, θ)
γ1 +γ2 +γ3 =α
−4N
|ν|≤N
+ Pν,N (x, y, ξ, θ) 1 − ρ
+
θ ξ
,,
∂yν σγ1 (x, y, ξ) Dyγ2 χ(x, y) Dyγ3 a(y, z, ξ) dy dθ.
4.14. Boundedness of Fourier series operators on L2 (Tn )
405
It follows now from (4.58) and (4.59) that |α| m |SαII (x, z, ξ)| ≤ C (θ + ξ)−N ξ ξ dθ
θ >ξ/2
m+|α|+n−N
≤ Cξ
,
which yields the desired estimate if we take large enough N . For the derivatives of SαII (x, z, ξ), similar to Part 3.1 for SαI , we can get extra powers of θ, which can be taken care of by choosing large N . The proof of Theorem 4.13.11 is now complete. Remark 4.13.15. Note that we could also use the following asymptotic expansion for c based on the discrete Taylor expansion from Theorem 3.3.21: c(x, z, ξ) ∼
1 θ(α) [α ω p(x, ω)]ω=∇x φ(x,ξ) α! θ∈Zn α≥0 × e2πi(Ψ(x,y,ξ)+(x−y)·θ) a(y, z, ξ) dy Tn
1 [α p(x, ω)]ω=∇x φ(x,ξ) = α! ω α≥0 × θ(α) e2πi(x−y)·θ e2πiΨ(x,y,ξ) a(y, z, ξ) dy θ∈Zn
=
Tn
A B 1 (α) 2πiΨ(x,y,ξ) [α e p(x, ω)] D a(y, z, ξ) . ω=∇x φ(x,ξ) y α! ω y=x
α≥0
Exercise 4.13.16. Justify this expansion to obtain yet another composition formula.
4.14
Boundedness of Fourier series operators on L2 (Tn )
In Theorem 4.8.1 we proved the boundedness of operators on L2 (Tn ) in terms of estimates on their symbols. In particular, in applications it is important to know how many derivatives (or differences in the present toroidal approach) of the symbol must be estimated for the boundedness of the operator. In this section we present the L2 (Tn )-boundedness theorem for Fourier series operators also paying attention to the number of required derivatives for the amplitude. However, first we need an auxiliary result which is of great importance on its own. The following statement is a modification of the well-known Cotlar’s lemma taking into account the fact that operators in our application Theorem 4.14.2, especially the Fourier transform on the torus, act on functions on different Hilbert spaces. The proof below follows [118, p. 280] but there is a difference in how we estimate operator norms because we cannot immediately replace the operator S by S ∗ S in the estimates since they act on functions on different spaces.
Chapter 4. Pseudo-differential Operators on Tn
406
Theorem 4.14.1 (Cotlar’s lemma in Hilbert spaces). Let H, G be Hilbert spaces. Assume that a family of bounded linear operators {Sj : H → G}j∈Zr and positive constants {γ(j)}j∈Zr satisfy Sl∗ Sk H→H ≤ [γ(l − k)] ,
Sl Sk∗ G→G ≤ [γ(l − k)] ,
2
and A=
2
γ(j) < ∞.
j∈Zr
Then the operator S=
Sj
j∈Zr
satisfies SH→G ≤ A. Proof. First let us assume that there are only finitely many (say N ) non-zero operators Sj . We want to establish an estimate uniformly in N and then pass to the limit. We observe that we have the estimate S ≤ S ∗ S for operator norms because we can estimate ||S||2H→G =
sup (Sf, Sf )G =
f H ≤1
sup (S ∗ Sf, f )H
f H ≤1
≤ S ∗ SH→H . 2k
For any k ∈ N and B ∈ L(H) we have B = (B ∗ B)2 , which follows m inductively from B2 = B ∗ B. Thus if m = 2k and B = S ∗ S then S ∗ S = (S ∗ S)m , so we can conclude k−1
SH→G ≤ S ∗ SH→H = (S ∗ S)m H→H ) ) ) ) ) ) ∗ ∗ ) =) S S · · · S S i1 i2 i2m−1 i2m ) ) )i1 ,...,i2m ) 2m
m
.
(4.60)
H→H
Now, we can group products in the sum in different ways. Grouping the terms in the last product as (Si∗1 Si2 )(Si∗3 Si4 ) · · · (Si∗2m−1 Si2m ), we can estimate ) ) ) ) ∗ )Si1 Si2 · · · Si∗2m−1 Si2m )
H→H
≤ γ(i1 − i2 )2 γ(i3 − i4 )2 · · · γ(i2m−1 − i2m )2 . (4.61)
Alternatively, grouping them as Si∗1 (Si2 Si∗3 ) · · · (Si2m−2 Si∗2m−1 )Si2m , we can estimate ) ) ) ) ∗ ≤ A2 γ(i2 − i3 )2 γ(i4 − i5 )2 · · · γ(i2m−2 − i2m−1 )2 . )Si1 Si2 · · · Si∗2m−1 Si2m ) H→H
(4.62)
4.14. Boundedness of Fourier series operators on L2 (Tn )
407
Taking the geometric mean of (4.61) and (4.62) and using it in (4.60), we get the estimate 2m A γ(i1 − i2 ) γ(i2 − i3 ) · · · γ(i2m−1 − i2m ). SH→G ≤ i1 ,...,i2m
Now, taking the sum first with respect to i1 and using that i1 γ(i1 −i2 ) ≤ A, then 2m taking the sum with respect to i2 , etc., we can estimate SH→G ≤ A2m i2m 1. Now, if there are only N non-zero Si ’s, we obtain the estimate 1
SH→G ≤ A N 2m which proves the statement if we let m → ∞. Since this conclusion is uniform over N , the proof is complete. We recall that in the analysis in this chapter we wrote 2π in the exponential to assure that functions e2πix·ξ are 1-periodic. In this section, the only function that occurs in the exponential is φ(x, k) and so we do not need to keep writing 2π in the exponential. Theorem 4.14.2 (Fourier series operators on L2 (Tn )). Let T : C ∞ (Tn ) → D (Tn ) be defined by eiφ(x,k) a(x, k) (FTn u)(k), T u(x) = k∈Zn
where φ : Rn ×Zn → R and a : Tn ×Zn → C. Assume that the function x → eiφ(x,ξ) is 1-periodic for every ξ ∈ Zn , and that for all |α| ≤ 2n + 1 and |β| = 1 we have |∂xα a(x, k)| ≤ C and ∂xα βk φ(x, k) ≤ C
(4.63)
for all x ∈ Tn and k ∈ Zn . Assume also that |∇x φ(x, k) − ∇x φ(x, l)| ≥ C|k − l|
(4.64)
for all x ∈ Tn and k, l ∈ Zn . Then T extends to a bounded linear operator from L2 (Tn ) to L2 (Tn ). Remark 4.14.3. Note that condition (4.64) is a discrete version of the usual local graph condition for Fourier integral operators, necessary for the local L2 boundedness. We also note that conditions on the boundedness of the higherorder differences of phase and amplitude would follow automatically from condition (4.63). Therefore, this theorem relaxes assumptions on the behaviour with respect to the dual variable, compared, for example, with the corresponding global result for Fourier integral operators in [95] in Rn .
Chapter 4. Pseudo-differential Operators on Tn
408
Proof of Theorem 4.14.2. Since for u : Tn → C we have uL2 (Tn ) = FTn u2 (Zn ) , it is enough to prove that the operator Sw(x) = eiφ(x,k) a(x, k) w(k) k∈Zn
is bounded from 2 (Zn ) to L2 (Tn ). Let us define Sl w(x) := eiφ(x,l) a(x, l) w(l),
Sl . From the identity ∗ (w, S v)2 (Zn ) = (Sw, v)L2 (Tn ) =
so that S =
l∈Zn
eiφ(x,k) a(x, k) w(k) v(x) dx
Tn k∈Zn
we find that the adjoint S ∗ to S is given by (S ∗ v)(k) = e−iφ(x,k) a(x, k) v(x) dx Tn
and so we also have (Sl∗ v)(m)
= δlm
Tn
e−iφ(x,m) a(x, m) v(x) dx = δlm (S ∗ v)(l).
It follows that Sk Sl∗ v(x)
eiφ(x,k) a(x, k) (Sl∗ v)(k) = δlk eiφ(x,k) a(x, k) e−iφ(y,k) a(y, k) v(y) dy n T = Kkl (x, y) v(y) dy, =
Tn
where Kkl (x, y) = δkl ei[φ(x,k)−φ(y,k)] a(x, k) a(y, k). From (4.63) and compactness of the torus it follows that the kernel Kkl is bounded and that Sk Sl∗ vL2 (Tn ) ≤ Cδkl vL2 (Tn ) . In particular, we can trivially conclude that for any N ≥ 0 we have the estimate Sk Sl∗ L2 (Tn )→L2 (Tn ) ≤ On the other hand, we have (Sl∗ Sk w)(m)
= δlm = =
Tn
CN . 1 + |k − l|N
e−iφ(x,l) a(x, l) (Sk w)(x) dx
δlm ei[φ(x,k)−φ(x,l)] a(x, k) a(x, l) w(k) dx Tn E K lk (m, μ) w(μ), μ∈Zn
(4.65)
4.14. Boundedness of Fourier series operators on L2 (Tn )
where E K lk (m, μ) = δlm δkμ
409
ei[φ(x,k)−φ(x,l)] a(x, k) a(x, l) dx. Tn
Now, if k = l, integrating by parts (2n + 1)-times with operator 1 ∇x φ(x, k) − ∇x φ(x, l) · ∇x i ∇x φ(x, k) − ∇x φ(x, l)2 and using the periodicity of a and ∇x φ (so there are no boundary terms), we get the estimate C δlm δkμ E |K , (4.66) lk (m, μ)| ≤ 1 + |k − l|2n+1 where we also used that by the discrete Taylor expansion (Theorem 3.3.21) the second condition in (4.63) implies that |∇x φ(x, k) − ∇x φ(x, l)| ≤ C|k − l| for all x ∈ Tn , k, l ∈ Zn . Estimate (4.66) implies sup m
E E |K lk (m, μ)| = |Klk (l, k)| ≤
μ
and similarly for supμ
m,
C , 1 + |k − l|2n+1
so that we have
Sl∗ Sk 2 (Zn )→2 (Zn ) ≤
C . 1 + |k − l|2n+1
(4.67)
These estimates for norms Sk Sl∗ L2 (Tn )→L2 (Tn ) and Sl∗ Sk 2 (Zn )→2 (Zn ) in (4.65) and (4.67), respectively, imply the theorem by a modification of Cotlar’s lemma given in Proposition 4.14.1, which we use with H = 2 (Zn ) and G = L2 (Tn ). Using Theorems 4.13.8, 4.13.11, and 4.14.2, we obtain the result on the boundedness of Fourier series operators on Sobolev spaces: Corollary 4.14.4 (Fourier series operators on Sobolev spaces). Let T : C ∞ (Tn ) → D (Tn ) be defined by eiφ(x,k) a(x, k) u (k), T u(x) = k∈Zn
where φ : Tn × Zn → R and a : Tn × Zn → C. Assume that for all α and |β| = 1 we have m |∂xα a(x, k)| ≤ Cα k , as well as |∂xα φ(x, k)| ≤ Cα k and ∂xα βk φ(x, k) ≤ Cαβ
Chapter 4. Pseudo-differential Operators on Tn
410
for all x ∈ Tn and k ∈ Zn . Assume that for some C > 0 we have C −1 k ≤ ∇x φ(x, k) ≤ C k for all x ∈ Tn , k ∈ Zn , and that |∇x φ(x, k) − ∇x φ(x, l)| ≥ C|k − l| for all x ∈ Tn and k, l ∈ Zn . Then T extends to a bounded linear operator from H s (Tn ) to H s−m (Tn ) for all s ∈ R. Exercise 4.14.5. Work out all the details of the proof.
4.15
An application to hyperbolic equations
In this section we briefly discuss how the toroidal analysis can be applied to construct global parametrices for hyperbolic equations on the torus and how to embed certain problems on Rn into the torus. The finite propagation speed of singularities for solutions to hyperbolic equations allows one to cut-off the equation and the Cauchy data for large x for the local analysis of singularities of solutions for bounded times. Then the problem can be embedded into Tn , or into the inflated torus N Tn (Remark 4.6.9), in order to apply the periodic analysis developed here. One of the advantages of this procedure is that since phases and amplitudes now are only evaluated at ξ ∈ Zn one can apply this also for problems with low regularity in ξ, in particular to problems for weakly hyperbolic equations or systems with variable multiplicities. For example, if the principal part has constant coefficients then the loss of regularity occurs only in ξ so techniques developed in this chapter can be applied. Let a(X, D) be a pseudo-differential operator with symbol a satisfying a = a(x, ξ) ∈ S m (Rn ×Rn ) (with some properties to be specified). There is no difference in the subsequent argument if a = a(t, x, ξ) also depends on t. For a function u = u(t, x) of t ∈ R and x ∈ Rn we write e2πix·ξ a(x, ξ) (FRn u)(t, ξ)dξ a(X, D)u(t, x) = Rn = e2πi(x−y)·ξ a(x, ξ) u(t, y) dy dξ. Rn
Rn
Let u(t, ·) ∈ L1 (Rn ) (0 < t < t0 ) be a solution to the hyperbolic problem ∂ u(t, x) = a(X, D)u(t, x), i ∂t u(0, x) = f (x), where f ∈ L1 (Rn ) is compactly supported.
(4.68)
4.15. An application to hyperbolic equations
411
Assume now that a(X, D) = a1 (X, D)+a0 (X, D) where a1 (x, ξ) is 1-periodic and a0 (x, ξ) is compactly supported in x (assume even that supp a0 (·, ξ) ⊂ [0, 1]n ). A simple example is a constant coefficient symbol a1 (x, ξ) = a1 (ξ). Let us also assume that supp f ⊂ [0, 1]n . We will now describe a way to periodise problem (4.68). According to Proposition 4.6.19, we can replace (4.68) by ∂ i ∂t u(t, x) = (a1 (x, D) + (Pa0 )(X, D))u(t, x) + Ru(t, x), u(x, 0) = f (x), where the symbol a1 + Pa0 is periodic and R is a smoothing operator. To study singularities of (4.68), it is sufficient to analyse the Cauchy problem ∂ i ∂t v(t, x) = (a1 (x, D) + (Pa0 )(X, D))v(t, x), v(x, 0) = f (x) since by Duhamel’s formula we have WF(u − v) = ∅. This problem can be transferred to the torus. Let w(t, x) = Pv(·, t)(x). By Theorem 4.6.12 it solves the Cauchy problem on the torus Tn , with operator P from (4.24) in Theorem 4.6.3: ∂ E0 (X, D))w(t, x), i ∂t w(t, x) = (a*1 (x, D) + Pa w(x, 0) = Pf (x). Now, if a ∈ S 1 (Rn × Rn ) is of the first order, the calculus constructed in previous sections yields the solution in the form e2πiφ(t,x,k) b(t, x, k) FTn (Pf )(k), w(t, x) ≡ Tt f (x) = k∈Zn
where φ(t, x, ξ) and b(t, x, ξ) satisfy discrete analogues of the eikonal and transport equations. Here we note that FTn (Pf )(k) = (FRn f )(k). We also note that the phase φ(t, x, k) is defined for discrete values of k ∈ Zn , so there is no issue of regularity, making this representation potentially applicable to low regularity problems and weakly hyperbolic equations. Example. If the symbol a1 (x, ξ) = a1 (ξ) has constant coefficients and belongs to S 1 (Rn × Rn ), and a0 belongs to S 0 (Rn × Rn ), we can find that the phase is given by φ(t, x, k) = x · k + ta1 (k). In particular, ∇x φ(x, k) = k. Applying a(X, D) to w(t, x) = Tt f (x) and using the composition formula from Theorem 4.13.11 we obtain a(X, D)Tt f (x) = e2πi((x−z)·k+ta1 (k)) c(t, x, k) f (z) dz, k∈Zn
where c(t, x, k) ∼
Rn
(2πi)−|α| ∂ξα a(x, ξ) α!
α≥0
ξ=k
∂xα b(t, x, k),
(4.69)
Chapter 4. Pseudo-differential Operators on Tn
412
since the function Ψ in Theorem 4.13.11 vanishes. From this we can find amplitude b from the discrete version of the transport equations, details of which we omit here. Finally, we note that we can also have an asymptotic expansion for the amplitude b in (4.69) in terms of the discrete differences α ξ and the corresponding (α)
derivatives ∂x instead of derivatives ∂ξα and ∂xα , respectively, if we use Remark 4.13.15 instead of Theorem 4.13.11. Exercise 4.15.1. Work out the details for the arguments above. Remark 4.15.2 (Schr¨ odinger equation). Let u(t, x), t ∈ R, x ∈ Tn , be the solution to a constant coefficients Schr¨odinger equation on the torus, i.e., u satisfies i∂t u + Lu = 0,
u|t=0 = f,
where L is the Laplace operator. This equation can be solved by taking the Fourier transform, and thus the Fourier series representation of the solution is 2 u(t, x) = eitL f (x) = ei2π(x·ξ−2πt|ξ| ) f(ξ). ξ∈Zn
This representation shows, in particular, that the solution is periodic in time. In [16, 17, 18], employing this representation, Bourgain used, for example, the equal) )2 4 ity uL4 (T×Tn ) = )u2 )L2 (T×Tn ) leading to the corresponding Strichartz estimates and global well-posedness results for nonlinear equations. We can note that since the torus is compact, the usual dispersive estimates fail even locally in time. We will not pursue this topic further, and refer to the aforementioned papers for the details.
Chapter 5
Commutator Characterisation of Pseudo-differential Operators On a smooth closed manifold the pseudo-differential operators can be characterised by taking commutators with vector fields, i.e., first-order partial derivatives. This approach is due to Beals ([12], 1977), Dunau ([32], 1977), and Coifman and Meyer ([23], 1978); perhaps the first ones to consider these kind of commutator properties were Calder´on and his school [21]. For other contributions, see also [26], [133] and [80]. In this chapter we present a Sobolev space version of these characterisations. This will be one of the steps in developing global quantizations of operators on Lie groups in Part IV. Indeed, a commutator characterisation in Sobolev spaces as opposed to only L2 will have an advantage of allowing us to control the orders of operators. In particular, the commutators provide us a new, quite simple way of proving the equivalence of local and global definitions of pseudo-differential operators on a torus, and we derive related commutator characterisations for operators of general order on the scale of Sobolev spaces. The structure of the treatment is the following. First, we review necessary pseudo-differential calculus on Rn , obtaining a commutator characterisation of local pseudo-differential operators (Theorem 5.1.4). After that, the corresponding global characterisation is given on closed manifolds (Theorem 5.3.1). Lastly, we apply this to the global symbolic analysis of periodic pseudo-differential operators on Tn (Theorem 5.4.1). Section 5.2 is devoted to a brief introduction to the necessary concepts of pseudo-differential operators on manifolds.
5.1
Euclidean commutator characterisation
In this section we discuss the case of the Euclidean space Rn . We will concentrate on the localisation of pseudo-differential operators which is just a local way to look
414
Chapter 5. Commutator Characterisation of Pseudo-differential Operators
at pseudo-differential operators from Chapter 2 where we dealt with global analysis on Rn . The commutator characterisation of local pseudo-differential operators on Rn provided by Theorem 5.1.4 is needed in the next section for the commutator characterisation result on closed manifolds. Definition 5.1.1 (Order of an operator on the Sobolev scale). A linear operator A : S(Rn ) → S(Rn ) is said to be of order m ∈ R on the Sobolev scale (H s (Rn ))s∈R , if it has bounded extensions As,s−m ∈ L(H s (Rn ), H s−m (Rn )) for every s ∈ R. In this case, the extension is unique in the sense that the operator A has the extension AS ∈ L(S (Rn )) satisfying AS |H s (Rn ) = As,s−m . Thereby any of the operators As,s−m or AS is also denoted by A. By Theorem 2.6.11 a pseudo-differential operator of order m in the class Ψm (Rn × Rn ) is also of order m on the Sobolev scale. Definition 5.1.2 (Local pseudo-differential operators). A linear operator A : C0∞ (Rn ) → D (Rn ) m is called a local pseudo-differential operator of order m ∈ R on Rn , A ∈ OpSloc (Rn ), m n ∞ n if φAψ ∈ Op S (R ) for every φ, ψ ∈ C0 (R ). Naturally, here
((φAψ)u)(x) = φ(x)A(ψu)(x). In addition to the symbol inequalities (2.3) in Definition 2.1.1, there is another appealing way of characterising pseudo-differential operators, namely via commutators. This characterisation dates back to [12] by Beals, to [32] by Dunau, and to [23] by Coifman and Meyer. We present a related result, Theorem 5.1.4, about local pseudo-differential operators. First we introduce the following notation: Definition 5.1.3 (Notation). Let us define the commutators Lj (A) := [∂xj , A] and Rk (A) := [A, Mxk ], where Mxk is the multiplication operator (Mxk f )(x) = xk f (x). Set Rα = R1α1 · · · Rnαn and accordingly Lβ = Lβ1 1 · · · Lβnn for multi-indices α, β, with convention L0j = I = Rk0 . Finally, for a partial differential operator C on Rn , let deg(C) denote its order. By Theorem 2.6.11, deg(C) is also the order of C on the Sobolev scale. The following theorem characterises local pseudo-differential operators on Rn in terms of the orders of their commutators on the Sobolev scale: Theorem 5.1.4 (Commutator characterisation on Rn ). Let m ∈ R and let A be a linear operator defined on C0∞ (Rn ). Then the following conditions are equivalent: m (Rn ). (i) A ∈ Op Sloc
5.1. Euclidean commutator characterisation
415
(ii) For any φ, ψ ∈ C0∞ (Rn ), for any s ∈ R and for any sequence C = (Cj )∞ j=0 ⊂ 1 Op Sloc (Rn ) of partial differential operators of first order, it holds that B0 = φAψ ∈ L(H s (Rn ), H s−m (Rn )), Bk+1 = [Bk , Ck ] ∈ L(H s (Rn ), H s−m+dC,k (Rn )), k where dC,k = j=0 (1 − deg(Cj )). (iii) For any φ, ψ ∈ C0∞ (Rn ), for any s ∈ R and for every α, β ∈ N0 , it holds that Rα Lβ (φAψ) ∈ L(H s (Rn ), H s−(m−|α|) (Rn )). Remark 5.1.5. At first sight, condition (ii) in Theorem 5.1.4 may seem awkward, at least when compared to condition (iii). However, this result will be needed in the pseudo-differential analysis on manifolds, and it is crucial in the proof of Theorem 5.3.1. Also notice the similarities in the formulations of Theorems 5.1.4 and 5.3.1, and in the proofs of Theorems 5.1.4 and 5.4.1. m (Rn ), and fix φ, ψ ∈ C0∞ (Rn ). Then Proof of Theorem 5.1.4. First, let A ∈ Op Sloc m n ∞ n B0 = φAψ ∈ Op S (R ). Let χ ∈ C0 (R ) be such that χ(x) = 1 in a neighbourhood of the compact set supp(φ) ∪ supp(ψ) ⊂ Rn , so that Bk+1 = [Bk , Ck ] = [Bk , χCk ]. Notice that χCk ∈ Op S deg(Ck ) (Rn ). Hence by induction and by the composition Theorem 2.5.1 it follows that Bk+1 ∈ Op S m−dC,k (Rn ). This proves the implication (i) ⇒ (ii) by Theorem 2.6.11 with p = 2. It is really trivial that (ii) implies (iii). Finally, let us show that (iii) implies (i). Assume (iii), and fix φ, ψ ∈ C0∞ (Rn ); we have to prove that φAψ ∈ Op S m (Rn ). Let χ ∈ C0∞ (Rn ) be such that χ(x) = 1 in a neighbourhood of the compact set supp(φ) ∪ supp(ψ) ⊂ Rn . We denote eξ (x) = e2πix·ξ . Evidently, φAψ is of order m, and
∂ξα ∂xβ σφAψ (x, ξ)
=
σRα Lβ (φAψ) (x, ξ)
=
e−2πix·ξ (Rα Lβ (φAψ)eξ )(x)
=
e−2πix·ξ (Rα Lβ (φAψ)(χeξ ))(x).
If 2s > n = dim(Rn ), s ∈ N, then by the Cauchy-Schwartz inequality for u ∈ H s (Rn ) we have: |u(x)| ≤ |ˆ u(ξ)| dξ Rn
4 ≤
=
Rn
(1 + |ξ|)−2s dξ
51/2 4
⎛
Cs uH s (Rn ) ≤ Cs ⎝
Rn
|γ|≤s
51/2 (1 + |ξ|)2s | u(ξ)|2 dξ ⎞1/2
∂xγ u2H 0 (Rn ) ⎠
.
416
Chapter 5. Commutator Characterisation of Pseudo-differential Operators
Applied to the symbol ∂ξα ∂xβ σφAψ this implies ⎛ |∂ξα ∂xβ σφAψ (x, ξ)| ≤ C ⎝ ⎛ =C⎝
⎞1/2 ∂ξα ∂xβ+γ σφAψ (·, ξ)2H 0 (Rn ) ⎠
|γ|≤s
⎞1/2
e−ξ Rα Lβ+γ (φAψ) (χeξ )2H 0 (Rn ) ⎠
|γ|≤s
⎛ ≤C⎝
⎞1/2 e−ξ 2L(H 0 ) Rα Lβ+γ (φAψ)2L(H m−|α| ,H 0 ) χeξ 2H m−|α| ⎠
.
|γ|≤s
By a version of Peetre’s inequality in (3.25) we have ∀s ∈ R ∀η, ξ ∈ Rn : (1 + |η + ξ|)s ≤ 2|s| (1 + |η|)|s| (1 + |ξ|)s (where |ξ| = ξ in the notation of the torus chapters is just the Euclidean norm of the vector ξ), so that we obtain + ,1/2 2 (1 + |η|)2(m−|α|) |1 χeξ (η)| dη χeξ H m−|α| (Rn ) = Rn
+
(1 + |η + ξ|)
= ≤
,1/2
Rn |m−|α||
2
2(m−|α|)
2
| χ(η)| dη
χH |m−|α|| (Rn ) (1 + |ξ|)m−|α| .
Hence |∂ξα ∂xβ σφAψ (x, ξ)| ≤ Cαβ,φ,ψ (1 + |ξ|)m−|α| , m (Rn ). Thus (i) is obtained from (iii). and consequently A ∈ Op Sloc
5.2
Pseudo-differential operators on manifolds
Here we briefly provide a background on pseudo-differential operators on manifolds. The differential geometry needed in the study is quite simple, sufficient general reference being any text book in the field, e.g., [54]. Definition 5.2.1 (Atlases on topological spaces). Let X be a topological space. An atlas on X is a collection of pairs {(Uα , κα )}α , where all sets Uα ⊂ X are open in X, α Uα = X, and for every α the mapping κα : Uα → Rn is a homeomorphism of Uα onto an open subset of Rn ; such n is called the dimension of the chart (Uα , κα ), and pairs (Uα , κα ) are called charts of the atlas. For every two charts (Uα , κα ) and (Uβ , κβ ) with Uα ∩ Uβ = ∅, the functions καβ := κα ◦ κ−1 β : κβ (Uα ∩ Uβ ) → κα (Uα ∩ Uβ )
5.2. Pseudo-differential operators on manifolds
417
are called transition maps of the atlas. We note that each transition map καβ is a homeomorphism between open subsets of Euclidean spaces, so that the dimension n is the same for such charts. We will say that a point x ∈ X belongs to a chart (U, κ) if x ∈ U . Definition 5.2.2 (Manifolds). Let X be a Hausdorff topological space such that its topology has a countable base1 . Then X equipped with an atlas A = {(Uα , κα )}α of charts of the same dimension n is called a locally Euclidean topological space. Since n is the same for all charts, we can set dim X := n to be the dimension of X. A locally Euclidean topological space with atlas A is called a (smooth) manifold, or a C ∞ manifold, if all the transition maps of the atlas A are smooth. A manifold M is called compact if X is compact. Example. Simple examples of n-dimensional manifolds include Euclidean spaces Rn , spheres Sn , tori Tn . Remark 5.2.3. We assume that X has a countable topological base and that it is Hausdorff to ensure that there are not too many open sets and that the topology of compact manifolds is especially nice, respectively. We also note that given two atlases we can look at transition maps in the atlas which is then union. Thus, if the union of two atlases is again an atlas we will call these atlases equivalent. This leads to a notion of equivalent atlases and thus a manifold is rather an equivalence class M = (X, [A]), if we do not want to worry about which atlas to fix. However, we will avoid such technicalities because of the limited differential geometry required for our purposes. In the sequel we will often omit writing the atlas at all because on the manifolds that we are dealing with the choice of an atlas will be more or less canonical. However, an important property for us is that if (U, κ) is a chart and V ⊂ U is open, then (V, κ) is also a chart (in an equivalent atlas, hence a chart in M ). We also note that Hausdorff follows from the existence of an atlas, which also implies the existence of a locally countable topological base. Instead of the first countability one may directly assume the existence of a countable atlas. Definition 5.2.4 (Smooth mappings). Let f : M → N be a mapping between manifolds M = (X, A) and N = (Y, B). Let x ∈ X, let (U, κ) ∈ A be a chart in M containing x, and let (V, ψ) ∈ B be a chart in N containing f (x). By shrinking the set U if necessary we may assume that f (U ) ⊂ V. We will say that f is smooth at x ∈ X if the mapping ψ ◦ f ◦ κ−1 : κ(U ) → ψ(V )
(5.1)
is smooth. As usual, f is smooth if it is smooth at all points. The space C ∞ (M ) is the set of smooth complex-valued functions on M , and C0∞ (U ) is the set of smooth functions with compact supports in an open set U ⊂ M . If k ∈ N and if all the mappings (5.1) are in C k (κ(U )) for all charts, then we will say that f ∈ C k (M ). 1 For
a topological base see Definition A.8.16
418
Chapter 5. Commutator Characterisation of Pseudo-differential Operators
Exercise 5.2.5. Check that the definition of “f is smooth at x” does not depend on a particular choice of charts (U, κ) and (V, ψ). Remark 5.2.6 (Whitney’s embedding theorem). We will deal only with smooth manifolds. It is a fundamental fact that every compact manifold admits a smooth embedding as a submanifold of RN for sufficiently large N . An interesting question is how small can N be. In 1936, in [150], for general (also non-compact) manifolds Whitney showed that one can take N = 2n + 1 for this to be true, later also improving it to N = 2n. We will not pursue this topic here and can refer to [54] for further details, but we will revisit it in a simpler context of Lie groups in Corollary 8.0.4 as well as use it in Section 10.6. Remark 5.2.7 (Orientable manifolds). The natural n-form on Rn is given by the volume element Ω = dx1 ∧ · · · ∧ dxn which is non-degenerate. For every open U ⊂ Rn the restriction ΩU := Ω|U defines a volume element on U . A diffeomorphism F : U → V ⊂ Rn is called orientation preserving if F ∗ ΩV = f ΩU for some f ∈ C ∞ (U ) such that f > 0 everywhere. A manifold M is called orientable if it has an atlas such that all the transition maps are orientation preserving. One can show that orientable manifolds have a non-degenerate volume element, i.e., it is possible to define a smooth n-form on M which is not zero at any point. Definition 5.2.8 (Localisation of operators). If A : C ∞ (M ) → C ∞ (M ) and φ, ψ ∈ C ∞ (M ), we define the operator φAψ : C ∞ (M ) → C ∞ (M ) by ((φAψ)u)(x) = φ(x) · A(ψ · u)(x). Definition 5.2.9 (κ-transfers). If (U, κ) is a chart on M , the κ-transfer Aκ : C ∞ (κ(U )) → C ∞ (κ(U )) of an operator A : C ∞ (U ) → C ∞ (U ) is defined by Aκ u := A(u ◦ κ) ◦ κ−1 . Similarly, the κ-transfer of a function φ is φκ = φ ◦ κ−1 . Exercise 5.2.10. Prove that the transfer of a commutator is the commutator of transfers: (5.2) [A, B]κ = [Aκ , Bκ ]. Pseudo-differential operators on the manifold M in the H¨ ormander sense are defined as follows: Definition 5.2.11 (Pseudo-differential operators on manifolds). A linear operator A : C ∞ (M ) → C ∞ (M ) is a pseudo-differential operator of order m ∈ R on M , if for every chart (U, κ) and for any φ, ψ ∈ C0∞ (U ), the operator (φAψ)κ is a pseudo-differential operator of order m on Rn . Since the class of pseudodifferential operators of order m on Rn is diffeomorphism invariant, it follows that the corresponding class on M is well defined. We denote the set of pseudodifferential operators of order m on M by Ψm (M ). Exercise 5.2.12. Check that the class of pseudo-differential operators of order m on Rn is diffeomorphism invariant (and see Section 2.5.2).
5.2. Pseudo-differential operators on manifolds
419
Definition 5.2.13 (Diff (M )). Let Diff(M ) be the ∗-algebra Diff(M ) =
∞
Diff k (M ),
k=0
where Diff k (M ) is the set of at most k th order partial differential operators on M with smooth coefficients. Here, Diff 0 (M ) ∼ = C ∞ (M ), and Diff 1 (M ) \ Diff 0 (M ) corresponds to the non-trivial smooth vector fields on M , i.e., the non-trivial smooth sections of the tangent bundle T M . Definition 5.2.14 (Closed manifolds). A compact manifold without boundary is called closed. Throughout this section and further in this chapter, M will be a closed smooth orientable manifold. Then we can equip it with the volume element from Remark 5.2.7. One can think of it as a suitable pullback of the Euclidean volume n-form (the Lebesgue measure) in local charts. Remark 5.2.15 (Spaces D(M ) and D (M )). A differential operator D ∈ Diff(M ) defines a seminorm pD on C ∞ (M ) by pD (u) = supx∈M |(Du)(x)|. The seminorm family {pD : C ∞ (M ) → R | D ∈ Diff(M )} induces a Fr´echet space structure on C ∞ (M ). This test function space is denoted by D(M ), and the distributions by D (M ) = L(D(M ), C). In particular, similar to Remark 1.3.7 in Rn , for u ∈ Lp (M ) and ϕ ∈ C ∞ (M ), the duality u(x) ϕ(x) dx u, ϕ := M
gives a canonical way to identify u ∈ Lp (M ) with a distribution in D (M ). Here dx stands for a volume element on M . Definition 5.2.16 (Sobolev space H s (M )). The Sobolev space H s (M ) (s ∈ R) is the set of those distributions u ∈ D (M ) such that (φu)κ ∈ H s (Rn ) for every chart (U, κ) on M and for every φ ∈ C0∞ (U ). Let U = {(Uj , κj )} be a cover of M with charts. Due to the compactness of M , we can require the cover to be finite. Fix a smooth partition of unity {(Uj , φj )} with respect to the cover U. We equip the Sobolev space H s (M ) with the norm ⎛ uH s (M ),{(Uj ,κj ,φj )} := ⎝
⎞1/2 (φj u)κj 2H s (Rn ) ⎠
.
j
Exercise 5.2.17. Show that any other choice of Uj , κj , φj would have resulted in an equivalent norm. Prove that H s (M ) is a Hilbert space. As a consequence of Corollary 1.5.15, as well as Propositions 1.5.18 and 1.5.19, we get:
420
Chapter 5. Commutator Characterisation of Pseudo-differential Operators
Corollary 5.2.18 (Density). Let M be a closed manifold. The space C ∞ (M ) is sequentially dense in Lp (M ) for all 1 ≤ p < ∞. Also, C ∞ (M ) is sequentially dense in H s (M ) for every s ∈ R. Remark 5.2.19. The last statement is true for any s ∈ R but requires more of the manifold theory than developed here. Such statements can be easily found in the literature, and in the case of Rn see an even more general statement in Theorem 1.3.31. However, the reader is encouraged to provide the details for the proof of the density for all s ∈ R. Definition 5.2.20 (Order of an operator on the Sobolev scale). A linear operator A on C ∞ (M ) is said to be of order m ∈ R on M , if it extends boundedly between H s (M ) and H s−m (M ) for every s ∈ R. Thereby the operator A has also the continuous extension AD : D (M ) → D (M ). As is in the case of Rn in Definition 5.1.1, any of these extensions coincide in their mutual domains, so that it is meaningful to denote any one of them by A. Exercise 5.2.21. Prove that C ∞ (M ) =
H s (M ) and D (M ) =
s∈R
H s (M ).
s∈R
Remark 5.2.22 (All operators are properly supported). We recall the notion of properly supported operators from Definition 2.5.20. Since the support of the integral kernel is closed, we immediately see that all pseudo-differential operators on a closed manifold are properly supported. We briefly address the Lp issue on compact manifolds, we formulate Theorem 5.2.23 (Boundedness on Lp (M )). Let M be a compact manifold and let A ∈ Ψ0 (M ). Then A is bounded from Lp (M ) to Lp (M ) for any 1 < p < ∞, and its operator norm is bounded by AL(Lp (M )) ≤ C
max
|β|≤n+1
∂xβ ∂ξα a(x, ξ) |ξ||α| ,
|α|≤[n/2]+1
where ∂xβ ∂ξα a(x, ξ) is defined on one of some finite number of selected coordinate systems covering M . The proof of this theorem can be carried out by reducing the problem to the corresponding Lp -boundedness statement of pseudo-differential operators in Ψ0 (Rn × Rn ) with compactly supported amplitudes which would follow from Theorem 2.6.22. However, the advantage of this theorem is that one also obtains a bound on the number of necessary derivatives (as well as a corresponding result for Theorem 2.6.22) if one reduces the problem to the Lp -multipliers problem. We refer to [130, p. 267] for further details. We refer to Section 13.1 for a further discussion of these concepts.
5.3. Commutator characterisation on closed manifolds
5.3
421
Commutator characterisation on closed manifolds
The main result of this section is Theorem 5.3.1 about the commutator characterisation (cf. Theorems 5.1.4 and 5.4.1), which was stated by Coifman and Meyer [23] in the case of 0-order operators on L2 (M ) (see also [32] for a kindred treatise). This will be applied in the final part of this chapter concerning periodic pseudo-differential operators (Theorem 5.4.1) and in Part IV (Theorem 10.7.7). Theorem 5.3.1 shows that pseudo-differential operators on closed manifolds can be characterised by the orders of their commutators on the Sobolev scale. Let M be a closed manifold. Naturally, an operator D ∈ Diff k (M ) from Definition 5.2.13 is of order deg(D) = k. Observe that the algebra Diff(M ) has the “almost-commuting property”: [Diff j (M ), Diff k (M )] ⊂ Diff j+k−1 (M ), which follows by the Leibniz formula. Actually, more general pseudo-differential operators are also characterised by the “almost-commuting” with differential operators: Theorem 5.3.1 (Commutator characterisation on closed manifolds). Let m ∈ R and let A : C ∞ (M ) → C ∞ (M ) be a linear operator. Then the following conditions are equivalent: (i) A ∈ Ψm (M ). 1 (ii) For any s ∈ R and for any sequence D = (Dj )∞ j=0 ⊂ Diff (M ), it holds that
"
where dD,k =
A0 = A ∈ L(H s (M ), H s−m (M )), Ak+1 = [Ak , Dk ] ∈ L(H s (M ), H s−m+dD,k (M )),
k
j=0 (1
− deg(Dj )).
We need the following auxiliary result: Lemma 5.3.2. Let M be a closed smooth manifold. Then there exists a smooth partition of unity with respect to a cover U on M such that U ∪ V is a chart neighbourhood whenever U, V ∈ U. Proof. Let V be a cover of M with chart neighbourhoods. Since M is a compact metrisable space by the Whitney embedding theorem (Remark 5.2.6), the cover V has the Lebesgue number λ > 0 – i.e., if S ⊂ M has a small diameter, diam(S) < λ, then there exists V ∈ V such that S ⊂ V , see Lemma A.13.12. Let W be a cover of M with chart neighbourhoods of diameter less than λ/2, and choose a finite subcover U ⊂ W. Now there exists a smooth partition of unity on M with respect to U, and if U, V ∈ U intersect, then diam(U ∪ V ) < λ. On the other hand, if U ∩ V = ∅, then U ∪ V is clearly a chart neighbourhood.
422
Chapter 5. Commutator Characterisation of Pseudo-differential Operators
Proof of Theorem 5.3.1. ((i)⇒(ii)) Assume that A ∈ Ψm (M ). Lemma 5.3.2 provides a smooth partition of unity {(Uj , φj )}N j=1 such that Ui ∪ Uj is always a chart neighbourhood, so that the study can be localised: A=
N N
φi Aφj .
i=1 j=1
Let (Ui ∪ Uj , κij ) be a chart. Now φi , φj ∈ C0∞ (Ui ∪ Uj ), so that the κij -transfer (φi Aφj )κij is a pseudo-differential operator of order m on Rn , hence belonging to L(H s (Rn ), H s−m (Rn )) by Theorem 2.6.11 . Thereby φi Aφj = ((φi Aφj )κij )κ−1 ij
belongs to L(H s (M ), H s−m (M )), and consequently A ∈ L(H s (M ), H s−m (M )). Thus we have the result Ψm (M ) ⊂ L(H s (M ), H s−m (M )). In order to get (ii), also inclusions [Ψm (M ), Diff 1 (M )] ⊂ Ψm (M )
and
[Ψm (M ), Diff 0 (M )] ⊂ Ψm−1 (M )
must be proven. Let A ∈ Ψm (M ) and D ∈ Diff 1 (M ), and fix an arbitrary chart (U, κ) and arbitrary functions φ, ψ ∈ C0∞ (U ). By a direct calculation, φ[A, D]ψ = [φAψ, D] − φA[ψ, D] − [φ, D]Aψ, so that (φ[A, D]ψ)κ = [(φAψ)κ , Dκ ] − (φA[ψ, D])κ − ([φ, D]Aψ)κ by (5.2). Because A ∈ Ψm (M ), Theorem 5.1.4 implies that the operators on the right-hand side of the previous equality are pseudo-differential operators of order m−(1−deg(D)) on Rn . Therefore [A, D] ∈ Ψm−(1−deg(D)) (M ), proving implication (i)⇒(ii). ((ii)⇒(i)) Let A : C ∞ (M ) → C ∞ (M ) satisfy condition (ii), and fix a chart (U, κ) on M and φ, ψ ∈ C0∞ (U ). To get (i), we have to prove that (φAψ)κ ∈ Op S m (Rn ), which by Theorem 5.1.4 follows, if we can prove the following variant of condition (ii): 1 n (ii) For any s ∈ R and for any sequence C = (Cj )∞ j=0 ⊂ Op Sloc (R ) of partial differential operators, it holds that " B0 = (φAψ)κ ∈ L(H s (Rn ), H s−m (Rn )), Bk+1 = [Bk , Ck ] ∈ L(H s (Rn ), H s−m+dC,k (Rn )),
where dC,k =
k
j=0 (1
− deg(Cj )).
Indeed, B0 = (φAψ)κ ∈ L(H s (Rn ), H s−m (Rn )) by (ii). Let χ ∈ C0∞ (κ(U )) such that χ(x) = 1 in a neighbourhood of the compact set supp(φκ ) ∪ supp(ψκ ) ⊂ Rn . 1 Then define D = (Dj )∞ j=0 ⊂ Diff (M ) so that Dj |M \U = 0, and Dj |U = (χCj )κ−1 .
5.4. Toroidal commutator characterisation
423
Then dD,k ≥ dC,k , and due to condition (ii), we get Bk+1
= [Bk , Ck ] = [Bk , χCk ] = [(Bk )κ−1 , Dk ]κ ∈ L(H s (Rn ), H s−m+dD,k (Rn )) ⊂ L(H s (Rn ), H s−m+dC,k (Rn )),
verifying (ii) . Hence A ∈ Ψm (M ).
Remark 5.3.3 (Ψ(M ) is a ∗-algebra). The pseudo-differential operators on M form a ∗-algebra Ψ(M ) = Ψm (M ), m∈R
where Ψm (M ) ⊂ L(H s (M ), H s−m (M )). It is true that Diff k (M ) ⊂ Ψk (M ), and Ψ(M ) has properties analogous to those of the algebra Diff(M ). Especially, [Ψm1 (M ), Ψm2 (M )] ⊂ Ψm1 +m2 −1 (M ). Exercise 5.3.4 (Paracompact manifolds). Generalise the result in Lemma 5.3.2 to smooth paracompact manifolds. Recall that a Hausdorff topological space is called paracompact if every open cover admits an open locally finite subcover.
5.4
Toroidal commutator characterisation
On the torus Tn = Rn /Zn one has a well-defined global symbol analysis of periodic operators from the class Ψ(Tn × Zn ), as developed in Chapter 4. In this section, as one application of the commutator characterisation Theorem 5.3.1, we provide a proof of the equality of operator classes Ψ(Tn × Zn ) = Ψ(Tn ). For the equality of operator classes Ψ(Tn × Zn ) = Ψ(Tn × Rn ) see Corollary 4.6.13 that was obtained using the extension and periodisation techniques. However, a similar application of Theorem 5.3.1 will be important on Lie groups (Theorem 10.7.7) where these other techniques are not readily available. For 1 ≤ j, k ≤ n, let us define the operators Lj and Rk acting on periodic pseudo-differential operators by Lj (A) := [Dxj , A] and Rk (A) := [A, ei2πxk I]. Moreover, for α, β ∈ Nn0 , let Lβ (A) α
R (A)
= Lβ1 1 · · · Lβnn (A), = R1α1 · · · Rnαn (A)
(here the letters L and R refer to “left” and “right”). By the composition Theorem 4.7.10, if A ∈ Op(S m (Tn × Zn )) then Lj (A) ∈ Op(S m (Tn × Zn )) and Rk (A) :=
424
Chapter 5. Commutator Characterisation of Pseudo-differential Operators
Op(S m−1 (Tn × Zn )). Let us explain how these commutators arise, and why they are so interesting. First, Dxj σA (x, ξ) = Dxj e−i2πx·ξ Aeξ (x) = e−i2πx·ξ Dxj A − ξj A eξ (x) = e−i2πx·ξ Dxj A − ADxj eξ (x) 3 2 = e−i2πx·ξ Dxj , A eξ (x) = σLj (A) (x, ξ). Thus the partial derivative with respect to xj of the symbol σA leads to the symbol of the commutator [Dxj , A]. As regards to the difference, the situation is almost similar (where vk stands for the standard k th unit basis vector of Zn ): ξk σA (x, ξ)
=
σA (x, ξ + vk ) − σA (x, ξ)
= = =
e−i2πx·(ξ+vk ) Aeξ+vk (x) − e−i2πx·ξ Aeξ (x) e−i2πxk e−i2πx·ξ A ◦ ei2πxk I − ei2πxk I ◦ A eξ (x) 3 2 e−i2πxk e−i2πx·ξ A, ei2πxk I eξ (x)
=
e−i2πxk σRk (A) (x, ξ).
The minor asymmetry in ξk σA (x, ξ) = e−i2πxk σRk (A) (x, ξ) caused by e−i2πxk is due to the nature of differences. In [12, p. 46-49] the pseudo-differential operators of certain degree have been characterised using analogues of these commutators representing the differentiations of a symbol. As before, the approach on Tn is somewhat simpler: Theorem 5.4.1 (Commutator characterisation on Tn ). Let A be a linear operator on C ∞ (Tn ). Then we have A ∈ Op(S m (Tn × Zn )) if and only if ∀α, β ∈ Nn0 : Lβ Rα (A) ∈ L(H m−|α| (Tn ), H 0 (Tn )).
(5.3)
Thus the classes of periodic pseudo-differential operators and pseudo-differential operators on Tn coincide. More precisely, for any m ∈ R it holds that Op S m (Tn × Zn ) = Ψm (Tn ).
(5.4)
Proof of Theorem 5.4.1 for the T1 case. The “only if”-part is trivial by Proposition 4.2.3, since Theorem 4.7.10 implies that L1 (B) ∈ Op(S l (T1 × Z1 )) and R1 (B) ∈ Op(S l−1 (T1 × Z1 )) for any B ∈ Op(S l (T1 × Z1 )). β For the “if”-part we have to estimate α ξ ∂x σA (x, ξ). Let us define opera −i2πx tor R1 by R1 (A) = e R1 (A). Because u(x) → e−i2πx u(x) is a homeomors 1 s α phism from H (T ) to H (T1 ) for every s ∈ R, it is true that Lα 1 R1 (A) ∈ m−α 1 0 1 L(H (T ), H (T )). Notice that ⎡ ⎤1/2 ⎡ ⎤1/2 | u(ξ)| ≤ ⎣ ξ−2 ⎦ ⎣ |ξ u(ξ)|2 ⎦ = C∂x uH 0 (T1 ) . |u(x)| ≤ ξ∈Z1
ξ∈Z1
ξ∈Z1
5.4. Toroidal commutator characterisation
425
Using this we get β α β+1 σA (x, ξ)H 0 (T1 ) α ξ ∂x σA (x, ξ) ≤ Cξ ∂x
= Ce−ξ Lβ+1 R1α (A)eξ H 0 (T1 ) 1 ≤ Ce−ξ IL(H 0 ,H 0 ) Lβ+1 R1α (A)L(H m−α ,H 0 ) eξ H m−α 1 = CLβ+1 R1α (A)L(H m−α (T1 ),H 0 (T1 )) ξm−α 1 ≤ Cαβ ξm−α .
This completes the proof of the one-dimensional case.
General proof of Theorem 5.4.1. Let us first prove the inclusion OpS m (Tn × Zn ) ⊂ Ψm (Tn ). We know by Proposition 4.2.3 that Op S m (Tn × Zn ) ⊂ L(H s (Tn ), H s−m (Tn )). Therefore by Theorem 5.3.1, in view of Proposition 4.2.3 it suffices to verify that [Op S m (Tn × Zn ), Diff 1 (Tn )] ⊂ Op S m (Tn × Zn )
(5.5)
[Op S m (Tn × Zn ), Diff 0 (Tn )] ⊂ Op S m−1 (Tn × Zn ).
(5.6)
and that This is true due to the asymptotic expansion of the composition of two periodic pseudo-differential operators (see Theorem 4.7.10). However, we present a brief independent and instructive proof of the inclusion (5.5). Let A ∈ Op S m (Tn ) and let X ∈ Diff 1 (Tn ), Xx = φ(x)∂xk (1 ≤ k ≤ n). Now i2πx·η ˆ [σA (x, ξ + η) − σA (x, ξ)]φ(η)e σ[A,X] (x, ξ) = i2πξk η
−φ(x)(∂xk σA )(x, ξ). Notice that σA (x, ξ + η) − σA (x, ξ) =
n ηj −(sgn(η j )+1)/2
sgn(ηj )ξj σA (x, ξ + η1 δ1 + · · · + ηj−1 δj−1 + ωj δj ),
j=1 ωj =(sgn(ηj )−1)/2
⎧ ⎨
where sgn(ηj ) =
1, ηj > 0, 0, ηj = 0, ⎩ −1, ηj < 0,
√ and there are at most j |ηj | < n(1 + η) non-zero terms in the sum. Hence, applying the ordinary Leibniz formula with respect to x, the discrete Leibniz formula with respect to ξ (Lemma 3.3.6), the inequality of Peetre (Proposition
426
Chapter 5. Commutator Characterisation of Pseudo-differential Operators
3.3.31) and Lemma 4.2.1, we get β |α ξ ∂x σ[A,X] (x, ξ)|
≤ Cαβ,φ,r (1 + ξ)
√ (1 + ξ)m−(|α|+1) n(1 + η)|m−(|α|+1)|+|β|+1−r
η
+Cαβ,φ (1 + ξ)m−|α| . By choosing r > |m − (|α| + 1)| + |β| + 2, the series above converges, so that β m−|α| . |α ξ ∂x σ[A,X] (x, ξ)| ≤ Cαβ (1 + ξ)
Hence [A, X] ∈ Op S m (Tn × Zn ). Similarly, but with less effort, one proves (5.6). Thus A ∈ Ψm (Tn ) by Theorem 5.3.1, and hence also (5.3) by Theorem 5.3.1. Now assume that A ∈ Ψm (Tn ). We have to prove that σA satisfies inequalities m (Tn × Zn ) in (4.6) from Definition 4.1.7. We defining the toroidal symbol class S1,0 m n also note that A ∈ Ψ (T ) implies (5.3) by Theorem 5.3.1. Let us define the *k by R *k (A) := e−i2πxk Rk (A), and set R *α := R *α1 · · · R *αn , so that transformation R n 1 β α * α Lβ (A) (x, ξ). ξ ∂x σA (x, ξ) = σR
*α Lβ (A) ∈ L(H m−|α| (Tn ), H 0 (Tn )). Notice that By Theorem 5.3.1, we have R ⎡ ⎤1/2 ⎡ ⎤1/2 | u(ξ)| ≤ ⎣ (1 + ξ)−2s ⎦ ⎣ (1 + ξ)2s |ˆ u(ξ)|2 ⎦ |u(x)| ≤ ξ
=
ξ
⎛
Cs uH s (Tn ) ≤ Cs ⎝
ξ
⎞1/2
∂xγ u2H 0 (Tn ) ⎠
,
|γ|≤s
where s ∈ N satisfies 2s > n = dim(Tn ). Using this we get ⎛ ⎞1/2 β β+γ ⎝ α σA (·, ξ)2H 0 (Tn ) ⎠ α ξ ∂x σA (x, ξ) ≤ C ξ ∂x ⎛ =C⎝
|γ|≤s
⎞1/2
*α Lβ+γ (A)eξ 2 0 n ⎠ e−ξ R H (T )
|γ|≤s
⎛ ≤C⎝
e−ξ I2L(H 0 (Tn ))
|γ|≤s
*α Lβ+γ (A)2 m−|α| n 0 n eξ 2 m−|α| n ×R L(H H (T ),H (T )) (T )
1/2
≤ Cαβ (1 + ξ)m−|α| , completing the proof.
Part III
Representation Theory of Compact Groups We might call the traditional topology and measure theory by the name “commutative geometry”, referring to the commutative function algebras; “non-commutative geometry” would then refer to the study of non-commutative algebras. Although the function algebras considered in the sequel are still commutative, the noncommutativity of the corresponding groups is the characteristic feature of Parts III and IV. Here we present the necessary material on compact groups and their representations. The presentation gradually increases the availability of topological and differentiable structures, thus tracing the development from general compact groups to linear Lie groups. Moreover, we present additional material on the Hopf algebras joining together the material of this part to the analysis of algebras from Chapter D. Nevertheless, we tried to make the exposition self-contained, providing references to Part I when necessary. If the reader wants to gain more profound knowledge of Lie groups, Lie algebras and their representation, there are many excellent monographs available on different aspects of these theories at different levels, for example [9, 19, 20, 31, 36, 37, 38, 47, 48, 49, 50, 51, 58, 61, 64, 65, 73, 74, 88, 123, 127, 147, 148, 149, 154], to mention a few.
Chapter 6
Groups 6.1
Introduction
Loosely speaking, groups encode symmetries of (geometric) objects: if we consider a space X with some specific structure (e.g., a Riemannian manifold), a symmetry of X is a bijection f : X → X preserving the natural involved structure (e.g., the Riemannian metric) – here, the compositions and inversions of symmetries yield new symmetries. In a handful of assumptions, the concept of groups captures the essential properties of wide classes of symmetries, and provides powerful tools for related analysis. Perhaps the first non-trivial group that mankind encountered was the set Z of integers; with the usual addition (x, y) → x + y and “inversion” x → −x this is a basic example of a group. Intuitively, a group is a set G that has two mappings G × G → G and G → G generalising the properties of the integers in a simple and natural way. We start by defining the groups, and we study the mappings preserving such structures, i.e., group homomorphisms. Of special interest are representations, that is those group homomorphisms that have values in groups of invertible linear operators on vector spaces. Representation theory is a key ingredient in the theory of groups. In this framework we study analysis on compact groups, foremost measure theory and Fourier transform. Remarkably, for a compact group G there exists a unique translation-invariant linear functional on C(G) corresponding to a probability measure. We shall construct this so-called Haar measure, closely related to the Lebesgue measure of a Euclidean space. We shall also introduce Fourier series of functions on a group. Groups having a smooth manifold structure (with smooth group operations) are called Lie groups, and their representation theory is especially interesting. Leftinvariant first-order partial differential operators on such a group can be identified with left-invariant vector fields on the group, and the corresponding set called the Lie algebra is studied.
430
Chapter 6. Groups
Finally, we introduce Hopf algebras and study the Gelfand theory related to them. Remark 6.1.1 (–morphisms). If X, Y are spaces with the same kind of algebraic structure, the set Hom(X, Y ) of homomorphisms consists of mappings f : X → Y respecting the structure. Bijective homomorphisms are called isomorphisms. Homomorphisms f : X → X are called endomorphisms of X, and their set is denoted by End(X) := Hom(X, X). Isomorphism-endomorphisms are called automorphisms, and their set is Aut(X) ⊂ End(X). If there exist the zero-elements 0X , 0Y in respective algebraic structures X, Y , the null space or the kernel of f ∈ Hom(X, Y ) is Ker(f ) := {x ∈ X : f (x) = 0Y } . Sometimes algebraic structures might have, say, topology, and then the homomorphisms are typically required to be continuous. Hence, for instance, a homomorphism f : X → Y between Banach spaces X, Y is usually assumed to be continuous and linear, denoted by f ∈ L(X, Y ), unless otherwise mentioned; for short, let L(X) := L(X, X). The assumptions in theorems etc. will still be explicitly stated. Conventions N is the set of positive integers, Z+ = N, N0 = N ∪ {0}, Z is the set of integers, Q the set of rational numbers, R the set of real numbers, C the set of complex numbers, and K ∈ {R, C}.
6.2
Groups without topology
We start with groups without complications, without assuming supplementary properties. This choice helps in understanding the purely algebraic ideas, and only later we will mingle groups with other structures, e.g., topology. Definition 6.2.1 (Groups). A group consists of a set G having an element e = eG ∈ G and endowed with mappings ((x, y) → xy) : G × G → G, (x → x−1 ) : G → G satisfying x(yz) = (xy)z, ex = x = xe, x x−1 = e = x−1 x,
6.2. Groups without topology
431
for all x, y, z ∈ G. We may freely write xyz := x(yz) = (xy)z; the element e ∈ G is called the neutral element, and x−1 is the inverse of x ∈ G. If the group operations are implicitly known, we may say that G is a group. If xy = yx for all x, y ∈ G then G is called commutative (or Abelian). Example. Let us give some examples of groups: 1. (Symmetric group). Let G = {f : X → X | f bijection}, where X = ∅; this is a group with operations (f, g) → f ◦ g, f → f −1 . This group G of bijections on X is called the symmetric group of X, and it is non-commutative whenever |X| ≥ 3, where |X| is the number of elements of X. The neutral element is idX = (x → x) : X → X. 2. The sets Z, Q, R and C are commutative groups with operations (x, y) → x + y, x → −x. The neutral element is 0 in each case. 3. Any vector space is a commutative group with operations (x, y) → x + y, x → −x; the neutral element is 0. 4. (Automorphism group Aut(V )). Let V be a vector space. The set Aut(V ) of invertible linear operators V → V forms a group with operations (A, B) → AB, A → A−1 ; this group is non-commutative when dim(V ) ≥ 2. The neutral element is I = (v → v) : V → V . 5. Sets Q× := Q \ {0}, R× := R \ {0}, C× := C \ {0} (more generally, invertible elements of a unital ring) form multiplicative groups with operations (x, y) → xy (ordinary multiplication) and x → x−1 (as usual). The neutral element is 1 in each case. 6. (Affine group Aff (V )). The set Aff(V ) = {Aa = (v → Av + a) : V → V | A ∈ Aut(V ), a ∈ V } of affine mappings forms a group with operations defined to be (Aa , Bb ) → (AB)Ab+a , Aa → (A−1 )A−1 a ; this group is non-commutative when dim(V ) ≥ 1. The neutral element is I0 . 7. (Product group). If G and H are groups then G × H has a natural group structure: ((g1 , h1 ), (g2 , h2 )) → (g1 h1 , g2 h2 ),
(g, h) → (g −1 , h−1 ).
The neutral element is eG×H := (eG , eH ). Exercise 6.2.2. Let G be a group and x, y ∈ G. Prove: (a) (x−1 )−1 = x. (b) If xy = e then y = x−1 . (c) (xy)−1 = y −1 x−1 . Definition 6.2.3 (Finite groups). If a group has finitely many elements it is said to be finite.
432
Chapter 6. Groups
Example. The symmetry group of a set consisting of n elements is called the permutation group of n elements. Such group is a finite group and has n! = 1·2 · · · n elements. Definition 6.2.4 (Notation). Let G be a group, x ∈ A, A, B ⊂ G and n ∈ Z+ . We write xA Ax AB A0 A−1 An+1 A−n
:= := := := :=
{xa | a ∈ A} , {ax | a ∈ A} , {ab | a ∈ A, b ∈ B} , {e} , −1 a |a∈A ,
:= An A, := (An )−1 .
Definition 6.2.5 (Subgroups H < G, and normal subgroups H G). A set H ⊂ G is a subgroup of a group G, denoted by H < G, if e ∈ H,
xy ∈ H
and x−1 ∈ H
for all x, y ∈ H (hence H is a group with the inherited operations). A subgroup H < G is called normal in G if xH = Hx for all x ∈ G; then we write H G. Remark 6.2.6. With the inherited operations, a subgroup is a group. Normal subgroups are the well-behaving ones, as exemplified later in Proposition 6.2.16 and Theorem 6.2.20. In some books normal subgroups of G are called normal divisors of G. Exercise 6.2.7. Let H < G. Show that if H ⊂ x−1 Hx for every x ∈ G, then H G. Exercise 6.2.8. Let H < G. Show that H G if and only if H = x−1 Hx for every x ∈ G. Example. Let us collect some instances and facts about subgroups: 1. (Trivial subgroups). We always have normal trivial subgroups {e} G and G G. Subgroups of a commutative group are always normal. 2. (Centre of a group). The centre Z(G) G, where Z(G) := {z ∈ G | ∀x ∈ G : xz = zx}. Thus, the centre is the collection of elements that commute with all elements of the group.
6.2. Groups without topology
433
3. If F < H and G < H then F ∩ G < H. 4. If F < H and G H then F G < H. 5. {Ia | a ∈ V } Aff(V ). 6. The following two examples will be of crucial importance later so we formulate them as Remarks 6.2.9 and 6.2.10. Remark 6.2.9 (Groups GL(n, R), O(n), SO(n)). We have SO(n) < O(n) < GL(n, R) ∼ = Aut(Rn ), where the groups consist of real n × n-matrices: GL(n, R) is the real general linear group consisting of invertible real matrices (i.e., determinant non-zero); O(n) is the orthogonal group, where the matrix columns (or rows) form an orthonormal basis for Rn (so that AT = A−1 for A ∈ O(n), det(A) ∈ {−1, 1}); SO(n) is the special orthogonal group, the group of rotation matrices of Rn around the origin, so that SO(n) = {A ∈ O(n) : det(A) = 1}. Remark 6.2.10 (Groups GL(n, C), U(n), SU(n)). We have SU(n) < U(n) < GL(n, C) ∼ = Aut(Cn ), where the groups consist of complex n × n-matrices: GL(n, C) is the complex general linear group consisting of invertible complex matrices (i.e., determinant non-zero); U(n) is the unitary group, where the matrix columns (or rows) form an orthonormal basis for Cn (so that A∗ = A−1 for A ∈ U(n), |det(A)| = 1); SU(n) is the special unitary group, SU(n) = {A ∈ U(n) : det(A) = 1}. Remark 6.2.11. The mapping (z → z ) : C → C1×1 identifies complex numbers with complex (1 × 1)-matrices. Thereby the complex unit circle group {z ∈ C : |z| = 1} is identified with the group U(1). Definition 6.2.12 (Right quotient G/H). Let H < G. Then x∼y
⇐⇒
xH = yH
defines an equivalence relation on G, as can be easily verified. The (right) quotient of G by H is the set G/H = {xH | x ∈ G} . Notice that xH = yH if and only if x−1 y ∈ H.
434
Chapter 6. Groups
Similarly, we can define Definition 6.2.13 (Left quotient H\G). Let H < G. Then x∼y
⇐⇒
Hx = Hy
defines an equivalence relation on G. The (left) quotient of G by H is the set H\G = {Hx | x ∈ G} . Notice that Hx = Hy if and only if x−1 y ∈ H. Remark 6.2.14 (Right for now). We will deal mostly with the right quotient G/H in Part III. However, we note that in Part IV we will actually need more the left quotient H\G. It should be a simple exercise for the reader to translate all the results from “right” to “left”. Indeed, simply replacing the side from which the subgroup acts from right to left, and changing all the words from “right” to “left” should do the job since the situation is completely symmetric. The reason for our change is that once we choose to identify the Lie algebras with the left-invariant vector fields in Part IV it leads to a more natural analysis of pseudo-differential operators on left quotients. However, because our intuition about division may be better suited to the notation G/H we chose to explain the basic ideas for the right quotients, keeping in mind that the situation with the left quotients is completely symmetric. Remark 6.2.15. It is often useful to identify the points xH ∈ G/H with the sets xH ⊂ G. Also, for A ⊂ G we naturally identify the sets AH = {ah : a ∈ A, h ∈ H} ⊂ G and {aH : a ∈ A} = {{ah : h ∈ H} : a ∈ A} ⊂ G/H. This provides a nice way to treat the quotient G/H. Proposition 6.2.16 (When is G/H a group?). Let H G be normal. Then the quotient G/H can be endowed with the group structure (xH, yH) → xyH,
xH → x−1 H.
Proof. The operations are well-defined mappings (G/H) × (G/H) → G/H and G/H → G/H, respectively, since HG
xHyH = xyHH
HH=H
=
xyH,
and (xH)−1 = H −1 x−1
H −1 =H
=
Hx−1 = x−1 H. HG
The group axioms follow, since by simple calculations we have (xH)(yH)(zH) = xyzH, (xH)(eH) = xH = (eH)(xH), (x−1 H)(xH) = H = (xH)(x−1 H). Notice that eG/H = eG H = H.
6.2. Groups without topology
435
Definition 6.2.17 (Torus Tn as a quotient group). The quotient Tn := Rn /Zn is called the (flat) n-dimensional torus. Definition 6.2.18 (Homomorphisms and isomorphisms). Let G, H be groups. A mapping φ : G → H is called a homomorphism (or a group homomorphism), denoted by φ ∈ Hom(G, H), if φ(xy) = φ(x)φ(y) for all x, y ∈ G. The kernel of φ ∈ Hom(G, H) is Ker(φ) := {x ∈ G | φ(x) = eH } . A bijective homomorphism φ ∈ Hom(G, H) is called an isomorphism, denoted by φ:G∼ = H. Remark 6.2.19. Group homomorphisms are the natural mappings between groups, preserving the group operations. Notice especially that for a group homomorphism φ : G → H it holds that φ(eG ) = eH
and φ(x−1 ) = φ(x)−1
for all x ∈ G. Example. Examples of homomorphisms: 1. (x → eH ) ∈ Hom(G, H). 2. For y ∈ G, (x → y −1 xy) ∈ Hom(G, G). 3. If H G then x → xH is a surjective homomorphism G → G/H. 4. For x ∈ G, (n → xn ) ∈ Hom(Z, G). 5. If φ ∈ Hom(F, G) and ψ ∈ Hom(G, H) then ψ ◦ φ ∈ Hom(F, H). ∼ U(1) ∼ 6. T1 = = SO(2). Theorem 6.2.20. Let φ ∈ Hom(G, H) and K = Ker(φ). Then: 1. φ(G) < H. 2. K G. 3. ψ(xK) := φ(x) defines a group isomorphism ψ : G/K → φ(G). Thus we have the commutative diagram G ⏐ ⏐ x→xK (
φ
−−−−→ ψ:G/K ∼ =φ(G)
H F ⏐y→y ⏐
G/K −−−−−−−−−→ φ(G).
436
Chapter 6. Groups
Proof. Let x, y ∈ G. Now φ(G) is a subgroup of H, because eH = φ(x)φ(y) = φ(x−1 )φ(x) = = =
φ(eG ) ∈ φ(G), φ(xy) ∈ φ(G), φ(x−1 x) = φ(eG ) eH · · · = φ(x)φ(x−1 );
notice that φ(x)−1 = φ(x−1 ). If a, b ∈ Ker(φ) then φ(eG )
= eH , φ(ab) = φ(a)φ(b) = eH eH = eH , φ(a−1 ) = φ(a)−1 = e−1 H = eH , so that K = Ker(φ) < G. If moreover x ∈ G then φ(x−1 Kx) = φ(x−1 ) φ(K) φ(x) = φ(x)−1 {eH } φ(x) = {eH } , meaning x−1 Kx ⊂ K. Thus K G by Exercise 6.2.8. By Proposition 6.2.16, G/K is a group (with the natural operations). Since φ(xa) = φ(x) for every a ∈ K, ψ = (xK → φ(x)) : G/K → φ(G) is a well-defined surjection. Furthermore, ψ(xyK) = φ(xy) = φ(x)φ(y) = ψ(xK)ψ(yK), thus ψ ∈ Hom(G/K, φ(G)). Finally, ψ(xK) = ψ(yK) ⇐⇒ φ(x) = φ(y) ⇐⇒ x−1 y ∈ K ⇐⇒ xK = yK, so that ψ is injective.
Exercise 6.2.21 (Universality of the permutation groups). Let G be a finite group. Show that there is a set X with finitely many elements such that G is isomorphic to a subgroup of the symmetric group of X.
6.3 Group actions and representations Spaces can be studied by examining their symmetry groups. On the other hand, it is fruitful to investigate groups when they are acting as symmetries of some nicely structured spaces. Next we study actions of groups on sets. Especially interesting group actions are the linear actions on vector spaces, providing the machinery of linear algebra – this is the fundamental idea in the representation theory of groups. Definition 6.3.1 (Transitive actions). An action of a group G on a set M = ∅ is a mapping ((x, p) → x · p) : G × M → M,
6.3. Group actions and representations
for which
437
x · (y · p) = (xy) · p, e·p=p
for all x, y ∈ G and p ∈ M ; the action is transitive if ∀p, q ∈ M ∃x ∈ G : x · q = p. If M is a vector space and the mapping p → x · p is linear for each x ∈ G, the action is called linear. Remark 6.3.2. To be precise, our action G × M → M in Definition 6.3.1 should be called a left action, to make a difference to the right actions M × G → M , which are defined in the obvious way. When G acts on M , it is useful to think of G as a (sub)group of symmetries of M . Transitivity means that M is highly symmetric: there are enough symmetries to move any point to any other point. Example. Let us present some examples of actions: 1. On a vector space V , the group Aut(V ) acts linearly by (A, v) → Av. 2. If φ ∈ Hom(G, H) then G acts on H by (x, y) → φ(x)y. Especially, G acts on G transitively by (x, y) → xy. 3. The rotation group SO(n) acts transitively on the sphere Sn−1 := {x = (xj )nj=1 ∈ Rn | x21 + · · · + x2n = 1} by (A, x) → Ax. 4. If H < G and ((x, p) → x · p) : G × M → M is an action then the restriction ((x, p) → x · p) : H × M → M is an action. Definition 6.3.3 (Isotropy subgroup). Let ((x, p) → x · p) : G × M → M be an action. The isotropy subgroup of q ∈ M is Gq := {x ∈ G | x · q = q} . That is, Gq ⊂ G contains those symmetries that fix the point q ∈ M . Theorem 6.3.4. Let ((x, p) → x · p) : G × M → M be a transitive action. Let q ∈ M . Then the isotropy subgroup Gq is a subgroup for which fq := (xGq → x · q) : G/Gq → M is a bijection. Remark 6.3.5. If Gq G then G/Gq is a group; otherwise the quotient is just a set. Notice also that the choice of q ∈ M here is essentially irrelevant. Example. Let G = SO(3), M = S2 , and q ∈ S2 be the north pole (i.e., q = (0, 0, 1) ∈ R3 ). Then Gq < SO(3) consists of the rotations around the vertical axis (passing through the north and south poles). Since SO(3) acts transitively on S2 , we get a bijection SO(3)/Gq → S2 . The reader may think how A ∈ SO(3) moves the north pole q ∈ S2 to Aq ∈ S2 .
438
Chapter 6. Groups
Proof of Theorem 6.3.4. Let a, b ∈ Gq . Then e · q = q, (ab) · q = a · (b · q) = a · q = q, a−1 · q = a−1 · (a · q) = (a−1 a) · q = e · q = q, so that Gq < G. Let x, y ∈ G. Since (xa) · q = x · (a · q) = x · q, f = (xGq → x · q) : G/Gq → M is a well-defined mapping. If x · q = y · q then (x−1 y) · q = x−1 · (y · q) = x−1 · (x · q) = (x−1 x) · q = e · q = q, i.e., x−1 y ∈ Gq , that is xGq = yGq ; hence f is injective. Take p ∈ M . By transitivity, there exists x ∈ G such that x · q = p. Thereby f (xGq ) = x · q = p, i.e., f is surjective. Remark 6.3.6. If an action ((x, p) → x · p) : G × M → M is not transitive, it is often reasonable to study only the orbit of q ∈ M , defined by G · q := {x · q | x ∈ G} . Now ((x, p) → x · p) : G × (G · q) → (G · q) is transitive, and (x · q → xGq ) : G · q → G/Gq is a bijection. Notice that either G · p = G · q or (G · p) ∩ (G · q) = ∅; thus the action of G cuts M into a disjoint union of “slices” (orbits). Definition 6.3.7 (Unitary groups). Let (v, w) → v, wH be the inner product of a complex vector space H. Recall that the adjoint A∗ ∈ Aut(H) of A ∈ Aut(H) is defined by A∗ v, wH := v, AwH . The unitary group of H is U(H) := {A ∈ Aut(H) | ∀v, w ∈ H : Av, AwH = v, wH } , i.e., U(H) contains the unitary linear bijections H → H. Clearly A∗ = A−1 for A ∈ U(H). The unitary matrix group for Cn is U(n) := A = (aij )ni,j=1 ∈ GL(n, C) | A∗ = A−1 , see Remark 6.2.10; here A∗ = (aji )ni,j=1 = A−1 , i.e., n k=1
aki akj = δij .
6.3. Group actions and representations
439
Definition 6.3.8 (Representations). A representation of a group G on a vector space V is any φ ∈ Hom(G, Aut(V )); the dimension of φ is dim(φ) := dim(V ). A representation ψ ∈ Hom(G, U(H)) is called a unitary representation, and ψ ∈ Hom(G, U(n)) is called a unitary matrix representation. Remark 6.3.9. The main idea here is that we can study a group G by using linear algebraic tools via representations φ ∈ Hom(G, Aut(V )). Remark 6.3.10. There is a bijective correspondence between the representations of G on V and linear actions of G on V . Indeed, if φ ∈ Hom(G, Aut(V )) then ((x, v) → φ(x)v) : G × V → V is an action of G on V . Conversely, if ((x, v) → x · v) : G × V → V is a linear action then (x → (v → x · v)) ∈ Hom(G, Aut(V )) is a representation of G on V . Example. Let us give some examples of representations: 1. 2. 3. 4.
If G < Aut(V ) then (A → A) ∈ Hom(G, Aut(V )). If G < U(H) then (A → A) ∈ Hom(G, U(H)). There is always the trivial representation (x → I) ∈ Hom(G, Aut(V )). (Representations πL and πR ). Let F(G) = CG , i.e., the complex vector space of functions f : G → C. Let us define left and right regular representations πL , πR ∈ Hom(G, Aut(F(G))) by (πL (y)f )(x) (πR (y)f )(x)
:= f (y −1 x), := f (xy)
for all x, y ∈ G. 5. Let us identify the complex (1 × 1)-matrices with the complex numbers by the mapping ((z) → z) : C1×1 → C. Then U(1) is identified with the unit circle {z ∈ C : |z| = 1}, and (x → eix·ξ ) ∈ Hom(Rn , U(1)) for all ξ ∈ Rn . 6. Analogously, (x → ei2πx·ξ ) ∈ Hom(Rn /Zn , U(1)) for all ξ ∈ Zn . 7. Let φ ∈ Hom(G, Aut(V )) and ψ ∈ Hom(G, Aut(W )), where V, W are vector spaces over the same field. Then φ ⊕ ψ = (x → φ(x) ⊕ ψ(x)) ∈ Hom(G, Aut(V ⊕ W )), φ ⊗ ψ|G = (x → φ(x) ⊗ ψ(x)) ∈ Hom(G, Aut(V ⊗ W )), where V ⊕ W is the direct sum and V ⊗ W is the tensor product space. 8. If φ = (x → (φ(x)ij )ni,j=1 ) ∈ Hom(G, GL(n, C)) then the conjugate φ = (x → (φ(x)ij )ni,j=1 ) ∈ Hom(G, GL(n, C)).
440
Chapter 6. Groups
Definition 6.3.11 (Invariant subspaces and irreducible representations). Let V be a vector space and A ∈ End(V ). A subspace W ⊂ V is called A-invariant if AW ⊂ W, where AW = {Aw : w ∈ W }. Let φ ∈ Hom(G, Aut(V )). A subspace W ⊂ V is called φ-invariant if W is φ(x)-invariant for all x ∈ G (abbreviated φ(G)W ⊂ W ); moreover, φ is irreducible if the only φ-invariant subspaces are the trivial subspaces {0} and V . Remark 6.3.12 (Restricted representations). If W ⊂ V is φ-invariant for φ ∈ Hom(G, Aut(V )), we may define the restricted representation φ|W ∈ Hom(G, Aut(W )) by φ|W (x)w := φ(x)w. If φ is unitary then its restriction is also unitary. Lemma 6.3.13. Let φ ∈ Hom(G, U(H)). Let W ⊂ H be a φ-invariant subspace. Then its orthocomplement W ⊥ = {v ∈ H | ∀w ∈ W : v, wH = 0} is also φ-invariant. Proof. If x ∈ G, v ∈ W ⊥ and w ∈ W then φ(x)v, wH = v, φ(x)∗ wH = v, φ(x)−1 wH = v, φ(x−1 )wH = 0, meaning that φ(x)v ∈ W ⊥ .
Definition 6.3.14 (Direct sums). Let V be an inner product space and let {Vj }j∈J be some family of its mutually orthogonal subspaces (i.e., vi , vj V = 0 if vi ∈ Vi , vj ∈ Vj and i = j). The (algebraic) direct sum of {Vj }j∈J is the subspace W = Vj := span Vj . j∈J
j∈J
If Aj ∈ End(Vj ) then let us define A=
-
Aj ∈ End(W )
j∈J
by Av := Aj v for all j ∈ J and v ∈ Vj . If φj ∈ Hom(G, Aut(Vj )) then we define φ= φj ∈ Hom(G, Aut(W )) j∈J
by φ|Vj = φj for all j ∈ J, i.e., φ(x) :=
. j∈J
φj (x) for all x ∈ G.
6.3. Group actions and representations
441
Remark 6.3.15. In a sense, irreducible representations are the building blocks of representations. Given a representation of a group, a fundamental task is to find its invariant subspaces, and describe the representation as a direct sum of irreducible representations. To reach this goal, we often have to assume some extra conditions, e.g., of algebraic or topological nature. Theorem 6.3.16 (Reducing finite-dimensional representations). Let φ ∈ Hom(G, U(H)) be finite-dimensional. Then φ is a direct sum of irreducible unitary representations. Proof (by induction). The claim is true for dim(H) = 1, since then the only subspaces of H are the trivial ones. Suppose the claim is true for representations of dimension n or less. Suppose dim(H) = n+1. If φ is irreducible, there is nothing to prove. Hence assume that there exists a non-trivial φ-invariant subspace W ⊂ H. Then also the orthocomplement W ⊥ is φ-invariant by Lemma 6.3.13. Due to the φ-invariance of the subspaces W and W ⊥ , we may define restricted representations φ|W ∈ Hom(G, U(W )) and φ|W ⊥ ∈ Hom(G, U(W ⊥ )). Hence H = W ⊕ W ⊥ and φ = φ|W ⊕ φ|W ⊥ . Moreover, dim(W ) ≤ n and dim(W ⊥ ) ≤ n; the proof is complete, since unitary representations up to dimension n are direct sums of irreducible unitary representations by the induction hypothesis. Remark 6.3.17. By Theorem 6.3.16, finite-dimensional unitary representations can be decomposed nicely. More precisely, if φ ∈ Hom(G, U(H)) is finite-dimensional then k k Wj , φ = φ|Wj , H= j=1
j=1
where each φ|Wj ∈ Hom(G, U(Wj )) is irreducible. Definition 6.3.18 (Equivalent representations). A linear mapping A : V → W is an intertwining operator between representations φ ∈ Hom(G, Aut(V )) and ψ ∈ Hom(G, Aut(W)), denoted by A ∈ Hom(φ, ψ), if Aφ(x) = ψ(x)A for all x ∈ G, i.e., if the diagram φ(x)
V −−−−→ ⏐ ⏐ A(
V ⏐ ⏐ (A
ψ(x)
W −−−−→ W commutes for every x ∈ G. If A ∈ Hom(φ, ψ) is invertible then φ and ψ are said to be equivalent, denoted by φ ∼ ψ. Remark 6.3.19. Always 0 ∈ Hom(φ, ψ), and Hom(φ, ψ) is a vector space. Moreover, if A ∈ Hom(φ, ψ) and B ∈ Hom(ψ, ξ) then BA ∈ Hom(φ, ξ).
442
Chapter 6. Groups
Exercise 6.3.20. Let G be a finite group and let F(G) be the vector space of functions f : G → C. Let 1 f dμG := f (x), |G| G x∈G
when f ∈ F(G). Let us endow F(G) with the inner product f, gL2 (μG ) := f g dμG . G
Define πL , πR : G → Aut(F(G)) by (πL (y) f )(x) := f (y −1 x), (πR (y) f )(x) := f (xy). Show that πL and πR are equivalent unitary representations. Exercise 6.3.21. Let G be non-commutative and |G| = 6. Endow F(G) with the inner product given in Exercise 6.3.20. Find the πL -invariant subspaces and give orthogonal bases for them. Exercise 6.3.22 (Torus Tn ). Let us endow the n-dimensional torus Tn := Rn /Zn with the quotient group structure and with the Lebesgue measure. Let πL , πR : Tn → L(L2 (Tn )) be defined by (πL (y) f )(x) := f (x − y), (πR (y) f )(x) := f (x + y) for almost every x ∈ Tn . Show that πL and πR are equivalent reducible unitary representations. Describe the minimal πL - and πR -invariant subspaces containing the function x → ei2πx·ξ , where ξ ∈ Zn . Remark 6.3.23. One of the main results in the representation theory of groups is Schur’s Lemma 6.3.25, according to which the intertwining space Hom(φ, φ) may be rather trivial. The most of the work for such a result is carried out in the proof of the following Proposition 6.3.24: Proposition 6.3.24. Let φ ∈ Hom(G, Aut(Vφ )) and ψ ∈ Hom(G, Aut(Vψ )) be irreducible. If A ∈ Hom(φ, ψ) then either A = 0 or A : Vφ → Vψ is invertible. Proof. The image AVφ ⊂ Vψ of A is ψ-invariant, because ψ(G) AVφ = A φ(G)Vφ = AVφ , so that either AVφ = {0} or AVφ = Vψ , as ψ is irreducible. Hence either A = 0 or A is a surjection.
6.3. Group actions and representations
443
The kernel Ker(A) = {v ∈ Vφ | Av = 0} is φ-invariant, since A φ(G) Ker(A) = ψ(G) A Ker(A) = ψ(G) {0} = {0} , so that either Ker(A) = {0} or Ker(A) = Vφ , as φ is irreducible. Hence either A is injective or A = 0. Thus either A = 0 or A is bijective. Corollary 6.3.25 (Schur’s Lemma (finite-dimensional (1905))). Let φ ∈ Hom(G, Aut(V )) be irreducible and finite-dimensional. Then Hom(φ, φ) = CI = {λI | λ ∈ C}. Proof. Let A ∈ Hom(φ, φ). The finite-dimensional linear operator A has an eigenvalue λ ∈ C, i.e., λI − A : V → V is not invertible. On the other hand, λI − A ∈ Hom(φ, φ), so that λI − A = 0 by Proposition 6.3.24. Corollary 6.3.26 (Representations of commutative groups). Let G be a commutative group. Irreducible finite-dimensional representations of G are one-dimensional. Proof. Let φ ∈ Hom(G, Aut(V )) be irreducible, dim(φ) < ∞. Due to the commutativity of G, φ(x)φ(y) = φ(xy) = φ(yx) = φ(y)φ(x) for all x, y ∈ G, so that φ(G) ⊂ Hom(φ, φ). By Schur’s Lemma 6.3.25, Hom(φ, φ) = CI. Hence if v ∈ V then φ(G)span{v} = span{v}, i.e., span{v} is φ-invariant. Therefore either v = 0 or span{v} = V .
Corollary 6.3.27. Let φ ∈ Hom(G, U(Hφ )) and ψ ∈ Hom(G, U(Hψ )) be finitedimensional. Then φ ∼ ψ if and only if there exists an isometric isomorphism B ∈ Hom(φ, ψ). Remark 6.3.28 (Isometries). An isometry f : M → N between metric spaces (M, dM ) and (N, dN ) satisfies dN (f (x), f (y)) = dM (x, y) for all x, y ∈ M . Proof of Corollary 6.3.27. The “if”-part is trivial. Assume that φ ∼ ψ. Recall that there are direct sum decompositions φ=
m j=1
φj ,
ψ=
n -
ψk ,
k=1
where φj , ψk are irreducible unitary representations on Hφj , Hψk , respectively. Now n = m, since φ ∼ ψ. Moreover, we may arrange the indices so that φj ∼ ψj for each j. Choose invertible Aj ∈ Hom(φj , ψj ). Then A∗j is invertible, and A∗j ∈
444
Chapter 6. Groups
Hom(ψj , φj ): if x ∈ G, v ∈ Hφj and w ∈ Hψj then A∗j ψj (x)w, vHφ
=
w, ψj (x)∗ Aj vHψ
=
w, ψj (x−1 )Aj vHψ
=
w, Aj φj (x−1 )vHψ
= =
φj (x−1 )∗ A∗j w, vHφ φj (x)A∗j w, vHφ .
Thereby A∗j Aj ∈ Hom(φj , φj ) is invertible. By Schur’s Lemma 6.3.25, A∗j Aj = λj I, where λj = 0. Let v ∈ Hφj such that vHφ = 1. Then λ = λv2Hφ = λv, vHφ = A∗j Aj v, vHφ = Aj v, Aj vHψ = Aj vHψ > 0, 2
so that we may define Bj := λ−1/2 Aj ∈ Hom(φj , ψj ). Then the mapping Bj : Hφj → Hψj is an isometry, Bj∗ Bj = I. Finally, define B :=
m -
Bj .
j=1
Clearly, B : Hφ → Hψ is an isometry, bijection, and B ∈ Hom(φ, ψ).
We have now dealt with groups in general. In the sequel, by specialising to certain classes of groups, we will obtain fruitful ground for further results in representation theory.
Chapter 7
Topological Groups A topological group is a natural amalgam of topological spaces and groups: it is a Hausdorff space with continuous group operations. Topology adds a new flavour to representation theory. Especially interesting are compact groups, where groupinvariant probability measures exist. Moreover, nice-enough functions on a compact group have Fourier series expansions, which generalise the classical Fourier series of periodic functions.
7.1 Topological groups Next we marry topology to groups. Definition 7.1.1 (Topological groups). A group and a topological space G is called a topological group if {e} ⊂ G is closed and if the group operations ((x, y) → xy) : G × G → G, (x → x−1 ) : G → G are continuous. Remark 7.1.2. The reader may wonder why we assumed that {e} ⊂ G is closed – actually, this condition is left out in some other definitions for a topological group. Notice that the good property brought by this assumption is that the topological groups become even Hausdorff spaces (see Exercise 7.1.3), which appeals to those who work in analysis. Example. In the following, when not specified, the topologies and the group operations are the usual ones: 1. Any group G endowed with the so-called discrete topology P(G) = {U : U ⊂ G} is a topological group. 2. Z, Q, R and C are topological groups when the group operation is the addition and the topology is as usual.
446
Chapter 7. Topological Groups
3. Q× , R× , C× are topological groups when the group operation is the multiplication and the topology is as usual. 4. Topological vector spaces are topological groups with vector addition: such a space is both a vector space and a topological Hausdorff space such that the vector space operations are continuous. 5. Let X be a Banach space. The set AUT(X) := Aut(X) ∩ L(X) of invertible bounded linear operators X → X forms a topological group with respect to the norm topology. 6. Subgroups of topological groups are topological groups. 7. If G and H are topological groups then G×H is a topological group. Actually, Cartesian products always preserve the topological group structure. Exercise 7.1.3. Show that a topological group is a Hausdorff space. Lemma 7.1.4. Let G be a topological group and y ∈ G. Then x → xy,
x → yx,
x → x−1
are homeomorphisms G → G. Proof. The mapping (x → xy) : G
x→(x,y)
→
G×G
(a,b)→ab
→
G
is continuous as a composition of continuous mappings. Its inverse mapping (x → xy −1 ) : G → G is also continuous; hence this is a homeomorphism. Similarly, (x → yx) : G → G is a homeomorphism. By definition, the group operation (x → x−1 ) : G → G is continuous, and it is its own inverse. Corollary 7.1.5. If U ⊂ G is open and S ⊂ G then SU, U S, U −1 ⊂ G are open. Proposition 7.1.6. Let G be a topological group. If H < G then H < G. If H G then H G. Proof. Let H < G. Trivially e ∈ H ⊂ H. Now H H ⊂ HH = H, where the inclusion is due to the continuity of the mapping ((x, y) → xy) : G×G → G. The continuity of the inversion (x → x−1 ) : G → G gives H
−1
⊂ H −1 = H.
Thus H < G. Let H G, y ∈ G. Then yH = yH = Hy = Hy; notice how homeomorphisms (x → yx), (x → xy) : G → G were used.
7.1. Topological groups
447
Remark 7.1.7. Let H < G and S ⊂ G. For analysis on the quotient space G/H, let us recall Remark 6.2.15: the mapping (x → xH) : G → G/H identifies the sets SH = {sh : s ∈ S, h ∈ H} ⊂ G, {sH : s ∈ S} = {{sh : h ∈ H} : s ∈ S} ⊂ G/H. Definition 7.1.8 (Quotient topology on G/H). Let G be a topological group, H < G. The quotient topology of G/H is τG/H := {{uH : u ∈ U } : U ⊂ G open} ; in other words, τG/H is the strongest (i.e., largest) topology for which the quotient map (x → xH) : G → G/H is continuous. If U ⊂ G is open, we may identify sets U H ⊂ G and {uH : u ∈ H} ⊂ G/H. Proposition 7.1.9. Let G be a topological group and H < G. Then a function f : G/H → C is continuous if and only if (x → f (xH)) : G → C is continuous. Proof. If f ∈ C(G/H) then (x → f (xH)) ∈ C(G), since it is obtained by composing f and the continuous quotient map (x → xH) : G → G/H. Now suppose (x → f (xH)) ∈ C(G). Take open V ⊂ C. Then U := (x → f (xH))−1 (V ) ⊂ G is open, so that U := {uH : u ∈ U } ⊂ G/H is open. Trivially, f (U ) = V . Hence f ∈ C(G/H). Proposition 7.1.10 (When is G/H Hausdorff ?). Let G be a topological group and H < G. Then G/H is a Hausdorff space if and only if H is closed. Proof. If G/H is a Hausdorff space then H = (x → xH)−1 ({H}) ⊂ G is closed, because the quotient map is continuous and {H} ⊂ G/H is closed. Next suppose H is closed. Take xH, yH ∈ G/H such that xH = yH. Then S := ((a, b) → a−1 b)−1 (H) ⊂ G × G is closed, since H ⊂ G is closed and ((a, b) → a−1 b) : G × G → G is continuous. Now (x, y) ∈ S. Take open sets U x and V y such that (U × V ) ∩ S = ∅. Then the sets U V
:= {uH : u ∈ U } := {vH : v ∈ V }
⊂ ⊂
G/H, G/H
are disjoint and open such that xH ∈ U and yH ∈ V . Thus G/H is Hausdorff. Theorem 7.1.11 (When is G/H a topological group?). Let G be a topological group and H G. Then ((xH, yH) → xyH)
:
(G/H) × (G/H) → G/H,
(xH → x−1 H)
:
G/H → G/H
are continuous. Moreover, G/H is a topological group if and only if H is closed.
448
Chapter 7. Topological Groups
Proof. We know already that the operations in the theorem are well-defined group operations, because H is normal in G. Recall Remark 7.1.7, how we may identify certain subsets of G with subsets of G/H. Then a neighbourhood of the point xyH ∈ G/H is of the form U H for some open U ⊂ G, U xy. Take open U1 x and U2 y such that U1 U2 ⊂ U . Then (xH)(yH) ⊂ (U1 H)(U2 H) = U1 U2 H ⊂ U H, so that ((xH, yH) → xyH) : (G/H) × (G/H) → G/H is continuous. A neighbourhood of the point x−1 H ∈ G/H is of the form V H for some open V ⊂ G, V x−1 . But V −1 x is open, and (V −1 )−1 = V , so that (xH → x−1 H) : G/H → G/H is continuous. Notice that eG/H = H. If G/H is a topological group, then H = (x → xH)−1 eG/H ⊂ G is closed. On the other hand, if H G is closed then (G/H) \ {eG/H } ∼ = (G \ H)H ⊂ G is open, i.e., {eG/H } ⊂ G/H is closed.
Definition 7.1.12 (Continuous homomorphisms). Let G1 , G2 be topological groups. Let HOM(G1 , G2 ) := Hom(G1 , G2 ) ∩ C(G1 , G2 ), i.e., the set of continuous homomorphisms G1 → G2 . Remark 7.1.13. By Theorem 7.1.11, closed normal subgroups of G correspond bijectively to continuous surjective homomorphisms from G to some other topological group (up to isomorphism). Remark 7.1.14. Let us recall some topological concepts: A topological space is connected if the only subsets which are both closed and open are the empty set and the whole space. A non-connected space is called disconnected. The component of a point x in a topological space is the largest connected subset containing x. Proposition 7.1.15. Let G be a topological group and Ce ⊂ G the component of e. Then Ce G is closed. Proof. Components are always closed, and e ∈ Ce by definition. Since Ce ⊂ G is connected, also Ce × Ce ⊂ G × G and is connected. By the continuity of the group operations, Ce Ce ⊂ G and Ce−1 ⊂ G are connected. Since e = ee ∈ Ce Ce , we have Ce Ce ⊂ Ce . And since e = e−1 ∈ Ce−1 , also Ce−1 ⊂ Ce . Take y ∈ G. Then y −1 Ce y ⊂ G is connected, by the continuity of (x → y −1 xy) : G → G. Now e = y −1 ey ∈ y −1 Ce y, so that y −1 Ce y ⊂ Ce ; Ce is normal in G. Proposition 7.1.16. Let ((x, p) → x · p) : G × M → M be a continuous action of G on M , and let q ∈ M . If Gq and G/Gq are connected then G is connected.
7.2. Representations of topological groups
449
Proof. Suppose G is disconnected and Gq is connected. Then there are non-empty disjoint open sets U, V ⊂ G such that G = U ∪ V . The sets U V
:= {uGq : u ∈ U } := {vGq : v ∈ V }
⊂ ⊂
G/Gq , G/Gq
are non-empty and open, and G/Gq = U ∪ V . Take u ∈ U and v ∈ V . As a continuous image of a connected set, uGq = (x → ux)(Gq ) ⊂ G is connected; moreover u = ue ∈ uGq ; thereby uGq ⊂ U . In the same way we see that vGq ⊂ V . Hence U ∩ V = ∅, so that G/Gq is disconnected. Corollary 7.1.17 (When is a group connected?). If G is a topological group, H < G is connected and G/H is connected then G is connected. Proof. Using the notation of Proposition 7.1.16, let M = G/H, q = H and x · p = xp, so that Gq = H and G/Gq = G/H. Exercise 7.1.18 (Groups SO(n), SU(n) and U(n) are connected). Show that SO(n), SU(n) and U(n) are connected for every n ∈ Z+ . How about O(n)? Exercise 7.1.19 (Finiteness of connected components). Prove that a compact topological group can have only finitely many connected components. Consequently, conclude that a discrete compact group is finite.
7.2 Representations of topological groups Definition 7.2.1 (Strongly continuous representations). Let G be a topological group and H be a Hilbert space. A representation φ ∈ Hom(G, U(H)) is strongly continuous if (x → φ(x)v) : G → H is continuous for all v ∈ H. Remark 7.2.2. The strong continuity in Definition 7.2.1 means that the mapping (x → φ(x)) : G → L(H) is continuous, when L(H) ⊃ U(H) is endowed with the strong operator topology: Aj
strongly
→
A
definition
⇐⇒
∀v ∈ H : Aj v − AvH → 0.
Why should we not endow U(H) with the operator norm topology (which is even stronger, i.e., a larger topology)? The reason is that there are interesting unitary representations, which are continuous in the strong operator topology, but not in the operator norm topology. This phenomenon is exemplified by Exercise 7.2.3: Exercise 7.2.3. Let us define πL : Rn → U(L2 (Rn )) by (πL (y)f )(x) := f (x − y) for almost every x ∈ R . Show that πL is strongly continuous, but not norm continuous. n
450
Chapter 7. Topological Groups
Definition 7.2.4 (Topologically irreducible representations). A strongly continuous φ ∈ Hom(G, U(H)) is called topologically irreducible if the only closed φ-invariant subspaces are the trivial ones {0} and H. Exercise 7.2.5. Let V be a topological vector space and let W ⊂ V be an Ainvariant subspace, where A ∈ Aut(V ) is continuous. Show that the closure W ⊂ V is also A-invariant. Definition 7.2.6 (Cyclic representations and cyclic vectors). A strongly continuous φ ∈ Hom(G, U(H)) is called a cyclic representation if span φ(G)v ⊂ H is dense for some v ∈ H; such v is called a cyclic vector. Example. If φ ∈ Hom(G, U(H)) is topologically irreducible then any non-zero v ∈ H is cyclic. Indeed, if V := span φ(G)v then φ(G)V ⊂ V and consequently φ(G)V ⊂ V , so that V is φ-invariant. If v = 0 then V = H, because of the topological irreducibility. Definition 7.2.7 (Representation as a direct sum). A Hilbert space H is a direct sum of closed subspaces (Hj )j∈J , denoted by H=
-
Hj
j∈J
if the subspace family is pairwise orthogonal and the linear span of the set ∪j∈J Hj is dense in H. Then the vectors in H have a unique orthogonal series expansion, more precisely 2 xj , x2H = xj H . ∀x ∈ H ∀j ∈ J ∃!xj ∈ Hj : x = j∈J
j∈J
If φ ∈ Hom(G, U(H)) and each Hj is φ-invariant then φ is said to be the direct sum φ|Hj φ= j∈J
where φ|Hj = (x → φ(x)v) ∈ Hom(G, U(Hj )). Proposition 7.2.8 (Decomposition of strongly continuous representations). Let φ ∈ Hom(G, U(H)) be strongly continuous. Then φ=
j∈J
where each φ|Hj is cyclic.
φ|Hj ,
7.3. Compact groups
451
Proof. Let J˜ be the family of all closed φ-invariant subspaces V ⊂ H for which φ|V is cyclic. Let S = s ⊂ J˜ ∀V, W ∈ s : V = W or V ⊥W . It is easy to see that {{0}} ∈ S, so that S = ∅. Let us introduce a partial order on S by inclusion: definition s1 ≤ s2 ⇐⇒ s1 ⊂ s2 . The chains in S have upper bounds: if R ⊂ S is a chain then r ≤ ∪s∈R s ∈ S for all r ∈ R. Therefore by Zorn’s Lemma, there exists a maximal element t ∈ S. Let W. V := W ∈t
To get a contradiction, suppose V = H. Then there exists v ∈ V ⊥ \ {0}. Since span(φ(G)v) is φ-invariant, its closure W0 is also φ-invariant (see Exercise 7.2.5). Clearly W0 ⊂ V ⊥ = V ⊥ , and φ|W0 has cyclic vector v, yielding s := t ∪ {W0 } ∈ S, where t ≤ s ≤ t. This contradicts the maximality of t; thus V = H.
Exercise 7.2.9. Fill in the details in the proof of Proposition 7.2.8. Exercise 7.2.10. Assuming that H is separable, prove Proposition 7.2.8 by ordinary induction (without resorting to Zorn’s Lemma).
7.3
Compact groups
Definition 7.3.1 ((Locally) compact groups). A topological group is a (locally) compact group if it is (locally) compact as a topological space. Remark 7.3.2. We have the following properties: 1. Any group G with the discrete topology is a locally compact group; then G is a compact group if and only if it is finite. 2. Q, Q× are not locally compact groups; R, R× , C, C× are locally compact groups, but non-compact. 3. A normed vector space is a locally compact group if and only if it is finitedimensional. 4. O(n), SO(n), U(n), SU(n) are compact groups. 5. GL(n) is a locally compact group, but non-compact. 6. If G, H are locally compact groups then G × H is a locally compact group. 7. If {Gj }j∈J is a family of compact groups then j∈J Gj is a compact group. 8. If G is a compact group and H < G is closed then H is a compact group.
452
Chapter 7. Topological Groups
Proposition 7.3.3. Let ((x, p) → x · p) : G × M → M be a continuous action of a compact group G on a Hausdorff space M . Let q ∈ M . Then the mapping f := (xGq → x · q) : G/Gq → G · q is a homeomorphism. Proof. We already know that f is a well-defined bijection. We need to show that f is continuous. An open subset of G · q is of the form V ∩ (G · q), where V ⊂ M is open. Since the action is continuous, also (x → x · q) : G → M is continuous, so that U := (x → x · q)−1 (V ) ⊂ G is open. Thereby f −1 (V ∩ (G · q)) = {xGq : x ∈ U } ⊂ G/Gq is open. Thus f is continuous. The space G is compact and the quotient map (x → xGq ) : G → G/Gq is continuous and surjective, so that G/Gq is compact. From general topology we know that a continuous bijection from a compact space to a Hausdorff space is a homeomorphism (see Proposition A.12.7). Corollary 7.3.4. If G is compact, φ ∈ HOM(G, H) and K = Ker(φ) then ψ := (xK → φ(x)) ∈ HOM(G/K, φ(G)) is a homeomorphism. Proof. Using the notation of Proposition 7.3.3, we have M = H, q = eH , x · p = φ(x)p, so that Gq = K, G/Gq = G/K, G · q = φ(G), ψ = f . Remark 7.3.5. What could happen if we drop the compactness assumption in Corollary 7.3.4? If G and H are Banach spaces, φ ∈ L(G, H) is compact and dim(φ(G)) = ∞ then ψ = (x + Ker(φ) → φ(x)) : G/Ker(φ) → φ(G) is a bounded linear bijection, but ψ −1 is not bounded! But if φ ∈ L(G, H) is a bijection then φ−1 is bounded by the Open Mapping Theorem! (Theorem B.4.31) Definition 7.3.6 (Uniform continuity on a topological group). Let G be a topological group. A function f : G → C is uniformly continuous if for every ε > 0 there exists open U e such that ∀x, y ∈ G : x−1 y ∈ U ⇒ |f (x) − f (y)| < ε. Exercise 7.3.7. Under what circumstances is a polynomial p : R → C uniformly continuous? Show that if a continuous function f : R → C is periodic or vanishes outside a bounded set then it is uniformly continuous. Theorem 7.3.8. If G is a compact group and f ∈ C(G) then f is uniformly continuous.
7.4. Haar measure and integral
453
Proof. Take ε > 0. Define the open disk D(z, r) := {w ∈ C : |w − z| < r}, where z ∈ C, r > 0. Since f is continuous, the set Vx := f −1 (D(f (x), ε)) x is open. Then x−1 Vx ee = e is open, so that there exist open sets U1,x , U2,x e such that U1,x U2,x ⊂ x−1 Vx , by the continuity of the group multiplication. Define Ux := U1,x ∩ U2,x . Since {xUx : x ∈ G} is an open cover of the compact space G, there is a finite subcover {xj Uxj }nj=1 . Now the set U :=
n
Uxj e
j=1
is open. Suppose x, y ∈ G such that x−1 y ∈ U . There exists k ∈ {1, . . . , n} such that x ∈ xk Uxk , so that x, y ∈ xU ⊂ xk Uxk Uxk ⊂ xk x−1 k Vxk = Vxk , yielding |f (x) − f (y)| ≤ |f (x) − f (xk )| + |f (xk ) − f (y)| < 2ε. Exercise 7.3.9. Let G be a compact group, x ∈ G and A = A < G.
{xn }∞ n=1 .
Show that
7.4 Haar measure and integral On a group, it would be natural to integrate with respect to measures that are invariant under the group operations: consider, e.g., the Lebesgue integral on Rn . However, it is not obvious whether there exist such invariant integrals in general. Next we will show that on a compact group there exists a unique probability functional, which corresponds to the so-called Haar measure. Definition 7.4.1 (Positive functionals). Let X be a compact Hausdorff space and K ∈ {R, C}. Then C(X, K) is a Banach space over K with the norm f → f C(X,K) := max |f (x)|. x∈X
Its dual C(X, K) = L(C(X, K), K) consists of the bounded linear functionals C(X, K) → K, and is endowed with the Banach space norm L → LC(X,K) :=
sup f ∈C(X,K): f C(X,K) ≤1
|Lf |.
A functional L : C(X, K) → C is called positive if Lf ≥ 0 whenever f ≥ 0. Exercise 7.4.2. Let X be a compact Hausdorff space. Show that a positive linear functional L : C(X, R) → R is bounded.
454
Chapter 7. Topological Groups
By the Riesz Representation Theorem (see Theorem C.4.65), if L ∈ C(X, K) is positive then there exists a unique positive Borel regular measure μ on X such that Lf = f dμ X
for every f ∈ C(X, K); moreover, μ(X) = LC(X,K) . For short, C(X) := C(X,C). Note that this is different from Chapter A (see, e.g., Exercise A.6.6) where we wrote C(X) for C(X, R). In the sequel, we shall construct a unique positive normalised translationinvariant measure on G. More precisely, we shall prove the following result: Theorem 7.4.3 (Haar functional). Let G be a compact group. There exists a unique positive linear functional Haar ∈ C(G) such that Haar(f ) = Haar(x → f (yx)), Haar(1) = 1, for all y ∈ G, where 1 = (x → 1) ∈ C(G). Moreover, this Haar functional satisfies Haar(f ) = Haar(x → f (xy)) = Haar(x → f (x−1 )). Remark 7.4.4 (Haar measure and integral). By the Riesz Representation Theorem (see Theorem C.4.65), the Haar functional begets a unique Borel regular probability measure μG such that Haar(f ) = f dμG = f (x) dμG (x). G
G
This μG is called the Haar measure of G. Often the Haar measure is implicitly assumed, and we may write, e.g., f (x) dx := f dμG . G
G
Obviously, 1 dμG = μG (G) = 1, f (x) dx = f (yx) dx G G = f (xy) dx = f (x−1 ) dx.
G
G
G
Thus the Haar integral Haar(f ) = G f (x) dx can be thought of as the most natural average of f ∈ C(G). In practical applications we can know usually only
7.4. Haar measure and integral
455
finitely many values of f , i.e., we are able to take only samples {f (x) : x ∈ S} for a finite set S ⊂ G. Then a natural idea for approximating Haar(f ) would be computing the finite sum f (x) α(x), x∈S
where sampling weights α(x) ≥ 0 satisfy x∈S α(x) = 1. Of course, such a sum is not usually invariant under the group operations. The problem is to find clever choices for sampling sets and weights, some sort of almost uniformly distributed unit mass on G is needed; for this end we shall introduce convolutions. Example. If G is finite then f dμG = G
1 f (x). |G| x∈G
Example (Haar measure on Tn ). For Tn = Rn /Zn , f dμTn = f (x + Zn ) dx, Tn
[0,1)n
i.e., integration with respect to the Lebesgue measure on [0, 1)n . What follows is preparation for the proof of Theorem 7.4.3. Definition 7.4.5 (Sampling measures). Let G be a compact group. A function α : G → [0, 1] is a sampling measure on G, α ∈ SMG , if supp(α) := cl {a ∈ G : α(a) = 0}
is finite and
α(a) = 1.
a∈G
The set supp(α) ⊂ G is called the support of α. Since supp(α) is finite we also have supp(α) = {a ∈ G : α(a) = 0} and, therefore, a sampling measure α ∈ SMG can be regarded as a finitely supported probability measure on G, satisfying f dα = α ˇ ∗ f (e) = f ∗ α(e), ˇ G
where α ˇ (a) := α(a−1 ). Remark 7.4.6. A sampling measure is nothing else but α= αj δaj , j
where the sum is finite, aj ∈ G, δaj is the Dirac measure at aj (i.e., a probability measure supported at aj ), and j αj = 1.
456
Chapter 7. Topological Groups
Definition 7.4.7 (Convolutions). Let α, β ∈ SMG and f ∈ C(G, K). The convolutions α ∗ β, α ∗ f, f ∗ β : G → K are defined by α ∗ β(b)
=
α(a)β(a−1 b),
a∈G
α ∗ f (x)
=
α(a)f (a−1 x),
a∈G
f ∗ β(x)
=
f (xb−1 )β(b).
b∈G
Notice that these summations are finite, as the sampling measures are supported on finite sets. Definition 7.4.8 (Semigroups and monoids). A semigroup is a non-empty set S with an operation ((r, s) → rs) : S × S → S satisfying r(st) = (rs)t for all r, s, t ∈ S. A semigroup is commutative if rs = sr for all r, s ∈ S. Moreover, if there exists e ∈ S such that es = se = s for all s ∈ S then S is called a monoid. Example. Z+ = {n ∈ Z : n > 0} is a commutative monoid with respect to multiplication, and a commutative semigroup with respect to addition. If V is a vector space then (End(V ), (A, B) → AB) is a monoid with e = I. Lemma 7.4.9. The structure (SMG , (α, β) → α ∗ β) is a monoid. Exercise 7.4.10. Prove Lemma 7.4.9. How is supp(α ∗ β) related to supp(α) and supp(β)? In which case is SMG is a group? Show that SMG is commutative if and only if G is commutative. Lemma 7.4.11. If α ∈ SMG then (f → α ∗ f ), (f → f ∗ α) ∈ L(C(G, K)) and α ∗ f C(G,K)
≤
f C(G,K) ,
f ∗ αC(G,K)
≤
f C(G,K) .
Moreover, α ∗ 1 = 1 = 1 ∗ α. Proof. Trivially, α ∗ 1 = 1. Because (x → a−1 x) : G → G is a homeomorphism and the summation is finite, α ∗ f ∈ C(G, K). Linearity of f → α ∗ f is clear. Next, |α ∗ f (x)| ≤ α(a)|f (a−1 x)| ≤ α(a)f C(G,K) = f C(G,K) . a∈G
Similar conclusions hold also for f ∗ α.
a∈G
Definition 7.4.12. Let G be a compact group. Let us define a mapping pG : C(G, R) → R by pG (f ) := max(f ) − min(f ).
7.4. Haar measure and integral
457
Lemma 7.4.13. If f ∈ C(G, R) and α ∈ SMG then min(f ) ≤ min(α ∗ f ) ≤ max(α ∗ f ) ≤ max(f ), min(f ) ≤ min(f ∗ α) ≤ max(f ∗ α) ≤ max(f ), so that pG (α ∗ f ) ≤ pG (f ),
pG (f ∗ α) ≤ pG (f ).
Proof. Now min(f ) =
α(a) min(f ) ≤ min x∈G
a∈G
max(α ∗ f ) = max x∈G
α(a)f (a−1 x) = min(α ∗ f ),
a∈G
α(a)f (a−1 x) ≤
a∈G
α(a) max(f ) = max(f ),
a∈G
and clearly min(α ∗ f ) ≤ max(α ∗ f ). The proof for f ∗ α is symmetric.
Exercise 7.4.14. Show that pG is a bounded seminorm on C(G, R). Proposition 7.4.15. Let f ∈ C(G, R). For every ε > 0 there exist α, β ∈ SMG such that pG (α ∗ f ) < ε, pG (f ∗ β) < ε. Remark 7.4.16. This is the decisive stage in the construction of the Haar measure. The idea is that for a non-constant f ∈ C(G) we can find sampling measures α, β that tame the oscillations of f so that α ∗ f and f ∗ β are almost constant functions. It will turn out that there exists a unique constant function Haar(f )1 approximated by the convolutions of the type α ∗ f and f ∗ β. In the sequel, notice how compactness is exploited! Proof. Let ε > 0. By Theorem 7.3.8, a continuous function on a compact group is uniformly continuous. Thus there exists an open set U ⊃ e such that |f (x)−f (y)| < ε, when x−1 y ∈ U . We notice easily that if γ ∈ SMG then also |γ∗f (x)−γ∗f (y)| < ε, when x−1 y ∈ U : |γ ∗ f (x) − γ ∗ f (y)|
=
γ(a) f (a−1 x) − f (a−1 y)
a∈G
≤
γ(a) f (a−1 x) − f (a−1 y)
a∈G
0. By the uniform continuity of fφ , take open U e such that fφ (a) − fφ (b)H < ε if ab−1 ∈ U . Thereby if xy −1 ∈ U then ) K ) K ) ) )fφ (x) − fφK (y))2 )fφ (x)(k) − fφK (y)(k))2 dμK/H (kH) = Hψ H K/H 2 = fφ (xk) − fφ (yk)H dμK/H (kH) K/H
≤
ε2 .
Hence fφK ∈ Cψ (G, Hψ ) ⊂ IndG ψ Hψ , so that we indeed have a mapping (fφ → K fφ ) : Cφ (G, H) → Cψ (G, Hψ ). Next, we claim that fφ → fφK defines a surjective linear isometry IndG φH → G Indψ Hψ . Isometricity follows by ) K )2 ) K )2 )fφ (x)) )fφ ) G = dμG/K (xK) Indψ Hψ Hψ G/K ) ) K )fφ (x)(k))2 dμK/H (kH) dμG/K (xK) = H G/K K/H 2 = fφ (xk)H dμK/H (kH) dμG/K (xK) G/K K/H 2 = fφ (x)H dμG/H (xH) =
G/H 2 fφ IndG H φ
.
How about the surjectivity? The representation space IndG ψ Hψ is the closure of Cψ (G, Hψ ), and Hψ is the closure of Cφ (K, H). Consequently, IndG ψ Hψ is the closure of the vector space Cψ (G, Cφ (K, H))
:= {g ∈ C(G, C(K, H)) | ∀x ∈ G ∀k ∈ K ∀h ∈ H : g(xk) = ψ(k)∗ g(x), g(x)(kh) = φ(h)∗ g(x)(k)}.
Given g ∈ Cψ (G, Cφ (K, H)), define fφ ∈ Cφ (G, H) by fφ (x) := g(x)(e). Then fφK = g, because fφK (x)(k) = fφ (xk) = g(xk)(e) = ψ(k)∗ g(x)(e) = g(x)(k).
488
Chapter 7. Topological Groups
Thus (fφ → fφK ) : Cφ (G, H) → Cψ (G, Cφ (K, H)) is a linear isometric bijection. Hence this mapping can be extended uniquely to a linear isometric bijection A : G IndG φ H → Indψ Hψ . G K Finally, A ∈ Hom IndG φ, Ind Ind φ , since H K H A IndG H φ(y)fφ (x)
=
Afφ (y −1 x)
=
fφK (y −1 x)
=
K IndG K ψ(y)fφ (x)
=
IndG K ψ(y)Afφ (x).
This completes the proof.
Exercise 7.9.15. Let H be a closed subgroup of a compact group G. Let φ = (h → I) ∈ Hom(H, U(H)), where I = (u → u) : H → H. 2 ∼ 2 a) Show that IndG φ H = L (G/H, H), where the L (G/H, H) inner product is given by fG/H (xH), gG/H (xH)H dμG/H (xH), fG/H , gG/H L2 (G/H,H) := G/H
when fG/H , gG/H ∈ C(G/H, H). b) Let K < G be closed. Let πK and πG be the left regular representations of K and G, respectively. Prove that πG ∼ IndG K πK . Remark 7.9.16 (Multiplicity of a representation). A fundamental result for induced representations is the Frobenius Reciprocity Theorem 7.9.17, stated below without a proof. Let G be a compact group and φ ∈ Hom(G, U(H)) be strongly in φ, defined as continuous. Let n ([ξ], φ) ∈ N denote the multiplicity of [ξ] ∈ G .k follows: if φ = j=1 φj , where each φj is a continuous irreducible unitary representation, then n([ξ], φ) := |{j ∈ {1, . . . , k} : [φj ] = [ξ]}| . That is, n([ξ], φ) is the number of times ξ may occur as an irreducible component in a direct sum decomposition of φ. Theorem 7.9.17 (Frobenius Reciprocity Theorem). Let G be a compact group and and [η] ∈ H. Then H < G be closed. Let ξ, η be continuous such that [ξ] ∈ G G n [ξ], IndG H η = n [η], ResH ξ , 1 where ResG H ξ is the restriction of ξ to H. 1 see
(7.4) for the definition of ResG H ξ.
7.9. Induced representations
489
H = {e} and η = (e → I) ∈ Hom(H, U(C)). Then πL ∼ Example. Let [ξ] ∈ G, G = {[η]}, so that IndH η by Exercise 7.9.15, and H η = n ([ξ], πL ) n [ξ], IndG H Peter−Weyl
= =
= =
dim(ξ) dim(ξ) n ([η], η) ⎞ ⎛ dim(ξ) η⎠ n ⎝[η],
j=1
n [η], ResG Hξ .
As it should be, this is in accordance with the Frobenius Reciprocity Theorem 7.9.17. Then by the Frobenius Reciprocity Theorem 7.9.17, Example. Let [ξ], [η] ∈ G. = n [η], ResG n [ξ], IndG Gξ Gη = n ([η], ξ) 1, when [ξ] = [η], = 0, when [ξ] =
[η]. Let φ be a finite-dimensional continuous unitary representation of G. Then φ = . k j=1 ξk , where each ξk is irreducible. Thereby IndG Gφ ∼
k j=1
IndG G ξj ∼
k -
ξj ∼ φ;
j=1
in other words, induction practically does nothing in this case.
Chapter 8
Linear Lie Groups In this chapter we study linear Lie groups, i.e., Lie groups which are closed subgroups of GL(n, C). But first some words about the general Lie groups: Definition 8.0.1 (Lie groups). A Lie group is a C ∞ -manifold which is also a group such that the group operations are C ∞ -smooth. We will be mostly interested in the non-commutative Lie groups in view of the following: Remark 8.0.2 (Commutative Lie groups). In the introduction to Part II we mentioned that in the case of commutative groups it is sufficient to study cases of Tn and Rn . Indeed, we have the following two facts: • Any compact commutative Lie group is isomorphic to the product of a torus with a finite commutative group. • Any connected commutative Lie group is isomorphic to the product of a torus and the Euclidean space. In other words, if G is a connected commutative Lie group then G ∼ = Tn × Rm for some n, m. We will not prove these facts here but refer to, e.g., [20, p. 25] for further details. Definition 8.0.3 (Linear Lie groups). A linear Lie group is a Lie group which is a closed subgroup of GL(n, C). There is a result stating that any compact Lie group is diffeomorphic to a linear Lie group, and thereby the matrix groups are especially interesting. In fact, we have: Corollary 8.0.4 (Universality of unitary groups). Let G be a compact Lie group. Then there is some n ∈ N such that G is isomorphic to a subgroup of U(n).
492
Chapter 8. Linear Lie Groups
8.1
Exponential map
The fundamental tool for studying linear Lie groups is the matrix exponential map, treated below. Let us endow Cn with the Euclidean inner product (x, y) → x, yCn :=
n
xj yj .
j=1 1/2
The corresponding norm is x → xCn := x, xCn . We identify the matrix algebra Cn×n with L(Cn ), the algebra of linear operators Cn → Cn . Let us endow Cn×n ∼ = L(Cn ) with the operator norm Y → Y L(Cn ) :=
sup x∈Cn : x Cn ≤1
Y xCn .
Notice that XY L(Cn ) ≤ XL(Cn ) Y L(Cn ) . For a matrix X ∈ Cn×n , the exponential exp(X) ∈ Cn×n is defined by the power series exp(X) :=
∞ 1 k X , k!
k=0
where X 0 := I; this series converges in the Banach space Cn×n ∼ = L(Cn ), because ∞ ∞ ) 1 ) 1 )X k ) n ≤ XkL(Cn ) = e X L(Cn ) < ∞. L(C ) k! k!
k=0
k=0
Proposition 8.1.1. Let X, Y ∈ Cn×n . If XY = Y X then exp(X + Y ) = exp(X) exp(Y ). Therefore exp : Cn×n → GL(n, C) satisfies exp(−X) = exp(X)−1 . Proof. Now exp(X + Y )
=
2l 1 (X + Y )k l→∞ k!
lim
k=0
XY =Y X
=
=
k 2l 1 k! X i Y k−i l→∞ k! i=0 i! (k − i)! k=0 ⎛
lim
l l ⎜ 1 i 1 j X Y + lim ⎜ l→∞ ⎝ i! j! i=0
j=0
i,j: i+j≤2l, max(i,j)>l
⎞ ⎟ 1 X iY j ⎟ ⎠ i! j!
8.1. Exponential map
493
⎛
⎞ l l 1 i 1 j⎠ X Y lim ⎝ l→∞ i! j! i=0 j=0
= =
exp(X) exp(Y ),
since the remainder term satisfies ) ) ) ) ) ) ) ) 1 i j) ) X Y ) ) ) )i,j: i+j≤2l, i! j! ) ) max(i,j)>l
≤
i,j: i+j≤2l, max(i,j)>l
L(Cn )
≤
1 XiL(Cn ) Y jL(Cn ) i! j!
1 c2l (l + 1)!
l(l + 1)
−−−→ 0, l→∞
where c := max 1, XL(Cn ) , Y L(Cn ) . Consequently, I = exp(0) = exp(X) exp(−X) = exp(−X) exp(X), so that we get exp(−X) = exp(X)−1 . Exercise 8.1.2. Verify the estimates and the ranges of the summation indices in the proof of Proposition 8.1.1. Lemma 8.1.3. Let X ∈ Cn×n and P ∈ GL(n, C). Then exp X T = exp(X)T , exp(X ∗ ) exp(P XP −1 )
= =
exp(X)∗ , P exp(X)P −1 .
Proof. For the adjoint X ∗ , ∞ ∞ 1 1 exp(X ) = (X ∗ )k = (X k )∗ = k! k! ∗
k=0
k=0
∞ 1 k X k!
∗ = exp(X)∗ ,
k=0
and similarly for the transpose X T . Finally, exp(P XP −1 ) =
∞ ∞ 1 1 (P XP −1 )k = P X k P −1 = P exp(X)P −1 . k! k!
k=0
k=0
Proposition 8.1.4. If λ ∈ C is an eigenvalue of X ∈ Cn×n then eλ is an eigenvalue of exp(X). Consequently det(exp(X)) = eTr(X) . Proof. Choose P ∈ GL(n, C) such that Y := P XP −1 ∈ Cn×n is upper triangular; the eigenvalues of X and Y are the same, and for triangular matrices the eigenvalues are the diagonal elements. Since Y k is upper triangular for every k ∈ N,
494
Chapter 8. Linear Lie Groups
exp(Y ) is upper triangular. Moreover, (Y k )jj = (Yjj )k , so that (exp(Y ))jj = eYjj . The eigenvalues of exp(X) and exp(Y ) = P exp(X)P −1 are the same. The determinant of a matrix is the product of its eigenvalues; the trace of a matrix is the sum of its eigenvalues; this implies the last claim. Remark 8.1.5. Recall that HOM(G, H) is the set of continuous homomorphisms from G to H, see Definition 7.1.12. Theorem 8.1.6 (The form of HOM(R, GL(n, C))). We have HOM(R, GL(n, C)) = t → exp(tX) | X ∈ Cn×n . Proof. It is clear that (t → exp(tX)) ∈ HOM(R, GL(n, C)), since it is continuous and exp(sX) exp(tX) = exp((s + t)X). Let φ ∈ HOM(R, GL(n, C)). Then φ(s + t) = φ(s)φ(t) implies that h t+h h φ(s) ds φ(t) = φ(s + t) ds = φ(u) du. 0
0
t
Recall that if I − AL(Cn ) < 1 then A ∈ C is invertible; now ) ) ) ) ) )1 h ) ) 1 h ) ) ) ) φ(s) ds) = ) (I − φ(s)) ds) )I − ) n )h 0 ) ) h 0 n×n
L(C )
L(Cn )
≤
0 such that log(exp(X)) = X if XL(Cn ) < r. Proof. Let c := I − A < 1 for a matrix A ∈ Cn×n . Then ∞ ∞ ∞ ) 1) 1 c k )(I − A)k ) n ≤ I − AL(Cn ) ≤ < ∞, ck = L(C ) k k 1−c
k=1
k=1
k=1
so that log(A) is well defined. Noticing that I and A commute, we have ∞ 1 exp(log(A)) = k!
∞ 1
−
k=0
l=1
l
k (I − A)l
= A,
because if |1 − a| < 1 for a number a ∈ C, then e
ln a
∞ 1 = k!
k=0
−
∞ 1 l=1
l
k (1 − a)
l
= a.
(8.1)
Due to the continuity of the exponential function, there exists r > 0 such that |1 − ex | < 1 if x ∈ C satisfies |x| < r, and then ln(e ) = − x
∞ 1 l=1
l
(1 − e ) = − x l
∞ 1 l=1
l
∞ 1 k 1− x k!
l = x,
(8.2)
k=0
so that if X ∈ Cn×n satisfies XL(C) < r then log(exp(X)) = −
∞ 1 l=1
l
(I − exp(X)) = − l
∞ 1 l=1
l
∞ 1 k I− X k!
l = X.
k=0
Exercise 8.1.8. Find an estimate for r in Proposition 8.1.7. Exercise 8.1.9. Justify formulae (8.1) and (8.2) and their matrix forms. Corollary 8.1.10. Let r be as above and B := X ∈ Cn×n : XL(Cn ) < r . Then (X → exp(X)) : B → exp(B) is a diffeomorphism (i.e., a bijective C ∞ -smooth mapping). Proof. As exp and log are defined by power series, they are not just C ∞ -smooth but also analytic.
496
Chapter 8. Linear Lie Groups
Lemma 8.1.11. Let X, Y ∈ Cn×n . Then exp(X + Y ) = lim (exp(X/m) exp(Y /m))
m
m→∞
and m2
exp([X, Y ]) = lim {exp(X/m), exp(Y /m)} m→∞
,
where [X, Y ] := XY − Y X and {a, b} := aba−1 b−1 . Proof. As t → 0,
,+ , t2 2 t2 2 3 3 I + tY + Y + O(t ) exp(tX) exp(tY ) = I + tX + X + O(t ) 2 2 t2 2 = I + t(X + Y ) + (X + 2XY + Y 2 ) + O(t3 ), 2 +
so that
, t2 I + t(X + Y ) + (X 2 + 2XY + Y 2 ) + O(t3 ) 2 + , t2 × I − t(X + Y ) + (X 2 + 2XY + Y 2 ) + O(t3 ) 2 +
{exp(tX), exp(tY )} =
= I + t2 (XY − Y X) + O(t3 ) = I + t2 [X, Y ] + O(t3 ). Since exp is an injection in a neighbourhood of the origin 0 ∈ Cn×n , we have exp(tX) exp(tY ) = exp t(X + Y ) + O(t2 ) , {exp(tX), exp(tY )} = exp t2 [X, Y ] + O(t3 ) as t → 0. Notice that exp(X)m = exp(mX) for all m ∈ N. Therefore we get m lim (exp(X/m) exp(Y /m)) = lim exp X + Y + O(m−1 ) m→∞
m→∞
= exp(X + Y ), m2
lim {exp(X/m), exp(Y /m)}
m→∞
= lim exp [X, Y ] + O(m−1 ) m→∞
= exp([X, Y ]).
8.2
No small subgroups for Lie, please
Definition 8.2.1 (“No small subgroups” property). A topological group is said to have the “no small subgroups” property if there exists a neighbourhood of the neutral element containing no non-trivial subgroups.
8.2. No small subgroups for Lie, please
497
We shall show that this property characterises Lie groups among compact groups. Example. Let {Gj }j∈J be an infinite family of compact groups eachhaving more than one element. Let us consider the compact product group G := j∈J Gj . Let Hj := {x ∈ G | ∀i ∈ J \ {j} : xi = eGi } . Then Gj ∼ = Hj < G, and Hj is a non-trivial subgroup of G. If V ⊂ G is a neighbourhood of e ∈ G then it contains all but perhaps finitely many Hj , due to the definition of the product topology. Hence in this case G “has small subgroups” (i.e., has not the “no small subgroups” property). Theorem 8.2.2 (Kernels of representations). Let G be a compact group and V ⊂ G open such that e ∈ V . Then there exists φ ∈ HOM(G, U(n)) for some n ∈ Z+ such that Ker(φ) ⊂ V . Proof. First, {e} ⊂ G and G \ V ⊂ G are disjoint closed subsets of a compact Hausdorff space G. By Urysohn’s Lemma (Theorem A.12.11), there exists f ∈ C(G) such that f (e) = 1 and f (G \ V ) = {0}. Since trigonometric polynomials are dense in C(G) by Theorem 7.6.2, we may take p ∈ TrigPol(G) such that p − f C(G) < 1/2. Then H := span {πR (x)p | x ∈ G} ⊂ L2 (μG ) is a finite-dimensional vector space, and H inherits the inner product from L2 (μG ). Let A : H → Cn be a linear isometry, where n = dim(H). Let us identify U(Cn ) with U(n). Define φ ∈ Hom(G, U(n)) by φ(x) := A πR (x)|H A−1 . Then φ is clearly a continuous unitary representation. For every x ∈ G \ V , |p(x) − 0| = |p(x) − f (x)| ≤ p − f C(G) < 1/2, so that p(x) = p(e), because |p(e) − 1| = |p(e) − f (e)| ≤ p − f C(G) < 1/2; consequently πR (x)p = p. Thus Ker(φ) ⊂ V .
Corollary 8.2.3 (Characterisation of linear Lie groups). Let G be a compact group. Then G has no small subgroups if and only if it is isomorphic to a linear Lie group. Proof. Let G be a compact group without small subgroups. By Theorem 8.2.2, for some n ∈ Z+ there exists an injective φ ∈ HOM(G, U(n)). Then (x → φ(x)) : G → φ(G) is an isomorphism, and a homeomorphism by Proposition A.12.7, because φ is continuous, G is compact and U(n) is Hausdorff. Thus φ(G) < U(n) < GL(n, C) is a compact linear Lie group.
498
Chapter 8. Linear Lie Groups
Conversely, suppose G < GL(n, C) is closed. Recall that the mapping (X → exp(X)) : B → exp(B) is a homeomorphism, where B = X ∈ Cn×n : XL(Cn ) < r for some small r > 0. Hence V := exp(B/2)∩G is a neighbourhood of I ∈ G. In the search for a contradiction, suppose there exists a nontrivial subgroup H < G such that A ∈ H ⊂ V and A = I. Then 0 = log(A) ∈ B/2, so that m log(A) ∈ B \ (B/2) for some m ∈ Z+ . Thereby exp(m log(A)) = exp(log(A))m = Am ∈ H ⊂ V ⊂ exp(B/2), but also exp(m log(A)) ∈ exp(B \ (B/2)) = exp(B) \ exp(B/2);
this is a contradiction.
Remark 8.2.4. Actually, it is shown above that Lie groups have no small subgroups; compactness played no role in this part of the proof. Exercise 8.2.5. Use the Peter–Weyl Theorem 7.5.14 to provide an alternative proof for Theorem 8.2.2. Hint: For each x ∈ G \ V there exists φx ∈ HOM(G, U(nx )) such that x ∈ Ker(φx ), because. . .
8.3
Lie groups and Lie algebras
Next we deal with representation theory of Lie groups. We introduce Lie algebras, which sometimes still bear the archaic label “infinitesimal groups”, quite adequately describing their essence: a Lie algebra is a sort of locally linearised version of a Lie group. Definition 8.3.1 (Lie algebras). A K-Lie algebra is a K-vector space V endowed with a bilinear mapping ((a, b) → [a, b]V = [a, b]) : V × V → V satisfying [a, a] = 0 and [a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0 for all a, b, c ∈ V ; the second identity is called the Jacobi identity. Notice that here [a, b] = −[b, a] for all a, b ∈ V . A vector subspace W ⊂ V of a Lie algebra V is called a Lie subalgebra if [a, b] ∈ W for all a, b ∈ W (and thus W is a Lie algebra in its own right). A linear mapping A : V1 → V2 between Lie algebras V1 , V2 is called a Lie algebra homomorphism if [Aa, Ab]V2 = A[a, b]V1 for all a, b ∈ V1 . Example. 1. For a K-vector space V , the trivial Lie product [a, b] := 0 gives a trivial Lie algebra. 2. A K-algebra A can be endowed with the canonical Lie product (a, b) → [a, b] := ab − ba;
8.3. Lie groups and Lie algebras
499
this Lie algebra is denoted by LieK (A). Important special cases of such Lie algebras are LieK (Cn×n ) ∼ = LieK (End(Cn )),
LieK (End(V )),
LieK (L(X)),
where X is a normed space and End(V ) is the algebra of linear operators V → V on a vector space V . For short, let gl(V ) := LieR (End(V )). 3. (Derivations of algebras). Let D(A) be the K-vector space of derivations of a K-algebra A; that is, D ∈ D(A) if it is a linear mapping A → A satisfying the Leibniz property D(ab) = D(a) b + a D(b) for all a, b ∈ A. Then D(A) has a Lie algebra structure given by [D, E] := DE − ED. An important special case is A = C ∞ (M ), where M is a C ∞ manifold; if C ∞ (M ) is endowed with the topology of local uniform convergence for all derivatives, then D ∈ D(C ∞ (M )) is continuous if and only if it is a linear first-order partial differential operator with smooth coefficients (alternatively, a smooth vector field on M ). Definition 8.3.2. The Lie algebra Lie(G) = g of a linear Lie group G is introduced in the following Theorem 8.3.3: Theorem 8.3.3 (Lie algebras of linear Lie groups). Let G < GL(n, C) be closed. The R-vector space Lie(G) = g := X ∈ Cn×n | ∀t ∈ R : exp(tX) ∈ G is a Lie subalgebra of the R-Lie algebra LieR (Cn×n ) ∼ = gl(Cn ). Proof. Let X, Y ∈ g and λ ∈ R. Trivially, exp(tλX) ∈ G for all t ∈ R, yielding λX ∈ g. Since G is closed and exp is continuous, G (exp(tX/m) exp(tY /m))
m
m
G {exp(tX/m), exp(Y /m)}
−−−−→
exp (t(X + Y )) ∈ G
−−−−→
exp (t[X, Y ]) ∈ G
m→∞
2
m→∞
by Lemma 8.1.11. Thereby X + Y, [X, Y ] ∈ g.
Exercise 8.3.4. Let X ∈ Cn×n be such that exp(tX) = I for all t ∈ R. Show that X = 0. Exercise 8.3.5. Let g ⊂ Cn×n be the Lie algebra of a linear Lie group G < GL(n, R). Show that g ⊂ Rn×n . Definition 8.3.6 (Dimension of a linear Lie group). Let G be a linear Lie group and g = Lie(G). The dimension of G is dim(G) := dim(g) = k, hence g ∼ = Rk as a vector space.
500
Chapter 8. Linear Lie Groups
Remark 8.3.7 (Exponential coordinates). From Theorem 8.1.6 it follows that HOM(R, G) = {t → exp(tX) | X ∈ g} . The mapping (X → exp(X)) : g → G is a diffeomorphism in a small neighbourhood of 0 ∈ g. Hence, given a vector space basis for g ∼ = Rk , a small neighbourhood of exp(0) = I ∈ G is endowed with the so-called exponential coordinates. If G is compact and connected then exp(g) = G, so that the exponential map may “wrap g around G”; we shall not prove this. Remark 8.3.8. Informally speaking, if X, Y ∈ g are near 0 ∈ g, x := exp(X) and y := exp(Y ) then x, y ∈ G are near I ∈ G and exp([X, Y ]) ≈ {x, y} = xyx−1 y −1 .
exp(X + Y ) ≈ xy,
In a sense, the Lie algebra g is the infinitesimally linearised G near I ∈ G. Remark 8.3.9 (Lie algebra as invariant vector fields). The Lie algebra g can be identified with the tangent space of G at the identity I ∈ G. Using left-translations (resp. right-translations), g can be identified with the set of left-invariant (resp. right-invariant) vector fields on G, and vector fields have a natural interpretation as first-order partial differential operators on G: For x ∈ G, X ∈ g and f ∈ C ∞ (G), define LX f (x)
:=
RX f (x)
:=
d f (x exp(tX)) , dt t=0 d f (exp(tX) x) . dt t=0
Then πL (y)LX f = LX πL (y)f and πR (y)RX f = RX πR (y)f for all y ∈ G, where πL , πR are the left and right regular representations of G, respectively. Definition 8.3.10 (Abbreviations for Lie algebras). Some usual abbreviations are gl(n, K) sl(n, K) o(n) so(n) u(n) su(n)
= = = = = =
Lie(GL(n, K)), Lie(SL(n, K)), Lie(O(n)), Lie(SO(n)), Lie(U(n)), Lie(SU(n)),
and so on. Exercise 8.3.11. Calculate the dimensions of the linear Lie groups mentioned in Definition 8.3.10. Proposition 8.3.12. Let G, H be linear Lie groups having the respective Lie algebras g, h. Let ψ ∈ HOM(G, H). Then for every X ∈ g there exists a unique Y ∈ h such that ψ(exp(tX)) = exp(tY ) for all t ∈ R.
8.3. Lie groups and Lie algebras
501
Proof. Let X ∈ g. Then φ := (t → ψ(exp(tX))) : R → H is a continuous homo morphism, so that φ = (t → exp(tY )), where Y = φ (0) ∈ h. Proposition 8.3.13. Let F, G, H be closed subgroups of GL(n, C), with their respective Lie algebras f, g, h. Then (a) H < G ⇒ h ⊂ g, (b) the Lie algebra of F ∩ G is f ∩ g, (c) the Lie algebra cI of the component CI < G of the neutral element I is g. Proof. (a) If H < G and X ∈ h then exp(tX) ∈ H ⊂ G for all t ∈ R, so that X ∈ g. (b) Let e be the Lie algebra of F ∩ G. By (a), e ⊂ f ∩ g. If X ∈ f ∩ g then exp(tX) ∈ F ∩ G for all t ∈ R, so that X ∈ e. Hence e = f ∩ g. (c) By (a), cI ⊂ g. Let X ∈ g. Now the connectedness of R (Theorem A.16.9) and the continuity of t → exp(tX) by Proposition A.16.3 imply the connectedness of {exp(tX) : t ∈ R} exp(0) = I. Thereby {exp(tX) : t ∈ R} ⊂ CI , so that X ∈ cI .
Example (Lie algebra of SL(n, K)). Let us compute the Lie algebra sl(n, K) of the linear Lie group SL(n, K) = {A ∈ GL(n, K) | det(A) = 1} . Notice that sl(n, K) ⊂ Kn×n by Exercise 8.3.5. Hence sl(n, K) := X ∈ Kn×n | ∀t ∈ R : exp(tX) ∈ SL(n, K) = X ∈ Kn×n | ∀t ∈ R : exp(tX) ∈ Kn×n , det(exp(tX)) = 1 . Let {λj }nj=1 ⊂ C be the set of eigenvalues of X ∈ Kn×n . The characteristic polynomial (z → det(zI − X)) : C → C of X satisfies det(zI − X)
=
n
(z − λj )
j=1
=
z n − z n−1
n j=1
=
λj + · · · + (−1)n
n
λj
j=1
z n − z n−1 Tr(X) + · · · + (−1)n det(X),
We know that X is similar to an upper triangular matrix Y = P XP −1 for some P ∈ GL(n, K). Since det(zI − P XP −1 )
= det(P (zI − X)P −1 ) = det(P ) det(zI − X) det(P −1 ) = det(zI − X),
502
Chapter 8. Linear Lie Groups
the eigenvalues of X and Y are the same, and they are on the diagonal of Y . Evidently, {eλj }nj=1 ⊂ C is the set of the eigenvalues of both exp(Y ) and exp(X) = P −1 exp(Y )P . Since the determinant is the product of the eigenvalues and the trace is the sum of the eigenvalues, we have det(exp(X)) =
n
eλj = e
n j=1
λj
= eTr(X)
j=1
(see also Proposition 8.1.4). Therefore X ∈ sl(n, K) if and only if Tr(X) = 0 and exp(tX) ∈ Kn×n for all t ∈ R. Thus sl(n, K) = X ∈ Kn×n | Tr(X) = 0 as the reader may check. Next we ponder the relationship between Lie group and Lie algebra homomorphisms. Definition 8.3.14 (Differential homomorphisms). Let G, H be linear Lie groups with respective Lie algebras g, h. The differential homomorphism of ψ ∈ HOM(G, H) is the mapping ψ = Lie(ψ) : g → h defined by ψ (X) :=
d ψ(exp(tX)) dt
. t=0
Remark 8.3.15. Above, ψ is well defined since f := (t → ψ(exp(tX))) ∈ HOM(R, H) is of the form t → exp(tY ) for some Y ∈ h, as a consequence of Theorem 8.1.6. Moreover, Y = f (0) = ψ (X) holds, so that ψ(exp(tX)) = exp(tψ (X)). Theorem 8.3.16. Let F, G, H be linear Lie groups with respective Lie algebras f, g, h. Let φ ∈ HOM(F, G) and ψ ∈ HOM(G, H). The mapping ψ : g → h defined in Definition 8.3.14 is a Lie algebra homomorphism. Moreover, (ψ ◦ φ) = ψ φ
and
IdG = Idg ,
where IdG = (x → x) : G → G and Idg = (X → X) : g → g. Proof. Let X, Y ∈ g and λ ∈ R. Then ψ (λX)
= = =
d ψ(exp(tλX))|t=0 dt d λ ψ(exp(tX))|t=0 dt λψ (X).
8.3. Lie groups and Lie algebras
503
If t ∈ R then exp (tψ (X + Y ))
= ψ (exp(tX + tY )) m = ψ lim (exp(tX/m) exp(tY /m)) m→∞
= = =
m
lim (ψ(exp(tX/m)) ψ(exp(tY /m)))
m→∞
lim (exp(tψ (X)/m) exp(tψ (Y )/m))
m
m→∞
exp(t(ψ (X) + ψ (Y ))),
so that tψ (X + Y ) = t (ψ (X) + ψ (Y )) for small enough |t|, as we recall that exp is injective in a small neighbourhood of 0 ∈ g. Consequently, ψ : g → h is linear. Next, exp (tψ ([X, Y ])) = ψ (exp(t[X, Y ])) m2 = ψ lim {exp(tX/m), exp(tY /m)} m→∞
= =
m2
lim {exp(tψ (X)/m), exp(tψ (Y )/m)}
m→∞
exp (t[ψ (X), ψ (Y )]) ,
so that we get ψ ([X, Y ]) = [ψ (X), ψ (Y )]. Thus ψ : g → h is a Lie algebra homomorphism. If Z ∈ f then (ψ ◦ φ) (Z)
= = =
Finally,
d dt
d ψ (φ(exp(tZ))) |t=0 dt d ψ (exp(tφ (Z))) |t=0 dt ψ (φ (Z)).
exp(tX)|t=0 = X, yielding IdG = Idg .
Remark 8.3.17. Notice that isomorphic linear Lie groups must have isomorphic Lie algebras. Now we know that a continuous Lie group homomorphism ψ can naturally be linearised to get a Lie algebra homomorphism ψ , so that we have the commutative diagram ψ
G −−−−→ F ⏐ exp⏐
H, F ⏐exp ⏐
ψ
g −−−−→ h. If we are given a Lie algebra homomorphism f : g → h, does there exist φ ∈ HOM(G, H) such that φ = f ? This problem is studied in the following exercises.
504
Chapter 8. Linear Lie Groups
Definition 8.3.18 (Simply connected spaces). A topological space X is said to be simply connected if X is path-connected and if every closed curve in X can be shrunken to a point continuously in the set X. Exercise 8.3.19. Show that the groups SU(n) and SL(n, C) are both connected and simply connected. Exercise 8.3.20. Show that the groups U(n) and GL(n, C) are connected but not simply connected. Exercise 8.3.21. Let G, H be linear Lie groups such that G is simply connected. Let f : g → h be a Lie algebra homomorphism. Show that there exists φ ∈ HOM(G, H) such that φ = f . (This is a rather demanding task unless one knows that exp : g → G is surjective and uses Lemma 8.1.11. A proof can be found, e.g., in [37].) Exercise 8.3.22. Related to Exercise 8.3.21, give an example of a non-simplyconnected G and a homomorphism f : g → h which is not of the form f = φ . Lemma 8.3.23. Let g be the Lie algebra of a linear Lie group G, and S := exp(X1 ) · · · exp(Xm ) | m ∈ Z+ , {Xj }m j=1 ⊂ g . Then S = CI , the component of I ∈ G. Proof. Now S < G is path-connected, since (t → exp(tX1 ) · · · exp(tXm )) : [0, 1] → S is continuous, connecting I ∈ S to the point exp(X1 ) · · · exp(Xm ) ∈ S. For a small enough neighbourhood U ⊂ g of 0 ∈ g, we have a homeomorphism (X → exp(X)) : U → exp(U ). Because of exp(X1 ) · · · exp(Xm ) ∈ exp(X1 ) · · · exp(Xm ) exp(U ) ⊂ S, it follows that S < G is open. But open subgroups are always closed, as the reader can easily verify. Thus S I is connected, closed and open, so that S = CI . Corollary 8.3.24. Let G, H be linear Lie groups and φ, ψ ∈ HOM(G, H). Then: (a) Lie(Ker(ψ)) = Ker(ψ ). (b) If G is connected and φ = ψ then φ = ψ. (c) Let H be connected; then ψ is surjective if and only if ψ is surjective. Proof. (a) Ker(ψ) < G < GL(n, C) is a closed subgroup, since ψ is a continuous homomorphism. Thereby Lie(Ker(ψ)) = X ∈ Cn×n | ∀t ∈ R : exp(tX) ∈ Ker(ψ) = X ∈ Cn×n | ∀t ∈ R : exp(tψ (X)) = ψ(exp(tX)) = I = X ∈ Cn×n | ψ (X) = 0 = Ker(ψ ).
8.3. Lie groups and Lie algebras
505
(b) Take A ∈ G. Then A = exp(X1 ) · · · exp(Xm ) for some {Xj }m j=1 ⊂ g by Lemma 8.3.23, so that φ(A)
= exp (φ (X1 )) · · · exp (φ (Xm )) = exp (ψ (X1 )) · · · exp (ψ (Xm )) = ψ(A).
(c) Suppose ψ : g → h is surjective. Let B ∈ H. Now H is connected, so that Lemma 8.3.23 says that B = exp(Y1 ) · · · exp(Ym ) for some {Yj }m j=1 ⊂ h. Exploit the surjectivity of ψ to obtain Xj ∈ g such that ψ (Xj ) = Yj . Then ψ (exp(X1 ) · · · exp(Xm ))
= ψ (exp(X1 )) · · · ψ (exp(Xm )) = exp(Y1 ) · · · exp(Ym ) = B.
Conversely, suppose ψ : G → H is surjective. Trivially, ψ (0) = 0 ∈ h; let 0 = Y ∈ h. Let r0 := r/Y , where r is as in Proposition 8.1.7; notice that if |t| < r0 then log(exp(tY )) = tY . The surjectivity of ψ guarantees that for every t ∈ R there exists At ∈ G such that ψ(At ) = exp(tY ). The set R := {At : 0 < t < r0 } is uncountable, so that it has an accumulation point x ∈ Cn×n ; and x ∈ G, because R ⊂ G and G ⊂ Cn×n is closed. Let ε > 0. Then there exist s, t ∈]0, r0 [ such that s = t and ) ) −1 ) < ε. As − x < ε, At − x < ε, )A−1 s −x Thereby ) ) −1 )As At − I )
) −1 ) )As (At − As )) ) ) ) (At − x + x − As ) ≤ )A−1 s −1 ≤ x + ε 2ε. =
−1 Hence we demand A−1 s At − I < 1 and ψ(As At ) − I < 1, yielding −1 ψ(At ) = exp((t − s)Y ). ψ(A−1 s At ) = ψ(As )
Consequently
ψ log(A−1 s At ) = (t − s)Y. 1 log(A−1 Therefore ψ t−s s At ) = Y .
Definition 8.3.25 (Adjoint representation of Lie groups). The adjoint representation of a linear Lie group G is the mapping Ad ∈ HOM(G, Aut(g)) defined by Ad(A)X := AXA−1 , where A ∈ G and G ∈ g.
506
Chapter 8. Linear Lie Groups
Remark 8.3.26. Indeed, Ad : G → Aut(g), because exp (tAd(A)X) = exp tAXA−1 = A exp (tX) A−1 belongs to G if A ∈ G, X ∈ g and t ∈ R. It is a homomorphism, since Ad(AB)X = ABXB −1 A−1 = Ad(A)(BXB −1 ) = Ad(A) Ad(B) X, and Ad is trivially continuous. Exercise 8.3.27. Let g be a Lie algebra. Consider Aut(g) as a linear Lie group. Show that Lie(Aut(g)) and gl(g) are isomorphic as Lie algebras. Definition 8.3.28 (Adjoint representation of Lie algebras). The adjoint representation of the Lie algebra g of a linear Lie group G is the differential representation ad = Ad : g → Lie(Aut(g)) ∼ = gl(g), that is ad(X) := Ad (X), so that ad(X)Y
d (exp(tX)Y exp(−tX)) |t=0 dt , , ++ d d exp(tX) Y exp(−tX) + exp(tX)Y exp(−tX) |t=0 = dt dt = XY − Y X = [X, Y ]. =
Remark 8.3.29. Notice that the diagram commutes: Ad
G −−−−→ F ⏐ exp⏐
Aut(G) F ⏐exp ⏐
Ad =ad
g −−−−−→ Lie(Aut(g)).
8.3.1 Universal enveloping algebra Here we discuss the universal enveloping algebra. Remark 8.3.30 (Universal enveloping algebra informally). We are going to study higher-order partial differential operators on G. Let g be the Lie algebra of a linear Lie group G. Next we construct a natural associative algebra U(g) generated by g modulo an ideal, enabling embedding g into U(g). Recall that g can be interpreted as the vector space of first-order left- (or right-) translation invariant partial differential operators on G. Consequently, U(g) can be interpreted as the vector space of finite-order left- (or right-) translation invariant partial differential operators on G.
8.3. Lie groups and Lie algebras
507
Definition 8.3.31 (Universal enveloping algebra). Let g be a K-Lie algebra. Let T :=
∞ -
⊗m g
m=0
be the tensor product algebra of g, where ⊗m g denotes the m-fold tensor product g ⊗ · · · ⊗ g; that is, T is the linear span of the elements of the form λ00 1 +
Km M
λmk Xmk1 ⊗ · · · ⊗ Xmkm ,
m=1 k=1
where 1 is the formal unit element of T , λmk ∈ K, Xmkj ∈ g and M, Km ∈ Z+ ; the product of T is begotten by the tensor product, i.e., (X1 ⊗ · · · ⊗ Xp )(Y1 ⊗ · · · ⊗ Yq ) := X1 ⊗ · · · ⊗ Xp ⊗ Y1 ⊗ · · · ⊗ Yq is extended to a unique bilinear mapping T × T → T . Let J be the (two-sided) ideal in T spanned by the set O := {X ⊗ Y − Y ⊗ X − [X, Y ] : X, Y ∈ g} ; i.e., J ⊂ T is the smallest vector subspace such that O ⊂ J and DE, ED ∈ J for every D ∈ J and E ∈ T (in a sense, J is a “huge zero” in T ). The quotient algebra U(g) := T /J is called the universal enveloping algebra of g. Definition 8.3.32 (Canonical mapping of a Lie algebra). Let ι : T → U(g) = T /J be the quotient mapping t → t + J . A natural interpretation is that g ⊂ T . The restricted mapping ι|g : g → U(g) is called the canonical mapping of g. Remark 8.3.33. Notice that ι|g : g → LieK (U(g)) is a Lie algebra homomorphism: it is linear and ι|g ([X, Y ])
= ι([X, Y ]) = ι(X ⊗ Y − Y ⊗ X) = ι(X)ι(Y ) − ι(Y )ι(X) = ι|g (X)ι|g (Y ) − ι|g (Y )ι|g (X) = [ι|g (X), ι|g (Y )].
Theorem 8.3.34 (Universality of the enveloping algebra). Let g be a K-Lie algebra, ι|g : g → U(g) its canonical mapping, A an associative K-algebra, and σ : g → LieK (A)
508
Chapter 8. Linear Lie Groups
a Lie algebra homomorphism. Then there exists a unique algebra homomorphism σ ˜ : U(g) → A satisfying σ ˜ (ι|g (X)) = σ(X) for all X ∈ g, i.e., σ ˜
U(g) −−−−→ F ⏐ ι|g ⏐ g
A ) ) )
σ
−−−−→ LieK (A).
Proof. Let us define a linear mapping σ0 : T → A by σ0 (X1 ⊗ · · · ⊗ Xm ) := σ(X1 ) · · · σ(Xm ).
(8.3)
Then σ0 (J ) = {0}, since σ0 (X ⊗ Y − Y ⊗ X − [X, Y ])
= σ(X)σ(Y ) − σ(Y )σ(X) − σ([X, Y ]) = σ(X)σ(Y ) − σ(Y )σ(X) − [σ(X), σ(Y )] = 0.
Hence if t, u ∈ T and t − u ∈ J then σ0 (t) = σ0 (u). Thereby we may define ˜ is an algebra σ ˜ := (t + J → σ0 (t)) : U(g) → A. Finally, it is clear that σ homomorphism making the diagram above commute. The uniqueness is clear by construction since (8.3) must hold. Corollary 8.3.35 (Ado–Iwasawa Theorem). Let g be the Lie algebra of a linear Lie group G. Then the canonical mapping ι|g : g → U(g) is injective. Proof. Let σ = (X → X) : g → gl(n, C). Due to the universality of U(g) there exists an R-algebra homomorphism σ ˜ : U(g) → Cn×n such that σ(X) = σ ˜ (ι|g (X)) for all X ∈ G, i.e., σ ˜
U(g) −−−−→ F ⏐ ι|g ⏐ g
Cn×n ) ) )
σ
−−−−→ gl(n, C).
Then ι|g is injective because σ is injective.
Remark 8.3.36. By the Ado–Iwasawa Theorem (Corollary 8.3.35), the Lie algebra g of a linear Lie group can be considered as a Lie subalgebra of LieR (U(g)). Definition 8.3.37 (ad). Let g be a K-Lie algebra. Let us define the linear mapping ad : g → End(g) by ad(X)Z := [X, Z].
8.3. Lie groups and Lie algebras
509
Remark 8.3.38. Let g be a K-Lie algebra and X, Z ∈ G. Since 0
= [[X, Y ], Z] + [[Y, Z], X] + [[Z, X], Y ] = [[X, Y ], Z] − ([X, [Y, Z]] − [Y, [X, Z]]) =
ad([X, Y ])Z − [ad(X), ad(Y )]Z,
we notice that ad([X, Y ]) = [ad(X), ad(Y )], i.e., ad is a Lie algebra homomorphism g → gl(g). Definition 8.3.39 (Killing form and semisimple Lie groups). The Killing form of the Lie algebra g is the bilinear mapping B : g × g → K, defined by B(X, Y ) := Tr (ad(X) ad(Y )) (recall that by Exercise B.5.41, on a finite-dimensional vector space the trace can be defined independent of any inner product). A (R- or C-)Lie algebra g is called semisimple if its Killing form is non-degenerate, i.e., if ∀X ∈ g \ {0} ∃Y ∈ g : B(X, Y ) = 0; n
equivalently, B is non-degenerate if the matrix (B(Xi , Xj ))i,j=1 is invertible, where {Xj }nj=1 ⊂ g is a vector space basis. A connected linear Lie group is called semisimple if its Lie algebra is semisimple. Example. Linear Lie groups SL(n, K) and SO(n) are semisimple, but GL(n) is not semisimple. Remark 8.3.40. Since Tr(ab) = Tr(ba), we have B(X, Y ) = B(Y, X). We also have B(X, [Y, Z]) = B([X, Y ], Z), because Tr(a(bc − cb)) = Tr(abc) − Tr(acb) = Tr(abc) − Tr(bac) = Tr((ab − ba)c) yields B(X, [Y, Z])
= Tr (ad(X) ad([Y, Z])) = Tr (ad(X) [ad(Y ), ad(Z)]) = =
Tr ([ad(X), ad(Y )] ad(Z)) Tr (ad([X, Y ]) ad(Z))
=
B([X, Y ], Z).
It can be proven that the Killing form of the Lie algebra of a compact linear Lie group is negative semi-definite, i.e., B(X, X) ≤ 0. On the other hand, if the Killing form of a Lie group is negative definite, i.e., B(X, X) < 0 whenever X = 0, then the group is compact.
510
8.3.2
Chapter 8. Linear Lie Groups
Casimir element and Laplace operator
Here we discuss some properties of the Casimir element and the corresponding Laplace operator. Definition 8.3.41 (Casimir element). Let g be a semisimple K-Lie algebra with a vector space basis {Xj }nj=1 ⊂ g. Let B : g × g → K be the Killing form of g, and define the matrix R ∈ Kn×n by Rij := B(Xi , Xj ). Let X i :=
n
R−1
ij
Xj ,
j=1
so that {X i }ni=1 is another vector space basis for g. Then the Casimir element Ω ∈ U(g) of g is defined by n Ω := Xi X i . i=1
Remark 8.3.42. The Casimir element Ω ∈ U(g) for the Lie algebra g of a compact semisimple linear Lie group G can be considered as an elliptic linear second-order (left and right) translation invariant partial differential operator. In a sense, the Casimir operator is an analogy of the Euclidean Laplace operator L=
n ∂2 ∞ n ∞ n 2 : C (R ) → C (R ). ∂x j j=1
Such a Laplace operator can be constructed for any compact Lie group G, and with it we may define Sobolev spaces on G nicely, etc. Theorem 8.3.43 (Properties of Casimir element). The Casimir element of a finitedimensional semisimple K-Lie algebra g is independent of the choice of the vector space basis {Xj }nj=1 ⊂ g. Moreover, DΩ = ΩD for all D ∈ U(g). Proof. Let {Xj }nj=1 ⊂ g, Rij = B(Xi , Xj ) and Ω be as in Definition 8.3.41. To simplify notation, we consider only the case K = R. Let {Yi }ni=1 ⊂ g be a vector space basis of g. Then there exists A = (Aij )ni,j=1 ∈ GL(n, R) such that ⎧ ⎨ Yi :=
⎩
n j=1
⎫n ⎬ Aij Xj
⎭ i=1
.
8.3. Lie groups and Lie algebras
511
Then S
:= =
n
(B(Yi , Yj ))i,j=1 n n n B Aik Xk , Ajl Xl ⎛ ⎝
=
k=1 n
l=1
⎞n
i,j=1
Aik B(Xk , Xl ) Ajl ⎠
k,l=1
i,j=1
T
=
ARA ;
hence S −1 = ((S −1 )ij )ni,j=1 = (AT )−1 R−1 A−1 . Let us now compute the Casimir element of g with respect to the basis {Yj }nj=1 : n
(S −1 )ij Yi Yj
=
i,j=1
=
=
=
n
(S −1 )ij
i,j=1 n k,l=1 n k,l=1 n
Xk Xl
n
k=1 n
Aik Xk
n
Ajl Xl
l=1
Aik (S −1 )ij Ajl
i,j=1
Xk Xl
n
(AT )ki ((AT )−1 R−1 A−1 )ij Ajl
i,j=1
Xk Xl (R−1 )kl .
k,l=1
Thus the definition of the Casimir element does not depend on the choice of a vector space basis. We still have to prove that Ω commutes with every D ∈ U(g). Since B(X i , Xj ) =
n k=1
(R−1 )ik B(Xk , Xj ) =
n
(R−1 )ik Rkj = δij ,
k=1
we can extend (Xi , Xj ) → Xi , Xj g := B(X i , Xj ) uniquely to an inner product ((X, Y ) → X, Y g ) : g × g → R, with respect to which the collection {Xi }ni=1 is an orthonormal basis. For the Lie product (x, y) → [x, y] := xy − yx of LieR (U(g)) we have [x, yz] = [x, y]z + y[x, z],
512
Chapter 8. Linear Lie Groups
so that for D ∈ g we get [D, Ω] = [D,
n
i
Xi X ] =
i=1
n
[D, Xi ]X i + Xi [D, X i ] .
i=1
Let cij , dij ∈ R be defined by [D, Xi ] =
n
cij Xj ,
j=1
[D, X i ] =
n
dij X j .
j=1
Then cij = Xj , [D, Xi ]g = B(X j , [D, Xi ]) = B([X j , D], Xi ) = B(−[D, X j ], Xi ) n n = B(− djk X k , Xi ) = − djk B(X k , Xi ) k=1
=−
n
k=1
djk Xk , Xi g = −dji ,
k=1
so that [D, Ω] =
=
n i,j=1 n
(cij Xj X i + dij Xi X j ) (cij + dji )Xj X i
i,j=1
=
0,
i.e., DΩ = ΩD for all D ∈ g. By induction, we may prove that [D1 D2 · · · Dm , Ω] = D1 [D2 · · · Dm , Ω] + [D1 , Ω]D2 · · · Dm = 0 for every {Dj }m j=1 ⊂ g, so that DΩ = ΩD for all D ∈ U(g).
Exercise 8.3.44. How should the proof of Theorem 8.3.43 be modified if K = C instead of K = R? Definition 8.3.45 (Laplace operator on G). The Casimir element from Definition 8.3.41, also denoted by LG := Ω ∈ U(g), and viewed as a second-order partial differential operator on G is also called the Laplace operator on G. Here a vector field Y ∈ g is viewed as a differential operator Y ≡ DY : C ∞ (G) → C ∞ (G), defined by Y f (x) ≡ DY f (x) =
d f (x exp(tY )) dt
. t=0
8.3. Lie groups and Lie algebras
513
Remark 8.3.46. The Laplace operator LG is a negative definite bi-invariant operator on G, by Theorem 8.3.43. If G is equipped with the unique (up to a constant) bi-invariant Riemannian metric, LG is its Laplace–Beltrami operator. In the notation of the right and left Peter–Weyl theorem in Theorem 7.5.14 and Remark 7.5.16, we write H := φ
dim -φ
φ Hi,·
=
dim -φ
i=1
φ H·,j .
j=1
the space Theorem 8.3.47 (Eigenvalues of the Laplacian on G). For every [φ] ∈ G Hφ is an eigenspace of LG and −LG |Hφ = λφ I, for some λφ ≥ 0. Proof. We will use the notation of Theorem 7.5.14. Note that by Theorem 8.3.43 the Laplace operator LG is bi-invariant, so that it commutes with both πR (x) and πL (x), for all x ∈ G. Therefore, by the Peter–Weyl theorem it commutes with Thus LG (Hφ ) ⊂ Hφ and LG (Hφ ) ⊂ Hφ , for all 1 ≤ i, j ≤ dim(φ). all φ ∈ G. ·,j ·,j i,· i,· φ φ H·,j = span(φij ), so that LG φij = cij φij for some It follows that LG φij ∈ Hi,· constants cij . Let us now determine these constants. We have (LG πR (y)φij )(x)
LG (φij (xy)) ⎞ ⎛ dim(φ) LG ⎝ φik (x)φkj (y)⎠
= =
k=1
dim(φ)
=
cik φik (x)φkj (y).
k=1
On the other hand we have (πR (y)LG φij )(x)
=
cij φij (xy)
dim(φ)
=
cij φik (x)φkj (y).
k=1
It follows now from the orthogonality Lemma 7.5.12 that cik φkj (y) = cij φkj (y), or that cik = cij for all 1 ≤ i, j, k ≤ dim(φ). A similar calculation with the left regular action πL (y) shows that ckj = cij for all 1 ≤ i, j, k ≤ dim(φ). Hence LG φij = cφij for all 1 ≤ i, j ≤ dim(φ), and since LG is negative definite, we obtain the statement with λφ := −c ≥ 0.
Chapter 9
Hopf Algebras Instead of studying a compact group G, we may consider the algebra C(G) of continuous functions G → C. The structure of the group is encoded in the function algebra, but we shall see that this approach paves the way for a more general functional analytic theory of Hopf algebras, which possess nice duality properties.
9.1 Commutative C ∗ -algebras Let A := C(X), where X is a compact Hausdorff space. We present1 some fundamental results: • All the algebra homomorphisms A → C are of the form f → f (x), where x ∈ X. • All the closed ideals of A are of the form I(K) := {f ∈ A | f (K) = {0}} , where K ⊂ X (with convention I(∅) := C(X)). Moreover, K = V (I(K)), where f −1 ({0}); V (J) = f ∈J
these results follow by Urysohn’s Lemma (Theorem A.12.11). • Linear functionals A → C are of the form f → f dμ,
(9.1)
X
where μ is a Borel-regular measure on X; this is the Riesz Representation Theorem C.4.60. 1 These
statements follow essentially from the results in Part I
516
Chapter 9. Hopf Algebras
• Probability functionals A → C are then of the form (9.1), where μ is a Borel-regular probability measure on X. All in all, we might say that the topology and measure theory of a compact Hausdorff space X is encoded in the algebra A = C(X), with a dictionary: Space X homeomorphism φ : X → X point x ∈ X closed set in X X metrisable Borel-regular measure on X Borel-regular probability measure on X .. .
Algebra A = C(X) isomorphism (f → f ◦ φ) : A → A algebra functional (f → f (x)) : A → C closed ideal in A A separable linear functional probability functional .. .
Remark 9.1.1. In the light of the dictionary above, we are bound to ask: 1. If X is a group, how is this reflected in C(X)? 2. Could we study non-commutative algebras just like the commutative ones? We might call the traditional topology and measure theory by the name “commutative geometry”, referring to the commutative function algebras; “non-commutative geometry” would refer to the study of non-commutative algebras. Let us now try to deal with the two questions posed above. Answering to question 1. Let G be a compact group. By Urysohn’s Lemma (Theorem A.12.11), C(G) separates the points of X, so that the associativity of the group operation ((x, y) → xy) : G × G → G is encoded by ∀x, y, z ∈ G
∀f ∈ C(G) :
f ((xy)z) = f (x(yz)).
Similarly, ∃e ∈ G
∀x ∈ G
∀f ∈ C(G) :
f (xe) = f (x) = f (ex)
encodes the neutral element e ∈ G. Finally, ∀x ∈ G
∃x−1 ∈ G
∀f ∈ C(G) :
f (x−1 x) = f (e) = f (xx−1 )
encodes the inversion (x → x−1 ) : G → G. Thereby let us define linear operators ˜ : C(G) → C(G × G), Δ ε˜ : C(G) → C, S˜ : C(G) → C(G),
˜ (x, y) := f (xy), Δf ε˜f := f (e), ˜ (x) := f (x−1 ); Sf
9.2. Hopf algebras
517
the interactions of these algebra homomorphisms contain all the information about the structure of the underlying group! This is a key ingredient in the Hopf algebra theory. Answering to question 2. Our algebras always have a unit element 1. An involutive C-algebra A is a C ∗ -algebra if it has a Banach space norm satisfying ab ≤ a b and
a∗ a = a2
for all a, b ∈ A. By Gelfand and Naimark (1943), see Theorem D.5.3, up to an isometric ∗-isomorphism a C ∗ -algebra is a closed involutive subalgebra of L(H), where H is a Hilbert space; moreover, if A is a commutative unital C ∗ -algebra then A ∼ = C(X) for a compact Hausdorff space X, as explained below: The spectrum of A is the set Spec(A) of the algebra homomorphisms A → C (automatically bounded functionals!), endowed with the Gelfand topology, which is the relative weak∗ -topology of L(A, C). It turns out that Spec(A) is a compact Hausdorff space. For a ∈ A we define the Gelfand transform a : Spec(A) → C,
a(x) := x(a).
It turns out that a is continuous, and that (a → a) : A → C(Spec(A)) is an isometric ∗-algebra isomorphism! If B is a non-commutative C ∗ -algebra, it still has plenty of interesting commutative C ∗ -subalgebras so that the Gelfand transform provides the nice tools of classical analysis on compact Hausdorff spaces in the study of the algebra. Namely, if a ∈ B is normal, i.e., a∗ a = aa∗ , then the closure of the algebraic span (polynomials) of {a, a∗ } is a commutative C ∗ -subalgebra. E.g., b∗ b ∈ B is normal for all b ∈ B. Synthesis of questions 1 and 2. By the Gelfand–Naimark Theorem D.5.11, the archetypal commutative C ∗ -algebra is C(X) for a compact Hausdorff space X. In the sequel, we introduce Hopf algebras. In a sense, they are a not-necessarilycommutative analogy of C(G), where G is a compact group. We begin by formally dualising the category of algebras, to obtain the category of co-algebras. By marrying these concepts in a subtle way, we obtain the category of Hopf algebras.
9.2
Hopf algebras
The definition of a Hopf algebra is a lengthy one, yet quite natural. In the sequel, notice the evident dualities in the commutative diagrams. For C-vector spaces V, W , we define τV W : V ⊗ W → W ⊗ V by the linear extension of τV W (v ⊗ w) := w ⊗ v.
518
Chapter 9. Hopf Algebras
Moreover, in the sequel the identity operation (v → v) : V → V for any vector space V is denoted by I. We constantly identify C-vector spaces V and C ⊗V (and respectively V ⊗ C), since (λ ⊗ v) → λv defines a linear isomorphism C ⊗ V → V . In the usual definition of an algebra, the multiplication is regarded as a bilinear map. In order to use dualisation techniques for algebras, we want to linearise the multiplication. Let us therefore give a new, equivalent definition for an algebra: Definition 9.2.1 (Reformulation of algebras). The triple (A, m, η) is an algebra (more precisely, an associative unital C-algebra) if A is a C-vector space, and m
:
A ⊗ A → A,
η
:
C→A
are linear mappings such that the following diagrams commute: the associativity diagram I⊗m A ⊗ A ⊗ A −−−−→ A ⊗ A ⏐ ⏐ ⏐ ⏐m m⊗I ( ( A⊗A
m
−−−−→
A
and the unit diagrams I⊗η
A ⊗ C −−−−→ A ⊗ A ⏐ ⏐ ⏐ ⏐m a⊗λ→λa( ( A
η⊗I
A ⊗ A ←−−−− C ⊗ A ⏐ ⏐ ⏐ ⏐ m( (λ⊗a→λa
A,
A
A.
The mapping m is called the multiplication and η the unit mapping; the algebra A is said to be commutative if mτAA = m. The unit of an algebra (A, m, η) is 1A := η(1), and the usual abbreviation for the multiplication is ab := m(a ⊗ b). For algebras (A1 , m1 , η1 ) and (A2 , m2 , η2 ) the tensor product algebra (A1 ⊗ A2 , m, η) is defined by m := (m1 ⊗ m2 )(I ⊗ τA1 A2 ⊗ I), i.e., (a1 ⊗ a2 )(b1 ⊗ b2 ) = (a1 b1 ) ⊗ (a2 b2 ), and η(1) := 1A1 ⊗ 1A2 . Remark 9.2.2. If an algebra A = (A, m, η) is finite-dimensional, we can formally dualise its structural mappings m and η; this inspires the concept of the co-algebra:
9.2. Hopf algebras
519
Definition 9.2.3 (Co-algebras). The triple (C, Δ, ε) is a co-algebra (more precisely, a co-associative co-unital C-co-algebra) if C is a C-vector space and Δ : ε :
C → C ⊗ C, C→C
are linear mappings such that the following diagrams commute: the co-associativity diagram (notice the duality to the associativity diagram) I⊗Δ
C ⊗ C ⊗ C ←−−−− C ⊗ C F F ⏐ ⏐ Δ⊗I ⏐ ⏐Δ C⊗C
Δ
←−−−−
C
and the co-unit diagrams (notice the duality to the unit diagrams) I⊗ε
C ⊗ C ←−−−− C ⊗ C F F ⏐ ⏐ λc→c⊗λ⏐ Δ⏐ C
ε⊗I
C ⊗ C −−−−→ C ⊗ C F F ⏐ ⏐ Δ⏐ ⏐λc→λ⊗c
C,
C
C.
The mapping Δ is called the co-multiplication and ε the co-unit mapping; the co-algebra C is co-commutative if τCC Δ = Δ. For co-algebras (C1 , Δ1 , ε1 ) and (C2 , Δ2 , ε2 ) the tensor product co-algebra (C1 ⊗ C2 , Δ, ε) is defined by Δ := (I ⊗ τC1 C2 ⊗ I)(Δ1 ⊗ Δ2 ) and ε(c1 ⊗ c2 ) := ε1 (c1 )ε2 (c2 ). Example. A trivial co-algebra example: if (A, m, η) is a finite-dimensional algebra then the vector space dual A = L(A, C) has a natural co-algebra structure. Indeed, let us identify (A ⊗ A) and A ⊗ A naturally, so that m : A → A ⊗ A is the dual mapping to m : A ⊗ A → A. Let us identify C and C naturally, so that η : A → C is the dual mapping to η : C → A. Then (A , m , η ) is a co-algebra (draw the commutative diagrams!). We shall give more interesting examples of co-algebras after the definition of Hopf algebras.
520
Chapter 9. Hopf Algebras
Definition 9.2.4 (Convolution of linear operators). Let (B, m, η) be an algebra and (B, Δ, ε) be a co-algebra. Let L(B) denote the vector space of linear operators B → B. Let us define the convolution A∗B ∈ L(B) of linear operators A, B ∈ L(B) by A ∗ B := m(A ⊗ B)Δ. Exercise 9.2.5. Show that L(B) in Definition 9.2.4 is an algebra, when endowed with the convolution product of operators. Definition 9.2.6 (Hopf algebras). A structure (H, m, η, Δ, ε, S) is a Hopf algebra if • (H, m, η) is an algebra, • (H, Δ, ε) is a co-algebra, • Δ : H → H ⊗ H and ε : H → C are algebra homomorphisms, i.e., Δ(f g) = Δ(f )Δ(g), Δ(1H ) = 1H⊗H , ε(f g) = ε(f )ε(g), ε(1H ) = 1, • and S : H → H is a linear mapping, called the antipode, satisfying S ∗ I = ηε = I ∗ S; i.e., I ∈ L(H) and S ∈ L(H) are inverses to each other in the convolution algebra L(H). For Hopf algebras (H1 , m1 , η1 , Δ1 , ε1 , S1 ) and (H2 , m2 , η2 , Δ2 , ε2 , S2 ) we define the tensor product Hopf algebra (H1 ⊗ H2 , m, η, Δ, ε, S) such that (H1 ⊗ H2 , m, η) is the usual tensor product algebra, (H1 ⊗ H2 , Δ, ε) is the usual tensor product co-algebra, and S := SH1 ⊗ SH2 . Exercise 9.2.7 (Uniqueness of the antipode). Let (H, m, η, Δ, ε, Sj ) be Hopf algebras, where j ∈ {1, 2}. Show that S1 = S2 .
9.2. Hopf algebras
521
Remark 9.2.8 (Commutative diagrams for Hopf algebras). Notice that we now have the multiplication and co-multiplication diagram Δm
−−−−→
H⊗H ⏐ ⏐ Δ⊗Δ(
H⊗H F ⏐m⊗m ⏐
⊗I
I⊗τ
H ⊗ H ⊗ H ⊗ H −−−−HH −−−→ H ⊗ H ⊗ H ⊗ H, the co-multiplication and unit diagram H ⏐ ⏐ Δ(
η
←−−−−
C ) ) )
η⊗η
H ⊗ H ←−−−− C ⊗ C, the multiplication and co-unit diagram H F ⏐ m⏐
ε
−−−−→
C ) ) )
ε⊗ε
H ⊗ H −−−−→ C ⊗ C and the “everyone with the antipode” diagrams H ⏐ ⏐ Δ(
ηε
−−−−→
H F ⏐m ⏐
I⊗S
H ⊗ H −−−−→ H ⊗ H. S⊗I
Example (A monoid co-algebra example). Let G be a finite group and F(G) be the C-vector space of functions G → C. Notice that F(G) ⊗ F(G) and F(G × G) are naturally isomorphic by m
(fj ⊗ gj )(x, y) :=
j=1
m
fj (x)gj (y).
j=1
Then we can define mappings Δ : F(G) → F(G) ⊗ F(G) and ε : F(G) → C by Δf (x, y) := f (xy),
εf := f (e).
In the next example we show that (F(G), Δ, ε) is a co-algebra. But there is still more structure in the group to exploit: let us define an operator S : F(G) → F(G) by (Sf )(x) := f (x−1 ). . .
522
Chapter 9. Hopf Algebras
Example (Hopf algebra for finite group). Let G be a finite group. Now F(G) from the previous example has a structure of a commutative Hopf algebra; it is co-commutative if and only if G is a commutative group. The algebra mappings are given by η(λ)(x) := λ, m(f ⊗ g)(x) := f (x)g(x) for all λ ∈ C, x ∈ G and f, g ∈ F(G). Notice that the identification F(G × G) ∼ = F(G) ⊗ F(G) gives the interpretation (ma)(x) = a(x, x) for a ∈ F(G × G). Clearly (F(G), m, η) is a commutative algebra. Let x, y, z ∈ G and f, g ∈ F(G). Then ((Δ ⊗ I)Δf )(x, y, z)
=
(Δf )(xy, z)
= = = =
f ((xy)z) f (x(yz)) (Δf )(x, yz) ((I ⊗ Δ)Δf )(x, y, z),
so that (Δ ⊗ I)Δ = (I ⊗ Δ)Δ. Next, (ε ⊗ I)Δ ∼ = (I ⊗ ε)Δ, because =I∼ (m(ηε ⊗ I)Δf )(x)
= ((ηε ⊗ I)Δf )(x, x) = Δf (e, x) = f (ex) = f (x) = f (xe) = · · · = (m(I ⊗ ηε)Δf )(x).
Thereby (F(G), Δ, ε) is a co-algebra. Moreover, ε(f g) = (f g)(e) = f (e)g(e) = ε(f )ε(g), ε(1F (G) ) = 1F (G) (e) = 1, so that ε : F(G) → C is an algebra homomorphism. The co-multiplication Δ : F(G) → F(G) ⊗ F(G) ∼ = F(G × G) is an algebra homomorphism, because Δ(f g)(x, y) = (f g)(xy) = f (xy) g(xy) = (Δf )(x, y) (Δg)(x, y), Δ(1F (G) )(x, y) = 1F (G) (xy) = 1 = 1F (G×G) (x, y) ∼ = (1F (G) ⊗ 1F (G) )(x, y). Finally, ((I ∗ S)f )(x)
= (m(I ⊗ S)Δf )(x) = ((I ⊗ S)Δf )(x, x) = (Δf )(x, x−1 ) = f (xx−1 ) = f (e) = εf = · · · = ((S ∗ I)f )(x),
so that I ∗ S = ηε = S ∗ I. Thereby F(G) can be endowed with a Hopf algebra structure.
9.2. Hopf algebras
523
Example (Hopf algebra for a compact group). Let G be a compact group. We shall endow the dense subalgebra H := TrigPol(G) ⊂ C(G) of trigonometric polynomials with a natural structure of a commutative Hopf algebra; H will be cocommutative if and only if G is commutative. Actually, if G is a finite group then F(G) = TrigPol(G) = C(G). For a compact group G, it can be shown that here H⊗H∼ = TrigPol(G × G), where the isomorphism is given by m
(fj ⊗ gj )(x, y) :=
j=1
m
fj (x)gj (y).
j=1
The algebra structure (H, m, η) is the usual one for the trigonometric polynomials, i.e., m(f ⊗ g) := f g and η(λ) = λ1, where 1(x) = 1 for all x ∈ G. By the Peter–Weyl Theorem 7.5.14, the C-vector space H is spanned by dim(φ) . , [φ] ∈ G φij : φ = (φij )i,j Let us define the co-multiplication Δ : H → H ⊗ H by
dim(φ)
Δφij :=
φik ⊗ φkj ;
k=1
we see that then
dim(φ)
(Δφij )(x, y)
=
(φik ⊗ φkj )(x, y)
k=1
dim(φ)
=
φik (x)φkj (y)
k=1
=
φij (xy).
The co-unit ε : H → C is defined by εf := f (e), and the antipode S : H → H by (Sf )(x) := f (x−1 ). Exercise 9.2.9. In the Example about H = TrigPol(G) above, check the validity of the Hopf algebra axioms.
524
Chapter 9. Hopf Algebras
Theorem 9.2.10 (Commutative C ∗ -algebras and Hopf algebras). Let H be a commutative C ∗ -algebra. If (H, m, η, Δ, ε, S) is a finite-dimensional Hopf algebra then there exists a Hopf algebra isomorphism H ∼ = C(G), where G is a finite group and C(G) is endowed with the Hopf algebra structure given above. Proof. Let G := Spec(H) = HOM(H, C). As H is a commutative C ∗ -algebra, it is isometrically ∗-isomorphic to the C ∗ -algebra C(G) via the Gelfand transform (f → f) : H → C(G),
f(x) := x(f ).
The space G must be finite, because dim(C(G)) = dim(H) < ∞. Now e := ε ∈ G, because ε : H → C is an algebra homomorphism. This e ∈ G will turn out to be the neutral element of our group. Let x, y ∈ G. We identify the spaces C ⊗ C and C, and get an algebra homomorphism x ⊗ y : H ⊗ H → C ⊗ C ∼ = C. Now Δ : H → H ⊗ H is an algebra homomorphism, so that (x ⊗ y)Δ : H → C is an algebra homomorphism! Let us denote xy := (x ⊗ y)Δ, so that xy ∈ G. This defines the group operation ((x, y) → xy) : G × G → G! Inversion x → x−1 will be defined via the antipode S : H → H. We shall show that for a commutative Hopf algebra, the antipode is an algebra isomorphism. First we prove that S(1H ) = 1H : S1H
= m(1H ⊗ S1H ) = m(I ⊗ S)(1H ⊗ 1H ) = m(I ⊗ S)Δ1H = (I ∗ S)1H = ηε1H = 1H .
Then we show that S(gh) = S(h)S(g), where g, h ∈ H, gh := m(g ⊗ h). Let us use the so-called Sweedler notation Δf =: f(1) ⊗ f(2) =: f(1) ⊗ f(2) ; consequently (Δ ⊗ I)Δf (I ⊗ Δ)Δf
(Δ ⊗ I)(f(1) ⊗ f(2) ) = f(1)(1) ⊗ f(1)(2) ⊗ f(2) , = (I ⊗ Δ)(f(1) ⊗ f(2) ) = f(1) ⊗ f(2)(1) ⊗ f(2)(2) ,
=
and due to the co-associativity we may re-index as follows: (Δ ⊗ I)Δf =: f(1) ⊗ f(2) ⊗ f(3) := (I ⊗ Δ)Δf
9.2. Hopf algebras
525
(notice that, e.g., f(2) appears in different meanings above – this is just a matter of notation!). Then S(gh)
=
S(ε((gh)(1) )(gh)(2) )
= = = = = = = = = = = = = =
ε((gh)(1) ) S((gh)(2) ) ε(g(1) h(1) ) S(g(2) h(2) ) ε(g(1) ) ε(h(1) ) S(g(2) h(2) ) ε(g(1) ) S(h(1)(1) ) h(1)(2) S(g(2) h(2) ) ε(g(1) ) S(h(1) ) h(2) S(g(2) h(3) ) S(h(1) ) ε(g(1) ) h(2) S(g(2) h(3) ) S(h(1) ) S(g(1)(1) ) g(1)(2) h(2) S(g(2) h(3) ) S(h(1) ) S(g(1) ) g(2) h(2) S(g(3) h(3) ) S(h(1) ) S(g(1) ) (gh)(2) S((gh)(3) ) S(h(1) ) S(g(1) ) ε((gh)(2) ) S(h(1) ) S(g(1) ) ε(g(2) h(2) ) S(h(1) ) S(g(1) ) ε(g(2) ) ε(h(2) ) S(h(1) ε(h(2) )) S(g(1) ε(g(2) )) S(h) S(g);
this computation can be compared to (xy)−1
= = = = = =
e(xy)−1 y −1 y(xy)−1 y −1 ey(xy)−1 y −1 x−1 xy(xy)−1 y −1 x−1 e y −1 x−1
for x, y ∈ G! Since H is commutative, we have proven that S : H → H is an algebra homomorphism. Thereby xS : H → C is an algebra homomorphism. Let us denote x−1 := xS ∈ G, which is the inverse of x ∈ G! We leave it for the reader to show that (G, (x, y) → xy, x → x−1 ) is indeed a group. Exercise 9.2.11. Finish the proof of Theorem 9.2.10.
526
Chapter 9. Hopf Algebras
Exercise 9.2.12 (Universal enveloping algebra as a Hopf algebra). Let g be a Lie algebra and U(g) its universal enveloping algebra. Let X ∈ g; extend definitions ΔX := X ⊗ 1U (g) + 1U (g) ⊗ X,
SX := −X
εX := 0,
so that you obtain a Hopf algebra structure (U(g), m, η, Δ, ε, S). Exercise 9.2.13. Let (H, m, η, Δ, ε, S) be a finite-dimensional Hopf algebra. (a) Endow the dual H = L(H, C) with a natural Hopf algebra structure via the duality (f, φ) → f, φH := φ(f ) where f ∈ H, φ ∈ H . (b) If G is a finite group and H = F(G), what are the Hopf algebra operations for H ? (c) With a suitable choice for H, give an example of a non-commutative non-cocommutative Hopf algebra H ⊗ H . Exercise 9.2.14 (M.E. Sweedler’s example). Let (H, m, η) be the algebra spanned by the set {1, g, x, gx}, where 1 is the unit element and g 2 = 1, x2 = 0 and xg = −gx. Let us define algebra homomorphisms ε : H → C and Δ : H → H ⊗ H by Δ(g) := g ⊗ g, Δ(x) := x ⊗ 1 + g ⊗ x, ε(g) := 1,
ε(x) := 0.
Let us define a linear mapping S : H → H by S(1) := 1,
S(g) := g,
S(x) := −gx,
S(gx) := −x.
Show that (H, m, η, Δ, ε, S) is a non-commutative non-co-commutative Hopf algebra. Remark 9.2.15. In Exercise 9.2.14, a nice concrete matrix example can be given. Let us define A ∈ C2×2 by + , 0 1 A := . 1 0 Let g, x ∈ C4×4 be given by + g :=
A 0
, 0 , −A
+ x :=
0 IC2 0 0
, .
Then it is easy to see that H = span{IC4 , g, x, gx} is a four-dimensional subalgebra of C4×4 such that g 2 = IC4 , x2 = 0 and xg = −gx.
Part IV
Non-commutative Symmetries In this part, we develop a non-commutative quantization of pseudo-differential operators on compact Lie groups. The idea is that it can be constructed in a way to run more or less parallel to the Kohn–Nirenberg quantization of operators on Rn that was presented in Chapter 2, and to the toroidal quantization of operators on Tn that was developed in Chapter 4. The main advantage of such an approach is that once the basic notions and definitions are understood, one can see and enjoy a lot of features which are already familiar from the commutative analysis. The introduced matrix-valued full symbols turn out to have a number of interesting properties. The main difference with the toroidal quantization here is that, due to the non-commutativity of the group, symbols become matrix-valued with sizes depending on the dimensions of the unitary irreducible representations of the group, which are all finite-dimensional because the group is compact. Among other things, the introduced approach provides a characterisation of the H¨ormander class of pseudo-differential operators on a compact Lie group G using a global quantization of operators, thus relying on the representation theory rather than on the usual expressions in local coordinate charts. This yields a notion where G of the full symbol of an operator as a mapping defined globally on G × G, is the unitary dual of G. As such, this presents an advantage over the local theory where only the notion of the principal symbol can be defined globally. In the case ∼ of the torus G = Tn , we naturally have G × G = Tn × Zn , and we recapture the notion of a toroidal symbol introduced in Chapter 4, where symbols are scalarvalued (or 1 × 1 matrix-valued) because all the unitary irreducible representations of the torus are one-dimensional. As an important example, the approach developed here will give us quite detailed information on the global quantization of operators on the three-dimensional sphere S3 . More generally, we note that if we have a closed simply-connected threedimensional manifold M , then by the recently resolved Poincar´e conjecture there is a global diffeomorphism M S3 SU(2) that turns M into a Lie group with
528
a group structure induced by S3 (or by SU(2)). Thus, we can use the approach developed for SU(2) to immediately obtain the corresponding global quantization of operators on M with respect to this induced group product. In fact, all the formulae remain completely the same since the unitary dual of SU(2) (or S3 in the quaternionic R4 ) is mapped by this diffeomorphism as well; for an example of this construction in the case of S3 SU(2) see Section 12.5. The choice of the group structure on M may be not unique and is not canonical, but after using the machinery that we develop for SU(2), the corresponding quantization can be described entirely in terms of M ; for an example compare Theorem 12.5.3 for S3 and Theorem 12.4.3 for SU(2). In this sense, as different quantizations of operators exist already on Rn depending on the choice of the underlying structure (e.g., Kohn–Nirenberg quantization, Weyl quantizations, etc.), the possibility to choose different group products on M resembles this. Due to space limitations, we postpone the detailed analysis of operators on the higher-dimensional spheres Sn SO(n + 1)/SO(n) viewed as homogeneous spaces. However, we will introduce a general machinery on which to obtain the global quantization on homogeneous spaces using the one on the Lie group that acts on the space. Although we do not have general analogues of the diffeomorphic Poincar´e conjecture in higher dimensions, this will cover cases when M is a convex surface or a surface with positive curvature tensor, as well as more general manifolds in terms of their Pontryagin class, etc. Thus, the cases of the three-dimensional sphere S3 and Lie group SU(2) are analysed in detail in Chapter 12. There we show that pseudo-differential operators from H¨ ormander’s classes Ψm (SU(2)) and Ψm (S3 ) have matrix-valued symbols with a remarkable rapid off-diagonal decay property. In Chapter 11 we develop the necessary foundations of this analysis on SU(2) which together with Chapter 12 provides a more detailed example of the quantization from Chapter 10. Finally, in Chapter 13 we give an application of these constructions to analyse pseudo-differential operators on homogeneous spaces.
Chapter 10
Pseudo-differential Operators on Compact Lie Groups 10.1
Introduction
In this chapter we develop a global theory of pseudo-differential operators on m (Rn × Rn ) ⊂ C ∞ (Rn × Rn ) refers to general compact Lie groups. As usual, S1,0 the Euclidean space symbol class, defined by the symbol inequalities ∂ξα ∂xβ p(x, ξ) ≤ C (1 + |ξ|)m−|α| ,
(10.1)
N0 = {0} ∪ N, where the constant C is independent for all multi-indices α, β ∈ of x, ξ ∈ Rn but may depend on α, β, p, m. On a compact Lie group G we define ormander class of pseudo-differential operators the class Ψm (G) to be the usual H¨ of order m. Thus, the operator A belongs to Ψm (G) if in (all) local coordinates operator A is a pseudo-differential operator on Rn with some symbol p(x, ξ) satisfying estimates (10.1), see Definition 5.2.11. Of course, symbol p depends on the local coordinate systems. It is a natural idea to build pseudo-differential operators out of smooth families of convolution operators on Lie groups. In this work, we strive to develop the convolution approach into a symbolic quantization, which always provides a much more convenient framework for the analysis of operators. For this, our analysis of operators and their symbols is based on the representation theory of Lie groups. This leads to the description of the full symbols of pseudo-differential operators on Lie groups as sequences of matrices of growing size equal to the dimension of the corresponding representation of the group. Moreover, the analysis is global and is not confined to neighbourhoods of the neutral element since it does not rely on the exponential map and its properties. We also characterise, in terms of the introduced quantizations, standard H¨ ormander classes Ψm (G) on Lie groups. One of the advantages of the presented approach is that we obtain a notion of full (global) symbols compared with only principal symbols available in the standard theory via localisations. Nn0 ,
530
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
In our analysis on a Lie group G, at some point we have to make a choice whether to work with left- or right-convolution kernels. Since left-invariant operators on C ∞ (G) correspond to right-convolutions f → f ∗ k, once we decide to identify the Lie algebra g of G with the left-invariant vector fields on G, it becomes most natural to work with right-convolution kernels in the sequel, and to define symbols as we do in Definition 10.4.3. It is also known that globally defined symbols of pseudo-differential operators can be introduced on manifolds in the presence of a connection which allows one to use a suitable globally defined phase function, see, e.g., [151, 100, 109]. However, on compact Lie groups the use of the group structure allows one to develop a theory parallel to those of Rn and Tn owing to the fact that the Fourier analysis is well adopted to the underlying representation theory. Some elements of such a theory were discussed in [128, 129] as well as in the PhD thesis of V. Turunen [139]. However, here we present the finite-dimensional symbols and we do not rely on the exponential mapping thus providing a genuine global analysis in terms of the Lie group itself. We also note that the case of the compact commutative group Tn is also recovered from this general point of view, however a more advanced analysis is possible in this case because of the close relation between Tn and Rn . Unless specified otherwise, in this chapter G will stand for a general compact Lie group, and dμG will stand for the (normalised) Haar measure on G, i.e., the unique regular Borel probability measure which is left-translation-invariant: f (x) dμG (x) = f (yx) dμG (x) G
G
for all f ∈ C(G) and y ∈ G. Then also f (x) dμG (x) = f (xy) dμG (x) = f (x−1 ) dμG (x), G
G
G
see Remark 7.4.4. Usually we abbreviate dμG (x) to dx since this should cause no confusion.
10.2
Fourier series on compact Lie groups
We begin with the Fourier series on a compact group G. Let Rep(G) denote the set of all strongly Definition 10.2.1 (Rep(G) and G). continuous irreducible unitary representations of G. In the sequel, whenever we mention unitary representations (of a compact Lie group G), we always mean strongly continuous irreducible unitary representations, which are then automat denote the unitary dual of G, i.e., the set of equivalence ically smooth. Let G classes of irreducible unitary representations from Rep(G), see Definitions 6.3.18 denote the equivalence class of an irreducible unitary repreand 7.5.7. Let [ξ] ∈ G sentation ξ : G → U(Hξ ); the representation space Hξ is finite-dimensional since G is compact (see Corollary 7.5.6), and we set dim(ξ) = dim Hξ .
10.2. Fourier series on compact Lie groups
531
We will always equip a compact Lie group G with the Haar measure μG , i.e., the uniquely determined bi-invariant Borel regular probability measure, see Remark 7.4.4. For simplicity, we will write Lp (G) for Lp (μG ), G f dx for G f dμG , etc. First we collect several definitions scattered over previous chapters in different forms. Definition 10.2.2 (Fourier coefficients). Let us define the Fourier coefficient f(ξ) ∈ End(Hξ ) of f ∈ L1 (G) by f (x) ξ(x)∗ dx; (10.2) f(ξ) := G
more precisely, (f(ξ)u, v)Hξ =
G
f (x) (ξ(x)∗ u, v)Hξ dx =
G
f (x) (u, ξ(x)v)Hξ dx
for all u, v ∈ Hξ , where (·, ·)Hξ is the inner product of Hξ . Remark 10.2.3. Notice that ξ(x)∗ = ξ(x)−1 = ξ(x−1 ). Remark 10.2.4 (Fourier coefficients on Tn as a group). Let G = Tn . Let us naturally identify End(C) with C, and U(C) with {z ∈ C : |z| = 1}. For each k ∈ Zn , we define ek : G → U(C) by ek (x) := ei2πx·k . Then f (x) e−i2πx·k dx f(k) := f(ek ) = Tn
is the usual Fourier coefficient of f ∈ L1 (Tn ). Remark 10.2.5 (Intertwining isomorphisms). Let U ∈ Hom(η, ξ) be an intertwining isomorphism, i.e., let U : Hη → Hξ be a bijective unitary linear mapping such that U η(x) = ξ(x)U for all x ∈ G. Then we have f(η) = U −1 f(ξ) U ∈ End(Hη ).
(10.3)
Proposition 10.2.6 (Inner automorphisms). For u ∈ G, consider the inner automorphisms φu = (x → u−1 xu) : G → G. Then for all ξ ∈ Rep(G) we have f ◦ φu (ξ) = ξ(u) f(ξ) ξ(u)∗ .
(10.4)
Proof. We can calculate f ◦ φu (ξ) = f (u−1 xu) ξ(x)∗ dx = f (x) ξ(uxu−1 )∗ dx G G ∗ = ξ(u) f (x) ξ(x) dx ξ(u)∗ = ξ(u) f(ξ) ξ(u)∗ , G
which gives (10.4).
532
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
Proposition 10.2.7 (Convolutions). If f, g ∈ L1 (G) then f ∗ g = g f. Proof. If ξ ∈ Rep(G) then f ∗ g(ξ)
(f ∗ g)(x) ξ(x)∗ dx = f (xy −1 )g(y) dy ξ(x)∗ dx G G ∗ = g(y) ξ (y) f (xy −1 ) ξ(xy −1 )∗ dx dy
=
G
G
G
g(ξ) f(ξ),
=
completing the proof.
Remark 10.2.8. The product g f in Proposition 10.2.7 usually differs from f g because f(ξ), g(ξ) ∈ End(Hξ ) are operators unless G is commutative when they are scalars (see Corollary 6.3.26) and hence commute. This order exchange is due to the definition of the Fourier coefficients (10.2), where we chose the integration of the function with respect to ξ(x)∗ instead of ξ(x). This choice actually serves us well, as we chose to identify the Lie algebra g with left-invariant vector fields on the Lie group G: namely, a left-invariant continuous linear operator A : C ∞ (G) → C ∞ (G) can be presented as a right-convolution operator Ca = (f → f ∗ a), resulting in convenient expressions like a b f. C a Cb f = However, in Remark 10.4.13 we will still explain what would happen had we chosen another definition for the Fourier transform. Proposition 10.2.9 (Differentiating the convolution). Let Y ∈ g and let DY : C ∞ (G) → C ∞ (G) be defined by DY f (x) =
d f (x exp(tY )) dt
. t=0
Let f, g ∈ C ∞ (G). Then DY (f ∗ g) = f ∗ DY g. Proof. We have DY (f ∗ g)(x) =
f (y) G
d g(y −1 x exp(tY )) dt
dy = f ∗ DY g(x). t=0
We now summarise properties of the Fourier series as a corollary to the Peter–Weyl Theorem 7.5.14:
10.2. Fourier series on compact Lie groups
533
Corollary 10.2.10 (Fourier series). If ξ : G → U(d) is a unitary matrix representation then f(ξ) = f (x) ξ(x)∗ dx ∈ Cd×d G
has matrix elements f(ξ)mn =
f (x) ξ(x)nm dx ∈ C, 1 ≤ m, n ≤ d. G
If here f ∈ L2 (G) then f(ξ)mn = (f, ξ(x)nm )L2 (G) ,
(10.5)
and by the Peter–Weyl Theorem 7.5.14 we have f (x) = dim(ξ) Tr ξ(x) f(ξ) [ξ]∈G
=
dim(ξ)
[ξ]∈G
d
ξ(x)nm f(ξ)mn
(10.6)
m,n=1
for almost every x ∈ G, where the summation is understood so that from each class we pick just (any) one representative ξ ∈ [ξ]. The particular choice of a [ξ] ∈ G representation from the representation class is irrelevant due to formula (10.3) and the presence of the trace in (10.6). The convergence in (10.6) is not only pointwise almost everywhere on G but also in the space L2 (G). Example. For f ∈ L2 (Tn ), we get f (x) =
ei2πx·k f(k),
k∈Zn
where f(k) = with C1×1 .
Tn
f (x) e−i2πx·k dx is as in Remark 10.2.4. Here C was identified
Finally, we record a useful formula for representations: Remark 10.2.11. Let e ∈ G be the neutral element of G and let ξ be a unitary matrix representation of G. The unitarity of the representation ξ implies the identity ξ(x−1 )mk ξ(x)kn = ξ(x)km ξ(x)kn . δmn = ξ(e)mn = ξ(x−1 x)mn = k
Similarly, δmn =
k
ξ(x)mk ξ(x)nk .
k
Here, as usual, δmn is the Kronecker delta: δmn = 1 for m = n, and δmn = 0 otherwise.
534
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
10.3
Function spaces on the unitary dual
In this section we lay down a functional analytic foundation concerning the func tion spaces that will be useful in the sequel. In particular, distribution space S (G) is of importance since it provides a distributional interpretation for series on G.
10.3.1
Spaces on the group G
We recall Definition 8.3.45 of the Laplace operator LG on a Lie group G: Remark 10.3.1 (Laplace operator L = LG ). The Laplace operator LG is a secondorder negative definite bi-invariant partial differential operator on G corresponding to the Casimir element of the universal enveloping algebra U(g) of the Lie algebra g of G. If G is equipped with the unique (up to a constant) bi-invariant Riemannian metric, LG is its Laplace–Beltrami operator. We will often denote the Laplace operator simply by L if there is no need to emphasize the group G in the notation. We refer to Section 8.3.2 for a discussion of its main properties. In Definition 5.2.4 we defined C k mappings on a manifold. These can be also characterised globally: Exercise 10.3.2. Let n = dim G and let {Yj }nj=1 be a basis of the Lie algebra g of G. Show that f ∈ C k (G) if and only if ∂ α f ∈ C(G) for all ∂ α = Y1α1 · · · Ynαn for all |α| ≤ k, or if and only if Lf ∈ C(G) for all L ∈ U(g) of degree ≤ k. Exercise 10.3.3. Show that f ∈ C ∞ (G) if and only if (−LG )k f ∈ C(G) for all k ∈ N. Show that f ∈ C ∞ (G) if and only if Lf ∈ C(G) for all L ∈ U(g). (Hint: use a priori estimates from Theorem 2.6.9.) We can recall from Remark 5.2.15 the definition of the space D (M ) of distributions on a compact manifold M . Let us be more precise in the case of a compact Lie group G: Definition 10.3.4 (Distributions D (G)). We define the space of distributions D (G) as the space of all continuous linear functionals on C ∞ (G). This means that u ∈ D (G) if it is a functional u : C ∞ (G) → C such that 1. u is linear, i.e., u(αϕ + βψ) = αu(ϕ) + βu(ψ) for all α, β ∈ C and all ϕ, ψ ∈ C ∞ (G); 2. u is continuous, i.e., u(ϕj ) → u(ϕ) in C whenever ϕj → ϕ in C ∞ (G), as j → ∞. Here1 ϕj → ϕ in C ∞ (G) if, e.g.,2 ∂ α ϕj → ∂ α ϕ for all ∂ α ∈ U(g), as j → ∞. We define the convergence in the space D (G) as follows. Let uj , u ∈ D (G). We will say that uj → u in D (G) as j → ∞ if uj (ϕ) → u(ϕ) in C as j → ∞, for all ϕ ∈ C ∞ (G). 1 For
a general setting on compact manifolds see Remark 5.2.15. 10.3.2 provides more options.
2 Exercise
10.3. Function spaces on the unitary dual
535
Definition 10.3.5 (Duality ·, ·G ). Let u ∈ D (G) and ϕ ∈ C ∞ (G). We write u, ϕG := u(ϕ). If u ∈ Lp (G), 1 ≤ p ≤ ∞, we can identify u with a distribution in D (G) (which we will continue to denote by u) in a canonical way by u, ϕG = u(ϕ) :=
u(x) ϕ(x) dx, G
where dx is the Haar measure on G. Exercise 10.3.6. Let 1 ≤ p ≤ ∞. Show that if uj → u in Lp (G) as j → ∞ then uj → u in D (G) as j → ∞. Remark 10.3.7 (Derivations in D (G)). Similar to operations on distributions in Rn described in Section 1.3.2, we can define different operations on distributions on G. For example, for Y ∈ g, we can differentiate u ∈ D (G) with respect to the vector field Y by defining (Y u)(ϕ) := −u(Y ϕ), for all ϕ ∈ C ∞ (G). Here the derivative Y ϕ = DY ϕ is as in Proposition 10.2.9. Similarly, if ∂ α ∈ U(g) is a differential operator of order |α|, we define (∂ α u)(ϕ) := (−1)|α| u(∂ α ϕ), for all ϕ ∈ C ∞ (G). Exercise 10.3.8. Show that ∂ α u ∈ D (G) and that ∂ α : D (G) → D (G) is continuous. Definition 10.3.9 (Sobolev space H s (G)). First let us note that the Laplacian L = LG is symmetric and I − L is positive. Set Ξ := (I − L)1/2 . Then Ξs ∈ L(C ∞ (G)) and Ξs ∈ L(D (G)) for every s ∈ R. Let us define (f, g)H s (G) := (Ξs f, Ξs g)L2 (G) (f, g ∈ C ∞ (G)). The completion of C ∞ (G) with respect to the norm f → f H s (G) = (f, f )H s (G) gives us the Sobolev space H s (G) of order s ∈ R. This is the same space as that in Definition 5.2.16 on general manifolds, or as the Sobolev space obtained using any smooth partition of unity on the compact manifold G, by Corollary 5.2.18. The operator Ξr is a Sobolev space isomorphism H s (G) → H s−r (G) for every r, s ∈ R. 1/2
Exercise 10.3.10. Show that if Y ∈ g, then the differentiation with respect to Y is a bounded linear operator from H k (G) to H k−1 (G) for all k ∈ N. Extend this to k ∈ R.
536
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
Remark 10.3.11 (Sobolev spaces and C ∞ (G)). We have H 2k (G) = C ∞ (G).
(10.7)
k∈N
This can be seen locally, since H 2k (G) = domain((−LG )k ) and LG is elliptic, so that (10.7) follows from the local a priori estimates in Theorem 2.6.9. Since the analysis of Sobolev spaces is closely intertwined with spaces on G, we study them also in the next section, in particular we refer to Remark 10.3.24 for their characterisation on the Fourier transform side.
10.3.2 Spaces on the dual G Since we will be mostly using the “right” Peter–Weyl Theorem (Theorem 7.5.14), we may simplify the notation slightly, also adopting it to the analysis of pseudodifferential operators from the next sections. Thus, the space < dim(ξ) dim(ξ) ξij : ξ = (ξij )i,j=1 , [ξ] ∈ G is an orthonormal basis for L2 (G), and the space Hξ := span{ξij : 1 ≤ i, j ≤ dim(ξ)} ⊂ L2 (G) is πR -invariant, ξ ∼ πR |Hξ , and L2 (G) =
-
Hξ .
[ξ]∈G
By choosing a unitary matrix representation from each equivalence class [ξ] ∈ G, ξ dim(ξ)×dim(ξ) by choosing a basis in the linear space Hξ . we can identify H with C Remark 10.3.12 (Spaces Hξ and Hξ ). We would like to point out a difference between spaces Hξ and Hξ to eliminate any confusion. Recall that if ξ ∈ Rep(G), then ξ is a mapping ξ : G → U(Hξ ), where Hξ is the representation space of ξ, with dim Hξ = dim(ξ). On the other hand, the space Hξ ⊂ L2 (G) is the span of the matrix elements of ξ and dim Hξ = (dim(ξ))2 . In the notation of the right and left Peter–Weyl Theorems in Theorem 7.5.14 and Remark 7.5.16, we have -
dim(ξ)
Hξ =
i=1
-
dim(ξ) ξ Hi,· =
ξ H·,j .
j=1
Informally, spaces Hξ can be viewed as “columns/rows” of Hξ , because for example ξ(x)v ∈ Hξ for every v ∈ Hξ . We recall the important property of the Laplace operator on spaces Hξ from Theorem 8.3.47:
10.3. Function spaces on the unitary dual
537
the space Theorem 10.3.13 (Eigenvalues of the Laplacian on G). For every [ξ] ∈ G ξ H is an eigenspace of LG and −LG |Hξ = λξ I, for some λξ ≥ 0. Exercise 10.3.14. Show that Hξ ⊂ C ∞ (G) for all ξ ∈ Rep(G). (Hint: Use Theorem 10.3.13 and the ellipticity of LG .) which we can now From Definition 7.6.11 we recall the Hilbert space L2 (G) describe as follows. But first, we can look at the space of all mappings on G: consists of all mappings Definition 10.3.15 (Space M(G)). The space M(G) → F :G
L(Hξ ) ⊂
[ξ]∈G
∞
Cm×m ,
m=1
In matrix representations, we can view satisfying F ([ξ]) ∈ L(Hξ ) for every [ξ] ∈ G. F ([ξ]) ∈ Cdim(ξ)×dim(ξ) as a dim(ξ) × dim(ξ) matrix. consists of all mappings F ∈ M(G) such that The space L2 (G) 2 ||F ||2L2 (G) dim(ξ) F ([ξ])HS < ∞, := [ξ]∈G
where ||F ([ξ])||HS =
< Tr (F ([ξ]) F ([ξ])∗ )
stands for the Hilbert–Schmidt norm of the linear operator F ([ξ]), see Definition B.5.43. Thus, the space = L2 (G)
⎧ ⎨ ⎩
→ F :G
L(Hξ ), F : [ξ] → L(Hξ ) :
[ξ]∈G
||F ||2L2 (G) :=
2
dim(ξ) F ([ξ])HS
[ξ]∈G
⎫ ⎬ 0 such that the inequality dim(ξ) ≤ Cξ
dim G 2
holds for all ξ ∈ Rep(G). Proof. We note by Theorem 10.3.13 that ξ is an eigenvalue of the first-order elliptic operator (1 − LG )1/2 . The corresponding eigenspace Hξ has the dimension dim(ξ)2 . Denoting by n = dim G, the Weyl formula for the counting function of the eigenvalues of (1 − LG )1/2 yields
dim(ξ)2 = C0 λn + O(λn−1 )
ξ≤λ 2
n
as λ → ∞. This implies the estimate dim(ξ) ≤ Cξ for large ξ, implying the statement. consists of all mappings H ∈ Definition 10.3.20 (Space S(G)). The space S(G) M(G) such that for all k ∈ N we have pk (H) :=
k
dim(ξ) ξ ||H(ξ)||HS < ∞.
(10.10)
[ξ]∈G
converges to H ∈ S(G) in S(G), and write Hj → H We will say that Hj ∈ S(G) in S(G) as j → ∞, if pk (Hj − H) → 0 as j → ∞ for all k ∈ N, i.e., if
k
dim(ξ) ξ ||Hj (ξ) − H(ξ)||HS → 0 as j → ∞,
[ξ]∈G
for all k ∈ N. In particular, the folWe can take different families of seminorms on S(G). lowing equivalence will be of importance since it provides a more direct relation with Sobolev spaces on G:
540
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
let pk (H) be as in Proposition 10.3.21 (Seminorms on S(G)). For H ∈ M(G), (10.10), and let us define a family qk (H) by ⎛ qk (H) := ⎝
⎞1/2 dim(ξ) ξ ||H(ξ)||2HS ⎠ k
.
[ξ]∈G
Then pk (H) < ∞ for all k ∈ N if and only if qk (H) < ∞ for all k ∈ N . We claim that pk (H) ≤ Cq4k (H) if k is large enough, and Proof. Let H ∈ M(G). older’s that q2k (H) ≤ pk (H). Indeed, by the Cauchy–Schwartz inequality (i.e., H¨ we inequality similar to that in Lemma 3.3.28 where we use the discreteness of G) can estimate −k 2k (dim(ξ))1/2 ξ (dim(ξ))1/2 ξ ||H(ξ)||HS pk (H) = [ξ]∈G
⎛ ⎝
≤
⎞1/2 ⎛
dim(ξ) ξ
−2k ⎠
[ξ]∈G
≤
⎝
⎞1/2 dim(ξ) ξ
4k
||H(ξ)||2HS ⎠
[ξ]∈G
Cq4k (H),
if we choose k large enough and use Proposition 10.3.19. Conversely, we have 2k (dim(ξ))2 ξ ||H(ξ)||2HS pk (H)2 = [ξ]=[η]∈G
+ ≥
k
k
dim(ξ) dim(η) ξ η ||H(ξ)||HS ||H(η)||HS
[ξ]=[η]
dim(ξ) ξ
2k
||H(ξ)||2HS
[ξ]∈G
= q2k (H)2 . As a corollary of Proposition 10.3.21 we have we have Corollary 10.3.22. For H ∈ M(G),
−k
dim(ξ) ξ
||H(ξ)||HS < ∞
[ξ]∈G
for some k ∈ N if and only if −l dim(ξ) ξ ||H(ξ)||2HS < ∞ [ξ]∈G
for some l ∈ N.
10.3. Function spaces on the unitary dual
541
Let us now summarise properties of the Fourier transform on L2 (G). Theorem 10.3.23 (Fourier inversion). Let G be a compact Lie group. The Fourier The transform f → FG f = f defines a surjective isometry L2 (G) → L2 (G). inverse Fourier transform is given by −1 H)(x) = dim(ξ) Tr (ξ(x) H(ξ)), (10.11) (FG [ξ]∈G
and we have
−1 FG ◦ FG = id
−1 FG ◦ FG = id
and
respectively. Moreover, the Fourier transform FG is unitary, on L2 (G) and L2 (G), ∗ −1 we have FG = FG , and for any H ∈ S(G) −1 −1 (FG ∗ H)(x) = FG (H ∗ )(x−1 ) = FG H (x) for all x ∈ G, where H ∗ (ξ) := H(ξ)∗ for all ξ ∈ Rep(G). Proof. In Theorem 7.6.13 we have already shown that the Fourier transform is By Corollary 7.6.10 the inverse Fourier a surjective isometry L2 (G) → L2 (G). transform is given by (10.11). Let us show the last part. Let f ∈ C ∞ (G) and Then we have H ∈ S(G). (f, FG ∗ H)L2 (G) = (FG f, H)L2 (G) dim(ξ) Tr f(ξ) H(ξ)∗ = +
=
f (x) G
= G
f (x) ξ ∗ (x) dx H(ξ)∗
dim(ξ) Tr
[ξ]∈G
=
[ξ]∈G
,
G
⎧ ⎨ ⎩
[ξ]∈G
⎫ ⎬ dim(ξ) Tr ξ(x−1 ) H(ξ)∗ dx ⎭
−1 f (x) FG (H ∗ )(x−1 ) dx,
−1 which implies FG ∗ H(x) = FG (H ∗ )(x−1 ). Finally, the unitarity of the Fourier transform follows from continuing the calculation: , + dim(ξ) Tr f (x) ξ ∗ (x) dx H(ξ)∗ (f, FG ∗ H)L2 (G) = [ξ]∈G
f (x)
= G
G
⎧ ⎨ ⎩
[ξ]∈G
dim(ξ) Tr (ξ ∗ (x) H(ξ)∗ )
⎫ ⎬ ⎭
dx
542
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
=
f (x) G
= G
⎧ ⎨ ⎩
[ξ]∈G
⎫ ⎬ dim(ξ)Tr (ξ(x) H(ξ)) dx ⎭
−1 f (x) FG H (x) dx.
Remark 10.3.24 (Sobolev space H s (G)). On the Fourier transform side, in view of Theorem 10.3.23, the Sobolev space H s (G) can be characterised by s H s (G) = {f ∈ D (G) : ξ f(ξ) ∈ L2 (G)}.
We also note that since G is compact, for s = 2k and k ∈ N, we have that f ∈ H 2k (G) if (−LG )k f ∈ L2 (G). By Theorem 10.3.25, the Fourier transform FG is a continuous bijection from H 2k (G) to the space ⎧ ⎫ ⎨ ⎬ 2k : F ∈ M(G) dim(ξ) ξ ||F (ξ)||2HS < ∞ . ⎩ ⎭ [ξ]∈G
in preparation We now analyse the Fourier transforms on C ∞ (G) and S(G) for their extension to the spaces of distributions. Theorem 10.3.25 (Fourier transform on C ∞ (G) and S(G)). The Fourier transform and its inverse F −1 : S(G) → C ∞ (G) are continuous, FG : C ∞ (G) → S(G) G −1 −1 respectively. ◦ FG = id and FG ◦ FG = id on C ∞ (G) and S(G), satisfying FG Proof. By (10.7) in Remark 10.3.11, writing C ∞ (G) = k∈N H 2k (G), the smoothness f ∈ C ∞ (G) is equivalent to f ∈ H 2k (G) for all k ∈ N, by Remark 10.3.24. This means that 2k dim(ξ) ξ ||f||2HS < ∞ [ξ]∈G
Consequently, fj → f in C ∞ (G) implies for all k ∈ N. Hence FG f ∈ S(G). 2k that fj → f in H (G) for all k ∈ N. Taking the Fourier transform we see that q2k (FG fj − FG f ) → 0 as j → ∞ for all k ∈ N, which implies that FG fj → FG f Inverting this argument implies the continuity of the inverse Fourier in S(G). −1 to C ∞ (G). The last part of the theorem follows from transform FG from S(G) Theorem 10.3.23. is a Montel nuclear space). The space S(G) is a Montel Corollary 10.3.26 (S(G) space and a nuclear space. is homeoProof. This follows from the same properties of C ∞ (G) because S(G) morphic to C ∞ (G), a homeomorphism given by the Fourier transform. See Exercises B.3.35 and B.3.51.
10.3. Function spaces on the unitary dual
543
of slowly increasing or tempered Definition 10.3.27 (Space S (G)). The space S (G) for distributions on the unitary dual G is defined as the space of all H ∈ M(G) which there exists some k ∈ N such that −k dim(ξ) ξ ||H(ξ)||HS < ∞. [ξ]∈G
is defined as follows. We will say that Hj ∈ S (G) The convergence in S (G) converges to H ∈ S (G) in S (G) as j → ∞, if there exists some k ∈ N such that −k dim(ξ) ξ ||Hj (ξ) − H(ξ)||HS → 0 [ξ]∈G
as j → ∞. Lemma 10.3.28 (Trace and Hilbert–Schmidt norm). Let H be a Hilbert space and let A, B ∈ S2 (H) be Hilbert–Schmidt operators5 . Then |Tr(AB)| ≤ ||A||HS ||B||HS . Proof. By the Cauchy–Schwartz inequality for the (Hilbert–Schmidt) inner product on S2 (H) we have |Tr(AB)| = |A, B ∗ HS | ≤ ||A||HS ||B||HS ,
proving the required estimate. and h ∈ S(G). We write Definition 10.3.29 (Duality ·, ·G ). Let H ∈ S (G) H, hG :=
dim(ξ) Tr (H(ξ) h(ξ)).
(10.12)
[ξ]∈G
The sum is well defined in view of dim(ξ) |Tr (H(ξ) h(ξ))| H, hG ≤ [ξ]∈G
≤
dim(ξ) H(ξ)HS h(ξ)HS
[ξ]∈G
⎛
≤⎝
⎞1/2 ⎛ dim(ξ) ξ
[ξ]∈G
< ∞, 5 Recall
Definition B.5.43.
−k
H(ξ)HS ⎠ 2
⎝
[ξ]∈G
⎞1/2 dim(ξ) ξ h(ξ)HS ⎠ k
2
544
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
which is finite in view of Proposition 10.3.21 and Corollary 10.3.22. Here we used Lemma 10.3.28 and the Cauchy–Schwarz inequality (see Lemma 3.3.28) in the and estimate. The bracket ·, ·G in (10.12) introduces the duality between S (G) is the dual space to S(G). so that S (G) S(G) is in S (G)). The space S(G) Proposition 10.3.30 (Sequential density of S(G) sequentially dense in S (G). Proof. We use the standard approximation in sequence spaces by cutting the se For ξ ∈ G we define quence to obtain its approximation. Thus, let H ∈ S (G). " H(ξ), if ξ ≤ j, Hj (ξ) := 0, if ξ > j. Let k ∈ N be such that
dim(ξ) ξ
−k
||H(ξ)||HS < ∞.
(10.13)
[ξ]∈G
for all j, and Hj → H in S (G) as j → ∞, because Clearly Hj ∈ S(G) −k −k dim(ξ) ξ ||Hj (ξ) − H(ξ)||HS = dim(ξ) ξ ||H(ξ)||HS → 0 [ξ]∈G
[ξ]∈G,ξ>j
as j → 0, in view of the convergence of the series in (10.13) and the fact that the eigenvalues of the Laplacian on G are increasing to infinity. We now establish a relation of the duality brackets ·, ·G and ·, ·G with the Fourier transform and its inverse. This will allow us to extend the actions of the Fourier transforms to the spaces of distributions on G and G. Proposition 10.3.31 (Dualities and Fourier transforms). Let ϕ ∈ C ∞ (G), h ∈ Then we have the identities S(G). 8 9 −1 FG ϕ, hG = ϕ, ι ◦ FG h G and
9 8 −1 FG h, ϕ G = h, FG (ι ◦ ϕ)G ,
where (ι ◦ ϕ)(x) = ϕ(x−1 ). Proof. We can calculate FG ϕ, hG =
dim(ξ) Tr (ϕ(ξ) h(ξ))
[ξ]∈G
=
[ξ]∈G
+ dim(ξ) Tr G
,
ϕ(x) ξ ∗ (x) dx h(ξ)
10.3. Function spaces on the unitary dual
⎧ ⎨
=
ϕ(x)
⎩
G (10.11)
[ξ]∈G
545
⎫ ⎬ dim(ξ) Tr ξ(x−1 ) h(ξ) dx ⎭
−1 = ϕ(x) (FG h)(x−1 ) dx G 8 9 −1 = ϕ, ι ◦ FG h G.
Similarly, we have 8
−1 h, ϕ FG
9
=
G
−1 (FG h)(x) ϕ(x) dx ⎫ ⎧ ⎨ ⎬ dim(ξ) Tr (ξ(x) h(ξ)) ϕ(x) dx ⎭ G⎩ [ξ]∈G +" # , dim(ξ) Tr ξ(x−1 ) ϕ(x−1 ) dx h(ξ) G
=
=
G
[ξ]∈G
=
+" dim(ξ) Tr G
[ξ]∈G
=
# , ξ (x) (ι ◦ ϕ)(x) dx h(ξ) ∗
dim(ξ) Tr (ι1 ◦ ϕ(ξ) h(ξ))
[ξ]∈G
=
h, FG (ι ◦ ϕ)G ,
completing the proof.
Proposition 10.3.31 motivates the following definitions: Definition 10.3.32 (Fourier transforms on D (G) and S (G)). For u ∈ D (G), we by ∈ S (G) define FG u ≡ u 8 9 −1 FG u, hG := u, ι ◦ FG h G, (10.14) For H ∈ S (G), we define F −1 H ∈ D (G) by for all h ∈ S(G). G 8 −1 9 FG H, ϕ G := H, FG (ι ◦ ϕ)G ,
(10.15)
for all ϕ ∈ C ∞ (G). Theorem 10.3.33 (Well-defined and continuous). For u ∈ D (G) and H ∈ S (G), −1 their forward and inverse Fourier transforms FG u ∈ S (G) and FG H ∈ D (G) and F −1 : S (G) → are well defined. Moreover, the mappings FG : D (G) → S (G) G D (G) are continuous, and −1 FG ◦ FG = id
respectively. on D (G) and S (G),
and
−1 FG ◦ FG = id
546
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
its inverse Fourier transform satisfies F −1 h ∈ C ∞ (G) by Proof. For h ∈ S(G), G −1 h ∈ C ∞ (G), so that the definition in (10.14) Theorem 10.3.25. Therefore, ι ◦ FG −1 → C ∞ (G) and ι : C ∞ (G) → : S(G) makes sense. Moreover, the mappings FG implying C ∞ (G) are continuous, so that FG u is a continuous functional on S(G), that FG9 : D 8(G) → S (G) that FG u ∈ S (G). Let us now show 8 9 is continuous. −1 −1 h G → u, ι ◦ FG h G as j → ∞ for Indeed, if uj → u in D (G), then uj , ι ◦ FG implying that FG uj → FG u in S (G). every h ∈ S(G), −1 is similar, as well as the The proof that FG H ∈ D (G) for every H ∈ S (G) −1 → D (G), and are left as Exercise 10.3.34. continuity of F : S (G) G
−1 To show that FG ◦ FG = id on D (G), we take u ∈ D (G) and ϕ ∈ C ∞ (G), and calculate 9 8 −1 (10.15) = FG u, FG (ι ◦ ϕ)G (FG ◦ FG )u, ϕ G 8 9 (10.14) −1 = u, ι ◦ FG (FG (ι ◦ ϕ)) G Theorem 10.3.25
u, ϕG .
=
and h ∈ S(G), we have Similarly, for H ∈ S (G) 8 9 8 −1 9 (10.14) −1 −1 (FG ◦ FG )H, h G = FG H, ι ◦ FG h G 8 9 (10.15) −1 = H, FG ι ◦ ι ◦ FG h G Theorem 10.3.25
H, hG ,
=
completing the proof.
−1 H∈ Exercise 10.3.34. Complete the proof of Theorem 10.3.33 by showing that FG −1 D (G) for every H ∈ S (G) and that FG : S (G) → D (G) is continuous.
Corollary 10.3.35 (Sequential density of C ∞ (G) in D (G)). The space C ∞ (G) is sequentially dense in D (G). Proof. The statement follows from Proposition 10.3.30 saying that the space S(G) is sequentially dense in S (G), and properties of the Fourier transform from Theorems 10.3.25 and 10.3.33.
10.3.3
Spaces Lp (G)
≡ Definition 10.3.36 (Spaces Lp (G)). For 1 ≤ p < ∞, we will write Lp (G) 2 1 p( p − 2 ) p for the space of all H ∈ S (G) such that G, dim ⎛ ⎝ ||H||Lp (G) :=
[ξ]∈G
⎞1/p (dim(ξ))p( p − 2 ) ||H(ξ)||pHS ⎠ 2
1
< ∞.
10.3. Function spaces on the unitary dual
547
≡ ∞ G, dim−1/2 for the space of all H ∈ S (G) For p = ∞, we will write L∞ (G) such that −1/2 ||H||L∞ (G) ||H(ξ)||HS < ∞. := sup (dim(ξ)) [ξ]∈G
but the notation p G, dimp( p2 − 12 ) can be also used We will usually write Lp (G) to emphasize that these spaces have a structure of weighted sequence spaces on with the weights given by the powers of the dimensions of the the discrete set G, representations. are Banach spaces for all 1 ≤ p ≤ ∞. Exercise 10.3.37. Prove that spaces Lp (G) = 2 G, = dim1 and L1 (G) Remark 10.3.38. Two important cases of L2 (G) dim3/2 are defined by the norms 1 G, ⎛ ⎝ ||H||L2 (G) :=
⎞1/2 dim(ξ) ||H(ξ)||2HS ⎠
,
[ξ]∈G
which is already familiar from (10.8), and by (dim(ξ))3/2 ||H(ξ)||HS . ||H||L1 (G) := [ξ]∈G
We first recall a result on We now discuss several properties of spaces Lp (G). the interpolation of weighted spaces from [14, Theorem 5.5.1]: Theorem 10.3.39 (Interpolation of weighted spaces). Let us write dμ0 (x) = w0 (x) dμ(x), dμ1 (x) = w1 (x) dμ(x), and write Lp (w) = Lp (w dμ) for the weight w. Suppose that 0 < p0 , p1 < ∞. Then (Lp0 (w0 ), Lp1 (w1 ))θ,p = Lp (w), where 0 < θ < 1,
1 p
=
1−θ p0
+
θ p1 ,
p(1−θ)/p0
and w = w0
pθ/p1
w1
.
From this we obtain: spaces). Let 1 ≤ p0 , p1 < ∞. Then Proposition 10.3.40 (Interpolation of Lp (G) Lp1 (G) = Lp (G), Lp0 (G), θ,p
where 0 < θ < 1 and
1 p
=
1−θ p0
+
θ p1 .
= Proof. The statement follows from Theorem 10.3.39 if we regard Lp (G) 2 1 p − with the weight given dim ( p 2 ) as a weighted sequence space over G p G, 2 1 by dim(ξ)p( p − 2 ) .
548
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
Lemma 10.3.41 (Hilbert–Schmidt norm of representations). If ξ ∈ Rep(G), then < ||ξ(x)||HS = dim(ξ) for every x ∈ G. Proof. We have ||ξ(x)||HS = (Tr (ξ(x) ξ(x)∗ ))
1/2
1/2 < = Tr Idim(ξ) = dim(ξ),
by the unitarity of the representation ξ.
The Fourier transProposition 10.3.42 (Fourier transforms on L1 (G) and L1 (G)). 1 ∞ form FG is a linear bounded operator from L (G) to L (G) satisfying ||f||L∞ (G) ≤ ||f ||L1 (G) . −1 to is a linear bounded operator from L1 (G) The inverse Fourier transform FG ∞ L (G) satisfying −1 H||L∞ (G) ≤ ||H||L1 (G) ||FG . Proof. Using f(ξ) = G f (x) ξ(x)∗ dx, by Lemma 10.3.41 we get ||f (ξ)||HS ≤ |f (x)| ||ξ(x)∗ ||HS dx ≤ (dim(ξ))1/2 ||f ||L1 (G) . G
Therefore, −1/2 ||f (ξ)||HS ≤ ||f ||L1 (G) . ||f||L∞ (G) = sup (dim(ξ)) [ξ]∈G
−1 H)(x) = [ξ]∈G dim(ξ) Tr (ξ(x) H(ξ)), by Lemma On the other hand, using (FG 10.3.28 and Lemma 10.3.41 we have −1 H)(x)| ≤ dim(ξ) ||ξ(x)||HS ||H(ξ)||HS |(FG [ξ]∈G
=
(dim(ξ))3/2 ||H(ξ)||HS
[ξ]∈G
=
||H||L1 (G) ,
−1 from which we get ||FG H||L∞ (G) ≤ ||H||L1 (G) .
Theorem 10.3.43 (Hausdorff–Young inequality). Let 1 ≤ p ≤ 2 and p1 + 1q = 1. Then ||f|| q ≤ ||f ||Lp (G) and ||F −1 H||Lq (G) ≤ Let f ∈ Lp (G) and H ∈ Lp (G). G L (G) ||H||Lp (G) .
10.3. Function spaces on the unitary dual
549
Theorem 10.3.43 follows from the L1 → L∞ and L2 → L2 boundedness in Proposition 10.3.42 and in Proposition 10.3.17, respectively, by the following interpolation theorem in [14, Corollary 5.5.4] (which is also a consequence of Theorem 10.3.39): Theorem 10.3.44 (Stein–Weiss interpolation). Let 1 ≤ p0 , p1 , q0 , q1 < ∞ and let I0 dν), T : Lp1 (U, w1 dμ) → Lq1 (V, w I1 dν), T : Lp0 (U, w0 dμ) → Lq0 (V, w with norms M0 and M1 , respectively. Then * dν) T : Lp (U, w dμ) → Lq (V, w with norm M ≤ M01−θ M1θ , where and w *=w I0
p(1−θ)/p0
w I1
pθ/p1
1 p
θ = 1−θ p0 + p1 ,
1 q
p(1−θ)/p0
θ = 1−θ q0 + q1 , w = w0
pθ/p1
w1
.
We now turn to the duality between spaces Lp (G): Let 1 ≤ p < ∞ and Theorem 10.3.45 (Duality of Lp (G)). p q L (G) = L (G).
1 p
+
1 q
= 1. Then
Proof. The duality is given by the bracket ·, ·G in Definition 10.3.29: H, hG :=
dim(ξ) Tr (H(ξ) h(ξ)).
[ξ]∈G
and h ∈ Lq (G), using Lemma Assume first 1 < p < ∞. Then, if H ∈ Lp (G) 10.3.28 we get H, hG ≤
dim(ξ) H(ξ)HS h(ξ)HS
[ξ]∈G
=
(dim(ξ)) p − 2 H(ξ)HS (dim(ξ)) q − 2 h(ξ)HS 2
1
2
[ξ]∈G
⎛
≤⎝
⎞1/p
(dim(ξ))p(
2 1 p−2
)
p H(ξ)HS ⎠
[ξ]∈G
⎛
×⎝
⎞1/q (dim(ξ))q(
[ξ]∈G
= ||H||Lp (G) ||h||Lq (G) ,
2 1 q−2
)
q h(ξ)HS ⎠
1
550
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
where we also used the discrete H¨older inequality (Lemma 3.3.28). Let now p = 1. In this case we have (dim(ξ))3/2 H(ξ)HS (dim(ξ))−1/2 h(ξ)HS H, hG ≤ [ξ]∈G
≤
||H||L1 (G) ||h||L∞ (G) .
We leave the other part of the proof as an exercise.
Remark 10.3.46 (Sobolev spaces Lpk (G)). If we use difference operators α from k ∈ N, on the unitary Definition 10.7.1, we can also define Sobolev spaces Lpk (G), dual G by = H ∈ Lp (G) : α H ∈ Lp (G) for all |α| ≤ k . Lpk (G)
10.4
Symbols of operators
Let G be a compact Lie group. Let us endow D(G) = C ∞ (G) with the usual test function topology (which is the uniform topology of C ∞ (G); we refer the reader to Section 10.12 for some additional information on the topics of distributions and Schwartz kernels if more introduction is desirable). For a continuous linear operator A : C ∞ (G) → C ∞ (G), let KA , LA , RA ∈ D (G × G) denote respectively the Schwartz, left-convolution and right-convolution kernels, i.e., Af (x) = KA (x, y) f (y) dy G = LA (x, xy −1 ) f (y) dy G = f (y) RA (x, y −1 x) dy (10.16) G
in the sense of distributions. To simplify the notation in the sequel, we will often write integrals in the sense of distributions, with a standard distributional interpretation. Proposition 10.4.1 (Relations between kernels). We have RA (x, y) = LA (x, xyx−1 ), as well as RA (x, y) = KA (x, xy −1 ) and LA (x, y) = KA (x, y −1 x), with the standard distributional interpretation.
(10.17)
10.4. Symbols of operators
551
Proof. Equality (10.17) follows directly from (10.16). The proof of the last two equalities is just a change of variables. Indeed, (10.16) implies that KA (x, y) = RA (x, y −1 x). Denoting v = y −1 x, we have y = xv −1 , so that KA (x, xv −1 ) = RA (x, v). Similarly, KA (x, y) = LA (x, xy −1 ) from (10.16) and the change w = xy −1 yield y = w−1 x, and hence KA (x, w−1 x) = LA (x, w). We also note that left-invariant operators on C ∞ (G) correspond to right-convolutions f → f ∗ k. Since we identify the Lie algebra g of G with the left-invariant vector fields on G, it will be most natural to study right-convolution kernels in the sequel. Let us explain this in more detail: Remark 10.4.2 (Left or right?). For g ∈ D (G), define the respective left-convolution and right-convolution operators l(f ), r(f ) : C ∞ (G) → C ∞ (G) by := f ∗ g, := g ∗ f.
l(f )g r(f )g
In this notation, the relation between left- and right-convolution kernels of these convolution operators in the notation of (10.16) becomes Ll(f ) (x, y) = f (y) = Rr(f ) (x, y). Also, if Y ∈ g, then Proposition 10.2.9 implies that DY l(f ) = l(f )DY . Let the respective left and right regular representations of G be denoted by πL , πR : G → U(L2 (G)), i.e., πL (x)f (y) πR (x)f (y)
= =
f (x−1 y), f (yx).
Operator A is left-invariant if
πL (x) A = A πL (x), πR (x) A = A πR (x),
right-invariant if for every x ∈ G. Notice that A is
⇐⇒ ⇐⇒
left-invariant right-invariant
right-convolution, left-convolution.
Indeed, we have, for example, [πR (x)l(f )g](z)
(f ∗ g)(zx) = f (y) g(y −1 zx) dy G = f (y) (πR (x)g)(y −1 z) dy =
G
= =
[f ∗ πR (x)g](z) [l(f )πR (x)g](z),
so that πR (x)l(f ) = l(f )πR (x), and similarly πL (x)r(f ) = r(f )πL (x).
552
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
The Lie algebras are often (but not always) identified with left-invariant vector fields, which are right-convolutions, that is why our starting choice in the sequel are right-convolution kernels. We refer to Remark 10.4.13 for a further discussion of left and right.
10.4.1 Full symbols We now define symbols of operators on G. Definition 10.4.3 (Symbols of operators on G). Let ξ : G → U (Hξ ) be an irreducible unitary representation. The symbol of a linear continuous operator A : C ∞ (G) → C ∞ (G) at x ∈ G and ξ ∈ Rep(G) is defined as σA (x, ξ) := rx (ξ) ∈ End(Hξ ), where rx (y) = RA (x, y) is the right-convolution kernel of A as in (10.16). Hence σA (x, ξ) = RA (x, y) ξ(y)∗ dy G
in the sense of distributions, and by Corollary 10.2.10 the right-convolution kernel can be regained from the symbol as well: dim(ξ) Tr (ξ(y) σA (x, ξ)) , (10.18) RA (x, y) = [ξ]∈G
where this equality is interpreted distributionally. We now show that operator A can be represented by its symbol: Theorem 10.4.4 (Quantization of operators). Let σA be the symbol of a continuous linear operator A : C ∞ (G) → C ∞ (G). Then Af (x) = dim(ξ) Tr ξ(x) σA (x, ξ) f(ξ) (10.19) [ξ]∈G
for every f ∈ C ∞ (G) and x ∈ G. Proof. Let us define a right-convolution operator Ax0 ∈ L(C ∞ (G)) by its kernel RA (x0 , y) = rx0 (y), i.e., by Ax0 f (x) := f (y) rx0 (y −1 x) dy = (f ∗ rx0 )(x). G
Thus σAx0 (x, ξ) = r0 x0 (ξ) = σA (x0 , ξ),
10.4. Symbols of operators
553
so that by (10.6) we have Ax0 f (x)
=
dim(ξ) Tr ξ(x) A x0 f (ξ)
[ξ]∈G
=
dim(ξ) Tr ξ(x) σA (x0 , ξ) f(ξ) ,
[ξ]∈G
where we used that f ∗ rx0 = r0 x0 f by Proposition 10.2.7. This implies the result, because Af (x) = Ax f (x), for each fixed x. Definition 10.4.3 and Theorem 10.4.4 justify the following notation: Definition 10.4.5 (Pseudo-differential operators). For a symbol σA , the corresponding operator A defined by (10.19) will be also denoted by Op(σA ). The operator defined by formula (10.19) will be called the pseudo-differential operator with symbol σA . If we fix representations to be matrix representations we can express all the formulae above in matrix components. Thus, if ξ : G → U(dim(ξ)) are irreducible unitary matrix representations then
Af (x)
=
⎛
dim(ξ)
dim(ξ)
[ξ]∈G
dim(ξ)
ξ(x)nm ⎝
m,n=1
⎞ σA (x, ξ)mk f(ξ)kn ⎠ ,
k=1
and as a consequence of (10.18) and Corollary 10.2.10 we also have formally:
RA (x, y) =
dim(ξ)
dim(ξ)
[ξ]∈G
ξ(y)nm σA (x, ξ)mn .
(10.20)
m,n=1
Alternatively, setting Aξ(x)mn := (A(ξmn ))(x), we have
dim(ξ)
σA (x, ξ)mn =
ξkm (x) (Aξkn )(x),
(10.21)
k=1
1 ≤ m, n ≤ dim(ξ), which follows from the following theorem: Theorem 10.4.6 (Formula for the symbol via representations). Let σA be the symbol of a continuous linear operator A : C ∞ (G) → C ∞ (G). Then for all x ∈ G and ξ ∈ Rep(G) we have σA (x, ξ) = ξ(x)∗ (Aξ)(x).
(10.22)
554
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
Proof. Working with matrix representations ξ : G → U(dim(ξ)), we have
dim(ξ)
ξkm (x) (Aξkn )(x)
k=1
(10.19)
=
=
ξkm (x)
[η]∈G
k
=
dim(η) Tr η(x) σA (x, η) ξ0 kn (η)
[η]∈G
k
ξkm (x)
dim(η)
η(x)ij σA (x, η)jl ξ0 kn (η)li
i,j,l
ξkm (x) ξ(x)kj σA (x, ξ)jn
k,j Rem. 10.2.4
=
σA (x, ξ)mn ,
where we take η = ξ if η ∈ [ξ] in the sum, so that ξ0 kn (η)li = ξkn , ηil L2 by (10.5), 1 if ξ = η, k = i and n = l, and zero otherwise. which equals dim(ξ) Remark 10.4.7 (Formula for symbol on Tn ). Since in the case of the torus G = Tn by Remark 10.2.4 representations of Tn are given by ek (x) = ei2πx·k , k ∈ Zn , formula (10.22) gives the formula for the toroidal symbol σA (x, k) := σA (x, ek ) = e−i2πx·k (A ei2πx·k )(x), k ∈ Zn , as in Theorem 4.1.4. Remark 10.4.8 (Symbol of the Laplace operator). We note that by Theorem 10.3.13 the symbol of the Laplace operator L = LG on G is σL (x, ξ) = −λ[ξ] Idim ξ , where Idim ξ is the identity mapping on Hξ and λ[ξ] are the eigenvalues of −L. The symbol of A ∈ L(C ∞ (G)) Remark 10.4.9 (Symbol as a mapping on G × G). is a mapping End(Hξ ), σA : G × Rep(G) → ξ∈Rep(G)
where σA (x, ξ) ∈ End(Hξ ) for every x ∈ G and ξ ∈ Rep(G). However, it can be Indeed, let ξ, η ∈ Rep(G) be equivalent viewed as a mapping on the space G × G. via an intertwining isomorphism U ∈ Hom(ξ, η): i.e., such that there exists a linear unitary bijection U : Hξ → Hη such that U η(x) = ξ(x) U for every x ∈ G, that is η(x) = U −1 ξ(x) U . Then by Remark 10.2.5 we have f(η) = U −1 f(ξ) U , and hence also σA (x, η) = U −1 σA (x, ξ) U. leads to the same Therefore, taking any representation from the same class [ξ] ∈ G operator A in view of the trace in formula (10.19). In this sense we may think that instead of G × Rep(G). symbol σA is defined on G × G
10.4. Symbols of operators
555
Remark 10.4.10 (Symbol of right-convolution). Notice that if A = (f → f ∗ a) then RA (x, y) = a(y) and hence a(ξ), σA (x, ξ) = 0 (ξ) = and hence Af a(ξ) f(ξ) = σA (x, ξ) f(ξ). Proposition 10.4.11 (Symbol of left-convolution). If B = (f → b ∗ f ) is the leftconvolution operator, then LB (x, y) = b(y),
RB (x, y) = LB (x, xyx−1 ) = b(xyx−1 ),
and the symbol of B is given by σB (x, ξ) = ξ(x)∗ b(ξ) ξ(x). Exercise 10.4.12. Prove Proposition 10.4.11. (Hint: use (10.4) and (10.17).) Remark 10.4.13 (What if we started with left-convolution kernels?). What if we had chosen right-invariant vector fields and corresponding left-convolution operators as the starting point of the Fourier analysis? Let us define another “Fourier transform” by πf (ξ) :=
f (x) ξ(x) dx. G
Then πf ∗g = πf πg , and a continuous linear operator A : C ∞ (G) → C ∞ (G) can be presented by Af (x) = dim(ξ) Tr ξ(x) σA (x, ξ) f(ξ) [ξ]∈G
=
dim(ξ) Tr (ξ(x)∗ σI A (x, ξ) πf (ξ)) ,
[ξ]∈G
where σI A (x, ξ)
= πy→LA (x,y) (ξ) = ξ(x) (A(ξ ∗ ))(x).
In the coming symbol considerations this left–right choice is encountered, e.g., as follows: σAB (x, ξ) ∼ σA (x, ξ) σB (x, ξ) + · · · I I σE AB (x, ξ) ∼ σ B (x, ξ) σ A (x, ξ) + · · ·
if we use right-convolutions, if we use left-convolutions.
There is an explicit link between the left–right cases. We refer to Section 10.11 for a further discussion of these issues for the operator-valued symbols. At the same time, we note that these choices already determine the need to work with right actions on homogeneous spaces in Chapter 13, so that the homogeneous spaces there are K\G instead of G/K, see Remark 13.2.5 for the discussion of this issue.
556
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
Remark 10.4.14. We have now associated a unique full symbol σA to each continuous linear operator A : C ∞ (G) → C ∞ (G). Here σA (x, ξ) : Hξ → Hξ is a linear operator for each x ∈ G and each irreducible unitary representation ξ : G → U(Hξ ). The correspondence A → σA is linear in the sense that σA+B (x, ξ) = σA (x, ξ) + σB (x, ξ)
and
σλA (x, ξ) = λσA (x, ξ),
where λ ∈ C. However, σAB (x, ξ) is not usually σA (x, ξ)σB (x, ξ) (unless B is a right-convolution operator, so that the symbol σB (x, ξ) = b(ξ) does not depend on the variable x ∈ G). A composition formula will be established in Theorem 10.7.8 below.
10.4.2
Conjugation properties of symbols
In the sequel, we will need conjugation properties of symbols which we will now analyse for this purpose. Definition 10.4.15 (φ-pushforwards). Let φ : G → G be a diffeomorphism, f ∈ C ∞ (G), A : C ∞ (G) → C ∞ (G) continuous and linear. Then the φ-pushforwards fφ ∈ C ∞ (G) and Aφ : C ∞ (G) → C ∞ (G) are defined by fφ Aφ f
:= f ◦ φ−1 , := A(fφ−1 ) φ
=
A(f ◦ φ) ◦ φ−1 .
Notice that Aφ◦ψ = (Aψ )φ . Exercise 10.4.16. Using the local theory of pseudo-differential operators show that A ∈ Ψμ (G) if and only if Aφ ∈ Ψμ (G). Definition 10.4.17. For u ∈ G, let uL , uR : G → G be defined by uL (x) := ux
and uR (x) := xu.
Then (uL )−1 = (u−1 )L and (uR )−1 = (u−1 )R . The inner automorphism φu : G → G defined in Proposition 10.2.6 by φu (x) := u−1 xu satisfies −1 φu = u−1 L ◦ uR = uR ◦ uL .
Proposition 10.4.18. Let u ∈ G, B = AuL , C = AuR and F = Aφu . Then we have the following relations between symbols: σB (x, ξ) σC (x, ξ) σF (x, ξ)
= = =
σA (u−1 x, ξ), ξ(u)∗ σA (xu−1 , ξ) ξ(u), ξ(u)∗ σA (uxu−1 , ξ) ξ(u).
10.4. Symbols of operators
557
Especially, if A = (f → f ∗ a), i.e., σA (x, ξ) = a(ξ), then σB (x, ξ) σC (x, ξ)
= a(ξ), = ξ(u)∗ a(ξ) ξ(u) = σF (x, ξ).
Proof. We notice that F = C(u−1 )L , so it suffices to consider only operators B and C. For the operator B = AuL , we get f (z) RB (x, z −1 x) f (z) dz = Bf (x) G
A(f ◦ uL )(u−1 L (x)) f (uy) RA (u−1 x, y −1 u−1 x) dy = G = f (z) RA (u−1 x, z −1 x) dz,
=
G
so RB (x, y) = RA (u−1 x, y), yielding σB (x, ξ) = σA (u−1 x, ξ). For the operator C = AuR , we have f (z) RC (x, z −1 x) dz = Cf (x) G
A(f ◦ uR )(u−1 R (x)) f (yu) RA (xu−1 , y −1 xu−1 ) f (yu) dy = G = f (z) RA (xu−1 , uz −1 xu−1 ) dz,
=
G −1
so that RC (x, y) = RA (xu σC (x, ξ)
= = = =
−1
, uyu ), yielding RC (x, y) ξ(y)∗ dy G RA (xu−1 , uyu−1 ) ξ(y)∗ dy G RA (xu−1 , z) ξ(u−1 zu)∗ dz G RA (xu−1 , z) ξ(u)∗ ξ(z)∗ ξ(u) dz G
= and completing the proof.
ξ(u)∗ σA (xu−1 , ξ) ξ(u)
558
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
Let us now record how push-forwards by translation affect vector fields. Lemma 10.4.19 (Push-forwards of vector fields). Let u ∈ G, Y ∈ g and let E = DY : C ∞ (G) → C ∞ (G) be defined by DY f (x) =
d f (x exp(tY )) dt
.
(10.23)
t=0
Then EuR = Eφu = Du−1 Y u , i.e.,
DY (f ◦ uR )(xu−1 ) = DY (f ◦ φu )(uxu−1 ) = Du−1 Y u f (x).
Proof. We have EuR f (x)
E(f ◦ uR )(xu−1 ) d = (f ◦ uR )(xu−1 exp(tY )) dt d f (xu−1 exp(tY )u) t=0 = dt d f (x exp(tu−1 Y u) t=0 = dt = Du−1 Y u f (x).
=
t=0
Due to the left-invariance, we have EuL = E, so that Eφu = (Eu−1 )uR = EuR = Du−1 Y u . L
For more transparency, we also calculate directly: Eφu f (x)
= = = = =
E(f ◦ φu )(uxu−1 ) d (f ◦ φu )(uxu−1 exp(tY )) dt d f (xu−1 exp(tY )u) t=0 dt d f (x exp(tu−1 Y u)) t=0 dt Du−1 Y u f (x),
t=0
yielding the same result.
Remark 10.4.20 (Symbol of iDY can be diagonalised). Notice first that the complex vector field iDY is symmetric: (iDY f )(x) g(x) dx (iDY f, g)L2 (G) = G = −i f (x) DY g(x) dx G
=
(f, iDY g)L2 (G) .
10.5. Boundedness of operators on L2 (G)
559
Hence it is always possible to choose a representative ξ ∈ Rep(G) ⎛ ⎞ from each [ξ] ∈ λ1 ⎟ .. such that σiD (x, ξ) is a diagonal matrix ⎜ G ⎝ ⎠, with diagonal . Y λdim(ξ) entries λj ∈ R, which follows because symmetric matrices can be diagonalised by unitary matrices. Notice that then also the commutator of symbols satisfy [σiDY , σA ](x, ξ)mn = (λm − λn ) σA (x, ξ)mn .
10.5
Boundedness of operators on L2 (G)
In this section we will state some natural conditions on the symbol of an operator A : C ∞ (G) → C ∞ (G) to guarantee its boundedness on L2 (G). Recall first that the Hilbert–Schmidt inner product of matrices is defined as a special case of Definition B.5.43: Definition 10.5.1 (Hilbert–Schmidt inner product). The Hilbert–Schmidt inner product of A, B ∈ Cm×n is A, BHS := Tr(B ∗ A) =
m n
Bij Aij ,
i=1 j=1 1/2
with the corresponding norm AHS := A, AHS , and the operator norm Aop := sup Ax2 : x ∈ Cn×1 , x2 ≤ 1 = A2 →2 , n where x2 = ( j=1 |xj |2 )1/2 is the usual Euclidean norm. Let A, B ∈ Cn×n . Then by Theorem 12.6.1 proved in Section 12.6 we have ABHS ≤ Aop BHS . Moreover, we also have Aop = sup AXHS : X ∈ Cn×n , XHS ≤ 1 . By this, taking the Fourier transform of the convolution and using Plancherel’s formula (Corollary 7.6.10), by Proposition 10.2.7 we get Proposition 10.5.2 (Operator norm of convolutions). We have g → f ∗ gL(L2 (G)) = g → g ∗ f L(L2 (G)) =
sup ξ∈Rep(G)
We also note that f(ξ)op = f(η)op if [ξ] = [η] ∈ G.
f(ξ)op .
(10.24)
560
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
We now extend this property to operators that are not necessarily left- or right-invariant. First we introduce derivatives of higher order on the Lie group G: dim(G)
Definition 10.5.3 (Operators ∂ α on G). Let {Yj }j=1 be a basis for the Lie algebra of G, and let ∂j be the left-invariant vector fields corresponding to Yj , ∂j = DYj , as in (10.23). For α ∈ Nn0 , let us denote ∂ α = ∂1α1 · · · ∂nαn . Sometimes we denote these operators by ∂xα . Remark 10.5.4 (Orderings). We note that unless G is commutative, operators ∂j do not in general commute. Thus, when we talk about “all operators ∂ α ”, we mean that we take these operators in all orderings. However, if we fix a certain ordering of Yj ’s, the commutator of a general ∂ α with ∂ α taken in this particular ordering is an operator of lower order (this can be easily seen either for the simple properties of commutators in Exercise D.1.5 or from the general composition Theorem 10.7.9). The commutator is again a combination of operators of the form ∂ β with |β| ≤ |α| − 1. Thus, since usually we require some property to hold for example “for all ∂ α with |α| ≤ N ”, we can rely iteratively on the fact that the assumption is already satisfied for ∂ β , thus making this ordering issue less important. Theorem 10.5.5 (Boundedness of operators on L2 (G)). Let G be a compact Lie group of dimension n and let k be an integer such that k > n/2. Let σA be the symbol of a linear continuous operator A : C ∞ (G) → C ∞ (G). Assume that there is a constant C such that ∂xα σA (x, ξ)op ≤ C for all x ∈ G, all ξ ∈ Rep(G), and all |α| ≤ k. Then A extends to a bounded operator from L2 (G) to L2 (G). Proof. Let Af (x) = (f ∗ rA (x))(x), where rA (x)(y) = RA (x, y) is the rightconvolution kernel of A. Let Ay f (x) = (f ∗ rA (y))(x), so that Ax f (x) = Af (x). Then |Ax f (x)|2 dx ≤ sup |Ay f (x)|2 dx, Af 2L2 (G) = G y∈G
G
and by an application of the Sobolev embedding theorem we get |∂yα Ay f (x)|2 dy. sup |Ay f (x)|2 ≤ C y∈G
|α|≤k
G
Therefore, using the Fubini theorem to change the order of integration, we obtain 2 Af L2 (G) ≤ C |∂yα Ay f (x)|2 dx dy ≤
C
|α|≤k
G
|α|≤k
y∈G
G
|∂yα Ay f (x)|2 dx
sup G
10.6. Taylor expansion on Lie groups
= ≤
C C
(10.24)
≤
C
561
sup ∂yα Ay f 2L2 (G)
|α|≤k
y∈G
|α|≤k
y∈G
sup f → f ∗ ∂yα rA (y)2L(L2 (G)) f 2L2 (G)
sup sup ∂yα σA (y, ξ)2op f 2L2 (G) ,
|α|≤k
y∈G [ξ]∈G
where the last inequality holds due to (10.24). This completes the proof.
10.6
Taylor expansion on Lie groups
As Taylor polynomial expansions are useful in obtaining symbolic calculus on Rn , we would like to have analogous expansions on a group G. Here, the Taylor expansion formula on G will be obtained by embedding G into some Rm , using the Taylor expansion formula in Rm , and then restricting it back to G. Let U ⊂ Rm be an open neighbourhood of some point e ∈ Rm . The N th order Taylor polynomial PN f : Rm → C of f ∈ C ∞ (U ) at e is given by PN f (x) =
α∈Nm 0 : |α|≤N
1 (x − e)α ∂xα f (e). α!
Then the remainder EN f := f − PN f satisfies EN f (x) = (x − e)α fα (x) |α|=N +1
for some functions fα ∈ C ∞ (U ). In particular, EN f (x) = O(|x − e|
N +1
)
as x → e.
Let G be a compact Lie group; we would like to approximate a smooth function u : G → C using a Taylor polynomial type expansion nearby the neutral element e ∈ G. By Corollary 8.0.4 we may assume that G is a closed subgroup of GL(n, R) ⊂ Rn×n , the group of real invertible (n × n)-matrices, and thus a closed submanifold of the Euclidean space of dimension m = n2 . This embedding of G into Rm will be denoted by x → x, and the image of G under this embedding will be still denoted by G. Also, if x ∈ G, we may still write x for x to simplify the notation. Let U ⊂ Rm be a small enough open neighbourhood of G ⊂ Rm such that for each x ∈ U there exists a unique nearest point p(x) ∈ G (with respect to the Euclidean distance). For u ∈ C ∞ (G) we define f ∈ C ∞ (U ) by f (x) := u(p(x)).
562
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
The effect is that f is constant in the directions perpendicular to G. As above, we may define the Euclidean Taylor polynomial PN f : Rm → C at e ∈ G ⊂ Rm . Let us define PN u : G → C as the restriction, PN u := PN f |G . We call PN u ∈ C ∞ (G) a Taylor polynomial of u of order N at e ∈ G. Then for x ∈ G, we have u(x) − PN u(x) = (x − e)α uα (x) |α|=N +1 ∞
for some functions uα ∈ C (G), where we set (x−e)α := (x −e)α . There should be no confusion with this notation because there is no substraction on the group G, so subtracting group elements means subtracting them when they are embedded in a higher-dimensional linear space. Taylor polynomials on G are given by PN u(x) =
1 (x − e)α ∂x(α) u(e), α!
|α|≤N
where we set ∂x(α) u(e) := ∂xα f (e).
(10.25)
Remark 10.6.1. We note that in this way we can obtain different forms of the Taylor series. For example, it may depend on the embedding of G into GL(n, R), on the choice of the coordinates in Rn × Rn , etc. Let us now consider the example of G = SU(2). Recall the quaternionic identification (x0 1 + x1 i + x2 j + x3 k → (x0 , x1 , x2 , x3 )) : H → R4 , to be discussed in detail in Section 11.4. Moreover, there is the identification H ⊃ S3 ∼ = SU(2), given by , + , + x11 x12 x0 + ix3 x1 + ix2 = = x. x = (x0 , x1 , x2 , x3 ) → −x1 + ix2 x0 − ix3 x21 x22 Hence we identify (1, 0, 0, 0) ∈ R4 with the neutral element of SU(2). Remark 10.6.2. Notice that the functions q+ (x) q− (x) q0 (x)
= = =
x12 = x1 + ix2 , x21 = −x1 + ix2 , x11 − x22 = 2ix3
also vanish at the identity element of SU(2).
10.7. Symbolic calculus
563
A function u ∈ C ∞ (S3 ) can be extended to f ∈ C ∞ (U ) = C ∞ (R4 \ {0}) by f (x) := u(x/x). Therefore, we obtain PN u ∈ C ∞ (S3 ), PN u(x) :=
1 α (x − e) ∂xα f (e), α!
|α|≤N
where e = (1, 0, 0, 0). Expressing this in terms of x ∈ SU(2), we obtain Taylor polynomials for x ∈ SU(2) in the form PN u(x) =
1 α (x − e) ∂x(α) u(e), α!
|α|≤N (α)
where we write ∂x u(e) := ∂xα f (e), and where (x − e)α := (x − e)α 2 α3 α4 = (x0 − 1)α1 xα 1 x2 x3 ,α1 + ,α + ,α + ,α + x11 + x22 x12 − x21 2 x12 + x21 3 x11 − x22 4 −1 = . 2 2 2i 2i This gives an example of possible Taylor monomials on SU(2).
10.7
Symbolic calculus
In this section, we study global symbols of pseudo-differential operators on compact Lie groups, as defined in Definition 10.4.3. We also derive elements of the calculus in quite general classes of symbols. For this, we first introduce difference operators acting on symbols in the ξ-variable. These are analogues of the ∂ξ -derivatives in Rn and of the difference operators ξ on Tn , and are obtained by the multiplication by “coordinate functions” on the Fourier transform side.
10.7.1
Difference operators
As explained in Section 10.6, smooth functions on a group G can be approximated by Taylor polynomial type expansions. More precisely, there exist partial (α) differential operators ∂x of order |α| on G such that for every u ∈ C ∞ (G) we have 1 qα (x−1 ) ∂x(α) u(e) + qα (x−1 ) uα (x) u(x) = α! |α|≤N
∼
1 qα (x−1 ) ∂x(α) u(e) α!
α≥0
|α|=N +1
(10.26)
564
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
in a neighbourhood of e ∈ G, where uα ∈ C ∞ (G), and qα ∈ C ∞ (G) satisfy (α) qα+β = qα qβ , and ∂x are as in (10.25). Moreover, here q0 ≡ 1, and qα (e) = 0 if |α| ≥ 1. α Definition 10.7.1 (Difference operators α ξ ). Let us define difference operators ξ α+β acting on Fourier coefficients by α f(ξ) := q1 = α β . α f (ξ). Notice that ξ
ξ
ξ
ξ
−1
Remark 10.7.2. The technical choice of writing qα (x ) in (10.26) is dictated by our desire to make the asymptotic formulae in Theorems 10.7.8 and 10.7.10 look similar to the familiar Euclidean formulae in Rn , and by an obvious freedom in selecting among different forms of Taylor polynomials qα , see Remark 10.6.1. For example, on SU(2), if we work with operators Δ+ , Δ− , Δ0 defined in (12.14)–(12.16), we can choose the form of the Taylor expansion (10.26) adapted to functions q+ , q− , q0 . On SU(2), we can observe that q+ (x−1 ) = −q− (x), q− (x−1 ) = −q+ (x), q0 (x−1 ) = −q0 (x), so that for |α| = 1 the functions qα (x) and qα (x−1 ) are linear combinations of q+ , q− , q0 . In terms of the quaternionic identification, these are functions from Remark 10.6.2. Taylor monomials (x − e)α from the previous section, when restricted to SU(2), can be expressed in terms of functions q+ , q− , q0 . For an argument of this type we refer to the proof of Lemma 12.4.5. Remark 10.7.3 (Differences reduce the order of symbols). In Theorem 12.3.6 we will apply the differences on the symbols of specific differential operators on SU(2). In general, on a compact Lie group G, a difference operator of order |γ| applied to a symbol of a partial differential operator of order N gives a symbol of order N − |γ|. More precisely: Proposition 10.7.4 (Differences for symbols of differential operators). Let cα (x) ∂xα (10.27) D= |α|≤N
be a partial differential operator with coefficients cα ∈ C ∞ (G), and ∂xα as in Definition 10.5.3. For q ∈ C ∞ (G) such that q(e) = 0, we define difference operator q acting on symbols by 0 (ξ). q f(ξ) := qf Then we obtain q σD (x, ξ) =
|α|≤N
cα (x)
+α, β≤α
β
(−1)|β| (∂xβ q)(e) σ∂xα−β (x, ξ),
(10.28)
which is a symbol of a partial differential operator of order at most N − 1. More precisely, if q has a zero of order M at e ∈ G then Op(q σD ) is of order N − M . Proof. Let D in (10.27) be be a partial differential operator, where cα ∈ C ∞ (G) and ∂xα : D(G) → D(G) is left-invariant of order |α|. If |α| = 1 and φ, ψ ∈ C ∞ (G) then we have the Leibniz property ∂xα (φψ) = φ (∂xα ψ) + (∂xα φ) ψ,
10.7. Symbolic calculus
leading to
565
0=
φ(x)
∂xα ψ(x)
∂xα φ(x) ψ(x) dx.
dx +
G
G
More generally, for |α| ∈ N0 , φ ∈ C ∞ (G) and f ∈ D (G), we have for the distributional derivatives φ(x) ∂xα f (x) dx = (−1)|α| ∂xα φ(x) f (x) dx, G
G
with a standard distributional interpretation. Recall that the right-convolution kernel RA ∈ D (G × G) of a continuous linear operator A : D(G) → D(G) satisfies Aφ(x) = φ(y) RA (x, y −1 x) dy. G
For instance, informally
φ(y) δe (y −1 x) dy =
φ(x) = G
φ(y) δx (y) dy, G
where δp ∈ D (G) is the Dirac delta distribution at p ∈ G. Notice that α ∂x φ(x) = (−1)|α| (∂yα φ)(xy −1 ) δe (y) dy G = φ(xy −1 ) ∂yα δe (y) dy. G
The right-convolution kernel of the operator D from (10.27) is given by RD (x, y) = cα (x) ∂yα δe (y). |α|≤N ∞
∞
Let Dq : C (G) → C (G) be defined by σDq (x, ξ) := q σD (x, ξ), i.e., RDq (x, y) = q(y) RD (x, y). Then Dq = Op(σDq ) is a differential operator: φ(xy −1 ) q(y) cα (x) ∂yα δe (y) dy Dq φ(x) = G
=
|α|≤N
(−1)|α| cα (x)
=
|α|≤N
∂yα φ(xy −1 ) q(y) δe (y) dy
G
|α|≤N
(−1)|α| cα (x)
+α , β≤α
β
(−1)|α−β| (∂xβ q)(e) ∂xα−β φ(x).
566
Chapter 10. Pseudo-differential Operators on Compact Lie Groups
Thus q σD (x, ξ) =
cα (x)
|α|≤N
+α, β≤α
β
(−1)|β| (∂xβ q)(e) σ∂xα−β (x, ξ).
Hence if q has a zero of order M at e ∈ G then Dq is of order N − M .
Exercise 10.7.5. Provide the distributional interpretation of all the steps in the proof of Proposition 10.7.4.
10.7.2
Commutator characterisation
Definition 10.7.6 (Operator classes Am k (M )). For a compact closed manifold M , let Am (M ) denote the set of those continuous linear operators A : C ∞ (M ) → C ∞ (M ) 0 m 2 m which are bounded from H (M ) to L (M ). Recursively define Am k+1 (M ) ⊂ Ak (M ) m m such that A ∈ Ak (M ) belongs to Ak+1 (M ) if and only if [A, D] = AD − DA ∈ Am k (M ) for every smooth vector field D on M . We now recall a variant of the commutator characterisation of pseudo-differential operators given in Theorem 5.3.1 which assures that the behaviour of commutators in Sobolev spaces characterises pseudo-differential operators: ∞ ∞ Theorem 10.7.7. A continuous ∞ linear operator A : C (M ) → C (M ) belongs to (M ). Ψm (M ) if and only if A ∈ k=0 Am k
We note that in such a characterisation on a compact Lie group M = G, it suffices to consider vector fields of the form D = Mφ ∂x , where Mφ f := φf is multiplication by φ ∈ C ∞ (G), and ∂x is left-invariant. Notice that [A, Mφ ∂x ] = Mφ [A, ∂x ] + [A, Mφ ] ∂x , where [A, Mφ ]f = A(φf ) − φAf . Hence we need to consider compositions Mφ A, AMφ , A ◦ ∂x and ∂x ◦ A. First, we observe that σMφ A (x, ξ) σA◦∂x (x, ξ) σ∂x ◦A (x, ξ)
= = =
φ(x) σA (x, ξ), σA (x, ξ) σ∂x (x, ξ), σ∂x (x, ξ) σA (x, ξ) + (∂x σA )(x, ξ),
(10.29) (10.30) (10.31)
where σ∂x (x, ξ) is independent of x ∈ G. Here (10.29) and (10.30) are straightforward and (10.31) follows by the Leibniz formula: ∂x ◦ Af (x) = ∂x dim(ξ) Tr ξ(x) σA (x, ξ) f(ξ) [ξ]∈G
=
dim(ξ) Tr (∂x ξ)(x) σA (x, ξ) f(ξ)
[ξ]∈G
+
[ξ]∈G
dim(ξ) Tr ξ(x) ∂x σA (x, ξ) f(ξ) ,
10.7. Symbolic calculus
567
where we used that σ∂x (x, ξ) = ξ(x)∗ (∂x ξ)(x) by Theorem 10.4.6 to obtain (10.31). Next we claim that we have the formula σAMφ (x, ξ) ∼
1 α σA (x, ξ) ∂x(α) φ(x), α! ξ
α≥0 (α)
where ∂x are certain partial differential operators of order |α|. This follows from the general composition formula in Theorem 10.7.8.
10.7.3
Calculus
Here we discuss elements of the symbolic calculus of operators. First we recall the fundamental quantity ξ from Definition 10.3.18 that will allow us to introduce the orders of operators. We note that this scale ξ on G is determined by the eigenvalues of the Laplace operator L on G. We now formulate the result on compositions: Theorem 10.7.8 (Composition formula I). Let m1 , m2 ∈ R and ρ > δ ≥ 0. Let A, B : C ∞ (G) → C ∞ (G) be continuous and linear, their symbols satisfy ) α ) )ξ σA (x, ξ)) op ) β ) )∂x σB (x, ξ)) op
≤
Cα ξm1 −ρ|α| ,
≤
Cβ ξm2 +δ|β| ,
Then for all multi-indices α and β, uniformly in x ∈ G and [ξ] ∈ G. σAB (x, ξ) ∼
1 (α) (α ξ σA )(x, ξ) ∂x σB (x, ξ), α!
(10.32)
α≥0
where the asymptotic expansion means that for every N ∈ N we have ) ) ) ) 1 ) ) α (α) )σAB (x, ξ) − ) ( σ )(x, ξ) ∂ σ (x, ξ) B ξ A x ) ) α! ) ) |α|